Mathematische Geodäsie/Mathematical Geodesy: Handbuch der Geodäsie, herausgegeben von Willi Freeden und Reiner Rummel [1. Aufl.] 9783662558539, 9783662558546

Selbstkonsistente Darstellung von Schlüssel- und Transfermethodologien vom Realitätsraum geodätischer Messungen und Beob

694 71 29MB

German;English Pages XV, 1463 [1459] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Mathematische Geodäsie/Mathematical Geodesy: Handbuch der Geodäsie, herausgegeben von Willi Freeden und Reiner Rummel [1. Aufl.]
 9783662558539, 9783662558546

Table of contents :
Front Matter ....Pages I-XV
Front Matter ....Pages 1-1
Mathematical Geodesy (Willi Freeden, Michael Schreiner)....Pages 3-63
Inverse Probleme der Geodäsie (Willi Freeden, Helga Nutz)....Pages 65-90
Up and Down Through the Gravity Field (F. Sansó, M. Capponi, D. Sampietro)....Pages 91-144
Front Matter ....Pages 145-145
Spherical Harmonics, Splines, and Wavelets (Willi Freeden, Michael Schreiner)....Pages 147-193
A Mathematical View on Spin-Weighted Spherical Harmonics and Their Applications in Geodesy (Volker Michel, Katrin Seibert)....Pages 195-307
Reconstruction and Decomposition of Scalar and Vectorial Potential Fields on the Sphere (Christian Gerhards, Roger Telschow)....Pages 309-349
Ellipsoidal-Spheroidal Representation of the Gravity Field (Erik. W. Grafarend)....Pages 351-442
Front Matter ....Pages 443-443
Monte Carlo Methods (Karl-Rudolf Koch)....Pages 445-475
Parameter Estimation, Variance Components and Statistical Analysis in Errors-in-Variables Models (Peiliang Xu)....Pages 477-533
Front Matter ....Pages 535-535
Fast Harmonic/Spherical Splines and Parameter Choice Methods (Martin Gutting)....Pages 537-574
Numerical Methods for Solving the Oblique Derivative Boundary Value Problems in Geodesy (Róbert Čunderlík, Marek Macák, Matej Medl’a, Karol Mikula, Zuzana Minarechová)....Pages 575-622
Front Matter ....Pages 623-623
Geodetic Methods for Monitoring Crustal Motion and Deformation (Athanasios Dermanis)....Pages 625-695
Theory and Realization of Reference Systems (Athanasios Dermanis)....Pages 697-823
Front Matter ....Pages 825-825
From Gaussian Least Squares Approximation to Today’s Operator-Theoretic Regularization of Ill-Posed Problems (Willi Freeden, Bertold Witte)....Pages 827-929
The Numerical Treatment of Covariance Stationary Processes in Least Squares Collocation (Wolf-Dieter Schuh, Jan Martin Brockmann)....Pages 931-966
Front Matter ....Pages 967-967
Inverse Gravimetry: Density Signatures from Gravitational Potential Data (Willi Freeden, M. Zuhair Nashed)....Pages 969-1052
From Newton’s Law of Gravitation to Multiscale Geoidal Determination and Ocean Circulation Modeling (Willi Freeden, Helga Nutz)....Pages 1053-1116
Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid: Theoretical Background and Multiscale Modeling (Gerhard Berg, Christian Blick, Matthias Cieslack, Willi Freeden, Zita Hauler, Helga Nutz)....Pages 1117-1181
Front Matter ....Pages 1183-1183
Satellite Gravitational Gradiometry: Methodological Foundation and Geomathematical Advances (Willi Freeden, Helga Nutz, Reiner Rummel, Michael Schreiner)....Pages 1185-1255
Very Long Baseline Interferometry (Axel Nothnagel)....Pages 1257-1314
Elementary Mathematical Models for GNSS Positioning (Bernhard Hofmann-Wellenhof)....Pages 1315-1448
Back Matter ....Pages 1449-1463

Citation preview

Springer Reference Naturwissenschaften

Willi Freeden  Hrsg.

Mathematische Geodäsie/ Mathematical Geodesy Handbuch der Geodäsie, herausgegeben von Willi Freeden und Reiner Rummel

Springer Reference Naturwissenschaften

Springer Reference Naturwissenschaften bietet Praktikern, Wissenschaftlern und Studierenden zielführendes Fachwissen in aktueller, modularisierter und verständlicher Form. Während traditionelle Handbücher ihre Inhalte bislang lediglich gebündelt und statisch in einer Druckausgabe präsentiert haben, bietet Springer Reference Naturwissenschaften eine um dynamische Komponenten erweiterte Online-Präsenz: ständige digitale Verfügbarkeit, frühes Erscheinen neuer Beiträge „online first“ und fortlaufende Erweiterung und Aktualisierung der Inhalte. Die Werke und Beiträge der Reihe repräsentieren den jeweils aktuellen Stand des Wissens des Faches, insbesondere auch in den Anwendungen und in der Verzahnung der verschiedenen Gebiete. Review-Prozesse sichern die Qualität durch die aktive Mitwirkung namhafter Herausgeber(inne)n und ausgesuchter Autor(inn)en. Springer Reference Naturwissenschaften wächst kontinuierlich um neue Kapitel und Fachgebiete. Weitere Bände in der Reihe http://www.springer.com/series/15072

Willi Freeden Hrsg.

Mathematische Geodäsie/ Mathematical Geodesy Handbuch der Geodäsie, herausgegeben von Willi Freeden und Reiner Rummel

mit 271 Abbildungen und 55 Tabellen

Hrsg. Willi Freeden Geomathematics Group, Mathematics Department University of Kaiserslautern Kaiserslautern, Germany

ISSN 2522-8161 ISSN 2522-817X (eBook) Springer Reference Naturwissenschaften ISBN 978-3-662-55853-9 ISBN 978-3-662-55854-6 (eBook) https://doi.org/10.1007/978-3-662-55854-6 Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen National-bibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.de abrufbar. Springer Spektrum © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung des Verlags. Das gilt insbesondere für Vervielfältigungen, Bearbeitungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Die Wiedergabe von allgemein beschreibenden Bezeichnungen, Marken, Unternehmensnamen etc. in diesem Werk bedeutet nicht, dass diese frei durch jedermann benutzt werden dürfen. Die Berechtigung zur Benutzung unterliegt, auch ohne gesonderten Hinweis hierzu, den Regeln des Markenrechts. Die Rechte des jeweiligen Zeicheninhabers sind zu beachten. Der Verlag, die Autoren und die Herausgeber gehen davon aus, dass die Angaben und Informationen in diesem Werk zum Zeitpunkt der Veröffentlichung vollständig und korrekt sind. Weder der Verlag, noch die Autoren oder die Herausgeber übernehmen, ausdrücklich oder implizit, Gewähr für den Inhalt des Werkes, etwaige Fehler oder Äußerungen. Der Verlag bleibt im Hinblick auf geografische Zuordnungen und Gebietsbezeichnungen in veröffentlichten Karten und Institutionsadressen neutral. Lektorat: Stephanie Preuß Springer Spektrum ist ein Imprint der eingetragenen Gesellschaft Springer-Verlag GmbH, DE und ist ein Teil von Springer Nature. Die Anschrift der Gesellschaft ist: Heidelberger Platz 3, 14197 Berlin, Germany

Preface

In our twenty-first-century, geodesy is strongly influenced by two scenarios: First, the technological progress, in particular space observation, has opened fundamentally new methods of measurements. Second, high speed computers lead to a strong “mathematization”. As a consequence, today’s geodesy is in a great transitional phase. In future, however, the width and depth of new geodetic challenges will simultaneously require an intense understanding and a broad analysis of all technologically as well as mathematically driven components. These requirements will be inevitably necessary in order to provide essential results and improvements in diverse research fields of geodetically significant public concern for our planet such as ice mass balancing, ocean current systems, climate change, expected shortage of natural resources, etc. The “Handbook of Mathematical Geodesy” deals with mathematics as the key technology for modeling and simulation purposes as well as analysis and interpretation of geodetic measurements and observations. It supplies cutting-edge and pioneering knowledge as transfer methodology from the “reality space” of measurements to the “model space” of mathematical structures and solutions, and vice versa. In accordance with the essence of geodetic research, interest in studying the gravitational field is only laid in macroscopic sense, where the quantum behavior of gravitation may not be taken in account. Hence, in geodetically reflected Earth’s gravity studies, velocities substantially smaller than the speed of light are considered, so that Newtonian physics can be used safely. In detail, this Handbook is concerned with the following selection of topical areas: general scientific background of Mathematical Geodesy, inverse problems, least squares methods, multiscale methods, statistical methods, special functions based methods, constructive approximation and numerical methods, reference systems and monitoring methods, satellite and space techniques. The objective of the handbook is twofold: on the one hand, the book represents a reference work for geodesists interested in applying innovative modeling and simulation supplies to promising data sets and structures. On the other hand, it serves as a collection of newsworthy material for all members of the mathematical community concerned with any of the diverse problems relevant in modern geodesy.

V

VI

Preface

For both groups the handbook offers important perspectives and challenges in crossing the traditional scientific frontiers. The editor wishes to express his deep gratitude to the people who not only made this handbook possible, but also made it extremely satisfactory: • The contributors to the handbook, who dedicated much time, effort, and creative energy to the project. • The staff at Springer, particularly Dr. Stephanie Preuß, Kavitha Janarthanan, Steffen Rieck, Feray Steinhart, Julia Körting, and Gabriele McLemore, who gave a lot of encouragement and advice. Thank you very much for all exceptional efforts and support in creating a work offering exciting discoveries and impressive progress. In fact, it is the hope that the “Handbook of Mathematical Geodesy” will stimulate powerful inspiration and new research achievements in geodesy as well as mathematics. Kaiserslautern August 2019

Willi Freeden

Vorwort

In unserem 21. Jahrhundert ist die Geodäsie durch zwei Szenarien stark beeinflusst: Zum einen hat der technologische Fortschritt, insbesondere Beobachtungen im Raum, fundamental neue Messmethoden eröffnet, zum anderen haben Hochleistungsrechner zu einer starken „Mathematisierung“ geführt. In der Konsequenz ist die Geodäsie in einer großartigen Durchgangs- und Entwicklungsphase. In Zukunft jedoch werden Ausmaß und Tiefe neuer geodätischer Herausforderungen simultan ein intensives Verständnis und eine breite Analyse aller technologischen und mathematischen Triebkomponenten bedingen. Diese Erfordernisse werden unweigerlich zwingend, um essenzielle Ergebnisse und Verbesserungen in diversen Angelegenheiten des öffentlichen Interesses, wie etwa Eismassenbilanzierung, ozeanische Strömungssysteme, Klimawandel, erwarteter Rohstoffmangel etc. zu erzielen, in denen die Geodäsie signifikant involviert ist. Das „Handbuch der Mathematischen Geodäsie“ beschäftigt sich mit der Mathematik als Schlüsseltechnologie für Modellierungs- und Simulationsbelange sowie der Analyse und Interpretation geodätischer Messungen und Beobachtungen. Es liefert wegbereitendes und bahnbrechendes mathematisches Wissen als Transfermethodologie vom „Realitätsraum“ geodätischer Messungen in den „Modellraum“ mathematischer Strukturen und Lösungen, und umgekehrt. Entsprechend dem Wesen geodätischer Forschung wird dem Gravitationsfeld ausschließliches Interesse im makroskopischen Sinne gewidmet, ohne dass das Quantenverhalten in Betracht gezogen wird. In der geodätisch reflektierten Beschäftigung mit dem Schwerefeld werden somit nur Geschwindigkeiten berücksichtigt, die wesentlich geringer als Lichtgeschwindigkeit sind, so dass Newtonsche Physik vorbehaltslos zur Anwendung kommen kann. Im Einzelnen beschäftigt sich dieses Handbuch mit der folgenden Themenauswahl: allgemeiner wissenschaftlicher Hintergrund der Mathematischen Geodäsie, inverse Probleme, kleinste Quadrate-Methoden, Multiskalenmethoden, statistische Methoden, Referenzsysteme und Monitoringmethoden, Satelliten- und Raumtechniken. Das Handbuch hat ein doppeltes Ziel: Einesteils stellt es ein Referenzwerk für Geodäten dar, die an der Anwendung innovativer Modellierungs- und Simulationsangebote auf zukunftsträchtige Datensätze und -strukturen interessiert sind. Anderenteils dient es Mathematikern, die sich mit den diversen Fragestellungen der modernen Geodäsie beschäftigen, als Sammlung aktuellen Materials. Für beide VII

VIII

Vorwort

Gruppen bietet das Handbuch bedeutende Perspektiven und Herausforderungen, um die traditionellen wissenschaftlichen Grenzen zu überschreiten. Der Herausgeber möchte allen seinen besonderen Dank aussprechen, die das Handbuch nicht nur möglich gemacht, sondern auch höchst zufriedenstellend umgesetzt haben: • Den Verfassern von Beiträgen für das Handbuch, die viel Zeit, Einsatz und kreative Energie in das Projekt setzten. • Dem Stab von Springer, insbesondere Dr. Stephanie Preuß, Kavitha Janarthanan, Steffen Rieck, Feray Steinhart, Julia Körting und Gabriele McLemore, die viel Zuspruch und Rat gaben. Vielen Dank für exzeptionelle Anstrengungen und Unterstützung zur Erstellung eines Werkes, das spannende Fundstätten und eindrucksvollen Fortschritt eröffnet. In der Tat besteht die Hoffnung, dass das „Handbuch der Mathematischen Geodäsie“ starke Inspiration und neue Forschungsleistungen in Geodäsie wie auch Mathematik anregen wird. Kaiserslautern August 2019

Willi Freeden

Inhaltsverzeichnis

Band 1 Part I

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1

Mathematical Geodesy: Its Role, Its Aim, and Its Potential . . . . . . . Willi Freeden and Michael Schreiner

3

2

Inverse Probleme der Geodäsie: Ein Abriss mathematischer Lösungsstrategien . . . . . . . . . . . . . . . . . . . . . . . . . . . . Willi Freeden und Helga Nutz

3

Up and Down Through the Gravity Field . . . . . . . . . . . . . . . . . . . . . . F. Sansó, M. Capponi, and D. Sampietro

Part II 4

5

6

7

Special Functions Based Methods . . . . . . . . . . . . . . . . . . . . . . . .

Spherical Harmonics, Splines, and Wavelets: Definitoric Constituents, Strategic Perspectives, Specific Applicabilities and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Willi Freeden and Michael Schreiner

65 91

145

147

A Mathematical View on Spin-Weighted Spherical Harmonics and Their Applications in Geodesy . . . . . . . . . . . . . . . . . Volker Michel and Katrin Seibert

195

Reconstruction and Decomposition of Scalar and Vectorial Potential Fields on the Sphere: A Brief Overview . . . . . . . . . . . . . . . Christian Gerhards and Roger Telschow

309

Ellipsoidal-Spheroidal Representation of the Gravity Field . . . . . . . Erik. W. Grafarend

351

IX

X

Inhaltsverzeichnis

Part III

Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

443

8

Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karl-Rudolf Koch

445

9

Parameter Estimation, Variance Components and Statistical Analysis in Errors-in-Variables Models . . . . . . . . . . . . . . . Peiliang Xu

Part IV 10

11

Approximation and Numerical Methods . . . . . . . . . . . . . . . . . .

535

Fast Harmonic/Spherical Splines and Parameter Choice Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Gutting

537

Numerical Methods for Solving the Oblique Derivative Boundary Value Problems in Geodesy . . . . . . . . . . . . . . . . . . . . . . . . . ˇ Róbert Cunderlík, Marek Macák, Matej Medl’a, Karol Mikula, and Zuzana Minarechová

Part V 12

13

477

575

Reference Systems and Monitoring Methods . . . . . . . . . . . . . .

623

Geodetic Methods for Monitoring Crustal Motion and Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Athanasios Dermanis

625

Theory and Realization of Reference Systems . . . . . . . . . . . . . . . . . . Athanasios Dermanis

697

Band 2 Part VI 14

15

Inverse Problems and Least Squares Methods . . . . . . . . . . . . .

825

From Gaussian Least Squares Approximation to Today’s Operator-Theoretic Regularization of Ill-Posed Problems . . . . . . . . Willi Freeden and Bertold Witte

827

The Numerical Treatment of Covariance Stationary Processes in Least Squares Collocation . . . . . . . . . . . . . . . . . . . . . . . . Wolf-Dieter Schuh and Jan Martin Brockmann

931

Part VII 16

Inverse Problems and Multiscale Methods . . . . . . . . . . . . . . .

967

Inverse Gravimetry: Density Signatures from Gravitational Potential Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Willi Freeden and M. Zuhair Nashed

969

Inhaltsverzeichnis

XI

17

From Newton’s Law of Gravitation to Multiscale Geoidal Determination and Ocean Circulation Modeling . . . . . . . . . . . . . . . . 1053 Willi Freeden and Helga Nutz

18

Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid: Theoretical Background and Multiscale Modeling . . . 1117 Gerhard Berg, Christian Blick, Matthias Cieslack, Willi Freeden, Zita Hauler, and Helga Nutz

Part VIII

Methods for Satellite and Space Techniques . . . . . . . . . . . . . 1183

19

Satellite Gravitational Gradiometry: Methodological Foundation and Geomathematical Advances . . . . . . . . . . . . . . . . . . . 1185 Willi Freeden, Helga Nutz, Reiner Rummel, and Michael Schreiner

20

Very Long Baseline Interferometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 1257 Axel Nothnagel

21

Elementary Mathematical Models for GNSS Positioning . . . . . . . . . 1315 Bernhard Hofmann-Wellenhof

Stichwortverzeichnis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1449

Autorenverzeichnis

Gerhard Berg Landesamt für Vermessung und Geobasisinformation RheinlandPfalz, Koblenz, Deutschland Christian Blick CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Deutschland Jan Martin Brockmann Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany M. Capponi Department of Civil, Constructional and Environmental Engineering, Università di Roma La Sapienza, Rome, Italy Department of Civil and Environmental Engineering, Politecnico di Milano, Milan, Italy Matthias Cieslack Landesamt für Vermessung und Geobasisinformation Rheinland-Pfalz, Koblenz, Deutschland ˇ Róbert Cunderlík Faculty of Civil Engineering, Department of Mathematics and Descriptive Geometry, Slovak University of Technology, Bratislava, Slovakia Athanasios Dermanis Department of Geodesy and Surveying (DGS), Aristotle University of Thessaloniki, Thessaloniki, Greece Willi Freeden Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany Christian Gerhards TU Bergakademie Freiberg, Geomathematics and Geoinformatics Group, Freiberg, Germany Erik. W. Grafarend Faculty of Aerospace Technology and Geodesy, Faculty of Mathematics and Physics, Faculty of Civil Engineering and Environment Engineering Sciences, Department of Geodesy and Geoinformatics, Stuttgart University, Stuttgart, Germany Martin Gutting Department of Biotechnology, Mannheim University of Applied Sciences, Mannheim, Germany

XIII

XIV

Autorenverzeichnis

Zita Hauler CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Deutschland Bernhard Hofmann-Wellenhof Working Group Navigation, Institute of Geodesy, Graz University of Technology, Graz, Austria Karl-Rudolf Koch Institute for Geodesy and Geoinformation, Theoretical Geodesy Group, University of Bonn, Bonn, Germany Marek Macák Faculty of Civil Engineering, Department of Mathematics and Descriptive Geometry, Slovak University of Technology, Bratislava, Slovakia Matej Medl’a Faculty of Civil Engineering, Department of Mathematics and Descriptive Geometry, Slovak University of Technology, Bratislava, Slovakia Volker Michel Geomathematics Group, University of Siegen, Siegen, Germany Karol Mikula Faculty of Civil Engineering, Department of Mathematics and Descriptive Geometry, Slovak University of Technology, Bratislava, Slovakia Zuzana Minarechová Faculty of Civil Engineering, Department of Mathematics and Descriptive Geometry, Slovak University of Technology, Bratislava, Slovakia M. Zuhair Nashed Department of Mathematics, University of Central Florida, Orlando, FL, USA Axel Nothnagel VLBI Research Group, Institute of Geodesy and Geoinformation of the University of Bonn, Bonn, Germany Helga Nutz CBM – Gesellschaft für Consulting, Business und Mangagement mbH, Bexbach, Deutschland Reiner Rummel Astronomical and Physical Geodesy, TU Munich, Munich, Germany D. Sampietro Geomatics Research & Development s.r.l., Como, Italy F. Sansó Department of Civil and Environmental Engineering, Politecnico di Milano, Milan, Italy Michael Schreiner Institute for Computational Engineering, University of Applied Sciences of Technology NTB, Buchs, Switzerland Wolf-Dieter Schuh Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany Katrin Seibert Geomathematics Group, University of Siegen, Siegen, Germany Roger Telschow Computational Science Center, University of Vienna, Vienna, Austria

Autorenverzeichnis

XV

Bertold Witte Institute for Geodesy and Geoinformation, University of Bonn, Bonn, Germany Peiliang Xu Disaster Prevention Research Institute, Kyoto University, Uji, Kyoto, Japan

Part I Background

1

Mathematical Geodesy Its Role, Its Aim, and Its Potential Willi Freeden and Michael Schreiner

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Geodesy: Its Role, Its Aim, Its Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Geomathematics: Its Role, Its Aim, Its Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Mathematical Geodesy: The Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 8 35 56 60

Abstract During the last decades, geodesy was influenced by two essential scenarios: First, the technological progress has completely changed the observational and measurement techniques. Modern high-speed computers and satellite-based techniques were more and more entering. Second, there was a growing public concern about the future of our planet, i.e., the change of its climate, the obligations of its environment, and about an expected shortage of its natural resources.

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. W. Freeden () Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] M. Schreiner Institute for Computational Engineering, University of Applied Sciences of Technology NTB, Buchs, Switzerland E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_91

3

4

W. Freeden and M. Schreiner

Simultaneously, all these aspects implied and imply the strong need of adequate mathematical structures, tools, and methods, i.e., geomathematics. This contribution deals with today’s methodological components of the circuit mathematical geodesy characterizing the interrelations between geodesy and geomathematics with respect to origin and foundation, constituting ingredients, scientific role as well as perspective potential. This introductory contribution represents a collection of known ideas and concepts from different sources in geodetic and geomathematical literature, however, in a new consistent setup and innovatively structured form.

Zusammenfassung Während der letzten Dekaden war Geodäsie von zwei wesentlichen Szenarien beeinflusst: Zum einen hat der technologische Fortschritt die Beobachtungsund Messmethoden vollständig geändert. Moderne Hochleistungsrechner und satellitenbasierte Techniken kamen mehr und mehr zum Zuge. Zum anderen gab es eine wachsende Besorgnis in der Bevölkerung um die Zukunft unseres Planeten, d. h. den Wechsel seines Klimas, die Belange seiner Umwelt und die erwartete Verknappung seiner natürlicher Ressourcen. Alle diese Aspekte implizier(t)en simultan den starken Bedarf an adäquaten mathematischen Strukturen, Hilfsmitteln und Methoden, kurzum an Geomathematik. Der vorliegende Beitrag beschäftigt sich mit den heutigen methodologischen Komponenten des Kreislaufs Mathematische Geodäsie, der die gegenseitige Interrelation von Geodäsie und Geomathematik bezüglich Ursprung und Grundlegung, konstituierender Bestandteile, wissenschaftlicher Rolle sowie perspektivischem Potential charakterisiert. Dieser einleitende Beitrag stellt eine Sammlung bekannter Ideen und Konzepte aus verschiedenen Quellen geodätisches und geomathematisches Literatur dar, allerdings in einer neuartigen konsistenten Zusammenstellung und innovativ strukturierten Form.

Keywords Earth system · Geodesy · Geomathematics · Mutual scientific interplay · Mathematical geodesy as circuit

1

Introduction

Modern humankind is changing essential components of the Earth’s system including the climate system, without sufficient understanding of the constituting

1 Mathematical Geodesy

5

ingredients today. There is an urgent need for a better knowledge of the Earth’s system and the interrelations among its components. This need cannot be satisfied without two requirements, geodetic measurements and observations of a large set of influencing parameters and mathematical concepts and models characterizing the Earth’s system (Fig. 1). In this respect, the United Nations General Assembly adopted the resolution on a Global Geodetic Reference Frame for Sustainable Development (A/RES/69/266) on February 26, 2015. The purpose of this document is a description of the Global Geodetic Reference Frame (GGRF), along with a brief description of its key components, as a realization of the Global Geodetic Reference System: The GGRF is intended to support the increasing demand for positioning, navigation, timing, mapping, and geoscience applications. The GGRF is essential for a reliable determination of changes in the Earth system, for natural disaster management, for monitoring sealevel rise and climate change, and to provide accurate information for decision-makers. Furthermore, due to globalization and interoperability requirements, there is a growing demand for spatial data infrastructure. Precise spatial information is needed in many areas of benefit to society, including transportation, construction, infrastructure, process control, surveying and mapping, and Earth sciences, and is especially important for monitoring progress towards the UN’s Sustainable Development Goals.

A constituting element of geodetic Earth system research is to explore and exploit the close connections between the process of geodetic observation and the mathematical modeling of Earth system constituents. The processes and phenomena

Fig. 1 The Earth (as seen in 1972 from the Apollo 17 mission) is in change. Current research topics essentially include its mantle and crust, the ocean currents, hydrology, and deformations of the Earth [63]

6

W. Freeden and M. Schreiner

of the Earth system manifest themselves in geodetic parameters. Geodetic observations determine parameter sets and time series about the topographic geometry, the physical geometry including the gravity field, Earth’s rotation and orientation. These geodetic parameters are incorporated into models and simulations of the Earth system processes and thereby help to understand and improve the knowledge about the Earth system. Conversely, models and simulations enable an improved mathematical analysis and a proper interpretation and application of geodetic measurements, which in turn provide innovations in geodetic measurement and observation techniques. There is a circuit of geodetic measurements and observations and geomathematical structures and concepts. Its overall aim is to get more accurate and more consistent results for Earth system research. Concerning the topographic geometry, the surface of the Earth is an almost perfect sphere. Deviations from its spherical shape are less than 0.4% of its radius (6371 km). The deviations essentially arise from its rotation, producing an almost ellipsoidal shape symmetric about the rotation axis. Relative to an ellipsoid the deviations are less than 10 km. The topographic features, mountains and valleys, are very irregular. They can only be followed by direct measurement in any detail. No simple generic mechanism exists that permits inference from a few selected points for the topography as a whole. In this respect the shape of the Earth as defined by its gravity field, is more representative. It should be noted that gravity as observed on the Earth’s surface is the combined effect of the gravitational mass attraction and the centrifugal force due to the Earth’s rotation. The force of gravity provides a directional structure to the space above the Earth’s surface. It is tangential to the vertical plumb lines and perpendicular to all level surfaces. Any water surface at rest is part of a level surface, characterizing the physical geometry of the Earth. As a consequence, the measurement and representation of the gravity field are a major object of geodetic activities. In this respect it must be emphasized that it is not the objective of mathematical geodesy to go into the nature of gravitation. This would be a subject of physics and would require a study of Newton’s and Einstein’s theories of gravitation and more recent developments in this field. Geodesy would rather like to get a feeling on how observations are affected by the gravitational field and how we can make use of this fact. Thereby, two basic attitudes can be taken in mathematical geodesy. Either the influencing effect of the gravitational field is considered as a calamity and one wants to eliminate it, or one takes a more positive attitude and tries to make use of the gravitational effect on geodetic observations and models. In both cases the goal is to come to a more detailed understanding. Future contributions are essentially required to aim at providing a survey of innovative mathematical instruments and concepts in concrete adaptation and application to recent demands and progress of gravity field reflected measurements in geodesy. Today, the knowledge of the gravity field of the Earth is of great importance for many applications (see Fig. 2) from which we only mention five significant examples (cf. [7, 53]):

1 Mathematical Geodesy

7

• Geodesy and Civil Engineering. Accurate heights are needed for civil constructions, mapping etc. They are obtained by leveling, a very time consuming and expensive procedure. Today’s geometric heights can be obtained fast and efficiently from space positioning (for example, the global navigation satellite system GNNS). The geometric heights are convertible to leveled heights by subtracting the precise geoid, which is implied by a high resolution gravitational potential. To be more specific, in those areas where good gravity information is already available, the future data situation will eliminate all medium and longwavelength distortions in unsurveyed areas. GNSS (i.e., NAVSTAR GPS, USA; GLONASS, Russian Federation; GALILEO, EU; BeiDou, People’s Republic of China) together with the planned explorer satellite missions will provide high quality height information at global scale. • Solid Earth Physics. The gravity anomaly field has its origin mainly in mass inhomogeneities of the continental and oceanic lithosphere. Together with height information and regional tomography, a much deeper understanding of tectonic processes should be obtainable.

g

rin

e ine

Ex

g

n lE

plo

ra

C

Ge

op

hy s

cis

d

Solid

Gravity Field

Eart

s ysic h Ph

Eart

hC lima te S yste m

o Ge

esy

d an

tio

n

ivi

Physical Oceanography

Fig. 2 The knowledge of the Earth’s gravity field and its essential importance for many applications

8

W. Freeden and M. Schreiner

• Physical Oceanography. Altimeter satellites in combination with a precise geoid will deliver global dynamic ocean topography. From the global surface circulation and its variations in time, the ocean topography can be computed resulting in a completely new dimension of accuracy. Surface circulation allows the determination of transport processes of e.g., polluted material. • Earth Climate System. There is a growing awareness of global environmental problems (for example, the CO2 -question, the rapid decrease of rain forests, global sea level changes, etc.). What is the role of the future airborne methods and satellite missions in this context? They do not tell us the reasons for physical processes, but it is essential to bring the phenomena into one system (e.g., to make sea level records comparable in different parts of the world). In other words, the geoid is viewed as an almost static reference for many rapidly changing processes and at the same time as a “frozen picture” of tectonic processes that evolved over geological time spans. • Exploration Geophysics. Airborne and satellite gravity measurements have usually been used together with aeromagnetic surveys, but the poor precision of airborne as well as spaceborne gravity measurements has hindered a wider use of this type of measurements. Strong improvements can be expected from the future scenario. Airborne as well as spaceborne gravity, of course, has a great advantage because measurements of the gravity field are not restricted to certain areas. Furthermore, knowledge of regional geologic structures can be gained by means of gravity data. For purposes of exploration, however, the determination of the absolute gravity field is of little significance as well as gravity anomalies of dimension very much greater than the gravity anomalies caused by e.g., the oil, gas, and water structures. The fundamental interest in gravitational methods in exploration today is based on terrestrial measurements of small variations.

2

Geodesy: Its Role, Its Aim, Its Potential

The word “geodesy” etymologically comes from the Ancient Greek γ ωδαισ ι´α (literally, “division of the Earth”). Following F.R. Helmert’s definition [31], geodesy is the discipline that deals with the measurement and representation of the Earth, including the gravity field, in a three-dimensional time-varying space (this definition encompasses all of surveying and its techniques). Geodesy is usually subdivided into geometrical geodesy, physical geodesy, and satellite geodesy, although alternative subdivisions are recognized as well. Geometrical geodesy is concerned with describing locations in terms of geometry. The topographic figure is represented by coordinates of points and by maps. Coordinate systems are one of the primary products of geometrical geodesy. Physical geodesy deals with the determination of the Earth’s gravity field, that amongst others is necessary for establishing heights (cf. Fig. 3). The physical shape of the Earth, as expressed by its gravity field, is represented in the form of level surfaces relative to a global reference body. The directions to the stars and the orbits of satellites provide an external reference. The changing orientation of the Earth with respect to

1 Mathematical Geodesy

9

the directions to stars is expressed by the Earth’s slightly variable spin rate, polar motion, and precision-nutation. Satellite geodesy is concerned with using orbiting satellites to obtain data for geodetic purposes. Once again, it should be pointed out that the Earth is a dynamic planet in permanent change, due to large-scale internal convective material and energy rearrangement processes, as well as manifold external effects. We can, therefore, only understand the Earth as our living environment if we consider it as a complex system of all its interacting components. The processes running on the Earth are coupled with one another, forming ramified chains of cause and effect which are additionally influenced by men who intervene into the natural balances and circuits. However, knowledge of these chains of cause and effect has currently still remained incomplete to a large extent. In adequate time, substantial improvements can only be reached by the exploitation of new measurement and observation methods. The subsystem originated and influenced by the gravity field giving information about mass distribution and mass transport (see Table 1) is an eminently important issue. The gravity field is one of the key components of future investigation. Even better, it can be expected that the design and development of new sensors such as optical clocks, quantum gravimeters, and the use of glass fibers open an auspicious geodetic epoch, thereby offering the combined utilization of relativistic geodesy with related classical measurement techniques in the near future. Seen from numerical point of view it must be remarked for future work that combining data from different sensors and sources is the way forward. Only coordinated research between geodesy, geophysics, and geomathematics will provide a breakthrough in understanding and modeling of important processes in the system Earth. There is an increasing interest of geodesy in anthropogenic effects penetrating the system Earth. Specific interest is in the separation of anthropogenic and natural variations and the detection of their interdependencies. So, traditional geodetic methods will not be able to master the new amounts of data neither theoretically nor numerically – especially considering the important aspect of a more intensively localized treatment with respect to space and time, consistently embedded into a global concept. The integration demands three essential characteristics: good modeling property, appropriate decorrelation ability, and fast algorithms. These characteristics are the key for a variety of abilities and new research directions. In particular, they explain the strong need of geomathematical research.

Table 1 Scientific use of gravity field observables Solid Earth Crustal density Post glacial rebound Exploration

Oceanography Glaciology Geodesy Dynamic topography Bedrock topography Levelling (GNSS) Heat transport Mass transport

Flux

Height systems Orbit determination

Climate Sea level changes Coastal zones

10

2.1

W. Freeden and M. Schreiner

Geometrical Geodesy

Cartographic products are flat, whether they are paper maps or images on a computer monitor. Humans tend to think of distances as being straight-line distances. The convenience of a flat model of space is clear. Regardless of our perceptions, the Earth is round. A map is a planar depiction of a round reality and it is a mathematical fact that such a depiction cannot be free of distortion. Planes are fundamentally different than spheres. So, we are led to say that the role of geometric geodesy is to bring the world of the round and the world of the flat together. Surveying (see., e.g. [1]) is a process of direct measurement and analysis specifically designed to document the existence, the identity, the location, and the dimension or size of natural or artificial features on land or in the air, space or water for the purpose of mapping, i.e., producing accurate and reliable maps, suitable for visualization if needed. Following almost literally the description by R. Rummel [53], the classical method of determination of the topographic surface is by triangulation (“Brun’s polyhedron”). Characteristic terrain points are connected by triangles. The measurement of angles or, equivalently, of distance ratios delivers the form of the triangles. Observing more than the minimally required number of elements permits, after adjustment due to Gauss’s ideas, an evaluation of the precision and reliability of the network of triangles. The physically defined plumb lines through a triangulation point serves as a reference of the measurement of the vertical angles with the sides to all adjacent network points. Thus, the geometric form of a network is determined horizontally and vertically involving physical means. For larger point configurations it is convenient to introduce a coordinate system. The orientation is determined by fixing two coordinates of a second and one coordinate of a third point. One arbitrary side length defines the scale of the network. Independent networks (e.g., separated by oceans) can be positioned and oriented on the globe by means of a common external reference, the stars. Measurements are made at selected points of the vertical angles of stars and the horizontal angle between the direction to a star and to a terrestrial point. However, these angles continuously change due to the Earth’s rotation, polar motion, and precession-nutation. Consequently these effects have to be carefully monitored and modeled. Only under this circumstance the star observation can be converted into astronomical latitude and longitude and azimuth (i.e., in geodetic nomenclature, the angle of a point with respect to north). Usually they are defined with respect to a conventional terrestrial coordinate system relative to the equator plane, the Greenwich meridian, and the North Pole, respectively. For mapping purposes a projection surface is chosen (e.g., a best-fitting ellipsoid) and the point coordinate are expressed, for example, by their geographical latitude, longitude, and height above the ellipsoid.

1 Mathematical Geodesy

2.2

11

Physical Geodesy

So far, in our approach to geodesy, the gravity field has not played any real part, except for its role in defining horizontal and vertical. As a matter of fact, if the Earth were a homogeneous, spherical body, gravity is almost constant all over the Earth’s surface, the well-known 9.81 ms−2 . The slightly curved plumb lines are directed toward the Earth’s center of mass, and this implies that all level surfaces are nearly spherical too and that the density distribution inside the Earth must consist of approximately spherical layers. The gravity decreases from the poles to the equator by about 0.05 ms−2 . This is caused by the flattening of the Earth’s figure and the negative effect of the centrifugal force, which is maximal at the equator. Almost 70% of the Earth’s surface is practically a level surface. The level surface at mean sea level is called the geoid (cf. Fig. 4). As expressed by Newton’s law of gravitation, the trajectory of any object in free fall is determined by the Earth’s mass distribution, whether it is the orbit of a satellite or the fall of a test mass of an absolute gravity apparatus. It is a major objective of mathematical geodesy to determine the Earth’s surface and the gravity field at its surface and outside it, employing terrestrial, airborne, as well as spaceborne measurements and observations. Angles and distances provide the geometric shape of point configuration, the astronomically determined latitude, longitude, and azimuth provide the proper location on the Earth’s surface. Figure 3 shows the basic situation. One drawback is that geometric heights (e.g., above a chosen ellipsoid) carry no physical information relating to the flux of water. For this purpose the difference in gravity potential between the level surface passing through each of the terrain points has to be determined (see, e.g., [29, 32, 53, 61] for more details). ellipsoid/spheroid geoid topography h

ellipsoidal height

H orthometric height h

N geoid height

N

H

Fig. 3 Geodetically relevant heights (note that the ellipsoidal height is equal to the sum of the orthometric height and the geoid height, i.e., h = H + N )

12

W. Freeden and M. Schreiner

ellipsoid/spheroid geoid mean sea level topographic geometry

Fig. 4 Geodetically relevant geometries

The potential differences are converted into metric height differences, e.g., orthometric or normal heights (cf. Fig. 3). The heights of all points of a certain area are referred to one fundamental benchmark, the height datum. As the datum points are usually chosen closed to sea level, one can speak of heights above sea level. In what follows, we frequently follow the seminal work by R. Rummel (given e.g., in [52, 53, 55, 56]) to characterize the role, aim, and potential of geodesy. Following [53] we are led to the following conclusion concerning the classical method of determination the geometric figure by triangulation: This concept of the determination of the geometric figure of the Earth was formulated by H. Bruns [6] and uses all measurement types available at that time. It is still valid today, although with significantly improved measurement techniques. The concept suffered in the past from two limitations: Vertical angles are usually distorted by atmospheric refraction and no direct connection between the geodetic networks of different continents was possible. Both drawbacks have been overcome by means of modern space techniques. Indeed, today, a truly global polyhedron covering the entire Earth, can be established. Crustal deformation studies in the horizontal and vertical directions require repeated application of the above principles.

2.2.1 Newton’s Foundation of Physical Geodesy First of all it should be noted that the geodetic foundation involving gravitation is still closely related to Isaac Newton’s original work. Newton’s physics is meant to be universal. It applies both to a planet and to a falling particle (in our approach illustrated by an “apple”). Newton himself often told the story that he was inspired to develop his theory of gravitation by watching the fall of an apple from tree. Although it has been said that the apple story is a myth, some acquaintances of Newton do, in fact, confirm the incident. For example, Voltaire (1727) wrote in his Essay on Epic Poetry: Sir Isaac Newton walking in his gardens, had the first thought of his system of gravitation, upon seeing an apple falling from a tree.

1 Mathematical Geodesy Fig. 5 Test mass (“apple”) in free fall: the principle of an absolute gravimeter

13

x x(t1 ) x(t2 )

x(t3 )

In his Philisophiae Naturalis Principia Mathematica, I. Newton (1687) gives the famous three laws of motion, recapitulated shortly here in modern nomenclature: • Newton’s First Law (also known as Law of Inertia) states that an object at rest tends to stay at rest and that an object of uniform motion tends to stay in uniform motion unless acted upon by net external force. The meaning of this law is the existence of reference frames (called inertial frames) where objects not acted upon by forces move in uniform motion (in particular, they may be at rest). • Newton’s Second Law states that an applied force on an object equals the rate of momentum. Using the definition of acceleration the law can be written in the iconic form that the force apart from a factor characterizing (constant) mass is equal to the acceleration (see Fig. 5). • Newton’s Third Law states that, for every action, there is an equal and opposite reaction. This means that any force exerted to an object has a counterpart force that is exerted in the opposite direction back onto the first object. In conclusion, Newton’s famous law about the mutual attraction of two masses tells us that the attractive force, called gravitation, is directed along the line connecting the two centers of the objects and is proportional to both masses as well as to the squared inverse of the distance between the two objects (see Fig. 6).

2.2.2 Gravity Field Constituents The gravity acceleration (gravity) w is the resultant of the gravitation v and the centrifugal acceleration c such that w = v + c. The centrifugal force c arises as a result of the rotation of the Earth about its axis. In this work we assume a rotation of constant angular velocity. The centrifugal acceleration acting on a unit mass is directed outward perpendicularly to the spin

14

W. Freeden and M. Schreiner

r

F1 m1

F2 m2

Fig. 6 Gravitation between point masses. Two mass points with a distant r attracts each other with a force. The moduli of the forces fulfill F1 = F2 = G m1 m2 /r 2 , where G is the gravitational constant

9.8072467

...

m/s2 10 − 9 ocean topography, movements of poles 10 − 8 temporal variations: oceans, hydrology 10 − 7 tide, atmospheric pressure 10 − 6 salt dome, sedimentary basins, deposits 10 − 5 density variations in crust and mantle 10 − 4 mountains, oceanic ridges, subduction 10 − 3 flattening and centrifugal acceleration 100 spherical Earth

Fig. 7 Illustration of the coordinates of the gravity acceleration at a certain location (cf. ESA medialab, ESA communication production SP−1314). All gravity measurements are our days expressed in SI-units. However, still wide use is made of some older, traditional standards. The unit of gravity acceleration is ms−2 , the traditional unit is Gal. (After G. Galilei, 1564–1647). 1 Gal = 0.01 ms−2

axis. Introducing the so-called centrifugal potential C, such that c = ∇C, the function C is non-harmonic. The direction of the gravity w is known as the direction of the plumb line, the quantity |w| is called the gravity intensity (often also just called gravity). Altogether, the gravity potential of the Earth can be expressed in the form W = V +C, and the gravity acceleration w is given by w = ∇W = ∇V +∇C. The surfaces of constant gravity potential, i.e., W = const., are designated as equipotential (level, or geopotential) surfaces of gravity (for more details, the reader is referred to monographs in geodesy, e.g., [26, 29, 30, 32, 44]. See Fig. 7 for an illustration of the significance of gravity acceleration. If the Earth had a spherical shape and if the mass inside the Earth were distributed homogeneously or rotationally symmetric, the line along which an apple fell would indeed be a straight line, directed radially and going exactly through the Earth’s center of mass. The gravitational field obtained in this way would be spherically symmetric. In reality, however, the situation is of higher complexity.

1 Mathematical Geodesy

15

The topographic features, mountains, and valleys are very irregular. The actual gravitational field is influenced by strong irregularities in density within the Earth. As a result, the gravitational force deviates from one place to the other from that of a homogeneous sphere. Internal density signatures are reflected by gravitational field signatures, and gravitational field signatures spectrally smooth out exponentially with increasing distance from the Earth’s body. As a consequence, we are confronted with the situation that positioning systems are ideally located as far as possible from the Earth, whereas gravitational field sensors are ideally located as close as possible to the Earth. Following these basic principles, various positioning and gravitational field determination techniques have been designed during the last decades. Considering the spatial location of the data, we may differentiate between terrestrial (surface), airborne, and spaceborne methods. The actual Earth’s surface (available from modern satellite techniques such as GNSS, LLR, SLR, VLBI, etc.) does not coincide with a level surface (i.e., an equipotential surface). The force of gravity is generally not perpendicular to the actual Earth’s surface. Instead we are confronted with the gravity intensity as an oblique derivative (see Fig. 8). The gravity vector is an oblique vector at any point on the Earth’s surface and generally not parallel to the normal vector. The force of gravity provides a directional structure. It is tangential to the vertical plumb lines and perpendicular to all (level) equipotential surfaces. Any water surface at rest is part of a level surface. Level (equipotential) surfaces are ideal reference surfaces, for example, for heights. As the level surfaces are, so to speak, “horizontal”, they play an important part in our daily life (e.g., in civil engineering for the purpose of height determination). The geoid (cf. [39]) is defined as that level surface of the gravity field which best fits the mean sea level (cf. Fig. 8). The direction of the gravity vector can be obtained by astronomical positioning. Measurements are possible with sufficient accuracy on the Earth’s surface only. Observations of the gravity vector are converted into so-called deflections of the vertical by subtracting a corresponding reference direction derived from a simple gravity field model associated to, for example, an ellipsoidal surface. Gravitational field determination based on the observation of deflections of the vertical and combined with gravity is feasible in smaller areas with good data coverage. The determination of equipotential surfaces of the potential W is strongly related to the knowledge of the potential V . The gravity vector w given by w = ∇W is normal to the equipotential surface passing through the same point. Thus, Fig. 8 Earth’s surface, geoid, ellipsoid (λ = w/|w| normalized (oblique) gravity vector, normal to the geoid, ν = normal vector to the Earth’s surface)

oceanic surface

Earth’s surface

λ(x) ellipsoid

ν(x)

geoid

16

W. Freeden and M. Schreiner

x

level surface w(x)

sphere

x

w(x)

geoid

plumb line Fig. 9 Level surfaces and plumb lines for a homogeneous ball (left) and an Earth-like body (right)

equipotential surfaces such as the geoid intuitively express the notion of tangential surfaces, as they are normal to the plumb lines given by the direction of the gravity vector (cf. Fig. 9). There are two types of gravimeters, viz. absolute gravimeter (Fig. 5) and relative gravimeter (Fig. 10). Relative gravimeters compare the value of gravity at one point with another. They must be calibrated at a location, where the gravity is known accurately. They measure the ratio of the gravity at the two points. Absolute gravimeters are based directly on measuring the acceleration of free fall (for example, of a test mass in a vacuum tube). Most common relative gravimeters are spring-based. By determining the amount by which the weight stretches the spring, gravity becomes available. The highest accuracy relative to gravity measurements are conducted at the Earth’s surface. Measurements on ships and in aircraft deliver reasonably good data only after the removal of inertial noise (see Fig. 10).

2.2.3 Terrestrial Geodesy and Boundary Value Problems Terrestrial geodesy aims at the combined determination of the figure of the Earth and its exterior gravitational field via boundary value problems. It is based on the assumption that all over the Earth’s surface the astronomical latitude and longitude as well as scalar gravity intensity are available, so that the direction and length of the gravity vector is known. In the standard geodetic approach it is also assumed that the gravitational effects of the sun and moon and of the Earth’s atmosphere are accounted for by means of certain corrections. As a consequence, the gravitational part of the gravity potential can be regarded as a harmonic function. Hence, the determination of the gravity potential in the exterior of the Earth from continuous boundary values given on the Earth’s surface could be considered a classical Dirichlet boundary value problem of potential theory. However, as the boundary surface itself is also unknown, this leads to a complicated non-linear problem (see, e.g., [33]). In the classical approach of physical geodesy (cf. [29, 32]), for purposes

1 Mathematical Geodesy

%2 > %1

17

%1

Fig. 10 The principle of a relative gravimeter

of linearization, a globally ellipsoidal reference gravity field, called normal field, is introduced. The choice of the approximate surface characterizes the two alternative ways to the solution of the so-called geodetic boundary-value problem (GBVP). An early ansatz was presented by G.G. Stokes [60], the other was followed by M.S. Molodensky [42]: G.G. Stokes [60] proposed reducing the given gravity data from the Earth’s surface to the geoid. As the geoid is understood to be the level surface at sea level its potential value is constant. The difference between the reduced gravity on the geoid and the reference gravity on the reference ellipsoid is called gravity anomaly. The difference between the actual and the reference potential, the so-called disturbing potential, and the geoid height (i.e., the height difference between geoid and ellipsoid) are unknown. The relation between gravity anomalies and the disturbing potential enables the formulation of the so-called Stokes boundary value problem. M.S. Molodensky [42] plants the heights derived from the potential differences directly on the ellipsoid to define an approximate surface that closely approximates the Earth’s surface. It is called telluroid. In this case the geometric unknown is the height anomaly, i.e., the deviation between the Earth’s surface and the telluroid. The major assumption of the GBVP is that the boundary functions are given all over the Earth. In fact, observations are only available at discrete terrain points, and there are large areas, particularly at sea, where not data are given at all.

18

W. Freeden and M. Schreiner

• The first problem involving geodetic boundary value problems is to compensate the phenomenon of topographic loads using isostatic models (see, e.g., [27, 57] and the references therein). The aim is to obtain representative data. The selection of the isostatic model should be based on four criteria (cf. [29]): (1) geophysically meaningful, (2) easy to compute, (3) small, smooth and therefore easy to interpolate, (4) the indirect effect, i.e., the change of potential and gravity due to isostatic mass replacement, should be small. • The second calamity with boundary value problems is approached using methods such as numerical integration or spline interpolation (in the jargon of geodesy called least squares collocation, mathematically it amounts to minimum norm interpolation). • The third limitation concerning data coverage can be partly overcome by combining terrestrial gravity material with gravity field information deduced from satellite orbit analysis. Usually, the task of determining the disturbing potential T from gravity anomalies is reduced to boundary value problems corresponding to a spherical boundary. Numerical realizations of such boundary value problems have a long tradition, starting from F. Neumann [47], G.G. Stokes [60]. The conceptional vagueness of disturbing potential and geoid via boundary problems based on heterogeneously distributed boundary data can be overcome by combining terrestrial and satellite data. With precise positioning by satellites, for example, intercontinental height links can be established between local geopotential realizations. Moreover, nowadays, GNSS can be used to determine the real Earth’s surface. This fact led K.-R. Koch, A.J. Pope [36] to reformulate the geodetic boundary value problem as an exterior oblique derivative problem without any need to introduce an auxiliary surface such as the telluroid or the ellipsoid. Conventionally, a solution of an oblique derivative problem corresponding to the boundary values on the real Earth’s surface is represented by a strongly singular (Fredholm) integral equation based on a potential of a single layer as ansatz function that causes difficulties for numerical realizations. W. Freeden, C. Gerhards [12], W. Freeden, H. Kersten [14, 15], W. Freeden, V. Michel [17] show that the geodetic boundary value problem using the gravitational vector on the known Earth’s surface can be deduced from discrete data sets by a constructive Runge theorem in terms of, e.g., mono- and/or multipoles situated on an arbitrary closed surface lying completely inside the Earth. Indeed, in today’s geodesy concerned with terrestrial datasets, a result first motivated by C. Runge [58] in one-dimensional complex analysis and later generalized, e.g., by J.L. Walsh [62] and many others to potential theory in threedimensional Euclidean space R3 is of basic interest. The Runge-Walsh concept reflects constructive approximation capabilities of the Earth’s gravitational (and not gravity) potential, so that geoscientifically realistic, and not necessarily spherical or ellipsoidal geometries have to come into play. For numerical computations, the Runge-Walsh concept may be regarded as the justification why boundary value problems in geodesy have to be solved without any need to use specific telluroidal

1 Mathematical Geodesy

19

reflected trial function systems such as the numerically more difficult to handle ellipsoidal harmonics, Lamé functions, etc. Instead outer spherical harmonics (i.e., multi-poles) and/or mono-poles showing a larger harmonicity domain than the exterior of a telluroid (in linear Molodensky problem) or the actual Earth (in the oblique derivative problem) can be taken into account for gravitational potential modeling (see, e.g., [16]).

2.2.4 Terrestrial Geodesy and Inverse Problems Geodesy (in the sense of Helmert [31], see also [53]) provides a representation of the geometric shape of the Earth, its gravity field (indicated, for example, by the geoid), and of the variable orientation of the Earth’s body in space. Since the surface of the Earth is directly accessible for measurement, more and better measurements, together with a suitably improved mathematical model, enable a better approximation of the Earth’s geodetic characteristics. In principle, since we are confronted with well-posed problems guaranteeing existence, uniqueness and stability of the solution, an improvement can be continued to almost any degree of perfection. The situation, however, changes drastically if we are interested in the Earth’s interior. This should be explained briefly: Equipotential surfaces of the Earth’s gravity potential W allow, in general, no simple mathematical representation. This is the reason why physical geodesy and geophysics choose a suitable reference surface for modeling the geoid, i.e., the equipotential surface at sea level. The reference surface is constructed as an equipotential surface of an artificial normal gravity potential U . Its gradient field, i.e., u = ∇U , is called normal gravity. For reasons of simplicity, physical geodesy usually uses an ellipsoid of revolution in such a way that a good adaption to the Earth’s surface is guaranteed. Closed representations of normal gravity potentials, in consideration of the centrifugal force, can be found extensively in the geodetic literature starting from [60]. The deviations of the gravity field of the Earth from the normal field of such an ellipsoid are small. The remaining parts of the gravity field are gathered in a so-called disturbing gravity field ∇T corresponding to the disturbing potential, usually denoted by T in geodetic nomenclature. Knowing the gravity potential, all equipotential surfaces – including the geoid – are given by an equation of the form W (x) = const. By introducing U as the normal gravity potential corresponding to the ellipsoidal field, the disturbing potential T is the difference of the gravity potential W and the normal gravity potential U , i.e., we are led to a decomposition of the gravity potential in the form W = U + T . According to the concept developed by [50, 51, 60], we may assume that (i) the center of the reference surface (classically an ellipsoid) coincides with the center of the gravity of the Earth, (ii) the difference of the mass of the Earth and the mass of the reference surface is zero. A point x on the geoid can be projected onto its associated point y of the reference surface by means of the normal. The distance N (x) between x and y is

20

W. Freeden and M. Schreiner

the geoid height or geoidal undulation in x (cf. Fig. 11). The gravity anomaly vector at the point x of the geoid is the difference between the gravity vector w(x) and the normal gravity vector u(y). Another possibility is to form the difference between the vectors w and u at the same point x such that we get the gravity disturbance vector. The gravity disturbance, beside being the difference in magnitude of the actual and the normal gravity vector, is also the normal component of the gravity disturbance vector. An illustration of the disturbing potential, the gravity disturbances, and the gravity anomalies are given in Figs. 13 and 14. So far, in geodetic practice, much more data on gravity anomalies |w(x)| − |u(y)| are available than on gravity disturbances |w(x)| − |u(x)|. However, by modern GNSS-technology, the point x is available rather than y. Therefore, in future, it can be expected that |w(x)| − |u(y)| will become more important than |w(x)| − |u(x)| (as [32] point out in their monograph on physical geodesy). This is the reason why modern geodesy mostly continues to work with an exterior Neumann boundary value problem corresponding to |w(x)| − |u(y)| as boundary values. The deflection of the vertical at a point x on the geoid is understood to be the angular (i.e., tangential) difference between the directions ν(x) and ν  (x) (cf. Fig. 11). Thus, the deflection of the vertical is related to the angle between the plumb line and the normal of the reference surface through the same point. In accordance with these settings it is common practice (cf. [44]) to accept to the following facts: The geoidal heights, i.e., the deviations of the equipotential surface on the mean ocean level from a reference surface (ellipsoidal surface), are extremely small. Their order is only a factor 10−5 of the Earth’s radius or even smaller (see W.A. Heiskanen, H. Moritz [29] for more details). Even more, the reference ellipsoid only differs from a sphere with (mean Earth’s) radius R in the order of the flattening of about 3 · 10−3 . Therefore, since the time of [60], it is common use that, in theory, a reference surface should be taken into account. However, in practice, the reference surface (ellipsoidal surface) is treated as a sphere for reasons of numerical efficiency and economy. In doing so, a relative error of the order of the flattening of

ν (x)

ν (x) geoid W = const = W0

x

w(x)

u(x)

N (x) geoidal height

y

u(y)

reference surface U = const = W0

Fig. 11 Illustration of the gravity anomaly vector w(x) − u(y), the gravity disturbance vector w(x) − u(x), and the geoidal height N (x). (From [9])

1 Mathematical Geodesy

21

the Earth’s body at the poles, i.e., a relative error of 10−3 , is accepted in all equations containing the disturbing potential. Considering appropriately performed reductions in numerical calculations, this error seems to be quite permissible. The disturbing potential enables us to make the following geophysical interpretations (developable from graphical illustrations such as Figs. 12, 13 and 14, 15): • Gravity disturbances and/or the gravity anomalies, which give a relation between the real Earth and a reference (usually ellipsoidal) Earth model, represent the expression of an imbalance of forces in the interior of the Earth. In accordance with Newton’s Law of Gravitation they permit the conclusion of an irregular density distribution inside the Earth. Clearly, gravity anomalies and/or gravity disturbances do not determine uniquely the interior density distribution of the Earth. They may be interpreted as filtered signatures, which give major weight to the density contrasts close to the surface and simultaneously suppress the influence of deeper structures inside the Earth (for more details see [11] and [4, 5]). • Geoid undulations provide a measure for the perturbations of the Earth from a hydrostatic equilibrium. They form the deviations of the equipotential surfaces at mean sea level from the reference ellipsoid. Geoid undulations show no essential correlation to the distributions of the continents (see Fig. 15). They seem to be generated by density contrasts much deeper inside the Earth. Large-scale

Fig. 12 Illustration of the gravity disturbances in 10−3 m/s2 computed from EGM2008 from degree 2 up to degree 720

Fig. 13 Illustration of gravity anomalies in 10−3 m/s2 computed from EGM2008 from degree 2 up to degree 720

22

W. Freeden and M. Schreiner

Fig. 14 Illustration of the disturbing potential in m2 /s2 computed from EGM2008 (Earth Gravitational Model due to [38]) from degree 2 up to degree 720

Fig. 15 Illustration of the geoid undulations with respect to the WGS84 ellipsoid in m

geoidal structures are locally superposed by signatures, which are originated by geophysical phenomena (such as deglaciation, postglacial uplift, sea level fluctuations, etc.). For more details the reader is referred, e.g., to R. Rummel [54] and the references therein. An illustration of the geoid as the isosurface of the gravity potential at sea level is given in Fig. 16. In conclusion, while geoidal undulations show essential information about the composition and viscosity distribution of the Earth’s mantle, gravity anomalies or gravity disturbances display density contrasts inside the Earth. The gravity field is therefore able to provide geological information inside the Earth. Gravimetry beside seismics and magnetics can be seen as significant tools to improve our knowledge of the Earth’s interior (see, e.g., [5]). In particular, gravimetric multiscale techniques (see, e.g., [2–4]) are able to decorrelate geological signature as, e.g., required in geothermal exploration. Gravimetry beside magnetics is particularly important in mining areas, where reflexion seismic can not be applied because of strong anthropogenic influences. It should be noted that the key ideas and concepts concerning gravity and magnetic field can be treated in parallel (see [12] for explicit details). Magnetics, however, needs a dipole manifestation in contrast to the monopole ability for gravitation. All in all, the gravity field reflects all mass inhomogeneities, the long spatial wavelengths being mainly due to mantle convection. Earth rotation and polar motion

1 Mathematical Geodesy

23

Fig. 16 Geoidal surface (more concretely, GFZ-EIGEN-CG01C geoid (2005)) as an illustration of the physical Earth’s geometry

time series reveal the global response of the Earth to the gravitational attraction of the sun and moon and to internal and external mass rearrangement. Gravimetric exploration needs appropriately dense surface measurements of gravity at local scale. However, it should be remarked that the Earth’s interior usually is not directly accessible to measurement. So, the acquisition of information has to be inferred indirectly from surface observations. This is the reason why we are canonically led to inverse problems (IP) in mathematics. Indeed, inverse gravimetry (see, e.g., [21] and the references therein) is a central research area of geodesy, geophysics, and geoexploration. It deals with the seriously ill-posed problem of transferring input gravitational potential information in the form of Newtonian volume integral quantities on the Earth’s surface to geological output characteristics of the density contrast function.

2.2.5 Contribution by Geodesy to Geodynamic Modeling The essential parts of the Earth system relevant to geodetic activities are the solid Earth, the oceans, and the atmosphere. The dynamics of the ocean and atmosphere represent much shorter time scales. As a consequence, measurement of the dynamic topography of the oceans requires a high precision and repetition rate. A secular change of sea level, which also affects ocean circulation, is caused by the melting of the ice caps. The post-glacial uplift due to the missing weight of the ice not only changes the topographic surface but also results in a secular variation of gravity. The effect of the atmosphere on geodetic observables is more indirect. Half of the mass of the atmosphere is found below 6000 m altitude. Thus the topographic relief significantly interferes with the atmospheric circulation. The resulting exchange of angular momentum is clearly visible in Earth rotation observation. The gravity field also contains a component from the gravitational attraction of the atmosphere. However, in contrast to the effect of atmospheric refraction on geodetic measurements, these phenomena can be modeled very well.

24

W. Freeden and M. Schreiner

Tectonic Deformations. The relative drift rates of lithospheric plates, based on paleo-magnetic evidence, range typically from 3 to 5 cm year−1 to maximum values of 18 cm year−1 . Despite the small rates and usually large distances between points on different plates, two geodetic measurement techniques are capable of providing in situ verification. These are laser ranging to artificial satellites and very long baseline interferometry. In the case of satellite laser ranging (SLR) the distance from ground station to satellite is derived from the travel time of emitted laser pulses. Reflectors mounted on the satellite direct the pulses back to the receiver at the ground station. The polyhedron principle can be followed here in a standard way. Assume that the configuration of three stations, located on one plate is known. Simultaneous distance measurements to a satellite yield its position. Inclusion of a fourth station located on a second plate permits the determination of the relative movement of the latter with respect to the other three. SLR attains centimeter precision. A complication arises from the fact that direct visibility is required (clouds, for example, prohibit measurements). This means that simultaneous measurements of a satellite from four stations is hardly ever possible. Consequently, non-simultaneous measurements have to be used with the computed orbit as intermediary, as shown in Fig. 17. As the dynamic of the motion of the satellite also play a part, the simple geometric concept is lost. Very long baseline interferometry (VLBI) is a space technique operating at radio frequencies. It is not weather-dependent. Radio telescopes located on different plates track signals from quasi-stellar objects (quasars). The signal is received at slightly different times at the individual stations (Fig. 18).

l1

%2 > %1

l2

%1

Fig. 17 The principle of satellite laser ranging (SLR). A leaser beam created at the surface of the Earth is reflected from the satellite. From the traveling time, the actual heights of the satellites are measured, so that the deviations of their paths due to gravity influences can be obtained

1 Mathematical Geodesy

25

Fig. 18 Principle of very long baseline interferometry (VLBI)

quasar l2

l1

Rates of plate motion, as derived so far from VLBI and SLR, show an overall agreement with geophysical tectonic rate models. For some more complex regions, such as the Mediterranean and Southern California, more stations and more sophisticated modeling are required. Transportable VLBI and SLR stations provide more flexibility in this respect. It is expected that the station density and ease of measurement will improve significantly with, e.g., the GNSS. In fact, modern GNSS enables the simultaneous tracking on radio frequencies of four or more satellites anywhere on Earth. In this instance the geometric principles as described for the case of SLR is inverted, with the ground station taking the place of the satellite. The precision of absolute positioning depends on the precision of the satellite orbit ephemerides. At distances of 100 km and less, traditional geodetic triangulation and leveling are still very successful in monitoring horizontal and vertical crustal movements. Long measurement recording have been established in this way for certain tectonically active zones. Oceanic Flow. Any deviation of the topography surface from the geoid implies imbalance, and this imbalance results in dynamic processes. This applies both for land topography and for the topography of the oceans. Before the space age the oceans were scarcely accessible to geodetic activity. Not until the 1970s, with the launch of the first altimeter satellites, did the situation changes profoundly. Now satellite radar altimeters measure with centimeter precision the distance from the spacecraft to the ocean surface. Combined with the accurately determined orbit, the geometric height h of the instantaneous sea surface above a chosen reference ellipsoid is obtained, as shown in Fig. 19. The choice of the orbit elements determines the spatial and temporal coverage of the oceans by measurements and their northern and southern latitude bounds. What type of information can be extracted from the measured sea surface heights? First, they approximate the geoid since the sea surface topography (SST) is small and can be partly corrected for. Second repeated measurement of the same ground track configuration provides the emporal variations of the ocean surface which is useful for studies of changes in global circulation patterns, meandering of ocean currents and eddy motion and also for tidal modeling in open oceans and shelf seas. Finally combination of sea surface heights with an independently determined precise geoid

26

W. Freeden and M. Schreiner

Fig. 19 The principle of satellite altimetry

satellite

dynamical sea topography l geoid

11

60

60

°N

°N

40

45

9 8

45

20

°N

10

60

°N

7 6

0

30

°N

–20 –40

75

°W

60° W

45° W

30° W

W 15°

–60

30

5

°N

4

75 °W

3

60° W

45° W

30° W

W 15°

2 1

Fig. 20 Sea surface topography (SST) [cm] (left) and geostrophic oceanic flow [cm/s] (right) of the gulf stream computed by the use of smoothed Haar wavelets via the surface curl gradient equation relating the scalar dynamical topography to the vectorial oceanic geostrophic flow. (Graphical illustration from [25])

yields the SST - a measure of the surface dynamics of the oceans that is essential, for example for the analysis of heat transport. The full capability of the latter for ocean modeling will become apparent when geoid models are further improved (see Fig. 20 for an example). Sea-level rise or fall has been studied in the past in conjunction with height system definition and research on regional recent crustal movement. The tide-gauge measurement records used for this purpose-extend back to more than one hundred years for some stations. Currently there is a growing awareness that anthropogenic climate changes might lead to a more rapid melting of the remaining ice caps. The implications of this are not yet well understood. Any study of the consequences, however, must rely on highly precise and reliable measurement records of a world wide sea-level monitoring system. Only then can the global pattern be studied, the crustal motion be separated from oceanic effects, and trend changes recognized early. The integration of tide gauges into one system will be achieved by precise satellite positioning and VLBI (Fig. 21).

1 Mathematical Geodesy

27

Fig. 21 Polar motion for different time ranges. (Data from the International Earth Rotation and Reference Systems Service, http://www. iers.org)

Earth’s Rotation. The Earth does not rotate uniformly. The deviation of its angular velocity vector (see, e.g., [45]) from a constant one in an Earth-fixed reference system is characterized by time-dependent Earth rotation parameters. The deviation of the direction of the velocity vector is called polar motion. It describes the position of the “true” rotation pole compared to a terrestrial pole “fixed to the Earth and related to the Earth-fixed reference system. In more detail, polar motion is defined relative to a conventionally defined reference axis, the so-called CIO (Conventional International Origin), being the pole’s average location over the year 1900. It consists of three major components: a free oscillation called Chandler wobble with a period of about 435 days, an annual oscillation, and an irregular drift in the direction of the 80th meridian west, which has lately been shifted toward the east. The slow drift, about 20 m since 1900, is partly due to motions in the Earth’s core and mantle, and partly to the redistribution of water mass as the Greenland ice sheet melts, and to isostatic rebound, i.e., the slow rise of land that was formerly burdened with ice sheets or glaciers. The mean displacement far exceeds the magnitude of the wobbles. This can lead to errors in software for Earth observing spacecraft, since analysts may read off a 5-m circular motion and ignore it, while a 20-m offset exists, fouling the accuracy of the calculated latitude and longitude. The latter are determined based on the International Terrestrial Reference System, which follows the polar motion (for more details see, e.g., [37]). Today, Earth rotation parameters can be observed with high precision. Observations of stars and quasars and also the orbit plane of the moon and artificial satellites are used in geodesy to determine the orientation of networks and to help reduce distortions. For this purpose the variable orientation of the Earth in space must be known. More precisely, a celestial coordinate system, represented by the positions of stars, has to be related to a terrestrial coordinate system linked to some fundamental points on Earth. In view of today’s high measurement standards even such small effects as plate tectonic motions have to be taken into account in the definition of these systems.

28

2.3

W. Freeden and M. Schreiner

Satellite Geodesy

The possibilities for geodesy changed fundamentally with the advent of satellites. Only by use of satellites geodesy did become truly three-dimensional and global. A satellite can serve in geodesy as • a measurement beacon at a high altitude with which points far apart can be connected. • a gyroscope, the orbit plane of which is stable enough to serve as an external direction reference. • a gravity probe in free fall, from the orbit of which the structure of the Earth’s gravitational field can be extracted. • a remote sensing platform, for measurement from space globally and with a high repetition rate. A variety of observational techniques exploiting satellites from the ground have been used to determine distances. Two of them have also provided essential inputs to the recent elaboration of global gravity models: Satellite laser ranging (SLR) delivers the distance of a satellite from a ground station, see Fig. 17. SLR data contain information about the orbit of the satellite, the position of the measurement site and Earth’s rotation and plate tectonic parameters. Range Rate Measurements are based on the observation of the Doppler effect by which the frequency of a transmitted signal is observed with a modified value proportional to the line–of–sight velocity between the transmitter and the observer. Plenty of such measurements have been collected between satellite borne transmitters and ground stations but only to a limited precision. The system can also be inverted, where transmitters are at the stations and the receiver is on board of a satellite.

2.3.1 Satellite-to-Satelltite Tracking (SST) From Newton’s approach it is known that the motion of any free falling body is purely determined by the Earth’s gravitational field. It is logical to try to invert this principle and to determine the structure of the Earth’s gravitational field from the study of the motion of bodies in free fall. This is the idea of all measurements and orbit analysis of Earth’s satellites for the purpose of global gravitational field recovery. In fact, the concept of gravitational field determination from orbit perturbation, i.e., satellite-to-satellite tracking (SST) goes back almost five decades. The original idea was to fly two satellites in an identical low orbit with a separation of a few hundred kilometres between the spacecraft (low-low SST). Between the satellites the distance and the Doppler frequency shift can be measured. As such the data represent admittedly, to some degree of approximation, first order tangential derivatives of the gravitational potential. The alternative to low-low SST is high-low SST: Nowadays the global navigation satellite system (GNSS) is fully operational with tens of satellites in space which can track a Low Earth Orbiter (LEO). From

1 Mathematical Geodesy

29

Fig. 22 The principle of satellite-to-satellite tracking (low-low SST), as realized in the GRACE mission

l

continuous carrier phase measurements of all visible GNSS-satellites the orbit can be determined to an accuracy of a few centimeters. As the GNSS-receiver mounted on the LEO always “contacts” four or even more of the GNSS satellites the relative motion of the LEO can be monitored three-dimensionally, i.e., in all three coordinate directions. The lower the orbit of the LEO the higher is its sensitivity with respect to the spatial variations of the gravitational forces but by skin forces as well (atmospheric drag, solar radiation, albedo, etc.) the latter have either to be compensated for by a drag-free mechanism or be measured by a three axis accelerometer. Of course, also the high orbiters, the GNSS satellites, are affected by non-gravitational forces. However the latter can be modeled quite well. They affect mainly the very long spatial scales, and to a large extent their effect averages out. In addition, the ephemerides of the GNSS satellites are determined very accurately by the large network of ground stations. In the case of SST lo-lo the relative motion between two LEO’s, chasing each other, is measured with highest precision. The quantity of interest is the relative motion of the center of mass of the two satellites. Again, the effect of non-gravitational forces on the two spacecraft either has to be compensated actively (see Figs. 22 and 23). On the basis of all tracking data, collected over several decades from satellites in orbits at different altitudes and inclinations, only long wavelength components of the global gravity field can be recovered. There are two reasons for this fact: An orbit as such is rather insensitive to local features of the gravitational field, and this insensitivity increases with increasing orbit altitude; and the satellites which can and are being used are flying at altitudes which are too high for acceptable modeling purpose. Therefore, satellite-only global gravity field models are reliable to a certain moderate maximum degree expressed in a potential representation in terms of spherical harmonics.

2.3.2 Satellite Gravity Gradiometry (SGG) Spacecraft are held in orbit by the gravity of the planet which they are orbiting, in the geodetic case the Earth. In Newtonian physics, the sensation of weightlessness

30

W. Freeden and M. Schreiner

l1

l2

Fig. 23 Satellite-to-satellite tracking (SST) and internal mass distribution

experienced by astronauts is not the result of there being zero gravitational acceleration (as seen from the Earth), but there being zero difference between the acceleration of the spacecraft and the acceleration of the astronaut. Space journalist J. Oberg [48] explains the phenomenon this way: The myth that satellites remain in orbit because they have “escaped Earth’s gravity” is perpetuated further (and falsely) by almost universal misuse of the word “zero gravity” to describe the free-falling conditions aboard orbiting space vehicles. Of course, this is not true; gravity still exists in space. It keeps satellites from flying straight off into interstellar emptiness. What’s missing is “weight”, the resistance of gravitational attraction by an anchored structure or a counterforce. Satellites stay in space because of their tremendous horizontal speed, which allows them—while being unavoidably pulled toward Earth by gravity—to fall “over the horizon.” The ground’s curved withdrawal along the Earth’s round surface offsets the satellites’ fall toward the ground. Speed, not position or lack of gravity, keeps satellites in orbit around the Earth.

In conclusion, if we consider the relative motion of a cluster of free falling proof masses (e.g., satellites), they somewhat follow different paths in space because of the different initial conditions with respect to gravitating body, in our case the Earth. From the measurement of the relative motion of the proof masses the gravitational tensor (Hesse tensor) can be deduced. In geometrical terms it is the local curvature in space (see, e.g., [40, 41] for a more detailed explanation). The measurement technique itself is called satellite gravity gradiometry (SGG), because the components of the tensor represent the gradients of the gravitational (relative) acceleration vector. In real world not the relative motion of free falling proof masses is observed, instead their position is constrained to a fixed position by an electronic spring (cf. Figs. 24 and 25). The components of the tensor are

1 Mathematical Geodesy

31

Fig. 24 The principle of a gradiometer (note that in GOCE, the gradiometer is a full tensor gradiometer, measuring also in the flight direction of the satellite)

derivable from the electronic signal required to keep the masses in a fixed relative position. Altogether, seen from observational point of view, satellite gravity gradiometry is the measurement of the relative acceleration, not between free falling test masses like satellites, but of test masses at different locations inside one satellite. Each test mass is enclosed in a housing and kept levitated (floating, without ever touching the walls) by a capacitive or inductive feedback mechanism. The difference in feedback signals between two test masses is proportional to their relative acceleration and exerted purely by the differential gravitational field. Non– gravitational acceleration of the spacecraft affects all accelerometers inside the satellite in the same manner and so ideally drops out during differencing. The rotational motion of the satellite affects the measured differences. However, the rotational signal (angular velocities and accelerations) can be separated from the gravitational signal, if acceleration differences are taken in all possible (spatial) combinations (= full tensor gradiometer). Again low orbit means high sensitivity. Seen from mathematical point of view, “downward continuation” of SGG-data in its simplest formulation (see, e.g., [7, 12, 17, 25] and the references therein) amounts to the study of operator theoretic techniques to provide the disturbing potential at height 0 km from data of the second order radial partial derivative at the satellite orbit (note that the second order derivative admits an interpretation as one-dimensional (linearized) curvature). In accordance with the classification of Hadamard [28], this leads to an exponentially ill-posed inverse problem, since the potential (Fourier) coefficients of the disturbing potential show an exponential decay with increasing height. As a consequence, the SGG-data contain “exponentially weakened” information of the disturbing potential (for illustrations explaining the ill-posed character of the SGG-problem graphically, Fig. 26 is a good demonstration; in fact, the first column of pictures impressively visualizes the decay of the disturbing potential with increasing height, while the third column of pictures shows

32

W. Freeden and M. Schreiner

l1

l2

Fig. 25 Satellite gravity gradiometry (SGG) and internal mass distribution

the compensating effect by additionally considering the curvature, i.e., by taking the second derivative at satellite height). So, “SGG- downward continuation” means to collect as much as possible gravitational information from the “exponentially smoothed curvature signature” at satellite height. Because of the ill-posedness of this problem it immediately turns out that the SGG-downward continuation operator, representing the mapping from the satellite data at the orbit to the terrestrial potential, must be regularized. The critical point of any regularization procedure (cf. [20]) is a stopping strategy providing a suitable parameter, so that the true signal in the SGG-data can be separated from the artefacts caused by the “exponential height-smoothing with subsequent curvature roughing”. The regularization method becomes even worse when the noise in measured SGG-data must be handled, thereby usually having no deeper knowledge of its specific character (for more details the reader is referred, e.g., to [24] and the literature therein). All in all, the three geodetically relevant satellite concepts in operation leading to downward continuation problems are satellite–to–satellite tracking in the high–low mode (SST hi–lo), satellite–to–satellite tracking in the low–low mode (SST lo–lo), and satellite gravity gradiometry (SGG). Common to all three concepts is that the determination of the Earth’s gravity field is based on the measurement of the relative motion (in the Earth’s gravity field) of test masses (“apples”) (Fig. 27). One can now argue that the basic observable in all three cases is gravitational acceleration. In the case of SST hi–lo, with the motion of the high orbiting

1 Mathematical Geodesy

33 500 km

potential

1000

-500

0

1st radial derivative

500

1000 -500

0

2nd radial derivative

500 1000

-500

0

500

1000

250 km potential

1000

-500

0

1st radial derivative

500

1000 -500

0

2nd radial derivative

500 1000

-500

0

500

1000

0 km potential

1000

-500

0

1st radial derivative

500

1000 -500

0

2nd radial derivative

500 1000

-500

0

500

1000

Fig. 26 Disturbing potential data generated with the EGM 2008 model [49], evaluated up to degree and order 720. From top to bottom: heights 500, 250 and 0 km. From left to right: disturbing potential in m2 /s2 , minus the first radial derivative in 10−6 m/s2 and the second order radial derivative in 10−12 /s2 . (From [22])

GPS satellites assumed to be perfectly known, this corresponds to an in situ 3-D acceleration measurement in the LEO. For SST lo-lo it is the measurement of acceleration difference over the inter satellite distance and in the line-of-sight (LOS) of the LEOs. Finally, in the case of gradiometry, it is the measurement of acceleration differences in 3-D over the tiny baseline of the gradiometer. In short we are confronted with the situation sketched in Table 2. Thus, in the mathematical sense, it is a transition from the first derivative of the gravitational potential via a difference in the first derivative to the second derivative. The guiding parameter that determines sensitivity with respect to the spatial scales of the Earth’s gravitational potential is the distance between the test masses, being almost infinity for SST hi-lo and almost zero for gradiometry.

34

W. Freeden and M. Schreiner

Fig. 27 Gravitation and the story of the apple: The tale of two ants walking on the surface of an apple: “They start at A and A’ and walk on two adjacent paths along shortest distance (geodesics) on the curved apple to B and B’. We measure the changing distance between the two ants. From these measured distances we deduce the local curvature of the apple.” This situation is analogous to the SST and SGG missions, i.e., the determination of the local curvature of the space caused by the gravitational field of the Earth from the measurement of the relative movement of masses, which move “in free fall” along geodetic lines

Table 2 Satellite methods and their observables SST hi-lo 3-D acceleration Gradient

SST lo-lo Acceleration difference Difference gradients

SGG Differential Gradient of gradient (Hesse tensor)

Final Conclusion. Over the years geoscientists and especially geodesists have realized the great complexity of the Earth and its environment. In particular, the knowledge of the gravity potential and its level (equipotential) surfaces have become an important issue. It was realized that dedicated highly accurate gravity field sensors, when operating in an isolated manner, have their shortcomings, and combining data from different sensors is therefore the way forward. At this stage of development the determination of the Earth’s gravitational field is a mathematical challenge which should include the numerical progress obtainable by downward continuation of spaceborne data preferably in appropriate combination with terrestrial data.

1 Mathematical Geodesy

3

35

Geomathematics: Its Role, Its Aim, Its Potential

In what follows, in order to characterize the role, aim, and potential of geomathematics we frequently follow the work given, e.g., in [8–10]. Mathematics (from Greek μαθ ´ ημα “knowledge, learning”) intends to study topics as quantity, structure, space, and change. Correspondingly, γ ωμαθ ´ ημα (geomathematics) is μαθ ´ ημα (mathematics) concerned with geoscientific obligations. In our times, geomathematics is thought of being a very young science and a modern area in the realms of mathematics. However, nothing is farer from the truth (cf. [59]). Geomathematics began as man realized that he walked across a spherelike Earth and that this observation has to be taken into account in measurements and calculations. In consequence, we can only do justice to Geomathematics, if we look at its historic importance, at least shortly: According to the oldest evidence which has survived in written form, geomathematics was developed in Sumerian Babylon and ancient Egypt (see Fig. 28) on the basis of practical tasks concerning measuring, counting, and calculation for reasons of agriculture and stock keeping. In the ancient world, mathematics dealing with problems of geoscientific relevance flourished for the first time, for example, when Eratosthenes (276–195 B.C.) of Alexandria calculated the radius of the Earth. We also have evidence that the

Fig. 28 Papyrus scroll containing indications of algebra, geometry, and trigonometry due to Ahmose (nineteenth century BC) [Department of Ancient Egypt and Sudan, British Museum EA 10057, London, Creative Commons Lizenz CC-BY-SA 2.0] taken from T. Sonar (2011): 3000 Jahre Analysis, Springer

36

W. Freeden and M. Schreiner

Arabs carried out an arc measurement northwest of Bagdad in the year 827 A.D. Further key results of geomathematical research lead us from the Orient across the occidental Middle Ages to modern times. N. Copernicus (1473–1543) successfully made the transition from the Ptolemaic geocentric system to the heliocentric system. J. Kepler (1571–1630) determined the laws of planetary motion. Further milestones from a historical point of view are, for example, the theory of geomagnetism developed by W. Gilbert (1544–1608), the development of triangulation methods for the determination of meridians by T. Brahe (1547–1601) and W. Snellius (1580– 1626), the laws of falling bodies by G. Galilei (1564–1642), and the basic theory on the propagation of seismic waves by C. Huygens (1629–1695). The laws of gravitation formulated by I. Newton (1643–1727) have taught us that gravitation decreases with an increasing distance from the Earth. In the seventeenth and eighteenth century, France took over an essential role through the foundation of the Academy in Paris (1666). Successful discoveries were the theory of the isostatic balance of mass distribution in the Earth’s crust by P. Bouguer (1698–1758), the calculation of the Earth’s shape and especially of the pole flattening by P. L. Maupertuis (1698 – 1759) and A. C. Clairaut (1713–1765), and the development of the calculus of spherical harmonics by A. M. Legendre (1752–1833) and P. S. Laplace (1749–1829). The nineteenth century was essentially characterized by C. F. Gauss (1777–1855). Especially important was the calculation of the lower Fourier coefficients of the Earth’s magnetic field, the hypothesis of electric currents in the ionosphere, as well as the definition of the level set of the geoid (however, the term “geoid” was introduced by J. B. Listing (1808–1882), a disciple of C.F. Gauss). In 1849, G. G. Stokes (1819–1903) laid the foundation of the fundamental boundary value problems of physical geodesy (cf. [60]). At the end of the nineteenth century, the basic idea of the dynamo theory in geomagnetics was developed by B. Stewart (1851–1935), etc. This very incomplete list (which does not even include the last century) already shows that geomathematics may be understood as cultural asset. Indeed, it is one of the large achievements of mankind from a historic point of view.

3.1

Geomathematics As Task and Objective

When mathematics yields good models of real phenomena, then mathematical reasoning can provide insight or predictions about nature. Through the use of abstraction and logic, mathematics developed from counting, calculation, measurement, and the systematic study of the shapes and motions of physical objects. Accordingly, geomathematics deals with the qualitative and quantitative properties of the current or possible structures of the system Earth. It guarantees concepts of scientific research concerning the system Earth, and it is simultaneously the force behind it. The system Earth consists of a number of elements which represent individual systems themselves. The complexity of the entire system Earth is determined by interacting physical, biological, and chemical processes transforming and transporting energy, material, and information. It is characterized by natural, social,

1 Mathematical Geodesy

37

and economic processes influencing one another. In most instances, a simple theory of cause and effect is therefore completely inappropriate if we want to understand the system. We have to think in dynamical structures and to account for multiple, unforeseen, and of course sometimes even undesired effects in the case of interventions. Inherent networks must be recognized and used, and self–regulation must be accounted for. All these aspects require a type of mathematics which must be more than a mere collection of theories and numerical methods. Mathematics dedicated to geosciences, i.e., geomathematics deals with nothing more than the organization of the complexity of the system Earth (Fig. 29). Descriptive thinking is required in order to clarify abstract complex situations. We also need a correct simplification of complicated interactions, an appropriate system of mathematical concepts for their description, and exact thinking and formulations. Geomathematics has thus become the key science of the complex system Earth. Wherever there are data and observations to be processed, e.g., the diverse scalar, vectorial, and tensorial clusters of satellite data, we need mathematics. For example, statistics serves for noise reduction, constructive approximation for compression and evaluation, the theory of special function systems yield georelevant graphical and numerical representations – there are mathematical algorithms everywhere. The specific task of geomathematics is to build a bridge between mathematical theory and geophysical as well as geotechnical applications.

Biosphere

Hy

re

dro

he

sph

osp

ere

Atm An

thr o

re

po

sph

he osp Ge

ere

Geosystems Mathematics

Pedosphere

Fig. 29 Geosystems Mathematics as the key technology penetrating the complex system Earth

38

W. Freeden and M. Schreiner

gra v mag itatio neti n sm sei climate smology , weath ocean circul er ation imaging science s plate tectonics mic a n y /d s gy chanic nolo geome tech gationg e t i l i l sate nav ensin s ote rem geo

geomathematics

s ion s nct uation u f q l ation e a ci ntial oxim r e p p p e a s ffer e di structiv s con ral transform es integ d data scienc numerics an signal processing inverse/ill-p osed probelm applie s appl d function i al ana e opt d har ly s is geo imiza monic sta tion anal ysis tist a n dc ics ont rol

Fig. 30 Geomathematics, its range of fields, and its disciplines

The special attraction of this branch of mathematics is therefore based on the vivid communication between applied mathematicians more interested in model development, theoretical foundation, and the approximate as well as computational solution of problems, and geoengineers and -physicists more familiar with measuring technology, methods of data analysis, implementation of routines, and software applications. There is a very wide range of modern geosciences on which geomathematics is focused (see Fig. 30), not least because of the continuously increasing observation diversity. Simultaneously, the mathematical “toolbox” is becoming larger. A special feature is that geomathematics primarily deals with those regions of the Earth which are only insufficiently or not at all accessible for direct measurements (even by remote sensing methods are absolutely essential for mathematical evaluation in these cases. Mostly, a physical quantity is measured in the vicinity of the Earth’s surface, and it is then continued downward or upward by mathematical methods until one reaches the interesting depths or heights.

3.2

Geomathematics As Generically Constituted Discipline

No branch of mathematics has influenced the general scientific thinking more than the mathematical theories originated from fields of physics such as mechanics. In the mathematical community, there is no doubt that the philosophy and the outcome of this influence to “reality” is the most characteristic feature of our technical civilization. However, a statement of such generality, though accepted by most mathematicians, is most likely questioned by many physicists. The reason for the difference of opinion is the lack of a generally accepted definition of what is indicated by the mathematical way of thinking.

1 Mathematical Geodesy

39

In what follows, influenced by a note presented by C. Müller [46], we first make the attempt to show the different tendencies of the line of thoughts in the interrelation and interaction of mathematical and physical sciences. In a second step it also characterizes the specific interplay of geophysics and geomathematics (Fig. 31). Seen from the historic point of view, one of the most important contributions to mathematics is geometry (from the Ancient Greek γ εωμετ ´ια geo- “Earth”, -

Fig. 31 Euclid, Elemente 10, Appendix in der 888 geschriebenen Handschrift Oxford, Bodleian Library, MS. D’Orville 301, fol. 268r

40

W. Freeden and M. Schreiner

metron “measurement”). The earliest recorded beginnings of geometry can be traced to ancient Mesopotamia and Egypt in the 2nd millennium BC. Early geometry was a collection of empirically discovered principles concerning lengths, angles, areas, and volumes, which were developed to meet some practical need in surveying and various crafts. The earliest known texts (cf. Fig. 28) on geometry are the Egyptian Rhind Papyrus (2000–1800 BC) and Moscow Papyrus (roughly 1890 BC), the Babylonian clay tablets (roughly 1900 BC). Geometry arose independently in a number of early cultures as a practical way for dealing with lengths, areas, and volumes. Geometry began to see elements of formal mathematical science emerging in the West as early as the sixth century BC. By the third century BC, geometry was put into an axiomatic form by Euclid, whose treatment, Euclid’s Elements, set a standard for many centuries to follow. It brought the heritage of mathematics from the antiquity to our time. Some centuries later, the Greeks themselves replaced the term “geometry”, which had meanwhile lost the original meaning of “Earth’s measuring” by “geodesy” as a new meaning of an abstract theory of the “Earth’s shape”, while geometry now reflected the mathematical rigor through its axiomatic method. In fact, it is the earliest example of the format still used in mathematics today, that of definition, axiom, theorem, and proof. Although most of the contents of the “Elements” were already known, Euclid arranged them into a single, coherent logical framework. The “ Elements” were known to all educated people in the West until the middle of the twentieth century and its contents are still taught in geometry classes today. The wealth of knowledge, the high standard of thinking, and the demand for intellectual rigour prove the great development of mathematics in classical Greece. In this respect, it should be mentioned that islamic scientists preserved Greek ideas and expanded on them during the Middle Ages (see Fig. 32). By the early seventeenth century, geometry had been put on a solid analytic footing by mathematicians such as René Descartes (1596–1650) and Pierre de Fermat (1607–1665). Since then, and into modern times, geometry has expanded into non-Euclidean geometry and manifolds, describing spaces that lie beyond the normal range of human experience. Hence, while geometry has evolved significantly throughout the years, there are some general concepts that are more or less fundamental. These include the concepts of points, lines, planes, surfaces, angles, and curves, as well as the more advanced notions of manifolds and topology or metric. Historians very often use the term modern (also, in geosciences) for the period starting with the Italian Renaissance. Scientists probably call I. Newton (1643–1727) the first modern scientist. The emphasis on a systematic study of the laws such as the Newtonian approach in “Philosophiae Naturalis Principia Mathematica” (cf. Fig. 33) may be regarded as the most obvious characteristic of modern times. It also initiated a new branch of mathematics, called analysis, which from its very beginning is so closely related to physics, for example in the field of differential equations, that some people finds it difficult to say, where mathematics starts and physics ends. The history of the interchange of ideas between the areas of mathematics and physics is a fascinating period in the history of the scientific thought. Unfortunately, it is so intricate that its complete account is still to be

1 Mathematical Geodesy

41

Fig. 32 Euclid’s Elements is by far the most famous mathematical work of classical antiquity, and also has the distinction of being the world’s oldest continuously used mathematical textbook. Little is known about the author, beyond the fact that he lived in Alexandria around 300 BC. The main subjects of the work are geometry, proportion, and number theory

investigated. Nevertheless, there is a large variety of basically different ideas and concepts and an increasing change of the aspects involved in the problems. So it may be concluded that one great contribution of mathematics to modern scientific progress is the setting of the pattern for the theories describing the laws of the real world (“reality”). Without mathematical theory there is no science in a modern sense. Apart from the mathematical aspect of the value of a theory, there is a practical side to which it is even more important. Once the theory is recognized to be in accordance with an experience, it renders the experiment unnecessary. This, of course, is very important where experiments are not possible. The best example of a geodetic field which could not carry out experiments, was probably astronomical geodesy. So it was not surprising that a new approach to the features of the “real world” had its striking success when it was possible to deduce Kepler’s empirical law on the movements of planets from Newton’s general law of mechanics. Even more astonishing, the same rules also apply to today’s space research, and nowadays satellite technology is a field in which experiments are the usual practice. One of the best examples to show the difference in the emphasis between physics, mathematics, and engineering is gravitational theory which played a major part in the history of sciences from every angle. In fact, it is well-suited for investigating

42

W. Freeden and M. Schreiner

Fig. 33 Sir Isaac Newton’s own first edition copy of his Philosophiae Naturalis Principia Mathematica with his handwritten corrections for the 20th edition. The first edition was published under the imprint of Samuel Pepys who was president of the Royal Society. By the time of the second edition, Newton himself had become president of the Royal Society, as noted in his corrections. It has been digitised by Cambridge University Library and can be seen in the Cambridge Digital Libraryalong with other original works by Isaac Newton. The book can be seen in the Wren Library of Trinity College, Cambridge

different trends in thinking as represented in various fields of science, for example, electrostatics, geomagnetics, etc. • The first or practically oriented step in every approach to “reality” is to gain experience and collect empirical data on certain aspects of our environment. Here, mathematics is used as a kind of registration or catalog to get concise descriptions of the laws of nature in empirical formulas. • The second or theoretical step subsequently is to set up a system of mathematical definitions and laws from which these formulas and more results may be derived by mathematical arguing which, in turn, are intended to be verified by measurements or experiments. It must be confessed that this stage of theoretical experimentation is most confusing to a mathematician, because it uses his language and formalism without accepting his basically critical attitude. It may, however, be very exciting and stimulating because a new mathematical concept may be found. In order to realize the second step to physical geodesy, geodetically reflected potential determination was in a fortunate position, as the Newtonian theory of

1 Mathematical Geodesy

43

gravitation had already developed a mathematical status, which could be naturally extended so as to give a complete description of the laws of this field. As a consequence, potential theory as the scientific collection of ideas, concepts, and structures involving Laplace’s equation gained new aspects, and it was challenged with new problems, of which the boundary value problems probably are the best known. Potential theory actually guarantees that if certain values of a potential under specific consideration are given on the boundary of a closed body, the potential is determined via the boundary value problem in the interior (or in the exterior when an additional regularity condition at infinity is supposed to hold true). This assertion, of course, has been checked in many experiments also under geodetic auspices, but naturally it cannot be verified experimentally in the generality in which it can be stated mathematically. More concretely, at a stage, where the theory is regarded as satisfactory from the physicist’s point of view, it is a system of fundamental laws, definitions and problems, of which some, under certain conditions, have been solved mathematically. The problems in their full generality, however, are given to mathematics as conjectures, in a sense, to be proved. They become the object of a study of the well-posedness, i.e., existence, uniqueness, stability proofs, which therefore aim at establishing the consistency of the general physical theory (cf. [28]). Altogether, the aspects of potential theory have changed considerably when constituents could be described by means of Laplace’s equation, just as scientific tasks arose from the theory of stationary flow, that indeed uses the same differential equation. It could thus be observed how new physical applications developed new aspects of potential theory and the theory of partial differential equations, primarily originated on geodetic gravitational developments. In the geodetic approach to geopotential theory, as in other scientific fields, the purely mathematical aspect is met by the engineer’s need for sufficiently handling a great number of applications, which are not contained in the typical cases by which the physicist verified this theory. It is, for instance, certainly not enough for geodetic engineering to be able to calculate the “geopotential” for spheres and ellipsoids, but more complicated bodies (e.g., the real Earth and/or the geoid) must be taken into account. The boundary value problem, which provides a mathematical apparatus of these questions, therefore, is not only a purely mathematical subtlety, but also a problem of great practical importance. It is therefore remarkable that the link between mathematics and engineering seems to be much stronger than it is mathematics and physics. Even more remarkably, engineering (especially, methods of non-destructive material testing) encouraged mathematics to deal not only with well-posed problems, but also with ill-posed problems. As a consequence in potential theory, a geodetically relevant example, namely the gravimetry problem became attackable, that aims at the determination of the density contrast function from Newtonian volume potential values on and outside the integration area (note that the determination of the density contrast function from internal volume potential values (cf. [21]) may be handled in a well-posed framework of partial differential equations, viz. Poisson differential equation, at least if the density contrast function is assumed to be sufficiently smooth).

44

W. Freeden and M. Schreiner

Looking at the developments of the interrelations between mathematics, engineering, and physics during the last century we are led to the following résumé: The aspects of the mathematically reflected problems of physics as well as engineering brought about, that in these fields more proofs are constructive, aiming at the actual calculation at least in principle than is the case in many other fields of mathematics. However, the aspect of a mathematical theory, the way of looking at its problems, is not only changed by contacting outside areas. A theory may get innovative incentives from a change of emphasis and the discovery of a new framework based on mathematical equipment exclusively. For example, the theory of linear spaces and operators which originated from the Fredholm theory of boundary value problems is now regarded as the common heading for a large class of problems in analysis. In fact, they created a new mathematical discipline, today called functional analysis. In this respect, C. Müller [46] states: It achieves a general formulation of a variety of problems, brings out the mathematical structure and thus simplifies the theory by emphasizing the essential. By reducing the assumptions to a minimum, it increases the range of applications. As a matter of fact, there can be no doubt that the historic core of the discipline “functional analysis” is the Fredholm theory of integral equations and the Hilbert theory of eigenfunctions. These approaches provided the basis, that was then simplified and extended by more abstract concepts.

Nowadays, operator theory of functional analysis in the manifestation of regularization methods even helps to handle ill-posed inverse problems for a large variety of applications. In geodesy, two essential areas for ill-posed inverse problem types may be specified, namely “downward continuation” of satellite data to the Earth’s surface and of terrestrial data to the Earth’s interior. The rationale in most methods for resolution (approximate solvability) of ill-posed inverse problems is to construct a “solution” that is acceptable physically as a meaningful approximation and is sufficiently stable from the computational standpoint, hence, an emphasis is put on the distinction between “solution” and “resolution”. The main dilemma of modeling of ill-posed problems (IPP) is that the closer the mathematical model describes the IPP, the worse is the “condition number” of the associated computational problem (i.e., the more sensitive to errors). For ill-posed problems, the difficulty is to bring additional information about the desired solution, compromises, or new outlooks as aids to the resolution of IPP. It is conventional to use the phrase “regularization of an ill-posed problem” to refer to various approaches to circumvent the lack of continuous dependence (as well as to bring about existence and uniqueness if necessary). Roughly speaking, this entails an analysis of an IPP via an analysis of an associated well-posed problem, i.e., a system (usually a sequence or a family) of well-posed problems, yielding meaningful answers to the IPP. We distinguish three aspects of regularization: • strategy of resolution and reconstruction, • regularization-approximation schemata, • regularization algorithms.

1 Mathematical Geodesy

45

One of the purposes of geomathematical work is to dramatize this delineation with reference to specific methods and results. The strategy of resolution and reconstruction of ill-posed problems involves one or more of the following intuitive ideas (cf. [19, 20]): • change the notion of what is meant by a solution (e.g., ε-approximate solution: Au − y ≤ ε, where ε > 0 is prescribed; quasi-solution: Au − y ≤ Ax − y for all x ∈ M, a prescribed subset of the domain of A; least-squares solution of minimal norm, etc.), • modify the operator equation or the problem itself, • change the spaces and/or topologies, • specify the type of involved noise (“strong” or “weak” noise). The philosophy of resolution leads to the use of algebraic methods versus function space methods, statistical versus deterministic approaches, strong versus weak noise, etc. A regularization-approximation scheme refers to a variety of methods such as Tikhonov’s regularization, projection methods, multiscale methods, iterative approximation, etc., that can be applied to ill-posed problems. These schemes turn into algorithms once a resolution strategy can be effectively implemented. Unfortunately, this requires a determination of a suitable value of a certain parameter associated with the scheme (e.g., regularization parameter, mesh size, dimension of subspace in the projection scheme, specification of the level of a scale space, classification of noise, etc.). This is not a trivial problem since it involves a trade-off between accuracy and numerical stability, a situation that does not usually arise in well-posed problems. From the standpoint of mathematical and numerical analysis one can roughly group “regularization methods” into three categories (cf. [19, 20]): (i) Regularization methods in function spaces is one category. This includes Tikhonov-type regularization, the method of quasi-reversibility, the use for certain function spaces such as scale spaces in multi-resolutions, the method of generalized inverses (pseudoinverses) in reproducing kernel Hilbert spaces, and multiscale wavelet regularization. (ii) Resolution of ill-posed problems by “control of dimensionality” is another category. This includes projection methods, discretization moment-discretization schemes. The success of these methods hinges on the possibility of obtaining approximate solution while keeping the dimensionality of the finite dimensional problem within the “range of numerical stability”. It also hinges on deriving error estimates for the approximate solutions that is crucial to the control of the dimensionality. (iii) A third category is iterative and filter methods which can be applied either to the problem in function spaces or to a discrete version of it. The crucial ingredient in iterative methods is to stop the iteration before instability creeps

46

W. Freeden and M. Schreiner

into the process. Thus iterative methods have to be modified or accelerated so as to provide a desirable accuracy by the time a stopping rule is applied. Filter methods refer to procedures where, for example, singular functions and values producing highly oscillatory solutions are eliminated. Various “ low pass” filters can, of course, be used. The last sentence in (b) is also crucial for determination of a stopping rule. Finally it should be noted that there are two main trends in mathematics, which cannot exist one without the other: The first is to discover the laws for relations between mathematical concepts and the second is to make these structures by a process of generalization and extension as applicable and powerful as possible. As a consequence, geomathematics as a generic discipline is interpenetrated by an alternative, that may be characterized as follows: • On the one hand, the range of mathematics contains many results which are noted for their generality like functional analysis and its manifestation in illposed problems. • On the other hand, many of the famous unsolved problems have a very special nature. All these problems deal with questions of great individuality in the sense that the theories generally developed so far do not yet contain the information which is necessary to elucidate the specific ingredients of the conjectures in question.

3.3

Geomathematics As Interdisciplinary Discipline

The canonical outcome of our considerations is that mathematics as an interdisciplinary science can be found in almost every area of our lives. Mathematics is closely interacting with almost every other science, even medicine and parts of the arts (“mathematization of sciences”). The use of computers allows for the handling of complicated models for real data sets. Modeling, computation, and visualization yield reliable simulations of processes and products. Mathematics is the “raw material” for the models and the essence of each computer simulation. As the key technology, it translates the images of the real world to models of the virtual world, and vice versa (cf. Fig. 34). As a matter of fact, the special importance of mathematics as an interdisciplinary science has been acknowledged increasingly within the last few years in technology, economy, and commerce. However, this process does not remain without effects on mathematics itself. New mathematical disciplines, such as scientific computing, financial and business mathematics, industrial mathematics, biomathematics, and also geomathematics have complemented the traditional disciplines. Interdisciplinarity also implies the interdisciplinary character of mathematics at school. Relations and references to other disciplines (especially informatics, physics, chemistry, biology, and also economy and geography) become more important, more interesting, and more expandable. Problem areas of mathematics become explicit and observable, and they can be visualized. Of course, this undoubtedly also holds for the system Earth.

1 Mathematical Geodesy

47

Modeling

Real World

Observations

Geomathematics Virtual World

Fig. 34 Geomathematics as a key technology bridging the real and virtual world

Clearly, geomathematics is closely interconnected with geostatistics and geoinformatics: Statistics (usually seen as a basic subdiscipline of mathematics) is generally devoted to the analysis and interpretation of uncertainties caused by limited sampling of a property under study. In consequence, the focus of geostatistics is the development and statistical validation of models to describe the distribution in time and space of Earth sciences phenomena. Geostatistical realizations aim at integrating physical constraints, combining heterogeneous data sources, and characterizing uncertainty. Applications include a large palette of areas, for example, groundwater hydrology, air quality, land use change using terrestrial as well as satellite data. Because of both the statistical distribution of sample data and the spatial correlation among the sample data, a large variety of Earth science problems are effectively addressed using statistical procedures. Geomathematics basically differs from geoinformatics. Geomathematics also deals with the further development of the language itself. Geoinformatics, however, concentrates on the design and architecture of processors and computers, data bases and programming languages, etc, in a georeflecting environment. In geomathematics, computers do not represent the objects to be studied, but instead represent technical auxiliaries for the solution of mathematical problems of georeality.

3.4

Geomathematics As Challenge

From a scientific and technological point of view, the past 20th century was a period with two entirely different faces concerning research and its consequences. The first two thirds of the century were characterized by a movement towards a seemingly inexhaustible future of science and technology; they were marked by the absolute belief in technical progress which would make everything achievable in the end. Up to the Sixties, mankind believed to have become the master of the Earth (note that, in geosciences as well as other sciences, to master is also a synonym for

48

W. Freeden and M. Schreiner

to understand). Geoscience was able to understand plate tectonics on the basis of Wegener’s theory of continental drift, geoscientific research began to deal with the Arctic and Antarctic, and man started to conquer the universe by satellites, so that for the first time in mankind’s history the Earth became measurable on a global scale, etc. Then, during the last third of the past century, there was a growing skepticism as to the question whether scientific and technical progress had really brought us forth and whether the effects of our achievements were responsible. As a consequence of the specter of a shortage in raw materials (mineral oil and natural gas reserves), predicted by the Club of Rome, geological/geophysical research with the objective of exploring new reservoirs was stimulated during the Seventies. Moreover, the last two decades of the century have sensitized us for the global problems resulting from our behavior with respect to climate and environment. Our senses have been sharpened as to the dangers caused by the forces of nature, from earthquakes and volcanic eruptions, to the temperature development and the hole in the ozone layer, etc. Man has become aware of his environment. The image of the Earth as a potato drenched by rainfall (which is sometimes drawn by oceanographers) is not a false one. The humid layer on this potato, maybe only a fraction of a millimeter thick, is the ocean. The entire atmosphere hosting the weather and climate events is only a little bit thicker. Flat bumps protruding from the humid layer represent the continents. The entire human life takes place in a very narrow region of the outer peel (only a few kilometers in vertical extension). However, the basically excellent comparison of the Earth with a huge potato does not give explicit information about essential ingredients and process of the system Earth, for example, gravitation, magnetic field, deformation, wind and heat distribution, ocean currents, internal structures, etc (Fig. 35). In our 21st century, geoproblems currently seem to overwhelm the scientific programs and solution proposals. “How much more will our planet Earth be able to take?” has become an appropriate and very urgent question. Indeed, there has been a large number of far–reaching changes during the last few decades, e.g.,

There is a powerful trend that modern high speed computers are entering more and more all geodisciplines. There exists a growing public concern about the future of our planet, its climate, its environment, and about an expected shortage of natural resources. There is a strong need for strategies of protection against threats of a changing Earth.

There is an exceptional situation of getting data of better and better quality.

Fig. 35 Four significant reasons for the increasing importance of geomathematics

1 Mathematical Geodesy

49

species extinction, climate change, formation of deserts, ocean currents, structure of the atmosphere, transition of the dipole structure of the magnetic field to a quadrupole structure, etc. These changes have been accelerated dramatically. The main reasons for most of these phenomena are the unrestricted growth in the industrial societies (population and consumption, especially of resources, territory, and energy) and severe poverty in the developing and newly industrialized countries. The dangerous aspect is that extreme changes have taken place within a very short time; there has been no comparable development in the dynamics of the system Earth in the past. Changes brought about by men are much faster than changes due to natural fluctuations. Besides, the current financial crisis shows that a model of affluence (which holds for approximately 1 billion people) cannot be transferred globally to 5–8 billion people. Massive effects on mankind are inevitable. The appalling résumé is that the geoscientific problems collected over the decades must now all be solved simultaneously. Interdisciplinary solutions including intensive mathematical research are urgently required as answers to an increasingly complex world. Geomathematics is absolutely essential for a sustainable development in the future. However, the scientific challenge does not only consist of increasing the leading role of mathematics within the current “scientific consortium Earth”. The significance of the subject “Earth” must also be acknowledged (again) within mathematics itself, so that mathematicians will become more enthusiastic about it. Up to now, it has become usual and good practice in application–oriented mathematical departments and research institutions to present applications in technology, economy, finances, and even medicine as being very career–enhancing for young scientists. Geomathematics can be integrated smoothly into this phalanx with current subjects like climate change, exploration, navigation, and so on. Of course, basic research is indispensable. Geomathematics should not hide behind the other geosciences! Neither should we wait for the next horrible disaster! Now is the time to turn expressly towards georelevant applications. The Earth as a complex, however limited system (with its global problems concerning climate, environment, resources, and population) needs new political strategies. Consequently, these will step by step also demand changes in research due to our modified concept of “well being” (e.g., concerning milieu, health, work, independence, financial situation, security, etc). The time has come to realize that geomathematics is also indispensable as a constituting discipline within universities (instead of “ivory tower like” parity thinking following traditional structures). All in all, mathematics should be one of the leading sciences for the solution of these complex and economically very interesting problems, instead of fulfilling mere service functions.

3.5

Geomathematics As Solution Method

Up to now, trial functions for the description of geoscientifically relevant parameters have been frequently based on the (almost) spherical geometry of the Earth.

50

W. Freeden and M. Schreiner

Although a mathematical formulation in a spherical context may be a restricted simplification, it is at least acceptable for a large number of problems. In fact, ellipsoidal nomenclature is much closer to geophysical and/or geodetic purposes at least when obligations of the Earth’s centrifugal potential have to be considered, but the computational and numerical amount of work usually is a tremendous obstacle. Usually, in geosciences, we consider a separable Hilbert space such as the L2 -space (of functions showing finite signature energy) with a (known) polynomial basis as reference space for ansatz functions. However, there is a striking difference between the L2 -space over the Earth’s body and surface. Continuous “surface functions” can be described in arbitrary accuracy, for example, with respect to C- and L2 -topology by restrictions of harmonic functions (such as spherical harmonics), whereas continuous “volume functions” contain anharmonic ingredients (for more details see, e.g., [12, 20]). This fact has serious consequences for the reconstruction of signatures. Since the times of C.F. Gauss (1835) a standard method for globally reflected surface approximation involving equidistributed data has been the Fourier series in an orthogonal basis in terms of spherical harmonics. It is characteristic for such an approach that these polynomial ansatz functions do not show any localization in space (cf. Table 3). In the momentum domain (throughout this work called frequency domain), each spherical function corresponds to exactly one single Fourier coefficient reflecting a certain frequency. We call this ideal frequency localization. Due to the ideal frequency localization and the simultaneous dispensation with localization in space, local data modifications influence all the Fourier coefficients (that have to be determined by global integration). Consequently, this also leads to global modifications of the data representations in case of local changes. Fourier expansions provide approximation by oscillation, i.e., the oscillations grow in number, while the amplitudes become smaller and smaller. Nevertheless, we may state that

Table 3 Uncertainty principle and its consequences for space–frequency localization space localization

n o s p a c e lo c a liz a t io n

id e a l s p a c e lo c a liz a t io n frequency localization

id ea l fr eq u en cy lo ca liza t ion

n o fr eq u en cy lo ca liza t ion kernel type

Legendre kernel

bandlimited

spacelimited

Dirac kernel

1 Mathematical Geodesy

51

ideal frequency localization has proved to be extraordinarily advantageous due to the important physical interpretability (as multipole moments) of the model and due to the simple comparability of the Fourier coefficients with observables in geophysical and/or geodetic interrelations (see, e.g., [23] and the references therein). From a mathematical and physical point of view, however, certain kinds of ansatz functions would be desirable which show ideal frequency localization as well as localization in space. Such an ideal system of ansatz functions would allow for models of highest resolution in space; simultaneously, individual frequencies would remain interpretable. However, the principle of uncertainty, which connects frequency localization and space localization qualitatively and quantitatively, teaches us that both properties are mutually exclusive (except for the trivial case). Extreme ansatz functions in the sense of such an uncertainty principle are, on the one hand, spherical polynomials (see Fig. 36), i.e., spherical harmonics (no space localization, ideal frequency localization) and, on the other hand, the Dirac (kernel) function(als) (ideal space localization, no frequency localization). In consequence, (spherical harmonic) Fourier methods are surely well suited to resolve low and medium frequency phenomena, while their application is critical to obtain high resolution models. This difficulty is also well known to theoretical physics, e.g., when describing monochromatic electromagnetic waves or considering the quantum–mechanical treatment of free particles. In this case, plane waves with fixed frequencies (ideal frequency localization, no space localization) are the solutions of the corresponding differential equations, but do certainly not reflect the physical reality. As a remedy, plane waves of different frequencies are superposed to so–called wave–packages, which gain a certain amount of space localization while losing their ideal frequency (spectral) localization. A suitable superposition of polynomial ansatz functions (see [18]) leads to so-called kernel functions/kernels with a reduced frequency but increased space localization (cf. Fig. 37). These kernels can be constructed as to cover various spectral bands and, hence, can show all intermediate stages of frequency – and space localization. The width of the corresponding frequency – and space localization is usually controlled using a so-called scale-parameter. If the kernel is given by a finite superposition of polynomial ansatz-functions, it is said to be bandlimited, while in the case of infinitely many ansatz-functions the kernel is called non-bandlimited. It turns out that, due to their higher frequency localization (short frequency band),

Fig. 36 Spherical harmonics of low degrees

52

W. Freeden and M. Schreiner

Fig. 37 Weighted summation of spherical harmonics involving the addition theorem leading to the generation of space localized zonal kernels

the bandlimited kernels show less space localization than their non–bandlimited counterparts (infinite frequency band). This leads to the following characterization of ansatz functions: Fourier methods with polynomial trial functions are the canonical point of departure for approximations of low-frequency phenomena (global to regional modeling). Because of their excellent localization properties in the space domain, bandlimited and non-bandlimited kernels with increasing space localization properties can be used for stronger and stronger modeling of short-wavelength phenomena (local modeling). Using kernels of different scales the modeling approach can be adapted to the localization properties of the physical process under consideration. By use of sequences of scale dependent kernels tending to the Dirac–kernel, i.e., so-called Dirac–sequences, a multiscale approximation (i.e., “zooming in” process) can be established appropriately. Our concept also enables us to deal with simple (scalar, vectorial, and/or tensorial) wavelet-techniques, i.e., with multiscale techniques based on special kernel functions: (spherical) scaling functions and wavelets. Typically, the generatingfunctions of scaling have the characteristics of low-pass filters, i.e., the polynomial basis-functions of higher frequencies are attenuated or even completely left out. The generating-functions of wavelets, however, have the typical properties of band-pass filters, i.e., polynomial basis functions of low and high frequency are attenuated or even completely left out when constructing the wavelet. Thus, wavelet-techniques usually lead to a multiresolution of the Hilbert space under consideration, i.e., a certain two-parameter splitting with respect to scale and space. To be more concrete, the Hilbert space under consideration can be decomposed into a nested sequence of

1 Mathematical Geodesy

53

approximating subspaces – the scale spaces – corresponding to the scale parameter. In each scale space, a model of the data-function can usually be calculated using the respective scaling functions and, thus, leading to an approximation of the data at a certain resolution. For increasing scales, the approximation improves and the information obtained on coarse levels is contained in all levels of approximation above. The difference between two successive approximations is called the detail information, and it is contained in the so-called detail spaces. The wavelets constitute the basis-functions of the detail spaces and, summarizing the subject, every element of the Hilbert space can be represented as a structured linear combination of scaling functions and wavelets of different scales and at different positions (“multiscale approximation”) (cf. [13, 17, 22] and the references therein). Hence we are able to find a possibility to break up complicated functions like the geomagnetic field, electric currents, gravitational field, deformation field, oceanic currents, etc, into single pieces of different resolutions and to analyze these pieces separately. This helps to find adaptive methods (cf. Fig. 38) that take into account the specific structure of the data, i.e., in areas where the data show only a few coarse spatial structures the resolution of the model can be chosen to be rather low; in areas of complicated data structures the resolution can be increased accordingly. In areas, where the accuracy inherent in the measurements is reached, the solution process can be stopped by some kind of thresholding. That is, using scaling functions and wavelets at different scales, the corresponding approximation techniques can be constructed as to be suitable for the particular data situation. Consequently, although most data show correlation in space as well as in frequency, the kernel functions with their simultaneous space and frequency localization allow for the efficient detection and approximation of essential features in the data structure by only using fractions of the original information (decorrelation of signatures) (Fig. 39).

Fig. 38 Scaling functions (upper row) and wavelet functions (lower row) in mutual relation (“tree structure”) within a multiscale approximation

54

W. Freeden and M. Schreiner

Fig. 39 Earth’s Gravitational Model (EGM): Scaling function (upper row) and wavelet function (lower row) reconstruction in mutual relation (“tree structure”). On the one hand, the multiscale approximation of EGM shows that the gravitational potential is a rather smooth function, so that most of its structure can be modeled by lowpass filtering. On the other hand, EGM is sharply decorrelated by bandpass filtering, so that the detail structure for larger scales concentrates to geophysically relevant zones (e.g., subduction zones, areas of strong orogenese, etc.)

Finally it is worth mentioning that future spaceborne observation combined with terrestrial and airborne activities will provide huge datasets of the order of millions of data to be continued downward to the Earth’s surface. Standard mathematical theory and numerical methods are not at all adequate for the solution of data systems with a structure such as this, because these methods are simply not adapted to the specific character of the spaceborne problems. They quickly reach their capacity limit even on very powerful computers. In our opinion, a reconstruction of significant geophysical quantities from future data material requires much more: for example, it requires a careful analysis, fast solution techniques, and a proper stabilization of the solution, usually including procedures of regularization (see [7] and the references therein). In order to achieve these objectives various strategies and structures must be introduced reflecting different aspects. As already pointed out, while global long-wavelength modeling can be adequately done by the use of polynomial expansions, it becomes more and more obvious that splines and/or wavelets are most likely the candidates for medium and short-wavelength approximation. But the multiscale concept of wavelets demands its own nature which – in most cases – cannot be developed from the well-known theory in Euclidean spaces. In consequence, the stage is also set to present the essential ideas and results involving a multiscale framework to the geoscientific community.

3.6

Geomathematics As Methodology

Current methods of applied measurement and evaluation processes vary strongly, depending on the examined measurement parameters (gravity, electric or magnetic field force, temperature and heat flow, etc), the observed frequency domain, and the occurring basic “field characteristic” (potential field, diffusion field, or wave

1 Mathematical Geodesy

55

field, depending on the basic differential equations). In particular, the differential equation strongly influences the evaluation processes. The typical mathematical methods are therefore listed here according to the respective “field characteristic” – as it is usually done in geomathematics. • Potential methods (potential fields, elliptic differential equations) in geomagnetics, geoelectrics, gravimetry, geothermal research, . . . • Diffusion methods (diffusion fields, parabolic differential equations) in flow and heat transport, magnetotellurics, geoelectromagnetics, . . . • Wave methods (wave fields, hyperbolic differential equations) in seismics, georadar, . . .. The diversity of mathematical methods will increase in the future due to new technological developments in computer and measurement technology. More intensively than before, we must aim for the creation of models and simulations for combinations and networks of data and observable structures. The process for the solution of practical problems usually has the following components: • Mathematical modeling: the practical problem is translated into the language of mathematics, requiring the cooperation between application–oriented scientists and mathematicians. • Mathematical analysis: the resulting mathematical problem is examined as to its “well-posedness” (i.e., existence, uniqueness, dependence on the input data). • Development of a mathematical solution method: appropriate analytical, algebraic, and/or numerical methods and processes for a specific solution must be adapted to the problem; if necessary, new methods must be developed. The solution process is carried out efficiently and economically by the decomposition into individual operations, usually on computers. • “Back-transfer” from the language of mathematics to applications: the results are illustrated adequately in order to ensure their evaluation. The mathematical model is validated on the basis of real data and modified, if necessary. We aim for good accordance of model and reality. Often, the process must be applied several times in an iterative way in order to get a sufficient insight into the system Earth. Nonetheless, the advantage and benefit of the mathematical processes are a better, faster, cheaper, and more secure problem solution on the basis of the mentioned means of simulation, visualization, and reduction of large amounts of data. So, what is it exactly that enables mathematicians to build a bridge between the different disciplines? The mathematics’ world of numbers and shapes contains very efficient tokens by which we can describe the rule–like aspect of real problems. This description includes a simplification by abstraction: essential properties of a problem are separated from unimportant ones and included into a solution scheme. Their “eye for similarities” often enables mathematicians to recognize a posteriori that an adequately reduced problem may also arise from very different situations, so that the

56

W. Freeden and M. Schreiner

resulting solutions may be applicable to multiple cases after an appropriate adaptation or concretization. Without this second step, abstraction remains essentially useless. The interaction between abstraction and concretization characterizes the history of mathematics and its current rapid development as a common language and independent science. A problem reduced by abstraction is considered as a new “concrete” problem to be solved within a general framework, which determines the validity of a possible solution. The more examples one knows, the more one recognizes the causality between the abstractness of mathematical concepts and their impact and cross-sectional importance. Final Conclusion. The Earth is a dynamic planet in permanent change, due to large–scale internal convective material and energy rearrangement processes, as well as manifold external effects. We can, therefore, only understand the Earth as our living environment if we consider it as a complex system of all its interacting components. The processes running on the Earth are coupled with one another, forming ramified chains of cause and effect which are additionally influenced by men who intervene into the natural balances and circuits. However, knowledge of these chains of cause and effect has currently still remained incomplete to a large extent. In adequate time, substantial improvements can only be reached by the exploitation of new measurement and observation methods, e.g., by satellite missions and by innovative mathematical concepts of modeling and simulation, all in all by geomathematics. As far as data evaluation is concerned in the future, traditional mathematical methods will not be able to master the new amounts of data neither theoretically nor numerically – especially considering the important aspect of a more intensively localized treatment with respect to space and time, embedded into a global concept. Instead, geoscientifically relevant parameters must be integrated into constituting modules; the integration must be characterized by three essential characteristics: good approximation property, appropriate decorrelation ability, and fast algorithms. These characteristics are the key for a variety of abilities and new research directions.

4

Mathematical Geodesy: The Circuit

In what follows, in order to characterize the circuit mathematical geodesy we follow the work given in [10]. The basis of geodetic science is its measurements, i.e., scalar numbers, vectors, tensors such as distances, angles, directions, velocities, accelerations. In this respect, the relevance of the gravity field manifests itself in twofold sense: from the need to handle heights and from the determination of the Earth’s shape. Consequently, geodesy realizes a physical rather than a geometrical understanding of height by observing that a point is higher than another if water flows from the first to the second. In other words, “geometric” obligations do not allow to be separated from physical ones. The gravity field is still present, as the driving force.

1 Mathematical Geodesy

57

Nowadays geodesy as a measuring discipline is in great shape. Computer facilities and measurement and observation methods opened new research areas and opportunities. However, it is geodetic trademark to present measured values always together with a suitable modeling procedure for interpretation and an appropriate knowledge and estimation about reliability and accuracy. Following R. Rummel [56], this diligence demonstrates the geodesists role as notary of the Earth. As an evident consequence, however, this notarial role explains that geodesy is more than a geoengineering discipline concerned only with measurements. Inherently, as we have seen, mathematics is implied as key technology bridging the real world of measurements and the virtual world of handling datasets, modeling geodetic quantities and processes, and providing illustrations and interpretations. Once more, the result of measurements are numbers, vectors, tensors, i.e., raw material. Mathematical handling and approximation of datasets as well as modeling techniques are necessary to connect the “reality space” with the “virtuality space”. In this sense, a model represents the result of the transfer, it intends to be an image of the reality, expressed in mathematical language, so that an interaction between abstraction and concretization is involved. The mathematical world of numbers and structures contains efficient tokens by which we are able to describe the rule-like aspect of a real problem. This description includes a simplification by abstraction: essential properties of, e.g., a geodetic problem are separated from unimportant ones and a solution scheme is set up. The “eye for similarities” enables mathematicians to recognize a posteriori that resulting solutions become applicable to multiple cases not only in geodesy but also in other scientific disciplines after an appropriate adaptation. Summarizing we are led to the following conclusion: Mathematical Geodesy is characterized by a twofold interaction. An input problem from reality space (“measurements”) reduced by abstraction and transferred into virtuality space results in a mathematical output model which following a “circuit” (cf. Fig. 40) becomes a new “concrete” input problem in reality space (usually in geodesy, but also in other sciences). As a consequence, the ideal geodetic process (circuit) for the solution process of geodetic problems (as proposed recently by R. Rummel [56]) canonically shows the following steps to be handled (see also the approach sketched in [9]): • Transfer from reality to virtuality space: Measurements and observational events in reality space lead to mathematical tokens and quantities as “raw material” for modeling and processing in virtuality space. The observational input is translated into the language of the virtuality space, i.e., mathematics, requiring close cooperation between application-oriented and mathematical scientists. • Retransfer from virtuality to reality space: Appropriate analytical, algebraic, statistic, stochastic, and/or numerical methods must be taken into account; if necessary, new solution techniques must be proposed. The mathematical model is validated, the aim is a good accordance of model and measurement. If necessary, on the basis of new measurements, the model must be improved by use of modified “raw material”.

58

W. Freeden and M. Schreiner

Observation (= measurement) of events in reality space

Modeling (= solution) on the basis of datasets in virtual space

transfer (bridging process) distances, directions, run times, velocities, accelerations, ...

potential theory, approximation, statistics, inverse problems, numerics, ... retransfer (interpretation, validation)

Fig. 40 The circuit: mathematical geodesy as explained in this work

Obviously, the benefit of a circuit is a better, faster, cheaper, and more secure problem solution on the basis of the mentioned processes of modeling, simulation, visualization, as well as reduction, decorrelation and denoising of large amounts of data. The more measurements are available, the more one recognizes the causality between abstraction by mathematical concepts and their impact and cross-sectional importance to reality. Evidently, the circuit in its ideal manifestation (as illustrated in Fig. 40) has to follow an obligatory line, namely to provide an appropriate platform within which mathematically/geodetically interrelated features can be better motivated and understood, thereby canonically leading to an interdisciplinary palette of solution procedures in geodetic areas of application. In accordance with this intention, criteria must be found relative to which the scope and limitations of the various methods can be assessed. This is important both in theory and practice since there generally is no cure-all method for most of geodetic problems. The interaction between abstraction and concretization characterizes the history of geodesy and its efforts as an independent science. The questions, however, are • why today’s geodesists commonly restrict themselves to the reality space (“measurements”) with a necessity to accept some “service fundamentals” of the virtuality space, • why today’s mathematicians are interested only in rare exceptions in appropriate handling of geodetically relevant obligations including specific model developments. Following an article about the interconnecting roles of geodesy and mathematics presented by H. Moritz [43], a prominent member of today’s geodesy, the actual interrelationship shows a twofold appraisal from history:

1 Mathematical Geodesy

59

• First, H. Moritz [43] states that the old days are gone when Carl Friedrich Gauss (1777–1855) himself developed his epoch-making theories inspired by his geodetic concerns. Gone also are the days when Felix Klein (1849–1925), one of the leading mathematicians of his time, called geodesy “that geometrical discipline in which the idea of approximation mathematics has found its clearest and most consequent expression” (see [35], p.128). Gone are the times when H. Poincaré (1854–1912) investigated problems of astronomy and geodesy and actively participated in geodetic life. So, we are led to the conclusion that it apparently is the fault of today’s mathematicians that they provide mathematics in an increasingly abstract way, without any regard to possible geodetic applications and, so to say in the scheme of Fig. 40, out of touch with reality? Moritz’ opinion is as follows: “In part, certainly, they are out of reality.” • Second, H. Moritz [43] is deeply convinced that an increasing abstraction is necessary to achieve progress, not only in mathematics, but also in today’s geodesy. What is frequently overlooked by potential geodetic users of mathematical theory is that the modern abstract methods of mathematics, if properly understood, provide an extremely powerful tool for the solution of applied problems which could not be solved otherwise: the more abstract a method is, the more it is sometimes suitable for a concrete problem. Thus, we may also conclude that it apparently is the fault of modern geodesists to be restricted to measurement tasks, without any regard to virtuality space providing valuable mathematical concepts and, so to say, also out of touch with virtuality? As a consequence, in the sense of Moritz’ explications, today’s circuits should follow the ideal way – at least to a considerable extent – that was initiated by Gauss as one of history’s most influential mathematician and geodesist for an extremely fruitful interdisciplinary exchange. The heritage of Gauss’s work has extremely much to offer even these days to build a strong scientific bridge between mathematics and geodesy by the consequent continuation of the interplay between abstraction and concretization. However, it must be confessed that today’s circuits (in the sense as depicted in Fig. 40) turn out to be too complex in their transfer demands from reality to virtuality space and vice versa, as to be handled by only one ingenious geoscientist. In addition, geodetic changes have been accelerated dramatically. A last “tour de force” for a consolidation of a circuit by a single scientist in the aforementioned classical sense probably was the work by L. Hörmander [34] on the “Geodetic Boundary Value Problem”, but only a few years later geodetic space observation by GPS made Hörmander’s deep model approach unrealistic, since he started from the traditional assumption that the actual Earth’s surface was unknown. Nowadays, the appalling résumé is that a large number of geodetic problems in their specific changes and modifications over the last years must be solved simultaneously. Interdisciplinary solutions are urgently required as answer to an increasingly complex geodetic world. In the opinion of the authors, the scientific challenge is a “geodetic consortium”, in which mathematics should not stay for geodesists in unloved external partnership and geodesy is of high quality problem-attractiveness for applied mathematicians. Indeed, the leading role of mathematics for obligations in virtuality space must be

60

W. Freeden and M. Schreiner

acknowledged (again) within today’s geodesy, so that mathematicians will become more enthusiastic about working on geodetic programs. A “geodetic consortium” reflecting the cross-sectional demands in reality as well as virtuality space is absolutely essential for a sustainable development in the future. No doubt, as this handbook will show, mathematicians can and should be integrated smoothly into the geodetic phalanx instead of restricting geodesists exclusively to measurements and mathematicians to mere service functions. Only a “geodetic consortium” consisting of scientists with equal standing, rights, and research position will be able to promote the significance of geodesy in its responsibility even for society similarly to the Gaussian epoch. This “Handbook of Mathematical Geodesy” (HbMG) aims at providing innovative mathematical instruments in virtuality space in concrete adaptation to recent demands of gravity field reflected geodesy. It presents geodetically relevant tools and techniques from functional analysis, potential theory, constructive approximation, inverse theory, and numerics. A selected list of topics includes the mathematical handling of geodetic inverse problems by least squares as well as multiscale regularization, in particular mollifier techniques in gravimetry, geodetic statistical methods, special functions based methods, approximation and numerical methods, e.g., for oblique boundary value problems, reference systems and monitoring methods, and the mathematical foundation of satellite and space techniques, “downward continuation problems” such as satellite gravity gradiometry. The handbook is meant as a mathematical addendum to the foundations in use within today’s virtuality space (cf. Fig. 40). The goal of the handbook is twofold: • to make mathematicians aware of the particular mathematical developments and calamities occurring in modern geodetic concretizations, • to make geodesists conscious of new tools, means, structures, methods, and procedures for handling recent measurements and observations by mathematical abstraction. All in all, the handbook “Mathematical Geodesy” is understood as an essential step towards modern manifestations of “geodetic consortia” realizing the cross-sectional demands and requirements of today’s circuits in a well-balanced interdisciplinary way. In addition, it represents the missing volume in Springer’s synopsis of the geodetic science today.

Literature 1. Anderson, J.M., Mikhail, E.M.: Surveying: Theory and Practice. McGraw Hill, Boston (1998) 2. Blick, C.: Multiscale Potential Methods in Geothermal Research: Decorrelation Reflected Post-Processing and Locally Based Inversion. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group, Verlag Dr. Hut, Munich, 2015 3. Blick, C., Freeden, W.: Spherical spline application to radio occultation data. J. Geodetic Sci. 1, 379–396 (2011)

1 Mathematical Geodesy

61

4. Blick, C., Freeden, W., Nutz, H.: Feature extraction of geological signatures by multiscale gravimetry. GEM Int. J. Geomath. 8, 57–83 (2017) 5. Blick, C., Freeden, W., Nutz, H.: Gravimetry and exploration. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics. Birkhäuser/Springer International Publishing, Basel/New-York/Heidelberg (2018) 6. Bruns, H.: Die Figur der Erde “Ein Beitrag zur europäischen Gradmessung”. P. Stankiewicz, Berlin (1878) 7. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. B.G. Teubner, Stuttgart/Leipzig (1999) 8. Freeden, W.: Geomathematik, was ist das überhaupt? Jahresb. Deutsch. Mathem. Vereinigung (DMV) 111, 125–152 (2009) 9. Freeden, W.: Geomathematics: its role, its aim, and its potential. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 3–78. Springer, Heidelberg (2015) 10. Freeden, W.: In: Freeden, W., Nashed, M.Z. (eds.) Introduction of Handbook of Mathematical Geodesy, X–XIV. Birkhäuser/Springer International Publishing, Basel/New-York/Heidelberg (2018) 11. Freeden, W., Blick, C.: Signal decorrelation by means of multiscale methods. World of Mining 65, 1–15 (2013) 12. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman and Hall/CRC Press, Boca Raton/London/New York (2013) 13. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications, Clarendon/Oxford (1998) 14. Freeden, W., Kersten, H.: The geodetic boundary-value problem using the known surface of the Earth, vol. 29. Veröff Geod Inst RWTH, Aachen (1980) 15. Freeden, W., Kersten, H.: A constructive approximation theorem for the oblique derivative problem in potential theory. Math. Methods Appl. Sci. 4, 104–114 (1981) 16. Freeden, W., Mayer, C.: Multiscale solution for the molodensky problem on regular telluroidal surfaces. Acta Geod. Geophys. Hung. 41, 55–86 (2006) 17. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston (2004) 18. Freeden, W., Michel, V., Simons, F.J.: Spherical harmonics based special function systems and constructive approximation methods. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 753–820. Springer International Publishing, Basel/New-York/Heidelberg (2018) 19. Freeden, W., Nashed, M.Z.: Operator-theoretic and regularization approaches to ill-posed problems. GEM Int. J. Geomath, 9, 1–115 (2018) 20. Freeden, W., Nashed, M.Z.: Ill-posed problems: operator methodologies of resolution and regularization. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 201–314. Springer International Publishing, Basel/NewYork/Heidelberg (2018) 21. Freeden, W., Nashed, M.Z.: Gravimetry as an ill-posed problem in mathematical geodesy. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 641–686. Springer International Publishing, Basel/New-York/Heidelberg (2018) 22. Freeden, W., Nashed, M.Z., Schreiner, M.: Spherical Sampling. Geosystems Mathematics. Springer International Publishing, Basel/New-York/Heidelberg (2018) 23. Freeden, W., Nutz, H.: Geodetic observables and their mathematical treatment in multiscale framework. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 315–458. Springer International Publishing, Basel/NewYork/Heidelberg (2018) 24. Freeden, W., Nutz, H., Schreiner, M.: Geomathematical advances in satellite gravity gradiometry (SGG). In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 501–604. Springer International Publishing, Basel/NewYork/Heidelberg (2018)

62

W. Freeden and M. Schreiner

25. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences – A Scalar, Vectorial, and Tensorial Setup. Springer, Heidelberg (2009) 26. Groten, E.: Geodesy and the Earth’s Gravity Field I + II. Dümmler, Bonn (1979) 27. Göttl, F., Rummel, R.: A geodetic view on isostatic models. Pure Appl. Geophys. 166, 1247–1260 (2009) 28. Hadamard, J.: Sur les problèmes aux dérivées partielles et leur signification physique. Princeton University Bulletin, 49–52 (1902) 29. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Freeman, San Francisco (1967) 30. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Reprint, Institut of Physical Geodesy, Technical University Graz (1981) 31. Helmert, F.R.: Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie, vol. 1. Teubner, Leipzig (1880) 32. Hofmann-Wellenhof, B., Moritz, H.: Physical Geodesy. Springer, Wien/New York (2005) 33. Hörmander, L.: Pseudodifferential Operators. Commun. Pure Appl. Math. 18, 501–517 (1965) 34. Hörmander, L.: The Boundary Problems of Physical Geodesy. The Royal Institute of Technology, Division of Geodesy, Report 9, Stockholm (1975) 35. Klein, F.: Elementarmathematik III. Die Grundlagen der Mathematischen Wissenschaften, Band 16. Springer, Berlin (1928) 36. Koch, K.R., Pope, A.J.: Uniqueness and existence for the geodetic boundary value problem using the known surface of the Earth. Bulletin Géodésique, 106, 467–476 (1972) 37. Lambeck, K.: The Earth’s Variable Rotation: Geophysical Causes and Consequences. Cambridge University Press, Cambridge (2005) 38. Lemoine, F.G., Kenyon, S.C., Factor, J.K., Trimmer, R.G., Pavlis, N.K., Shinn, D.S., Cox, C.M., Klosko, S.M., Luthcke, S.B., Torrence, M.H., Wang, Y.M., Williamson, R.G., Pavlis, E.C., Rapp, R.H., Olson, T.R.: The Development of the Joint NASA GSFC and NIMA Geopotential Model EGM96, NASA/TP-1998-206861, NASA Goddard Space Flight Center, Greenbelt (1998) 39. Listing, J.B.: Über unsere jetzige Kenntnis der Gestalt und Größe der Erde. Dietrichsche Verlagsbuchhandlung, Göttingen (1873) 40. Marussi, A.: Intrinsic Geodesy. Springer. Berlin/Heidelberg (1985) 41. Misner, C.W., Thorne, J.A., Wheeler, J.A.: Gravitation. Freeman and Company, San Francisco (1973) 42. Molodensky, M.S., Eremeev, V.F., Yurkina, M.I.: Methods for Study of the External Gravitational Field and Figure of the Earth. Trudy TSNIIGAiK, Geodezizdat, Moscow, 131, 1960. English translat.: Israel Program for Scientific Translation, Jerusalem (1962) 43. Moritz, H.: Geodesy and Mathematics. Zeszyty Naukowe Akademii Görniczo-Hutniezej I.M. Stanislawa Staszica, No. 780, Geodezja, 63, pp. 38–43, Krakow (1981) 44. Moritz, H.: Classical Physical Geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, pp. 253–290. Springer, Heidelberg (2015) 45. Moritz, H., Mueller, I.I.: Earth Rotation. Ugar, New York (1987) 46. Müller, C.: Aspects of differential equations in mathematical physics. In: Langer, R.E. (ed.) Partial Differential Equations and Continuum Mechanics, pp. 3–8. The University of Wisconsin Press, Madison (1961) 47. Neumann, F.: Vorlesungen über die Theorie des Potentials und der Kugelfunktionen, pp. 135–154. Teubner, Leipzig (1887) 48. Oberg, J.: Space myths and misconceptions. Omni 15(7), 38–40 (1993) 49. Pavlis, N.K., Holmes, S.A., Kenyon, S.C., John K., Factor, J.K.: The development and evaluation of the Earth gravitational model 2008 (EGM2008). J. Geophys. Res.: Solid Earth (1978–2012) 117(B4) (2012). https://doi.org/40.1029/2011JB008916 50. Pizzetti, P.: Corpi equivalenti rispetto alla attrazione newtoniana esterna. Rom. Acc. L. Rend. 18, 211–215 (1909) 51. Pizzetti, P.: Sopra il Calcoba Tesrico delle Deviazioni del Geoide dall’ Ellissoide. Att. R Accad. Sci. Torino, 46, 331–350 (1910)

1 Mathematical Geodesy

63

52. Rummel, R.: Geodesy’s Contribution to Geophysics. ISR Interdiscipl. Sci. Rev. 9(2), 113–122 (1984) 53. Rummel, R.: Geodesy. In: Nierenberg, W.A. (ed.) Encyclopedia of Earth System Science, vol. 2, pp. 253–262. Academic Press, New York (1992) 54. Rummel, R.: Spherical spectral properties of the Earth’s gravitational potential and its first and second derivatives. In: Rummel, R., Sanso, F. (eds.) Lecture Notes in Earth Science, vol. 65, pp. 359–404. Springer, Berlin (1997) 55. Rummel, R.: Dynamik aus der Schwere—Globales Gravitationsfeld. In: An den Fronten der Forschung. Kosmos – Erde – Leben, Verhandlungen der Gesellschaft Deutscher Naturforscher und Ärzte (122. Versammlung, Halle), 69–77 (2002) 56. Rummel, R.: Geodäsie in Zeiten des Wandels – Versuch einer Standortbestimmung. Zeitschrift für Geodäsie, Geoinformation und Landmanagement (ZfV) 139, 211–216 (2014) 57. Rummel, R., Sünkel, H., Tscherning, C.C.: Comparison of global topographic/isostatic models to the Earth’s observed gravity field.. Department of Geodetic Science and Surveying, Ohio State University, Report No. 388 (1988) 58. Runge, C.: Zur Theorie der eindeutigen analytischen Funktionen. Acta Math. 6, 229–234 (1885) 59. Sonar, T.: 3000 Jahre Analysis. Springer, Berlin/Heidelberg/New York (2011) 60. Stokes, G.G.: On the variation of gravity at the surface of the Earth. Trans. Cambr. Phil. Soc. 148, 672–712 (1849) 61. Torge, W., Müller, J.: Geodesy. De Gruyter (2012) 62. Walsh, J.L.: The approximation of harmonic functions by harmonic polynomials and by harmonic rational functions. Bull. Am. Math. Soc. 35, 499–544 (1929) 63. https://web.archive.org/web/20160112123725/http://grin.hq.nasa.gov/ABSTRACTS/GPN-20 00-001138.html

2

Inverse Probleme der Geodäsie Ein Abriss mathematischer Lösungsstrategien Willi Freeden und Helga Nutz

Inhalt 1 Einleitung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Grundgleichungen der Theorie inverser Probleme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Strategien und Dilemmata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Zusammenfassung und Ausblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literatur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66 68 74 87 88

Zusammenfassung Der vorliegende Beitrag beschäftigt sich mit mathematischen Lösungspotentialen und Strategien für inverse Probleme in der Geodäsie. Die Dilemmata hinsichtlich Existenz, Eindeutigkeit und Stabilität eines Lösungsprozesses werden beschrieben. Die Notwendigkeit zur Regularisierung wird herausgestellt, spezifische Eigenschaften der Regularisierungsverfahren werden kurz skizziert.

Dieser Beitrag ist Teil des Handbuchs der Geodäsie, Band „Mathematische Geodäsie/ Mathematical Geodesy“, herausgegeben von Willi Freeden, Kaiserslautern. W. Freeden () Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany E-mail: [email protected] H. Nutz CBM – Gesellschaft für Consulting, Business und Mangagement mbH, Bexbach, Deutschland E-mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_92

65

66

W. Freeden und H. Nutz

Abstract This contribution is concerned with mathematical potentials and strategies for the solution of inverse problems in geodesy. The dilemmata with respect to existence, uniqueness, and stability of a solution process are described. The need of regularization is pointed out, specific properties of regularization procedures are briefly sketched.

Schlüsselwörter Geomathematik als Schlüsseltechnologie geodätischer Erkundung · Gradiometrie · Gravimetrie · Aufgabenstellungen und Lösungspotentiale · Strategien und Dilemmata

1

Einleitung

Von Geodäten wird oftmals die Meinung vertreten, dass inverse Probleme wie etwa Gravimetrie oder Gradiometrie heutzutage kein mathematisches Problem mehr darstellen und man das Risiko von Falschaussagen und Fehlinterpretationen weitgehend im Griff hätte. Zur Verdeutlichung dieser Ansicht werden kugelfunktionsbasierte Verfahren herangezogen, die hochfrequente Signalanteile unterdrücken und in der Tat global mit gutem Erfolg zur Trenddarstellung des Gravitationspotentials der Erde, beispielsweise aus Satellitendaten, angewendet werden können. Sie verleiten somit beispielsweise zu dem Schluss, dass beim „Herunterrechnen“ von Satellitendaten auf Erdniveau („Downward Continuation Problem“), keine schwerwiegenden mathematischen Probleme anzutreffen sind. Allerdings wird selbst bei der erfolgreichen Anwendung von Kugelfunktionsmethoden in endlich dimensionalen „Polynomialräumen“ (als Refernzräumen) der den geophysikalischen Grundgleichungen zugrundeliegende Formelapparat weitgehend außer Acht gelassen. Bei genauerer Analyse werden wir nämlich unmittelbar und zwingend auf unendlich dimensionale Hilberträume als Referenzräume geführt. Für einen Mathematiker stellen sich in diesem Zusammenhang nun die folgenden eng miteinander verwandten Fragen: • Sind die spezifischen Hilbertraumingredientien, wie Abstand-, Winkel- und Konvergenzbegriff, ausreichende Strukturen, um Lösungsstrategien eines inversen Problems zu beschreiben? • Ist es denkbar, dass inverse Probleme wie Gravimetrie oder Gradiometrie von ihrer Hilbertraumformulierung her zu dem Dilemma führen, als unlösbar zu gelten?

2 Inverse Probleme der Geodäsie

67

• Ist eine Lösung – soweit unter gewissen Einschränkungen vorhanden – überhaupt eindeutig? • Ist ein Näherungsprozess zu einer (Ersatz-)Lösung stabil? Die Beantwortung dieser Fragen ist Gegenstand dieses Beitrages. Sie erfolgt dabei streng basierend auf der intrinsischen Struktur der mathematisch eingesetzten Hilfsmittel und des im Modell auftretenden Kontextes, um eine wissenschaftlich nicht widerlegbare Objektivierung von Aussagen zu ermöglichen. Dabei gilt das Prinzip, dass vereinfachende Schlussfolgerungen die mathematische Lösungskomplexität substantiell widerspiegeln müssen, ansonsten sind sie ihrem Wesen nach wertlos. Bei einem solchen mathematisch fundierten Objektivierungsanspruch sind sich die Autoren dieses Beitrages selbstverständlich bewusst, dass zum Verständnis ein gewisses Grundwissen der Mathematik beim Leser vorausgesetzt werden muss. Insbesondere ist die Darstellung der in der Physikalischen Geodäsie auftretenden Grundgleichungen (vgl. Abschn. 2) ohne einen gewissen Formelapparat nicht denkbar. Wir haben jedoch in diesem als Einführung gedachten Beitrag den Versuch gemacht, die Gesamtdarstellung so anzulegen, dass der formelbehaftete Abschn. 2 weitgehend überlesen werden kann, ohne den Gehalt und die Aussagekraft der in Abschn. 3 getroffenen mathematischen Sachverhalte und Schlüsse zu beeinträchtigen. Die wesentlichen Inhalte des Beitrages, d. h. Strategien und Dilemmata der inversen Problemen der Geodäsie, werden in Abschn. 3 behandelt. Angestrebt ist dabei ein verständlicher Überblick über das heutzutage verfügbare Lösungspotential in typisch mathematischer Vorgehensweise (vgl. auch [18,19,22]). Der Brückenschlag zwischen Realität und Modell beinhaltet dabei kanonisch eine Vereinfachung durch Abstraktion: Wesentliche Eigenschaften eines Anwendungsproblems werden von unwichtigen getrennt und gehen in ein Lösungsschema ein. Der mathematische Blick für Gemeinsamkeiten erlaubt es nachträglich zu erkennen, dass ein geeignet reduziertes Problem auch aus ganz anderen Zusammenhängen und Verfahren entstehen kann. Entsprechend werden die resultierenden Lösungen bei angemessener Anpassung und Konkretisierung vielseitig verwendbar, allerdings weisen sie auch gleiche Lösungsdilemmata auf. Nichtsdestoweniger ist es dieses Wechselspiel aus Abstraktion und Konkretisierung, das die Entstehungsgeschichte der mathematischen Lösungsmethoden auch in der Geodäsie kennzeichnet, wobei Mathematik mittels ihrer verbindenden Sprache unter Einschluss von Computerverfahren und geeigneter Verarbeitung von Datensystemen ohne Zweifel zu ihrem hohen Standard beigeträgt. Der Beitrag schließt in Abschn. 4 mit einer Zusammenschau der gegenwärtigen mathematischen Komponenten in der Theorie inverser Probleme der Geodäsie und der Würdigung des ursächlichen Zusammenhangs zwischen der Abstraktheit mathematischer Konzepte mit deren Schlag- und Querschnittsvermögen nicht nur für Belange der Geodäsie.

68

2

W. Freeden und H. Nutz

Grundgleichungen der Theorie inverser Probleme

Grundsätzlich lassen sich für die Mathematische Geodäsie zwei Typen inverser Probleme unterscheiden: (i) Downward Computation („Herunterrechnen“): Berechnung der Dichte innerhalb eines Bereichs B des Erdinneren (und somit von Information über geologische Formationen in diesem Bereich) aus gemessener Gravitationsfeldinformation an der Erdoberfläche. (ii) Downward Continuation („Herunterfortsetzen“): Berechnung von Gravitationsfeldinformation für einen Bereich B der Erdoberfläche aus Gravitationsfeldinformation gemessen auf Satellitenhöhe (und/oder Flugzeughöhe). Eine Kombination beider Probleme, also Berechnung von Dichteinformation aus Gravitationsfeldinformation im Außenraum der Erde, ist ebenfalls möglich. Beiden Typen gemeinsam ist, dass sie in eine (Fredholmsche) Integralformel der (ersten) Art  B

K(x, y) F (y) dy = P (x),

x ∈ B,

(1)

mit vorgegebener Kernfunktion K : B × B → R münden. Dabei ist B das entsprechende Untersuchungsgebiet, und B = B ∪ ∂B ist der topologische Abschluss von B, wobei konventionsgemäß ∂B den Rand von B bezeichnet (dy ist je nach Problem Volumenelement dV (y) oder Flächenelement dS(y)). Charakteristisch ist der inverse Charakter der Integralgleichung (1), nämlich dass sie von einer gemessenen „Wirkung“ P auf eine gewünschte „Ursache“ F abzielt. Ein System liefert dabei P (x) als Messgröße diskret für Punkte x im „Messbereich“. Gesucht ist F im Idealfall kontinuierlich für alle Punkte des „Ursachenbereichs“. Bei den in der Geodäsie interessanten Integralgleichungen gibt es insbesondere bei verrauschten Daten oftmals keine voll befriedigende Lösungstheorie. Auf die besonderen Spezifika dieser Fragestellungen werden wir in Abschn. 3 eingehen. Im Folgenden beschreiben wir nun die drei Problemstellungen, mit denen wir enstprechend den zu Beginn des Abschnitts gemachten Überlegungen in der Geodäsie konfrontiert sind: (i) Inverse Gravimetrie durch „Downward Computation“ ist dem Newtonschen Gesetz folgend durch eine volumenhafte Integralgleichung gegeben, die eine Faltung der Dichteverteilung F mit dem Newton-Kern, d. h. dem reziproken Abstand von Integrationsvariablen und Aufpunkt, darstellt (siehe Abschn. 2.1).

2 Inverse Probleme der Geodäsie

69

(ii) Inverse Gradiometrie durch „Downward Continuation“ führt unter der Voraussetzung, dass man Satellitenorbit und Erdoberfläche nach dem Rungeschen Gedanken (vgl. [3, 15, 20, 21]) jeweils auf sphärische Referenzflächen bezieht, auf eine (flächenhafte) sphärische Integralformel, die als Kern K(·, ·) gewisse zweite Ableitungen des sogenannten Poisson-Kern besitzt (siehe Abschn. 2.2). (iii) Inverse Gradiometrie/Gravimetrie durch „Downward Computation“ ist entweder eine Komposition von (i) und (ii) oder sie resultiert kanonisch aus einer Modifikation des Newtonschen Volumenpotentials (siehe Abschn. 2.3).

2.1

Klassische Gravimetrie

Unser Startpunkt im Falle der klassischen Gravimetrie ist die Poissonsche Differentialgleichung, die mithilfe des Laplace-Operators Δ für ein geometrisch ausgezeichnetes Gebiet B ⊂ R3 , dessen Rand ∂B die Anwendung des aus der Vektoranalysis bekannten Satzes von Gauss erlaubt, durch F = −ΔP

(2)

gegeben ist, wobei nach der Newtonschen Theorie P das Gravitationspotential zur (genügend glatten) Dichteverteilung F in B ist. Die Fundamentallösung des Laplace-Operators Δ, d. h. der Newton-Kern, ist durch die rotationssymmetrische Kernfunktion K(Δ; |x − y|) =

1 , 4π |x − y|

x, y ∈ R3 , x = y,

(3)

gegeben (siehe z. B. [8, 16, 17] für weitere potentialtheoretische Ausführungen). Die allgemeine Lösung der Poissongleichung (2) lässt sich mithilfe der Fundamentallösung des Laplace-Operators Δ in Form der (volumenhaften) Newtonschen Integraldarstellung  P (x) =

B

K(Δ; |x − y|) F (y) dV (y),    Newton-Kern

x ∈ R3 ,

(4)

angeben, wobei dV im Falle eines Gebietes B ⊂ R3 das Volumenelement im R3 ist. Die klassische Anwendung in der Geodäsie ergibt sich über das sogenannte inverse Gravimetrieproblem, d. h. der Bestimmung der im Newtonintegral (4) auftretenden Dichtefunktion F in B aus Information des Gravitationspotentials P , gewöhnlich auf dem Rand ∂B des Gebietes, aber auch aus dem Außenraum (z. B. bei Messung mit Flugobjekten mit niedriger Flughöhe). Weiterführende mathematische Literatur zur inversen Gravimetrie als eine der klassischen Explorationsmethoden findet der interessierte Leser in [1, 6, 7, 15, 17, 21, 29, 34, 45, 54]

70

W. Freeden und H. Nutz

oder auch in Beiträgen dieses Handbuchs. Innovative Methoden zur Unterstützung und Verbesserung der klassischen Auswerteverfahren finden sich z. B. in [8, 9]. Observablenbasierte Gravimetrie: Bei dieser Methode kommen an Stelle der Potentialwerte P (x) für Punkte x ∈ ∂B (oder allgemeiner auch in R3 \B) andere geodätische Messwerte (wie zum Beispiel Gravitationsanomalien) zum Einsatz. Das methodische Vorgehen ist in diesem Fall weitgehend analog.

2.2

Klassische Gradiometrie

Vorbetrachtung: Wir bezeichnen, wie in der heutigen Mathematik üblich, die Sphäre im Euklidischen Raum R3 um den Nullpunkt mit Radius R als S2R , d. h. S2R = {x ∈ R3 : |x| = R}.

(5)

Wir nehmen weiterhin zur Vereinfachung an, dass die Erdoberfläche S2R und der Satellitenorbit S2r , R < r, jeweils Sphären um den Nullpunkt 0 (d. h. den Schwerpunkt des sphärischen Erdmodells) mit Radien R und r sind. Die Flughöhe des Satelliten ist somit gegeben durch r − R. Die Kugel um den Nullpunkt mit Radius R ist definiert durch B3R = {x ∈ R3 : |x| < R}.

(6)

Upward Continuation: Die sogenannte „upward continuation“ ordnet der Funktion F auf der Sphäre S2R (Erdoberfläche) die entsprechenden Werte im Außenraum (z. B. auf dem Satellitenorbit) zu. Mathematisch in expliziterer Weise formuliert betrachten wir die Abbildung A, die einer Funktion F ∈ L2 (S2R ), d. h. einer quadratintegrablen Dichte, die Funktion P zuordnet: A : F → P = A[F ]

(7)

mit  P (x) = A[F ](x) =

1 |x|2 − R 2 F (y) dS(y), 2 S2R 4π R |x − y|    Poisson-Kern

|x| > R,

(8)

wobei dS das Flächenelement bezeichnet. Das Resultat dieser Abbildung ist eine im Außenraum der Kugel B3R , also in R3 \B3R , harmonische Funktion P , d. h. x → P (x) = A[F ](x),

x ∈ R3 \B3R ,

(9)

2 Inverse Probleme der Geodäsie

71

genügt im Außenraum der Laplace-Gleichung Δx P (x) = 0,

x ∈ R3 \B3R .

(10)

Der Raum A[L2 (S2R )] bildet einen Hilbertaum vom Sobolevschen Typ (siehe [15, 23]). Für jedes r > R ist A[L2 (S2R )]|S2r ein echter Teilraum von L2 (S2r ). Downward Continuation: Das inverse Problem besteht darin, aus der Kenntnis der Werte der Funktion P auf dem Satellitenorbit S2r , d. h. P (x) = A[F ](x),

x ∈ S2r ,

(11)

die auf der Erdoberfläche S2R quadratintegrable Funktion F zu bestimmen. Da, wie bereits oben erwähnt wurde, A[L2 (S2R )]|S2r einen echten Teilraum von 2 L (S2r ) darstellt, ist das Problem der „Downward Continuation“ nicht für alle Funktionen in L2 (S2r ) lösbar. „Downward Continuation“ ist nur durchführbar, wenn die Integralgleichung A[F ] = P eine rechte Seite besitzt, die eine Restriktion eines Potentials P auf S2r darstellt. Dieser Sachverhalt ist allerdings nicht von praktischer Relevanz, da man zur Modellierung von F von diskret vorgegebenen Werten von P auf dem Satellitenorbit S2r ausgehen kann. Für die Modellierung von wesentlicher Bedeutung ist aber der folgende Sachverhalt: Der Operator A stellt eine beschränkte Abbildung des Raumes L2 (S2R ) in den Sobolevraum A[L2 (S2R )]|S 2 ⊂ L2 (S2r ) dar. Er besitzt auf A[L2 (S2R )]|S2r sogar eine r eindeutig definierte Inverse. Leider ist nach den Gesetzmäßigkeiten der Funktionalanalysis diese Inverse auf dem Raum A[L2 (S2R )]|S2r aber nicht beschränkt, d. h. der Inversionsprozess lässt sich nicht stabil durchführen. Er bedarf einer sogenannten „Regularisierung“, indem der nicht beschränkte inverse Operator durch eine Familie von beschränkten Operatoren angenähert wird. Bei der Modellierung ersetzt man dann geeignet (möglichst einer „Stoppregel“ folgend) den unbeschränkten Operator durch einen beschränkten Regularisierungsoperator. Die Annahme eines sphärischen Satellitenorbits ist für den oben beschriebenen Sachverhalt nicht zwingend erforderlich. „Downward Continuation“ kann somit auch allgemeiner als das inverse Problem der Bestimmung von F ∈ L2 (S2R ) aus der Vorgabe eines Potentials P des Sobolevraumes A[L2 (S2R )] (gesamter Außenraum der Erde) anstelle von A[L2 (S2R )]|S2r (sphärischer Satellitenorbit) verstanden werden. In der Konsequenz können diskrete Potentialwerte aus dem gesamten Außenraum der Erde in das inverse Problem Eingang finden. Die eindeutig bestimmte Inverse bleibt dabei selbstverständlich unbeschränkt.

2.2.1 Außenraum basierte Gradiometrie Im Folgenden bezeichnen wir speziell die Erde mit B, wobei wir in der Modellierung voraussetzen, dass der Schwerpunkt der Erde mit dem Ursprung des Koordinatensystems zusammenfällt. Die Erdoberfläche ist dann gegeben durch ∂B.

72

W. Freeden und H. Nutz

Die sogenannte „Rungekugel“ B3R ist eine ganz im Innern von B liegende Kugel. Nach dem Satz von Runge (siehe z. B. [2, 3, 15, 23]) können wir annehmen, dass das tatsächliche Gravitationspotential der Erde B als eine im Außenraum der Erde R3 \B harmonische und in R3 \B stetige Funktion (beliebig genau in gleichmäßiger Approximation) durch eine im Außenraum der Rungekugel R3 \B3R harmonische Funktion P im Untersuchungsbereich R3 \B angenähert werden kann. Skalare Gradiometrie: Das (klassische) skalare Gradiometrieproblem besteht aus der Ermittlung von F = P |∂B an der Erdoberfläche aus zweiten radialen Ableitungen von P auf dem tatsächlichen Satellitenorbit. Bekannt ist somit die Funktion    x x − · ∇x · ∇x P (x) = G(x), x→ − |x| |x|

x ∈ R3 \B3R ,

(12)

für die Messpunkte x auf dem Satellitenorbit. Gesucht ist die Funktion F = P |∂B auf der Erdoberfläche. Die Lösung ergibt sich durch Inversion der Integralgleichung     x x 1 |x|2 − R 2 − − P (y) dS(y) = G(x), · ∇x · ∇x |x| |x| 4π R |x − y|3 S2R



(13)

und anschließend stabil durch „Upward Continuation“ von P |S2 zu P |∂B über das R Dirichletproblem der Potentialtheorie. Der Operator A des skalaren Gradiometrieproblems ist also definiert durch    x 1 |x|2 − R 2 − P (y) dS(y). · ∇x A[F ](x) = |x| 4π R |x − y|3 S2R (14) A ist invertierbar mit einer unbeschränkten Inversen A−1 , so dass wie im Falle der Gravimetrie Regularisierungstechniken zum Einsatz kommen. Regularisierungen mit Hilfe des Kalküls der Kugelfunktionen in endlich dimensionalen Räumen sind Standard in der Geodäsie (siehe z. B. [47, 48]). Alternative Lösungsstrategien für das sklare Gradiometrieproblem, insbesondere Multiskalenregularisierungen, finden sich beispielsweise in [15, 20]. 



x − · ∇x |x|

Tensorielle Gradiometrie: Im Gegensatz zum klassischen skalaren Gradiometrieproblem, bei dem zur Berechnung ausschließlich skalare (radiale) Ableitungen verwendet werden, zieht man bei der Betrachtung des tensoriellen Gradiometrieproblems den gesamten Hessetensor zur Berechnung heran. Das tensorielle Gradiometrieproblem besteht somit aus der Ermittlung von F =P |∂B an der Erdoberfläche aus (allen) zweiten Ableitungen von P , also dem Hessetensor, auf dem tatsächlichen Satellitenorbit. Bekannt ist somit die Funktion

2 Inverse Probleme der Geodäsie

73

x → (∇x ⊗ ∇x )P (x) = g(x),

x ∈ R3 \B3R ,

(15)

also die durch Messung an den Messpunkten x erhaltenen Tensorwerte g auf dem tatsächlichen Satellitenorbit. Gesucht ist wiederum die Funktion F = P |∂B auf der Erdoberfläche. Die Integralformel des tensoriellen Gradiometrieproblems lautet 



1 x 2 − R2 (∇x ⊗ ∇x ) 4π R |x − y|3 S2R

 P (y) dS(y) = g(x).

(16)

Die Problematik auch dieser Integralgleichung ist die Stabilität der Inversen. Tensorielle Kugelfunktions-, Spline- und Waveletstrategien zur Lösung durch Regularisierung finden sich z. B. in [20, 24] und in diesem Handbuch.

2.3

Explorationsgradiometrie

Ziel der Explorationsgradiometrie ist die Bestimmung der Dichteverteilung im Erdinneren aus Messungen der zweiten Ableitungung des Gravitationspotentials im Außenraum (z. B. dem Satellitenorbit). Somit ist das zu lösende Problem gekennzeichnet durch eine Kombination des (klassischen) Gravimetrieproblems, das in Abschn. 2.1 beschrieben wurde, mit dem Gradiometrieproblem im Außenraum aus Abschn. 2.2. Wir bezeichnen in diesem Abschnitt mit F die Dichteverteilung in B, P ist das sich aus der Dichteverteilung ergebende Potential im Außenraum der Erde. Die Messwerte (Ableitungen des Potentials) bezeichnen wir mit G im skalaren Fall bzw. g im tensoriellen Fall. Skalare Explorationsgradiometrie: Die (skalaren) Messwerte G für Punkte x auf dem tatsächlichen Satellitenorbit sind in diesem Fall gegeben durch    x x − · ∇x · ∇x P (x), G(x) = − |x| |x|

x ∈ R3 \B3R .

(17)

Grundlage für die Berechnung der Lösung ist die Integralgleichung   G(x) =

B

− 

x · ∇x |x|

   x 1 1 − · ∇x F (y) dV (y) |x| 4π |x − y|  

(18)

=K(x,y)

mit der skalaren Kernfunktion K. Tensorielle Explorationsgradiometrie: Eine analoge Betrachtung führt im Falle von tensoriellen Messwerten g für Punkte x auf dem tatsächlichen Satellitenorbit gegeben durch

74

W. Freeden und H. Nutz

g(x) = (∇x ⊗ ∇x )P (x),

x ∈ R3 \B3R ,

(19)

auf die Integralgeleichung  g(x) =

1 1 (∇x ⊗ ∇x ) F (y) dV (y) 4π |x − y| B   

(20)

=k(x,y)

mit der tensoriellen Kernfunktion k.

2.4

Mathematische Erkundungsverfahren

Für die Gesamtheit der Geowissenschaften gilt, dass die aus den Mess- und Auswerteverfahren resultierenden Integralgleichungen stark variieren. Üblicherweise klassifiziert man die mathematischen Erkundungsverfahren entsprechend den zugrundeliegenden Differentialgleichungen und den resultierenden Feldern wie folgt: • Potentialverfahren (Potentialfelder, elliptische Differentialgleichungen) in Gravimetrie, Geomagnetik, Geoelastik, . . . • Diffusionsverfahren (Diffusionsfelder, parabolische Differentialgleichungen) in Magnetotellurik, Geoelektromagnetik, . . . • Wellenverfahren (Wellenfelder, hyperbolische Differentialgleichungen) in Seismologie und Seismik, Georadar, . . . Die geodätischen Erkundungsverfahren sind in der Regel Potentialverfahren. Allerdings sollte Erwähnung finden, dass die Diversität der mathematischen Erkundungsverfahren sich in der Geodäsie in Zukunft mit der technologischen Entwicklung von Mess- und Computertechnik erhöhen wird. Hinzu kommen luft- und satellitengestützte Verfahren, die immer stärker für die Erkundung, insbesondere im oberflächennahen Bereich, einsetzbar werden. Eine Kombination von terrestrisch basierter Gravimetrie mit Satellitenmethoden ist beispielsweise äußerst vielversprechend für zukünftige Explorationsvorhaben.

3

Strategien und Dilemmata

Wir wenden uns nun der abstrahierenden Formulierung geodätischer Grundgleichungen zu. In mathematischer Abstraktion (vgl. z. B. [18, 19, 40]) beschreibt man technische oder naturwissenschaftliche Probleme, wie sie in Abschn. 2 aufgelistet wurden, durch eine Operatorgleichung der Form Ax = y,

x ∈ X, y ∈ Y.

(21)

2 Inverse Probleme der Geodäsie

75

Dabei bezeichnen wir mit x die Ursache aus der „Menge“ der Ursachen X und mit y die Wirkung aus der „Menge“ der Wirkungen Y . A ist wie im vorherigen Abschn. 2 aufgezeigt ein Integraloperator, der Ursache und Wirkung basierend auf dem zugrundeliegenden (technischen oder naturwissenschaftlichen) Problem miteinander verknüpft. In der Geodäsie sind X und Y gewöhnlich Hilberträume, d. h. vollständige Räume versehen mit einer Abstands- und Winkeldefinition. Bei einem direkten Problem leitet man aus der messbaren und somit bekannten Ursache x ∈ X die Wirkung y ∈ Y ab, das heißt man berechnet diese (direkt) mithilfe der zugrundeliegenden Operatorgleichung. Wenn die Ursache x nicht oder auch nur in sehr unzureichender Form direkt messbar ist, so muss man von der beobachteten, messbaren Wirkung y auf deren Ursache x schließen. Derartige Probleme nennt man inverse Probleme. Inverse Probleme sind im Vergleich zu den direkten Problemen schwieriger zu lösen, da sie in der Regel „schlecht gestellt“ sind. Wichtig ist dabei die Erkenntis, dass diese Eigenschaft dem Problem selbst innewohnt und nicht aufgrund mangelnder oder fehlerhafter mathematischer Modellierung entsteht. Insofern kann diese Schwierigkeit nicht durch verbesserte Modellierung des zugrundeliegenden technischen oder naturwissenschaftlichen Problems überwunden werden, sondern man braucht möglichst komplementär gelagerte zusätzliche Strategien und Techniken zur verbesserten Lösbarkeit inverser Probleme. An einfachen Beispielen lässt sich die (begrenzte) Leistungsfähigkeit der Grundgleichung (21) durch Auftreten von sogenannten „Geistern“ erklären: (1) Es bezeichne x die Massendichte eines Körpers, y (im Sinne der Newton Theorie) den Messwert (Output) des resultierenden Gravitationspotentials. Es besteht die Möglichkeit, innerhalb eines Körpers Massen derart umzuverteilen, dass sich die Gesamtmasse und das Gravitationspotential nicht ändern. Dies kann man auf unendlich viele Weisen tun. Die Gesamtmasse ist insofern nur dann sinnvoll verwendbar, wenn man über die Gravitationspotentialwerte zusätzliche Eigenschaften des Körpers kennt. In der Tat haben seit Stokes im 19. Jahrhundert zahlreiche Mathematiker und Physiker darauf hingewiesen, dass die eindeutige Bestimmung der Massendichte x innerhalb eines Objektes aus Messwerten an der Oberfläche nicht möglich ist. (2) Ein weiteres Beispiel, das auch in der Medizin auftritt, sind Messungen von elektrischen Strömen y, die sich auf der Oberfläche eines Körpers sehr genau als Output messen lassen. Es lassen sich, wie beim Gravitionsfeld der Erde, unendlich viele verschiedene Ladungsverteilungen x innerhalb des Körpers angeben, die dieselben Messwerte y erzeugen. Wir bezeichnen mit x die wirkliche Lösung („real world solution“) von (21). Weiterhin sei z eine Lösung von Az = 0, die also keine Wirkung auf der Oberfläche hervorbringt und sich somit nicht bei der Messung von y auswirkt. Dann gilt bei den oben beschriebenen Beispielen (1) und (2) wegen der Linearität des Operators A

76

W. Freeden und H. Nutz

A(x + z) =  Ax +  Az = y. =y

(22)

=0

In dieser Gleichung ist z bei der Messung von y ein sogenannter „Geist“, also ein Phantom oder Artefakt. Solche Geister treten in der messenden Physik, speziell bei Vorhandensein nur endlich vieler Messwerte, nicht nur in den genannten Beispielen auf. Das Problem der „Geister“ führt uns zur Schlussfolgerung, dass man zur Ermittlung der realen Lösung x auf zusätzliche Informationen angewiesen ist und mit der Operatorgleichnung (21) allein nicht auskommt. Das Auftreten von Geistern, d. h. mathematisch die Nichteindeutigkeit der Lösung von (21), ist nur eine der Schwierigkeiten bei inversen Problemen. Wir wollen uns nun mit der Fragestellung beschäftigen, welche weitere Ursachen dazu führen, dass inverse Probleme häufig als „schlecht gestellt“ bezeichnet werden. J. Hadamard [26, 27] führte den Begriff der sogenannten gut gestellten Probleme ein. Ein gut gestelltes Problem zeigt folgende drei Eigenschaften: • Die Gleichung Ax = y hat für jedes y ∈ Y eine Lösung x ∈ X (Existenz). • Diese Lösung ist eindeutig bestimmt (Eindeutigkeit). • Die inverse Abbildung A−1 : Y → X ist stetig (Stabilität). Bei Verletzung einer dieser drei Bedingungen heißt das Problem schlecht gestellt („Ill–Posed Problem“, IPP). Dabei erweist sich neben dem Auftreten von Geistern (also die Eindeutigkeit) die dritte Bedingung in der Praxis als besonders kritisch. Dies liegt daran, dass bei Messungen die Daten nicht exakt vorliegen, sondern einem Messrauschen unterworfen sind. Ist die inverse Abbildung nicht stetig, so hängt die Lösung x nicht stetig von den Daten y ab. In der Folge können kleinste Messfehler zu drastischen Fehlern in der Lösung führen oder – im schlimmsten Fall – die Lösung völlig unbrauchbar machen. Es sei nochmals darauf hingewiesen, dass wir im Falle von schlecht gestellten inversen Problemen nicht aufgrund eines unvollständig oder ungeschickt formulierten Modells schlechte Ergebnisse in der Lösung erzielen, sondern dass diese Problematik dem jeweiligen Modell selbst innewohnt. In der Konsequenz sind somit inverse Probleme der Newtonschen Geodäsie aufgrund ihrer Schlechtgestelltheit stets eine echte Herausforderung. Die geomathematische Aufgabe, geologische Information des Erdinnern aus z. B. gravimetrischen oder gradiometrischen Daten an der Erdoberfläche oder des Außenraums zu gewinnen, führt unabhängig von der angestrebten Tiefenlage beispielsweise stets zu einem im oben definierten Sinn schlecht gestellten inversen Problem, bei dem sogar alle drei Bedingungen (also Existenz, Eindeutigkeit und Stabilität) verletzt sind. Wie können wir nun trotz der Schlechtgestelltheit eines Problems (21) dennoch verwertbare Informationen gewinnen? Zur Behandlung schlecht gestellter inverser Probleme wurde in der Tat in den letzten Dekaden eine Vielzahl von Methoden entwickelt. Der Grundgedanke bei den meisten Lösungsmethoden (zur näherungsweisen Lösung) der IPP besteht im Wesentlichen darin, eine „Lösung“ zu konstruieren, die physikalisch gesehen als sinnvolle Näherung akzeptabel sowie aus numerischer Sicht ausreichend stabil ist. Das Grundproblem bei der Model-

2 Inverse Probleme der Geodäsie

77

lierung schlecht gestellter Probleme stellt dabei der folgende Zusammenhang dar: Je genauer das mathematische Modell das IPP beschreibt, desto schlechter ist die „Konditionszahl“ des zugeordeten Berechnungsproblems, d. h. desto anfälliger ist das Verfahren Fehlern gegenüber. Die Schwierigkeit bei schlecht gestellten Problemen besteht dabei insbesondere darin, zusätzliche Information, Kompromisse oder neue Ausblicke hilfreich und zielführend so in die bestehende Lösung zu integrieren, dass die Genauigkeit der Lösung auch tatsächlich verbessert wird. Üblicherweise spricht man wie bereits in Abschn. 2 ohne nähere Erklärung erwähnt von der sogenannten „Regularisierung eines schlecht gestellten Problems“ zur Beschreibung der verschiedenen Lösungsansätze zur Umgehung dieses Problems der fehlenden stetigen Abhängigkeit (ebenso wie die Untersuchung von Existenzund Eindeutigkeitsaussagen). Grob gesprochen beinhaltet dies die Analyse des IPP durch Analyse entsprechender zugeordeteter gut gestellter Probleme, also einer Familie (üblicherweise eine Folge oder ein System) von gut gestellten Problemen, die sinnvolle Antworten auf das schlecht gestellte Problem liefern. Dabei sind nach M.Z. Nashed [40], einem der Pioniere der Theorie der inversen Probleme, drei Aspekte bei der Regularisierung von Relevanz: 1. Strategie zur Lösung und Rekonstruktion, 2. Regularisierung-Approximations-Schema, 3. Algorithmen zur Regularisierung. Wir verfolgen in diesem Artikel im Wesentlichen das Ziel, die verschiedenen Methoden und Ergebnisse voneinander abzugrenzen. Ad 1: Die Strategie der Lösung und Rekonstruktion von schlecht gestellten Problemen beinhaltet mindestens eine der folgenden intuitiven Ideen (für nähere Einzelheiten sei auf die heutigen Lehrbücher der Theorie inverser Probleme verwiesen): • Anpassung der Vorgabe an die Genauigkeit einer Lösung u, z. B. – ε-Genauigkeit der Lösung: Au − y ≤ ε bei vorgegebenem ε, – Quasi-Lösung: Au − y ≤ Ax − y für alle x ∈ M, wobei M eine vorgegebene Teilmenge des Definitionsbereiches von A ist, – Kleinste-Quadrate-Lösung Ax − yY = inf{Az − yY : z ∈ X}, – Best-approximierende Lösung x † = A† y mit x † X = inf{zX : z ist Kleinste-Quadrate-Lösung von Ax = y}, – usw. • Modifizierung der Operatorgleichung oder des Problems selbst, • Spezifierung der Art des Rauschens sowie deren Einbeziehung in den Lösungsvorschlag. Die Philosopie der Lösung führt zur Verwendung von algebraischen Lösungen im Gegensatz zu Lösungen in Funktionenräumen, statistischen Lösungen im Gegensatz zu deterministischen Lösungen, starkem im Gegensatz zu schwachem Rauschen usw.

78

W. Freeden und H. Nutz

Ad 2: In Bezug auf Regularisierung-Approximations-Schemata verweisen wir auf eine Reihe von Methoden, die bei schlecht gestellen inversen Problemen zum Einsatz kommen, wie Tikhonov-Regularisierung, Projektionsmethoden, Multiskalenmethoden, iterative Approximation, usw. Auf eine detaillierte Beschreibung dieser Verfahren sei hier verzichtet. Sie findet sich an anderer Stelle in diesem Handbuch. Ad 3: Indem diese Schemata implementiert werden, führen sie zu bestimmten Algorithmen zur Regularisierung. Leider schließt dies die Festlegung eines geeigneten Wertes eines gewissen Parameters, der dem Schema zugeordet ist, ein (z. B. Regularisierungsparameter, Maschenweite, Dimension des Unterraumes beim Projektionsschema, Festlegung des Grades der Skalenräume, Klassifizierung des Rauschens, usw.). Dies stellt sich als nichttriviales Problem heraus, da es in Folge einen Trade-off zwischen Genauigkeit und numerischer Stabilität gibt. Ein solcher Konflikt tritt in gut gestellten Problemen in der Regel nicht auf.

3.1

Regularisierung und Fehlerspezifikation

Das Konzept der Regularisierer spielt eine entscheidende Rolle bei der Gewinnung einer approximativen Lösung des IPP bei Verwendung von verrauschten Daten. Wir werden aufzeigen, dass es in der Tat ein generisches Konzept zur Vereinheitlichung einiger Prinzipien, die in zahlreichen Regularisierungs-Approximation-Schemata vorkommen, ist (Tikonov-Regularisierung, abgeschnittene oder gefilterte Singulärwertzerlegung, iterative Methoden, Projektionsmethoden, Multiskalentechniken, usw.). Seien nun X, Y Hilbert-Räume und A : X → Y eine Eins-zu-Eins-Abbildung (nicht notwendigerweise linear). Ein Regularisierer zur Gleichung Ax = y ist eine einparametrige Familie von Operatoren {Rγ : γ ∈ Γ }, wobei Γ eine Indexmenge reeller Zahlen mit 0 ∈ Γ (Abschluss von Γ ) ist, die die folgenden Bedingungen erfüllt: (C1) Für jedes γ ∈ Γ ist Rγ ein stetiger Operator von Y nach X. (C2) Für jedes x ∈ X gilt limγ →0 Rγ Ax − xX = 0. Schlecht gestellte Probleme sind in der mathematischen Praxis in der Regel in folgendem Sinne regularisierbar: Für jedes y aus dem Wertebereich (d. h. range) von A R(A), sei y ε bekannt mit „Rauschlevel“ y ε − yY ≤ ε. Wir betrachten die im Vergleich zur ursprünglichen Operatorgleichung (21) durch verrauschte Daten gestörte Operatorgleichung Ax = y ε

(23)

zwischen den normierten Räumen X, Y . Die Existenz eines Regularisierers ermöglicht es uns, eine „approximative Lösung“ x ε mit x ε − A−1 yX → 0

2 Inverse Probleme der Geodäsie

79

und Ax ε − yY → 0 zu berechnen. Wir erläutern diese Tatsache genauer: Sei x ε = Rγ y ε die mithilfe der Regularisierung Rγ zu verrauschten Daten berechnete „Lösung“. Dann gilt die Ungleichung x ε − A−1 yX ≤ Rγ y − A−1 yX + Rγ y ε − Rγ yX ,

(24)

wobei wir annehmen, dass beide Normen auf der rechten Seite von (24) bekannt sind. Der erste Term, also Rγ y − A−1 yX , kann als Konvergenzrate von Rγ Ax zu x interpretiert werden, während der zweite Term, also Rγ y ε −Rγ yX , Information über das Ausmaß der Stetigkeit des Operators Rγ in Y bereitstellt. Weiterhin gilt für jedes feste γ , dass Rγ y ε − Rγ yX → 0 strebt, wenn ε → 0 geht . Für ein gegebenes ε > 0 wählen wir speziell γ = γ (ε), um die rechte Seite von (24) zu mininieren. Dann erfüllt x ε = Rγ (ε) y ε die geforderte Bedingung weil Rγ (ε) y ε − A−1 yX → 0 wenn ε → 0 strebt. Wir sind nun zunächst an dem Fall interessiert, dass A linear ist. Der Einfachheit halber nehmen wir außerdem an, dass auch Rγ linear ist. Dann gilt wegen y ε − yY ≤ ε x ε − A−1 yX ≤ Rγ y − A−1 yX + M(γ )ε,

(25)

wobei die Norm des Operators Rγ  kleiner als eine von γ abhängende Konstante M(γ ) gemacht werden kann, d. h. Rγ  ≤ M(γ ) gilt. Es ist zu beachten, dass die Operatoren Rγ nicht gleichmäßig beschränkt sind, weil A−1 unbeschränkt ist, so dass M(γ ) → ∞ strebt, wenn γ → 0 geht. Die Vorgehensweise eines Regularisierers kann leicht auf Kleinste-QuadrateProbleme ausgeweitet werden. Wir nennen ein Problem wie in Gl. (23), oder gleichwertig A† , regularisierbar, wenn eine ein-parametrige Familie linearer Operatoren {Rγ : γ ∈ Γ } mit R(Rγ ) ⊂ M existiert, die limγ →0 Rγ Ax − xX = 0 für x ∈ M erfüllt und für die Rγ für jedes γ > 0 beschränkt ist. Auch hier gilt, dass die Familie {Rγ } nicht gleichmäßig beschränkt für γ ist, weil R(A) nicht abgeschlossen ist. Die Existenz eines Regularisierers gewährleistet uns allerdings eine Familie von approximativen Lösungen, die mithilfe eines zugeordeten gut gestellten Problems bestimmt wird. Bei verrauschten Daten y, beispielsweise y ε − y ≤ ε, erfüllt der Fehler Rγ y ε −A† yX → ∞ mit γ → 0 (er bläht sich gewissermaßen auf). Fordert man nun die Bedingung, dass der Fehler minimiert wird, so folgt: Rγ y ε − A† yX ≤ Rγ y − A† yX + Rγ (y ε − y)X ≤ Rγ y − A† yX + Cγ ε,

(26)

wobei wir in Analogie zu den oben gemachten Überlegungen die von γ abhängige Konstante Cγ so zu wählen ist, dass Rγ  ≤ Cγ gilt. Der erste Term (der sogenannte Regularisierungsfehler) geht mit γ → 0 gegen Null, wohingegen der zweite Term (die sogenannte Magnifikation des Rauschfehlers aufgrund der schlecht Gestelltheit) gegen ∞ geht (siehe auch Abb. 1):

80

W. Freeden und H. Nutz 0.7

0.6

0.5

0.4

0.3

0.2



0.1

0 0.1

0.2

0.3



0.4

0.5

0.6

0.7

0.8

0.9

1

Abb. 1 Verhalten des Regularisierungsfehlers, Rauschfehlers sowie des gesamten Fehlers sowie Markierung des optimalen Regularisierungsparameters γ

A† y − Rγ yX

−→ 0

 γ →0

(27)  Rγ (y − y ε ) X ≤ Rγ ε −→ ∞ A† y − Rγ yX

−→ ∞

 γ →∞

(28)  Rγ (y − y ε ) X ≤ Rγ ε −→ 0

Wenn wir eine Fehlerabschätzung für den ersten Term sowie eine Abschätzung des Wachstums für Cγ kennen, so können wir ein geeignetes γ festlegen. Eine solche Abschätzung kann für bestimmte Regularisierer ermittelt werden, indem man zusätzliche Information zur Lösung A† y einfließen läßt, beispielsweise Glattheit oder Robustheitsbedingungen an Cγ (z. B. lässt sich eine Abschätzung angeben, dass Cγ ≤ d(γ ) eine bekannte Funktion ist, die gegen ∞ strebt wenn γ → 0 geht und Rγ y − A† yX ≤ b(γ ) mit B(γ ) → 0 wenn γ → 0 geht). In diesem Fall kann ein optimales γ berechnet werden, für welches Rγ y ε → A† y mit ε → 0 gilt

2 Inverse Probleme der Geodäsie

81

(vgl. Abb. 1). Zahlreiche Konkretisierungen von Regularisierern sind in Form von Regularisierungs- und Projektionsmethoden sowie iterativen Methoden gegeben. Im Falle eines iterativen Schemas stellt γ = 1/n, xn = Rγ y die n-te Iteration des Prozesses dar, die in Abwesenheit von Rauschen in den Daten y gegen A† (oder A−1 ) konvergiert. Bei Projektionsmethoden ist ebenfalls γ = 1/n, wobei n die Dimension des approximativen Unterraumes ist. Bei finiten Differenzen stellt γ die Maschenweite h dar. Die vorangehenden Ausführungen zeigen, wie man stabile approximative Lösungen bei verrauschten Daten y und unter Gewährleistung der Abschätzung (26) erzielen kann, häufig jedoch ohne zu wissen, ob die getroffenen Vereinbarungen zur Wirklichkeitsnähe führen.

3.2

Stetige und diskrete Regularisierung

Die numerische Analyse von schlecht gestellten Problemen beinhaltet letztendlich die Lösung von endlich dimensionalen Problemen, um numerische Approximationen zu erhalten. Dies hat häufig eine zweistufige Regularisierung zur Folge. Zunächst „regularisiert“ man beispielsweise das Problem in Funktionenräumen und wendet anschließend numerische Methoden zur Approximation der Lösung eines gut gestellten Problems an (oder einer Familie solcher Probleme). Oder man „diskretisiert“ oder approximiert möglicherweise das IPP durch endlich dimensionale Probleme und beseitigt anschließend die numerische Instabilität dieser Probleme durch Methoden der numerischen linearen Algebra, die für diskrete schlecht gestellte Probleme geeignet sind (siehe z. B. frühe Arbeiten von [10, 37, 53]). Diese beiden alternativen Wege werden in folgendem Diagramm dargestellt: cr

→ Pγ

P p  Fn →

p

p dr

Pn → Pn,γ

Pγ ,n

Dabei ist P das gegebene schlecht gestellte Problem und Pγ ist eine „regularisierte“ Version von P , die durch Anwendung eines gewissen Regularisierungsschemas in einem gewissen Funktionenraum entstanden ist. Fn ist ein Problem in einem endlich dimensionalen Funktionenraum und Pn ist ein endlich dimensionales alebraisches Problem, wobei beide durch Approximation von P gewonnen wurden; weiterhin bezeichnet Pn,γ eine „Regularisierung“ des (numerisch instabilen) Problems Pn und Pγ ,n ist eine numerische Approximation des Problems Pγ , das für nicht zu kleine γ numerisch stabil ist. Im Diagramm bedeutet außerdem „c r“ stetige Regularisierung („continuous regularization“), „d r“ diskrete Regularisierung („discrete regularization“) und „p“ bezeichnet eine generische „Projektion“ (Diskretisierung, Projektionsmethode, etc.). Mit dem bisherigen Kenntnisstand über Regularisierungsmethoden können wir an dieser Stelle nur einige Verfahren dem Namen nach erwähnen, die konkrete Realisierungen dieses Schemas bereitstellen (für detailliertere Studien sei auf die mathematische Spezialliteratur verwiesen):

82

W. Freeden und H. Nutz

P → Pγ Tikhonov-Regularisierung, Regularisiereroperatoren in Funktionenräumen, quasireversible Methoden, Ersetzung des IPP durch ein stabiles Minimierungsproblem, das von einem Parameter abhängt, iterative Methoden in Funktionenräumen; P → Fn Abgeschnittene Singulärwertzerlegung (TSVD), abgeschnittene Reihenentwicklung, Momentumdiskretisierung, Projektionsmethoden; P → Pn Finite-Differenzen-Methoden mit Kollokation, Reduzierung von Fn zu algebraischen Gleichungen; Pn → Pn,γ Zerlegungsmethoden oder Regularisierung für lineare algebraische Gleichungen, TSVD für Matrizen; Pγ → Pγ ,n zahlreiche numerische Methoden zur Lösung gut gestellter Probleme, z. B. Diskretisierung, Projektionsmethoden, Multiskalentechniken, usw. Wir kehren nun zum allgemeinen Schema zurück und bezeichnen mit x, xγ , xn , xγ ,n , xn,γ jeweils die „Lösung“ (im klassischen oder kleinste-Quadrate Sinne) des Problems P , Pγ , Pn , Pγ ,n , Pn,γ bei unverrauschten Daten, während wir mit x ε , xγε , usw. die zugehörige Lösung bei verrauschten Daten bezeichnen (y wird dabei ersetzt durch y ε , wobei y ε − yY ≤ ε für ein ε > 0). Wenn wir annehmen, dass die verschiedenen Regularisierungsschemata ohne Fehler konvergent sind (z. B. xγ →x wenn γ → 0, xn,γ → xn wenn γ → 0 für jedes feste n, xn → x wenn n → ∞, usw.), so kann man ähnlich wie in (25) und (26) Abschätzungen verwenden, um wie zuvor eine „approximative Lösung“ x ε zu berechnen, beispielsweise xγε ,n − xX ≤ xγε ,n − xγε X + xγε − xγ X + xγ − xX .

(29)

Hierbei stellt xγε ,n − xγε X eine approximative Fehlerabschätzung dar, die einen Konvergenzbereich des Approximationsschemas für gut gestellte Probleme Pγ bei festem γ liefert. xγε − xn X ist eine Abschätzung für den Fehler aufgrund des Rauschens, die bei bekannter Robustheit von Pγ angeben werden kann und xγ − xX ist der Regularisierungsfehler. Gleichermaßen gilt γ

γ

ε ε xn,γ − xX ≤ xn,γ − xn X + xn − xn X + xn − xX .

(30)

γ

ε − xn X eine Abschätzung des Es ist zu beachten, dass in diesem Fall xn,γ Regularisierungsfehlers des Problems Pn darstellt und xnε − xn X der Fehler aufgrund der Rauschübertragung in das diskrete System ist, usw.

3.3

Dilemmata und Methodologien

Die Dilemmata und Methodologien der mathematischen und numerischen Analyse der IPP beinhaltet die folgenden Feststellungen und Beobachtungen: • Für die meisten Regularisierungs- und Approximationsschemata gilt im rauschfreien Fall

2 Inverse Probleme der Geodäsie

83

lim xγ = x

(31)

lim xn = x.

(32)

γ →0

und

n→∞

Für einige Approximationsschemata (Projektionsschemata) kann auch ohne Rauschen möglicherweise keine Konvergenz eintreten. • Weder der doppelte Grenzwert lim xγε ,n ,

γ →0

(33)

n→∞

noch die iterierten Grenzwerte (limn→∞ limγ →0 und limγ →0 limn→∞ ) von xγε ,n ε ε  auf, wenn und xn,γ existieren. In der Tat blähen sich xγε ,n X und xn,γ X n→∞ und γ → 0 geht. Es soll in diesem Zusammenhang betont werden, dass dieses Sich-Aufblähen intrinsisch allen IPPs innewohnt, unabhängig von jedem Regularisierungs-Approximations-Schema. Somit kann bei einer numerischen ε  minimiert Lösung eines IPP bestenfalls der Fehler x − xγε ,n X oder x − xnγ X werden oder es können „Pfade“ gesucht werden, entlang denen xγ ,n und xn,γ gegen x konvergieren, wenn ε → 0 geht. • Das Diagramm der alternativen Wege ist im allgemeinen nicht–kommutativ. Es ist nicht immer klar, welcher Weg entlang des Diagramms der effektivste ist. ε  vergleichen Man muss die minimalen Fehler von x − xγε ,n X und x − xn,γ X und außerdem die rechnerische Komplexität beider Pfade abwägen. Für einige einfache Schemata ist das Diagramm kommutativ (z. B. kommutieren TikhonovRegularisierung und TSVD für kompakte Operatoren). • Präzise Lösungen eines schlecht gestellten Problems (d. h. ein optimaler Kompromiss zwischen Genauigkeit und numerischer Stabilität) hängen von einem „Kriterium zur optimalen Parameterwahl“ (für γ und n) ab. Oft gibt es (aufgrund fehlender Konvergenzradien in ausreichender Genauigkeit in den vorangehenden Abschätzungen) keine analytischen Kriterien für diese Wahl, mit Ausnahme der Tikhonov-Regularisierung oder ähnlicher Methoden, die auf einfachen Variationsprinzipien beruhen. Oft werden die Parameter durch ein interaktives Berechnungsschema ermittelt, das auf groben analytischen Abschätzungen beruht.

3.4

Grobe Eingruppierung der Regularisierungsmethoden

Vom Standpunkt der mathematischen und numerischen Analyse kann man die „Regularisierungsmethoden“ in drei Kategorien einteilen.

84

W. Freeden und H. Nutz

• Regularisierungsmethoden in Funktionenräumen. Diese beinhalten Regularisierung vom Tikhonov-Typ, die Methode der Quasireversibilität, die Verwendung bei gewissen Funktionenräumen wie den Skalenräumen bei der Multiresolution, die Methode der verallgemeinerten Inversen in Hilberträumen mit Reprokernen sowie Regularisierung innerhalb einer Multiskalentechnik mit Wavelets. • Lösung schlecht gestellter Probleme durch „Kontrolle der Dimensionalität“. Dies beinhaltet Projektionsmethoden, Diskretisierung und Momentumdiskretisierungsschemata. Der Erfolg dieser Methoden hängt von der Möglichkeit ab, eine approximative Lösung zu erzielen während die Dimension des endlichdimensionalen Problems im Bereich der numerischen Stabilität bleibt. • Iterative Methoden und Filtermethoden. Diese Methoden können entweder auf Probleme in Funktionenräumen oder auf eine diskrete Version angewandt werden. Der entscheidende Faktor bei iterativen Methoden ist es, die Iteration zu beenden, bevor numerische Instabilitäten aufkommen. Aus diesem Grund müssen iterative Methoden modifiziert oder verbessert werden, um eine geforderte Genauigkeit zum Zeitpunkt der Abbruchvorschrift sicherzustellen. Filtermethoden sind Verfahren, bei denen beispielsweise singuläre Funktionen und Singulärwerte, die stark oszillierende Lösungen produzieren, entfernt werden. Selbstverständlich können verschiedene Tiefpassfilter angewandt werden. Schon länger gilt die Verwendung von a-priori-Grenzen (oder allgemeiner apriori-Information) bei der Lösung eines schlecht gestellten Problems als wesentlich um die stetige Abhängigkeit zu verbessern (also einen Regularisierungseffekt zu erzielen). Zu Beginn der Studien schlecht gestellter Probleme machte Tikhonov die entscheidende Beobachtung (siehe [51, 52]), dass die Einschränkung auf eine kompakte Menge Wohlgestelltheit gewährleistet. Genauer, wenn X und Y metrische Räume sind, F : X → Y eine stetige Injektion und C ⊂ X kompakt, dann ist F −1 : F (C) → C stetig. Die Methode sukzessiver Approximationsgradienten und relative iterative Methoden können zur Suche einer approximativen Lösung schlecht gestellter Probleme verwendet werden. Eine Konvergenzanalyse einiger iterativer Methoden für Integral- und Operatorgleichungen der ersten Art sowie für Gleichungen zweiter Art bei einem charakteristischen Wert ist z. B. in [28] zu finden. Diese Methoden beinhalten als Spezialfall das, was man manchmal als LandweberIteration bezeichnet [50], wobei diese tatsächlich identisch ist mit der Methode der sukzessiven Approximation angewandt auf Normalgleichungen, d. h. wir haben xn+1 = xn − A∗ Axn − A∗ g für die Operatorgleichung Ax = y. Wenn eine iterative Methode zur numerischen Approximation eines Problems (egal ob gut oder schlecht gestellt) angewendet wird, muss eine Abbruchvorschrift zum Beenden des Verfahrens entwickelt werden. Berücksichtigt werden müssen hierbei Rundungsfehler und die Toleranz des Approximationsfehlers (d. h. was man als „Null“ betrachtet vom Standpunkt der Approximation her). Für schlecht gestellte Probleme ist die Abbruchvorschrift aufgrund der durch die Schlechtgestelltheit bedingten numerischen Instabilität deutlich komplizierter. Für den Fall, dass y verrrauscht ist, also y ε = y + u mit u ≤ ε, müssen die Approximationen

2 Inverse Probleme der Geodäsie

xn+1 = xn − β(A∗ Axn − A∗ y − A∗ u)

85

(34)

(wobei β ein Parameter ist, für den diese Iteration konvergiert, wenn u = 0 ist) nicht gegen A† konvergieren. Gleiches gilt für die Methode des stärksten Abstiegs, wenn γ durch γn ersetzt wird, welches der Wert ist, bei dem Axn+1 − y ε  minimiert wird. Natürlicherweise hängt das Abbruchkriterium vom Residuum als beobachtbare Größe ab. Wählt man δ > 0 (Abbruchlevel), so erhält man zu Beginn der Iteration Axn − y ε Y > δ, n < m, Axm − y ε Y ≤ δ.

(35)

Wir nehmen an, dass der Abbruchlevel δ = δ(ε) dieses iterativen Prozesses für δ(ε) → 0 die Bedingung δ(ε) > ε erfüllt. Dann ist es leicht zu zeigen, dass lim sup xM − A† y = 0,

ε→0 u≤ε

(36)

wobei xM = xM (δ, u, x0 ) die approximative Lösung ist, die man aus (34) mit Abbruchlevel δ erhält. Damit ist gezeigt, dass man den iterativen Prozess unter diesen Voraussetzungen als Regularisierungsschema anwenden kann. Das Ergebnis (36) bedeutet, dass die Familie {Bn }, wobei Bn der Operator (also die Iterationsvorschrift) ist, der y auf xn+1 abbildet, ein Regularisierer ist. In diesem Fall übernimmt 1/n die Rolle des Parameters γ . Zum Abschluss wollen wir schließlich noch erwähnen, dass bei iterativen Methoden die Funktion xn (δ(ε), u, x0 ) als „optimale“ Approximationslösung, die einen Kompromiss zwischen Genauigkeit und Stabilität darstellt, verwendet werden kann. Da xn eine Linearkombination einer endlichen Anzahl von Funktionen darstellt (rekursiv eingeführt in [31]), folgt, dass regularisierende iterative Schemata im Rahmen der Auflösung durch Kontrolle der Dimensionalität betrachtet werden können. In diesem Fall sind die Funktionen allerdings nicht a-priori gegeben, sondern sie werden rekursiv durch wiederholte Filterung einer anfänglichen Approximation x0 erzeugt. Der Schwerpunkt liegt dann auf den Filteroperatoren durch Verwendung der Adjungierten und auf der Wahl von x0 . Ein Ansatz zur Lösung schlecht gestellter Probleme ist, eine Approximation der Lösung (des inversen Problems oder des Identifikationssproblems) durch eine Linearkombination einer „kleinen“ Anzahl von umsichtig ausgewählten Funktionen zu suchen. Das Problem reduziert sich dann auf auf die beste Approximation mit „wenigen“ Parametern. Für gewisse schlecht gestellte Probleme kann eine geschickte Wahl der Funktionen getroffen werden. Die Idee der Regularisierung durch Diskretisierungs- oder Projektionsmethoden ist in der numerischen Analyse von schlecht gestellten inversen Problemen sehr populär. Die Idee ist (vom Prinzip her) einfacher als die anspruchsvolle Methode der Tikhonov-Regularisierung. Die Anwendung solcher Methoden liegt zeitlich vor der Einführung der Regularisierung. Bei dieser Gelegenheit soll nicht unerwähnt bleiben, dass Maxwell unter den ersten Wissenschaftlern war, der 1879 beim

86

W. Freeden und H. Nutz

Versuch die Kapazität einer quadratischen Metallplatte unter Verwendung von Integralgleichungen ein inverses Problem formulierte und löste. Sein numerischer Ansatz war im Grunde eine Kollokation. Der Aufbau von Antennensystemen und zahlreichen elektromagnetischen Apparaten waren ebenfalls mit der Entwicklung von Projektionsmethoden verknüpft. Ein Rahmenkonzept zur Reduzierung inverser Probleme und Identifikationsprobleme zu endlich dimensionalen Problemen existiert innerhalb der klassischen Projektionsmethoden. Wir wählen eine Familie von endlich dimensionalen Unterräumen Xn ⊂ X, Yn ⊂ Y  die dem Problem Ax = y zugeordnet sind, wobei A : X → Y und Y  der Dualraum von Y ist (also der Raum aller stetigen linearen Abbildungen von Y nach R). Wir nehmen der Einfachheit halber an, dass R(A) dicht in Y liegt und y ∈ R(A) ist. Die Approximationen xn ∈ Xn für x = A−1 y werden durch die Bedingung ψn (Axn − y) = 0,

ψn ∈ Yn ,

(37)

festgelegt. Bei Einsatz von verrauschten Daten, also Ersetzung von y durch y ε und bei Verwendung der Projektionsmethode haben wir die folgende grundlegende Abschätzung xnε − xX ≤ (1 + Pn X )d(x, Xn ) + Qn X ε.

(38)

Hierbei sind die linearen Abbildungen Pn : X → Xn und Qn : Y → Xn definiert durch ψn APn = ψn A und ψn AQn = ψn für alle ψn ∈ Yn und mit d(x, Xn ) = inf{x −uX : u ∈ Xn }. Der erste Teil der rechten Seite von (38) ist der Projektions(oder Diskretisierungs-)Fehler. Wenn Pn  gleichmäßig beschränkt in n ist, d. h. wenn das Verfahren in der Sprache der Finite-Elemente-Projektion quasi-optimal ist, dann ist die Ordnung unabhängig von ψn und hängt nur davon ab, wie gut x in Xn approximiert werden kann. Der zweite Teil von (38) ist der Fehler aufgrund des Rauschens verstärkt durch die Schlechtgestelltheit. Man kann zeigen, dass Qn  ≥ a(n) ist, wobei a(n) = sup{u : u ∈ Xn , Au = 1}

(39)

von Xn abhängt aber nicht von ψn . Für schlecht gestellte Probleme geht a(n) → ∞ wenn n → ∞ strebt, weil die Norm der inversen Matrix, die sich aus dem System von algebraischen linearen Gleichungen ergibt, die aus (38) resultieren, nicht gleichmäßig beschränkt ist. Wenn Qn  mit n → ∞ nicht schneller wächst als a(n), dann heißt das Projektionsschema robust. Wenn ein Projektionsschema sowohl quasi-optimal als auch robust ist, so gilt xnε − xX ≤ C{d(x, Xn ) + εa(n)}.

(40)

Diese Abschätzung kann durch eine geschicktere Wahl von ψn nicht verbessert werden. Weiterhin kann durch Minimierung der rechten Seite im Hinblick auf n ein

2 Inverse Probleme der Geodäsie

87

Kompromiss erzielt werden. Zahlreiche konkrete Realisierungen innerhalb dieses Schemas von Projektionsmethoden werden z. B. in [46] erzielt. In [18, 19, 41] wird eine detaillierte Analyse verschiedener Projektionsmethoden gemacht. Projektionsmethoden (unter Verwendung von Spline-, Wavelet- oder FiniteElemente-Funktionen) können, wie im oben beschriebenen Fall, entweder direkt auf das IPP angewandt werden oder auf das regularisierte Problem, d. h. auf das Problem der Minimierung des Funktionals J (x) = Ax − y2 + γ LX2 , a > 0. Verschiedene Konvergenzergebnisse werden schon im frühen Stadium der IPPTheorie in [33, 46, 49] gezeigt. In der Literaturliste finden sich ebenso noch weiter nützliche Referenzen zu Projektions- und Finite-Element-Methoden für IPPs. Die Momentumdiskretisierungs- (oder Semi-Diskretisierungs-)Methode für Integralgleichungen der ersten Art und für IPPs in Hilberträumen mit Reprokernstruktur (siehe [38, 42, 43]) ist im Sinne einer Projektionsmethode sowohl quasi-optimal als auch robust. Sie ist besonders geeignet, wenn nur an endlich vielen Punkten Werte der Datenfunktion bekannt sind. Diese Eigenschaften sind in Verbindung mit Konvergenz- und Kommutativitäteigenschaften, die in [39] gezeigt werden, verantwortlich für das unkomplizierte Verhalten dieser Methoden bei der numerischen Implementierung. Mittlerweile gibt es eine Vielzahl von Lehrbüchern über IPPs mit umfassenden und tiefgehenden Resultaten, z. B. [1, 5, 11–13, 25, 26, 30, 32, 35, 36, 44], um nur wenige zu nennen.

4

Zusammenfassung und Ausblick

Inverse Probleme liefern aufgrund ihrer mathematischen Besonderheit nie eine absolute Gewissheit bei der Lösung. Der Grund liegt im spezifischen Charakter inverser Probleme (wie etwa gravimetrischer und/oder gradiometrischer Potentialverfahren), die den Schluss von indirekt gemessener Wirkung auf die geodätisch/geologisch reale Ursache für ein Gebiet der Erde vollziehen müssen, das sich bei Eintritt in ein Projekt gänzlich der unmittelbaren Beobachtung entzieht. Dieser Umstand impliziert unabdingbar das „Auftreten methodisch bedingter Geister (ghosts)“ bei der Lösung eines inversen Problems, die „Schrecken“ jeglicher Lösungsbemühungen darstellen, Als Idealzustand muss gelten, diese methodisch bedingten Geister eines inversen Verfahrens komplementär durch ein alternatives Verfahren sichtbar zu machen. Nur in vollständiger Komplementarität reduziert sich das Restrisiko in dem Maße, dass Falschaussagen und Fehlinterpretionen vermieden werden könnten. Mehr noch, für die inverse Datenauswertung ist festzuhalten, dass die überkommenen mathematischen Verfahren in Zukunft weder in Theorie noch in Numerik in der Lage sein werden, die neuen Datenmengen – insbesondere unter dem wichtigen Aspekt einer vertieften räumlich-zeitlichen Behandlung – zu beherrschen. Gefordert ist vielmehr eine Aufbereitung der geowissenschaftlich relevanten Größen in konstituierende Bausteine, die durch drei wesentliche Merkmale charakterisiert ist: Gute Approximationseigenschaft, geeignetes Dekorrelationsvermögen und schnelle sowie stabile Algorithmen. Diese Eigenschaften sind der Schlüssel für

88

W. Freeden und H. Nutz

eine Vielzahl von Fertigkeiten, insbesondere Datenkompression und -transmission, Entrauschung (Denoising) sowie selektive Multisresolution beispielsweise geologischer Strukturen. Alles in Allem bleibt festzuhalten: Geomathematik zur Behandlung inverser Probleme ist und bleibt eine Schlüsseltechnologie auch in der Behandlung inverser Probleme der Geodäsie. Auch wenn große Fortschritte in den letzten Dekaden gemacht wurden und für die Zukunft zu erwarten sind, so ist doch auf absehbare Zeit der mathematische Kreislauf (vgl. [14]) bestehend aus • • • •

mathematischer Modellbildung, mathematischer Analyse der „Wohlgestelltheit“, Entwicklung und Ausführung eines mathematischen Lösungsverfahrens, Rückübertragung in die Anwendung

unabdingbar und unverzichtbar sowie gegebenenfalls hinreichend oft iterativ anzuwenden. Bemerkung. Ein ähnlich gelagerter Beitrag [22] findet sich bereits im „Handbuch Oberflächennahe Geotheormie“ [4], der sich aber speziell den spezifischen Obliegenheiten der Geothermie widmet.

Literatur 1. Anger, G.: A characterization of the inverse gravimetric source problem through extremal measures. Rev. Geophys. Space Phys. 19, 299–306 (1981) 2. Augustin, M., Bauer, M., Blick, C., Eberle, S., Freeden, W., Gerhards, C., Ilyasov, M., Kahnt, R., Klug, M., Möhringer, S., Neu, T., Nutz, H., Ostermann, I., Punzi, A.: Modeling deep geothermal reservoirs: Recent advances and future perspectives. In: Freeden, W., Sonar, T., Nashed, Z. (Hrsg.) Handbook of Geomathematics, 2. Aufl., S. 1547–1629. Springer, Heidelberg (2015) 3. Augustin, M., Freeden, W., Nutz, H.: About the importance of the Runge-Walsh concept for gravitational field determination. In: Freeden, W., Nashed, M.Z. (Hrsg.) Handbook of Mathematical Geodesy. Geosystems Mathematics. Springer, Basel/New York/Heidelberg (2018) 4. Bauer, M., Freeden, W., Jacobi, H., Neu, T. (Hrsg.): Handbuch Tiefe Geothermie: Prospektion, Exploration, Realisierung, Nutzung. Springer, Berlin/Heidelberg (2014) 5. Baumeister, J.: Stable Solution of Inverse Problems. Vieweg, Braunschweig (1987) 6. Blakely, R.J.: Potential Theory in Gravity and Magnetic Application. Cambridge University Press, Cambridge (1996) 7. Blick, C.: Multiscale potential methods in geothermal research: Decorrelation reflected postprocessing and locally based inversion. Ph.D.-Thesis, AG Geomathematik, University of Kaiserslautern (2015) 8. Blick, C., Freeden, W., Nutz, H.: Feature extraction of geological signatures by multiscale gravimetry. GEM Int. J. Geomath. 8(1), 57–83 (2017) 9. Blick, C., Freeden, W., Nutz, H.: Innovative Explorationsmethoden am Beispiel der Gravimetrie und Reflexionsseismik. In: Bauer, M., Freeden, W., Jacobi, H., Neu, T. (Hrsg.) Handbuch Oberflächennahe Geothermie. Springer Spektrum, Heidelberg (2018) 10. Björck, A., Elden, L.: Methods in numerical algebra for ill-posed problems. In: Ill-Posed Problems: Theory and Practice. Reidel, Dodrecht/Boston (1981)

2 Inverse Probleme der Geodäsie

89

11. Engl, H.: Integralgleichungen. Springer, Berlin (1997) 12. Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 13. Engl, H., Louis, A.K., Rundell, W. (Hrsg.): Inverse Problems in Geophysical Applications. SIAM, Philadelphia (1997) 14. Freeden, W.: Geomathematics: Its role, its aim, and its potential. In: Freeden, W., Nashed, M.Z., Sonar, T. (Hrsg.) Handbook of Geomathematics, Bd. 1, 2. Aufl., S. 3–78. Springer, Heidelberg (2015) 15. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. Teubner, Stuttgart (1999) 16. Freeden, W., Blick, C.: Signal decorrelation by means of multiscale methods. World Min. 65, 1–15 (2013) 17. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. CRC Press/Taylor & Francis, Boca Raton (2013) 18. Freeden, W., Nashed, M.Z.: Operator-theoretic and regularization approaches to Ill-posed problems. GEM Int. J. Geomath. (2017). https://doi.org/10.1007/s13137-017-0100-0 19. Freeden, W., Nashed, M.Z. (Hrsg.): Ill-posed problems: Operator methodologies of resolution and regularization approaches. In: Handbook of Mathematical Geodesy. Geosystems Mathematics. Springer, Basel/New York/Heidelberg (2018) 20. Freeden, W., Nutz, H.: Satellite gravity gradiometry as tensorial inverse problem. GEM Int. J. Geomath. 2, 177–218 (2011) 21. Freeden, W., Nutz, H.: Mathematik als Schlüsseltechnologie zum Verständnis des Systems „Tiefe Geothermie“. Jahresber. Deutsch. Math. Vereinigung (DMV) 117, 45–84 (2015) 22. Freeden, W., Nutz, H.: Mathematische Lösungspotentiale, Strategien und Dilemmata. In: Bauer, M., Freeden, W., Jacobi, H., Neu, T. (Hrsg.) Handbuch Oberflächennahe Geothermie, S. 257–279. Springer Spektrum, Heidelberg (2018) 23. Freeden, W., Nutz, H.: Geodetic observables and their mathematical treatment in multiscale framework. In: Freeden, W., Nashed, M.Z. (Hrsg.) Handbook of Mathematical Geodesy. Geosystems Mathematics. Springer, Basel/New York/Heidelberg (2018) 24. Freeden, W., Nutz, H., Schreiner, M.: Geomathematical advances in satellite gravity gradiometry. In: Freeden, W., Nashed, M.Z. (Hrsg.) Handbook of Mathematical Geodesy. Geosystems Mathematics. Springer, Basel/New York/Heidelberg (2018) 25. Groetsch, C.W.: The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind. Pitman, Boston (1984) 26. Hadamard, J.: Sur les problèmes aux dérivés partielles et leursignification physique. Princeton Univ. Bull. 13, 49–52 (1902) 27. Hadamard, J.: Lectures on the Cauchy Problem in Linear Partial Differential Equations. Yale University Press, New Haven (1923) 28. Kammerer, W.J., Nashed, M.Z.: Iterative methods for best approximate solutions of linear integral equation of the first and second kind. J. Math. Anal. Appl. 40, 547–573 (1972) 29. Kellogg, O.D.: Foundations of Potential Theory. Frederick Ungar Publishing Company, New York (1929) 30. Kirsch, A.: An Introduction to the Mathematical Theory of Inverse Problems. Springer, Heidelberg (1996) 31. Lieusternik, L.A., Sobolev, V.J.: Elements of Functional Analysis. Ungar, New York (1961) 32. Louis, A.K.: Inverse und schlecht gestellte Probleme. Teubner, Stuttgart (1989) 33. Marti, J.T.: On the convergence of an algorithm computing minimum-norm solutions of illposed problems. Math. Comput. 34, 521–527 (1980) 34. Möhringer, S.: Decorrelation of gravimetric data. Ph.D.-Thesis, AG Geomathematik. TU Kaiserslautern (2014) 35. Morozov, V.A.: Methods for Solving Incorrectly Posed Problems. Springer, New York (1984) 36. Nashed, M.Z.: Generalized inverse, normal solvability and iteration for singular operator equations. In: Rall, L.B. (Hrsg.) Nonlinear Functional Analysis and Applications, S. 311–359. Academic, New York (1971) 37. Nashed, M.Z. (Hrsg.): Generalized Inverses and Applications. Academic, New York (1976)

90

W. Freeden und H. Nutz

38. Nashed, M.Z.: Aspects of generalized inverses in analysis and regularization. In: Generalized Inverses and Applications, S. 193–244. Academic Press, New York (1976) 39. Nashed, M.Z.: On moment-discretization and least-squares solutions of linear integration equations of the first kind. J. Math. Anal. Appl. 53, 359–366 (1976) 40. Nashed, M.Z.: Operator-theoretic and computational approaches to ill-posed problems with application to antenna theory. IEEE Trans. Antennas Propag. 29, 220–231 (1981) 41. Nashed, M.Z.: Analysis and implementation of projection-regularization methods for ill-posed problems. In: Ill-Posed Problems: Theory and Practice. Reidel, Dordrecht/Boston (1981) 42. Nashed, M.Z., Wahba, G.: Convergence rate of approximate solutions to linear operator equations of the first kind. Math. Comput. 28, 69–80 (1974) 43. Nashed, M.Z., Wahba, G.: Regularization and approximation of linear operator equations in reproducing kernel spaces. Bull. Am. Math. Soc. 80, 1213–1218 (1974) 44. Nashed, M.Z., Scherzer, O.: Inverse Problems, Image Analysis and Medical Imaging (Contemporary Mathematics), Bd. 313. American Methematical Society, Providence (2002) 45. Nettleton, L.L.: Gravity and Magnetics in Oil Prospecting. McGraw-Hill Book Company, Inc., New York (1976) 46. Natterer, F.: The finite element method for ill-posed problems. RAIRO Anal. Numer. 11, 271–278 (1977) 47. Rummel, R.: Spherical spectral properties of the Earth’s gravitational potential and its first and second derivatives. In: Rummel, R., Sanso, F. (Hrsg.) Lecture Notes in Earth Science, Bd. 65, S. 359–404. Springer, Berlin (1997) 48. Rummel, R., van Gelderen, M., Koop, R., Schrama, E., Sansó, F., Brovelli, M., Miggliaccio, F., Sacerdote, F.: Spherical harmonic analysis of satellite gradiometry. New Series, Bd. 39. Netherlands Geodetic Commission, Delft (1993) 49. Song, M.: Regularization-projection methods and finite element approximations for ill-posed linear operator equations. PhD thesis, University Michigan (1978) 50. Strand, O.N.: Theory and methods related to the singular function expansion and Landweber’s iteration for integral equations of the first kind. SIAM J. Numer. Anal. 11, 798–825 (1974) 51. Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195–198 (1943) 52. Tikhonov, A.N.: On the solution of incorrectly formulated problems and the regularization method. Dokl. Akad. Nauk SSSR 151, 501–504 (1963) 53. Varah, J.: On the numerical solution of ill-conditioned linear systems with applications to illposed problems. SIAM J. Numer. Anal. 10, 257–267 (1973) 54. Zidarov, D.P.: Inverse Gravimetric Problem in Geoprospecting and Geodesy. Developments in Solid Earth Geopyhsics, Bd. 19. Elsevier, Amsterdam/Oxford/New York (1990)

3

Up and Down Through the Gravity Field F. Sansó, M. Capponi, and D. Sampietro

Contents 1 Gravity Field: Why and How . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Principles of Upward Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Geodetic Boundary Value Problems (GBVP’s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Global Models as Approximate Solutions of the GBVP . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Principles of Downward Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The Constant Density Layer with Unknown Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Some Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93 95 101 108 114 123 135 143

Abstract The knowledge of the gravity field has widespread applications in geosciences, in particular in Geodesy and Geophysics. The point of view of the paper is to describe the properties of the propagation of the potential, or of its

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. F. Sansó () Department of Civil and Environmental Engineering, Politecnico di Milano, Milan, Italy E-Mail: [email protected] M. Capponi Department of Civil, Constructional and Environmental Engineering, Università di Roma La Sapienza, Rome, Italy Department of Civil and Environmental Engineering, Politecnico di Milano, Milan, Italy E-Mail: [email protected] D. Sampietro Geomatics Research & Development s.r.l., Como, Italy E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_93

91

92

F. Sansó et al.

relevant functionals, while moving upward or downward. The upward propagation is always a properly posed problem, in fact a smoothing and somehow related to the Newton integral and to the solution of boundary value problems (BVP). The downward propagation is always improperly posed, not only due to its intrinsic numerical instability but also because of the nonuniqueness that is created as soon as we penetrate layers of unknown mass density. So the paper focuses on recent results on the Geodetic Boundary Value Problems on the one side and on the inverse gravimetric problem on the other, trying to highlight the significance of mathematical theory to numerical applications. Hence, on the one hand we examine the application of the BVP theory to the construction of global gravity models, on the other hand the inverse gravimetric problem is studied for layers together with proper regularization techniques. The level of the mathematics employed in the paper is willingly kept at medium level, often recursing to spherical examples in support to the theory. Most of the material is already present in literature but for a few parts concerning global models and the inverse gravimetric problem for layers.

Zusammenfassung Die Kenntnis des Schwerefeldes hat weitreichende Anwendungen in den Geowissenschaften, insbesondere in Geodsie und Geophysik. Unser Anliegen in diesem Beitrag ist die Beschreibung von Eigenschaften zur Fortpflanzung des Potentials oder seiner relevanten Funktionale nach oben und nach unten. Die Fortpflanzung des Potentials nach oben („upward continuation“) ist stets ein wohlgestelltes Problem. „Downward Continuation“ ist stets ein schlechtgestelltes Problem, nicht nur wegen der numerischen Instabilitten, sondern vor allem wegen der Nichteindeutigkeit der Bestimmung von Massenschichtung aus Potentialwerten. Als Konsequenz fokussiert sich der Beitrag auf neuere Resultate aus dem Bereich geodätischer Randwertprobleme und zum anderen auf das inverse Gravimetrieproblem. Dabei machen wir den Versuch, die Bedeutung von mathematischer Theorie für numerische Anwendungen heraus zu streichen. Das Paper ist vom mathematischem Anspruch her schlicht gehalten. Dazu bedienen wir uns oftmals der Rückführung auf sphärische Beispiele. Der größte Teil des Materials ist bereits in der Literatur vorhanden, bis auf Teile für die globalen Modelle und das inverse Gravimetrieproblem für Schichtungen.

Keywords Upward continuation · Downward continuation · Geodetic boundary value problem · Inverse gravimetric problem

3 Up and Down Through the Gravity Field

1

93

Gravity Field: Why and How

The knowledge of the gravity field of the Earth is a scientific achievement of fundamental interest for almost all Earth sciences, but with distinct applications according to whether we need it on the Earth surface and in space above it, or below the Earth surface. To the first category belong all geodetic applications, primarily the knowledge of the geoid and of various derivatives of the potential, which are essential in modern surveying, cartography as well as oceanography and in navigation, specially in space close to the Earth, where satellites and shuttles do their job. To the second category belong all solid Earth geophysical applications at different scales and depths, from the global ones, where different geological models of the Earth system can find a confirmation or a disproof, to the local ones where the presence of a short wavelength gravity anomaly can reflect the existence of a shallow density anomaly, corresponding to valuable natural resources. For historical reasons, the manipulation of the gravity field in the geodetic reign has been considered as an (almost) well posed problem, dominated as it is by the work of transforming gravity anomalies into anomalous potential, or geoid undulations. Theoretically this is the domain of the Geodetic Boundary Value Problem (GBVP) (or better “Problems”) that has got a significant mathematical assessment in the very last decades (see [27] for a review) after the fundamental impulse imparted by Hörmander’s publication [8]. Essentially we know that if we want to determine the gravity potential, from the surface of the Earth, S, upward, from some kind of gravity anomalies, given all over S, this can be done under fairly general conditions, providing unique, stable solutions. Indeed things become different when we pose the problem of determining the gravity field on S from spatial observations, like those collected in recent years by the three gravity satellite missions CHAMP, GRACE, and GOCE [5] and we will explain why, shortly. On the contrary the use of the gravity field in the reign of solid Earth geophysics, namely inferring the mass distribution from the knowledge of the gravity field on S (or out of it) belongs to the domain of improperly posed problems. In fact its solution is generally not unique and its determination leads systematically to numerical instabilities even when the class of acceptable geological models is so restricted as to guarantee uniqueness, as it happens with the two constant density layers problem. As a side remark on the above coarse and imprecise classification, let us notice that it is based on the concept: “surface of the Earth”. This is also a rather fishy object requiring a good deal of thought to be defined in an acceptable form (see [25]). Here we report the definition given in [28]: the surface of the Earth “is any surface S on which we can assume (after a limit process) to know both the value of the gravity potential W and the modulus of its gradient g to a predefined degree of accuracy”; in the present context one should add “solid (or liquid) masses above S should have a known density and a thickness below 100 m, such that their effects can be corrected for, at the above mentioned level of accuracy”. The limit of 100 m

94

F. Sansó et al.

Fig. 1 Notation for the geometry of the analyzed scenario

is not sharp, but not arbitrary either; it is related to the so called linearization band, of which one can read in [28]. Even with the above specifications one has to realize that it is not possible to avoid in a layer close to S a certain overlapping of the geodetic and geophysical use of gravity. This is the topographic layer (roughly between S and the geoid) where we assume to know the mass density too, though with unavoidable errors, that due to the thickness of such a layer, can be significantly reflected into our knowledge of the gravity field. This is not a big problem for geodetic applications, since we can remove and than restore the effects of sources below S, as far as linearization procedures are permitted. On the contrary errors in the density of the topographic layer can distort the subsequent geophysical gravity inversion. This is why terrain corrections, beyond numerical aspects, are still object of investigations. Apart from this remark, let us now introduce the usual notation (B, S, Ω) to represent the body of the Earth B, which we assume to be simply connected and bounded, its surface S, that we shall assume star-shaped and Lipschitz, namely to be represented in spherical coordinates by the equation r = R(σ ) ≡ R(λ, ϕ)

(1)

with (λ, ϕ) spherical angular coordinates, R(λ, ϕ) a function defined on the unit sphere with a gradient essentially bounded σ a. e. ,

| ∇σ R(σ ) | ≤ C

(2)

and finally Ω is the space exterior to S (see Fig. 1). Indeed to determine the gravity field in Ω is the geodetic problem and to determine it, or (almost) equivalently the density field ρ, in B is the geophysical problem. Let us note en passant that due to known properties of the density ρ (it has to be measurable, positive and bounded) and to the assumption on S, it is also known that the gravity potential has to be continuous together with its first derivatives across S. However what we want primarily to reason on is that problems are well posed or not depending essentially from two other factors: whether we move upward or downward from the area where we have data to the area where we want to know the gravity field; whether along this path we meet or not sources of the gravity field, namely ρ = 0.

3 Up and Down Through the Gravity Field

95

We will reason under a linearized, spherical approximation hypothesis, although our conclusions are valid under much more general conditions. So we assume that we can describe the anomalous gravity by means of the anomalous gravity potential u(P ), that u is generated by an anomalous mass density distribution that we will call again ρ, obtained by subtracting to the actual density some density model compatible with the normal potential (see [18]). So the relation between the source anomaly ρ and the anomalous potential u, is just Newton’s integral:  u(P ) = G B

ρ(Q) dBQ . lP Q

(3)

Likewise we shall assume that the gravity “anomalies” available are just the gravity disturbances δg, that in spherical approximation are related to u by (see [27]) δg = −

2

∂ u ∂r

(4)

Principles of Upward Continuation

In this section we will recall the relations that bind a mass distribution in a thin layer to the gravity field, in terms of its potential u, generated on the layer itself and then the propagation of u, from the level of the layer, upward. In other words we shall study the operators that make us move up along the chain (Fig. 2). To keep things easy and evident we shall consider spherical layers or spherical surfaces to make our description self-evident, although our conclusions are valid under general conditions. So in this context up and down has to be understood in radial direction. The advantage of the use of a spherical geometry is the possibility of exploiting the spherical harmonic representation of functions, in L2 (Sr ),

f (r, σ ) =

n +∞

fnm (r) Ynm (σ );

(5)

n=0 m=−n

in particular when f is known to be harmonic in a spherical domain, BR ≡ {r ≤ R}, we recall that Eq. 5 has the form (see [27] Cap. 3,4)

Fig. 2 Operators to move up along the chain

96

F. Sansó et al.

r≤R

u(r, σ ) =

,

 n r unm Ynm (σ ) R m=−n

n +∞ n=0

(6)

The functions  n r Sinm (r, σ ) = Ynm (σ ) R

(7)

are called internal solid spherical harmonics. Likewise a function harmonic in ΩR ≡ {r ≥ R} can be expanded into the form r≥R

u(r, σ ) =

,

 n+1 R unm Ynm (σ ) r m=−n

n +∞ n=0

(8)

and the functions Senm (r, σ ) =

 n+1 R Ynm (σ ) r

(9)

are called external solid spherical harmonics. The functions Ynm (σ ) are as usual surface spherical harmonics of degree n and order m ( [27], Cap. 13.3). Incidentally we note that Sinm (R, σ ) ≡ Senm (R, σ ) ≡ Ynm (σ )

(10)

which is, by a suitable choice of the normalization constants, an orthonormal complete basis of L2 (σ ) ( [27], Cap. 3.4), namely the space of f (σ ) with bounded norm:  1 2  f L2 (σ ) = (11) f 2 (σ ) dσ. 4π To continue it is convenient to introduce a few definitions: Definition 1. we say that an operator S is a (isotropic) smoother in L2 (σ ) if f ∈ L (σ ) 2

S f =

n +∞

Sn fnm Ynm (σ )

(12)

n=0 m=−n

and Sn > 0 ,

lim Sn = 0

n→∞

(13)

3 Up and Down Through the Gravity Field

97

Here isotropic is related to the fact that the weights Sn depend only on the degree n. Definition 2. we say that S has power strength N if nN Sn ≤ C

(14)

Definition 3. we say that S has exponential strength if ∃ q > 1 such that q n Sn ≤ C.

(15)

We notice that, due to Eq. 13, the operator S is always bounded in L2 (σ ) and even compact (see [32], X.1). In addition if f ∈ L2 (σ ) and S has power strength N, then all derivatives of f up to order N ∂ α f = ∂1α1 ∂2α2 ∂3α3 f ∈ L2 (σ )

| α | = α1 + α2 + α3 ≤ N

(16)

are also in L2 (σ ) and we say that f ∈ H N,2 (σ ). Here and in the sequel H n,2 on any set A denotes the usual Sobolev Hilbert spaces of degree n and power 2 (see [14]). If S has exponential strength, then S f ∈ C ∞ (σ )

(17)

i.e., equivalently the derivatives of f of any order are in L2 (σ ) and in fact also continuous. It is worth noticing that due to the first condition in Eq. 13 the operator S is always invertible as: S f = 0 ⇒ Sn fnm = 0 ⇒ fnm = 0 ⇒ f = 0

(18)

indeed the inverse S −1 is always an unbounded operator in L2 (σ ). Now coming to the question posed at the beginning of the section, we only need to write the Newton integral for a thin layer of width Δ around a sphere of radius R0 and linearize the formula with respect to Δ:  u(r, σ ) = G

dσ 



R0 + Δ 2

R0 − Δ 2

ρ(σ  ) 2 r dr  G R02 Δ l



dσ 

ρ(σ  ) l

(19)

1 with l = r 2 + R02 − 2 r R0 cos(ψσ σ  ) 2 and ψσ σ  the spherical angle between the two directions, σ, σ  . It is worth noting that u is essentially the potential of a single layer deposited on SR0 with surface density ω(σ  ) = Δ ρ(σ  ). Then taking for instance r > R0 and using the well known representation (see [18], Cap. 1.15)

98

F. Sansó et al. +∞ n 1 R0n 1 = Y (σ ) Ynm (σ  ) n+1 2 n + 1 nm l r m=−n

(20)

n=0

one gets r ≥ R0 ,

u(r, σ ) = 4 π G R0 Δ

 n  +∞ R0 n+1 Ynm (σ ) ρnm r 2n+1 m=−n

(21)

n=0

where ρnm =

1 4π



dσ  ρ(σ  ) Ynm (σ  ).

(22)

Letting r → R0 , we get the spectral relation unm (R0 ) = 4 π G R0 Δ

ρnm 2n+1

(23)

where unm (R0 ) are the harmonic coefficients of u(R0 , σ ). As we see the Newton operator N that transform ρ into u at the same level R0 , is a smoother with power strength N = 1, i.e., it transforms a ρ ∈ L2 (σ ) into a potential u ∈ H 1,2 (σ ). This is confirmed by the well known formula: δg|SR0 = −u |SR0 = 4 π G R0 Δ

n +∞ n=0

n+1 ρnm Ynm (σ ) 2 n+1 m=−n

(24)

or, in spectral form, δgnm (R0 ) = 4 π G R0 Δ

n+1 ρnm 2n+1

(25)

showing that if ρ ∈ L2 (σ ), then δg = −u ∈ L2 (σ ) too. Let us note that Eq. 24 gives the limit from outside (r → R0+ ) of the gravity anomaly generated by the single layer (Eq. 19); as we know such a function has a sharp jump across SR0 , equal to −4 π G ω(σ ) = −4 π G Δ ρ(σ ) (see [18], Cap. 1,3). Among other things Eq. 25 means that giving the gravity anomaly δg on the upper surface of the thin layer, one can retrieve at once the body density ρ, apparently ∂ without numerical instabilities; namely the operator − ∂r N (·)|SR0 is not a smoother and its inverse is bounded. Now to conclude our analysis we have to propagate the potential u from the level R0 to the level R (> R0 ), because we know that it is only on the surface SR that we can get information on u. Since for r > R0 , u has the form of Eq. 21, by using Eq. 23 we find

3 Up and Down Through the Gravity Field

u(R, σ ) =

n +∞ n=0

99



R0 unm (R0 ) R m=−n

n+1 Ynm (σ )

(26)

or  unm (R) = unm (R0 )

R0 R

n+1 .

(27)

As we see the upward continuation operator U is a smoother with exponential strength. An analogous computation for δg leads to   n +∞ n + 1 R0 n+2 δg(r, σ ) = unm (R0 ) Ynm (σ ) R0 R m=−n

(28)

    n + 1 R0 n+2 n + 1 R0 n+2 unm (R0 ) = 4 π G Δ ρnm . R0 R 2n+1 R

(29)

n=0

or δgnm (R) =

In this case the density coefficients ρnm (R0 ) (or the single layer coefficients ωnm = ρnm Δ) can still be uniquely retrieved from δgnm (R), but the numerical  n+2 instability, due to the exponential factor RR0 present in D = U −1 , is unbeatable, unless we use a regularization method, e.g., of the Tikhonov type (see for instance the recent paper [7]). Remark 1. We can observe that, combining Eqs. 23 and 27 one can write, for any R  , R0 < R  < R,  n+1 R0 ρnm = unm (R) = 4 π G R0 Δ 2n+1 R  n+1   n+1   n+1 R R0 R  = unm (R0 ) = u (R ) nm  R R R

(30)

This is the spectral representation of a property of the family of upward continuation operators. If we denote by U (R : R0 ) the upward continuation of the harmonic function u from the level r = R0 to the level r = R, Eq. 30 can be written as R0 < R  < R ,

U (R, R0 ) = U (R, R  ) U (R  , R0 )

(31)

highlighting the semi-group property of this family. But this means also that once we know the potential u(R  , σ ), we don’t need anything else to propagate it at higher

100

F. Sansó et al.

levels and in particular the relation between the density ρ(R0 , σ ) at level R0 and the field u(r, σ ), for any r > R  , is broken. In fact the same u(r, σ ) in {r > R  } could be generated by a single layer, at the same level R  , with density ω (σ ) = ω(σ )



R0 R

n+1

 = Δ ρnm

R0 R

n+1

which is nothing else but the famous Poincaré’s sweeping out (see [33]). Notice that by suitably tuning the surface density ω (σ ) we could have substituted the layer on SR0 with any other layer on SR  with R0 < R  < R  . An analogous reasoning holds for δg(r, σ ) too, with the spectral upward  n+2 continuation factor RR0 . It is time now to add the effects of many thin layers, going to the continuous limit, which is obtained from Eqs. 21, 22, and 23 by assuming that ρ is a function varying with r, ρ(r, σ ), so that also ρnm result to be functions of r, ρnm (r), and taking Δ = dr; we get then

r≥R

   n +∞ Ynm (σ ) R r  n+1 u(r, σ ) = 4 π G ρnm (r  )r  dr  = 2 n + 1 r 0 m=−n n=0

=4π G

   n  n+1 +∞ R Ynm (σ ) R r  n+1 ρnm (r  ) r  dr  r 2 n + 1 R 0 n=0 m=−n (32)

where 1 ρnm (r) = 4π



ρ(r, σ  ) Ynm (σ  ) dσ 

and its spectral counterpart, in terms of potential, is unm (r) =

4π G 2n+1



R 0

  n+1 r ρnm (r  ) r  dr  . R

(33)

Notice that Eq. 32 is valid for r ≥ R, since if we wanted u(r, σ ) in {r < R} we should have used the development of 1l in terms of internal spherical harmonics in the region {r ≤ r  ≤ R}. In any event we underline that due to the previous Remark 1 for any R0 , 0 ≤ R0 ≤ R, we can split the effect of the mass density into two parts, one inside BR0 ≡ {r ≤ R0 }, which generates the internal potential uI (R0 , σ ) that is then upward continued up to SR ≡ {r = R}, the other one expressed by the Newton integral in spherical form, namely:

3 Up and Down Through the Gravity Field

101

r≥R

u(r, σ ) = U (R, R0 ) [uI (R0 , σ )] +    n  n+1 +∞ R Ynm (σ ) R r  n+1 +4π G ρnm (r  ) r  dr  r 2 n + 1 R R0 n=0 m=−n (34) Obviously analogous formulas could be worked out for δg. So, summarizing we could say that by the above, rather rudimentary example, we have understood the two main rules of moving “upward” through the gravity field: (a) if we start from a surface S0 (in the example the sphere SR0 ) and want to bring the gravity field to the level of the surface S encompassing S0 (in the example the sphere SR , R > R0 ), we need to know: (i) the potential uI of the masses inside B0 (∂B0 = S0 ) on S0 , or any other boundary data on S0 that determine uniquely uI |S0 , like for instance δg|S0 (ii) the mass distribution between S0 and S (iii) that no other masses exist outside S (b) the gravity field from S outside (in Ω) can be obtained: (i) by solving a boundary value problem with the given data on S, computing a regular harmonic solution and than restricting it to S (in our example by using the upward operator U (R, R0 )) (ii) by computing the Newton integral for the known density ρ between S0 and S (in our example the second term in Eq. 34) From the comments done in the section we already know that, the harmonic upward continuation, namely the operator U , is one to one, while the Newton integral is not one to one, ρ(r, σ ) ⇐⇒ u|S , because we have already shown by the spherical example, that layers could be moved up and down by suitably changing their (surface) density without changing the potential on S. More on the upward continuation operator can be found in [28], Appendix A.

3

Geodetic Boundary Value Problems (GBVP’s)

This is the geodetic part of the use of gravity data for the determination of the gravity field on S (surface of the Earth) and Ω (exterior space), i.e., in Ω. The significance of GBVP’s stems from the fact that we can determine what is the form of a minimal information to be given on S in order to find the corresponding (anomalous) gravity potential u in a unique and stable mode. Speaking of the anomalous potential implies that we can safely go to the linearized versions of the GBVP’s (see [27], 15 and [28], 2); when we say that we can find u in a unique and stable mode, it means that we are able to identify spaces Y , for the data, and X for the unknowns such that to any f ∈ Y we can find one and only one u ∈ X satisfying the BVP and that u depends continuously on f with respect to the corresponding topologies. So we can say that a GBVP has to be a typical well posed problem, according to Hadamard, for the determination of u in Ω.

102

F. Sansó et al.

The standard form of a liniarized BVP is: ⎧ in Ω ⎨ Δu = 0 B u|S = f ⎩ u→0 , r→∞

(35)

where B u represents the linear operator linking u to the observations, f , taken on the boundary S. The two terms f and S depend on the kind of data we consider as given; the same happens to the shape of the boundary operator B. In any event we shall assume S to be star-shaped, i.e., it can be described by the equation r = R(σ ) = Rσ . We notice that maybe the most “natural” GBVP would be the so called Altimetry Gravimetry (AG) problem, in which the data are splitted in two parts: (a) on the ocean it is supposed that the stationary height of the sea (SSH) is known from satellite altimetry, namely the height h(σ ), with respect to the ellipsoid, of the foot of radar pulses, reflected by the sea. Furthermore it is assumed that oceanographers can provide the so called dynamic height, namely the height ξD of the stationary sea surface with respect to the reference equipotential surface, the geoid (G). Such dynamic height is supported by the steady circulation in the oceans. So if we call N , the geoid undulation, i.e., the height of the geoid on the ellipsoid, we have the notable relation h(σ ) = ξD (σ ) + N(σ )

(36)

As obvious, h, ξD , N are function of horizontal coordinates σ = (λ, ϕ). Since N (σ ) is related to u by Brun’s relation N(σ ) = γ (σ )−1 u(σ ), where γ is as usual the modulus of normal gravity, Eq. 36 implies that we can pretend to know on the ocean directly u(σ ); in this case S is just the projection of the ocean on the corresponding part of the ellipsoid; (b) on land we assume to know the modulus of the gravity vector g(P ) at any point, and some kind of altimetric information. If we know hP , the ellipsoidal height of P , then the linearization leads to the land boundary relation: δg(σ ) = g(σ ) − γ (σ ) = −

∂u ∂h

(37)

δg is known as gravity disturbance and the surface S in this case is directly the Earth surface for the land part. If on the contrary we assume to know the total gravity potential at P , W (P ) = U (P ) + u(P ) (with U (P ) the normal potential), then the linearization process leads to the boundary relation: Δg(P ) = g(P ) − γ (P ∗ ) = −

∂γ

∂T − ∂h T ∂h γ

(38)

3 Up and Down Through the Gravity Field

103

also known as fundamental equation of physical geodesy; Δg is known as free air gravity anomaly, P ∗ is a known point at a height related to W (P ), and in this case the surface S is the telluroid, i.e., the surface swept by P ∗ . Such linearization processes are described in details in [18], 2–13 and in [27], 2. The (AG) problem is not an easy one, as for its quantitative analysis, so the actual state of the art is still an analysis for a spherical surface S0 divided in ocean and land; this can be found in [24]. Fortunately, after some years, it was understood that marine geoid data can be manipulated to produce also on the ocean a dataset of gravity anomalies (see [1]) to provide a high resolution gravity map. This thanks to old formulas of Vening Meinesz [18], 2–22 combined with the calculation of the horizontal gradient of u and the physical fact that the geoid on the ocean is a quite smooth function. Therefore we could arrive at GBVP’s of two types according to whether we consider as known the gravity disturbance δg, in which case S is directly the surface of the Earth, or the free air gravity anomaly Δg, in which case S is the telluroid; this second problem is also known under the name of scalar Molodensky’s problem. The analysis of such problems has been carried out to a satisfactory point; the most recent results are collected in [26]. Here we shall concentrate on the first BVP, namely the so called Fixed Boundary gravimetric BVP: ⎧ ⎨ Δu = 0 ∂u | = f (σ ) ⎩ ∂h S 1 u = O( r ) ,

in Ω on S r→∞

(39)

This is to avoid the technicalities related to the Molodensky’s problem and its non-unique solution. Even more, since our purpose is to discuss the relation of the GBVP to global gravity models, we shall further simplify Eq. 39, transforming the boundary relation into the corresponding spherical approximation formula:  ∂u  = u (σ )|S = u (Rσ , σ ) ≡ f (σ ) ∂r S

(40)

For such a simplified problem: ⎧ ⎨ Δu = 0 u | = f (σ ) ⎩ S u→0,

in Ω on S r→∞

(41)

the classical analysis leading to existence, uniqueness and stability of the solution is of an overwhelming simplicity. For this reason we sketch it here, in a slightly

104

F. Sansó et al.

modified form; for more details you can consult [27], 15,4 where the extension to the non-spherical approximation, i.e., to the boundary relation of Eq. 37, is performed. The functional spaces involved are L2 (S) for the data f (σ ) with norm  f

2L2 (S)

=

f 2 (σ ) dS

(42)

| ∇u |2 dS.

(43)

S

and H 1,2 (S) for u, with norm  u

2H 1,2 (S)

= S

Note that Eq. 43 gives  a true norm for the harmonic function u (i.e., it is 0 if and only if u = 0). If S | ∇u |2 dS = 0 then | ∇u ||S = 0, almost everywhere, namely both the tangential gradient of u on S and its normal derivative should be zero. From the first statement one derives that u should be constant on S. But then S is an equipotential surface of u, and ∂u ∂n = 0 implies, by the unique solvability of the exterior Neumann problem (e.g., [31], Cap. 7 and 9), that u has to be zero everywhere on S and in Ω. Now we can start from the differential identity ∇ · [r u ∇u] = | ∇u |2 +

1 ∂ r | ∇u |2 2 ∂r

(44)

 ∂ = xi ∂x∂ i . Integrating that the reader is invited to verify directly, recalling that r ∂r Eq. 44 on Ω and using the divergence theorem as well as an integration by parts on r between Rσ and +∞, we arrive at the identity (see [8], [26]) 





| ∇u |2 dΩ + Ω

S

Rσ3 | ∇u |2 dσ ≡ 2

Rσ u S

∂u dS. ∂n

(45)

Then we use the following inequalities:  | ∇u |2 dΩ ≥ 0

(46)

Ω

   1   1 2 2   2 2  Rσ u ∂u dS  ≤ R+ f (σ ) dS un dS ≤ R+  f L2 (S)  u H 1,2 (S)   ∂n S S S (47) and, noticing that cosI dS = Rσ2 dσ (cosI = n · er ), 

 S

Rσ3 | ∇u |2 dσ ≥ R−

S

| ∇u |2 cosI dS ≥ R− cosI+  u 2H 1,2 (S) ,

(48)

3 Up and Down Through the Gravity Field

105

where I is the inclination of the surface S with respect to the main direction of the vertical, approximated here by er , R± are the max and min of Rσ , I+ is the maximum inclination of S which is supposed to be less than 90◦ , so that (cosI+ )−1 = J+ < +∞. Wrapping up Eqs. 46, 47, and 48 in Eq. 45 we find R− cosI+  u 2H 1,2 (S) ≤ 2 R+  f L2 (S)  u H 1,2 (S) namely  u H 1,2 (S) ≤ 2

R+ J+  f L2 (S) . R−

(49)

As we can see the stability of the solution of this GBVP in H 1,2 (S) depends essentially on the inclination of the terrain with respect to the vertical. Uniqueness obviously depends from Eq. 49; existence requires some further reasoning for which we send to literature (e.g., see [26]). Remark 2. Naturally we would like not only to know that given δg there is the one and only one harmonic u such that u |S = −δg, but also to know how to compute it, e.g., at the level of the boundary S itself. This can be done, as we will shortly see, by global modelling u with finite combinations of external spherical harmonics, but also by the representation of u by means of boundary layers, that has occupied geodesists for several decades. Here we want to return to the so called Prague method introduced in Geodesy by T. Krarup in the first of his famous letters on Molodensky’s problem [10]. We build on Krarup’s ideas, adapting them to the case of the simple, (i.e., spherical) Fixed Boundary GBVP. The concept is that, noticing that v = −r u = −



xi

∂u ∂xi

(50)

has to be harmonic, when u is such, to determine v from its boundary values v|S = f (σ ) = Rσ δg(σ )

(51)

means solving a Dirichlet problem. A classical approach to Dirichlet is to represent v(r, σ ) by means of a double layer potential, namely  x ∈ Ω,

v(x) =

μ(y) S

∂ 1 dSy ∂ny lxy

(52)

106

F. Sansó et al.

with (lxy = |x − y|). Then taking the limit for x → xS ≡ (Rσ , σ ), i.e., approaching the boundary S from outside along the normal to S, one gets the well known relation (see [18], 1-4)  2 π μ(xS ) +

μ(y) S

∂ 1 dSy = f (σ ) ∂ny lxy

(53)

that can serve as an integral equation for the unknown μ(y) (y ∈ S). Once μ(y) is obtained, Eq. 52 allows to compute v in Ω. Once v is known, one can observe that, from Eq. 50 we have:  x ≡ (r, σ ) ∈ Ω;

+∞

u(r, σ ) = r

1 v(s, σ ) ds; s

(54)

notice that in fact, when v(s, σ ) is a regular harmonic function, it is v(s, σ ) = O( 1s ), so that the integral in Eq. 54 is convergent and in fact it provides a regular potential u = O( 1r ). The Prague method consists essentially in making the two operations Eqs. 52 and 54 in one shot only, namely setting  u(r, σ ) =

dSy μ(y) ny · ∇y K(x, y) 

(55)

S +∞

K(x, y) = r

1 1 dsξ ; sξ lξy

(56)

in the above formula the following notation has been used x = r er lξy =

y = ρ eσ 

 sξ2 + ρ 2 − 2 ρ sξ cosψ

ξ = sξ e σ cosψ = eσ · eσ  .

A remarkable point is that the integral of Eq. 56 has an explicit form, namely, K(x, y) = K(r, ρ, ψσ σ  ) =

lxy + ρ − r cosψ 1 log . ρ r(1 − cosψ)

(57)

Therefore one has (see Fig. 3 for the meaning of eψ ): ∇y K(x, y) =

It is easy to see that

∂K 1 ∂K eσ  + eψ ∂ρ ρ ∂ψ

(58)

3 Up and Down Through the Gravity Field

107

Fig. 3 Unit vectors for the computation of ∂ ∇y K(x, y) ; eψ = ∂ψ eσ 

eψ = − with eσ =

1 r

x, eσ  =

1 ρ

1 eσ + cotgψ eσ  sinψ

(59)

y. Moreover ∂K 1 1 =− K+ ∂ρ ρ ρ lxy

  lxy + ρ − r ∂K sinψ 1 r . =− − ∂ψ ρ lxy + ρ − r cosψ 1 − cosψ lxy

(60)

(61)

It is interesting to observe that   ∂K ∂K 1 , =O ∂ρ ∂ψ lxy so they are integrable functions on S. Therefore the relation in Eq. 55 can be used even if we take x ∈ S; so once μ(y) has been reckoned by solving Eq. 53, u(x)|S is retrieved by Eq. 55 with Eqs. 60 and 61. As a final comment on this Remark, one can raise the question of the computability of a solution in two steps; μ from Eq. 53 and u from Eq. 55. In fact if the solution method goes through the representation of μ in spherical harmonics, then a direct approximation method can only be superior. On the other hand the solution of Eq. 53 can also be pursed by a multiscale method ( [6], Cap. 3) and the corresponding discretization that in principle is capable of reducing the calculations taking advantage of the large areas of the Earth surface (e.g., the oceans) on which the gravity field is smoother. In such a case the Prague method could be taken into consideration. Remark 3. it is interesting to observe that the above results valid for the simple fixed boundary GBVP, can be easily carried on by a perturbative technique to the true linearized fixed boundary GBVP, namely to Eq. 39. The analysis can be found in [27], Cap. 15,4 and in synthesis one can say (with a slight different definition of the norms) that the stability constant, that in Eq. 49 was about 2J+ , now becomes:

108

F. Sansó et al.

Fig. 4 The geometric setting of Runge-Krarup’s theorem



C∼ = 2 J+ (1 − 2 + J+ ) + ∼ = 12 e2 ∼ = 6.72 · 10−3

(62)

2 + J+ < 6.72 J+ < 1.

(63)

on condition that

An easy calculation, recalling that J+ = (cosI+ )−1 , shows that Eq. 63 is satisfied if the maximum inclination of the topography I+ , satisfies I+ < 89.6◦ certainly not a very restrictive condition. The above can be generalized to the Molodensky linearized problem, though with a much more involved analysis.

4

Global Models as Approximate Solutions of the GBVP

A global model uN (r, σ ) is an approximate representation of the actual anomalous potential by means of a truncated series of external spherical harmonics, i.e., (cfr. Eq. 9) uN (r, σ ) =

n N n=0 m=−n

unm Senm (r, σ ) =

n N n=0 m=−n

unm

 n−1 R Ynm (σ ); r

(64)

since the summation in Eq. 64 is finite, the choice of R is just a matter of convenience. In the present context we choose R = R0 , a Bjerhammar radius, namely a radius such that the sphere S0 ≡ {r = R0 } is totally inside the body B of the Earth (Fig. 4). In order to say that uN is an approximation of u, we need a norm to measure the residual u − uN . Given the frame of the theory of the GBVP presented in Sect. 3, in our application we shall use the norm in H 1,2 (S) in a form essentially equivalent to Eq. 43 namely

3 Up and Down Through the Gravity Field

109

  u 2H 1,2 (S) =

| ∇u|S |2 dσ

(65)

S

We underline that here u ∈ H 1,2 (S) means also that u is harmonic in Ω (Fig. 1), namely if we call H (Ω) the space of all functions harmonic in Ω we understand u ∈ H 1,2 (S) ⇐⇒  u H 1,2 (S) < +∞

,

u ∈ H (Ω)

(66)

An analogous notation will be used for functions in H 1,2 (S0 ) that are also harmonic in Ω0 . Remark 4. We have given to the norms of Eqs. 43 and 65 the same name because the two are equivalent from the functional point of view. In fact from dS =

Rσ2 dσ = Rσ2 J dσ cosI

(67)

one sees that, with obvious symbolism, 2 2 R−  u (65) ≤  u (43) ≤ R+ J+  u (65) .

(68)

Nevertheless the two norms assign, so to say, different weights to different areas; in view of Eq. 67 the norm in Eq. 43 gives more weight to rugged areas where cosI can become smaller. This may not be wise from the approximation point of view. To give a mathematical basis to our statement that uN is an approximative representation of u, in this case in H 1,2 (S), we just need to know H0 = Span{Senm (r, σ )},

(69)

of finite linear combinations of the external spherical harmonics is dense in H 1,2 (S), denoted H0 ⊂d H 1,2 (S)

(70)

and that the coefficients unm of uN are judiciously chosen according to some convergent approximation principle. This can be seen in two steps. At first consider the two spaces H 1,2 (S0 ) and 1,2 H (S) and the restriction operator RΩ , that restricts to Ω a function harmonic in Ω0 ; it is then obvious that RΩ (H 1,2 (S0 )) ⊂ H 1,2 (S).

(71)

110

F. Sansó et al.

On the other hand, in view of the very general Runge-Krarup theorem ([11], Cap. 9) we know that, as a particular case, Eq. 71 holds densely, i.e.,

RΩ (H 1,2 (S0 )) ⊂d H 1,2 (S).

(72)

As a second step, we can see that H0 ⊂d H 1,2 (S0 );

(73)

in fact since S0 is a sphere we have ∀u ∈ H

1,2

u=

(S0 ) ,

n +∞

unm Senm (r, σ ),

(74)

n=0 m=−n

  u 2H 1,2 (S ) = 0

S0

(u 2 +

1 | ∇σ u |2 ) dσ = R02

+∞ n 4π 2 = 2 unm (n + 1)(2 n + 1). R0 n=0 m=−n

(75)

This formula is easily explained by taking into account that 

 | ∇σ u(R0 , σ ) |2 dσ = − σ

u(R0 , σ ) Δσ u(R0 , σ ) dσ = σ

n +∞

n (n+1) u2nm .

n=0 m=−n

From Eq. 75 it is obvious that Eq. 73 holds because  u − uN 2H 1,2 is just the residual of a convergent series of positive terms. But then, combining Eqs. 72 and 73, RΩ H0 = Span{RΩ Senm (r, σ )} ⊂d H 1,2 (S),

(76)

which is basically what we wanted to prove. Now notice that the theory of Sect. 3 tells us that the operation B : H 1,2 (S) −→ L2 (S) is in fact one to one, bounded and also with a bounded inverse B −1 : L2 (S) −→ H 1,2 (S). In this case we know ( [32] Cap. VII,1) that we can define a transpose operator B T , B T : L2 (S) −→ H 1,2 (S),

3 Up and Down Through the Gravity Field

111

which is also bounded, invertible and with bounded inverse. As a consequence, identifying H0 and RΩ (H0 ), we can claim that B(H0 ) ⊂d L2 (S).

(77)

In fact if ϕ ∈ L2 (S) is such that ∀u ∈ H0

< ϕ , B u >L2 (S) = 0

we have as well, thanks to Eq. 76, ∀u ∈ H0

< B T ϕ , u >H 1,2 (S) = 0 ⇒ B T ϕ = 0 ⇒ ϕ = 0,

and Eq. 77 is proved. At this point it is expedient to set up an approximation procedure by using the data f (σ ) = B u|S and setting up, for instance, a least squares principle min  f (σ ) − B uN (σ ) 2L2 (S) =  2 (N )

{unm }

(78)

allowing to determine the set of coefficients {unm } that give the “best” approximation of f among the linear combinations of {B Senm } complete up to degree N . Indeed Eq. 77 implies that lim ε2 (N ) = 0,

N →∞

(79)

i.e., a l.s. estimate is such B uN −→ B u. But then, as a consequence of Eq. 49, we have as well

lim  u − uN H 1,2 (S) = 0 ,

N →∞

(80)

i.e., uN is converging to the exact solution u. Note be taken that in this context the least squares principle has nothing to do with stochastic errors that always affect observations, like δg; in our case the principle has a pure deterministic meaning as functional approximation criterion. Another important remark is that the functions B Senm (Rσ , σ ) are not orthogonal in L2 (S), nor they are if we take a modified norm substituting dS with dσ . Therefore the least squares principle (Eq. 78) will lead to estimates of the coefficients that change if we change the maximum degree N , unm (N ). Two issues related to this non-orthogonality are worth to be discussed. The first is of numerical character; in fact a least squares solution requires the computation of a very large normal matrix with entries

112

F. Sansó et al.

 (AT A)nmj k =

B Senm (Rσ , σ ) B Sej k (Rσ , σ ) dS

(81)

S

which for a maximum degree 2000 requires some 1013 integrals. Indeed there are methods for a direct iterative solution of the normal system and among them, accepting some approximations, we can count the actual geodetic solution passing through an approximate downward continuation of the datum δg (or Δg in case of Molodensky’s problem) down to the ellipsoid, followed by numerical quadrature with functions that are orthogonal on such a surface. The approach is discussed in both [27], Cap. 15,5 and [28], Cap. 5. Beyond the method in fact employed to compute the actual global model EGM 2008 (see [19]), this “change of boundary” approach has been systematically studied, from the numerical point of view, in a recent paper ( [4]), where clear improvements in terms of decrease of biases are shown with respect to more traditional approaches. A second important question is what is the significance of unm (N ) when N −→ ∞; are such limits existing? Are the limits related somehow to the internal moments of the mass distribution? To answer this question we can make the following reasoning. Let us take a sphere Se (with radius Re ) external to S and consider for our u ∈ H 1,2 (S), the function of σ only u(Re , σ ); such a function has a convergent harmonic series:

u(Re , σ ) =

n +∞

unm (Re ) Ynm (σ )

(82)

n=0 m=−n

and since it is a quite smooth function (on the bounded set Se ), we certainly have unm (Re ) =

1 4π

 u(Re , σ ) Ynm (σ ) dσ

(83)

and  u(Re , σ ) 2L2 (σ ) =

n +∞

unm (Re )2 < +∞.

(84)

n=0 m=−n

Since the norm squared of Eq. 84 is controlled by  u 2H 1,2 (S) , i.e.,  u(Re , σ ) L2 (σ ) ≤ C  u H 1,2 (S) , the linear functionals of Eq. 83 are continuous on H 1,2 (S) and therefore there must be functions fnm ∈ H 1,2 (S) such that

3 Up and Down Through the Gravity Field

113

unm (Re ) =< fnm , u >H 1,2 (S) ;

(85)

the reasoning is identical to that presented in [27], Cap. 15,2 Prepositions 1,2. Now consider applying the same functionals to the least squares solution (see Eq. 64 with R = R0 ); one obviously hase 

R0 < fnm , uN >H 1,2 (S) = unm (N ) Re

n+1 .

(86)

On the other hand since uN is known to converge to u in H 1,2 (S), one must have too  lim unm (N ) =

N →∞

Re R0

n+1 unm (Re ).

(87)

So we see that individual coefficients of the approximation do converge to harmonic coefficients that represent u(r, σ ) from the level r = Re up. Notice however that although uN → u in H 1,2 (S) and unn (N ) →  n+1 Re unm (Re ), this does not mean that the truncated series has a limit in R0 the form of such a series convergent down to the surface S (or even worse down to the surface S0 ). This is due to the fact that one cannot presume that  n  +∞ Re 2 n+2 unm (Re )2 R 0 m=−n n=0

be bounded, unless the original function u(r, σ ) can be continued down to the level r = R0 . As matter of fact every time we see an expression like u(Rσ , σ ) =

n +∞

unm Senm (Rσ , σ )

n=0 m=−n

we are using just a symbol, devoid of a precise mathematical meaning. This argument might be important when we want to combine gravity measurements from satellite with boundary information given on S. In fact assume that the satellite surveying at the altitude r = Re is transformed into estimates of the coefficients unm (Re ), as it has been done in the so called spacewise approach for the GOCE mission [15]; then we would like to know how to write “observation equations” for unm (Re ), to be combined with the “terrestrial” global model. This in principle should be done in the form of Eq. 85; namely we should know explicitely the functions fnm . This is difficult to do if we want to preserve one of

114

F. Sansó et al.

the forms of the scalar product in H 1,2 (S) that we have already introduced, namely Eqs. 43 and 65. Nevertheless, as proved in Appendix A, it is not difficult to find functions {Fnm (σ )}, {Gnm (σ )} such that the identity holds unm (Re ) =

1 4π



Fnm (σ ) (u |S ) dσ +

1 4π

 Gnm (σ ) (u|S ) dσ ;

(88)

the explicit analytical form of Fnm (σ ) and Gnm (σ ) is also given in Appendix A. Remark 5. As a matter of fact Eq. 88 suggests that we should inquire a third equivalent version of the norm in H 1,2 (S), namely  u H˜ 1,2 (S) =

1 4π



[(u 2 |S ) + (u2 |S )] dσ.

(89)

However, given the result of Remark 2 and considering the way in which it has been proved, the equivalence of Eq. 89 with Eqs. 43 and 65 is immediate. In any event Eq. 88 is already an answer to our question, and it can be used in a l.s. approximation process, e.g., in combination with ground observations δg = −u |S . Alternatively one could use the above relation to transform Eq. 88 into 1 unm (Re ) + 4π



1 Fnm (σ ) δg(σ ) dσ ≡ 4π

 Gnm (σ ) (u|S ) dσ,

(90)

which is again an observation equation for u, maybe in a simpler form. Is has to be underlined however that if we want to use Eq. 90 in a l.s. process in Gauss-Markov sense, one should properly propagate the covariance of the observations, taking into account the cross-covariances between δg and the left hand side of Eq. 90. Whatever Eq. 88 or Eq. 90 is more convenient, depends on the way in which we approximate u. For instance when u is approximated by a global model uN , certainly the use of Eq. 88 is easier; if on the contrary we use boundary elements to approximate u, maybe Eq. 90 could be convenient.

5

Principles of Downward Continuation

In this section we invert the considerations of Sect. 2, since now we want to move downward through the gravity field from an upper surface SU , to a lower surface SL . Given that we assume that there are no masses above SU , so that the exterior set ΩU is by definition a domain of harmonicity of u, the downward continuation is dominated by two principles:

3 Up and Down Through the Gravity Field

115

(a) as far as we move in the harmonicity domain of u, namely if we assume that SL coincides with or encloses the Earth surface S, the downward continuation (D.C.) operator

D(SL , SU ) ≡ U (SU , SL )−1

(91)

is unbounded, but as inverse of a one to one operator, it is one to one too; (b) when we dive into the masses there are two alternatives: (i) either we know the mass density ρ, as a function of the position, in which case the D.C. can be reconducted to the rules of (a), (ii) as soon as we penetrate an unknown spatial mass distribution, a strong non uniqueness of the D.C. starts, so that we cannot say anymore that there is a D.C. operator, like Eq. 91, but rather there is a whole family of mass distributions that can generate the same field given on SU . In fact in this case the same concept of D.C. is lost. We start with the first principle, (a). The existence of the operator D = U −1 for the spherical case, as well as its unboundedness has already been commented in Sect. 2. In fact if SU and SL are two spheres with radius respectively RU and RL (RU > RL ), inverting Eq. 27 we see that 

RU unm (RL ) = unm (RU ) RL

n+1 ,

(92)

which means that the coefficients unm (RL ) are uniquely identified by the coefficients unm (RU ), although the latter are increasingly magnified with the degree n because RRUL > 1. If we want to get out of the spherical example, we get the same result (at least for the uniqueness) by recalling the “identity principle” of harmonic functions (see [27], 13). In fact two harmonic functions that coincide in a neighborhood of a point of their harmonicity domain, coincide everywhere in it. Then if u is given on SU and therefore on ΩU , any harmonic function coinciding with u on SU will also coincide with it in ΩU and therefore everywhere in the harmonicity domain of u; since SL is by hypothesis in such a domain, the D.C. of u from SU to SL is unique, i.e., there exists the operator D = U −1 . That such operator is unbounded comes from the fact that U is compact, although we will not dwell on the proof that can be found in [27], Appendix A. Remark 6 (uniqueness of the inverse single layer problem with fixed geometry). Given the discussion of Sect. 2, it is interesting too to know whether given that u can be downward continued to SL we can assume too that the potential u is generated by a single layer supported by SL . The answer is in the positive under

116

F. Sansó et al.

general conditions; here however, to simplify matters, we stipulate that SL is fairly smooth and fully contained into the harmonicity domain of u, so that uL = u|SL is a smooth function too. In other words we would like to know whether, once u is continued down to SL , the equation  P ∈ SL ,

uL (P ) = S

ω(Q) dSQ . lP Q

(93)

has a solution for the density ω(Q). We show how to construct ω(Q) from u and the procedure automatically proves existence and uniqueness of the solution. Let us consider that from u we can compute not only uL but also ∂u ∂n |SL and this is a smooth function too, given the hypothesis done on SL . So from uL (P ), (P ∈ SL ) we can find as well the solution of the Dirichlet problem in BL (SL ≡ ∂BL ), namely a function uI harmonic inside SL and coinciding with uL on SL . Since uL is smooth, I so is uI too; so we can readily compute the function ∂u ∂n |SL . But then, by applying the well known jump relations of potentials of single layers (e.g., see [27], 1.5). We have Q ∈ SL

ω(Q) = −

1 4π



   ∂u  ∂uI  (Q) − (Q) . ∂n SL ∂n S L

(94)

Now such a surface density generates an outer potential, in ΩL , that has a normal derivative on the upper face of SL equal to ∂u ∂n |SL . But then by the uniqueness of the solution of the exterior Neumann problem (see [22] n. 81) such a potential coincides with u, i.e., it satisfied Eq. 93. Note be taken that in this reasoning SL is fixed. As claimed before, the above reasoning can be generalized to the case that SL is at the boundary of the harmonicity domain of u, so much so that we could roughly summarize the Remark by stating that the inverse problem for a single layer with fixed geometry SL has a unique solution if we are able to downward continue u from SU to SL . We can come now to the principle (b); the case (i) is particularly simple. In fact if we know the mass distribution between SU and SL , we can apply the Newton integral to it, subtract the so derived potential uT from u, and then apply the rules of principle (a) to the remaining potential v = u − uT , that has now a D.C. down to SL . This is the solution typically applied in Geodesy to subtract the influence of topographic masses; whence the index T used above. So we are left with principle (b), the case (ii). Remember that S is the boundary of B, where ρ = 0; if SU is exterior to S, we have already described the operator D that downward continues the potential u from SU to S. Since we want to know what happens when we penetrate inside S, to simplify the setting of the problem we can assume that SU ≡ S and u is given directly on such a surface.

3 Up and Down Through the Gravity Field

117

Fig. 5 The setting of the problem of inverting u|S into a mass density in the layer L and several other ancillary surfaces

Notice that if instead of u|S we assume that δg = −u |S is given, we can always invoke the GBVP solution to transform such a datum into u. On the other hand we have to notice that as for the inner surface SL , if we don’t know the mass distribution internal to it, or at least the potential uL , generated by such internal masses on SL (see comments in Remark 1) we are compelled to shrink SL to the origin, i.e., to consider the problem of inverting u|S into the whole B. So our general setting is as in Fig. 5 in which we mean that SL can be reduced to the origin, namely to disappear as a surface. When SL does not coincide with O, we assume to know the internal density or equivalently that the body BL (internal to SL ) is hollow, since we can always subtract from the overall picture the potential generated by the known masses in BL . So basically we would like to study the solution ρ of the equation  P ∈S

u(P ) = G

L

ρ(Q) dBP Q , lP Q

(95)

when we know that there exist a ρ ∈ L2 (L ) that really generates u|S and we have fixed the geometry of L . The first comment that we have to do on Eq. 95, is that indeed such an equation has not a unique solution. The spherical example in particularly clear. Example 1. Assume S is a sphere of radius R and SL a sphere of radius RL < R; decide further to look for ρ constant in L . Then (see [27], 1.3) if u(R) = u0 , ∀ ρ > 4u0 2 we get a RL G

3

π R

 RL = R 3 − (R 3 −R 3 )

u0 R G

4 3

1 3

πρ

L such that u0 = G 43 π ρ. R But even more generally we can use Remark 6 to show that masses inside L can be shifted without changing the field outside S. In fact if we take a thin layer around S  (see Fig. 5) and we consider part of the masses in this layer and the associated potential, we can use Remark 7 to show that such a potential can be

118

F. Sansó et al.

as well generated by a suitable thin layer around S  (⊃ S  ), at least outside S  itself. So using repeatedly such “sweeping out” for many layers we can produce quite different density functions that generate the same u outside S. Given the above reasoning, the task we have is double: to characterize the full set H0 of densities {ρ0 } that produce a zero outer potential, namely the null space of the operator of Eq. 95; to show how to construct at least one solution of Eq. 95. The problem, dating back one century with the works of [12,20], has been solved long ago under the hypothesis that u is in fact generated by an L2 (L ) density: we quote only [23] and [2]. A more recent and quite general analysis of the problem can be found in [29]. Another slightly different approach can be found in [30]. Here, nevertheless, we follow a more elementary approach, which has the merit to highlight that when we deal with a true layer (L = B) the minimum L2 -norm solution, that we are going to study, belongs to a space more restricted than that of functions harmonic in L and, in particular, there are functions harmonic in L that can generate a zero outer potential. This is not possible when L ≡ B, as is the case studied in the quoted literature. So let us start from the fact that Eq. 95, as a linear equation between two Hilbert spaces, has generally as solution a linear manifold 

S = {ρ + ρ0 }  ρ0 ∈ H0 ; L

ρ0 (Q) lP Q

dB0 = 0.

(96)

The two Hilbert spaces in question are precisely 3

ρ ∈ L2 (L ) ;

u(P )|S ∈ H 2 (S)

(97)

The situation is described in more detail in [23] and [29], nevertheless to grasp the 3 idea of H 2 (S) we suggest the reader to develop the following spherical example. Example 2. let S be a sphere with radius R, SL a sphere with radius RL < R and L the layer between the two. By using the same notation as in Eq. 33, we can write unm (R) =

4π G (2 n + 1)



or, by Schwartz inequality, putting q = | unm (R) |2 ≤

  n r ρnm (r  ) r  2 dr  RL R R

RL R

(98)

< 1,

(4 π G)2 R 4 (1 − q 2n+3 ) (2 n + 3) (2 n + 1)2



R RL

2 ρnm (r  ) r  2 dr  .

(99)

The  above  inequality is not strict, namely it can become an equality when ρnm (r) = n

Cnm

r R

, we see that, for some constant C,

3 Up and Down Through the Gravity Field



| unm (R) |2 (2 n+1)2 (2 n+3)≤C

119



R

2 ρnm (r  ) r  2 dr  =C

RL

1 4π

 L

ρ 2 (Q) dBQ , (100)

with the possibility of an equality. 3 So for functions u(R, σ ) defined on S if we define the norm in H 2 by  u 2

H

3 2 (S)

=



| unm |2 (2 n + 1)2 (2 n + 3)

(101)

3

we find that ρ ∈ L2 (L ) ⇒ u ∈ H 2 (S). Remark 7. From the Eq. 101 it is particularly clear that, with an obvious notation, 1

3

L2 (S) ⊂ H 2 (S) ⊂ H 1 (S) ⊂ H 2 (S);

(102)

each embedding is in fact not only continuous but even compact. 3 3 Moreover we notice that if u ∈ H 2 (S) and f ∈ H − 2 (S), namely  f 2

−3 H 2 (S)

=



| fnm |2 (2 n + 1)−2 (2 n + 3)−1 < +∞,

(103)

then indeed the following bilinear expression and the subsequent inequality makes sense 2  2    2  =  f u ·  u 2 3 . nm nm  ≤  f  − 3   H 2 (S) H 2 (S)

   2 L (S)  3

(104) 3

Actually f ∈ H − 2 (S) represents a general continuous functional on H 2 (S), by means of the L2 (S) product. The above statements generalize to any surface with a reasonable degree of smoothness, e.g., so called Lipschitz surfaces (see [14] pag 96). At this point, let us notice that Eq. 95 can be written in the form G< in fact, ∀ P ∈ S,

1 lP Q

, ρ >L2 (L ) = u(P ) ;

∀P ∈S

(105)

∈ L2 (L ), since, calling Δ the diameter of S,

1 lP Q



1

2 L lP Q

 dBQ ≤

See Fig. 6 for a graphic explanation

 dσ 0

Δ

r2 dr ≤ 4 π Δ. r2

(106)

120

F. Sansó et al.

Fig. 6 For every P the volume element dBQ = dσ L2P Q dl along the radius in direction eσ (P ) runs at most from l = 0 to L = Δ

Fig. 7 The closed manifold of all solutions of Eq. 95, S , and the minimum norm solution ρh

From Eq. 105 we easily read out that the subspace H0 , defined by Eq. 96, is closed in L2 (L ). In fact if ρ0n ∈ H0 and ρ0n → ρ0 in L2 (L ) it is obvious that u0 (P ) = < ρ 0 ,

1 lP Q

>L2 (L ) = lim < ρ0n , n→∞

1 lP Q

>L2 (L ) ≡ 0,

(107)

namely ρ 0 ∈ H0 . So the picture of the set S (see Eq. 96) is like that in Fig. 7, with S a closed linear manifold in L2 (L ) In such a situation, when we know that ρ ∈ L2 (L ) exists but we don’t know it, it becomes only natural to look for a geometrically significant element as a representative of S ; this is the minimum norm element ρh ∈ S , which does coincide with the pseudosolution of our problem (look for instance into [7] for the definition of pseudoinverse of a continuous operator with non trivial null space). That such ρh exists (and is unique) as the orthogonal projection of the origin O, on S , descends from the fundamental projection theorem in a Hilbert space (see for instance [27], 12.3). Nevertheless we use the knowledge of the existence of ρh , to derive its peculiar shape by the principle of Lagrange multipliers. Namely we know that ρh , the solution of the principle min  ρ 2L2 (L ) ,

ρ∈S

(108)

is also the unconstrained solution ρλ of the other principle    ρ 1 2 dBQ >L2 (S) min  ρ L2 (L ) + < λ , u − G 2 L lP Q

(109)

3 Up and Down Through the Gravity Field

121

on condition that the multiplier λ satisfies the equation  G

L

ρλ dBQ = u(P ) lP Q

P ∈S

(110)

In Eq. 109 the scalar product < , >L2 (S) is meant as a continuous linear  functional of the member to the right. Since u − G L lPρλQ dBQ has to belong to 3

H 2 (S), thanks to Remark 7 we know that λ, also as a solution of Eq. 110, has to be 3 sought in H − 2 (S). The free variation of Eq. 109 with respect to ρ gives  Q∈L

ρλ (Q) − G S

λ(P ) dSP = 0, lP Q

(111)

namely ρλ (Q) is a function harmonic in L , generated by a single layer with density λ(P ) deposited only on the upper surface S. Substituting Eq. 111 into Eq. 110 we get the equation for λ 

 G S dSP λ(P  ) G(P , Q) = u(Q) G(P , Q) = L l 1  l 1 dBP  PP

Q∈S

(112)

P Q

Recurring to the spherical case, as in Example 5.1, we can easily compute, when S is a sphere of radius R and SL a sphere of radius RL ,   (1 − q 2 n+3 ) RL q= G(P , Q) = 4 π R Ynm (σP ) Ynm (σQ ) R (2 n + 1)2 (2 n + 3) (113) showing clearly that G is a strictly positive definite kernel, with power strength 3 3 N = 3. This is compliant with the fact that if u ∈ H 2 (S), then λ ∈ H − 2 (S). Again by using our spherical harmonic representation we can further compute, from Eq. 111, the  ρλ 2L2 (L ) =

1 4π 2 < +∞ λnm R2 (2 n + 1)2 (2 n + 3)

(114)

showing that the solution ρh = ρλ so found does belong to L2 (L ) as it should. At this point we have a coherent scheme for the solution of the inverse gravimetric problem for the layer L : 3

• we are given the potential u|S and we assume it to belongs to H 2 (S), • then we solve for λ Eq. 112 and we know that there is one and only one solution 3 in H − 2 (S),

122

F. Sansó et al.

• finally we compute the minimum norm solution ρh = ρλ from Eq. 111, obtaining in fact an L2 (L ) bounded function; ρh results to be equal to the harmonic potential of a single layer on S; we call   Hh = ρh =

λ lP Q

λ∈H

dBQ

− 32

 (S) ;

(115)

of L2 (L ), it is obvious that Hh is a closed subspace  • the space H0 , such that S = {Hh H0 }, is just the orthogonal complement of Hh in L2 (L ). Two remarks are in order to conclude the above analysis. Remark 8. The subspace Hh is not the full subspace of functions in L2 (L ), also harmonic in L , that we call H L2 (L ). In fact this full subspace can be proved to be generated by the sum of all single layer potentials with density on S plus all single layer potentials with density on SL . If Hh ⊂ H L2 (L ) strictly then we should be able to find a harmonic function h0 which is also in H0 , namely that generates a zero outer potential; in other words H L2 (L ) ∩ H0 = 0. For the usual spherical example, this is the case if one takes ρ = h0 =

3 1 1 − q2 1 − 2 R 1 − q3 r

,

RL ≤ r ≤ R,

as the reader is invited to verify. Remark 9. When we have L ≡ B, i.e., the inversion domain is the whole body B, we have one further characterization of H0 , which has proved to be useful in many instances. First of all we observe that if we eliminate the internal surface SL , then the 3 densities of the form of Eq. 111 with λ ∈ H − 2 (S) are in fact spanning H L2 (B), 2 namely Hh ≡ H L (B). In this case therefore H0 is characterized by functions ρ0 such that < ρ0 , h >L2 (B) = 0 , ∀ h ∈ H L2 (B).

(116)

To proceed further, notice that ∀ ρ0 ∈ H0 ⊂ L2 (B) we can define a ϕ such that 

Δϕ = ρ0 ϕ|S = 0

(117)

3 Up and Down Through the Gravity Field

123

It turns out that the solution of Eq. 117 is a ϕ ∈ H 2,2 (B). But in this case one must have form of Eq. 116 and from Δh = 0, 





ρ0 h dBP = B

Δϕ h dBP = B

S

∂ϕ h dSP = 0; ∂n

(118)

since Eq. 118 has to hold ∀ h ∈ H L2 (B) one may conclude that it has to be too  ∂ϕ  ≡0 ∂n S

(119)

our statement then becomes: ρ0 ∈ H0 ⇐⇒ ρ0 = Δϕ ; ϕ ∈ H02,2 ≡ {ϕ ∈ H 2,2 , ϕ|S = 0 ϕn |S = 0}

(120)

i.e., ρ0 ∈ H0 if it is the laplacian of a function square integrable together with its second derivatives, and such that it goes to zero, together with its normal derivative at the boundary S. Not such a similar characterization is possible when L ⊂ B.

6

The Constant Density Layer with Unknown Geometry

In the previous section we have analyzed the inversion of the Newton operator, with a family of densities supported by a layer, L ≡ {RLσ ≤ r ≤ Rσ }, contiguous to the known outer surface S ≡ {r = Rσ }, that generate the same potential u in Ω. By hypothesis the shape of L , namely S and SL , was supposed to be given while the density ρ was considered as unknown. In this section we will discuss the inversion of Newton’s operator, by exchanging the above hypotheses; namely we shall assume that ρ is a given function of the point Q ∈ B and viceversa the layer surface SL , cutting the domain where ρ has to be considered different from zero, is unknown. For the sake of simplicity we will assume that ρ is constant, although for instance some ρ depending on r in the form of a decreasing function could be treated very much in the same way (see for instance [9]). In any event the problem of one layer only, is less cumbersome than one might believe, from the geophysical point of view. As a matter of fact this is an idealization of the so called Moho problem. On such items one can consult [13, 17]. Although this is not completely exact, we will consider the Moho as the boundary between crust and mantle. If we idealize the interior of the Earth as a distribution of layers of constant density like in Fig. 8, and if we further assume that the layer interfaces are known, we could write:

124

F. Sansó et al.

Fig. 8 A planetary model with 3 layers of constant density, core, mantel and crust: interfaces are star-shaped



dσ 

u(P )|S = G ρc



RCσ  0

 + G ρCr







r 2 dr + G ρM lP Q

Rσ 

r2

RMσ 

dr

lP Q



dσ 



RMσ  RCσ 

r 2 dr + lP Q

(121)

.

Equation 121 can be rearranged as 

  R 2 σ r dr r 2 dr  u(P )|S = G ρC dσ + G ρM dσ + lP Q 0 RCσ  lP Q   R 2 σ r dr − G (ρM − ρCr ) dσ  RMσ  lP Q 



RCσ 

(122)

and finally  G (ρM − ρCr )

dσ 



Rσ 

RMσ 

  R  2 Cσ r dr r 2 dr = −u(P ) + G ρC dσ  + lP Q lP Q 0   R 2 σ r r  ≡ f (σ ). + G ρM dσ RCσ  lP Q (123)

Equation 123 shows the relation existing between a known function f (σ ), a known constant density ρM −ρCr , that in the sequel we will denote simply as ρ, and the unknown function r = RMσ ≡ RLσ , representing the Moho surface. The above model can be indeed complicated with many layers as far as we keep the hypothesis that all the other interfaces, beyond RLσ , and all the constant densities of the layers are known.

3 Up and Down Through the Gravity Field

125

One might argue that indeed the interfaces of the various layers in the Earth interior are not precisely known, nor the density of the layers are really constant. Nevertheless it is also known that such perturbations of our simplistic model, do have a visible effect on a global gravity model, only in low degrees, so that one could think that considering the model only for degrees higher than a certain Lmax one could consider it mainly due to the effects of the first layer, the crust. Naturally it is unrealistic to think of cutting a global model precisely in two at degree Lmax , nevertheless there are arguments saying that if we choose Lmax somewhere between 36 and 72, we get a model for which Eq. 123 becomes sufficiently representative to make of its solution an interesting geophysical problem. So we are left with the task of solving the equation  Gρ

L

dBQ = f (σ ) lP Q

P ∈ S , L ≡ {S, SL }

(124)

where the unknown is SL . Notice that Eq. 124 is slightly more general than Eq. 123 in that in Eq. 124 we don’t need to make the hypothesis that SL is star-shaped, although in the sequel we will always accept such a restriction, in particular to simplify the result concerning the uniqueness of the solution. So we write Eq. 124 in the form Q ≡ (r, σ  ) ,

 Gρ

dσ 





RLσ

r 2 dr = f (σ ) lP Q

(125)

and we would like to know whether solving such an equation constitutes a properly posed problem or not and whether its solution is unique or not. We will respond to the first question on the ill-posedness, so that once we have stated that the solution is unstable it is more important to know whether in any way it is unique or not. In fact for an improperly posed problem it makes little sense to try to prove conditions for the existence of the solution, because known data will not in general satisfy these conditions, but we would rather like to know whether some approximation method, like Tikhonov regularization (see [7]), will provide a quasi solution that ultimately, when the perturbation of data is going to zero, converges to the exact solution. We proceed to the analysis of Eq. 125 in two steps: first we will consider its linearized version, then we will derive some results for the non linear form. At least we will discuss the most appropriate form of a Tikhonov principle for the Eq. 124. So assume that we know an approximate solution R˜ σ of Eq. 125, in the sense that if we compute f˜(σ ) = G ρ



dσ 



Rσ 

R˜ σ 

r 2 dr lP Q

we find an f˜(σ ) that we consider close to f (σ ). Then, assuming that

(126)

126

F. Sansó et al.

RLσ = R˜ σ + δRσ

(127)

we develop to the first order the integral of Eq. 125, namely we put  dσ





Rσ  RLσ 

r 2 dr ∼ = lP Q

 dσ





Rσ 

r 2 dr + lP Q

R˜ σ 



dσ 

R˜ σ2  δRσ  , lP Q˜

˜ = (R˜ σ  , σ  ) Q

(128) From the geodetic point of view this is nothing more than mass coating, namely ˜ from squeezing along the radius the masses in the layer (S˜ , SL ) onto the surface S; the mathematical point of view this is computing the Gâteaux differential which is in fact coinciding with the Frechet differential if the hypothesis that ˜ S) = Dist (S,

min

˜ ∈S Q∈S,P

PQ = δ > 0

(129)

is satisfied. If we recall that cos I˜σ  d S˜Q = R˜ σ  dσ 

(cos I˜σ  = eσ  · n˜ σ  )

and we put ω(σ  ) = cos I˜σ  δ R˜ σ 

(130)

we see that Eq. 125 combined with Eq. 128, can be written as δf (σ ) = f (σ ) − f˜(σ ) = G ρ

 S˜

d S˜Q

˜ ω(Q) , l˜ ˜

˜ ∈ S). ˜ (P ∈ S , Q

(131)

PQ

But this is exactly the problem of inverting a single layer with fixed geometry, upward continued from S˜ to S. This problem has already been discussed in Sect. 5 and we have stated that it is improperly posed, but that the solution ω(σ  ), when it exists, is unique (see Remark 6). If we further assume that cos I˜ ≥ C ≥ 0 ,

(132)

thanks to Eq. 130 we can conclude that δRσ  is also the unique solution of the linearized problem (Eq. 128). Naturally we have to assume that R˜ + δRL in any case is completely inside S to guarantee a reasonable solution. This concludes the analysis of the linearized problem. Now we come to the non linear problem (Eq. 125). First of all we would like to know whether also this problem could be transformed into that of inverting a single

3 Up and Down Through the Gravity Field

127

layer but with unknown geometry. To this aim we set up the following reasoning. Consider that f (σ ) is as a matter of fact the trace on the boundary S of a potential v harmonic in Ω, v|S = f (σ ). Now by solving the exterior Dirichlet problem, when f (σ ) and S are sufficiently smooth, we are in condition to compute v  and its trace on S; analogously we can compute g(σ  ) = (r

∂ − 2) u|S . ∂r

(133)

On the other hand the following identity holds ∂ 1 2 1 ∂ s3 − = 2 , ∂r lP Q lP Q s ∂s LP Q

(134)

P ≡ (r, σ ) , Q ≡ (s, σ  ) , cosψ = eσ · eσ  1 lP Q = [r  + s 2 − 2 r s cosψ] 2

(135)

    R 2  σ s ds  ∂  − 2 G ρ dσ  g(σ ) = r =  ∂r RLσ  lP Q r=Rσ  3    R  σ  s ∂  ds  = = G ρ dσ ∂s l PQ RLσ  r=Rσ  3   R3  Rσ − Lσ = G ρ dσ  lP Qe lP QL

(136)

r where 

Therefore it is

with Qe = (Rσ  , σ  ) , QL = (RLσ  , σ  ).

(137)

Equation 136 can be arranged as  Gρ

dσ 

3 RLσ  =Gρ lP QL



dσ 

Rσ3  − g(σ ) = g(σ ). lP Qe

(138)

This is a non linear Fredholm equation of the first kind, with RLσ as unknown and g(σ ) as known term. To see its relation to a single layer potential it is enough to 2 dσ  , to realize that Eq. 138 can be written in the form recall that dSL cosIL = RLσ 

128

F. Sansó et al.

 Gρ

dSL SL

cosILσ  RLσ  = g(σ ), lP QL

(139)

showing that g(σ ) is the boundary value (on S) of the potential of a single layer with unknown geometry (SL ), with density ω(σ  ) = ρ cosILσ  RLσ  , upward continued from SL to S. As always the upward continuation implies a smoothing with exponential strength, so solving Eq. 139 is a (strongly) improperly posed nonlinear problem. As for the uniqueness of the solution we return to the problem in the form of Eq. 123 that we rewrite  Gρ

L

dBQ = δf (σ ) , lP Q

P ≡ (Rσ , σ ) ,

L ≡ {RLσ ≤ r ≤ Rσ }.

(140)

The first proof of uniqueness of the solution of Eq. 140 is an old one, due to P.S. Novikov, dating back to 1938 (see [9], 3.1). Many years later one of the authors of this paper, unaware of Novikov’s work, has found exactly the same proof, as one can read in [3]. The authors are glad to restore the correct attribution of the proof. Here we will follow a different path, taking advantage of our Eq. 136 and of Remark 6. Assume to have two solutions RLσ = R1σ , RLσ = R2σ producing the same potential, for the respective layers L1 , L2 , on the surface S  u1 (P ) = G ρ

L1

dBQ ≡ u2 (P ) = G ρ lP Q

 L2

dBQ , P ∈ S. lP Q

(141)

But then, by the unique continuation principle u(P ) = u2 (P ) − u1 (P ) ≡ 0

(142)

for every P in the common domain of harmonicity. So if we define R+σ = max R1σ , R2σ ,

(143)

and correspondingly Ω+ , B+ the exterior and the interior of S+ ≡ {r = R+σ } (see Fig. 9), Eq. 142 has to hold in P ∈ Ω+ . If we call the differential body δB, δB = L1 ÷ L2 = (L1 ∪ L2 )  (L1 ∩ L2 ) and S± its boundaries

(144)

3 Up and Down Through the Gravity Field

129

Fig. 9 In this figure are represented the lower surfaces S1 , S2 (continuous lines) of the two layers L1 , L2 that generate the same potential (on and outside) S; the differential body δB, segmented into subset δBk where the density is +ρ or −ρ; the external and internal boundaries S± (dashed lines) of δB. Notice that the number of δBk could even be +∞



S+ ≡ {r = R+σ } , R+σ = max R1σ R2σ S− ≡ {r = R−σ } , R−σ = min R1σ R2σ

(145)

and, referring to Fig. 9, δBk+ = subset of δB where R1σ > R2σ , δBk− = subset of δB where R1σ < R2σ we find that the difference potential (Eq. 141) u(P ) can be written as  u(P ) = G ρ =Gρ

L1 ÷L2



±ρ dB = lP Q

δBk

k

=Gρ

(±ρ) dB = lP Q

  ρ k

1

δBk+ lP Q

dB −

(146)  ρ j

1

δBj − lP Q

 dB .

So the density  ρ0 (Q) = (±ρ) =

ρ −ρ

P ∈ δBk+ , ∀k P ∈ δBj − , ∀j

(147)

must be such as to guarantee a zero Newtonian potential outside S+ , i.e., in Ω+ . We observe that from the definition of Eq. 147 ρ0 (Q) = ±ρ depends basically on σ but not on r, so that inside δB the identity holds:

130

F. Sansó et al.

∂ρ(Q) ≡0 ∂r

Q ∈ ∂B.

(148)

Now we want to return for a minute to our formula of Eq. 136. It is obvious from the way in which it has been derived, that if ρ is a function of σ ; it can be modified taking into account that here the layer is between S− and S+ , as    3  3  R+σ  R−σ ∂    r − 2 u(P ) ≡ G dσ ρ(σ ) − ∂r lP Q+ lP Q− P ≡ (R+σ , σ ) ∈ S+ ,

Q− ≡ (R−σ  ) ∈ S− ,

(149)

Q+ ≡ (R+σ  , σ  ) ∈ S + .

On the other hand u(P ) is continuous with its first derivatives in Ω+ ∪S+ , because it is generated by a bounded mass density in δB, i.e., | ρ | = ρ. So we have ∂u (P ) ≡ 0 , ∂r

u(P ) ≡ 0

P ∈ S+

(150)

and Eq. 149 becomes  P ∈ S+

0≡





dσ ρ(σ )



3 R+σ 

lP Q+



3 R−σ 

lP Q

 .

(151)

Now we multiply Eq. 151 by some function ω(P ) and integrate on S+ ; moreover we denote  ω(P ) Q ∈ S+ ∪ B (152) v(Q) = dSP lP Q and we note that as a single layer on S+ , v(P ) is a harmonic function in B. So when Q ∈ S+ , v(Q+ ) is just the boundary value on S+ of a function harmonic in B. We have already discussed in Remark 6 that the correspondence between v|S+ and ω is one to one and we have shown how to derive one from the other. So we can rewrite Eq. 151 as  0≡

  3 dσ  ρ(σ  ) R+σ  v(Q+ ) − R−σ  v(Q− ) ,

(153)

having at our disposal the trace of v on S+ . So we can choose, with the same notation as in Eq. 147  v(Q+ ) = (±1) =

1 −1

Q+ ∈ δBk+ , ∀k Q+ ∈ δBj − , ∀j

(154)

3 Up and Down Through the Gravity Field

131

and we find, since v(Q+ ) ρ(σ  ) = ρ,   3 3  dσ R+σ  ρ − R−σ  ρ(σ )v(Q− ) .





0≡

(155)

On the other hand, by the maximum principle that all harmonic functions satisfy, | v(Q− ) | ≤ sup | v(Q+ ) | = 1

(156)

− ρ(σ  ) v(Q− )  −ρ.

(157)

and so

Therefore Eq. 155 now becomes  0

  3 3 − R dσ  ρ R+σ  −σ  , ρ > 0.

(158)

Since R+σ   R−σ  ,

(159)

Eq. 158 can be true only if R+σ  ≡ R−σ 

σ a. e.

(160)

On the other hand | R1σ  − R2σ  | = | R+σ  − R−σ  |

(161)

so Eq. 160 implies that R1σ  and R2σ  are equal almost everywhere, i.e., SL1 ≡ SL2 , namely uniqueness is proved. Now that we have achieved a uniqueness result, we can pass to discuss the application of some regularization method to get an approximate solution of Eq. 124. Instead of the original Eq. 124, we can use the equivalent formulation of Eq. 138, that we write as (recall, P ∈ (Rσ , σ ) , QL = (RLσ  , σ  )) 

F [RL ; σ ] = g(σ )  F [RL ; σ ] = G ρ dσ 

3 RLσ  lP QL .

(162)

A classical regularization of Eq. 162 has the form of the Tikhonov variational principle, (see [7]), namely we search RL in such a way that the minimum of the functional

132

F. Sansó et al.

T (RL ) =  g − F [RL ] 2L2 (σ ) + λJ (RL )

(163)

is achieved. Usually the regularizing term in T (RL ) is taken to be the square of a norm or a semi-norm like  2 (164) J0 (RL ) = dσ RLσ  J1 (RL ) =

dσ | ∇σ RLσ |2

(165)

dσ (Δσ RLσ )2 .

(166)

 J2 (RL ) =

While J0 (RL ) is nothing but the L2 (σ ) norm, J1 (RL ), J2 (RL ) are semi-norms, i.e., they enjoy all the properties of norms, but for strict positivity, namely J2 (RL ) = 0  RL = 0 and the same is true from J1 (RL ). Moreover the set C of admissible solutions RLσ , in our case is quite naturally 0 ≤ RLσ ≤ Rσ ,

(167)

because the surface SL should be wrapped by S. Indeed T (RL ) defined as in Eq. 163 is never negative, so we can always define m = inf T (RL ) RL ∈C

(168)

and it will be m ≥ 0. Nevertheless to see whether m is a real minimum of T on C, i.e., whether there is a R L ∈ C such that m = T (R L ) ,

(169)

is a different story requiring a deeper mathematical analysis. We list here sufficient conditions for m to be a true minimum: (1) C has to be bounded and closed in H , (2) F [RL ] has to be continuous in H , (3) the sets C ∩ {T (RL ) ≤ a} ,

(170)

a > m, have to be compact in C, i.e., ∀ {RLn ∈ C} we can find a subsequence {RLnk } that is convergent in H , indeed the limit of RLnk belongs to C because of (1), (4) J (RL ) has to be lower semi-continuous, i.e.,

3 Up and Down Through the Gravity Field

133

− → RLn H R L ⇒ J (R L ) ≤ lim J (RLn ).

That such conditions are satisfied in our case, is a problem examined in detail in the Appendix B, concentrating on the choice of J1 (RL ) as a regularizer. In fact we can argue that J0 (RL ) is too weak, for our purposes; in other words solutions with this choice can oscillate too much. In particular condition (3) is not satisfied because spheres in L2 (σ ) are certainly not compact, nor are their intersections with C. As for the choice of J2 (RL ), which is however successfully present in literature (see [21], 4.2), we consider it too strong; in other words solutions, with such a choice, become too smooth and loose the kind of look that is expected for geological interfaces. So we shall concentrate on the classical choice of J1 (RL ). To have a pictorial perception of the above discussion we provide in Fig. 10 an example of bounded surfaces which satisfy J0 < a, J1 < a, J2 < a. Once the existence of the minimum m is granted, we would also like to know how to compute the solution R L .This can be done by direct methods, namely using the variational principle (Eq. 168) or by solving, with some discretization technique, the corresponding Euler equation. Observing that the following two variational relations holds: ⎧  ⎨ δF [RL ; σ ] = K[(RL ; σ, σ  ] δRLσ  dσ  2 ⎩ K[RL ; σ, σ  ] = RLσ 

2 +3R 2 −5R 2 RLσ  σ Lσ  Rσ cosψ

lP3 Q

,

(171)

with P ≡ (Rσ , σ ) , Q ≡ (Rσ  , σ  ) , cosψ = eσ · eσ  ;

Fig. 10 Examples of functions f (σ ) which satisfy respectively f ∈ L2 (σ ), ∇f ∈ L2 (σ ), Δf ∈ L2 (σ ); in case (a) f can make jumps, (b) n exists a.e. but can make jumps, (c) curvature exists a.e. but can make jumps

134

F. Sansó et al.

 δ

| ∇σ  RLσ  |2 dσ  = 2



∇σ  RLσ  · ∇σ  δRLσ  dσ  = 

= −2

(172) 

(Δσ  RLσ  )δRLσ  dσ ,

it is not difficult to derive the Euler equation from Eq. 168, with J (RL ) ≡ J1 (RL ), namely 

dσ F [RL ; σ ] K[RL ; σ, σ  ] − λ Δσ  RLσ  =



dσ g(σ ) K[RL ; σ, σ  ].

(173)

Remark 10. It is interesting to try to linearize Eq. 173, to get perturbation equations for a spherical layer, namely setting Rσ ≡ R(const)

RLσ = R0 + δRLσ

(R0 = const).

(174)

Without entering into the detailed computation we observe that with this spherical symmetry, calling 1

l0 = [R 2 + R02 − 2 R R0 cosψ] 2 ,

(175)

one has  F [R0 ; σ ] = G ρR03 K[R0 ; σ, σ  ] =

R3 dσ  =4π Gρ 0 , l0 R

R02 (2 R02 + 3 R 2 − 5 R0 R cosψ) l03

(176) ,

(177)

⎧ ⎨ K[RL ; σ, σ  ] ∼ = K[R0 ; σ, σ  ] + G(σ, σ  ) δRLσ  ⎩ G(σ, σ  ) =

R0 [2 R04 +5 R02 R 2 +6 R 4 −10 R03 R cosψ−18 R0 R 3 cosψ+15 R02 R 2 cos 2 ψ] l05

(178) so that the linearized version of Eq. 173 reads 





dσ δR  +

Lσ 

 dσ K[R0 ; σ, σ ] K[R0 ; σ, σ ] + 



   dσ g(σ ) G(σ, σ  ) δRLσ  + dσ F [R0 ; σ ] G(σ, σ  ) δRLσ  − 

+ λ Δσ  δRLσ  =





dσ g(σ ) K[R0 ; σ, σ ] −



dσ F [R0 ; σ ] K[R0 ; σ, σ  ].

(179) Despite its awful appearance, in reality taking into account Eq. 176 and the definition of g(σ ), Eq. 138, it turns out that

3 Up and Down Through the Gravity Field





 dσ

135 

F [R0 , σ ] − g(σ ) G(σ, σ ) =



dσ g(σ ) G(σ, σ  )

(180)

with g(σ ) = r ∂r u − 2 u|S .

(181)

Since in the left hand side of Eq. 179, such a term multiplies δRLσ  , neglecting second order terms, we can substitute g(σ ) with the constant  (R 3 − R03 )  R 3 − R03 4 . g 0 = (r ∂r − 2) G π ρ = −4 π G ρ  3 r R r=R

(182)

Moreover we have too 

dσ G(σ, σ  ) = G0

(constant)

(183)

so that the second and third terms put together are just a constant multiplying δRLσ  . Therefore Eq. 179 becomes an equation, where all integral kernels are isotropic, namely depending on (σ, σ  ) only through cosψσ σ  , so that its solution is not difficult if we look for it in terms of a series of spherical harmonics. We leave the exercise to the patient reader.

7

Some Conclusions

The paper has examined, after recurring to spherical examples, how to move up and down through the gravity field. To move up the theory of Newton’s integral accompanied by the solution of the GBVP has to be applied. To move down, one finds a unique continuation of the field as far as we stay outside its source, namely the mass density. When we start penetrating the masses, a strong non uniqueness of the downward continuation starts and a choice has to be made in order to guarantee the uniqueness of the solution. The classical cases of the minimum L2 -norm solution and that of a layer with constant density, known upper surface and unknown lower surface, are discussed. In any way when we come to numbers and try to downward continue the gravity field, we always have to apply a regularization method. Some discussion on applying the Tikhonov principle, has been presented, although this is not yet finished since the need to find a regularization stronger that L2 but milder that H 1,2 suggests to perform an analysis in the space of functions of bounded variation. This however will be object of a forthcoming work.

136

F. Sansó et al.

Appendix A On the relation between terrestrial and satellite global models In this Appendix we aim at proving Eq. 88, giving on the same time the explicit form of Fnm (σ ), Gnm (σ ). We do that, not only adapting the second Green identity to an oblique derivative as opposed to a normal problem (in our case er = n), but also taking advantage of the hypothesized star-shaped geometry of the boundary S. While the first step is perfectly known in literature (see [16], I.6 ), the second is not, at least to the knowledge of the authors, although of chief interest in Geodesy. So let P = (x) ≡ (r, σ ) be any point in Ω; likewise we will use Q = y ≡ (s, σ  ) for any Q in Ω or on S. We start then from the known formula ( [18] 1–5) 1 u(P ) = 4π

   ∂ 1 1 ∂u u dSQ . − ∂nQ lP Q lP Q ∂nQ S

(184)

If the equation of S is written as r = Rσ ,

(185)

it is easy to verify that the vector n=

1 (Rσ er − ∇σ Rσ ) , p

(p =

 Rσ2 + | ∇σ Rσ |2 )

(186)

is the outer normal, so that cosI = n · er =

Rσ > 0. p

(187)

On the same time dS =

Rσ2 dσ = p Rσ dσ. cosI

(188)

We introduce now a slightly different notation for the gradient with respect to the spherical angular coordinates (σ ), namely for any differentiable function f (r, σ ) we write ∂σ f (r, σ ) instead of ∇σ f (r, σ ); this is because we will need to restrict then ∂σ f (r, σ ) to the surface S, so obtaining ∂σ f (r, σ )|S = ∂σ f (Rσ , σ )

(189)

but we want to avoid any confusion with ∇σ [f (Rσ , σ )]. In fact we have ∇σ [f (r, σ )|S ] = ∇σ f (Rσ , σ ) ≡ f  (Rσ , σ ) ∇σ Rσ + ∂σ f (Rσ , σ ),

(190)

3 Up and Down Through the Gravity Field

137

where as usual we have denoted with f  the radial derivative of f (r, σ ). Finally we recall that in spherical coordinates it holds ∇ = er

1 ∂ + ∂σ . ∂r r

(191)

So for any regular function f (r, σ ) we can write  ∂f  Rσ  ∇σ Rσ · ∂σ f (Rσ , σ ) f (Rσ , σ ) − = n · ∇f |S =  ∂n S p p Rσ

(192)

   ∂f  2  dS = R f (R , σ ) − ∇ R · ∂ f (R , σ ) dσ. σ σ σ σ σ σ ∂n S

(193)

and

Note that we continue to use the notation ∇σ Rσ because Rσ is function of σ only and not of r, so that no ambiguity can rise. Now we observe that if we take P ∈ Se , namely r = Re , and Q ∈ S, namely Q = y ≡ (Rσ  , σ  ), since Re > Rσ  (∀ σ  ) the function lP−1Q is analytic in Q ≡ y, so that Eq. 193 applies and we can write Eq. 184 in the form  1 u(P ) = u(Re , σ ) = 4π  1 − lP Q

   ∂ 1 1  2 + u(Rσ  , σ ) Rσ  − ∇σ  Rσ  · ∂σ  ∂s lP Q lP Q   Rσ2  u (Rσ  , σ  ) − ∇σ  Rσ  · ∂σ  u(Rσ  , σ  ) dσ 

(194) We multiply Eq. 194 by Ynm (σ ) and compute the integral over σ , recalling that unm (Re ) =

1 4π

 u(Re , σ ) Ynm (σ ) dσ.

(195)

On the other hand we also have 1 4π



1 = 4π

1 lP Q

Ynm (σ ) dσ =

1 4π



1 Ynm (σ ) dσ = |x−y|

 l +∞ Rσn  Ynm (σ  ) Ylk (σ ) Ylk (σ  ) Rσl  Y (σ ) dσ = nm 2l+1 Rel+1 Ren+1 2n + 1 l=0 k=−l

so that, using Eqs. 195 and 196 in Eq. 194, we get the identity

(196)

138

F. Sansó et al.

   n+1 R  Rσn  n σn+1 Ynm (σ  )− n+1 ∇σ  Rσ  · ∇σ  Ynm (σ  ) u dσ  + Re Re    n Rσ  1  2     − Y (σ ) R u − ∇ R · ∂ u dσ  . nm σ σ σ σ 4 π (2 n + 1) Ren+1 (197)

unm (Re ) =

1 4 π (2 n + 1)

With Eq. 197 we are almost done, however we want to eliminate in it the dependence from ∂σ  u(Rσ  , σ  ). For this purpose we use Eq. 190 in the integral  J =  =

Rσn  Ynm (σ  ) ∇σ  Rσ  · ∂σ  u dσ  =   Rσn  Ynm (σ  ) ∇σ  Rσ  · ∇σ  u|S − u | ∇σ  Rσ  |2 dσ  .

(198)

Now it is enough to recall the Gauss theorem for the spherical surface, namely 

   ∇ · h(σ ) f (σ ) dσ h(σ ) · ∇σ f (σ ) dσ = −

(199)

that applied to Eq. 198 gives     2  u | ∇σ  Rσ  | Ynm (σ ) + n Rσn−1 | ∇σ  Rσ  |2 Ynm (σ  )+ J =−    + Rσn  Δσ  Rσ  Ynm (σ  ) + Rσ  ∇σ  Rσ  · ∇σ  Ynm (σ  ) u dσ 

(200)

where Δσ  = ∇σ  · ∇σ  is the well known Laplace Beltrami operator. Introducing the notation qσ =

Rσ , Re

and wrapping up, we finally obtain unm (Re ) =

1 4π



Fnm (σ  ) u (Rσ , σ  ) dσ  +

1 4π



Gnm (σ  ) u(Rσ  , σ  ) dσ  (201)

with Fnm (σ  ) = −Re qσn (qσ2  + | ∇σ  qσ  |2 )

Ynm (σ  ) 2n+1

(202)

3 Up and Down Through the Gravity Field

139

 Gnm (σ ) = n qσn+1 Ynm (σ  ) − 2 qσ2  ∇σ  qσ  · ∇Ynm (σ  )+  

 2 2      | ∇ q | − q Δ q Y (σ ) (2 n + 1)−1 . − n qσn+1  nm σ σ σ σ σ

(203)

After such somewhat cumbersome calculations, it seems not unuseful to verify 0 the correctness of Eq. 201 at least when Rσ = R0 and qσ = R Re = q. In such a case ∇σ qσ = 0 and Δσ qσ = 0, so that we have Fnm (σ  ) = −Re q n+2

Ynm (σ  ) 2n+1

,

Gnm (σ  ) = n q n+1

Ynm (σ  ) . 2n+1

Therefore Eq. 201 reads  Re q n+2 1 unm (Re ) = − u (R0 , σ  ) Ynm (σ  ) dσ  + 2n+1 4π  1 + n q n+1 u(R0 , σ  ) Ynm (σ  ) dσ  . 4π

(204)

If we put u(r, σ ) =

 n  +∞ R0 n+1 unm (R0 ) Ynm (σ ) r m=−n n=0

so that −u (R0 , σ  ) =

n +∞ n+1 unm (R0 ) Ynm (σ  ) R 0 m=−n n=0

we see that Eq. 204 reduces to  unm (Re ) = unm (R0 ) q n+1 = unm (R0 )

R0 Re

n+1 ,

which is nothing but Eq. 27 and is known to be correct.

Appendix B On the application of Tikhonov regularization to the inverse problem of a layer with constant density We first show the mathematical reasoning that proves that conditions (1) to (4), in Sect. 6, are sufficient to guarantee that:

140

F. Sansó et al.

m = inf T (RL )

(205)

RL ∈C

is a true minimum, namely that ∃ R L ∈ C such that   m = T RL .

(206)

Then we will show that with T (RL ) given by Eq. 163, F [RL ] given by Eq. 162 and J (RL ) = J1 (RL ), given by Eq. 165, such conditions are satisfied. Given the definition of m, Eq. 205, and the fact that T (RL ) ≥ 0, there must a sequence RLn ∈ C (⊂ H ), such that m = lim T (RLn ) .

(207)

n→∞

Given Eq. 207, there must be a constant T such that T (RL ) < T ;

(208)

T . λ

(209)

Eq. 208 implies that J (RL ) < So 

T {RLn } ∈ C ∩ J (RLn ) < λ

 (210)

and, according to the condition (3), there must be a subsequence, that we call again {RLn }, which is convergent in H , RLn → R L . H

(211)

Since C is closed, by (1), it has to be R L ∈ C.

(212)

On the other hand, F (RL ) is continuous H → H , and therefore   F (RLn ) → F R L ,

(213)

   g − F (RLn ) 2H →  g − F R L 2H .

(214)

H

3 Up and Down Through the Gravity Field

141

But then one has too   λ lim J (RLn ) = lim T (RLn ) −  g − F (RLn ) 2H =   = m −  g − F R Ln 2H .

(215)

On the same time   λ lim J (RLn ) ≡ λlimJ (RLn ) ≥ λJ R L .

(216)

Considering Eqs. 215 and 216 we see that       λT R L +  g − F R L 2H ≡ J R L ≤ m;

(217)

since R L ∈ C and m is the I nf of T (RL ) on C, (217) can only be an equality and R L is a minimum point of J (RL ) in CJ , as it was to be proven. In our specific case, H ≡ L2 (σ ) and C is given by RL ∈ C :

0 ≤ RLσ ≤ Rσ .

(218)

Since B is bounded and {r = Rσ } is ∂B, one has for some R < +∞ Rσ ≤ R

(219)

RLσ ≤ R

(220)

 RL ∈ C ⇒  RLσ L2 ≤ R μ (B) < +∞,

(221)

and therefore it is

too, so that

i.e., C is bounded. in C ⊂ L2 (σ ), converging to Furthermore, let {RLn } be a Cauchy sequence  R L ; then we know there is a subsequence, RLnk , that is pointwise converging to R L a.e.. But then 0 ≤ R L = lim RLnk ≤ Rσ , σ a. e.,

(222)

i.e., R L ∈ C. So C is closed. Now we want to prove that F (RL ) is continuous L2 (σ ) → L2 (σ ). We will adopt the notation of Fig. 9, and of Eqs. 144, 145, and 147. So we consider

142

F. Sansó et al.

   | F (RL2 ) − F (RL1 ) |2 =  G

L1 ÷L2

2  ρ0 (Q) dBQ  ; P Q

(223)

we have already observed (see Eq. 106 and Fig. 6) that for whatever layer L ⊂ B one has  1 dB ≤ 4 π Δ. (224) 2  L So, by applying the Schwartz inequality to Eq. 223, one gets  | F (RL2 ) − F (RL1 ) |2 ≤ C02 4 π Δ

L1 ÷L2

ρ02 dB.

(225)

But ρ02 ≡ ρ 2 (constant) by definition of ρ0 (see Eq. 147) and so | F (RL2 ) − F (RL1 ) |2 ≤ C02 4 π Δρ 2 μ (L1 ÷ L2 )

(226)

On the other hand, again by Schwarz inequality,  μ (L1 ÷ L2 ) =



R+σ



s ds ≤ R 2

R−σ

≤R

2√

2

 dσ |R+σ − R−σ |



(227)

1

 dσ | R+σ − R−σ |2

2

.

By its definition it is | R+σ − R−σ | ≡ | RL2 − RL1 |, so that collecting Eqs. 226 and 227, one gets | F (RL2 ) − F (RL1 ) |2 ≤ const.  RL2 − RL1 L2 (σ ) .

(228)

As we see Eq. 228 implies that F (RL ) is continuous from L2 (σ ) to the space of continuous functions, C (σ ), and therefore also L2 (σ ) → L2 (σ ), since   f L2 (σ ) =

3 f 2 (σ )dσ

2

√ √ ≤ sup| f (σ ) | 4 π ≡  f C(σ ) 4 π .

(229)

Finally we have to verify that the set Γa = C ∩ {J (RL ) < a}

(230)

is compact in L2 (σ ), ∀ a. But we have already observed that C is bounded, so that, for some positive constant a,

3 Up and Down Through the Gravity Field

143

RL ∈ C ⇒  RL 2L2 (σ ) ≤ a.

(231)

Then we have RL ∈ Γa ⇒  RL 2H 1,2 (σ ) =  RL 2L2 (σ ) + J1 (RL ) ≤ c + a,

(232)

namely Γa is a bounded set in H 1,2 (σ ). We have already observed that the embedding of H 1,2 (σ ) into L2 (σ ) is compact, and in any way this is quite obvious because f ∈ H 1,2 ;

f =



fnm Ynm (σ )

(233)

implies   f 2H 1,2 =

 f 2 dσ +

 ≡

| ∇σ f |2 dσ 

f dσ −

≡4π

2



f Δσ f dσ

(234)

2 fnm [1 + n (n + 1)] < +∞.

So Γa is compact in L2 (σ ) as it was to be proved.

Literature 1. Andersen, O.B.: Marine gravity and geoid from satellite altimetry. In: Sanso’, F., Sideris, M. (eds.) Geoid Determination: Theory and Methods, Chapter 9. LNESS. Springer, Berlin/Heidelberg (2013) 2. Ballani, L., Stromeyer, D.: The inverse gravimetric problem: a Hilbert space approach. In: Proceedings of International Symposium on Figure of the Earth, the Moon, and Other Planets, Prague, pp. 359–373 (1982) 3. Barzaghi, R., Sansó, F.: Remarks on the inverse gravimetric problem. Bollettino di geodesia e scienze affini 45(2), 203–216 (1986) 4. Bucha, B., Hirt, C., Kuhn, M.: Runge Krarup type gravity field solutions to avoid divergence in traditional external spherical harmonic modelling. J. Geod. (2018, under review) 5. Flechtner, F., Sneeuw, N., Schuh, W.D.: Observation of the System Earth from Space: CHAMP, GRACE, GOCE and Future Missions. Springer, Berlin/Heidelberg (2014) 6. Freeden, W., Michel, V.: Multiscale Potential Theory: With Applications to Geoscience. Springer, Berlin/Heidelberg (2012) 7. Freeden, W., Nashed, M.Z.: Operator-theoretic and regularization approaches to ill-posed problems. Int. J. Geomath. Vol 9(1) pp. 1–115 (2018) 8. Hörmander, L.: The boundary problems of physical geodesy. Arch. Ration. Mech. Anal. 62(1), 1–52 (1976) 9. Isakov, V.: Inverse Source Problems, vol. 34. A.M.S., Providence (1990)

144

F. Sansó et al.

10. Krarup, T.: Letters on Molodensky’s problem. I the single Molodensky’s problem. In: Borre, K.(ed.) Mathematical Foundation of Geodesy. Selected papers of Torben Krarup. Springer, Berlin/Heidelberg (2006) 11. Krarup, T.: “On Potential Theory” in Mathematical Foundation of Geodesy: Selected Papers of Torben Krarup. Selected Papers of Torben Krarup. Springer, Berlin/Heidelberg (2006) 12. Lauricella. G.: Sulla distribuzione di massa all’interno dei pianeti. Rend. Acc. Lincei. XII, 18–21 (1912) 13. Leliwa-Ropystynski, J.: Distribution of selected physical parameters in the Earth and planetary interiors. In: Leliwa-Ropystynski, J., Teisseyre, R. (eds.) Constitution of the Earth’s Interior. Elsevier, New York (1984) 14. McLean, W.: Strongly Elliptic Systems and Boundary Integral Equations. Cambridge University Press, Cambridge/New York (2000) 15. Migliaccio, F., Reguzzoni, M., Gatti, A., Sansò, F., Herceg, M.: A GOCE-only global gravity field model by the space-wise approach. In: Proceedings of the 4th International GOCE User Workshop. ESA SP-696 (2011) 16. Miranda, C.: Partial Differential Equations of Elliptic Type. Springer, Berlin/Heidelberg (1970) 17. Moritz, H.: The Figure of the Earth: Theoretical Geodesy and the Earth’s Interior. H. Wichmann, Karlsruhe (1990) 18. Moritz, H., Heiskanen, W.A.: Physical Geodesy. W.H. Freeman, San Francisco (1967) 19. Pavlis, N.: Global gravitational models. In: Sanso’, F., Sideris, M. (eds.) Geoid Determination: Theory and Methods, Chapter 9. LNESS. Springer, Berlin/Heidelberg (2013) 20. Pizzetti, P.: Intorno alle possibili distribuzioni della massa nell’interno della Terra. Annali di Mat. Milano XVII, 225–258 (1910) 21. Richter, M.: Inverse Problems. Lecture Notes in Geosystems Mathematics and Computing. Birkhäusen, Cham (2016) 22. Riesz, F., Nagy, B.S.: Lecons d’Analyse Fonctionnelle. Gauthier-Villars, Paris (1968) 23. Sansó, F.: Internal Collocation. Memorie dell’Accademia dei Lincei, vol. 16, Rome (1980) 24. Sansó, F.: Theory on GBVP’s applied to the analysis of altimetric data. In: Rummel, R., Sanso’, F. (eds.) Satellite Altimetry in Geodesy and Oceanography. LNESS, vol. 50. Springer, Berlin/Heidelberg (1993) 25. Sansò, F.: The long road from measurements to boundary value problems in physical geodesy. Manuscripta geodaetica 20(5), 326–344 (1995) 26. Sansó, F.: The Analysis of the GBVP: State and Perspectives: Handbook of Math-Geodesy, pp. 463–493. Springer, Berlin/Heidelberg (2017) 27. Sansò, F., Sideris, M.G.: Geoid Determination: Theory and Methods. LNESS. Springer, Berlin/Heidelberg (2013) 28. Sansò, F., Sideris, M.G.: Geodetic Boundary Value Problem: The Equivalence Between Molodensky’s and Helmert’s Solutions. Springer, Cham, Switzerland (2016) 29. Sansó, F.: On the regular decomposition of the inverse gravimetric problem in non-L2 spaces. Int. J. Geomath. 5(1), 33–61 (2014) 30. Vaní˘cek, P., Kleusberg, A.: What an external gravitational potential can really tell us about mass distribution. Boll. Geof. Teor. ed Appl. 102(108), 243–250 (1985) 31. Wermer, J.: Potential Theory. Lecture Notes in March 2008. Springer, Berlin/Heidelberg (1974) 32. Yosida, K.: Functional Analysis. Springer, Berlin/Heidelberg (1995) 33. Zidarov, D.P.: Inverse Gravimetric Problem in Geoprospecting and Geodesy. Public House of Bulg. Ac. of Sc. Elsevier, Sofia (1990)

Part II Special Functions Based Methods

4

Spherical Harmonics, Splines, and Wavelets Definitoric Constituents, Strategic Perspectives, Specific Applicability and Applications Willi Freeden and Michael Schreiner

Contents 1 Motivation and Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Signals, Spherical Harmonics, and Pseudodifferential Operators . . . . . . . . . . . . . . . . . . 3 Zonal Kernel Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Spline Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Wavelet Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Goal of Spherical Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Sampling Methods and Recovery Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Selective Publication List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

148 150 155 157 168 171 175 177 181 182

Abstract This contribution substantially represents a geodetically relevant collection of particularly valuable material in the diverse approximation areas involving spherical harmonics, splines, and wavelets, thereby establishing a consistent and

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. W. Freeden () Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] M. Schreiner Institute for Computational Engineering, University of Applied Sciences of Technology NTB, Buchs, Switzerland E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_101

147

148

W. Freeden and M. Schreiner

unified setup. The goal of the work is to preferably convince members from geodesy that spherically oriented approximation provides a rich mathematical cornucopia that has much to offer to a large palette of applications. Geomathematically it reflects both the approximate shape of the Earth’s surface and the typical satellite geometry of a low Earth orbiter (LEO). Our essential interest is in reconstruction and decomposition characteristics corresponding to different types of data on spheres and various observables naturally occurring in geodetic context, when efficient and economic numerical realizations are required. Another objective is to provide an addition to the library of any individual interested in geodetically reflected local as well as global spherical approximation theory.

Zusammenfassung Dieser Beitrag stellt eine geodätisch relevante Sammlung von besonders wertvollem Material in den diversen Approximationsgebieten dar, die mit Kugelfunktionen, Splines und Wavelets involviert sind, und zwar in einem konsistenten und vereinheitlichtem Gefüge. Das Ziel der Arbeit besteht darin vorzugsweise Geodäten zu überzeugen, dass sphärisch orientierte Approximation ein reiches mathematisches Füllhorn bereitstellt, welches viel für eine breite Palette von Anwendungen zu bieten hat. Geomathematisch spiegelt es sowohl die approximative Erdfigur als auch die typische Satellitengeometrie eines tief fliegenden Erdorbiters wider. Unser wesentliches Interesse liegt in den Charakteristiken der Rekonstruktion und Dekomposition der verschiedenen Datentypen auf Sphären und der natürlich im geodätischen Kontext auftretenden mannigfaltigen Observablen, soweit effiziente und ökonomische nummerische Realisationen gefordert sind. Ein weiteres Anliegen ist, eine Zusatzbibliothek für Interessenten in lokal sowie global geprägter sphärischer Approximationstheorie verfügbar zu machen.

Keywords Spherical harmonics · Spherical splines · Spherical wavelets · Philosophies · Definitions · Strategies · Applicabilities · Applications

1

Motivation and Justification

An essential objective of mathematics is to create settings and concepts to understand better our world. Mathematics is present in everyday life. Even more, almost all sciences undergo a process of “mathematization” due to increasing technological progress. An example is geomathematics. It is a key discipline for observing, understanding, and forecasting the complexity of the system Earth, in particular for the responsible subsystem of geodesy. We are living in a world of rapid climate, environmental, and societal change. Emphasis must be also put on the interactions with the ecosystem. For all these interactions to be managed there is a strong need for geomathematical research that becomes increasingly apparent. Understanding phenomena requires the handling and analyzing of data. Usually,

4 Spherical Harmonics, Splines, and Wavelets

149

data sets are discrete manifestations of continuous processes of the system Earth. As an urgent consequence, geosciences and, in particular, geodesy, have to take care in maintaining and improving the links between the Earth’s system and its simplifications by theories, models, and simulations based on discrete data sets. The building bridge between the real world and its virtual counterpart based on mappings in form of measurements and observations consequently is one of the fundamental roles of mathematical geodesy. Since the Earth’s surface is an almost perfect sphere – deviations from its spherical shape are less than 0.4% of its radius and essentially arise from (sufficiently known) rotation effects – we are immediately led to the work presented in this contribution, namely “sphericity”. As a matter of fact, spherical approximation of a signal should be considered in various parts: the first stating the fact that a function can be completely determined by space or frequency properties, the second describing how to reconstruct the function (signal) using its space and frequency constituents from discrete data, the third decomposing a continuous signal into certain space and band details by decorrelation of specific signature ingredients. All these aspects of approximation can be discussed mathematically by means of a variety of trial functions such as spherical harmonics, splines, and wavelets in different topologies and methodologies. In other words, the overall objective is to handle discrete measurements and observations on the sphere in an appropriate space and/or frequency way, so that they subsequently allow the transfer from discrete to continuous framework, and vice versa. In turn, the choice of discretization methods has an important impact on the approximation solutions of the models under consideration. Particular questions have to be answered, e.g., about the type, width, and character of data sets, estimating sizes or parameters, comparisons and validation of scientific hypotheses. So, progress in approximation often leads to progress in mathematics as well as applications. All in all, the goal of spherical approximation is not only to make continuous theory based on discrete data knowledge more useful, but also to describe concrete problems in geoapplications, in particular geodesy, adequately by anchoring them specifically in mathematical formalisms. What is exactly that enables the mathematicians to provide the transfer from concrete measurements and observables to abstract mathematical formalisms and models such as provided by approximation theory? Some programmatic answers should be given already at this early stage essentially inspired by theses in [64, 65, 90]: 1. The mathematical world of numbers and structures contains efficient tokens by which the rule-like aspect of (e.g., sampling) problems can be described appropriately. This description includes as an essential step a simplification by abstraction. In spherical approximation, the principal impact of abstraction is to allow the replacement of a continuous signal by a discrete set of its samples preferably with minimal loss of any information. So, simplification by abstraction consists of specifying the criteria under which the original continuous signal may be reproduced.

150

W. Freeden and M. Schreiner

2. Essential properties of an approximation problem are separated from unimportant ones, further specified, and afterwards included into a solution scheme. The “eye for similarities” often enables mathematicians to recognize a posteriori that an adequately reduced problem may also arise from very different situations in various application areas, so that the resulting solutions may be applicable to multiple settings after an adequate adaptation or concretization. Without this ingredient, the abstraction remains essentially useless. More concretely, bivariate polar coordinates are the cutting edge of non-isotropic spherical approximation. However, as we shall see in our work, isotropic solutions are geoscientifically important because of the characteristics of the occurring observables. Isotropic solutions, however, require a different point of departure, for example, based on rotation-invariant spherical polynomials involving Legendre polynomials, and their sum conglomerates to zonal kernels. 3. The interaction between abstraction and concretization characterizes the history of mathematics and its current development as a common language and an independent standard. A problem reduced by abstraction has to be considered as a new “concrete” problem to be solved within a general framework, that determines the validity of a possible solution. In the Euclidean approximation theory, the frequency localization of a signal characterized by Fourier theoretic means is the standard assumption. In fact, it plays an eminent role in the process of abstraction. In the spherical context, there are different ways of how frequency localization may be realized, for example, as bivariate extension starting from Euclidean Fourier settings or intrinsically on the sphere, e.g., in terms of the system of spherical harmonics. 4. The more examples and facets one knows, the more one recognizes the causality between the abstractness of mathematical concepts and their impact and crossdisciplinary significance. Validation of examples is an overall guiding principle in geodetically reflected mathematics, particularly in approximation. It is the preassumption for the mathematical “eye for similarities” and the applicability to diverse areas of research.

2

Spherical Signals, Spherical Harmonics, and Pseudodifferential Operators

As already pointed out, there is a palette of signals in mathematical geodesy to be studied on the sphere. For instance, the space variation of a spherical signal is fundamental for many applications. However, if we are interested in gaining a deeper understanding of the space variation, it is often advantageous to study the signal in different representations as, e.g., for EGM models. For example, the signal can be obtained from a complete system of polynomials, e.g., spherical harmonics, providing a spectral (frequency) representation. From a mathematical point of view there is an infinite number of ways this can be done. Even more, the idea that a discontinuous signal (function) on the sphere may be expressed as a sum of

4 Spherical Harmonics, Splines, and Wavelets

151

arbitrarily often differentiable polynomials on the sphere turned out to be one of the great innovations since the time of Laplace, Legendre, and Gauss. How much energy a spherical signal has and how much energy it takes to produce are central problems in geosciences. Signal analysis has been extended to many diverse types of data with different understanding of energy. Today, the usual understanding of the total energy of a signal F is achieved by the “continuous summation”, i.e., integration over all space (the unit sphere) S2 of the pointwise “fractional (pointwise) energy” |F (ξ )|2 , ξ ∈ S2 , in the form  F 2L2 (S2 ) =

S2

|F (ξ )|2 dS(ξ ),

(1)

where dS is the surface element. The space L2 (S2 ) of all signals (functions) having finite energy may be appropriately characterized by certain systems of restrictions of homogeneous harmonic polynomials to the sphere, in fact, leading canonically to a spherical harmonics system constituting the Hilbert space L2 (S2 ). The polynomial structure of spherical harmonics {Yn,k }n=0,1,...,k=−n,...,n in the Hilbert space L2 (S2 ) has tremendous advantages: First, spherical harmonics of different degrees are orthogonal (in the topology implied by (1)). Second, the space Harmn = spank=−n,...,n Yn,k

(2)

of spherical harmonics of degree (frequency) n is finite-dimensional. Its dimension is given by dim(Harmn ) = 2n + 1, so that Harm0,...,m = span n=0,...,m Yn,k =

m !

Harmn

(3)

2n + 1 = (m + 1)2 .

(4)

k=−n,...,n

n=0

implies dim(Harm0,...,m ) =

m n=0

The basis property of {Yn,k }n=0,1,...,k=−n,...,n in the space L2 (S2 ) of finite-energy signals ∞ is equivalently characterized by the completion of the orthogonal direct sum n=0 Harmn , i.e.,

L (S ) = 2

2

∞ !

·L2 (S2 )

Harmn

.

(5)

n=0

This is the natural reason why spherical harmonic expansions are essential tools in the theory of gravitational and geomagnetic potentials.

152

W. Freeden and M. Schreiner

Spectral analysis in terms of spherical harmonics {Yn,k }n=0,1,...,k=−n,...,n has led to the discovery of basic laws of nature. It allows us to understand the composition and ingredients of features of the Earth (for more details about space and frequency description of (one-dimensional) signals see, e.g. [32] and the references therein). The formalism of a spherical harmonics approximation system is essentially based on the following principles (cf. [102]): 1. The spherical harmonics are obtainable in a twofold way, namely as restrictions of three-dimensional homogeneous harmonic polynomials or intrinsically on the unit sphere S2 as eigenfunctions of, e.g., the Beltrami operator or certain pseudodifferential operators. 2. The Legendre kernels (polynomials) are obtainable as the outcome of sums extended over a maximal horizontal orthonormal system of spherical harmonics (cf. Fig. 1) of degree (i.e., frequency) n. 3. The Legendre kernels are rotation-invariant with respect to orthogonal transformations. 4. Each Legendre kernel implies an associated Funk-Hecke formula that determines the constituting features of the convolution (filtering) of a square-integrable field against the Legendre kernel.

Fig. 1 Spacelimited zonal kernel generated as weighted infinite sum over spherical harmonics

4 Spherical Harmonics, Splines, and Wavelets

153

5. The orthogonal Fourier expansion of a square-integrable field is the sum of the convolutions of the field against the Legendre kernels being extended over all frequencies. In fact, the theory of spherical harmonics provides a powerful spectral framework to unify, review, and supplement the different approaches in spaces over the unit sphere S2 , where distance (norm) and angle are at hand in suitable reference (pre-)Hilbert spaces such as the space C(0) (S2 ) of continuous functions on S2 . The essential tools in these (pre-)Hilbert spaces are the Legendre functions, used in orthogonal Fourier expansions and endowed with rotational invariance. The coordinate-free construction yields a number of formulas and theorems by use of coordinatefree spherical harmonics, that previously were derived only in problem-affected coordinate (more precisely, polar coordinate) representations (note that there is no singularity free global coordinate system on S2 ). As a consequence, any kind of singularity is avoided at points being fixed under orthogonal transformations. Finally, the transition from the scalar to the vectorial as well as the tensorial case opens new promising perspectives of constructing important zonal classes of spherical trial functions by summing up Legendre kernel expressions (see [90,102]), thereby providing (geo-)physical relevance and increasing local applicability (see, e.g., [69, 72, 74, 75, 90, 102, 127, 188] and the references therein for a variety of aspects on constructive spherical harmonics approximation). Any signal F ∈ L2 (S2 ) can be split (cf. Table 1) into “orthogonal contributions” involving the Fourier transforms F ∧ (n, k) defined by 



F (n, k) =

S

2

F (ξ )Yn,k (ξ ) dS(ξ ),

(6)

in terms of L2 (S2 )–orthonormal spherical harmonics {Yn,k }n=0,1,...,k=−n,...,n . The total energy of a signal should be independent of the method used to calculate it. Hence, F 2L2 (S2 ) as defined by (1) should be the sum of (F ∧ (n, k))2 over all frequencies. So, Parseval’s identity identifies the spatial energy of a signal with the spectral energy, decomposed orthogonally into single frequency contributions F 2L2 (S2 )

= "F, F #L2 (S2 )

n ∞  ∧ 2 F (n, k) . = n=0 k=−n

Parseval’s identity also explains why the (global) geosciences more often work with  the frequency energy, i.e., amplitude spectrum F ∧ (n, k) n=0,1,...,k=−n,...,n than with the original space signal F ∈ L2 (S2 ). As a consequence, the “inverse Fourier transform” F =

n ∞ n=0 k=−n

F ∧ (n, k)Yn,k

(7)

154

W. Freeden and M. Schreiner

Table 1 Fourier expansion of L2 (S2 )-functions Spherical harmonics {Yn,j }n=0,1,...,j =−n,...,n as polynomial system on the unit sphere S2 ⊂ R3 Orthonormality and invariance



addition theorem

One-dimensional Legendre polynomial Pn : 4π n Pn (ξ · η) = 2n+1 ξ, η ∈ S2 j =−n Yn,j (ξ )Yn,j (η), Convolution by the Legendre kernel (Pn ∗ F )(ξ ) =

2n+1 4π

 S2



Funk-Hecke formula

Legendre transform of F :  Pn (ξ · η)F (η) dS(η) = nj=−n F ∧ (n, j )Yn,j (ξ ), Fourier



ξ ∈ S2

coefficients

Fourier coefficients of F ∈ L2 (S2 ):  F ∧ (n, j ) = S2 F (ξ )Yn,j (ξ )dS(ξ ) superposition over frequencies F (ξ ) =

∞

n=0

2n+1 4π



orthogonal series expansion

Fourier series of F ∈ L2 (S2 ):  n  ∧ P (ξ · η)F (η) dS(η) = ∞ n=0 j =−n F (n, j )Yn,m (ξ ), S2 n

ξ ∈ S2

allows the geoscientists to think of the function (signal) F as weighted superpositions of “wave functions” Yn,k corresponding to different frequencies. This property is used by geodesy since Gauss [110] for establishing Earth Gravitational Models (such as EGM96 [157], EGM2008 [207]). One can think of measurements as operating on an “input signal” F to produce an output signal G = ΛF , where Λ is an operator acting on L2 (S2 ). Fortunately, large portions of interest can be well approximated by pseudodifferential operators (see, e.g., [72, 77, 246]). If Λ is such an operator on L2 (S2 ), this means that Λ Yn,k = Λ∧ (n, k) Yn,k , n = 0, 1, . . . , k = −n, . . . , n,

(8)

 where, in geodesy and geophysics, the “symbol” Λ∧ (n, k) n∈N ,k=−n,...,n, usually 0 is a sequence of real values independent of the order k, i.e., Λ∧ (n, k) = Λ∧ (n) for all n. A pseudodifferential operator Λ satisfying Λ∧ (n, k) = Λ∧ (n) for all n is called rotation-invariant (or isotropic). Equation (8) allows the interpretation that the spherical harmonics are the eigenfunctions of the operator Λ. Different pseudodifferential operators Λ (for example, the Beltrami operator Δ∗ ) are characterized by their eigenvalues Λ∧ (n). Moreover, the amplitude spectrum {G∧ (n, k)} of the response of Λ is described in terms of the amplitude spectrum of functions (signals) by a simple multiplication by the “transfer” function Λ∧ (n).

4 Spherical Harmonics, Splines, and Wavelets Table 2 Geodetic operators in spherical nomenclature

Operator/quantity Λ Gravity anomaly Geoid undulations Stokes operator Single layer Double layer First normal derivative Second normal derivative Upward continuation Satellite gravity gradiometry

155 Λ∧ (n)

Order 1

n−1 R R2 R n−1 R n+ 12 R − 2n+1 n+1 − R (n+1)(n+2) 2 " R #n+1 R R+H " #n+1 (n+1)(n+2) R R+H (R+H )2

0 −1 −1 −1 1 2 −∞ −∞

Various important pseudodifferential operators are listed in Table 2. Here R designates the Earth’s mean radius, and H is the satellite height.

3

Zonal Kernel Functions

Any kernel function K : S2 × S2 → R that is characterized by the property that there exists a function K˜ : [0, 2] → R such that ˜ K(ξ, η) = K(|ξ − η|) = K˜

"

# ˆ · η), 2 − 2ξ · η = K(ξ

ξ, η ∈ S2 ,

(9)

is called a (spherical) radial basis function (at least in the theory of constructive approximation). The application of a rotation (i.e., a 3 × 3 orthogonal matrix t with tT = t−1 ) leads to ˆ · η) = K(ξ, η). ˆ ˆ · (tT tη)) = K(ξ K(tξ, tη) = K((tξ ) · (tη)) = K(ξ

(10)

In particular, a rotation around the axis ξ ∈ S2 (i.e., tξ = ξ ) yields K(ξ, η) = K(ξ, tη) for all η ∈ S2 . Hence, K(ξ, ·) possesses rotational symmetry with respect to the axis ξ . A kernel Kˆ : S2 × S2 → R satisfying ˆ · η) = K(tξ ˆ K(ξ · tη), ξ, η ∈ S2 ,

(11)

for all orthogonal transformations t is known as a zonal kernel function. To highlight the reducibility of Kˆ to a function defined on the interval [−1, 1], ˆ · η), (ξ, η) ∈ S2 × S2 , is used. the notation (ξ, η) → K(ξ From the theory of spherical harmonics we get a representation of any L2 (S2 )– zonal kernel function K in terms of a Legendre expansion

156

W. Freeden and M. Schreiner

K(ξ ·) =

∞ 2n + 1 n=0



K ∧ (n)Pn (ξ ·)

(12)

(in the  · L2 (S2 ) –sense), where the sequence {K ∧ (n)}n∈N0 given by K ∧ (n) = 2π



1

−1

K(t)Pn (t) dt

(13)

is called the Legendre symbol of the zonal kernel K(ξ ·). A simple but extreme example (with optimal frequency localization and no space localization) is the Legendre kernel where K ∧ (n) = 1 for one particular n and K ∧ (m) = 0 for m = n, i.e., the Legendre kernel is given by S2 × S2 % (ξ, η) →

2n + 1 Pn (ξ · η). 4π

An additive clustering of weighted Legendre kernels generates zonal kernel functions (cf. Fig. 1). Physical devices do not transmit spherical harmonics of arbitrarily high frequency without severe attenuation. The transfer function Λ∧ (n) usually tends to zero with increasing n. It follows that the amplitude spectra of the responses (observations) to functions (signals) of finite energy are also negligibly small beyond some finite frequency. Thus, both because of the frequency limiting nature of the devices used, and because of the nature of the “transmitted signals”, the geoscientist is soon led to consider bandlimited functions. These are the functions F ∈ L2 (S2 ) whose “amplitude spectra” vanish for all n ≥ N (for some fixed N ∈ N0 ). In fact, we are able to distinguish bandlimited kernels (i.e., K ∧ (n) = 0 for all n > N) and non-bandlimited ones, for which infinitely many numbers K ∧ (n) do not vanish. A bandlimited function F ∈ L2 (S2 ) can be written as a finite Fourier series. So, any function F of the form

F =

n N

F ∧ (n, k)Yn,k

(14)

n=0 k=−n

is said to be bandlimited with the band N, if F ∧ (N, k) = 0 for at least one k. If there exists a region Γ  S2 such that F ∈ L2 (S2 ) vanishes on S2 \Γ , F is said to be spacelimited (locally supported). F ∈ L2 (S2 ) is called spacelimited (locally supported) with spacewidth  ∈ (−1, 1) around an axis η ∈ S2 , if the function F vanishes for some  ∈ (−1, 1) on the set of all ξ ∈ S2 satisfying −1 ≤ ξ · η ≤  (where  is the largest number for which this is the case).

4 Spherical Harmonics, Splines, and Wavelets

157

Zonal kernel function theory relies on the following principles: 1. Weighted Legendre kernels are the summands of zonal kernel functions. 2. The Legendre kernel is ideally localized in frequency. The Dirac kernel is ideally localized in space. 3. The only frequency- and spacelimited zonal kernel is the zero function. All in all, zonal kernels exist as bandlimited and non-bandlimited functions. Every bandlimited kernel refers to a cluster of a finite number of polynomials i.e., spherical harmonics, hence, it corresponds to a certain band of frequencies. In contrast to a single polynomial which is localized in frequency but not in space, a bandlimited kernel such as the Shannon kernel already shows a certain amount of space localization. If we move from bandlimited to non-bandlimited kernels the frequency localization decreases and the space localization increases in accordance with the relationship provided by the uncertainty principle. Bandlimited functions are infinitely often differentiable everywhere. Moreover, it is clear that any bandlimited function F is an analytic function. From the analyticity it follows immediately that a non-trivial bandlimited function cannot vanish on any (non-degenerate) subset of S2 . The only function that is both bandlimited and spacelimited is the zero function. Functions cannot have a finite support in spatial as well as in spectral domain. A certain way out is the bandlimited context of (spherical harmonics based) Shannon kernels, that allows a spatiospectral concentration in terms of Slepian functions, where the measure of concentration is invariably a quadratic energy ratio adapted to the local area under consideration (see, e.g., [90] and the references therein). Numerical analysis would like to deal with spacelimited functions. However, such a function (signal) of finite (space) support turns out to be non-bandlimited, so that it must contain spherical harmonics of arbitrarily large frequencies. Thus, there is a dilemma or uncertainty principle, in seeking functions that are somehow concentrated in both space and frequency.

4

Uncertainty Principle

There is a way of mathematically expressing the impossibility of simultaneous confinement of a function to space and frequency (more accurately, angular momentum), namely the uncertainty principle. If we consider |F (ξ )|2 , ξ ∈ S2 , as a density in space so that F 2L2 (S2 ) = 1, the average space (expectation) can be defined in the usual way as any average is understood: space gF

 =

S2

ξ(F (ξ ))2 dS(ξ ).

(15)

The reason for introducing an average is that it may give a gross characterization of the density. Moreover, it may indicate where the density is concentrated around

158

W. Freeden and M. Schreiner

the average. Various measures can be used to make certain whether the density is concentrated around the average. The most common measure is the standard space deviation, σF , given by space (σF )2

 =

space

S2

(ξ − (gF

))(F (ξ ))2 dS(ξ ).

(16)

The standard deviation is an indication of the space localization of the signal. If the standard deviation is small, then most of the signal is concentrated around the average space. If (F ∧ (n, j ))2 represents the density in frequency, then we may use it to calculate averages, the motivation being the same as in space domain. It also gives a rough idea of the main characteristics of the spectral density. The average frequency frequency , and its standard deviation, σF , frequency (frequency expectation), gF (sometimes also called bandwidth) are given by frequency

gF

=

n ∞

 2 n(n + 1) F ∧ (n, j )

(17)

n=0 j =−n

and frequency 2

(σF

) =

n " ∞ # 2 frequency 2  ∧ n(n + 1) − gF F (n, j )

(18)

n=0 j =−n

Note (see, e.g., [32]) that, for reasons of consistency with the theory of spherical harmonics we chose n(n + 1) instead of the 1D-standard choice n (observe that Δ∗ Yn,j = −n(n + 1)Yn,j for all n, j ). The square roots of the variances, i.e., σ space and σ frequency , are called the uncertainties, respectively. It turns out that for these quantities we obtain an estimate of the following type (for the proof the reader is referred to [63]): space

ΔF

frequency

ΔF

space

where the so-called uncertainties ΔF

≥ 1,

frequency

, ΔF

(19)

are given by

space

space ΔF

=

σF

space

|gF

(20)

|

and frequency

ΔF

frequency 1/2

= (gF

)

.

(21)

4 Spherical Harmonics, Splines, and Wavelets

159

In other words, space localization is at the cost of frequency localization, and vice versa. The discovery of the uncertainty by W. Heisenberg (1927) is one of the great achievements of the last century. For signal analysis it roughly states the fact that a narrow spatial “waveform” implies a wide frequency spectrum, and a wide “spatial waveform” yields a narrow spectrum. Both the spatial waveform and the frequency spectrum cannot be made arbitrarily small simultaneously. The uncertainty principle enables us to understand the transition from the theory of spherical harmonics through zonal kernel functions to the Dirac kernel. To this end we have to realize the relative advantages of the classical Fourier expansion method by means of spherical harmonics, and this is not only in the frequency domain, but also in the space domain. It is a characteristic for Fourier techniques that the spherical harmonics as polynomial trial functions admit no localization in space domain, while in the frequency domain, they always correspond to exactly one degree, i.e., frequency, and therefore, are said to show ideal frequency localization. Because of the ideal frequency localization and the simultaneous absence of space localization local changes of fields (signals) in the space domain affect the whole table of orthogonal (Fourier) coefficients. This, in turn, causes global changes of the corresponding (truncated) Fourier series in the space domain. Nevertheless, ideal frequency localization is often helpful for meaningful physical interpretations by relating the different observables of a geopotential to each other at a fixed frequency, e.g., the Meissl scheme in physical geodesy (see [200, 218], and the references therein). Taking these aspects on spherical harmonics modeling by Fourier series into account, trial functions which simultaneously show ideal frequency localization as well as ideal space localization would be a desirable choice. In fact, such an ideal system of trial functions would admit models of highest spatial resolution which were expressible in terms of single frequencies. However, from the uncertainty principle, space and frequency localization are mutually exclusive. In conclusion, Fourier expansion methods are well suited to resolve low and medium frequency phenomena, i.e., the “trends” of a signal, while their application to obtain high resolution in global or local models is critical. This difficulty is also well known to theoretical physics, e.g., when describing monochromatic electromagnetic waves or considering the quantum-mechanical treatment of free particles. There, plane waves with fixed frequencies (ideal frequency localization, no space localization) are the solutions of the corresponding differential equations, but they do certainly not reflect the physical reality. As a remedy, plane waves of different frequencies are superposed into “wave-packages” that gain a certain amount of space localization, while losing their ideal spectral localization. In a similar way, we are confronted with the following situation: A suitable superposition of spherical harmonics (cf. Fig. 1) leads to kernel functions with a reduced frequency, but increased space localization. Example: Localization of the Legendre Kernel Recalling the setting Pn∗ =  2n+1 4π Pn (see Fig. 2) we have

160

W. Freeden and M. Schreiner

1.5

P2* P5* P9*

1

P2* P5* P9*

1

0 0.5 −1 0 −π

−π/2

0

π/2

π

0

5

10

15

Fig. 2 The Legendre kernel Pn∗ for n = 2, 5, 9, space representation for ϑ → Pn∗ (cos(ϑ)), ϑ ∈ [−π, π ] (left), and frequency representation m → (Pn∗ )∧ (m) (right)

 S2

(Pn∗ (ξ · ζ ))2 dS(ξ ) = 1

(22)

for all ξ ∈ S2 , such that the uncertainty principle is reflected by space

gP ∗ (ξ ·) = 0, n

frequency

gP ∗ (ξ ·) n

space

σP ∗ (ξ ·) = 1,

(23)

n

= n(n + 1),

frequency

σP ∗ (ξ ·) n

= 0.

(24)

Example: Localization of Bandlimited/Non-spacelimited Shannon Kernels The dyadic Shannon scaling function is given by SH0,...,2J −1 (ξ · η) =

J −1 2

SH0,...,2J −1 ∧ (n)

n=0

2n + 1 Pn (ξ · η), 4π

ξ, η ∈ S2 ,

(25)

where  ∧

SH0,...,2J −1 (n) =

1,

if n ∈ [0, 2J ),

0,

else

.

(26)

The dyadic Shannon scaling function SH0,...,2J −1 shows undesired oscillation phenomena, which increase heavily with increasing scales J (see Fig. 3). Mathematically, oscillations are understood as sign changes of the slope. In the Shannon case it is not hard to see in more qualitative way that SH0,...,2J −1 possesses only simple zeros which imply slope changes. It is not difficult to see (cf., e.g., [90]) that the space uncertainty of the L2 (S2 )$ 0,...,2J −1 (·, η) of scale J is given by normalized Shannon scaling function SH

4 Spherical Harmonics, Splines, and Wavelets

J =2 J =3 J =4

4

161

J =2 J =3 J =4

0.8 0.6

2 0.4 0.2

0 −π

−π/2

π/2

0

0

π

0

5

10

15

20

$ 0,...,2J −1 (cos(ϑ)) for J = 2, J = 3, and J = 4. Fig. 3 The normalized dyadic Shannon kernel SH $ 0,...,2J −1 (cos(ϑ)), ϑ ∈ [−π, π ] (left) and The illustrations show the space representation ϑ → SH ∧ $ frequency representation n → (SH 0,...,2J −1 ) (n), n ∈ N0 (right)



space

ΔSH $

0,...,2J −1 (·,η)

=

2J +1 − 1 , 2J − 1

(27)

while the uncertainty in frequency is given by frequency

ΔSH $

0,...,2J −1 (·,η)

frequency

= ΔSH $

0,...,2J −1 (·,η)

√ 22J − 1 . = √ 2

(28)

Thus, it follows, that the uncertainty product is given by space

ΔSH $

0...,2J −1 (·,η)



frequency

ΔSH $

0...,2J −1 (·,η)

=

(2J +1 − 1)(2J + 1) 2J +1 − 2

 12 .

(29)

It is obvious that this product diverges for J −→ ∞, as it is shown in Fig. 4. It is remarkable that the situation for the Bernstein kernel is different, as it is shown in the next example. Example: Localization of Bandlimited/Non-spacelimited Bernstein Kernels The dyadic Bernstein scaling function (ξ, η) → BE0,...,2J −1 (ξ · η), ξ, η ∈ S2 is given by 2J −2 BE0,...,2J −1 (ξ · η) = π



1+ξ ·η 2

2J −1 ,

ξ, η ∈ S2 .

(30)

Clearly, we are able to understand the kernel BE0,...,2J −1 as bandlimited radial basis function. The name Bernstein is motivated by the observation that the kernel

162

W. Freeden and M. Schreiner

106

Δ Δ

104

Δ 2

10

space SH 0,...,2J −1

frequency

SH 0,...,2J −1

space

SH 0,...,2J −1

Δ

frequency SH 0,...,2J −1

100 10−2 0

5

10

15

20

Fig. 4 Uncertainty classification of the dyadic Shannon kernel. Shown are the functions J → space frequency space frequency ΔSH , J → ΔSH , and J → ΔSH ΔSH . Note, that the product diverges $ $ $ $ 0,...,2J −1

0,...,2J −1

for J → ∞

0,...,2J −1

1.5

J =2 J =3 J =4

1

J =2 J =3 J =4

2

1

0.5

0 −π

0,...,2J −1

−π/2

0

π/2

π

0 0

5

10

15

20

$ 0,...,2J −1 (cos(ϑ)) for J = 2, J = 3, and Fig. 5 The normalized dyadic Bernstein kernel BH $ 0,...,2J −1 (cos(ϑ)), ϑ ∈ [−π, π ] J = 4. The illustrations show the space representation ϑ → BH ∧ $ (left) and frequency representation n → (BH 0,...,2J −1 ) (n), n ∈ N0 (right)

BE0,...,2J −1 (ξ · η), ξ, η ∈ S2 is proportional to the Bernstein polynomials. From [90] we know the explicit representation of the symbol (cf. [50]). For an illustration of the dyadic Bernstein kernel and its symbol, see Fig. 5. The space uncertainty of the L2 (S2 )-normalized dyadic Bernstein scaling func$ 0,...,2J −1 (·, η) of scale J is given by tion BE space

ΔBE $

0,...,2J −1 (·,η)

√ =

2J +1 − 1 . 2J − 1

(31)

4 Spherical Harmonics, Splines, and Wavelets

163

And the uncertainty in frequency is given by frequency

ΔBE $

0,...,2J −1 (·,η)

√ 2J − 1 . = √ 2

(32)

For the uncertainty product we therefore obtain space

ΔBE $



frequency

0,...,2J −1 (·,η)

ΔBE $

0,...,2J −1 (·,η)

=

2J +1 − 1 2J +1 − 2

 12 (33)

and this product tends to 1 for J −→ ∞. For an illustration of the uncertainty for the dyadic Bernstein kernel, see Fig. 6. The dyadic Bernstein scaling function constitutes a bandlimited scaling function that is free of oscillations. Moreover, the Bernstein kernel minimizes its uncertainty product in the limit, but unlike the Gaussian kernel which also possess this property (see, e.g., [72,102]) this minimizer is bandlimited. In detail, the uncertainty product of the Bernstein scaling function tends to 1. Example: Localization of Non-bandlimited/Non-spacelimited Abel-Poisson Kernels Let us consider the Abel-Poisson kernel Qh : S2 × S2 → R, h < 1, given by ∞

Qh (ξ · η) =

2n + 1 1 − h2 1 hn Pn (ξ · η). = 4π (1 + h2 − 2h ξ · η)3/2 4π

(34)

n=0

102

100

Δ

space

Δ

frequency

Δ

space

BE 0,...,2J −1 BE 0,...,2J −1

BE 0,...,2J −1

Δ

frequency BE 0,...,2J −1

10−2 0

5

10

15

20

Fig. 6 Uncertainty classification of the dyadic Berstein kernel. Shown are the functions J → (1) (3) (1) (3) ΔoBE , J → ΔoBE , and J → ΔoBE ΔoBE . Note, that the product $ $ $ $ 0,...,2J −1

0,...,2J −1

converges to 1 for J → ∞

0,...,2J −1

0,...,2J −1

164

W. Freeden and M. Schreiner

An easy calculation gives us  1/2 = Qh L2 [−1,1] = Qh2 (1)



1 + h2 4π

1/2

1 . 1 − h2

(35)

˜ h (t) = Qh −1 Furthermore, for Q Qh (t), t ∈ [−1, 1] (see also Fig. 7), we L2 [−1,1] obtain after an elementary calculation space Δ˜ Qh

1 − h2 = , 2h

frequency Δ˜ Qh

√ 6h = . 1 − h2

(36)

Thus we finally obtain (see Fig. 8) √ =

6 > 1. 2

space

frequency

space frequency Δ˜ Δ˜ Qh Qh

(37)

Note that, in this case, the value Δ ˜ Δ ˜ is independent of h. All Qh Qh intermediate cases of “space-frequency localization” are realized by the AbelPoisson kernel, but it should be emphasized that the Abel-Poisson kernel does not satisfy a minimum uncertainty state. Example: Localization of the Dirac Kernel Letting h formally tend to 1 in the results provided by the uncertainty principle for the Abel-Poisson kernel function we are able to interpret the localization properties of the Dirac kernel on S2 satisfying δ ∧ (n) = 1 for all n ∈ N0 , so that

h = 0.2 h = 0.5 h = 0.8

3

2

2

1

1

0 −π

−π/2

0

π/2

h = 0.2 h = 0.5 h = 0.8

3

π

0 0

5

10

15

˜ h for h = 0.2, 0.5, 0.8, space representation for ϑ → Fig. 7 The Abel-Poisson kernel Q ˜ h (cos(ϑ)), t = ξ ·η = cos(ϑ), ϑ ∈ [−π, π ] (left), and frequency representation m → (Q ˜ h )∧ (m) Q (right)

4 Spherical Harmonics, Splines, and Wavelets

165

8 space Δ˜ Qh frequency Δ˜ Qh space frequency Δ˜ Δ˜

6

Qh

Qh

4

2

0 0

0.2

0.4

0.6

0.8

1

Fig. 8 Abel-Poisson kernel uncertainty classification.The illustration shows the functions h → space frequency space frequency Δ ˜ , h → Δ ˜ , and h → Δ ˜ Δ ˜ Qh

Qh

Qh

Qh

1.5

h = 0.3 h = 0.7 h = 0.9

1

h = 0.3 h = 0.7 h = 0.9

1

0.5

0.5

0 0 −π

−π/2

0

π/2

π

0

10

20

Fig. 9 The Haar kernel B˜ h(0) for h = 0.3, 0.7, 0.9. The illustrations show the space representation (0) (0) ϑ → B˜ h (cos(ϑ)), ϑ ∈ [−π, π ] (left) and frequency representation n → (B˜ h )∧ (n), n ∈ N0 (right)

δ(ξ · η) =

∞ 2n + 1 n=0



Pn (ξ · η),

ξ, η ∈ S2 ,

(38)

where the convergence is understood in distributional sense. As a matter of fact, letting h tend to 1 shows us that the variances in the space domain take the constant value 0. On the other hand, the variances in the frequency domain converge to ∞. Hence, the Dirac kernel shows ideal space localization, but no frequency localization (Fig. 9).

166

W. Freeden and M. Schreiner

Example: Localization of Non-bandlimited/Non-spacelimited Gaussian Kernels The minimum uncertainty state within the uncertainty relation is provided by the Gaussian probability density function (see [72, 153]). Consider the function Gλ given by Gλ (t) = e−(λ/2)(1−t) ,

t ∈ [−1, 1],

λ > 0.

(39)

An elementary calculation shows us that ˜ λ (t) = γ (λ)e−(λ/2)(1−t) , G

(40)

with √ γ (λ) = (1/ 4π )



1 (1 − e−2λ ) 2λ

−1/2 ,

(41)

˜ λ L2 [−1,1] = 1. It is not difficult to deduce that Δspace Δfrequency → 1 as satisfies G ˜λ ˜λ G G λ → ∞ (note that the best value of the uncertainty principle is 1). Example: Localization of Non-bandlimited/Spacelimited Haar Kernels The Haar kernel is given by  (0) Bh (t)

=

0, t ∈ [−1, h) 1, t ∈ [h, 1].

(42)

The Haar kernel is discontinuous so that we also think of smoothed versions of it. More explicitly, for k ∈ N, the smoothed Haar kernels are defined by ⎧ t ∈ [−1, h) ⎨ 0, (k) Bh (t) = (t − h)k ⎩ , t ∈ [h, 1]. (1 − h)k

(43)

For k ∈ N0 , an elementary calculation gives Bh(k) 2L2 (S2 )

 = 2π

1 −1

[Bh(k) (t)]2 dt = 2π

This leads to the normalized Haar kernels % 2k + 1 (k) ˜ B (k) , Bh = 2π(1 − h) h

1−h . 2k + 1

(44)

(45)

since our uncertainty properties are formulated for kernels with norm 1. An elementary calculation (see [90] for more details) yields

4 Spherical Harmonics, Splines, and Wavelets space

Δ ˜ (k) = Bh

167

 1 (1 − h)(h + 4k + 3). 1 + h + 2k

(46)

For the localization in frequency we have % frequency Δ ˜ (k) B

=

h

k(h + 2k) . (1 − h)(2k − 1)

(47)

Figure 10 gives a graphical impression of these results for the special cases k = 1 and k = 3. For particular choices of values h, k, we obtain the expected results. Non-bandlimited kernels show a much stronger space localization than their bandlimited counterparts. Empirically, if K ∧ (n) ≈ K ∧ (n + 1) ≈ 1 for many successive large integers n, then the support of the series (12) in the space domain is small, i.e., the kernel is spacelimited (once more, in the jargon of approximation theory “locally supported”). This leads to the other extremal kernel (in contrast to the Legendre kernel) which is the Dirac kernel with optimal space localization but no frequency localization and K ∧ (n) = 1 for all n. However, the Dirac kernel does not exist as a classical function in mathematical sense, it is a generalized function (i.e., distribution). Nevertheless, it is well known that, if we have a family of kernels {KJ }J =0,1,... , where limJ →∞ KJ∧ (n) = 1 for each n and an additional (technical) condition holds, then {KJ }J =0,1,... is an approximate identity, i.e., KJ ∗ F tends to F in the sense of L2 (S2 ) for all F ∈ L2 (S2 ), or to F ∈ C(0) (S2 ) for all F ∈ C(0) (S2 ), respectively, so that the Dirac kernel may be formally understood (in certain topology) as the limit of the family {KJ }J =0,1,... : lim KJ (ξ · η) = δ(ξ · η), ξ, η ∈ S2

(48)

J →∞

4 3 2

4

space Δ˜ Bh frequency Δ˜ Bh space frequency Δ˜ Δ˜ Bh

3

Bh

2

1 0 −1

space Δ˜ Bh frequency Δ˜ Bh space frequency Δ˜ Δ˜ Bh

Bh

1

−0.5

0

0.5

0 −1

1

−0.5

0

0.5

1

Fig. 10 Uncertainty classification of the normalized smoothed Haar kernel function B˜ h (k = 1, space frequency space frequency left; k = 3 right). Δ ˜ (k) , Δ ˜ (k) and Δ ˜ (k) Δ ˜ (k) are shown as functions of h ∈ [−1, 1] (k)

Bh

Bh

Bh

Bh

168

W. Freeden and M. Schreiner

Table 3 Different types of zonal kernels: bandlimited, spacelimited, and non-spacelimited/nonbandlimited General case Legendre kernels K ∧ (n) = δn,k

Dirac kernel Bandlimited K ∧ (n) = 0, n>N

Spacelimited K(ξ · η) = 0, 1−ξ ·η ≥ δ

Shannon K ∧ (n) = 1, n≤N

Haar K(ξ · η) = 1, 1−ξ ·η ≤ δ

K ∧ (n) = 1 n = 0, . . .

in space domain or equivalently lim KJ∧ (n) = 1, n ∈ N0

J →∞

(49)

in frequency domain. Assuming limn→∞ K ∧ (n) = 0 we are led to the assertion that the slower the sequence {K ∧ (n)}n=0,1,... converges to zero, the lower is the frequency localization, and the higher is the space localization. A unifying scheme can be found in Table 3. Kernel function approximation exists in spline and wavelets specifications naturally based on certain realizations in frequency and space localization. Obviously, if a certain accuracy should be guaranteed in kernel function approximation, adaptive sample point grids are required for the resulting extension of the spatial area determined by the kernels under investigation.

5

Spline Approximation

If data sets on the sphere are localized, approximation problems can be attacked by application of tensor product techniques in terms of coordinates originally designed for bivariate Euclidean space nomenclature. However, global problems like the determination of the gravitational field, magnetic field, tectonic movements, ocean circulation, climate change, hydrological and meteorological purposes, etc. involve essentially the entire surface of the sphere, so that modeling the data as arising in Euclidean two-space via latitude-longitude coordinate separation is no longer appropriate. Even more, since there is no differentiable mapping of the entire sphere to a bounded planar region, there is a need to develop approximations such as sampling methods over the sphere itself, thereby avoiding (artificially occurring) singularities. Looking at a numerically efficient and economic global model in today’s literature, a spherical (and usually not a physically more suitable ellipsoidal)

4 Spherical Harmonics, Splines, and Wavelets

169

Table 4 The uncertainty principle and its consequences space localization



-

n o s p a c e lo c a liz a t io n

id e a l s p a c e lo c a liz a t io n frequency localization



-

id e a l fr e q u e n c y lo c a liz a t io n

n o fr e q u e n c y lo c a liz a t io n kernel type

 Legendre kernel

bandlimited

spacelimited

Dirac kernel

reference shape of the Earth has been taken into account in almost all practical approximations because of their conceptual simplicity, numerical computability, and approximate accuracy (Table 4). Starting from the time of Laplace, Legendre, and Gauss (see, e.g., [110, 154, 156]), the context of spherical harmonics is a well-established tool, particularly for access to the inversion of problems under the assumption of a bandlimited Earth’s gravitational and/or magnetic model. Nowadays, spectral reference models, i.e., Fourier expansions in terms of spherical harmonics for the Earth’s gravitational and magnetic potential are widely known by tables of expansion coefficients as frequency determined constituents. In this respect, it should be mentioned that geoscientific modeling demands its own nature. Concerning modeling one is usually not interested in the separation into scalar Cartesian component functions involving product ingredients. Instead, inherent physical properties should be observed. For example, the deflections of the vertical form a vector-isotropic surface gradient field on the Earth’s surface, the equations for (geostrophic) ocean (surface) flow involving geoidal undulations (heights) imply a divergence-free vector-isotropic nature, satellite gradiometer data lead to tensor-isotropic Hesse fields. As a consequence, in a geoscientifically reflected spherical framework, all these physical constraints result in a formulation by rotation-invariant pseudodifferential operators. Hence, rotational symmetry (isotropy) is an indispensable ingredient (see, e.g., [72, 246]) in the bridging transformation, that relates geophysical and/or geodetic quantities, i.e., the object parameters, to the observed and/or measured data sets, and vice versa. Commonly, zonal functions on the sphere recognized as positive definite kernels may be interpreted as generating reproducing kernels of Sobolev spaces. This observation is the reason why spherical splines may be based on a variational

170

W. Freeden and M. Schreiner

approach (cf. [19, 55, 56, 62, 72, 253, 254]) that minimizes a weighted Sobolev norm of the interpolant, with a large class of spline manifestations provided by pseudodifferential operators being at the disposal of the user. A Sobolev space framework involving rotation-invariant pseudodifferential operators (as originated by observables, e.g., in physical geodesy) shows some important benefits of spline interpolation as preparatory tool for spherical sampling (see [90]). Accordingly we are confronted with the following situation: 1. Interpolating/smoothing splines are canonically based on a variational approach (see [72, 90]) that minimizes a weighted Sobolev norm of the interpolating/smoothing spline, with a large class of spline manifestations provided by pseudodifferential operators being at the disposal of the user. Regularly distributed as well as scattered data systems can be handled. 2. Artificial singularities caused by the use of (polar) coordinates in global approximation can be avoided totally. 3. The rotational invariance of observables (such as gravity anomalies, gravity disturbances, and disturbing potentials in geodetic theory) is perfectly maintained and can be adequately handled with Meissl schemes for geodetic observables (see [200, 221]). 4. Measurement errors can be handled by an adapted interpolation/smoothing spline method. Error bounds can be derived that include computable constants rather than only being given in terms of the order of convergence of the maximum distance from any point in the domain to the nearest data point. 5. Spline spaces serve as canonical reference spaces for purposes of spherical sampling relative to finite as well as infinite scattered data distributions. 6. Spherical splines provide approximations using kernel functions with a fixed “window”, i.e., preassigned frequency and space relation. 7. The accuracy of spline approximation can be controlled by decreasing sampling width. 8. Global spherical spline interpolation in terms of zonal, i.e., radial basis functions, has its roots in physically motivated problems of minimizing a “(linearized) curvature energy” variational problem consistent to data points. Numerical experiences with the linear system of equations have shown that the system tends to be ill-conditioned unless the number of data points is not too large. Clearly, oscillation phenomena in spline interpolation may occur for larger data gaps. Spherical splines were independently developed in [53, 56], and [253]. A spline survey on the unit sphere S2 including aspects in spherical sampling is given, e.g., in [63,65,69,72,74,102,189,191,254] (see also the references therein). The numerical computation may be organized very efficiently and economically by fast multipole methods in combination with near/far field procedures (for more details the reader is referred to [117–120]). For generalizations to non-spherical manifolds and the use in boundary value problems of elliptic partial differential equations the reader is, e.g., referred to [51,

4 Spherical Harmonics, Splines, and Wavelets

171

56,57,59–63,71,74,84,85,94,234], and the references therein. Applications in flow problems of meteorology and ocean circulation are given, e.g., in [49, 65, 82]. A multivariate manifestation of spline interpolation in Paley-Wiener spaces with relations to Shannon-type sampling can be found in [88].

6

Wavelet Approximation

The wavelet approach is a more flexible approximation method than the Fourier expansion approach in terms of spherical harmonics or the variational spline method using zonal kernels. Due to the fact that variable “window kernel functions”, i.e., zooming-in, are being applied, a substantially better representation of the high frequency “short-wavelength” part of a function is achievable (global to scaledependent local approximation). The zooming-in procedure allows higher global resolutions and, therefore, makes a better exposure of the strong correlation between the function (signal) under consideration and the local phenomena that should be modeled. Furthermore, the multiscale analysis can be used to modify and improve the standard approach in the sense that a local approximation can be established within a global orthogonal (Fourier) series and/or spline concept (see [72]). In essence, the characterization of spherical wavelets contains three basic attributes: 1. Basis Property. Wavelets are building blocks for the approximation of arbitrary functions (signals). In mathematical terms this formulation expresses that the set of wavelets forms a “frame” (see, e.g., [38] and the references therein for details in classical one-dimensional theory). 2. Decorrelation. Wavelets possess the ability to decorrelate the signal. This means, that the representation of the signal via wavelet coefficients occurs in a “more constituting” form than in the original form, reflecting any certain amount of space and frequency information. The decorrelation enables the extraction of specific information contained in a signal through a particular locally reflected number of coefficients. Signals usually show a correlation in the frequency domain as well as in the space domain. Obviously, since data points in a local neighborhood are more strongly correlated as those data points far-off from each other, signal characteristics often appear in certain frequency bands. In order to analyze and reconstruct such signals, we need “auxiliary functions” providing localized information in the space as well as in the frequency domain. Within a “zooming-in process”, the amount of frequency as well as space contribution can be specified in a quantitative way. A so-called scaling function forms a compromise in which a certain balanced amount of frequency and space localization in the sense of the uncertainty principle is realized. As a consequence, each scaling function on the sphere depends on two variables, namely a rotational parameter (defining the position) and a dilational (scaling) parameter, which controls the amount of the space localization to be available at the price of the

172

W. Freeden and M. Schreiner

frequency localization, and vice versa. Associated to each scaling function is a wavelet function, which in its simplest manifestation may be understood as the difference of two successive scaling functions. Wavelets exist as bandlimited and non-bandlimited variants. Filtering (convolution) with a scaling function takes the part of a low-pass filter, while convolution with the corresponding wavelet function provides a band-pass filtering. A multiscale approximation of a signal is the successive realization of an efficient (approximate identity realizing) evaluation process by use of scaling and wavelet functions which show more and more space localization at the cost of frequency localization. The wavelet transform within a multiscale approximation lays the foundation for the decorrelation of a signal. 3. Efficient Algorithms. Wavelet transforms provide efficient algorithms because of the space localizing character. The successive decomposition of the signal by use of wavelets at different scales offers the advantage for efficient and economic numerical calculations. The detail information stored in the wavelet coefficients leads to a reconstruction from a rough to a fine resolution and to a decomposition from a fine to a rough resolution in form of tree algorithms. In particular, the decomposition algorithm is an excellent tool for the post-processing of a signal into “constituting blocks” by decorrelation, e.g., the specification of signature bands corresponding to certain signal specifics. As a consequence, spherical wavelets may be regarded as constituting multiscale building blocks, which provide a fast and efficient way to decorrelate a given signal data set. The properties (basis property, decorrelation, and efficient algorithms) are common features of all wavelets, so that these attributes form the key for a variety of applications particularly for signal reconstruction and decomposition, thresholding, data compression, denoising by use of, e.g., multiscale signal-to-noise ratio, etc. The essential power of spherical wavelets is based on the “zooming in” property, i.e., scale dependent varying amounts of both frequency and space localization. This multiscale structure is the reason why spherical wavelets can be used as mathematical means for breaking up a complicated structure of a function into many simple pieces at different scales and positions. It should be pointed out that several wavelet approaches involving spherical wavelets have been established, all of them providing multiscale approximation, but not all of them showing a structural “breakthrough” in form of a multiresolution of the whole reference space by nested detail spaces. In all cases, however, wavelet modeling is provided by a twoparameter family reflecting the different levels of localization and scale resolution. From a historic point of view, early multiscale procedures based on spherical harmonics “sum conglomerates” in form of scale dependent zonal kernels (i.e., scaling functions) can be split into two types: 1. A continuous wavelet transform (and its scale discretizations) were obtained by taking particular advantage of the space/frequency concept of spherical singular integrals, more concretely, singular integrals generating (scale dependent) approximate identities (cf. [76, 99]). In the case of the continuous wavelet transform (see [72]), the question of a complete characterization of a function

4 Spherical Harmonics, Splines, and Wavelets

173

can be answered immediately using the classical continuous concept of singular integrals by the resolution of the identity, at least if the wavelets are “admissible” in some sense. The scale relation between singular integral kernel and wavelet kernel is determined by certain spectral differential equations in terms of their Legendre coefficients (see, e.g., [72]). Within this framework the wavelets turn out to be (not-necessarily isotropic) “product” kernel functions generated by summing up certain clusters of spherical harmonic expressions. The techniques of exact space discretization by use of spherical bandlimited wavelets have been first presented in [106, 107]. The bandlimited/non-bandlimited wavelet constructions are definable in two ways, either by increasing space localization of the kernels or by decreasing frequency localization of their corresponding symbols (i.e., spherical Fourier transform). 2. The monograph [72] presents a scale discrete wavelet transform involving bandlimited as well as non-bandlimited kernel representations by forming the socalled P -scale or M-scale (or other type) wavelet representations. With the help of approximate or exact (spherical harmonics or spherical splines) interpolatory formulas all wavelet transforms allow the transfer to “fully discrete” variants, e.g., via tree algorithms (see [63, 72]). Often in geosciences fast approximation procedures are required for large amounts of data. Whereas global methods like approximation with spherical harmonics have proven to be reliable for global trend resolution [220,223], the focus for local reconstructions has shifted to “zooming-in” techniques involving wavelets [72, 237, 262]. It is evident that wavelet approximation on the sphere relating to (parts of) R2 is much better studied (see e.g. [163]). In fact, there exists a huge number of wavelets for very different purposes. In particular, there is a variety of wavelets which are orthogonal or show other very interesting properties such as a compact support (cf. [38]). Specifically, the usage of filter banks makes a 2Dwavelet transform very economical. A disadvantage is that one cannot easily deal with data which are not given on a regular grid. Nonetheless there are, e.g., lifting schemes for scattered data (cf. [135]), so that the speed disadvantage in comparison to grid based methods is manageable. Another possibility of a plane reduction is the use of non-grid based FFT techniques [211] and the realzation of wavelet transform via the known representation in the 2D-Fourier domain. At this point, it should be noted that intrinsic situations on the sphere are rather different. All wavelet methods usually suffer from the property that they are not canonically parallelizable. Usually, the characteristic feature of wavelets is that a single function, the so-called motherwavelet (see e.g. [38]) is dilated to cover different frequency bands and is shifted to cover the spatial domain. Any function possessing a cap as a support cannot be shifted around on the sphere in order to cover it without overlap. This observation considerably limits the use of an isotropic mother-wavelet because one has to solve larger systems of equations to counter the non-orthogonality of the wavelets. Due to the complexity of introducing wavelets on the sphere a number of different approaches have been proposed over the last three decades. An incomplete list can be found below (see also the schematic illustration at the end of this paper):

174

W. Freeden and M. Schreiner

• Many contributions create wavelets based on tensor products [36], operate on subdomains [35, 131, 258], or use other analytical construction methods [121], see also [90]. • The papers [6, 7] develop abstract conditions for wavelets and construct corresponding wavelets. These types of wavelets as many others need approximate integrations on the sphere, which causes additional work due to the right choice of the integration weights (unless one is working on very special grids). Furthermore, it is not immediately clear how these wavelet types can be made physically meaningful, for instance in physical geodesy, to allow for an easy harmonic continuation to the three-dimensional space. • Other approaches relate to implementations involving certain wavelet kernels (see, e.g., [131, 153, 179, 190, 212, 248, 263]). The bandlimited approaches suffer from the need of fast algorithms as provided, e.g., by spacelimited wavelets, but they offer excellent spectral possibilities to compare features of geophysical and geodetic quantities (note that an important example is the already mentioned Meissl scheme in physical geodesy). • The constructions [72, 99, 106, 107] yield frequency reflected wavelets by using the specific spectral properties involving the theory of spherical harmonics. This spectral approach can be extended to the three-dimensional space, e.g., for solving boundary value problems for the Laplace operator, the Helmholtz operator, and the Cauchy-Navier operator. • The contribution [80] gives a construction of spherical wavelets based on jump and limit relations of single and double layer surface potentials for the Laplace equation. All the resulting (linear and bilinear) wavelet-types are explicitly available in closed form in space as well as frequency domain. Moreover, this ansatz can be extended to other elliptic partial differential equations in R3 (such as the Helmholtz equation [81], the Cauchy-Navier equation of elasticity [1], the Stokes equation [170]) for use in boundary value problems corresponding to geoscientifically relevant boundaries (such as sphere, ellipsoid, and actual Earth’s surface). • The vast field of boundary integral equations has also led to the creation of wavelet methods and fast wavelet algorithms even on manifolds (see e.g. [80, 124, 152]). • A dilemma is to construct isotropic non-bandlimited wavelets on the sphere which are orthogonal (note that different wavelet spaces for bandlimited isotropic wavelets can be orthogonal when properly constructed, cf. [72]). The bandlimited Shannon-type wavelets (see, e.g., [101]) are orthogonal and isotropic, but spacelimited Haar-type wavelets (cf. [76]) show orthogonality on the sphere only to some extent. By accepting non-isotropy of the wavelet-type more orthogonality can be achieved (an example is discussed in [83], see also [90]). Roughly speaking, the wavelet transform is a space localized replacement of the Fourier transform, providing space-varying frequency distribution in banded form. Wavelets provide sampling by only using a small fraction of the original information of a function. Typically the decorrelation is achieved by wavelets which have a

4 Spherical Harmonics, Splines, and Wavelets

175

characteristic local band (localization in frequency). Different types of wavelets can be established from certain constructions of space/frequency localization. It turns out that the decay towards long and short wavelengths (i.e., in information theoretic jargon, bandpass filtering) can be assured without any difficulty. Moreover, vanishing moments of wavelets enable us to combine polynomial (orthogonal) expansion (responsible for the long-wavelength part of a function) with wavelet expansions responsible for the medium-to-short-wavelength contributions (see, e.g., [72, 107]). Because of the rotation-invariant nature of a large number of geodetic quantities, resulting approximation methods of zonal spline and wavelet nature have much to offer. This is the reason why the authors of [90] decided to add to the plane involved matter some of the significant sphere intrinsic features and results of spherical sampling in a unifying concept (note that vector- and tensorisotropic sampling follows by obvious arguments from the context provided in [90] based on settings developed in [69, 74, 102]). Furthermore, the fundamental idea of handling inverse problems of satellite technology (such as satellite-tosatellite tracking, satellite gravity gradiometry) is to understand regularization of inverse (non-bounded) pseudodifferential operators by a multiresolution analysis using certain kernel function expressions as regularizing wavelets.

7

Goal of Spherical Approximation

Spherical approximation essentially amounts to the reconstruction and decomposition of a spherical function by a certain sum of its functional values at a discrete set of points. In this respect, sampling (cf. [90]) is a keystone of constructive approximation and signal processing. Its manifestation can be given in a polynomial, spline, and wavelet framework for scattered as well as regularly structured data systems (note that the most common variant of bandlimited Shannon polynomial sampling is illustrated in Table 5). Our interest here is to explain that the different forms of spherical sampling reflect the specific circumstances of space/frequency localization of a signal under scattered or regularly structured data configurations. The special interest in writing this contribution is to point out the classification of general approximation types of polynomial (i.e., spherical harmonics) expansion, spline interpolation and smoothing, and wavelet approximation on the unit sphere corresponding to structured and scattered data sets. The large class of possible weights that are permitted in bandlimited polynomial as well as non-bandlimited spline sampling should allow the users to build up some desirable characteristics for their own specific application under consideration. Even more, in parallel to finite and infinite-dimensional spline interpolation, spherical multiscale sampling procedures are obtained, e.g., by a consequent use of the reproducing structure inherently contained in the corresponding kernel manifestations, e.g., relative to rotation-invariant pseudodifferential operators. As a consequence, multiresolution approximation in form of tree algorithms opens new perspectives of decorrelating spherical signals, thereby specifying characteristic features inherent in the signal. This aspect provides a new component in geodetic interpretation and understanding.

176 Table 5 Shannon sampling in terms of polynomials, i.e., spherical harmonics (see [90] for more details)

W. Freeden and M. Schreiner F bandlimited, i.e., F ∈ Harm0,...,m =span{Yn,j }n=0,...,m, j = −n, . . . , n Orthonormality of Yn,j F ∧ (n, j ) F =



Fourier expansion

Fourier coefficients: n ∧ n=0 j =−n F (n, j )Yn,j

m

Pointset XN = {η1 , . . . , ηN }



unisolvence

Solution of the linear system: n,j a k=1 k Yp,q (ηk ) = δp,n δq,j , p = 0, . . . , m, q = −n, . . . , n

N

Shannon type



Lagrangians

Spherical harmonic representation: n  n,j Lk = m n=0 j =−n ak Yn,j , k = 1, . . . , N Functional values F (ηk )



Lagrange formla

Shannon sampling theorem:  F = N k=1 F (ηk ) Lk

The overall intention on spherical sampling (see also [90]) is to build a viable, sustainable bridge for recognizing the causality between the abstractness of a geodetically relevant model and the impact as well as the cross-disciplinary importance of practical realizations. This paper is written under the seminal auspices that there is no universal method in spherical approximation, being optimal in all aspects. A method that is economical, usually lacks in accuracy; a method that is accurate, usually lacks in economy. The structure of regularly distributed/scattered data systems requires different approaches and methodologies. The type of observables demands the type of constructive isotropic/non-isotropic approximation. As a consequence, every single sampling problem has its own characteristic nature. It can be optimally solved only under specific circumstances, pre-determined under certain constraints. This fundamental conclusion is the reason why the palette of spherical approximation techniques has to be proposed under diverse theoretical as well as practical aspects and demands. The thrust of a paper as presented here is devoted to the interdisciplinary character of approximation problems. It is hoped that the viewpoints and approaches for geoscientifically relevant obligations would be found useful in connection with other approximation problems in diverse areas of application. In fact, our purpose is to provide a new outlook in which technical results can be better motivated and understood. Within this framework, criteria can be given relative to which the scope and limitations of the various methods can be assessed. This is important both in theory and practice since we are aware of the fact that there is no

4 Spherical Harmonics, Splines, and Wavelets

177

cure-all method for approximating signals. So it is imperative to learn to clarify why a certain method works in some context as well as when not to use that method. The work [90] discusses at length the intuitive assumptions and principles that underlie the various methods. Altogether, it is evident that spherical tools, structures, and methods play an important part particularly in the treatment of geosciences. An essential application involve measurements of data from the gravitational force at points on the surface of the Earth and/or at satellite heights, based on discrete values. Their objective is the investigation of qualitative and quantitative structures of the Earth’s gravitational field to deepen the understanding of the Earth’s figure and complexity by the specific investigation of mass distributions and motions. Usually this is undertaken even today, e.g., in geodesy following Newton’s celebrated theory “de mundi systemate” (1715). As a matter of fact, approximation procedures constitute essential instruments to act as the key technology in gravitative research. Ideally, a good understanding of the Earth’s system implies the solution of gravitationally reflected problems, and vice versa. The examplary prototype “geodesy” based on Newtonian concepts obviously demonstrates that mathematical sampling is able to provide information by reduction of the complexity in theoretic as well as numerical sense such that a “new”, more concrete situation can be efficiently and economically attacked within a simpler-structured framework. The mathematical disciplines to be involved are special functions (especially, spherical harmonics) theory, (rotational-invariant) pseudodifferential operators and integral equation theory, Fourier and distribution theory, and functional analysis (providing the indispensable tools for applying the machinery of the theory of ill-posed and inverse problems).

8

Sampling Methods and Recovery Problems

In accordance with J. R. Higgins [128] sampling a signal is considered in two variants: • the first stating the fact that a bandlimited function is completely determined by its samples, • the second describing how to decompose or reconstruct a function using its samples in an appropriate way. In addition, recent multiscale philosophy (cf. [90]) enables us to realize another variant: • the third decorrelating a signal into constituting ingredients characterizing specific features of interest for an applicant. In what follows we explain some important aspects leading to the sampling scenario today and to predict promising perspectives for the next future. We begin with the steps from classical one-dimensional theory to multidimensional settings.

178

W. Freeden and M. Schreiner

Early in classical one-dimensional theory, both variants of the sampling theorem pointed out by J. R. Higgins [128] were given by E.T. Whitaker [259–261] in a more precise but somewhat different form. In later years it became known that the classical sampling theorem had been presented before C.E. Shannon [231, 232] to the Russian communication community by V. A. Kotel’nikov [149]. In a more implicit verbal form, it had also been described in the German literature by H. Raabe [213] (for more details concerning the history see [160]). As a consequence of the discovery of the several independent introductions of the sampling theorem, people started to refer to the theorem by including the names of the aforementioned authors, resulting in such phrases as the “Whittaker - Kotel’nikov - Shannon (WKS) sampling theorem”. In the WKS context, the theorem is directly applicable to timedependent signals and is normally formulated in that framework (see also [37]). Nonetheless, already this time-dependent form of sampling is of great practical importance and the basis of many developments. It has many appealing applications in physics and engineering (for additional information and references concerning the applications see, e.g., [128, 138, 139, 269]). The classical versions of the sampling theorem can be extended to functions of arbitrarily many variables in an iterated one-dimensional way. For example (see, e.g., the descriptions in Wikipedia for more details), gray scale images are often represented as two-dimensional arrays (or matrices) of real numbers representing the relative intensities of pixels (picture elements) located at the intersections of row and column sample locations. As a result, images require two independent variables, or indices, to specify each pixel uniquely – one for the row, and one for the column. Color images typically consist of a composite of three separate gray scale images, one to represent each of the three primary colors – red, green, and blue. Similar to one-dimensional discrete time signals, images can also suffer from aliasing if the sampling resolution, or pixel density, is inadequate. Generally, sampling theory in Euclidean theory (in the sense of the representation of a continuous signal in terms of its discrete samples) has attracted considerable interest in the past three decades. Major advances in Euclidean theory and applications of sampling expansions in different function spaces have been made by mathematicians and engineers. There are several journals on signal processing and computational harmonic analysis that publish papers on sampling theory and its applications. The field also sports its own journal “Sampling Theory in Signal and Image Processing” and has its own “professional society”. We quote from Wikipedia: “SampTA (Sampling Theory and Applications) is a biennial interdisciplinary conference for mathematicians, engineers, and applied scientists. The main purpose of SampTA is to exchange recent advances in sampling theory and to explore new trends and directions in related areas of applications. The SampTA conference series began as a small workshop in 1995 in Riga, Latvia, but the meetings grew into full-fledged conferences attracting an even mix of mathematicians and engineers as the interest in sampling theory blossomed.” The most prominent realization in Euclidean spaces is the Shannon sampling theory resulting in a huge number of publications (from which we list only a limited number, [25, 27, 164, 165, 197, 198, 269] and the references therein). Many authors

4 Spherical Harmonics, Splines, and Wavelets

179

developed sampling theorems based on the uni-variate theory of regular and singular boundary value problems and also by use of transforms other than the Fourier transform, including such techniques as the Sturm-Liouville, Legendre, Laguerre, Jacobi transform (see, e.g., [193, 195, 269]) originated by, e.g., Kramer’s approach [151]. Another field of research has been in non-uniform sampling for nonbandlimited signals. More recently, methods of functional analysis and harmonic analysis have played a pivotal role to provide major advances in sampling theory and its foundational aspects. In particular, new directions have been proposed involving various function spaces that admit sampling expansions such as Sobolev spaces, reproducing Hilbert spaces, Wiener amalgam space, shift invariant spaces, translation-invariant spaces, convolution sampling, non-linear sampling and other fundamental issues in sampling theory. For various perspectives the reader is referred to, e.g, [2–5, 15, 17, 18, 20, 25– 29, 48, 52, 123, 128–130, 134, 137, 139, 155, 164–166, 175–177, 193, 194, 196, 197, 204,215,233,239,241–246,251,252,255–257,268–271] and the references therein. The papers [122, 205, 255] and many references therein deal with sampling aspects of engineering realization. Nevertheless, even today, sampling has a number of problems, for example, bandlimiting operation generates Gibb’s oscillations, the sinc function has a very slow decay at infinity which makes computation in the signal domain very difficult. Multidimensional Euclidean sampling can be found in, e.g., [2, 15, 31, 140, 164, 165, 206, 209], and many other contributions. Usually, it is restricted to the fundamental cell of Zq , q ≥ 2, and consequently based on an iterated one-dimensional framework. The close relationship of Shannon-type sampling to lattice point identities of analytic number theory seems to be a new field of interest. The work [88, 91] provides the point of departure for a palette of Shannon-type sampling formulas and over- and undersampling characterizations in higher dimensions. The rudiments of sampling are covered in almost any engineering textbook on signal analysis. In the last three decades, however, with the surge of new techniques in analysis especially multiscale methods, sampling theory has started to take a new prominent role within the traditional branches of mathematics, thereby leading to discoveries in other areas of mathematics, such as inverse problems, signal processing, and image analysis. Researchers in one of these areas were often unfamiliar with the techniques and relevance of the other two areas. In fact, the connections among all the areas of inverse problems, signal processing, and image analysis were rather tenuous. Today the situation, however, has changed drastically because of the bridging links of the sampling methodology. Subsequently, the areas of inverse problems, signal processing, and image analysis changed themselves. Briefly formulated, our today’s understanding of these areas is as follows (cf. [193]): Inverse problems (IP’s) deal with determining for a given input-output system an input that produces an observed output, or of determining an input that produces a desired output (or comes as close to it as possible), often in the presence of noise. Signal Analysis/Processing (SA) deals with digital representations of signals and their analog reconstructions from digital representations.

180

W. Freeden and M. Schreiner

Image Analysis and Processing are concerned with image refinement and recovery, and particularly include geophysics as well as medical imaging. Nowadays, the common thread among inverse problems, signal analysis, and imaging is seen in a unifying way as recovering of an object (function, signal, picture) from certain known information about the object. In this respect, sampling expansions, filters, reproducing kernel spaces, various function spaces, and techniques of functional analysis, computational and harmonic analysis play pivotal roles in the recovery of an object from discrete information. So we are canonically led to conclude that sampling methodology takes a new role as a unifying bridge in the solution processes of recovery problems. In turn, the applicability of sampling theory to practical recovery problems shows a tremendous increase. The application of sampling techniques arose in many branches, including computer vision, natural language processing, machine learning, statistics, medical imaging (such as tomography and EEG/ERP), remote sensing, non-destructive testing, astronomy, geosystems mathematics (especially involving, e.g., gravimetric, geomagnetic, elasticity and deformation, seismic exploration and satellite observational technology), and many other fields. The rationale in most numerical methods involving sampling is to construct a “solution” that is acceptable physically as a meaningful approximation and is sufficiently stable from the computational standpoint. The vivid research activity has led to a vast literature of inverse and ill-posed problems, Standard textbooks are, e.g., [40, 42, 116, 146, 158, 192, 194, 249, 250]. In addition to the books we mention the journals Inverse Problems, Inverse Problems in Science and Engineering, Journal on Inverse and Ill-Posed Problems, Inverse Problems and Imaging, GEM International Journal on Geomathematics. In our work we restrict ourselves to a selection of solution methods involving spherical sampling. One of the major purposes of [90] is to dramatize the delineation with reference to specific methods and results in geomathematically relevant geodetic reality, for which [89] is meant to be an appropriate preparation. Recovery problems also constitute the essential instruments and the key technology for the mathematical interaction of abstraction as well as concretization of geoscientific research. In fact, recovery problems from partial or indirect information are characterized by the fact that the transformation from data to object constituents (or vice versa) is a result of the interaction of a physical system with the object that we wish to infer properties about. As a consequence, a good understanding of the Earth’s system using recovery methods enables the formulation of a geomathematical problem by reduction in theoretic as well as numeric sense such that a new, more concrete situation can be efficiently attacked within a wellstructured framework, usually in the context of special function theory, differential equations, integral equations, Fourier theory, and functional analysis. Coherently, by solution processes of recovery problems, the capacity is provided to recognize the causality between the abstractness of the geomathematical context and the impact as well as cross-sectional importance to the geoscientific reality. Because of the rapid progress in computer technology and the recent developments in geosciences, e.g., in measurement techniques and observational technology, there is a strong need for

4 Spherical Harmonics, Splines, and Wavelets

181

contributions on spherical (polynomial, spline, and wavelet) sampling, for which [90] intends to give a consistent consolidation.

9

Selective Publication List

This contribution allows the reader to concentrate on essential topics of the sampling theory in geoscientifically relevant recovery problems. Another objective is to show how advances in the field of sampling lead to new discoveries in other mathematical as well as geoscientific branches. Clearly, spherical approximation can be used not only in diverse fields of geoapplications, but also in other sciences such as medicine, computer vision, etc. Table 6, however, is restricted to activities in geosystems mathematics. It only gives an overview about some geomathematical aspects of sampling in the georeality Table 6 Selective publication list of sampling methods and application areas Spherical harmonics Splines Wavelets Gravitation [9], [12], [30], [33], [43], [44], [30], [33], [55], [57], [13], [39], [65], [67], [69], [45], [54], [69], [70], [77], [80], [58], [60], [62], [63], [70], [79], [84], [87], [87], [96], [102], [119], [125], [84], [87], [105], [108], [96], [100], [101], [102], [104], [107], [109], [147], [132], [148], [157], [184], [186], [117], [118], [119], [170], [180], [182], [200], [200], [210], [214], [218], [219], [120], [126], [187], [201], [226], [208], [229], [224], [235], [236], [237], [246], [208], [217], [226], [267] [265], [272] [228], [247] Geomagnetics [10], [11], [14], [34], [69], [77], [225] [14], [41], [68], [69], [77], [87], [111], [112], [113], [114], [111], [112], [113], [114], [115], [150], [202], [203], [236], [115], [161], [162], [167], [238], [264] [168], [169] Elsasticity and Deformation [1], [95], [71], [73] [1], [84], [85] Flow (Ocean Circulation, Meteorology, Optical Flow) [49], [132], [143], [144], [145], [82] [47], [64], [65], [170] [178], [199], [220], [219] Satellite technology (Ionospheric Currents, SST, SGG, Radio Occultation) [83], [84], [86], [92], [103], [22], [23], [84], [86], [76], [83], [84], [86], [92], [104], [132], [141], [150], [92], [225], [226], [227] [97], [98], [100], [126], [161], [162], [167], [168], [219], [223], [222], [227], [240] [169], [171], [226], [229] Exploration, Prospecting [216] [8], [41], [51], [61], [93], [8], [18], [21], [24], [51], [65], [66], [180], [181], [185] [182], [183] Digital Terrain Modeling, Image Processing, Pattern Recognition [15], [16], [20], [136], [142], [33], [46], [78], [105], [33], [78], [121], [230] [172], [174], [173], [211], [266] [133], [159]

182

W. Freeden and M. Schreiner

without being complete. Even more controversially, the authors are only capable to give a subjective list reflecting their own research activities during the last two decades. This strongly limits the significance and the value of Table 6. In doing so, the authors apologize very much for not citing so many important contributions, even in geosystems mathematics. As a matter of fact, the citations of the list are only meant as entry-points into the huge area of significant applications in geoscientifically relevant sampling. Nevertheless, we hope that the entry-points turn out to be helpful to find and specify the particular publications of interest for the reader. Acknowledgements This contribution represents a synopsis of ideas and concepts presented in the textbook “Spherical Sampling”, Geosystem Mathematics, Birkhäuser, Basel [90]. For more mathematical details and algorithmic aspects the interested reader is referred to the contents of this work.

Literature 1. Abeyratne, M.K., Freeden, W., Mayer, C.: Multiscale deformation analysis by Cauchy-Navier wavelets. J. Appl. Math. 12, 605–645 (2003) 2. Aldroubi, A., Gröchenig, K.: Nonuniform sampling and reconstruction in shift invariant spaces. SIAM Rev. 43, 585–620 (2001) 3. Aldroubi, A., Sun, Q., Tang, W-S.: -frames and shift-invariant subspaces on L . J. Fourier Anal. Appl. 7, 1–21 (2001) 4. Aldroubi, A., Sun, Q., Tang, W.-S.: Nonuniform average sampling and reconstruction in multiple generated shift-invariant spaces. Constr. Approx. 20, 173–189 (2004) 5. Aldroubi, A., Sun, Q., Tang, W.-S.: Convolution, average sampling and a calderon resolution of the identity for shift-invariant spaces. J. Fourier Anal. Appl. 22, 215–244 (2005) 6. Antoine, J.-P., Demanet, L., Jaques, L., Vandergheynst, P.: Wavelets on the sphere: implementations and approximations. Appl. Comput. Harm. Anal. (ACHA) 13, 177–200 (2002) 7. Antoine, J.-P., Vandergheynst, P.: Wavelets on the 2-sphere: a group-theoretical approach. Appl. Comput. Harmon. Anal. (ACHA) 7, 1–30 (1999) 8. Augustin, M., Bauer, M., Blick, C., Eberle, S., Freeden, W., Gerhards, C., Ilyasov, M., Kahnt, R., Klug, M., Möhringer, S., Neu, T., Nutz, H., Ostermann, I., Punzi, A.: Modeling deep geothermal reservoirs: recent advances and future perspectives. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 2, 2nd edn., pp. 1547–1629. Springer, Berlin/Heidelberg (2015) 9. Augustin, M., Freeden, W.: A survey on classical boundary value problems in physical geodesy. In: Grafarend E.W. (ed.) Encyclopedia of Geodesy. Springer International Publication, Switzerland (2016). https://doi.org/10.1007/978-3-319-02370-0-1117-1 10. Backus, G.E., Parker, R., Constable, C.: Foundations of Geomagnetism. Cambridge University Press, Cambridge (1996) 11. Baratchart, L., Gerhards, C.: On the recovery of crustal and core components of geomagnetic potential fields. SIAM Appl. Math. (2017, submitted) 12. Barzaghi, R., Sansò, F.: Remarks on the inverse gravimetric problem. Boll. Geod. Scienze Affini 45, 203–216 (1986) 13. Bauer, F., Gutting, M.: Spherical fast multiscale approximation by locally compact orthogonal wavelets. GEM Int. J. Geomath. 2, 69–85 (2011) 14. Bayer, M., Freeden, W., Maier, T.: A vector wavelet approach to iono- and magnetospheric geomagnetic satellite data. J. Atmos. Sol. Terr. Phys. 63, 581–597 (2001)

4 Spherical Harmonics, Splines, and Wavelets

183

15. Behmard, H., Faridani, A.: Sampling of bandlimited functions on unions of shifted lattices. J. Fourier Anal. Appl. 8, 43–58 (2001) 16. Behmard, H., Faridani, A., Walnut, D.: Construction of sampling theorems for unions of shifted lattices. Sampling Theory Signal Image Process. 5, 297–319 (2006) 17. Benedetto, J.J., Ferreira, P.J.S.G. (eds.): Modern Sampling Theory: Mathematics and Applications. Birkhäuser, Boston (2001) 18. Benedetto, J.J., Zayed, A.I. (eds.): Sampling, Wavelets, and Tomography. Birkhäuser, Boston (2003) 19. Bezhaev, A.Y., Vasilenko, V.A.: Variational Spline Theory. Russian Academy of Sciences, Sibirian Branch. Bulletin of the Novosibirsk Computing Center, NCC Publisher, Novosibirsk (1993) 20. Bi, N., Nashed, M.Z., Sun, Q.: Reconstructing signals with finite rate of innovation from noisy samples. Acta Appl. Math. 107, 309–372 (2009) 21. Blick, C.: Multiscale potential methods in geothermal research: decorrelation reflected post-processing and locally based inversion. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group, Verlag Dr. Hut, Munich (2015) 22. Blick, C., Eberle, S.: Radio occultation via satellites. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 1089–1126. Springer, Berlin/Heidelberg (2015) 23. Blick, C., Freeden, W.: Spherical spline application to radio occultation data. J. Geodetic Sci. 1, 379–396 (2011) 24. Blick, C., Freeden, W., Nutz, H.: Feature extraction of geological signatures by multiscale gravimetry. GEM Int. J. Geomath. 8, 57–83 (2017) 25. Butzer, P.L.: A survey of the Whittaker Shannon sampling theorem and some of its extensions. J. Math. Res. Exposition 3, 185–212 (1983) 26. Butzer, P.L., Splettstößer, W., Stens, R.L.: The sampling theorem and linear prediction in signal analysis. Jahresber. Deutsch. Math. Vereinigung (DMV) 90, 1–60 (1988) 27. Butzer, P.L., Stens, R.L.: The Euler-MacLaurin summation formula, the sampling theorem, and approximate integration over the real axis. Linear Algebra Appl. 52/53, 141–155 (1983) 28. Butzer, P.L., Stens, R.L.: Sampling theory for not necessarily band-limited functions: a historical overview. SIAM Rev. 34, 40–53 (1992) 29. Campbell, L.L.: A comparison of the sampling theorem of Kramer and Whittaker. SIAM 12, 117–130 (1964) 30. Choirat, C., Seri, R.: Computational aspects of Cui-Freeden statistics for equidistribution on the sphere. Math. Comput. 82, 2137–2156 (2013) 31. Cluny, F., Costarelli, D., Minotti, A, Vinti, G.: Enhancement of thermographic images as tool for structural analysis in eathquake engineering. NDT & E Int. 70, 60–72 (2015) 32. Cohen, L.: Time-Frequency Analysis. Prentice Hall, Englewood Cliffs (1995) 33. Cui, J., Freeden, W., Witte, B.: Gleichmäßige Approximation mittels sphärischer FiniteElemente und ihre Anwendung auf die Geodäsie. Zeitschrift für Vermessungswesen 117, 266–278 (1992) 34. Dahlen, F.A., Simons, F.J.: Spectral estimation on a sphere in geophysics and cosmology. Geoph. J. Int. 174, 774–807 (2008) 35. Dahlke, S., Dahmen, W., Schmitt, E., Weinreich, I.: Multiresolution analysis and wavelets on S2 and S3 . Numer. Funct. Anal. Optim. 16, 19–41 (1995) 36. Dahlke, S., Maass, P.: Continuous wavelet transforms with application to analyzing functions on spheres. J. Fourier Anal. Appl. 2, 379–396 (1996) 37. Daubechies, I.: Time-frequency localization operators: a geometric phase space approach. IEEE Trans. Informat. Theory 34, 961–1005 (1988) 38. Daubechies, I.: Ten lectures on wavelets. In: CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 61. SIAM, Philadelphia (1992) 39. Daubechies, I., Sweldens, W.: Factoring wavelet transform into lifting steps. In: Klees, R., Haagmans, R. (eds.) Wavelets in Geosciences. Lect. Notes Earth Sci. 90, 131–157 (1999) 40. Engl, H.: Integralgleichungen. Springer, Berlin (1997)

184

W. Freeden and M. Schreiner

41. Engl, H., Louis, A.K., Rundell, W. (eds.): Inverse Problems in Geophysical Applications. SIAM, Philadelphia (1997) 42. Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 43. ESA (European Space Agency): The Nine Candidate Earth Explorer Missions. Publications Division ESTEC, Noordwijk, SP–1196(1) (1996) 44. ESA (European Space Agency): European Views on Dedicated Gravity Field Missions: GRACE and GOCE. ESD-MAG-REP-CON–001 (1998) 45. ESA (European Space Agency): Gravity Field and Steady-State Ocean Circulation Mission. ESTEC, Noordwijk, ESA (European Space Agency) SP–1233(1) (1999) 46. Fasshauer, G.E., Schumaker, L.L.: Scattered data fitting on the sphere. In: Dahlen, M., Lyche, T., Schumaker L.L. (eds.) Mathematical Methods for Curves and Surfaces II, pp. 117–166. Vanderbilt University, Nashville (1998) 47. Fehlinger, T., Freeden, W., Gramsch, S., Mayer, C., Michel, D., Schreiner, M.: Local modelling of sea surface topography from (Geostrophic) ocean flow. ZAMM 87, 775–791 (2007) 48. Feichtinger, H.G., Gröchenig, K.: Theory and practice of irregular sampling. In: Benedetto, J.J., Frazier, M.W. (eds.) Wavelets: Mathematics and Applications, pp. 305–363. CRC Press, Boca Raton (1993) 49. Fengler, M., Freeden, W.: A nonlinear Galerkin scheme involving vector and tensor spherical harmonics for solving the incompressible Navier-Stokes equation on the sphere. SIAM J. Sci. Comput. 27, 967–994 (2005) 50. Fengler, M., Freeden, W., Gutting, M.: The spherical Bernstein wavalets. Int. J. Pure Appl. Math. 31, 209–230 (2006) 51. Fengler, M.J., Michel, D., Michel, V.: Harmonic spline-wavelets on the 3-dimensional ball and their application to the reconstruction of the earth’s density distribution from gravitational data at arbitrarily shaped satellite orbits. ZAMM 86, 856–873 (2006) 52. Fogel, L.J.: A note on the sampling theorem. IRE Trans. Inf. Theory IT 1, 47–48 (1955) 53. Freeden, W.: An application of a summation formula to numerical computation of integrals over the sphere. Bull. Géod. 52, 165–175 (1978) 54. Freeden, W.: On the approximation of external gravitational potential with closed systems of (Trial) functions. Bull. Géod. 54, 1–20 (1980) 55. Freeden, W.: On spherical spline interpolation and approximation. Math. Meth. Appl. Sci. 3, 551–575 (1981) 56. Freeden, W.: On approximation by harmonic splines. Manuscr. Geod. 6, 193–244 (1981) 57. Freeden, W.: Interpolation and best approximation by harmonic spline functions – theoretical and computational aspects. Boll. Geod. Scienze Affini 41, 106–120 (1982) 58. Freeden, W.: On the permanence property in spherical spline interpolation. Department of Geodetic Science, The Ohio State University, Columbus, Ohio, OSU Report No. 341 (1982) 59. Freeden, W.: On spline methods in geodetic approximation problems. Math. Methods Appl. Sci. 4, 382–396 (1982) 60. Freeden, W.: A spline interpolation method for solving boundary value problems of potential theory from discretely known data. Numer. Math. Partial Diff. Equ. 3, 375–398 (1987) 61. Freeden, W.: Metaharmonic splines for solving the exterior Dirichlet problem of the Helmholtz equation. In: Utreras, F., Chui, C.K., Schumaker, L.L. (eds.) Topics in Approximation Theory, pp. 99–110. Academic, Boston (1987) 62. Freeden, W.: Spherical spline approximation and its application in physical geodesy. In: Vogel, A., Ofeagbu, C.O., Gorenflo, R., Ursin, B., (eds.) Geophysical Data Inversion Methods and Applications, pp. 79–104. Vieweg Publication, Braunschweig (1990) 63. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. B.G. Teubner, Stuttgart/Leipzig (1999) 64. Freeden, W.: Geomathematik, was ist das überhaupt? Jahresb. Deutsch. Mathem. Vereinigung (DMV) 111, 125–152 (2009)

4 Spherical Harmonics, Splines, and Wavelets

185

65. Freeden, W.: Geomathematics: its role, its aim, and its potential. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 3–78. Springer, Heidelberg (2015) 66. Freeden, W., Blick, C.: Signal decorrelation by means of multiscale methods. World of Mining 65, 1–15 (2013) 67. Freeden, W., Fehlinger, T., Klug, M., Mathar, D., Wolf, K.: Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J. Geod. 83, 1171–1191 (2009) 68. Freeden, W., Gerhards, C.: Poloidal and toroidal fields in terms of locally supported vector wavelets. Math. Geosc. 42, 817–838 (2010) 69. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman and Hall/CRC Press, Boca Raton/New York/London (2013) 70. Freeden, W., Gerhards, C., Nutz, H., Schreiner, M.: Disturbing potential from deflections of the vertical: from globally reflected surface gradient equation to locally oriented multiscale modeling. In: Grafarend E.W. (ed.) Encyclopedia of Geodesy. Springer, International Publications Switzerland (2016) 71. Freeden, W., Gervens, T., Mason, J.C.: A minimum norm interpolation method for determining the displacement field of a homogeneous isotropic elastic body from discrete data. IMA J. App. Math. 44, 55–76 (1990) 72. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications, Clarendon/Oxford (1998) 73. Freeden, W., Groten, E., Schreiner, M., Söhne, M., Tücks, M.: Deformation analysis using Navier spline interpolation (with an application to the Lake Blåsjö area). Allgemeine Vermessungs – Nachrichten (AVN) 3, 120–146 (1996) 74. Freeden, W., Gutting, M.: Special Functions of Mathematical (Geo)Physics. Birkhäuser, Basel (2013) 75. Freeden, W., Gutting, M.: Integration and Cubature Methods – A Geomathematically Oriented Course. Chapman and Hall/CRC Press, Boca Raton/New York/London (2018) 76. Freeden, W., Hesse, K.: On the multiscale solution of satellite problems by use of locally supported Kernel functions corresponding to equidistributed data on spherical orbits. Studia Scient. Math. Hungarica 39, 37–74 (2002) 77. Freeden, W., Maier, T.: On multiscale denoising of spherical functions: basic theory and numerical aspects. Electron. Trans. Numer. Anal. (ETNA) 14, 40–62 (2002) 78. Freeden, W., Mason, J.C.: Uniform piecewise approximation on the sphere. In: Mason, J.C., Cox, M.G. (eds.) Algorithms for Approximation II, pp. 320–333. Chapman and Hall, New York (1990) 79. Freeden, W., Mayer, C.: Wavelets generated by layer potentials. Appl. Comput. Harmon. Anal. (ACHA) 14, 195–237 (2003) 80. Freeden, W., Mayer, C.: Multiscale solution for the Molodensky problem on regular telluroidal surfaces. Acta Geod. Geophys. Hung. 41, 55–86 (2006) 81. Freeden, W., Mayer, C., Schreiner, M.: Tree algorithms in wavelet approximation by Helmholtz potential operators. Numer. Funct. Anal. Optim. 4, 747–782 (2003) 82. Freeden, W., Michel, D., Michel, V.: Local multiscale approximation of geostrophic oceanic flow: theoretical background and aspects of scientific computing. Marine Geod. 28:313–329 (2005) 83. Freeden, W., Michel, V.: Orthogonal zonal, tesseral, and sectorial wavelets on the sphere for the analysis of satellite data. Adv. Comput. Math. 21, 187–217 (2004) 84. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston (2004) 85. Freeden, W., Michel, V.: Wavelet deformation analysis for spherical bodies. Int. J. Wavelets Multiresolution Inf. Process. (IJWMIP) 3, 523–558 (2005) 86. Freeden, W., Michel, V., Nutz, H.: Satellite-to-satellite tracking and satellite gravity gradiometry (advanced techniques for high-resolution geopotential field determination). J. Eng. Math. 43, 19–56 (2002)

186

W. Freeden and M. Schreiner

87. Freeden, W., Michel, V., Simons, F.J.: Spherical harmonics based special function systems and constructive approximation methods. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics. Springer International Publishing, Basel/New-York/Heidelberg (2018) 88. Freeden, W., Nashed, M.Z.: Multivariate Hardy-type lattice point summation and Shannontype sampling. GEM Int. J. Geomath. 6, 163–249 (2015) 89. Freeden, W., Nashed, M.Z.: Operator-theoretic and regularization approaches to ill-posed problems. GEM Int. J. Geomath. (2017). https://doi.org/10.1007/s13137-017-0100-0 90. Freeden, W., Nashed, M.Z., Schreiner, M.: Spherical Sampling. Geosystem Mathematics. Birkhäuser, Basel (2018) 91. Freeden, W., Nashed, M.Z.: Lattice Point Identities and Shannon-Type Sampling. Chapman Hall/CRC Press, Boca Raton/New York, London (2020) 92. Freeden, W., Nutz, H.: Satellite gravity gradiometry as tensorial inverse problem. GEM Int. J. Geomath. 2, 177–218 (2011) 93. Freeden, W., Nutz, H.: Mathematik als Schlüsseltechnologie zum Verständnis des Systems “Tiefe Geothermie”. Jahresber. Deutsch. Math. Vereinigung (DMV) 117, 45–84 (2015) 94. Freeden, W., Reuter, R.: Spherical harmonic splines: theoretical and computational aspects. Meth. u. Verf. d. Math. Physik 27, 79–103 (1988) 95. Freeden, W., Reuter, R.: A constructive method for solving the displacement boundary value problem of elastostatics by use of global basis systems. Math. Methods Appl. Sci. 12, 105– 128 (1990) 96. Freeden, W., Schneider, F.: Wavelet approximation on closed surfaces and their application to boundary value problems of potential theory. Math. Methods Appl. Sci. 21, 129–163 (1998) 97. Freeden, W., Schneider, F.: Regularization wavelets and multiresolution. Inverse Prob. 14, 225–243 (1998) 98. Freeden, W., Schneider, F., Schreiner, M.: Gradiometry – an inverse problem in modern satellite geodesy. In: Engl, H.W., Louis, A., Rundell, W. (eds.) GAMM-SIAM Symposium on Inverse Problems: Geophysical Applications, pp. 179–239 (1997) 99. Freeden, W., Schreiner, M.: Non-orthogonal expansions on the sphere. Math. Methods Appl. Sci. 18, 83–120 (1995) 100. Freeden, W., Schreiner, M.: Spaceborne gravitational field determination by means of locally supported wavelets. J. Geod. 79, 431–446 (2005) 101. Freeden, W., Schreiner, M.: Multiresolution analysis by spherical up functions. Constr. Approx. 23, 241–259 (2006) 102. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences – A Scalar, Vecterial, and Tensorial Setup. Springer, Heidelberg (2009) 103. Freeden, W., Schreiner, M.: Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, pp. 269–302. Springer, Heidelberg (2010) 104. Freeden, W., Schreiner, M.: Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, New-York/Heidelberg (2015) 105. Freeden, W., Schreiner, M., Franke, R.: A survey on spherical spline approximation. Surv. Math. Ind. 7, 29–85 (1996) 106. Freeden, W., Windheuser, U.: Spherical wavelet transform and its discretization. Adv. Comput. Math. 5, 51–94 (1996) 107. Freeden, W., Windheuser, U.: Combined spherical harmonic and wavelet expansion. Appl. Comp. Harm. Anal. (ACHA) 4, 1–37 (1997) 108. Freeden, W., Witte, B.: A combined (spline-)interpolation and smoothing method for the determination of the gravitational potential from heterogeneous data. Bull. Géod. 56, 53–62 (1982) 109. Freeden, W., Wolf, K.: Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math. Semesterb. 56, 53–77 (2008) 110. Gauß, C.F.: Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des magnetischen Vereins (1838)

4 Spherical Harmonics, Splines, and Wavelets

187

111. Gerhards, C.: Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modeling. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2011) 112. Gerhards, C.: Spherical decompositions in a global and local framework: theory and an application to geomagnetic modeling. GEM Int. J. Geomath. 1, 205–256 (2011) 113. Gerhards, C.: A combination of downward continuation and local approximation for harmonic potentials. Inverse Prob. 30, 085004 (2014) 114. Gerhards, C.: A multiscale power spectrum for the analysis of the lithospheric magnetic field. GEM Int. J. Geomath. 5, 63–79 (2014) 115. Gerhards, C.: Multiscale modeling of the geomagnetic field and ionospheric currents. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer (2015). https://doi.org/10.1007/978-3-642-27793-118-4 116. Groetsch, C.W.: Inverse Problems in the Mathematical Science. Vieweg, Braunschweig (1993) 117. Gutting, M.: Fast multipole methods for oblique derivative problems. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2008) 118. Gutting, M.: Fast multipole accelerated solution of the oblique derivative boundary value problem. GEM Int. J. Geomath. 3, 223–252 (2012) 119. Gutting, M.: Fast spherical/harmonic spline modeling. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Heidelberg (2014) 120. Gutting, M.: Parameter choices for fast harmonic spline approximation. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics. Birkhäuser, Basel (2018) 121. Göttelmann, J.: Locally supported wavelets on manifolds with applications to the 2D sphere. Appl. Comput. Harmon. Anal. (ACHA) 7, 1–33 (1999) 122. Haddad, R.A., Parsons, T.W.: Digital Signal Processing: Theory, Applications and Hardware. Computer Science Press, New York (1991) 123. Han, D., Nashed, M.Z., Sun, Q.: Sampling expansions in reproducing Kernel Hilbert and Banach spaces. Num. Funct. Anal. Optim. 30, 971–987 (2009) 124. Harbrecht, H., Schneider, R.: Wavelet Galerkin schemes for boundary integral equations – implementation and quadrature. SIAM J. Sci. Comput. 27, 1347–1370 (2006) 125. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Freeman, San Francisco (1967) 126. Hesse, K.: Domain decomposition methods in multiscale geopotential determination from SST and SGG. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern, Shaker (2002) 127. Hesse, K., Sloan, I.H., Womersley, R.S.: Numerical integration on the sphere. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 3, 2nd edn., pp. 2671–2710. Springer, Heidelberg (2015) 128. Higgins, J.R.: Five short stories about the cardinal series. Bull. Am. Math. Soc. 12, 45–89 (1985) 129. Higgins, J.R.: Sampling theory in Fourier and signal analysis, volume 1: foundations. Oxford University Press, Oxford (1996) 130. Higgins, J.R., Stens, R.L.: Sampling theory in Fourier and signal analysis, volume 2: advanced topics. Oxford Science Publications, Oxford (2000) 131. Holschneider, M.: Continuous wavelet transforms on the sphere. J. Math. Phys. 37, 4156– 4165 (1996) 132. Ilk, K., Flury, J., Rummel, R., Schwintzer, P., Bosch, W., Haas, C., Schröter, J., Stammer, D., Zahel, W., Miller, H., Dietrich, R., Huybrechts, P., Schmeling, H.D., Wolf, H.G., Rieger, J., Bardossy, A., Güntner, A., Gruber, T.: Mass transport and mass distribution in the earth system, contribution of the new generation of satellite gravity and altimetry missions to geosciences. GOCE-Projektbüro TU Munich, GeoForschungsZentrum Potsdam (2005) http://gfzpublic.gfz-potsdam.de/pubman/faces/viewItemOverview, Cited 01 Oct 2017

188

W. Freeden and M. Schreiner

133. Imiya, A., Sugaya, H., Torii, A., Mochizuki, Y.: Variational analysis of spherical images. In: Gagalowicz, W., Philips, W. (eds.) Computer Analysis of Images and Patterns. Lecture Notes in Computer Science, vol. 3691, pp. 104–111. Springer, Berlin/Heidelberg (2005) 134. Ismail, M., Nashed, M.Z., Zayed, A., Ghaleb, A.: Mathematical Analysis, Wavelets and Signal Processing. Contemporary Mathematics, vol. 190. American Mathematical Society, Providence (1995) 135. Jansen, M., Oonincx, P.: Second Generation Wavelets and Applications. Springer, Berlin (2005) 136. Jarosz, W., Carr, N.A., Jensen, W.W.: Importance Sampling Spherical Harmonics. Eurographics 28, 2 (2009) 137. Jerri, J.A.: On the application of some interpolating functions in physics. J. Res. Nat. Bur. Standards Sect. B 73, 241–245 (1969) 138. Jerri, J.A.: Sampling expansion for Laguerre L2 -transforms. J. Res. Nat. Bur. Standards Sect. B 80, 415–418 (1976) 139. Jerri, J.A.: The Shannon sampling theorem – its various extensions and applications: a tutorial review. Proc. IEEE 65, 1565–1596 (1977) 140. Kalker, T.: On multidimensional sampling. In: Madisetti, V.K., Williams D.B. (eds.) Digital Signal Processing Handbook. CRC Press, Boca Raton (1999) 141. Keller, W.: Satellite-to-satellite tracking (low-low/high-low). In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 171–210. Springer, Berlin/Heidelberg (2015) 142. Khalid, Z., Durrani, S., Sadeghi, P., Kennedy, R.A.: Spatio-spectral analysis on the sphere using spatially localized spherical harmonics transform. IEEE Trans. Signal Process. 60, 1487–1492 (2012) 143. Kirisits, C., Lang, L.F., Scherzer, O.: Optical flow on evolving surfaces with an application at the analysis of 4D microscopy data. In: Kuijoer, A., Bredies, K., Pock, T., Bischof, H. (eds.) SSVM’13: Proceedings of the Fourth International Conference on Scale Space and Variational Methods in Computer Vision. Lecture Notes in Computer Science, vol. 7893, pp. 246–257. Springer, Berlin/Heidelberg (2013) 144. Kirisits, C., Lang, L.F., Scherzer, O.: Decomposition of optical flow on the sphere. GEM. Int. J. Geomath. 5, 117–141 (2014) 145. Kirisits, C., Lang, L.F., Scherzer, O.: Optical flow on evolving surfaces with space and time regularisation. J. Math. Imaging Vision 52, 55–70 (2015) 146. Kirsch, A.: An Introduction to the Mathematical Theory of Inverse Problems. Springer, Heidelberg (1996) 147. Klees, R., Haagmans, R. (eds.): Wavelets in the Geosciences. Lecture Notes in Earth Sciences, vol. 90. Springer, Berlin/Heidelberg (2000) 148. Konopliv, A.S., Asmar, S.W., Carranza, E., Sjogren, W.L., Yuan, D.N.: Recent gravity models as a result of the lunar prospector mission. Icarus 150, 1–18 (2001) 149. Kotel’nikov, V.A.: On the carrying capacity of the “Ether” and “Wire” in telecommunications. In: Material for the First All-Union Conference on Questions of Communications, Izd. Red.Upr. Svyazi RKKA, Moscow (1933) 150. Kotsiaros, S., Olson, N.: The geomagnetic field gradient tensor. GEM Int. J. Geomath. 3, 297–314 (2012) 151. Kramer, H.P.: A generalized sampling theorem. J. Math. Phys. 38, 68–72 (1959) 152. Kunoth, A., Sahner, J.: Wavelets on manifolds: an optimized construction. Math. Comput. 75, 1319–1349 (2006) 153. Lain Fernández, N.: Polynomial bases on the sphere. Ph.D.-Thesis, University of Lübeck, Logos, Berlin (2003) 154. de Laplace, P.S.: Theorie des attractions des sphéroides et de la figure des planètes. Mèm. de l’Acad. Paris (1785) 155. Larson, D., Massopust, P., Nashed, M.Z., Nguyen, M.C., Papadakis, M., Zayed, A. (eds.): Frames and Operator Theory in Analysis and Signal Processing. Contemporary Mathematics, vol. 451. American Mathematical Society, Providence (2008)

4 Spherical Harmonics, Splines, and Wavelets

189

156. Legendre, A.M.: Recherches sur l’attraction des sphèroides homogènes. Mèm. math. phys. près. à l’Acad. Aci. par. divers savantes 10, 411–434 (1785) 157. Lemoine, F.G., Kenyon, S.C., Factor, J.K., Trimmer, R.G., Pavlis, N.K., Shinn, D.S., Cox, C.M., Klosko, S.M., Luthcke, S.B., Torrence, M.H., Wang, Y.M., Williamson, R.G., Pavlis, E.C., Rapp, R.H., Olson, T.R.: The development of the joint NASA GSFC and NIMA geopotential model EGM96. NASA/TP-1998-206861. NASA Goddard Space Flight Center, Greenbelt (1998) 158. Louis, A.K.: Inverse und schlecht gestellte Probleme. Teubner, Stuttgart (1989) 159. Lyche, T., Schumaker, L.L.: A multiresolution tensor spline method for fitting functions on the sphere. SIAM J. Sci. Comput. 22, 724–74 (2000) 160. Lüke, H.D.: The origins of the sampling theorem. IEEE Commun. Mag. 37, 106–108 (1999) 161. Maier, T.: Multiscale geomagnetic field modelling from satellite data: theoretical aspects and numerical applications. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2003) 162. Maier, T.: Wavelet-Mie-representations for solenoidal vector fields with applications to ionospheric geomagnetic data. SIAM J. Appl. Math. 65, 1888–1912 (2005) 163. Mallat, S.: Applied mathematics meets signal processing. In: Proceedings of the International Congress of Mathematicians, Berlin, vol. I, pp. 319–338. Documenta Mathematica (1998) 164. Marks II, R.J.: Introduction to Shannon sampling and interpolation theory. Springer, Berlin (1991) 165. Marks II, R.J.: Advanced topics in Shannon sampling and interpolation theory. Springer. Berlin/Heidelberg (1993) 166. Marvasti, F.A. (ed.): Nonuniform Sampling: Theory and Practice. Information Technology Transmission, Processing, and Storage. Plenum Publishing Corporation, New York (2001) 167. Mayer, C.: Wavelet modelling of ionospheric currents and induced magnetic fields from satellite data. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2003) 168. Mayer, C.: Wavelet modelling of the spherical inverse source problem with application to geomagnetism. Inverse Prob. 20, 1713–1728 (2004) 169. Mayer, C.: Wavelet decomposition of spherical vector fields with respect to sources. J. Fourier Anal. Appl. 12, 345–369 (2006) 170. Mayer, C., Freeden, W.: Stokes problems, layer potentials and regularizations, and multiscale applications. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 2, 2nd edn., pp. 1155–1253. Springer, Berlin/Heidelberg (2015) 171. Mayer, C., Maier, T.: Separating inner and outer Earth’s magnetic field from CHAMP satellite measurements by means of vector scaling functions and wavelets. Geophys. J. Int. 167, 1188–1203 (2006) 172. McEwen, J.D., Puy, G., Thiran, J.-P., Vandergheinst, P.: In: Papadakis, M., van de Ville, D., Goyal, V.K. (eds.) Sampling theorems and compressive sensing on the sphere. Wavelets and Sparsity XIV. Proceedings of SPIE, vol. 8138, pp. F1–F9 (2011) 173. McEwen, J.D., Puy, G., Thiran, J.-P., Vandergheinst, P.: Sparse image reconstruction on the sphere: implications of a new sampling theorem. IEEE Tans. Image Process. 22, 1–11 (2013) 174. McEwen, J.D., Wiaux, Y.: A novel sampling theorem on the sphere. IEEE Trans. Sig. Process. 59, 1–13 (2011) 175. van der Mee, C.V., Nashed, M.Z., Seatzu, S.: Sampling expansions and interpolation in unitarily translation invariant reproducing kernel Hilbert space. Adv. Comput. Math. 19, 355–372 (2003) 176. Meeks, R.: Introduction to Shannon Sampling and Interpolation Theory. Springer, New York (1991) 177. Meeks, R.: Advanced Topics in Shanon Sampling and Interpolation Theory. Springer, New York (1993) 178. Melani, C., Campana, M., Lombardot, B., Rizzi, B., Veronesi, F., Zanella, C., Bourgine, P., Mikula, K., Peyrieras, N., Sarti, A.: Cells tracking in a live Zebrafish embryo. In: Proceedings of the 29th Annual International Conference of the IEEE Engineering in Medicine and Sociology Society (EMBS 2007), pp. 1631–1634 (2007)

190

W. Freeden and M. Schreiner

179. Mhaskar, H.N., Narcowich, F.J., Prestin, J., Ward, J.D.: Polynomial frames on the sphere. Adv. Comput. Math. 3, 378–403 (2003) 180. Michel, V.: A multiscale method for the gravimetry problem: theoretical and numerical aspects of harmonic and anharmonic modelling. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (1999) 181. Michel, V.: A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the earth’s interior. Habilitation Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2002) 182. Michel, V.: Scale continuous, scale discretized and scale discrete harmonic wavelets for the outer and the inner space of a sphere and their application to an inverse problem in geomathematics. Appl. Comput. Harm. Anal. (ACHA) 12, 77–99 (2002) 183. Michel, V.: Regularized wavelet-based multiresolution recovery of the harmonic mass density distribution from data of the earth’s gravitational field at satellite height. Inverse Prob. 21, 997–1025 (2005) 184. Michel, V.: Lectures on Constructive Approximation – Fourier, Spline, and Wavelet Methods on the Real Line, the Sphere, and the Ball. Birkhäuser, Boston (2013) 185. Michel, V., Wolf, K.: Numerical aspects of a spline-based multiresolution recovery of the harmonic mass density out of gravity functionals. Geophys. J. Int. 173, 1–16 (2008) 186. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, pp. 253–290. Springer, Heidelberg (2015) 187. Moritz, H., Sünkel, H. (eds.): Approximation Methods in Geodesy. Lectures delivered at 2nd International Summer School in the Mountains on Mathematical Methods in Physical Geodesy. Wichmann, Karlsruhe (1978) 188. Müller, C.: Analysis of Spherical Symmetries in Euclidean Spaces. Springer, New York/Berlin/Heidelberg (1998) 189. Narcowich, F., Petrushev, P., Ward, J.: Localized tight frames on spheres. SIAM J. Math. Anal. 38, 574–594 (2006) 190. Narcowich, F.J., Ward, J.D.: Nonstationary wavelets on the m-sphere for scattered data. Appl. Comput. Harm. Anal. (ACHA) 3, 324–336 (1996) 191. Narcowich, F.J., Ward, J.D.: Scattered data interpolation on spheres: error estimates and locally supported basis functions. SIAM J. Math. Anal. 33, 1393–1410 (2002) 192. Nashed, M.Z. (ed.): Generalized Inverses and Applications. Academic, New York (1976) 193. Nashed, M.Z.: Inverse problems, moment problems. In: Siddiqi, A.H., Singh, R.C., Manchanda, P. (eds.) Signal Processing: Un Menage a Trois. Mathematics in Science and Technology, pp. 1–19. World Scientific, Singapore (2010) 194. Nashed, M.Z., Scherzer, O.: Inverse Problems, Image Analysis and Medical Imaging. Contemporary Mathematics, vol. 313. American Mathematical Society, Providence (2002) 195. Nashed, M.Z., Sun, Q.: Sampling and reconstruction of signals in a reproducing Kernel subspace of L (Rd ). J. Function. Anal. 258, 2422–2452 (2010) 196. Nashed, M.Z., Sun, Q., Tang, W.S.: Average sampling in L2 . Can. Acad. Sci. Ser. 1 347, 1007–1010 (2009) 197. Nashed, M.Z., Walter, G.G.: General sampling theorems for functions in reproducing Kernel Hilbert space. Math. Contr. Signals Syst. 4, 363–390 (1991) 198. Nashed, M.Z., Walter, G.G.: Reproducing Kernel Hilbert space from sampling expansions. Contemp. Math. 190, 221–226 (1995) 199. Nerem, R.S., Tapley, B.D., Shum, C.K.: A general ocean circulation model determined in a simultaneous solution with the Earth’s gravity field. In: Sünkel, H., Baker, T. (eds.) Sea Surface Topography and the Geoid. International Association of Geodesy Symposia, vol. 104, pp. 158–199. Springer, New York (1990) 200. Nutz, H.: A unified setup of gravitational field observables. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2002) 201. Nutz, H., Wolf, K.: Multiresolution analysis of hydrology and satellite gravitational data. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 497–518. Springer, Berlin/Heidelberg (2015)

4 Spherical Harmonics, Splines, and Wavelets

191

202. Olsen, N., Glassmeier, K.-H., Jia, X.: Separation of the magnetic field into external and internal parts. Space Sci. Rev. 152, 159–222 (2010) 203. Olsen, N., Hulot, G., Sabaka, T.J.: Sources of the geomagnetic field and the modern data that enable their investigation. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, pp. 106–124. Springer, Heidelberg (2010) 204. Papoulis, A.: Error analysis in sampling theory. Proc. IEEE 54, 947–955 (1966) 205. Papoulis, A.: Generalized sampling expansion. IEEE Trans. Circ. Syst. 24, 652–654 (1977) 206. Parzen, E.: A simple proof and some extensions of the sampling theorem. Technical Report No. 7, Department of Statistics, Stanford University, pp. 1–10 (1956) 207. Pavlis, N.K., Holmes, S.A., Kenyon, S.C., John K., Factor, J.K.: The development and evaluation of the earth gravitational model 2008 (EGM2008). J. Geophys. Res. Solid Earth (1978–2012) 117(B4), 04406 (2012) 208. Pesenson, I.: Splines and wavelets on geophysically relevant manifolds. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 3, 2nd edn., pp. 2527– 2562. Springer, New York/Berlin/Heidelberg (2015) 209. Petersen, D.P., Middleton, D.: Sampling and reconstruction of wave-number-limited functions in N -dimensional Euclidean spaces. Inf. Control. 5, 279–323 (1962) 210. Plattner, A., Simons, F.J.: Spatiospectral concentration of vector fields on a sphere. Appl. Comput. Harmon. Anal. (ACHA) 36, 1–22 (2014) 211. Potts, D., Steidl, G., Tasche, M.: Fast Fourier transforms for nonequispaced data: a tutorial, modern sampling theory. Appl. Numer. Harmon. Anal. (ACHA) 22, 247–270 (2001) 212. Potts, D., Tasche, M.: Interpolatory wavelets on the sphere. In: Chui, C.K., Schumaker, L.L. (eds.) Approximation Theory VIII, vol. 2, pp. 335–342. World Scientific, Singapore (1995) 213. Raabe, H.: Untersuchungen an der wechselseitigen Mehrfachübertragung (Multiplexübertragung). Elektrische Nachrichtentechnik 16, 213–228 (1939) 214. Rauhut, H., Ward, R.: Sparse recovery for spherical harmonic expansions. In: Proceedings of the SampTA, pp. 1–4 (2011) 215. Rawn, M.D.: Generalized sampling theorems for bessel-type transformations of band-limited functions and distributions. SIAM J. Appl. Math. 49, 638–649 (1989) 216. Reimer, M.: Multivariate Polynomial Approximation. Birkhäuser, Basel/Boston/Berlin (2003) 217. Reuter, R.: Über Integralformeln der Einheitssphäre und harmonische Splinefunktionen. PhDThesis, Veröff. Geod. Inst. RWTH Aachen, Report No. 33 (1982) 218. Rummel, R.: Spherical spectral properties of the Earth’s gravitational potential and its first and second derivatives. In: Rummel, R., Sanso, F. (eds.) Lecture Notes in Earth Science, vol. 65, pp. 359–404. Springer, Berlin (1997) 219. Rummel, R.: Geodetic boundary value problems in view of the one centimeter geoid. In: Rummel, R., Sansò, F. (eds.) Lecture Notes in Earth Sciences, vol. 65. Springer, Berlin/Heidelberg (1997) 220. Rummel, R., Balmino, G., Johannessen, J., Visser, P., Woodworth P.: Dedicated gravity field missions – principles and aims. J. Geodyn. 33, 3–20 (2002) 221. Rummel, R., van Gelderen, M.: Meissl scheme-spectral characteristics of physical geodesy. Manuscr. Geod. 20, 379–385 (1995) 222. Rummel, R., van Gelderen, M., Koop, R., Schrama, E., Sansó, F., Brovelli, M., Miggliaccio, F., Sacerdote, F.: Spherical Harmonic Analysis of Satellite Gradiometry. Netherlands Geodetic Commission. New Series, vol. 39 (1993) 223. Rummel, R., Reigber, C., Ilk, K.: The use of satellite-to-satellite tracking for gravity parameter recovery. ESA Workshop on Space Oceanography, Navigation, and Geodynamics (SONG), ESA-SP-137, pp. 151–161 (1978) 224. Sansò, F., Tscherning, C.C.: The inverse gravimetric problem in gravity modelling. In: Kejlsø, E., Poder, K., Tscherning, C.C. (eds.) Festschrift to Torben Krarup, pp. 299–334. Geodätisk Institute, Copenhagen (1989)

192

W. Freeden and M. Schreiner

225. Schmidt, M., Dettmering, D., Seitz, F.: Using B-splines expansions for ionosphere modeling. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.), Handbook of Geomathematics, vol. 1, 2nd edn., pp. 939–984. Springer, Berlin/Heidelberg (2015) 226. Schneider, F.: Inverse problems in satellite geodesy and their approximate solution by splines and wavelets. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (1997) 227. Schreiner, M.: Tensor spherical harmonics and their application in satellite gradiometry. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (1994) 228. Schreiner, M.: Locally supported kernels for spherical spline interpolation. J. Approx. Theory 89, 172–194 (1997) 229. Schreiner, M.: Wavelet approximation by spherical up functions. Habilitation Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2004) 230. Schröder, P., Sweldens, W.: Spherical wavelets. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’95), pp. 161–172. ACM, New York (1995) 231. Shannon, C.E.: Communication in the presence of noise. Proc. Inst. Radio Eng. 37, 10–21 (1949) 232. Shannon, C.E.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1949) 233. Shen, X., Zayed, A.I. (eds.): Multiscale signal analysis and modeling. Springer, New York (2013) 234. Shure, L., Parker, R.L., Backus, G.E.: Harmonic splines for geomagnetic modelling. Phys. Earth Planet. Int. 28, 215–229 (1982) 235. Simons, F.J.: Slepian functions and their use in signal estimation and spectral analysis. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn., pp. 891–923. Springer, Heidelberg (2010) 236. Simons, F.J., Dahlen, F.A., Wieczorek, M.: Spatiospectral concentration on a sphere. SIAM Rev. 48, 504–536 (2006) 237. Simons, F.J., Dahlen, F.A.: Spherical Slepian functions and the polar gap in geodesy. Geoph. J. Int. 166, 1039–1061 (2006) 238. Simons, F.J., Plattner, A.: Scalar and vector Slepian functions, spherical signal estimation and spectral analysis. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 2563–2608. Springer, Berlin/Heidelberg (2015) 239. Smale, S., Zhou, D.-X.: Shannon sampling and function reconstruction from point values. Bull. Am. Math. Soc. 41, 279–305 (2004) 240. Steiner, A.K., Hunt, D., Ho, S.-P., Kirchengast, G., Mannucci, A.J., Scherllin-Pirscher, B., Gleisner, H., von Engeln, A., Schmidt, T., Ao, C.O., Leroy, S.S., Herman, B., Kursinski, E.R., Foelsche, U., Gorbunov, M., Heise, S., Kuo, Y.-H., Lauritsen, K.B., Marquardt, C., Rocken, C., Schreiner, W., Sokolovskiy, S., Syndergaard, S., Wickert, J.: Quantification of structural uncertainty in climate data records from GPS radio occultation. Atmos. Chem. Phys. 13, 1469–1484 (2013) 241. Stenger, F.: Approximations via Whittaker’s cardinal function. J. Approx. Theory 17, 222–240 (1976) 242. Stenger, F.: Numerical methods based on Whittaker cardinal, or sinc functions. SIAM Rev. 23, 165–224 (1981) 243. Sun, Q.: Non-uniform sampling and reconstruction for signals with finite rate of innovations. SIAM J. Math. Anal. 38, 1389–1422 (2006) 244. Sun, Q.: Frames in spaces with finite rate of innovation. Adv. Comput. Math. 28, 301–329 (2008) 245. Sun, Q.: Local reconstruction for sampling in shift-invariant spaces. Adv. Comput. Math. 32, 335–352 (2010) 246. Svensson, S.L.: Pseudodifferential operators. A new approach to the boundary value problems of Physical Geodesy. Manuscr. Geod. 8, 1–40 (1983) 247. Svensson, S.L.: Finite elements on the sphere. J. Approx. Theory 40, 246–260 (1984)

4 Spherical Harmonics, Splines, and Wavelets

193

248. Sweldens, W.: The lifting scheme: a construction of second generation wavelets. SIAM J. Math. Anal. 29, 511–546 (1997) 249. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 250. Tikhonov, A.N., Goncharsky, A.V., Stepanov, V.V., Yagola, A.G.: Numerical Methods for the Solution of Ill-Posed Problems. Kluwer, Dordrecht (1995) 251. Unser, M.: Sampling – 50 years after Shannon. Proc. IEEE 88, 569–587 (2000) 252. Vetterli, M., Marziliano, P. Blu, T.: Sampling signals with finite rate of innovation. IEEE Trans. Signal Process. 50, 1417–1428 (2002) 253. Wahba, G.: Spline interpolation and smoothing on the sphere. SIAM J. Sci. Stat. Comput. 2, 5–16 (also: Errata SIAM J. Sci. Stat. Comput. 3, 385–386 1981) (1981) 254. Wahba, G.: Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990) 255. Walter, A.V., Schäfer, R.W.: Digital Signal Processing. Prentice-Hall, New York (1989) 256. Walter, G.G.: A sampling theorem for wavelet subspace. IEEE Trans. Inform. Theor. 38, 881–884 (1992) 257. Weber, E.: The geometry of sampling on unions of lattices. Proc. Am. Math. Soc. (2002) 258. Weinreich, I.: A construction of C(1) –wavelets on the two-dimensional sphere. Appl. Comput. Harmon. Anal. (ACHA) 10, 1–26 (2001) 259. Whittaker, E.T.: On the functions which are represented by the expansions of the interpolation theory. Proc. R. Soc. Edin. Sec. A 35, 181–194 (1915) 260. Whittaker, E.T.: On a new method of graduation. Proc. Edin. Math. Soc. 41, 63–75 (1923) 261. Whittaker, E.T.: The Fourier theory of cardinal functions. Proc. Math. Cox. Edin. 1, 169–176 (1929) 262. Wiaux, Y., Jacques, L., Vandergheynst, P.: Correspondence principle between spherical and Euclidean wavelets. Astrophys. J. 632, 15–28 (2005) 263. Wiaux, Y., McEwen, J.D., Vandergheynst, P., Blanc, O.: Exact reconstruction with directional wavelets on the sphere. Mon. Not. R. Astron. Soc. 388, 770–788 (2008) 264. Wicht, J., Stellmach, S., Harder, H.: Numerical dynamo simulations: from basic concepts to realistic models. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn., pp. 459–502. Springer, Heidelberg (2010) 265. Wieczorek, M.A.: The gravity and topography of the terrestrial planets. In: Spohn, T. (ed.) Treatise on Geophysics, vol.10, 2nd edn., pp. 153–2193. Elsevier, Amsterdam (2015) 266. Wojciech, J., Carr, N.A., Jensen, H.W.: Importance sampling spherical harmonics. Eurographics 28 (2009) 267. Wolf, K.: Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2009) 268. Yao, K.: Applications of reproducing Kernel Hilbert spaces-bandlimited signal models. Inf. Control 11, 429–444 (1967) 269. Zayed, A.: Advances in Shannon’s Sampling Theory. CRC Press/Chapman and Hall, Boca Raton/New York/London (1993) 270. Zayed, A., Hinsen, G., Butzer, P.L.: On Lagrange interpolation and Kramer-type sampling theorems associated with Sturm-Liouville problems. SIAM J. Appl. Math. 50, 893–909 (1990) 271. Zayed, A.I., Schmeisser, G.: New Perspectives on Approximation and Sampling. Applied and Harmonic Analysis. Birkhäuser, Basel (2014) 272. Zidarov, D.P.: Inverse Gravimetric Problem in Geoprospecting and Geodesy. Developments in Solid Earth Geophysics, vol. 19. Elsevier, Amsterdam (1990)

5

A Mathematical View on Spin-Weighted Spherical Harmonics and Their Applications in Geodesy Volker Michel and Katrin Seibert

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Spin Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Definition by Newman and Penrose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Properties and Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Definition by the Wigner D-Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Operator ð . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The Uniqueness of the Eigenfunctions of Δ∗,N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Additional Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Relation to the Scalar, Vector, and Tensor Spherical Harmonics . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

196 198 207 208 213 224 260 274 299 302 305

Abstract The spin-weighted spherical harmonics (by Newman and Penrose) form an orthonormal basis of L2 (Ω) on the unit sphere Ω and have a huge field of applications. Mainly, they are used in quantum mechanics and geophysics for the theory of gravitation and in early universe and classical cosmology. Furthermore, they have also applications in geodesy. The quantity of formulations conditioned this huge spectrum of versatility. Formulations we use are for example given by the Wigner D-function, by a spin raising and spin lowering operator or as a function of spin weight.

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. V. Michel () · K. Seibert Geomathematics Group, University of Siegen, Siegen, Germany E-Mail: [email protected]; [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_102

195

196

V. Michel and K. Seibert

We present a unified mathematical theory which implies the collection of already known properties of the spin-weighted spherical harmonics. We recapitulate this in a mathematical way and connect it to the notation of the theory of spherical harmonics. Here, the fact that the spherical harmonics are the spin-weighted spherical harmonics with spin weight zero is useful. Furthermore, our novel mathematical approach enables us to prove some previously unknown properties. For example, we can formulate new recursion relations and a Christoffel-Darboux formula. Moreover, it is known that the spinweighted spherical harmonics are the eigenfunctions of a differential operator. In this context, we found Green’s second surface identity for this differential operator and the fact that the spin-weighted spherical harmonics are the only eigenfunctions of this differential operator.

Keywords addition theorem · Christoffel-Darboux formula · Green’s second surface identity · recursion relations · scalar · vector · and tensor spherical harmonics · spin lowering and raising operators · spin weight · spin-weighted Beltrami operator · spin-weighted spherical harmonics · Sturm-Liouville differential equation · survey article · unified setup · Wigner D-function

1

Introduction

This article addresses the spin-weighted spherical harmonics defined by Newman and Penrose [22]. It comprises the treatise on these special functions from the thesis [27]. As a survey article, it presents a unified mathematical theory of the spin-weighted spherical harmonics. In this context, a rigorous setup is developed and utilized to prove known and previously unknown theorems for these basis functions. Furthermore, the fact that the spin-weighted spherical harmonics of spin weight zero are the well-known fully normalized scalar spherical harmonics (in the notation of Freeden, Gervens, and Schreiner [14, 15]) provides a connection to a widely used function system. Due to this link, in some literature, the spin-weighted spherical harmonics are also called the generalized spherical harmonics (with minor differences in the notations) [6, 29]. The spin-weighted spherical harmonics have a wide spectrum of versatility. They can be formulated in different ways, for example in terms of the Wigner D-functions, the Jacobi polynomials and the generalized associated Legendre functions [31]. In addition, scalar, vector, and tensor spherical harmonics can be represented in terms of the spin-weighted spherical harmonics. It is shown that this interplay can be exploited to define a basis transformation of the vector spherical harmonics of Hill [17] and the tensor spherical harmonics of Freeden, Gervens, and Schreiner [13] with the advantage that the recombined basis functions become orthonormal already with respect to the dot or the double dot product [30]. Due to the mentioned catenation with other popular systems of trial functions, the spin-weighted spherical harmonics have a wide-ranging field of applications in

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

197

physics, geophysics, and geodesy. They are used for the theory of gravitation [22], for example for the study of the asymptotic behavior of the gravitational field in null directions [8,31], and in early universe and classical cosmology, for instance for the analysis of the temperature and polarization anisotropies of the cosmic microwave background [34]. The outline of this article is as follows. We start with the definition of the spin weight. First, we define the functions oi and oˆ i , i = 1, 2, which deliver us the spin weight of a function. Then, we explain the definition of the spin-weighted spherical harmonics with the help of the spin raising and lowering operators ð and ð. This approach corresponds to the definition of Newman and Penrose [22], which connects the spin-weighted spherical harmonics to the spherical harmonics by an iterative use of these operators ð and ð. Moreover, this definition is reformulated into a recursion formulation which connects the spin-weighted spherical harmonics of spin weight N to those of spin weight N ± 1. Next, we introduce different properties of the spin-weighted spherical harmonics. These are for example known and new recursion relations. Furthermore, we formulate a Christoffel-Darboux formula and define the spin-weighted Beltrami operator with help of the recapitulation of a Sturm-Liouville differential equation. In the next section, we deal with the definition of the spin-weighted spherical harmonics with help of the Wigner D-function. This is an upmost important property of the spin-weighted spherical harmonics. For example by [10, 32], the Wigner D-function is already very well investigated. With help of this representation, we can find some more formulations for the spin-weighted spherical harmonics like writing them in a finite series expansion and thus in a finite series expansion with the functions oi and oˆ i , i = 1, 2. From the finite series expansion, we prove that the spin-weighted spherical harmonics are bounded. Moreover, Green’s second surface identity for the spin-weighted Beltrami operator is formulated subsequently. Furthermore, the definition of the spin-weighted spherical harmonics by the Wigner D-function allows us to express them in terms of Jacobi polynomials. In addition, the formulation by the Wigner D-function enables us to prove the orthonormality of the spin-weighted spherical harmonics on L2 (Ω) and to formulate the addition theorem for the spin-weighted spherical harmonics. Additionally, we pay attention on the properties of the spin raising and lowering operator. Here, we connect, for example, these operators to the spin-weighted Beltrami operator. In the subsequent section, the kernel of the spin raising and lowering operators is described. Also, we show the completeness of the spin-weighted spherical harmonics and the Parseval identity. Then, we define the set of the (∗, N)-harmonic functions of spin weight N and degree n, which is spanned by the spin-weighted spherical harmonics. With all these properties, we show the uniqueness of the spin-weighted spherical harmonics as the only eigenfunctions of the spin-weighted Beltrami operator. In the section about additional properties of the spin-weighted spherical harmonics, we mainly collect recursion relations for the spin-weighted spherical harmonics, which are given in [32] for the Wigner D-function.

198

V. Michel and K. Seibert

In the last section, we deal with the relation between the spin-weighted spherical harmonics and the scalar, vector, and tensor spherical harmonics.

2

Preliminaries

In this section, we present the basics and notations which we use in this article. We start with the definition of function spaces and its associated norms. Definition 1. Let D ⊂ Rn , compact, and W ⊂ R be given. Then, C(k) (D, W ) is the space of all functions F : D → W , which are at least differentiable to order k ∈ N0 , where the k-th derivative is continuous. If W = R, then we denote C(k) (D, R) := C(k) (D) and if k = 0, then we write C(0) (D, R) := C(D, W ). Definition 2. The norm for F ∈ C(p) (D), 0 ≤ p ≤ ∞, D ⊂ Rn compact, is given by F C(D) := sup |F (x)|, x∈D

where |y| denotes the Euclidean norm for y ∈ Rn , n ∈ N. Next, we look at the notations for spherical calculations from [14, 15, 21]. We start with contents borrowed mainly from [21]. We begin with basic definitions. Definition 3. The unit sphere Ω of the three-dimensional Euclidean space R3 is represented by  ' &  Ω = x ∈ R3  |x| = 1 . Definition 4. With εr , εϕ , εt we denote a local orthonormal basis on the unit sphere Ω given by ⎛√ ⎞ 1 − t 2 cos ϕ √ ξ = εr = ⎝ 1 − t 2 sin ϕ ⎠ , t



⎞ − sin ϕ εϕ = ⎝ cos ϕ ⎠ , 0

⎛ ⎞ −t cos ϕ ⎠ εt = ⎝ −t √ sin ϕ , 2 1−t

where t ∈ [−1, 1] is the polar distance and ϕ ∈ [0, 2π ) is the longitude. Note that εr is radially outward, εϕ eastward and εt northward. Moreover, for t = −1, we obtain the South pole and for t = 1 the North pole. Definition 5. We define the tensors [15] itan (ξ ) := εϕ ⊗ εϕ + εt ⊗ εt

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

199

and jtan (ξ ) := εt ⊗ εϕ − εϕ ⊗ εt for ξ = ξ(t, ϕ) ∈ Ω. We use the following notation for partial differentiation: ∂x :=

∂ . ∂x

Then, we can define the following well-known differential operators. Definition 6. The gradient is defined by ⎛ ⎞ ∂x1   ∇x := ∂xi i=1,2,3 = ⎝∂x2 ⎠ ∂x 3 and the Laplace operator by Δx := ∂x21 + ∂x22 + ∂x23 for x = (x1 , x2 , x3 )T ∈ D ⊂ R3 . In the following, we introduce the spherical differential operators. Definition 7. The surface gradient  1 ∇ξ∗ := εϕ √ ∂ϕ + ε t 1 − t 2 ∂t 1 − t2 and the surface curl gradient L∗ξ := −εϕ



1 − t 2 ∂t + ε t √

1 1 − t2

∂ϕ

for " ξ = ξ(t,#ϕ) ∈ Ω are differential operators on the sphere such that ∇rξ = ξ ∂r + 1r ∇ξ∗ for r ∈ R+ , ξ ∈ Ω and L∗ξ = ξ ∧ ∇ξ∗ . ξ = ξ(t, ϕ) is the polar coordinate representation of ξ ∈ Ω. Moreover, Δ∗ξ := ∂t

"" # # 1 − t 2 ∂t +

1 ∂2 1 − t2 ϕ

is the Beltrami operator such that Δ∗ = ∇ ∗ · ∇ ∗ = L∗ · L∗ .

200

V. Michel and K. Seibert

Lemma 1. The Laplace operator can be decomposed into spherical components given for x = rξ ∈ R3 , r = |x| ∈ R, and ξ ∈ Ω by 2 1 Δx = ∂r2 + ∂r + 2 Δ∗ξ . r r Now, we can write down the well-known Green’s second surface identity [21]. Note that all integrals in this thesis are Lebesgue-integrals. Theorem 1. Green’s second surface identity is given by  " # F (ξ )Δ∗ξ G(ξ ) − G(ξ )Δ∗ξ F (ξ ) dω(ξ ) Γ

 = ∂Γ

 F (ξ )

∂ ∂ G(ξ ) − G(ξ ) F (ξ ) ∂ν(ξ ) ∂ν(ξ )

 dσ (ξ ),

  where F, G ∈ C(2) Γ , Γ ⊂ Ω with a sufficiently smooth boundary and ν is the outward unit normal vector field to ∂Γ . Then, Green’s second surface identity on the unit sphere is given by  " # F (ξ )Δ∗ξ G(ξ ) − G(ξ )Δ∗ξ F (ξ ) dω(ξ ) = 0 Ω

for F, G ∈ C(2) (Ω). Next, we deal with spherical function spaces. Definition 8. For 1 ≤ p < ∞ and a (Lebesgue) measurable set D ⊂ Rn , we denote with L p (D, Rm ) the space of all (Lebesgue) measurable functions F : D → Rm with  |F (x)|p dx < ∞. D

Let N with

p

(D, Rm ) be the space of all (Lebesgue) measurable functions F : D → Rm  |F (x)|p dx = 0. D

  Then, we define Lp (D, Rm ),  · p by     Lp D, Rm := L p D, Rm /N

p

  D, Rm ,

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

201

where 

1

F p := F Lp (D,Rm ) :=

|F (x)| dx p

p

< ∞.

D

  Theorem 2. The function space Lp (D, Rm ) ,  · p is a Banach space. Note that functions which are almost everywhere equal are considered to be identical. So, all functions F where F Lp (D) = 0 are identified with the zero function. In this article, we mostly use p = 2 and m = 1. This yields scalar function  spaces and we write L2 (D, R) = L2 (D). The set L2 (D), .2 is a Hilbert space. Theorem 3. We get C(Ω)

·L2 (Ω)

= L2 (Ω).

Definition 9. We define the inner product on L2 (Ω) for scalar functions F, G ∈ L2 (Ω) by  "F, G#L2 (Ω) :=

F (ξ )G(ξ ) dω(ξ ). Ω

Now, we define the spherical harmonics, where we follow again mainly [21]. We start with the function spaces for the spherical harmonics. Definition 10. Let D ⊂ R3 be open and connected. A twice continuously differentiable function F ∈ C(2) (D) is called harmonic, if for all x ∈ D the equation Δx F (x) = 0 is fulfilled. Definition 11. A, in general, complex-valued polynomial P on Rm , m ∈ N, is called homogeneous of degree n ∈ N0 , if there exist constants Cα , which do not all vanish, such that P (x) =



Cα x α

|α|=n

m m m α for i=1 αi , and x := ,mall xαi ∈ R , where α = (α1 , . . . , αm ) ∈ N0 , |α| := i=1 xi . Homn (Rm ) denotes the set of the homogeneous polynomials of degree n.

202

V. Michel and K. Seibert

  Definition 12. Harmn R3 , n ∈ N0 , denotes the space of all homogeneous harmonic polynomials. We also define the spaces n " # " # ! Harm0...n R3 := Harmi R3 i=0

and ∞ " # " # 3 Harm0...i R3 . Harm0...∞ R := i=0

For D ⊂ R3 , we denote  " #' &  Harmn (D) := P |D P ∈ Harmn R3 and  " #' &  Harm0...∞ (D) := P |D P ∈ Harm0...∞ R3 . From these definitions we obtain the space of the spherical harmonics. In the following, we have a closer look at these special functions [21]. Definition 13. The elements of Harmn (Ω), n ∈ N0 , are called the spherical harmonics. Theorem 4. For the dimension of Harmn (Ω), we get dim Harmn (Ω) = 2n + 1. Now, we can introduce the spherical harmonics. Theorem 5. For Yn ∈ Harmn (Ω), Ym ∈ Harmm (Ω), n, m ∈ N0 , m = n, we receive "Yn , Ym #L2 (Ω) = 0. 2 Definition 14. The set {Y  n,j }j =−n,...,n , n ∈ N0, is a complete L (Ω)-orthonormal set in the function space Harmn (Ω), "·, ·#L2 (Ω) , this means that

. / 1. Yn,j , Yn,j  L2 (Ω) = δj,j  . / . 2. If F, Yn,j L2 (Ω) = 0 for all j = −n, . . . , n and F ∈ Harmn (Ω), then F = 0.

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

203

Remark 1. From the previous definition, we can conclude that all Yn ∈ Harmn (Ω), n ∈ N0 , can be written by Yn =

n . / Yn , Yn,j L2 (Ω) Yn,j .

j =−n

Definition 15. The index n is called the degree and the index j is called the order of Yn,j .  Remark 2. The set of the spherical harmonics Yn,j n∈N , j =−n,...,n is an orthonor0  mal set in the function space Harm0...∞ (Ω), "·, ·#L2 (Ω) , this means that . / Yn,j , Yn ,j  L2 (Ω) = δn,n δj,j  . Remark 3. It is obvious that for 0 ≤ p ≤ q ≤ ∞ [15]  Harmp...q (Ω) = span Yn,j

n=p,...,q,j =−n,...,n

.

The following theorems are very important for the use of the spherical harmonics. Theorem 6. The spherical harmonics are the eigenfunctions of the Beltrami operator, this means that for ξ ∈ Ω Δ∗ξ Yn,j (ξ ) = −n(n + 1)Yn,j (ξ ). Theorem 7 (Spherical Addition Theorem). For n ∈ N0 and ξ, η ∈ Ω, we obtain n

Yn,j (ξ )Yn,j (η) =

j =−n

Theorem 8. The set

 Yn,j

n∈N0 , j =−n,...,n

2n + 1 Pn (ξ · η). 4π   is complete in L2 (Ω), "·, ·#L2 (Ω) .

Consequently, for F, G ∈ L2 (Ω), we get 0 0 0 0 n L 0 0 . / 0 F, Yn,j L2 (Ω) Yn,j 0 lim 0F − 0 L→∞ 0 0 n=0 j =−n

= 0.

L2 (Ω)

This means that every F ∈ L2 (Ω) can be written uniquely in the L2 (Ω)-sense in terms of a Fourier series F =

n ∞ . / F, Yn,j L2 (Ω) Yn,j . n=0 j =−n

204

V. Michel and K. Seibert

Moreover, the Parseval identity "F, G#L2 (Ω) =

n ∞ . / . / F, Yn,j L2 (Ω) G, Yn,j L2 (Ω) n=0 j =−n

holds true. Theorem 9. The spherical harmonics are the only eigenfunctions of the Beltrami operator Δ∗ to the eigenvalues −n(n + 1).  That Yn,j n∈N , j =−n,...,n is a complete L2 (Ω)-orthonormal set, does not lead to 0 a unique basis. Therefore, the fully normalized spherical harmonics are introduced. The derivation can be done as shown in the proof of Theorem 14 for the spinweighted case. Theorem 14 also holds true for the fully normalized spherical harmonics, as we will see in the proof of the corresponding theorem. Definition 16. We denote the (scalar) fully normalized spherical harmonics by Yn,j (ξ(t, ϕ)) := Xn,j (t)eij ϕ

⎧   ⎨(−1)j 2n+1 (n−j )! P (t)eij ϕ , j ≥ 0 4π (n+j )! n,j := ⎩(−1)j Y j −1 are given by [21] Pn(α,β) (t)

(−1)n := n (1 − t)−α (1 + t)−β 2 n!



d dt

n

(1 − t)α+n (1 + t)β+n .

The spin-weighted spherical harmonics can be written with help of the Jacobi polynomials for j + N, j − N ≥ 0 with ξ = ξ(t, ϕ) ∈ Ω, N ∈ Z, n ∈ N0 , n ≥ |N |, j = −n, . . . , n by [10, 28, 32] N Yn,j (ξ )

(−1)j = 2j

2

2n + 1 4π

%

j +N j −N (n + j )!(n − j )! ij ϕ (j +N,j −N ) e (1 − t) 2 (1 + t) 2 Pn−j (t). (n + N)!(n − N)!

Proof. For ϑ ∈ [0, π ], we know from (2) from Remark 13 that n n n dj,−N (ϑ) = (−1)n−N +n+j d−j,N (ϑ) = (−1)j +N d−j,N (ϑ).

Then, we get with Definition 27 for t = cos ϑ the representation n dj,−N (ϑ)

(−1)j +N +n−j = 2n

%

j +N j −N (n + j )! (1 − t)− 2 (1 + t)− 2 (n − j )!(n − N)!(n + N)!

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

 ×

d dt

253

n−j   (1 − t)n+N (1 + t)n−N

(−1)j +N = 2j

%

j +N j −N (n + j )!(n − j )! (1 − t) 2 (1 + t) 2 (n − N)!(n + N)!  n−j   d (−1)n−j −(j +N ) −(j −N ) (1 − t)n+N (1 + t)n−N , × n−j (1 + t) (1 − t) 2 (n − j )! dt    (j +N,j −N)

=Pn−j

if j + N, j − N Theorem 14.

>

(t)

−1. Then, the proposition follows directly with ) (

In this context, we introduce the spin-weighted Legendre polynomials. Definition 29. We define the spin-weighted Legendre polynomials of spin weight N ∈ Z and degree n ∈ N0 , n ≥ |N |, by (−1)n N Pn (t) := n √ 2 (n + N)!(n − N)!



d dt

n   (1 − t)n−N (1 + t)n+N

for t ∈ [−1, 1]. The spin-weighted Legendre polynomials are polynomials of degree n. So, they are infinitely differentiable on [−1, 1]. Furthermore, it is obvious that we get the Legendre polynomials 0 Pn = Pn for spin weight zero. Definition 30. In addition to the considerations above, we define the associated spin-weighted Legendre functions of spin weight N ∈ Z, degree n ∈ N0 , n ≥ |N |, and order j = −n, . . . , n by

N Pn,j (t)

:=

⎧ " #j j −N ⎨(1 − t) j +N d 2 (1 + t) 2 ⎩(−1)N +j

dt

N Pn (t),

j ≥0 j ðN +1 ðN F, G

 L2 (Ω)

= 

Ω

=

Ω

= =

# " F (ξ ) ðN +1 ðN G(ξ ) dω(ξ ) # " F (ξ ) ðN +1 ðN G(ξ ) dω(ξ )

N F, ðN +1 ðN

> G

L2 (Ω)

.

) (

Due to Corollary 14, the previous lemma holds true for all N Yn,j . Now, we prove a more general version of Lemma 25. Lemma 28. Let N ∈ Z, n ∈ N0 , n ≥ |N |, j = −n, . . . , n, and p ∈ N0 . We get p

p

ðN +p ðN

N Yn,j

= (−1)p

(n − N)! (n + N + p)! (n − N − p)! (n + N )!

N Yn,j

N Yn,j

= (−1)p

(n + N)! (n − N + p)! (n + N − p)! (n − N )!

N Yn,j

for p ≤ n − N and p

p

ðN −p ðN for p ≤ n + N.

Proof. We know from Lemma 26 that for all N ∈ Z, all n ∈ N0 , and all j = −n, . . . , n, we have

280

V. Michel and K. Seibert

ðN

N Yn,j

=



n(n + 1) − N(N + 1) N +1 Yn,j

and ðN

N Yn,j

 = − n(n + 1) − N(N − 1) N −1 Yn,j .

Let p ∈ N0 . With (1), we obtain for p ≤ n − N the identity p

ðN

N Yn,j

= = =

  %

(n − N )(n + N + 1) ðp−1 N +1 Yn,j (n − N) . . . (n − N − p + 1)(n + N + 1) . . . (n + N + p) N +p Yn,j (n − N)!(n + N + p)! (n − N − p)!(n + N)!

N +p Yn,j .

We are able to derive a similar result for the case p ≤ n + N which is given by p

ðN

N Yn,j

 p−1 = − (n + N )(n − N + 1) ð N −1 Yn,j  = (−1)p (n+N) . . . (n+N −p+1)(n−N +1) . . . (n−N +p) N −p Yn,j % p (n + N)!(n − N + p)! = (−1) N −p Yn,j . (n + N − p)!(n − N)!

Then, we get for p ≤ n − N the representation % p ðN +p N +p Yn,j

= (−1)p

(n + N + p)!(n − (N + p) + p)! (n + N + p − p)!(n − (N + p))!

N Yn,j

and for p ≤ n + N the corresponding formula % p ðN −p N −p Yn,j

=

(n − N + p)!(n + N )! (n − N)!(n + N − p)!

N Yn,j .

All in all, we get for p ≤ n − N the result p

p

ðN +p ðN N Yn,j % (n − N )!(n + N + p)! p ð = N +p Yn,j (n − N − p)!(n + N)! N +p % % (n − N )!(n + N + p)! (n + N + p)!(n − N − p + p)! (−1)p = (n − N − p)!(n + N)! (n + N + p − p)!(n − N − p)!

N Yn,j

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

= (−1)p

(n − N)!(n + N + p)! (n − N − p)!(n + N)!

281

N Yn,j ,

where the case p ≤ n + N leads us to % p p ðN +p ðN N Yn,j

= (−1)

p

= (−1)p

(n + N)!(n − N + p)! (n + N − p)!(n − N)!

(n + N)!(n − N + p)! )N − p)! (n − N)!(n +(

%

(n − N + p)!(n + N )! (n − N )!(n + N − p)!

N Yn,j

N Yn,j

) (

Lemma 29. Let N ∈ Z, n ∈ N0 , and j = −n, . . . , n. For n < N, we get [22] N

ð0 Yn,j = 0. Proof. From the alternative definition of the spin-weighted spherical harmonics given in Lemma 4, we know that ðN

N Yn,j

 = − n(n + 1) − N(N − 1) N −1 Yn,j

for all N ∈ Z, all n ∈ N0 , and all j = −n, . . . , n. Then, we conclude that N

ð0 Yn,j = ð−(N −1) . . . ð0 Yn,j ⎛ = (−1) ⎝ N

⎞1 2

0 :

(n(n + 1) − k(k − 1))⎠

−N Yn,j

k=−(N −1)

= (−1)N

6N −1 :

5 12 (n(n + 1) − k(k + 1))

−N Yn,j .

k=0

The obtained product vanishes, if k = n. Because k = 0, . . . , N −1 and n < N, that is n ∈ {0, . . . , N − 1}, the value k = n occurs. Therefore, for n < N, we observe that N

ð0 Yn,j = 0.

) (

Next, we deal with the kernel of the operators ð and ð. Theorem 20. The kernel of the operator ðN is given for N ∈ Z by [9] ⎧&  ⎨ F ∈ C(1) (Ω ) ∩ L2 (Ω)  F = N 0 j =−N cj ker (ðN ) = ⎩{0},

N YN,j ,

' cj ∈ C , N ≥ 0 else

,

282

V. Michel and K. Seibert

where for ξ = ξ(t, ϕ) ∈ Ω0 , j = −N, . . . , N , we have % N+j N−j 2N + 1 (2N)! j 1 (−1) N eij ϕ (1 − t) 2 (1 + t) 2 . N YN,j (ξ ) = 4π 2 (N − j )!(N + j )!    2

=:cN,j

Proof. To calculate the kernel of the operator ðN , we have to solve the equation   N t − i∂ϕ 2 0 = ðN F (ξ ) = 1 − t ∂t + √ F (ξ ), 1 − t2 where F ∈ C(1) (Ω0 ) ∩ L2 (Ω) and N ∈ Z. We use separation of variables and set F (ξ(t, ϕ)) := f (t)g(ϕ), ξ ∈ Ω0 . Then, we obtain  Nt i 1 − t 2 f  (t)g(ϕ) + √ f (t)g(ϕ) − √ f (t)g  (ϕ) 1 − t2 1 − t2 # " 0 = 1 − t 2 f  (t)g(ϕ) + Ntf (t)g(ϕ) − if (t)g  (ϕ)

0= ⇔

and consequently, we get # f  (t) " g  (ϕ) + Nt = i . 1 − t2 f (t) g(ϕ) Now, the left-hand side depends only on t and the right-hand side only on ϕ. Consequently, both sides must be equal to a constant c. Then, we get for the dependency on ϕ that c=i

g g

⇔ g  = −icg

and, thus, we get the solution for g given by g(ϕ) = a · e−icϕ , where g(0) = g(2π ). Hence, a = ae−2iπ c



e−2iπ c = 1

and, therefore, −2iπ c = 2iπj



c = −j,

Then, we obtain for the dependency on t for all j ∈ Z

j ∈ Z.

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

" # f + Nt = −j 1 − t2 f

283

1 df j + Nt =− . f dt 1 − t2



So, we get by integration 

1 dt 1 − t2   



t dt 1 − t2    = 12 (ln(1+t)−ln(1−t)) =− 12 ln(1−t 2 )   # 1+t N " j + ln 1 − t 2 + b = − ln 2 1−t 2 # " # " # " j j N N = ln (1 + t)− 2 (1 − t) 2 + ln (1 + t) 2 (1 − t) 2 + ln b˜   N−j N+j = ln b˜ (1 + t) 2 (1 − t) 2 .   

ln (|f (t)|) = −j

−N

≥0

As a result, the solution for f (without loss of generality) is given by ˜ − t) f (t) = b(1

N+j 2

(1 + t)

N−j 2

.

Altogether, we get the representation F (ξ ) = f (t)g(ϕ) = ce ˜ ij ϕ (1 − t) 

N+j 2

(1 + t) 

=:I

N−j 2



.

For N ≥ 0 and N ≥ |j |, the exponents of I are non-negative. Therefore, I ∈ L2 (Ω). For N < 0, at least one of the exponents is negative. Therefore, I ∈ / L2 (Ω) and the only solution is zero. So, we only have to look at the case N ≥ 0 further. From Definition 27 and Theorem 14, we conclude that % 2 j +N j −N 1 (N − j )! 2N + 1 ij ϕ j e (1 − t) 2 (1 + t) 2 N YN,j (ξ ) = (−1) N 2 (N + j )!(2N)! 4π  N +j d × (1 + t)2N dt    (2N)! 2N−N−j = (2N−N−j )! (1+t) (2N)! N−j = (N−j )! (1+t)

284

V. Michel and K. Seibert

% 1 = (−1)j N 2 

(2N)! (N + j )!(N − j )!  =cN,j

2

j +N N−j 2N + 1 ij ϕ e (1 − t) 2 (1 + t) 2 . 4π 

As a consequence, we obtain that F (ξ ) =

c˜ cN,j

N YN,j (ξ )

satisfies 0 = ðN F (ξ ) for all j = −N, . . . , N and consequently, all linear combinations of the spin-weighted spherical harmonics N YN,j of spin weight and degree N . Therefore, F is a function of spin weight and degree N. ) ( Theorem 21. Analogously, the kernel of the operator ðN is given for N ∈ Z by [9] " # ker ðN =

=

 ' &   F ∈ C(1) (Ω0 ) ∩ L2 (Ω)  F = N j =−N cj −N Y−N,j , cj ∈ C , N ≤ 0 {0}, else  ' &   F ∈ C(1) (Ω0 ) ∩ L2 (Ω)  F = N j =−N cj N Y−N,j , cj ∈ C , N ≤ 0 {0},

.

else

Proof. With the kernel of the operator ðN , we see directly that for N ∈ Z ker (ð−N ) ⎧&  ⎨ F ∈ C(1) (Ω ) ∩ L2 (Ω)  F = −N c 0 j =N j = ⎩{0}, ⎧&  ⎨ F ∈ C(1) (Ω ) ∩ L2 (Ω)  F = −N c 0 j =N j = ⎩{0},

−N Y−N,j ,

' cj ∈ C ,

−N ≥ 0 else

'

−N Y−N,j ,

cj ∈ C ,

N ≤0

.

else

With Remark 8, this is equal to

" # ker ðN =

 ' &   F ∈ C(1) (Ω0 ) ∩ L2 (Ω)  F = −N j =N cj −N Y−N,j , cj ∈ C , N ≤ 0 {0},

else

.

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

285

From Definition 27 and Theorem 14, we know for ξ = ξ(t, ϕ) ∈ Ω0 that N Y−N,−j (ξ )

2

=

% j −N j +N 1 − 2N (−N + j )! −j N (−1) 2 (1 − t)− 2 (1 + t)− 2 e−ij ϕ 4π (−N − j )!(−2N)!  −j −N d × (1 − t)−2N dt    (−2N)! −2N+N+j (−1)−j −N = (−2N+N+j )! (1−t) (−2N)! j −N (−1)−j −N = (j −N)! (1−t)

2 = (−1)N

1 − 2N N 2 4π

%

N−j N+j (−2N)! (1 − t)− 2 (1 + t)− 2 e−ij ϕ (−N + j )!(−j − N)!

and, therefore, we see with help of the previous proof that −N Y−N,j (ξ )

2 =

1 − 2N (−1)j 2N 4π

= (−1)j

% N−j N+j (−2N)! (1 − t)− 2 (1 + t)− 2 e−ij ϕ (j − N)!(−j − N)!

N Y−N,−j (ξ ).

Altogether, we obtain " # ker ðN =

 ' &   F ∈ C(1) (Ω0 ) ∩ L2 (Ω)  F = N j =−N cj N Y−N,−j , cj ∈ C , N ≤ 0 {0},

else

and, consequently, we get " # ker ðN =

 ' &   F ∈ C(1) (Ω0 ) ∩ L2 (Ω)  F = N c Y , c ∈ C , N ≤0 j N −N,j j j =−N {0},

.

else

) ( Corollary 15. The operator ðN is injective for N < 0 and the operator ðN is injective for N > 0. Lemma 30. Let N ∈ Z. We recapitulate that for all functions F ∈ C(1) (Ω0 ) ∩ L2 (Ω) and for N < 0 the identity [22]

286

V. Michel and K. Seibert

ðN F = 0 implies that F = 0. Moreover, we obtain that for all functions F ∈ C(1) (Ω0 ) ∩ L2 (Ω) and for N > 0 where ðN F = 0 it also holds true that F = 0. Lemma 31. Let F ∈ L2 (Ω) and N ∈ Z be given. If . / F, N Yn,j L2 (Ω) = 0 for all n ∈ N0 , n ≥ |N|, and all j = −n, . . . , n, then F = 0. Proof. Let N ∈ Z, n ∈ N0 , n ≥ |N|, and j = −n, . . . , n. Furthermore, let without loss of generality F ∈ C(∞) (Ω). It suffices to consider only such functions, because C(∞) (Ω) is dense in L2 (Ω) [33]. • For N = 0, we get . / 0 = F, Yn,j L2 (Ω) =

 F (ξ )Yn,j (ξ ) dω(ξ ) Ω

and see directly, for example from [21], that F = 0 due to the completeness of spherical harmonics. • For N > 0, we obtain with Definition 26 that . / 0 = F, N Yn,j L2 (Ω) % > (n − N)! = F, ðN = Y 0 n,j 2 L (Ω) (n + N)! %  (n − N)! = F (ξ )ðN 0 Yn,j (ξ ) dω(ξ ) (n + N)! Ω

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

% = % =

(n − N)! (n + N)! (n − N)! (n + N)!

287

 Ω

F (ξ )ðN −1 . . . ð1 ð0 Yn,j (ξ ) dω(ξ )

 Ω

F (ξ )ð−N +1 . . . ð−1 ð0 Yn,j (ξ ) dω(ξ ).

These integrals exist, because we know from Corollary 5 that the integrands are bounded. Now, we apply N times Lemma 18 such that % 0= % = % =

(n − N)! (−1)N (n + N)! (n − N)! (−1)N (n + N)!

 "

# ð1 ð2 . . . ðN F (ξ ) Yn,j (ξ ) dω(ξ )

Ω

 " Ω

# N ðN F (ξ ) Yn,j (ξ ) dω(ξ )

= N > (n − N)! (−1)N ðN F, Yn,j 2 . L (Ω) (n + N)!

Therefore, we get > = N ðN F, Yn,j

L2 (Ω)

= N > for n ≥ N. Now, we look at ðN F, Yn,j

L2 (Ω)

=0

for n < N. Here, we see the other

way around to above that > = N ðN F, Yn,j

L2 (Ω)

=

 " # ð1 ð2 . . . ðN F (ξ ) Yn,j (ξ ) dω(ξ ) Ω



= (−1)

N Ω

F (ξ )ðN −1 . . . ð1 ð0 Yn,j (ξ ) dω(ξ ).

We obtain for ξ ∈ Ω the equation ðN −1 . . . ð1 ð0 Yn,j (ξ ) = 0 for all n = 0, . . . , N − 1. This is obvious, because for n = 0, we know from Definition 16 and from Definition 24 that 1 ð0 Y0,0 (ξ ) = ð0 √ = 0. 4π For n = 1, we use Definition 26 and get

288

V. Michel and K. Seibert

ð0 Y1,j (ξ ) =



2 1 Y1,j (ξ ),

where we know from Theorem 20 that 1 Y1,j ∈ ker (ð1 ). This means that ð1 ð0 Y1,j (ξ ) =

√ 2 ð1 1 Y1,j (ξ ) = 0.

Inductively, we get the proposition for all n ∈ N0 up to n = N − 1, where we conclude from Definition 26 and from Theorem 20 the identity ðN −1 . . . ð1 ð0 YN −1,j (ξ ) =

 (2(N − 1))! ðN −1

N −1 YN −1,j (ξ )



  ∈ker(ðN−1 )

= 0.

Altogether, this leads us to = N > ðN F, Yn,j

L2 (Ω)

=0

for all n ∈ N0 and j = −n, . . . , n. Then, we have the same case like for N = 0 and we get from [21] that N

ðN F = 0. Now, we apply N times Lemma 30. So, we get with N > 0 the proposition that F = 0. • For N < 0, we get with Definition 26 that . / 0 = F, N Yn,j L2 (Ω) % > (n + N)! = −N F, ð0 Yn,j 2 = (−1)N L (Ω) (n − N)! %  (n + N)! −N = (−1)N F (ξ )ð0 Yn,j (ξ ) dω(ξ ) (n − N)! Ω %  (n + N)! = (−1)N F (ξ )ðN +1 . . . ð−1 ð0 Yn,j (ξ ) dω(ξ ) (n − N)! Ω %  N (n + N)! F (ξ )ð−(N +1) . . . ð1 ð0 Yn,j (ξ ) dω(ξ ). = (−1) (n − N)! Ω In analogy to above, these integrals exist, because we know from Corollary 5 that the integrands are bounded. Then, we apply N times Lemma 18. This results in

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

% 0 = (−1) % = % =

N

(n + N)! (−1)N (n − N)!

(n + N)! (n − N)!

289

 Ω

(ð−1 ð−2 . . . ðN F (ξ )) Yn,j (ξ ) dω(ξ )

 " # ð−N F (ξ ) Yn,j (ξ ) dω(ξ ) N Ω

> (n + N)! = −N ðN F, Yn,j 2 . L (Ω) (n − N)!

So, we see that = > ð−N N F, Yn,j

L2 (Ω)

> = F, Y for n ≥ −N. Now, we look at ð−N n,j N

=0

L2 (Ω)

for n < −N . Here, we get the

other way around to above =

ð−N N F, Yn,j



>

=

L2 (Ω)

Ω

(ð−1 ð−2 . . . ðN F (ξ )) Yn,j (ξ ) dω(ξ )

= (−1)

 N Ω

F (ξ )ðN +1 . . . ð−1 ð0 Yn,j (ξ ) dω(ξ ).

We obtain that for ξ ∈ Ω ðN +1 . . . ð−1 ð0 Yn,j (ξ ) = 0 for all n = 0, . . . , −N − 1 (see also Lemma 29). This is obvious, because for n = 0, we receive from Definition 16 and from Definition 24 1 ð0 Y0,0 (ξ ) = ð0 √ = 0. 4π For n = 1, we can use Definition 26 such that √ ð0 Y1,j (ξ ) = − 2 −1 Y1,j (ξ ), # " where we know from Theorem 21 that −1 Y1,j ∈ ker ð−1 . This means that √ ð−1 ð0 Y1,j (ξ ) = − 2 ð−1

−1 Y1,j (ξ )

= 0.

Inductively, we get the proposition for all n ∈ N0 up to n = −N − 1, where we get from Definition 26 and from Theorem 21

290

V. Michel and K. Seibert

 ðN +1 . . . ð−1 ð0 Y−(N +1),j (ξ ) = (−1)−(N +1) (2(−N − 1))! × ðN +1 N +1 Y−(N +1),j (ξ )    " # ∈ker ð−(N+1)

= 0. All in all, we obtain the result =

ð−N N F, Yn,j

> L2 (Ω)

=0

for all n ∈ N0 and j = −n, . . . , n. Hence, we know from [21] that ð−N N F = 0. Now, we apply N times Lemma 30. Then, we get also in the case N < 0 the proposition that F = 0. ( ) From the previous lemma, we can directly conclude the completeness of the spinweighted spherical harmonics and hence, the following theorem. Theorem 22. The system Consequently, for F ∈



N Yn,j n≥|N |,j =−n,...,n 2 L (Ω), we obtain

is complete in (L2 (Ω), "., .#L2 (Ω) ).

0 0 0 0 L n 0 0 0 F − lim 0 "F, Y # Y 2 N n,j L (Ω) N n,j 0 0 L→∞ 0 0 n=|N | j =−n

= 0.

L2 (Ω)

This means that, for every function F ∈ L2 (Ω) and for every N ∈ Z, there exist coefficients N Fn,j such that F =

∞ n

N Fn,j N Yn,j ,

n=|N | j =−n

where the convergence is understood in the L2 (Ω)-sense. Thus, in combination with the orthonormality of the spin-weighted spherical har monics from Theorem 15, we obtain that N Yn,j n≥|N |,j =−n,...,n is an orthonormal basis of L2 (Ω). Therefore, the coefficients N Fn,j are unique.

Remark 18. There is an alternative proof of the previous theorem given by [22]. However, the proposition is shown for a “suitably regular” [22] function N F of spin weight N. If we want to perform the proof for functions in L2 (Ω), we have to show

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

291

N

that (ðN N F ) ∈ L2 (Ω) at least for all N F in a dense subset of L2 (Ω). Then, we can continue like in [22]. Corollary 16 (Parseval Identity for the Spin-Weighted Spherical Harmonics). For every function F, G ∈ L2 (Ω) and every N ∈ Z, the Parseval identity for the spin-weighted spherical is fulfilled such that "F, G#L2 (Ω) =

∞ n . / . / F, N Yn,j L2 (Ω) G, N Yn,j L2 (Ω)

n=|N | j =−n

and consequently, F 2L2 (Ω) =

∞ n  2 / .   F, N Yn,j L2 (Ω)  .

n=|N | j =−n

Next, we define the space which is spanned by the spin-weighted spherical harmonics. Therefore, we first have to make further assumptions. Lemma 32. The definition of a homogeneous polynomial from [21] is equivalent to P (x) = |x|n

2 i1 ,...,i2n

di1 ,...,i2n o i1 . . . oin oˆ in+1  . . . oˆ i2n n

n

for x ∈ R3 . As a consequence, all homogeneous polynomials have spin weight zero [31]. Proof. A homogeneous polynomial of degree n is defined for x = (x1 , x2 , x3 )T ∈ R3 by [21] P (x) =



Cα x α

|α|=n

=



Cα x1α1 x2α2 x3α3

|α|=n

=

3

C˜ i xi1 . . . xin ,

i1 ,...,in =1

 where α = (α1 , α2 , α3 ) ∈ N30 , |α| := 3k=1 αk , Cα a constant, i = (i1 , . . . , in ) and without loss of generality C˜ i is totally symmetric. With x = |x|ξ and ξ ∈ Ω, we get

292

V. Michel and K. Seibert

P (x) = |x|n



Cα ξ α

|α|=n 3

= |x|n

C˜ i ξi1 . . . ξin .

i1 ,...,in =1

Because ⎛ ⎞ ⎛√ ⎞ 2 ξ1 √1 − t cos ϕ ξ = ⎝ξ2 ⎠ = ⎝ 1 − t 2 sin ϕ ⎠ , ξ3 t     cos ϕ = 12 eiϕ + e−iϕ , and sin ϕ = 2i1 eiϕ − e−iϕ , we reformulate the components of the vector ξ with help of Definition 20. Moreover, we represent coordinates in terms of the functions oi and oˆ i , i = 1, 2. For the polar distance, we have 1 ((1 + t) − (1 − t)) 2 # √ √ √ 1 "√ 1+t 1+t − 1−t 1−t = 2 5 62 5 62 56 2 55 6 62 1 + t −i ϕ 1 + t iϕ 1 − t iϕ 1 − t −i ϕ 1 2 2 2 2 e e e e = +2 − 2 2 2 2 2 2

t=

= oξ1 oˆ ξ2 + oξ2 oˆ ξ1 . For the cartesian coordinates, we obtain  1 − t 2 cos ϕ 2 2 # 1 − t 1 + t 1 " iϕ =2 e + e−iϕ 2 2 2 62 5 62 5 62 56 2 5 1 − t iϕ 1 + t iϕ 1 + t −i ϕ 1 − t −i ϕ = e 2 e 2 − e 2 e 2 − 2 2 2 2

ξ1 =

= oξ2 oˆ ξ2 − oξ1 oˆ ξ1 = −oξ1 oˆ ξ1 + oξ2 oˆ ξ2 ,  ξ2 = 1 − t 2 sin ϕ 2 2 # 1 − t 1 + t 1 " iϕ =2 e − e−iϕ 2 2 2i

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

62 = −i

1 − t iϕ e 2 2

5 62

1 + t iϕ e 2 2

5

62 +

1 + t −i ϕ e 2 2

293

56 2

1 − t −i ϕ − e 2 2

5

# " = −i oξ2 oˆ ξ2 + oξ1 oˆ ξ1 # " = −i oξ1 oˆ ξ1 + oξ2 oˆ ξ2 , and ξ3 = t = oξ1 oˆ ξ2 + oξ2 oˆ ξ1 . The latter equations can be simplified to ξi =

2

i σA,B oξA oˆ ξB ,

A,B=1

where  σ1 =



 −1 0 , 0 1

σ 2 = −i

 10 , 01

σ3 =

  01 . 10

Furthermore, we see that the spin weight satisfies sw(ξi ) = 0 for i = 1, 2, 3 and therefore, sw(ξi1 · · · · · ξin ) = 0 + . . . + 0 = 0. Altogether, we conclude that

P (x) = |x|

n



3 i1 ,...,in =1

2

2

= |x|n

A1 ,...,An ,B1 ,...,Bn =1

2

= |x|n







2 σAi11 ,B1 oξA1 oˆ ξB1 ⎠ . . . ⎝ σAinn ,Bn oξAn oˆ ξBn ⎠ A1 ,B1 =1 An ,Bn =1

C˜ i ⎝

⎛ ⎝ 

3

i1 ,...,in =1



C˜ i σAi11 ,B1 . . . σAinn ,Bn ⎠ oξA1 oξA2 . . . oξAn oˆ ξB1 oˆ ξB2 . . . oˆ ξBn 

=:dA1 ,B1 ,...,An ,Bn totally symmetric



dA1 ,...,An ,B1 ,...,Bn oξA1 oξA2 . . . oξAn oˆ ξB1 oˆ ξB2 . . . oˆ ξBn

A1 ,...,An ,B1 ,...,Bn =1

= |x|n

2

i

i

di1 ,...,i2n oξi1 . . . oξin oˆ ξn+1 . . . oˆ ξ2n .       i1 ,...,i2n =1 n

n

) (

294

V. Michel and K. Seibert

Now, we can show that the spin-weighted spherical harmonics are continuously differentiable. Remark 19. For N = 0, we get from the previous lemma that the spin-weighted spherical harmonics are homogeneous polynomials. So, they are infinitely differentiable on the unit sphere Ω. This means that they are in C(∞) (Ω). For N = 0, this is not the case. We know already from Corollary 2 that the spin-weighted spherical harmonics are infinitely differentiable on Ω0 . Hence, there are problems for t = ±1. We see this in the following examples. Let ξ = ξ(t, ϕ) ∈ Ω. Then, we get from Definition 16 the examples 2 Y2,2 (ξ ) =

5 (−1)4 4π 8

2

# 1 2iϕ " e 1 − t2 24

 

d dt

4 " #2 1 − t2  

" #4 = dtd (1−2t 2 +t 4 )=24

2 =

# 15 " 1 − t 2 e2iϕ 32π

and 2 Y2,−2 (ξ ) = (−1) Y2,2 (ξ ) = 2

# 15 " 1 − t 2 e−2iϕ . 32π

So, we calculate the according spin-weighted spherical harmonics of spin weight 2 by 2 2 Y2,±2 (ξ ) =

2

" # # 1 " 5 e±2iϕ 1 − t 2 (−2) − 1 − t 2 (±2i)2 2 256π 1−t   # t " 2 1 − t (±2i) −2i (±2i)(−2t) + 1 − t2

" # 5 e±2iϕ −2 + 2t 2 + 4 ± 4 (−2t + t) 256π 2 " # 5 e±2iϕ 2 + 2t 2 ∓ 4t = 256π 2 5 = (1 ∓ t)2 e±2iϕ . 64π

=

Then, we see that for t → −1, we obtain the limit 2 2 Y2,2 (ξ ) →

5 2iϕ e 4π

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

295

and for t → 1, correspondingly, 2 2 Y2,−2 (ξ )



5 −2iϕ . e 4π

Since these limits depend on ϕ, the spin-weighted spherical harmonics are not in C(∞) (Ω). Now, we formulate a set for the spin-weighted spherical harmonics. Remark 20. It is well known that a function F ∈ C(2) (R3 ) is called harmonic, if for all x ∈ R3 Δx F (x) = 0. This is equivalent to the proposition that the polynomial of degree n ∈ N0 given by F (rξ ) = r n Yn (ξ ), r ∈ R+ , ξ ∈ Ω, is harmonic, if Δ∗ξ Yn (ξ ) = −n(n + 1)Yn (ξ ). Therefore, we formulate the following definition for the spin-weighted spherical harmonics. Definition 31. A function Yn ∈ C(2) (Ω0 ) of degree n ∈ N0 is called (∗, N)harmonic for N ∈ Z, if for all ξ ∈ Ω0 the equation Δ∗,N ξ Yn (ξ ) = −n(n + 1)Yn (ξ ) is fulfilled for the spin-weighted Beltrami operator from Corollary 3. Definition 32. With HarmN n (Ω0 ), we denote the set of the (∗, N)-harmonic polynomials of degree n ∈ N0 and spin weight N ∈ Z. This means that HarmN n (Ω0 ) : =

&

  (2) P N n  N Pn ∈ C (Ω0 ) is a (∗, N)-harmonic function of spin ' weight N and degree n .

Definition 33. Analogously to Definition 12, we define the spaces HarmN 0...n (Ω0 ) :=

n ! i=0

and

HarmN i (Ω0 )

296

V. Michel and K. Seibert

HarmN 0...∞ (Ω0 ) :=

∞ -

HarmN 0...i (Ω0 ).

i=0

Note that a (∗, N)-harmonic function gets, in general, not harmonic for N = 0 by multiplication with r n , r ∈ R, n ∈ N0 . Similarly a function of spin weight N = 0 is, in general, not homogeneous. Only for the case of spin weight zero, we get by HarmN n (Ω0 ) = Harmn (Ω0 ) the set of the harmonic and homogeneous polynomials of degree n ∈ N0 which are restricted to the unit sphere. Corollary 17. The spin-weighted spherical harmonics N Yn of spin  weight N ∈ Z and degree n ∈ N0 , n ≥ |N|, form the set HarmN n (Ω0 ). The system N Yn,j j =−n,...,n   forms an orthonormal system of HarmN n (Ω0 ), "·, ·#L2 (Ω) .  Proof. From Theorem 22, we know that the system N Yn,j n≥|N |,j =−n,...,n is com plete for all N ∈ Z. Therefore, N Yn,j j =−n,...,n is also complete in HarmN n (Ω0 ) for all N ∈ Z and all n ∈ N0 , n ≥ |N |. Furthermore, we conclude from Theorem 15 that .

/

N Yn,j , N Yn,j  L2 (Ω)

= δj,j 

for all j, j  ∈ {−n, . . . , n}. Finally, we get from Lemma 31 that "F, N Yn,j #L2 (Ω) = 0 for all j = −n, . . . , n and F ∈ HarmN n (Ω0 ) implies that F = 0.

) (

Remark 21. It is obvious that for 0 ≤ p ≤ q ≤ ∞, we have HarmN p...q (Ω0 ) = span



N Yn,j n=p,...,q,j =−n,...,n .

So, we obtain the next lemma for the C(Ω0 )-norm of the spin-weighted spherical harmonics. Lemma 33. For every n ≥ |N|

N Yn

∈ HarmN n (Ω0 ), we get for n ∈ N0 and N ∈ Z with 2

N Yn C(Ω0 ) ≤

2n + 1 N Yn L2 (Ω) . 4π

In particular, we obtain 0 0 0N Yn,j 0

C(Ω0 )

2 ≤

2n + 1 . 4π

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

297

Proof. For every N Yn ∈ HarmN n (Ω0 ), n ∈ N0 , N ∈ Z, n ≥ |N |, we know from Corollary 17 that for ξ ∈ Ω0 ,

N Yn (ξ )

=

n . j =−n

/

N Yn , N Yn,j L2 (Ω) N Yn,j (ξ ).

Then, we get with the Cauchy-Schwarz inequality    n . / |N Yn (ξ )| =  N Yn , N Yn,j L2 (Ω) j =−n

    Y (ξ ) N n,j  



⎞1 ⎛ ⎞1 2 n  n 2 2   . / 2     =⎝  N Yn , N Yn,j L2 (Ω)  ⎠ ⎝ N Yn,j (ξ ) ⎠ . j =−n

j =−n

With Corollary 11, this leads us directly to the proposition 2 N Yn C(Ω0 ) ≤

2n + 1 N Yn L2 (Ω) 4π

and particularly to 0 0 0N Yn,j 0

2

C(Ω0 )



2n + 1 . 4π

) (

Finally, we can show that the spin-weighted spherical harmonics are the only eigenfunctions of the operator Δ∗,N . Theorem 23. The spin-weighted spherical harmonics N Yn,j and their linear combinations are the only eigenfunctions of the differential operator Δ∗,N to the eigenvalues −n(n + 1) for all N ∈ Z, all n ∈ N0 , n ≥ |N |, and all j = −n, . . . , n in X2 (Ω0 ). Proof. Let ξ ∈ Ω0 . From Corollary 3, we know already that for all N ∈ Z, all n ∈ N0 , n ≥ |N|, and all j = −n, . . . , n, the property Δ∗,N ξ is valid.

N Yn,j (ξ )

= −n(n + 1) N Yn,j (ξ )

298

V. Michel and K. Seibert

(1) Determination of the eigenvalues We assume that there exists a number λ which satisfies λ = −n(n + 1) for all n ∈ N0 , n ≥ |N|, and there exists a function K ∈ X2 (Ω0 ) such that Δ∗,N K = λK. We apply Green’s second surface identity on the unit sphere for the operator Δ∗,N , Theorem 17, so that for all N ∈ Z, n ∈ N0 , n ≥ |N |, and j = −n, . . . , n, we obtain   Ω

∗,N K(ξ ) Δ∗,N N Yn,j (ξ ) −N Yn,j (ξ ) Δξ K(ξ ) ξ





=−n(n+1)





N Yn,j

⇔ (−n(n + 1) − λ)   

=0





=λK(ξ )





dω(ξ ) = 0

K(ξ ) N Yn,j (ξ ) dω(ξ ) = 0 Ω

⇔ "K, N Yn,j #L2 (Ω) = 0. From Theorem 22, we know that



N Yn,j n,j

is an orthonormal basis. Consequently,

and therefore, λ is not an eigenvalue. we obtain K = 0 in (2) Determination of the eigenfunctions Let us now assume that we have an eigenvalue of the form λ = −k(k + 1) for a fixed k ∈ N0 , k ≥ |N|, and a corresponding eigenfunction K ∈ X2 (Ω0 ), K = 0, such that Δ∗,N K = λK. In analogy to the previous considerations, we get for all N ∈ N0 , n ∈ N0 , n ≥ |N|, and j = −n, . . . , n the equation L2 (Ω),

(−n(n + 1) − λ)"K, N Yn,j #L2 (Ω) = 0, where −n(n + 1) − λ = 0 for n = k. Then, for all N ∈ Z, n ∈ N0 , n ≥ |N|, n = k, and j = −n, . . . , n, we can obviously deduce that "K, N Yn,j #L2 (Ω) = 0. From Theorem 22, we know again that {N Yn,j }n,j is an orthonormal basis. So, we obtain

K=

∞ n

"K, N Yn,j #L2 (Ω) N Yn,j

n=|N | j =−n

=

k j =−k

"K, N Yk,j #L2 (Ω) N Yk,j ∈ span



N Yk,j j =−k,...,k

. ) (

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

9

299

Additional Properties

As mentioned before, the spin-weighted spherical harmonics have a huge field of applications. For a better illustration, we summarize here further additional properties. Remark 22. From [32], we get the confirmation of Corollary 3 with Theorem 14. This is because n n J 2 Dj,N (α, β, γ ) = n(n + 1)Dj,N (α, β, γ ),

where α, γ ∈ [0, 2π ], β ∈ [0, π ] and  J =− 2

∂β2

+ cot β∂β +

1 sin2 β

# " 2 2 ∂α − 2 cos β∂α ∂γ + ∂γ .

With t = cos β and with Definition 27, we obtain J

2

n Dj,−N (α, β, γ )

 "" # # ∂ 2 + 2iNt∂ − N 2  α 2 n = − ∂t 1 − t ∂t + α Dj,−N (α, β, γ ) 1 − t2

for ξ = ξ(t, α) ∈ Ω and hence, with Theorem 14 J 2 N Yn,j (ξ ) = −Δ∗,N ξ

N Yn,j (ξ ).

Next, we collect additional recursion relations for the spin-weighted spherical harmonics. For further details, see [32], where they are listed for the Wigner D-function. Now, we recapitulate them directly for the spin-weighted spherical harmonics with help of Theorem 14. Lemma 34. For the spin-weighted spherical harmonics, the following five differential relations are given [32] for ξ = ξ(t, ϕ) ∈ Ω and for n ∈ N0 , N, j ∈ Z with n ≥ |N| and n ≥ |j |: " # t 2 − 1 ∂t

N Yn,j (ξ )

N = −(n + 1)αn,j N + nαn+1,j



1 − t 2 ∂t

N Yn,j (ξ )

=

N Yn−1,j (ξ ) +

jN n(n + 1)

N Yn,j (ξ )

N Yn+1,j (ξ ),

1 (n + j )(n − j + 1) e−iϕ N Yn,j −1 (ξ ) 2 1 (n − j )(n + j + 1) eiϕ N Yn,j +1 (ξ ), − 2

300

V. Michel and K. Seibert

1 (n − N )(n + N + 1) N+1 Yn,j (ξ ) 2 1 − (n + N )(n − N + 1) N−1 Yn,j (ξ ), 2  N + jt ∓iϕ =±√ N Yn,j ∓1 (ξ ), N Yn,j (ξ ) ± (n ± j )(n ∓ j + 1) e 2 1−t  j + Nt = ∓√ N Yn,j (ξ ) ± (n ∓ N )(n ± N + 1) N±1 Yn,j (ξ ), 2 1−t

=

where N Y−1,j (ξ ) := 0 and N Yn,j (ξ ) := 0 for n < |N| or n < |j |. The second relation is also mentioned in [6] and the third one in [6, 25]. Furthermore, the fifth differential relation is equivalent to the alternative definition of the spin-weighted spherical harmonics from Lemma 26. Lemma 35. For the spin-weighted spherical harmonics, we get the following recursion relations [32] for ξ = ξ(t, ϕ) ∈ Ω and for n ∈ N0 , N, j ∈ Z with n ≥ |N | and n ≥ |j |: 1 (n − N )(n + N + 1) N +1 Yn,j (ξ ) 2 1 + (n + N )(n − N + 1) N −1 Yn,j (ξ ), 2 N + jt 1 −√ (n + j )(n − j + 1) e−iϕ N Yn,j −1 (ξ ) N Yn,j (ξ ) = 2 1 − t2 1 + (n − j )(n + j + 1) eiϕ N Yn,j +1 (ξ ), 2 % (n − N )(n + N + 1) 1 ± t ±iϕ e N Yn,j ±1 (ξ ) = ∓ N +1 Yn,j (ξ ) (n ∓ j )(n ± j + 1) 2 √ N 1 − t2 −√ N Yn,j (ξ ) (n ∓ j )(n ± j + 1) % (n + N )(n − N + 1) 1 ∓ t ± N −1 Yn,j (ξ ), (n ∓ j )(n ± j + 1) 2 % (n + j )(n − j + 1) 1 ± t −iϕ − N ∓1 Yn,j (ξ ) = ± e N Yn,j −1 (ξ ) (n ± N )(n ∓ N + 1) 2 √ j 1 − t2 −√ N Yn,j (ξ ) (n ± N )(n ∓ N + 1) j + Nt √ 1 − t2

N Yn,j (ξ )

=

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

% ∓  1 − t 2 e±iϕ

%

(n − j )(n + j + 1) 1 ∓ t iϕ e (n ± N )(n ∓ N + 1) 2

301

N Yn,j +1 (ξ ),

n±j +1 N αn,j N Yn−1,j (ξ ) n∓j √ N (n ∓ j )(n ± j + 1) − N Yn,j (ξ ) n(n + 1) % n∓j ± αN N Yn+1,j (ξ ), n ± j + 1 n+1,j 2  n∓N +1 N 2 αn,j N Yn−1,j (ξ ) − 1 − t N ∓1 Yn,j (ξ ) = ± n±N √ j (n ± N )(n ∓ N + 1) − N Yn,j (ξ ) n(n + 1) 2 n±N ∓ αN N Yn+1,j (ξ ), n ∓ N + 1 n+1,j % (n + j + 1)(n ∓ N + 1) N iϕ αn,j N Yn−1,j (ξ ) −(1 ± t) e N ∓1 Yn,j +1 (ξ ) = (n − j )(n ± N ) √ (n−j )(n+j +1)(n ± N )(n ∓ N +1) ± N Yn,j (ξ ) n(n + 1) % (n − j )(n ± N ) αN + N Yn+1,j (ξ ), (n + j + 1)(n ∓ N + 1) n+1,j % (n − j + 1)(n ± N + 1) N −iϕ αn,j N Yn−1,j (ξ ) −(1 ± t) e N ±1 Yn,j −1 (ξ ) = (n + j )(n ∓ N ) √ (n+j )(n−j +1)(n ∓ N )(n ± N +1) ± N Yn,j (ξ ) n(n + 1) % (n + j )(n ∓ N ) αN + N Yn+1,j (ξ ), (n − j + 1)(n ± N + 1) n+1,j N Yn,j ±1 (ξ )

=∓

where N Y−1,j (ξ ) := 0 and N Yn,j (ξ ) := 0 for n < |N| or n < |j |. The first relation is also mentioned in [6, 25] and the second one in [6]. Lemma 36. Another recursion relation is given for ξ = ξ(t, ϕ) ∈ Ω, n ∈ N0 , and N, j ∈ Z with n ≥ |N | and n ≥ |j | by [34]

302

V. Michel and K. Seibert

2

N Yn,j (ξ )

=

n ∓ N + 1 j ∓ nt √ N ∓1 Yn,j (ξ ) n ± N n 1 − t2 %   1 2n + 1 (n ± N − 1) n2 − j 2 ± √ 2n − 1 n±N n 1 − t2

N ∓1 Yn−1,j (ξ ),

where N Y−1,j (ξ ) := 0 and N Yn,j (ξ ) := 0 for n < |N| or n < |j |.

10

Relation to the Scalar, Vector, and Tensor Spherical Harmonics

One reason for the applicability of the spin-weighted spherical harmonics is the possibility to represent scalar, vector, and tensor spherical harmonics in term of the spin-weighted spherical harmonics [30]. The corresponding relations are derived in this section. From the definition of the spin-weighted spherical harmonics, we already know that 0 Yn,j

= Yn,j .

For the vector spherical harmonics, we get the following theorem. Theorem 24. The tangential vector spherical harmonics can be combined such that # 1 " (2) (3) −yn,j (ξ ) ± iyn,j (ξ ) = ±1 Yn,j (ξ )τ± , ±√ 2 where  1  τ± := − √ εt ± iεϕ 2 with τ± · τ± = 1 and τ± · τ∓ = 0 for ξ ∈ Ω. Proof. For ξ = ξ(t, ϕ) ∈ Ω, we obtain # 1 " (2) (3) ±√ −yn,j (ξ ) ± iyn,j (ξ ) 2    1 1 −εϕ √ = ±√ ∂ϕ − εt 1 − t 2 ∂t ∓ iεϕ 1 − t 2 ∂t 2n(n + 1) 1 − t2  1 ±iεt √ ∂ϕ Yn,j (ξ ) 1 − t2

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

1 = ±√ n(n + 1)



1 − t 2∂

t

−√

i 1 − t2

303

      −1  t ϕ ±∂ϕ Yn,j (ξ ) √ ε ± iε . 2

The proposition follows from Lemma 3 such that # 1 " (2) (3) −yn,j (ξ ) ± iyn,j (ξ ) = ±1 Yn,j (ξ )τ± . ±√ 2 The orthonormality of the vectors τ± is obviously given by the orthonormality of the unit sphere vectors εt and εϕ . ( ) Now, the tensor spherical harmonics are represented by the spin-weighted spherical harmonics. Corollary 18. In the same way, we get for the left normal/right tangential tensor spherical harmonics the relation # 1 " (1,2) (1,3) −y n,j (ξ ) ± iy n,j (ξ ) = ±1 Yn,j (ξ ) (ξ ⊗ τ± ) ±√ 2 and for the left tangential/right normal tensor spherical harmonics the identity # 1 " (2,1) (3,1) ±√ −y n,j (ξ ) ± iy n,j (ξ ) = ±1 Yn,j (ξ ) (τ± ⊗ ξ ) , 2 where ξ ∈ Ω. Theorem 25. In analogy to the considerations above, we get the following identity: # 1 " (2,3) (3,2) −y n,j (ξ ) ± iy n,j (ξ ) = ±2 Yn,j (ξ ) (τ± ⊗ τ± ) , −√ 2

ξ ∈ Ω.

Proof. For ξ = ξ(t, ϕ) ∈ Ω, we get # 1 " (2,3) (3,2) −√ −y n,j (ξ ) ± iy n,j (ξ ) 2 # 1 " (2,3) (3,2) y n,j (ξ ) ∓ iy n,j (ξ ) =√ 2 " 1 1 ∇ξ∗ ⊗ ∇ξ∗ Yn,j (ξ ) − L∗ξ ⊗ L∗ξ Yn,j (ξ ) = √ 2 n(n + 1)(n(n + 1) − 2) +2∇ξ∗ Yn,j (ξ ) ⊗ ξ ∓ i∇ξ∗ ⊗ L∗ξ Yn,j (ξ ) ∓ iL∗ξ ⊗ ∇ξ∗ Yn,j (ξ ) ∓ 2iL∗ξ Yn,j (ξ ) ⊗ ξ

#

304

V. Michel and K. Seibert

    ϕ  1 1 1 ϕ ε ⊗ ∂ϕ ε = √ ∂ϕ ± i∂t Yn,j (ξ ) 2 n(n + 1)(n(n + 1) − 2) 1 − t2  # "   1 t + εϕ ⊗ εϕ ∂ϕ2 − t∂t − 1 − t 2 ∂t2 + t∂t ± i∂ϕ ∂t ± i ∂ϕ ± i∂t ∂ϕ 2 1−t 1 − t2  t ±i ∂ ϕ Yn,j (ξ ) 1 − t2   # "   ϕ t 1 t 2 2 2 ∂ϕ ∂t +∂t ∂ϕ + + ε ⊗ε ∂ ∓i ∂ ± i 1−t ∂t ∓ it∂t Yn,j (ξ ) 2 ϕ 2 ϕ 1−t  1−t =



+ ε ⊗ε t

ϕ



 ∂ t ∂ϕ +

2t 1−t 2

∂ϕ − 1−tt 2 ∂ϕ

# " t t 2 ∂t2 ∓ it∂t ∂ +∂ ∂ + ∂ ± i 1−t ϕ ϕ t ϕ 1−t 2 1−t 2  ∂ϕ2 ± it∂t Yn,j (ξ )

1 1 − t2       1 1 ∂ϕ ∓ 2i √ ∂ϕ Yn,j (ξ ) + εt ⊗ ξ − 1−t 2 ∂t +2 1−t 2 ∂t ± i √ 1−t 2 1−t 2  " #  t  1 t t 2 2 2 + ε ⊗ε 1 − t ∂t − t∂t − ∂ ∓ i∂t ∂ϕ ∓i ∂ ∓i∂ϕ ∂t 2 ϕ 1 − t2 ϕ  1− t  ∓i

∓i

2t 1−t 2

∂ϕ ±i

t 1−t 2

∂ϕ

× Yn,j (ξ )       ϕ 1 2 2 2 ∂ϕ + √ ∂ϕ ∓ i 1−t ∂t ± 2i 1−t ∂t Yn,j (ξ ) + ε ⊗ ξ −√ 1−t 2 1−t 2     t  1 ϕ + ε ⊗ ∂ϕ ε ∂t ∓ i ∂ϕ Yn,j (ξ ) 1 − t2 =

1 1 √ 2 n(n + 1)(n(n + 1) − 2)  "  # 1 2t 2 2 2 Yn,j (ξ ) × 1 − t ∂t − ∂ ∓ i ∂t ∂ϕ + ∂ϕ ∂t + ∂ϕ 1 − t2 ϕ 1 − t2     t  ϕ 1 ϕ t t ϕ ϕ t ∂ϕ ± i∂t Yn,j (ξ ) × −ε ⊗ ε + ε ⊗ ε ± i ε ⊗ ε + ε ⊗ ε + 1 − t2      ϕ t  1 t t ∂ ± i∂ × εϕ ⊗ ∂ϕ εϕ ∓ iεt ⊗ ∂ϕ εϕ − t ϕ t Yn,j (ξ ) ε ⊗ε ∓ iε ⊗ε 2 1−t      ϕ  1 t Y + 1 − t2 ∂ ± i∂ (ξ ) ε ⊗ ξ ∓ iε ⊗ξ ϕ t n,j 1 − t2

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

305

1 n(n + 1)(n(n + 1) − 2)  "  # 1 2t 2 Yn,j (ξ ) (τ± ⊗τ± ) × 1 − t 2 ∂t2 − ∂ ∓ i ∂ ∂ + ∂ ∂ + ∂ t ϕ ϕ t ϕ 1 − t2 ϕ 1 − t2   #   " ϕ  ϕ 1 1 t t 2 ∂ϕ ± i∂t Yn,j (ξ ) ε ∓ iε ⊗ ∂ϕ ε − tε + 1 − t ξ . + 2 1 − t2

=√

We observe that the derivation  ∂ϕ εϕ − tεt + 1 − t 2 ξ ⎞ ⎛√ ⎛ ⎞ ⎛ ⎞ 2 −t cos(ϕ) − cos(ϕ)  √1 − t cos(ϕ) 2 ⎝ 1 − t 2 sin(ϕ) ⎠ ⎠ = ⎝ − sin(ϕ) ⎠ − t ⎝ −t √ sin(ϕ) + 1 − t 2 0 1−t t =0 leads us to the identity # 1 " (2,3) (3,2) −y n,j (ξ ) ± iy n,j (ξ ) −√ 2  "  # 1 1 t 2 1 − t 2 ∂t2 − =√ ∂ ∓ 2i ∂ ∂ + ∂ ϕ t ϕ 1 − t2 ϕ 1 − t2 n(n + 1)(n(n + 1)−2)  × Yn,j (ξ ) (τ± ⊗ τ± ) . Hence, we obtain due to Lemma 3 the result # 1 " (2,3) (3,2) −√ −y n,j (ξ ) ± iy n,j (ξ ) = ±2 Yn,j (ξ ) (τ± ⊗ τ± ) . 2

) (

Literature 1. Bieberbach, L.: Theorie der Differentialgleichungen. Springer, Heidelberg (1979) 2. Bouzas, A.O.: Addition theorems for spin spherical harmonics: I. Preliminaries. J. Phys. A Math. Theor. 44(16), 165301 (2011) 3. Bouzas, A.O.: Addition theorems for spin spherical harmonics: II. Results. J. Phys. A Math. Theor. 44(16), 165302 (2011) 4. Burridge, R.: Spherically symmetric differential equations, the rotation group, and tensor spherical functions. Math. Proc. Camb. Philos. Soc. 65(1), 157–175 (1969) 5. Campbell, W.B.: Tensor and spinor spherical harmonics and the spin-s harmonics s Ylm (θ, φ)∗ . J. Math. Phys. 12(8), 1763–1770 (1971)

306

V. Michel and K. Seibert

6. Dahlen, F.A., Tromp, J.: Theoretical Global Seismology. Princeton University Press, Princeton (1998) 7. Dhurandhar, S.V., Tinto, M.: Astronomical observations with a network of detectors of gravitational waves – I. Mathematical framework and solution of the five detector problem. Mon. Not. R. Astron. Soc. 234(3), 663–676 (1988) 8. Dray, T.: The relationship between monopole harmonics and spin-weighted spherical harmonics. J. Math. Phys. 26(5), 1030–1033 (1985) 9. Eastwood, M., Tod, P.: Edth – a differential operator on the sphere. Math. Proc. Camb. Philos. Soc. 92(2), 317–330 (1982) 10. Edmonds, A.R.: Angular Momentum in Quantum Mechanics. Princeton University Press, Princeton (1957) 11. Fengler, M.J., Freeden, W.: A nonlinear Galerkin scheme involving vector and tensor spherical harmonics for solving the incompressible Navier-Stokes equation on the sphere. SIAM J. Sci. Comput. 27(3), 967–994 (2005) 12. Freeden, W., Gervens, T.: Vector spherical spline interpolation – basic theory and computational aspects. Math. Methods Appl. Sci. 16(3), 151–183 (1993) 13. Freeden, W., Gervens, T., Schreiner, M.: Tensor spherical harmonics and tensor spherical splines. Manuscr. Geodaet. 19, 70–100 (1994) 14. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere with Applications to Geomathematics. Oxford University Press, Oxford (1998) 15. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences. Springer, Heidelberg (2009) 16. Goldberg, J.N., Macfarlane, A.J., Newman, E.T., Rohrlich, F., Sudarshan, E.C.G.: Spin-s spherical harmonics and ð. J. Math. Phys. 8(11), 2155–2161 (1967) 17. Hill, E.L.: The theory of vector spherical harmonics. Am. J. Phys. 22(4), 211–214 (1954) 18. Hu, W., White, M.: CMB anisotropies: total angular momentum method. Phys. Rev. D 56(2), 596–615 (1997) 19. Kostelec, P.J., Maslen, D.K., Healy, D.M., Rockmore, D.N.: Computational harmonic analysis for tensor fields on the two-sphere. J. Comput. Phys. 162(2), 514–535 (2000) 20. Lewis, A., Challinor, A., Turok, N.: Analysis of CMB polarization on an incomplete sky. Phys. Rev. D 65(2), 023505 (2002) 21. Michel, V.: Lectures on Constructive Approximation. Fourier, Spline, and Wavelet Methods on the Real Line, the Sphere, and the Ball. Birkhäuser Verlag, New York (2013) 22. Newman, E.T., Penrose, R.: Note on the Bondi-Metzner-Sachs group. J. Math. Phys. 7(5), 863–870 (1966) 23. Penrose, R., Rindler, W.: Spinors and Space-Time Volume 1: Two-Spinor Calculus and Relativistic Fields. Cambridge University Press, Cambridge (1984) 24. Penrose, R., Rindler, W.: Spinors and Space-Time Volume 2: Spinor and Twistor Methods in Space-Time Geometry. Cambridge University Press, Cambridge (1986) 25. Phinney, R.A., Burridge, R.: Representation of the elastic-gravitational excitation of a spherical Earth model by generalized spherical harmonics. Geophys. J. Int. 34(4), 451–487 (1973) 26. Schreiner, M.: Tensor Spherical Harmonics and Their Application in Satellite Gradiometry. Ph.D. thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern (1994) 27. Seibert, K.: Spin-Weighted Spherical Harmonics and Their Application for the Construction of Tensor Slepian Functions on the Spherical Cap. Ph.D. thesis, Geomathematics Group, Department of Mathematics, University of Siegen (2018) 28. Sharma, S.K., Khanal, U.: Perturbation of FRW spacetime in NP formalism. Int. J. Mod. Phys. D 23(2), 1450017 (2014) 29. Tamm, I.: Die verallgemeinerten Kugelfunktionen und die Wellenfunktionen eines Elektrons im Felde eines Magnetpoles. Z. Phys. 71(3–4), 141–150 (1931) 30. Thorne, K.S.: Multipole expansions of gravitational radiation. Rev. Mod. Phys. 52(2), 299–339 (1980)

5 A Mathematical View on Spin-Weighted Spherical Harmonics and. . .

307

31. Torres del Castillo, G.F.: Spin-weighted spherical harmonics and their applications. Rev. Mex. Fis. 53(2), 125–134 (2007) 32. Varshalovich, D.A., Moskalev, A., Khersonskii, V.: Quantum Theory of Angular Momentum. World Scientific Publishing Co Pte Ltd, Singapore (1988) 33. Voigt, A., Wloka, J.: Hilberträume und elliptische Differentialoperatoren. Bibliographisches Institut, Mannheim (1975) 34. Wiaux, Y., Jacques, L., Vielva, P., Vandergheynst, P.: Fast directional correlation on the sphere with steerable filters. Astrophys. J. 652(1), 820–832 (2006) 35. Wiaux, Y., Jacques, L., Vandergheynst, P.: Fast spin ±2 spherical harmonics transforms and application in cosmology. J. Comput. Phys. 226(2), 2359–2371 (2007) 36. Wu, T.T., Yang, C.N.: Some properties of monopole harmonics. Phys. Rev. D 16(4), 1018–1021 (1977) 37. Zaldarriaga, M., Seljak, U.: All-sky analysis of polarization in the microwave background. Phys. Rev. D 55(4), 1830–1840 (1997)

6

Reconstruction and Decomposition of Scalar and Vectorial Potential Fields on the Sphere A Brief Overview Christian Gerhards and Roger Telschow

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Basic Function Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Green’s Functions for the Beltrami Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Vector Field Decompositions and Vectorial Function Systems . . . . . . . . . . . . . . . . . . . 5 Examples and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

310 312 326 329 340 346

Abstract We give a brief overview on approximation methods on the sphere that can be used in a variety of geophysical setups. A particular focus is on methods related to potential field problems and spatial localization, such as spherical splines, multiscale methods, and Slepian functions. Furthermore, we introduce the common Helmholtz and Hardy-Hodge decompositions of spherical vector fields together with some related recent results. The methods are illustrate for two different examples: determination of the disturbing potential from deflections of the vertical and approximation of magnetic fields induced by oceanic tides.

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. C. Gerhards () TU Bergakademie Freiberg, Geomathematics and Geoinformatics Group, Freiberg, Germany E-Mail: [email protected] R. Telschow Computational Science Center, University of Vienna, Vienna, Austria E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_103

309

310

C. Gerhards and R. Telschow

Zusammenfassung Wir geben einen kurzen Überblick über Approximationsmethoden auf der Sphäre, welche Anwendung in verschiedenen geophysikalischen Fragestellungen finden. Im Speziellen geht es um Methoden mit Bezug zu Potentialfeldproblemen und Lokalisierung auf der Sphäre (z. B. Splines, Multiskalenmethoden und Slepian Funktionen). Des Weiteren führen wir zwei bekannte Vektorfeldzerlegungen (Helmholtz und Hardy-Hodge) ein und stellen die Verbindung zu einigen neueren Resultaten her. Abschliessend illustrieren wir unsere Ansätze an zwei geophysikalischen Beispielen: der Bestimmung des Störpotentials aus Lotabweichungen und der Approximation des Magnetfelds, welches durch Ozeangezeiten erzeugt wird.

Keywords Approximation on the sphere · Spatial localization on the sphere · Spherical multiscale expansions · Spherical function systems · Spherical vector field decompositions · Potential theory on the sphere

1

Introduction

This chapter gives an overview on tools that can be used for the approximation of solutions to problems that are in one way or another restricted to subdomains of the sphere. The sphere itself appears naturally in many geoscientifically relevant situations. Naturally in the sense that the Earth is sphere-like but, e.g., also in the sense that a function that is harmonic in the exterior of a smooth surface (a situation regularly occurring with potential field data in geodesy and geomagnetism) can be approximated by functions that are harmonic in the exterior of some sphere (in a geodetic context often called Bjerhammar sphere). In the paper at hand, however, we do not focus (in the first place) on tools for the entire sphere but rather on tools for problems on subdomains. In other words, we are interested in spatial localization on the sphere. The need for spatial localization might stem from the limited availability of data, from natural constraints such as oceanic or continental boundaries, or simply from the fact that one is particularly interested in the closer investigation of a particular region. The foundation for handling those problems is the availability of adequate function systems which can be used for approximation and discretization. Spherical harmonics are clearly the most widely used of such systems but are not particularly suited for spatially localized investigations. Nonetheless, they form the basis for the construction of many spatially localizing function systems and

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

311

are introduced in Sect. 2. In the last decades, a wide variety of approximation methods that focus on spatial localization has been developed. Among them are spherical splines and radial basis functions (e.g., [17, 24, 50, 75, 82, 84]), Slepian functions (e.g., [71, 76, 77]), and spherical wavelets and multiscale methods (e.g., [10, 28, 31, 49, 55]). A brief introduction to these function systems is given in Sect. 2. We aim at providing an overview rather than giving a detailed analysis. For the latter we refer to the cited literature. In the case of wavelets and multiscale approaches, we do not further elaborate the Fourier domain approach of [28, 31] but rather an approach via regularizing scaling kernels in spatial domain (e.g., [12, 19, 29, 34, 36]). Latter has the advantage that for simple geometries like spherical caps, one can design explicit closed-form kernels that yield multiscale procedures that only require information from a fixed pre-defined region while other multiscale approaches typically require global information on the entire sphere at the lowest scale. This involves some basic boundary value problems for the Beltrami operator on the sphere and Green’s functions that are described in Sect. 3. In Sect. 4, we formulate similar concepts for a vectorial setup. In particular, we introduce the Hardy-Hodge decomposition of spherical vector fields, which on the one hand allows the definition of certain vectorial function systems and on the other hand the characterization and interpretation of gravity and magnetic potential field data with respect to the origin of their sources. Although vectorial problems can be regarded componentwise as scalar problems, it is often advisable to use a vectorial formulation since this preserves the characteristics of the underlying vectorial equations. Two numerical examples illustrating applications of the previous considerations are provided in Sect. 5, namely, the reconstruction of the disturbing potential from deflections of the vertical and, more recently, the extraction of ocean tidal magnetic fields from satellite magnetic field data. We also summarize the pros and cons of the different function systems and their areas of application in this section. Eventually, it should be said that there exists a plethora of numerical schemes that can be used in combination with these function systems. They range from collocation and Galerkin-type methods (which both involve the solution of systems of linear equations), over cubature rules (e.g., [2, 11, 16, 21, 47]), to more flexible greedytype methods (e.g., [14, 15, 57, 62, 63]). The latter two are used in the two examples presented in Sect. 5. The goal of this paper is to give a brief overview on some existing function systems and their possible use in addressing spatially localized problems on the sphere for efficient numercial use in geodesy. A particular focus is on kernels and multiscale methods motivated by potential field methods. More extensive mathematical treatments on the covered topics can be found, e.g., in the monographs [20, 25, 30, 39, 60]. Our notations will, up to a few exceptions, also be closely resembled by those used in these monographs.

312

2

C. Gerhards and R. Telschow

Basic Function Systems

We start by introducing some basic notations used throughout the course of this paper. By SR = {x ∈ R3 : |x| = R} we mean the sphere of radius R > 0 centered at the origin, while R ⊂ SR denotes a subset of the sphere (typically simply connected and with a smooth boundary). The unit sphere S1 is simply abbreviated by S. The open ball of radius R > 0 is denoted by BR = {x ∈ R3 : |x| < R} and the unit ball again by B. The Euclidean gradient in R3 is given by ∇ = (∂x1 , ∂x2 , ∂x3 )T , its tangential component by ∇S and the derivative in radial direction by ∂r . In other words, the Euclidean gradient at a point x ∈ R3 \ {0} can be expressed as 1 ∇x = ξ ∂r + ∇S,ξ , r

ξ=

x ∈ S, r = |x| > 0. |x|

In general, Greek letters ξ, η, . . . denote unit vectors in S while x, y, . . . denote any vector in R3 . By = ∇ · ∇ = ∂x21 + ∂x22 + ∂x23 we mean the classical Laplacian in R3 while S denotes the Laplace-Beltrami operator on the sphere S, i.e., x

2 1 = ∂r2 + ∂r + 2 r r

S,ξ ,

ξ=

x ∈ S, r = |x| > 0. |x|

In spherical coordinates, t = cos(θ ) ∈ [−1, 1] denotes the polar distance, θ ∈ [0, π ] the co-latitude, and ϕ ∈ [0, 2π ) the longitude of a vector in the unit sphere S. The operators ∇S and S can then be expressed as  1 ∇S = eϕ √ ∂ϕ + et 1 − t 2 ∂t , 1 − t2 " # 1 2 (1 − t + = ∂ )∂ ∂ 2, t t S 1 − t2 ϕ in these coordinates, where er , et , and eϕ denote the unit vectors ⎛√ ⎞ 2 √1 − t cos(ϕ) er = ⎝ 1 − t 2 sin(ϕ) ⎠ , t



⎞ −t cos(ϕ) ⎠ et = ⎝ −t √ sin(ϕ) , 1 − t2

⎛ ⎞ − sin(ϕ) eϕ = ⎝ cos(ϕ) ⎠ . 0

Eventually, we denote scalar valued functions mapping from SR or R3 into R by lower case letters f, g, . . . and vector valued functions mapping from SR or R3 into R3 by boldface letters f, g, . . .. Occasionally we need tensor valued functions that are mapping into R3×3 , which we denote by upper case boldface letters F, G, . . .. Applying ∇S to a scalar valued function f leads to a curl-free, tangential vector field ∇S f . In analogy, we obtain a divergence-free, tangential vector field by applying the operator LS which is at a point ξ ∈ S defined via LS,ξ = ξ ×∇S,ξ (where ’×’ denotes the classical vector product) such that

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

313

 1 LS = −eϕ 1 − t 2 ∂t + et √ ∂ϕ . 1 − t2 The Cartesian unit  vectors are denoted by e1 = (1, 0, 0)T , e2 = (0, 1, 0)T and e3 = 3 T (0, 0, 1) . For f = i=1 fi ei , the operator ∇S · is defined by ∇S · f = 3i=1 ei · ∇S fi such that −∇S and ∇S · applied to tangential vector fields are adjoint to each other. Analogously, the operator LS ·, which is defined by LS · f = 3i=1 ei · LS fi , is the adjoint operator to −LS , i.e.,  S

 (LS · f(η)) g(η)dω(η) = −

S

f(η) · LS,η g(η)dω(η)

(1)

and we have ∇S · ∇S = S = LS · LS . Furthermore, every vector field f can be decomposed in its radial and tangential components, i.e., f = prad f + ptan f with projections   ptan f = (et · f) et + eϕ · f eϕ .

prad f = (er · f) er ,

This should suffice to be equipped with all necessary notations for the remainder of this chapter.

2.1

Spherical Harmonics and Legendre Polynomials

If Hn is a homogeneous, harmonic polynomial of degree n in R3 (i.e., Hn (x) = 0 and Hn (λx) = λn Hn (x), for x ∈ R3 and λ ∈ R), then we call the restriction Yn = Hn |S to the unit sphere a spherical harmonic of degree n. It can be shown that the dimension of the space of spherical harmonics of degree n is 2n + 1. From now on, we use the following notation for a particular set of spherical harmonics Yn,k of degree n and order k. Definition 1. Let n ∈ N0 and k = −n, . . . , n. Then we define % Yn,k (ξ ) =

⎧√ ⎨ 2 cos(kϕ), k < 0, 2n + 1 (n − |k|)! Pn,|k| (t) 1, k = 0, ⎩√ 4π (n + |k|)! 2 sin(kϕ), k > 0,

for t ∈ [−1, 1], ϕ ∈ [0, 2π ) being the spherical coordinates of ξ ∈ S, where j

j

Pn,j (t) = (1 − t 2 ) 2 ∂t Pn (t),

t ∈ [−1, 1],

j ∈ N0 ,

denotes the associated Legendre function (ALF) of degree n and order j . Furthermore, Pn denotes the Legendre polynomial of degree n that is uniquely determined by the properties

314

C. Gerhards and R. Telschow

(i) Pn : [−1, 1] → R is a polynomial of degree n, 1 (ii) −1 Pn (t)Pk (t)dt = 0, for all k = n, (iii) Pn (1) = 1. The spherical harmonics as given in Definition 1 are only one of many possible  choices that satisfy S Yn,k (η)Ym,l (η)dω(η) = 0 if n = m or k = l and  2 S |Yn,k (η)| dω(η) = 1 (an illustration of the spherical harmonics from Definition 1 is provided in Fig. 1). A very useful property of any choice of such orthonormal spherical harmonics is the addition theorem n

Yn,k (ξ )Yn,k (η) =

k=−n

2n + 1 Pn (ξ · η), 4π

ξ, η ∈ S,

(2)

and the following closed-form representations of power series of Legendre polynomials ∞

1 hn Pn (t) = √ , 1 + h2 − 2ht n=0

∞ (2n + 1)hn Pn (t) = n=0

1 − h2 3

(1 + h2 − 2ht) 2

t ∈ [−1, 1], h ∈ (−1, 1),

,

t ∈ [−1, 1], h ∈ (−1, 1).

(3)

(4)

Eventually, any square integrable function f : S → R can be expressed in the form f =

n ∞

f ∧ (n, k)Yn,k ,

(5)

n=0 k=−n 2 f, g equal in where above equality  is meant in L2 (S)-sense (we call two functions 2 the L (S)-sense if S |f (η) − g(η)| dω(η) = 0). The quantity f ∧ (n, k) is given by

Y2,0

Y6,3 0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

Fig. 1 Illustration of the absolute values of the spherical harmonics Y2,0 and Y6,3

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

f ∧ (n, k) =

315

 S

f (η)Yn,k (η)dω(η)

and is called Fourier coefficient of f for degree n and order k. Whenever we talk about ‘Fourier domain’, we mean the representation of a function f by its Fourier coefficients f ∧ (n, k), n ∈ N0 , k = −n, . . . , n. In contrast, we talk about ‘spatial domain’ when we are interested in the function f at points ξ ∈ S. Combining (2), (4), and (5) motivates the following representation of a squareintegrable function f : 1 h→1− 4π

f (ξ ) = lim



1 − h2 S

3

(1 + h2 − 2h(ξ · η)) 2

f (η)dω(η),

ξ ∈ S,

(6)

where equality is again meant in L2 (S)-sense. While one might say that (5) is a representation that is Fourier domain oriented, the representation (6) is more spatial domain oriented and, therefore, reflects properties that we are interested in the course of this chapter. Eventually, it can also be shown that all spherical harmonics are eigenfunctions of the Beltrami operator, more precisely, S Yn,k

= −n(n + 1)Yn,k ,

n ∈ N0 , k = −n, . . . , n.

In other words, differentiation in spatial domain could be loosely interpreted as multiplication by powers of n in Fourier domain. In order to obtain invertibility, #k " we typically identify k-fold differentiation by multiplication with n + 12 (see, e.g., [60]). Precise mathematical derivations of the statements in this section can be found, e.g., in the monographs [25, 30, 60, 65].

2.2

Spherical Splines and Radial Basis Functions

A radial basis function ! : R3 × R3 → R is a function satisfying that !(x, y) only depends on the distance |x − y| of two points x, y ∈ R3 . Analogously, in the spherical context of the chapter at hand, a function ! : S × S → R is called a spherical radial basis function (or simply spherical basis function; cf. [24, 50]) if !(ξ, η) depends only on the spherical distance ξ · η of ξ, η ∈ S, given that 1 ξ · η = 1 − |ξ − η|2 . 2 From the addition theorem (2) it can be seen that such ! is representable by a function φ : [−1, 1] → R in the way

316

C. Gerhards and R. Telschow

!(ξ, η) = φ(ξ · η) =

∞ 2n + 1 n=0



φ ∧ (n)Pn (ξ · η) =

n ∞

φ ∧ (n)Yn,k (ξ )Yn,k (η),

n=0 k=−n

(7) 1 1 ∧ where φ ∧ (n) = 2π −1 φ(t)Pn (t)dt. The choice of the Legendre coefficients φ (n) influences the spatial localization of φ. We say that φ has ‘good’ spatial localization if |φ(t)| is small for t < h and we say that φ ist ‘strictly’ spatially localized if actually φ(t) = 0 for all t < h, for some fixed h ∈ (−1, 1). However, the uncertainty principle (e.g., [27]) tells us that strict localization requires infinitely many Legendre coefficients to be non-zero. From a numerical point of view, the latter means that the choice of coefficients should be such that φ has a closed-form representation that can be evaluated easily, or that we sacrifice strict localization and just require good spatial localization, which would allow to use a finite sum in (7). Popular choices that allow closed-form representations are related to the expressions (3), (4): Abel-Poisson kernel : φ(t) =

1 − h2 1 , 4π (1 + h2 − 2ht) 32

Singularity kernel : φ(t) =

φ ∧ (n) = hn , h ∈ (0, 1) fixed,

φ ∧ (n) =

hn n+

1 2

, h ∈ (0, 1) fixed,

1 1 . √ 2π 1 + h2 − 2ht

Yet, the examples above yield ‘good’ spatial localization but not strict localization. For strictly localized examples with closed-form representations the reader is referred to, e.g., [74, 83, 84]. Popular choices with only finitely many non-zero Legendre coefficients (where M denotes the degree of truncation of the sum over n in (7)) are Cubic-Polynomial kernel :



φ (n) =

" 1−

n M+1

0, Dirichlet kernel :

φ ∧ (n) =



#2 " 1+

2n M+1

#

, n = 0, . . . , M, n ≥ M + 1,

1, n = 0, . . . , M, 0, n ≥ M + 1.

For interpolation purposes, however, the Dirichlet kernel is not favourable from a numerical nor from a theoretical point of view (e.g., [78] and earlier references therein). Localization of φ close to 1 implies localization of the corresponding radial basis function !(·, η) close to η ∈ S. The top row of Fig. 2 illustrates ! for the AbelPoisson case and the bottom row for the Cubic-Polynomial case. The advantage of

6 Reconstruction and Decomposition of Scalar and Vectorial. . . Abel-Poisson Kernel, h=0.5

317

Abel-Poisson Kernel, h=0.8 0.5 0.45

5

0.4 4

0.35 0.3

3

0.25 0.2

2

0.15 0.1

1

0.05 0

0

CP Kernel, N=5

CP Kernel, N=10 3

1

2.5

0.8

2 0.6 1.5 0.4

1

0.2

0.5

0

0

Fig. 2 Illustration of the absolute values of the superposition of two spherical radial basis functions !(·, η1 ) and !(·, η2 ) centered at nodal points η1 , η2 ∈ S that are indicated by white dots. The black circles indicate an exemplary situation for the data distribution. Top row: AbelPoisson kernels for parameters h = 0.5 and h = 0.8. Bottom row: Cubic-Polynomial (CP) kernels for truncation degrees M = 5 and M = 10

spherical radial basis functions over general function systems on the sphere is that their analysis and evaluation can be reduced to the one-dimensional interval [−1, 1]. Furthermore, the choice of nodal points η1 , . . . , ηN ∈ S that determine the centers of the functions !(·, η1 ), . . . , !(·, ηN ) can be adapted to the given data situation. From the introduction of spherical radial basis functions above it is only a small step to the definition of spherical splines in a reproducing Hilbert space setting. Definition Let K ∧ (n) ∈ R, n ∈ N0 , be a non-zero and summable sequence ∞ 2.2n+1 (i.e., n=0 4π |K ∧ (n)|−2 < ∞) and & ' HK (S) = f ∈ L2 (S) : f HK < ∞ , with the norm given by f 2HK = more, set K(η, ξ ) =

n ∞ n=0 k=−n

∞ n n=0

1

k=−n |f

∧ (n, k)|2 |K ∧ (n)|2 .

Yn,k (η)Yn,k (ξ ), K ∧ (n)2

Further-

ξ, η ∈ S.

We call HK (S) a reproducing kernel Hilbert space and K the corresponding reproducing kernel. Then, any function S : S → R of the form

318

C. Gerhards and R. Telschow

S(ξ ) =

N

αi K(ηi , ξ ),

ξ ∈ S,

i=1

with coefficients αk ∈ R, is called spherical spline relative to the system of nodal points {η1 , . . . , ηN } ⊂ S.  − 1 The choice K ∧ (n) = φ ∧ (n) 2 , with φ ∧ (n) given, e.g., by the Abel-Poisson or the singularity kernel satisfies the conditions of Definition 2 and the spherical radial basis function ! corresponding to φ denotes exactly the reproducing kernel K. Another example for spherical splines coming from the fundamental solution of the Beltrami operator is discussed in Sect. 2.3 (however, there the Legendre coefficient φ ∧ (0) of degree zero vanishes, which requires the inclusion of the constant function in the spherical spline setting). Interpolation with spherical splines of the form given in Definition 2 yields interpolating functions with minimal HK -norm. Therefore, remembering the inter2  pretation of, e.g., two-fold differentiation as multiplication by n + 12 in Fourier domain, we see that spline interpolation on the sphere in the space HK with #2 " K ∧ (n) = n + 12 relates closely to classical cubic spline interpolation on the line. An example reflecting the latter situation is the one based on the fundamental solution of the Beltrami operator mentioned in the previous paragraph. Interpolation with spherical splines based on the Abel-Poisson kernel, on the other hand, would lead to infinitely often differentiable interpolating functions (due to the asymptotic behaviour of the corresponding K ∧ (n) = h−n with respect to n). This may be too smooth for some applications but, e.g., for potential field problems in geodesy, AbelPoisson type splines are the natural choice (since they reflect the harmonic upward continuation of a signal on the sphere S into its exterior; compare Sect. 5 for more details). In the following, we make the procedure of spherical spline interpolation more precise: Given a set of pairwise distinct nodal points {η1 , . . . , ηN } ⊂ S and a set of measurement values {f1 , . . . , fN } ⊂ R at these nodal points, then a function g ∈ HK is called interpolating function if g(ηj ) = fj ,

j = 1, . . . , N.

The set of all interpolating functions in HK is denoted by IN . Among them is the uniquely determined spline S of the form given in Definition 2 whose coefficients α1 , . . . , αN , satisfy the system of linear equations N

αi K(ηi , ηj ) = fj ,

j = 1, . . . , N.

i=1

This spline has the desirable norm minimizing property

(8)

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

SHK = min gHK . g∈IN

319

(9)

Apart from possible noise in the data, the quality of the approximation of the true underlying function f (that determines the values f1 , . . . , fN ) by the spline S depends significantly on the distribution of the nodal points (i.e., the points η1 , . . . , ηN , where the data has been collected). For some explicit error estimates with respect to the fill distance supξ ∈S minj =1,...,N |ξ − ηi | the reader is referred to, e.g., [24, 50, 84]. Additionally, from a numerical point of view, the condition of the system of linear equations (8) becomes the most important aspect. This is again influenced by the nodal points but also by the choice of the space HK (and, therefore, by the choice of the reproducing kernel K). For the case of AbelPoisson splines, e.g., the parameter h ∈ (−1, 1) allows some control over the localization, which also affects the condition of the problem. A parameter h close to one yields good spatial localization of K and a good condition of the system of linear equations (8). However, a too narrow spatial localization of the reproducing kernel with respect to the fill distance of the nodal points also bears the risk of solely reproducing the data in the nodal points without obtaining a reasonable interpolation in between. This effect is illustrated, e.g., in [44, 60]. Remark 1. In order to address the condition of the interpolation problem described above, it is common practice to substitute pure interpolation by regularized approximation (smoothing). This means that instead of minimizing (9), one searches for a spline S that satisfies N j =1

|fj − S(ηj )|2 + λS2HK = min

g∈HK

N j =1

|fj − g(ηj )|2 + λg2HK ,

for some regularization parameter λ > 0. A comparison of different methods for the choice of λ for geodetically relevant problems can be found, e.g., in [6, 7, 45]. Additionally, the interpolation/approximation problem can be generalized by including linear functionals Fj , j = 1, . . . , N , such that the goal is to find splines  2 2 that satisfy Fj S = fj , j = 1, . . . , N , or minimize N j =1 |fj − Fj S| + λSHK , respectively. For details, the reader is referred, e.g., to [17, 24, 60, 82]. To summarize, spherical splines of the form presented in Definition 2 offer the nice handling properties known from radial basis functions and they can be adapted to the data situation as well as to the structure of the underlying (geo)physical problem by choosing adequate reproducing kernels K. However, they require the solution of possibly very large systems of linear equations that can be severely ill-conditioned. A more thorough mathematical analysis of this reproducing kernel based approach can be found, e.g., in the early papers [17, 75, 82] and the already mentioned monographs [24, 50, 84]. An approach based on triangulations that is more directly oriented along the lines of spline interpolation in the plane is presented, e.g., in [1].

320

2.3

C. Gerhards and R. Telschow

Fundamental Solutions and Multiscale Regularization

1 The fundamental solution 4π |x−y| of the Laplacian in R3 plays a central role in approximation methods for potential field problems in geoscience. Since many of such problems can be reduced to considerations intrinsic on the sphere, the fundamental solution of the Laplace-Beltrami operator

G(ξ, η) =

1 1 ln(1 − ξ · η) + (1 − ln(2)), 4π 4π

ξ, η ∈ S, ξ = η,

(10)

plays an equally important role (without further mention, G always indicates the fundamental solution for the Laplace-Beltrami operator). In particular, solutions of the differential equations S f = h or ∇S f¯ = h on S can be expressed in the form  f (ξ ) =

S

f¯(ξ ) = −

G(ξ, η)h(η)dω(η) + const.,

ξ ∈ S,

(11)

 S

∇S,η G(ξ, η) · h(η)dω(η) + const.,

ξ ∈ S,

(12)

respectively. Although these two differential equations above are rather specific, many geoscientific problems can be modeled in such a way that they reduce to equations of that type (some examples are provided in Sect. 5). Given data for h or h of f relies on adequate cubature in a set of points η1 , . . . , ηN ∈ S, an approximation  rules for the evaluation of the integrals S . . . dω rather than solvers for systems of linear equations as in the case of spherical splines. For scattered data, such cubature rules are typically not directly available but for particular point configurations there exist various methods (e.g., [11, 47]). Thus, the data either has to be collected at a numerically suitable set of points in the first place (if that is feasible) or it has to be interpolated on a cubature grid afterwards. An issue that occurs with numerical integration in (11) and (12) is instability induced by the singularity of the fundamental solution at points ξ = η. Regularization around these points can resolve this issue and at the same time enables a multiscale procedure that reflects the spatial localization properties that we are aiming at in this chapter (see, e.g., [12, 19, 29, 34]). This procedure does not only hold for the fundamental solution of the Beltrami operator but for convolutions with any spherical radial basis function possessing a weak singularity at ξ = η. Definition 3. Let ! : S × S \ {(ξ, ξ ) : ξ ∈ S} → R be a continuous spherical radial basis function that possesses a weak singularity at points (ξ, ξ ) ∈ S × S (i.e., there exists a continuous function φ : [−1, 1) → R with a weak singularity at t = 1 such that !(ξ, η) = φ(ξ · η), for ξ = η ∈ S). A set of continuous functions Rρ : [−1, 1] → R, with parameter ρ > 0, 1 that satisfies Rρ (1 − ρ) = φ(1 − ρ) and limρ→0+ 1−ρ |Rρ (t)|dt = 0 is called a set of auxiliary regularization functions. Provided a sequence of radii

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

321

ρ0 > ρ1 > ρ2 > . . . > 0 with limJ →∞ ρJ = 0, the corresponding scaling kernel at scale J is defined by !J : S × S → R,

!J (ξ, η) =

 φ(ξ · η),

ξ · η ≤ 1 − ρJ ,

RρJ (ξ · η), ξ · η > 1 − ρJ ,

and the wavelet kernel at scale J by "J : S × S → R,

"J = !J +1 − !J .

The radius ρJ is called scaling parameter and a typical choice is ρJ = 2−J . Now, let the setup be given as in Definition 3, with !J (ξ, η) =

 G(ξ, η),

Then the solution of multiscale expansion

1 4π

ln(ρJ ) + Sf

S

ξ · η > 1 − ρJ .

(13)

= h as indicated in (11) can be expressed by the

 f (ξ ) =

ξ · η ≤ 1 − ρJ , 1 4π (1 − ln(2)),

!0 (ξ, η)h(η)dω(η) +

∞  J =0 S

"J (ξ, η)h(η)dω(η) + const.,

ξ ∈ S. (14)

The crucial aspect of the representation above is that the wavelet kernel "J (ξ, ·) vanishes outside the spherical cap ρJ (ξ ) = {η ∈ S : ξ · η > 1 − ρJ } with center ξ and (polar) radius ρJ . This means that, due to the monotonic decrease of ρJ , the evaluation of the wavelet contributions at higher scales J in (14) requires data in smaller and smaller caps ρJ (ξ ) ⊂ S. In other words, the resolution of the quantity of interest f can be adapted to the data situation, i.e., in regions with higher data density the multiscale expansion can be evaluated numerically up to a higher scale J . The scale dependent spatial localization of the wavelet kernels is illustrated in Fig. 3. The effect on the reconstruction of f itself is illustrated in more detail for the particular example of approximating the disturbing  gravity potential in Sect. 5. In general, it can be said that the wavelet contribution S "J (ξ, η)h(η)dω(η) at increasing scales J reflects more and more localized features of f . Thus, if not interested in f as a whole but only in the qualitative behaviour of local or regional variations, it suffices to solely investigate the wavelet contributions based on locally or regionallyavailable data. If a quantitative study of f is required, also the scaling contribution S !0 (ξ, η)h(η)dω(η) has to be evaluated. The latter typically requires

322

C. Gerhards and R. Telschow

0

0 0.1 0.25 0.08 0.2 0.06

0.15

0.04

0.1

0.02

0.05

0

0

4

2

0.2

0.1 0.08

0.15

0.06 0.1 0.04 0.05

0.02 0

0

Fig. 3 Illustration of the absolute values of the scaling kernel !0 (ξ, ·) and the wavelet kernels "J (ξ, ·) used in (14), for scales J = 0, 2, 4, centered at the point ξ ∈ S that is indicated by the white dot

global data, however, only at a fairly coarse resolution. For some particular problems (e.g., the one for the disturbing gravity potential in Sect. 5), there are possibilities to circumvent the requirement of global data entirely by the construction of Green’s functions based on the fundamental solution G(·, ·). This is discussed in Sect. 3. For the vectorial equation ∇S f¯ = h and the integral representation (12) of its solution, the procedure is similar and explained in more detail in Sect. 4. Remark 2. Similar to the multiscale expansion (14) of solutions f to the underlying differential equation S f = h (or of solutions to any other spherical differential equation for which the fundamental solution is known), such a multiscale expansion can be formulated as well for simple approximation of f from discrete data on f itself:  f (ξ ) =

S

!0 (ξ, η)f (η)dω(η) +

∞  J =0 S

"J (ξ, η)f (η)dω(η),

ξ ∈ S.

The scaling kernel !J then reflects a so-called approximate identity and a variety of possible choices for !J exist, e.g., the Abel-Poisson kernel from Sect. 2.2, i.e., 3 1 (1 − h2J )(1 + h2J − 2hJ (ξ · η))− 2 (where the parameter hJ plays !J (ξ, η) = 4π a role similar to that of ρJ in Definition 3). However, the Abel-Poisson based wavelets "J = !J +1 −!J offer good spatial localization but not strict localization. A choice for !J that offers strict spatial localization can be based, e.g., on the Haar kernel (cf. [26, 46]) or on the fundamental solution for the Beltrami operator

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

323

and the construction principle in Definition 3. For brevity, we desist from giving examples for the latter (more details can be found, e.g., in [36]). Concerning the particular application of approximating f from discrete data of f itself, the multiscale approach should be regarded mainly as a post-processing tool to study scale-dependent variations of f . If one is just interested in a precise approximation of f , then it is more recommendable to use methods such as spline interpolation since the evaluation of the integrals in the multiscale expansion often requires interpolation of the data onto a suitable cubature grid in the first place. The situation is slightly different when dealing with the approximation of solutions to differential equations, because then the scaling and wavelet kernels !J and "J as used, e.g., in (14) additionally reflect information from the underlying equations. Remark 3. A relation of the current section to spherical splines is given when considering the convolution of the fundamental solution of the Beltrami operator with itself. This yields an ‘almost’ reproducing kernel which, by use of the series expansion

G(ξ, η) =

n ∞ n=1 k=−n

1 Yn,k (ξ )Yn,k (η), −n(n + 1)

ξ, η ∈ S,

has the representation  K(η, ξ ) =

S

G(η, ζ )G(ζ, ξ )dω(ζ ) =

n ∞ n=1 k=−n

1 Yn,k (η)Yn,k (ξ ), n2 (n + 1)2

ξ, η ∈ S. It can be directly seen that this kernel corresponds to the space HK with K ∧ (n) = n(n + 1). For details on the derivation of a closed-form expression of K, the reader is referred to [30]. The only inconsistency to Definition 2 is the fact that K ∧ (0) = 0 such that we call K an ‘almost’ reproducing kernel. Thus, in order to obtain a proper spherical spline, we have to include an additive constant:

S(ξ ) =

N

αi K(ηi , ξ ) + const.,

ξ ∈ S.

i=1

Again, we refer the reader to [30] for more details. As we have mentioned before, multiplication in Fourier domain with coefficients K ∧ (n) = n(n + 1) reflects application of the Beltrami operator − S , so that interpolation with splines in the space HK of the form above yields the interpolating function of minimal ‘bending energy’.

324

C. Gerhards and R. Telschow

Summarizing, splines typically reveal a fixed spatial localization and their adaptation to the data is achieved by the choice of the nodal points η1 , . . . , ηN . With the multiscale setup described here, the variation of the scale J offers a further means of adaptation. Beyond taking into account the data situation, it allows to obtain information on the variation of a signal at different spatial scales. Combinations of spherical spline and multiscale methods are less common but do exist (e.g., [8, 64]). The procedure from Definition 3 and (14) is rather generic but it requires the knowledge of closed-form expressions for the fundamental solutions of the underlying differential equations. However, with the availability of the fundamental solution for the Beltrami equation, a wide range of geoscientific applications can be handled. For other approaches to spherical multiscale methods that have found their way into geosciences, the reader is referred, e.g., to the early papers [9, 10, 28, 31, 49].

2.4

Slepian Functions

The methods of the two previous sections offer spatial localization properties based on the choice of the underlying reproducing kernel or scaling kernel, respectively, together with the choice of the scale J in the case of multiscale methods. However, the involved function systems are spherical radial basis functions, meaning that spatial localization is restricted to regions of the shape of spherical caps ρ (ξ ). In some geophysical situations, the regions of interest may have a more complex geometry, e.g., determined by continental or oceanic boundaries. Slepian functions on the sphere aim at providing an orthonormal system of functions that are localized in such a general predefined region  ⊂ S. The function g with the best localization is the one that maximizes the energy ratio  g2L2 () |g(η)|2 dω(η) λ (g) =  = . 2 g2L2 (S) S |g(η)| dω(η) Moreover, let us assume that g is bandlimited with bandlimit N, i.e., it is of the form

g=

n N

g ∧ (n, k)Yn,k

n=0 k=−n

such that the corresponding norms are given by

g2L2 (S) =

n N 2  ∧ g (n, k) n=0 k=−n

(15)

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

325

as well as g2L2 ()

=

N n N m





g (n, k) g (m, j )

Yn,k (η) Ym,j (η)dω(η) 

n=0 m=0 k=−n j =−m

Let now A = (A(n,k),(m,j ) ) ∈ sorted1 ) inner products



2 2 R(N +1) ×(N +1)

(16) be the matrix consisting of (properly

 A(n,k),(m,j ) =

Yn,k (η) Ym,j (η)dω(η) 

and b = (b(n,k) )T ∈ R(N +1) with n = 0, . . . , N and k = −n, . . . , n, then the energy ratio of g with respect to  is given by 2

λ (g) =

bTAb bT b

(17)

In order to obtain the function which is best localized in , we have to maximize this ratio. That is, we have a finite-dimensional optimization problem where we want ∗ to find the vector b∗ = (b(n,k) )T that maximizes (17), yielding the corresponding ∗ optimally localized function g given by g∗ =

n N

∗ bn,k Yn,k .

(18)

n=0 k=−n

Obviously, the energy ratio is invariant with respect to scaling, since λ (rg) = λ (g) for arbitrary r = 0. Thus, it suffices to take into account only normalized functions that fulfill g2L2 (S) = 1. In this case, (17) becomes λ (g) = g2L2 () = bTAb .

(19)

The underlying problem is now a constrained optimization problem with associated Lagrangian function L(b, μ) := bTAb+μ(bT b−1) . Due to the symmetry of A, the necessary condition for a maximum of the Langrangian yields Ab + μb = 0 as the equation to be solved in order to find the optimal coefficient vector b. Eventually, this means we want to find eigenvalues and eigenvectors of the matrix A by solving Ab = λb .

(20)

the order is (0, 0), (1, −1), (1, 0), (1, 1), (2, −2), . . . such that the pair (n, k) is at position n2 + n + k + 1 in a row or column.

1 Typically,

326

C. Gerhards and R. Telschow 1.4 1.2 1

1.2 1 0.8

0.8 0.6 0.4

0.6 0.4

0.2

0.2

0

0

Fig. 4 Let  denote the Earth’s continental areas: Illustration of the absolute values of the Slepian function g1 with the best localization in  (left) and the Slepian function g256 with the worst localization in , i.e, the best localization in S \ , (right) for bandlimit N = 15 (the code used for the generation of the Slepian functions has been kindly supplied by [68]; a thorough description of the theoretical setup is given in [69])

In other words, the optimization problem leads to an eigenvalue problem where the eigenvalues λ are the possible energy ratios and the corresponding eigenvectors are Fourier coefficients of bandlimited functions g attaining this particular energy ratio λ = λ (g). Obviously, we have 0 ≤ λ (f ) ≤ 1 for every square-integrable function f = 0. In typical scenarios, it turns out that the eigenvalues are clustered close to one and close to zero. Those eigenvalues λ1 , . . . , λL closer to one determine the subset {gi }i=1,...,L of Slepian functions that should be used for approximation in , while the remaining set {gi }i=L+1,...,(N +1)2 is suitable for approximation in the complementary region S \  (however, it should be noted that the bandlimitedness of the functions makes strict localization in  or S \  impossible). An exemplary illustration of Slepian functions is provided in Fig. 4. For details on the construction and properties of spherical Slepian functions in a scalar and vectorial setup, the reader is referred to [70, 71, 76, 77]. More recently, Slepian functions have also been considered in the context of inverse problems, i.e., including operators other than a simple restriction operator in the construction procedure (e.g., [61, 71]). For the particular case of upward/downward continuation of potential field data, alternative methods with optimized kernels for localization in predefined regions have been treated in [35, 40, 41]. Generally, one can say that Slepian functions are a useful tool when considering fixed, predefined subregions of the sphere. Then, they play a somewhat similar role as spherical harmonics for the entire sphere. However, opposed to splines or multiscale methods, they lack the ability to adapt to non-uniform data distributions or to allow a scale-dependent study of spatial variations of a signal. The latter, however, has been addressed recently by proposing a combination of multiscale methods with Slepian functions (cf. [61]).

3

Green’s Functions for the Beltrami Operator

In Sect. 2.3 we have mentioned that the scaling kernel !0 that is used in the multiscale representation (14) of the solution f of S f = h (and is illustrated

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

327

in the top left corner of Fig. 3) reveals no particular spatial localization. Thus, if we are interested in a multiscale procedure solely based on local data in a region  ⊂ S, we either have to accept the error made by neglecting data outside  at scale zero or we try to modify the scaling kernel in such a way that it is adapted to the region of interest. Latter can be achieved by substituting the underlying fundamental solution by Green’s functions for Dirichlet or Neumann boundary values. Such Green’s functions have the form GD = G − D in the case of Dirichlet boundary values and GN = G − N in the case of Neumann boundary values, where the auxiliary functions D, N satisfy S D(ξ, η)

=−

1 , 4π

S N(ξ, η)

D(ξ, η) = G(ξ, η),

=

1 1 − , || 4π

ξ, η ∈ ,

∂ν(η) N(ξ, η) = ∂ν(η) G(ξ, η),

ξ ∈ , η ∈ ∂.

Here, ∂ν denotes the normal derivative at a point on the boundary ∂. The crucial aspect is the construction of the functions D and N for the particular region . Some general construction principles are indicated in [43, 52, 53]. Explicit derivations for spherical caps ρ¯ (ξ¯ ) can be found in [20, 39], namely, 1 1 ln(r ∗ (1 − ξ ∗ · η)) + (1 − ln(2)), (21) 4π 4π 1 1 1−ρ ln(r ∗ (1 − ξ ∗ · η)) − ln(1 + ξ¯ · η) + (1 − ln(2)), N (ξ, η) = − 4π 2πρ 4π (22) D(ξ, η) =

   with ξ ∗ = (r ∗ )−1 ξ − (ρ − 1)−1 (r ∗ − 1)ξ¯ and r ∗ = − ρ −1 (ρ − 2)−1 1 + ξ · ξ¯ (ρ − 1)  +(ρ − 1)2 . A solution of S f = h can then be expressed as 

 GD (ξ, η)h(η)dω(η) +

f (ξ ) = 

∂ν(η) GD (ξ, η) f (η)dσ (η),

ξ ∈ .

∂

or, alternatively, as 

 GN (ξ, η)h(η)dω(η) −

f (ξ ) = 

GN (ξ, η)∂ν f (η)dσ (η) + const.,

ξ ∈ ,

∂

The price that we pay for the restriction to  is that we additionally need information on the normal derivative ∂ν f or the function f itself at the boundary ∂. The latter is, however, not necessary if we are dealing with first order problems such as ∇S f¯ = h. In that case, the solution reads f¯(ξ ) = −

 

∇S,η GN (ξ, η) · h(η)dω(η) + const.,

ξ ∈ ,

328

C. Gerhards and R. Telschow

which will be important in our example on deflections of the vertical in Sect. 5. Based on this, we can represent f¯ by a similar multiscale representation as in (14) for the global case: f¯(ξ ) = −

 



∇S,η !N 0 (ξ, η) · h(η)dω(η)

∞  J =0 

∇S,η "JN (ξ, η) · h(η)dω(η) + const.,

(23)

where ξ ∈  and with !J as in (13) the underlying scaling kernel is given by !N J = !J − N . The auxiliary function N does not require any regularization since it is N twice continuously differentiable in . For the difference wavelets "JN =!N J +1 −!J this means that they are actually identical to the wavelets "J in (14). This does not N hold true for !N J itself. In particular, one can see that !J is not a spherical radial basis function. An illustration for spherical caps is given in Fig. 5. In conclusion, for certain problems, the procedure above can combine the multiscale concept from Sect. 2.3 with spatial localization in a predefined fixed subregion  ⊂ S. Opposed to the Slepian approach, the involved kernels are actually strictly localized within . Their explicit construction is, however, only possible for simple regions such as spherical caps or spherical triangles. For the Dirichlet case (but not for the Neumann case), known representations of Green’s

N 0

0.09

N 4

0.25

0.08 0.07

0.2

0.06 0.15

0.05 0.04

0.1

0.03 0.02

0.05

0.01 0

0 N 2

N 4

0.1 0.08

0.2 0.15

0.06 0.1 0.04 0.02 0

0.05 0

Fig. 5 Illustration of the absolute values of the scaling kernels !N J (ξ, ·), for scales J = 0, 4, (top row) and the wavelet kernels "JN (ξ, ·), for scales J = 2, 4, (bottom row) used in (23), centered at the point ξ ∈ S that is indicated by the white dot. The black dot and the back curve indicate the center ξ¯ ∈ S and the boundary of the underlying spherical cap ρ (ξ¯ ) of polar radius ρ = 0.6

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

329

functions for the Laplace operator in the Euclidean plane can be directly transferred to the sphere by the stereographic projection. In fact, many of the concepts from classical potential theory in Euclidean space and its applications have analogous formulations intrinsic on the sphere (e.g., the method of fundamental solutions or boundary integral methods). For details, the reader is referred to, e.g., [20, 32, 39] and references therein. Here, we focus on exemplarily illustrations of multiscale procedures.

4

Vector Field Decompositions and Vectorial Function Systems

In this section, we want to formulate the procedures of the previous Sects. 2 and 3 in a vectorial framework. The differential equation ∇S f = h that we have treated before is already of vectorial type, but the solution f that we want to approximate is scalar. Now, we are interested in approximating vectorial functions f on the sphere. For that purpose, we introduce two decompositions, the well-known spherical Helmholtz decomposition and the Hardy-Hodge decomposition. Both of which lead to multiscale representations similar to Sect. 2.2, but with scaling and wavelet kernels that are tensorial. Afterwards, we use these two decompositions to define vector spherical harmonics and vector spherical splines. We do not include a section on vector Slepian functions since the basic concepts are very similar to the scalar case in Sect. 2.4. For more details, we refer the reader to the more extensive articles [69–71, 77].

4.1

Helmholtz Decomposition and Vectorial Multiscale Regularization

Decompositions of vector fields f are useful tools in order to extract specific contributions of interest as well as to reduce vector fields to scalar quantities for which already known methods can be applied. A particularly well-known decomposition is the spherical Helmholtz decomposition. Definition 4. Let f be a square-integrable vector field on the unit sphere S. Then there exist square-integrable scalar-valued functions f1 , f2 , and f3 such that f = f(1) + f(2) + f(3) = o(1) f1 + o(2) f2 + o(3) f3 . The operators o(i) , i = 1, 2, 3, denote abbreviations either for multiplication by the unit normal vector, for the surface gradient, or for the surface curl gradient, respectively. That is, on a scalar valued function f they act via o(1) f (ξ ) = ξ f (ξ ),

o(2) f (ξ ) = ∇S f (ξ ),

o(3) f (ξ ) = LS f (ξ ),

ξ ∈ S.

330

C. Gerhards and R. Telschow

The vectorial functions f(1) , f(2) , f(3) , and the scalar function f1 are determined uniquely while f2 and f3 are only determined up to an additive constant. The operators O (i) , i = 1, 2, 3, are the adjoint operators to o(i) , i.e.,  " S

 # Oη(i) f(η) g(η)dω(η) = f(η) · oη(i) g(η)dω(η) S

and they are given by Oξ(1) = ξ · f(ξ ),

Oξ(2) = −∇S,ξ · ptan f(ξ ),

Oξ(3) = −LS,ξ · ptan f(ξ ) .

The function f(1) denotes the radial contribution of f and is easily computable, while the tangential curl-free contribution f(2) and the tangential divergence-free contribution f(3) require some more effort (an illustration of the decomposition can be found in Fig. 6). However, one is often interested in separating the curl-free from the divergence-free contributions of a given spherical vector field f, e.g., to describe depth-integrated ocean currents, which are approximately surface divergence-free. In order to obtain the scalar-valued function f3 , we have to solve the Beltrami equation S f3 = LS · f. Then, Eq. (11) in connection with (1) yield  f3 (ξ ) =

S

G(ξ, η)LS · f(η)dω(η) + const.



=−

S

LS,η G(ξ, η) · f(η)dω(η) + const.,

f(1)

f

4 3.5 3 2.5 2 1.5 1 0.5 0

4 3.5 3 2.5 2 1.5 1 0.5 0

f(2)

ξ ∈ S.

f(3)

Fig. 6 Illustration of the Helmholtz decomposition of a vector field f (top left). Colours indicate the absolute value of the radial parts, the arrows indicate the orientation of the tangential parts

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

331

Since we are mainly interested in the vectorial quantity f(3) , we can neglect the additive constant from above and typically set it to zero for the computation of f3 . The involved convolution kernel can be regularized and we obtain a multiscale representation similar to (14) and (23):  f3 (ξ ) = − −

S

LS,η !0 (ξ, η) · f(η)dω(η)

∞  J =0 S

LS,η "J (ξ, η) · f(η)dω(η),

ξ ∈ S,

(24)

with !0 and "J as given in (14) and Definition 3. The computation of the vectorial quantity f(3) , which we are actually interested in, requires another application of o(3) . Hence, for ξ ∈ S, we obtain    LS,ξ ⊗ LS,η !0 (ξ, η) f(η)dω(η) f(3) (ξ ) = − S



∞  J =0 S

  LS,ξ ⊗ LS,η "J (ξ, η) f(η)dω(η),

(25)

i.e., the scaling and wavelet kernels are now of tensorial nature. By ⊗ we denote the tensor product of two vectors x, y ∈ R3 (or of two vectorial operators), ⎛ ⎞ x1 y1 x1 y2 x1 y3 x ⊗ y = ⎝x2 y1 x2 y2 x2 y3 ⎠ . x3 y1 x3 y2 x3 y3 An illustration of the kernels at different scales can be found in Fig. 7. For clarity and in order to point out the divergence-free character, we only indicated the vectorial kernels LS,η !0 and LS,η "J from (24) and not the actual tensorial kernels from (25). Substituting the surface curl gradient LS by the surface gradient ∇S in (25), we get an analogous multiscale representation for the curl-free contribution f(2) of f. The radial contribution f(1) can easily be computed via the auxiliary scalar function f1 (ξ ) = ξ · f(ξ ), ξ ∈ S, and a multiscale representation for scalar functions as indicated in Remark 2. If data is only available in a subregion  ⊂ S, the multiscale representation (25) suffers from the same problem as described for the scalar case in Sect. 2.3, which is that the scaling kernel !0 is non-zero almost everywhere. Green’s functions as described in Sect. 3 address this issue. However, opposed to the global case, the contributions f(2) and f(3) are not determined uniquely anymore in  unless one has knowledge of f(2) or f(3) at the boundary ∂. If one of the latter is available, multiscale expansions based on Green’s functions can be obtained similarly as in Sect. 3, together with some further considerations on the Helmholtz decomposition. The reader who is interested in more details is referred to [20, 33].

332

C. Gerhards and R. Telschow

0

0 0.12

0.16

0.1

0.14 0.12

0.08

0.1

0.06

0.08

0.04

0.06 0.04

0.02

0.02 0

0

2

0.35

4 1.8

0.3

1.6

0.25

1.4

0.2

1.2 1

0.15

0.8

0.1

0.6 0.4

0.05 0

0.2 0

Fig. 7 Illustration of the scaling kernel LS !0 (ξ, ·) and the wavelet kernels LS "J (ξ, ·) used in (25), for scales J = 0, 2, 4, centered at the point ξ ∈ S that is indicated by the white dot. Colours indicate the absolute value and arrows the orientation

All in all, we see that the same regularization approach that yielded multiscale expansions for the scalar case in Sect. 2.3 also works in a vectorial setup. The main difference is that the scaling and wavelet kernels are now required to be tensors. The Helmholtz decomposition itself is actually not necessary for the multiscale representation but it supplies a useful interpretation in terms of radial, tangential curl-free, and tangential divergence-free contributions.

4.2

Hardy-Hodge Decomposition and its Geophysical Interpretation

Another useful vector field decomposition is the so-called Hardy-Hodge decomposition. It does not decompose into radial and tangential contributions but into contributions that are gradients of functions that are harmonic in the interior and exterior of the sphere under consideration, respectively. This makes it particularly useful for the investigation of potential field problems as occurring in gravity and magnetic field studies. In terms of the set of vector spherical harmonics in Definition 7 from the next section, this decomposition has been used fairly frequently, e.g., in geomagnetism to distinguish the contributions of the geomagnetic field with respect to sources in the interior and the exterior of the satellite’s orbit where the measurements have taken place (cf. [3, 59, 66]). More recently, it has turned out that the decomposition can also be used to characterize the contributions of magnetizations that can be determined uniquely from magnetic field measurements

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

333

(cf. [42,58] for a vector spherical harmonic approach, [38,81] for the reconstruction of inducing dipole directions, and [5, 37, 56] for more general considerations) and to separate core from crustal contributions (under certain assumptions; cf. [4]). Definition 5. Let f be a square-integrable vector field over the unit sphere S. Then there exist square-integrable scalar-valued functions f˜1 , f˜2 , and f˜3 such that f = ˜f(1) + ˜f(2) + ˜f(3) = o˜ (1) f˜1 + o˜ (2) f˜2 + o˜ (3) f˜3 . This decomposition is called Hardy-Hodge decomposition. The operators o˜ (i) , i = 1, 2, 3, are defined in the following way by their relation to the operators o(i) , i = 1, 2, 3 from the Helmholtz decomposition in Definition 4: o˜

(1)

  1 D+ − o(2) , =o 2 (1)

 o˜

(2)

=o

(1)

1 D− 2

 + o(2) ,

o˜ (3) = o(3) ,

1 where D = (− S + 14 ) 2 . The vectorial functions ˜f(1) , ˜f(2) , ˜f(3) are determined uniquely while f˜1 − f˜2 and f˜3 are only determined up to an additive constant.

Analogously to above, the operators O˜ (i) , i = 1, 2, 3, are the adjoint operators to o˜ (i) given by   1 (1) ˜ O (1) − O (2) , O = D+ 2

  1 (2) ˜ O (1) + O (2) , O = D− 2

O˜ (3) = O (3) .

A crucial property of the Hardy-Hodge decomposition is that any sufficiently smooth function ˜f(1) = o˜ (1) f˜1 on SR can be expressed as the gradient of a harmonic function h in the exterior R3 \ BR , i.e., ˜f(1) (ξ ) = ∇h(ξ ),

ξ ∈ SR ,

h(x) = 0,

x ∈ R3 \ BR .

Analogously, any function of the form ˜f(2) = o˜ (2) f˜2 on SR can be expressed as the gradient of a harmonic function h in the interior BR . For the gravity field ∇V at a satellite’s orbit, this means that it can be expressed solely by functions of the form ˜f(1) . The latter motivates the use of the corresponding vector spherical harmonics from Definition 7 rather than those from Definition 6 for gravity field studies. The same properties of the Hardy-Hodge decomposition also allow to characterize spherical magnetizations m on a sphere Sr (e.g., the Earth’s surface) from knowledge of the corresponding magnetic field b, given by 1 ∇x b(x) = 4π

 Sr

m(y) ·

x−y dω(y), |x − y|3

x ∈ SR ,

334

C. Gerhards and R. Telschow

˜ (i) , i = 1, 2, 3, are the Hardyon a sphere SR , R > r (e.g., a satellite’s orbit). If m (1) (3) ˜ and m ˜ produce no magnetic field on SR . That Hodge contributions of m, then m ˜ (2) is, knowing the magnetic field on an exterior orbit, one can only reconstruct the m contribution of the actual magnetization m. The situation slightly improves if one knows in advance that m is strictly spatially localized. Then one can reconstruct the ˜ (1) and m ˜ (2) . For details on the decomposition of magnetizations, the contributions m reader is referred to [5, 37, 42]. An illustration of the Hardy-Hodge decomposition is provided in Fig. 8. Analogous to (25), multiscale representations can be derived for the HardyHodge decomposition. The representations for f(3) and ˜f(3) are identical since the operators o(3) and o˜ (3) are identical. For the other two contributions, the computations are somewhat more tedious, but one can fall back on an explicit connection between the Helmholtz and the Hardy Hodge decomposition, namely,  1 −1 ˜ f1 = D f1 + 2  1 f˜2 = D−1 f1 + 2

 1 1 f2 − f2 , 2 2  1 1 f2 + f2 . 2 2

(26) (27)

For ˜f(1) , this leads to an expansion ˜f(1) (ξ ) =

 S

˜ 0 (ξ, η)f(η)dω(η) + 

∞  J =0 S

˜ J (ξ, η)f(η)dω(η), 

ξ ∈ S,

6 5 4

3 2.5 2

3 2

1.5 1

1

0.5

0

0

2.5 2 1.5 1 0.5 0

Fig. 8 Illustration of the Hardy-Hodge decomposition of a vector field f (top left). Colours indicate the absolute value of the radial parts, the arrows indicate the orientation of the tangential parts

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

335

˜ 0 and  ˜ J are derived from (26) and where the tensorial scaling and wavelet kernels  multiscale representations for f(1) and f(2) of the Helmholtz decomposition similar to (25). For the somewhat lengthy computations, we refer the interested reader to [34]. In the end, it breaks down to the regularization concept from Definition 3 and knowledge of the fundamental solutions for the operators S and D. To summarize, the Hardy-Hodge decomposition is a particularly useful tool for geophysical potential field problems, due to the interpretation of its first two contributions as gradients of harmonic functions. From a computational point of view, the Helmholtz decomposition from Sect. 4.1 might be easier to handle, but the relations (26), (27) allow a transfer of the representations from one decomposition to the other.

4.3

Vector Spherical Harmonics

Given the operators o(i) , i = 1, 2, 3, from the previous sections, we can now define a set of orthonormal vector spherical harmonics based on the Helmholtz decomposition. Definition 6. Let {Yn,k }n∈N0 ,k=−n,...,n be an orthonormal set of scalar spherical harmonics, e.g., the one from Definition 1. Then we define vector spherical harmonicsof type i = 1, 2, 3 by  (i) − 1 (i) 2 o Y y(i) n,k , n,k = μn

n ∈ N0 , k = −n, . . . , n,

(2) (3) The normalization coefficients are given by μ(1) n = 1 as well as μn = μn = (i) n(n + 1) and y0,0 := 0, i = 2, 3. Tensorial Legendre functions of type (i, j ), with i, j = 1, 2, 3, are defined by

(i,j )

Pn

 − 1  (j ) − 1 (i) (j ) 2 μ 2 o (ξ, η) = μ(i) n n ξ ⊗ oη Pn (ξ · η),

ξ, η ∈ S, n ∈ N0 .

vector spherical harmonics are orthonormal in the sense that  These (j ) (i) (i) 2 dω(η) = 1 and |y (η)| S n,k S yn,k (η) · ym,l (η)dω(η) = 0 if i = j , n = m, or k = l. Furthermore, they satisfy the vectorial addition theorem n

(i,j )

(j ) y(i) n,k (ξ ) ⊗ yn,k (η) = Pn

(ξ, η),

ξ, η ∈ S, i, j = 1, 2, 3,

k=−n

and they inherit the completeness property from the scalar case. Latter means that any square-integrable vector-valued function f can be expressed in the form f=

∞ n 3  (i) ∧ f (n, k)y(i) n,k , i=1 n=0 k=−n

(28)

336

C. Gerhards and R. Telschow

where equality is meant in the L2 (S, R3 )-sense. The Fourier coefficients are given by  (i) ∧ (n, k) = f

 S

f(η) · y(i) n,k (η)dω(η).

Just as in the scalar case, we talk about ‘Fourier domain’ when we are dealing with a representation of f by its Fourier coefficients and we talk about ‘spatial domain’ when we are interested in the evaluation of f at a point ξ ∈ S. An expansion of f in the form (28) decomposition of f by the identification  (i)the ∧Helmholtz nreflects exactly (i) f (n, k)y . of f(i) with ∞ n=0 k=−n n,k However, opposed to (scalar) spherical harmonics, vector spherical harmonics loose the property of being eigenfunctions of the Beltrami operator S (when applied componentwise). An alternative set of vector spherical harmonics that preserves the eigenfunction property is based on the Hardy-Hodge decomposition. Definition 7. Let {Yn,k }n∈N0 ,k=−n,...,n be an orthonormal set of (scalar) spherical harmonics, e.g., the one from Definition 1. Then we define vector spherical harmonics of type i = 1, 2, 3 by  (i) − 1 (i) y˜ (i) ˜ n 2 o˜ Yn,k , n,k = μ

n ∈ N0 , k = −n, . . . , n,

The normalization coefficients are given by μ˜ (1) ˜ (2) n = (n+1)(2n+1), μ n = n(2n+1) (3) (i) as well as μ˜ n = n(n + 1) and y˜ n,k := 0, i = 2, 3. Tensorial Legendre functions of type (i, j ), with i, j = 1, 2, 3, are defined by  − 1  (j ) − 1 (i) (i,j ) 2 μ 2 o ˜n ˜ ξ ⊗ o˜ η(j ) Pn (ξ · η), P˜ n (ξ, η) = μ˜ (i) n

ξ, η ∈ S, n ∈ N0 .

The Hardy-Hodge based vector spherical harmonics (i.e., the vector spherical harmonics from Definition 7 indicated by a tilde) can alternatively be expressed as linear combinations of the Helmholtz based vector spherical harmonics from Definition 6: 2 y˜ n,k = (1)

2 y˜ n,k = (2)

n + 1 (1) y − 2n + 1 n,k n y(1) + 2n + 1 n,k

2 2

n y(2) , 2n + 1 n,k n + 1 (2) y , 2n + 1 n,k

(3) y˜ (3) n,k = yn,k .

Thus, the previously mentioned properties for the Helmholtz based vector spherical harmonics canonically transfer to the Hardy-Hodge based vector spherical harmonics. This holds, in particular, for the expansion

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

f=

337

∞ n 3  (i) ∧ f˜ (n, k)˜y(i) n,k

(29)

i=1 n=0 k=−n

of square-integrable vector-valued functions. Additionally, they represent eigenfunctions of the Beltrami operator, namely, ˜ n,k Sy

= −(n + 1)(n + 2)˜y(1) n,k ,

˜ n,k Sy

= −n(n − 1)˜y(2) n,k ,

˜ n,k Sy

= −n(n + 1)˜y(3) n,k ,

(1) (2) (3)

and, therefore, allow a simple characterization of the differentiability of a vector field based on the asymptotic decay of its Fourier coefficients (similar to the scalar case). However, a more important aspect for geophysical applications is that the Hardy-Hodge based vector spherical harmonics relate to inner and outer spherical harmonics. The latter denote the harmonic continuation of (scalar) spherical harmonics on the sphere SR into the interior BR and the exterior R3 \ BR , respectively. More precisely, (scalar) inner and outer harmonics are defined via 1 R





 x , x ∈ BR , |x|     x 1 R n+1 ext , x ∈ R3 \ BR , Hn,k (R; x) = Yn,k R |x| |x| int (R; x) = Hn,k

|x| R

n

Yn,k

and we obtain 



 x , μ˜ n y˜ n,k |x|  n+2    R x 1 (1) (1) ext , −∇Hn,k (R; x) = 2 μ˜ n y˜ n,k |x| |x| R int ∇Hn,k (R; x)

1 = 2 R

|x| R

n−1 

(2) (2)

x ∈ BR , x ∈ R3 \ BR .

Summarizing, due to this connection, Hardy-Hodge based vector spherical harmonics reflect typical properties of potential field problems. In particular, the set {˜y(1) n,k }n∈N,k=−n,...,n is suited for the expansion of satellite gravity field data and the internal contributions of satellite magnetic field data (see, e.g., [18,59,66,71]) while, according to Sect. 4.2, the set {˜y(2) n,k }n∈N,k=−n,...,n is suited for the expansion of the uniquely determined contributions of spherical magnetizations (e.g., [42, 58, 81]). Nonetheless, vector spherical harmonics still reflect global structures and, therefore, only serve as auxiliary functions for our goal of constructing spatially localized function systems. For more details on the mathematical properties of vector spherical harmonics, the reader is referred, e.g., to the monographs [24, 30, 60].

338

4.4

C. Gerhards and R. Telschow

Vector Spherical Splines and Radial Basis Functions

Vectorial spline interpolation can be introduced as simple componentwise interpolation by scalar splines. However, if the available data is not given in Cartesian vectors but, e.g., if the data reflects the radial or the curl- and divergence-free tangential components of a vector field, then another more general approach should be used. We particularly formulate the approach based on the Helmholtz decomposition, i.e., involving the operators o(i) and their adjoints O (i) . But it works analogously for any other orthogonal vector field decomposition, e.g., the Hardy-Hodge decomposition and the corresponding operators o˜ (i) and their adjoints O˜ (i) . In particular, if we substitute o(i) by multiplication with the i-th Cartesian unit vector ei and O (i) as the scalar product ei · with the i-th Cartesian unit vector, then we essentially end up with componentwise interpolation. This allows to adapt the approach to the problem under consideration (see, e.g., [23, 51]). Definition 8. Let Ki∧ (n) ∈ R, n ∈ N0 , i = 1, 2, 3, be a non-zero and summable  2n+1 ∧ −2 < ∞, i = 1, 2, 3) and sequence (i.e., ∞ n=0 4π |Ki (n)| HK (S) = {f ∈ L2 (S, R3 ) : fHK < ∞}, with the norm given by f2HK = Furthermore, we set K(η, ξ ) =

∞ n 3

3 i=1

1

i=1 n=0 k=−n

∞ n n=0

k=−n |(f

(i)

y (η) ⊗ yn,k (ξ ), K ∧ (n)2 n,k (i)

(i)

)∧ (n, k)|2 |Ki∧ (n)|2 .

η, ξ ∈ S.

Then, any function s(i) : S → R3 of the form s(i) (ξ ) =

N

αj(i) Oη(i)j K(ηj , ξ ),

ξ ∈ S, i = 1, 2, 3,

j =1

with coefficients αj(i) ∈ R, is called spherical spline of type i relative to the system of nodal points {η1 , . . . , ηN } ⊂ S. The application of the operator O (i) to the tensor field K is meant columnwise, i.e., Oη K(η, ξ ) = (i)

3 l=1

(i)



6 3

5 Kk,l (η, ξ )ek el ,

k=1

with K = (Kk,l )k,l = 1,2,3 . Interpolation based on vector spherical splines from above yields minimum norm interpolating functions similar to the scalar case in Sect. 2.2. Namely, given a set

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

339

of measurements {f1 , . . . , fN } ⊂ R3 , fj = (fj,1 , fj,2 , fj,3 )T , at the nodal points {η1 , . . . , ηN } ⊂ S, the set of interpolating functions g ∈ HK (S) with O (i) g(ηj ) = fj,i ,

j = 1, . . . , N, i = 1, 2, 3,

is denoted by IN . The interpolating spline s is defined as the superposition s = s(1) + s(2) + s(3) of vector spherical splines of type i = 1, 2, 3, that solves the system of linear equations N

αk(i) Oη(i)j Oη(i)k K(ηk , ηj ) = fj,i ,

j = 1, . . . , N, i = 1, 2, 3.

(30)

k=1

It has the desired minimum norm property sHK = min gHK , g∈IN

in close relation to the scalar case (9). Data does not necessarily have to be available for all three contributions. However, if, e.g., only {f1,1 , . . . , fN,1 } is given, then only the type 1 contribution of s can be reconstructed. The contributions of type 2 and 3 would become zero due to the norm minimizing property. A common choice for vector spherical splines is the tensorial Abel-Poisson based n n kernel (i.e., K1∧ (n) = h− 2 and K2∧ (n) = K3∧ (n) = (n(n + 1)h)− 2 ) which allows the closed-form representation K(η, ξ ) =

3 1 (i) 1 − h2 oη ⊗ oξ(i) , 3 2π (1 + h2 − 2h(η · ξ )) 2

η, ξ ∈ S.

i=1

For the approximation of external gravity potential field data, one would rather ˜ (i) use the operators o˜ (i) and substitute y(i) n,k by y n,k in Definition 8 (compare the considerations in Sect. 4.2). Similar as mentioned for the scalar case in Remark 1, interpolation can be replaced by regularized approximation and linear functionals can be included for more complex relations between the spline s and the data f1 , . . . , f N . It does not hold true anymore that K is a spherical radial basis function in the sense that it depends only on the polar distance η · ξ . But it has the property of rotational invariance in the sense that RT K(Rη, Rξ )R = K(η, ξ ), for all η, ξ ∈ S and all rotation matrices R ∈ R3×3 that leave η invariant (i.e., Rη = η). Functions that reveal this invariance property are called tensor spherical radial basis functions. The naming makes sense when we consider that (scalar) spherical basis functions ! as defined in Sect. 2.2 satisfy the (scalar) rotational invariance !(Rξ, Rη) = !(ξ, η), for all ξ, η ∈ S and all rotation matrices R ∈ R3×3 that leave one of the arguments, either ξ or η, invariant. The Euclidean norm |K| of the tensorial spherical radial basis functions is actually a (scalar) spherical radial

340

C. Gerhards and R. Telschow

basis function in the sense of Sect. 2.2. The tensorial Abel-Poisson based kernel additionally preserves the spatial localization of its scalar counterpart. The property that spatial localization of the (scalar) reproducing kernel transfers to its tensorial counterpart holds true for most of the relevant cases, but in general it is dependent on the choice of the underlying scalar kernel and the choice of the linear operators used for the generation of K. For early references and more details on the presented approach to vector spherical splines, the reader is referred to, e.g., [23, 24].

5

Examples and Conclusions

We have briefly introduced spherical splines, multiscale scaling and wavelet kernels, and Slepian functions as possible function systems that could be used for approximation and reconstruction procedures in geoscience. Classical spherical harmonics are a good choice when global, uniformly-distributed data is available. If the latter is not the case, spherical splines and multiscale procedures form an auspicious alternative. Splines can be adapted to the data by the choice of nodal points. However, one has to be aware of the fact that the corresponding system of linear equations might be (severely) ill-conditioned and has to be handled appropriately (e.g., by regularization). Multiscale procedures allow a scale-dependent adaptation to the data. Regions with higher data density can be resolved up to higher scales, while the approximation in regions with sparse data coverage should be stopped at lower scales, without influencing the resolution in other regions. No systems of linear equations need to be solved, but multiscale methods as we have introduced them in Sects. 2.3 and 4.1 rely on adequate cubature rules. Additionally, depending on the particular application, the scaling kernels need not be spatially localized, i.e., approximations at the lowest scale might suffer from the lack of global data. This can be addressed, e.g., by the construction of suitable Green’s functions as indicated in Sect. 3. An example where such an adaptation is useful is provided in Sect. 5.1. Apart from the capability of adapting to non-uniform data situations, multiscale procedures have the advantage that they allow a more thorough study of spatially varying features due to the evolution of spatial resolution through the different scales. If data is available at a fairly uniform distribution, but only in a particular region, Slepian functions form a useful alternative. The same holds true if the signal under consideration is naturally constraint to a certain region. Latter is the case, e.g., in Sect. 5.2 where we study the contribution to the geomagnetic field which is induced by ocean tides. There, we construct a set of trial functions that is based on Slepian functions which are localized over the oceans. The combination of such trial functions together with a greedy-type algorithm also allows the application to problems with a non-uniform spatio-temporal data distribution.

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

5.1

341

Deflections of the Vertical

The Earth’s gravity force is tangential to the vertical plumb lines and perpendicular to all equipotential surfaces. The geoid is defined as that equipotential surface of the gravity potential W which best fits the mean sea level. Gravity data (i.e., gravity vectors g at various locations of measurement) can be converted into gravity anomalies by subtracting a reference gravity vector γ corresponding to a reference potential U derived from a simple gravity field model such as that of an ellipsoid with homogeneous mass density (see, e.g., [48]). Deflections of the vertical  are tangential vectors that describe the deviation between the (real) gravity vector and the reference gravity vector at a location x on the geoid. Namely,   (x) = ν  (x) − ν  (x) · ν(x) ν(x), where ν and ν  are the outer unit normal vectors corresponding to g and γ , respectively (cf. Fig. 9). In this section we are interested in the determination of the disturbing potential T = W − U from given deflections of the vertical. By various steps of approximation, the following spherical connection between the disturbing potential and the deflections of the vertical can be obtained: ∇S T (Rξ ) = −

kM (Rξ ), R

ξ ∈ S,

(31)

where k is the gravitational constant, M the Earth’s mass, and R the radius of the spherical approximation of the geoid. For details on the derivation, the reader is referred to [13, 22] and references therein. From the considerations in Sects. 2.3 and 3, we obtain the following two multiscale representations of T , based on Eq. (31): For ξ ∈ S,

Fig. 9 Illustration of the gravity vector g and the reference gravity vector γ , as well as the corresponding outer unit normal vectors. (Figure courtesy of [22])

342

C. Gerhards and R. Telschow

T (Rξ ) =

kM R

 S

∇S,η !0 (ξ, η) · (Rη)dω(η)

∞  kM + ∇S,η "J (ξ, η) · (Rη)dω(η), R S

(32)

J =0

and in case of  ⊂ S as the region under investigation (in our example, a spherical cap covering South America), for ξ ∈ , T (Rξ ) =

kM R

 

∇S,η !N 0 (ξ, η) · (Rη)dω(η)

∞  kM + ∇S,η "JN (ξ, η) · (Rη)dω(η). R 

(33)

J =0

The expression (33) solely requires local data of  in  while (32) requires global data in S. An illustration of the scaling and wavelet contributions of the two different representations is provided in Figs. 10 and 11, under the assumption that data is only available in  (i.e., the integration over S in (32) has to be restricted artificially to ). The input data for  for the numerical example has been computed from EGM2008 on an equiangular grid within . The fact that the involved scaling and wavelet kernels of (32) may have contributions outside of  (opposed to the scaling and wavelet kernels used in (33)) leads to the boundary disturbances that can be observed in Fig. 10, while such disturbances are not present in Fig. 11. Away from the boundary, both multiscale approaches yield similar results.

5.2

Ocean Tide Generated Magnetic Fields

The Earth’s magnetic field is composed of various contributions, one of them being the magnetic field induced by ocean tides. In recent years it has become possible to extract the signal which is due to the semi-diurnal lunar M2 tide in satellite magnetic field data (e.g., [72, 73, 80]). So far, the used extraction procedures are solely based on the temporal periodicity of the tidal signal. However this is a classical example where one has the a priori knowledge that the sources, i.e., tidal ocean currents, are restricted to a subdomain  of the (spherical) Earth’s surface. Here, we indicate an approach that in addition to the temporal periodicity also reflects this spatial localization of the source. After a Fourier transform in time domain, the governing Maxwell equations take the form ∇ × boc = μ0 σ (e + u × bmain ), ∇ ×e

= iωboc ,

∇ · boc = 0,

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

343

50

60 0°00° W

80° W

60° W

40° W

40

80° W

0°00° W

60° W

40° W

20 20°S

0

0

20°S

–20 40°S

–40

40°S –50

Wavelet contribution at scale J = 8.

Wavelet contribution at scale J = 9. 50

0°00° W

80° W

60° W

40°

40

40 W

0°00° W

80° W

60° W

40°

W

30

30 20

20

10

10 20°S

0

0

–10

–10

–20

–20 40°S

Wavelet contribution at scale J = 10.

–30

–30

–40

–40

Wavelet contribution at scale J = 11.

Fig. 10 Illustration of the multiscale reconstruction of T (units  via (32). (Figure courtesy of [13])

m2 ) s2

from knowledge of  within

where boc denotes the unknown ocean tidal magnetic field that we are interested in, bmain the main/core magnetic field, σ the Earth’s conductivity (we assume to know both of the latter), e the electric field, and u the depth integrated velocities of the tidal ocean currents (both of the latter are assumed to be unknown). The fixed frequency ω represents the M2 tidal period of approximately 12.42h. Instead of solving the time-harmonic Maxwell equations with a given model of depth-integrated velocities u, we choose a set of functions g1 , . . . , gL suitable for the approximation of u. For each such gi , we set u = gi and solve the above equations to obtain a set of trial functions b1 , . . . , bL suitable for the approximation of boc . Since we know that u is localized over the oceans and approximately divergence-free, we choose g1 , . . . , gL to be divergence-free vector Slepian functions that are localized over the oceans (in our example, we assume the bandlimit N = 40 and use the first L = 2500 best localized Slepian functions; cf. [68, 69] for the numerical generation). For the numerical solution of the time-harmonic Maxwell equations, we have used the frequency domain 3-D integral equation solver (X3DG) of [54]; the underlying model of the surface shell conductivity σ is shown in Fig. 12, the model for bmain was taken from [67]. The resulting trial functions now contain

344

C. Gerhards and R. Telschow

120° W

90° W

60° W

30° W

30° N

150

15° N

100 50

0° 15° S

0

30° S

–50 –100

45° S

–150 60° S

–200

Scalling contribution at scale J = 4. 120° W

90° W

60° W

30° W

120° W

30° N

90° W

60° W

30° W 50

30° N 60

15° N

40 15° N 30

40 0°

0° 20

15° S

0

30° S

20 15° S 10 30° S 0

–20

45° S

45° S

–10

–40 –20 60° S

–60

60° S –30

Wavelet contribution at scale J = 6. 120° W

90° W

60° W

Wavelet contribution at scale J = 8.

30° W

120° W

30° N

90° W

60° W

30° W

30° N

60

40 15° N

15° N

40

30 0°

0° 20

20 15° S

15° S 10

30° S

0 30° S

0 45° S

–10 –20

60° S

–20 45° S –40 60° S

–60

–30

Wavelet contribution at scale J = 10.

Wavelet contribution at scale J = 15.

Fig. 11 Illustration of the multiscale reconstruction of T (units  via (33)

m2 ) s2

from knowledge of  within

6 Reconstruction and Decomposition of Scalar and Vectorial. . . Accumulated energy of the first 2500 Slepian functions

Accumulated energy of the first 2500 trial functions

120 100 80 60 40 20 0 Absolute value of main/core magnetic field

x104 6.5

Surface shell conductivity

345

x104 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 x104 2.5

6 5.5

2

5 4.5

1.5

4 1

3.5 3

0.5

2.5

0

Fig. 12 Illustration of the accumulated energy of the divergence-free vector Slepian functions g1 , . . . , gL (top left) and the corresponding trial functions b1 , . . . , bL (top right) at time t = 0. Absolute value of the underlying main/core magnetic field bmain (unit nT ; bottom left) and surface shell conductivity (unit S; bottom right) assumed for the computation of the trial functions

Reconstruction via trial functions

CM5 model 2

2

1.5

1.5

1

1

0.5

0.5

0

0 Residual between CM5 model and reconstruction

0.3 0.25 0.2 0.15 0.1 0.05 0

Fig. 13 Illustration of the absolute values of the radial part of the M2 tidal magnetic field from the CM5 model (top left) and the approximation based on the trial functions b1 , . . . , bL (top right). The bottom right image shows the absolute values of the residual between the two approximations (unit nT )

346

C. Gerhards and R. Telschow

information on the temporal periodicity as well as on the spatial origin of the sources 2 of the M2 tidal magnetic signal. The accumulated energies L i=1 |gi (ξ )| of the L 2 vector Slepian functions and i=1 |bi (t, ξ )| of the corresponding trial functions at points ξ ∈ S and for a fixed time t are indicated in Fig. 12. Both cases exhibit the desired  spatial localization over the oceans. The local variations of the accumulated 2 energy L i=1 |bi (t, ξ )| over the oceans additionally reflect the variations of the underlying conductivity and main/core magnetic field. Eventually, the b1 , . . . , bL from above form a suitable foundation for the approximation of ocean tide induced magnetic field signals. In our test example, we have combined them with the Regularized Orthogonal Functional Matching Pursuit (a greedy-type algorithm described in detail in [63, 79]) to approximate the M2 tidal magnetic field contribution of the CM5 geomagnetic field model (cf. [73]). The results are shown in Fig. 13. The M2 tidal signal in the CM5 model has been extracted from satellite data solely based on the temporal periodicity of the ocean tides. The effect of the approximation based on the temporal and spatially reflected trial functions b1 , . . . , bL , becomes apparent in the residual (bottom right image in Fig. 13): differences between the CM5 model and our approximation exist mainly over the continents. Acknowledgements This work was partly supported by DFG grant GE 2781/1-1.

Literature 1. Alfeld, P., Neamtu, M., Shumaker, L.L.: Fitting data on sphere-like surfaces using splines. J. Comput. Appl. Math. 73, 5–43 (1995) 2. Atkinson, K.: Numerical integration on the sphere. J. Austr. Math. Soc. 23, 332–347 (1982) 3. Backus, G., Parker, R., Constable, C.: Foundations of Geomagnetism. Cambridge University Press, Cambridge (1996) 4. Baratchart, L., Gerhards, C.: On the recovery of crustal and core contributions in geomagnetic potential fields. SIAM J. Appl. Math. 77, 1756–1780 (2017) 5. Baratchart, L., Hardin, D.P., Lima, E.A., Saff, E.B., Weiss, B.P.: Characterizing kernels of operators related to thin plate magnetizations via generalizations of Hodge decompositions. Inverse Prob. 29, 015004 (2013) 6. Bauer, F., Gutting, M., Lukas, M.A.: Evaluation of parameter choice methods for regularization of ill-posed problems in geomathematics. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Berlin (2015) 7. Bauer, F., Reiß, M.: Regularization independent of the noise level: an analysis of quasioptimality. Inverse Prob. 24, 055009 (2008) 8. Berkel, P., Fischer, D., Michel, V.: Spline multiresolution and numerical results for joint gravitation and normal-mode inversion with an outlook on sparse regularisation. GEM Int. J. Geomath. 1, 167–204 (2011) 9. Chambodut, A., Panet, I., Mandea, M., Diamet, M., Holschneider, M., Jamet, O.: Wavelet frames: an alternative to spherical harmonic representation of potential fields. Geophys. J. Int. 163, 875–899 (2005) 10. Dahlke, S., Dahmen, W., Schmitt, E., Weinreich, I.: Multiresolution analysis and wavelets on S 2 and S 3 . Num. Func. Anal. Appl. 16, 19–41 (1995) 11. Driscoll, J.R., Healy, M.H. Jr.: Computing fourier transforms and convolutions on the 2-sphere. Adv. Appl. Math. 15, 202–250 (1994)

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

347

12. Fehlinger, T., Freeden, W., Mayer, C., Michel, D., Schreiner, M.: Local modelling of sea surface topography from (geostrophic) ocean flow. ZAMM 87, 775–791 (2007) 13. Fehlinger, T., Freeden, W., Mayer, C., Schreiner, M.: On the local multiscale determination of the earths disturbing potential from discrete deflections of the vertical. Comput. Geosci. 12, 473–490 (2009) 14. Fischer, D., Michel, V.: Sparse regularization of inverse gravimetry – case study: spatial and temporal mass variations in South America. Inverse Prob. 28, 065012 (2012) 15. Fischer, D., Michel, V.: Automatic best-basis selection for geophysical tomographic inverse problems. Geophys. J. Int. 193, 1291–1299 (2013) 16. Freeden, W.: On integral formulas of the (unit) sphere and their application to numerical computation of integrals. Computing 25, 131–146 (1980) 17. Freeden, W.: On approximation by harmonic splines. Manuscr. Geod. 6, 193–244 (1981) 18. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. Teubner, Leipzig (1999) 19. Freeden, W., Gerhards, C.: Poloidal and toroidal field modeling in terms of locally supported vector wavelets. Math. Geosc. 42, 818–838 (2010) 20. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Pure and Applied Mathematics. Chapman & Hall/CRC, Boca Raton (2012) 21. Freeden, W., Gerhards, C.: Romberg extrapolation for Euler summation-based cubature on regular regions. GEM Int. J. Geomath. 8, 169–182 (2017) 22. Freeden, W., Gerhards, C., Schreiner, M.: Disturbing potential from deflections of the vertical: from globally reflected surface gradient equation to locally oriented multiscale modeling. In: Grafarend, E. (ed.) Encyclopedia of Geodesy. Springer International Publishing (2015) 23. Freeden, W., Gervens, T.: Vector spherical spline interpolation – basic theory and computational aspects. Math. Methods Appl. Sci. 16, 151–183 (1993) 24. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (with Applications to Geomathematics). Oxford Science Publications. Clarendon Press, Oxford (1998) 25. Freeden, W., Gutting, M.:Special Functions of Mathematical (Geo-)Physics. Applied and Numerical Harmonic Analysis. Springer, Basel (2013) 26. Freeden, W., Hesse, K.: On the multiscale solution of satellite problems by use of locally supported kernel functions corresponding to equidistributed data on spherical orbits. Stud. Sci. Math. Hungar. 39, 37–74 (2002) 27. Freeden, W., Michel, V.: Constructive approximation and numerical methods- in geodetic research today – an attempt at a categorization based on an uncertainty principle. J. Geod. 73, 452–465 (1999) 28. Freeden, W., Schneider, F.: Regularization wavelets and multiresolution. Inverse Prob. 14, 225–243 (1998) 29. Freeden, W., Schreiner, M.: Local multiscale modeling of geoidal undulations from deflections of the vertical. J. Geod. 78, 641–651 (2006) 30. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences. Springer, Berlin/Heidelberg (2009) 31. Freeden, W., Windheuser, U.: Combined spherical harmonics and wavelet expansion – a future concept in Earth’s gravitational potential determination. Appl. Comput. Harm. Anal. 4, 1–37 (1997) 32. Gemmrich, S., Nigam, N., Steinbach, O.: Boundary integral equations for the LaplaceBeltrami operator. In: Munthe-Kaas, H., Owren, B. (eds.) Mathematics and Computation, a Contemporary View. Proceedings of the Abel Symposium 2006. Springer, Berlin (2008) 33. Gerhards, C.: Spherical decompositions in a global and local framework: theory and an application to geomagnetic modeling. GEM Int. J. Geomath. 1, 205–256 (2011) 34. Gerhards, C.: Locally supported wavelets for the separation of spherical vector fields with respect to their sources. Int. J. Wavel. Multires. Inf. Process. 10, 1250034 (2012) 35. Gerhards, C.: A combination of downward continuation and local approximation for harmonic potentials. Inverse Prob. 30, 085004 (2014)

348

C. Gerhards and R. Telschow

36. Gerhards, C.: Multiscale modeling of the geomagnetic field and ionospheric currents. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Berlin (2015) 37. Gerhards, C.: On the unique reconstruction of induced spherical magnetizations. Inverse Prob. 32, 015002 (2016) 38. Gerhards, C.: On the reconstruction of inducing dipole directions and susceptibilities from knowledge of the magnetic field on a sphere. Inv. Probl. Sci. Engin. https://doi.org/10.1080/ 17415977.2018.1438426, to appear. 39. Gerhards, C.: Spherical potential theory: tools and applications. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of mathematical geodesy – functional analytic and potential theoretic methods, Birkhäuser, Basel (2018) 40. Gerhards, C., Pereverzyev, S. Jr., Tkachenko, P.: A parameter choice strategy for the inversion of multiple observations. Adv. Comp. Math. 43, 101–112 (2017) 41. Gerhards, C., Pereverzyev, S. Jr., Tkachenko, P.: Joint inversion of multiple observations. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of mathematical geodesy – functional analytic and potential theoretic methods, Birkhäuser, Basel (2018) 42. Gubbins, D., Ivers, D., Masterton, S.M., Winch, D.E.: Analysis of lithospheric magnetization in vector spherical harmonics. Geophys. J. Int. 187, 99–117 (2011) 43. Gutkin, E., Newton, K.P.: The method of images and green’s function for spherical domains. J. Phys. A: Math. Gen. 37, 11989–12003 (2004) 44. Gutting, M.: Fast spherical/harmonic spline modeling. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Berlin (2015) 45. Gutting, M., Kretz, B., Michel, V., Telschow, R.: Study on parameter choice methods for the RFMP with respect to downward continuation. Front. Appl. Math. Stat. 3, 10 (2017) 46. Haar, A.: Zur Theorie der orthogonalen Funktionensysteme. Math. Ann. 69, 331–371 (1910) 47. Hesse, K., Sloan, I., Womersley, R.: Numerical integration on the sphere. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Berlin (2015) 48. Hofmann-Wellenhof, B., Moritz, H.: Physical Geodesy, 2nd edn. Springer, Vienna (2006) 49. Holschneider, M.: Continuous wavelet transforms on the sphere. J. Math. Phys. 37, 4156–4165 (1996) 50. Hubbert, S., LeGia, Q.T., Morton, T.: Spherical Radial Basis Functions, Theory and Applications. Springer International Publishing (2015) 51. Kamman, P., Michel, V.: Time-dependent Cauchy-Navier splines and their application to seismic wave front propagation. ZAMM J. Appl. Math. Mech. 88, 155–178 (2008) 52. Kidambi, R., Newton, K.P.: Motion of three point vortices on a sphere. Phys. D 116, 143–175 (1998) 53. Kidambi, R., Newton, K.P.: Point vortex motion on a sphere with solid boundaries. Phys. Fluids 12, 581–588 (2000) 54. Kuvshinov, A.V.: 3-D global induction in the ocean and solid earth: recent progress in modeling magnetic and electric fields from sources of magnetospheric, ionospheric, and ococean origin. Surv. Geophys. 29, 139–186 (2008) 55. LeGia, Q.T., Sloan, I., Wendland, H.: Multiscale analysis on sobolev spaces on the sphere. SIAM J. Num. Anal. 48, 2065–2090 (2010) 56. Lima, E.A., Weiss, B.P., Baratchart, L., Hardin, D.P., Saff, E.B.: Fast inversion of magnetic field maps of unidirectional planar geological magnetization. J. Geophys. Res. Solid Earth 118, 1–30 (2013) 57. Mallat, S., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41, 3397–3415 (1993) 58. Masterton, S., Gubbins, D., Müller, R.D., Singh, K.H.: Forward modelling of oceanic lithospheric magnetization. Geophys. J. Int. 192, 951–962 (2013) 59. Mayer, C., Maier, T.: Separating inner and outer Earth’s magnetic field from CHAMP satellite measurements by means of vector scaling functions and wavelets. Geophys. J. Int. 167, 1188–1203 (2006)

6 Reconstruction and Decomposition of Scalar and Vectorial. . .

349

60. Michel, V.: Lectures on Constructive Approximation – Fourier, Spline, and Wavelet Methods on the Real Line, the Sphere, and the Ball. Birkhäuser, Boston (2013) 61. Michel, V., Simons, F.: A general approach to regularizing inverse problems with regional data using Slepian wavelets. Inverser Prob. 33, 125016 (2018) 62. Michel, V., Telschow, R.: A non-linear approximation method on the sphere. GEM Int. J. Geomath. 5, 195–224 (2014) 63. Michel, V., Telschow, R.: The regularized orthogonal functional matching pursuit for ill-posed inverse problems. SIAM J. Num. Anal. 54, 262–287 (2016) 64. Michel, V., Wolf, K.: Numerical aspects of a spline-based multiresolution recovery of the harmonic mass density out of gravity functionals. Geophys. J. Int. 173, 1–16 (2008) 65. Müller, C.: Spherical Harmonics. Springer, New York (1966) 66. Olsen, N., Glassmeier, K-H., Jia, X.: Separation of the magnetic field into external and internal parts. Space Sci. Rev. 152, 135–157 (2010) 67. Olsen, N., Lühr, H., Finlay, C.C., Sabaka, T.J., Michaelis, I., Rauberg, J., Tøffner-Clausen, L.: The CHAOS-4 geomagnetic field model. Geophys. J. Int. 197, 815–827 (2014) 68. Plattner, A., Simons, F.J.: slepian_golf version 1.0.0. https://doi.org/10.5281/zenodo.583627 69. Plattner, A., Simons, F.J.: Spatiospectral concentration of vector fields on a sphere. Appl. Comp. Harm. Anal. 36, 1–22 (2014) 70. Plattner, A., Simons, F.J.: Potential-field estimation from satellite data using scalar and vector Slepian functions. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Berlin (2015) 71. Plattner, A., Simons, F.J.: Internal and external potential-field estimation from regional vector data at varying satellite altitude. Geophys. J. Int. 211, 207–238 (2017) 72. Sabaka, T., Tyler, R., Olsen, N.: Extracting ocean-generated tidalmagnetic signals from Swarm data through satellite gradiometry. Geophys. Res. Lett. 43, 3237–3245 (2016) 73. Sabaka, T.J., Olsen, N., Tyler, R.H., Kuvshinov, A.: CM5, a pre-Swarm comprehensive geomagnetic field model derived from over 12 years of CHAMP, ørsted, SAC-C and observatory data. Geophys. J. Int. 200, 1596–1626 (2015) 74. Schreiner, M. Locally supported kernels for spherical spline interpolation. J. Approx. Theory 89, 172–194 (1997) 75. Shure, L., Parker, R.L., Backus, G.E.: Harmonic splines for geomagnetic modeling. Phys. Earth Planet. Inter. 28, 215–229 (1982) 76. Simons, F.J., Dahlen, F.A., Wieczorek, M.A.: Spatiospectral localization on a sphere. SIAM Rev. 48, 505–536 (2006) 77. Simons, F.J., Plattner, A.: Scalar and vector slepian functions, spherical signal estimation and spectral analysis. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, Berlin (2015) 78. Sloan, I., Womersley, R.: Filtered hyperinterpolation: a constructive polynomial approximation on the sphere. GEM Int. J. Geomath. 3, 95–117 (2012) 79. Telschow, R.: An Orthogonal Matching Pursuit for the Regularization of Spherical Inverse Problems. PhD thesis, University of Siegen (2014) 80. Tyler, R., Maus, S., Lühr, H.: Satellite observations of magnetic fields due to ocean tidal flow. Science 299, 239–240 (2003) 81. Vervelidou, F, Lesur, V., Morschhauser, A.,Grott, M., Thomas, P.: On the accuracy of paleopole estimations from magnetic field measurements. Geophys. J. Int. 211, 1669–1678 (2017) 82. Wahba, G.: Spline inteprolation and smoothing on the sphere. SIAM J. Sci. Stat. Comput. 2, 5–16 (1981) 83. Wendland, H.: Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Adv. Comput. Math. 4, 389–396 (1995) 84. Wendland, H.: Scattered Data Approximation. Cambridge University Press, Cambridge (2005)

7

Ellipsoidal-Spheroidal Representation of the Gravity Field Erik. W. Grafarend

Ellipsoidal-Spheroidal representation of the gravity field of a gravitating and rotating Earth, the anharmonic part as well as the harmonic part, zero, first and second derivatives of its potential field, deformable bodies

Contents 1 2 3

Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Definition of Ellipsoidal-Spheroidal Coordinates: The Ellipsoidal-Spheroidal Gravity Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Anharmonic Somigliana-Pizzetti Reference Gravity Field of a Rotating Earth, the World Geodetic Datum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

354 359 365

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. This contribution is dedicated to the late Helmut Wolf Founder of the one and only Institute of Theoretical Geodesy at Bonn University/Germany/ and for his courage to make me the youngest German Professor of Geodetic Sciences as well as Friedrich W. Hehl Theoretical Physicist, University of Cologne/Germany/ for his critical accompanying of my research and for his substantial advice. E. W. Grafarend () Faculty of Aerospace Technology and Geodesy, Faculty of Mathematics and Physics, Faculty of Civil Engineering and Environment Engineering Sciences, Department of Geodesy and Geoinformatics, Stuttgart University, Stuttgart, Germany E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_104

351

352

E. W. Grafarend

4

Geostationary and Geosynchronous Satellite Orbits: Analysis of the Gravitational Force and of the Centrifugal Force, the Null Space of the Gravity Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Molodensky Heights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The MacCullagh Representation: The Time-Variation of Multipoles of the Gravity Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

379 392 416 439

Abstract We begin with a chapter on motivation, namely why the Earth cannot be a ball due to Earth rotation which we daily experience. In contrast, the Earth’s gravity field is axially symmetric as a first order approximation, not spherically symmetric. The same axially symmetric gravity field applies to all planets and mini-planets, of course the Moon, the Sun and other space objects which intrinsically rotate. The second chapter is therefore devoted to the definition of ellipsoidal-spheroidal coordinate which allow separation of variables. The mixed elliptic-trigonometric elliptic coordinates are generated by the intersection by a family of confocal, oblate spheroids, a family of confocal half hyperboloids and a family of half planes: in this coordinate system {λ, φ, u} we inject to forward transformation of spheroidal coordinates into Cartesian coordinates {x, y, z} and the uniquely inverted ones into the backward transformation {x, y, z} −→ {λ, φ, u}. In such a coordinate system we represent the eigenspace of the potential field in terms of the gravitational field being harmonic as well as the centrifugal potential field being anharmonic. Such an eigenspace is being described by normalized associated Legendre functions of first and second kind. The normalization is based on the global area element of the spheroid E2a,b . The third chapter is a short introduction into the Somigliana-Pizetti level ellipsoid in terms of its semi-major axis and its semi-minor axis as well as best estimation of the fundamental Geodetic Parameters {W0 , GM, J2 , } approximating the Physical Surface of the Planet Earth, namely the Gauss-Listing Geoid. These parameters determine the World Geodetic Datum for a fixed reference epoch. These parameters are called (i) the potential value of the equilibrium figure close to Mean Sea Level, (ii) the gravitational mass, (iii) the second kind, zero order (2, 0) of the gravitational field and finally (iv) the Mean Rotation Speed. These numerical values of the Planet Earth are numerically given. The best estimations of the form parameters derived from two constraints are presented for the Somigliana-Pizzetti Level Ellipsoid. In case of real observations we have to decide whether or not to reduce the constant tide effect. For this reason we have computed the “zero-frequency tidal reference system” and the “tide free reference system” which differ about 40 cm. The radii are {a = 6,378,136.572 m, b = 6,356,751.920 m} for the tide-free Geoid of Reference, but {a = 6,378,136.602 m, b = 6,356,751.860 m} for the zero-frequency tide Geoid

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

353

of Reference. These results presented in the Datum 2000 differ significantly from the data of the Standard Geodetic Reference System 1980. The geostationary orbit balances the gravitational force and the centrifugal force to zero, the socalled Null Space. Its value of 42,164 km distance from the Earth Center has been calculated in the quasi-spherical referenced coordinate system introduced by T. Krarup. This Null Space evaluates the degree/order term (0, 0) of the gravitational field and the degree/order terms (0, 0) and (2, 0) of the centrifugal field. A careful treatment of the axial symmetric gravity field representing this gravitational and centrifugal field of this degree/order amounts to solve a polynomial equation of order ten. The intersection point of these two forces has been calculated with a lot of efforts! Referring to the Somigliana-Pizzetti Reference Gravity Field we compute in all detail Molodensky heights. In using the World Geodetic Datum 2000 we have presented the Telluroid, telluroid heights and the highlight “Molodensky Heights”. The highlight is our Quasi-geoid Map of East Germany, based on the minimum distance of the Physical Surface of the Earth to the Somigliana-Pizzetti telluroid. We build up the theory of the timevarying gravity field of excitation functions of various types: (i) tidal potential, (ii) loading potential, (iii) centrifugal potential and (iv) transverse stress. The mass density variation in time, namely caused by (i) initial mass density and (ii) the divergence of the time displacement vectors, is represented in terms (i) radial, (ii) spheroidal and (iii) toroidal displacement coefficients in terms of the spherical Love-Shida hypothesis. For the various excitation functions we compute those coefficients.

Keywords Gravitational and centrifugal force · Gauss-Green potential · Coriolis and Euler force · Axisymmetric gravity field · Critical radius · Harmonic and anharmonic parts · Elliptic, spheroidal and toroidal parts · Jacobi functions · level ellipsoid and the Geoid · Zero frequency Geoid · Zund Theorem · Bifurcation · Love Shida Hypothesis

Zusammenfassung Wie beginnen das erste Kapitel mit dem Argument, dass die Erde auf Grund der Erdrotation keine Kugel mit konstantem Radius sein kann, sondern näherungsweise ein abgeplattetes Ellipsoid ist, im Einklang mit unserer täglicher Erfahrung: die Erde rotiert in etwa 24 Stunden. Wie geben eine Einführung in die Abplattung, die Rotation, den Näherungswert aller terrestrischen Planeten an, ebenso wie die charakteristischen Daten zum Planeten Erde, sowie verschiedene Definitionen über das Jahr, den Monat, der Tag. Das Schwerefeld der Erde besteht aus einem (i) harmonischen Anteil und einem

354

E. W. Grafarend

(ii) anharmonischen Anteil. Das zweite Kapitel ist der genauen Definition von elliptischen-sphäroiden Koordinaten einer elliptischen Erde gewidmet. Ellipsoid-harmonische Reihenentwicklungen des dreidimensionalen Laplace Operatorsrunden das Kapitel ab. Das dritte Kapitel konzentriert sich auf das anharmonische Somigliana-Pizzetti Referenzfeld. Die besten Schätzungen der sog. Form-Parameter der Erde bilden einen ersten Höhepunkt. Das vierte Kapitel erlaubt den geostationären und geosynchronen Satelliten-Radius von 42. 164 km, den sog. Null-Raum des Schwerefeldes außerhalb der rotierenden Erde. Das Molodensky-Höhensystem steht im Zentrum des fünften Kapitels, das zentrale Höhensystem in Russland und Europa. Es basiert auf dem Somigliana-Pizzetti Referenz-Schwerefeld. Zentral ist das quasi-Geoid als GPS-Informationssystem der Erdparameterdaten. Die zeitliche Veränderung des terrestrischen Schwerefeldes, die sog. MacCullagh Darstellung steht zentral im sechsten Kapitel. Den Abschluss bildet unsere Zusammenfassung, und der Ausblick: Geodäsie im 21. Jahrhundert.

1

Motivation

This is the title of a remarkable contribution of the Science Journalist Arbesman which appeared in the Science Section of the daily newspaper “Süddeutsche Zeitung”, Tuesday. 5 February 2013. 1ndeed, “the Earth irregular surface is globally best approximated by an oblate ellipsoid-of-revolution, but NOT by a sphere”. Why this result is true? Because the Earth as well as all planets rotate: The rotation flattens the Earth. Please, refer to Table 1 for reviewing gravitational and rotational effects in an Earth fixed reference system: we arrive at four types of forces: Coriolis force gravitational force centrifugal force ——– Euler-Darwin force (vector field components) (derived from the Gauss-Green potential) The strongest forces are the gravitational force and the centrifugal force. The Coriolis force drives the currents at Sea; the Euler-Darwin force proportional to the time derivative of the rotational vector is responsible for Polar Motion and Length-of-Day variations as well as precession-nutation elements of Angular Momentum, namely the kinematical Euler equations and dynamical Euler equations. If we consider the Earth deformable, we arrive instead at the Euler-Liouville equations. All terrestrial planets rotate and are affected by gravitational forces as well as the centrifugal forces in their orbits. For instance, Table 1 informs us about the obliquity, the rotational period, the orbital revolution as well as the equatorial radius in km.

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

355

Table 1 Obliquity, rotational period, orbital revolution, equatorial radius of terrestrial planets Planet

Obliqnity (◦ )

Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune

0.1 177.4 23.4 25.2 3.1 26.7 97.9 29.6

Rotational period (days) 58.68 243.0 0.99 1.03 0.41 0.45 0.72 0.67

Orbitational revolution (days) 87.97 224.7 365.26 681.9 11.86 29.46 83.75 163.72

Equatorial radius (km) 2,439 607 6.378 3,393 71,492 60.268 25,554 24,769

Moons 0 0 1 2 63 60 27 13

For the Earth, we have an obliquity of 23.4◦ , a rotational period of 0.99 days, an orbital revolution of 365.26 days and an equatorial radius of 6,378 km. The Moon and the Earth form the coupled system. There exist 3 coupling systems in general: spin-spin orbit-orbit spin-orbit While the planetary system was well known in the past, contemporary mathematicians seem to have forgotten these items believing in a spherical, gravitational Earth nowadays Another up-to-date example in NASA’s Space Mission Dawn: its target was to investigate the space parameters of the asteroid belt, namely Ceres and Vesta, miniplanets between the Planet Mars and the Planet Jupiter. Our interest is here to present results of the coupling of gravitation and rotation in terms of two-axis and three-axis reference ellipsoids. Observations 2015 have been proven that Ceres is a slightly flattened rotational ellipsoid with an equational radius of 963 km and a polar radius of 891 km, the difference is 72 km. The rotational period amounts to 9.007 h, the mean density has been determined by 2.077 ∓ 0.036 g/cm3 . The asteroid Vesta has been characterized by a triaxial ellipsoid of radii 280 km, 272 and 227 km. Its mean density is 3.7 ∓ 0.23 g/cm3 , the rotational period is 5.3 h. The asteroid Ceres has been famous for its orbit prediction by C.F. Gauss. He gave also the name for the asteroid Vesta, a Roman Goddess. Here we begin with two key references: Our first reference has been taken form the Yearly Calendar 2016 of the British Royal Astronomical Society of the Planet Earth where the author is a member:

356

E. W. Grafarend

Earth Perihelion 2 January 2016, 22:49 (147.1 million km) Aphelion 4 July 2016, 16:24 (152.1 million km) Equinoxes 20 March 2016, 04:30 22 September 2016, 14:21 Solstices 20 June 21 2016, 22:34 December 2016, 10:44 Obliquity of ecliptic (2016.0) 23◦ .43720 Perihelion 0.983303941 au Aphelion 1.016750939 au 1 au 149,597,871 km Equatorial radius 6378.136 km Polar radius 6356.755 km Mass 5.9722 × 1024 kg Mean Earth Moon distance 384 400 km Mean Earth Sun distance 1.00000106 au Mean orbital speed 29.785 km/s Solar parallax 8”.79414 Length of Year, Month, Day Year (days) Tropical (equinoxes) 365.24219 Sidereal (fixed stars) 365.25636 Anomalistic (apsides) 365.25964 Eclipse (Moon’s nodes) 346.62003 Gaussian (Kepler’s law, a =1) 365.25690 Month (days) Tropical (equinoxes) 27.32158 Sidereal (fixed stars) 27.32166 Anomalistic (apsides) 27.55455 Draconic (nodes) 27.21222 Synodic (New Moons) 29.53059 Day Mean solar day = 1.00273791 mean sidereal day Mean sidereal day = 0.99726957 mean solar day

Question Should Geodesists learn about the equatorial radius 6378.136 km and about the polar radius 6356.755 km? Answer Of course, every Geodesist should learn all these Earth data as well as the length of the {Year, Month, Day}. There are no other geoscientists who know these details better. Our second reference has been taken from my colleague F. W. Hehl and Y.N. Obukhov (2002 page 378):

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

357

Kerr-Newman Solution “When a source is rotating, its electromagnetic and gravitational field are no longer spherical symmetric. Instead, the Reissner-Nordström geometry discussed above is replaced by an axial symmetric configuration described by the co-frame “The latter is directly related to the angular momentum of the source.” F.M.Hehl is Professor of Theoretical Physics at Cologne University Germany. He raised the basic question, why the terrestrial gravity field cannot be “spherical symmetric, but axial symmetric” due to Earth rotation Question Should Geodesists learn that the gravity field of the Earth is not spherical symmetric, but at least axial symmetric? Should Geodesists learn that the Earth rotates causing Angular Momentum? Answer Of course, a Geodesist should know that the Earth (and all terrestrial planets) rotates. They should know Polar Motion, Length-of-Day variations or Precession-Nutation, in detail the kinematical Euler equations as well as the dynamical Euler equations (rigid bodies) and Euler Liouville equations (deformable bodies). There are no other specialists than the geodesists who know better these facts for the Planet Earth. Exercise If we neglect Polar Motion and Length-of-Day variations, the Centrifugal Potential allows the representation V (x, y, z) = 12 ω2 (x 2 + y 2 ) if we assume the rotation around the z-axis, for instance for the Earth. We want to test “harmonicity” by a simple derivation: 1 2 2 ω (x + y 2 ) 2 ∂ ∂ V (x, y, z) = ω2 y, V (x, y, z) = 0 ∂y ∂z

V (x, y, z) = ∂ V (x, y, z) = ω2 x, ∂x ∂2 V (x, y, z) = ω2 , ∂x 2 

∂2 ∂2 2 V (x, y, z) = ω , V (x, y, z) = 0 ∂y 2 ∂z2  ∂2 ∂2 + 2 V (x, y, z) = 2ω2 , ∂x 2 ∂y

(1) (2) (3) (4)

# " 2 ∂ ∂2 ∂2 We refer to the harmonic field by the condition 3 V (x, y, z) = ∂x 2 + ∂y 2 + ∂z2 V (x, y, z) = 0 obviously the Mean Centrifugal Potential is not a harmonic field due to 3 V (x, y, z) = 2ω2 . As a differential equation of linear type we enjoy the property (i) first solve the harmonic field equation in general and(ii) add the special solution of the anharmonic field equation secondly.

358

E. W. Grafarend

The Figure of the Earth, the Moon, the Sun, the terrestrial Planets have been in the focal interest of Geodetic Sciences, both in Geometry Space as well as Gravity Space, in particular changing in time. Many textbooks have been written, especially to analyze the irregular figure of the Earth topography. Up-to-date contributions argue that the irregular figure of the Earth cannot be described as a Differential Manifold, perhaps by Fractals. Seen historically, various figures have been assumed of type Plane, Sphere, Ellipsoid, regular Topography, irregular Topography, Fractal Geometry Relativity has taught us that any frame of reference is suitable to describe spacetime. In practice, geodesists use the approximation of type “weak field, slow motion” or “post-Newtonian of second order”. 99% of the colleagues in Geodesy believe in the frames of reference of type body Space fixed or fixed “inertial” moving in Newtonian Spacetime, extended in case of a deformable body. The Earth, the Moon, the Sun, the terrestrial planets rotate. Table 1 is a summary of the rotational periods of the terrestrial planets as well as the orbital periods in a first order approximation. We repeat: “Rotation flattens the Earth and the Planets”. In a body fixed frame of reference we describe the rotational motion by forces of type: (i) (ii) (iii) (iv)

gravitation } centrifugal Coriolis and } Euler Darwin

both derived from a scalar potential Gauss-Green, which cannot be derived from a scalar potential.

Traditionally, Geodesists use only those forces (i) and (ii) which is summedup a additively towards W = U+V where W is called gravity potential denoted by U (the gravitational potential) and V (the centrifugal potential). They leave the Coriolis force-the driving mechanism of the Oceans - and the Euler-Darwin force initiating Polar Motion/Length-of-Day resp. Precession/Nutation to Oceanography and Geophysics What is our topic? Here we present to you first the arguments that the “Earth is not a ball”. First, we define the ellipsoidal-spheroidal coordinates, namely

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

359

the intersection of a family of confocal oblate spheroids, a family of confocal half hyperboloids and a family of half planes called {λ, φ, u}. A special topic is the spheroidal eigen-space of the three-dimensional Laplace partial differential equation as well as the external gravity field of the Earth. The highlight is the weighted orthogonality condition with respect to reference ellipsoid of revolution E2a,b . Second we study the anharmonic Somigliana-Pizzetti referenced gravity field of the rotating Earth, part of the World Geodetic Datum, namely 4 parameters {W0 , GM, J2 , #} characterizing the Somigliana-Pizzetti gravity field. The best estimates of the form parameters {a,b} or {a, ε} are subject to two conditional equations. We give these data form “zero frequency tide system” as well as “tide free reference system”: they differ by −0.39 m (semi major axis) and −0.45 m (semi minor axis), a quantity which cannot be neglected. Third we study the geostationary orbit, namely the analysis of gravitational force and of the centrifugal force, for instance being zero. It is the radius of the “geostationary orbit” of 42.164 km computed by Bode and Grafarend [9], recently by Iorio [35]. Fourth, Molodensky heights with respect to the Somigliana-Pizzetti potential reference field are the central subject of this chapter. The illustrations in the Figs. 8, 9, 10, and 11 refer to the quasi-Goeid of East Germany in the EGG97 reference system. Fifths, Space Gradiometry, namely tensor valued ellipsoidal harmonics and the datum problem are the key topics. We treat up-and the problematic downward continuation of the components (i) tangential shear, (ii) tangential dilation (iii) normal components and finally (ii) mixed components. The downward continuation of the gravity gradients to an ellipsoidal boundary has to be regularized, split into the symmetric and into the antisymmetric tensor of second order part. The three phases in applying ellipsoidal anharmonic and harmonic parts are illustrated. The disturbing gravity field is represented up to degree (360/360) in ellipsoidal harmonics: basic is the Lusternik-Schnirelmann Category: we need at least two charts to cover the ellipsoid (or sphere), not one as the naïve map-maker believes. Various numerical studies have been proven that the approximation of the Earth’s Figure by an Ellipsoid-of-Revolution is three times (factor) better than the sphere.

2

The Definition of Ellipsoidal-Spheroidal Coordinates: The Ellipsoidal-Spheroidal Gravity Field

In a detailed analysis of ellipsoidal-spheroidal coordinates has been presented by Thong and Grafarend [63] to which we prefer. There you find a review of various types of spheroidal coordinates as well as spheroidal eigenfunctions/eigenvalues which operate on the spheroidal-not spherical-solution space of the three dimensional Laplace partial differential equation for the external gravity field of the Earth. Here we have chosen {λ, φ, u} as spheroidal coordinates which are generated as elliptic coordinates by the intersection of a family of confocal oblate spheroids, a family of confocal half hyperboloids and a family of half planes according to the

360

E. W. Grafarend

following definitions. Other coordinate system like in use in GPS suffer from the problem of separation Definition 1 (Spheroidal coordinates {λ, φ, u}). The mixed elliptic-trigonometric elliptic coordinates generated by the intersection: (i) The family of confocal, oblate spheroids

E2√ 2 2 u +ε ,u

 2   2  z2 3 x + y 2 2 2 := x ∈ R  2 + 2 = 1, u ∈ (0, +∞), ε := a − b u + ε2 u (5)

(ii) The family of confocal half hyperboloids

E2ε cos φ,ε sin φ

 2    π π 2  z2 3 x +y , φ = 0 − := x ∈ R  2 = 1, φ ∈ − , 2 2 ε cos2 φ ε2 sin2 φ (6)

(iii) The family of half planes & ' P2cos λ,sin λ := x ∈ R3 |y = x tan λ, λ ∈ [0, 2π ]

(7)

These are called spheroidal. The longitude (vgl. (148)) λ gives orientation of the half planes; the latitude φ relates to the inclination of the asymptotes of confocal half hyperboloids; the elliptic coordinate u coincides with the semi-minor axis of confocal oblate spheroids (confocal, oblate, ellipsoids of revolution). In addition, let us assume that the pre-relativistic space near the Earth is three dimensional Euclidean {R3 , gkl } with the matrix of the metric [gkl ]. If R3 is covered by Cartesian coordinates, the three-dimensional Euclidean space {R3 , δkl } is completely covered by one chart. In contrast, if R3 is covered by curvilinear coordinates of type spheroidal, a minimal atlas of the three-dimensional Euclidean space is established by three charts. Accordingly, the following corollary generates only one chart of {R3 , φkl } due to the demanded domain λ ∈ {λ ∈ R |0 < λ < 2π }, φ ∈}φ ∈ R| − π/2 < φ < +π/2}, u ∈ {u ∈ R| u > 0} by avoiding singularities at the North-pole as well as at the South-pole of the confocal oblate spheroids

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

361

Corollary 1 (Conversion of Cartesian coordinates {x, y, z} into spheroidal coordinates {λ, φ, u}). The forward transformation of spheroidal coordinates {λ, φ, u} into Cartesian coordinates {x, y, z}, namely √ x = √u2 + ε2 cos φ cos λ y = u2 + ε2 cos φ sin λ z = u sin λ

(8)

can be uniquely inverted into the backward transformation of Cartesian coordinates {x, y, z} into spheroidal coordinates {λ, φ, u}, namely:  1 1 + − sgn y − sgn ysgn x + 1 π (9) λ = arctan x 2 2    1 #  " 2  1 2 2 2 2 2 + y 2 + z2 2 + 4ε 2 z2 φ = (sgnz) arcsin ε + x − x + y + z 2ε2 (10) "y #

u=



   1  2  2 1 2 x + y 2 + z2 − ε2 + x 2 + y 2 + z2 − ε2 + 4ε2 z2 2

(11)

if λ ∈ {λ ∈ R |0 < λ < 2π } φ ∈ {φ ∈ R |−π/2 < φ < +π/2} u ∈ {u ∈ R |u > 0 }

(12)

holds. Spheroidal coordinates enjoy the property to decomposing the three-dimensional Laplace partial differential equation into separable functions. We collect this basic result in: Lemma 1 (Spheroidal eigenspace of the three-dimensional Laplace partial differential equation, external gravity field of the Earth). For a static, uniformly rotating Earth the gravity potential field W (λ, φ, u) can be additively decomposed into the gravitational potential field U (λ, φ, u) and the centrifugal potential field V (λ, φ, u), namely: W (λ, φ, u) = U (λ, φ, u) + V (λ, φ, u)

(13)

The multiplicative decomposition of the gravitational potential field into separable functions U (λ, φ, u) = Λ(λ), !(φ), U (u) generates the solution of the threedimensional Laplace partial differential equation

362

E. W. Grafarend

U (λ, φ, u) =

+n ∞

unm

n=0 m=−n

Q∗n|m|

u

ε b ∗ Qn|m| ε

· enm (λ, φ)

(14)

In terms of surface spherical harmonics enm (λ, φ) =

P∗n|m| (sin φ)



cos mλ ∀m  0 sin |m| λ ∀m < 0

(15)

In the space {R3 , E2a,b } which is external to the ellipsoid of reference (x 2 + y 2 )/(b2 + ε2 ) + z2 /b2 = 1. The eigenfunction enm (λ, φ) are arthonormal on the ellipsoid of referenced with a proper choice of a weighting function. A representation of the centrifugal potential in (i) Cartesian coordinates, (ii) spherical coordinates and (iii) surface spherical harmonics is V (φ, u) =

cos2 φ = u2 + ε2 = V (φ, u) = =

# 1 " # 1 2" 2 1 2 2 ω x + y 2 = ω2 u2 + ε2 cos2 φ = ω2 (P∗2 20 + ε ) cos φ 2 2 3 (16)   2 1 P∗00 (sin φ) − √ P∗20 (sin φ) (17) 3 5 # 2 " ∗ "u# P20 + ε2 (18) 3 ε # " "u# # 2 2 " ∗ "u# 2 ω P20 + ε2 P∗00 (sin φ) − √ ω2 P∗20 + ε2 P∗20 (sin φ) 9 ε ε 9 5 # # " " # " " # 2 2 ∗ u u 2 ω P20 + ε2 e00 − √ ω2 P∗20 + ε2 e20 (19) 9 ε ε 9 5

So far we have not defined the normalized associated Legendre functions of the first kind as well as of the second kind as they appear in (24)–(28). Definition 2 (normalized associated Legendre functions of the first and second kind). The fully normalized associated Legendre functions of the first kind are defined by means of recurrence relation of type √

2n + 1 cos φ P∗n−1,n−1 (sin φ) √ 2n √ 2n + 1 ∗ cos φ P∗n−1,n−2 (sin φ) Pn,n−1 (sin φ) = √ 2(n − 1) P∗nn (sin φ)

=

(20) (21)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

P∗nm (sin φ)

√ 4n2 − 1 =√ sin φ P∗n−1,m (sin φ) n2 − m2 √ (2n + 1)(n + m − 1)(n − m − 1)  − × P∗n−2,m (sin φ) (n2 − m2 )(2n − 3)

363

(22)

subject to ∀n ∈ [3, ∞) and m ∈ [0, n − 2) with staring values: P∗00 (sin φ) = 1 √ P∗10 (sin φ) = 3 sin φ √ P∗11 (sin φ) = 3 cos φ √ " # 5 ∗ 3 sin2 φ − 1 P20 (sin φ) = 2 √ ∗ P21 (sin φ) = 15 sin φ cos φ √ 15 ∗ cos2 φ P22 (sin φ) = 2

(23) (24) (25) (26) (27) (28)

If it is the imaginary root of minus unity, the associated Legendre functions of the second kind are defined by as integral relation of type Q∗nm

"u# ε

" u# = i n+1 Qnm i ε

(29)

m/2 " u # (−1)m 2n (n + m)!m!  u2 Qnm i = n+1 + 1 ε i (n − m)!(2m)! ε2  ∞ sinh2m τ dτ ×  n+m+1  0

u ε

+

u2 ε2

(30)

+ 1 cosh τ

With starting values for n = 0, 1, 2 and m = 0, Q∗0

"u#

= arc cot

"u#

ε ε "u# "u# u Q∗1 = 1 − arc cot ε ε ε   " u # 1  u2 "u# u Q∗2 = 3 2 + 1 arc cot −3 ε 2 ε ε ε

(31) (32) (33)

The reader may wonder about the ratio Q∗nm (u/ε)/Q∗nm /(b/ε) of normalized associated Legendre functions of the second kind as they appear in the series expansion of the gravitational potential field U (λ, φ, u) of type (20) with respect

364

E. W. Grafarend

to spheroidal coordinates. Indeed this ratio is motivated by the definition of “weighted orthonormality” of the base functions or eigenfunctions on the ellipsoid of revolutions u0 = b. Corollary 2 is a résumé of the global area element of the reference ellipsoid of revolution E2a,b which enables us in Corollary 3 to formulate “weighted orthonormality” as well as the reproducing property of the “weighted scalar product”. Corollary 2 (Local and global area element of the reference ellipsoid of revolution E2a,b ). The local area element of the spheroid E2a,b is given by & ' √ dS = d area(E2a,b ) = gλλ gφφ dφdλ  & ' dS = d area(E2a,b ) = a b2 + ε2 sin2 φ cos φdλdφ

(34) (35)

while the global area element of the spheroid of the spheroid E2a,b amounts to  S = area(E2a,b ) = 4π a

1 1 b2 a + ε + ln 2 4 aε a − ε

 (36)

the proof of Corollary 2 follows from straight forward integration of ds Corollary 3 (“Weighted orthonormality”, “weighted scalar product” with respect to the reference ellipsoid of revolution E2a,b ). The base functions or eigenfunctions are orthonormal with respect to the weighted scalar product 

. / 1 epq (λ, φ)|enm (λ, φ) w := S

dSw(φ)epg (λ, φ) = δpn δqm E2√

(37)

u2 +ε2 , u

and the “quantum numbers” p, n = 0, 1, . . . , ∞; q = −p, −p + 1, . . . , −1, 0, +1 . . . , p − 1, p; m = −n, −n + 1, . . . , −1, 0, +1, . . . , n − 1, n. The weighting function is defined by w(φ) := 

a b2 + ε2 sin2 φ



1 1 b2 a + ε + ln 2 4 aε a − ε

 (38)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field





"U (λ, φ, u = b)|enm (λ, φ)#w = 4π a  ×

365

1 1 b2 a + ε + ln 2 4 aε a − ε 



dy 0

+π/2 −π/2

−1

 dφ b2 ε2 sin2 φ cos φw(φ)

× U (λ, φ, u = b)enm (λ, φ) =

+m ∞

. / upq epq (λ, φ)|enm (λ, φ) w

p=0 q=−m

=

+m ∞

upq δpn δqn = unm

(39)

p=0 q=−m

hat the reproducing property

3

The Anharmonic Somigliana-Pizzetti Reference Gravity Field of a Rotating Earth, the World Geodetic Datum

The recent years have seen the tremendous progress in high resolution of the spacetime terrestrial gravity field which demands an update of the key from parameters of the Somigliana-Pizzetti level ellipsoid, in particular the semi-major axis and the semi-minor axis or equivalently the√linear eccentricity e := a 2 − b2 from best estimates of the fundamental Geodetic Parameters {W0 , GM, J2 , #} Approximating the Physical Surface of the Planet Earth, namely the Gauss-Listing Geoid These values build up the World Geodetic Datum The idea of generating the gravity field of a rotating level ellipsoid according to Somigliana [59] and Pizzetti [53] is the following. Given the general representation of the gravity field in terms of surface spheroidal harmonics, namely normalized associated Legendre functions of the first and second kind. We use the representation of the gravity field in terms of spheroidal coordinates which is specified to a

366

E. W. Grafarend

√ particular ellipsoid of revolution E2α,β of semi-major axis a := b2 − ε2 , and semiminor axis b = u which is at the same time a level ellipsoid of revolution gauged to the gravity potential value w0 = W0 of the Geoid. The solution of this problem in eigenspace of surface spheroidal harmonics is presented in.

3.1

The Somigliana-Pizzetti Gravity of a Level Ellipsoid, Two Constraints

Lemma 2 (The gravity field of a level ellipsoid, gauge to the Geoid). If the spheroidal gravity potential field is specified to the level ellipsoid (i) u = b and (ii) w0 = W0 constant, the eigenfunctions of the three dimensional Laplace partial differential equation are restricted according to +∞ ∞

1 1 umn emn (λ, φ) + ω2 a 2 e00 − √ ω2 a 2 e20 (λ, φ) = w0 3 3 5 n=0 m=−n (40)     1 1 W (λ, φ, b) = u00 + ω2 a 2 e00 + u20 − √ ω2 a 2 e20 (λ, φ) 3 3 5

W (λ, φ, b) =

+

+∞ ∞

umn emn (λ, φ)

(41)

n=0 m=−n

= ω0 = const. 1 u00 + ω2 a 2 = w0 3 1 u20 − √ ω2 a 2 = 0 3 5 ∀n ≥ 1,

(n, m) = (2, 0)

(42) (43) (44)

Since only the first term on the left-hand side is a constant, the constant w0 of the level ellipsoid is balanced by (42). The terms of degree/order (2, 0), as well as (n, m) = (2, 0), n ≥ 1 are not constant. Accordingly by means of (43) and (44) they have to vanish. The proof of Lemma 2 is straight-forward. In order to identify the terms of degree/order (0, 0), (2, 0), u00 and u20 respectively, we introduce the Newton gravitational potential field in terms of spheroidal coordinates-namely its spheroidal harmonic expansion-generated by the mass density field ρ(λ , φ  , u )

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

367

Corollary 4 (Spheroidal harmonic expansion of the inverse distance function in the domain u < u). If the inverse distance function, represented in Cartesian coordinates as well as in spheroidal coordinates, 1 1 =  x − x  (x − x  )2 + (y − y  )2 + (z − z )2  2  = u2 + ε2 cos φ cos λ − u2 + ε2 cos φ  cos λ +



u2 + ε2 cos φ cos λ −



u2 + ε2 cos φ  cos λ

2

2 '−1/2

+ u sin φ − u sin φ 

(45)

is expanded in the domain u > u into surface spheroidal harmonics, we receive  ∞ " u #  u  1 1  Pn i (2n + 1) P (sin φ)P (sin φ )Q = n n n i x − x   ε ε ε

(46)

n=0



 (n − m)! +2 Pnm (sin φ) (−1) (n + m)! m=1  " u #  u    Pn i cos m(λ − λ ) ×Pnm (sin φ )Qnm i ε ε n

m

subject to ∗ (sin φ) e = (λ, φ) = Pn|m|

Pn∗ sin φ =

√ %



cos mλ ∀m ≥ 0 sin |m| λ∀m < 0

2n + 1Pn (sin φ)

(n − |m|)! Pn (sin φ) (n + |m|)! "u# " u# ∗ = i −n Pnm i Pnm ε ε "u# " u# ∗ n+1 Qnm = i Qnm i ε ε

∗ (sin φ) = Pnm

2(2n + 1)

(47) (48) (49) (50) (51)

such that "u# "u# 1 1 1 m (n − m)! ∗ ∗ × P Q emn (λ , φ  )emn (λ, φ) (−1) = n|m| n|m|  x − x  ε (n + m)! ε ε 2 ∞

n=0

(52)

368

E. W. Grafarend

holds. For the proof of Corollary 4 we refer to Neumann [51] or [34, pp. 424–430]. Lemma 3 (Spheroidal harmonics expansion of the Newton gravitational potential). With respect to the spheroidal harmonic expansion of the inverse distance function in the domain u < u the Newton gravitational potential field can be represented by 



U (λ, φ, u) = G

d λ



C

π/2 −π/2

dφ  cos φ  ×



u (λ ,φ)

du (u2 + ε2 sin φ)

0

1 × ρ(λ , φ  , u ) (53) x(λ,φ, u) − x(λ , φ  , u )  u (λ ,φ)  π/2  G 2π d λ dφ  cos φ × du (u2 + ε2 sin φ)ρ(λ , φ  , u ) U (λ, φ, u) = ε 0 −π/2 0   +n ∞ u (n − |m|) ∗ × Pn|m| (−1)m (n + |m|)! ε n=0 m=−n "u# enm (λ , φ)enm (λ φ) (54) × Q∗n|m| ε namely with uniform convergence of the spheroidal harmonic expansion of the inverse distance function such that summation and integration and be interchanged. U (λ, φ, u) =

 2π  π/2 ∞ +n  (n − |m|)! G (−1)m d λ dφ  cos φ  ε (n + |m|)! 0 −π/2 m=−n n=0



u (λ ,φ)

× 0

× Q∗n|m|

2







du (u + ε sin φ)ρ(λ ,φ , u

"u# ε



2

∗ )Pn|m|



u ε



 



enm (λ ,φ )

 enm (λ,φ)

(55)

The spheroidal harmonic coefficients amount to    2π  π/2 b (n − |m|!) ∗ G d λ dφ  cos φ  (−1)m Qn|m| ε (n − |m|!) ε 0 −π/2    u (λ ,φ) u  2 2   ∗ enm (λ ,φ  ) × du (u + ε sin φ)ρ(λ ,φ, u ) Pn|m| ε 0  n = 0, 1, . . . , ∞ ∀ (56) m = −n, −n + 1, . . . , n − 1, n

unm : =

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

369

Here is a sketch of the proof. The Newton gravitational potential (189) contains four factorial elements. (i) G denotes the Newton gravitational constant, also called “coupling constant” between the inertial force and the gravitational force. √ (ii) The volume in spheroidal coordinates is generated by detGg, being define as the square-rooted determinant of the matrix G of the metric. In particular g11 = gλλ = (u2 + ε2 ) cos2 φ, g22 = gφφ = u2 + ε2 sin2 φ, g33 = guu = (u2 + ε2 sin2 φ)/(u2 + ε2 ), 2, 3}, g √kl = 0 ∀k = l, k,2 l ∈ {1, det G = cos φ(u + ε2 sin2 φ) holds. The volume of the terrestrial body is bounded by its surface, here represented by the function u(λ, φ)Accordingly, the integration over the third spheroidal coordinate u extends from zero to u(λ, φ). (iii) ρ(λ, φ, u)denotes mas density field, expressed in terms of the spheroidal coordinated {λ, φ, u} (iv) 1/||x (λ, φ, u) − x(λ , φ  , u )|| denotes the inverse of the Euclidean distance between the points x (λ, φ, u)and x(λ , φ  , u ) Next, (54) is generated by implementing the inverse distance function as the spheroidal harmonic expansion with respect to the domain u < u from (52) into the kernel of the Newton gravitational potential integral. In the external domain u < u the series expansion of the Newton kernel is uniformly convergent, a prerequisite in order to interchange integration and summation. Accordingly Eq. (51) is generated by this operation. Finally, if we compare the external solution of the three-dimensional Laplace partial differential equation in terms of spheroidal coordinates, (8) and the spheroidal harmonic expansion of the Newton integral we are led to the spheroidal harmonic coefficients (56). These spheroidal harmonic coefficients, also called spheroidal multipoles, are now specified for (degree/order) (0, 0) and (2, 0), respectively, since they appear in two constraints (42), (43) of the Somigliana-Pizzetti gravity field Corollary 5 (Spheroidal multipole of degree/order (0, 0) and (2, 0), respectively). By means of (20)–(28), (19)–(33) the spheroidal harmonic coefficients/spheroidal multipoles u00 and u20 are represented by u00 =

   2π  π/2 G ∗ b Q00 d λ dφ  cos φ  ε ε 0 −π/2    u (λ ,φ  ) u  2 2     ∗ e00 (λ , φ  ) × du (u + ε sin φ )ρ(λ ,φ , u )P00 ε 0

(57)

370

E. W. Grafarend

u20 =

   2π  π/2 G ∗ b Q20 d λ dφ  cos φ  ε ε 0 −π/2    u (λ ,φ  ) u  2 2     ∗ e20 (λ , φ  ) × du (u + ε sin φ )ρ(λ ,φ , u )P20 ε 0

(58)

The proof of Lemma 3 is straight-forward In order to identify the terms of degree/order (0, 0) and (2, 0), u00 and u20 respectively, we introduce the Newton gravitational potential field in terms of spherical coordinates-namely its spheroidal harmonics expansion-generated by the mas density field ρ(λ , φ  , u ) Corollary 6. Spheroidal harmonics expansion of the inverse distance function in the domain u < u u00 =

   2π  π/2 b G arc cot d λ dφ  cos φ  ε ε 0 −π/2    u (λ ,φ  ) u  2 2     ∗ e00 (λ , φ  ) × du (u + ε sin φ )ρ(λ ,φ , u )P00 ε 0

(59)

or   b GM arc cot (60) ε ε     2   2π  π/2 b G1 b b = dλ dφ  cos φ  3 2 + 1 arc cot −3 ε 2 ε ε 0 ε −π/2  2 √  u (λ ,φ  ) u 5  2 2     1 3 2 +1 (3 sin2 φ  −1) × du (u + ε sin φ )ρ(λ , φ , u ) 2 2 ε 0 (61) √  2   2π     π/2 b b 5G b 3 2 + 1 arc cot −3 = dλ dφ  cos φ  8 ε3 ε ε 0 ε −π/2  u (λ ,φ  ) " × du (u2 + ε2 sin φ  )ρ(λ , φ  , u ) 9u2 sin2 φ  + 3ε2 sin2

u00 = u20

u20

0

u20

# (62) φ  − 3u − ε2 √   x2  2     y2  z (x  ,y  ) 2 b b 5G b   − 3 3 = + 1 arc cot dx dy dz 3 2    8 ε ε ε x1 ε z1 (x ,y ) y1 # " × ρ(λ , φ  , u )[−3 x 2 + y 2 + 6z2 + 2ε2 ] (63)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

√ u20 =

5G 8 ε3

371

 2     "    # b b 1 11 b 3 2 + 1 arc cot −3 6 l + l 22 − l 33 +2Mε2 ε ε 2 ε (64)

subject to  M:=









0

 =

x2

dx 

x1

 l 11 =

x2



x2

dx 

 l

=

dx 



y2



y2

dx





dy  dy  dy 

y2

dy





u (λ ,φ  )

du (u2 + ε2 sin φ  )ρ(λ ,φ  , u )

0

  

y1

x2

x1

y2

d φ cosφ

y1

x1 33

−π/2



y1

x1

l 22 =



π/1





y1

z2 (x  ,y  ) z1 (x  ,y  ) z2 (x  ,y  ) z1 (x  ,y  ) z2 (x  ,y  ) z1 (x  ,y  ) z2 (x  ,y  ) z1 (x  ,y  )

dz ρ(λ ,φ  , u )

(65)

dz ρ(λ ,φ  , u ) · (y 2 + z2 )

(66)

dz ρ(λ ,φ  , u ) · (z2 + x 2 )

(67)

dz ρ(λ ,φ  , u ) · (x 2 + y 2 )

(68)

9u2 sin2 φ  + 3ε2 sin2 φ  − 3u − ε2 = −3(x 2 + y 2 ) + 6z2 + 2ε2

(69)

as the moment of zero order, called the mass of the body, and moments of second order respect to Cartesian coordinates. Generates the transformation of the spheroidal kernel of u20 into its Cartesian kernel. The proof of Corollary 5 is straight-forward. If we substitute into right-hand side of, namely, u2 sin2 φ = z2 , x 2 + y 2 = (u2 + ε2 ) cos2 φ = u2 + ε2 − (u2 + ε2 ) sin2 φ, ε2 sin2 φ = u2 + ε2 − u2 sin2 φ − (u2 + ε2 ) cos2 φ, we arrive at the right-hand side. In summarizing, we have succeeded to represent the spheroidal harmonic coefficients (u00 , u20 ) as they appear in the SomiglianaPizzetti gravity potential field, in terms of the mass of the terrestrial body and the spheroidal/Cartesian mass moments of second order Why did we bother you with the Cartesian moment representation of the spheroidal harmonic coefficient of second order u20 ? The reason is to finally bridge the gap to the conventional spherical harmonic expansion of the terrestrial gravitational potential.

372

3.2

E. W. Grafarend

Best Estimates of the Form Parameters {a, b} of a Somigliana-Pizzetti Related Level Ellipsoid

By means of (42), (43) we shall establish the conditional equations of spheroidal harmonic coefficients of degree/order (0, 0) and (2, 0) in case of a level ellipsoid of the Somigliana-Pizzetti gravity field. As soon as we take advantage of the transformation of spheroidal harmonic coefficients into spherical harmonic coefficients we arrive at (42), (43) as the conditional equations, now in terms of the spherical harmonic coefficients GM and, J2 , namely for the conventional datum R = a as well as, ω0 = W0 .ω = # These final conditional (42), (43) are nonlinearly relating those six parameters of the Somigliana-Pizzetti gravity field a, b, W0 , GM, J2∗ , #. Given the four parameters {W0 , GM, J2 , #} = y1 , y2 , y3 , y4 called pseudo-observations the conditional Eqs. (43), (43) are linearized with respect to the semi-major axis a and semi-minor axis b of the level ellipsoid a, b = x1 , x2 namely ⎛ f1 (x1 , x2 , y1 , y2 , y3 , y4 ) = 

f2 (x1 , x2 , y1 , y2 , y3 , y4 ) = ⎡6 ×⎣ 3

y2 x12

− x22

1

x22 x12 − x22

arc cot ⎝ 

y2 1  4 x2 − x2

5





2

6 15y3

x2 x12

− x22

x12 x12

− x22

+

⎠ + 1 y42 x12 − y1 = 0 3

√ 5

5

(70)



x2 1 ⎠ − 3  x2 + 1 arc cot ⎝  − √ y42 x12 = 0 2 2 2 2 3 5 x1 − x2 x1 − x2 (71)

A Taylor expansion of f (x, y) is f (x, y) = f (x0 , y) +

1  1 f (x0 )(x − x0 ) + f  (x0 )(x − x0 ) ⊗ (x − x0 ) 1! 2!

+ O3 ((x − x0 ) ⊗ (x − x0 ) ⊗ (x − x0 ))

(72)

1 = f0 + J0 (x − x0 ) + H0 (x − x0 ) ⊗ (x − x0 ) + O3 . 2 With respect to the Jacobi-Matrix J0 of the first order partial derivatives at the approximation point x0 as well as the Hesse matrix H0 = [vecH1 ; vecH2 ] of second order partial derivatives at the approximation point x0 , H0 . H1 denotes the Hesse matrix of f1 while H2 the linearized Taylor expansion (for details we refer to Saupe [57])

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

f := f (x, y) − f (x0 , y) = J (x0 )(x − x0 ) = J0 x

373

(73)

Subject to ⎛ ∂f ∂f 1 1 ⎜ ∂x1 ∂x2 J := ⎜ ⎝ ∂f ∂f 2

2

⎞ ⎟ ⎟ ⎠

(74)

∂x1 ∂x2 where the partials of the Jacobi matrix are collected in Table 3 x − x0 =

x = J0−1 (f − f0 ) = (J (x0 ))−1 (f − f0 )

(75)

holds. Newton iteration updates are generated by the n-sequence x − x0 = J0−1 (f − f0 ) ⇒ x1 = x0 + J0−1 (f − f0 )

(76)

⇒ x2 = x1 = J1−1 (f − f1 )

(77)

⇒ · · · ⇒ xn = xn−1

(78)

And stops at the reproducing point (“fix-point”) xn = xn−1 . Here we needed n = 1 Newton iteration step. Next we implement the error propagation from the pseudoobservations {y1 , y2 , y3 , y4 } to the derived parameters {x1 , x2 } namely characterized by the first moments, the expectation E{x} = ξ and E{y} = η, as well as by thesecond moments,  the variance-covariance-matrices/dispersion matrices D{x} = x and D{y} = y . Up to nonlinear terms we derive

= Jξ−1 Jη0 D{y}Jη 0 (Jξ−1 ) 0 0 = Jx−1 Jy Jy (Jx −1 )

D{x} =

x

(79)

y

where Jξ 0 , Jη0 respectively, represent the Jacobi matrices of partial derivatives of the function f (x, y) with respect to x, y, respectively, at the evaluation point (ξ0 , η0 ). Jx ∼ Jξ 0 is given by Table 2, Jy ∼ Jξ 0 is given by Eq. (75) as well as in Tables 3 and 4. ⎡ ∂f ∂f ∂f ∂f 1 1 1 1 ⎢ ∂y1 ∂y2 ∂y3 ∂y4 Jy = ⎢ ⎣ ∂f2 ∂f2 ∂f2 ∂f2 ∂y1 ∂y2 ∂y3 ∂y4

⎤ ⎥ ⎥ (ξ0 , η0 ) ⎦

(80)

374

E. W. Grafarend

Table 2 Jacobian matrix of two constraints of the Somigliana–Pizzetti gravity field, partial derivatives with respect to the unknowns (x1 , x2 ) 6

∂f1 ∂x1

=

x2 y2 x12 −x1 x22

x1 −x22  2 2 3/2 x1 −x2

+ 23 x1 y42 − 6

5

√ x22

x2 y2 arccot ∂f1 ∂x2 ∂f2 ∂x1

+

=

y2 −x12 −x22

=

−2x1 y42 √ 3 5

+ +

5

√ x22

x1 y2 arccot

x1 −x22

 2 2 3/2 x1 −x2 "√ #4 √ x2 y2 5x12 − 5x22 +15x12 y3  −2x16 +3x14 x22 −x26 +3x12 x2 x12  5 2x1 −x12 +x22

4  " √ # √ −3x2 2 + 5x 2 − 15x 2 y − 30x 2 y − 5x  2 2 5/2 3 3  2 2 + 1+ 1 1 1 2 4 x1 −x2 x1 −x2 ⎡ x1 y2

∂f2 ∂x2

=

"√  #  y2 5 x12 −x22 +15x12 y3 3/2  4 x12 −x22

" √  #  x2 y2 − 5 x12 −x22 +45x12 y3 +  2 2 5/2 4 x1 −x2

2 ⎢ ⎢−  x1 3/2 − ⎣ x 2 −x 2 1

2

4 −3x2 x12 −x22

 + 1+

5x22  2 2 3/2 x1 −x2 −3x22 x12 −x22



 3 x12 −x22

+

6

 arccot

6

1/1 − x22 arccot  x2 2 2 x1 −x2

3x22 x12 −x22

6  arccot  6



53

53

x2 x12 −x22 5⎤

x2 6x12 x2 arccot x12 −x22  2 2 2 x1 −x2

⎥ ⎥ ⎦

53  x2 x12 −x22

Table 3 Jacobi matrix of two constraints of the Somigliana–Pizzetti gravity field, partial derivatives with respect to the pseudo observations (y1 y2 , y3 , y4 ) 5

6

arccot ∂f1 ∂y1

∂f1 ∂y2

= −1

=

∂f2 y2 ∂f2 ∂y1

=0

∂f1 ∂y3

=0

∂f2 ∂y3

= ×

= 4

×

√ x22

x1 −x22  2 x1 −x22 √  2 2 5 x1 −x2 +15x12 y3  3/2 4 x12 −x22

−3x2 x12 −x22

 + 1+

3x22 x12 −x22

53

6

 arccot

 x2 x12 −x22 ∂f1 ∂y4

= 23 y4 x12

∂f2 ∂y4

=

15x12 y2



3 2 2

4 x1 −x2 4 2

−3x2 x12 −x22

 + 1+

3x22 x12 −x22

53

6

 arccot

 x2 x12 −x22

 σ12 σ12 , D{x} = = σ12 σ22 x # " D{y} = = Diag σ12 , σ22 , σ32 , σ42

−2x12 y4 √ 3 5



(81)

y

Indeed for the error propagation in linearized form, namely (75) we have assumed that the variance-covariance-matrix /dispersion matrix D{y} of the

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

375

 Table 4 Best estimates of pseudo-observations W0 , GM, J2∗ , # References Burša et al. [10]

Parameter W0

Value (62,636,855.72 ±0.5) m2 /s2 (62,636,855.80 ± 0.5) m2 /s2 (62,636,855.80 ±3.6) m2 /s2 (398,600.4418 ±0.0008) km3 /s2

Burša et al. [11]

W0

Grafarend and Ardalan [27]

W0

Ries et al. [56]

GM

Ries et al. [56]

GM

(398,600.4415 ± 0.0008) km3 /s2

Tapley et al. [62]

J2∗

Lemoine et al. [40]

J2∗

Rapp et al. [55]

J2∗

−4.8416954845647 × 10−4 ± 4.66 × 10−11 −4.84165371736 × 104 ± 3.56 × 10−11 −4.84165532804 × 10−4 ± 4.47 × 10−11

Groten [28]

#

Sources Satellite altimetric data, gauge stations values Satellite altimetric data, gauge stations values Gauge station data, GPS, Baltic Sea Level Project Satellite Laser Ranging, satellite altimetric data; in SI units, including mass of earth’s atmosphere Satellite Laser Ranging, satellite altimetric data; in TDT units, including mass of earth’s atmosphere JGM3 geopotential model; in zero frequency tide system EGM96 geopotential model; in tide-free system OSU91A geopotential model; in tide-free system

(7.292115 × 10−5 ±10−12 ) rad/s

 pseudo-observations W0 , GM, J2∗ , # = y1 , y2 , y3 , y4 is unknown, building up the argument why they have been neglected. In the following  based upon current best estimates of the four fundamental parameters W0 , GM, J2∗ , # we obtain numerical values of the form parameters: {a, b} of E2a,b of the level ellipsoid of Somigliana-Pizzetti type by Newton iteration of the linearized two conditional equations, in particular dependent of the way that the indirect permanent tide is being implemented. Table 5 refers to the data {W0 , GM, J2 , #} to be applied within our computation. For instance, the zero frequency-tide reference system is contained in the spherical harmonics coefficients J2 while in the tide-free reference system J2 is reduced for the effect of the indirect tide Note that in both cases J2 is free of direct tidal effects Our calculation of the two form parameters {a, b} of E2a,b of the level ellipsoid of Somigliana–Pizzetti type in

376

E. W. Grafarend

the zero-frequency-tide reference system are collected in Table 5 while Table 6 is devoted to the results {a, b} in the tide-free reference system. In both the reference systems different values of W0 from Bursa et al. [10], Bursa et al. [11] as well as Grafarend and Ardalan [27] and of GM from Ries et al. [56] in SI and TDT units are tested. The effect of various J2 values from JGM3, EGM96 and OSU91A on the form parameters {a, b} of E2a,b of the level ellipsoid of Somigliana–Pizzetti type is evaluated. In addition based upon error propagation the variance-covariance matrix of {a, b} with respect to the best estimates of {W0 , GM, J2 , #} is outlined in Table 7. From the Tables 4 and 5 the following conclusions can be made: The change of the GM value from TDT to SI units produces a millimeter variation in the form parameters {a, b} of E2a,b which is below the level of the calculated root-meansquare error values. J2∗ data from the geopotential models JGM3, EGM96 and OSU91A result in almost the same values of the form parameters {a, b} of E2a,b . The root-mean-square errors of the form parameters {a, b} of E2a,b are directly proportional to the rootmean-square error of the gauge value of the potential W 0, the ruling parameter for the variance budget. Indeed, we have documented that a tidal potential which is constant in time in its indirect effect with a properly chosen secular Love number (“zero-frequency tidal reference system”) flattens the reference ellipsoid as a level ellipsoid of Somigliana-Pizzetti type. The following contains the final data set of the form parameters {a, b} of level ellipsoid-of-revolution of Somigliana–Pizzetti type (tide-free as well as zero-frequency tide reference). There are six parameters, namely the four fundamental geodetic parameters {W0 , GM, J2 , #} and two form parameters {a, b} or {a, ε} which determine the ellipsoidal reference gravity field of type Somigliana-Pizzetti constraint to two nonlinear condition equations. Their iterative solution leads to best estimates a = (6,378,136.572 ± 0.053) m, b = (6,356,751.920 ± 0.052) m, ε = (521,853.580 ± 0.013)m for the tidal free Geoid of Reference and a = (6,378,136.602 ± 0.053) m, b = (6,356,751.860 ± 0.052) m, ε = (521,853.674 ± 0.015)m zero-frequency tide Geoid of Reference. The best estimates of theform parameters of a Somilgliana-Pizzetti level ellipsoid, {a, b} differ significantly by −0.39 m, −0.454 m, respectively, from the data of the Standard Geodetic Reference System 1980, Ardalan et al. [2]. The axial symmetric gravity field is also the subject of the solution of Einstein’s field equations for rotating and gravitating massive bodies, finally solved by Roy Kerr (1963) for a stationary axially symmetric metric. See more details in Hehl und Obukov [30], Ansorg et al. [1], Kopeikin [38], Kopeikin et al. [39] as well as Backus [4] and Ballani et al. [5].

b(m) 6,356,751.868 ± 0.052

6,356,751.863 ± 0.052

6,356,751.860 ± 0.052

6,356,751.860 ± 0.366

a(m) 6,378,136.610 ± 0.053

6,378,136.605 ± 0.053

6,378,136.602 ± 0.053

6,378,136.602 ± 0.369

−0.398

−0.398

−0.395

a – aGRS80 (m) −0.390

−0.454

−0.454

−0.451

b – bGRS80 (m) −0.446

62,636,855.80c ± 3.6

62,636,855.80b ± 0.5

62,636,855.72a ± 0.5

W0 (m2 /s2 ) 62,636,855.72a ± 0.5

GM (km3 /s2 ) 398,600.4418d ± 0.0008 398,600.4415e ± 0.0008 398,600.4418d ± 0.0008 398,600.4418d ± 0.0008

J2∗ −4.8416954845647 × 10−4f ± 4.66 × 10−11 −4.8416954845647 × 104f ± 4 66 × 10−11 −4.8416954845647 × 10−4f ± 4.66 × 10−11 −4.8416954845647 × 10−4f ± 4.66 × 10−11

Table 5 Best estimates of the form parameters {a, b} of the Somigliana–Pizzetti level ellipsoid in zero frequency tide system, compared to Moritz [48]

7 Ellipsoidal-Spheroidal Representation of the Gravity Field 377

6,356,751.928 ± 0.052

6,356,751.923 ± 0.052

6,356,751.920 ± 0.052

6,356,751.920 ± 0.366

6,356,751.926 ± 0.052

6,356,751.917 ± 0.366

6,378,136.580 ± 0.053

6,378,136.575 ± 0.053

6,378,136.572 ± 0.053

6,378,136.572 ± 0.369

6,378,136.581 ± 0.053

6,378,136.573 ± 0.369

−0.427

−0.419

−0.428

−0.428

−0.425

a−a GRS80 (m) −0.420

b

Burša et al. [10] Burša et al. [11] c Grafarend and Ardalan [27] d Ries et al. [56] (in SI units) e Ries et al. [56] (in TDT units) f Tapley et al. [62] g Rapp et al. [55] # = 7.292115 × 10− 5 ± 10−12 (rad/s) [28]

a

b(m)

a(m)

−0.397

−0.388

−0.394

−0.394

−0.391

b−b GRS80 (m) −0.386

62,636,855.80c ± 3.6

62,636,855.72a ± 0.5

62,636,855.80c ± 3.6

62,636,855.80b ± 0.5

62,636,855.72a ± 0.5

62,636,855.72a ± 05

W0 (m2 /s2 )

398,600.4418d ± 0.0008

398,600.4418d ± 0.0008

398,600.4418d ± 0.0008

398,600.4418d ± 0.0008

398,600.4415d ± 0.0008

398,600.4418e ± 0.0008

GM (km3 /s2 )

−8,416,537 × 10−4f ± 3.561 ×10−11 −4.8416537 × 10−4f ± 3.561 × 10"" −4.8416537 × 10−4f ± 3.561 ×10−11 −4.8416537 × 10−4f ± 3.561 ×10−11 −4.8416553×10−4g ±4.472 ×10−11 −4.8416553×10−4g ±4.472 ×10−11

J2∗

Table 6 Best estimates of the form parameters {a, b} of the Somigliana–Pizzetti level ellipsoid in tide-free system, compared to H. Moritz [48]

378 E. W. Grafarend

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

379

Table 7 Variancecovariance matrix of optimal {a, b} of E2a,b , of Somigliana–Pizzetti type, via  error propagation from variances of the pseudoobservations W0 , GM, J2∗ , # σ W 0 (m2 /s2 ) σ GM

0.5

4

(km3 /s2 ) σ J2∗

0.0008

σ # (rad/s) σˆ a (m) σˆ b (m) σˆ ab

4.5 × 10−11 10−12

0.053 0.052 2.738 × 10−3

ρ=

σˆ ab σˆ a σˆ b

0.99984 (% 99.98)

Geostationary and Geosynchronous Satellite Orbits: Analysis of the Gravitational Force and of the Centrifugal Force, the Null Space of the Gravity Space

Due to the different nature of the gravitational force as well as the centrifugal force –the gravitational force is attractive, the centrifugal force is repulsive– notable in the external space of the Earth it might happen that the sum of these two forces will be zero. This is the orbit of a satellite in a geostationary and geosynchronous satellite orbit. How are geostationary and geosynchronous satellite orbits defined? A geosynchronous satellite orbits the Earth along a constant longitude. A special case is the geostationary satellite that is located at a constant position above the equator. The ideal position of a geostationary satellite is at the level of zero gravity, i.e. at the geocentric radius where the gravitational force of the Earth equals the centrifugal force. These forces must be compensated for several perturbing forces, in particular for the lunisolar tides. Considering that the gravity field of the Earth caries not only radially but also laterally, this study focuses on the variations of zero gravity not only on the equator (for geo-stationary satellites) but also for various latitudes. It is found that the radius of a geostationary satellite deviates from its mean value of 42,164.2 km only within ±2 m, mainly due to the spherical harmonic coefficient 122, which is related with the equatorial flattening of the Earth. Away from the equator the zero gravity surface deviates from the ideal radius of a geosynchronous satellite, and more so for higher latitudes. While the radius of the former surface increases towards infinity towards the poles, the latter decreases about 520 m from the equator to the pole. Tidal effects vary these radii within ±2.3 km. Here the contribution of L.E. Sjoeberg, E. Grafarend and M. S.S. Joud, J. Geod. Sci. 7(2017)43–50 have been adopted.

4.1

Example: Geostationary Orbit: Gravitational Force Plus Centrifugal Force Equal to Zero

First, we calculate its approximate radius 42,164 km which had been already computed by Bode and Grafarend [9], recently by Iorio [35].

380

E. W. Grafarend

Here we try to give a first-hand analytical derivation of this geostationary radius. In contrast to the derivation of the Somigeliana-Pizzetti gravity field which was based on (i) the degree/order (0, 0) and (2, 0) for the gravitational field as well as (ii) the degree/order term (0, 0) and (2, 0) for the centrifugal field, we only consider in a quasi-spherical coordinates (λ, φ, r (λ, φ)) of gravity- term noted by T.Krarup’sspace: The gravitational term of degree/order (0, 0) and The centrifugal term of degree/order (0, 0) and (2, 0) Ansatz # " ∂W = −gmr −3 + ω2 x ∂x " # ∂W γy : = = −gmr −3 + ω2 y ∂y γx : =

γz : =

∂W = −gmr −3 z ∂z

(82) (83) (84)

Obviously, the rotational term appears only in terms γx and γy , not in the term γz . The Mean Rotational Speed is restricted to the equatorial plane spanned by the x-axis as well as by the y-axis. Question Where it happens − gmr0−3 + ω2 = 0

(85)

or ω=

  3 gmr0−3 or r0 = gmω−2

(86)

in the equatorial plane (z = 0)? Answer The null space defined by −gmr0−3 + ω2 = 0 appears in the equatorial plane (z = 0), the plane in which the rocket start normally. It is radius of the geostationary orbit.

4.2

Geostationary and Geosynchronous Satellite Orbits

Geostationary and geosynchronous satellites are important for society, e.g. for telecommunication and navigation. Both types of satellites have a period of revolution around the Earth’s rotational axis that equals one day, which makes these satellites to stay fixed at a specific longitude. An ideal geostationary satellite, a special case of the geosynchronous satellite, is always in the equatorial plane of

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

381

the Earth, while the orbital plane of the latter is inclined vs. the celestial equatorial plane, which makes the satellite move periodically between its northern and southern latitudes, being approximately equal to plus and minus of the inclination angel of the orbital plane. The orbital plane of the geostationary satellite is subject to a precession due to tidal forces of the moon and the sun and the non-spherical shape of the Earth. This implies that the inclination of the plane varies within ±15 degrees away from the equator with a period of about 53 years. The flattening of the Earth also causes a slow longitudinal drift of the satellite towards one of the stable equilibrium points at 75◦ 3E (over Indonesia) and 104◦ . 7W (in the Eastern Pacific), “the graveyards of geostationary satellites”. Also solar wind and radiation pressure make the satellite to slowly drift away from the undisturbed orbit. All these perturbations require orbital maneuvers to keep the satellite in a requested position. The orbits of the above satellites can be determined analytically and/or directly monitored by observations, such as satellite laser ranging and C-band ranging (e.g. [29]), GNSS or Doppler shift observations by LEO satellites. Here we will consider the problem that the Earth is not at spherical homogenous ball, and the gravity field is not radially symmetric but varies also laterally in space. In case of the geostationary satellite, orbiting the Earth along the equator, this means that the centrifugal force acting on the satellite matches the gravitational force at various radii, depending on the longitude position of the satellite. As gravity is defined as the sum of the gravitation and the centrifugal force per unit mass, the geostationary satellite is located on the zero gravity surface, which is at the focus of this study. However, as will be shown, geosynchronous satellites (with exception for geostationary satellites) typically orbit the Earth below the zero gravity surfaces. Although, most of the orbital theory given here should be well-known among satellite orbit engineers. It is probably not the case for geodesists. Also, “the zero gravity surface” is a new concept. Basic Question The energy integral for satellites

V +

kMm kMs v2 + c, + = lm ls 2

(87)

where V is the gravitational potential at the satellite generated by the Earths mass, k is the gravitational constant, Mm and Ms are the masses of the moon and the sun at distances lm and ls from the satellite, respectively, v is the total velocity of the satellite and c is an integration constant, was presented by Bjerhammar [7]. Disregarding the tidal effects one obtains an equivalent formula for the Earth’s gravitation (γ ) by the expression

γ =−

1 ∂v 2 ∂V + ∂r 2 ∂r

(88)

382

E. W. Grafarend

Here r is the geocentric radius of the satellite. This formula disregards the minor influences of solar radiation, air drag and tidal effects of the planets. Approximate radius of a geostationary satellite A geostationary satellite orbits the Earth in its equatorial plane. In this case the satellite is moving with the same angular velocity (ω) as the Earth rotates, and the velocity of the satellite becomes v = ωr

(89)

γ = rω2

(90)

 = γ − rω2 = 0,

(91)

and Eq. (88) yields

or

where G is gravity. Hence, the geostationary satellite is located at the radius of zero gravity above the Earth, which implies that the gravitational force of the Earth is completely balanced by the centrifugal force. By approximating g by the zero degree harmonic/Keplerian term (kM/r 2 ) of the Earth’s gravitation, where kM is the geocentric gravitational constant, (i.e. Newton’s gravitational constant times mass of the Earth,) it follows form Eqs. (89) and (90) that r = r¯ =

 3

kM/ω2 ,

(92)

and by differentiating this equation one can see the change in radius (dr) for any change in gravitation (dg): dr = r¯

dγ γ

(93)

Bode and Grafarend [9] estimated r¯ to 42,164.26 km by using the constants from Geodetic Reference System 1967 (Table 8). Taking the constants from GRS1980 the radius for zero gravity becomes 42,164.17 km, i.e. the estimates of the radius differ 90 m between the reference systems, mainly due to the change inkM.

Table 8 The radii for zero gravity determined for GRS67 and GRS 80 for a Keplerian orbit by Eq. (92) System GRS 1967 GRS 1980

mK[m3 s−2 ] 3,986,030 ×·108 3,986,005 ×·108

ω[s−1 ] 7,292,115.1467×·108 7,292,115 ×·108

r¯ [km] 42,164.26 42,164.17

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

383

Precise radius of a geostationary satellite As the gravity field of the Earth varies laterally from point to point, Eq. (92) cannot hold as a precise estimate, but g must be determined by a more exact formula than above. Here we will first improve the gravity  Eq. (176b) by approximating it by the normal gravity field of Pizzetti [54] and Somigliana [58]. [Ardalan and Grafarend [2] presented an approximate formula for the normal gravity on the level ellipsoid accurate to 10−7 mGal. However, here we need gravity at satellite level along the equator] Then we will also compute the radius along the equator for the vanishing gravity from an Earth Gravitational Model (EGM). The zero gravity radius by normal gravity The normal gravity field is the field generated by a level ellipsoid. As derived in Somigliana ( [58]; see also Heiskanen and Moritz [32] Sect. 2–8 and Caputo [14]) using Jacoby ellipsoidal coordinates (μ, β, λ), the normal gravity vector is the gradient of the normal potential, with the two components:

γμ = −

   1 1 kM ω2 α 2 E q  sin2 β − − ω2 ucos 2 β + w B2 2 6 B 2 qo

(94)

and   ω2 α 2 q 1 2 − + ω B sin βcosβ γβ = − w B qo

(95)

Subject to 1 q = q (u) = 2

     E u2 u 1 + 3 2 arctan −3 u E E

qo = q (b)    E u u2  1 − arctan −1 q (u) = 3 1 + 2 u E E  u2 + E 2 sin2 β . w= B

(96) (97) (98) (99)

√ Here B = u2 + E 2 and a and E = ae are the semi major axis and linear eccentricity of the level ellipsoid. Also e is the eccentricity and u, β and λ(= 0) are the ellipsoidal harmonic coordinates with u along the semi-minor axis of the reference ellipsoid and β being the reduced latitude, (The longitude component λ of normal gravity always vanishes, as the level ellipsoid is rotationally symmetric around the z-axis.)

384

E. W. Grafarend

As the magnitude of the gravity vector is γ =

 γu2 + γβ2 .

(100)

And on the equator β = γβ = 0, it follows that 1 γ = |γu | = w



 kM ω2 α 2 E q  (u) 2 − −ω u . 6qo B2 B2

(101)

Hence, equating γ to zero, zero gravity is given by the (approximate) depressed cubic formula u3 + uE 2 − C = 0,

(102)

Where C=

a 2 Eq  (u) kM − . 6qo ω2

(103)

Here the term C is first assumed as a constant. In that case Cardano’s method shows that only one real root exists, and it becomes: E E% % F F F F 3 C 3 E6 G E6 C C2 C2 G + + − + − . u= 2 4 27 4 27 2

(104)

However, the term q  (u) in C is slightly dependent on u, and therefore the primary solution of Eq. (104) needs to be iterated. Once u is determined, the zero gravity radius above the equator follows form r=B=

 u2 + E 2 .

(105)

Using the parameters in GRS 1967/1980 one obtains the radius 42,164.79/ 42,164.70 km. Zero gravity radius represented by EGM There will be a small variation of the radius of zero gravity along the equator, mainly due to the tesseral spherical harmonics of degree and order 2 [i.e. A22 and A2−2 in Eq. (107)]. As these coefficients are of the order of the 2×10−6 , one can expect the deviation of the radius of the order of a few meters [when considering Eq. (93), (107) and (108)]. Starting from the spherical harmonics expression of the Earth’s gravitational potential truncated at some degree n0 :

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

V =

385

 n+1 n0 R Vn (θ, λ) , r

(106)

n=0,n =1

Subject to Vn (θ, λ) =

kM R

 n+2 R Vn (θ, λ) r

(107)

The gravitational experienced by the satellite becomes γ =

n0 n=0,n =1

n+1 R

 n+1 R Vn (θ, λ) . r

(108)

Here, (θ, λ) is the horizontal position in co-latitude and longitude, respectively, R is a reference radius (e.g. sea level radius) for the EGM with coefficients Anm Inserting Eq. (108) into Eq.(91) one obtains for a geostationary satellite with θ = π/2 : 2 r=

3

kM (1 + ω2

 ) = r¯ 1 +

3



2

9

 ... ,

(109)

where =

n0   n n R n=2

r

m=−n

Anm Ynm

"π 2

# ,λ

(110)

Eq. (110) needs to be iterated by starting, e.g., form r = r¯ and the convergence should be rapid as for R being sea level radius rr¯ = 6371/42,157 ≈ 0.127 and |Anm | < 2 × 10−3 numerical results are given below. Numerical Results Equations (109) and (110) were used with EGM GOCO05c complete to degree 80 to determine the variation of zero gravity and its radius along the equator. The mean value (g) ¯ of gravitation along the equator was determined to 0.2242179 m/s2 by Eq. (109), corresponding to the mean radius r = 42,164.2 km (i.e. the same as for the Keplerian orbit). The variations along the equator of the gravitation (dg = g − g) ¯ at the mean radius and the radius (dr = r − r¯ ) are shown in Figs. 1 and 2, respectively. The first figure shows that dg varies between −9.81μGal and +9.69μGal with minima at longitudes (75◦ .1) and (254◦ .7) and maxima at longitudes (161◦ .0) and (349◦ .7). The maxima (being unstable points) correspond to radii with deviations dr of 1.9 m and 1.6 m, while the minima (stable points) are attained at deviations of −2.0 and −1.5 m. (See also Table 9.)

386

E. W. Grafarend 10

Data

8 6

g — g [μGal]

4 2 0 −2 −4 −6 −8 −10

0

60

120

180

240

300

360

λ [degrees] Fig. 1 The vertical axis shows the difference of zero gravity form its mean values along the equator. λ axis is longitude. (L.E. Sjoeberg, E. Grafarend and M. S.S. Joud (2017)) 2 1.5 1

r — r [m]

0.5 0 −0.5 −1 −1.5 −2

0

60

120

180

240

300

360

λ [degrees] Fig. 2 The variation of the radius (r) of zero gravity along the equator. The vertical axis shows dr = r − r¯¯ in meter where r = 46,164.17 km in the average radius along the equator. The horizontal axis shows the longitude. (L.E. Sjoeberg, E. Grafarend and M. S.S. Joud (2017))

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

387

Table 9 The radius and Earth’s gravity acceleration at the extreme points. λ [degree] dg[μGal] γ [m/s 2 ] dr [m] r [km]

Minima 75.05 -9.8 0.22421781 1.973 42,164.347

Maxima 160.95 9.7 0.22421801 1.953 42,164.351

254.75 -7.1 0.22421784 -1.544 42,164.347

349.75 7.2 0.22421798 1.560 42,164.351

Lateral motions of the satellite A perfect geostationary satellite is constantly positioned over the equator at a specific longitude and geocentric radius. However, in reality, mainly due to perturbing forces, this will not be the case, but there will always be small motions away from the perfect position. Small periodic motions may be acceptable, but at some point secular motions must be corrected by on-board thrusters. However, also for the Kepler motion (with no perturbing forces), in practice the orbit is always inclined and eccentric, implying that the satellite moves away from the stable geostationary position. In case of small eccentricity and inclination, as outlined in the Appendix based on Soop [60], the geostationary satellite will experience small sinusoidal liberations (periodic motions) of its coordinates that can be expressed as: r ≈ r¯ + δa − r¯ e cos [ω (t − .tP )] δ = i sin[ω¯ + ω (t − .tP )] λ ≈ # + ω¯ − αo + ω (t − .tP ) − 1.5 (δa/¯r ) ω (t − .tP ) + 2esinω (t − .tP ) (111) where,#i and ω are the right ascension of the ascending node and the inclination of the orbital plane as well as the argument of the perigee vs. the ascending node.tP is the time for perigee passage, t0 is an arbitrarily selected reference time and the declination approximates the latitude. Zero gravity surface Equation (90) or (91) yields the radius of the zero gravity surface only along the equator. For any other latitude zero gravity is given by the equation  = γ − rω2 cos 2 φ = 0

(112)

where φ is the latitude. Relying on the approximate formula for gravitation = kM/r 2 , one obtains approximately the radius as 2 r¯ (φ) =

3

kM ω2

(113)

388

E. W. Grafarend

Considering the more precise gravity given by Eq. (13), one obtains also the improved radius: 2 r=

3

kM (1 + ω2

 ) = r¯ 1 +

3



2

9

 ... ,

(114)

with as in Eq. (110) but for varying co-latitudes. From Eqs. (113) and (114) one can see that the radius increases by the factor (cosφ)−2/3 vs. the radius of the geostationary satellite when moving away from the equator. This is the ideal radius that matches Eq. (112), and it is illustrated in Figs. 3 and 4. One can see that the radius increases dramatically from that of the geostationary satellite at the equator to about 135,000 km at 80◦ latitude (and towards infinity at the poles). It can be seen from Fig. 4 (bottom) that the variation of the residual radius dr = r − r¯ (φ) at different latitudes decreases with latitude (due to the increasing distance to the center of the Earth). For instance, at the equator dr is within ±2 m, while at latitude 80◦ the variation has decreased to within ±0.02 m, but the extrema occur for the same longitudes, according to Table 9. The ideal radius of a geosynchronous satellite A geosynchronous satellite orbits the Earth along a constant longitude, implying that its period and angular velocity is the same as for the Earth’s rotation around its axis (with angular velocity ω). Generally, the orbital plane is inclined (with the

140000 130000 120000 110000

r [km]

100000 90000 80000 70000 60000 50000 40000 −80

−60

−40

−20

0

20

40

60

80

φ [degrees] Fig. 3 The figure shows the mean geocentric radius (¯r ) along parallel circle of the zero gravity surface between±80˚(L.E. Sjoeberg, E. Grafarend and M. S.S. Joud (2017))

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

389

135465.8384 135465.8383

f = 800

135465.8382 66931.3200 66931.3190

f = 600

66931.3180 50362.5100 f = 400

50362.5050 50362.5000 43949.5300

f = 200

43949.5200 43949.5100 42164.3600

f = 00

42164.3500 42164.3400

0

50

100

150

200

250

300

350

0.05 f = 800 0 −0.05 0.5 f = 600 0 −0.5 2 f = 400 0 −2 2 f = 200

0 −2 2

f = 00 0 −2

0

60

120

180

240

300

360

Fig. 4 The vertical axis are (top) the radius of the zero gravity surface in km and (bottom) the change of the radius dr = r − −¯r (φ) in m of the surface about its mean (¯r ) over parallels. The horizontal axis shoes the longitude in degrees. (L.E. Sjoeberg, E. Grafarend and M. S.S. Joud (2017))

390

E. W. Grafarend

inclination angel i) with respect to the equatorial plane, implying that the satellite moves periodically along a specific meridian between the latitudes −i and +i. However, as also in this case the centrifugal force should match the gravitational force, Eqs. (90) and (91) hold, and the radius can be determined by Eqs. (110) and (14b), but now the fixed co-latitude is replaced by the variable co-latitude of the geosynchronous satellite. In Fig. 5 the mean radius as a function of latitude is illustrated. On the equator, it is clear that the geosynchronous satellite orbits at the same radius as a geostationary satellite (Eq. 110) on the zero gravity surface (Eq. 114). When migrating from the equator towards the poles, the satellite orbit is closer to the Earth’s center than at the equator. For instance, the mean radius for the satellite at 60◦ latitude is about 400 m less than that at the equator. In Fig. 6 the radii about their latitudinal means r¯ (φ) are shown. In Fig. 6(top) one can see that the radius for a geosynchronous satellite decreases about 500m from the equator to latitude 80◦ , while the radius of the zero gravity surface, shown in Figs. 3 and 4 (top), increases about 92,800 km. Comparing Figs. 4(bottom) and 6 (bottom) one notices that away from the equator the radius variation around its latitude mean value is slightly smaller for the zero gravity surface than for a geosynchronous satellite. Effects of tidal perturbations Capderou ( [13], p. 159) emphasized that at the radius of a geosynchronous satellite the Earth’s flattening term J2 does not cause the largest perturbation of the satellite orbit, but it is due to the lunisolar tide.

42164.4

42164.3

r [km]

42164.2

42164.1

42164.0

42163.9

42163.8 −80

−60

−40

−20

0

20

40

60

80

φ [degrees] Fig. 5 The mean radius (¯r ) of a geosynchronous satellite along parallel circles at latitude φ. (L.E. Sjoeberg, E. Grafarend and M. S.S. Joud (2017))

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

391

Fig. 6 Bifurcation point in the equatorial plane. (J. Zund [66], p. 230)

Considering various perturbations he found that the radius for geostationary is 1.7 km higher than that for the Keplerian orbit. The lunisolar perturbations of geostationary and geosynchronous satellite orbits are both periodical and secular. The acceleration vector of a satellite (being the negative of the gravitational acceleration) due to each of the celestial bodies (with mass Mi ) is given by  d x¨ = −dγ = kMi

x x’ − r3 (r  )3



where x and x’ are the vectors from the celestial body to the satellite and Earth‘s center, respectively. Inserting dg into Eq. (93) one obtains the change of radius due to the tidal effect. Our numerical calculations show that the lunisolar tide vary the satellite radius between −2.26 and 1.80 km.

4.3

Discussion

While the radius of a geostationary satellite (42,164.2 km) varies only within ±2 m along the equator, the ideal radius of a geosynchronous satellite, mainly due to the flattening of the Earth, decreases about 522 m from the equator to the pole. However, these figures do not include the lunisolar tidal effect, which alter the radius between 1.80 and +2.26 km. Along, the equator the radius of the zero gravity surface agrees with the radii of geostationary satellites. Away from the equator geosynchronous satellites fly below the zero gravity surface, whose radius approaches infinity at the poles. The zero gravity surface is the ideal surface for conducting zero gravity experiments and engineering production. Finally we discuss two items: “the singularity of the poles” as well as the nature of balance zero called “bifurcation”

392

E. W. Grafarend

We know very well that why we have to live at the coordinate singularity at the North Pole in using spherical or ellipsoidal coordinates. LusternikSchrirelmann-Category I teaches us that a minimum atlas of the sphere or of the ellipsoid consists at least of two maps. This is the basis of the Theory of Map Projections: For instance we need at least (i) Standard spherical/ellipsoidal coordinates or UM (Universal Mercator Projection for the surface or the ellipsoid) and (ii) a transversal coordinate system or UTM (Universal Transverse Mercator Projection for the sphere or the ellipsoid). J. Zund [66] has presented in Sect. VII.3 “The fundamental Theorem of Differential Geodesy in detail” the Pizzetti sphere, namely the set of equipotential surfaces lying between the radius R of the toposphere and the compact surfaces whenever 0 < h < 5R Where h is height above the reference sphere” (or the ellipsoid). His analysis is centered around the question what happens as h→5R Here we present by Fig. 6 the situation in the equatorial plane: the surface degenerates into a closed curve which is circular in shape. For for a mean radius , such a surface degenerates, namely bifurcations onto a 2-sheeted open cylindrical surface.

For J. Zund the Pizzetti Theorem may be regarded as the fundamental theorem of Differential Geometry. On such a concept, J. Zund presented Hotine’s Hierarchy of Local Coordinates Systems ⏐ ⏐ H0 ⏐ ⏐ H1 ⏐ ⏐ H2 ⏐ IH 3

: (Λ, !, N) : (“normalsystem ) : (trippleorthogonalsystem) : (“λ, φ, h )

as well as his substantial critique.

5

Molodensky Heights

Let us begin with a short review of the essentials of the topic telluroid, telluroid height and the highlight, “Molodensky height”. The telluroid as introduced by M.S. Molodensky [45–47] may be considered the best analytical representation of the irregular surface of the Earth. Given

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

393

the placement vector of a point in geometry space, for instance by GPS (global problem solver), and reference gravity potential in gravity space, the telluroid can be uniquely defined by a properly chosen projection. For instance, astronomical longitude/astronomical latitude (spherical coordinates in gravity space) at a topographic point could be defined to coincide with reference longitude/reference latitude (spherical coordinates in reference gravity space) at a telluroid point in order to establish an isoparametric mapping of the telluroid. Bode and Grafarend [9] extensively studied such an isoparametric telluroid mapping with respect to a reference gravity potential which is additively decom-posed into (1) the zero-order coefficient of a spherical harmonic expansion of the gravitational potential and (2) the centrifugal potential. In particular, they succeeded in identifying the singular points of such a telluroid map. Here we aim at an ellipsoidal telluroid mapping, which is set up as follows. Let there be given the triplet of Jacobi ellipsoidal/ spheroidal coordinates {λp , φp , up } of a point p on the topographic surface of the Earth which parameterizes the geometry space, which may be converted from GPS positioning. In addition, the actual gravity potential wp , which parameterises the gravity space, at p is given by means of gravimetric levelling with respect to, for example, the Amsterdam Level Datum. Find the point P on the telluroid whose reference gravity potential Wp is gauged to the actual gravity potential wp , for example by means of Wp = wp . Alternatively, determine the Jacobi ellipsoidal/spheroidal coordinates {Λp , !p , Up } of a point P on the surface of the telluroid by a suitable mapping procedure. The telluroid derived from such a definition is called a potential telluroid, in contrast to cases where the modulus of gravity acceleration at p is equated to the modulus of reference gravity acceleration at P . In such a case, we would refer to a gravity telluroid, a proposal that has already been made by Grafarend [21,22]. Here we have chosen the reference gravity field of a level ellipsoid which coincides with the World Geodetic Datum (Grafarend and Ardalan [24]), and which is also known as the Somigliana-Pizzetti gravity potential field [59]. We may therefore call the related telluroid the Somigliana-Pizzetti potential telluroid. Our contribution is organized as follows. In Sect. 1 we introduce the minimumdistance mapping of a topographic point p to the corresponding point P on the telluroid with a proper potential-type gauge. In particular, we present the Somigliana-Pizzetti gravity potential field, both in an explicit form and in a form of ellipsoidal orthonormal functions. Here we encounter Jacobi ellipsoidal/spheroidal coordinates, as well as their transformation to Gauss ellipsoidal/spheroidal coordinates and to Cartesian coordinates in the World Geodetic Datum 2000, made, for example, by GPS. In particular, we review the system of normal equations of the Euclidean minimum-distance mapping, which may be characterized as nonlinear and four-dimensional. Their Hessian (the matrix of second derivatives) is given in Appendix. It is tested for positive definiteness at the solution point, which is generated by fast Newton-Raphson iteration. Section 2 presents us with a case study, namely a computation of the Somigliana-Pizzetti potential telluroid and quasi-geoid from position and potential data representation for East Germany (196 GPS stations, 196 geopotential data). We review in detail the generation of Jacobi

394

E. W. Grafarend

 ellipsoidal/spheroidal coordinates λp , φp , up ∈ M2u , p an element of topo graphical surface/two-dimensional Riemann manifold M2u and Λp , !p , Up ∈ M2U , and P an element of the Somigliana-Pizzetti potential telluroid/two2 dimensional Riemann √ manifold MU . As soon as we plot the ellipsoidal/spheroidal height (up − Up ) G33 where G33 is the third diagonal element of the matrix of metric on the level of the international reference ellipsoid E2A,A,B (A semi-major axis, B semi-minor axis), we obtain the Somigliana-Pizzetti quasi-Geiod. Here we have chosen the equidistant-conic projection of E2A,A,B with standard parallels 50˚N and 52.5˚N whose mapping equations (l, b) → (x, y) ∈ R2 are given in Sect. 5.4. Also included in Sect. 2 is the comparison between the results obtained by minimum-distance telluroid mapping of Somigliana-Pizzetti type and the results which may be derived by using the formula ζ = T /γ . We have especially included this comparison to answer the question “what is the difference between minimum distance telluroid mapping and the results that may obtain by ζ = T /γ , where γ is considered GM/r 2 ?”

5.1

Formulation of the Problem

In the introduction, we presented a general definition for the mapping of the Earth’s surface onto the telluroid. Here we are going to specify that definition to the Somigliana-Pizzetti normal field and the mini-mum-distance mapping. We will call such a mapping Somigliana-Pizzetti minimum-distance telluroid mapping. The Somigliana-Pizzetti minimum-distance telluroid mapping can be defined as follows. Given the actual gravity potential value w(x), at the known point p(x) on the surface of the Earth M2h , find the point P (x) such that: (1) the normal Somigliana-Pizzetti potential field Wp = W (X) at point P (X) ∈ M2H is equal to the actual potential at p(x) ∈ M2h (2) the point P (X) ∈ M2H is at minimum (Euclidean) distance from the point p(x) ∈ M2h , on the physical surface of the Earth. Since the most suitable coordinate system for formulation of the Somigliana-Pizzetti field is ellipsoidal, we formulate our minimization problem in terms of Jacobi ellipsoidal coordinates {λ, φ, u}. The surface which consists of all those points of minimum distance is called the Molodensky telluroid M2H or, specifically in our case, the Molodensky telluroid of Somigliana-Pizzetti type. For a graphical illustration we refer to Fig. 7. Assume two points {p(x), q(y)} are given by their placement vectors {x, y} on the Earth’s topographic surface. In addition, we attribute to these points the gravity potential {w(x), w(y)}, respectively. In the first step we introduce the gauge {Wp = wp , WQ = wq }, {W (X) = w(x), W (Y) = w(y)}. Indeed, the reference potential values {WP , WQ } or {W (X), W (Y)} at the points {P , Q} or the placement vector

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

395

Fig. 7 Point p(x) ∈ M2h and q(y) ∈ M2h on the topographic surface and their minimum-distance mapping onto the reference equipotential surface W (X) = w(x) and W (Y) = w(y) generating the point P ∈ M2H and Q ∈ M2H on the Molodensky telluroid. (Grafarend and Ardalan [24])

{X, Y } respectively, with respect to the Somigliana-Pizzetti potential are set equal to the actual gravity potential values {wp = wq } or {w(x), w(y)}, respectively. Using the potential gauge we have constructed equipotential surfaces with respect to the Somigliana-Pizzetti reference potential. The second step aims at determining the coordinates of the points {P , Q} or of the placement vectors {X, Y } by means of the minimum-distance mapping (1) at the point p onto the reference equipotential surface w(p) = W (P ) = const. to generate the point P at minimum distance and (2) at the point q onto the reference equipotential surface w(q) = W (Q) = const. to generate the point Q at minimum distance. What has been shown here for two points has to be done for all points of a geodetic network with {x, y, z, w} data, namely from GPS positioning and gravimetric levelling. Indeed, the Molodensky telluroid reflects very well the Earth’s topography. The minimum distance {Pp , Qq } is conventionally called the height anomaly of Molodensky type. Analytically we can formulate the above-stated optimization problem by minimizing the constraint Lagrangean L(x1 , x2 , x3 , x4 ) : = x − X2 + x4 (WP − wp ) = [x − X(x1 , x2 , x3 )]2 + [y − Y (x1 , x2 , x3 )]2 + [z − Z(x1 , x2 , x3 )]2 + x4 [W (x1 , x2 , x3 ) − wp ] = (xˆ1 , xˆ2 , xˆ3 , xˆ4 ) =

min

(115)

x1 ,x2 ,x3 ,x4

min

x1 ,x2 ,x3 ,x4

L(x1 , x2 , x3 , x4 )

(116)

where (x1 , x2 , x3 ) = (Λ, !, U ) are Jacobi ellipsoidal coordinates of the point P ∼ X ∈ M2H kill, on the telluroid, and x4 is the unknown Lagrange multiplier. For the definition of Jacobi ellipsoidal coordinates we refer to Sect. 3. Definition 3 presents the Somigliana-Pizzetti gravity potential field in terms of Jacobi ellipsoidal coordinates {λ, φu }. The Somigliana-Pizzetti field was developed

396

E. W. Grafarend

by Pizzetti [53] and Somigliana [59] and recently extensively analysed by Grafarend and Ardalan [24] in functional analytical terms. Definition 3 (Somigliana-Pizzetti field as the gravity field of a rotational ellipsoid.). (1) Explicit form in terms of fundamental geodetic parameters {a, b, W0 , #} (according to Grafarend and Ardalan [24])   1 1 2 2 cot−1 uε W (φ, u) = (W0 − # a )  b  + #2 (u2 + ε2 ) −1 3 3 cot ε " 2 # ⎧   ⎨ 1 3 u2 + 1 cot−1 uε − 3 uε 2 2" ε # + √ # a   2 ⎩3 5 3 bε2 + 1 cot−1 bε − 3 bε " # √5 1 2 2 2 (3 sin2 φ − 1) − √ # u +ε 2 3 5

(117)

(2) Explicit form in terms of fundamental geodetic parameters {a, b, GM, #} [according to Pizzetti [53] and Somigliana [59]] W (φ, u) =

"u# GM arc cot ε ε " 2 #   u 3 + 1 arc cot uε − 3 uε 1 2 2 ε2 # + # a " 2 (3 sin2 φ − 1)   6 3 b + 1 arc cot b − 3 b ε2

ε

(118)

ε

# 1 " + #2 u2 + ε2 cos2 φ 2 (3) Functional analytical form (according to Grafarend et al. [24]) W (φ, u) =

GM ∗ " u # Q00 e00 ε ε       1 2 2 2P2∗ uε Q∗00 uε − 3P1∗ uε 1 + # a       √ e20 (φ) 3 2P2∗ bε Q∗00 bε − 3P1∗ bε 5 "u# 1 ]e00 + #2 ε2 [1 + P2∗ 2 ε "u# 9 ]e20 (φ) − √ #2 ε2 [1 + P2∗ ε 9 5

(119)

m

0

0

m

n

1

2

n

(2n + 1)(n + m − 1)(n − m − 1)  − (n2 − m2 )(2n − 3) ∗ ∀n ∈ [3, ∞) and m ∈ [0, n − 2] ×Pn−2,m 0

× cos mτ dτ

×

⎝u + ε

⎛ %

(n + m)! π n!

√ 4n2 − 1 ∗ sin φPn−1,m √ n2 − m2



1 u ε  u2 1 3 2 +1 2 ε

1 √ 3 sin φ √ 5 (3 sin2 φ − 1) 2



∗ (u/ε) Pnm

∗ (sin φ) Pnm

⎞n u2 + 1 cos τ ⎠ ε2

Q∗nm (u/ε) "u# arc cot u ε "u# 1 − arc cot ε 2  ε "u# u u 1 3 2 + 1 arc cot −3 2 ε ε ε 2n (n + m)!m! m (−1) (n − m)!(2m)! ∞ sinh2m r dτ × 0 ⎛ ⎞n+m+1 % 2 ⎝ u + u + 1 cos τ ⎠ ε ε2

∗ (sinφ), P ∗ (u/ε), and the associated Legendre functions of the second kind. Table 10 Normalized associated Legendre functions of the first kind, Pnm nm Q∗nm (u/ε)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field 397

398

E. W. Grafarend

∗ (sinφ) are orthogonal base functions of Legendre type, where eno (λ, φ) = Pn0 ∗ ∗ u Pnm (sinφ),Pnm ( /ε ) are normalised associated Legendre functions of the first kind, and Q∗nm (u/ε ) are associated Legendre functions of the second kind as briefly reviewed in Table 10.

Using the forward transformation of Jacobi ellipsoidal coordinates {λ, φ, u} into Cartesian coordinates {x,y,z} as summarised the functional L( x1 , x2 , x3 , x4 ) can be written as 2   2 2 L(ΛP , !P 2 , UP , x4 ) : = xp − UP + ε cos !P cos ΛP + (yp −



UP2 + ε2 cos !P sin ΛP )2

(120)

+ (zp + UP sin !P )2 + x4 (W (!P , UP ) − wp ) 2   L(x1 , x2 , x3 , x4 ) : = xp − x32 + ε2 cos x2 cos x1 + (yp −



x32 + ε2 cos x2 sin x1 )2

(121)

+ (zp + x3 sin x2 )2 + x4 (W (x2 , x2 ) − wp ) where {x1 , x2 , x3 } are unknown Jacobi ellipsoidal coordinates of the point P (Λ, !, U ) = P (x1 , x2 , x3 ) on the telluroid P (X) ∈ M2H , W (!, U ), = W (x2 , x3 ) corresponds to the Somigliana-Pizzetti potential field at point P (Λ, !, U ) =∈ M2H [cf. Eq. (4)], and wp refers to actual gravity potential at point p{x, y, z} on the surface of the Earth. The functional L( x1 , x2 , x3 , x4 ) is minimal if and only if the following two types of conditions are fulfilled

(1)

∂L (xˆ1 , xˆ2 , xˆ3 , xˆ4 ) x1  = 2 xˆ32 + ε2 cos xˆ2 (xp sin xˆ1 − yp cos xˆ1 ) = 0

f1 : =

∂L (xˆ1 , xˆ2 , xˆ3 , xˆ4 ) ∂x2   = 2(xp − xˆ32 + ε2 cos xˆ2 cos xˆ1 ) xˆ32 + ε2 sin xˆ2 cos xˆ1   + 2(yp − xˆ32 + ε2 cos xˆ2 sin xˆ1 ) xˆ32 + ε2 sin xˆ2 sin xˆ1

f2 : =

= 2(zp − xˆ3 sin xˆ2 )xˆ3 cos xˆ2 + xˆ4

∂W (xˆ2 , xˆ3 ) = 0 ∂x2

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

399

∂L (xˆ1 , xˆ2 , xˆ3 , xˆ4 ) ∂x3  (xp − xˆ32 + ε2 cos xˆ2 cos xˆ1 )xˆ3 cos xˆ2 cos xˆ1  = −2 xˆ32 + ε2  (yp − xˆ32 + ε2 cos xˆ2 sin xˆ1 )xˆ3 cos xˆ2 sin xˆ1  −2 xˆ32 + ε2

f3 : =

− 2(zp − xˆ3 sin xˆ2 ) sin xˆ2 + xˆ4 f4 : =

(122)

∂W (xˆ2 , xˆ3 ) = 0 ∂x3

∂L (xˆ1 , xˆ2 , xˆ3 , xˆ4 ) = W (xˆ2 , xˆ3 ) − wp = 0 ∂x4 ∂ 2L (xˆ1 , xˆ2 , xˆ3 ) be positive semi - definite ∂xi ∂xj

(2)

for i, j ∈ {1, 2, 3} (123) ⎡2

∂ f1 ∂x12 ∂ 2 f2 ∂x12 ∂ 2 f3 ∂x12 ∂ 2 f4 ∂x12

⎢ ⎢ ⎢  F : =H=⎢ ⎢ ⎢ ⎣

∂ 2 f1 ∂x1 x2 ∂ 2 f2 ∂x1 x2

∂ 2 f1 ∂x1 x3 ∂ 2 f2 ∂x1 x3

∂ 2 f1 ∂x1 x4 ∂ 2 f2 ∂x1 x4

∂ 2 f1 ∂x2 x1 ∂ 2 f2 ∂x2 x1

∂ 2 f3 ∂x1 x2 ∂ 2 f4 ∂x1 x2

∂ 2 f3 ∂x1 x3 ∂ 2 f4 ∂x1 x3

∂ 2 f3 ∂x1 x4 ∂ 2 f4 ∂x1 x2

∂ 2 f3 ∂x2 x1 ∂ 2 f4 ∂x2 x1

∂ 2 f1 ∂x22 ∂ 2 f2 ∂x22 ∂ 2 f3 ∂x22 ∂ 2 f4 ∂x22

∂ 2 f1 ∂x2 x3 ∂ 2 f2 ∂x2 x3

∂ 2 f1 ∂x2 x4 ∂ 2 f2 ∂x2 x4

∂ 2 f1 ∂x3 x1 ∂ 2 f2 ∂x3 x1

∂ 2 f1 ∂x3 x2 ∂ 2 f2 ∂x3 x2

∂ 2 f3 ∂x2 x3 ∂ 2 f4 ∂x2 x3

∂ 2 f3 ∂x2 x4 ∂ 2 f4 ∂x2 x4

∂ 2 f3 ∂x3 x1 ∂ 2 f4 ∂x3 x1

∂ 2 f3 ∂x3 x2 ∂ 2 f4 ∂x3 x2

∂ 2 f1 ∂x32 ∂ 2 f2 ∂x32 ∂ 2 f3 ∂x32 ∂ 2 f4 ∂x32

∂ 2 f1 ∂x3 x4 ∂ 2 f2 ∂x3 x4

∂ 2 f1 ∂x4 x1 ∂ 2 f2 ∂x4 x1

∂ 2 f1 ∂x4 x2 ∂ 2 f2 ∂x4 x2

∂ 2 f1 ∂x4 x3 ∂ 2 f2 ∂x4 x3

∂ 2 f3 ∂x3 x4 ∂ 2 f4 ∂x3 x4

∂ 2 f3 ∂x4 x1 ∂ 2 f4 ∂x4 x1

∂ 2 f3 ∂x4 x2 ∂ 2 f4 ∂x4 x2

∂ 2 f3 ∂x4 x3 ∂ 2 f4 ∂x4 x3



∂ 2 f1 ∂x42 ∂ 2 f2 ∂x42 ∂ 2 f3 ∂x42 ∂ 2 f4 ∂x42

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

J J Partials derivatives ∂W ∂φ and ∂W ∂u of Eq. (204) can be readily derived from Eq. (200) as follows:  x3  2 ∂W ∂W 2 2 (3x3 + ε)arccot ε − 3x3 ε = =a #

sin x2 cos x2   ∂φ ∂x2 (3b2 + ε2 )arccot bε − bε " # − #2 x32 + ε2 sin x2 cos x2 ∂W ∂W GM = =− 2 ∂u ∂x3 x3 + ε2

(124)

    ε(3x32 + 2ε2 ) + −3x32 − 3x3 ε2 arccot xε3 1 2 2 − # a

      3 (x32 + ε2 ) arccot bε ε2 + −3ε + 3arccot bε b b # " (125) × 3 sin2 x2 − 1 + #2 x3 cos2 x2

400

E. W. Grafarend

Equations (122) build up the variational equations of the optimization problem of Eq. (115). The system of Eq. (122) is a nonlinear system and its Brook-Taylor expansion reads F(x) = F(x0 ) +

1  F (x0 )(x − x0 ) 1!

1  F (x0 )(x − x0 ) ⊗ (x − x0 ) + O3 ((x − x0 ) ⊗ (x − x0 ) ⊗ (x − x0 )) 2! 1 (126) = F0 + J0 (x − x0 ) + H0 (x − x0 ) ⊗ (x − x0 ) + O3 2 +

where ⎤ f1 (x1 , x2 , x3 , x4 ) ⎢ f2 (x1 , x2 , x3 , x4 ) ⎥ ⎥ F=⎢ ⎣ f3 (x1 , x2 , x3 , x4 ) ⎦ , ⎡

f4 (x1 , x2 , x3 , x4 ) ⎤ x1 ⎢ x2 ⎥ ⎥ x=⎢ ⎣ x3 ⎦ , x4 ⎡ ∂f1 ∂f1 ∂f1 ∂f1 ⎤

(127a)



∂x1 ∂x2 ∂f2 ∂x2 ∂f3 1 ∂x2 ∂f4 ∂f4 ∂x1 ∂x2

⎢ ∂f2 ⎢ 1 F := J = ⎢ ∂x ∂f3 ⎣ ∂x

∂x3 ∂f2 ∂x3 ∂f3 ∂x3 ∂f4 ∂x3

∂x4 ∂f2 ∂x4 ∂f3 ∂x4 ∂f4 ∂x4

⎥ ⎥ ⎥ ⎦

(127b)

(127c)

And ⊗ is the symbol for the Kronecker tensor product. Newton iteration solution can be performed by the n sequence

x − x0 =

−1 −1 x = J−1 0 (F − F0 ) = (J(x0 )) (F − F0 )x − x0 + J0 (F − F0 ) (128)

⇒ x1 = x0 + J−1 0 (F − F0 )

(129)

⇒ x2 = x1 + J−1 1 (F − F1 )

(130)

⇒ · · · ⇒ xn = xn−1

(131)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

401

Where the Jacobian matrix of linearized form of the variational Eq. (8) is as follows: ⎤ ∂f1 ∂f1 ∂f1 ∂f1 ⎢ ∂x1 ∂x2 ∂x3 ∂x4 ⎥ ⎥ ⎢ ⎢ ∂f2 ∂f2 ∂f2 ∂f2 ⎥ ⎥ ⎢ ⎢ 1 ∂x2 ∂x3 ∂x4 ⎥ J := ⎢ ∂x ⎥ ∂f ∂f ∂f ∂f 3 3 3 ⎥ ⎢ 3 ⎥ ⎢ ⎢ ∂x1 ∂x2 ∂x3 ∂x4 ⎥ ⎣ ∂f4 ∂f4 ∂f4 ∂f4 ⎦ ∂x1 ∂x2 ∂x3 ∂x4 #1/2 " ∂f2 = −2 x32 + ε2 cos x2 (xp cos x1 + yp sin x1 ) ∂x1 1/2  ∂f1 = 2 x32 + ε2 sin x2 (−xp sin x1 + yp cos x1 ) ∂x2   cos x2 −xp sin x1 + yp cos x1 x3 ∂f1 = −2  2 1/2 ∂x3 x3 + ε2 ∂f1 =0 ∂x4 #1/2 " ∂f1 = 2 x32 + ε2 sin x2 (xp sin x1 − yp cos x1 ) ∂x1 # " ∂f2 = 2 x32 + ε2 (sin x2 )2 (cos x1 )2 ∂x2 "  #1/2 #1/2 " 2 2 + 2 xp − x3 + ε cos(x2 ) cos x1 x32 + ε2 # " × cos x2 cos x1 + 2 x32 + ε2 (sin x2 )2 (sin x1 )2 "  #1/2 #1/2 " 2 2 + 2 yp − x3 + ε cos x2 sin x1 x32 + ε2 ⎡

× cos x2 sin x1 + 2x32 (cos x2 )2 + 2(zp − x3 × sin x2 )x3 sin x2 + x4 (#2 a 2 ((3x32 /ε2 + 1) × arccot(x3 /ε) − 3x3 /ε)/((3b2 /ε2 + 1)arccot(b/ ε) − 3b/ε) cos(x2 )2 − #2 a 2 ((3x32 /ε2 + 1)arccot (x32 + ε) − 3x3 /ε)/((3b2 /ε2 + 1)arccot(b/ ε) − 3b/ε)(sin x2 )2 + #2 (x32 /ε2 )(sin x2 )2 − #2 (x32 /ε2 )(cos x2 )2 )

(132)

(133)

(134)

(135)

402

E. W. Grafarend

∂f2 = −2 sin x2 (cos x1 )2 cos(x2 )x3 + 2(xp − (x32 + ε2 )1/2 ∂x3 #1/2 " × cos x2 cos x1 )/ x32 + ε2 sin x2 cos x1 (x3 ) − 2 sin x2 (sin x1 )2 cos x2 (x3 )  "  #1/2 #1/2 " 2 2 + 2 yp − x3 + ε cos x2 sin x1 / x32 + ε2

(136)

× sin x2 sin x1 (x3 ) + 2x3 cos x2 sin x2 − 2(zp − x3 sin x2 ) × cos x2 + x4 (#2 a 2 (6x3 ε2 arccot(x3 /ε) − (3x32 /ε2 + 1)/ε/(1 + x32 /ε2 ) − 3/ε/) /((3b2 /ε2 + 1) arccot(b/ε) − 3b/ε) sin x2 cos x2 − 2#2 x3 cos x2 sin x2 ∂f2 = #2 a 2 ((3x32 / ε2 + 1)arccot(x3 /ε) ∂x4 − 3x3 /ε)/((3b2 /ε2 + 1)arccot(b/ε)

(137)

− 3b/ε) sin x2 cos x2 − #2 (x32 /ε2 ) cos x2 sin x2 cos x2 x3 (sin x1 xp − cos x1 yp ) ∂f3 =2 ∂x1 (x32 + ε2 )1/2

(138)

∂f3 = −2 sin x2 (cos x1 )2 cos(x2 )x3 ∂x2  "  #1/2 #1/2 " 2 2 + 2 xp − x3 + ε cos x2 cos x1 / x32 + ε2 × sin x2 cos(x1 )x3 − 2 sin x2 (sin x1 )2 cos(x2 )x3 "  #1/2 #1/2 " + 2 yp − x32 + ε2 cos x2 sin x1 x32 + ε2 (139) × sin x2 sin(x1 )x3 + 2x3 cos x2 sin x2 " "  − 2 zp − x3 sin x2 cos x2 + x4 #2 a 2 6x3 /ε2 arccot (x3 /ε) − (3x32 /ε2 + 1)/ε/(1 + x32 /ε2 ) − 3/ε)/((3b2 /ε2 + 1)arccot(b/ε) − 3b /ε) sin x2 cos x2 − 2#2 x3 cos x2 sin x2 )

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

403

# " ∂f3 = 2/ x32 + ε2 (cos x2 )2 (cos x1 )2 x32 ∂x3   #1/2 " 2 2 + 2 xp − x3 + ε cos x2 cos x1 /(x32 + ε2 )3/2 cos x2 cos(x1 )x32 − 2(xp − (x32 + ε2 )1/2 #1/2 " × cos x2 cos x1 )/ x32 + ε2 cos x2 cos x1 # " + 2/ x32 + ε2 (cos x2 )2 sin(x1 )2 x32   #1/2 " 2 2 + 2 yp − x3 + ε cos x2 sin x1 /(x32 + ε2 )3/2

(140)

#1/2 " × cos x2 sin x1 x32 − 2(yp − x32 + ε2 × cos x2 sin x1 )/(x32 + ε2 )1/2 cos x2 sin x1 × 2 sin x22 + x4 (2GM/ε4 /(1 + x32 /ε2 )2 x3 + 1/6#2 a 2 (6/ε2 arccot(x3 /ε) − 12x3 /ε3 /(1 + x32 /ε2 ) + 2(3x32 /ε2 + 1)ε3 /(1 + x32 /ε2 )2 x3 )/((3b2 /ε2 + 1)arccot(b/ε) − 3b/ε)(3 sin(x2 )2 − 1) + #2 (cos x2 )2 )

∂f3 = −GM/ε2 / (1 + x32 /ε2 ) + 1 / 6#2 a 2 (6x3 / ε2 arccot(x3 /ε) ∂x4 − (3x32 /ε2 + 1)/ε/(1 + x32 / ε2 ) − 3/ε)/((3b2 / ε2

(141)

+ 1) arc cot(b/ε) − 3b/ε)(3(sin x2 ) − 1) 2

+ #2 x3 (cos x2 )2 ∂f4 =0 ∂x1

(142)

∂f4 = #2 a 2 ((3x32 / ε2 + 1)arccot(x3 /ε) − 3x3 / ε)/((3b2 /ε2 ∂x2 + 1)arccot (b/ε) − 3b / ε) sin x2 cos x2 − #2 (x32 + ε2 ) cos(x2 ) sin(x2 )

(143)

404

E. W. Grafarend

∂f4 = −GM/ε2 / (1 + x32 /ε2 ) + 1 / 6#2 a 2 (6x3 / ε2 arc cot(x3 /ε) ∂x3 − (3x32 /ε2 + 1)/ε/(1 + x32 / ε2 ) − 3/ε)/((3b2 / ε2 + 1)

(144)

× arccot (b/ε) − 3b/ε)(3(sin x2 ) − 1) 2

+ #2 x3 (cos x2 )2 ∂f4 =0 ∂x4

(145)

The solution set (xˆ1 , xˆ2 , xˆ3 , xˆ4 ) derived from the final step of the Newton iteration of Eq. (131) provide the necessary condition of Eq. (122) of a minimal solution. The external solution is minimal if the condition of Eq. (123) is satisfied. Indeed, we have to prove that the Hesse matrix HL of second derivatives is positive semidefinite, i.e. the characteristic polynomial |HL − λI = 0 has three eigenvalues being non-negative. We have excluded the explicit form and computation of the Hesse matrix HL here: the test of HL ≥ 0 can be obtained from me (Baranov [6]).

5.2

Case Study: Potential Quasi-geoid of East Germany

Next, we present the results of the minimum-distance mapping of the physical surface of the Earth M2H , to the Somigliana-Pizzetti telluroid M2H for 196 GPS stations in the eastern part of Germany. Figure 8 shows the geographical distribution of the GPS stations and Table 11 lists the first 10 GPS records of the GPS file of the East Germany. The coordinates are given in terms of Gauss ellipsoidal coordinates {l, b, h} with respect to the GRS80 reference ellipsoid. The Gauss ellipsoidal coordinates {l, b, h} of 196 GPS stations are converted to Jacobi ellipsoidal coordinates {λ, φ, u} according to forward transformation equations given in Sect. 3. Table 12 presents the Jacobi ellipsoidal coordinates of the sample stations given in Table 11. Newton-Raphson iteration solution of the normal equations of Eqs. (122) led to point-wise telluroid mapping of all GPS stations in East Germany. A portion of the results for first 10 GPS stations is presented in Table 13. Columns 1-3 refer to Jacobi ellipsoidal coordinates of telluroid projection points. Column 4 presents the difference between the u component of the GPS stations and its telluroid projection. Finally, column 5 shows the projection √ of (up − Up ) along the unit vector E u . The geometric height H = (up − Up ) G33 presenting the separation between the surface of the Earth and the Molodensky telluroid is considered as the height above the reference ellipsoid, to produce by definition the quasi-geoid map shown in Fig. 9. A part of the European Gravimetric Quasi-Geoid (EGG97) covering East Germany is also shown in Fig. 10. If we compare the Jacobi ellipsoidal coordinates of type longitude, namely λp and Λp , and latitude, namely φp and !p , of p (surface points) and P (telluroid points) for some sample stations of Tables 12 and 13 we

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

405

Fig. 8 The 196 GPS station in the eastern part of Germany Equidistance conic projection; standard parallel: 50N and 52.5 N; reference ellipsoid: WGD2000. (Grafarend and Ardalan [24])

Table 11 Part of the GPS file of the eastern part of Germany Longitude (lp ) [deg]

Latitude(bp ) [deg]

13.4363 13.6433 12.5016 13.0076 13.4252 13.2909 12.7371 13.65S6 13.0796 12.4053

54.6772 54.5136 54.4716 54.4256 54.4172 54.3508 54.2981 54.2971 54.2511 54.2495

Ellipsoidal height (hp ) [m] 82.295 68.192 39.663 48.669 92.697 42.444 50.352 101.363 55.762 40.304

Geopotential number [m2 /s2 ] 455.75 318.76 21.20 115.93 553.89 58.39 127.90 641.58 185.67 23.93

406

E. W. Grafarend

Table 12 Transferred Jacobi ellipsoidal coordinates {λ, φ, u}p of the Gauss ellipsoidal coordinates given in Table 11

λp 13.4363 13.6433 12.5016 13.0076 13.4252 13.2909 12.7371 13.6586 13.0796 12.4053

φp 54.5864 54.4226 54.3805 54.3345 54.3261 54.2596 54.2069 54.2059 54.1598 54.1582

Table 13 Telluroid mapping of the sample GPS stations of Table 12 ΛP (degree) 13.4363 13.6433 12.5016 13.0076 13.4252 13.2909 12.7371 13.6586 13.0796 12.4053

!P (degree) 54.58637 54.42259 54.38054 54.33449 54.32609 54.25961 54.20686 54.20586 54.15981 54.15821

UP (m) 6,356,798.3508 6,356,784.3782 6,356,754.0257 6,356,763.6887 6,356,808.3635 6,356,757.8193 6,356,764.9099 6,356,817.3096 6,356,770.8030 6,356,754.3042

up – Up (m) 35.8969 35.7512 37.5424 36.8958 36.2992 36.5332 37.3598 36.0296 36.8831 37.9061

up 6 356 834.2477 6 356 820.1294 6 356 791.5681 6 356 800.5844 6 356 844.6627 6 356 794.3526 6 356 802.2698 6 356 853.3393 6 356 807.6861 6 356 792.2103

 √ u − Up G33 (m) 35.8565 35.7107 37.4998 36.8538 36.2579 36.4914 37.3170 35.9883 36.8407 37.8625

immediately realise that these Jacobi ellipsoidal co-ordinates agree with each other up to first order. We can say that the minimal-distance mapping has produced the original Molodensky mapping λp = Λp , φp = !p up to first order. Finally, the calculated quasi-Geoid is compared with the new European Gravimetric Quasi-Geoid (EGG97). A summary of statistics of this comparison is given in Table 14, and the contour map of their differences is shown in Fig. 11. Now let us compare the results of minimum-distance mapping and what we may obtain by using the formula ζ = T /γ [Vanieek and Krakiwsky 1986, p. 494, formula (21.10)]. In Table 15 we have computed such a telluroidal height for the points given in Table 11. As can be seen from Table 15 this formula gives quite different results to the telluroid heights obtained by minimum-distance telluroid mapping and of course those of EGG97! This answers the question why the formula ζ = T /γ based on γ = GM/r 2 cannot be considered as a means for telluroid computation and why we have taken the trouble to develop minimum-distance telluroid mapping!

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

407

m

55.0° N

46 45 Hamburg

44

52.5° N

43

Berlin

42 GERMANY Koln

50.0° N

41

Prague

Frankfurt

40 39 38

Munich 37

47.5° N 7.5° E

10.0° E

12.5° E

15.0° E

36

Fig. 9 Quasi-geoid map of East Germany, based on the minimum-distance mapping of the physical surface of the Earth to the Somigliana-Pizzetti telluroid. The quasi-geoid undulations are in the interval 35.609 to 47.501 m. Equidistant conic projection; standard parallels: 50◦ N and 52.5◦ N; reference ellipsoid: WGD2000 [24]

5.3

A Short Introduction to: “Mixed Elliptic-Trigonometric Ellipsoidal Coordinates” and “Jacobi Ellipsoidal Coordinates”

We were thinking about you reading this “advanced text”! For this section we decided to include a short introductory text: Jacobi spheroidal coordinates as well as Gauss ellipsoidal coordinates.

5.4

Mixed Elliptic-Trigonometric Ellipsoidal Coordinates {λ, φ, u}

In term of Jacobi ellipsoidal coordinates {λ, φ, u}, a point in space is located as the intersection point of three coordinate surfaces. The coordinate surfaces as formulated in the Definition A1

408

E. W. Grafarend

Fig. 10 Quasi-geoid height of EGG97 over East Germany. Equidistance conic projection; standard parallels : 50◦ N and 52.5◦ N; reference ellipsoid: WGD2000 [24] Table 14 Statistics of quasi-geoid height of EGG97 minus the calculated quasi-geoid based on minimum-distance mapping of the physical surface of the Earth onto the telluroid at 196 GPS station in East Germany

Statistics of √ ζEGG97 − (u − Up ) G33 Mean Std Max Min Number of sample points

[m] −0.0550 0.0389 0.0299 −0.1446 196

Definition 4. Coordinate surfaces of Jacobi ellipsoidal coordinates {λ, φ, u} 1. The family of conformal, oblate spheroids E

2√

u2 +ε2 , u

  2 2 z2 3 x +y 2 2 2 := x ∈ R | 2 + 2 = 1, u ∈ (0, +∞), ε := a − b u + ε2 u (146)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

409

Fig. 11 Quasi-geoid height of EGG97 minus the calculated quasi-geoid based on the minimumdistance mapping of physical surface of the Earth onto the telluroid. The difference is in the range of – 0.1446 to 0.0229 m. Equidistant conic projection; standard parallels: 50◦ N and 52.5◦ N; reference ellipsoid: WGD2000. (Grafarend and Ardalan [24])

Table 15 Computation of the telluroid heights based on the formula ζ = T /γ = (W (λ, φ, r) − GM / r)/(GM/r 2 ) for the sample points given in Table 2 Longitude (lp ) [deg] 13.4363 13.6433 12.5016 13.0076 13.4252 13.2909 12.7371 13.6586 13.0796 12.4053

Latitude (bp ) [deg]

Ellipsoidal height (hp ) [m]

Telluroid height (ζp ) [m]

54.6772 54.5136 54.4716 54.4256 54.4172 54.3508 54.2981 54.2971 54.2511 54.2495

82.295 68.192 39.663 48.669 92.697 42.444 50.352 101.363 55.762 40.304

304.143 361.790 378.358 394.029 396.536 420.156 439.691 438.868 455.909 457.451

410

E. W. Grafarend

2. The family of conformal, oblate hyperboloids & z2 x2 + y2 − H2ε cos φ,ε sin φ := x ∈ R3 | 2 ε cos2 φ ε2 sin2 φ  π π ' = 1, φ ∈ − , , φ = 0 2 2

(147)

3. The family of half planes & ' P2cos λ,sin λ := x ∈ R3 |y = x tan λ, λ ∈ [0, 2π ]

(148)

The longitude λ gives orientation to the half planes. The latitude φ is related to the inclination of the asymptotes of confocal half hyperboloids; the elliptic coordinate u coincides with the semi-minor axis of confocal oblate spheroids (confocal, oblate ellipsoids of revolution). Definition 5. Conversion of Cartesian coordinates {x, y, z} into ellipsoidal coordinates {λ, φ, u} (1) Forward transformation from ellipsoidal coordinates {x, y, z} into ellipsoid coordinates {λ, φ, u}  u2 + ε2 cos φ cos λ  y = u2 + ε2 cos φ sin λ

x=

(149)

z = u sin φ (2) Backward transformation of Cartesian coordinates {x, y, z} into ellipsoidal coordinates {λ, φ, u} ⎧ arctan yx ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ arctan yx + π ⎪ ⎨ y λ = arctan x + 2π ⎪ ⎪ π ⎪ ⎪ ⎪ 2 ⎪ ⎪ ⎩ 3 π2

for x > 0 and y  0 for x < 0 and y = 0 for x > 0 and y < 0

(150)

for x = 0 and y < 0 for x = 0 and y < 0

 1/2   2 1 2 2 2 2 2 2 2 2 2 2 ε −(x +y +z )+ x +y +z −ε +4ε z ) φ = (sgnz) arcsin 2ε2 (151) 

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

u=

411

1/2    1 2 x + y 2 + z2 − ε2 + (x 2 + y 2 + z2 − ε2 )2 + 4ε2 z2 2 (152)

Definition 6. Geometry of ellipsoidal coordinates {λ, φ, u} (1) Jacobi matrix of the transformation of ellipsoidal coordinates {λ, φ, u} into Cartesian coordinates:{x, y, z} From the Jacobi matrix ‘J ’ of the transformation of ellipsoidal coordinates {λ, φ, u} into Cartesian coordinates {x, y, z} can be constructed as: ⎡

⎤ Xλ Xφ Xu J := ⎣ Yλ Yφ Yu ⎦ Zλ Zφ Zu

(153)

The partial derivatives involved in Eq. (39) are as follows: √ Xλ = Dλ X =√ − u2 + ε2 cos φ sin λ Yλ = Dλ Y = u2 + ε2 cos φ cos λ Zλ = Dλ Z = 0 √ Xφ = Dφ X = −√ u2 + ε2 sin φ cos λ Yφ = Dφ Y = − u2 + ε2 sin φ sin λ Zφ = Dφ Z = u cos φ Xη = Dη X = √ u2 2 cos φ cos λ u +ε u u2 +ε2

Yη = Dη Y = √

cos φ sin λ

Zη = Dη Z = sin φ (2) The metric tensor

⎤ dλ dS 2 = [dλ, dφ, du]J ∗ J ⎣ dφ ⎦ du ⎡

G := J⎡∗ J 0 (u2 + ε2 ) cos2 φ ⎢ 2 + ε 2 sin2 φ 0 u =⎣ 0 0 := gnm ∀n, m = 1, 2, 3

(154)

0 0 u2 +ε2 sin2 φ u2 +ε2

⎤ ⎥ ⎦

(155)

412

E. W. Grafarend

(3) The Laplacan  √  √  g ∂ g ∂ g ∂ ∂ ∂ + + g11 ∂λ ∂φ g22 ∂φ ∂u g33 ∂u   2 u2 +ε2 sin2 φ ∂ 2 ∂ ∂ 1 ∂2 2 2 ∂ −tan φ + 2 +2u +(u +ε ) 2 = ∂φ ∂φ ∂u ∂u u2 +ε2 sin2 φ (u2 +ε2 ) cos2 φ ∂λ2 (156)

1 =√ g



∂ ∂λ

√

Figure 12 illustrates the coordinate lines of Jacobi ellipsoidal coordinates of u and φ in the plane λ = 0˚. This configuration, because of the rotational symmetry of the system, is the same for any plane A= const. The following conclusions can be drawn: (1) the coordinate line of φ for u = b = 6,356,751.860 m, where linear √ eccentricity ε := a 2 + b2 = 521,854.677 m is selected is very close to a circle. (2) the coordinate line u [(λ, φ) = const] for u > 2,000,000 m is almost a straight line. This fact can be seen in terms of variations of curvature of coordinate line φ = const. in the λ = 0˚. plane versus Jacobi ellipsoidal coordinate u as shown in Figs. 13 and 14.

5.4.1 Gauss Ellipsoidal Coordinates {l, b, h} In terms of Gauss ellipsoidal coordinates, a point in 3-D space is identified by the intersection of the two coordinate surfaces of the type plane l = const. and reference ellipsoid Ea,b and a vertical distance measured from the reference ellipsoid Ea,b Definition 7 reviews the Gauss ellipsoidal coordinates through their relation with the Cartesian coordinates {x, y, z}. Definition 7. Forward and backward transformation of Gauss ellipsoidal coordinates {l, b, h} into Cartesian coordinates {x, y, z} Forward transformation   x = √ a 2 + h(l, b) cos b cos l 1−e2 sin b   y = √ a 2 + h(l, b) cos b sin l 1−e2 sin b   a(1−e2 ) √ + h(l, b) sin b z= 2 1−e2 sin b

(157)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

413

x 106

6

φ=45°

φ=60°

φ=60°

φ=45°

4

z (m)

2

0

u=6356751.86 m

−2

−4

−6

φ=80°

φ=80 −8

−6

−4

−2

0 x (m)

2

4

6

8 x 106

◦ ◦ ◦ Fig. 12 Jacobi ellipsoidal coordinates {λ, φ, u}, √ Coordinate lines φ = {45 , 60 , 80 } and u = b = 6,356,751.860 m. linear eccentricity ε = a 2 − b2 = 521 854.667 m is chosen. (Grafarend and Ardalan [24])

Backward transformation ⎡

sgn x = +, sgny ⎢ sgn x = −, sgny l = arctan y/x ⎢ ⎣ sgn x = +, sgny sgn x = +, sgny

= + : 0 ≤ l < π/2 = + : π/2 ≤ l < π = − : π ≤ l < 3π/2 = − : 3π/2 ≤ l < 2π

(158)

The other two components, i.e. b and h can be derived by Newton iteration or by solving a system of algebraic equations or by using closed formulae (see e.g. Heikkinen [31]). Grafarend and Engels [25] have developed series expansion for 2 . the height function H (L, B) in terms of a set of orthogonal functions on Eab

414

E. W. Grafarend

Fig. 13 Variation of the curvature of the coordinate lines φ = {45◦ , 60◦ , 80◦ } in the plane λ = 0◦ versus the Jacobi ellipsoidal coordinates u m (Grafarend and Ardalan [24])

5.5

Direct Transformation Between Gauss Ellipsoidal Coordinates and Jacobi Ellipsoidal Coordinates

According to Grafarend et al. [24] the following relations exist between Gauss ellipsoidal coordinates {l, b, h} and Jacobi ellipsoidal coordinates {λ, φ, u} (Figs. 15 and 16). (1) Forward transformation equations {λ, φ, u} → {l, b, h} λ=1

 φ = arctan( 1 − e2 tan b)   a(1 − e2 ) 1 cos b + h [1 + (1 − e2 ) tan2 b]1/2 u= 1 − e2 (1 − e2 sin2 b)1/2

(159) (160) (161)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

415

Fig. 14 Variation of the curvature of the coordinate lines φ = {45◦ , 60◦ , 80◦ } in the plane λ = 0◦ versus the Jacobi ellipsoidal coordinates zoomed in for u ≥ 2,000,000 m (Grafarend and Ardalan [24])

Fig. 15 Commutative diagram of the transformation among Jacobi ellipsoidal coordinates {λ, φ, u}, Gauss ellipsoidal coordinates {l, b, h} and Cartesian coordinates {x, y, z} (Grafarend and Ardalan [24])

(x,y,z)

(λ,φ,u)

(l,b,h)

(λ,φ)

(l,b)

(x,y)

Fig. 16 Transformation of Jacobi ellipsoidal coordinates {λ, φ} into equidistant-conic map projection coordinates {x, y} (Grafarend and Ardalan [24])

416

E. W. Grafarend

(2) Backward transformation equation{l, b, h} → {λ, φ, u} I =λ



b = arctan √ h=

1 1 − e2

 tan φ

1/2 1 2 tan φ 1 − e2 −1/2  tan2 φ 2 2 − a(1 − e ) 1 − e 1 − e2 + tan2 φ 

(162) (163)

 1 − e2 u cos(φ) 1 +

(164)

We hope that our Section. 5 in the topic Jacobi and Gauss ellipsoidal coordinates was helpful to you. You barely can find this subject in Geodetic Textbooks! [3]

6

The MacCullagh Representation: The Time-Variation of Multipoles of the Gravity Field

There is experimental evidence that the terrestrial gravity field changes over time. For example, Cheng et al. [17] have analysed satellite Laser ranging observations to Starlette, Lageos 1 and 2, Ajisai, Etalon 1 and 2, Stella and BE-C and found from satellite orbit perturbations temporal variations of the zonal harmonic coefficients of the terrestrial gravity field j2 = −2.7 ± 0.4(I0−11 /year) as well as j3 , j4 , j5 and j6 . The MacCullagh formulae represent the incremental Cartesian moments of the mass density δI 12 , δI 13 , δI 32 , as well as the incremental moments of inertia δJ 12 , δJ 13 , δJ 32 in terms of the incremental gravitational potential coefficients of degree 2 and order −2, −1, +1. Here, we derive the MacCullagh formulae (MacCullagh [41]) valid for a rigid body and generalize it for a deformable body for excitation force fields/excitation potential fields of tidal, loading, centrifugal and surface stress types. Such a generalized MacCullagh representation has been made possible by means of a recently derived integral relation between the Love number k1 (R) and the Love-Shida functions h1 (r), l1 (r) [Grafarend et al. ( [26], p. 23); here Eq. (50)]. Section 6.1 reviews the definition of spherical and Cartesian moments of the mass density field, in particular following Nagel [50] and Spada [61]. The transformation between Cartesian monomes and solid spherical harmonics, namely the derivation of the classical MacCullagh formulae, is the target of Sect. 6.2; the basic results are collected in six propositions. Section 6.3 focuses on the excitation forces of tidal, loading, centrifugal and surface stress types in the context of scalar- and vectorvalued spherical harmonics. Section 6.5 introduces four propositions, one theorem and one corollary. They govern the radial, spheroidal and toroidal displacements and the incremental coefficients of the terrestrial gravitational potential generated by (i) the tidal potential; (ii) the loading potential; (iii) the centrifugal potential

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

417

and (iv) the transverse surface stress within the Love-Shida formalism for a radially symmetric Earth model, in both the time and the Laplace domain. Theorem 1 contains the extended MacCullagh representation of incremental Cartesian moments, the incremental inertia tensor.

6.1

Definition of Spherical and Cartesian Moments of Mass Density

Outside a bounding sphere (Brillouin sphere) SR2 of radius R the gravitational potential field of the Earth can be represented in terms of scalar spherical harmonics by +l  l+1 L R u(λ, ϕ, r) = lim el,m (λ, φ)ul,m L→∞ r

(165)

l=0 m=−1

where (λ, ϕ, r) denotes the spherical coordinates (longitude, latitude, distance from the centre of the Earth) and el,m (λ, ϕ) the fully normalized scalar-valued spherical harmonics. g represents Newton’s gravitational constant, and in the following m denotes the total mass of the spherical Earth. If the mass density field (λ, ϕ, r) in the interior domain of the Earth in known, the coefficients, of the gravitational potential field can be computed from the Gauss-Newton integral +π/2 

2π u(λP , ϕP , rP ) = g

dλQ 0

R 2 drQ rQ

dϕQ cos ϕQ

−π/2

0

(λQ , ϕQ , rQ ) 0 ×0 0x(λP , ϕP , rP ) − x(λQ , ϕQ , rQ )0

(166)

namely

ul,m

g := R

+π/2 

2π dλ 0

−π/2

R

dϕcosϕ ×

dr (λ, ϕ, r) 0

r l+2 el,m (λ, ϕ) ∀rP > R R l 2l + 1 (167)

0 0−1 by means of a series expansion of the reciprocal Euclidean distance 0xP − xQ 0 . xP denotes the placement vector of the evaluation point P . and xQ the placement vector of the source point Q. Since the mass density field of the Earth changes over time under the influence of volume forces as well as surface forces (stress), the gravitational potential field u (λ, ϕ, r; t) in particular its spherical harmonic coefficients ii i ,m (t), change over

418

E. W. Grafarend

time. Let us decompose the gravitational potential field u(λ, ϕ, r) as well as its representative coefficients ul,m into a reference (stationary) part and a disturbing (time-dependent) part (0)

u(·; t) = u0 (·) + δu(·; t),

ul,m (t) = ul,m + δul,m (t)

(168)

The Eulerian increment δul,m of the spherical harmonic coefficient can be computed by the Eulerian increment δ (λ, ϕ, r; t) of the mass density field by means of

δul,m

g = R

+π/2 

2π dλ

R

dϕ cos ϕ×

−π/2

0

dr δ(λ, ϕ, r) 0

r l+2 el,m (λ, ϕ) ∀rP > R R l 2l + 1 (169)

The incremental coefficients δul,m are affected, in addition, by a change of the shape of the Earth and the inner interfaces. These effects are absorbed in the Eulerian incremental mass distribution δ (λ, ϕ, r; t) , namely by ‘single layers’ of material condensed onto SR2 and the other interior interfaces, e.g. of spherical shape, as reviewed by Grafarend [23]. Following Dixon [19, pp. 511–517] the Cartesian moments I i1 ...in of order n as well as its increments δI i1 ...in represent the reference (stationary) part 0 as well as the disturbing (time-dependent) part δ of the mass density distribution by means of 

I i1 i2 ···in−1 in :=

d3 x(x)x i1 x i2 · · ·x in−1 x in

∀i1 , . . . , in ∈ {1, 2, 3}

V

(170) and δI

i1 i2 ···in−1 in

 :=

d3 xt δ(x; t)x i1 x i2 · · ·x in−1 x in

∀i1 , . . . , in ∈ {1, 2, 3}

Vt

(171) In terms of the Cartesian coordinates Xi and Xi , respectively, of the placement vector and the displacement vector of material particles in their reference position, an equivalent representation of Cartesian incremental moments can be given in linear approximation by δI i1 ···in =



 d3 X 0 (X) V

 Xi1 Xin Xi1 · · · Xin + · · · Xi1 Xin

(172)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

419

subject to the quantum numbers i1 , . . . , in {1, 2, 3}. Note that d 3 X denotes the volume element at reference epoch, while d 3 xt , denotes the volume element at time instant t. As an alternative to the Cartesian moments I i1 ...in the trace-reduced Cartesian moments of second order J k,l as well as its increments  J kl : = V



δJ

kl

:=

# " d3 x(x) x2 δ kl − x k x l = δ kl I jj − I kl

(173)

# " d3 xt δ(x; t) x2 δ kl − x k x l

Vt

= δ kl δI jj − δI kl k, l, j ∈ {1, 2, 3}

(174)

 2 subject to the summation convention over repeated indices, e.g. x2 = x 1 +  2 2  3 2 jj x + x , I = I 11 + I 22 + I 33 play an essential role in the balance equations on angular momentum and incremental angular momentum, respectively. Note that x = x 1 , y = x 2 , z = x 3 , where z is measured along the rotation axis at reference epoch.

6.2

Transformation Between Cartesian Monomes and Solid Spherical Harmonics: The MacCullagh Formulae

As is already known, solid spherical harmonics r l el,m (λ, ϕ) can be expressed as homogeneous Cartesian polynomials. Substituting such a Cartesian polynomial for the spherical harmonic in the spherical harmonic coefficients of Eq. (3), we can represent the spherical harmonic coefficients ul,m as a sum of Cartesian moments I kl or J kl of the mass density distribution. Let us begin with the presentation of the celebrated Moivre’s formula:  [(l−|m|)/2]  (2l−2k)! (−1)k k!(l−k)!(l−|m|−2k)! r l el,m (λ, ϕ) = 2−l 2(2l + 1) (l−|m|)! (l+|m|)! ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

k=0

[m/2]  i=0

(−1)i

  m m−2i 2i y x 2i

√1 ×(x 2 +y 2 +z2 )kz l−|m|−2k 2  ⎪  ⎪ [(|m|−1)/2] ⎪  ⎪ |m| i ⎪ ⎩ (−1) x |m|−2i−1 y 2i+1 2i + 1 i=0

⎫ ⎪ for m > 0⎪ ⎪ ⎪ ⎪ ⎬ for m = 0 ⎪ ⎪ ⎪ ⎪ ⎭ for m > 0⎪ (175)

420

E. W. Grafarend

Example 1 (Moivre’s formula, m = 2). [m/2] = 1 :   1  2 cos(2λ) = (−1)i (cos λ)2−2i (sin λ)2i 2i i=0 = cos2 λ − sin2 λ

(176a)

[(m − 1)/2] = 0 :   2 0 sin(2λ) = (−1) (cos λ)(sin λ) = 2 sin λ cos λ 1

(176b)

Example 2 (Moivre’s formula, m = 3). [m/2] = 1 : cos(3λ) =

1 i=0

 (−1)

i

3 2i

 (cos λ)3−2i (sin λ)2i

= cos3 λ − 3 cos λ sin2 λ [(m − 1)/2] = 1 : sin(3λ) =



 3 (cos λ)3−2i−1 (sin λ)2i+1 2i + 1 i=0   3 0 = (−1) (cos λ)2 (sin λ) 1   3 + (−1) (cos λ)0 (sin λ)3 3 = 3 cos2 λ sin λ − sin3 λ 1 

(177a)

(−1)i

(177b)

Proposition 1 (Transformation of homogeneous Cartesian polynomials into solid spherical harmonics). 

 cos(mλ) for m  0 sin(|m|λ) for m < 0  [(l−|m|)/2]  (2l−2k)! |m| ϕ cos (−1)k k!(l−k)!(l−|m|−2k)! (sin ϕ)l−|m|−2k = 2−l 2(2l + 1) (l−|m|)! (l+|m|)! k=0 ⎧ ⎫   [m/2]  m ⎪ ⎪ i m−2i 2i ⎪ (−1) (sin λ) for m > 0 ⎪ (cos λ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2i i=0 ⎨ ⎬ 1 √ for m = 0 × ⎪ ⎪  2 ⎪ ⎪ [(|m|−1)/2] ⎪ ⎪  ⎪ ⎪ |m| i ⎪ ⎩ ⎭ (−1) (cos λ)|m|−2i−1 (sin λ)2i+1 for m > 0 ⎪ 2i + 1 i=0 (178) el,m (λ, ϕ) = P¯l,|m| (sin ϕ)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

421

In order to prove Proposition 1 we begin with an inversion of Moivre’s formula, namely

cos(mλ) =

[m/2]

 (−1)

i

i=0

sin(mλ) =

[(m−1)/2]

m 2i



 (−1)

i

i=0

(cos λ)m−2i (sin λ)2i

(179a)

 m (cos λ)m−2i−1 (sin λ)2i+1 2i + 1

(179b)

[m/2] denotes the largest integer less than or equal to m/2, [(m − 1)/2] the largest integer less than or equal to (m−1)/2. Inserting Moivre’s formula into the definition of surface spherical harmonics, we are led to the results of Proposition 1: As soon as we invert x = rcosϕcosλ, y = rcosϕsinλ, z = rsinϕ to

sin λ =  r=



y x2

+ y2

, sin ϕ = 

x x2

x 2 + y 2 + z2

cos λ = 

x x2 + y2

, cos ϕ = 

+ y 2 + z2

 x2 + y2

(180)

x 2 + y 2 + z2

and substitute Eq. (180) in Eqs. (219) and (220), we obtain Eq. (175). Accordingly we present in Table 16 the solid spherical harmonics r l el,m (λ, φ) in terms of homogeneous Cartesian polynomials up to degree/order 4/4. If we introduce these degree/order 2/2 representations into the definition of the spherical harmonic coefficients in Eq. (167) with respect to the Cartesian moments of Eq. (170), namely I kl , we arrive at Proposition 2.

Table 16 Number of Cartesian monomes versus number of solid spherical harmonics Degree 0 1 2 3 4 l

Cartesian monomes 1 3 6 10 15 (l + 1) (l + 2)/2

Solid spherical harmonics 1 3 5 7 9 2l + 1

Rank deficiency 0 0 1 3 6 l(l– 1)/2

422

E. W. Grafarend

Proposition 2 (transformation of Cartesian moments of order 2 into spherical moments of degree/order 2/2). u2,−2 u2,0 u2,1

√ √ g 15 12 g 15 23 = I u2,−1 = I 5R 3 5R 3 √ g 15 = (2I 33 − I 11 − I 22 ) 10R 3 √ √ g 15 13 g 15 11 = I u2,2 = (I − I 22 ) 5R 3 10R 3

(181)

For higher degrees/orders of ul.m (l > 2, −1 < m < +1) , corresponding transformations I i1 i2 ...il−1 in → ul,m can be derived, but will be presented elsewhere. We only notice that for a degree l ≥ 2 we are left with more Cartesian monomes x α y β zγ subject to α + β + γ = l than solid spherical harmonics. There are l−α l α=0 β=0

=

(l − 1)(l − 2) 2

Cartesian monomes, but only 2l + 1 solid spherical harmonics. The fundamental transformation of Eq. (11) cannot be inverted due to a rank deficiency. Accordingly, the representation of a Cartesian moment in terms of spherical moments/spherical harmonic potential coefficients does not exist. Table 17 shows details of the rank deficiency by degree.

Table 17 Transformation of homogeneous Cartesian polynominals to solid spherical harmonics up to degree/order 4/4 e0,0 (λ, ϕ) = 1 √ √ √ re1,−1 (λ, ϕ) = 3y re1,0 (λ, ϕ) = 3z re1,1 (λ, ϕ) = 3x √ √ √ r 2 e2,−2 (λ, ϕ) = 15xy r 2 e2,−1 (λ, ϕ) √ = 15yz r 2 e2,0 (λ, ϕ) = 25 (2z2 − (x 2 + y 2 )) √ 15 2 2 2,2 2 r 2 e2,1 (λ, ϕ) = 15xz  r e (λ, ϕ) = 2 (x − y ) √ 1 35 3 3,−3 2 3 3 3,−1 105xyz r 3 e3,−1 (λ, ϕ) = r e (λ, ϕ) = 2 2 (3x y − y ) r e (λ, ϕ) =  3 7 2 2 2 2 6 (4z − (x + y ))y 7 3 3,0 r e (λ, ϕ) = 2 (2z2 − 3(x 2 + y 2 ))z   2 2 3 3,3 r 3 e3,1 (λ, ϕ) = 32 76 (4z2 −(x 2 +y 2 ))x r 3 e3,2 (λ, ϕ) = 105 2 z(x −y ) r e (λ, ϕ) =  1 35 3 2 2 2 (x − 3xy )  √ 2 3 r 4 e4,−4 (λ, ϕ) 3 235 (x 3 y − xy 3 ) r 4 e4,−3 (λ, ϕ) = 32 35 2 z(3x y − y ) √ √ r 4 e4,−2 (λ, ϕ) = 3 2 5 xy(6z2 − (x 2 + y 2 )) r 4 e4,−1 (λ, ϕ) = 34 10y(4z3 − 3(x 2 + y 2 )z) √ r 4 e4,0 (λ, ϕ) = 38 (8z4 − 24z2 (x 2 + y 2 ) + 3(x 2 + y 2 )2 ) r 4 e4,1 (λ, ϕ) = 34 10x(4z3 − 3(x 2 + y 2 )z)  r 4 e4,2 (λ, ϕ) = r 4 e4,4 (λ, ϕ) =

√ 3 5 2 4 (6z √ 3 35 4 8 (x

− (x 2 + y 2 ))(x 2 − y 2 ) r 4 e4,3 (λ, ϕ) =

− 6x 2 y 2 + y 4 )

3 2

35 3 2 z(x

− 3xy 2 )

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

423

However, for some special monomes or Cartesian moments, an inversion of spherical harmonics or spherical moments is possible, namely the one given in Corollary 7 with respect to I 12 + I 23 + I 31 respectively. On the contrary, for I 11 + I 22 + I 33 respectively, corresponding formulae do not exist. Corollary 7. Transformation of spherical moments/ spherical harmonic coefficients to Cartesian moments of order 2 5R 3 5R 3 5R 3 I 12 = √ u2,−2 I 23 = √ u2,−1 I 31 = √ u2,1 g 15 g 15 g 15

(182)

Alternatively, similar transformations of trace-reduced Cartesian moments/moments of inertia to spherical moments/spherical harmonic coefficients of order 2 and vice versa exist. They are collected in Proposition 3 and Corollary 8, commonly called MacCullagh formulae [41]. Proposition 3. Transformation of trace-reduced Cartesian moments/moments of inertia to spherical moments/ spherical harmonic coefficients u2,−2 u2,0 u2,1

√ √ g 15 12 g 5 23 =− J u2,−1 = − 3 J 5R 3 5R √ g 15 =− (2J 33 − J 11 − J 22 ) 10R 3 √ √ g 15 13 g 15 11 =− J u = − (J − J 22 ) 2,2 5R 3 10R 3

(183)

Corollary 8. MacCullagh formulae, transformation of spherical moments/spherical harmonic coefficients to trace-reduced Cartesian moments/moments of inertia of order 2 5R 3 J 12 = − √ u2,−2 g 15 J

13

5R 3 = − √ u2,1 g 15

5R 3 J 23 = − √ u2,−1 g 15

(184)

As stated above, u2,0 and u2,2 cannot be uniquely inverted to J 11 , J 22 , J 33 respectively! Accordingly, formulae for the diagonal elements J 11 , J 22 , J 33 do not exist! However, it is always possible to represent an arbitrary Cartesian monome of degree 1 as a finite sum of solid spherical harmonics of degrees n < 1, which are eventually multiplied by powers of r. This is due to the fact that the space of analytic functions inside a sphere can be spanned by the Cartesian product of surface spherical harmonics and radial Jacobi polynomials, as shown by Ballani et al. [5]. Instead of using radial Jacobi polynomials we take advantage of a recurrence

424

E. W. Grafarend

relation in order to obtain the Cartesian monomes as a sum of spherical functions. With the help of the well-known recurrence relation of associated Legendre functions [Eq. 60] [for more details refer to e.g. Knickmeyer [36]] and the addition theorems of trigonometric functions, we are able to compute ⎡ ⎤ x ⎣ y ⎦ r k el,m (λ, ϕ) z

Namely in Preposition 4, which is used in the subsequent part of this paper. Proposition 4. recurrence relation for xr k el,m (λ, ϕ), yr k el,m (λ, ϕ), zr k el,m (λ, ϕ) ⎡

xr k el,m

⎫ ⎧ l−1,m−1 ⎪ ⎪ e ⎪ √ l−1,0 ⎪ ⎪ ⎪ ⎢ ⎪ ⎪ ⎪ ⎪ ⎢2 2e ⎬ ⎨ k+1 ⎢ (l + |m|)(l + |m| − 1) r ⎢ = √ 0 ⎪ ⎪ (2l − 1) 2 2l + 1 ⎢ ⎢ ⎪ ⎪ ⎪ ⎪ 0 ⎪ ⎪ ⎣ ⎪ ⎭ ⎩ el−1,m+1 ⎪ ⎫ ⎧ ⎪ el+1,m−1 ⎪ ⎪ ⎪ √ ⎪ ⎪ ⎪ 2 ⎪ ⎪ 2el+1,0 ⎪ ⎬ ⎨ (l − |m| + 1)(l + |m| + 2) − 0 ⎪ ⎪ (2l + 3) ⎪ ⎪ ⎪ ⎪ 0 ⎪ ⎪ ⎪ ⎭ ⎩ el+1,m+1 ⎪ ⎫ ⎧ ⎪ el+1,m+1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ l+1,m+1 ⎪ 2 ⎪ ⎪ e ⎨ (l + |m| + 2)(l + |m| + 1) √ l+1,1 ⎬ + 2e ⎪ ⎪ (2l + 3) ⎪ ⎪ ⎪ el+1,m−1 ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ l+1,m−1 ⎪ e ⎫⎤ ⎡ ⎧ ⎤ ⎪ for 1 < m < l el−1,m+1 ⎪ ⎪ ⎪ ⎪ ⎥ ⎪⎥ ⎢ ⎪ el−1,m+1 ⎪ 2 ⎪ ⎪ for m = 1 ⎢ ⎥ (l − |m|)(l − |m| − 1) ⎨ √ l−1,1 ⎬⎥ ⎥ ⎥⎢ − for m = 0 2e ⎥ ⎥⎢ ⎥ ⎪⎥ ⎢ ⎪ l−1,m−1 ⎪ (2l − l) ⎪ ⎪ ⎪ ⎣ ⎦ ⎦ for m = −1 e ⎪ ⎪ ⎪ l−1,m−1 ⎪ ⎭ ⎩ for − l < m < −1 e (185)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

425



⎧ ⎫ ⎪ −el−1,1−m ⎪ ⎪ ⎪ ⎪ ⎪ ⎢ ⎪ ⎪ ⎪ ⎪ ⎢ 0 ⎨ ⎬ ⎢ k+1 (l+|m|)(l+|m|−1) r k l,m ⎢ √ yr e = 2 2l+1 ⎢ 0 (2l−1) √ l−1,0 ⎪ ⎪ ⎢ ⎪ ⎪ ⎪ ⎪ 2e ⎪ ⎪ ⎣ ⎪ ⎩ l−1,−(m+1) ⎪ ⎭ e ⎧ ⎫ ⎪ −el+1,1−m ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 0  ⎨ ⎬ (l−|m|+1)(l−|m|+2) − 0 (2l+3) √ l+1,0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2e ⎪ ⎪ ⎪ ⎩ el+1,−(m+1) ⎪ ⎭ ⎧ ⎫ ⎧ ⎫⎤ ⎪ ⎪ el+1,−(m+1) ⎪ el−1,−(m+1) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎥ ⎪ ⎪ l+1,−(m+1) ⎪ l−1,−(m+1) ⎪ ⎪ ⎪ ⎪ ⎪  ⎨ e√ ⎬  ⎨ e√ ⎬⎥ l+1,−1 − (l−|m|)(l−|m|−1) l−1,−1 ⎥ + (l+|m|+2)(l+|m|+1) 2e 2e ⎥ (2l+3) (2l−1) ⎥ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ l+1,−m+1 l−1,−m+1 ⎪ ⎪ ⎪ ⎪ ⎦ −e −e ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩−el+1,−m+1 ⎭ ⎩−el−1,−m+1 ⎭ ⎡

⎤ for 1 < m < l ⎢ for m = 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ for m = 0 ⎥ ⎢ ⎥ ⎣ for m = −1 ⎦ for − l < m < −1

zr k el,m =

k+1 √r 2l+1



(l−|m|+1)(l+|m|+1) (2l+3)

(186)

el+1,m 

 + (l+|m|)(l−|m|) el−1,m (2l−1)

(187)

By applying such a recurrence relation as given in Proposition 4 we succeed in deriving the transformations up to third order of solid spherical harmonics to Cartesian monomes outlined in Table 18. These formulae do not serve for the inversion of Eq. (183) for the diagonal elements of the coordinates P i of the inertia tensor, since in the definition of the spherical harmonic potential coefficients only spherical harmonics, not spherical harmonics multiplied by powers of r, appear. Under certain assumptions (conditions for the excitation functions), however, an inversion is possible for the incremental moments of inertia. Such an inversion procedure will be outlined in Sect. 6.5.

426

E. W. Grafarend

Table 18 Transformation of solid spherical harmonics to Cartesian monomes

1 = e(0,0)

r r r x = √ e1,1 y = √ e1,−1 z = √ e1,0 3 3 3

x2 =

r2 3

y2 =

r2 3

 

√1 e2,0 5

+

e0,0 −

√1 e2,0 5



 

 3 2,2 5e



3 2,2 5e

xy =

2 √r e2,−2 15

xz =

yz =

2 √r e2,−1 15

z2 =

2 √r e2,1 15

r2 3



√2 e2,0 5

+ e0,0



√   1 6 3,−1 √1 e3,−3 y 3 = r 3 53 e1,−1 − 10 e − 7 70     √  3 2 3,0 3,−3 x 2 y = r√ 2e1,−1 − 27 e3,−1 + 30 z3 = r 3 53 e1,0 + √ e 7 e 5 7 10 3         3 3 3,0 5 3,2 3,3 2 z = r√3 1,0 − x e xy 2 = r√ 2e1,1 − 27 e3,1 − 30 e e + e 7 7 7 10 3 5 3        r√3 2 3,1 r√3 3 3,0 5 3,2 2 1,1 2 1,0 e + 2 7e y z= e − 7e − 7e xz = 5 3 5 3    3 3 xyz = √r e3,−2 yz2 = r√ e1,−1 + 2 27 e3,−1 x3 = r 3

√

e0,0 −

5 3

3 1,1 5 e



1 10





6 3,1 7e

+

√1 e3,3 70

105

Let us summarize the important contents of the basic second section:

Proposition 1 deals with the product decomposition of {cos(mλ), sin(mλ)}. In contrast, Proposition 2 highlights the transformation of Cartesian moments of degree/order 2/2 into spherical moments of degree/order 2/2 while Proposition 3 the inverse transformation of spherical moments/spherical harmonics coefficients to Cartesian moments of order two. Proposition 3 sums-up the transformation of the trace-reduced Cartesian moments of inertia to spherical moments applied to harmonics coefficients. The celebrated MacCullagh formulae are the subject of Corollary 5 applied to the transformation of spherical moments/spherical harmonics coefficients to trace-reduced Cartesian moments/moment of inertia of order 2. Finally Proposition 4 brings up the recurrence relation of type Xr k el,m (λ, φ).

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

6.3

427

Excitation Functions

While we derived in the previous section the MacCullagh representation or tracereduced Cartesian moments/moments of inertia in terms of spherical harmonic coefficients of the gravitational potential valid for a rigid body, now we aim at a generalization for a deformable body of radial symmetry in order to apply the Love-Shida hypothesis. Indeed, under the action of volume and surface forces a displacement field X(x, t) is built up which in turn changes the spherical as well as the Cartesian moments of the mass density field. Here, we review the impact of the forces of types (1) tidal, (2) loading, (3) centrifugal and (4) transverse surface stress.

6.3.1 The Tidal Potential The tidal potential inside the Earth is represented in the form l L " r #l tid el,m (λ, ϕ) vl,m (t) v (λ, ϕ, r, t) = lim L→∞ R tid

(188)

l=2 m=−1

with respect to a massive sphere S2R of the mean radius R of the Earth. The coefficients of the tidal potential are quasi-periodic and can be represented for very long time intervals by sums of trigonometric functions of time, namely tid (t) = vl,m



tid vl,m,n (t) =

n



cos(vl,m,n t + γl,m,n )cl,m,n

(189)

n

Here, νl,m,n denote discrete tidal frequencies, γl,m,n phase angles and cl,m,n the corresponding amplitudes. For m > 0 the relations =C1 m,nz V1,-m,n = V1,m ,n and y1, , ,n = y1, , ,, , −x12 hold. These relations are motivated

v tid = lim

L→∞

l N L l=2 m=0 n=1

 π cl,m,n Pl,m (sin ϕ) × cos ωl,m,n t+βl,m,n +mλ+(l−m) 2 (190)

by an alternative representation of the tidal potential at the surface of the Earth SR2 for r = R. The tidal potential does not contain terms of zero and first degree. Such constraints can be interpreted as follows: (1) the tidal potential is gauged to zero at the origin of the coordinate system and (2) the integral of the extra-terrestrial attraction forces over the volume of the Earth is approximated by the gravitation vector, taken at the origin of the coordinate system, multiplied by the total mass of the Earth. Conventionally, the center of mass of the Earth is chosen as the origin of the coordinate system

428

E. W. Grafarend

6.3.2 The Loading Potential The condensation of a load on a reference surface, for instance on the sphere S2R of the mean radius R of the Earth, produces a ‘single-layer’ mass density distribution which in turn generates the loading potential l " # L r l l,m load e (λ, ϕ)vl,m (t) L→∞ R

v load (λ, ϕ, r, t) = lim

(191)

l=1 m=−l

The periodicity of the coefficients of the loading potential, however, is not as clear as in the tidal case. For more details of this problem, refer to Wolf [64] and Mitrovica et al. [44]. We assume that both the loading and the tidal potential do not contain terms of zero degree. For the loading case, such a constraint relates to the conservation of mass of the Earth. The term of degree one describes the variation of the mass centre over time. For oceanographic loading, Bursa et al. [12, p. 16] have documented mass-centre variations of the order of 6 mm.

6.3.3 The Centrifugal Potential Due to the rotation of the Earth a third force, namely the centrifugal force, is acting. Like the tidal and the loading force, the centrifugal force is ‘conservative’ and can be represented as the gradient of the centrifugal potential ν cent 2 2 2 v cent = 12 [ω  r − "ω, "x# ] = 13 ω2 r 2 − ωx2 x 2 −

" # " 2 2 2 z2 − − ωy2 y 2 − x +z − ω z 2 ω xy + ω ω xz + ω ω yz) −3(ω x y x z y z  " #  √ 2 1 e2,0 − 3e2,2 ωx2 + = r3 e0,0 ω2 + √ 2 5   3 2,−2 2,−1 2,1 ωx ωy + e ωy ωz + e ωx ωz ) − 5 (e y 2 +z2 2

#

x 2 +y 2 2

#

(192) ω denotes the rotation vector, ω2 = ωx2 + ωy2 + ωz2 its modulus squared. The centrifugal potential can be additively decomposed into a reference part V cent , corresponding to a stationary rotation vector and a disturbing part δV cent , corresponding to an incremental time-varying rotation vector δω. ν ent = V cent + δV cent

(193)

Conventionally, the Cartesian coordinate system is chosen in such a way that the z-axis coincides with the reference rotation vector , namely = ez .Under such an assumption the reference part of the centrifugal potential can be represented by V cent =

  e2,0 | |2 (x 2 + y 2 )| |2 = r 2 e0,0 − √ 2 3 5

(194)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

429

As well as the incremental centrifugal potential in linear approximation   2 2 1 δV cent = r 2 e0,0 δωz − √ (e2,1 δωx + e2,−1 δωy ) − e2,0 √ δωz 3 15 3 5

(195)

Any of these three forms of the centrifugal potential is built on two different terms. The first term, proportional to e00 , is non-harmonic (anharmonic, not a harmonic function); the second term, proportional to e2,m , is harmonic. Accordingly, unlike the tidal potential and the loading potential, the centrifugal potential as a whole cannot be represented in the space of harmonic functions. However, the anharmonic term is isotropic, namely independent of spherical longitude and latitude. If an Earth model is chosen incompressible and radially symmetric, the anharmonic part has no effect at all. The harmonic term has the form of a tidal potential such that in dynamical terms the corresponding centrifugal force can be treated like a tidal force. However, since there are non-periodic parts in the rotation vector, a representation of the coefficients of the ‘harmonic part’ of the centrifugal potential in the form of Eq. (189) cannot be complete. Conventionally, the representation of Eq. (210) is used for short-periodic terms, e.g. diurnal or semidiurnal periods, but the Laplace transformation for long-periodic and secular variations. The coefficients of the incremental centrifugal potential are as follows: R 2 cent δV2,1 = − √ δωx 15 cent δV2,0

R 2 cent δV2,−1 = − √ δωy 15

2 R 2 = − √ δωz 3 5

(196)

For further details, refer to Moritz and Mueller [49].

6.3.4 Transverse Surface Stress In terms of vector spherical harmonics the transverse surface stress (shear coupling’) which is generated by surface forces of wind or water current type is conventionally represented by

t = lim

L→∞

+l L

surface surface {Sl,m (λ, ϕ)sl,m + Tl,m (λ, ϕ)tl,m }

(197)

l=1 m=−1

namely with respect to the spheroidal vector spherical harmonics Sl,m (λ, ϕ) by surf ace as well as to the toroidal vector spherical means of the coefficients Sl,m surf ace

harmonics tl,m (λ, ϕ) by means of the coefficients tl,m gives the definition of vector spherical harmonics with radial/normal components Rl,m (λ, ϕ) and lateral/ tangential components of spheroidal Sl,m (λ, ϕ) and toroidal Tl,m (λ, ϕ) types.

430

E. W. Grafarend

6.4

The Moments of an Incremental Mass Density Field Generated by Excitation Force Fields

The postulate of mass conservation during deformation leads to a special representation of the Eulerian increment δ(x) of the mass density field. Its source is the divergence of the product of the reference mass density field 0 (x) and the displacement vector field AX(x) in a linear approximation, namely δ(x) = −div [0 (x) · Let us represent the displacement vector field harmonics of type X=

+l ∝

X(x)]

(198)

X(x) in terms of vector spherical

{Rl,m (λ, ϕ)rl,m (r) + Sl,m (λ, ϕ)sl,m (r) + Tl,m (λ, ϕ)tl,m (r)}

l=0 m=−l

(199) In the special case of a radially symmetric stratified Earth, the classical LoveShida formalism applies for the various excitation force fields as discussed in, for example, Grafarend et al. [27]. Such force fields produce radial, spheroidal and toroidal displacements as well as Eulerian increments of the terrestrial gravitational potential, which are summarized in Proposition 5. Proposition 5. radial, spheroidal and toroidal displacement coefficients and incremental coefficients of the terrestrial gravitational potential generated by the tidal potential, Love-Shida formalism tid rl,m (r, t) =

tid sl,m (r, t) =

N 1

htid gm l,m,n (r) cos 2 R n=1 N 1  gm R 2 n=1

  vl,m,n t + γl,m,n cl,m,n

  tid l(l + 1)ll,m,n (r) × cos vl,m,n t + γl,m,n cl,m,n

(200)

tid rl,m (r, t) ≡ 0

are the displacement coefficients of radial, spheroidal and toroidal types in geometry space, while δutid l,m (r, t)

=

N

  tid kl,m,n (r) cos vl,m,n t + γl,m,n cl,m,n

(201)

n=1

are the incremental coefficients of the terrestrial gravitational field in gravity space, namely in terms of trigonometric series generated by the tidal potential V tid (t)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

431

tid tid Notice that the radial Love-Shida functions htid l,m,n (r), kl,m,n (r), ll,m,n (r) are frequency dependent, described by the index n. On the spherical surface S of mean Earth radius R these functions are called tidal Love-Shida numbers.

Proposition 6. radial, spheroidal and toroidal displacement coefficients and incremental coefficients of the terrestrial gravitational potential generated by the loading potential, Love-Shida formalism load r˜l,m (r, s) = load s˜l,m (r, s) =

1 ˜ load load (s) hl,m (r, s)v˜l,m

gm R2

1 ˜ load gm ll,m (r, s) R2



load l(l + 1)v˜l,m (s)

(202)

load (r, s) ≡ 0 t˜l,m

δ u˜ load l,m (r, s)

=

" # r l R

 load load ˜ + kl,m (r, s) v˜l,m (s)

(203)

are the displacement coefficients and the incremental coefficients of the terrestrial gravitational field in the Laplace domain [27], while

load lrl,m (r, t)

load sl,m (r, t)

=

=

1

t

gm R2 0

1

t

gm R2 0

 load   hload l,m (r, t − t )vl,m (t )dt

 load load  ll,m (r, t − t  ) l(l + 1)vl,m (t )dt 

(204)

load tl,m (r, t) ≡ 0

δuload l,m (r, t)

=

" r #l R

t load vl,m (t) +

load load  kl,m (t − t  )vl,m (t )dt 

(205)

0

are the corresponding coefficients in the time domain generated by the loading potential V load (t) ˜load ˜ load Notice that the radial Love-Shida functions h˜ load l,m (r, s),ll,m (r, s) , kl,m (r, s) are Laplace-transformed, indicated by a tilde ‘, −,1 over the coefficient; s is the parameter of the Laplace domain. Equation (202) is adequate for a viscoelastic Earth model. The coefficients δ u˜ load l,m of the Laplace-transformed terrestrial gravitational potential δ u˜ load namely its Eulerian increment, include the direct effect of the load, while the coefficients δutid l,m (r, t) do not contain the direct effect of the

432

E. W. Grafarend

tidal potential. Indeed, we consider the load as a part of the Earth system. In the load . corresponding formula of Eq. (203) this argument shows up in the term (r/R)l v˜l,m According to the Convolution Theorem a product of two Laplace-transformed functions can be convolved into the time domain. Accordingly, Eqs. (204) and (205) express the effects of the loading potential in terms of convolution integrals in the time domain. Indeed, these formulae are valid only under the assumption that all incremental field quantities vanish for t ≤ 0. Proposition 7 (radial, spheroidal and toroidal displacement coefficients and incremental coefficients of the terrestrial gravitational potential generated by the centrifugal potential, Love-Shida formalism). cent (r, s) = 1 h ˜ load ˜ cent r˜2,m gm 2,m (r, s)δ V2,m (s) R2

∀m ∈ {−1, 0, +1}

√ cent cent (r, s) = 1 ˜ ˜ cent s˜2,m gm 6l2,m (r, s)δ V2,m (s) R2 cent (r, s) ≡ 0 t˜2,m

∀m ∈ {−1, 0, +1}

(206)

are the displacement coefficients of radial, spheroidal and toroidal types in geometry space, while ˜ cent ˜ cent δ u˜ cent 2,m (r, s) = kl,m (r, s)δ V2,m (s)

∀m ∈ {−1, 0, +1}

(207)

are the incremental coefficients of the terrestrial gravitational field in gravity space, both in the Laplace domain generated by the incremental centrifugal potential δ V˜ cent . As mentioned above, for the purely periodic constituents of the centrifugal potential, the formulation in the time domain - in analogy to Eqs. (200) and (201) – is preferred. Proposition 8 (radial, spheroidal and toroidal displacement coefficients and incremental coefficients of the terrestrial gravitational potential generated by the transverse surface stress, Love-Shida formalism, Merriam [43]).  (r, s) r˜l,m

surface

(s) 1 1 s˜ = h˜ lm (r, s) gm m √lm l(l + 1) R2 R3

1 1 surface   (r, s) = l˜lm (r, s) gm m s˜lm (s) s˜l,m R2 R3  (r, s) ≡ 0 t˜l,m

(208)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

433

are the displacement coefficients of radial, spheroidal and toroidal types in the Laplace domain, while R3 surface  δ u˜ l,m (r, s) = k˜l,m s˜ (r, s) √ (s) m l(l + 1) lm

(209)

are the incremental coefficients of the terrestrial gravitational field in the Laplace domain generated by spheroidal stress. In contrast, the coefficients generated by toroidal stress are  r˜l,m (r, s) = 0  s˜l,m (r, s) = 0

1 1 surface  t˜l,m (r, s) ≡ l˜l (r, s) gm m t˜l,m (s)

(210)

R2 R3

δ u˜  l,m (s)

=0

The normalization of the coefficient functions hlm (r, s) etc. is the same as in Merriam [43] (in which the spheroidal vector spherical harmonics are differently normalized). In order to compute the Cartesian moments of the incremental mass density which is generated by the various excitation fields, we have to specify the density disturbance in Eq. (36). According to Grafarend et al. [27, p. 21] the Eulerian increment of the mass density for a radially symmetric stratification can be represented in terms of the Love-Shida formalism by

δ(x) = −

l ∞ l=1 m=−1



  0 (r) dhl (r) 2h − l(l + 1)l (r) + r (r) l 1 dr r gm R2  d0 hl (r) l,m e (λ, ϕ)vl,m + dr gm R2

(211)

for forces of ‘tidal’ and ‘load’ types. A similar formula exists for forces ‘centrifugal’ and ‘transverse surface stress’ types. The most important relation between the Love number kl and the Love-Shida functions hl (r) , ll (r) of any type kl (R) =

4π m R2

1 2l + 1

R 0 (r)

" r #l+1 R

[(l + 1)ll (r) + hl (r)]dr

(212)

0

has been proven in Grafarend et al. ( [27], p. 23). Indeed, it can be applied to LoveShida functions/ numbers as they appear in a reaction to tidal, loading, centrifugal and surface stress forces or potentials. As soon as we implement the integral relation of Love-Shida functions hl (r) , ll (r) to the Love number kl , into the spherical

434

E. W. Grafarend

moments, we arrive at  V

∞  n  

 (r) r gm2

n (r) [2hn (r) + r dhdr  hn (r) × en,j (λ, ϕ)vn,j × r l el,m (λ, ϕ)d3 x −n(n + 1)ln (r)] + d dr gm

δr l el,m d3 x

=−

=

V n=1 j =−n

kl (R) g (2l

R

R2

+ 1)R l+1 vl,m

(213) Since the various excitation force fields do not contain terms of degree zero, in a spherical harmonic expansion, we obtain in particular  δ(λ, ϕ, r)r k e0,0 d3 x = 0 (214) V

due to the orthogonality of the surface spherical harmonics. With the help of the tools we have collected here, we are able to compute the incremental Cartesian moments of the mass density generated by the various excitation fields. We introduce Eqs. (24) and (211) into the definition of the Cartesian moments given in Eq. (171). Using the orthonormality relations of the spherical harmonics, in particular Eq. (214), we obtain integrals of the form of Eq. (213). The results are presented in the following theorem. Theorem 1. The incremental Cartesian moments of second order of the incremental mass density distribution are represented by δI 12 =

R3 g



3 5 δI 23 = Rg 3 k2 v2,−2  3 δI 13 = Rg 53 k2 v2,1

3

δI 22 = δI 33 =

"

v2,0 √ − 3 5 " 3 v2,0 −5 Rg k2 √ + 3 5 # " 3 2v −5 Rg k2 √2,0 3 5

δI 11 = −5 Rg k2



5 3 k2 v2,−1

(215)

#

v √2,2 15 # v √2,2 15

(216)

as well as the trace-reduced Cartesian moments, namely the moments/products of inertia 3

δJ 12 := −δI 12 = − Rg δJ 23 := −δI 23 = − δJ 13 := −δI 13 =

R3



5

3

k2 v2,−2

5 g  3 k2 v2,−1 3 − Rg 53 k2 v2,1

(217)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

435

Table 19 Interpretation of the symbol k2 v2m in Eqs. (215)–(218) for different excitation types Tidal excitation k2 v2,m

N  n=1

tid (R)v tid k2,m,n 2,m,n

Surface loading

Centifigural excitation

Transverse surface stress

load (R, s))v˜ load (1 + k˜2,m 2,m

cent (R, s)δ V˜ cent k˜2,m 2,m

 (R, s) k˜2,m

3

δJ 22 := δI 11 + δI 33 = δJ 33 := δI 11 + δI 22 =

"

#

v2,0 v √ − √2,2 3 5 15 # " 3 v2,0 v 5 Rg k2 √ − √2,2 3 5 15 √ 3 −2 35 Rg k2 v2,0

δJ 11 := δI 22 + δI 33 = 5 Rg k2

R√3 surface s˜ m 6 2,m

(218)

Due to Eq. (214) a degree-zero term disappears in Eqs. (215)–(218). These formulae merely give the framework for the computation of the incremental Cartesian moments which are generated by the various excitation fields we have considered in this section. The symbols ‘k2 ’ and ‘v2,m ’ have to be replaced by the respective Love number and excitation coefficient according to the scheme shown in Table 19. For convenience we have presented the formulae for the loading, centrifugal and transverse surface stress excitation in the Laplace domain again. In these cases the respective Cartesian moments in Eqs. (215)–(218) have to be understood as Laplace transforms also. If there are several excitation forces acting simultaneously, the corresponding incremental Cartesian moments can be obtained by linear superposition. Whereas the first three identities for the ‘mixed’ moments of Eqs. (215) and (217) follow directly from Eqs. (182) and (200), the last three identities of Eqs. (216) and (218) for δJ 11 , δJ 22 , δJ 33 can only be obtained with the support of the integral relation of Eq. (212), kl {R; hl (r), ll (r)]. Indeed these three identities for the ‘diagonal’ moments can be considered as an extension of the MacCullagh representation within the Love-Shida formalism. The incremental polar moment of inertia δJ 33 is of particular interest: under the postulate of conservation of angular momentum, δJ 33 enables us to compute models for length-of-day variations. Notice that the extended MacCullagh representation of incremental Cartesian moments holds for Love numbers of (i) tidal, (ii) load, (iii) centrifugal and (iv) surface stress types. Corollary 9 (Extended MacCullagh representation of incremental tensor of inertia for a deformable body subject to the Love-Shida formalism). If the excitation force field/potential field/surface stress field does not contain coefficients of degree zero and if the Love-Shida hypothesis applies, then the incremental tensor of inertia is trace free. For the proof we compute δJ 11 + δJ 22 + δJ 33 = 0 We finally give the definitions and relationships of the scalar and vector spherical harmonics which we have used throughout the paper. In particular the expansion of the reciprocal distance in solid spherical harmonics and recurrence

436

E. W. Grafarend

relations (relationships between Legendre functions of different degree and order) are collected.

6.5

Basics of Spherical Harmonic Functions of Type Scalar, Vector and Tensor, Equivalence Statement

Definition of the scalar-valued fully normalized surface spherical harmonics ⎡

elm (λ, ϕ)

P¯lm (sin ϕ) cos mλ ⎣ := P¯l0 (sin ϕ) P¯l|m| (sin ϕ) sin |m|λ  ⎡ √ 2(2l + 1) (l−m)! Pl,m (sin ϕ) cos mλ for m > 0 ⎢ √ (l−m)! ⎢ =⎣ for m = 0 2l + 1Pl,0 (sin ϕ) √ (l−|m|)! 2(2l + 1) (l−|m|)! Pl,|m| (sin ϕ) sin |m|λ for m < 0

(219)

Definition of the associated Legendre functions Pl,m (sin ϕ) := 2−l cosm ϕ

[(l−m)/2]  k=0

(2l−2k)! (−1)k k!(l−k)!(l−m−2k)! ×(sin ϕ)l−m−2k for |m|≤l

with : [(l − m)/2] := greatest integer less or equal (l − m) / 2 (220) Reciprocal distance ⎧ ∞ l " #l ⎪ rQ 1   ⎪ ⎪ ⎨ rP rP 1 l=0 m=−l 0 0= " #l ∞ l 0xP − xQ 0 ⎪ rP 1   ⎪ ⎪ ⎩r r Q

l=0 m=−l

Q

⎫ ⎪ ⎪ < rP ⎪ ⎬

el,m (λP ,ϕP )el,m (λQ ,ϕQ ) 2l+1

for rQ

el,m (λP ,ϕP )el,m (λQ ,ϕQ ) 2l+1

⎪ ⎪ for rP < rQ ⎪ ⎭ (221)

Recurrence relation cos ϕPl,m (sin ϕ) =

(l + m)(l + m − 1) Pl−1,m−1 (sin ϕ) (2l + 1) −

(l − m + 1)(l − m + 2) Pl+1,m−1 (sin ϕ) (2l + 1)

1 1 Pl+1,m+1 (sin ϕ) − Pl−1,m+1 (sin ϕ) cos ϕPl,m (sin ϕ) = 2l + 1 2l + 1 (l − m + 1) (l + m) Pl+1,m (sin ϕ) + Pl−1,m (sin ϕ) sin ϕPl,m (sin ϕ) = (2l + 1) (2l + 1)

(222)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

437

We had presented in (219)–(222) the basic relation for the scalar-valued surface spherical harmonics, we plan now to present to you the vector-valued spherical harmonics: Definition of the vector spherical harmonics l,m Rl,m (λ, ϕ) := er eλ,ϕ

r · grad el,m (λ, ϕ) Sl,m (λ, ϕ) := √ l(l + 1)   1 1 ∂ l,m ∂ l,m e (λ, ϕ) + eϕ e (λ, ϕ) eλ =√ cos ϕ ∂λ ∂ϕ l(l + 1) 1 rot(xel,m (λ, ϕ)) Tl,m (λ, ϕ) := √ l(l + 1)   1 ∂ 1 ∂ l,m =√ e (λ, ϕ) eλ el,m (λ, ϕ) − eϕ ∂ϕ cos ϕ ∂λ l(l + 1)

(223)

⇒ Tl,m = −er × Sl,m Sl,m = er × Tl,m For your interest we prove the equivalence of (171) and (172): Proof of the equivalence of Eqs. (171) and (172) Let us decompose the density and the (actual) coordinates under the integral in Eq. (170) according to the Lagrangean description: (x) = 0 (X) + (X), x i = Xi + Xi . Here (X) denotes the Lagrangean increment of the mass density. Since the volume element d3 x in the actual state and the volume element d 3 X in the reference state are in linear approximation related by d 3 x = (1 + divX)d 3 X, the integral in Eq. (170) can be decomposed into 



I i1 ,i2 ,...,in :=

0 (X)Xi1 Xi2 · · · Xi3 d3 X + V

V



+

0 (X)Xi1 Xi2 · · · Xi3 div X d3 X

(X)Xi1 Xi2 · · · Xi3 d3 X V





+

0 (X)

Xi1 Xi2 Xin + + ··· i i i X1 X2 Xn



V

× Xi1 Xi2 · · · Xin d3 X (224) Since

 = −0 (X) div( X) (mass conversion law), we find

438

E. W. Grafarend

 I

i1 ,i2 ,...,in

=



 0 (X)X X · · · X d X + i1

i3 3

i2

V

0 (X)

Xi1 Xi2 Xin + + · · · Xi1 Xi2 Xin



V

× X X ···X d X i1

in 3

i2

(225)

and therefore 

 δI

i1 ,i2 ,...,in

=

0 (X)

 Xi1 Xi2 Xin × Xi1 Xi2 · · · Xin d3 X + + ··· i Xi1 Xi2 Xn

V

(226) This is just the definition of Eq. (172). On the other hand, the Eulerian increment δ of mass density follows mass conservation δ = −div(0 (X) X) such that in linear approximation 

 δ(X)x x · · · x d x = − i1 i2

div(0 (x) X)x i1 x i2 · · · x in d3 x

in 3

R3

V



=−

div[0 (x) Xx i1 x i2 · · · x in ]d3 x

R3



+

= 0 (x)

(227)

> X, grad(x i1 x i2 · · · x in ) d3 x

R3

holds. Here the first integral can be transformed into a surface integral by means of the Gauss Theorem. Since outside the Earth the density 0 equals zero, this integral vanishes. The second integral equals in linear approximation the integral of Eq. (172), which has been derived from the Lagrangean decomposition. All notations and conventions of Sect. 6.5 follow Eringen [20], to whom we refer.

6.6

Summary

The Cartesian moments of the mass density of a gravitating body and the spheroidal/spherical harmonic coefficients of its gravitational field are related in a peculiar way. In particular, the products of inertia can be expressed by the spheroidal/spherical harmonic coefficients of the gravitational potential as was derived by MacCullagh for a rigid body. Here the MacCullagh formulae are extended to a deformable body which is restricted to radial symmetry in order to apply the Love-Shida hypothesis. The mass conservation law allows (continued)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

439

a representation of the incremental mass density by the respective excitation function. A representation of an arbitrary Cartesian monome is always possible by sums of solid spherical harmonics multiplied by powers of the radius. Introducing these representations into the definition of the Cartesian moments, an extension of the MacCullagh formulae is obtained. In particular, for excitation functions with a vanishing harmonic coefficient of degree zero, the (diagonal) incremental moments of inertia also can be represented by the excitation coefficients. Four types of excitation functions are considered, namely: (1) tidal excitation; (2) loading potential; (3) centrifugal potential; and (4) transverse surface stress. One application of the results could be model computation of the length-of-day variations and polar motion, which depend on the moments of inertia.

Acknowledgements The author wants to thanks S. Kopeikin for his work on relativistic ellipsoidal figures of equilibrium

Literature 1. Ansorge, M., Fischer, T., Kleinwächter, A., Meinel, R., Petroff, D., Schöbel, K.: Equilibrium configurations of homogeneous fluids in general relativity. Mon. Not. R. Astron. Soc. 355, 682–688 (2004) 2. Ardalan, A.A., Grafarend, E.W.: Somigliana-Pizzetti gravity: the international gravity formula to the sub-nanoGal level. J. Geodesy 75, 424–437 (2001) 3. Ardalan, A.A., Grafarend, E.W., Ihde, J.: Molodensky potential telluroid based on a minimum-distance map, Case study: the Quasi-Geoid of East Germany in the World Geodetic Datum 2000. J. Geodesy 76, 127–138 (2002) 4. Backus, G.E.: Converting vector and tensor equations to scalar equations in spherical coordinates. Geophys. J. R. Astron. Soc. 13, 61–101 (1967) 5. Ballani, L., Engels, J., Grafarend, E.: Global base functions for the mass density in the interior of a massive body (Earth). Manuscr. Geodaet. 18, 99–114 (1993) 6. Baranov, W.: La formulle de Stokes estelle correcte? Bull. Geod. 49, 27–34 (1975) 7. Bjerhammar, A.: The energy integral for satellites. Report from the Division of Geodesy Division, Royal Institute of Technology, Stockholm (1967) 8. Bocchio, F.: Geodetic singularities. Rev. Geophys. Space Res. 20, 399–409 (1981) 9. Bode, A., Grafarend, E.W.: The telluroid mapping based on a normal gravity potential including the centrifugal term. Boll. Geod. Sci. Aff. 41, 21–56 (1982) 10. Bursa, M., Kouba, J., Radcj, K., True, S.A., Vatrt, Y., Vojtiskova, M.: Monitoring geoidal potential on the basis of TOPEX/POSEIDON altimeter data and EGM96]. Paper presented at Scientific Assembly of LAG, Rio de Janeiro (1997) 11. Bursa, M., Radej, K., Sima, Z., True, S.A., Vatrt, V.: Determination of the geopotential scale factor from TOPEX/POSEl DON satellite altimetry. Stud. Geophs. Geod. 14, 203–216 (1997) 12. Bursa, M., Kouba, J., Radej, K., True, S.A., Vatrt, V., Vojtiskova, M.: Temporal variations in sea surface topography and dynamics of the Earth’s inertia ellipsoid. Stud. Geophs. Geod. 43, 7–19 (1999) 13. Capderou, M.: Satellites – Orbits and Missions. Springer (2004) 14. Caputo, M.: The Gravity Field of the Earth. Academic, New York/London (1967)

440

E. W. Grafarend

15. Chandrasekhar, S.: Ellipsoidal Figures of Equilibrium, IX+252pp. Yale University Press, New Haven (1969) 16. Chen, X., Nashed, Z., Qi, L.: Convergence of Newton’s method for singular smooth and nonsmooth equations using adaptive outer inverses. SIAM J. Opt. 7, 445–462 (1997) 17. Cheng, M.K., Shum, C.K., Tapley, B.D.: Determination of long term changes in the Earth’s gravity field from satellite laser ranging observations. J. Geophys. Res. 102(B10), 22377– 22390 (1997) 18. Denker, F.I., Torge, W.: The European gravimetric quasi-geoid EGG97 – an IAG supported continental enterprise. In: Forsberg, R., et al. (eds.) IAG Symposium. Proceedings, vol. 119, pp. 249–254. Springer, Berlin/Heidelberg/New York (1998) 19. Dixon, W.G.: Dynamics of extended bodies in general relativity II. Moments of the chargecurrent vector. Proc. R. Soc. Land. A 319, 509–547 (1970) 20. Eringen, C.: Nonlinear Theory of Continuous Media. McGraw-Hill, New York (1962) 21. Grafarend, E.: The definition of the telluroid. Bull. Geod. 52, 25–37 (1978) 22. Grafarend, E.: Six lectures on geodesy and global geodynamics. In: Moritz, H., Sünkel, H., Graz (eds.) Mitteilungen der geodätischen Institute der Technischen Universität Graz, Folge 41, pp. 531–685 (1978) 23. Grafarend, E.: The time-varying gravitational potential field of a massive deformable body. Stud. Geophs. Geod. 44, 364–373 (2000) 24. Grafarend, E., Ardalan, A.A.: World geodetic datum. J. Geod. 73, 611–623 (1999) 25. Grafarend, E., Lohse, P.: The minimal distance mapping of the topographic surface onto the reference ellipsoid of revolution. Manuscipta Geosdaetica 16, 92–110 (1991) 26. Grafarend, E., Heidenreich, E.D., Schaffrin, B.: A representation of the standard gravity field. Manuscr. Geodaet. 2, 135–174 (1977) 27. Grafarend, E., Engels, J., Varga, P.: The spacetime gravitational field of a deformable body. J. Geod. 72, 11–30 (1997) 28. Groten, E.: Current best estimates of the parameters of common relevance to astronomy, geodesy, and geodynamics. Internal Communications of IAG/ JUGG Special Commission 3, Darmstadt (1997) 29. Guo, R., Hu Xiao, G., Tang, B., Huang, Y., Liu, L., Cheng, L., Feng, H.E.: Precise orbit determination for geostationary satellites with multiple tracking techniques. Astronomy 55(8), 687–692 (2010). https://doi.org/10.1007/s11434-010-0074-x 30. Hehl, F.W., Obukov, Y.N.: Foundation of Classical Electrodynamics-Charge Flux and Metric. Birkhäuser, Boston/Basel/Berlin (2002) 31. Heikkinen, M.: Geschlossene Formeln zur Berechnung räumlicher geodätischer Koordinaten aus rechtwinkligen Ko-ordinaten. Z. Vermess 5, 207–211 (1982) 32. Heiskanen, W., Moritz, H.: Physical Geodesy, 364pp. W. H. Freeman, San Francisco (1967) 33. Hirvonen, R.: New theory of gravimetriy geodesy. Ann. Acad. Sci. Fennicae, A III (56) Helsinki (1960) 34. Hobson, E.W.: The Theory of Spherical and Ellipsoidal Harmonics. Chelsea Publishing Company, New York (1965) 35. Iorio, L.: Orbital motion as gradiometers for post-Newtonian tidal effects. Astrophys. Space Sci. (2014). https://doi.org/10.3389/fspac.2014.00003 36. Knickmeyer, E.H.: Eine approximative Lösung der allgemeinen linearen Geodatischen Randwertaufgabe durch Reihenentwicklung nach Kugelfunktionen. Deutsche Geodatische Kommission, Reihe C, Nr 304, Bayerische Akademie der Wissenschaften, Munchen (1984) 37. Knogl, J.S., Henkel, P., Guenther, C.H.: Precise Positioning of a Geostationary Data Relay using LEO Satellites. 53rd International Symposium ELMAR-2011, 14–16 Sept 2011, Zadar, pp. 325–328 (2011) 38. Kopeikin, S.M.: Reference ellipsoid and geoid in chronometric geodesy. Front. Fundam. Astron. 1, 1–15 (2016) 39. Kopeikin, S., Han, W., Mazurova, E.: Post-Newtonian reference ellipsoid for relativistic Geodesy. Phys. Rev. D 93, 044069 (2016)

7 Ellipsoidal-Spheroidal Representation of the Gravity Field

441

40. Lemoine, F.G., Smith, D.E., Kunz, L., Smith, R., Pavlis, E.C., Pavlis, N.K., Klosko, S.M., Chinn, D.S., Torrence, M.H., Williamson, R.G., Cox, E.M., Rachlin, K.E., Wang, Y.M., Kenyon, S.C., Salman, R., Trimmer, R., Rapp, R.H., Nerem, R.S.: The development of the NASA GSFC and NIMA joint geopotential model. In: Segawa, J., Fujimoto, H., Okubo, S. (eds.) Gravity, Geoid and Marine Geodesy, International Association of Geodesy Symposia, Vol. 117, pp. 461–469. Springer, Berlin/Heidelberg (1996) 41. MacCullagh, J.: On the rotation of a solid body round a fixed point (being an account of the late Prof. MacCullagh’s lectures on that subject, compiled by S. Haughton). R Irish Acad, Trans 22, 139–154 (1855) 42. McClure, P.: Diurnal polar motion. NASA X-592-73-259, Goddard Space Flight Center, Greenbelt (1973) 43. Merriam, J.B.: Toroidal Love numbers and transverse stress at the Earth’s surface. J. Geophys. Res. 90(B9), 7795–7802 (1985) 44. Mitrovica, J.X., Davis, J.L., Shapiro, I.I.: A spectral formalism for computing threedimensional deformations due to surface loads – 1. Theory. J. Geophys. Res. 99(B4), 7057–7073 (1994) 45. Molodensky, M.S.: Main problem of geodetic gravimetry. Trans. Centr. Res. Inst. G, A & C 42 (1945) 46. Molodensky, M.S.: External gravitational field and the figure of the Earth’s physical surface. Information of the USSR Academy of Sciences, Geographical and Geophysical Series 13, no. 3 (1948) 47. Molodensky, M.S., Eremeev, V.F., Yurkina, M.I.: Methods for study of the external gravitational field and figure of the Earth [translated] from Russian by Israel Program for Scientific Translations for the Office of Technical Services, Department of Commerce, Washington, DC (1960) 48. Moritz, H.: Geodetic Reference System 1980, Geodesist’s Handbook. Bull. Geod. 66, 187–192 (1992) 49. Moritz, H., Mueller, I.: Earth Rotation. Ungar, New York (1987) 50. Nagel, E.: Die Bezugssysteme der Satellitengeodäsie. Deutsche Geodätische Kommission Reihe C, Heft Nr 223, Bayerische Akademie der Wissenschaften, München (1976) 51. Neumann, J.: Entwicklung der in elliptischen Koordinaten ausgedrückten rcziproken Entfernung zweier Punkte in Reihen. Journal für die Reine und Angewandte Mathematik 37, 21–50 (1848) 52. Pail, R., Gruber, T., Fecher, T.: The Combined Gravity Model G00005c. GFZ Data Services (2016). http://doi.org/10.5880/icgem.2016.003 53. Pizzetti, P.: Geodesia – Sulla espressione della gravita alla superficie del geoide, supposto ellisoidico. Atti Reale Accad Linc 3, 166–172 (1894) 54. Pizzetti, P.: Sopra il calcolo terico delle deriviazioni del geoide dall’ ellissoide. Atti Accad. Sci. Torino. 46, 331–350 (1911) 55. Rapp, R.H., Wang, Y.M., Pavlis, N.K.: The Ohio State 1991 geopotential and sea surface topography harmonic coefficient models. Report 410, Ohio State University Department of Geodetic Science and Surveying, Columbus (1991) 56. Ries, J.C., Eanes, R.J., Shum, C.K., Watkins, M.M.: Progress in the determination of the gravitational coefficient of the Earth. Geophys. Res. Lett. 19, 529–531 (1992) 57. Saupe, D.: Discrete versus continuo us Newton’s method: a case study. Acta Appl. Math. 13, 59–80 (1988) 58. Somigliana, C.: Teoria generale del campo gravitionale dell‘s ellipsoide. Mem. Soc. Astr. Ital. IV (1929) 59. Somigliana, C.: Geofisica – Sul campo gravitazionale esterno del geoide ellissoidico. Atti. Reale. Acad. Naz. Linc. Rendi. 6, 237–243 (1930) 60. Soop, E.M.: Handbook of Geostationary Orbit. Springer Co., Dordrecht (1994) 61. Spada, G.: Changes in the Earth inertia tensor: the role of boundary conditions at the core— mantle interface. Geophys. Res. Lett. 22(24), 3557–3560 (1995)

442

E. W. Grafarend

62. Tapley, B.D., Watkins, W.M., Ries, J.C., Davis, G.W., Eanes, R.J., Poole, S.R., Rim, H.J., Schutz, B.E., Shum, C.K., Nerem, R.S., Lerch, F.J., Marshall, J.A., Klosko, S.M., Pavlis, N.K., Williamson, R.G.: The JGM3 gravity model. J. Geophys. Res. 101, 28029–28049 (1996) 63. Thong, N.C., Grafarend, E.W.: A spheroidal model of the terrestrial gravitational field. Manuscr. Geodaet. 14: 285–304 (1989); Vanitiek P, Krakiwsky E (1986) Geodesy: The Concepts. Elsevier Science, Amsterdam 64. Wolf, D.: Gravitational viscoelastodynarnics for a hydrostatic planet. Deutsche Geodätische Kommission, Reihe C, Heft Nr 452, Bayerische Akademie der Wissenschaften, München (1997) 65. Yang, Y., Yang, X., Li, Z., Feng, C.: Satellite orbit determination combining C-band ranging and differenced ranges by transfer. Chinese Sci. Bull. 58(19):2323–2328 (2013) 66. Zund, J.: Foundations of differential geodesy, p. 373. Springer, Berlin (1994)

Part III Statistical Methods

8

Monte Carlo Methods Karl-Rudolf Koch

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Bayesian Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Error Propagation by Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Generating Random Variates from Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Monte Carlo Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Markov Chain Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Minimal Detectable Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

447 448 454 454 457 461 464 472 472

Abstract Monte Carlo methods deal with generating random variates from probability density functions in order to estimate unknown parameters or general functions of unknown parameters and to compute their expected values, variances and covariances. One generally works with the multivariate normal distribution due to the central limit theorem. However, if random variables with the normal distribution and random variables with a different distribution are combined, the normal distribution is not valid anymore. The Monte Carlo method is then needed

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. Karl-Rudolf Koch () Institute for Geodesy and Geoinformation, Theoretical Geodesy Group, University of Bonn, Bonn, Germany E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_100

445

446

Karl-Rudolf Koch

to get the expected values, variances and covariances for the random variables with distributions different from the normal distribution. The error propagation by the Monte Carlo method is discussed and methods for generating random variates from the multivariate normal distribution and from the multivariate uniform distribution. The Monte Carlo integration is presented leading to the sampling-importance-resampling (SIR) algorithm. Markov Chain Monte Carlo methods provide by the Metropolis algorithm and the Gibbs sampler additional ways of generating random variates. A special topic is the Gibbs sampler for computing and propagating large covariance matrices. This task arises when the geopotential is determined from satellite observations. The example of the minimal detectable outlier shows, how the Monte Carlo method is used to determine the power of a hypothesis test.

Zusammenfassung Monte-Carlo-Methoden arbeiten mit Zufallszahlen aus Verteilungsfunktionen, um unbekannte Parameter oder allgemeine Funktionen unbekannter Parameter zu schätzen, und um ihre Erwartungswerte, Varianzen und Kovarianzen zu berechnen. Im allgemeinen nutzt man wegen des zentralen Grenzwertsatzes die multivariate Normalverteilung. Wenn jedoch Zufallsvariable mit der Normalverteilung mit Zufallsvariablen unterschiedlicher Verteilungen kombiniert werden, gilt die Normalverteilung nicht mehr. Die Monte-Carlo-Methode wird dann benötigt, die Erwartungswerte, Varianzen und Kovarianzen der Zufallsvariablen mit Verteilungen zu erhalten, die sich von der Normalverteilung unterscheiden. Die Fehlerfortpflanzung durch die Monte-Carlo-Methode wird diskutiert und Methoden für die Generierung von Zufallswerten aus der multivariaten Normalverteilung und aus der multivariaten Gleichverteilung. Die Monte-CarloIntegration der wesentlichen Stichprobe führt auf den SIR (sampling-importance-resampling) Algorithmus. Monte-Carlo-Methoden mit Markoff-Ketten verschaffen durch den Metropolis-Algorithmus und das Gibbs-Verfahren weitere Methoden, Zufallswerte zu generieren. Als besondere Aufgabe wird das Gibbs-Verfahren zur Berechnung und Propagation großer Kovarianzmatrizen behandelt. Dieses Problem tritt auf, wenn das Schwerefeld der Erde aus Satellitenbeobachtungen bestimmt wird. Das Beispiel der minimal aufzudeckenden Ausreißer zeigt, wie die Monte-Carlo-Methode benutzt wird, um die Trennschärfe eines Hypothesentests zu bestimmen.

Keywords Bayesian statistics · SIR algorithm · Metropolis algorithm · Gibbs sampler · Markov Chain Monte Carlo method

8 Monte Carlo Methods

1

447

Introduction

The contribution “Monte Carlo Methods” to the volume “Handbook of Mathematical Geodesy (HbMG)” for the new edition of “Jordan-Eggert-Kneissl” cannot be considered a review article. The existing applications and references in geodesy are too numerous. It rather reflects the experience, which the author has made, to estimate unknown parameters or general functions of unknown parameters and to compute their expected values, variances and covariances by Monte Carlo methods. This contribution to HbMG was already published in a slightly different form in [49]. An example for the Monte Carlo methods is the law of error propagation: A first vector is nonlinearly transformed by a matrix of coefficients to a second vector whose covariance matrix has to be computed. The matrix of coefficients then contains the partial derivatives of the nonlinear transformation. If we know the distribution of the first vector and due to the central limit theorem, see Sect. 4.2, it is frequently the normal distribution, random variates are drawn from this distribution. The covariance matrix of the nonlinearly transformed vector then follows from the Monte Carlo estimate. This will save a considerable amount of derivatives to be determined so that errors of the linearization are avoided. Thus, the Monte Carlo estimate becomes more efficient, see Sect. 3. The following topics have been dealt with. Section 2 introduces some notions of Bayesian statistics. The unknown quantities or unknown parameters are defined as random variables for which random variates are generated from their distributions. Bayesian statistics is therefore well suited for Monte Carlo methods. It extends the notion of probability by defining the probability for statements or propositions, whereas traditional statistics not founded on Bayes’ theorem introduces probability for random events resulting from random experiments. In Bayesian statistics, the knowledge about a statement is expressed by probability. If the statements refer to the values of random variables, their distributions can be derived. The generalized Bayes’ theorem, point estimation, estimation of confidence regions and hypothesis testing are covered in this section. Section 3 treats the error propagation by Monte Carlo methods. This was already mentioned as an example at the beginning of this Sect. 1. Section 4 deals with generating random variates from distributions. It presents the normal distribution, the multivariate normal distribution and the multivariate uniform distribution. Section 5 explains the Monte Carlo integration. It leads to the importance sampling and to the sampling-importance-resampling (SIR) algorithm. Furthermore, the computation of marginal distributions is presented. Section 6 gives the Markov Chain Monte Carlo methods which also allow generating random variates from distributions. The Metropolis algorithm is the first

448

Karl-Rudolf Koch

of the Markov Chain Monte Carlo methods. The Gibbs sampler was developed for the Bayesian restoration of digital images and later used for a variety of problems of Bayesian inference. It is well suited for computing and propagating large covariance matrices. Section 7 discusses as an example the minimal detectable outliers as measures of reliability, which is the ability of a parameter estimation to detect outliers. It is shown how the power of a hypothesis test is computed by the Monte Carlo method.

2

Bayesian Statistics

2.1

Generalized Bayes’ Theorem

Bayes’ theorem, which is derived for the probability of statements, is generalized such that it is valid for the probability density function, shortly called the distribution, of a random vector. Let x and y be random vectors, we then obtain by the conditional distribution under the condition that values of the random vector y are given, cf. [44, p. 31], p(x|y, C) =

p(x, y|C) p(y|C)

(1)

where the background information C is known. Furthermore, we have p(y|x, C) =

p(x, y|C) . p(x|C)

(2)

If these two equations are solved for p(x, y|C) and the resulting expressions are equated, the generalized Bayes’ theorem is found p(x|y, C) =

p(x|C)p(y|x, C) . p(y|C)

(3)

Since the vector y contains fixed values, p(y|C) is constant. Bayes’ theorem is therefore often applied in the form p(x|y, C) ∝ p(x|C)p(y|x, C).

(4)

where ∝ denotes proportionality. The components of the vector x are identified with unknown parameters, i.e. quantities which describe unknown phenomena. The set of vectors x, which collects all possible values for the parameters, is called the parameter space X , thus x ∈ X . To estimate the unknown parameters x,

8 Monte Carlo Methods

449

measurements or observations, also called data, are taken which contain information about the parameters. The values y of the random vector y represent the given data. The probability density function p(x|C) contains information about the parameters x before the data y have been taken. One calls p(x|C) therefore prior density function for the unknown parameters x. By taking into account the data y, the probability density function p(x|y, C) follows. It is named posterior density function for the parameters x. Via the density function p(y|x, C), the information available in the data y reaches the parameters x. Since the data y are given, this density function is not interpreted as a function of y but as a function of the parameters x. It is called likelihood function. Thus, posterior density function ∝ prior density function × likelihood function.

(5)

The measurements modify the prior density function by the likelihood function and lead to the posterior density function for the unknown parameters. The vector x of unknown parameters is defined in Bayes’ theorem as random vector with which a prior density and a posterior density function are associated. This approach is contrary to the one of traditional statistics which is not based on Bayes’ theorem and which defines the vector of unknown parameters as a vector of constants. But this does not mean that the vector x of parameters in Bayes’ theorem (3) may not represent constants like the coordinates of a point at the rigid surface of the earth. Not the constants are random, it is the knowledge about the values of the constants which are random and for which probability is introduced.

2.2

Point Estimation

The knowledge of the posterior density function for the unknown parameters from Bayes’ theorem allows to estimate the unknown parameters. This is called point estimation. A decision is needed for the estimation, and it results in a loss. The true state of the system is characterized by the true values x of the unknown parameters. The loss is a function of the estimates xˆ and of the true values x of the parameters ˆ It must express the quality of the estimate x. ˆ It so that it will be called L(x, x). therefore has to increase for bad estimates. ˆ  (x − x) ˆ of the A simple loss function results from the sum of squares (x − x) errors x − xˆ of the estimates xˆ of the unknown parameters x. This sum of squares is generalized by means of the covariance matrix D(x|y) =  for the random vector x. It shall be positive definite. Its inverse  −1 is proportional to the weight matrix of x, cf. [40, p. 104]. Thus, the quadratic loss function ˆ = (x − x) ˆ   −1 (x − x) ˆ L(x, x)

(6)

450

Karl-Rudolf Koch

is chosen where the squares of the errors x − xˆ are weighted by −1 . This loss function leads to the well known method of least squares for the linear model, see Sect. 7.1. To determine the posterior expected loss of the quadratic loss function (6), its expected value has to be computed by the posterior density function p(x|y, C) for x from (3). We obtain  ˆ = E[L(x, x)]

X

ˆ  −1 (x − x)p(x|y, ˆ (x − x) C)dx

(7)

where X denotes the parameter space over which the values of x have to be integrated. The minimum of the expected loss is obtained by, cf. [44, p. 66], xˆ B = E(x|y).

(8)

The expected value is computed according to (7) by the posterior density function p(x|y, C) for x. The estimate xˆ B , which is called Bayes estimate, therefore follows with  xˆ B = xp(x|y, C)dx. (9) X

The vector x of unknown parameters is a random vector. Its Bayes estimate xˆ B is a fixed quantity. To express the dispersion of xˆ B , the covariance matrix D(x|y) for the vector x of unknown parameters is introduced by the posterior density function p(x|y, C) D(x|y) = E((x − E(x|y))(x − E(x|y)) )  (x − xˆ B )(x − xˆ B ) p(x|y, C)dx. = X

(10)

We will look at the more general case that the n×1 vector x of unknown parameters is defined by the n × 1 vector f (y) of functions of the n × 1 random vector y of measurements x = f (y).

(11)

A simple example of the functions f (y) is the sum of the vector of measurements and of the vector of systematic effects in laser scanning [50]. The expected value E(f (y)) and the covariance matrix D(f (y)) of the vector f (y) follows by the probability density function p(x|y, C) of the vector x by  E(f (y)) =

X

f (y)p(x|y, C)dx,

(12)

8 Monte Carlo Methods

 D(f (y)) =

X

451

(f (y) − E(f (y)))(f (y) − E(f (y))) p(x|y, C)dx.

(13)

Let m random variates x j be generated for the vector x from the probability density function p(x|y, C) x j = (xij )

for i ∈ {1, . . . , n}, j ∈ {1, . . . , m}.

(14)

ˆ (y)) of the n × n covariance matrix D(f (y)) of the The Monte Carlo estimate D(f vector f (y) is then obtained by the sum, cf. [58, p. 275], [44, p. 225], [48],  ˆ ˆ ˆ (y)) = 1 (x j − E(x|y))(x D(f j − E(x|y)) m m

(15)

j =1

ˆ (y)) of the expectations E(f (y)) and with the estimate with the estimate E(f ˆ E(x|y) of the expectation E(x|y) 1 ˆ (y)) = E(x|y) ˆ xj . E(f = m m

(16)

j =1

The example cited for the vector f (y) from (11) is normally distributed for the vector of measurements and multivariate uniformly distributed for the systematic effects. The Monte Carlo estimate is then needed, as the sum is not normally distributed anymore. This has also been pointed out in [2].

2.3

Estimation of Confidence Regions

In contrast to the point estimation of Sect. 2.2, a region shall now be determined in which the vector x of unknown parameters is situated with a given probability. We obtain by the posterior density function p(x|y, C) for x from Bayes’ theorem (3) the probability  P (x ∈ Xu |y, C) =

Xu

p(x|y, C)dx

(17)

that the vector x belongs to the subspace Xu of the parameter space X with Xu ⊂ X . One is often interested to find the subspace where most of the probability, for instance 95%, is concentrated. Given a probability, there are obviously many possibilities to establish such a subspace. A region of values for x within the subspace, however, should be more probable than a region outside the subspace. It will be therefore required that the probability density function of each point within the subspace is equal to or greater than the density function of a point outside the

452

Karl-Rudolf Koch

subspace. The region of highest posterior density, also called H.P.D. region, is thus obtained. Let p(x|y, C) be the posterior density function for the vector x of unknown parameters, the subspace XB with XB ⊂ X is then called a 1 − α H.P.D. region or confidence region, if  P (x ∈ XB |y, C) =

XB

p(x|y, C)dx = 1 − α

p(x 1 |y, C) ≥ p(x 2 |y, C)

for

and

x 1 ∈ XB , x 2 ∈ / XB .

(18)

The value for α is generally chosen to be α = 0.05, but also α = 0.1 or α = 0.01 are selected. Let the vector x of unknown parameters contain only the random variable x as component. The confidence interval for x is then defined by (18). It is simply computed by Monte Carlo methods. Random variates xj with j ∈ {1, . . . , m} for x are generated, see (14). The interval between the maximum and the minimum of the m values is divided into small intervals, let say m − 1 intervals. The probabilities of the random variates xj lying within the small intervals are determined by the relative frequencies. The probabilities at both ends of the interval are then added such that (18) is fulfilled, until the probability α is reached [46]. To get a confidence ellipse, random variates xij with i, j ∈ {1, . . . , m} for the vector x from (14) are generated in the rectangular region, where the confidence ellipse is situated. The region is divided into a mesh of small rectangles. They are ordered in terms of decreasing probabilities obtained by relative frequencies. The probabilities are added at both ends of the interval such that (18) is fulfilled, until the probability α is reached [35]. To obtain three-dimensional confidence ellipsoids, again random variates xij with i, j ∈ {1, . . . , m} for the vector x are generated in the three-dimensional region, where the confidence ellipsoid is situated. The region is divided into small rectangular volume elements. Relative frequencies are computed for the volume elements. The frequencies are scaled to the interval [0,1] to find the probability density function pij for i, j ∈ {1, . . . , m}. The densities are sorted in decreasing order. However, even for a large number of generated random variates, there will be many volume elements having the same densities, especially at the boundary of the confidence ellipsoid so that the densities cannot be ordered. The following method has been applied to solve this problem. The densities pij are arranged together with the coordinates of the volume elements in decreasing order. The highest density value pij together with the coordinates of the volume element determine the center of the confidence ellipsoid. Distances are computed from the center of the confidence ellipsoid to the volume elements and sorted in increasing order together with the densities pij and the coordinates of the volume

8 Monte Carlo Methods

453

elements. The densities are summed such that (18) is fulfilled, until the probability α is reached [50]. To construct a confidence hyperellipsoid, the probability density functions p(x j |y, C) for the random variates x j from (14) are sorted in decreasing order. By summing the density function up to the index b for which 1 − α holds true, the point x b with pb = p(x b |y, C) at the boundary of the confidence hyperellipsoid is obtained. If the vectors x j are graphically depicted as points and if neighboring points are selected with smaller density values than pb and with larger ones, the boundary of the confidence hyperellipsoid is obtained by interpolating between these points. This method, of course, can also be applied to compute confidence ellipses and confidence ellipsoids.

2.4

Hypothesis Testing

Propositions concerning the unknown parameters may be formulated as hypotheses, and methods for deciding whether to accept or to reject the hypotheses are called hypothesis tests. Let X0 ⊂ X and X1 ⊂ X be subspaces of the parameter space X and let X0 and X1 be disjoint, i.e. X0 ∩X1 = ∅. The assumption that the vector x of unknown parameters is element of the subspace X0 is called the null hypothesis H1 and the assumption that x is element of X1 the alternative hypothesis H1 . We are interested in the point null hypothesis. It is defined by H0 : x = x 0

versus

H1 : x = x 0

(19)

where only the vector x 0 in X0 is given and X1 is the complement of X0 . The point null hypothesis will be tested such that is accepted if the point x 0 lies within the 1−α confidence region for the parameter vector. This is the case, if the inequality p(x 0 |y, C) > p(x b |y, C)

(20)

is fulfilled where x b denotes a point at the boundary of the confidence region XB . Its density pb is introduced by pb = p(x b |y, C)

(21)

and a point x 0 lies within the confidence region XB , if p(x 0 |y, C) > pb .

(22)

The hypothesis H0 is then accepted, otherwise it is rejected. See the last paragraph of Sect. 2.3 for an application of (22) and see [50] for the use of the hypothesis test.

454

3

Karl-Rudolf Koch

Error Propagation by Monte Carlo Methods

Let the n × 1 random vector x with the covariance matrix D(x|y) be linearly transformed with the m × n matrix A and the m × 1 vector b of constants into the m × 1 random vector x by x = Ax + b. Its m × m covariance matrix D(x|y) follows with D(x|y) = D(Ax + b) = AD(x|y)A

(23)

since we obtain with the definition (10) of a covariance matrix D(x|y) = E((x − E(x|y))(x − E(x|y)) ) and D(x|y) = E((Ax + b − AE(x|y) − b)(Ax − AE(x|y)) ) = AE((x − E(x|y))(x − E(x|y)) )A . In case of nonlinear transformations, the matrix A contains as elements the derivatives of the transformed values of the random vector x with respect to the values of x, cf. [40, p. 100]. Equation (23) is then called the law of error propagation. If we know the distribution of the vector x and due to the central limit theorem, see Sect. 4.2, it is frequently the normal distribution, random variates x j are drawn like in (14) from this distribution for x x j = (xij )

for i ∈ {1, . . . , n}, j ∈ {1, . . . , m}.

(24)

The covariance matrix D(x|y) of the nonlinearly transformed vector x and its expectation E(x|y) then follows from the Monte Carlo estimate (15) and (16). This will save a considerable amount of derivatives to be computed so that errors of the linearization are avoided, as was already mentioned in Sect. 1. Thus, the Monte Carlo estimate becomes more efficient, cf. [1, 70]. For geodetic applications, this has been pointed out in [32] and demonstrated in [46].

4

Generating Random Variates from Distributions

The posterior density function p(x|y, C) for the random vector x of the unknown parameters has to be integrated for the Bayes estimate from (9) and for computing confidence regions from (18). Often, these integrals cannot be solved analytically so that numerical techniques have to be applied. The Monte Carlo integration in Sect. 5 helps to overcome this deficiency. Random variates are generated for the Monte Carlo integration from a distribution which approximates the posterior density functions for the unknown parameters. One distinguishes between generating random numbers, that are values of a random variable which possesses the uniform distribution in the interval [0,1], and generating values of random variables with a known distribution. Both methods give random values, also called random variates. They are generated by computers. Deterministic procedures are applied so that pseudo random numbers instead of true random numbers are obtained. However, one strives for generating pseudo random

8 Monte Carlo Methods

455

numbers with such properties that they can be used as true random numbers. One therefore talks about random variates only.

4.1

Inversion Method

Let X be a random variable with the distribution function F (x). It is a monotonically increasing function, and its inverse function F −1 (u) is defined for any value of u between 0 and 1 by the smallest value for x which fulfills F (x) = u, thus F −1 (u) = min{x : F (x) = u , 0 ≤ u ≤ 1}.

(25)

If the random variable U is uniformly distributed in the interval [0, 1], the random variable X = F −1 (U ) has the distribution function F (x).

(26)

This may be shown by the definition of the distribution function P (X < x|C) = P (F −1 (U ) < F −1 (u)|C) = P (U < u|C) = u = F (x) because P (U < u|C) = u holds true for a uniform distribution. Furthermore, if a random variable X has the distribution function F (x) the random variable U = F (X) is uniformly distributed in the interval [0,1]

(27)

as for every u with 0 ≤ u ≤ 1 one gets P (U < u|C) = P (F (X) < u|C) = P (X < F −1 (u)|C) = P (F −1 (u)|C) = u since P (U < u|C) = u is valid for a uniform distribution. To generate a value x of the random variable X with the distribution function F (x) by the inversion method, the following two steps are necessary: 1. a random number is generated, i.e. a value u of the random variable U uniformly distributed in the interval [0, 1], 2. the value x follows from x = F −1 (u).

(28)

Thus, the inversion method presents itself, if the distribution function F (x) and its inverse function can be analytically given. Let values of the random variable X with the uniform distribution be generated in the interval [a, b]. We obtain (x − a)/(b − a) = u from the uniform distribution and

456

Karl-Rudolf Koch

x = a + (b − a)u.

(29)

Random variates can also be generated by the rejection method, which is applied for a random variable with a probability density function not being normalized. This happens, if the density function results from Bayes’ theorem (4), cf. [13, p. 53], [14, p. 50], [68, p. 45] and [71].

4.2

Normal Distribution and Multivariate Normal Distribution

The significance of the normal distribution stems from the central limit theorem [12, p. 214], [77, p. 257]. It states that for n independent random variables with any distributions, the distribution of the sum moves under certain conditions asymptotically towards a normal distribution, if n goes to infinity. A corresponding theorem holds for the multivariate normal distribution [12, p. 316]. A number of techniques exist for generating random variates from a normal distribution. For a review see [13, p. 93] and [14, p. 379]. A method often applied goes back to [11] with a modification of [60] for fast computations. In general, random variates zj for j ∈ {1, . . . , m} with the standard normal distribution N (0, 1) are generated. To obtain random variates xj with the normal distribution N (μ, σ 2 ), the transformation is applied xj = σ zj + μ.

(30)

Random variates x j with j ∈ {1, . . . , m} for the n × 1 vector x with the multivariate normal distribution x ∼ N(μ, ) shall now be determined, where the covariance matrix is assumed to be positive definite. First, m independent variates for the normal distribution N (0, 1) are generated and collected in the n × 1 vector zj which has the distribution N(0, I ). If the transformation is applied, which is based on a Cholesky factorization of the positive definite covariance matrix by, say = GG , where G denotes a regular lower n × n triangular matrix, cf. [40, p. 30], we obtain x j = Gzj + μ,

(31)

so that the distribution x ∼ N(μ, ) follows. Instead of using the transformation (31) based on the Cholesky factorization of the covariance matrix , we may also apply the decomposition of into its eigenvalues [44, p. 197].

4.3

Multivariate Uniform Distribution

An analytical expression for the multivariate uniform distribution is not available. However, one can proceed according to [18], see also [25, p. 207]. Let the n × 1

8 Monte Carlo Methods

457

vector x = (xi ) be normally distributed with expectation E(x) = μ and a prescribed covariance matrix = (σj k ) with j , k ∈ {1, . . . , n}, thus x ∼ N (μ, ). The prescribed correlation matrix R = (ρj k ) follows by ρj k = σj k /(σj σk )

(32)

  with σj = σj2 and σk = σk2 being the standard deviations of xj and xk . Random variates x j are drawn like in (14) from the normal distribution x j = (xij )

i ∈ {1, . . . , n}, j ∈ {1, . . . , m}

for

(33)

by first generating m times n independent random variates with standard normal distribution N (0, 1) to obtain the n × 1 vectors zi . They are transformed by x j = Gzi + μ

(34)

where G denotes the lower triangular matrix of the Cholesky factorization = GG , see (31). The distribution functions F (xij ) of the standard normal distribution for the random variates xij are computed to obtain the vector f j = (F (x1j ), . . . , F (xnj )) of random variates whose components uij uij = F (xij )

for

i ∈ {1, . . . , n}, j ∈ {1, . . . , m}

(35)

are random variates from the multivariate uniform distribution in the interval [0, 1] with correlation matrix R  which is very close to the prescribed correlation matrix R. If identity of the given matrix R and the matrix R  is asked for, one can manipulate the matrix which one starts with such that the generated matrix agrees with the prescribed matrix. The Guide to the Expression of Uncertainty in Measurements (GUM ) [34] recommends in addition to the uniform distribution the trapezoidal distribution and the triangular distribution for the analysis of measurements, cf. [15, p. 115], [16,36]. Random variates for the multivariate trapezoidal distribution and the multivariate triangular distribution are generated corresponding to the multivariate uniform distribution shown above [45]. The methods recommended for GUM are compared with fuzzy techniques in [3].

5

Monte Carlo Integration

As shown for the point estimation by (9) and for the estimation of confidence regions by (18), integrals have to be solved. The Monte Carlo integration solves this problem. It is based on generating random variates from a probability density function which approximates the density function in the integral to be solved. The integral is then computed by a summation.

458

Karl-Rudolf Koch

5.1

Importance Sampling

Let x be a random vector of unknown parameters with x ∈ X , where X denotes the parameter space, and p(x|y, C) the posterior density function from Bayes’ theorem (3). The integral I with  I=

X

g(x)p(x|y, C)dx = Ep (g(x))x∈X

(36)

needs to be computed where g(x) denotes some function. The integral is equal to the expected value Ep (g(x))x∈X of g(x) computed by p(x|y, C) for x ∈ X . If p(x|y, C) is intractable, which means that random values cannot be generated from p(x|y, C), and if u(x) is a tractable density function which approximates p(x|y, C), we rewrite the integral to obtain  I=

X

g(x)(p(x|y, C)/u(x))u(x)dx = Eu (g(x)(p(x|y, C)/u(x)))x∈X

= Ep (g(x))x∈X

(37)

where Eu (g(x)(p(x|y, C)/u(x)))x∈X is the expected value of g(x)(p(x|y, C)/ u(x)) computed by u(x) for x ∈ X . A sequence of m independent and identically distributed random vectors x j , j ∈ {1, . . . , m} is now being generated, which have the density function u(x). Because u(x) approximates the density function p(x|y, C), the random values are generated at points which are important, that is at points where large values of p(x|y, C) are concentrated. The estimate Iˆ of I from (37) follows with (16) as arithmetic mean 1 g(x j )(p(x j |y, C)/u(x j )). Iˆ = m m

(38)

j =1

This is called the importance sampling of the Monte Carlo integration. Let g(x j )(p(x j |y, C)/u(x j )) be independent and have equal variances σ 2 V (g(x j )(p(x j |y, C)/u(x j ))) = σ 2

for j ∈ {1, . . . , m}.

(39)

The variance Iˆ of the estimate Iˆ then follows with the error propagation (23) by V (Iˆ) =

m m # 1 2 1 1 " V (g(x j )(p(x j |y, C)/u(x j ))) + b = 2 σ = σ 2. m m m j =1

j =1

(40) If m goes to infinity, the variance goes to zero V (Iˆ)m→∞ = 0

(41)

8 Monte Carlo Methods

459

and we obtain in the limit Iˆm→∞ = I.

(42)

The number m of random variates depends on the variance V (Iˆ) in (40), one wants to achieve. It also depends on, how well p(x|y, C) is approximated by u(x). If u(x) is a poor choice, m needs to be large. One usually works because of Bayes’ theorem (3) with a posterior density function which is not normalized. We will call it p(x|y, ¯ C), thus  p(x|y, C) = p(x|y, ¯ C)/c

with c =

X

p(x|y, ¯ C)dx

(43)

where c denotes the normalization constant. Its estimate cˆ follows with g(x j ) = 1 from (38) by 1 p(x ¯ j |y, C)/u(x j ) m m

cˆ =

(44)

j =1

and the estimate Iˆ of the integral I , if p(x|y, ¯ C) is applied, by Iˆ =

m

g(x j )(p(x ¯ j |y, C)/u(x j ))

j =1

=

m

K m (p(x ¯ j |y, C)/u(x j ))

(45)

j =1

(46)

wj g(x j )

j =1

with wj wj = (p(x ¯ j |y, C)/u(x j ))

K m (p(x ¯ j |y, C)/u(x j ))

(47)

j =1

 ˆ being the importance weights. Because of m j =1 wj = 1, the estimate I can be interpreted as the weighted arithmetic mean of g(x j ). If the prior density function p(x|C) in Bayes’ theorem (3) gives a good approximation for the posterior density function p(x|y, C) and if it is tractable, it can be used as approximate density function u(x). The importance weights wj then follow from (47) by the likelihood function p(y|x, C) evaluated at the random variates x j generated by the prior density function p(x|C). Thus,

460

Karl-Rudolf Koch

wj = p(y|x j , C)

K m

p(y|x j , C).

(48)

j =1

5.2

SIR Algorithm

If we sample from the posterior density function p(x|y, ¯ C) so that ux = p(x|y, ¯ C), we obtain from (47) the constant weights wj = 1/m, which are interpreted as constant probabilities for the samples x j . Summing over the probabilities 1/m for g(x j ) = 1 in (45) gives Iˆ = 1 because the integral over a density function is equal to one. The interpretation of the samples x j having probabilities wj leads to the sampling-importance-resampling (SIR) algorithm of [67], see also [71], as an alternative to the rejection method mentioned at the end of Sect. 4.1: 1. Draw let say M samples x j for x j ∈ X from a tractable density function u(x) approximating the target function p(x|y, C). 2. Draw from these samples having probabilities equal to the importance weights wj in (47) k samples x ∗j for k < M. They have the distribution p(x|y, C) in case of M → ∞.

(49)

The SIR algorithm has been applied for a recursive Bayesian filtering by [28] who called their method bootstrap filter, see also [17].

5.3

Computation of Marginal Distributions

Let estimates be computed, confidence regions be established or hypotheses be tested only for a subset of the unknown parameters x, which is collected in the vector x 1 with x = |x 1 , x 2 | . The posterior marginal density function p(x 1 |y, C) is then determined by a Monte Carlo integration with  p(x 1 |y, C) =

X2

p(x 1 , x 2 |y, C)dx 2

(50)

where X2 denotes the parameter space of x 2 . With random variates for x 1 , which have the marginal density function p(x 1 |y, C), the vector x 1 is then estimated, confidence regions for x 1 are computed or hypotheses for x 1 are tested by the methods described in Sects. 2.2, 2.3 and 2.4. Random variates for x 1 have to be drawn from a density function which approximates the marginal density function for x 1 . In addition, random variates for x 2 with the density function p(x 1 , x 2 |y, C) need to be generated given the value for x 1 .

8 Monte Carlo Methods

6

461

Markov Chain Monte Carlo Methods

The numerical methods of the Monte Carlo integration rely on generating random samples from distributions which approximate the posterior density function p(x|y, C) for the random vector x of unknown parameters from Bayes’ theorem (3). Numerical methods are now covered which are based on generating random variates from the posterior density function p(x|y, C) itself. For special cases p(x|y, C) might be tractable so that random samples can be drawn. Furthermore, it is possible to generate samples from normally distributed observations and transform them into the estimates of the unknown parameters to obtain random variates for the parameters, cf. [4, 41]. General methods for generating random variates for the posterior density function p(x|y, C) are given by the Markov Chain Monte Carlo methods. They simulate a Markov Chain in the parameter space X for the unknown parameters x so that the limiting distribution of the Markov Chain is the target distribution, the distribution with the posterior density function p(x|y, C). Random variates for x are generated from an approximate distribution, which is then moved towards a better approximation of the target distribution. The samples are drawn sequentially, and the distribution of one sample depends on the previous draw so that a Markov Chain is formed. At each step of the simulation the approximate distribution is improved, until it converges to the target distribution. Two Markov Chain Monte Carlo methods will be presented, the Metropolis algorithm and the Gibbs sampler.

6.1

Metropolis Algorithm

The Metropolis algorithm was the first Markov Chain Monte Carlo method to be developed and goes back to [61]. It does not need a special distribution to sample from and can be applied for a posterior density function p(x|y, ¯ C) from (43) which is not normalized because ratios of density functions are computed. One samples a proposal x ∗ from a jumping or proposal distribution pt (x ∗ |x t−1 ) for t ∈ {1, 2, . . .} with x t−1 being the previous generated vector. The jumping distribution has to be symmetric, which means that the probability of drawing x¯ ¯ The ratio r of the density functions from x is equal to the one of obtaining x from x. [21, p. 289] r=

p(x ∗ |y) p(x t−1 |y)

(51)

is computed. One sets  x = t

x∗

with probability min(r, 1)

x t−1

otherwise.

(52)

462

Karl-Rudolf Koch

This means if r ≥ 1, the proposal x ∗ is accepted. If r < 1, a random number u for the random variable U is generated with the uniform distribution in the interval [0, 1] and x ∗ is accepted, if u < r since P (U < u) = u holds true for a uniform distribution. The Metropolis algorithm is easy to apply. However, if the proposal distribution is far off the target distribution, the Metropolis algorithm becomes inefficient because of its slow convergence especially in higher dimensions. For the proof of convergence of the Metropolis algorithm to the target distribution cf. [21, p. 290].

6.2

Gibbs Sampler

The Gibbs sampler was introduced by [23] for the Bayesian restoration of digital images and was then frequently used for different tasks of digital image analysis, cf. [22,24], [53, p. 310]. After [20] showed that the Gibbs sampler can be applied for a variety of problems of Bayesian inference, cf. also [72], [63, p. 225], [26] and [21, p. 287], it became an often used Markov Chain Monte Carlo method. The Gibbs sampler decomposes the parameter space by sampling from the conditional distribution p(x|y, C) for each parameter xi of the vector x = (xi ) of unknown parameters, thus diminishing the problem of high dimensions. We get from the conditional density function for the random vector x 1 given the values for x 2 the relation p(x 1 |x 2 , C) = p(x 1 , x 2 |C)/p(x 2 |C) and for the posterior density function p(x|y, C) p(xi |x1 , . . . , xk−1 , xk+1 , . . . , xn , y, C) =

p(x1 , . . . , xn |y, C) p(x1 , . . . , xk−1 , xk+1 , . . . , xn |y, C)

for

i ∈ {1, . . . , n}.

(53)

The posterior density function p(x|y, C) is uniquely determined by these conditional density functions for xi [10]. Since only xi is variable, while fixed values are assigned to the remaining components, the denominator on the right-hand side of (53) corresponds to a normalization constant, as a comparison with (43) shows. The conditional density function for xi at the left-hand side of (53) is therefore found such that in the posterior density function p(x|y, C) only the component xi is considered being variable and that the appropriate normalization constant is introduced. An example of using the SIR algorithm for the Gibbs sampler is given by [43]. The Gibbs sampler begins with arbitrary starting values (0)

x1 , . . . , xn(0) .

(54)

Random values are then sequentially drawn for xi with i ∈ {1, . . . , n} from (53) to complete one iteration. For the qth iteration we generate

8 Monte Carlo Methods

463

(q)

(q−1)

, . . . , xn

, y, C)

(q)

(q)

(q−1)

, . . . , xn

(q−1)

, y, C)

(q)

(q)

(q)

(q−1)

(q−1)

x1 from p(x1 |x2

(q−1)

x2 from p(x2 |x1 , x3

x3 from p(x3 |x1 , x2 , x4

, . . . , xn

, y, C)

(55)

................................................ (q)

(q)

(q)

xn from p(xn |x1 , . . . , xn−1 , y, C). The sequence of random variates generated by the iterations forms a Markov Chain. It will be assumed that after o iterations convergence is reached so that (o) (o) the distribution of the generated random vector with values x1 , . . . , xn is close enough to the target distribution. In the limit o → ∞, it is the target distribution with density function p(x|y, C) [23]. Conditions for the convergence are found in [66]. The process with o iterations is repeated with new starting values m times so that m random variates x j generated for x are obtained which will be denoted by x j = |x1j , . . . , xnj |

with j ∈ {1, . . . , m}.

(56)

One frequently generates only one Markov Chain and discards during a burn-in phase of let say k iterations all random samples. Afterwords, the random variates of only each sth iteration are collected to obtain the sample x j in (56), because subsequent samples are correlated. The value of s, of course, depends on the correlation between the samples which in turn depends on the correlation of the unknown parameters.

6.3

Gibbs Sampler for Computing and Propagating Large Covariance Matrices

When estimating unknown parameters, their covariance matrix (10) is needed to judge the variations of the estimates. Further quantities are often derived from the unknown parameters and their covariance matrix might be of greater interest than the covariance matrix of the unknown parameters itself. The covariance matrix of the derived quantities are obtained according to (23) by propagating the covariance matrix of the unknown parameters. For instance, when determining the gravity field of the earth from satellite observations, the geopotential is generally expanded into spherical harmonics whose coefficients are the unknown parameters. Several ten thousands of harmonic coefficients are estimated in linear models so that it takes a considerable computational effort to compute the covariance matrix of the harmonic coefficients by inverting the matrix of normal equations. The harmonic coefficients are transformed into gridded gravity anomalies or geoid undulations. By orbit integration, the positions of satellites result from the harmonic coefficients. The covariance matrix of these derived quantities are obtained in case of linear

464

Karl-Rudolf Koch

transformations by multiplying the covariance matrix of the harmonic coefficients from the left by the matrix of the linear transformations and from the right by its transpose according to (23). For nonlinear transformations the matrix of transformations contains the derivatives of the transformations, as mentioned in connection with (24). These propagations of covariance matrix lead in case of many unknown parameters to cumbersome computations. By applying the random variates x j from (56) of the Gibbs sampler, the determination of the derivatives of the nonlinear transformations can be avoided so that the Monte Carlo estimate with (15) and (16) becomes simpler. If the unknown parameters are strongly correlated, a grouping or blocking technique becomes helpful [59, p. 130]. Gundlich et al. [29] used the grouping technique and derived it for correlated unknown parameters, where the matrix of normal equations appears as ill-conditioned. The Gibbs sampler thus obtained is well suited for parallel computing as shown by [52]. To determine by hypothesis tests the maximum degree of harmonic coefficients in a geopotential model, parallel Gibbs sampler was applied to compute random variates for harmonic coefficients [42]. Alkhatib and Schuh [4] discussed the Monte Carlo estimate of covariance matrices for determining the geopotential from satellite data. Gundlich and Kusche [30] presented the solution of a quasi-linear model by the Monte Carlo method for the gravity field recovery from satellite data. Quasi-linear models are a special class of non-linear models where the coefficient matrix of the model depends on unknown parameters. Monte Carlo simulations for the decorrelation of the signals of GPS satellites are presented by [78].

7

Minimal Detectable Outliers

Baarda [7] by his pioneering work introduced outlier tests to geodesy. He assumed the variance factor, i.e. the variance of unit weight, of the linear model to be known. Thus, the test statistic of his data snooping follows by the normalized residual having the normal distribution. Pope [64] started from an estimated variance factor and derived as test statistic the studentized residual with the τ -distribution. Koch et al. [39], [40, p. 302] and [55] showed that the tests for multiple outliers can be derived by the mean-shift model, cf. [6, p. 21], [9]. Applied to searching for one outlier, it leads to the data snooping or the τ -test. Different choices of alternative hypotheses for outlier tests were discussed by [57]. Baarda [7] also introduced the concept of reliability. He looked at the alternative hypothesis of his test that the outlier is not equal to zero in contrast to the null hypothesis that the outlier equals zero. The non-central χ 2 -distribution follows for the alternative hypothesis of his test. He determined the non-centrality parameter by setting the power of the test to a fixed value, generally 0.80, which means, the probability of rejecting a false null hypothesis is 0.80. In case of one outlier, the expression for the non-centrality parameter can be solved for the detectable outlier. Thus, a measure of reliability is obtained.

8 Monte Carlo Methods

465

This measure was called marginally detectable outlier by [54] and is now named the minimal detectable bias MDB, cf. [75, p. 102]. The MDB is a scalar for one outlier. It is generalized for multiple outliers to a unit vector with as many components as there are outliers to be searched for. The vector indicates the positions of the outliers within the observations, and it is multiplied by a scalar with the same expression as the MDB for one outlier, cf. [38, 76]. The MDB is defined for all combinations of the observations with respect to the number of outliers. Bounds for the MDB were computed by solving eigenvalue problems, thus obtaining measures of reliability, cf. [75, p. 105], [38]. The non-centrality parameter of the χ 2 -distribution, on which the MDB depends, can be expressed by a quadratic form and therefore by a hyperellipsoid. Its axes result from the eigenvalues [74]. The MDB is also a function of the size of the test, i.e. the level of significance, which for practical applications is set to 0.001, cf. [73]. Measures of reliability for multiple outliers are derived here. As known, multiple outlier testing is difficult to successfully apply. Furthermore, if a certain number of outliers is searched for and all combinations of this number within the observations are tested, different outliers may produce identical residuals and therefore identical test statistics [8]. This does not cause a problem here as the magnitudes of the detectable outliers are important, their distributions among the observations is less of a concern. Estimated variance factors are assumed, the test statistics are therefore F distributed. In addition, the size of the test is not fixed to 0.001 but depends on the number of observations for the following reason. If one outlier is searched for in the observations, all observations have to be tested. To keep the size of the overall test constant, for instance 0.05, the size of the test of the individual observation is depending on the number of measurements. This has been shown by [64]. Instead of computing the size of the test for the individual observation, [56] applied Monte Carlo methods to compute percentage points for maximal normalized and studentized residuals. In case of multiple outliers, all combinations of the observations with respect to the number of outliers have to be tested. Like the non-centrality parameter of the χ 2 distribution, the non-centrality parameter of the F -distribution can be represented by a quadratic form. The eigenvector of the quadratic form which belongs to the smallest eigenvalue leads to the detectable outliers. The minimum of these values among all combinations give the MDBs, the minimal detectable outliers. Instead of generalizing the MDB for one outlier, it is derived here for multiple outliers.

7.1

Multiple Outlier Testing

Let the linear model, cf. [40, p. 153], [44, p. 85], Xβ = E(y) = y + e and

with E(e) = 0

D(y) = D(e) = σ 2 P −1

(57)

466

Karl-Rudolf Koch

be given, where X denotes the n × u matrix of known coefficients with rank X = u, β the u × 1 vector of unknown parameters, y the n × 1 vector of observations, e the n × 1 vector of errors, σ 2 the unknown variance factor and P the n × n positive definite weight matrix of the observations. The unknown parameters β are computed by the least squares estimate βˆ with βˆ = (X PX)−1 X Py,

(58)

the vector eˆ of the residuals follows with eˆ = −(I − X(X PX)−1 X P)y,

(59)

its covariance matrix D(ˆe) with D(ˆe) = σ 2 Qe = σ 2 (P−1 − X(X PX)−1 X )

(60)

and the quadratic form # of the residuals with # = y (P − PX(X PX)−1 X P)y.

(61)

The mean-shift model results from augmenting (57) by the r×1 vector ε of unknown outliers and the n × r matrix Z of known coefficients   β  |X, Z|   = E(y) ε

with D(y) = σε2 P−1

(62)

where σε2 denotes the unknown variance factor of the mean-shift model, cf. [40, p. 208] for a linear model with a matrix X of coefficients not of full rank. Let’s assume that a test for r outliers has to be derived. The number r can be arbitrarily chosen but should reflect the number of outliers which are expected in the observations. The outliers are ordered in a sequence starting with y = (yi ) at the observation yk and are called yk , yk+1 , . . . , yk+r−1 . They are subtracted from the observations yk , yk+1 , . . . , yk+r−1 . Thus, ε = | yk , yk+1 , . . . , yk+r−1 |

and

Z = |ek , ek+1 , . . . , ek+r−1 |

(63)

with el = |0, . . . , 0, 1, 0, . . . , 0|

(64)

where the lth component with l ∈ {k, k + 1, . . . , k + r − 1} gets the value 1. The estimate εˆ of ε follows by

8 Monte Carlo Methods

467

εˆ = SZ PQe Py = −SZ Pˆe

(65)

S = (Z PQe PZ)−1 .

(66)

with

The matrix Qe P is called redundancy matrix, cf. [40, p. 305], [31], because the ith diagonal element of Qe P gives the contribution of the observation yi to the redundancy n − u tr(Qe P) = tr(I − X(X PX)−1 X P) = n − u.

(67)

The multiple outlier test is defined by the point null hypothesis H0 that all outliers ε are equal to zero against the alternative hypothesis H1 that outliers ε¯ are present H0 : ε = 0

versus

H1 : ε = ε¯ = 0.

(68)

This hypothesis is introduced into the linear model (62) as special case of a general linear hypothesis. It gives the test statistic T by, cf. [40, p. 302], R/r (# − R)/(n − u − r)

(69)

R = εˆ  S−1 εˆ = eˆ  PZSZ Pˆe.

(70)

T = with

In case the null hypothesis is true, T has the F -distribution T ∼ F (r, n − u − r).

(71)

The null hypothesis is rejected if T > F1−α;r,n−u−r

(72)

where α denotes the size of the test and F1−α;r,n−u−r the upper α-percentage point of the F -distribution. It is defined by 

F1−α;r,n−u−r

F (r, n − u − r)dT = 1 − α

(73)

0

where F (r, n − u − r) now denotes the probability density function of the F distribution. Thus,

468

Karl-Rudolf Koch





F (r, n − u − r)dT = α

(74)

F1−α;r,n−u−r

gives the probability α of the Type I error. In case the alternative hypothesis is true, T has the non-central F -distribution T ∼ F  (r, n − u − r, λ)

(75)

with the non-centrality parameter λ λ = ε¯  S −1 ε¯ /σε2

(76)

σε2 = (# − R)/(n − u − r).

(77)

and

The probability of the Type II error of the test is computed by 

F1−α;r,n−u−r

P (Type II error) =

F  (r, n − u − r, λ)dT = 1 − β

(78)

0

where F  (r, n − u − r, λ) now denotes the density function of the non-central F distribution. The probability of rejecting a false null hypothesis is given by the power β of the test. The test by (69) is a uniformly most powerful invariant (UMPI) test, cf. [5, p. 108], [37, p. 37]. It maximizes the power within the class of unbiased, invariant tests. If r outliers are searched for in n observations, q test statistics have to be computed by the combination of rth order   n = q. r

(79)

Let α be the size of testing the q combinations. The probability of the Type I error of this test then results from P (T1 > F1−α;r,n−u−r ∪ T2 > F1−α;r,n−u−r ∪ . . . ∪ Tq > F1−α;r,n−u−r ) = α.

(80)

By applying the Bonferroni inequality, cf. [5, p. 195], we get P (T1 > F1−α;r,n−u−r ) + P (T2 > F1−α;r,n−u−r ) + . . . + P (Tq > F1−α;r,n−u−r ) ≥ α

(81)

8 Monte Carlo Methods

469

because of the relation which holds for l = m and l, m ∈ {1, 2, . . . , q}, cf. [44, p. 7], P (Tl > F1−α;r,n−u−r ∪ Tm > F1−α;r,n−u−r ) = P (Tl > F1−α;r,n−u−r ) + P (Tm > F1−α;r,n−u−r ) − P (Tl > F1−α;r,n−u−r ∩ Tm > F1−α;r,n−u−r ) which is applied recursively to (80) and where the probabilities, which are small, of the intersections of the sets are neglected. It therefore follows nP (Ti > F1−α;r,n−u−r ) ≥ α with i ∈ {1, . . . , q} from (81) and approximately P (Ti < F1−α;r,n−u−r ) = 1 − α/q.

(82)

Thus, the ith among the q combinations has to be tested with the size α/q to obtain α as size of the test for all q combinations. For such a test, the upper percentage point of the F -distribution in (73) needs to be replaced by F1−α/q;r,n−u−r so that instead of (74) the Type I error follows from 



F (r, n − u − r)dT = α/q.

(83)

F1−α/q;r,n−u−r

7.2

Detection of Outliers

The ability of a parameter estimation to detect outliers has been called reliability by [7]. He started from the alternative hypothesis of his outlier test and determined the non-centrality parameter of the non-central χ 2 -distribution by assuming a fixed value for the power of the test, 0.80 is generally taken. The alternative hypothesis H1 with the non-central F -distribution is used here. Furthermore, one of the q combinations is tested with the size α/q so that the upper percentage point F1−α/q;r,n−u−r in (83) is applied, which follows from (73) by 

F1−α/q;r,n−u−r

F (r, n − u − r)dT = 1 − α/q.

(84)

0

The non-centrality parameter λ in (76) is determined by λ0 with fixing the power β of the test to β = 0.80 λ0 = λ(1 − α/q; r, n − u − r, β = 0.80). The power β follows from (78) by

(85)

470

Karl-Rudolf Koch



F1−α/q;r,n−u−r

F  (r, n − u − r, λ0 )dT = 1 − β

(86)

0  so that λ0 is computed as the (1 − β)-percentage point, say F1−β;r,n−u−r,λ , of the 0 non-central F -distribution. The integral in (84) and the integral in (86) with λ0 being known are integrated over the identical domain [0, F1−α/q;r,n−u−r ], therefore  F1−β;r,n−u−r,λ = F1−α/q;r,n−u−r . 0

(87)

The cumulative distribution function of the F -distribution and of the non-central F distribution are computed by the series expansion of the incomplete beta function, cf. [40, p. 116]. To obtain the percentage points, approximate values are chosen and then iteratively corrected in nested intervals [19], also called binary search [69]. We now look at the q combinations with respect to the r outliers. Let ε¯ j with j ∈ {1, . . . , q} be one r × 1 vector of outliers. The non-centrality parameter λ0 can then be expressed with (66), (68) and (76) by the quadratic form λ0 = ε¯ j Z PQe PZ¯ε j /σε2 .

(88)

It shall be minimized subject to the condition ε¯ j ε¯ j = 1, cf. [27, p. 267], ε¯ j Z PQe PZ¯ε j /σε2 → min for ε¯ j ε¯ j = 1.

(89)

Let the eigenvector ε¯ j m belonging to the smallest eigenvalue of the symmetric matrix Z PQe PZ give the minimum ε¯ j m Z PQe PZ¯ε j m /σε2 = min for ε¯ j m ε¯ j m = 1.

(90)

The smallest value of ε¯ j m Z PQe PZ¯ε j m is then determined among all q combinations. To find it, we get by the factor f since the quadratic form (90) has to be equal to λ0 from (88) λ0 = f ε¯ j m Z PQe PZ¯εj m f/σε2 .

(91)

Solving for f , we obtain the expression which has to be minimized % f =

λ0 σε2 .   ε¯ j m Z PQe PZ¯ε j m

(92)

8 Monte Carlo Methods

471

Let the minimum be attained with ε¯ lm , we then find the MDBs, the minimal detectable biases or the minimal detectable outliers by, cf. [38, 76], % MDB = f ε¯ lm =

λ0 σε2 ε¯ lm   ε¯ lm Z PQe PZ¯ε lm

for ε¯ lm ε¯ lm = 1.

(93)

They give a measure for the reliability of the parameter estimation. The matrix PQe P is therefore called the reliability matrix R of the observations R = PQe P = (rij ),

(94)

although [65] denotes the redundancy matrix Qe P in (67) as reliability matrix. The r × r matrix Z PQe PZ in (93) is found with Z from (63) by   rkk  Z PQe PZ =  r

k+r−1,k

 rk,k+1 . . . rk,k+r−1  . .................  rk+r−1,k+1 . . . rk+r−1,k+r−1 

(95)

The MDB for r = 1 follows from (93) with % MDB =

λ0 σε2 . rmm

(96)

where rmm denotes the maximal diagonal element of the reliability matrix PQe P. As shown, outliers are searched for in all combinations of the observations with respect to the number of outliers. To keep the size of the outlier test constant, the test of the individual observations has to depend on the number of combinations. The power of the test is set to 0.80. The non-centrality parameter λ0 is computed as the (1–0.80)-percentage point of the non-central F -distribution. It is a function of a quadratic form, which is minimized by computing the eigenvalues and eigenvectors. The smallest minimum among all combinations give the MDBs, the minimal detectable outliers. The MDBs are introduced as given outliers into the alternative hypothesis of the outlier test so that the null hypothesis is false. A Monte Carlo method checks whether on the average the false null hypotheses are rejected with a power of 0.80. If this is not the case, the MDBs have to be enlarged by a factor [47]. Outliers are better determined by a robust parameter estimation than by a multiple hypothesis test. Huber’s robust M-estimation [33] is well known and has been frequently applied. Even better suited for the outlier search is the Expectation Maximization (EM) algorithm. It is an adaptive method, which takes care of a few or of many outliers. Monte Carlo simulations check the results of these methods [51, 62].

472

8

Karl-Rudolf Koch

Summary

The generalized Bayes’ theorem leads to the posterior density function for the vector of unknown parameters to be determined by measurements. Monte Carlo methods rely on generating random numbers for the vector of unknown parameters from the posterior density function. This can be accomplished by quite a number of methods. Presented are the inversion method, the Monte Carlo integration, the sampling-importance-resampling (SIR) algorithm and the Markov Chain Monte Carlo methods. The Gibbs sampler plays an important role among the latter ones. It can be used for computing and propagating large covariance matrices which appear when the gravity field of the earth is determined by satellite observations. The Monte Carlo estimate gives by the samples for the vector of unknown parameters their covariance matrix and expectations for the point estimation. The Monte Carlo estimate leads also to the estimation of confidence regions and to hypothesis tests. Acknowledgments The author is indebted to Willi Freeden for his invitation to this contribution for HbMG and to Jan Martin Brockmann for his valuable comments.

Literature 1. Acko, B., Godina, A.: Verification of the conventional measuring uncertainty evaluation model with Monte Carlo simulation. Int. J. Simul. Model. 4, 76–84 (2005) 2. Alkhatib, H., Kutterer, H.: Estimation of measurement uncertainty of kinematic TLS observation process by means of Monte-Carlo methods. J. Appl. Geodesy 7, 125–133 (2013) 3. Alkhatib, H., Neumann, I., Kutterer, H.: Uncertainty modeling of random and systematic errors by means of Monte Carlo and fuzzy techniques. J. Appl. Geodesy 3, 67–79 (2009) 4. Alkhatib, H., Schuh, W.D.: Integration of the Monte Carlo covariance estimation strategy into tailored solution procedures for large-scale least squares problems. J. Geodesy 81, 53–66 (2007) 5. Arnold, S.: The Theory of Linear Models and Multivariate Analysis. Wiley, New York (1981) 6. Baarda, W.: Statistical Concepts in Geodesy. Publications on Geodesy, vol. 2, Nr. 4. Netherlands Geodetic Commission, Delft (1967) 7. Baarda, W.: A Testing Procedure for Use in Geodetic Networks. Publications on Geodesy, vol. 2, Nr. 5. Netherlands Geodetic Commission, Delft (1968) 8. Baselga, S.: Nonexistence of rigorous tests for multiple outlier detection in least-squares adjustment. J. Surv. Eng. 137, 109–112 (2011) 9. Beckman, R., Cook, R.: Outlier. . . . S. Technometrics 25, 119–149 (1983) 10. Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. B 36, 192–236 (1974) 11. Box, G., Muller, M.: A note on the generation of random normal deviates. Ann. Math. Stat. 29, 610–611 (1958) 12. Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1946) 13. Dagpunar, J.: Principles of Random Variate Generation. Clarendon Press, Oxford (1988) 14. Devroye, L.: Non-Uniform Random Variate Generation. Springer, Berlin (1986) 15. Dietrich, C.: Uncertainty, Calibration and Probability, 2nd edn. Taylor & Francis, Boca Raton (1991) 16. van Dorp, J., Kotz, S.: Generalized trapezoidal distributions. Metrika 58, 85–97 (2003) 17. Doucet, A., Godsill, S., Andrieu, C.: On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. Comput. 10, 197–208 (2000)

8 Monte Carlo Methods

473

18. Falk, M.: A simple approach to the generation of uniformly distributed random variables with prescribed correlations. Commun. Stat. Simul. 28, 785–791 (1999) 19. Gaida, W., Koch, K.R.: Solving the cumulative distribution function of the noncentral F -distribution for the noncentrality parameter. Sci. Bull. Stanislaw Staszic Univ. Min. Metall. Geodesy b.90(1024), 35–44 (1985) 20. Gelfand, A., Smith, A.: Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85, 398–409 (1990) 21. Gelman, A., Carlin, J., Stern, H., Rubin, D.: Bayesian Data Analysis, 2nd edn. Chapman and Hall, Boca Raton (2004) 22. Geman, D., Geman, S., Graffigne, C.: Locating texture and object boundaries. In: Devijver, P., Kittler, J. (eds.) Pattern Recognition Theory and Applications, pp. 165–177. Springer, Berlin (1987) 23. Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 6, 721–741 (1984) 24. Geman, S., McClure, D.: Statistical methods for tomographic image reconstruction. Bull. Int. Stat. Inst. 52, 5–21 (1987) 25. Gentle, J.: Random Number Generation and Monte Carlo Methods, 2nd edn. Springer, Berlin (2003) 26. Gilks, W.: Full conditional distributions. In: Gilks, W., Richardson, S., Spiegelhalter, D. (eds.) Markov Chain Monte Carlo in Practice, pp. 75–88. Chapman and Hall, London (1996) 27. Golub, G., van Loan, C.: Matrix Computations. The Johns Hopkins University Press, Baltimore (1984) 28. Gordon, N., Salmond, D.: Bayesian state estimation for tracking and guidance using the bootstrap filter. J. Guid. Control. Dyn. 18, 1434–1443 (1995) 29. Gundlich, B., Koch, K.R., Kusche, J.: Gibbs sampler for computing and propagating large covariance matrices. J. Geod. 77, 514–528 (2003) 30. Gundlich, B., Kusche, J.: Monte Carlo integration for quasi–linear models. In: Xu, P., Liu, J., Dermanis, A. (eds.) VI Hotine-Marussi Symposium on Theoretical and Computational Geodesy, pp. 337–344. Springer, Berlin/Heidelberg (2008) 31. Guo, J.F., Ou, J.K., Yuan, Y.B.: Reliability analysis for a robust M-estimator. J. Surv. Eng. 137, 9–13 (2011) 32. Hennes, M.: Konkurrierende Genauigkeitsmaße – Potential und Schwächen aus der Sicht des Anwenders. Allgemeine Vermessungs-Nachrichten 114, 136–146 (2007) 33. Huber, P.: Robust estimation of a location parameter. Ann. Math. Stat. 35, 73–101 (1964) 34. ISO: Guide to the Expression of Uncertainty in Measurement. International Organization for Standardization, Geneve (1995) 35. JCGM: Evaluation of measurement data–supplement 2 to the “Guide to the Expression of Uncertainty in Measurement”–Extension to any number of output quantities. JCGM 102:2011. Joint Committee for Guides in Metrology. (2011). www.bipm.org/en/publications/guides 36. Kacker, R., Jones, A.: On use of Bayesian statistics to make the guide to the expression of uncertainty in measurement consistent. Metrologia 40, 235–248 (2003) 37. Kargoll, B.: On the Theory and Application of Model Misspecification Tests in Geodesy. Universität Bonn, Institut für Geodäsie und Geoinformation, Schriftenreihe 8, Bonn (2008) 38. Knight, N., Wang, J., Rizos, C.: Generalised measures of reliability for multiple outliers. J. Geod. 84, 625–635 (2010) 39. Koch, K.R.: Ausreißertests und Zuverlässigkeitsmaße. Vermessungswesen und Raumordnung 45, 400–411 (1983) 40. Koch, K.R.: Parameter Estimation and Hypothesis Testing in Linear Models, 2nd edn. Springer, Berlin (1999) 41. Koch, K.R.: Monte-Carlo-Simulation für Regularisierungsparameter. ZfV–Z Geodäsie, Geoinformation und Landmanagement 127, 305–309 (2002) 42. Koch, K.R.: Determining the maximum degree of harmonic coefficients in geopotential models by Monte Carlo methods. Studia Geophysica et Geodaetica 49, 259–275 (2005) 43. Koch, K.R.: Gibbs sampler by sampling-importance-resampling. J. Geod. 81, 581–591 (2007)

474

Karl-Rudolf Koch

44. Koch, K.R.: Introduction to Bayesian Statistics, 2nd edn. Springer, Berlin (2007) 45. Koch, K.R.: Determining uncertainties of correlated measurements by Monte Carlo simulations applied to laserscanning. J. Appl. Geod. 2, 139–147 (2008) 46. Koch, K.R.: Evaluation of uncertainties in measurements by Monte Carlo simulations with an application for laserscanning. J. Appl. Geod. 2, 67–77 (2008) 47. Koch, K.R.: Minimal detectable outliers as measures of reliability. J. Geod. 89, 483–490 (2015) 48. Koch, K.R.: Bayesian statistics and Monte Carlo methods. J. Geod. Sci. 8, 18–29 (2018) 49. Koch, K.R.: Monte Carlo methods. GEM Int. J. Geomath. 9(1), 117–143 (2018) 50. Koch, K.R., Brockmann, J.: Systematic effects in laser scanning and visualization by confidence regions. J. Appl. Geod. 10(4), 247–257 (2016) 51. Koch, K.R., Kargoll, B.: Outlier detection by the EM algorithm for laser scanning in rectangular and polar coordinate systems. J. Appl. Geod. 9, 162–173 (2015) 52. Koch, K.R., Kusche, J., Boxhammer, C., Gundlich, B.: Parallel Gibbs sampling for computing and propagating large covariance matrices. ZfV–Z Geodäsie, Geoinformation und Landmanagement 129, 32–42 (2004) 53. Koch, K.R., Schmidt, M.: Deterministische und stochastische Signale. Dümmler, Bonn (1994). http://skylab.itg.uni-bonn.de/koch/00_textbooks/Determ_u_stock_Signale.pdf 54. Kok, J.: Statistical analysis of deformation problems using Baarda’s testing procedures in: “Forty Years of Thought”. Anniversary Volume Occasion of Prof. Baarda’s 65th Birthday 2, 470–488 (1982). Delft 55. Kok, J.: On data snooping and multiple outlier testing. In: NOAA Technical Report NOS NGS 30. US Department of Commerce, National Geodetic Survey, Rockville (1984) 56. Lehmann, R.: Improved critical values for extreme normalized and studentized residuals in Gauss-Markov models. J. Geod. 86, 1137–1146 (2012) 57. Lehmann, R.: On the formulation of the alternative hypothesis for geodetic outlier detection. J. Geod. 87, 373–386 (2013) 58. Leonard, T., Hsu, J.: Bayesian Methods. Cambridge University Press, Cambridge (1999) 59. Liu, J.: Monte Carlo Strategies in Scientific Computing. Springer, Berlin (2001) 60. Marsaglia, G., Bray, T.: A convenient method for generating normal variables. SIAM Rev. 6, 260–264 (1964) 61. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953) 62. Nowel, K.: Application of Monte Carlo method to statistical testing in deformation analysis based on robust M-estimation. Surv. Rev. 48(348), 212–223 (2016) 63. O’Hagan, A.: Bayesian Inference, Kendall’s Advanced Theory of Statistics, vol. 2B. Wiley, New York (1994) 64. Pope, A.: The statistics of residuals and the detection of outliers. In: NOAA Technical Report NOS65 NGS1. US Department of Commerce, National Geodetic Survey, Rockville (1976) 65. Proszynski, W.: Another approach to reliability measures for systems with correlated observations. J. Geod. 84, 547–556 (2010) 66. Roberts, G., Smith, A.: Simple conditions for the convergence of the Gibbs sampler and Metropolis-Hastings algorithms. Stoch. Process. Appl. 49, 207–216 (1994) 67. Rubin, D.: Using the SIR algorithm to simulate posterior distributions. In: Bernardo, J., DeGroot, M., Lindley, D., Smith, A. (eds.) Bayesian Statistics, vol. 3, pp. 395–402. Oxford University Press, Oxford (1988) 68. Rubinstein, R.: Simulation and the Monte Carlo Method. Wiley, New York (1981) 69. Schader, M., Schmid, F.: Distribution function and percentage points for the central and noncentral F-distribution. Stat. Pap. 27, 67–74 (1986) 70. Siebert, B., Sommer, K.D.: Weiterentwicklung des GUM und Monte-Carlo-Techniken. tm– Technisches Messen 71, 67–80 (2004) 71. Smith, A., Gelfand, A.: Bayesian statistics without tears: a sampling-resampling perspective. Am. Stat. 46, 84–88 (1992) 72. Smith, A., Roberts, G.: Bayesian computation via the Gibbs sampler and related Markov Chain Monte Carlo methods. J. R. Stat. Soc. B 55, 3–23 (1993)

8 Monte Carlo Methods

475

73. Staff of the Geodetic Computing Center, S.: The Delft approach for the design and computation of geodetic networks. In: “Forty Years of Thought”. Anniversary Volume on the Occasion of Prof. Baarda’s 65th Birthday vol. 1, pp. 202–274. Delft (1982) 74. Teunissen, P.: Adjusting and testing with the models of the affine and similarity transformation. Manuscr. Geodaet. 11, 214–225 (1986) 75. Teunissen, P.: Testing theory; An introduction. MGP, Department of Mathematical Geodesy and Positioning, Delft University of Technology, Delft (2000) 76. Teunissen, P., de Bakker, P.: Single-receiver single-channel multi-frequency GNSS integrity: outliers, slips, and ionospheric disturbances. J. Geod. 87, 161–177 (2013) 77. Wilks, S.: Mathematical Statistics. Wiley, New York (1962) 78. Xu, P.: Random simulation and GPS decorrelation. J. Geod. 75, 408–423 (2001)

9

Parameter Estimation, Variance Components and Statistical Analysis in Errors-in-Variables Models Peiliang Xu

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Effects of Errors-in-Variables on the Weighted LS Estimate of Parameters . . . . . . 3 The Effects of Errors-in-Variables on Variance Component Estimation . . . . . . . . . . . . 4 TLS Estimation of Parameters in EIV Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Error Analysis of Errors-in-Variables Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Variance Components in Errors-in-Variables Models . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

478 484 492 497 514 524 529

Abstract This chapter discusses statistical and numerical aspects of constrained and unconstrained errors-in-variables (EIV) models. The parameters in an EIV model can often be estimated by using three categories of methods: the conventional weighted least squares (LS) method, normed orthogonal regression, and the weighted total least squares (TLS) method. The conventional weighted LS method is of significantly computational advantage but not rigorous statistically. We systematically investigate the effects of random errors in the design matrix on the weighted LS estimates of parameters and variance components, construct the N-calibrated almost unbiased weighted LS estimator of parameters and derive almost unbiased estimates for the variance of unit weight. Although orthogonal

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. P. Xu () Disaster Prevention Research Institute, Kyoto University, Uji, Kyoto, Japan E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_99

477

478

P. Xu

regression can be used to estimate the parameters in an EIV model, it is not statistically optimal either. The weighted TLS method is most rigorous and optimal to statistically estimate the parameters in an EIV model at the cost of substantially increasing computation. We reformulate an EIV model as a nonlinear adjustment model without constraints and investigate the statistical effects of nonlinearity on the nonlinear TLS estimate, including the first order approximation of accuracy, nonlinear confidence region and bias of the nonlinear TLS estimate. Closed form solutions to coordinate transformation have been presented as well. Finally, we prove that variance components in an EIV model with the simplest stochastic structure are not estimable.

Zusammenfassung Der Beitrag diskutiert statistische und numerische Aspekte von („errors-invariable“) EIV-Modellen mit oder ohne Nebenbedingungen. Die Parameter in einem EIV-Modell lassen sich oftmals unter Benutzung dreier methodischer Kategorien bewerten: konventionelle gewichtete LS-Methode, normierte orthogonale Regression und gewichtete TLS-Methode. Die vorliegende Arbeit beschäftigt sich mit der Reformulierung eines EIV-Models als nichtlineares Ausgleichungsmodell ohne Nebenbedingungen.

Keywords Errors-in-variables · Estimability of variance components · Nonlinear adjustment · Orthogonal regression · Total least squares · Variance components · Weighted least squares

1

Introduction

Parameter adjustment starts with the following linear model:  E(y) = Aβ, y = Aβ +  , D(y) = y = W−1 σ 2

(1)

where E(·) stands for the operator of mathematical expectation and D(·) for the variance-covariance matrix, y is a vector of measurements, A is an (n × t) deterministic coefficient matrix with rank r(< t), β is a t-dimensional unknown vector of parameters,  is the observation error vector of y with the positive definite weight matrix W and zero mean, σ 2 is the unknown variance of unit weight.

9 Parameter Estimation, Variance Components and Statistical. . .

479

If the linear model (1) is of full rank, namely, r = t, then the weighted least squares (LS) principle leads to a unique solution, which is the best linear unbiased estimator of β; otherwise, the weighted LS estimator of β is not unique. A unique solution of β to the linear model (1) without full rank can be derived by using a number of different methods, which include: (i) generalized inverses of the normal matrix (see e.g., [11,69,81]); (ii) the Helmert approach without using the concept of generalized inverse of matrices (see e.g., [94, 95]); and (iii) a set of linear (minimal and/or inner) constraints (see e.g., [27, 46, 48, 62, 63, 66, 71, 75, 98, 110]). After a unique solution is obtained by using a particular set of minimal and/or inner constraints, one may then apply Baarda’s S-transformation to transform the solution under one set of minimal constraints to that under another set of minimal constraints [27, 86]. Koch [45] proved that the S-transformation was actually equivalent to the projection to obtain estimable functions or invariants (see also [37,38]). Xu [97,98] further extended the results of linear models without full rank to nonlinear models without full rank (see also [28, 47]). In practice, however, some or all of the elements of the design matrix A are not given deterministically but measured or derived from measurements with random errors. Models with a random design matrix A have been well known as errors-invariables (EIV) models (see e.g., [17, 32]). If all the elements of A are measured with random errors, the EIV model can be written as follows: y = (A − EA )β + ,

(2)

(see, e.g., [32,82,91]), where y and  have been defined as in (1), A is a measurement matrix, and EA is the random matrix of A. The stochastic model of EA is assumed 2 to be E(EA ) = 0 and D( A ) = W−1 a σa , where  A stands for the vector by sequentially stacking the columns of the matrix EA underneath each other [57]. The EIV model (2) can also be equivalently rewritten mathematically as the following nonlinear model: y = Aβ + ,

(3a)

A = A + EA ,

(3b)

where A stands for the true matrix of A. If some of the elements of A are not random or functionally dependent, then the corresponding EIV model can be represented as a partial EIV nonlinear model [107]: y = (β T ⊗ In )(h + Ba) + ,

(4a)

a = a + a ,

(4b)

480

P. Xu

where h is a deterministic constant vector whose elements correspond to the nonrandom elements of A, B is a given (deterministic) matrix, depending on the number (say ta ) of functionally independent random elements of A. The ta −dimensional vector a collects all the independent random elements of A, the true values of a are denoted by a, and  a is assumed to be of zero mean. In the similar manner, we can assume σa2 = σ 2 and write the stochastic model for a, whose weight matrix is denoted by ω, as follows: D( a ) = ω−1 σ 2 .

(4c)

In addition, we also assume that  and  a are stochastically independent in the remainder of this paper, namely, cov(,  a ) = 0.

(4d)

We should note, however, that the zero correlation of (4d) between  and  a can be readily removed (if necessary). In other words, the development of nonlinear TLS adjustment to be discussed later can be easily extended to cover the case of non-zero correlation between  and  a . Example 1. Given a geodetic network of stations on the surface of the Earth, we assume that these stations are measured with conventional geodetic measuring instruments, resulting in a set of coordinates xc for these stations, with the subscript c standing for conventional. We then further assume that these stations are also measured with modern space geodetic techniques such as global navigation satellite systems, very long baseline interferometry and/or satellite laser ranging, which produce another set of the coordinates xs for the same stations, with the subscript s standing for space. It is well known that if these stations are not subject to deformation, the configuration of the stations should remain rigid and the coordinates of each station satisfy the following transformation: xis = t + sR{xic −  ic } +  is ,

(5)

(see e.g., [31,49,53,55,96]), where the superscript i stands for the ith station,  is and  ic are the random errors of xis and xic , respectively. Both  is and  ic are assumed to be of mean zero and variance-covariance matrices s and c , respectively. The vector t is an unknown shift vector, s an unknown scaling and R an unknown rotation matrix. Example 2. Time series has been successfully used to model a variety of engineering problems, which can be generally written symbolically as follows:

9 Parameter Estimation, Variance Components and Statistical. . .

yt +

p i=1

ϕi yt−i = f (at , β) +

q

αi t−i + t ,

481

(6)

i=1

where yi (i = 1, 2, . . . , n) are given, i (i = 1, 2, . . . , n) are a sequence of white noise with mean zero and variance σ 2 , ϕi (i = 1, 2, . . . , p), αi (i = 1, 2, . . . , q) and β are the unknown parameters, f (·) is a linear or nonlinear trend function, and at is a known vector. The model (6) is called as a non-stationary autoregressive moving average time series model. If the trend term f (at , β) does not show up in (6), the simplified model is best known as an ARMA(p, q) model. In particular, the model (6) is called a non-stationary autoregressive time series model in the case of q = 0 and a non-stationary moving average time series model in the case of p = 0. In practical applications of the time series model (6), yi (i = 1, 2, . . . , n) and at are often measurements contaminated with random errors. For more details about time series, the reader is referred to Fuller [33]. As a special case of (6), if p = q = 0, then the model (6) becomes yt = f (at , β) + t ,

(7)

which is a nonlinear EIV model, if at is measured with random errors. The point correspondence or transformation relationship (5) without the scaling factor s also plays an essential role in estimating three-dimensional rigid motion parameters from a sequence of images (see e.g., [8,41,43]). Many more examples of EIV models can be found in other subjects of science and engineering, e.g., systems, control and signal processing (see e.g., [91]), physics and astronomy (see e.g., [15, 34]) and earth sciences (see e.g., [13, 18, 90, 101]). Recently, Shi and Xu [84] proved that even given that A of an EIV model is not of full rank, namely, some of the parameters β and/or their combinations are theoretically redundant, if the random elements of A satisfy certain conditions, the rank deficiency of such an EIV model cannot be identifiable almost surely from the measurements A and y. In other words, the random errors of A and y would turn a theoretically rank defect EIV model into a practically estimable model, unless the rank deficiency is due to overparametrization. In what follows, we will focus on practically soluble EIV models with a unique solution. For more details on EIV models without full rank, the reader is referred to Shi and Xu [84]. Although EIV models were fully formulated and the LS principle was then applied to solve them more than one century ago (see e.g., [1, 2, 21, 23, 25, 50, 67]), they have been revitalized after the publication by Golub and van Loan [36], who called the solution approach to the EIV model (2) as total least squares (TLS). TLS has since attracted a widely spread attention (see e.g., [17, 32, 60, 91]) and applied to many science and engineering problems. Given a number of points (xi , yi ) in

482

P. Xu

the two-dimensional space or (xi , yi , zi ) in the three-dimensional space, with xi , yi and zi being measured with random errors, Adcock [1, 2] first applied the ordinary LS method to the problems of line and/or surface fitting of the points by taking the random errors of xi , yi and zi into account. Pearson [67] proposed minimizing the sum of squared Euclidean distances from the measured points to an unknown hyperplane to be optimally determined and proved that the solution is defined by the eigenvector of the smallest eigenvalue. Essentially speaking, the fitting criterion by Pearson [67] is also the ordinary LS method, as used by Adcock [1,2] and Coolidge [21]. Deming [23–25] and Gerhold [34] extended the ordinary LS method of Adcock [1,2], Pearson [67] and Coolidge [21] to the case where the measurements, namely the elements of A and y, are assumed to be stochastically independent but are of different accuracy or weighting factors for the EIV model (2). Further extension to the case of correlated measurements can be found in Markovsky and van Huffel [59], Schaffrin and Wieser [77], Fang [29] and Xu et al. [107]. Recently, Xu and Liu [104, 108] studied the variance component estimation in an EIV model and concluded that under a certain condition, the variance components of both the measurements y and the random coefficient matrix A are not estimable. Golub and van Loan [36] proposed minimizing the Frobenius norm of the residuals to both A and y, which are re-scaled with two positive diagonal matrices. They also proved that the TLS solution to the EIV model (2) is the eigenvector of the smallest eigenvalue. Although Golub and van Loan [36] called their solution method as TLS, the method has nothing to do with the ordinary or weighted LS principle from the statistical point of view, except for some very special cases [107]. From this point of view, one may more properly say that the method presented by Golub and van Loan [36] solely emphasizes the fitting of data from the point of view of approximation theory. Actually, before the terminology of total least squares was coined by Golub and van Loan [36], the solution approach to EIV models in the weighted LS sense was called general least squares by Deming [23–25] and complete least squares by Gerhold [34], respectively. Almost all the works on TLS focus on methods and algorithms for numerically finding the TLS solution and their applications, as can be seen in the literatures of, for example, mathematics/statistics (see e.g., [1, 2, 21, 25, 34, 36, 67]), systems, control and signal processing (see e.g., [60, 91]), industrial surveying and geodetic coordinate transformation (see e.g., [3, 30, 31, 64, 76–78, 83, 85, 107]), only to name a few areas. However, compared with a great number of publications on TLS algorithms and applications, statistical aspects of a (weighted or ordinary) TLS estimate have not received due attention, in particular, in the case of finite samples. The first order approximate variance-covariance matrix of the TLS estimate was given by Gerhold [34]. The TLS estimate is proved to be asymptotically unbiased with probability one (or weakly consistent), as the number of data y tends to infinity (see e.g., [35, 60, 91]). Unlike Gerhold [34] and Schaffrin [76] started with the rigorous TLS solution and derived an approximate accuracy for the estimated model parameters and the Lagrange multipliers by propagating the errors of y (see also [79]). TLS solutions have also been extended to the case with constraints (see e.g.,

9 Parameter Estimation, Variance Components and Statistical. . .

483

[23, 24, 76, 78]). Recently, Xu et al. [107] reformulated the partial EIV model as a nonlinear parameter adjustment model and derived the biases of the TLS estimate due to the model nonlinearity and further a first order accuracy for the weighted TLS estimate of the parameters. The weighted TLS method under the statistical framework is mathematically the most rigorous for parameter estimation in EIV models, but at the cost of having to estimate the additional parameters A. For a general EIV model with a (n × t) random matrix, this additional number of unknowns is exactly equal to (n × t), not to mention that the EIV model is essentially nonlinear. In the era of large data, let us say, n = 1,000,000 and t = 10,000, then the total number of unknowns to be estimated in such an EIV model is more than 1.0 × 1010 . For problems with this large number of unknown parameters, computational burden can be very much demanding (if not completely impossible). In the case of coordinate transformation, let us assume a regional and/or continental network with, say 10,000 stations. In a two-dimensional (2D) transformation, if the naive weighted LS method is used, then we actually only need to estimate four 2D transformation parameters. However, if the rigorous weighted TLS method is applied, then we have to estimate 40,004 unknown parameters, consisting of the 40,000 coordinate unknowns in the random design matrix A and four 2D transformation parameters. The simplest approximate approach to estimating the unknown vector β from both the observed data y and A of the EIV model (2) is to treat A as if it were not random and then go forward to directly apply the weighted LS method, as mentioned by Kummell [50] more than one century ago, even though this approximate approach was already recognized to be not statistically optimal in the LS sense by Kummell [50]. Nevertheless, the weighted LS estimate of the parameters β in the EIV model (2) is much more effective computationally. For some EIV problems, the TLS and weighted LS methods have been shown to produce negligible differences in the estimated parameters, as in the case of geodetic coordinate transformation. In order to understand under what conditions we can safely use the cheap and easy weighted LS method, it is interesting and important to investigate the effect of the random design matrix A on the weighted LS-based solutions of quantities of interest, theoretically and practically. Hodges and Moore [39] derived the bias formulae for the weighted LS estimate of β and the estimated variance of unit weight. Davies and Hutton [22] focused on the asymptotic aspects of the same estimators. These two publications are based on three simple assumptions, i.e., (i) the elements of y are statistically independent and of identical accuracy; (ii) the elements of A are statistically independent and 2 , σ 2 , . . . , σ 2 ), where diag(σ 2 , σ 2 , . . . , σ 2 ) stands E(ETA EA ) = ee = diag(σe1 et et e2 e1 e2 for a diagonal matrix with positive elements σei2 ; and (iii) y and A are mutually independent statistically. Xu [101] investigated the effect of random errors of A on the accuracy of the inverted stress tensor, since measured fault-slip data in the design matrix A had been treated in geophysical inversion as if these measured directional data were free of random errors. Recently, Xu et al. [108] intended to systematically investigate the effects of EIV on all the quantities of interest,

484

P. Xu

including the model parameters, the variance-covariance matrix of the estimated parameters, the variance of unit weight and the corrections of measurements y. Xu [102] further systematically investigated the effect of EIV on variance component estimation. An N-calibrated weighted LS estimate of β was also developed and a corrected variance-covariance formula for the weighted LS estimate of β was provided as well by Xu [102]. This chapter is organized as follows. In Sect. 2, we will discuss the effects of EIV on the estimated quantities of geodetic interest, namely, the model parameters, the variance-covariance matrix of the estimated parameters and the variance of unit weight. The theoretical biases of the weighted LS estimate of parameters have shown that the effect of random matrix on adjustment depends on the design matrix itself, the variance-covariance matrix of its elements and the model parameters. By removing the effect of the random matrix from the weighted LS estimate and the estimate of the variance of unit weight, we can accordingly obtain the bias-corrected weighted LS estimate of parameters and the bias-corrected estimates for the variance of unit weight for the EIV model, respectively, as in the case of multiplicative noise models [106]. In Sect. 3, we will further investigate the effects of EIV on variance component estimation of MINQUE type by treating the design matrix as if it were errors-free and derive the first order bias of this variance component estimate. In Sect. 4, we will introduce three basic TLS methods to estimate the parameters in EIV models: the ordinary TLS method or orthogonal regression by Adcock [1, 2], Pearson [67] and Coolidge [21], the Frobenius-norm-based SVD method by Golub and van Loan [36] and the statistically rigorous weighted TLS method by Deming [23–25], Markovsky and van Huffel [59], Schaffrin and Wieser [77], Fang [29] and Xu et al. [107]. In Sect. 4.5, we will present some closed form solutions of coordinate transformation, which are largely based on the literature of computer visions for use in geodesy. In Sect. 5, we analyze the statistical consequences of nonlinearity on the nonlinear TLS estimate, including the first order approximation of accuracy, nonlinear confidence region and bias of the nonlinear TLS estimate. Although TLS has been substantially investigated theoretically and widely applied in practical applications, little has been done to address the estimation of the errorsin-variables (EIV) stochastic model. Finally, in Sect. 6, we will prove that the variance components of EIV models under a certain condition are not estimable, indicating that we have no way of gaining any knowledge on such an EIV stochastic model, except for repeating measurements of functionally independent random elements of A.

2

The Effects of Errors-in-Variables on the Weighted LS Estimate of Parameters

The weighted LS method is most intuitive to estimate the parameters in an EIV model. However, the random design matrix A is expected to result in various kinds of consequences. We will systematically discuss the effects of the random errors of the design matrix on the estimated quantities of geodetic interest, in particular,

9 Parameter Estimation, Variance Components and Statistical. . .

485

the model parameters, the variance-covariance matrix of the estimated parameters and the variance of unit weight. More specifically, we will focus on the error and bias analysis of the estimated parameters and variance of unit weight. We finally construct the bias-corrected estimates for the parameters and the variance of unit weight, respectively. For more details on the effects of errors-in-variables, the reader is referred to Xu et al. [108] and Xu [102]. In this section, the stochastic model of EA is assumed to be E(EA ) = 0 and D( A ) = A , where  A stands for the vector by sequentially stacking the columns of the random error matrix EA underneath each other [57]. We further assume that  is stochastically independent of  A . Extension to the case of correlation between  and  A is necessarily only technical. If not all the elements of A are random and/or if some elements can be derived (linearly and/or nonlinearly) from other elements, then the corresponding rows and columns of A can be simply replaced with zero.

2.1

The Effect of EIV on the Weighted LS Estimate

If the random errors of A in the EIV model (3) are neglected, then the weighted LS estimate of β can be readily written as follows: βˆ = (AT WA)−1 AT Wy = N−1 AT Wy,

(8)

where βˆ is the weighted LS estimate of β, which is a nonlinear vector function of EA and , and N = AT WA.

(9)

By expanding the weighted LS estimate βˆ into a Taylor’s series of EA and  and truncating it up to the second order approximation, we have −1 T βˆ = β + N A W( − EA β)

−N

−1 T EA WREA β

+N

−1 T EA W

−N

−1

T

−N

+N

T

A WEA N

−1

T

A WEA β

−1 T −1 T EA WAN A W

−1

A WEA N

−1

T

A W,

(10)

where T

N = A WA, −1

R = I − AN

T

A W.

The technical derivation of (10) can be found in Xu [102].

486

P. Xu

ˆ Because EA is assumed to be stochastically independent of , the bias of β, ˆ ˆ defined as bias(β) = E(β) − β, can be accordingly computed as follows: ˆ = −N bias(β)

−1

Q1β β + N

−1

T

A WQ2β β,

(11a)

where Q1β = E{ETA WREA } = [tr{WR cc j i }],

(11b)

and Q2β = E{EA N

−1

T

−1

A WEA } = [tr{N

T

A W cr j i }],

(11c) cj

where cc j i stands for the covariance matrix between the jth column vector EA and cj

cc ci cr the ith column vector Eci A of EA , namely, j i = cov(EA , EA ), and j i for the cj

covariance matrix between the jth column vector EA and the ith row vector Eri A of cj cr ri EA , namely, j i = cov(EA , EA ), respectively. It is obvious from (11) that the bias of the weighted LS estimate βˆ is solely due to the random errors of the coefficient matrix A. If there would be no random cr errors in A, the matrices cc j i and j i will be equal to zero. As a result, Q1β = 0, ˆ = 0, indicating that the weighted LS estimate βˆ is Q2β = 0, and further bias(β) ˆ is proportional both to unbiased in this case. It is also obvious from (11) that bias(β) the parameters β and the variance-covariance A of A. The larger the magnitudes ˆ will become. On the contrary, of parameters β or A , the larger the bias bias(β) ˆ bias(β) will decrease with the decrease of A . In other words, when the accuracy ˆ will become smaller. of the elements of A becomes higher, bias(β) As a simple example, let us assume W = In and E(ETA EA ) = nσa2 diag(ρ11 , ρ22 , . . . , ρtt ) = nσa2 ρ, where ρ = diag(ρ11 , ρ22 , . . . , ρtt ). In addition, we further assume that ci ci 2 cc ii = cov(EA , EA ) = ρii σa In .

(12)

Thus it is easy to show that Q1β = (n − t)ρσa2 ,

(13a)

and −1

Q2β = [tr{N

Hij }], (i = 1, 2, . . . , n; j = 1, 2, . . . , t)

(13b)

9 Parameter Estimation, Variance Components and Statistical. . .

487

where ij

Hij = ρjj σa2 HA , ij

where HA is a zero (t × t) matrix, except for its jth column being equal to the transpose of the ith row vector of A. Substituting Q1β and Q2β of (13) into (11a), we obtain the bias of the weighted LS estimate βˆ in this special case as follows: −1

−1

ˆ = −(n − t)N bias(β)

ρβσa2 + N

T

A Q2β β.

(14)

Comparing (14) with formula (9) of Hodges and Moore [39], we can see that formula (9) of Hodges and Moore [39] corresponds to the first part of our bias −1 formula (14), namely, −(n − t)N ρβσa2 . The second term in (14) is missing altogether in Hodges and Moore [39].

2.2

The Effect of EIV on the Accuracy of the Weighted LS Estimate

To study the effect of the random errors EA and  on the accuracy of the weighted LS estimate of β, we only need to keep the linear term of (10), namely, −1 T βˆ = β + N A W( − EA β).

(15)

Applying the error propagation law to (15) and bearing in mind that  and EA are stochastically independent, we have the variance-covariance matrix of βˆ −1 2

ˆ =N D(β)

−1

σ +N

T

−1

A WKWAN

,

(16a)

where K is an (n×n) matrix, whose elements are denoted by kij (i, j = 1, 2, . . . , n) and given by rj

T T T rr kij = E{Eri A ββ (EA ) } = β j i β, rj

T ri rr j i = E{(EA ) EA }.

(16b) (16c)

ˆ consists of two parts, It is clear from (16a) that the variance-covariance matrix D(β) which are caused by the random errors of A and y, respectively. If A is deterministic, ˆ with K is equal to zero and D(β) ˆ returns to the variancethen the term of D(β) covariance matrix of the estimated parameters in the conventional linear Gauss−1 Markov model. In other words, because of the random errors of A, N σ 2 cannot correctly reflect but always optimistically overestimate the accuracy of the weighted ˆ LS estimator β.

488

P. Xu

In particular, if the row vectors of EA are assumed to be mutually independent stochastically, and for each of the row vectors, ⎡

2 σa1 ⎢ 0 ⎢ D(Eri A ) = ⎢ .. ⎣ .

0 2 σa2 .. .

0

0 0 .. .

··· ··· .. .

0 0 .. .

⎤ ⎥ ⎥ ⎥, ⎦

(17a)

0 · · · σat2

0

then we have rr j i = 0, (i = j )

(17b)

K = k I,

(17c)

and

where k=

t

2 2 σai βi .

(17d)

i=1

ˆ of (16a) becomes In this special case, the variance-covariance matrix D(β) ˆ =N D(β)

−1 2

−1

σ + kN

T

A W2 AN

−1

.

(18)

ˆ is roughly proportional It is also clear from (17d) that the second part of D(β) 2 to the true power β of the parameters β and the noise level of A. The larger the true values of β and/or σa2 , the larger the errors of βˆ become. If the design matrix A is treated as if it were not random, then the corrections of the measurements y can be written by r = Aβˆ − y.

(19)

Inserting (3a) and (15) into (19), we can rewrite the corrections r of (19) in terms of the linear approximation of EA and  as follows: r = (A + EA )βˆ − Aβ −  −1

= EA β + AN −1

= (I − AN

T

A W( − EA β) − 

T

A W)(EA β − )

= R(EA β − ).

(20)

9 Parameter Estimation, Variance Components and Statistical. . .

489

The variance-covariance matrix of r is then equal to −1

D(r) = (W−1 − AN

T

T

A )σ 2 + RKR .

(21)

If the stochastic model of EA satisfies the conditions (17), (21) becomes −1

D(r) = (W−1 − AN

2.3

T

T

A )σ 2 + kRR .

(22)

The Effect of EIV on the Estimate of the Variance of Unit Weight

The usual weighted LS estimate for the variance of unit weight σ 2 is well known and given by L2 = rT Wr/(n − t). σ

(23)

To study the effect of the random errors of A on the estimate of σ 2 , we apply the mathematical expectation operation to (23): L2 ) = E(rT Wr)/(n − t) = tr{WE(rrT )}/(n − t). E(σ

(24)

If we only consider all the terms up to the second order approximation of EA and , the term E(rrT ) in (24) is actually equal to the variance-covariance matrix of r. Thus we can readily insert (21) into (24) and obtain: L2 ), L2 ) = σ 2 + bias(σ E(σ

(25a)

L2 ) stands for the bias of the estimate σ L2 of the variance of unit weight where bias(σ and is given by L2 ) = bias(σ

ca , n−t

(25b)

where the coefficient ca is given by ca = tr{KWR}.

(25c)

If the stochastic model of EA satisfies the conditions (17), the coefficient ca becomes ca = k tr{WR}.

(25d)

2 = σ 2 (i = 1, 2, . . . , t) in (17a), c of (25d) is In particular, if W = I and σai a a simplified as:

490

P. Xu

ca = (n − t)σa2 β T β,

(26)

L2 ) of (25b) turns out to be: and bias(σ L2 ) = σ 2 β T β. bias(σ a

(27)

L2 of the variance of unit weight is biased, as can be clearly seen The estimate σ from (25a). The coefficient ca comes from the variance-covariance matrix of r in association with the random errors of A and is always positive. If the coefficient L2 ) = 0, indicating that the matrix A is deterministic, then K = 0. As a result, bias(σ L2 is unbiased in this case of conventional (deterministic) linear models. estimate σ We can also observe two features from the positive coefficient ca in (26): (i) since the bias (25b) is proportional to the noise level of A, the noisier the random elements L2 deviates from the variance of unit weight positively. In other of A, the larger σ L2 of the variance of unit weight σ 2 always tends to produce a words, the estimate σ 2 larger value for σ , since ca > 0; and (ii) the coefficient ca is roughly proportional (implicitly) to the power of the parameters β2 . The larger the absolute values of L2 . This explains very well the parameters β, the larger the bias (25b) of the estimate σ the experimental fact/phenomenon that in geodetic coordinate transformation, the differences in transformation parameters from the weighted LS and TLS methods are not significant or almost negligible but their corresponding estimates of the variance of unit weight are significantly different, as reported, for example, in Schaffrin and Felus [77], Cai and Grafarend [16] and Felus and Burtch [31].

2.4

Bias-Corrected Weighted LS Estimates of Parameters and the Variance of Unit Weight

In the preceding subsections, we have learned that the random errors in the coefficient matrix A will result in the biases in almost all the quantities of interest that are involved with the weighted LS estimate. Since we have worked out the biases of quantities of interest, we may theoretically construct unbiased versions of L2 by the weighted LS estimate βˆ and the estimate of the variance of unit weight σ L2 , respectively. subtracting their corresponding biases (11a) and (25b) from βˆ and σ L 2 ˆ Accordingly, the unbiased results of β and σ may be symbolically written as follows: ˆ β = βˆ − bias(β),

(28)

L2 − bias(σ L2 ), σ2 = σ

(29)

and

9 Parameter Estimation, Variance Components and Statistical. . .

491

L2 ) are respectively given by (11a), and (25b). ˆ and bias(σ where bias(β) L2 ) require the true values of A, β and σ 2 , we can ˆ Although bias(β) and bias(σ L2 . In what follows, we will practically compute them with the estimates βˆ and σ propose alternative methods to construct bias-corrected estimators for β and σ 2 . Because the design matrix A is random, the normal matrix AT WA will deviate from its true value as well. Xu [102] proposed an intuitive idea to reconstruct a bias-corrected normal matrix and then go on to estimate the parameters β, which is called the N-calibrated weighted LS estimate. Rewriting the normal matrix N of (9) in terms of EA as follows: N = AT WA T

= N + A WEA + ETA WA + ETA WEA ,

(30)

we have E(N) = N + QN ,

(31)

where QN is given as follows: QN = [tr(W cc j i )]. Given the variance-covariance matrix of EA , the matrix QN is uniquely computable. We can then construct an unbiased estimate of N by removing QN from the normal ˆ and given as follows: matrix N of (9), which is denoted by N ˆ =N−Q . N N

(32)

Thus, we can finally use the unbiased estimate of N to replace the normal matrix N and derive a new weighted LS estimate of β, denoted by βˆ bc and given as follows: βˆ bc = (N − QN )−1 AT Wy,

(33)

where βˆ bc is called the N-calibrated weighted LS estimator by Xu [102]. If the signal-to-noise ratio of A is small, QN could be significant in comparison with N. Some of the eigenvalues of (N−QN ) may be too small and/or even negative. In such a case, the estimate βˆ bc of (33) may become unstable and unreliable. In the case of the variance of unit weight, we will further assume that A = QA σa2 and σ 2 = σa2 . As a result, (25a) can be rewritten below: L2 ) = σ 2 + E(σ

caq 2 n − t + caq 2 σ = σ , n−t n−t

from which we can construct an unbiased estimate of σ 2 as follows:

492

P. Xu

L σ2 =

rT Wr , n − t + caq

(34)

after taking (23) into account, where caq = tr{WRKq }, and the elements of Kq are denoted by kij q given by kij q = β T Qrr j i β. It is clear that due to the randomness of A, we will have to add a positive (but fractional) coefficient caq to calibrate the number of redundant measurements (n−t) in order to produce an unbiased estimate of σ 2 . For more technical details, the reader may refer to Xu [102]. We will now close this section with two remarks. Remark 1. The theoretical bias analysis has shown that the effect of random matrix on adjustment depends on the design matrix itself, the variance-covariance matrix of its random elements and the model parameters. The theoretical analysis also successfully explains all the anomalously large estimates of the variance of unit weight reported in the geodetic literature. Remark 2. The simulation results of the effects of EIV on the weighted LS estimate of parameters were reported by Xu et al. [108] and Xu [102], which have shown: (i) that both the bias-corrected weighted LS estimate and the N-calibrated weighted LS estimate obviously outperform the weighted LS estimate. The intuitively Ncalibrated weighted LS estimate is computationally less expensive and statistically performs even better than the bias-corrected weighted LS estimate in producing an almost unbiased estimate of parameters; and (ii) that the conventional notion of accuracy of the weighted LS estimate, namely, (AT WA)−1 σ 2 , cannot be used to represent the accuracy of the weighted LS estimate. It has to be compensated by an extra term to properly reflect the variance-covariance matrix of the weighted LS estimate, which is solely due to the random errors of A but roughly proportional to the length of measurements y.

3

The Effects of Errors-in-Variables on Variance Component Estimation

We will now investigate the effect of the errors-in-variables on variance component estimation. Xu and Liu [104, 105] proved that it is generally impossible to correctly estimate the stochastic parameters of variance components for both y and A in the

9 Parameter Estimation, Variance Components and Statistical. . .

493

EIV model (2). Thus, from this point of view, it is important and interesting, both theoretically and practically, to investigate the following question: if the matrix A is treated as if all its elements were non-random, as in the case of the conventional linear model, then we would like to know how the randomness of A will affect the estimation of the stochastic parameters of the measurements y. To answer the question posed in the above, we assume a general variancecovariance matrix QA σa2 for the random errors EA , with both QA and σa2 being known or given, and represent the stochastic model of the measurements y as follows: y =

my

Uiy σiy2 ,

(35)

i=1

where y is the variance-covariance matrix of y, the Uiy are known and positive semi-definite, σiy2 (i = 1, 2, . . . , my ) are the (unknown) variance components of y, and my is the number of unknown variance components σiy2 . Throughout this section, we will assume that EA and  are stochastically independent. For conciseness of notations, we will collect all the variance components σiy2 (i = 1, 2, . . . , my ) in the vector σy . Since we are only interested in the effects of random errors EA and  on variance component estimation, we do not assume any distributional information on EA and . We use the conventional MINQUE method to estimate the variance components of y by treating the coefficient matrix A as if none of its elements were random and then investigate the effect of the EIV model on the variance component estimate of σy . The basic idea is to find the dependence of the MINQUE-estimated variance components of y on the random errors EA and further on QA σa2 . The biases of the estimated variance components will be derived. The special case of the variance of unit weight will be further derived here. We should note that the basic idea of this paper can be readily extended to other variance component estimation methods. Nevertheless, since the principle and methods of investigation are identical, we will limit ourselves to the MINQUE method. For the functional model (2) and the stochastic model (35), and given a set of 2 (i = 1, 2, . . . , m ) (or in vector form σ initial values σ0iy y 0y for σy ), we apply the MINQUE method and construct the local estimate of variance components σy through the following system of equations: Sσˆy = q,

(36)

(see e.g., [68, 69]), where σˆy is the local estimate of variance components σy , the elements sij of S and those qi of q are respectively given as follows: sij = tr{PUiy PUjy } for i, j = 1, 2, . . . , my and,

(37a)

494

P. Xu −1 qi = rT −1 0y Uiy 0y r,

(37b)

P = −1 0y R,

(37c)

R = I − AN−1 AT −1 0y

(37d)

N = AT −1 0y A,

(38)

for i = 1, 2, . . . , my , where

with

0y =

my

2 Uiy σ0iy ,

i=1

r = Aβˆ − y,

(39)

βˆ = N−1 AT −1 0y y,

(40)

if 0y is invertible (or non-singular). Since the design matrix A is random under the framework of EIV models, we will now investigate how the randomness of A will affect the MINQUE estimate of variance components σy . More precisely, we derive the expectation of qi in (37b) and then find the effect of the variance-covariance matrix of A on σˆy in (36). Thus, we will first find the expectation of the quadratic form (37b). For this purpose and bearing the assumed stochastic model (35) in mind, we can rewrite the linearized expressions of the weighted LS estimates of β and the corrections r in terms of the random errors EA and , together with their variance covariance matrices, as follows: −1 T βˆ = β + N A −1 0y ( − EA β), −1

ˆ =N D(β)

(41a) −1

−1 A −1 0y ( y + K) 0y AN T

,

(41b)

and r = R(EA β − ),

(41c) T

D(r) = R( y + K)R . where K has been defined as in (16b) and

(41d)

9 Parameter Estimation, Variance Components and Statistical. . .

495

T

N = A −1 0y A. With formulae (41c) and (41d) in hand, we can use the standard formulae of quadratic form of random variables (see e.g., [46,68,81]) and derive the expectation of qi in (37b). Since D(r) = E(rrT ) and according to Xu [102], the expectation of (37b) leads to: −1 E(qi ) = E(rT −1 0y Uiy 0y r)

= tr{PUiy PK} +

my

tr{PUiy PUjy }σiy2 ,

(42)

j =1

where P = −1 0y R. By collecting all the elements E(qi ) into a column vector, we obtain the following theoretical relationship: E(q) = Sσ y + ua ,

(43)

where the elements of S have been defined as in (37a) but with P replaced by P, and the elements of ua are given by: uia = tr{PUiy PK}.

(44)

From the normal equation (36), we readily write the MINQUE estimate of the variance components σ y as follows: σˆ y = S−1 q,

(45)

if the normal matrix S is regularly invertible. If we only consider the effects of the second order approximations of the random errors EA and  on σˆ y , after applying the expectation operator to (45), we can obtain −1

E(σˆ y ) = S

(46)

E(q).

Substituting (43) into (46) yields −1

E(σˆ y ) = S

−1

(Sσ y + ua ) = σ y + S

ua ,

(47)

from which we can compute the bias of σˆ y , denoted by bias(σˆ y ), as follows: −1

bias(σˆ y ) = E(σˆ y ) − σ y = S

ua .

(48)

It is clear from (44) that the element uia of ua in (48) are totally attributed to the random errors in the design matrix A. If the design matrix A would be free of

496

P. Xu

random errors, uia would be equal to zero, indicating that in this case, the MINQUE estimation of variance components σ y is unbiased. For an EIV model, since the design matrix A contains random elements, K can never be equal to zero, implying that the MINQUE estimate of σ y is always biased for any EIV model. On the other hand, since uia depends on the unknown parameters β through K, the larger the parameters β, the larger the bias of σˆy from (36). Under the assumed special stochastic model (17a), all the elements uia of ua t  2 β 2 . In this case, since K = kI are proportional to k of (17d), namely, k = σai i i=1

in (44), the biases of the estimated variance components of the measurements y are proportional to the value k. As a special example, if there is only one variance component, namely, my = 1 2 = σ 2 , then the variance-covariance matrix of y returns to the simplified and σ1y model y = W−1 σ 2 . If we further assume that the random errors EA of A are only rr 2 2 2 known up to an unknown scalar, namely, rr j i = Qj i σa , and σa = σ , then (42) becomes E(rT Wr) = tr{PW−1 PK} + (n − t)σ 2 .

(49)

On the other hand, by assumption, K = Kq σ 2 ,

(50)

where the elements kij q of Kq are given by kij q = β T Qrr j i β, we obtain the unbiased estimate of σ 2 as follows: L2 = σ

rT Wr , n − t + cay

(51a)

where the constant cay is given by T

T

cay = tr{Kq W[I − A(A WA)−1 A W]}.

(51b)

The unbiased estimate of the variance of unit weight (51) is clearly the same as that of (34). The simulations of Xu [102] have shown that: (i) errors-in-variables can have a significant effect on the estimation of variance components of y, if the random errors of A are significant but treated as non-random. The variance components of y can be incorrectly estimated by more than one order of magnitude, depending on the nature of problems and the sizes of EIV; (ii) the bias-corrected VC estimate can effectively remove the bias of the VC estimate. If the signal-to-noise is small, higher order terms may be necessary. Nevertheless, since we construct the bias-corrected VC estimate by directly removing the estimated bias from the estimate itself, the

9 Parameter Estimation, Variance Components and Statistical. . .

497

simulation results of Xu [102] also have clearly indicated that there is a great risk to obtain negative values for variance components y.

4

TLS Estimation of Parameters in EIV Models

In this section, we will discuss major methods for parameter estimation in EIV models: (i) the orthogonal regression method, which was first proposed by Adcock [1, 2] and likely, independently by Pearson [67] and Coolidge [21], since these two latter publications did not refer to Adcock [1, 2] and Coolidge [21] did not refer to Pearson [67] either; (ii) the Frobenius-norm-based TLS method proposed by Golub and van Loan [36], which has been mostly followed by van Huffel and Vandewalle [91]; and (iii) the weighted TLS method, which is the first rigorous statistical method to estimate the parameters in an EIV model. The weighted TLS method was first proposed by Deming [23–25] and further extended in the geodetic literature by Schaffrin and Wieser [77], Fang [29] and Xu et al. [107]. As an interesting and important application example, we will present the closed form solutions to coordinate transformation, as systematically developed, for example, by Sanso [74], Davenport Keat [44], Horn [40], Arun et al. [8] and Horn et al. [41]. We will not cover the material on EIV condition adjustment and EIV robust estimation. If the reader is interested in these two aspects of EIV models, he should refer to Tong et al. [88], Zumar [111], Cheng and van Ness [20], Bab-Hadiashar and Suter [9], Amiri-Simkooei and Jazaeri [5] and Chang et al. [19] for more information.

4.1

The Orthogonal Regression Method

The idea of orthogonal regression for parameter estimation in an EIV model was first proposed by Adcock [1]. As a direct application of Adcock [1,2] and Coolidge [21] estimated the parameters of intercept and slope of a 2D line of best fitting by minimizing the sum of squared Euclidean distances of 2D measured data points (xi , yi ) (i = 1, 2, . . . , n) to the fitted line. Pearson [67] extended the idea of orthogonal regression to the general case of hyperplanes of arbitrary dimensions (see also [65]). To begin with, let us rewrite the measured matrix A and the data vector y as n data points, each with (t + 1) elements, namely, xi = (ai , yi ) (i = 1, 2, . . . , n). Denoting the Euclidean distance from xi to the hyperplane to be determined by si , we can then write the best fitting criterion of orthogonal regression for the EIV model (2) as follows:

min: S(β, E, r) =

n i=1

si2 ,

(52)

498

P. Xu

(see e.g., [1, 2, 21, 65, 67]), where E is the correction matrix to A and r the correction vector to y. The criterion (52) is generally applicable, no matter whether the data points xi are random, partly random or deterministic. Under the defined estimation criterion (52), Pearson [67] proved two important results for the parameter estimation in the EIV model (2) (see also [65]): (i) that the solution must pass through the centroid xc = (ac , yc ); and (ii) that the solution can be directly obtained from the direction cosines of the least eigenvalue of the data variancecovariance matrix. Since the estimated hyperplane has been proved to pass through the centroid xc = (ac , yc ) [65], it can be symbolically written as follows: (a − ac )da + (y − yc )dy = 0,

(53)

where (dTa , dy ) defines the normal vector to the hyperplane. Denoting the extended data matrix May below ⎤ a1 − ac y1 − yc ⎢ a2 − ac y2 − yc ⎥ ⎥ ⎢ =⎢ ⎥, .. .. ⎦ ⎣ . . an − ac yn − yc ⎡

May

(54)

Nievergelt [65] has proved that the unknown normal vector (dTa , dy ) corresponds to the eigenvector of the smallest eigenvalue of the matrix MTay May , as was already clearly stated in Pearson [67]. As a result, we can obtain the orthogonal regression estimate of β in the EIV model (2) through the estimated hyperplane (53). From the statistical point of view, if all the elements aij of A and yi of y are stochastically independent and of the same accuracy, then the estimation criterion (52) for the EIV model (2) essentially corresponds to the ordinary TLS method. Following this line of reasoning, if (ai , yi ) and (aj , yj ) (i = j ) are stochastically independent and if the elements of (ai , yi ) are all of the same accuracy for each i, then the criterion (52) of orthogonal regression can be extended to min: S(β, E, r) =

n

wi si2 ,

(55)

i=1

where wi is a positive scalar and plays the role of a weighting factor for each data point (ai , yi ). Equation (55) may be called Pearson’s [67] weighted criterion of best fitting for EIV models. The geometrical interpretation of the best fitting criterion (52) and the corresponding corrections to the EIV model (2) by Pearson [67] is illustrated by the red solid lines in Fig. 1. We should note, however, that the optimal criterion of best fitting proposed by Pearson [67] is different from least squares. Even if A is not random, one can use Pearson’s criterion to estimate the model parameters β. An

9 Parameter Estimation, Variance Components and Statistical. . .

499 Pn

P3 P1

P2

Fig. 1 Geometrical illustrations of the TLS solution and the corresponding corrections (after Xu et al. [107]): the black solid line – the TLS solution; the red solid lines – the TLS vectors of corrections when all the elements of A and y are observed with random noises; the solid blue lines – the TLS vectors of corrections when A is deterministic and only the elements of y are observed with random noises. They are essentially the LS corrections of observations, since TLS is degenerated into LS; and the green solid lines – the TLS vectors of corrections in the case that y is free of random errors but A is measured with random errors. The red circles mark the data points

operational algorithm to compute the estimate of β in the EIV model (2) under Pearson’s criterion can be found in Nievergelt [65].

4.2

Golub/van Loan’s TLS Estimation of Parameters

Although TLS was already used for data fitting earlier by Adcock [1, 2], Pearson [67], Coolidge [21] and Deming [23–25], it is the landmark publication of Golub and van Loan [36] that makes TLS widely spread, accepted and applied in many areas of science and engineering. It was then further systematically developed and popularized by van Huffel and Vandewalle [91]. TLS has since become a standard method to solve EIV models. Actually, Golub and van Loan [36] coined the terminology of total least squares and demonstrated that the TLS solution can be readily obtained algorithmically by singular value decomposition. Golub and van Loan [36] solve the EIV model (2) by minimizing the Frobenius norm of the corrections to both A and y. Given two diagonal matrices D and , namely,

500

P. Xu

⎡ ⎢ ⎢ D=⎢ ⎣



d1

⎥ ⎥ ⎥ = diag(d1 , d2 , . . . , dn ), ⎦

d2 ..

.

(56a)

dn ⎡ ⎢ ⎢ =⎢ ⎣



γ1

⎥ ⎥ ⎥ = diag(γ1 , γ2 , . . . , γt+1 ), ⎦

γ2 ..

.

(56b)

γt+1 where all the values di and γi are positive, Golub and van Loan [36] propose minimizing the following objective function: min: S(E, r, β) = D[E, r]2F ,

(57a)

subject to the following equality constraints: (A + E)β = y + r,

(57b)

where BF denotes the Frobenius norm of B, namely, B2F =

i

2 bij ,

j

with bij being the element of B at row i and column j . The solution to the constrained minimization problem (57) is called the TLS solution to the EIV model (2) [36]. Golub and van Loan [36] prove that the solution to the minimization problem (57a) subject to the nonlinear equality constraints (57b) can be found by computing the singular value decomposition of the following matrix: C = D[A, y] = UVT ,

(58)

where both U and V are the normalized orthogonal matrices, and  is the diagonal matrix with all its elements being positive and arranged in decreasing order. The TLS solution to (57) corresponds to the eigendirection of the minimum singular value of C. It is unique, since all the singular values have been proved to be distinguishable almost surely [84]. Denoting the last eigenvector of V as (vTt , vt+1 ), Golub and van Loan [36] obtain the following TLS solution to (57) as follows: βˆ TLS = − 1 vt /(γt+1 vt+1 ),

(59)

9 Parameter Estimation, Variance Components and Statistical. . .

501

where  1 = diag(γ1 , γ2 , . . . , γt ). Further algorithms to compute the ordinary TLS solution with D = In and  = It+1 can be readily found in the TLS literature [60, 91], where In and It+1 are (n × n) and (t + 1) × (t + 1) identity matrices, respectively. In this case, the TLS solution of (59) becomes βˆ OR = −vt /vt+1 ,

(60)

(see e.g., [60,91]), where βˆ OR stands for the ordinary TLS or orthogonal regression solution of β in the EIV model (2). In case that D = In and  = It+1 , we can trivially prove that the TLS objective function (57a) can be rewritten as follows: min: S(E, r, β) = [E, r]2F =

n

si2 .

(61)

i=1

In other words, the ordinary TLS solution by Golub and van Loan [36] is identical to the EIV solution derived by Pearson [67] and Nievergelt [65]. The geometrical interpretation of the TLS method by Golub and van Loan [36] can also be found in Pearson [67]. Golub and van Loan [36] additionally illustrate the (ordinary and/or weighted) LS solution, as shown in solid blue lines in Fig. 1. Based on the results by Golub and van Loan [36], we know that Pearson’s solution is also an ordinary TLS solution in the case of the EIV model (2), even though Pearson [67] attacked this model from a completely different perspective of best fitting. Although the solution approach (57) has been well known to result in the TLS solution after Golub and van Loan [36], our question is whether this TLS formulation is most optimal to solve an EIV model from the statistical point of view, with or without constraints and/or with different weights for A and y. Actually, Golub and van Loan [36] call both matrices D and  in (57) the weight matrices. Nevertheless, since the diagonal matrix  has been multiplied to [E, r] from right, it essentially plays the role of re-scaling each of β and the corresponding measurement of y. In other words, if we replace each βi with βiγ = γi βi and y with yγ = γt+1 y, then  can be readily removed from (57a). From this point of view,  should not be considered as a weight matrix. Since  is only to re-scale the model parameters β and the measured data y, we can remove it from (57a). For the diagonal matrix D of (56a), we can again rewrite (57a) as follows: min: S(E, r, β) = D[E, r]2F =

n

di2 si2 ,

(62)

i=1

which clearly indicates that the TLS criterion by Golub and van Loan [36] could be said to be the weighted sum of squared distances from the data points to the hyperplane to be estimated, as formulated by Pearson [67]. In other words, Pearson’s [67] criterion of best fitting can be extended to the weighted case and will become

502

P. Xu

the weighted LS method with the weighting factors di2 . It is clear from (62) that di of (57a) plays the role of weighting for each data point (ai , yi ) and the objective function (57a) is actually an weighted LS criterion. If we follow the assumption of different weightings for different measurements by Deming [23–25], then it is impossible to rewrite the Deming’s weighted TLS criterion into the form of (55). In this case, the diagonal matrix D cannot be considered as a weight matrix either.

4.3

The Weighted TLS Estimation of Parameters

The weighted TLS method was first applied to solve nonlinear EIV models by Deming [23–25], who assumed a general problem as follows: fi (ai + Ei , yi + ri , β) = 0,

(63)

for i ∈ (1, 2, . . . , n), where ai and yi have been defined as in the EIV model (2), Ei and ri are the corrections to ai and yi , respectively. By assuming a different weight ωij for each measured element aij of A and a different weight wi for each of the measured data yi of y, Deming [23–25] solved the nonlinear EIV model (63) by minimizing the following weighted TLS objective function: min: S(E, r, β) =

m n i=1 j =1

2 ωij eij +

n

wi ri2 ,

(64a)

i=1

subject to the following equality conditions: y

β

fi (ai , y, β 0 ) + (Fai )T Ei + (Fi )T r + (Fi )T δβ = 0, (i = 1, 2, . . . , n) y

β

(64b)

where Fai , Fi and Fi are the partial derivative (column) vectors of fi (·) with respect to Ei , r and β, respectively, which are all computed at the values of A, y and β 0 , with β 0 being the approximate values of β. He then used the weighted TLS method (64a) to adjust the measurements A and y under the linearized constraints (64b). A further extension of Deming’s criterion (64a) would be to consider the correlations among the elements of A and between A and y; one of such examples is an autoregressive model with a Toeplitz structure (see, e.g., [33, 77, 107]). Essentially speaking, the weighted TLS method by Deming [23–25] is equivalent to the weighted LS condition adjustment with unknown parameters, though the measurements are not just limited to y but also A. Although Deming [23, 25] used the weighted TLS principle (64a) correctly to solve the nonlinear EIV model (63), his solution is based on linearizing nonlinear equations of measurements. As a consequence, the TLS solution obtained may only be approximate, unless initial approximate values of the model parameters are sufficiently good, as pointed out by Gerhold [34]. This procedure has now been

9 Parameter Estimation, Variance Components and Statistical. . .

503

practised routinely in geodesy, in particular, in coordinate transformation (see, e.g., [53–55,64,83]). Actually, the approach to using the method of Lagrange multipliers to solve EIV models, as developed by Deming [23–25], has been followed by almost all the works on weighted TLS (see, e.g., [31, 34, 77, 78, 83]). Based on the same weighting assumption as in Deming [23–25], Gerhold [34] applied the weighted TLS method (64a) to the EIV model (2) through the formulation of the following minimization problem: min: S(E, r, β) =

m n

2 ωij eij

i=1 j =1

+

n

wi ri2 ,

(65a)

i=1

subject to the following equality constraints: (A + E)β = y + r.

(65b)

In order to use the notation of vectors, we follow Magnus and Neudecker [57] and denote eA = vec(E), which is to put the elements of E into a vector one column after the other. Then the left hand side of (65b) can be rewritten as follows: (β T ⊗ In )(vA + eA ) = y + r,

(66)

(see also [57]), where vA = vec(A) and ⊗ stands for Kronecker product. As a result, (65a) can be rewritten as follows: min: S(eA , r, β) = eTA ωeA + rT Wr,

(67)

subject to the equality conditions (66), where ω and W are the diagonal weight matrices of vA and y, respectively. Although the early works on TLS (see, e.g., [23–25, 34]) assumed the diagonal matrices for both ω and W, such an assumption is not really necessary to solve the TLS objective function (67) under the constraints (66). Thus, in the remainder of this chapter, we remove this diagonality condition on both ω and W and only assume that both of them are positive definite [77,107]. To start applying the method of Lagrange multipliers to solve the weighted TLS problem (67) under the constraints (66), we construct the augmented Lagrangian function: min: ! =

1 T 1 e ωeA + rT Wr + [(β T ⊗ In )(eA + vA ) − y − r)]T λ, 2 A 2

(68)

504

P. Xu

(see, e.g., [34]), where λ is the vector of the Lagrange multipliers. The solution to (68) is the weighted TLS estimate. For brevity of notations, we denote the estimate of β by βˆ but keep the notations of eA , r and λ to denote their estimates, respectively. By differentiating ! of (68) with respect to eA , r, β and λ at the point of their weighted TLS estimates, respectively, and letting all these partial derivatives equal to zero, we have the normal equations: ∂! = ωeA + (βˆ ⊗ In )λ = 0, ∂eA

(69a)

∂! = Wr − λ = 0, ∂r

(69b)

∂! = (A + E)T λ = 0, ∂β

(69c)

∂! = (A + E)βˆ − y − r = 0. ∂λ

(69d)

E = (eA1 , eA2 , . . . , eAt ).

(70)

where

From (69a), (69b) and (69d), we have eA = −ω−1 (βˆ ⊗ In )λ,

(71a)

λ = Wr,

(71b)

r = (A + E)βˆ − y.

(71c)

Substituting (71c) into (71b) and then further into (69c) yields (A + E)T W{(A + E)βˆ − y} = 0, from which we can obtain the representation of the TLS estimate of β as follows: βˆ = {(A + E)T W(A + E)}−1 (A + E)T Wy,

(71d)

(see also [85]). ˆ (71d) is essentially a system Since eA of (71a) or E of (70) are the functions of β, of nonlinear equations and has to be solved iteratively. Actually, based on the four ˆ eA (or equivalently E), equations of (71), we can numerically solve iteratively for β, r and λ.

9 Parameter Estimation, Variance Components and Statistical. . .

505

In particular, if the elements for each column of A are of the same accuracy and further if the elements of A at different columns are stochastically independent, the weight matrix ω can be written as diag(ω1 In , ω2 In , . . . , ωt In ). In this case, the correction matrix E can be elegantly written as follows: eAi = −βˆi Wr/ωi , (i = 1, 2, . . . , t).

4.4

(72)

Nonlinear Weighted LS Parameter Adjustment for Partial EIV Models

Although the EIV model (2) has been often formulated as a nonlinear weighted LS adjustment with nonlinear equality constraints, we will reformulate it as an unconstrained nonlinear LS adjustment model. More precisely, an EIV model of type (2) is essentially equivalent to two sub-models of parameter adjustment, one is nonlinear but only bilinear with respect to the unknown parameters of interest and the other is simply linear. In this subsection, we will assume that some of the elements of A are either deterministic or satisfy certain linear relations and extend the EIV model (2) in two aspects: (i) we assume all the elements of A are random but can repeat themselves. In other words, we assume that the elements of A show a certain pattern such as Hankel or Toeplitz structure. Such EIV models have been better known as structured EIV models; and (ii) in the most general case, if some of the elements of A are either deterministic and/or functionally dependent, we will follow Xu et al. [107] and call this type of EIV models as partial EIV models. Equality and inequality constraints may be imposed on the elements of A and likely also the parameters β, which will not be discussed in this chapter.

4.4.1 Constrained TLS Estimation of Parameters Structured EIV models have been substantially investigated as a special class of constrained EIV models. A structured EIV model has been usually formulated and solved in two different ways: one by minimizing the Frobenius norm as in the case of Golub and van Loan [36], subject to the condition that the correction matrices and vector must be of the same form as both A and y; and the other by applying the weighted LS method to all the measurements, subject to the nonlinear constraints transformed from the original EIV model (see e.g., [51, 58–61]). More precisely, the ordinary structured TLS solution is to solve the following minimization problem: min: S(β, E, r) = [E, r]2F ,

(73a)

subject to the nonlinear equality constraints: (A + E)β = y + r,

(73b)

and the second condition that [E, r] is of the same structure as [A, y] (see e.g., [58, 59, 61]).

506

P. Xu

As an example of structured EIV models, we consider a simplified dynamic system of time series (see e.g., [7]). The regression model of time series can be written as follows: yi =

t

βj x i−j +1 ,

(74)

j =1

where y i and x j are the outputs and inputs of the system, which are observed with the values of yi and xj and contaminated with additive random errors i and ηj , respectively, βj are the unknown parameters. The matrix form of (74) is equivalent to the following EIV model: y + r = (A + E)β,

(75a)

where y and β are the collection vectors of yi and βj , respectively, and the matrix A is of Toeplitz structure and given below ⎡

xn ⎢ xn−1 ⎢ A = ⎢. ⎣ .. xt

xn−1 xn−2 .. .

··· ··· .. .

⎤ xn−t+1 xn−t ⎥ ⎥ ⎥. .. ⎦ .

(75b)

xt−1 · · · x1

To warrant the condition of keeping the same structure of [A, y], the correction matrix E should also be of the following Toeplitz structure: ⎡

en ⎢ en−1 ⎢ E = ⎢. ⎣ .. et

en−1 en−2 .. .

··· ··· .. .

⎤ en−t+1 en−t ⎥ ⎥ ⎥. .. ⎦ .

(75c)

et−1 · · · e1

It is obvious from (75c) that the elements of E depend only on n independent elements [e1 , e2 , . . . , en ]T , which is denoted by α. To emphasize this dependence, we denote E of (75c) as E(α). As a result, the structured EIV model (76) is equivalently rewritten as follows: min: S(β, α, r) = α T α + rT r,

(76a)

subject to the nonlinear equality constraints: y + r = [A + E(α)]β.

(76b)

In particular, if A is of Toeplitz structure, fast algorithms have been developed to solve structured EIV models of this type. For more details on these algorithms, the

9 Parameter Estimation, Variance Components and Statistical. . .

507

reader is referred to Mastronardi et al. [61] and Markovsky and van Huffel [58, 59]. Structured TLS algorithms are also published by Rosen et al. [72] for the L1 and L∞ norms.

4.4.2

Nonlinear Weighted LS Parameter Adjustment for Partial EIV Models For structured EIV models, if the measurements A, or more precisely, α and y are 2 of different accuracy with the respective variance-covariance matrices W−1 α σ and −1 2 W σ , one can solve such structured EIV models using the weighted LS principle instead of the Frobenius norm of [E, r]. In this case, the weighted TLS solution to structured EIV models can be formulated as the following equality constrained optimization problem: min: S(β, α, r) = α T Wα α + rT Wr,

(77a)

subject to the nonlinear equality constraints: y + r = [A + E(α)]β,

(77b)

(see e.g., [7, 12, 59, 60]). Alternatively, we can follow Xu et al. [107] to reformulate a (full or partial) EIV model into a nonlinear parameter adjustment model (see also [85]), as already given in the model (3) or (4). Any structured and unstructured EIV model can be well reformulated as a partial EIV model and further a nonlinear parameter adjustment problem [107], so far as some of the elements of A are not random and/or functionally dependent. The advantage of this reformulation is that the equality constraints (76b) or (77b) are automatically satisfied and can be eliminated. For convenience, we rewrite the partial EIV model (4) as follows: y = (β T ⊗ In )(h + Ba) + ,

(78a)

a = a + a ,

(78b)

D() = W−1 σ 2 ,

(78c)

D( a ) = ω−1 σ 2 ,

(78d)

cov(,  a ) = 0.

(78e)

and

This last condition (78e) of stochastic independence between  and  a can be replaced with a non-zero correlation matrix, which will not be discussed in this chapter, since it does not impose extra technical difficulty. We may note that in the

508

P. Xu

structured EIV formulation (76), the parameters α are equivalent to the corrections to a in (78b). In order to estimate a and β from y and a, we apply the weighted LS principle to the unconstrained nonlinear adjustment model (78), which leads to the nonlinear minimization problem: min: S(a, β) = (a−a)T ω(a−a)+{(β T ⊗In )(h+Ba)−y}T W{(β T ⊗In )(h+Ba)−y}. (79) Differentiating S(a, β) of (79) with respect to a and β at the point of their weighted TLS estimates, which are denoted by aˆ and βˆ respectively, and letting them equal to zero, we have ∂S(a, β) ˆ − y} = 0, = 2ω(aˆ − a) + 2BT (βˆ ⊗ In )W{(βˆ T ⊗ In )(h + Ba) ∂a

(80a)

⎧⎡ ⎤ ⎡ ⎤⎫ T ⎪ aˆ T BT1 ⎪ ⎪ ⎪ h1 ⎪ ⎪ ⎪ ⎥⎪ ⎨⎢ hT ⎥ ⎢ aˆ T BT2 ⎥⎬ ∂S(a, β) ⎢ 2⎥ ⎢ ⎢ ⎥ ˆ − y} = 0, = 2 ⎢ . ⎥ + ⎢ . ⎥ W{(βˆ T ⊗ In )(h + Ba) ⎪ ⎣ .. ⎦ ⎣ .. ⎦⎪ ∂β ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ hT aˆ T BTm ⎭ m

(80b)

where ⎡

h1 ⎢ h2 ⎢ h=⎢ . ⎣ ..





B1 ⎢ B2 ⎥ ⎢ ⎥ ⎥, B = ⎢ . ⎣ .. ⎦

hm

⎤ ⎥ ⎥ ⎥. ⎦

Bm

Here all the hi are n-dimensional vectors, and all the Bi are (n × ta ) matrices. We should note, however, that some of the vectors hi may be null vectors and hi is not necessarily equal to hj (i = j ). The same applies to Bi and Bj . The reader should understand that there is no systematic rule for defining these partial vectors and matrices, depending on whether the elements of A are random or deterministic. After some rearrangements, Eq. (80a) become (ω + STβ WSβ )aˆ = ωa − STβ W(

t

hi βˆi ) + STβ Wy,

i=1

or equivalently, aˆ = (ω + STβ WSβ )−1 {ωa − STβ W(

t i=1

hi βˆi ) + STβ Wy},

(81)

9 Parameter Estimation, Variance Components and Statistical. . .

509

where Sβ =

t

Bi βˆi .

(82)

i=1

By using the inverse of sum of matrices [57] and after some derivations [85], aˆ of (81) can be concisely written alternatively as follows: ˆ aˆ = a + ω−1 STβ M−1 (y − Aβ),

(83)

M = W−1 + Sβ ω−1 STβ .

(84)

where

Since the first observational equations (78a) of the nonlinear adjustment model (78) is essentially equivalent to y = A(a)β + ,

(85)

where A(a) stands for the true values of A. To emphasize its dependence only on a finite number of independent unknowns a, we use the symbol A(a). In what follows, we will use A to represent A(a). If we have the estimate aˆ of a, we can ˆ of A. With this simple replacement, we then obtain the corresponding estimate A can rewrite (80b) as follows: ∂S(a, β) ˆ T W(A ˆ βˆ − y) = 0, = 2A ∂β or equivalently, ˆ T WA) ˆ T Wy, ˆ −1 A βˆ = (A

(86)

(see also [85]). The TLS estimate of the variance of unit weight is then given by L2 = S(a, ˆ β)/(n ˆ σ − t).

(87)

Since the reformulated partial EIV model (78) is actually a separable nonlinear LS problem, one can readily use a variant of the Gauss-Newton method to numerically compute all the unknowns of interest. Such a solution method is linearly convergent, according to Dennis and Schnabel [26]. Actually, (78) is conditionally linear with respect to β, given a, or more precisely, bilinear in (78a) and simply linear in (78b). A well-coded algorithm can further reduce some computational complexity.

510

P. Xu

Equations (81) and (83) have their own advantages for computation. In general, we always start with the assumption on the variance covariance matrices, or at least, their cofactor matrices for both the measurements y and a; as a result, the inverse matrices of W and ω are given. If the number of measurements y is significantly larger than the number of independent random elements of A, then (83) is much more efficient computationally, since we only need to invert a matrix of size (n × n) per iteration. On the other hand, if n is much larger than ta , i.e., there are only a few random elements in the design matrix A, then (81) requires much less computation than (83). In this case, (81) will require the inversion of a matrix of size (ta × ta ), while (83) will have to compute the inverse of a matrix of size (n × n). In other words, the computational complexity of inverting the normal matrix will be roughly proportional to O(n3 ) arithmetic operations per iteration for the newly derived alternative formula (83) and O(ta3 ) arithmetic operations for the original formula (81).

4.5

Closed form Solutions to Coordinate Transformation

In many problems of engineering importance, we often have to determine the relationship between point clouds in different reference frames. Given two point sets with known coordinates xi and yi (i = 1, 2, . . . , n), if these points describe the same rigid object under two different coordinate systems S1 and S2 , a common problem in geodesy, photogrammetry and computer vision is to determine the relative relationship between these two reference systems S1 and S2 through the following transformation: yi = t + sRxi , (i = 1, 2, . . . , n),

(88)

where t is the translation vector between the origins of the systems S1 and S2 , R is the rotation matrix, and s is the relative scaling parameter if the standards of lengths used in both systems are different. If the coordinates xi and yi are contaminated with random errors, (88) is an EIV model, as described in (5) of Example 1. Since the rotation matrix R has to be proper and satisfy the condition of orthogonality, namely, RT R = I and det(R) = 1, where det(·) stands for the determinant of a square matrix, the model (88) is nonlinear, even if xi are free of random errors and the scaling factor s is given. In practical solutions to (88), the rotation matrix R has often been represented either by using three independent Euler angles or the four quaternion parameters (see e.g., [4, 8, 40, 41, 44, 74, 99, 103, 109]. In geodetic coordinate transformation, three Euler angles are generally very small and it is often sufficient to first linearize (88) and then apply the weighted LS method to estimate the seven transformation parameters. However, in photogrammetry, attitude determination and computer vision, three Euler angles can be large and the linearization approach may take many iterations for the solution to converge or even not converge at all.

9 Parameter Estimation, Variance Components and Statistical. . .

511

To solve the transformation problem (88) without any limit on the Euler angles in closed form, one often simply applies the ordinary LS method to estimate the seven transformation parameters. At most, a scalar weighting factor may be applied to different set of xi and yi . In this case, the objective cost function can be written as follows: min: S(t, s, R) =

n

wi yi − t − sRxi 2 ,

(89)

i=1

where wi are positive scalars. We should immediately point out that although wi play the role of weighting, they generally do not reflect the real variance-covariance matrices of all xi and yi , as correctly mentioned by Lu [56]. Assuming t = 0 and s = 1, Thompson [87] used two points to reconstruct the rotation matrix R by analytically solving a linear system of equations in the three Rodriguez parameters. Sanso [74] represented the rotation matrix R in terms of quaternion parameters and went on to construct the closed form solution for the seven transformation parameter with the ordinary LS criterion (89). Under the same assumption as in Thompson [87], Davenport invented the q-method to derive the closed form solution of the quaternion parameters for R, as documented in Keat [44]. Under the same condition as in Thompson [87] but without imposing the proper condition det(R) = 1, Schönemann [80] successfully used the singular value decomposition technique to analytically solve for the orthogonal matrix under the ordinary LS principle. Arun et al. [8] and Horn et al. [41], likely independently, developed the method proposed by Schönemann [80], with an extension to allow the unknown vector t by Arun et al. [8] and a full model (88) by Horn et al. [41], respectively. Arun et al. [8] and Horn et al. [41] reported that the reconstructed orthogonal matrix R can be generally proper to satisfy det(R) = 1 (see also [109]). Nevertheless, they may end up with a reflection if data are severely contaminated, as pointed out by Horn et al. [41] and Umeyama [89]. Umeyama [89] actually also proposed a method to ensure a rotation matrix, and as a result, overcome the disadvantages of Arun et al. [8] and Horn et al. [41]. Based on the full transformation model (88) and the criterion (89), Horn [40] used the quaternion concept to find the closed form solution for all the seven transformation parameters. In the remainder of this subsection, we will briefly outline the closed form solutions of the seven transformation parameters in the model (88) on the basis of Sanso [74], Keat [44] and Horn [40]. To begin with, we compute the derivatives of S(t, s, R) in (89) with respect to t, set them to zero and obtain n

wi (yi − ˆt − sRxi ) = 0,

i=1

or equivalently, ˆt = yc − sRxc ,

(90)

512

P. Xu

where xc and yc are the weighted centroids of the points xi and yi (i = 1, 2, . . . , n) and given by n

xc =

wi xi

n M

i=1

wi ,

i=1

and yc =

n

wi yi

n M

i=1

wi .

i=1

Substituting t of (89) with ˆt of (90) and after some slight arrangement, we have min: S(s, R) =

n

wi yic − sRxic 2 ,

(91)

i=1

where yic = yi − yc , xic = xi − xc . Now we then compute the derivative of S(s, R) with respect to s, set it to zero and obtain n

wi xTic RT (yic − sˆ Rxic ) = 0,

i=1

or equivalently, sˆ =

n

wi yTic Rxic

i=1

n M

wi xic 2 .

(92)

i=1

Instead of using (92) to estimate √ the √ scaling factor s, Horn [40] proposed a symmetrical expression of (yic / s − sRxic ) to replace (yic − sRxic ) of (91). As a result, Horn [40] obtained the following alternative estimator of s: sˆ =

 n i=1

wi yic 

2

n M

1/2 wi xic 

2

.

(93)

i=1

Finally, we substitute s of S(s, R) with sˆ of (93) and, after some technical derivations, obtain the following minimization problem:

9 Parameter Estimation, Variance Components and Statistical. . .

min: S(R) =

n

513

wi yic − sˆ Rxic 2

i=1

=

n

wi yic 2 − 2ˆs

i=1

n

wi yTic Rxic + sˆ 2

i=1

n

wi xic 2 .

(94)

i=1

It is obvious from (94) that minimizing S(R) of (94) is mathematically equivalent to maximizing the following problem: max: Sm (R) =

n

wi yTic Rxic = tr{RMxy },

(95a)

i=1

where Mxy =

n

wi xic yTic .

(95b)

i=1

The maximization problem (95a) can be found in Sanso [74], Keat [44], Horn [40], Arun et al. [8] and Umeyama [89]. Actually, although Sanso [74] first arrived at the expression (92) for the scale s, he treated s in (91) without using (92) as if s were fixed and had nothing to do with R. As a result, he also ended up with the maximization problem (95a). There exist two methods to construct the closed form solutions of the rotation matrix R. One method is to directly find R that maximizes (95a) (see e.g., [8, 41]), while the other method is to construct the closed form solution of R via the intermediate quaternion parameters (see e.g., [40, 44, 74]). To find R that maximizes (95a), Arun et al. [8] applied singular value decomposition to Mxy : Mxy = UVT ,

(96)

and constructed the orthonormal matrix, denoted by O, as follows: O = VUT .

(97)

They then proved that among all the orthonormal matrices, O maximizes the objective function (95a) (see also [41]). If the proper condition det(R) = 1 is satisfied, R = O; otherwise, O is a reflection, which can happen in the case of severely contaminated data [41, 89]. Some more work will be required to ensure a solution of rotation [44, 89]. The second method is based on the notation of quaternions. To begin with, one will have to first define a unit vector n as the rotation axis and a rotation angle θ and introduce the quaternion vector as follows:

514

P. Xu

⎤ q1   ⎢ q2 ⎥ ⎥ = n sin(θ/2) . q=⎢ ⎣ q3 ⎦ cos(θ/2) q4 ⎡

(98)

As a result, the rotation matrix R can be represented with q. Keat [44] rewrote the objective function Sm (R) = tr{RMxy } of (95a) as the following quadratic form: max: Sm (R) = qT Sq q,

(99a)

 S11q Vm Sq = , VTm S44q

(99b)

S11q = Mxy + MTxy ,

(99c)

S44q = tr(Mxy ),

(99d)

where 

and the vector Vm is derived from the following skew symmetric matrix: Sm = Mxy − MTxy ,

(99e)

namely, ⎡

⎤ m32 − m23 Vm = ⎣ m13 − m31 ⎦ , m21 − m12

(99f)

where mij are the elements of Mxy . For more technical details, the reader is referred to Keat [44]. A slightly more lengthy derivation can be found in Horn [40]. Since Sq of (99a) is symmetric, it has been proved that the solution of q that maximizes Sm (R) is equal to the eigenvector of the largest positive eigenvalue of Sq (see e.g., [40, 44, 74]). Finally, we may note that the solutions of Sanso [74] and Horn [40] cannot warrant the minimization of (89).

5

Error Analysis of Errors-in-Variables Models

We have developed a number of methods to solve structured and unstructured EIV problems. Since both the measurements A and y are contaminated with random errors, the estimates of parameters have to be assigned with statistical measures to evaluate their statistical quality. TLS estimates have been proved to be weakly consistent under regularity conditions, as the number of measurements y tends

9 Parameter Estimation, Variance Components and Statistical. . .

515

to infinity (see e.g., [35, 60, 91]). However, asymptotic statistical results are not very useful in practice (see e.g., [73]). Even if a TLS estimator is known to be asymptotically unbiased, it can still be significantly biased in the case of finite samples, depending on the severity of model nonlinearity. Deming [25] and Gerhold [34] derived the first order approximation of accuracy for the estimated parameters, which are further developed by Xu et al. [107] and Amiri-Simkooei et al. [6]. In this section, we will discuss finite sample statistical aspects of TLS estimates. For more details, the reader is referred to Xu et al. [107]. Since the TLS estimate of parameters in a (complete or partial) EIV model is essentially the problem of nonlinear weighted LS adjustment, we can naturally use nonlinear adjustment theory or nonlinear regression to statistically analyze the nonlinear TLS estimate. More precisely, we will focus on the error evaluation and bias analysis of the nonlinear TLS estimate. Although the nonlinearity measures of a nonlinear functional model would be important to help evaluate the effectiveness of linear approximation, as well documented in the books by Ratkowsky [70] and Seber and Wild [82], we will not touch them here for the following reasons: (i) part of these nonlinearity measures is parameter-dependent, but our parameters of interest will not be reparameterized; and (ii) more importantly, the statistical analysis of the nonlinear TLS estimate has already used a complete knowledge of the first- and second-order partial derivatives of the nonlinear functional model, which are the starting point to compute the nonlinearity curvature measures.

5.1

First Order Approximation of Accuracy

Linear approximation has been one of the most popular techniques to solve nonlinear problems. In the case of the partial EIV model (4) or (78), given approximate values a0 for a and β 0 for β, we can then linearize (78a) as y − A0 β 0 = A0 β + (β T0 ⊗ In )B a + , where

β = β − β 0,

(100a)

a = a − a0 , A0 is equal to A. Accordingly, (4b) becomes a − a0 =

a + a .

(100b)

In statistical analysis after nonlinear adjustment, a0 and β 0 can actually take on the converged TLS solutions of a and β in association with the partial EIV model (78). In this case, A0 should accordingly be updated with the converged TLS solutions as well. Applying the weighted LS principle to estimate β and a in (100), together with the stochastic model for y and A, namely (78d) and (78e), we obtain the linearized normal equations:

516

P. Xu



N11 N12 N21 N22



   N β uβ0 La = ua0 ,

(101)

where N11 = AT0 WA0 , N12 = AT0 WS0 , N21 = ST0 WA0 = NT12 , N22 = ST0 WS0 + ω, uβ0 = AT0 W(y − A0 β 0 ), ua0 = ω(a − a0 ) + ST0 W(y − A0 β 0 ), and S0 is equal to Sβ of (82) but computed at the point of β 0 . Thus the first order approximation of accuracy for the TLS estimate is simply given by 4 3  −1 βˆ N11 N12 D ˆ = σ 2, N21 N22 a

(102)

from which we can also write the first order approximation of accuracy for the weighted TLS estimate of β as follows: ˆ = (N11 − N12 N−1 N21 )−1 σ 2 . D(β) 22

(103)

The approximate accuracy (103) of the TLS solution of β clearly indicates that the accuracy of the weighted LS solution is overestimated if the random errors of A are not taken into account. When inverting for stress tenors from fault-slip data, Xu [101] found that the accuracy of stress parameters were too high by a factor of four to nine, if the random errors in forming the design matrix were not taken into account. We should note that, if B = Int , then (103) is equal to the linear approximation of accuracy given by Deming [25] and Gerhold [34], which can also be found in Amiri-Simkooei et al. [6]. As a special example, if W = In , ω = Int , and B = Int , then (103) becomes ˆ = (N11 − N12 N−1 N21 )−1 σ 2 , D(β) 22

(104)

9 Parameter Estimation, Variance Components and Statistical. . .

517

where N11 = AT0 A0 , N12 = AT0 S0 , N21 = NT12 , N22 = (β 0 β T0 ) ⊗ In + Int , S0 = [β10 In , β20 In , . . . , βt0 In ]. This first order accuracy (104) is different from that given by Schaffrin [76] and Schaffrin et al. [79], which may indicate that the practice of directly propagating the random errors of y through the nonlinear TLS solution is not acceptable to construct the first order approximation of variance-covariance matrix for the TLS estimate of β. Probably, a further numerical comparison may be helpful.

5.2

Nonlinear Confidence Region

Let a linear model be described by y = Fβ + ,

(105a)

D() = W−1 σ 2 ,

(105b)

where, except for F, all the other notations have been defined as in (1). In (105a), F is an (n × t) non-random coefficient matrix of full column rank. It is well known that if  is assumed to be normally distributed with zero mean, then the confidence region for the unknown parameters β is given by ˆ ˆ = (β − β) ˆ T (FT WF)(β − β) S(β) − S(β) L2 F (t, n − t, α), ≤ tσ

(106)

(see e.g., [81, 82]), under the weighted LS criterion: min: S(β) = (y − Fβ)T W(y − Fβ),

(107)

L2 is the estimate where βˆ is the weighted LS estimate of β, α is a significance level, σ 2 ˆ of σ and generally given by S(β)/(n − t), and F (t, n − t, α) is the critical value of F −distribution. In the case of nonlinear regression, Fβ of (105) has to be replaced with a nonlinear vector function of β, say f(β), which is also called expectation functions. In other words, a nonlinear adjustment model can be written as follows:

518

P. Xu

y = f(β) + ,

(108)

where f(·) is a given vector of nonlinear functions. Beale [10] suggested an approximate confidence region for β, which is given by ˆ ≤ S(β) ˆ S(β) − S(β)

t F (t, n − t, α), n−t

(109)

(see also [82]), where S(β) is defined by S(β) = {y − f(β)}T W{y − f(β)}. The confidence region (109) is exact in the case of linear models. It is asymptotically correct under regularity conditions, if the mathematical model (108) is nonlinear and if the number of measurements y tends to infinity. In general, (109) works well, if the nonlinearity is only moderate and/or if the variances of y are small [10]. However, if the nonlinearity is sufficiently severe, one will have to slightly modify the right hand side of (109). We note that the approximate confidence region (109) is not of an ellipsoidal shape any more, and the extent of deviation depends on the nonlinearity of f(β). For more details, the reader is referred to Beale [10] (see also [82]). For the nonlinear adjustment model (78) reformulated from a partial EIV model, if both  and  a are normally distributed with zero means, then the approximate confidence region (109) can be rewritten as follows: L2 F (t + ta , n − t, α), ˆ β) ˆ ≤ (t + ta )σ S(a, β) − S(a,

(110)

where S(a, β) has been defined by (79). Very often, we may only be concerned with the confidence region of the unknown model parameters β. In this case, a has to be replaced by its corresponding relationship (81) in terms of β. As a result, we have the subset nonlinear confidence region for β as follows: L2 F (t, n − t, α), ˆ ≤ tσ Sa (β) − Sa (β)

(111)

where Sa (β) is the same as in (79), except for that the relationship (81) is used to replace a in terms of β. The linearized approximation of confidence region for the subset β can be more properly given by L2 F (t, n − t, α). ˆ T (N11 − N12 N−1 N21 )(β − β) ˆ ≤ tσ (β − β) 22

(112)

9 Parameter Estimation, Variance Components and Statistical. . .

5.3

519

Bias of the Nonlinear TLS Estimate

For the nonlinear adjustment model (108), it is well known that the weighted LS estimate of β is biased (see e.g., [14, 70, 73, 82]). By further assuming that  is normally distributed with zero mean and variance W−1 σ 2 , Box [14] derived the ˆ which is denoted by bβ and simply given as bias of the weighted LS estimate β, follows: bβ = −

σ2 T (F WF)−1 FT Wmβ , 2

(113a)

where F is the first partial derivative matrix of f(β) with respect to β, namely, F=

∂f(β) ∂β T

,

(113b)



⎤ tr{(FT WF)−1 H1 } ⎢ tr{(FT WF)−1 H2 } ⎥ ⎢ ⎥ mβ = ⎢ ⎥, .. ⎣ ⎦ .

(113c)

tr{(FT WF)−1 Hn } Hi =

∂ 2 fi (β) ∂β∂β T

, (i = 1, 2, . . . , n)

(113d)

and fi (β) is the ith component of f(β). Since the partial EIV problem (78) is nonlinear with respect to β and a, their weighted TLS estimates are biased according to Box [14] (see also [70, 73, 82]). In order to apply the bias result of Box [14] to the nonlinear partial EIV model (78), all what we need to do is to compute the first and second partial derivatives of (113) that correspond to the nonlinear adjustment model (78). Because our partial EIV model (78) is conditionally linear, we will now work out the bias formulae for the weighted TLS estimates of β and a. To start with, let us assume the normal distributions for both  and  a , namely,  ∼ N (0, W−1 σ 2 ) and  a ∼ N(0, ω−1 σ 2 ), and assume the statistical independence between  and  a , namely, cov(,  a ) = 0. By following Box [14], the biases of the weighted TLS estimates of β and a, denoted respectively by bβ and ba , are simply listed as follows:   σ2 bβ = − (FT Wya F)−1 FT Wya m, (114a) ba 2 where the first partial derivatives F of the nonlinear expectation functions with respect to β and a can be readily obtained from (100) in the following:

520

P. Xu

4 F=

3 ˆ (βˆ T ⊗ In ) B A . 0 I ta

(114b)

ˆ is equal to A, except for that the TLS estimate of a is used to replace the Here A values of a. Wya and m are given by  Wya = ⎡



W ω

,

tr{(FT Wya F)−1 H1 } tr{(FT Wya F)−1 H2 } .. .

⎢ ⎢ m=⎢ ⎣

⎤ ⎥ ⎥ ⎥, ⎦

(114c)

tr{(FT Wya F)−1 Hn+ta } Hi is the second partial derivatives of the ith nonlinear expectation function with respect to β and a and ta is the total number of elements of a. In principle, F and all Hi should be computed at the point of the true values of β and a but practically, the weighted TLS solution is used to compute F and Hi . In the similar manner to (101), we have 4 F Wya F = T

3 ˆ T WA ˆ A ˆ T WSβ A ˆ ST WSβ + ω , STβ WA β

whose inverse is symbolically denoted as follows: QF = (F Wya F) T

−1

 Qβ Qβa . = Qaβ Qa 

(115)

In the case of the nonlinear adjustment problem (78), we have the second partial derivatives Hi for each of y, which are given by ∂ 2yi   β ∂(β T , aT ) ∂ a   0 (It ⊗ ei )B = 0 BT (It ⊗ eTi )

Hi =

(116)

for i = 1, 2, . . . n. Here ei = (0, . . . 0, 1, 0, . . . , 0) is an n−dimensional natural row vector, i.e., all the elements of ei are equal to zero, except for the ith element being equal to unity. On the other hand, since none of the expectation functions of a in (78b) are the functions of β, the corresponding second partial derivatives are all equal to zero, namely,

9 Parameter Estimation, Variance Components and Statistical. . .

Hn+i =

∂ 2ai =0   β T T ∂(β , a ) ∂ a

521

(117)

for i = 1, 2, . . . ta . By splitting the vector m of (114c) into two parts and denoting them by m1 and m2 which are assumed to correspond, respectively, to Hi of (116) and Hn+i of (117), as a result of (117), we immediately have m2 = 0.

(118)

For each element of m1 , we first compute the product of QF and Hi of (116), then apply the trace operation to it and finally obtain: m1i = tr{Qβa BT (It ⊗ eTi ) + Qaβ (It ⊗ ei )B},

(119)

where m1i is the ith element of m1 . Substituting (114b), (115), (118) and (119) into (114a), we finally obtain the bias vector of the weighted TLS estimates of β and a as follows: 

bβ ba



σ2 =− 2

4

ˆ T + Qβa ST Qβ A β ˆ T + Qa ST Qaβ A β

3 Wm1 .

(120)

It is obvious from (120) that: (i) the biases of the weighted TLS estimates are proportional to the noise level σ 2 ; and (ii) with the increase of QF , the biases may become larger. By removing the biases (120) from the weighted TLS estimates of β and a, we can obtain their almost unbiased weighted TLS solutions up to the second order approximation. In fact, this approach of constructing second order unbiased estimators has been successfully used to design almost unbiased filters [99, 100].

5.4

Examples

The example is directly taken from Xu et al. [107]. The main purpose is to demonstrate the TLS algorithm and its statistical analysis for partial EIV models. The simulated example assumes that we measure a number of points (xi , yi ) and want to fit them to a straight line y = β1 + xβ2 . We simulate two values for β1 and β2 and 20 points with the values of x between 2 and 10, which are used to compute the true values of y. The noise level of σ 2 = 0.09 is used to generate 20 random xi values and 20 random yi values. The generated xi and yi are statistically independent and normally distributed. Both the weight matrices of W for y and ω for x(= a) are diagonal with different values for each yi and each xi . To apply the developed TLS algorithms of Sect. 4.3 for partial EIV models to the simulated problem, we set a maximum tolerant error of 1.0 × 10−8 for both

522

P. Xu

Table 1 Parameter estimates and their standard deviations. TrueV – the true values of β1 and β2 ; TLS_Est – the TLS estimates of β1 and β2 ; TLS_STD – the standard deviations of the TLS estimates computed from the first order approximation of the variance-covariance matrix; BCor_TLS – the bias-corrected TLS estimates of β1 and β2 ; TLS_Bias – the biases of the TLS estimates; WLS_Est – the weighted LS estimates of β1 and β2 by ignoring the randomness of the design matrix; WLS_STD – the standard deviations of the weighted LS estimates (after Xu et al. [107]) Parameters TrueV TLS_Est TLS_STD BCor_TLS TLS_Bias WLS_Est WLS_STD β1 −1.3573 −0.7133 0.8803 −0.7571 0.0438 −2.7006 0.3491 β2 −2.0452 −2.1685 0.1481 −2.1608 −0.0077 −1.7788 0.0542

β1 and β2 to terminate the computation. The TLS estimates of β1 and β2 are listed in Table 1, together with their first order approximation of standard deviations and the true values of β1 and β2 . For comparison purpose, we compute the (ordinary) weighted LS estimates of β1 and β2 and their standard deviations by simply ignoring the randomness of xi and show the results in the same table. It is clear from Table 1 that the differences of the TLS estimates of β1 and β2 from their true values can be well explained statistically by their first order approximation of the standard deviations. However, the weighted LS estimates are obviously much worse than the TLS estimates. In particular, by comparing column TLS_STD with column WLS_STD of Table 1, we can see that the accuracy of the weighted LS estimates are too high by a factor of about two in this example, as theoretically expected in Sect. 5.1. As part of the statistical analysis, we estimate the biases of the TLS estimates of β1 and β2 and show them in column TLS_Bias of Table 1. The biases of the TLS-estimated parameters are about −6.14 and 0.35 per cent of the corresponding estimates, respectively. If the true values of β1 and β2 are used, then the percentages reduce to −3.23 and 0.38, respectively. Nevertheless, the maximum (absolute) relative error for β1 is slightly larger than the empirical number of one per cent, which usually serves as an indicator to declare that a problem under discussion is nonlinear [70]. From this point of view, though the two small relative errors of biases should imply that the simulated EIV example is not severely nonlinear, the effect of nonlinearity will still have to be taken into account. Thus we compute the bias-corrected TLS estimates and list them in column BCor_TLS. Indeed, they are slightly closer to the true values of β1 and β2 than the TLS estimates in this example, which is consistent with the theoretical result of unbiasedness up to the second order approximation. It is interesting to note that although the model is linear with respect to β1 , the bias results clearly indicate that the effect of nonlinearity on this linear parameter cannot be neglected. In the case of estimating the variance of unit weight, both the TLS and bias-corrected TLS methods produce almost the same values of 0.1267 and 0.1268 for the variance of unit weight, respectively. By using the first order approximation of variance-covariance matrices, we plot the confidence ellipses for the weighted LS estimate and the TLS estimate in Fig. 2, given two significance levels of 0.05 and 0.01. With the same significance levels,

9 Parameter Estimation, Variance Components and Statistical. . .

523

−1.4

−1.6

−1.8

β2

−2

−2.2

−2.4

−2.6

−2.8 −4

−3

−2

−1

0

1

2

3

β1 Fig. 2 The true values and the estimates of β1 and β2 , and the confidence regions of the estimates: the true values – the blue circle; the weighted LS estimate by ignoring the randomness of xi – the black mark +; the TLS estimate – the red diamond mark; the bias-corrected TLS estimate – the red star mark; the black ellipses – the confidence regions of the weighted LS estimate, with the inner dashed ellipse and outer solid ellipse corresponding to the significance levels of 0.05 and 0.01, respectively; the red ellipses – the confidence regions of the TLS estimate computed with the first order approximation of its variance-covariance matrix, with the inner red dashed ellipse and outer red solid ellipse corresponding to the significance levels of 0.05 and 0.01, respectively; the green ellipse-like figures – the confidence regions computed by using the nonlinear confidence region formula (111), with the inner green solid ellipse-like figure and outer green solid ellipse-like figure corresponding to the significance levels of 0.05 and 0.01, respectively (after Xu et al. [107])

we show the nonlinear confidence regions (111) in Fig. 2. Also shown in Fig. 2 are the true values of β1 and β2 (the blue circle), the weighted LS estimate (the black mark +), the bias-corrected TLS estimate (the red star mark) and the TLS estimate (the diamond mark). It is obvious that the true values of β1 and β2 are well covered by the confidence regions, either computed with the first order approximation of the variance-covariance matrix of the TLS estimate or with the nonlinear confidence region (111). Nevertheless, the effect of nonlinearity on the confidence regions can still be clearly seen from the red (dashed or solid) and green lines of Fig. 2. The shapes of the green lines are not significantly different from that of an ellipse, indicating and confirming that the nonlinearity of the example is only marginal. It is also clear from Fig. 2 that the true values of β1 and β2 are not covered by any of the confidence ellipses for the weighted LS estimate, which confirms our conclusion in Sect. 5.1 that the accuracy of the weighted LS estimate by ignoring the randomness

524

P. Xu

of xi is too optimistic. In particular, the weighted LS estimate (the black mark +) is surely outside both the red dashed line and the inner green line. It is also outside the outer green line, though inside the solid red line. Since the nonlinear confidence region (111) is statistically more appropriate than the first order approximation (see, e.g., [10, 70, 82]), we may conclude that the weighted LS estimates of β1 and β2 in this example are incorrect statistically, at least, at the significance level of 0.01.

6

Variance Components in Errors-in-Variables Models

In the previous sections, we have assumed that the stochastic models for both y and A in the EIV model (3) and/or (4) are known, at least, up to an unknown scalar variance of unit weight. In physical, statistical and engineering practice, we can often encounter the following problems: (i) y and A in EIV models may contain some unknown parameters in their stochastic models; and (ii) the elements of y and/or A are not necessarily of the same types of measurements either. Even if we assume that y (and/or A) is of the same type of measurements, they may be measured with instruments of different accuracy. For problems of this kind, the corresponding stochastic model for the EIV model (3) should be more appropriately described as follows: y =

my

Uiy σiy2 ,

(121a)

2 Uia σia ,

(121b)

i=1

a =

ma i=1

where y and a are the variance-covariance matrices of y and vec(A), respectively. All the Uiy and Uia are known and positive semi-definite. σiy2 (i = 2 (i = 1, 2, . . . , m ) are called (unknown) variance compo1, 2, . . . , my ) and σia a nents of y and A, respectively. Although the general EIV model (2) with the stochastic model (121) should be more appropriate to describe a wide range of real-life EIV problems, little has ever been done to simultaneously address the model parameters and the variance components. Limited related work has been done to estimate both parameters and variance components in spatial and generalized mixed effects models with measurement errors by using the maximum likelihood and quasi-likelihood methods [52, 93] and the restricted maximum likelihood and pseudo-likelihood methods [92]. Although Fuller [32] applied the maximum likelihood method to estimate the variance components of y and A in the EIV model (2), the results gave the impression that the variance components in stochastic models are estimable. In this section, we will focus on the variance components of  and  A in the stochastic model (121). The stochastic model (121) is different from any

9 Parameter Estimation, Variance Components and Statistical. . .

525

conventional variance component models in the sense that there exist no direct redundant observations of A for an independent assessment of  A . More precisely, starting with quadratic forms of the corrections (or equivalently, the residuals) of the measured data y and A to estimate the variance components, we will follow Xu and Liu [104, 108] to show that the variance components of  and  A are actually not estimable. In order to investigate the variance component estimation of the EIV model (2), we follow Xu et al. [107] to equivalently reformulate the EIV models (2) and (121) as a nonlinear Gauss-Markoff model with variance components: y = Aβ + ,

(122a)

A = A + EA ,

(122b)

   T    y  (β ⊗ In )a . + = a a a

(123)

or in a compact form:

The corresponding stochastic model for  and  a is the same as formulated in (123) and can be equivalently represented in one matrix for both the measurements y and A as follows:  y 0 0 a ⎡ my ⎤  2 U σ 0 iy iy ⎢ ⎥ i=1 ⎥ =⎢ m ⎣ ⎦ a 2 Uia σia 0 

=

i=1

 my

=

i=1

 ma  Uiy 0 0 0 2 σ2 σ + 0 0 iy 0 Uia ia 

i=1

my

=



Ui σiy2 +

i=1

ma

2 Ui σia

i=1

my +ma

=



Ui σi2 ,

(124)

i=1 2 are given, respectively, as where the matrices Ui corresponding to σiy2 and σia follows:

526

P. Xu



 Uiy 0 Ui = , 0 0 for i = 1, 2, . . . , my and  Ui =

 0 0 , 0 Uia

for i = (1 + my ), (2 + my ), . . . , (ma + my ). We assume that both y and a are invertible. Nevertheless, this assumption is not really necessary. For example, if a is singular, one can follow Xu et al. [107] to reformulate the original EIV model into a partial EIV model, where the new variance-covariance matrix is only restricted to those independent random elements of A and becomes invertible. With the reformulation of (123) and the stochastic model (124), it has become clear that the EIV model (2) can be identically treated mathematically as a nonlinear GaussMarkoff model with unknown variance components; obviously, this is the most natural way to handle the estimation of variance components in EIV models, since conventional nonlinear estimation theory and methods can be adapted naturally. In other words, one can now freely use any numerical methods to solve for the parameter estimate of the nonlinear model and freely adopt any appropriate methods to estimate the variance components. For brevity of notations, we collect the variance components of y in (121a) and those of a (or A) in (121b) into two vectors σy and σa , respectively. A procedure of variance component estimation almost always consists of two iteratively looped steps: one to compute the parameters and the other to compute 2 (i = 1, 2, . . . , m ) and the variance components. Given a set of initial values σ0iy y 2 (i = 1, 2, . . . , m ) for the unknown variance components, if we apply the σ0ia a MINQUE method to the EIV models (123) and (124), we would obtain the following iterative formulae: ˆ −1 (A) ˆ T −1 A] ˆ T −1 y, βˆ = [(A) 0y 0y

(125a)

ˆ aˆ = a + 0a (βˆ ⊗ In )E−1 (y − Aβ),

(125b)

to compute the weighted TLS estimates of parameters and the corrections to the measurements A, and 

Sy Sya Say Sa



σˆy σˆa



 =

 qy , qa

(126)

9 Parameter Estimation, Variance Components and Statistical. . .

527

to compute the MINQUE estimates of σy and σa , where −1 qyi = rTy −1 0y Uiy 0y ry ,

(127a)

−1 qai = rTa −1 0a Uia 0a ra ,

(127b)

for i = 1, 2, . . . , my , and

for i = 1, 2, . . . , ma . ry and ra are the correction vectors to the measurements y and A, respectively. Note that we also used the notation r to denote the corrections to y and eA for the corrections to A, namely, ry = r and ra = eA . The elements of Sy , Sa and Sya are computed, respectively, by the following equations: ij

sy = tr{PUi PUj } = tr{Py Uiy Py Ujy },

(127c)

for i, j = 1, 2, . . . , my , ij

sa = tr{PUi PUj } = tr{Pa Uia Pa Uj a },

(127d)

for i, j = 1, 2, . . . , ma , and ij

sya = tr{PUi PUj } = tr{Pay Uiy Pya Uj a },

(127e)

for i = 1, 2, . . . , my and j = 1, 2, . . . , ma . The matrices Py , Pa and Pya are respectively equal to −1 −1 Py = −1 0y − 0y Hy 0y , −1 −1 Pa = −1 0a − 0a Ha 0a ,

and −1 Pya = − −1 0y Hya 0a .

The matrices Hay , Ha and Hy have been given directly by the following corresponding block sub-matrices:

528

P. Xu

4

3 3 4 ˆ ˆ (β) T ˆ T ⊗ In Qβ Qβa A) 0 ( A H= Qaβ Qa 0 Ia (βˆ ⊗ In ) Ia   Hy Hya . = Hay Ha

(127f)

For more technical details on the derivations of (126) and (127), the reader is referred to Xu and Liu [104, 108]. As is well known, given a (deterministic) linear model with full rank, if the stochastic model of measurements is of a block-diagonal structure with a sufficient number of measurements, the variance components in the stochastic model is estimable (see e.g., [42, 68]). For simplicity of discussion, let us assume that the stochastic model (121b) for the random matrix A can be rewritten as follows: a =

m

2 2 Uia σia = diag(σia ) ⊗ In ,

(128)

i=1

where Uia is a diagonal block matrix with its ith block being equal to the identity 2 ) is a diagonal matrix with its ith diagonal element being matrix In , and diag(σia 2 equal to σia . Since the stochastic model (128) is clearly of a block-diagonal structure, one may conjecture that the variance components σa are estimable under the condition that σy are estimable. In what follows, we will strictly prove that σa even in the simplest stochastic model (128) cannot be estimable and cannot be separated from σy . 2 (i = 1, 2, . . . , m ) and Actually, given the (iteratively) initial values σ0iy y 2 σ0ia (i = 1, 2, . . . , ma ), based on the stochastic model (121b) for y and (128) for A and following the line of thought by Xu and Liu [104, 108], we can directly rewrite the corrections ria (= eAi ) to A in (72) and ry to y in (71c), as follows: ry = (A + E)βˆ − y,

(129a)

2 ˆ −1 βi 0y ry , (i = 1, 2, . . . , t). ria = −σ0ia

(129b)

If the measurements y are further assumed to be of the simplest stochastic model, namely Iσy2 , then (129b) can be simplified as: ria = −

2 σ0ia 2 σ0y

βˆi ry , (i = 1, 2, . . . , t).

(129c)

From (129a) and (129c), we can readily obtain the following relationship of proportionality:

9 Parameter Estimation, Variance Components and Statistical. . .

ry : r1a : r2a : . . . :: rta = (−1) : (

2 σ01a 2 σ0y

βˆ1 ) : (

2 σ02a 2 σ0y

βˆ2 ) : . . . : (

529 2 σ0ta 2 σ0y

βˆt )

2 2 2 2 = (−σ0y ) : (βˆ1 σ01a ) : (βˆ2 σ02a ) : . . . : (βˆt σ0ta ).

(130)

Following the statistical reasoning of Xu and Liu [104, 108] to consider only the terms up to the second order approximation of random errors, we immediately conclude that: (i) the variance components σa are not estimable; and (ii) the variance components σa cannot be separated from the variance of unit weight σy2 . It is also easy to prove that this conclusion can be further extended to the most general case of variance components σy . For more details on the inestimability analysis of variance components in EIV models, the reader is referred to Xu and Liu [104, 108].

Literature 1. Adcock, R.J.: Note on the method of least squares. Analyst 4, 183–184 (1877) 2. Adcock, R.J.: A problem in least squares. Analyst 5, 53–54 (1878) 3. Akyilmaz, O.: Total least squares solution of coordinate transformation. Surv. Rev. 39, 68–80 (2007) 4. Altmann, S.L.: Rotations, Quaternions, and Double Groups. Clarendon Press, Oxford (1982) 5. Amiri-Simkooei, A., Jazaeri, S.: Data-snooping procedure applied to errors-in-variables models. Stud. Geophys. Geod. 57, 426–441 (2013) 6. Amiri-Simkooei, A.R., Zangeneh-Nejad, F., Asgari, J.: On the covariance matrix of weighted total least-squares estimates. J. Surv. Eng. 142, 04015014 (2016) 7. Aoki, M., Yue, P.C.: On a priori error estimates of some identification methods. IEEE Trans. Auto. Contr. AC-15, 541–548 (1970) 8. Arun, K.S., Huang, T.S., BLOSTEIN, S.D.: Least-squares fitting of two 3-D point sets. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 698–700 (1987) 9. Bab-Hadiashar, A., Suter, D.: Robust total least squares based optic flow computation. Int. J. Comput. Vis. 29, 566–573 (1998) 10. Beale, E.M.L.: Confidence regions in non-linear estimation (with discussions). J. R. Stat. Soc. B22, 41–88 (1960) 11. Ben-Israel, A., Greville, T.: Generalized Inverses: Theory and Applications. Springer, Berlin (2000) 12. Boor, B.D.: Structured total least squares and L2 approximation problems. Lin. Algebra. Appl. 188, 163–205 (1993) 13. Bormann, P., Wylegalla, K.: Untersuchung der Korrelationsbeziehungen zwischen verschiedenen Arten der Magnitudenbestimmung der Station Moxa in Abhängigkeit vom Gerätetyp und vom Herdgebiet. Publications of Institute of Geophysics, Pol. Acad. Sci. 93, 159–175 (1975) 14. Box, M.J.: Bias in nonlinear estimation (with discussions). J. R. Stat. Soc. B33, 171–201 (1971) 15. Branham R.L. Jr.: Astronomical data reduction with total least squares. New Astron. Rev. 45, 649–661 (2001) 16. Cai, J.Q., Grafarend, E.: Systematical analysis of the transformation between Gauss-Kruegercoordinate/DHDN and UTM-coordinate/ETRS89 in Baden-Württemberg with different estimation methods. In: Drewes, H. (ed.) Geodetic Reference Frames, International Association of Geodesy Symposia 134, pp. 205–211. Springer, Berlin (2009)

530

P. Xu

17. Carroll, R.J., Ruppert, D., Stefanski, L.A., Crainiceanu, C.M.: Measurement Error in Nonlinear Models – a Modern Perspective, 2nd edn. Chapman and Hall, London (2006) 18. Castellaro, S., Mulargia, F., Kagan, Y.Y.: Seismology regression problems for magnitudes. Geophys. J. Int. 165, 913–930 (2006) 19. Chang, G.B., Xu, T.H., Wang, Q.X.: M-estimator for the 3D symmetric Helmert coordinate transformation. J. Geod. 92, 47–58 (2018) 20. Cheng, C., van Ness, J.W.: Generalized M-estimators for errors-in-variables regression. Ann. Stat. 20, 385–397 (1992) 21. Coolidge, J.L.: Two geometrical applications of the method of least squares. Am. Math. Mon. 20, 187–190 (1913) 22. Davies, R.B., Hutton, B.: The effect of errors in the independent variables in linear regression. Biometrika 62, 383–391 (1975) 23. Deming, W.E.: The application of least squares. Philos. Mag. 11, 146–158 (1931) 24. Deming, W.E.: On the application of least squares — II. Philos. Mag. 17, 804–829 (1934) 25. Deming, W.E.: Statistical Adjustment of Data. Dover Publications, New York (1964) 26. Dennis, Jr., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations. SIAM Classics in Applied Mathematics. SIAM, Philadelphia (1996) 27. Dermanis, A.: The photogrammetric inner constraints. ISPRS J. Photogramm. Remote Sens. 49, 25–39 (1994) 28. Dermanis, A.: Generalized inverses of nonlinear mappings and the nonlinear geodetic datum problem. J. Geod. 72, 71–100 (1998) 29. Fang, X.: Weighted total least squares solutions for applications in Geodesy. Wissenschaftliche Arbeiten der Fachrichtung Geodäsie und Geoinformatik, Ph.D.-Thesis, Nr.294, Leibniz University Hannover (2011) 30. Felus, Y.A.: Application of total least squares for spatial point process analysis. J. Surv. Eng. 130, 126–133 (2004) 31. Felus, Y.A., Burtch, R.C.: On symmetrical three-dimensional datum conversion. GPS Solut. 13, 65–74 (2009) 32. Fuller, W.A.: Measurement Error Models. Wiley Interscience, New York (1987) 33. Fuller, W.A.: Introduction to Statistical Time Series, 2nd edn. Wiley, New York (1996) 34. Gerhold, G.A.: Least-squares adjustment of weighted data to a general linear equation. Am. J. Phys. 37, 156–161 (1969) 35. Gleser, L.: Estimation in a multivariate “errors in variables” regression model: large sample results. Ann. Stat. 9, 24–44 (1981) 36. Golub, G.H., van Loan, C.F.: An analysis of the total least squares problem. SIAM J. Numer. Anal. 17, 883–893 (1980) 37. Grafarend, E., Schaffrin, B.: Unbiased free net adjustment. Surv. Rev. 22, 200–218 (1974) 38. Grafarend, E., Schaffrin, B.: Equivalence of estimable quantities and invariants in geodetic networks. ZfV 101, 485–491 (1976) 39. Hodges, S.D., Moore, P.G.: Data uncertainties and least squares regression. Appl. Stat. 21, 185–195 (1972) 40. Horn, B.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. 4, 629–642 (1987) 41. Horn, B., Hilden, H.M., Negahdripour, S.: Closed-form solution of absolute orientation using orthonormal matrices. J. Opt. Soc. Am. 5, 1127–1135 (1988) 42. Horn, S.D., Horn, R.A., Duncan, D.B.: Estimating heteroscedastic variances in linear models. J. Am. Stat. Ass. 70, 380–385 (1975) 43. Kanatani, K., Niitsuma, H.: Optimal computation of 3-D similarity: Gauss-Newton vs. GaussHelmert. Comput. Stat. Data Anal. 56, 4470–4483 (2012) 44. Keat, J.E.: Analysis of least-squares attitude determination routine DOAOP. Technical Report CSC/TM-77/6034, Computer Sciences Corp. (1977) 45. Koch, K.R.: S-transformations and projections for obtaining estimable parameters. In: 40 Years of Thought, Anniversary Volume on the Occasion of Prof. Baarda’s 65th Birthday, Delft, Vol. 1, pp. 136–144 (1982)

9 Parameter Estimation, Variance Components and Statistical. . .

531

46. Koch, K.R.: Parameter estimation and hypothesis testing in linear models, 2nd edn. Springer, Berlin (1999) 47. Kotsakis, C.: Reference frame stability and nonlinear distortion in minimum-constrained network adjustment. J. Geod. 86, 755–774 (2012) 48. Kotsakis, C.: Generalized inner constraints for geodetic network densification problems. J. Geod. 87, 661–673 (2013) 49. Krakiwsky, E.D., Thomson, D.B.: Mathematical models for the combination of terrestrial and satellite networks. Can. Surv. 28, 606–615 (1974) 50. Kummell, C.H.: Reduction of observation equations which contain more than one observed quantity. Analyst 6, 97–105 (1879) 51. Lemmerling, P., van Huffel, S.: Structured total least squares. In: van Huffel, S., Lemmerling, P. (eds.) Total Least Squares and Errors-in-Variables Modelling, pp. 79–91. Kluwer Academic Publishers, Dordrecht (2002) 52. Li, Y., Tang, H., Lin, X.: Spatial linear mixed models with covariate measurement errors. Stat. Sin. 19, 1077–1093 (2009) 53. Liu, J.N.: The equivalence of coordinate transformation models for the combination of satellite and terrestrial networks. J. Wuhan Tech. Univ. Surv. Mapp. 8, 37–50 (1983). (in Chinese with English abstract) 54. Liu, J.N., Liu, D.J.: The influence of the accuracy in geodetic and geocentric coordinates on combined adjustment. Acta Geod. Cartogr. Sinica. 14, 133–144 (1985). (in Chinese with English abstract) 55. Liu, J.N., Liu, D.J., Cui, X.Z.: Theory and applications of combined adjustment of satellite and terrestrial networks. J. Wuhan Tech. Univ. Surv. Mapp. 12(4), 1–9 (1987). (in Chinese with English abstract) 56. Lu, G.: Development of a GPS multi-antenna system for attitude determination. Ph.D-Thesis, UCGE Reports No. 20073. Department of Geomatics Engineering, The University of Calgary (1995) 57. Magnus, J.R., Neudecker, H.: Matrix differential calculus with applications in statistics and econometrics. Wiley, New York (1988) 58. Markovsky, I., van Huffel, S.: High-performance numerical algorithms and software for structured total least squares. J. Comput. Appl. Math. 180, 311–331 (2005) 59. Markovsky, I., van Huffel, S.: On weighted structured total least squares. In: Lirkov, I., Margenov, S., Wa´sniewski, J. (eds.) LSSC 2005, LNCS 3743, pp. 695–702 (2006) 60. Markovsky, I., van Huffel, S.: Overview of total least squares methods. Sig. Proc. 87, 2283– 2302 (2007) 61. Mastronardi, N., Lemmerling, M., van Huffel, S.: Fast structured total least squares algorithm for solving the basic deconvolution problem. SIAM J. Matrix Anal. Appl. 22, 533–553 (2000) 62. Meissl, P.: Die innere Genauigkeit eines Punkthaufens. Öster. ZfV 50, 159–165, 186–194 (1962) 63. Meissl, P.: Über die innere Genauigkeit dreidimensionaler Punkthaufen. ZfV 90, 109–118 (1965) 64. Neitzel, F.: Generalization of total least-squares on example of unweighted and weighted 2D similarity transformation. J. Geod. 84, 751–762 (2010) 65. Nievergelt, Y.: Total least squares: state-of-the-art regression in numerical analysis. SIAM Rev. 36, 258–264 (1994) 66. Papo, H., Perelmuter, A.: Datum definition by free net adjustment. Bull. Geod. 55, 218–226 (1981) 67. Pearson, K.: On lines and planes of closest fit to systems of points in space. Philos. Mag. 2, 559–572 (1901) 68. Rao, C.R., Kleffe, J.: Estimation of Variance Components and Applications. North-Holland, Amsterdam (1988) 69. Rao, C.R., Mitra, S.K.: Generalized Inverse of Matrices and Its Applications. Wiley, New York (1971)

532

P. Xu

70. Ratkowsky, D.A.: Nonlinear Regression Modeling – A Practical Unified Approach. Marcel Dekker, New York (1983) 71. Rinner, K.: Systematic investigations of geodetic networks in space. European Res Office, Contract No. 91–591-EUC 3584 (1966) 72. Rosen, J.B., Park, H., Glick, J.: Total least squares formulation and solution for structured problems. SIAM J. Matrix Anal. Appl. 17, 110–126 (1996) 73. Ross, G.J.S.: Nonlinear Estimation. Springer, New York (1990) 74. Sanso, F.: An exact solution of the roto-translation problem. Photogrammetria 29, 203–216 (1973) 75. Schaffrin, B.: Aspects of network design. In: Grafarend, E., Sanso, F. (eds.) Optimization and Design of Geodetic Networks, pp. 548–597. Springer, Berlin (1985) 76. Schaffrin, B.: A note on constrained total least-squares estimation. Linear Algebra Appl. 417, 245–258 (2006) 77. Schaffrin, B., Wieser, A.: On weighted total least-squares adjustment for linear regression. J. Geod. 82, 415–421 (2008) 78. Schaffrin, B., Felus, Y.A.: An algorithmic approach to the total least-squares problem with linear and quadratic constraints. Stud. Geophys. Geod. 53, 1–16 (2009) 79. Schaffrin, B., Lee, I., Choi, Y., Felus, Y.A.: Total least-squares (TLS) for geodetic straight-line and plane adjustment. Boll. Geod. Sci. Aff. LXV, 141–168 (2006) 80. Schönemann, P.H.: A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1–10 (1966) 81. Searle, S.R.: Linear Models. Wiley, New York (1971) 82. Seber, G., Wild, C.: Nonlinear Regression. Wiley, New York (1989) 83. Shen, Y.Z., Li, B.F., Chen, Y.: An iterative solution of weighted total least squares adjustment. J. Geod. 85, 229–238 (2011) 84. Shi, Y., Xu, P.L.: Unidentifiability of errors-in-variables models with rank deficiency from measurements (2018, submitted) 85. Shi, Y., Xu, P.L., Liu, J.N., Shi, C.: Alternative formulae for parameter estimation in partial errors-in-variables models. J. Geod. 89, 13–16 (2015) 86. Teunissen, P.: Zero order design: generalized inverse, adjustment, the datum problem and S-transformations. In: Grafarend, E., Sanso, F. (eds.) Optimization and Design of Geodetic Networks, pp. 11–55. Springer, Berlin (1985) 87. Thompson, E.H.: An exact linear solution of the problem of absolute orientation. Photogrammetria 15, 163–179 (1959) 88. Tong, X.H., Jin, Y.M., Zhang, S.L., Li, L.Y., Liu, S.J.: Bias-corrected weighted total leastsquares adjustment of condition equations. J. Surv. Eng. 141, 04014013 (2015) 89. Umeyama, S.: Least squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. 13, 376–380 (1991) 90. Ursin, B.: Methods for estimating the seismic reflection response. Geophysics 62, 1990–1995 (1997) 91. van Huffel, S., Vandewalle, J.: The Total Least Squares Problem: Computational Aspects and Analysis. SIAM, Philadelphia (1991) 92. Wang, N., Davidian, M.: A note on covariate measurement error in nonlinear mixed effects models. Biometrika 83, 801–812 (1996) 93. Wang, N., Lin, X., Gutierrez, R., Carroll, R.J.: Bias analysis and SIMEX approach in generalized linear mixed measurement error models. J. Am. Stat. Ass. 93, 249–261 (1998) 94. Wolf, H.: Helmerts Lösung zum Problem der freien Netze mit singulärer Normalgleichungsmatrix. ZfV 97, 189–192 (1972) 95. Wolf, H.: Die Helmert-Inverse bei freien geodätischen Netzen. ZfV 98, 396–398 (1973) 96. Wolf, H.: Scale and orientation in combined doppler and triangulation nets. Bull Géod 54, 45–53 (1980) 97. Xu, P.L.: Testing the hypotheses of non-estimable functions in free net adjustment models. Manuscr. Geodaet. 20, 73–81 (1995)

9 Parameter Estimation, Variance Components and Statistical. . .

533

98. Xu, P.L.: A general solution in nonlinear rank-defect geodetic models. Boll. Geod. Sci. Aff. 56, 1–25 (1997) 99. Xu, P.L.: Biases and accuracy of, and an alternative to, discrete nonlinear filters. J. Geod. 73, 35–46 (1999) 100. Xu, P.L.: Nonlinear filtering of continuous systems: foundational problems and new results. J. Geod. 77, 247–256 (2003) 101. Xu, P.L.: Determination of regional stress tensors from fault-slip data. Geophys. J. Int. 157, 1316–1330 (2004) 102. Xu, P.L.: The effect of errors-in-variables on variance component estimation. J. Geod. 90, 681–701 (2016) 103. Xu, P.L., Grafarend, E.: Statistics and geometry of the eigenspectra of 3-D second-rank symmetric random tensors. Geophys. J. Int. 127, 744-756 (1996) 104. Xu, P.L., Liu, J.N.: Variance components in errors-in-variables models: estimability, stability and bias analysis. Invited talk, VIII Hotine-Marussi Symposium on Mathematical Geodesy, Rome, 17–21 June (2013) 105. Xu, P.L., Liu, J.N.: Variance components in errors-in-variables models: estimability, stability and bias analysis. J. Geod. 88, 719–734 (2014) 106. Xu, P.L., Shimada, S.: Least squares parameter estimation in multiplicative noise models. Commun. Stat. B29, 83–96 (2000) 107. Xu, P.L., Liu, J.N., Shi, C.: Total least squares adjustment in partial errors-in-variables models: algorithm and statistical analysis. J. Geod. 86, 661–675 (2012) 108. Xu, P.L., Liu, J.N., Zeng, W., Shen, Y.Z.: Effects of errors-in-variables on weighted least squares estimation. J. Geod. 88, 705–716 (2014) 109. Zeng, H.: Analytical algorithm of weighted 3D datum transformation using the constraint of orthonormal matrix. EPS 67, art.105 (2015) 110. Zhou, J.W.: Quasi-stable adjustment of monitoring networks. Wuhan Inst. Geod. Geophys. Spec. Publ. No. 2 (1980). (in Chinese) 111. Zumar, R.H.: Robust estimation in the errors-in-variables model. Biometrika 76, 149–160 (1989)

Part IV Approximation and Numerical Methods

Fast Harmonic/Spherical Splines and Parameter Choice Methods

10

Martin Gutting

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Harmonic/Spherical Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Fast Multipole Method for Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Parameter Choice Methods for Spline Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

539 541 548 560 569 570

Abstract Solutions to boundary value problems in geoscience where the boundary is the Earth’s surface can be constructed in terms of harmonic splines. These are localizing trial functions that make use of a reproducing kernel. Splines allow regional modeling or the improvement of a global model in a part of the Earth’s surface. For certain cases of the reproducing kernels a fast matrix-vector multiplication using the fast multipole method (FMM) is available. The main idea of the fast multipole algorithm consists of two parts: First, the hierarchical decomposition of the three-dimensional computational domain into cubes. Second, an approximation instead of the actual kernel is used for the more distant points which allows to consider many distant points at once. The numerical effort of the

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. Martin Gutting () Department of Biotechnology, Mannheim University of Applied Sciences, Mannheim, Germany E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_106

537

538

M. Gutting

matrix-vector multiplication becomes linear in reference to the number of points for a prescribed accuracy of the approximation of the reproducing kernel. This fast multiplication is used in spline approximation for the solution of the occurring linear systems which also allows the treatment of noisy data requires the choice of a smoothing parameter. Several methods are presented which ideally automatically choose this parameter with and without prior knowledge of the noise level. Using a fast solution algorithm we no longer have access to the whole matrix or its singular values whose computation requires a much larger numerical effort. This situation must be reflected by the parameter choice methods.

Zusammenfassung Lösungen zu Randwertproblemen aus den Geowissenschaften, bei denen die Randfläche durch die Erdoberfläche gegeben ist, können mittels harmonischer Splines konstruiert werden. Diese Splines sind lokalisierende Testfunktionen, die aus einem reproduzierenden Kern hervorgehen. Sie erlauben regionale Modelle sowie die lokale Verbesserung globaler Modelle in Teilen der Erdoberfläche. In bestimmten Fällen dieser reproduzierenden Kerne ist eine schnelle MatrixVektor Multiplikation durch das schnelle Multipolverfahren (FMM) verfügbar. Die Kernidee des schnellen Multipolalgorithmus’ besteht aus zwei Teilen: Zum einen die hierarchische Unterteilung des dreidimensionalen Berechnungsgebiets in geschachtelte Würfel. Zum anderen wird eine Approximation anstelle des Kerns für weiter entfernte Punkte benutzt, die es erlaubt viele entfernte Punkte auf ein Mal zu betrachten. Der numerische Aufwand der Matrix-Vektor Multiplikation wird so linear bzgl. der Punkteanzahl bei einer vorgeschriebenen Genauigkeit der Approximation des reproduzierenden Kerns. Diese schnelle Multiplikation wird bei der Spline-Approximation benutzt, um die auftretenden linearen Gleichungssysteme effizient zu lösen. SplineApproximation erlaubt es auch verrauschte Daten zu nutzen, erfordert allerdings die Wahl eines Glättungsparameters. Verschiedene Verfahren werden präsentiert, die idealerweise automatisch diesen Parameter wählen – manche mit und manche ohne Kenntnis des Rauschlevels. Da ein schneller Lösungsalgorithmus für die linearen Gleichungssysteme benutzt wird, stehen dabei die gesamte Matrix oder gar ihre Singulärwerte, deren Berechnung einen viel höheren numerischen Aufwand erfordert, nicht zur Verfügung. Die Parameterwahlverfahren müssen diese Situation adäquat widerspiegeln.

Keywords Boundary value problem · Generalized interpolation · Harmonic splines · Spline approximation · Fast multipole method · Regularization · Ill-posed problems · Parameter choice methods

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

1

539

Introduction

Spherical splines have been developed by [25] and independently by [90] for interpolation and smoothing/approximation problems on the sphere and have been generalized to harmonic splines by [24, 26, 27]. These harmonic splines can be used for interpolation/approximation on regular surfaces, but in particular for the solution of boundary value problems where the boundary is a regular surface. Convergence theorems exist for both spherical splines and harmonic splines (cf. [28–31]), and the different types of spline spaces and their reproducing kernels have been investigated (cf. [42] and the references therein). In geosciences they have found many applications (see, e.g., [32, 33, 39–41, 74] and the references therein). Modeling with splines leads to a system of linear equations which in case of harmonic splines has to be densely populated. This makes the solution expensive in terms of the numerical effort. Iterative solvers require fast summation methods corresponding to the reproducing kernels of these splines to be efficient. For this we present the fast multipole method in Sect. 3. On the sphere there are several additional ways to achieve a fast summation: spherical panel clustering (cf. [32, 36] and the references therein), spherical FFT for gridded data points (see, e.g., [17]) or spherical NFFT for non-equispaced data (cf. [60, 79]). The problem under consideration determines which method should be preferred. Greengard [47], Greengard and Rokhlin [48, 49], and Rokhlin [84] have introduced the fast multipole method (FMM) in two and three dimensions for fast evaluation of potentials corresponding to the Laplace operator (generalizations to further operators have been introduced later). Since such potentials are closely related to certain reproducing kernels of our splines, the FMM allows fast summation of harmonic splines as well as spherical splines. Note that in our approach spherical spline are merely a special case. Such a combination of FMM and splines is used in [46] to solve problems of satellite geodesy with harmonic splines corresponding to the singularity kernel. We have extended this to the Abel-Poisson kernel and use the accelerated version of the FMM that was first introduced in [14, 50]. This approach has also been applied to the oblique boundary value problem of potential theory in [51, 52]. Further applications and examples can be found in [53, 54]. We consider the following (generalized) interpolation problems: Problem 1 (Interpolation on a regular surface). Let Σ be a C (0) -regular surface (see Definition 1 below for details). Let a finite set of points {x1 , . . . , xN } ⊂ Σ on the surface and data Fi , i = 1, . . . , N corresponding to these points be given. The aim is to find a function F in a function space of choice such that F (xi ) = Fi , i = 1, . . . , N . If the data Fi are error-affected and strict interpolation is no longer desirable, the interpolation conditions are reduced to F (xi ) ≈ Fi , i = 1, . . . , N . In this case F

540

M. Gutting

has to minimize a functional that balances closeness to the data and smoothness of F , usually with one (or several) parameters (see Sect. 2.4 for the details). Closely related to this interpolation/approximation problem is the discrete version of the Dirichlet boundary value problem which requires only the values of the boundary function in a finite set of points on the surface. Problem 2. Let Σ be a C (k) -regular surface with k ≥ 2 (see again Definition 1 below for details). Let {x1 , . . . , xN } ⊂ Σ be a discrete set of N points on the surface. For each point xi let Fi = U (xi ) be given, where i = 1, . . . ,N . The task is to determine the potential U ∈ C (0) Σext ∩ C (2) (Σext ) which is harmonic in Σext , i.e., in the exterior of the regular surface Σ. Moreover, the potential U has to be regular at infinity, i.e., for |x| → ∞, " # |U (x)| = O |x|−1 , " # |∇U (x)| = O |x|−2 , .

(1) (2)

Alternatively, the task is to determine an approximation UN to U which fits the data, i.e., for i = 1, . . . , N , UN (xi ) = Fi = U (xi ).

(3)

As before the interpolation conditions (3) are relaxed in case of error-affected data/measurements. Obviously, Neumann problems can also be considered. Note ∂ denotes the normal derivative with respect to the normal n pointed to the that ∂n exterior Σext . Problem 3. Let Σ be a C (k) -regular surface with k ≥ 2 (see again Definition 1 below for details). Let {x1 , . . . , xN } ⊂ Σ be a discrete set of N points on the surface. For each point xi let discrete normal derivatives Fi = ∂U ∂n (xi ) be given, where i = 1, . . . , N .   The task is to determine the potential U ∈ C (1) Σext ∩ C (2) (Σext ) which is harmonic in Σext and regular at infinity, i.e., (1) and (2) hold. Alternatively, the task is to find an approximation UN to U which fits the Neumann data, i.e., for i = 1, . . . , N , ∂UN ∂U (xi ) = Fi = (xi ). ∂n ∂n

(4)

With a few slight modifications Problem 3 can be generalized to the exterior oblique derivative problem. We refer to [39, 51, 52] and the references therein for further details.

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

541

The outline of this paper is as follows: Sect. 2 summarizes the theory of harmonic splines and spline approximation. In Sect. 3 we establish the connection between harmonic splines and the sums that can be computed by the fast multipole method, we introduce the adaptive construction of the decomposition of the computational domain and provide our version of the fast multipole algorithm for harmonic splines. Section 4 gives an overview of suitable methods to choose the smoothing parameter of the approximating splines if the data are afflicted with (stochastic) noise.

2

Harmonic/Spherical Splines

Spherical harmonics, which we denote by Yn,m (with degree n ∈ N0 , order m = −n, . . . , n), are known to form a complete orthonormal basis of the space L2 (S2 ) of square integrable functions on the unit sphere S2 (see, e.g., [18, 37, 38, 88]). x Let ϑ ∈ [0, π ], ϕ ∈ [0, 2π ) be the usual spherical coordinates of ξ = |x| where  x = (x1 , x2 , x3 )T ∈ R3 and its norm is denoted by |x| = x12 + x22 + x32 . We obtain C : S2 → C of degree the complex-valued fully normalized spherical harmonics Yn,m n and order m with

% C (ξ ) Yn,m

= (−1)

m

2n + 1 (n − m)! Pn,m (cos(ϑ))eimϕ , 4π (n + m)!

ξ ∈ S2 .

(5)

Thereby, Pn,m : [−1, 1] → R are the associated Legendre functions with Pn,m (t) =

# d n+m " 2 1 2 m n 2 (1 − t (t , ) − 1) 2n n! dt n+m

m = 0, . . . , n.

(6)

The symmetry relation Pn,−m (t) = (−1)m (n−m)! (n+m)! Pn,m (t) extends them for negative orders (cf., e.g., [11,18]). These are the spherical harmonics as they are usually used in physics and they are orthonormal with respect to the scalar product of complexvalued functions on the sphere  "F, G#L2 (S2 ) =

S2

F (ξ )(G(ξ ))∗ dS(ξ )

(7)

where the upper star ∗ in (9) denotes the complex conjugate and dS stands for the surface element. We require these functions for the construction of the FMM in Sect. 3. In geosciences the real-valued fully normalized spherical harmonics are more R : S2 → R of degree n and order m with commonly used, i.e, Yn,m

542

M. Gutting

% R Yn,m (ξ )

=

⎧√ ⎪ ⎪ ⎨ 2 cos(mϕ) 2n + 1 (n − |m|)! Pn,|m| (cos(ϑ)) 1 ⎪ 4π (n + |m|)! ⎪ ⎩√2 sin(mϕ)

m < 0, m = 0,

(8)

m > 0.

These spherical harmonics are orthonormal with respect to the scalar product of real-valued functions on the sphere  "F, G#L2 (S2 ) =

S2

F (ξ )G(ξ ) dS(ξ ).

(9)

If not indicated otherwise these are the spherical harmonics used in this paper and the superscript R is omitted. Note that the numbering with respect to the order is chosen such that the following simple relation between complex-valued and realvalued spherical harmonics holds: ⎧√ "   # ⎪ ⎨ 2−δ0,m Y C (ξ ) + Y C (ξ ) ∗ , m ≤ 0, n,m n,m 2 R Yn,m (ξ ) = (−1)m √2−δ " #   0,m ⎪ C (ξ ) + Y C (ξ ) ∗ , m > 0, ⎩ Yn,m n,m 2i

(10)

for all ξ ∈ S2 , n ∈ N0 and m ∈ Z with −n ≤ m ≤ n. For more details see, e.g., [37, 38] and the references therein. Both kinds of spherical harmonics {Yn,m }n∈N0 ,m=−n,...,n also form a closed system in C(S2 ) (complex- or real-valued, respectively). They are closed and complete in L2 (S2 ) (complex- or real-valued, respectively). This allows the representation of square-integrable functions on any sphere S2R of radius R > 0 by their Fourier series, where the Fourier coefficients of F ∈ L2 (S2R ) are denoted by ∧

F (n, m) =

2.1

 S2R

F (x)

  1 Yn,m Rx dSR (x) . R

(11)

Regular Surfaces and Runge Spheres

Due to the Runge-Walsh approximation theorem, we can use functions which possess a larger domain of harmonicity to approximate the solution of a problem which requires harmonicity only outside the Earth’s surface (see [32, 39] for an extensive introduction of this technique). Harmonic splines as introduced in [24, 26, 27, 85] are constructed in such a way that they are subspaces of the space of harmonic functions on a sphere situated inside the Earth, the so-called Runge (or Krarup) sphere (see, e.g., [76]). The Earth’s surface is considered to be regular as by the following definition.

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

543

Definition 1. A C (k) -regular surface Σ ⊂ R3 is a surface in R3 which has to fulfill the following properties: (i) Σ divides R3 into the interior Σint and the exterior Σext , where Σint is a bounded region and Σext is an unbounded region. (ii) The origin is contained in Σint . (iii) Σ is closed (and therefore compact) and free of double points. (iv) Σ is a C (k) - surface, i.e., for each x ∈ Σ there exists a neighborhood U ⊂ R3 of x such that Σ ∩ U possesses a C (k) -parametrization. We can also define C (k,λ) -regular surfaces Σ ⊂ R3 with λ ∈ (0, 1) as a C (k) -regular surface where every point x ∈ Σ possesses a neighborhood U such that Σ ∩ U can locally be parameterized by a k-times λ-Hölder continuously differentiable parametrization. Such surfaces are required for oblique derivative boundary value problems as in [51, 52], but not here. Note that any sphere S2R of radius R > 0 is obviously a C (∞) -regular surface. For regular surfaces Σ we can define Runge spheres (cf. [32, 33, 39], see also [76] where it is called Krarup sphere). Definition 2. The Runge sphere S2R is a sphere of radius R around the origin such that the exterior of the Runge sphere, i.e., S2R,ext , contains the exterior of the regular surface Σ, i.e., Σext ⊂ S2R,ext .

2.2

Sobolev Spaces

Using the Runge sphere S2R Definition 3 briefly introduces Sobolev spaces of the form # " H = H {An }; S2R,ext .

(12)

For more details, the reader is referred to [32, 39, 74] and the references in these books. Definition 3. Let {An }n∈N0 ⊂ R be a sequence which satisfies the summability condition ∞ 2n + 1 n=0

4π A2n

< ∞.

(13)

544

M. Gutting

# " The Sobolev space H = H {An }; S2R,ext is defined by " # #·H{A }; S2  " n R,ext 2 2 H = H {An }; SR,ext = E {An }; SR,ext ,

(14)

" # " # where E {An }; S2R,ext ⊂ C (∞) S2R,ext is the set of all functions that are harmonic in S2R,ext , infinitely often differentiable on the Runge sphere S2R and regular at infinity (i.e., (1) and (2) hold) and whose Fourier coefficients F ∧ (n, m) with respect to L2 (S2R ) (as defined in (11)) fulfill " # H {An }; S2R,ext

F 

=

n ∞

 2 A2n F ∧ (n, m) < ∞ .

(15)

n=0 m=−n

H is a Hilbert space with the inner product defined by "F, G#

"

H {An }; S2R,ext

#

=

n ∞

A2n F ∧ (n, m)G∧ (n, m)

(16)

n=0 m=−n

" # for F, G ∈ H {An }; S2R,ext . It is well-known (cf. [32, 74] and the references therein) that such a space possesses a so-called reproducing kernel (see [3] for an overview on reproducing kernels in general). Definition 4. Let U be a non-empty set and (X, "·, ·#X ) be a separable Hilbert space of real valued functions on U . Let {Bn }n∈N0 be a complete orthonormal system in (X, "·, ·#X ). Any function K : U × U −→ R of the form K (x, y) =



K ∧ (n)Bn (x) Bn (y)

(17)

n=0

with x, y ∈ U and K ∧ (n) ∈ R for n ∈ N0 is called an X-product kernel (briefly an X-kernel). An X-kernel K (·, ·) : U × U −→ R is called a reproducing kernel (or shortly repro-kernel) for (X, "·, ·#X ) if: (i) K (x, ·) ∈ X for all x ∈ U . (ii) "K (x, ·) , F #X = F (x) for all x ∈ U and all F ∈ X.

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

545

If there exists such a repro-kernel in X, then X is called a reproducing"kernel Hilbert # space and the repro-kernel is unique (cf. [3]). In the space H = H {An }; S2R,ext with a summable sequence {An } the repro-kernel (17) can be represented by its expansion in Legendre polynomials due to the well-known addition theorem for spherical harmonics: KH (x, y) =

n ∞ n=0

=

∞ 2n + 1 n=0

2.3

1 1 A |x| m=−n n



1 2 4π An |x||y|

R |x|





n Yn,m

x |x|

n



R2 |x||y|

Pn



1 1 An |y|



R |y|



n Yn,m

 y x · . |x| |y|

y |y|



(18)

Harmonic Splines

We use the reproducing kernels of Sect. 2.2 to define harmonic splines. Definition 5. Let {L1 , . . . , LN } ⊂ H∗ be a set of N linearly independent bounded linear functionals on the reproducing kernel Hilbert space H. Then any function S of the form S=

N

ai Li KH (·, ·)

(19)

i=1

with a set of so-called spline coefficients {a1 , . . . , aN } ⊂ R is called an Hspline relative to {L1 , . . . , LN }. The function space of all H-splines relative to {L1 , . . . , LN } is denoted by SH (L1 , . . . , LN ). By construction any H-spline is a harmonic function. The interpolating spline S F for the function F ∈ H has to fulfill the interpolation conditions L i S F = Li F

for i = 1, . . . , N.

(20)

The interpolation conditions (20) can be rewritten as a system of linear equations for the spline coefficients ai : N

ai Li Lj KH (·, ·) = Lj F,

j = 1, . . . , N,

(21)

i=1

whose corresponding matrix possesses the entries Li Lj KH (·, ·) and is symmetric and positive definite (for linear functionals L1 , . . . , LN ∈ H∗ which are linearly independent).

546

M. Gutting

In this paper, we consider only evaluation functionals Lx , i.e., Lx F = F (x) where x ∈ Σext . Furthermore, L1 , . . . , LN are given by Li F = F (xi ) where xi ∈ Σ. For other types of functionals see [30, 31] or [51, 52] (for the case of oblique derivatives). In the following theorem we summarize the properties of H-splines. Theorem 1. Let F ∈ H and let {L1 , . . . , LN } ⊂ H∗ . Then the H-spline interpolation problem with the interpolation conditions (20) is uniquely solvable and its solution S F ∈ SH (L1 , . . . , LN ) possesses the following properties: (i) (ii) (iii)

F S 0 Fis0 the H-orthogonal projection of F onto SH (L1 , . . . , LN ). 0S 0 ≤ F H . H If G ∈ H also satisfies the interpolation conditions (20), then the first minimum property holds:

0 02 0 02 0 0 0 0 G2H = 0S F 0 + 0G − S F 0 , H

(22)

H

i.e., S F is the interpolating function of F in H with minimal norm. (iv) If S ∈ SH (L1 , . . . , LN ) and G ∈ H also satisfies the interpolation conditions (20), then the second minimum property holds: 0 02 0 02 0 0 0 0 S − G2H = 0S F − G0 + 0S − S F 0 . H

H

(23)

For the proof and for further details on splines, the reader is referred to [24, 31, 32, 35, 74] and the references therein. n

Example 1. The choice An = h− 2 , h ∈ (0, 1), fulfills (13) and provides us with the reproducing kernel called Abel-Poisson kernel which is given by KH (x, y) =

|x|2 |y|2 − h2 R 4 1 . 3 4π  2 2 |x| |y| + h2 R 4 − 2hR 2 x · y 2 1

(24)

n

Example 2. The sequence An = (n + 12 ) 2 h− 2 , h ∈ (0, 1), also satisfies (13) and leads to the singularity kernel given by KH (x, y) =

1 1 . 1 2π  2 2 2 4 2 2 |x| |y| + h R − 2hR x · y

(25)

For a detailed derivation of both examples we refer to, e.g., [34, 37, 51]. In [39] the existance of approximations fulfilling interpolation conditions is shown by the Runge-Walsh approximation theorem and an extension of Helly’s theorem (cf. [93]). Convergence results for harmonic splines (cf. [30, 31]) can be

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

547

derived that show the convergence to the solution of the Dirichlet boundary value problem for an increasing density of data points, i.e., if the largest data gap goes to zero (cf. [31]). We consider two specific types of splines (using Abel-Poisson and singularity kernels) and propose the fast multipole method to quickly compute the sums  ai KH (xi , yj ) for many points in Sect. 3. This can be used to solve the systems of linear equations (21) that occur in the solution of the interpolation problems using harmonic splines.

2.4

Spline Approximation

For noisy data, i.e., Fi = U (xi ) + δi , i = 1, . . . , N , where the noise δi is modeled by some stochastic process, e.g., white noise (see Sect. 4.1 for details), in Problem 1 or Problem 2, it makes no sense to compute an interpolation problem. We look for an approximation to U which can be interpreted as a smoothing of the data (see [25, 35, 43, 91] for the spherical spline approximation, [24, 32] for the case of harmonic spline approximation). Minimizing the following functional

μ(S) =

N N (Li S − Fi )Cij (Lj S − Fj ) + β S2H

(26)

i=1 j =1

" # in the reproducing kernel Hilbert space H = H {An }; S2R,ext yields the desired smoothed approximation of the data. C = (Cik ) ∈ RN ×N denotes a positive definite matrix which allows us to include covariance information on the data if available. β > 0 is a constant smoothing parameter which balances closeness to the data and smoothing. The following theorem of [32, 74] (see also the references therein) summarizes the existence and uniqueness of a spline approximation. Theorem 2. Let Fi , i = 1, . . . , N , correspond to a set of linearly independent bounded linear functionals L1 , . . . , LN ∈ H∗ . Then there exists a unique element S ∈ SH (L1 , . . . , LN ) such that μ(S) ≤ μ(F ) for all F ∈ H

(27)

and μ(S) = μ(F ) if and only if S = F . This element is called the smoothing spline or approximating spline. Its spline coefficients ai , i = 1, . . . , N, are uniquely determined by the system of linear equations N i=1

" # ai Li Lj KH (·, ·) + β(C −1 )ij = Lj F,

j = 1, . . . , N.

(28)

548

M. Gutting

The matrix in (28) corresponds to the sum of the matrix in (21) and βC −1 . It is still positive definite. If C is the identity matrix, there is only the one smoothing parameter β. Using a diagonal matrix as C it is possible to introduce weights for the data Fi and include additional information on the noise of the data. The choice of the smoothing parameter(s) can be interpreted as the application of a parameter choice method in the regularization theory of ill-posed problems (see Sect. 4). Note that this is an approach similar to solving a regularized least squares system.

2.5

Spherical Splines

Spherical splines have been introduced by [25] and independently by [90] and can be embedded naturally in the harmonic setting of Sect. 2.3. On the sphere of radius R, i.e., Σ = S2R , no additional Runge sphere is required and the surface itself can be   used to consider the corresponding Sobolev spaces H = H {An }; S2R with respect to a sequence {An }. The summability condition (13) remains the same and if it is satisfied by {An } we again obtain a reproducing kernel. It should be remarked that on the sphere this reproducing kernel is a radial basis function depending only on the distance |x − y| of its two arguments x, y ∈ S2R . Moreover, it is possible to consider Sobolev spaces that possess a locally supported reproducing kernel, i.e., KH (x, y) = 0 if |x − y| is larger than a threshold value (see [35] and also the categorization in [42]).  For our purposes the problem of needing a fast method to evaluate sums ai KH (xi , yj ) and to solve the system (21) for the spline coefficients remains the same as before. However, there exists a wider spectrum of fast summation methods on the sphere such as panel clustering (see [22, 32, 36] and the references therein), spherical FFT for gridded data points (cf., e.g., [17]) or spherical NFFT for nonequispaced data (cf., e.g., [60, 79]). In [35] a more general construction which is orthogonal to the spherical harmonics of degrees 0 to m is introduced which can be used for approximation by combining it with the corresponding spherical harmonics. A combined interpolation and smoothing approach is included as well as spline exact numerical integration formulas on the sphere. Convergence results for spherical splines are shown in [28, 29] if the size of the largest data gap goes to zero. We refer the reader to [35, 74] and the references therein for a detailed introduction. It should be noted that the construction has also been carried over to the three-dimensional ball (see [73, 74] and the references therein).

3

The Fast Multipole Method for Splines

The interpolation conditions (20) as well as the minimization of the smoothing functional (26) lead to a system of linear equations (21), or (28) respectively. Its matrix is densely populated and the size of the matrix is the number of data points. Therefore, it can be a large system of linear equations whose solution becomes difficult.

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

549

Reproducing kernels of a reproducing kernel Hilbert space H defined by the summable sequence {An } can be expanded in terms of Legendre polynomials as in (18). The singularity kernel (25) and the Abel-Poisson kernel (24) possess a representation as an elementary function and both kernels are closely related to the 1 single pole |x−y| by the Kelvin transform. Therefore, we can use the fast multipole method (FMM), which has been introduced by [47–49, 84], for the fast summation  of harmonic splines, i.e., of the sum ai KH (xi , ·). This corresponds to the matrixvector products occurring in an iterative solver for (21) or (28). The FMM creates a hierarchical subdivision of the computational domain into nested cubes that are organized in an octtree data structure. Instead of single point interaction the cubes summarize the part of the kernel sum corresponding to the points they contain and interact with other cubes via the coefficients of truncated inner/outer harmonics expansions. This kernel approximation is applied as often as possible and on the coarsest possible level of the tree data structure. Direct evaluation is used only for the closest cubes where the approximation is not accurate enough. The algorithm has been improved several times to increase its efficiency (cf., e.g., [14, 50, 92]). We summarize our implementation and show the application of the FMM to harmonic splines (see [51, 53, 54] for a more detailed analysis).

3.1

Kelvin Transform of Reproducing Kernels

The Kelvin transform yields the aforementioned connection between the kernels (25) and (24) and the fundamental solution of the Laplace equation, i.e., the single pole. The transform can be seen as a reflection on a sphere around the origin and we choose the Runge sphere S2R of Definition 2 for this (see also text books on potential theory, e.g., [33, 61]). Definition 6. Let Γ ⊆ R3 be a domain, W : Γ −→ R a function. Let the reflection of Γ on the sphere S2R be given by Γ

KT

 = x KT ∈ R3 :

 R 2 KT x =x∈Γ . |x KT |2

(29)

The function W KT : Γ KT −→ R, x KT → W KT (x KT ) =

(30) R W |x KT |



R2 |x KT |2

 x KT

=

R W (x), |x KT |

is called the Kelvin transform of W with respect to the sphere of radius R. The Kelvin transform is applied to the reproducing kernels with respect to one KT (x, y KT ) argument (the other argument is kept fixed). The Kelvin transform KH of the singularity kernel (25) can be computed, e.g., by its expansion

550

M. Gutting

KH (x, y) =

∞ hn n=0

n+

1 2

2n + 1 4π |x||y|





R2 |x||y|

1 (h|y KT |)n = Pn 2π |y| |x|n+1 n=0

=

where y KT =

R2 y |y|2



n

 Pn

x y · |x| |y|

x y KT · KT |x| |y |



 (31)

1 |y KT | KT 1 KH (x, y KT ), = 2π |y| |x − hy KT | R

and

KT KH (x, y KT ) =

1 1 . 2π R |x − hy KT |

(32)

KT (x, y KT ) of the Abel-Poisson kernel (24) is given by The Kelvin transform KH

KH (x, y) =

=

|x|2 |y|2 − h2 R 4 1 3 4π  2 2 |x| |y| + h2 R 4 − 2hR 2 x · y 2 |y KT | 1 |x|2 − h2 |y KT |2 |y KT | KT KH (x, y KT ), = R 4π R |x − hy KT |3 R

(33)

which is related to (32) by

KT KH (x, y KT ) =

# 1 " 1 −x · ∇x − 12 Id . 2π R |x − hy KT |

(34)

We summarize both (32) and (34) by use of the operator Dx such that

KT KH (x, y KT ) =

1 1 Dx , 2π R |x − hy KT |

(35)

where Dx = Id (singularity kernel) or Dx = −x · ∇x − 12 Id (Abel-Poisson kernel). For further details we refer to [51–53].

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

3.2

551

Adaptive Decomposition of the Domain

Now we consider the evaluation of the sum

N i=1

ai KH (xi , y) =

N i=1

ai

|y KT | KT KH (xi , y KT ) R

 N  |y KT | ai 1  = Dx KT R 2π R |x − hy | x=xi

(36)

i=1

at the points yj , j = 1, . . . , M = O(N ), which can be the points xi as in the systems of linear equations (21) or (28). To better distinguish the points xi and yj we call 2 the Kelvin transformed points hyjKT = h |yR |2 yj which are used in (36) targets. j At first a bounding cube is determined which is large enough such that it contains all points and all targets. This single cube forms level 0 of the octtree structure and is subdivided into eight equally sized cubes of half its edge length which then form level 1. Each of these cubes is adaptively divided into nested cubes where a cube of level l has half the edge length of a cube of level l − 1 as proposed, e.g., by [14]. Points and targets are sorted into the currently available cubes. If a cube contains more than the prescribed maximal number of points or targets m, it is split into eight smaller cubes of the next level. All its points/targets are redistributed into these eight cubes. We summarize the necessary vocabulary in the following defintions (see also [14, 50]). Definition 7. (a) A cube C is called child of the cube B if C results from a single subdivision of B which in return is named the parent of C. (b) A cube that is not further subdivided is called childless or a leaf. (c) Cubes are said to be neighbors if they are of the same size (same level) and share at least one boundary point. Each cube is a neighbor of itself. (d) If two cubes are at the same level, but are no neighbors, they are called wellseparated, i.e., between these cubes exists at least one cube of their size. Each cube carries the relevant information about other cubes, in particular its neighbors, in four lists as suggested by [14]. We borrow Figs. 1 and 2 from [54] to give a two-dimensional illustration. Definition 8. (a) In list 1 of the childless cube X are all childless cubes directly adjacent to X. List 1 only contains any cubes if X is a leaf. In this case it always contains at least X itself. (b) List 2 of a cube X consists of all children of neighbors of the parent cube of X which are well-separated from X. The cube X does not need to be childless.

552

M. Gutting

Fig. 1 Two-dimensional illustration of an adaptive decomposition for levels 2 (left) and 3 (right) from [54]. If the marked square is a leaf, the white cubes form its list 1 and the blue cubes correspond to list 3. If the marked square is not a leaf, all white and all blue cubes are treated at the next level. List 2 cubes are red and list 4 cubes are yellow. Cubes in green have been handled at coarser levels

(c) Children of neighbors of the leaf X (or smaller cubes descending from neighbors of X) which do not have any point in common with X form list 3. Their parents have to be adjacent to X. If X is not childless, then list 3 is empty. (d) List 4 consists of childless cubes which are neighbors of the parent cube of X, but these childless cubes are not adjacent to X. It should be noted that all neighbors of a cube X can be found in list 1 of X. Moreover, all cubes in list 2 of a cube X are of the same size as X and well-separated from X. Further simple observations are described, e.g., in [14, 51, 53, 54]. When all points and targets are sorted into cubes which are no longer split the adaptive decomposition of the computational domain and the construction of the octtree data structure are complete. The algorithm removes childless cubes that contain neither points nor targets and are no longer required.

3.3

Single Pole Expansion

In addition to the decomposition of the domain, the other ingredient of the FMM is the kernel approximation of the single pole by a truncated expansion and the use of translation theorems to shift the expansion center and to change the type of expansion. Similar to (31) we use the generating function of the Legendre polynomials Pn to expand the single pole where the point x0 ∈ R3 is the expansion center. Moreover, we apply the addition theorem of the complex-valued fully C of (5) normalized spherical harmonics Yn,m

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

553

Fig. 2 Two-dimensional illustration of an adaptive decomposition for level 4 from [54]. If the marked square is a leaf, the white cubes form its list 1 and the blue cubes correspond to list 3. If the marked square is not a leaf, all white and all blue cubes are treated at the next level. List 2 cubes are red and list 4 cubes are yellow. Cubes in green have been handled at coarser levels

n

C C (Yn,m (ξ ))∗ Yn,m (η) =

m=−n

2n + 1 Pn (ξ · η), 4π

(37)

where ξ, η ∈ S2 (see, e.g., [1, 37, 71]). As before, the upper ∗ denotes the complex conjugate. Together this yields the following expansion: 1 1 = |x − y| |y − x0 − (x − x0 )|   ∞ x − x0 y − x0 |x − x0 |n · = P n |y − x0 | |x − x0 | |y − x0 |n+1 n=0

=

n ∞

∗ In,m (x − x0 )On,m (y − x0 ),

(38)

n=0 m=−n

where |y − x0 | > |x − x0 |. Note that we use the (complex-valued) outer and inner harmonics for n ∈ N0 , m = −n, . . . , n: 2

√   x (n + m)!(n − m)! C On,m (x) = , Yn,m n+1 |x| |x| 2   x |x|n 4π C In,m (x) = . Yn,m √ 2n + 1 (n + m)!(n − m)! |x| 4π 2n + 1

(39) (40)

554

M. Gutting

Thereby, in (39) and (40) ϑ ∈ [0, π ], ϕ ∈ [0, 2π ) are the usual spherical coordinates x of |x| . Well-known translation theorems for these outer and inner harmonics allow to shift the expansion center (see, e.g., [21] for a detailed derivation). Theorem 3 (Translation Theorem for Outer Harmonics). Let x, y ∈ R3 such that |x| > |y|. Then the outer harmonic of degree n ∈ N0 and order m ∈ Z, −n ≤ m ≤ n, at x − y can be expanded in terms of inner and outer harmonics as follows 

On,m (x − y) =

∞ n

In∗ ,m (y)On+n ,m+m (x)

(41)

In∗ −n,m −m (y)On ,m (x).

(42)

n =0 m =−n 

=

∞ n n =n m =−n

Note that in (42) we make use of the convention that In,m = 0 if the order fulfills |m| > n. Obviously, this infinite series as well as the infinite sum in (38) have to be truncated for the algorithm which are sources of (truncation) errors. Error estimates for the truncation errors can be found in [47, 49, 50] and the references therein. Another approach which investigates the combined error of both truncations in (38) and (42) is considered in [51]. Theorem 4 (Translation Theorem for Inner Harmonics). Let x, y ∈ R3 . Then the inner harmonic of degree n ∈ N0 and order m ∈ Z, −n ≤ m ≤ n, at x − y can be expanded in a finite sum of inner harmonics 

In,m (x − y) =

n n



(−1)n In ,m (y)In−n ,m−m (x).

(43)

n =0 m =−n

For orders with |m| > n we have again by convention In,m = 0. Note that no truncation is necessary for this translation theorem, i.e., no truncation errors occur. By applying (42) of Theorem 3 we can translate an outer harmonics expansion with expansion center x0 such as F (x) =

n ∞

Fx∧,O (n, m)On,m (x − x0 ) 0

(44)

n=0 m=−n

which converges uniformly for x ∈ S2r0 ,ext (x0 ) with a radius r0 > 0. S2r0 ,ext (x0 ) denotes the exterior of the sphere of radius r0 around x0 . The outer harmonics series resulting from the translation possesses the expansion center x1 and the coefficients

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

555

Fig. 3 Two-dimensional illustration of the spheres S2r0 (x0 ), S2r1 (x1 ), S2r2 (x2 ), and S2r3 (x3 ) which exemplify the typical geometry required for the translations as described in (46), (47), and (50)



Fx∧,O (n , m ) = 1

n n

Fx∧,O (n, m)In∗ −n,m −m (x0 − x1 ). 0

(45)

n=0 m=−n

This expansion converges uniformly for x ∈ S2r1 ,ext (x1 ) where for the exteriors of the two spheres holds that S2r1 ,ext (x1 ) ⊂ S2r0 ,ext (x0 ).

(46)

Figure 3 illustrates the geometric situation in two dimensions. This translation of coefficients is called multipole to multipole translation (M2M). Note that one can show that no further errors arise if the series in (44) has already been truncated before the translation. By using formulation (41) of Theorem 3 we also find that the outer harmonics expansion with expansion center x1 can be translated into an inner harmonics series centered around x2 which converges uniformly for x ∈ S2r2 ,int (x2 ) if the new ball of convergence is situated completely in S2r1 ,ext (x1 ) (see Fig. 3 for a two-dimensional illustration), i.e., S2r1 ,int (x1 ) ∩ S2r2 ,int (x2 ) = ∅.

(47)

The resulting coefficients of the inner harmonic expansion are Fx∧,I (n , m ) 2

=

n ∞



∗ Fx∧,O (n, m)(−1)n +m On+n  ,m −m (x2 − x1 ) 1

n=0 m=−n

and this translation is named multipole to local translation (M2L).

(48)

556

M. Gutting

Finally, Theorem 4 lets us shift the expansion center of such inner harmonics expansions to the new center x3 which possesses the coefficients Fx∧,I (n , m ) = 3

∞ n n=n m=−n

Fx∧,I (n, m)In−n ,m−m (x3 − x2 ). 2

(49)

and converges uniformly for x ∈ S2r3 ,int (x3 ) where S2r3 ,int (x3 ) ⊂ S2r2 ,int (x2 ).

(50)

Figure 3 illustrates this geometric situation in two dimensions. This translation step is called local to local translation (L2L). For further details we refer to [51] and the references therein, in particular [21].

3.4

The Fast Multipole Algorithm

To start the algorithm a first set of multipole expansion coefficients for each cube containing any points has to be computed. We consider only the part of the spline related to a single cube X, i.e., the kernel functions KH (xi , ·), where xi ∈ X and y ∈ Σext : F (y) =

N

ai KH (xi , y) =

i=1 xi ∈X

N

 ai

i=1 xi ∈X

  |y KT | 1 1  Dx . KT R 2π R |x − hy | x=xi

(51)

We find the following expansion for |hy KT − x0 | > |xi − x0 |, xi ∈ X, i.e., if x0 is the center of the cube X, the targets hy KT and the cube X need to fulfill a distance requirement, i.e., targets must be contained in a well-separated cube. N |y KT | ai F (y) = R 2π R

6 Dx

i=1 xi ∈X

=

n ∞ n=0 m=−n

5  ∗ In,m (x − x0 )On,m (hy KT − x0 ) 

n ∞ |y KT | ∧,O Fx0 (n, m)On,m (hy KT − x0 ) R m=−n

x=xi

(52)

n=0

where the multipole coefficients Fx∧,O (n, m) of the cube X are given by 0 Fx∧,O (n, m) = 0

 N  ai  ∗ Dx In,m (x − x0 )  . 2π R x=xi

i=1 xi ∈X

(53)

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

557

This first step is called point to multipole (P2M) step where the infinite sum in (52) has to be truncated at degree p. The degree p essentially determines the accuracy of the algorithm. The coefficients Fx∧,O (n, m) can be translated to other cubes via 0 relations (45), (48) as well as (49) as long as the distance requirements are fulfilled by the construction of the decomposition of the domain into nested cubes. At the end of the fast multipole cycle, i.e., after several M2M-, M2L-, L2Ltranslations, each cube Y possesses an inner harmonics expansion centered around the center of the cube. This expansion has to be evaluated at the targets contained by Y . This evaluation is called the local to targets (L2T) step: 6 F (yj ) =

p n |y KT | ∧,I Fx0 (n, m)In,m (hy KT − x0 ) R m=−n n=0

5   

,

(54)

y=yj

2

R where the variable y is hidden by y KT = |y| 2 y. Now we briefly sketch the fast multipole algorithm (for more details see, e.g., [13, 14] or [51, 53, 54] for our specific implementation).

Algorithm 5 (Fast Multipole Algorithm) Input: • A set of points xi ∈ Σext (often xi ∈ Σ), i = 1, . . . , N , a set of coefficients ai , i = 1, . . . , N . • The choice of the type of the reproducing kernel KH (singularity or Abel-Poisson with the parameter h and the radius of the Runge sphere R). • A set of evaluation points yj ∈ Σext , j = 1, . . . , M, where M = O(N ). • The degree of the multipole expansion p, the maximal number of points per cube m. Aim: Compute the sum F (yj ) =

N

ai KH (xi , yj )

for each j = 1, . . . , M.

(55)

i=1

Initialization: 2

• Compute the targets hyjKT = h |yR |2 yj , j = 1, . . . , M. j • Build an adaptive octtree containing all points and targets as well as the lists of Definition 7. Fast Multipole cycle: 1. For all leaves: Generate multipole coefficients by (53) up to degree p around the center of X from the points in X (P2M).

558

M. Gutting

2. For level l = L − 1, . . . , 2: Translate the multipole coefficients of the children of X to X itself (M2M) for all cubes X of level l via (45). 3. For level l = 2, . . . , L: For all cubes X of level l: Compute the expansion coefficients of an inner harmonics expansion around the center of X from the points in Y for all cubes Y of list 4 of X. 4. For level l = 2, . . . , L: For all cubes X of level l: Translate the multipole coefficients of X to the local coefficients of Y (M2L) by (48) for all cubes Y in list 2 of X. 5. For level l = 2, . . . , L − 1: Translate the local coefficients of X to the children of X (L2L) via (49) for all cubes X of level l. 6. For all leaves X: Evaluate the inner harmonics expansion of X at all targets in X (L2T) as in (54). Store the result in F . 7. For all cubes Y in list 1 of X: Add the direct evaluation of the kernel sum corresponding to the points in Y at the targets in X to F . 8. For all cubes Y in list 3 of X: Evaluate the multipole expansion around the center of Y at the targets in X and add the results to F . KT Oj = |yj | Fj for j = 9. Reverse the effects of the Kelvin transformation, i.e., F R O. 1, . . . , M. Return the result F For the computation of the spline coefficients of the spline approximation of Sect. 2.4 we consider the system of linear equations (28) instead of (21). This N  means that we have to add β ai (C −1 )ij to the matrix-vector product that is i=1

computed by the FMM. In order to keep a fast algorithm the matrix C −1 has to allow a fast summation method or C has to be a sparse matrix. The cases where C is a diagonal matrix can also be included in the direct evaluation step of the fast multipole algorithm.

3.5

Acceleration of the Translations

Newer versions of the FMM include several ideas to reduce the numerical effort of the translations from the original O(p4 ) to O(p3 ) or even O(p2 ) per translation operation. This includes the ideas of [92] (see also [14, 50]) for the multipole to multipole (M2M) and the local to local (L2L) steps using Wigner rotation matrices (cf., e.g., [11, 15, 18, 88]). The main point is to rotate the coordinate system such that the shift direction becomes the ε3 -axis, shift there and rotate back. This reduces the numerical costs from O(p4 ) in the M2M- and L2L-steps to O(p3 ), since each rotation as well as the shift along the ε3 -axis require an effort of O(p3 ). For a detailed description we refer to [92] or [51] with all technical details of our implementation. For the M2L translation, [14, 50] have replaced this step with exponential translations which are based on the numerical integration of the integral representation

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

1 1 = |x − y| 2π =





e−λ(x3 −y3 )

0





559

eiλ((x1 −y1 ) cos α)+(x2 −y2 ) sin α) dα dλ

0

s(ε) Mk wk e−λk (x3 −y3 ) eiλk ((x1 −y1 ) cos αj,k )+(x2 −y2 ) sin αj,k ) + O(ε) Mk k=1

j =1

(56) for points satisfy 1 ≤ x3 − y3 ≤ 4 as well as  x, y whose Cartesian coordinates √ 0 ≤ (x1 − y1 )2 + (x2 − y2 )2 ≤ 4 2. Details as well as integration points λk , weights wk and numbers of points Mk for the trapezoidal rule applied to the inner integral can be found in [14, 50, 94]. The accuracy ε of the numerical integration is determined by the discretization parameter s(ε) = O(p) of the outer integral in (56). The total number of numerical integration points, i.e., the number of exponential functions and coefficients, is supposed to be O(p2 ). By Hobson’s formula (cf. [58]) a multipole expansion of F is transformed by (56) into a series of exponentials (multipole to exponential step, briefly M2X), these exponentials can be translated efficiently by the exponential to exponential shift (X2X). Afterwards the local coefficients are computed from the exponential coefficients (X2L). The restrictions on the positions of x and y mean that the exponential translations are applicable for cubes in list 2 (see Definition 8) that are situated above the current cube with another cube in between. However, by combining rotations of the multipole expansion using again the Wigner rotation matrices, the exponential translation can substitute the M2L translation for all cubes in list 2. Therefore, the list of all well-separated cubes (list 2) is split into 6 directional lists (up, down, North, South, East and West) and instead of M2L the following sequence of transformations is used: (rotation), M2X, X2X, X2L, (inverse rotation). Each exponential shift requires numerical costs of O(p2 ) and the rotations can be applied using O(p3 ) operations (as do the M2X and X2L steps). Thus, this improves the performance compared to the M2L step’s O(p4 ) effort. Moreover, we can save translations by recombination (see [14,50,51,53] for more on the technical details). It should also be noted that there are several symmetries in the coefficients of the exponential expansion since we are dealing with a real-valued function F . These symmetries can be used to further reduce the constant of the numerical costs (cf. [14, 50]).

3.6

Parameters and Results of Our FMM Implementation

The main parameters of Algorithm 5 in the accelerated version using exponential translations as described in Sect. 3.5 are • the exponential truncation degree s(ε), • the multipole truncation degree p, • the maximal number of points/targets per cube m.

560

M. Gutting

All three parameters depend on the choice of the reproducing kernel and they have a huge influence on the performance of the algorithm (in particular m). Typically the multipole truncation degree p is chosen in accordance to s(ε) for an error level ε. We refer to [51–54] for different kernels and also different problems (interpolation/oblique derivatives). Note that the kernels of Abel-Poisson type require a slightly higher degree and therefore a bit more numerical effort. The maximal number of points or targets per cube m has a strong influence on the adaptive octtree construction and the performance of the FMM. If m is too small, there are many cubes each containing only very few points. Thus, the kernel expansion coefficients no longer combine the information of enough points to be efficient. If m is too large, there are only few cubes each with a large number of points. This means that far too often instead of kernel expansion direct interaction is used. Therefore, m can be used to balance the effort of the direct interaction and the kernel approximation. It has been suggested to choose m = O(p3/2 ) (cf. [14, 50]) which may serve as a guideline. For our implementation we refer to [51–54] where the results of our empirical tests are summarized. Using these optimizations of the parameters of the FMM its performance can be compared with direct computation and break-even points of our implementation can be found, i.e., the minimal number of points that is necessary for our algorithm to be faster than the direct approach. Under optimal conditions this can be as low as about 2,500 points (cf. [51–54]). Note that such results are always very dependent on the implementation. In general, the Abel-Poisson kernel requires some more computational time since it leads to a more difficult step to compute the multipole coefficients. The linear asymptotic behavior which we expect from the FMM instead of the quadratic behavior of the direct approach can easily be established. Finally, we present some results in Fig. 4 and the corresponding absolute error in Fig. 5. These results apply the Abel-Poisson kernel (with R = 6,352 km and h = 0.965) to an interpolation problem for the gravitational potential on the Earth’s surface using 80,000 data points on the Earth’s surface. The FMM was used with p = 25, s(ε) = 26, m = 240. The octtree had maximal level L = 4 and consisted of 1,507 cubes (1,242 leaves). In combination with a preconditioner based on an overlapping domain decompositioning method it took just 14 GMRES iterations to yield convergence. Note that the use of a preconditioner is highly recommended. Ours is based on an overlapping domain decompositioning method (see [51–53] for further details). Further examples, results, and applications can also be found there.

4

Parameter Choice Methods for Spline Approximation

In this section we summarize several parameter choice methods for the determination of the smoothing parameter in (26) and (28) that are known from the theory of regularization of inverse problems. For this context the reader is referred to [6, 9] where also many numerical tests for a wide range of inverse problems are presented.

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

–600

–400

–200

0

200

561

400

600

Fig. 4 Interpolating spline for the gravitational potential on the Earth’s surface using AbelPoisson kernels for 80,000 points in m2 /s 2

0.05

0.1

0.15

0.2

Fig. 5 Absolute error of the interpolating spline of Fig. 4 in m2 /s 2

0.25

562

M. Gutting

As in [54] we consider the problem of solving Ax = y where A : X → Y is a linear operator. In our case X = Y = RN and A is the matrix of the linear system (21) (in [6, 9] this is called case C2). This is in fact not an ill-posed problem, but can be severely ill-conditioned (depending on the distribution of the data points) and as seen in Sect. 2.4 noisy data lead to the use of a form of Tikhonov regularization, i.e., the linear system (28) with the smoothing parameter β > 0. It should be noted that we want to use the FMM for a fast matrix-vector-multiplication, i.e., A is never given as an actual matrix. This excludes some of the parameter choice methods in order to keep the low computational costs provided by the FMM. Moreover, it should be pointed out that the use of an iterative solver (e.g., cg-method or GMRES) requires a stopping criterion and the number of iterations can be seen as another regularization parameter which needs to be chosen in some way. The smoothing parameter β > 0 is typically discretized exponentially, i.e., βn = β0 qβn with qβ ∈ (0, 1) and n ∈ N (actually only n = 1, . . . , nmax ). The use of a discrete set of regularization parameters with a fine enough resolution does not alter the behavior of most parameter choice methods. For the efficient implementation of these methods, it is useful to have a bound on the value of the optimal parameter (i.e., a maximal regularization parameter), especially if the method minimizes some function (see also [6, 9] and Sect. 4.3). We denote the vector of noisy data by y δ (see Sect. 4.1) and xnδ ∈ X is the vector of spline coefficients resulting from the minimization of the smoothing functional (26) using βn and y δ , i.e., xnδ solves the system of linear equations (28): δ xnδ = (A + βn I )−1 y δ = A−1 n y ,

(57)

where An = A + βn I . Furthermore, xn0 = A−1 n y with noise-free data y ∈ Y.

4.1

Noise Models

Additive noise models, i.e., y δ = y + δξ , are considered where ξ is a normalized noise element and δ > 0 is the noise level. The most common noise model in the classical inverse problems literature is (cf. [20]), where ξ ∈ Y with ξ  ≤ 1, so y δ − y ≤ δ. This models discretization errors, but only poorly represents random measurement errors arising in practice. A stochastic noise model for a discrete data vector y δ ∈ RN (see [91]) uses ξ , where the components ξi are i.i.d. random variables with mean Eξi = 0 and variance Eξi2 = 1. δ is the standard deviation of each error component δξi and Ey δ − y2 = δ 2 Eξ 2 = N δ 2 . Note that for the number of data N → ∞ this is unbounded. For correlated errors, δξi possesses the covariance matrix C. If known, this matrix can be used in (28). We restrict ourselves to diagonal C here in order to to keep the advantages of a fast algorithm like the FMM. If y δ = y + δξ with ξ ∼ N (0, I ), the noise model corresponds to Gaussian white noise. For colored noise, if the entries

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

563

Ckk are increasing, it is called blue noise, and, if they are decreasing, it is called red noise (see [6, 9] and the references therein for more details).

4.2

Parameter Choice Methods

A parameter choice method is a rule that assigns a value for the regularization/smoothing parameter β. In case of a discrete set of parameters β1 , . . . , βnmax , the method selects a value for the index, which will be denoted by n∗ and the corresponding parameter is βn∗ . Parameter choice methods can be classified as three basic types by their input (see, e.g., [8, 20]): • a-priori method, i.e., n∗ is a function of δ and information about x which is not known in practice. Thus, we do not dicuss such methods here. • a-posteriori method, i.e., n∗ = n∗ (δ, y δ ) requiring the noise level δ. If δ is not known, an estimate is used. • data-driven method, i.e., n∗ = n∗ (y δ ) requiring only the data y δ as input (sometimes called “heuristic method”). If y δ contains stochastic noise, then n∗ is a random variable. n∗ is defined as either the point at which a function F (n) falls below a threshold (Type 1) or the minimizer of F (Type 2). Some methods need tuning parameters (in particular methods of type 1) and can be sensitive to this choice. It should be pointed out that [4] states that, for an ill-posed problem, a parameter choice rule that does not explicitly use the noise level (e.g., data-driven methods) cannot yield a regularization method such that the worst case error converges to 0 as δ → 0. This Bakushinskii veto is important for deterministic noise, but it is not really appropriate for stochastic noise (cf. [8, 10]). There are data-driven rules yielding regularization methods that converge and perform very well in practice (see also [6, 9]). Since x = A−1 y is unknown (noise-free data y is not available), a practical parameter choice method must use some other known or easily computed/estimated quantities such as, e.g., the norm of the residual defined as y δ − Axnδ . In our case this is a Euclidean norm which is easily computed. The error x − xnδ  is split such that x − xnδ  ≤ x − xn0  + xn0 − xnδ .

(58)

The first term which is called regularization error is usually bounded by a decreasing function ϕ(n) reflecting smoothness properties and the so-called qualification of the regularization method (see [9, 20] and the references therein). The second term which is called propagated noise error on the right-hand side of (58) can often be bounded for regularization methods as

564

M. Gutting

xn0 − xnδ  ≤ δ(n),

(59)

where  is a known increasing function of n, indicating that, with less smoothing, there is more influence of the data noise (cf. [20]). In the case of stochastic noise, the risk, i.e., the expected squared error Ex − xnδ 2 , is considered. For noise with zero mean, instead of (58), the risk can be decomposed exactly into a sum of squared bias x − xn0 2 and variance terms Exn0 − xnδ 2 , i.e. Ex − xnδ 2 = x − xn0 2 + Exn0 − xnδ 2 .

(60)

The squared bias can be bounded as before and, under suitable assumptions, the variance can be expressed as δ 2 2 (n) for some increasing function (n). For white noise, our Tikhonov regularized solution (57) has variance 2 2 δ 2 2 (n) = δ 2 EA−1 n ξ = δ

k

1 , (σk + βn )2

(61)

where σk are the singular values of A (in our case they are eigenvalues of A). A much more detailed discussion of the above errors including, e.g., minimax results, oracle inequalities, can be found in [5, 6, 9, 54, 66] and the references therein.

4.3

Maximal Regularization Parameter

Some parameter choice methods, e.g., the balancing principle of Sect. 4.4.2, require a maximal index nmax as essential input in the algorithm itself. Bauer and Lukas [9] suggest to define the maximal index as nmax = max{n|(n) < 0.5(∞)},

(62)

where Exn0 − xnδ 2 = δ 2 2 (n) and δ 2 2 (∞) is the supremum of the variance. This allows us to expect that the optimal parameter index is smaller than nmax . Also note that nmax should not be too large for an efficient computation. We obtain nmax either by an analytic expression for δ 2 2 (n), as in (61) for white noise, or by a good estimate of it. If there are several independent data sets available, then a good estimate (for any noise color) is δ δ δ 2 2 (n) ≈ 2−1 mean{xn,i − xn,j 2 , i = j }.

(63)

Often two sets of data are sufficient (see [5] for further details). If only a single data set is available, then it may not be possible to estimate δ 2 2 (n) if the noise is correlated with unknown covariance. Then one can define a maximal index n˜ max by βn˜ max ≈ σn˜ max for our Tikhonov regularization if there are at least good estimates

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

565

of the eigenvalues σk of A available. However, methods that perform much worse without the use of the maximal index nmax , may yield different results for n˜ max instead of nmax (see [6, 9] for more on this topic).

4.4

Description and Evaluation of Methods

In this section, we describe the origin and idea of the method, list the input of the method and the algorithm that we use. We also give a brief discussion of known theoretical and practical issues about the method, including the most relevant references. Several of the methods use a tuning parameter or some other parameter that must be chosen (see [6, 9] and the references therein for further details).

4.4.1 The Discrepancy Principle and Its Variants The discrepancy principle of [77, 78] is one of the oldest and most widely used parameter choice procedures (cf. [20] and references therein). Its idea is that for a good regularized solution, the norm of the residual should match the noise level δ of the data. The method needs the following input: • Norms of residuals {Axnδ − y δ }n≤nmax until a certain bound is satisfied. • Noise level δ. • Tuning parameter τ ≥ 1. In a deterministic setting with y δ − y ≤ δ, the parameter choice n∗ is the first n such that Axnδ − y δ  ≤ τ δ. In a stochastic setting, with the error in each element of y δ ∈ RN having standard deviation δ, the choice n∗ is the first n such that √ Axnδ − y δ  ≤ τ δ N .

(64)

Originating from a deterministic setting, the discrepancy principle has also been studied in stochastic settings (see, e.g., [12, 67, 68]) and for many regularization methods and many inverse problems. There are many results on convergence properties of this method for both settings (see, e.g., [6,9,54] for an overview of the literature). The discrepancy principle is one of the fastest methods available, since one only needs to compute the residuals until the bound (64) is satisfied which allows the use of the FMM of Sect. 3. However, its drawback is the requirement of an accurate estimate of the noise level. Estimations that are just slightly off can lead to very poor solutions (see, e.g., [57]). There are also many variants of the method such as the transformed discrepancy principle (cf. [81, 82]), the modified discrepancy principle (sometimes abbreviated as MD rule) (cf. [19, 44, 80]), or the varying discrepancy principle (cf. [12, 65]). One drawback of them is that they are no longer easily compatible with the FMM and require an increased numerical effort. For comparative studies in the context of inverse problems with stochastic noise we refer, e.g., to [6, 9].

566

M. Gutting

4.4.2 The Balancing Principle and Its Variants The balancing principle of [63] was originally derived for statistical estimation from direct observations in a white noise model. Since then it has been developed further for regularization of linear and nonlinear inverse problems (see, e.g., [7, 72]) in deterministic and stochastic settings. The idea is to to balance the known propagated noise error bound δ(n) in (59) with the unknown regularization error (58) by an adaptive procedure that employs a collection of differences of regularized solutions. As input the balancing principle needs: • Maximal index nmax , e.g., as defined in (62). • All regularized solutions {xnδ }n≤nmax up to the index nmax . • An upper bound δ(n) for the propagated noise error xn0 − xnδ  or a bound or estimate δ 2 2 (n) of the variance Exn0 − xnδ 2 . • Noise level δ (and the covariance in the stochastic setting if known). Then one can use known expressions for δ(n). Alternatively, if one has two or more independent sets of data yiδ , then Exn0 − xnδ 2 can be estimated by (63). • Tuning constant κ, typically κ ∈ [0.5, 1.5] (cf. [9] and the references therein). The balancing functional is defined by b(n) =

& max

n 1. For c = 1 the method reduces to GCV. The noise level does not need to be known. The modified GCV estimate is defined by  n∗ = argminn≤nmax

Axnδ − y δ 2

2 (N −1 tr(I − cAA−1 n ))

 .

(72)

For comparative studies of these variants and further details we refer to [6, 9, 54] and the references therein.

5

Conclusion

An iterative algorithm like, e.g., conjugate gradients or GMRES which uses the FMM to accelerate its matrix-vector multiplication is an efficient solution strategy. It allows to treat interpolation problems and boundary value problems with many data points on regular surfaces (e.g., the actual topography of the Earth)

570

M. Gutting

(see [51–53]). It should be pointed out that this spline approach is not restricted to a global treatment, but also applies to regional domains (cf. [51,52]). This can lead to a local improvement of models in areas of particular interest. The approach can be extended to spline approximation (in particular for diagonal covariance matrices) as seen in Sect. 2.4 and the end of Sect. 3.4). The smoothing parameter(s) plays a crucial role in this approach and must be chosen very carefully or a lot of information is lost to over-smoothing, in particular the high-frequent details of the signal. We have presented several parameter choice methods the can be used without losing the advantages of the FMM. Their performance for the regularization of inverse problems has been investigated in several studies with different solution techniques (see, e.g., [6, 9, 55] and the references therein). Further tests of the combination of the FMM with these parameter choice methods (cf. [54] and the references therein) are an interesting challenge for the future. In particular the interaction with stopping criteria for iterative solvers needs further investigation (cf. [55]). Note that often the solution procedure requires much more computational effort than the evaluation of the parameter choice. It can be advisable to apply or combine several parameter choice methods to obtain better results. For highly irregular distributions of data points, the spline approach reaches its limits due to large data gaps which result in severe ill-conditioning. Even smoothing splines cannot completely remedy this though further investigation is required. However, functional matching pursuit methods (RFMP or ROFMP) can result in better approximations (see [23, 55, 75] and the references therein), but so far these algorithms require high numerical costs. These methods are also iterative regularizations and the combination of stopping criteria and regularization parameters has been investigated for a class of ill-posed problems in [55].

Literature 1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. Dover Publications, Inc., New York (1972) 2. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Second International Symposium on Information Theory (Tsahkadsor, 1971), pp. 267–281. Akadémiai Kiadó, Budapest (1973) 3. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950) 4. Bakushinskii, A.B.: Remarks on choosing a regularization parameter using the quasioptimality and ratio criterion. U.S.S.R. Comput. Math. Math. Phys. 24, 181–182 (1984) 5. Bauer, F.: Some considerations concerning regularization and parameter choice algorithms. Inverse Prob. 23(2), 837–858 (2007) 6. Bauer, F., Gutting, M., Lukas, M.A.: Evaluation of parameter choice methods for regularization of ill-posed problems in geomathematics. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 1713–1774. Springer, Heidelberg (2015) 7. Bauer, F., Hohage, T.: A Lepskij-type stopping rule for regularized Newton methods. Inverse Prob. 21, 1975–1991 (2005) 8. Bauer, F., Kindermann, S.: Recent results on the quasi-optimality principle. J. Inverse Ill-Posed Prob. 17(1), 5–18 (2009)

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

571

9. Bauer, F., Lukas, M.A.: Comparing parameter choice methods for regularization of ill-posed problems. Math. Comput. Simul. 81(9), 1795–1841 (2011) 10. Becker, S.M.A.: Regularization of statistical inverse problems and the Bakushinkii veto. Inverse Prob. 27, 115010, 22pp (2011) 11. Biedenharn, L.C., Louck, J.D.: Angular Momentum in Quantum Physics (Theory and Application). Encyclopedia of Mathematics and Its Applications. Addison-Wesley, Reading (1981) 12. Blanchard, G., Mathé, P.: Discrepancy principle for statistical inverse problems with application to conjugate gradient iteration. Inverse Prob. 28, 115011, 23pp (2012) 13. Carrier, J., Greengard, L., Rokhlin, V.: A fast adaptive multipole algorithm for particle simulations. SIAM J. Sci. Stat. Comput. 9(4), 669–686 (1988) 14. Cheng, H., Greengard, L., Rokhlin, V.: A fast adaptive multipole algorithm in three dimensions. J. Comput. Phys. 155, 468–498 (1999) 15. Choi, C.H., Ivanic, J., Gordon, M.S., Ruedenberg, K.: Rapid and stable determination of rotation matrices between spherical harmonics by direct recursion. J. Chem. Phys. 111(19), 8825–8831 (1999) 16. Cummins, D.J., Filloon, T.G., Nychka, D.: Confidence intervals for nonparametric curve estimates: toward more uniform pointwise coverage. J. Am. Statist. Assoc. 96(453), 233–246 (2001) 17. Driscoll, J.R., Healy, D.M.: Computing Fourier transforms and convolutions on the 2-sphere. Adv. Appl. Math. 15, 202–250 (1994) 18. Edmonds, A.R.: Drehimpulse in der Quantenmechanik. Bibliographisches Institut, Mannheim (1964) 19. Engl, H.W., Gfrerer, H.: A posteriori parameter choice for general regularization methods for solving linear ill-posed problems. Appl. Numer. Math. 4(5), 395–417 (1988) 20. Engl, H.W., Hanke, H., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 21. Epton, M.A., Dembart, B.: Multipole translation theory for the three-dimensional Laplace and Helmholtz equations. SIAM J. Sci. Comput. 16(4), 865–897 (1995) 22. Fengler, M.J.: Vector spherical harmonic and vector wavelet based non-linear Galerkin schemes for solving the incompressile Navier–Stokes equation on the sphere. Ph.D. thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern, Shaker, Aachen (2005) 23. Fischer, D., Michel, V.: Sparse regularization of inverse gravimetry – case study: spatial and temporal mass variation in South America. Inverse Prob. 28, 065012, 34pp (2012) 24. Freeden, W.: On approximation by harmonic splines. Manuscripta Geod. 6, 193–244 (1981) 25. Freeden, W.: On spherical spline interpolation and approximation. Math. Method. Appl. Sci. 3, 551–575 (1981) 26. Freeden, W.: Interpolation and best approximation by harmonic spline functions. Boll. Geod. Sci. Aff. 1, 105–120 (1982) 27. Freeden, W.: On spline methods in geodetic approximation problems. Math. Method. Appl. Sci. 4, 382–396 (1982) 28. Freeden, W.: Ein Konvergenzsatz in sphärischer Spline-Interpolation. Z. f. Vermessungswes.(ZfV) 109, 569–576 (1984) 29. Freeden, W.: Spherical spline interpolation: basic theory and computational aspects. J. Comput. Appl. Math. 11, 367–375 (1984) 30. Freeden, W.: Harmonic splines for solving boundary value problems of potential theory. In: Mason, J.C., Cox, M.G. (eds.) Algorithms for Approximation. The Institute of Mathematics and Its Applications, Conference Series, vol. 10, pp. 507–529. Clarendon Press, Oxford (1987) 31. Freeden, W.: A spline interpolation method for solving boundary value problems of potential theory from discretely given data. Numer. Methods Partial Differ. Equ. 3, 375–398 (1987) 32. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. B.G. Teubner, Stuttgart/Leipzig (1999) 33. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman & Hall/CRC, Boca Raton (2013)

572

M. Gutting

34. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere. Oxford University Press, Oxford (1998) 35. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications, Clarendon (1998) 36. Freeden, W., Glockner, O., Schreiner, M.: Spherical panel clustering and its numerical aspects. J. Geodesy 72, 586–599 (1998) 37. Freeden, W., Gutting, M.: Special Functions of Mathematical (Geo-)Physics. Birkhäuser, Basel (2013) 38. Freeden, W., Gutting, M.: Integration and Cubature Methods: A Geomathematically Oriented Course. Chapman and Hall/CRC, Boca Raton (2017) 39. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston (2004) 40. Freeden, W., Nashed, M.Z. (eds.): Handbook of Mathematical Geodesy – Functional Analytic and Potential Theoretic Methods. Birkhäuser, Basel (2018) 41. Freeden, W., Nashed, M.Z., Sonar, T. (eds.): Handbook of Geomathematics, 2nd edn. Springer, Heidelberg (2015) 42. Freeden, W., Schreiner, M.: Special functions in mathematical geosciences: an attempt at a categorization. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn., pp. 925–948. Springer, Heidelberg (2010) 43. Freeden, W., Schreiner, M., Franke, R.: A survey on spherical spline approximation. Surv. Math. Ind. 7, 29–85 (1997) 44. Gfrerer, H.: An a posteriori parameter choice for ordinary and iterated Tikhonov regularization of ill-posed problems leading to optimal convergence rates. Math. Comput. 49(180), 507–522 (1987) 45. Girard, D.: A fast “Monte-Carlo cross-validation” procedure for large least squares problems with noisy data. Numer. Math. 56(1), 1–23 (1989) 46. Glockner, O.: On numerical aspects of gravitational field modelling from SST and SGG by harmonic splines and wavelets (with application to CHAMP data). Ph.D. thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern. Shaker, Aachen (2002) 47. Greengard, L.: The Rapid Evaluation of Potential Fields in Particle Systems. MIT Press, Cambridge (1988) 48. Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. J. Comput. Phys. 73(1), 325–348 (1987) 49. Greengard, L., Rokhlin, V.: Rapid evaluation of potential fields in three dimensions. In: Anderson, C., Greengard, L. (eds.) Vortex Methods, pp. 121–141. Springer, Berlin/New York (1988) 50. Greengard, L., Rokhlin, V.: A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Numer. 6 229–269 (1997) 51. Gutting, M.: Fast multipole methods for oblique derivative problems. Ph.D. thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern. Shaker, Aachen (2007) 52. Gutting, M.: Fast multipole accelerated solution of the oblique derivative boundary value problem. GEM Int. J. Geom. 3(2), 223–252 (2012) 53. Gutting, M.: Fast spherical/harmonic spline modeling. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 2711–2746. Springer, Heidelberg (2015) 54. Gutting, M.: Parameter choices for fast harmonic spline approximation. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy – Functional Analytic and Potential Theoretic Methods, pp. 605–639. Birkhäuser, Basel (2018) 55. Gutting, M., Kretz, B., Michel, V., Telschow, R.: Study on parameter choice methods for the RFMP with respect to downward continuation. Front. Appl. Math. Stat. 3, 1–17 (2017). https:// doi.org/10.3389/fams.2017.00010 56. Hansen, P.C.: Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev. 34(4), 561–580 (1992) 57. Hansen, P.C.: Rank-Deficient and Discrete Ill-Posed Problems. SIAM, Philadelphia (1998)

10 Fast Harmonic/Spherical Splines and Parameter Choice Methods

573

58. Hobson, E.W.: The Theory of Spherical and Ellipsoidal Harmonics (Second Reprint). Chelsea Publishing Company, New York (1965) 59. Hutchinson, M.: A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Commun. Stat. Simul. Comput. 18(3), 1059–1076 (1989) 60. Keiner, J., Kunis, S., Potts, D.: Fast summation of radial functions on the sphere. Computing 78, 1–15 (2006) 61. Kellogg, O.D.: Foundation of Potential Theory. Springer, Berlin/Heidelberg/New York (1967) 62. Kindermann, S., Neubauer, A.: On the convergence of the quasioptimality criterion for (iterated) Tikhonov regularization. Inverse Probl. Imaging 2(2), 291–299 (2008) 63. Lepskij, O.: On a problem of adaptive estimation in Gaussian white noise. Theory Probab. Appl. 35(3), 454–466 (1990) 64. Lu, S., Mathé, P.: Heuristic parameter selection based on functional minimization: optimality and model function approach. Math. Comput. 82(283), 1609–1630 (2013) 65. Lu, S., Mathé, P.: Discrepancy based model selection in statistical inverse problems. J. Complex. 30(3), 290–308 (2014) 66. Lukas, M.A.: Convergence rates for regularized solutions. Math. Comput. 51(183), 107–131 (1988) 67. Lukas, M.A.: On the discrepancy principle and generalised maximum likelihood for regularisation. Bull. Aust. Math. Soc. 52(3), 399–424 (1995) 68. Lukas, M.A.: Comparisons of parameter choice methods for regularization with discrete noisy data. Inverse Prob. 14(1), 161–184 (1998) 69. Lukas, M.A.: Robust generalized cross-validation for choosing the regularization parameter. Inverse Prob. 22(5), 1883–1902 (2006) 70. Lukas, M.A.: Strong robust generalized cross-validation for choosing the regularization parameter. Inverse Prob. 24, 034006, 16pp (2008) 71. Magnus, W., Oberhettinger, F., Soni, R.P.: Formulas and Theorems for the Special Functions of Mathematical Physics. Die Grundlehren der mathematischen Wissenschaften, vol. 52, 3rd edn. Springer, New York (1966) 72. Mathé, P., Pereverzev, S.V.: Regularization of some linear ill-posed problems with discretized random noisy data. Math. Comput. 75(256), 1913–1929 (2006) 73. Michel, V.: Tomography: problems and multiscale solutions. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics. 1st edn., pp. 949–972. Springer, Heidelberg (2010) 74. Michel, V.: Lectures on Constructive Approximation – Fourier, Spline, and Wavelet Methods on the Real Line, the Sphere, and the Ball. Birkhäuser, Boston (2013) 75. Michel, V.: RFMP – an iterative best basis algorithm for inverse problems in the geosciences. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 2121–2147. Springer, Heidelberg (2015) 76. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn., pp. 127–158. Springer, Heidelberg (2010) 77. Morozov, V.A.: On the solution of functional equations by the method of regularization. Soviet Math. Dokl. 7, 414–417 (1966) 78. Phillips, D.: A technique for the numerical solution of certain integral equations of the first kind. J. Assoc. Comput. Mach. 9, 84–97 (1962) 79. Potts, D., Steidl, G.: Fast summation at nonequispaced knots by NFFTs. SIAM J. Sci. Comput. 24(6), 2013–2037 (2003) 80. Raus, T.: On the discrepancy principle for the solution of ill-posed problems. Uch. Zap. Tartu. Gos. Univ. 672, 16–26 (1984) 81. Raus, T.: An a posteriori choice of the regularization parameter in case of approximately given error bound of data. In: Pedas, A. (ed.) Collocation and Projection Methods for Integral Equations and Boundary Value Problems, pp. 73–87. Tartu University, Tartu (1990) 82. Raus, T.: About regularization parameter choice in case of approximately given error bounds of data. In: Vainikko, G. (ed.) Methods for Solution of Integral Equations and Ill-Posed Problems, pp. 77–89. Tartu University, Tartu (1992)

574

M. Gutting

83. Robinson, T., Moyeed, R.: Making robust the cross-validatory choice of smoothing parameter in spline smoothing regression. Commun. Stat. Theory Methods 18(2), 523–539 (1989) 84. Rokhlin, V.: Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 60, 187–207 (1985) 85. Shure, L., Parker, R.L., Backus, G.E.: Harmonic splines for geomagnetic modelling. Phys. Earth Planet. Inter. 28, 215–229 (1982) 86. Tikhonov, A., Arsenin, V.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 87. Tikhonov, A., Glasko, V.: Use of the regularization method in non-linear problems. U.S.S.R. Comput. Math. Math. Phys. 5(3), 93–107 (1965) 88. Varshalovich, D.A., Moskalev, A.N., Chersonskij, V.K.: Quantum Theory of Angular Momentum. World Scientific, Singapore (1988) 89. Wahba, G.: Practical approximate solutions to linear operator equations when the data are noisy. SIAM J. Numer. Anal. 14(4), 651–667 (1977) 90. Wahba, G.: Spline interpolation and smoothing on the sphere. SIAM J. Sci. Stat. Comput. 2, 5–16. Also errata: SIAM J. Sci. Stat. Comput. 3, 385–386 (1981) 91. Wahba, G.: Spline Models for Observational Data. SIAM, Philadelphia (1990) 92. White, C.A., Head-Gordon, M.: Rotating around the quartic angular momentum barrier in fast multipole method calculations. J. Chem. Phys. 105(12), 5061–5067 (1996) 93. Yamabe, H.: On an extension of the Helly’s theorem. Osaka Math. J. 2(1), 15–17 (1950) 94. Yarvin, N., Rokhlin, V.: Generalized Gaussian quadratures and singular value decomposition of integral equations. SIAM J. Sci. Comput. 20(2), 699–718 (1998)

Numerical Methods for Solving the Oblique Derivative Boundary Value Problems in Geodesy

11

ˇ Róbert Cunderlík, Marek Macák, Matej Medl’a, Karol Mikula, and Zuzana Minarechová

Contents 1 2 3 4 5

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formulation of the Oblique Derivative Boundary Value Problem . . . . . . . . . . . . . . . . . Numerical Solution by the Boundary Element Method . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Solution by the Finite Volume Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Oblique Derivative Boundary Condition in the Oblique Derivative Boundary Value Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

576 578 580 589 591 619 619

Abstract We present various numerical approaches for solving the oblique derivative boundary value problem. At first, we describe a numerical solution by the boundary element method where the oblique derivative is treated by its decomposition into the normal and tangential components. The derived boundary integral equation is discretized using the collocation technique with linear basis functions. Then we present solution by the finite volume method on and above the Earth’s surface. In this case, the oblique derivative in the boundary condition is treated in three different ways, namely (i) by an approach where the oblique

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. ˇ R. Cunderlík () · M. Macák · M. Medl’a · K. Mikula · Z. Minarechová Faculty of Civil Engineering, Department of Mathematics and Descriptive Geometry, Slovak University of Technology, Bratislava, Slovakia E-Mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_105

575

ˇ R. Cunderlík et al.

576

derivative is decomposed into normal and two tangential components which are then approximated by means of numerical solution values (ii) by an approach based on the first order upwind scheme; and finally (iii) by a method for constructing non-uniform hexahedron 3D grids above the Earth’s surface and the higher order upwind scheme. Every of proposed approaches is tested by the socalled experimental order of convergence. Numerical experiments on synthetic data aim to demonstrate their efficiency.

Zusammenfassung Der Beitrag beschäftigt sich mit verschiedenen numerischen Verfahren zum schiefachsigen Randwertproblem der Geodäsie. Zunchst wird eine numerische Lösung des Randwertproblems mittels Randelementmethoden beschrieben, welche die schiefachsigen Ableitungen in Normal- und Tangentialkomponenten zerlegt. Die sich ergebende Randintegralgleichung wird mittels Kollokationstechnik unter Verwendung linearer Basisfunktionen diskretisiert. Es folgt ein Lösungsvorschlag mittels Finite-Volumen-Technik auf und oberhalb der Erdoberfläche. In diesem Fall wird eine schiefachsige Ableitung auf drei verschiedene Arten behandelt, nämlich (i) durch einen Zerlegungsansatz in Normal- und zwei Tangentialkomponenten, die dann mittels numerischer Lösungswerte Approximation finden (ii) durch einen Zugang, der ein upwind – Schema umsetzt (iii) durch eine Methodik der Konstruktion nicht-gleichförmiger hexaedrischer 3D-Gitter oberhalb der Erdoberfläche und einem upwind Schema höherer Ordnung. Jeder der vorgeschlagenen Zugänge wird numerisch auf ihre Effizienz untersucht.

Keywords Geodetic boundary value problem · Oblique derivative boundary condition · Boundary element method · Finite volume method · Numerical solution · Global gravity field modelling · Local gravity field modelling · Upwind method · Advection equation · Evolving surfaces

1

Introduction

A determination of the Earth’s gravity field is usually formulated in terms of the geodetic boundary value problems (BVPs). A combination of terrestrial gravimetric measurements and precise 3D positioning by GNSS directly yields gravity disturbances. They naturally lead to boundary conditions (BC) of the fixed gravimetric boundary value problem (FGBVP), namely to the oblique derivative BC. Hence, from the mathematical point of view, the FGBVP represents an exterior oblique

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

577

derivative geodetic BVP for the Laplace equation, cf. Koch and Pope [33], Freeden and Kersten [19], Bjerhammar and Svensson [8], Holota [28]. Classically, a solution procedure for the oblique derivative problem has been based on integral equations using the single-layer potential, cf. Bitzadse [7], Miranda [49]. Koch and Pope [33] applied such an integral equation procedure to solve the FGBVP. However, the strong nature of the singularities demanding Cauchy’s principal integral values turned out to be a serious obstacle, see Freeden and Gerhards [18]. Later, Freeden and Kersten [20] proposed a new concept of approximations using the generalized Fourier expansions to transfer strongly singular integrals into regular ones and this approach has been further developed, e.g., in Freeden [17], Bauer [5], Gutting [24–26], Freeden and Michel [21], Freeden and Gerhards [18]. Recently, Freeden and Nutz [22] published the conceptual setup of the Runge-Walsh theorem for the oblique derivative problem of physical geodesy. A development of high performance computing facilities has brought new opportunities for numerical solutions of the geodetic BVPs as well. Efficient numerical methods such as the boundary element method (BEM), the finite element method (FEM) or the finite volume method (FVM) can be also applied for global as well as local gravity field modelling. Among various approaches, we distinguish between solution to geodetic BVPs on infinite domains, see e.g., Holota [28], Klees ˇ et al. [32], Nesvadba et al. [50], Cunderlík et al. [11], Holota and Nesvadba [29], and on finite domains, cf. Fašková et al. [16], Minarechová et al. [48], Macák et al. [38]. In the case of BEM, there have been published several papers and here we mention only few of them, i.e., Klees [31], Lehmann and Klees [35], Klees et al. ˇ ˇ [32], Cunderlík et al. [11] or Cunderlík and Mikula [12]. The oblique derivative ˇ problem treated by BEM is discussed in Cunderlík et al. [13]. The FEM applied to gravity field modelling has been studied in Meissl [45], Shaofeng and Dingbo [53] or Fašková et al. [16]. The first application of FVM has been introduced by Fašková [15] and its parallel implementation by Minarechová et al. [48]. However, both papers have studied the geodetic BVP with the Neumann BC. The first insight of FVM applied to the oblique derivative BVP has been discussed in [39]. Later this effort was further developed in Macák et al. [38, 40], where treatment of the oblique derivative by the central scheme and the first order upwind scheme [36], respectively, were developed for solving FGBVPs on uniform grids above the ellipsoid. Recently, Medl’a et al. [44] presented the FVM on non-uniform grids, where a discretization method based on an evolution of the Earth’s surface has been developed. In this way, one obtains a more regular non-uniform 3D hexahedron grid. Numerically, an innovative higher order upwind method for non-uniform grids has been applied. In this chapter, we formulate the oblique derivative BVP and present concise solutions by two numerical methods, namely

ˇ R. Cunderlík et al.

578

• the boundary element method, • the finite volume method. In case of the BEM, the oblique derivative in the BC is decomposed into one normal and two tangential components that are expressed through the gradients of the unknown disturbing potential. In case of the FVM, the oblique derivative in the BC is treated in three different ways, i.e., • the oblique derivative is decomposed into normal and two tangential components which are then approximated by means of numerical solution values using the central scheme applied on uniform grids, • using the first order upwind scheme that is applied on uniform grids, • using the higher order upwind scheme that is applied on non-uniform grids. Every proposed approach is tested by various representative numerical experiments.

2

Formulation of the Oblique Derivative Boundary Value Problem

Let us consider the FGBVP, cf. [8, 28, 33]: ΔT (x) = 0,

x ∈ R 3 − S,

∇T (x) · s(x) = −δg(x), T (x) → 0,

as

(1)

x ∈ ∂S,

(2)

|x| → ∞,

(3)

where S is the Earth, T (x) is the disturbing potential defined as a difference between the real and normal gravity potential at any point x = (x, y, z), δg(x) is the gravity disturbance and s(x) = −∇U (x)/|∇U (x)| is the unit vector normal to the equipotential surface of the normal potential U (x) at any point x. Equations (1), (2), and (3) represent an exterior BVP for the Laplace equation, i.e., the computational domain (outside the Earth) is infinite. From the aforementioned numerical methods it is natural to apply BEM that is suitable for exterior BVPs since it reduces the problem from the 3D infinite domain onto its “2D” boundary. On the contrary, FVM requires a discretization of the whole computational domain into finite volumes. To that goal we construct a domain Ω in the external space above the Earth, see [16]. Such a domain Ω (Fig. 1) is bounded by the bottom surface Γ ⊂ ∂Ω representing the Earth’s surface and an upper surface created at appropriate altitude, e.g., at mean altitude of the GOCE satellite orbits.

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

579

Fig. 1 The bounded domain Ω – global gravity field modelling

Fig. 2 The bounded domain Ω – local gravity field modelling

In case of local gravity field modelling, see Fig. 2, we choose part of the Earth’s surface and we add side boundaries. Then on the top and side boundaries the Dirichlet-type BC for disturbing potential can be generated from any GOCE-based satellite-only geopotential model.

ˇ R. Cunderlík et al.

580

In the bounded domain Ω, we consider the following BVP ΔT (x) = 0,

x ∈ Ω ⊂ R3 ,

∇T (x) · s(x) = −δg(x), T (x) = TSAT (x),

(4)

x ∈ Γ ⊂ ∂Ω,

(5)

x ∈ ∂Ω − Γ,

(6)

where Γ ⊂ ∂Ω represents the Earth topography or its part, i.e., the bottom boundary, ∂Ω − Γ represents the top boundary together with side boundaries (in case of local gravity field modelling), and TSAT is the disturbing potential generated from any GOCE-based satellite-only geopotential model. In the case that the Dirichlet and oblique derivative BC are obtained from different sources, problem with a compatibility of BC can arise on the edge where bottom and side boundaries meet. Then the Dirichlet BC (6) is prescribed also in a narrow band of the bottom boundary along to this edge, i.e., Γ is given by the bottom part of ∂Γ minus the narrow band.

3

Numerical Solution by the Boundary Element Method

An objective of the BEM is to replace a partial differential equation solved in a 3D domain by an equivalent equation that gives a solution on the boundary of the domain only (cf. [9, 27, 37, 52]). There are two fundamental approaches to derive an integral formulation of the Laplace equation on the boundary. The first one is often called the direct method, i.e., the integral equations can be derived through an application of the Green’s third identity. The second one is called the indirect method and is based on the assumption that harmonic functions can be expressed in terms of a single-layer or double-layer potential generated by continuous source density functions defined on the boundary. However, such source densities have usually formal character without a direct physical relation to the problem. Therefore in the following we focus on the direct BEM formulation where values of the function and its normal derivative over the boundary play the role of the source densities in generating the harmonic function over the whole solution domain [9].

3.1

Boundary Integral Equation for the Fixed Gravimetric BVP

A main advantage of BEM is the fact that only the boundary of the solution domain requires a division into its elements (Fig. 3), so the dimension of the problem is reduced by one. The direct BEM formulation applied to the Laplace equation (1) results in the boundary integral equation (BIE), [9]

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

581

Fig. 3 The discretization of the Earth’s surface by the global triangulation

1 T (x) + 2

 T (y)

∂G (x, y)dy Γ = ∂n

Γ



∂T (y)G(x, y)dy Γ, ∂n

(7)

Γ

where Γ is the boundary, dΓ is the area element and the kernel function G represents the fundamental solution of the Laplace equation, G(x, y) =

1 , 4π |x − y|

x, y ∈ R 3 .

(8)

The term ∂T /∂n in BIE (7) represents the normal derivative, while FGBVP includes the oblique derivative BC. In order to derive BIE (7) for the oblique derivative problem we follow the idea described in [3]. At first we decompose the vector ∇T into the normal and tangential components

ˇ R. Cunderlík et al.

582

∇T = (∇T · n)n + (∇T · t)t + (∇T · f)f,

(9)

where n is the unit normal vector, t and f are the unit tangential vectors. They represent a local orthonormal base system. Then the oblique derivative term can be written in the form ∇T · s = (∇T · n)n · s + (∇T · t)t · s + (∇T · f)f · s.

(10)

From this equation we express the normal derivative ∂T 1 = ∇T · n = [∇T · s − (∇T · t)t · s − (∇T · f)f · s] . ∂n n·s

(11)

Inserting Eq. (11) into Eq. (7) we get BIE for our oblique derivative problem 1 T (x) + 2  + Γ

 T (y)

∂G (x, y)dy Γ + ∂n

Γ

 Γ

((∇T · t) t · s) (y)G(x, y)dy Γ + (n · s)

((∇T · f) f · s) (y)G(x, y)dy Γ = (n · s)

 Γ

(∇T · s) (y)G(x, y)dy Γ, (n · s)

x ∈ Γ. (12)

The term ∇T · s on the right-hand side of BIE (12) represents the oblique derivative BC (2) and thus can be replaced by negative values of the input surface gravity disturbances. Then BIE (12) represents the direct BEM formulation for FGBVP defined by Eqs. (1), (2), and (3).

3.2

Collocation with Linear Basis Function

As a numerical technique to discretize BIE (12) we use the collocation method with linear basis functions (denoting by the C 1 collocation). The Earth’s surface as a boundary of the domain is approximated by a triangulation of the topography N P expressed as a set of panels ΔΓj (Fig. 3), i.e., Γ = ΔΓj . The vertices j =1

xi , . . . , xN of the triangles represent the nodes – the collocation points. The C 1 collocation involves a piecewise linear representation of the boundary functions T and δg on planar triangles [9]

T (x) ≈

3 k=1

Tk ψk (x),

x ∈ ΔΓj ,

(13)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

δg(x) ≈

3

δgk ψk (x),

x ∈ ΔΓj ,

583

(14)

k=1

where Tk and δgk for k = 1, 2, 3 represent values of the boundary functions at the vertices of the triangle ΔΓj . The linear basis functions ψ1 , ψ2 , . . . , ψN are given by ψj (xi ) = 1,

xi = xj ,

(15)

ψj (xi ) = 0,

xi = xj ,

(16)

where i = 1, .., N; j = 1, .., N and N is the number of the collocation points. These approximations allow to reduce the original BIE (7) to a discrete form (more details can be found in [11]). In the case of BIE (12) for the oblique derivative, a contribution of the tangential components is expressed through the gradients of the unknown disturbing potential. In the C 1 collocation they can be expressed through the gradients of the linear basis functions ∇T (x) ≈

3

Tk ∇ψk (x), x ∈ ΔΓj ,

(17)

k=1

where Tk are unknown values of the disturbing potential at collocation points that represent vertices of the triangle ΔΓj and ψk are the linear basis functions at these points. Since the gradient of the linear basis function ∇ψj is constant on the whole triangle ΔΓj , it can be expressed using the Green theorem ∇ψj =

1   m ΔΓj

 ∇ψj dΔΓ = ΔΓj

1   m ΔΓj

 ψj η d∂Γ,

(18)

∂Γj

where m(ΔΓj ) is the area of the triangle ΔΓj and η is the normal vector to its sides ∂Γ . Considering a fact that the j th linear basis function equals to 1 at the j th collocation point and to 0 at others vertices of the triangle ΔΓj , i.e., at mth and k th collocation points, Eq. (17) can be simplified into the form ∇ψj =



1  lj m η j m + lj k η j k ,  2m ΔΓj

(19)

where lj m and lj k are the lengths of the sides of the triangle intersecting at the j th collocation point, and ηj m and ηj k are the normal vectors to these sides, see Fig. 4. Considering all these approximations we get a discrete form of BIE (12) for each collocation point i.

ˇ R. Cunderlík et al.

584 Fig. 4 Triangle intersecting at the j th collocation point and normal vectors ηj m and ηkj to its sides

ci Ti ψi +

     sj ·tj  sj ·fj ∂Gij ψj + ∇ψj ·tj Tj Gij + ∇ψj ·fj Gij dΓj = ∂n nj ·sj nj ·sj

 N j =1suppψ

j

=

 N

δgj Gij

j =1suppψ

ψj dΓj , i = 1, . . . , N, nj ·sj

(20)

j

where suppψj is the support of the j th basis function. The coefficient ci represents a “spatial segment” bounded by the triangles joined at the i th collocation point. In the case of the linear basis functions, it can be evaluated by the expression [42] 1 [2π + ci = sgn(ρi · (ns × ns+1 )) arccos(ns · ns+2 )], 4π S

(21)

s=1

where ρi is the distance vector at the i th collocation point, ns is the normal unit vector to the s th triangle of the suppψi and S represents the number of triangles in the suppψi . Equations (20) represent the system of linear equations that can be rewritten into the matrix-vector form Mt = L δg,

(22)

where t = (T1 , . . . , TN )T and δg = (δg1 , . . . , δgN )T . Coefficients of the matrices M and L represent integrals that need to be computed using an appropriate discretization of the integral operators in (20). The discretization of the integral

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

585

operators is affected by the weak singularity of the kernel functions. The integrals with regular integrands, which represent non-diagonal coefficients, are approximated by the Gaussian quadrature rules defined on a triangle [34]. Their discrete form is given by Lij =

S K 1 Ajs ψk wk , 4π cos ϕjs rik s s=1

i = j

(23)

k=1

4 6 53 S K K ψk wk ∇ψjs · tj +∇ψjs ·fj wk 1 Mij = + Ajs kij s , i =j (24) 3 4π cos ϕjs rik s rik s s=1 k=1 k=1 where Ajs is the area of the s th triangular element of the suppψj , kij s is the distance from the i th collocation point to the plane represented by this triangular element, K is the number of points used for the Gaussian quadrature with their corresponding weights wk and linear basis functions ψk , and rik s is the distance from the i th collocation point to the k th quadrature point of the s th triangular element. The j th component of the vector δg in (22) corresponds to the input value of the measured surface gravity disturbance δg at the j th collocation point. The cos ϕjs represents a projection of the unit vector s at the j th collocation point to the normal vector n of the s th triangular element of the suppψj , kij s . The non-regular integrals (singular elements) arise only for the diagonal components of the linear system. They require special evaluation techniques in order to handle the singularity of the kernel function. Thanks to the diagonal component ci and the orthogonality of the normal to its planar triangular element, the singular element is represented by the spatial segment [3] Mii = ci .

(25)

The kernel function G (Eq. (8)) in integrals on the right-hand side of Eq. (20) is weakly singular. Hence, the diagonal coefficients Lii can be evaluated analytically Lii =

S Ais 1 tan[(βs + αs )/2] . ln 2π ρs cos ϕis tan(βs /2)

(26)

s=1

where Ais is the area of the s th triangle of the suppψi determined by the line of length ρs and angles αs , βs (Fig. 5). The diagonal component in Eq. (25) can be evaluated geometrically using (21) or through the physical consideration. The second approach is based on the fact that a constant potential applied over a closed body produces no flux. Accordingly, in case of the exterior Neumann problems, the sum of all components in each row should be equal to 1 [9]. Then one can easily calculate coefficient Mii after evaluating of all non-diagonal coefficients, i.e.,

ˇ R. Cunderlík et al.

586 Fig. 5 Evaluating of the singular element (the C 1 collocation)

Mii = 1 −

N

Mij∗ ,

(27)

j =1,i =j

where coefficients Mij∗ do not include contributions of tangential components. Mij∗

S K ψk wk 1 = Ajs kij s , i = j. 3 4π rik s s=1 k=1

(28)

The matrix M in Eq. (22) is a nonsymmetric dense N × N matrix. Consequently, memory requirements and CPU-time consumptions are of the order O(N 2 ). It means that for increasing N (of the order 105 and more), the BEM applications lead to large-scale linear systems with enormous memory requirements. Therefore computing on parallel computers with distributed memory is practically inevitable. In the last decades there have been developed numerous compression techniques that can be applied to reduce a numerical complexity of BEM, e.g., the fast multipole method (FMM), Hierarchical matrices (H-matrices) or panel clustering. Here we briefly mention that the FMM approximates the kernel function in Eq. (8) factorizing the x, y dependency by a multipole expansion [23]. Hence, interactions of the far zones can be evaluated straightforwardly and the original dense matrix M is transformed into a sparse one. An implementation of the FMM as well as panel clustering to the gravity field modelling by the Galerkin BEM is published in [32].

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

587

A main idea of the H-matrices is based on an approximation of the entire system matrix that is split into a family of submatrices. Large submatrices are stored in factorized representation, while small submatrices are stored in standard representation. This allows to reduce memory requirements significantly while improving the efficiency. Among others we briefly mention the Adaptive Cross Approximation (ACA) algorithm where numerically rank-deficient sub-blocks, which correspond to interactions of well-separated groups of nodes, can be efficiently compressed through an approach very similar to the column-pivoted LU decomposition [55].

3.3

Numerical Experiments

To demonstrate properties of our BEM approach, we present a reconstruction of a known harmonic function directly on the Earth’s surface where the corresponding oblique derivative BCs are prescribed. The Earth’s surface has been approximated by a triangulated surface. Vertices of the global triangulation have represented the collocation points and they have been regularly distributed over the whole Earth’s surface (Fig. 3). Their horizontal positions have been generated by the algorithm developed in [10]. A chosen level of the discretization has yielded a size of the triangular elements as well as number of collocation points (N ) (Table 1). Vertical positions of the collocation points have been interpolated from the DNSC mean sea surface [1] at oceans, and SRTM30PLUS-V5.0 global topography model [6] on lands. In this way 3D positions of the collocation points have been constructed. At these points the disturbing potential as a reconstructed harmonic function and the surface gravity disturbances as the oblique derivative BCs have been generated from the EGM2008 geopotential model up to degree and order 2160 [51]. All large-scale parallel computations were performed on the cluster with 1.2 TB of distributed memory. The standard MPI (Message Passing Interface) subroutines [2] have been used for the code parallelization. As a linear solver, the BiConjugate Gradient Stabilized (BiCGSTAB) method [4, 54] has been used, which is suitable for dense and nonsymmetric matrices. To reduce large memory requirements we have used an iterative procedure introduced in [12]. In the first iterative step it has incorporated a priori known global solutions generated from the ITG-GRACE03S satellite-only geopotential model up to degree 180 [43]. These “approximate” values of the disturbing potential have been used to evaluate a contribution of the

Table 1 Statistical characteristics of the residuals between the BEM solutions and EGM2008 [units: m2 s −2 ]

CASE Resolution Nodes Mean Max Min STD

A 0.075 deg 5 760 002 −1.315 1.216 −13.145 1.033

B 0.05 deg 12 960 002 −0.939 0.084 −7.320 0.564

ˇ R. Cunderlík et al.

588

far zones to every collocation point. It means that all “far zones components” of the original system matrix multiplied by the approximate values of the unknown disturbing potential have been passed to the known right hand side of Eq. (22). In this way values of the disturbing potential at collocation points have been iteratively improved and the original dense stiffness matrix have been transformed into the sparse one. Such an approach has allowed us to reduce the memory requirements significantly, however, in the cost of large CPU-time consumption. Our experiences have shown that our numerical scheme based on the discrete BIE (20) works well for solutions on coarse grids or triangulations. However, for very refined triangulation the system matrix M has become worse conditioned and the BiCGSTAB has stopped converge, even using preconditioning. It is due to the fact that in areas of extremely complicated Earth’s surface, the tangential components have become stronger, and taking into account that the kernel function for the tangential components is one order stronger than for the normal component (O(r −1 ) vs. O(r −2 ), see Eq. (24)), this has caused the worse conditioned system matrix. To overcome such a drawback we have decided to use an iterative approach in which the tangential components have remained on the right-hand-side  1 P ∂G T (p) + T P (q) (p, q)dΓq = 2 ∂n  = Γ

Γ

∇T · s − (∇T P −1 · t)t · s − (∇T P −1 · f)f · s (q)G(p, q)dΓq n·s

(29)

where P is an iterative step. Since we have used the iterative approach also for the elimination of far zones’ interactions, in the first iteration we have used the same disturbing potential generated from the ITG-GRACE03S model. It means that in one process such an iterative procedure has treated both, the oblique derivative as well as elimination of far zones’ interactions. Consequently, the system matrix has changed to M∗ which is generally well-conditioned also for refined triangulation and the BiCGSTAB solver has converged. Coefficients of the system matrix M∗ have be evaluated using Eqs. (27) and (28). Table 1 summarizes statistical characteristics of the residuals between the BEM solutions and EGM2008 for different levels of the discretization. Figure 6 depicts these residuals. We remind that the BEM solutions are obtained at collocation points directly on the Earth’s surface considering its complicated topography. It is evident that the largest residuals are in high mountains, especially in the Himalayas and Tibetan plateau. However, refining of the triangulation has resulted in an obvious improvement. This improvement has been achieved despite the fact that more refined triangulation has involved more detailed consideration of the Earth’s surface topography. It is worth to note that the BEM applications allow also local refinements of the global triangulation. In this way one can achieve more precise solution, especially over regions with a complicated boundary, while the overall memory requirements can be reduced. And this is challenging for further investigation.

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

589

Fig. 6 Residuals between the BEM solution and EGM2008 for the triangulation with (a) 5 760 002, and (b) 12 960 002 collocation points [units: m2 s −2 ]

4

Numerical Solution by the Finite Volume Method

The general discretization approach is to divide the computational domain Ω into the regular uniform hexahedron grid of finite volumes Vi,j,k . Then integrate the equation over each finite volume with a use of the divergence theorem that turns some of the volume integrals into surface integrals. Then the resulting discretised

ˇ R. Cunderlík et al.

590

equations equate fluxes across finite volumes to sources and sinks inside the volume, and they can be solved with standard direct or iterative methods. So we multiply Eq. (4) by minus 1 and in aforementioned manner 





ΔT dV = −

Vi,j,k

∇T · n dS.

(30)

∂Vi,j,k

we obtain the weak formulation of Eq. (4) in the finite volume Vi,j,k 

∂T dS = 0, ∂n



(31)

∂Vi,j,k

where n is a unit normal vector to the boundary of the finite volume Vi,j,k . Let us denote the representative point of the finite volume Vi,j,k by xi,j,k . Finite volumes are constructed around inner (those that do not lie on the boundary ∂Ω) representative points. Let N1 denote the set of all triplets (p, q, r),|p|+|q|+|r| = 1. Then the finite volumes Vi+p,j +q,k+r , (i, j, k) ∈ N1 share a common 2D boundary p,q,r ei,j,k with the finite volume Vi,j,k . Using such a discretization we can write



(p,q,r)∈N1

 p,q,r

ei,j,k

∂T p,q,r dS = 0, ∂ni,j,k

(32)

p,q,r

where ni,j,k is the unit normal vector oriented from the finite volume Vi,j,k to Vi+p,j +q,k+r . The derivation in the direction of the normal vector can be approximated by Ti+p,j +q,k+r − Ti,j,k ∂T . p,q,r ≈ p,q,r ∂ni,j,k di,j,k

(33)

p,q,r

Unknown values Ti,j,k are considered in points xi,j,k and di,j,k is a distance between points xi,j,k and xi+p,j +q,k+r . It is worth noting that the normal vector p,q,r ni,j,k has to be identical with the connecting line of representative points. In other case, the non-uniform grid has to be handled. By considering the derivative in the normal direction to be constant on the p,q,r boundary ei,j,k and utilizing Eq. (33) we obtain −

(p,q,r)∈N1

# " p,q,r Ti+p,j +q,k+r − Ti,j,k m ei,j,k = 0, p,q,r di,j,k

(34)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

591

# " p,q,r p,q,r where m ei,j,k is an area if the boundary ei,j,k . This equation can be written in the form (p,q,r)∈N1

# " p,q,r m ei,j,k   Ti,j,k − Ti+p,j +q,k+r = 0, p,q,r di,j,k

(35)

representing the linear system of algebraic equations for FVM. Then the term " # p,q,r

m ei,j,k p,q,r di,j,k

defined on sides of the finite volume Vi,j,k is referred to as the transmis-

sivity coefficient [14]. The system (35) must be accompanied by BCs. In case of the Dirichlet BC (6), we prescribe the value of Ti+p,j +q,k+r on the boundary. In case of the oblique derivative BC (5) it needs a special treatment which is discussed in the following section.

5

The Oblique Derivative Boundary Condition in the Oblique Derivative Boundary Value Problem

As we have mentioned above, in this section we present three different approaches to the oblique derivative BC in the oblique derivative BVPs, namely by (a) the central scheme applied on uniform grids, (b) the first order upwind scheme applied on uniform grids , (c) the higher order upwind scheme applied on non-uniform grids .

5.1

Approach Based on the Central Scheme Applied on Uniform Grids

The following approximations apply for a finite volume Vi,j,k for which the −1,0,0 boundary ei,j,k lies on the bottom boundary Γ (Fig. 7). Let N3 denote the set of all triplets (p, q, r), |p| + |q| + |r| = 3. Then the finite volumes Vi+p,j +q,k+r , (i, j, k) ∈ N3 share a common vertex with the finite volume p,q,r Vi,j,k . Then let us denote this common vertex by xi,j,k and it holds p,q,r

xi,j,k =

1 8



xi+l,j +m,k+n ,

(36)

(l,m,n)∈B(p,q,r)

where B(p, q, r) = {(p, q, r), (p, q, 0), (p, 0, r), (p, 0, 0), (0, q, r), (0, q, 0), (0, 0, r), (0, 0, 0)}.

ˇ R. Cunderlík et al.

592

Fig. 7 Illustration of the FVM grid notation

So let us suppose the oblique derivative BVP (4), (5), and (6). We start by splitting the gradient in (5) into one normal and two tangential directions ∇T = (∇T · n)n + (∇T · t)t + (∇T · f)f =

∂T ∂T ∂T n+ t+ f, ∂n ∂t ∂f

(37)

where n is the normal vector and t, f are tangent vectors to Γ ⊂ ∂Ω ⊂ R 3 . These three vectors form an orthonormal basis. Then we put (37) into (5) to obtain  ∇T · s =

 ∂T ∂T ∂T ∂T ∂T ∂T n+ t+ f ·s= n·s+ t·s+ f·s ∂n ∂t ∂f ∂n ∂t ∂f

(38)

and the BC (5) is transformed into the forms ∂T ∂T ∂T n·s+ t·s+ f · s = δg. ∂n ∂t ∂f

(39)

We set approximations of normal and tangent vectors n= t=

f=

xi−1,j,k − xi,j,k , |xi−1,j,k − xi,j,k | −1,1,1 −1,−1,−1 xi,j,k − xi,j,k −1,1,1 −1,−1,−1 |xi,j,k − xi,j,k | −1,1,−1 −1,−1,1 xi,j,k − xi,j,k −1,1,−1 −1,−1,1 |xi,j,k − xi,j,k |

,

.

(40)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

593

And we approximate the normal and tangential derivatives in (39) by Ti−1,j,k − Ti,j,k ∂T = , ∂n |xi−1,j,k − xi,j,k | −1,1,1 −1,−1,−1 Ti,j,k − Ti,j,k ∂T = −1,1,1 , −1,−1,−1 ∂t |xi,j,k − xi,j,k | −1,1,−1 −1,−1,1 Ti,j,k − Ti,j,k ∂T = −1,1,−1 , −1,−1,1 ∂f |xi,j,k − xi,j,k |

(41)

p,q,r

where values Ti,j,k are defined by p,q,r

Ti,j,k =

1 8



Ti+l,j +m,k+n ,

(42)

(l,m,n)∈B(p,q,r)

If we put these approximations into (39) we get a discrete form of the 3D oblique derivative BC (5) −1,1,1 −1,−1,−1 Ti,j,k − Ti,j,k Ti−1,j,k − Ti,j,k ∇T · s ≈ n · s + −1,1,1 t·s+ −1,−1,−1 |xi−1,j,k − xi,j,k | |xi,j,k − xi,j,k |

+

−1,1,−1 −1,−1,1 Ti,j,k − Ti,j,k −1,1,−1 −1,−1,1 |xi,j,k − xi,j,k |

f · s = δg.

(43)

These equations are incorporated into the FVM linear system which is then solved.

5.1.1 Numerical Experiments The numerical schemes will be qualified according to the value of the so-called experimental order of convergence (EOC) that can be computed as follows. If we assume that the error of the scheme in some norm is proportional to some power of the grid size, i.e., Error(h) = Ch , with a constant C, then having two grids with sizes h1 and h2 , where h1 > h2 , yields two errors Error(h1 ) = C(h1 ) and Error(h2 ) = C(h2 ) from where we can simply extract  = log h1 (Error(h1 )/Error(h2 )). If h2 = h21 then  = log2 (Error(h1 )/Error(h2 )). h2

Then the  is the EOC and can be determined by comparing numerical solutions and exact solutions on subsequently refined grids. Now let us remind that gravity disturbance defined as a difference between magnitudes of the real and normal gravity represents a projection of ∇T (x) into the unit vector s(x). The oblique derivative arises from the fact that the direction of s(x) in general does not coincide with the normal n(x) to the Earth’s surface. It

ˇ R. Cunderlík et al.

594

means that here we can distinguish two angles; the first one between n(x) a s(x) is known, while the second one between ∇T (x) and s(x) is unknown due to an unknown direction of ∇T (x). To simulate such a situation we perform the following two testing experiments. In the first testing experiment, we have the computational domain a tesseroid bounded by two concentric spheres with radii r1 = 1 and r2 = 2, and a coaxial cone with dimension (0, π/2) × (0, π/2). There have been the oblique BC (5) on the bottom boundary and the Dirichlet BC on the upper and side boundaries applied. The oblique derivative vector s is chosen to be s=

xC − xi− 1 ,j,k 2

|xC − xi− 1 ,j,k |

,

2

where the center point is xC = (0.1, −0.2, −0.1). As the Dirichlet BC (6) we have considered the exact solution of (4) in the form T(x, y, z) = 1/r, where r is the distance from the center point xC . As the Neumann/oblique BC on the bottom boundary, we have supposed the derivative of this exact solution that is equal to −1/r 2 . The results can be seen in Table 2. One can see that the proposed approach is second order accurate. For the second testing experiment we have the same computational domain and the same BCs as in the previous one, but the oblique vector s has been rotated by 20◦ . The coordinates of the center point have been xC = (−0.2, 0.1, 0.2). The L2 (Ω)-norm of differences between the exact and numerical solutions as well as the EOC of the method are shown in Table 3. One can see that also in this case with the rotated oblique vector, the value of EOC of the proposed approach reaches value 2. It is worth noting that in case when the oblique vector is identical with gradient vector, we can project to the normal without incorporating error. However, if this is not so (this is the case of rotation), we can’t solve BVP with Neumann BC, but we have to use the proposed approach. Table 2 The L2 (Ω)-norm and the EOC for the experiment with the oblique BC, when the center point is shifted xC = (0.1, −0.2, −0.1)

n1 × n2 × n3 2×2×4 4×4×8 8 × 8 × 16 16 × 16 × 32 32 × 32 × 64

||T − T ||L2(Ω) 6.74805.10−2 9.00317.10−3 1.54266.10−3 3.01950.10−4 0.67123.10−5

EOC – 2.90597 2.54502 2.35328 2.16928

Table 3 The L2 (Ω)-norm and the EOC for the experiment with the oblique BC. The center point is shifted xC = (−0.2, 0.1, 0.2) and the oblique vector s is rotated by 20◦

n1 × n2 × n3 2×2×4 4×4×8 8 × 8 × 16 16 × 16 × 32 32 × 32 × 64

||T − T ||L2(Ω) 6.43828.10−2 8.14779.10−3 1.34261.10−3 2.44307.10−4 0.52002.10−5

EOC – 2.98220 2.60137 2.45827 2.23204

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

595

The third numerical experiment has dealt with the FGBVP and the computational domain Ω above Himalaya region approximated by the ellipsoid WGS84. A range for the ellipsoidal latitude and longitude has been B ∈ "20.0◦ , 50.0◦ # and L ∈ "60.0◦ , 110.0◦ #, respectively. To calculate the oblique derivative vector, the ellipsoidal heights above reference ellipsoid WGS84 have been generated from SRTM30 [6]. The upper boundary has been 240 km above the WGS84 reference ellipsoid. The number of finite volumes has been 1200 in height, 900 in meridional and 1500 in zonal directions, i.e., 5 × 5 × 200 m sized volumes have been created. All BCs, namely gravity disturbances as well the disturbing potential, have been generated from EGM2008, see [51]. Results are depicted in Figs. 8 and 9 with

Fig. 8 The disturbing potential solution T in the area of Himalaya region [units: m2 s−2 ]

Fig. 9 Local improvement in T from the oblique BC when using EGM08 data only [units: m2 s−2 ]

ˇ R. Cunderlík et al.

596 Table 4 Comparison of statistical characteristics in between the disturbing potential computed by FVM applied to solving BVP with the oblique BC using EGM08 data only, and the disturbing potential generated from EGM08 directly [units: m2 s−2 ]

AREA Min Mean Max Std RMS

HIMALAYA −0.87 0.04 0.95 0.17 0.18

Fig. 10 Illustration of the 2D FVM grid. The dashed lines denote the boundaries of added finite volumes, the vector s is depicted by red

corresponding statistics presented in Table 4. One can observe that the highest values of residuals are in the areas of the mountainous ridges. More details about this approach can be found in [41].

5.2

Approach Based on the First Order Upwind Scheme Applied on Uniform Grids

In this approach, the oblique derivative BC is interpreted as a stationary advection equation for the unknown disturbing potential. Then its approximation is done by using the first order upwind scheme, which takes into account information from inflow parts of the finite volume boundary only. Since in upwind scheme is natural to have outer normal to the computational domain (see Fig. 10), we multiply the unit vector s(x) by minus 1. So one can rewrite the divergence of T (x)s(x) in the form

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

597

∇ · (T (x)s(x)) = T (x)∇ · s(x) + ∇T (x) · s(x).

(44)

By inserting (44) into Eq. (5), we obtain ∇ · (T (x)s(x)) − T (x)∇ · s(x) = δg(x).

(45)

We add one row of finite volumes under the bottom boundary, see Fig. 10, and integrate (45) over one of the added finite volumes Vi,j,k (we omit (x) to simplify the notation in the following equations) 





∇ · (T s) dV − Vi,j,k

T ∇ · s dV =

Vi,j,k

δg dV .

(46)

Vi,j,k

Using a constant approximation of the solution T on the finite volume Vi,j,k denoted by Ti,j,k and applying the divergence theorem to the left-hand side of Eq. (46) we obtain



(p,q,r)∈N1 p,q,r ei,j,k





p,q,r

T s · ni,j,k dS −

(p,q,r)∈N1

p,q,r

ei,j,k

 =

p,q,r

s · ni,j,k dS =

Ti,j,k

δg dV .

(47)

Vi,j,k p,q,r

p,q,r

Denoting a constant approximation of the solution on the interface ei,j,k by Ti,j,k and a volume of the finite volume Vi,j,k by m(Vi,j,k ) yields

p,q,r



p,q,r

s · ni,j,k dS −

Ti,j,k

(p,q,r)∈N1





(p,q,r)∈N1

p,q,r

ei,j,k

p,q,r

s · ni,j,k dS

Ti,j,k p,q,r

ei,j,k

= δg m(Vi,j,k ).

(48)

When we denote p,q,r si,j,k



p,q,r

=

p,q,r

p,q,r

s · ni,j,k dS ≈ m(ei,j,k ) s · ni,j,k ,

(49)

p,q,r ei,j,k

we finally obtain (p,q,r)∈N1

p,q,r

p,q,r

si,j,k (Ti,j,k − Ti,j,k ) = δg m(p).

(50)

ˇ R. Cunderlík et al.

598

Due to an analogy of the oblique derivative BC (5) and the stationary advection equation, we have applied an upwind principle, which is used exclusively in solving advection equations in fluid dynamics [36]. Then we define p,q,r

if si,j,k > 0,

p,q,r

(51)

p,q,r

if si,j,k < 0,

p,q,r

(52)

Ti,j,k = Ti,j,k , Ti,j,k = Ti+p,j +q,k+r ,

p,q,r

which correspond to the inflow part to the finite volume Vi,j,k (si,j,k < 0) and p,q,r outflow part to the finite volume Vi,j,k (si,j,k > 0) when s is understood as an advection velocity vector. By using (51) and (52) in (50) we obtain the final form of an approximation to the oblique derivative BC (5) as

p,q,r

si,j,k (Ti+p,j +q,k+r − Ti,j,k ) = δg m(p),

(53)

(p,q,r)∈N1in

where N1in is a set of neighbours at the inflow boundaries of the finite volume Vi,j,k , p,q,r i.e., where si,j,k < 0.

5.2.1 Numerical Experiments Now, we present two numerical experiments, one testing and one with real data. In the testing numerical experiment, the computational domain has been a tesseroid bounded by two concentric spheres with radii Rd = 1 m and Ru = 2 m, and a coaxial cone with dimension (0, π/4) × (0, π/4). As the Dirichlet BC (6), the exact solution of (5) in the form T ∗ = 1/r on the upper and the side boundaries, has been prescribed. The direction of the unit vector s1 (x), i.e., the unit gradient vector of the exact solution, has been modified by angle ±α to create a new unit vector s(x). For this experiment we have chosen α = 20◦ . The coordinates of the point mass source have been xC = (0.3, −0.2, 0.1). Then the oblique derivative BC is given by the projection ∇T (x) · s(x) = −(1/r 2 ) cos(α). The L2 (Ω) and MAX(Γ ) norms of differences between the exact and numerical solutions and the EOC of the methods are shown in Table 5. We observe stable behaviour of EOC for the upwind scheme and oscillatory EOC for the central scheme. Table 5 The L2 (Ω)-norm, MAX(Γ )-norm and the EOC for the 3D experiment with the 3D oblique derivative BC when the oblique vector s does not have direction of the solution gradient Upwind scheme n1 ×n2 ×n3 8×8×4 16×16×8 32×32×16 64×64×32 128×128×64

Central scheme

T ∗ − T L (Ω) EOC T ∗ − T MAX(Γ ) EOC T ∗ − T L (Ω) EOC 2 2

0.177728 0.059441 0.022542 0.010819 0.005143

– 1.58 1.39 1.05 1.07

0.362022 0.177806 0.083563 0.041756 0.019506

– 1.03 1.08 1.00 1.13

0.061529 – 0.146351 −1.25 0.058753 1.31 0.008090 2.86 0.004520 0.83

T ∗ − T MAX(Γ ) EOC

0.3511 0.209212 0.050549 0.053722 0.024245

– 0.75 2.05 2.64 0.84

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

599

In numerical experiment with real data, we apply the upwind scheme for global gravity field modelling. We try to reconstruct a harmonic function given by the EGM2008 geopotential model up to degree 2160 [51]. It means that all BCs are generated from this model. The Dirichlet BC in the form of the disturbing potential is prescribed on the upper boundary at the constant altitude of 240 km above the reference ellipsoid. The oblique derivative BCs are generated as the first derivative of the disturbing potential in the direction of the normal to the reference ellipsoid. They are generated at points on the real topography that is approximated using the SRTM30PLUS global topography model [6]. Our goal is to show a convergence of the FVM solution to EGM2008 when refining the computational grid. Although the oblique derivative BCs are considered at points on the real topography, in our FVM approach we so far use a structured grid of finite volumes. It means that the computational domain Ω in our computations is bounded by the reference ellipsoid. However, all input data here are adopted from the real p,q,r topography. This means that ni,j,k on the bottom boundary is given by the normal to the topography and not by the normal to the ellipsoid. Then the unit vector p,q,r s(x) represents the normal to the reference ellipsoid while the direction of ni,j,k represents the normal to the Earth’s surface and is adopted from our approximation p,q,r of the topography. In this way we are able to evaluate the coefficients si,j,k in our approximation of the oblique derivative BC. The computational grid is constructed using the number of divisions in L, B, H directions given by n1 × n2 × n3 : (a) (b) (c) (d)

540 × 270 × 75 (resolution: 40 × 40 × 3200 m), 1080 × 540 × 150 (resolution: 20 × 20 × 1600 m), 2160 × 1080 × 300 (resolution: 10 × 10 × 800 m), 4320 × 2160 × 600 (resolution: 5 × 5 × 400 m).

The obtained FVM solutions are compared with EGM2008. The statistical characteristics of residuals on the bottom boundary as well as computational aspects are summarized in Table 6. One can see that the FVM solution converges to EGM2008 by refining the finite volume grid, i.e., the mean value, STD as well as maximum norm are decreasing. It is worth to note that every refinement of the discretization involves a more detailed consideration of the topography. This does not allow us to compute EOC directly, however, STD as well as the maximum norm in Table 6 indicate that the upwind scheme is the first order accurate. Table 6 Statistics of residuals in T [m2 s −2 ] on the bottom boundary Γ for successive refinements, and computational details Resolution 40 × 40 20 × 20 10 × 10 5 × 5

Min. −78.910 −46.584 −22.011 −13.926

Max. 80.426 27.558 7.954 7.932

Mean −0.392 −0.273 −0.265 −0.114

STD (total) 5.238 1.948 0.904 0.558

STD (Sea) 4.771 1.489 0.327 0.183

STD (Land) 6.228 2.750 1.578 0.991

600

ˇ R. Cunderlík et al.

Fig. 11 Residuals in T [m2 .s −2 ] between the disturbing potential computed by the FVM solution with upwind treatment of oblique derivative and EGM2008 solution on the bottom boundary Γ

The residuals between the most refined FVM solution in case (d) and EGM2008 are depicted in Fig. 11. The largest residuals are negative and they evidently correlate with high mountainous areas of Himalayas and Andes. The minimal values of residuals in Table 6 indicate that refinements of the discretization improve the FVM solution also in these zones of complicated topography. This confirms that the presented FVM approach based on the upwind treatment of the oblique derivative BC is able to reconstruct a harmonic function and thus is efficient to solve the oblique derivative BVP. More details about this approach can be found in [38].

5.3

Approach Based on the Higher Order Upwind Scheme Applied on Non-uniform Grids

Now we will present FVM on non-uniform grids above the Earth’s topography. The oblique derivative BC is again treated as a stationary advection equation. We use a method for discretization of the computational domain based on an evolution of the Earth’s surface depending on its mean curvature. This approach involves a tangential redistribution of the evolving surface discretization points leading to a construction of a more regular non-uniform 3D hexahedron grid. Then we present a discretization of the Laplace equation and oblique derivative BC on such non-uniform grids. It consists of a reconstruction of the normal derivative to the finite volume using derivatives in the tangential directions. Numerically, the oblique derivative BC treated as an advection equation and a new higher order upwind method for nonuniform grids are applied. First, let us see the computational domain Ω and its grid as a parametrized volume. A parametrization determines a distribution of points, which in a discrete

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

601

form determines our finite volume grid. Let us denote by S = {x(u, v, t), u ∈ (0, 1), v ∈ (0, 1), t ∈ (0, tend )} the unknown parametrization of Ω. We consider that S(u, v, 0) approximates the Earth’s topography and we would like to force it in such a way that S(u, v, tend ) forms approximately a part of an ellipsoid at height H above the reference ellipsoid. This problem can be treated in such a way that S(u, v, tend ) will be the reference ellipsoid, which is then scaled to be approximately at the height H and S(u, v, 0) remains unchanged. The 3D volume S can be seen as an evolving surface for which parameter t is the time. The grid is constructed by an evolution of the surface S(u, v, 0) by its mean curvature and a force f , where f corresponds to the mean curvature of the reference ellipsoid in the point S ∗ . The point S ∗ is given by the projection of S(u, v, t) to the reference ellipsoid. Using this evolution we achieve that the surface continuously forms a shape of a part of the ellipsoid and the mathematical formulation of this process is given by [46] ∂t x(u, v, t) = ε (kN + f N) ,

(54)

where unknown x is the position vector of the evolving surface S, k is two times its mean curvature and N is the normal vector at the point x. The scalar f is the force applied in direction of the normal vector N. The vector kN is computed as kN = Δs x, where Δs is the so-called Laplace-Beltrami operator [46]. The scalar ε is a parameter determining how fast the surface is moving. The equation (54) is solved using the FVM. Boundary points of the surface do not have to be on the reference ellipsoid at time 0 due to real topography. We want them to get on the reference ellipsoid in time tend . So we decided that boundary points will move linearly to the points on the ellipsoid, but we allow them also a tangential movement. Points of the grid of the computational domain Ω are discrete points of scaled S. These points are scaled around the origin (0, 0, 0) with the factor 1 + 240000/6378137.

5.4

Controlling the Quality of Grid by Using the Tangential Redistribution of Points

A redistribution of points on a surface is important for a uniformity of the computational grid and a numerical stability of a surface evolution. In our approach, we decided to follow principles published in [30, 46]. In this case, the discrete surface is composed by discrete parallels and meridians which cross in discrete points xij . In the point xij , the i-th discrete meridian crosses the j -th discrete parallel. Then the uniform redistribution can be achieved by adding a tangential movement of the surface in the directions Tpi and Tmj , where Tpi is a tangent to the i-th parallel and Tmj is a tangent to the j -th meridian. The tangential movement does not change the shape of a surface. Let us have only one general curve Γ on the surface. This curve moves as the surface S moves. The parametrization Γ = {x(u, t), u ∈ (0, 1), t ∈ (0, tend )} determines a discretization of the curve . If we have n points on a curve Γ at time

ˇ R. Cunderlík et al.

602 Fig. 12 Illustration of the curve Γ on the surface S and TΓ , NΓ1 , NΓ2

  point m, points of the curve are xi = x ni , m . If the curve has a uniform distribution at time 0, we want to preserve this distribution. If it does not have the uniform distribution, we want redistribute points uniformly. Using this parametrization we can write TΓ = xu /|xu |. There is another important parametrization of Γ . It is called the arc-length parametrization. We denote it by s. For this parametrization it holds |dx| ds = 1. Using this parametrization we can write TΓ = xs and using the Frenet formula kNΓ = TΓs = xss . For better clarity, we have decided to denote the surface normal by NS and the surface mean curvature by k S . Movement of the curve is split in three perpendicular directions: the direction TΓ which is a tangent vector of the curve Γ , and other two directions NΓ1 , NΓ2 which lie in the normal plane of the curve Γ . The direction NΓ1 is chosen to be the normal vector of the surface NS . The third vector is NΓ2 = NΓ1 ×TΓ (see the Fig. 12). In general, the curve evolution is given by the equation ∂t x = U Γ NΓ1 + V Γ NΓ2 + AΓ TΓ ,

(55)

where x is the position vector of the curve Γ on the surface S. Since the curve is moving by (54), the values of U Γ ,V Γ and AΓ are given by ## " " U Γ = ε k S NS + f NS · NΓ1 , ## " " V Γ = ε k S NS + f NS · NΓ2 , ## " " AΓ = ε k S NS + f NS · TΓ . Since NΓ1 = NS , NΓ2 ⊥ NS and TΓ ⊥ NS we have

(56)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

603

" # U Γ = ε kS + f , V Γ = 0, AΓ = 0.

(57)

Using these facts and by adding a new tangent velocity α Γ TΓ , we obtain ∂t x = U Γ NΓ1 + α Γ TΓ .

(58)

The scalar α Γ is a quantity providing the tangential redistribution of points on the curve Γ . Since we do not want this velocity to move boundary points, we set α(0) = α(1) = 0. 2" # " #2 " #2 2 dx1 dx2 ds Γ 3 Let us introduce a function g = |xu | = + + dx = du , du du du which can be used for the point distribution. From the discrete point of view, g Γ is proportional to a distance points on the curve. Let us denote LΓ the # " Γbetween g = 0, the ratio of distances between points and length of the curve Γ . If LΓ t

length of the curve remains the same. This equation determines which α Γ gives us an redistribution conserving initial one and it can be rewritten to 

gΓ LΓ

 = t

 gΓ  Γ αs − U Γ k1Γ + "U Γ k1Γ #Γ . Γ L

(59)

The detailed process of obtaining Eq. (59) can be found in [44]. If we want to determinate g Γ such that we obtain an asymptotically uniform redistribution, we can choose [47] 

" For change of

gΓ LΓ

gΓ LΓ

 t

  gΓ = ω 1− Γ . L

(60)

# t

in time holds Eq. (59) and after substituting it into Eq. (60),

we can see that everything in (60) except the term α Γ is given by the evolution of the curve and the surface. The equation (60) can be rewriten to  LΓ +ω −1 , gΓ 

αsΓ

=

−U Γ k1Γ

+ "U Γ k1Γ #Γ

(61)

from where we can determine α Γ for any curve Γ on the surface S. By adding such movement in direction of tangent vector of the curves, the final equation for the surface evolution, which includes also tangential evolution of points, is given by

ˇ R. Cunderlík et al.

604

∂t x = ε (kN + f N) + "α Γ TΓ #,

(62)

 where "α Γ TΓ # = Γ ∈M Γ α Γ TΓ /|M Γ | and M Γ is the set of curves crossing in the point x which we want to redistribute and |M Γ | is a cardinality of the set M Γ . Since redistributions on crossing curves do not have to be compatible, we take the average value. In the continuous case, the Eqs. (62) and (54) give the same image of the evolving surface, but in the discrete case we obtain almost uniform point redistribution by using (62).

5.5

Numerical Approximation of Evolving Surface

Let us assume that the surface is composed by ni meridians and nj parallels. A point of an intersection of the i-th meridian and the j -th parallel in a time index t is denoted by xij t . Let p, q ∈ {−1, 0, 1} and let Nint denote a set of all (p, q), |p| + |q| = int, where int denote an integer number. So points xi+p,j +q,t , (p, q) ∈ N1 are north, south, east, west neighbouring points and points xi+p,j +q,t , (p, q) ∈ N2 are north-east, north-west, south-east, south-west neighbouring points. If we do not specify that (p, q) belongs to N1 or N2 , we always consider that it belongs to the set N1 . The surface is divided into finite volumes. A finite volume Vij t is associated with the point xij t . Vertices of the finite volume are given by centers of line segments connecting points xij t and xi+p,j +q,t , (p, q) ∈ N1 , and by centers of quadrilaterals given by points xij t , xi+p,j,t , xi,j +q,t , xi+p,,j +q,t , (p, q) ∈ N2 . These vertices are pq denoted by xij t , see Fig. 13, and they are computed by the formula pq

xij t =

1 4



xi+l,j +m,t ,

(63)

(l,m)∈B(p,q)

where B(p, q) = {(p, q),(p, 0),(0, q),(0, 0)}. A boundary between Vij t and Vi+p,j +q,t , (p, q) ∈ N1 is kinked, so it is pqr composed by two line segments. Let us denote by eij t , (p, q) ∈ N1 , r ∈ {−1, 1}, two line segments forming the boundary between the finite volumes Vij t and pqr Vi+p,j +q,t . Let us define a function, which generate a corner vertex of eij t  1 (p, q, r) = pqr

(r, q), p = 0, (p, r), q = 0. pq

1(p,q,r)

(64)

pqr

A line segment eij t is then given by points xij t and xi,j,t . Let us denote nij t pqr an outer normal to the eij t . For better understanding see the Fig. 13. Let k be equal to one for simplicity. By integrating (62) over the finite volume Vij t we get

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

605

xi,j+1,t x1,1 i,jt

Vi,j,t xi,j,t xi−1,j,t

Vi,j,t xi,j,t

xi+1,j,t

x1,0 i,jt

e1,0,1 i,jt

e1,0,−1 i,jt

n1,0,1 i,jt xi+1,j,t 1,0,−1 ni,jt

x1,−1 i,jt xi,j−1,t

a)

b)

Fig. 13 The finite volume with representative points and finite volume with normals and edges



 ∂t xdS = Vij t

 Δs xdS +

Vij t

 f NdS +

Vij t

"α Γ TΓ #dS

(65)

"α Γ TΓ #dS.

(66)

Vij t

and by using Green’s theorem we have 

 ∂t xdS =

Vij t

 ∇s x · nij t ds +

∂Vij t

 f NdS +

Vij t

Vij t

Using definition of the finite volume, the first term on the right-hand side of (66) can be rewritten as 



∇s x · nij t ds =





pqr

∇s x · nij t ds =

(p,q)∈N1 r∈{−1,1} pqr eij t

∂Vij t



=





(p,q)∈N1 r∈{−1,1} pqr eij t

∂x pqr ds. ∂nij t

(67)

pqr

A derivative of x in the direction of nij t is considered constant on the boundary pqr eij t . In general, a vector xi+p,j +q,t − xij t is not in the direction of the normal pqr vector nij t , so the derivative in the direction of the normal vector is approximated by a derivative in a direction of xi+p,j +q,t − xij t and a derivative in a direction of pqr pqr the tangent vector to eij t . The tangent vector to eij t is defined as

p,q,r ti,j,k

=

1(p,q,r)

− xi,j,k

1(p,q,r)

− xi,j,k |

xi,j,k

|xi,j,k

pq pq

.

(68)

ˇ R. Cunderlík et al.

606 pq

A unit vector vij t , which is pointing from the neighbouring point xi+p,j +q,t to the point xi,j,t , is given by xi+p,j +q,t − xi,j,t . |xi+p,j +q,t − xi,j,t |

pq

vij t =

(69)

pqr

An approximation of the normal vector to eij t is defined as

pq

pqr

pq

p,q,r

pq

p,q,r

vij t × ti,j,k

pqr

nij t =

p,q,r

× ti,j,k .

|vij t × ti,j,k |

(70)

pqr

pq

Since vectors vij t , nij t and tij t lie in the same plane, the vector vij t can be pqr pqr expressed as a linear combination of nij t and tij t , and it holds pqr

pqr pqr

pqr pqr

pqr

pqr

pqr

pqr

∇s x·sij t = ∇s x·(βij t nij t +γij t tij t ) = βij t ∇s x·nij t +γij t ∇s x·tij t pqr

pqr

pq

pqr

pqr

(71)

pq

where βij t = nij t · vij t and γij t = tij t · vij t . Thus the derivative in the normal direction can be expressed as pqr

1

pqr

∇s x · nij t =

γij t

pqr

pqr ∇s x · sij t −

pqr

pqr

βij t

βij t

∇s x · tij t ,

(72)

and approximated by 1(p,q,r)

pqr

pqr ∇s x · nij t

pq

γij t xi,j,t+1 − xi,j,t+1 1 xi+p,j +q,t+1 − xi,j,t+1 − pqr 1(p,q,r) . = pqr pq |xi+p,j +q,t − xi,j,t | βij t βij t |x − xi,j,t | i,j,t

(73)

pqr

Using this equation and because the length of eij t1 is equal to pqr

1(p,q,r)

pq

m(eij t1 ) = |xi,j,t

− xi,j,t |,

(74)

Eq. (67) can be approximated by





(p,q)∈N1 r∈{−1,1} pqr eij t



∂x pqr ds ≈ ∂nij t (p,q)∈N

pqr

m(eij t1 ) xi+p,j +q,t+1 − xi,j,t+1 pqr

βij t

|xi+p,j +q,t − xi,j,t |



1

r∈{−1,1}

pqr γij t " pqr

βij t

1(p,q,r)

pq

xi,j,t+1 − xi,j,t+1

# .

(75)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . . 1(p,q,r)

607

pq

Because xi,j,t+1 and xi,j,t+1 are vertices of the finite volume computed as in (63), the equation can be rewritten pqr  m(eij t1 ) xi+p,j +q,t+1 − xi,j,t+1



pqr

(p,q)∈N1 r∈{−1,1} pqr



γij t

pqr

4βij t

⎛ ⎝

|xi+p,j +q,t − xi,j,t |

βij t

xi+l,j +m,t+1 −

(l,m)∈B(1(p,q,r))





 xi+l,j +m,t+1 ⎠ .

(76)

(l,m)∈B(p,q)

A constant value of f Nij t is considered on the finite volume Vij t . So the second term on the right-hand side of the equation (66) can be rewritten as  f NdS = m(Vij t )f Nij t ,

(77)

Vij t

where m(Vij t ) is a 2D measure of Vij t . In order to compute Nij t , we consider a vector kNij t computed by Eq. (76), where all values are taken at time index t. Then the normal vector to the surface is given by Nij t =

kNij t . |kNij t |

(78)

The meridians and parallels are curves according to which we are going to redistribute points on the surface. Only one meridian and one parallel go through the point xij t . Let us consider the i-th meridian and the j -th parallel. The point xij t is the i-th point on the j -th parallel and the j -th point on the i-th parallel in time t. So we can write 

 "α Γ TΓ #dS =

Vij t

α i Ti + α j Tj dS, 2

(79)

Vij t

where Ti (Tj ) is the tangent vector to the i-th meridian (j -th parallel). Values of α i Ti and α j Tj are considered constant on Vij t and we approximate them using central differences  Vij t

α i Ti + α j Tj dS = 2

  m(Vij t ) j xi+1,j,t+1 − xi−1,j,t+1 i xi,j +1,t+1 − xi,j −1,t+1 = · αj t + αit (80) 2 |xi,j +1,t − xi,j −1,t | |xi,j +1,t − xi,j −1,t |

ˇ R. Cunderlík et al.

608 j

where αji t (αit ) is α i (α j ) in the j -th (i-th) point on the i-th (j -th) parallel in time t. A time derivative is considered constant on the finite volume and is approximated by a finite difference 



xi,j,t+1 − xij t ∂t xdS = m(Vij t ) Δt

 (81)

.

Vij t

Using Eqs. (75), (77), (80), and (81), we get 

xi,j,t+1 − xij t m(Vij t ) Δt pqr



αij t



 =

pqr

βij t

(p,q)∈N1 r∈{−1,1}

(l,m,)∈B(1(p,q,r))

|xi+p,j +q,t − xi,j,t |



xi+l,j +m,t+1 −

pqr

4βij t

pqr  m(eij t1 ) xi+p,j +q,t+1 − xi,j,t+1



 xi+l,j +m,t+1

(l,m,)∈B(p,q)

+ m(Vij t )f Nij t   m(Vij t ) i xi,j +1,t+1 − xi,j −1,t+1 j xi+1,j,t+1 − xi−1,j,t+1 αj t + αit . + 2 |xi,j +1,t − xi,j −1,t | |xi,j +1,t − xi,j −1,t |

(82)

We have a system of ni × nj equations with ni × nj unknowns xij,t+1 , where i = 1, . . . , ni and j = 1, . . . , nj . j Values of αji t (αit respectively) are computed before the system of equations (82) is solved. We obtain these values by solving Eq. (61). Approximating the derivative in (61) by using the backward difference and taking the right-hand side in the discrete points we get αji t − αji −1,t |xi,j t − xi,j −1,t |

6

=

i Uji −1/2,t k1,j −1/2,t

i − "Uti k1t #i

5 Lit |xi,j t − xi,j −1,t | +ω −1 nj (83)

where # " i i i km,j = k − k m,j t −1/2,t m,,j −1,t /2, m = 1, 2

(84)

i i i km,j t = kNj t · Nm,j t , m = 1, 2

(85)

Ni1,j t = Nij t ,

(86)

Ni2,j t = Ni1,j t × Tij t ,

(87)

Tij t =

xi,j +1,t − xi,j −1,t , |xi,j +1,t − xi,j −1,t |

(88)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

kNij t Uji −1/2,t

xi,j +1,t −xi,j,t |xi,j +1,t −xi,j,t |

=



xi,j,t −xi,j −1,t |xi,j,t −xi,j −1,t |

(|xi,j +1,t − xi,j,t | + |xi,j,t − xi,j −1,t |)/2 # " i /2, = Uji +1,t + Uj,t

609

,

i Uj,t = (εkNij t + f Nij t ) · Ni1,j t ,

"U i k1i #i =

Lit =

nj 1

Lit nj

i i hil (Ul−1/2 k1,l−1/2 ),

(89) (90) (91) (92)

l=1

|xi,j t − xi,j −1,t |.

(93)

i=1

From (83), (93) and (93) we get # " i αji t = αji −1,t − |xi,j t − xi,j −1,t | Uji −1/2,t k1,j −1/2,t + + |xi,j t − xi,j −1,t |  +ω

nj l=1

Lit nj j

# " i i + |xi,lt − xi,l−1,t | Ul−1/2,t k1,l−1/2,t 

− |xi,j t − xi,j −1,t | .

(94) j

i = 0 (α i Because α0,t 0,t = 0), every value of αj t (αit ) can be computed before solving system of equations (81). The system of equations (82) can be solved using the BiCGSTAB method [4, 54].

5.6

Discretization of the Oblique Derivative BVP for the Laplace Equation

5.6.1 Approximation of the Laplace Equation In this section, we introduce FVM for a discretization of Eq. (4) on a non-uniform grids. We discretize the domain Ω by the non-uniform regular hexahedron grid using the approach described in the previous section. Such constructed vertices are p,q,r representative points of finite volumes. Vertices of the finite volume xi,j,k are constructed in the center of eight neighbouring representative points using Eq. (36), Using the same principles as in Eqs. (30) and (31) we obtain Eq. (32). Unlike for p,q,r uniform grids, where the vector xi+p,j +q,k+r − xi,j,k and the normal vector ni,j,k are parallel, we can not use Eq. (33) to approximate the normal derivative. For this approximation we utilize the same idea as in Sect. 5.1 of splitting a derivative in the directions of three linearly independent directions.

ˇ R. Cunderlík et al.

610 p,q,r

One of the vectors is the unit vector vi,j,k , which is pointing from the neighbouring point xi,j,k to the point xi+p,j +q,k+r , and is given by p,q,r

vi,j,k =

xi+p,j +q,k+r − xi,j,k . |xi+p,j +q,k+r − xi,j,k |

(95)

Let us introduce new operations on the set N1

⊕(p, q, r) =

⎧ ⎪ ⎪ ⎨ (p, 1, 1),

p = 0

(1, q, 1), ⎪ ⎪ ⎩ (1, 1, r),

r = 0

⎧ ⎪ ⎪ ⎨ (p, −1, −1), (p, q, r) = (−1, q, −1), ⎪ ⎪ ⎩ (−1, −1, r), ⎧ ⎪ ⎪ ⎨ (p, 1, −1), (p, q, r) = (1, q, −1), ⎪ ⎪ ⎩ (1, −1, r), (p, q, r) =

q = 0

p = 0 q = 0 r = 0 p = 0 q = 0 r = 0

⎧ ⎪ ⎪ ⎨ (p, −1, 1),

p = 0

(−1, q, 1), ⎪ ⎪ ⎩ (−1, 1, r),

r = 0

q = 0

p,q,r

Thanks to these operations we can write the vertices of the boundary ei,j,k as (p,q,r)

(p,q,r)

⊕(p,q,r)

⊕(p,q,r)

xi,j,k , xi,j,k , xi,j,k and xi,j,k . These vertices are used to compute p,q,r p,q,r tangent vectors. The first tangent vector ti,j,k to the boundary between ei,j,k is given by p,q,r ti,j,k

=

⊕(p,q,r)

− xi,j,k

⊕(p,q,r)

− xi,j,k

xi,j,k |xi,j,k

(p,q,r) (p,q,r)

|

(96)

.

p,q,r

p,q,r

The second tangent vector fi,j,k is given by other two vertices of ei,j,k , p,q,r

fi,j,k =

(p,q,r)

− xi,j,k

(p,q,r)

− xi,j,k

xi,j,k

|xi,j,k

(p,q,r) (p,q,r)

|

.

(97)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

611

Fig. 14 The finite volume

The normal vector to the boundary of the finite volume is then defined by p,q,r

p,q,r

p,q,r

ni,j,k = ti,j,k × fi,j,k .

(98)

p,q,r

where ni,j,k is the outer normal relative to the finite volume Vi,j,k (see Fig. 14). pqr pqr pqr Since the vector vij k can be expressed as a linear reconstruction of nij k , tij k pqr and fij k , it holds pqr

pqr pqr

pqr pqr

pqr pqr

∇T · vij k = ∇T · (βij k nij k + αij k tij k + γij k fij k ) pqr

pqr

pqr

pqr

pqr

pqr

= βij k ∇T · nij k + αij k ∇T · tij k + γij k ∇T · fij k , pqr

pqr

(99)

pqr

where coefficients αij k , βij k and γij k are given by solving a linear system of equations pqr

pqr pqr

pqr pqr

pqr pqr

vij k = βij k nij k + αij k tij k + γij k fij k .

(100)

Therefore, for the derivative in the direction of normal we get pqr

∇T · nij k =

1 pqr βij k

pqr

pqr

pqr

pqr

pqr

(∇T · vij k − αij k ∇T · tij k − γij k ∇T · fij k ).

(101)

ˇ R. Cunderlík et al.

612

Equation (101) is approximated by 1

pqr

pqr βij k

pqr

pqr

pqr

pqr

(∇T · vij k − αij k ∇T · tij k − γij k ∇T · fij k ) ≈ pqr

⊕(p,q,r)

(p,q,r)

− Ti,j,k αij k Ti,j,k 1 Tij k − Ti+p,j +q,k+r − pqr ⊕(p,q,r) pqr pqr (p,q,r) βij k dij k βij k |x − xi,j,k | i,j,k pqr



(p,q,r)

γij k Ti,j,k

(p,q,r)

− Ti,j,k

,

(102)

=0.

(103)

pqr

βij k |x(p,q,r) − x(p,q,r) | i,j,k i,j,k

⊕(p,q,r)

⊕(p,q,r)

are the values at the points xi,j,k . where Ti,j,k Equation (32) can be rewritten using Eq. (102) in the form 





pqr

m(eij k )

(p,q,r)∈N1 ⊕(p,q,r)

pqr



αij k Ti,j,k

1 Tij k − Ti+p,j +q,k+r pqr dij k

pqr βij k

(p,q,r)

− Ti,j,k

βij k |x⊕(p,q,r) − x(p,q,r) | i,j,k i,j,k pqr

pqr



(p,q,r)

γij k Ti,j,k

(p,q,r) 

− Ti,j,k

pqr

βij k |x(p,q,r) − x(p,q,r) | i,j,k i,j,k

For the finite volumes, that are adjacent to the boundary finite volumes, the value Ti+p,j +q,k+r is given by the Dirichlet BC (6). Similarly as in Eq. 42 the values ⊕(p,q,r) ⊕(p,q,r) Ti,j,k are not given in representative points, but in points xi,j,k , which are vertices of the finite volume. They are at the center of the corresponding ⊕(p,q,r) are approximated by representative points (36). So values Ti,j,k ⊕(p,q,r)

Ti,j,k

⊕(p,q,r)

= T (xi,j,k

(p,q,r)

)=

(p,q,r)

1 8



Ti+l,j +m,k+n ,

(104)

(l,m,n)∈B(⊕(p,q,r))

(p,q,r)

and values Ti,j,k , Ti,j,k , Ti,j,k in Eq. (103) can be expressed similarly. It is worth noting that for an uniform grid Eq. (103) is the same as Eq. (35).

5.6.2 Approximation of the Oblique Derivative Boundary Condition In this section we introduce a higher order discretization of the oblique derivative BC (5). The computational domain is divided by finite volumes as in the previous subsection. However, the finite volumes are constructed also around representative points on the boundary Γ . Vertices common to boundary finite volumes and inner finite volumes are located at the center of the representative points defined by (36). Other vertices of the boundary finite volumes are obtained by mirroring of the former ones through Γ . The set of added finite volumes is denoted by O.

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

613

As in Sect. 5.1, we understand Eq. (5) as an advection equation and we can obtain Eq. (50) using the same principles. The difference between this method and the first order upwind method is in the p,q,r approximation of the value Ti,j,k . In this case, we do not simply assign a value Ti,j,k or Ti+p,j +q,k+r to the boundary but we correct it using a gradient. Using this p,q,r method there are two ways how to approximate the value Ti,j,k p,q,r

p,q,r

Ti,j,k = Ti,j,k + ∇Ti,j,k · (xi,j,k − xi,j,k ), p,q,r

(105) p,q,r

Ti,j,k = Ti+p,j +q,k+r + ∇Ti+p,j +q,k+r · (xi,j,k − xi+p,j +q,k+r ),

(106)

p,q,r

We choose an appropriate approximation using an upwind method. If si,j,k > 0 we use Eqs. (105) and (106) otherwise. Then Eq. (50) becomes

p,q,r

(p,q,r)∈N1in (i,j,k)

+



p,q,r



p,q,r

(Ti,j,k + ∇Ti,j,k · (xi,j,k − xi,j,k ))si,j,k

(p,q,r)∈N1out (i,j,k)



p,q,r

(Ti+p,j +q,k+r + ∇Ti+p,j +q,k+r · (xi,j,k − xi+p,j +q,k+r ))si,j,k

p,q,r

Ti,j,k si,j,k −



p,q,r

Ti,j,k si,j,k = |Vi,j,k |δg.

(p,q,r)∈N1out (i,j,k)

(p,q,r)∈N1in (i,j,k)

(107) p,q,r

p,q,r

By using the functions max(0, si,j,k ) and min(0, si,j,k ) we can write

 p,q,r p,q,r (Ti+p,j +q,k+r +∇Ti+p,j +q,k+r ·(xi,j,k −xi+p,j +q,k+r )) min(0, si,j,k )

(p,q,r)∈N1 p,q,r

p,q,r

p,q,r

+ (Ti,j,k + ∇Ti,j,k · (xi,j,k − xi,j,k )) max(0, si,j,k ) − Ti,j,k si,j,k



= |Vi,j,k |δg. (108)

The gradient on the finite volume Vi,j,k can be expressed using derivatives in three linear independent directions. Let us denote these directions p, q, r. For derivatives in these directions applies ∂T ∂T ∂T ∂T = ∇T · p = px + py + pz , ∂p ∂x ∂y ∂z ∂T ∂T ∂T ∂T = ∇T · q = qx + qy + qz , ∂q ∂x ∂y ∂z ∂T ∂T ∂T ∂T = ∇T · r = rx + ry + rz . ∂r ∂x ∂y ∂z

(109)

ˇ R. Cunderlík et al.

614

If we look at (109) as a system of linear equations for unknowns obtain the solution

∂T ∂x

,

∂T ∂y

,

∂T ∂z

, we

∂T ∂T ∂T ∂T ∂T −pz qy ∂T ∂T ∂r + py qz ∂r − qz ∂p ry + pz ∂q ry + qy ∂p rz − py ∂q rz =− , ∂x pz qy rx − py qz rx − pz qx ry + px qz ry + py qx rz − px qy rz ∂T ∂T ∂T ∂T ∂T pz qx ∂T ∂T ∂r − px qz ∂r + qz ∂p rx − pz ∂q rx − qx ∂p rz + px ∂q rz =− , ∂y pz qy rx − py qz rx − pz qx ry + px qz ry + py qx rz − px qy rz ∂T ∂T ∂T ∂T ∂T −py qx ∂T ∂T ∂r + px qy ∂r − qy ∂p rx + py ∂q rx + qx ∂p ry − px ∂q ry =− , ∂z pz qy rx − py qz rx − pz qx ry + px qz ry + py qx rz − px qy rz (110)

and thus ∇Ti,j,k =

∂T ∂T p × q ∂T ∂r + q × r ∂p + r × p ∂q

det(p, q, r)

,

(111)

where ⎛

⎞ px py pz det(p, q, r) = det ⎝ qx qy qz ⎠ . rx ry rz

(112)

If the finite volume, on which we want to reconstruct the gradient, is the inner finite volume, then p, q, r are defined by p=

xi+1,j,k − xi−1,j,k , |xi+1,j,k − xi−1,j,k |

q=

xi,j +1,k − xi,j −1,k , |xi,j +1,k − xi,j −1,k |

r=

xi,j,k+1 − xi,j,k−1 . |xi,j,k+1 − xi,j,k−1 |

(113)

Approximation of derivatives in these directions are Ti+1,j,k − Ti−1,j,k ∂T ≈ , ∂p |xi+1,j,k − xi−1,j,k | Ti,j +1,k − Ti,j −1,k ∂T ≈ , ∂q |xi,j +1,k − xi,j −1,k | Ti,j,k+1 − Ti,j,k−1 ∂T ≈ . ∂r |xi,j,k+1 − xi,j,k−1 |

(114)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

615

On the other hand, if the finite volume is the boundary finite volume then one of the neighbouring finite volumes does not exist. Let say the finite volume Vi−1,j,k does not exist. Then we cannot use Ti−1,j,k for reconstruction but we can use the oblique derivative g(xij k ) in direction s(xij k ). Let us denote them g and s. Then p, q, r are defined by p = s, q=

xi,j +1,k − xi,j −1,k , |xi,j +1,k − xi,j −1,k |

r=

xi,j,k+1 − xi,j,k−1 . |xi,j,k+1 − xi,j,k−1 |

(115)

Approximation of derivatives in these directions are ∂T = δg, ∂p Ti,j +1,k − Ti,j −1,k ∂T ≈ , ∂q |xi,j +1,k − xi,j −1,k | Ti,j,k+1 − Ti,j,k−1 ∂T ≈ , ∂r |xi,j,k+1 − xi,j,k−1 |

(116)

and so s×q ∇Ti,j,k =

Ti,j +1,k − Ti,j −1,k Ti,j,k+1 − Ti,j,k−1 + q × r δg + r × s |xi,j,k+1 − xi,j,k−1 | |xi,j +1,k − xi,j −1,k | . det(s, q, r) (117)

Substituting Eq. (117) into Eq. (108), we get equations for boundary finite volumes Vi,j,k ∈ O. Due to the construction of our scheme, the equations for these finite volumes may require two neighbouring finite volumes in the directions of q and r. For those which do not have such neighbours, we have to prescribe Dirichlet BC, which is also in accordance with the compatibility of BCs mentioned in the introduction. All these equations, together with equations from the discretization of the Laplace equation form a numerical scheme for solving the BVP (4), (5) and (6).

5.7

Numerical Experiments

In the first experiment we solve the BVP (4), (5) and (6) with BCs obtained from an artificial harmonic function defined on a computational domain, see Fig. 15. This computational domain is bounded by four planar side boundaries, a spherical upper boundary and by the bottom boundary given by a perturbed sphere. In order to test

616

ˇ R. Cunderlík et al.

Fig. 15 Computational domain for the first experiment

the numerical scheme, we constructed the most coarse grid. Then refined grids were constructed by adding new representative points in-between representative points of the previous grid using Eq. (36). The exact solution was chosen as T (x) = 1 |x−(0.1,0.2,0.3)| and its values were used to generate the oblique derivative and the Dirichlet BC. The oblique derivative BC were prescribed on the perturbed sphere as the bottom boundary. The vectors in the direction of ∇T (x) were rotated alternately by the angle of π/6 around x, y, z axes to get the vectors v, see Fig. 16. Table 7 depicts the L2 -norm and maximum norm of residuals between the obtained FVM solutions and the exact solution and the achieved EOC. The second experiment is computed on the same computational domain with the exact solution taken from EGM2008 while using only the SH coefficients up to degree and order 5. The oblique derivative is generated as the first derivative of the disturbing potential (the exact solution) in the radial direction. This radial direction represents the oblique direction since it differs from the direction of the normal vector to the bottom boundary. Table 8 shows the L2 -norm and maximum norm of residuals between the obtained FVM solutions and exact solution, and the achieved EOC. Both experiments show that EOC of our FVM approach is about 1.6, which means that if we decrease the maximal size of the finite volumes by 2 then the error of our solution will decrease approximately by 3 (21.6 ≈ 3.03). Following numerical experiments were performed in the domain above the Himalayas bounded by "60◦ , 110◦ # meridians and "20◦ , 50◦ # parallels. The EGM2008 up to degree 2160 was used to generate all BCs and the harmonic function. The

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

617

Fig. 16 Oblique derivative directions for the first experiment

Table 7 The L2 norm and max norm of residuals, and EOC of FVM for the problem with the exact solution T (x) = |x − (0.1, 0.2, 0.3)|−1

Table 8 The L2 norm and max norm of residuals, and EOC of FVM for the problem with exact solution taken from the EGM2008 up to degree and order 5

hmax 0.125 0.0642 0.0324 0.0162 0.00817

hmax 0.125 0.0642 0.0324 0.0162 0.00817

||ehmax ||L2 EOC 0.000229 – 7.00358e−05 1.761 2.43723e−05 1.545 7.90755e−06 1.635 2.44251e−06 1.702 ||ehmax ||L2 EOC 9.25506e−05 – 2.39154e−05 2.01 8.80662e−06 1.462 2.96979e−06 1.579 9.39478e−07 1.667

||ehmax ||max 0.00269 0.00102 0.000418 0.000153 5.23081e−05

EOCmax – 1.429 1.31582 1.45607 1.56279

||ehmax ||max EOCmax 0.000911 – 0.000348 1.43 0.0001389 1.349 4.87918e−05 1.51659 1.5969e − 05 1.61833

bottom boundary was given by grid points that are located on the Earth’s surface. Their spacing in horizontal directions was uniform. Their heights were interpolated from the SRTM30 PLUS topography model [6], see Fig. 17a. An upper boundary

618

ˇ R. Cunderlík et al.

Fig. 17 (a) The Earth’s surface topography over the Himalayas (the bottom boundary) [m], (b) the disturbing potential from EGM2008 on the Earth’s surface [m2 s −2 ], (c) the disturbing potential from our FVM solution [m2 s −2 ], d, e, f) residuals between the EGM2008 and our FVM solution, where grid density is: (d) 501 × 301 × 25, (e) 1001 × 601 × 49, (f) 2001 × 1201 × 97 points [m2 s −2 ]

was chosen in the height of 240 km above a reference ellipsoid corresponding to an average altitude of the GOCE satellite orbits. The resulting 3D computational grid constructed by our surface evolution method is non-uniform. On the bottom boundary the first derivatives in the radial direction were prescribed that represented the oblique derivative BC. On the rest of the boundary the Dirichlet BC in form of the disturbing potential were prescribed. All these BCs were generated from the EGM2008 model up to degree 2160. Three experiments with different grid densities were performed, namely the grids with the densities 501 × 301 × 25, 1001 × 601 × 49 and 2001 × 1201 × 97 points. They approximately correspond to spacing 0.1◦ × 0.1◦ × 10 , 0.05◦ × 0.05◦ × 5 and 0.025◦ × 0.025◦ × 2.5 km. Figure 17b shows EGM2008 at points on the Earth’s topography as the harmonic function that we are reconstructing. The obtained FVM solution for the most dense grid is depicted in Fig. 17c. Residuals between EGM2008 and our FVM solutions on the bottom boundary can be seen in Fig. 17c, d, e. The statistical characteristics of

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

619

Table 9 Statistics of residuals between our FVM solution and the EGM2008 in the domian above the Himalayas [m2 s −2 ] Resolution Grid density Min. value Mean value Max. value St. deviaton

0.1◦ × 0.1◦ × 10 km 501 × 301 × 25 -5.07 1.79 23.05 2.3

0.05◦ × 0.05◦ × 5 km 1001 × 601 × 49 -1.68 0.87 11.98 1.09

0.025◦ × 0.025◦ × 2.5 km 2001 × 1201 × 97 -0.44 0.33 3.90 0.37

the corresponding residuals are summarized in Table 9. It is evident that refinements of the grid leads to higher accuracy of the FVM solution giving better agreement with EGM2008. Standard deviations (STDs) are decreasing from 2.3 to 0.37 m2 s−2 (∼2.3 dm to 3.7 cm) and the maximal values from 23.1 to 3.9 m2 s−2 (∼from 2.3 m to 3.9 dm).

6

Summary

In this chapter we have presented several numerical approaches for solving the oblique derivative boundary value problem based on the boundary element and finite volume methods. Some of these approaches have been applied on uniform and one on nonuniform grids. The presented numerical experiments have confirmed that obtained numerical solutions converge to the exact solutions. Finally we may conclude that presented numerical methods may provide an important basis for solving various geoscientific problems described by partial differential equations. Acknowledgements This work was supported by Grants APVV-15-0522, VEGA 1/0608/15 and VEGA 1/0714/15.

Literature 1. Andersen, O.B., Knudsen, P., Berry, P.: The DNSC08 ocean wide altimetry derived gravity field. Presented at EGU-2008, CityplaceVienna, country-regionAustria (2008) 2. Aoyama, Y., Nakano, J.: RS/6000 SP: Practical MPI Programming. IBM, Poughkeepsie/New York (1999) 3. Baláš, J., Sládek, J., Sládek, V.: Stress Analysis by Boundary Element Methods. Elsevier, Amsterdam (1989) 4. Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., Van der Vorst, H.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia (1994) 5. Bauer, F.: An alternative approach to the oblique derivative problem in potential theory. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern. Shaker Verlag, Aachen (2004) 6. Becker, J.J., Sandwell, D.T., Smith, W.H.F., Braud, J., Binder, B., Depner, J., Fabre, D., Factor, J., Ingalls, S., Kim S.H., Ladner, R., Marks, K., Nelson, S., Pharaoh, A., Trimmer,

620

ˇ R. Cunderlík et al.

R., Rosenberg, J. Von, Wallace, G., Weatherall, P.: Global bathymetry and elevation data at 30 arc seconds resolution: SRTM30 PLUS. Mar. Geod. 32(4), 355–371 (2009) 7. Bitzadse, A.V.: Boundary-Value Problems for Second-Order Elliptic Equations. NorthHolland, Amsterdam (1968) 8. Bjerhammar, A., Svensson, L.: On the geodetic boundary value problem for a fixed boundary surface. A satellite approach. Bull. Geod. 57(1–4), 382–393 (1983) 9. Brebbia, C.A., Telles, J.C.F., Wrobel, L.C.: Boundary Element Techniques, Theory and Applications in Engineering. Springer, New York (1984) ˇ 10. Cunderlík, R., Mikula, K., Mojzeš, M.: 3D BEM application to Neumann geodetic BVP using the collocation with linear basis functions. In: Proceedings of ALGORITMY 2002, Conference on Scientific Computing, Podbanské, pp. 268–275 (2002) ˇ 11. Cunderlík, R., Mikula, K., Mojžeš, M.: Numerical solution of the linearized fixed gravimetric boundary-value problem. J. Geod. 82, 15–29 (2008) ˇ 12. Cunderlík, R., Mikula, K.: Direct BEM for high-resolution gravity field modelling. Stud. Geophys. Geod. 54(2), 219–238 (2010) ˇ 13. Cunderlík, R., Mikula, K., Špir R.: An oblique derivative in the direct BEM formulation of the fixed gravimetric BVP. IAG Symp. 137, 227–231 (2012) 14. Eymard, R., Gallouet, T., Herbin, R.: Finite volume approximation of elliptic problems and convergence of an approximate gradient. Appl. Numer. Math. 37(1–2), 31–53 (2001) 15. Fašková, Z.: Numerical methods for solving geodetic boundary value problems. PhD Thesis, SvF STU, Bratislava (2008) ˇ 16. Fašková, Z., Cunderlík, R., Mikula, K.: Finite element method for solving geodetic boundary value problems. J. Geod. 84(2), 135–144 (2010) 17. Freeden, W.: Harmonic splines for solving boundary value problems of potential theory. In: Mason, J.C., Cox, M.G. (eds.) Algorithms for Approximation. The Institute of Mathematics and Its Applications, Conference Series, vol. 10, pp. 507–529. Clarendon Press, Oxford (1987) 18. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. CRC press, Taylor & Francis Group, Boca Raton, Florida (2013) 19. Freeden, W., Kersten, H.: The Geodetic Boundary Value Problem Using the Known Surface of the Earth, Veröff. Geod. Inst. RWTH Aachen, 29 (1980) 20. Freeden, W., Kersten, H.: A constructive approximation theorem for the oblique derivative problem in potential theory. Math. Methods Appl. Sci. 3, 104–114 (1981) 21. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhauser, Boston (2004) 22. Freeden W., Nutz H.: On the solution of the oblique derivative problem by constructive RungeWalsh concepts. In: Pesenson I., Le Gia Q., Mayeli A., Mhaskar H., Zhou D.X. (eds.) Recent Applications of Harmonic Analysis to Function Spaces, Differential Equations, and Data Science. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham (2017) 23. Greengard, L., Rokhlin, V.: A fast algorithm for particle simulation. J. Comput. Phys. 73, 325–348 (1987) 24. Gutting, M.: Fast multipole methods for oblique derivative problems. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern. Shaker Verlag, Aachen (2007) 25. Gutting, M.: Fast multipole accelerated solution of the oblique derivative boundary value problem. Int. J. Geomath. 3, 223–252 (2012) 26. Gutting, M.: Fast Spherical/Harmonic Spline Modeling. Handbook of Geomathematics, 2nd edn., pp. 2711–2746. Springer, Berlin/Heidelberg (2015) 27. Hartmann, F.: Introduction to Boundary Elements. Theory and Applications. Springer, Berlin (1989) 28. Holota, P.: Coerciveness of the linear gravimetric boundary-value problem and a geometrical interpretation. J. Geod. 71, 640–651 (1997) 29. Holota, P., Nesvadba, O.: Model refinements and numerical solution of weakly formulated boundary-value problems in physical geodesy. In: IAG Symposium, vol. 132, pp. 320–326 (2008)

11 Numerical Methods for Solving the Oblique Derivative Boundary . . .

621

30. Húska, M., Medl’a, M., Mikula, K., Novysedlák, P., Remešíková, M.: A new form-finding method based on mean curvature flow of surfaces. In: Handloviˇcová, A., Minárechová, Z., Ševˇcoviˇc, D., (eds.) ALGORITMY 2012, 19th Conference on Scientific Computing, Podbanske, 9–14 Sept 2012, Proceedings of Contributed Papers and Posters. ISBN:978-80227-3742-5. Publishing House of STU, pp. 120–131 (2012) 31. Klees, R.: Boundary value problems and approximation of integral equations by finite elements. Manuscr. Geodaet. 20, 345–361 (1995) 32. Klees, R., van Gelderen, M., Lage, C., Schwab, C.: Fast numerical solution of the linearized Molodensky problem. J. Geod. 75, 349–362 (2001) 33. Koch, K.R., Pope, A.J.: Uniqueness and existence for the geodetic boundary value problem using the known surface of the Earth. Bull. Geod. 46, 467–476 (1972) 34. Laursen, M.E., Gellert, M.: Some criteria for numerically integrated matrices and quadrature formulas for triangles. Int. J. Numer. Meth. Eng. 12, 67–76 (1978) 35. Lehmann, R., Klees, R.: Numerical solution of geodetic boundary value problems using a global reference field. J. Geod. 73, 543–554 (1999) 36. LeVeque, R.J.: Finite Volume Methods for Hyperbolic Problems. Cambridge Texts in Applied Mathematics (2002). ISBN:978-0521009249 37. Lucquin, B., Pironneau, O.: Introduction to Scientific Computing. Wiley, Chichester (1998) ˇ 38. Macák, M., Cunderlík, R., Mikula, K., Minarechová, Z.: An upwind-based scheme for solving the oblique derivative boundary-value problem related to the physical geodesy. J. Geod. Sci. 5(1), 180–188 (2015) 39. Macák, M., Mikula, K., Minarechová, Z.: Solving the oblique derivative boundary-value problem by the finite volume method, In: ALGORITMY 2012, 19th Conference on Scientific Computing, Podbanske, 9–14 Sept 2012, Proceedings of Contributed Papers and Posters, Publishing House of STU, pp. 75–84 (2012) ˇ 40. Macák, M., Mikula, K., Minarechová, Z., Cunderlík, R.: On an iterative approach to solving the nonlinear satellite-fixed geodetic boundary-value problem, In: IAGSymp, vol. 142, pp. 185– 192 (2016) 41. Macák, M., Minarechová, Z., Mikula, K.: A novel scheme for solving the oblique derivative boundary-value problem. Stud. Geophys. Geo. 58(4), 556–570 (2014) 42. Mantiˇc, V.: A new formula for the C-matrix in the Somigliana identity. J. Elast. 33(3), 191–201 (1993) 43. Mayer-Gurr, T., et al.: The new combined satellite only model GOCO03s. Presented at the GGHS-2012 in Venice (2012) ˇ 44. Medl’a, M., Mikula, K., Cunderlík, R., Macák, M.: Numerical solution to the oblique derivative boundary value problem on non-uniform grids above the Earth topography. J. Geod. 92(1), pp 1–19 (2017) 45. Meissl, P.: The use of finite elements in physical geodesy. Report 313, Geodetic Science and Surveying, The Ohio State University (1981) 46. Mikula, K., Remešiková, M., Sarkóci, P., Ševˇcoviˇc, D.: Manifold evolution with tangential redistribution of points. SIAM J. Sci. Comput. 36(4), A1384–A1414 (2014) 47. Mikula, K., Ševˇcoviˇc, D.: A direct method for solving an anisotropic mean curvature flow of planar curve with an external force. Math. Methods Appl. Sci. 27(13), 1545–1565 (2004) ˇ 48. Minarechová, Z., Macák, M., Cunderlík, R., Mikula, K.: High-resolution global gravity field modelling by the finite volume method. Stud. Geophys. Geo. 59, 1–20 (2015) 49. Miranda, C.: Partial Differential Equations of Elliptic Type. Springer, Berlin (1970) 50. Nesvadba, O., Holota, P., Klees, R.: A direct method and its numerical interpretation in the determination of the gravity field of the Earth from terrestrial data. In: IAG Symposium, vol. 130, pp. 370–376 (2007) 51. Pavlis, N.K., Holmes, S.A., Kenyon, S.C., Factor, J.K.: The development and evaluation of the Earth gravitational model 2008 (EGM2008). J. Geophys. Res. 117, B04406. https://doi.org/10. 1029/2011JB008916 (2012) 52. Schatz, A.H., Thomée, V., Wendland, W.L.: Mathematical Theory of Finite and Boundary Element Methods. Birkhauser Verlag, Basel/Boston/Berlin (1990)

622

ˇ R. Cunderlík et al.

53. Shaofeng, B., Dingbo, C.: The finite element method for the geodetic boundary value problem. Manuscr. Geod. 16, 353–359 (1991) 54. Sleijpen, G.L.G., Fokkema, D.R.: Bicgstab (l) for Linear Equations Involving Unsymmetric Matrices with Complex Spectrum. http://dspace.library.uu.nl/handle/1874/16827 (1993) 55. Zhao, K., Vouvakis, M., Lee, J.-F.: The adaptive cross approximation algorithm for accelerated method of moment computations of EMC problems. IEEE Trans. Electromagn. Compat. 47, 763–773 (2005)

Part V Reference Systems and Monitoring Methods

Geodetic Methods for Monitoring Crustal Motion and Deformation

12

Athanasios Dermanis

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Classical “Infinitesimal” Approximate Approach to Planar Deformation . . . . . . . . 3 The New Rigorous Approach to Planar Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Invariance Characteristics of Deformation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Various Types of Strain Rates and Their Invariance Characteristics . . . . . . . . . . . . . 6 Separating Relative Motion from Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Interpolation of Displacements or Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The Deformation of Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Deformation Analysis on the Reference Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Three-Dimensional Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

627 630 633 640 644 648 662 676 683 689 691 694

Abstract The use of geodetic data for crustal deformation studies is studied for the two possible cases: (a) The comparison of shape at two epochs for which station coordinates are available in order to compute, at any desired point, invariant planar deformation parameters (strain parameters), such as principal strains, principal elongations, dilatation and shear. (b) The utilization of coordinates and

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. A. Dermanis () Department of Geodesy and Surveying (DGS), Aristotle University of Thessaloniki, Thessaloniki, Greece E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_108

625

626

A. Dermanis

velocities at a particular epoch for the computation of the time derivatives of the deformation parameters (strain rate parameters). The classical approximate “infinitesimal” theory is presented as well as the widely used finite element method with triangular elements for the interpolation of station displacements and velocities. In addition, a new completely rigorous planar deformation theory, based on the singular value decomposition of the deformation gradient matrix, is presented for both strain and strain rate parameter computation. The invariance characteristics of all the above deformation parameters, under changes of the involved reference systems, are studied, from a purely geodetic point of view different from that in classical mechanics. Emphasis is given to the separation of rigid motion of independent tectonic regions from their internal deformation, utilizing the concept of a discrete Tisserand reference system that best fits the geodetic subnetwork covering the relevant region. Interpolation of displacements or velocities using stochastic minimum mean square error prediction (known as collocation or kriging) is also examined with emphasis on how it can become statistically relevant and rigorous based on sample covariance and crosscovariance functions. It is also shown how the planar deformation can be adapted to the study of surface deformation, with applications to the study of shell-like constructions in geodetic engineering and the deformation of the physical surface of the earth. The most important application presented, is the study of horizontal deformation on the surface of the reference ellipsoid, utilizing either differences of geodetic coordinates between two epochs, or the horizontal components of station velocities. Finally, it is also shown how the rigorous theory of planar deformation can be extended to the three-dimensional case.

Zusammenfassung Die Verwendung geodätischer Daten für Studien der Krustendeformation führt zu zwei möglichen Fällen: (a) dem Vergleich der Ausprägung zweier Epochen, für die Stationskoordinaten verfügbar sind, um in jeder gewünschten Genauigkeit invariante planare Deformationsparameter zu berechnen, (b) der Nutzanwendung von Koordinaten und Geschwindigkeiten, um während einer bestimmten Epoche Zeitableitungen der Deformationsparameter zu bestimmen. Der vorliegende Beitrag widmet sich der klassischen approximativen „infinitesimalen“ Theorie ebenso wie der weit verbreiteten Finite-Element-Methode mittels triangulärer Elemente zur Interpolation von Stationsverschiebungen und Geschwindigkeiten. Eine strenge Theorie der Erweiterung planarer Deformation auf den dreidimensionalen Fall wird aufgewiesen.

Keywords Crustal motion · Crustal deformation · Principal strains · Strain · Strain rate · Dilatation · Shear · Strain invariance · Surface deformation · Horizontal deformation

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

1

627

Introduction

Geodetic data analysis, restricted by the presence of rather large observational errors, has treated the earth as a rigid body for a long time in the past, despite the fact that evidence to the contrary was already recorded in historic times, always associated with earthquakes. Another type of earth motion that is easy to record are landslides but these are not associated with a motion of the earth crust as a whole and are of minor geophysical importance. The proposal of the theory of continental drift by Alfred Wegener in 1915 and the emerging theories of plate motion, have put forward the idea of crustal motion of a secular character, much different than the episodic changes of shape caused by earthquakes or the periodic deformations of earth tides. Plates floating on the asthenosphere (the semifluid upper mantle) are necessarily rotating around an axis passing through the earth’s center as already foreseen by Leonhard Euler in the middle of the eighteenth century as a corollary of one of his mathematical theorems. The actual geodetic confirmation of plate motion had to wait for the development of modern powerful space technics with a few millimeter accuracy and dedicated continuous monitoring in order to identify plate motions with very small annual rates. However, geodesists have put up efforts to determine crustal motion even at times where the signals to be detected were at the same order, if not lower, with the noise level. As reported by [29], displacements caused by earthquakes have been already geodetically detected in Japan as early as 1891, but the earlier methodological developments appeared in the 1920s [27, 33, 35, 36]. The developed methodology using triangular (finite) elements is still very popular despite its shortcomings. The latter are associated with two characteristics of the methodology. The first one is the interpretation of episodic relative motion across faults as strain, which in fact refers to continuous deformation. The second problem is the calculation of over-optimistic accuracies, which may take into account the observation errors but grossly ignore the modeling errors associated with the hypothesis of homogeneous strain within each triangle. Later these ideas and similar approaches have been developed for applications in seismic areas such as the Saint Andreas fault in California and have been grossly enhanced with the emergence of high accuracy electronic distance measurement instruments, thus replacing triangulation networks with trilateration or combined ones. The relative geodetic efforts have been motivated, almost exclusively, by earthquake research. We will refrain from giving here a historical account of the various geodetic crustal motion monitoring methods since the subject has been already covered in a previous review [11], where more than 200 relative publications have been analyzed in detail. We will concentrate here on more recent developments, which however relate to the monitoring of horizontal crustal motion only. Secular vertical motions associated with post-glacial uplift, require different observational strategies, which include gravimetric data and will not be covered here. Even with today’s super accurate observational techniques, signals associated with continuous deformation (i.e. change of shape and not relative motion) are in

628

A. Dermanis

many cases within, or slightly above, the order of magnitude of the observational errors and are thus difficult to detect. We need to clarify, from the very beginning, the difference between relative motion of independent plates, sub-plates and independent tectonic units in general, from the internal alteration of their shape, which is deformation in the strict sense. Apart from the obvious geophysical importance of relative motions, pure deformation is of outmost importance since it is associated with tectonic processes that eventually lead to earthquakes. To put it in plane words “something must bend before it breaks” and geodesy will provide in the future the means to detect preseismic deformations and thus offer a contribution to earthquake prediction, no matter how distant this goal may presently seem. This however will require a further increase in observational accuracy, which is rather certain if one looks back to the impressive history of the development of geodetic instrumentation. This effort cannot be achieved however with geodetic control networks that have been designed for other purposes, but it will require dedicated dense networks in seismic areas across active faults, a development that seems plausible with the recent decline in the cost of GNSS receivers, but also other relevant instruments as accelerometers in particular. We will conclude this introduction with a short account of the basic mathematical concepts from continuum mechanics. Deformation analysis is concerned with the local alteration of shape around any point of interest, when the global alteration of shape is described by the deformation mapping x3  = f (3 x ), where x3 is the position vector of any point P at any reference epoch t and x3  its position vector at an epoch t  . Sometimes, especially in studies of dynamics, the reference shape, described by the position vectors x3, does not correspond to the shape of the body at any particular moment, but rather to its shape in a state of equilibrium (no influence of external forces), which may not be realized at any particular instant within the time interval under investigation. As usually done in mathematical physics, physical phenomena are studied locally in space and time by differential equations describing what is happening within an infinitesimal neighborhood around each point and within an infinitesimal time interval around each instant of time. In the case of deformation, the relevant equations are the so called constitutive equations which relate the result (deformation) to the causes (action of forces). They are more widely known by the engineering term stress-strain relations, where the stresses are the applied forces and strain refers to the resulting deformation of the body under study. Here we will not be concerned with such constitutive equations, which describe how the body reacts under acting forces according to its physical constitution, i.e. liquid, elastic, viscous, etc., but only with the description of deformation itself. When the deforming body is the earth crust, the study of the dynamical processes causing deformation is the object of geophysics. Geodesy contributes by determining to an unpreceded level of accuracy the deformation itself, either as relative motion of different plates, sub-plates and independent tectonic units in general, or as internal deformation within these units. Although there are actually

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

629

no barriers between different scientific fields, which is even truer for the sister sciences of geodesy and geophysics, it is of outmost importance to understand and interpret correctly geodetic results, in terms of precision and assumptions involved in their analysis, before moving to the next and more important step of “explaining” the relevant geophysical phenomena. One of these assumptions is the geophysical hypothesis that horizontal motions have different driving mechanisms (e.g. plate tectonics) from those of vertical motion (e.g. post-glacial uplift), which dictates a similar separation in the geodetic description of horizontal and vertical deformation. For a local description it is sufficient to replace the deformation mapping x3  = f (3 x ) by its local much simpler linear approximation, namely the deformation gradient tensor F = ∇f . When Cartesian reference systems (O, 3e), (O  , e3 ) are used at the two epochs t and t  , with respective origins O , O  , and respective base vectors 3e = [3e1 e32 e33 ], e3  = [3 e 1 e3 2 e3 3 ], then x3 = 3ex and x3  = 3e x , where   T  x = [x1 x2 x3 ] and x = [x1 x2 x3 ]T are the Cartesian coordinates of the point P at the two epochs. In this case the deformation mapping f is represented by the vector (column-matrix) relation x = f(x) and the deformation gradient by ∂x ∂f = . The very advantageous use of Cartesian the gradient matrix F = ∂x ∂x coordinates is possible under the Newtonian mechanics hypotheses of flat space and flat (homogeneous) time, which are independent from each other, in contrast to the unified curved space-time of general relativity. Although relativistic effects are by no means negligible in geodetic data analysis, they are usually confined to peripheral “relativistic corrections”, or to instrumental arrangements such as transmitting GPS signals at a frequency, such that under relativistic effects it takes its nominal value when arriving at the receiver. Thus, it is legitimate to work with convenient Cartesian coordinates. Deformation analysis based on the gradient matrix F has the disadvantage that it depends on a particular choice of reference systems (O, 3e), (O  , e3  ) at the epochs under comparison. In mathematical physics the problem is bypassed by using curvilinear coordinates and tensor analysis, where equations have the same form independently the choice of coordinates. Although this is the only possibility in relativistic mechanics, where no global reference systems exists for the curved space-time, we must point out that in the usual framework of Newtonian mechanics, practical curvilinear coordinate systems (e.g. spherical, geodetic, etc.) are defined through their relation to Cartesian coordinates and thus inevitably depend on the choice of reference systems. For this reason it is important to look for scalar functions of F, which are coordinate invariants, i.e. independent of the choice of reference systems. We will be concerned here with the determination of such functions of F, which are called strain parameters. We must keep in mind though, that geodetic invariance does not coincide with the classical notion of invariance in mechanics as it will be explained in Sect. 4. A somewhat different but related type of deformation analysis is concerned not with the comparison of the shape of a body at two epochs, but with the rate of change

630

A. Dermanis

of shape at a particular time epoch. In this case invariant strain rate parameters will be extracted, not from the gradient matrix F, but from the velocity gradient matrix ∂v dx L= where v = is the point velocity. ∂x dt The next two chapters are concerned with planar deformation, which is also a basic element of surface deformation, and the computation of planar strain parameters. Section 2 describes the widely used approximate classical infinitesimal theory, while Sect. 3 presents a new rigorous theory free of any approximations. Section 4 examines the invariance characteristics of planar strain parameters from a new purely geodetic point of view. Section 5 departs from the comparison of shape at two epochs and is concerned with the rate with which shape changes as described by invariant strain rate parameters. It concludes with the unexpected result that under certain restrictive conditions the approximate infinitesimal approach provides results free of any approximation! In Sect. 6 the need to separate relative motion from actual deformation is stretched and the relevant tools for this purpose are developed. The methods for the necessary interpolation of the geodetically derived discrete point displacements or velocities are examined in Sect. 7. Two methods are described, the still widely used finite element method with triangular elements and the less used but more advantageous method of interpolation via linear minimum mean square error stochastic prediction, known as collocation in geodesy and practically equivalent to the kriging methods used in geostatistics and other fields. Section 8 departs from the planar case to that of the deformation of curved two-dimensional surfaces, thus providing the means for relevant geodetic engineering applications, or even the study of the deformation of the earth surface itself. Section 9 is devoted to the application of the results of the previous chapter for developing a new methodology that allows us to perform horizontal deformation analysis not on a plane (such as the cartographic plane) but on the surface of the reference ellipsoid. It includes an adaptation of the interpolation via stochastic prediction on the ellipsoid via a spherical approximation. Section 10 gives an account of how planar deformation analysis can be extended to the threedimensional one. Finally, we close our presentation with some concluding remarks in Sect. 11.

2

The Classical “Infinitesimal” Approximate Approach to Planar Deformation

In the classical engineering approach to planar deformation, instead of the deforma∂u ∂x one uses the displacement gradient J = , where u = x −x tion gradient F = ∂x ∂x is the coordinate difference or displacement vector. The two gradients are obviously related by J = F − I. The fundamental concept of deformation is the strain tensor, which is represented by the strain matrix E, defined in relation to the change of the “element of length” from ds to ds  through

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

631

ds 2 − ds 2 = dxT Edx. 2 From ds 2 = dxT dx, ds 2 = dxT d x and dx = E=

(1)

∂x dx = Fdx it follows that ∂x

1 1 T (F F − I) = (J + JT + JT J). 2 2

(2)

When displacements are sufficiently small (infinitesimal), the second order term JT J is negligible and the rigorous strain matrix E may be replaced by its “infinitesimal” approximation Einf =

1 (J + JT ). 2

(3)

The displacement gradient can be decomposed into a symmetric and an antisymmetric part as J=

1 1 (J + JT ) + (J − JT ) ≡ Einf + 2 2

(4)

The antisymmetric part =

 1 1 0 (J − JT ) = 2 2 J21 − J12

J12 − J21 0



 =

0 −ω

 ω , 0

(5)

does not contribute to deformation and is related to a rotation R(ω) by the small rotation angle ω = 12 (J12 − J21 ), through the approximation R(ω) ≈ I + . The diagonalization of the symmetric part Einf = 12 (J + JT ) has the form  Einf =

ε11 ε12

ε12 ε22



⎡ ⎢ =⎣1 2



ε = R(−β) max 0

J11 (J12 + J21 )

⎤ 1 (J12 + J21 ) ⎥ 2 ⎦= J22



0

R(β),

εmin

(6)

where the maximum and minimum principal strains are given by

εmax = =

ε11 + ε22 + J11 + J22 +



2 (ε11 − ε22 )2 + 4ε12



2

=

(J11 − J22 )2 + (J12 + J21 )2 , 2

(7)

632

A. Dermanis

εmin = =



ε11 + ε22 − J11 + J22 −

2 (ε11 − ε22 )2 + 4ε12



2

=

(J11 − J22 )2 + (J12 + J21 )2 . 2

(8)

The direction angle β of maximum strain εmax is computed from tan 2β =

2ε12 J12 + J21 = . ε11 − ε22 J11 − J22

(9)

The sum = traceEinf = εmax + εmin = ε11 + ε22 = J11 + J22

(10)

is the dilatation, while the difference γ = εmax − εmin =



2 = (ε11 − ε22 )2 + 4ε12   = (J11 − J22 )2 + (J12 + J21 )2 ≡ γ12 + γ22

(11)

is the maximum shear, with γ1 = ε11 − ε22 = J11 − J22 ,

γ2 = 2ε12 = J12 + J21 ,

(12)

being the shear components. Thus an alternative form of (9) is tan 2β =

γ2 γ1

(13)

The direction ϕ of maximum shear strain is computed from tan 2ϕ = −

γ1 γ2

(14)

which implies that ϕ = β ± 45◦ . The elements Jik of the displacement gradient, as well as the elements Fik of the deformation gradient are in an one-to-one correspondence with thee four strain parameters , ω, γ1 , γ2 . Therefore the linearized deformation mapping x = Fx, can be expressed in terms of the strain parameters as 

x y





+ γ1 ⎢1 + 2 = ⎣ γ − 2ω 2 2

⎤ γ2 + 2ω   ⎥ x 2 − γ1 ⎦ y . 1+ 2

(15)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

633

Fig. 1 Deformation of a square centered at the origin and having side length 2, caused in the special cases where only one or two of the strain parameters is applied; (a) only dilatation , (b) only rotation ω, (c) only strain component γ1 , (d) only strain component γ2 , (e) only strain components γ1 and γ2 (with γ2 > γ1 > 0)

To get a feeling of the particular geometrical significance of each one of these parameters, Fig. 1 illustrates the transformations in the special cases where only one or two of these parameters are non-zero.

3

The New Rigorous Approach to Planar Deformation

As already pointed out, all information relative to the deformation function x = f(x) mapping coordinates x at an initial epoch t to those x at a later epoch t  , is locally ∂x . In order to derive physically represented by the deformation gradient F = ∂x meaningful parameters describing deformation that have a more appealing geometric meaning and are invariant with respect to the choice of reference systems at both epochs under comparison, we will introduce and compare two different decompositions of the deformation gradient F. The first is no other than the wellknown Singular Value Decomposition (SVD) of a quadratic matrix, while the

634

A. Dermanis

second involves directly dilatation and shear in a way analogous to that of the classical approximate approach. The singular value decomposition is based on the diagonalizations [see e.g. 14, 31] C ≡ FT F = R(−θ )2 R(θ ),

B ≡ FFT = R(−θ  )2 R(θ  ),

(16)

of the symmetric matrices C and B. C is the matrix representing the so-called right Cauchy–Green deformation tensor, while the matrix B is the representation of the left Cauchy–Green tensor, or Finger tensor. Note that in the relevant literature there is a multiplicity of terminology and even different definitions for the same term [see e.g. 20, 23, 24].   cos α sin α Here R(α) = stands for the rotation matrix in the plane while − sin α cos α   2 λ1 0 involves the common positive eigenvalues λ21 ≥ λ22 of both FT F 2 = 0 λ22 and FFT . The SVD of F is given by F = R(−θ  )R(θ ),

(17)

 λ1 0 contains the positive roots of 2 . The linear transformation where  = 0 λ2 x = Fx = R(−θ  )R(θ )x can be decomposed into the steps p = R(θ )x, p = p and x = R(−θ  )p , which can be interpreted as follows (see Fig. 2): Consider a reference system parallel to the one of epoch t, with origin the point P to which F refers and axes P x1 , P x2 . The rotation matrix R(θ ) brings the axes to new directions Pp1 , Pp2 , which are called the principal directions of the deformation. In the sequel, the diagonal matrix  deforms the plane in such a way that a point at a distance 1 from P on the axis Pp1 is displaced along the same axis to a new position at a distance λ1 from P (p1 = λ1 p1 ). In a similar way a point at a distance 1 from P along the axis Pp2 is displaced along the same axis to a new position at a distance λ2 from P (p2 = λ2 p2 ). A unit circle with origin at P (with equation pT p = 1) is deformed into an ellipse (with equation p T −2 p = 1) with the same center and semi-axes λ1 and λ2 along the two principal directions. Consider also a reference system parallel to the one of epoch t  , with origin the point P and axes P x1 , P x2 . The rotation matrix R(θ ) aligns the initial epoch axes to the principal directions, while the rotation matrix R(θ  ) aligns the final epoch axes to the principal directions. Thus θ is the direction angle of the first principal direction with respect to the initial epoch reference system, while θ  is the direction angle of the first principal direction with respect to the final epoch reference system. For an interpretation with respect to the original deformation function x = f(x), we consider a parameterized curve x(τ ) passing through the point P , which is dx is a tangent vector to the curve x(τ ) at mapped into a curve x (τ ) = f (x(τ )). If dτ P , it is mapped into a tangent vector 

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

635

Fig. 2 The deformation gradient matrix mapping according to its singular value decomposition

dx dx ∂f dx dx = =F = R(−θ  )R(θ ) , dτ ∂x dτ dτ dτ

(18)

dx dx = R(θ ) with their at the same point. If we multiply both sides of R(θ  ) dτ dτ transposes we obtain ds 2 = dτ 2



dx dτ

T

dx = dτ



dx dτ

T R(−θ )2 R(θ )

dx . dτ

(19)

dx has direction angle α with respect to the initial epoch reference system and dτ % % %    T T dx  dx  dx dx dx ds 2 ds magnitude   = = , it holds that = = 2 dτ dτ dτ dτ dτ dτ 2   dx ds cos α = . Replacing this in (19) we arrive at dτ dτ sin α If

λ(α) ≡

ds  (α) = ds



λ21 cos2 (α − θ ) + λ22 sin2 (α − θ ).

(20)

636

A. Dermanis

∂λ = 2(λ22 −λ21 ) sin(α −θ ) cos(α −θ ) = 0, The extrema of λ(α) are obtained for ∂α which has two solutions α = θ and α = θ + 90◦ . Therefore the extrema of ds  are obtained along the two mutually perpendicular the elongation factor λ = ds principal directions, with the maximum and minimum being λmin = λ(θ + 90◦ ) = λ2 ,

λmax = λ(θ ) = λ1 ,

(21)

respectively, since λ1 ≥ λ2 . The quantities λ1 , λ2 and θ or θ  can be easily derived from the known formulas for the diagonalization of a 2 × 2 symmetric matrix, as functions of the elements of the matrices. In appendix A we include a variant of the well-known diagonalization equations, which allows us to determine the direction of the eigenvectors in an unambiguous way. Applying these results, we obtain the following algorithm for the computation of the desired quantities: A = C11 + C22 , λ21 =

A+B , 2

B= λ22 =



2 , (C11 − C22 )2 + 4C12

(22)

A−B , 2 C11 − C22

(23)

R0 ≡ cos 2θ =  , 2 (C11 − C22 )2 + 4C12 2 2 1 − R0 1 + R0 , cos θ = . sin θ = sgn(C12 ) 2 2

(24)

(25)

Note that the above formulas have the advantage of determining θ in the right quadrilateral in comparison to the usual equivalent equation tan 2θ = 2C12 /(C11 − C22 ). The angle θ  can be derived in the same way as θ by simply replacing the elements of C with those of B. From the defining equation of the strain matrix E = 12 (FT F − I) and the SVD F = R(−θ  )R(θ ) it follows that ⎡

1 2 (λ1 − 1) 1 ⎢ E = [R(−θ )2 R(θ ) − I] = R(−θ ) ⎣ 2 2 0 

e ≡ R(−θ ) max 0

0 emin

 R(θ ),

⎤ 0

⎥ ⎦ R(θ ) ≡ 1 2 (λ2 − 1) 2 (26)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

637

where the principal strains along the principal directions are the maximum and minimum strains emax =

1 2 (λ − 1), 2 1

emin =

1 2 (λ − 1). 2 2

(27)

The strain matrix E is a representation of the Green–St. Venant (Lagrangian) strain tensor or simply the Green strain tensor. Two related decompositions of the deformation gradient are F = RU = VR, where U represents the symmetric right Cauchy stretch tensor (stretch is the ratio of the final length to the original length), V represents the symmetric left Cauchy stretch tensor and R is the orthogonal rotation matrix. Obviously in our case U = R(−θ )R(θ ), V = R(−θ  )R(θ  ) and R = R(θ − θ  ). The angle ω = θ − θ

(28)

can be interpreted as a rotation angle, analogous to that of the infinitesimal approach, only if it is possible to maintain the same reference system in both epochs under comparison, through reference to a fixed non-deforming surrounding environment. We pass now to a different representation directly involving dilatation and shear. A pure shear along the first coordinate axis is described by the deformation gradient  1 γ matrix  = , where γ = tan ωγ and ωγ is the angle of inclination to 0 1 the right as a result of deformation on lines parallel to the second axis (Fig. 3). A shear at any other direction, defined by the direction angle ϕ, is described by the deformation gradient matrix  ϕ = R(−ϕ)R(ϕ) [23]. Dilatation is accounted for by a scale factor σ . The three quantities σ , ϕ and γ are short by one with respect to the four independent elements of F. For this reason an additional rotation by an angle ψ is implemented, so that the deformation gradient is decomposed into F = σ R(ψ) ϕ = σ R(ψ)R(−ϕ)R(ϕ) = F(γ , ϕ, σ, ψ).

(29)

Since orthogonal matrices have determinant one and det  = 1, it follows that det F = σ 2 . Recalling that the area of a quadrilateral with sides the vectors a and b is given by the determinant of the matrix having these vectors as columns, E =

= area(a, b) = det a b , we note that the linear mapping F transforms the vectors a and b into vectors Fa and Fb, respectively, a new

which form

quadrilateral with area E  = area(Fa, Fb) = det Fa Fb = det F a b =

= det F det a b = det F area(a, b) = det F E. Therefore the determinant of the deformation gradient relates to the ratio of new to old area by det F = σ 2 = E  /E, while in the same linear approximation dilatation relates to the ratio of area change to the old area = (E  − E)/E. In terms of the original non linear deformation  mapping x = f(x), a small area E around the point of interest P is mapped by f into a small area E  and dilatation at P is defined as the limit

638

A. Dermanis

Fig. 3 Shear along the first axis and shear along an axis with direction angle ϕ

E − E E = lim − 1 = det F − 1. E→0 E→0 E E

= lim

(30)

Therefore, the scale factor is related to dilatation through = σ 2 − 1.

(31)

It is possible to directly compute the parameters γ , σ , ϕ and ψ from the elements of F through the diagonalizations of the symmetric matrices FT F = = σ 2 R(−ϕ) T R(ϕ), FFT = σ 2 R(ψ − ϕ) T R(−ψ + ϕ). Instead, we will relate them to the already derived parameters θ , λ1 , λ2 and θ  . From (17) it follows that det F = det  = λ1 λ2 = σ 2 = 1 + and dilatation can be computed from = λ1 λ2 − 1.

(32)

The SVD of  is given by 

1 γ = 0 1





μ1 = R(−δ ) 0 

 0 R(δ) μ2

(33)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

639

The values of μ1 , μ2 , δ, δ  are obtained by applying the diagonalization formulas  1 γ and  T = of appendix A to the symmetric matrices  T  = γ 1 + γ2   1 + γ2 γ = . Thus γ 1   2 + γ2 + γ γ2 + 4 2 + γ2 − γ γ2 + 4 2 = , μ2 = , 2 2 % γ 1 +  , sin δ = cos δ  = 2 2 γ2 + 4 % γ 1  −  cos δ = sin δ = , δ  = 90◦ − δ, 2 2 γ2 + 4

μ21

(34)

(35)

With the SVD of  from (33) we have F = R(ψ − ϕ − δ  )



σ μ1 0

  λ 0 R(δ + ϕ) = R(−θ  ) 1 σ μ2 0

 0 R(θ ). λ2

(36)

From comparison follows that λ1 = σ μ1 , λ2 = σ μ2 and since μ21 + μ22 = λ2 + λ2 λ2 + λ22 , shear is given by = 2 + γ2 = 1 2 2 = 1 λ1 λ2 σ λ1 − λ2 . γ = √ λ1 λ2

(37)

From the obvious angle relations δ + ϕ = θ , ψ − ϕ − δ  = −θ  and δ  = 90◦ − δ, it follows that the direction of (maximum) shear is given by its direction angle ϕ = θ − δ,

(38)

while the additional rotation angle which does not contribute to deformation is given by ψ = ϕ − θ  − δ  = θ − δ − θ  − δ  = ω − 2δ + 90◦ .

(39)

The difference δ = θ − ϕ is the angle between the maximal principal direction θ and the direction of shear ϕ, while the rotation angle ψ is related to the SVD rotation ˜ = angle ω = θ − θ  . Note that if the alternative decomposition F = σ  ϕ˜ R(ψ) ˜ ˜ = σ R(−ϕ)R( ˜ ϕ)R( ˜ ψ) was used instead, this would lead to the same angle ψ = ψ, while the ditection of shear would change to ϕ˜ = θ  − δ  = ϕ − ψ, which accounts for the same physical direction.

640

4

A. Dermanis

Invariance Characteristics of Deformation Parameters

The elements of the deformation gradient matrix F =

∂x , as well those of the ∂x

∂(x − x) , depend on the reference systems ∂x  to which the coordinates x, x refer. On the contrary, the deformation parameters introduced as functions of the elements of F or J have certain invariance characteristics with respect to the reference system choices. There are two types of invariance involved, absolute numerical invariance, where a parameter maintains its numerical value under any reference system choice, and physical invariance, where a parameter may change its value, but it always defines the same physical object. In continuum mechanics, where one usually studies the deformation of a physical object in a non-deforming laboratory environment, a single choice of a laboratoryfixed reference system is sufficient since it is valid for any epoch. The same is true in some geodetic or surveying applications, e.g. when one studies the deformation of a dam with observations performed from points on a stable surrounding environment. The increase of precision in modern mostly space geodetic techniques, aiming at geophysical applications, faces the problem of the reference system definition for a continuously deforming earth, where there is no stable rigid part with respect to which to fix the reference system in a time invariant way. On the contrary, the reference system has to be defined at every epoch. In this way, the choice of a spatial reference system is replaced by the choice of a spatiotemporal one. Geodesy faces yet another problem related to the scale of the reference system, i.e. to the choice of the unit of length. In classical mechanics, the unit of length is considered to be the same at every epoch, since it is defined through comparison with a reference prototype, the meter. In modern geodetic techniques however, no direct distance measurements are performed. Instead, one measures time intervals, which are converted into distances by multiplication with the adopted constant velocity of light in vacuum. Thus, the unit of length is defined indirectly through the definition and realization of the unit of time. The realization of the second as a unit of time is based on a specific set of atomic clocks. The implementation of different clock sets in different geodetic observational campaigns may result in different units of time and length, which are related through a scale factor. Even if the same set of clocks is used, the unit of length may change from epoch to epoch, due to instabilities in clock performance. While in classical continuum mechanics coordinates x, x˜ in different reference systems are connected by rigid transformations x˜ = R(χ )x + d, R(χ ) being the orthogonal rotation matrix and d the displacement vector, in geodesy we must consider similarity transformations x˜ = sR(χ )x + d involving an additional scale factor s. Let the reference systems, at the two epochs t and t  under comparison, change according to the similarity transformations displacement gradient matrix J =

x˜ = sRx + d,

x˜  = s  R x + d ,

(40)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

641

with R = R(χ ) and R = R(χ  ). By the chain rule of differentiation the original ∂x deformation gradient matrix F = transforms into ∂x ∂ x˜  ∂ x˜  ∂x ∂x ∂ x˜  ∂x = = F˜ = ∂ x˜ ∂x ∂x ∂ x˜ ∂x ∂x



−1

s  T T R FR = μR FR , s (41) where μ = s  /s is the scale factor ratio. Consequently, the displacement gradient matrix J = F − I transforms according to ∂ x˜ ∂x

= s  R F[sR]−1 =

T T T J˜ = F˜ − I = μR FR − I = μR JR + μR R − I.

(42)

From the relation E = 12 (FT F − I) and its counterpart E˜ = 12 (F˜ T F˜ − I) it follows easily that μ2 − 1 1 I. E˜ = (F˜ T F˜ − I) = μ2 RERT + 2 2

(43)

If the same scale can be maintained then s  = s, μ = 1 and the transformations simplify to F˜ = R FRT , J˜ = R JRT + R RT − I, E˜ = RERT . If furthermore the same reference system can be maintained, then R = R and we obtain the usual tensor transformation rules F˜ = RFRT , J˜ = RJRT , E˜ = RERT . We will examine first the invariance characteristics of the approximate infinitesimal approach, which we may relate to the elements of the displacement gradient matrix J through ⎡ ⎡ ⎤ ⎤ + γ1 0 J11 ⎢ J21 ⎥ 1 ⎢ γ2 − 2ω ⎥ 1 ⎢ −1 ⎢ ⎢ ⎥ ⎥ j ≡ vecJ = ⎢ ⎣ J12 ⎦ = 2 ⎣ γ2 + 2ω ⎦ = 2 ⎣ 1 J22 − γ1 0 ⎡

  1 W I B= , 2 I W

−1

B



−W = I

 I , −W

⎤⎡ ⎤ 0 2ω ⎥ ⎢ 1⎥⎢ ⎥ ⎥ ≡ Bq, 1 ⎦ ⎣ γ1 ⎦ γ2 0 (44)   0 1 W= , (45) −1 0 1 1 0 0 0 0 1 −1

where W = R(90◦ ) has the property W2 = −I and inverse W−1 = −W. From (42) follows that ˜j = vecJ˜ = μ(R ⊗ R )j + μvec(R RT ) − vecI.

(46)

Combining the inverse relation q˜ = B−1 ˜j, with (46) and (44) we obtain the transformation q˜ = μB−1 (R ⊗ R )Bq + μB−1 vec(R R ) − B−1 vecI, T

(47)

642

A. Dermanis

which after performing the necessary calculations leads to 

     2ω˜ 2ω μ sin(χ  − χ )  = μR(χ − χ ) + 2 ˜ μ cos(χ  − χ ) − 1     γ˜1 γ = μR(χ  + χ ) 1 . γ˜2 γ2

(48) (49)

Taking the norm of both sides of (49) we get γ˜ = μγ .

(50)

From the transformation laws for the shear components γ1 , γ2 follow the transformation laws for the angles β˜ = β −

χ + χ , 2

ϕ˜ = ϕ −

χ + χ . 2

(51)

None of the deformation (strain) parameters is an invariant in the general geodetic case where μ = 1 and χ  = χ . If scale is maintained (μ = 1) then only γ˜ = γ is a numerical invariant. In the classical case of continuous mechanics with μ = 1 and χ  = χ the parameters ω˜ = ω, ˜ = , γ˜ = γ are invariant, while the shear components transform according to γ˜1 = cos(2χ )γ1 + sin(2χ )γ2 and γ˜2 = − sin(2χ )γ1 + cos(2χ )γ2 . In conclusion, the approximations of the infinitesimal approach destroy the invariance properties of the supposedly invariant strain parameters. Even if the approximation μ ≈ 1 is acceptable, approximate invariance can be claimed only under the arbitrary assumption that χ  ≈ χ . In view of the fact that the angles χ and χ  may differ considerably, the resulting lack of invariance or approximate invariance for , also affects the maximal and minimal principal strains εmax = 12 ( + γ ) and εmin = 12 ( − γ ). Turning to the newly derived rigorous deformation parameters we note that the SVDs of F˜ = μR(χ  )FR(−χ ) and F give ˜ θ˜ ) = μR(χ  )R(−θ  )R(θ )R(−χ ) F˜ = R(−θ˜ )R(

(52)

which implies the transformations θ˜ = θ − χ ,

θ˜ = θ  − χ  ,

λ˜ 1 = μλ1 ,

λ˜ 2 = μλ2 .

(53)

From the transformation law for the elongation factors we can derive directly the ones for parameters that are functions of them γ˜ = γ ,

˜ = μ2 (1 +

) − 1,

(54)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

643

μ2 − 1 , 2

(55)

e˜max = μ2 emax +

e˜min = μ2 emin +

μ2 − 1 . 2

Application of our second type of decomposition to both F˜ = μR(χ  )FR(−χ ) and F gives ˜ ϕ) ˜ F˜ = σ˜ R(ψ)R(− ϕ) ˜ R( ˜ = μR(χ  )[σ R(ψ)R(−ϕ)R(ϕ)]R(−χ )

(56)

which implies that ϕ˜ = ϕ − χ ,

ψ˜ = ψ + χ  − χ ,

ω˜ = ω + χ  − χ ,

σ˜ = μσ.

(57)

The last relation follows from σ˜ ˜ = μσ  and the fact that the already established relation γ˜ = γ implies that ˜ =  . Since = σ 2 − 1, the relation 2 σ˜ = μσ is equivalent to the relation 1 + ˜ = μ (1 + ). On the base of the derived transformation laws we conclude that in the general geodetic case (different scale and different reference system orientation at different epochs) the only numerical invariant is the shear γ . In addition the angles θ , θ  and ϕ, transforming according to θ˜ = θ − χ , θ˜ = θ  − χ  and ϕ˜ = ϕ − χ , are physical invariants. When the reference system at epoch t rotates counter-clock-wise by an angle χ , the direction angles θ and ϕ decrease by χ , so that they continue to point to the same physical direction. Similarly when the reference system at epoch t  rotates counter-clock-wise by an angle χ  , the direction angle θ  decreases by χ  , so that it continues to point to the same physical direction. The angle ω which represents a physical rotation from epoch t to epoch t  is also a physical invariant: when the reference system at epoch t rotates counter-clock-wise by an angle χ , ω decreases by χ and when the reference system at epoch t  rotates counter-clock-wise by an angle χ  , ω increases by χ  so that its different values ω and ω˜ represent the same physical rotation. Analogously, the same holds for the angle ψ. Of the remaining deformation parameters λ1 , λ2 , emax , emin , are affected by the change in scale μ = s  /s. If the scale can be maintained the same at both epochs (μ = 1) then all quantities become numerical invariants. In the classical case of classical continuum mechanics where both the scale is maintained and a common reference system is defined for both epochs, then by necessity χ  = χ and ψ˜ = ψ is not only a physical but also a numerical invariant. In conclusion, the newly derived deformation parameters have the invariance properties, which their “infinitesimal” classical counterparts lack as a consequence of the approximations involved. The problem of different scale (different definition of time unit and hence different definition of length unit) affects all parameters associated with length with the exception of shear, which refers solely to the change of shape independently of magnitude. In the bottom line we cannot distinguish whether a numerical increase/decrease of the overall shape of a region is due to an actual physical modification or to a loss/gain in the rates of the supporting atomic clocks.

644

5

A. Dermanis

The Various Types of Strain Rates and Their Invariance Characteristics

All strain parameters p(F) describing deformation, are functions of the deformation gradient, which in turn is a function F(t, t  ) of the two epochs under comparison. Consequently strain parameters are themselves two-epoch functions p(t, t  ), and two different types of corresponding strain rate parameters ca be defined. The first refers to the derivative with respect to the second epoch p(t, ˙ t ) =

d p(t, t  ) dt 

(58)

while a second type of single epoch rate can be defined though the limit d d p(t, t + τ ) = lim p(t, t  ).  τ →0 dτ t →t dt 

p(t) ˙ = lim

(59)

In the infinitesimal approach, all strain parameters are linear functions of the displacement gradient J(t, t  ), with the exception of the maximum shear strain γ , and the principal strains εmax , εmin . Therefore their rates can be derived from dJ , using the same formulas of Sect. 2, by the displacement rate matrix J˙ = dt  ˙ simply replacing J with J to obtain the strain rate parameters ˙ = J˙11 + J˙22 , ω˙ = 12 (J˙12 − J˙21 ), γ˙1 = J˙11 − J˙22 and γ˙2 = J˙12 + J˙21 . Curiously enough, dγ d  γ1 γ˙1 + γ2 γ˙2 , the quantity instead of the time derivative  =  γ12 + γ22 = dt dt γ  

γ˙ = γ˙12 + γ˙22 (analogous to γ˙ = γ12 + γ22 ) is computed as the “rate of maximum shear strain”. The paradox of this choice will be resolved in the following. Furthermore ε˙ max = 12 ( ˙ + γ˙ ) and ε˙ min = 12 ( ˙ − γ˙ ) are also sometimes computed. In the rigorous approach, differentiation of (22) and (23) gives A˙ = C˙ 11 + C˙ 22 , C˙ 11 − C˙ 22 2C˙ 12 (C11 − C22 )(C˙ 11 − C˙ 22 ) + 4C12 C˙ 12  = + (60) B˙ = 2 1 + Q−2 1 + Q2 (C11 − C22 )2 + 4C12 where we have set Q=

C11 − C22 2C12

(61)

A˙ + B˙ A˙ − B˙ and λ˙ 2 = . Consequently, the rates of the principal 4λ1 4λ2 elongations are given by while λ˙ 1 =

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

645

5 ˙ 11 − C˙ 22 ˙ 12 C 2 C + C˙ 11 + C˙ 22 +  , 1 + Q−2 1 + Q2 6 5 ˙ 11 − C˙ 22 ˙ 12 1 C 2 C λ˙ 2 = − C˙ 11 + C˙ 22 −  . 4λ2 1 + Q−2 1 + Q2

1 λ˙ 1 = 4λ1

6

(62)

(63)

The remaining two-epoch derivatives have the straightforward forms ˙ = λ˙ 1 λ2 + λ1 λ˙ 2 , γ˙ =

ω˙ = θ˙ − θ˙ ,

(λ1 + λ2 )(λ2 λ˙ 1 − λ1 λ˙ 2 ) γ (λ1 + λ2 ) = √ 2(λ1 − λ2 ) 2λ1 λ2 λ1 λ2

e˙max = λ1 λ˙ 1 ,

(64) 

λ˙ 2 λ˙ 1 − λ1 λ2

 ,

e˙min = λ2 λ˙ 2 .

(65) (66)

For the angular parameters we may differentiate F = R(−θ  )R(θ ), recalling ˙ that R(α) = R(α + 90◦ ), to obtain the four equations ˙ ) F˙ = −θ˙ R(−θ  + 90◦ )R(θ ) + θ˙ R(−θ  )R(θ + 90◦ ) + R(−θ  )R(θ

(67)

which can be solved for λ˙ 1 , λ˙ 2 , θ˙ and θ˙ . We need only the last two θ˙ =

−λ1 cos θ  sin θ − λ2 sin θ  cos θ ˙ λ1 cos θ  cos θ − λ2 sin θ  sin θ ˙ + F F12 + 11 λ21 − λ22 λ21 − λ22 (68) +

λ1 sin θ  cos θ + λ2 cos θ  sin θ ˙ −λ1 sin θ  sin θ + λ2 cos θ  cos θ ˙ F21 + F22 2 2 λ1 − λ2 λ21 − λ22

λ2 cos θ  sin θ + λ1 sin θ  cos θ ˙ λ2 cos θ  cos θ − λ1 sin θ  sin θ ˙ + F F12 + θ˙ = − 11 λ21 − λ22 λ21 − λ22 (69) +

λ2 sin θ  cos θ + λ1 cos θ  sin θ ˙ −λ2 sin θ  sin θ + λ1 cos θ  cos θ ˙ + F F22 . 21 λ21 − λ22 λ21 − λ22

The rotation angle rate becomes ω˙ = θ˙ − θ˙  = −

sin ω ˙ cos ω ˙ (F11 + F˙22 ) + (F12 − F˙21 ). λ1 + λ2 λ1 + λ2

(70)

In a similar way we may differentiate the representation F = σ R(ψ)R(−ϕ)R(ϕ) to obtain the four equations

646

A. Dermanis

F˙ = σ˙ R(ψ − ϕ)R(ϕ) + (ψ˙ − ϕ)σ ˙ R(ψ − ϕ + 90◦ )R(ϕ)+ ˙ + ϕσ ˙ R(ψ − ϕ)R(ϕ + 90◦ ) + σ R(ψ − ϕ)R(ϕ) ˙ recalling that which can be solved for σ˙ , γ˙ , ϕ˙ and ψ, the last two ϕ˙ =

= σ 2 − 1. We need only

cos(2ϕ − ψ) ˙ sin(2ϕ − ψ) ˙ (F22 − F˙11 ) − (F12 + F˙21 )+ √ √ 2γ 1 + 2γ 1 + +

(71)

(72)

cos(ϕ − ψ) sin(ϕ − ψ) (cos ϕ F˙11 + sin ϕ F˙12 ) − √ (cos ϕ F˙21 + sin ϕ F˙22 ), √ 2 1+ 2 1+

cos(ϕ − ψ) sin(ϕ − ψ) ψ˙ = √ (cos ϕ F˙11 + sin ϕ F˙12 ) − √ (cos ϕ F˙21 + sin ϕ F˙22 ). 1+ 1+ (73) d p(t, t  ). In order to dt  p(t, ˙ t  ), we need to know how the obtain the single epoch derivatives p(t) ˙ = lim  All the above are two-epoch derivatives p(t, ˙ t ) = t →t

˙ t  ) behave as functions of t  . deformation gradient F(t, t  ) and its derivative F(t, For this reason we cannot derive general formulas for single epoch strain parameter rates. We will therefore restrict ourselves to the usual case of linear-in-time (constant velocity) coordinate model of the form x(t) = x0 + (t − t0 )v, where x0 are the initial coordinates at a reference epoch t0 and v is the constant velocity of the point in question. In this case the epochs t, t  under comparison become t0 , t, respectively, and the deformation gradient becomes F=

∂x ∂v = I + (t − t0 ) ≡ I + τ L, ∂x0 ∂x0

(74)

∂v is the velocity gradient. The ∂x0 corresponding displacement gradient is J = F − I = τ L and its derivative J˙ = L. It follows that where we have set τ = t − t0 and L =

C = FT F = I + τ (L + LT ) + τ 2 LLT =  1 + 2τ L11 + τ 2 (L211 + L221 ) = τ (L12 + L21 ) + τ 2 (L11 L12 + L21 L22 ) Q=

(75)  τ (L12 + L21 ) + τ 2 (L11 L12 + L21 L22 ) , 1 + 2τ L22 + τ 2 (L212 + L222 )

2(L11 − L22 ) + τ (L211 + L221 − L212 − L222 ) C11 − C22 . = 2C12 2[(L12 + L21 ) + τ (L11 L12 + L21 L22 )]

(76)

For t → t0 (or equivalently τ = t  − t → 0), F → I, C → I, λ1 → 1, λ2 → 1, since λ21 , λ22 are the eigenvalues of C, and

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

˙ = lim lim C

t→t0

t→t0

lim Q =

t→t0



C˙ 12 C˙ 22

C˙ 11 C˙ 21





2L11 = L12 + L21

647

 L12 + L21 , 2L22

(77)

L11 − L22 . L12 + L21

(78)

On the basis of the above limits it is easy to derive the required single epoch strain rates 1 L11 + L22 + (L11 − L22 )2 + (L12 + L21 )2 , t→t0 2 2 1 L11 + L22 − λ˙ 2 (t0 ) = lim λ˙ 2 = (L11 − L22 )2 + (L12 + L21 )2 , t→t0 2 2 ˙ (t0 ) = lim ˙ = lim λ˙ 1 + lim λ˙ 2 = L11 + L22 ,

λ˙ 1 (t0 ) = lim λ˙ 1 =

t→t0

t→t0

t→t0

γ˙ (t0 ) = lim γ˙ = lim λ˙ 1 − lim λ˙ 2 = t→t0

t→t0



t→t0

(79) (80) (81)

(L11 − L22 )2 + (L12 + L21 )2 ,

(82)

and furthermore e˙max (t0 ) = lim e˙max = lim λ˙ 1 = λ˙ 1 (t0 ), e˙min (t0 ) = lim e˙min = t→t0

t→t0

t→t0

= lim λ˙ 2 = λ˙ 2 (t0 ). For the rotation angle we note that lim θ  = lim θ , so that t→t0

lim ω = 0, and since F˙ = L it follows that

t→t0

t→t0

t→t0

ω(t ˙ 0 ) = lim ω˙ = lim (F˙12 − F˙21 ) = L12 − L21 . t→t0

(83)

t→t0

Recognizing that in the case of the linear in time model L = J˙ and setting γ˙1 (t0 ) = J˙11 − J˙22 = L11 − L22 ,

γ˙2 (t0 ) = J˙12 + J˙21 = L12 + L21 ,

(84)

we obtain ˙ (t0 ) = J˙11 + J˙22 , ω(t ˙ 0 ) = J˙12 − J˙21 ,  γ˙ (t0 ) = γ˙1 (t0 )2 + γ˙2 (t0 ), e˙max (t0 ) = λ˙ 1 (t0 ) =

˙ (t0 ) + γ˙ (t0 ) 2

, e˙min (t0 ) = λ˙ 2 (t0 ) =

(85) (86) ˙ (t0 ) − γ˙ (t0 ) 2

.

(87)

We have in essence recovered the results of the infinitesimal approach with two important limitations. The strain rate formulas are valid only for the linear in time coordinate model x(t) = x0 + (t − t0 )v and only at the reference epoch t0 , with ∂v respect to which the velocity gradient J˙ = L = is computed. Since the choice ∂x0

648

A. Dermanis

of t0 is quite arbitrary, the results hold equally for any epoch t, if we simply replace ∂v . By the chain rule it holds that L with Lt = ∂x(t)  Lt =

Lt,11 Lt,21

Lt,12 Lt,22

 =

  ∂v ∂x0 ∂x(t) −1 ∂v = =L = L (I + τ L)−1 = ∂x(t) ∂x0 ∂x(t) ∂x0 (88)

1 [L + (det L)(t − t0 )I] = 1 + τ traceL + (t − t0 )2 det L   1 L12 L11 + (t − t0 )(L11 L22 − L12 L21 ) , = L21 L22 + (t − t0 )(L11 L22 − L12 L21 ) D(t)

=

where we have set D(t) = det (I + τ L) = 1 + (t − t0 ) traceL + (t − t0 )2 det L.

(89)

Substituting L with Lt , we obtain the following strain rates at any epoch t: ˙ (t) =

˙ (t0 ) + 2(t − t0 ) det L , D(t)

γ˙1 (t) =

γ˙1 (t0 ) , D(t)

λ˙ 1 (t) =

λ˙ 1 (t0 ) + (t − t0 ) det L , D(t)

e˙max (t) =

6

γ˙2 (t) =

γ˙2 (t0 ) , D(t)

e˙max (t0 ) + (t − t0 ) det L , D(t)

ω(t ˙ 0) , (90) D(t)  γ˙ (t0 ) γ˙ (t) = γ˙1 (t)2 + γ˙2 (t) = , D(t) (91)

ω(t) ˙ =

λ˙ 2 (t) =

λ˙ 2 (t0 ) + (t − t0 ) det L , D(t)

e˙min (t) =

(92)

e˙min (t0 ) + (t − t0 ) det L . D(t) (93)

Separating Relative Motion from Deformation

The coordinates at two epochs, or coordinates and velocities at the same epoch, which form the basis for the derivation of strain or strain rate parameters, refer to reference systems, which are in fact arbitrarily chosen. Geodetic observations are only capable of determining the shape of a geodetic network and its variation with time. Despite common belief, absolute positioning is not only outside the capabilities of geodesy, it is also physically meaningless. As we have already seen in the previous section, when the new rigorous deformation theory is implemented, deformation parameters are either numerical or physical invariants when the same scale is maintained over time. Therefore, we do not need to worry about the already made choice of reference system when our target is the internal deformation of a

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

649

region. However, coordinate variations may reveal relative motion of independent tectonic regions, which is of as much interest as their internal deformation. The relation to the choice of reference systems is based on the dual meaning of a rigid transformation x˜ = Rx + d: it represents both a change of the reference system and an actual rigid motion with respect to a fixed reference system. Another important reason for detecting and removing relative motion, is that one does not have at hand the deformation mapping x = f(x) itself, but only its discrete trace xi = f(xi ) at the network stations. Thus an interpolation must take place ∂x at any desired point. As in order to determine the deformation gradient F = ∂x we will see in the next session, interpolation is more efficiently performed when the main trend is removed; and relative motion is such a type of trend. We will examine more precisely the two different cases of deformation: The computation of strain parameters from the comparison of the shape at two epochs, and the computation of strain rate parameters from coordinates and velocities at the same epoch. Before discussing specific interpolation techniques in the next section, it is necessary to point out that interpolation of discrete point data cannot be performed successfully over discontinuities. Such spatial discontinuities occur at the borders of tectonic plates or sub-plates, which move with respect to each other in addition to their internal deformation. It is of the outmost importance in crustal motion studies to separate relative motion of independent tectonic units from their internal deformation. The first step in our analysis is the identification of independent tectonic units, exhibiting relative motion, by examining the available discrete station displacements or velocities. This can be done much easier if a common rigid motion trend is first removed. The separation can be realized with the use of standard clustering techniques (K-means, Isodata) employing a “distance” where both velocity similarity and spatial proximity is taken into account. In most cases separation by simple inspection of the reduced velocities is efficient. We will describe first how information provided by displacements or velocities can be separated into a rigid motion of a whole area and the remaining internal deformation, in various possible deformation analysis situations. We start first with two-epoch comparison situation where the basic information are the station displacements as expressed by coordinate differences. The simpler case and easier to understand, is when the area covered by the network includes an “external region” that is effectively rigid, and a “target region” which may be either rigid, or deforming. This is in fact the situation in classical mechanics where a deforming body is studied within a rigid laboratory, but occurs also in many geodetic engineering applications, as for example, the deformation of a water dam. The coordinates at the two epochs refer to the same reference system, a fact that is ensured by identifying the coordinates of the external region at the two epochs during the data analysis procedure. When the target region is rigid, the coordinates of the points on the target region at the two epochs demonstrate a variation due solely to a rigid motion xi = Rxi + d and the parameters of the transformation can be estimated by least squares, taking into account the influence of observational errors on the input coordinate estimates.

650

A. Dermanis

When the target region is deforming, the meaning of its relative motion as a whole with respect to the external region is not quite clear. Indeed the term “motion of the deformable area” does not have a clear non-ambiguous meaning. When it is used, it refers in reality to the motion of a reference system that best represents the deformable target area as a whole. An optimal representation is the one that minimizes the relative motion of the target area points with respect to the representing reference system, i.e. one for which the remaining residual displacements δxi = xi − x˜ i , after the rigid motion component x˜ i = Rxi + d is removed, are collectively as small as possible. A simple least squares fit ( i δxTi δxi = min) is a reasonable choice. It is a usual practice to use weighted least squares with the inverse of the coordinate covariance matrix as weight matrix. This practice, which is the statistically legitimate one to use when the target region is rigid, is not justified in the present case of a deforming target region. One can no more resort to the Gauss-Markov theorem, because in this case the remaining residuals are not zero mean random errors but deterministic unknown quantities. The only possible different weighting could be based on geophysical grounds, e.g. assigning larger weights to points that best represent the area, and smaller (even zero) weights to ones that are associated with localized tectonic behavior. We have loosely referred to an optimal reference system, which best represents a deforming region, but this concept needs further clarification. We have chosen a reference system SE for the external (E) rigid region which is time independent SE (t) = SE (t  ) = SE . In the target (T) region the original choices for reference system at the two epochs have been ST (t) = ST (t  ) = SE , for the original data xi ∈ SE and xi ∈ SE . Then a rigid transformation ST (t) = SE → S˜T (t), of the form x˜ i = R(θ)xi + d, has provided a new reference system S˜T (t) for the first epoch final data x˜ i ∈ S˜T (t), while at the second epoch the reference system remained the same for xi ∈ SE . We could as well perform a rigid transformation ST (t  ) = SE → S˜T (t  ) of the form x˜ i = R(θ)xi + d to provide a new reference system S˜T (t  ) for the target region at the two epochs, minimizing instead the square sum of the displacements δxi = x˜ i − xi . Since the choice of the reference system for the external region is arbitrary we could have chosen another reference system, say SE , which would differ from SE by a rigid transformation. Then we could perform a displacement minimization in order to change the target region reference system at the first epoch from ST (t) = SE to S˜ T (t), or to change the one at the second epoch from ST (t  ) = = SE to S˜ T (t  ). All the resulting possible pairs, namely S˜T (t) and ST (t  ) = SE , or ST (t) = SE and S˜T (t  ), or S˜ T (t) and ST (t  ) = SE , or ST (t) = SE and S˜ T (t  ), are in fact optimal pairs of reference systems. Each pair differ from another by a rigid transformation, the same at both epochs. In conclusion optimality of the reference system for a deforming region refers to a pair of reference systems for two epochs under comparison and is achieved by minimizing the sum of the squares of the displacements between the two epochs. In addition, optimal reference system pairs are not unique but they are defined up to a constant rigid transformation. The next case refers to a geodetic network covering an area that does not include an external stable region. One should first check whether the area covered exhibits a

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

651

unified tectonic behavior or whether it can be divided into sub-regions with different tectonic behavior, which in addition to their internal deformation also move with respect to each other. When a single unified region is studied, the reference systems at the two epochs are arbitrarily chosen and the part of the coordinate variation that can be attributed to a rigid motion is not reflecting an actual such motion but it is an artifact of reference system choice. To see this, consider the special case where a region is studied assuming that it is deformable, while in reality it is rigid. Then the rigid transformation connecting the reference systems at the two epochs does not reflect any actual motion, but is solely due to the choice made for the reference systems. This is reflected in the computed displacements (coordinate differences) while in reality there are no replacements at all. A different choice could eliminate these displacements (in fact minimize them if one accounts for the influence of the observational errors) or make them arbitrarily large. When the rigid motion part is removed (by the least squares minimization of the displacements) one establishes in fact an optimal pair of reference systems for the two epochs. This has been already realized in early deformation studies, where the minimization of the displacements was achieved as follows: the first epoch coordinate estimates were used as approximate values in the second epoch adjustment, and inner constraints were implemented for the definition of the reference system. As it is well known, the latter minimize the norm of the unknowns, i.e. of the corrections to the approximate values, that is the displacements (two epoch coordinate differences). When different tectonic units are identified within the area covered by the network, then the first epoch common reference system is optimally paired with a different second epoch reference system for each region, obtained by leastsquares minimization of the displacements within each region. The different rigid transformations for any pair of regions can be used to identify the relative motion ˜B of the two regions. For example if x˜ A i = RA xi + dA and x i = RB xi + dB are the transformations for two regions A and B, the relative motion of region B with respect to region A is described by the transformation T A x˜ B xi + (dB − RB dA ), i = (RB RA )˜

which is in fact independent of the choice of reference system at the first epoch. Let us pass next to the determination of strain rate parameters when the data are station coordinates and velocities at the same epoch. While coordinates depend on the choice of reference system at the particular epoch to which they refer, velocities, being the derivatives of coordinates, depend on the choice of reference system at each epoch within a small interval around the data epoch. If two spatiotemporal reference system choices differ at each epoch by a rigid transformation x˜ i (t) = = R(t)xi (t) + d(t), then the coordinates at a particular epoch tk will be connected by x˜ i (tk ) = R(tk )xi (tk ) + d(tk ), velocities will be connected by (dots represent time derivatives) ˙ k )xi (tk ) + R(tk )vi (tk ) + d(t ˙ k ). v˜ i (tk ) = R(t

(94)

652

A. Dermanis

Thus, an optimal reference system must be sought by determining optimal values of the rotation parameters, their derivatives and the derivatives of the translation parameters. Concerning the optimality criteria we will resort to Felix Tisserand who addressed the question of an optimal reference system best representing the whole earth E, in the framework of earth rotation. He used as origin the geocenter, i.e. 1 the center of mass of the earth,  i.e. he set xC = M E x dm = 0, where dm is the mass element and M = E dm is the earth mass. For the orientation of the  axes he minimized the relative kinetic energy TR = 12 E x˙ T x˙ dm at any epoch, thus collectively minimizing the relative velocities x˙ with respect to the reference system. An equivalent requirement is the vanishing of the relative angular momentum  hR = E [x×]˙x dm = 0 at any epoch. Since a rigid variation at all epochs gives an equally optimal reference system, the Tisserand optimality conditions are hR (t) = 0 and x˙ C = 0. In geodetic applications we are not dealing with continuous bodies directly, but rather with discrete sets of points (network stations). For this reason we will resort to a discrete analog of the Tisserand reference system for a network or subnetwork, based on the visualization of network stations as mass points of equal mass (unit mass without loss of generalization). Such an optimal discrete Tisserand reference system for a particular region is one for which the discrete (barycentric)  ˙ = 0 vanishes at every epoch, angular momentum h = [(x xi − m) i − m)×](˙ i  where m = n1 i xi is the network barycenter, while its barycenter remains constant ˙ = 0, i.e. m =const. When coordinates and velocities at some epoch in time, m are available, the discrete Tisserand principle is directly applicable. It remains to demonstrate that the vanishing of the discrete angular momentum is equivalent to the minimization of the discrete kinetic energy T = 12 i x˙ Ti x˙ i = min, or equivalently  to the least squares minimization of the apparent velocities i vTi vi = min. A Tisserand-like set of optimality conditions can be also applied to the comparison at two epochs. To this purpose we will replace the angular momentum with a similar quantity, h = i [(xi − m)×](δxi − δm) = 0, where the velocities have been replaced by the displacements δxi = xi − xi . Here δm = m − m, xi and xi are the station coordinates at the two epochs t and t  , while m and m are their respective barycenters. For the reference system origin we will require a constant barycenter, i.e. m = m. We will determine the rigid transformation parameters leading to optimal reference system for a deforming region in five cases depending on the available data: Two epoch coordinates in three-dimensions (three-dimensional displacements), two epoch coordinates on the plane (planar displacements), three-dimensional velocities, planar velocities and horizontal velocities on the surface of the earth for global or large region applications. The solutions will be derived in two ways, by minimizing the sum of the squares of the displacements or velocities and also by applying discrete Tisserand or Tisserand-like principles. A final point that needs clarification relates to the case of coordinates directly determined in a global reference system, e.g. by space techniques. One tends to believe that their temporal variation is absolute with respect to an earth-fixed reference system. This is not the case however: The spatiotemporal global reference system is conventionally established both for an initial reference epoch, as well

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

653

as with respect to its temporal evolution and is one of infinitely many choices, all describing the same temporal variation of the shape of the global network.

6.1

Three Dimensional Velocities

dxi , in three dimensions We have available coordinates xi and velocities vi = x˙ i ≡ dt and we seek to find the optimal parameters of the rigid transformation x˜ i = R(θ)xi +d, ˙ i + d, ˙ which are least-squares minimized leading to velocities v˜ i =  x˙˜ i = R˙xi + Rx T ˙ T R is v˜ i v˜ i = min. Noting that the matrix [ω×] = R with target function φ = i

antisymmetric (just differentiate RT R = I to see this) with axial vector ω, we may ˙ = −R[ω×] to express the new velocities as use the resulting relation R v˜ i = Rvi − R[ω×]xi + d˙ = Rvi + R[xi ×]ω + d˙

(95)

and the target function as φ=

i

v˜ Ti v˜ i =



T

Rvi + R[xi ×]ω + d˙ Rvi + R[xi ×]ω + d˙ = min . θ,ω,d˙

i

(96) We will access the required minimum in two steps according to the scheme   ¯ min φ = min min φ = min φ,

θ,ω,d˙

θ,ω



θ,ω

φ¯ = min φ. d˙

(97)

∂φ ˙ − R[ω×]m+ To determine φ¯ in the first step, we set = 0, which yields Rm ∂ d˙   1 1 ˙ = n i vi . Therefore +d˙ = 0, where m = n i xi and m ˙ d˙ = R[ω×]m − Rm.

(98)

Replacing this value in φ, we obtain the partial minimum φ¯ =

i

¯ {(vi − m) ˙ + [(xi − m)×]ω}T {(vi − m) ˙ + [(xi − m)×]ω} = φ(ω) = min, ω

(99) which depends only on ω. To obtain the final minimum at the second step we need ∂ φ¯ = 0, which gives only to set ∂ω

654

A. Dermanis



˙ + [(xi − m)×](vi − m)

i



[(xi − m)×]2 ω = h − Cω = 0

(100)

i

where we have recognized the discrete angular momentum h=



˙ [(xi − m)×](vi − m),

i

as well as the discrete inertia matrix of the network C = −



[(xi − m)×]2 , both

i

referring to the original reference system. Solving for ω and integrating (98) we find that φ = v˜ Ti v˜ i is minimized at i

ω = C−1 h,

d = Rm + c,

(101)

where c is an arbitrary constant. To complete the solution, one must solve the ˙ θ˙ , θ )T R(θ ) = [ω×], but fortunately we do not need to do differential equations R( ˙ this. We only need to compute the minimized velocities v˜ i = Rvi − R[ω×]xi + d, ˙ become v˜ i = R(vi − m) ˙ − R[ω×](xi − m) and which for d˙ = R[ω×]m − Rm expressed in the original reference system take the free of R(θ) form ˙ − [ω×](xi − m) = (vi − m) ˙ + [(xi − m)×]C−1 h. v i = RT v˜ i = (vi − m)

(102)

The above velocities are the basis for performing the deformational analysis in the area, after a rigid motion for the whole area has been removed. Let us also examine the solution obtained by directly applying the Tisserand principles in the new reference system h˜ =



˙˜ = 0, ˜ [(˜xi − m)×](˜ vi − m)

˙˜ = 0. m

(103)

i

˜ = Rm + d and thus Taking the mean on both sides of x˜ i = Rxi + d we obtain m ˙˜ = −R[ω×]m + Rm ˙˜ = 0 we obtain d˙ = −R[m×]ω− ˙ Therefore applying m ˙ + d. m ˙ as before, so that v˜ i = Rvi + R[xi ×]ω + d˙ becomes −Rm ˙ + R[(xi − m)×]ω. v˜ i = R(vi − m) ˜ ˜˙ = ˙ − R[(xi − m)×]ω in h˜ Replacing x˜ i − m = R(xi − m) and v˜ i − m R(vi − m) ˙ we obtain h˜ = R [(x − m)×](v − m) + [(x − m)×]2 ω = R(h − Cω). i i i i i Therefore applying h˜ = 0 gives ω = C−1 h as before.

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

6.2

655

Three Dimensional Displacements

In the case when we are interested in the deformation from an epoch t to an epoch t  , the data at hand are the corresponding station coordinates xi and xi or the displacements δxi = xi − xi . We seek to find the parameters θ, d in a rigid transformation x˜ i = R(θ)xi + d, which yields minimal displacements δ x˜ i = xi − x˜ i = xi − Rxi − d in the sense that φ=



δ x˜ Ti δ x˜ i =



i

(xi − Rxi − d)T (xi − Rxi − d) = min .

To obtain the minimum we set the solution system

∂φ ∂φ = 0 and = 0, k = 1, 2, 3, which provide ∂d ∂θk d = m − Rm

and



 i (xi

(104)

i

− Rxi − d)T 

∂R xi = 0, or after replacing d = m − Rm ∂θk

xi − m − R(xi − m)

i

=

(105)

T

∂R xi = ∂θk

 xi − m − R(xi − m) i

T

∂R (xi − m) = 0 ∂θk

(106)

∂RT R are antisymmetric (just differentiate ∂θk RT R = I to see this) with axial vectors ωk , we may use the resulting relations ∂R = −R[ωk ×] to express the last relation in the form ∂θk Noting that the matrices [ωk ×] =



xi − m − R(xi − m)

T

R[ωk ×](xi − m) = 0,

k = 1, 2, 3.

(107)

i

 Rearranging terms we arrive at ωTk i [(xi − m)×]RT (xi − m ) = 0 for  k = 1, 2, 3, or jointly after setting = [ω1 ω2 ω3 ], T i [(xi − m)×]RT (xi −  −m ) = 0. Under mild conditions it holds that det = 0 and thus

[(xi − m)×]RT (xi − m ) = 0.

(108)

i

This is a nonlinear system of three equations in three unknowns θ = [θ1 θ2 θ3 ]T , that can be solved by standard numerical techniques. Once θ and d = m − Rm are

656

A. Dermanis

at hand, they can be used to compute the new minimized displacements, ether in the new reference system as δ x˜ i = (xi − m ) − R(xi − m), or converted to the original one as δxi = RT δ x˜ i = RT (xi − m ) − (xi − m).

(109)

It is possible to derive the same results by applying the Tisserand-like principles ˜ = m , where h˜ is defined in analogy with angular momentum by h˜ = 0 and m simply replacing velocities with displacements h˜ =



˜ ˜ = [(˜xi − m)×](δ x˜ i − δ m)

i



˜ [(˜xi − m)×]δ x˜ i = 0.

(110)

i

˜ = m − δ m. ˜ Taking the mean of both sides of x˜ i = Rxi + d, we obtain where δ m ˜ = Rm + d, so that x˜ i − m ˜ = R(xi − m), while the transformed displacements are m ˜ = m we obtain m ˜ = Rm + d = m and δ x˜ i = xi − x˜ i = xi − Rxi − d. Applying m  thus d = m − Rm, same value as above. With this value the transformed velocities become δ x˜ i = xi − m − R(xi − m) and replacing in the definition of h˜ we obtain  h˜ = R i [(xi − m)×]RT (xi − m ). Therefore application of h˜ = 0 leads to the  same relation i [(xi − m)×]RT (xi − m ) = 0 as above.

6.3

Plane Velocities

We turn to the study of horizontal deformation in planar approximation, where the data are station coordinates xi = [xi yi ]T and their velocities vi = x˙ i = [ui vi ]T . We seek to find the parameters θ , d, of a rigid transformation x˜ i = R(θ )xi + d, ˙ i + R˙xi + d, ˙ in the sense that they which provides minimal velocities v˜ i = x˙˜ i = Rx T ˙ ˙ minimize the target function φ = v˜ i v˜ i . Recall that R = θRW = θ˙ WR, where i



0 1 W = R(90 ) = −1 0 ◦

 (111)

has the properties WT = −W, W2 = −I and aT Wa = 0 for any a. The velocities become v˜ i = θ˙ R(θ )Wxi + R(θ )vi + d˙

(112)

and the target function takes the form φ=



v˜ Ti v˜ i =

i

˙ = φ(θ, θ˙ , d).



T

θ˙ R(θ )Wxi + R(θ )vi + d˙ θ˙ R(θ )Wxi + R(θ )vi + d˙ = i

(113)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

657

We will again obtain the desired minimum in two steps according to the scheme   ¯ φˆ = min φ = min min φ = min φ, θ,θ˙ ,d˙

θ,θ˙

For the first step we set



θ,θ˙

φ¯ = min φ d˙

(114)

∂φ = 0, which gives ∂ d˙ ˙ d˙ = −θ˙ R(θ )Wm − R(θ )m,

(115)

  ˙ = n1 i vi = [mu mv ]T . Replacing thee where m = n1 i xi = [mx my ]T and m above value of d˙ into φ, we obtain the partial minimum φ¯ =



T

˙ ˙ ˙ , θW(x θ˙ W(xi − m) + (vi − m) i − m) + (vi − m)

(116)

i

which depends only on θ˙ . Therefore in the second step we need only to set and solve the resulting equation for θ˙ , to obtain the value 

˙ (xi − m)T W(vi − m) i = θ˙ =  (xi − m)T (xi − m) i

∂ φ¯ =0 ∂ θ˙



(xi − mx )(vi − mv ) − (yi − my )(ui − mu ) i . 

(xi − mx )2 + (yi − my )2 i

(117) With this value of θ˙ we have θ (t) = θ0 +(t −t0 )θ˙ , where θ0 = θ (t0 ) is an arbitrary ˙ are integration constant. The displacements satisfying d˙ = −θ˙ R(θ )Wm − R(θ )m d = R(θ )m + c,

(118)

with c being an arbitrary constant. We just need to compute the minimized velocities ˙ for the above values, which take the form v˜ i = θ˙ R(θ )W(xi − m) + R(θ )(vi − m) in the new reference system. They can be expressed in the original reference system as ˙ ˙ v i = R(−θ )˜vi = θW(x i − m) + (vi − m).

(119)

The internal deformation analysis is then conducted based on the above velocity values. For an interpretation and connection with the application of Tisserand principles we denote by ∼ a = [a1 a2 0]T the natural three-dimensional extension of any planar vector a = [a1 a2 ]T . Then we may define the angular momentum vector as

658

A. Dermanis

h ∼ =



˙ [(x −m ∼i − m ∼ )×](v ∼) = 0 ∼i

0 h

T

,

(120)

i

where the angular momentum modulus is h=



(xi − mx )(vi − mv ) −



i

=



(yi − my )(ui − mu )

i

˙ (xi − m)T W(vi − m).

(121)

i

Computing also the inertia matrix C = −



[(xi − m)×]2 , we note the value of

i

its element C33 =



[(yi − my )2 + (xi − mx )2 ].

(122)

i −1 h, in analogy with the Thus the derivative value has the symbolic form θ˙ = C33 −1 three-dimensional case where ω = C h. Let us examine the solution obtained by imposing the Tisserand conditions h˜ = 0 ˙˜ = 0. By taking means on both sides of x˜ i = Rxi +d, (degenerating into h˜ = 0) and m ˙˜ = Rm ˙ + Rm ˙ ˜ = Rm + d and hence m ˙ + d˙ = θ˙ RWm + Rm ˙ + d, it follows that m ˙ ˙ ˙ ˜ = 0 gives d = −θ RWm − Rm, ˙ same as before. With this so that the condition m ˙ i + Rvi + d˙ becomes value the transformed velocity v˜ i = x˙˜ i = Rx

˙ v˜ i = θ˙ RW(xi − m) + R(vi − m). ˙˜ = 0, we get x˜ i − m ˜ = Rm + d and m ˜ = R(xi − m) and Noting that m ˙ ˜ = v˜ i = θ˙ RW(xi − m) + R(vi − m). ˙ With these values and recalling v˜ i − m that WR = RW, WRT = RT W, W2 = −I, the condition h˜ = 0 becomes h˜ =



˙˜ = ˜ T W(˜vi −m) (˜xi −m)

i



 ˙ ˙ = (xi −m)T RT W θRW(x i −m) + R(vi − m)

i

˙ = 0, = −θ˙ (xi − m)T (xi − m) + (xi − m)T W(vi − m) i

i

 ˙ (xi − m)T W(vi − m) ˙ ˙ , same as before. which solved for θ gives θ = i T i (xi − m) (xi − m)

(123)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

6.4

659

Plane Displacements

In the case when we are interested in planar deformation from an epoch t to an epoch t  , the data at hand are the corresponding horizontal station coordinates xi and xi or the displacements δxi = xi − xi . We seek to find the transformation parameters θ , d in a rigid transformation x˜ i = R(θ )xi + d, which yields minimal displacements δ x˜ i = xi − x˜ i = xi − Rxi − d in the sense that φ=



δ x˜ Ti δ x˜ i =

i



(xi − Rxi − d)T (xi − Rxi − d) = min.

(124)

i

∂φ ∂φ = 0 and = 0 we obtain the solution system ∂d ∂θ

Setting

d = m − R(θ )m, i

(xi − Rxi − d)T

The last with d = m − R(θ )m and

(xi − m )T RWxi −

i

=



=



∂R xi = 0. ∂θ

(126)

∂R = RW takes the form ∂θ

(xi − m)T Wxi =

i

(xi − m )T RW(xi − m) −

i



(125)



(127)

(xi − m)T W(xi − m) =

i

(xi − m )T RW(xi − m) = 0,

i

in view of the property aT Wa = 0. Replacing all vectors and matrices with their explicit values, we arrive at the explicit solution   (xi − mx )(yi − my ) − (yi − my )(xi − mx ) i = tan θ = −   (xi − mx )(xi − mx ) + (yi − my )(yi − my ) i



(xi − m)T W(xi − m ) . =−  (xi − m)T (xi − m ) i

(128)

i

In terms of the displacements δxi = xi − xi and their means the above relation takes the form

660

A. Dermanis



(xi − m)T Wδxi i  tan θ = −  = (xi − m)T (xi − m) + (xi − m)T δxi i

(129)

i



(xi − mx )δyi − (yi − my )δxi i 

. = 

(xi − mx )2 + (yi − my )2 + (xi − mx )δxi + (yi − my )δyi i

i

A Tisserand-like interpretation is possible if we define an analog to the angular  momentum by replacing velocities with displacements, as h = i (xi − m)T Wδxi . The condition imposed in terms of the new reference system are h˜ = 0 and the ˜ = m . Taking the mean on both sides of x˜ i = Rxi + d, barycenter preservation m ˜ = Rm + d and x˜ i − m ˜ = R(xi − m). The displacements in the new we obtain m ˜ = m we obtain d = m −Rm system are δ x˜ i = xi − x˜ i = xi −Rxi −d. Applying m   which turns the displacements into δ x˜ i = (xi − m ) − R(xi − m). With these values and taking into account the basic properties of W, the condition h˜ = 0 takes the form h˜ =



˜ T Wδ x˜ i = − (˜xi − m)



i

(xi − m )T WR(xi − m) = 0,

(130)

i

which is the same equation to be solved for tan θ as before.

6.5

Velocities on the Sphere

Our last application relates to horizontal deformation analysis carried on the surface of the reference ellipsoid, which is sufficiently approximated by a sphere. Given the station coordinates xi on the surface of the sphere and their horizontal velocities vi tangent to the surface of the sphere, we seek the transformation parameters θ in a ˙ rigid rotation x˜ i = R(θ)xi , such that the velocities v˜ i = x˙˜ i = R(θ)x i + R(θ)vi in the new reference system are minimized according to the target function φ =  T ˙ T R is antisymmetric (just ˜ i v˜ i = min. Noting that the matrix [ω×] = R = iv differentiate RT R = I), with its axial vector ω being the corresponding rotation ˙ = −R[ω×], and the vector in the original reference system, it follows that R transformed velocities become v˜ i = R(θ) (vi + [xi ×]ω) .

(131)

Thus the target function becomes φ=

i

(vi + [xi ×]ω)T (vi + [xi ×]ω) = min . ω

(132)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

∂φ = 0, which gives ∂ω [xi ×]vi + [xi ×]2 ω = 0.

661

The minimum is obtained by setting

i

(133)

i

Recognizing h=



[xi ×]vi , C = −

i



[xi ×]2

(134)

i

as the angular momentum and the inertia matrix, respectively, the relation (133) becomes h − Cω = 0 and the optimal value of the rotation vector is ω = C−1 h.

(135)

˜ We  may arrive at the same solution applying the Tisserand principle h = = [˜xi ×]˜vi = 0. Replacing x˜ i = R(θ)xi and v˜ i = R (vi + [xi ×]ω) we i  obtain h˜ = R [xi ×](vi + [xi ×]ω) so that h˜ = 0 leads to the same relation i  [xi ×](vi + [xi ×]ω) = 0 as before. i

The transformation parameters can be determined by solving the system of ˙ T R = [ω×] = [(C−1 h)×]. Fortunately, we do not need to differential equations R do this. All we are interested are the transformed velocities v˜ i = R(θ)(vi + [xi ×]ω) computed with ω = C−1 h and more precisely their components v i in the original system obtained from v i = R(θ)T v˜ i v i = vi + [xi ×]ω = vi + [xi ×]C−1 h.

(136)

The deformation analysis can be now performed on the reference ellipsoid using the above reduced velocities v i after a rigid rotation of the whole tectonic unit under study is filtered out. The original velocities vi have been thus decomposed into two parts vi = [ω×]xi + v i ≡ vrot,i + v i

(137)

The first part is the rotational part vrot,i = [ω×]xi due to the rotation of the whole area around the direction of ω (usually √ referred to as Euler pole) with an angular velocity equal to the modulus ω = ωT ω of ω. The second part are the reduced velocities v i which account for the internal deformation of the tectonic unit. A constant Euler pole and angular velocity is usually assumed. This is however incompatible with the usual linear-in-time (constant velocity) coordinate variation model xi (t) = xi0 + (t − t0 )vi , which leads to constant angular momentum

662

A. Dermanis

h(t) =



[xi (t)×]vi =



i

[xi0 ×]vi + (t − t0 )

i



[vi ×]vi =



i

[xi0 ×]vi = h,

i

(138) but time-variable inertia matrix C(t) = − =−





[xi (t)×]2 =

i

[xi0 ×]2 −(t − t0 )



i

≈−



([vi ×][xi0 ×]+[vi ×][xi0 ×]) − (t−t0 )2

i

[xi0 ×]2 − (t − t0 )

i





[vi ×]2 ≈

i

˙ ([vi ×][xi0 ×] + [vi ×][xi0 ×]) ≡ C0 + (t − t0 )C

i

(139) ˙ −1 h is time dependent. Thus the rotation of the studied and ω(t) ≈ [C0 + (t − t0 )C] area takes place around a migrating Euler pole and with variable angular velocity. ˙ is a small matrix we may use the first order approximation Since C ˙ −1 = C−1 − (t − t0 )C−1 CC ˙ −1 [C0 + (t − t0 )C] 0 0 0 to obtain the time varying rotation vector −1 ˙ −1 ω(t) ≈ C−1 ˙ 0 h − (t − t0 )C0 CC0 h ≡ ω0 + (t − t0 )ω,

(140)

˙ ˙ = −C−1 where ω0 = C−1 0 h and ω 0 Cω0 . The time varying angular velocity is given by ω(t) ≈

 ˙ ωT ω ωT0 ω0 + (t − t0 ) 0 ≡ ω0 + (t − t0 )ω, ˙ ω0

(141)

˙ Note however that both the pole migration and the angular where ω˙ = ω0−1 ωT0 ω. velocity variation are very small in magnitude.

7

Interpolation of Displacements or Velocities

Once the study area has been separated into independent tectonic clusters, which move with respect to each other, and the rigid motion trend has been removed, the next step is the interpolation of the reduced discrete displacements, or velocities, accordingly. The interpolation provides the continuous field, which differentiated allows the computation of the elements of the deformation gradient, or the velocity gradient matrix, that lead to the computation of invariant strain, or strain rate parameters, respectively.

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

663

Among the various possible interpolation methods, a particularly popular one is a form of piecewise interpolation, where the deformation or velocity gradient matrix is assumed to be constant within disjoint triangular elements. It is the first interpolation method that has been historically applied in the 1920s in Japan [29]. Thanks to its simplicity it is still widely used, despite its shortcomings and the development of more sophisticated approaches. The finite element approach using triangular elements starts with the separation of the area into non-overlapping triangles having as vertices the stations of a geodetic network. At that time the formation of triangles was done “by hand” but currently an optimal formation of triangles having minimal areas can be attained through a so-called Delaunay triangulation [see e.g. 19]. We will only present the interpolation of displacements, since interpolation of velocities is completely analogous. From the coordinates xi , xi of each network station i at two different epochs t, t  , the derived discrete displacements xi = xi − xi = x(xi ) must be interpolated to derive a continuous displacement field x(x). Under the assumption of linear displacements x(x) = Ax + b within each triangle, we need to find the parameters A and b from the vertex conditions xi = x(xi ) = Axi + b, i = 1, 2, 3. Since we are really interested only on the constant displacement gradient ∂ x = A, the vector b can be eliminated by taking displacement matrix J = ∂x differences x2 − x1 = A(x2 − x1 ), x3 − x1 = A(x3 − x1 ) to obtain the system

( x2 −

x1 )

( x3 −



x1 ) = A (x2 − x1 )

(x3 − x1 )



(142)

with solution

J = A = ( x2 −

x1 )

( x3 −

x1 )



(x2 − x1 )

Using the notation xi = [xi yi ]T , xi = [xi yi ]T , J are explicitly given by

(x3 − x1 )

−1

.

(143)

xi = [ui vi ]T , the elements of

(u2 − u1 )(y3 − y1 ) − (u3 − u1 )(y2 − y1 ) , D −(u2 − u1 )(x3 − x1 ) + (u3 − u1 )(x2 − x1 ) , = D (v2 − v1 )(y3 − y1 ) − (v3 − v1 )(y2 − y1 ) , = D −(v2 − v1 )(x3 − x1 ) + (v3 − v1 )(x2 − x1 ) , = D

J11 =

(144)

J12

(145)

J21 J22

(146) (147)

where D = (x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 ).

(148)

664

A. Dermanis

From the displacement gradient J the deformation gradient F = I + J can be computed, which is the basis for computing the various invariant deformation parameters, λ1 , λ2 , θ , θ  , emax , emin , ω, , γ , ϕ, ψ as explained in Sect. 3. These have constant values within each triangle and are discontinuous at the triangle sides. The interpolation of velocities with the finite element method is completely analogous. The elements Lij of the velocity gradient matrix L can be computed from the above formulas by simply replacing the displacements xi = [ui vi ]T with the velocity components vi = [ui vi ]T . From L the various strain rate parameters are computed. In addition to the piecewise linear character of the deformation parameters, a serious disadvantage of the method is the unrealistic computation of the variances and covariances of the deformation parameters. This is due to the fact that error propagation takes into consideration the uncertainties of the available values of xi , xi (or xi , vi ) but ignores completely the uncertainty of the interpolation itself. Thus, the computed signal-to-noise ratio is too optimistic and the derived parameters may be in fact meaningless in view of a real very small signal-to-noise ratio. Covariance propagation from coordinates to the elements of the displacement gradient matrix J can be sufficiently performed through the usual linear approximation procedure, utilizing the derivatives of the elements Jj k with respect to the input coordinates xi , yi , xi , yi , i = 1, . . . , n, with known covariance matrix. On the contrary the error propagation from the elements of J (or L) to the desired strain (or strain rate) parameters is problematic since the latter are strongly nonlinear functions of the former. This is a common problem independent of the employed interpolation method. Error propagation can be more effectively realized implementing Monte Carlo techniques, which however demand complete knowledge of the probability density function of the starting parameters and not only their mean and covariance matrix as in the linear case. A particular class of interpolations of a vector x(x) is implementing interpolating functions which are linear combinations of a known set of base functions ϕk (x), k = 1, 2, . . . , m, and thus   m m  au,k u(x) ϕk (x) = ak ϕk (x) = x(x) = = av,k v(x) k=1 k=1 ⎡ m ⎤    a ϕ (x) ⎢ k=1 u,k k ⎥ ϕ(x)T au ⎢ ⎥ =⎣ m ⎦ = ϕ(x)T av , av,k ϕk (x) 

(149)

k=1

where the vectors ϕ(x), au , av have elements ϕ(x)k = ϕk (x), (au )k = = au,k , (av )k = av,k . The unknown coefficients ak can be determined from the available discrete m  ak ϕk (xi ). A unique solution is possible only when values xi = x(xi ) = k=1

n = m. When n > m (more data than unknowns) there exists no solution and

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

665

exact interpolation is replaced by a smoothing interpolation with residuals ei = m  ak ϕk (xi ). An optimal least squares interpolation follows by minimizing xi − k=1

the weighted sum of the residuals using as weight matrix the covariance matrix of the data xi , computed from the preceding analysis where the coordinates xi , xi are estimated on the basis of available geodetic observations. More interesting is the case m > n (more unknowns than data) which has infinitely many solutions. A unique can be obtained by minimizing the weighted norm of the unknowns   solution au , i.e. by requiring that ||a||2R = aT Ra = min, for an appropriate positivea= av definite symmetric matrix R. The system of equations involved is ⎤ ⎤ ⎡ ϕ(x1 )T av v(x1 ) .. ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎥ ⎢ ⎢ T ⎢ ⎥ ⎢ v = ⎢ v(xi ) ⎥ = ⎢ ϕ(xk ) av ⎥ ⎥ = av , ⎥ ⎢ . ⎥ ⎢ . .. ⎦ ⎣ .. ⎦ ⎣ T ϕ(xn ) av v(xn ) (150) where !ik = ϕk (xi ), and they form jointly the system of observation equations ⎤ ϕ(x1 )T au ⎡ ⎤ ⎢ .. ⎥ u(x1 ) ⎥ ⎢ . ⎥ ⎢ ⎢ .. ⎥ ⎢ T u = ⎣ . ⎦ = ⎢ ϕ(xk ) au ⎥ ⎥ = au , ⎥ ⎢ . .. ⎦ ⎣ u(xn ) T ϕ(xn ) au ⎡

x=



     u  0 au ¯ = a. = av v 0 

(151)

The well-known unique solution satisfying aT Ra = min is ¯ −1  ¯ (R ¯ )−1 x. aˆ = R−1  T

T

(152)

Leaving aside for the moment the problem of the optimal choice of R, we look into the particular case where the unknown coefficients ak = [au,k av,k ]T areindependnly weighted for each base function ϕk , with a weight matrix Rk =  Ru,k Ruv,k . In this case the quantity to be minimized takes the form = Ruv,k Rv,k φ=

m k=1

aTk Rk ak = aTu

aTv





Ru RTuv

Ruv Rv



au av

 = aT Ra,

(153)

where Ru , Rv , Ruv are diagonal matrices with diagonal elements Ru,k , Rv,k , Ruv,k ,   Su,k Suv,k −1 = respectively. Setting Sk = = R−1 k it is easy to prove that R Suv,k Sv,k   Su Suv where Su , Sv , Suv are diagonal matrices with diagonal = S = STuv Sv elements Su,k , Sv,k , Suv,k , respectively. Therefore the minimum norm solution (152) becomes

666

A. Dermanis

 aˆ u ¯ T (S ¯  ¯ T )−1 x = = S aˆ = aˆ v  −1    S u T S uv T u Su T Suv T = . STuv T Sy T S Tuv T S v T v 

(154)

The corresponding interpolated functions u(x) ˆ = ϕ(x)T aˆ u and v(x) ˆ = ϕ(x)T aˆ v become 

   u(x) ˆ S u T ϕ(x)T Su T ϕ(x)T Suv T = T T T T v(x) ˆ ϕ(x) Suv  ϕ(x) Sv  S uv T  −1    T Ku Kuv u ku kTuv ≡ , T T T kuv kv Kuv Kv v

S uv T S v T

−1   u v (155)

where we have set Ku = S u T , Kuv = S uv T , Kv = S v T , ku = = S u ϕ(x), kuv = S uv ϕ(x), kv = S v ϕ(x). Recalling that !ik = ϕk (xi ) and ϕ(x)k = ϕk (x) the elements of the above matrices become (Ku )ij =

m

(Kuv )ij =

Su,k ϕk (xi )ϕk (xj ),

k=1

(Kv )ij =

m

m

Suv,k ϕk (xi )ϕk (xj ),

k=1

(156)

Sv,k ϕk (xi )ϕk (xj ),

k=1

(ku )i =

m

Su,k ϕk (xi )ϕk (x),

(kuv )i =

k=1

(kv )i =

m

m

Suv,k ϕk (xi )ϕk (x),

k=1

(157)

Sv,k ϕk (xi )ϕk (x).

k=1

If we introduce the two-point functions ku (x, x ) =

m

Su,k ϕk (x)ϕk (x ),

k=1

kv (x, x ) =

m

kuv (x, x ) =

m

Suv,k ϕk (x)ϕk (x ),

k=1

Sv,k ϕk (x)ϕk (x ),

(158)

k=1

then the elements of the matrices appearing in the interpolation can be expressed as

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

(Kx )ij = kx (xi , xj ),

(Kxy )ij = kxy (xi , xj ),

667

(Ky )ij = ky (xi , xj ), (159)

(kx )i = kx (x, xi ),

(kxy )i = kxy (x, xi ),

(ky )i = ky (x, xi ).

(160)

We turn next to the choice of the R matrix and in our particular case the choice of the Rk matrices. A well-known choice that leads to best (= minimum mean square error) linear uniformly unbiased estimates (BLUUE or simply BLUE) of parameters in linear models is to use as weight matrix the inverse of the covariance matrix of the minimized quantities. This is possible only if we consider the interpolation coefficients ak = [au,k av,k ]T as random variables with zero means E{ak } = 0, which are uncorrelated for different k and thus have covariance matrices 4 3   2 σu,k σuv,k E{au,k au,j } E{au,k av,j } T = δkj E{ak aj } = = δkj k . 2 σuv,k σv,k E{au,k av,j } E{av,k av,j } (161) With these values the covariance and cross-covariance matrices of the resulting m  au,k ϕk (x), v(x) = zero mean stochastic processes (random fields) u(x) = =

m 

k=1

av,k ϕk (x) become

k=1

 Cu (x, x ) Cuv (x, x ) ≡ E{ x(x) x(x )T } = C x (x, x ) = Cuv (x, x ) Cv (x, x ) 



=

m m

E{ak aTj }ϕk (x)ϕj (x )

k=1 j =1

=

m



k ϕk (x)ϕk (x ) =

k=1

m k=1

4

=

m m

δkj k ϕk (x)ϕj (x ) =

k=1 j =1 2 σu,k σuv,k

σuv,k 2 σv,k

3 ϕk (x)ϕk (x )

(162)

k

and explicitly Cu (x, x ) =

m

2 σu,k ϕk (x)ϕk (x ), Cv (x, x ) =

k=1

=

m

m

2 σv,k ϕk (x)ϕk (x ), Cuv (x, x )

k=1

σuv,k ϕk (x)ϕk (x ).

(163)

k=1

This means that when the per coefficient weight submatrices are the inverses 2 of the per coefficient covariance submatrices Rk = −1 k , i.e. when Sk,u = σu,k , 2 , S Sk,v = σv,k k,uv = σuv,k then the three functions required to construct the matrices appearing in the interpolation formulas become the covariance functions

668

A. Dermanis

of the velocity components, namely ku (x, x ) = Cu (x, x ), kv (x, x ) = Cv (x, x ), kuv (x, x ) = Cuv (x, x ). The interpolation Eqs. (155) take the form 

  T u(x) c = Tu cuv v(x)

cTuv cTv



Cu CTuv

Cuv Cv

−1   u , v

(164)

where the relevant matrices have elements (Cu )ij = Cu (xi , xj ),

(Cuv )ij = Cuv (xi , xj ),

(Cv )ij = Cv (xi , xj ), (165)

(cu )i = Cu (x, xi ),

(cuv )i = Cuv (x, xi ),

(cv )i = Cv (x, xi ).

(166)

The interpolation formulas (164) are in fact the formulas for best (= minimum mean square error) linear prediction of the random variables u(x), v(x) from the given outcomes of the random variables (u)i = u(xi ), (v)i = v(xi ), i = 1, 2, . . . , n. In geodesy the method is widely applied for interpolating scalar gravity data and is known as collocation. It is closely related to the kriging method applied in Geostatistics, as explained by [9] and further elaborated in [12]. However, once a stochastic point of view is adopted one cannot ignore the presence of errors in the available data. Their covariance and cross-covariance matrices are the sum of those of the random fields and the corresponding ones u , v , uv of the errors in the displacement (or velocity) estimates. The interpolation-prediction equations take the correct (from the stochastic point of view) form 

  T u(x) c = Tu cuv v(x)

cTuv cTv



Cu + u CTuv + Tuv

Cuv + uv Cv + v

−1   u . v

(167)

To realize the interpolation-prediction all we need is the knowledge of the three covariance two-point functions Cu , Cv , Cuv directly, and no explicit knowledge of 2 , σ 2 and covariance the base functions ϕk (x) and the coefficient variances σu,k v,k σuv,k is required. This is a critical point because it allows us to estimate sample covariance functions Cˆ u , Cˆ v , Cˆ uv from the data u(xi ), v(xi ) themselves, provided that their number is sufficiently large. This estimation is further facilitated by the assumptions that the random field at hand are weakly homogeneous, i.e. the covariance functions satisfy C(x + h, x + h) = C(x, x ) for every h, and in addition they are weakly isotropic i.e. the covariance functions are of the form C(x, x ) = C(r) depending only on the distance r = |x − x| between the two points. For example Cˆ u (r) and Cˆ uv (r) are estimated by dividing the r-axis to equal intervals Ik = ((k − 1) r, k r] , k = 1, 2, . . ., and estimating values at the interval mediums rk = (k − 1) r + r/2 from 1 Cˆ u (rk ) = u(xi )u(xj ), Nk rij ∈Ik

1 Cˆ uv (rk ) = u(xi )v(xj ), Nk rij ∈Ik

(168)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

669

where rij = |xi − xj | and Nk is the number value pairs with distances falling  within the interval Ik . As we will explain below, the value Cˆ u (0) = n1 i u(xi )2 is computed separately because it is an estimate of the sum C(0) + σv2 , where σv2 is a variance associated with data noise. A parametrized covariance model C(r, p) is fitted to the Cˆ u (rk ) at rk = 0.5 r, 1.5 r, 2.5 r, . . . ., by selecting the parameters values p = [p1 · · · ps ]T which provide the best fitting. For the covariance functions Cu and Cv the selected models must belong to the family of positive-definite functions, i.e. such that e.g. the matrix Cu with elements (Cu )ij = C(|xi − xj |) is positive-definite for any arbitrary set of points {xi }. Two widely used positive2 2 2 2 definite models are the exponential one C(r) = C0 e−a r = C0 2−r /r0 (a 2 = ln 2/r02 ), and the inverse quadratic one C(r) = C0 /(1 + r 2 /r02 ). The value r0 is the correlation length defined by C(r0 ) = 12 C(0) = 12 C0 . When the number of available data is not sufficiently large for the estimation of a sample covariance function, a usual choice for the correlation length is the mean value of the distances between adjacent points. A serious problem with these two models is that they are always positive (C(r) > 0) and they cannot be adapted to negative correlation (C(r) < 0) which often appears in practice. Models that accept negative values can be borrowed from gravity field applications, based on the fact that the expansion in Legendre  functions C(ψ) = n cn Pn (cos ψ), where ψ is spherical distance on the sphere, is a positive-definite function whenever cn ≥ 0. These can be adopted to the planar case by associating ψ with either the arc length r = Rψ, R being a mean earth radius or with the spatial, distance between the points r = 2R sin(ψ/2). This will lead to two models of the form

C(r) =

n

" r# , cn Pn cos R

C(ψ) =

n

  r2 cn Pn 1 − , 2R 2

cn ≥ 0.

(169) The parameter R can also be used as an additional “tuning” unknown in the least squares fitting procedure. One way to realize the best fitting to the sample values ˆ k ) is to solve the constrained minimization problem [see e.g. 2] Cˆ k = C(r

 2 pk C(rk ) − Cˆ k = min , cn ≥0

k

(170)

with weights pk , which are usually taken to be inversely proportional (pk = 1/Nk ) to the number of pairs Nk summed for the computation of the value Cˆ k . Another approach is to force the coefficients cn to be positive by modeling them as positive functions of set of new tuning parameters. For example, one may use the model

cn = s 2(n+2)

A(n − 1) , (n − 2)(n + B)

2 < n ≤ nmax ,

(171)

670

A. Dermanis

proposed by [34], for gravity field related applications, where the best fitting parameters A > 0, 0 < s < 1 and integer B are again determined by constrained least squares. The above positive definite models can be implemented to obtain sample covariance functions Cˆ u (r), Cˆ v (r) of the random fields u(x), v(x), respectively, but they are not appropriate for the modelling of their cross-covariance Cuv (x), which is not necessarily a positive definite function. The use of the exponential model for cross-covariance functions that has appeared in the literature is rather arbitrary as it lacks theoretical justification. Unfortunately, we have not at hand a class of models for cross-covariance functions and the matter needs further theoretical investigation. The problem relates to that of the modeling of the co-variogram in kriging, where some theoretical progress has been made, leading to restricting conditions. In any case the models used in kriging for covariances, variograms and co-variograms are too simple to be of any practical use in geodetic applications. A naive solution out of this problem is the usual practice of ignoring the cross-correlation between the two displacement or velocity components, i.e. to assume that Cuv (r) = 0, so the corresponding cross-covariance matrices vanish (Cuv = 0,cuv = 0), in which case (under the additional non-valid assumption uv = 0) the vector predictioninterpolation is split into two separate ones for each component u(x) = cTu (Cu + u )−1 u,

v(x) = cTv (Cv + v )−1 v.

This approach that is sometimes followed in the literature, as well as its even simpler T −1 version u(x) = cTu C−1 u u, v(x) = cv Cv v are simply erroneous. Even if sampling evidence points to Cuv (r) = 0, the cross-correlation of the vector component errors ( uv )ij = σu(xi )v(xj ) = E{u(xi )v(xj )} = 0 cannot be ignored. A compromise in order to overpass the problem of modelling the crosscovariance function Cuv (r) is based on the idea of changing the reference system in such a way that the sample cross-covariance becomes as small as possible. A system rotation by an angle of ' transforms the displacements (or velocities) according to ˜ u(x) = R(')u(x). Thee corresponding transformation for covariance and crosscovariance functions is ˜ C(r) ≡



   C˜ u (r) C˜ uv (r) ˜u˜ T } = R(') Cu (r) Cuv (r) R(−') ≡ = E{ u C˜ uv (r) C˜ v (r) Cuv (r) Cv (r)

≡ R(')C(r)R(−').

(172)

For each particular value of r it is possible to diagonalize C(r) = = R(−')R(')T , where  is the diagonal matrix of the eigenvectors λ1 ≥ λ2 of ˜ C(r) so that C(r) = R(')C(r)R(−') = , explicitly C˜ u (r) = λ1 , C˜ u (r) = λ2 , ˜ Cuv (r) = 0. It is thus possible to make the cross-correlation to vanish for some particular distance, but not over its whole domain of definition. However experience from repeated applications has demonstrated that when a rotation makes C˜ uv (0) = 0 then all the other values C˜ uv (r) remain small, despite the fact that Cuv (r) does not

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

671

in general has its maximum at zero as do the (auto)covariance functions where Cu (r) ≤ Cu (0), Cv (r) ≤ Cv (0). The angle ' that leads to C˜ uv (0) = 0 can be computed from tan 2' =

2Cuv (0) . Cu (0) − Cv (0)

(173)

Once the assumption that C˜ uv (r) = 0 is acceptable, the prediction is carried out as follows: First, the data components ui , vi are transformed into the ones u˜ i , v˜i in the rotated reference system, according to [u˜ v˜ ] = [u v]R(−'). Secondly, the covariance matrices are accordingly transformed using the covariance propagation law ˜ u = cos2 ' y + sin2 ' v + sin ' cos '( uv + Tuv ),

(174)

˜ v = sin2 ' u + cos2 ' v − sin ' cos '( uv + Tuv ),

(175)

˜ uv = cos2 ' uv − sin2 ' Tuv + sin ' cos '( v − u ).

(176)

The predicted displacements in the new reference system for any desired point x are 

  T u(x) ˜ c˜ x˜ (x) = = Tu c˜ uv v(x) ˜

c˜ Tuv c˜ Tv

4 ˜ ˜u Cu + T ˜ uv

3−1   u˜ , ˜v + ˜v v˜ C ˜ uv

(177)

and they are finally transformed back to the original reference system using x(x) = R(−') x˜ (x).

(178)

In applying the above prediction algorithms, we have assumed that the random fields u(x) and v(x) have zero mean. However, this assumption may not be valid. Even if a rigid-motion trend has been already removed as explained in Sect. 6, an additional trend must be removed from the random fields, so that the remaining residual random fields have zero mean. Usually a linear trend removal is sufficient. Using the model x(x) = J0 x + c, which is linear with respect to the elements of J0 , least squares estimates Jˆ 0 , cˆ can be  obtained from the observed values x(xi ) = = J0 xi +c+δ xi , by requiring that i δ xTi δ xi = min. Note that weighted least squares with weight matrix the inverse of the data covariance matrix is not justified in this case. The residuals δ xi are not zero mean random quantities because they contain a deterministic part and thus the Gauss-Markov theorem does not apply. The estimated linear trend x(x) = Jˆ 0 x+ cˆ is subtracted from the data to obtain residual displacements δ xi = x(xi ) − Jˆ 0 xi − cˆ . These are the basis for obtaining sample covariance functions and predicting-interpolating residual displacements δ x(x) at any desired point x. Actually, we are not interested in the displacements themselves but in the displacement (or velocity) gradient matrix

672

A. Dermanis

J(x) =

∂ x0 ∂δ x ∂ x (x) = (x) + (x) = Jˆ 0 + δJ(x). ∂x ∂x ∂x

(179)

∂δ x (x), we note that the variable x To obtain the elements of δJ(x) = ∂x appears only within the vectors cu , cv , cuv . Differentiation of the predicted residual displacement components δu(x), δv(x) in the rotated “decorrelation” reference system according to Eqs. (177) gives ⎡

⎤ ∂δ u˜ ⎡ ˜ ⎤ ⎢ ∂ x˜ ⎥ ⎢ ∂δ ⎥ ⎡ ∂ c˜ δ J11 ⎢ v˜ ⎥ u ⎥ ⎢ δ J˜21 ⎥ ⎢ ⎢ ∂ x˜ ⎥ ⎢ ∂ x ˜  ⎢ ⎥ ⎢ sˆ ≡ ⎣ ˜ ⎦ = ⎢ ∂δ u˜ ⎥ = ⎣ ∂ c˜ uv ⎢ ⎥ δ J12 ⎢ ∂ y˜ ⎥ ˜ ⎢ ⎥ ∂ x ˜ δ J22 ⎣ ∂δ v˜ ⎦ 4

˜u + ˜u C × T ˜ uv

∂ y˜ ˜ uv

3−1 

˜v + ˜v C

∂ c˜ uv ∂ x˜ ∂ c˜ v ∂ x˜

∂ c˜ u ∂ y˜ ∂ c˜ uv ∂ y˜

⎤ ∂ c˜ uv T ∂ y˜ ⎥ ⎥ ∂ c˜ v ⎦ × ∂ y˜

 δ u˜ ≡ Cs s (Cs + )−1 b, δ v˜

(180)

where    

∂ c˜ u ∂ x˜ ∂ c˜ v ∂ x˜ ∂ c˜ u ∂ y˜

 i



∂ C˜ u x˜ − x˜i ∂ C˜ u (ri ), = = ∂ x˜i ri ∂r =

∂ C˜ v x˜ − x˜i ∂ C˜ v (ri ), = ∂ x˜i ri ∂r

=

∂ C˜ u y˜ − y˜i ∂ C˜ u (ri ), = ∂ y˜i ri ∂r

i



i



∂ c˜ uv ∂ C˜ uv y˜ − y˜i ∂ C˜ uv (ri ), = = ∂ y˜ i ∂ y˜i ri ∂r   ∂ c˜ v ∂ C˜ v y˜ − y˜i ∂ C˜ v (ri ), = = ∂ y˜ i ∂ y˜i ri ∂r



∂ c˜ uv ∂ x˜

 = i

∂ C˜ uv x˜ − x˜i ∂ C˜ uv (ri ), = ∂ x˜i ri ∂r (181)

(182)

 ˜ y˜ are the coordinates of the prediction with ri = (x˜ − x˜i )2 + (y˜ − y˜i )2 and x, point. The interpolated displacement gradient matrix is back-transferred to the original reference system by δJ = R(−')δ J˜ R(') and the deformation analysis is based on the predicted deformation gradient matrix F = I + J = I + Jˆ 0 + δJ. The main advantage of interpolation of displacements or velocities via stochastic prediction is that it allows us to obtain reliable statistics for the prediction errors εˆ = sˆ − s (predicted minus true value) which reflect not only the effect of observational errors but also the uncertainty in the interpolation process. This is

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

673

achieved through the estimation of the sample covariance function and it does not hold in the case where some empirical covariance model is used without statistical justification. The covariance matrix Cεˆ of the prediction errors as provided by the law of covariance propagation is given by Cεˆ = Cs − Cs s (Cs + )−1 CTs s ,

(183)

where the matrices , Cs , Cs s have been implicitly defined in relation (180). The additional matrix Cs  is the covariance matrix of the predicted displacement (or velocity) gradient elements, which are derivatives of the displacements. Its elements are thus double derivatives with respect to the coordinates of the covariance functions. In the rotated for decorrelation reference system it has the form ⎡

∂ 2 C˜ u ⎢ ⎢ ∂ x˜ 2 ⎢ ∂ 2 C˜ ⎢ uv ⎢ ⎢ ∂ x˜ 2 Cs  = ⎢ 2 ˜ ⎢ ∂ Cu ⎢ ⎢ ∂ y∂ ⎢ 2˜ x˜ ⎣ ∂ C˜ uv ∂ y∂ ˜ x˜ ⎡ ∂ 2 C˜ u ⎢ ⎢ ∂ x˜ 2 ⎢ ⎢ ⎢ 0 ⎢ =⎢ 2˜ ⎢ ∂ Cu ⎢ ⎢ ∂ y∂ ⎢ ˜ x˜ ⎣ 0

∂ 2 C˜ uv ∂ x˜ 2 ∂ 2 C˜ v

∂ 2 C˜ u ∂ y∂ ˜ x˜ ∂ 2 C˜ uv

∂ x˜ 2 ∂ 2 C˜ uv ∂ y∂ ˜ x˜ ∂ 2 C˜ v ∂ y∂ ˜ x˜

∂ y∂ ˜ x˜ ∂ 2 C˜ u ∂ y˜ 2 ∂ 2 C˜ uv ∂ y˜ 2 ∂ 2 C˜ u ∂ y∂ ˜ x˜

0 ∂ 2 C˜

v

∂ x˜ 2 0 ∂ 2 C˜ v ∂ y∂ ˜ x˜

0 ∂ 2 C˜ u ∂ y˜ 2 0

∂ 2 C˜ uv ∂ y∂ ˜ x˜ ∂ 2 C˜ v



⎥ ⎥ ⎥ ⎥ ⎥ ∂ y∂ ˜ x˜ ⎥ ⎥ 2 ∂ C˜ uv ⎥ ⎥ ∂ y˜ 2 ⎥ ⎥ ∂ 2 C˜ v ⎦ ∂ y˜ 2 r=0 ⎤

⎥ ⎥ ⎥ v ⎥ ⎥ ∂ y∂ ˜ x˜ ⎥ , ⎥ ⎥ 0 ⎥ ⎥ ⎥ ∂ 2 C˜ v ⎦ ∂ y˜ 2 r=0 0

∂ 2 C˜

(184)

the last term following from the decorrelation assumption C˜ uv = 0. The derivatives are evaluated at r = 0 (x˜  = x, ˜ y˜  = y) ˜ since all predicted signals refer to the same point. Their explicit form depends on the specific models adopted for the covariance functions C˜ u (r), C˜ v (r) (and C˜ uv (r) when C˜ uv = 0). Since (x˜ − x˜  )2 ∂ 2 C ∂ 2C (y˜ − y˜  )2 ∂C + = , ∂r ∂ x˜ 2 r3 r2 ∂r 2 (y˜ − y˜  )2 ∂ 2 C ∂ 2C (x˜ − x˜  )2 ∂C + = , ∂r ∂ y˜ 2 r3 r2 ∂r 2   1 ∂C ∂ 2C (x˜ − x˜  )(y˜ − y˜  ) ∂ 2C − , = + ∂ y∂ ˜ x˜ r ∂r r2 ∂r 2

(185) (186) (187)

674

A. Dermanis

the adopted models C(r) must be such that the indeterminacies for r = 0, x˜  = x, ˜ y˜  = y˜ are removed and the derivatives are well defined. A slightly different approach is to use the fitted linear trend only for the purpose of determining the sample covariances. New trend parameter values can be estimated simultaneously with the prediction using a mixed linear model b = Ax + s + e, E{s} = 0, E{ssT } = Cs , E{e} = 0, E{eeT } = , E{seT } = 0, (188) instead of the previously used random effects model b = s + e, with the additional prediction of new signals s with E{s } = 0, E{s s T } = Cs , E{s sT } = Cs s . Here s are the observed displacements or velocities at the control network points, b are the observed values, e the corresponding observational errors and s the elements of the displacement (or velocity) matrix δJ at any desired point. The deterministic parameters x are the elements J0 , c of a linear trend u0 (x) = J0 x + c and A is a known design matrix. The well-known algorithm for estimation and prediction is as follows xˆ = [AT (Cs + )−1 A]−1 AT (Cs + )−1 b, Cxˆ = [AT (Cs + )−1 A]−1 ,

(189)

sˆ = Cs s (Cs + )−1 (b − Aˆx),   Cεˆ  = Cs − Cs s (Cs + )−1 − (Cs + )−1 AN−1 AT (Cs + )−1 CTs s . (190) The above procedure for the simultaneous estimation of trend parameters and the prediction of random-field-related parameters is completely equivalent to what is known in geostatistics as “universal kriging”, in the simple case where one predicts the vector components only. The extension to any linear functional of the vector components, through covariance propagation is particular to geodetic applications and provides in a certain sense a “generalized kriging” approach, as explained by [28]. The above equations require complete knowledge of the error covariance matrix and not up to a scale factor σ 2 as in the usual linear Gauss-Markov model. The rather unlikely assumption of a common factor σ 2 in both the error and the signal covariance matrices allows to estimate σˆ 2 = eˆ T (Cs + )−1 eˆ /f , where f are the degrees of freedom and eˆ = b − Aˆx − sˆ with sˆ = Cs (Cs + )−1 (b − Aˆx). The covariance matrices in the above estimation-prediction relations are replaced by corresponding covariance factor matrices ( → Q,Cs → Qs , Cs s → Qs s , Cs → Qs , Cxˆ → Qxˆ , Cεˆ  → Qεˆ  ) and final covariance estimates are given by ˆ  = σˆ 2 Q  . ˆ xˆ = σˆ 2 Qxˆ , C C εˆ εˆ We close by pointing some (at least theoretical) shortcomings of the interpolation through stochastic prediction approach, or collocation approach in geodetic jargon. These relate to the determination of the sample covariance functions Cˆ u , Cˆ v , Cˆ uv . In applications related to the scalar gravity field the data at hand are point observations affected by independentrandom errors of the same accuracy. Thus instead of estimating Cˆ u (rk ) = N1k ij u(xi )u(xj ) we obtain

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

675

ˆ u (rk ) = 1 [u(xi ) + ei ][u(xj ) + ej ] G Nk ij

1 1 1 1 u(xi )u(xj ) + u(xi )ej + u(xj )ei + ei ej = Nk Nk Nk Nk ij

ij

ˆ k) + = C(r

ij

1 ei ej . Nk

ij

(191)

ij

The second and third terms tend to vanish because they are estimates of the zero cross-covariance between the stochastically independent signals and errors. For i = j (rk = 0) the behavior of the last term depends on whether the data at hand are the result of independent positioning or they are obtained through a network adjustment. In the first case, the errors at different points are independent and the last term also tends to vanish. This is not the case in a network situation where the errors in coordinate estimates are correlated. More specifically, considering that each pair appears just once in the last sum, let eI be the vector of the errors ei and eJ the vector of the pairing errors ej at different points, the last term is an unbiased estimate of  its expectation E{ N1k ij ei ej } = N1k E{eTI eJ } = N1k trE{eJ eTI } = N1k tr JI , where JI is an off-diagonal submatrix of the covariance matrix of all the errors. As such, it has both negative and positive elements in its diagonal, so that their sum tends to 1 2 1 T ˆ ˆ ˆ vanish. Passing to the case r = 0, we obtain G(0) = C(0)+ e i i = C(0)+ n e e n where e contains all the observational errors and the last term is always positive. The term Sˆ = n1 eT e is an unbiased estimate of its mean S = n1 E{eT e} = n1 trE{eeT } = 1 n tr e , where e is the covariance matrix of the errors at the sample points. For the three sample covariance functions we obtain ˆ u (0) = Cˆ u (0)+ Sˆu , G

ˆ v (0) = Cˆ v (0)+ Sˆv , G

ˆ uv (0) = Cˆ uv (0)+ Sˆuv , G

(192)

1 1 1 where Sˆu = eTu eu , Sˆv = eTv ev , Sˆuv = eTu ev are unbiased estimates of their n n n respective means Su =

1 tr u , n

Sv =

1 tr v , n

Suv =

1 tr uv . n

(193)

If we have great confidence in the covariance matrices of the displacement components we may use the sample values at r = 0, to compute the values ˆ u (0) − Su , Cˆ v (0) = G ˆ v (0) − Sv , Cˆ uv (0) = G ˆ uv (0) − Suv , which will Cˆ u (0) = G be included in the values for fitting model covariance functions. We prefer to fit the ˆ ˆ k ), without including the value G(0) and thus covariance models to the points C(r  obtain an estimate C (0) by extrapolation (evaluation of the fitted model at r = 0). ˆ u (0) − Cu (0), Sv = G ˆ v (0) − Cv (0), Sv = G ˆ v (0) − Cv (0) should The values Su = G be compared with Su , Sv , Suv , respectively. If unjustified discrepancies are found  /S as ad hoc estimates of missing variance one can use the ratios Su /Su , Sv /Sv , Suv uv

676

A. Dermanis

factors and rescale the data covariance matrices into more realistic ones according S S S to the replacement scheme u → u u , v → v v , uv → uv uv . Su Sv Suv All the above procedures apply directly to the interpolation of known discrete station velocities for the determination of the elements of the velocity gradient matrix L and the computation of strain rate parameters. The stochastic prediction or collocation method has been introduced to the study of crustal motions by [15], who gave the detailed algorithm and applied it to the seismic areas of Volvi in Greece and Friuli in Italy [3]. It is based on the simple idea that displacements are more likely to be similar at points close to each other, while they can become different at distant points. It is this simple fact coming from experience that is expressed by the covariance functions of the displacements or velocities. The method has been thereafter adopted in Europe [8, 21, 32] and Japan (16, 17, 22). For a more detailed account of relevant applications see [11].

8

The Deformation of Surfaces

The planar theories of two dimensional deformation can be very easily adapted to curved surfaces; after all every smooth surface looks like a plane in a small neighborhood of a point. In addition to the intrinsic deformation parameters which are common with the planar case, the extrinsic parameters relating to the bending of a surface are of interest and these require a local approximation to the surface not by its best fitting plane but rather by its best fitting triaxial ellipsoid [13]. Surface deformation finds a wide range of applications in geodetic engineering for the study of constructions with one of their dimensions very small, i.e. of shells. For the crustal motion of the earth there are two possible applications. The first is to study the deformation of the physical surface of the earth, ignoring that earth deformation is in fact a three-dimensional process. The other is to follow the classical separation of crustal deformation to horizontal and vertical one, in accordance to the different geophysical processes, which drive these two deformation components, e.g. plate motion versus post-glacial uplift. I in this case, the horizontal deformation is the deformation of the projections of surface points on the reference ellipsoid, pretty much as in the classical horizontal deformation, which relates to projections on the horizontal plane that best fits the geoid in a small area. The study of the deformation of the actual surface of the earth by localized strain parameters refers to the best fitting planes perpendicular to the surface, and thus, because of the terrain variability, the deformation at different points is “seen” from different spatial directions, which have no physical consistency. Another possible type of application is for landslide areas, although relative motion rather than deformation is the main factor in this case. Intrinsic surface deformation is a result of its three dimensional deformation as described by the paired position vectors x3 = e3 0 x at epoch t and x3  = e3 0 x at epoch t  of the same material point. The deformation mapping x3  = f (3 x ) is analytically described by the vector mapping x = f(x) with respect to a chosen

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

677

global three-dimensional reference system (O, e3 0 ) with origin O and basis e3 0 = [3 e 01 e3 02 e3 03 ]. Intrinsically the surface is described by a system of two curvilinear coordinates u = [u1 u2 ]T , and extrinsically by the mapping x3 = χ(u 3 1 , u2 ) [30]. If (u1 , u2 ) and (u1 , u2 ) are the curvilinear coordinates at two epochs t and t  , respectively, and x3 = χ(u 3 1 , u2 ), x3  = χ3 (u1 , u2 ) are the corresponding position vectors, then the three-dimensional deformation mapping x3  = χ3 (u1 , u2 ) = f (3 x ) = f (χ(u 3 1 , u2 )) induces an intrinsic surface deformation mapping u = φ(u). The tangent planes to the particular material point under consideration at the two epochs t and t  are spanned by the corresponding natural bases 3e = [3e1 e32 ] =

∂ x3 = ∂u





∂ x3 ∂u1

∂ x3 ∂u2

∂ x3  ∂u1

∂ x3  ∂u2

=

∂(3 e 0 x) ∂x = e3 0 ∂u ∂u

(194)

and e3  = [3 e 1 e3 2 ] =

∂ x3  = ∂u



 =

 ∂(3 e 0 x ) 0 ∂x 3 = e . ∂u ∂u

(195)

 = e3  • e3  are represented The corresponding metric tensors gik = e3i • e3k and gik i k by the respective matrices

 G = 3e • 3e = T

∂x ∂u

T

∂x , ∂u



G = e3

T



• e3 =



∂x ∂u

T

∂x . ∂u

(196)

A parametrized curve x3(λ) = x3(u(λ)) on the surface at epoch t, passing through the point P under consideration, is deformed at a later epoch t  into a      curve x3 (λ) = x3 u (u(λ)) = x3 (φ (u(λ))). Consequently, the tangent vector ∂ x3 du du d x3 = = 3e = 3ev to this curve at P is mapped into the vector v3 = dλ ∂u dλ dλ     ∂ x3 ∂u du d x3  ∂u 3 v3 = = = e v = e3  v , tangent to the deformed curve at P . dλ ∂u ∂u dλ ∂u Therefore the mapping F : v3 → v3 , which is the tangent mapping F = ∇φ to the intrinsic surface deformation mapping φ : u → u , is represented in the local bases ∂u 3e, e3  by Fu ≡ so that v = Fu v. Fu is the deformation gradient matrix for the ∂u ∂x for the plane, where the surface similar to the deformation gradient matrix F = ∂x  curvilinear coordinates u, u are replaced by Cartesian coordinates x, x , which are orthonormal. The lack of orthonormality of the bases 3e, e3  is the only obstacle in the direct adaptation of the planar deformation theory to the deformation of non-flat surface. This problem can be easily resolved by replacing these bases by a set of orthonormal ones, and also replacing Fu with the representation F of F = ∇φ, with respect to the new orthonormal bases. After this is done, the surface deformation analysis becomes identical to that of the planar case. The only problem is that there are infinitely choices of orthonormal bases differing only by rotations R(ϑn ) and

678

A. Dermanis

R(ϑn ) around the directions normal to the surface at epochs t and t  . This means that we cannot determine uniquely the rotation component in the three-dimensional rotation of the surface around the point P , but only the components associated with e31 × e32 the change of orientation of the surface unit normal from n3 = at epoch t |3e1 × e32 | e3  × e3 2  at epoch t  . to n3 =  1 e3 × e3   1

2

Let ε3 and ε3  be two orthonormal bases for the surface tangent planes at P , for the two epochs t and t  . Let the transformations from the original bases 3e, e3  to the orthonormal ones ε3, ε3  be represented by matrices T and T , so that ε3 = 3e T,

ε3  = e3  T ,

3e = ε3T−1 ,

e3  = ε3 T−1 .

(197)

The tangent mapping F = ∇f , which maps a vector v3 = 3ev = ε3 T−1 v ≡ ε3 v¯ into F (3 v ) = v3 = e3  v = ε3 T −1 v ≡ ε3 v¯  , is represented by v¯  = T −1 v = T −1 Fu v = = T −1 Fu T v¯ , so that the gradient matrix becones F = T

−1

Fu T.

(198)

The deformation gradient matrix F with respect to orthonormal bases can be used for proceeding with the deformation analysis exactly as in the case of planar deformation theory. It remains to determine the nonsingular matrices T, T which lead to orthonormal matrices. From the orthonormality of the new bases (3εT • ε3 = I  and ε3 T • ε3 = I) follows that ε3T • ε3 = (3eT)T • 3eT = TT 3eT • 3eT = TT GT = I and similarly T T G T = I. Thus the choice of the orthonormal bases is only restricted by the conditions TT GT = I,

TT G T = I.

(199)

If T¯ = QT, T¯  = Q T are two other choices for the orthonormalization T T matrices then the conditions T¯ GT¯ = I, T¯ T GT¯  = I assert that I = T¯ GT¯ = = QT TT GTQ = QT Q = I and similarly QT Q = I, so that both relating matrices Q and Q are orthogonal, i.e. plane rotations Q = R(ϑn ) and Q = R(ϑn ). The simpler choice of appropriate matrices T, T is based on the diagonalization of the metric matrices G = UMUT ,

G = U M UT ,

(200)

where the diagonal matrices M and M have as diagonal elements the eigenvalues (in descending magnitude) m1 , m2 and m1 , m2 of the metric matrices G and G , respectively, while the orthogonal matrices U, U have as columns the corresponding (unit) eigenvectors. It is easy to verify that the matrices T = UM−1/2 ,

T = U M

−1/2

(201)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

679

satisfy the conditions (199) and thus lead to orthonormal matrices. The deformation gradient matrix F = T −1 Fu T takes for this choice the form F = M

1/2

U Fu UM−1/2 . T

(202)

Another choice is to leave the direction of one of the two base vectors unaltered and to replace the remaining one with the unit vector perpendicular to the first within the tangent plane. For example ε31 =

1 1 e31 = √ e31 , |3e1 | g11

ε32 ⊥3ε1 ,

ε3 1 =

1  1 e3 1 ,  e3 1 =  |3 e 1|  g11

ε3 2 ⊥3 ε 1 .

(203) For this choice the projection of e32 on the direction of e31 is given by e3p = g12 e31 • e32 e31 |3e2 | = e31 and its projection on the direction perpendicular to e31 |3e1 | |3e2 | |3e1 | g11 g12 1 e3⊥ , and finally is given by e3⊥ = e32 − e3p = e32 − e31 . Therefore ε32 = g11 |3e⊥ | √ g11 1 g12 ε31 = √ e31 , ε32 = − √ e31 + √ e32 , (204) g11 g11 g g with similar relations for the epoch t  . The deformation gradient F = T −1 Fu T can be computed from the corresponding transformation matrices ⎤ ⎡  g12 1 ⎡ ⎤ − g12 1 ⎥ ⎢   g ⎥ ⎢ g g11 ⎢ √g11 − √g11 g ⎥ 11 ⎥ ⎢  ⎥  √ T=⎢ ⎥, ⎣ g11 ⎦ , T = ⎢  ⎢ g11 ⎥ 0 √ ⎦ ⎣ g 0 √  g ⎤ ⎡  g12  ⎢ g11   ⎥ ⎢ g11 ⎥ ⎢ −1 √  ⎥ T = ⎢ (205) ⎥, g ⎥ ⎢  ⎦ ⎣ 0  g11 where gij , gij , are the elements of the metric matrices G, G , respectively, while g = det G and g  = det G . A special choice for curvilinear coordinates are the so called convective coordinates u = u [23], where the points at epoch t  are identified by their coordinates at epoch t. In this case Fu = I and F = T −1 T. If the base transformation matrices (201) resulting from the diagonalization (200) of the metric matrices are used, then F = M 1/2 U T UM−1/2 , which setting U = R(ζ ), U = R(ζ  ) and recalling that mi and mi are the diagonal elements of M and M , respectively, takes the form

680

A. Dermanis

⎡ %

⎤ m1 sin(ζ − ζ  ) ⎥ ⎥ m2  1/2  −1/2 ⎥. % F = M R(ζ − ζ )M ⎥ m2 ⎦ cos(ζ − ζ  ) m2 (206) When the matrices T and T are provided following the scheme described by Eqs. (203), (204), and (205) the deformation gradient matrix becomes m1 ⎢ cos(ζ − ζ  ) ⎢ m1 % =⎢ ⎢ m2 ⎣ sin(ζ − ζ  ) − m1

⎡%  −1

F=T

⎢ ⎢ T=⎢ ⎢ ⎣

 g11 g11

0

 g12

2

%

%  g11 g11 − g12  g11 g g11 g %  g g11  g g11

⎤ ⎥ ⎥ ⎥. ⎥ ⎦

(207)

We will shortly outline two possible applications, one in small scale geodetic engineering deformation projects [13] and one for the deformation of the actual surface of the earth [see e.g. 1, 18, 25, 26, 37]. In an engineering application a surface-like construction is surveyed at two epochs t, t  for the determination of the coordinates of a set of points xi = [Xi Yi Zi ]T , xi = [Xi Yi Zi ]T . The discrete displacements δxi = xi − xi = = [Ui Vi Wi ]T are interpolated as functions of the u = [X Y ]T coordinates, in order to obtain displacements δx = [U V W ]T at any desired point x, and in ∂U ∂U ∂V ∂V , UY = , VX = , VY = , particular their derivatives UX = ∂X ∂Y ∂X ∂Y ∂W ∂W WX = ,WY = . In addition the Zi coordinates are interpolated as functions ∂X ∂Y T of u = [X Y ] , in order to obtain coordinates Z at any desired point x, and in ∂Z ∂Z and ZY = . The points on the surface are particular their derivatives ZX = ∂X ∂Y identified at both epochs by their coordinates u = [X Y ]T at the first epoch. The use of the convective coordinates (u = u) simplifies the deformation analysis. The partial derivative matrices ⎡ 1 ∂x ⎣ = 0 ∂u ZX

⎤ 0 1 ⎦, ZY

⎡ 1 + UX ∂x ∂x ⎣ = = VX ∂u ∂u ZX + WX

⎤ UY 1 + VY ⎦ ZY + WY

(208)

provide the metric matrices at the two epochs using Eqs. (196). The matrices T and T are provided either through diagonalizations applying Eqs. (200) and (201) or following the scheme described by Eqs. (203), (204) and (205). Once F has been derived using either Eq. (206) or (207), deformation analysis is carried out as described in session 3. We do not include here the variation of the surface normal and the computation of parameters referring to the surface bending, which are covered in [13].

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

681

The study of the deformation of the actual surface is quite similar with one difference: the density of geodetic networks is very low, so that one cannot interpolate height information. Instead, the auxiliary use of a terrain model must be implemented. Points on the earth surface will be identified by their first two geodetic coordinates u = [λ ϕ]T at both epochs t and t  (convective coordinates). The discrete displacements δxi = xi − xi = [Ui Vi Wi ]T are interpolated as functions of the u = [λ ϕ]T coordinates, in order to obtain displacements δx = [U V W ]T ∂U ∂U , Uϕ = , at any desired point x, and in particular their derivatives Uλ = ∂λ ∂ϕ ∂V ∂V ∂W ∂W , Vϕ = , Wλ = ,Wϕ = . Let q = [λ ϕ h]T denote the Vλ = ∂λ ∂ϕ ∂λ ∂ϕ geodetic coordinates, a the large semi-major axis of the reference ellipsoid and e its eccentricity. The required partial derivative matrices are ⎡ 1 ∂q ⎣ = 0 ∂u hλ

⎤ 0 1 ⎦, hϕ

∂δx

= δxλ ∂u

δxϕ





Uλ = ⎣ Vλ Wλ

⎤ Uϕ Vϕ ⎦ , Wϕ

∂x 1/2 = RGq , ∂q (209)

where R = R3 (−90◦ − λ)R1 (ϕ − 90◦ ) [10] ⎡

(N + h)2 cos2 ϕ Gq = ⎣ 0 0 M=

a(1 − e2 ) (1 − e2 sin2 ϕ)3/2

0 (M + h)2 0

⎤ 0 0⎦, 1

a , N= 1 − e2 sin2 ϕ (210)

,

is the diagonal metric matrix of the orthogonal curvilinear coordinates [10]. The ∂h ∂h , hϕ = are computed with the help of a terrain model derivatives hλ = ∂λ ∂ϕ h(λ, ϕ). Use of the chain rule gives ⎡ 1 ∂x ∂x ∂q 1/2 = = RGq ⎣ 0 ∂u ∂q ∂u hλ

⎤ 0 1 ⎦, hϕ ⎡

⎤ 1 0

∂x ∂x ∂δx ∂x 1/2 ⎣ = + = RGq = 0 1 ⎦ + δxλ δxλ , ∂u ∂u ∂u ∂u hλ hϕ ⎡ ⎤T ⎡ ⎤  T 1 0 1 0 ∂x ∂x ⎣ 1/2 1/2 G= = 0 1 ⎦ Gq RT RGq ⎣ 0 1 ⎦= ∂u ∂u hλ hϕ hλ hϕ   (N + h)2 cos2 ϕ + h2λ hλ hϕ = , hλ hϕ (M + h)2 + h2ϕ

(211)

(212)

(213)

682

A. Dermanis

G =



+

∂x ∂u 

T

∂x = G + D + DT ∂u

Uλ2 + Vλ2 + Wλ2 Uλ Uϕ + Vλ Vϕ + Wλ Wϕ

 Uλ Uϕ + Vλ Vϕ + Wλ Wϕ , Uϕ2 + Vϕ2 + Wϕ2

(214)

with 

1 0 D= 0 1

⎡  Uλ hλ 1/2 T ⎣ Gq R Vλ hϕ Wλ

⎤ Uϕ Vϕ ⎦ . Wϕ

(215)

Another possibility is to interpolate geodetic coordinate differences δq = ∂δq = [δλ δϕ δh]T and compute their derivatives δqλ = = [δλλ δϕλ δhλ ]T , δqϕ = ∂λ ∂δq = [δλϕ δϕϕ δhϕ ]T . The additional required partial derivative matrices are = ∂ϕ ∂x 1/2 = R G q , ∂q

⎡ δλ ∂δq ⎣ λ = δϕλ ∂u δhλ

⎤ δλϕ δϕϕ ⎦ , δhϕ

(216)

with R , G q computed now at the displaced position λ = λ + δλ, ϕ  = ϕ + δϕ, h = h + δh and ∂x ∂q ∂x = ∂u ∂q ∂u

(217)

∂x ∂x ∂q ∂x ∂(q + δq) ∂x ∂x = = = = ∂u ∂u ∂q ∂u ∂q ∂u ∂q



 ∂q ∂δq + . ∂u ∂u

(218)

These allow the computation of the metric matrices  G=

∂x ∂u

T

∂x , ∂u



G =



∂x ∂u

T

∂x . ∂u

(219)

From the computed matrices G, G in either approach the deformation gradient matrix F is computed as in the general case and it forms the basis for deformation analysis as in the planar case of Sect. 3. We have restricted the discussion above to the computation of strain parameters referring to the comparison of the surface shape at two epochs. For the computation of surface strain rate parameters, utilizing three-dimensional coordinates and velocities at the same epoch of discrete network stations, the procedure is completely analogous. All we have to do is to replace the displacements U , V , W , with the

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

683

velocity components vX , vY , vZ , or the geodetic coordinate differences δλ, δϕ, δh, ˙ ˙ ϕ, with their time derivatives λ, ˙ h.

9

Deformation Analysis on the Reference Ellipsoid

When a large area is studied, it is not possible to perform the horizontal deformation analysis on a plane, whether this is an actual horizontal plane or the plane of a map projection, where actual deformation is mixed with cartographic deformation. In large area (or even global) applications, the horizontal plane must be replaced with the “horizontal” surface of the reference ellipsoid, following the classical horizontalvertical separation in geodesy. With this separation, deformation is not studied on the physical earth surface where observations are necessarily restricted, but on its pointwise projection on the horizontal surface of the reference ellipsoid, similarly to the case of classical planar deformation analysis, where the projection refers to the horizontal plane. As in the classical case, vertical motions (now referring to the normal to the ellipsoid) are ignored, in the sense that they remain the study of a separate analysis that aims at identifying and interpreting different geophysical processes. Again, we have two types of deformation studies: The first compares two epochs for which the geodetic coordinates of discrete control points are known, to obtain strain parameters at any desired point. The second is based on continuous monitoring at permanent control stations, which allows the determination of initial epoch geodetic coordinates and their constant velocities, which are used for obtaining strain rate parameters. We will present in detail only the first situation since the second one is completely analogous. Let λi , ϕi , hi be the geodetic coordinates of each station Pi at epoch t and λi , ϕi ,  hi the ones epoch t  . An interpolation procedure, applied to the differences δλi = = λi − λi , δϕi = ϕi − ϕi , allows to determine the horizontal geodetic coordinates u = [λ ϕ  ]T = [λ + δλ ϕ + δϕ]T at epoch t  , of any desired point P on the reference ellipsoid, which is identified through its horizontal geodetic coordinates u = [λ ϕ]T at epoch t. Accordingly, the interpolation allows the determination of the derivatives within the ellipsoidal deformation gradient matrix ⎡

∂λ ⎢ ∂λ Fu = ⎢ ⎣ ∂ϕ  ∂λ

⎤ ∂λ ∂ϕ ⎥ ⎥. ∂ϕ  ⎦ ∂ϕ

(220)

Applying the developed theory of surface deformation in the previous section, we need to form the surface metric matrices at the two epochs  G=

∂x ∂u

T

∂x , ∂u

G =



∂x ∂u

T

∂x , ∂u

(221)

684

A. Dermanis

where x and x are the three-dimensional Cartesian coordinates of the station projection on the reference ellipsoid at the epochs t and t  , respectively. Utilizing the well-known relations ⎡

⎤ N cos ϕ cos λ x = ⎣ N cos ϕ sin λ ⎦ , N (1 − e2 ) sin ϕ

(222)

∂ (N cos ϕ) = −M sin ϕ, ∂ϕ

∂ [N(1 − e2 ) sin ϕ] = Mcoϕ, ∂ϕ

(223)

it is easy to derive the desired metric matrices  G=

 0 , M2

N 2 cos2 ϕ 0

G =



N 2 cos2 ϕ  0

 0 . M 2

(224)

Since these are already diagonal, as a consequence of the orthogonality of geodetic coordinates, the matrices of transformation to orthonormal bases (from the already orthogonal ones) are given by ⎡

T = G−1/2

0

⎥ −1/2 , T = G 1 ⎦ M

0







1 ⎢ N cos ϕ =⎣

1 ⎢ N  cos ϕ  =⎣ 0

0

⎥ . 1 ⎦ M

(225)

Finally the desired surface deformation gradient which is the basis for the deformation analysis is given by F = T −1 Fu T, or explicitly ⎡

N  cos ϕ  ∂λ ⎢ N cos ϕ ∂λ F=⎢ ⎣ M  ∂ϕ  N cos ϕ ∂λ

⎤ N  cos ϕ  ∂λ M ∂ϕ ⎥ ⎥. M  ∂ϕ  ⎦ M ∂ϕ

(226)

Utilizing the above deformation gradient matrix, deformation analysis can be performed as in the planar case. In the case of permanent geodetic stations where coordinate time series xi (tk ) k = 1, 2, . . . , m are computed, these can be converted to geodetic time series λi (tk ), ϕi (tk ), hi (tk ). Ignoring the height components we may apply a linear regression to the horizontal time series according to the model λi (t) = λ0i + (t − t0 )λ˙ i , ϕi (t) = ϕ0i + (t − t0 )ϕ˙ i to obtain least squares estimates of the initial horizontal coordinates λ0i , ϕ0i and their constant rates λ˙ i , ϕ˙i . An ˙ ϕ˙ at any desired point P identified by interpolation scheme will provide the rates λ, its initial epoch coordinates λ0 , ϕ0 , as well as the derivatives within the ellipsoidal velocity gradient matrix

12 Geodetic Methods for Monitoring Crustal Motion and Deformation



∂ λ˙ ⎢ ∂λ Lu = ⎢ ⎣ ∂ ϕ˙ ∂λ

⎤ ∂ λ˙ ⎥ ∂ϕ ⎥ . ∂ ϕ˙ ⎦ ∂ϕ

685

(227)

The required velocity gradient matrix with respect to orthonormal bases is given by L = T −1 Lu T, or explicitly ⎡

N  cos ϕ  ∂ λ˙ ⎢ N cos ϕ ∂λ F=⎢ ⎣ M  ∂ ϕ˙ N cos ϕ ∂λ

⎤ N  cos ϕ  ∂ λ˙ M ∂ϕ ⎥ ⎥.  ⎦ M ∂ ϕ˙ M ∂ϕ

(228)

Interpolation with stochastic prediction can be carried out in a short of spherical approximation which is necessary in order to introduce the isotropy assumption and thus use covariance functions Cλ˙ (ψ), Cϕ˙ (ψ), Cλ˙ ϕ˙ (ψ), which are functions of the spherical distance ψ between two points. Decorrelation is also possible, involving a rotation by an angle ' around the normal to the sphere, at a central point within ˜ ϕ˜ in the “rotated” the area under study. The computation of spherical coordinates λ, reference system are in this case somewhat more complicated and the same holds true for back-transformation of the interpolated derivatives λ˙˜ , ϕ˙˜ to the corresponding ˙ ϕ˙ in the original system. Nevertheless, simple decorrelation is still possible, ones λ, if the geometric interpretation of rotation is abandoned and one uses a rotation-like transformation 4

λ˜˙ (λ, ϕ) ˜˙ ϕ(λ, ϕ)

3



cos ' sin ' = − sin ' cos '



   ˙ ˙ λ(λ, ϕ) λ(λ, ϕ) = R(') , ϕ(λ, ˙ ϕ) ϕ(λ, ˙ ϕ)

(229)

˙ ϕ˙ with two equivalent random which replaces the two random fields of interest λ, ˙ ˙ ˙ The angle ' = fields λ˜ = cos 'λ˙ + sin 'ϕ˙ and ϕ˜ = − sin 'λ˙ + cos 'ϕ. ˆ λϕ (0) 2 C is determined from the diagonalization = 12 arctan Cˆ λ (0) − Cˆ ϕ (0) 4

Cˆ λ˙ (0) Cˆ ˙ (0) λϕ˙

Cˆ λ˙ ϕ˙ (0) Cˆ ϕ˙ (0)

3

4

Cˆ λ˜˙ (0) = R(−') 0

3 0 R('), Cˆ ϕ˙˜ (0)

(230)

which makes Cˆ λ˜˙ ϕ˜˙ (0) = 0. If Cˆ λ˜˙ ϕ˜˙ (ψ) remains negligibly small within the whole relevant range of ψ, then decorrelation can be realized. The data λ˙ i , ϕ˙ i are 4 3   ˜λ˙ λ˙ i transformed into pseudo-data = R(') i , which form the basis for the ϕ˙i ϕ˙˜i

686

A. Dermanis

∂ λ˜˙ ∂ λ˜˙ ∂ ϕ˜˙ ∂ ϕ˜˙ , , , , at any desired point with coordinates λ, ϕ. ∂λ ∂ϕ ∂λ ∂ϕ 4 3   λ˙ λ˜˙ From the obvious inverse relations = R(−') , we may then compute the ϕ˙ ϕ˜˙ required contributions to the velocity gradient matrix Fu using

prediction of values



∂ λ˙ ⎢ ⎢ ∂λ ⎣ ∂ ϕ˙ ∂λ

⎡ ˜ ⎤ ∂ λ˙ ∂ λ˙ ⎢ ⎥ ∂ϕ ⎥ = R(−') ⎢ ∂λ ⎣ ∂ ϕ˜˙ ∂ ϕ˙ ⎦ ∂ϕ ∂λ

⎤ ∂ λ˜˙ ⎥ ∂ϕ ⎥ . ˜ ∂ ϕ˙ ⎦ ∂ϕ

(231)

In addition to the removal of a rotation around an Euler pole, in order to further substantiate the zero mean hypothesis, an additional linear trend removal is preceding both the sample covariance determination and the prediction, so that the determination of reasonable sample covariance functions is possible. The predicted contributions to Fu must be added to the ones coming from the linear trend, i.e. the 0 , Fˆ 0 , Fˆ 0 , Fˆ 0 estimated from the linear regression values Fˆ11 12 21 22 

λ˙ i ϕ˙i





0 F11 = 0 F21

 =

λ˙ i 0

0 F12 0 F22

ϕ˙i 0



0 λ˙ i

   cλ˙ e + + λ˙ cϕ˙ eϕ˙ ⎤ ⎡ 0 F11 ⎢F0 ⎥ ⎥   ⎢ ⎢ 12 0 ⎥ e˙ 0 1 0 ⎢ F21 ⎥ ⎢ 0 ⎥+ λ . ⎥ ϕ˙i 0 1 ⎢ F22 eϕ˙ ⎥ ⎢ ⎣ cλ˙ ⎦ cϕ˙

λ˙ i ϕ˙i





(232)

A slightly different approach is to originally use a linear trend removal only for the determination of the sample covariance matrices and to use instead a simultaneous prediction and linear trend determination on the basis of the mixed linear model, as explained for the planar case. From here on, we remove for the sake of simplicity the tilted overbar, with the understanding that all quantities refer to the decorrelated observed and related random fields. In addition to the two sample covariance functions Cˆ λ˙ , Cˆ ϕ˙ (modeled by appropriate positive-definite functions) we also need to determine the cross-covariances ∂ λ˙ ∂ λ˙ ∂ ϕ˙ ∂ ϕ˙ between observed (λ˙ ,ϕ) ˙ and the predicted (λ˙ λ = , λ˙ ϕ = , ϕ˙λ = , ϕ˙ϕ = ) ∂λ ∂ϕ ∂λ ∂ϕ random fields (elements of Cs s ), namely ∂ Cˆ λ˙ , Cˆ λ˙ λ ,λ˙ = ∂λ

∂ Cˆ λ˙ Cˆ λ˙ ϕ ,λ˙ = , ∂ϕ

Cˆ ϕ˙λ ,λ˙ =

∂ Cˆ λ˙ ϕ˙ ∂λ

= 0,

Cˆ ϕ˙ϕ ,λ˙ =

∂ Cˆ λ˙ ϕ˙ ∂ϕ

= 0, (233)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

Cˆ λ˙ λ ,ϕ˙ =

∂ Cˆ λ˙ ϕ˙ ∂λ

= 0,

Cˆ λ˙ ϕ ,ϕ˙ =

∂ Cˆ λ˙ ϕ˙ ∂ϕ

= 0,

687

∂ Cˆ ϕ˙ Cˆ ϕ˙λ ,ϕ˙ = , ∂λ

∂ Cˆ ϕ˙ , Cˆ ϕ˙ϕ ,ϕ˙ = ∂ϕ

(234)

derived by applying the laws of covariance propagation. Computation of the covariance matrix of the prediction errors requires also the covariance between all the predicted random fields (elements of Cs ), namely ∂ 2 Cˆ λ˙ Cˆ λ˙ λ ,λ˙ λ = , ∂λ2

∂ 2 Cˆ λ˙ , Cˆ λ˙ λ ,λ˙ ϕ = ∂λ∂ϕ

Cˆ λ˙ λ ,ϕ˙λ =

∂ 2 Cˆ λ˙ ϕ˙

∂ 2 Cˆ λ˙ Cˆ λ˙ ϕ ,λ˙ ϕ = , ∂ϕ 2

Cˆ λ˙ ϕ ,ϕ˙λ =

∂ 2 Cˆ ϕ˙ Cˆ ϕ˙λ ,ϕ˙λ = , ∂λ2

∂ 2 Cˆ ϕ˙ , Cˆ ϕ˙λ ,ϕ˙ϕ = ∂ϕ∂λ

∂ϕ∂λ

= 0,

∂ 2 Cˆ λ˙ ϕ˙ ∂λ2

= 0,

Cˆ λ˙ ϕ ,ϕ˙ϕ =

Cˆ λ˙ λ ,ϕ˙ϕ =

∂ 2 Cˆ λ˙ ϕ˙ ∂ϕ 2

∂ 2 Cˆ λ˙ ϕ˙ ∂λ∂ϕ

= 0;

= 0;

∂ 2 Cˆ ϕ˙ Cˆ ϕ˙ϕ ,ϕ˙ϕ = . ∂ϕ 2

(235)

We can distinguish between two types of covariance models, the ones depending on the spatial distance r = 2R sin

√ ψ = R 2 − 2t 2

(236)

between two points on the surface of the approximating sphere with radius R, and the ones which depend on t ≡ cos ψ = sin ϕ sin ϕ  + cos ϕ cos ϕ  cos(λ − λ ).

(237)

For the first case the required derivatives of covariance functions of the type C(r) are computed according to the rules R 2 ∂t ∂ Cˆ ∂ Cˆ R 2 ∂t ∂ Cˆ ∂ Cˆ =− , =− , ∂λ r ∂λ ∂r ∂ϕ r ∂ϕ ∂r 3 4     ∂ 2 Cˆ R 4 ∂t 2 ∂ 2 Cˆ R 2 R 2 ∂t 2 ∂ 2 t ∂ Cˆ , = 2 − + 2 ∂λ r ∂r ∂λ2 r ∂r 2 r 2 ∂λ ∂λ     ∂ 2 Cˆ R 4 ∂t 2 ∂ 2 Cˆ ∂ 2 t ∂ Cˆ R 2 R 2 ∂t ∂t = 2 + , − ∂λ∂ϕ ∂ϕ r r ∂r 2 r 2 ∂ϕ ∂λ ∂λ∂ϕ ∂r 3 4     ∂ 2 Cˆ R 4 ∂t 2 ∂ 2 Cˆ R 2 R 2 ∂t 2 ∂ 2 t ∂ Cˆ , = 2 − + 2 ∂ϕ r ∂r ∂ϕ 2 r ∂r 2 r 2 ∂ϕ ∂ϕ

(238) (239)

(240) (241)

688

A. Dermanis

where ∂t = − cos ϕ cos ϕ  sin(λ − λ ), ∂λ ∂t = cos ϕ sin ϕ  − sin ϕ cos ϕ  cos(λ − λ ), ∂ϕ ∂ 2t = − cos ϕ cos ϕ  cos(λ − λ ), ∂λ2 ∂ 2t = sin ϕ cos ϕ  sin(λ − λ ), ∂λ∂ϕ ∂t 2 = − sin ϕ sin ϕ  − cos ϕ cos ϕ  cos(λ − λ ). ∂ϕ 2

(242)

For the second case the required derivatives of covariance functions of the type C(t) are computed according to the rules ∂ Cˆ ∂t ∂ Cˆ = , ∂λ ∂λ ∂t  2 ∂ 2 Cˆ ∂t = ∂λ∂ϕ ∂λ

∂ Cˆ ∂t ∂ Cˆ = , ∂ϕ ∂ϕ ∂t ∂ 2 Cˆ ∂ 2 t ∂ Cˆ , + ∂λ∂ϕ ∂t ∂t 2

 ∂t 2 ∂ 2 Cˆ ∂ 2 t ∂ Cˆ , + ∂λ ∂t 2 ∂λ2 ∂t  2 2 ˆ ∂ 2 Cˆ ∂ C ∂t ∂ 2 t ∂ Cˆ . = + ∂ϕ ∂ϕ 2 ∂t 2 ∂ϕ 2 ∂t (243)

∂ 2 Cˆ = ∂λ2



An interesting question is whether we can avoid the linear regressions of the geodetic coordinates λi (tk ), ϕi (tk ), k = 1, 2, . . . , m, to estimate initial values λ0i = λ(t0 ), ϕ0i = ϕ(t0 ) and the constant derivatives λ˙ i , ϕ˙i , which are the input data for the interpolation-prediction, and interpolate instead the already available horizontal velocity components from the regression of the Cartesian coordinates, known as stacking. Stacking has the great advantage that takes care of the fact that the Cartesian coordinates xi (tk ) at each epoch may refer to different reference system, both by choice and through the effect of systematic errors. The Cartesian ∂xi ∂xi ˙ dxi ∂xi ˙ = ϕ+ ˙ components of the constant velocities vi = λ+ h provided by dt ∂λ ∂ϕ ∂h stacking, are converted to the local geodetic reference system (East, North, Height) and the resulting components relate to the time derivatives λ˙ , ϕ, ˙ h˙ through viE = (Ni + hi ) cos ϕi λ˙ i ,

viN = (Mi + hi )ϕ˙ i ,

˙ viH = h.

(244)

The two horizontal components viE , viN themselves, are not appropriate for interpolation, because they do not refer to the reference ellipsoid but to the earth surface at different heights hi for each station. They can be used though for the determination of the derivatives

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

λ˙ i =

viE , (Ni + hi ) cos ϕi

ϕ˙i =

viN , Mi + h i

689

(245)

which are required for the interpolation necessary for deformation analysis. This means that when the objective is the interpolation of horizontal velocities and not deformation analysis, one has two choices: The first is to interpolate the above λ˙ i , ϕ˙i to obtain λ˙ , ϕ˙ at any desired point (λ, ϕ) on the ellipsoid, and then convert them ˙ v N = M ϕ˙ on the reference to horizontal velocity components v E = N cos ϕ λ, ellipsoid. The second choice is to reduce the original station horizontal velocities to ones on the reference ellipsoid (viE )h=0 = Ni cos ϕi λ˙ i =

Ni vE , (Ni + hi ) i

(viN )h=0 = Mi ϕ˙i =

Mi vN Mi + h i i (246)

and interpolate them directly to any desired point on the ellipsoid. An alternative to the interpolation of λ˙ , ϕ˙ is to replace the random field λ˙ with the random field υ˙ = cos ϕ λ˙ , in order to avoid indeterminacies in the computation of covariances and cross-covariances.

10

Three-Dimensional Deformation

We have devoted our study here to two-dimensional deformation (on the plane or a curved surface) despite the fact that the deforming earth is a three-dimensional body. As already explained, the main reason for separating horizontal from vertical deformation is their attribution to different geophysical causes. In addition, deformation analysis requires continuous knowledge of displacements in the neighborhood of all points of interest and this requires interpolation of the obtained discrete geodetic data. As geodetic observations are naturally restricted on the surface of the earth, horizontal interpolation of observed displacements can be reasonably performed, and potentially improved by designing denser geodetic networks. On the contrary, to obtain three-dimensional information an extrapolation is needed, which is not justified in view of the lack of vertically repeated displacement data. Thus geodesy can only determine vertical motions and not deformation in the vertical direction. The only case that such data may be available is in mountain regions, where e.g. stations on the top of a mountain and around it could be used to provide threedimensional deformation within the mountain. An extension of the infinitesimal approximate approach has been proposed, using quadrilaterals as finite elements. We have already criticized the limitations of such a finite element approach. As an indeed extreme counterexample, consider the physically absurd but mathematically sound idea to use data on the tops of three adjacent mountains and a low point among them in order to compute the deformation of the particular quadrilateral! Nevertheless, we will discuss the three-dimensional case for purely theoretical reasons, to see if a rigorous alternative to the classical approximate infinitesimal approach is possible. The deformation gradient F and the displacement gradient

690

A. Dermanis

J = F−I are in this case 3×3 matrices. In the approximate “infinitesimal” approach, J = Einf + is decomposed into the symmetric part Einf = 12 (JT + J), which is an approximation to the strain matrix, and an antisymmetric part = 12 (JT − J) = = [ω×], having as axial vector the rotation vector ω = [ω1 ω2 ω3 ]T . Diagonalization of Einf = R(θ)T LR(θ) gives the directions of the principal axes, defined through R(θ) and the principal strains e1 ≥ e2 ≥ e3 as the diagonal elements of the diagonal matrix L. The approximate relations of the planar case are then applied to each of the principal planes (1,2), (2,3), (1,3) to obtain principal plane dilatations 12 = e1 + e2 , 23 = e2 + e3 , 13 = e1 + e3 and principal plane shears γ12 = e1 − e2 , γ23 = e2 − e3 , γ13 = e1 − e3 = γmax . In the rigorous approach the diagonalizations C = FT F = R(θ)T 2 R(θ),

B = FFT = R(θ )T 2 R(θ )

(247)

allow the singular value decomposition F = R(θ )T R(θ),

(248)

where θ = [θ1 θ2 θ3 ]T and θ = [θ1 θ2 θ3 ]T are the respective rotation parameters, e.g. rotation angles around the three axes and the diagonal elements λ1 ≥ λ2 ≥ λ3 of the diagonal matrix  are the principal elongations. The rotation R(θ) brings the axes of the reference system at the initial epoch t in the directions of the three principal axes, while the rotation R(θ ) brings the axes of the reference system at the second epoch t  in the directions of the principal axes. The principal axes are three mutually orthogonal directions, such that any point lying on one of them remains on the same axis after the deformation, but its distance from the origin changes by the factor λi corresponding to the axis. The strain matrix in this case accepts the diagonalization E=

1 T 1 (F F − I) = R(θ)T (2 − I)R(θ), 2 2

(249)

where the diagonal elements of the diagonal matrix 12 (2 − I) are the principal strains e1 =

λ21 − 1 , 2

e2 =

λ22 − 1 , 2

e3 =

λ23 − 1 . 2

(250)

The planar deformation formulas can then be applied to compute, within each principal plane, the dilatations 12

= λ1 λ2 − 1,

23

= λ2 λ3 − 1,

13

= λ1 λ3 − 1

(251)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

691

and the shears λ1 − λ2 γ12 = √ , λ1 λ2

λ2 − λ3 γ23 = √ , λ2 λ3

λ1 − λ3 γ13 = √ . λ1 λ3

(252)

Let V = aT [b×]c be the volume of the parallelepid with sides the vectors a, b, c. A mapping F maps these vectors into the vectors a = Fa, b = Fb, c = Fc which form a parallelepid with volume V  = a T [b ×]c . Applying this to the unit vectors μi1 , μi2 , μi3 (I3×3 = [i1 i2 i3 ]) it is easy to verify that V  /V = det F. If we V = det F. The volume dilatation is defined let μ → 0, then V → 0 and thus lim V →0 V V −V V in analogy to the planar areal dilatation as = lim = lim −1 = V →0 V →0 V V  T = det F − 1 and since det F = det{R(θ ) } det  det{R(θ)} = det , it follows that = λ1 λ2 λ3 − 1.

(253)

If the reference systems at the two epochs can be identified with the help of a rigid environments (as in the case of classical mechanics) then the rotation parameters θ defined through R( θ) = R(θ )T R(θ)

(254)

represent the rotational part of F which does not contribute to deformation. For rotations close to the identity the approximations R(θ) ≈ I−[θ×], R(θ) ≈ I−[θ×], R( θ) ≈ I − [ θ×] hold and it follows that θ ≈ θ − θ .

11

Concluding Remarks

The described above separation of geodetic observational evidence into the relative motion of independent tectonic units and their internal deformation does not exhaust the present geodetic capabilities. It mostly fits to the analysis of data provided by permanents GNSS stations, established mostly for mapping purposes. Their main limitation is due to their relatively small spatial resolution. Global networks from various spatial geodetic techniques (GNSS, VLBI, SLR, DORIS) aiming at geophysical research, and in particular those contributing to the formulation of the International Terrestrial Reference Frame (ITRF), have even smaller spatial resolution. Despite their highest possible quality in terms of observational accuracy and data processing techniques, they can provide only the general picture of global plate tectonics. The great challenge for geodesy in this field is the analysis of crustal motion in the bordering areas between independent tectonic units, which are associated with major faults, as well as in the areas around secondary faults

692

A. Dermanis

in the interior of the currently identified tectonic units. In these areas, the goal is to identify earthquake precursors, by acquiring knowledge of the type of spatial deformation that precedes major earthquakes. The difficulties in achieving such a goal are indeed very large despite the fact that they are within the capabilities of the present day technology and they are mostly related to organization and cost issues. Cross-faulting techniques were the first widely used geodetic methodologies for detecting crustal motion. The first ones were based on preexisting networks and the local repetition of geodetic observations after the seismic event. As such, they were detecting relative motion across the fault and not deformation in the strict sense. The emergence of electronic distance measuring instruments allowed repeated campaigns on the same network, usually on an annual basis. The obtained results have suffered from low spatial resolution, which led to the uncritical adoption of a homogeneous deformation within each triangle of the network, or even over larger areas. This led to the determination of strain or strain rate parameters, which were accompanied by over-optimistic measures of accuracy. Thus, despite the fact that this type of research was well received at its time, it has been silently abandoned and present day deformation analyses are almost exclusively based on data from existing mapping-dedicated permanent GNSS stations. The need for large spatial resolution, i.e. for networks that are dense enough to overcome the interpolation uncertainty, makes the relevant campaign cost rather prohibitive, especially when the accompanying need for high temporal resolutions, necessitates the use of permanents GNSS networks that are costly to maintain. The greatest difficulty though comes when one tries to answer the question where to look. As it is not possible to monitor all active fault areas in a highly seismic region, a choice of one or few more particular areas must be made, which are rather small enough to be covered by dense networks. Although seismologists can provide some guidelines, these are two general in terms of place and time to be of actual help. Nobody can predict where and when a large earthquake will hit next, so the choice to be made must based rather on luck. To give an example, the Department of Geodesy and Surveying of the Aristotle University of Thessaloniki monitors a seismic area for almost 40 years, using always the most accurate available techniques, but everything seems to be quiet there so far, with variations in the shape of the network within the observational noise. Does this mean that we should be pessimistic about the future contribution of geodesy to earthquake research? The desirable negative answer to this can be based on two factors. The first is the emergence of low-cost GNSS receivers and the second is the continuous improvement in instrumental accuracy and data analysis methodologies. Thus, we may be confident that in the future we will be able to determine, with large signal-to-noise ratio, the deforming behavior of seismic areas in the time period preceding a large earthquake. Hopefully, these patterns will provide earthquake precursors that will be of practical use. For the sake of completeness, we refer to another geodetic contribution of geodesy to seismology, that came to be known as GNSS seismology [see e.g. 6, 7], which is though of pointwise rather than spatial nature. It is based on the use of GNSS data to determine displacements in very short intervals of time and hence

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

693

velocities, which can be associated with acceleration data from accelerometers. The positive factors are two: the possibility to use very low cost accelerometers [see. e.g. 4] and the fact that GNSS seismology provides data referring to a more stable global reference system, while accelerometers provide data in their own internal reference system which follows rotations (tilts) caused by local crust deformation. The recorded data from such an instrumental combination that seem to be of practical use are restricted to a small time interval around the earthquake occurrence. Therefore, they contribute mostly to seismology, and not to earthquake prediction directly. Acknowledgements This presentation uses largely material from an unpublished invited presentation [5]. Most of its conclusions here are based on practical experience gained with software developed by Professor Ludovico Biagi of the Politecnico di Milano, which cover all methods of deformation analysis. I am thankful to Professor Biagi for his contribution to this research and his overall support.

Appendix A: Unambiguous Diagonalization of a 2 × 2 Symmetric Matrix From the well-known fact that symmetric matrices have real positive eigenvalues and orthonormal eigenvalues we may set  M=

a b

  b cos α = d sin α

− sin α cos α



l12 0

0 l22



cos α − sin α

sin α cos α

 = R(−α)L2 R(α), (A1)

l12 = A + B,

l22 = A − B,

l12 ≥ l22 ,

(A2)

where A=

a+d , 2

B=

1 (a − d)2 + 4b2 , 2

(A3)

while the angle α is determined in the correct quadrilateral from 2

1−P sin α = sgn(b) , cos α = 2

2

1+P , 2

(A4)

where P =

a−d . 2B

(A5)

The above formulas for the angle α avoid the ambiguity in the usual relation 2b tan 2α = . a−d

694

A. Dermanis

Literature 1. Altiner, Y.: Geometrische Modellierung innerer und auterer Deformationen der Erdoberfläche mit Anwendungen an der nordanatolischen Verwerfung und in der West-Türkei, Reihe C, no. 462, DGK, Frankfurt am Main (1996) 2. Barzaghi, R., Borghi, A., Sona, G.: New covariance models for local applications of collocation. In: Benciolini, B. (ed.) IV Hotine-Marussi Symposium on Mathematical Geodesy, Trento 14–17 Sept 1998. IAG Symposia, vol. 122. Springer, Berlin/Heidelberg (1998) 3. Bencini, P., Dermanis, A., Livieratos, E., Rossikopoulos, D.: Crustal deformation at the Friuli area from discrete and continuous geodetic prediction techniques. Bollettino di Geodesia e Scienze Affini, XLI(2), 137–148 (1982) 4. Benedetti, E., Dermanis, A., Crespi, M.: On the feasibility to integrate low-cost MEMS accelerometers and GNSS receivers. Adv. Space Res. 59, 2764–2778 (2017) 5. Biagi, L., Dermanis, A.: Crustal Deformation Analysis from Permanent GPS Networks. Invited presentation at the EGU General Assembly, Vienna, 20–24 Apr 2009 6. Branzanti, M., Benedetti, E., Colosimo, G., Mazzoni, A., Crespi, M.: Real-time monitoring of fast displacements with VADASE: new applications and challenges with Galileo. In: ENC – GNSS 2014 Proceedings (2014) 7. Colosimo, G.: VADASE: a Brand New Approach to Real-Time GNSS Seismology. Lap Lambert Academic Publishing GmbH & Co KG, Saarbrücken, Germany (2013) 8. Danuser, G., Geiger, A., Muller, M.V.: Modellierung von Verschiebungs- und Verzerrungsfeldern. Bericht Nr. 218, Institut fur Geodäsie und Photogrammetrie, ETH-Zurich (1993) 9. Dermanis, A.: Kriging and collocation – a comparison. Manuscr. Geodaet. 9(3), 159–167 (1984) 10. Dermanis, A.: Coordinates and Reference Systems (in Greek). Ziti Publications, Thessaloniki (2005) 11. Dermanis, A.: The evolution of geodetic methods for the determination of strain parameters for Earth crust deformation. In: Arabelos, D., Contadakis, M., Kaltsikis, Ch., Spatalas, S. (eds.) Terrestrial and Stellar Environment, Volume in honor of Prof. G. Asteriadis, pp. 107– 144. Publication of the School of Rural & Surveying Engineering, Aristotle University of Thessaloniki (2009) 12. Dermanis, A.: Kriging in the light of the theory of random field prediction. In: Yannopoulos, S. (ed.) HydroGaea. Volume in honor of Prof. Ch. Jimopoulos, pp. 447–456. Publication of the School of Rural & Surveying Engineering, Aristotle University of Thessaloniki (2009) 13. Dermanis, A.: Fundamentals of surface deformation and application to construction monitoring. J. Appl. Geomat. 3(1), 9–22 (2011) 14. Dermanis, A., Rummel, R.: Data analysis methods in geodesy. In: Dermanis, A., Gruen, A., Sanso, F. (eds.) Geomatic Methods for the Analysis of Data in Earth Sciences. Lecture Notes in Earth Sciences, vol. 95, pp. 17–92. Springer, Heidelberg (2000) 15. Dermanis, A., Livieratos, E., Rossikopoulos, D., Vlachos, D.: Geodetic prediction of crustal deformations at the seismic area of Volvi. In: Proceedings International Symposium “Geodetic Networks and Computations”, Munich, 1981. Veröffentlichungen Deutsche Geodätischen Kommission, Reihe B, Nr. 258/V, pp. 234–248 (1982) 16. El-Fiky, G.S.: Temporal change of the crustal deformation and interplate coupling in the Tohoku district, Northeast Japan – a new approach. Ph.D., Thesis, Faculty of Science, Tokyo University (1998) 17. El-Fiky, G.S., Kato, T., Fujii, Y.: Distribution of the vertical crustal movement rates in the Tohoku district, Japan, predicted by least-squares collocation. J. Geodesy 71, 432–442 (1997) 18. Grafarend, E.W., Voosoghi, B.: Intrinsic deformation analysis of the Earth’s surface based on displacement fields derived from space geodetic measurements. Case studies: present-day deformation patterns of Europe and of the Mediterranean area (ITRF data sets). J. Geodesy 77, 303–326 (2003) 19. Hjelle Ø., Dæhlen M.: Triangulations and Applications. Springer, Berlin (2010)

12 Geodetic Methods for Monitoring Crustal Motion and Deformation

695

20. Irgens F.: Continuum Mechanics. Springer, Berlin (2008) 21. Kahle, H.-G., Müller, M.V., Geiger, A., Danuser, G., Mueller, S., Veis, G., Billiris, H., Paradissis, D.: The strain field in Northwestern Greece and the Ionian Islands: results inferred from GPS measurements. Tectonophysics 249, 41–52 (1995) 22. Kato, T., El-Fiky, G.S., Oware, E.N.: Crustal strains in the Japanese islands as deduced from GPS dense array. Geophys. Res. Lett. 25, 3445–3448 (1998) 23. Marsden, J.E, Hughes, T.J.R.: Mathematical Foundations of Elasticity. Dover, New York (1983) 24. Mase, G.E.: Continuum Mechanics. McGraw-Hill, New York (1970) 25. Moghtased-Azar, K., Grafarend, E.W.: Surface deformation analysis of dense GPS networks based on intrinsic geometry: deterministic and stochastic aspects. J. Geod. 83, 431–454 (2009) 26. Moghtases-Azar, K.: Surface deformation analysis of dense GPS networks based on intrisic geometry. Deterministic and Stochastic Aspects, Deutsche Geod. Komm., Reihe C, Heft Nr. 612 (2007) 27. Muto, K.: A study of displacements of triangulation points. Bull. Earthquake Res. Inst. Univ. Tokyo 10, 384–391 (1932) 28. Reguzzoni, M., Sansó, F., Venuti, G.: The theory of general kriging, with applications to the determination of a local geoid. Geophys. J. Int. 162, 303–314 (2005) 29. Rikitake, T.: Earthquake Prediction. Elsevier, Amsterdam (1976) 30. Stoker, J.J.: Differential Geometry. Wiley-Interscience, New York (1969) 31. Strang, G.: Introduction to Linear Algebra, 4th edn. Wellesley-Cambridge Press, Wellesley (2009) 32. Straub, C.: Recent crustal deformation and strain accumulation in the Marmara Sea region, N.W. Anatolia, inferred from GPS measurements, Ph.D., Thesis, ETH Zürich, Inst. of Geod. and Photogramm. Mitt. 58 (1996) 33. Terada, T., Miyame, N.: Deformation of the Earth crust in Kwansai districts and its relation to the orographic feature. Bull. Earthquake Res. Inst. Univ. Tokyo 7, 22–239 (1929) 34. Tscherning, C.C., Rapp, R.H.: Closed covariance expressions for gravity anomalies, geoid undulations, and deflections of the vertical implied by anomaly degree-variance models, Reports of the Department of Geodetic Science, No 28, The Ohio State University, Columbus (1974) 35. Tsuboi, C.: Investigation on the deformation of the Earth’s crust in the Tango district connected with the Tango earthquake of 1927. Bull. Earthquake Res. Inst. Univ. Tokyo 10, 411–434 (1932) 36. Tsuboi, C.: Investigation on the deformation of the Earth’s crust found by precise geodetic means. Jap. J. Astron. Geoph. 10, 93–248 (1933) 37. Voosoghi, B.: Intrinsic deformation analysis of the Earth surface based on 3-dimensional displacement fields derived from space geodetic measurements. Report Nr. 2003, Institut für Photogrammetrie, Universität Stuttgart (2000)

Theory and Realization of Reference Systems

13

Athanasios Dermanis

Contents 1 2 3 4 5 6

7 8 9 10 11 12 13 14 15

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference Systems in Motion: Generalized Euler Kinematic Equations – The Rotation Vector Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference Systems for the Description of Earth Rotation . . . . . . . . . . . . . . . . . . . . . . . The Realization of a Reference System Within Data Analysis, in the Case of Rigid Geodetic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Estimation for Models Without Full Rank Utilizing Minimal Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathematical Modeling of Spatiotemporal Reference Systems for a Deformable Geodetic Network: Deterministic Aspects and Reference System Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference System Definition in the Analysis of Coordinate Time Series . . . . . . . . . . . Various Types of Minimal Constraints for the Definition of a Spatiotemporal Reference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Posteriori Change of the Spatiotemporal Reference System . . . . . . . . . . . . . . . . . . . Kinematic Minimal Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transforming a Network Reference System into an (Approximate) Earth Reference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formulation of the International Terrestrial Reference Frame: Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basics of Data Set Combination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stacking of a Coordinate Time Series from a Single Space Technique . . . . . . . . . . . . . The Implementation of Time Series of Earth Orientation Parameters . . . . . . . . . . . . . .

699 701 704 711 718

731 738 744 748 756 762 768 771 782 787

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. A. Dermanis () Department of Geodesy and Surveying (DGS), Aristotle University of Thessaloniki, Thessaloniki, Greece E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_107

697

698

A. Dermanis

16 ITRF Formulation: Combination of Initial Coordinates, Velocities and EOPs Obtained Separately from Each Space Technique . . . . . . . . . . . . . . . . . . . . . . . . 17 ITRF Formulation: The Combination of Separate Estimates from Space Techniques in the Case of Non-singular Covariance Matrices . . . . . . . . . . . . . . . . . . . . 18 ITRF Formulation: Some Remarks on the Origin and Scale of the Reference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Post-linear Coordinate Variation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

792 801 809 816 820

Abstract After a short introduction on the basics of reference system theory and its application for the description of earth rotation, the problem of establishing a reference system for the discrete stations of a geodetic network is studied, from both a theoretical and a practical – implementation point of view. First the case of rigid networks is examined, which covers also the case of deformable networks with data collected within a time span, small enough for the network shape to remain practically unaltered. The problem of how to analyze observations, which are invariant under particular changes of the reference system, is examined within the framework of least squares estimation theory, with a rank deficiency in the design matrix. The complete theory is presented, including all necessary proofs. Not only of the usual statistical results for the rank deficient linear Gauss-Markov model, but also those of the rich geodetic theory are presented, based on the fact that the physical cause of the rank deficiency is known to be the lack of definition of the reference system. The additional geodetic results are based on the fact that one can easily construct a matrix with columns that are a basis of the null space of the design matrix. Insights are presented into the geometric characteristics of the problem and its relation to the theory of generalized inverses. Passing into deformable networks, a deterministic mathematical model is presented, based of the concept of geodesic lines which are the shortest between linear shape manifolds, associated with the network shape at each instant. Reference system optimality for a discrete network is related to the relevant ideas of Tisserand, developed for the continuum of the earth masses. The practical problem of choosing a reference system for a coordinate time series is examined, for the case where a linear-in-time model is adopted for the temporal variation of coordinates. The choice of reference system is related to the choice of minimal constraints for obtaining one out of the infinitely many least squares solutions, corresponding to descriptions in different reference systems of the same sequence of network shapes. The a-posteriori change of the reference system is examined, where one moves from one least squares solution to another one, satisfying particular minimal constraints. Kinematic minimal constraints are also introduced, leading to coordinates that demonstrate the minimum coordinate

13 Theory and Realization of Reference Systems

699

variation and are thus connected to the ideas of Tisserand for reference system optimality. It is also shown how to convert a reference system of a geodetic network to one for the whole continuous earth, or at least the lithosphere, utilizing additional geophysical information. The last item is the combination of data from four space techniques (VLBI, SLR, GPS, DORIS) in order to establish a global reference system realized though a number of parameters that constitute the International Terrestrial Reference Frame. After a theoretical exposition of the basics of data combination, the various methods of spatial data combination are presented, for both coordinate and Earth Orientation Parameter time series, while alternatives are presented for the choice of the origin (geocenter) and the network scale from the scale of VLBI and SLR. Finally, existing and new methodologies are presented for building post linear models, describing the temporal variation of station coordinates.

Keywords Reference systems · Rank deficient linear model · Minimal constraints · Inner constraints · Kinematic constraints · Tisserand reference system · Coordinate tine series · Earth orientation parameters · Combination of geodetic space techniques · International Terrestrial Reference Frame (ITRF)

1

Introduction

Applied mathematics aim at the description of physical processes by means of mathematical equations. In order to achieve this goal, physical entities, such as points, time instants, and scalar, vector or tensor quantities, must be represented by real numbers. In a more general relativistic set up, where space-time is considered to be a curved manifold, points are converted to numbers with the use of a coordinate system, namely a one-to-one correspondence between events (point plus epoch) and a tetrad of numbers, three for the point and one for the time instant. As geodesists well know from the use of geodetic longitude and latitude, in order to describe points on the earth surface, such a one-to-one correspondence is generally impossible to achieve: the poles have a unique latitude but they correspond to any value of the geodetic longitude. In modern differential geodesy, this problem is bypassed by means of an atlas, which is a collection of coordinate systems called charts, each chart covering only an open subset of the manifold. When a curved manifold is embedded in a flat space, e.g., a two dimensional surface embedded in the Euclidean three-dimensional space, vectors can be viewed in the usual way, as tangent to the manifold, living within the surrounding flat space and not within the manifold. This cannot be achieved though, when a curved manifold is considered by itself, without any embedding into a flat space. To overcome this problem, modern differential geometry replaces vectors with

700

A. Dermanis

directional derivatives, i.e., derivatives along all curves passing through the point considered, which are tangent to each other and have the same rate of displacement with respect to their parametric representation. Fortunately, these mathematical complications can be avoided in geodesy, thanks to the implementation of the Newtonian-Euclidean model for space-time, which separates time from space and allows the possibility of parallel translation within the flat three-dimensional space. Although general relativity is not irrelevant to modern geodetic observations, it is customary to perform “relativistic corrections” on them, which allow their further analysis within a Newtonian-Euclidean model. A reference system for time consists of a particular time instance called the time origin, a time interval serving as the unit of time and a direction, which is necessarily the one from the past towards the future. This allows the representation of any time instant by a number, namely the ratio of the time interval between the instant and the time origin to the time unit, with a positive sign when the instant occurs later than the time origin and negative otherwise. A reference system within the Euclidean three-dimensional space serves two purposes at the same time: it represents points by three numbers, their Cartesian coordinates, and local vectors by three numbers, their components with respect to a local set of three base vectors. A reference system consists of a particular point O, called the origin, three directed non-coplanar straight lines passing through the origin, called the axes, and a line segment serving as the unit of length. Alternatively, we may replace the axes and the unit of length by three non-coplanar vectors at the origin, visualized as directed line segments (arrows!), having length equal to the length unit and directions those of the axes pointing toward their positive sense. Thus a reference system (O, e31 , e32 , e33 ) consists of the origin O and the vector basis e31 , e32 , e33 . Following the usual practice, we will assume hereon that the three base vectors are perpendicular to each other, a choice which greatly simplifies, and thus facilitates, relevant computations. It is also assumed that the base vectors form a right-handed triad, which means that looking from e33 , e31 appears to be on the right with respect to e32 . With this choice, we will speak of a Cartesian reference system and Cartesian coordinates. For any other point P the directed line segment 3 serves as the position vector of P and can be expressed as a unique linear x3 = OP combination of the base vectors x3 = x 1 e31 + x 2 e32 + x 3 e33 . The three components x 1 , x 2 , x 3 of the position vector serve as the Cartesian coordinates of the point P . A local vector v3 at any point P can be represented by its components v 1 , v 2 , v 3 with respect to a local basis e31 (P ), e32 (P ), e33 (P ), which results from the parallel transport of the reference system basis e31 (O), e32 (O), e33 (O) from the origin O to the point P . The vector components are the coefficients in the linear combination v3(P ) = v 1 e31 (P ) + v 2 e32 (P ) + v 3 e33 (P ). We will use here matrix notation by setting



3e = e31 e32 e33 ,

which allows us to write



⎤ x1 x = ⎣ x2 ⎦ , x3



⎤ v1 v = ⎣ v2 ⎦ , v3

(1)

13 Theory and Realization of Reference Systems

701

⎡ 1⎤ x

1 2 3 ⎣ x3 = x e31 + x e32 + x e33 = e31 e32 e33 x 2 ⎦ = 3ex, x3 ⎡ 1⎤ v

v3 = v 1 e31 + v 2 e32 + v 3 e33 = e31 e32 e33 ⎣ v 2 ⎦ = 3ev, v3

(2)

(3)

omitting the dependence of the basis from the relevant point O or P , since it is clear from the context. We will also make extensive use of the of the antisymmetric matrix [a×] with axial vector (column matrix) a, defined as ⎡

⎤ 0 −a3 a2 [a×] = ⎣ a3 0 −a1 ⎦ , −a2 a1 0

⎤ a1 a = ⎣ a2 ⎦ , a3 ⎡

(4)

which allows us to express the exterior vector product c3 = a3 × b3 through the matrix expression c = [a×]b. We will also make use of the relation [(Qa)×] = Q[a×]QT ,

(5)

which is valid for any proper orthogonal matrix Q (QT Q = QQT = I, det Q = +1). We will also use dots for expressing derivatives with respect to time, e.g., p˙ ≡ dp dt .

2

Reference Systems in Motion: Generalized Euler Kinematic Equations – The Rotation Vector Concept

For the analysis of modern geodetic observations carried out by space techniques such as GPS (GNSS), VLBI, SLR and DORIS, it is necessary to implement at least two reference systems. An inertial celestial reference system for the description of satellite orbits and radio source directions, and a terrestrial reference system, which represents the deforming earth in the best possible way. The motion of the earth can be thus separated into three parts: (a) translational motion of its origin, ideally chosen as the geocenter (earth mass center), with respect to the inertial space, i.e., the orbit of the earth around the sun as studied by celestial mechanics. (b) rotation of the earth (more precisely of the chosen terrestrial reference system) with respect to the celestial reference system, as studied by geodesists and astronomers. (c) deformation of the earth, i.e., the motion of its masses with respect to the chosen terrestrial reference system.

702

A. Dermanis

For the analysis of observations within a relatively small time interval, say one day, the curvature of the orbit of the earth can be practically ignored and thus assume that the earth moves along a straight line. Thus a reference system having the same origin as the terrestrial reference system but axes parallel to those of the celestial one, can be considered as a quasi-inertial reference system. This choice allows us to leave aside the translational motion and study only the rotation of the terrestrial reference system with respect to the quasi-inertial celestial reference system. In a more general context we study the rotation of a rotating reference system (O, 3e(t)) with respect to a non-rotating one (O, 3e0 ). The bases of the two systems will be related by 3e(t) = 3e0 RT (t),

(6)

where R(t) is a time dependent orthogonal matrix, called the rotation matrix, while 3e0 = 3e(t)R(t) is the corresponding inverse relation. Omitting the dependence on time for the sake of simplicity, it follows from the fact that x3 = e0 x0 = ex = e0 RT x, x0 = RT x and thus the coordinates in the two systems will be related by x = Rx0 .

(7)

Note that we have chosen the use of the transpose RT in the transformation of the bases (6) so that the rotation matrix R appears in the more applicable coordinate transformation (7). The time rate of the rotating basis follows by differentiating (6) to obtain T T d3e T e0 dR eR dR dt = 3 dt = 3 dt . It is easy to verify by differentiating the relation RR = I T

that the matrix R dR dt is antisymmetric and denoting by ω its axial vector we set R

dRT = [ω×], dt

(8)

d3e = 3e[ω×]. dt

(9)

and thus

The relations (8) are the generalized kinematic Euler equations. Their specific form depends on the particular representation of the rotation matrix R. For the representation through Euler angles R = R1 (ϕ)R3 (θ )R1 (ψ), where Rj (θj ) represents a rotation around the axis j by an angle θj , one obtains the usual kinematic Euler equations appearing in texts. We will give here the ones corresponding to the usual geodetic choice of Cardan angles R = R3 (θ3 )R2 (θ2 )R1 (θ1 ):

13 Theory and Realization of Reference Systems

703

dθ2 dθ3 + cos θ3 cos θ2 , dt dt dθ2 dθ3 − sin θ3 cos θ2 , ω2 = cos θ3 dt dt dθ3 dθ1 ω3 = + sin θ2 . dt dt

ω1 = sin θ3

(10)

The vector ω 3 = 3eω = 3e0 ω0 with components ω in the rotating system and T ω0 = R ω in the non-rotating one, is no other than the rotation vector of the rotating reference system, which is defined as the vector ω 3 = ω n3, where ω = |ω| 3 is the instantaneous angular velocity and n3 the unit vector in the direction of the instantaneous axis of rotation. The latter can be defined as follows: The transition from the position of the axes 3e(t) at epoch t to their position 3e(t + t) at a later epoch t + t, can be achieved by a single rotation around an axis n3(t, t + t) by an angle θ where ω(t, t + t) = θ/ t is the mean angular velocity of this rotation. n3 and ω are the respective limits of n3(t, t + t) and ω(t, t + t) as t → 0. An alternative form of the kinematic Euler equations follows by replacing ω = Rω0 T T which gives R dR dt = [(Rω0 )×] = R[ω0 ×]R and hence [ω0 ×] =

dRT R. dt

(11)

z d3e 0 Any vector z3 = 3e0 z0 = 3ez has time derivative d3 e0 dz e dz dt = 3 dt = dt z + 3 dt =   dz d x3 3e [ω×]z + dt . Applying this to the velocity v3 = dt = 3ev and the acceleration a3 = ddtv3 = 3ea = 3e0 a0 , it follows that their components in the rotating system are dv v = [ω×]x + dx dt and a = [ω×]v + dt , which combined give

a=

dω d 2x dx +[ ×]x. + [ω×]2 x + 2[ω×] dt dt dt 2

Newton’s second law of dynamics a3 ≡

d 2 x3 dt 2

(12)

= f3, where f3 = 3ef = 3e0 f0 are the 2

applied forces per unit mass, is represented by a0 = ddtx20 = f0 in the non-rotating system and by a = f in the rotating one, which in view of (12) becomes d 2x dω dx −[ ×]x. = f − [ω×]2 x − 2[ω×] 2 dt dt dt 2

(13)

This means that the apparent acceleration ddt 2x , as seen within the rotating system, depends in addition to the actual applied forces on three pseudo-forces: the centrifugal force p = −[ω×]2 x = −(ωT x)ω − (ωT ω)x, the Coriolis force

704

A. Dermanis

dω q = −2[ω×] dx dt and the gyroscopic force g = −[ dt ×]x. The Coriolis force is exerted only on bodies which are moving with respect to the rotating system, i.e., when dx dt = 0, while the gyroscopic force appears only when the rotation vector changes, either direction with respect to the rotating system, or magnitude. For the rotating terrestrial reference system, the Coriolis force is exerted only on bodies moving with respect to the earth, while the gyroscopic force is very small, as it depends on polar motion and variations in the angular velocity of the earth (variations in the length of the day), which are both physical phenomena of small magnitude.

3

Reference Systems for the Description of Earth Rotation

In order to describe the rotation of the terrestrial reference system with respect to the quasi-inertial celestial reference system, we must take advantage of the fact that it is dominated by the diurnal rotation. Indeed the major part of earth rotation takes place around an axis with slowly varying position with respect to the celestial system (precession and notation), as well as with respect to the earth (polar motion), and with little varying rotational velocity of about 2π per day. In view of the dominant role of diurnal rotation, it is wise to separate large from small parts of “earth rotation” by introducing two intermediate reference systems with their third axis aligned to the rotation axis, one that rotates with the earth and one that it does not. Denoting the terrestrial reference system by 3eT and the celestial one by 3eC , the two new reference systems are the intermediate terrestrial one 3eI T and the intermediate 3 being the unit vector in the celestial one 3eI C , where e33I T = e33I C = n3, with n3 = ω1 ω direction of the rotation vector. If xT , xC , xI T , xI C are coordinates of any point in the respective reference systems 3eT , 3eC , 3eI T , 3eI C , the overall rotation transformation xT = RxC can be analyzed into three parts xT = RxC = WR3 (θ )QxC .

(14)

The matrix Q of the transformation xI C = QxC from the celestial to the intermediate celestial reference system represents the phenomenon of precession and nutation, i.e., the variation of the rotation axis direction n3 = e33I C with respect to the celestial reference system. The matrix WT of the transformation xI T = WT xT from the terrestrial to the intermediate terrestrial reference system represents the phenomenon of polar motion, i.e., the variation of the rotation axis direction n3 = e33I T with respect to the terrestrial reference system. Finally R3 (θ ) is the matrix of diurnal rotation xI T = R3 (θ )xI C around the rotation axis n3 = ω−1 ω, 3 where θ is the earth rotation angle. The explicit representation of the precession-nutation and polar motion matrices are Q = R3 (−s)R3 (−E)R2 (d)R3 (E),

WT = R3 (−s  )R3 (−F )R2 (g)R3 (F ). (15)

13 Theory and Realization of Reference Systems

705

Within the precession-nutation matrix, the rotation R3 (E) brings the second axis 3 and n in a position perpendicular to the plane of e3C 3, while the following rotation I R2 (d) aligns the third axis to the n3 = e33 C direction. The following rotation R3 (−E) merely brings the first and second axes closer to their original positions. The remaining rotation R3 (−s) serves to bring the e31I C axis in its desired position, through an appropriate choice of the tuning angle s, so that it does not follow the earth in its diurnal rotation. Completely analogous is the situation within the transpose of the polar motion matrix W, where the rotation R3 (−s  ) serves to bring the e31I T axis in its desired position, through an appropriate choice of the tuning angle s  , so that it does follow the earth in its diurnal rotation. In order to proceed with the choice of the tuning angles s and s  , so that the definition of the two intermediate reference systems is completed, we must give a precise mathematical meaning to the expressions “follows the earth in its diurnal rotation” and “does not follow the earth in its diurnal rotation”. This is achieved with the introduction of the relative rotation vector concept between any two reference systems, in analogy to the rotation vector of a rotating reference system with respect to a non-rotating one. Let 3eA , 3eB be two reference systems related by 3eB = 3eA RTA→B . Then in analogy to Eq. (8) we may define the relative rotation vector ωA→B through the relation [ωA→B ×]RA→B

dRTA→B . dt

(16)

Applying this definition to the precession-nutation and the polar motion matrix we may define the relative rotation vectors ωQ = ωC→I C and ωW = ωT →I T as [ωQ ×] = Q

dW dQT , [ωW ×] = WT . dt dt

(17)

The intermediate celestial reference system does not follow the earth in its diurnal rotation when the relative rotation vector ωQ = ωC→I C of precession-nutation has 3 = 0. The intermediate no component along the rotation axis n3 = c33I C , i.e., when ωQ terrestrial reference system follows the earth in its diurnal rotation when the relative rotation vector ωW = ωT →I T of polar motion has no component along the rotation 3 = 0. Thus we have given a precise mathematical axis n3 = e33I T , i.e., when ωW context to the requirements for the choice of the tuning angles s and s  , which the astronomers have called the Non Rotating Origin principle (NRO) (see [18], for the original ideas and [29], for the rigorous mathematical elaboration). Note that in astronomy directions are depicted as points on a unit sphere and the “origin” in this case is the point the e3I1C direction. It must therefore # #hold that " " representing T

T

dQ dW 3 = [ω ×] 3 = [ω ×] ωQ = 0 and ωW = 0, W 21 = W dt Q 21 = Q dt 21 21 in which case performing the necessary differentiations we arrive at the final NRO conditions

706

A. Dermanis

dE ds = (cos d − 1) , dt dt

ds  dF = (cos g − 1) , dt dt

(18)

of which the first is indeed an NRO condition, while the second should rather be called a Rotating Origin (RO) condition. The obvious solutions to the above equations are  t dE s(t) = s0 + [cos d(t) − 1] (t)dt, dt t0

 t dF s  (t) = s0 + [cos g(t) − 1] (t)dt. dt t0 (19) The official IERS representation of the earth rotation does not implement the above angles E, d for precession-nutation and F , g for polar motion. For precessionnutation the first two X, Y of the celestial components of the rotation direction n3 = 3eC nC are used instead, which are related to E, d according to ⎡

⎤ ⎡ ⎤ X cos E sin d nC = ⎣ Y ⎦ = ⎣ sin E sin d ⎦ . Z cos d

(20)

In terms of X and Y the precession-nutation matrix assumes the representation ⎡

⎤ 1 − aX2 −aXY −X ⎦, Q = R3 (−s) ⎣ −aXY 1 − aY 2 −Y 2 2 X Y 1 − a(X + Y )

(21)

1 1 1 1 = ≈ + (X2 + Y 2 ). √ 2 2 1 + cos d 2 8 1+ 1−X −Y

(22)

where a=

The IERS does not follow the obvious symmetric alternative for the representation of the polar motion matrix W in terms of ξ = cos F sin g, η = sin F sin g, i.e., the first two of the components nT = [ξ ηζ ]T of the rotation direction n3 = 3eT nT , in the terrestrial reference system. Instead it sticks to the traditional representation W = R1 (−yP )R2 (−xP ), with an additional rotation by the tuning angle s  so that W = R1 (−yP )R2 (−xP )R3 (s  ).

(23)

The NRO conditions in terms of the new parameters X, Y , xP , yP take the form ˙ − XY˙ ), s˙ = a(XY

s˙  = y˙P sin xP ≈ y˙p xp .

The tuning angle s in terms of the above representation is given by

(24)

13 Theory and Realization of Reference Systems

 s(t) = s0 +

t

707

˙ a(t)[X(t)Y˙ (t) − Y (t)X(t)]dt ≈

t0

1 ≈ s0 − [X(t)Y (t) − X(t0 )Y (t0 )] + 2



t

˙ X(t)Y (t)dt,

(25)

t0

dY ˙ where X˙ = dX dt , Y = dt and the value s0 = −94 μas is chosen in order to secure continuity with the abandoned previous version of earth rotation representation. Note that since xP , yP are quite small quantities, it holds in first order approximation that ξ ≈ −xP , η ≈ yP . The tuning angle s  in terms of xP , yP is given by

s  (t) =

1 2



t

[xp (t)y˙p (t) − x˙p (t)yp (t)]dt ≈ −47μas t.

(26)

t0

With the above representations the total earth rotation is described by xT = RxC =



⎤ 1 − aX2 −aXY −X ⎦ xC . = R1 (−yP )R2 (−xp )R3 (s  + θ −s) ⎣ −aXY 1 − aY 2 −Y 2 2 X Y 1 − a(X + Y ) (27)

The functions X(t), Y (t), s(y) can be evaluated either in terms of given series with coefficients provided by the IERS or with the use of software subroutines provided by the International Astronomic Union. These functions are provided by the IAU adopted theory and one needs to add to the theoretical values XI AU , YI AU corrections provided by VLBI observations, according to X = XI AU + δX, Y = YI AU + δY . These corrections are provided by IERS in discrete form on a daily basis. Similar daily discrete values are provided for the polar motion parameters xP , yP . The earth rotation angle is calculated from θ (Tu ) = 2π(0.7790572732640 + 1.00273781191135448Tu ),

(28)

where Tu = Julian date UT1 − 2451545.0 is the number of UT1 Julian days that have passed since 12h UT1, January 1, 2000 (when the Julian date UT1 was 2451545). UT1 time is computed from UTC time (= TAI + n sec), to which one adds the difference UT1-UTC provided by the IERS. We have repeatedly used above the term “rotation of the earth”. This should be understood as a convenient short-name for the correct term “rotation of the terrestrial reference system”. It makes sense to talk for the rotation of the earth, only in the case of a rigid earth, where the rotation vector remains physically invariant, whatever the choice of the time-independent earth-fixed terrestrial reference system. In the actual case of the deforming earth, the terrestrial reference system needs to be defined for

708

A. Dermanis

every time epoch and different choices lead to physically different rotation vectors of the terrestrial system (and not of the earth!). Another characteristic of the above IERS representation of the rotation of the terrestrial reference system is the replacement of the actual instantaneous rotation vector with smoothed versions, originally the Celestial Ephemeris Pole (CEP) and presently the Celestial Intermediate Pole (CIP). The reasoning behind this choice is that high frequency sub-daily terms in nutation could not be detected by the (then primarily classical astronomical) observations. Thus, they should be removed from precession-nutation and be matched by corresponding terms in polar motion (which cannot be predicted by theory) in such a way that the total rotation matrix remains unchanged. Leaving aside the absurdity of the idea that one should replace the model of an observed time function with a smoothed version because of the limited temporal resolution capabilities of the observational process, we must remark that modern space techniques are quite capable for detecting sub-diurnal nutation terms. Therefore, the CIP concept needs to be updated. In accordance with the above terminology, the direction of the first axis of the intermediate celestial reference system e31I C is called the Celestial Intermediate Origin (CIO), while the direction of the first axis of the intermediate terrestrial reference system e31I T is called the Terrestrial Intermediate Origin (TIO). More details on the IERS representation of earth rotation are given in the IERS Conventions [52]. Looking on the original rigorous representation of the rotation matrix R = R3 (−F )R2 (−g)R3 (F + s  + θ − s − E)R2 (d)R3 (E),

(29)

we note that it involves 7 parameters (time functions), while an orthogonal matrix can be represented by only three parameters. Therefore, the implemented parameters must satisfy a number of conditions which at first sight appear to be 7 − 3 = 4. More careful examination reveals the fact that the parameters s and s  are not independent, since only their difference s  − s appears in the representation. Indeed the representation remains the same if s and s  are replaced by s = s + f and s  = s  + f , where f is an arbitrary time function. Therefore the representation depends on 6 parameters (F, g, s  − s, θ, d, E) which must satisfy 6−3 = 3 conditions. In order to find these conditions we remark that the representation involves the explicit use of a rotation axis e33I T = e33I C , independently of whether this is the instantaneous rotation axis, or the CIP, or any other convenient choice. Furthermore the rotation must have an angular velocity of θ˙ = dθ dt , so that the rotation vector corresponding to I T ˙ 3 = 3eI T ωI T = 3eI C ωI C the R, must be of the form ω 3 = θ e33 = θ˙ e33I C and setting ω components in both intermediate systems must be ωI T = ωI C = [00θ˙ ]T . We may 3 = 3eT ωT in the terrestrial reference system using the find the components ωT of ω T generalized Euler kinematic equations [ωT ×] = R dR dt and then convert them in the intermediate terrestrial system with the polar motion transformation ωI T = WT ωT . Thus, the desired conditions follow by setting

13 Theory and Realization of Reference Systems

ωI T

709

⎡ ⎤ 0 T ⎣ = W ωT = 0 ⎦ , θ˙

(30)

with ωT derived from the Euler kinematic equations. Performing the necessary differentiations (see [27, 30, 33] for details) we arrive at the following three differential equations  R(E + s)

E˙ sin d d˙



= R(θ )R(F + s  )



 F˙ sin g , g˙

˙ s˙  + F˙ − cos g F˙ = s˙ + E˙ − cos d E,

(31) (32)

which constitute the compatibility conditions that the superfluous Earth Orientation Parameters (EOPs) must satisfy for the rotation matrix representation (29) to be mathematically consistent. The first two compatibility conditions (31) are the direction conditions, which assert that ωI1T = 0 and ωI2T = 0, i.e., that the rotation vector induced by the rotation matrix R as given by Eq. (29) has the same direction as the common third axis of the intermediate celestial and the intermediate terrestrial reference system (direction of diurnal rotation): ω3 3 e3I T = e33I C . The third compatibility condition (32) is the magnitude condition, which asserts that ωI3T = θ˙ . When all three conditions are satisfied then obviously ω = |ω| 3 = θ˙ . The satisfaction ˙ s˙  = (cos g − 1)F˙ of the two NRO conditions (18), namely s˙ = (cos d − 1)E, clearly guarantees the satisfaction of (32). Since however the direction conditions, which are ignored in the IERS representation are not satisfied, it is simply wrong ˙ since ω1 = 0, ω2 = 0 and thus ω = ω3 = θ. ˙ We may to assume that ω = θ, IT IT IT obtain ω from the relation ω2 = θ˙ 2 + where

θ˙ 2 ,

(33)

θ˙ 2 = (ωI1T )2 + (ωI2T )2 is given by θ˙ 2 = E˙ 2 sin2 d + d˙ + F˙ 2 sin2 g + g˙ 2 − − 2 cos(s  + F + θ − s − E)(E˙ F˙ sin d sin g + d˙ g)− ˙ − 2 sin(s  + F + θ − s − E)(E˙ g˙ sin d − F˙ d˙ sin g).

(34)

For the IERS representation of Eq. (29) the compatibility conditions take the form ⎡ " # ⎤ # "   a2 a2 ˙ − 1−a XY X− 1 + 1−a Y 2 Y˙ y˙P cos xP  ⎦ # " # , = R(−θ − s R(−s) ⎣ " ) a2 a2 ˙ ˙ x˙P 1 + 1−a X2 X+ 1−a XY Y (35)

710

A. Dermanis

aY X˙ − aXY˙ − s˙ = y˙P sin xp − s˙  ,

(36)

for direction and modulus respectively. The correction velocity is given by θ˙ 2 =

θ˙ 2 for the rotational

(a 2 XY )2 + (1 − a + a 2 X2 )2 ˙ 2 (1 − a + a 2 Y 2 )2 + (a 2 XY )2 ˙ 2 X + Y + (1 − a)2 (1 − a)2

a 2 XY 2 − 2a + a 2 (X2 + Y 2 ) +2 X˙ Y˙ + y˙P2 cos2 xP + xP2 + (1 − a)2   sin(s − s  − θ ) & cos xP (1 − a + a 2 X2 )X˙ y˙P + (a 2 XY )Y˙ y˙P + +2 1−a ' +(a 2 XY )X˙ x˙P + (1 − a + a 2 Y 2 )Y˙ x˙P +   cos(s − s  − θ ) & cos xP (a 2 XY )X˙ y˙P + (1 − a + a 2 Y 2 )Y˙ y˙P − 1−a ' (37) −(1 − a + a 2 X2 )X˙ x˙P − (a 2 XY )Y˙ x˙P .

+2

The above IERS representation of earth rotation is effective since January 1st 2003 and is based on the IAU2000 resolutions of the International Astronomical Union. The former representation differs from the new one in two aspects: the first is the separation of nutation from precession and the second is in the definition of the two origins (directions of the first axes) that define diurnal rotation. The previous representation was of the form R = WR3 (GST) NP = = [R1 (−yp )R2 (−xp )]R3 (GST)[R1 (−ε − × [R3 (−z)R2 (θ )R3 (−ζ )],

ε)R3 (− ψ)R1 (ε)]× (38)

where GST is the Greenwich Sidereal Time, z, θ , ζ are the precession angles, while nutation is defined by the obliquity ε, the nutation in obliquity ε, and the nutation in longitude ψ. The representation of polar motion is essentially the same, except for the tuning rotation R3 (s  ) which is missing, along with its celestial counterpart R3 (−s). The precession matrix P transforms coordinates xC in the celestial system 3eC into coordinates xMC = PxC in the mean celestial system 3eMC . The nutation matrix N transforms coordinates xMC in the mean celestial system 3eMC into coordinates xT C = NxMC in the true celestial system 3eT C , having the axis 3eT3 C in the direction of the Celestial Ephemeris Pole (CEP) which differs from the direction of the instantaneous rotation vector ω 3 in a way similar to that of the CIP in the new representation. The diurnal rotation matrix R3 (GST) transforms coordinates xT C in the true celestial system 3eT C into coordinates xT T = R3 (GST)xT C in the

13 Theory and Realization of Reference Systems

711

true terrestrial system 3eT T having its third axis e33T T = e33T C also in the direction of the CEP. Finally the polar motion matrix W transforms coordinates xT T in the true terrestrial system into coordinates xT = WxT T in the terrestrial system. The first axis e31T C of the true celestial system is in the direction of the vernal equinox , which is the intersection of the true equator (plane of 3eT1 C , 3eT2 C ) with the ecliptic (plane of the orbit of the earth). The first axis e31MC of the mean celestial system is in the direction of the mean vernal equinox m , which is the intersection of the mean equator (plane of e31MC , e32MC ) with the ecliptic. The direction of the first axis 3eT1 T of the true celestial system has no specific definition. It is simply the direction, which results from the transformation 3eT T = 3eT W = 3eT R1 (−yP )R2 (−xP ). A different choice of rotations than R1 (−yP )R2 (−xP ), which also bring the e33T in the direction e33T T of the CEP, would have resulted in a different direction of e31T T . The celestial system 3eC was defined to be no other than the mean celestial system of a specific reference epoch, namely 12 UT (universal Time) of January 1, 2000. A former choice has been the same hour and date for 1950.

4

The Realization of a Reference System Within Data Analysis, in the Case of Rigid Geodetic Networks

In performing data analysis one normally uses a mathematical model xa = f(ya ) relating n observables ya to m < n unknown parameters xa in an unambiguous way, i.e., by an injective mapping f : xa → ya , such that for every ya ∈ M ≡ R(f) ⊂ R n there exists a unique xa such that xa = f(ya ). In classical geodesy however, we observe quantities ya , e.g., angles or distances, which depend only on the geometric form of the geodetic network, while coordinates are used as unknown parameters xa . Thus to any value of the observables ya corresponds an infinite set of unknowns xa , which are the coordinates expressing the network form specified by the observables, in the various possible reference systems. Thus, in order to obtain a unique solution, the reference system has to be chosen, either a priori, or within the data analysis process. The mapping f is no longer injective. As xa varies over R m (space of the unknowns) the corresponding images f(xa ) do not cover the whole of R n (space of the observations) but only a submanifold M = R(f) = {ya = f(xa )|xa ∈ R m },

(39)

which we will call the observables manifold. It has dimension r = m − d, where d is the number of parameters defining the reference system, e.g., 6 for a 3D network with unknown origin and orientation, or 7, if the network scale is also unknown. This means that there are d superfluous in the m coordinates of the network, as many as the parameters of a transformation that corresponds to a change of the reference system.

712

A. Dermanis

To any given ya ∈ M there corresponds a solution manifold or shape manifold Sya = {xa ∈ R n |f(xa ) = ya },

(40)

consisting of all coordinate sets xa giving the same observables ya and the same network configuration. As ya varies over M, the various corresponding manifolds Sya have two characteristics: (a) they do not intersect (Sya ∩ Sy˜ a = ∅ for y˜ a = ya ) and (b) they fill up the parameter space R m (given any xa ∈ R m there exists a unique manifold Sf(xa ) to which xa belongs). In mathematical terms we say that the shape manifolds constitute a fibering F of R m , with each Sya being a fiber of F . One way to define the reference system is by means of an appropriate set of minimal constraints c(xa ) = 0, which define a manifold C = {xa ∈ R m |c(xa ) = 0} such that for every ya ∈ M, C and Sya have a single point in common C ∩ Sya = {xaC,ya } (see Fig. 1). Such a manifold C is called a section of the fibering F . Let xa ∈ Sya and x˜ a ∈ Sya be two points on the same shape manifold, i.e., f(xa ) = f(˜xa ) = ya . As they represent the same network configuration there exists

Rm

Rn

y

a

b

xa

f

xCa ,y a

Sya

f

a

y = f (x )

yˆ aNL

C

xˆ aNL

M = R(f )

S yˆ a

NL

Rm

Rn

yb f

b y 0 = f (x 0 )



yˆ aL

xˆ aL xˆ E

A = R ( A) M = R(f )

x0

f



ˆ La x⬘

Syˆ

xˆ ⬘

N = R(E) Sy0

Fig. 1 The geometry of the non-linear (above) and the linearized (below) least squares solutions

13 Theory and Realization of Reference Systems

713

a coordinate transformation x˜ a = Tp (xa ) = t(xa , p),

(41)

mapping one into the other, where p are the d transformation parameters. For example, in a three-dimensional network we have the general similarity transformation x˜ a = t(xa , p) = t(xa ; s, θ, d) = (1 + s)R(θ)xa + d,

(42)

where p = [θT dT s]T , s is the scale parameter, θ = [θ1 θ2 θ3 ]T are the rotation angles and d = [d1 d2 d3 ]T the translation components. Fixing xa , every other point x˜ a ∈ Sya is in one-to-one correspondence with the transformation parameters from xa to x˜ a = t(xa , p). This means that the transformation parameters from the fixed xa may serve as a set of curvilinear coordinates on Sya . In particular the tangent ∂t vectors to the coordinates ∂p (xa , p = 0), i.e., the columns of the matrix Ea = i ∂t a (x , p = 0), form a basis for the tangent space Na = Txa (Sya ) to the E(xa ) ≡ ∂p shape manifold Sya at the point xa . In actual data analysis, the outcomes of the observations yb = ya + e differ for the values of the observables ya , as a consequence of the unavoidable observation errors e, and yb ∈ / M. Thus an estimate yˆ a ∈ M of ya needs to be chosen in order to obtain a unique estimate xˆ a of the unknown parameters with the help of the additional minimal constraints c(xa ) = 0. The usual choice is estimation by the least squares method where the square weighted distance φ=||yb −f(xa )||2P =[yb −f(xa )]T P[yb −f(xa )] of yb from the manifold M is minimized. Application of φ = min under the conditions c(xa ) = 0 leads to the rather complicated nonlinear normal equations



∂f a (ˆx ) ∂xa N L

T

 P[yb − f(ˆxaN L )] +

∂c a (ˆx ) ∂xa N L

T k = 0,

c(ˆxaN L ) = 0,

(43)

which must be solved for xˆ aN L and the Lagrange multipliers k. The observables solution yˆ aN L = f(ˆxaN L ) is in fact the orthogonal projection of the observations yb on the observables manifold M (see Fig. 1). All elements of the corresponding shape manifold Syˆ aNL are least squares solutions for the unknown parameters xa , corresponding to the same shape of the geodetic network expressed in different reference systems. The nonlinear constraints c(xa ) = 0 merely serve in picking a unique one xˆ aN L out of all elements of Syˆ aNL , thus choosing a particular reference system. The constraints manifold C = {xa ∈ R m |c(xa ) = 0} of all points satisfying the constraints is a section of the fibering F of all shape manifolds Sya , provided that it intersects each one of them at a single point xC,ya , i.e., C ∩ Sya = {xc,ya }, ∀ya ∈ M. A necessary and sufficient condition for this (transversality condition) is that the tangent spaces Txa (C), Txa (Sf(xa ) ) to the section C and the solution manifold Sf(xa ) , respectively, at each point xa ∈ C have only the origin 0 as common element [34].

714

A. Dermanis

To avoid solving the complicated nonlinear normal equations (43) by numerical analysis techniques, an iterative scheme is used, based on some initiating approximate values x0 for xa and linearization of the mathematical model and the minimal constraints, if they are not already linear. The linearized model becomes  b ≡ yb − f(x0 ) =

 ∂f a (x ) 0 (x − x0 ) + e ≡ Ax + e, ∂xa

and the linearized minimal constraints   ∂c c(x0 ) + (x0 ) (xa − x0 ) ≡ −d + CT x = 0. ∂xa

(44)

(45)

The original unknowns xa are replaced by the unknown corrections x = xa − x0 , the observations yb by the reduced observations b = yb − f(x0 ) ≡ yb − y0 and the observables ya by the corrections to the observables y = ya − f(x0 ) = ya − y0 . The nonlinear model ya = f(xa ) is replaced by the linear model y = Ax and the nonlinear observation equations yb = f(xa ) + e by the linearized ones b = Ax + e. The implementation of a linear model allows us to solve the “choice of weight matrix problem” under the assumption that the errors are random variables with zero mean E{v} = 0 and covariance matrix C = E{vvT } = σ 2 Q, known up to an unknown scalar factor σ 2 . According to the celebrated Gauss-Markov theorem the choice P = Q−1 provides Best Uniformly Unbiased Estimates (BLUUE or simply BLUE) qˆ = aT xˆ for all estimable linear functions q = aT x of the parameters, i.e., quantities which are functions q = dT y = dT Ax of the observables, or equivalently functions q = aT x with a = AT d ∈ R(AT ). The essence of the linearization lies in the assumption that the approximate values x0 produce an approximate geometric configuration of the network, expressed by the corresponding approximate values of the observables y0 = f(x0 ), which is close to the true geometric configuration of the network, expressed by the true values of the observables ya = f(xa ). As observational errors e are small, this will allow the estimation process to produce a linear estimate yˆ a = f(ˆxa ) which is close to the true value of the observables ya = f(xa ). Even in this case, there are infinitely many coordinate sets xˆ a , which do produce the same value yˆ a , but on the other hand may vary significant and be largely different from the approximate values x0 . For this reason, we must replace the idea of arbitrary coordinate transformations from one reference system to another, and seek only solutions xa which are close to the approximate values x0 . This will restrict the coordinate transformations, and the corresponding choices of reference system, to those provided by coordinate transformations close to the identity, i.e., transformations x˜ ai = (1 + s)R(θ)xai + d with very small parameter values s, θ, d (recall that zero transformation values correspond to the identity transformation). In this case, the approximate linearized close to the identity transformation takes the form x˜ ai = xai + sxai + [xai ×]θ + d, which results by utilizing the approximation R(θ) = I−[θ×] and neglecting second ap ap and higher order terms. Replacing the entries xai of xa with xai = xi +xi , where xi

13 Theory and Realization of Reference Systems

715

are the approximate coordinates of station i (entries of x0 ) and xi the corresponding corrections (entries of x), the transformation becomes

x˜ i ≈

ap ap xi +sxi +[xi ×]θ+d

= xi +



ap [xi ×]

⎡ ⎤ θ

I3

ap xi ⎣ d ⎦

≡ xi +Ei p.

(46)

s For all stations the linearized coordinate transformation takes the form x˜ = x + Ep, where ⎡





.. .. .. . . . ⎢ ⎢ ⎥ ⎢ ap ⎢ ⎥ E = ⎣ Ei ⎦ = ⎢ [xi ×] I3 ⎣ .. .. .. . . .





.. . 0

.. .. . . ap ap −zi yi

⎢ .. . ⎥ ⎢ ⎢ ⎢ ap ap ⎥ ap xi ⎥ = ⎢ zi 0 −xi ⎦ ⎢ ap ap ⎢ −y .. 0 ⎣ i xi . .. .. .. . . .

.. . 1

.. . 0

.. . 0

010 0 .. .

01 .. .. . .

⎤ .. . ⎥ ap xi ⎥ ⎥ ap ⎥ yi ⎥ . ap ⎥ zi ⎥ ⎦ .. .

(47)

This is the linearized three-dimensional similarity transformation, involving rotation, translation and scale change. If one of these transformation elements is missing, the corresponding columns of Ei and rows of p should omitted. For

be ap I3 and example in the rigid transformation without scale change Ei = xi ×   θ p = . For the case of two-dimensional planar networks, and introducing the d matrix of planar rotation 

 cos θ sin θ R(θ ) = , − sin θ cos θ

(48)

the similarity transformation x˜ ai = (1 + s)R(θ )xai + d takes the linearized form

ap

ap



ap

x˜ i ≈ xi +sxi +[Wxi ]θ +d = xi + Wxi

I2

⎡ ⎤ θ ap xi ⎣ d ⎦ ≡ xi +Ei p, s

(49)

where W = R(90◦ ). For all network points the transformation becomes x˜ = x+Ep, where ⎡

⎤ ⎡ .. .. . ⎢ ⎥ ⎢ . ap ⎥ ⎢ E=⎢ ⎣ Ei ⎦ = ⎣ Wxi .. .. . .

.. . I2 .. .

⎡ . ⎤ .. .. . ⎥ ⎢ ap ⎢ y ap ⎢ i xi ⎥ ⎦ = ⎢ −x ap ⎣ i .. .. . .

.. .. . . 10 01 .. .. . .

.. ⎤ . ap ⎥ xi ⎥ ⎥ ap . yi ⎥ ⎦ .. .

(50)

716

A. Dermanis

In order to understand the relation between the original nonlinear and the linearized estimation problem, we first note that as any coordinate, e.g., xia , varies in R, with the rest of the coordinates remaining fixed, the image f(xia ) traces a curve on the ∂f observables manifold M and the partial derivative ∂x a is a tangent vector to this i curve and hence tangent to M. In particular the columns of the matrix A, which ∂f are by definition the derivatives ai = ∂x a (x0 ) are vectors tangent to M at the point i y0 = f(x0 ) ∈ M. Therefore the m columns of A span the manifold A tangent to M at y0 = f(x0 ), and since the range of A is the set of the linear combinations of its columns then A = R(A). On the other side, consider within Sy0 = Sf(x0 ) a change of reference system transformation x˜ 0 = t(x0 , p). As each pi varies, holding the rest of the transformation parameters p fixed, x˜ 0 = (pi ) traces a curve on Sy0 and ∂t is a vector tangent to this curve and hence tangent the partial derivative ∂∂px˜ 0i = ∂p i to Sy0 . Therefore the d columns ei =

∂t (x0 , p = 0), ∂pi

(51)

E=

∂t (x0 , p = 0), ∂p

(52)

of the matrix

span the linear manifold N = Tx0 (Sy0 ) = R(E) tangent to Sy0 = Sf(x0 ) at x0 . Considering the combined function (f ◦ t)(xa , p) ≡ f(t(xa , p)) and recalling that f (t (xa , p)) = f (xa ) = ya , ∀p, with ya independent of p, the chain rule for derivatives gives ∂(f ◦ t) a (x , 0) = ∂p where we have set Aa = it holds that



 ∂f ∂t ∂ya (xa , 0) = Aa Ea = = 0, ∂t ∂p ∂p

∂f a ∂xa (x )

and Ea =

∂t a ∂p (x , 0)

(53)

as before. In particular at x0

AE = 0,

(54)

∂f ∂t where A = ∂x a (x0 ) and E = ∂p (x0 , p = 0) as before. The relation (54) means that each column ek , k = 1, 2, . . . , d of E, is mapped by A into Aek = 0. Since the columns of E are linearly independent they form a basis for the null space N (A) of A and the tangent manifold defined above can now be identified as

N = Tx0 (Sy0 ) = R(E) = N(A) ≡ {x ∈ R m |Ax = 0}.

(55)

The situation is similar at xa where Na = Txa (Sya ) = R(Ea ) = N (Aa ). The linearization replaces the original nonlinear mapping f : R m →R n : xa →ya , from R m to R n with its derivative mapping

13 Theory and Realization of Reference Systems

A = ∇f : Tx0 (R m ) → Ty0 (R n ),

717

(56)

from the tangent space Tx0 (R m ) (at x0 to R m ) to the tangent space Ty0 (R n ) (at y0 = f(x0 ) to R n ). Since Tx0 (R m ) and Ty0 (R n ) are essentially R m and R n with their origins shifted to x0 and y0 = f(x0 ), respectively, we may simply consider the linear mapping A : R m → R n : x → y. The estimation by least squares provides a unique reduced observable estimate yˆ = Aˆx, which is the projection of the reduced observations b = yb − y0 = yb − f(x0 ) on the image manifold A by minimizing the weighted norm ||b − y||2p = ||b − Ax||2p = (b − Ax)T P(b − Ax) = min .

(57)

Note that, as a consequence of the linearization errors, yˆ aL ≡ y0 + yˆ differs from the least squares solution yˆ aN L to the nonlinear problem (ˆyaL = yˆ aN L ) and also yˆ aL ∈ / M. If xˆ is a solution to the linearized least squares problem (Aˆx = yˆ ) then xˆ aL ≡ x0 + xˆ = xˆ aN L for all least squares solutions xˆ aN L to the nonlinear problem. From xˆ we may obtain any other solution xˆ  = xˆ + Ep by adding a linear combination Ep of the columns of E, i.e., any vector belonging to the null space of A (Ep∈N(A)=R(E)). As a consequence of the linearization errors, ˆ  = x0 + xˆ + Ep = xˆ aL + Ep differs from any corresponding xˆ a L ≡ x0 + x ˆ˜ a ). In fact xˆ a ∈ a nonlinear estimate xˆ a xa N L (ˆ L = x NL N L / Syˆ NL . However, when the approximate values are sufficiently close to the real ones, these differences turn out to be insignificant in most real-life problems, since they remain below the level of the observational errors, which cause the difference between the estimate yˆ aL ≡ y0 + yˆ and the (unknown) true value of ya to be potentially considerably larger than the difference between the linear(ized) estimate yˆ aL and the nonlinear one yˆ aN L . Within the linearized approach, given any xˆ such that yˆ = Aˆx, we may seek upon all other least squares solutions xˆ  √ = xˆ + Ep, p ∈ R d , the one xˆ E with minimum length ||ˆxE || = min ||ˆx || = min xˆ T xˆ  , so that xˆ aE = x0 + xˆ E is as close p

p

as possible to the approximate values x0 . Equivalently, we may minimize instead φ = ||ˆx ||2 = xˆ T xˆ  = (ˆx + Ep)T (ˆx + Ep), by setting ∂φ ∂p = 0, which leads to pE = −(ET E)−1 ET xˆ and thus xˆ E = xˆ + EpE = xˆ − E (ET E)−1 ET xˆ .

(58)

It is easy to verify that the minimum norm solution xˆ E satisfies the so called inner constraints ET xˆ E = 0. In fact, it is completely identified among all least squares solutions by the inner constraints, since if xˆ  = xˆ + Ep is another least squares solution satisfying ET xˆ  = ET xˆ + ET E p = 0 then 0 = ET xˆ  − ET xˆ E = ET xˆ − ET Ep − (ET xˆ + ET EpE ) = ET E(p − pE ) = 0

718

A. Dermanis

implies that p = pE and hence xˆ  = xˆ E . Note that if xˆ and xˆ  are any two elements of Syˆ then A(ˆx − xˆ ) = Aˆx −Aˆx = yˆ − yˆ = 0 and xˆ  − xˆ ∈ N (A). Thus Syˆ is a linear variety (hyperplane) parallel to the null space N(A) and for a known xˆ ∈ Syˆ , it can be represented as a linear variety Syˆ = xˆ + N(A). The minimum norm solution can be visualized as the orthogonal projection of the origin 0 (moved to x0 ) on Syˆ . As y varies over the range linear manifold A = R(A), the corresponding solution manifolds Sy from a fibering F of R m , although in this case, the term quotient space is rather used in linear algebra. In this case the section C of the elements satisfying the linear constraints CT x = d is also a linear manifold such that intersects each Sy at a single element. A necessary and sufficient condition for this is that it intersects the null space at a single element, C ∩ N(A) = {xC,N }. In this respect, the linear constraints CT x = d are characterized as minimal constraints, because they provide a single element xˆ C ∈ C ∩ Syˆ out of all least squares solutions xˆ ∈ Syˆ , thus choosing a particular reference system. Note that xˆ C = xˆ E + xC,N holds in this case. The inner constraints ET x = 0 are a particular set of minimal constraints which provide the minimum norm solution, so that ||ˆxE || = ||ˆxaE − x0 || = min. xˆ ∈Syˆ

The reference system to which the final coordinates xˆ aE = x0 + xˆ E belong can be interpreted as follows: The approximate coordinates x0 define an approximate geodetic network with a reference system attached to it. The least squares solution yˆ a = y0 + yˆ = f(x0 ) + yˆ for the observables fully determines the geometric configuration of the network. The minimum norm principle ||ˆxaE − x0 ||2 = provides a least squares fit (with identity weight (ˆxaE − x0 )T (ˆxaE − x0 ) = min a xˆ E ∈Syˆ

matrix) of the estimated network configuration to the approximate one. Once the best fit is realized, the estimated network configuration inherits the reference system of the approximate network. The so called inner constraint matrix E is easily accessible in geodesy, on the basis of the understanding of the physical reasons that cause the rank deficiency of the design matrix A. This fact provides a series of results that have escaped the attention of statisticians. In the above discussion we considered only network station coordinates as unknowns. In a real problem, there might be other types of parameters of which some will be invariant under the change of the reference system and some will not. For the latter it suffices to know how they modify under a change of the reference system in order to determine through linearization the total inner constraints matrix referring to all the unknown parameters of our particular model.

5

Least Squares Estimation for Models Without Full Rank Utilizing Minimal Constraints

We proceed with the solution to the problem of estimation of unknown parameters in the case of a linear model without full rank

13 Theory and Realization of Reference Systems

719

e ∼ (0, σ 2 P−1 ),

b = Ax + e,

(59)

where A has dimensions n × m and rank r(A) = r < m < n, with a rank defect d = m − r. To apply the least squares principle φ = ||b − y||2P = (b − Ax)T P(b − Ax) = min under the additional minimal constraints CT x = d we form the Lagrangean ∂Φ Φ = φ − 2kT (CT x − d) and set ∂Φ ∂x = 0 and ∂k = 0, which leads to the linear normal equations 

N C CT 0

    xˆ u = , k d

(60)

involving Lagrange multipliers k, where we have set N = AT PA and u = AT Pb. The solution of the above equations, and in particular the inversion of the augmented coefficient matrix, relies on the fact that the matrices N and C are not completely independent. Indeed for the set of d = m − rank(A) constraints CT x = d to be a set of minimal constraints one of the following two equivalent conditions must hold  rank

A CT



 = rank

N CT

 = m.

(61)

Another form of the inverse sought, relies on the matrix E satisfying the relation AE = 0, which also implies that NE = 0. As a consequence of the above interdependences the following relations hold (N + CCT )−1 C = E(CT E)−1 ,

(62)

(N + CCT )−1 N = I − E(CT E)−1 CT ,

(63)

CT (N + CCT )−1 AT = 0,

(64)

CT (N + CCT )−1 N = 0,

(65)

N(N + CCT )−1 N = N,

(66)

T −1

C (N + CC ) T

C = I,

(67)

(N + CCT )−1 − (N + CCT )−1 CCT (N + CCT )−1 = = (N + CCT )−1 −E(ET CCT E)−1 ET , (N + CCT )−1 u + E(CT E)−1 d = (N + CCT )−1 u + (N + CCT )−1 Cd.

(68) (69)

The proof of the above relations is rather straightforward. From NE = 0, we have (N + CCT )E = NE + CCT E = CCT E, and since both CT E and N + CCT are regular (N + CCT )E(CT E)−1 = C, E(CT E)−1 = (N + CCT )−1 C, which is (62).

720

A. Dermanis

For the proof of (63) we multiply E(CT E)−1 = (N + CCT )−1 C from the right with CT and get E(CT E)−1 CT = (N + CCT )−1 CCT = = (N + CCT )−1 CCT + (N + CCT )−1 N − (N + CCT )−1 N = = (N + CCT )−1 (N + CCT ) − (N + CCT )−1 N = = I − (N + CCT )−1 N which gives (63). For the proof of (64) we multiply the transpose of (62) CT (N + CCT )−1 = T (E C)−1 ET with AT from the right and since ET AT = (AE)T = 0 we obtain CT (N + CCT )−1 AT = (ET C)−1 ET AT = 0, which is (64). For the proof of (65) we multiply the transpose of (62) with N from the right and since ET N = 0 we obtain CT (N + CCT )−1 N = (ET C)−1 ET N = 0, which is (65). For the proof of (66) we take into account the transpose of (65) N(N + CCT )−1 C = 0 to obtain N(N + CCT )−1 N = N(N + CCT )−1 N + N(N + CCT )−1 CCT − − N(N + CCT )−1 CCT = = N(N + CCT )−1 (N + CCT ) − N(N + CCT )−1 CCT = N and (66) has been proved. For the proof of (67) we start from (65), 0 = CT (N + CCT )−1 N = CT (N + CCT )−1 (N + CCT − CCT ) = = CT − CT (N + CCT )−1 CCT = 0 Multiplying from the right with C gives CT C − CT (N + CCT )−1 CCT C = 0, CT (N + CCT )−1 CCT C) = CT C and CT (N + CCT )−1 C = CT C(CT C)−1 = I. Taking into account (62) and its transpose CT (N + CCT )−1 = (ET C)−1 ET we directly obtain (N + CCT )−1 CCT (N + CCT )−1 = E(CT E)−1 (ET C)−1 ET = E(ET CCT E)−1 E and (68) is proved. To prove (69) we simply replace on the left side (N + CCT )−1 C = E(CT E)−1 from (62).

13 Theory and Realization of Reference Systems

721

Setting QC = (N + CCT )−1 − E(ET CCT E)−1 ET ,

(70)

it is easy to establish that QC C = 0,

CT QC = 0,

NQC + C(E C) T

−1 T

E = I,

(71) (72)

NQC + N = N,

(73)

QC NQC = QC .

(74)

Using (62) it follows that QC C = (N + CCT )−1 C − E(CT E)−1 (ET C)−1 ET C = E(CT E)−1 −E(CT E)−1 = 0 and (71) is proved. Utilizing NE = 0 and the transpose of (63), namely N(N + CCT )−1 = I − C(ET C)−1 ET we have NQC = N(N + CCT )−1 − NE(CT E)−1 (ET C)−1 ET = N(N + CCT )−1 = = I − C(ET C)−1 ET and (72) follows. Multiplying (72) with N from the right we get NQC N + C(ET C)−1 ET N = N and since ET N = 0, (73) follows. Multiplying (72) with QC from the left, it follows that QC NQC + QC C(ET C)−1 ET = QC and since according to (71) QC C = 0, (74) follows. The solution of the normal equations (60) has two different but equivalent forms. We shall first obtain the form implementing the inner constraints matrix E and next the form which it does not. The inverse of the coefficient matrix in the augmented normal equations has the form 

N C CT 0



−1 =

 E(CT E)−1 QC . (ET C)−1 ET 0

(75)

This relation is easy to verify by multiplying the original matrix with its inverse and taking into account that NE = 0, as well as the relations (71) and (72). Indeed 

N C CT 0



   E(CT E)−1 QC NQC + C(ET C)−1 ET NE(CT E)−1 = = (ET C)−1 E 0 CT QC CT E(CT E)−1     NQC + C(ET C)−1 ET 0 I0 = = . I CT QC 0I (76)

Utilizing the above inverse, the solution takes the form

722

A. Dermanis

   −1      xˆ E(CT E)−1 N C u u QC = = = k (ET C)−1 ET 0 CT 0 d d   QC u + E(CT E)−1 d , = (ET C)−1 ET u

(77)

or explicitly xˆ = QC u + E(CT E)−1 d and k = (ET C)−1 ET u. From AE = 0 and its transpose ET AT = 0 it follows that ET u = ET AT Pb = 0 and k = (ET C)−1 ET u = 0, while QC u = (N+CCT )−1 u−E(ET CCT E)−1 ET u = (N+CCT )−1 u and thus xˆ C = (N+CCT )−1 u+E(CT E)−1 d. Therefore the solution to the least squares problem in the rank deficient model with the implementation of minimal constraints CT x = d is given by xˆ C = QC u + E(CT E)−1 d = (N + CCT )−1 u + E(CT E)−1 d,

k = 0.

(78)

The covariance matrix of xˆ C is given by Cxˆ C = σ 2 Qxˆ C , where the covariance factor matrix Qxˆ C can be obtained from the known covariance factor matrix of the observations Qb = P−1 and the linear relation (78) in the form xˆ C = QC AT Pb + E(CT E)−1 d. Application of the law of covariance propagation gives

T Qxˆ C = QC AT PQb QC AT P = QC AT PAQC = QC NQC and in view of (74) Qxˆ C = QC = (N + CCT )−1 − E(ET CCT E)−1 ET .

(79)

In order to derive a form of the solution to the normal equations that does not implement the matrix E, we need to make use of the relation (N + CCT )−1 − (N + CCT )−1 CCT (N + CCT )−1 = (N + CCT )−1 N(N + CCT )−1 , (80) which is easy to prove by setting R = N + CCT and noting that (N + CCT )−1 N(N + CCT )−1 = R−1 NR−1 = = R−1 − R−1 + R−1 NR−1 = = R−1 − R−1 RR−1 + R−1 NR−1 = = R−1 − R−1 (R − N)R−1 = = R−1 − R−1 CCT R−1 = = (N + CCT )−1 −(N+CCT )−1 CCT (N + CCT )−1 . The inverse of the coefficient matrix of the normal equations is given by

13 Theory and Realization of Reference Systems



N C CT 0

−1

 =

723

 (N + CCT )−1 C Q C , 0 CT (N + CCT )−1

(81)

where Q C = (N + CCT )−1 − (N + CCT )−1 CCT (N + CCT )−1 = = (N + CCT )−1 N(N + CCT )−1 .

(82)

In order to verify (81) we take into account that Q C = R−1 − R−1 CCT R−1 , CT R−1 N = 0 and NR−1 C = 0 from (65), while CT R−1 C = I from (67). Multiplying the original matrix with its inverse we obtain 

  −1  N C R − R−1 CCT R−1 R−1 C = 0 CT R−1 CT 0   NR−1 − NR−1 CCT R−1 + CCT R−1 NR−1 C = = CT R−1 C CT R−1 − CT R−1 CCT R−1       NR−1 − 0CT R−1 + CCT R−1 0 (N + CCT )R−1 0 I0 = = = . I CT R−1 − ICT R−1 0 I 0I (83)

Utilizing the above inverse, the solution takes the form −1    −1      u u R − R−1 CCT R−1 R−1 C xˆ N C = = = 0 CT R−1 d d k CT 0   −1 R u − R−1 CCT R−1 u + R−1 Cd , = CT R−1 u

(84)

explicitly xˆ = R−1 u − R−1 CCT R−1 u + R−1 Cd and k = CT R−1 u. Recalling that from (64) CT R−1 AT = 0 it follows that k = CT R−1 u = CT R−1 AT Pb = 0 and xˆ = R−1 u + R−1 Cd. Therefore, the solution to the normal equations is given by xˆ C = (N + CCT )−1 (u + Cd),

k = 0.

(85)

Covariance propagation on xˆ = R−1 u + R−1 Cd = R−1 AT Pb + R−1 Cd with Qb = P−1 gives Qxˆ = (R−1 AT P)P−1 (R−1 AT P)T = R−1 AT PAR−1 = R−1 NR−1 = Q C. C (86) Therefore the covariance matrix of xˆ C is Cxˆ C = σ 2 Qxˆ C with covariance factor matrix

724

A. Dermanis

Qxˆ = Q C = (N + CCT )−1 − (N + CCT )−1 CCT (N + CCT )−1 C

= (N + CCT )−1 N(N + CCT )−1 .

(87)

If xˆ is a least squares solution obtained by a set of minimal constraints, which may even be unknown to us, we can convert it into a solution xˆ C satisfying a specific set of minimal constraints CT x = d, by choosing the right set of parameters p in the linearized coordinate transformation xˆ C = xˆ + Ep. Indeed from CT xˆ C = CT xˆ + CT Ep = d follows that p = (CT E)−1 (d − CT xˆ ) and hence xˆ C = [I − E(CT E)−1 CT ]ˆx + E(CT E)−1 d.

(88)

Of particular interest is the solution xˆ E of minimum norm, which is the solution among all least squares solutions for which the norm φ = ||x||2 = xT x is minimized. It can be recovered from any least squares solution xˆ , obtained by minimal constraints with an appropriate choice of the parameters p in the linearized transformation x = xˆ + Ep. To find the proper p we minimize φ = xT x = (ˆx + Ep)T (ˆx + Ep) by setting ∂φ x + Ep)T E = 0, which gives ∂p = 2(ˆ T −1 T p = −(E E) E xˆ and the minimum norm solution is given by   xˆ E = I − E(ET E)−1 ET xˆ .

(89)

The above relation can be easily generalized to xˆ W = I − (ET WE)−1 ET W xˆ , which minimizes the weighted norm ||x||2W = xT Wx, for any positive-definite matrix W. Note that the idempotent matrix PN (A)⊥ = I − E(ET E)−1 ET is a projector operator onto the orthogonal complement N(A)⊥ of the null space of the design matrix N (A) = {x ∈ R m |Ax = 0} = R(E) spanned by the columns of E. Comparing this with the conversion (88) from one least squares to another, it follows that the minimum norm solution can be also obtained by using minimal constraints CT x = d, by simply choosing C = E and d = 0. The constraints ET x = 0, where E is the coefficient matrix in the linearized transformation x˜ = x + Ep under a change of the reference system, are called inner constraints. The “inner constraints” or “minimum norm” or “free network” solution derives directly from the minimal constraints solution by replacing C = E and d = 0 to obtain xˆ E = (N + EET )−1 u,

(90)

Qxˆ = (N + EET )−1 − E(ET E)−2 ET = (N + EET )−1 N(N + EET )−1 = E

= (N + EET )−1 − (N + EET )−1 EET (N + EET )−1 .

(91)

13 Theory and Realization of Reference Systems

725

If a set of minimal constraints CT x = d is multiplied from the left with any nonsingular matrix S, we obtain a completely equivalent set of constraints SCT x = Sd, ˜ T x = d˜ with C ˜ = CST and d˜ = Sd, providing exactly the same least squares or C solution. The same is true for the inner constraints ET x = 0, which can be replaced with E˜ T x = 0, where E˜ = ERT with an arbitrary non-singular matrix R. Writing ˜ d, ˜ E ˜ in place of C, d, E and the solutions for minimal and inner constraints with C, ˜ = CST , d˜ = Sd, E˜ = ERT , respectively, and setting ST S = G, then replacing C RT R = GE → G, we obtain a slightly generalized form of the minimal and inner constraints solutions xˆ C = (N + CGCT )−1 (u + CGd) = (N + CGCT )−1 u + E(CT E)−1 d,

(92)

Qxˆ = (N + CGCT )−1 − (N + CGCT )−1 CGCT (N + CGCT )−1 = C

= (N + CGCT )−1 N(N + CGCT )−1 = = (N + CGCT )−1 − E(ET CGCT E)−1 ET , xˆ E = (N + EGET )−1 u, T −1

(94)

Qxˆ = (N + EGE ) −E(E EGE E) E

(93)

T

T

−1 T

E =

= (N+EGET )−1 N(N + EGET )−1 = = (N + EGET )−1 − (N + EGET )−1 EGET (N + EGET )−1 ,

(95)

where G is now an arbitrary non-singular symmetric matrix. The matrix G should be chosen in such a way that the corresponding diagonal elements of the matrices N and CGCT in the sum N+CGCT (or N and EGET in the sum N+EGET ) have the same order of magnitude, thus restricting the effect of round-off numerical errors in the computations. Some authors arrive at the above results by introducing so called stochastic constraints d = CT x + ed (or 0 = ET x + ed ), with ed ∼ (0, σ 2 G−1 ) uncorrelated with e. In this case (94) and (95) are simply the solutions to the normal equations based on the two sets b = Ax + e and the stochastic constraints. We strongly dislike this approach because it leads to misinterpretation. The constraints do not represent any actual information based on observational evidence but they are merely a means for a selecting a particular least squares solution out of infinitely many ones, and thus selecting, at the same time, a particular reference system out of infinitely many ones. The choice of a reference system, which is in any case a mathematical convention and not a real physical object, is a purely deterministic process and there is nothing stochastic about it. This becomes obvious from the fact that the resulting solution xˆ C which satisfies exactly the “stochastic” constraints, CT xˆ C = d, in which case the corresponding error estimates become eˆ d = CT xˆ C − d = 0 and the pseudo-data d are not adjusted at all! The solutions with minimal and inner constraints can be associated with generalized inverses of the normal equations matrix N or the design matrix A. Recall that

726

A. Dermanis

the generalized inverses M− of a matrix M are characterized by four properties M = MM− M,

(G1)

(96)

M− = M− MM− ,

(G2)

(97)

(PMM− )T = PMM− ,

(G3)

(98)

(M− MR)T = M− MR. (G4)

(99)

For a matrix M− to be characterized as a generalized inverse of M, it must necessarily satisfy the generalized inverse property (G1). (G2) is the reflexivity property (M being also a generalized inverse of M− ) and if both (G1) and (G2) are satisfied M− is a reflexive generalized inverse of M. (G3) is the least squares property and if (G1) and (G3) are satisfied M− is a least-squares generalized inverse of M. (G4) is the minimum norm property and if (G1) and (G4) are satisfied M− is a minimum norm generalized inverse of M. If M is n × m, then the least squares property refers to the norm ||y||2P = yT Py in R n and the minimum norm property to the norm ||x||2R = xT Rx in R m . For a system y = Mx, x = M− y is a least squares solution if M− satisfies (G1) and (G3) i.e., it satisfies ||y − M− x||2 = (y − M− x)T P(y − M− x) = min. Similarly if y ∈ R(M) so that y = Mx is a consistent set, then x = M− y is a minimum norm solution if M− satisfies (G1) and (G4), i.e., it satisfies ||x||2 = xT Rx = min, among all x for which Mx = y. Finally if a generalized inverse satisfies all properties (G1), (G2), (G3), (G4), it is a unique reflexive, least squares, minimum norm generalized inverse, it is called the pseudoinverse of M and is denoted by M+ . From properties (73) and (74) follows that Qxˆ C = QC satisfies NQxˆ C N = N and Qxˆ C NQxˆ C = Qxˆ C . It is therefore a reflexive generalize inverse of N. The matrix A− = (N + CCT )−1 AT P = R−1 AT P,

(100)

appearing in the solution xˆ C = (N + CCT )−1 AT Pb + (N + CCT )−1 Cd, satisfies properties (G1), (G2), (G3), but not (G4), and is thus a least squares reflexive generalized inverse of the design matrix A. Indeed from (64) CT R−1 AT = 0, and thus AA− A = AR−1 N = AR−1 (R − CCT ) = AR−1 R − AR−1 CCT = AR−1 R = A, satisfying (G1) while A− AA− = R−1 NR−1 AT P = R−1 (R − CCT )R−1 AT P = = R−1 AT P − R−1 CCT R−1 AT P = R−1 AT P = A− , and (G2) is also satisfied. It is also obvious that PAA− = PA(N + CCT )−1 AT P is a symmetric matrix and (G3) is satisfied. (G4) is not satisfied for the norm

13 Theory and Realization of Reference Systems

727

||x||2 = xT x, because A− A = (N + CCT )−1 N is not symmetric. Since the inner constraints are just a special case of minimal constraints, the covariance cofactor matrix Qxˆ shares with Qxˆ the properties NQxˆ E N = N (G1) and Qxˆ NQxˆ E

E

= Qxˆ

E

E

C

(G2). From Qxˆ

E

= (N + EET )−1 − E(ET E)−2 ET and

NE = 0 follows that Qxˆ N = (N+EET )−1 N and NQxˆ = N(N + EET )−1 . From E E property (72) for C = E and QC = Qxˆ E we obtain Qxˆ E N = I − E(ET E)−1 ET and NQxˆ E = I − E(ET E)−1 ET . Therefore both NQxˆ E and Qxˆ E N are symmetric and (G3) and (G4) are also satisfied, for P = I and R = I. Since all four properties are satisfied Qxˆ E is a pseudoinverse of N Qxˆ E = N+ ,

(101)

with respect to the norm ||x||2 = xT x in R m . For the matrix A− = (N + EET )−1 AT P, appearing in the inner constraints solution xˆ E = A− b + E(ET E)−1 d, we note that A− A = (N + EET )−1 N = I − E(ET E)−1 ET is symmetric with the last term following from property (63) with C = E and its transposition. Thus property (G4) is satisfied with R = I. In addition PAA− = PA(N + EET )−1 AT P is also symmetric and property (G3) is satisfied. (G1) is satisfied, since AE = 0 gives AA− A = A[I − E(ET E)−1 ET ] = A. It also holds that A− AA− = (N + EET )−1 N(N + EET )−1 AT P while from (91) (N + EET )−1 N(N + EET )−1 = (N + EET )−1 − E(ET E)−2 ET and since ET AT = 0 we get A− AA− = (N + EET )−1 AT P − E(ET E)−2 ET AT P = (N + EET )−1 AT P = A− and property (G2) is also satisfied. Since A− = (N + EET )−1 AT P satisfies all four properties it is a pseudoinverse of A (N + EET )−1 AT P = A+ ,

xˆ E = A+ b,

(102)

with respect to the norm ||y||2 = yT Py in R n and the norm ||x||2 = xT x in R m . Recalling that in statistics estimable parameters within the linear model are defined as those having unbiased estimates, we may formally take the expectation of the minimal constraints solution, taking into account that d = CT x and E{b} = Ax, we get E{ˆxC } = (N + CCT )−1 (APE{b} + Cd) = (N + CCT )−1 (Nx + CCT x) = x, (103)

728

A. Dermanis

and so we have managed to get an unbiased estimate of a non-estimable quantity! To resolve this paradox we notice that estimability was established using the minimal constraints relation CT x = d, so it seems that coordinates are non-estimable quantities in the model b = Ax + v, but estimable in the joint model b = Ax + v, CT x = d. From the physical point of view, coordinates are estimable when the constraints are CT x = d hold true in a physical sense and are not merely a computational device for obtaining one out of the infinitely many least squares solutions xˆ of the model b = Ax + v, v ∼ (0, σ 2 P−1 ). Even in this case, one may counter argue that the arbitrary chosen minimal constraints CT x = d define a reference system and once this system is chosen the coordinates with respect to it become estimable. The problem though is that this reference system remains physically inaccessible, because we cannot physically access its origin and direction of its axes on the basis of the available estimates xˆ C . The reason is that the latter define an estimated network shape, which differs from the unknown true one. Fitting parts of this estimated shape to corresponding parts of the real network, will lead to a different realization of the reference system, depending on which minimal parts we choose to fit. This is a general problem with reference frames, a term which in geodesy refers to the realization of the reference system by means by a set of estimated coordinates (plus velocities for deformable networks). To give an example of minimal constraints that have a physical meaning and lead to estimable coordinates, consider a classical planar network where angles and distances have been observed. The three minimal constraints XA = YA = 0, YB = 0 define a reference system with origin at network point A and first axis in the direction of the line AB. For any other point P , the coordinate YP is the distance of P from the line AB and the coordinate XP is the distance AP  , where P  is the projection of P on the line AB. But these quantities refer solely to the geometric form of the network and are independent of any choice of reference system. Therefore they are determinable-estimable quantities and the estimates Xˆ P , YˆP are indeed unbiased, also because the introduced constraints have defined a reference system with physically accessible origin and axes.   x1 in the In a more general setup, we may reorder the coordinates x = x2   x model b = [A1 A2 ] 1 + v, so that the d constraints x2 = 0 are minimal x2   x constraints of the form CT x = [0 Id ] 1 = 0. We shall call constraints of x2 this type, or its generalization x2 = c2 , where c2 are fixed known values, as trivial constraints. The estimates can be also obtained by replacing CT = [0 Id ], d = c2 in the general solution, but it is easier to eliminate the fixed coordinates x2 = c2 and work with the reduced model b − A2 c2 = A1 x1 + v to obtain the estimates xˆ 1 = (AT1 PA1 )−1 AT1 P(b − A2 c2 ) while xˆ 2 = c2 , with covariance factor matrices Qxˆ 1 = (AT1 PA1 )−1 , Qxˆ 2 = 0, Qxˆ 1 xˆ 2 = 0. This is a special case of a general parameter elimination technique based on the minimal constraints

13 Theory and Realization of Reference Systems

729

  x1 = CT1 x1 + CT2 x2 = d, where by parameter reordering x2 −T T the d × d matrix CT2 is non-singular. Solving for x2 = C−T 2 d − C2 C1 x1 and replacing in b = A1 x1 +A2 x2 +v we obtain the reduced model b¯ ≡ b−A2 C−T 2 d= T )x + v ≡ A ¯ 1 x1 + v without rank defect, having least squares (A1 − A2 C−T C 1 2 1 solution

CT x

= CT1

CT2

¯ T Pb¯ ≡ N ¯ T PA ¯ 1 )−1 A ¯ −1 u, xˆ 1 = (A 1 1 1 ¯ −T T ˆ 1, xˆ 2 = C−T 2 d − C2 C1 x

¯ −1 , Qxˆ 1 = N 1

(104)

−1 T Qxˆ 2 = C−T 2 C1 Qxˆ 1 C1 C2 ,

T Qxˆ 1 xˆ 2 = −C−T 2 C1 Qxˆ 1 .

(105)

Inner constraints minimize the norm of all parameters x, which in addition to the network coordinates they may include other parameters related to the observational process. By proper ordering let x1 denote the network coordinates or, more generally, a subset of the parameters of immediate interest. Then we may minimize the seminorm ||x1 ||2 = xT1 x1 instead of the norm ||x||2 = xT x. This is achieved by  

T x1 T T = ET1 x1 + ET2 x2 = 0 with replacing the inner constraints E x = E1 E2 x2 the so called partial inner constraints  x  1 = ET1 x1 = 0. CT x = ET1 0 (106) x2 To see this, consider the linearized coordinate transformation xˆ  = xˆ + Ep, which splits into xˆ 1 = xˆ 1 + E1 p and xˆ 2 = xˆ 2 + E2 p. Minimizing φ = (ˆx1 )T xˆ 1 = (ˆx1 + E1 p)T (ˆx1 + E1 p) by setting ∂φ x1 + E1 p)T E1 = 0, we obtain ∂p = 2(ˆ p = −(ET1 E1 )−1 ET1 xˆ 1 and conversion to the partial inner constraints solution satisfying E1 x1 = 0 takes the form xˆ  = xˆ − E(ET1 E)−1 ET1 xˆ 1 = xˆ − E





ET1 0

    E1 −1 T xˆ 1 = E1 0 xˆ 2 E2

= [I − E(CT E)−1 CT ]ˆx,

(107)

with CT = ETI 0 . The partial inner constraints solution xˆ EP follows easily from

the general minimal constraints solution by replacing CT = ETI 0  u1 + E1 d = u2 −1    u1 N11 + E1 ET1 N12 , = NT12 N22 u2 

xˆ EP =

N11 + E1 ET1 N12 NT12 N22

−1 

(108)

730

A. Dermanis

 −1 N11 N12 N11 + E1 ET1 N12 = NT12 N22 NT12 N22 −1    E1 (ET1 E1 )−2 ET1 E1 (ET1 E1 )−2 ET2 N11 + E1 ET1 N12 . − = NT12 N22 E2 (ET1 E1 )−2 ET1 E2 (ET1 E1 )−2 ET2 

Qxˆ EP =

N11 + E1 ET1 N12 NT12 N22

−1 

(109)

A generalization of the inner constraints are the minimum weighted norm constraints or generalized inner constraints [42] where the minimized quantity is the square distance ||x − xref ||2W = (x − xref )T W(x − xref ) from a fixed known value xref . These can be used to adapt the solution to a preexisting solution xref , assigning different weights to different points according to their importance, or to different coordinates, e.g., by downloading the less accurate vertical components with respect to the horizontal ones. The solution can be derived from the linearized transformation xˆ G = xˆ + Ep, where xˆ is any least squares solution, by minimizing φ = (x − xref )T W(x − xref ) = (ˆx + Ep − xref )T W(ˆx + Ep − xref ). Setting ∂φ = 2(ˆx +Ep−xref )T WE = 0 ∂p gives the parameter values p = −(ET WE)−1 ET W(ˆx − xref ) for which the desired solution becomes xˆ G = [I − E(ET WE)−1 ET W]ˆx + E(ET WE)−1 ET Wxref ,

(110)

Comparing with the general transformation (88) from one least squares solution to another, we conclude that the same solution can be obtained by minimal constraints CT x = d, if we choose CT = ET W and d = ET Wxref , i.e., with the generalized inner constraints ET W(x − xref ) = 0.

(111)

In analogy with the usual inner constraints we may consider also partial generalized ref T ref 2 inner constraints where ||x1 − xref 1 ||W1 = (x1 − x1 ) W1 (x1 − x1 ) is minimized.   From the split transformation xˆ 1 = xˆ 1 + E1 p and xˆ 2 = xˆ 2 + E2 p we may minimize φ = (ˆx1 )T xˆ 1 = (ˆx1 + E1 p − x1 )T W1 (ˆx1 + E1 p − x1 ), ref

ref

(112)

ref ∂φ x1 + E1 p − x1 )T W1 E1 = 0 to obtain p = −(ET1 W1 E1 )−1 ET1 W1 ∂p = 2(ˆ ref (ˆx1 − xˆ 1 ) so that the conversion to the partial inner constraints solution takes

ref the form xˆ 1 = I − E1 (ET1 W1 E1 )−1 ET1 W1 xˆ 1 + E1 (ET1 W1 E1 )−1 ET1 W1 x1 ,

setting

13 Theory and Realization of Reference Systems

731

xˆ 2 = xˆ 2 − E2 (ET1 W1 E1 )−1 ET1 W1 (ˆx1 − x1 ) which can be combined into ref

xˆ  = xˆ − E(ETI W1 E1 )−1 ET1 W1 xˆ 1 + E(ET1 W1 E1 )−1 ETI W1 x1 = (113) −1 T  T −1 T  T ref E1 W1 x1 . = xˆ − E E1 W1 0 E E1 W1 0 xˆ + E E1 W1 0 E ref

Comparison with the general conversion (88) from one least squares equation to the other shows that the desired solution can be obtained also from the minimal

ref constraints solution setting CT = ET1 W1 0 and d = ET1 W1 x1 , i.e., by the generalized partial inner constraints ref

ET1 W1 (x1 − x1 ) = 0.

(114)

The idea of inner constraints and free network originates in the work of Meissl [45–47]. It has been further elaborated by Blaha [15, 16], Grafarend and Schaffrin [38, 39], Baarda [11, 12] and many others. Precursors of these ideas were already present in the work of Bjerhammer [14] who in 1951 rediscovered the pseudoinverse of Moore [48], know widely known as the Moore-Penrose inverse, before Penrose in 1955 [50]. For more details we suggest [13, 28, 41, 56].

6

Mathematical Modeling of Spatiotemporal Reference Systems for a Deformable Geodetic Network: Deterministic Aspects and Reference System Optimality

In the case of a rigid network with time-invariant geometric configuration the reference system needs to be chosen just once and remains the same for any time instant t. In contrast, in the case of a deformable network, a reference system needs to be chosen for every instant t, in correspondence with a time-varying geometric configuration. Thus the choice of a spatial reference system for a rigid network is replaced by the choice of a spatiotemporal reference system in the case of a deforming network. Before attacking the actual problem of the choice of the reference system within the process of data analysis, it is necessary to examine the reference system choice within a deterministic model, where we assume that the geometric configuration of the network is known for all epochs t within some time interval. In the simplest case, the temporal variation of the geometric configuration is continuous and the choice of a reference system for each epoch t should lead to its representation by a continuous function x(t), where x contains the coordinates of all the n network points. The more realistic case of non-continuous deformation with discontinuities associated with seismic events can be similarly treated in a piecewise manner for intervals between seismic discontinuities. For each epoch, a coordinate transformation x˜ i (t) = t(xi (t), p(t)) leads to an equally valid continuous representation x˜ (t), provided that the transformation parameter functions p(t) are continuous (Fig. 2). For each

732

A. Dermanis

R 3n

tL t

tF

C x(t L ) x(t )

x(t F )

~

x~ (t L ) C x~ (t )

St L

St x~ (t F )

St F

Fig. 2 A reference system x(t) is a section C of the fibering F of the network coordinate space R 3n , with fibers the shape manifolds of the network at various epochs t. An optimal reference system x˜ (t) is a section C˜ (a geodesic curve in R 3n ) intersecting perpendicularly all shape fibers

fixed epoch all such representations x˜ (t) constitute a shape manifold St in R 3n , corresponding to all coordinates that give the same geometric configuration of the network at the particular epoch t. The set of all shape manifolds St , for all t in the P considered interval, constitutes a fibering of the subset t St ⊂ R 3n , provided that for t  = t, i.e., that we exclude the pathological cases where the network returns to a previous geometric St ∩ St = Ø configuration or that remains rigid for some time interval. Before the introduction of the (now spatiotemporal) reference system, the deterministic model for the temporal evolution of the geometric configuration of the network consists of the time-dependent collection of shape manifolds St for t ∈ [tF , tL ], where tF and tL are the initial and final epochs, respectively, of the time interval to which our analysis is confined. The choice of a spatiotemporal reference P system becomes now the choice of a curve C in t St ⊂ R 3n , which intersects each shape manifold St at a unique point x(t), i.e., C ∩ St = {x(t)}. In this context the question of what is the optimal choice of a reference system arises in a natural way. Mathematically, if xref (t) is some arbitrarily chosen reference system, the problem reduces to that of finding the optimal parameter functions p(t), which transform the reference system choice xref (t) into the optimal one x(t) = t(xref (t), p(t)). Here t : (xref , p) → x stands for the transformation from the coordinates of the reference to those of the optimal reference system defined pointwise by the rigid transformation xi (t) = R(θ(t))xi,ref (t) + d(t),

i = 1, 2, . . . , n,

(115)

T

where the transformation parameters p(t) = θ(t)T d(t)T consist of the rotation parameters θ(t) of the orthogonal matrix R(t) and the displacement vector d(t). The most obvious optimality characteristic is that the optimal choice x(t) does not exhibit coordinate variations that come from the choice of the reference system itself and do not reflect actual variations in the geometric configuration of the network. In plane words the station coordinates should vary as little as possible! The remaining problem is how to quantify coordinate variation. In the framework of differential geometry this optimality characteristic is satisfied by requiring that the curve x(t) is

13 Theory and Realization of Reference Systems

733

a geodesic joining two points x(tF ) ∈ StF P and x(tL ) ∈ StL . The geodesic property requires a choice of metric in R 3n , where t St is embedded, which we choose to be the simple Euclidean metric ρ(x, x ) = ||x −x||E =



(x − x)T (x − x) =

%

(x i − x )T (x i , xi ).

(116)

i

This does not completely solve the problem because there exists a geodesic for any such arbitrary pair of endpoints. We must further require that x(t) is the shortest possible geodesic between all points x(tF ) ∈ StF and x(tL ) ∈ StL . If we fix x(tF ) and let x(tL ) vary within StL , among all geodesics between x(tF ) and the various x(tL ), there will be one having the shortest possible length Lx(tF ) , which depends only on x(tF ). The question is whether there is a particular shortest length among all possible Lx(tF ) , as x(tF ) varies over StF . It turns out, as we will see below, that all Lx(tF ) are equal and there is therefore an infinite number of shortest geodesics between StF and StL , all with the same length, which all provide optimal spatiotemporal reference systems. For a geodesic between x(tF ) and StL to be shortest, it must hold that it is perpendicular to StL , which means that the tangent vector x˙ (tL ) must be orthogonal to the linear manifold Tx(tL ) (StL ), tangent to the nonlinear manifold StL . By the reciprocal argument, x˙ (tF ) must be orthogonal to the linear manifold Tx(tF ) (StF ), tangent to the nonlinear manifold StF . Since the choice of initial tF and final epoch tL is rather arbitrary, they may replaced by any two epochs t1 and t2 in the interval [tF , tL ]. Orthogonality must therefore hold at x(t1 ) and x(t2 ), and hence at any point x(t) on the curve. To explicitly express the orthogonality relation x˙ (t)⊥Tx(t) (St ), we need a basis for the tangent linear manifold to St at a point x(t), which is provided by the local tangent vectors ei (x(t)) =

∂t (x(t)), i = 1, · · · , 6, ∂pi

(117)

so that  Tx(t) (St ) = span

 ∂t (x(t)) . ∂pi

(118)

Expressing the orthogonality x˙ (t)⊥Tx(t) (St ) by the vanishing of the Euclidean inner product, it turns out that the optimal reference system x(t) = t(xref (t), p(t)) is provided by the solution p(t) to the set of differential equations  x˙ (t)T

 ∂t (x(t)) = 0, ∂pi

i = 1, . . . 6,

t ∈ [tF , tL ].

(119)

Regarding optimality, we must note that if pf is a fixed (time independent)  set of transformation parameters, then the coordinate systems x(t) and x˜ (t) = t x(t), pf

734

A. Dermanis

are kinematically equivalent, since they are connected at any epoch by a fixed set of transformation parameters and thus exhibit the same coordinate variation. We will characterize such reference systems as parallel. Definition 1. Two spatiotemporal reference systems will be called parallel, if the values of the parameters of the transformation from one to the other are constant for all epochs within the time interval under consideration. For two parallel spatiotemporal reference systems connected through x (t) =  Rx(t) + d, where R and d are constant, the tangent vectors are related by dx dt (t) = R dx dt (t). Using the relation between the tangent vector magnitude and the length element ds it follows that 

ds  dt

2

 =

dx dt

T 

dx dt



 =

dx dt

T



dx R R dt T



 =

dx dt

T 

dx dt



 =

ds dt

2

and therefore the length of elements of the two system curves are equal ds  = ds. Consequently, the lengths L and L of the two shortest geodesics between StF and t StL , beginning from x(tF ) and x˜ (tF ), respectively, are equal since L = tFL ds  =  tL tF ds = L. The differential equations (119) defining the optimal solution have infinite solutions, each particular one depending on integration constants, which may be the chosen initial values x(tF ). Different choices of x(tF ) lead to different optimal solutions, which are parallel in the above defined sense. Therefore, the reference system at some particular reference epoch must be arbitrarily chosen, while optimality criteria determine only the temporal evolution of the reference system, i.e., the reference system at any subsequent epoch tF < t ≤ tL . This is true for the case when rigid transformations are allowed between reference systems, keeping the scale fixed, but it does not hold for the case of similarity transformations. The reason lies in the choice of the Euclidean metric as a means of quantification of coordinate variation, while the choice of an arbitrarily small scale factor λ(t) = 1 + s(t) can make the length of the resulting shortest geodesic arbitrarily small, so that the shortest geodesic becomes indeterminable. For the similarity transformation case x˜ (t) = (1 + s(t)) R(t)x(t) + d(t), a different metric is needed, which is invariant, not only with respect to rotation and translation, but also with respect to change of scale. We restrict ourselves here to the standard case where network scale is already defined, with a realization based on a set of atomic clocks, so that only rigid transformations of displacement and rotation are allowed between the different possible reference systems. We assume that some reference solution x(t) is already established and we seek a solution x˜ (t), which is optimal, i.e., a shortest geodesic. The rigid transformation for each network point i has the form x˜ i = R(θ)xi + d, where dependence on time is omitted for the sake of simplicity, and we seek to find the optimal functions θ(t), d(t) satisfying the orthogonality conditions (119) and

13 Theory and Realization of Reference Systems

735

thus providing an optimal solution. The tangent vectors to the shape manifold are ∂ x˜ ∂ x˜ the columns of the matrices ∂d , ∂θ and the orthogonality condition takes the form x˙˜ T

∂ x˜ i ∂ x˜ = = 0, x˙˜ Ti ∂d ∂d n

x˙˜ T

i=1

∂ x˜ i ∂ x˜ = = 0. x˙˜ Ti ∂θ ∂θ n

(120)

i=1

Introducing the notation [ωk ×] = R

∂RT , ∂θk

k = 1, 2, 3

= [ω1 ω2 ω3 ],

(121)

˙ i = [(Rxi )×] θ˙ so that x˙˜ i = Rx ˙ i + R˙xi + d˙ we can easily show that Rx ∂(˜xi ) ∂ x˜ i i) ˙ ˙ ˙ becomes x˜ i = [(Rxi )×] θ + R˙xi + d, while ∂d = I3 and ∂θ = ∂(Rx = ∂θ ∂R [(Rxi )×] . For the proof note that (121) implies that ∂θk = −[ωk ×]R so that   ˙ ˙ i =− ˙ ˙ Rx k θk [ωk ×]Rxi = [(Rxi )×] k θk ωk = [(Rxi )×] θ. With these values and setting 1 xi , n n

m=

i=1

C=−

n

i=1

[(xi − m)×]2 = −

i=1

h=

n

1 x˙ i , n n

˙ = m

n

[xi ×]2 + n[m×]2 ,

i=1

˙ = [(xi − m)×](˙xi − m)

i=1

n

˙ [xi ×]˙xi − n[m×]m,

(122)

i=1

the orthogonality conditions take the form    n  1 ˙ T ∂ x˙ T 1 ∂ x˙ i T ˙ ˙ + d˙ = 0, x˜ x˜ i = R [m×] RT θ˙ + Rm = n ∂d n ∂d

(123)

i=1

RT −T

   n  ∂ x˙ i T ˙ ∂ x˙ T x˙˜ T x˜ i = = ∂d ∂θ i=1

˙ − n[m×]RT d˙ = 0. = CRT θ˙ − n[m×]2 RT θ˙ − h − n[m×]m

(124)

˙ = −[m×]RT θ˙ − RT d˙ and replacing in the secSolving the first for m ond we finally arrive at the desired system of nonlinear differential equations ˙ θ˙ = −1 RC−1 h, or after replacing the second into d˙ = −R[m×]RT θ˙ − Rm, the first, ˙ − R(θ)[m×]C−1 h d˙ = −R(θ)m

(125)

736

A. Dermanis

θ˙ = (θ)−1 R(θ)C−1 h.

(126)

These differential equations have infinitely many solutions, depending on the initial values (integration constants) d0 = d(tF ), θ0 = θ(tF ). Each choice leads to one of the infinitely possible shortest geodesics with different starting point x˜ t (tF ) = R(θ0 )xi + d0 on the initial shape manifold StF . Different choices of θ0 , d0 , and hence of x˜ (tF ), correspond to reference systems that are parallel to each other, in the (already defined above) sense of being connected with time-invariant transformation parameters. In the special case that the reference solution x(t) has been chosen in a way that ˙ = 0 the differential equations degenerate into d˙ = 0, θ˙ = 0 with h = 0 and m constant solution d = d0 , θ = θ0 . This means that the general solution is parallel to ˙ = 0. In other words the conditions h = 0 and any special solution with h = 0 and m ˙ = 0 are sufficient in providing one of the desired shortest geodesics. To formally m prove this statement we will start with an arbitrary solution x(t) and we will seek the functions d(t), θ(t) leading through x i = R(θ)xi + d to a solution x (t), which ˙  = 0, and see whether this is an optimal solution or not. satisfies h = 0 and m  ˙ + Rm ˙ which with Rm ˙ = [(Rm)×] θ˙ leads to ˙ = Rm ˙ + d, Obviously m ˙ ˙  = [(Rm) ×] θ˙ + Rm ˙ + d. m

(127)

˙  = R[(xi − m)×]RT θ˙ + R(˙xi − m) ˙ and x˙ i − m ˙  = R(xi − m), Replacing x˙ i − m  h becomes h =

n

˙ ) = [(x i − m )×](˙xi − m

i=1

6

=R

n

5 [(xi − m)×]2 RT θ˙ + R

i=1

= −RCRT θ˙ + Rh.

6

n

5 ˙ [(xi − m)×](˙xi − m) =

i=1

(128)

˙ − [(Rm)×] θ˙ and ˙  = 0 and h = 0 we arrive at d˙ = −Rm Setting m ˙θ = −1 RC−1 h, respectively. Replacing the second in the first we arrive at the ˙ − R[m ×]C−1 h, θ˙ = −1 RC−1 h, which are differential equations d˙ = −Rm identical to the ones (125) and (126) providing an optimal solution. In conclusion, ˙  = 0 and h = 0 lead equivalently to an optimal solution (shortest the conditions m geodesic) as the one defined directly by the orthogonality condition (119). To understand the importance of the above conclusion, we must turn to the solution of the choice of reference system problem, which has been proposed in geophysics, not for a discrete geodetic network, but for the continuous mass distribution of the earth. The origin of the reference system is taken to be the geocenter, which in an arbitrary reference system has coordinates defined at every epoch by

13 Theory and Realization of Reference Systems

xG (t) =

1 M(t)

737

 x(t)dm(t)

(129)

E(t)

where dm(t) is the mass element, integration is taken over all the earth masses and M(t) = E(t) dm(t) is the total mass of the earth. The condition xG (t) = 0 for a geocentric reference system settles the issue of defining the origin system. For the orientation of the axes the French astronomer Felix Tisserand (1845–1896) has introduced the concept of what we now call Tisserand axes [49, 61]. In order to minimize the apparent motion of the earth masses with respect to the optimal reference system, he has chosen to minimize the relative kinetic energy of the earth  TR (t) = 12 E(t) x˙ (t)T x˙ (t)dm(t). It turns out that TR (t) = min is equivalent to the vanishing of the relative angular momentum of the earth  hR (t) =

[x(t)×]˙x (t) dm(t) = 0.

(130)

E(t)

The above condition does not define uniquely the orientation of the reference system. Indeed a time-independent rotation x (t) = R0 x(t) leads to a relative   angular momentum h R = E [(R0 x)×](R0 x˙ ) dm = R0 h = 0, which also vanishes. There is therefore an infinite set of Tisserand reference systems, all of them being parallel in the already explained sense. A most general Tisserand system can be obtained by relaxing the strict “geocentricity” requirement xG (t) = 0, with xG (t) = constant (equivalently x˙ G (t) = 0). In this case the geocentric relative angular momentum will be given instead by  [(x − xG )×] (˙x − x˙ G ) dm = 0.

hR =

(131)

E

The ideas of Tisserand can be applied to the case of a discrete geodetic network if we consider the network points as mass points with equal mass, which we may take to be unity without loss of generality. In such a case we must replace integration with summation, so that with the replacements 1 M

 E

1 1 (·), x → xi , xG → m xi , and n n n

(·)dm →

n

i=1

hR → h =

1 n

i=1

n

˙ [(xi − m)×] (˙xi − m)

i=1

˙ = 0 and h = 0. But the conditions x˙ G (t) = 0, hR = 0 are replaced by m these are exactly the conditions we have obtained as necessary for producing spatiotemporal reference systems which are shortest geodesics! Therefore, our theory for discrete point networks is just a special case of Tisserand’s choice of reference system (slightly generalized with respect to the origin choice). The

738

A. Dermanis

connection of the two theories is an important result with practical consequences, since they lead to appropriate minimal constraints for application in data analysis, the so called kinematic constraints [1, 2]. These lead to an optimal choice of the spatiotemporal reference system, with minimized station motions, which best represents the temporal evolution of the network shape. For further details see [23–25].

7

Reference System Definition in the Analysis of Coordinate Time Series

Observational data, related to the geometric configuration (shape) of a geodetic network at the epoch of observation, are collected by four space techniques GPS/GNSS [35], VLBI [59], SLR [51] and DORIS [62]. From the viewpoint of formal statistical optimality, all these data should be analyzed simultaneously. At first sight, this appears to be a herculean task. It is however possible to combine the normal equations, derived from different subsets of data, into a new set that provides the combined solution. Computational difficulties are not therefore a prohibitive factor. There are two levels in dividing the data into subsets. At the higher level, data from different space techniques are analyzed separately. At the lower level, data from a particular space technique, collected over a long time interval, are separated into subsets corresponding to smaller time intervals of one day, or one week, or the few days of the duration of a VLBI session. Thus, single epoch solutions become available as a first step of data analysis. Returning to the question of statistical optimality, we must realize that it holds under the particular assumptions that observation errors are zero mean random variables and that the covariance matrix of all observations is known up to a scale factor. These assumptions however are not valid in a real world situation. The presence of systematic errors and correlations between data in different epochs, are ignored in the separate per epoch analysis. Therefore, the analysis of the data into subsets may have some advantages other than the obvious computational convenience. Such an advantage is easier to realize in the presence of outliers, which are difficult to detect in the analysis of a large data set, where the effect of even a single outlier spreads over a large number of error estimates. Treating data in subsets enhances the possibility of detecting “bad data” and it is at least a necessary preprocessing step in the overall data analysis. Physical considerations can also help in detecting bad data. For example, the fact that a network deforms slowly allows the detection of bad epoch solutions which step out in the produced coordinate series in a way that is highly improbable to arise from the effect of random errors. Following the standard practice, we will start with the analysis of coordinate time series, which are the result of per epoch data analysis. The estimated network shape is represented through a set of station coordinate estimates obtained with the use of a particular set of (hopefully minimal) constraints. Thus every epoch has its own independently chosen reference system. The task of the next step, which is usually called stacking is twofold: the first is to achieve a smoothing interpolation

13 Theory and Realization of Reference Systems

739

of the discrete series, through a coordinate evolution model of the general form xi (t) = fi (t, a) where a is a set of unknown model coefficients; the second is to choose an optimal reference system for the description of the interpolated continuous temporal variation of the network shape. The usual choice is a linearin-time coordinate variation or constant velocity model xi (t) = x0i + (t − t0 )vi where x0i = xi (t0 ) are the initial coordinates and vi the velocity of station i. The station coordinate estimates xˆ i (tk ) at epoch tk as treated as pseudo-observations xob i (tk ), which are expressed as functions of the coordinates xi (t) in a spatiotemporal reference system common for all epochs xob i (tk ) = (1+sk )R(θk )xi (tk )+dk +eik = (1+sk )R(θk ) [xi0 + (tk − t0 )vi ]+dk +eik . (132) Here θk , dk , sk are the transformation parameters from the new (not yet defined) reference system to that of epoch tk and eik are the observational errors. As usual linearization, based on the approximation R(θk ) ≈ I − [θk ×], leads to a sufficient approximate linear model of the form ap

ap

xob i (tk ) = xi0 + (tk − t0 )vi + sk x0i + [x0i ×]θk + dk + eik = ⎡ ⎤ θk

ap ap ⎣ = xi0 + (tk − t0 )vi + [x0i ×] I3 x0i dk ⎦ + eik = sk ≡ xi0 + (tk − t0 )vi + Ei zk + eik

i = 1, 2, . . . , n,

(133)

k = 1, 2, . . . , m,

ap

where x0i are approximate values to the initial coordinates xi0 . Alternatively, the model can be expressed in terms of corrections to approximate values δx0i = ap ap ap ap ap x0i − x0i , δvi = vi − vi , δθk = θk − θk , δdk = dk − dk , δsk = sk − sk as

ap xob i (tk ) = δxi0 + (tk − t0 )δvi + x0i ×

I3

⎡ ⎤ δθk ap x0i ⎣ δdk ⎦ + ek = δsk

= δxi0 + (tk − t0 )δvi + Ei δzk + eik ,

(134)

where ap

ap

ap ap

ap

ap

ap

ob xob i (tk ) = xi (tk ) − x0i − (tk − t0 )vi − sk x0i − [x0i ×]θk − dk ,

(135)

are the reduced observations. The form (134) in term of corrections is more appropriate when iterations are performed using the estimates of each step as approximate values in the next one, until convergence. This solves the least squares estimation problem for the original nonlinear model.

740

A. Dermanis

For purely geodetic purposes, the parameters of interest are the initial coordinates xi0 and the velocities vi , while the transformation parameters zk = [θTk dTk sk ]T are nuisance parameters. However, their estimates are needed in order to convert available estimates of Earth Orientation Parameters (EOPs) from the reference system at epoch tk to the new common reference system. EOPs connect the celestial reference system with the adopted terrestrial reference system at epoch tk and are different for each space technique. We will first look into the stacking problem of coordinate time series without taking EOP coordinate time series into consideration. The inclusion of a separate sk at every epoch tk does not conform with our basic deterministic and stochastic model, since observations from all space techniques are not invariant under scale transformations and have their own scale provided by their unit of length. This in turn is just the product of the constant velocity of light with their corresponding unit of time, which is realized by a particular set of atomic clocks. At most one could assume a linear drift in these clocks, which could be accommodated by a linear model sk = s0 + (t − t0 ) s˙ arriving at an observations model where only initial scale s0 = s(t0 ) and scale rate s˙ appear as unknown parameters. The inclusion of different scale factors sk is though a standard practice that has proven to be effective. This is so for two reasons. The first has to do with the fact that the assumption of zero mean random errors on which the least squares estimation is based is not really valid. In addition to the random errors which will cause random variations in the per epoch estimated network shape, various stochastic effects will cause corresponding shape variations. The use of a different scale parameter per epoch “absorbs” a part of these variations. The second reason is that although the observations have no absolute rank defect with respect to scale, they may be in a close to rank defect situation. Although observations are not absolutely invariant under scale transformation their variation may be very small and thus negligible. It is known from computational experience that only VLBI and SLR provide observations with strong scale information, while scale information is weak in GNSS and DORIS. For this reason their scale is not taken into consideration when combining data from the four techniques in order to formulate the ITRF. Such close to rank deficiency situations apply also to translation and rotation transformations. VLBI has an absolute rank deficiency with respect to translation. SLR strongly senses the geocenter through the satellite orbits and thus has no translation defect or weakness, when the geocenter is used as origin of the reference system. GPS and DORIS also sense the geocenter but in a much weaker way, so that their origin/translational information is of much lower quality. It is therefore necessary to be able to detect deficiencies and weaknesses attributed to the definition of the reference system a priori in a preprocessing stage. The number of deficiencies and weaknesses is mirrored in the number of eigenvalues of the normal equations matrix N of the solution at each epoch, which have (practically) zero or close to zero values. It is not clear though, whether these are indeed associated with defects in the reference system definition, or to which one of the particular possible types of defect (origin, orientation, scale) they can be attributed. The reason is that the corresponding eigenvectors ui have no physical meaning. On the contrary, the

13 Theory and Realization of Reference Systems

741

columns of the inner constraint matrix E = [eθ1 eθ2 eθ3 ed1 ed2 ed3 es ] are associated with orientation components, translation components and scale. In fact they form a basis of the null space N(N) of N, which is identical to the null space N (A) of the design matrix A. Combining eigenvectors ui with zero or very small eigenvalues λi with the rows of E, [21] have introduced three appropriate rank deficiency interpretation indices for the detection of defects or weaknesses in reference system definition. The first index E F F ωQ,i = arctan G

E F F uTi EQ (ETQ EQ )−1 ETQ ui G1− − 1= arcsin , uTi EQ (ETQ EQ )−1 ETQ ui uTi ui (136) uTi ui

is the angle between any eigenvector ui and a subspace of N (N) = N (A) spanned by a subset of the columns of E. For example EQ = [eθ1 eθ2 eθ3 ] seeks to see whether a zero or close to zero eigenvalue λi (Nui = λi ui ) can be attributed to an orientation defect of the reference system. The second index % ψq = arctan

%

eTq eq eTq U(UT U)−1 UT eq

− 1 = arcsin 1 −

eTq U(UT U)−1 UT eq eTq eq

,

(137) where U = [· · · ui · · · ] is the matrix with columns the eigenvectors with zero or very small eigenvalues, is the angle between one of the columns of E and spanU, the set of all linear combinations of the columns of U. The third index is the pair χQ,min =



1 − μmin ,

χQ,max =



1 − μmax ,

(138)

where μmin and μmax are the largest and smaller eigenvalue of the matrix UT EQ (ETQ EQ )−1 ETQ U. They are the minimum and maximum angle between any vector in spanEQ ⊂ N(N) and any vector in spanU. The indexes also apply to the extended case of E = [eθ 0 eθ 0 eθ 0 ed 0 ed 0 ed 0 es0 | eθ˙ 0 eθ˙ 0 eθ˙ 0 ed˙0 ed˙0 ed˙0 es˙ ], 1

2

3

1

2

3

1

2

3

1

2

3

(139)

for reference system deficiencies in original epoch orientation (θ10 , θ20 , θ30 ), origin (d10 , d20 , d30 ) and scale (s0 ), as well in orientation rate (θ˙10 , θ˙20 , θ˙30 ), origin rate (d˙1 , d˙2 , d˙3 ) and scale rate (˙s ), as we will see below. The above indexes help in understanding the quality of informational content of each space technique, related to the reference system definition. They are also useful in depicting isolated bad epochs, which do not confront with the overall deficiency characteristics of the time series and should not be included in the stacking solution. Even if no deficiencies or weaknesses are determined with the above indices with respect to translation and rotation, the relevant parameters θk , dk are included in the

742

A. Dermanis

stacking solution in order to remove to some degree systematic effects, as done in the case of the scale parameters sk . The least squares stacking adjustment using the above linearized observations model leads to normal equations with rank defect accepting infinitely many least squares estimates xˆ 0i , vˆ i , i = 1, 2, . . . , n and zˆ k , k = 1, 2, . . . , m. Each solution corresponds to a different choice of the spatiotemporal reference system. Numerical computations reveal that the normal equations have 14 practically zero eigenvalues, corresponding to a rank defect of 14. Therefore 14 minimal constraints must be introduced. It is thus essential to determine the inner constraints matrix E, appearing in the linearized transformation of the total unknowns x = x + Ep, T

with transformation parameters p = ψ T cT λ (rotation angles ψ, translations c, scale parameter λ). The problem here lies in the fact that a change of the spatiotemporal reference system is a change of the reference system at every epoch. The transformation from the instantaneous coordinates xi (t) = x0i + (t − t0 )vi , into instantaneous coordinates x i (t) = [1 + λ(t)]R(ψ(t))[x0i + (t − t0 )vi ] + c(t) ≈ ap

ap

≈ x0i + (t − t0 )vi + [x0i ]ψ(t) + c(t) + λ(t)x0i ,

(140)

reveals two problems. Firstly, the resulting coordinate model is not linear-in-time x i (t) = x 0i + (t − t0 )v i for some parameters x 0i , v i . Secondly the transformation parameters, as functions ψ(t), c(t), λ(t), are infinite in number! Even if one restricts the analysis to only the discrete epochs tk , k = 1, 2, . . . , m, in the per epoch transformations x i (tk ) ≈ x0i + (tk − t0 )vi + [x0i ×]ψ(tk ) + c(tk ) + λ(tk )x0i , ap

ap

(141)

there are 7 m transformation parameters involved, much more than the numerically detected rank deficiency of 14. A simple minded approach is to arbitrarily restrict the transformation functions to ones that are linear-in-time, namely ˙ ψ(t) = ψ0 + (t − t0 )ψ,

c(t) = c0 + (t − t0 )˙c,

λ(t) = λ0 + (t − t0 )λ˙ ,

(142)

so that the resulting x i (t) is of the second degree with respect to (t − t0 ) and then take advantage of the fact that the coefficient of (t − t0 )2 turns out to be a negligible quantity of the second order. This non-rigorous approach implements the correct ˙ c˙ , λ˙ . number of transformation parameters, i.e., the 14 parameters ψ0 , c0 , λ0 , ψ, A rigorous solution to this problem has been presented by Chatzinikos and Dermanis [20] who have directly investigated the rank deficiency of the design matrix of the stacking problem and have revealed the relation between any two least squares solutions. Their results are summarized into two propositions: Proposition 1. If xˆ 0i , vˆ i , zˆ k , i = 1, . . . , n, k = 1, . . . , m, is one of the least ˙ squares solutions of the stacking problem, then xˆ 0i = xˆ 0i + Ei p0 , vˆ i = vˆ i + Ei p,

13 Theory and Realization of Reference Systems

743

˙ is also a least squares solution for any values of the zˆ k = zˆ k − p0 − (tk − t0 )p, T T T T constants p0 = [ψ c λ0 ] , p˙ = [ψ˙ c˙ T λ˙ ]T . 0 0

Proposition 2. If xˆ 0i , vˆ i , zˆ k and xˆ 0i , vˆ i , zˆ k are two least squares solutions of the T ˙ T, stacking problem, then there exist constants p0 = [ψT0 cT0 λ0 ]T , p˙ = [ψ˙ c˙ T λ]   ˙ i = 1, . . . , n, and zˆ k = zˆ k − p0 − such that xˆ 0i = xˆ 0i + Ei p0 , vˆ i = vˆ i + Ei p, ˙ k = 1, . . . , m. (tk − t0 )p, On the basis of the above propositions we may construct the desired parameter transformation under change of the reference system, having the general form x tot = xtot +Etot ptot . Letting x0 contain the station initial coordinates x0i , v contain the station velocities vi , z contain the epoch unknown transformation parameters zk , and E contain the matrices Ei , the desired parameter transformation becomes in the case of stacking ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ ⎤ x0 + Ep0 x 0 x0 E 0   p ⎦, x tot = ⎣ v ⎦ = xtot +Etot ptot = ⎣ v ⎦ + ⎣ 0 E ⎦ 0 = ⎣ v + Ep˙ p˙  z z J Jt z + Jp0 + Jt p˙ (143) where ⎡

⎤ .. . ⎢ ⎥ ⎥ E=⎢ ⎣ Ei ⎦ , .. .



⎤ .. . ⎢ ⎥ ⎥ J=⎢ ⎣ −I7 ⎦ , .. .





⎤ .. . ⎢ ⎥ ⎥ Jt = ⎢ ⎣ −(tk − t0 )I7 ⎦ , .. .

(144)

˙ z = z + Jp0 + Jt p.

(145)

or separately x 0 = x0 + Ep0 ,

˙ v = v + Ep,

T The transformation parameters are p0 = [ψ T0 cT0 λ0 ]T and p˙ = [ψ˙ c˙ T λ˙ ]T . The most important characteristic of this transformation is the separation between the transformation for initial coordinates and the one of velocities

x 0 = x0 + Ep0 ,

˙ v = v + Ep.

(146)

This allows to separately convert any least squares solution xˆ 0 , vˆ , to another solution having desired properties, both with respect to initial coordinates and velocities separately. In addition, the inner constraints matrix Etot can be implemented in order to derive inner or partial inner constraints in their simple or generalized form.

744

A. Dermanis

8

Various Types of Minimal Constraints for the Definition of a Spatiotemporal Reference System

Utilizing the derived transformation matrix ⎡

Etot

⎤ E 0 = ⎣ 0 E ⎦, J Jt

(147)

the inner constraints ETtot xtot = 0 take the form ⎤ ⎡ ⎤ ap −[x0i ×] θk m ⎣ ⎦ x0i − ⎣ dk ⎦ = ETi x0i − zk = ET x0 + JT z = I3 ap i=1 k=1 i=1 k=1 (x0i )T sk ⎡  ⎤ n m  ap − [x0i ×]x0i − θk ⎢ i=1 ⎥ i=1 ⎢ ⎥ m n ⎢ ⎥   ⎢ ⎥ = 0, x − d (148) =⎢ 0i k ⎥ i=1 ⎢ n i=1 ⎥ m ⎣  ap T ⎦  [x0i ] x0i − sk n

m

i=1

T

E

v + JTt z

n



i=1

⎤ ap −[x0i ×] ⎣ ⎦ vi − = − (tk − t0 )zk = I3 ap T i=1 k=1 i=1 (x0i ) ⎡ ⎤ θk m − (tk − t0 ) ⎣ dk ⎦ = k=1 sk ⎡  ⎤ n m  ap − [x ×]vi − (tk − t0 )θk ⎢ i=1 0i ⎥ k=1 ⎢ ⎥ m n ⎢ ⎥   ⎢ ⎥ = 0. vi − (tk − t0 )dk =⎢ ⎥ i=1 ⎢ n i=1 ⎥ m ⎣  ap T ⎦  [x0i ] vi − (tk − t0 )sk n

ETi vi

i=1

m

n



(149)

i=1

The first 7 constraints define the reference system at the initial epoch and the second 7 its time rate, i.e., its temporal evolution. Of the 7 initial epoch constraints (148) the 3 first define the initial orientation, the next 3 the initial origin and the last one the initial scale. Of the 7 rate constraints (149) the first 3 define the orientation rate, the next 3 the origin rate and the last one the scale rate.

13 Theory and Realization of Reference Systems

745

In addition to the above inner constraints ETtot xtot = 0, one may use ref

ref

(a) their generalized version ETtot Wtot (xtot − xtot ) = 0, satisfying (xtot − xtot )T ref Wtot (xtot − xtot ) = min (weighted minimum distance constraints) and the special cases (b) ETtot Wtot xtot = 0 satisfying xTtot Wtot xtot = min (weighted minimum norm constraints) and ref ref ref (c) ETtot xtot = ETtot xtot satisfying (xtot − xtot )T (xtot − xtot ) = min ( minimum distance constraints). ref

ap

A special case of the last constraints results by choosing xtot = xtot , giving the form ap ETtot (xtot −xtot ) = ETtot δxtot = 0, minimizing the norm of the parameter corrections δxTtot δxtot = min. They have exactly the same form as the inner constraints applied to the parameters, but they apply instead to the parameter corrections. Therefore one may replace the parameters in Eqs. (148) and (149) with their corrections. In general, it is hard to find justification for using a weight matrix other than the identity. In any case, it is of advantage to retain the separation of the constraints into the ones related to initial coordinates and thus defining the reference system at the initial epoch and the ones related to velocities and thus defining the temporal evolution (rate) of the reference system. This is achieved by considering blockdiagonal weight matrices of the form ⎡

Wtot

⎤ Wx0 0 0 = ⎣ 0 Wv 0 ⎦ , 0 0 Wz

(150)

which leads to the separable generalized inner constraints  ref

ETtot W(xtot − xtot ) =

ET

JT

0 0 ET JTt



⎤ ⎤⎡ ref Wx0 0 0 x0 − x0 ⎥ ⎣ 0 Wv 0 ⎦ ⎢ ⎣ v − vref ⎦ = 0, 0 0 Wz v − zref (151) ⎡

or explicitly ref

ET Wx0 (x0 − x0 ) + JT Wz (z − zref ) = 0,

(152)

ET Wv (v − vref ) + JTt Wz (z − zref ) = 0.

(153)

A reasonable simplification is to use zref = 0, in which case the initial epoch ref constraints ET Wx0 (x0 − x0 ) + JT Wz z = 0. can be used to adapt the initial epoch ref reference system to that of the target initial coordinates x0 , e.g., those of a previous solution. At the same time the rate constraints ET Wv (v − vref ) + JT Wz z = 0 can be used to adapt the temporal evolution of the reference system to that of the target

746

A. Dermanis

velocities vref , e.g., those from a previous solution, or velocities provided by a geophysical model. The above objectives can be best met my partial constraints, especially those which do not involve the transformation parameters. The possible partial inner constraints are the following: (1) involving only initial coordinates and velocities 

ET 0 0 ET



x0 v



 =

ET x0 ET v0

 =0:

ET x0 = 0

ET v = 0,

and

(154) (2) involving only initial coordinates and transformation parameters 

ET JT 0 JTt



  T  x0 E x0 + JT z = =0 : v JTt z

ET x0 + JT z=0 and

JTt z = 0, (155)

(3) involving only velocities and transformation parameters 

0 JT ET JTt

    v JT z =0 : = ET v + JTt z z

JT z=0

and

ET v + JTt z = 0, (156)

(4) involving only transformation parameters 

 JT z =0: JTt z

JT z = 0

&

JTt z = 0. (157)

The explicit forms of the above partial inner constraints can be easily formed from the explicit form of the total inner constraints Eqs. (148) and (149) by simply removing the terms corresponding to the non-participating parameters. For example, the partial inner constraints involving only initial coordinates and velocities are ⎡

n 

ap

[x0i ×]x0i



⎢ i=1 ⎥ ⎢ ⎥ n ⎢ ⎥  ⎢ ⎥ = 0, x0i ⎢ ⎥ ⎢ n i=1 ⎥ ⎣  ap T ⎦ (x0i ) x0i i=1



n 

ap

[x0i ×]vi



⎢ i=1 ⎥ ⎢ ⎥ n ⎢ ⎥  ⎢ ⎥ = 0, vi ⎢ ⎥ ⎢ n i=1 ⎥ ⎣  ap T ⎦ (x0i ) vi

(158)

i=1

for initial epoch and rate, respectively. In a similar way the partial inner constraints involving only transformation parameters are

13 Theory and Realization of Reference Systems



m 





θ ⎢ k=1 k ⎥ ⎢ m ⎥ ⎢ ⎥ ⎢ dk ⎥ ⎢ ⎥ = 0, ⎢ k=1 ⎥ m ⎣ ⎦ sk

747



m 

(t − t0 )θk ⎢ k=1 k ⎥ ⎢ m ⎥ ⎢ ⎥ ⎢ (tk − t0 )dk ⎥ ⎢ ⎥ = 0, ⎢ i=1 ⎥ m ⎣ ⎦ (tk − t0 )sk

k=1

(159)

k=1

for initial epoch and rate, respectively. Partial inner constraints involving only initial coordinates ET x0 = 0 or only velocities ET v = 0 do not qualify as minimal constraints. Indeed ET x0 = 0, defines the reference system at the initial epoch but not its temporal evolution, while ET v=0 defines the temporal evolution of the reference system but not at the initial epoch. This can be shown from the fact that the corresponding partial inner constraints matrices do not have full column rank in this case ⎡

⎤ E0 rank(Epar ) = rank ⎣ 0 0 ⎦ = 7 < 14, 00



⎤ 00 rank(Epar ) = rank ⎣ 0 E ⎦ = 7 < 14. 00 (160) The possible generalized partial constraints for block diagonal weight matrix are the following: (1) involving only initial coordinates and velocities ref

ET Wx0 (x0 − x0 ) = 0

and

ET Wv (v − vref ) = 0,

(161)

(2) involving only initial coordinates and transformation parameters ref

ET Wx0 (x0 − x0 ) + JT Wz (z − zref ) = 0

and

JTt Wz (z − zref ) = 0, (162)

(3) involving only velocities and transformation parameters JT Wz (z − zref ) = 0

ET Wv (v − vref ) + JTt Wz (z − zref ) = 0, (163) (4) involving only transformation parameters and

JT Wz (z − zref ) = 0

and

JTt Wz (z − zref ) = 0.

(164)

If the weight submatrices Wx0 , Wv and Wz are also block-diagonal we may derive explicit forms of the above generalized constraints. For example, those involving only initial coordinates and velocities become

748

A. Dermanis



⎤ ap ref −[x0i ×]Wx0i (x0i − x0i ) ⎢ ⎥ ref Wx0i (x0i − x0i ) ⎣ ⎦ = 0, ap ref T i=1 (x0i ) Wx0i (x0i − x0i ) ⎡ ⎤ ap ref ×]W (v − v ) −[x n v i i i 0i ⎢ ⎥ ref Wvi (vi − vi ) ⎣ ⎦ = 0, ap T ref i=1 (x0i ) Wvi (vi − vi )

n

(165)

while those involving only transformation parameters become ⎡  ⎤ m ref Wθk (θk − θk ) ⎢ k=1 ⎥ ⎢ m ⎥ ⎢ ref ⎥ ⎢ ⎥ = 0, W (d − d ) dk k k ⎢ ⎥ ⎢ k=1 ⎥ m ⎣  ref ⎦ wst (sk − sk ) k=1

⎡  ⎤ m ref (tk − t0 )Wθk (θk − θk ) ⎢ k=1 ⎥ ⎢ m ⎥ ⎢ ref ⎥ ⎢ ⎥ = 0. (t − t )W (d − d ) k 0 dk k k ⎢ ⎥ ⎢ k=1 ⎥ m ⎣  ref ⎦ (tk − t0 )wst (sk − sk ) k=1

(166)

9

A Posteriori Change of the Spatiotemporal Reference System

Instead of applying directly a desired set of minimal constraints CT x = d, in order to obtain a corresponding least squares solution xˆ C , one can obtain first any least squares solution xˆ by means of any set of convenient minimal constraints and then transform it into the desired solution xˆ C , using the solution conversion equation xˆ C = xˆ −E(CT E)−1 (CT xˆ −d). We will apply this equation first to the various types of (total) inner constraints, usual and generalized ones, and then to the various types of partial inner constraints implementing only a subset of the unknown parameters. Finally, we look into the possibility to apply quite different types of constraints for initial coordinates and velocities, an option that becomes possible in view of the separation of the linearized reference system transformation into two independent ones for initial coordinates and velocities. An important result in this context is the possibility to apply the minimum norm or the minimum trace of the covariance matrix properties to only a subset of the above parameters.

9.1

Conversion to a Solution Satisfying Some Type of Inner Constraints

We will now examine how a solution obtained using any type of minimal constraints can be converted to a solution satisfying either generalized inner constraints or one of their special cases, such as the usual inner constraints. We refer here to total

13 Theory and Realization of Reference Systems

749

constraints involving all of the three sets of parameters, initial station coordinates, station velocities, and nuisance transformation parameters. In the case of generalized inner constraints (weighted minimum distance conref ref straints) ETtot W(xtot − xtot ) = 0, it holds that CT = ETtot W, d = ETtot Wxtot and T −1 T the general conversion equation xˆ C = xˆ − E(C E) (C xˆ − d) becomes xˆ tot = xˆ tot − Etot (ETtot WEtot )−1 ETtot W(ˆxtot − xtot ). ref

(167)

Application to the explicit set of the stacking parameters and restriction to blockdiagonal weight matrix gives ⎤ ⎡ ⎤ ⎡ ⎤ −1 E 0  T xˆ 0 xˆ 0 T JT Wz Jt ⎣ vˆ  ⎦ = ⎣ vˆ ⎦ − ⎣ 0 E ⎦ E Wx0 E + J Wz J × JTt Wz J ET Wv E + JTt Wz Jt  zˆ J Jt zˆ 4 3 ref ET Wx0 (ˆx0 − x0 ) + JT Wz (ˆz − zref ) × ≡ ET Wv (ˆv − vref ) + JTt Wz (ˆz − zref ) ⎤ ⎡ ⎤ ⎡ E 0   xˆ 0 t (168) ≡ ⎣ vˆ ⎦ − ⎣ 0 E ⎦ x . tv zˆ J Jt ⎡

T T   Setting tx = tTx,θ tTx,d tx,s , tv = tTv,θ tTv,d tv,s the conversion equations take the analytic form

ap ap xˆ 0i = xˆ 0i − x0i × tx,θ − tx,d − tx,s x0i ,

ap ap vˆ i = vˆ i − x0i × tv,θ − tv,d − tv,s x0i ,  θˆ k = θˆ k + tx,θ + (tk − t0 )tv,θ ,

dˆ k = dˆ k + tx,d + (tk − t0 )tv,d , sˆk = sˆk + tx,s + (tk − t0 )tv,s .

(169)

The above equations are general enough to cover all types of constraints. What varies from case to case is the form of the auxiliary vectors tx and tv . We give below the possible special cases. Conversion to the solution for minimum distance constrains (special case W = I): ref The minimal constraints have in this case the form ETtot (xtot − xtot ) = 0 and the auxiliary conversion vectors become

750

A. Dermanis



tx tv





τ1 I7 ET E + nI7 = τ1 I7 ET E + τ2 I7

−1 4

3 ref ET (ˆx0 − x0 ) + JT (ˆz − zref ) , ET (ˆv − vref ) + JTt (ˆz − zref ) (170)

where

τ1 =

s

(tk − t0 ),

τ2 =

k=1

s

(tk − t0 )2 .

(171)

k=1

ref

Conversion to the solution for inner constrains (special case W = I, xtot = 0): The minimal constraints have in this case the form ETtot xtot = 0 and the auxiliary conversion vectors become 

9.2

tx tv



 =

τ1 I7 ET E + nI7 τ1 I7 ET E + τ2 I7

−1 

 ET xˆ 0 + JT zˆ . ET vˆ + JTt zˆ

(172)

Conversion to a Solution Satisfying Some Type of Partial Inner Constraints

We look next into the conversion into a solution satisfying generalized partial inner ref constraints of the form ETpar W(xtot − xtot ) = 0, where only a subset of the unknowns is constrained, and their special cases. In this case the general conversion relation xˆ tot = xˆ tot − E(CT E)−1 (CT xˆ tot − d) takes with CT = ETpar W and d = ETpar Wxref the form xˆ tot = xˆ tot − E(ETpar WEtot )−1 ETpar W(ˆxtot − xtot ) = xˆ tot − Et. ref

(173)

Application to the explicit set of the stacking parameters and restriction to blockdiagonal weight matrix gives again the general conversion Eqs. (169) with 

tx tv



⎡ ⎤ ⎤⎞−1 ref Wx 0 E 0 Wx0 (ˆx0 − x0 ) ⎢ ⎥ = ⎝ETpar ⎣ 0 Wv E ⎦⎠ ETpar ⎣ Wv (ˆv − vref ) ⎦. Wz J Wz Jt Wz (ˆz − zref ) ⎛



(174)

To get the conversion equation for each particular type of partial inner constraints we merely need to replace the Epar matrix with its particular form. We will present only two cases, the one involving only initial coordinates and velocities and the one involving only transformation parameters.

13 Theory and Realization of Reference Systems

751

Generalized partial inner constraints involving only initial coordinates and velocities: In this case ⎡

Epar

⎤ E0 = ⎣ 0 E⎦, 0 0

(175)

and one needs to apply Eqs. (169) with 

tx tv

4

 =

3 ref (ET Wx0 E)−1 ET Wx0 (ˆx0 − x0 ) . (ET Wv E)−1 ET Wv (ˆv − vref )

(176)

For the special case Wx0 = I, Wv = I, it holds that tx = (ET E)−1 ET (ˆx0 − x0 ), tv = (ET E)−1 ET (ˆv − vref ) and ref



ap 2

ap C − n x¯ 0 × −n x¯ 0 × ⎢

ap (ET E) = ⎢ nI3 ⎣ n x¯ 0 × ap

⎥ ⎥, ⎦

ap

n¯x0

ap

(177)

ap

γ 2 + n(¯x0 )T x¯ 0

n(¯x0 )T

0



0

where we have set 1 ap = x0 i , n n

ap x¯ 0

ap

ap

ap

x0 i = x0 i − x¯ 0 ,

(178)

i=1

C=−

n

ap

[ x0 i ×]2 ,

(179)

i=1

γ2 =

n

ap

ap

( x0 i )T x0 i .

(180)

i=1

Implementation of the analytical inverse ⎡

(ET E)−1

C−1 ⎢

ap ⎢ = ⎢ − x¯ 0 × C−1 ⎣ 0

ap C−1 x¯ 0 × 1 n I3

+

ap ap 1 ap ap T x¯ (¯x0 ) − [¯x0 ×]C−1 [¯x0 ×] γ2 0 ap − γ12 (¯x0 )T

0



⎥ ap − γ12 x¯ 0 ⎥ ⎥, ⎦ 1 γ2

(181) leads to the following conversion equations to a solution satisfying the constraints ref EWx0 (ˆx0 − x0 ) = 0 and EWv (ˆv − vref ) = 0, simultaneously

752

A. Dermanis

 ap 

ap 1 ref ref x0j × (ˆx0j − x0j )− (ˆx0j − x0j )+ x0i × C−1 n n

xˆ 0i = xˆ 0i −





n

j =1

j =1



1 ⎣ ap ref ap ( x0j )T (ˆx0j − x0j )⎦ x0i , γ2 n

j =1

vˆ i = vˆ i −

n n  

ap 1 ref ap ref x0j × (ˆvj − vj )− (ˆvj − vj ) + x0i × C−1 n j =1







j =1

1 ⎣ ap ref ap ( x0j )T (ˆvj − vj )⎦ x0i , γ2 n

j =1

 θˆ k = θˆ k + C−1

n

  ap ref ref [ x0j ×] xˆ 0j − x0j + (tk − t0 )(ˆvj − vj ) ,

j =1

sˆk = sˆk −

n  1 " ap #T  ref ref ˆ x x − x + (t − t )(ˆ v − v ) , 0j k 0 j j 0j 0j γ2 j =1

 1  ref ref xˆ 0j − x0j + (tk − t0 )(ˆvj − vj ) − dˆ k = dˆ k − n n

j =1

ap  ap − x¯ 0 × (θˆ k − θˆ k ) − (ˆsk − sˆk )¯x0 .

(182)

Generalized partial inner constraints involving only transformation parameters: In this case ⎡ ⎤ 0 0 (183) Epar = ⎣ 0 0 ⎦ , J Jt and one may apply Eqs. (169) with 

tx tv



 =

JT Wz J JT Wz Jt JTt Wz J JTt Wz Jt

−1 

 JT Wz (ˆz − zref ) . JTt Wz (ˆz − zref )

(184)

In the special case that Wz = I we may analytically invert 

JT J JT Jt JTt J JTt Jt

and obtain

−1



nI7 τ1 I7 = τ1 I7 τ2 I7

−1

1 = nτ2 − τ12



 τ2 I7 τ1 I7 , −τ1 I7 nI7

(185)

13 Theory and Realization of Reference Systems

753

τ2 τ1 ref ref (ˆzk − zk ) − (tk − t0 )(ˆzk − zk ), 2 2 nτ2 − τ1 k nτ2 − τ1 k τ1 n ref ref tv = − (ˆzk − zk ) + (tk − t0 )(ˆzk − zk ). 2 2 nτ1 − τ1 k nτ2 − τ1 k

tx =

(186) (187)

It is also possible to derive similar relations for generalized partial inner constraints involving only initial coordinates and transformation parameters or only velocities and transformation parameters. It is not though possible to apply generalized partial inner constraints involving only initial coordinates or only velocities as already explained.

9.3

Conversion to a Solution Satisfying Different Constraints for Initial Coordinates and Velocities

An important characteristic of the linearized parameter transformation is the splitting (Eq. 147) between the initial coordinates transformation x 0 = x0 + Ep0 ˙ This allows us to seek solutions which and the velocity transformation v = v + Ep. incorporate different principles for selecting the least squares solutions xˆ 0 and vˆ , beyond the above cases based on the choice of a set of “global” minimal constraints CTtot xtot = dtot involving both x0 and v. Before examining particular choices, we consider the general case where the constraints matrix has the form ⎡

Ctot

⎤ Cx 0 = ⎣ 0 Cv ⎦ , 0 0

(188)

and leads to separate constraints CTx x0 = dx and CTv v = dv , which can be thus of a different nature. The total conversion xˆ tot = xˆ tot −Etot (CTtot Etot )−1 (CTtot xˆ tot −dtot ) from an original least squares solution to the one satisfying CTtot xtot = dtot , takes in this case the form xˆ 0 =ˆx0 − E(CTx E)−1 (CTx xˆ 0 − dx )=ˆx0 − Etx ,

tx =(CTx E)−1 (CTx xˆ 0 − dx ), (189)

vˆ  = vˆ − E(CTv E)−1 (CTv vˆ − dv ) = vˆ − Etv ,

tv = (CTv E)−1 (CTv vˆ − dv ), (190)

zˆ  = zˆ − Jtx − Jt tv .

(191)

Note that we arrive at the same solution regardless of whether the constraints are applied jointly or separately. Seeking xˆ 0 = xˆ 0 + Ep0 satisfying CTx x0 = dx alone, leads to the solution (189) and seeking vˆ = vˆ + Ep˙ satisfying CTv x0 = dv alone, leads to the solution (190).

754

A. Dermanis

We look first into the case of separate generalized inner constraints. In the case ref of minimal weighted distance, if we want to minimize φx = ||ˆx0 − x0 ||2Wx alone, 0 then replacing xˆ 0 = xˆ 0 + Ep0 we find that the minimum of φx = (ˆx0 −x0 )T Wx0 (ˆx0 −x0 ) = (ˆx0 +Ep0 −x0 )T Wx0 (ˆx0 +Ep0 −x0 ) = min, ref

ref

ref

ref

p0

∂φx = ∂p0 −1 T E) E W

is provided by p0 =

−(ET W

x0

ref

(ˆx0 + Ep0 − x0 )T Wx0 E ref x0 − x0 ) x0 (ˆ

=

(192) 0, which gives

and hence

xˆ 0 = xˆ 0 − E(ET Wx0 E)−1 ET Wx0 (ˆx0 − x0 ). ref

(193)

If we want to minimize φx = ||ˆv − vref ||2Wv alone, then replacing vˆ  = vˆ + Ep˙ we find that the minimum of φv = (ˆv − vref )T Wv (v − vref ) = (ˆv + Ep˙ − vref )T Wv (vˆ + Ep˙ − vref ) = min, p˙

(194)  + Ep ref )T W E = 0, which gives p v ˙ ˙ = 2(ˆ v − v is provided by ∂φ v 0 = ∂ p˙ −(ET Wv E)−1 ET Wv (ˆv − vref ) and hence vˆ  = vˆ − E(ET Wv E)−1 ET Wv (ˆv − vref ).

(195)

Comparing with the joint generalized inner constraints solution where φx + φv = ref ||x 0 − x0 ||2Wx + ||ˆv − vref ||2Wv is minimized we see that the solutions are 0

the same. Therefore when ||x 0 − x0 ||2Wx + ||ˆv − vref ||2Wv is minimized, also ref

0

||x 0 − x0 ||2Wx and ||ˆv − vref ||2Wv are simultaneously minimized! In the special ref

0

ref

case of the classical inner constraints (x0 = 0, vref = 0, Wx0 = I), the solution ref ref where ||x 0 − x0 ||2 + ||ˆv − vref ||2 is minimized, also minimizes ||x 0 − x0 ||2 and ||v − vref ||2 , separately. The importance of the above conclusions lies in the fact that we do not need to use a minimum norm or a minimum weighted distances principle for both initial coordinates and velocities. We can combine such a principle for one of the two, ref ref e.g., ||xˆ 0 − x0 ||2Wx = min for x0 via the constraints ET Wx0 (x0 − x0 ) = 0, 0

with any other set of constraints whatsoever for the other, e.g., CTv v = dv . This will ref prove to be an important property, because ||xˆ 0 − x0 ||2Wx = min is quite useful 0

in bringing the solution xˆ 0 close to a preexisting one x0 , while for velocities a different optimal choice is preferable, as we will see in the next section. By the way, it is important to keep in mind that initial velocity constraints CTx x0 = dx should always be combined with a set of velocity constraints CTv x0 = dv , and vice versa, because none of the two can stand alone as a set of partial ref

13 Theory and Realization of Reference Systems

755

minimal constraints. We know that the simultaneous use of partial inner constraints ET x0 = dx and ET v = dv where in addition to the minimum property the trace of the joint covariance cofactor matrix of xˆ 0 and vˆ is minimized, i.e., trQxˆ 0 + trQvˆ = min. An interesting open question is whether this choice minimizes also trQxˆ 0 and trQvˆ separately, as it does in the case of the norms. The answer is positive. To demonstrate this, consider the constraints CTx x0 = dx which provide the solution xˆ 0 = [I − E(CTx E)−1 CTx ] xˆ 0 + E(CTx E)−1 dx , with covariance cofactor matrix Qxˆ 0 = [I − E(CTx E)−1 CTx ]Qxˆ 0 [I − E(CTx E)−1 CTx ]T = HQ0 HT .

(196)

Setting Qxˆ 0 = Q0 and Cx = C for simplicity we have H = I − E(CT E)−1 CT and we seek to find the matrix C, which minimizes φ = trQxˆ 0 = tr(HQ0 HT ). Recalling that

∂C ∂Cik

= ei eTk ,

∂CT ∂Cik

= ek eTi , where ei stands for the ith column of the

3n × 3n identity matrix, and the property compute

∂H ∂Cik

(197)

∂(M−1 ) ∂q

= −E(CT E)−1 ek eTi H, so that the

−1 we can easily = −M−1 ∂M ∂q M

      ∂H ∂φ ∂HT ∂H = 2 tr = tr Q0 HT + tr HQ0 Q0 HT = ∂Cik ∂Cik ∂Cik ∂Cik ' & = 2 tr −E(CT E)−1 ek eTi HQ0 HT = −2eTi HQ0 HT E(CT E)−1 ek = = −2[HQ0 HT E(CT E)−1 ]ik . The minimum is obtained for equation

∂φ ∂Cik

(198)

= 0 and therefore from the solution of the matrix

F(C) = [H(C)]Q0 [H(C)]T E = 0.

(199)

Since obviously [H(E)]T E = 0, it follows that the choice Cx = C = E satisfies the above equation and thus the constraints ET x0 = dx minimize trQxˆ 0 , independently of the constraints CTv v = dv used for the velocities. In a completely analogous way the constraints ET v = dv minimize trQvˆ  , independently of the constraints CTx x0 = dx used for the initial coordinates. In particular the joint use of ET x0 = dx and ET v = dv does not only minimize trQxˆ 0 + trQvˆ  , as already known but also trQxˆ 0 and trQvˆ  separately! In some applications one may seek to transform velocities vˆ  = vˆ + Ep˙ in such a way that vˆ T Wv vˆ  is minimized. The solution is obviously the speref cial case of generalized inner constraints solution with x0 = 0, given by vˆ  = vˆ − E(ET Wv E)−1 ET Wv vˆ . This has covariance factor matrix

756

A. Dermanis

Qvˆ  = [I − E(ET Wv E)−1 ET Wv ]Qvˆ [I − E(ET Wv E)−1 ET Wv ]T = GQvˆ GT . (200) If one seeks to find the weight matrix Wv which minimizes trQvˆ  , then following and a procedure similar to that above he will come up with the answer Wv = Q−1 vˆ  the solution vˆ . vˆ  = vˆ − E(ET Q−1 E)−1 ET Q−1 vˆ  vˆ 

(201)

This solution however, does not minimize trQvˆ  among all possible least squares solutions, but only within the smaller class of least squares solutions which minimize vˆ T Wv vˆ  as Wv runs along all possible symmetric positive-definite weight matrices. Thus if trQvˆ  = min is desired, one should use instead directly constraints ET v = dv , with arbitrary dv , or simply ET v = 0. For a solution satisfying vˆ T Wv vˆ  = min one should seek different criteria for the choice of weight matrix. Apart of the obvious choice Wv = I, a choice of Wv = I may be based on the need to put larger weights to the coordinate of stations with high quality results and to down-weight stations of lower quality. Another option is to assign larger weights to more stable stations that reflect overall plate or sub-plate behavior and smaller ones to stations affected by localized tectonic behavior.

10

Kinematic Minimal Constraints

All the above forms of minimal constraints are useful for adapting the solution of stacking to either a previous solution, or to some geophysical plate motion model in order to check its consistency with the observed data. However, they do not produce an optimal reference system, in the sense that the temporal variation of coordinates reflects only temporal variations in the network shape and not variations due solely to the choice of reference system at each epoch.PAn optimal reference system corresponds to a shortest geodesic in the subspace t St of the space R 3n of all the coordinates of the N station points. We have also seen that such a choice is in fact a discrete version of Tisserand’s choice of reference system based on the minimization of the relative kinetic energy of earth masses. In the most general case of arbitrary station variation xi (t) the defining conditions are the barycenter preservation condition xB (t) =

1 xi (t) = cB , n

∀t,

(202)

i

(cB = constant) for the definition of the reference system origin and the vanishing relative angular momentum condition h(t) =

i

[{xi (t) − xB (t)}×]

d{xi (t) − xB (t)} = 0, dt

∀t,

(203)

13 Theory and Realization of Reference Systems

757

for the definition of the reference system orientation. Furthermore, we need a condition for the definition of the reference system scale. We choose to preserve the mean quadratic size of S(t) of the network, quantified by the mean quadratic value of the distances of all stations from their barycenter S 2 (t) ≡

1 1 [xi (t) − xB (t)]T [xi (t) − xB (t)] = CS2 , n n

∀t,

(204)

i

(CS2 = constant). Different values of the constants cB and CS2 lead to different minimum length geodesics, which are though parallel in the sense that they are connected by time invariant transformation parameters. If a choice xi (t) satisfies the condition h(t) = 0, ∀t, then it can be easily established that a parallel reference system choice x˜ i (t) = (1 + s)Rxi (t) + d (s, R, d constant) has relative angular momentum h˜ = (1 + s)2 Rh = 0, which also vanishes. Thus, the orientation condition does hold for all parallel reference systems and a specific reference system can be obtained by a specific choice of the orientation at some initial epoch t0 along with the choice of the constants cB and CS2 . When network station motion follows a linear-in-time model xi (t) = x0i + (t − t0 )vi , the above three conditions become xB (t) = x¯ 0 + (t − t0 )¯v = cB , h= [(xi0 − x¯ 0 )×](vi − v¯ ),

∀t,

(205) (206)

i

S 2 (t) =

1 1 (x0i − x¯ 0 )T (x0i − x¯ 0 ) + 2(t − t0 ) (x0i − x¯ 0 )T (vi − v¯ ) ≡ n n i

i

≡ S0 + 2(t − t0 )S1 =

1 2 C , n S

∀t,

(207)

1 vi , n

(208)

where x¯ 0 =

1 x0i , n i

S0 =

v¯ =

i

1 1 T (x0i − x¯ 0 )T (x0i − x¯ 0 ) = x0i x0i − x¯ T0 x¯ 0 , n n

(209)

1 1 T (x0i − x¯ 0 )T (vi − v¯ ) = x0i vi − x¯ T0 v¯ . n n

(210)

i

S1 =

i

i

i

The condition hR (t) = 0 already holds for any epoch. For xB (t) = cB and S 2 (t) = Cs2 /n to hold for any epoch it must hold that x¯ 0 = cB , v¯ = 0, S0 = Cs2 /n and S1 = 0. Thus we arrive at the nonlinear kinematic constraints

758

A. Dermanis

1 x0i = cB , n i vi = 0, i

i

(211a) (211b)

6

56 5 1 [xi 0 ×]vi − [x0 i ×] vi = 0, n i



xT0 i x0i

i

(211c)

i

5T 6 5 6 1 = [x0 i ] x0i = CS2 , n i

(211d)

i

6 5T 6 5 1 1 1 T x0 i vi − [x0 i ] vi = 0, n n n i

i

(211e)

i

which define, respectively, the initial origin, the origin rate, the orientation rate, the initial scale and the scale rate. While (211a) and (211b) are already linear, the rest are nonlinear with respect to the unknowns x0i , vi and they need to be linearized in order to obtain the desired linear kinematic constraints, which are the minimal ap ap constraints to be used. Using approximate values x0i , vi and ignoring second order ap ap ap terms in the small quantities δx0i = x0i − x0i , δvi = v0i − vi , vi we obtain the kinematic constraints defining the orientation, origin and scale at the initial epoch ⎤ ⎡ ⎤ ap 0 [x ×] 0i ap ⎥ ⎣ ⎦δx0i = ⎢ CTxi δx0i = CTx δx0 = I3 ⎣ n(cB − x¯ 0 ) ⎦ = dx , ap ap CS2 −γ 2 i i (x0 i − x¯ 0i )T 2 (212) and the kinematic constraints defining the rates (temporal evolution) of orientation, origin and scale ⎡

CTv δv =



CTvi δvi =



i

i



⎤ ⎡ ⎤ ap ap [(x0 i − x¯ 0 )×] −hap ⎣ ⎦δvi = ⎣ −n¯vap ⎦ = dv , I3 ap ap T (x0 i − x¯ 0 ) −κ

ap

ap

(213)

ap

where x0 , vap are the means of the x0i , vi values, respectively, ap

γ 2 = nS0 =



ap

ap T

ap

ap

(x0 i − x¯ 0 ) (x0 i − x¯ 0 )



i

κ=

ap nS1

hap =

i

=



ap T ap

(214)

i ap (x0 i

ap T ap − x¯ 0 ) (v i

− v¯ ) = ap

i ap

ap

(x0 i − x¯ 0 ) x0 i ,

ap

ap

[(x0 i − x¯ 0 )×](v i − v¯ ap ) =

i



ap T ap

ap

(x0 i − x¯ 0 ) v i ,

(215)

i ap

ap

ap

[(x0 i − x¯ 0 )×]v i .

(216)

13 Theory and Realization of Reference Systems ap

759 ap

Replacing δx0i = x0i − x0i , δvi = v0i − vi , the linearized kinematic constraints can be converted into ones with respect to the original unknowns, those defining the reference system at the initial epoch ⎤ ⎡ ⎤ ap 0 [x ×] 0i ⎥ ⎣ ⎦x0i = ⎢ CTx x = CTxi x0i = I3 ⎣ ncB ⎦ = dx , 2 +γ 2 ap ap T C S i i (x0 i − x¯ 0i ) ⎡

(217)

2n

and those defining the rate (temporal evolution) of the reference system

CTv v =



CTvi vi =



i

i



⎤ ⎡ ⎤ ap ap [(x0 i − x¯ 0 )×] 0 ⎣ ⎦vi = ⎣ 0 ⎦ = 0 = dv . I3 ap ap 0 (x0 i − x¯ 0 )T

(218)

The above constraints have been completed with constraints defining the initial epoch orientation, which cannot be obtained from kinematic principles. We have chosen to borrow  theapmissing constraintsfromapthe inner constraints, using the equivalent sets i [x0 i ×]δx0i = 0 and i [x0 i ×]x0i = 0. The constants cB , CS2 can be chosen arbitrarily, thus providing different solutions among all possible parallel solutions, which are all optimal from the kinematic point of view. Any particular optimal solution depends on the choice of the reference system at the initial epoch, realized by the choice of cB for the initial origin and the choice of CS2 for the initial scale. Initial orientation depends on the approximate values used in the initial orientation constraint borrowed from the inner constraints. As a possible choice of constants one can choose the mean values over the whole coordinate time series xobs i (tk ), k = 1, 2, . . . , m, i = 1, 2, . . . , n , i.e., 3 4 n m m 1 obs 1 1 obs cB = xB (tk ) = xi (tk ) , m m n k=1

(219)

i=1

1 obs T obs obs [xi (tk ) − xobs B (tk )] [xi (tk ) − xB (tk )]. m m

CS2 =

k=1

n

(220)

k=1 i=1

The total constraints have the form 

CTtot xtot

CTx x0 = CTv v



⎡ ⎤  x0   CTx 0 0 ⎣ ⎦ dx = dtot . = v = dv 0 CTv 0 z 

(221)

Instead of applying the kinematic constraints directly, we can apply any other set of minimal constraints and then convert a posteriori the obtained least squares solution into the one with kinematic constraints. Applying the general conversion equation xˆ  = xˆ −E(CT E)−1 (CT xˆ −d), or in two steps t = (CT E)−1 (CT xˆ −d), xˆ  = xˆ −Et,

760

A. Dermanis

we have in our case ⎡

xˆ tot

⎡ ⎤ ⎤ xˆ 0 xˆ 0 − Etx ⎦, = ⎣ vˆ  ⎦ = xˆ tot − Etot ttot = ⎣ vˆ − Etv  zˆ zˆ − Jtx − Jt tv





(222)

where 

 (CTx E)−1 (CTx xˆ 0 − dx ) . (CTv E)−1 (CTv vˆ − dv ) (223) With tx = [(tθx )T (tdx )T txs ]T and tv = [(tθv )T (tdv )T tvs ]T the above transformation takes the explicit form ttot =

tx tv

= (CTtot Etot )−1 (CTtot xˆ tot − dtot ) =

ap ap xˆ 0i = xˆ 0i − x0i × tθx − tdx − txs x0i ,

ap ap vˆ i = vˆ i − x0i × tθv − tdv − tvs x0i ,  θˆ k = θˆ k − tθx − (tk − t0 )tθv ,

dˆ k = dˆ k − tdx − (tk − t0 )tdv , sˆk = sˆk + txs − (tk − t0 )tvs .

(224)

The matrices ⎤ ap [x0 i ×]

⎣ ⎦ [xap ×] I3 xap = CTx E = I3 0i 0i ap ap i (x0 i − x¯ 0 i )T ⎡ ⎤ ap ap −C + n[¯x0 ×]2 n[¯x0 ×] 0 ap ap =⎣ (225) nI3 n¯x0 ⎦ , n[¯x0 ×] 2 0 0 γ ⎡ ⎡ ap ⎤ ⎤ ap −C 0 0 [(x0 i − x¯ 0 )×] ap ap ⎣ n[¯xap ×] nI3 n¯xap ⎦ , ⎣ ⎦ [x ×] I3 x CTv E = I3 0i 0i = 0 0 ap ap T i (x0 i − x¯ 0 ) 0 0 γ2 (226)



have respective analytical inverses ⎡

(CTx E)−1

−C−1 ⎢ x¯ ap × C−1 =⎣ 0 0

ap ⎤ C−1 x¯ 0 × 0 ap ap op 1 x0 ×]C−1 [¯x0 ×] − γ12 x¯ 0 ⎥ ⎦, n I3 − [¯ 0

1 γ2

(227)

13 Theory and Realization of Reference Systems

761

⎤ −C−1 0 0 1 op ⎥ ⎢ ap −1 1 = ⎣ x¯ 0 × C n I3 − γ 2 x¯ 0 ⎦ , ⎡

(CTv E)−1

0

(228)

1 γ2

0

which can be used to compute tx = [(tθx )T (tdx )T txs ]T and tv = [(tθv )T (tdv )T tvs ]T according to (223). Replacing the obtained values tθx , tdx , txs and tθv , tdv , tvs in the transformation equations (224) we obtain the explicit formulas xˆ 0i = xˆ 0i + [ x0i ×]C−1 ap

4 +



ap

[ x0i ×] xˆ 0j −

i

C2 1 1 ap − 2 ( x0i )T xˆ 0j + S2 2 γ γ

1 xˆ 0j + cB + n j

3 ap

x0i ,

i

vˆ i = vˆ i + [ x0i ×]C−1 ap



ap

[ x0j ×] vˆ j −

j

1 vˆ j − n j

⎡ ⎤ 1 ap ap − 2⎣ ( x0j )T vˆ j ⎦ x0i , γ j

 θˆ k

= θˆ k + C−1



[ x0j ×][ˆx0j + (tk − t0 )ˆvj ] + nC−1 [¯x0 ×]cB , ap

ap

j

sˆk = sˆk +

C2 1 1 ap − 2 ( x0j )T [ˆx0j + (tk − t0 )ˆvj ] + S2 , 2 γ 2γ j

1  ap ap [¯x0j + (tk − t0 )ˆvj ] + cB + (ˆsk − sˆk )¯x0 , dˆ k = dˆ k + [¯x0 ×](θˆ k − θˆ k ) − n j

(229) ap

ap

ap

where x0i = x0i − x¯ 0 . Similar transformation equations can be derived for the case where the unknowns are the corrections to the original unknown parameters. ap ap ap ap Replacing xˆ 0i = x0i + δ xˆ 0i , vˆ i = vi + δ vˆ i , xˆ 0i = x0i + δ xˆ 0i , vˆ i = vi + δ vˆ i , we obtain the conversion formulas δ xˆ 0i = δ xˆ 0i + [ x0i ×]C −1 ap

4 −



ap

[ x0j ×]δ xˆ 0j −

j

C2 1 1 ap + 2 ( x0j )T δ xˆ 0k + S2 2 γ 2γ k

1 ap δ xˆ 0j + (cB − x¯ 0 )− n j

3 ap

x0i ,

762

A. Dermanis

δ vˆ i = δ vˆ i + [ x0i ×]C−1 ap



[ x0j ×]δ vˆ j + [ x0j ×]C−1 hap − ap

ap

j

3 4 1 1 ap T ap ap δ vˆ j − v¯ − 2 ( x0j ) δ vˆ j + κ x0i , − n γ j

 θˆ k

= θˆ k + C−1

k



[

ap x0j ×][δ xˆ 0j

+ (tk − t0 )δ vˆ j ] + nC−1 [¯x0 ×]cB + ap

j

+ (tk − t0 )C−1 hap sˆk = sˆk +

C2 1 1 κ ap − 2 ( x0j )T [δ xˆ 0j + (tk − t0 )δ vˆ j ] + S2 − 2 (tk − t0 ), 2 γ 2γ γ j

1  ap ap [δ xˆ 0j + (tk − t0 )δ vˆ j ] + (cB − x¯ 0 )− dˆ k = dˆ k + [¯x0 ×](θˆ k − θˆ k ) − n j

ap −(tk − t0 )¯vap + (ˆsk − sˆk )¯x0 .

11

(230)

Transforming a Network Reference System into an (Approximate) Earth Reference System

The solution to the stacking problem defined by kinematic constraints is by far the optimal choice, at least from a purely geodetic point of view, because it does not exhibit temporal coordinate variations, which do not reflect variations in the network shape, but are caused by variations in the chosen spatiotemporal reference system. The other desired property, that of closeness to a pre-existing solution, e.g., that in an officially adopted reference frame, can also be satisfied by choosing the initial epoch reference system as desired, through the choice of values for the arbitrary constants and the approximate values for the initial epoch orientation constraint borrowed from the inner constraints. Thus optimality refers only to the temporal evolution (rate) of the reference system and in particular to the velocity estimates, which are the smallest possible in the sense of their mean quadratic modulus. The only other choice for the velocity estimates that should be used, is the ones closest to a given geophysical model for the sole purpose of investigating its compatibility with the available data. From the geodetic point of view, the kinematic constraints solution is the best solution as a basis for global, regional and even local mapping. Nevertheless, even such an optimal solution has its drawbacks, which are related to the discrete nature of the geodetic networks, the dependence on the particular design, and of course, most of all, the lack of coverage over the oceans. A truly optimal reference system in the sense of Tisserand, should refer to the temporal variation of the continuous mass distribution within the whole earth. Such knowledge is beyond the reach of geodetic observational techniques, but a

13 Theory and Realization of Reference Systems

763

reasonable compromise can be limited to the lithosphere of the earth. The basic idea is to use the knowledge of the variation in the geometric configuration of networks on the earth surface in order to infer corresponding variations within the lithosphere and in particular the relative motion of plates and sub-plates. In the following, we will use the term plate in a more general sense that covers not also sub-plates but even any region that exhibits different kinematic behavior from its neighbor ones. Starting with an existing spatiotemporal coordinate system, described by the coordinate functions xi (t) of a global geodetic network, we will convert into an optimal one x˜ i (t) as follows: First an optimal discrete Tisserand reference system will be separately established for each subnetwork DK covering a corresponding plate PK . Next the derived motion of each of these local reference systems will be assumed to correspond to the motion of all points within the corresponding plate thus allowing us to compute the angular momentum of the lithosphere, as the sum of those for each plate. Finally, the rigid transformation will be sought, which leads to a Tisserand reference system for the lithosphere where the angular momentum of the lithosphere will vanish as required. In doing so, we must first determine how the angular momentum varies under a time-dependent rigid transformation. The angular momentum in the original reference system is given by  h=

˙ [(x − m)×](˙x − m)dm,

(231)

  1 where M = dm is the mass and m = M x dm the barycenter of the body ˜ = Rm + d, in question. A rigid transformation x˜ = Rx + d will result into m ˙˜ = Rm ˙˜ = ˙ + R˙x + d˙ and m ˙ + Rm ˙ x˜ − m ˙ + d, ˜ = R(x − m), x˙˜ − m x˙˜ = Rx ˙ ˙T ˙ The corresponding rotation vector is given by [ω×] ˜ R(x − m) + R(˙x − m). = RR T ˜ in the original system so that in the new system and by ω = R ω ˙ T R = −RT R, ˙ [ω×] = R

(232)

with the last term following from the differentiation of RT R = I. Therefore ˙ = −R[ω×], R

˙˜ = R[(x − m)×]ω + R(˙x − m). ˙ x˙˜ − m

(233)

Replacing into the expression for the angular momentum in the new system we get h˜ =



˙˜ ˜ [(˜x − m)×]( x˙˜ − m)dm =

 =

˙ R[(x − m)×]RT {R[(x − m)×]ω+R(˙x − m)}dm = 

=R

 [(x − m)×]2 dmω + R

˙ [(x − m)×](˙x − m)dm.

(234)

764

A. Dermanis

Recognizing the original angular momentum in the second term and  C = − [(x − m)×]2 dm,

(235)

as the inertia matrix of the body in the original reference system, we obtain the law of transformation of angular momentum h˜ = R(h − Cω).

(236)

The same law holds true for the discrete angular momentum of geodetic networks as it can be easily seen if integration is replaced by summation in the above derivation. We will next seek the rigid transformation x i = R xi + d from the original reference system to the optimal one for a particular sub-network DK . Denoting by hDK =



[(xi − x¯ B )×](˙xi − x˙¯ B ),

i∈DK

CDK = −



[(xi − x¯ B )×]2 ,

(237) (238)

i∈DK

˙ T R = −RT R ˙ , [ωK ×] = R

(239)

the angular momentum, the inertia matrix, and the rotation vector, respectively, of the subnetwork in the original reference system, application of the above transformation law gives h DK = R (hDK − CDK ωK ). For optimality with respect to orientation it must hold that h DK = 0, in which case the rotation vector can be computed from ωK = C−1 DK hDK .

(240)

Optimality with respect to the origin is established from ˙  m + R m ˙ =R ˙ + d˙  = R [m ×]ωK + R m ˙ + d˙  = 0. m

(241)

The optimal rigid transformation elements R and d can be determined from solving ˙  = −R [ωK ×] for the R (θ ) parameters θ and then the differential equation R ˙ for d . Fortunately, we will solve the differential equation d˙  = −R ([m×]ωK + m) not need to do this. At this point we assume that the plate PK moves rigidly in the way described by the best fitted reference system to its subnetwork DK , which means that the plate points do not move with respect to this system and thus x˙  = 0 for all points in PK . This means that the residual apparent velocities v i are indeed small and can be either attributed to the effect of observational errors, or to secondary within plate deformations of secondary importance. From x˙  = 0 follows that the angular momentum also vanishes in this reference system, i.e., that h PK = 0. Since by the law of angular momentum transformation h PK = R (hPK −CPK ωK ) = 0 it follows

13 Theory and Realization of Reference Systems

765

that hPK = CPK ωK . In order to determine the rigid transformation x˜ = Rx + d which transforms the original system to an optimal system we note that by the law of transformation h˜ PK = R(hPK − CPK ω), with ω now being the rotation vector of the rigid transformation expressed in the original reference system. Optimality with respect to orientation is achieved by setting the total angular momentum of the lithosphere equal to zero h˜ ∪K PK =



h˜ PK =



k

R(hPK − CPK ω) = R



k

(CP k ωK − CPk ω) = 0.

k

(242) Solving the above equation for ω it follows that ω=

6 K

5−1 CPK



CPK ωK =

K

6

5−1 CPK

K



CPK C−1 DK hDK ,

(243)

K

and the rotation vector ω is in fact a weighted mean of the rotation vectors ωK of the reference systems that are best fitted to the subnetworks DK . For optimality ˙˜ = 0. Obviously m ˜ = Rm + d, where with respect to the origin werequire that m   1 m= mK and mK = MK PK x dm with MK = PK dm being the mass of plate K

˙˜ = Rm ˙ + Rm ˙ + d˙ = R[m ×]ω + Rm ˙ + d˙ = 0 and the PK . Consequently m elements R, d of the rigid transformation leading to the optimal reference system can be determined from one of the solutions of the differential equations ˙ = −R[ω×], R

˙ d˙ = −R([m×]ω + m).

(244)

If we consider a representation in terms of rotation angles around the axes R(θ) = ∂R R3 (θ3 )R2 (θ2 )R1 (θ1 ) and set [ωj ×] = −RT ∂θ , it follows that ω = θ˙ 1 ω1 + θ˙ 2 ω2 + j θ˙ 3 ω3 = θ˙ where = [ω1 ω2 ω3 ] and we must solve first the differential equation θ˙ = −1 ω,

(245)

e.g., by numerical integration, and then use the solution θ in solving next ˙ For the usual representation R(θ) = R3 (θ3 )R2 (θ2 )R1 (θ1 ) d˙ = −R(θ)([m ×] ω+m). with rotations around the axes, the desired matrix becomes ⎤ ⎤−1 ⎡ tan θ2 1 tan2 θ1 tan2 θ1 − cos 1 − sin θ1 sin θ2 θ1 ⎥ ⎢ tan θ1 = ⎣ 0 cos θ1 − sin θ1 cos θ2 ⎦ = ⎣ 0 cos1 θ1 ⎦≈ cos θ1 1 0 0 cos θ1 cos θ2 0 0 cos θ1 cos θ2 ⎡ ⎤ 1 θ1 −θ2 ≈ ⎣ 0 1 θ1 ⎦ , (246) 0 0 1 ⎡

−1

766

A. Dermanis

the last term being an acceptable approximation, since we are concerned here with transformations close to the identity and we can neglect second order terms in the small rotation angles θ1 , θ2 , θ3 . With this value the differential equations splits in two parts θ˙3 = ω3 ,



θ˙1 θ˙2





ω2 −ω3 = ω3 0



   ω1 θ1 + . θ2 ω1

(247)

The first has solution θ3 = θ2,0 + (t − t0 )ω3 while the second is a system of first order differential equations θ˙ = Aθ + b with constant coefficients that can be solved by standard mathematical techniques. The above approach is general enough as it considers any possible motion of each plate. However, plates are primarily known to “float” over the mantle, without any “bending” in the sense of rotating around a horizontal axis passing through their barycenter. The only other possible motion is vertical motion associated with post-glacial uplift. It seems a good idea to consider only the horizontal components of sub-network point velocities in our analysis and to implement only horizontal angular momentums. However, this is of no concern, since vertical velocity components do not contribute to the angular momentum. Indeed if we ˙ = x˙ i − m ˙ into a horizontal and a radialanalyze each barycentric velocity vi − m V , where vV ||x − m and vH ⊥x − m, then since ˙ = vH vertical part as vi − m + v i i i i i i ˙ = 0 it follows that the angular momentum [(xi − m)×](vM i − m)  h=

 ˙ [(x − m)×](˙x − m)dm =

[(x − m)×]vH i dm,

(248)

involves only horizontal velocities and we do not need to worry about, vertical motions which are additionally of lower accuracy that the horizontal ones, at least for GPS/GNSS networks. A first simplification is to consider only “floating” plate motions which are described by a time varying rotation R(t) (x i = R xi , x˜ = Rx) around a time varying rotation vector ω(t), i.e., rotation about a migrating axis with varying angular velocity. Repeating the above procedure without the translation part, we arrive exactly at the same Eq. (243) and we need only to solve the differential equation θ˙ = (θ)−1 ω to determine θ(t) and apply the transformation x˜ i (t) = R(θ(t))xi (t) to the optimal lithosphere related reference system. The next step towards simplification is to assume that plates rotate around a fixed axis with constant angular velocity, i.e., with time independent rotation vectors ωK . Then the rotation induced “horizontal” velocities vω,i = [ωK ×]xi = −[xi ×]ωK should be fitted in the least squares sense to the horizontal components vH i of the observed constant velocities vi , within each sub-network DK . We seek to minimize

13 Theory and Realization of Reference Systems

φ=



767

T

H (vi,ω − vH i ) (vi,ω − vi ) =

i∈Dk

=



T H (−[xi ×]ωK − vH i ) (−[xi ×]ωK − vi ) = min . ωK

i∈Dk

(249)

It can be easily shown that φ = −ωTK



[xi ×]2 ωK − 2ωTK

i∈Dk



T

H H [xi ×]vH i + (vi ) vi =

i∈Dk

H T H = ωTK CDK ωK − 2ωTK hH Dk + (vi ) vi ,

where hH Dk =

 i∈Dk

[xi ×]vH = i



(250)

[xi ×]vi is the non-barycentric horizontal

i∈Dk

angular momentum, and the minimum is obtained from

∂φ ∂ωK

= 0, which gives

H ωK = C−1 DK hDK .

(251)

The only difference with the general approach is that the barycentric angular momentum hDk has been replaced by the non-barycentric angular momentum hH + (t − t0 )vi the Dk . For the standard linear-in-time model xi (t) = x0i H [x0i ×]vi , while non-barycentric angular momentums becomes hDk = i∈Dk  hDK = [x0i ×]vi − n[¯x0 ×]¯v. i∈Dk

In this particular case, where plate motion rotation around fixed axes with constant angular velocities have been assumed, the constant plate rotation axes ωK result in a constant rotation axis ω for the transformation from the original to the optimal reference system, if the small variations in the inertia matrices of the plates are disregarded. Therefore the transformation will cause each point to move on a circle in a plane perpendicular to ω. For small time intervals the transformation x˜ i = Rxi can be simply approximated by x˜ i = xi + (t − t0 )[ω×]xi . For the linearin-time model xi = x0 i + (t − t0 )vi this results in the transformations x˜ 0i = x0i ,

ap

v˜ i = vi + [ω×]x0 i + (t − t0 )[ω×]vi ≈ vi + [x0 i ×]ω,

(252)

neglecting second order terms as usually. We see that with this simplified but nevertheless reasonable plate rotation model, only the velocities of a global reference frame need to be modified. The transformation to an optimal reference system that best fits the lithosphere, rather than a particular global geodetic network, faces two difficulties in its realization. The first is related to the lack of a dense high quality geodetic network

768

A. Dermanis

over the lands, but the emergence of continuously improving GPS/GNSS networks will eventually resolve this problem. The second and more serious one is the lack of coverage over the oceans, where information is limited to isolated stations on islands. Inevitably, one has to rely on this case on geophysical models for the identification and rotation of plates and sub-plates. In any case, conversion to a reference system best fitted to the lithosphere continuum, is important, not only for liberating the choice from a particular discrete geodetic network configuration, but also for comparison with earth rotation theory. Earth orientation parameters determined by geodetic space techniques refer, or are converted, to the adopted reference system, while theories provide earth orientation parameters referring to the primarily to the lithosphere, or to the mantle to be more precise, because they incorporate different rotations for the earth’s mantle and core. The basic ideas of the approach presented here were developed by Dermanis [26].

12

Formulation of the International Terrestrial Reference Frame: Introductory Remarks

Data from four geodetic space techniques are jointly analyzed in order to obtain estimates of parameters a, describing the temporal variation of the coordinates xi = xi (t, a) of selected stations i of a global geodetic network. The estimates aˆ of the coordinate model parameters a, which depend on the choice of a spatiotemporal reference system in the final combined solution for all space techniques, comprise the International Terrestrial Reference Frame (ITRF), which is a practical realization of the International Terrestrial Reference System (ITRS), see [1, 3–9, 58, 60]. The official ITRF is the responsibility of the IERS (International Earth Rotation and Reference Systems Service), which is a joint Service of the IAG (International Association of Geodesy) and IAU (International Astronomic Union). The dominant ITRF parameters (and until recently the only ones) are the initial epoch coordinates x0i and the constant velocities vi of the linear-in-time model xi (t) = x0i +(t −t0 )vi . Only the last version ITRF-2014 [7] includes additional parameters for non-linear annual and semi-annual periodic variations, as well as models for post-seismic variation. Four techniques are involved in the ITRF formulation: Very Long Baseline Interferometry (VLBI), Satellite Laser Ranging (SLR), Doppler Orbitography and Radiopositioning Integrated by Satellite (DORIS) and GPS (Global Positioning System). We will use the term GPS here in a wider sense that will also cover satellite positioning systems other than the U.S. “GPS” brand name, namely the Russian GLONASS, the European GALILEO, the Chinese BeiDou and other existing or emerging regional systems. We prefer here the term GPS, merely for its simplicity and merely because the official adopted term Global Navigation Satellite System (GNSS) does not pay proper tribute to the efforts of the geodetic community, which are directed towards very high accuracy positioning and have little to do with low accuracy “navigation”. All four techniques provide positioning information in terms of single epoch solutions, with different characteristics and qualities in their

13 Theory and Realization of Reference Systems

769

reference system definition. They also provide different information additional to that of positioning. SLR is the only technique that provides information relating to the gravitational field of the earth. The only part of this information that relates to the 3, C 1 ITRF are the first order spherical harmonic coefficients C10 = μxG 11 = μxG , 3 2 1 2 S11 = μxG , which are related to the coordinates xG , xG , xG of the geocenter (center of mass of the earth). The parameter μ = GM, is the product of the universal gravitational constant G and the mass M of the earth. VLBI is the only technique that provides EOPs related to precession nutation, since it observes extragalactic radio sources with fixed directions with respect to the International Celestial Reference System (ICRS), which is realized by the International Celestial Reference Frame (ICRF), a catalogue of the direction angles of selected radio sources. VLBI mainly senses the geometric relation between the ITRS and the ITRF as realized by three parameters (rotation angles) of the instantaneous rotation matrix. The separation into precession-nutation on one hand and polar motion and lengthof-the day (equivalent to angular velocity of rotation) is realized by modelling, which forms the basis for EOP estimates that are claimed to relate to the CIP and not to the instantaneous rotation axis. The other techniques are tracking satellites with orbits evolving according to the laws of dynamics as described in an inertial or quasi-inertial reference system such as the ICRS. Description of orbits with respect to the ITRS introduces the pseudo forces related to the instantaneous earth rotation vector. Thus polar motion and length-of-the-day EOPs are produced, that are also claimed to relate to the CIP rather than the instantaneous earth rotation vector, which appears in the pseudoforce relations (Eq. 13). Strictly speaking, the earth rotation vector, is a type of velocity with respect to the time continuous rotation ˙ T ) while matrix (recall the generalized Euler kinematic equations [ω×] = R R the available data are discrete in time. Therefore, some type of interpolation with respect to time is required to obtain EOP estimates. Usually a linear-in-time model is assumed for EOPs within the small time interval of either the observations which will provide a “single epoch” solution, or for daily subintervals. Thus the obtained EOPs are some type of averages over a small time span and thus higher frequency components of earth rotation remain undetectable. In any case the question of what type of EOPs are obtained from the space techniques requires further theoretical investigation, especially since higher temporal resolution data make possible the sub-daily determination of earth rotation, thus leading us to eventually abandoning the CIP for a rotation pole closer to the instantaneous earth rotation (see e.g., [9, 10, 17]). All space techniques have their own scale as defined by their own set of atomic clocks. Nevertheless, scale information from VLBI and SLR is considered to be of much higher quality and the ITRF initial scale and its rate are weighted combinations of those from VLBI and SLR. Before proceeding with the data combination one must have a clear view of their actual deficiencies and weaknesses with their respect to their adopted reference system, for each technique and each single epoch solution. The three indices ωQ , ψq and χQ,min /χQ,max developed by Chatzinikos and Dermanis [21] can be used to this purpose, in order not to rely on theoretical considerations only. Their application

770

A. Dermanis

requires knowledge of the normal equation coefficient matrices, which are also necessary for the final combined all-techniques solution. Only VLBI provides the normal equations for its solutions. GPS and DORIS provide covariance matrices obtained with the use of minimal constraints, which allow the recovery of the normal equation matrices. Taking advantage of the conversion of a minimally constraint solution xˆ C to one with inner constraints xˆ E = [I − E(ET E)−1 E]ˆxC ≡ HˆxC ,

(253)

we can convert the covariance factor matrices to Qxˆ E = HQxˆ C HT ,

(254)

and then take advantage of Eq. (91) to recover N = [Qxˆ E + E(ET E)−2 ET ]−1 − EET .

(255)

ˆ xˆ = σˆ 2 Qxˆ we must To obtain Qxˆ C from the given covariance matrix estimate C C C 2 only know the estimate σˆ . SLR provides covariance matrices obtained by loose constraints, which give a non-minimal constrained solution, with a covariance factor matrix of the general form Qxˆ = (N + Px )−1 . The usual choice is Px = k 2 I with k 2 very small but sufficient to guarantee inversion; hence the term loose, although no constraint has been applied. The approach can be theorized as a Bayesian approach with prior information on the parameters equivalent to the addition of pseudoobservations 0 = x + ex , with ex ∼ (0, σ 2 P−1 x ). Even in this case the normal ˆ xˆ = σˆ 2 Qxˆ if one equations matrix can be recovered as N = Q−1 − Px from C C C xˆ 2 2 2 knows σˆ and Px (just k for the usual choice Px = k I). The important difference is that normal equations and covariance matrices obtained by minimal constraints contain no information about the characteristics of the reference system that is not present in the available observations. On the contrary, covariance matrices of full rank, obtained in other ways, contain complete reference system information and transformation parameters must be added as additional unknowns to accommodate this fact, when they are combined with reference system free data for a solution where the reference system will be finally chosen by minimal constraints. Since the availability of existing information is not a scientific question, we will assume that normal equation matrices are available, commenting when necessary on problems caused by using the inverses of full rank covariance matrices as weight matrices. As it is well known, uncorrelated data sets can be jointly processed only if they partly depend on common parameters. Data from different space techniques share no common station-related parameters, in this case station initial coordinates and velocities, because they simply do not share common stations. Indeed, it is impossible for a VLBI antenna, a laser system, a DORIS beacon and a GPS antenna to be placed on exactly the same point, even if this is to happen at different observation instants. Instead instruments of the different space techniques are placed

13 Theory and Realization of Reference Systems

771

near to each other at the so-called collocation sites and local high precision surveys provide the “ties”, i.e., the data that make possible the joint treatment of the data from the four space techniques. They have the form of the vector from the center of one instrument to that of another, or their initial values and constant derivatives, when a linear shift with respect to time is detected. These vectors can be expressed as data bC = AV aV +AL aL +AG aG +AD aD +eC , depending on initial coordinates and velocities aV , aL , aG , aD of stations in the VLBI, SLR, GPS and DORIS networks, respectively. The per epoch estimates aˆ V , aˆ L , aˆ G , aˆ D of coordinates and EOP time series are treated as pseudo-observations, depending on initial coordinates and velocities, as well as on per epoch transformation parameters, from the final (not yet defined) ITRF reference system to the reference system of each epoch within each technique. In this respect the ITRF formulation can be considered as a simultaneous stacking realized through the data from ties at the collocation sites. The question is about the proper matrices to be used as weight matrices of those pseudo-observations, especially when correctly computed covariance matrices using minimal constraints are singular and cannot be inverted to provide weight matrices as usual. For this reason, we will next examine the problem of jointly adjusting uncorrelated data sets in more detail. For the sake of convenience we will introduce throughout this chapter a special notation for block-diagonal (BD), block-column (BC) and block-row (BR) matrices involving a repetition of similar submatrices each related to a particular space technique ⎡

MV ⎢ 0 BD(MT ) ≡ ⎢ ⎣ 0 0

0 MS 0 0

0 0 MG 0

⎤ 0 0 ⎥ ⎥, 0 ⎦ MD

BR(MT ) = [MV MS MG MD ],

⎤ MV ⎢ MS ⎥ ⎥ BC(MT ) = ⎢ ⎣ MG ⎦ , MD ⎡

(256)

where the indices stand for V = VLBI, S = SLR, G = GPS and D = DORIS. The notation will also naturally extend to the case of submatrices depending on some index k = 1, 2, . . . , K, i.e., BD(Mk ), BC(Mk ), BR(Mk ).

13

Basics of Data Set Combination

13.1

Combining Uncorrelated Data Sets

The ITRF formulation is based on uncorrelated data sets from different space techniques that all depend on the temporal variation of a global geodetic network and additional parameters particular to each technique. These data can be combined at a higher level, where primary data from all techniques are jointly analyzed with a prohibitive computational cost, or at a lower level, where already estimated coordinate and EOP time series are combined to produce initial coordinates, velocities and final EOPs in relation to a linear-in-time coordinate model. We will

772

A. Dermanis

examine here in a more general set up, the possible approaches to the combination of uncorrelated data sharing common parameters. Consider the observation equations of two uncorrelated data sets b1 = A1x x + A1y y + e1 ,

b2 = A2x x + A2z z + e2 ,

(257)

T 2 −1 where E{e1 } = 0, E{e1 eT1 } = σ 2 P−1 1 , E{e2 } = 0, E{e2 e2 } = σ P2 and E{e1 eT2 } = 0, which are connected only through their common parameters x. The normal equations for the joint adjustment are



⎤⎡ ⎤ ⎡ ⎤ N1,x + N2,x N1,xy N2,xz u1x + u2x xˆ ⎢ NT ⎦, N1,y 0 ⎥ ⎣ ⎦ ⎣ yˆ ⎦ = ⎣ u1y 1,xy T ˆ z u N1,xz 0 N2,z 2z

(258)

where N1,x = AT1x P1 A1x , N1,y = AT1y P1 A1y , N1,xy = AT1x P1 A1y , u1x = AT1x P1 b1 , u1y = AT1y P1 b1 and N2,x = AT2x P2 A2x , N2,z = AT2z P2 A2z , N2,xz = AT2x P2 A2z , u2x = AT2x P2 b2 , u2z = AT2z P2 b2 . We are interested, in particular, for the case where the coefficient matrix in the above normal equations is singular due to the lack of reference system definition which must be introduced with the help of minimal constraints. The normal equations for the joint adjustment can be viewed as the result of an addition ⎛⎡ ⎤ ⎡ ⎤⎞ ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ N1,x N1,xy 0 N2,x 0 N2,xz u1x u2x xˆ ⎜⎢ T ⎥ ⎣ ⎟ (259) 0 0 0 ⎦⎠ ⎣ yˆ ⎦ = ⎣ u1y ⎦ + ⎣ 0 ⎦ , ⎝⎣ N1,xy N1,y 0 ⎦ + T N2,xz 0 N2,z zˆ 0 u2z 0 0 0 after inflation (i.e., insertion of zero rows and columns for missing unknowns) of the normal equations 4 4

N1,x N1,xy NT1,xy N1,y N2,x N2,xz NT2,xz N2,z

3 3

xˆ 1 yˆ 1 xˆ 2 zˆ 2

4

 =

4

 =

N1,x xˆ 1 + N1,xy yˆ 1 NT1,xy xˆ 1 + N1,y yˆ 1 N2,x xˆ 2 + N2,xz zˆ 2 NT2,xz xˆ 2 + N2,z zˆ 2

3

 u1x , u1y

(260)

 u2x , = u2z

(261)

 =

3



obtained when adjusting the two data separately. The separate solutions xˆ 1 , yˆ 1 and xˆ 2 , zˆ 2 can be obtained with the inclusion of minimal constraints, which are in general different for the two data sets. The joint solution is given by applying Eqs. (78) and (79) or (85) and (87) and requires the introduction of minimal constraints. Since we are primary interested in the common parameters x, with the non-common parameters y, z being of secondary interest or even nuisance parameters, we will consider only the case where the

13 Theory and Realization of Reference Systems

773

minimal constraints CTx x = d, involve only the common parameters. In this case the solution for the parameter estimates and their covariance factor matrices can be expressed in terms of the following sequential algorithm: ¯ 1,x = N1,x − N1,xy N−1 NT , N 1,y 1,xy

¯ 2,x = N2,x − N2,xz N−1 NT , N 2,z 2,xz

(262)

¯ −1 u1y , u¯ 1x = u1x − N1,xy N 1,y

¯ −1 u2z , u¯ 2x = u2x − N2,xz N 2,z

(263)

¯ 1,x + N ¯ 2,x , ¯x = N N

u¯ x = u¯ 1x + u¯ 2x ,

(264)

¯ x + Cx CTx )−1 (u¯ x + Cx d) = (N ¯ x + Cx CTx )−1 u¯ x + Ex (CTx Ex )−1 d, xˆ = (N (265) ¯ 1,x + N ¯ 2,x + Cx CTx )−1 − Ex (ETx Cx CTx Ex )−1 ETx , Qxˆ = (N

(266)

¯ −1 (u1y − NT xˆ ), yˆ = N 1,xy 1,y

T ˆ ), zˆ = N−1 2,z (u2z − N2,xz x

(267)

Qxˆ yˆ = −Qxˆ N1x,y N−1 1,y ,

Qxˆ zˆ = −Qxˆ N2,xz N−1 2,z ,

(268)

−1 T Qyˆ = N−1 1,y − N1,y N1,xy Qxˆ yˆ ,

−1 T Qzˆ = N−1 2,z − N2,z N2,xz Qxˆ zˆ ,

(269)

T Qyˆ zˆ = −N−1 1,y N1,xy Qxˆ zˆ .

(270)

We have assumed that the rank deficiencies in both sets are not related with the noncommon parameters but only with the common ones. Thus the rank deficiencies are associated with the design submatrices A1x , A2x and the columns of both A1y , A2z are linearly independent, in which case the submatrices N1,y = AT1y P1 A1y and N2,z = AT2z P2 A2z are invertible. This restriction has no consequence in the geodetic applications where the common parameters x are associated with station coordinates and the rank deficiencies are due to the lack of reference system definition. The solutions of each of the separate normal equations based only on one of the two data sets, with generally different minimal constraints CT1x x = d1 and CT2x x = d2 , have the sequential form ¯ 1,x + C1x CT )−1 (u¯ 1x + C1x d1 ) = (N ¯ 1,x + C1x CT )−1 u¯ 1x + xˆ 1 = (N 1x 1x + Ex (CT1x Ex )−1 d1 ,

(271)

¯ 1,x + C1x CT )−1 − Ex (ETx Cx CTx Ex )−1 ETx , Q1,x = (N 1x

(272)

T ˆ 1 ), yˆ 1 = N−1 1,y (u1y − N1,xy x

(273)

Q1,xy = −Q1,x N1,xy N−1 1,y ,

−1 T Q1,y = N−1 1,y − N1,y N1,xy Q1,xy ,

(274)

¯ 2,x + C2x CT )−1 (u¯ 2x + C2x d2 ) = xˆ 2 = (N 2x ¯ 2,x + C2x CT )−1 u¯ 2x + Ex (CT Ex )−1 d2 , = (N 2x 2x

(275)

774

A. Dermanis

¯ 2,x + C2,x CT )−1 − Ex (ETx Cx CTx Ex )−1 ETx , Q2,x = (N 2,x

(276)

T ˆ 2 ), zˆ 2 = N−1 2,z (u2z − N2,xz x

(277)

Q2,xz = −Q2,x N2,xz N−1 2,z ,

−1 −1 Q2,z = N−1 2,z − N2,xz N2,xz .

(278)

If we eliminate the non-common parameters from the joint normal equations (258) we arrive at the reduced joint normal equations ¯ 1,x + N ¯ x xˆ = (N ¯ 2,x )ˆx = u¯ 1x + u¯ 2x = u¯ x . N

(279)

The joint solution (265) for xˆ satisfying minimal constraints CT1x x = d1 , the separate solution xˆ 1 from the first data set (271) satisfying minimal constraints CT1x x = d1 and the separate solution xˆ 2 from the second data set (275) satisfying minimal constraints CT2x x = d2 , they all have the same form as if combining the reduced normal equations of each case with the corresponding minimal constraints, as it holds for original unreduced normal equations, provided that the incorporated minimal constraints involve only the common parameters that remain after the reduction. If we eliminate the non-common parameters from each of the separate normal equations (260) and (261) we arrive we arrive at the reduced separate normal equations ¯ 1,x xˆ 1 = u¯ 1x , N

¯ 2,x xˆ 2 = u¯ 2x . N

(280)

Addition of the above reduced normal equations reproduces Eq. (279) derived by eliminating the non-common parameters from the joint normal equations. Thus we come to the following conclusion: Proposition 3. Elimination of the non-common parameters from the joint normal equations of two uncorrelated data sets is equivalent to the formulation of the separate normal equations for each set, separate elimination of the non-common parameters and addition of the reduced separate normal equations. Thus addition of the normal equations from the two sets followed by elimination of the non-common parameters, is equivalent to elimination of the non-common parameters from the two sets followed by addition of the reduced normal equations. We turn next to the usual procedure of using separately obtained estimates from uncorrelated data sets, as pseudo-observations for the estimation of parameters in a combination step. Ideally, we would like to recover the same solution as if two or more uncorrelated data sets with common parameters were adjusted simultaneously. The critical question in this case is which weight matrices to use in the least squares solution. In case that there is no rank defect, the answer is trivial: we just use the inverses of the covariance factor matrices. In rank deficient models, the covariance matrices are singular and cannot be inverted. One can resort to Rao’s

13 Theory and Realization of Reference Systems

775

unified estimation theory (see e.g., [40, 54, 55]) which takes into account singular covariance matrices for the observations. Fortunately, it will turn out that we can avoid this complication. First, we will utilize the left sides of the separate normal equations (260) and (261) in the joint normal equations (258), in order to express them in terms of the separate estimates ⎡

⎤⎡ ⎤ ⎡ ⎤ N1,x + N2,x N1,xy N2,xz (N1,x xˆ 1 + N1,xy yˆ 2 ) + (N2,x xˆ 2 + N2,xz zˆ 2 ) xˆ ⎢ ⎥⎢ ⎥ ⎢ ⎥ NT1,xy xˆ 1 + N1,y yˆ 2 N1,y 0 ⎦ ⎣ yˆ ⎦ = ⎣ ⎣ NT1,xy ⎦. T T zˆ N2,xz 0 N2,z N2,xz xˆ 2 + N2,z zˆ 2 (281)

If the estimates of the separate solutions are used as pseudo-observations of the respective unknowns 

xˆ 2 yˆ 2



    x exˆ 1 = , + eyˆ 2 y



4 with yet unspecified weight matrices

xˆ 2 zˆ 2



3   4 exˆ x 2 = , = z ezˆ 2

(282)

3 4 3 P1,x P1,xy P2,x P2,xz , , respectively, PT1,xy P1,y PT2,xz P2,z

we will arrive at the normal equations ⎤⎡ ⎤ ⎡ ⎤ (P1,x xˆ 1 + P1,xy yˆ 2 ) + (P2,x xˆ 2 + P2,x zˆ 2 ) P1,x + P2,x P1,xy P2,xz xˆ ⎢ ⎥ ⎢ PT P1,y 0 ⎥ PT1,xy xˆ 1 + P1,y yˆ 2 ⎦ ⎣ yˆ ⎦ = ⎣ ⎦. ⎣ 1,xy T T ˆ z P2,xz 0 P2,z P2,xz xˆ 2 + P2,z zˆ 2 (283) Comparison with the directly obtained joint solution in the form (281) shows that the last can be obtained using the estimates from the separate solutions, provided that the normal equations are used as weight matrices ⎡

4

P1,x P1,xy PT1,xy P1,y

3

4 =

N1,x N1,xy NT1,xy N1,y

3

4 ,

3 P2,x P2,xz PT2,xz P2,z

4 =

3 N2,x N2,xz . NT2,xz N2,z (284)

Thus we have arrived at the following conclusion: Proposition 4. If the parameter estimates obtained from the separate adjustments of uncorrelated data sets are jointly used as pseudo-observations of the corresponding unknown parameters, with weight matrices identical to the coefficient matrices of the separate adjustment normal equations, then the obtained normal equations are the same as when all data sets are jointly adjusted. There is a point that needs to be stressed with respect to the above proposition. It has been implicitly assumed that the same unknown variance factor σ 2 applies to all data sets bi = Ai,x x + Ai,yi yi + ei , ei ∼ (0, σ 2 P−1 i ), i = 1, . . . , q. Thus we cannot cover a model with variance component estimation, where each data set

776

A. Dermanis

bi = Ai,x x + Ai,yi yi + ei , i = 1, . . . , q, has a different unknown variance factor σi2 , i.e., ei ∼ (0, σi2 P−1 i ). In geodesy we know the covariance matrix Ci of each data set and we may directly use weight matrices Pi = C−1 i , as e.g., suggested by W. Baarda and the Delft school. Nevertheless separate estimates σˆ i2 = eˆ Ti Pi eˆ i /fi are used in most cases in order to check the statistical hypotheses σi2 = 1, as a means of checking the validity of the model rather than the existence of a missed nonunity factor in the data covariances. A large value of σˆ i2 , reflects either insufficient modeling (ignored effects in the deterministic part of the model), or existence of ignored correlations in the observations, or presence of systematic errors, i.e., violation of the zero mean hypothesis E{ei } = 0. It is though customary to scale the obtained covariance factor matrix Qxˆ (i) , which theoretically is the covariance matrix of the separate parameter estimates xˆ (i) under the choice Pi = C−1 i , by 2 ˆ the additional inflation factor σˆ i to obtain a covariance matrix estimates Cxˆ (i) = σˆ i2 Qxˆ (i) , in an attempt to partially account for the use of an incorrect model, without any real theoretical foundation for this choice. As we have already mentioned and we will also repeat in the forthcoming, statistical optimality, in the sense of following the correct theoretical procedures, is not a panacea when the basic GaussMarkov model (known deterministic model, zero mean errors, error covariance matrix known except of a multiplicative factor) is not consistent with the data. It may happen that sub-optimal procedures prove to be more robust with respect to deterministic and stochastic model errors and thus produce better (closer to reality) estimates. In the special case where there are no defects associated with the definition of the reference system, the resulting normal equations and covariance ˆ xˆ = σˆ 2 Qxˆ and factor matrices are non-singular. Then using the estimates C i (i) (i) −1 −2 −1 ˆ corresponding weight matrices Pxˆ = C = σˆ Q , in the combination step, (i)

xˆ (i)

i

xˆ (i)

accounts for an ad hoc method of variance component estimation. Let us now turn to the case where the non-common parameters are nuisance parameters and we are only interested in the estimates of the common parameters. In this case we can use the separate estimates xˆ 1 , xˆ 2 as pseudo-observations 

xˆ 1 xˆ 2

 =

        x I e e + xˆ 1 = x + xˆ 1 , exˆ 2 exˆ 2 x I

(285)

with general weight matrices Pxˆ 1 and Pxˆ 2 . The least squares solution leads to the combined normal equations (Pxˆ 1 + Pxˆ 2 )ˆx = Pxˆ 1 xˆ 1 + Pxˆ 2 xˆ 2 .

(286)

Comparison with the reduced normal equations (279) combined with (280) shows that we can recover the optimal joint estimates if we use the reduced normal ¯ 1,x , equations matrices of the separate solutions as weight matrices, Pxˆ 1 = N ¯ Pxˆ 2 = N2,x . Thus we have arrived at the conclusion:

13 Theory and Realization of Reference Systems

777

Proposition 5. The optimal estimates of the common parameters obtained from the joint adjustments of uncorrelated data sets, can be also obtained in the follow way: Form the separate normal equations of each data set and obtain the reduced normal equation coefficient matrices by eliminating the non-common parameters; determine the separate solutions with the help of appropriate minimal constraints; use the separate estimates as pseudo-observations of the common unknowns, with weight matrices the reduced coefficient matrices of the normal equations, and determine the deriving solution. In plain words, the choice of reference system present in the separate coordinate estimates is “killed” by the multiplication with their weight matrices coming from the separate normal equations. Thus the joint normal equations have no reference system information, and this must be introduced through a choice of minimal constraints. Note that if one uses instead the unreduced submatrices N1,x , N2,x of the separate normal equations as weight matrices it arrives at normal equations (N1,x + N2,x )ˆxR = N1,x xˆ 1 + N2,x xˆ 2 , which obtain a suboptimal set of solutions xˆ R , different from the set of the optimal ones xˆ . This means e.g., that in theory, one should not rely on the normal sub-matrices of the per epoch coordinates and EOPS from SLR observations but the complete normal equation matrix involving also spherical harmonics, which must be reduced in order to obtain the proper weight matrix in the ITRF formulation. Furthermore, the gravity field parameter estimates should be updated, on the basis of the per-epoch coordinates and EOP estimates provided by the final ITRF solution. The question is whether this theoretically “correct” approach should be followed in place of the actually used restricted approach. The answer is yes, provided that the deterministic and stochastic model used in SLR data analysis is consistent with reality. Again, we face the question of the relevance of formal statistical optimality in the case of doubts about the validity of the model upon which the optimal procedure has been built. The sub-optimal restricted solution may prove more robust, by not allowing incorrect information from the gravity field estimation part to flow into the procedure of ITRF parameter estimation.

13.2

Non-adjustable Observations in Uncorrelated Data Sets

We will turn next to particular sets of observational data which we will call nonadjustable observations. Let us suppose that in the second set of data b2 = A2x x+ A2z z + e2 the submatrix A2z is square and nonsingular. As one searches for the values of the parameters, it is obvious that for whatever the value of the estimate xˆ , ˆ ) which gives error estimates there is a corresponding value zˆ = A−1 2z (b2 − A2x x ˆ ) = 0, eˆ 2 = b2 − A2x xˆ − A2z zˆ = b2 − A2x xˆ − A2z A−1 2z (b2 − A2x x

(287)

778

A. Dermanis

and thus has no contribution to the least squares sum under minimization. In such a situation we say that the observations b2 are non-adjustable with respect to the parameters zˆ . To see what exactly is happening in this case we form the joint equations for the two sets, which explicitly are (N1,x + N2,x )ˆx + N1,xy yˆ + AT2x P2 A2z zˆ = u1x + u2x ,

(288)

NT1,xy xˆ + N1,y yˆ = u1y ,

(289)

AT2z P2 A2x xˆ + AT2z P2 A2z zˆ = AT2z P2 b2 .

(290)

Since both AT2z and P2 are invertible (290) simplifies to A2x xˆ + A2z zˆ = b2 and solved for zˆ gives ˆ) zˆ = A−1 2z (b2 − A2x x

(291)

Replacing this value in the rest of the normal equations (288) and (289) these become N1,x xˆ + N1,xy yˆ = u1x

(292)

NT1,xy xˆ + N1,y yˆ

(293)

= u1y

which are exactly the ones from the separate adjustment of only the first set of data b1 . Solving (292) and (293) we obtain directly the jointly optimal estimates of xˆ and yˆ , which can be used in (291) in order to compute the jointly optimal estimate of zˆ . Thus we come to the following conclusion: Proposition 6. When one of two uncorrelated data sets is non-adjustable with respect to its non-common parameters, the adjustment of the other set provides directly the jointly optimal estimates of its parameters. The obtained estimates can be used to derive the jointly optimal estimates of the non-common parameters of the non-adjustable data set utilizing the observation model of the non-adjustable data as if it was free of errors. The covariance factor matrices Qxˆ , Qyˆ , Qxˆ yˆ are obtained from the separate adjustment of only the first set of data b1 . The remaining covariance factor matrices can be computed sequentially from 

Qxˆ zˆ Qyˆ zˆ

4

 =

−Qxˆ AT2x A−T 2z −QTxˆ yˆ AT2x A−T 2z

3 ,

#−1 " −A−1 Qzˆ = AT2z P2 A2z 2z A2x Qxˆ zˆ .

(294)

13 Theory and Realization of Reference Systems

13.3

779

Non-adjustable Observations in Correlated Data Sets

So far we considered only uncorrelated data sets. We will generalize the previous case of a non-adjustable data set to the case where the two data sets are in fact correlated. The observation equations in this case are 

b1 b2





A1x x + A1y y + e1 = A2x x + A2z z + e2



⎡ ⎤  x   A1x A1y 0 ⎣ ⎦ e1 = , y + e2 A2x 0 A2z z 

(295)

with weight matrix being the inverse of the joint covariance factor matrix 

P1 P12 PT12 P2



 =

Q1 Q12 QT12 Q2

−1 ,

(296)

where the non-zero submatrix P12 is a result of the existing correlation (Q12 = 0). We will need some relations following from the last one, namely −1 T P−1 2 = Q2 − Q12 Q1 Q12 ,

−1 T T P−1 2 P12 = −Q12 Q1

−1 T T P−1 2 P12 = −Q12 Q1 ,

−1 T P1 − P12 P−1 2 P12 = Q1 .

(297)

The joint normal equations take in this case the form ⎤⎡ ⎤ ⎡ ⎤ Nx Nxy Nxz ux xˆ ⎥⎣ ⎦ ⎣ ⎦ ⎢ T N N N = uy , ⎣ xy y yz ⎦ yˆ T T zˆ Nxz Nyz Nz uz ⎡

(298)

where Nx = AT1x P1 A1x + AT2x PT12 A1x + AT1x P12 A2x + AT2x P2 A2x Nxy = AT1x P1 A1y + AT2x PT12 A1y , Ny = AT1y P1 A1y ,

Nxz = AT1x P12 A2z + AT2x P2 A2z ,

Nyz = AT1y P12 A2z ,

Nz = AT2z P2 A2z ,

(299)

ux = AT1x P1 b1 + AT2x PT12 b1 + AT1x P12 b2 + AT2x P2 b2 , uy = AT1y P1 b1 + AT1y P12 b2 ,

uz = AT2z PT12 b1 + AT2z P2 b2 .

(300)

In particular, the third one of (298) has the explicit form (AT2z PT12 A1x + AT2z P2 A2x )ˆx + AT2z PT12 A1y yˆ + AT2z P2 A2z zˆ = AT2z PT12 b1 + AT2z Pb2 , (301)

780

A. Dermanis

In view of the fact that both AT2z and P are invertible, we may solve the last equation −1 T T for zˆ and take the relation P−1 2 P12 = −Q12 Q1 into account, to obtain   −1 T ˆ ˆ ˆ b x − Q x − A y ) . zˆ = A−1 − A Q (b − A 2 2x 1 1x 1y 12 1 2z

(302)

An interesting interpretation of the above relation can be obtained if it is analyzed in three steps ˆ − e˜ 2 ). zˆ =A−1 2z (b2 − A2x x (303) In the first step an estimate eˆ 1 of the errors in the first data set is estimated. In the second step this estimate is used in order to obtain a prediction e˜ 2 of the errors in the second data set. Finally, in the third step the estimates xˆ , zˆ and the prediction e˜ 2 are replaced in the observation equations b2 = A2x x + A2x z + e2 and the resulting relation is solved for zˆ , in order to express it in terms of xˆ and e˜ 2 . If the above value of zˆ is replaced in the first two of (298), and take into account −1 T that P1 − P12 P−1 2 P12 = Q1 we obtain the reduced normal equations eˆ 1 =b1 − A1x xˆ − A1y yˆ ,

ˆ 1 = Q21 Q−1 ˆ1, e˜ 2 =QT12 Q−1 1 e 1 e

(AT1x Q−1 x + (AT1x Q−1 y = AT1x Q−1 1 A1x )ˆ 1 A1y )ˆ 1 b1 ,

(304)

x + (AT1y Q−1 y = AT1y Q−1 (AT1y Q−1 1 A1x )ˆ 1 A1y )ˆ 1 b1 .

(305)

These are no other than the normal equations obtained from the adjustment of only the first set b1 , completely ignoring the existence of the second set, and in particular its correlation with it. Thus we come to the following conclusion: Proposition 7. When one of two correlated data sets is non-adjustable with respect to its non-common parameters, the adjustment of the other set alone provides directly the jointly optimal estimates of its parameters, provided that correlation is ignored and the weight matrix is simply the inverse of the corresponding covariance factor submatrix. Once the estimates xˆ , yˆ and their covariance factor matrices Qxˆ , Qyˆ , Qxˆ yˆ have been thus determined the remaining estimates zˆ can be computed using Eq. (302). The covariance factor matrices can be obtained from the inversion of the coefficient matrix of the joint normal equations (298). It turns out that the submatrices Qxˆ , Qyˆ , Qxˆ yˆ are exactly the same as computed from the adjustment of only the first data set. In order to obtain the remaining covariance factor matrices Qxˆ zˆ , Qyˆ zˆ and Qzˆ , we will derive first a more general scheme for their sequential estimation for any set of equations having the form of Eq. (298). We will consider two cases, the case where the coefficient matrix of the normal equations is nonsingular and the singular case where the submatrix Nzˆ is invertible and the rank deficiency can be removed by minimal constraints CTx x + CTy y = d, which does not involve the eliminated

13 Theory and Realization of Reference Systems

781

parameters z. The first case is rather trivial because the desired covariance factor matrices follow from the analytical inversion of the normal equations coefficient matrix. To prove the second case we will set a = [xT yT ]T and write (298) in the compact form 

Na Naz NTaz Nz

    aˆ ua = , zˆ uz

(306)

with minimal constraints CTa a = d. The inner constraints matrix has the form E =   Ea , satisfying Ez 

Na Naz NTaz Nz



Ea Ez



 =

Na Ea + Naz Ez NTaz Ea + Nz Ez

 =

  0 . 0

(307)

Application of Eq. (79) gives the covariance factor matrix 

Qaˆ Qaˆ zˆ QTaˆ zˆ Qaˆ



 =

Na + Ca CTa Naz NTaz Nz



−1 −

Ea R−1 ETa Ea R−1 ETz Ez R−1 ETa Ez R−1 ETz

 (308)

with R = ET C CT E = ETa Ca CTa Ea . Analytical inversion gives 

Na + Ca CTa Naz NTaz Nz

−1

 =

   Qaˆ + Ea R−1 ETa Qaˆ zˆ + Ea R−1 ETz X Y , = QTaˆ zˆ + Ez R−1 ETa Qaˆ + Ez R−1 ETz YT W (309)

where T −1 X = (Na + Ca CTa − Naz N−1 = Qaˆ + Ea R−1 ETa , z Naz ) −1 T −1 −1 T Y = −X Naz N−1 z = −(Qaˆ + Ea R Ea )Naz Nz = Qaˆ zˆ + Ea R Ez , −1 T −1 −1 T −1 T −1 T W = N−1 z − Nz Naz Y = Nz − Nz Naz (Qaˆ zˆ + Ea R Ez ) = Qzˆ + Ez R Ez . (310)

Solving the last two for Qaˆ zˆ and Qzˆ we obtain −1 T T −1 −1 Qaˆ zˆ = −Qaˆ Naz N−1 z − Ea R (Ea Naz − Ez Nz )Nz = −Qaˆ Naz Nz , −1 T −1 T −1 T −1 −1 T Qzˆ = N−1 z − Nz Naz Qaˆ zˆ − Nz (Naz Ea + Nz Ez )R Ez = Nz − Nz Naz Qaˆ zˆ , (311)

since from (307) NTaz Ea + Nz Ez = 0. Replacing a and the relative submatrices in terms of y, z, we obtain the sequential algorithm

782

A. Dermanis

Qxˆ zˆ = −(Qxˆ Nxˆ zˆ + Qxˆ yˆ Nyˆ zˆ )Nz−1 , ˆ

Qyˆ zˆ = −(QTxˆ yˆ Nxz + QTyˆ Nyz )N−1 z ,

−1 T T Qzˆ = N−1 z − Nz (Nxz Qxˆ zˆ + Nyz Qyˆ zˆ ).

(312)

The above is a general result for the computation of covariance cofactor matrices of the eliminated parameters z, that will be used later. Returning to the present case of non-adjustable observations with respect to the parameters z, we may replace the submatrices Nzˆ , Nxˆ zˆ , Nyˆ zˆ with their particular values from Eq. (299) we obtain −T −1 −T T T Qxˆ zˆ = −Qxˆ (AT1x P12 P−1 2 + A2x )A2z − Qxˆ yˆ A1y P12 P2 A2z −T −1 −T T T = Qxˆ (AT1x Q−1 1 Q12 − A2x )A2z + Qxˆ yˆ A1y Q1 Q12 A2z , −T −1 −T T T Qyˆ zˆ = −QTxˆ yˆ (AT1x P12 P−1 2 + A2x )A2z − Qyˆ A1y P12 P2 A2z −T −1 −T T T = QTxˆ yˆ (AT1x Q−1 1 Q12 − A2x )A2z + Qyˆ A1y Q1 Q12 A2z , −1 T −1 −1 T Qzˆ = (AT2z P2 A2z )−1 − A−1 2z (P2 P12 A1x + A2x )Qxˆ zˆ − A2z P2 P12 A1y Qyˆ zˆ , (313)

where we have used the relations (297).

14

Stacking of a Coordinate Time Series from a Single Space Technique

In the joint treatment of data from the four space techniques and data related to ties at the collocation sites, we need first to form the normal equations for the stacking procedure for each technique separately. These can be used either directly for the joint ITRF solution, or separately, to obtain separate ITRF parameter estimates for each technique, which will then be combined to obtain the final joint estimates. In the second case the use of minimal constraints is necessary to obtain the separate estimates. In general, it is advisable to compute separate solutions (and not only their further needed normal equation matrices) as an intermediate diagnostic step, since problems pertaining to one data set will affect the separate estimates, but they might be masked in the joint solution under the influence of the correct information from the other data sets. This is particularly well known to surveyors where outliers can be depicted by data snooping much easier when suspicious geodetic subnetworks are independently adjusted, rather than in the adjustment of the whole of a large network. We will start by formulating the normal equations for a particular (unspecified) space technique. In the first step we will ignore the EOP data, and then we will examine their joint treatment. Since the model for the stacking has the same form, whether we work with the original parameters or with corrections to their approximate values, we will consider only the later, since it forms the basis of an

13 Theory and Realization of Reference Systems

783

iterative solution to the original non-linear model. From the basic model xob i (tk ) = δx0i +(tk −t0 )δvi +Ei δzk +eik ,

i = 1, 2, . . . , n,

k = 1, 2, . . . , m, (314)

we form the observation equations for all data in epoch tk ⎡

⎡ ⎡ ⎤ ⎤ ⎤ ⎡ ⎤ .. .. .. .. . . . ⎢ ⎢ ⎢ . ⎥ ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ ob ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ x(tk ) = ⎢ ⎣ xi (tk ) ⎦ = ⎣ δx0i ⎦ + (tk − t0 ) ⎣ δvi ⎦ + ⎣ Ei ⎦ δzk + ⎣ eik ⎦ = .. .. .. .. .. . . . . . .. .





= δx0 + (tk − t0 )δv + Eδzk + ek ,

(315)

and the total observation equations for all epochs ⎡

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ .. .. .. .. . . ⎢ . ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ b= ⎣ x(tk ) ⎦ = ⎣ I3n ⎦ δx + ⎣ (tk − t0 )I3n ⎦ δv + ⎣ · · · .. .. .. .. . . . .

.. . E .. .

⎤⎡ ⎤ ⎡ ⎤ .. .. . ⎥⎢ . ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ ⎥ ···⎥ ⎦ ⎣ δzk ⎦ + ⎣ ek ⎦ .. .. .. . . . ..

(316) = (1m ⊗ I3n )δx + (τ ⊗ I3n )δv + (Im ⊗ E)δz + e ≡ Jδx + Jt δv + Gδz + e, where τ has elements τk = tk − t0 . In order to accommodate for the fact that not all stations participate in all observations of epoch tk , we add zero entries for the missing observations and replace the weight matrix Pk of the observations with an inflated version P¯ k , by adding zero columns and rows corresponding to the non observing stations. The final normal equations have the form 

Na Naz NTaz Nz



⎤⎡ ⎤ ⎡ ⎤ ⎡    δ xˆ ux Nx Nxv Nxz δ aˆ ua , = ⎣ NTxv Nv Nvz ⎦ ⎣ δ vˆ ⎦ = ⎣ uv ⎦ = uz δˆz δˆz NTxv NTvz Nz uz



 δ xˆ where δ aˆ = and δ vˆ Nx = JT PJ =

m

P¯ k ,

Nxv = JT PJt =

k=1

Nv = JTt PJt =

m

m

(tk − t0 )P¯ k ,

k=1

(tk − t0 )2 P¯ k ,

k=1

Nxz = J PG = BR(Nxzk ), T

Nxzk = P¯ k E,

(317)

784

A. Dermanis

Nvz = JTt PG = BR(Nvzk ),

Nvzk = (tk − t0 )P¯ k E,

Nz = GT PG = BD(Nzk ),

Nzk = ET P¯ k E,

ux = JT Pb =

m

(318)

P¯ k x(tk ),

k=1

uv = JTt Pb =

m

(tk − t0 )P¯ k x(tk ),

k=1

uzk = ET P¯ k x(tk ).

uz = G Pb = BC(uzk ), T

(319)

In order to express the various elements in terms of the actual per epoch weight matrices Pk , we denote with (Pk )ij be the 3 × 3 submatrix of Pk corresponding to the stations i and j , and we set (P¯ k )ij = ik j k (Pk )ij , where ik = 1 if station i participates in the observations of epoch tk , and ik = 0 if it does not. We will also denote with Ti the subset of the set of epoch indices k = 1, 2, . . . , m in which station i participates and with Sk the subset of the set of station indices i = 1, 2, . . . , n corresponding to the stations participating at epoch tk . When the indices ik appear in a summation over all epochs and stations it holds that n

=

ik Qi



Qi ,

i∈Sk

i=1

m

ik Qk

=



The submatrices Nx , Nxv and Nv share the common form the respective values τ¯k = 1, τ¯k = tk − t0 and τ¯k = (tk − t0 Nτ¯k =

m

τ¯k p¯ k =

k=1

m

τ¯k

ik

Qk .

(320)

k∈Ti

k=1

j k Pk

=

m 

τ¯k p¯ k with τ¯k k=1 )2 . Setting



τ¯k Pk

taking

(321)

k∈Ti ∩Tj

k=1

we have Nx = Nτ¯ k =1 , Nxv = Nτ¯k =tk −t0 , Nv = Nτ¯k =(tk −t0 )2 , with station submatrices of the form (Nτ¯k )ij = τ¯k (Pk )ij , explicitly k∈Ti ∩Tj

Nxi xj =



(Pk )ij ,

Nxi vj =

k∈Ti ∩Tj

Nvi vj =





(tk − t0 )(Pk )ij ,

k∈Ti ∩Tj

(tk − t0 )2 (Pk )ij .

(322)

k∈Ti ∩Tj

The remaining submatrices of N and u per epoch and/or station are computed from

13 Theory and Realization of Reference Systems

Nxi zk =

ik

785

Nvi zk = (tk − t0 )Nxi zk ,

(Pk )ij Ej ,

j ∈Sk

Nzk =



ETi (Pk )ij Ej ,

(323)

i∈Sk j ∈Sk

uxi =



(Pk )ij xob j (tk ),

uvi =

k∈Ti j ∈Sk

uzk =



i∈Sk

ETi



k∈Ti

(tk − t0 )



(Pk )ij xob j (tk ),

j ∈Sk

(Pk )ij xob j (tk ).

(324)

j ∈Sk

To form the contribution to the total ITRF normal equations we need to consider the pseudo-observations for the per epoch EOP estimates. For applications other than the ITRF formulation the stacking solution can be computed by introducing a set of minimal constraints. To take advantage of the block-diagonal form of the Nz sub-matrix which can thus be inverted very easily, it is advisable to apply partial minimal constraints involving only the initial coordinates and velocities, included in the vector a = [δxT0 δvT ]T , e.g., kinematic constraints or generalized partial inner constraints or a combination of the two. Simple partial inner constraints are computationally simpler to apply, keeping in mind that we can a posteriori transform the solution to that of any desired minimal constraints. The general form of the constraints will be in that case CTa δa = da and the solution will be provided by applying Eqs. (78) and (79) 

−1       δ aˆ ua Ea (CTa Ea )−1 da Na + Ca CTa Naz + , (325) = NTaz Nz uz Ez (CTa Ea )−1 da δˆz    −1   Qδ aˆ Qδ aˆ δˆz Na + Ca CTa Naz Ea R−1 ETa Ea R−1 ETz = , − QTδaˆ δˆz Qδˆz NTaz Nz Ez R−1 ETa Ez R−1 ETz R = ETa Ca CTa Ea .

(326)

Since the matrix Nz is block diagonal it can be easily inverted and utilized in a procedure that is similar to the elimination of δˆz in the case with full rank, where no constraints are implemented. Analytical inversion gives 

Na + Ca CTa Naz NTaz Nz

−1

 =

 Ga Gaz , GTaz Gz

(327)

where T −1 Ga = (Na − NTaz N−1 z Naz + Ca Ca ) ,

Gaz = −Ga Naz N−1 z ,

−1 T −1 −1 T −1 Gz = N−1 z − Nz Naz Gaz = Nz + Nz Naz Ga Naz Nz .

(328)

786

A. Dermanis

With these values and setting ¯ a = Na − NTaz N−1 N z Naz ,

u¯ a = ua − Naz N−1 z uz ,

(329)

we arrive at the following algorithm ¯ a + Ca CTa )−1 u¯ a + Ea (CTa Ea )−1 da , δ aˆ = (N

(330)

−1 T ˆ, δ¯z = N−1 z uz − Nz Naz δ a

(331)

¯ a + Ca CTa )−1 − Ea R−1 ETa , Qδ aˆ = (N

(332)

Qδ aˆ δˆz = −Qδ aˆ Naz N−1 z ,

(333)

−1 T Qδˆz = N−1 z − Nz Naz Qδ aˆ δˆz .

(334)

The last two equations are a direct application of the general equations (311) derived in Sect. 13.3. The above algorithm is completely analogous with the algorithm based ¯a on parameter elimination, in the case of full rank design matrix, in which case N is invertible and no constraints are needed. The only difference is that Eqs. (330) ¯ −1 ¯ −1 ¯ a and Qaˆ = N and (332) have replaced the equations δ aˆ = N a u a , respectively, of ¯ the full rank case. The matrices Na , u¯ a appearing in the above algorithm have the explicit values ¯a = N



 ¯ xv ¯x N N ¯ Txv N ¯v , N

¯ x = Nx − N

m

 u¯ a =

 u¯ x , u¯ v

T Nxzk N−1 zk Nxzk ,

¯ xv = Nxv − N

k=1

¯ v = Nv − N

m

(335) m

T Nxzk N−1 zk Nvzk ,

k=1 T Nvzk N−1 zk Nvzk ,

(336)

k=1

u¯ x = ux −

m

Nxzk N−1 zk uzk ,

u¯ v = ux −

k=1

m

Nvzk N−1 zk uzk .

(337)

k=1

In the particular case that inner partial inner constraints ETa δa = 0 are used, we need only to replace (330) and (332) with ¯ a + Ea ETa )−1 u¯ a , δ aˆ = (N

(338)

¯ a + Ea ETa )−1 − Ea (ETa Ea )−2 ETa . Qδ aˆ = (N

(339)

Since we can a posteriori convert the stacking solution to one corresponding to any desired constraints, it is advantageous to use the above much simpler algorithm in order to derive first the solution with partial inner constraints ETa δa = 0 involving

13 Theory and Realization of Reference Systems

787

only initial coordinates and velocities. For this reason, we give the explicit algorithm for the partial inner constraints solution: 

4

Qδ xˆ 0 QTδxˆ δ vˆ 0

δ xˆ 0 δ vˆ



 =

¯ xv ¯ x + EET N N T ¯ ¯ Nxv Nv + EET

−1 

 u¯ x , u¯ v

T ˆ 0 − NTvzk δ vˆ ), δˆzk = N−1 zk (uzk − Nxzk δ x 3  −1 ¯ x + EET ¯ xv Qδ xˆ 0 δ vˆ N N − = ¯ Txv ¯ v + EET Qδ vˆ N N



0 E(ET E)−2 ET − 0 E(ET E)−2 ET

(340) (341)

 ,

(342)

Qδ xˆ 0 δˆz = −(Qδ xˆ 0 Nxz + Qδ xˆ 0 δ vˆ Nvz )N−1 z , Qδ vˆ δˆz = −(QTδxˆ 0 δ vˆ Nxz + Qδ vˆ Nvz )N−1 z , −1 T −1 Qδˆz = N−1 z − Nz (Nxz Qδ xˆ δˆz + Nvz Qδ vˆ δˆz ).

15

(343)

The Implementation of Time Series of Earth Orientation Parameters

When stacking is performed for the purpose of ITRF formulation, it is necessary to take into account the available EOP time series, so that the relevant information from all space techniques can be used to obtain combined EOP estimates in addition to the station initial coordinates and velocities, which are the main goal. As done with epoch coordinates, the EOP estimates of each epoch refer to the particular reference system defined through the applied minimal constraints. In order to use them as pseudo-observations we must know the equations which convert EOPs in the (yet undefined) stacking reference system to the reference system of the particular epoch. More general we need to know how EOPs are transformed under a change of the terrestrial reference system. Since EOPs refer to directions, they are affected only by the rotational part of the reference system transformation and not on the displacement or scale part. Table 1 presents the general transformation equations for EOPs under changes of either the terrestrial reference system (TRS), or the celestial reference system (CRS) or both (TRS + CRS). The change in the TRS is expressed by x˜ T = R(θ)xT ≈ (I − [θ×])xT in terms of rotation angles θ = [θ1 θ2 θ3 ]T around the three axes, while the change in the CRS is analogously expressed by x˜ C = R(ψ)xC ≈ (I − [ψ×])xC with angles ψ = [ψ1 ψ2 ψ3 ]T around the three axes. EOP transformation laws are given not only for the CIP but also for the instantaneous rotation axis as well as to the angular momentum axis, which is a more stable axis not following the high frequency precession-nutation variations of the instantaneous rotation axis. The CIP here does not directly relate to its rather vague theoretical definition (smoothed

788

A. Dermanis

Table 1 Variation of EOPs under changes x˜ T = R(θ)xT = (I − [θ×])xT (time depended) in the terrestrial reference system (TRS), x˜ C = R(ψ)xC = (I − [ψ×])xC (time fixed) in the celestial reference system (CRS), or both Change of

EOPs related to CIP EOPs related to the rotation vector

TRS + CRS x˜P = xP − θ2 y˜P = yP − θ1 X˜ − X − ψ2 Y˜ = Y + ψ1 θ˜ = θ + θ3 − ψ3 , TRS

x˜P = xP − θ2 y˜P = yP − θ1 X˜ = X

CRS

Y˜ = Y θ˜ = θ + θ3 , x˜P = xp y˜P = yP X˜ = X − ψ2 Y˜ = Y + ψ1 θ˜ = θ − ψ3 ,

x˜P = xP − θ2 +

θ˙1 ω θ˙2 ω

EOPs related to the angular momentum ξ˜h = ξh − θ2

y˜P = yP − θ1 − X˜ = X − ψ2 + ˙ ˙ + cos θ θ1 ω−sin θ θ2 ˜ Y = Y + ψ1 + ˙ θ θ˙2 + sin θ θ1 −cos ω ˜ ˙ Λ = Λ(1 − θ3 /ω),

(−η˜ h ) = (−ηh ) − θ1 X˜ h − Xh − ψ2

θ˙ x˜P = xP − θ2 + ω1 θ˙ y˜P = yP − θ1 − ω2 cos θ θ˙1 −sin θ θ˙2 X˜ = X + ω sin θ θ˙1 −cos θ θ˙2 Y˜ = Y + ω

ξ˜h = ξh − θ2

Λ˜ = Λ(1 − θ˙3 /ω), x˜P = xp y˜P = yP X˜ = X − ψ2 Y˜ = Y + ψ1 Λ˜ = Λ,

Y˜h − Yh + ψ1 ˜ = UT1 + (θ3 + ψ3 )/(2π B)a UT1 (−η˜ h ) = (−ηh ) − θ1 X˜ h = Xh Y˜h = Yh ˜ = UT1 + θ3 /(2π B)a UT1 ˜ξh = ξh η˜ h = ηh X˜ h = Xh − ψ2 Y˜h = Yh + ψ1 ˜ = UT1 − ψ3 /(2π B)a UT1

a

B = 1.00273781191135448

version of the instantaneous rotation axis after removal of sub-daily precessionnutation terms) but rather to its operational realization. According to the IERS conventions, the CIP axis is defined directly in relation to the CRS through the smoothed version of the theoretically predicted precession-nutation, after the latter are corrected, according to observational evidence provided by VLBI. The EOPs into consideration are the first two components of the unit vector of the rotation axis with respect to the CRS X, Y , the polar motion components xP , yP , which relate to the components of the rotation axis with respect to the TRS and the earth rotation angle θ and the length of the day (LOD) Λ. Λ is related to the angular velocity ω through Λ = 2π/ω. An alternative to LOD is universal time UT1 expressed as Tu = (Julian UT1 date − 2451545.0). It relates to the earth rotation angle θ through θ = 2π(A + BTu ) where A = 7790572732640 and B = 1.00273781191135448, as well as to the IERS provided differences UT1-UTC through the trivial relation UT1=UTC+(UT1-UTC). With respect to the EOP time series provided by the four space techniques, VLBI provides daily polar motion, polar motion rates, LOD and

13 Theory and Realization of Reference Systems

789

UT1-UTC, SLR provides polar motion and LOD on both a weekly and fortnightly rate, GPS provides daily polar motion, polar motion rates and LOD, while DORIS provides only polar motion on a weekly basis. The relations for EOPs related to the CIP under changes of only the terrestrial reference system, are the ones actually used in the ITRF formulation. They have been adapted from relations developed, for the now abandoned classical earth rotation representation, by Zhu and Mueller [63]. From the TRS/CIP entries in Table 1, it follows that the EOP pseudo-observations at each epoch tk have the form qobs (tk ) = qk + T0 zk + eq,k , where qk are the unknown EOP parameters in the final ITRF reference system, zk are the transformation parameters from the ITRF reference system to that of the epoch tk , T0 is the known design matrix and eq,k are the relevant errors. In reduced form they become ap

bq,k = qobs (tk ) − T0 zk = qk + T0 δzk + eq,k .

(344)

As an example, when polar motion and UT1 are observed the above equation takes the specific form ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 −1 0 θ1 (tk ) xP (tk ) exP (tk ) xPobs (tk ) ⎣ y obs (tk ) ⎦ = ⎣ yP (tk ) ⎦ + ⎣ −1 0 0 ⎦ ⎣ θ2 (tk ) ⎦ + ⎣ ey (tk ) ⎦ . P P 0 0 2π1B UT1obs (tk ) UT1(tk ) θ3 (tk ) eUT1 (tk ) (345) The particular form of bq,k = qk + T0 δzk + eq,k depends on which EOPs are provided by each technique. This model does not include time series of the form q˙ obs (tk ) = q˙ k + T0 δ˙zk + e˙ q,k , since these depend on the derivatives δ˙zk of the transformation parameters δzk , which are not present in the stacking model for coordinate time series and thus constitute additional unknown parameters and overparameterization. In principle one can use a simple local interpolation depending on the 2j + 1 neighbor values, e.g., δzk−j , . . . , δzk , . . . , δzk+j , to obtain a local function δzk in the form of a polynomial of degree 2j . Differentiation and evaluation at tk , will produce δ˙zk = δ˙z(tk ) which will depend on the neighboring values taken into account in the interpolation. For example, a simple 3-point Lagrangean interpolation of a second degree polynomial δz(t) = a + bt + ct 2 based on the values δzk−1 , δzk , δzk+1 will produce the relation ⎡

δ˙zk = − +

(tk+1 − tk ) (tk+1 − 2tk − tk−1 ) δzk−1 + δzk + (tk − tk−1 )(tk+1 − tk−1 ) (tk − tk−1 )(tk+1 − tk ) (tk − tk−1 ) δzk+1 . (tk+1 − tk )(tk+1 − tk−1 )

(346)

Even simpler piecewise linear interpolation schemes are possible, based on only two successive values, such as δ˙zk = tk+11−tk (δzk+1 − δzk ) and δ˙zk = tk −t1k−1 (δzk − δzk−1 ), or their mean value

790

A. Dermanis

1 tk+1 − 2tk + tk−1 1 δzk−1 + δzk + δzk+1 . 2(tk − tk−1 ) 2(tk − tk−1 )(tk+1 − tk ) 2(tk+1 − tk ) (347) We strongly advise against such a procedure, because the transformation parameters do not account only for a change of the spatiotemporal reference system but they also serve to absorb variations due to observational errors. Derivation is a procedure that strongly amplifies the effect of such errors, and the resulting time series of EOP derivatives will be in general too erratic to be of practical use. In order to combine EOP derivatives from the four space techniques in the final ITRF formulation, it is at least necessary to convert them from the reference system of each epoch to the spatiotemporal reference system of stacking. This can be done by a posteriori conversion based on the estimates δˆzk obtained by the stacking of each separate techniques, without including the EOP derivative as pseudo-observations. This will also require an interpolation scheme, with doubtable effectiveness. In our opinion EOP derivatives should not be considered either in stacking per technique or in ITRF formulation, because they pose more problems than their prospective usefulness. For all epochs we have the EOP observation equations δ˙zk = −

bq = q + Tδz + eq .

(348)

The uncorrelated EOP observations must be combined with the observation equations for the previously analyzed coordinate time series b = Ax δ xˆ 0 +Av δ vˆ +Az δˆz+ eb = Aa δ aˆ + Az δˆz + eb into the joint observation equations 

b bq



⎤   δa Aa Az 0 ⎣ ⎦ eb = . δz + eq 0 T I q 





(349)

The only common parameters between the two sets b and bq are the transformation parameters δz. In the EOP time series observation model bq = q + Tδz + eq , the EOPs q are non-common parameters and their implied design submatrix is the nonsingular identity matrix I. As already seen in Sect. 6.2., this mean that the EOP pseudo-observations bq are non-adjustable observations with respect to the EOP parameters q. Thus the results of Propositions 6 or 7 apply, depending on whether b and bq are correlated or not. We will examine first the no correlation situation where   Pb 0 . Then according to Proposition 6, the joint weight matrix has the form 0 Pq we can obtain the optimal estimates of δ xˆ 0 , δ vˆ , δˆz, adjusting only the coordinate time series observation b, performing the stacking without EOPs as described in the previous Sect. 6.3. The optimal estimates of the EOPs qˆ can be directly computed from the error-free model bq = q + Tδˆz, as

13 Theory and Realization of Reference Systems

qˆ = bq − Tδˆz.

791

(350)

The situation is quite different though when correlation between coordinate time series and EOP time series is taken into account4 and the 3weight matrix for the Pb Pbq joint observation equation (349) has the form . Then according to PTbq Pq Proposition 7, we can obtain the optimal estimates of δ xˆ 0 , δ vˆ , δˆz, adjusting only the coordinate time series observation b, performing the stacking without EOPs as described in the previous Sect. 13.3. Once δ xˆ 0 , δ vˆ , δˆz have been estimated from the coordinate time series, the EOP estimates may be obtained utilizing equation (302), which in this case (b1 → b, b2 → bq , x → δa, y → δz, z → q, A1x → Aa , A1y → Az , A2z → I, A2x → T) takes the form ˆ − Az δˆz). qˆ = bq − Tδˆz − QTbq Q−1 b (b − Aa δ a

(351)

Note that the last equation can be interpreted as a two-step procedure. In the first step the coordinate error estimates eˆ b = b − Aa δ aˆ q − Az δˆzq are used to predict the EOP ˆ b and then from the model consistency bq = qˆ + Tδˆz + e˜ q , error e˜ q = QTbq Q−1 b e the EOP estimates follow as qˆ = bq − Tδˆz − e˜ q . In addition to the separately derived covariance factor matrices Qδ aˆ , Qδˆz , Qδˆzδ aˆ , additional ones related to the EOP estimates can be computed sequentially utilizing equations (311), which in this case take the form T T −1 Qδˆz,qˆ = Qδˆz (ATz Q−1 b Qbq − T ) + Qδˆz,δ aˆ Aa Qb Qbq , T T −1 Qδ aˆ ,qˆ = QTδˆz,δ aˆ (ATz Q−1 b Qbq − T ) + Qδ aˆ Aa Qb Qbq , −1 T −1 T Qqˆ = Qq − QTbq Q−1 b Qbq + (Qbq Qq Az − T)Qδˆz,qˆ + Qbq Qb Aa Qδ aˆ ,qˆ . (352)

The question whether correlation between EOPs and coordinate time series, within data of the same epoch should be taken into account, must be answered in the positive if the assumptions on which the data analysis is based (correct deterministic and stochastic model, zero mean errors, correct error covariances, no systematic errors) is consistent with reality. If not, one might prefer not to let problems in the determination of EOPs within each epoch have an undesired influence on the quality of the estimates for initial coordinates and velocities. Computation experience has shown very small differences in the results obtained from the above two possible approaches.

792

16

A. Dermanis

ITRF Formulation: Combination of Initial Coordinates, Velocities and EOPs Obtained Separately from Each Space Technique

In order to combine different sets of data, these must be connected, either in the deterministic part of their model through the existence of common unknown parameters or in the stochastic part through the existence of correlations, or in both ways. Since data from different space techniques are naturally uncorrelated, the existence of common parameters is the only possible way. There are two types of common parameters. The first are the coordinate-related parameters of stations at the collocation sites, which appear as common parameters between the data from one technique and the data from local surveys performed for ties between nearby stations observing with different space techniques. The other are the EOP parameters, which must be common in all techniques that provide them, when they are expressed in the final ITRF reference system. Of the two sets, the EOPs are by far the most problematic. Common EOP parameters in the strict sense are (1) daily polar motion parameters, their rates and LOD from VLBI and GPS and (2) weakly polar motion from SLR and DORIS. One of the problems is how to connect daily with weekly EOP estimates of the same type. They are certainly interrelated but not exactly the same in order to be directly included as common parameters in the ITRF formulation adjustment. They are both a (not clearly defined) type of average of the relevant instantaneous values at the observation instants of the data analyzed to produce the daily or weekly estimates. Such averages depend on both the type of the observations and their temporal distribution within the data analysis interval. A more serious problem is the compatibility of EOPs from different techniques, with respect to which pole they refer. Nominally, they are all referring to the CIP, but it is highly questionable how this reference is realized, provided that the CIP is a conventionally constructed concept and not a physical object. Only the three parameters of the rotation matrix connecting the TRS to the CRS can be considered as physical objects, but these appear only in VLBI observations. SLR, GPS and DORIS sense earth rotation through the satellite orbits, which when viewed in the TRS they formally depend on the instantaneous rotation vector which appears the pseudo-forces caused by the TRS rotation. Even the rotation vector, as any type of velocity, is not a physical object but a mathematical concept, which becomes physically accessible only in the case of time continuous observations and not discrete ones as in the case of geodetic space techniques. All these EOP-related problems make even more important the role of the ties at collocation sites in the joint analysis of all data in the ITRF formulation. In addition, they pose the serious problem of how to treat EOP parameters in the ITRF formulation, in a way that their ambiguities have no negative influence on the quality of initial station coordinates and velocities, which are the main ITRF products. In any case we will present here also the “rigorous” combination of coordinate time series from all techniques, also including EOP time series, for the sake of completeness, although we by no means endorse this approach.

13 Theory and Realization of Reference Systems

793

The local ties at collocation sites are performed either by the use of classical terrestrial measurements (direction angles, distances, and spirit leveling) or by the GPS technique. When the GPS technique is used, the resulting station-to-station vectors are already aligned to a GPS-related reference system. In this case we do not need to express the observed displacement vector as a function of its ITRF components and the rotation angles from the ITRF to the GPS reference system. Indeed, a rotational ambiguity that will cause a 10 m displacement on the earth surface (an exaggerated value!) will cause a displacement of one tenth of a millimeter over a 50 m baseline. Thus observed baselines may be modeled as already referred to the final ITRF reference system. The situation is quite different when only terrestrial observations are performed in local ties and the resulting vector is referred to the local astronomic frame. To maintain an accuracy of one tenth of a millimeter over a 50 m baselines in the conversion to a global reference system, one needs a knowledge of the direction of the vertical down to 0.1 mas, which is quite impossible to achieve. Modeling the observed vectors as functions of both the coordinate differences in the ITRF reference system and the rotation angles from the local horizontal to the global ITRF system has the disadvantage that each vector adds as many unknowns (rotation angles) as observations (vector components) except for sites where more than two space techniques are collocated. For this reason, it is necessary that local terrestrial techniques include connections with a surrounding geodetic network, existing or created for this purpose, which is already aligned to a global network through GPS observations. We may look upon the ITRF formulation as a simultaneous stacking problem, to which the input data are the coordinate time series at each technique, plus the local ties data, plus the EOP time series for each technique. The basic difference with the single technique stacking and EOP estimation, is that now the EOP data refer to unknowns common to all techniques and not to each technique separately. We will see how the estimates from the single technique adjustments can be used in the ITRF formulation without actually performing the solution with all available data in one step. The latter will be formulated only theoretically in order to connect its results with the estimates from the separate solutions. Since we are dealing with uncorrelated data between different techniques we can profit from the property of the addition of normal equations. Recall that we have introduced the notation BD(MT ), BR(MT ) and BC(MT ) for a block-diagonal, a row, and a column matrix, respectively, having as elements the submatrices MT , for T = V , S, G, D, corresponding to the four space techniques (VLBI, SLR, GPS, DORIS). The EOP pseudo-observation equations (344) at epoch tk , from a particular space technique T , take the form bq,T ,k = qT ,k + TT ,k δzT ,k + eq,T ,k . If qk are the EOPs at epoch tk from all techniques, we may connect qT ,k from technique T to the total EOPs qk , through qT ,k = LT ,k qk , where LT ,k is a participation matrix. It may be obtained by removing from the identity matrix the rows corresponding to EOPs that are not observed by technique T . Usually LT ,k will be the same for all epochs, but different ones may account for missing EOP observations in some epochs. For all epochs the EOPs qT , observed by technique T , are respectively connected to the

794

A. Dermanis

overall EOPs q, from all techniques, through ⎡

⎤ ⎡ ⎤ ⎡ .. . .. .. . .. . . ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ qT = ⎢ ⎣ qT ,k ⎦ = ⎣ LT ,k qk ⎦ = ⎣ · · · LT ,k .. .. . . . 0 ..

⎤⎡ ⎤ . 0 ⎥ ⎢ .. ⎥ ⎢ ⎥ ···⎥ ⎦ ⎣ qk ⎦ = LT q. .. .. . .

(353)

LT is a sparse matrix with non-zero quasi-diagonal sub-blocks LT ,k . The coordinate time series data from each technique T are eT ∼ (0, σ 2 P−1 T ),

bT = AaT δaT + AzT δzT + eT ,

T = V , S, G, D, (354)

The EOP time series data from each technique T are bq,T = qT + TT δzT + eq,T ,

eq,T ∼ (0, σ 2 P−1 q,T ),

T = V , S, G, D, (355) where qT is the subset of all EOPs q provided by the technique T . The data from local ties at the collocation sites have the form bc = CaV δaV + CaS δaS + CaG δaG + CaD δaD + vc = ⎤ ⎡ δaV ⎢ δaS ⎥ ⎥ = [ CaV CaS CaG CaD ] ⎢ ⎣ δaG ⎦ + vc ≡ Cδa + vc ,

(356)

vc ∼ (0, σ 2 P−1 c ),

δaD with δa ≡ BC(δaT ), since there is no significant dependence on the transformation parameters zT . The indispensable contribution from the tie observations to the normal equations, which must be always included, is given by Nc δa = uc , where ⎡

CTa Pc CaV ⎢ TV ⎢C P C Nc ⎢ TaS c aV ⎣ CaG Pc CaV CTaD Pc CaV

CTaV Pc CaS CTaS Pc CaS CTaG Pc CaS CTaD Pc CaS

CTaV Pc CaG CTaS Pc CaG CTaG Pc CaG CTaD Pc CaG

uc = BC(uc,T ) = BC(CTaT Pc bc ).

16.1

⎤ CTaV Pc CaD ⎥ CTaS Pc CaD ⎥ ⎥, CTaG Pc CaD ⎦ CTaD Pc CaD

(357)

(358)

Combination Without Taking EOP Data into Consideration

Let us look first into the approach where EOP time series are not taken into account and the only data are coordinate time series and the tie observations at collocation sites. We are mainly interested in station initial coordinates and velocities, while transformation parameters are nuisance parameters. We can proceed in two ways

13 Theory and Realization of Reference Systems

795

applying either Propositions 3 or 5. In the first case we can add the reduced normal equations from each technique after the nuisance transformation parameter have been eliminated, as well as the normal equations from the tie observations, in order to produce the joint normal equations as they would result after the elimination of the nuisance parameters. The original normal equations from each technique are 

NaT NaT zT NTaT zT NzT



δ aˆ T δˆzT



 =

uaT uzT

 .

(359)

The transformation parameters δˆzT can be eliminated by solving the second of the normal equations for −1 T ˆT , δˆzT = N−1 zT uzT − NzT NaT zT δ a

(360)

and replacing in the first one, thus arriving at the reduced normal equations ¯ a δ aˆ T = u¯ a , N T T

T = V , S, G, D,

(361)

T ¯ a = Na − Na z N−1 ¯ zT . If these are ¯ aT = uaT − NaT zT N−1 where N zT NaT zT and u zT u T T T T added to the normal equation for the tie observations, we obtain the reduced joint normal equations

¯ a )δ aˆ = uc + u¯ a , (Nc + N

(362)

¯ a = BD(N ¯ a ), δ aˆ = BC(δ aˆ T ), u¯ a = BC(u¯ a ). These can be solved where N T T with the addition of minimal constraints. The second possible approach, which will produce exactly the same results, is based on the application of Proposition 5. The separate reduced normal equations of each technique (361) are solved with the use of minimal constraints, which may be different in each technique, producing estimates δ aˆ V |V , δ aˆ S|S , δ aˆ G|G , δ aˆ D|D . The notation δ aˆ T |T emphasizes the fact that these are estimates of δaT based on data from only the space technique T , thus reserving the notation δ aˆ T for the estimates obtained when data from all techniques are combined. These are used as pseudo-observations with weight matrices NaV , NaS , NaG , NaD , respectively, together with the tie observations to produce the reduced joint normal equations ¯ a + Nc )δ aˆ = uc + N ¯ a BC(δ aˆ T |T ) = uc + BC(N ¯ a δ aˆ T |T ), (N T

(363)

¯ a δ aˆ T |T are the same as in the previous which in view of the relations u¯ aT = N T approach. Once the optimal values of the parameters δ aˆ T have been determined by solving the above normal equations using minimal constraints CTa δa = da , the estimates of the transformation parameters follow from

796

A. Dermanis T ˆ T ), δˆzT = N−1 zT (uzT − NaT zT δ a

(364)

with related covariance factor matrices (see Eqs. 311) Qaˆ zˆ = −Qaˆ Naz N−1 z , −1 T Qzˆ = N−1 z − Nz Naz Qaˆ zˆ .

(365)

The EOP data from the four techniques can be combined a posteriori, thus avoiding to allow their questionable nature to influence the estimates of the initial coordinates and velocities included in δ aˆ . The relevant model is b˜ q,T ≡ bq,T −TT δˆzT = qT +eq,T ,

eq,T ∼ (0, σ 2 P−1 q,T ),

T = V , S, G, D, (366) where b˜ q,T are the EOP data transformed in the new common for all techniques reference system, utilizing the estimated at the first step values δˆzT . The correlation between bq,T and δˆzT , stemming from the one between bq,T and the coordinate time series data bT , is ignored. The same is true for the covariance matrix of δˆzT , which are treated as constants in the EOP combination. It should be however taken into account in the covariance propagation for the computation of the covariance factor matrix Qq of the EOP estimates qˆ resulting from the combination. The situation is pretty much similar to the so called stations with weights in classical densification networks, where the officially adopted coordinates of stations from a higher order network were necessarily fixed in the adjustment of the new network, but their uncertainty was taken into account when computing the covariance matrix of the new station coordinates. Even today this approach may be useful for relating a local network to the reference system of the ITRF network (i.e., the ITRS), without altering the coordinates and velocities of the included ITRF stations. Since not all EOPs q are included in each technique, using the above introduced participation matrix LT , we obtain the observation equations b˜ q,T = LT q + eq,T ,

eq,T ∼ (0, σ 2 P−1 q,T ),

T = V , S, G, D,

(367)

The joint observation equations from all techniques are b˜ q = Lq + eq ,

eq ∼ (0, σ 2 P−1 q ),

(368)

with b˜ q = BC(b˜ q,T ), L = BC(LT ), eq = BC(eq,T ) and weight matrix Pq = BD(Pq,T ). The corresponding normal equations are Nq qˆ = uq , with

(369)

13 Theory and Realization of Reference Systems

Nq = LT Pq L =



LTT Pq,T LT ,

797



uq = LT Pq b˜ q =

T =V ,S,G,D

LTT Pq,T b˜ q,T .

T =V ,S,G,D

(370) The solution qˆ = N−1 u provides the desired final EOP estimates. The matrix q q Nq is non-singular as can be easily seen by considering the case where all techniques observe all EOPs in which case LT = I. The corresponding covariance factor matrix, taking into account the uncertainty Qδˆz of the fixed transformation parameters δˆz is given by T T −1 Qqˆ = N−1 q L Pq (I + TQδˆz T Pq )LNq ,

(371)

where T = BD(TT ). If desired a more rigorous approach can be followed, taking into account the correlation between bq,T and δˆzT , which follows from the correlation between bq,T and bT . We will not pursue this approach since we have already have departed from strict rigor for the already explained reasons. One of the advantages of this two-step approach is that in the second step of the EOP combination one can model weekly estimates of EOP values and their velocities as functions of the within the week daily EOP parameters. This calls for an interpolation scheme of the daily values and the expression of the weakly value as an average of the interpolated function. For example, a direct linear regression qi = q(ti ) = qkW + (ti − tk )q˙kW , of the within the week daily values qk−3 , . . . , qk , . . . , qk−3 , allows to express the weakly values qkW and their velocities q˙kW , visualized as assigned to the week mid-day epoch tk , as functions 1 1 1 1 1 1 1 qk−3 + qk−2 + qk−1 + qk + qk+1 + qk+2 + qk−3 , 7 7 7 7 7 7 7 3 2 1 1 qk−3 + qk−2 − qk−1 + 0qk + qk+1 + =− 28h 28h 28h 28h 3 2 qk+2 + qk+3 , + 28h 28h

qkW = q˙kW

(372)

(373)

where h is the length of one day in the adopted units. Note that in this case the matrices LT are no more “inflation” matrices with elements one or zero, but proper design matrices as derived by interpolation models as the one above.

16.2

Combination Including EOP Data

When EOP coordinate series are also implemented, we can still use reduced normal equations where transformation parameters have been eliminated. The EOP parameters are no more non-adjustable parameters as in the case of the stacking of data from a single technique, because they appear as unknowns in the data of all the techniques. From the correlated observation equations (354) and (355), or in matrix

798

A. Dermanis

form 

bT bq,T





AaT 0 AzT = 0 I TT



⎤   δaT ⎣ qT ⎦ + eT , eq,T δzT ⎡

(374)

4

3 PbT PbT qT and their corresponding weight matrix , it follows that the normal PTbT qT PqT equations for the correlated coordinate and EOP time series data of each technique T = V , S, G, D, have the form ⎤⎡ ⎤ ⎡ ⎤ NδaT NδaT qT NδaT δzT δ aˆ T |T uδaT ⎥ ⎢ NT ⎣ δaT qT NqT NqT δzT ⎦ ⎣ qˆ T |T ⎦ = ⎣ uqT ⎦ . NTδaT δzT NTqT δzT NδzT δˆzT |T uδzT ⎡

(375)

where NδaT = ATaT PbT AaT ,

NδaT qT = ATaT PbT qT ,

NδaT δzT = ATaT (PbT AzT + PbT qT T), NqT = PqT ,

NqT δzT = PTbT qT AzT + PqT TT ,

NδzT = (ATzT PbT + TTT PTbT qT )AzT + (ATzT PbT qT + TTT PqT )TT ,

(376)

uδaT = ATaT PbT bT + ATaT PbT qT bq,T , uqT = PTbT qT bT + PqT bq,T , uδzT = (ATzT PbT + TTT PTbT qT )bT + (ATzT PbT qT + TTT PqT )bq,T .

(377)

Here δ aˆ T |T , δˆzT |T , qˆ T |T denote the estimates of δaT , δzT , qT , respectively, obtained using only data from technique T , in order to distinguish from the respective ˆ obtained using data from all techniques. Notice however that estimates δ aˆ T , δˆzT , q, while δaT , δzT are completely different parameters in each technique T , the EOP parameters qT are a subset of the EOP parameters q covered by all techniques. Thus qˆ T |T are separate estimates of qT , while qˆ is the joint estimate of q. In order to apply the addition of the normal equations we must take into account that qT = LT q, in which case the normal equations (375) are “inflated” into ⎡

⎤⎡ ⎤ ⎡ ⎤ NδaT NδaT qT LT NδaT δzT uδaT δ aˆ T |T ⎢ LT NT ⎥ T T ⎣ T δaT qT LT NqT LT LT NqT δzT ⎦ ⎣ qˆ |T ⎦ = ⎣ LTT uqT ⎦ . T T NδaT δzT NqT δzT LT NδzT δˆzT |T uδzT

(378)

13 Theory and Realization of Reference Systems

799

This simply means that NδaT qT LT , is formed from NδaT qT by inserting zero columns at the slots for the missing EOPs, LTT NqT δzT is formed from NqT δzT by inserting zero rows at the slots for the missing EOPs, while LTT NqT LT is formed from NqT by inserting both zero columns and zero rows in the same slots. The addition of the normal equations results into ⎡

⎤⎡ ⎤ ⎡ ⎤ Nδa + Nc Nδa,q Nδa,δz uδa + uc δ aˆ ⎢ NT ⎥ Nq Nq,δz ⎦ ⎣ qˆ ⎦ = ⎣ uq ⎦ , ⎣ δa,q T δˆz Nδa,δz NTq,δz Nδz uδz

(379)

where δ aˆ = BC(δ aˆ T ), uδa = BC(uδaT ), Nδa = BD(NδaT ),

δˆz = BC(δˆzT ), uδz = BC(uδzT ), Nδa,δz = BD(NδaT δzT ),

Nδa,q = BC(NδaT qT LT ), LTT NqT LT , Nq =

Nδz = BD(NδzT )

Nq,δz = BR(LTT NqT δzT ), uq = LTT uqT .

T =V ,S,G,D

(380)

T =V ,S,G,D

Since the transformation parameters are nuisance parameters it is convenient to eliminate them from the normal equations, following either Propositions 3 or 5. If Proposition 3 is followed we must eliminate δzT from the normal equations (375) before inflation and then add the resulting reduced normal equations inflated, plus the ones from the tie observations. The reduced equations become 4

¯ δa N ¯ δa q N T T T T ¯ ¯q NδaT qT N T

3

δ aˆ T |T qˆ T |T



 =

u¯ δaT u¯ qT

 (381)

,

where ¯ δa = Nδa − Nδa δz N−1 NT N T T T T δzT δaT δzT ,

¯ δa q =Nδa q − Nδa δz N−1 NT N T T T T T T δzT qT δzT ,

¯ q = Nq − Nq δz N−1 NT N T T T T δzT qT δzT , u¯ δaT = uδaT − NδaT δzT N−1 δzT uδzT ,

u¯ qT = uqT − NqT δzT N−1 δzT uδzT .

(382)

The reduced equations (381) must be first inflated into 4

¯ δa ¯ δa q LT N N T T T T T ¯ ¯ q LT LT NδaT qT LTT N T

3

δ aˆ T |T qˆ |T



 =

u¯ δaT LTT u¯ qT

 ,

(383)

800

A. Dermanis

and then added together with the ones from tie observations, to produce the joint reduced normal equations 4

¯ δa + Nc N ¯ δa,q N T ¯ ¯q Nδa,q N

3

   u¯ δa + uc δ aˆ = , qˆ u¯ q

(384)

where ¯ δa,q = BC(N ¯ δa q LT ), N T T ¯q = ¯ q LT , LTT N N T

¯ δa = BD(N ¯ δa ), N T u¯ q = LTT u¯ qT .

T =V ,S,G,D

(385)

T =V ,S,G,D

If desired the estimates of the transformation parameters can be derived according to Eq. (277), which in this case becomes ˆ δˆzT = N−1 (uδzT − NTδaT δzT δ aˆ T − NTqT δzT LT q). δˆz

(386)

T

The related covariance matrices can be sequentially computed (see Eqs. 312) as −1 Qδ aˆ ,δˆz = −(Qδ aˆ Nδ aˆ ,δˆz + Qδ aˆ ,qˆ Nq,δ ˆ zˆ )Nδˆz , −1 T Qq,δˆ ˆ z = −(Qδ aˆ ,qˆ Nδ aˆ ,δˆz + Qqˆ Nq,δˆ ˆ z )Nδˆz ,

Qδˆz = N−1 − N−1 (NTδaˆ ,δˆz Qδ aˆ ,δˆz + NTq,δˆ ˆ z ). ˆ z Qq,δˆ δˆz δˆz

(387)

Much more simple is the approach suggested by Proposition 5. The separate estimates δ aˆ T |T , qˆ T |T , from the reduced normal equations of each technique (after eliminating the transformation parameters δzT ) can be used as pseudo-observations, with the reduced normal coefficients matrices as weights. The normal equations of each technique have been already given by (381) and (382). The pseudo-observation equations from each technique T are therefore

4 with weight matrix

δ aˆ T |T = δaT + eδ aˆ T |T ,

(388)

qˆ T |T = qT + eqˆ T |T = LT q + eqˆ T |T ,

(389)

3 ¯ δa N ¯ δa q N T T T . Accordingly, their contribution to the joint ¯T ¯ N δaT qT NqT

normal equations is 4

¯ δa ¯ δa q LT N N T T T T T ¯ ¯ q LT LT NδaT qT LTT N T

3

δ aˆ T qˆ

4

 =

¯ δa δ aˆ T |T + N ¯ δa q qˆ T |T N T T T T T ¯ ¯ q qˆ T |T ˆ LT NδaT qT δ aT |T + LTT N T

3 . (390)

13 Theory and Realization of Reference Systems

801

Adding the above contributions from all techniques and the contribution from the tie observations, the reduced joint normal equations become 4

¯ δa N ¯ δa,q N ¯T ¯ N δa,q Nq

3

   u˜ δa δ aˆ = , qˆ u˜ q

(391)

where ¯ δa = BD(N ¯ δa ), N δ aˆ = BC(δ aˆ T ), T ¯q = ¯ q LT , LTT N N T

¯ δa,q = BC(N ¯ δa q LT ), N T T

T

¯ δa δ aˆ T |T + N ¯ δa q qˆ T |T ), u˜ δa = BC(N T T T

u˜ q =



¯T ¯ q qˆ T |T ). ˆ T |T + LTT N (LTT N T δaT qT δ a

T

(392) The advantage of this approach is that we may obtain the per technique estimates δ aˆ T |T , qˆ T |T in any convenient way and not necessarily by solving (through minimal constraints) the reduced normal equations (381). Thus we may take advantage that, for a single technique, the EOP time series are non-adjustable parameters and proceed as described in Sects. 13.3 and 15, utilizing a reduced weight matrix.

17

ITRF Formulation: The Combination of Separate Estimates from Space Techniques in the Case of Non-singular Covariance Matrices

We have seen in the previous chapter that estimates from the separate stackings of each space techniques can be used as direct pseudo-observations of the corresponding unknowns ignoring completely the fact that the estimates refer to a different reference system for each technique, which is also different from the reference system to which the ITRF parameters will finally refer. This seemingly arbitrary assumption is legitimate, provided that one uses as weight matrices the coefficient matrices of the normal equation of each technique and (most important) that these matrices have been rigorously computed on the basis of the data at hand, and thus share the rank deficiencies due to the lack of reference system information in the performed observations. When this is not the case and the normal equation matrices have no rank defects at all or have rank defects different from those assumed on a theoretical basis, one must include in the data analysis model transformation parameters connecting the spatiotemporal reference system of each space technique to that the ITRF. The same holds true if, for some reason, positive-definite weight matrices are used other than the coefficient matrices of the per technique normal equations.

802

A. Dermanis

In the second combination step the input data are the estimates obtained from the stacking of each technique T = V , S, G, D, namely the initial coordinates  δ xˆ T |T and the stacked EOPs qˆ T |T , containing the time and velocities aˆ T |T = δ vˆ T |T series qˆ k|T , for all epochs tk , k = 1, 2, . . . , m. Each of these refer to a proper reference system established in the first step through the use of minimal constraints. In order to relate them to the corresponding combined estimates aˆ T , qˆ k , referring in a final common ITRF reference system, we need the transformation laws from one reference system to the other. In principle the transformation to the technique T coordinates xi|T (t) from those of the ITRF xi (t) has the general form xi|T (t) = [1 + sT (t)]R(θT (t))xi (t) + dT (t) ≈ ap

ap

≈ xi (t) + sT (t)xi (t) + [xi (t)×]θT (t) + dT (t), ap

ap

(393)

ap

with xi (t) = x0i + (t − t0 )vi , assuming common approximate values, where one may in principle incorporate arbitrary continuous transformation functions θT (t), dT (t), sT (t). The need however to maintain the linear-in-time model, at least approximately, necessitates the restriction to linear functions θT (t) = θ0T +(t−t0 )θ˙ T , dT (t) = d0T + (t − t0 )d˙ T , sT (t) = s0T + (t − t0 )˙sT . In this case the (t − t0 )2 terms are negligibly small and the transformation law for initial coordinates and velocities takes the familiar form, ap

ap

xˆ 0i,T |T = x0i + s0T x0i + [x0i ×]θ0T + d0T + ex0i,T = x0i + Ei p0T + ex0i,T , vˆ i,T |T = v0i + s˙T x0i + [x0i ×]θ˙ T + d˙ T + eviT = v0i + Ei p˙ T + eviT , ap

ap

(394)

or collectively, xˆ 0T |T = xˆ 0T + Ep0T + ex0T , vˆ T |T = vˆ T + Ep˙ T + evT , and jointly         xˆ 0T |T x0T + Ep0T + ex0T x0T E0 p0T = = + + = vˆ T |T p˙ T vT + Ep˙ T + evT vT 0E   e + x0T ≡ aT + Ea pT + eaT . (395) evT 

aˆ T |T

The transformation parameters zk,T from the reference system of the technique T to the one of epoch tk , must be related to the corresponding values zk,T from the final system of the ITRF to that of epoch tk . As we have already seen (see Table 1) the transformation of EOP time series under a change of the reference system has the form qk = qk,T + T0 p(tk ), which in the present case where pT (tk ) = p0T + (tk − t0 )p˙ T takes the form qk = qk,T + T0 p0T + (tk − t0 )T0 p˙ T . Here qk are the EOPs of the epoch tk , expressed in the reference system of each technique, and qk the same quantities expressed in the final ITRF reference system. If derivatives of EOPs have been in some way

13 Theory and Realization of Reference Systems

803

already transformed in the spatiotemporal system of the separate stacking of each ˙ k) = technique, they will obviously transform according to qˆ k = q˙ k + T0 p(t q˙ k + T0 p˙ T . Collectively, the estimates of the stacking of each technique qˆ T |T for all epochs, can be expressed as functions of the same subset of the EOPs qT through ⎡

qˆ T |T

.. . + eqk,T .. .

⎢ ⎢ qˆ ⎢ k|T ⎢ ⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎢ ⎢ ⎢ qˆ˙ k|T ⎣

− .. . + eq˙ k,T .. .

⎡ ⎤ ⎡ ⎤ ⎤ .. .. .. .. . ⎢ . ⎥ ⎥ ⎥ ⎢ . ⎥ ⎢ . ⎢e ⎥ ⎥ ⎢ q ⎥ ⎢ T (t − t )T ⎥ 0 0⎥ ⎢ qk,T ⎥ ⎥ ⎢ k⎥ ⎢ 0 k ⎢ ⎥ ⎥ ⎢ . ⎥ ⎢ . .. .. ⎥ ⎥ ⎥ ⎥ ⎢ .. ⎥ ⎢ ..  ⎢ . ⎢ . ⎥ ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ p0T ⎥ ⎥ ⎢ ⎥ ⎢ +⎢ − ⎥= − ⎥ ⎥=⎢−⎥+⎢ − ⎢ ⎥ ⎥ ⎥ ⎢ . ⎥ ⎢ . p˙ T .. ⎢ .. ⎥ ⎥ ⎥ ⎢ . ⎥ ⎢ . ⎢ ⎥ ⎥ ⎢ . ⎥ ⎢ . . . ⎥ ⎢ ⎥ ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ q˙ k ⎥ ⎢ 0 T0 ⎣ eq˙ k,T ⎦ ⎦ ⎦ ⎣ ⎦ ⎣ .. .. .. .. . . . . ⎤

= qT + Eq pT + eqT ,



T , R = V , S, G, D.

(396)

Here qT is the subset of all EOP data that are contained in technique T , expressed in the ITRF reference system, while qˆ T |T are the estimates of qT provided by the stacking of technique T only, expressed in the corresponding reference system. As we have seen the relation to all EOP time series q obtained from one or more techniques is expressed by qT = LT q, where LT is the already introduced participation matrix. The transformation parameters zk,T transform the reference system of a specific technique T to the one at epoch tk . The transformation parameters pT (t) = p0,T +(t−t0 )p˙ T transform the reference system of the ITRF to that of a specific technique T . Therefore the transformation parameters zk which will transform the reference system of the ITRF to that of epoch tk will be given by zk,T |T = zk + pT (tk ) = zk + p0,T + (tk − t0 )p˙ T , zˆ k,T |T = zk + pT (tk ) + ezˆ k,T |T = zk + p0,T + (tk − t0 )p˙ T + ezˆ k,T |T ,

(397)

or jointly ⎡

.. .

⎢ zˆ T |T = ⎢ ⎣ zˆ k,T |T .. .





⎤ ⎡ .. .. . ⎥ ⎢ ⎥ ⎢. ⎥ = ⎢ zk ⎥ + ⎢ I (tk ⎦ ⎣ ⎦ ⎣ .. .. . .

⎤ ..   . ⎥ p0,T ⎥ ≡ zT + Ez pT + ezT . − t0 )I ⎦ p˙ T .. . (398)

804

A. Dermanis

The total pseudo-observation equations are ⎡

⎤ ⎡ ⎤ aˆ T |T aT + Ea pT + eaT bT = xˆ T |T = ⎣ qˆ T |T ⎦ = ⎣ LT q + Eq pT + eqT ⎦ = zˆ T |T zT + Ez pT + ezT ⎡ ⎡ ⎤ ⎤⎡ ⎤ ⎡ ⎤ I 0 0 Ea eaT aT = ⎣ 0 LT 0 ⎦ ⎣ q ⎦ + ⎣ Eq ⎦ pT + ⎣ eqT ⎦ = Ez ezT 0 0 I zT = AT xT + ET pT + eT ,

T = V , S, G, D.

(399)

T where xT = aTT qT zTT stands for the unknowns which are also present in the separate solutions. These estimates are accompanied by their normal equation matrix NT , which in view of its rank deficiency satisfies ⎤⎡ ⎤ NaT NaT qT NaT zT Ea ⎥ ⎢ NT ET = ⎣ NTaT qT NqT NqT zT ⎦ ⎣ Eq ⎦ = 0, NTaT zT NTqT zT NzT Ez ⎡

ETT NT = 0.

(400)

The normal equations formed on the basis of the observation equations (399) are 4 NxT NxT pT NTxT pT NpT

3

xˆ T pˆ T





uxT = upT

 ,

(401)

where NxT = ATT NT AT , uxT = ATT NT bT ,

NxT pT = ATT NT ET , upT = ETT NT bT .

NpT = ETT NT ET , (402)

In view of ETT NT = 0 Eq. (400) it holds that NxT pT = 0, NpT = 0 and the above normal equations per technique degenerate into two equations (ATT NT AT )ˆxT = ATT NT bT and 0 pˆ T = 0 [32]. This simply means that no additional transformation parameters pT (from the joint ITRF to the already determined ITRF of each technique) can be recovered when the rigorous coefficient matrices of the normal equations per technique NT are used as weight matrices. In the rigorous approach each of these matrices has a rank deficiency of 14 due to the lack of definition of the initial epoch reference system and its temporal evolution (rate). The resulting normal equations will be satisfied by any value of the transformation parameters pT whatsoever! If however the normal equations happen to have full rank, due to a departure from strict rigor, the relation NT ET = 0 does not hold anymore and the present approach can be indeed realized. The reasons for the departure from a strictly

13 Theory and Realization of Reference Systems

805

rigorous approach can be attributed either to the use of non-minimal constraints, such as the infamous loose constraints, or to the introduction of prior information on initial coordinates and velocities (e.g., estimates from a previous ITRF version) with an (incorrectly) non-singular weight matrix. In such a case the joint normal equations, follow from the addition of the ones for each technique and those from the tie observations at collocation ties. We will give first the solution for the case where EOP data are ignored, which is in our opinion the best approach to use, due to the problems related to the EOPs as already has been explained. It is obvious from (397) that the set of pseudo-observations zˆ k,T |T is nonadjustable with respect to the parameters zk so that we can implement Proposition 7 and the relevant results. We can adjust the remaining pseudo-observations by simply using the reduced weight matrix. The remaining pseudo-observations are inthis case the per technique estimates xˆ 0,T |T and the remaining pseudoof initial coordinates and velocities aˆ T |T = vˆ T |T observation equations are simply aˆ T |T = aT Ea p + eaT , as described by Eq. (395). They are related to those for zˆ k,T |T , through the normal equations weight matrix   NaT NaT zT ¯ a = Na − , and the relevant reduced weight matrix becomes N T T NTaT zT NzT −1 T NaT zT NzT NaT zT , where  NaT =

Nx0T Nx0T vT NTx0T vT NvT



 ,

NaT zT =

Nx0T zT NvT zT

 (403)

.

The contribution to the normal equations from each technique becomes 

¯a ¯ a Ea N N T T T T ¯ ¯ a Ea Ea NaT Ea N T



aˆ T pˆ T



 =

¯ a aˆ T |T N T T ¯ a aˆ T |T Ea N T

 .

(404)

Adding these contributions and the ones Nc,a aˆ T = uc,a from the tie observations at collocation sites we arrive at the total normal equations 4

Na + Nc,a Nap NTap Np

3    aˆ ua = , pˆ up

(405)

where ¯ a ), Na = BD(N T aˆ = BC(ˆaT ),

¯ a Ea ), Nap = BD(N T

pˆ = BC(pˆ T ),

¯ a Ea ), Np = BD(ETa N T

¯ a aˆ T |T ), ua = BC(N T

¯ a aˆ T |T ). up = BC(ETa N T (406)

806

A. Dermanis

To solve the above normal equation a set of minimal constraints, of the general form CTa a + CTp p = d, must be introduced. In this case the unique solution becomes 3−1    4  Na + Nc,a + Ca CTa Nap + Ca CTp aˆ ua + Ca d = , pˆ NTap + Cp CTa Np + Cp CTp ua + Cp d

(407)

and has covariance factor matrix 4

Qaˆ Qaˆ pˆ QTaˆ pˆ Qpˆ

3

4 =

Na + Nc,a + Ca CTa Nap + Ca CTp NTap + Cp CTa Np + Cp CTp 4

3−1 4

Na + Nc,a + Ca CTa Nap + Ca CTp × NTap + Cp CTa Np + Cp CTp 4

Na + Nc,a + Ca CTa Nap + Ca CTp = NTap + Cp CTa Np + Cp CTp 4 3 Ea R−1 ETa Ea R−1 ETp − , Ep R−1 ETa Ep R−1 ETp

3 Na + Nc,a Nap × NTap Np

3−1 =

3−1 −

(408)

where R = (ETa Ca + ETp Cp )(CTa Ea + CTp Ep ). The transformation parameter optimal estimates can be recovered if desired from T T ˆT . zˆ T = zˆ T |T + N−1 aT |T − aˆ T ) − (N−1 zT NaT zT (ˆ zT NaT zT Ea + Ez )p

(409)

A non-rigorous version leading to a suboptimal solution, results from ignoring altogether the transformation parameter estimates zˆ k,T |T from the four techniques. The normal equations in this case have exactly the same form as above, with the only difference being that the unreduced weight matrix NaT appears in the place of ¯a . the reduced weight matrix N T The EOPs can be independently combined a posteriori. The original EOP time series must be first converted to the reference system of each technique using the updated estimates zˆ T rather than the ones zˆ T |T from each technique. Next they must converted to the final ITRF reference system, utilizing the relevant transformation parameter estimates pˆ T . Once they all refer to the same reference system they can be combined as previously explained in Sect. 16.1. Despite our reservations, we include for the sake of completeness the approach where the per technique EOP estimates qˆ T |T = LT q + Eq pT + eqT are included in the combination step. The difference in this case is that the weight matrix associated with the per technique pseudo-observation equations and the resulting reduced weight matrix are, respectively,

13 Theory and Realization of Reference Systems

807

⎤ NaT qT NaT zT ⎥ NqT NqT zT ⎦ , (410) T NqT zT NzT 3 4 3     ¯a q N N N NaT zT a a q −1 T T T T T T T N = = − N N z z z a q T T T T T T ¯q NaT qT NqT N NqT zT T T T 4 3 T −1 T NaT − NaT zT N−1 zT NaT zT NaT qT − NaT zT NzT NqT zT = . (411) T −1 T NTaT qT − NqT zT N−1 zT NaT zT NqT − NqT zT NzT NqT zT ⎡

NxT

¯x N T

NaT ⎢ = ⎣ NTaT qT NTaT zT 4 ¯ N = ¯ T aT Na q

The normal equations from each technique are ⎡

¯a ¯ a q LT N N T T T ⎢ T ¯T ¯ q LT ⎣ LT NaT qT LTT N T ¯ Ta p ¯ Tq p N N T T T T

⎤⎡ ⎤ ⎡ ⎤ ¯a p N u˜ aT aˆ T T ¯q p ⎥ ⎣ qˆ ⎦ = ⎣ u˜ q ⎦ , N T T ⎦ T ¯p pˆ T u˜ pT N T

(412)

where ¯ a Ea + N ¯ a q Eq , ˜a p = N N T T T T T T ¯ ¯T ˜ q p = LT N N T T T aT zT Ea + LT NqT Eq ,

¯ a Ea + N ¯ a p Eq ) + ETq (N ¯ a z Ea + N ¯ q Eq ), ˜ p = ETa (N N T T T T T T T ¯ a aˆ T |T + N ¯ a q qˆ T |T , u˜ aT = N T T T ¯ Ta z aˆ T |T + LT N ¯ ˆ T |T , u˜ aT = LTT N T qT q T T ¯ a + ETq N ¯ Ta z )ˆaT |T + (ETa N ¯ a q + ETq N ¯ q )qˆ T |T . u˜ pT = (ETa N T T T T T T

(413)

The joint normal equations follow from the addition of the above contributions from each technique and the ones Nc,a aˆ T = uc,a from the tie observations at collocation sites. They have the form ⎡

⎤⎡ ⎤ ⎡ ⎤ Na + Nc,a Naq Nap ua + uc,a aˆ ⎢ ⎥ ⎦, Nq Nqp ⎦ ⎣ qˆ ⎦ = ⎣ uq ⎣ NTaq T T pˆ Nap Nqp Np up

(414)

where aˆ = BC(ˆaT ),

pˆ = BC(pˆ T ),

¯ a ), Na = BD(N T ¯ a q LT ), = Nap = BD(N ¯ a q ) = BD(N ¯ a Ea + N ¯ a q Eq ), Naq = BC(N T T T T T T T

(415)

808

Nq =

A. Dermanis



¯ qT LT , LTT N

T T ¯ ˜ q p ) = BR(LT N ¯T Nqp = BR(N T T T aT zT Ea + LT NqT Eq ),   ˜ p ) = BD ETa (N ¯ a Ea + N ¯ a q Eq ) + ETq (N ¯ Ta z Ea + N ¯ q )Eq , Np = BD(N T T T T T T T

(416) ¯ a aˆ T |T + N ¯ a q qˆ T |T ), ua = BC(u˜ aT ) = BC(N T T T " # ¯ Ta z aˆ T |T LT N ¯ q qˆ T |T , LTT N u˜ qT = uq = T T T T T

T

  ¯ a + ETq N ¯ Ta z )ˆaT |T + (ETa N ¯ a q + ETq N ¯ q )qˆ T |T . up = BC(u˜ pT ) = BC (ETa N T T T T T T (417) A unique solution to the normal equations can be obtained after the introduction of a set of minimal constraints of the general form CTa a + CTq q + CTa p = d, following the general relations (78) and (79) or (85) and (87). Typically constraints of the form CTa a + CTp p = d which do not involve EOP parameters are implemented. It suffices to use minimal constraints CTa a = d involving only initial coordinates and velocities. Since any minimally constrained solution can be easily converted to a solution satisfying desired minimal constraints, our advice is to use partial inner constraints ETa a = 0, with the simpler solution ⎤−1 ⎡ ⎡ ⎤ ⎡ ⎤ Na + Nc,a + Ea ETa Naq Nap aˆ ua + uc,a ⎢ ⎥ ⎣ qˆ ⎦ = ⎣ ⎦, NTaq Nq Nqp ⎦ ⎣ uq T T pˆ Nap Nqp Np up

(418)

and covariance factor matrices ⎡

⎤ ⎡ Qaˆ Qaˆ qˆ Qaˆ pˆ Na + Nc,a + Ea ETa ⎢ QT Q Q ⎥ ⎢ NTaq ⎣ aˆ qˆ qˆ qˆ pˆ ⎦ = ⎣ T T Qaˆ pˆ Qqˆ pˆ Qpˆ NTap ⎡ Ea (ETa Ea )−2 ETa ⎢ − ⎣ Eq (ETa Ea )−2 ETa Ep (ETa Ea )−2 ETa

⎤−1 Naq Nap ⎥ Nq Nqp ⎦ − NTaq Np

⎤ Ea (ETa Ea )−2 ETq Ea (ETa Ea )−2 ETp ⎥ Eq (ETa Ea )−2 ETq Eq (ETa Ea )−2 ETp ⎦ . Ep (ETa Ea )−2 ETq Ep (ETa Ea )−2 ETp (419)

Updated estimates of the transformation parameters can be obtained if desired, using the relation

13 Theory and Realization of Reference Systems

809

T ˆ zˆ T = zˆ T |T + N−1 aT |T − aˆ T ) + NTqT zT (qˆ T |T − LT q)]− zT [NaT zT (ˆ T T ˆT . − [N−1 zT (NaT zT Ea + NqT zT + Eq ) + Ez ]p

(420)

The related covariance matrices can be sequentially computed according to Eqs. (311), which in this case where a → [aT qT pT ]T , takes the form ⎡ ⎤⎡ ⎤ ⎤ Qaˆ Qaˆ qˆ Qaˆ pˆ Qaˆ zˆ Naz ⎥ T −1 ⎣ Qqˆ ⎦ = −⎢ ⎣ Qaˆ qˆ Qqˆ Qqˆ pˆ ⎦ ⎣ Nqz ⎦ Nz , ˆz T T Qpˆ Qaˆ pˆ Qqˆ pˆ Qpˆ Npz ˆz ⎡ ⎤ Qaˆ zˆ −1 T T T ⎣ ⎦ Qzˆ = N−1 ˆz . z − Nz [ Naz Nqz Npz ] Qqˆ Qpˆ ˆz ⎡

18

(421)

ITRF Formulation: Some Remarks on the Origin and Scale of the Reference System

In our previous discussions, we examined the use of transformation parameters zk connecting the final reference system to be introduced by minimal constraints with the already established one for the station coordinates in each solution at epoch tk . We have not though specified which of the seven transformation parameters (rotation angles θ1 , θ2 , θ3 , translation components d1 , d2 , d3 , and scale parameter s) should be included in each case. They should be all included if the original observations were invariant with respect to rotations, translations and change of scale. This is not the case however. All space techniques have their own scale, i.e., their own unit of length which follows from the time unit as realized through the use of a particular set of atomic clocks. SLR involves tracking of low orbiting satellites which sense the gravity field of the earth and hence the position of the geocenter. The same is true for DORIS and GPS to a certain extent, while only VLBI observations are completely invariant to translations due to the practically infinite distance of the observed extragalactic radio sources. In computational work, however, one needs to take into consideration not only the theoretically absolute rank defects but also close to rank defect situations. In this respect, the rank deficiency indices introduced by Chatzinikos and Dermanis [21], already discussed in  Chap. 7, “Ellipsoidal-Spheroidal Representation of the Gravity Field”, provide the means for not only detecting, but also for interpreting and quantifying rank defect or close to rank defect situations. It turns out that only SLR gives reliable geocenter information, while only scale provided by VLBI and SLR is worth considering in combined solutions. We will discuss first how geocenter information should be incorporated in the simpler case of stacking an SLR coordinate time series alone. Geocentric coordinates appear to be indispensable for geophysical applications but not so

810

A. Dermanis

for geodetic positioning and mapping applications. To understand this, consider the extreme hypothetical case of an earth with a completely rigid lithosphere, with respect to which the position of the geocenter varies, as a consequence of mass redistribution within the earth. In this case, the obvious geodetic choice is a reference system fixed with respect to the lithosphere. But even in the real situation of a deforming lithosphere it is reasonable to seek a reference system with respect to which coordinates do not demonstrate unnecessary variations, for example the one established by kinematic constraints, where the origin is either the barycenter of the geodetic network, or more generally one with constant barycenter coordinates. On the other hand, even in geophysical applications, the call for a geocentric reference system does not mean that coordinates will follow the geocenter with its variations of relatively higher frequencies, but rather a smoothed version of the geocenter, where e.g., non-linear temporal variations are left out [43, 44, 57]. In any case, the incorporation of a linear-in-time coordinate variation model, filters out such nonlinear variations and does not allow them to influence the choice of the reference system, either barycentric or geocentric. At the bottom line, the choice between a geocentric or a barycentric reference system is a pseudo-problem, since coordinates can be easily converted from one to the other. Returning to the problem of establishing a geocentric reference system in the stacking of SLR coordinate time series, there are two choices. The first is to introduce only rotation and scale transformation parameters zk for each epoch tk , thus allowing the geocentric reference system of each epoch to enforce a geocentric reference system in the final solution, where only orientation, scale and their rates be defined via minimal constraints. (Scale information in SLR has been ignored here for the sake of the argument.) The second is to introduce all possible 7 transformation parameters per epoch and to establish a geocentric reference system either through appropriate minimal constraints or by a posteriori conversion. Turning to the stacking model (133), (314)-(316) we may split the term Ei zk , in two terms ⎡ ⎤   θk

ap ap ⎣ ap θk ⎦ +dk ≡Ea,i za,k +Eb,i zb,k , I3 x0i dk = [x0i ×] x0i sk sk (422) so that for all stations, the choice is between the use of the restricted observation equations

ap Ei zk = [x0i ×]

b = (1m ⊗ I3n )δx0 + (τ ⊗ I3n )δv + (Im ⊗ Ea )za + e ≡ Jδx0 + Jt δv + Ga za + e, (423) and the original extended one b = (1m ⊗ I3n )δx0 + (τ ⊗ I3n )δv + (Im ⊗ Ea )za + (Im ⊗ Eb )zb + e ≡ ≡ Jδx0 + Jt δv + Ga za + Gb zb + e

(424)

13 Theory and Realization of Reference Systems

811

In general, adjustments with a restricted model b = A1 x1 + e (R) and an extended one b = A1 x1 + A2 x2 + e (E) lead to different estimates for estimable quantities. In the rank deficient case the models are equivalent when the observable estimates yˆ R = A1 xˆ 1(R) and yˆ E = A1 xˆ 1(E) + A2 xˆ 2(E) are identical. This will happen only when the columns of A2 already belong to the span of the columns of A1 , in mathematical terms R(A2 ) ⊆ R(A1 ). In this case R(A2 ) = R([A1 A2 ]), the subspace M of the observables is the same and the orthogonal projection yˆ of the observations b on M will be the same. In our case the question is whether all the rows of the matrix ⎡

Eb ⎢ .. ⎢ . ⎢ ⎢ (Im ⊗ Eb ) = ⎢ 0 ⎢ (3nm)×(4m) ⎢ . ⎣ ..

··· .. .

··· .. . 0 ···

0 ··· .. . . . . Eb .. .

0 .. . . · · · .. .. . 0

⎤ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎦

(425)

0 · · · Eb

can be expressed as linear combinations of the rows of the matrix ⎡

[1m ⊗I3n

τ⊗I3n

I3n (t1 − t0 )I3n ⎢ .. .. ⎢ . . ⎢ ⎢ Im ⊗Ea ] = ⎢ I3n (tk − t0 )I3n ⎢ ⎢ . .. ⎣ .. . I3n (tm − t0 )I3n

Ea · · · .. . . . . 0 ··· .. . . . .

0 ··· .. . . . . Ea .. .

0 .. . .. ··· . .. . 0

⎤ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎦

(426)

0 · · · 0 · · · Ea

where ⎤ I3 ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎥ Eb = ⎢ ⎢ I3 ⎥ , (3n)×3 ⎢ . ⎥ ⎣ .. ⎦ ⎡

I3



ap

[x01 ×] ⎢ .. ⎢ . ⎢ ap Ea ⎢ ⎢ [x0i ×] (3n)×4 ⎢ ⎣ ... ap [x0n ×]

ap ⎤ x01 .. ⎥ . ⎥ ap ⎥ x0i ⎥ ⎥. .. ⎥ . ⎦

(427)

ap

x0n

Since this does not appear to be possible, we come to the conclusion that the estimated shape of the network, as expressed by the station coordinates xˆ i (t) = xˆ 0i (t) + (t − t0 )ˆvi , i = 1, 2, . . . , n, is different in the two approaches, independently of the choice of the reference system. Let us first clarify what is the rigorous approach to use under the Gauss-Markov model, which is the basis of our computations. The data consist of per epoch station coordinate estimates x(tk ) accompanied by weight matrices, which are no others that the coefficient matrices Nx (tk ) of the normal equations formed from the actual

812

A. Dermanis

observations performed within the time interval corresponding to the epoch tk , as already explained in Sect. 13.1, Proposition 4. The rank deficiencies of the Nx (tk ) matrices correspond to the transformation characteristics (rotation, displacement, scaling) that leave the primary observables invariant (absolute deficiency) or they vary very little (close to deficiency situation). In any case they can be numerically detected, identified and interpreted by the above mentioned deficiency indices. Only transformation parameters with respect to which Nx (tk ) has rank defect, should be introduced in the stacking model. Thus for SLR only rotation parameters should be included, for VLBI only rotation and displacement ones, while all transformation parameters (rotation, translation and scale) should be included for GPS and DORIS. An additional requirement is that the per epoch estimates x(tk ) have been correctly obtained by using minimal constraints. Let us push the argument in favor of rigor one step further. The unknown parameters of interest, i.e., the initial station coordinates and their velocities, are correctly estimated if all observations from all epochs are jointly adjusted. In this case the joint normal equations will be the sum of the per epoch normal equations, with one difference: these will refer not to single epoch coordinates xi (tk ), but to initial values x0i and velocities vi , a linear in time model should be used within each epoch also instead of the model with constant coordinates, as actually done. An alternative to the addition of the normal equations is to use the per epochs estimates (of x0i and velocities vi ) as pseudo-observations with weight matrices their normal equations coefficient matrices. In both equivalent approaches (Sect. 13.1, Proposition 4) there is no room for transformation parameters and the whole stacking idea in its usual form should be abandoned if one wants to be formally rigorous. This is not the case however, for a simple reason: the rigorous approach is based upon the basic statistical assumptions of the Gauss-Markov model and is correct when these assumptions (zero mean errors, known error covariance matrix up to a scalar) are consistent with physical reality. This is certainly not the case, because systematic errors do affect the performed observations. In this respect, the role of transformation parameters in the stacking solution is twofold: On one hand to free the coordinates xi (tk ) from the independent solutions at epochs tk from their different reference systems, and on the other hand to absorb a part of the systematic errors, i.e., to remove an error trend that can be expressed by a combination of rotation, translation and scale. Thus there is an advantage to use the extended approach, allowing all 7 transformation parameters and to seek a way to convert the reference system established through minimal constraints to a geocentric one, or one that has a desired scale. As explained by Chatzinikos and Dermanis [20] different least squares solutions to the stacking problem are related by xˆ 0i = xˆ 0i + Ei p0 , vˆ i = vˆ i + Ei p˙ and zˆ k = zˆ k − p0 − (tk − t0 )p˙ (see propositions 1 and 2 of  Chap. 7, “Ellipsoidal-Spheroidal Representation of the Gravity Field”). For comparing the extended with the reduced approach we will separate the transformation parameters zˆ k in these relations into ones zˆ a,k that are present in both approaches and ones zˆ k,b that are present only in the extended approach. In this case the relations between two different least squares

13 Theory and Realization of Reference Systems

813

solutions, corresponding to a different choice of the spatiotemporal reference system are related through xˆ 0i = xˆ 0i + Ea,i pa,0 + Eb,i pb,0 ,

vˆ i = vˆ i + Ea,i p˙ a + Eb,i p˙ b ,

zˆ a,k = zˆ a,k − pa,0 − (tk − t0 )p˙ a ,

zˆ b,k = zˆ b,k − pb,0 − (tk − t0 )p˙ b . (428)

In the restricted approach zˆ b,k = 0 has been enforced, thus leading to pb,0 = 0 and p˙ b = 0 with allowable transformations xˆ 0i = xˆ 0i + Ea,i pa,0 , vˆ i = vˆ i + Ea,i p˙ a between different least squares solutions, corresponding to the reference system characteristics that have not been inherited from the epoch solutions xi (tk ). In the extended approach zˆ b,k = 0, but the new solution xˆ 0i , vˆ i is selected to be the one closest to that with zˆ b,k = 0, whenever the resulting new transformation parameters zˆ b,k = zˆ b,k − pb,0 − (tk − t0 )p˙ b are collectively as small as possible. Measuring their 0 m 0  0  02 total magnitude by 0zˆ b,k 0 we can minimize them collectively by choosing as k=1

transformation parameters pb,0 , p˙ b the ones that make

m  k=1

(ˆzb,k )T (ˆzb,k ) = min . pb,0 ,p˙ b

This is in fact a linear regression with model zˆ b,k = pb,0 + (tk − t0 )pˆ b + zˆ b,k that can be easily solved to provide the optimal values pˆ b,0 , pˆ˙ b . Then the desired solution referring to a reference system with characteristic (origin, scale) close to the desired ones is given by xˆ 0i = xˆ 0i + Eb,i pˆ b,0 , vˆ i = vˆ i + Eb,i pˆ˙ b , leaving the rest of the transformation parameters unchanged (ˆza,k = zˆ a,k ). Thus, the extended approach can be realized as a compromise that tries to find a balance between the need for systematic error removal and the need of closeness to a solution with desired characteristics (geocentric, having the scale of VLBI or SLR). Coming to the specific problem of converting a solution obtained by the extended approach to one as close to geocentric as possible, we have the case where zˆ b,k = dˆ k , ˙ The relevant regression takes the form pb,0 = d0 and p˙ b = d. dˆ k = d0 + (tk − t0 )d˙ + dˆ k ,

m k=1

(dˆ k )T dˆ k = min, d0 ,d˙

(429)

and has solution   1 ˆk , ¯ − τ1 τ (t − t ) d d 2 k 0 k m τ2 − τ12   1 1 ˆd˙ = ˆ ¯ −τ1 d + (tk − t0 )dk , k m τ2 − τ12 dˆ 0 =

(430)

 where d¯ is the mean of the values dk , τp = m1 k (tk − t0 )p , p = 1, 2. The change of the reference system of the stacking solution to a “geocentric” one is realized via

814

A. Dermanis

ˆ˙ vˆ i = vˆ i + d.

xˆ 0i = xˆ 0i + dˆ 0 ,

(431)

In the case of converting a solution of the extended approach to one that is close in scale to that of a specific technique (SLR or VLBI), we have zˆ b,k = sˆk , pb,0 = s0 and p˙ b = s˙ . The relevant regression takes the form sˆk = s0 + (tk − t0 )˙s + sˆk ,

m

(ˆsk )2 = min,

k=1

s0 ,˙s

(432)

which has solution   1 1 −τ1 s − τ1 sˆ0 = (tk − t0 )ˆsk , k m τ2 − τ12   1 1 ˆs˙ = −τ1 s + (tk − t0 )ˆsk , k m τ2 − τ12

(433)

where s is the mean of the values sˆk , while τ1 and τ2 are the same as above. The change of the reference system of the stacking solution to one with scale close to that of a particular technique is realized via xˆ 0i = xˆ 0i + sˆ0 x0i , ap

ap vˆ i = vˆ i + sˆ˙ x0i .

(434)

When all the techniques are combined to get joint estimates xˆ 0i , vˆ i and zˆ k,T , T = V , S, G, D, an equally valid solution in a different reference system is given by xˆ 0i = xˆ 0i +Ei p0 ,

˙ vˆ i = vˆ i +Ei p,

˙ zˆ k,T = zˆ k,T −p0 −(tk −t0 )p.

(435)

A regression of the SLR displacement transformation parameters dk,S within zˆ k,S , will give the parameters of the linear trend dˆ 0,S , dˆ˙ S which lead to a geocentric solution, while a regression of the SLR scale transformation parameters sˆk,S(G) within zˆ k,S(G) will give the parameters of the linear trend sˆ0S , sˆ˙S which will lead to a solution adapted to the SLR scale. Applying both we obtain the geocentric solution with SLR scale xˆ 0i(G,S) = xˆ 0i + dˆ 0,S + sˆ0S xi , ap

ap vˆ i(G,S) = vˆ i + dˆ˙ S + sˆ˙S+V xi .

(436)

Alternatively a regression of the VLBI scale transformation parameters sˆk,V within zˆ k,V will give the parameters of the linear trend sˆ0V , sˆ˙V which combined with dˆ 0,S , dˆ˙ S will lead to a geocentric solution adapted to the VLBI scale ap xˆ 0i(G,V ) = xˆ 0i + dˆ 0,S + sˆ0V xi ,

ap vˆ i(G,V ) = vˆ i + dˆ˙ S + sˆ˙V xi .

(437)

13 Theory and Realization of Reference Systems

815

For an in-between solution with scale inherited from both SLR and VLBI one may use, instead of sˆ0S , sˆ˙S or sˆ0V , sˆ˙V , a weighted combination s0,S+V = λs0V + (1 − λ)s0S ,

s˙S+V = λ˙sV + (1 − λ)˙sS ,

0 ≤ λ ≤ 1, (438)

which leads to a solution with combined scale ap vˆ i(G,S+V ) = vˆ i + dˆ˙ S + sˆ˙S+V xi . (439) For λ = 0 one obtains a solution adapted to the SLR scale only, while for λ = 1, the solution is adapted to only the VLBI scale. The question of whether to use the simple restricted approach, or the above, slightly more complicated, extended approach, cannot be answered by theoretical means only. Additional information is needed on the nature of the effects of systematic errors on the estimated per epoch solutions, the main question being whether they can be effectively reduced by removing a trend of the form of a similarity transformation. The two step approach described in  Chap. 17, “From Newton’s Law of Gravitation to Multiscale Geoidal Determination and Ocean Circulation Modeling”, which is applicable only in the case of non-singular per approximation technique normal equation coefficient matrices, has the advantage that the problems of scale and origin can be attacked in a much simpler way in the second combination step. Recall that station initial coordinates and velocity estimates xˆ 0i,T |T , vˆ i,T |T , obtained from the stacking of each technique T separately, are the data in a combination with observation equations model xˆ 0i,T |T = x0i + Ei p0T + ex0i,T and vˆ iT |T = vi + Ei p˙ T + eviT (Eq. 394), where the unknowns are the final ITRF parameters x0i , vi , plus the nuisance transformation parameters p0T , p˙ T (initial epoch and rate, respectively) from the ITRF reference system to that of each particular technique T . The separate estimates xˆ 0i,T |T , vˆ i,T |T can be obtained using the restricted or the extended model approach for SLR and/or VLBI. While all seven transformation parameters zk are included for GPS and DORIS, in the restricted model approach zk contains only orientation parameters for SLR, where origin (geocenter) and scale is defined, and only orientation and translation parameters for VLBI where scale is defined. In the combination step, the restricted model approach must be used, with the same sets of parameters included within p0T and p˙ T for each technique, as in the restricted model approach within each technique. It is fully justified when the extended model approach has been used in the separate stackings per technique, because in this case the larger part of the systematic errors has been absorbed by the per epoch transformation parameters. Since no scale transformation parameters s0T (within p0T ) and s˙T (within p˙ T ) are included for both VLBI and SLR, the resulting ITRF scale is a combination of the scales of VLBI and SLR. If s0T and s˙T are excluded for only one of these two techniques, then its scale passes to the ITRF. It is thus possible to obtain two ITRFs, one with VLBI scale and one with SLR scale, which are related by a scale transformation only, and can be combined into an in-between solution with an intermediate weighted average scale. For example if ap ap xˆ 0i|V = xˆ 0i|S + s0 x0i , vˆ i|V = vˆ i|S + s˙ x0i is the relation between the solution based ap xˆ 0i(G,S+V ) = xˆ 0i + dˆ 0,S + sˆ0,S+V xi ,

816

A. Dermanis

on SLR scale only (ˆx0i|S , vˆ i|S ) and the solution based on VLBI scale only (ˆx0i|V , vˆ i|V ) an in-between solution will be ap

ap

ap

xˆ 0i =ˆx0i|S +λs0 x0i = xˆ 0i|V +(λ−1)s0 x0i ,

ap

vˆ i =ˆvi|S +λ˙s x0i = vˆ i|V +(λ−1)˙s x0i , (440) with 0 ≤ λ ≤ 1. λ = 0 provides the solution with only VLBI scale, while λ = 1, the one with only SLR scale. A particular approach which enables the control the relative “flow” of scale from SLR and VLBI to the ITRF solution, is to model the scale parameters s0T and s˙T for SLR and VLBI as stochastic parameters with zero mean and known variances. Splitting  Ei = [Eai Ebi ], p0T =

pa,0T pb,0T



 =

pa,0T s0T



 , p˙ =

p˙ a,0T p˙ b,T



 =

p˙ a,0T s˙T

 ,

(441) where pa,0T and p˙ a,T contain the orientation (θ0T , θ˙ 0T ) and the translation (d0T , d˙ T ) parameters, the observation equations (394) take the form ⎡

⎤ x0i,T      xˆ 0i,T |T vi,T ⎥ I3 0 Ea,i 0 ⎢ s0T ex0i,T Eb,i 0 ⎢ ⎥ = + . + vˆ i,T |T s˙T evi,T 0 I3 0 Ea,i ⎣ pa,0T ⎦ 0 Eb,i p˙ a,T (442) The final observation equations for all stations (395) take the form 





xˆ 0,T |T vˆ T |T







 =

I3n 0 Ea 0 0 I3n 0 Ea





⎤ x0T      ⎢ vT ⎥ s0T ex0T ⎢ ⎥ + Eb 0 + , ⎣ pa,0T ⎦ s˙T evT 0 Eb p˙ a,0T

T = V , S, G, D,

(443)

which is a mixed linear model with both deterministic and stochastic parameters and can be adjusted by standard relevant techniques. The variances σs20T , σs˙2T for VLBI and SLR can be used to control the relative influence of the VLBI and SLR scale on the final ITRF scale. The smaller the variance, the greater the influence of the corresponding technique on the ITRF scale.

19

Post-linear Coordinate Variation Models

Examination of the residuals xi (tk ) = xi (tk ) − x0i − (tk − t0 )vi of coordinate time series after a linear trend x0i + (tk − t0 )vi has been removed, reveals a systematic

13 Theory and Realization of Reference Systems

817

behavior and calls for the additional modeling for non-linear temporal variation. There are three requirements that such a model should fulfill [31]: (a) analytical description in terms of a finite number of parameters, so that coordinates at any particular time instant can be computed. (b) adaptability to the main characteristics of the coordinate time series variation, so that only a high frequency part that can be attributed to observational noise is left out. (c) description in terms of parameters that are appropriate for a physical interpretation. Examination of the residuals xi (tk ) demonstrate a quasi-periodic annual variation. The corresponding spectra obtained by a discrete Fourier transform reveal peaks close to annual and semi-annual frequencies, although these peaks do not stand out clearly, and contribution from other frequencies is present. In any case, the most obvious choice is that of a Fourier series model x(t) =

 j

 2π t 2π t aj cos , + bj sin Tj Tj

(444)

with frequencies fj = 1/Tj that include at least an annual (Tj = 1 year) and a semiannual (Tj = 6 months) term. In fact, the last ITRF version provides the coefficients aj , bj of annual and semi-annual terms, although they are not yet included in the official ITRF2014 version [7]. In addition it includes models for station coordinate variations associated with post seismic deformation. Removal of such periodic terms leaves out a systematic part that varies from station to station and is more intense in the height component. Thus the above model fulfills greatly the requirements of analytical description and physical interpretation but falls somewhat short as far as adaptability is concerned. It is obvious that some additional non-periodic signal is present in coordinate time series. Another popular approach is the Singular Spectral Analysis (SSA) for a discrete time series (see e.g., [36, 37] for a general exposition and [22, 53] for geodetic applications). The original time series {xk } is replaced with a smoothed time series {xk }, so that the method is essentially a filtering method. It demonstrates excellent adaptability but falls short with respect to the other two requirements, analytical description and physical interpretation. Our own rather extensive computational experience has shown that despite its profound theoretical foundation, results indistinguishable fromthose of SSA can be also obtained by a simple moving k+M 1 average (xk = 2M+1 j =k−M xj ) with appropriate window length 2k + 1. Two more methods which provide both adaptability and analytical description have been proposed by Chatzinikos and Dermanis [19]. The first is the use of cubic splines with equidistant nodes which are least-squares fitted to the coordinate time series. The interpolating function is defined in a piece-wise manner by cubic polynomials Si (t) which share common values, and common first and second derivatives at the

818

A. Dermanis

bordering points (nodes). There are two equivalent models   gi − gi−1 2Ni−1 + Ni (t − τi−1 )2 + − x(t)=Si (t) = gi−1 +Ni−1 (t − τi−1 )+ 3 h h2 (445)   Ni−1 + Ni gi − gi−1 (t − τi−1 )3 , − i = 1, 2, . . . , n, + −2 h3 h2 and  x(t) = Si (t) = gi−1 + +

 gi − gi−1 2Mi−1 + Mi Mi−1 h (t − τi−1 ) (t − τi−1 )2 + − 2 6 2 h

Mi − Mi−1 (t − τi−1 )3 , 6h

i = 1, 2, . . . , n,

(446)

where τi , i = 0, 1, . . . , n, are the spline nodes, each spline Si (t) applies only to the interval τi−1 ≤ t ≤ τi , and h = τi+1 − τi is the distance between the nodes, gi = Si (τi ) = Si+1 (τi ), Mi =

Ni =

dSi dSi+1 (τi ) = (τi ), dt dt

d 2 Si d 2 Si+1 (τ ) = (τi ). i dt 2 dt 2

(447)

Model (445) is accompanied by the conditions of equality of the second derivatives hNi−1 + 4hNi + hNi+1 = 3(gi+1 − gi−1 ),

i = 1, 2, . . . , n − 1,

(448)

while model (446) is accompanied by the conditions of equality of the first derivatives Mi−1 + 4Mi + Mi−1 =

6 (gi−1 − 2gi + gi+1 ), h2

i = 1, 2, . . . , n − 1.

(449)

The unknown parameters gi , Ni , i = 0, 1, . . . , n or gi , Mi , depending on the model used, are estimated by least squares fit of the observed values xk = x(tk ), using the standard method of least squares adjustment with linear constraints. The method demonstrates excellent adaptability, with results similar to those of the SSA. It also has the advantage of analytical description, but the derived coefficients are not fit for physical interpretation. For this reason, another method has been developed that fulfills all three requirements. It is based on the observation that the departure of the annual coordinate variation from perfect periodicity is due to its association with hydrological factors and ultimately with weather patterns. Although the four seasons are repeated annually, winters and summers do not have the same intensity every year (some

13 Theory and Realization of Reference Systems

819

summers are hotter, some winders colder) and the associated weather phenomena do not arrive exactly on specific dates (some years winter comes earlier, some others later and the same holds for summer). This leads us to take the basic annual periodic signal a0 cos(ω0 t0 ) + b0 sin(2π/T0 ), with ω0 = 2π/T0 , T0 = 1 year, rewrite

into its polar form A0 cos(ω0 t − ϕ0 ), with amplitude A0 = a02 + b02 and phase ϕ0 = arctan(b0 /a0 ) and try to modified it in a way that conforms with physical reality. Indeed, this monochromatic signal will serve as the carrier on which both time dependent amplitude and phase will be modulated to obtain the representation x(t) = A(t) cos(ω0 t − ϕ(t)).

(450)

The unknown amplitude and phase functions can be modeled in a simple way. We choose a piecewise linear representation A(t) = Ai−1 +

Ai − Ai−1 (t − τi−1 ), τi − τi−1

ϕ(t) = ϕi−1 +

ϕi − ϕi−1 (t − τi−1 ), τi − τi−1

τi−1 ≤ t ≤ τi ,

(451)

where τi , i = 0, 1, . . . , n, are the piece nodes and Ai = A(τi ), ϕi = ϕ(τi ). A simple least square fit to the observed values xk = x(tk ) for the determination of the parameter values Ai , ϕi , has proven to suffer from over-adaptability, even for large distance between successive nodes. The interpolating functions tends to absorb also high frequency variations due to observational errors. For this reason the least squares approach (b = Ax + e, eT Pe = min) has been replaced by a Tikhonov regularization (eT Pe + xT Wx = min) with xT Wx = ρA

n i=0

(Ai − A0i ) + ρϕ

n

(ϕi − ϕi0 ),

(452)

i=0

where ρA , ρϕ are regularization parameters. The reference values A0i , ϕi0 are obtained as follows: For every observation epoch tk , a periodic signal Ak cos(ωk t − ϕk ), with annual period is best fitted to all data within a year-long moving window having tk as its center. The resulting dense data Ak , ϕk are averaged, or best fitted by piecewise linear functions, to obtain the required reference values A0i , ϕi0 at the nodes. The regularization parameters ρA , ρϕ and the node distance h = τi − τi−1 can be used as free parameters, in order to tune the adaptability of the method. Under particular choices one obtains results practically identical to those of the SSA (or the simple moving average filter) and the approach with equidistant cubic splines. There is a great difference though; the parameters Ai , ϕi (or the linearly interpolated values A(t), ϕ(t)) are most appropriate for comparison with hydrological-meteorological data. Therefore, unlike the other approaches, the representation with amplitude and phase modulation, is the only one that fulfills all three requirements: analytical description, adaptability and physical interpretation.

820

A. Dermanis

Literature 1. Altamimi, Z., Dermanis, A.: The choice of reference system in ITRF formulation. In: Sneeuw, N., et al. (eds.) VII Hotine-Marussi Symposium on Mathematical Geodesy, International Association of Geodesy, Symposia, vol. 137, pp. 329–334. Springer, Berlin (2009) 2. Altamimi, Z., Dermanis, A.: Theoretical foundations of ITRF determination. The algebraic and the kinematic approach. In: Katsampalos, K.V., Rossikopoulos, D., Spatalas, S., Tokmakidis, K. (eds.) On Measurements of Lands and Constructions. Volume in honor of Prof. Dimitios G. Vlachos. Publication of the School of Rural & Surveying Engineering, Aristotle University of Thessaloniki, pp. 331–359 (2013) 3. Altamimi, Z., Sillard, P., Boucher, C.: ITRF2000: a new release of the international terrestrial reference frame for earth science applications. J. Geophys. Res. 107(B10), 2214 (2002) 4. Altamimi, Z., Sillard, P., Boucher, C.: ITRF2000: from theory to implementation. In: Sansò, F. (ed.) V Hotine–Marussi Symposium on Mathematical Geodesy. IAG Symposia, vol. 127, pp. 157–163. Springer, Berlin (2004) 5. Altamimi, Z., Collilieux, X., Legrand, J., Garayt, B., Boucher, C.: ITRF2005: a new release of the international terrestrial reference frame based on time series of station positions and earth orientation parameters. J. Geophys. Res. 112, B09401 (2007) 6. Altamimi, Z., Collilieux, X., Métivier, L.: ITRF2008: an improved solution of the international terrestrial reference frame. J. Geod. 85, 457–473 (2011) 7. Altamimi, Z., Rebischung, P., Métivier, L., Collilieux, X.: ITRF2014: a new release of the international terrestrial reference frame modeling nonlinear station motions. J. Geophys. Res. Solid Earth 121, 6109–6131 (2016) 8. Angermann, D., Drewes, H., Krügel, M., Meisel, B., Gerstl, M., Kelm, R., Müller, H., Seemüller, W., Tesmer, V.: ITRS Combination Center at DGFI: A Terrestrial Reference Frame Realization 2003. Deutsche Geodätische Kommission Reihe B Nr. 313, München (2004) 9. Angermann, D., Drewes, H., Gerstl, M., Krügel, M., Meisel, B.: DGFI combination methodology for ITRF2005 computation. In: Drewes, H. (ed.) Geodetic Reference Frames. IAG Symposia, vol. 134, pp. 11–16. Springer, Berlin (2009) 10. Artz, T., Bernhard, L., Nothnagel, A., Steigenberger, P., Tesmer, S.: Methodology for the combination of sub-daily Earth rotation from GPS and VLBI observations. J. Geod. 86, 221– 239 (2012) 11. Baarda, W.: S-Transformations and Criterion Matrices. Netherlands Geodetic Commission, Publ in Geodesy, New Series, vol. 5, no. 1, Delft (1973). https://www.ncgeo.nl/downloads/ 18Baarda.pdf 12. Baarda, W.: Linking up spatial models in geodesy. Extended S-Transformations. Netherlands Geodetic Commission, Publ. in Geodesy, New Series, no. 41, Delft (1995). https://www.ncgeo. nl/downloads/41Baarda.pdf 13. Biagi, L., Sanso, F.: Sistemi di riferimento in geodesia: algebra e geometria die minimi quadrati per un modello con deficienza di rango. Bollettino di Geodesia e Scienze Affini. Parte prima: Anno LXII, N. 4, 261–284. Parte seconda: Anno LXIII, N. 1, 29–52. Parte terza: Anno LXIII, N. 2, 129–149 (2003) 14. Bjerhammar, A.: Rectangular reciprocal matrices with special emphasis to geodetic calculations. Bulletin Géodésique 52, 188–220 (1951) 15. Blaha, G.: Inner adjustment constraints with emphasis on range observations. Department of Geodetic Science, Report 148, The Ohio State University, Columbus (1971) 16. Blaha, G.: Free networks: minimum norm solution as obtained by the inner adjustment constraint method. Bull Géodésique 56, 209–219 (1982) 17. Bolotin, S., Bizouard, C., Loyer, S., Capitaine, N.: High frequency variations of the earth’s instantaneous angular velocity vector. Determination by VLBI data analysis. Astron. Astrophys. 317, 601–609 (1997) 18. Capitaine, N., Guinod, B., Souchay, J.: A non-rotating origin of the instantaneous equator: definition, properties and use. Cel. Mech. 39, 283–307 (1986)

13 Theory and Realization of Reference Systems

821

19. Chatzinikos, M., Dermanis, A.: A comparison of existing and new methods for the analysis of nonlinear variations in coordinate time series. In: IUGG 2015, Prague, 22 June–3 July 2015. Available at: https://www.researchgate.net 20. Chatzinikos, M., Dermanis, A.: A coordinate-invariant model for deforming geodetic networks: understanding rank deficiencies, non-estimability of parameters, and the effect of the choice of minimal constraints. J. Geod. 91, 375–396 (2017) 21. Chatzinikos, M., Dermanis, A.: Interpretation of numerically detected rank defects in GNSS data analysis problems in terms of deficiencies in reference system definition. GPS Solutions 21, 1239–1250 (2017) 22. Chen, Q., van Dam, T., Sneeuw, N., Collilieux, X., Weigelt, M., Rebischung, P.: Singular spectrum analysis for modeling seasonal signals from GPS time series. J. Geodyn. 72, 25–35 (2013) 23. Dermanis, A.: The Non-Linear and the Space-Time Datum problem. Paper presented at the Meeting “Mathematische Methoden der Geodaesie”, Mathematisches Forschungsinstitut Oberwolfach, 1–7 Oct 1995. Available at: http://der.topo.auth.gr, https://www.researchgate.net/ 24. Dermanis, A.: Generalized inverses of nonlinear mappings and the nonlinear geodetic datum problem. J. Geod. 72(2), 71–100 (1998) 25. Dermanis, A.: Establishing global reference frames. Nonlinear, temporal, geophysical and stochastic aspects. Invited paper presented at the IAG international symposium Banff, Alberta, 31 July–4 Aug 2000 (2000). In: Sideris, M.G. (ed) Gravity, Geoid and Geodynamics”, IAG Symposia, vol. 123, pp. 35–42. Springer, Berlin (2002) 26. Dermanis, A.: Global reference frames: connecting observation to theory and geodesy to geophysics. In: IAG 2001 Scientific Assembly “Vistas for Geodesy in the New Milennium”, Budapest, 2–8 Sept 2001. Available at http://der.topo.auth.gr, https://www.researchgate.net/ 27. Dermanis, A.: Some remarks on the description of earth rotation according to the IAU 2000 resolutions. From Stars to Earth and Culture. In honor of the memory of Professor Alexandros Tsioumis, pp. 280–291. School of Rural & Surveying Engineering, The Aristotle University of Thessaloniki (2003) 28. Dermanis, A.: The rank deficiency in estimation theory and the definition of reference frames. In: Sansò, F. (ed.) V Hotine-Marussi Symposium on Mathematical Geodesy, Matera, 17–21 June 2003. International Association of Geodesy Symposia, vol. 127, pp. 145–156. Springer, Heidelberg (2003) 29. Dermanis, A.: Coordinates and Reference Systems. Ziti Publications, Thessaloniki (2005) 30. Dermanis, A.: Compatibility of the IERS earth rotation representation and its relation to the NRO conditions. Proceedings, Journées 2005 Systèmes de Référence Spatio-Temporels “Earth dynamics and reference systems: five years after the adoption of the IAU 2000 Resolutions”, Warsaw, 19–21 Sept 2005, pp. 109–112 (2005) 31. Dermanis, A.: The ITRF beyond the “Linear” model. Choices and challenges. In: Xu, P., Liu, J., Dermanis, A.: (eds.) VI Hotine-Marussi Symposium on Theoretical and Computational Geodesy. International Association of Geodesy Symposia, vol. 132, pp. 111–118. Springer (2006) (Invited presentation at the VI Hotine-Marussi Symposium, Wuhan, 29 May–2 June 2006) 32. Dermanis, A.: On the alternative approaches to IITRF formulation. A theoretical comparison. IUGG General Assembly, Melbourne. In: Rizos, C., Willis, P. (eds.) Earth on the Edge: Science for a Sustainable Planet, International Association of Geodesy Symposia, vol. 139, pp. 223– 229. Springer, Berlin/Heidelberg (2014) 33. Dermanis, A.: Global reference systems: theory and open questions. Invited paper at the Academia dei Lincei Session, VIII Hotine-Marussi Symposium on Mathematical Geodesy, Rome, 17–21 June 2013. In: Sneeuw, N., Novák, P., Crespi, M., Sansò, F. (eds.) VIII HotineMarussi Symposium on Mathematical Geodesy, IAG Symposia, vol. 142, pp. 9–16. Springer International Publishing, Switzerland (2016) 34. Dermanis, A., Sansò, F.: Different equivalent approaches to the geodetic reference system. Rendiconti della Accademia dei Lincei, Scienze fisiche e naturali. On-Line-First (volume in print) (2018)

822

A. Dermanis

35. Dow, J., Neilan, R.E., Rizos, C.: The international GNSS service in a changing landscape of global navigation satellite systems. J. Geod. 83(3–4), 191–198 (2009). https://doi.org/10.1007/ s00190-008-0300-3 36. Elsner, J.B., Tsonis, A.A.: Singular Spectrum Analysis. A New Tool in Time Series Analysis. New York, Plenum Press (1996) 37. Golyandina, N., Zhigljavsky, A.: Singular Spectrum Analysis for Time Series. Springer Briefs in Statistics. Springer (2013). ISBN:978-3-642-34912-6 38. Grafarend, E., Schaffrin, B.: Unbiased free net adjustment. Surv. Rev. 22(171), 200–218 (1974) 39. Grafarend, E., Schaffrin, B.: Equivalence of estimable quantities and invariants in geodetic networks. Zeitschrift für Vemessungswesen 101(11), 485–491 (1976) 40. Gross, J.: The general Gauss-Markov model with possibly singular dispersion matrix. J. Stat. Pap. 45, 311–336 (2004) 41. Koch, K.-R.: Parameter estimation and hypothesis testing in linear models, 2nd edn. Springer, Berlin (1999) 42. Kotsakis, C.: Generalized inner constraints for geodetic network densification problems. J. Geodesy 87, 661–673 (2013) 43. Lavallée, D.A., van Dam, T., Blewitt, G., Clarke, P.J.: Geocenter motions from GPS: a unified observation model. J. Geophys. Res. Solid Earth 111(B5) (2006). https://doi.org/10.1029/ 2005JB003784 44. Meindl, M., Beutler, G., Thaller, D., Dach, R., Jäggi, A.: Geocenter coordinates estimated from GNSS data as viewed by perturbation theory. Adv. Space Res. 51(7), 1047 (2013) 45. Meissl, P.: Die innere Genauigkeit eines Punkthaufens. Österreichers Zeitschrift für Vermessungswesen 50, 159–165 and 186–194 (1962) 46. Meissl, P.: Über die innere Genauigkeit dreidimensionalern Punkthaufen. Zeitschrift für Vermessungswesen, 1965, 90. Jahrgang, Heft 4, 109–118 (1965) 47. Meissl, P.: Zusammenfassung und Ausbau der inneren Fehlertheorie eines Punkthaufens. Deutsche Geodätische Kommission, Reihe A, Nr. 61, 8–21 (1969) 48. Moore, E.H.: On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc. 26(9), 394–95 (1920) 49. Munk, W.H., MacDonald, G.J.F.: The Rotation of the Earth. Cambridge University Press, London (1960) 50. Penrose, R.: A generalized inverse for matrices. Proc. Cambridge Philos. Soc. 51, 406–413 (1955) 51. Pearlman, M.R., Degnan, J.J., Bosworth, J.M.: The international laser ranging service. Adv. Space Res. 30(2), 135–143 (2002) 52. Petit, G., Luzum, B.: IERS Conventions. IERS Technical Note No. 36, Verlag des Bundesamts für Kartographie und Geodäsie, Frankfurt am Main 2010. Working version under continuous updating is available at http://iers-conventions.obspm.fr/updates/2010updatesinfo.php (2010) 53. Rangelova, E., van der Wal, W., Sideris, M.G., Wu, P.: Spatiotemporal analysis of the GRACE-derived mass variations in North America by means of multi-channel singular spectrum analysis. In: Mertikas, S.P. (ed.) Gravity, Geoid and Earth Observation, International Association of Geodesy Symposia, vol. 135, pp. 539–546. Springer, Berlin/Heidelberg (2010) 54. Rao, C.R.: Unified Theory of Linear Estimation. Sankhya, Series A, vol. 33, pp. 371–394 (1971). Corrigenda. Sankhya, Series A, Springer, vol. 34, pp. 194, 477 (1972) 55. Rao, C.R.: Unified theory of least squares. Commun. Stat. Theory Methods 1(1), 1–8 (1973) 56. Rao, C.R.: Linear Statistical Inference and Its Applications, 2nd edn. Wiley, New York (1973) 57. Rebischung, P., Altamimi, Z., Springer, T.: A colinearity diagnosis of the GNSS geocenter determination. J. Geod. 88(1), 65–85 (2014). https://doi.org/10.1007/s00190-013-0669-5 58. Rothacher, M., Angermann, D., Artz, T., Bosch, W., Drewes, H., Gerstl, M., Kelm, R., König, D., König, R., Meisel, B., Müller, H., Nothnagel, A., Panafidina, N., Richter, B., Rudenko, S., Schwegmann, W., Seitz, M., Steigenberger, P., Tesmer, S., Tesmer, V., Thaller, D.: GGOS-D: homogeneous reprocessing and rigorous combination of space geodetic observations. J. Geod. 85, 679–705 (2011)

13 Theory and Realization of Reference Systems

823

59. Schuh, H., Behrend, D.: VLBI: a fascinating technique for geodesy and astrometry. J. Geodyn. 61, 68–80 (2012). https://doi.org/10.1016/j.jog.2012.07.007 60. Seitz, M., Angermann, D., Blossfeld, M., Drewes, H., Gerstl, M.: The 2008 DGFI realization of the ITRS: DTRF2008. J. Geod. 86, 1097–1123 (2012) 61. Tisserand, F.: Traité de Mécanique Céleste. Gauthieu-Villars, Paris (1889) 62. Willis, P., Fagard, H., Ferraged, P., Lemoinee, F.G., Noll, C.E., Noomen, R., Otten, M., Ries, J.C., Rothacher, M., Soudarin, L., Tavernier, G., Valette, J.-J.: The international DORIS service: toward maturity. Adv. Space Res. 45(12), 1408–1420 (2010). https://doi.org/10.1016/j.asr. 2009.11.018 63. Zhu, S.-Y., Mueller, I.I.: Effects of adopting new precession, nutation and equinox corrections on the terrestrial reference frame. Bull. Geod. 57(1983), 29–42 (1983)

Part VI Inverse Problems and Least Squares Methods

From Gaussian Least Squares Approximation to Today’s Operator-Theoretic Regularization of Ill-Posed Problems

14

Willi Freeden and Bertold Witte

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Pseudoinverse for Finite Dimensional Matrix Equations . . . . . . . . . . . . . . . . . . . . . . . . 3 Functional Analytical Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Specific Regularization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Regularization Involving Strong and Weak Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Operator-Theoretic Reconstruction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Pseudoinverses in Reproducing Kernel Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Multiscale Regularization Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Iterative Methods as Regularization Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Mollifier Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

828 837 846 858 864 870 888 899 909 921 924

Abstract The aim of this contribution is to document the pioneer dimension of Gauss’s method of least squares approximation and to demonstrate its mediation role to today’s regularization processes of pseudoinverses in ill-posed inverse problems.

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. W. Freeden () Department of Mathematics, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] B. Witte Institute for Geodesy and Geoinformation, University of Bonn, Bonn, Germany E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_94

827

828

W. Freeden and B. Witte

Zusammenfassung Das Ziel dieses Beitrages ist es, die Pionierdimension der Gaussschen Methodik der Kleinste-Quadrate-Approximation zu dokumentieren und ihre Vermittlerrolle hin zu heutigen Regularisierungsprozessen von Pseudoinversen in schlechtgestellten inversen Problemen zu demonstrieren.

Keywords Least squares methods · Pseudoinverses · Ill-posed problems · Inverse problems · Regularization techniques

1

Introduction

This paper may be regarded as a synopsis of ideas and concepts presented in the notes [33, 38] in the field of least squares framework. Some of our parts almost literally follow the context worked out in the “Handbook of Mathematical Geodesy (HbMG)”, Geosystems Mathematics, Birkhäuser, International Springer Publishing, 2018.

1.1

Gauss’s Historic Role and Influence

Mathematicians interested in geodesy tried to use the probability theory since the middle of the eighteenth century for the evaluation of redundant measurements. Like in the theory of games, values of measurements can be regarded as random samples of a random process. Already in the year 1755, T. Simpson (1710–1761) pointed out that the mean value of a number of measurements can be advantageously used (see [6] for more details). Especially in the works of the French mathematicians J.-L. Lagrange (1736–1813) and P.-S. Laplace (1749–1827), important theoretical foundations were treated. The actual reasoning for the adjustment theory with its core, the method of least squares, was done by C.F. Gauss (1777–1855) and A.-M. Legendre (1752–1833) around 1800. Gauss supposably hit on the fundamental idea in the autumn 1794 reading a publication of Lambert concerning applications of mathematics (see [42]), but he published his method not before 1809 (cf. “Theoria motus corporum coelestium in sectionibus conicis solem ambientium”). He closed this publication (see Fig. 1) by giving an explanation of his method (“Determinatio orbitae observationibus quotcumque quam proximae satisfacientis”). Further works followed in the year 1810 (“Disquisitio de elementis ellipticis Palladis”) in 1816 (“Bestimmung der Genauigkeit der Beobachtungen” 1880) and finally, in 1821 and 1823 in a systematic presentation (“Theoria combinationis observationum erroribus minimis obnoxiae, pars prior et pars posterior”). Three years later these works were completed by the note “Supplementum theoriae combinationis observationum erroribus minimis obnoxiae”.

14 From Gaussian Least Squares Approximation to Today’s . . .

829

Fig. 1 Title-page of “Theoria motus corporum coelestium in sectionibus conicis solem ambientium”, original latin edition by Friedrich Perthes and I.H. Besser, Hamburg, 1809, also: “Carl Friedrich Gauss: Werke, herausgegeben von der (Königlichen) Gesellschaft der Wissenschaften zu Göttingen” (Göttinger Digitalisierungszentrum)

Actually, in the year 1806, Legendre first published this adjustment method in a supplement of his treatise “Nouvelles méthodes pour la détermination des orbites cometes” ` and termed this principle “Sur la méthode des moindres carrés”

830

W. Freeden and B. Witte

(cf. [79]). This term was also taken by Gauss. Legendre explains in his paper that there is among all principles no simpler one for the evaluation of observations than to minimize the sum of the squared errors in order to estimate an unknown quantity. Since 1801 Gauss applied the method of least squares nearly daily for his astronomic computations (Gauss Werke, Vol. IV, p. 98). A well known application during this time was the computation of the orbit of the Planetoid Ceres (cf. [15]), which could only be observed by G. Piazzi (1746–1826) over 90◦ of its orbit in 1801. In spite of the small number of observations, Gauss was able to compute the ephemeris of Ceres so precisely that Ceres could be redetected at the predetermined place. The discovery of the planetoid Ceres introduced Gauss to the world as a theoretical astronomer of the highest reputation (cf. [15]). Why Gauss published his “Theoria motus corporum coelestium . . . ” so late can probably only be explained by the endeavour of Gauss to give his works “an inner perfection”. Gauss’s motto was: “pauca sed matura”. This is probably the reason why Legendre could publish before him in the year 1806 the method without giving a thorough reasoning (see also the comments by [56]). Note that R. Adrain (1775–1843) introduced this principle (cf. [1]), too. Legendre characterized the method of least squares as the most general, most accurate, and very easy to apply adjustment principle and proved that this method leads for one unknown to the arithmetic mean and for two or three unknowns to the center point of the system. According to Legendre a kind of equilibrium between the deviations of the measurements will be fixed, comparable to the determination of a space point by measurements, which lies in the center point of the system. Independently from Legendre, Adrain derived in his publication “Research concerning the probabilities of the errors which happen in making observations” (cf. [1]) the function of the error probability and the method of least squares resulting out of it. Also D. Huber (1768–1839) was working on the method of least squares, but it is Gauss’s name connected to this method because of his significant contributions. The correspondence of great mathematicians of that time, e.g., between Gauss and Laplace, supports the impression that there was no dispute concerning the priority. Rather the correspondence proves that this simple principle of the method of least squares was highly regarded by the contemporaries (see also [48]). In accordance with [6], the method of least squares can be regarded as a characteristic example of the increasing importance of mathematics in geodesy in the first half of the nineteenth century. At the same time this method also is an expression of how the unity of observational practice, the evaluation of measuring results, and the corresponding theory is growing together.

1.2

Reasons for the Least Squares Method

The introduction of the “mean square error” and its definition as an independent measure of precision was an important assumption for the foundation of the least squares method. Besides the definition of the measure of precision Gauss also demonstrated in his “Theoria combinationis observationum erroribus minimis

14 From Gaussian Least Squares Approximation to Today’s . . .

831

obnoxiae” how to get the mean square errors of the unknowns as well as the weight of a function of the adjusted unknowns in the course of an adjustment of observation equations (see [44] and the comments by [123]). Following a publication by J.W.R. Dedekind (1831–1916) (for more details see [12]) Gauss did argue in his lecture about the least squares method against the adjustment principle of minimizing the absolute sum of errors (deviations) and the algebraic sum being zero. This principle was first applied in Ch. Maire [83] and later in P.S. Laplace [77]. In geodesy, this optimization problem is known as the Boscovi`c-Laplace method. It seems that F.Y. Edgeworth [16] in 1887 used this method primarily but without constraints for the estimation of many unknowns. A solution suitable for practical geodetic computations, which may be used to determine the most favourable weights in base extensional networks, was achieved by K. Friedrich [40] in 1937. About 50 years earlier, this method was installed in [119] to solve similar problems. Mathematically, the Boscovi´c-Laplace method may be regarded as an 1 -method under constraints. No doubt, 1 -type approaches are especially appropriate to detect gross errors (outliers) in observations. One of Gauss’s arguments against the Boscovi´c-Laplace method should be mentioned: With a greater number of observations a bigger error would not exert a stronger impact on the results than many smaller errors, of which the absolute values have the same sum. An error, which occurs n-times, must have a stronger impact than n-single errors. Instead of the errors themselves the squares are therefore taken and to estimate the usefulness of a hypothesis of smallness of their corresponding sum of squared errors. Gauss rejected errors with a higher power, which have always to be even, because the computations would than be extremely complicated so that the treatment would not be worth the effort (cf. [12]). Today, for practical purposes, e.g., in engineering surveying, the estimation procedures for the least squares method and the 1 -method are usually applied in simultaneous combination, because the corresponding programs are available. Specifically with the help of the 1 -method outliers are detected, localized, and then eliminated. In some cases the 1 -method turns out to be insufficient, to detect reliable outliers, so that other robust estimation methods come into play (see, e. g., [57, 65, 71, 73, 74, 81, 84, 118] and the references therein for more details). Concerning the probablistic reasoning Gauss’s work, the theory of probability starts with a justification for the least squares method. In his “Theoria motus corporum coelestium” the problem was set in such a way that, with measurements of equal precision, the accidental errors exhibit a differentiable density of the probability distribution. The distribution is determined under the supposition that the most probable value for the measured quantity with any number of observations is equal to the arithmetic mean of the measured values (cf. [48]). Under this assumption Gauss was led to the well known bell curve for the (Gaussian) probability distribution. √ The well-known function x → (σ 2π )−1 exp(−(2σ 2 )−1 (x − μ)2 ), x ∈ R, with σ the standard deviation, σ 2 the variance, and μ the expectation, indeed, provides a measure for the precision of the observations. Gauss was aware of a certain arbitrariness of this quantity. In fact, he wrote in the Articles 178 and 179 of his publication “Theoria motus corporum coelestium...:”.

832

W. Freeden and B. Witte

“Functio modo eruta omni quidem igore errorum probilitatis exprimere certo non potest. . . . Hoc principium, quod in omnibus applicationibus mathesis ad philosophiam naturalem usum frequentissimum”. “The function just found cannot, it is true, express rigorously the probabilities of the errors: For since the possible errors are in all cases confined within certain limits, the probability of errors exceeding those limits ought always to be zero, while our formula always gives some value. However, this defect, which every analytical function must, from its nature, labor under, is of no importance in practice, because the value of our function decreases so rapidly, when it has acquired a considerable magnitude, that it can safely be considered as vanishing. This principle, which promises to be of most frequent use in all applications of the mathematics to natural philosophy, must, everywhere, be considered an axiom with the same propriety as the arithmetical mean of several observed values of the same quantity is adopted as the most probable value (translated by [11]).”

After Gauss has determined the probability distribution for the measuring errors (“errors of observations”) in such a way, he concluded in one of the next paragraphs: The probability density of a given totality of observations will attain its maximum under the condition that the sum of the squares of the deviations of the observed quantities with respect to the true values of the quantities, which have to be measured, will become a minimum. This principle can also be applied to observations of different precision. If the mean square error is, however, unknown and the deviations of the approximate values from the true values have to be estimated, then the student distribution should be chosen according to the number of degrees of freedom. Especially with a small number of observations the results will be different to the ones determined with the procedure given by Gauss (see, e.g., [48]). In case that the Gaussian error law and the axiom of the arithmetic mean underlying this law would not be valid, then the unknowns determined by the method of least squares are not any more the most probable values. Gauss came to a second justification presumably on the basis of an article by Laplace, who proved in his “ Théorie analytique des Probabilités” ( [78]) that the unknowns determined by the method of least squares have the smallest average errors. This proof is independent of the form of the error law, if for all observations the same error law is valid and if positive as well as negative errors of equal absolute value are equal probable. Laplace furnished this proof for two unknowns under the condition that the number of the observations is infinite. Gauss did not take this assumption into account and he proved this in his “Theoria observationum erroribus minimis obnoxiae”. In the “Anzeigen” ( [44]), the following sentence can be found: “. . . , die Funktion für die Wahrscheinlichkeit der Fehler sei, welche sie wolle, und die Anzahl der Beobachtungen möge gross oder klein sein.” (. . . , the function for the probability of the errors may be as it likes to be and the number of observations may be large or small).

In consistency with the remarks to be found in G.W. Stewart [123] we present a brief list of what was new in Gauss’s treatment of random errors: • • • •

The careful distinction between systematic and random errors. The use of the first moment of a distribution to measure its center. The use of the second moment to measure the precision. The development of a Chebyshev-like inequality.

14 From Gaussian Least Squares Approximation to Today’s . . .

833

• The correct formula for the expectation of a random-valued function. • The rate of convergence of the sample mean and variance. • The correct formula for estimating the precision of observations from the residual sum of squares.

1.3

Gaussian Adjustment Theory

For geodetic purposes, even in our days, the adjustment of observation equations is, in principle, applied in such a way as Gauss has proposed it. H. Wolf (1910– 1994) came to the following explanation in the note [137]: After linearization of the observation equations, usually by taylorizing, the system of normal equations is set up and solved by use of the original Gaussian algorithm. Weight reciprocals and weighting coefficients as well as the partial derivatives of the observations with respect to the unknowns are computed. The error calculation consists of the computation of the mean square error of unit weight, the observations, the unknowns, and their functions. The well known control-formula for the computation of the sum of the squared residuals can also be dated back. Further, the problem is solved to specify the alterations in the unknowns, which are caused by an addition of a further observation equation or by a change of a single weight. Also the way of getting homogeneity by multiplying with the square root of the weight can be found in Gauss’s work. Starting with the successful computation of the orbit of the planetoid Ceres (dwarf planet) up to the computation of the geodetic arc measurements in the Kingdom of Hanover Gauss validated the practical usefulness of his method. Besides the justification and development of his least squares method Gauss proposed numerous approximate adjustments, which have naturally to be free of arbitrariness and contradiction as well as qualified for adjustments in successive steps and in groups. In order to adjust the angle observations taken on the single stations of the net for his arc measurements (station adjustments) Gauss used partly angle unknowns and partly bearing unknowns. To solve the adjustments he developed the successive approximation method (cf. [80]), which was applied by an adjustment of observation equations as well as of condition equations (cf. [47]). After getting the normal equation system with each step only the residual for one unknown is determined. For this step the residuals of the other unknowns are, however, set to zero. Gauss adjusted trigonometric nets partly with angle residuals and partly with bearing residuals using the condition equation method. The correlations coming from the station adjustments were not regarded (cf. [137]). However, Gauss [45] pointed out that, for this purpose, the observations to be adjusted must be independent of each other. For the adjustment of condition equations it is not necessary to put up the normal equation system, if directly the condition equations are step by step fulfilled. This possibility for the application of the Gaussian successive approximation procedure with height nets was firstly pursued by C.A. Vogler [131]. For the adjustment of the geodetic arc measurement net in the Kingdom of Hanover with condition equations Gauss used a successive approximation procedure, which he did not

834

W. Freeden and B. Witte

publish in detail, but it was later reconstructed by L. Krüger [76]. Subsequently, Krüger developed this procedure to the so called Krüger-method by two groups: Gauss [45] recommended only in Article 20 of his publication “Supplementum theoria combinationis . . .” to divide the condition equations in two groups for the adjustment of large triangulation nets. A further important geodetic application example was the computation of the dimensions of the Earth’s ellipsoid by the Gauss-disciple Schmidt, who at Gauss’s suggestion adjusted the geodetic arc measurements of good quality known at that time. For the polar flattening he actually came to the result of 1 : 298,39, an amazing result, if compared to the corresponding value of the World Geodetic System (WGS 84) of 1 : 298, 257.

1.4

Helmert’s Contribution to the Adjustment Theory

By studying the publications of F.R. Helmert (1843–1917) we notice that he has always set new emphases in adjustment theory resulting from practical applications and has permanently further developed its theoretical foundations (see [136] for more details). The central meaning of this evaluation method can be demonstrated by Helmert’s inaugural address, which he delivered to the Royal Prussian Academy of Science. The following two sentences are taken from this speech [62]: “Everyone, who knows the contents of geodesy a little bit, will have noticed that the various applications of the method of adjustment and the theory of the errors of observations are especially in this field of science... Here it becomes evident that even in a purely mathematical way quite simple tasks can due to the matter of accuracy gain in geodesy theoretical interest, which has to be added by the application of measurement results.”

In the following three subjects of central significance are chosen from Helmert’s book on the method of adjustment theory. In the second edition of this book [60] the different procedures of the adjustment method are classified into: Direct observations, observation equations, condition equations, observation equations with condition equations, condition equations with unknowns. For the last name there are today some denotations in common use: “Standard procedure of the adjustment method” or “Mixed model” or “Gauss–Helmert-model”. The new extensions of the Gauss–Helmert-model are principally in the original formulation in his book included (cf. [120]). The transitions between the different models of the adjustment method are of special interest for efficient computations. By introducing “equivalent observations” Helmert was able to implement in a computational sense the transitions efficiently. For him an “equivalent system”, was of special interest: The system of reduced normal equations, these are the normal equations, which are brought by the Gaussian elimination method into a step like form: This kind of proceeding enables the immediate access to the “compensation” parameters, which results from the Gaussian elimination process. In such a way Helmert could decorrelate correlated observations. Today we have quite often, especially in satellite geodesy, long series of observations with more than hundreds of thousand measurements. Between the decorrelation of observations and the homogenizing of data there exists a close relationship, if independent observations with equal weights

14 From Gaussian Least Squares Approximation to Today’s . . .

835

appear. Observation equations for uncorrelated measurements with unequal weights can in such a way be modified that equal weights can be used. Helmert showed that this can be achieved by multiplication of the complete observation equation using the square root of the weight. This perception, known as Cholesky’s method since 1924, became a standard procedure in numerical mathematics. “Questions concerning the accuracy, which occur by applying on measurement results” have always found the special interest of Helmert. The following quantities are therefore exactly defined in his book: The mean error (arithmetic mean), the mean square error (standard deviation), and the probable error. These three quantities for the error did not lose their importance up today, although in statistics the variance (square of the mean square error) is usually chosen. For data sets with conspicuous data, for instance with gross errors, the definition of the probable error is often used being robust against blunders. Already Gauss came to the conclusion that by presence of n true observation errors εi , i = 1, . . . , n, estimators for the before specified quantities can be defined, at which the  power sums own a central meaning. These are defined as follows: Mˆ m := n1 ni=1 |εi |m . Here m characterises the order of the moment. Out of these  power sums the estimators for the mean error Dˆ ε := Mˆ 1 = n1 ni=1 |εi | and the  mean square error Sˆε := Mˆ 2 are given. It can be seen that the estimators for Dˆ ε and Sˆε are again random variables. This estimation includes uncertainties or errors, for which Helmert turns the mean square errors of the estimators out. In contradiction to the estimation of the mean error and the mean square error the estimation of the probable error ρ is done by counting. From these three estimators the estimation for the mean square error is the most accurate one. Helmert pointed out that the differences were not significant. In a footnote on page 26 of his book he indicated: “We cannot ignore that a more rigorous investigation should first find out, whether it is permitted to judge the accuracy of the computation of μ(= Sˆε ) and ϑ(= Dˆ ε ) after the mean square error [...]. Because the errors in σm (= Mˆ m ) obey obviously a law, which deviates from the [...] anticipated form of the law of errors [...].” Because of this footnote Helmert was criticized by R. Mees [85] and with that the statement on the estimator with the highest accuracy was called in question. In a comprehensive comment on the “probabilities of the power sums” Helmert [61] was getting down to the distribution functions of the estimators of the moments of a finite number n of true observation errors ε and stated that the probability density function is following a Gauss distribution. The mean square errors (standard deviations) for any distribution functions of the estimators can be deduced. With that the statement is proofed that the most accurate estimator is that with the smallest mean square error (dispersion). Beyond that Helmert was in this publication occupied with the question, in which way the estimation of the second moment of the probability density function can be computed. For m = 2, i.e., for the mean square error μ, he derivates the probability using  the induction method that the sum of the squares of the true observation errors ni=1 εi2 , which must be normal distributed, is between the borders of n. By homogenizing the observation errors the probability density function, respectively the probability distribution function of the χ 2 -distribution is obtained. The probability density function is a special

836

W. Freeden and B. Witte

Gamma-distribution, which is very important for many hypothesis tests, why it is also called χ 2 -distribution. This name can be traced to a paper of Pearson [108], although the definition of this distribution follows immediately out of Helmert’s derivation. The following proposition (theorem) is valid: The sum of the squares of r stochastic independent and standard normal distributed random variables xi , i = 1 . . . r, follows the χ 2 -distribution with the degree of freedom r. With it Helmert was able to give the proof demanded by Mees.

1.5

Historic Thinking About the Origin of the Pseudoinverse

The idea of a pseudoinverse (or generalized inverse) may be briefly motivated as follows: Suppose that A ∈ Rm×n , A = (Aj,l ) j =1,...,m

(1)

l=1,...,n

is understood as a linear mapping from Rn into Rm . If the element y = (y1 , . . . , ym )T in Rm is known, then the problem is to find x = (x1 , . . . , xn )T in Rn such that x → y = Ax

(2)

holds true. If n = m and A ∈ Rm×n is invertible, with inverse A−1 , then the solution clearly is x = A−1 y and Ax = 0 uniquely yields x = 0. However, in case of n = m, the problem amounts to a proper understanding of the “invertibility” of Ax = y, A ∈ Rm×n , x ∈ Rn , y ∈ Rm . The question is if there will be a mapping A† , called a generalized inverse (or, often more used, pseudoinverse) of A, such that x = A† y (so that A† = A−1 in case of n = m)? During the last decades of the last century, the concept of a pseudoinverse (or generalized inverse) has rated considerable attention in the mathematical as well as geodetic literature (a bibliography, for example, listing over 1700 references on the subject is due to M.Z. Nashed [95]). Early interest in the first half of the last century in the subject of pseudoinverses was initiated by a paper on matrices by R. Penrose [107] (see also A. Bjerhammar [8]). However, this concept had been considered somewhat earlier. For example, E. H. Moore (1862–1932) presented a development [88] of the notion (see also [4,7,8,40,41,89,106,111], and [122]). Moreover, in the setting of integral and differential operators the concept was considered even earlier by E.I. Fredholm (1866–1927) in [23], A. Hurwitz (1859–1919) in [66], and D. Hilbert (1862–1943) in [64] (see [112] for a discussion of generalized inverses in classical analysis, and see also A. BenIsrael [5], T. L. Boullion, P. L. Odell [9], and M. Z. Nashed [95] for brief historical sketches of the subject). The relation between Gauss’s ideas and the concept of the pseudoinverse was discussed by D.W. Robinson [115] in more detail. His paper attempts to show that although Gauss did not formalize the notion of a pseudoinverse, he provided the essential ingredients to produce one. All in all, the conclusion of Robinson [115] is as follows:

14 From Gaussian Least Squares Approximation to Today’s . . .

837

“While generalized inverses were not a part of Gauss’s vocabulary, equivalent expressions may be found in his writings. Specifically, in his solution of the problem of least squares, Gauss established explicit formulas which may readily be identified with generalized inverses. Indeed, by translating his analytical formulation of the problem into the more geometrical framework of vector spaces and linear mappings, the features of generalized inverses are easily recognized. In particular, Gauss’s development of best linear estimates was in the spirit of generalized inverses. This observation suggests the possibility that Gauss’s view of least squares in 1826 may have been more geometrical than the analytical form of his presentation could express. He may well have conceived best linear estimates in some mode of geometrical thought, but communicated the results in the accepted and more rigorous analytical mode of the day, since the geometrical tools of the early 19th century were limited, and the conceptual framework needed to develop a theory of generalized inverses was not available to him”.

In what follows, we briefly list some modern areas pioneered by Gauss: Scientific fields of today’s mathematics which can be essentially traced back to Gauss’s work are as follows: (a) The probability distribution function (“bell curve”) introduced by Gauss has as normal distribution obtained in statistics a special significance, because theory and practice of confidence intervals and the on this based hypothesis tests were not possible without the Gaussian distribution (see [137]). (b) According to W. Jordan, the least squares method can be used to solve optimization problems by an iterative process, if the observations are redundant, e.g., for the detection of optimal weight distributions, a problem, which can primarly be settled by use of the Boscovi´c-Laplace method respectively, the simplex algorithm of linear programming (cf. [137]). (c) The procedures of spherical and spherical-harmonic prediction and collocation, which made their headway by H. Moritz [90] are based on the method of least squares (note that “least squares collocation” as it is called in geodesy is just named “minimum norm interpolation” in mathematical language, while “least squares adjustment” as understood in geodesy is nothing more than “smoothing”, e.g., in the sense of [113]). Furthermore, the prediction method can be considered as a kind of inter- or extrapolation to determine a trend function. These approximation methods were transferred and widely extended to a spline context by W. Freeden and many others [24–27, 30, 39, 54, 55, 114, 121, 133]. Clearly, the list of examples of Gaussian ideas as presented here is rather incomplete. Furthermore, our approach can be deepened by far, which will be done partly later in a today’s functional analytic jargon.

2

Pseudoinverse for Finite Dimensional Matrix Equations

Until now, our survey on least squares and pseudoinverse facets has been formulated under historic aspects. In what follows (see [38]) we would like to answer the

838

W. Freeden and B. Witte

question what exactly enabled mathematicians to build the bridge from Gaussian least squares theory to essential settings of today’s theory of ill-posed problems such as regularization. To this end, the following two questions (Q1) and (Q2) have to be answered: (Q1) What can be specified as mathematical structures and settings from least squares by a process of abstraction, i.e., a reduction to essential features. (Q2) What can be achieved from the reduced context obtained by abstraction from least squares theory as a new field of scientific interest, e.g., in geodesy or other sciences. The answers can be given in a three step (S1), (S2), (S3) procedure: (S1) We specify the transfer from least squares solutions to the theory of the pseudoinverse in a finite dimensional matrix calculus. The pseudoinverse is seen, comparably, but more generally to the Gaussian approach, as the best-approximate solution within the set as least squares solutions in finite dimensional settings. (S2) In turn, the reduction of the pseudoinverse to its specific properties involving functional analytic means based on operator theoretical background leads to the concretization of the pseudoinverse in finite dimensional spaces that can be extended to infinite dimensional spaces in a straightforward way. (S3) As a consequence, we shall be able by use of an infinite dimensional operator calculus to solve ill-posed problems, e.g., for (compact) operator equations using regularization for the inversion. Methodologies of resolution of ill-posed problems become obvious with particular reference to the problem of finding minimum norm least squares solutions of first kind integral equations (and, more generally, of linear operator equations with non-closed range). After having treated the finite dimensional situation (see also [95, 96]), we turn to the analogous theory in operator framework, later on.

2.1

Spectral Matrix Representation

We start with the linear matrix equation of the form Ax = y,

(3)

with A ∈ Kn×n (as usual, K stands for R or C) being a Hermitian positive definite matrix with n rows and n columns. From the spectral theory (see, e.g., [135]) it is well-known that there exist eigenvalues 0 < λ1 ≤ . . . ≤ λn and a corresponding unitary matrix U = (u1 , . . . , un ) of eigenvectors ui ∈ Cn \{0} (i.e., uH i uj = δij ) such that A has a representation of the form

14 From Gaussian Least Squares Approximation to Today’s . . .

⎛ ⎜ ⎜ U AU = ⎜ ⎜ ⎝

..

H



0

λ1 . .. 0

. λn



839

⎞ ⎟ ⎟ ⎟ = diag(λ1 , . . . , λn ). ⎟ ⎠

(4)



n×n

The condition number of A is given by the quotient of the largest and smallest eigenvalue, i.e., κ = λλn1 (note that λ1 > 0). For the sake of simplicity and coherence with the analysis for the infinite dimensional case below, we shall assume here that the scaling is such that λn = 1, so that κ = λ−1 1 . The condition number is a measure for the stable solvability of the problem (3).

2.2

Ill-Conditioned Matrix Equations

Assume that we have noisy data y ε instead of y, which satisfy the condition y ε − y ≤ ε

(5)

in the Euclidean norm on Kn . Let x ε denote the solution with right-hand side y ε . Then it follows from the spectral representation that  x ε − x = U diag

1 1 ,..., λ1 λn

 U H (y ε − y).

(6)

Hence, observing the orthogonality of eigenvectors we are led to the estimate x ε − x2 =

n

−2 ε H ε 2 2 λ−2 i |ui (y − y)| ≤ λi y − y .

(7)

i=1

In other words, we have x ε − x ≤ κy ε − y ≤ κ ε.

(8)

The sharpness of this estimate can be immediately seen for y ε − y = εu1 . It is clear that with increasing condition number of the matrix A, the noise amplification increases in the worst-case. For large κ one therefore speaks of an “ill-conditioned problem” (IPP). But it should be remarked that a finite dimensional linear problem is never ill-posed (in the sense that the third condition in Hadamard’s classification is violated), but for κ large one certainly comes close to this case. We also observe that errors in low frequencies (i.e., corresponding to eigenvectors with large eigenvalues) are amplified less. Following our nomenclature we see that, an error in the lowest frequency, i.e., y ε − y = ε un is not amplified at all. In

840

W. Freeden and B. Witte

fact, we just obtain x ε − x = ε from the spectral representation. This is a typical effect for inverse problems. It means that not all possible versions of noise of the same size are equally bad, high-frequency noise corresponding to low eigenvalues is always worse than low-frequency noise. However, in practice, we are able to make any assumption on the noise only in rare exceptions, so that a regularization method has to deal with arbitrary noise. Until now, we have assumed that the matrix A is Hermitian positive definite, i.e., the minimal eigenvalue is positive. If this is not the case, the matrix has a non-trivial nullspace. If λr denotes the minimal non-zero eigenvalue, then the solution formula becomes n H x= λ−1 (9) i ui ui y, i=r

and the problem is solvable if and only if uH i y = 0 for i < r. If the data set is noisy, i.e., instead of y we have to consider y ε , we are led to use the projection P y ε onto the range of A. In doing so we obtain for the corresponding solution x ε with data P y ε that xε − x =

n

H ε λ−1 i ui ui (P y − y).

(10)

i=r H ε ε Since uH i P y = ui y for i ≥ r we thus can estimate similarly as described above

x ε − x ≤ λr ε.

(11)

Consequently, there is no error propagation in the nullspace components and the noise amplification is actually determined by the minimal non-zero eigenvalue.

2.3

Matricial Pseudoinverse

Let A ∈ Kn×m be a matrix with n rows and m columns, y ∈ Km . Note that A is not required to be square, no rank assumptions are made at this stage. Then we know that the linear system Ax = y,

x ∈ Kn ,

(12)

need not have a (unique) solution. If (12) is unsolvable, a reasonable generalized notion of a solution is a “least square solution”, which minimizes the residual Ax−y in the Euclidean norm (note that  ·  stands for the Euclidean norm in this section): A vector x ∈ Kn is called a (1) least squares solution of (12) if and only if Ax − y = inf{Az − y : z ∈ Kn },

(13)

14 From Gaussian Least Squares Approximation to Today’s . . .

841

(2) best-approximate solution (or minimal norm solution) of (12) if and only if x is a least squares solution and x = inf{z : z is a least square solution}.

(14)

The following results are well-known from classical linear algebra (see, e.g., [135]): (i) A vector x ∗ is a solution of (13) if and only if the “normal equations” AH Ax ∗ = AH y

(15)

are satisfied. (ii) The problem (13) possesses a unique solution if and only if A has full rank. Our purpose is to show that a best-approximate solution in the sense of (2) always exists and is unique such that the following definition makes sense: If A† ∈ Km×n is understood as the matrix which assigns to each y ∈ Kn the bestapproximate solution of (12), then it is called the pseudoinverve (Moore-Penrose or generalized inverse) of A. Our aim is to construct A† and, hence, best-approximation solutions via the socalled singular value decomposition (SVD) of A. To this end we first recall the definition of the singular values of a matrix A. Let σ1 , . . . , σr > 0 be such that σ12 ≥ σ22 ≥ . . . ≥ σr2 > 0 are the positive eigenvalues of the matrix AH A (each one written down as often as its multiplicity is). Then σ1 , . . . , σr are called the “singular values of A”. This definition makes sense, since AH A is positive semidefinite. Obviously, r ≤ min{n, m}, where r is the rank of A. We know that a Hermitian matrix can be diagonalized, where the diagonal elements are its eigenvalues. The following theorem generalizes this result to the non-Hermitian case. Let σ1 ≥ σ2 ≥ . . . ≥ σr > 0 be the singular values of A. Then there exist unitary matrices U ∈ Km×m and V ∈ Kn×n such that ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ H V AU = ⎜ ⎜ ⎜ ⎜ ⎝ 

0

σ1 ..

. σr 0 ..

0

 m×n

. 0

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ 

(16)

842

W. Freeden and B. Witte

holds true. The columns of U and V are eigenvectors of AH A and AAH , respectively. The expression (16) is called “singular value decomposition (SVD) of A”. The singular value decomposition (SVD) is not unique, since the unitary matrices U and V are not. Obviously, from (16), we obtain ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ A=V⎜ ⎜ ⎜ ⎜ ⎝

0

σ1 ..

. σr 0 ..

.

0

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ H ⎟U , ⎟ ⎟ ⎟ ⎠

(17)

0

since V and U are invertible, the rank of A is r, the number of non-zero singular values (counted with multiplicity). Remark. In accordance one has to compute the singular values of A, e.g., as the positive square-roots of the eigenvalues of AH A, which can be done, e.g., by the QR-algorithm. However, since AH A usually has a worse condition than A, one should (in critical cases) use a variant of the QR-method that does not use AH A explicitly. Note that with U = (u1 , . . . , um ), V = (v1 , . . . vn ), we have for i ∈ {1, . . . , r} Aui = σi vi

(18)

AH vi = σi ui ,

(19)

and

which follows from the singular value decomposition (16) via multiplication by V and U H , respectively. The system {(σi ; ui , vi ) : i ∈ {1, . . . , r}} is called a “singular system” for A. Since {v1 , . . . , vr } is an orthonormal basis for the range R(A) = {y : Ax = y}, for any x ∈ Kn , Ax =

r

"Ax, vi #vi =

i=1

r

"x, AH vi #vi ,

(20)

i=1

which implies in connection with (19) that Ax =

r i=1

σi "x, ui #vi

(21)

14 From Gaussian Least Squares Approximation to Today’s . . .

843

holds true. Analogously, for y ∈ Km , we have AH y =

r

σi "y, vi #ui .

(22)

i=1

Remark. The notion of a singular system and the expansion (21) and (22) generalize to compact operators on infinite dimensional spaces, e.g., integral operators, as we will see later on. Remark. Note that if A has real entries, so U and V have. Now we relate the SVD to the pseudoinverse (sometimes also called Moore-Penrose inverse). Moreover, we show that SVD can be used to compute the best-approximate solution. For A having the SVD (16), ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ A† = U ⎜ ⎜ ⎜ ⎜ ⎝ 

1 σ1

0 ..

. 1 σr

0 .. 0



.

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ H ⎟V . ⎟ ⎟ ⎟ ⎠

0

(23)



n×m

Our approach also implies the existence and uniqueness of a best-approximate solution. Since A† y is the least squares solution of minimal norm, we obtain that A† y is a solution of the normal equations AH Ax = AH y with minimal norm, i.e., A† y = (AH A)† AH y.

(24)

This means that in order to approximate A† y we may as well compute an approximation to the minimal-norm solution in the normal equations, a fact we will heavily use in the construction of regularization methods, later on. For the case of overdetermined linear systems, i.e., (12) which m ≥ n and the null space N (A) = {x : Ax = 0} only consists of {0}, the best-approximate solution of (12) can effectively be computed via Householder transformations (see, e.g. [134]) in the following way: One transforms A by n − 1 Householder (or other orthogonal, e.g., Givens) transformations to a n × n-triangular matrix R (and 0 from the (n + 1)-st row on) and applies the same transformations to b. If Q is the unitary matrix representing these transformations, we are led to  QA =

R 0



 ,

Qy =

h1 h2

 .

(25)

844

W. Freeden and B. Witte

For any x ∈ Kn , 0 0 0 Rx − h1 0 0. 0 Ax − y = Q(Ax − y) = 0 −h2 0

(26)

Consequently, the best-approximate solution can be computed in the form A† y = R −1 h1 (since R is triangular, R −1 can be computed easily by substitution).

2.4

Continuous Dependence

While for the case of an invertible matrix A, A† y = A−1 y depends continuously not only on y, but also on A, this is not the case in general: Consider, for example, the matrix (cf. [135])  A() =

 10 ;  ∈ R, 0

(27)

then  A() = †

10 0 1

 (28)

for  = 0, and we have  A(0)† =

 10 , 00

(29)

so that, e.g.,     0 † 0  A(0) A() 1 1 †

(30)

as  → 0.

2.5

Truncated Singular Value Regularization

The representation (23) also shows how errors in y affect the result A† y: Errors in components of y corresponding to small singular values are amplified by the large factor of the singular value, so that such data errors are dangerous. This explains the numerical instability of (23), if A has small singular values. Although the problem of computing the best-approximate solution is well-posed, it is then numerically unstable.

14 From Gaussian Least Squares Approximation to Today’s . . .

845

The first idea to reduce this instability is to replace (23) by r "y, vi # ui σi i=1

xα =

(31)

σi2 ≥a

with an appropriately chosen value α > 0; this truncation is the first example of a “regularization”, where the original problem is replaced by a neighboring one, which is more stable. However, the choice of the “regularization parameter” α is quite crucial. If we use (31) with perturbed data y ε (with y − y ε  ≤ ε), we obtain as the “regularized solution” xαε =

r "y ε , vi # ui . σi i=1

(32)

σi2 ≥α

We estimate the total error between xαε and the sought-for quantity A† y: 0 0 0 0 0 0 0 0 0 0 0 0 r r r ε, v # 0 "y, vi # 0 0 0 "y, v # "y, v # − "y i i i ε † 0 0 0 xα − A y ≤ 0 ui − ui 0 + 0 ui 0 0 . σi σi 0 0 i=1 0 0 i=1 σi i=1 0 0 σ 2 ≥α 0 0 σ 2 ≥α i

i

(33) For sufficiently small α, the first summand on the right hand side is empty. For the second summand we have 0 02 0 0 0 r 0 r r 0 "y, vi # − "y ε , vi # 0 |"y − y ε , vi #|2 1 ε2 0 0 u = ≤ |"y − y ε , vi #|2 ≤ . i0 0 2 σi α α σi 0 i=1 0 i=1 i=1 0 σ 2 ≥α 0 σ 2 ≥α i

i

(34) The sum (32) is called truncated singular value expansion. It can be interpreted as applying a low-pass filter to the data.

2.6

Tikhonov Regularization

Another way of making (23) more stable would be to replace it by the sum xαε =

r i=1

σi "y ε , vi #ui . σi2 + α

(35)

This is a famous Tikhonov regularization method (cf. [126–130]), which we shall consider in more detail in infinite dimensions, too. It is helpful to characterize it in

846

W. Freeden and B. Witte

a different way: If xαε is given by (35), then it is not hard to see that "

# AH A + αI xαε = AH y ε ,

(36)

which is an alternative characterization of the Tikhonov regularization. From this “regularized normal equation”, we can obtain still another characterization of xαε , namely as the unique minimizer of the so-called Tikhonov functional x → Ax − y ε 2 + αx2 ,

(37)

which can be seen by putting the first derivative of the functional in (37) to 0, resulting exactly in the linear equation (36). The minimization of (37) can be seen as a combination of the two minimizations that appear in the definition of a best-approximate solution. It has also interpretations as a penalty method, e.g., via Lagrange multipliers. The computation explained above can also be performed for α = 0 (with (17) replaced by (23)). In this case it shows that x is the solution of minimal norm of the normal equations AH Ax = AH y

(38)

which was already attacked by [43], (see also the contributions by [110, 115]). If AH A is invertible (and, hence, positive definite), the normal equations (38) can be solved by standard Cholesky decomposition, which leads to an alternative method for computing best-approximation solutions, for which no SVD is needed. However, as mentioned above, forming AH A may seriously decrease the numerical stability. Hence, it should be avoided in ill-conditioned cases.

3

Functional Analytical Nomenclature

The following nomenclature is part of any functional analytical textbook (see, e.g., [72, 125, 138]). A mapping A : X → Y between two normed spaces X and Y is called an operator. In analogy to the matricial approach we let N (A) = {x ∈ X : Ax = 0} ⊂ X

(39)

R(A) = {y = Ax : x ∈ X} ⊂ Y

(40)

and

for the kernel (or the null space) of A and the image of A, respectively. Let X and Y be normed spaces. The operator A : X → Y is called linear, if (1) A(x + y) = Ax + Ay for all x, y ∈ X, (2) A(αx) = αAx for all x ∈ X and α ∈ K.

14 From Gaussian Least Squares Approximation to Today’s . . .

847

The operator A is called bounded, if there exists a constant C ≥ 0 such that AxY ≤ CxX

for all x ∈ X.

Let X and Y be normed spaces and A : X → Y be a linear operator. Then the following statements are equivalent: (1) A is continuous on X. (2) A is continuous in 0 ∈ X. (3) A is bounded on X. The space of all continuous linear operators between X and Y is denoted by L(X, Y ). If Y = X we set L(X) = L(X, X). A norm on L(X, Y ) is given by AxY = sup AxY = sup AxY . x =0 xX xX ≤1 xX =1

A = sup

(41)

We often use the notation AX→Y for the norm of an operator A : X → Y . Together with the norm (41) the space L(X, Y ) is a normed space. If X is a normed space and Y is a Banach space, For combinations of two linear operators we have the following result. Let A ∈ L(X, Y ) and B ∈ L(Y, Z), then BA ∈ L(X, Z) and we have BAX→Z ≤ BY →Z AX→Y .

(42)

For A ∈ L(X) we obtain iteratively An  ≤ An

(43)

for all n ∈ N. For sequences of operators two different terms of convergence are used. Let {An }n∈N ⊂ L(X, Y ) be a sequence of operators between X and Y and let A ∈ L(X, Y ). (1) A sequence of operators {An }n∈N is called pointwise convergent to an operator A if lim An x − AxY = 0

n→∞

(44)

for all x ∈ X. (2) A sequence of operators {An }n∈N is called uniformly convergent to an operator A if

848

W. Freeden and B. Witte

lim An − AX→Y = 0.

n→∞

(45)

Uniform convergence implies pointwise convergence but the converse is in general not true. Since we mainly study inverse problems in geodesy, the inversion of operators is of particular interest. For the existence of a continuous inverse of an operator we have the following equivalent criterion. The linear operator A : X → Y between the normed spaces X, Y has a continuous inverse, i.e., A−1 ∈ L(R(A), X), if and only if there exists a constant c > 0 such that cxX ≤ AxY

for all x ∈ X .

(46)

In this case, A−1  ≤ c−1 .

(47)

As for the existence of a bounded inverse of an operator A we also have an equivalent criterion for the non-existence of a bounded inverse. An operator A ∈ L(X, Y ) does not have a continuous inverse if and only if there exists a sequence {xn } ⊂ X with xn X = 1 for all n ∈ N and limn→∞ Axn = 0. The space L(X, K) is clearly a Banach space. It is symbolized by X∗ and called the dual space of X. The elements of X∗ are called linear functionals.

3.1

Well-Posedness in the Sense of Hadamard

A mathematical model is a mapping A : X → Y from the set (of “causes”) X to the set (of “effects”) Y . D(A) = X,

R(A) = {y = Ax : x ∈ X}

(48)

and N (A) = {x ∈ X : Ax = 0}

(49)

are the domain (of definition) of A, the range (or image) of A, and the null space (or kernel) of A, respectively. We are interested in investigating the operator equation Ax = y,

x ∈ X, y ∈ Y.

(50)

A direct problem amounts to the description and evaluation of y, i.e., Ax. An inverse problem is the task to find, for a given element y ∈ Y , an element x ∈ X such that (50) holds true.

14 From Gaussian Least Squares Approximation to Today’s . . .

849

Hadamard’s Classification: Let X, Y be metric spaces. Suppose that A : X → Y is a mapping from X to Y . The problem (A; X, Y ) is called well-posed (in the sense of Hadamard) if (H1) For every y ∈ Y there exists an x ∈ X with Ax = y (Existence of the Inverse). (H2) For every y ∈ Y there exists one and only one x ∈ X with Ax = y (Uniqueness of the Inverse). (H3) The inverse mapping A−1 : Y → X is continuous, i.e., the solution x ∈ X of Ax = y depends continuously on y ∈ Y (Continuous Dependence). If one of the three properties is violated, then the problem is called ill-posed (in the sense of Hadamard). Using a singular system in a finite dimensional context, x can also be written as the sum ri=1 σi−1 "y, vi #ui , such that A† y =

r "y, vi # i=1

σi

ui ,

(51)

which will also generalize to the infinite dimensional setting. AH A is invertible if N (AH A) = N (A) = {0}. Then (cf. [88, 89]) we have A† = (AH A)−1 AH .

(52)

The SVD of A immediately lead us to the famous “Moore-Penrose equations”, which also characterize A† : A† AA† = A† ,

(53)

AA† A = A,

(54)

(A† A)H = A† A,

(55)

(AA† )H = AA† .

(56)

It can also be seen that A† A and AA† are orthogonal projectors onto N (A)⊥ and R(A), respectively. These facts, in turn, can be used to characterize A† in yet another way, namely as the unique linear operator satisfying #−1 " A† |R(A) = A|N (A)⊥

(57)

N (A† ) = R(A)⊥ .

(58)

and

850

W. Freeden and B. Witte

#−1 " In other words, the pseudoinverse is the unique linear extension of A|N (A)⊥ to A† = R(A) ⊕ R(A)⊥

(59)

satisfying (58). Summarizing our results we obtain the following statement: For each y ∈ D(A† ), the equation Ax = y has a unique best-approximatesolution (pseudosolution) given by x † = A† y,

(60)

where A† is the pseudoinverse and the set of all least squares solutions is given by {x † } + N (A).

3.2

Pseudoinverse for Infinite Dimensional Operator Equations

Next, as already announced, our goal is to extend some material known from matrix analysis in (finite dimensional) Euclidean spaces to linear operator framework between Hilbert spaces. Central in our considerations is the introduction of the pseudoinverse. We begin with some preparatory remarks clarifying the functional analytic background in order to make the Hilbert space (HS) context of the theory of ill-posed problems (IPP) more transparent: As usual, the domain, range, and nullspace (kernel) of any operator A : D(A) → R(A) are denoted by D(A), R(A), and N (A), respectively. We start with solvability conditions of an operator equation in Hilbert spaces, where the concepts of “distance” and “angle” are at the disposal for an applicant. More concretely, let X and Y be Hilbert spaces and let A : X → Y be a bounded linear operator whose range R(A) is not necessarily closed. Then we have the orthogonal decompositions X = N (A) ⊕ N (A)⊥ ,

and

(61)



Y = R(A) ⊕ R(A) ,

(62)

N (A∗ ) = R(A)⊥ ,

(63)

where N (A)⊥ is the orthogonal complement of N (A), R(A) is the closure of the range of A, and A∗ is the adjoint operator of A, i.e., "Ax, y# = "x, A∗ y#

(64)

for all x ∈ X and y ∈ Y . We consider the operator equation Ax = y.

(65)

14 From Gaussian Least Squares Approximation to Today’s . . .

(1) (2) (3) (4)

851

R(A) is dense in Y , (hence, N (A∗ ) = {0}), and y ∈ R(A); R(A) is dense in Y , and y ∈ / R(A); R(A) is a proper subspace of Y , and y ∈ R(A) + R(A)⊥ ; R(A) = Y , and y ∈ / R(A) + R(A)⊥ .

In case (1) one has, of course, a solution in the classical sense; in case (2) and (4) a classical solution does not exist, while in case (3) a solution need not exist. We say x is a “least squares solution” of (65) if inf{Au − y : u ∈ X} = Ax − y.

(66)

Au − y2 = Au − Qy2 + y − Qy2 ,

(67)

Since

where Q is the orthogonal projector of Y onto R(A), it is clear that a least squares solution exists if and only if y ∈ R(A) + R(A)⊥ ,

(68)

where R(A) + R(A)⊥ is a dense set in Y . For such y the set of all least squares solutions of (65), denoted by L(y), is a non-empty closed convex set (indeed L(y) is the translate of N (A) by a fixed element of N (y)), hence, it has a unique element of minimal norm, denoted by A† y. The pseudoinverse (or generalized inverse) A† is the linear operator which assigns to each y ∈ D(A† ) := R(A)+R(A)⊥ , the unique element in L(y)∩N (A)⊥ , so that L(y) = A† y+N (A). It is easy to show that A† y is the minimal norm solution (equivalently the unique solution in N (A)⊥ ) of the normal equation A∗ Ax = A∗ y

(69)

(the equation obtained by setting the first variation of Ax − y2 equal to zero). It also follows that A† = (A| N (A)⊥ )−1 Q so that A† can be characterized as the linear operator with the function theoretic properties: D(A† ) = R(A) + R(A)⊥ ,

N (A† ) = R(A)⊥ = N (A∗ )

(70)

and R(A† ) = N (A)⊥ .

(71)

The equivalence of these characterizations of A† is established in [93](see also [51, 99] for a lucid exposition and [99, 100] for generalization to unbounded operators). In case (1) above, A† gives the minimal norm solution of (3). In case (3), Eq. (65) has a least squares solution (which is unique if and only if N (A) = {0}). In both

852

W. Freeden and B. Witte

cases the infimum in (66) is attained and is equal to zero and y −Qy, respectively. Case (2) and (4) are pathological and usually are not under discussion in generalized inverse theory, since in both cases y ∈ / D(A† ), and the infimum in (66) is not attained. As canonical evolution of Hadamard’s classification, M.Z. Nashed [95,98] called the operator equation (65) well-posed in the least squares (relative to X and Y ) if for each y ∈ Y the equation has a unique least squares solution (of minimal norm), which depends continuously on y; otherwise the problem is ill-posed. The advantage of adopting this notion of well-posedness is that it focuses on infinite dimensional problems (e.g., an inconsistent finite system of linear algebraic equations will not be ill-posed in above sense, while it is ill-posed in the sense of Hadamard). It follows immediately from the open mapping theorem in functional analysis that the following statements are equivalent: (a) the problem (65) is well-posed; (b) R(A) is closed; (c) A† is bounded. Summarizing we are led to the following conclusion (see [3, 96, 99]): The problem (A; X, Y ) is called well-posed in the sense of Nashed, if R (A) is closed in Y . If R (A) is not closed in Y , the problem (A; X, Y ) is called ill-posed in the sense of Nashed.

3.3

Weighted Least Squares Problems

Very often we are interested in weighted minimal norm least squares solutions. Let LA (y) be the set of all least squares solutions of Ax = y, where A is a member of the class L(X, Y ) of all bounded linear operators from X into Y . Let Z be a Hilbert space and L : DL ⊂ X → Z be a closed linear operator with dense domain and closed range. For y ∈ D(A† ), we consider the following problem: find w ∈ LA (y) such that Lw ≤ Lu for all u ∈ LA (y).

(72)

If L(N (A)) is closed and N (A)∩N (L) = {0}, then (72) has a unique solution w(y). We denote by A†L the linear map induced by y → w(y) and call it the weighted generalized inverse of A We define a new inner product and norm on D(L) by [u, v]L := (Au, Av) + (Lu, Lv)

(73)

√ with uL := [u, u]L . We denote the space D(L) with this new inner product by XL . It is easy to show that A†L y is the least squares solution of XL –minimal norm of Ax = y. Let

14 From Gaussian Least Squares Approximation to Today’s . . .

M := {x ∈ X : L∗ Lx ∈ N (A)⊥ }.

853

(74)

Then M is the orthogonal complement of N (A) with respect to [·, ·]L , and R(A†L ) = M.

3.4

Singular Value Decomposition for Compact Operators

Next we discuss a certain set of operators, viz. compact operators, that turn out to be specific prototypes for generating a large class of ill-posed problems. Let X, Y be normed spaces. An operator A : X → Y is called compact, if one of the following equivalent conditions is fulfilled: (1) Every bounded subset U ⊂ X possesses an image in Y , which is relatively compact, i.e., A(U ) is a compact set. (2) For every bounded sequence {xn }n ⊂ X the sequence {Axn }n possesses a subsequence that converges in Y . Let X, Y, Z be normed spaces. By convention, we use the following setting: K (X, Y ) = {A : X → Y : A is linear and compact}

(75)

K (X) = K (X, X) .

(76)

and

The following statements hold true: (1) K (X, Y ) ⊂ L (X, Y ) . (2) If A ∈ L(X, Y ) with dim R (A) < ∞, then A is compact. (3) If A : X → Y is compact and B : Y → Z is continuous or A : X → Y is continuous and B : Y → Z is compact, then AB : X → Z is compact. (4) The identity operator I : X → X is compact if and only if X is finite dimensional. (5) Let Y be a Banach space. Then K (X, Y ) is closed, i.e., a sequence of compact operators {An }n ⊂ K (X, Y ) limn→∞ An − AX→Y = 0 has a compact limit, i.e. the limit operator A is compact. (4) If A is compact and invertible and X is not finite dimensional, then A−1 is not continuous. Example. Let G be a regular region in Rq , i.e., a bounded region G dividing Rq uniquely into the inner space G and the outer space G c = R3 \G, G = G ∪ ∂G, such that the boundary ∂G is an orientable smooth Lipschitzian manifold of dimension q − 1.

854

W. Freeden and B. Witte

  Suppose that K is of class C (0) G × G . We introduce the integral operator A :     C (0) G → C (0) G by letting  (AF ) (x) =

G

K (x, y) F (y) dy,

  F ∈ C (0) G .

(77)

The proof of the compactness of A can be based on a well-known theorem of classical analysis, namely the Theorem of Arzela-Ascoli. This theorem provides two   equivalent properties to the relative compactness of a subset U ⊂ C (0) G :   Let ∅ = G be regular. A subset U ⊂ C (0) G is relatively compact if and only if the following two statements are valid: (i) U is equicontinuous, i.e., for every ε > 0 there exists δ (ε) > 0, such that for all F ∈ U |F (x) − F (y)| < ε

(78)

for all x, y ∈ G with x − y < δ (ε) . (ii) U is bounded, i.e., there exists an M > 0 with F 

" # C (0) G

< M for all F ∈ U .

In accordance with the Theorem of Arzela–Ascoli we are now interested in applying the above conditions (i) and (ii) to the context of the integral operator introduced in (77): (1) K is uniformly continuous in G × G. Therefore, AU is equicontinuous. (2) Suppose that U ⊂ C (0) (G) is bounded. Assume that F is of class U . Then we have     |AF (x)| =  K (x, y) F (y) dy  ≤ M sup |K (x, y)| G < ∞. (79) G

x,y∈G

In other words, AU is bounded. As a consequence, by virtue of the Theorem of Arzela-Ascoli, we are able to conclude that AU is relatively compact, so that the operator A defined by (77) is compact.   Example. Let G be a regular region in Rq , and K be of class L2 G × G , then     A : L2 G → L2 G given by  AF =

G

  K (·, y) F (y) dy, F ∈ L2 G

is compact (the proof can be found in, e.g., [63, 75]).

(80)

14 From Gaussian Least Squares Approximation to Today’s . . .

855

  Example. Let G be a regular region in Rq . We introduce the operator A : L2 G →   L2 G given by  AF (x) =

G

K (x, y) F (y) dy, F ∈ L2 (G).

(81)

If K is continuous for x = y and weakly singular, i.e., there exist a value α ∈ (0, q) and a constant C > 0 such that |K (x, y)| ≤ C

1 , |x − y|q−α

(82)

then A is compact (for the proof see, e.g., [72]). As a consequence, the Newton volume integral V (x) = AF (x) =

 1 1 F (y) dy, F ∈ L2 (G), 4π G |x − y|

(83)

occurring in the so-called inverse gravimetry problem of determining geologic density distributions F inside the Earth from the Earth’s gravitational potential V in G c ⊂ R3 forms a compact operator A.

3.5

Singular Value Decomposition

Next we are concerned with the introduction of eigenvalues and eigenfunctions corresponding to an operator A ∈ L(X). Let X be a normed space. Suppose that A is of class L (X). (1) λ ∈ C is called a regular value of A if and only if λI − A is continuously invertible. ρ (A) = {λ ∈ C : λI − A continuously invertible} is called the resolvent set. (2) σ (A) = C \ ρ (A) is called the spectrum of A. (3) λ ∈ σ (A) is called an eigenvalue of A if N (λI − A) = {0}. The elements of N (λI − A) \ {0} are called eigenvectors of A corresponding to the eigenvalue λ. The following results are standard for a Banach space X (see, e.g., [63]): (1) If λ ∈ σ (A), then |λ| ≤ A, i.e., the spectrum is bounded. (2) σ (A) ⊂ C is compact. Let X be a normed space. Suppose that A is a compact operator on X (i.e., A ∈ K (X)).

856

(1) (2) (3) (4)

W. Freeden and B. Witte

If λ ∈ σ (A) \ {0} , then λ is an eigenvalue of A. If λ is an eigenvalue of A, then dim N (λI − A) < ∞. σ (A) is at most countable. Furthermore, 0 ∈ σ (A). 0 is the only accumulation point of σ (A).

Central in our considerations about compact operator is the following spectral theorem for compact self-adjoint operators that can be seen in parallel to the finite dimensional case of matrix operators: Let X be a Hilbert space. Assume that A is of class K (X) and that A is selfadjoint, i.e., A∗ = A. Then there exists an orthonormal system {xi }i∈N ⊂ X and a sequence {μi }i∈N ⊂ R (finite or countable) with |μ1 | ≥ |μ2 | ≥ . . . > 0, such that Ax =



μi "x, xi # xi

(84)

i=1

holds true for all x ∈ X (note that we usually write "x, xi # instead of "x, xi #X if no confusion is likely to arise). Suppose that X and Y are Hilbert spaces. Furthermore, let A be of class K(X, Y ). Then A∗ A is also compact and obviously self-adjoint. Due to the spectral theorem there exist a sequence {λi }i∈N ⊂ R and an orthonormal system {xi }i∈N ⊂ X such that A∗ Ax =



λi "x, xi # xi ,

x ∈ X.

(85)

i=1

Suppose that λi ∈ σ (A∗ A) \ {0} and denote, as usual, by xi its corresponding eigenvector. It follows that . / λi xi 2 = λi "xi , xi # = "λi xi , xi # = A∗ Axi , xi = "Axi , Axi # = Axi 2 . (86) Therefore we are able to conclude that λi > 0. In the sequel, we assume that the eigenvalues are listed in the chronological order as follows: λ1 ≥ λ2 ≥ · · · ≥ λi ≥ λi+1 ≥ · · · ≥ 0. Set σj =



λj . Moreover, let yi = A∗ yi = A∗



1 Axi σi

1 σi Axi

 =

(87)

i.e., Axi = σi yi , i ∈ N, and

1 ∗ 1 A Axi = λi xi = σi xi . σi σi

(88)

14 From Gaussian Least Squares Approximation to Today’s . . .

857

It is not hard to see that "yi , yk # = =

/ 1 1 . ∗ 1 "Axi , Axk # = "λi xi , xk # A Axi , xk = σi σk σi σk σi σk

(89)

σi λi "xi , xk # = δi,k = δi,k . σi σk σk

Thus, {yi }i∈N ⊂ Y forms a complete orthonormal system (ONS) in R(A), so that the system {xi }i∈N is a complete ONS in N (A)⊥ . Now, assume that x is a member of N (A)⊥ . Then it follows that x=



"x, xi # xi

(90)

i=1

and Ax =



"x, xi # Axi =

i=1



σi "x, xi # yi

(91)

i=1

for all x ∈ N (A)⊥ . Let X, Y be Hilbert spaces. The set {σi ; xi , yi }i∈N ⊂ (0, ∞)×X×Y is called the singular system of an operator A ∈ K (X, Y ). The values σi are called the singular values of A. The elements xi , yi are called the singular vectors. Furthermore, the series Ax =



σi "x, xi # yi , x ∈ X

(92)

i=1

is called the singular value decomposition (SVD) of A.

3.6

Picard Condition and Pseudoinverse in Compact Operator Framework

The Picard condition plays an essential role in the solvability of inverse problems: If A : X → Y is compact with the singular value decomposition (SVD) {σi ; xi , yi }i∈N , then y ∈ R (A) is an element of R (A) if and only if the series ∞ |"y, yi #|2 i=1

is convergent.

σi2

(93)

858

W. Freeden and B. Witte

Clearly, from the Picard condition (93) it follows that σi−2 |"y, yi #|2 → 0 if i → ∞ so that information about the decay of the Fourier coefficients of an element y becomes available. Let A be a compact operator (i.e., A ∈ K (X, Y )) with SVD {σi ; xi , yi }i∈N . Then the pseudoinverse (or generalized inverse) of a compact operator is representable in the form A† y =

∞ 1 "y, yi # xi σi

(94)

i=1

  for all y ∈ D A† . If R (A) is finite dimensional (i.e., there exist only finitely many element yi ), then R A† < ∞. Therefore, A† is compact. In particular, A† is continuous. The representation of the pseudoinverse in terms of the singular value decomposition (94) opens the perspective to classify ill-posed problems. Indeed, the summands σi−1 "y, yi #xi occurring in the series (94) depend closely on the singular values. If the values σi are small, then the contribution by the series (94) becomes large. The existence of SVD can be guaranteed for all compact operators. Nevertheless, the concrete knowledge of SVD is critical. Only in rare exceptions, SVD is explicitly known in practice.

4

Specific Regularization Methods

Obviously, the first two criteria (H1) and (H2) determining a well-posed problem in the sense of Hadamard, can always be enforced by considering the generalized inverse A† . A violating of the third criterion, i.e., instability arises if the spectrum of the operator A is not bounded away from zero. Thus, it seems to be natural to construct regularizing approximations via modifying the smallest singular values. In accordance with the singular value decomposition of the generalized inverse, it follows that such a modification of small values and, hence, a construction of regularization operators can be obtained in the form xα = Rα y =



σi Fα (σi2 )"y, yi # xi ,

y ∈ Y,

(95)

i=1

with some function Fα : R+ → R+ such that Fα (λ) →

1 , λ

λ > 0, α → 0.

(96)

Such an operator Rα as defined by (95) and (96) may be understood as a regularization operator if

14 From Gaussian Least Squares Approximation to Today’s . . .

λ|Fα (λ)| ≤ CFα < ∞,

859

λ > 0.

(97)

|"y, yi #|2 ≤ CF2 α y2 ,

(98)

If (97) is satisfied, then we are able to see that Rα y2 =



σi2 (Fα (σi ))2 |"y, yi #|2 ≤ CF2 α

i=1

∞ i=1

where CFα is a bound for the norm of Rα . Note that the pointwise convergence of Fα immediately implies the pointwise convergence of Rα to A† .

4.1

Truncated Singular Value Regularization

Let A be a compact operator (i.e., A ∈ K (X, Y )) with SVD {σi ; xi , yi }i∈N . The main idea of truncated singular value decomposition is to ignore all singular values below a certain threshold value, which we can identify with the regularization parameter α, hence, the representation of the regularized solution is given by Fα (λ) =

1

λ , λ≥α 0 ,λ 0,

(107) (108)

14 From Gaussian Least Squares Approximation to Today’s . . .

861

as xα = x( α1 ·). By representing x in terms of the singular vectors xi in the form x(t) =



(109)

αi (t) xi

i=1

with αi (0) = 0, we obtain from the singular value decomposition αi (t) = −σi2 αi (t) + σi "yi , y#.

(110)

This ordinary differential equation can be solved analytically by " # 1 αi (t) = 1 − exp(−σi2 t) " yi , y#. σi

(111)

Hence, the regularized solution is given by    1 λ Fα (λ) = 1 − exp − α λ

(112)

i.e., xα =

∞ i=1

4.5

6

6

σ2 1 − exp − i α

55

1 "y, yi # xi , σi

y ∈ Y.

(113)

Error Estimates

For the error between x † and xαε in the case of noisy data y ε , we are able to write (with xαε = Rα y ε ) x † − xαε = (x † − xα ) + (xα − xαε ).

(114)

The first term x † − xα is the approximation error of the regularization method, which is independent of the noise. The second term xα − xαε corresponding to the propagation of data noise in the regularized case. By aid of the triangle inequality it follows that x † − xαε  ≤ x † − xα  + xα − xαε .

(115)

Thus, the estimate of the error between the regularized solution and the exact solution, can be handled by two error terms separately. It is clear that such an estimation provides a guideline for the parameter choice, namely by choosing α such that the terms on the right hand side are balanced.

862

W. Freeden and B. Witte

Next we deal with an estimate of the approximation error, which is independent of the noise level ε: Let Fα : R+ → R+ be a piecewise continuous function satisfying the assumptions 1 , λ > 0, α → 0, λ |Fα (λ)| ≤ Mα < ∞, λ > 0,

Fα (λ) →

(116) (117)

and sup(λFα (λ)) ≤ CF < ∞

(118)

α,λ

for some constant CF > 0. Moreover, let the regularization operator be defined by (95). Then, for all y ∈ D(A† ), we have Rα y → A† y,

α → 0.

(119)

The function t → Fα (t), t ∈ R+ , converges pointwise to the function  F (t) =

0, t >0 1 , t = 0.

(120)

Due to the discontinuity at zero, the convergence of tFα (t) − 1 to zero is becoming slower and slower as t decreases to zero. Since it is allowed to specify an arbitrarily small singular value σi and the minimal norm solution x † = xi , the convergence of regularized solutions is arbitrarily slow. On the other hand, we observe that there is a possibly faster convergence if the components "x † , xi # decay sufficiently fast μ compared to the eigenvalues. For example, if we have |"x † , xi #| ≤ cσi for some constant c > 0 and μ > 0, then it follows lim sup Rα y − A† y2

(121)

α→0

≤ lim sup c2 α→0

≤ c2

∞ n=1

∞ 2μ (σi Fα (σi ) − 1)2 σi n=1 1+μ

lim(σi α

μ

Fα (σi ) − σi )2 .

In other words, one has to consider the limit of the function t → |t 1+μ Fα (t) − t μ |

(122)

14 From Gaussian Least Squares Approximation to Today’s . . .

863

as t → ∞ instead, which is usually much faster. For example, in case of the truncated singular value decomposition, we obtain  0 , t ≥α 1+μ μ |t Fα (t) − t | = μ (123) t , t < α. If the singular values of the operator decay sufficiently fast (which is the typical case  μ σ for ill-posed problems), e.g., ∞ n=1 i < ∞, we are confronted with the situation Rα y − A† y2 ≤ c2





σi

≤ c2 α μ

σi 0 for all y ∈ Y , y = 0), and let ε2 := "η, Aη#.

(136)

We assume that ε is “small” and investigate what happens when ε → 0. The operator A introduced above is not meant to be arbitrary. In fact, it must be connected with K in the sense that, for some m ≥ 1 (not necessarily integer), the range of K is continuously embedded into the range of Am , so that A−m K : X → Y

is continuous.

(137)

If η satisfies (136), (137), it is referred to as weakly bounded noise. Some comments should be made: In a deterministic setting, a reasonable model for the noise is that it is “high-frequency”, and we would like to investigate what happens when the frequency tends to ∞, but without the noise tending to 0 strongly, that is without assuming that ηY → 0. Thus, η → 0 weakly begins capturing the essence of “noise”. Then, for any linear compact operator S : Y → Y , we would have SηY → 0. So, in this sense, there is nothing unusual about (136) and (137). Moreover, we would like (136) to capture the whole truth, i.e., the relations "η, Ap η# = o(ε2 ) and

"η, Aq η# = O(ε2 )

(138)

fail for p > 1 and q < 1 as ε → 0. This may be a tall order, although examples of operators A and noises η satisfying (136)-(138) are easily constructed (for more details see [18]). At the same time A is supposed to capture the smoothing effect of K in the sense of (137). Ideally, one would like A−m K to be continuous with a continuous inverse. The natural choice A = (KK ∗ )1/2m would achieve this, but would have to be reconciled with (136) and possibly (138). The condition (137) is not unreasonable. Eggermont et al. [19] show how the weak noise model leads to simple bounds on expressions like "η, y#Y for y ∈ Am (Y ), the range of Am . For β > 0, they introduce the inner product on Am (Y ), by letting

14 From Gaussian Least Squares Approximation to Today’s . . .

"y, z#m,β = "y, z#Y + β 2m "A−m y, A−m z#, z ∈ Am (Y ),

867

(139)

and denote the associated norm by ·m,β . The following result is of interest in itself, but it also later on plays a crucial role in the context of Tikhonov regularization with weakly bounded noise: Suppose that m ≥ 1. Under the assumptions (136), (137) on the weakly bounded noise, for all y ∈ Am (Y ) and all β > 0 |"η, y#| ≤ β −1/2 ε ym,β

(140)

(note that the factor β −1/2 stays the same, regardless of m).

5.2.1 Tikhonov Regularization and Noise Context Let L : D(L) ⊂ X → Z, where Z is a Hilbert space, be a closed linear operator with dense domain and closed range. We first assume (see also [94,127,129]), [130]) that N (L) is finite dimensional and that N (L)∩N (A) = {0}. We endow D(L) with the topology induced by the graph norm |u| := (u2 + Lu2 )1/2 .

(141)

Then L becomes a bounded operator on D(L). We define a new inner product on D(L) by [u, v]L = "Au, Av#Y + "Lu, Lv#Z

(142)

√ Then the induced norm uL := [u, u] is equivalent to the graph norm of u. Thus, both A and L are bounded operators on D(L) equipped with the inner product (142); we denote this Hilbert space by XL . The discussion shows that, under the hypotheses listed above, without loss of generality we may restrict ourselves to the case in which A : X → Y and L : X → Z are both bounded. For each y ∈ D(A† ), there is a unique element x ∈ L(y) which minimizes Lu. Let A†L y := x, and define M := {x ∈ X : L∗ Lu ∈ N (A)⊥ }.

(143)

Then it is not difficult to show that M is the orthogonal complement of N (A) with respect to the inner product (142), so that A†L is the generalized inverse relative to the decompositions: X : N (A) ⊕ M,

Y = R(A) ⊕ R(A)⊥ .

(144)

Under the above assumptions for each α > 0, there exists a unique xα which minimizes Jα (x) := Ax − y2Y + αLx2Z .

(145)

868

W. Freeden and B. Witte

Furthermore, xα = (A∗ A + αL∗ L)−1 A∗ y

(146)

and xα converges to A†L y for y ∈ D(A† ) as α → 0, and diverges otherwise. Thus it follows that in the presence of contamination, say y is replaced by yε , where yε − y ≤ ε for some ε > 0, the norm of the corresponding xαε may well diverge. It should be noted that (A∗ A + αL∗ L)−1 A∗ does not converge in the uniform operator topology as α → 0; in fact, (A∗ A + αL∗ L)−1 A∗  → ∞. Furthermore, the equation system (A∗ A + αL∗ L)x = A∗ y is poorly conditioned for small α, and hence numerically unstable. Thus, both contamination and numerical approximation dictate that a choice for α has to be made which would be a suitable compromise between accuracy and stability. Several procedures for the choice of “optimal” α are available, some of which take into consideration various a priori information. A simple choice that works would be α = ε2 .

5.2.2 Strongly Bounded Noise in Tikhonov Regularization We come back to the situation where K : X → Y is a linear compact operator between the Hilbert spaces X and Y . Consider the data y ∈ Y according to the equation y = Kx0 +η, where η ∈ Y is the unknown noise and x0 ∈ X is an unknown element one wishes to recover from the data y. We study Tikhonov’s regularization as a scheme to recover x0 from the data y in the strong noise model y = Kx0 + η

with

ηY ≤ ε .

(147)

The interest is in what happens when ε → 0. It should be noted that, in the Tikhonov regularization scheme, the unknown x0 is estimated by specifying the solution x = x α,ε of the problem minimize

Kx − y2Y + αx2X

over

x∈X

(148)

for some regularization parameter α, α > 0, yet to be specified. This procedure dates back to [109, 129]. Its minimizer exists and is unique. Moreover, it is well-known (see, e.g., [52]) that convergence rates on the error x α,δ − x0 X can be obtained from a source condition. For simplicity, it is assumed there that there exists a z0 ∈ X such that the “source condition” ν 2

x0 = (K ∗ K) z0

for some

0 0, of X. Indeed, in a large number of ill-posed problems (A; X, Y ) the operator A shows the property that the image Ax is smoother than x. In concrete situations this leads us to functions xk in the singular system with strongly growing oscillations for increasing k. Looking at the higher frequency parts of an element x ∈ X, i.e., the inner products "x, xk # for large k, we notice that there is a damping effect on Ax by the factor σk . Hence, the norm xν can be interpreted in the sense that "x, xk #/σkν → 0 for k → ∞ is demanded. In addition, the larger the value of ν is chosen, the faster "x, xk # has to converge to 0. Therefore it can be concluded that the element x does not contain relevant high frequency components. In other words, x can be regarded as “smooth”.

14 From Gaussian Least Squares Approximation to Today’s . . .

6.3

875

Optimality of Reconstruction Methods

In the following, a stable reconstruction method for the solution of the operator equation involving A ∈ L(X, Y ) is understood to be a continuous (not-necessarily linear) mapping T : Y → X with T 0 = 0. The question is which reconstruction error does occur in the best worst case, if the data are noisy. The worst case error of a reconstruction method T for A corresponding to the noise level ε and the additional information A† yν ≤ ρ is defined as & ' Eν (ε, ρ, T )= sup T y ε −A† y : y∈R(A), y ε ∈ Y, y−y ε  ≤ ε, A† yν ≤ ρ . (179) Another expression (avoiding the occurrence of the operator A† ) can be formulated by  Eν (ε, ρ, T ) = sup T y ε − x : x ∈ Xν , y ε ∈ Y, Ax − y ε  ≤ ε, xν ≤ ρ . (180) Clearly, the smaller the worst case error, the better the reconstruction method. The best worst case error for A corresponding to the noise level ε and the additional restriction A† yν ≤ ρ is understood to be Eν (ε, ρ) = inf {Eν (ε, ρ, T ) : T : Y → X continuous, T 0 = 0} .

(181)

Note that the best worst case error, i.e., Eν (ε, ρ), depends on the problem (i.e., on the operator A), but not on the reconstruction method. The next result tells us about the quantity of the best worst case error (cf. [116]). Let A be of class L(X, Y ). Then we have Eν (ε, ρ) = eν (ε, ρ),

(182)

where eν (ε, ρ) = sup {x : x ∈ Xν , Ax ≤ ε, xν ≤ ρ} .

(183)

Clearly, our results characterize the best worst case error independently of the knowledge of a specific reconstruction method. Let A be of class L(X, Y ). Then, for ν > 0, 1

ν

eν (ε, ρ) ≤ ρ ν+1 ε ν+1 .

(184)

Furthermore, there exists a sequence {εk }k∈N with εk → 0 for k → ∞ such that 1

ν

eν (ε, ρ) = ρ ν+1 ε ν+1 . In other words, the estimate (184) is sharp, i.e., it cannot be improved.

(185)

876

6.4

W. Freeden and B. Witte

Regularization Filters

If A is an injective operator of class K(X, Y ), then A† can be expressed in the form (A∗ A)−1 A∗ . The non-continuity is caused by the term (A∗ A)−1 , that has to be stabilized. In connection with the functional calculus for compact operators we are therefore led to filters as appropriate tools for regularization.

Let {Ft }t>0 , Ft : 0, A2 → R be a family of piecewise continuous functions satisfying the conditions  (F1) limt→0 Ft (λ) = λ1 for all λ ∈ 0, A2 ,

(F2) λ |Ft (λ)| ≤ CF for all λ ∈ 0, A2 and t > 0. Then the family {Ft }t>0 is called a filter relative to A. By virtue of Condition (F1), Ft (A∗ A) becomes a continuous operator, which converges in pointwise sense to (A∗ A)−1 as t → 0. This is the reason why we let   Rt y = Ft A∗ A A∗ y,

y ∈ Y.

(186)

As a consequence, {Ft }t>0 filters the influence of small singular values of A on the operator Rt in (186). In terms of the singular system {σk ; xk , yk }k∈N we are able to write ∞ " #   Ft A∗ A A∗ y = Ft σk2 σk "y, yk # xk + Ft (0) PN (A) A∗ y

(187)

k=1

=



" # Ft σk2 σk "y, yk # xk

k=1

due to fact that PN (A) A∗ y = 0 (note that A∗ y ∈ R (A∗ ) = N (A)⊥ ). Considering the approximation error we obtain, for y ∈ R(A), that   A† y − Rt y = A† y − Ft A∗ A A∗ y   = A† y − Ft A∗ A A∗ AA† y     = I − Ft A∗ A A∗ A A† y   = pt A∗ A A† y,

(188)



where the function pt : λ → pt (λ) , λ ∈ 0, A2 is given by pt (λ) = 1 − λFt (λ) ,

  λ ∈ 0, A2 .

(189)

The identity (188) leads us to the formulation of the following result (cf. [116]):

14 From Gaussian Least Squares Approximation to Today’s . . .

877

Assume that A ∈ K (X, Y ). Let {Ft }t>0 be a filter. Then lim Rt y =

t→0

where

  ⎧ † ⎨ A y y ∈ D A† , ⎩

  ∞ y∈ / D A† ,

(190)

  Rt y = Ft A∗ A A∗ y

for y ∈ Y . The next result concretizes the stability of Rt y ε under the noise level ε in more detail: Let {Ft }t>0 be a filter so that (F 1), (F 2) hold true. For y, y ε ∈ Y with y − ε y  ≤ ε, set xt = Rt y and xtε = Rt y ε . Then, for the residual term, we have Axt − Axtε  ≤ CF ε,

(191)

while, for the error term, it follows that  xt − xtε  ≤ ε CF M (t) where we have used the abbreviation M (t) =

sup

|Ft (λ)| .

(192)

0≤λ≤A2 † † Let† us continue with the estimate of the total error, thereby using x = A y, y ∈ D A . We base our considerations on the usual splitting into the approximation error and the data error in the form

A† y − Rt y ε  ≤ A† y − Rt y + Rt y − Rt y ε  = x



(193)

− xt  + xt − xtε 

 ≤ x † − xt  + ε CF M (t).   

t→0

→0

Obviously, the approximation error x † −Rt y tends to 0. The discussion of the data error is much more problematic than the approximation error: From the limit relation limt→0 Ft (λ) = λ1 imposed on a filter within the interval [0, A2 ] it follows that t→0

M (t) −→ ∞. Therefore, for the total error, we are confronted with a divergent behavior as t → 0. Nevertheless, convergence properties can be forced by a suitable coupling of t and ε. In fact, it can be deduced that, under the a priori parameter choice indicated above, the approximation error as well as the data error converge to 0, if the noise level ε tends to zero:

878

W. Freeden and B. Witte

Let {Ft }t>0 be a filter. If we choose α : (0, ∞) → (0, ∞) , ε → α(ε), √ ε→0 ε→0 such that α (ε) −→ 0 as well as ε M (α (ε)) −→ 0, then ({Rt }t>0 , α) with Rt = Ft (A∗ A) A∗ is a regularization of A† (by convention, {Ft }t>0 is called a regularizing filter). Next we mention which additional requirement imposed on filters Rt = Ft (A∗ A)A∗ will be necessary to guarantee the order optimality. Let {Ft }t>0 be a regularizing filter for A ∈ L(X, Y ). Assume there are t0 > 0, μ > 0, and a function ωμ : (0, t0 ] → R such that sup

λμ/2 |pt (λ)| ≤ ωμ (t)

(194)

0≤λ≤A2

for all t ∈ (0, t0 ] (with pt (λ) = 1 − λFt (λ)). Let y ∈ R(A) and let x † = A† y in Xμ with x † μ ≤ . Then the following estimates (a) x † − xt X ≤ ωμ (t), (b) Ax † − Axt Y ≤ ωμ+1 (t) hold true for xt = Rt y = Ft (A∗ A)A∗ y and 0 < t < t0 . Moreover, let μ

(1) ωμ (t) ≤ Cp t 2 for t → 0, (2) M(t) = sup |Ft (λ)| ≤ CM t −1 for t → 0, 0≤λ≤A2

where μ > 0, Cp , CM > 0 are constants. Let the a priori parameter choice α : (0, ∞) → (0, ∞) fulfill   2   2 ε μ+1 ε μ+1 ≤ α(ε) ≤ C2 , ε → 0, C1 ρ ρ

(195)

where C1 , C2 are positive constants. Then, ({Rt }t>0 , α), Rt = Ft (A∗ A)A∗ , is an order optimal regularization for A† with respect to Xμ . Of course, we need to know the values ρ and μ to guarantee the order optimality. Without the availability of ρ but based on the knowledge of μ, we are led to α(ε) = 2

Cε μ+1 with C being a positive constant to find an order optimal regularization. Without any information of both parameters ρ as well as μ we have to deal with a posteriori parameter choices. An asymptotic behavior of ωμ determines the speed of convergence for the reconstruction error. An important feature is the so-called qualification. Let {Ft }t≥0 be a regularizing filter for A† , where A ∈ L (X, Y ) satisfying the asymptotic relation

14 From Gaussian Least Squares Approximation to Today’s . . .

M (t) ≤

879

CM , t → 0. t

(196)

where CM is a positive constant. The maximal value μ0 , such that there exists, for all values μ ∈ (0, μ0 ] , a constant Cp > 0 satisfying μ

μ

sup λ 2 |pt (λ)| ≤ Cp t 2 , t → 0,

(197)

is called the qualification of the filter (remember pt (λ) = 1 − λFt (λ)). In other words, the qualification can be regarded as the maximal rate of decay. If the qualification is finite, there exists a parameter choice α (ε) such that  μ  0 A y − Rα(ε) y  = O ε μ0 +1 , ε → 0, †

ε

(198)

holds true for A† y ∈ Xμ0 . If the qualification is infinite, there exists a parameter choice α (ε) for which the error decay comes arbitrarily close to O (ε). As a consequence, filters with infinite qualification are more advantageous than others.

6.4.1 Order Optimality of Special Regularization Filters In the following we recover important examples of regularization methods constituted by filters, i.e., the truncated singular value decomposition SVD and the Tikhonov-Phillips regularization. 6.4.2

Truncated Singular Value Decomposition Revisited ∞  1 The SVD of A† is A† y = σk "y, yk # xk . We choose the filter k=1

Ft (λ) =

1

λ, 0,

λ ≥ t, λ < t.

(199)

Then ∞ " # 1  ∗  ∗ "y, yk # xk (200) Rt y = Ft A A A y = Ft σk2 σk "y, yk # xk = √ σk k=1

σk ≥ t

is the truncated SVD of A† (with finitely many summands). Clearly we have (1) limt→0 Ft (λ) = λ1 for all λ > 0, (2) CF = sup0≤λ≤A2 λ |Ft (λ)| = 1 for all t ≤ A2 , (3) M (t) = sup0≤λ≤A2 |Ft (λ)| = 1t for all t > 0.

880

W. Freeden and B. Witte

The total error can be described as follows: A† y − Rt y ε 

≤ =

A† y − Rt y + Rt y − Rt y ε  1  "y, yk # xk  + ε CF M (t)  √ σk

(201)

σk < t

⎛ Parseval

=



⎞1 2



1 ε |"y, yk #|2 ⎠ + √ . 2 t σk

√ σk < t

Moreover, for all μ > 0 and 0 ≤ t ≤ A2 , we have μ

λ 2 |pt (λ)| =

sup 0≤λ≤A2

sup

μ

μ

μ

λ 2 |1 − λFt (λ)| = sup λ 2 = t 2 . 0≤λ≤t

0≤λ≤A2

Together with M(t) = 1t we find that this filter possesses an infinite qualification, and it is order optimal for all μ > 0. However, one can show that the TSVD is not optimal for any μ > 0 (see [116]).

6.4.3 Asymptotic Regularization Revisited We choose the following filter ⎧ " # ⎪ 1−exp − λt ⎪ ⎨ λ>0 λ Ft (λ) = ⎪ ⎪ ⎩ 1 λ = 0.

(202)

t

Then we obtain

Rt y =



" # Ft σk2 σk "y, yk # xk =

k=1

  σk2 1 − exp − ∞ t k=1

σk

"y, yk # xk .

It is not difficult to prove the following properties: (1) limt→0 Ft (λ) = λ1 for λ > 0    1 − exp − λt (2) λ |Ft (λ)| = λ

 λ>0 ≤ 1 , t > 0, λ=0 t (3) sup0≤λ≤A2 λ |Ft (λ)| = 1 = CF , (4) M (t) = sup0≤λ≤A2 |Ft (λ)| = 1t , t > 0 (note that Ft (λ) is monotonously decreasing in λ and limλ→0 Ft (λ) = 1t ). The qualification of the filter again is infinite.

14 From Gaussian Least Squares Approximation to Today’s . . .

6.4.4

881

Tikhonov Regularization Revisited

Using the filter Ft (λ) =

1 , λ+t

t > 0,

(203)

we find that ∞   Rt y = Ft A∗ A A∗ y =

σk "y, yk # xk +t

(204)

σ2 k=1 k

and 

∞  A∗ A + tI Rt y = k=1

=



" # σk 2 "y, # y σ + t x = σk "y, yk # xk k k k σk2 + t k=1 ∞

"y, σk yk # xk =

k=1



"y, Axk # xk =

k=1

(205)

∞ . ∗ / A y, xk xk = A∗ y, k=1

i.e., xt = Rt y is the unique solution of the equations (A∗ A + tI ) xt = A∗ y. Such equations are called regularized normal equations. It can be easily seen that (1) limt→0 Ft (λ) = λ1 , λ > 0, λ ≤ 1 = CF for all t > 0, (2) λ |Ft (λ)| = λ+t (3) M (t) = sup0≤λ≤A2 |Ft (λ)| = sup0≤λ≤A2

1 λ+t

= 1t ,

t > 0.

In order to compute the qualification for Tikhonov’s regularization we take a look at the term μ 2

λ |pt (λ)| =

sup 0≤λ≤A2

sup

t

0≤λ≤A2

μ 2

 λ  μ2 t

+1    λ t

.

(206)

=hμ (λ,t)

In fact, we are led to distinguish two cases: Case 1: For μ > 2: hμ (λ, t) is strictly monotonously increasing in λ. Case 2: For μ ≤ 2: we find the estimate μ

sup 0≤λ≤A2

μ

hμ (λ, t) ≤ t 2

z2 . 0≤z0

(212)

If y ∈ / R (A), then PR(A)⊥ y ε Y can be arbitrarily large. However, if y ∈ R (A), then we have   PR(A)⊥ y ε  = PR(A)⊥ y − y ε  ≤ y − y ε  ≤ ε

(213)

14 From Gaussian Least Squares Approximation to Today’s . . .

883

and, therefore, lim d (t) ≤ ε

(214)

t→0,t>0

In other words, for all τ > 1 exists a t0 with d (t) < τ ε for all t ≤ t0 . Let τ > 1 be chosen (fixed) and {tk } be a strictly monotonously decreasing sequence with limit zero. Determine k ∗ , such that d (tk ∗ ) ≤ τ ε ≤ d (ti ) ,

i = 1, . . . , k ∗ − 1

(215)

Set α (ε, y ε ) = tk ∗ . Let A be of class L (X, Y ) . Suppose that {Ft }t>0 is a regularizing filter with qualification μ0 > 1. Moreover, assume that M(t) ≤ CtM for t → 0. Let the parameter choice α : (0, ∞) × Y → (0, ∞) be taken in accordance with the discrepancy principle, such that the sequence {tk }k satisfies tk = θk tk−1 , where 0 < ϑ ≤ θk < 1 for all k. Furthermore, suppose that τ > sup{|pt (λ)| : t > 0, 0 ≤ λ ≤ A2 } ≥ pt (0) = 1. Then ({Rt }t>0 , γ ) with Rt = Ft (A∗ A) A∗ is an order optimal regularization of † A with respect to Xμ for all (0, μ0 − 1]. Note that, for the values μ ∈ (μ0 − 1, μ0 ] we obtain nothing, whereas, the 2 1 Tikhonov case leads to the order of convergence O(ε 2 ) instead of O(ε 3 ). This is the reason why the discrepancy principle may produce suboptimal rates of convergence in case of a finite qualification. However, in connection with an infinite qualification, no such problems arise.

6.6

Generalized Discrepancy Principle

Next we are interested in a generalization of the discrepancy principle. To this end we make a reformulation: Let {tk } be a strictly monotonically decreasing sequence with limit zero. Then, tk ∗ is chosen as follows: & tk ∗ = sup tk & = sup tk & = sup tk

: Axtεk − y ε 2 ≤ τ ε2

' (216) '

  : ptk AA∗ y ε 2 ≤ τ ε2 ' =   > : y ε , pt2k AA∗ y ε ≤ τ ε2 .

In doing so we have used the function st = pt2 . For a generalization we allow arbitrary functions st in the following sense: Let τ > 1 be chosen (fixed) and {tk } as before. Determine k ∗ such that ' &   / . tk ∗ = sup tk : y ε , stk AA∗ y ε ≤ τ ε2 .

(217)

884

W. Freeden and B. Witte

Set   α ε, y ε = tk ∗ .

(218)

We have to look for functions st which yield order optimal methods for the whole parameter domain (0, μ0 ]. An answer is given by the example: 2+ μ2

st (λ) = pt

0

 (λ) = pt3 (λ) =

t t +λ

3 .

(219)

The generalized discrepancy principle with this family of functions st together with the Tikhonov regularization is an order optimal method with respect to Xμ for μ ∈ (0, 2]. In the previous considerations we have studied a number of a posteriori parameter choice rules which all depend in one way or the other on the computed approximation – and on the given data error level ε. A perfect example to illustrate this general reasoning is the discrepancy principle where reconstructions are discarded unless their data fit has the order to the noise level ε. In practical examples such noise level information is not always available (or reliable). For instance, a given discrete data vector may consist of a finite number of measurements, for each of which we may or may not know the standard deviation and/or a worst-case error bound. Typically, the worst-case bound will be a severe overestimation, while the standard deviation might underestimate the true error. both estimates may therefore lead to a significant loss of accuracy when used in these parameter choice rules. Another uncertainty problem arises if we are going to embed the discrete data into a continuous model by some interpolation or approximation process. Then we have to estimate the L2 -norm of the difference between the constructed function and the true data function from the discrete noise information, and from a priori assumed smoothness properties of the data. Often it is necessary to consider alternative (a posteriori) parameter choice rules that avoid knowledge of the noise level, and to determine some realistic regularization parameter on the basis of the actual performance of the regularization method under consideration. Such heuristic parameter choice rules will be called error free. A good reference to these strategies is [21]. It must be emphasized, however, that error free parameter choice rules cannot provide a convergent regularization method in the strict sense. Still, there are examples where an error free rule leads to better reconstructions than some sophisticated order-optimal rule, cf., e.g. [58] for some numerical comparisons. Another heuristic parameter choice rule which can be interpreted via some kind of error estimation is the method of generalized cross-validation introduced by Wahba (cf. [132] for the history of this method and a more detailed exposition). It applies to problems, where A is an operator into a finite dimensional data space, e.g., a generalized moment problem.

14 From Gaussian Least Squares Approximation to Today’s . . .

885

Another very popular error-free parameter choice rule has been advocated by Hansen [59]. This method is based on an inspection of the residual norms of the computed approximations, this time by relating them to the norms of the approximations themselves (cf. [58]). In spite of its use in several applications, for example, in satellite to satellite tracking, satellite gravity gradiometry (see, e.g., [28, 37], [37, 49, 117], there still lacks a sound mathematical foundation of the Lcurve method).

6.7

Tikhonov-Phillips Regularization

The Tikhonov-Phillips filter is given by Ft (λ) =

1 , λ+t

  t > 0, λ ∈ 0, A2 .

(220)

Let A be of class L (X, Y ) . Without loss of generality, suppose that A is injective (otherwise we have to replace X by N (A)⊥ ). Let Z be a Hilbert space and B ∈ L (X, Z) be continuously invertible, i.e., there exists a value β > 0 such that βxX ≤ BxZ

for all x ∈ X.

(221)

Before we deal with the generalization of the Tikhonov-Phillips regularization we mention some preparatory results: (1) Lax-Milgram Lemma Suppose that L ∈ L (X) . Assume there exists a value λ > 0 such that "Lx, x# ≥ λx2

(222)

holds for all x ∈ X. Then L is continuously invertible and L−1  ≤

1 . λ

(223)

(2) The stabilized normal equation 

 A∗ A + tB ∗ B x = A∗ y

(224)

with y ∈ Y possesses a unique solution for all t > 0 which continuously depends on y. Our aim is to show that the solution of (224) can be equivalently obtained by minimizing the argument of the Tikhonov-Phillips functional given by

886

W. Freeden and B. Witte

Jt,y (x) = Ax − y2 + tBx2 .

(225)

Note that the second term on the right hand side of (225) is called the penalty term of the Tikhonov-Phillips functional: Let A and B be given as indicated above. Moreover, assume that y ∈ Y and t > 0. Then the following statements are equivalent: (1) (A∗ A + tB ∗ B) xt = A∗ y, (2) xt minimizes the functional Jt,y (x) = Ax − y2 + tBx2 ,

(226)

 i.e., xt = arg min Jt,y (x) : x ∈ X . Our purpose is to comment on this result in more detail: For y ∈ Y and the family of generalized Tikhonov-Phillips regularizations Rt given by  −1 ∗ A y xt = Rt y = A∗ A + tB ∗ B  = arg min Jt,y (x) : x ∈ X ,

(227)

the penalty term satisfies the estimate 1 Bxt  ≤ √ y. t

(228)

In fact, the property (228) explains the role of the penalty term. If t is large, then Bxt  is small compared to Axt − yY . If t 6 1, then Bxt  becomes large in comparison with the residual term Axt − y. All in all, the choice of the operator influences the character of xt . Some features can be strengthened, where others can be weakened.   Let A and B be given as before. Furthermore, suppose that y ∈ D A† and r > 0. Set   1 δr = inf  2 B ∗ BA† y − A∗ y   : y  ∈ Y, y   ≤ r . β

(229)

Then the following statements hold true for xt = (A∗ A + tB ∗ B)−1 A∗ y: (1) xt − A† y2 ≤ δr2 + tβ 2 r 2 , (2) limt→0 xt = A† y.

r, t > 0,

Some additional effort is needed to formulate regularizations under a priori parameter choice.

14 From Gaussian Least Squares Approximation to Today’s . . .

887

Let A, B given as before. If we choose γ : (0, ∞) → (0, ∞) such that lim γ (ε) = 0 and

ε→0

ε = 0, lim √ ε→0 γ (ε)

(230)

then ({Rt }t>0 , γ ) with Rt = (A∗ A + tB ∗ B)−1 A∗ is a regularization of A† . In order to derive further convergence results we impose further assumtions on B: If {(σn ; xn , yn )} is the singular system of A and if, for B ∈ L (X, Z), we set B ∗ Bx =



βk2 "x, xk # xk ,

  βk ∈ β, B2

(231)

k=1

(this is a particular specialization), then we get ∞  −1 ∗ Rt y = A∗ A + tB ∗ B A y=

σ2 k=1 k

σk "y, yk # xk . + tβk2

(232)

Note that the classical Tikhonov-Phillips regularization uses B = I , i.e., βk = 1 for all k. By βk we control which singular value gets damped and how strongly it is regularized. In general, Rt cannot be written as a filter, i.e., in the form Ft (A∗ A)A∗ . However, the following helpful estimates can be easily verified (see [116]): FB2 t (A∗ A)A∗ y ≤ Rt y ≤ Fβ 2 t (A∗ A)A∗ y

for all y ∈ Y,

(233)

and pβ 2 t (A∗ A)x ≤ (I − Rt A)x ≤ pB2 t (A∗ A)x

for all x ∈ X,

(234)

t 1 and Ft (λ) = λ+t (as for the classical Tikhonov where pt (λ) = 1 − λFt (λ) = λ+t filter). The family {Rt }t>0 possesses the same asymptotic behavior for t → 0 as the classical Tikhonov filter, the behavior for ε → 0 is independent of B. Suppose that A ∈ K(X, Y ) and B ∈ L(X, Z). Assume that the representation (231) holds true. Let Rt be given in the form Rt = (A∗ A + tB ∗ B)−1 A∗ , t > 0.

(a) If the a priori parameter γ is chosen such that   2   2 ε μ+1 ε μ+1 ≤ γ (ε) ≤ CΓ Cγ ρ ρ

for ε → 0,

(235)

(as in (195)), where Cγ , CΓ are positive constants, then the method ({Rt }t>0 , γ ) is an order optimal regularization of A† with respect to Xμ , μ ∈ (0, 2].

888

W. Freeden and B. Witte

(b) If we choose γ according to the discrepancy principle, then ({Rt }t>0 , γ ) is an order optimal regularization of A† with respect to Xμ , μ ∈ (0, 1]. The order of 1 decay of the error O(ε 2 ) is maximal. (c) If we choose γ in accordance with the generalized discrepancy principle with tk∗ = sup{tk | η(tk ) ≤ τ ε2 }, where η(t) = Axtε − y ε 2 − "A∗ (Axtε − y ε ), (A∗ A + tB ∗ B)−1 A∗ (Axtε − y ε )#, (236) with y ε ∈ Y and xtε = Rt y ε , then ({Rt }t>0 , γ ) is an order optimal regularization of A† with respect to Xμ , μ ∈ (0, 2]. 2

Once again, it is possible to prove that O(ε 3 ) is the maximal order of decay for the error. In other words, the qualification does not increase based on this generalization. However, it should be remarked that the reconstructions using B = I may lead to strongly different results. In particular, it is possible to adapt βk to the spectrum of the noise, if this information is known or it can be estimated by other methods.

7

Pseudoinverses in Reproducing Kernel Hilbert Spaces

Within the L2 −context the range of a compact linear operator K with infinite dimensional range is always non-closed. In M.Z. Nashed, G. Wahba [101–103] we are confronted with the question: Can one endow R(K) with a new inner product that would make R(K) a Hilbert space and that would have additional useful properties?

7.1

Reproducing Kernel Hilbert Space (RKHS) Framework

A Hilbert space H of complex-valued functions on a (bounded) set G (e.g., a regular region) is called a reproducing kernel Hilbert space (RKHS) if all the evaluation functional H % x → x(t) ∈ C are continuous (bounded) for each fixed t ∈ G, i.e., there exists a positive constant Ct for each t ∈ G such that |x(t)| ≤ Ct xH for all x ∈ H . By the Riesz Representation Theorem, for each t ∈ G, there exists a unique element Qt such that x(t) = "x, Qt #H for all x ∈ H . The reproducing kernel Q(·, ·) : G × G → C of a RKHS H is defined by Q(s, t) = "Qs , Qt #H , s, t ∈ G. We list some basic properties of RKHS’s that are particularly relevant in approximation and estimation theory: • • • •

Q(s, t) = Q(t, s) for all t, s ∈ G. Q(s, s) ≥ 0√ for all s ∈√G. |Q(s, t)| ≤ Q(s, s) Q(t, t) for all s, t ∈ G. The reproducing kernel Q(s, t) on G × G is a non-negative definite Hermitean kernel.

14 From Gaussian Least Squares Approximation to Today’s . . .

889

Conversely by the Aronszajn-Moore Theorem, every non-negative definite Hermitean function Q(·, ·) on G × G determines a unique Hilbert space HQ for which Q(·, ·) is a reproducing kernel (see, e. g., [2]) (note that a complex-valued kernel F on G ×G is said to be positive definite if, for any n points t1 , . . . , tn ∈ G, the matrix A = (F (ti , tj ))1≤i,j ≤n is non-negative definite, i.e., u Au = H

n

ui F (ti , tj ) uj ≥ 0

(237)

i,j =1

for all u = (u1 , . . . , un ) ∈ Cn ). • A closed subspace H˜ of a RKHS H is also a RKHS. Moreover, the orthogonal ˜ t) of the RKHS H˜ are projector P of H onto H˜ and the reproducing kernel Q(s, ˜ ˜ k = P Q. related by Pf (s) = "f, Qs #, s ∈ G for all f ∈ H where Q • In a RKHS, the element representing a given bounded linear functional L can be expressed by means of the reproducing kernel: L(f ) = "f, h#H , where h = L(Q). Similarly, for a bounded linear operator L on H to H , we have that Lf (t) = "Lf, h# = "f, L∗ h#. • If G is a bounded domain or if G is an unbounded domain but  G ×G

|Q(s, t)|2 dt ds < ∞,

(238)

and Q(s, t) is continuous on G × G, then HQ is a space of continuous functions. • Every finite dimensional function space is a RKHS H with reproducing kernel Q(s, t) =

n

ui (s) ui (t),

(239)

i=1

where {ui }ni=1 is an orthonormal basis for H (notice that the sum in the above definition of the kernel Q is invariant under the choice of an orthonormal basis). • If the integral relation  G ×G

|Q(s, t)|2 ds dt < ∞,

(240)

holds true, then Q(·, ·) has a countable sequence of eigenvalues and eigenfunctions (Theorem of Mercer). • Let {ϕn }n∈N be a sequence of complex functions defined on G such that, for every t ∈ G, ∞ n=1

|ϕn (t)|2 < ∞.

(241)

890

W. Freeden and B. Witte

 ∞ 2 For every sequence {cn }n∈N with ∞ n=1 |cn | < ∞, the series n=1 cn ϕn (t) is then convergent in C for every t ∈ G. The functions which are the sums of such series form a linear subspace H , on which we are able to define the structure of a separable Hilbert space by taking as scalar product, for f =



cn ϕn ,

g=

n=1



dn ϕn ,

(242)

n=1

the number "f, g#H =



cn dn .

(243)

ϕn (t)ϕn (s), t, s ∈ G × G.

(244)

n=1

This space has a reproducing kernel, namely Q(x, y) =

∞ n=1

• Let H be a separable RKHS, then its reproducing kernel Q(·, ·) has the expansion Q(s, t) =



ϕn (t) ϕn (s),

(245)

n=1

where {ϕn }∞ basis for H (we remark that for a general n=1 is an orthonormal ∞ separable Hilbert space H , n=1 ϕn (t) ϕn (s) is not a reproducing kernel (note that L2 (G) is not an RKHS) and also that φn ’s do not generally correspond to sampling expansions . If they do, if ϕn (t) = Q(tn , t) for some sequence i.e., ∞ {tn }, then we have that f (t) = n=1 f (tn ) ϕn (t), this constitutes a sampling theorem.) • If the reproducing kernel Q(s, t) of a RKHS H is continuous on G × G, then H is a space of continuous functions being uniformly continuous on a bounded G. This follows from |x(t) − x(s)| = |"x, Qt − Qs #H | ≤ xH Qt − Qs H

(246)

and Qt − Qs 2 = Q(t, t) − 2Q(t, s) + Q(s, s) for all s, t ∈ G.

(247)

14 From Gaussian Least Squares Approximation to Today’s . . .

891

• Strong convergence in a RKHS H implies pointwise convergence and uniform convergence on compact sets, because of the fact |x(t) − xn (t)| = |"x − xn , Qt #H | ≤

 Q(t, t) x − xn H .

(248)

• Let HQ denote the RKHS with reproducing kernel Q, and denote the inner product and norm in HQ by "·, ·#HQ and  · HQ , respectively. Note that Q(s, s  )(= Qs (s  )) is a non–negative definite Hermitean kernel on G × G, and that {Qs , s ∈ G} spans HQ since "Qs , x#HQ = 0, s ∈ G, implies x(s) = 0. For more properties of reproducing kernel spaces the reader is referred to, e.g., [2, 11], and [68] and the references therein. • For every positive definite kernel Q(·, ·) on G × G, there exist a zero mean Gaussian process with Q(·, ·) as its covariance, giving rise to the relation between Bayes estimates, Gaussian processes, and optimization processes in RHKS (for more details the reader is referred to the geodetic literature, see e.g., [50, 86, 87, 91], and the monographs [82, 132]). Interest in reproducing kernel Hilbert spaces have increased in recent years, as the computer capacity has made solutions of ever larger and more complex problems practicable. Indeed, new reproducing kernel representations and new applications (in particular in physical geodesy and geophysics) are being contributed at a rapid rate. For example, a certain RHKS in terms of outer harmonics allows the adequate determination of the Earth’s gravitational potential (see, e.g., [24, 25, 121] for early spline approaches) in consistency with gravitational observables of heterogeneous type (that are interpretable as (bounded) linear functionals on the RKHS under consideration).

7.2

Hilbert-Schmidt Operator Theory

An RKHS HQ with RK Q determines a self-adjoint Hilbert-Schmidt operator (also denoted by Q) on L2 (G) to L2 (G) by letting  (Qx)(s) =

G

Q(s, s  ) x(s  ) ds  ,

x ∈ L2 (G).

(249)

Since Q is assumed to be continuous, then by the Theorem of Mercer (see, e.g., [75]), the operator Q has an L2 (G)-complete orthonormal system of eigenfunctions ∞ ∞ with λ ≥ 0 and {φi }∞ and corresponding eigenvalues {λ } i i i=1 λi < ∞. i=1 i=1 Thus Q is a trace-class operator (see, e.g., [14]) so that Q(·, ·) has the uniformly convergent Fourier expansions

892

W. Freeden and B. Witte

Q(s, s  ) =



λi φi (s) φi (s  )

(250)

λi "x, φi #L2 (G ) φi ,

(251)

i=1

and Qx =

∞ i=1

where "·, ·#L2 (G ) is the inner product in L2 (G). It is well known (see, for example, [129]) that the space HQ consists of all x ∈ L2 (G) satisfying the condition ∞

 2 λi −1 "x, φi #L2 (G ) < ∞,

(252)

i=1

(note that the notational convention 0/0 = 0 is being adopted) with inner product "·, ·#HQ given by "x1 , x2 #HQ =



λ−1 i "x1 , φi #L2 (G ) "x2 , φi #L2 (G ) .

(253)

i=1

The operator Q has a well-defined symmetric square root Q1/2 which is a HilbertSchmidt operator (see, e.g., [101–103]): Q1/2 x =

∞ 

λi "x, φi #L2 (G ) φi .

(254)

i=1

Thus, since N (Q) = N (Q1/2 ), we have " # HQ = Q1/2 (L2 (G)) = Q1/2 L2 (G)  N (Q) .

(255)

(Q1/2 )† has the representation (Q1/2 )† x =

∞  ( λi )† "x, φi #L2 (G ) φi

(256)

i=1 ⊥ (⊥ in L2 (G)), where, for Θ a real number, Θ † = Θ −1 for Θ = on HQ ⊕ HQ 0; Θ † = 0 for Θ = 0. Similarly, Q† has the representation

14 From Gaussian Least Squares Approximation to Today’s . . .

Q† x =



893

λ†i "x, φi #L2 (G ) φi .

(257)

i=1

For any operator Q on L2 (G), induced by an RK Q(s, s  ) as defined in (249), we shall adopt the notational conventions (cf. [101–103]) Q−1 := Q† ,

(258)

Q−1/2 := (Q1/2 )† .

(259)

This leads to the relations & ' xHQ = inf pL2 (G ) : p ∈ L2 (G), x = Q1/2 p ,

x ∈ HQ .

(260)

and "x1 , x2 #HQ = "Q1/2 x1 , Q1/2 x2 #L2 (G ) ,

x1 , x2 ∈ HQ ,

(261)

and, if x1 ∈ HQ and x2 ∈ HQ with x2 = Qρ for some ρ ∈ L2 (G), then "x1 , x2 #HQ = "x1 , ρ#L2 (G ) .

7.3

(262)

Relationship Between Pseudoinverse and L2 -Topology

We are now ready to explore properties of the generalized inverse of a linear operator between two RK spaces (cf. [101–103]). To this end we let X = L2 (G) and Y = L2 (H) denote the Hilbert spaces of square-integrable real-valued functions on the closed, bounded domains G and H, respectively. Let A be a linear operator from X into Y . Let ⊂ denote point set inclusion only, and suppose that A has the following properties: HQ ⊂ D(A) ⊂ X,

(263)

where HQ is an RKHS with continuous RK on G × G; A(HQ ) = HR ⊂ HR ⊂ Y,

(264)

where HR and HR are RKHS’s with continuous RK’s on H × H; so that N (A) in HQ is closed in HQ .

(265)

We emphasize, in particular, that the space HR is not necessarily closed in the topology of HR .

894

W. Freeden and B. Witte

Let A†(X,Y ) denote the generalized inverse of A, when A is considered as a mapping from X into Y , and let A†(Q,R) denote the generalized inverse of A when A is considered as a mapping from HQ into HR . Now, the topologies in (X, Y ) are not the same as the topologies in (HQ , HR ). Thus, the generalized inverses A†(X,Y ) and A†(Q,R) show distinct continuity properties, in general. We shall develop the relation between A†(Q,R) and certain (X, Y ) and (Y, Y ) generalized inverses. In the sequel, the operators R : Y → Y and R 1/2 : Y → Y are defined from the RK of HR analogous to Q and Q1/2 (see (251) and (254)). We continue the notational † −1/2 = (R 1/2 )† convention of (258), i.e., R −1 = R † = R(Y,Y ) and R (Y,Y ) . From [101–103] we are able to deduce the following result: Under assumptions (263), (264), and (265), let y ∈ D(A†(Q,R) ), i.e., y ∈ HR ⊕ H ⊥ (⊥ in HR ). R Then y ∈ D(Q1/2 (R −1/2 AQ1/2 )†(X,Y ) R −1/2 )

(266)

A†(Q,R) y = Q1/2 (R −1/2 AQ1/2 )†(X,Y ) R −1/2 y.

(267)

and

Moreover, it follows that A†(Q,R) is bounded, provided that A(HQ ) = HR . It should be noted that an operator A may satisfy the assumption A(HQ ) = HR while failing to have a closed range in the space Y . This is, for example, the case if A is a Hilbert-Schmidt linear integral operator (with non-degenerate kernel) on X. It is this observation which makes RKHS useful in the context of regularization and approximation of ill-posed linear operator equations.

7.4

Explicit Representation of Minimal Norm Solutions

We assume that HQ is chosen so that the linear functionals {Et : t ∈ H} defined by Et x = (Ax)(t)

(268)

are continuous in HQ .Then, by the Riesz representation theorem, there exists {ηt , t ∈ H} ∈ HQ such that (Ax)(t) = "ηt , x#HQ ,

t ∈ H,

x ∈ HQ ,

(269)

where ηt is explicitly given by ηt (s) = "ηt , Qs #HQ = (AQs )(t)

(270)

14 From Gaussian Least Squares Approximation to Today’s . . .

895

(ηt (s) is readily obtained in a more explicit form from (270) if A is a differential or integral operator). Let R(t, t  ) be the non-negative definite kernel on H × H given by R(t, t  ) = "ηt , ηi #HQ ,

t, t  ∈ H.

(271)

Assume that HR is the RKHS with RK R given by (271). Let Rt be the element of HR defined by Rt (t  ) = R(t, t  ), and let "·, ·#HR be the inner product in HR . Suppose that V is the closure of the span of {ηt , t ∈ H} in HQ . Now, {Rt , t ∈ H} spans HR , and by the properties of RKHS, we have "ηt , ηt  #HQ = R(t, t  ) = "Rt , Rt  #HR .

(272)

Thus there is an isometric isomorphism between the subspace V and HR , generated by the correspondence ηt ∈ V ∼ Rt ∈ HR .

(273)

Then, x ∈ V ∼ y ∈ HR if and only if "ηt , x#HQ = y(t) = "Rt , g#HR , t ∈ H, i.e., if and only if y(t) = (Ax)(t), t ∈ H. Thus, A(HQ ) = A(V ) = HR . The nullspace of A in HQ is {x : x ∈ HQ , AxHR = 0}. Since "ηt , x#HQ = 0,

t ∈H

and

x ∈ HQ ⇒ x ∈ V ⊥ ,

(274)

and x ∈ V implies xHQ = AxV , it follows that the nullspace of A in HQ is V ⊥ ( ⊥ in HQ ). Hence, (268) entails that the nullspace of A : HQ → HR in HQ is always closed, irrespective of the topological properties of A : X → Y . We list the following table of corresponding sets and elements, under the correspondence “∼” of (273), where the entries on the left are in HQ : V ∼ HR ,

(275)

x ∼ y,

(276)

η t ∼ Rt ,

(277)

ηs∗ .

(278)

PV Qs ∼

Here, PV is the projector from HQ onto the (closed) subspace V , y(t) = "ηt , x#HQ , t ∈ H, and ηs∗ = AQs = A(PV Qs ), i.e., ηs∗ (t) = "ηi , PV Qs #HQ = ηt (s).

(279)

This leads to the following result (the proof is given in [102, 103]): Let A and HQ satisfy (268), and let R be given by (272), where ηi is defined by (269). Let ηs∗ = AQs . Then, for y ∈ HR ,

896

W. Freeden and B. Witte

(A†(Q,R) y)(x) = "ηs∗ , y#HR ,

s ∈ G.

(280)

We also obtain another operator representation of A†(Q,R) : Under the assumptions (i) D(A∗ ) is dense in Y , where A∗ is the adjoint of A considered as an operator from X to Y , (ii) A and HQ satisfy (269), (iii) HQ and HR = A(HQ ) possess continuous reproducing kernels. For y ∈ HR , (A†(Q,R) y)(s) = (QA∗ (AQA∗ )†(Y,Y ) y)(s),

7.5

s ∈ G.

(281)

Poorly Conditioned Operator Equations

For A : X → Y , the pseudocondition number of A (relative to the norms of X and Y ) is given by γ (A; X, Y ) = sup

x =0 x∈D(A)

AxY xX

sup y =0 y∈D(A† )

A† yX . yY

(282)

The equation Ax = y is said to be poorly conditioned in the spaces X, Y if the number γ (A; X, Y ) is much greater than 1 (note that 1 ≤ γ (A; X, Y ); for ill-posed problems, γ is not finite). Suppose HQ is an RKHS with HQ ⊂ D(A), and A and HQ satisfy (268) with A(HQ ) = HR , R given by (272). Then γ (A; HQ , HR ) = 1. To see this, write x ∈ HQ in the form x = x1 + x2 , where x2 ∈ V ⊥ . Then Ax = Ax1 = y1 and y1 HR = x1 HQ . Thus x1 HQ y1 HR sup = 1. x =0 xHQ y1 =0 y1 HR

γ (A; HQ , HR ) = sup

(283)

On the other hand, the number γ (A; X, Y ) may be large. Thus, the casting of the operator equation Ax = y in the reproducing kernel spaces HQ , HR always leads to a well-conditioned (indeed, optimally-conditioned) problem.

7.5.1

Regularization of Pseudosolutions in Reproducing Kernel Spaces We study properties of regularized pseudosolutions (in RKHS) xα of the operator equation Ax = y, where y is not necessarily in the range of the operator A. By a

14 From Gaussian Least Squares Approximation to Today’s . . .

897

regularized pseudosolution we mean a solution to the variational problem: Find xα in HQ to minimize Jy (x) = y − Ax2HP + αx2HQ ,

α > 0,

(284)

where HQ is an RKHS in the domain of A,  · HP denotes the norm in an RKHS HP with reproducing kernel P , HP ⊂ Y , Jy (x) is assigned the value +∞ if y − Ax ∈ / HP , and α > 0. We suppose A and HQ satisfy (268), hence, A(HQ ) = HR possesses a reproducing kernel. As before, A may be unbounded, invertible, or compact considered as an operator from X(= L2 (G)) to Y (= L2 (H)). It is assumed that y possesses a (not necessarily unique) representation y = y0 + ξ , for some y0 ∈ A(HQ ) and ξ ∈ HP , where ξ may be thought of as a “disturbance”. For α > 0, let HαP be the RKHS with RK αP (t, t  ), where P (t, t  ) is the RK on H × H associated with HP . We have HP = HαP and  · 2HP = α · 2HαP .

(285)

Let R(α) = R + αP , and let HR(α) be the RKHS with RK R(α) = R(α, t, t  ). According to [2], HR(α) is the Hilbert space of functions of the form y = y0 + ξ,

(286)

where y0 ∈ HR and ξ ∈ HP . Following [2], we note that this decomposition is not unique unless HR and HP have no element in common except the zero element. The norm in HR(α) is given by y2R(α) = min{y0 2HR + ξ 2HαR : y0 ∈ HR , ξ ∈ HP , y0 + ξ = y},

(287)

where, however, the y0 and ξ attaining the minimum in (287) are easily shown to be unique by the strict convexity of the norm. Consider the problem of finding xα ∈ HQ to minimize Jy (x) in (284) for y ∈ HR(α) . Then y − Axα must be in HP and it is obvious that xα ∈ V , the orthogonal complement of the nullspace of A in HQ . For any x ∈ V , xHQ = AxHR by the isometric isomorphism between V and HR , and (284) may be written in the equivalent form: Find xα ∈ V to minimize αAx2HR + y − Ax2HP .

(288)

Comparing (287) and (288) with the aid of (285), we see that y0 and ξ attaining the minimum on the right hand side of (287) are related to the solution xα of the minimization problem (288), by y0 = Axα

and

ξ = y − Axα .

(289)

898

W. Freeden and B. Witte

A representation of the solution xα is given (see [101, 102]) as follows: Suppose D(A∗ ) is dense in Y, HQ ⊂ D(A) and A and HQ satisfy (268). Suppose HQ , HR (= A(HQ )) and HP ⊂ Y all have continuous RK’s. Then, for y ∈ HR(α) , the unique minimizing element xα ∈ HQ of the functional Jy (x) is given by "ηs∗ , y#HR(α) = xα (s) = (QA∗ (AQA∗ + αP )†(Y,Y ) y)(s)

s ∈ G,

(290)

where ηs∗ = AQs . We call the (linear) mapping which assigns to each y ∈ HR(α) the unique minimizing element xα the regularization operator of the equation Ax = y. The most useful situations occur, of course, when HR is strictly contained in HR(α) . For example, HR may be a dense subset of Y in the Y -topology and HR(α) a bigger dense subset. We deal with this situation later. If HR⊥ (in Y ) is not empty, then P may be chosen so that the closure of HP in the Y -topology equals HR⊥ in Y . Then HP ∩ HR = {0}, HαP and HR are orthogonal subspaces of HR(α) (see, e.g., [2]), and the decomposition (286) is unique. In this case we have the following theorem which shows that the regularization operator is indeed a generalized inverse in an appropriate RKHS: If HP ∩ HR = {0}, then the minimizing element xα of (284) is the solution to the problem: Find x ∈ L(y) to minimize xHQ ,

(291)

where L(y) = {x : x ∈ HQ , y − AxHR(α) = inf y − AzHR(α) }. z∈HQ

(292)

It should be remarked that, in our approach, we have A(HQ ) = HR ⊂ HR(α) ⊂ Y.

(293)

Replacing HR and HR in (264) by HR and HR(α) , respectively, we get from (267) A†(Q,R(α)) y = Q1/2 [(R + αP )−1/2 AQ1/2 ]†(X,Y ) (R + αP )−1/2 y

(294)

for y ∈ D(A†(Q,R(α)) ) . It is helpful to remember that the topology on HR is not, in general, the restriction of the topology of HR(α) , with the notable exception of the case HR ∩ HP = {0}. In [103] a concrete example is provided arising in the approximate solution of boundary value problems, where HR is not a closed subspace of HR(α) . If HR ∩ HP = {0}, then HR is a closed subspace of HR(α) , and we have A†(Q,R(α)) = QA∗ (R + αP )−1 .

(295)

14 From Gaussian Least Squares Approximation to Today’s . . .

899

Note that in this case, the generalized inverse and the regularization operator coincide. If HR = A(HQ ) is not closed in HR(α) , then the regularization operator and the generalized inverse are different. Also, the right hand side of (294) and (295) are not the same: (295) has maximal domain HR(α) , while (294) has maximal domain HR ⊕ HR⊥ (⊥ in HR(α) ).

7.5.2 Rates of Convergence to the Pseudoinverse We note some properties of xα as α → 0 when HR ⊂ HP . If y ∈ HR = A(HQ ), then we have xα → A†(Q,R) y as α → 0; here we may say something about the rate of convergence if certain additional conditions are satisfied (compare also with [67]). However, y may not be in the domain of A†(Q,R) . This situation can occur if, for example, HR is dense in HR(1) . In this case, limα→0 xα Q = ∞: Assume that y = Ax0 + ξ0 , where x0 ∈ V , ξ0 ∈ HP and suppose that HR ⊂ HP . Then the following properties hold true: (i) B = P −1/2 R 1/2 is a bounded operator on Y = L2 (H). (ii) If ξ0 = 0 and (B ∗ B)−1 R −1/2 (Ax0 )L2 (H) < ∞, then A†(Q,R) y − xα 2HQ = O(α 2 ).

(296)

(iii) If ξ0 = 0 and (B ∗ B)−1/2 R −1/2 (Ax0 )L2 (H) < ∞, then A†(Q,R) y − xα 2HQ = O(α).

(297)

/ HR , then limα→0 xα HQ = ∞. (iv) If ξ0 ∈ Here, inverses indicated by “ – ” are the generalized inverses in the topology of L2 -spaces.

8

Multiscale Regularization Scheme

Next a compact operator equation is dealt within regularization methods, based on filtering techniques by means of wavelets. In a general setup a singular integral approach to regularization is established, decomposition/reconstruction regularization wavelets are introduced in the frequency space which allow the regularization in form of a multiresolution analysis. Two different types of regularization wavelets are discussed in more detail, namely (non-locally supported) Tikhonov-Phillips regularization wavelets and (bandlimited) truncated singular value decomposition wavelets. Our considerations closely follow W. Freeden, F. Schneider [36] about regularization and multiresolution. More details and other variants can be found in W. Freeden et al. [35]. Let (X, "·, ·#) and (Y, "·, ·#) be separable real functional Hilbert spaces over domains GX and GY , respectively, i.e., X, respectively, Y consists of functions

900

W. Freeden and B. Witte

x : GX → R, respectively, y : GY → R. We consider a linear, compact operator A : X → Y satisfying N (A) = {0}, R(A)  Y , R(A) = Y, so that A∗ A : X → X is self-adjoint. In the sequel, we denote by {xn }n=0,1,... a complete orthonormal system in (X, ) and by {yn }n=0,1,... a complete orthonormal system in (Y, "·, ·#Y ) such that the singular values {σn }n=0,1,... of A satisfy Axn = σn yn , A∗ yn = σn xn , n ∈ N0 . Since A is supposed to be injective it follows that σn > 0 for all n ∈ N0 . Any p ∈ P , P ∈ {X, Y }, can be represented in terms of an orthonormal (Fourier) expansion with respect to {pn }n=0,1,... ∞

p=

pP∧ (n)pn

(298)

n=0

with pP∧ (n) = "p, pn #P ,

(299)

where the equality in (298) is understood in the  · P -sense. In conclusion, any element of R(A) admits an expansion of the form Ax =



∧ σn xX (n)yn .

(300)

n=0

As is well-known, the Picard condition tells us that the problem Ax = y,

x ∈ X, y ∈ Y

(301)

has a solution if and only if y ∈ Y satisfies ∞ (σn−1 yY∧ (n))2 < ∞.

(302)

n=0

In this case it is known that the solution of (301) is representable in the form x=



σn−1 yY∧ (n)xn = A† y.

(303)

n=0

Since the right hand side y in (301) is error affected (due to the inaccuracy of the measuring instrument) in any practical application, the series (303) will not converge in general. For that purpose we are interested in regularized solutions. The idea (cf. [36]) we follow is to represent the J -level regularization of the problem

14 From Gaussian Least Squares Approximation to Today’s . . .

Ax = y,

x ∈ X, y ∈ Y

901

(304)

by means of a wavelet analysis. The overall advantage of such a method is that we obtain a J + 1-level regularization by starting with the J -level regularization and adding so-called detail information. It becomes clear that any classical regularization method based on a filtered singular value decomposition can be reformulated in terms of our wavelet method. Thus, any known parameter choice strategy depending on the special method is also applicable and, moreover, any of the corresponding error estimates holds true, too. For that reason we omit these discussions here. As is well-known, a family {RJ }J ∈Z of linear operators RJ : Y → X, J ∈ Z, is a regularization of A† if it satisfies the following properties: (i) RJ is bounded on Y for all J ∈ Z, (ii) for any member y ∈ R(A), the limit relation limJ →∞ RJ y = A† y holds in the  · X -sense. xJ = RJ y is called the J -level regularization of the problem (301). A function Γ P ,Q K : GP × GQ → R, P , Q ∈ {X, Y }, of the form Γ P ,Q (x, y) =



Γ ∧ (n) pn (x) qn (y),

x ∈ GP , y ∈ GQ , Γ ∧ (n) ∈ R, n ∈ N0 ,

n=0

(305) is called a (P , Q)-(product) kernel. Note that the indices P and Q in Γ P ,Q are associated to the variables x and y, respectively, such that Γ P ,Q (x, y) = Γ Q,P (y, x). The sequence {Γ ∧ (n)}n=0,1,... is called the symbol of the (P , Q)kernel. For brevity, a (P , P )-kernel is simply said to be a P -kernel. An important question for our investigations is as follows: fix one variable of a product kernel, what are the conditions for the product kernel (as a function of the remaining variable) to be of class P ∈ {X, Y }? The answer is provided by the concept of symbol admissibility. A symbol {Γ ∧ (n)}n=0,1,... is called P -admissible, P ∈ {X, Y }, if it satisfies the following conditions: (Ai) ∞  ∧ 2 Γ (n) < ∞,

(306)

n=0

(Aii) 6∞ 5 ∧ 2 (Γ (n)pn (x)) < ∞. sup

x∈GP

n=0

(307)

902

W. Freeden and B. Witte

A symbol {Γ ∧ (n)}n=0,1,... is called ((P , Q)-admissible, P , Q ∈ {X, Y }, or simply) admissible, if it is P -admissible as well as Q-admissible. From the definition of admissibility we immediately obtain the result: Let {Γ ∧ (n)}n=0,1,... be the symbol of a (P , Q)-kernel, P , Q ∈ {X, Y }. If {Γ ∧ (n)}n=0,1,... is P -admissible, then Γ P ,Q (x, ·) ∈ Q for every (fixed) x ∈ GP , (β) If {Γ ∧ (n)}n=0,1,... is Q-admissible, then Γ P ,Q (·, y) ∈ P for every (fixed) y ∈ GQ . (α)

A fundamental tool for our wavelet theory is the concept of a convolution we introduce below: Let Γ X,X be an X-kernel with X-admissible symbol. Suppose that x is of class X. Then we understand the convolution of Γ X,X and x to be the function given by ∞ = > ∧ (γ X,X ∗ x)(t) = γ X,X (t, ·), x = Γ ∧ (n)xX (n)xn (t), X

t ∈ GX .

(308)

n=0

∧ ∧ X,X ∗ x ∈ X. In We immediately see that (γ X,X ∗ x)∧ X (n) = Γ (n)xX (n) and γ X,Y analogous way we define the convolution of an (X, Y )-kernel γ having an Xadmissible symbol with a function y ∈ Y to be the expression ∞ = > (γ X,Y ∗ y)(t) = γ X,Y (t, ·), y = Γ ∧ (n)yY∧ (n)xn (t), Y

t ∈ GX ,

(309)

n=0

and it follows that γ X,Y ∗ y ∈ X. Lastly, the convolution of an (X, Y )-kernel γ X,Y having an Y -admissible symbol with a function x ∈ X is given by ∞ = > ∧ (γ X,Y ∗ x)(s) = γ X,Y (·, s), x = Γ ∧ (n)xX (n)yn (s), X

s ∈ GY ,

(310)

n=0

and we have γ X,Y ∗ x ∈ Y . Next we proceed with the convolution of two product kernels leading to the following result: Let γ X,X be an X-kernel with X-admissible symbol and let Υ X,Y be an (X, Y )-kernel with Y -admissible symbol. Then (γ ∗ Υ )X,Y (t, s) = (γ X,X ∗ Υ X,Y (·, s))(t) > = = γ X,X (t, ·), Υ X,Y (·, s) =



(311) X

Γ ∧ (n)Υ ∧ (n)xn (t)yn (s), t ∈ GX , s ∈ GY

n=0

represents an (X, Y )-kernel with admissible symbol

14 From Gaussian Least Squares Approximation to Today’s . . .

((γ ∗ Υ )X,Y )(n) = Γ ∧ (n)Υ ∧ (n).

903

(312)

In order to prepare the fundamentals of the forthcoming wavelet theory we are now interested in countable families {ΓJP ,Q }, J ∈ Z, of product kernels ΓJP ,Q , P , Q ∈ {X, Y }. Observing our notation we are able to define a dilation operator acting on these families in the following way: let ΓJP ,Q be a member of the family of product kernels. ,Q Then the dilation operator DK , K ∈ Z is defined by DK ΓJP ,Q = γJP+K . P ,Q P ,Q P ,Q Especially, we obtain ΓJ = DJ γ0 , J ∈ Z. Thus we refer γ0 to as a “mother kernel”. P ,Q Moreover, we define a shifting operator StP , t ∈ GP , P ∈ {X, Y }, by StP ΓJ = P ,Q Q P ,Q P ,Q ΓJ (t, ·), t ∈ GP , J ∈ Z, resp. Ss ΓJ = ΓJ (·, s), s ∈ GQ , J ∈ Z. In doing so we consequently get ΓJP ,Q (t, ·) = StP DJ γ0P ,Q , t ∈ GP , J ∈ Z, resp. P ,Q Q P ,Q ΓJ (·, s) = Ss DJ γ0 , s ∈ GQ , J ∈ Z.

8.1

Regularization Scaling Functions and Wavelets

Next we are concerned with a wavelet based regularization technique of problem (304): Let {(ΦJ )∧ (n)}n=0,1,... , J ∈ Z, be an (X, Y )-admissible symbol of a family of product kernels which additionally satisfies the following conditions: (i) (ii) (iii) (iv)

limJ →∞ σn ((ΦJ )∧ (n))2 = 1, n ∈ N, ((ΦJ +1 )∧ (n))2 ≥ ((ΦJ )∧ (n))2 , J ∈ Z, n ∈ N, limJ →−∞ ((ΦJ )∧ (n))2 = 0, n ∈ N, σ0 ((ΦJ )∧ (0))2 = 1, J ∈ Z.

Then {(ΦJ )∧ (n)}n=0,1,... is said to be the generating symbol of a regularization scaling function (with respect to (301)). The (X, Y )-kernel d

ΦJX,Y =



(ΦJ )∧ (n)xn yn

(313)

n=0

is called a decomposition regularization scaling function, the (X, X)-kernel ΦJX,X =

r

∞ (ΦJ )∧ (n)xn xn

(314)

n=0

is called a reconstruction regularization scaling function. From the results of the previous section it is clear that d ΦJX,Y (t, ·) ∈ Y, t ∈ GX , J ∈ Z, rΦJX,X (t, ·) ∈ X, t ∈ GX , J ∈ Z, and (rΦJ ∗ d ΦJ )X,Y is an (X, Y )-kernel

904

W. Freeden and B. Witte

with (X, Y )-admissible symbol {((ΦJ )∧ (n))2 }n=0,1,... . Observing these properties we are able to verify the following result, which is central for our considerations: Let {(ΦJ )∧ (n)}n=0,1,... , J ∈ Z, be the generating symbol of a regularization scaling function. Then, for any y ∈ Y , xJ = rΦJX,X ∗ (d ΦJX,Y ∗ y)

(315)

represents the J -level regularization of problem (304). If, in addition, y ∈ R(A), then lim xJ − A† y = 0.

J →∞

(316)

For RJ : Y → X defined by RJ y =

"

ΦJ ∗ d ΦJ

r

#X,Y

∗ y, y ∈ Y

(317)

we have (see [36]) ∞ RJ y = ((ΦJ )∧ (n))2 yY∧ (n) xn n=0

and RJ 2 ≤

∞ ((ΦJ )∧ (n))4 ,

J ∈ Z.

(318)

n=0

As an immediate consequence we obtain the following result: Let y be a member of R(A). Suppose that y ε ∈ Y denotes the right hand side of problem (304) with noise level y − y ε  ≤ ε. Moreover, assume that {(ΦJ )∧ (n)}n=0,1,... , J ∈ Z, is the generating symbol of a regularization scaling function, where the parameter J = J (ε) is assumed to satisfy (1) limε→0 J (ε) = ∞, (2) limε→0 ε((ΦJ )∧ (n))2 = 0. Then we have lim (rΦJ ∗ d ΦJ )X,Y ∗ y ε − A† y = 0.

ε→0

(319)

Condition (iii) seems to be unnecessary for the proof of (319) and, in fact, it is. Nevertheless, in what follows we need this assumption for our multiresolution analysis and the (spectral) introduction of wavelets.

14 From Gaussian Least Squares Approximation to Today’s . . .

905

For any y ∈ R(A) each (rΦJ ∗ d ΦJ )X,Y ∗ y provides a regularization of the solution A† y at scale J by “smoothing” the Fourier coefficients of A† y with the symbol {((ΦJ )∧ (n))2 }n=0,1,... . In terms of filtering, (rΦJ ∗ d ΦJ )X,Y may be interpreted as a low-pass filter. Accordingly we understand the scale spaces VJ to be the image of R(A) under the operator RJ : VJ = RJ (R(A)) = {(rΦJ ∗ d ΦJ )X,Y ∗ y : y ∈ R(A)}.

(320)

This leads to the following statement: The scale spaces satisfy the properties Q (i) ∞ J =−∞ VJ = {x0 }, (ii) {X0 } ⊂ VJ ⊂ VJ  ⊂ X, J ≤ J  , i.e., for any right hand side y ∈ R(A) of problem (304), all J -level regularizations with fixed parameter J are sampled in a scale space VJ with the above property, P ·X = X, (iii) ∞ J =−∞ VJ (iv) if xJ ∈ VJ , then D−1 xJ ∈ VJ −1 , J ∈ Z. If a collection of subspaces of X satisfies the above conditions we call them a regularization multiresolution analysis (RMRA). The definition of the regularization scaling function allows us to introduce regularization wavelets. An essential point is the definition of a decomposition and a reconstruction regularization wavelet associated to regularization mother wavelets. This definition, of course, has to be formulated in close relation to a prescribed regularization scaling function. Let {(Φj )∧ (n)}n=0,1,... , j ∈ Z, be the generating symbol of a regularization scaling function. Then the (X, Y )-admissible generating symbol {(Ψj )∧ (n)}n=0,1,... , j ∈ Z, and the (X, X)-admissible generating symbol {(Ψ˜ j )∧ (n)}n=0,1,... , j ∈ Z, respectively, are defined by the “scaling equation” (Ψ˜ j )∧ (n)(Ψj )∧ (n) = ((Φj +1 )∧ (n))2 − ((Φj )∧ (n))2 .

(321)

Correspondingly, the (X, Y )-kernel d

ΨJX,Y =

∞ (ΨJ )∧ (n) xn yn

(322)

n=0

is called the decomposition regularization wavelet, while the (X, X)-kernel r

Ψ˜ JX,X =

∞ (Ψ˜ J )∧ (n) xn xn

(323)

n=0

is called the reconstruction regularization wavelet. The corresponding regularization mother wavelets are denoted by d Ψ0X,Y and r Ψ˜ 0X,X , respectively.

906

W. Freeden and B. Witte

Using this notation, any decomposition regularization wavelet, respectively, any reconstruction regularization wavelet can be interpreted as a dilated and shifted copy of the corresponding mother wavelet. d

ΨJX,Y (t, ·) = StX DJ d Ψ0X,Y ,

(324)

r

Ψ˜ JX,X (t, ·) = StX DJ rΨ0X,X .

(325)

At this stage it becomes obvious why we required Condition (iv). A conclusion of (321) is a vanishing 0-th moment of the regularization wavelets. Moreover, from (321) it can be readily deduced that J  2  2 (Φ0 )∧ (n + 1) = (Ψ˜ j )∧ (n)(Ψj )∧ (n) = (Φ0 )∧ (n) j =−∞

+

J (Ψ˜ j )∧ (n)(Ψj )∧ (n).

(326)

j =0

Thus, we easily see in connection with 321 that "

Φ˜ J +1 ∗ d ΦJ +1

r

#X,Y

=

J

J d ˜ X,Y r d X,Y ˜ ( Ψj ∗ Ψj ) = ( Φ0 ∗ Φ0 ) + (r Ψ˜ j ∗d Ψ˜ j )X,Y . r

j =−∞

j =0

(327) In analogy to the definition of the operator RJ we consider now convolution operators SJ : Y → X, J ∈ Z, defined by SJ y = (r Ψ˜ J ∗ d ΨJ )X,Y ∗ y.

(328)

It describes the “detail information” of the right hand side y at scale J . From Eq. (327) it follows that the operator RJ +1 can be decomposed in the following way: RJ +1 = R0 +

J

Sj .

(329)

j =0

But this gives rise to introduce the detail spaces as follows: WJ = SJ (R(A)) = {(r Ψ˜ J ∗ d ΨJ )X,Y ∗ y : y ∈ R(A)}.

(330)

The space WJ contains the “detail information” needed to go from a regularization at level J to a regularization at level J + 1. Note that VJ = VJ −1 + WJ −1 ,

(331)

14 From Gaussian Least Squares Approximation to Today’s . . . J

Wj = V 0 +

j =−∞

J

907

Wj = VJ +1 .

(332)

j =0

It is worth mentioning that, in general, the sum decomposition is neither direct nor orthogonal. Two examples leading to orthogonal and non-orthogonal multiresolution are introduced in the next section. Any right hand side y ∈ Y can now be decomposed as follows. Starting with R0 y we find RJ +1 y = R0 y +

J

(333)

Sj y

j =0

for any J ∈ Z. In other words, the partial reconstruction RJ y is nothing else than the “difference of two smoothings” at two consecutive scales, SJ y = RJ +1 y − RJ y.

(334)

For what follows we define the regularization wavelet transform at scale J ∈ Z and position t ∈ GX by letting RW T (y)(J ; t) = "d ΨJX,Y (t, ·), y#Y ,

y ∈ Y.

(335)

From (327) it is not hard to verify the main result in this context which is now formulated in more detail: Let {(Φj )∧ (n)}n=0,1,... , j ∈ Z, be the generating symbol of a regularization scaling function. Suppose that {(Ψj )∧ (n)}n=0,1,..., j ∈ Z, {(Ψ˜ j )∧ (n)}n=0,1,..., j ∈ Z, are the generating symbols of the corresponding regularization wavelets. Furthermore, let y be of class Y . Then, xJ = (rΦ0 ∗ d Φ0 )X,Y ∗ y +

J

r

Ψ˜ j ∗ RW T (y)(j ; ·)

(336)

j =0

denotes the J -level regularization of the problem (304) satisfying lim xJ − A† y = 0

J →∞

(337)

provided that y ∈ R(A). Equation (336) shows the essential characteristic of regularization wavelets. By adding the so-called detail information of level J as the difference of two smoothings of two consecutive scales J + 1 and J , we change the regularized solution from xJ to xJ +1 thereby satisfying limJ →∞ xJ = A† y in the case that y ∈ R(A). Of course, this can be understood as a kind of a permanence principle.

908

W. Freeden and B. Witte

R0 (y)

R1 (y)

R2 (y)

...

−→

A† y

j→∞

V0



V1



V2

...

=

A† X

V0 +

W0

+

W1

+ W2

...

=

A† X

R0 (y)+

S0 (y)

+

...

=

A† y

S1 (y) + S2 (y)+

Fig. 2 Multiresolution regularization scheme

8.2

Multiscale Scheme and Regularizers

The multiscale method as presented here can be illustrated by the following scheme (Fig. 2): The singular values {σn }n=0,1,... of A satisfy Axn = σn yn , A∗ yn = σn xn , n ∈ N0 . Keeping these facts in mind we are led to introduce the following examples of generating symbols of a regularization scaling function: (i) Truncated singular value decomposition (bandlimited regularization). (a) orthogonal:  (ΦJ )∧ (n) =  NJ =

−1/2

σn 0

for n = 0, . . . , NJ , for n ≥ NJ + 1

0 for J ∈ Z, J < 0 , 2J − 1 for J ∈ Z, J ≥ 0

(338)

(339)

(b) non-orthogonal: ⎧ −1/2 ⎪ for n = 0, . . . , MJ ⎨ σn ∧ (ΦJ ) (n) = σn−1/2 (τJ (n))1/2 for n = MJ + 1, . . . , NJ , ⎪ ⎩0 for n ≥ NJ + 1  NJ =  MJ = and

(340)

0 for J ∈ Z, J < 0 , 2J +1 − 1 for J ∈ Z, J ≥ 0

(341)

0 for J ∈ Z, J < 0 2J − 1 for J ∈ Z, J ≥ 0

(342)

14 From Gaussian Least Squares Approximation to Today’s . . .

τJ (n) = 2 − 2−J (n + 1), n ∈ [2J − 1, 2J +1 − 1], J ∈ N0 .

909

(343)

It is easy to see that case (a) leads to an orthogonal RMRA, i.e., the detail and the scale spaces satisfy the orthogonality conditions VJ +1 = VJ ⊕ WJ ,

WJ ⊥ WK , K = J, K, J ≥ 0.

(344)

In case (b) the scale and detail spaces are still finite dimensional, but the detail spaces are no longer orthogonal. (ii) Tikhonov’s regularization (non-bandlimited regularization). (a) classical 6 (ΦJ )∧ (n) =

σn 2 σn + γJ2

51 2

n ∈ N, J ∈ Z,

,

(345)

(b) Tikhonov-Phillips 6

σn



(ΦJ ) (n) =

51

σn2 + γJ2 (n + 14 )4

2

,

n ∈ N, J ∈ Z

(346)

with {γJ }, J ∈ Z, being a sequence of real numbers satisfying limJ →∞ γJ = 0 and limJ →−∞ γJ = ∞. Case (a) leads to the minimization of the Tikhonov functional, where the penalty term is given by the norm of the regularized solution of problem (304). Case (b) leads to the minimization of the TikhonovPhillips functional, where the penalty term is given by the linearized spherical bending energy of the regularized solution of problem (304). Both cases lead to infinite dimensional scale and detail spaces and, furthermore, the RMRA is neither direct nor orthogonal.

9

Iterative Methods as Regularization Schemes

From linear algebra we borrow the idea to use iterative solvers of the equation Ax = y, y ∈ Y , in the following way: Suppose that the matrix A ∈ Kn×n is written in the form A = B − C with an invertible B. Then we are able to write Ax = y, in equivalent form Ax = y ⇔ (B − C) x = y ⇔ Bx = Cx + y ⇔ x = B −1 Cx + B −1 y.

(347)

Let us base the determination of a (uniquely determined) fixed point on the following iteration: x (0) ∈ Kn ,

arbitrary,

910

W. Freeden and B. Witte

Bx (n+1) = Cx (n) + y,

n ∈ N0 .

 Clearly, if x (n) converges to x ∗ ∈ Rn , then it follows that Ax ∗ = y. In numerical linear algebra the following procedures are convenient which should be recapitulated here: Let A ∈ Rn×n be given. Let us decompose the matrix A in the form A = L + D + U such that ⎛ ⎜ L=⎝

0

0 ..

.







⎟ ⎠

⎜ D=⎝

0

a11 .. 0

0

.





⎟ ⎠

⎜ U =⎝



0 ..

.

0

ann

⎞ ⎟ ⎠

(348)

0

We distinguish the following concepts: (1) (Jakobi Method) We let B = D,

C = −L − U.

(349)

Then we are led to x (n+1) = D −1 (L + U ) x (n) + D −1 y,

n ∈ N.

(350)

(2) (Gauß–Seidel Method) We choose B = D + L,

C = −U.

(351)

Then we are led to x (n+1) = (D + L)−1 U x (n) + (D + L)−1 y,

n ∈ N.

(352)

(3) (Richardson Method) For ω > 0, let B=

1 1 I, C = I − A. ω ω

(353)

Then we are led to the recursion     −1 1 −1 1 1 (n) I I −A x + I = y ω ω ω   1 I − A x (n) + ωy =ω ω 

x

(n+1)

= (I − ωA) x (n) + ωy,

n ∈ N.

(354)

14 From Gaussian Least Squares Approximation to Today’s . . .

911

Since iteration methods known from linear algebra that use the upper or lower triangular part of a matrix cannot be adequately transferred to operator equations, we are not able to use the first two of the aforementioned iteration procedures. However, the third method (i.e., Richardson method or successive relaxation method) can be applied leading to a variant called Landweber iteration.

9.1

Landweber Iteration

In order to solve the normal equation A∗ Ax = A∗ y,

y ∈ Y, x ∈ X,

(355)

the Landweber iteration starting from the initial value x (0) ∈ X is defined by   x (m+1) = I − ωA∗ A x (m) + ωA∗ y " # = x (m) + ωA∗ y − Ax (m) ,

(356) m = 0, 1, . . .

For simplicity, we introduce the family of operators {Rm }m∈N ⊂ L (Y, X) by Rm y = x (m) : 2 (0) ∈ X, we have Let 0 < ω < A 2 and A ∈ L (X, Y ). Then, for x  lim Rm y =

m→∞

  A† y + PN (A) x (0) , y ∈ D A†  † ∞, y∈ /D A

(357)

The Landweber iteration is characterized by    m x (m) = Rm y = Fm A∗ A A∗ y + I − ωA∗ A x (0) ,

(358)

where Fm (λ) = ω

m−1 j =0

(1 − ωλ)j =

1 − (1 − ωλ)m . λ

(359)

Thus, Landweber iteration represents a regularization with a filter that cannot be used as such if we do not know the singular values of our problem. The polynomial λ → pm (λ) = 1 − λFm (λ) = (1 − ωλ)m is called residual polynomial. • For x (0) = 0 it follows that A† y is the solution of the normal equation A∗ Ax = A∗ y

(360)

912

W. Freeden and B. Witte

showing minimal norm (as usual). This result is also obtained for x (0) ∈ N (A)⊥ , since then PN (A) x (0) = 0. • For x (0) = 0 we see that A† y + PN (A) x (0) is the solution of the normal equation A∗ Ax = A∗ y

(361)

with minimal distance to x (0) , i.e. the x (0) -minimum norm solution. & ' A† y + PN (A) x (0) − x (0)  = min x − x (0)  : A∗ Ax = A∗ y

(362)

Obviously, the choice of the initial value of the Landweber iteration for inverse problems is of great importance for the solution (see, e.g., A. Rieder [116]). We investigate the same choice for a noiseless and noisy right  hand side. Let A ∈ L (X, Y ), y, y ε ∈ Y with y − y ε  < ε. Let x (m) and x (m),ε be the Landweber iteration sequences with respect to y and y ε corresponding # to the same " 2 (0),ε (0) initial value x = x ∈ X and the same parameter ω ∈ 0, A2 . Then x

(m)

−x

(m),ε

√ 2ω, m = 1 ≤ε √ mω, m ≥ 2.

(363)

The estimate (363) suggests that the parameter ω should be chosen as small as possible. However, we have seen that the convergence of the approximation error depends on the estimate |1 − ωλ| < 1. As a consequence, for small ω the approximation error converges very slowly. In more details, we separate the total error in the usual way A† y − Rm y ε  = A† y − Rm y + Rm y − Rm y ε        approximation error

(364)

data error

√ m→∞ with A† y − Rm y −→ 0 if x (0) = 0 and Rm y − Rm y ε  ≤ mω ε, i.e. the Landweber iteration is a regularization with a suitably chosen m. Our aim now is to find a stopping rule for m. In fact, we obtain for the residual term # "  Ax (m),ε − y ε = A I − ωA∗ A x (m−1),ε + ωA∗ y ε − y ε (365) = Ax (m−1),ε − ωAA∗ Ax (m−1),ε + ωAA∗ y ε − y ε     = I − ωAA∗ Ax (m−1),ε − I − ωAA∗ y ε # "  = I − ωAA∗ Ax (m−1),ε − y ε . Under the choice 0 < ω
1 be fixed. We are interested in determining m∗ ∈ N0 , such that ∗ Ax (m ),ε − y ε  ≤ τ ε < Ax (m),ε − y ε  m = 0, 1, . . . , m∗ − 1.

(368)

The discrepancy principle (368) seems to be particularly suitable for Landweber iteration, since a residual term larger than 2ε implies the monotonicity of the error: Suppose that A ∈ L (X, Y ), y ∈ R (A) and y ε ∈ Y with y − y ε  < ε. If 1 Ax (m),ε − y ε  > 2ε and 0 < ω < A 2 , then A† y − x (m+1),ε  < A† y − x (m),ε ,

m ∈ N0 ,

(369)

i.e., the error is also strictly monotonously decreasing. In other words, as already announced the error decreases monotonically just like the residual as long as the residual stays larger than 2ε. An upper bound for the number of iterations when using the discrepancy principle is as follows (cf. [116]): Assume that A ∈ L (X, Y ), y ∈ R (A), and y ε ∈ Y with y − y ε  < ε. Suppose 1 that 0 < ω < A 2 . The discrepancy principle (368) with τ > 1 yields the stopping

index m∗ = m∗ (ε, y ε ) ≤ CL ε−2 with the constant CL > 0 for the Landweber iteration. Note that this result does not require any assumptions on the smoothness of the solution. With such information the discrepancy principle enables us to stop the iteration much earlier. Suppose that A ∈ L(X, Y ), 0 < ω < 2/A2 , y ∈ R(A), y ∈ Y with y − y ε  < ε and x (0) = 0. Then the Landweber iteration together with the discrepancy principle (368) is an order optimal regularization of A† , i.e. the Landweber iteration possesses infinite qualification. The stopping index can be estimated as follows

914

W. Freeden and B. Witte −2

m∗ = m∗ (ε, y ε ) ≤ Cμ ε μ+1

(370)

with Cμ > 0. If we choose as a initial value 0 = x (0) with x (0) ∈ N (A)⊥ , it suffices to study the convergence of the sequence {x˜ (m),ε } with x˜ (0),ε = 0 which results from the Landweber method applied to the equation Ax = y ε − Ax (0) . The minimum norm solution of this equation is A† y − x (0) and if A† y ∈ Xμ , we also need to have that μ

x (0) ∈ Xμ to obtain the optimal order of decay for the error, i.e. O(ε μ+1 ). Unfortunately, for unknown μ we have to chose 0 as starting value. The disadvantage of the Landweber iteration is that its convergence is rather slow, i.e., the stopping index m∗ is often large. This is the reason why semi-iterative methods (see, e.g., [116] and the references therein) come into play to accelerate the convergence.

9.2

Semi-Iterative Methods

The characteristics of Landweber iteration are as follows (cf. [116]): x (0) = 0

(371)

  x (m) = Fm A∗ A A∗ y

(372)

and

with Fm (λ) given by (359). For λ ∈ [0, A2 ] we have Fm (λ) →

1 , λ

m → ∞.

(373)

Moreover, we are able to show that sup λ∈[0,A2 ]

λ|Fm (λ)| =

sup λ∈[0,A2 ]

|1 − (1 − ωλ)m | ≤ 2,

(374)

so that {Fm }m∈N is a regularizing filter. The residual polynomials corresponding to the filter polynomials are of degree m: pm (λ) = 1 − λFm (λ) = (1 − ωλ)m .

(375)

All in all, the Landweber iteration procedure is as follows:   y − Ax (m) = pm AA∗ y.

(376)

14 From Gaussian Least Squares Approximation to Today’s . . .

915

In order to accelerate the Landweber iteration we are led to the idea (see, e.g., the monograph [116]), to replace the polynomial filter by another filter family, that shows a faster convergence to 1/λ. To this end we consider an alternative polynomial Fm of degree m − 1, so that its residual polynomial λ → pm (λ) = 1 − λFm (λ) is of degree m. Letting x (m) := Fm (A∗ A) A∗ y we obtain as residuum y−Ax (m) = pm (AA∗ ) y. Now, if {Fm }m∈N is a regularizing filter, the corresponding family {pm }m∈N has the following properties: If {Fm } is a regularizing filter, the residual polynomials {pm } satisfy the following properties: (i) lim pm (λ) = 0 for λ ∈ [0, A2 ] .

m→∞

(ii) {pm } is uniformly bounded on the interval 0, A2 (by 1 + CF ). (iii) pm (0) = 1 for all m.

Conversely, if {pm } is a family of polynomials satisfying the properties (i), (ii) (iii), then Fm (λ) =

1 − pm (λ) , λ

m ∈ N0

(377)

constitutes a regularizing filter. From the theory on special functions of mathematical physics” (see, e.g., [31, 124]) we borrow the following result: Let the family {pm } fulfill the following conditions:

(i) pm is a polynomial of degree m on the interval 0, A2 . (ii)

2 A 



pm (λ) pn (λ) w (λ) dλ = 0 if n = m, where w (λ) > 0 for λ ∈ 0, A2

0

and w is piecewise continuous. (iii) pm (0) = 1, m ∈ N0 . Then there exist Am , Bm ∈ R satisfying pm (λ) = pm−1 (λ) + Am (pm−1 (λ) − pm−2 (λ)) − Bm λpm−1 (λ) .

(378)

The polynomials {pm } are orthogonal polynomials on the interval 0, A2

with respect to the weight function w (λ) note that (w (λ) induces a measure on 0, A2 ). In transition to iteration we are led to x (0) = 0,   1 − p1 (λ) x (1) = F1 A∗ A A∗ y with F1 (λ) = , λ " # x (m) = x (m−1) + Am x (m−1) − x (m−2)

(379) (380)

916

W. Freeden and B. Witte

" # +Bm A∗ y − Ax (m−1)

for m ≥ 2.

(381)

This is the reason why methods of type (379), (380), (381) are called semi-iterative (note that, for each iteration step, two previous iteration values are required). For simplicity, assume now that A ≤ 1 which can be achieved by scaling the operator. As a consequence, we only need to consider polynomials on the interval [0, 1]. Then the following results are known for semi-iterative methods (see, e.g., [116]): (1) Let {pm } be a sequence of residual polynomials, so that it is uniformly bounded on [0, 1], normalized by pm (0) = 1, and it converges pointwise to 0 on (0, 1]. If {Fm } is the corresponding sequence of filters, then for A ∈ L(X, Y ) and x (0) ∈ X it follows that lim x (m) = lim

m→∞

m→∞



=

"

x (0) + Fm (A∗ A)A∗ (y − Ax (0) )

A† y + PN (A) x (0) ∞

: :

#

y ∈ D(A† ), y∈ / D(A† ).

(2) Suppose that A is of class L(X, Y ). Let {x (m) }, {x (m),ε } be the iterates of a semi-iterative method with respect to y ∈ R(A) and y ε ∈ Y using the same initial value. Let the residual polynomials of the semi-iterative method be uniformly bounded by Cp > 0. Then, for the data error, we have x (m) − x (m),ε  ≤ 2Cp m ε.

(382)

Together with a stopping rule that fulfills m∗ (ε) → ∞ ,

εm∗ (ε) → 0 for ε → 0

(383)

the semi-iterative method is a regularization of A† . In order to to determine the speed of convergence we investigate ωμ (m) = sup λμ/2 |pm (λ)|

(384)

0≤λ≤1

(3) Each sequence of polynomials {pm }m∈N0 , pm (0) = 1, which satisfies the best asymptotic behavior ωμ (m) = O(m−μ )

for m → ∞

(385)

for some μ > 0, is uniformly bounded on [0, 1] and converges pointwise to 0 on (0, 1]. In other words {pm }m∈N0 given in such a way is a sequence of residual polynomials, for which, in addition, ωα (m) = O(m−α ) for 0 < α ≤ μ. The discrepancy principle (cf. (368)) can also be used as stopping rule.

14 From Gaussian Least Squares Approximation to Today’s . . .

917

(4) Let A ∈ L(X, Y ) and y ∈ R(A). Let the normalized polynomials {pm }m∈N0 , pm (0) = 1, satisfy (385) for some μ > 1. Then the corresponding semi-iterative method with starting value x (0) = 0 is an order optimal regularization of A† with respect to Xα for 0 < α ≤ μ − 1 if it is combined with the discrepancy principle (368) as stopping rule with τ > sup{pm C (0) [0,1] : m ∈ N0 } ≥ 1. The stopping index satisfies # " m∗ = m∗ (ε, y ε ) = O ε−1/(α+1)

(386)

for ε → 0. Normalized polynomials that fulfill (385) automatically lead to semi-iterative order optimal regularization methods. The reason for this is that (385) implies the uniform boundedness of the polynomials on [0, 1] as well as sup{|Fm (λ) : λ ∈ [0, 1]} ≤ 2τ m2 for the corresponding filters. In the general case that pt (λ) = 1 − λFt (λ) we are confronted with the situation (see, e.g., [116]) that μ

ωμ (t) ≤ Cp t 2

for t → 0

(387)

which is the analogue of (385). However, this neither implies the uniform boundedness of {pt }t>0 nor an estimate like |Ft (λ)| ≤ CF t −α .

9.3

Gradient Method

The method of successive approximation gradient and related iterative methods can be used for finding approximate solutions of ill-posed problems (see, e.g., [53, 69, 70, 92, 97] and the references therein for more details). We let X and Y be two Hilbert spaces, both over K, and let A be a bounded linear operator on X into Y . As already known, the linear equation Ax = y,

y∈Y

(388)

may or may not have a solution depending on whether or not y is in R(A), the range of A, and even if y ∈ R(A) the solution of (388) need not be unique. For any bounded linear operator A : X → Y, R(A) and R(A∗ ) are closed subspaces of X and Y , respectively, hence, X = R(A) ⊕ R(A)⊥ and Y = R(A∗ ) ⊕ R(A∗ )⊥ . The relations R(A) = R(A∗ )⊥ , R(A∗ ) = R(A)⊥ , R(A)⊥ = R(A∗ ), R(A) = R(AA∗ ) are also valid (see, for instance, [138]). Let P denote the orthogonal projection of X onto R(A∗ ) and let Q denote the orthogonal projection of A on R(A). Then Ax = AP x for all x ∈ X, A∗ y = A∗ Qy for each y ∈ Y , and the restriction of A to R(A)⊥ has an inverse, which is not necessarily continuous. In either case, i.e., y ∈ R(A) unique or non-unique solution of (388), one can seek a best approximate solution, i.e., a solution which minimizes the quadratic

918

W. Freeden and B. Witte

function J (x) = Ax − y2 . Such a solution always exist for all y ∈ Y if R(A) is closed. If R(A) is arbitrary, a best approximation does not exist for all y ∈ Y , however, it does exist for all y ∈ R(A)⊥ . We consider the conjugate gradient (CG) method (cf. [69, 70]) that minimizes J (x) = Ax − y2 at each step. That is, choose an initial vector x0 ∈ X, then compute r0 = p0 = A∗ (Ax0 − y), where as always A∗ is the adjoint of A. If p0 = 0, compute x1 = x0 − α0 p0 , where α0 = r0 2 /Ap0 2 . For i = 1, 2, . . ., compute ri = A∗ (Axi − y) = ri−1 − αi−1 A∗ Api−1 ,

(389)

where "ri−1 , pi−1 # , Api−1 2

(390)

pi = ri + βi−1 pi−1 ,

(391)

αi−1 = and if ri = 0, then compute

where βi−1 = −

"ri , A∗ Api−1 # . Api−1 2

(392)

Set xi+1 := xi − αi pi .

(393)

We examine some properties of the CG algorithm, for the case of an arbitrary bounded linear operator. In this case, the domain of the generalized inverse of A is D(A† ) = R(A) + R(A)† . If y ∈ D(A† ), then Qy = y¯ is in the range of A and v = A† y = A† y¯ and y = Qy = Av = AA† y. Since Q is an orthogonal projection, the functional J can be written as J (x) = Ax − y2 = Ax − y ¯ 2 + y¯ − y2 . Thus, minimizing J is equivalent to minimizing the functional Ax − y ¯ 2 which we denote by g(x). Setting u = v + (I − P )x0 = A† y + (I − P )x0

(394)

we are able to define the error vector e := x − u and the vector r = A∗ (Ax − y) = A∗ (Ax − y). ¯ Then (A∗ A)e = r and

(395)

14 From Gaussian Least Squares Approximation to Today’s . . .

"r, e# = Ax − y ¯ 2 = g(x).

919

(396)

The sequence of iterates {xi } generated by the CG method (389), (390), (391), (392), and (393) is contained in the flat x0 + R(A∗ ) with both ri and pi , i = 0, 1, 2, . . ., in R(A∗ ). Moreover, p0 , p1 , . . . , pi−1 form an A∗ A-orthogonal set of vectors and their span is an i-dimensional subspace of R(A∗ ). If at the ith step, ri = 0, then both Axi − y and Axi − y¯ are vectors in R(A∗ ) = R(A)⊥ . However, Axi and y¯ are also in R(A), and therefore Axi = y, ¯ implying that g(xi ) = 0. In this case, the iteration terminates at the ith step and we have xi = x0 −

i−1

αk pk = u,

(397)

k=0

as well as A† y = P x0 −

i−1

αi pk .

(398)

k=0

Therefore, unless explicitly mentioned otherwise, we shall assume that the conjugate gradient method does not terminate in a finite number of steps, that is, ri = 0 for i = 0, 1, . . . . We list some known identities for the conjugate gradient method: For indices satisfying k = 0, 1, 2, . . ., and i = 0, 1, 2, . . ., we have "ri , ri # = "pi , rk #,

(399)

Api   Ari ,

(400)

βi = "pi , pk # =

ri+1 2 , ri 2

(401)

ri 2 pk 2 , rk 2

(402)

i rj rj 2

(403)

pi = ri 2

j =0

and ri 2  pi . Setting g(xi ) = "ri , ei # = Axi − y ¯ 2 , where ei = xi − u, we find that

(404)

920

W. Freeden and B. Witte

g(xi ) − g(xi+1 ) = αi ri 2 .

(405)

Finally, xi minimizes the functionals J (x) and g(x) on the i–dimensional flat x0 + span{p0 , p1 , . . . , pi−1 }.

(406)

It is worth mentioning the following three properties (see [70]): (i) For k = 0, 1, 2, . . . , i g(xi ) = "ei , rk # = "ri , ek #.

(407)

"pi , ei #ri 2 = g(xi )pi 2 .

(408)

ei+1 2  ei 2 − αi g(xi )

(409)

For i = 0, 1, 2, . . .,

(ii) The inequality

(iii)

holds for i = 0, 1, 2, . . .. For any non–negative integers i and j , both "pi , ei # and "ei , ej # are non– negative numbers.

The main result on the CG method for bounded linear operators with closed range depends heavily upon the following observation: Let X and Y be two Hilbert spaces over the same field and let A be a bounded linear transformation mapping X into Y . If R(A) is closed, then S = A|R(A∗ ), the restriction of A to R(A∗ ), has a bounded inverse and U = A∗ A|R(A∗ ) is a positive definite operator onto R(A∗ ). Let μ(x) := "U x, x#/"x, x#, x = 0, be the Rayleigh quotient of U . Since U is a bounded symmetric positive definite linear operator on the Hilbert space R(A∗ ), the spectral bounds m = inf{μ(x) : x ∈ R(A∗ )}

(410)

M = sup{μ(x) : x ∈ R(A∗ )}

(411)

and

are positive and finite. Let X and Y be two Hilbert spaces over the real field and let A be a bounded linear transformation mapping X into Y . If the range of A is closed then the conjugate gradient method (389), (390), (391), (392), and (393) converges

14 From Gaussian Least Squares Approximation to Today’s . . .

921

monotonically to the least squares solution u = A† y + (I − P )x0 of Ax = y. Moreover, if m and M are the spectral bounds of U = {A∗ A|R(A∗ )}, then g(x0 ) xi − u  m



2

M −m M +m

2i ,

i = 0, 1, 2, . . . .

(412)

Altogether, let X and Y be two Hilbert space over the same field. If A is a linear transformation mapping X into Y of rank r, then the conjugate gradient method associated with the system T x = y converges in at most r steps to the least squares solution u = A† y + (I − P )x0 . When R(A∗ ) is not closed, then the operator U = A|R(A∗ ) need not be positive definite, and therefore the generalization of an inequality can not longer be utilized. However, it is still possible to establish convergence of the CG method under mild restrictions (see [69, 70]): Let X and Y be two Hilbert spaces over the real field, and let A be a bounded linear operator mapping X to Y . If Qy ∈ R(AA∗ A), then the conjugate gradient method (389), (390), (391), (392), and (393), with initial value x0 ∈ R(A∗ A), converges monotonously to the least squares solution of minimal norm u = A† y. In fact, xi − u2 

10

A2 x0 − A† y2 A∗† x0 − (AA∗ )† y2 , 2 A A∗† x0 − (AA∗ )† y2 + ix0 − (AA∗ )† y2

i = 1, 2, . . . . (413)

Mollifier Methods

The original idea of the mollifier method can be characterized as follows (see, e.g., [21]): We are interested in the solution x † of Ax = y, but we realize that the problem is “too ill-posed” for being able to determine x † accurately. Thus, we compromises by changing the problem into a more well-posed one, namely that of trying to determine a mollified version Eρ x † of the solution, where Eρ is a suitable “mollification operator” depending on a parameter ρ. The heuristic motivation is that the trouble usually comes from high frequency components of the data and of the solution, which are damped out by mollification (which, in fact, defines mollification). In abstract nomenclature, early mollifier methods for ill-posed problems were studied in [82]. Our approach refers to [21]. First we recapitulate the essential ingredients. Again we start from the operator equation Ax = y, x ∈ X, y ∈ Y

(414)

with X, Y Hilbert spaces and R(A) non-closed. Our aim is to introduce operators Eρ : X → X such that

922

W. Freeden and B. Witte

Eρ x → x,

x ∈ X,

ρ → 0.

(415)

If X is a suitable function space, we are able to represent Eρ by a mollifier eρ via the equation Eρ (x)(s) = "eρ (s, ·), x#X .

(416)

Instead of x † we now look for Eρ x † for some ρ > 0, thereby assuming that eρ has a representation A∗ vsρ = eρ (s, ·)

(417)

ρ

with vs ∈ Y . Then, if Ax † = y, we can compute Eρ x † as follows: (Eρ x † )(s) = "eρ (s, ·), x † #X = "A∗ vsρ , x † #X = "vsρ , Ax † #Y = "vsρ , y#Y ,

(418)

i.e., (Eρ x † )(s) = "vsρ , y#Y .

(419)

Hence, the problem of solving (414) reduces to that of solving (417), which is also ill-posed as soon as (414) is. However, the right hand side of (417) (which is actually a family of equations depending on the parameter s) is usually given analytically, since the mollifier eρ is chosen. Hence, there is no (or much less) error in the data of (417), and equations can be solved (by regularization) much better than (414). As ρ soon as an approximation for vs has been computed, it can be used to solve (414) for any right hand side y via (419). If we define the operator Sρ : Y → X via the estimate (Sρ y)(s) = "vsρ , y#,

(420)

then, by (419), this operator maps the right hand side of (414) to mollified solutions. This motivates the term approximate inverse of A used for Sρ , also for the more general case that (417) is not solvable. In this case, (417) is replaced by A∗ vsρ − eρ (s, ·)X → min,

(421)

AA∗ vsρ = Aeρ (s, ·).

(422)

which, is equivalent to

ρ

Note that one needs the requirement that (422) is solvable. The function vs is called reconstruction kernel; uniqueness can be enforced by solving (422) in the

14 From Gaussian Least Squares Approximation to Today’s . . .

923

best-approximate sense, i.e., by selecting the solution of (422) with minimal norm: ρ vs = (A∗ )† eρ (s, ·). Assume, for simplicity, that R(A) is dense in Y , so that γ (AA∗ )−1 exist. Then we have with vs defined by (421): Sρ y = "(AA∗ )−1 Aeρ (s, ·), y#Y = "eρ (s, ·), A∗ (AA∗ )−1 y#X

(423)

= "eρ (s, ·), (A∗ A)† A∗ y#X = (Eρ x † )(s), i.e., Sρ y is the mollified version of the best-approximate solution of (419). This justifies (421). Let A be compact with singular system (σn ; xn , yn ). Let a regularization method realized in standard way, i.e.,

xα =



σn Fα (σn2 )"y, yn #Y xn .

(424)

n=1

If we assume that X and Y are suitable function spaces, then (424) can be written as xα (s) = "vsρ , y#Y

(425)

with vsρ (t) =



σn Fα (σn2 )yn (t)yn (s).

(426)

n=1 ρ

Now, vs can be written in the form (422) with eρ (s, t) =



σn2 Fα (σn2 )yn (s)yn (t).

(427)

n=1

Hence, xα can be considered as a mollified solution Eρ x † with Eρ given (in the sense of (415)) by the mollifier (427), so that linear regularization methods can also be viewed as mollifier methods. The underlying “suitable function space” have to be such that point evaluation is continuous wherever used. The choice of the mollifier, of course, depends on what one wants to achieve; frequently used choices are singular integral kernels (such as Haar kernel, sinc kernel, etc.). Note that each of these mollifiers can be applied only to equations, so that (422) admits a solution.

924

W. Freeden and B. Witte

Literature 1. Adrain, R.: Research Concerning the Probabilities of the Errors which Happen in Making the Observations. The Analyst of Mathematical Museum, vol. I. W. P. Farrand and Co., Philadelphia (1808) 2. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950) 3. Baart, M.L.: Methods for Solution of Fredholm Integral Equations of the First Kind. CSIR, Pretoria, Technical Report, vol. 104 (1979) 4. Baer, R.: Linear Algebra and Projective Geometry. Academic Press, New York (1952) 5. BenIsrael, A., Greville, T.N.E.: Generalized Inverses: Theory and Applications. New York: Wiley-Interscience (1974) 6. Bialas, V.: Erdgestalt, Kosmologie und Weltanschauung. Konrad Wittwer, Stuttgart (1982) 7. Bjerhammar, A.: Rectangular reciprocal matrices with special reference to geodetic calculations. Bull. Géod. 20, 188–220 (1951) 8. Bjerhammar, A.: Theory of Errors and Generalized Matrix Inverses. Elsevier Scientific Publ. Co., Amsterdam (1973) 9. Boullion, T.L., Odell, P.L.: Generalized Inverse Matrices. Wiley-Interscience, New York (1971) 10. Davis, C.H.: Theory of the Motion of Heavenly Bodies Moving about the Sun in Conic Sections (translation of “Theoria motus corporum coelestium in sectionibus conicis solem ambientium” by C.F. Gauss, reprinted by Dover, New York (1963)) 11. Davis, P.J.: Interpolation and Approximation. Blaisdell, New York (1963) 12. Dedekind, R.: Gauß in seiner Vorlesung über die Methode der kleinsten Quadrate. Festschrift zur Feier des hundertfünfzigjährigen Bestehens der Königlichen Gesellschaft der Wissenschaften zu Göttingen, Weidmannsche Buchhandlung (1901) 13. Defrise, M., de Mol, C.: A note on stopping rules for iterative regularization methods and filtered SVD. In: Inverse Problems: an Interdisciplinary Study, pp. 261–268. Academic Press, San Diego (1987) 14. Dunford, N., Schwarz, J.: Linear Operators, vol. II. Wiley-Interscience, New York (1963) 15. Dunnington, G.W.: Carl Friedrich Gauss: Titan of Science. A Study of His Life and Work. Exposition Press, New York (1955) 16. Edgeworth, F.Y.: On observations relating to several quantities. Hermathena 6, 279–285 (1887) 17. Eggermont, P.P.B., LaRicca, V.N.: Maximum Penalized Likelihood Estimation, vol. II, Regression. Springer, New York (2009) 18. Eggermont, P.P.B., LaRicca, V.N. Nashed, M.Z.: On weakly bounded noise in ill-posed problems. Inverse Prob. 25, 115018–115032 (2009) 19. Eggermont, P.N., LaRiccia, V., Nashed, M.Z.: Noise models for ill-posed problems. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 2, 2nd edn., pp. 1633–1658. Springer, New York (2015) 20. Eggermont, P.P.B., LaRicca, V.N. Nashed, M.Z.: Moment discretization of ill-posed problems with discrete weakly bounded noise. Int. J. Geomath. 3, 155–178 (2012) 21. Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 22. Fellmann, E., Huber, D. Z.: Neue Deutsche Biographic 9, 689–1972 (https://www.deutschebiographic.de/pnd117030260,html/#ndbcontent) 23. Fredholm, I.: Sur une classe d’equations functionelles. Acta Math. 27, 365–390 (1903) 24. Freeden, W.: On spherical spline interpolation and approximation. Math. Meth. Appl. Sci. 3, 551–575 (1981a) 25. Freeden, W.: On approximation by harmonic splines. Manuscr. Geod. 6, 193–244 (1981b) 26. Freeden, W.: Spherical spline interpolation: basic theory and computational aspects. J. Comput. Appl. Math. 11, 367–375 (1984) 27. Freeden, W.: A spline interpolation method for solving boundary value problems of potential theory from discretely given data. Math. Part. Diff. Equ. 3, 375–398 (1987)

14 From Gaussian Least Squares Approximation to Today’s . . .

925

28. Freeden, W.: Multiscale modelling of spaceborne geodata. B.G. Teubner, Stuttgart/Leipzig (1999) 29. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman and Hall/CRC Press, Boca Raton/London/New York (2013) 30. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications, Clarendon/Oxford (1998) 31. Freeden, W., Gutting, M.: Special Functions of Mathematical (Geo)Physics. Birkhäuser, Basel (2013) 32. Freeden, W., Gutting, M.: Integration and Cubature Methods – A Geomathematically Oriented Course. Chapman and Hall/CRC Press, Boca Raton/London/New York (2018) 33. Freeden, W., Nashed, M.Z.: Ill-posed problems: operator methologies of resolution and regularization. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 201–314. Springer, Birkhäuser (2018) 34. Freeden, W., Nashed, Z., Sonar, T. (eds.): Handbook of Geomathematics, vol. 1, 2, and 3, 2nd edn. Springer, New-York (2015) 35. Freeden, W., Nashed, Z., Schreiner, M: Spherical Sampling. Geosystems Mathematics. Springer, Birkhäuser (2018) 36. Freeden, W., Schneider, F.: Regularization wavelets and multiresolution. Inverse Prob. 14, 493–515 (1998) 37. Freeden, W., Schreiner, M.: Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn. Springer, New-York/Heidelberg (2015) 38. Freeden, W., Sonar, T., Witte, B.: Gauss as mediator between mathematics and geodesy from the past to the present. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 1–163. Springer, Birkhäuser (2018) 39. Freeden, W., Witte, B.: A combined (spline-)interpolation and smoothing method for the determination of the gravitational potential from heterogeneous data. Bull. Géod. 56, 53–62 (1982) 40. Friedrich, K.: Allgemeine für die Rechenpraxis geeignete Lösung für die Aufgaben der kleinsten Absolutsumme und der günstigsten Gewichtsverteilung. Zeitschrift für Vermessungswesen (ZfV) 337–358 (1937) 41. Friedrichs, K.O.: Spectral Theory of Operators in Hilbert Space. Springer, New York (1981) 42. Galle, A.: Über die geodätischen Arbeiten von Gauß, I. Abschnitt: Die Methode der kleinsten Quadrate. Berlin (1924) (Gauß Werke Bd. XI 2. Abteilung) 43. Gauss, C.F.: Theoria motus corporum coelestium in sectionibus conicis solem ambientium, Hamburg, 1809, Werke 7. Translated into English by C.H. Davis (1963) 44. Gauss, C.F.: Theoria combinationis observationum erroribus minimis obnoxiae, 1. Teil, Göttingen (1821) 45. Gauss, C.F.: Supplementum theoriae combinationis observationum erroribus minimis obnoxiae. Göttingen (1826) 46. Gauss, C.F.: Untersuchungen über Gegenstände der Höheren Geodäsie. J. Frischauf, Hrsg., Ostwald’s Klassiker der exakten Wissenschaften, Nr. 177, Leipzig (1910) 47. Gerling, C.L.: Ausgleichungsrechnungen der praktischen Geometrie oder die Methode der kleinsten Quadrate. Hamburg und Gotha (1843) 48. Gnedenko, B.W.: Über die Arbeiten von C.F. Gauß zur Wahrscheinlichkeitsrechnung. In: C.F. Gauß Leben und Werk, Hrsg. H. Reichardt, Haude & Spenersche Verlagsbuchhandlung, pp. 193–204. Berlin (1960) 49. Glockner, O.: On Numerical Aspects of Gravitational Field Modelling from SST and SGG by Harmonic Splines and Wavelets (with Application to CHAMP Data), Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2001) 50. Grafarend, E.W.: Six lectures on geodesy and global geodynamics. In: Moritz, H., Sünkel, H. (eds.) Proceedings of the Third International Summer School in the Mountains, pp. 531–685 (1982)

926

W. Freeden and B. Witte

51. Groetsch, C.W.: Generalized Inverses of Linear Operators. Marcel Dekker. Inc., New York (1977) 52. Groetsch, C.W.: Inverse Problems in the Mathematical Science. Vieweg, Braunschweig (1993) 53. Groetsch, C.W., Mashed, M.Z., Scherzer, O. (eds): Iterative stabilization and edge detection. In: Contemporary Mathematics, vol. 313, pp. 129–141. American Mathematical Society, Providence (2002) 54. Gutting, M.: Fast multipole methods for oblique derivative problems. Ph.D-Thesis, Geomathematics Group, University of Kaiserslautern (2007) 55. Gutting, M.: Fast multipole accelerated solution of the oblique derivative boundary value problem. GEM Int. J. Geomath. 3, 223–252 (2012) 56. Hammer, E.: Beitrag zur Geschichte der Ausgleichungsrechnung. Zeitschrift für Vermessungswesen, pp. 613–628 (1900) 57. Hampel, F.R., Ronschetti, E.M., Rousseeuw, P.R., Stahel, W.A.: Robust Statistics. Wiley, New York (1986) 58. Hanke, M., Hansen, P.C.: Regularization methods for large scale problems. Surv. Math. Indus. 3, 253–315 (1993) 59. Hansen, P.C.: Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev. 34, 561–580 (1992) 60. Helmert, F.R.: Die Ausgleichungsrechnung nach der Methode der kleinsten Quadrate mit Anwendungen auf die Geodäsie und die Theorie der Messinstrumente. B.G. Teubner, Leipzig (1872, 1907) 61. Helmert, F.R.: Über die Berechnung des wahrscheinlichen Fehlers aus einer endlichen Zahl wahrer Beobachtungsfehler. Zeitschrift für Mathematik und Physik, XX (4), 300–303 (1875) 62. Helmert, F.R.: Zur Bestimmung kleiner Flächenstücke des Geoids aus Lothabweichungen mit Rücksicht auf Lothkrümmung. 1. Mitteilung. Sitzungsber. der Kgl. Preuß. Akad. d. Wiss. zu Berlin (1900) 63. Heuser, H.: Funktionalanalysis. 4. Auflage, Teubner (1975) 64. Hilbert, D.: Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen. Teubner, Leipzig (1912) 65. Huber, P.J.: Robust Statistics. Wiley, New York (1981) 66. Hurwitz, W.A.: On the pseudo-resolvent to the kernel of an integral equation. Trans. Am. Math. Soc. 13, 405–418 (1912) 67. Ivanov, V.K., Kudrinskii, V.Yu.: Approximate solution of linear operator equations in Hilbert space by the method of least squares. I. Z. Vycisl. Mat. i Mat. Fiz. 6, 831–944 (1966) 68. Jorgensen, P., Tian, F.: Graph Laplacians and discrete reproducing kernel Hilbert spaces from restrictions. Stoch. Anal. Appl. 34, 722–747 (2016) 69. Kammerer, W.J., Nashed, M.Z.: The convergence of the conjugate gradient method for singular linear operator equations. SIAM J. Numer. Anal. 9, 165–181 (1972) 70. Kammerer, W.J., Nashed, M.Z.: Iterative methods for best approximate solutions of linear integral equation of the first and second kind. J. Math. Anal. Appl. 40, 547–573 (1972) 71. Kanani, E.: Robust estimators for geodetic transformations and GIS. Ph.D.-Thesis, ETH No. 13521, Institut für Geodäsie und Photogrammetrie, Eidg. Technische Hochschule Zürich, Mitteilung Nr. 70, Zürich (2000) 72. Kantorowitsch, L.W., Akilow, G.P.: Funktionalanalysis in normierten Räumen. AkademieVerlag, Berlin (1964) 73. Koch, K.R.: Parameterschätzung und Hypothesentests. Dümmler Verlag, Bonn (1997) 74. Koch, K.R.: Parameter Estimation and Hypothesis Testing in Linear Models, 2nd edn. Springer, Berlin (1999) 75. Kress, R.: Linear Integral Equations, 2nd edn. Springer, Berlin (1989) 76. Krüger, L.: Über die Ausgleichung von bedingten Beobachtungen in zwei Gruppen. Veröff. d. Kgl. Preuß. Inst. N. F., Nr. 18, Potsdam (1905) 77. Laplace, P.S.: Traité de mécanique céleste. tome 2, Paris (1799) 78. Laplace, P.S.: Théorie analytique des probabiltés. Euvres, tome VII, 353 (1812)

14 From Gaussian Least Squares Approximation to Today’s . . .

927

79. Legendre, A.M.: Nouvelles méthodes pour la détermination des orbites cométes. Paris (1806) 80. Lehmann, G.: Gauss theoretische Arbeiten. In: C. F. Gauss und die Landesvermessung in Niedersachsen, herausgegeben von der Niedersächsischen Vermessungs- und Katasterverwaltung (1955) 81. Lehmann, R., Lösler, M.: Multiple outlier detection: hypothesis tests versus model selection by information criteria. J. Surv. Eng. 142. https://doi.org/10.1061/(ASCE)SU.1943-5428. 0000189 (2016) 82. Louis, A.K.: Inverse und schlecht gestellte Probleme. Teubner, Stuttgart (1989) 83. Maire, Ch., Boscovi´c, R.J.: De Litteraria Expeditione per Pontificiam Ditionem ad Dimentiendos duos Meridiani Gradus, Rom, pp. 409–503 (1755) 84. Marx, C.: Outlier detection by means of Monte Carlo estimation including resistant scale estimation. J. Appl. Geod. 9, 123–141 (2015) 85. Mees, R.: Über die Berechnung des wahrscheinlichen Fehlers einer endlichen Zahl von Beobachtungen. Zeitschrift für Mathematik und Physik, XX (2), 145–152 (1875) 86. Meissl, P.A.: A study of covariance functions related to the Earth’s disturbing potential. Department of Geodetic Science, No. 151, The Ohio State University, Columbus (1971) 87. Meissl, P.A.: Hilbert spaces and their applications to geodetic least squares problems. Boll. Geod. Sci. Aff. N. 1, 181–210 (1976) 88. Moore, E.H.: On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc. 26, 394–395 (1920) 89. Moore, E.H.: General analysis. Mem. Am. Math. Soc. 1, 197–209 (1935) 90. Moritz, H.: Statistische Methoden in der gravimetrischen Geodäsie. Zeitschrift für Vermessungswesen, pp. 409–416 (1963) 91. Moritz, H.: Advanced Physical Geodesy. Herbert Wichmann Verlag/Abacus Press, Karlsruhe/Tunbridge (1980) 92. Nashed, M.Z.: Steepest descent for singular linear operator equations. SIAM J. Numer. Anal. 7, 358–362 (1970) 93. Nashed, M.Z.: Differentiability and related properties of nonlinear operators: some aspects of the role of differentials in nonlinear functional analysis. In: Rall, L.B. (ed.) Nonlinear Functional Analysis and Applications, pp. 103–309. Academic Press, New York (1971) 94. Nashed, M.Z.: Approximate regularized solutions to improperly posed linear integral and operator equations. In: Colton, D.L., Gilbert, R.P. (eds.) Constructive and Computational Methods for Differential and Integral Equations, LNM 430, pp. 289–322. Springer, New York (1974) 95. Nashed, M.Z.: Aspects of generalized inverses in analysis and regularization. In: Generalized Inverses and Applications, Nashed, M.Z. (ed.), pp. 193–244. Academic Press, New York (1976) 96. Nashed, M.Z.: Generalized Inverses and Applications. Academic Press, New York (1976) 97. Nashed, M.Z.: Perturbations and approximation for generalized inverses and linear operators. In: Generalized Inverses and Applications, Nashed, M.Z. (ed.), pp. 325–396. Academic Press, New York (1976) 98. Nashed, M.Z.: Continuous and semicontinuous analogous of iterative method of Cimmino and Kaczmarz with applications to the inverse radon transform. In: Herman G.T., Natterer, F., (eds.) Mathematical Aspects of Computerized Tomography, pp. 160–178. Springer, New York (1981) 99. Nashed, M.Z.: A new approach to classification and regularization of ill-posed operator equations. In: Engl, H., Groetsch, C.W. (eds.) Inverse and Ill-Posed Problems, Band 4, Notes and Reports in Mathematics and Science and Engineering. Academic Press, Boston (1987) 100. Nashed, M.Z., Votruba, F.G.: A unified operator theory of generalized inverses. In: Nashed, M.Z. (ed.) Generalized Inverses und Applications, pp. 1–109. Academic Press, New York (1976) 101. Nashed, M.Z., Wahba, G.: Generalized inverses in reproducing kernel spaces: an approach to regularization of linear operator equations. SIAM J. Math. Anal. 5, 974–987 (1974)

928

W. Freeden and B. Witte

102. Nashed, M.Z., Wahba, G.: Approximate regularized pseudosolution of liner operator equations when the data-vector is not in the range of the operator. Bull. Am. Math. Soc. 80, 1213–1218 (1974a) 103. Nashed, M.Z., Wahba, G.: Convergence rates of approximate least squares solutions of linear integral and operator equations of the first kind. Math. Comput. 28, 69–80 (1974b) 104. Natterer, F.: Error bounds for Tikhonov regularization in Hilbert scales. Appl. Anal. 18, 29–37 (1984) 105. Neubauer, A.: On converse and saturation results for Tikhonov regularization of linear illposed problems. SIAM J. Numer. Anal. 34, 517–527 (1997) 106. Neumann, J.V.: On regular rings. Proc. Nat. Acad. Sci. USA 22, 707–713 (1936) 107. Penrose, R.: A generalized inverse for matrices. Proc. Camb. Philos. Soc. 51, 406–413 (1955) 108. Pearson, K.F.: On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have Arisen from random sampling. Philos. Mag. Ser. 5 302, 157–175 (1900). https://doi.org/10. 1080/14786440009463997.12 109. Philips, B.L.: A technique for the numerical solution of certain integral equations of the first kind. J. Assoc. Comput. Math. 9, 84–97 (1962) 110. Plackett, R.L.: An historical note on the method of least squares. Biometrika 36, 458–460 (1949) 111. Rado, R.: Note on generalized inverses of matrices. Proc. Camb. Philos. Soc. 52, 600–601 (1956) 112. Reid, W.T.: Generalized inverses of differential and integral operators. In: Boullion, T.L., Odell, P.L. (eds.) Theory and Applications of Generalized Inverses of Matrices. Symposium Proceedings, Texas Tech University Mathematics Series, vol. 4, pp. 1–25. Texas Tech Press, Lubbock (1965) 113. Reinsch, C.: Smoothing by spline functions. Numer. Math. 10, 177–183 (1967) 114. Reuter, R.: Über Integralformeln der Einheitssphäre und harmonische Splinefunktionen. Ph.D.-Thesis, Veröff. Geod. Inst. RWTH Aachen, Report No. 33 (1982) 115. Robinson, D.W.: Gauss and generalized inverses. Hist. Math. 7, 118–125 (1980) 116. Rieder, A.: Keine Probleme mit Inversen Problemen. Vieweg, Wiesbaden (2003) 117. Rummel, R., Balmino, G., Johannessen, J., Visser, P., Woodworth, P.: Dedicated gravity field missions – principles and aims. J. Geodyn. 33, 3–20 (2002) 118. Schlossmacher, E.J.: An iterative technique for absolute deviations curve fitting. J. Am. Stat. Assoc. 68, 857–859 (1973) 119. Schreiber, O.: Anordnung der Winkelbeobachtungen im Göttinger Basisnetz. Zeitschrift für Vermessungswesen, pp. 129–161 (1882) 120. Schuh, W.-D.: Über die Ausgleichung bei “Überschüssigen Messungen und zufälligen Beobachtungen” – auf den Spuren von Friedrich Robert Helmert. Schriftenreihe des Förderkreises Vermessungstechnisches Museum 42, 32–47, Dortmund (2017) 121. Shure, L., Parker, R.L., Backus, G.E.: Harmonic splines for geomagnetic modelling. Phys. Earth Planet. Inter. 28, 215–229 (1982) 122. Siegel, C.L.: Über die analytische Theorie der quadratischen Formen, III. Ann. Math. 38, 212–291 (1937) 123. Stewart, G.W.: Theoria Combinationis Observationum Erroribus Minimis Obnoxiae. Pars Prior et Pars Posterior and Supplementum, Theory of the Combination of Observations Least subject to Errors, Part One, Part Two, Supplement by C.F. Gauß, Society for Industrial and Applied Mathematics (1995) 124. Szegö, G.: Orthogonal Polynomials. American Mathematical Society Colloquium Publications, vol. 23. American Mathematical Society, Providence (1959) 125. Taylor, A.E., Lay, D.: Functional Analysis. Wiley, New York (1979) 126. Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk. SSSR 39, 195–198 (1943) 127. Tikhonov, A.N.: On the solution of incorrectly formulated problems and the regularization method. Dokl. Akad Nauk SSSR 151, 501–504 (1963)

14 From Gaussian Least Squares Approximation to Today’s . . .

929

128. Tikhonov, A.N.: On methods of solving incorrect problems. Am. Math. Soc. Transl. 2(70), 222–224 (1968) 129. Tikhonov, A.N., Arsenin V.Ya.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 130. Tikhonov, A.N., Goncharsky, A.V., Stepanov, V.V., Yagola, A.G.: Numerical Methods for the Solution of Ill-Posed Problems. Kluwer, Dordrecht (1995) 131. Vogler, C.A.: Lehrbuch der praktischen Geometrie. Bd. II 1. Halbband, Braunschweig (1894) 132. Wahba, G.: Convergence rates of certain approximate solutions to Fredholm integral equations of the first kind. J. Approx. Theory 7, 167–185 (1973) 133. Wahba, G.: Spline Models for Observational Data. SIAM, Philadelphia (1990) 134. Werner, D.: Funktionalanalysis. 4. Auflage. Springer, Berlin (2002) 135. Werner, J.: Numerische Mathematik 1. Vieweg Studium, Braunschweig (1991) 136. Witte, B.: Friedrich Robert Helmert in memory of his 100th year of death. GEM Intern. J. Geomath. 8, 153–168 (2017) 137. Wolf, H.: C.F. Gauß und die Methode der kleinsten Quadrate. Allgemeine VermessungsNachrichten, pp. 125–133 (1977) 138. Yosida, K.: Functional Analysis, 5th edn. Springer, Berlin (1980)

The Numerical Treatment of Covariance Stationary Processes in Least Squares Collocation

15

Wolf-Dieter Schuh and Jan Martin Brockmann

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Least Squares Collocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Covariance Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Least Squares Collocation: Filter Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Magic Square: Transition from an AR(p)-Process to a Covariance Function . . . . . . . . 6 Combined Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Resumé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Cholesky Approach: Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

933 936 937 941 943 947 950 951 964

Abstract Digital sensors provide long series of equispaced and often strongly correlated measurements. A rigorous treatment of this huge set of correlated measurements in a collocation approach is a big challenge. Standard procedures – applied in a thoughtless brute force approach – fail because these techniques are not suitable to handle such huge systems. In this article two different strategies, denoted as covariance approach and filter approach, to handle such huge systems are contrasted. In the covariance approach various decorrelation strategies based on different Cholesky approaches to factorize the variance/covariance matrices are reviewed. The focus is on

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. W.-D. Schuh () · J. M. Brockmann Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany E-Mail: [email protected]; [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_95

931

932

W.-D. Schuh and J. M. Brockmann

arbitrary distributed data sets with a finite number of data. But also extensions to sparse systems resulting from finite covariance functions and on exploiting the Toeplitz structure which results in the case of equispaced systems are elaborated. Apart from that filter approaches are discussed to perform a prewhitening strategy for the data and rearrange the whole model to work with this filtered data in a rigorous way. Here, the special focus is on autoregressive processes to model the correlations. Finite, causal, non-recursive filters are constructed as prewhitening filters for the data as well as the model. This approach is extreme efficient, but can only deal with infinite equispaced data sets. In real data scenarios, finite sequences and data gaps must be handled as well. For the covariance approach this is straightforward but it is a serious problem for the filter approach. Therefore a combination of these approaches is constructed to select the best properties from each. Covariance matrices of equispaced data sets designed by recursively defined covariance sequences are represented by AR processes as well as by Cholesky factorized matrices. It is shown, that it is possible to switch between both strategies to get data gaps and the warm up phase for the filter approach under control.

Keywords Stochastic processes · Collocation · Decorrelation · Filtering · Equispaced data · Finite data series · Data gaps · Yule-Walker equations · Magic square · Recursive defined covariance sequences

Zusammenfassung Digitale Sensoren liefern Zeitreihen von gleichabständigen und oft stark korrelierten Messungen. Eine strenge Auswertung, dieser zumeist umfangreichen Datensätze, in einem Kollokationsansatz stellt eine große Herausforderung dar. Standardverfahren sind nicht in der Lage solch große Systeme zu bewältigen. In diesem Artikel werden zwei unterschiedliche Verfahren – der Kovarianzansatz und der Filteransatz – einander gegenübergestellt, um deren Potential für den Einsatz bei sehr großen Datenmengen zu diskutieren. Im Kovarianzansatz erfolgt die Modellierung der Korrelationen durch Kovarianzfunktionen. Die Dekorrelation der Messungen erfolgt über die Faktorisierung der Kovarianzmatrizen, wofür unterschiedliche Varianten der Cholesky Faktorisierung untersucht werden. Dünn besetzte Systeme hervorgerufen durch den Einsatz von finiten Kovarianzfunktionen, regelmäßige Toeplitz Systeme resultierend aus der regelmäßigen Abtastung, aber auch die Auswirkungen von lokalen Datenverlusten (Datenlöcher) werden untersucht und maßgeschneiderte Zerlegungsverfahren entwickelt.

15 The Numerical Treatment of Covariance Stationary Processes in . . .

933

Der Filteransatz bietet hingegen kaum Möglichkeiten flexibel auf Eigenheiten der Daten einzugehen. Für regelmäßige Abtastung von sehr langen Zeitreihen, ist die Dekorrelation über Filter extrem effizient. Durch die Modellierung der Korrelationen durch finite, autoregressive Prozesse (AR-Prozesse) werden kausale, nichtrekursive, finite Filter aufgebaut, die unendlich ausgedehnte Meßreihen effizient dekorrelieren können. Datenlöcher, aber auch die Initialisierung des Filterprozesses bewirken aber einen Datenverlust, der speziell bei großen Korrelationslängen durchaus dramatisch sein kann. Messserien aus der Praxis bestehen aber aus endlich vielen Messungen, enthalten Unregelmäßigkeiten und Datenlöcher. Während der Kovarianzansatz ohne Probleme auf diese Daten anwendbar ist, sind beim effizienten Filteransatz spezielle Vorkehrungen (Näherungen) notwendig. Durch den Einsatz von rekursiv definierten Kovarianzsequenzen kann jedoch ein Übergang zwischen den beiden Ansätzen hergestellt werden. Die vollbesetzte Kovarianzmatrix kann mit Hilfe einer speziellen Variante der Cholesky Inversion – rekursives rückwärts Rändern – in eine dünnbesetzte Matrix zerlegt werden, die weitgehend eine Band-Toeplitz-Struktur aufweist, die dem kausalen, nichtrekursiven Filter im Filteransatz entspricht. Durch die Kombination der beiden Ansätze ist es somit möglich einen zeitvariablen, kausalen, nicht rekursiven Filter zu entwickeln, der endliche und leicht unregelmäßige Datensätze streng und effizient dekorrelieren kann.

Schlüsselwörter Stochastische Prozesse, Kollokation, Dekorrelation, Filterung, gleichabständige Daten, finite Zeitreihen, Datenlöcher, Yule-Walker Gleichungen, Magisches Quadrat, rekursiv definierte Kovarianzsequenzen

1

Introduction

Digital sensors provide long series of equispaced measurements, which are often strongly correlated because of the high sampling rate and the characteristics of the measurement system. A rigorous treatment of this huge set of correlated measurements in an adjustment model or collocation approach is a big challenge. Simplified models or approximative solutions are often introduced to overcome the numerical effort e.g., short arc approaches [26, 36], stochastic impulses [4, 18] or a periodic behavior of the correlations are assumed to get access to Fouriertechniques in the spectral domain [6, 35]. But for a rigorous treatment tailored solution strategies in connection with high performance computing are essential to overcome the huge numerical effort [40]. Therefore it is necessary to analyze the situation in detail. The measurement model in the collocation approach [22, 30, 31] consists of a deterministic part

934

W.-D. Schuh and J. M. Brockmann

and a stochastic part which can further be divided into a signal part and a noise part. Whereas the signal part is characterized by a covariance stationary stochastic process, the noise part is typically a simple white noise process with uncorrelated components. Because of the finite length of the data streams, the treatment of the correlations by covariance matrices seems to be preferable. Also data gaps can be easily accounted for in this representation. In addition, a sparse representation resulting from the use of finite covariance functions [13, 21, 29, 34, 38, 42, 43] opens the possibility to treat huge matrices. But, a large drawback is the deduction of the shape and the variability of the covariance function from the discrete data samples. Within the computation of the empirical covariance function and its approximation by an analytical, positive definite function it is not easy to consider and resolve secondary effects [33, Sec. 5.3]. Though, each covariance stationary process can be represented in the time/space domain as well as in the spectral domain. The four quantities, (a) the stochastic process itself, (b) the autocovariance sequence, (c) the spectral representation of the process, and (d) the power spectrum as spectral representation of the covariance function, form the corners of a square (Magic Square) and the relations between them are indicated by arrows. With respect to sensor measurements, equispaced, time-discrete, non-periodic stochastic processes are of special interest. The quantities and relations in the Magic Square can be found in Fig. 1 [24]. Using the spectral representation of the covariance function, a large variety of different strategies can be applied for the estimation process. Depending on the data characteristics (bias, finite sequences) different strategies are discussed in [19] (see also [7, Chap. 7], [33, Sec. 5.3 and 5.4]). Furthermore, robust estimation techniques are extensively treated in [20]. For the robust estimation of the covariance function, the description of a stochastic process by an autoregressive model is of special interest. In that case, it suggests itself to use directly the coefficients of the modeled autoregressive process (AR-process) as a prewhitening filter to decorrelate the data, instead of the covariance approach. At a first glance this approach has a lot of benefits, as it performs efficient on huge equispaced data sets. But it has also two important drawbacks with respect to real finite measurement sequences: a filter warmup phase is necessary to handle the first data of the sequence and no data gaps are allowed [41]. In this paper we elaborate an approach, where the best properties of both strategies are combined. The paper is organized as follows. In Sect. 2 the basics of the Least Squares Collocation (LSC) approach are summarized. Special focus in Sect. 3 is on the covariance approach and on strategies for the decorrelation of arbitrarily structured covariance matrices by Cholesky factorization. Different variations of the Cholesky factorization are used to design causal and anti-causal, recursive and non-recursive filters. In Sect. 4 the signal in the collocation approach is considered as an autoregressive process of a finite order p, shortly denoted as AR(p) process. The pros and cons of this ansatz are discussed in detail. In Sect. 5 the transition from

15 The Numerical Treatment of Covariance Stationary Processes in . . .

935

∞ ν2  1 )=  St e−i2πνtΔt dν  2 )− S(ν S(ν

St



t=−∞ ν1

 Δt

St = Δt

νN



−ν N

i2πνtΔt

e

ν N   dS(ν)

−ν N

 dS(ν)





   E dS(ν) dS∗ (ν)

 E (Su )(Su+|t| ) γ S (ν)=



S γ|t|

 Δt S γ|t|

time-domain

∞ 

S γ|t| cos(2πνtΔt)



t=−∞

= Δt

νN

ν N  γ S (ν)

−ν N

S

cos(2πνtΔt) γ (ν)dν

−ν N

spectral domain

Fig. 1 Magic Square of a covariance stationary, equispaced, infinite, real valued stochastic process. Upper left: Sequence of stochastic values with a sampling interval of Δt. Upper right: The spectral representation of the equispaced stochastic process forms a continuous, piecewise R(ν), which is periodic with respect to the interval defined, complex valued, stochastic process d S 1 [−ν N , ν N ] [24]. ν N represents the Nyquist frequency, which is fixed by ν N = 2Δt . The transformation between time and spectral domain is performed by a stochastic Fourier transform The back transformation is defined by a stochastic Fourier integral Lower left: real valued and even covariance sequence. Lower right: spectral density function R γ S (ν), real valued and periodic in the interval [−ν N , ν N ]. The discrete form of the Wiener-Chintschin theorem [9, Sect. 8.2.2] defines the transformation between the covariance sequence in the time-domain and the power spectrum. Because of the real valued and even covariance sequence, the transformation from the time domain to the spectral domain can be done by a discrete cosine transformation, while the inverse transformation is performed by a continuous cosine transformation

the AR(p) process to the covariance approach is discussed. Recursive defined covariance functions for equispaced data are introduced. Their representations in the time and the spectral domain are elaborated and presented in compact form in the Magic Square. In Sect. 6 it is shown, that it is possible to switch between the covariance and the filter approach. In this way, data gaps and the warm up phases can be treated in the filter approach by a time variable AR processes. This findings are discussed in the resumé (Sect. 7). The Appendix outlines different strategies for the Cholesky factorization. Besides the common factorization in a lower and upper triangular matrix also the backward Cholesky factorization is discussed. Computational strategies to compute the inverse of a system directly from the Cholesky factorized triangular matrices are elaborated. A flexible combination of all the possible factorization steps for a symmetric matrix as well as for their inverse is discussed and serves a basis for the integration of the covariance approach into the filter approach.

936

2

W.-D. Schuh and J. M. Brockmann

Least Squares Collocation

Following [31, Sec. 14] (see also [42, Sec. 3]) the measurement model for the observables L ∈ Rn in least squares collocation is given in the general form by and

Σ {L} = Σ {S} + Σ {N } ,

(1)

and

Σ {L} = Σ {S} + Σ {N } ,

(2)

L = f(ξ ) + S + N or in the linear form by L = Aξ + S + N

where f(ξ ) respectively Aξ represents a systematic (deterministic) model part whereas the signal S and the noise N represents the stochastic part.1 In contrast to the general covariance stationary stochastic process S, the noise process N is characterized by independently and identical distributed random variables with expectation E {N } = 0n and variance Σ {N } = 1n σ 2 and often shortly denoted as white noise process. The random variables S and N are independent, Σ {S, N } = On and therefore Σ {L} = Σ {S + N } = Σ {S} + Σ {N } := Σ

(3)

holds for the stochastic model. The variance/covariance matrix Σ {L} will be often simply denoted as covariance matrix and marked by Σ in the following. The best O for the true unknown functional parameters ξ linear unbiased estimator (BLUE) X is given by   O = AT Σ −1 A −1 AT Σ −1 L X

(4)

with the variance/covariance matrix " #−1 O } = AT Σ −1 A Σ{X .

(5)

O for the signal component is given by The best linear unbiased predictor (BLUP) S O = Σ {S, S +N } Σ −1 Π Σ⊥−1 Σ −1 L . S S (A)

(6)

this contribution random variables are denoted by calligraphic letters L and random vectors by L. Greek letters denote true values ξ and vectors of true values ξ , whereas Latin letters represent fixed numbers or realizations a, vectors a and matrices A of numbers and realizations. 1n stands for the unity matrix with dimension n×n and On for a nullmatrix of dimension n×n, On1 ×n2 for 1 In

a nullmatrix with n1 rows and n2 columns, whereas 0n denotes a null vector with n elements

15 The Numerical Treatment of Covariance Stationary Processes in . . .

937

−1

where Π Σ denotes the projector into the orthogonal column space S ⊥ (A) of the S ⊥ (A) matrix A with metric Σ −1 , defined as −1

ΠΣ := Σ − A(AT Σ −1 A)−1 AT . S ⊥ (A)

(7)

Introducing the reduced observation vector $ := Π Σ⊥−1 Σ −1 L ΔL S (A)

(8)

$. O = Σ {S, S +N } Σ −1 ΔL S

(9)

we can rewrite (6) as

The variance of this estimator can be derived by variance propagation. Considering that −1

$ = Π Σ⊥ Σ{ΔL} , S (A)

(10)

and taking into account the idempotence of the projector with respect to the metric Σ −1 [42, Sec. 3.1], it follows that O = Σ {S, S +N } Σ −1 Σ{ΔL}Σ $ −1 Σ {S +N , S} . Σ{S}

(11)

O − S can be computed by (cf. [31, The variance of the estimation error ES O := S Sec. 14], [42, Sec. 3]) O . Σ{ES O } = Σ {S} − Σ{S}

3

(12)

Covariance Approach

In this paper we have a focus on the computation of the unknown parameters (4) and their covariance matrix (7). A strategy which is often used when covariances have to be considered is to decorrelate the observations in a pre-processing step. Therefore, the observation equations (2) are multiplied with a regular matrix H ∈ Rn×n H L =  HS +H H A ξ +  N    :=A :=L :=S :=N

(13)

with the objective, that the transformed observations L = H L are uncorrelated and homogenized. That means for the propagated variance of L & ' ! Σ L = Σ {H L} = H ΣH T = 1n

(14)

938

W.-D. Schuh and J. M. Brockmann

must hold. This yields an alternate formulation for (2) L = Aξ + S + N

& ' & ' Σ L = Σ S + N = 1n .

with

(15)

To find an adequate representation for H which satisfies (14), it is a straightforward step to factorize the variance/covariance matrix in two parts, where one part is the transposed of the other. This can be done e.g., by an eigenvalue decomposition Σ = U Λ1/2 Λ1/2 U T , where the columns of the matrix U represent the eigenvectors and the square roots of the corresponding eigenvalues are arranged in the diagonal of the diagonal matrix Λ1/2 . A computational more efficient way is to factorize the covariance matrix Σ by the Cholesky (forward factorization) approach into a lower and upper triangular matrix, where the lower triangular matrix is the transposed of the upper triangular matrix R, Σ = RT R .

(16)

To satisfy condition (14), !

H R T R H T = 1n

(17)

must hold. This is clearly fulfilled for H T = R −1 . Therefore the transformed quantities L and A can be computed by  −1 L = H L = RT L

resp.

 −1 A = H A = RT A

(18)

which corresponds to the solutions of the systems of equations RT L = L

RT A = A .

resp.

(19)

In an element-wise formulation, the solution can be determined by a forward substitution algorithm ⎡ r11 ⎢r12 ⎢ ⎢ . ⎣ .. r13

r22 .. . . . . r23 . . . rnn

⎤⎡ ⎤ ⎡ ⎤ L1 L1 ⎥ ⎢L2 ⎥ ⎢L2 ⎥ ⎥⎢ ⎥ ⎢ ⎥ ⎥⎢ . ⎥ = ⎢ . ⎥ ⎦ ⎣ .. ⎦ ⎣ .. ⎦ Ln

Ln

# 1" Li − rki Lk rii i−1

resp.

Li =

k=1

(20) (cf. Appendix A.1). If we look to (20) as a filter equation, we identify this filter as a linear, time variant, recursive, causal filter. This filter has some benefits because it can handle finite time series, is flexible with respect to data distribution (data gaps) and can handle sparse structures resulting from for instance finite covariance functions. But, this filter approach has

15 The Numerical Treatment of Covariance Stationary Processes in . . .

939

also some cons, as this recursive procedures are time-consuming and its parallel usage is not straightforward. Time variant filters have a memory requirement of O{n2 } and have a computational complexity of O{n3 }.

3.1

Variations of Cholesky Decorrelation Filters

The versatility of the Cholesky approach as documented in the Appendix can be used to construct different types of filters, also non-recursive (moving average) and anti-causal filters can be constructed. In a first approach an anti-causal, recursive filter can be constructed by the partition of the matrix Σ into an upper triangular matrix and its transposed, Σ = RR

T

(21)

Numerically this partitioning can be done by backward Cholesky factorization (83) as elaborated in Appendix A.4. The corresponding transformation matrix H for the decorrelation process L = H L follows from the condition !

T

H ΣH T = H RR H T = 1n

(22)

−1

and yields H = R . To avoid the inversion of the matrix R, the decorrelation process in form of a matrix-vector multiplication L = HL = R

−1

L

(23)

is rearranged, such that the solution of R L = L.

(24)

has to be determined. For the solution, the upper triangular form of R can be exploited starting form the last equation by a backward substitution. The decorrelation process for L resp. A with respect to the factorization (21) can be seen as an anticausal, recursive filter. Both filters (20) and (24) discussed so far are recursive filters. However, nonrecursive filters are preferable because of their better performance with respect to runtime and their straightforward implementation on parallel machines. In a first attempt it is assumed that the inverse Σ −1 of the covariance matrix Σ is known. The inverse matrix Σ −1 is now factorized by the common Cholesky approach into a lower and upper triangular matrix. The name of this matrix can be chosen arbitrarily but according to the discussion in the Appendix, especially with respect to Fig. 10c −1 we denote this matrix by R . The Cholesky factorization of the inverse Σ −1 is given by

940

W.-D. Schuh and J. M. Brockmann

Σ −1 = (R

−1 T

R

−1

(25)

.

Considering the decorrelation condition (14) with respect to the inverse covariance matrix " −1  −1 #−1  −1 T ! T H ΣH = H Σ −1 H T = H (R R H T = H RR H T = 1n

(26)

−1

it follows that the transformation matrix is H = R . It is not very efficient and numerically stable to firstly compute the inverse Σ −1 and then secondly perform −1 the Cholesky factorization R . In Fig. 10c it is shown, that the computation of −1 R can be done starting from Σ by (79) organized by recursive backward edging −1 strategy cf. (80), beginning with the last element. With the computed matrix R the decorrelation process can be performed by L=R

−1

L

(27)

which then corresponds to an anti-causal, non-recursive filter. To construct a causal, non-recursive filter,the inverse of the covariance matrix must be factorized by a backward Cholesky factorization (cf. Fig. 10d) into Σ −1 = R −1 (R −1

T

.

(28)

 −1 results from the decorrelation The corresponding decorrelation matrix H = R T condition (14) and can be computed by (87) by a recursive forward edging strategy cf. (80) starting with the first element. The decorrelation process is then given by  −1 L = RT L.

(29)

Now, it corresponds to a causal, non-recursive filter, which is simple to apply from the numerical point of view and benefits from a sequential data availability.

3.2

Resumé about Cholesky Decorrelation Filters

In the last section various types of Cholesky factorization strategies are recapped. The focus of this consideration is not on the efficient computation of these factorizations, but on the achievable decorrelation results in terms of different recursive and non-recursive, causal and anti-causal filters. Figure 2 illustrates the different possibilities to factorize the covariance matrix by the Cholesky approach. Summarizing the pros and cons it can be stated that the covariance approach in connection with the different Cholesky factorization techniques is extremely flexible, can handle arbitrarily distributed data sets, but is computational very

15 The Numerical Treatment of Covariance Stationary Processes in . . .

Decomposition Σ = RT R

Decorrelation matrix

Factorization forward Cholesky reduction

H = RT

−1

compute R from Σ (63) T

Σ =RR

backward Cholesky reduction

−1

Compute R

−1

−1

L ⎡ = R L ⎤⎡ ⎤ ∗ ∗ ∗ ⎣L⎦ = ⎣ R−1 ∗ ⎦ ⎣L⎦ ∗ anti-causal, non-recursive filter

H =R

⎡ ⎤

from Σ

(79)  −1 Σ = R−1 (R−1 )T

recursive forward edging

−1 L = RT L ⇐⇒ RT L= L ⎡ ⎤⎡ ⎤ ⎡ ⎤ ∗ ⎣ ∗ RT ⎦ ⎣L⎦ = ⎣L⎦ ∗ ∗ ∗ causal, recursive filter L⎡ = R L ⎤ ⇐⇒ ⎡ ⎤R L ⎡ =⎤L ∗ ∗ ∗ ⎣ R ∗ ⎦ ⎣L⎦ = ⎣L⎦ ∗ anti-causal, recursive filter

(82)

recursive backward edging

Decorrelation of L

−1

H =R

compute R from Σ

  −1 −1 −1 Σ = (R )T R

941

H = RT

−1

Compute R−1 from Σ (87)

−1

−1 L = RT L ⎤⎡ ⎤ ⎡ ∗

⎣L⎦ = ⎣ ∗ RT −1 ⎦ ⎣L⎦ ∗ ∗ ∗ causal, non-recursive filter ⎡ ⎤

Fig. 2 Decorrelation strategies by various types of Cholesky factorizations

demanding. Huge data sets can only be handled, when finite covariance functions are applied and the sparsity of the matrix can be utilized and preserved in the different factorizations. Especially for equispaced data the filter approach, which is introduced in the upcoming section, seems preferably.

4

Least Squares Collocation: Filter Approach

In contrast to the general collocation approach in the last section, we will now have a look on equispaced data sets. For that purpose, we define the observation equations in a row-wise sequential formulation as L t = At ξ + S t + N t

t ∈ Z,

(30)

where the random variable Lt describes the measurement process in time T = T0 +tΔt. At denotes the tth-row of the design matrix A and ξ denotes the unknown true parameters. The random variables St and Nt are associated with stochastic   processes St Δt and Nt Δt , which can be represented by an auto-regressive parameter model in the parameters Li , i = 1, . . . , p (AR(p) process)       St = α1 St−1 + Nt−1 + α2 St−2 + Nt−2 + . . . + αp St−p + Nt−p + Et .

(31)

942

W.-D. Schuh and J. M. Brockmann

  Et and Nt are member of two infinite white noise processes Et Δt and Nt Δt characterized by independent and identically distributed random variables with expectation E{Et } = 0, E{Nt } = 0, and variances σE2 resp. σN2 ,       Et Δt ∼ W N 0, 1∞ σE2 ; Nt Δt ∼ W N 0, 1∞ σN2 ; Σ {Et , Nk } = 0; t, k ∈ Z. (32) O cf. (4) for the true unknown The goal is to find the best linear unbiased estimator X O parameters ξ with minimal variance Σ{X }. We use a whitening strategy and reorder the AR(p)-process (31) with respect to the white noise component Et St − α1 (St−1 + Nt−1 ) − α2 (St−2 + Nt−2 ) − . . . − αp (St−p + Nt−p ) = Et and expand both sides with Nt matrix-vector form ⎡ .. .. .. . . . ⎢ ⎢ −α . . . −α p 2 ⎢ ⎢ −αp . . . ⎢ ⎢ −αp ⎣  ⎡

⎤ .. . ⎢ ⎥ ⎢ Et−2 + Nt−2 ⎥ ⎢ ⎥ ⎥ =⎢ ⎢ Et−1 + Nt−1 ⎥ ⎢ E +N ⎥ t t ⎣ ⎦ .. .   

(33)

and re-write the equation for t, t − 1, t − 2, . . . in

⎤⎡ ⎤ .. . . ⎥⎢ ⎥ ⎥ ⎢ St−2 + Nt−2 ⎥ −α1 1 ⎥⎢ ⎥ ⎥ ⎢ St−1 + Nt−1 ⎥ −α2 −α1 1 ⎥⎢ ⎥ ⎥⎢ S + N ⎥ . . . −α2 −α1 1 t t ⎦⎣ ⎦ .. .. .. .. .. . . . . .     S +N H ..

(34)

E +N

or   H S +N =E +N

(35)

which can be interpreted as a linear filter (cf. Fig. 3). This pre-whitening filter is characterized by the following attributes, it is linear, time invariant, non-recursive and causal. It can be applied to the observation

S+N

Fig. 3 Pre-whitening filter

filter

H

E+N

15 The Numerical Treatment of Covariance Stationary Processes in . . .

943

equations (2) by multiplication from the left side with the filter matrix H . Afterwards, they read L = Aξ + E + N

  2 with E + N ∼ W N 0, 1∞ (σE2 + σN ,

(36)

where L = H L and A = H A denote the filtered measurements and the columnwise filtered design matrix. The colored noise S + N is transformed to white noise E + N by applying the filter matrix as shown in (35). The parameters α1 , . . . αp of the AR(p)-process can be derived in an iterative procedure from the residuals (cf. [7, 19, 23, 33]). The pre-whitening filter approach has a lot of pros like the flexibility in filter design and the possibility to cascade filters, the causality of the filter allows a sequential adjustment approach with a computational complexity of O{n}, which allows for an efficient computation in cache-conform algorithms and can be easily tailored for parallel computations with a scalability O{cores}. Despite the pros, also some cons can be identified as the filter can only deal with infinite, equispaced, uninterrupted (gapless) and outlier-free data sets. These needs are not realistic, when considering real measurement time series. These results in the open questions: • How to initialize the filter process? Often, filter warmup phases are proposed. This warm up automatically results in a loss of data [41], which should be avoided especially for short time series. • How to deal with data gaps and outliers? Should data gaps be filled with interpolated or approximated data? For instance Siemes [44, Sec. 4.3] proposed a sophisticated approach, where besides the deterministic part also the stochastic part is estimated for the gap filling values. All these points can be well treated by the covariance approach (cf. Sect. 3) and need no special attention. But, as mentioned above the covariance approach has the disadvantage of the loss of computational efficiency. Therefore, it is reasonable to propose a strategy for the combination of these two approaches.

5

Magic Square: Transition from an AR(p)-Process to a Covariance Function

At the end of the last section the pro and cons of the covariance approach as well as of the filter approach are stated. In this section we seek for a combination of both approaches, which benefits from the pros of both strategies. Therefore it is necessary to find the connection between the covariance sequence and the AR(p) coefficients.  Z The autocovariance sequence γ|t| of a general time discrete covariance Δt  stationary process Zt Δt is defined via Z := E γ|k|

   Zt − E {Zt } Zt−k − E {Zt−k } ;

k∈Z.

(37)

944

W.-D. Schuh and J. M. Brockmann

 Introducing the AR(p)-process St Δt given by (31) connected with the assumption of white noise processes Et and Nt it can be stated that E {St } = 0. Therefore the autocovariance sequence (37) coincides with the autocorrelation sequence, i.e., S γ|k| = E {St St−k } ;

k∈Z.

(38)

For simplicity, but without restriction on the universality, only the signal part St and not the combined signal plus noise part St +Nt is considered in the following. S for the The extension for the noise part is straightforward. The autocovariance γ|k| interval k with respect to the AR(p) process (31) is obtained from  S γ|k| = α1 E {St−1 St−k } + α2 E {St−2 St−k } + . . . + αp E St−p St−k + E {Et St−k } ;

k ∈ Z.

(39)

For the evaluation of this equation for different k the following relations have to be considered. It can be stated that preceding signals St−k , k > 0 and the noise Et at stage t are independent and therefore E {St−k Et } = 0. Furthermore the signal St at stage t includes the noise Et which yields to E {Et St } = σE2 . Considering covariance stationarity, the covariances depend only on the time lag k and not on the stage t, consequently  S E {St−1 St−k } = E St St−(k−1) = γ|k−1| .

(40)

Evaluating (39) for different values k yields k=0: k=1: k=2: .. .

γ0S γ1S γ2S

= α1 γ1S = α1 γ0S = α1 γ1S .. . S = α1 γp−1

+ α2 γ2S + α2 γ1S + α2 γ0S .. . S + α2 γp−2

+ . . . + αp γpS + σE2 S + . . . + αp γp−1 S + . . . + αp γp−2 .. . + . . . + αp γ0S

k=p:

γpS

k =p+1: k =p+2: .. .

S = α γS + α γS +...+ α γS γp+1 1 p 2 p−1 p 1 S S S γp+2 = α1 γp+1 + α2 γp + . . . + αp γ2S .. .. .. .. . . . .

.

(41) This equations are known as Yule-Walker equations [7, Sec. 3.2] and provide the connection between the AR(p) coefficients αj , j = 1, . . . , p and the covariance  S sequence γ|t| , t ∈ Z. The Yule-Walker equations can be divided into two parts. Δt In the upper part only the covariances γ0S to γpS are involved, whereas in the lower S to the system. It part each additional equation adds a new autocovariance factor γp+i should be noted that the covariance sequence is recursively defined and is therefore

15 The Numerical Treatment of Covariance Stationary Processes in . . .

945

denoted as recursive defined covariance sequence. The sequence of covariances uniquely defined by the first p + 1 covariances, γ0S , . . . , γpS . Rewriting the Yule-Walker equations (41) to matrix-vector notation helps for a better understanding, ⎤ ⎡ S γ1 γ0S ⎢γS ⎥ ⎢γS ⎢ 1 ⎥ ⎢ 0 ⎢ S ⎥ ⎢ S ⎢ γ2 ⎥ ⎢ γ1 ⎢ . ⎥ ⎢ ⎢ . ⎥ ⎢ .. ⎢ . ⎥ ⎢ . ⎢ S ⎥ ⎢ ⎢ γp ⎥ = ⎢ γ S ⎢ ⎥ ⎢ p−1 ⎢ ⎥ ⎢ ⎢γS ⎥ ⎢ S ⎢ p+1 ⎥ ⎢ γp ⎢ S ⎥ ⎢ S ⎢ γp+2 ⎥ ⎢ γ ⎣ ⎦ ⎣ p+1 .. .. . . ⎡

γ2S γ1S γ0S .. . S γp−2 S γp−1 γpS .. .

⎤ ⎡ ⎤ γpS 1 S ⎥ ⎢0⎥ γp−1 ⎥ ⎥ ⎡ ⎤ ⎢ S ⎥ ⎢0⎥ ⎥ α1 γp−2 ⎢ ⎥ ⎥ .. ⎥ .. ⎥ ⎢ α2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ α3 ⎥ ⎢ . ⎥ ⎥ 2 S + ⎢ ⎥ ⎥ . . . γ0 ⎥ ⎢ . ⎥ ⎢ 0 ⎥ σE . ⎥ ⎥ ⎣ .. ⎦ ⎢ ⎢ ⎥ ⎥ ⎢0⎥ . . . γ1S ⎥ ⎢ ⎥ ⎥ αp ⎢0⎥ . . . γ2S ⎥ ⎣ ⎦ ⎦ .. .. . . ... ... ...

(42)

Reordering the first p +1 equations (upper part) with respect to the autocovariance sequence γ0S , γ1S , . . . γpS , it follows S + α γS α1 γ1S + α2 γ2S + . . .+αp−1 γp−1 (−1)γ0S + p p S S S S + α1 γ0 +(−1+α2 )γ1 + α3 γ2 + . . . + αp γp−1 0γpS S + α2 γ0S + (α1 +α3 )γ1S +(−1+α4 )γ2S + . . . + 0γp−1 0γpS .. .. . . S +(−1)γ S αp−1 γ1S + αp−2 γ2S + . . . + α1 γp−1 αp γ0S + p

=−σE2 = 0 = 0 . .. . = 0 (43)

This can be expressed in matrix-vector notation by ⎛⎡

−1 ⎜⎢ α ⎜⎢ 1 ⎜⎢ α ⎜⎢ 2 ⎜⎢ . ⎜⎢ . ⎜⎢ . ⎜⎢ ⎝⎣ αp−1 αp



−1 α1 −1 .. .. . . αp−2 αp−3 αp−1 αp−2

⎡ 0 ⎥ ⎢0 ⎥ ⎢ ⎥ ⎢0 ⎥ ⎢ ⎥+⎢ . .. ⎥ ⎢. ⎥ ⎢. . ⎥ ⎢ ⎦ ⎣0 . . . −1 . . . α1 −1 0

α1 α2 α3 .. . αp

⎤ ⎤⎞ ⎡ S ⎤ ⎡ γ0 . . . αp−2 αp−1 αp −σE2 S ⎥ ⎥ ⎥ ⎟ ⎢ ⎢ . . . αp−1 αp ⎥⎟ ⎢ γ1 ⎥ ⎢ 0 ⎥ S ⎥ ⎥ ⎢ ⎥ ⎟ ⎢ . . . αp ⎥⎟ ⎢ γ2 ⎥ ⎢ 0 ⎥ ⎥⎟ ⎢ . ⎥ = ⎢ . ⎥ . . ⎥⎟ ⎢ . ⎥ ⎢ . ⎥ ⎥⎟ ⎢ . ⎥ ⎢ . ⎥ .. ⎥ ⎥⎟ ⎢ S ⎥ ⎢ ⎦⎠ ⎣ γp−1 ⎦ ⎣ 0 ⎦ 0 γpS (44)

Together with the equations of the lower part of (41) S = α γS + α γS +...+ α γS γp+1 1 p 2 p−1 p 1 S = α γS + α γS +...+ α γS γp+2 1 p+1 2 p p 2 , .. .. .. .. . . . .

(45)

these equations form the reorganized Yule-Walker equations [42, 43]. The covariS ance sequence γ|t| , t ∈ Z can be deduced via the solution of (44) directly Δt

946

W.-D. Schuh and J. M. Brockmann

from the coefficients αj , j = 1, . . . , p of the AR(p)-process and the recursion formulas (45). And vice versa, the coefficients of the AR(p)-process αj , j = 1, . . . , p, can be derived from the solution of the upper part of the equation system (42) from the covariance sequence γ1S , . . . , γpS . The error variance σE2 results from the first row of (42). The Yule-Walker equations and its reorganized representation allow for direct transformation between the autocovariance sequence and the coefficients of an AR(p)-process. It should be noted, that the stochastic process defined by an AR(p)-process (31) yields to a special type of covariance sequence, which is denoted as a recursive defined covariance sequence. To find the representation of the AR(p)-process in the spectral domain, we reformulate the AR(p)-process as a moving average process. With the reordered infinite sequence of deterministic coefficients  αt

Δt

 := . . . 0 1 − α1 − α2 . . . − αp 0 . . .

(46)

 a convolution of the sequence with the sequence of the stochastic signal St  results in a white noise sequence Et Δt ,  αt

Δt

 ∗ St

Δt

 = Et

Δt

Δt

(47)

.

The representation in the spectral domain is given by R = d E(ν) R H (ν) d S(ν)

(48)

 where H (ν) denotes the transfer function of αt Δt , which is defined by the Fourier  transform of the non-periodic infinite sequence αt Δt & H (ν) = F αt

' Δt

=1−

p

αj e−i2π νj Δt ,

(49)

j =1

which is defined in agreement with the definition in Fig. 1. The spectral representation of the signal immediately follows from (48) R d E(ν)

R = d S(ν) 1−

p  j =1

αj

.

(50)

e−i2π νj Δt

As the autocovariance sequence in the time domain is represented by the expectation of the correlation of the stochastic process with itself, the correlation in the time domain can be represented by an element-wise multiplication with the conjugate quantities in the spectral domain. Whereas the denominator is deterministic the stochastic part in the numerator is transformed by the expectation operator to σE2 [42, Fig. 5].

15 The Numerical Treatment of Covariance Stationary Processes in . . .

947

∞ ν2  2 )− S(ν  1 )=  St e−i2πνtΔt dν S(ν

 t=−∞ ν1

p 

αj St−j + Et

reorganized Yule-Walker equations (44)



S γ|t|

Δt

p 

 dE(ν) αj e−i2πνjΔt

ν N   N −ν

   E dS(ν) dS∗ (ν)

γ S (ν) = σE2



1−

j=1



j=1

 dS(ν) =



St =

+

p−1 

 1 + α2p − 2αp cos(2πνpΔt) +

p−j 

ν N α2j −2 cos(2πνjΔt)(αj − αk αk+j )  N

j=1

time-domain

k=1

−ν

spectral domain

Fig. 4 Magic Square of a autoregressive process of order p

With the help of the Magic Square representation, these relations can be documented in the time domain as well as in the spectral domain [42, 43]. We can find closed analytical relations between all the quantities, if we assume that the AR(p) coefficients αj , j = 1, . . . , p are given (cf. Fig. 4). Especially the power spectrum can be expressed in a closed form, which is not possible if we use the standard Wiener-Chintschin Theoremfor the transformation of the covariance sequence into the power spectrum (cf. Fig. 1).

6

Combined Approach

In the last sections two different approaches to deal with correlated time series in the least squares collocation approach were compared. In the covariance approach, the correlations are expressed by the covariance function and collected in the variance/covariance matrix. The Cholesky approach is used for the decorrelation of the measurements. In contrast to the covariance approach, the filter approach uses an AR(p)-process to describe the correlations in the time series. Applying a causal, non-recursive filter to the measurements (35) as well as to the columns of the design matrix, an alternative problem (36) with uncorrelated measurements is identified. In the last section it is shown that the Yule-Walker equations provide a direct transition between the covariance sequence and the coefficients of the AR(p)-process. Because of the required equispaced data distribution for the AR(p)-process, the continuous covariance function becomes a recursively defined covariance sequence. Considering covariance stationarity, this covariance matrix of the filter approach for equispaced data has a Toeplitz structure. To be more precise, the covariance matrix is an infinite Toeplitz structured matrix.

948

W.-D. Schuh and J. M. Brockmann

In the case of equispaced data, the covariance approach yields to a Toeplitz structured covariance matrix as well. But in contrast to the infinite matrix of the filter approach, the covariance matrix in the covariance approach is restricted to a finite dimension. Let us denote the number of measurements by n, then the variance/covariance matrix is of dimension n × n and can be seen as a resected diagonal block of the infinite Toeplitz matrix. In contrast to an infinite system, the inverse of a finite Toeplitz matrix is not Toeplitz structured again. Fast O(nlogn) solutions strategies based on Fourier transformations are only possible for infinite or periodic systems [6], but also for band Toeplitz systems [40, App. A.7]. For finite Toeplitz systems, Levinsons-Durbin algorithm [11, 25, 46] provides a very efficient solution with respect to the number of operations O(n2 ) and required memory O(n). In the last section, the duality between the filter and covariance approach was already elaborated. Now, we would like to answer the question, whether it is possible to use this duality and the given relations for an efficient and flexible decorrelation. The filter approach results in a causal, finite, non-recursive filter. At the end of Sect. 3.1, a causal, non-recursive filter was constructed by the Cholesky approach  as well (by computing R T )−1 by the recursive forward edging strategy (84)). We use this analogy, to identify a direct connection between these two strategies. To find the similarities between these two approaches the Yule-Walker equation (42) is rewritten ⎡

γ0S ⎢ γS ⎢ 1 ⎢ . ⎢ . ⎢ . ⎢ S ⎢ γp ⎢ ⎢γS ⎣ p+1 .. .

γ1S γ0S .. . S γp−1 γpS .. .

... ... .. . ... ... .. .

γpS S γp−1 .. . γ0S γ1S .. .

S γp+1 γpS .. . γ1S γ0S .. .

⎤⎡ ⎤ ⎡ ⎤ ... σ2 1 ⎢ −α ⎥ ⎢ 0E ⎥ ...⎥ ⎥ ⎢ 1⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ . ⎥ .. ⎥ ⎢ ⎥ ⎢ ⎥ .⎥ ⎥ ⎢ .. ⎥ ⎢ .. ⎥ ⎥⎢ ⎥=⎢ ⎥ . . . ⎥ ⎢−αp ⎥ ⎢ 0 ⎥ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ...⎥ ⎦⎣ 0 ⎦ ⎣ 0 ⎦ . .. .. .. . .

.

(51)

With respect to the covariance approach the covariance matrix Σ is factorized into  −1 −1 T −1 R (R ) (cf. Fig. 2), where R −1 can be computed by recursive forward edging (84). Using the matrix Σ instead of N −1 and additionally R −1 instead of R compared to Fig. 10d for the backward Cholesky approach, the following equation results ⎡ S S ⎤ ⎡ (−1) (−1) ⎤ ⎡ ⎤ rjj rjj γ0 γ1 . . . γjS 1 ⎢γ S γ S . . .γ S ⎥ ⎢ (−1) (−1) ⎥ ⎢ ⎥ r r 0⎥ ⎢ ⎢ 1 0 ⎥ j −1 ⎥ ⎢ . . ⎥ ⎢ j −1,j jj ⎥ = ⎢ ⎢ . . . ⎢ . . . . ⎥⎢ ⎥ ⎣ .. ⎥ .. . . ⎦⎣ .⎦ ⎣ . . ⎦ (−1) (−1) 0 γjS γjS−1 . . . γ0S r1j rjj

for

j = 0, . . . , n .

(52)

This equation corresponds to the problem to find the first row of the inverse of matrix Σ(1:j,1:j). This can be done very efficient for large Toeplitz systems by the Levinson-Durbin algorithm [11, 25], but for smaller systems also a forward edging

15 The Numerical Treatment of Covariance Stationary Processes in . . .

949

process corresponding to (80) is suitable. Comparing (52) and (51) we see that for j ≥ p the two systems corresponds to each other and the solution of (52) can be found by an element wise comparison with the solution vector in (51) 1 ; σE αk =− ; σE

(−1) = rjj (−1)

rj −k,j

(−1) rk,j = 0;

k = 1, . . . , p k = p + 1, . . .

(53) .

It can be seen that the Cholesky factors for the rows j ≥ p can be immediately deduced from the coefficients of the AR(p)-process only by a linear scaling with respect to the variance σE . The same values, shifted by one column, are obtained row-by-row. Thus, the Cholesky factorization of the Toeplitz structured covariance matrix corresponding to the AR(p)-process is a lower band matrix with p + 1 elements per row and has again Toeplitz structure for the rows j ≥ p. For the rows j < p the coefficients can be determined by (52) in a rigorous way, because of the finite shape of the covariance matrix. On the left side of Fig. 5 the results for an AR(6) process are depicted. The blue elements mark the Toeplitz structure with the scaled AR(6) coefficients deduced from (53) or strictly speaking from the recursion (second) part of the Yule-Walker equations (42), whereas the yellow and red elements represent the warm-up phase determined by (52).

Fig. 5 Causal, non-recursive filter for a AR(6) process deduced by the backward Cholesky approach by recursive forward edging. The blue elements correspond after a scaling (53) exactly to the AR(p) coefficients. The yellow and red part at the beginning represents the warm-up part. On the left side a finite sequence with 21 equispaced data points is shown, whereas on the right side the data point at position 11 is missing. The green elements mark coefficients which are affected by this data gap

950

W.-D. Schuh and J. M. Brockmann

This matrix can be seen as causal, non-recursive filter with variable filter coefficients for the warm up phase. They are due to the truncation of the infinite sequence to a finite sequence at the beginning of the sequence. Because of the causality, no additional disturbances at the end of the sequence arises. It has to be noted, that the corresponding variance/covariance matrix is a dense matrix, but the factorized form derived by the backward Cholesky approach is a band matrix due to the recursively defined covariance sequence. In the common case the relation is vice versa. If the system is sparse (e.g., because of a finite covariance function), the Cholesky factorized system is also sparse but its inverse is dense. Here, we start with the dense system and compute the Cholesky factorization of the sparse inverse system which covariance function is deduced from a finite AR(p) process. Real measurement series are not only characterized by a finite number of measurements but also affected by data gaps. In many practical filter applications an interpolated value is used to preserve the equispaced structure [44, Sec. 4.3]. In contrast to this, the covariance approach has no problems with these data gaps. Only the corresponding columns and rows from the variance/covariance matrix has to be eliminated to account for the missing data. Using equispaced data the elimination of the rows and columns destroys – or better disturbs – the Toeplitz structure. This is of special interest for the rows k ≥ p where the Toeplitz structure is also reflected by the causal filter. A detailed look to the recursion part of the YuleWalker equations (42) shows, that starting with the data gap the next p recursion equations are disturbed, but afterwards, the Toeplitz structure restores and the AR(p) coefficients fulfill the recursion equations. This means that only p rows are influenced by a data gap. The disturbed coefficients can be computed by the Cholesky approach, starting with a stack of p standard rows, the data gap and p additional rows. This computation can then be inserted into the equispaced filter (matrix). In Fig. 5 on the right side the impact of a data gap at point 11 illustrates this behavior. Figure 6 shows the situation for several data gaps. Different situations can occur. Isolated gaps as discussed before (here at point 11), but also combined data gaps, where the space between two gaps is smaller than the process order p (here at point 20 and 23) or four gaps (35, 35, 38, and 42) interact with each other (Fig. 6, left). On the right side a continuous data gap between 20 to 40 is shown. All these situations can be fixed by a local Cholesky approach, where corresponding divergent rows can be locally computed in strict manner by the Cholesky approach.

7

Resumé

Huge variance/covariance matrices are the bottlenecks for least squares collocation. In this article different approaches are discussed to overcome this problem. The covariance approach is on the one hand extremely flexible with respect to data distribution, can rigorously handle data sets with a finite number of data and can treat data gaps without any problems. But the numerical effort – especially for huge data sets – is enormous. Efficient numerical methods, high performance computing but also finite covariance functions in connection with sparse solvers

15 The Numerical Treatment of Covariance Stationary Processes in . . .

951

Fig. 6 Causal, non-recursive filter for a AR(6) process deduced by the backward Cholesky approach by recursive forward edging. On the left side a finite sequence with 50 equispaced data points with missing data at the points 11, 20, 23, 35, 37, 38, and 42 are shown, where as on the right side a continuous data gap from point 20 to point 40 is simulated

are necessary to handle such huge systems. For the solution of the symmetric covariance systems the Cholesky algorithm is a very common and efficient tool. Different decorrelation strategies based on the Cholesky factorization are discussed and a large variety to design causal and anti-causal, recursive and non-recursive filters are shown. Further strategies are discussed to handle Toeplitz systems for equispaced data sets. But the filter approach, where the correlations in an equispaced time series are represented by an autoregressive process, seems to be unbeatable for this equispaced data distribution. The resulting discrete, causal, non-recursive filters can be efficiently applied. But for real data applications and real measurement series the large drawback of the filter approach is, that this filter approach works only on infinite equispaced data series. Also data gap are not acceptable. This paper shows a strategy to combine both approaches, the combination selects the best properties from both. Considering the Yule-Walker equations a recursively defined covariance sequence can be defined which allows to switch between the covariance approach and the filter approach. Thus an efficient algorithm is deduced which can handle in a rigorous and efficient way finite sequences of equispaced data sets which can include data gaps.

Appendix A

Cholesky Approach: Revisited

The Cholesky approach was first published posthumously by [3] in Bulletin Géodésique and based on an unpublished manuscript from [10]. The solution of

952

W.-D. Schuh and J. M. Brockmann

rectangular

infinite

asymmetric

illposed

Cholesky factorization

symmetric singular

indefinite positive definite

dense

block cyclic

Toeplitz 1D

sparse

supernodal

2D

Fig. 7 Application of the Cholesky approach on different types of systems. This figure is focused on the application on symmetric systems and shows the variety of different strategies only for this special type

symmetric linear equation systems is a frequently asked problem in numerical mathematics and data analysis. The efficient and stable solution of the normal equations as well as the factorization and inversion of covariance matrices is an inherent task in least squares adjustment and least squares collocation. Thus, many geodesists use the Cholesky approach and developed a lot of strategies to adopt this method for their special applications [28,32]. Sparse solvers [37,45] as well as partial inversion strategies [2, 17] are developed, studies on the numerical stability and the influence of rounding errors in huge systems are performed [12, 27], but also strategies to generate correlated signals for Monte Carlo approaches [1] are developed. Usually Cholesky factorization is connected to symmetric positive definite systems, but in Fig. 7 also other possible applications for infinite, rectangular, asymmetric and illposed systems are mentioned. These systems are out of scope of this article, but in many situations the Cholesky approach is extremely efficient and numerically stable applicable.

A.1

Cholesky Solution

The Cholesky approach was originally developed for the solution of a symmetric linear equation system by the decomposition into a lower and upper triangular matrix. In contrast to the LU decomposition the diagonal elements of both matrices

15 The Numerical Treatment of Covariance Stationary Processes in . . .

953

are chosen equal to eliminate the degree of freedom. For symmetric systems the hermitian matrix N ∈ Cn×n is decomposed into triangular matrices, where one is the conjugate transposed of the other. In this work we choose the notation N = RH R ,

(54)

where R denotes a upper (right) triangular matrix and R H its conjugate transposed (hermitian) form. If N ∈ Rn×n is a positive definite, real valued, symmetric matrix the decomposition is real valued, and thus N = RT R ,

(55)

with R ∈ Rn×n . This approach denoted a Cholesky forward factorization is often used for least squares problems, especially for Gauss-Markov models and adjustment models with condition equations. The resulting positive definite normal equations can be decomposed very efficiently (roughly twice as efficient as the LU decomposition) and numerically stable [12, 27] by the Cholesky approach. The solution x ∈ Rn of the linear symmetric system Nx = n

(56)

with the symmetric matrix N ∈ Rn×n and the right hand side n ∈ Rn can be efficiently determined after the factorization R T  Rx = n

(57)

:=z

in a two step approach by solving the lower triangular system RT z = n

(58)

for the auxiliary unknowns z ∈ Rn in a forward substitution step and then solving the upper triangular system Rx = z

(59)

for the unknowns x in a backward substitution step. The formulas for each single coefficient of the forward and backward substitution can be derived immediately from (58) 6 zi = ni −

i−1 k=1

and (59)

5 rki zk /rii ,

i = 1, . . . , n,

(60)

954

W.-D. Schuh and J. M. Brockmann

6

5

n

xi = zi −

i = n, . . . , 1

rik xk /rii ,

.

(61)

k=i+1

The coefficients of the triangular matrix R can be derived from the identity (55). Written in expanded form ⎡ n11 ⎢n12 ⎢ ⎢ . ⎣ ..

n12 n22 .. .

... ... .. .

⎤ ⎡ r11 n1n ⎢ ⎥ n2n ⎥ ⎢r12 .. ⎥ = ⎢ .. . ⎦ ⎣ .

n13 n23 . . . nnn

r13

⎤⎡ r22 .. . . . . r23 . . . rnn

r11 r12 ⎥ ⎢ r22 ⎥⎢ ⎥⎢ ⎦⎣

... ... .. .

⎤ r1n r2n ⎥ ⎥ .. ⎥ , . ⎦

(62)

rnn

it can be seen immediately that n2 – respectively 12 n(n + 1) linear independent – identities results from this equation. They can be used to derive the Cholesky coefficients by % rii =

nii − 6

rij = nij −

i−1  k=1

i−1  k=1

2 rki

5

rki rkj /rii ,

⎫ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ j = i +1, . . . , n ⎪ ⎭

i = 1, . . . , n .

(63)

It should be mentioned, that the three required loops with respect to i, j and k can be organized in accordance with the used storage scheme of the matrices. These numerical considerations are not in the focus of this article (see for instance [40, p. 138ff]). Figure 8 summarizes the four steps which are required to determine a solution of a linear symmetric system with the Cholesky approach. It should be mentioned, that the factorization step for off-diagonal elements exactly corresponds to the forward substitution step. The first three steps use a column access only, whereas for the backward substitution a row and a column access is proposed. These figures provide a feeling, what happens during the factorization of sparse input matrices. If an element nij is initially zero, it is unchanged during factorization, if the scalar product of the column vectors rki and rkj vanishes. This is the case especially for envelope systems, where the column above a zero element is zero as well. Therefore the envelope structure of a sparse symmetric system is preserved during Cholesky factorization [15, 28]. Special numbering schemes are developed [8, 37, 40, 45] to determine the order of the parameters in the matrix, to optimize the structure to preserve sparsity also during the factorization step. The design of covariance matrices by finite covariance functions [13, 21, 34, 42] yields to sparse systems, which can efficiently solved by the Cholesky approach. Figure 8 can also be used to get an impression of the case of an indefinite system. In the factorization step of the diagonal element, the value under the square-root becomes negative. Therefore it is necessary to introduce the imaginary unit i. In

15 The Numerical Treatment of Covariance Stationary Processes in . . .

955

j

j

i

i

i N

R

N

i

i

    =



R

i

i







,

  i−1   2 rii =  nii − rki





,

 rij =

k=1

⎞ 



=⎝

nij −

i−1 

 /rii

rki rkj

k=1

(a) factorization: diagonal element

(b) factorization: off-diagonal element

i n

i

z

z

x

R

i

R

i

i





=⎝



 zi =

ni −

, i−1 

⎞  ⎠

i

i

 =

 −



 rki zk

i

/rii

k=1

(c) substitution: forward

xi =

ni −

  ,

n 

 rik xk

/rii

k=i+1

(d) substitution: backward

Fig. 8 Graphical interpretation of the factorization steps (63), the forward substitution step (60) and the backward substitution step (61). (a) Factorization: diagonal element. (b) Factorization: off-diagonal element. (c) Substitution: forward. (d) Substitution: backward

case this happens, the divisor in all off-diagonal factorization steps is imaginary, therefore the entire row consists of imaginary values. The main operation during the whole factorization and substitution steps are scalar products between two columns. If a row contains imaginary values in both column elements, the result is not complex anymore, but real valued. The product of these two elements is not added during the computation of the scalar product, but must be subtracted. Therefore, indefinite systems can be solved easily by the Cholesky approach by introducing an index vector to indicate which rows involve imaginary numbers. As

956

W.-D. Schuh and J. M. Brockmann

the normal equations of the commonly used Gauss-Markov model, or the covariance matrices are positive definite systems, this extension of the Cholesky approach is rarely used in geodesy. But the normal equations of the Gauss-Helmert model and also the Gauss-Markov model with restrictions are indefinite systems and the above mentioned extension is crucial to take benefit from the stable and efficient Cholesky approach also for indefinite systems.

A.2

Cholesky Inversion

In this section we are looking for the inverse of the matrix N, where we assume, that the Cholesky factorized matrices R are already computed. A common used strategy is to invert the triangular matrix R and apply  −1  −1 N −1 = R T R = R −1 R T .

(64)

Here we want to elaborate a direct strategy to switch from R to the inverse matrix N −1 . The positive definite matrix N and its inverse N −1 can be written in block notation as 4 3   (−1) (−1) N 11 N 12 N 11 N 12 −1 , N = N= (65) (−1) (−1) , N T12 N 22 (N 12 )T N 22 where N 11 denotes the upper left block in the inverse N −1 . It has to be carefully distinguished from the inverse of the upper left block of the matrix N , which is denoted by N −1 11 . The partitioning of the matrix into blocks is performed in a way, that N 11 ∈ Rn1 ×n1 and N 22 ∈ Rn2 ×n2 are quadratic. The identity (−1)

4

3    N 11 N 12 1n1 On1 ×n2 = , N N = 1n N T12 N 22 On2 ×n1 1n2 (66) can be used to find a representation of the blocks of the inverse matrix. They read (−1) (−1) N 12 N 11 (−1) (−1) (N 12 )T N 22

−1

−1  (−1) N 22 = N 22 − N T12 N −1 , 11 N 12 (−1)

= −N −1 11 N 12 N 22 ,

(−1)

−1 −1 T = N −1 11 + N 11 N 12 N 22 N 12 N 11 .

N 12 N 11

(−1)

(67) (68)

(−1)

(69)

Performing a Cholesky factorization of the block partitioned matrix N   T   R 11 R 11 R 12 N 11 N 12 = (N 12 )T N 22 (R 12 )T R T22 R 22



(70)

15 The Numerical Treatment of Covariance Stationary Processes in . . .

957

the relation between the matrices N ij and the matrices R ij are given by N 11 = R T11 R 11

N 12 = R T11 R 12

N 22 = R T12 R 12 + R T22 R 22 .

(71)

Substituting these quantities in (67–69) the well-known relations between the blocks of the inverse matrix and the Cholesky factorized block matrices can be derived as (−1)

N 22

(−1)

N 12

(−1)

N 11

−1  = R T22 R 22 ,

(72)

= −R −1 11 R 12 N 22 ,  T −1 (−1) T = R 11 R 11 − N 12 R T12 (R −1 11 ) . (−1)

(73) (74)

To avoid the inversion of large systems the computations are performed by a (−1) backward substitution row-by-row starting with the element nnn (−1) nnn = (−1)

nij

(−1)

nii

1 2 rnn

= − r1ii =

1 rii2

n 

(−1)

k=i+1



1 rii

rik nkj

n  k=i+1

,

(−1)

rik nik

⎫ ⎪ j = i +1, . . . , n ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎭

i = n, . . . , 1 . (75)

Figure 9 gives a graphical interpretation of (75). It shows the necessary operations to invert the row i. At this step it is assumed, that for the rows i + 1, . . . , n the inverse elements are already computed and for the rows 1, . . . i the Cholesky factorized form is available. By this backward substitution process the whole inverse can be computed. The computation of the inverse matrix, with the Cholesky factorization available, can be performed in the order of O( 13 n3 ) operations, which is twice expensive as the Cholesky factorization step itself, where only O( 16 n3 ) operations are necessary. All in all the inversion (Cholesky and recursive backward edging) process is in the order of O( 12 n3 ). It is well-known, that in general the inversion destroys sparsity, such that the inverse is typically a dense matrix. In contrast to many other engineering and research areas, where only the solution of the system is requested, in statistics, adjustment theory and geodesy the inverse of the system is of special interest, because the uncertainties and the covariances are deduced from the inverse. The knowledge of this information is crucial in most applications. Sometimes, we are only interested in special elements of the inverse, e.g., the main diagonal elements or elements in a defined bandwidth or off-diagonal elements between special nodes which are characterized by their neighborhood or by points directly connected by measurements. These are typically the entries, which are already non zero in the sparse normal equations. Error propagation within geodetic networks [37], Helmert-Blocking [47] or nested

958

W.-D. Schuh and J. M. Brockmann

dissection [14] partitioning strategies to solve large finite element mesh systems and kite-structures [39, 40] to perform a spherical harmonic analysis with irregularly fully populated low degree and block-diagonal structured high degree fields are typical applications. In connection with the new adjustment of the North American horizontal datum [5] a special algorithm to compute the inverse elements for an envelope structured sparse matrices was introduced and denoted as partial inverse . It is shown by Hanson [17], that all the inverse elements within the envelope can be computed in a strict manner without any information about the inverse elements outside of the envelope. This is due to the fact, that the non-computed elements within each column of the inverse corresponds exactly to the zero-elements within the Cholesky reduced row. (−1) Therefore, the scalar product in (75) (cf. also the computation of N i,i+1:n in Fig. 9)

⎫ ⎬ Cholesky reducted part ⎭ row i, to be inverted ⎫ ⎬ ⎭ already inverted part

(−1)

N i+1:n,i+1:n =

already inverted part (symmetric) = − 1

(−1)

N i,i+1:n =

(−1)

N ii

=

=

1



1

Fig. 9 Graphical interpretation of the inversion step (75) starting from a Cholesky factorized matrix. By a backward process the inverse matrix is computed row-by-row. In contrast to the upper triangular Cholesky matrix R the inverse matrix N −1 is a symmetric matrix. The skeleton-like squares marks this symmetric part, which is typically not stored

15 The Numerical Treatment of Covariance Stationary Processes in . . .

959

for all elements within the envelope is not influenced by inverse elements, that are positioned outside the envelope. This argument holds also for special structures coming from a nested dissection partitioning or for kite-structures [2].

A.3

Cholesky Factorization of a Matrix Starting from Its Inverse

In many statistical applications the inverse of the matrix (covariance matrix) plays a central role. For some special applications, it can be useful to compute the Cholesky factors of the initial matrix N directly from the inverse matrix N −1 (covariance matrix). To obtain an algorithm for that purpose, it is necessary to reorder the relations between the Cholesky coefficients and the elements of the inverse given in (75) with respect to the Cholesky coefficients. For the last diagonal element from (−1) the inverse nnn , we get immediately the last Cholesky coefficient by rnn

 (−1) = 1/ nnn .

(76)

Now in a step-wise approach the Cholesky coefficients can be computed row-by-row by rearranging (75) with respect to the Cholesky coefficients. This yields to (−1)

nij

(−1)

(−1)

rii + ni+1,j ri,i+1 + . . . + nn,j rin = 0 ,

j = i + 1, . . . , n

(77)

1 . rii

(78)

and respectively (−1) (−1) (−1) rii + ni+1,i ri,i+1 + . . . + nn,i rin = nii

After multiplying all equations by rii and a rearranging, we end up with a system of linear equations ⎤ ⎡ (−1) (−1) nii ni,i+1 rii rii (−1) (−1) ⎢rii ri,i+1 ⎥ ⎢ ⎢ ⎥ ⎢ni,i+1 ni+1,i+1 ⎢ . ⎥=⎢ . .. ⎣ .. ⎦ ⎢ ⎣ .. . (−1) (−1) rii rin ni−1,n nin ⎡

(−1) ⎤−1 ⎡ ⎤ . . . nin 1 (−1) ⎥ ⎥ . . . ni+1,n ⎥ ⎢ ⎢0⎥ ⎥ ⎢ .⎥ . ⎥ .. . .. ⎦ ⎣ .. ⎦ (−1) 0 . . . nnn

for

i = n−1, . . . , 1 .

(79) The Cholesky coefficients of row i multiplied by rii are therefore equivalent to the first row of the corresponding sub-matrix of the inverse. The square root of the first element provides rii and all the other elements can be computed by dividing the elements by rii . On a first glance this factorization process seems to be very inefficient, because in each step the computation of the inverse of the sub-block is necessary. But this successive inversion can be done by recursive backward edging

960

W.-D. Schuh and J. M. Brockmann



n11

n12

⎢ ⎢ T ⎣ n12 N 22 ⎡

⎤  ⎤−1 ⎡  (−1) T −1 n11 − n12 N −1 −n11 n12 N −1 22 n12 22 ⎥ ⎢ ⎥ ⎥ ⎥ =⎢ ⎥ ⎢ (−1) (−1) ⎦ −1 −1 T n T N −1 ⎦ ⎣ −N −1 n N +N n n n 22 12 11 22 22 12 11 12 22

(−1) n11

(−1) n12

⎢ ⎢ =⎢ ⎢ T (−1) (−1) ⎣ (n12 ) N 22

⎤−1 ⎥ ⎥ ⎥ ⎥ ⎦

(80)

−1 assuming that the inverse N −1 22 respectively N i+1:n,i+1:n are already computed and the inverse of the by one row and column extended matrix should be determined N −1 i:n,i:n [41, p.13f]. The required operations of the overall procedure is in the order of O( 12 n3 ). In applications from signal processing, often deal with covariance matrices resulting from equispaced data. These matrices are Toeplitz structured and the Levinson-Durbin recursion allows for a very efficient solution with O(2n2 ) operations ( [25]; [11]; [16, p.128–129]; [41, p.291ff]).

A.4

Backward Cholesky Factorization

In the standard Cholesky factorization procedure a symmetric matrix N is split into a lower triangular matrix and an upper triangular matrix, where one is the transposed of the other, N = R T R, where R denotes the upper (right) triangular matrix. In the backward Cholesky factorization the symmetric matrix is factorized into an upper triangular matrix and a lower triangular matrix. This yields to T

N =RR ,

(81)

or in extended notation to ⎡

n11 ⎢n12 ⎢ ⎢ . ⎣ .. n13

n12 . . . n22 . . . .. . . . . n23 . . .

⎤ ⎡ ⎤⎡ n1n r 11 r 12 . . . r 1n r 11 ⎥ ⎢ ⎥ ⎢ n2n ⎥ ⎢ r 22 . . . r 2n ⎥ ⎢r 12 ⎢ . .. ⎥ = ⎢ . . .. ⎥ . . ⎦ ⎣ .. . ⎦ ⎣ nnn r nn r 13

⎤ ⎥ r 22 ⎥ ⎥ . .. . . ⎦ . . r 23 . . . r nn

(82)

This factorization can be seen as the standard Cholesky approach for a top-down and left-right flipped symmetric system. The factorization starts at the last diagonal element nnn and processes row by row to the top. The notation with the bar on the top should express, that in general the coefficients rij differ from r ij . From the identity (82), the recursion formulas for the computation of the coefficients R can be immediately deduced

15 The Numerical Treatment of Covariance Stationary Processes in . . .

% r ii =

n 

nii −

k=i+1

6 r ij = nij −

n  k=j +1

⎫ ⎪ ⎪ ⎪ ⎪ ⎬

r 2ik 5

⎪ ⎪ ⎪ j = n, . . . , i +1 ⎪ ⎭

r j k r ik /r jj ,

961

i = n, . . . , 1 .

(83) In correspondence to Sect. A.2, the recursion formulas for the coefficients of the inverse of the symmetric matrix can be computed by forward substitution, (−1)

n11

=

(−1) nij

=−

(−1) njj

=

1

r 211 1

j −1

r jj

k=1

1

r 2jj



(−1) r kj nik

1

j −1

r jj

k=1

⎫ ⎪ i = 1, . . . , j − 1 ⎪ ⎪ ⎬

,

⎪ ⎪ ⎪ ⎭

(−1) r kj nkj

j = 1, . . . , n

(84)

and can be reorganized with respect to the computation of the backward Cholesky factors r ij from the inverse N −1 (−1) (−1) (−1) ni1 r 1j + ni2 r 2j + . . . + nij r jj = 0 , (−1) (−1) njj r jj + n1j r 1j + . . . + nj(−1) −1,j r j −1,j =

i = 1, . . . , j − 1 1 r jj

(85) (86)

.

The computation of the backward Cholesky factors of the column j can be organized as the solution of the linear system of equations ⎡ ⎤ ⎡n(−1) n(−1) r 1j r jj 11 12 (−1) (−1) ⎢r 2j r jj ⎥ ⎢ n12 n22 ⎢ ⎥ ⎢ ⎢ . ⎥=⎢ . .. ⎣ .. ⎦ ⎢ ⎣ .. . (−1) (−1) r jj r jj n1j n2j

(−1) ⎤−1 ⎡

. . . n1j

(−1) ⎥ n2j ⎥ ⎥

... . ⎥ .. . .. ⎦ (−1) . . . njj

⎤ 0 ⎢0⎥ ⎢ ⎥ ⎢.⎥ ⎣ .. ⎦

for

j = 2, . . . , n ,

1

(87) as a recursive forward edging process. If the inverse matrix is Toeplitz structured the Levinson-Durbin algorithm [11, 25, 46] can be applied.

A.5

Resumé on Cholesky Factorization

In the last sections two different approaches of the Cholesky factorization are discussed. The standard or forward Cholesky factorization (63), where the symmetric matrix N is split into a lower triangular matrix and an upper one

962

W.-D. Schuh and J. M. Brockmann

N = RT R

(88)

and the backward Cholesky factorization (83), where the symmetric matrix N is factorized into T

N =RR .

(89)

It is important to note that in general R is different from R. In a further step, the inverse matrix N −1 can be computed directly from the Cholesky factorized triangular matrices R or R by backward substitution (75) or forward substitution (84). It is also presented, how the Cholesky factorized matrices R and R of the initial matrix N can be derived from the given inverse matrix N −1 . The matrix R can be computed by recursive backward edging (79) and the matrix R by recursive forward edging (87). The circle can be closed by a simple matrix multiplication (88) or (89) to come back to the initial matrix N from the triangular matrices R and R. Fig. 10a, b summarizes this possible paths. A further interesting perspective is to start with the inverse matrix N −1 instead of the matrix N and set up the same sequence of Cholesky factorization steps as done before. Formally the inversion of the matrix N with respect to the forward Cholesky factorization (88) can be written  T  −1 N −1 = R T R = R −1 R −1

(90)

  −1 T −1 T −1 = R R N −1 = RR

(91)

and

with respect to backward Cholesky factorization (89). It is now crucial that in −1 contrast to the common factorized matrices their inverse forms R −1 and R occur. Changing the view, also the inverse matrix N −1 can be factorized into a lower and upper triangular matrix by the forward Cholesky factorization (55) we see, −1 that we get immediately R or by the backward Cholesky factorization (83) R −1 respectively. This means, starting from N −1 the inverses of the triangular matrices can be directly computed. In the next steps the Cholesky inversion by backward substitution (75) or forward substitution (84) can be computed to get the inverse of the inverse which is for sure the matrix N again. But also the other way is now −1 possible, i.e., the computation of the matrix R directly from the matrix N by recursive backward edging (79) and by recursive forward edging (87) the matrix R −1 . In Fig. 10 all these factorization steps are summarized for direct factorization of the matrix N as well as for N −1 . The Cholesky approach allows a lot of variations to factorize a symmetric matrix −1 and opens a way to directly compute R and R or R −1 or R . It should be

(−1)

nij

rij

(−1)

nij

(backward substitution)

Cholesky inversion

rij

Cholesky forward factorization

nij

recursive backward edging

matrix multiplication

(−1)

nij

(backward substitution)

Cholesky inversion

rij

Cholesky forward factorization

(−1)

nij matrix multiplication

recursive backward edging

(c) Cholesky approach of N −1

nij

(−1)

rij

(−1)

nij

T

R

(−1)

nij

rij

nij

(−1)

nij

(forward substitution)

Cholesky inversion

rij

Cholesky backward factorization

nij

recursive forward edging

matrix multiplication

T −1

nij

(−1)

rij

(−1)

nij

nij

(forward substitution)

Cholesky inversion

(−1)

rij

Cholesky backward factorization

(−1)

nij

recursive forward edging

matrix multiplication

(d) backward Cholesky approach of N −1

N

(R )

N −1

(b) backward Cholesky factorization of N

N −1

R−1

R

N

Fig. 10 Review the Cholesky factorization steps applied to N and N −1 . (a) (Forward) Cholesky factorization of N. (b) Backward Cholesky factorization of N . (c) Cholesky approach of N −1 . (d) Backward Cholesky approach of N −1

N

−1

R

N −1

(R )

R

nij

(a) (forward) Cholesky factorization of N

N −1

T −1

RT

N

15 The Numerical Treatment of Covariance Stationary Processes in . . . 963

964

W.-D. Schuh and J. M. Brockmann

mentioned that some of the factorization steps can be extreme efficiently computed for Toeplitz structures resulting from equispaced data distribution.

Literature 1. Alkhatib, H., Schuh, W.-D.: Integration of the Monte Carlo covariance estimation strategy into tailored solution procedures for large-scaled least squares problems. J. Geodesy 70, 53–66 (2007). https://doi.org/10.1007/s00190-006-0034-z 2. Auzinger, T., Schuh, W.-D.: High-degree spherical harmonic analysis combining gridded and random distributed data sets. Phys. Chem. Earth 23(1), 19–23 (1998). https://doi.org/10.1016/ S0079-1946(97)00236-X 3. Benoit: Note Sur Une Méthode de Résolution des équations Normales Provenant de L’Application de la Méthode des Moindres Carrés a un Système D’équations Linéaires en Nombre Inférieur a Celui des Inconnues. —Application de la Méthode a la Résolution D’un Système Defini D’éQuations LinéAires. Bull. géodésique 2(1), 67–77 (1924). ISSN:00074632. https://doi.org/10.1007/BF03031308 4. Beutler, G., Jäggi, A., Hugentobler, U., Mervart, L.: Efficient satellite orbit modelling using pseudo-stochastic parameter. J. Geodesy 80, 353–372 (2006). https://doi.org/10.1007/s00190006-0072-6 5. Bossler, J.D.: The new adjustment of the North American horizontal datum. Eos, Trans. Am. Geophys. Union 57(8), 557–562 (1976). ISSN:2324-9250. https://doi.org/10.1029/ EO057i008p00557 6. Bottoni, G., Barzaghi, R.: Fast collocation. Bull. Géodésique 67, 119–126 (1993) 7. Box, G., Jenkins, G.: Time Series Analysis Forcasting and Control. Holden-Day (1970) 8. Boxhammer, C., Schuh,W.-D.: GOCE gravity field modeling: computational aspects – free kite numbering scheme. In: Rummel, R., Reigber, C., Rothacher, M., Boedecker, G., Schreiber, U., Flury, J. (Hrsg.) Observation of the Earth System from Space, S. 209–224. Springer, Berlin/Heidelberg (2006). https://doi.org/10.1007/3-540-29522-4_15 9. Buttkus, B.: Spectral Analysis and Filter Theory in Applied Geophysics. Springer, Berlin/Heidelberg (2000) 10. Cholesky, A.-L.: Sur la résolution numérique des systèmes d’équations linéaires. (Société des amis de la Bibliothèque et de l’Histoire de l’École polytechnique, reprint: Bulletin de la Sabix [En ligne], 39, pp. 81–95 | 2005) (1910). http://sabix.revues.org/529 11. Durbin, J.: The fitting of time series models. Rev. inst. Int. Stat. 28, 233–243 (1960) 12. Ernst, A., Schuh, W.-D.: The effect of reordering strategies on rounding errors in large, sparse equation systems. In: Sneeuw, N., Novák, P., Crespi, M., Sansò, F. (Hrsg.) VII. Hotine-MarussiSymposium, IAG Symposia. Lecture Notes in Earth Sciences, Band 137, S. 99–104. Springer, Berlin/Heidelberg (2012). https://doi.org/10.1007/978-3-642-22078-4_15 13. Gaspari, G., Cohn, S.: Construction of correlation functions in two and three dimensions. Q. J. R. Meteorol. Soc. 125(554), 723–757 (1999) 14. George, A.: Nested dissection of a regular finite element mesh. SIAM J. Numer. Anal. 10, 345–363 (1973). https://doi.org/10.1137/0710032 15. George, A., Liu, J.W.-H.: Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, Englewood Cliffs (1981) 16. Golub, G., van Loan, C.: Matrix Computations. North Oxford Academic, Oxford (1983) 17. Hanson, R.: A posteriori error propagation. In: Proceedings of the “2nd International Symposium on Problems Related to the Redefinition of North American Geodetic Networks”, Arlington, S. 427–445, 24–28 Apr 1978 18. Jäggi, A.: Pseudo-stochastic orbit modelling of Low Earth Satellites using the global positioning system. Geodätisch-geophysikalische Arbeiten in der Schweiz, Band 73. Schweizerische Geodätische Kommission (2007). http://boris.unibe.ch/id/eprint/25278

15 The Numerical Treatment of Covariance Stationary Processes in . . .

965

19. Kay, S., Marple, S.: Spectrum analysis – a modern perspective. Proc. IEEE 69(11), 1380–1419 (1981). ISSN:0018-9219 20. Kleiner, B., Martin, R., Thomson, D.: Robust estimation of power spectra. J. R. Stat. Soc. Ser. B-Methodol. 41(3), 313–351 (1979). ISSN:0035-9246 21. Koch, K., Kuhlmann, H., Schuh, W.-D.: Approximating covariance matrices estimated in multivariate models by estimated auto- and cross-covariances. J. Geodesy 84(6), 383–397 (2010). https://doi.org/10.1007/s00190-010-0375-5 22. Krarup, T.: A contribution to the mathematical foundation of physical geodesy. Geodætisk Institut, Meddelelse n. 44, København (1969) 23. Krasbutter, I., Brockmann, J.M., Kargoll, B., Schuh, W.-D.: Adjustment of digital filters for decorrelation of GOCE SGG data. Flechtner, F., Sneeuw, N., Schuh, W.-D. (Hrsg.) Observation of the System Earth from Space – CHAMP, GRACE, GOCE and Future Missions. Advanced Technologies in Earth Sciences, GEOTECHNOLOGIEN Science Report, Band 20, S. 109–114. Springer (2014). https://doi.org/10.1007/978-3-642-32135-1_14 24. Krasbutter, I., Kargoll, B., Schuh, W.-D.: Magic square of real spectral and time series analysis with an application to moving average processes. In: Kutterer, H., Seitz, F., Alkhatib, H., Schmidt, M. (Hrsg.) The 1st International Workshop on the Quality of Geodetic Observation and Monitoring Systems (QuGOMS’11), IAG Symposia. International Association of Geodesy Symposia, Band 140, S. 9–14. Springer (2015). ISBN:978-3-319-10827-8. https://doi.org/10. 1007/978-3-319-10828-5_2 25. Levinson, N.: The Wiener RMS (Root Mean Square) error criterion in filter design and prediction. J. Math. Phys. 25, 261–278 (1947) 26. Mayer-Gürr, T.: Gravitationsfeldbestimmung aus der Analyse kurzer Bahnbögen am Beispiel der Satellitenmissionen CHAMP und GRACE. Dissertation, Promotion an der Landwirtschaftlichen Fakultät der Universität Bonn, Schriftenreihe des Instituts für Geodäsie und Geoinformation der Rheinischen Friedrich-Wilhelms-Universität, Folge 9 (2006). http://nbnresolving.de/urn:nbn:de:hbz:5N-09047 27. Meissl, P.: A priori prediction of roundoff error accumulation in the solution of a super-large geodetic normal equation system. NOAA/National Ocean Survey’s National Geodetic Survey (NGS), Rockville. http://trove.nla.gov.au/work/19473491?selectedversion=NBD2223841 (1980) 28. Meissl, P.: Least squares adjustment: a modern approach. Mitteilungen der Geodätischen Institute der TU Graz, Band 43. Geodätischen Institute der TU Graz, Graz. ftp://skylab.itg. uni-bonn.de/schuh/Separata_Meissl/meissl_82b.pdf (1982) 29. Moreaux, G.: Compactly supported radial covariance functions. J. Geodesy 82(7), 431–443 (2008) ISSN:0949-7714. https://doi.org/10.1007/s00190-007-0195-4 30. Moritz, H.: Least-squares collocation. Deutsche Geodätische Kommission, München. Reihe A 75 (1973) 31. Moritz, H.: Advanced Physical Geodesy. Wichmann, Karlsruhe (1980) 32. Poder, K., Tscherning, C.: Cholesky’s method on a computer. Internal Report No. 8, The Danish Geodetic Institute (1973) 33. Priestley, M.: Spectral Analysis and Time Series. Elsevier Academic Press, Amsterdam (2004) 34. Sansò, F., Schuh, W.-D.: Finite covariance functions. Bull. Géodésique 61(4), 331–347 (1987). https://doi.org/10.1007/BF02520559 35. Sansò, F., Tscherning, C.: Fast spherical collocation: theory and examples. J. Geodesy 77, 101–112 (2003) 36. Schall, J., Eicker, A., Kusche, J.: The ITG-Goce02 gravity field model from GOCE orbit and gradiometer data based on the short arc approach. J. Geodesy 88(4), 403–409 (2014). ISSN:0949-7714. https://doi.org/10.1007/s00190-014-0691-2. https://doi. org/10.1007/s00190-014-0691-2. 37. Schuh, W.-D.: Programmierung rationeller Algorithmen zur Umordnung, Auflösung und Inversion der Normalgleichungen geodätischer Netze. Diplomarbeit, Technische Universität Graz (1981)

966

W.-D. Schuh and J. M. Brockmann

38. Schuh, W.-D.: Kollokation – zu rechenaufwendig? ZAMM, Z. angew. Math. Mech. 69 4, T73–T75. http://onlinelibrary.wiley.com/doi/10.1002/zamm.19890690403/pdf (1989) 39. Schuh, W.-D.: Least squares adjustment of high degree spherical harmonics. Jacobsen, B.E. (Hrsg.) Inverse Methods – Interdisciplinary Elements of Methodology, Computation and Application. Lecture Notes in Earth Sciences, vol. 63, S. 276–283. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0011786 40. Schuh, W.-D.: Tailored numerical solution strategies for the global determination of the earth’s gravity field, Band 81. Mitteilungen der Geodätischen Institute. Technische Universität Graz (TUG), Graz. ftp://skylab.itg.uni-bonn.de/schuh/Separata/schuh_96.pdf (1996) 41. Schuh, W.-D.: Numerische Verfahren zur geodätischen Optimierung. Skriptum. Theoretische Geodäsie, Universität Bonn (2003) 42. Schuh, W.-D.: Signalverarbeitung in der Physikalischen Geodäsie. Freeden, W., Rummel, R. (Hrsg.) Handbuch der Geodäsie, Band Erdmessung und Satellitengeodäsie. Springer Reference Naturwissenschaften, S. 73–121. Springer, Berlin/Heidelberg (2016). ISBN:978-3-662-470992. https://doi.org/10.1007/978-3-662-47100-5_15. https://doi.org/10.1007/978-3-662-471005_15 43. Schuh, W.-D., Krasbutter, I., Kargoll, B.: Korrelierte Messung – was nun? Neuner, H. (Hrsg.) Zeitabhängige Messgrößen - Ihre Daten haben (Mehr-)Wert. DVW-Schriftenreihe, Band 74, S. 85–101. Wißner, Augsburg (2014) 44. Siemes, C.: Digital Filtering Algorithms for Decorrelation within Large Least Squares Problems. Dissertation, Landwirtschaftliche Fakultät der Universität Bonn, Bonn. http://nbnresolving.de/urn:nbn:de:hbz:5N-13749 (2008) 45. Snay, R.: Reducing the profile of sparse symmetric matrices. Bull. Géodésique 50, 341–352 (1976) 46. Trench, W.: An algorithm for the inversion of finite Toeplitz matrices. SIAM J. Soc. Indust. Appl. Math. 12, 515–522 (1964) 47. Wolf, H.: The Helmert Block Method – Its Origin and Development. Second International Symposium on Problems Related to the Redefinition of the North American Geodetic Networks, S. 319–326. US Department of Commerce, Washington (1978)

Part VII Inverse Problems and Multiscale Methods

Inverse Gravimetry: Density Signatures from Gravitational Potential Data

16

Willi Freeden and M. Zuhair Nashed

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Newton Volume Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Ill-Posedness of the Inverse Gravimetry Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Mollifier Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Reproducing Kernel Hilbert Space (RKHS) Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

971 977 984 991 1011 1045 1046

Abstract This paper represents an extended version of the publications Freeden W. (2015) Geomathematics: its role, its aim, and its potential. In: Freeden W., Nashed M.Z., Sonar T. (Hrsg.) Handbook of Geomathematics, Bd. 1, 2. Aufl., S. 3–78. Springer, New York/Heidelberg, “Handbook of Geomathematics”, Springer, 2015, Freeden W., Nashed M.Z. (2018) Inverse gravimetry as an ill-posed problem in mathematical geodesy. In: Freeden W., Nashed M.Z. (Hrsg.) Handbook of Mathematical Geodesy. Geosystems Mathematics, S. 641–685. Birkhäuser/Springer, Basel/New York/Heidelberg “Handbook of

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. W. Freeden () Department of Mathematics, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] M. Z. Nashed Department of Mathematics, University of Central Florida, Orlando, FL, USA E-Mail: [email protected]; [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_96

969

970

W. Freeden and M. Z. Nashed

Mathematical Geodesy”, Birkhäuser, International Springer Publishing, 2018, and, in particular, Freeden W., Nashed M.Z. GEM Int. J. Geomath. 9, 199– 264, 2018, from which all the theoretical framework is taken over to this work. The aim of the paper is to deal with the ill-posed problem of transferring input gravitational potential information in the form of Newtonian volume integral values to geological output characteristics of the density contrast function. Some essential properties of the Newton volume integral are recapitulated. Different methodologies of the resolution of the inverse gravimetry problem and their numerical implementations are examined including their dependence on the data source. Three types of input information may be distinguished, namely internal (borehole), terrestrial (surface), and/or external (spaceborne) gravitational data sets. Singular integral theory based inversion of the Newtonian integral equation such as a Haar-type solution is handled in a multiscale framework to decorrelate specific geological signal signatures with respect to inherently given features. Reproducing kernel Hilbert space regularization techniques are studied (together with their transition to certain mollified variants) to provide geological contrast density distributions by “downward continuation” from terrestrial and/or spaceborne data. Numerically, reproducing kernel Hilbert space spline solutions are formulated in terms of Gaussian approximating sums for use of gravimeter data systems.

Zusammenfassung Diese Arbeit stellt eine erweiterte Version der Publikationen Freeden W. (2015) Geomathematics: its role, its aim, and its potential. In: Freeden W., Nashed M.Z., Sonar T. (Hrsg.) Handbook of Geomathematics, Bd. 1, 2. Aufl., S. 3–78. Springer, New York/Heidelberg, „Handbook of Geomathematics“, Springer, 2015, Freeden W., Nashed M.Z. (2018) Inverse gravimetry as an ill-posed problem in mathematical geodesy. In: Freeden W., Nashed M.Z. (Hrsg.) Handbook of Mathematical Geodesy. Geosystems Mathematics, S. 641–685. Birkhäuser/Springer, Basel/New York/Heidelberg „Handbook of Mathematical Geodesy“, Birkhäuser, International Springer Publishing, 2018, und, insbesondere, (Freeden W., Nashed M.Z. GEM Int. J. Geomath. 9, 199–264, 2018) dar, deren theoretisches Gerüst vollständig übernommen wurde. Ziel der Arbeit ist die Beschäftigung mit dem schlecht-gestellten Problem, gravitative Potential-InputInformation in Form von Newtonschen Volumenintegralwerten in geologische Output-Charakteristika der Dichtekontrastfunktion umzuwandeln. Einige wesentliche Eigenschaften des Newtonschen Volumenintegrals werden rekapituliert. Verschiedene Methoden zur Lösung des inversen Gravime-

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

971

trieproblems sowie numerische Implementierungen werden abhängig von der Datenquelle untersucht. Drei Typen von Inputdaten-Information lassen sich unterscheiden, nämlich innere (Bohrloch), terrestrische (Erdoberfläche), und/oder äußere (Raum) Gravitationsdatensätze. Inversion der Newtonschen Integralgleichnung basierend auf singuärer Integrationstheorie, wie etwa Haar-Typ Inversion, wird in einem Multiskalengefüge zur Dekorrelation spezifischer geologischer Signalsignaturen mit inhärent gegebenen Merkmalen behandelt. Reproduzierende Kern-Hilbertraum-Regulasierungstechniken werden (zusammen mit ihrem Übergang zu bestimmten Mollifier-Varianten) studiert, um geologische Kontrastdichteverteilungen durch „Fortsetzen nach unten“ aus terrestrischen und/oder räumlichen Daten zu erhalten. Für die Anwendung auf Gravimeterdatensysteme werden reproduzierende Hilbertraum-KernSpline-Lösungen numerisch in Form Gaußscher approximierender Summen formuliert.

Keywords Newton potential · Newton kernel regularization · Mollifier reduction to singular integral structure · Mollifier transfer to reproducing kernel Hilbert space structure · Multiscale mollified spline interpolation and smoothing inversion · Exponential sum approximations

1

Introduction

Inverse problems (IP’s) arise in many branches of science and mathematics, including computer vision, natural language processing, machine learning, statistics, medical imaging (such as tomography), remote sensing, non-destructive testing, astronomy, geosystems mathematics (especially involving, e.g., gravimetric, geomagnetic and seismic exploration, and satellite observational technology), and many other fields. The rationale in most methods for resolution (i.e., the approximate solvability) of an IP is to construct a “solution” that is acceptable physically as a meaningful approximation and is sufficiently stable from the computational standpoint. The main dilemma of modeling inverse problems is that most of them are ill-posed. The characteristic of such problems is that the closer the mathematical model describes the ill-posed problem (IPP), the worse is the “condition” of the associated computational problem (i.e., the more sensitive to errors). Therefore, the indispensable problem is to bring additional information about the desired solution, compromises, or new outlooks as aids to the resolution of ill-posed problems (IPP’s). It is conventional to use the phrase “regularization of an ill-posed problem”

972

W. Freeden and M. Z. Nashed

to refer to various approaches to circumvent the lack of continuous dependence (as well as to bring about existence and uniqueness if necessary). Roughly speaking, this entails a treatment of an IPP via an analysis of an associated family (usually a sequence or a net) of well-posed problems (WPP’s), yielding meaningful answers to the IPP. It should be remarked that IPP’s are a very active field of research in applied mathematics, with a fastly growing bibliography. Throughout this contribution about inverse gravimetry, we shall refer to various papers and monographs including further details on several aspects. One of the major purposes of this work is to dramatize this delineation with reference to specific methods and results in georelevant sciences. By a regularization-approximation scheme we refer to a variety of methods such as Tikhonov’s regularization, projection methods, iterative approximation, mollifier methods, multi-scale methods, etc., that can be applied to ill-posed problems. These schemes turn into algorithms once a resolution strategy can be effectively implemented. Unfortunately, the regularization approach requires the determination of a suitable value of a certain parameter associated with the scheme (e.g., mesh size, dimension of subspace in the projection scheme, mollification parameter, specification of the adequate level of a scale space, etc.). Even in their restriction to concrete obligations in practice, this is not a trivial problem since it involves a tradeoff between accuracy and numerical stability, a situation that does not usually arise in well-posed problems (guaranteeing, in accordance with the famous Hadamard approach published in 1902, existence, uniqueness, continuous dependence on the data). All in all, inverse problems (IP’s) deal with determining for a given input-output system an input that produces an observed output, or of determining an input that produces a desired output (or comes as close to it as possible), often in the presence of noise. The vivid research activity has led to a vast literature on inverse and illposed problems. Standard textbooks are, e.g., [3, 21, 22, 55, 71, 83, 102, 108, 130]. In addition to the books we mention the journals Inverse Problems, Inverse Problems in Science and Engineering, Journal on Inverse and Ill-Posed Problems, Inverse Problems and Imaging, Mathematical Inverse Problems, and GEM International Journal on Geomathematics. Many further references will be given in due course. Inverse gravimetry as a particularly important example in the theory of inverse problems is a central research area of geodesy, geophysics, and geoexploration. It is a potential field technique which reflects variations in the Earth’s gravitational field. These variations are caused by density contrasts inside the Earth. Gravimetric surveys are carried out by use of extremely sensitive instruments capable of measuring tiny variations in the gravitational field. A gravimeter is a type of an accelerometer. There are essentially two types of gravimeters, namely relative and absolute gravimeters. Absolute gravimeters measure the local gravity in absolute units. They work by directly measuring the acceleration of a mass during free fall in a vacuum. Relative gravimeters compare the value of gravity at one point

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

973

with another. Most common relative gravimeters are spring-based. A spring-based relative gravimeter is basically a weight on a spring, and by measuring the amount by which the weight stretches the spring, gravitation becomes available via Hooke’s law in linearized form. On a global scale gravimetric datasets are used in gravity surveys for establishing the figure of the geoid. Locally used micro-gravimeters provide gravity anamalies and /or disturbances, e.g., for local geodetic and geophysical research, geothermal exploration, petroleum and mineral recovery. Turning over to the inverse gravimetry problem in a mathematically appropriate framework we are led to start from the following basic constituents of the theory of inverse problems: A direct (forward) problem is given in the following form: object

−→

data information of the object.

The inverse problem is considered the “inverse” of the direct problem which relates the object (sub)information to the object. An object may be understood to be the systematic relationship of all data subinformation, object parameters, and other auxiliary information. The aim of the inverse gravimetry problem is to determine the density inside a certain subarea G of the Earth. If the density is supposed to be a function F of bounded signal energy in G , i.e.,  F L2 (G ) =

1/2 |F (x)| dx 2

G

< ∞,

(1)

x ∈ R3

(2)

the gravitational potential 1 V (x) = 4π

 G

1 F (y) dy, |x − y|

can be calculated everywhere in R3 according to Newton’s famous law (1687), so that the direct gravimetry problem F 

= density signature

−→

V 

(3)

= gravitational potential

is a matter of (approximate) integration. Already at this stage, it should be mentioned that there is a striking dependence in the calculation of a gravitational value V (x) on its position x ∈ R3 . As far as the point x is situated in the “outer space” G c = R3 \G (G = G ∪ ∂G , ∂G boundary surface of G ), the value V (x) is obtained by a proper volume integral. However, for a position x in the “inner space” G or on the boundary ∂G , we are confronted with

974

W. Freeden and M. Z. Nashed

an improper volume integral in R3 . As a consequence, it may be expected that the inverse gravimetry problem −→

V 

F 

(4)

= density signature

= gravitational potential

also shows a striking difference in its solution process and dependence on the position of the gravitational data. In its simplest operator formulation, the integral equation (2) may be described as an equation A[F ] = V ,

(5)

where the operator A is defined as an integral over the volume G ⊂ R3  A[F ](x) =

G

G(Δ; |x − y|) F (y) dy = V (x), x ∈ R3 ,

(6)

constituted by the convolution of the density contrast function F and a kernel function G(Δ; ·) showing the same singularity as a point mass, namely the Newtonian kernel G(Δ; |x − y|) =

1 1 , x ∈ R3 \{y}. 4π |x − y|

(7)

Apart from the sign, G(Δ; |x − y|) is known as the fundamental solution of the Laplace equation, so that −Δx G(Δ; |x − y|) = 0, x = y (or in distributional jargon, −Δx G(Δ; |x − y|) = δ(|x − y|), where δ is the Dirac distribution). Roughly speaking, the operator A acts in the inner space G as the inverse to the negative Laplace operator,  − ΔV (x) = −Δ A[F ](x) = − Δx

G

G(Δ; |x − y|) F (y) dy = F (x), x ∈ G ,

(8) which is to say (at least if the function F is Hölder continuous in a neighborhood of x ∈ G ) that the operation of taking the Newtonian potential of a function is a somehow “inverse operation” to the application of the negative Laplace operator. The integral (6) is named for I. Newton (1642-1720), who first discovered it and pioneered the work of P.-S. Laplace (1749-1829) and C.F. Gauss (17771855) about harmonic and potential functions. Indeed, the setting (6) serves as the fundamental gravitational potential in Newton’s law of gravitation (1687). Following the Newtonian context, the interest of our paper is laid in studying the gravitational field in macroscopic sense, where the quantum behavior of gravitation may not be taken in account. Note that, in geodetically based Earth’s gravity work,

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

975

velocities that are encountered are considerably smaller than the speed of the light. Consequently, Newtonian physics can be safely used here. In the sense of Newtonian potential theory (see, e.g., [30]), Eq. (6) is in close relation to the third Green theorem of vector analysis  α(x) P (x) = −

G

G(Δ; |x − y|) ΔP (y) dV (y)

 G(Δ; |x − y|)

 + ∂G

(9)

 ∂ ∂ P (y) − P (y) G(Δ; |x − y|) dω(y), ∂ν(y) ∂ν(y)

that holds true for all twice continuously differentiable functions P on G , where α(x) is the solid angle subtended by the surface ∂G at the point x ∈ R3 characterized by  α(x) = − ∂G

∂ G(Δ; |x − y|) dω(y). ∂ν(y)

(10)

In fact, from classical potential theory, it is known that ⎧ 1, ⎪ ⎪ ⎪ ⎨ 1 α(x) = , ⎪2 ⎪ ⎪ ⎩ 0,

x ∈ G, x ∈ ∂G ,

(11)

x ∈ G c,

provided that the boundary surface ∂G (cf. Fig. 1) is (locally) continuously differentiable (for a more detailed analysis the reader is referred to, e.g., [30]). Once more, as an immediate consequence of (9), we may expect that the discussion of A[F ](x), x ∈ R3 , actually has to be split into three cases, in dependence on the location of x ∈ R3 as a point of the inner space G , outer space G c , or the boundary ∂G , i.e., the internal, surface (terrestrial), and external (spaceborne) input gravitational data. Compared with the integral equation (6) Fig. 1 Solid angle subtended by a continuously differentiable surface ∂G at a point x ∈ R3

α(x) = 21

α(x) = 0

α(x) = 1

976

W. Freeden and M. Z. Nashed

we are led to the conclusion that the boundary integral in (9) vanishes for the gravitational potential V provided that x is a point of the inner space G . Moreover, a solution of (6) in G is not unique, since the addition of any harmonic function to V will not affect (8). In potential theory this observation can be used, for example, to prove existence and uniqueness of solutions to the Dirichlet problem for the Poisson equation inside or outside regular domains and for suitably well-behaved functions: One first applies a Newtonian potential to obtain a solution, and then adjusts it by adding a harmonic function to get the correct boundary data. As already mentioned, the aim of this contribution is different from solving a boundary-value problem: In the language of functional analysis, we have to solve a Fredholm integral equation of the first kind (6) that canonically leads to the framework of the theory of ill-posed problems. The main difficulty, however, is that the input data of the inverse gravimetry problem are not canonically given in the inner space G , but usually in G c . As a matter of fact, until now in physical geodesy, only measurements are taken on the surface ∂G (terrestrial measurements) and/or in the outer space G (spaceborne measurements), i.e., in the set G c . Only in exceptional cases, e.g., in the neighborhood of “boreholes” of geothermal projects, the gravitational potential V and the target function F are given inside G , so that the use of the Poisson differential equation (8) becomes applicable in the inversion process. Typically, for inverse problems, there will also be certain physical constraints which will be necessary to impose on the potential pattern so that the wanted geological features of the density distribution can be approximated in some acceptable manner. Such constraints are usually referred to as conditions reflecting realizability conditions. They will be represented in our mathematical framework by requiring the density functions to lie in some appropriate subset of the output space. Under certain conditions these realizability constraints will serve to regularize the originally stated ill-posed problem, while in others, they will dictate compromises that will have to be made between requirements for accuracy of the approximating functions and the demands of meeting such a priori constraints. In this contribution we are essentially interested in regularization procedures based on mollifier techniques. Different types of mollifiers are studied in more detail, namely singular integral-type mollifiers and reproducing Hilbert space kernel mollifiers. In physical geodesy, the study of inverse gravimetry essentially started with the work of P.S. Laplace, A.M. Legendre, S.D. Poisson [75–77, 80, 81, 117], G. Green, G.G. Stokes [52, 126, 127], C. F. Gauss, F. Helmert [50, 66, 67], P. Pizzetti, G. Lauricella [79, 116], and many others. Not only in physical geodesy, but also in inverse problem theory, there is a huge literature about the character and role of inverse gravimetry in the framework of ill-posed and inverse problems, from which we only mention a limited selection: [2, 3, 9–13, 30, 37, 70, 74, 89–93, 98, 100, 114, 119,121–123,125,132,133,139,140] (for further details the reader is referred to the references therein). Our paper, however, follows a different approach, namely physically motivated mollifier concepts, whose points of departure were already indicated in the introductory chapter [28] of the “Handbook of Geomathematics”, Springer,

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

977

2015, in the contribution [40] in the “Handbook of Mathematical Geodesy”, Birkhäuser, International Springer Publishing, 2018, and particularly in the article [41] of the Springer “GEM International Journal on Geomathematics”, g, 199–264, DOI https://doi.org/10.1007/s13137-018-0103-5, 2018. Intuitively, given a function which is rather irregular, by convolving it with a mollifier the function gets “mollified”, that is, its sharp features are smoothed. Consequently, mollifiers are known as smooth functions with special properties to create sequences of smooth functions approximating a non-smooth function (in the case of inverse gravimetry the Newton kernel). Our approach actually contains a twofold mollification for which the convolutions, i.e., the mollified potential and density approximations, are related in the sense of Poisson’s differential equation by the negative Laplace derivative. More specifically, within the potential framework, the Newton kernel is mollified by certain Taylor expansions so that the negative Laplace derivative of the Taylor expressions within the contrast signature framework leads to mollifiers of the Dirac kernel, i.e., singular integrals. In doing so, a substitude of the potential simultaneously provides an associated approximation of the density. In fact, this construction principle turns out to be valid for both singular integral-type mollifiers and reproducing Hilbert space kernel mollifiers.

2

Newton Volume Integral

In order to handle the inverse gravimetry problem some potential-theoretic preliminaries are needed (see, e.g., [30]): Let G be a regular region in R3 , i.e., a bounded region G dividing R3 uniquely into the inner space G and the outer space G c = R3 \G , G = G ∪ ∂G , such that the boundary ∂G is an orientable Lipschitzian manifold of dimension 2 (for example, ball, ellipsoid, cube and other polyhedra, spheroid, telluroid, geoid, (actual) Earth or appropriate parts of it).

2.1

Basics of Potential Theory

A real-valued function P is called harmonic in G ⊂ R3 if P is of the class C (2) (G ) of functions with continuous second order partial derivatives and satisfies the Laplace equation 6 ΔP (x) =

∂ ∂x1

2

 +

∂ ∂x2

2

 +

∂ ∂x3

2 5 P (x1 , x2 , x3 ) = 0

(12)

for all x = (x1 , x2 , x3 )T , x ∈ G . Some important examples of harmonic functions are given below in the classical nomenclature of potential theory (see, e.g., [30, 47]).

978

W. Freeden and M. Z. Nashed

(a) Potential of a mass point: According to Newton’s Law of Gravitation two points x, y with masses Mx , My attract each other with a force given by −

γ Mx My (x − y), 4π |x − y|3

x, y ∈ R3 , x = y.

(13)

The force is directed along the line connecting the two points x, y . The constant γ denotes Newton’s gravitational constant (note that γ can be assumed to be equal to one in the theoretical part, but not in numerical applications). Although the masses Mx , My attract each other in symmetric way, it is convenient to call one of them the attracting mass and the other one the attracted mass. Conventionally the attracted mass is set equal to unity and the attracting mass is denoted by M: v(x) = −

M γ (x − y), 4π |x − y|3

x ∈ R3 \{y}.

(14)

The formula (14) expresses the force exerted by the mass M on a unit mass located at the distance |x − y| from M. Obviously, the intensity |v(x)| of the gravitational force v(x) is given by |v(x)| =

M γ , 4π |x − y|2

x ∈ R3 \{y}.

(15)

The scalar function V defined by V (x) = γ M G(Δ; |x − y|) = γ M

1 1 , 4π |x − y|

x ∈ R3 \{y}

(16)

is called the potential of gravitation at the point y. The force vector v(x) is the gradient vector of the scalar V (x): v(x) = ∇V (x),

x ∈ R3 \{y}.

(17)

Calculating the divergence ∇· of the gradient field v, it readily follows that ∇ · v(x) = ∇ · ∇ V (x) = ΔV (x) = 0,

x ∈ R3 \{y}.

(18)

1 (note that |y| ≤ |x| 2 implies |x − y| ≥ ||x| − |y|| ≥ 2 |x|), i.e., V is regular at −1 infinity: V (x) = O(|x| ), |x| → ∞. (b) Potential of a finite mass point system: The potential for N points xi with masses Mi , i = 1, . . . , N, is the sum of the individual contributions

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

V (x) = γ

N

Mi G(Δ; |x − yi |),

x ∈ R3 \{y1 , . . . , yn }.

979

(19)

i=1

Clearly we have ΔV (x) = 0,

x ∈ R3 \{y1 , . . . , yN }.

(20)

(c) Potential of a volume: Let G ⊂ R3 be a regular region. The point masses are distributed continuously over G ⊂ R3 with density F . Then the discrete sum (19) becomes a continuous sum, i.e., an integral over the body G :  V (x) = γ

G

G(Δ; |x − y|)F (y) dy.

(21)

x ∈ R3 \G .

(22)

Obviously, ΔV (x) = 0,

Note that V is defined on the whole space R3 , however, ΔV (x) may not be obtained easily by interchanging the Laplace operator and the integral over G for all points x inside G . At infinity the potential behaves like V (x) = O(|x|−1 ), |x| → ∞, uniformly with respect to all directions, i.e., V is regular at infinity.

2.2

Properties of the Newton Integral

The Newton (volume) integral (21) over a regular region G corresponding to a mass density distribution F satisfies the Laplace equation in the outer space G c = R3 \G . Clearly, this property is an immediate consequence of the harmonicity of the fundamental solution for the Laplace operator (in what follows, for simplicity, we restrict ourselves to a Newton integral (21) with γ chosen to be equal to 1). Harmonicity in G c . Let F : G → R be an integrable bounded function. Then

 V (x) =

G

G(Δ; |x − y|)F (y) dy,

x ∈ Gc

(23)

satisfies  Δx

G

G(Δ; |x − y|)F (y) dy = 0

for all x ∈ G c , i.e., V is harmonic in G c .

(24)

980

W. Freeden and M. Z. Nashed

Properties in G . By one-dimensional Taylor linearization (cf. [47]) we obtain

1 1 1 1 3 1 (u − u0 ) + (u − u0 )2 √ =√ − 3 u0 2 2 8 (u0 + θ (u − u0 )) 52 u u0

(25)

for some θ ∈ (0, 1). Setting u = r 2 and u0 = ρ 2 we therefore find 1 1 = r 2ρ

  3 r2 1 3− 2 + (r 2 − ρ 2 )2 . 8 (ρ 2 + θ (r 2 − ρ 2 )) 25 ρ

(26)

In other words, by letting r = |x − y| we are able to give a simple example for a “mollification” of the fundamental solution of the Laplace equation G(Δ; r) =

1 , 4π r

r > 0,

(27)

by

GH ρ (Δ; r)

=

  ⎧ 1 1 2 ⎪ ⎪ 3 − 2r , ⎪ ⎨ 8πρ ρ

r≤ρ (28)

⎪ ⎪ ⎪ ⎩ 1 , 4π r

r > ρ.

such that GH ρ (Δ; ·) is continuously differentiable for all r ≥ 0. Obviously, G(Δ; r) = GH ρ (Δ; r) for all r > ρ. As a consequence, G(Δ; |x − y|) =

1 1 , 4π |x − y|

|x − y| = 0,

(29)

admits a “mollification” (regularization) of the form

GH ρ (Δ; |x

− y|) =

⎧   1 1 ⎪ 2 ⎪ 3 − 2 |x − y| , ⎪ ⎪ ⎨ 8πρ ρ ⎪ ⎪ 1 1 ⎪ ⎪ , ⎩ 4π |x − y|

|x − y| ≤ ρ (30) ρ < |x − y|.

Let F : G → R be of class C (0) (G ). We set  V (x) = and

G

G(Δ; |x − y|)F (y) dy

(31)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

981

 VρH (x) =

GH ρ (Δ; |x − y|)F (y) dy.

G

(32)

The integrands of V and V ρ differ only in the ball Bρ (x) around the point x with radius ρ, i.e., Bρ (x) = {y ∈ R3 : |x − y| < ρ}. Because of its continuity, the function F : G → R is uniformly bounded on G . This fact shows that 6     H V (x) − Vρ (x) = O

5 Bρ (x)

|G(Δ; |x

− y|) − GH ρ (Δ; |x

− y|)| dy

= O(ρ 2 ), (33)

ρ → 0. Therefore, V is of class C (0) (G ) as the limit of a uniformly convergent sequence of continuous functions on G . Furthermore, we let  v(x) =

∇x G(Δ; |x − y|)F (y) dy

(34)

∇x GH ρ (Δ; |x − y|)F (y) dy.

(35)

G

and  vρH (x) =

G

Because of the fact |∇x G(Δ; |x − y|)| = O(|x − y|−2 ),

(36)

the integrals v and vρH exist for all x ∈ G . It is not hard to see that     sup v(x) − vρH (x) = sup |∇V (x) − ∇VρH (x)| = O(ρ), ρ → 0. x∈G

(37)

x∈G

Consequently, v is a continuous vector field on G . Moreover, as the relation (37) holds uniformly on G , we obtain  v(x) = ∇V (x) =

G

∇x G(Δ; |x − y|)F (y) dy.

(38)

Altogether, we are allowed to formulate the following properties: Let G be a regular region. Let F : G → R be of class C (0) (G ). Then V , VρH as defined by (31) and (32), respectively, are of class C (1) (G ), such that lim sup |V (x) − VρH (x)| = 0.

ρ→0

x∈G

(39)

982

W. Freeden and M. Z. Nashed

Furthermore, ∇V is of class C (0) (G ), such that  ∇V (x) =

G

F (y) ∇x G(Δ; |x − y|) dy,

x ∈ G.

(40)

and lim sup |∇V (x) − ∇VρH (x)| = 0.

ρ→0

(41)

x∈G

Poisson’s Differential Equation. We next come to the Poisson differential equation

under the assumption of μ-Hölder continuity, μ ∈ (0, 1], imposed on the function 3 F on G . For this purpose we note that the Taylor linearization of s − 2 around s0 is 3

−5

given by s02 − 32 s0 2 (s − s0 ). Hence, by letting s = r 2 and s0 = ρ 2 , we are able to replace the term r −3 by 2ρ1 3 (5 − ρ32 r 2 ). In consequence, as “mollification” of Z (Δ; |x − y|) =

1 , 4π |x − y|3

|x − y| = 0,

(42)

we introduce

Zρ (Δ; |x − y|) =

⎧   1 3 ⎪ 2 ⎪ |x − y| 5 − , ⎪ ⎪ ⎨ 8πρ 3 ρ2 ⎪ ⎪ ⎪ ⎪ ⎩

|x − y| ≤ ρ (43)

1 , 4π |x − y|3

ρ < |x − y|.

The function r → Zρ (Δ; r), r ≥ 0, is continuously differentiable. Moreover, by the same arguments as above, it can be shown that the vector field  zρ (x) = −

G

Zρ (Δ; |x − y|)(x − y)F (y) dy

(44)

converges uniformly on G as ρ → 0 to the limit field  v(x) = ∇V (x) =

G

∇x G(Δ; |x − y|)F (y) dy.

(45)

For all x ∈ Bρ (x) we obtain by a simple calculation 15 ∇x · ( Zρ (Δ; |x − y|)(x − y)) = 8π Furthermore, we find



1 |x − y|2 − ρ3 ρ5

 .

(46)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

 Bρ (x)

  ∇x · Zρ (Δ; |x − y|) (x − y) dy = 1.

983

(47)

Hence, under the additional assumption of μ-Hölder continuity, μ ∈ (0, 1], for the function F on G , i.e., |F (x) − F (y)| ≤ C |x − y|μ for all x, y ∈ G , we obtain  ∇ · zρ (x) = − ∇x ·  =−

Zρ (Δ; |x − y|)(x − y)F (y) dy

G

Bρ (x)

(48)

  ∇x · Zρ (Δ; |x − y|)(x − y) F (y) dy 

= −α(x) F (x) +

Bρ (x)

  (F (x) − F (y)) ∇x · Zρ (Δ|x − y|)(x − y) dy.

Thus, the μ-Hölder continuity of F guarantees the estimate   sup ∇ · zρ (x) + α(x) F (x) = O(ρ μ ), ρ → 0,

(49)

x∈G

uniformly with respect to x ∈ G , where α(x) is the solid angle at x subtended by the surface ∂G . In an analogous way, we are able to show that the first partial derivatives of (44) converge uniformly to continuous limit fields. Again, the uniform convergence shows that ∇V is differentiable in G , and we have  ∇ · v(x) = ΔV (x) = Δx

G

G(Δ; |x − y|)F (y) dy = −α(x) F (x),

x ∈ G.

(50) It should be noted that the assumption of μ-Hölder continuity of F , μ ∈ (0, 1], is needed for the proof of (50). Indeed, H. Petrini [115] showed that the μHölder continuity of F , μ ∈ (0, 1], is necessary to imply the second continuous differentiability of the Newton volume potential. Let G be a regular region. If F is of class C (0,μ) (G ), μ ∈ (0, 1], in the neighborhood of x ∈ G , then the Poisson differential equation  − Δx

S

F (y) G(Δ; |x − y|) dV (y) = α(x) F (x)

(51)

holds true in the neighborhood of x ∈ G , where α(x) is the solid angle subtended by the surface ∂G at x.

984

3

W. Freeden and M. Z. Nashed

Ill-Posedness of the Inverse Gravimetry Problem

Contrary to the case of L2 (∂G ) (see, e.g., [24] for more details), the class L2 (G ) of square-integrable functions on a regular region G is not obtainable only by the L2 -completion of a countable harmonic function system. In addition, we have to take into account a so-called “anharmonic function system” (see, e.g., [10], [37, 89, 136]). This observation explains the ill-posedness of the inverse gravimetry problem. In the classical nomenclature of physical geodesy, the inversion of Newton’s law of gravitation (6) from terrestrial gravitational data, i.e., the determination of the internal density contrast function from potential data on the boundary ∂G is known as the terrestrial inverse gravimetry problem (TIGP). In other words, for a regular region G ⊂ R3 , (TIGP) is the problem of determining the density function F ∈ L2 (G ) from (information of) the gravitational potential V on ∂G in accordance with the integral equation  V (x) = A[F ](x) =

G

F (y) G(Δ; |x − y|) dy, x ∈ ∂G .

(52)

In the sequel we denote the image of X = L2 (G ) under the operator A by Y := A[L2 (G )], i.e., Y can be characterized by  Y = {V : V = A[F ] =

G

G(Δ; | · −y|)F (y) dy, F ∈ L2 (G )}.

(53)

Furthermore, for any subset H ⊂ R3 , we introduce the operator AH : X = L2 (G ) → Y |H = AH [L2 (G )]

(54)

(more accurately, AGH ), so that Y |H consists of all functions AH [F ] given by  H % x → AH [F ](x) =

G

G(Δ; |x − y|) F (y) dy, F ∈ L2 (G )

(55)

(note that A may be formally understood as AR3 in our nomenclature). Clearly, as we already know, Y |H forms a set of harmonic functions in H, provided that H is a subset of G c . In our notational framework, the terrestrial/spaceborne inverse gravimetry problem ( TSIGP) of classical physical geodesy can be formulated as follows: (TSIGP): Given V ∈ C (0) (G c ), find F ∈ L2 (G ) with AG c [F ] = V . Hadamard’s Classification of the Gravimetry Problem. In accordance with the famous Hadamard’s classification (cf. [62, 63]), (TSIGP) violates all criteria of well-posedness, viz. uniqueness, existence, and stability:

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

985

(i) A solution of (TSIGP) exists only if V belongs to the space Y |G c . However, it should be pointed out that this restriction does not cause any numerical difficulty since, in practice, the information of V is only finite-dimensional. (ii) The most serious problem is the non-uniqueness of the solution: The associated Fredholm integral operator AG c has a kernel (null space) which is already known to coincide with the L2 (G )-orthogonal subspace, which is the closure of all harmonic functions on G . More precisely, if F is a member of class L2 (G ), then AG c : L2 (G ) → Y |G c given by  V = AG c [F ] =

G

  G(Δ; | · −y|) F (y) dy  , F ∈ L2 (G ), c

(56)

G

defines a linear operator such that AG c [F ] is continuous in G c , harmonic in G c , and regular at infinity. The operator AG c as defined by (56) is surjective, but it is not injective. Indeed, the null space (kernel) of AG c , i.e., N (AG c ) = AnH arm(G )

(57)

consists of all functions in L2 (G ) that are orthogonal to harmonic functions in G . N (AG c ) is the space of anharmonic functions in G . In fact, we have (see, e.g., [89, 136]) L2 (G ) = H arm(G )

·L2 (G )

#⊥ " · 2 ⊕ H arm(G ) L (G ) ,

(58)

hence, L2 (G ) = H arm(G ) ⊕ AnH arm(G ) = H arm(G ) ⊕ N (AG c ).

(59)

Unfortunately, the orthogonal complement, i.e., the class of anharmonic functions, is infinite-dimensional (see, e.g., [89, 136]). (iii) Restricting the operator AG c to H arm(G ) leads to an injective mapping which, however, has a discontinuous inverse. Concerning the historical background, the problem of non-uniqueness has been discussed extensively in literature (for a more detailed analysis see, e.g., [93]). This problem can be resolved by imposing some reasonable additional condition on the density. As we already saw, a suitable condition, suggested by the mathematical structure of the Newton potential operator A is to require that the density be harmonic. In fact, the approximate calculation of a harmonic density has already been implemented in several papers (see, e.g., [97] and the references therein), whereas the problem of determining the anharmonic part seems to be still a great challenge.

986

W. Freeden and M. Z. Nashed

Altogether, it should be remarked that up to now the ill-posednes of (TSIGP) seriously limits its application in geoscience and exploration, but on and off in geodesy (see, e.g., [131]) and particularly in geothermal research (see, e.g., [44]), we are able to take advantage of gravitational data systems inside the area G under the specific consideration. However, under the knowledge of additional internal gravimeter data, the methodological situation will change drastically, and significant improvement may be expected for practical applicability.

3.1

Spectral Inversion for Balls

The set H arm(Bβ (0)) of harmonic functions in the ball Bβ (0) with radius β around the origin 0 is a closed subspace of L2 (Bβ (0)) (for more details see, e.g., [5, 30]). Note that β can be chosen, for example, to be the radius of a Runge (Bjerhammar) sphere with β a − 1. (b) A closed system in Anharm(Σint ) is given by      x (2n + 3)β 2k n+2k n x → |x| |x| Yn,j − . 2n + 2k + 3 |x| k∈N;n∈N0 ;j ∈{1,...,2n+1} (82) Moreover, the basis functions are polynomials of degree ≤ N ∈ N\{1} if and only if the index triple the range n ∈ {0, . . . , N − 2}, j ∈  (k, n, j ) is within {1, . . . , 2n + 1}, k ∈ 1, . . . , N 2−n , where [ · ] is the Gauss bracket, defined by [x] = max{ν ∈ Z : ν ≤ x}, x ∈ R. The space of anharmonic polynomials with degrees ≤ N has dimension 1 3 N − 16 N. 6 The obvious advantage of the system in (a) is its orthogonality. On the other hand, the system described in (b) has a radial part (see also [37]), which is explicitly given, whereas the radial part of the orthogonal system has to be calculated iteratively by means of recurrence formulas.

990

W. Freeden and M. Z. Nashed

The important role of the anharmonic functions in the theory of (TSIGP) is also stressed if we investigate a radially symmetric density distribution which is given for the mantle and the outer and inner core of the Earth. Such a structure of spherical layers does not give any information in the gravitational potential and, therefore, cannot be recovered by means of harmonic functions. V. Michel [89], indeed, shows that a reconstruction of the (deep) Earth’s interior with a harmonic function system does not make any sense. Therefore, a reliable method for the (global) approximation of the density distribution of the Earth requires a treatment of both orthogonal projections: the harmonic part and the anharmonic part. Moreover, we recall that the contribution of H−1,1 to an (outer) gravitational (disturbance) potential can be neglected when applying an appropriate coordinate transformation (see, e.g., [65,69,99] for more details). Therefore, this operation can be interpreted physically as filtering out the contribution of the radially symmetric density structures in the Earth’s interior (note that the total mass of an anharmonic density function is zero).

3.2

Spectral Inversion for Regular Regions

The spherically reflected results will now be extended to the investigation of the inverse problem AG c [F ] = V , where AG c [F ] is the gravitational potential of a regular region G ⊂ R3 and F ∈ L2 (G ) is the desired mass density distribution F . As already known from (57), the null space of the operator AG c is given by N (AG c ) = AnH arm(G ).

(83)

A general complete orthonormal basis system for the harmonic functions inside or outside an arbitrary regular region is not available. This is the reason why the following setting is useful (cf. [89]): Let the families of functions {Hn,j (G ; ·)}n∈N0 ;j =1,...,2n+1 and {H−n−1,j (G c ; ·)}n∈N;j =1,...,2n+1  be complete orthonormal systems of the Hilbert spaces H arm(G ), "·, ·#L2 (G ) and   H arm(G c ), "·, ·#L2 (G c ) , respectively, and {kG∧ (n)}n∈N0 be the symbol of AG c : L2 (G ) → Y |G c = R(AG c ) = AG c (L2 (G )),

(84)

given by AG c [F ](x) =

∞ 2n+1 n=0 j =1

. / kG∧ (n) F, Hn,j (G ; ·) L2 (G ) H−n−1,j (G c ; x), x ∈ G c , (85)

where H−1,1 (G ; ·) is not an element of L2 (G c )). We assume that kG∧ (n) = 0 for all n ∈ N0 .

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

991

If ∂G is a sphere with radius β around the origin, we let Hn,j (G ; ·) := Hn,j (β; ·);

n ∈ N0 , j ∈ {1, . . . , 2n + 1};

H−n−1,j (G c ; ·) := H−n−1,j (β; ·);

n ∈ N, j ∈ {1, . . . , 2n + 1} .

(86) (87)

Moreover, we set kG∧ (n) = kβ∧ (n) =

β2 4π . √ 2n + 1 (2n − 1)(2n + 3)

(88)

The inverse problem AG c [F ] = V with F ∈ H arm(G ) unknown, is solvable if and only if V ∈ H arm(G ) with > ⎞2 ⎛= c ∞ 2n+1 ⎜ V , H−n−1,j (G ; ·) L2 (G c ) ⎟ ⎠ ρ

(124)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1001

satisfies − Δx GVρ P (Δ; |x − y|) = V Pρ (|x − y|),

x, y ∈ R3 .

(125)

An elementary calculation yields −

1 d 2 d 6 r2 ) = 2, r (1 − 2 2 r dr dr ρ ρ

(126)

so that − Δx V Pρ (Δ; |x − y|) = DρV P (|x − y|),

x, y ∈ R3 ,

(127)

where  DρV P (|x

1 6 CρV P ρ 2

− y|) =

=

8π 3 ρ

0

, |x − y| ≤ ρ , |x − y| > ρ.

(128)

Clearly, all methodological concepts developed for the Haar case together with its multiscale settings remain valid. Their formulations are straightforward. The following result, however, serves as a strategic basis for our approach to density feature extraction in specific representation within the de la Vallée Poussin framework. Theorem 1. The “ρ-de la Vallée Poussin potential functions”  VρV P (x)

=

GVρ P (Δ; |x − y|)F (y) dy

G

(129)

and the “ρ-de la Vallée Poussin density function”  FρV P (x) =

G

V Pρ (|x − y|)F (y) dy

(130)

satisfy the relations sup |V (x) − VρV P (x)| = O(ρ 2 ),

ρ→0

(131)

x∈G

and lim sup |α(x)F (x) − FρV P (x)| = 0,

ρ→0

x∈G

where α(x) is the solid angle subtended by the boundary ∂G at x ∈ G .

(132)

1002

W. Freeden and M. Z. Nashed

Unfortunately, de la Vallée Poussin potentials VρV P do not generally show a faster convergence to V than VρH . Approximate Mollifier Solution. In similarity to our previous Haar considerations

we use the operators  AVρ P [F ] = VρV P =

G

GVρ P (Δ; | · −z|) F (z) dz, F ∈ L2 (G ),

(133)

V Pρ (| · −z|) F (z) dz, F ∈ L2 (G ).

(134)

and  IρV P [F ] = FρV P =

G

We denote the image of X = L2 (G ) under the operator AVρ P by YρV P . So, instead of discussing the integral A[F ](x) we choose AVρ P [F ], F ∈ L2 (G ), for some sufficiently small ρ > 0. We take advantage of the fact that  G

G(Δ; |x − z|) DρV P (|y − z|) dz = V Pρ (|x − y|), x, y ∈ G .

(135)

Note that  Δx

G

G(Δ; |x − z|) DρV P (|y − z|) dz = Δx V Pρ (|x − y|) = −DρV P (|x − y|), x, y ∈ G .

(136)

After these preliminaries we are able to conclude that  IρV P [F ](x) = FρV P (x) =

G

V Pρ (|x − w|)F (w) dw

  =  =  =  = holds true for x ∈ G , so that

G

G

G

G

G

 G(Δ; |w − z|) DρV P (|x − z|)

DρV P (|x

dz F (w) dw

 − z)

 G

G(Δ; |w − z|)F (w) dw

dz

DρV P (|x − z|) A[F ](z) dz DρV P (|x − z|) V (z) dz.

(137)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1003

 FρV P (x) =

G

DρV P (|x − z|) V (z) dz, x ∈ G .

(138)

The right hand side of (138) is given analytically when the parameter ρ is chosen appropriately. So, if we define the operator Sρ : YρV P → X in the form  FρV P = Sρ [V ], Sρ [V ] =

G

DρV P (| · −z|) V (z) dz, x ∈ G ,

(139)

then, by (138), this operator maps the gravitational potential to mollified solutions of ITIGP. This property motivates the term mollified inverse of A used for Sρ . The discretization of the identity (139) given by FρV P (x) 

N

wi DρV P (|x − ziN |) V (ziN ),

ziN ∈ G , x ∈ G

(140)

i=1

may serve as an alternative to improve local density knowledge from given internal (e.g., borehole) data V (ziN ), i = 1, . . . , N, where the values wi , i = 1, . . . , N, are the known integration weights. Finally, it should be noted that, more generally, any singular integral (cf. [94,95]) can be chosen in analogy to the de la Vallée Poussin kernel, i.e., smoothed Haar kernel, as long as its Laplace derivative takes a reasonable role in the mollification context.

4.3

Singular Integral-Type Mollifier Method

First we recapitulate the concept of a singular integral: Let {Kρ }ρ>0 be a family of functions r → Kρ (r), r ≥ 0, satisfying the following conditions: (i) (ii) (iii) (iii) (iv)

Kρ (r) = 0, r > ρ, Kρ (r) ≥ 0, r ≥ 0, Kρ |[0, ρ] is of class C (∞) , d 2 d − r12 dr r K (r)|

= 0,  ρ dr ρ 2 r∈[0,ρ] 4π 0 Kρ (r) r dr = 1.

Then, the family {Iρ }ρ>0 of operators Iρ : F → Iρ [F ], F ∈ Z, (Z = C (0) (R3 ) or Z = L2 (R3 )), given by   Kρ (|x − y|) F (y) dy = Kρ (|x − y|) F (y) dy Iρ [F ](x) = Fρ (x) = R3

Bρ (x)

(141) is called a singular integral in Z, if the following approximate identity relation holds true

1004

W. Freeden and M. Z. Nashed

lim Iρ [F ] − F Z = 0

(142)

ρ→0

for all F ∈ Z. Obviously, an example of a singular integral of the aforementioned type is given by the de la Vallée Poussin kernel. Let G be a regular region. Suppose that {Kρ }ρ>0 is a kernel constituting a singular integral in the L2 -metric, then it is not difficult to show (see, e.g., [94, 95]) that the limit relation  lim

ρ→0

G

  Iρ [F ](x) − F (x)2 dx

1 2

=0

(143)

holds true for all F ∈ L2 (G ), while, for all F ∈ C (0) (G ), we have lim sup |Iρ [F ](x) − F (x)| = 0.

ρ→0 x∈G

(144)

Correspondingly to the family {Kρ }ρ>0 we are led to families {Gρ }ρ>0 and {Dρ }ρ>0 such that − Δx Gρ (Δ; |x − y|) = Kρ (|x − y|),

x, y ∈ R3

(145)

and − Δx Kρ (|x − y|) = Dρ (|x − y|),

x, y ∈ R3 .

(146)

Our interest now is in the terrestrial gravimetry problem (TGP), that may be regarded as a particularly relevant problem in geoscientific practice (our considerations, however, remain valid for (ITGP)). We start from known values V (xi ), xi ∈ ∂G , i = 1, . . . , N, given by  A[F ](xi ) =

G

G(Δ; |xi − z|) F (z) dz = V (xi ), xi ∈ ∂G , i = 1, . . . , N, (147)

which can be thought of as resulting from moment discretization of the gravimetry integral equation (cf. (6))  A[F ](x) =

G

G(Δ; |x − z|) F (z) dz = V (x), x ∈ ∂G , F ∈ L2 (G ).

(148)

(TGP) aims at determining an approximation of the function F ∈ L2 (G ) from the N Eqs. (147). Introducing the following settings

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

 (N )

A[F ] :=



 G(Δ; |x1 − z|) F (z) dz, . . . ,

G

1005

G

G(Δ; |xN − z|) F (z) dz , (149)

(N )

v := (V (x1 ), . . . , V (xN ))T ,

(150)

we are able to rewrite the Eqs. (147) in operator form as follows: (N )

A : L2 (G ) → RN , F → (N ) v = (N ) A [F ].

(151)

We look for an approximate inverse (N ) S : RN → L2 (G ) for (N ) A in the form (N )

S t :=

N i=1

V (xi ) D(|xi − ·|), t = (t1 , . . . , tN )T ,   

(152)

=ti

in terms of functions D(|xi − ·|) ∈ L2 (G ), i = . . . , N, satisfying (N )

S

(N )

A [F ] =

N  i=1

G

G(Δ; |xi − z|)F (z) dz D(|xi − ·|)

 =

G

F (z)

N

(153)

G(Δ; |xi − z|) D(|xi − ·|) dz.

k=1

Now, the stage is set for explaining the mollifier philosophy, i.e., the sum N

G(Δ; |x − xi |) D(|xi − y|)

(154)

i=1

is understood as a discrete version of the “continuous expression”  G

G(Δ; |x − z|) D(|z − y|) dz  δ(|x − y|)

(155)

whose “mollifier version” for some family {Kρ }ρ>0 constituting a singular integral is given by  G

G(Δ; |x − z|) Dρ (|z − y|) dz = Kρ (|x − y|),

with sufficiently small ρ > 0. This observation leads to the sum

(156)

1006

W. Freeden and M. Z. Nashed

(N )

Sρ t =

N

V (xi ) Dρ (|xi − ·|)

(157)

i=1

and (N )



(N )

A [F ] =

N  G

i=1

G(Δ; |xi − z|)F (z) dz Dρ (|xi − ·|)

 =

G

F (z)

G(Δ; |xi − z|) Dρ (|xi − ·|) dz.

k=1

as approximations to (N ) S t and (N ) S

4.4

N

(158)

(N ) A

[F ], respectively.

Moment Method

Next we mention the finite moment problem for TIGP. For that purpose we assume that the N potential (volume integral) values  G

G(Δ; |xi − y|) F (y) dy = V (xi ), xi ∈ ∂G , i = 1, . . . , N.

(159)

are known. The standard solution process (see, e.g., [22, 71]) consists of finding a linear combination in terms of the functions x → G(Δ; |xi − x|), x ∈ G , xi ∈ ∂G , i = 1, . . . , N. In other words, the moment method looks for a function F ∈ XN satisfying the conditions (159), where XN is given by XN := spani=1,...,N G(Δ; |xi − ·|).

(160)

As a consequence, the moment solution is a harmonic function inside G . More formally, consider again a semi-discrete observation operator (N ) A : 2 L (G ) → RN , F → (N ) v = (N ) A [F ], of type (149) and (150). Remembering F ∈ XN and choosing F as the linear combination F =

N

βi G(Δ; |xi − ·|)

(161)

k=1

we are led to a (uniquely solvable) linear system in the unknowns β1 , . . . , βN , viz. N k=1

 βi

G

G(Δ; |xi − y|) G(Δ; |xj − y|) dy = V (xj ), j = 1, . . . , N,

(162)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1007

that turns out to play a central role in the context of minimum norm (spline) interpolation in reproducing kernel Hilbert spaces as discussed later on.

4.5

Backus–Gilbert Method

The concept originally proposed by G.E. Backus and F. Gilbert (cf. [6–8]) is that one does not primarily wish to solve the finite moment problem as explained above, but rather one is interested in how well all possible candidates for solution can be recovered pointwise. More specifically, the Backus–Gilbert method is based on a pointwise minimization criterion: Keep y ∈ G fixed and determine the numbers μi (= μi (y)), i = 1, . . . , N, , as the solution of the following minimization problem:  G

N 2     |z − y|2  μi G(Δ; |xi − z|) dz  

→ min.

(163)

i=1

subject to μ ∈ RN , μ = (μ1 , . . . , μN )T with  N G i=1

μi G(Δ; |xi − z|) dz = 1.

(164)

2 , z ∈ G , in the integrand of (163) It should be remarked that the factor z → |z − y| is a measure for the concentration of the sum N i=1 μi G(Δ; |xi − y|) around the point y ∈ G under consideration. In the literature (see, e.g., [83, 118]), more generally, the term z → |z − y|2ν , z ∈ G , ν ≥ 1, is sometimes chosen. In this case, the larger ν, the more concentrated is the sum N i=1 μi G(Δ; |xi − y|) around y ∈ G. In matrix-vector nomenclature (thereby notationally omitting the dependence on the fixed, but arbitrary point y ∈ G ) we are able to rewrite the quadratic optimization problem (163) and (164), in the form

μ· Qμ

→ min.

(165)

subject to κ · μ = 1,

(166)

where  (Q)i,j :=

G

|z − y|2 G(Δ; |xi − z|) G(Δ; |xj − z|) dz, i, j = 1, . . . , N

(167)

1008

W. Freeden and M. Z. Nashed

and  κj :=

G

G(Δ; |xj − z|) dz, j = 1, . . . , N.

(168)

In fact, the formulas (165) and (166) constitute a quadratic minimization problem with only one linear equation constraint. We may assume that κ = (κ1 , . . . , κN )T is different from 0, since otherwise the constraint (166) cannot be valid. The introduction of a Langrange multiplier well-known from optimization theory (see, e.g., [137]) can be used to characterize the solvability of the resulting linear Qμ − λκ = 0 under the constraint κ · μ = 1. , i.e., existence and uniqueness. In more  detail, from the integral in (163), we see that μ · Q μ ≥ 0 and μ · Q μ = 0 implies N i=1 μi G(Δ; |xi − ·|) = 0, so that the linear independence of the system {G(Δ; |xi − ·|)}i=1,...,N shows that Q is positive definite. Summarizing our results we therefore obtain the following statement: The symmetric matrix Q ∈ RN ×N as defined by (167) is positive definite for every y ∈ G . The quadratic minimization problem (165) and (166) is uniquely solvable. The vector μ is the unique solution of (165) and (166) if and only if there exist a real number λ (the Lagrange multiplier) so that (μ, λ) ∈ RN +1 solves the linear system Qμ − λκ = 0 under the constraint κ · μ = 1. The Lagrange multiplier λ = μ · Q μ represents the minimal value of the quadratic minimization problem. Consider the unique solution μ ∈ RN , μ = (μ1 , . . . , μN )T , μi = Mi (y), i = 1, . . . , N, of the quadratic minimization problem (165) and (166). The Backus– Gilbert solution FN of the discrete version of TIGP  G

G(Δ; |xi − z|) FN (z) dz = V (xi ), xi ∈ ∂G , i = 1, . . . , N,

(169)

is defined by FN =

N

V (xi ) μi

(170)

i=1

in G . The minimal value λ (more accurately, λ(y)) is called the spread. As already mentioned, the Backus–Gilbert solution (170) generally is not a solution of the finite moment problem (159). This observation is certainly a disadvantage. Therefore, the question arises if the error may be estimated in an appropriate way (see [71] for related work in one-dimensional context): Let F ∈ L2 (G ) be any solution of the finite moment problem (159). Suppose that FN given by (170) is the Backus–Gilbert solution. Then, in connection with (164), it follows that

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

FN (y) − F (y) =

N

V (xi ) μi − F (y)

 N G i=1

i=1

=

N 

μi G(Δ; |xi − z|) dz

G(Δ; |xi − z|) (F (z) − F (y)) μi dz

G

i=1

1009

(171)

holds true. Consequently, we obtain |FN (y) − F (y)|

   N    G(Δ; |xi − z|) μi ) |F (z) − F (y)| dz.   G 



(172)

i=1

Under the assumption of Lipschitz-continuity of F in G , i.e., the existence of a constant CF so that |F (z) − F (y)| ≤ CF |z − y|,

y, z ∈ G ,

(173)

we are able to deduce that |FN (y) − F (y)|



CF

   N    G(Δ; |xi − z|) μi  |z − y| dz.   G 

(174)

i=1

By virtue of the Cauchy–Schwarz inequality we therefore obtain from (174)  |FN (y) − F (y)|



CF

≤ CF



G

 N     1· G(Δ; |xi − z|) μi  |z − y| dz  

(175)

i=1

⎛ G  ⎝

 G

⎞ 12 N 2     G(Δ; |xi − z|) μi  |z − y|2 dz⎠ .    i=1

For N ∈ N, y ∈ G , we set  2 (y) := min eN

G

 |ZN (z)|2 |z − y|2 dz : ZN ∈ XN ,

G

 ZN (z) dz = 1 . (176)

Thus, we finally arrive at |FN (y) − F (y)| ≤ CF



G  eN (y)

(177)

1010

W. Freeden and M. Z. Nashed

as pointwise error estimate of the difference of the solution of the finite moment problem (159) and the Backus–Gilbert solution (170). We conclude our considerations with the question if the Backus-Gilbert method admits a relation to the mollifier method: Once again, consider the semi-discrete observation operator A : L2 (G ) → RN , F → (N ) v = (N ) A [F ],

(N )

(178)

where  (N )

A[F ] :=

G



 G(Δ; |x1 − z|) F (z) dz, . . . ,

G

G(Δ; |xN − z|) F (z) dz , (179)

(N )

v := (V (x1 ), . . . , V (xN )) . T

(180)

By virtue of the operator (N ) S given by "

(N )

N # S v (y) = V (xi ) μi (y),

y ∈ G,

(181)

k=1

we have constructed a left inverse (N ) S : RN → L2 (G ) such that N  (N ) (N ) S A [F ](y) = G(Δ; |xi − z|)F (z) dz μi (y) G

i=1

 =

G

F (z)

6N 

(182)

5 G(Δ; |xi − z|) μi (y) dz,

i=1



 δ(|z−y|)



 F (y). Note that we are formally allowed (in distributional context) to formulate  F (y) =

G

F (z) δ(|z − y|) dz 

 

G

(183)

F (z)

G

G(Δ; |x − z|) M(|x − y|) dx dz

where, in analogy to (146), we have − Δz δ(z − y|) = M(|z − y|)  −Δz

N i=1

G(Δ; |xi − z|) μi (y).

(184)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

5

1011

Reproducing Kernel Hilbert Space (RKHS) Methods

Next we consider reproducing kernel Hilbert space “mollifier solutions”. First we discuss the classical geodetic External/Terrestrial Inverse Gravimetry Problem (ETIGP). Then we go over to the Internal/Terrestrial External Inverse Gravimetry Problem (ITEIGP), i.e., the inverse gravimetry problem in whole Euclidean space R3 .

5.1

Preparatory Material on Reproducing Kernel Hilbert Spaces

Since reproducing kernel Hilbert space structure plays a central role in our forthcoming mollifier approach, some preparatory aspects should be presented for the convenience of the reader (for more details see, e.g., [4, 20, 68, 112, 120, 138]): A Hilbert space H of real-valued functions on a set G (e.g., a regular region) is called a reproducing kernel Hilbert space (RKHS) if all the evaluation functional H % F → F (t) ∈ R are bounded (continuous) for each fixed t ∈ G , i.e., there exists a positive constant Ct for each t ∈ G such that |F (t)| ≤ Ct F H for all F ∈ H . By the Riesz Representation Theorem, for each t ∈ G , there exists a unique element Kt such that F (t) = "F, Kt #H for all F ∈ H . The reproducing kernel K(·, ·) : G × G → R of a RKHS H is defined by K(s, t) = "Ks , Kt #H , s, t ∈ G . We list some basic properties of RKHS’s that are particularly relevant for our approach: • • • •

K(s, t) = K(t, s) for all t, s ∈ G . K(s, s) ≥ 0√ for all s ∈√G . |K(s, t)| ≤ K(s, s) K(t, t) for all s, t ∈ G . The reproducing kernel K(s, t) on G × G is a non-negative definite Hermitean kernel. Conversely by the Aronszajn-Moore Theorem, every nonnegative definite Hermitean function K(·, ·) on G × G determines a unique Hilbert space HK for which K(·, ·) is a reproducing kernel (cf. [4]) (note that a kernel F on G × G is said to be positive definite if, for any n points t1 , . . . , tn ∈ G , the matrix A = (M(ti , tj ))1≤i,j ≤n is non-negative definite, i.e., uH Au =

n

ui M(ti , tj ) uj ≥ 0

(185)

i,j =1

for all u = (u1 , . . . , un ) ∈ Rn ). • A closed subspace H˜ of a RKHS H is also a RKHS. Moreover, the orthogonal ˜ t) of the RKHS H˜ are projector P of H onto H˜ and the reproducing kernel K(s, ˜ related by P F (s) = "F, Ks #, s ∈ G for all f ∈ H where K˜ k = P K. • In a RKHS, the element representing a given bounded linear functional L can be expressed by means of the reproducing kernel: L(F ) = "F, W #H , where W = L(K).

1012

W. Freeden and M. Z. Nashed

• If G is a regular region and K(·, t·) is continuous on G × G , then HK is a space of continuous functions. • If the integral relation  G ×G

|Q(s, t)|2 ds dt < ∞,

(186)

holds true, then Q(·, ·) has a countable sequence of eigenvalues and eigenfunctions (Theorem of Mercer). • L2 (G ), the space of all square-integrable functions on the regular region G , is not a RKHS. Indeed, the point evaluation is not well defined. Each function F ∈ L2 (G ) is actually an equivalence class of functions equal to each other almost everywhere. Thus the “value” at a point has no meaning since any point has measure zero. • Let {Φn }n∈N be a sequence of functions defined on G such that, for every t ∈ G , ∞

|Φn (t)|2 < ∞.

(187)

n=1

For every sequence {cn }n∈N satisfying ∞

|cn |2 < ∞,

(188)

n=1

 the series ∞ n=1 cn Φn (t) is then convergent for every t ∈ G . The functions which are the sums of such series form a linear space H , on which we are able to define the structure of a separable Hilbert space by taking as scalar product, for F =



cn Φn ,

G=

n=1



dn Φn ,

(189)

n=1

the number "F, G#H =



cn dn .

(190)

Φn (t) Φn (s), t, s ∈ G × G .

(191)

n=1

This space has a reproducing kernel, namely K(x, y) =

∞ n=1

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1013

• Let H be a separable RKHS, then its reproducing kernel K(·, ·) has the expansion K(s, t) =



Φn (t) Φn (s),

(192)

n=1

where {Φn }∞ basis for H (we remark that for a general n=1 is an orthonormal  separable Hilbert space H , ∞ n=1 Φn (t) Φn (s) is not a reproducing kernel (note that L2 (G ) is not an RKH S) and also that Φn ’s do not generally correspond to sampling expansions . If they do, i.e., if Φn (t) = K(tn , t) for some sequence {tn }, then we have that F (t) = ∞ n=1 F (tn ) Φn (t), this constitutes a sampling theorem.) • If the reproducing kernel K(s, t) of a RKHS H is continuous on G × G , then H is a space of continuous functions (uniformly continuous on a bounded G ). This follows from |F (t) − F (s)| = |"F, Kt − Ks #H | ≤ F H Kt − Ks H

(193)

and Kt − Ks 2 = K(t, t) − 2K(t, s) + K(s, s)

(194)

for all s, t ∈ G . • Strong convergence in a RKHS H implies pointwise convergence and uniform convergence on compact sets, because of the fact |F (t) − Fn (t)| = |"F − Fn , Qt #H | ≤

 K(t, t) F − Fn H .

(195)

• Let HK denote the RKHS with reproducing kernel K, and denote the inner product and norm in HK by "·, ·#HK and  · HK , respectively. Note that K(s, s  )(= Ks (s  )) is a non-negative definite Hermitean kernel on G × G , and that {Ks , s ∈ G } spans HK since "Ks , F #HK = 0, s ∈ G , implies F (s) = 0. For more properties of reproducing kernel spaces the reader is referred to, e.g., [107, 109–113], and the references therein. • For every positive definite kernel K(·, ·) on G × G , there exist a zero mean Gaussian process with K(·, ·) as its covariance, giving rise to the relation between Bayes estimates, Gaussian processes, and optimization processes in RHKS (for more details the reader is referred to the geodetic literature, see e.g., [51, 87, 88, 97], and the monographs [83, 135]). Interest in reproducing kernel Hilbert spaces have increased in recent years, as the computer capacity has made solutions of ever larger and more complex problems practicable. Indeed, new reproducing kernel representations and new applications (in particular in physical geodesy and geophysics) are being contributed at a rapid

1014

W. Freeden and M. Z. Nashed

rate. For example, a certain RHKS in terms of outer harmonics allows the adequate determination of the Earth’s gravitational potential (see, e.g., [25, 124] for early spline approaches) in consistency with gravitational observables of heterogeneous type (that are interpretable as (bounded) linear functionals on the RKHS under consideration). In our approach we are particularly interested in inverse gravimetry involved reproducing Hilbert space kernel framework, i.e., the reproducing kernel Hilbert spaces are restrictions of Y (cf. (53)) to certain subspaces of R3 .

5.2

External/Terrestrial RKHS for Regular Regions

Let PH arm(G ) and PAnH arm(G ) be the orthogonal projectors of the space L2 (G ) to H arm(G ) and N (AG c ) = AnH arm(G ), respectively. Then, every function F of the Hilbert space L2 (G ) can be uniquely decomposed in the form F = PH arm(G ) [F ] + PAnGarm(G ) [F ]

(196)

such that



AG c [F ] = AG c PH arm(G ) [F ] + AG c PAnH arm(G ) [F ] = AG c PH arm(G ) [F ] .    =0

(197)

Furthermore, it is clear that 0 02 0 02 F 2L2 (G ) = 0PH arm(G ) [F ]0L2 (G ) + 0PAnH arm(G ) [F ]0L2 (G ) .

(198)

In conclusion, AG c [PH arm(G ) [F ]] is that function of class L2 (G ), which has the smallest L2 (G )-norm among all (density) functions F in L2 (G ) generating the same potential in the space Y |G c = AG c (L2 (G )). Consequently, to every P ∈ Y |G c , there corresponds a unique F ∈ H arm(G ) such that AG c [F ] = AG c [PH arm(G ) [F ]] = P .

(199)

The restriction AG c |H arm(G ) is a linear bijective operator, i.e., to every P ∈ Y |G c there exists a unique F ∈ H arm(G ) such that AG c |H arm(G )[F ] = P . On the space Y |G c we are able to induce an inner product "·, ·#Y |G c by defining .

/ AG c |H arm(G )[F ], AG c |H arm(G )[G] Y |G c = "F, G#L2 (G ) ,

(200)

where F, G ∈ L2 (G ). Y |G c equipped with the inner product "·, ·#Y |G c is a Hilbert

space. AG c |H arm(G ) is an isometric operator relating L2 (G ) to Y |G c . Our goal is to show that (Y |G c , "·, ·#Y |G c ) is a reproducing kernel Hilbert space, i.e., a Hilbert space equipped with the reproducing kernel KY |G c (·, ·). It is clear that, for every

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1015

x ∈ G c , G(Δ; |x − ·|) is an element of H arm(G ). From well-known reproducing Hilbert space theory (see, e.g., [4]), it follows that any given potential P ∈ Y |G c can be represented in the form P (x)=AG c |H arm(G )[F ](x)=" G(Δ; |x − ·|), F #L2 (G ) , x ∈ G c , F ∈ H arm(G ). (201) For x ∈ G c , the evaluation functional Ex [P ] = P (x) is a bounded functional on G c . Indeed, from the Cauchy-Schwarz inequality applied to (201) we obtain |Ex [P ]| = |P (x)| ≤ ||F ||L2 (G ) ||G(Δ; |x − ·|)||L2 (G ) .

(202)

Consequently, we have |Ex [P ]| = P (x)| ≤ Cx P Y |G c ,

P ∈ Y |G c , x ∈ G c .

(203)

Thus, a necessary and sufficient condition for the Hilbert space Y |G c to possess a reproducing kernel (see, e.g., [4]) is fulfilled. Even more, we are are able to find the explicit expression of the reproducing kernel KY |G c (·, ·) : G c × G c → R for the

Hilbert space Y |G c such that, for every P ∈ Y |G c , the reproducing property > = P (x) = P , KY |G c (x, ·)

Y |G c

,

x ∈ G c,

(204)

is valid. For x ∈ G c and F ∈ H arm(G ) such that AG c [F ] = P , we obtain P (x) = "F, G(Δ; |x − ·|)#L2 (G )

(205)

= "AG c [F ], AG c [G(Δ; |x − ·|)]#Y |G c = "P , AG c [G(Δ; |x − ·|)]#Y |G c . Hence, KY |G c (x, ·) = AG c [G(Δ; |x − ·|)], i.e., we have for x, y ∈ G c : The volume integral KY |G c (x, y) = "G(Δ; |x − ·|), G(Δ; |y − ·|)#L2 (G )  1 1 dz = (4π )2 G |x − z||y − z|

(206)

represents the (unique) reproducing kernel of Y |G c . Clearly, for “geoscientifically relevant geometries” G such as geoid, real Earth, etc. the integral (206) has to be determined by approximate integration rules such as presented in [32] (see also the literature therein). For purpose of exploration,

1016

W. Freeden and M. Z. Nashed

cuboids play a particular role, for which standard product integration techniques are available. Summarizing we end up with the following result: # " our considerations The class Y |G c , "·, ·#Y |G c constitutes a Hilbert space possessing the reproducing kernel (206).

5.3

External/Terrestrial RKHS for Balls

For the special case of a ball Bβ (0) of radius β around the origin the kernel KY |Bβc (0) (·, ·) given by KY |B c (0) (x, y) = β

1 (4π )2

 Bβ (0)

1 dz, |x − z||y − z|

(207)

can be expressed as series representation by use of the expansion (see, e.g., [47])   ∞ x 1 |y|n y G(Δ; |x − y|) = · , |y| < |x|, Pn 4π |x| |y| |x|n+1

(208)

n=0

where Pn is the Legendre polynomial of degree n. In connection with (207) we obtain  2 n+1   ∞ x β y 1 β · . Pn KY |B c (0) (x, y) = β 4π (2n + 1)(2n + 3) |x||y| |x| |y| n=0

(209) We are interested in an explicit expression of the infinite Legendre sum (209). To this end, we have a closer look at the term 1 (2n + 1)(2n + 3)

(210)

that can be decomposed via partial fraction decomposition in the form 1 1 1 = − . (2n + 1)(2n + 3) 2(2n + 1) 2(2n + 3) As a consequence, the reproducing kernel can be rewritten in the form

(211)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

KY |B c (0) (x, y) = β

 2 n   ∞ β x y β3 1 1 · Pn 8π |x||y| 2n + 1 |x||y| |x| |y| n=0    =:Φ1

"

√β ,x ·y |x||y| |x| |y|

1017

(212)

#

 2 n   ∞ x β y β3 1 1 · − Pn . 8π |x||y| 2n + 3 |x||y| |x| |y| n=0    =:Φ2 β Case 1. We let h := √|x||y| < 1 and t := we consider the Legendre expansions

x |x|

·

"

√β ,x ·y |x||y| |x| |y|

y |y|

#

∈ (−1, 1). Under this assumption



Φ1 (h, t) =

β 1 h h2n+1 Pn (t), 8π 2n + 1

(213)

n=0

∞ β 1 h2n+2 Pn (t). Φ2 (h, t) = 8π 2n + 3

(214)

n=0

Recalling the generating series by means of the Legendre polynomials (see, e.g., [1, 85]) ∞

1 h2n Pn (t) = √ , 1 + h4 − 2h2 t n=0

h ∈ [0, 1), t ∈ [−1, 1].

(215)

we obtain by integration of both sides of (215) with respect to h ∞ n=0

1 h2n+1 Pn (t) = 2n + 1

 √

1 1 + h4 − 2h2 t

dh.

(216)

The Ph.D.-thesis [73] provides the following representation % % h2 β h2 Φ1 (h, t) = −i h +1 − +1 √ √ 8π −t + t 2 − 1 t + t2 − 1 55 6 2 6 √ t− t 2 −1 1 −1 √ , √2 h F i sinh 2 ×

2

−t+

1 √

−t+

t 2 −1

t −1 t+

t −1

√ 1 + h4 − 2h2 t

,

(217)

where F is the elliptic integral of the first kind (see, e.g., [85]). Analogously, for the determination of Φ2 (h, t), we have

1018

W. Freeden and M. Z. Nashed ∞

h2n+2 Pn (t) = h2

n=0



h2 , h2n Pn (t) = √ 1 + h4 − 2h2 t n=0

(218)

so that integration of the last equation with respect to h yields ∞ n=0

1 h2n+3 Pn (t) = 2n + 3

 √

h2

(219)

dh.

1 + h4 − 2h2 t

By use of the elliptic integral E of the second kind (see, e.g., [85]) we deduce from [73] that 2 2 √ 2 h2 √ ( t 2 − 1 + t) + 1 − √h 2 + 1 2 −t+ t −1 t+ t −1 β Φ2 (h, t) = 2 √ 8π 1 √ 1 + h4 − 2h2 t −t+

t 2 −1

6 %

 6 −1

× E i sinh

h

6 − F i sinh−1

−t + 6 %

1 √

5 t2 − 1

,

t− t+ 5



t2 − 1

5

t2 − 1 5 √ 1 t − t2 − 1 h , . √ √ −t + t 2 − 1 t + t2 − 1

Next we are concerned with the special cases that t = √β |x||y|



x |x|

·

y |y|

(220)

= ±1 and h =

< 1. These cases do not need the reduction to elliptic integrals:

Case 2. We let h =

√β |x||y|

< 1 and t =

x |x|

·

y |y|

= 1. Now we obtain

  ∞ ∞ 1 h2n+1 h2n+1 1 Pn (1) = = dh = dh. √    4 2 2n + 1 2n + 1 1 − h2 1 + h − 2h n=0 n=0 =1

(221) β and 1 − h2 > 0, since h2 ∈ [0, 1). Using the partial fraction Note, that h = √|x||y| " # 1 1 1 1 , we are led to = + 2 2 1−h 1+h 1−h



1 1 dh = 2 2 1−h



1 dh + 1−h



1 dh 1+h



  1 1 1+h . = (− ln(1 − h) + ln(1 + h)) = ln 2 2 1−h

(222)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1019

It follows that (note that t = 1)   ∞ h2 h2n+3 h2 = dh = dh. √ 2n + 3 1 − h2 1 + h4 − 2h2 n=0 Using the partial fraction decomposition 

h2 1−h2

=

1 2

" −2 +

1 1−h

+

(223)

1 1+h

# we see that

    1+h h2 1 1 ln − 2h . dh = − ln(1 − h) + ln(1 + h)) = (−2h 2 2 1−h 1 − h2 (224)

Hence, we arrive at the following representation of the kernel KY |Bβc (0) (x, y) : ⎛ ⎞ √ √β 1 + |x||y| ⎠ ⎝ |x||y| ln ⎝ KY |B c (0) (x, y) = β 8π |x||y| 2β √β ⎛

β3

|x||y|

⎛ ⎛

(|x||y|)3/2 ⎝ ⎝ 1 + − ln 2β 3 1− 1 = 16π



√β |x||y| √β |x||y|

⎞⎞ β ⎠⎠ ⎠ − 2√ |x||y| ⎞

   √  β2 |x||y| + β − |x||y| ln √ + 2β . √ |x||y| |x||y| − β (225)

Case 3: We let h =

√β |x||y|

< 1 and t =

x |x|

·

y |y|

= −1. Now we start from

∞ ∞ h2n+1 h2n+1 Pn (−1) = (−1)n 2n + 1    2n + 1 n=0

=(−1)n

(226)

n=0

 =

1 dh = √ 1 + h4 + 2h2

We use the partial fraction decomposition

h2 1+h2



=1−

1 dh = tan−1 (h). 1 + h2 1 1−h2

and obtain

 ∞ h2 h2n+3 = (−1) dh (227) √ 2n + 3 1 + h4 + 2h2 n=0   h2 1 = dh = 1 − dh = h − tan−1 (h). √ 2 1 + h2 1+h

1020

W. Freeden and M. Z. Nashed

As a consequence we obtain the following representation of the kernel KY |B c (0) (x, y) : β

√   β |x||y| tan−1 √ β |x||y|    3/2 β β (|x||y|) −1 − tan − √ √ β3 |x||y| |x||y|       β2 β 1 −β . = + |x||y| tan−1 √ √ 8π |x||y| |x||y|

KY |B c (0) (x, y) = β

β3 8π |x||y|

(228)

All in all, our approach shows that the reproducing kernel does not provide a closed representation even in spherical framework, since the occurring elliptic integrals do not admit a closed form. As a consequence, suitable techniques of constructive approximation and/or numerical integration are required.

5.4

External/Terrestrial/Internal RKHS for Regular Regions

A careful look at the volume integral  (x, y) →

G

G(Δ; |x − z|) G(Δ; |z − y|) dz

(229)

already showed that (229) exists for all x, y ∈ R3 (with x, y ∈ G c constituting even a regular integral expression). Furthermore, to every F ∈ L2 (G ), there exists a unique V ∈ Y of the form  V (x) = A[F ](x) =

G

G(Δ; |x − y|) F (y) dy,

x ∈ R3 .

(230)

On the space Y we are able to impose an inner product "·, ·#Y by setting "A[F ], A[G]#Y = "F, G#L2 (G ) ,

(231)

where F, G ∈ L2 (G ). The space Y equipped with the inner product "·, ·#Y is a Hilbert space. Note that, for all x ∈ R3 , the Cauchy-Schwarz inequality yields the estimate %

% |V (x)| ≤

G

| G(Δ; |x − y|)|2 dy

G

|F (y)|2 dy,

where it is already known that there exists a constant Cx such that

(232)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1021

% |V (x)| ≤ Cx

|F (y)|2 dy

G

(233)

holds true for all x ∈ G c . Moreover, for x ∈ G and some value R ≥ d,

d = diam (G ) = max |x − y|,

(234)

x,y∈G

we are able to see that  G

|G(Δ; |x − y|)|2 dy =

1 (4π )2



1 (4π )2 1 (4π )2

=

 G



1 dy |x − y|2

BR (x)



0

R



(235)

1 dy |x − y|2

|x−y|=r

1 dS(y) dr |x − y|2

R . 4π

=

Altogether, we are able to conclude that for each fixed x ∈ R3 , the evaluation functional Ex is bounded. Hence, a necessary and sufficient condition that (Y, "·, ·#Y ) may be specified as reproducing kernel Hilbert space (see Sect. 5.1) is satisfied. In fact, for x ∈ G and F ∈ L2 (G ), we obtain V (x) = "G(Δ; |x − ·|), F #L2 (G )

(236)

= "A[G(Δ; |x − ·|)], A[F ]#Y = "A[G(Δ; |x − ·|)], V #Y , so that  KY (x, y) = =

G

G(Δ; |x − z|) G(Δ; |z − y|) dz

1 (4π )2

 G

(237)

1 1 dz, x, y ∈ R3 , |x − z| |z − y|

is the unique reproducing kernel of Y . Summarizing our considerations we are finally allowed to formulate the following result: The image space Y = A[L2 (G )] is a reproducing kernel Hilbert space having the reproducing kernel

1022

W. Freeden and M. Z. Nashed

KY (x, y) =

1 (4π )2

 G

1 1 dz, x, y ∈ R3 . |x − z| |z − y|

(238)

It should be remarked that reproducing kernel Hilbert space structure is of particular importance in the inversion of Newton’s Law of Gravitation, since the reproducing kernel framework makes a numerical computation efficient and economical, as we shall see from the following gravimetric spline context (note that [25–27, 30, 37, 48, 124] include details about harmonic splines) .

5.5

External/Terrestrial/Internal Spline Theory for Regular Regions

Let G be a regular region. Suppose that {x1 , . . . , xN }, xi = xj , i = j, is a discrete set of N given points in R3 . Assume that the values γi = V (xi ), xi ∈ R3 , i = 1, . . . , N , constitute a given data set from the Newton potential (6). We want to find V to the potential V such that an approximation SN V SN (xi ) = V (xi ) = γi ,

i = 1, . . . , N.

(239)

(If the data are noisy, interpolation should be replaced by smoothing (see, e.g., [48] and the references therein)). A functional value V (x) at a point x ∈ R3 can be identified with an evaluation functional Ex : V → Ex [V ] = V (x),

V ∈ Y (G c ).

(240)

For each x ∈ R3 , the linear functional Ex defined by Ex : V → Ex [V ] = V (x), V ∈ Y, is bounded on Y , i.e, |Ex [V ]| = |V (x)| ≤ Cx V Y . Moreover, for x ∈ ∂G and for all V ∈ Y we have Ex [V ] = V (x) = (V , KY (x, ·))Y ) . Spline Method. The Newton potential V , from which the discrete data are known, is

considered to be an element of the Hilbert space Y possessing the reproducing kernel KY (·, ·), while the observed values at the points x1 , . . . , xN ∈ R3 are assumed to be associated with linearly independent bounded functionals Ex1 , . . . , ExN . In doing V ∈ Y as a linear combination so, we are able to find a minimum norm solution SN V is meant of the representers Exi [KY (·, ·)] to the functionals Ex1 , . . . , ExN , i.e., SN as the projection of V to the N-dimensional linear subspace spanned by the linearly independent representers Exi [KY (·, ·)], i = 1, . . . , N (see, e.g., [20]). Let {x1 , . . . , xN } ⊂ R3 be a point system, such that the evaluation functionals Ex1 , . . . , ExN are linearly independent. Then, within the set IEVx

1 ,...,ExN

= {U ∈ Y : Exi [U ] = Exi [V ] = γi , i = 1, . . . , N },

V that satisfies the minimum norm interpolation problem of finding SN

(241)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data V SN Y =

inf

U ∈IEVx

U Y

1023

(242)

1 ,...,ExN

is well posed, i.e., its solution exists, is unique and depends continuously on the data V is given in the explicit form γ1 , . . . , γN . The uniquely determined solution SN V SN (x) =

N

aiN Exi [KY (x, ·)],

x ∈ R3 ,

(243)

i=1 N are determined by solving the linear system of where the coefficients a1N , . . . , aN equations N

aiN Exi Exj [KY (·, ·)] = γj ,

j = 1, . . . , N.

(244)

i=1

As a consequence of the interpolation procedure, the density inside G is obtained as linear combination in terms of fundamental solutions with singularities in the points {x1 , . . . , xN } : F V SN (x) = −Δx SN (x)

=−

N

(245)

aiN Exi [G(Δ; |x − ·|)]

i=1

=−

N

aiN G(Δ; |x − xi |),

x ∈ G \{x1 , . . . , xN }.

i=1 F is a harmonic function provided that {x , . . . , x } ⊂ G c . As a consequence, SN 1 N

Example (Multiscale Spline Mollifier Interpolation Model). The BP density model (see Fig. 5) was created by Frederic Billette and Sverre Brandsberg-Dahl as a blind test for the velocity workshop at the 66th EAGE international meeting in Paris, France, in June 2004. The left side of the model is based on a geological cross section through the Western Gulf of Mexico. The central part the model is a simplified representation of geologic features in the Eastern/Central Gulf of Mexico and off-shore Angola. The right side of the model is a composite representation of velocity issues encountered in the Caspian Sea, North Sea or Trinidad. The results of our numerical experiences may be summarized as follows: The low-pass filtered density functions provide essential structural geological information, particularly for larger scale numbers j (see Fig. 6). The band-pass (detail) density functions also show strong separation surfaces between geological

1024

W. Freeden and M. Z. Nashed

Fig. 5 Section of the BP density model chosen for the implementation of the spline interpolation in [g/cm3 ]

formations, comparable to the earlier singular integral based approach. The local area of the BP-model was taken to avoid occurring storage calamities. Spline Solutions in Terms of Gaussian Sums. Until now, the proposed methods for

numerically calculating the reproducing kernel are either to apply some cubature rule or, in the spherical context, to truncate the associated spherical harmonic expansion. However, using the truncated spherical harmonic expansion of an acceptable numerical effort usually is not accurate enough for terrestrial data points. In what follows we are interested in spline approximation as an inversion procedure based on evaluation functionals. As, in spline approximation, one needs to evaluate reproducing kernel sum expressions when calculating the full-sized spline matrix, we have to find a proper replacement of 1 KA[L2 (G )] (x, y) = (4π )2

 G

1 dz, x, y, ∈ R3 , |x − z||y − z|

(246)

that can be evaluated in economic and efficient way. Our goal is to substitute the kernel of a monopole |x − z|−1 by a linear combination of Gaussians, i.e., M 1 2 ≈ ωm e−αm |x−z| , |z − x| m=1

(247)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

a

b 2000

2000 2200

2.2

2200

2400

2

2400

2600

1.8

2600

2800

1.6

2800

3000

1.4

3000

3200

1.2

3200

3400

1

3400

3600

0.8

3600

3800

0.6

3800

4000

0.4

4000

2.3

2.4

2.5

2.6

j=6

c

2.7 x 104

2.5

2

1.5

1

0.5 2.3

2.4

2.5

2.6

j=7

d

2000

2.7 x 104

2000

2200

3

2200 2

2400 2600

2400

2.5

2600 1.5

2800 3000

2

2800 3000

1

3200 3400

1.5

3200 1

3400 0.5

3600 3800 4000

1025

3600

0.5

3800 0 2.3

2.4

2.5

2.6

j=8

2.7 x 104

4000

2.3

2.4

2.5

j=9

2.6

2.7 x 104

Fig. 6 Spline interpolation of the density [g/cm3 ] in a local region for different scales j

so that 1 KA[L2 (G )] (x, y) = (4π )2 M  KA[L 2 (G )] (x, y) =

1 (4π )2



1 dz (248) G |x − z||y − z|  M M 2 2 ωm ωn e−αm |x−z| e−αn |y−z| dz

m=1 n=1

G

(note that the right hand side of (247) allows the separation into Cartesian coordinates; thus, multi-dimensional integrals over cuboids can be handled iteratively, just by one-dimensional integration).

1026

W. Freeden and M. Z. Nashed

The critical point concerning (247) is to specify the coefficients αm , ωm , m = 1, . . . , M, and the integer M. Different approaches have been proposed in the literature: In [60], the approximation is attacked by a Newton-type optimization procedure. In [61], a Remez algorithm exploits specific properties of a certain error functional (not discussed here in more detail). Fast multipole methods (see, e.g., [19, 53]) also provide tools of great numerical significance. Our approach here closely parallels the concepts presented in [14,15]. This concept starts with an initial approximation obtained by the appropriate discretization of an integral expression of |z − x|−1 . Afterwards, in order to reduce the number M of terms of the Gaussian sum on the right side of (247), an algorithm is applied based on Prony’s method. An advantage is that we are able to develop our results for the one-dimensional fuction r → r −1 , r ∈ [δ, 1], with some sufficiently small δ > 0: M 1 2 ωm e−αm r , r ∈ [δ, 1]  r

(249)

m=1

(note that we can transform any closed interval to this type of interval so that we can restrict ourselves, without loss of generality to the interval [δ, 1]). Our interest is in guaranteeing that   M 1  ε 2 rel   , ωm e−αm r  ≤  − r  r

r ∈ [δ, 1],

(250)

m=1

where εrel > 0 is (a bound of) the relative error of the approximation (note that the relative error states, up to which significant digit our approximations are correct; in fact, if we compare the absolute error with the relative error via the relation εrel = maxr∈[δ,1] (εabs r), it becomes clear that the relative error will become larger than the absolute error under our assumptions). Remark 1. By choosing a scale parameter ρ instead of δ, the Newtonian kernel G(Δ; |x − y|) admits, e.g., an “exponential Haar mollification” (regularization) of the form

GH ρ (Δ; |x − y|) ≈

  ⎧ 1 1 2 ⎪ ⎪ , 3 − |x − y| ⎪ ⎪ ρ2 ⎪ ⎨ 8πρ M ⎪ ⎪ 1 2 ⎪ ⎪ ⎪ ωm e−αm |x−y| , ⎩ 4π

|x − y| ≤ ρ (251) ρ < |x − y|.

m=1

Remark 2. Suppose that an interval [A, B] is given with the property A/B ≥ δ. Then, in the sense of (250), we are allowed to conclude that

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

  M 1 ωm − αm2 r 2  εrel  e B ≤ ,  − r  B r

r ∈ [A, B].

1027

(252)

m=1

A,B A,B In other words, by setting ωm = ωm /B and αm = αm /B 2 we obtain the coefficients for an approximation of 1/r on the interval [A, B] (note, that the (bound of the) relative error is not affected by this parametrisation).

Initial Approximation. We start from the integral representation 2 1 =√ r π



∞ −∞

e−r

2 e2s +s

(253)

ds,

that easily follows from 2 √ π



b

e−r

2 e2s +s

a

2 ds = √ π



eb

e−(rt) dt 2

(254)

ea

 b 2 1 re −x 2 = √ e dx π r rea # 1" = erf(reb ) − erf(rea ) r by applying the limit relations limb→∞ erf(reb ) = 1, lima→−∞ erf(rea ) = 0, where erf is the error function given by 2 erf(s) = √ π



s

exp(−t 2 ) dt, s ≥ 0.

(255)

0

Next we apply the composite trapezoidal rule (see, e.g., [32]) to discretize the integral (253) in order to deduce an initial approximation of the form (249). In more detail, if a, b are given such that −∞ < a < 0 < b < ∞ and M + 1 is the total number of grid points in the trapezoidal rule, then using the abbreviations sm := a + mh, m = 0, . . . , M

(256)

and h :=

b−a M

we are led to coefficients ωm and αm of the Gaussian sum (247) in the form

(257)

1028

W. Freeden and M. Z. Nashed

2h ωm := √ esm , m = 0, . . . , M, π

(258)

αm := e2sm , m = 0, . . . , M.

(259)

and

So, our initial approximation is of the form M 2h sm 1 √ e exp(−e2sm r 2 ), r π

r ∈ [δ, 1].

(260)

m=0

Now, we are prepared to deal with the question of how to choose a, b, and M. First it is readily seen that    b   −r 2 e2s +s  1 − r e ds   a  ∞  2 2s =r e−r e +s ds + r

(261) a

−∞

b

e−r

2 e2s +s

ds.

Our desire is to estimate the occurring integrals such that  r

a

−∞

e−r

2 e2s +s

ds
0, the space of all mollified singular integral-type Newton integrals Aρ [F ] given by

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1041

 Aρ [F ] =

G

Gρ (Δ; |x − y|) F (y) dy, F ∈ L2 (G ),

(292)

with Gρ (Δ; | · − · |) given either by (30) or by (145), so that Yρ = Aρ [L2 (G )], we are led to an analogous result to the singular integral-type Newton integral: The image space Yρ = Aρ [L2 (G )] is a reproducing kernel Hilbert space possessing the reproducing kernel  KYρ (x, y) =

G

Gρ (Δ; |x − z|) Gρ (Δ; |z − y|) dz, x, y ∈ R3 .

(293)

Finally, it should be mentioned that  − Δx KYρ (x, y) =

Kρ (|x − z|) Gρ (Δ; |z − y|) dz, x, y ∈ R3 .

G

(294)

Spline Mollifier Method. For sufficiently small ρ, an approximate version of the

kernel  KY (x, y) =

G

G(Δ; |x − z|) G(Δ; |y − z|) dz

(295)

is now given by  KYρ (x, y) =

G

Gρ (Δ; |x − z|) Gρ (Δ; |y − z|) dz.

(296)

Note that, from the integral in (296), we see that N N

 ak ai

k=1 i=1

G

Gρ (Δ; |xk − z|) Gρ (Δ; |xi − z|) dz

(297)

2   N    ak Gρ (Δ; |xk − z|) dz ≥ 0. =    G k=1

Moreover, the linear independence of the system {Gρ (Δ; |xi − ·|)}i=1,...,N implies that the Gram matrix   Gρ (Δ; |xi − z|) Gρ (Δ; |xk − z|) dz (298) G

k,i=1,...,N

is positive definite, so that KYρ (·, ·) is a positive definite kernel.

1042

W. Freeden and M. Z. Nashed

  In other words, the integral (296) defines a Hilbert space Yρ , "·, ·#Yρ possessing (296) as the reproducing kernel. In Yρ , minimum norm (spline) interpolation as described above can be performed in analogous way: Let {x1 , . . . , xN } ⊂ R3 be a point system, such that the evaluation functionals Ex1 , . . . , ExN are linearly independent. Then, within the set IEVx

1 ,...,ExN

" # = {U ∈ YρJ G c : Exi [U ] = Exi [V ] = γi , i = 1, . . . , N },

(299)

V that satisfies the minimum norm interpolation problem of finding SN V SN Yρ =

inf

U ∈IEVx

U Yρ

(300)

1 ,...,ExN

is well posed, i.e., its solution exists, is unique and depends continuously on the data V is given in the explicit form γ1 , . . . , γN . The uniquely determined solution SN V (x) = SN

N

aiN Exi [KYρ (x, ·)],

x ∈ R3 ,

(301)

i=1 N are determined by solving the linear system of where the coefficients a1N , . . . , aN equations N

aiN Exi Exj [KYρ (·, ·)] = γj ,

j = 1, . . . , N.

(302)

i=1

In this case, we obtain an approximation of the density distribution as a linear combination of singular integral-type kernels (294) which are not harmonic. Moreover, for purposes of decorrelation, we are able to take advantage of the sparse character of the resulting wavelets to reduce considerably the computational effort. Application to Geodetic Observables. In geodetic practice, a spline approach by

means of evaluation functionals as presented until now is unrealistic. For practical purposes other geodetic observables have to come into play such as gravity anomalies, gravity disturbances, deflections of the vertical, satellite-to-satellite (SST)-data, satellite-gravity-gradiometry (SGG)-data. These observables, however, involve derivatives applied to the gravitational potential, i.e., the Newton volume integral. As long as this derivative is taken in a point of G c , i.e., applied to a regular Newton volume integral, the linear functionals representing these observables (cf. [43]) are bounded on the reproducing kernel Hilbert space Y = A[L2 (G )]. However, in case of terrestrial observables, the derivative in a point on the boundary ∂G does not allow to guarantee the boundedness on Y = A[L2 (G )], using a Cauchy–Schwarz estimation in standard way. The same calamity happens for

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1043

derivatives in points inside G . As a consequence, spline methods may be applied only in mollified versions, i.e., Gaussian sum approaches or (certain) mollified Hilbert spaces Yρ = Aρ [L2 (G )]. Another spline variant for terrestrial observables involving derivatives is to introduce a Runge (Bjerhammar) ball BR (0) inside G , i.e., R is chosen such that R < infx∈∂G |x|. In this case, terrestrial observables relative to G become inner observables relative to (BR (0))c . As prototypes for the different spline realizations we discuss terrestrial oblique derivatives (cf. [26, 30, 33, 34, 37, 72]), i.e., we assume that there are known, for a point system {x1 , . . . , xN } ⊂ ∂G , the linear functionals Dxi [V ] :=

∂V (xi ) = λ(xi ) · ∇V (xi ) = γi , i = 1, . . . , N, ∂λ

(303)

where λ is a continuous unit vector field satisfying infx∈∂G λ(x) · ν(x) > 0 and ν is the unit normal field on ∂G directed inward into G . The occurrence of derivatives in (303) leads to the following spline variants: RKHS Mollification. Let {x1 , . . . , xN } ⊂ ∂G be a point system, such that the

functionals Dx1 , . . . , DxN are linearly independent. Then, within the set

= {U ∈ Aρ [L2 (G )] : Dxi [U ] = Dxi [V ] = γi , i = 1, . . . , N }, (304) V that satisfies the minimum norm interpolation problem of finding SN IDVx

1 ,...,DxN

V Aρ [L2 (G )] = SN

inf

U ∈IDVx

U Aρ [L2 (G )]

(305)

1 ,...,DxN

is well posed, i.e., its solution exists, is unique and depends continuously on the data V is given in the explicit form γ1 , . . . , γN . The uniquely determined solution SN V SN (x) =

N

aiN Dxi [KAρ [L2 (G )] (x, ·)],

x ∈ R3 ,

(306)

i=1

where the coefficients equations N

N a1N , . . . , aN

are determined by solving the linear system of

aiN Dxi Dxj [KAρ [L2 (G )] (·, ·)] = γj ,

j = 1, . . . , N.

(307)

i=1

RKHS Runge Mollification . Let {x1 , . . . , xN } ⊂ ∂G be a point system, such that

the functionals Dx1 , . . . , DxN are linearly independent. Moreover, let BR (0) be a Runge ball inside G , i.e., let R be chosen such that R < infx∈∂G |x|. Then, within the set

1044

W. Freeden and M. Z. Nashed

IDVx

= {U ∈ A[L2 (BR (0)] : Dxi [U ] = Dxi [V ] = γi , i = 1, . . . , N }, (308) V that satisfies the minimum norm interpolation problem of finding SN 1 ,...,DxN

V SN A[L2 (BR (0)] =

U A[L2 (BR (0)]

inf

U ∈IEVx

(309)

1 ,...,ExN

is well posed, i.e., its solution exists, is unique and depends continuously on the data V is given in the explicit form γ1 , . . . , γN . The uniquely determined solution SN V SN (x) =

N

aiN Dxi [KA[L2 (BR (0)] (x, ·)],

x ∈ R3 ,

(310)

i=1 N are determined by solving the linear system of where the coefficients a1N , . . . , aN equations N

aiN Dxi Dxj [KA[L2 (BR (0)] (·, ·)] = γj ,

j = 1, . . . , N.

(311)

i=1

By going over to Gaussian sums we obtain the following variant: V admits a Gaussian sum approximation RKHS Gaussian Runge Mollification. SN in the form

V SN (x) =

N

M a˜ iN Dxi [KA[L 2 (B

R (0)]

(x, ·)],

x ∈ R3 ,

(312)

i=1 N are determined by solving the linear system of where the coefficients a˜ 1N , . . . , a˜ N equations N

a˜ iN Dxi Dxj [KA[L2 (BR (0)] (·, ·)] = γj ,

j = 1, . . . , N.

(313)

i=1

These expressions are an immediate consequence of the aforementioned Runge reflected minimum norm formulation of spline interpolation to oblique derivative data. A mollification procedure that formally simulates a minimum norm approach is as follows: V admits a Gaussian sum approximation RKHS Gaussian Runge Mollification. SN in the form

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

V SN (x) =

N

M a˜ iN Dxi [KA[L 2 (G )] (x, ·)],

x ∈ R3 ,

1045

(314)

i=1 N are determined by solving the linear system of where the coefficients a˜ 1N , . . . , a˜ N equations N

a˜ iN Dxi Dxj [KA[L2 (G )] (·, ·)] = γj ,

j = 1, . . . , N.

(315)

i=1

Other efficient spline solution techniques corresponding to oblique derivative data and involving fast multipole algorithms have been proposed by M. Gutting [56–58].

6

Concluding Remarks

Finally, it should be remarked that mathematical structures and results developed for the gravimetry problem enable us to apply not only new mollification techniques as presented here, but also a large variety of ideas and concepts known from the theory of ill-posed problems (see, e.g. [38] and the references therein for a geodetically relevant approach). These aspects, however, are straightforward, so that the details will not be considered here. Moreover, the handling of noise may be done, e.g., by variance-covariance multiscale comparison in an analogous way as indicated in [35, 36] (see also the concepts in the monograph [42] and the references therein). Finally it should be remarked that similar mollifier techniques as presented here can be developed for more general problems in mathematical physics such as  V (x) =

G

G(L; x, y) F (y) dy

(316)

involving the Green’s function G(L; x, y) corresponding to the differential operator L so that F (x) = −Lx V (x), x ∈ G ,

(317)

where in distributional sense − Lx G(L; x, y) = δ(x, y), x ∈ G

(318)

and  F (x) =

G

δ(x, y) F (y) dy, x ∈ G .

(319)

1046

W. Freeden and M. Z. Nashed

In dependence on the specific choice of the operator L we are then led to the following exploration areas: (i) gravitational modeling (L Laplace operator), (ii) geomagnetic modeling (L div, curl (pre-) Maxwell operators), (iii) acoustic seismic tomography (L Helmholtz operator), (iv) elastic seismic tomography (L (Helmholtz) Cauchy-Navier operator), (v) acoustic scattering (L time dependent wave equation), (vi) elastic scattering (L time dependent elastic wave equation). The mathematical treatment of tomography and scattering problems in a consistent setup is a challenge for future investigations. Acknowledgements The first author thanks the “Federal Ministry for Economic Affairs and Energy, Berlin” and the “Project Management Jülich” (corporate manager Dr. S. Schreiber) for funding the projects “GEOFÜND” (funding reference number: 0325512A, PI Prof. Dr. W. Freeden, University of Kaiserslautern, Germany) and “SPE” (funding reference number: 0324061, PI Prof. Dr. W. Freeden, CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Germany, corporate manager Prof. Dr. mult. M. Bauer).

Literature 1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. Dover Publications, Inc., New York (1964) 2. Anger, G.: A characterization of inverse gravimetric source problem through extremal measures. Rev. Geophys. Space Phys. 19, 299–306 (1981) 3. Anger, G.: Inverse Problems in Differential Equations. Akademie–Verlag, Berlin (1990) 4. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950) 5. Augustin, M., Freeden, W., Nutz, H.: About the importance of the Runge–Walsh concept for gravitational field determination. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 517–560. Birkhäuser/Springer, Basel/New-York/Heidelberg (2018) 6. Backus, G.E., Gilbert, F.: Numerical applications of a formalism for geophysical inverse problems. Geophys. J. R. Astron. Soc. 13, 247–276 (1967) 7. Backus, G.E., Gilbert, F.: The resolving power of gross earth data. Geophys. J. R. Astron. Soc. 16, 169–205 (1968) 8. Backus, G.E., Gilbert, F.: Uniqueness of the inversion of inaccurate gross earth data. Philos. Trans. R. Soc. London 226, 123–197 (1970) 9. Ballani, L.: Solving the inverse gravimetric problem: on the benefit of wavelets. In: Sansò, F. (ed.) Geodetic Theory Today, Proceedings of the 3rd Hotine–Marussi Symposium on Mathematical Geodesy 1994, pp. 151–161. Springer, Berlin (1995)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1047

10. Ballani, L., Engels, J., Grafarend, E.W.: Global base functions for the mass density in the interior of a massive body (Earth). Manuscr. Geod. 18, 99–114 (1993) 11. Ballani, L., Stromeyer, D.: The inverse gravimetric problem: a Hilbert space approach. In: Holota, P. (ed.) Proceedings of the International Symposium Figure of the Earth, the Moon, and Other Planets 1982, pp. 359–373. Prague (1983) 12. Ballani, L., Stromeyer, D., Barthelmes, F.: Decomposition principles for linear source problems. In: Anger, G., Gorenflo, R., Jochmann, H., Moritz, H., Webers, W. (eds.) Inverse Problems: Principles and Applications in Geophysics, Technology, and Medicine, Mathematical Research, 47. Akademie–Verlag, Berlin (1993) 13. Barzaghi, R., Sansò, F.: Remarks on the inverse gravimetric problem. Boll. Geod. Sci. Aff. 45, 203–216 (1986) 14. Beylkin, G., Monzón, L.: On approximation of functions by exponential sums . Appl. Comput. Harmon. Anal. 19, 17–48 (2005) 15. Beylkin, G., Monzón, L.: Approximation of functions by exponential sums revisited. Appl. Comput. Harmon. Anal. 28, 131–149 (2010) 16. Blick, C.: Multiscale potential methods in geothermal research: decorrelation reflected post-processing and locally based inversion. Ph.D.-Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group (2015) 17. Blick, C., Freeden, W., Nutz, H.: Gravimetry and exploration. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 687–752. Birkhäuser/Springer, Basel/New-York/Heidelberg (2018) 18. Burschäpers, H.C.: Local modeling of gravitational data. Master Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group (2013) 19. Cheng, H., Greengard, L., Rokhlin, V.: A fast adaptive multipole algorithm in three dimensions. J. Comput. Phys. 155, 468–498 (1999) 20. Davis, P.J.: Interpolation and Approximation. Blaisdell, New York (1963) 21. Engl, H.: Integralgleichungen. Springer Lehrbuch Mathematik, Wien (1997) 22. Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 23. Engl, H., Louis, A.K., Rundell, W. (eds.): Inverse Problems in Geophysical Applications. SIAM, Philadelphia (1997) 24. Freeden, W.: On the approximation of external gravitational potential with closed systems of (Trial) functions. Bull. Géod. 54, 1–20 (1980) 25. Freeden, W.: On approximation by harmonic splines. Manuscr. Geod. 6, 193–244 (1981) 26. Freeden, W.: A spline interpolation method for solving boundary value problems of potential theory from discretely given data. Math. Part. Diff. Equ. 3, 375–398 (1987) 27. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. B.G. Teubner, Stuttgart/Leipzig (1999) 28. Freeden, W.: Geomathematics: its role, its aim, and its potential. In: Freeden, W., Nashed, Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 3–78. Springer, New York/Heidelberg (2015) 29. Freeden, W., Blick, C.: Signal decorrelation by means of multiscale methods. World Min. 65, 1–15 (2013) 30. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman and Hall/CRC Press, Boca Raton/London/New York (2013) 31. Freeden, W., Gutting, M.: Special Functions of Mathematical (Geo)Physics. Birkhäuser, Basel (2013) 32. Freeden, W., Gutting, M.: Integration and Cubature Methods. Chapman and Hall/CRC Press, Boca Raton/London/New York (2018) 33. Freeden, W., Kersten, H.: The geodetic boundary-value problem using the known surface of the Earth. Veröff. Geod. Inst. RWTH Aachen 29, 1–70 (1980) 34. Freeden, W., Kersten, H.: A constructive approximation theorem for the oblique derivative problem in potential theory. Math. Meth. Appl. Sci. 4, 104–114 (1981)

1048

W. Freeden and M. Z. Nashed

35. Freeden, W., Maier, T.: On multiscale denoising of spherical functions: basic theory and numerical aspects. Electron. Trans. Numer. Anal. (ETNA) 14, 40–62 (2002) 36. Freeden, W., Maier, T.: Spectral and multiscale signal-to-noise thresholding of spherical vector fields. Comput. Geosci. 7(3), 215–250 (2003) 37. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston (2004) 38. Freeden, W., Nashed, M.Z.: Operator-theoretic and regularization approaches to ill-posed problems. GEM Int. J. Geomath. 9, 1–115 (2018) 39. Freeden, W., Nashed, M.Z.: Ill-posed problems: operator methodologies of resolution and regularization. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 210–314. Birkhäuser/Springer, Basel/New-York/Heidelberg (2018) 40. Freeden, W., Nashed, M.Z.: Inverse gravimetry as an ill-posed problem in mathematical geodesy. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 641–685. Birkhäuser/Springer, Basel/New-York/Heidelberg (2018) 41. Freeden, W., Nashed, M.Z.: Inverse gravimetry: background material and multiscale mollifier approaches. GEM Int. J. Geomath. 9, 199–264 (2018). https://doi.org/10.1007/s13137-0180103-5 42. Freeden, W., Nashed, M.Z., Schreiner, M.: Spherical Sampling. Geosystems Mathematics. Birkhäuser/Springer, Basel/New-York/Heidelberg (2018) 43. Freeden, W., Nutz, H.: Geodetic observables and their mathematical treatment in multiscale framework. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 315–458. Birkhäuser/Springer, Basel/New-York/Heidelberg (2018) 44. Freeden, W., Nutz, H.: Mathematik als Schlüsseltechnologie zum Verständnis des Systems “Tiefe Geothermie”. Jahresber. Deutsch. Math. Vereinigung (DMV) 117, 45–84 (2015) 45. Freeden, W., Schneider, F.: Regularization wavelets and multiresolution. Inverse Prob. 14, 493–515 (1998) 46. Freeden, W., Schreiner, M.: Local multiscale modelling of geoid undulations from deflections of the vertical. J. Geod. 79, 641–651 (2006) 47. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences – A Scalar, Vecterial, and Tensorial Setup. Springer, Heidelberg (2009) 48. Freeden, W., Witte, B.: A combined (Spline-)interpolation and smoothing method for the determination of the gravitational potential from heterogeneous data. Bull. Géod. 56, 53–62 (1982) 49. Freeden, W., Schneider, F., Schreiner, M.: Gradiometry – an inverse problem in modern satellite geodesy. In: Engl, H.W., Louis, A., Rundell, W. (eds.) GAMM-SIAM Symposium on Inverse Problems: Geophysical Applications, pp. 179–239. SIAM Proceedings, Philadelphia (1997) 50. Gauss, C.F.: Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des magnetischen Vereins, Göttingen (1838) 51. Grafarend, E.W.: Six lectures on geodesy and global geodynamics. In: Moritz, H., Sünkel, H. (eds.) Proceedings of the Third International Summer School in the Mountains, Graz Geodetic Institute. pp. 531–685 (1982) 52. Green, G.: An Essay on the Application of Mathematical Analysis to the Theories of Electricity and Magnetism. T. Wheelhouse, Nottingham (1838) 53. Greengard, L., Rokhlin, V.: A new version of the fast multipole method for the laplace equation in three dimensions. Acta Numer. 6, 229–269 (1997) 54. Groetsch, C.W.: The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind. Pitman, London-Boston (1984) 55. Groetsch, C.W.: Inverse Problems in the Mathematical Science. Vieweg, Braunschweig (1993)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1049

56. Gutting, M.: Fast multipole methods for oblique derivative problems. Ph.D.-Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group (2007) 57. Gutting, M.: Fast multipole accelerated solution of the oblique derivative boundary value problem. GEM Int. J. Geomath. 3, 223–252 (2012) 58. Gutting, M.: Fast spherical/harmonic spline modeling. In: Freeden, W., Nashed, Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 3, 2nd edn., pp. 2711–2746. Springer, New York/Heidelberg (2015) 59. Haar, A.: Zur Theorie der orthogonalen Funktionensysteme. Math. Ann. 69, 331–371 (1910) 60. Hackbusch, W.: Entwicklungen nach Exponentialsummen. Technical Report. Max-PlanckInstitut für Mahematik in den Naturwissenschaften, Leipzig (2010) 61. Hackbusch, W., Khoromoskij, B.N., Klaus, A.: Approximation of functions by exponential sums based on the Newton-type optimisation. Technical Report, Max-Planck-Institut für Mathematik in den Naturwissenschaften, Leipzig (2005) 62. Hadamard, J.: Sur les problémes aux dérivés partielles et leur signification physique. Princeton Univ. Bull. 13, 49–52 (1902) 63. Hadamard, J.: Lectures on the Cauchy Problem in Linear Partial Differential Equations. Yale University Press, New Haven (1923) 64. Hanson, R.J.: A numerical methods for solving Fredholm inegral equations of the first kind. SIAM J. Numer. Anal. 8, 616–662 (1971) 65. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Freeman, San Francisco (1967) 66. Helmert, F.: Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie 1. B.G. Teubner, Leipzig (1880) 67. Helmert, F.: Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie 2. B.G. Teubner, Leipzig (1884) 68. Hille, E.: Introduction to the general theory of reproducing kernels. Rocky Mountain J. Math. 2, 321–368 (1972) 69. Hofmann–Wellenhof, B., Moritz, H.: Physical Geodesy. Springer, Wien/New York (2005) 70. Kellogg, O.D.: Foundations of Potential Theory. Frederick Ungar Publishing Company, New York (1929) 71. Kirsch, A.: An Introduction to the Mathematical Theory of Inverse Problems, 2nd edn. Springer, Heidelberg (1996) 72. Koch, K.R., Pope, A.J.: Uniqueness and existence for the geodetic boundary value problem using the known surface of the earth. Bull. Géod. 106, 467–476 (1972) 73. Kotevska, E.: Real earth oriented gravitational potential determination. Ph.D.-Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group (2011) 74. Krarup, T.: A contribution to the mathematical foundation of physical geodesy. Danish Geodetic Institute, Report No. 44, Copenhagen (1969) 75. Laplace, P.S.: Traité de mécanique céleste. Tome 2, Paris (1799) 76. Laplace, P.S.: Théorie analytique des probabiltés. Livre II, Chap. IV. Paris (1812) 77. Laplace, P.S.: Théorie analytique des probabiltés. Euvres, Tome VII, p. 353 (1812) 78. Lavrentiev, M.M.: Some improperly posed problems of mathematical physics. Izdat. Sibirsk. Otdel, Akad. Nauk. SSSR, Novosibirsk, 1962. Englisch Transl., Springer Tracts in Natural Philosophy, vol. 11. Springer, Berlin (1967) 79. Lauricella, G.: Sulla funzione potenziale di spazio corrispondente ad una assegnata azione sterna. Rend. Lincei XX (1911) 80. Legendre, A.M.: Nouvelles méthodes pour la détermination des orbites cométes. Paris (1806) 81. Legendre, A.M.: Analyse des triangles tracés sur la surface dun sphéroide. Tome VII de la I Série des mémoires de lát’ Académie des Sciences. Paris, 131 (1806) 82. Locker, J., Prenter, P.M.: Regularization with differential operators. J. Math. Anal. Appl. 74, 504–529 (1980) 83. Louis, A.K.: Inverse und schlecht gestellte Probleme. Teubner, Stuttgart (1989) 84. Louis, A.K., Maass, P.: A mollifier method for linear equations of the first kind. Inverse Prob. 6, 427–440 (1989)

1050

W. Freeden and M. Z. Nashed

85. Magnus, W., Oberhettinger, F., Soni, R.P.: Formulas and theorems for the special functions of mathematical physics. In: Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, Band 52. Springer, Berlin, 3. Auflage (1966) 86. Martin, G.S., Wiley, R., Marfurt, K.J.: Marmousi2: an elastic upgrade for Marmousi. Lead. Edge 25, 156–166 (2006) 87. Meissl, P.A.: A study of covariance functions related to the Earth’s disturbing potential. Department of Geodetic Science, No. 151, The Ohio State University, Columbus (1971) 88. Meissl, P.A.: Hilbert spaces and their applications to geodetic least squares problems. Boll. Geod. Sci. Aff. 1, 181–210 (1976) 89. Michel, V.: A multiscale method for the gravimetry problem: theoretical and numerical aspects of harmonic and anharmonic modelling. Ph.D.- Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group, Shaker, Aachen (1999) 90. Michel, V.: Scale continuous, scale discretized and scale discrete harmonic wavelets for the outer and the inner space of a sphere and their application to an inverse problem in geomathematics. Appl. Comput. Harm. Anal. (ACHA) 12, 77–99 (2002) 91. Michel, V.: A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the Earth’s interior. Habilitation Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group, Shaker, Aachen (2002) 92. Michel, V: Regularized wavelet–based multiresolution recovery of the harmonic mass density distribution from data of the Earth’s gravitational field at satellite height. Inverse Prob. 21, 997–1025 (2005) 93. Michel, V., Fokas, A.S.: A unified approach to various techniques for the non-uniqueness of the inverse gravimetric problem and wavelet-based methods. Inverse Prob. 24. https://doi.org/ 10.1088/0266-5611/24/4/045019 (2008) 94. Michlin, S.G.: Multidimensional singular integrals and integral equations. Pergamon Press, New York (1965) 95. Michlin, S.G.: Lehrgang der Mathematischen Physik, 2nd edn. Akademie Verlag, Berlin (1975) 96. Möhringer, S.: Decorrelation of gravimetric data. Ph.D.-Thesis, University of Kaiserslautern, Mathematics Department, Geomathematics Group (2014) 97. Moritz, H.: Advanced Physical Geodesy. Herbert Wichmann Verlag/Abacus Press, Tunbridge/Karlsruhe (1980) 98. Moritz, H.: The Figure of the Earth. Theoretical Geodesy of the Earth’s Interior. Wichmann Verlag, Karlsruhe (1990) 99. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd ed, vol. 1, pp. 253–290. Springer, New York/Heidelberg (2015) 100. Morozov, V.A.: Methods for Solving Incorrectly Posed Problems. Springer, New York (1984) 101. Nashed, M.Z.: Generalized inverses, normal solvability and iteration for singular operator equations. In: Rall, L.B. (ed.) Nonlinear Functional Analysis and Applications, pp. 311–359. Academic, New York (1971) 102. Nashed, M.Z.: Aspects of generalized inverses in analysis and regularization. In: Generalized Inverses and Applications, pp. 193–244, Nashed, M.Z. (ed), Academic Press, New York (1976) 103. Nashed, M.Z.: New applications of generalized inverses in system and control theory. In: Thomas, J.B. (ed.) Proceedings of 1980 Conference on Information Sciences and Systems, pp. 353–358. Princeton University Press, Princeton (1980) 104. Nashed, M.Z.: Operator-theoretic and computational approaches to ill-posed problems with applications to antenna theory. IEEE Trans. Antennas Propag. 29, 220–231 (1981) 105. Nashed, M.Z.: A new approach to classification and regularization of ill-posed operator equations. In: Engl, H., Groetsch, C.W. (eds.) Inverse and Ill-Posed Problems, Band 4. Notes and Reports in Mathematics and Science and Engineering. Academic Press, Boston (1987)

16 Inverse Gravimetry: Density Signatures from Gravitational Potential Data

1051

106. Pavlis, N.K., Holmes, S.A., Kenyon, S.C., Factor, J.K.: An Earth Gravitational Model to Degree 2160: EGM2008. General Assembly of the European Geosciences Union, Vienna (2018) 107. Nashed, M.Z.: In: Siddiqi, A.H., Singh, R.C., Manchanda, P. (eds.) Inverse Problems, Moment Problems and Signal Processing: Un Menage a Trois. Mathematics in Science and Technology, pp. 1–19. World Scientific, New Jersey (2010) 108. Nashed, M.Z., Scherzer, O.: Inverse Problems, Image Analysis and Medical Imaging (Contemporary Mathematics), vol. 313. American Mathematical Socienty, Providence, R.I. (2002) 109. Nashed, Z.M., Sun, Q.: Function spaces for sampling expansions. In: Shen, X., Zayed, A.I. (eds.) Multiscale Signal Analysis and Modeling, pp. 81–104. Springer, New York (2013) 110. Nashed, M.Z., Wahba, G.: Generalized inverses in reproducing kernel spaces: an approach to regularization of linear operator equations. SIAM J. Math. Anal. 5, 974–987 (1974) 111. Nashed, M.Z., Wahba, G.: Regularization and approximation of liner operator equations in reproducing kernel spaces. Bull. Am. Math. Soc. 80, 1213–1218 (1974a) 112. Nashed, M.Z., Walter, G.G.: General sampling theorems for functions in reproducing kernel Hilbert spaces. Math. Contr. Signals Syst. 4, 363–390 (1991) 113. Nashed, M.Z., Walter, G.G.: Reproducing kernel Hilbert space from sampling expansions. Contemp. Math. 190, 221–226 (1995) 114. Parker, R.L.: The theory of ideal bodies for gravity interpretation. Geophys. J. R. Astr. Soc. 42, 315–334 (1975) 115. Petrini, H.: Sur l’existence des derivees secondes du potentiel. C. R. Acad. Sci. Paris 130, 233–235 (1900) 116. Pizzetti, P.: Corpi equivalenti rispetto alla attrazione newtoniana asterna. Rend. Lincei XVIII (1909) 117. Poisson, S.D.: Traité de mécanique 1+2. Bachelier, Paris (1833) 118. Rieder, A.: Keine Probleme mit Inversen Problemen. Vieweg (2003) 119. Rummel, R.: Geodesy. In: Encyclopedia of Earth System Science, vol. 2, pp. 253–262. Academic Press, San Diego (1992) 120. Saitoh, S.: Theory of Reproducing Kernels and its Applications. Longman, New York (1988) 121. Sansò, F.: Internal collocation. Atti Della Academia Nazionale Dei Lincei 16, 4–52 (1980) 122. Sansò, F., Rummel, R. (eds.): Geodetic Boundary Value Problems in View of the One Centimeter Geoid. Lecture Notes in Earth Sciences, vol. 65. Springer, Berlin/Heidelberg (1997) 123. Sansò, F., Tscherning, C.C.: The inverse gravimetric problem in gravity modelling. In: Kejlsø, E., Poder, K., Tscherning, C.C. (eds.) Festschrift to Torben Krarup, pp. 299–334. Geodaetisk Institute, Copenhagen (1989) 124. Shure, L., Parker, R.L., Backus, G.E.: Harmonic splines for geomagnetic modelling. Phys. Earth Planet. Inter. 28, 215–229 (1982) 125. Skorvanek, M.: The inverse gravimetric problem for the Earth. In: Proceedings of the 4th International Symposium on Geodesy and Physics of the Earth 1980. Veröff. Zentralinst. Physik der Erde, Zentralinstitut Physik der Erde, vol. 63, pp. 464–475 (1981) 126. Stokes, G.G.: On the variation of gravity at the surface of the Earth. Trans. Cambr. Philos. Soc. 148, 672–712 (1849) 127. Stokes, G.G.: On the internal distribution of matter which shall produce a given potential at the surface of a gravitating mass. Proc. R. Soc. London 15, 482–486 (1867) 128. Symes, W.W.: The Rice Inversion Project, Department of Computational and Applied Mathematics, Rice University, Houston, Texas. http://www.trip.caam.rice.edu/downloads/ downloads.html (2016). Accessed 12 Sept 2016 129. Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195–198 (1943) 130. Tikhonov, A.N.: On the solution of incorrectly formulated problems and the regularization method. Dokl. Akad Nauk SSSR 151, 501–504 (1963) 131. Torge, W.: Gravimetry. de Gruyter, Berlin (1989)

1052

W. Freeden and M. Z. Nashed

132. Tscherning, C.C.: Analytical and discrete inversion applied to gravity data. In: Holm Jacobsen, B. (ed.) Proceedings of the Interdisciplinary Inversion Workshop 1, Methodology and Application Perspectives in Geophysics, Astronomy and Geodesy, pp. 5–8. Aarhus (1992) 133. Tscherning, C.C., Strykowski, G.: Quasi-harmonic inversion of gravity field data, model optimization in exploration geophysics 2. In: Vogel, A. (ed.) Proceedings of the 5th International Mathematical Geophysics Seminar, pp. 137–154. Vieweg, Braunschweig/Wiesbaden (1987) 134. Vogel, C.R.: Computational Methods for Inverse Problems. SIAM, Philadelphia (2002) 135. Wahba, G.: Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990) 136. Weck, N.: Zwei inverse Probleme in der Potentialtheorie. Mitt. Inst. Theor. Geodäsie, Universität Bonn. 4, 27–36 (1972) 137. Werner, J.: Optimization Theory and Apllications. Vieweg-Verlag, Braunschweig/Wiesbaden (1984) 138. Xia, X.G., Nashed, M.Z.: The Backus-Gilbert method for signals in reproducing Hilbert spaces and wavelet subspaces. Inverse Prob. 10, 785–804 (1994) 139. Zidarov, D.P.: Conditions for uniqueness of self–limiting solutions of the inverse problems. Comptes rendus de l’Académie bulgare des Sciences 39, 57–60 (1986) 140. Zidarov, D.P.: Inverse Gravimetric Problem in Geoprospecting and Geodesy. Developments in Solid Earth Geophysics, vol. 19. Elsevier, Amsterdam (1990)

From Newton’s Law of Gravitation to Multiscale Geoidal Determination and Ocean Circulation Modeling

17

Willi Freeden and Helga Nutz

Contents 1 Introduction: Newton’s Law of Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Vector Analytic Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Potential Theoretic Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Geodetic Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Disturbing Potential from Gravity Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Disturbing Potential from Deflections of the Vertical . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Ocean Circulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Concluding Remarks about the Earth’s Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1054 1057 1061 1064 1078 1089 1100 1110 1113

Abstract The objective of this contribution is the documentation of the pioneer dimension of Newton’s work to demonstrate his mediating role between classical gravitational theory and today’s multiscale concepts of geoidal determination and ocean circulation modeling.

This chapter is part of the series Handbuch der Geodäsie, volume Mathematische Geodäsie/ Mathematical Geodesy, edited by Willi Freeden, Kaiserslautern. W. Freeden () Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] H. Nutz CBM – Gesellschaft für Consulting, Business und Mangagement mbH, Bexbach, Germany E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_97

1053

1054

W. Freeden and H. Nutz

Zusammenfassung Das Ziel dieses Beitrags ist die Dokumentation der bahnbrechenden Dimension der Arbeiten von Newton, um seine Vermittlerrolle zwischen klassischer Gravitationstheorie und den heutigen Multiskalenkonzepten bei Geoidbestimmung und Modellierung der Ozeanzirkulation zu demonstrieren.

Keywords Newton’s law of gravitation · Potential theory · Geodetic boundary value problem · Geoidal determination · Ocean circulation

1

Introduction: Newton’s Law of Gravitation

Isaac Newton (January 4, 1643 to March 31, 1727) was a physicist and mathematician. Newton made discoveries in optics, motion, and mathematics. Newton’s book on physics, Philosophiae Naturalis Principia Mathematica (Mathematical Principles of Natural Philosophy), 1687, contains information on nearly all of the essential concepts of physics except energy, ultimately helping him to explain the laws of motion and, in particular, the theory of gravitation. Newton is credited as one of the great minds of the seventeenth century scientific revolution. In fact, toward the end of his life, Newton had become one of the most famous scientists in Europe. In classical physical geodesy following the law proposed by Newton the only forces considered were the forces of mutual attraction acting upon two material particles of small size or two material points. These forces are directly proportional to the product of the masses of these particles and inversely proportional to the square of the distance between them. Thus, the first and the most important problem from the point of view of physical geodesy was to study the forces of attraction of a material point by a finite (regular) material body – a spheroid and, in particular, an ellipsoid (since many celestial bodies have this shape). Studies carried out by Lagrange (1736–1813) in 1773, Legendre (1752–1833) between 1784–1794 and continued by Laplace (1749–1827) in 1782–1799 became of major importance. Lagrange established that the field of gravitational forces, as it is called now, is a (gradient) potential field. He introduced a function which was called in 1828 by G. Green (1793–1841) a potential function and later in 1840 by Gauss just a potential. Already Gauss, Green, and their contemporaries discovered that the method of potentials can be applied not only to solve problems in the theory of gravitation but, in general, to solve a wide range of problems in mathematical physics, in particular, in electrostatics and magnetism. In this connection, potentials became to be considered not only for the physically realistic problems concerning mutual attraction of positive masses, but also for problems with “masses of arbitrary sign”, or charges. Representative boundary value problems were defined in potential theory, such as the Dirichlet problem and the Neumann problem, the electrostatic problem of the

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1055

static distribution of charges on conductors or the Robin problem, and the problem of sweeping-out mass (balayage method). To solve the aforementioned problems in the case of domains with sufficiently smooth boundaries certain types of potentials turned out to be efficient, i.e., special classes of parameter-dependent integrals such as volume potentials of distributed mass, single and double layer potentials, Green potentials, etc. Results obtained by A. M. Lyapunov (1857–1918) and V. A. Steklov (1864–1926) at the end of nineteenth century were fundamental for the creation of strong methods of the solution of the main boundary value problems. Newton’s law of universal gravitation, first published in his aforementioned Principia in 1687, asserts that the force  exerted on a point mass Q at x ∈ R3 by a system of finitely many point masses qi at yi ∈ R3 , i = 1, . . . , N, is equal to (x) = C

N  i=1

qi Q x − yi , |x − yi |2 |x − yi |

x = yi , i = 1, . . . , N,

(1)

with a constant C < 0 (like masses attract). The same law of interaction between point charges was discovered experimentally by C. A. de Coulomb (1736–1806) and announced in 1785, now with C > 0 (like charges repel). Note that the numerical value of the constant C depends on the unit system one is using to measure force, mass (or charge), and distance. After the introduction of the function (or potential) V (x) given by V (x) = C

N  i=1

qi , |x − yi |

x = yi , i = 1, . . . , N,

(2)

into the theory of gravitation by D. Bernoulli in 1748, Lagrange noticed in 1773 that (x) = Q ∇x V (x),

x = yi , i = 1, . . . , N.

(3)

Hence, the function V completely describes the gravitational (or electrostatic) field. For a continuous distribution of charges on a (regular) region G with density ρ, vanishing outside G , the potential becomes C V (x) = 4π

 G

ρ(y)

1 dy, |x − y|

x ∈ G c,

(4)

where dy is the volume element and G c = R3 \G is the outer space of G . As observed by Laplace in 1782, the function G(Δ; | · −y|), y ∈ R3 , given by G(Δ; |x − y|) = fulfils the Laplace equation

1 1 , 4π |x − y|

x = y,

(5)

1056

W. Freeden and H. Nutz

Δx G(Δ; |x − y|) = 0,

x ∈ R3 \{y}.

(6)

In today’s jargon the function −G(Δ; | · −y|) is called the fundamental solution of the Laplace equation (6) in R3 \{y}. The solutions of the Laplace equation came to be known as harmonic functions. It should, however, be remarked that the Laplace equation had been also considered by Lagrange in 1760 in connection with his study of fluid flow problems. Laplace’s work was completed by his student Poisson (1781–1840) in 1813, when he showed that ΔV = −Cρ for “smooth” densities ρ. We summarize these classical results in today’s mathematical formalism: Theorem 1. Suppose that G is a regular region in R3 , i.e. a bounded region G ⊂ R3 dividing R3 uniquely into the inner space G and the outer space G C = R3 \G , G = G ∪ ∂G , such that G contains the origin and the boundary ∂G is an orientable smooth Lipschitzian manifold of dimension 2. (1) Let ρ : G → R be an integrable bounded function. Then  V (x) = C

ρ(y) G(Δ; |x − y|) dy,

G

x ∈ G c,

(7)

satisfies the Laplace equation Δx V (x) = 0

(8)

for all x ∈ G c , i.e., V is harmonic in G c . (2) Let ρ : G → R be of class C (0) (G ). Then V as defined by (7) is of class C (0) (G ). Furthermore, we have  (x) = ∇x V (x) = C

G

ρ(y) ∇x G(Δ; |x − y|) dy,

x ∈ G.

(9)

Moreover, the so-called Poisson equation under the assumption of μ-Hölder continuity, μ ∈ (0, 1], can be formulated as follows: Theorem 2. If ρ is of class C (0,μ) (G ), μ ∈ (0, 1], G a regular region in R3 , then the Poisson differential equation − Δx V (x) = Cρ(x)

(10)

holds true for all x ∈ G . It should be remarked, that in Theorem 2 the assumption of μ-Hölder continuity of ρ, μ ∈ (0, 1], is needed for its proof. As a matter of fact, it is shown in [58],

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1057

that the μ-Hölder continuity of ρ, μ ∈ (0, 1], is necessary to imply the second continuous differentiability of the Newton volume potential V . Strongly influenced by Newton’s ideas, in the nineteenth century, the forces in physics were believed to be derived from potential functions which satisfy Laplace’s equation. This explains the term “potential theory”. Consequently, in mathematical sense, potential theory was the study of functions that could serve as (approximants of) potentials. Nowadays, we know that the equations that describe forces are systems of non-linear partial differential equations, such as Einstein equations (see, e.g., [1, 47, 71]) and that the Laplace equation is only valid as a limiting case. Nevertheless, the term potential theory has remained as a convenient term for describing the study of functions satisfying the Laplace equation and its generalizations. Studies in potential theory concerning properties of different potentials have acquired an independent significance. In the first half of the twentieth century, a great stimulus for the generalization of the principal problems and the completion of the existing formulations in potential theory was made on the basis of the general notions of a Radon measure, capacity, and generalized functions. Nowadays, potential theory is closely related in its development to the theory of analytic functions and to some extend to the probability theory (see, e.g., [1, 71]). Together with further studies of classical boundary value problems and inverse problems, the modern period of the development of potential theory is usually characterized by the application of methods and notions of topology and functional analysis, and the use of abstract axiomatic methods.

2

Vector Analytic Tools

The Gauss integral theorem (from 1813) and the related Green formulas (cf. [32]) are among the basic tools of potential theory. They are also indispensable for a variety of problems in physical geodesy (cf. [34, 35, 54]). Theorem 3 (Gauss’s Integral Theorem). Let G be a regular region. Let F : G → R be a scalar field, f : G → R3 a vector field, that is continuous on G and differentiable in G , respectively. Then 

 F (y)ν(y) dS(y) =



∂G



f (y) · ν(y) dS(y) = 

∂G

ν(y) ∧ f (y) dS(y) = ∂G



G

G

G

∇y F (y) dy,

(11)

∇y · f (y) dy,

(12)

∇y ∧ f (y) dy,

(13)

1058

W. Freeden and H. Nutz

provided that the integrand on the right hand side is Lebesgue-integrable on G . The vector field ν : ∂G → R3 is the (unit) normal field pointing into the exterior of G and dS is the surface element. The identities (12) and (13) are valid for all vector fields, whatever their physical meaning is. Of special interest for us is Eq. (12). For the physical interpretation of Eq. (12) it is helpful to assume that f is the velocity vector of an incompressible fluid. Inside the surface ∂G there may be sources in which the fluid is generated or sinks in which the fluid is annihilated. The divergence ∇ ·f measures the strength of  the sources and sinks. The volume integral G ∇y ·f (y) dy is the total amount of the  fluid generated in unit time. The surface integral ∂G f (y) · ν(y) dS(y) is the total amount of fluid flowing in unit time across the surface ∂G . Therefore, the Gauss formula expresses a balance equation, namely the evident fact that both integrals in (12) are equal. Gravitational Interpretation. In the case where the vector f is the gravitational force, i.e., if f =  = ∇V, the intuitive interpretation of the Gauss integral theorem is not so obvious, but the analogy to the balance equation of fluid flow is helpful. In gravitation we can take advantage of the Poisson equation (10) ∇ ·  = ΔV = −Cρ.

(14)

This equation (cf. [34]) can be interpreted in the sense that the masses are the sources of the gravitational field. The strength of the sources, ∇ · , is proportional to the mass density ρ. In analogy to the fluid flow, the right-hand side of (14) is called the flux of force, i.e., in our case the gravitational flux. Next we come to the interior Green formulas for regular regions G ⊂ R3 . Suppose that f = ∇F , where F ∈ C (1) (G ) ∩ C (2) (G ), i.e., F : G → R is continuously differentiable in G and F |G is twice continuously differentiable in G . Let ΔF be Lebesgue-integrable in G . Then we obtain from the Gauss theorem (12)  ∂G

∂F (y) dS(y) = ∂ν(y)

 G

Δy F (y) dy,

(15)

∂ where ∂ν = ν · ∇ denotes the derivative in the direction of the outer (unit) normal field ν. Under the special choice f = F ∇G the Gauss Theorem yields the following theorem.

Theorem 4 (First Interior Green Theorem). Suppose that G ⊂ R3 is a regular region. For F ∈ C (1) (G ), G ∈ C (1) (G ) ∩ C (2) (G ) with ΔG Lebesgue-integrable on G , we have

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

 G

  F (y)Δy G(y) + ∇y F (y) · ∇y G(y) dy =

1059

 F (y) ∂G

∂G (y) dS(y). ∂ν(y) (16)

Taking f = F ∇G − G ∇F we finally obtain the following theorem. Theorem 5 (Second Interior Green Theorem). Suppose that G ∈ R3 is a regular region. For F, G ∈ C (1) (G ) ∩ C (2) (G ) with ΔF and ΔG Lebesgue-integrable on G , we have  G

  G(y)Δy F (y) − F (y)Δy G(y) dy 



= ∂G

∂F ∂G G(y) (y) − F (y) (y) ∂ν(y) ∂ν(y)

 (17)

dS(y).

Definition 1 (Solid Angle). Let G ⊂ R3 be a regular region. Then the solid angle α(x) subtended by the boundary ∂G at the point x ∈ R3 is given by  α(x) = − ∂G

∂ G(Δ; |x − y|) dS(y). ∂ν(y)

(18)

In fact, from the Second Green Theorem, we easily obtain the following result. Theorem 6 (Third Interior Green Theorem). Let G ⊂ R3 be a regular region with continuously differentiable boundary ∂G . Suppose that F : G → R is of class C (1) (G ) ∩ C (2) (G ) with ΔF Lebesgue- integrable on G . Then 

 G(Δ; |x − y|)

∂G

 −

G

 ∂ ∂ F (y) − F (y) G(Δ; |x − y|) dS(y) ∂ν(y) ∂ν(y)

G(Δ; |x − y|)Δy F (y) dy = α(x)F (x).

(19)

Especially, for a continuously differentiable boundary ∂G , we obtain the following corollary. Corollary 1. Let G ⊂ R3 be a regular region with continuously differentiable boundary ∂G . Suppose that F : G → R is of class C (1) (G ) ∩ C (2) (G ) with ΔF Lebesgue- integrable on G . Then  G(Δ; |x − y|)

 ∂G

∂ ∂ F (y) − F (y) G(Δ; |x − y|) ∂ν(y) ∂ν(y)

 dS(y)

1060

W. Freeden and H. Nutz

⎧ F (x), x ∈ G, ⎪ ⎪ ⎪  ⎨ 1 − G(Δ; |x − y|)Δy F (y) dy = F (x), x ∈ ∂G , ⎪2 G ⎪ ⎪ ⎩ 0, x ∈ G c.

(20)

As special case we obtain for continuously differentiable functions F in G which are harmonic in G the so-called Interior Fundamental Theorem. Corollary 2. Suppose that F : G → R is of class C (1) (G ) ∩ C (2) (G ) with ΔF = 0 on G . Then    ∂F ∂ G(Δ; |x − y|) (y) − F (y) G(Δ; |x − y|) dS(y) (21) ∂ν(y) ∂ν(y) ∂G ⎧ F (x), x ∈ G, ⎪ ⎪ ⎪ ⎨ 1 = F (x), x ∈ ∂G , ⎪ 2 ⎪ ⎪ ⎩ 0, x ∈ G c. Letting F = 1 in G , we obviously are led to the following result for the solid angle in connection with (5) and Corollary 2: ⎧ ⎨ 1, x ∈ G , a(x) = 12 , x ∈ ∂G , ⎩ 0, x ∈ G c ,

(22)

provided that G ⊂ R3 is a regular region with continuously differentiable boundary ∂G (cf. Fig. 1). In the case of the cube G = (−1, 1)3 ⊂ R3 we especially have (i) α(x) = 1 if x is located in the open cube G , (ii) α(x) = 12 if x is located on one of the six faces of the boundary ∂G of the cube G but not on an edge or in a vertex, (iii) α(x) = 14 if x is located on one of the eight edges of ∂G but not in a vertex, (iv) α(x) = 18 if x is located in one of the eight vertices of ∂G (cf. Fig. 2). It should be remarked that the divergence theorem (Eq. (12)) first appeared in Lagrange’s 1860 posthumous work, and it was proved in a special case already by Gauss in 1813. The general three-dimensional case was treated by M. V. Ostrogradsky in 1826. In a preliminary section of his groundbreaking 1828 essay, George Green proved several reductions of three-dimensional volume integrals to surface integrals, similar in spirit to the divergence theorem, and independently of M. V. Ostrogradsky. Nowadays, those are called Green’s identities and best viewed as consequences of the Gauss’s integral theorem.

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1061

Fig. 1 Solid angle subtended at x ∈ R3 by the surface ∂G of a regular region G with “smooth boundary”

α(x) =

1 2

α(x) = 1

=

1 2

=

1 4

α(x) = 0

=1

=

=0

1 8

Fig. 2 Solid angle subtended at x ∈ R3 by the surface ∂G of the “non-smooth” cube G = (−1, 1)3

3

Potential Theoretic Background

Next we are concerned with the mean value theorem (in today’s mathematical language) that dates back to Gauss (1840). Note that we use the notation S2r (x) = {z ∈ R3 : |z − x| = r} for the sphere around x with radius r and B3r (x) = {z ∈ R3 : |z − x| < r} for the ball around x with radius r. Theorem 7. Let G ⊂ R3 be a regular region. Then the following statements are equivalent: (a) U : G → R is harmonic in G , i.e., U ∈ C (2) (G ) and ΔU = 0 in G , (b) U : G → R possesses the Mean Value Property on G , i.e., U is of class C (0) (G ) and for all x ∈ G and all r > 0 with B3r (x)  G , 1 U (x) = 4π r 2

 S2r (x)

U (y) dS(y),

(23)

1062

W. Freeden and H. Nutz

(c) U is of class C (0) (G ) and for all r > 0 with B3r (x)  G  B3r (x)

(U (x) − U (y)) dy = 0.

(24)

A central result in the theory of harmonic functions is the Maximum/Minimum Principle. Essential tool is the Mean Value Property. Theorem 8 (Maximum/Minimum Principle). Let G ⊂ R3 be a regular region. Suppose that U is harmonic in G and non-constant. Then U does not reach its minimum or maximum in G . If, in addition, U is of class C (0) (G ), then U reaches its minimum and maximum in G , and the extremal points are lying on ∂G . More precisely, sup |U (x)| ≤ sup |U (x)|.

(25)

x∈∂G

x∈G

A direct consequence of the Maximum/Minimum Principle is the following stability theorem. Theorem 9. Let G ⊂ R3 be a regular region. Suppose that U and V are of class C (0) (G ) ∩ C (2) (G ), and harmonic in G . Let ε be an arbitrary positive number. If sup |U (x) − V (x)| ≤ ε,

(26)

x∈∂G

then sup |U (x) − V (x)| ≤ ε.

(27)

x∈G

Now we are prepared to establish the (real) analyticity of harmonic functions. Theorem 10 (Analyticity). Let G ⊂ R3 be a regular region. Suppose that U is harmonic on G . Then U is (real) analytic, i.e., for x0 ∈ G there exists ρ > 0 such that ∞  1 ((h · ∇)j U )(x0 ) U (x0 + h) = j!

(28)

j =0

for all h ∈ R3 with |h| < ρ. The Newton (volume) potential extended over G is harmonic in the exterior G c = R3 \G . This is the reason why potential theory under geoscientifically relevant aspects essentially aims at concepts in the outer space of a regular region.

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1063

The treatment of the outer space in the Euclidean space R3 , however, includes the discussion at infinity. As a consequence, Green’s integral theorems must be formulated under geophysically relevant conditions imposed on harmonic functions at infinity. Mathematically (see, e.g., [15]), the “regularity at infinity” can be deduced via the Kelvin transform by a transition from functions harmonic in the inner space to their counterparts in outer space, and vice versa. Theorem 11. If U is harmonic in G c and U converges to zero for |x| → ∞ uniformly with respect to all directions, then |x| |U (x)| and |x|2 |∇x U (x)| are bounded for |x| → ∞. Theorem 11 leads us to the definition of the “regularity at infinity”. Definition 2. A function U : G c → R is called regular at infinity, if U satisfies the asymptotic relation |U (x)| = O(|x|−1 ) and |∇x U (x)| = O(|x|−2 ), |x| → ∞, uniformly with respect to all directions x/|x|. Now we are prepared to discuss exterior versions of the Green’s identities involving harmonic functions being regular at infinity. All these identities can be obtained by first considering the auxiliary set GRc (0) = G c ∩ B3R (0) (with R sufficiently large such that G  B3R (0), i.e., G ⊂ B3R (0) and dist(∂G , S2R (0)) > 0) and afterwards letting R tend to infinity (note that GRc (0) as the difference of the two regular regions B3R (0) and G allows the application of the interior Green formulas). Theorem 12 (First Exterior Green Theorem). Let F be a function of class C (2) (G c ) ∩ C (1) (G c ) such that F is harmonic in G c and regular at infinity. Suppose that the function H ∈ C (1) (G c ) satisfies the asymptotic relations

|y|2 F (y)∇y H (y) = O (1)

(29)

and

∇y F (y) · ∇y H (y) = O



1 |y|3+ε

 ,

ε > 0.

(30)

Then  Gc

 ∇y F (y) · ∇y H (y) dy =

F (y) ∂G

∂H (y) dS(y), ∂ν(y)

(31)

where ν is the outer unit normal field to G c , i.e., the inner unit normal field to G . Theorem 13 (Second Exterior Green Theorem). Let F, G ∈ C (1) (G c ) ∩ C (2) (G c ) be harmonic in G c and regular at infinity. Then

1064

W. Freeden and H. Nutz

 F (y)

 ∂G

∂ ∂ H (y) − H (y) F (y) ∂ν(y) ∂ν(y)

 dS(y) = 0.

(32)

Theorem 14 (Third Exterior Green Theorem). Suppose that G is a regular region with continuously differentiable boundary ∂G . Let U be of class C (1) (G c ) ∩ C (2) (G c ) such that U is harmonic in G c and regular at infinity. Then  G(Δ; |x − y|)

 ∂G

 ∂ ∂ U (y) − U (y) G(Δ; |x − y|) dS(y) ∂ν(y) ∂ν(y)

⎧ ⎨ U (x), x ∈ G c , = 12 U (x), x ∈ ∂G , ⎩ 0, x ∈ G,

(33)

where ν is the outer unit normal field to G c , i.e., the inner unit normal field to G , and α(x) is the solid angle subtended by the boundary ∂G at x ∈ R3 .

4

Geodetic Boundary Value Problems

Since the time of Newton, the theories for the precise determination of the attraction of the Earth, exerting on a point on the Earth’s surface, did have a special significance for geodesy. These investigations are especially connected to the names of Huygens, MacLaurin, d’Alembert, Lagrange, Legendre, and Laplace as well as to their “scholars” Poisson and Plana (cf. [65]). In his treatise about his investigations on potential theory concerning homogeneous ellipsoids (Theoria attractonis corporum sphaeroidorum ellipticorum homogeneorum methodo nova tractata, Vol. II, Göttingen, [1813]), presented to the “Königliche Gesellschaft der Wissenschaften zu Göttingen” (Royal Society of Sciences at Göttingen), in which the “Gauss’s Integral Theorem” (divergence theorem) about the transformation of a volume integral in a surface integral is derived, Gauss writes: The solution of Laplace deserves because of its elegance and its sagacity common admiration; but just the fact that it was necessary to use special finesses and artifices in order to solve the problem, may among the geometricians awake the desire for a simpler, less complicated and more direct solution. This desire did not completely come true by a new proof of the main sentence given by Legendre (Hist. de l‘acad. roy. des sc. 1788, Sur les intgrales doubles), though the here shown extraordinary analytical skilfulness was by all geometricians recognized. Later Biot and Plana have tried to simplify the solution (Mm. De linstitut T. VI; Memorie di matematica e di fisica della societ. italiana T. XV). But also these two solutions belong to one of the most complicated applications of analysis. We hope that a new solution will not be unwelcome to mathematicians and astronomers, which will use a quite different way.

As a consequence of the historic prework, the determination of the Earth’s shape and its gravity potential are considered to be main tasks of physical geodesy. Both problems relate to the work of Stokes [64]. Nowadays, they can be tackled simultaneously by the so-called Molodensky problem (cf. [49]). In order to

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1065

understand Molodensky’s setup, we have to start once more with the gravitational potential V of the Earth, related to its mass by the Newton volume integral (4). As a consequence of the Gauss Theorem we already know that V is harmonic in the outer space, i.e., ΔV = 0 in G c . Moreover, V is regular at infinity. As the Earth is rotating, we can also assign a centrifugal potential Φ to the rotating body. This results in the so-called gravity potential W given by W = V + Φ.

(34)

The gradient of the gravity potential is the gravity vector  = ∇W.

(35)

The magnitude of , in geodesy usually denoted by g = || and simply called the gravity (intensity), can be obtained on the surface ∂G of the Earth from gravimetric measurements. The direction of  gives the direction of the plumb line and can be obtained from astronomic observations and today also from satellite measurements. Plumb lines are not straight, but intersect each equipotential surface of W normally, such that the gravity vector  at any given point is tangential to the plumb line at this point. Combining leveling with the gravimetric and astronomic measurements which determine  allows us to get W on the surface up to an additional constant which can be determined from additionally knowing at least one distance. All data sets are assumed to be corrected for influences like gravitational potentials of other celestial bodies or the Earth’s precession and atmosphere. Hörmander [36] in his seminal work stated the situation related to Molodensky’s problem as the following idealized setting (where we follow almost literally the approach described in [4]): (i) The Earth G is a rigid body rotating with a known constant angular speed ω around a fixed axis, which we choose as the x3 -axis. The centrifugal potential is given by Φ(x) = (ii) (iii) (iv) (v)

1 2 2 ω (x1 + x22 ), 2

x = (x1 , x2 , x3 )T .

(36)

The center of gravity is the origin 0 of the (Earth’s fixed) system of coordinates. The gravity vector  is known at every point P of ∂G . The gravity potential W is known at every point P of ∂G . ∂G can be mapped to the unit sphere S2 = {x ∈ R3 : |x|2 = 1}, i.e., there is a differentiable embedding ς : S2 → R3 such that ς (S2 ) = ∂G .

In the Molodensky problem, we are looking for the unknown gravity potential W in the whole space R3 and the unknown embedding ς . As the gravitational potential V is a harmonic function, i.e., ΔV = 0 outside the Earth, we obtain

1066

W. Freeden and H. Nutz

ΔW = Δ(V + Φ) = ΔV + ΔΦ = 2ω2

(37)

outside ∂G . The assumption of having the barycenter at the origin further on imposes on V that the asymptotic relation V (x) =

C + O(|x|−3 ), |x|

|x| → ∞,

(38)

holds true, which can be seen by taking a multipole expansion into account. All in all, Molodensky’s problem is a non-linear free-boundary problem and, therefore, hard to solve mathematically. As a consequence, a variety of approximate methods exist. Usually, Molodensky’s approach is modified via linearization. In fact, linearization and a sophisticated iterative process which avoids the loss of regularity are essential in treating Molodensky’s problem (note that we do not cover here the iteration procedures as proposed by [36]). Linearizing Molodensky’s problem amounts to the introduction of two ingredients: (α) (β)

an approximate surface ∂T for the Earth, called telluroid, an approximate potential U , called normal potential.

Classically, in the geodetic context, the following steps are carried out: (1) Choose a reference body E , usually an ellipsoid, which lies completely inside the Earth and rotates with the same angular speed ω as the Earth around the x3 -axis. (2) For every point P on ∂G , find a point Q0 on the surface ∂E such that xP − xQ0 is normal to ∂E . xP and xQ0 are the position vectors of P and Q0 . (3) Determine the normal potential U such that – its gravitational potential part is caused by a mass identical to the Earth’s mass, – its centrifugal potential part is identical to (36), – the reference surface ∂E is an equipotential surface of U . Note that U can be calculated explicitly as its centrifugal part is known and its gravitational part is the solution to an exterior Dirichlet problem with boundary values given on ∂E (see, e.g., [33, 34]). (4) Compute the gradient ∇U of U , called normal gravity vector field and denoted by u = ∇U with the magnitude γ = |u| called normal gravity. As ∂E is an equipotential surface of U , u is normal to ∂E for every point on ∂E , i.e., u(xQ0 ) is parallel to xP − xQ0 . (5) For every point P on the real Earth’s surface ∂G choose a point Q according to one of the following conditions: (A) Q lies on the line between P and its corresponding point Q0 on the surface ∂E so that W (xP ) = U (xQ ),

(39)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1067

(B) u(xQ ) (xP )

=

u(xQ ) |(xP )|

(40)

W (xP ) = U (xQ ),

(41)

(xP ) = u(xQ ),

(42)

and

(C)

respectively. All points Q chosen in this way make up the telluroid ∂T . Each of the above conditions leads to a slightly different telluroid. None of them is an equipotential surface, neither for W , nor for U . Condition (A) is the most popular one and the one originally used by Molodensky, whereas (B) is theoretically more correct and (C) seems to be better adapted to the so-called Marussi condition (cf. [36, 42]). For clarification, the Marussi condition says that the Jacobi matrix of u, i.e., the Hessian of U, should be non-singular at every point Q, i.e.,  det

∂ 2 U (x) ∂xi ∂xj





i,j =1

3

= 0

(43)

x=xQ

for all points Q on the telluroid ∂T . We shall see later why this condition is needed and senseful. In principle, the telluroid ∂T can be chosen by any surface as long as there is a one-to-one mapping between the Earth’s surface ∂G and ∂T (in this respect it should be remarked that the geoid may be assumed nowadays to be known with an accuracy of about one centimeter or even less). Clearly, as a point of departure in the context of linearization, ∂T should be close to ∂G (in some sense) and chosen in a way that brings advantages during the following process of linearization (note that a more correct notation is to write Q0 (P ) and Q(P ) as the points Q0 and Q are dependent on P ; we will do so whenever it may help to avoid any confusion). The introduction of the normal gravity field u also suggests the definition of socalled normal plumb lines, i.e., lines which intersect each equipotential surface of u normally, such that the normal gravity vector u at any given point is tangential to the normal plumb line at this point. The normal potential has been well established in geodesy long before Molodensky, whereas other surfaces have been in use for a long time before the telluroid, e.g., the geoid Σgeoid . As mentioned above, ∂E is an equipotential surface of the normal

1068

W. Freeden and H. Nutz

potential U . Denoting the value of U on ∂E by U0 , the geoid Σgeoid is defined as the equipotential surface of W for which we have W = U0 on Σgeoid . Points on the geoid Σgeoid can be related to points on ∂G . We denote by P0 ∈ Σgeoid the point related to P ∈ ∂G . We can determine P0 from P by moving along the plumb line from P towards the center of the Earth until we reach the geoid. Another possibility would be to use the normal plumb line to reach the geoid or to choose P0 as the point on the geoid that lies also on the line between P ∈ ∂G and the corresponding Q0 ∈ ∂E . With an appropriate choice of ∂E and the normal potential U , all of these methods lead to almost the same point P0 . We define the distance vector between P0 (P ) and Q0 (P ) as d(P ) = xP0 (P ) − xQ0 (P ).

(44)

Its magnitude is the so–called geoidal undulation, in the geodetic context denoted by

N(P ) = |d(P )| = xP0 (P ) − xQ0 (P ) .

(45)

Linearization of Molodensky’s Problem. In order to linearize Molodensky’s problem, we start from the assumption (due to [36]) that instead of one embedding ς : S2 → R3 we have a family of smooth embeddings depending on a parameter χ ∈ [0, 1], i.e., S : S2 × [0, 1] → R3 , such that S(S2 , 0) = ∂T ,

S(S2 , 1) = ∂G ,

(46)

smooth with respect to χ , too. Moreover, we let W : R3 × [0, 1] → R

(47)

be a family of potentials such that W(·, 0) = U and W(·, 1) = W . We assume that W(·, χ ) = V(·, χ ) + Φ,

(48)

i.e., the gravitational part V of W depends on χ and the centrifugal potential part Φ is independent of χ (this is in accordance with the assumptions on U and W ). Corresponding to W, we arrive at a family of gravity vectors g = ∇W such that g(·, 0) = u and g(·, 1) = .

(49)

We are now able to discuss the composition W = W ◦ S : S2 × [0, 1] → R,

(ξ, χ ) → W(ξ, χ ) = W(S(ξ, χ ), χ ).

(50)

For χ = 0, we find W(S(ξ, 0), 0) = U (xQ ) for Q ∈ ∂T , i.e., boundary values of U on the telluroid. For χ = 1, we obtain W(S(ξ, 1), 1) = W (xP ) for P ∈ ∂G ,

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1069

i.e., boundary values of W on the real Earth’s surface. Analogously, for g = ∇W, we introduce g = g ◦ S : S2 × [0, 1] → R,

(ξ, χ ) → g(ξ, χ ) = g(S(ξ, χ ), χ ).

(51)

Note that W and  are supposed to be known on ∂G whereas U and u can be calculated analytically outside ∂E . Now, the linearization can be performed by differentiation with respect to χ , denoted by a dot. We first notice that ˙ ˙ = ∂W = ∂V = V. W ∂χ ∂χ

(52)

˙ is harmonic and satisfies (38). For boundary values we obtain via the chain Thus, W rule ˙ ˙ ˙ W(ξ, χ ) = W(S(ξ, χ ), χ ) + ∇W(S(ξ, χ ), χ )T S(ξ, χ) ˙ ˙ = W(S(ξ, χ ), χ ) + g(S(ξ, χ ), χ )T S(ξ, χ ).

(53)

In the same way we arrive at ˙ χ ) = g˙ (S(ξ, χ ), χ ) + J (g)(S(ξ, χ ), χ )T S(ξ, ˙ g(ξ, χ) ˙ = g˙ (S(ξ, χ ), χ ) + Hess(W)(S(ξ, χ ), χ )S(ξ, χ ),

(54)

with J (g) being the Jacobian of g and Hess(W) the Hessian of W. As an immediate result we are led to the aforementioned Marussi condition  det

∂ 2 W(S(ξ, χ ), χ ) ∂xi ∂xj

3 i,j =1

 = 0

(55)

for all (ξ, χ ) ∈ S2 × [0, 1]. Rewriting the Eqs. (53), (54) in composition form, we see that ˙ =W ˙ ◦ S + (g ◦ S)T S, ˙ W ˙ g˙ = g˙ ◦ S + (Hess(W) ◦ S) S.

(56) (57)

From (57), we are able to deduce that   ˙ = (Hess(W) ◦ S)−1 g˙ − g˙ ◦ S . S

(58)

1070

W. Freeden and H. Nutz

In connection with (56) we therefore obtain from (58)   ˙ =W ˙ ◦ S + (g ◦ S)T (Hess(W) ◦ S)−1 g˙ − g˙ ◦ S W   ˙ ◦ S + g˙ − g˙ ◦ S T (Hess(W) ◦ S)−1 (g ◦ S) =W     ˙ ◦ S + g˙ − g˙ ◦ S T (Hess(W))−1 g ◦ S =W     ˙ ◦ S + g˙ − g˙ ◦ S T (J (g))−1 g ◦ S, =W

(59)

  where we have used the fact that the Hessian is symmetric. (J (g))−1 g is the tangent of the curve along which the gravity field has a fixed direction. Such lines are called isozenithals in geodesy. We introduce   M = (J (g))−1 g

(60)

to gain a more compact notation. As a matter of fact, we are able to detect the following equivalencies   ˙ =W ˙ ◦ S + (M ◦ S)T g˙ − g˙ ◦ S W ⇔ ⇔

˙ − (M ◦ S)T g˙ = W ˙ ◦ S − (˙g ◦ S)T (M ◦ S) W   ˙ − (M ◦ S)T g˙ = W ˙ − g˙ T M ◦ S, W

(61) (62) (63)

such that



  ˙ =W ˙ ◦ S + (M ◦ S)T g˙ − g˙ ◦ S W     ˙ − (M ◦ S)T g˙ = W ˙ − ∇W ˙ T M ◦ S. W

(64) (65)

Looking at our equivalencies in more detail we notice that we have specified a ˙ Since the values on the left-hand side of (65) are boundary condition for W. supposed to be known, the boundary condition is of Robin-type with an oblique derivative due to the occurrence of the vector M which is in general not normal to the boundary surface S(S2 , ·). Geodetic Quantities. Although our derivation as presented above is mathematically sound, it looks rather unfamiliar to the reader with a geodetic background. In fact, the quantities above can be easily related to well-known and often used quantities from geodesy. We formally approximate derivatives with respect to χ by differences. As results we are able to specify the following quantities:

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1071

(1) potential anomaly W(S(ξ, 1), 1) − W(S(ξ, 0), 0) ˙ = W (xP ) − U (xQ ) = ΔW (xP ), W(x, χ) ≈ 1 (66) (2) gravity anomaly vector g(S(ξ, 1), 1) − g(S(ξ, 0), 0) ˙ = (xP ) − u(xQ ) = Δ(xP ), g(x, χ) ≈ 1 (67) (3) disturbing (or anomaly) potential W(x, 1) − W(x, 0) ˙ = W (x) − U (x) = T (x), W(x, χ) ≈ 1

(68)

(4) gravity disturbance vector g˙ (x, χ ) ≈

g(x, 1) − g(x, 0) = (x) − u(x) = δ(x), 1

(69)

where the last expression of each formula relates to the nomenclature usually used in geodesy (cf. [34]) i.e., δ, Δ are used here for differences. By comparison we are immediately led to ∇T = δ and the harmonicity of T outside ∂G . The settings (66), (67), (68), and (69) also show that T and δ can be understood for all x ∈ R3 , whereas the anomalies ΔW and Δ can only be defined for pairs of corresponding points on certain surfaces. It is common practice in geodesy to work on the geoid instead of the real surface of the Earth and the ellipsoid instead of the telluroid, although there is no need mathematically to do so. In this case, the anomalies are defined with respect to P0 ∈ ∂G and Q0 ∈ ∂E . Moreover, a remove-restore or gravity reduction technique is required that removes all masses outside the geoid by condensing the topography such that the disturbing potential can be treated as being harmonic outside the geoid (see, e.g., [35, 54] for a more detailed discussion of remove-restore procedures). In order to approximate the vector M, we go back to Eq. (60). Introducing the (local) unit vector in (local) direction of the isozenithal as ετ and writing derivatives ∂ , it can be shown (see, e.g., [42, 52] and the in direction of the isozenithal as ∂τ references therein) that 

1 ∂γ M≈− γ ∂τ

−1

ετ .

(70)

Introducing −Δg  (x) = −ΔT (x)ετ as the component of the gravity vector in the downward direction of the isozenithal, the identity (65) shows that 

 ∂T (x)

1 ∂γ (x) 1 ∂γ (x)  ΔW (x) − Δg (x) = T (x) − .

γ (x) ∂τ γ (x) ∂τ ∂τ x=xP

(71)

1072

W. Freeden and H. Nutz

In the case of a non-rotating sphere (see, e.g., [52]), we get with |x| = r u(x) = −

GM GM x = − 2 εr , r |x|3

(72)

1 ∂γ (x) 1 ∂γ (x) 2 = =− , γ (x) ∂τ γ (x) ∂r r

(73)

where G is the gravitational constant, M is the (mean) Earth’s mass and εr is the spherical basis vector in radial direction. Example. In spherical approximation, P is mapped on a point P with the spherical coordinates (r, θ, λ), i.e., ⎛ ⎞ r cos(θ ) cos(λ) xP = ⎝ r cos(θ ) sin(λ) ⎠ . r sin(θ )

(74)

Hence, (71) reduces to 



∂T (x) 2 2 + T (x) = −ΔT (x)er + ΔW (x)

. ∂r r r x=x

(75)

P

The boundary value problem consisting of the Laplace equation Δx T (x) = 0,

x ∈ G c,

(76)

and Eq. (75) is called simple Molodensky problem. If the telluroid is chosen according to condition (A), ΔW vanishes on the Earth’s surface. Let us further assume that r can be chosen to take a value R identical for all points. The direction of the isozenithal is then identical to the direction of the radius vector er . We also approximate ΔT (xP )er ≈ Δg(xP ) = g(xP ) − γ (xQ ).

(77)

Δg is called gravity anomaly. It is approximately, but not exactly, the magnitude of the gravity anomaly vector Δ. The gravity anomaly Δg can be computed directly from measurements of g as γ can be calculated due to the definition of U . With these assumptions, we arrive at the classical (exterior) Stokes boundary value problem Δx T (x) = 0, 2 ∂T (x) + T (x) = −Δg(x), ∂r R

|x| > R,

(78)

|x| = R,

(79)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1073

Fig. 3 Different realizations of the geodetic boundary value problem. (Modified illustration following [61], see also [4])

T (x) =

C + O(|x|−3 ), |x|

|x| → ∞.

(80)

As the derivative with respect to r coincides on the sphere with the normal derivative, this is no longer an oblique-derivative problem (Fig. 3). The Stokes problem links the disturbing potential T to gravity anomalies Δ. On the other hand, we have seen that ∇T = δ. On the sphere S2R (0) around the origin with radius R, the normal component of this relation is given by ∂T = δT εr . ∂r

(81)

Using an approximation of the form δT (x)εr ≈ δg(x) = g(x) − γ (x),

(82)

thus introducing the gravity disturbance δg, we obtain the exterior Neumann boundary value problem (see, e.g., [35]) Δx T (x) = 0, ∂T (x) = δg(x) ∂r c T (x) = + O(|x|−3 ), |x|

|x| > R,

(83)

|x| = R,

(84)

|x| → ∞.

(85)

In the same way as before, δg is approximately the magnitude of the gravity disturbance vector δ. If the approximation (82) is not admissible, which is also

1074

W. Freeden and H. Nutz

the case if we do not use spherical approximation, the relation evaluation ∇T = δ on a boundary surface yields an oblique derivative problem. A historical remark involving Neumann’s problem is due to M.I. Yurkina (cf. [35]): The Neumann problem is a classical problem of potential theory, with a long history. Neumann’s problem is named after Carl Neumann, who edited his father’s (Franz Neumann) lectures from 1850s. The external spherical Neumann problem also occurs in [39]. It is again found in [37, 53]. In future, because of GNSS (Global Navigation Satellite System such as GPS, GLONASS, BEIDOU, GALILEO), gravity disturbances may be expected to be much more available than gravity anomalies. This observation implies a renewed importance of the Neumann problem. Another quantity that can be used to determine the disturbing potential is the deflection of the vertical. The deflection of the vertical is the difference between the direction of the reference normal vector ν  (x) = −u(x)/γ (x), where, as already defined, γ (x) = |u(x)| is the normal gravity associated with the normal potential U and the normal gravity vector u = ∇U . The normal vector ν(x) = −(x)/g(x), where g(x) = |(x)|, is associated with the (actual) gravity potential W and the (actual) gravity vector  = ∇W (see Fig. 4). There are different definitions based on whether ν is evaluated on the surface, e.g., the real surface of the Earth or on the geoid, and whether ν  is evaluated at the real surface of the Earth, the telluroid, or the reference ellipsoid (for a distinction, see, e.g., [38, 66]). Considering a point P0 on the geoid Σgeoid , we can decompose the gradient ∇T of the disturbing potential T = W −U into a normal part and a surface part, defining the latter as   ∇S T (xP0 ) = ∇T (xP0 ) − ∇T (xP0 ) · ν(xP0 ) ν(xP0 ),

(86)

as Σgeoid is an equipotential surface of W and, thus, ν(xP0 ) is normal to Σgeoid . The normal gravity vector u can be decomposed similarily by defining   uS (xP0 ) = u(xP0 ) − u(xP0 ) · ν(xP0 ) ν(xP0 ) Fig. 4 Illustration of the definition of the gravity vector  and the normal gravity vector

u(x)

w(x)

u(y)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

  = −γ (xP0 )ν  (xP0 ) + γ (xP0 ) ν  (xP0 ) · ν(xP0 ) ν(xP0 )     = −γ (xP0 ) ν  (xP0 ) − ν  (xP0 ) · ν(xP0 ) ν(xP0 ) .

1075

(87)

Note that (x) = −g(x)ν(x) by definition, i.e., there are no surface components of (xP0 ) at the geoid. Therefore, we get for the surface components ∇S T (xP0 ) = −uS (xP0 )     = −γ (xP0 ) ν  (xP0 ) − ν  (xP0 ) · ν(xP0 ) ν(xP0 ) = −γ (xP0 )Θ(xP0 )

(88)

with the (vectorial) deflection of the vertical defined as   Θ(xP0 ) = ν  (xP0 ) − ν  (xP0 ) · ν(xP0 ) ν(xP0 ).

(89)

We can also use Bruns formula [7] in the form T (xP0 ) = γ (xQ0 )N (xP0 )

(90)

with Q0 being the point on the reference ellipsoid associated with P0 and N the geoidal undulation (leading to the approach given by [67]). In spherical approximation (see, e.g. [54]), we can estimate γ (xQ0 ) by the constant value γ0 to arrive at ∇S N(xP0 ) = − γ (xP )

γ (xP0 ) Θ(xP0 ). γ0

(91)

Usually, it is also assumed that γ0 0 ≈ 1 and that ν  (xP0 ) can be approximated by the ellipsoidal normal, i.e., ν  (xQ0 ). We refer the reader to [23, 38, 59, 66]. The solution of (91) leads to an integral representation of N . Usually, in physical geodesy, anisotropic scalar representations, called Vening-Meinesz formulas, are in use. An ellipsoidal formulation as proposed in our theoretical approach can be found in [31] (see also the references therein). In our approach we follow a different strategy: Our interest first is to solve the vectorial differential equation. Essential tool is an integral formula involving Green’s function of the Beltrami operator (see [23]). To be concrete, a global method of determining the geoidal undulation from deflections of the vertical is established by inversion of the surface gradient operator using the Green function of the Beltrami operator. In turn, an alternative direct way of representing geoidal undulations from deflections of the vertical is deduced by certain tools of surface vector theory thereby avoiding any occurrence of the Stokes kernel function within the solution process (cf. [34]). Second, a method is proposed of establishing the transition from global to local modeling of geoidal undulations from deflections of the vertical within a scale–dependent approximation procedure.

1076

W. Freeden and H. Nutz

Indeed, our approach enables us to obtain a locally reflected improvement of the fine structure of the geoid by (surface curl-free) vector wavelets. More explicitly, for detailed local information about the geoid, only a local amount of the vectorial signal, i.e., the deflections of the vertical, are needed within the multiscale process. Since the flattening in a local approach is negligibly small, a calculation in spherical approximation is canonical. For simplicity, we restrict ourselves to error free data. A vectorial multiscale signal–to–noise ratio handling noisy data is proposed, e.g., in [20]. Our considerations are based on the pre–work [9, 14, 23, 24, 30]. In spherical nomenclature, any unit vector ξ ∈ S2 can be represented in the form ξ = sin ϕε3 + cos ϕ(cos λε1 + sin λε2 ), 0 ≤ λ < 2π, −π/2 ≤ ϕ ≤ π/2

(92)

(λ: spherical longitude, ϕ: spherical latitude), where ε1 , ε2 , ε3 , respectively, form the (canonical) orthonormal basis in R3 . A moving orthonormal triad on the unit sphere S2 is given in the form ⎛

⎞ cos λ cos ϕ εr = ⎝ sin λ cos ϕ ⎠ , sin ϕ



⎞ − sin λ ελ = ⎝ cos λ ⎠ , 0



⎞ − cos λ sin ϕ εϕ = ⎝ sin λ sin ϕ ⎠ , cos ϕ

(93)

and (88) and (91) read 1 ∗ GM ∇ξ T (Rξ ) = − 2 Θ(Rξ ), R R 1 ∗ ∇ N(Rξ ) = −Θ(Rξ ). R ξ

(94) (95)

The surface gradient ∇ξ∗ in local spherical coordinates is given by ∇ξ∗ = ελ

1 ∂ ∂ + εϕ , cos ϕ ∂λ ∂ϕ

(96)

where − π2 < ϕ < π2 , G is the gravitational constant and R is the radius used in spherical approximation. This particular representation of ∇ξ∗ yields 1 ∂N (λ, ϕ) = NSC(λ, ϕ), R ∂ϕ

(97)

1 1 ∂N (λ, ϕ) = EWC(λ, ϕ) R cos ϕ ∂λ

(98)

− −

with the north-south componentent NSC(λ, ϕ) and the east-west component EWC(λ, ϕ) of Θ. Once more, the difference between Eq. (95) on the one hand and Eqs. (97) and (98) on the other hand is, that the former is an isotropic vectorial

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1077

differential equation whereas the latter is an anisotropic system of two scalar differential equations. The difference between (95) on the one hand and (97) and (98) on the other hand is, that the former is an isotropic vectorial differential equation whereas the latter is an anisotropic system of two scalar differential equations. The solution of the isotropic vectorial differential equation (and a multiscale approximation) can be found in [23], while the solution of the scalar anisotropic differential equations is due to [67]. For Dirichlet, Neumann, and Stokes problems, there exist solution procedures based on an integral representation of T (see, e.g., [54]). However, evaluating these integrals can be cumbersome. Therefore, it is necessary to consider suitable integration and approximation formulas to derive a numerical procedure that allows a fast and precise determination of the disturbing potential T from given boundary data (numerical realizations by locally supported wavelets have been recently proposed by [15, 24, 25, 28]). This leads us to the consideration of finite pointset methods on spheres and low-pass filtering involving truncated series of spherical harmonics. Even harmonic spline Runge methods can be performed, e.g., after a suitable Kelvin transform (see [4, 8, 15, 21, 22]). Simplified Geodetic Models. If the normal potential U is determined as above, it has the same monopole component as W , as the mass of the reference ellipsoid is assumed to be equal to the mass of the Earth. As a consequence, the monopole component of T has to vanish, such that we actually have T (x) = O(|x|−3 ) as |x| → ∞. Even more, the traditional approach to be realized in standard textbooks of geodesy (see, e. g., [34, 35]) is based on the Pizzetti oriented concept (see, e.g., [60]) such that the following assumptions may be supposed to be valid: 1. The mass within the reference elliposid for establishing the disturbing potential T is equal to the mass of the Earth. 2. The center of the reference ellipsoid coincides with the center of the Earth. 3. The value of the potential on the geoidal surface and the value of the normal potential on the reference ellipsoidal surface are the same. 4. There are no masses outside the geoid (remove-restore-principle from masses outside the geoid). 5. The constructive approximation is simplified from an ellipsoidal to a spherical framework for reasons of computational economy. In the Pizzetti reflected concept we finally arrive at the following manifestations 

 S2R (0)

T (x) dS(x) = 0,

resulting in the identities

S2R (0)

T (x)(εk ·x) dS(x) = 0,

k = 1, 2, 3,

(99)

1078

W. Freeden and H. Nutz



 S2R (0)

Δg(x) dS(x) = 0,

S2R (0)

Δg(x)(εk · x) dS(x) = 0,

(100)



 S2R (0)

δg(x) dS(x) = 0,

k = 1, 2, 3,

S2R (0)

δg(x)(εk · x) dS(x) = 0,

k = 1, 2, 3. (101)

To some extent, the conceptional vagueness of disturbing potential and geoid via boundary problems based on heterogeneously distributed boundary data can be overcome by combining terrestrial and satellite data. With precise positioning by satellites, for example, intercontinental height links can be established between local geopotential realizations. Moreover, nowadays, GNSS can be used to determine the real Earth’s surface. This observation led to a reformulation of the geodetic boundary value problem as an exterior oblique derivative problem without any need to introduce an auxiliary surface such as telluroid or ellipsoid (cf. [40]). Conventionally, a solution of an oblique derivative problem corresponding to the boundary values F on the real Earth surface ∂G ∇V |∇V |

(102)

inf (λ(x) · ν(x)) > 0,

(103)

∂V = F, ∂λ

λ=

and

x∈∂G

is represented by a strongly singular (Fredholm) integral equation of the second type based on a potential of a single layer as an ansatz function. However, this solution technique causes difficulties for numerical realizations. Augustin et al. [5], Freeden and Kersten [17, 18], Freeden and Gerhards [15] show that the geodetic boundary value problem using the gravitational vector (102) on the known Earth’s surface can be deduced from discrete data sets by a constructive Runge theorem in terms of, e.g., mono- and/or multipoles situated on an arbitrary closed surface lying completely inside the Earth.

5

Disturbing Potential from Gravity Anomalies

The traditional approach of physical geodesy (cf., e.g., [34, 54]) starts from the assumption that scalar gravity intensity is available over the whole Earth’s surface. The gravitational part of the gravity potential can then be regarded as a harmonic function outside the Earth’s surface. A classical approach to gravity field modeling was conceived by Stokes [64]. He proposed to reduce the given gravity accelerations from the Earth’s surface to the geoid (see, e.g., [45]), where the geoid is a level

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1079

surface, i.e., its potential value is constant. The difference between the reduced gravity disturbing potential, i.e., the difference between the actual and the reference potential, can be obtained from a (third) boundary value problem of potential theory. Molodensky et al. [49] proposed to improve Stokes solution by “reducing” the gravity anomalies given on the Earth’s surface to a “normal level surface” (telleroid). In both cases, the calculation via the associated integral formulas is usually performed in spherical approximation although concepts of ellipsoidal realization are available (see [31] and the references therein). In fact, Hofmann– Wellenhof and Moritz [35] mention that in standard geodetic approaches, the reference surface is never a sphere in any geometrical sense, but an ellipsoid. As the flattening of the Earth is very small, the ellipsoidal formulas can be expanded into power series in terms of the flattening so that terms containing higher orders can be neglected. “In this way one obtains formulas that are rigorously valid for the sphere, but approximately valid for the actual reference ellipsoid as well” [35]. For practical evaluation, the Stokes convolution integral connecting the Stokes kernel and the gravity anomaly must be replaced by approximate cubature formulas using certain integration weights and knots. The approximate integration formulas are the essential problem in the framework of globally determining the disturbing potential and, subsequently, the geoidal height following Bruns’s concept (see [7]). In fact, we are confronted with the following dilemma: On the one hand, Weyl’s law of equidistribution (cf. [72]) tells us that numerical integration and equidistribution of the nodal points are mathematically equivalent (see also [16]). This law holds true for any reference surface, i.e., telluroid, ellipsoid as well as sphere. In order to get a better and better accuracy in approximate integration procedures, we thus need globally densely equisitributed datasets over the whole reference surface. On the other hand, even nowadays, observations in sufficient data width and quality are only available for certain parts of the Earth’s surface, and there are large areas, particularly at sea, where no suitable data are given at all. In fact, the terrestrial gravity data coverage now and in the foreseeable future is far from being satisfactory and totally inadequate for the purpose of high-precision geoid determination. As a consequence, Stokes integral formula and its improvements based on Molodensky’s concept cannot be applied on a global basis neither in an ellipsoidal nor in a spherical framework. We have to observe the specific heterogeneous data situation. A mathematical way out is an adequate multiscale method providing a “zooming in” approximation in adaptation to the data distribution and density. In this contribution our particular goal is a local high resolution gravitational model reflecting the available data obligations as far as possible. In addition, since the flattening in a local approach is negligibly small, a calculation in spherical approximation is canonical. For simplicity, we restrict ourselves to error free data. A multiscale signal-to-noise ratio handling noisy data is proposed, e.g., in [19]. Our considerations are based on the work [14, 15, 23, 25, 26]. The illustrations are essentially taken from [25] and the Ph.D.-thesis [73]: We begin with the recapitulation of the global Stokes approach in spherical approximation. Let S2R be the sphere with radius R around the origin and the gravity anomaly Δg ∈ C (0) (S2R ) with

1080

W. Freeden and H. Nutz

 S2R

Δg(x) dS(x) = 0

(104)

and  S2R

Δg(x)(εk · x) dS(x) = 0,

k = 1, 2, 3,

be given. Let B3R be the ball with radius R around the origin and B3R

(105) ext

the outer

ext B3R

→ R is the unique space of this ball. Then, the disturbing potential T : solution of the exterior Stokes boundary-value problem (see also [11, 25, 73]): (i) T is continuously differentiable in B3R

ext

and twice continuously differentiable

ext B3R ,

ext ext i.e., T ∈ C (1) (B3R ) ∩ C (2) (B3R ), ext ext harmonic in B3R , i.e., ΔT = 0 in B3R ,

in (ii) T is (iii) T  is regularRat infinity, (iv) S2 T (y)H−n−1,k (y) dS(y) = 0, R

(v)

x − |x|

· ∇x T (x) −

2 |x| T (x)

n = 0, 1, k = 1, . . . , 2n + 1,

= Δg(x),

x ∈ S2R .

R The functions H−n−1,k denote outer harmonics (see, e.g., [15] for a precise definition). T is determined by the Stokes integral formula

T (Rξ ) =

1 4π R

 S2R

St (Rξ, Rη) Δg(Rη) dS(Rη) =

R 4π

 S2

St (ξ, η) Δg(Rη) dS(η), (106)

with the Stokes kernel St given by  1 1 St (ξ, η) = 1−5ξ ·η −6(S(ξ ·η)) +S(ξ ·η)−3ξ ·η ln , + S(ξ · η) (S(ξ · η))2 (107) ξ, η ∈ S2 , and where we have used the abbreviation 

−1

√ 2 S(ξ · η) = √ , 1−ξ ·η

1 − ξ · η = 0.

(108)

To regularize the improper integral (106), we replace the zonal kernel S by the space-regularized zonal kernel (see, e.g., [23, 25, 73]) ⎧ √ ⎨√ 2 ,  S ρ (t) = R 1−t ⎩ ρ 3−

 2R 2 (1 − t) , ρ2

The kernels S and S ρ are shown in Fig. 5.

−1 ≤ t < 1 − 1−

ρ2 2R 2

ρ2 , 2R 2

≤ t < 1.

(109)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1081

Fig. 5 The functions S and S ρ

Clearly, the function S ρ is continuously differentiable on the interval [−1, 1], and we have (see [25, 73]) ⎧ 2 ⎨ √ 1 3 , −1 ≤ t < 1 − ρ 2 ,  ρ  2R 2(1−t) 2 S (t) = ρ2 ⎩ 2R 3 , 1 − 2R 2 ≤ t < 1. ρ3

(110)

Furthermore, the functions S and S ρ are monotonically increasing on the interval [−1, 1), such that S(t) ≥ S ρ (t) ≥ S(−1) = S ρ (−1) = 1 holds true on the interval [−1, 1). Considering the difference between the kernel S and its linearly regularized version S ρ , we find

S(t) − S (t) = ρ

⎧ ⎨0,

√ ⎩√ 2 1−t

2



R ρ

 3−

ρ  −1 ≤ 2t < 1 − 2R 2 , 2 ρ 2R (1 − t) , 1 − 2R 2 ≤ t < 1. ρ2

(111)

It can be shown (cf. [23]) that for F ∈ C (0) (S2 ) and S ρ defined by (109), we have





ρ

S(ξ · η)F (η) dS(η) − S (ξ · η)F (η) dS(η)

= 0. lim sup 2 2 ρ→0+ 2 ξ ∈S

S

S

(112)

To obtain another useful convergence result, we observe that for all t ∈ [−1, 1) ρ2 with 1 − 2R 2 ≤ t

1082

W. Freeden and H. Nutz

   1 1 1 1 − ln + + S(t) (S(t))2 S ρ (t) (S ρ (t))2      = ln 1 + S(t)) − ln(1 + S ρ (t) − 2 ln S(t)) − ln(S ρ (t)

 ln

(113)

and, thus,

   

1 1 1 1

ln

= O(|S(t) − S ρ (t)|). − ln + +

ρ 2 ρ 2 S(t) (S(t)) S (t) (S (t))

(114)

This leads to the following result: √ 2 and let S ρ , ρ ∈ (0, 2R], be Let S be the singular kernel given by S(t) = √1−t the corresponding (Taylor) linearized regularized kernel defined by (109). Then 

1

lim

 lim

ρ→0+

ρ→0+ −1

 

ln (1 + S(t)) − ln 1 + S ρ (t) dt = 0,

   

1 1 1 1

dt = 0,

ln − ln + +

ρ (t) 2 2 ρ S(t) S (S(t)) (S (t)) −1  1  2   lim 1 − t 2 dt = 0. (S(t))2 − S ρ (t)

(115)

1

ρ→0+ −1

(116) (117)

The regularization (109) leads to the following regularized global representation of the disturbing potential corresponding to gravity anomalies as boundary data (see [25]) T ρ (Rξ ) =

R 4π

 S2

St ρ (ξ, η)Δg(Rη) dS(η)

(118)

with (see also Fig. 6) St ρ (ξ, η) =

=

1 − 5ξ · η − 6 (S(ξ · η))−1 + S ρ (ξ · η) − 3ξ · η ln



1

+

1 ρ S (ξ · η)



(S ρ (ξ · η))2   ⎧ R 2R 2 ⎪ 3 − (1 − ξ · η) ⎪ ρ ⎪ ρ2    ⎪ ⎪ ⎪ −3ξ · η ln 1 + R 3 − 2R 2 (1 − ξ · η) ⎪ ⎪ 2 ⎪ ρ  ρ  ⎪ ⎪ √ 2 1 − ξ · η ⎨ +6ξ · η ln Rρ 3 − 2R (1 − ξ · η) , ρ2 + 1 − 5ξ · η − 6 √ 2 ρ ⎪ 2 ⎪ ⎪ ⎪  0 ≤ 1 −√ξ · η ≤ 2R 2 , √ ⎪ ⎪ 2 ⎪ √ √ ·η , − 3ξ · η ln 1−ξ2 ·η + 1−ξ ⎪ ⎪ 1−ξ ·η 2 ⎪ ⎪ ⎩ ρ2 < 1 − ξ · η ≤ 2, 2R 2 (119)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1083

Fig. 6 The regularized Stokes kernel ϑ → St ρ (cos ϑ) for several ρ

for ξ, η ∈ S2 and ρ ∈ (0, 2R]. Here, we have made use of Eq. (113). With the results above we finally obtain that T is the solution of the Stokes boundary-value problem: Let T ρ , ρ ∈ (0, 2R], represent the regularization of the disturbing potential T as given in Eq. (118). Then

lim sup T (Rξ ) − T ρ (Rξ ) = 0.

ρ→0+ ξ ∈S2

(120)

The linear space regularization technique enables us to formulate multiscale solutions for the disturbing potential from gravity anomalies. For numerical application, we have to go over to scale-discretized approximations of the solution to the boundary-value problem. For that purpose, we choose a monotonically decreasing sequence {ρj }j ∈N0 , such that lim ρj = 0,

j →∞

ρ0 = 2R.

(121)

A particularly important example, which we use in our numerical implementations below, is the dyadic sequence with ρj = 21−j R,

j ∈ N, ρ0 = 2R.

(122)

Obviously, 2ρj +1 = ρj , j ∈ N0 , holds true for two consecutive elements of the sequence. In correspondence to the sequence {ρj }j ∈N0 , a sequence {St ρj }j ∈N0 of discrete versions of the regularized Stokes kernels (119), so-called Stokes scaling functions, is available. Figure 6 shows a graphical illustration of the regularized Stokes kernels for different scales j . The regularized Stokes wavelets, forming the sequence {W St ρj }j ∈N0 , are understood to be the difference of two consecutive regularized Stokes scaling functions,

1084

W. Freeden and H. Nutz

respectively, W St ρj (ξ, η) = St ρj +1 (ξ, η) − St ρj (ξ, η),

j ∈ N0 .

(123)

These wavelets possess the numerically nice property of a local support. More specifically, the function η → W St ρj (ξ, η), η ∈ S2 , vanishes everywhere outside the spherical cap Γρ 2 /(2R 2 ) (ξ ). j Explicitly written out, we have W St ρj (ξ, η) ⎧   1 1 ρj +1 (ξ · η) − 3ξ · η ln ⎪ S + ⎪ ρ ρ 2 j +1 j +1 ⎪ S (ξ ·η)) ⎪ ⎪   (ξ ·η) (S ⎪ ρj2+1 ⎪ 1 1 ρ j ⎪ −S , 0 ≤ 1 − ξ · η ≤ (ξ · η) + 3ξ · η ln + , ⎪ ρ ρ ⎪ 2R 2 S j (ξ ·η) ⎨  (S j (ξ ·η))2 1 1 = S(ξ · η) − 3ξ · η ln S(ξ ·η) + (S(ξ ·η))2 ⎪   ⎪ ⎪ ρj2+1 ρj2 ⎪ 1 1 ρj (ξ · η) + 3ξ · η ln ⎪ −S , + < 1 − ξ · η ≤ , ρj ρj ⎪ 2 2 ⎪ 2R 2R 2 S (ξ ·η) (S (ξ ·η)) ⎪ ⎪ 2 ⎪ ρj ⎩0, < 1 − ξ · η ≤ 2. 2R 2

(124) Let J ∈ N0 be an arbitrary scale. Suppose that St ρJ is the regularized Stokes scaling function at scale J . Furthermore, let W St ρj , j = 0, . . . , J, be the regularized Stokes wavelets given by (124). Then an easy calculation shows that St ρJ (ξ, η) = St ρJ0 (ξ, η) +

J −1

W St ρj (ξ, η).

(125)

j =J0

The local support of the Stokes wavelets within the framework of Eq. (125) should be studied in more detail: Following the sequence (122), we start with a globally supported scaling kernel St ρ0 = St 2R . Then we add more and more wavelet kernels W St ρj , j = 0, . . . , J , to achieve the required scaling kernel St ρJ . It is of particular importance that η → W St ρj (ξ, η), ξ ∈ S2 fixed, are ξ -zonal functions and possess spherical caps as local supports, as already explained above. Clearly, the support of the wavelets W St ρj becomes more and more localized for increasing scales j . In conclusion, a calculation of an integral representation for the disturbing potential T starts with a global trend approximation using the scaling kernel at scale j = 0 (of course, this requires data on the whole sphere, but the data can be rather sparsely distributed since they only serve as a trend approximation). Step by step, we are able to refine this approximation by use of wavelets. The increasing spatial localization of the wavelets successively allows a better spatial resolution of the disturbing potential T . Additionally, the local supports of the wavelets have a computational advantage since the integration is reduced from the entire sphere to smaller and smaller spherical caps. Consequently, the presented numerical technique becomes capable

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1085

of handling heterogeneously distributed data sets in adaptation to their mutual spacing. All in all, keeping the space-localizing properties of the regularized Stokes scaling and wavelet functions in mind, we are able to establish an approximation of the solution of the disturbing potential T from gravity anomalies Δg in the form of a zooming-in multiscale method. A low-pass filtered version of the disturbing potential T at the scale j in an integral representation over the unit sphere S2 is given by T ρj (Rξ ) =

R 4π

 S2

Δg(Rη) St ρj (ξ, η) dS(η),

ξ ∈ S2 ,

(126)

while the j -scale band-pass filtered version of T leads to the integral representation WT

ρj



R (Rξ ) = 4π

Δg(Rη) W St ρj (ξ, η) dS(η),

ξ ∈ S2 ,

(127)

Γρ 2 /(2R 2 ) (ξ ) j

where Γρ 2 /(2R 2 ) (ξ ) is a spherical cap. j

In consequence, we obtain the following limit relation: Let T ρJ0 be the regularized version of the disturbing potential T at some arbitrary initial scale J0 as given in (126), and let W T ρj , j = J0 , J0 + 1, . . . , be given by (127). Then, the following reconstruction formula holds true:

⎛ ⎞

J −1

ρJ0 ρj

⎝ ⎠ W T (Rξ )

= 0. lim sup T (Rξ ) − T (Rξ ) + J →∞ ξ ∈S2

j =J0

(128)

The multiscale procedure (wavelet reconstruction) as developed here can be illustrated by the following scheme W T ρJ0 +1

W T ρJ0 T

ρJ0

  ρJ0 +1 −→ + −→ T −→ + −→ T ρJ0 +2 . . . .

(129)

Consequently, a tree algorithm based on regularization in the space domain has been realized for determining the disturbing potential T from locally available data sets of gravity anomalies Δg. An example is shown in Fig. 7. The fully discretized multiscale approximations have the following representations

T ρJ (Rξ ) 

NJ     R  wkNJ Δg RηkNJ St ρJ ξ, ηkNJ , 4π k=1

(130)

1086

W. Freeden and H. Nutz

Fig. 7 Global multiscale approximation of the gravity disturbance potential form gravity anomalies with Stokes scaling and wavelet functions for 90,000 data points of the EGM96-model. (a) Low-pass filtered gravity disturbance potential for scale 4 with St ρ4 . (b) Band-pass filtered gravity disturbance potential for scale 4 with W St ρ4 . (c) Band-pass filtered gravity disturbance potential for scale 5 with W St ρ5 . (d) Band-pass filtered gravity disturbance potential for scale 6 with W St ρ6 . (e) Band-pass filtered gravity disturbance potential for scale 7 with W St ρ7 . (f) Low-pass filtered gravity disturbance potential for scale 8 with St ρ8

WT

ρj

Nj    N  R  Nj N (Rξ )  wk Δg Rηk j W St ρj ξ, ηk j , 4π

(131)

k=1

N

N

ξ ∈ S2 , where ηk j are the integration knots and wk j the integration weights. Whereas the sum in (130) has to be extended over the whole sphere S2 , the summation in (131) has only to be computed for the local supports of the wavelets (note that the symbol  means that the error between the right- and the left-hand side can be neglected). Figures 8, 9, and 10 show that the method presented here solves a dilemma in geodesy: Common global solution methods always need dense and globally

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

a

180° W

90° W



90° E

1087

180° E

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

–800

–600

–400

–200

0

200

400

600

[m2/s2]

b

2

120° W 90° W

60° W

30° W

2

[m /s ]

c

2

120° W 90° W

80 30° N

0° 15° S 30° S 45° S 60° S

30° W

2

[m /s ] 80

30° N 60

15° N

60° W

40 20 0 –20 –40 –60

60 15° N 0° 15° S 30° S 45° S 60° S

40 20 0 –20 –40 –60

Fig. 8 Band-pass filtered versions W T ρj of the disturbing potential for the subregion in (a) for scales j = 4, 5 calculated from different numbers of data points (note that the number of data points for both data sets extrapolated to the whole sphere S2R (0) would correspond to approximately 1,000,000 points). (a) Low-pass filtered version T ρ4 of the disturbing potential T calculated from 490,000 data points distributed over the whole sphere S2R (0). (b) Band-pass filtered version of T at scale 4 from 281 428 data points distributed within the outer-bordered subregion in (a). (c) Bandpass filtered version of T at scale 5 from 226,800 data points distributed within the inner-bordered subregion in (a)

equidistributed data sets over the whole sphere S2R (0) to obtain a better approximation quality (according to Weyl’s Law of Equidistribution, see for example, [16]). However, the reality is quite different. On the one hand, we have large gaps in data sets, particularly at sea. On the other hand, there are some regions where the accuracy and density of available data sets is quite remarkable. The solution offered by our wavelet method is to start with a coarse, global approximation, e.g., of the

1088

W. Freeden and H. Nutz

a

120° W 90° W

60° W

30° W

2 2 [m /s ]

500 30° N

400 300

15° N

200 0°

100 0

15° S

–100 –200

30° S

–300 45° S

–400

60° S

b

90° W

15° N



15° S

–500

70° W

[m2/s2] 50

c

90° W

15° N

70° W

[m2/s2] 50

40

40

30

30

20

20

10



10

0

0

–10

–10

–20

–20 15° S

Fig. 9 Band-pass filtered versions W T ρj of the disturbing potential for subregions in (a) for scales j = 6, 7 calculated from different numbers of data points (note that the number of data points for both data sets extrapolated to the whole sphere S2R (0) would correspond to approximately 4,000,000 points). (a) Low-pass filtered version T ρ6 of the disturbing potential T of the innerbordered subregion in Fig. 8a computed by the sum of T ρ4 (Fig. 8a), W T ρ4 (Fig. 8b), and W T ρ5 (Fig. 8c) in this subregion. (b) Band-pass filtered version of T at scale 6 from 71,253 data points distributed within the outer-bordered subregion in (a). (c) Band-pass filtered version of T at scale 7 from 63,190 data points distributed within the inner-bordered subregion in (a)

disturbing potential using a scaling function of scale J0 and add local refinement in the form of band-pass filtered versions using Stokes wavelets. This can be realized as these wavelets only have a compact support. This procedure allows the incorporation of heterogeneous data sets in a way that locally improves the approximation of the disturbing potential in spite of non-equidistributed data sets.

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

a 15° N

90° W

70° W

1089 [m2/s2] 400 300 200 100



0 –100 –200 –300

15° S

b

95° W 5° N

90° W

2

2

85° W [m /s ] 5

c

95° W 5° N

90° W

2 2 85° W [m /s ] 5

4

4 2.5° N

3

2.5° N

2

2

1

1 0°

0



–2

2.5° S

–4

–2 –3

–3 5° S

0 –1

–1 2.5° S

3

5° S

–4

Fig. 10 Band-pass filtered versions W T ρj of the disturbing potential for subregions in (a) for scales j = 8, 9 calculated from different numbers of data points (note that the number of data points for both data sets extrapolated to the whole sphere S2R (0) would correspond to approximately 49,000,000 points). (a) Low-pass filtered version T ρ8 of the disturbing potential T of the innerbordered subregion in Fig. 9a computed by the sum of T ρ6 (Fig. 9a), W T ρ6 (Fig. 9b), and W T ρ7 (Fig. 9c) in this subregion. (b) Band-pass filtered version of T at scale 8 from 90,951 data points distributed within the outer-bordered subregion in (a). (c) Band-pass filtered version of T at scale 9 from 85,491 data points distributed within the inner-bordered subregion in (a)

6

Disturbing Potential from Deflections of the Vertical

As we already know, the force of gravity, i.e., the resultant of gravitational and centrifugal force, provides a directional structure to the space above the Earth’s

1090

W. Freeden and H. Nutz

surface. It is tangential to the vertical plumb lines and perpendicular to all (level) equipotential surfaces. Any water surface at rest is part of a level surface. Level (equipotential) surfaces are ideal reference surfaces, for example, for heights. The geoid is defined as that level surface of the gravity field which best fits the mean sea level. Gravity vectors can be measured by absolute or relative gravimeters. The highest accuracy relative gravity measurements are conducted at the Earth’s surface. Gravity data are converted into gravity anomalies by subtracting a corresponding reference potential derived from a simple gravity field model associated to an ellipsoidal surface (remove–restore–principle; see, e.g., [35]). It should be pointed out that the distribution of Earth’s gravity data on a global scale is far from being homogeneous with large gaps, in particular over oceans but also over land. Thus terrestrial gravity data coverage is far from being satisfactory and totally inadequate for global purposes of high-precision geoidal determination. Observations of the gravity vector are converted into so–called deflections of the vertical by subtracting a corresponding reference direction derived form a simple gravity field model associated with an ellipsoidal surface. Deflections of the vertical are tangential fields of the disturbing potential. An important tool of our considerations is the Green function on the unit sphere S2 ⊂ R3 with respect to the Beltrami operator Δ∗ Δ∗ξ = ∇ξ∗ · ∇ξ∗

(132)

given by (see [12]) ∗

G(Δ ; ξ, η) =

∞  2n + 1 n=1



1 Pn (ξ · η), n(n + 1)

(133)

where Pn is the Legendre polynomial of degree n. From the bilinear expansion we are able (cf. [12]) to deduce that G(Δ∗ ; ξ, η) =

1 1 1 ln(1 − ξ · η) + − ln(2) 4π 4π 4π

(134)

is an explicit representation of Green’s function with respect to the Beltrami operator Δ∗ . Note that, in distributional sense, we have 1 , 4π

(135)

Pn (ξ · η).

(136)

Δ∗ G(Δ∗ ; ξ, η) = δ(ξ · η) − where the Dirac distribution formally reads δ(ξ · η) =

∞  2n + 1 n=0



17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1091

An easy calculation using ∇ξ∗ F (ξ · η) = F  (ξ · η)(η − (ξ · η)ξ ),

(137)

shows that applying the differential operator ∇ ∗ to the Green function with respect to the Beltrami operator yields ∇ξ∗ G(Δ∗ ; ξ, η) = −

1 η − (ξ · η)ξ , 4π 1 − ξ · η

−1 ≤ ξ · η < 1.

(138)

From [24, 26] we know the fundamental theorem for ∇ ∗ on the unit sphere S2 : Theorem 15. Suppose that F is of class C (1) (S2 ). Then, for all ξ ∈ S2 , we have F (ξ ) =

1 4π



 F (η) dS(η) − S2

∇η∗ F (η) · ∇η∗ G(Δ∗ ; ξ, η) dS(η).

(139)

S2

Plugging (133) into (139) yields F (ξ )=

1 4π

 S2

 ∞  1 2n+1 F (η)dS(η)− Pn (ξ ·η)(η−(ξ ·η)ξ )·∇η∗ F (η)dS(η). 4π n(n+1) n=1

S2

(140) The aforementioned result can be used to establish the remainder term for an approximate integration formula on the sphere S2 by comparing F (ξ ) at each point ξ ∈ S2 with the mean integral value of F F ∧ (0, 1) =

1 4π

 F (η) dS(η).

(141)

S2

Combining Eqs. (139) and (138) we obtain the differential equation for ∇ ∗ on S2 : Let : S2 →R3 be a continuously differentiable vector field on S2 with ξ ·(ξ ) = 0, L∗ξ · (ξ ) = 0, ξ ∈ S2 . Then 1 F (ξ ) = 4π

 S2

1 (ξ − (ξ · η)η) · (η) dS(η) 1−ξ ·η

(142)

is the uniquely determined solution of the differential equation ∇ξ∗ F (ξ ) = (ξ ), satisfying

ξ ∈ S2 ,

(143)

1092

W. Freeden and H. Nutz

1 4π

F ∧ (0, 1) =

 F (ξ ) dS(ξ ) = 0.

(144)

S2

In what follows, for real values ρ > 0, we consider – as an auxiliary function – the so-called ρ–regularized Green kernel function with respect to Δ∗



G (Δ ; ξ, η) = ρ

⎧ 1 1 ⎪ ⎪ ln(1 − ξ · η) + (1 − ln(2)), ⎪ ⎨ 4π 4π

1 − ξ · η > ρ,

⎪ ⎪ 1−ξ ·η 1 ⎪ ⎩ + (ln(ρ) − ln(2)), 4πρ 4π

1 − ξ · η ≤ ρ.

(145) In analogy to the regularization procedure described in the preceding Sect. 5 for F ∈ C (0) (S2 ), we introduce the “regularized potential”  P (F )(ξ ) = ρ

S2

Gρ (Δ∗ ; ξ, η)F (η) dS(η)

(146)

G(Δ∗ ; ξ, η)F (η) dS(η).

(147)

as counterpart to the potential  P (F )(ξ ) =

S2

We are now able to formulate the following result: For (sufficiently small) values ρ > 0 and F ∈ C (0) (S2 ), the potential P ρ (F ) is of class C (1) (S2 ), and we have

lim sup P (F )(ξ ) − P ρ (F )(ξ ) = 0

ρ→0 ξ ∈S2

(148)

and



 



∗ ∗ ∗ ρ ∗ lim sup ∇ξ G(Δ ; ξ, η)F (η) dS(η) − ∇ξ G (Δ ; ξ, η)F (η) dS(η) = 0.

ρ→0 ξ ∈S2

S2 S2 (149) Furthermore,



 



∗ ∗ ∗ ∗ sup ∇ξ G(Δ ; ξ, η)F (ξ ) dS(η) − ∇ξ G(Δ ; ξ, η)F (η) dS(η) = 0.

2 ξ ∈S

S2 S2 (150) The regularized Green kernel functions now enable us to reformulate the regularized fundamental theorem for ∇ ∗ :

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1093

Theorem 16. Suppose that F is a continuously differentiable function on S2 . Then

lim sup F (ξ ) − ST ρ (ξ ) − F ∧ (0, 1) = 0,

ρ→0 ξ ∈S2

(151)

where we have used the abbreviations  ST ρ (ξ ) = −

∇η∗ F (η) · s ρ (ξ, η) dS(η)

(152)

S2

and s ρ (ξ, η) = ∇η∗ Gρ (Δ∗ ; ξ, η) ⎧ 1 1 ⎪ (ξ − (ξ · η)η), 1 − ξ · η > ρ, − ⎪ ⎪ ⎨ 4π 1 − ξ · η = ⎪ ⎪ 1 ⎪ ⎩ − (ξ − (ξ · η)η), 1 − ξ · η ≤ ρ. 4πρ

(153)

In what follows, scale discrete scaling vector fields and wavelets will be introduced to establish a numerical solution process. We start with the choice of a sequence that divides the continuous scale interval (0, 2] into discrete pieces. More explicitly, (ρj )j ∈N0 denotes again a monotonically decreasing sequence of real numbers satisfying ρ0 = 2,

lim ρj = 0.

j →∞

(154)

As in Eq. (122), one may choose ρj = 21−j , j ∈ N0 . An illustration of the absolute value of the scaling vector fields s ρj , i.e., |s ρj |, can be found in Fig. 11. We immediately get the following result: For F ∈ C (1) (S2 )

lim sup F (ξ ) − F ∧ (0, 1) − ST ρj (ξ ) = 0,

j →∞ ξ ∈S2

(155)

where  ST

ρj

(ξ ) = −

∇η∗ F (ξ ) · s ρj (ξ, η) dS(η).

(156)

S2

Our procedure canonically leads to the following type of scale–discretized wavelets:

1094

W. Freeden and H. Nutz |sρj (co sϑ)|

0.4 j=0 j=1 j=2 j=3

0.35 0.3 0.25 0.2 0.15 0.1 0.05 0

−π

−π/2

π/2

0

π

Fig. 11 Illustration of the norm of the scaling vector fields |s ρj (cos ϑ)| with cos ϑ = ξ · η.

w ρj = s ρj +1 − s ρj ,

(157)

or, more explicitly, ⎧ 0, ρj < 1 − ξ · η, ⎪ ⎪  ⎨ 1 1 1 ρj w (ξ, η) = − 4π  ρj − 1−ξ ·η (ξ − (η · ξ )η) , ρj +1 < 1 − ξ · η ≤ ρj , ⎪ ⎪ 1 1 ⎩− 1 4π ρj +1 − ρj (ξ − (η · ξ )η) , −ξ · η ≤ ρj +1 . (158) Equation (157) is called the (scale) discretized scaling equation. A graphical impression of the scale-discretized wavelets can be found in Fig. 12. Assume now that F is a function of class C (1) (S2 ). Observing the discretized scaling equation (157) we get for J ∈ N0 ST ρJ = ST ρ0 +

J −1

W T ρj ,

(159)

j =0

with  WT

ρj

(ξ ) =

∇η∗ F (η) · w ρj (ξ, η) dS(η),

j = 0, . . . , J − 1.

S2

Therefore, we are able to formulate the following result using

(160)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1095

|wρj (co sϑ)| 0.7 j=0 j=1 j=2 j=3

0.6 0.5 0.4 0.3 0.2 0.1 0

−π

−π/2

0

π/2

π

Fig. 12 Illustration of the norm of the wavelets |w ρj (cos ϑ)| with cos ϑ = ξ · η



F ∧ (n, m) =

F (ξ )Yn,m (ξ ) dS(ξ ),

n = 0, 1; m = 1, . . . , 2n + 1,

(161)

S2

where Yn,m are spherical harmonics. Let {s ρj }j ∈N0 be a (scale) discretized scaling vector field. Then the multiscale representation of a function F ∈ C (1) (S2 ) is given by

⎛ ⎞

∞ 3  

1 ρj ∧ ∧

⎝ ⎠ W T (ξ ) + F (0, 1) + F (1, m)Y1,m (ξ )

= 0. sup F (ξ ) − 3

ξ ∈S2 j =0 m=1 (162) The aforementioned result admits the following reformulation (cf. [23]):

ST ρJ +

∞  j =J

W T ρj = F − F ∧ (0, 1) −

3 1 ∧ F (1, m)Y1,m 3

(163)

m=1

for every J ∈ N0 (in the sense of  · C(S2 ) ). Again, the integrals ST ρj , W T ρj may be understood as low–pass and band–pass filters, respectively. Obviously,

ST ρj +1 = ST ρj + W T ρj .

(164)

1096

W. Freeden and H. Nutz

Equation (164) may be interpreted in the following way: The (j + 1)-scale low– pass filtered version of F is obtained by the j -scale low–pass filtered version of F added by the j -scale band–pass filtered version of F . The above considerations lead us to the following results in the determination of geoidal undulations N : S2 → R from deflections of the vertical Θ : S2 → R3 via (see (95)) ∇ξ∗ N(Rξ ) = −RΘ(Rξ ),

ξ ∈ S2 .

(165)

As in the classical process of gravimetric determination of the geoid (see e.g., [34]) we assume that the zero-degree term of the disturbing potential vanishes at the (spherical) surface of the Earth (i.e., the difference of the mass of the Earth and the mass of the reference ellipsoid is supposed to be zero). Moreover, we assume that the center of the reference ellipsoid coincides with the center of gravity of the Earth so that the first–degree term is zero. In other words, we have N ∧ (n, m) =

 N(Rξ )Yn,m (ξ ) dS(ξ ) = 0,

n = 0, 1; m = 1, . . . , 2n + 1.

S2

(166) To be more specific, we find the following result: Let Θ(R·) : ξ → Θ(Rξ ), ξ ∈ S2 , be a continuous field on S2 . Then the geoidal undulations can be determined by the formula N (Rξ ) =

R 4π

 S2

1 (ξ − (ξ · η)η) · Θ(Rξ ) dS(η). 1−ξ ·η

(167)

According to our approach the integral on the right–hand side of Eq. (167) can be written approximately by replacing the improper integral by proper parameter-integrals involving the “regularized Green function kernel” with respect to the Beltrami operator. In our notation, for ξ ∈ S2 and J ∈ N0 , we obtain  Θ(Rη) · s ρJ (ξ, η) dS(η) − R

N (Rξ ) = −R

∞   j =J

S2

Θ(Rη) · w ρj (ξ, η) dS(η).

S2

(168)

Consequently we find  N (Rξ ) = −R

Θ(Rη)·s ρJ (ξ, η) dS(η)−R S2

∞ 

 Θ(Rη)·w ρj (ξ, η) dS(η).

j =J 1−ξ ·η≤ρ j η∈S2

(169)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1097

More explicitly,  N (Rξ ) = −R S2 \Γρ 2 /(2R 2 ) (ξ )

  1 1 (ξ − (ξ · η)η) dS(η) Θ(Rη) · − 4π 1 − ξ · η

J

  1 Θ(Rη) · − (ξ − (ξ · η)η) dS(η) 4πρJ

 −R Γρ 2 /(2R 2 ) (ξ ) J

∞ 

−R

j =J

⎧ ⎪ ⎪ ⎪ ⎨



⎪ ⎪ ⎪ ⎩Γρ 2 /(2R2 ) (ξ )\Γρ 2 j

Θ(Rη)·

2 (ξ ) j +1 /(2R )

    1 1 1 − (ξ −(ξ · η)η) dS(η) − 4π ρj +1 1−ξ · η  − Γρ 2

2 (ξ ) j +1 /(2R )

⎫ ⎪ ⎪ ⎪ ⎬

   1 1 1 . Θ(Rη) · − − (ξ − (ξ · η)η) dS(η) ⎪ 4π ρj +1 ρj ⎪ ⎪ ⎭ (170)

In other words, finer and finer detail information about the geoid is obtained by wavelets with smaller and smaller local support. Two local areas for the multiscale techniques as presented in this paper will be illustrated namely South America and Hawaii. First, we investigate the disturbing gravitational field for areas in which mantle plumes and hotspots occur by use of regularized Green functions. A mantle plume occurs in case of upwelling of abnormally hot rocks within the Earth’s mantle. Plumes are assumed to be vertical conduits in which the hot mantle material due to buoyancy rises from the lower mantle to the lithosphere at velocities as large as 1 m yr−1 and these quasicylindrical regions have a typical diameter of about 100 − 200 km. In mantle convection theory, mantle plumes are related to hotspots which describe centers of surface volcanism that are not directly caused by plate tectonic processes. A hotspot is a long–term source of volcanism which is fixed relative to the plate which moves over it. We study the Hawaii hotspot which is one of the most well-known. Due to the local nature of plumes and hotspots we have to use high resolution gravity models. Because of the lack of terrestrial–only data the combined gravity model EIGEN– GLO4C by GFZ is used, that consists of satellite data, gravimetry, and altimetry surface data. The multiscale approximation of the disturbing potential in the region of Hawaii is shown in Fig. 13. The disturbing potential calculated by use of the regularized Neumann Green function is shown in Fig. 14. The advantage of using the Neumann Green function

1098

W. Freeden and H. Nutz

+

+ low–pass filtering (scale j = 6)

+

band–pass filtering (scale j = 6)

+ band–pass filtering (scale j = 8)

band–pass filtering (scale j = 7)

+

+ band–pass filtering (scale j = 10)

band–pass filtering (scale j = 9)

+

= band–pass filtering (scale j = 11)

low–pass filtering (scale j = 12)

Fig. 13 Multiscale reconstruction of the disturbing potential T in [m2 s−2 ] from deflections of the vertical for the Hawaiian (plume) area using regularized Green functions (a rough low–pass filtering at scale j = 6 is improved by several band–pass filters of scale j = 6, . . . , 11, the last illustration shows the approximation of the disturbing potential T at scale j = 12, see [14])

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . . Disturbing Potential, Scale J=4

90° W

90° W

Disturbing Potential, Difference Scale J=6 and Scale J=4

30° W

120° W

30° N

150

30° N

°

100

15 N

15 N °

0

30 S

°

−50

45° S

−100

30° W 60 40

°

0 m²/s²

°

15 S

90° W

°

50

0

90° W

°

20

°

0

15 S

+

30 S

−20

45° S

−40

−150

60° S

60° S

−200

low–pass filtering (scale j = 4) 90° W

90° W

Disturbing Potential, Difference Scale J=10 and Scale J=8

30° W

120° W

30° N

50

°

40

15 N

30° W 40

°

30

°

10

°

30 S

0

°

45 S

−10

+

−20

60° S

20

°

15 S 10

°

30 S

0

°

45 S

−10 −20

60° S

−30

−30

band–pass filtering (scale j = 7, 8)

band–pass filtering (scale j = 9, 10)

Disturbing Potential, Difference Scale J=15 and Scale J=10

120° W

90° W

90° W

Disturbing Potential, Scale J=15

30° W

°

30 N °

15 N

m²/s²

20

°

m²/s²

0

15 S

120° W

90° W

90° W

30° W

°

60

30 N

40

15 N

300

°

°

200

°

0

0

°

30 S −20

45° S

m²/s²

20

°

15 S

100

°

15 S

=

0

°

30 S

m²/s²

0

+

90° W

15 N

0

+

90° W

30° N

30

°

−60

band–pass filtering (scale j = 5, 6)

Disturbing Potential, Difference Scale J=8 and Scale J=6

120° W

m²/s²

120° W

1099

−100

45° S

−40 −200 ° 60 S

−60

° 60 S

−300

band–pass filtering (scale j = 11, ..., 15)

low–pass filtering (scale j = 15)

Fig. 14 Multiscale reconstruction of the disturbing potential T in [m2 s−2 ] from deflections of the vertical for South America using regularized Neumann Green functions (a rough low–pass filtering at scale j = 4 is improved by several band–pass filters of scale j = 6, . . . , 15, the last illustration shows the approximation of the disturbing potential T at scale j = 15, see [14])

1100

W. Freeden and H. Nutz

instead of the “normal” Green function (134) is that no integral at the boundary of the region has to be considered. More information about Neumann Green functions can be found in [15]. The input data used for the multiscale decomposition has been calculated from EGM2008 (from degree 3).

7

Ocean Circulation

Ocean flow has a great influence on mass transport and heat exchange. By modeling oceanic currents we therefore gain, for instance, a better understanding of weather and climate. In what follows we devote our attention to the geostrophic oceanic circulation on bounded regions. In a first approximation the oceanic surfaces under consideration may be assumed to be parts of the boundary of a spherical Earth model. We calculate the oceanic flow under the simplifying assumptions of stationarity, spherically reflected horizontal velocity, and strict neglect of inner frictions. This leads us to long scale currents, which still give meaningful results for inner oceanic areas as, for example, when applied to the phenomenon of El Niño (cf. [14]). The simulation of ocean currents is based on the Navier–Stokes equations. Their formulation (see, e.g., [3]) is well–known: Let us consider a fluid occupying an arbitrary (open and bounded) subdomain G0 ⊂ R3 at time t = 0. The vector function u : [0, tend ] × G0 → Gt ⊂ R3 describes the motion of the particle positions X ∈ G0 with time, so that, at times t ≥ 0, the fluid occupies the domain Gt = {u(t; X) | X ∈ G0 }, respectively. Hence, Gt is a closed system in the sense that no fluid particle flows across its boundaries. The path of a particle X ∈ G0 is given by the graph of the function t → u(t; X), and the velocity of the fluid at a fixed location x = u(t; X) ∈ Gt is given by the derivative (t; X) = ∂t∂ u(t; X). The derivation of the governing equations relies on the conservation of mass and momentum. The essential tool is the transport theorem, which shows how the time derivative of an integral over a domain changing with time may be computed. The mass of a fluid occupying a domain is determined by the integral over the density of the fluid ρ. Since the same amount of fluid occupying the  domain at time 0 later occupies the domain at time t > 0, we have to require that G0 ρ(0; x)dx coincides  with Gt ρ(t; x)dx for all t ∈ (0, tend ]. Therefore, the derivative of the mass with respect to time must vanish and the transport theorem yields for all t and Gt   Gt

 ∂ ρ(t; x) + div (ρ)(t; x) dx = 0. ∂t

(171)

Since this identity is valid for arbitrary regions (in particular, for arbitrarily small ones), this implies that the integrand itself vanishes, which yields the continuity equation ∂ ρ + div (ρ) = 0. ∂t

(172)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1101

The momentum of a solid body is the product of its mass with its velocity  Gt

(173)

ρ(t; x)(t; x)dx.

According to Newton’s second law, the change of (linear) momentum is equal to the sum of the forces acting on the fluid. We distinguish two typesof forces, viz. body forces (e.g., gravity, Coriolis force), which can be expressed as Gt ρ(t; x)k(t; x)dx with a given force density k per unit volume and surface forces (e.g., pressure, internal friction) representable as ∂Gt σ (t; x)ν(x) dS(x), where σ (t; x) is the stress tensor and ν(x) denotes the outer normal on ∂Gt in x. Thus Newton’s law reads d dt



 Gt

ρ(t; x)(t; x) dx =

 Gt

ρ(t; x)k(t; x) dx +

σ (t; x)ν(x) dS(x). ∂Gt

(174) If we now apply the product rule and the transport theorem componentwise to the term on the left and apply the divergence theorem to the second term on the right, we obtain the momentum equation ∂ (ρ) + ( · ∇)(ρ) + (ρ)∇ ·  = ρk + ∇ · σ. ∂t

(175)

The nature of the oceanic flow equation depends heavily on the model used for the stress tensor. In the special case of incompressible fluids (e.g., ocean water) that is characterized by a density ρ(t; x) = ρ0 = const dependent neither on space nor on time, we find ∇ ·  = 0, i.e.,  is divergence–free. When modeling an inviscid fluid, internal friction is neglected and the stress tensor is determined solely by the pressure σ (t; x) = −P (t; x)i, where i is the unit matrix. In the absence of inner friction (in consequence of, e.g., effects of wind and surface influences) we are able to ignore the derivative ∂∂t and, hence, the dependence on time. As relevant volume force densities k we have the gravity field  and the Coriolis acceleration c = −2ω ∧ ,

(176)

where ω is the angular velocity. Finally, for large scale currents of the ocean, the nonlinear part does not play any role, i.e., the term ( · ∇) is negligible. Under all these very restrictive assumptions the equation of motion (175) reduces to the following identity 2ω ∧  = −

∇P + . ρ0

(177)

Even more, we suppose a velocity field of a spherical layer model. For each layer, i.e., for each sphere around the origin 0 with radius r(≤ R), the velocity field 

1102

W. Freeden and H. Nutz

can be decomposed into a normal field nor and a tangential filed tan . The normal part is negligibly small in comparison to the tangential part (see the considerations in [57]). We assume that ω = Ω 3 , where Ω is the rotation rate of the Earth and define the Coriolis parameter C(ξ ) by C(ξ ) = 2 ω · ξ.

(178)

For simplicity, we regard the gravity acceleration as a normal field: (rξ ) = −gr ξ, ξ ∈ S2 (with gr as mean gravity intensity) on the sphere of radius r. From Eq. (177) we then get 2ω × tan = −

1 ∗ 1 ∂P − ∇ P − gr , ξ ρ0 ∂r ρ0 r

(179)

with gr = gr ξ . Separating into normal and tangential part we finally arrive at C(ξ )ξ ∧ tan (rξ ) = −

1 ∗ ∇ P (rξ ) ρ0 r ξ

(180)

and (2ω ∧ tan (rξ )) · ξ = −

1 ∂ P (rξ ) − gr . ρ0 ∂r

(181)

Equation (180) essentially tells us that the tangential surface gradient is balanced by the Coriolis force. Moreover, the vertical Coriolis acceleration in comparison to the tangential motion is very small, such that we are allowed to assume that (2ω ∧ tan (Rξ )) · ξ = 0. On the surface of the Earth (here, r = R), we then obtain from (181) a direct relation of the product of the mean density and the mean gravity acceleration to the normal pressure gradient (hydrostatic approximation): ∂ P (Rξ ) = −ρ0 gr . ∂r

(182)

This is the reason why we obtain the pressure by integration as follows P (Rξ ) = −ρ0 gR Ξ (Rξ ) + PAtm ,

(183)

where PAtm denotes the mean atmospheric pressure. The quantity Ξ (Rξ ) (cf. Fig. 15) is the difference between the heights of the ocean surface and the geoid at the point ξ ∈ S2 . The scalar function ξ → Ξ (Rξ ), ξ ∈ S2 , is called ocean topography. By use of altimeter satellites, we are able to measure the difference H between the known satellite height HSat and the (unknown) height of the ocean surface HOcean : H = HSat − HOcean . After calculation of HOcean we get the ocean topography Ξ = HOcean − HGeoid from the known geoidal height HGeoid . In connection with (180) this finally leads us to the equation

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1103

SATELLITE ORBIT MICROWAVE MEASUREMENT OF COLUMNAR WATER VAPOR

RADAR ALTIMETER RANGING

OCEAN TOPOGRAPHY

LASER RANGING STATION

SEA SURFACE SEA LEVEL

GEOID SEA-FLOOR TOPOGRAPHY

REFERENCE ELLIPSOID

Fig. 15 Ocean topography and geoid

C(ξ ) ξ ∧ tan (Rξ ) =

gR ∗ ∇ Ξ (Rξ ). R ξ

(184)

Remembering the definition of the surface curl gradient L∗ξ = ξ ∧ ∇ξ∗ we are able to reformulate (184) C(ξ )tan (Rξ ) = −

gR ∗ L Ξ (Rξ ). R ξ

(185)

This is the so–called equation of the geostrophic oceanic flow. As a consequence, we have to deal with an equation of vectorial tangential type given on the unit sphere S2 . Structurally, we have to be concerned with an equation involving the surface ) curl gradient L∗ in the form L∗ S = s where S = Ξ and s(ξ ) = − C(ξ gR tan (Rξ ). Now, as C(ξ ) and gR are known, the computation of the geostrophic oceanic flow simply is a problem of differentiation, namely the computation of the derivative L∗ξ S(ξ ) = ξ ∧ ∇ξ∗ S(ξ ), ξ ∈ S2 . The componentwise formulas for the longitude and latitude direction of tan are given as

1104

W. Freeden and H. Nutz

C(ϕ, λ)tan (R; ϕ, λ) = −

gR ∂Ξ (R; ϕ, λ) , R ∂ϕ

(186)

C(ϕ, λ)λtan (R; ϕ, λ) = −

gR 1 ∂Ξ (R; ϕ, λ) , R sin ϕ ∂λ

(187)

ϕ

where λ is the longitude and ϕ, ϕ = 0, is the latitude in spherical coordinates. The point of departure for spherical harmonic modeling is to determine the geostrophic oceanic flow following Eq. (185), which leads to L∗ Ξ (Rξ ) = −

R C(ξ )tan (Rξ ), gR

ξ ∈ S2 .

(188)

The calculation of the derivative L∗ Ξ is not realizable, at least not directly, since we are confronted with discrete data material that, in addition, is only available for oceanic areas. In the geodetic literature a spherical harmonic approach is usually used in form of a Fourier series in terms of spherical harmonics. More concretely, due to [2], a global approach for the ocean topography can be modeled by extension of the (altimetric) sea surface into the land surface using geoidal heights from a gravity model and applying an iterative process. The result of the iteration is a global representation of the sea surface in terms of spherical harmonics up to a certain degree L: Ξ (Rξ ) =

L 2n+1  

(189)

Cnk Yn,k (ξ ).

n=0 k=1

Using fully normalized associated Legendre polynomials of degree n and order k, P nk , and spherical coordinates (ϕ, λ) (see, e.g., [54, 55]), the spherical harmonic expansion is given in the form Ξ (R; ϕ, λ) =

n  L    1 2 Cnk cos kλ + Cnk sin kλ P nk (cos ϕ).

(190)

n=0 k=0

The vectorial isotropic operator L∗ (cf. [24] for further details) is then applied – unfortunately under the loss of its vectorial isotropy when decomposed in terms of scalar components – to the resulting Fourier series. The results are scalar components of the geostrophic flow. The longitude and latitude parts of L∗ Ξ (see Eqs. (186) and (187)) are given by ϕ

tan (R; ϕ, λ) = −

L 2n+1   1 ∂Yn,k gR (ϕ, λ) Cnk R C(ϕ, λ) ∂ϕ n=0 k=1

=−

gR 1 R C(ϕ, λ)

n  L  

1 2 Cnk cos kλ + Cnk sin kλ

n=0 k=0

 ∂P (cos ϕ) nk , ∂ϕ

(191)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

λtan (R; ϕ, λ) = −

1105

L 2n+1 1 1  ∂Yn,k gR (ϕ, λ) Cnk R C(ϕ, λ) sin ϕ ∂λ n=0 k=1

=−

gR 1 R C(ϕ, λ)

n L  

  1 2 k −Cnk sin kλ + Cnk cos kλ P nk (cos ϕ).

(192)

n=1 k=1

An alternative representation in terms of divergence free vector spherical harmonics that reflects the vectorial isotropy is given in [24]. The Fourier expansion approach leads to a series of the velocity field in terms of divergence–free vector spherical harmonics. However, these vector types of functions are far from being suitable for purposes of approximation. In the case of ocean circulation, boundary effects in the neighborhood of coast lines cannot be avoided by non–space–localizing vector spherical harmonics, when the ocean topography is set to zero over the continents. By use of space–localizing wavelets as proposed below no assumptions on the continents must be made. The serious difficulty with global polynomial structures such as spherical harmonics (with Ξ = 0 on continents!) is the occurrence of the Gibb’s phenomenon close to the coast lines (see, e.g., [2, 55]). An alternative approach suppressing numerically generated oscillations in inner oceanic areas is the application of kernels with local support as, e.g., smoothed Haar kernels (see [14, 26] and the references therein)  Φτ(k) (t)

=

0, k+1 (t−(1−τ ))k , 2π τ k+1

τ < 1 − t ≤ 2, 0 ≤ 1 − t ≤ τ.

(193)

(k)

For τ ∈ (0, 2], k ∈ N, the function Φτ as introduced by (193) is (k − 1)–times (k) continuously differentiable on the interval [0, 2]. (Φτj )j ∈N0 is a sequence tending to the Dirac distribution, i.e., a Dirac sequence (cf. Figs. 16 and 17). For a strictly monotonically decreasing sequence (τj )j ∈N0 satisfying τj → 0 for j → ∞, e.g, τj = 2−j , we obtain for the convolution integrals (low–pass filters)  Ξτ(k) (ξ ) j

=

S2

Φτ(k) (ξ · ε)Ξ (ε)dS(ε), ξ ∈ S2 , j

(194)

the limit relation



lim sup Ξ (ξ ) − Ξτ(k) (ξ )

= 0. j

j →∞ ξ ∈S2

(195)

An easy calculation yields the representation L∗ξ Φτ(k) (ξ · ε) = j

⎧ ⎨ 0, ⎩

k−1 k(k+1) ((ξ ·ε)−(1−τj )) ξ k+1 2π τ j

τj < 1 − ξ · ε ≤ 2, ∧ε ,

0 ≤ 1 − ξ · ε ≤ τj .

(196)

1106

W. Freeden and H. Nutz

8

8 k=0 k=2 k=3 k=5

7 6

6

5

5

4

4

3

3

2

2

1

1

0

0

−1

−1

−2 −4

−3

−2

−1

0

1

2

3

scale j = 3, k = 0;2;3;5

j=0 j=1 j=2 j=3

7

4

−2 −4

−3

−2

−1

0

1

2

3

4

scale j = 0;1;2; 3, k = 5

Fig. 16 Sectional illustration of the smoothed Haar wavelets Φτ(k) j (cos(θ)) with θ ∈ [−π, π ], τj = 2−j

Fig. 17 Illustration of the first members of the wavelet sequence for the smoothed Haar scaling function on the sphere (τj = 2−j , j = 2, 3, 4, k = 5)

It is not hard to see (cf. [24]) that



lim sup L∗ξ Ξ (ξ ) − L∗ξ Ξτ(k) (ξ )

= 0. j

j →∞ ξ ∈S2

(197)

The multiscale approach by smoothed Haar kernels can be formulated in a standard multiscale nomenclature (see [14, 26]). For example, the Haar wavelets can be understood as differences of two successive scaling functions (smoothed Haar kernels). In doing so, an economical and efficient algorithm in a tree structure (Fast Wavelet Transform (FWT)) can be implemented appropriately.

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1107

+ low–pass filtering (scale j = 1)

+

band–pass filtering (scale j = 1)

+ band–pass filtering (scale j = 3)

band–pass filtering (scale j = 2)

+

+ band–pass filtering (scale j = 5)

band–pass filtering (scale j = 4)

+

= band–pass filtering (scale j = 6)

low–pass filtering (scale j = 7)

Fig. 18 Multiscale approximation of the ocean topography [cm] (a rough low–pass filtering at scale j = 1 is improved with several bandpass filters of scale j = 1, . . . , 6, where the last picture shows the multiscale approximation at scale j = 7 (cf. [27]))

1108

W. Freeden and H. Nutz

Fig. 19 Ocean topography [cm] (left) and geostrophic oceanic flow [cm/s] (right) of the gulf stream computed by the use of smoothed Haar wavelets (j = 8, k = 5) (cf. [27])

The data points used for our demonstration are extracted from the French CLS01 model (in combination with the EGM96 model). Clearly, ocean currents are subject to different influence factors, such as wind field, warming of the atmosphere, salinity of the water, etc, which are not accounted for in our modeling. Our approximation (see Figs. 18, 19) must be understood in the sense of a geostrophic balance as proposed earlier. An analysis shows that its validity may be considered as given on spatial scales of an approximate expansion of a little more than 30 km, and on time scales longer than approximately one week. Indeed, the geostrophic velocity field is perpendicular to the tangential gradient of the ocean topography (i.e., perpendicular to the tangential pressure gradient). This is a remarkable property. The water flows along curves of constant ocean topography (see Fig. 18 for a multiscale modeling). Despite the essentially restricting assumptions necessary for geostrophic modeling, we obtain instructive circulation models of the internal ocean surface current for the northern or southern hemisphere, respectively (however, difficulties for the computation of the flow arise from the fact that the Coriolis parameter vanishes on the equator). In Fig. 20 some results based on an alternative approach following the ideas presented in [15] are shown. This method reconstructs the mean ocean topography from the geostrophic ocean flow that is only known on a subdomain of the sphere (here, a spherical cap). The input data (see Fig. 21, right) has been calculated on the basis of the mean ocean topography (or mean dynamic topography (MDT)) model from [48]. More information about regularization based on Green functions for modeling the sea surface topography can be found in [10] (however not restricted to subdomains of the sphere).

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . . MDT, Scale J=4 150° W

MDT, Difference Scale J=6 and Scale J=4

120° W

180° W 60

°

150° W

120° W

°

8

45 N

45 N

55

6

50 °

30 N

°

cm

40 °

15 N

35 30



15 S

2

+

15° N

0 −2

°

0

25

−4

20

°

4

30 N

45

°

−6

15 S

low–pass filtering (scale j = 4)

band–pass filtering (scale j = 5, 6)

MDT, Difference Scale J=8 and Scale J=6

MDT, Difference Scale J=10 and Scale J=8

180°° W

150°° W

120°° W

°

180° W

150° W

120° W 3

°

4

45 N

cm

180° W

1109

45 N

2 2

−2

+



−4



°

−6

°

15 S

45 N

−1

0

−2

°

15 S

70

50

30° N

0

°

120° W

60

cm

15° N

150° W

°

45 N

2 1

+

MDT, Scale J=15 180° W

3

30° N

−4

band–pass filtering (scale j = 9, 10)

120° W

°

−2

15 S

MDT, Difference Scale J=15 and Scale J=10 150° W

−1

−3

band–pass filtering (scale j = 7, 8) 180° W

cm

0 15° N

40

=

15° N

cm

15° N

cm

0

+

1

° 30 N

30° N

30

°

20

0

10

°

15 S

band–pass filtering (scale j = 11, ..., 15)

low–pass filtering (scale j = 15)

Fig. 20 Multiscale approximation of the ocean topography [cm] (a rough low–pass filtering at scale j = 4 is improved with several bandpass filters of scale j = 4, . . . , 15, where the last picture shows the multiscale approximation at scale j = 15

1110

W. Freeden and H. Nutz Geostrophic Flow

MDT °

150 W

°

°

180 W

120 W 70

°

°

150 W

45° N

45 N 60 50 40 15° N 0° 15° S

30 20

35 30

°

30 N cm

30° N

°

120 W

25 20

°

15 N

15



10

cm/s

°

180 W

5 10

15° S

Fig. 21 Interpolated mean ocean topography (or mean dynamic topography (MDT)) calculated from [48] on the left and the geostrophic flow from the mean topography model on the right

8

Concluding Remarks about the Earth’s Figure

In the eighteenth century it was well established that deviations between measurement and ellipsoidal theory were existent, which could not been explained by measurement errors (i.e., inaccurate measurements). The geodetic arc measurements as well as the pendulum measurements pointed to the problem that local deflections of the vertical occur because of the different composition of the masses of the Earth. The actual composition of the Earth’s masses could not correspond to the presumed assumption. Already [46] discussed the gravity changes as local distortions, which they traced back to the influence of mountains, a simple conclusion of Newton’s potential theory. Also the strongly different values for the flattening of the ellipsoid coming from different geodetic arc measurements questioned the regular curvature of the meridians. Laplace [43] concluded in his “Traité de Mécanique Céleste” that the Earth is noticeably different from an elliptic shape. On the basis of the well known scientific findings at that time and his own knowledge about the significance of the deflections of the vertical, which he traced back to the visibility of the irregularity of the masses as well as to the different densities below the Earth’s surface, Gauss came as a first scientist to a comprehensive definition of the mathematical figure of the Earth which was fundamental in physical geodesy. This definition has been published in 1828: “Bestimmung des Breitenunterschiedes zwischen den Sternwarten von Göttingen und Altona” (Determination of the latitude difference between the observatories of Göttingen and Altona), (Gauss Werke vol. IX, p. 49). In this treatise one can find the following sentences: In our opinion the topic is viewed from a wrong angle, if such phenomena are always considered as local deviations of the plumb line, and these deviations are regarded as it were only an exception. What we call in a geometric sense the surface of the Earth is nothing else as the same surface, which everywhere intersects the direction of gravity at right angles and

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1111

part of it coincides with the surface of the world’s oceans. The direction of the gravity is determined at every point by the shape of the part of the solid Earth and its unequal density. At the exterior rind of the Earth, from which we alone know anything, its shape and density appear as extreme irregular; the irregularity of the density may extend fairly deep under the outer rind and cannot be computed, because there is nearly no data available. The geometrical surface is the product of the total effect of these unequal distributed elements. . . In consideration of this situation nothing can prevent us to view the Earth as a whole as a spheroid of revolution, from which its real (geometrical) surface will overall deflect almost by stronger, weaker, shorter or longer undulations.

For this definition of the geometrical surface of the Earth, Gauss’s scholar Listing [44] chose the term “geoid”, which is up to date used, and is the reference surface for all heights above sea level. Gauss explains the term “height” in a letter to Olbers (Gauss works, vol. IX, p. 375): This surface (the surface of the oceans) is called a horizontal surface (couche de niveau); equal heights are given to the points of this surface without caring by no means about whether or how much these points deviate from an ellipsoidal spheroid. The heights above this surface can be measured by a barometer as well as trigonometrically so that both must correspond to each other.

According to [50] this surface of the Earth, in principle defined by Gauss, can in its importance for geodesy be considered as a change like that of Copernicus, because the simple model of the surface of an ellipsoid as an ideal Earth’s figure was replaced by the physically interpreted Earth. Since that time the ellipsoid serves in geodesy mainly as surface for computations in geodetic surveying and as a reference surface for the geoid and the Earth’s gravity field. The Gaussian definition solves the problems with complicated correction models, which occur by reducing measurements for the determination of deflections of the vertical. The surface of an ellipsoid of revolution representing the Earth cannot be exactly horizontal because of gravitational pull of the mountains, that is vertical to the plumb line. The Gaussian definition does not imply a simple geometrical surface, but it is defined by specifying all locations in Euclidean space R3 satisfying the simple equation W = W0 = const. In other words, W is equated with the constant W0 , with that the geoid is explained as a level (equipotential) surface of the Earth’s gravity field. With his definition Gauss introduced a principally different model, which is defined through the observations and represents a reversal of perspective (Gauss says that this definition is a change of the point of view). A generalization of the Gaussian definition was achieved by Bruns [7]: The problem of scientific geodesy is the investigation of the potential function of the Earth.

Instead of looking at a special equipotential surface, namely the geoid, all possible equipotential surfaces are considered in this definition as well as the gravity potential. The definition by Bruns can especially be used in satellite geodesy, because a satellite is moving in the gravity field of the Earth so that its orbit is determined by the potential W . In today’s geodesy concerned with the Earth’s figure, a result first motivated by Runge [62] in one-dimensional complex analysis and later generalized, e.g., by

1112

W. Freeden and H. Nutz

Walsh [69], Vekua [68], and Hörmander [36] to potential theory in threedimensional Euclidean space R3 is of basic interest. For geodetically relevant application (see, e.g., [41, 50, 54, 63] and the references therein), the Runge–Walsh theorem may be formulated as follows (cf. [51]): Let the Earth’s interior G ⊂ R3 be a regular region. The Earth’s gravitational potential in G c that is harmonic in G c and regular at infinity, can be approximated by a function that is harmonic outside an arbitrarily given Runge (in geodesy called Bjerhammar) ball A  G , i.e., A ⊂ G with dist(A , ∂G ) > 0 in the sense that, for any given ε > 0, the absolute error between the two functions is smaller than ε for all points outside and on any closed surface completely surrounding ∂G in its outer space. The value ε may be arbitrarily small, and the surrounding surface may be arbitrarily close to the surface. Obviously, the Runge–Walsh theorem in the preceding formulation represents a pure existence theorem. It guarantees only the existence of an approximating function and does not provide a constructive method to find it. The situation, however, is completely different if spherical geometries are exclusively involved in the Runge concept. Assuming that both A , G are concentric balls around the origin with A  G , a constructive approximation of a potential in the outer space G c is available, e.g., by outer harmonic (orthogonal) expansions (see, e.g., [29,39,56,70]). More concretely, within the classical context of a twofold spherical configuration, a constructive version of the Runge–Walsh theorem can be guaranteed by finite truncations of Fourier expansions in terms of outer harmonics, where the L2 (∂G )-convergence of the Fourier series implies uniform converges on any pointset K  G c .The Fourier coefficients are obtained by integration over the sphere ∂G . The gravitational potential is available (in spectral sense) by tables of the Fourier coefficients. Nowaday, in fact, outer harmonic expansions constitute the conventional geodetic tools in globally reflected approximation of the Earth’s gravitational potential and its observables. From a superficial point of view, one could suggest that approximation by truncated series expansions in terms of outer harmonics is closely related to spherical geometries ∂A , ∂G . Freeden [13], however, showed that the essential steps to a constructive Fourier approach can be extended to any regular (i.e., notnecessarily spherical) region G and to any regular (i.e., not-necessarily spherical) Runge domain A  G . The Runge-Walsh theorem in this formulation avoids any difficulty with the convergence to the gravitational potential by the generalized Fourier series for arbitrary sets K  G c . In analogy to the spherical case, however, it likewise does not help to specify convergence inside A c \G c , so that any attempts (see [6]) to reduce gravitational information via infinite Fourier series downward from ∂G to the Runge-surface ∂A are not justifiable by the Runge-Walsh framework. In summary, the Runge-Walsh concept reflects constructive approximation capabilities of the Earth’s gravitational (and not gravity) potential even if geoscientifically realistic (i.e., not necessarily spherical) geometries come into play. For numerical computations, the Runge concept may be regarded as the justification why boundary value problems in geodesy have to be solved without any need to

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1113

use specific telluroidal reflected trial function systems such as numerically more difficult to handle ellipsoidal harmonics, Lamé functions, etc. Instead outer spherical harmonics (i.e., multi-poles) and/or mono-poles showing a larger harmonicity domain than the exterior of a telluroid (in linear Molodensky problem) or the actual Earth (in oblique derivative problem) can be taken into account.

Literature 1. Adams, D.R., Hedberg, L.I.: Function Spaces and Potential Theory. Grundlehren der Mathematischen Wissenschaften, vol. 314. Springer, Berlin (1996) 2. Albertella, A., Savcenko, R., Bosch, W., Rummel, R.: Dynamic Ocean Topography – The Geodetic Approach, vol. 27. Schr-r. Inst. für Astron. Phys. Geod. Forsch-einricht. Satell-geod, TU München, München (2008) 3. Ansorge, R., Sonar, T.: Mathematical Models of Fluid Dynamics. Wiley, Weinheim (2009) 4. Augustin, M., Freeden, W.: A survey on classical boundary value problems in physical geodesy. In: Grafarend, E.W. (ed.) Encyclopedia of Geodesy, pp. 1–7. Springer International Publishing, Switzerland (2015) 5. Augustin, M., Freeden, W., Nutz, H.: About the importance of the Runge–Walsh concept for gravitational field determination. In: Freeden, W., Nashed, M.Z., (eds.) Handbook of Mathamatical Geodesy, Geosystems Mathematics, Birkhäuser, Springer international Publishing, Basel, New York, Heidelberg, 517–560 (2018) 6. Bjerhammar, A.: Gravity Reduction to an Internal Sphere. Division of Geodesy, Stockholm (1962) 7. Bruns, H.: Die Figur der Erde “Ein Beitrag zur europäischen Gradmessung”. P. Stankiewicz, Berlin (1878) 8. Cui J., Freeden W.: Equidistribution on the sphere. SIAM J. Sci. Stat. Comput. 18, 595–609 (1997) 9. Fehlinger, T.: Multiscale formulations for the disturbing potential and the deflections of the vertical in locally reflected physical geodesy. Ph.D-Thesis, Geomathematics Group, University of Kaiserslautern (2009) 10. Fehlinger, T., Freeden, W., Gramsch, S., Mayer, C., Michel, D., Schreiner, M.: Local modelling of sea surface topography from (geostrophic) ocean flow. ZAMM 87, 775–791 (2007) 11. Freeden, W.: Über eine Verallgemeinerung der Hardy–Landauschen Identität. Manuscr. Math. 24, 205–216 (1978) 12. Freeden, W.: Über eine Klasse von Integralformeln der Mathematischen Geodäsie. Veröff. Geod. Inst. RWTH Aachen, 27 (1979) 13. Freeden, W.: On the approximation of external gravitational potential with closed systems of (trial) functions. Bull. Géod. 54, 1–20 (1980) 14. Freeden, W.: Geomathematics: its role, its aim, and its potential. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 3–79. Springer, New York/Berlin/Heidelberg (2015) 15. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman and Hall/CRC Press, Boca Raton/London/New York (2013) 16. Freeden, W., Gutting, M.: Integration and Cubature Methods. Chapman and Hall/CRC Press, Boca Raton/London/New York (2018) 17. Freeden, W., Kersten, H.: The Geodetic Boundary-Value Problem Using the Known Surface of the Earth, vol. 29. Veröff. Geod. Inst. RWTH Aachen, Aachen (1980) 18. Freeden, W., Kersten, H.: A constructive approximation theorem for the oblique derivative problem in potential theory. Math. Methods Appl. Sci. 4, 104–114 (1981)

1114

W. Freeden and H. Nutz

19. Freeden, W., Maier, T.: On multiscale denoising of spherical functions: basic theory and numerical aspects. Electron. Trans. Numer. Anal. 14, 56–78 (2002) 20. Freeden, W, Maier, T.: Spectral and multiscale signal-to-noise thresholding of spherical vector fields. Comput. Geosci. 7, 215–250 (2003) 21. Freeden, W., Mayer, C.: Multiscale Solution for the Molodensky problem on regular telluroidal surfaces. Manuscr. Math. Acta Geod. Geophys. Hung. 41, 55–86 (2006) 22. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston (2004) 23. Freeden, W., Schreiner, M.: Local multiscale modelling of geoid undulations from deflections of the vertical. J. Geod. 79, 641–651 (2006) 24. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences – A Scalar, Vecterial, and Tensorial Setup. Springer, Heidelberg (2009) 25. Freeden, W., Wolf, K.: Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math. Semesterb. 56, 53–77 (2009) 26. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications/Clarendon, Oxford (1998) 27. Freeden, W., Michel, D., Michel, V.: Local multiscale approximation of geostrophic oceanic flow: theoretical background and aspects of scientific computing. Mar. Geod. 28, 313–329 (2005) 28. Fehlinger, T., Freeden, W., Klug, M., Mathar, D., Wolf, K.: Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J. Geodesy 83, 1171– 1191 (2009) 29. Gauss, C.F.: Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des magnetischen Vereins, Göttingen (1838) 30. Gerhards, C.: Spherical multiscale methods in terms of locally supported wavelets: Theory and application to geomagnetic modeling. Ph.D-Thesis, Geomathematics Group, University of Kaiserslautern (2011) 31. Grafarend, C.F., Klapp, M., Martinec, M.: Spacetime modeling of the Earth’s gravity field by ellipsoidal harmonics. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd ed., pp. 381–496. Springer, New York/Berlin/Heidelberg (2015) 32. Green, G.: An Essay on the Application of Mathematical Analysis to the Theories of Electricity and Magnetism. T. Wheelhouse, Nottingham (1838) 33. Groten, E.: Geodesy and the Earth’s Gravity Field I + II. Dümmler, Bonn (1979) 34. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Freeman, San Francisco (1967) 35. Hofmann–Wellenhof, B., Moritz, H.: Physical Geodesy, 2nd edn. Springer, Wien/New York (2006) 36. Hörmander, L.: The Boundary Problems of Physical Geodesy. The Royal Institute of Technology, Division of Geodesy, Report 9, Stockholm (1975) 37. Hotine, M.: Mathematical Geodesy. ESSA Monography, vol. 2. U.S. Department of Commerce, Washington (1969); Reprint 1992 by Springer 38. Jekeli, C.: An analysis of deflections of the vertical derived from high-degree spherical harmonic models. J. Geod. 73, 10–22 (1999) 39. Kellogg, O.D.: Foundations of Potential Theory. Frederick Ungar Publishing Company, New York (1929) 40. Koch, K.R., Pope, A.J.: Uniqueness and existence for the geodetic boundary value problem using the known surface of the earth. Bull. Géod. 106, 467–476 (1972) 41. Krarup, T.: A Contribution to the Mathematical Foundation of Physical Geodesy. Danish Geodetic Institute, Report No. 44, Copenhagen (1969) 42. Krarup, T.: Letters on Molodensky’s Problem I-IV. Communication to the members of the IAG Special Study Group 4.31 (1973) 43. Laplace, P.S.: Traité de mécanique céleste. Tome 2, Paris (1799) 44. Listing, J.B.: Über unsere jetzige Kenntnis der Gestalt und Größe der Erde. Dietrichsche Verlagsbuchhandlung, Göttingen (1873)

17 From Newton’s Law of Gravitation to Multiscale Geoidal . . .

1115

45. Listing, J.B.: Neue geometrische und dynamische Constanten des Erdkörpers. Nachr. Königl. Ges. Wiss. & Georg-August-Univ Göttingen, Dietrichsche Verlagsbuchhandlung, pp. 749– 815 (1878) 46. Maire, Ch., Boscovi´c, R.J.: De Litteraria Expeditione per Pontificiam Ditionem ad Dimentiendos duos Meridiani Gradus. In typographio Palladis, excudebant Nicolaus, et Marcus Palearini, Romae, pp. 409–503 (1755) 47. Markina, I., Vodopyanov, S.K.: Fundamentals of the nonlinear potential theory for subelliptic equations. I, II. Siberian Adv. Math. 7, 32–62 (1997) 48. Maximenko, N., Niiler, P., Rio, M.-H., Melnichenko, O., Centurioni, L., Chambers, D., Zlotnicki, V., Galperin, B.: Mean dynamic topography of the ocean derived from satellite and drifting buoy data using three different techniques. J. Atmos. Ocean Tech. 26, 1910–1919 (2009) 49. Molodensky, M.S., Eremeev, V.F., Yurkina, M.I.: Methods for study of the external gravitational field and figure of the earth, p. 131. Trudy TSNIIGAiK, Geodezizdat, Moscow (1960) (English trans: Israel Program for Scientific Translation, Jerusalem (1962)) 50. Moritz, H.: Der Begriff der mathematischen Erdgestalt seit Gauss. Allgemeine VermessungsNachrichten, 133–138 (1977) 51. Moritz, H.: Advanced Physical Geodesy. Wichmann Verlag, Karlsruhe (1980) 52. Moritz, H.: Advanced Physical Geodesy, 2nd edn. Wichmann, Karlsruhe (1989) 53. Moritz, H.: Classical Physical Geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 1st edn., pp. 127–158. Springer, New York/Berlin/ Heidelberg (2010) 54. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 253–290. Springer, New York/Berlin/ Heidelberg (2015) 55. Nerem, R.S., Koblinski, C.J.: The Geoid and Ocean Circulation. In: Vancek, P., Christou, N.T. (eds.) Geoid and Its Geophysical Interpretations. CRC Press, Boca Raton (1994) 56. Neumann, F.: Vorlesungen über die Theorie des Potentials und der Kugelfunktionen, pp. 135– 154. Teubner, Leipzig (1887) 57. Pedlosky, J.: Geophysical Fluid Dynamics. Springer, New York (1979) 58. Petrini, H.: Sur l’existence des derivees secondes du potentiel. C. R. Acad. Sci. Paris 130, 233–235 (1900) 59. Pick, M., Pícha, J., Vyskoˇcil, V.: Theory of the Earth’s Gravity Field. Elsevier Scientific Publishing Company, Amsterdam (1973) 60. Pizzetti, P.: Sopra il Calcoba Tesrico delle Deviazioni del Geoide dall’ Ellissoide. Att. R Accad. Sci. Torino 46, 331–350 (1910) 61. Rummel, R.: Geodesy. In: Nierenberg, W.A. (ed.) Encyclopedia of Earth System Science, vol. 2, pp. 253–262. Academic Press, San Diego (1992) 62. Runge, C.: Zur Theorie der eindeutigen analytischen Funktionen. Acta Math. 6, 229–234 (1885) 63. Sanso, F.: A note on density problems and the Runge Krarup’s Theorem. Bolletino di Geodesia e Science Affini 41, 422–477 (1982) 64. Stokes, G.G.: On the variation of gravity on the surface of the Earth. Trans. Camb. Phil. Soc. 8, 672–695 (1849) 65. Todhunter, I.: A history of the mathematical theories of attraction and of the figure on the Earth from the time of Newton to that of Laplace. Macmillan, London (1878); Reprint 1962 by Dover Publications, New York 66. Torge, W.: Geodesy, 2nd edn. de Gruyter, Berlin (1991) 67. Vening Meinesz, F.A.: A formula expressing the deflection of the plumb-line in the gravity anomalies and some formulae for the gravity field and the gravity potential outside the geoid. Proceedings of the Koninklijke Nederlandse Akademie van Wetenschappen, 31(3):315–331 (1928) 68. Vekua, I.N.: Über die Vollständigkeit des Systems harmonischer Polynome im Raum. Dokl. Akad. Nauk 90, 495–498 (1953)

1116

W. Freeden and H. Nutz

69. Walsh, J.L.: The approximation of harmonic functions by harmonic polynomials and by harmonic rational functions. Bull. Am. Math. Soc. 35, 499–544 (1929) 70. Wangerin, A.: Theorie des Potentials und der Kugelfunktionen. Walter de Gruyter & Co, Berlin/Leipzig (1921) 71. Wermer, J.: Potential Theory. Lecture Notes in Mathematics, vol. 408. Springer, Berlin (1974) 72. Weyl, H.: Über die Gleichverteilung von Zahlen mod. Eins. Math. Ann. 77, 313–352 (1916) 73. Wolf, K.: Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. Ph.D-Thesis, Geomathematics Group, University of Kaiserslautern (2009)

18

Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid: Theoretical Background and Multiscale Modeling Gerhard Berg, Christian Blick, Matthias Cieslack, Willi Freeden, Zita Hauler, and Helga Nutz

Contents 1 2 3 4 5 6

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gravity and Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gravimeter and Gravimetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geoid: Physical Nature, Geomathematical Context, and Multiscale Decorrelation . . . Geoid Versus Quasigeoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quasigeoid: Geometric Nature, Geomathematical Context, and Multiscale Decorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The German Quasigeoid GCG2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Acquisition and Provision of Gravity Data for the Quasigeoid Modeling . . . . . . . . . . . 9 Multiscale Decorrelation of Gravimetric Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1118 1119 1122 1127 1155 1159 1162 1164 1169 1176 1177

This chapter is part of the series Handbuch der Geodäsie, volume Mathematische Geodäsie/ Mathematical Geodesy, edited by Willi Freeden, Kaiserslautern. G. Berg · M. Cieslack Landesamt für Vermessung und Geobasisinformation Rheinland-Pfalz, Koblenz, Deutschland E-Mail: [email protected]; [email protected] C. Blick · Z. Hauler · H. Nutz CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Deutschland E-Mail: [email protected]; [email protected]; [email protected] W. Freeden () Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_98

1117

1118

G. Berg et al.

Abstract The methodical aspects of gravimetry are investigated from observational as well as mathematical/physical point of view. Local gravimetric data sets are exploited to visualize multiscale features in geophysically relevant signature bands of gravity anomalies and quasigeoidal heights. Wavelet decorrelation is illustrated for a certain area of Rhineland-Palatinate.

Zusammenfassung Die methodischen Aspekte der Gravimetrie werden sowohl in messtechnischer als auch mathematisch/physikalischer Hinsicht untersucht. Lokale gravimetrische Datensätze werden genutzt, um Multiskalenmerkmale in geophysikalisch relevanten Signaturbändern von Gravitationsanomalien und Quasigeoidhöhen zu visualisieren. Wavelet-Dekorrelationen werden für ein bestimmtes Gebiet in Rheinland-Pfalz illustriert.

Keywords Gravimetry · Gravity anomalies · Geoid versus quasigeoid · Multiscale decorrelation

Schlüsselwörter Gravimetrie · Schwereanomalien · Geoid versus Quasigeoid · Multiskalendekorrelation

1

Introduction

Concerning Earth’s gravity observation, it should be pointed out that the terrestrial distribution of Earth’s gravity data is far from being homogeneous with large gaps, in particular over oceans but also over land. In addition, the quality of the data is very distinct. As a matter of fact, a global terrestrial gravity data coverage now and in the foreseeable future is far from being satisfactory. This is one of the reasons why spaceborne measurements have to come into play for global gravity determination. Until now, however, the relatively poor precision of satelliteonly spaceborne gravity measurements has hindered a wider use of this type of measurements, in particular for local purposes of geoidal modeling and exploration. Nonetheless, satellite models can be used as trend solution to avoid oscillation phenomena (Gibb’s phenomena) of terrestrial data modeling in data gaps and in the vicinity of the boundary of the local area under consideration. In fact, it must be emphasized that, in future, only a coordinated research of a horizontal as well as a vertical “zooming-in” approach will provide a breakthrough in local modeling to understand significant structures and processes of the Earth’s system.

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1119

As already pointed out in [17], the way forward in local modeling has to be based on two requirements: (i) Vertical multiscale modeling, i.e., “zooming-in downward continuation” of different data sources starting from globally available spaceborne SST and/or SGG data as means for an appropriate trend solution via more accurate (regional) airborne data down to (local) high-precision terrestrial gravimetric data sets (this aspect has been investigated in several publications, e.g., [18, 20, 24, 26, 27], and the references therein). (ii) Horizontal multiscale modeling, i.e., “zooming-in procedures” from rough to dense terrestrial data material from global to local areas, where certain geodetic features should be specified and investigated in more detail (see, e.g., [6, 29, 31–34], and the references therein). (iii) Combining data from different sensors and sources, thereby observing that positioning systems are ideally located as far as possible from the Earth, while gravity field sensors are ideally located as close as possible to the Earth. In this contribution, we briefly explain the status quo of gravimetric observation. We recapitulate the standard definitions of gravity anomaly, Bouguer anomaly, anomalous potential, geoid, and quasigeoid. On the basis of these results we present horizontal multiscale methods by means of geoscientifically relevant wavelets for the purpose of appropriate decorrelation of interpretable signatures inherent in the gravitational quantities under consideration (for an excursion to vertical multiscale modeling the reader is referred to, e.g., [35] (this handbook) and the references therein).

2

Gravity and Gravitation

The following terminology is standard in the geodetic context: The gravity acceleration (gravity) w is the resultant of the gravitation v and the centrifugal acceleration c such that w = v + c.

(1)

The centrifugal force c arises as a result of the rotation of the Earth about its axis. For purposes of local gravity exploration we are allowed to assume a rotation of constant angular velocity ω. The centrifugal acceleration acting on a unit mass is directed outward perpendicularly to the spin axis (see Fig. 1). Introducing the socalled centrifugal potential C, such that c = ∇C, the function C turns out to be non-harmonic. The direction of the gravity w is known as the direction of the plumb line, the quantity |w| is called the gravity intensity (often also just called gravity and denoted in the geodetic jargon usually by g). Altogether, the gravity potential of the Earth can be expressed in the form

1120

G. Berg et al.

Fig. 1 Gravitation v, centrifugal acceleration c, and gravity acceleration w

direction of plumb line

x

v

c

w

center of mass

Level surface

Level surface

x

x

w(x)

w(x) sphere

geoid

Plumb line

Plumb line

Fig. 2 Level surfaces and plumb lines for a homogeneous ball (left) and an Earth-like body (right) (taken from [20])

W = V + C,

(2)

and the gravity acceleration w is given by ∇C . w = ∇W =  ∇V +  =v

(3)

=c

As already pointed out, the surfaces of constant gravity potentials, i.e., W = const., are designated as equipotential (level, or geopotential) surfaces of gravity (for more details, the reader is referred to the standard monographs in physical geodesy, e.g., [39, 40, 43, 64, 85]). The force of the gravity provides a directional structure to the space above the Earth’s surface (see Fig. 2). It is

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1121

tangential to the vertical plumb lines and perpendicular to all (level) equipotential surfaces. Any water surface at rest is part of a level surface. Level, i.e., equipotential surfaces are ideal reference surfaces, for example, for changes in the Earth’s system. The geoid is defined as that level surface of the gravity field which best fits the mean sea level. A gravity anomaly is understood as the difference between the observed acceleration on the Earth’s surface and the corresponding value originated by a model of the Earth’s gravity field. Historically, the model is constructed under simplifying assumptions, usually in such a way that the figure of an ellipsoid of resolution or a spheroidal surface are assumed (cf. Fig. 3). Gravity on the surface of this reference surface is given by a (simple) known formula. The subtraction from observed gravity at the same location provides the gravity anomaly. Of course, anomalies are much smaller than the values of gravity. A location with a positive anomaly typically shows more gravity than predicted by the model so that the presence of a subsurface positive mass anomaly is suggested. A negative anomaly exhibits a lower value than predicted so that a subsurface deficit is suggested. Thus, gravity anomalies are of substantial geophysical as well as geological interest. Once an artificial geopotential field U has been constructed matching the reference surface (in geodesy, usually, an ellipsoid/spheroid but, in principle, any surface close to the geoidal surface may be taken) with an equipotential surface, it is called a normal potential (see, e.g., [40]). The difference of the potential W and the normal potential U is known as the disturbing potential (or anomalous potential) T T = W − U.

(4)

In accordance with its construction, the disturbing potential T is smaller than U and W and captures the detailed variations of the true gravity field of the actual Earth

ellipsoid/spheroid geoid mean sea level topographic geometry

Fig. 3 Geodetically relevant surfaces (sectional illustration from W. Freeden, M. Schreiner [28], Mathematical Geodesy – Its Role, Its Aim, and Its Potential, this handbook)

1122

G. Berg et al.

from point-to-point, as distinguished from the global trend captured by the reference surface (e.g., ellipsoid/spheroid). The direction of the gravity vector can be obtained, e.g., by astronomical positioning. Measurements are possible on the Earth’s surface. Observations of the gravity vector are converted into so-called deflections of the vertical by subtracting a corresponding reference direction derived from a simple gravity field model associated to, e.g., a reference surface. Deflections of the vertical constitute tangential fields of the anomalous potential. Due to the high measurement effort required to acquire these types of data compared to a gravity measurement, the data density of vertical deflections is much less than that of gravity anomalies. Gravitational field determination based on the observation of deflections of the vertical and combined with gravity is feasible in smaller areas with good data coverage. The actual Earth’s surface (globally available from modern spaceborne techniques such as GNSS, LASER, VLBI, etc.) does not coincide with an equipotential surface (i.e., a level surface). The force of gravity is generally not perpendicular to the actual Earth’s surface (see Fig. 3). We are confronted with the gravity intensity as an oblique derivative on the Earth’s surface. The gravity vector is an oblique vector at any point on the Earth’s surface and generally not the normal vector. The determination of equipotential surfaces of the potential W is strongly related to the knowledge of the potential V . The gravity vector w given by w = ∇W is normal to the equipotential surface passing through the same point. Once more, equipotential surfaces such as the geoid intuitively express the notion of tangential surfaces, as they are normal to the plumb lines given by the direction of the gravity vector.

3

Gravimeter and Gravimetry

Next, in some parts, we almost literally follow the explanations to be found in [7]. Gravimeters are typically designed to measure very tiny fractional changes of the Earth’s gravity, caused by nearby geologic structures or the shape of the Earth (Fig. 5). There are two types of gravimeters, viz. absolute gravimeters (cf. Fig. 4) and relative gravimeters (cf. Fig. 6 and Fig. 7). Absolute gravimeters measure the local gravity and are directly based on measuring the acceleration of free fall (for example, of a test mass in a vacuum tube). A new type of an absolute gravimeter is the atomic gravimeter, which measures the free fall of laser-cooled atoms. Atomic gravimeters promise an extremely increased precision of gravity measurements. Relative gravimeters compare the value of gravity at one point with another. They must be calibrated at a location, where the gravity is known accurately and measure the ratio of the gravity at the two points. Most common relative gravimeters are spring–based (cf. Fig. 6 and Fig. 7). By determining the amount by which the weight stretches the spring, gravity becomes available via Hooke’s law. The highest possible accuracy of relative gravity measurements are conducted at the Earth’s

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

test mass time measuring intervals

Fig. 4 The principle of an absolute gravimeter: The gravity intensity |w| can be obtained via the ordinary differential equation m (d/dt)2 x = |w| by measuring the transition of the test mass through three time levels

1123

surface. Measurements on ships and in aircrafts deliver reasonably good data only after the removal of inertial noise. Gravimetry relates the Earth’s mass density to the gravitational potential via Newton’s Law of Gravitation. By inverse gravimetry, we denote the determination of the Earth’s mass density distribution from data of the gravitational potential or related quantities. Clearly, the determination of gravity intensities as well as gravity anomalies of dimension very much larger than the gravity anomalies which are caused by regional structures is of less significance for purposes of local gravitational determination. More particularly, local gravimetric methods are based on the measurements of small variations. It should be mentioned that gravity prospecting was first used in the case of strong density contrasts in a geological structure, and the usual approach is to measure differences in gravity from place to place. Today, the interpretation of gravimetric data is done by comparing the shape and size of gravity disturbances and anomalies to those caused by bodies of various geometrical shapes at different depths and differing densities. The observed gravity intensity |w| on the Earth’s surface (see Fig. 7) depends on the following effects to be removed (for more detailed studies, see, e.g., [65, 72, 81]): • • • • • • •

attraction of the reference surface (e.g., an ellipsoid/spheroid), elevation above sea level, topography, time dependent (tidal) variations, (Eötvös) effect of a moving platform, isostatic balance on the lower lithosphere, density variations inside the upper crust.

1124

G. Berg et al.

Fig. 5 Relative gravimeter Scintrex-CG6 in action (State Office for Surveying and Geobase Information Rhineland-Palatinate (LVermGeo))

In more detail, certain corrections have to be applied to the data in order to account for effects not related to the subsurface: Drift corrections are necessary, since each gravimeter suffers mechanical changes over time, and so does its output measurement. This change is generally assumed to be linear. In case of acquisition on a moving platform, the motion relative to the surface of the Earth implies a change in centrifugal acceleration. The Eötvös correction depends on the latitude

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1125

L0 L(x) L(x´)

m |w(x)|

m |w(x´)|

Fig. 6 Hooke’s law considers a simple helical spring that has one end attached to some fixed object, while the free end is being pulled by a force whose magnitude is m |w(x)| at the location x. Hooke’s law is a first-order linear approximation to the real response of a spring. It tells us that the magnitude of the force is proportional to the extension of the spring, i.e., |w(x)| − |w(x  )| = κ  m (L(x) − L(x )) (m represents the mass and κ is a constant characterizing the material of the spring)

Fig. 7 Mass implied gravitational effect obtained by a relative gravimeter (illustration with kind permission of Teubner-publishing taken from [46] in modified form)

gravitational effect

mass attaction

sediment salt

basalt

and the velocity vector of the moving platform. It should be observed that free air anomaly does not correct for the first two effects which could mask the gravity anomalies related to the Bouguer density contrasts in the crust. Complete Bouguer correction effectively remove the gravity anomalies due to bathymetry, but still contain the gravity effect of the Moho. Isostatics contain the gravity effect of the Moho. Special methods such as Poincare-Bey corrections are in use within boreholes or for special geoid computations. For more information the reader is referred to geodetic textbooks such as [40,43,84,85] and to the literature concerned with prospecting and exploration (see, e.g., [65, 66, 81] and the references therein).

1126

G. Berg et al.

As a consequence, to isolate the effects of local density variations from all other contributions, it is necessary to apply a series of reductions (Fig. 8): • The attraction of, e.g., the reference ellipsoid/spheroid has to be subtracted from the measured values. • An elevation correction must be done, i.e., the vertical gradient of gravity is multiplied by the elevation of the station and the result is added. With increasing elevation of the Earth, there is usually an additional mass between the reference level and the actual level. This additional mass itself exerts a positive gravitational attraction. • Bouguer correction and terrain correction are applied to correct for the attraction of the slab of material between the observation point and the geoid. • A terrain correction accounts for the effect of nearby masses above or mass deficiencies below the station. Isostatic correction accounts for the isostatic roots (Moho). Gravity observation (cf. Fig. 8) can be done over land or sea areas using different techniques and equipment. Terrestrial gravimetry in exploration was first applied to p7 rospect for salt domes, e.g., in the Gulf of Mexico (an example in the eastern part of Germany is shown in Fig. 9), and for looking for anticlines in continental areas (see, e.g., [65, 66], and the references therein). Nowadays, gravimetry is in use all over the world in diverse applications, from which we list only a few examples: (1) Gravimetry is decisive for geodetic purposes of modeling gravity anomalies, geoidal undulations, and quasigeoidal heights. (2) Gravimetry is helpful in different phases of the oil exploration and production processes as well as in geothermal research. (3) Archaeological and geotechnical studies aim at the mapping of subsurface voids and overburden variations. (4) Gravimetric campaigns may be applied for groundwater and environmental studies. They help to map aquifers to provide formations and/or structural control. (5) Gravimetric studies give information about tectonically derived changes and volcanological phenomena.

Fig. 8 Illustration of the components of the gravity acceleration (ESA medialab, ESA communication production SP–1314)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1127

Fig. 9 Top: Gravity effect in [μm s −1 ] of the salt dome Werle (Mecklenburg, Germany); bottom Geological vertical profile (with kind permission of Teubner-publishing taken from [46] in modified form)

All in all, nowadays the main applications of gravimetry can be listed as follows: (i) determination of geodetic key observables for modeling gravity anomalies and definition of geological structural settings, such as hotspots and plumes, (ii) faults delineation, (iii) recovery of salt bodies, (iv) metal deposits, (v) forward modeling, inversion (i.e., inverse gravimetry), (vi) postprocessing to assist seismic modeling, geomagnetic interpretation, etc. for explorational purposes (e.g., in geothermal research). The knowledge of horizontal/oblique/vertical derivatives of the gravity potential is a useful addendum to prospecting and exploration. This is the reason why we are interested in discussing derivatives later on in more detail (based on ideas and concepts developed in [20]).

4

Geoid: Physical Nature, Geomathematical Context, and Multiscale Decorrelation

Knowing the gravity potential, all equipotential surfaces (including the geoid at mean sea level) are given by an equation of the form W (x) = const (Fig. 2). By introducing U as the normal gravity potential corresponding to a reference domain, usually an ellipsoidal domain, the disturbing potential T is already known to be the difference of the gravity potential W and the normal gravity potential U , i.e., we are

1128

G. Berg et al.

led to a decomposition of the gravity potential in the form W = U + T . According to the concept developed by [41, 54, 82], and [68, 69] we may assume that (a) the center of the reference surface (ellipsoid) coincides with the center of gravity of the Earth, (b) the difference of the mass of the Earth and the mass of the reference body (in today’s geodesy usually an ellipsoidal body, but in future more complicated bodies are definitely senseful) is zero. A point x of the geoid can be projected onto its associated point y of the reference surface (e,g., ellipsoid) by means of the surface normal. The distance N (x) between x and y is called the geoidal height or geoidal undulation in x (cf. Fig. 10). The gravity anomaly vector a(x) at the point x of the geoid is defined as the difference between the gravity vector w(x) and the normal gravity vector u(y), i.e., a(x) = w(x) − u(y).

(5)

Another possibility is to form the difference between the vectors w and u at the same point x such that we get the gravity disturbance vector d(x) defined by d(x) = w(x) − u(x).

(6)

In geodesy, several basic mathematical relations between the scalar fields |w| and |u| as well as between the vector fields a and d are known. In the following, we only describe the fundamental relations heuristically (see, for example, [39, 40] for more details): The gravity disturbance vector d(x) at the point x on the geoid can be written as follows: d(x) = w(x) − u(x) = ∇ (W (x) − U (x)) = ∇T (x).

Fig. 10 Illustration of the gravity vector w(x), the normal gravity vector u(x), and the geoidal height N (x). Here, ν and ν  denote the normal to the geoid and the reference surface (ellipsoid), respectively (following [40])

(7)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1129

According to Taylor’s formula of multivariate analysis, U (y) + ∂U ∂ν  (y)N (x) is the linearization of U (x), i.e., by expanding the potential U at the point x and truncating the Taylor series at the linear term, we obtain (cf. Figs. 10 and 11) U (x)  U (y) +

∂U (y)N (x), ∂ν 

(8)

u(y) |u(y)|

(9)

where ν  (y) = −

is the ellipsoidal normal at y and the geoidal undulation N (x) is the aforementioned distance between x and y (note that the symbol ‘’ means that the error between the left and the right hand side may be assumed to be insignificantly small). Using the fact that T (x) = W (x) − U (x) and observing the relations |u(y)| = −ν  (y) · u(y) = −ν  (y) · ∇U (y) = −

∂U (y), ∂ν 

(10)

we find under the assumption of (8) that N (x) =

U (y) − U (x) T (x) − (W (x) − U (y)) = . |u(y)| |u(y)|

(11)

Finally, considering U (y) = W (x) = const. = W0 , we end up with the so-called Bruns’ formula (cf. [8]) N(x) =

T (x) . |u(y)|

(12)

ellipsoid/spheroid geoid topography h

ellipsoidal height

H orthometric height h

N geoid height

N

H

Fig. 11 Geodetically relevant heights (from W. Freeden, M. Schreiner, [28] Mathematical Geodesy – Its Role, Its Aim, and Its Potential, this handbook)

1130

G. Berg et al.

This formula relates the physical quantity T (x) to the geometric quantity N (x) for points x on the geoid. It is helpful to study the vector field ν(x) in more detail: ν(x) = −

w(x) . |w(x)|

(13)

Due to the definition of the normal vector field (13), we obtain the following identity w(x) = ∇W (x) = − |w(x)| ν(x).

(14)

In an analogous way we obtain u(x) = ∇U (x) = − |u(x)| ν  (x).

(15)

The deflection of the vertical Θ(x) at the point x on the geoid is understood to be the angular (i.e., tangential) difference between the directions ν(x) and ν  (x) (see, e.g., [29] for more details). The deflection of the vertical is determined by the angle between the plumb line and the normal of the reference surface through the same point (see Fig. 10): Θ(x) = ν(x) − ν  (x) −

   ν(x) − ν  (x) · ν(x) ν(x).

(16)

According to its construction, the deflection of the vertical Θ(x) at x is orthogonal to the normal vector field ν(x), i.e., Θ(x) · ν(x) = 0. Since the plumb lines are orthogonal to the equipotential surfaces of the geoid and the reference surface, respectively, the deflection of the vertical gives briefly spoken a measure of the gradient of the equipotential surfaces (cf. [40]). From (14), in connection with (16), it follows that      w(x) = −|w(x)| Θ(x) + ν  (x) + ν(x) − ν  (x) · ν(x) ν(x) .

(17)

Using Eqs. (15) and (17) we finally obtain for the gravity disturbing vector d(x) at the point x d(x) = ∇T (x) = w(x) − u(x)        = −|w(x)| Θ(x) + ν  (x) + ν(x) − ν  (x) · ν(x) ν(x) − −|u(x)|ν  (x)      = −|w(x)| Θ(x) + ν(x) − ν  (x) · ν(x) ν(x) − (|w(x)| − |u(x)|) ν  (x).

(18)

The quantity D(x) = |w(x)| − |u(x)| is called the gravity disturbance, whereas

(19)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

A(x) = |w(x)| − |u(y)|

1131

(20)

is called the gravity anomaly. Splitting the gradient ∇T (x) of the disturbing potential T at x into a normal part (pointing into the direction of ν(x)) and an angular (tangential) part (using the representation of the surface gradient ∇ ∗ ), we have ∇T (x) = ν(x)

1 ∗ ∂T (x) + ∇ T (x). ∂ν |x|

(21)

Since the gravity disturbances represent at most a factor of 10−4 of the Earth’s gravitational force (for more details see [40]), the error between ν(x) ∂T ∂ν (x) and ∂T ν  (x) ∂ν (x) has no (computational) significance. Consequently, we may assume  d(x)  ν  (x)

∂T 1 ∗ ∇ T (x). (x) +  ∂ν |x|

(22)

  Moreover, the scalar product ν(x) − ν  (x) · ν(x) can also be neglected. Thus, in connection with (18), we obtain d(x)  −|w(x)| Θ(x) − D(x)ν  (x).

(23)

By comparison of (22) and (23), we therefore obtain ∂T (x) = −ν  (x) · d(x), ∂ν  1 |w(x)| Θ(x) = − ∇ ∗ T (x). |x| D(x) = −

(24) (25)

In other words, the gravity disturbance D(x), beside being the difference in magnitude of the actual and the normal gravity vector, is also the normal component of the gravity disturbance vector d(x). In addition, we are led to the angular differential equation (25). Applying Bruns formula (12) to Eqs. (24) and (25) we obtain D(x) = |w(x)| − |u(x)| = −|u(y)|

∂N (x) ∂ν 

(26)

for the gravity disturbance and |w(x)| Θ(x) = −

1 ∗ 1 ∇ T (x) = − |u(y)| ∇ ∗ N (x) |x| |x|

(27)

for the deflection of the vertical (note that Θ(x) may be multiplied (without loss of (computational) precision) either by |w(x)| or by |u(x)| since it is a small quantity).

1132

G. Berg et al.

Turning over to the gravity anomalies A(x), it follows from the identity (24) by linearization that −

∂T ∂|u(y)| (x) = D(x)  A(x) − N (x).  ∂ν ∂ν 

(28)

Using Bruns formula (12), we obtain for the gravity anomalies that A(x) = −

∂T 1 ∂|u(y)| (x) + T (x). ∂ν  |u(y)| ∂ν 

(29)

Summing up our results (24) for the gravity disturbance D(x) and (29) for the gravity anomaly A(x), we are led to the so-called fundamental equations of physical geodesy: ∂T (x), ∂ν  ∂T 1 ∂|u(y)| A(x) = |w(x)| − |u(y)| = −  (x) + T (x). ∂ν |u(y)| ∂ν 

D(x) = |w(x)| − |u(x)| = −

(30) (31)

Equations (30) and (31) show the relation between the disturbing potential T and the gravity disturbance D and the gravity anomaly A, respectively, on the geoid (see, for example, [39, 40, 57]). They are used as boundary conditions in boundaryvalue problems. Following [40], the geoidal heights N, i.e., the deviations of the equipotential surface on the mean ocean level from the reference ellipsoid, are extremely small (Fig. 11). Their order is of only a factor 10−5 of the Earth’s radius (see [40] for more details). Even more, the reference ellipsoid only differs from a sphere S2R with (mean Earth’s) radius R in the order of the flattening of about 3 · 10−3 . Therefore, since the time of [82], it is common use that, in theory, a reference, e.g., an ellipsoidal surface or a more appropriate surface should be taken into account. However, in numerical practice for reasons of numerical economy and practical efficiency, the reference surface is adequately treated as a sphere, and the Eqs. (26) and (27) are solved in spherical approximation. In doing so, a relative error of the order of the flattening of the Earth’s body at the poles, i.e., a relative global error of 10−3 , is accepted in all equations containing the disturbing potential. Considering appropriately performed reductions in numerical calculations, this error seems to be quite permissible (cf. [40, 43], and the remarks in [37, 38] for comparison with ellipsoidal approaches), and this is certainly the case if local features are under consideration. In addition, Runge’s approach (see, e.g., [3,27] for more details on the Runge context) allows the calculation of gravitational quantities on arbitrary surfaces, in particular ellipsoids, spheroids, telluroids, just by suitably operating with spherically based equipment such as multipoles (i.e., (outer) spherical harmonics). In other words, for computational purposes in gravitational theory we are not required to use, for example, ellipsoidal/spheroidal framework. Instead we are

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1133

allowed to perform calculations involving gravitation on an ellipsoid/spheroid just in an appropriate spherical framework. Moreover, in geoscience, it is common numerical practice for local approximations to replace the reference surface by a sphere or even by a plane. In what follows we first use the classical (global) spherical approach of physical geodesy (see, e.g., [40,43]) for subsequent application in regional/local approximation: According to the classical Pizzetti assumptions (see [68, 69]), it follows that the first moment integrals of the disturbing potential vanish, i.e.,  R T (y)H−n−1,k (y) dS(y) = 0, (32) S2R

R for n = 0, 1, k = 1, . . . , 2n + 1, where {H−n−1,k } denotes the system of outer 3 spherical harmonics, dS is the surface element in R , and S2R is the sphere in R3 around the origin with radius R. More concretely, the Pizzetti assumptions tell us that

• if the Earth’s center of gravity is the origin, there are no first-degree terms in the spherical harmonic expansion of T , • if the mass of the spherical Earth and the mass of the reference surface (ellipsoid) is equal, there is no zero term. In this way, together with the indicated processes in gravitational modeling, formulas and structures are obtained that are valid for the sphere. In the well-known spherical nomenclature, involving a sphere S2R as reference surface for purposes of computation (R being the mean Earth’s radius) with a mass M distributed homogeneously in its interior, we are simply led to (cf. [40]) U (y) =

γM , |y|

u(y) = ∇U (y) = −

γM y , |y|2 |y|

(33)

where γ is the gravitational constant (γ = 6.6742 · 10−11 m3 kg−1 s−2 ). Hence, we obtain |u(y)| =

γM , |y|2

γM ∂|u(y)| u(y) · ∇|u(y)| = −2 3 , =− ∂ν  |u(y)| |y| 1 ∂|u(y)| 2 =− .  |u(y)| ∂ν |y|

(34) (35) (36)

Furthermore, in spherical nomenclature, i.e., x ∈ S2R , we obviously have −

∂T x · ∇T (x). (x) = −  ∂ν |x|

(37)

1134

G. Berg et al.

Therefore, we end up with the formulation of the fundamental equations of physical geodesy in terms of a spherical context: D(x) = −

x · ∇T (x), |x|

(38)

A(x) = −

x 2 · ∇T (x) − T (x). |x| |x|

(39)

In addition, in a vector spherical context (see also [30]), we obtain for the differential equation (25) − ∇ ∗ T (x) =

γM Θ(x), R

(40)

and, by virtue of Bruns formula (12), we finally find that − ∇ ∗ N(x) = R Θ(x).

(41)

In physical geodesy, a componentwise scalar determination of the vertical deflection is usually used (see, e.g., [40], as well as the paper by [43]). Our work prefers the vectorial framework, i.e., the vector equation (40). In doing so, we are concerned with an isotropic vector approach by means of the fundamental solution with respect to the Beltrami operator (see also [29, 30]) instead of the conventional anisotropic scalar decomposition into vector components due to [89]. The disturbing potential enables us to make the following geophysical interpretations (for more details, see, e.g., the work by [38, 50, 73, 80], and the references therein): • Gravity disturbances D and gravity anomalies A (see Fig. 12) represent a relation between the real Earth and a reference (e.g., ellipsoidal) Earth model. In accordance with Newton’s Law of Gravitation they therefore show the imbalance of forces in the interior of the Earth due to the irregular density distribution inside the Earth. Clearly, gravity anomalies and/or gravity disturbances do not determine uniquely the interior density distribution of the Earth. They may be interpreted as certain filtered signatures, which give major weight to the density contrasts close to the surface and simultaneously suppress the influence of deeper structures inside the Earth. • Geoid undulations provide a measure for the perturbations of the Earth from a hydrostatic equilibrium (see Fig. 13). They form the deviations of the equipotential surfaces at mean sea level from the reference surface (e.g., an ellipsoid). Geoid undulations show no essential correlation to the distributions of the continents. They seem to be generated by density contrasts much deeper inside the Earth. As already explained, the task of determining the disturbing potential T from gravity disturbances or gravity anomalies, respectively, leads to boundary-value

+75°

–1

+90°

1135

80 –1 ° 50 –1 ° 20 –9 ° 0° –6 0° –3 0° 0° +3 0° +6 0° +9 0 +1 ° 20 +1 ° 50 +1 ° 80 °

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

+60° +45° +30° +15° 0° –15° –30° –45° –60° –75° –1

–90° –0.5

0

0.5

1

m s2

+75°

80 –1 ° 50 –1 ° 20 –9 ° 0° –6 0° –3 0° 0° +3 0° +6 0° +9 0 +1 ° 20 +1 ° 50 +1 ° 80 °

–1

+90°

x 10–3

+60° +45° +30° +15° 0° –15° –30° –45° –60° –75° –1

–90° –0.5

0

0.5

1

1.5

m s2

x 10–3

Fig. 12 Gravity anomalies (above) and gravity disturbances (below) (from [30])

problems usually corresponding to a spherical boundary. Numerical realizations of such boundary-value problems have a long tradition, starting from [82] and [67]. Nonetheless, our work presents some recent aspects in their potential theoretic treatment by proposing appropriate space-regularization techniques applied to the resulting integral representations of their solutions. For both boundary-value problems, viz. the Neumann and the Stokes problem, we are able to present two solution methods: The disturbing potential may be either solved by a Fourier (orthogonal)

G. Berg et al.

+75°

–1

+90°

80 –1 ° 50 –1 ° 20 –9 ° 0° –6 0° –3 0° 0° +3 0° +6 0° +9 0 +1 ° 20 +1 ° 50 +1 ° 80 °

1136

+60° +45° +30° +15° 0° –15° –30° –45° –60° –75° –100

–90° –50

0

50

m

Fig. 13 Geoidal undulations (from [30])

expansion method in terms of spherical harmonics or it can be described by a singular integral representation over the boundary S2R . So far, much more data on gravity anomalies A(x) = |w(x)| − |u(y)| are available than on gravity disturbances D(x) = |w(x)|−|u(x)|. However, by modern GNSS-technology (see, e.g., [87]), the point x on the geoid is rather determined than y on the reference ellipsoid. Therefore, in future, it can be expected that D will become more important than A (as [43] pointed out in their monograph on physical geodesy). This is the reason why we continue to work with D. Nevertheless, the results of our (multiscale) approach applied to A are of significance. Therefore, the key ideas and concepts concerning A can be treated in parallel (see [13, 20, 92] for explicit details). In order to formulate some results  in thelanguage of potential theory, we first introduce the potential space P ot (1) R3 \B3R , where B3R is the (open) ball of radius R around the origin 0. More concretely, we let P ot (R3 \B3R ) be the space of all functions F : R3 \B3R → R satisfying (i) F |R3 \B3 is a member of C (2) (R3 \B3R ), R

(ii) F satisfies Laplace’s equation ΔF (x) = 0, x ∈ R3 \B3R . (iii) F is regular at infinity, i.e., F (x) = O(|x|−1 ), |x| → ∞. P ot (1) (R3 \B3R ) is formally understood to be the space P ot (1) (R3 \B3R ) = C (1) (R3 \B3R ) ∩ P ot (R3 \B3R ).

(42)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1137

In the language of potential theory, the exterior Neumann boundary-value problem corresponding to known gravity disturbances D (compare (38)) reads as follows: (ENPPG) Let D be a continuous function on S2R = ∂B3R , i.e., D ∈ C (0) (S2R ) with  R D(y)H−n−1,k (y) dS(y) = 0,

S2R

(43)

  for n = 0, 1, k = 1, . . . , 2n + 1. Find T ∈ P ot (1) R3 \B3R , such that the boundary   condition D = ∂T ∂ν S2 holds true and the potential T fulfills the conditions R

 S2R

R T (y)H−n−1,k (y) dS(y) = 0

(44)

for n = 0, 1, k = 1, . . . , 2n + 1. It is known (see, e.g., [20]) that the solution of the boundary-value problem (ENPPG) can be represented in the form T (x) =

1 4π R

 S2R

D(y) N(x, y) dS(y), x ∈ R3 \B3R ,

(45)

where the Neumann kernel N(·, ·) in (45) possesses the spherical harmonic expansion

N(x, y) =

 n+1  ∞  R2 x y 2n + 1 Pn · . |x||y| n+1 |x| |y|

(46)

n=2

By well-known manipulations, the series in terms of Legendre polynomials can be expressed as an elementary function leading to the integral representation   |y| + y − 1 2R ⎝ ⎝  T (x) = + ln D(y)  4π R S2R |x − y| |y| + y − ⎛







⎞⎞



⎠⎠ dS(y).

2 R2  x − R |x| |x|2 2 R2  x + R |x| |x|2

(47) It is not difficult to see that for x ∈ S2R , the integral (47) is equivalent to T (x) =

1 4π R



 S2R

D(y)

  2R |y| + |x − y| − R dS(y). + ln |x − y| |y| + |x − y| + R

(48)

1138

G. Berg et al.

y x Written out in spherical nomenclature x = R |x| , y = R |y| , x = y on S2R , we find

  x 2 y N R ,R = x |x| |y|  |x| −

  ⎞ x y  R  |x| − |y|   + ln ⎝ ⎠ .  x y  y  2R + R −   |y| |x| |y|  ⎛

(49)

If we use    1  x x·y 2 y   ,  |x| − |y|  = 2 − 2 |x| |y|

(50)

then, for x = y, we are led to the identity 

x y N R ,R |x| |y|



 =N = 

x y , |x| |y| √ 2

1−

x |x|

·



y |y|

(51) ⎛



√ 2

− ln ⎝1 +  1−

x |x|

·

y |y|

⎠.

Consequently, for points x ∈ S2R , we (formally) get the so-called Neumann formula, which constitutes an improper integral over S2R :        x y x y 1 T R D R = N , dS(y), |x| 4π R S2R |y| |x| |y|

(52)

where the Neumann kernel constitutes radial basis function due to (50). Once more, in accordance with the conventional approach of physical geodesy, the Neumann formula (52) is valid under the following constraints (see also [31, 40, 63]): (i) the mass within the reference ellipsoid is equal to the mass of the Earth, (ii) the center of the reference ellipsoid coincides with the center of the Earth, (iii) the formulation is given in the spherical context to guarantee economical and efficient numerics. Note that we are able to set N(Rξ, Rη) = N(ξ, η) = N (ξ · η) in terms of the y x unit vectors ξ = |x| and η = |y| which simplifies our notation: If we define the single-layer kernel S : [−1, 1) → R by √

2 S(t) = √ , t ∈ [−1, 1), 1−t the Neumann kernel is the zonal function of the form

(53)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

N (ξ · η) = S(ξ · η) − ln (1 + S(ξ · η)) , 1 − ξ · η > 0.

1139

(54)

An equivalent formulation of the improper integral (52) over the unit sphere S2 (= S21 = ∂B31 ) is then given by R 4π

T (Rξ ) =

 S2

D(Rη)N (ξ · η) dS(η).

(55)

It should be remarked that the exterior Stokes boundary value problem of determining the disturbing potential from known gravity anomalies can be handled in a quite analogous way (see [13, 20, 92]), providing the so-called Stokes integral associated to the radially symmetric Stokes kernel as an improper integral on S2R . Next we deal with the vertical deflections Θ (cf. [20,29]). Suppose that T fulfills the conditions (44). We consider the differential equation (compare Eq. (40)) ∇ξ∗ T (Rξ ) = −

γM Θ(Rξ ), R

(56)

where T (R·) represents the disturbing potential and Θ(R·) denotes the vertical deflection (cf. (40)). The differential equation (56) can be solved in a unique way by means of the fundamental solution with respect to the Beltrami operator γM T (Rξ ) = R

 S2

  Θ(Rη) · ∇η∗ G Δ∗ ; ξ · η dS(η),

(57)

where (ξ, η) → G (Δ∗ ; ξ · η) , 1 − ξ · η = 0, is the fundamental solution of the Beltrami equation on the unit sphere S2 (see, e.g., [16]) given by   1 1 ln(1 − ξ · η) + (1 − ln(2)). G Δ∗ ; ξ · η = 4π 4π

(58)

The identity (57) immediately follows from the Third Green Theorem (cf. [20, 30]) for ∇ ∗ on S2 in connection with (44). By virtue of the identity   ξ − (ξ · η)η ∇η∗ G Δ∗ ; ξ · η = − , ξ = η, 4π(1 − ξ · η)

(59)

the integral (57) can be written in the form T (Rξ ) =

R 4π

 S2

  Θ(Rη) · g Δ∗ ; ξ, η dS(η),

where the vector kernel g(Δ∗ ; ξ, η), ξ = η, is given by

(60)

1140

G. Berg et al.

  γ M ξ − (ξ · η)η g Δ∗ ; ξ, η = − 2 1−ξ ·η R

(61)

(note that we write ∇η∗ to indicate that the operator ∇ ∗ is applied to the η-variable). Again we are confronted with a representation of the disturbing potential T as an improper integral over the sphere S2 . All our settings leading to the disturbing potential on the sphere S2R turn out to be improper integrals. As we have shown they have either the singularity behavior of the single-layer kernel S (cf. Eq. (53)) or the characteristic logarithmic singularity of the fundamental solution with respect to the Beltrami operator G(Δ∗ ; ·, ·) (cf. (58)). Indeed, the fundamental solution and the single-layer kernel are interrelated (see [20]) by the identities S(ξ · η) =

√ −2π G(Δ∗ ;ξ ·η)+ 1 2 2e

(62)

and G(Δ∗ ; ξ · η) = −

1 1 ln(S(ξ · η)) − (1 − 2 ln(2)). 2π 4π

(63)

Therefore, we are confronted with the remarkable situation that a (Taylor) regularization of the single-layer kernel implies a regularization of the fundamental solution, and vice versa. Next, we present multiscale representations for the Neumann kernel N (cf. Eq. (54)). Note that all modern multiscale approaches have a conception of wavelets in common as constituting multiscale building blocks, which provide a fast and efficient way to decorrelate a given signal data set. The properties (basis property, decorrelation, and efficient algorithms) are common features of all wavelets, so that these attributes form the key for a variety of applications (see, e.g., [17,24], as well as [20]), particularly for signal reconstruction and decomposition, thresholding, data compression, denoising by, e.g., multiscale signal-to-noise ratio, etc. and, in particular, decorrelation. We follow the space (taylorized) regularization methods presented in [31] for linear regularization of the single-layer kernel S and [29] for linear regularization of the fundamental solution G(Δ∗ ; ·) of the Beltrami equation. For higher order approximations, the reader is referred to the Ph.D.-theses [13] and [92]. The essential idea is to regularize the single-layer kernel function √

2 S(t) = √ 1−t

(64)

by replacing it by a Taylor linearization. To this end, we notice that the first derivative of the kernel S is given as follows

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1141

1 S  (t) = √ , t ∈ [−1, 1). 3 2(1 − t) 2

(65)

Consequently, we obtain as (Taylor) linearized approximation corresponding to the τ2 expansion point 1 − 2R 2 , τ ∈ (0, 2R],      τ2 τ2 τ2  + S 1 − t − (1 − ) + .... S(t) = S 1 − 2R 2 2R 2 2R 2

(66)

In more detail, the kernel S is replaced by its (Taylor) linearized approximation S τ τ2 at the point 1 − 2R 2 , τ ∈ (0, 2R], given by S (t) = τ

⎧ " ⎨R 3 − τ √ 2 , 1−t

⎩√

#

2R 2 (1 − t) τ2

0≤1−t ≤

,

τ2 2R 2

τ2 , 2R 2

(67)

< 1 − t ≤ 2.

Note that the expansion point 1 − τ 2 /(2R 2 ), τ ∈ (0, 2R], is chosen in consistency with the notation in the initial paper [29] and the subsequent papers [33] and [31]. A graphical illustration of the original kernel S(t) and a τ -scale dependent version of its linear space-regularized kernel S τ (t) is shown in Fig. 14. Clearly, the function S τ is continuously differentiable on the interval [−1, 1], and we have ⎧ 3 τ2 ⎨ 2R3 , 0 ≤ 1 − t ≤ 2R  τ  2, τ S (t) = (68) 1 τ2 < 1 − t ≤ 2. ⎩√ 3 , 2R 2 2(1−t) 2

Furthermore, the functions S and S τ are monotonically increasing on the interval [−1, 1), such that S(t) ≥ S τ (t) ≥ S(−1) = S τ (−1) = 1 holds true on the interval Fig. 14 Single-layer kernel S(t) (continuous black line) and its Taylor linearized regularization S τ (t), for R = 1 and τ = 12 , 1, 2 (dotted lines)

8

S(t)

7

S0.5(t)

6 5 4

S1(t)

3 2

S2(t)

1 0 -1

-0.5

0

0.5

1

1142

G. Berg et al.

[−1, 1). Considering the difference between the kernel S and its linearly regularized version S τ , we find ⎧ √ " # ⎨ √ 2 − R 3 − 2R 2 (1 − t) , 0 < 1 − t ≤ τ 2 , τ τ2 2R 2 1−t S(t) − S τ (t) = (69) τ2 ⎩0, < 1 − t ≤ 2. 2R 2

By elementary manipulations of one-dimensional analysis we readily obtain 

1

−1

  S(t) − S τ (t) dt = O(τ ).

(70)

As a consequence, we have Lemma 1. For F ∈ C (0) (S2 ) and S τ defined by (67),      τ  lim sup  S(ξ · η)F (η) dS(η) − S (ξ · η)(ξ · η)F (η) dS(η) = 0. 2 2 τ →0+ 2 S

ξ ∈S

S

(71) To study the surface gradient ∇ξ∗ , we choose F to be of class C (1) (S2 ). Letting tξ ∈ R3×3 be the orthogonal matrix (with det(tξ ) = 1) leaving ε3 fixed such that tξ ξ = ε3 , we obtain ∇ξ∗



 S2

S(ξ · η)F (η) dS(η) =

S2

S(η3 )∇ξ∗ F (tTξ η) dS(η)

(72)

for ξ ∈ S2 and η = (η1 , η2 , η3 )T . By regularizing the single-layer kernel, we obtain  S2

∇ξ∗ S τ (ξ · η)F (η) dS(η) =

 S2

S τ (η3 )∇ξ∗ F (tTξ η) dS(η)

(73)

for ξ ∈ S2 . Finally, Lemma 1 leads us to the following limit relation (see [30]). Lemma 2. Let F be of class C (1) (S2 ). Let S τ be given by (67). Then      ∗ τ ∗  lim sup  ∇ξ S (ξ · η)F (η)dS(η) − ∇ξ S(ξ · η)F (η)dS(η) = 0. 2 2 τ →0+ 2 ξ ∈S

S

S

(74) Following Eq. (63) we introduce modified kernels Gτ (Δ∗ ; ·) by Gτ (Δ∗ ; t) = −

1 1 ln(S τ (t)) − (1 − 2 ln(2)), −1 ≤ t ≤ 1. 2π 4π

(75)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1143

These kernels Gτ (Δ∗ ; ·) are “single-layer kernel regularization” of the fundamental solution G(Δ∗ ; ·), which fulfill the following integral relations (cf. [30]). Lemma 3. For F ∈ C (0) (S2 ) and Gτ (Δ∗ ; ·) defined by (75), we have      G(Δ∗ ; ξ · η)F (η) dS(η) − Gτ (Δ∗ ; ξ · η)F (η) dS(η) = 0, lim sup  2 2 τ →0+ 2 S

ξ ∈S

S

(76) and      ∗ τ ∗ ∗ ∗  lim sup  ∇ξ G (Δ ; ξ · η)F (η) dS(η)−∇ξ G(Δ ; ξ · η)F (η) dS(η) =0. 2 2 τ →0+ 2 ξ ∈S

S

S

(77) Numerical implementations and computational aspects of the Taylor regularization techniques as presented here have been applied (even for subsets of S2R ) to different fields of physical geodesy (see, e.g., [19, 20, 29–31, 33] and the references therein). The space-regularization techniques enable us to formulate multiscale solutions for the disturbing potential from gravity disturbances or vertical deflections. Note that we need higher-order regularizations whenever gravitational observables containing second or higher order derivatives come into play. An example is gravity gradiometry, which will not be discussed here. The interested reader is referred to the contribution about Satellite Gravity Gradiometry within this handbook [35]. As point of departure for our considerations serves the special case study of the linear regularization of the single-layer kernel in the integral representation of the solution of the Neumann boundary-value problem (ENPPG). As we  already  know, the solution of the (Earth’s) disturbing potential T ∈ P ot (1) R3 \B3R from known vertical derivatives, i.e., gravity disturbances D =  ∂T  2 ∂ν S2R , satisfying the conditions (44) on the sphere SR , can be formulated as an improper integral (see Eq. (55)) T (Rξ ) =

R 4π

 S2

D(Rη) N(ξ · η) dS(η), ξ ∈ S2 ,

(78)

with the Neumann kernel (54). Our interest is to formulate regularizations of the disturbing potential T by use of the (Taylor) linearized approximation of the singlelayer kernel S τ : [−1, 1] → R, τ ∈ (0, 2R], introduced in (67). As a result, we obtain the regularized Neumann kernels  N (ξ · η) = τ

S τ (ξ · η) − ln (1 + S τ (ξ · η)) ,

0≤1−ξ ·η ≤

S(ξ · η) − ln(1 + S(ξ · η)),

τ2 2R 2

τ2 , 2R 2

< 1 − ξ · η ≤ 2,

(79)

1144

G. Berg et al.

# ⎧ " R 2R 2 ⎪ 3 − (1 − ξ · η) 2 ⎪τ " τ " ⎪ ## ⎨ R 2R 2 = − ln 1 + τ 3 − τ 2 (1 − ξ · η) , ⎪ # " ⎪ √ √ ⎪ ⎩ √ 2 − ln 1 + √ 2 , 1−ξ ·η 1−ξ ·η

0≤1−ξ ·η ≤ τ2 2R 2

τ2 , 2R 2

< 1 − ξ · η ≤ 2.

In doing so, we are immediately led to the regularized representation of the disturbing potential T corresponding to the known gravity disturbances: R T (Rξ ) = 4π



τ

R = 4π +

η∈S2

D(Rη)N τ (ξ · η) dS(η)



η∈S2 2 1−ξ ·η> τ 2 2R

R 4π

D(Rη)N (ξ · η) dS(η)



η∈S2

D(Rη)N τ (ξ · η) dS(η).

(80)

2 1−ξ ·η≤ τ 2 2R

The representation (80) is remarkable, since the integrands of T and T τ only differ τ2 on the spherical cap Γτ 2 /(2R 2 ) (ξ ) = {η ∈ S2 : 1 − ξ · η ≤ 2R 2 }. By aid of Lemma 1, we obtain Theorem 1. Suppose that T is the solution of the Neumann boundary-value problem (ENPPG) of the form (78). Let T τ , τ ∈ (0, 2R], represent its regularization (80). Then   lim sup T (Rξ ) − T τ (Rξ ) = 0.

τ →0+ ξ ∈S2

For numerical applications, we have to go over to scale-discretized approximations of the solution to the boundary-value problem (ENPPG). For that purpose, we choose a monotonically decreasing sequence {τj }j ∈N0 , such that lim τj = 0, τ0 = 2R.

j →∞

(81)

A particularly important example, that we use in our numerical implementations below, is the dyadic sequence with τj = 21−j R, j ∈ N0 .

(82)

It is easy to see that 2τj +1 = τj , j ∈ N0 , is the relation between two consecutive elements of the sequence. In correspondence to the sequence {τj }j ∈N0 , a sequence {N τj }j ∈N0 of discrete versions of the regularized Neumann kernels (79), so-called

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1145

Neumann scaling functions, is available. Fig. 15 (left) shows a graphical illustration of the regularized Neumann kernels for different scales j . The regularized Neumann wavelets, forming the sequence {W N τj }j ∈N0 , are understood to be the difference of two consecutive regularized Neumann scaling functions, respectively, W N τj = N τj +1 − N τj , j ∈ N0 .

(83)

The Neumann wavelets are illustrated in Fig. 15 (right). These wavelets possess the numerically important property of a local support. More concretely, η → W N τj (ξ · η), η ∈ S2 , vanishes everywhere outside the spherical cap Γτ 2 /(2R 2 ) (ξ ). j Let J ∈ N0 be an arbitrary scale. Suppose that N τJ is the regularized Neumann scaling function at scale J . Furthermore, let W N τj , j = 0, . . . , J, be the regularized Neumann wavelets as given by (83). Then we obviously have N τJ = N τ0 +

J −1

W N τj .

(84)

j =0

The local support of the Neumann wavelets within the framework of (84) should be studied in more detail: We start with the globally supported scaling kernel N τ0 = N 2R . Then we add more and more wavelet kernels W N τj , j = 0, . . . , J −1, to achieve the scaling kernel N τJ . It is of particular importance that the kernel functions η → W N τj (ξ · η), ξ ∈ S2 fixed, are ξ -zonal functions with local support (spherical caps).

8 7 6

6 WN τ0

N τJ N

5 4

5 4

WN τ1 WN τ2

3

3

2

2 1

1 0 -1

-0.5

0

0.5

1

0

1 –0.8 –0.6 –0.4 –0.2

0

0.2 0.4 0.6 0.8

1

Fig. 15 Illustration of the Neumann kernel N (t) (left, continuous black line) and its Taylor linearized regularization N τJ (t), J = 0, 1, 2, τJ = 21−J R and R = 1 (left, dotted lines). The corresponding Taylor linearized Neumann wavelets W N τJ (t) for scales J = 0, 1, 2, are shown on the right

1146

G. Berg et al.

In conclusion, a calculation of an integral representation for the disturbing potential T starts with a global trend approximation using the scaling kernel at scale j = 0 (of course, this requires data on the whole sphere, but the data can be rather sparsely distributed since they only serve as a trend approximation). Step by step, we are able to refine this approximation by use of wavelets of increasing scale. The spatial localization of the wavelets successively allows a better spatial resolution of the disturbing potential T . Additionally, the local supports of the wavelets provide a computational advantage since the integration has to be performed on smaller and smaller spherical caps. In consequence, the presented numerical technique becomes capable of handling heterogeneously distributed data. All in all, keeping the space-localizing property of the regularized Neumann scaling and wavelet functions in mind, we are able to establish an approximation of the solution of the disturbing potential T from gravity disturbances D in form of a “zooming-in” multiscale method. A lowpass filtered version of the disturbing potential T at the scale j in an integral representation over the unit sphere S2 is given by (compare Eq. (80)) T τj (Rξ ) =

R 4π

 S2

D(Rη) N τj (ξ · η) dS(η), ξ ∈ S2 ,

(85)

while the j -scale bandpass filtered version of T leads to the integral representation by use of wavelets W T τj (Rξ ) =

R 4π

 D(Rη) W N τj (ξ · η) dS(η), ξ ∈ S2 .

(86)

Γτ 2 /(2R 2 ) (ξ ) j

Theorem 2. Let T τJ0 be the regularized version of the disturbing potential at some arbitrary initial scale J0 as given in (85), and let W T τj , j = 0, 1, . . . , be given by (86). Then, the following reconstruction formula holds true:  ⎛ ⎞   N   τJ0 τJ0 +j  ⎝ ⎠ lim sup T (Rξ ) − T (Rξ ) + WT (Rξ )  = 0. N →∞ ξ ∈S2   j =0 The multiscale procedure (wavelet reconstruction) as developed here can be illustrated by the following scheme W T τJ0

W T τJ0 +1

  T τJ0 −→ + −→ T τJ0 +1 −→ + −→ T τJ0 +2 . . . . As a consequence, a tree algorithm based on the regularization in the space domain has been realized for determining the disturbing potential T from locally available data sets of gravity disturbances D.

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1147

In order to get a fully discretized solution of the Neumann boundary-value problem (ENPPG), approximate integration by use of appropriate cubature formulas is necessary (see, e.g., [32,42] for more details about approximate integration on the (unit) sphere). The fully discretized multiscale approximations have the following representations Nj # R Nj " Nj # τj " N N ξ · ηk j , ξ ∈ S 2 , T (Rξ )  wk D Rηk 4π

(87)

Nj " # R Nj " Nj # N W N τj ξ · ηk j , ξ ∈ S2 , wk D Rηk W T (Rξ )  4π

(88)

τj

k=1

τj

k=1

N

N

where ηk j are the Nj integration knots and wk j the integration weights. Whereas the sum in (87) has to be calculated on the whole sphere S2 , the summation in (88) has to be computed only for the local supports of the wavelets (note that the symbol  means that the error between the right and the left hand side can be neglected). Figs. 16, 17, and 18 present a decomposition of the Earth’s disturbing potential T in lowpass and bandpass filtered parts via Neumann wavelets for data sets of increasing data width leading to the hotspot of the Galapagos Islands. Seen from the geodetic reality, the figures are remarkable in the following sense: For getting a better accuracy in numerical integration procedures providing the (global) solution of the boundary-value problem (ENPPG) as illustrated in Fig. 16a, we need denser, globally over the whole sphere S2R equidistributed data sets (most notably, in the sense of Weyl’s Law of Equidistribution). However, in today’s reality of gravitational field observation, we are confronted with the problem that terrestrial gravitational data (such as gravity disturbances, gravity anomalies) of sufficient width and quality are only available for certain parts of the Earth’s surface (for more details concerning the observational aspects see, e.g., [10–12, 74, 75]). As a matter of fact, there are large gaps, particularly at sea, where no data sets of sufficient quality are available at all. This is the reason why the observational situation implies the need for specific geodetically oriented modeling techniques taking the heterogeneous data situation and the local availability of the data (usually related to latitude-longitude data grids) into consideration. In this respect, the “zoomingin” realization based on single-layer space-regularization is a suitable efficient and economic mathematical answer. As already known from (60), the solution of the surface differential equation (see Eq. (40)) ∇ ∗ T (Rξ ) = −

γM Θ(Rξ ), ξ ∈ S2 , R

(89)

determining the disturbing potential T from prescribed vertical deflections Θ under the conditions (44) is given by

1148

G. Berg et al.

a

180° W

90° E

90° E



180° E

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 70° S

–600

–400

–200

0

200

400

[m2/s2]

b

120° W

90° W

60° W

[m2/s2]

30° W

80

30° N

c

120° W

90° W

60° W

30° W

30° N

60

[m2/s2] 80 60

15° N

15° N 40



20 0

15° S

40

0° 15° S

–20

30° S

–40 –60

45° S 60° S

–80

20 0 –20

30° S 45° S 60° S

–40 –60 –80

Fig. 16 Lowpass filtered version T τ4 of the disturbing potential T and the corresponding bandpass filtered versions W T τj for scales j = 4, 5 of the magenta bordered region in 16(a) calculated from different numbers of data points (from the Ph.D.-Thesis [92], Geomathematics Group, University of Kaiserslautern). (a) Low pass part T τ4 calculated from 490,000 data points distributed over the whole sphere S2R . (b) Details W T τ4 at scale 4 from 281,428 data points distributed within the black bordered region in (a). (c) Details W T τ5 at scale 5 from 226,800 data points distributed within the gray bordered region in (a)

R T (Rξ ) = 4π

 S2

  Θ(Rη) · g Δ∗ ; ξ, η dS(η),

(90)

where the vector kernel g (Δ∗ ; ξ, η) , 1 − ξ · η > 0, reads as follows (see Eq. (61))   2 1 γM (ξ − (ξ · η)η) g Δ∗ ; ξ, η = − 2 2 R 1−ξ ·η =−

1 γM (S(ξ · η))2 (ξ − (ξ · η)η). 2 R2

(91)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

a

120° W

90° W

60° W

1149

[m2/s2]

30° W

500 30° N

400 300

15° N 200 0°

100 0

15° S

–100 –200

30° S

–300 45° S

–400

60° S

b 15° N

90° W

70° W

[m2/s2]

c 15° N

[m2/s2]

50

40

40

30

30

10

20 0°

10

0

0

–10

–10

–20 15° S

70° W

50

20 0°

90° W

–20 15° S

Fig. 17 Lowpass filtered version T τ6 of the disturbing potential T of the magenta bordered region in 16(a) and the corresponding bandpass filtered versions W T τj for scales j = 6, 7 (from the Ph.D.-Thesis [92], Geomathematics Group, University of Kaiserslautern). (a) Low pass part T τ6 of the magenta bordered region in (a) computed by the sum of T τ4 (a), W T τ4 (b), and W T τ5 (c) in this region. (b) Details W T τ6 at scale 6 from 71 253 data points distributed within the black bordered region in (a). (c) Details W T τ7 at scale 7 from 63 190 data points distributed within the gray bordered region in (a)

Analogously to the calculation of the disturbing potential T from known gravity disturbances D (i.e., the Neumann problem (ENPPG)), the numerical calamities of the improper integral in (90) can be circumvented by replacing the zonal kernel S(ξ · η) by the regularized kernel S τ (ξ · η). This process leads to space-regularized

1150

G. Berg et al.

a 15°N

[m2/s2] 90°W

70°W

400

300

200

100 0°

0

-100

-200

-300

15°S

b 95°W 5°N

90°W

85°W

2 2 [m /s ] 5

c 95°W 5°N

4

3 2

1 0°

0

1 0°

-2

2.5°S

-3 -4

0 -1

-1

5°S

[m2/s2] 5

2.5°N

2

2.5°S

85°W

4

3 2.5°N

90°W

-2 -3

5°S

-4

Fig. 18 Lowpass filtered version T τ8 of the disturbing potential T of the magenta bordered region in 17(a) and the corresponding bandpass filtered versions W T τj for scales j = 8, 9 (from the Ph.D.-Thesis [92], Geomathematics Group, University of Kaiserslautern). (a) Low pass part T τ8 of the magenta bordered region in (a) computed by the sum of T τ6 (a), W T τ6 (b), and W T τ7 (c) in this region. (b) Details W T τ8 at scale 8 from 71 253 data points distributed within the black bordered region in (a). (c) Details W T τ9 at scale 9 from 63 190 data points distributed within the gray bordered region in (a)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1151

representations T τ of the disturbing potential T calculated from vertical deflections Θ within a multiscale “zooming-in” procedure analogous to the approach for gravity disturbances as input data. To be more concrete, the kernel function g(Δ∗ ; ·, ·) is replaced by the space-regularized function using Eq. (67)   2 γM  g τ Δ∗ ; ξ, η = − 2 S τ (ξ · η) (ξ − (ξ · η)η), 2R ⎧ γM " 2 4 9R ⎪ − − 12R (1 − ξ · η) ⎪ τ4 ⎨ 2R 2 τ 2 # 6 4R τ2 = + τ 6 (1 − ξ · η)2 (ξ − (ξ · η)η), 0 ≤ 1 − ξ · η ≤ 2R 2, ⎪ ⎪ ⎩ γM 2 τ2 < 1 − ξ · η ≤ 2, − 2R 2 1−ξ ·η (ξ − (ξ · η)η), 2R 2 (92) for τ ∈ (0, 2R]. This leads to the following approximative representation of the disturbing potential T : R T (Rξ ) = 4π



τ

S2

  Θ(Rη) · g τ Δ∗ ; ξ, η dS(η),

(93)

with g τ (Δ∗ ; ·, ·) given by (92). Theorem 3. Suppose that T is the solution (90) of the differential equation (89), with Θ being a member of the class of continuous vector valued functions c(0) (S2R ). Let T τ , τ ∈ (0, 2R], represent its regularized solution of the form (93). Then   lim sup T (Rξ ) − T τ (Rξ ) = 0.

τ →0+ ξ ∈S2

(94)

By restricting {g τ (Δ∗ ; ·, ·)}τ ∈(0,2R] to the sequence {g τj (Δ∗ ; ·, ·)}j ∈N0 , corresponding to a set of scaling parameters {τj }j ∈N0 satisfying τj ∈ (0, 2R] and limj →∞ τj = 0, we are canonically led to regularized vector scaling functions such that a scale-discrete solution method for the differential equation (89) can be formulated. The vector scaling function g τj +1 (Δ∗ ; ·, ·) at scale j +1 is constituted by the sum of the vector scaling function g τj (Δ∗ ; ·, ·) and the corresponding discretized vector wavelet wg τj (Δ∗ ; ·, ·), given by   wg τj Δ∗ ; ξ, η = g τj +1 (Δ∗ ; ξ, η) − g τj (Δ∗ ; ξ, η) .

(95)

Hawaii: Ritter and Christensen [71] believe that a stationary mantle plume located beneath the Hawaiian Islands created the Hawaii-Emperor seamount chain while the oceanic lithosphere continuously passed over it. The Hawaii-Emperor

1152

G. Berg et al.

chain consists of about 100 volcanic islands, atolls, and seamounts that spread nearly 6000 km from the active volcanic island of Hawaii to the 75–80 million year old Emperor seamounts nearby the Aleutian trench. With moving further south east along the island chain, the geological age decreases. The interesting area is the relatively young southeastern part of the chain, situated on the Hawaiian swell, a 1200 km broad anomalously shallow region of the ocean floor, extending from the island of Hawaii to the Midway atoll. Here, a distinct gravity disturbance and geoid anomaly occurs that has its maximum around the youngest island that coincides with the maximum topography and both decrease in northwestern direction. The progressive decrease in terms of the geological age is believed to result from the continuous motion of the underlying plate (cf. [60, 91]). With seismic tomography, several features of the Hawaiian mantle plume are gained (cf. [71] and the references therein). They result in a Low Velocity Zone (LVZ) beneath the lithosphere, starting at a depth of about 130–140 km beneath the central part of the island of Hawaii. So far, plumes have just been identified as low seismic velocity anomalies in the upper mantle and the transition zone, which is a fairly new achievement. As plumes are relatively thin with respect to their diameter, they are hard to detect in global tomography models. Hence, despite novel advances, there is still no general agreement on the fundamental questions concerning mantle plumes, like their depth of origin, their morphology, their longevity, and even their existence is still discussed controversially. This is due to the fact that many geophysical as well as geochemical observations can be explained by different plume models and even by models that do not include plumes at all (e.g., [15]). With our space-localized multiscale method of deriving gravitational signatures (more concretely, the disturbing potential) from the vertical deflections, we add a new component in specifying essential features of plumes. The vertical deflections of the plume in the region of Hawaii are visualized in Fig. 19. From the bandpass filtered detail approximation of the vertical deflections (Fig. 20) and the corresponding disturbing potential (Fig. 21), we are able to

Fig. 19 Illustration of the vertical deflections in the region of Hawaii (from the Ph.D.-thesis [13], Geomathematics Group, University of Kaiserslautern)

-4 X 10 3.5 3 22.5° N

2.5 2

22.0° N

1.5 1

17.5° N

0.5 160.0° W 157.5° W 155.0° W 152.5° W

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1153

Fig. 20 Approximation of the vector valued vertical deflections Θ in [ms−2 ] of the region of Hawaii (compare Fig. 19). A rough lowpass filtering at scale 6 is improved by several bandpass filters of scale j = 6, . . . , 11, the last picture shows the multiscale approximation at scale j = 12 (from the Ph.D.-thesis [13], Geomathematics Group, University of Kaiserslautern)

1154

G. Berg et al.

Fig. 21 Multiscale reconstruction of the disturbing potential T in [m2 s−2 ] from vertical deflections Θ for the Hawaiian (plume) area using the scaling function g τ . A rough lowpass filtering at scale j = 6 is improved by several bandpass filters at scales j = 6, . . . , 11, the last illustration shows the approximation of the disturbing potential T at scale j = 12 (from the Ph.D.-thesis [13], Geomathematics Group, University of Kaiserslautern)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1155

conclude that the Hawaii plume has an oblique layer structure. As can be seen in the lower scale (for which numerical evidence suggests that they reflect the higher depths), the strongest signal is located in the ocean in a westward direction of Hawaii. With increasing scale, i.e., lower depths, it moves more and more to the Big Island of Hawaii, i.e., in eastward direction. All in all, by the space-based multiscale techniques initiated by [29, 30] in gravitation we are able to come to interpretable results involving geological obligations in relation to hotspots/mantle plumes based on “surface interpretations” and just by looking at the anomalous behavior in terms of surface integrals without using the framework of Newton’s volume integrals.

5

Geoid Versus Quasigeoid

In what follows, we are strongly influenced by the work of [88]. As a matter of fact, we use parts of his highly instructive contribution almost literally to compare the conceptional background of geoid and quasigeoid. Nevertheless, seen from a modern mathematical point of view, we are not allowed to share all the conclusions of this contribution. Following [88] we wish to argue that the classical, physically meaningful surface, the geoid, as introduced by [54] and mathematically discussed in the preceding chapter, is still not only the most natural surface to refer to, but also an artificial surface, called the quasigeoid should be used and computed from data on the surface of the Earth. In fact, it is well known in surveying practice that heights have to be referred to mean sea level. However, to obtain some heights of practical value, the mean sea level underneath the continents has to be known. The mean sea level anywhere more or less follows the gravity equipotential surface to a certain constant W0 . As pointed out, the determination of such a horizontal surface, that best approximates the (mean) sea level leads to the geoid. Two conceptually different kinds of height systems are commonly used in geodesy (cf. Fig. 22): (i) Orthometric height: H is the “practical height” above the sea level used in mapping and engineering practice. The orthometric height of a point of interest is measured along the plumbline, a line always tangent to the gravity vector, form the geoid to the point of interest. (ii) Geodetic height (or ellipsoidal height): h is the height above the bi-axial “geocentric reference surface (ellipsoid)”, measured along the normal to the reference surface (following the concept of [88]). It can be readily determined from observations from satellites, but it is of very little practical use on their own. However, if the departure of the geoid from the geocentric reference surface (ellipsoid) N is subtracted from a geodetic height h, as we can see in Fig. 22, we get the orthometric height H , which subsequently can be used in practice. Geoidal heights are useful as an intermediary between satelliteobserved heights (geodetic) and practical heights (orthometric). These three

1156

G. Berg et al.

Fig. 22 Relation between geoidal, geodetic, and orthometric heights

Fig. 23 Relation between Cartesian and curvilinear coordinates

heights are consequently related by the equation: H = h − N.

(96)

Fig. 23 shows that the geodetic height h can be computed from the satellite determined position (given in the Cartesian coordinates) exactly, if specific values for the size and shape (flattening) of the geocentric reference ellipsoid are adopted. The calculation is simply a matter of applying general geometrical principles. In Fig. 23, the center of the ellipsoid is coincident with the center-of-mass of the Earth by definition. Orthometric heights can be determined by a simple differential procedure which is quite accurate, but it is also slow, expensive, and prone to systematic errors. This classical process, the terrestrial levelling, has been used all around the world for more than a century. For economical reasons, the tendency today is to replace this

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1157

process by satellite methods, which provide geodetic height differences. Satellite methods are almost as accurate as terrestrial levelling, particularly for larger distances, and much cheaper to use. If this approach is used, accurate knowledge of the geoidal heights on land becomes a prerequisite for converting geodetic heights to orthometric heights. Orthometric heights and geoidal heights are widely used around the world, particularly in America and in portions of Africa and Asia. More recently, there has been the decision in Canada and in the US to adopt orthometric heights and a geoidal model as their national systems of heights. The determination of the geoid is a purely physical problem: if we knew the mass density distribution within the Earth we could compute the gravity field, including the gravity potential and thus the geoid, to any accuracy anywhere by calculating Newton’s volume integral. We would then get the geoid by simply connecting all the points of the same required (constant) value W0 of the potential. Unfortunately, we do not know the density distribution within the Earth to a sufficient accuracy to do this, so this approach cannot be used in practice. The only viable alternative is to use gravity values, which are cheap, plentiful and sufficiently accurate. If we have these, we can take advantage of the relation between gravity and gravity potential, as shown in Fig. 24. Beyond the theoretical problem posed by the unknown topographic density, downward continuation is an ill-posed problem, which is known as inverse gravimetric problem (for more details and mollifier regularizations, see, e.g., [25]). The fact that the topographic density was not known with an adequate accuracy back in the 1960s (and this problem lingers on still today) led Molodensky to declare the geoid impossible to determine to a sufficient accuracy and to introduce an alternative quantity known as the quasigeoid (see [59]). Methods of determining the quasigeoid have since been somewhat refined, especially by the formulation in terms of analytical continuation as described by [4], but also by numerous other mathematical and theoretical developments (e.g., [45, 49, 62] and [44]). The interplay of the quasigeoid with the geoid and the reference ellipsoid, is shown in Fig. 25. The vertical distance between the quasigeoid and the reference surface (ellipsoid) is called the quasigeoidal height (also known as height anomaly) ζ . For the determination of the quasigeoid, it would not be necessary to know the topographic density as all the computations are done not on the geoidal surface but on the

Fig. 24 The relation between gravity w, its potential W , and local surfaces

1158

G. Berg et al.

Fig. 25 The relation between quasigeoid, geoid and reference ellipsoid (following [88])

surface of the Earth (or at an almost identical surface to it, called the telluroid – see the definition below). Molodensky’s theory deals with the gravity potential outside the Earth’s surface. Molodensky’s approach does not require any knowledge of topographic density, as it deals only with the external field and needs only to know the geometry of the external field. On the other hand, as the approach is based on geometry, it requires integration over the surface of the Earth, or more precisely over the telluroid. The telluroid is a surface that looks like the Earth surface except that it is displaced from the Earth surface by the quasigeoidal height. For the quasigeoid to have use in practice, a meaningful height system must be associated with it. This system is called normal heights and it is used in the countries of the former Soviet Union and nine other European countries (Germany, France, Sweden, Poland, Czech Republic, Slovak Republic, Hungary, Romania and Bulgaria). The normal height H N of a point on the topographical surface is defined as the height of the corresponding point on the telluroid above the reference surface (ellipsoid), measured along the normal plumbline. However, normal heights may equivalently be seen as heights of the topographical surface above the quasigeoid, also measured along the normal plumbline. The relation among the normal height H N , height anomaly ζ and geodetic height h is exactly the same as that among orthometric height, geoidal height and geodetic height (cf. Figs. 22 and 25): H N ≈ h − ζ.

(97)

Following this concept (cf. [88]), normal heights and orthometric heights at open sea are exactly the same, while they may differ by up to one and a half meters on land. The difference between the two surfaces – the geoidal surface and the telluroid – over which the integration for the geoid or quasigeoid determination respectively is carried out is as follows: • The geoid is a fairly smooth surface without any kinks, edges or other irregularities (as seen in Fig. 22). • The telluroid, or the Earth surface for that matter, is much rougher. It is the common belief in geodesy that the Earth’s surface is not sufficiently smooth to allow approximate integration (as [88] pointed out). As a matter of fact, although

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1159

the calamities should not be underestimated, the progress in numerical integration is high (see, e.g., [22]), so that the mathematical apparatus required by Molodensky’s approach should be realizable in its original meaning, in future. Altogether, we are allowed to come to the following conclusion: The knowledge of topographic density is still a problem, but it can be resolved to an accuracy of a few centimeters if the geological formation of the crust is reasonably well known. To obtain the total uncertainty of the geoid, the uncertainty in the effect of irregular topographic density must be added to the uncertainty in geoid determination that comes from the approximations in the theory and the numerical computations. Molodensky was right 50 years ago and today, where the substantial increase in the knowledge of topographical density distribution have changed the situation substantially. Once again, Molodensky’s approach does not require any knowledge of topographic density, but it needs the surface integration to be carried out over the surface of the Earth, or over the telluroid to be accurate. Up till now, geodesists’ opinion is that this cannot be done with sufficient accuracy. However, it may be questioned that this opinion is true in view of the tremendous progress in numerical integration.

6

Quasigeoid: Geometric Nature, Geomathematical Context, and Multiscale Decorrelation

As pointed out earlier, heights in geodesy come in the following variants (see also Table 1): (i) (ii) (iii) (iv)

orthometric heights, normal heights, geodetic heights, geopotential heights.

Each has its advantages and disadvantages. Geopotential heights are physical measures of potential energy (in [m2 s −2 ]). Both orthometric and normal heights are geometrically defined. They are heights in meters above the sea level. Orthometric and normal heights differ in the way in which mean sea level is conceptually continued under the continental masses. The reference surface for orthometric heights is the geoid. Since measurements cannot be taken from the geoid, levelling is in use. It produces the practically most useful heights above sea level directly, the more economical use of GNSS-instruments (such as GPS) give heights above the reference surface (ellipsoid). Once more, it should be mentioned that the relation among the “normal height” H N and the geodetic height h is exactly the same as that among orthometric height, geoidal height and geodetic height: H N = h − ζ.

(98)

1160 Table 1 Geodetic height type classification

G. Berg et al. Orthometric Geodetic

H h

Geoidal

N

Measured from the geoid Measured from the reference surface (ellipsoid) Calculated as the difference N =h−H

Fig. 26 Earth’s surface ∂ G, telluroid ∂ T, and their one-to-one correspondence

The original problem leading to the Molodensky approach (in its rigorous formulation discussed, e.g., by [45, 48]) can be briefly described as follows: Given, at all points on the Earth’s surface ∂G, the gravity potential W and the gravity vector w = ∇W , then the aim is to determine the quasigeoidal surface. The quasigeoidal height determination is based on the fact that the Earth’s surface ∂G is approximated by the boundary ∂T of a (regular) region T, i.e., the telluroid (close to the Earth’s surface) with known gravitational potential U in T c , i.e. on the surface ∂T and in the exterior of ∂T (cf. Fig. 26). We assume that there exists a one-to-one correspondence between ∂G and ∂T. W is the actual potential and U is an approximation of W called the normal potential. As usual, we let u = ∇U which is called the normal gravity and w = ∇W called the actual gravity which is given on ∂G. Assume that, for given x ∈ ∂T, the point y ∈ ∂G is the one associated to x by the one-to-one correspondence between ∂G and ∂T (cf. Fig. 26). The two points are connected by the normal height vector n = y − x. Hence, a substitute formulation of the classical Molodensky problem is to determine the length of |n| = H N , i.e., the distance of ∂G and the approximating telluroid along the one-to-one correspondence between ∂G and ∂T. To this end we introduce δW = W |∂ G − U |∂ T ,

(99)

δw = w|∂ G − u|∂ T ,

(100)

where δW is called the potential anomaly and δw is called gravity anomaly (see [45, 48, 49, 61]). Furthermore, we use the disturbing potential T by T =W −U in G, so that we have

(101)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1161

δW = T |∂ G + U |∂ G − U |∂ T ,

(102)

δw = w|∂ G − u|∂ T .

(103)

Using the Taylor expansion of u and U in terms of n and neglecting terms of higher order in n (which represents no substantial loss of accuracy if a sufficiently close telluroid is chosen) we finally arrive at the approximations δW (x) = T (x) + u(x) · n,

(104)

δw(x) = w(y) − u(y) + m(x) n,

(105)

x ∈ ∂T, y ∈ ∂G, where we set  m = ∇u =

∂ 2U ∂xi ∂xj

 .

(106)

i,j =1,...,3

Observing the relations w(y) − u(y) = (∇W )(y) − (∇U )(y) = (∇T )(y) = (∇T )(x)

(107)

we arrive at δW (x) = T (x) + u(x) · n,

(108)

δw(x) = (∇T )(x) + m(x) n.

(109)

Equation (108) is a counterpart to the (already known) Bruns formula. Actually it connects the disturbing potential T on the telluroid ∂T with the anomalies between ∂G and the telluroid ∂T. If we assume that m(x) is invertible for all x ∈ ∂T, we obtain by virtue of (107) n = m(x)−1 (δw(x) − (∇T )(x)),

(110)

|n| = |m(x)−1 (δw(x) − (∇T )(x))|.

(111)

so that

Inserting the identity (110) into Eq. (108), we end up with T (x) − u(x) · (m(x))−1 (∇T )(x) = δW (x) − u(x) · m(x)−1 δw(x).

(112)

This is the so-called fundamental boundary condition for the Molodesky problem. It is formulated exclusively for points on the telluroid, and it does not need information about the topographic density.

1162

G. Berg et al.

Following [49] (see also [61]) the vector u(x)(m(x))−1 can be seen in first order to be oriented in the direction of the exterior unit normal field ν on the telluroid ∂T. More specifically, u(x)(m(x))−1 = −

|x| ν(x). 2

(113)

Inserting expression (113) into Eq. (112) therefore results in the identity ν(x) · (∇T )(x) +

2 T (x) = F (x), |x|

(114)

where we have used the abbreviation F (x) = ν(x) · δw(x) +

2 δW (x) |x|

(115)

(note that the boundary condition (114) can be seen to be equivalent to (112) transformed in an appropriate coordinate system). Summarizing all the steps of the linearization procedure we are led to discuss the following type of a boundary-value problem in potential theory (note that a constructive Runge solution is given, e.g., in [3, 23]). Exterior Molodensky Problem (EMP): Find T ∈ P ot (1) (T c ), i.e., T ∈ C (2) (T c )∩ with ΔT = 0 in T c and |T (x)| = O(|x|−1 ), |x| → ∞, such that

C (1) (T c )

∂T (x) + μ(x)T (x) = F (x), x ∈ ∂T, ∂ν

(116)

where μ, F ∈ C (0) (∂T) are known functions on the boundary surface ∂T of the regular region T. Obviously, in mathematical nomenclature, the exterior Molodensky problem (EDP) forms a special Robin problem (cf. [47]). It should be remarked that, in the case that ∂T is a sphere, the problem becomes the well-known Stokes problem (see [40] or [61]) and in the case of an ellipsoid it is called ellipsoidal Stokes problem (see, e.g., [36,56,61]). Locally reflected multiscale solutions of the Stokes’ problem are due to [31,34] (see also the references in [20]).

7

The German Quasigeoid GCG2016

A geodetic realization of a quasigeoid is the German Combined Quasigeoid 2016 (GCG2016). The GCG2016 is the official height reference surface of the Surveying Authorities of the Federal Republic of Germany. It corresponds to the reference ellipsoid of the Geodetic Reference System 1980 (GRS80). Among others the

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1163

GCG2016 is well suited to transform ellipsoidal heights determined by GNSS observations into normal heights, and vice versa. The GCG2016 is a core component of the so-called Integrated Geodetic Spatial Reference 2016 (IGSR2016), which has been introduced in 2016 as the official geodetic spatial reference in Germany. The IGSR2016 pursues an holistic approach of the geometrically and the physically defined components of geodetic spatial reference. This approach has been implemented by the national surveying authorities in Germany as the result of the extensive project “Renewal of the DHHN”. Over the course of the project nearly the complete German first order leveling network has been remeasured between 2006 and 2012 to an accuracy (standard deviation) of only 3–8 mm. In addition, in 2008 an elaborate GNSS campaign over six weeks with GNSS observation times of at least two times 24 h at each station has been performed. In this process 250 so-called geodetic basic network points (“Geodätische Grundnetzpunkte” – GGP) have been determined to an accuracy of a few millimeters throughout Germany. Each of the GGP has been integrated into the remeasured leveling network. Furthermore, from 2009 to 2015 for the total number of GGP high precision gravimetric measurements have been performed, partially by absolute and partially by relative gravimetry. Finally in 2016 the project led to a complete revision and upgrade of the authorized geodetic spatial reference systems and their realizations in Germany with its significant components as follows: Geometrical Component: • European Terrestrial Reference System 1989 (ETRS89) in its official German realization: ETRS89/DREF91, realization 2016. Physical Components: • German Main Height Network 2016 (DHHN2016) and • German Main Gravity Network 2016 (DHSN2016). Components Linking the Geometrical and the Physical Copmponents: • GGP, each with high precision coordinates in ETRS89/DREF91, realization 2016 as well as precise normal heights in DHHN2016 und gravity values in DHSN2016, • Quasigeoid GCG2016 as physically defined height reference surface in relation to the mathematically defined ellipsoid GRS80. These components in common constitute the IGSR2016 in Germany. They are of a piece and therefore completely consistent to each other. Thus, the result of the project is unique:

1164

G. Berg et al.

• A significantly improved authorized geodetic spatial reference for Germany, where the approach of integrated geodesy (and thus strictly speaking the Bruns formula, cf. Sect. 4) is accomplished in practice for the first time for a complete country. The modeling of the GCG2016 has been executed by order of the Working Committee of the Surveying Authorities of the states (Laender) of the Federal Republic of Germany (AdV) by two independent institutions: • The Federal Agency for Cartography and Geodesy (BKG), • the Institute of Geodesy of the Leibniz University Hannover (IfE). Both institutions processed the extensive data provided by the federal states and by other sources with basically different approaches. The results of both institutions were alike and the GCG2016 (Fig. 27) finally has been computed by averaging the two independent solutions of BKG and IfE. The GCG2016 is provided as a grid over the whole area of Germany. The grid resolution is 30 by 45 (latitude by longitude), which means in metric system about 0.9 km by 0.9 km. The standard deviations of the height anomalies are specified as follows: In lowlands and uplands 1 cm, in the alpine mountains 2 cm and in marine areas 2–6 cm. For more details concerning the respective approaches, the computation, the properties and the providing of the GCG2016 see the publications of BKG: • Brochure: https://www.bkg.bund.de/SharedDocs/Downloads/BKG/DE/Downloads-DEFlyer/BKG-Quasigeoid-DE.html • Documentation Quasigeoid GCG2016: https://www.bkg.bund.de/DE/Produkte-und-Services/Shop-und-Downloads/ Digitale-Geodaten/Quasigeoid/quasigeoid.html

8

Acquisition and Provision of Gravity Data for the Quasigeoid Modeling

An essential goal of the survey of gravity data at the State Office for Surveying and Geobase Information Rhineland-Palatinate (LVermGeo) is the derivation of a consistent height reference surface (quasigeoid) with an accuracy in the centimeter range and better. This is a mandatory prerequisite for satellite-supported altitude determination using the official satellite positioning service SAPOS [76, 77]. For the computation of such an accurate quasigeoid amongst others precise, up-to-date, and sufficiently dense gravity values are necessary. This task can not be achieved with the data of current satellite gravity field missions alone, since the spatial resolution is not sufficient. Therefore, precise

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1165

Fig. 27 German Combined Quasigeoid GCG 2016

terrestrial measurements of the gravitational field at the Earth’s surface with measuring point distances of at most 1–2 km can not be dispensed with in the near future. Furthermore, a uniform geodetic spatial reference of all measurement data sets (coordinates of the gravimetric measurement points, ellipsoidal and physical heights, digital terrain model (DTM)) is a basic requirement for quasigeoid modeling. This requirement is fulfilled in Germany since 2016 with the unitary Integrated Geodetic Spatial Reference 2016 (IGSR2016) with its components ETRS89/DREF91 (Realization 2016), German Main Height Network 2016

1166

G. Berg et al.

(DHHN2016), German Main Gravity Network 2016 (DHSN2016) and German Combined Quasigeoid 2016 (GCG2016) [1, 14]. In order to be able to estimate the quality of the terrestrial gravity data used for the modeling of the quasigeoid, it is important to know the origin of this data [90]. To ensure the German national gravity standard, the Federal Agency for Cartography and Geodesy (BKG) maintains the German Basic Gravity Network (DSGN). The DSGN forms the overall reference framework for the German Main Gravity Network (DHSN), for which the German federal states are responsible. In the course of time, various implementations of the DSGN and the DHSN emerged: • German Basic Gravity Network 1976 (DSGN76) and German Main Gravity Network 1982 (DHSN82) On the basis of a few DSGN76 points (datum points) determined with absolute gravimeters, the DHSN82 has been realized by the official gravity fixed points of 1st order (SFP 1.O.). These have been determined by the states of the Federal Republic of Germany in the years 1978 to 1984 by gravity measurements with relative gravimeters [83]. • German Basic Gravity Network 1994 (DSGN94) and German Main Gravity Network 1996 (DHSN96) After the reunification of Germany, the DSGN94 has been created by re-measuring the benchmarks of the DSGN76 with absolute gravimeters and extending the network to the eastern part of Germany (see [70, 86]). Subsequently, the countries introduced the DHSN96. In the eastern part of Germany, the DHSN96 has been established by direct connection to the new absolute gravity stations of the DSGN94. In the western federal states, the DHSN82 was adapted to the DSGN94 by a constant level shift around the mean level difference between DSGN94 and DSGN76 of −19 μGal. • German Basic Gravity Network 2016 (DSGN2016) and German Main Gravity Network 2016 (DHSN2016) The DSGN2016 has been established in the context of the generation of the Integrated Geodetic Spatial Reference 2016. The DSGN2016 consists of the fixed points of the previous DSGN94 as well as selected benchmarks of the Integrated Geodetic Reference Network (GREF) of the BKG. New gravimatric values have been determined by high precision gravimetry for all benchmarks of the DSGN. The accordance of the applied field-suitable absolute gravimeters with the International Gravity Standardization Network 1971 (IGSN71) is ensured by periodic measurements on gravimetric reference stations. The current DHSN2016 has the same level and scale as the DHSN96, but it differs in terms of accuracy, reliability and currency. The densification of the German Main Gravity Network (DHSN) below the SFP of first order is in the responsibility of the federal states. Thus, in RhinelandPalatinate, the second-order SFP network was created, which in turn served as a reference network for the subsequent measurements (SFP of 3rd order). The primary goal of 3rd order gravity measurements was to determine the gravity values of height fixed points in order to calculate the physically defined normal heights of the official height fixed points.

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1167

In order to close the remaining gaps and to realize a considerable better coverage of the territory of Rhineland-Palatinate the gravity values of a special kind of existing fixed points, the so-called superordinate fixed points (ÜFP), have been determined at a later time. As a result, surface gravity values measured for the entire territory area of Rhineland-Palatinate with a density of approx. 1 point per 8 km2 were available. The density of the data was much higher along the leveling lines due to the SFP of 3rd order. However, there remain already some extended larger areas of significantly lower density, such as, e.g., military training areas, an airport, large forest areas and peripheral areas along the state border. A certain disadvantage for the quasigeoid modeling is undoubtedly that the actually available gravity data meanwhile have an average age of 20–25 years, and that intermediate gravity changes are thus not sufficiently reflected in the data. The accuracy of the SFP depends on the order and amounts in the range between 10 and 30 · 10−8 m/s2 (μGal) with respect to the sensitive point of the gravimeter and to the time of the measurement. The standard use of the theoretical vertical gravity gradient to reduce the measured gravity values from the sensitive point of the gravimeter to the Earth’s surface additionally reduces the accuracy of the gravity values. It should also be kept in mind that, for reasons of economy, parts of the SFP network of the 3rd order could not be determined by double measurements but only by uncontrolled simple measurements. The requirements for the gravity data for the computation of the GCG2016 [53] are described as follows: • Current gravity values (SW), • Density of the SW as a function of the roughness of the gravitational field, i.e., in regions with larger horizontal gradients of Bouguer anomalies, the density of the measurements should be higher, since in these areas the quasigeoid is structured more irregularly, • uniform density of at least one SW per 4 × 4 km to max. two SW per 2 × 2 km, • Accuracy of the gravity measurement 50–100 μGal, • Accuracy of georeferencing the height better than 0.3 m, • Accuracy of georeferencing the location better than 3 m, • Points must be representative of the surrounding topography, • suitable conditions for GNSS and gravity measurement. To meet these requirements, the federal states were asked by resolution AK RB 09/20 of the Working Group “Spatial Reference” of the Working Committee of the Surveying Authorities of the states of the Federal Republic of Germany (AdV) to carry out supplementary area-covering gravity measurements and to make these additional gravity data available to the BKG for quasigeoid modeling. As a result, in the State Office for Surveying and Geobase Information RhinelandPalatinate (LVermGeo), between August 2014 and November 2015, around 1200 further gravity points were determined, affiliated to the second-order SFP. These points are locally unmarked. They have been georeferenced by GNSS measurements

1168

G. Berg et al.

by assistance of SAPOS. The gravimetric measurements have been carried out with modern relative gravimeters [79] with an accuracy better than 50 μGal. Depending on the necessary travelling time, the location of the affiliating points, the existing road infrastructure, the applied measuring method and the respective local conditions, it is possible to measure approximately 6 to 8 points per working day in this way. The measurement in extensive forest areas is particularly complex, since the GNSS measurement under these conditions as is known is limited in space (clearances, clearings) and time (season with little foliage). In large cities, due to the strong anthropogenic microseismics, extended measuring times must be expected. With the official gravity fixed points (SFP) and additional area-covering gravity measurements on unmarked points, approximately 9000 SW were available for the computation of the GCG2016 for the subarea of Rhineland-Palatinate. Subsequently, in cooperation with the BKG, a comprehensive evaluation of the existing gravity data with regard to suitability for quasigeoid modeling was carried out according to the following criteria and limits: • Plausibility check of the gravity values by means of gravity prediction on the basis of Bouguer anomalies (maximum permissible difference: 3 mGal). • Comparison of the heights of the measured points with a current high-resolution terrain model. For this purpose the official Digital Terrain Model (DGM25) was used. Larger deviations (over 5 m) indicate unrepresentative or not in the DGM25 resolved point locations such as bridges, towers or steep slopes. Also, erroneous or inaccurate georeferencing of the measured gravity points (e.g., caused by poor digitization) may be the cause of larger deviations. • Sighting the survey sketches, point descriptions and photos of SFP located on or in the immediate vicinity of rock or massive buildings, thus influencing the measured gravity value through the gravitational effect of their masses. By calculations with different block models it could be estimated that the gravity value, e.g., at a height bench mark, which typically is mounted in the exterior wall of a church building, depending on the wall thickness, material and construction, is measured too low by a few 100 μGal. Detected points with differences greater than the predetermined limits were systematically examined to identify the cause of the differences. Erroneous gravity data or SFP unsuitable for quasigeoid modeling were identified and thus excluded from the GCG2016’s further calculations. On the first of December 2016, the GCG2016 was published and officially introduced as part of the official German Integrated Geodetic Spatial Reference 2016 (IGSR2016). Thus, for the first time, there was a height reference surface valid throughout Germany which was completely consistent with the official reference frames with an accuracy (standard deviation) of 1 cm in the lowlands and in the low mountain ranges, 2 cm in the Alps and 2–6 cm in the sea area. Thus, the geometric and the physical components of the integrated geodetic spatial reference can now be linked with high accuracy.

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1169

It is planned to calculate and publish an updated and improved version of the GCG at the beginning of the 2020s. Further area-covering gravity measurements are already taking place in the State Office for Surveying and Geobase Information Rhineland-Palatinate (LVermGeo) with the aim of filling the gaps left by the evaluation mentioned above, and to adapt the density of the measured gravity data to the ever increasing demands. This is intended to further increase the accuracy, but in particular the reliability, of the future GCG. It is also being considered to systematically review the existing gravity data and, if necessary, to replace it with current gravity data. This is particularly useful where major gravity changes have occurred due to mass changes (e.g., mining of mineral resources, reservoirs, pumped storage facilities, etc.).

9

Multiscale Decorrelation of Gravimetric Data

Next we present a multiscale decomposition technique based on the Bouguer anomalies and the quasigeoidal undulations for subareas of Rhineland-Palatinate (see Fig. 28 for the landscapes and Fig. 29 for the geological subareas). As multiscale tools we use Haar wavelets and scaling functions for the decomposition (see [5, 19, 20] for more details). The geologic units of Rhineland Palatinate can be divided as follows (see [51]): In the northern part we have the Rhenish Slate Mountains (or Rhenish Massif) with anticlines and synclines striking SW-NE and overthrusts (see [93]). During the Variscan orogeny, the Rhenish Slate Mountains represented a part of the Rhenohercynian zone as a fold-and-thrust belt (see [58]). It is mainly composed of slates, siltstones, sandstones and quartzite from the Lower Devonian, where igneous rocks as keratophyres and diabases have penetrated. Subordinately, lime and dolomite rocks occur. Middle Devonian limestones can be found in the Eifel depression. The regional structures are generally trending SW-NE with NW-facing folds and overthrusts (NW-vergence), see [51, 58]. The main tectono-stratigraphic units are from north to south separated by the Siegen Main Thrust, the Boppard-Görgeshausen Thrust and the Southern Hunsrück Boundary Fault. The southern partly thrust sheet is the metamorphic zone of the southern Hunsrück (Northern Phyllite Zone) (cf. [58]). It forms a multiply folded imbricate belt of partly mylonitic metasediments and metavolcanics. This thrust sheet is cut by the Southern Hunsrück Boundary Fault. The rather fold-dominated Eifel and thrust-dominated Hunsrück situated in the south of the Rhenish Slate Mountains are separated by a Permian graben, the so-called Wittlicher Senke bounded by normal faults (see [78]). From western direction Mesozoic layers (Buntsandstein, Muschelkalk, Keuper, and Lower Jura) overlap the Rhenish Slate Mountains in the Trier Basin. The Buntsandstein consists typically of red-colored, fluviatile sandstone. The Muschelkalk transgression results in marine conditions, which comprise various sandy-marly-dolomitic rock types. The carbonate sedimentation of the Muschelkalk is terminated by a regional regression (see [78]). The deposits in the follow age of the Keuper are sandy-clayey-marly and colored. The

1170

G. Berg et al.

Fig. 28 Landscapes in Rhineland-Palatinate (taken from [52] in modified form)

transgression of the Lias (Lower Jura) sea finished the continental and marginal marine shaped sedimentation of the Triassic. Marine marl abundant in fossils and limestone with interbedded sandstones occur (see [78]). Eifel and Westerwald are marked with Tertiary and Quaternary Volcanics from basaltic origin. In between is the subsidence area of the Neuwied Basin with sediments as gravel, sand and clay as well as Pleistocene volcanic deposits, e.g., pumice (see [93]). The southern edge of the Rhenish Slate Mountains is adjacent to the Saar-NaheBasin with a sequence of Permian-Carboniferous sedimentary rock (clay-, silt- and sandstone) and Volcanics (rhyolithe, dacite and andesite) of the Rotliegend of a thickness of several 1000 m shaped as trough in the north and as saddle in the south (see [93]). At the northern margin of the Saar-Nahe Basin, a pronounced fault zone, the Southern Hunsück Boundary Fault represents a large southward-dipping detachment fault, and is responsible for the Saar-Nahe Basin being a half-graben (cf. [55]). The southern part of Rhineland-Palatinate – the Palatine Trough – is covered with rock of Triassic, mostly sandy conglomeratic deposits of the Buntsandstein. They can be found in a shallow syncline where Muschelkalk is widespread in its center in the southwest (see [93]). The Cenozoic Upper Rhine Graben crosses the eastern part of the Permocarboniferous Saar-Nahe Basin. Due to the subsidence in this graben structure, the top of the up to 2 km thick Permocarboniferous is located at a depth of 600–2.900 m and is overlain by Tertiary and Quaternary sediments (cf. [2]). The partly 4.000 m thick graben is filled up with mighty sediment sequences of marl- and limestone as well

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

Fig. 29 Geological subareas in Rhineland-Palatinate (taken from [9] in modified form)

1171

1172

G. Berg et al.

as gravels and sands (see [93]). The Mainz Basin was formed as an extension at its northern end. It is essentially built from the sedimentary deposits of the Tertiary, which can be subdivided into the lying silty-marly layer of the Oligocene and the overlying calcareous Miocene sediments. The pre-Tertiary rocks of the Mainz Basin consist mainly of Permian (Rotliegend) rhyolithes and siliciclastics of the Saar-Nahe Basin (cf. [51]). Fig. 30 shows the decompositon of the Bouguer data with the lowpass filtered approximations on the left and associated bandpass filtered detail information on the right. The lowpass filtered version at scale j = 2 only provides the coarse structure of the Bouguer anomaly. The multiscale method justifies that, with increasing scales, the lowpass filtered Bouguer anomaly converges more and more to the input Bouguer anomaly data. The most important structures are in general well outlined by anomalies. Positive anomalies usually indicate thrusts or anticlines and negative anomalies synclines or depressions, respectively. The anomalies or the boundary between anomalies are often correlated with major faults observable at the surface. Starting with the decorrelation at scale j = 3 of the bandpass filtering of the area under investigation the geological structures can be roughly assigned. Especially the Mainz Basin sticks out with an almost concentric positive anomaly, whereas the Upper Rhine Graben has a minimum. Lineaments and rough shape of the tectonical units and a structural direction can be observed at scale j = 4. The map reveals the main SW-NE striking Variscan structures below the Rhenish Slate Mountains, the Saar-Nahe Basin and the Wittlicher Senke. In contrast, the Mainz Basin and the Upper Rhine Graben are characterized by the relatively large anomaly. The shape of the basin is more clearly defined, as well as in the northwest the Trier Basin with a sligtly negative anomaly. The positive and negative anomalies occur due to greater fault zones. This is most obvious in the middle of the map in the Hunsrück Slate zone between quartzite/phillite and claystone. At scale j = 5 a refinement of the tectono-stratigraphic units becomes visible. Thus, more structures can be seen as, e.g. the Mosel Basin. Furthermore, in addition to the maximum in the north of the Southern Hunsrück Boundary Fault a minimum in the northern part is constituted, which corresponds to the quartzite.The positive anomaly in the east correlates with the Northern Phyllite Zone. Between the SaarNahe Basin and the Mainz Basin the fault zone at the border fault is indicated by an axis of minima. Altogether, the illustration of the tectonical structures is refined. The finest formation of the tectonical structure becomes visible at scale j = 6 and different geological relations have an effect onto the map. The structures which have been elaborated at scale j = 5 become even more obvious. The SWNE oriented line of maxima in the middle of the map are sligtly shifted in southern direction compared to j = 5 and mark the Southern Hunsrück Boundary Fault. In the Mainz Basin, the contour of the positive anomaly is smaller and it shows a more local importance. This can be correlated to the mighty Permian and Tertiary magmatite below a thin layer of sediments. The anomaly at the edge of the Graben

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1173

Fig. 30 Decorrelation of the Bouguer anomaly for a subarea of Rhineland Palatinate, Germany, via a multiscale approach for different scales. Bandpass filtered data on the right, Lowpass filtered data on the left. (a) Scale j = 2. (b) Scale j = 2. (c) Scale j = 3. (d) Scale j = 3. (e) Scale j = 4. (f) Scale j = 4. (g) Scale j = 5. (h) Scale j = 5. (i) Scale j = 6. (j) Scale j = 6. (k) Scale j = 7

1174

G. Berg et al.

between the Saar-Nahe Basin and the Mainz Basin shows more exactly the course of the border fault. The Wittlich Basin is clearly defined by negative anomalies. The weakness area caused of many SSW-NNE oriented faults between the Trier Basin and the Rhenish Slate Mountains, is indicated by a line of minima. In accordance with the quasigeoidal definition, the multiscale decomposition of the quasigeoid (Fig. 31) does not indicate in the same way a remarkable match to the geologically based structures as the Bouguer anomaly. Nonetheless, at scales j = 4, 5, we can still detect the Rhenish Slate Mountains, the Saar-Nahe Basin, and the Mainz Basin, but the border lines are less obvious. It seems that density contrasts much deeper inside the Earth play a particular role. Summarizing our multiscale results, we are led to the following conclusions: The multiscale approach as proposed in this contribution breaks up a complicated signal such as the Bouguer anomaly field into “wave band signatures” at different scales, i.e., a certain resolution. To each scale parameter, a scaling function is defined leading to an approximation of the data at this particular resolution. The difference between two successive scaling functions, i.e., in the jargon of constructive approximation, the wavelets represent the corresponding wave bands. They yield desired geologically based detail information. With increasing scale, the approximation is getting finer and finer starting form a lowpass approximation and adding more and more wave bands. The multiscale approach guarantees that the lowpass information contained on a certain (coarse) level is also contained in the approximations of higher scales. Thus, it is advantageous that we are able to analyze the wave bands separately (decorrelation). In doing so, the multiscale concept helps to find adaptive methods to the particular structure of the input data. Additionally, the resolution of the model can be adapted to the spatial structures, i.e., for areas with coarse spatial structures, the resolution of the model can be chosen to be rather low, and for areas with complicated structures the resolution can be increased accordingly. Consequently, since most data show correlation both in space as well as in frequency, the multiscale technique is an appropriate method for a simultaneous space and frequency localization. As far as the numerical realization is concerned, fast wavelet methods (FWT) are applicable. Considering especially quantities involving the disturbing potential field in the outer space via boundary values, we observe – from computational point of view – two major requirements: First, the field characterisics of geodetic features are usually of local character such that the use of local wavelets is evident. Second, in view of the physical relevance of the multiscale approach, we need wavelets which have a certain relation to the corresponding partial differential equation (here, the Laplace equation). Moreover, we have to be concerned with wavelet types which are manageable from mathematical point of view and, additionally, show a close relation to the physical model (for examples, the reader is referred to the following contributions of the Geomathematics Group, Kaiserslautern: [5–7, 17, 20, 21, 24, 27, 29, 30], and the list of references therein). All in all, the main results and characteristics of our multiscale method involving the (quasi)geoidal model can be summarized as follows:

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1175

Fig. 31 Decorrelation of the quasigeoid (GCG2016) for a subarea of Rhineland Palatinate, Germany, via a multiscale approach. Bandpass filtered data on the right, Lowpass filtered data on the left. (a) Scale j = 3. (b) Scale j = 3. (c) Scale j = 4. (d) Scale j = 4. (e) Scale j = 5. (f) Scale j = 5. (g) Scale j = 6. (h) Scale j = 6. (i) Scale j = 7

1176

G. Berg et al.

• Physically based behavior and appropriate interpretability of detail information via the developed wavelet (band) structures. • Numerical efficiency and economy by virtue of the wavelets enabling an adaptive choice of the local support and resulting in fast algorithms. • Scale dependent decorrelation into wavebands and scale dependent detection of specific geodetic/geologic structures within a systematic “zooming-in/zooming out” multiresolution process.

10

Conclusion

Local knowledge of the gravity potential and its equipotential (level) surfaces have become an important issue not only in geodetic modeling but also for geological interpretation, e.g., for purposes of exploration and prospecting (see, e.g., [7] and the references therein). Indeed, the gravity field is a key component of future investigation. Seen from a numerical point of view, the way forward has to focus on two challenges: (i) It is commonly known that highly accurate sensors, when operating in an isolated manner, have their shortcomings. Combining globally available satellite data with regional airborne and/or local terrestrial observations within a physically founded and mathematically consistent multiscale process is therefore an essential step forward. (ii) Based on the excellent geodetic pre-work thus far (such as GCG2016 for Germany), a vertical as well as horizontal “zooming-in” detection of specific geological/geophysical attributes is an outstanding field of interest for validating the multiresolution method based on heterogeneous datasets and geophysically oriented multiscale “downward continuation” modeling of the different data sources starting from spaceborne data as a trend solution via more accurate airborne data down to high-precision local data sets. (iii) For data supplementation and numerical stabilization, spaceborne data are indispensable even for local purposes because of their a priory trend predetermination. The unfortunate terrestrial situation with larger gaps in most countries causes particular mathematical attention for homogenization and unification to suppress undesired oscillation phenomena within the numerical modeling process of the data. (iii) The whole spectrum of spaceborne/airborne/ground data systems covers all verifiable “signature wave packages”. Actually, the advantage of satellite lower frequency band data at the ground is their availability everywhere, while (airborne) medium and (terrestrial) high(er) frequency bands are merely at the disposal for regional and local occurrence, respectively. Geologically relevant signatures (as discussed, e.g., in [25] for purposes of inverse gravimetry), however, presuppose an extremely dense and highly accurate gravitational input data material (better than the usual geodetic situations today), in particular if more detailed internal structures are of interest.

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1177

Altogether, the connecting link for all requirements is a vertical/horizontal multiscale philosophy including all data information, where the localization in space enables us to handle the data dependent of their space availability and density for a particular area and the localization in frequency bands provides appropriate decorrelation of specifically demanded features contained in the signatures. Acknowledgements The authors C. Blick, W. Freeden, Z. Hauler, and H. Nutz thank the “Federal Ministry for Economic Affairs and Energy, Berlin” and the “Project Management Jülich” (PtJ-corporate managers Dr. V. Monser, Dr. S. Schreiber) for funding the projects “GEOFÜND” (funding reference number: 0325512A, PI Prof. Dr. W. Freeden, University of Kaiserslautern, Germany) and “SPE” (funding reference number: 0324061, PI Prof. Dr. W. Freeden, CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Germany, corporate manager Prof. Dr. M. Bauer).

Literature 1. AG der Vermessungsverwaltungen der Länder der Bundesrepublik Deutschland: Richtlinie für den einheitlichen integrierten geodätischen Raumbezug des amtlichen Vermessungswesens in der Bundesrepublik Deutschland (RiLi-RB-AdV) (2017) 2. Aretz, A., Bär, K., Götz, A., Sass, I.: Outcrop analogue study of Permocarboniferous geothermal sandstone reservoir formations (northern Upper Rhine Graben, Germany): impact of mineral content, depositional environment and diagenesis on petrophysical properties. Int. J. Earth Sci. 105(5), 1431–1452 (2016) 3. Augustin, M., Freeden, W., Nutz, H.: About the importance of the Runge-Walsh concept for gravitational field determination. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 517–560. Birkhäuser, Basel (2018) 4. Bjerhammer, A.: A New Theory of Gravimetric Geodesy. Geodesy Division Report of the Royal Institute of Technology, Stockholm (1963) 5. Blick, C.: Multiscale Potential Methods in Geothermal Research: Decorrelation Reflected PostProcessing and Locally Based Inversion. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2015) 6. Blick, C., Freeden, W., Nutz, H.: Feature extraction of geological signatures by multiscale gravimetry. GEM Int. J. Geomath. 8(1), 57–83 (2016) 7. Blick, C., Freeden, W., Nutz, H.: Gravimetry and exploration. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 687–751. Birkhäuser, Basel (2018) 8. Bruns, E.H.: Die Figur der Erde. Königl. Preussisch. Geodätisches Institut, P. Stankiewicz Buchdruckerei, Berlin (1878) 9. Bundesanstalt für Geowissenschaften und Rohstoffe. Geological Map of Germany 1: 1,000,000 (GK1000), (c). BGR, Hannover (2014) 10. ESA: The Nine Candidate Earth Explorer Missions. Publications Division ESTEC, Noordwijk, SP-1196(1) (1996) 11. ESA: European Views on Dedicated Gravity Field Missions: GRACE and GOCE. An Earth Sciences Division Consultation Document. ESD-MAG-REP-CON-001 (1998) 12. ESA: The Nine Candidate Earth Explorer Missions. Publications Division ESTEC, Noordwijk, SP-1233(1). Report for mission selection of the four candidate earth explorer missions (1999) 13. Fehlinger, T.: Multiscale Formulations for the Disturbing Potential and the Deflections of the Vertical in Locally Reflected Physical Geodesy. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2009)

1178

G. Berg et al.

14. Feldmann-Westendorff, U., Liebsch, G., Sacher, M., Müller, J., Jahn, C., Klein, W., Liebig, A., Westphal, K.: Das Projekt zur Erneuerung des DHHN: Ein Meilenstein zur Realisierung des integrierten Raumbezugs in Deutschland. ZfV 5, 354–367 (2016) 15. Foulger, G., Natland, J., Presnall, D., Anderson, D. (eds.): Plates, Plumes, and Paradigms. Geological Society of America, Boulder (2005) 16. Freeden, W.: On integral formulas of the (unit) sphere and their application to numerical computation of integrals. Computing 25, 131–146 (1980) 17. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. B.G. Teubner, Leipzig (1999) 18. Freeden, W.: Geomathematics: its role, its aim, and its potential. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn, pp. 3–79. Springer, New York (2015) 19. Freeden, W., Blick, C.: Signal decorrelation by means of multiscale methods. World of Mining 65(5), 304–317 (2013) 20. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. CRC Press/Taylor & Francis, Boca Raton (2013) 21. Freeden, W., Gutting, M.: Special Functions of Mathematical (Geo-)Physics. Applied and Numerical Harmonic Analysis (ANHA). Birkhäuser, Basel (2013) 22. Freeden, W., Gutting, M.: Integration and Cubature Methods. Monographs and Research Notes in Mathematics, A Chapman & Hall Book. CRC PRess/Taylor & Francis Group, Boca Raton (2018) 23. Freeden, W., Mayer, C.: Multiscale solution for the Molodensky problem on regular telluroidal surfaces. Acta Geodaetica et Geophysikca Hungarica 41, 55–86 (2008) 24. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston (2004) 25. Freeden, W., Nashed, M.Z.: Inverse gravimetry: background material and multiscale mollifier approaches. GEM Int. J. Geomath. 9(2), 199–264 (2018) 26. Freeden, W., Nutz, H.: Satellite gravity gradiometry as tensorial inverse problem. GEM Int. J. Geomath. 2, 177–218 (2011) 27. Freeden, W., Nutz, H.: Geodetic observables and their mathematical treatment in multiscale framework. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 315–458. Birkhäuser, Basel (2018) 28. Freeden, W., Schreiner, M.: Mathematical Geodesy: Its Role, Its Aim, and Its Potential. In: Freeden, W. (ed.) Handbuch der Mathematischen Geodäsie. Springer Spektrum, Heidelberg (2020) 29. Freeden, W., Schreiner, M.: Local multiscale modelling of geoid undulations from deflections of the vertical. J. Geod. 79, 641–651 (2006) 30. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences (A Scalar, Vectorial, and Tensorial Setup). Springer, Heidelberg (2009) 31. Freeden, W., Wolf, K.: Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math. Semesterb. 56, 53–77 (2008) 32. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications, Clarendon/Oxford (1998) 33. Freeden, W., Fehlinger, T., Mayer, C., Schreiner, M.: On the local multiscale determination of the Earth’s disturbing potential from discrete deflections of the vertical. Comput. Geosci. 4, 473–490 (2008) 34. Freeden, W., Fehlinger, T., Klug, M., Mathar, D., Wolf, K.: Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J. Geod. 83, 1171–1191 (2009) 35. Freeden, W., Nutz, H., Rummel, R., Schreiner, M.: Satellite gravity gradiometry (SGG): methodological foundation and geomathematical advances. In: Freeden, W. (ed.) Handbuch der Mathematischen Geodäsie. Springer Spektrum, Heidelberg (2019) 36. Grafarend, E.: The reference figure of the rotating earth in geometry and gravity space and an attempt to generalize the celebrated Runge-Walsh approximation theorem for irregular surfaces. GEM Int. J. Geomath. 6, 101–140 (2015)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1179

37. Grafarend, E.W., Aardalan, A.A., Finn, G.: Ellipsoidal vertical deflections and ellipsoidal gravity disturbances: case studies. Stud. Geophys. Geod. 50, 1–57 (2006) 38. Grafarend, E.W., Klapp, M., Martinec, Z.: Spacetime modelling of the Earth’s gravity field by ellipsoidal harmonics. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 1st edn, pp. 159–253. Springer, Heidelberg (2010) 39. Groten, E.: Geodesy and the Earth’s Gravity Field I + II. Dümmler, Bonn (1979) 40. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Freeman, San Francisco (1967) 41. Helmert, F.: Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie, I, II. B.G. Teubner, Leipzig (1884) 42. Hesse, K., Sloan, I.H., Womersley, R.S.: Numerical integration on the sphere. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn, vol. 2, pp. 1187–1220. Springer, Heidelberg (2010) 43. Hofmann-Wellenhof, B., Moritz, H.: Physical Geodesy. Springer, Wien/New York (2005) 44. Holota, P.: Variational methods for geodetic boundary-value problems. In: Sansò, F., Rummel, R. (eds.) Lecture Notes in Earth Sciences, vol. 65. Springer, Berlin/Heidelberg/New York (1997) 45. Hörmander, L.: The boundary problems of physical geodesy. Arch. Ration. Mech. Anal. 62(1), 1–52 (1976) 46. Jacobs, F., Meyer, H.: Geophysik-Signale aus der Erde. B.G. Teubner, Leipzig, and VDF Verlag, Zürich (1992) 47. Kellogg, O.D.: Foundations of Potential Theory. Springer, Berlin/Heidelberg/New York (1967) 48. Krarup, T.: A Contribution to the Mathematical Foundation of Physical Geodesy. Danish Geodetic Institute, Report No. 44, Copenhagen (1969) 49. Krarup, T.: On Potential Theory. Danish Geodetic Institute Report No. 6, Copenhagen, Denmark (1973) 50. Kusche, J.: Time-variable gravity field and global deformation of the Earth. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn, vol. 1, pp. 253–268. Springer, Heidelberg (2010) 51. Landesamt für Geologie und Bergbau Rheinland-Pfalz, Mainz.: Geologie von RheinlandPfalz: VII, 400 Seiten, 162 Abbildungen, 36 Tabellen, 3 Anlagen. E. Schweizerbart Science Publishers (Nägele u. Obermiller), Stuttgart (2005) 52. Landeszentrale Politische Bildung: Rheinland-Pfalz. http://rlp100seiten.de/land-undlandschaften/, Accessed 22 (2018) 53. Liebsch, G., Schirmer, U., Reinhold, A., Falk, R., Wilmes, H., Schäfer, U., Rülke, A., Ihde, J.: Anforderungen an die Schweredatenbasis zur Quasigeoidbestimmung. Bundesamt für Kartographie und Geodäsie, Frankfurt am Main (2012) 54. Listing, J.B.: Über unsere jetzige Kentniss der Gestalt und Größe der Erde. Nachrichten von der Königl. Gesellschaft der Wissenschaften und der G. A. Universität zu Göttingen 3:33–98 (1873) 55. Lorenz, V., Haneke, J.: Relationship between diatremes, dykes, sills, laccoliths, intrusiveextrusive domes, lava flows, and tephra deposits with unconsolidated water-saturated sediments in the late Variscan intermontane Saar-Nahe Basin, SW Germany. Geol. Soc. Lond. Spec. Publ. 234, 75–124 (2004) 56. Martinec, Z.: Boundary-Value Problems for Gravimetric Determination of a Precise Geoid. Springer, Berlin/Heidelberg/New York (1999) 57. Meissl, P.A.: A Study of Covariance Functions Related to the Earth’s Disturbing Potential. Scientific Report No. 151, The Ohio State University, Department of Geodetic Science, Columbus (1971) 58. Moe, A.: Structural development of a volcanic sequence of the Lahn area during the Variscan orogeny in the Rhenohercynian Belt (Germany). Ph.D.-Thesis, Universität Heidelberg (2000) 59. Molodensky, M.S.: Methods for Study of the External Gravitational Field and Figure of the Earth. Translated from Russian by the Israel Program for Scientific Translations for the Office of Technical Services, U.S. Department of Commerce, Washington, DC, 1962 (1960) 60. Morgan, W.J.: Convection plumes in the lower mantle. Nature 230, 42–43 (1971)

1180

G. Berg et al.

61. Moritz, H.: Recent Developments in the Geodetic Boundary Value Problem. Scientific Report. No. 266, The Ohio State University, Department of Geodetic Science, Columbus (1977) 62. Moritz, H.: Geodetic reference system 1980. Bull. Géod. 54(3), 395–407 (1980) 63. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 1st edn, vol. 1, pp. 127–158. Springer, Heidelberg (2010) 64. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn, vol. 1, pp. 253–289. Springer, New York (2015) 65. Nettleton, L.L.: Elementary Gravity and Magnetics for Geologists and Seismologists. Society of Exploration Geophysicists, Tulsa (1971) 66. Nettleton, L.L.: Gravity and Magnetics in Oil Prospecting. McGraw-Hill, New York (1976) 67. Neumann, F.: Vorlesungen über die Theorie des Potentials und der Kugelfunktionen. B.G. Teubner, Leipzig (1887) 68. Pizzetti, P.: Geodesia – sulla espressione della gravita alla superficie del geoide, supposto ellissoidico. Att. R. Acad. Lincei 3, 331–350 (1894) 69. Pizzetti, P.: Corpi equivalenti rispetto alla attrazione newtoniana esterna. Rom. Acc. L. Rend. 18, 211–215 (1909) 70. Richter, B., Wilmes, H., Franke, A., Falk, R., Reinhart, E., Torge, W.: Das Deutsche Schweregrundnetz 1994 (DSGN94). ZfV 123, 363–370 (1989) 71. Ritter, J., Christensen, U.: Mantle Plumes, A Multidisciplinary Approach. Springer, Berlin/Heidelberg (2007) 72. Rivas, J.: Gravity and magnetic methods. In: Short Course on Surface Exploration for Geothermal Resources, UNU-GTP, Lake Naivasha, Kenya, (2009) 73. Rummel, R.: Spherical spectral properties of the Earth’s gravitational potential and its first and second derivatives. In: Sansò, S., Rummel, R. (eds.) Geodetic Boundary Value Problems in View of the One Centimeter Geoid. Lecture Notes in Earth Science, vol. 65, pp. 359–404. Springer, Berlin/Heidelberg (1997) 74. Rummel, R.: GOCE: gravitational gradiometry in a satellite. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn, pp. 211–226. Springer, New York (2015) 75. Rummel, R., Balmino, G., Johannessen, J., Visser, P., Woodworth, P.: Dedicated gravity field missions – principles and aims. J. Geodyn. 33, 3–20 (2002) 76. SAPOS: SAPOS in Rheinland-Pfalz. https://lvermgeo.rlp.de/de/aufgaben/vermessungstechnischer-raumbezug/saposr/saposr-in-rheinland-pfalz/ (2018). Accessed 27 Sept 2018 77. SAPOS: Satellitenpositionierungsdienst der Deutschen Landesvermessung. https://www. sapos.de/ (2018). Accessed 27 Sept 2018 78. Schintgen, T.V., Förster, A.: Geology and basin structure of the Trier-Luxembourg Basin – implications for the existence of a buried Rotliegend graben. Zeitschrift der Deutschen Gesellschaft für Geowissenschaften 164(4), 615–637(23) (2013) 79. Schlesinger, R., Cieslack, M.: Simultane Messungen mit zehn Scintrex-CG-5-Gravimetern im stationären Parallelbetrieb. AVN 8–9, 274–283 (2018) 80. Schuber, D., Turcotte, D., Olson, P.: Mantle Convection in the Earth and Planets. Cambridge University Press, Cambridge (2001) 81. Seigel, H.O.: A Guide to High Precision Land Gravimeter Surveys. Scintrex Limited, Concord (1995) 82. Stokes, G.G.: On the variation of gravity at the surface of the earth. Trans. Camb. Philos. Soc. 148, 672–712 (1849) 83. Thönnissen, K.: Aufbau des Schwerenetzes 1. Ordnung in Rheinland-Pfalz. Nachrichtenblatt der Vermessungs- und Katasterverwaltung Rheinland-Pfalz, Koblenz (1982) 84. Torge, W.: Gravimetry. de Gruyter, Berlin (1989) 85. Torge, W.: Geodesy. de Gruyter, Berlin (1991) 86. Torge, W., Falk, R., Franke, A., Reinhart, E., Richter, B., Sommer, M., Wilmes, H.: Das Deutsche Schweregrundnetz 1994 (DSGN94) Band I. C.H. Beck Verlag (2000) 87. United Nations: Current and Planned Global and Regional Navigation Satellite Systems and Satellite-Based Augmentations Systems. United Nations, New York (2010)

18 Gravimetric Measurements, Gravity Anomalies, Geoid, Quasigeoid . . .

1181

88. Vaníˇcek, P., Kingdon, R., Santos, M.: Geoid versus quasigeoid: a case of physics versus geometry. Contrib. Geophys. Geod. 42(1), 101–117 (2012) 89. Vening-Meinesz, F.A.: A formula expressing the deflection of the plumb line in the gravity anomalies and some formulas for the gravity field and the gravity potential outside the geoid. Proc. Koninklijke Akad. Wet. Amsterdam 31, 315–322 (1928) 90. Weber, D.: Die Schweremessungen der Landesvermessung in Deutschland. ZfV 11, 370–378 (1998) 91. Wilson, J.: A possible origin of the Hawaiian island. Can. J. Phys. 41, 863–868 (1963) 92. Wolf, K.: Multiscale Modeling of Classical Boundary Value Problems in Physical Geodesy by Locally Supported Wavelets. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2009) 93. https://www.mineralienatlas.de/lexikon/index.php/Deutschland/Rheinland-Pfalz. Accessed 31 Oct. (2018)

Part VIII Methods for Satellite and Space Techniques

19

Satellite Gravitational Gradiometry: Methodological Foundation and Geomathematical Advances

Willi Freeden, Helga Nutz, Reiner Rummel, and Michael Schreiner

Contents 1 Introductory Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 SGG-Aspects of Inverse Ill-Posed Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 SGG-Aspects of Potential Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 SGG in Frequency-Based Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 SGG in Space-Based Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1187 1195 1205 1222 1238 1248 1250

Abstract Satellite Gravitational Gradiometry (SGG) is an observational technique of globally establishing the fine structure and the characteristics of the external

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. W. Freeden () Geomathematics Group, Mathematics Department, University of Kaiserslautern, Kaiserslautern, Germany E-Mail: [email protected]; [email protected] H. Nutz CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Deutschland E-Mail: [email protected] R. Rummel Astronomical and Physical Geodesy, TU Munich, Munich, Germany E-Mail: [email protected] M. Schreiner Institute for Computational Engineering, University of Applied Sciences of Technology NTB, Buchs, Switzerland E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_111

1185

1186

W. Freeden et al.

Earth’s gravitational field. The “Gravity field and steady-state Ocean Circulation Explorer” GOCE (2009–2013) was the first satellite of ESA’s satellite program intended to realize the principle of SGG and to deliver useful SGG-data sets. In fact, GOCE was capable to provide suitable data material of homogeneous quality and high data density. Mathematically, SGG demands the determination of the gravitational potential in the exterior of the Earth including its surface from given data of the gravitational Hesse tensor along the satellite orbit. For purposes of modeling we are led to invert the “upward continuation”-operator resulting from the Abel–Poisson integral formula of potential theory. This approach requires the solution of a tensorial Fredholm integral equation of the first kind relating the desired Earth’s gravitational potential to the measured orbital gravitational gradient acceleration. The integral equation constitutes an exponentially illposed problem of the theory of inverse problems, which inevitably needs two regularization processes, namely “downward continuation” and (weak or strong) “error regularization” in the case of noisy data. This contribution deals with two different SGG-multiscale regularization methods, one in space domain and the other in frequency domain. Both procedures provide the gravitational potential as derived from tensorial SGG-data along the satellite orbit on the real Earth’s surface as required from the view point of geodesy.

Zusammenfassung Gravitationsgradiometrie mittels Satelliten (SGG) ist eine Messtechnik zur globalen Bestimmung der Feinstruktur und der Eigenschaften des Gravitationsfeldes im Außenraum der Erde samt Erdoberfläche. Der ESA-Satellit GOCE (2009–2013) war der erste, dessen Aufgabe es war, das Prinzip SGG umzusetzen und brauchbare SGG-Daten zu liefern. GOCE gelang es in der Tat, Datenmaterial in einheitlicher Qualität und hoher Datendichte bereitzustellen. Mathematisch erfordert SGG die Bestimmung des Gravitationspotentials im Außenraum der Erde unter Einschluss der Erdoberfläche aus vorgegebenen Daten des Hesse-Tensors des Gravitationspotentials auf dem Satellitenorbit. Die Modellierung führt auf die Inversion des „upward continuation“-Operators, der aus der Abel-Poissonschen Integralformel der Potentialtheorie resultiert. Dieser Zugang erfordert die Lösung einer tensoriellen Fredholmschen Integralgleichung erster Art, die das Gravitationspotential im Außenraum der Erde zu entlang des Orbits gemessenen Gravitationsgradientbeschleunigungen in Beziehung setzt. Zur Lösung der Integralgleichung ist ein exponentiell schlecht-gestelltes Problem der Therorie inverser Probleme zu bewältigen, das unvermeidlich zweier Regularisierungprozesse bedarf, nämlich „downward continuation“ und (schwache oder starke) „Fehler-Regularisierung“ im Falle verrauschter Daten.

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1187

Dieser Beitrag beschäftigt sich mit zwei verschiedenen SGG-MultiskalenRegularisierungsmethoden, eine im Ortsbereich und eine andere im Frequenzbereich. Beide Zugänge liefern das Gravitationspotential – in der Tat wie aus geodätischer Sicht gefordert – auf der tatsächlichen Erdoberfläche aus tensoriellen, entlang von Satellitenorbits gemessenen SGG-Daten.

Keywords Satellite gravitational gradiometry (SGG) · Tensorial pseudodifferential equation · Up- and downward continuation · Invertibility · Exponential ill-posedness · Multiscale regularization · Space/frequency decorrelation

1

Introductory Aspects

The Earth’s gravitational field provides the natural orientation in our living environment, level (or equipotential) surfaces defining the horizontal and plumb lines the vertical. In first approximation, seen from the outer space, all level surfaces seem to be spherical and plumb lines close to the Earth’s surface look like straight lines directed radially towards the Earth’s center of mass with the gravitational acceleration being close to the well-known 9.8 m/s2 . A closer look reveals, however, the slight oblateness of the level surfaces as well as a small increase of gravitational attraction towards the poles by 0.05 m/s2 , corresponding to 1/200 of the gravity intensity, both caused by the flattening of the Earth’s figure and by its rotation. The oblateness of the level surfaces causes the plumb lines to be somewhat curved, as they are orthogonal trajectories of them. Topography and density variations in the Earth’s crust and mantle produce gravity anomalies primarily in the range of one thousandth to one millionth the gravity intensity, temporal variations are much smaller, typically less than one millionth of the gravity intensity. The equipotential surface at mean sea level has a special relevance, as will be discussed below, and is denoted “geoid” since Listing [83]. When compared to a best fitting ellipsoid with a flattening of f = 1/298.3 deviations become visible, so-called geoid undulations with an elevation range between −100 m and +80 m. Undulations relative to a hydrostatic equilibrium figure (flattening f = 1/299.63) are larger with maximum values of about 200 m (cf. [74]).

1.1

Geodetic Aspects

The geoid plays a prominent role in physical geodesy and oceanography. Ocean topography is the deviation of the actual mean ocean surface from the geoid. The geoid is the hypothetical surface of the oceans at rest [65]. Nowadays, the shape of the actual ocean surface is continuously monitored by satellite radar or laser altimetry. Ocean topography, the difference between geoid and mean ocean

1188

W. Freeden et al.

surface is typically about ±30 cm only with maximum values of 1–2 m in the centers of the major circulation systems. Ocean topography can be translated to geostrophic ocean surface velocities. They serve as input to numerical ocean modeling with focus on ocean mass and heat transport. In geodesy the geoid is the classical global reference surface of heights. The most direct measure of height differences are gravity potential differences. In the past potential differences were measured by geodetic leveling in conjunction with gravimetry. Now, in space age, the geometric position differences of terrain points are given by Global Navigation Satellite Systems (GNSS). A global gravity field model is needed, in addition, to determine their potential difference. Reference of all potential differences to the geoid results in heights above mean sea level, so-called geopotential numbers [75]. In geophysics, gravity is of fundamental importance for the study of the composition and dynamics of the Earth (or of Moon and planets). Short scales gravity anomalies indicate density jumps in the subsurface and are employed for exploring ores, salt domes, fractures or sediment layers. Crustal anomalies are used for studying isostatic compensation of topographic masses or the elasticity of lithospheric plates. Subduction processes, faulting or mantle plumes are reflected in medium wavelength gravity or geoid anomalies (see [29, 135]). The dominating temporal variation of the Earth’s field is the tidal signal due to the attraction of Sun, Moon and planets and its effect on the solid Earth and oceans. Generally tides are well known and understood, but there remain some uncertainties in the models of the ocean tides. More interesting are gravity changes due to mass variations and mass transport in the global water cycle: the melting of the ice shields of Greenland and Antarctica and of the many glaciers and ice caps, the mass component of sea level change and changes of continental water storage. Other important contributions come from glacial isostatic adjustment processes and from ocean bottom pressure variations (cf. [17, 133]). Terrestrial absolute and relative gravimetry is the classical method of gravity field determination [130]. However, even after more than 100 years of terrestrial gravimetry the global coverage is incomplete, inhomogeneous and in large parts inaccurate. The modern answer to these deficiencies is satellite gravimetry. Only with satellites it is possible to cover the entire Earth with measurements within reasonable time and all measurements are done with one and the same set of instruments. Furthermore, as global sampling takes only between ten days to two months, depending on the requested sampling density, important time series of temporal changes can be established. Nevertheless, it remains almost paradoxical to try to measure the Earth’s gravity field with satellites. Why? Because of the great height of the satellites above the Earth’s surface. From Newton’s Law of Gravitation it is well-known that the force of gravitation caused by a mass and sensed by a test mass decreases with the square of their distance. Any satellite orbiting the Earth is such a test mass, circling the Earth in free fall in its gravitational field. The dominating part of the motion is the Keplerian (elliptic) orbit about the almost spherical Earth. Superimposed is a slight precession of this ellipse and of the orbit

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1189

plane caused by the Earth’s oblateness. In the present context more important is the sequence of accelerations and decelerations due to mountains, valleys and any other mass anomaly on and inside the Earth. One could refer to it as a gravitational code that has to be deciphered. The higher the orbit altitude the “quadratically” smaller these perturbations and the more difficult it becomes to read this signal code. One obvious counter measure is to choose an orbit altitude as low as possible. Thus, dedicated satellite gravimetry missions fly at extremely near-Earth orbits (NEO’s). A second strategy is, instead of trying to decipher the orbit’s gravitational code, to apply the principle of satellite gravitational gradiometry for the measurements. Satellite Gravitational Gradiometry (SGG) is the measurement of the gradients of the three components of the gravitational vectors – all nine of them, or a few, or a linear combination of some components. Gradiometry is therefore equivalent to the measurement of second-order derivatives of the gravitational potential. The principle is illustrated in Fig. 1. Fig. 1 shows a satellite orbiting the Earth and four mass probes in its interior. All five masses, the satellite and the four probes, are in free fall around the Earth in its gravitational field. We assume the satellite’s mass to be concentrated in its center of mass and the probes to be located around the center of mass of the spacecraft. Thus, the lower of the four mass probes will be slightly more attracted by e.g. a mountain on Earth than the satellite itself and even more than the mass probe on top. Also, the front probe is attracted by the mountain in a direction slightly different from that of the mass probe in the middle and from that in the rear. Measuring these tiny differences in gravitational attraction is (1) greatly attenuating the contribution of the main signal which is that of the spherical Earth and thereby (2) emphasizing the shorter scale signal due to a mountain or any other small scale signal. One could

Fig. 1 The principle of a gradiometer, i.e., the measurement of variations in the acceleration due to gravity

1190

W. Freeden et al.

compare it to a gravitational magnifying glass. With a set-up like in Fig. 1 the mass probes are not only subject to the gravitational field of the Earth. If each of them remains Earth-pointing they also experience the effect of centrifugal and angular accelerations caused by the once-per-revolution rotation of the satellite around the Earth. With the x1 -axis pointing in flight direction, the x2 -axis orthogonal to the orbit plane and the x3 -axis (almost) radially towards the Earth’s center the complete set of nine measurable components becomes (with Vxi xj = ∂ 2 V /∂xi ∂xj , i, j = 1, 2, 3): ⎛

⎞ Vx1 x1 Vx1 x2 Vx1 x3 ⎝ Vx x Vx x Vx x ⎠ 2 1 2 2 2 3 Vx3 x1 Vx3 x2 Vx3 x3    gravitational tensor ⎛ ⎞ ωx1 ωx3 −ωx22 − ωx23 ωx1 ωx2 ⎜ ⎟ + ⎝ ωx2 ωx1 −ωx23 − ωx21 ωx2 ωx3 ⎠ ωx3 ωx1 ωx3 ωx2 −ωx21 − ωx22    angular velocities ⎛ ⎞ ∂ ∂ 0 ∂t ωx3 − ∂t ωx2 ⎜ ⎟ ∂ + ⎝ − ∂t∂ ωx3 . 0 ∂t ωx1 ⎠ 

∂ ∂t ωx2

− ∂t∂ ωx1

0  angular accelerations

(1)



While the gravitational tensor and the angular velocity part are symmetric, the angular acceleration part is skew-symmetric. This allows the separation of the rotational terms from the gravitational part (see [110] for more details). Let us say a few words about the history of satellite gravitational gradiometry. What were the essential steps? Already the radio signals of the first two artificial satellites, Sputnik 1 and 2, both launched in 1957, were used to determine the oblateness of the Earth’s gravitational field [13, 89]. This great success was the beginning of a continuous effort of gravitational model improvement and refinement based on the analysis of various kinds of orbit tracking data from a large number of satellites [76]. First ideas about gravitational gradiometry were published shortly after the Sputnik satellites (cf. [15,18,27,28,121]). From these ideas several alternative lines of hardware development emerged, as discussed in [136]. Some of these concepts entered into the NASA program discussions of the workshop at Williamstown [79] and shortly afterwards into the NASA Earth and Ocean Physics Applications Program [99]. They were further elaborated in a workshop on satellite gravimetry at the National Research Council [104]. In Europe it started with a discussion of

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1191

the theoretical and practical concepts of satellite gravimetry at a summer school in Lannion, sponsored by the French space agency CNES [1]. The European Space Agency began with a programmatic discussion at the Space, Oceanography, Navigation and Geodynamics workshop at Ellmau in 1978 [22]. As a result of these activities on both sides of the Atlantic, NASA finally concentrated on the realization of satellite-to-satellite tracking in the low-low mode (SST l-l), while in Europe the focus was on gravitational gradiometry. SST l-l is a differential technique alternative to gradiometry or one should say: It is not a differential but a difference technique, because the test masses are typically apart at a distance as large as 200 km (the test masses of a gradiometer instrument have a distance of only 50 cm). This makes it less suitable than gradiometry for a very detailed determination of the static gravity field of the Earth, but much more suitable for measuring the temporal variations, albeit less detailed. The mass probes in the case of SST are two satellites themselves, which follow each other in the same orbit at a distance of 200 km. The gravimetric signal are the distance changes which are measured with a precision of a few micrometers. The first mission of this type was GRACE, a NASA mission with German participation, in orbit from 2002 to 2017 and highly successful [128]. It has recently been succeeded by the follow-on mission GRACE FO, again with contributions from the German DLR and GFZ. In Europe, encouragement for gradiometry came from the positive experience with the high precision micro-accelerometer CACTUS onboard of the French mission CASTOR (D5B) [8]. A proposal was addressed to ESA in the context of the science program Horizon-2000 for a gradiometer experiment denoted GRADIO [7]. It lead to a further proposal of a joint gravity and magnetic field mission denoted ARISTOTELES [111]. Neither GRADIO nor ARISTOTELES were approved. Technology was not yet mature for a complex mission such as this one, and, probably more importantly, it did not fit in any of the existing ESA programs. In a new attempt in the nineties the dedicated gravitational gradiometry mission proposal GOCE (Gravity and steady-state Ocean Circulation Explorer) was approved in 1998. It was the first mission of the newly established ESA program “Living Planet”[23]. GOCE was launched on March 17, 2009; it was in orbit till November 11, 2013. Its orbit was as low as 255 km and was further lowered to 224 km in the mission’s final phase. This was only possible with an active drag compensation system on board. GOCE carried the first satellite gradiometer instrument, a three axis device with arm lengths of 50 cm and centered at the satellite’s center of mass. Each gradiometer arm held two high precision three-axis accelerometers at its ends. Thus, it was a full tensor instrument, in theory, as described by (1). In practice, it was impossible to build accelerometers with three ultra sensitive axes in a laboratory on Earth (under the influence of the gravity intensity), and one axis had to be constructed much more robust. As a consequence, high precision measurements were only possible for the gradiometer components Vx1 x1 , Vx2 x2 , Vx3 x3 , and Vx1 x3 as well as for the angular rate ωx2 . The determination of the angular rates was supported by star sensor measurements. Based on the results of GOCE the most detailed and

1192

W. Freeden et al.

accurate global static gravity models were derived. Fundamental theoretical work on satellite gradiometry goes back to [92, 95], and to [88], see also [109, 110]. In several fascinating publications Marussi discussed the geometric structure of the local gravitational field and its interpretation as local tidal field. The tidal effect is in this case not that of Sun, Moon and planets but of the Earth’s gravitation in a local triad on a test mass not exactly placed at the triad’s center [85, 86] and [87]. Some years later the mathematical foundations of gravitational gradiometry were elaborated and refined in the dissertations [105, 122, 123] and in the research notes [46, 47, 51, 53] of the Geomathematics Group, University of Kaiserslautern. The great drawback of acquiring gravitational data at LEO’s altitude is that the upward continuation of the gravitation amounts to an exponential smoothing of the potential coefficients in terms of outer harmonics with increasing height. In other words, satellite measurements do not contain the same signal information at LEO’s height (i.e., 200–250 km) as on the Earth’s surface. This is the reason why the gravitational potential is obtainable from satellite data only in an attenuated form when continued to the Earth’s surface. Even more, it may happen in downward continuation that the noise in the measurements is amplified. Nevertheless, for satellite gravitational gradiometry (SGG), as provided by GOCE, advantage can be taken from the fact that second derivatives instead of the potential itself are used as observations on LEO’s orbit. Mathematically, this means that the exponential decay of the outer harmonic coefficients is reduced polynomially by two degrees. In other words, SGG takes advantage of the fact that second derivatives produce a rougher data set than the potential itself such that the resolution of the gravitational structure is much finer. In addition, in the frequency context of outer harmonics, the Meissl scheme (see, e.g., [88, 105, 115], and [47]) enables us in spectral nomenclature to relate the orthogonal coefficients at LEO’s height to the orthogonal coefficients at the surface of the Earth, at least in the context of a spherical model and under the restrictive assumption of bandlimited outer harmonic modeling without observational errors.

1.2

Mathematical Aspects

The literature dealing with the solution procedures of satellite gravitational gradiometry can be divided essentially into two classes: the timewise approach and the spacewise approach. The former one considers the measured data as a time series, while the second one supposes that the data are given in advance as dataset on the satellite orbit. This contribution is part of the spacewise approach (see, e.g., [57, 109, 110, 112–114, 116, 119, 122, 123] for some earlier work). The goal of this work is a potential theoretically based and numerically reflected approach to satellite gravitational gradiometry corresponding to tensorial SGG-data by use of multiscale regularization methods, both in frequency as well as space domain.

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1193

Γ

S2R

Fig. 2 The spherically reflected geometric situation of satellite gravitational gradiometry (SGG)

Spherically Reflected Formulation For simplicity, we start with the mathematical SGG-description for the outer space of a sphere S2R in Euclidean space R3 of radius R around the origin (thereby using [46, 47, 53] as basic material for the discussion in the frequency domain; in space domain a novel methodology will be presented). In the spherically reflected case (see Fig. 2) the relation between the known tensorial measurements v = (∇ ⊗ ∇)V , i.e., the gradiometer orbital data on Γ and the corresponding potential V outside the sphere S2R is equivalently expressible in the following ways: • (in frequency domain) by a pseudodifferential equation of the form ∞ 2n+1

V

∧L2 (S2 ) R

 (1,1) R;(1,1) (n, m) μ˜ n hn,m = (∇ ⊗ ∇)V (x) = v(x),

(2)

n=0 m=1 (1,1)

where the coefficients μ˜ n

are given by

μ˜ n(1,1) := (n + 2)(n + 2)(2n − 3)(2n − 1), R;(1,1)

hn,m

and V

(3)

denotes a tensor outer harmonic of type (1, 1) of degree n and order m ,

∧L2 (S2 ) R

(n, m) are the orthogonal (Fourier) coefficients given by

1194

W. Freeden et al.

V

∧L2 (S2 ) R

 R (n, m) = "V , Hn,m #L2 (S2 ) = R

S2R

R V (y) Hn,m (y) dS(y)

(4)

R as scalar outer harmonic of degree n and order m. with Hn,m • (in space domain) by a linear integral equation of the first kind

 S2R

1 |x|2 − R 2 V (y) (∇x ⊗ ∇x ) dS(y) = (∇ ⊗ ∇)V (x) = v(x), 4π R |x − y|3   

(5)

=:kR (x,y)

where, as already stated, v(x) = (∇x ⊗ ∇x )V (x) with locations x on the satellite orbit Γ designates the Hesse tensor of V at x, and KR (·, ·) given by KR (x, y) :=

1 |x|2 − R 2 , x, y ∈ S2;ext R , x = y, 4π R |x − y|3

(6)

is the Abel–Poisson kernel in the outer space S2;ext of the sphere S2R ⊂ R3 R 2 (providing “upward continuation” from the sphere SR to its outer space by forming the convolution against V ).

SGG-Operator Formulation The SGG-problem in its spherically reflected form is a typical member of a “downward continuation problem” in the mathematical discipline of “Inverse Problems”. It essentially requires the determination of the gravitational potential V on and outside the sphere S2R from the Hesse tensor v=∇ ⊗ ∇V on the satellite orbit Γ . Because of the affecting effects of gravitational datasets at LEO’s altitude (particularly in the presence of noise), downward continuation by SGG inevitably leads to an inverse problem of ill-posed character in the sense of Hadamard’s classification [73]. As a consequence, our first interest are regularization methods, which are relevant as approximate SGG-solution strategies. In fact, it turns out that Eqs. (2) and (5) define the same operator equation Λ V = v,

(7)

so that, in accordance with well-known results of the mathematical theory of inverse problems, the operator Λ : X −→ Y in (7) constitutes an invertible mapping between scalar and tensorial Hilbert spaces X, Y of Sobolev type (under the assumption that the data are not erroneous). Unfortunately, by virtue of functional analytic arguments (see, e.g., [21, 81, 101, 108, 129] for more details), the inverse of Λ turns out to be unbounded, hence, Λ−1 needs regularization even if the SGG-data are free of errors. As a matter of fact, a large variety of regularization techniques may be applied to the SGG-problem.

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1195

In this work, however, our purpose is to work out multiscale approaches for regularization of the inverse operator Λ−1 of Λ, namely • SGG using frequency regularizing Tikhonov and truncated singular integral techniques, • SSG using space regularizing “downward continuation” techniques involving Abel–Poisson kernels. Alternative wavelet methods and multiscale realizations can be found in several publications. It started with [35, 49, 57, 122]. Further related notes are [24–26, 42, 46, 47, 53, 54, 56, 58–64, 66, 69, 77, 90, 91, 105, 106, 125, 137]. Tree algorithms are available if SGG-data within the Hesse tensor (for example, second radial derivatives) are used for multiscale approximation. The trace of the Hesse tensor (which is equal to zero because of the harmonicity of the gravitational potential) offers the possibility to validate tensorial multiscale methods. Moreover, observational errors can be handled by filtering techniques within the tree algorithm (see [35, 40, 41]). Since all these techniques and procedures are well-documented in the literature, they will be not discussed in this approach.

1.3

Layout of the Paper

The layout of the paper is as follows: As a preparation we first present a brief overview on relevant aspects of the theory of ill-posed inverse problems (Sect. 2) and tensorial potential theory (Sect. 3). Then, based on (2), we propose multiscale frequency regularization methods in Sect. 4. Finally, based on (5), we come to multiscale space regularization methods in Sect. 5. Both techniques are based on a “Runge argument” to allow for a multiscale regularization of the Earth’s gravitational potential V on and outside the actual Earth’s surface.

2

SGG-Aspects of Inverse Ill-Posed Problems

For the convenience of the reader, we present a brief course of basic facts on regularization in Hilbert space settings, which are useful to understand solution strategies in the framework of pseudodifferential equations. The explanations are based on functional analytic tools (see, e.g., [45], where much more additional material and references can be found).

1196

2.1

W. Freeden et al.

Ill-Posed Problems in Hilbert Spaces

We recapitulate some known results: Let X and Y be two Hilbert spaces with inner products "·, ·#X and "·, ·#Y , respectively. Let Λ : X −→ Y

(8)

be a linear bounded operator. Given y ∈ Y, we are looking for a solution of the operator equation Λx = y.

(9)

Four (mutually exclusive) situations arise (cf. [45]) in the discussion of the range R(Λ) and the null space N (Λ) of the operator Λ (and its adjoint operator Λ∗ ). (i) (ii) (iii) (iv)

R(Λ) is dense in Y, hence, N (Λ∗ ) = {0}, and y ∈ R(Λ); R(Λ) is dense in Y, and y ∈ / R(Λ); R(Λ) is a proper subspace of Y, and y ∈ R(Λ) + R(Λ)⊥ ; R(Λ) = Y, and y ∈ / R(Λ) + R(Λ)⊥ .

In case (i) one has, of course, a solution in the classical sense; in case (ii) and (iv) a classical solution does not exist, while in case (iii) a solution need not exist. We say x is a “least-squares solution” of (9) if inf{Λu − y : u ∈ X} = Λx − y.

(10)

Λu − y2 = Λu − Qy2 + y − Qy2 ,

(11)

Since

where Q is the orthogonal projector of Y onto R(Λ), it is clear that a least-squares solution exists if and only if y ∈ R(Λ) + R(Λ)⊥ ,

(12)

where R(Λ) + R(Λ)⊥ is a dense set in Y . For such y the set of all least-squares solutions of (9), denoted by L(y), is a nonempty closed convex set (indeed L(y) is the translate of N (Λ) by a fixed element of N (y)), hence, it has a unique element of minimal norm, denoted by Λ† y. The generalized inverse (or pseudoinverse) Λ† is the linear operator which assigns to each y ∈ D(Λ† ) := R(Λ) + R(Λ)⊥ , the unique element in the set L(y) ∩ N (Λ)⊥ , so that L(y) = Λ† y + N (Λ). It is easy to show that Λ† y is the minimal norm solution (equivalently the unique solution in N (Λ)⊥ ) of the normal equation Λ∗ Λx = Λ∗ y

(13)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1197

(the equation obtained by setting the first variation of Λx − y2 equal to zero). It also follows that Λ† = (Λ| N (Λ)⊥ )−1 Q (14) so that Λ† can be characterized as the linear operator with the function-theoretic properties: D(Λ† ) = R(Λ) + R(Λ)⊥ ,

N (Λ† ) = R(Λ)⊥ = N (Λ∗ )

(15)

and R(Λ† ) = N (Λ)⊥ .

(16)

The equivalence of these characterizations of Λ† is established, e.g., in [100] and a large amount of subsequent contributions (see also [45, 102] for a lucid exposition and [103] for the generalization to unbounded operators). In case (i) above, Λ† gives the minimal-norm solution of (3). In case (iii), (9) has a least-squares solution (which is unique if and only if N (Λ) = {0}). In both cases the infimum in (10) is attained and is equal to zero and y − Qy, respectively. Case (ii) and (iv) are pathological and usually not under discussion in generalized inverse theory, since in both cases y ∈ / D(Λ† ), and the infimum in (10) is not attained. For gravimetric synthesis and identification [44] as well as in satellite problems in geodesy [35], cases (i), (ii), (iii) are of practical interest. Thus, for the identification problem, case (i) corresponds to identification with complete and exact information. Case (ii) may correspond to identification in the presence of contamination in the measurement. In either of these cases, it is theoretically possible to approximate the infimum (whether actually attainable or not) to within any desired degree of accuracy. For practical reasons, it may be necessary to limit the accuracy of the approximation in order to insure that certain a priori conditions are met. Such constraints on the input (or control) functions are characteristic of control systems and often lead to sets of state which are unreachable; the corresponding control problem is not even approximately controllable. The case (iii) and (iv) arise for more general synthesis problems if N (Λ∗ ) is nontrivial. In accordance with Hadamard [73] we call an operator problem (9) well-posed, if the following properties are valid: • For all “admissible” data, a solution exists. • For all “admissible” data, the solution is unique. • The solution depends continuously on the data. In our setting, these requirements can be translated more accurately into • Λ is injective, i.e., R(Λ) = Y. • Λ is surjective, i.e., N (Λ) = {0}. • Λ−1 is bounded and continuous.

1198

W. Freeden et al.

If one of the three conditions is not fulfilled, the problem (9) is called ill-posed in the sense of Hadamard (note that, in the case of the SGG–problem, which is ill-posed, the most crucial problem is the unboundedness of the inverse operator Λ−1 ). As canonical evolution of Hadamard’s classification, M.Z. Nashed [100] called the operator equation (9) well-posed in the least-squares (relative to X and Y) if for each y ∈ Y the equation has a unique least-squares solution (of minimal norm), which depends continuously on y; otherwise the problem is ill-posed. The advantage of adopting this notion of well-posedness is that it focuses on infinite-dimensional problems (e.g., an inconsistent finite system of linear algebraic equations will not be ill-posed in above sense, while it is ill-posed in the sense of Hadamard). It follows immediately from the open mapping theorem in functional analysis (see, e.g., [103]) that the following statements are equivalent: • the problem (9) is well-posed; • R(Λ) is closed; • Λ† is bounded. Summarizing we are led to the following conclusion (see [103]): The problem (Λ; X, Y ) is called well-posed in the sense of Nashed, if R (Λ) is closed in Y. If R (Λ) is not closed in Y, the problem (Λ; X, Y) is called ill-posed in the sense of Nashed. Let us discuss the consequences of the violations of the above requirements for the well-posedness of (9): The lack of injectivity of Λ is perhaps the easiest problem. From theoretical point of view, the space X can be replaced by the orthogonal complement N (Λ)⊥ , and the restriction of the operator Λ to N (Λ)⊥ yields an injective problem. However, in practice, one is normally confronted with the problem that R(Λ) = Y, since the right hand side is given by measurements and is, therefore, disturbed by errors (note that this calamity is also inherent in the SGG-problem). Finally it is helpful to discuss the following situation: We assume that y ∈ R(Λ), but only a perturbed right hand side y ε is known. Furthermore, we suppose that y − y ε Y < ε.

(17)

Λx ε = y ε .

(18)

Our aim is to solve

Since y ε might not be in R(Λ), the solution of this equation might not exist, and we have to generalize what is meant by a solution. x ε is the least-squares solution of (18), if Λx ε − y ε Y = inf{Λz − y ε Y : z ∈ X}.

(19)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1199

x ε is the best approximate solution of Λx ε = y ε , if x ε is a least-squares solution and x ε X = inf{zX : z is a least-squares solution of Λz = y ε }

(20)

holds.

2.2

Regularization Strategies and Error Behavior

A serious problem for ill-posed problems occurs when Λ−1 or Λ† are not continuous (as in the SGG-context). That means that small errors in the data or even small numerical noise can cause large errors in the solution. In fact, in most cases the application of an unbounded Λ−1 or Λ† does not make any sense. The usual strategy to overcome this difficulty is to substitute the unbounded inverse operator Λ−1 : R(Λ) −→ Y

(21)

by a suitable bounded approximation R : Y −→ X.

(22)

The operator R is not chosen to be fix, but dependent on a regularization parameter α. According to the conventional approach in the theory of ill-posed problems we are then led to introduce the following definition: A regularization strategy is a family of linear bounded operators Rα : Y −→ X, α > 0,

(23)

lim Rα Λx = x

(24)

x ∈ X,

(25)

so that α→0

for all

i.e., the operators Rα Λ converge pointwise to the identity. From the theory of inverse problems (see, e.g., [102, 103]) it is also clear that if Λ : Y → Y is compact and X has infinite dimension (as it is the case for the SGGapplication we have in mind), then the operators Rα are not uniformly bounded, i.e., there exists a sequence {αj } with limj →∞ αj = 0 and Rαj L(Y,X) → ∞ for j → ∞.

(26)

1200

W. Freeden et al.

Note that the convergence of Rα Λx in (23) is based on y = Λx, i.e., on unperturbed data. In practice, the right hand side is affected by errors and then no convergence is achieved. Instead, one is (or has to be) satisfied with an approximate solution based on a certain choice of the regularization parameter. Let us discuss the error of the solution. For that purpose, we let y ∈ R(Λ) be the (unknown) exact right-hand side and y ε ∈ Y be the measured data with y − y ε Y < ε.

(27)

x α,ε = Rα y ε ,

(28)

For a fixed α > 0, we let

and look at x α,ε as an approximation of the solution x of Λx = y. Then the error can be split as follows: x α,ε − xX = Rα y ε − xX

(29)

≤ Rα y ε − Rα yX + Rα y − xX ≤ Rα L(Y,X) y ε − yY + Rα y − xX , such that x α,ε − xX ≤ εRα L(Y,X) + Rα Λx − xX .

(30)

We see that the error between the exact and the approximate solution consists of two parts: The first term is the product of the bound for the error in the data and the norm of the regularization parameter Rα . This term will usually tend to infinity for α → 0 if the inverse Λ−1 is unbounded and Λ is compact (cf. (26)). The second term denotes the approximation error (Rα − Λ−1 )yX for the exact right-hand side y = Λx. This error tends to zero as α → 0 by the definition of a regularization strategy. Thus, both parts of the error show a diametrically oriented behavior. A typical picture of the errors in dependence on the regularization parameter α is sketched in Fig. 3. Thus, a strategy is needed to choose α dependent an ε in order to Fig. 3 Typical behavior of the total error in a regularization process



19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1201

keep the error as small as possible, i.e. we would like to minimize εRα L(Y,X) + Rα Λx − xX .

(31)

In principle, we distinguish two classes of parameter choice rules: If α = α(ε) only depends on ε, we call α = α(ε) an a-priori parameter choice rule. Otherwise α depends also on y ε and we call α = α(ε, y ε ) an a–posteriori parameter choice rule. It is usual to say that a parameter choice rule is convergent, if for ε → 0 the rule fulfills the limit relations lim sup{Rα(ε,y ε ) y ε − Λ† yX : y ε ∈ Y, y ε − yY ≤ ε} = 0

ε→0

(32)

and lim sup{α(ε, y ε ) : y ε ∈ Y, y − y ε Y ≤ ε} = 0.

ε→0

(33)

We are stopping the discussion of parameter choice rules here. For more material the interested reader is referred to any textbook on inverse problems, e.g., [20, 81, 84, 108].

2.3

Singular Systems and Resolution Methods

If Λ : X → Y is compact (see, e.g., [45] for more details), a singular system (σn ; vn , un ) is defined as follows: {σn2 }n∈N are the nonzero eigenvalues of the self-adjoint operator Λ∗ Λ (Λ∗ is the adjoint operator of Λ), written down in decreasing order with corresponding multiplicity. The family {vn }n∈N constitutes a corresponding complete orthonormal system of eigenvectors of Λ∗ Λ. We let σn > 0 and define the family {un }n∈N via un = Λvn /Λvn Y . The sequence {un }n∈N forms a complete orthonormal system of eigenvectors of ΛΛ∗ , and the following formulas are valid Λvn = σn un ,

(34)



(35)

σn "x, vn #X un , x ∈ X,

(36)

σn "y, un #Y vn , y ∈ Y.

(37)

Λ un = σn vn , so that Λx =

∞ n=1

Λ∗ y =

∞ n=1

1202

W. Freeden et al.

The convergence of the infinite series is understood with respect to the Hilbert space norms under consideration. The identities (36) and (37) are called the singular value expansions of the corresponding operators. If there are infinitely many singular values, they tend to 0, i.e., limn→∞ σn = 0. An important result in the theory of singular value expansions should be presented, which can be found in all standard textbooks on Inverse Problems: Let (σn ; vn , un ) be a singular system for the compact linear operator Λ, y ∈ Y. Then we have y ∈ D(Λ† )

(38)

∞ |"y, un #Y |2 < ∞, σn2

(39)

if and only if

n=1

and for y ∈ D(Λ† ) we have Λ† y =

∞ "y, un #

Y

n=1

σn

vn .

(40)

The condition (38) is the so-called Picard criterion. It says that a best–approximate solution of Λx = y exists only if the Fourier coefficients of y decrease fast enough relative to the singular values. The representation (40) of the best-approximate solution motivates a method for the construction of regularization operators, namely by damping the factors 1/σn in such a way that the series converges for all y ∈ Y. We are looking for filters q : (0, ∞) × (0, ΛL(X,Y) ) −→ R

(41)

such that Rα y :=

∞ q(α, σn ) n=1

σn

"y, un #Y vn , y ∈ Y,

(42)

is a regularization strategy. The following statement is, e.g., known from [81]. Let Λ : X → Y be compact with singular system (σn ; vn , un ). Assume that q from (41) has the following properties: (i) |q(α, σ )| ≤ 1 for all α > 0 and 0 < σ ≤ ΛL(X,Y) . (ii) For every α > 0 there exists a c(α) so that |q(α, σ )| ≤ c(α)σ for all 0 < σ ≤ ΛL(X,Y) . (iii) lim q(α, σ ) = 1 for every 0 ≤ σ ≤ ΛL(X,Y) . α→0

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1203

Then the operator Rα : Y → X, α > 0, defined by Rα y :=

∞ q(α, σn ) n=1

σn

"y, un #Y vn , y ∈ Y,

(43)

is a regularization strategy with Rα L(Y,X) ≤ c(α). The function q is called a regularizing filter for Λ. Two important examples should be mentioned: q(α, σ ) =

σ2 α + σ2

(44)

defines the Tikhonov regularization, whereas  q(α, σ ) =

1 , σ 2 ≥ α, 0 , σ2 < α

(45)

leads to the regularization by truncated singular value decomposition.

2.4

Summary Excursion to Regularization Techniques

The strategy of resolution and reconstruction of ill-posed problems involves one or more of the following intuitive ideas (cf. [45, 101, 102], and the references therein): • change the notion of what is meant by a solution, e.g., an ε-approximate solution: Λx˜ − y ≤ ε, where ε > 0 is the prescribed error value; quasi-solution: Λx˜ − y ≤ Λx − y for all x ∈ M, a prescribed subset of the domain of A; leastsquares solution of minimal norm, etc, • modify the operator equation or the problem itself, • change the spaces and/or topologies, • specify the type of involved noise (“strong” or “weak” noise as discussed, e.g., in [19]). From the standpoint of mathematical and numerical analysis one can roughly group “regularization methods” into six categories (cf. [45, 48]): 1. Regularization methods in function spaces is one category. This includes Tikhonov-type regularization, the method of quasi-reversibility, the use for certain function spaces such as scale spaces in multiresolutions, the method of generalized inverses (pseudoinverses) in reproducing kernel Hilbert spaces, and multiscale wavelet regularization.

1204

W. Freeden et al.

2. Resolution of ill-posed problems by “control of dimensionality” is another category. This includes projection methods, moment-discretization schemes. The success of these methods hinges on the possibility of obtaining an approximate solution while keeping the dimensionality of the finite dimensional problem within the “range of numerical stability”. It also hinges on deriving error estimates for the approximate solutions that is crucial to the control of the dimensionality. 3. A third category is iterative methods which can be applied either to the problem in function spaces or to a discrete version of it. The crucial ingredient in iterative methods is to stop the iteration before instability creeps into the process. Thus iterative methods have to be modified or accelerated so as to provide a desirable accuracy by the time a stopping rule is applied. 4. A fourth category is filter methods. Filter methods refer to procedures where, for example, values producing highly oscillatory solutions are eliminated. Various “low pass” filters can, of course, be used. They are also crucial for the determination of a stopping rule. Mollifiers are known in filtering as smooth functions with special properties to create sequences of smooth functions approximating a non-smooth function or a singular function. 5. The original idea of a mollifier method (see, e.g., [45] and the references therein) is of interest for the solution of an operator equation, but we realize that the problem is “too ill-posed” for being able to determine the (pseudo)inverse accurately. Mollifiers are known as smooth functions with special properties to create sequences of smooth functions approximating a non-smooth function. Thus, we compromise by changing the problem into a more well-posed one, namely that of trying to determine a mollified version of the solution. The heuristic motivation is that the trouble usually comes from high frequency components of the data and of the solution, which are damped out by mollification. 6. The root of the Backus–Gilbert method (BG method) was geophysical (cf. [4– 6]). The characterization involved in the model is known as moment problem in the mathematical literature. The BG-method can be thought of as resulting from discretizing an integral equation of the first kind. Where other regularization methods, such as the frequently used Tikhonov regularization method (see, e.g., [45] and the references therein), seek to impose smoothness constraints on the solution, the BG-method instead realizes stability constraints. As a consequence, the solution is varying as little as possible if the input data were resampled multiple times. The common feature between mollification and the BG-method is that an approximate inverse is determined independently from the right hand side of the equation. The philosophy of resolution leads to the use of algebraic methods versus function space methods, statistical versus deterministic approaches, strong versus weak noise, etc. A regularization-approximation scheme refers to a variety of methods such as Tikhonov regularization, projection methods, multiscale methods, iterative approximation, etc., that can be applied to ill-posed problems. These schemes turn into

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1205

algorithms once a resolution strategy can be effectively implemented. Unfortunately, this requires the determination of a suitable value of a certain parameter associated with the scheme (e.g., regularization parameter, mesh size, dimension of subspace in the projection scheme, specification of the level of a scale space, classification of noise, etc.). This is not a trivial problem since it involves a trade-off between accuracy and numerical stability, a situation that does not usually arise in well-posed problems.

3

SGG-Aspects of Potential Theory

In what follows we list some (known) potential theoretic tools to be needed for the study of SGG. For more details and proofs the reader is referred, e.g., to [30, 36], and the references therein.

3.1

Gravitation

According to the classical Newton’s Law of Gravitation (1687), knowing the density distribution inside the Earth, the gravitational potential V can be computed everywhere in R3 . In the Earth’s exterior Σ ext , the Newtonian Earth’s gravitational volume potential V convolving the fundamental solution x → (4π )−1 |x − y|−1 of the Laplacian Δ against the mass density distribution ρ inside the Earth Σ int , i.e.,  V (x) =

Σ int

1 1 ρ(y) dV (y) 4π |x − y|

(46)

is harmonic in Σ ext : ΔV (x) = 0,

x ∈ Σ ext

(47)

(we omit the gravitational constant for our theoretical purposes, which must be observed, however, in numerical obligations). In the Earth’s interior Σ int , V is related to the Earth’s density ρ (provided that ρ is assumed to be Hölder continuous) via the Poisson differential equations ΔV (x) = −ρ(x),

x ∈ Σ int .

(48)

Moreover, the gravitational potential V is regular at infinity. In fact, it can be readily seen that |y| ≤

|x| , y ∈ Σ int , 2

(49)

1206

W. Freeden et al.

implies 1 |x|, 2

(50)

|x| → ∞.

(51)

|x − y| ≥ ||x| − |y|| ≥ i.e., # " |V (x)| = O |x|−1 ,

Collecting the results known from Newtonian potential theory on the Earth’s gravitational field v for the outer space Σ ext we are confronted with the following characterization: The vector field v is real-analytic in Σ ext such that (v1) (v2)

ext div v = ∇ · v = 0, curl v = L · v = 0 in Σ  , −2 v is regular at infinity: |v(x)| = O |x| , |x| → ∞.

The properties (v1) and (v2) imply that the Earth’s gravitational field v in the exterior of the Earth Σ ext is a gradient field v(x) = ∇V (x), x ∈ Σ ext .

(52)

Moreover, the gradient field of the Earth’s gravitational field (i.e., the Jacobi matrix field) v = ∇v, obeys the following properties: v is a real-analytic tensor field in the Earth’s exterior Σ ext such that (v1) (v2)

ext div v = ∇ · v = 0, curl v = L · v = 0 in the  Earth’s exterior Σ , −3 v is regular at infinity: |v(x)| = O |x| , |x| → ∞.

Combining our identities we finally see that v can be represented as the Hesse tensor of the scalar field V , i.e., v(x) = (∇ ⊗ ∇) V (x), x ∈ Σ ext .

(53)

Hence, the potential theoretic situation for the SGG-problem can be formulated briefly as follows: Suppose that satellite data v = (∇ ⊗ ∇) V are known on the orbit Γ , the satellite gravitational gradiometry problem (SGG-problem) amounts to the problem of determining V on and outside the Earth’s surface Σ, i.e., in the set Σ ext = Σ ext ∪ Σ, from (discrete data of) v = (∇ ⊗ ∇) V on the satellite orbit Γ . In conclusion, seen from the potential-theoretic context, SGG is a non-standard problem. The reasons are obvious: • Tensorial SGG-data (or a scalar manifestation of them such as the second radial derivative on the orbit) do not form the standard equipment of potential theory (such as, e.g., Dirichlet’s or Neumann’s boundary data). Thus, uniqueness cannot be deduced in the way as known (cf. [80]) for boundary data in classical boundary value problems (such as Dirichlet’s or Neumann’s boundary value problem).

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1207

Γ

Σ

Fig. 4 The (real) SGG-geometric situation as interplay between Earth Σ and orbit Γ

Nevertheless, “upward continuation” via Dirichlet’s problem as the inverse to “downward continuation” plays an important role in the mathematical treatment of SGG. • SGG-data have its natural limit because of the exponential damping of the frequency parts (i.e., the orthogonal coefficients) of the gravitational potential with increasing satellite heights. • “Downward continuation” by SGG as the inverse of “upward continuation” (cf. Fig. 4) leads to an ill-posed problem, since the data are not given on the boundary of the domain of interest, i.e., on the Earth’s surface Σ, but on locations along the orbit Γ .

3.2

Potential Spaces

Next we are interested in characterizing the essential players involved in the SGGmatch as members of infinite-dimensional potential spaces. Let us begin with the introduction of some auxiliary material: We understand the Earth’s surface Σ ⊂ R3 to be a regular surface, i.e., Σ is a surface with the following properties: 1. Σ divides the Euclidean space R3 into the (open) bounded region Σ int (inner space) and the (open) unbounded region Σ ext (outer space) so that Σ ext = R3 \Σ int , Σ = Σ int ∩ Σ ext , i.e. ∅ = Σ int ∩ Σ ext ,

1208

W. Freeden et al.

2. Σ int contains the origin 0, 3. Σ is a closed and compact surface free of double points, 4. Σ is locally of class C(2) , i.e. Σ is locally C(2) -smooth (see [36] for more details concerning regular surfaces). The function spaces C(2) (Σ ext ) and c(2) (Σ ext ) etc. are defined in canonical way. We define Pot(Σ ext ) as the scalar space of potentials harmonic in Σ ext and regular at infinity: # " Pot(Σ ext ) = {F ∈ C(2) (Σ ext ) : ΔF = 0 in Σ ext , F (x) = O |x|−1 , |x| → ∞}. (54) Pot(0) (Σ ext ) is the space of continuous functions F : Σ ext → R whose restrictions F |Σ ext are members of Pot(Σ ext ). In brief (but not quite mathematically accurate), Pot(0) (Σ ext ) = Pot(Σ ext ) ∩ C(0) (Σ ext ),

(55)

Pot(k) (Σ ext ) = Pot(Σ ext ) ∩ C(k) (Σ ext ), 0 ≤ k ≤ ∞.

(56)

and, in similar way,

We introduce the tensorial counterpart pot(Σ ext ) of Pot(Σ ext ) as follows: pot(Σ ext ) = {f ∈ c(1) (Σ ext ) : ∇ · f = 0,

∇ ∧ f = 0 in Σ ext , # |f(x)| = O |x|−3 , |x| → ∞}. "

(57)

Similarly, we let pot(0) (Σ ext ) = pot(Σ ext ) ∩ c(0) (Σ ext ),

(58)

pot(k) (Σ ext ) = pot(Σ ext ) ∩ c(k) (Σ ext ), 0 ≤ k ≤ ∞.

(59)

and

3.3

Spherical Harmonics

Since tensor spherical harmonics do not belong to the standard equipment of geodesists as well as mathematicians a short introduction should be given. A more detailed study of our approach can be found in [52]; similar SGG-relevant aproaches are presented in [112, 114, 115].

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1209

Spherical Nomenclature For all x ∈ R3 , x = (x1 , x2 , x3 )T , different from the origin 0, we have x = rξ , r = |x| = x12 + x22 + x32 , where ξ = (ξ1 , ξ2 , ξ3 )T is the uniquely determined directional unit vector of x ∈ R3 . As usual, the unit sphere in R3 will be denoted by S2 = S21 , while S2α designates the 2-sphere around the origin with radius α. The unit 3-ball around the origin is denoted by B3 = B31 , and B3α is the ball around the origin with radius α. If the vectors ε1 , ε2 , ε3 form the canonical orthonormal basis in R3 , we may represent ξ ∈ S21 in polar coordinates by ξ = tε3 +



" # 1 − t 2 cos ϕε1 + sin ϕε2 ,

− 1 ≤ t ≤ 1, 0 ≤ ϕ < 2π, t = cos θ.

(60)

Inner, vector, and dyadic (tensor) product of two vectors x, y ∈ R3 , respectively, are denoted by x · y, x ∧ y, and x ⊗ y. As usual, a second order tensor f ∈ R3 ⊗ R3 is understood to be a linear mapping that assigns to each x ∈ R3 a vector y ∈ R3 . The (cartesian) components Fij of f are defined by Fij = εi · (fεj ) = (εi )T (fεj ), so that y = fx is equivalent to y · εi = 3 j T j =1 Fij (x · ε ). We write f for the transpose of f. The dyadic (tensor) product x ⊗ y of two elements x, y ∈ R3 is the tensor that assigns to each u ∈ R3 the vector (y · u)x. More explicitly, (x ⊗ y)u = (y · u)x for every u ∈ R3 . The inner product f · g of two second order tensors f, g ∈ R3 ⊗ R3 is defined by f · g = 3i,j =1 Fij Gij , and |f| = (f · f)1/2 is called the norm of f. Note that, throughout this paper, scalar valued (resp. vector valued, tensor valued) functions are denoted by capital (resp. small, small bold) letters. A function F : S2 → R (resp. f : S2 → R3 , f : S2 → R3 ⊗R3 ) possessing k continuous derivatives on the unit sphere S2 is said to be of class C(k) (S2 ) (resp. c(k) (S2 ), c(k) (S2 )). C(0) (S2 ) (resp. c(0) (S2 ), c(0) (S2 )) is the class of real continuous scalar-valued (resp. vectorvalued, tensor-valued) functions on the unit sphere S2 . Spherical Differential Operators In terms of polar coordinates (60) the gradient ∇ in R3 allows the representation ∇x = ξ ∂/∂r + (1/r)∇ξ∗ , where ∇ ∗ is the surface gradient of the unit sphere S2 ⊂ R3 . The operator Δ∗ =∇ ∗ ·∇ ∗ is called the Beltrami operator of the unit sphere S2 . Obviously, it can be understood as the angular part of the Laplace operator. For F ∈ C(1) (S2 ) we introduce the surface curl gradient L∗ξ by L∗ξ F (ξ ) = ξ ∧ ∇ξ∗ F (ξ ), ξ ∈ S2 , while ∇ξ∗ · f (ξ ), ξ ∈ S2 , and L∗ξ · f (ξ ), ξ ∈ S2 , respectively, denote the surface divergence and surface curl of the vector field f at ξ ∈ S2 . Scalar Spherical Harmonics Scalar spherical harmonics are defined as restrictions of homogeneous harmonic polynomials to the unit sphere S2 . In all geosciences interested in global modeling, spherical harmonics are the functions which are usually taken to represent scalar fields in spectral way on a spherical surface such as the Earth’s (mean) sphere.

1210

W. Freeden et al.

Let Hn be a homogeneous harmonic polynomial of degree n in R3 , n ∈ N0 , i.e. Hn ∈ Harmn (R3 ). The restriction Yn = Hn |S2 is called (scalar) spherical harmonic of degree n. The space {Yn = Hn |S2 : Hn ∈ Harmn (R3 )}

(61)

of all (scalar) spherical harmonics of degree n is denoted by Harmn (S2 ). Harmn (S2 ) is known to be of dimension 2n + 1. Spherical harmonics of different degrees are orthogonal in L2 (S2 )-sense, that is  "Yn , Yn˜ #L2 (S2 ) =

S2

Yn (ξ )Yn˜ (ξ ) dS(ξ ) = 0,

n = n. ˜

(62)

Throughout this text a capital letter Y followed by one or two indices always denotes a spherical harmonic of the degree given by the first index and order given by the second index. Two indices mean that the function, for example Yn,m , is a member of an L2 (S2 )-orthonormal system of functions {Yn,1 , . . . , Yn,2n+1 }n∈N0 . By use of the scalar spherical harmonics every function F ∈ L2 (S2 ) can be written as an orthogonal (Fourier) series F =

∞ 2n+1

F

∧L2 (S2 )

(n, m)Yn,m

(63)

n=0 m=1

(in L2 (S2 )-sense) with Fourier coefficients F

∧L2 (S2 )

 (n, m) = "F, Yn,m #L2 (S2 ) =

S2

F (ξ )Yn,m (ξ ) dS(ξ ).

(64)

The system {Yn,m }n=0,1,...;m=1,...,2n+1 is closed in C(0) (S2 ) with respect to the norm  · C(0) (S2 ) , i.e., for any number ε > 0 and any function F ∈ C(0) (S2 ), there exists a linear combination FN =

N 2n+1

dn,m Yn,m ,

(65)

n=0 m=1

such that F − FN C(0) (S2 ) ≤ ε. The system {Yn,m }n=0,1,...;m=1,...,2n+1 is, furthermore, complete in L2 (S2 ) with respect to "·, ·#L2 (S2 ) , i.e., F ∈ L2 (S2 ) with ∧ F L2 (S2 ) (n, m) = 0 for all n = 0, 1, . . . ; m = 1, . . . , 2n + 1 implies F = 0 (see, e.g., [58, 98]). We conclude this short introduction to the theory of scalar spherical harmonics with the so-called addition theorem which formulates the correlation between the spherical harmonics and the corresponding Legendre polynomials.

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1211

Addition Theorem for Scalar Spherical Harmonics. Let {Yn,m }m=1,...,2n+1 be an

L2 (S2 )-orthonormal system in Harmn (S2 ). Then, for any pair (ξ, η) ∈ S2 × S2 , the addition theorem reads 2n+1

Yn,m (ξ )Yn,m (η) =

m=1

2n + 1 Pn (ξ · η), 4π

(66)

where Pn : [−1, 1] → [−1, 1] is the Legendre polynomial of degree n. Tensor Spherical Harmonics We recapitulate the introduction of tensor spherical harmonics as proposed by [52]: By l2 (S2 ) we denote the Hilbert space of squareintegrable tensor fields f : S2 → R3 ⊗ R3 with the inner product  "f, g#l2 (S2 ) =

S2

f(ξ ) · g(ξ ) dS(ξ ),

f, g ∈ l2 (S2 ),

(67)

and associated norm ·l2 (S2 ) (note that the space l2 (S2 ) is the completion of c(0) (S2 ) with respect to the norm  · l2 (S2 ) ). We now introduce the operators o(i,k) : C(∞) (S2 ) → c(∞) (S2 ), i, k = 1, 2, 3, which transform scalar functions into tensor fields: o(1,1) F (ξ ) := ξ ⊗ ξ F (ξ ),

(68)

⊗ ∇ξ∗ F (ξ ),

(69)

o(1,3) F (ξ ) := ξ ⊗ L∗ξ F (ξ ),

(70)

o(2,1) F (ξ ) := ∇ξ∗ F (ξ ) ⊗ ξ,

(71)

o(2,2) F (ξ ) := itan (ξ )F (ξ ), # " o(2,3) F (ξ ) := ∇ξ∗ ⊗ ∇ξ∗ − L∗ξ ⊗ L∗ξ F (ξ ) + 2∇ξ∗ F (ξ ) ⊗ ξ,

(72)

(1,2)

o

F (ξ ) := ξ

(73)

o(3,1) F (ξ ) := L∗ξ F (ξ ) ⊗ ξ, # " o(3,2) F (ξ ) := ∇ξ∗ ⊗ L∗ξ + L∗ξ ⊗ ∇ξ∗ F (ξ ) + 2L∗ξ F (ξ ) ⊗ ξ,

(74)

o(3,3) F (ξ ) := jtan (ξ )F (ξ ),

(76)

(75)

F ∈ C(2) (S2 ), ξ ∈ S2 . The tensors itan = i − ξ ⊗ ξ and jtan = ξ ∧ i are the surface identity tensor and the surface rotation tensor, respectively. The adjoint operators O (i,k) to o(i,k) satisfying "o(i,k) F, f#l2 (S2 ) = "F, O (i,k) f#L2 (S2 )

(77)

1212

W. Freeden et al.

for F ∈ C(2) (S2 ) and f ∈ c(2) (S2 ) are given by O (1,1) f(ξ ) := ξ T f(ξ )ξ,

(78)

" # O (1,2) f(ξ ) := −∇ξ∗ · ptan ξ T f(ξ ) , " # O (1,3) f(ξ ) := −L∗ξ · ptan ξ T f(ξ ) ,

(79) (80)

O (2,1) f(ξ ) := −∇ξ∗ · ptan (f(ξ )ξ ) ,

(81)

O (2,2) f(ξ ) := itan (ξ ) · f(ξ ), " " # # O (2,3) f(ξ ) := ∇ξ∗ · ptan ∇ξ∗ · ptan,∗ f(ξ ) − L∗ξ · ptan L∗ξ · ptan,∗ f(ξ )

(82)

− 2∇ξ∗ · ptan (f(ξ )ξ ) ,

(83)

O (3,1) f(ξ ) := −L∗ξ · ptan (f(ξ )ξ ) , " " # # O (3,2) f(ξ ) := L∗ξ · ptan ∇ξ∗ · ptan f(ξ ) + ∇ξ∗ · ptan L∗ξ · ptan f(ξ ) − 2L∗ξ · ptan (f(ξ )ξ ) ,

(84)

(85)

O (3,3) f(ξ ) := jtan (ξ ) · f(ξ ),

(86)

where ptan f (ξ ) := f (ξ ) − (ξ · f (ξ ))ξ,

(87)

ptan,∗ f(ξ ) := f(ξ ) − ξ ⊗ ((f(ξ )) ξ ),

(88)

T

ξ ∈ S2 . With the help of the operators o(i,k) we are able to define a set of tensor spherical harmonics {y(i,k) n,m }i,k=1,2,3; n=0ik ,...; m=1,...,2n+1 by #−1/2 " (i,k) y(i,k) o(i,k) Yn,m , n,m := μn (i,k)

where the normalization constants μn

μ(i,k) n

⎧ ⎪ ⎪ 1, ⎪ ⎨ 2, := ⎪ n(n + 1), ⎪ ⎪ ⎩ 2n(n + 1)(n(n + 1) − 2),

(89)

are given by

(i, k) = (1, 1), (i, k) ∈ {(2, 2), (3, 3)}, (i, k) ∈ {(1, 2), (1, 3), (2, 1), (3, 1)}, (i, k) ∈ {(2, 3), (3, 2)}.

(90)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1213

For simplicity, we use the abbreviation ⎧ ⎪ ⎨ 0, (i, k) ∈ {(1, 1), (2, 2), (3, 3)}, 0ik := 1, (i, k) ∈ {(1, 2), (1, 3), (2, 1), (3, 1)}, ⎪ ⎩ 2, (i, k) ∈ {(2, 3), (3, 2)}.

(91)

2 By harm(i,k) n (S ) we denote the space of all tensor spherical harmonics of degree n and kind (i, k). (i,k)

Addition Theorem for Tensor Spherical Harmonics. Let {yn,m }m=1,...,2n+1 be an 2 harm(i,k) n (S ).

l2 (S2 )-orthonormal

basis of tensorial addition theorem reads 2n+1

Then, for any pair (ξ, η) ∈ S2 × S2 , the

(p,q)

y(i,k) n,m (ξ ) ⊗ yn,m (η) =

m=1

2n + 1 (i,k,p,q) Pn (ξ, η), 4π

(92)

(i,k,p,q)

i, k, p, q ∈ {1, 2, 3}, where Pn : S2 × S2 → R3 ⊗ R3 ⊗ R3 ⊗ R3 denote the Legendre tensors of degree n defined by (i,k,p,q)

Pn

#−1/2 " # " (p,q) −1/2 (i,k) (p,q) μn := μ(i,k) oξ oη Pn (ξ · η), n

ξ, η ∈ S2 .

(93)

Note that, for sufficiently smooth tensor fields f : S2 → R3 ⊗ R3 of the form f(ξ ) =

3

Fi,k (ξ )εi ⊗ εk ,

x ∈ S2 ,

(94)

i,k=1

we set (p,q)



f(ξ ) =

3 "

(p,q)



# Fi,k (ξ ) ⊗ εi ⊗ εk .

(95)

i,k=1

Explicit representations of the Legendre tensors can be found in [52]. By harmn (S2 ) we denote the space of all tensor spherical harmonics of degree n. The system {y(i,k) n,m } of tensor spherical harmonics was introduced by concentrating on the fact that the decomposition into normal and tangential tensor fields is fulfilled (cf. [105]). However, one disadvantage of this set of tensor spherical harmonics is that these functions are not eigenfunctions of the (scalar) Beltrami operator. Nonetheless, this property enables us to define so-called outer harmonics in such a way that they fulfill the Laplace equation in the outer space. This is the reason why we introduce another set of operators o˜ (i,k) : C(∞) (S2 ) → c(∞) (S2 ), i, k = 1, 2, 3, based on the operators o(i,k) by

1214

W. Freeden et al.



⎞ ⎛ ⎞ o˜ (1,1) Yn Yn ⎜ o˜ (1,2) Y ⎟ ⎜Y ⎟ ⎜ ⎜ n⎟ n⎟ ⎜ (2,1) ⎟ ⎜ ⎟ Y ⎟ = aD ⎜ Yn ⎟ , ⎜ o˜ ⎜ (2,2) n ⎟ ⎜ ⎟ ⎝ o˜ ⎝ Yn ⎠ Yn ⎠ ˜o(3,3) Yn Yn

⎛ and

⎞ ⎛ ⎞ o˜ (1,3) Yn Yn ⎜ o˜ (2,3) Yn ⎟ ⎜ Yn ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ o˜ (3,1) Yn ⎠ = bD ⎝ Yn ⎠ , o˜ (3,2) Yn Yn

(96)

where the matrix operators aD and bD are defined by (see [52]) ⎛ o(1,1) (D + 1)(D + 2) −o(1,2) (D + 2) −o(2,1) (D + 2) − 1 o(2,2) (D + 2)(D + 1) ⎜ ⎜ aD := ⎜ ⎝

1 o(2,3) 2 o(1,2) D −o(2,1) (D − 1) − 12 o(2,2) D(D − 1) − 12 o(2,3) −o(1,2) (D + 1) o(2,1) (D + 2) 12 o(2,2) (D + 2)(D + 1) − 12 o(2,3) 1 o(2,3) o(1,2) (D − 1) o(2,1) (D − 1) − 12 o(2,2) D(D − 1) 2 0 o(2,1) − 12 o(2,2) D(D + 1) − 12 o(2,3) 2

o(1,1) D 2 o(1,1) (D + 1)2 o(1,1) D(D − 1) 0

⎞ ⎟ ⎟ ⎟ ⎠ (97)

and ⎛ ⎜ ⎜ bD := ⎜ ⎜ ⎝

o(1,3) (D + 1)

o(3,1)

o(1,3) D

−o(3,1)

0

o(3,1) (D + 2)

0

o(3,1) (D − 1)

− 12 o(3,2)

− 12 o(3,3) D(D + 1)



⎟ 1 (3,2) 1 (3,3) D(D + 1) ⎟ 2o 2o ⎟, ⎟ − 12 o(3,2) 12 o(3,3) (D + 2)(D + 1) ⎠ 1 (3,2) − 12 o(3,3) D(D − 1) 2o

and D is the (pseudodifferential) operator D = (−Δ + 14 )1/2 − characterized by

1 2

DYn := D ∧ (n)Yn = nYn

(98)

of order 1

(99)

for all Yn ∈ Harmn (S2 ). The adjoint operators O˜ (i,k) : c(∞) (S2 ) → C(∞) (S2 ), i, k = 1, 2, 3, to the operators o˜ (i,k) satisfying the equation "˜o(i,k) G, f#l2 (S2 ) = "G, O˜ (i,k) f#L2 (S2 ) ,

(100)

f ∈ c(∞) (S2 ), G ∈ C(∞) (S2 ), are obviously given by ⎞ ⎛ ⎞ Yn O˜ (1,1) Yn ⎜Y ⎟ ⎜ O˜ (1,2) Y ⎟ ⎜ n⎟ ⎜ n⎟ ⎜ ⎟ ⎜ ˜ (2,1) ⎟ Yn ⎟ := aD ⎜ Yn ⎟ , ⎜O ⎜ ⎟ ⎜ ˜ (2,2) ⎟ ⎝ Yn ⎠ ⎝O Yn ⎠ Yn O˜ (3,3) Yn ⎛

and

⎛ ˜ (1,3) ⎞ ⎛ ⎞ O Yn Yn ⎜ O˜ (2,3) Yn ⎟ ⎜ Yn ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ O˜ (3,1) Yn ⎠ := bD ⎝ Yn ⎠ . Yn O˜ (3,2) Yn

(101)

After these preliminaries we are now in a position to introduce the tensor spherical harmonics

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

#−1/2 " y˜ (i,k) o˜ (i,k) Yn,m , ˜ (i,k) n,m := μ n

1215

(102)

n = 0˜ ik , . . . ; m = 1, . . . , 2n + 1, with ⎧ ⎪ ⎨ 0, (i, k) ∈ {(1, 1), (2, 1), (3, 1)}, 0˜ ik := 1, (i, k) ∈ {(1, 2), (1, 3), (2, 3), (3, 3)}, ⎪ ⎩ 2, (i, k) ∈ {(2, 2), (3, 2)},

(103)

and normalization constants μ˜ n(1,1) := (n + 2)(n + 1)(2n + 3)(2n + 1),

(104)

μ˜ n(1,2) := 3n4 ,

(105)

μ˜ n(1,3) := n(n + 1)2 (2n + 1),

(106)

μ˜ n(2,1) := (n + 1)2 (2n + 3)(2n + 1),

(107)

:= n(n − 1)(2n + 1)(2n − 1),

(108)

μ˜ n(2,2)

μ˜ n(2,3) := n2 (n + 1)2 ,

(109)

μ˜ n(3,1) := n2 (n + 1)(2n + 1),

(110)

μ˜ n(3,2) := n(n + 1)2 (2n + 1),

(111)

:= n (n − 1)(2n + 1).

(112)

μ˜ n(3,3)

2

In contrary to the system (89), each member of the system {˜y(i,k) n,m } is an eigenfunction of the Beltrami operator. More explicitly, we have the following result. Assume that {Yn,m }n=0,1,...; m=1,...,2n+1 is an L2 (S2 )-orthonormal set of scalar spherical harmonics. Then, the set & ' y˜ (i,k) n,m

i,k=1,2,3; n=0˜ ik ,...; m=1,...,2n+1

,

(113)

as defined by (102), forms an l2 (S2 )-orthonormal set of tensor spherical harmonics which is closed in c(0) (S2 ) and l2 (S2 ) with respect to  · c(S2 ) and  · l2 (S2 ) , respectively, and complete in l2 (S2 ) with respect to (·, ·)l2 (S2 ) . Furthermore, we are able to verify that (1,1) (1,1) = −(n + 2)(n + 3)˜yn,m , Δ∗ξ y˜ n,m

(114)

(1,2) (1,2) = −n(n + 1)˜yn,m , Δ∗ξ y˜ n,m

(115)

(1,3) (1,3) = −(n + 1)(n + 2)˜yn,m , Δ∗ξ y˜ n,m

(116)

1216

W. Freeden et al. (2,1) (2,1) Δ∗ξ y˜ n,m = −n(n + 1)˜yn,m ,

(117)

(2,2) (2,2) = −(n − 1)(n − 2)˜yn,m , Δ∗ξ y˜ n,m

(118)

(2,3) (2,3) = −n(n − 1)˜yn,m , Δ∗ξ y˜ n,m

(119)

(3,1) (3,1) = −(n + 1)(n + 2)˜yn,m , Δ∗ξ y˜ n,m

(120)

(3,2) (3,2) = −n(n − 1)˜yn,m , Δ∗ξ y˜ n,m

(121)

(3,3) (3,3) = −n(n + 1)˜yn,m , Δ∗ξ y˜ n,m

(122)

where the application of the Beltrami operator is understood component-bycomponent. Because of the completeness of the tensor spherical harmonics (see [37, 38]) every tensor field f ∈ l2 (S2 ) can be written as an orthogonal (Fourier) series f=

∞ 2n+1 3

f

(i,k)∧l2 (S2 )

(n, m) y˜ (i,k) n,m

(123)

i,k=1 n=0˜ ik m=1

(in  · l2 (S2 ) -sense), where the Fourier coefficients are given by f

3.4

(i,k)∧l2 (S2 )

 (n, m) = "f, y˜ (i,k) n,m #l2 (S2 ) =

S2

f(ξ ) · y˜ (i,k) n,m (ξ ) dS(ξ ).

(124)

Outer Harmonics

We begin with the well-known scalar theory (see, e.g., [35]). Scalar Outer Harmonics The scalar outer harmonics are defined by R Hn,m (x) :=

1 R



R |x|



n+1 Yn,m

 x , |x|

x ∈ S2;ext R ,

(125)

denotes the outer space of n = 0, 1, . . . , m = 1, . . . , 2n + 1, where, as usual, S2;ext R the sphere S2R in R3 with radius R around the origin. The following properties are valid: R is of class C(∞) (S2;ext ), • Hn,m R R is harmonic in S2;ext : Δ H R (x) = 0, x ∈ S2;ext , • Hn,m x n,m R R 1 R | • Hn,m S2R = R Yn,m ,  R R ,HR # R • "Hn,m l,s L2 (S2R ) = S2R Hn,m (x)Hl,s (x) dS(x) = δn,l δm,s ,   R (x)| = O |x|−1 , |x| → ∞. • |Hn,m

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1217

Tensor Outer Harmonics We introduce an associated class of tensor outer harmonics (cf. [52, 58, 105]) based on the definition of tensor spherical harmonics as defined in (102):

R;(1,1) hn,m (x)

1 := R

R;(1,2) (x) := hn,m

1 R

R;(1,3) hn,m (x) :=

1 R

R;(2,1) hn,m (x) :=

1 R

R;(2,2) hn,m (x) :=

1 R

R;(2,3) hn,m (x) :=

1 R

R;(3,1) hn,m (x) :=

1 R

R;(3,2) hn,m (x) :=

1 R

R;(3,3) hn,m (x) :=

1 R





 x , |x|  n+1   x R (1,2) , y˜ n,m |x| |x|    n+2 x R (1,3) , y˜ n,m |x| |x|  n+1   R x (2,1) , y˜ n,m |x| |x|    n−1 x R (2,2) , y˜ n,m |x| |x|    n x R (2,3) , y˜ n,m |x| |x|    n+2 x R (3,1) , y˜ n,m |x| |x|    n x R (3,2) , y˜ n,m |x| |x|    n+1 x R (3,3) y˜ n,m , |x| |x| R |x|

n+3

(1,1) y˜ n,m

(126) (127) (128) (129) (130) (131) (132) (133) (134)

˜ x ∈ S2;ext R , n = 0ik , . . . ; m = 1, . . . , 2n + 1. It is not difficult to show that the following properties are satisfied: R;(i,k)

• hn,m is of class c(∞) (S2;ext R ), R;(i,k) R;(i,k) • Δx hn,m (x) = 0, x ∈ S2;ext fulfill the R , i.e., the component functions of hn,m Laplace equation, R;(i,k) (i,k) • hn,m |S2 = R1 y˜ n,m , R  R;(p,q) R;(i,k) R;(p,q) R;(i,k) • "hn,m , hl,s #l2 (S2 ) = S2 hn,m (x) · hl,s (x)dS(x) = δi,p δk,q δn,l δm,s , R

R

where l2 (S2R ) is the space of square-integrable tensor fields on S2R ,  −1  • |hR;(i,k) , |x| → ∞. n,m (x)| = O |x| It must be emphasized that the spherically reflected formulation of the tensorial SGG-problem exclusively uses the tensor outer harmonics of kind (1, 1) specified by (126).

1218

3.5

W. Freeden et al.

Runge-Walsh Concept

In the theory of harmonic functions related to regular surfaces, a result first motivated by [117] in one-dimensional complex analysis and later generalized by [134] and [132] to potential theory in three-dimensional Euclidean analysis is of basic interest. For geodetically relevant obligations, the reader is referred to, e.g., [3, 12, 30, 67, 78, 82, 93, 94, 96, 97, 118]. In our approach (cf. Fig. 5) we use the Runge concept in the formulation [30]. It tells us that Pot(0) (Σ ext ) = span

R n=0,1,...; (Hn,m )|Σ ext m=1,...,2n+1

·C(0) (Σ ext )

,

(135)

where Σ is a regular surface (e.g., sphere, ellipsoid, telluroid, geoid, or real Earth’s surface) and S2R is a sphere inside Σ int , such that the “Runge condition” R < σ = infx∈Σ |x| is valid (note that S2R in (135) may be replaced by any regular surface Ξ (cf. [3]) located totally in Σ int with dist(Σ, Ξ ) > 0, but we restrict ourselves to the spherical (Bjerhammar) case). The Runge–Walsh approximation property (135) justifies the approximation of the Earth’s gravitational potential on and outside the regular surface by a linear combination of scalar outer harmonics, i.e., by harmonic “trial functions of polynomial nature” showing a “harmonicity domain” S2;ext ⊃ Σ ext . It should R

Γ

Σ S2R

Fig. 5 The spherically reflected Runge (Bjerhammar) situation. (For more details see [30] and [3])

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1219

be remarked that the same property holds true, for example, for outer ellipsoidal harmonics outside a regular surface Ξ located totally in Σ int with dist(Σ, Ξ ) > 0. However, once again, for reasons of numerical economy and efficiency we restrict ourselves to outer spherical harmonics and Runge (Bjerhammar) spheres. Moreover, the Runge-property can be verified for so-called fundamental systems of monopoles inside Σ int (cf. [30, 42]), certain kernel representations as well as harmonic splines and wavelets (cf. [35, 36, 42]). From [105] we know that pot(0) (Σ ext ) = span

R;(1,1) )|Σ ext n=0,1,...; (hn,m m=1,...,2n+1

·c(0) (Σ ext )

.

(136)

Again, it should be remarked that the same property holds true for other trial systems, such as tensor outer ellipsoidal harmonics outside an internal ellipsoid, fundamental systems of monopole Hesse tensors inside Σ int , certain tensor kernel representations as well as tensor harmonic splines and wavelets. From the scalar Runge concept (135) it can be deduced additionally (see, e.g., [30, 36, 42]) that, for an arbitrarily small number ε > 0 and any given potential V˜ ∈ Pot(0) (Σ ext ), there exists a potential V ∈ Pot(S2;ext R ) with R < σ = infx∈Σ |x| such that supx∈Σ ext |V˜ (x) − V (x)| ≤ ε

(137)

supx∈Γ |(∇ ⊗ ∇)V˜ (x) − (∇ ⊗ ∇)V (x)| ≤ C ε      

(138)

and

=˜v(x)

=v(x)

for some positive constant C (depending on the geometries of Σ and Γ ). The calamity of evaluating the gravitational potential V˜ (for example, the Earth’s gravitational potential) by a potential V in terms of outer harmonics is that these basis functions are globally supported. This is the reason why they do not show any space localization but ideal frequency (momentum) localization (for a more detailed description see [34,52,63]). This property makes outer harmonics laborous to use for high resolution modeling at local scale. As a matter of fact, the uncertainty principle (see [63] and the references therein) leads us to the conclusion that outer harmonics are well suited to resolve low-frequency problems at global scale, i.e. to represent trend approximations. As a well promising compromise, “sum conglomerates” of outer harmonics, i.e., so-called (outer) harmonic kernel functions, can be handled (see [63] offering a limited but appropriately balanced range of frequency as well as space localization). Even better, we can construct families of kernels which control the increase of space localization at the cost of the decrease of frequency localization by specifying a scale

1220

W. Freeden et al.

parameter. This leads to the multiscale philosophy by wavelets in space as well as frequency context as discussed later on. Summarizing our considerations we are led to the following conclusion: A viable way to model SGG may be based on the Runge property that there exists a potential V outside a Runge (Bjerhammer) sphere S2R inside the Earth in ε-accuracy (ε > 0, arbitrarily small) to the Earth’s gravitational potential V˜ such that supx∈Γ |(∇ ⊗ ∇)V˜ (x) − (∇ ⊗ ∇)V (x)| = supx∈Γ |˜v(x) − v(x)| ≤ C ε.      

(139)

=v(x)

=˜v(x)

As a consequence of our excursion to the Runge theory, the relation between the tensorial measurements v˜ , i.e., the gradiometer orbital data of V˜ on Γ and the auxiliary potential V outside the sphere S2R may be explained for points x on the satellite orbit Γ as follows (see (5), (2)): • (in space domain) by a linear Fredholm integral equation of the first kind  S2R

1 |x|2 − R 2 V (y) ∇x ⊗ ∇x dS(y) = (∇ ⊗ ∇)V (x) = v(x) ≈ v˜ (x), 4π R |x − y|3    =kR (x,y)

(140) • (in frequency domain) by a pseudodifferential equation of the form ∞ 2n+1

V

∧L2 (S2 ) R

 (1,1) R;(1,1) (n, m) μ˜ n hn,m = (∇ ⊗ ∇)V (x) = v(x) ≈ v˜ (x)

n=0 m=1

(141) (note that “ ≈ ” means “approximately equal”, so that the input data v˜ (x) may be replaced by v(x)). In other words, SGG in frequency as well as space domain is appropriately modeled in the spherically reflected nomenclature of V , thereby taking as input data the tensor measurements v˜ (x) of the (actual) Earth’s potential V˜ on the orbit Γ . SGG Uniqueness Some words about the uniqueness of spherically reflected SGG should be made: Keeping in mind that any solution of the SGG-problem in tensor spherical framework can be expressed as a series of outer harmonics due to the Runge–Walsh approximation property and observing the closure and completeness of the spherical harmonics in the space of square-integrable functions on spheres, we are able to conclude (cf. [123, 124], and [125]) that the SGG-problem is uniquely solvable (up to some low order spherical harmonics) involving the O (1,1) , O (1,2) , O (2,1) , O (2,2) , and O (2,3) components.

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1221

More concretely, we are able to formulate the following results: Let V be of class Pot(0) (S2;ext ), with γ < infx∈Γ |x| (cf. Fig. 6). Then the follγ owing statements may be listed: 1. O (i,k) ∇ ⊗ ∇V |S2γ = 0 if (i, k) ∈ {(1, 3), (3, 1), (3, 2), (3, 3)}, 2. O (i,k) ∇ ⊗ ∇V |S2γ = 0 for (i, k) ∈ {(1, 1), (2, 2)} if and only if V |S2γ = 0, 3. O (i,k) ∇ ⊗ ∇V |S2γ = 0 for (i, k) ∈ {(1, 2), (2, 1)} if and only if V |S2γ is constant, 4. O (2,3) ∇ ⊗ ∇V |S2γ = 0 if and only if V |S2γ is linear combination of spherical harmonics of degree 0 and 1. The SGG-uniqueness list gives detailed information, which tensor components of the Hesse tensor are suitable to guarantee the uniqueness of the SGG-problem. ) with vanishing spherical harmonic In fact, for a potential of class Pot(0) (S2;ext γ moments of degree 0 and 1 such as the Earth’s disturbing potential (see, e.g., [75, 78, 97]), uniqueness is assured for all cases listed above. Fig. 7 gives graphical illustrations of the disturbing potential and its second order radial derivative at the height of the Earth’s surface (0 km) and at the (orbital) height of 250 km. In this respect, it should be noted that the ESA-satellite GOCE had been injected into its orbit with some additional height so as to perform the complex commissioning activities. In fact, the orbit was left to decay naturally due to atmospheric drag from the initial height of 278 km to the desired height of 254 km.

Γ

S2γ

S2R

Σ

Fig. 6 The geometric situation of satellite gravitational gradiometry as discussed in our frequency framework

1222

W. Freeden et al.

Fig. 7 Disturbing potential data generated from the EGM 2008 model [107], evaluated up to degree and order 720. From top to bottom: heights 250 and 0 km. From left to right: disturbing potential in m2 /s2 , negative first radial derivative in 10−6 m/s2 and the second order radial derivative in 10−12 /s2 (taken from [63])

4

SGG in Frequency-Based Framework

The considerations of this chapter use some pre-work to be found in [31, 35, 46, 53, 58]. Our goal is to formulate the SGG-problem in terms of a pseudodifferential operator equation involving suitable Sobolev reference spaces.

4.1

Sobolev Spaces

The scalar case merely serves as a preparation of the tensorial context, which should be studied subsequently. Scalar Case We let A be the linear space consisting of all sequences {An }n∈N of real numbers An = 0, n ∈ N0 : A := {{An } : An ∈ R, An = 0, n ∈ N0 } .

(142)

We consider the set E = E({An }; S2;ext R ) given by ∞ 2n+1 & " ∧ 2 #2 ' 2 L2 (SR ) E := F ∈ Pot(∞) (S2;ext F ) : A (n, m) < ∞ , (143) n R n=0 m=1

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1223

where F

∧L2 (S2 ) R

 (n, m) =

S2R

R F (y)Hn,m (y) dS(y).

(144)

Note that we assume that R is chosen in accordance to the Runge concept (cf. Fig. 6). On E we introduce an inner product "·, ·# 2;ext by H({An };SR

"F, G#

H({An };S2;ext R )

=

∞ 2n+1

A2n F

∧L2 (S2 ) R

)

∧L2 (S2 )

(n, m)G

R

(n, m),

F, G ∈ E.

n=0 m=1

(145) The associated norm is given by F 

H({An };S2;ext R )

1/2

 = "F, F #

(146)

.

H({An };S2;ext R )

2;ext The (scalar) Sobolev space H(S2;ext R ) (= H({An }; SR )) is the completion of E under the norm  ·  2;ext :

H({An };SR

)

H({An }; S2;ext R ) := E

·

2;ext H({An };SR )

(147)

.

H({An }; S2;ext R ) equipped with the inner product "·, ·#

H({An };S2;ext R )

∗{A }

is a separable

Hilbert space. The system {Hn,m n (R; ·)} given by ∗{An } R Hn,m (R; x) := A−1 n Hn,m (x),

x ∈ S2;ext R ,

(148) ∗{A }

n ∗ (R; ·) instead of H is a Hilbert basis (we simply write Hn,m n,m (R; ·) if no

confusion is likely to arise). Any function F ∈ H({An }; S2;ext R ) can be expanded ∗ (R; ·): as a Fourier series in terms of the basis Hn,m F =

∞ 2n+1



F

2;ext H({An };SR )

∗{An } (n, m)Hn,m (R; ·),

(149)

n=0 m=1

where ∧

F

2;ext H({An };SR )

∗ (n, m) = "F, Hn,m (R; ·)#

H({An };S2;ext R )

.

(150)

1224

W. Freeden et al.

If no confusion is likely to arise we also use the notation ∧

F

2;ext H(SR )

(n, m) = F



2;ext H({An };SR )

(151)

(n, m).

Furthermore, according to its construction, the space H(S2;ext R ) has the reproducing kernel function K

H(S2;ext R )

K

H(S2;ext R )

(x, y) =

(·, ·) : S2;ext × S2;ext → R given by R R

∞ 2n+1

∗{An } ∗{An } Hn,m (R; x)Hn,m (R; y),

x, y ∈ S2;ext R .

(152)

n=0 m=1

Our next goal is the introduction of a class of scalar Sobolev spaces based on the spherical symbol {(Δ∗;R )∧ (n)}n∈N0 of the Beltrami operator Δ∗;R related to the sphere S2R . We observe that Δ∗;R Yn,m =

1 ∗ 1 Δ Yn,m = − 2 n(n + 1)Yn,m , 2 R R

n ∈ N0 ; m = 1, . . . , 2n + 1. (153)

In particular, we have Δ∗;R Y0,1 = 0, which requires a shift by a non-zero constant, for example 4R1 2 , to obtain invertibility. As a consequence we formally obtain     1 s/2 n + 1/2 s −Δ∗;R + Y = Yn,m n,m R 4R 2

(154)

and 6 −Δ

∗;R

1 + 4R 2

s/2 5∧   n + 1/2 s ∧ F (n, m) = F (n, m), R

(155)

n ∈ N0 , m = 1, . . . , 2n + 1. For s ∈ R we let  Hs (S2;ext R ) := H

n + 1/2 R

s 

 ; S2;ext , R

(156)

such that the norm in Hs (S2;ext R ) fulfills F 

Hs (S2;ext R )

= (−Δ∗;R x +

For H0 (S2;ext R ) we identify the norm  · 

1 s/2 ) F L2 (S2 ) . R 4R 2

H0 (S2;ext R )

(157)

with the  · L2 (S2 ) –norm. R

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1225

The space H0 (S2;ext R ) may be understood as the space of all solutions of the Dirichlet boundary value problem in S2;ext corresponding to L2 (S2R )-boundary R 2 2 values on S2R (note that the potential in H0 (S2;ext R ) corresponding to the L (SR )(Dirichlet) boundary conditions on S2R is uniquely determined). Furthermore, if t < s, then we have

F 

Ht (S2;ext R )

≤ F 

(158)

Hs (S2;ext R )

and 2;ext Hs (S2;ext R ) ⊂ Ht (SR ).

(159)

In order to formulate some results (cf. [31, 35, 58]) about the convergence of the expansion in terms of outer harmonics to a function in ordinary pointwise sense (Sobolev Lemma) we need the concept of summable sequences {An }n∈N0 ∈ A satisfying ∞ 2n + 1 n=0

A2n

< ∞.

(160)

Sobolev Lemma Assume that the sequences {An }n∈N0 , {Bn }n∈N0 ∈ A are given in # " such a way that {Bn−1 An }n∈N0 is summable. Then each F ∈ H {Bn−1 An }; S2;ext R corresponds to a potential of class Pot(0) (S2;ext R ). The Sobolev Lemma states that in the case of summability of the sequence {Bn−1 An }n∈N0 , the Fourier series in terms of the basis functions # " ∗ is continuous on the boundary S2R . In particular, ∈ H {Bn−1 An }; S2;ext Hn,m R we have the following statement: If F ∈ Hs (S2;ext R ) with s > 1, then F corresponds to a function of class Pot(0) (S2;ext R ). 2 2 For any F in L (SR ), there exists one and only one “harmonic continuation” U ∈ H(S2;ext R ) which is given by U (x) =

∞ 2n+1

A2n F

∧L2 (S2 ) R

∗ (n, m)Hn,m (R; x),

x ∈ S2;ext R ,

(161)

n=0 m=1

where F

∧L2 (S2 ) R

 (n, m) =

S2R

R F (y) Hn,m (y) dS(y) =

n = 1, 2, . . . ; m = 1, . . . , 2n + 1.

1 ∧H(S2;ext ) R U (n, m), A2n

(162)

1226

W. Freeden et al.

Tensorial Case In order to introduce Sobolev spaces for tensor fields we start from ∞ 2n+1 & ' 2 ∧l2 (S2R ) 2 e := f ∈ pot(∞) (S2;ext ) : A (f (n, m)) < ∞ , n R

(163)

n=0 m=1

where f

∧l2 (S2 ) R

 (n, m) =

S2R

R;(1,1) f(y) hn,m (y) dS(y).

(164)

Equipped with the inner product

"f, g#

h({An };S2;ext R )

=

∞ 2n+1

A2n f

∧l2 (S2 ) R

∧l2 (S2 )

(n, m) g

R

(165)

(n, m),

n=0 m=1

f, g ∈ e, the space e becomes a pre-Hilbert space. We define the Sobolev space 2;ext h(S2;ext R ) = h({An }; SR ) as the completion of e under the norm  · 

which denotes the norm associated to "·, ·#

h(S2;ext R ) ·

h({An }; S2;ext R ) := e

h({An };S2;ext R )

:

2;ext h({An };SR )

,

(166)

.

2;ext If no confusion is likely to arise we also use h(S2;ext R ) instead of h({An }; SR ).

h(S2;ext R ) equipped with the inner product "·, ·# space. The system

h({An };S2;ext R ) ∗{An } {hn,m (R; ·)}n∈N0 ;m=1,...,2n+1 , given by

∗{An } R;(1,1) hn,m (R; x) := A−1 (x), n hn,m

is a separable Hilbert

x ∈ S2;ext R ,

(167)

2;ext represents an h(S2;ext R )-orthonormal Hilbert basis in h(SR ) (note that we some∗{A } times write h∗n,m (R; ·) instead of hn,mn (R; ·)). As a consequence we can expand a ∗{A }

n function f ∈ h(S2;ext R ) as a Fourier series in terms of the basis hn,m (R; ·)

f=

∞ 2n+1 n=0 m=1



f

2;ext h({An };SR )

∗{An } (n, m)hn,m (R; ·),

(168)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1227

where ∧

f

2;ext h(SR )

(n, m) = f



2;ext h({An };SR )

∗{An } (n, m) = "f, hn,m (R; ·)#

h(S2;ext R )

The space h(S2;ext R ) possesses the reproducing kernel K

h(S2;ext R )

.

(169)

(·, ·) : S2;ext ⊗ R

S2;ext → R3 ⊗ R3 ⊗ R3 ⊗ R3 given by R

K

h(S2;ext R )

(x, y) =

∞ 2n+1

∗{An } ∗{An } hn,m (x) ⊗ hn,m (y),

(170)

n=0 m=1

x, y ∈ S2;ext R , i.e., ˜ (1,1) K • for all x ∈ S2;ext R , OR

h(S2;ext R )

˜ (1,1) is (·, x) ∈ h(S2;ext R ), where the operator OR (1,1)

the extension of S the adjoint operator of T o˜ R



(1,1) O˜ R f(x)

=

(1,1) O˜ R K 2,ext (·, x), f h(S ) R

to tensor fields of rank four,

h(S2;ext R )

for every f ∈ h(S2;ext R ) and all x ∈

S2;ext R . Finally, we set  hs (S2;ext R ) := h

n + 1/2 R

s 

 ; S2;ext . R

(171)

Tensorial Sobolev Lemma Let the sequences {An }n∈N0 , {Bn }n∈N0 ∈ A be given # " such that {Bn−1 An }n∈N0 ∈ A is summable. Then each f ∈ h {Bn−1 An }; S2;ext R corresponds to a function of class pot(0) (S2;ext R ). For any f ∈ l2 (S2R ), there exists one and only one tensorial “harmonic upward continuation” u ∈ h(S2;ext R ) of the form u(x) =

∞ 2n+1

A2n f

∧l2 (S2 ) R

(n, m) h∗n,m (R; x),

x ∈ S2;ext R ,

(172)

n=0 m=1

where f

∧l2 (S2 ) R

 (n, m) =

S2R

R;(1,1) f(y) · hn,m (y) S(y) =

1 ∧h(S2;ext ) u R (n, m). A2n

(173)

1228

W. Freeden et al. ∧2

∧2

2

2

Obviously, f l (SR ) (n, m) = u l (SR ) (n, m) holds true for all n, m. More detailed information about tensorial Sobolev spaces can be found in [105].

4.2

Pseudodifferential Equation

Since we know, at least in the spherically reflected context (as introduced by [127]), which conditions guarantee the uniqueness of a SGG-solution we can turn to the question how to find a solution from tensorial data and what we mean with a spectral solution, when we have to take into account the ill-posedness. This leads us to analyze the SGG-problem step by step by use of the concept of pseudodifferential operators. We shortly introduce the framework of scalar and tensorial pseudodifferential operators (PDO’s). Scalar Pseudodifferential Operator Let {Λ∧ (n)}n∈N0 be a sequence of real 2;ext ), τ ≥ R > 0, defined by numbers. The operator Λ : H(S2;ext R ) → H(Sτ

ΛF =

∞ 2n+1

Λ∧ (n) F



2;ext H(SR )

∗ (n, m) Hn,m (τ ; ·),

(174)

n=0 m=1

is called a scalar pseudodifferential operator of order t if |Λ∧ (n)| lim " #t = const = 0 n→∞ n + 12

(175)

holds true for some t ∈ R. If the limit |Λ∧ (n)| lim " #t = 0 n→∞ n + 12

(176)

holds true for all t ∈ R, the operator Λ is called a scalar pseudodifferential operator of exponential order. The sequence {Λ∧ (n)}n∈N0 is called the symbol of the scalar PDO Λ. Tensorial Pseudodifferential Operator Let {Λ∧ (n)}n∈N0 be a sequence of real 2;ext ), τ ≥ R > 0, defined by numbers. The operator Λ : H(S2;ext R ) → h(Sτ

ΛF =

∞ 2n+1 n=0 m=1

Λ∧ (n)F



2;ext H (SR )

(n, m) h∗n,m (τ ; ·),

(177)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1229

is called a tensorial pseudodifferential operator of order t if |Λ∧ (n)| lim " #t = const = 0 n→∞ n + 12

(178)

holds true for some t ∈ R. If the limit |Λ∧ (n)| lim " #t = 0 n→∞ n + 12

(179)

holds true for all t ∈ R, the operator Λ is called a tensorial pseudodifferential operator of exponential order. The sequence {Λ∧ (n)}n∈N0 is called the symbol of the tensorial PDO Λ. In the following we define scalar and tensorial kernel functions which are of basic importance for the consideration of the SGG-problem in terms of pseudodifferential operators. Kernel Functions Let τ, R satisfy τ ≥ R ≥ 0. Then any kernel QR,R (·, ·) : S2;ext R × S2;ext → R of the form R QR,R (x, y) =



Q∧ (n)

n=0

2n+1

∗ ∗ Hn,m (R; x)Hn,m (R; y)

(180)

m=1

x, y ∈ S2;ext is called an HR,R -kernel. Any kernel qR,τ (·, ·) : S2;ext × S2;ext → τ R R 3 3 R ⊗ R of the form qR,τ (x, y) =



q∧ (n)

n=0

2n+1

∗ Hn,m (R; x)h∗n,m (τ ; y)

(181)

m=1

(x, y) ∈ S2;ext ×S2;ext is called an hR,τ -kernel. The sequence {Q∧ (n)}n∈N0 is called τ R the symbol of the HR,τ -kernel, and {q∧ (n)}n∈N0 is called the symbol of the hR,τ kernel. Convolutions An HR,R -kernel QR,R (·, ·) with the symbol {Q∧ (n)}n=0,... is called admissible, if the following conditions are satisfied: 1. 2.

∞

∧ 2 < ∞, n=0 (Q (n)) " #2 ∞ 2 Q∧ (n) (2n + 1) n=0 An

< ∞.

1230

W. Freeden et al.

In analogy, an hR,τ -kernel qR,τ (·, ·) with the symbol {q∧ (n)}n=0,... is called admissible, if the conditions (i) and (ii) are satisfied. We define the convolution of an admissible HR,R -kernel against a function F ∈ H(S2;ext R ) via the Parseval identity as follows:

(QR,R 4

H(S2;ext R )

F )(x) =

∞ 2n+1

Q∧ (n)F



2;ext H(SR )

∗ (n, m)Hn,m (R; x),

(182)

n=0 m=1

x ∈ S2;ext R . In analogy, we introduce the convolution of an admissible hR,τ -kernel 2;ext against F ∈ H(S2;ext ), respectively, by R ) and f ∈ h(Sτ

(qR,τ 4

H(S2;ext R )

F )(x) =

∞ 2n+1

q∧ (n)F



2;ext H(SR )

(n, m)h∗n,m (τ ; x),

(183)

n=0 m=1

, and x ∈ S2;ext τ (qR,τ ∗

h(S2;ext ) τ

f)(x) =

∞ 2n+1

q∧ (n)f



2;ext h(SR )

∗ (n, m)Hn,m (R; x),

(184)

n=0 m=1

x ∈ S2;ext R . Within the context of pseudodifferential operators we are able to formulate the (scalar or tensorial) SGG-problem (for more details on pseudodifferential operators the reader should consult [127], [35, 58] and the references therein). Upward/Downward Continuation Let S2R be a Runge sphere inside the real Earth’s Σ int , i.e., R < infx∈Σ |x|. Furthermore, let γ be smaller than the lowest possible altitude of the satellite, i.e., γ < infx∈Γ |x| (cf. Fig. 9). Consider a potential of class H({An }; S2;ext R ) of the form F =

∞ 2n+1



F

2;ext H({An };SR )

∗{An } (n, m) Hn,m (R; ·).

(185)

n=0 m=1 R,γ

The upward continuation operator Λup associates to F ∈ H({An }; S2;ext R ) the R,γ

R,γ

solution Λup F of the Dirichlet problem Λup F ∈ Pot(0) (S2;ext ) corresponding to γ R,γ

R,γ

the boundary values (Λup F )|S2γ = F |S2γ . The upward continuation operator Λup has the associated symbol

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

 n R (Λup ) (n) = . γ R,γ ∧

1231

(186)

R,γ

The inverse of Λup is called the downward continuation operator, so that Λdown = (Λup )−1 . R,γ

R,γ

(187)

It brings down the gravitational potential at height γ to the height R and has the associated symbol " γ #n R,γ (Λdown )∧ (n) = . R

(188)

It is obvious, that the upward continuation is well-posed, whereas the downward continuation leads to an ill-posed problem. Pseudodifferential Operator of the First Order Radial Derivative This operator associates to F ∈ H({An }; S2;ext R ) the solution ΛFRD F of the Dirichlet problem ΛFRD F ∈ Pot(0) (S2;ext ) corresponding to the boundary values (ΛFRD F )|S2γ = γ ∂ − ∂r F |S2 . This is an operator of order 1 with the associated symbol R

(ΛFRD )∧ (n) =

n+1 . R

(189)

Pseudodifferential Operator of the Second Order Radial Derivative Analogous ∂2 considerations applied to the operator ∂r 2 on F yields the operator of the second order radial derivative which associates to F ∈ H({An }; S2;ext R ) the solution ΛSRD F ) corresponding to the boundary of the Dirichlet problem ΛSRD F ∈ Pot(0) (S2;ext γ ∂2 values (ΛSRD F )|S2γ = ∂r F | . This is an operator of order 2 with the associated 2 2 SR symbol (ΛSRD )∧ (n) =

(n + 1)(n + 2) . R2

(190)

The Pseudodifferential Operator of the Hesse Tensor Let us consider the operator of the second derivative (i.e., the Hesse tensor) 2;ext ∇ ⊗ ∇ : H(S2;ext R ) → h(SR ),

(191)

which associates to F ∈ H({An }; S2;ext R ) the solution ΛHesse F of the Dirichlet 2;ext problem of finding ΛHesse F ∈ h(Sγ ) corresponding to the boundary values

1232

W. Freeden et al.

(ΛHesse F )|S2γ = (∇ ⊗ ∇)F |S2 . R

(192)

This is an operator of order 2 given by the symbol ∧



(ΛHesse ) (n, m) =

(n + 2)(n + 1)(2n + 3)(2n + 1) , R2

(193)

n = 0, 1, . . . ; m = 1, . . . , 2n + 1. Up to now, we assumed spherical geometry in connection with spherical harmonics, i.e., we presented spherical harmonics which are adequate for the common spherical approach, where the reference surface is supposed to be a sphere and the data are given on a spherical satellite orbit. Next, we make the first steps to a Runge concept which enables the application of arbitrary geometries. The basic idea underlying the Runge concept is to specify two spheres as illustrated in Fig. 6, thereby leading back to spherical basis functions and using the specific properties of outer harmonics. Starting from the data given on the real satellite orbit Γ (which is not necessarily required here to be a closed surface) we pull down the tensorial information to a sphere S2γ of radius γ with dist(S2γ , Γ ) > 0. By virtue of “downward continuation” from S2γ to the sphere S2R inside the Earth Σint such that dist(S2R , Σ) > 0 we are able to express the solution in terms of scalar outer harmonics, i.e., we obtain a representation of the gravitational potential on the real Earth’s surface Σ using data on the real orbit Γ . Scalar Pseudodifferential Operator for Satellite Gravitational Gradiometry Let a function G of class H({An }; S2;ext ) be known. Suppose that the symbol of γ the pseudodifferential operator is given by (ΛSGG )∧ (n) =

 n (n + 1)(n + 2) R . γ γ2

(194)

Find a potential F ∈ H({An }; S2;ext R )|Σ ext such that (ΛSGG F ) = G.

(195)

The solution process of the SGG-problem of determining the gravitational potential of the Earth on the Earth’s surface from orbital values v of the Hesse tensor on Γ may be based on the Runge assumption that there exists a potential outside a Runge (Bjerhammar) sphere S2R inside the Earth with ε-accuracy (ε > 0, but arbitrarily small) to the real gravitational potential outside and on the Earth’s surface Σ. Thus we are allowed to model the SGG-problem (see also [48]) in the following way:

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1233

The Tensorial SGG Pseudodifferential Operator The SGG-operator R,γ

2;ext ΛSGG : H(S2;ext ) R ) → h(Sγ

(196)

expressed in terms of outer harmonics is given by ∗ ΛSGG Hn,m (R; x) = R,γ



(n+2)(n+1)(2n+3)(2n+1) γ2

" #n R γ

h∗n,m (γ ; x),

x ∈ S2;ext . γ (197)

The symbol of this operator is given by (ΛSGG )∧ (n, m) = R,γ

 n √ (n + 2)(n + 1)(2n + 3)(2n + 1) R , γ γ2

(198)

n = 0, 1, . . . ; m = 1, . . . , 2n + 1. The SGG-operator applied to F ∈ H(S2;ext R ) is representable as a outer harmonic series expansion as follows:

R,γ

ΛSGG F (x) =

∞ 2n+1 ∧ " R,γ #∧ 2;ext ΛSGG (n, m)F H(SR ) (n, m)h∗n,m (γ ; x).

(199)

n=0 m=1

The interrelations between the potential F and the full Hesse tensor of F on the Earth’s surface and the satellite orbit can be represented in a so-called Meissl scheme (see Fig. 8). Meissl schemes both in the framework of outer harmonics and in multiscale nomenclature can be found in [47]. Table 1 presents a list of all the afore discussed pseudodifferential operators.

4.3

Multiscale Frequency Regularization

Tensor spherical harmonics of type (1,1) allow to express the Hesse tensor applied to solid (outer) harmonics in the form  (∇ ⊗ ∇)

R Hn,m

=

(1,1)

μ˜ n

R;(1,1) hn,m ,

where we remember μ˜ n(1,1) = (n + 2)(n + 2)(2n − 3)(2n − 1).

(200)

1234

W. Freeden et al.

Fig. 8 The Meissl scheme for the Hesse tensor on the Runge reference sphere of the Earth’s surface and on the reference sphere of the satellite orbit (see also [47], and [112], [115])

Table 1 Pseudodifferential operators which play a certain role within the SGG-context Operator R,γ

Λup

Description Upward continuation operator

Λdown

Downward continuation operator

ΛR FRD

First order radial derivative at the Earth’s surface

ΛR SRD

Second order radial derivative at the Earth’s surface

R,γ

ΛSGG

Scalar pseudodifferential operator for satellite gravitational gradiometry

ΛHesse

Hesse tensor

ΛSGG

Tensorial pseudodifferential operator for satellite gravitational gradiometry

R,γ

Symbol  n R γ " γ #n R (n + 1) − R (n + 1)(n + 2) R2 R n (n + 1)(n + 2) γn γ2 √ (n + 2)(n + 1)(2n + 3)(2n + 1) R2 √   (n + 2)(n + 1)(2n + 3)(2n + 1) R n γ γ2

Order −∞ ∞ 1 2

−∞ 2

−∞

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1235

This leads to the tensor-isotropic SGG-pseudodifferential equation Λ(V ) =

∞ 2n+1

V

∧L2 (S2 ) R

 R;(1,1) (n, m) μ˜ n(1,1) hn,m = (∇ ⊗ ∇)V = v,

n=0 m=1

(201) as spectral (frequency) representation for the inversion of the SGG-integral equation ΛV = v. Equivalently, we have a representation as convolution  Λ(V ) =

S2R

KΛ R (·, y) V (y) dS(y) = (∇ ⊗ ∇)V = v,

(202)

where the tensorial kernel KΛ R (·, ·) is given by KΛ R (x, y)

=

∞ 2n+1 

R;(1,1) R μ˜ n(1,1) hn,m (x) Hn,m (y)

(203)

n=0 m=1 R } this enables us to in spectral way. By the completeness of the system {Hn,m 2 2 conclude in the framework l (SR ) of square-integrable tensor fields on S2R that

 "(∇

R;(1,1) ⊗ ∇)V , hp,q #l2 (S2 ) R

=

S2R

=V

R;(1,1) (∇ ⊗ ∇)V (y) · hp,q (y) dS(y)

∧L2 (S2 ) R

 (1,1) (p, q) μ˜ n .

(204)

Consequently, we obtain the following expansion for the potential V in S2;ext R V (x) = Λ−1 (∇ ⊗ ∇V )(x) =

(205)

∞ 2n+1 R;(1,1) (1,1) −1/2 R "∇ Hn,m (x),  ⊗∇V, hn,m #l2 (S2R ) (μ˜ n ) n=0 m=1

=v

i.e., V (x) = Λ−1 v(x) =



−1

S2R

KΛ R (x, y) · v(y) dS(y)

(206)

with −1

KΛ R (x, y) =

∞ 2n+1 R R;(1,1) (μ˜ n(1,1) )−1/2 Hn,m (x) hn,m (y). n=0 m=1

(207)

1236

W. Freeden et al.

This formula expresses the gravitational potential V in terms of the tensor v on locations x of the satellite orbit Γ . Λ is a linear bounded injective compact operator (see, e.g., [35] for more details) so it follows (see, e.g., [45]) that the SGG-problem is ill-posed because of the unboundedness of Λ−1 . Hence, the SGG-problem needs regularization. As described earlier, a regularization strategy for the SGG-problem is a sequence {Rj }j ∈N0 of linear bounded pseudodifferential operators Rj so that lim Rj Λ(V ) = V

(208)

j →∞

in the outer space of S2R , i.e., the operators Rj Λ converge in pointwise sense to the identity operator. In principle, all regularization methods (mentioned in Sect. 2) are applicable to SGG. In what follows, however, we are only interested in two SGG-multiscale regularization strategies. Tikhonov Regularization Strategy This method makes use of the (nonbandlimited) isotropic Tikhonov-kernels (scaling functions) Φj , j = 0, 1, . . . , given by ∞ 2n+1 R R (Φj )∧ (n)Hn,m (x)Hn,m (y) Φj (x, y) =

(209)

n=0 m=1

with (Φj )∧ (n) =

(Λ∧ (n))2 , (Λ∧ (n))2 + μ2j

n = 0, 1, . . . , j = 0, 1, . . . ,

(210)

where {μj }, j = 0, 1, . . . , is a sequence of real numbers satisfying lim μj = 0.

(211)

j →∞

Then, the operators Rj = Φj ∗ Λ−1 constitute a multiscale regularization strategy. More concretely, Rj Λ(V )(x) =

∞ 2n+1 R;(1,1) (1,1) −1/2 R (Φj )∧ (n)"∇ Hn,m (x),  ⊗∇V, hn,m #l2 (S2 ) (μ˜ n ) =v

n=0 m=1

R

(212) i.e.,

 Rj Λ(V )(x) =

S2R

 S2R

−1

Φj (x, z) KΛ R (z, y) dS(z) · (∇ ⊗ ∇)V (y) dS(y). (213)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1237

Truncated Singular Value Regularization Strategy The point of departure is a one-dimensional function ϕ0 : [0, ∞) → R with the following properties (cf. [63]): (i) (ii) (iii) (iv)

ϕ0 (0) = 1, ϕ0 is monotonically decreasing, ϕ0 is continuous at 0, ϕ0 : [0, ∞) → R has a local support, i.e., supp ϕ0 ⊂ [0, 1].

Accordingly we are led to the isotropic scaling functions Φj , j = 0, 1, . . . , given by Φj (x, y) =

∞ 2n+1

R R ϕ0 (2−j n) Hn,m (x)Hn,m (y),

(214)

n=0 m=1

where 0 ≤ (Φj )∧ (n) = ϕ0 (2−j n) ≤ 1,

n = 0, 1, . . .

(215)

and lim (Φj )∧ (n) = lim ϕ0 (2−j n) = 1,

j →∞

j →∞

n = 0, 1, . . . .

(216)

In accordance with our construction, the compact support of ϕ0 implies that only finitely many (Φj )∧ (n) are different from 0. Obviously, it follows that supp ϕj ⊂ [0, 2j ]. So, the bandlimited kernels Φj , j = 0, 1, . . . , defined via a generator ϕ0 satisfying the properties (i) – (iv) as stated above define operators Rj = Φj ∗ Λ−1 ,

(217)

which constitute a regularization strategy in the following sense: 1. Rj is bounded, 2. the limit relation lim Rj Λ(V ) = V

j →∞

(218)

holds true in the outer space of S2R . For more bandlimited as well as non-bandlimited regularization strategies and methodologies the reader is referred to, e.g., [45, 63].

1238

W. Freeden et al.

5

SGG in Space-Based Framework

As already pointed out, the SGG-problem can be modeled by a tensorial Fredholm integral equation of the first kind in space domain. In what follows we are interested in a novel SGG-modeling method using exclusively arguments in space domain thereby involving Abel–Poisson kernels, where the geometric situation is illustrated in Fig. 9.

5.1

Fredholm Integral Equation

In the spherically reflected Runge (Bjerhammar) framework of a potential V approximating arbitrarily close the Earth’s external gravitational potential V˜ (cf. (140)), the SGG-integral equation  S2R

1 |x|2 − R 2 V (y) ∇x ⊗ ∇x dS(y) = (∇ ⊗ ∇)V (x) = v(x) ≈ v˜ (x), 4π R |x − y|3    =kR (x,y)

(219) holds true for all locations x along the satellite orbit Γ , where v(x) = (∇ ⊗ ∇)V (x) ≈ (∇ ⊗ ∇)V˜ (x), x ∈ Γ,

(220)

are the known tensorial input data, kR (·, ·) is as usual the Hesse tensor of the Abel–Poisson kernel given by (6), and, in accordance with our construction, the scalar function V |Σ , i.e., the potential on the (actual) Earth’s surface Σ obtained by “upward continuation” via the integral formula  V |Σ =

S2R

V (y) KR (·, y) dS(y)|Σ

(221)

represents the desired SGG-solution to the known Hesse tensor field v(x), x ∈ Γ, given by  v(x) =

S2R

V (y) kR (x, y) dS(y),

x∈Γ

(222)

(note that we do not require that Γ is a closed surface in Euclidean space R3 ).

5.2

Space Solution in Preparatory Framework

By approximate integration over the sphere S2R (see, e.g., [39] for appropriate rules) we are led to a cubature formula of the form

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

 v(x) =

S2R

V (y) kR (x, y) dS(y) ≈

NR k=1

w R V (y R ) kR (x, ykR ), x ∈ Γ,  k  k 

1239

(223)

=:akR

with appropriately given weights wkR ∈R, k=1, . . . , NR , and nodes ykR ∈S2R , k=1, . . . , NR (“≈ means that the continuous sum, i.e., the integral, on the left side is approximated by an associated discrete cubature formula on the right side). In order to determine the coefficients akR ∈R, k=1, . . . , NR , we assume that tensor values v(xl ), l = 1, . . . , LR , are known along the orbit Γ . Thus, SGG is reduced to a discrete problem, and a (discrete) solution can be obtained, e.g., by interpolatory requirements (in case of error-free data) or by smoothing and/or adjustment procedures (such as proposed, e.g., by [32, 55, 96] in case of erroraffected data)

v(xl ) ≈

NR

akR kR (xl , ykR ), l = 1, . . . , LR .

(224)

k=1

Once the coefficients of the linear system (224) akR = wkR V (ykR ), k = 1, . . . , NR ,

(225)

are available (note that the weights wkR , k = 1, . . . , NR , are known from the integration rule), the Runge (Bjerhammar) potential V can be obtained from its discretization (223) as follows:  V (x) =

S2R

(226)

V (y)KR (x, y) dS(y)

≈ V R (x) :=

NR k=1

w R V (y R ) KR (x, ykR ),  k  k 

R 2 x ∈ S2,ext R , yk ∈ SR .

=akR

As a consequence, the restriction V R |Σ of V R represents the desired approximate SGG-solution on the Earth’s surface Σ in space domain. Unfortunately, it should be noted that the linear system (224) with all tensorial input data situated in the “far field” Γ of the Runge (Bjerhammar) sphere S2R represents a serious obstacle to establish a discrete SGG-solution V R . In fact, it may be expected that the coefficient matrix of the linear system (224) tends to be ill-conditioned at least for larger integers NR and LR . This is the reason, why we propose a twofold regularization procedure, namely a “downward continuation” multiscale regularization-strategy to take advantage from the decorrelation property of multiscale structures and to suppress numerical instabilities as far as possible.

1240

5.3

W. Freeden et al.

Space Solution in Multiscale Framework

Let us consider a monotonously decreasing sequence {γn }n=0,1,... of real numbers γn , i.e., γn > γn+1, n = 0, 1, . . . , satisfying the conditions (cf. Fig. 9) • the initial value γ0 is equal to γ , i.e., we let γ0 = γ ,

(227)

• the limit of the sequence {γn }n=0,1,... is equal to the radius R of the Runge (Bjerhammar) sphere, i.e., lim γn = R.

(228)

n→∞

In what follows, we identify the members of the sequence {γn }n=0,1,... with the radii of spheres around the origin to act as the “scales” in an Abel–Poisson kernel filter-reflected low- and bandpass approach thereby observing that Kγn (x, y) =

1 |x|2 − γn 2 , x, y ∈ S2γn , x = y, 4π γn |x − y|3

S2γ0

(229)

Γ

S2γn S2R

Σ

Fig. 9 The geometric situation of satellite gravitational gradiometry as discussed in our “downward continuation” multiscale space regularization procedure

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1241

shows increasing space localization with increasing “scale” n, i,e., with decreasing radii γn (for more details about the space localization of Abel–Poisson kernels see, e.g., [63] and the references therein). Initial Step: Discrete Abel–Poisson Lowpass Filtering Classical potential theory (see, e.g., [80]) tells us that the restriction V | 2,ext of the potential V can be Sγ0

represented in the form  V (x) = V |

S2,ext γ0

(x) =

S2γ

V (y) Kγ0 (x, y) dS(y),

x ∈ S2,ext γ0 .

(230)

0

By approximate integration over the sphere S2γ0 we are led to a cubature formula Vγ0 approximating the integral representation (230) of V on S2,ext in the form γ0 N

V|

S2,ext γ0

(x) ≈ Vγ0 (x) :=

γ0

k=1

γ

γ

γ

wk 0 V (yk 0 ) Kγ0 (x, yk 0 ), x ∈ S2,ext γ0 ,   

(231)

γ =:ak 0

γ

with appropriately given weights wk 0 ∈R, k=1, . . . , Nγ0 , and prescribed nodes γ yk 0 ∈S2γ0 , k = 1, . . . , Nγ0 . Hence, we obtain by forming the Hesse operator N

v(x) ≈ vγ0 (x) = (∇x ⊗ ∇x )Vγ0 (x) =

γ0

k=1

γ

γ

γ

wk 0 V (yk 0 ) kγ0 (x, yk 0 ), x ∈ Γ.    γ

=ak 0

(232) γ

In order to determine the coefficients ak 0 ∈R, k = 1, . . . , Nγ0 , we assume in γ accordance with the observational SGG-situation that the tensor values v(xl 0 ), l = 1, . . . , Lγ0 , are known along the orbit Γ . In other words, the problem of determining V | 2,ext by a cubature based approxiSγ0

mation Vγ0 is reduced to the discrete problem N

γ v(xl 0 )



γ vγ0 (xl 0 )

=

γ0

γ

γ

γ

ak 0 kγ0 (xl 0 , yk 0 ), l = 1, . . . , Lγ0 .

(233)

k=1 γ

Once the linear system (233) is solved for the input data v(xl 0 ), l = 1, . . . , Lγ0 , so γ γ that the coefficients ak 0 , k = 1, . . . , Nγ0 , are available (once more, note that wk 0 is known from the integration rule), the potential Vγ0 as a discrete version to V | 2,ext can be obtained from

Sγ0

1242

W. Freeden et al. N

Vγ0 (x) =

γ0

k=1

γ

γ

γ

wk 0 V (yk 0 ) Kγ0 (x, yk 0 ),   

x ∈ S2,ext γ0 .

(234)

γ

=ak 0

Subsequently, (regularization by) “downward continuation” yields the “0-scale” of the form potential VγR0 in S2,ext R VγR0 (x)

=

NR

6 wkR

k=1

Vγ0

yR γ0 k R

5 KR (x, ykR ),

R 2 x ∈ S2,ext R , yk ∈ SR ,

(235)

hence, the restriction VγR0 |Σ of the potential VγR0 represents the “0-scale” SGGlowpass solution on the (real) Earth’s surface Σ, where the weights wkR ∈ R, k = 1, . . . , NR , and nodes ykR ∈ S2R , k = 1, . . . , NR , in (235) can be taken from (226) or another appropriate cubature formula. It should be noted that, for reasons of comparability, we use the same cubature formula for all forthcoming “downward continued” n-scale SGG-lowpass solutions on the Runge (Bjerhammar) sphere S2,ext R . Gaussian Sum Mollification An elementary calculation yields

kγ0 (x, y) = (∇x ⊗ ∇x )Kγ0 (x, y)   ∂2 Kγ (x, y) , = ∂xi ∂xj 0 i,j =1,2,3 where

∂2 Kγ0 (x, y) ∂xi2  2 12xi (xi − yi ) 1 − = 3 4π γ0 |x − y| |x − y|5 −

and

for x ∈ Γ.

y = (y1 , y2 , y3 )T ∈ S2γ0 ,

3(|x|2 − γ0 2 ) 15(xi − yi )2 (|x|2 − γ0 2 ) + |x − y|5 |x − y|7

∂2 Kγ (x, y) ∂xi ∂xj 0  6xi (xj − yj ) 6xj (xi − yi ) 1 =− + 4π γ0 |x − y|5 |x − y|5 5 15(|x|2 − γ0 2 )(xj − yj )(xi − yi ) − , |x − y|7

(236)



i = j,

(237)

(238)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1243

In order to solve the linear system (233) numerically, we can take advantage of the decorrelation capability caused by the substitution of the kernel of a “monopole” (cf. (237), (238)) by a linear combination of Gaussians, i.e., M 1 2 ≈ ωm e−αm |x−y| , δ ≤ |x − y|, |x − y|

(239)

m=1

being valid for some sufficiently small δ > 0. The critical point concerning (239), however, is to find an a priori specification in our numerics for the coefficients αm , ωm , m = 1, . . . , M, depending on the integer M (note that the use of Gaussians also allows to choose γ0 equal to infx∈Γ |x| in the initial step, since |x − y|−1 is bounded away from the singularity at x = y by an arbitrarily small constant δ > 0). Indeed, different approaches to attain suitable αm , ωm , m = 1, . . . , M, for appropriate choices δ > 0 can be found in the literature (not discussed here in more detail): • In [71], the approximation is attacked by a Newton-type optimization procedure. • In [72], a Remez algorithm exploits specific properties of a certain error functional. • Fast multipole methods (see, e.g., [16, 68, 70]) also provide tools of great numerical significance. The application of the fast multipole method also allows the treatment of noisy data by specifying parameter choices with and without prior knowledge of the noise level (cf. [70]). The approach of the Geomathematics Group, Kaiserslautern, described in [14, 44], closely parallels the concepts presented in [10, 11]. This concept starts with an approximation obtained by the appropriate discretization of an integral expression of a monopole |x − y|−1 . Afterwards, in order to reduce the number M of terms of the Gaussian sum on the right side of (239), an algorithm is applied based on Prony’s method. An advantage is that one is able to work with the one-dimensional function r → r −1 , r ≥ δ > 0, sufficiently small: M 1 2 ≈ ωm e−αm r . r

(240)

m=1

Subsequent Steps: Discrete Abel–Poisson Lowpass and Bandpass Filtering The potential V | 2,ext admits the representation Sγ1

 V (x) = V |

2,ext (x) =

Sγ1

S2γ

1

V (y)Kγ1 (x, y) dS(y),

x ∈ S2,ext γ1 .

(241)

1244

W. Freeden et al.

By approximate integration over the sphere S2γ1 we are able to deduce a cubature formula Vγ1 of the integral representation (241) of V | 2,ext in the form Sγ1

N

V (x) = V |

S2,ext γ1

γ1

(x) ≈ Vγ1 (x) =

γ

γ

bk 1 Kγ1 (x, yk 1 ), x ∈ S2,ext γ1 ,

(242)

k=1 γ

with Nγ1 > Nγ0 and prescribed nodes yk 1 ∈ S2γ1 , k = 1, . . . , Nγ1 , such that the weights satisfy the properties γ

γ

bk 1 := ak 0 ,

k = 1, . . . , Nγ0 .

(243)

Hence, we obtain by forming the Hesse operator N

v(x) ≈ vγ1 (x) = (∇x ⊗ ∇x )Vγ1 (x) =

γ1

γ

γ

bk 1 kγ1 (x, yk 1 ), x ∈ Γ.

(244)

k=1

We consider the “wavelet potential” Wγ0 given by Wγ0 := Vγ1 |

S2,ext γ0

− Vγ0 .

(245)

Because of (243), Wγ0 allows the discretization in the form N

N

Wγ0 (x) =

γ1

γ

γ

bk 1 Kγ1 (x, yk 1 ) −

k=1

γ0

γ

γ

ak 0 Kγ0 (x, yk 0 )

(246)

k=1

N

γ1

=

γ

γ

bk 1 Kγ1 (x, yk 1 )

k=Nγ0 +1 N

+

γ0

γ  γ γ  ak 0 Kγ1 (x, yk 1 ) − Kγ0 (x, yk 0 ) , x ∈ S2,ext γ0 .

k=1

We use the canonical notation wγ0 := (∇ ⊗ ∇)Wγ0 . Then, in connection with (246), the SGG-equations γ

γ

γ

γ

wγ0 (xl 1 ) = vγ1 (xl 1 ) − vγ0 (xl 1 ), xl 1 ∈ Γ, l = Lγ0 + 1, . . . , Lγ1 ,    γ ≈v(xl 1 )

(247)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1245

lead to the linear system N

γ wγ0 (xl 1 )

γ1

=

γ

γ

γ

bk 1 kγ1 (xl 1 , yk 1 )

(248)

k=Nγ0 +1 N

+

γ0

γ  γ γ γ γ  ak 0 kγ1 (xl 1 , yk 1 ) − kγ0 (xl 1 , yk 0 )

k=1 γ

in the unknowns bk 1 , k = Nγ0 + 1, . . . , Nγ1 . Equivalently, we obtain N

N

γ1

γ

γ

γ

γ

bk 1 kγ1 (xl 1 , yk 1 ) ≈ v(xl 1 ) −

k=Nγ0 +1

γ0

γ

γ

γ

ak 0 kγ1 (xl 1 , yk 1 ),

(249)

k=1 γ

l = Lγ0 + 1, . . . , Lγ1 , where ak 0 , k = 1, . . . , Nγ0 , are known (note that (i) the γ points xl 1 ∈ Γ, l = Lγ0 + 1, . . . , Lγ1 , should be chosen in an appropriate way to guarantee an improvement provided by the bandpass potential Wγ0 of scale 0 to the γ γ whole multiscale approach and (ii) the choice yk 1 = yk 0 , k = 1, . . . , Nγ0 , is not excluded). γ Once the coefficients bk 1 , k = Nγ0 + 1, . . . , Nγ1 are calculated from (249), the potential Vγ1 is known from (242). The potential Vγ1 may be approximately regarded as the sum of the lowpass potential Vγ0 and the bandpass potential Wγ0 of scale 0 in 2,ext S2,ext γ0 , leading to the “1-scale” SGG-lowpass solution in Sγ1 . “Downward continuation” yields the “1-scale” potential VγR1 in S2,ext of the form R

VγR1 (x) :=

NR

6

yR γ1 k R

wkR Vγ1

k=1

5 KR (x, ykR ),

R 2 x ∈ S2,ext R , yk ∈ SR ,

(250)

so that the restriction VγR1 |Σ of the potential VγR1 represents the “1-scale” SGGlowpass solution on the (real) Earth’s surface Σ (as pointed out earlier, for reasons of comparability, we use the same cubature formula for the “downward continued” 1-scale” SGG-lowpass solution VγR1 as for the “downward continued” 0-scale” SGG-

lowpass solution VγR0 on S2,ext R ). Now, generally, assume that “downward continued” k-scale” SGG-lowpass in the indicated way. Our solutions VγRk , k = 1, . . . n, are calculated on S2,ext R purpose is to determine VγRn+1 : The potential V | 2,ext admits the representation Sγn+1

 V (x) = V |

S2,ext γn+1

(x) =

S2γ

n+1

V (y) Kγn+1 (x, y) dS(y),

x ∈ S2,ext γn+1 ,

(251)

1246

W. Freeden et al.

Therefore, we obtain by applying the Hesse operator N

v(x) ≈ vγn+1 (x) = (∇x ⊗ ∇x )Vγn+1 (x) =

γ1

γ

γ

bk n+1 kγn+1 (x, yk n+1 ), x ∈ Γ.

k=1

(252) By approximate integration over the sphere S2γn+1 we are able to deduce a cubature formula Vγn+1 of the integral representation (251) of V | 2,ext in the form Sγn+1

Nγn+1

Vγn+1 (x) =



γ

γ

bk n+1 Kγn+1 (x, yk n+1 ), x ∈ S2,ext γn+1 ,

(253)

k=1 γ

with Nγn+1 > Nγn and given nodes yk n+1 ∈ S2γn+1 , k = 1, . . . , Nγn+1 , such that γ

γ

bk n+1 = bk n ,

k = 1, . . . , Nγn .

(254)

We consider the “wavelet potential” Wγn given by Wγn := Vγn+1 |

S2,ext γn

− Vγn .

(255)

Because of (254), Wγn allows the discretization in the form Nγn+1

Wγn (x) =



γ

γ

bk n+1 Kγn+1 (x, yk n+1 ) −

k=1

Nγn

γ

γ

bk n Kγn (x, yk n )

k=1

Nγn+1



=

γ

γ

bk n+1 Kγn+1 (x, yk n+1 )

k=Nγn +1

+

Nγn

γ γ  γ  bk n Kγn+1 (x, yk n+1 ) − Kγn (x, yk n ) , x ∈ S2,ext γn , (256)

k=1 γ

where the coefficients bk n , k = 1, . . . , Nγn , are known. By definition we set wγn := (∇ ⊗ ∇)Wγn .

(257)

Then, in connection with (256), the SGG-equations γ

γ

γ

γ

wγn (xl n+1 ) ≈ v(xl n+1 )−vγn (xl n+1 ), xl n+1 ∈ Γ, l = Lγn +1, . . . , Lγn+1 ,

(258)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1247

lead to the linear system γ

γ

γ

wγn (xl n+1 ) = vγn+1 (xl n+1 ) − vγn (xl n+1 )

(259)

Nγn+1



=

γ

γ

γ

bk n+1 kγn+1 (xl n+1 , yk n+1 )

k=Nγn +1

+

Nγn

γ γ γ γ  γ  bk n kγn+1 (xl n+1 , yk n+1 ) − kγn (xl n+1 , yk n )

k=1 γ

in the unknowns bk n+1 , k = Nγn + 1, . . . , Nγn+1 , i.e., Nγn+1



γ

γ

γ

bk n+1 kγn+1 ((xl 1 , yk n+1 )

(260)

k=Nγ0 +1



γ v(xl n+1 ) −

Nγn

γ

γ

γ

bk n kγn+1 (xl n+1 , yk n+1 ),

k=1 γ

l = Lγn +1, . . . , Lγn+1 (again, xl n+1 ∈ Γ, l = Lγn +1, . . . , Lγn+1 should be chosen in an appropriate way to guarantee an improvement by the bandpass potential Wγn ). γ Once the coefficients bk n+1 , k = Nγn + 1, . . . , Nγn+1 are calculated from (260), the potential Vγn+1 is known from (242). The potential Vγn+1 may be regarded as the sum of the lowpass potential Vγn and the bandpass potential Wγn of scale n in S2,ext γn , and Vγn+1 represents the “(n + 1)-scale” SGG-lowpass solution in S2,ext γn+1 . “Downward continuation” leads to the “(n + 1)-scale” potential VγRn+1 in S2,ext R of the form VγRn+1 (x)

:=

NR k=1

6 wkR

Vγn+1

yR γn+1 k R

5 KR (x, ykR ),

R 2 x ∈ S2,ext R , yk ∈ SR ,

(261) so that the restriction VγRn+1 |Σ of VγRn+1 represents the “(n + 1)-scale” SGG-lowpass solution on the (real) Earth’s surface Σ. Continuing our wavelet process we obtain “downward continued” k-scale” SGGlowpass solutions VγRk , k = 0, 1, . . . , calculated on S2,ext R , hence, available on the real Earth’s surface Σ and its exterior space just by restriction. All in all, we successively obtain lowpass and bandpass SGG-discretizations of the Earth’s gravitational potential from a tensorial (error-free) SGG-dataset, thereby basing our multiscale procedure exclusively on operations in space domain (note that the error-affected case can be handled scale-by-scale by obvious smoothing and/or adjustment manipulations as already mentioned above).

1248

6

W. Freeden et al.

Conclusion

The great advantage of SGG is that gradiometer data are globally available in homogeneous quality and high density. These facts justify an intense study of SGG from geodetic as well as mathematical point of view. Although an impressive rate of the Earth’s gravitational potential can be detected globally at the orbit of a satellite (like GOCE), the intrinsic drawback of satellite techniques in geoscientific research is that the measurements are performed at a certain altitude. Consequently, in satellite gravitational gradiometry, only the Hesse tensor of an Abel–Poisson “filtered version of the gravitational potential on the Earth’s surface” is available by measurements and a loss compared to gravitational field information detectable on the Earth surface is unavoidable. So, it naturally follows that a “downward continuation” process involving adapted regularization procedures must be applied in order to handle the filtered “portion of the signal” that is available from the Earth’s potential by taking gradiometer measurements on the orbit. In this respect, multiscale techniques using regularizing wavelets as constructed in this contribution, indeed, represent an outstanding methodology by its particular ability to exhaust all specific features of the “portion of the signal” by a “zoomingin” process. In fact, different powerful techniques for regularization are at the disposal of the geodetic analyst in frequency as well as space domain from globally via regionally up to locally reflected scales. All these facts lead to the conclusion that multiresolution methods are superior to any other SGG-solution strategy. As an immediate consequence, this work on multiscale regularization may be rated without any doubt as a well-promising and far-reaching methodology in future SGG-research. However, there is also no doubt that each method in approximation theory and numerics has its own aim and character. In fact, it is the essence of any numerical realization that it becomes optimal only with respect to especially specified criteria. For example, Fourier expansion methods with polynomial trial functions (such as spherical harmonics) offer the canonical “trend-approximation” of low-frequency phenomena (for global modeling), they provide an excellent control and comparison of spectral properties of the signal, since any spherical harmonic relates to one “frequency” (i.e., the degree of the polynomial). This is of tremendous advantage for relating data types under spectral aspects. Nonetheless, it is at the expense that the polynomials are globally supported such that local modeling usually results into serious problems of economy, efficiency, and stability. Bandlimited kernels can be used for the transition from long-wavelength to short-wavelength phenomena (global to local modeling) in the signal. Because of their excellent localization properties in the space domain, the non-bandlimited kernels can be used for the modeling of short-wavelength phenomena. Local modeling is effective and economic because of a decorrelation ability, but it must be emphasized that the information obtainable by kernel approximations such as wavelets is clustered in frequency bands so that spectral investigations are much more laborious and time consuming. All in all, for the numerical work to be done in constructive approximation, we have

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1249

to make an a priori choice which feature should be pointed out and preferably handled. In particular, we have to reflect the different stages of space/frequency localization so that the modeling process can be performed under the localization requirements necessary and sufficient for relevant geodetic interpretation. A positive decision in one direction often amounts to a negative implication in an other direction. Ultimately, because of the uncertainty principle, it is impossible to provide a “cure-all methodology”. This “sine qua non” ingredient in any mathematical approximation method is also the reason, why we are indispensibly led to two essential calamities in the “step by step” SGG-approximation by scaling and wavelet potentials as proposed here: • Similarly to the collocational spline and smoothing theory (see, e.g., [32,33,120, 126, 131]), the choice of the regularization scaling function is an important problem in SGG. Mathematically, all wavelet regularization strategies are equivalent, however, the right computational compromise between mathematical rigor and geodetic relevance in respect to the geometry of the orbit, data width, accuracy of the data, and occurring noise level and characteristics is a task for future research, which should not be underestimated. • From mathematical point of view, we are not confronted with a multiscale solution of a well-posed boundary value problem of elliptic partial differential equations, for which subsequently each detail information guarantees an improvement to come closer and closer to the solution. Boundary value problems (see, e.g., [2, 50]) do not require any stopping strategy of the multiscale process, since the boundary data are (generally) not given only as “portion of the original signal” in filtered form. An algorithm establishing an approximate solution for the inverse SGG-problem, however, has to take into account the requirement to stop at the right level of approximation in order to model appropriately the Abel– Poisson filter-nature of the measured data. Unfortunately, today, the amount of amplification for the gravitational potential is not yet suitably known on the orbit as an a priori state. As a consequence, a missing stopping strategy in the multiresolution regularization caused by manual input usually produces huge errors in the potential at the Earth’s surface even from extremely small errors in the measurements. Thus there is a strong need for a geodetically relevant and mathematically motivated SGG-stopping strategy in the near future. In conclusion, a loss of information in SGG-modeling is unavoidable. An algorithm establishing an approximate solution for the inverse SGG-problem has to reflect as good as possible the intention of the geoscientific applicant. Geodetic a priori information today for the characterization of the right scaling and wavelet potentials for the SGG-solution process as well as a mathematically validated stopping strategy in multiscale regularization is an important challenge for future work. Acknowledgements W. Freeden and H. Nutz thank the “Federal Ministry for Economic Affairs and Energy, Berlin” and the “Project Management Jülich” for funding the project

1250

W. Freeden et al.

“SPE” (funding reference number 0324016, CBM – Gesellschaft für Consulting, Business und Management mbH, Bexbach, Germany) on gravimetric potential methods in geothermal exploration.

Literature 1. Abalakine, V., Balmino, G., Lambeck, K., Moritz, H., Mulholland, J.D., Tozer, F.: La Géodynamique Spatiale. Summer School Lecture Notes, Centre Nationale D’Etudes Spatiales, 20.8–13.9, Lannion (1974) 2. Abeyratne, M.K., Freeden, W., Mayer, C.: Multiscale deformation analysis by Cauchy-Navier wavelets. J. Appl. Math. 12, 605–645 (2003) 3. Augustin, M., Freeden, W., Nutz, H.: About the importance of the Runge-Walsh concept for gravitational field determination. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Geosystems Mathematics, pp. 517–560. Birkhäuser, Basel (2018) 4. Backus, G.E.: Converting vector and tensor equations to scalar equations in spherical coordinates. Geophys. J. R. Astron. Soc. 13, 61–101 (1967) 5. Backus, G.E., Gilbert, F.: The resolving power of gross Earth data. Geophys. J. R. Astron. Soc. 16, 169–205 (1968) 6. Backus, G.E., Gilbert, F.: Uniqueness in the inversion of inaccurate gross Earth data. Philos. Trans. R. Soc. Lond. A 266, 123–192 (1970) 7. Balmino, G., Barlier, F., Bernard, A., Bouzat, C., Rummel, R., Touboul, P.: Proposal for a Satellite Gravity Gradiometer Experiment for the Geosciences, 168pp., Toulouse Cedex, preprint (1985) 8. Bernard, A., Canny, J.P., Juillerat, R., Touboul, P.: Electrostatic suspension of samples in microgravity. Acta Astron. 12(7–8), 469–646 (1985) 9. Beutler, G.B., Drinkwater, M.R., Rummel, R., von Steiger, R.: Earth gravity field from space – from sensors to Earth sciences. In: The Space Sciences Series of ISSI, vol. 18, pp. 419–432. Kluwer, Dordrecht (2003) 10. Beylkin, G., Monzón, L.: On approximation of functions by exponential sums. Appl. Comput. Harmon. Anal. 19, 17–48 (2005) 11. Beylkin, G., Monzón, L.: Approximation of functions by exponential sums revisited. Appl. Comput. Harmon. Anal. 28, 131–149 (2010) 12. Bjerhammar, A.: Gravity Reduction to an Internal Sphere. Division of Geodesy, Stockholm (1962) 13. Buchar, E.: Motion of the nodal line of the second Russian Earth satellite (1957) and flattening of the Earth. Nature 182, 198–199 (1958) 14. Burschäpers, H.C.: Local modeling of gravitational data. Master Thesis, Geomathematics Group, Mathematics Department, University of Kaiserslautern (2013) 15. Carroll, J.J., Savet, P.H.: Gravity difference detection. Aerosp. Eng. 18, 44–47 (1959) 16. Cheng, H., Greengard, L., Rokhlin, V.: A fast adaptive multipole algorithm in three dimensions. J. Comput. Phys. 155, 468–498 (1999) 17. Dickey, J.O. (ed.): Satellite gravity and the geosphere. Committee of Earth Gravity from Space. National Research Council Report, 112pp., Nat. Acad. Washington, D.C. (1977) 18. Diesel, J.W.: A new approach to gravitational gradient determination of the vertical. AIAA J. 2(7), 1189–1196 (1964) 19. Eggermont, P.N., LaRiccia, V., Nashed, M.Z.: Noise models for ill–posed problems. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 1633–1658. Springer, New York (2014) 20. Engl, H., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht/Boston/London (1996) 21. Engl, H., Louis, A.K., Rundell, W.: Inverse Problems in Geophysical Applications. SIAM, Philadelphia (1997)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1251

22. ESA: Space oceanography, navigation and geodynamics. In: Proceedings of a European workshop, ESA SP-137, Schloss Elmau (1978) 23. ESA: Gravity Field and Steady-State Ocean Circulation Mission (GOCE), ESTEC, Noordwijk, ESA SP–1233(1) (1999) 24. Fehlinger, T.: Multiscale formulations for the disturbing potential and the deflections of the vertical in locally reflected physical geodesy. Ph.D Thesis, Geomathematics Group, University of Kaiserslautern (2009) 25. Fengler, M., Freeden, W., Gutting, M.: The spherical Bernstein wavalets. Int. J. Pure Appl. Math. 31, 209–230 (2006) 26. Fehlinger, T., Freeden, W., Gramsch, S., Mayer, C., Michel, D., Schreiner, M.: Local modelling of sea surface topography from (geostrophic) ocean flow. ZAMM 87, 775–791 (2007) 27. Forward, R.L.: Geodesy with orbiting gravity gradiometers. Use of Artificial Satellites for Geodesy, AGU Monograph No.15, U.S. Government Printing Office, Washington D.C. (1972) 28. Forward, R.L., Miller, R.: Generation and detection of dynamic gravitational-gradient fields. J. Appl. Phys. 38, 512 (1967). https://doi.org/10.1063/1.1709366 29. Fowler, C.M.R.: The Solid Earth, 2nd edn. Cambridge University Press, Cambridge (2005) 30. Freeden, W.: On the approximation of the external gravitational potential with closed systems of (trial) functions. Bull. Géod. 54, 1–20 (1980) 31. Freeden, W.: On spherical spline interpolation and approximation. Math. Methods Appl. Sci. 3, 551–575 (1981) 32. Freeden, W.: On approximation by harmonic splines. Manuscr. Geod. 6, 193–244 (1981) 33. Freeden, W.: A spline interpolation method for solving boundary value problems of potential theory from discretely given data. Math. Part. Diff. Equ. 3, 375–398 (1987) 34. Freeden, W.: The uncertainty principle and its role in physical geodesy. In: Freeden, W. (ed.) Progress in Geodetic Science (Geodetic Week, Kaiserslautern, 1998), pp. 225–236. Shaker, Aachen (1998) 35. Freeden, W.: Multiscale Modelling of Spaceborne Geodata. B.G. Teubner, Leipzig (1999) 36. Freeden, W., Gerhards, C.: Geomathematically Oriented Potential Theory. Chapman & Hall/CRC Press, Boca Raton (2013) 37. Freeden, W., Gutting, M.: On the completeness and closure of vector and tensor spherical harmonics. Integral Transforms Spec. Funct. 19, 713–734 (2008) 38. Freeden, W., Gutting, M.: Special Functions of Mathematical (Geo)Physics. Birkhäuser, Basel (2013) 39. Freeden, W., Gutting, M.: Integration and Cubature Methods. CRC Press/Taylor & Francis Group, Boca Raton (2017) 40. Freeden, W., Maier, T.: On multiscale denoising of spherical functions: basic theory and numerical aspects. Electron. Trans. Numer. Anal. (ETNA) 14, 40–62 (2002) 41. Freeden, W., Maier, T.: Spectral and multiscale signal–to–noise thresholding of spherical vector fields. Comput. Geosci. 7(3), 215–250 (2003) 42. Freeden, W., Michel, V.: Multiscale Potential Theory (With Applications to Geoscience). Birkhäuser, Boston/Basel/Berlin (2004) 43. Freeden, W., Michel, V.: Orthogonal zonal, tesseral, and sectorial wavelets on the sphere for the analysis of satellite data. Adv. Comput. Math. 21, 187–217 (2004) 44. Freeden, W., Nashed, M.Z.: Inverse gravimetry: background material and multiscale mollifier approaches. GEM Int. J. Geomath. 9, 199–264 (2018) 45. Freeden, W., Nashed, M.Z.: Ill-posed problems: operator methodologies of resolution and regularization. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy: Functional Analytic and Potential Methods, pp. 201–314. Birkhäuser/Springer International Publishing, Basel/New York/Heidelberg (2018) 46. Freeden, W., Nutz, H.: Satellite gravity gradiometry as tensorial inverse problem. GEM Int. J. Geomath. 2, 177–218 (2011) 47. Freeden, W., Nutz, H.: Geodetic observables and their mathematical treatment in multiscale framework. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy:

1252

W. Freeden et al.

Functional Analytic and Potential Methods, pp. 315–458. Birkhäuser/Springer International Publishing, Basel/New York/Heidelberg (2018) 48. Freeden, W., Sansó, F.: Geodesy and mathematics: interactions, acquisitions, and open problems. International Association of Geodesy Symposia (IAGS), Springer Nature Switzerland (2019). https://doi.org/10.1007/1345_2019_87 49. Freeden, W., Schneider, F.: Regularization wavelets and multiresolution. Inverse Probl. 14, 493–515 (1998) 50. Freeden, W., Schneider, F.: Wavelet approximation on closed surfaces and their application to boundary value problems of potential theory. Math. Methods Appl. Sci. 21, 129–163 (1998) 51. Freeden, W., Schreiner, M.: Satellte gradiometry – from mathematical and numerical point of view. In: Rummel, R., Schwintzer, P. (eds.) A major STEP for Geodesy, Report 1994 of the STEP Geodesy Working Group, Munich/Potsdam, pp. 35–44 (1994) 52. Freeden, W., Schreiner, M.: Spherical Functions of Mathematical Geosciences. A Scalar, Vectorial, and Tensorial Setup. Springer, Berlin/Heidelberg (2009) 53. Freeden, W., Schreiner, M.: Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 339–380. Springer/New York/Heidelberg (2015) 54. Freeden, W., Windheuser, U.: Combined spherical harmonic and wavelet expansion. Appl. Comput. Harm. Anal. (ACHA) 4, 1–37 (1997) 55. Freeden, W., Witte, B.: A combined (spline-)interpolation and smoothing method for the determination of the gravitational potential from heterogeneous data. Bull. Géod. 56, 53–62 (1982) 56. Freeden, W., Wolf, K.: Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math. Semesterberichte 56, 53–77 (2008) 57. Freeden, W., Schneider, F., Schreiner, M.: Gradiometry – an inverse problem in modern satellite geodesy. In: Engl, H.W., Louis, A., Rundell, W. (eds.) GAMM–SIAM Symposium on Inverse Problems: Geophysical Applications, pp. 179–239 (1997) 58. Freeden, W., Gervens, T., Schreiner, M.: Constructive Approximation on the Sphere (With Applications to Geomathematics). Oxford Science Publications/Clarendon Press, Oxford (1998) 59. Freeden, W., Michel, V., Nutz, H.: Satellite-to-satellite tracking and satellite gravity gradiometry (advanced techniques for high-resolution geopotential field determination). J. Eng. Math. 43, 19–56 (2002) 60. Freeden, W., Michel, D., Michel, V.: Local multiscale approximation of geostrophic oceanic flow: theoretical background and aspects of scientific computing. Mar. Geod. 28, 313–329 (2005) 61. Freeden, W., Fehlinger, T., Klug, M., Mathar, D., Wolf, K.: Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J. Geod. 83, 1171–1191 (2009) 62. Freeden, W., Michel, V., Simons, F.J.: Spherical harmonics based special function systems and constructive approximation methods. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy: Functional Analytic and Potential Methods, pp. 201–314. Birkhäuser/Springer International Publishing, Basel/New-York/Heidelberg (2018) 63. Freeden, W., Nashed, M.Z., Schreiner, M.: Spherical Sampling. Geosystems Mathematics. Springer International Publishing, Basel/New York/Heidelberg (2018) 64. Gerhards, C.: Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modeling. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2011) 65. Gill, A.E.: Atmosphere – Ocean Dynamics. Academic Press, New York (1982) 66. Glockner, O.: On numerical aspects of gravitational field modelling from SST and SGG by harmonic splines and wavelets (with application to CHAMP data). Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2002) 67. Grafarend, E.W.: The reference figure of the rotating Earth in geometry and gravity space and an attempt to generalize the celebrated Runge-Walsh approximation theorem for irregular surfaces. GEM Int. J. Geomath. 6, 101–140 (2015)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1253

68. Greengard, L., Rokhlin, V.: A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Numer. 6, 229–269 (1997) 69. Gutting, M.: Fast multipole methods for oblique derivative problems. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group. Shaker, Aachen (2008) 70. Gutting, M.: Parameter choices for fast harmonic spline approximation. In: Freeden, W., Nashed, M.Z. (eds.) Handbook of Mathematical Geodesy. Functional Analytic and Potential Methods, pp. 605–639. Birkhäuser/Springer International Publishing, Basel/New York/Heidelberg (2018) 71. Hackbusch, W.: Entwicklungen nach Exponentialsummen. Technical Report. Max-PlanckInstitut für Mahematik in den Naturwissenschaften, Leipzig (2010) 72. Hackbusch, W., Khoromoskij, B.N., Klaus, A.: Approximation of functions by exponential sums based on the Newton-type optimisation. Technical Report, Max-Planck-Institut für Mathematik in den Naturwissenschaften, Leipzig (2005) 73. Hadamard, J.: Sur les problèmes aux dérivés partielles et leur signification physique. Princeton Univ. Bull. 13, 49–52 (1902) 74. Hager, B.H., Richards, M.A.: Long-wavelength variations in Earth’s geoid: physical models and dynamical implications. Phil. Trans. R. Soc. Lond. A 328, 309–327 (1989) 75. Heiskanen, W.A., Moritz, H.: Physical Geodesy. Freeman, San Francisco (1967) 76. Hendriksen, S.W. (ed.): National Geodetic Satellite Program. Part 1 and 2, NASA Washington DC (1977) 77. Hesse, K.: Domain decomposition methods in multiscale geopotential determination from SST and SGG. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2003) 78. Hotine, M.: Mathematical Geodesy. ESSA Monographs, 2, U.S. Department of Commerce, Washington, D.C., Rockville (1985) 79. Kaula, W.M. (ed.): The Terrestrial Environment – Solid Earth and Ocean Physics Application of Space and Astronomic Techniques. Report of a study at Williamstown/Mass. to NASA, Cambridge, MA (1969) 80. Kellogg, O.D.: Foundations of Potential Theory. Frederick Ungar Publishing Company, New York (1929) 81. Kirsch, A.: Introduction to the Mathematical Theory of Inverse Problems. Springer, New York (1996) 82. Krarup, T.: A Contribution to the mathematical foundation of physical geodesy. Meddelelse No. 44, Geodätisk Inst Köbenhavn (1969) 83. Listing, J.B.: Über unsere jetzige Kenntniss der Gestalt und Grösse der Erde. Nachrichten von der Königl. Gesellschaft der Wissenschaften und der G. A. Universität zu Göttingen 3, 33–98 (1873) 84. Louis, A.K.: Inverse und schlecht gestellte Probleme, Teubner, Leipzig (1989) 85. Marussi, A.: The tidal field of a planet and the related intrinsic reference systems. Geophys. J.R. Astr. Soc. 56, 409–417 (1979) 86. Marussi, A.: Microgravitation in space. Geophys. J. R. Astron. Soc. 76, 691–695 (1984) 87. Marussi, A., Chiaruttini, C.: The motion of a free particle and of a spherical satellite in a circular orbit in a central field. In: Marussi, A. (ed.) Intrinsic Geodesy, pp. 179–189. Springer, Berlin (1985) 88. Meissl, P.: A study of covariance functions related to the Earth’s disturbing potential. Department of Geodetic Science, No. 151, The Ohio State University, Columbus (1971) 89. Merson, R.H., King-Hele, D.G.: Use of artificial satellites to explore the Earth’s gravitational field: results from Sputnik 2 (1957). Nature 182, 640–641 (1958) 90. Michel, V.: A multiscale method for the gravimetry problem: theoretical and numerical aspects of harmonic and anharmonic modelling. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group. Shaker, Aachen (1999) 91. Michel, V.: Lectures on Constructive Approximation – Fourier, Spline, and Wavelet Methods on the Real Line, the Sphere, and the Ball. Birkhäuser, Boston (2013) 92. Moritz, H.: Kinematic Geodesy. Deutsche Geodätische Kommission A-59, München (1968)

1254

W. Freeden et al.

93. Moritz, H.: Advanced least squares estimation. Department of Geodetic Science, No. 175, The Ohio State University, Columbus (1972) 94. Moritz, H.: Recent developments in the geodetic boundary value problem. Department of Geodetic Science, No. 266, The Ohio State University, Columbus (1977) 95. Moritz, H.: Inertia and gravitation in geodesy. In: Proceedings of 3rd International Symposium Inertial Technology for Surveying and Geodesy, Banff, vol. I (1986) 96. Moritz, H.: Advanced Physical Geodesy, 2nd edn. Wichmann Verlag, Karlsruhe (1989) 97. Moritz, H.: Classical physical geodesy. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 253–289. Springer, New York (2015) 98. Müller, C.: Spherical Harmonics. Lecture Notes in Mathematics, vol. 17. Springer, Berlin (1966) 99. NASA: Earth and Ocean Physics Applications Program. Volume II, Rationale and Program Plans (1972) 100. Nashed, M.Z.: Generalized inverses and applications. Proceedings of an Advanced Seminar Conducted by the Mathematics Research Center, University of Wisconsin, Madison (1976) 101. Nashed, M.Z.: Operator–theoretic and computational approaches to ill–posed problems with applications to antenna theory. IEEE Trans. Antennas Propag. 29(2), 220–231 (1981) 102. Nashed, M.Z.: A new approach to classification and regularization of ill-posed operator equations. In: Engl, H.W., Groetsch, C.W. (eds.) Inverse and Ill–Posed Problems. Notes and Reports in Mathematics in Science and Engineering, vol. 4, pp. 53–75. Academic Press, New York (1987) 103. Nashed, M.Z.: Inner, outer, and generalized inverses in Banach and Hilbert spaces. Numer. Funct. Anal. Optim. 9, 261–325 (1987) 104. NRC: Committee on Geodesy: Applications of a Dedicated Gravitational Satellite Mission. National Academy Press, Washington, D.C. (1979) 105. Nutz, H.: A unified setup of gravitational field observables. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group. Shaker, Aachen (2002) 106. Nutz, H., Wolf, K.: Multiresolution analysis of hydrology and satellite gravitational data. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, vol. 1, 2nd edn., pp. 497–518. Springer, Berlin/Heidelberg (2015) 107. Pavlis, N.K., Holmes, S.A., Kenyon, S.C., John, K., Factor, J.K.: The development and evaluation of the Earth gravitational model 2008 (EGM2008). J. Geophys. Res. Solid Earth (1978–2012) 117(B4), 04406 (2012) 108. Rieder, A.: Keine Probleme mit Inversen Problemen. Vieweg, Wiesbaden (2003) 109. Rummel, R.: Determination of short-wavelength components of the gravity field by satelliteto-satellite tracking or satellite gradiometry: an attempt to an identification of problem areas. Manuscr. Geod. 4, 107–148 (1979) 110. Rummel, R.: Satellite gradiometry. In: Sünkel H. (ed.) Mathematical and Numerical Techniques in Physical Geodesy. Lecture Notes in Earth Sciences, vol. 7, pp. 317–363. Springer, Berlin (1986). https://doi.org/10.1007/BFb0010135 111. Rummel, R.: ARISTOTELES: surface gravity from space gradiometry. In: Proceedings of the Italian Workshop on the European Solid–Earth mission ARISTOTELES. Aeritalia, Trevi (1989) 112. Rummel, R.: Spherical spectral properties of the Earth’s gravitational potential and its first and second derivatives. In: Rummel, R., Sansò, F. (eds.) Lecture Notes in Earth Science, vol. 65, pp. 359–404. Springer, Berlin (1997) 113. Rummel, R.: GOCE: gravitational gradiometry in a satellite. In: Freeden, W., Nashed, M.Z., Sonar, T. (eds.) Handbook of Geomathematics, 2nd edn., pp. 211–226. Springer, New York/Heidelberg (2015) 114. Rummel, R., van Gelderen, M.: Spectral analysis of the full gravity tensor. Geophys. J. Int. 111, 159–169 (1992) 115. Rummel, R., van Gelderen, M.: Meissl scheme – spectral characteristics of physical geodesy. Manuscr. Geod. 20, 379–385 (1995)

19 Satellite Gravitational Gradiometry: Methodological Foundation and . . .

1255

116. Rummel, R., van Gelderen, M., Koop, R., Schrama, E., Sansò, F., Brovelli, M., Miggliaccio, F., Sacerdote, F.: Spherical Harmonic Analysis of Satellite Gradiometry. Publications on Geodesy, 39, Delft (1993) 117. Runge, C.: Zur Theorie der eindeutigen analytischen Funktionen. Acta Math. 6, 229–234 (1885) 118. Sansò, F.: A note on density problems and the Runge-Krarup theorem. Boll. Geod. Sci. Affini. 41, 422–477 (1982) 119. Sansò, F., Rummel, R. (eds.): Geodetic Boundary Value Problems in View of the One Centimeter Geoid. Lecture Notes in Earth Sciences, vol. 65. Springer, Berlin/Heidelberg (1997) 120. Sansò, F., Barzaghi, R., Tscherning, C.C.: Choice of norm for the density distribution of the Earth. Geophys. J. R. Astron. Soc. 87, 123–141 (1968) 121. Savet, P.H.: Gravity field exploration by a new gradient technique. J. Spacecr. 6(6), 710–716 (1969) 122. Schneider, F.: Inverse problems in satellite geodesy and their approximate solution by splines and wavelets. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group. Shaker, Aachen (1997) 123. Schreiner, M.: Tensor spherical harmonics and their application in satellite gradiometry. Ph.D.-Thesis, University of Kaiserslautern, Geomathematics Group (1994) 124. Schreiner, M.: Uniqueness problems in satellite gradiometry. In: Neunzert, H. (ed.) Proceedings of the 8th Conference of the European Consortium for Mathematics in Industry. University of Kaiserslautern (1994) 125. Schreiner, M.: Wavelet approximation by spherical up functions. Habilitation Thesis, Geomathematics Group, University of Kaiserslautern. Shaker, Aachen (2004) 126. Shure, L., Parker, R.L., Backus, G.E.: Harmonic splines for geomagnetic modelling. Phys. Earth Planet. Inter. 28, 215–229 (1982) 127. Svensson, S.L.: Pseudodifferential operators – a new approach to the boundary value problems of physical geodesy. Manuscr. Geod. 8, 1–40 (1983) 128. Tapley, B.D., Bettadpur, S., Ries, J.C., Thompson, P.F., Watkins, M.W.: GRACE measurements of mass variability in the Earth system. Science 305(5683), 503–505 (2004) 129. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 130. Torge, W.: Gravimetry. De Gruyter, Berlin/New York (1989) 131. Tscherning, C.C.: A note of the choice of norm when using collocation for the computation of approximations to the anomalous potential. Bull. Géod. 51, 137–147 (1977) 132. Vekua, I.N.: Über die Vollständigkeit des Systems harmonischer Polynome im Raum (Russian). Dokl. Akad. Nauk 90, 495–498 (1953) 133. Wahr, J., Molenaar, M.: Time variability of the earth’s gravity field: hydrological and oceanic effects and their possible detection using GRACE. J. Geophys. Res. 103(B12), 30:205–30:229 (1998) 134. Walsh, J.L.: The approximation of harmonic functions by harmonic polynomials and by harmonic rational functions. Bull. Amer. Math. Soc. 35, 499–544 (1929) 135. Watts, A.B.: Isostasy and the Flexure of the Lithosphere. Cambridge University Press, Cambridge (2001) 136. Wells, W.C. (ed.): Spaceborne gravity gradiometers. Proceedings of Workshop held at NASA Goddard Space Flight Center, NASA Conference Publication 2305, Greenbelt (1984) 137. Wolf, K.: Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. Ph.D.-Thesis, Geomathematics Group, University of Kaiserslautern (2009)

Very Long Baseline Interferometry

20

Axel Nothnagel

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Basic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Technical Considerations of VLBI Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Scheduling process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Observations and Calibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Correlation and Fringe Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1258 1262 1263 1265 1268 1280 1291 1307 1310

Abstract This chapter describes the theory and the individual operational steps and components needed to carry out geodetic and astrometric Very Long Baseline Interferometry (VLBI) measurements. Pairs of radio telescopes are employed to observe far distant compact radio galaxies for the determination of the differences of the arrival times at the telescopes. From multiple observations of time delays of different radio sources, geodetic parameters of interest such as telescope coordinates, Earth orientation parameters, and radio source positions

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. A. Nothnagel () VLBI Research Group, Institute of Geodesy and Geoinformation of the University of Bonn, Bonn, Germany E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_110

1257

1258

A. Nothnagel

are inferred. The VLBI operation’s scheme generally consists of scheduling, observing session, correlation, and data analysis.

Zusammenfassung In diesem Kapitel werden die Theorie und die einzelnen operationellen Schritte und Komponenten beschrieben, die erforderlich sind, um mit dem Verfahren der Radiointerferometrie auf langen Basislinien (im Englischen Very Long Baseline Interferometry [VLBI]) geodätische und astrometrische Messungen durchzuführen. Bei diesem Verfahren werden jeweils Paare von Radioteleskopen genutzt, um die Differenzen der Ankunftszeiten der Signale von Quasaren als primäre Observable zu gewinnen. Aus einer Vielzahl dieser Laufzeitdifferenzen von verschiedenen Radioquellen können dann die relevanten Zielparameter wie Teleskopkoordinaten, Erdorientierungsparameter und Radioquellenpositionen ermittelt werden. Der gesamte VLBI-Prozess besteht in der Regel aus den Schritten Erstellung des Beobachtungsplans, Durchführung der Beobachtungen, Korrelation und Datenanalyse.

Keywords Interferometry · Astrometry · Correlation · Fringe fitting · Radio telescopes · Telescope deformations · Thermal expansion of radio telescopes · VLBI data analysis

1

Introduction

1.1

Overview

The technique of Very Long Baseline Interferometry (VLBI) was invented by astronomers with its first development steps in the mid 1960s for synthesizing a telescope aperture of several thousand kilometers improving the angular resolution by orders of magnitude for imaging extragalactic objects (e.g., [10, 17, 48, 49]). The first applications to geodesy and astrometry followed a few years later (e.g., [21, 34, 70]). Astrometry is an indispensable sub-discipline of VLBI which deals with the positions of astronomical objects. Pairs of radio telescopes as far apart as an Earth diameter simultaneously receive the electromagnetic radiation of extra-galactic radio sources such as quasars and other compact radio galaxies at quasi-infinite distances. Ideal candidates of these sources are point-like and show no proper motions on decadal time scales. Due to the large distance, the emission of the radio sources arrives on Earth as plain wavefronts and cause different arrival times at the two radio telescopes

20 Very Long Baseline Interferometry

1259

k

forming a baseline (Fig. 1). The difference in arrival times is the delay τ which is one of the observables of geodetic and astrometric VLBI. To determine the delays, in a first step the incoming wavefronts at each telescope are digitized, time tagged, and recorded on some appropriate medium such as magnetic disks. After transportation to a correlator facility, the second step consists of a cross correlation and fringe fitting process for each individual observation to produce the observables. From a multitude of delay observables in different baseline – source geometries, the parameters of interest such as telescope coordinates, Earth orientation parameters, or radio source positions can be inferred. With the extremely high accuracy and reliability of these parameters, VLBI contributes to two of the three pillars of geodesy, namely geometry and Earth rotation. Since the recorded bandwidth is a key driver of the precision of VLBI observables as described below, it has always been the primary development challenge to increase the recorded bandwidth to a technical feasible and economically acceptable level. For this reason, high precision geodetic and astrometric VLBI started in 1979 with the first observing sessions of the then newly developed Mark-3 system [20].

k

c

τ

.

b

Local Oscillator

RF Amplifier

RF Amplifier

Mixer

Mixer

Frequency Standard Station Clock

Local Oscillator Frequency Standard

Sampler Formatter

Sampler Formatter

Recording Unit

Recording Unit

Correlator

Fig. 1 VLBI constellation with electronics components

Station Clock

1260

A. Nothnagel

With a much wider bandwidth than before, precision of centimeters or even a few millimeters were reached on baselines of several thousand kilometers. The technique has been engineered ever further since then, making use of growing capabilities of modern electronics, other hardware components, and analysis software including geophysical models. Efforts to exploit VLBI in a structured manner led to the foundation of the International VLBI Service for Geodesy and Astrometry (IVS) [55] in 1999. Today, the IVS coordinates the majority of geodetic and astrometric developments and observations resulting in a manifold of synergies. The active network of the IVS and associated partners such as the Long Baseline Observatory (LBO, https://www.lbo.us), earlier known as Very Long Baseline Array (VLBA), consists of about 40 radio telescopes with about ten more under construction (Fig. 2). More details of the IVS, in particular of the observing programme can be found in [68, 69] and [55].

1.2

Reference Frames

As with other space geodetic techniques, the coordinates of the radio telescopes are referred to a conventional Cartesian terrestrial reference frame with its origin at the

Ny Ålesund 75°

Onsala

60°

Svetloe Badary

Effelsberg Brewster Hancock

North Liberty Owens Valley Goldstone Kitt Peak

Los Alamos Pie Town

Medicina

Changchun

Zelenchukskaya

Matera

Westford Azores

Greenbelt

Fort Davis

Kokee Park

Wettzell Simeiz

Nanshan

Madrid Yebes

45°

Mizusawa Tsukuba Sejong Kashima34

Noto 30°

Seshan

Canary

Kashima 11 Koganei

St. Croix Mauna Kea

15°



Fortaleza -15°

Katherine Hartebeesthoek

Yarragadee

Parkes

Concepción

-30°

Warkworth Tidbinbilla Hobart

-45°

IVS Site Cooperating VLBI Site

-60°

Future IVS Site O'Higgins

Syowa -75°

-165°

-150°

-135°

-120°

-105°

-90°

-75°

-60°

-45°

-30°

-15°



15°

30°

45°

60°

75°

90°

105°

120°

135°

Fig. 2 Global network of radio telescopes for geodetic and astrometric observations

150°

165°

20 Very Long Baseline Interferometry

1261

geocenter and the x, y, and z axes defined as a right handed system. From the origin, the x axis points at the intersection of the zero meridian and the equator, the z axis in the direction of the Earth’s rotation axis, and the y axis is orthogonal to both of them. Due to the fact that continental drift and tectonic deformations produce movements of the lithosphere of several millimeters per year, the telescope positions have to be attributed with a reference date and respective velocity components vx , vy , and vz in mm/year. For practical purposes most VLBI analysis groups maintain their own realizations from their latest global analyses. All of them are very close to the International Terrestrial Reference Frame (ITRF) in its latest version (e.g., [3] ) to which the IVS contributes routinely as described by [8, 12, 81]. On average the differences are only on the order of a few millimeters for the coordinates and some sub-millimeter per year differences for the velocities with very few exceptions for telescopes with known peculiarities. With VLBI, compact extra-galactic nuclei such as quasi-stellar objects (quasars) and other galaxies emitting electromagnetic radiation in the radio frequency domain are the primary objects to be observed. The majority of these objects are quasars and for this reason, we will generalize the naming of the radio sources sometimes only as quasars. For their vast distance, most of the observed objects appear quasi-point-like and do not exhibit any proper motions. However, with increased resolution of the interferometers, time dependent morphology of the radio sources becomes visible and the definition of the exact location of the center of emission becomes time dependent at the sub-milliarcsecond level. In addition, the asymmetric field of radiated energy leads to the so-called source structure effect which corrupts the phase of the signal and consequently the inferred delays. Currently, these objects are excluded whenever possible but endeavors also exist to correct for these effects with adequate models. The second reference frame which is directly linked to VLBI observations is the celestial frame consisting of the positions of the compact extra-galactic radio sources. The positions are defined as angular right ascension and declination components of an equatorial polar coordinate system with the origin in the solar system barycenter. Right ascension (R.A.) is counted clockwise in the equatorial plane from 0 to 24 h with minutes and seconds of time while declination runs in degrees, minutes, and seconds of arc from −90◦ at the south celestial pole to +90◦ at the north celestial pole. For quite some time the most accurate frame has been the International Celestial Reference Frame in its second realization (ICRF-2) [27]. However, work is in progress to make use of an additional wealth of observations, which has been accumulated between March 2009 and March 2018, and generate a new ICRF, i.e., ICRF3 (Fig. 3) [38]. It is expected that the axis definition is accurate to 10–20 μas and the position uncertainties have a median error of 0.15 mas as compared to 0.5 mas in ICRF2 (Jacobs, pers. comm.).

1262

A. Nothnagel

60

°

30°



−30°



−6

Fig. 3 Source distribution of ICRF2 sources (blue) plus new sources in ICRF3 (magenta). The black line is the ecliptic, the dashed line is the galactic equator

2

Basic Theory

Quasars and other compact extra-galactic radio sources are the primary objects observed in VLBI. Originating from physical processes within these objects they emit pure random noise with mostly a rather flat amplitude spectrum in the frequency bands observable on Earth. Although the radiation is random noise, this is also called the signal of the radio source. Compact extra-galactic objects are found as far away as the frontiers of the universe at quasi-infinite distances. For this reason, the emission of these radio sources arrives on Earth as plain wavefronts and causes different arrival times at the two radio telescopes forming a baseline (Fig. 1). The difference in arrival times t1 and t2 is the delay, which is one of the observables of geodetic and astrometric VLBI. The principal connection between the time delay τ , the baseline b and the unit vector in source direction k is 1 τ = t 2 − t1 = − b · k c

(1)

with c being the speed of light. On the side of the observable τ , this concept more or less assumes a stationary geometry and the observations appear like snapshots of this situation. However, as we will see later, VLBI needs some finite integration time and the Earth of course rotates during that time.

20 Very Long Baseline Interferometry

1263

In a slightly different conceptual approach, the radiation received by the two telescopes is considered as two identical monochromatic waves with the phase difference of the two signals rotating due to Earth rotation. This interferometer phase is another observable of the VLBI technique. While the Earth rotates, this phase difference rotates through many cycles depending on baseline length and orientation as well as on the wavelength of the observing frequency. The determination of this phase needs some integration time as well applying concepts of interferometry. The interferometer phase is determined in the correlation and the subsequent fringe fitting process to search for the interferometer phase at a certain epoch (see Sect. 6). In a rough concept of the correlation process, the digitized signal streams of the two telescopes are cross-multiplied at various trial delays for finding a correlation maximum and the respective interferometer phase.

3

Technical Considerations of VLBI Observations

Turning to the observations, it is rather clear that the quasars and other compact radio sources do not emit just monochromatic radiation but a broad spectrum of frequencies. This is recorded with a bandpass of limited bandwidth and the time delay will turn into a group delay, which obeys the rule of τ=

dφ(ν) dν

(2)

with φ being the phase and ν being discrete observing frequencies. Its first derivative w.r.t. time is the delay rate τ˙ =

dτ . dt

(3)

Together with some other parameters, the total spanned bandwidth determines the theoretical accuracy with which the group delay can be determined [82]: στ =

1 2π · SN R · ΔνRMS

(4)

with ΔνRMS = effective bandwidth of the receiving system depending on the number of channels n, the individual channel frequencies νi and the mean frequency νM [76]

ΔνRMS =

E F F n F (νi − νM )2 G i=1 n

.

(5)

1264

A. Nothnagel

The signal to noise ratio (SNR) of an interferometer is the inverse of the phase noise (standard deviation) and can be calculated theoretically as S SN R = η 2k

%

A1 · A2 √ 2ΔνT Tsys1 · Tsys2

(6)

with η = digitizing loss factor (0.5–0.7), S = correlated flux density of the radio source, k = Boltzmann’s constant (1.38 × 10−23 Ws/K), Ai = effective antenna areas of telescope 1 and 2, Tsys = noise temperatures of the receiving systems, Δν = total bandwidth of the receiving system, T = coherent integration time. The actual SNR is computed during correlation (Sect. 6.3). The efficiency of a radio telescope for VLBI observations is regularly determined and expressed as System Equivalent Flux Density (SEFD), Ssys . The SEFD represents the flux density of a fictitious radio source which would double the output power of the complete receiving system. The spectral flux density of a radio source as well as for a receiving system is given in Jansky [Jy] where 1 Jy = 1 × 10−26 W/(Hz * m2 ). Ssys is computed according to [76] with Ssys =

2k Tsys Aeff

(7)

where Aeff = effective antenna aperture, Tsys = system temperature, which is composed of the sky brightness temperature Tsky and the noise temperature of the instrument Tinst , Tsys = Tsky + Tinst

(8)

with Tsky = Tso · λ2.55 and Tso being the galactic background radiation of 60 K ± 20 K. λ is the wavelength of the radiation. The SEFD has the advantage that it can be measured easily at each telescope independently by determining the fractional increase in power obtained when pointing on and off a source of known flux density. SEFDs range from 60 to 400 Jy for large antennas, 800 to 1000 Jy for the 20 m class systems to more than 15,000 Jy for telescopes of only a few meters of diameter. These values also strongly depend on the receiver noise mitigation by cryogenic systems. With most of the parameters mentioned above being predefined, the integration time and the bandwidths are the parameters which still allow some scope for increase and, thus, reduction in standard deviation of the delay observables. However, the integration time should be kept short. An initial reason was that the frequency standards allowed only for a limited coherence time but this has been overcome by the use of hydrogen maser clocks (H-Maser). Today, short integration times permit collecting observations at as many different spatial directions as possible for a good sampling of the atmospheres above the radio telescopes [59].

20 Very Long Baseline Interferometry Fig. 4 Generic channel allocation for bandwidth synthesis

1265

Amp.

ν 720 MHz

Another parameter, which can be optimized, is the observed bandwidth, which appears as the effective bandwidth in Eq. 4 and as total bandwidth in Eq. 6. Since recording capacity has always been a limiting factor, bandwidth synthesis has been invented [63]. This technique works on the basis of extracting and processing comparatively small frequency bands (today up to 16 MHz) out of a spectrum of, e.g., 720 MHz to span the total bandwidth of the latter but reducing the need for storage and transportation to only n times 16 MHz (Fig. 4). While the two extreme frequency bands help to increase the effective bandwidth ΔνRMS according to Eq. 5, the interposed bands support this as well but also increase the total bandwidth Δν in Eq. 6 and mitigate sidelobes in the delay resolution function (Sect. 6). The sequence of frequencies is selected initially to produce from any pair of channels as many different frequency differences as possible. As shown later, the individual channels produce so-called single-band (group) delays while the combination of all channels in either S or X band yield multi-band (group) delays.

4

Scheduling process

Observing VLBI sessions requires an active control process because it has to be guaranteed that the telescopes forming one or several baselines point at the same quasar for the same period of time. In addition, recording of the data has to be synchronized as well since registration is not continuous to make optimal use of the recording media. Under these premises, observing schedules are prepared a few weeks before the observing date. They contain the start and stop times for every telescope in the network for every individual observation. Assuming a network of up to 20 telescopes, multiple configurations of subsets of these telescopes form so-called scans of one quasar at a time. The composition of these subsets always depends on the visibility of the quasar from a certain area on the Earth’s surface. In the end, a schedule for an observing session of 24 h may consist of thousands of scans. The word scan is actually used if two or more telescopes simultaneously observe the same radio source. Each scan produces n · (n − 1)/2 individual delay observations for n telescope in the same scan.

1266

A. Nothnagel

The preparation of the observing schedules is of great importance to the overall results because it defines the geometric configuration of the parameter estimation. In this respect it is comparable to geodetic network optimization (e.g., [4, 29]). The planning of VLBI observations, in fact, is rather complicated because there are many parameters which have to be taken into account. The first question to answer always is whether a quasar is actually above the horizon at the telescope. This can be deduced from a pure geometric consideration resulting from the position of the quasar in a sky-fixed celestial reference frame (see Sect. 1.2) and the coordinates of the telescope in the terrestrial frame transformed into its instantaneous location in a geocentric celestial reference frame. To first order this is just a simple rotation about the Earth rotation axis with the local hour angle hloc . With some generalization it is computed with the right ascension α, the longitude of the telescope λ, Universal Time U T , and Greenwich Mean Sidereal Time (GMST) at 0h UT according to hloc = GMST0h + UT · 1.00274 + λ − α.

(9)

For any radio source with its right ascension α and declination δ and any telescope with its geographic latitude Φ and its instantaneous hour angle the respective azimuth A and elevation angle ε can be computed according to [50] − cos δ sin hloc sin δ cos ! − cos δ cos hloc sin !

(10)

sin ε = sin Φ sin δ + cos Φ cos δ cos hloc .

(11)

tan A =

Querying the local horizon mask of each telescope, which may be limiting the observable section of the sky, the question is answered whether a radio source is observable at a certain instant of time. By default, VLBI observations are carried out down to the threshold of ε ≥ 3◦ . Due to the fact that cable links between the fixed ground part and the turning elements need to be secured from over-twist, all azimuth-elevation telescopes are limited in their azimuth rotations. Normally, a full circle is augmented by some reasonable arc length in clockwise and counter-clockwise directions to allow for continuous tracking at all directions (Fig. 5). Special care has to be taken in the scheduling process that the duration of the slew from one azimuth to another is computed correctly taking into account that sometimes the shortest path is blocked by the antenna limits. For each source the duration of the observation has to be calculated. This mainly depends on a predefined minimal signal-to-noise ratio (SNR) which is normally set at a level of 25 to 35. For each source, the necessary integration time is computed according to Eq. 6 depending on the correlated flux density of the source and the sensitivity of the telescopes represented in the SEFDs according to Eq. 7. For the zenith ) has to be scaled damping caused by the atmosphere, the SEFD at zenith (Ssys according to [28, p. 170] by

20 Very Long Baseline Interferometry

1267

Fig. 5 Example of cable wrap limitations (−270◦ to +270◦ ). N.B.: Position Az may be reached by two different azimuth values

N

270°

90°

Az

zenith Ssys (ε) = Ssys · f (ε).

(12)

In its simplest form, the scaling factor f (ε) is a mapping function just as 1/ sin ε. A more complex model f (ε) =

n i=1

ci (sin εy )i

(13)

is being used in the SKED scheduling program [28], where n is the number of terms (usually 1–3), ci is the coefficient for the i-th term, and y is the power of the sin ε term (0 ≤ y ≤ 1). Since it takes quite some effort to determine the coefficients and power law, these are available only for a few radio telescopes such as those of the Very Long Baseline Array (VLBA) [61] with c1 close to 1. These mapping functions lead to an increase of the SEFDs already by a factor of 2 at ε = 30◦ increasing sharply below. For this reason, observations close to the horizon need much longer integration times than those near zenith. The final observing schedule is constructed in a sequential forward mode where new observations are being selected to fulfill an optimization criterion. Most operational observing schedules are generated to optimize the sky coverage at each telescope in a way that the hemisphere above each telescope is filled homogeneously with observations as much as possible for a good estimation of the atmospheric refraction parameters (Figs. 6 and 7). It should be noted here that due to the fact that the atmospheric refraction parameters are often estimated as linear splines with interval lengths of, e.g., 1 h, the sky coverage is actually evaluated only for the hour preceding the current instant. The generation of the observing schedules is a contemporary field of research and many innovative optimization schemes have been devised e.g., [11,73,74,79]. In the scheduling process itself, we assume a start scenario where all telescopes point at the same quasar. The question always is which is the next observation fulfilling the optimization criterion. To answer this, for all possible candidates the next possible start time is computed from the slew rates of the telescopes and the minimum scan lengths for each baseline according to the SNR threshold. The final selection is then

1268

A. Nothnagel

Fig. 6 Local sky coverage of observations at telescope HART15 (South Africa)

HART15M (Ht) 430 scans

0 30 60 90 60 30 0 0

Fig. 7 Local sky coverage of observations at telescope NYALES20 (Svalbard, Norway)

0

30

60

90

60

30

0

NYALES20 (Ny) 459 scans

30 60 90 60 30 0 0

30

60

90

60

30

0

taken in agreement with some predefined minor options such as maximum scan length or minimum separation between observations of the same source. For more details of the geodetic scheduling procedure see [28].

5

Observations and Calibrations

5.1

Radio Telescopes

Radio telescopes for geodetic and astrometric applications are basically the same as those for pure astronomical observations. The only differences are found in the receiver and backend technology. The dominant part of a radio telescope is the main reflector, often combined with a secondary reflector or sub-reflector, which serves to combine the energy of the incoming wave front in a single focal point. The construction of a radio telescope aims at locating the phase center of a so-called feed horn exactly at the focal point to guide the energy into the receiver which

20 Very Long Baseline Interferometry Fig. 8 Cassegrain radio telescope optics with parabolic main reflector and hyperbolic sub-reflector (dashed lines are discrete ray paths)

1269

primary focus sub−reflector

main reflector

feed horn Fig. 9 Gregorian radio telescope optics with parabolic main reflector and ellipsoidal sub-reflector (dashed lines are discrete ray paths)

secondary focus sub−reflector primary focus

predominantly consists of the first stage of amplification in a cooled (cryogenic) environment. A radio telescope is said to have a prime focus system if the radio frequency feed horn is placed at the focal point of the main reflector. Most radio telescopes, however, use a secondary focus near the vertex of the paraboloid for easier access to the receiver equipment. For this purpose either a hyperbolic (Cassegrain system, Fig. 8) or an ellipsoidal sub-reflector (Gregorian system, Fig. 9) are mounted in front of or behind the primary focal point, respectively, to concentrate the energy in a secondary focal point. The feed horn is then placed here. The reflector optics is mounted on a pair of axes which are perpendicular to each other to reach all positions on the sky. The most common mount is the azimuthelevation, also called alt-azimuth, axis system (Fig. 10) [7]. Other systems are polar mounts, where the primary axis is in a position parallel to the Earth’s rotation axis (Fig. 11), and the X/Y mount, where the primary axis just lies in a horizontal plane (Fig. 12). It should be mentioned here that azimuth-elevation telescopes are constructed in two different ways. The one group is a so-called wheel and track type where the whole structure moves on a circular track with at least four groups of wheels distributing the weight evenly (Fig. 13). Today, this construction is mainly used for

1270

A. Nothnagel

Fig. 10 Azimuth-elevation mount with primary axis in local vertical

VLBI reference point

ε elevation axis

AO

positive axis offset azimuth axis

Fig. 11 Polar mount with primary axis parallel to Earth rotation axis

declination axis axis offset

δ AO VLBI reference point polar axis

bigger telescopes of diameters larger than 20 m. As with telescopes with polar and X/Y mounts these are generally made of steel entirely. The second group consists of so-called turning-head telescopes where only the top part rotates while a tower made of concrete supports the moving parts (Fig. 14). In geodetic and astrometric VLBI, the Earth-fixed coordinates of a radio telescope are always related to the VLBI reference point. This is a point within the structure of a telescope which is invariant to any rotations of the telescope [72]. To first order this is the intersection of the primary and the secondary rotation axis of the telescope. However, since these intersect only in very seldom cases, the reference point is the projection point of the secondary axis onto the primary axis (Figs. 10, 11, 12) The separation of the two axes is called the axis offset (AO). It

20 Very Long Baseline Interferometry

1271

Fig. 12 XY mount with primary axis in horizontal plane

Y axis axis offset AO

VLBI reference point X axis hp

Fig. 13 Wheel and track telescope (Medicina, Italy)

can range from a few millimeters in cases, where intersecting axes were planned but not exactly realized for constructional reasons, to several meters where required for technical reasons, in particular for polar and X/Y mounts. In seldom cases, the axis offset may even be negative resulting from the fact that the elevation axis lies a small distance behind the azimuth axis [54]. Considering the travel time of the incoming wave fronts, the distance between the feed horn and the secondary axis is considered invariant to first order. It should be

1272

A. Nothnagel

Fig. 14 Turning head telescope (Wettzell, Germany)

mentioned here already that any constant part of the signal path produces a constant time delay contribution which is treated as relative clock offset in the analysis (see Sect. 7). The axis offset, however, produces a time delay ΔτAO which varies with elevation. In general, it depends on the unit vector in source direction s and the unit vector in the direction of the fixed axis f [54]: ΔτAO =

 1 AO · 1 − (s · f)2 . c

(14)

For azimuth-elevation telescopes the axis offset produces an extra time delay contribution of ΔτAO =

1 · AO · cos ε. c

(15)

with ε being the elevation angle of the pointing, while for polar mounts the delay contribution depends on the declination of the radio source δ: ΔτAO =

1 · AO · cos δ. c

(16)

20 Very Long Baseline Interferometry

1273

For X/Y mounts one has to distinguish in which direction the primary axis is oriented. For those in north-south direction, the delay contribution is ΔτAO =

 1 · AO · 1 − (cos ε · cos α)2 c

(17)

with α being the azimuth of the radio source, while for those in the east-west direction it is ΔτAO =

5.2

 1 · AO · 1 − (cos ε · sin α)2 . c

(18)

Telescope Deformations

It is needless to state that the radio telescopes used for geodetic and astrometric VLBI need to be stable enough in all their components for results of highest accuracy. This applies not only for the reference point itself but also for the reflecting optics. The stability of the reference point of VLBI telescopes has been studied only occasionally [45, 46]. The results indicate that the telescope studied is stable close to the detection limit of sub-millimeter accuracy. It is worth to note that all these studies are made for turning-head telescopes where fixed parts of the construction can be used to mount geodetic targets. For wheel and track telescopes any effect of instability, manifested as tumbling of the reference point, cannot be separated from the determination of the axis offset [37]. However, if a displacement of the reference point is actually a consequence of a tilting of the telescope, this can be detected by identifying trends in its pointing model. Furthermore, a tilting process may be interpreted as a spurious telescope motion in the global frame which may be identified by comparison with velocity vectors from observations of nearby permanent GNSS installations as demonstrated on the PIETOWN radio telescope [61]. In the construction process of a radio telescope, special care is taken that the weight of the primary reflector, the quadrupod holding the sub-reflector, and the sub-reflector are balanced by some counter-weight. The reason is that the motors and gears should be relinquished from any torques caused by imbalances for protection. For this reason, there is almost no shift in load enforced on the elevation axis, and thus on the VLBI reference point, when the telescope is tilted to different elevation angles. The only instability effect of the VLBI reference point is then thermal expansion. This is mostly modeled by applying the ambient temperature to the telescope dimensions [54] but examples exist where thermal expansion is actually measured on the telescopes directly with the help of invar rods [39, 84]. In most cases, however, the effects of thermal expansion are being modeled applying the expansion coefficients to the dimensions of the telescopes. If we first consider only the support structure up to the elevation axis of an azimuthelevation telescope, this normally consists of some concrete foundation and a tower

1274

A. Nothnagel

produced of steel. Depending on the material, these have two slightly different expansion coefficients. Examples are γf of 1.0×10−5 [1/◦ C] for a concrete base and γp = 1.2 × 10−5 for the steel tower. Applying a conventional reference temperature of T0 and a time lag of t − Δt [54], the elevation dependent effect on the delay of telescope i can be computed according to

Δτtherm.i =

1  · γf · (T (t − Δtf ) − T0 ) · (hf · sin ε) c  +γa · (T (t − Δta ) − T0 ) · (hp · sin ε) .

(19)

This is by far the biggest effect of thermal expansion. For a telescope of 20 m diameter, the height of the elevation axis is roughly 12 m causing a height variation effect of almost 3 mm originating from a 20 ◦ C temperature difference between winter and summer. Another part of the telescope, where stability counts, is the superstructure of the telescope being responsible for the path length of the signal. In an ideal situation the total path length through the reflecting optics is considered to be stable for any direction on the sky. However, thermal expansion and gravitational deformations cause the path length to change. Thermal expansion primarily leads to a change of the focal length of the telescope [6]. In addition, the expansion of the legs of the quadrupod holding the sub-reflector or the feed horn in primary focus produces an extra path length at higher temperatures. Besides this pure geometric effect, the focal characteristics may be disturbed slightly leading to phase noise from different sections of the reflecting surfaces but these are estimated to appear only in the SNR budget and have to be neglected. Variations of the path length have adverse effects only within the time frame of a single session of 24 h duration because the mean effect is compensated for by the clock offset parameter in the estimation process (Sect. 7). With higher resolutions of the clock parameters down to 1 h and below, the effect is mitigated even more and the level of remaining path length variations is generally less than 1 mm. They can be modeled if all dimensions and material properties are known [54]. The gravitational deformations of the paraboloid and the sub-reflector create different path lengths in general but also depending on the radial distance from the optical axis. They are more critical than those of thermal origin because they are purely elevation dependent and thus change from observation to observation. As has already been investigated by [19], for a prime focus telescope, the deformations can be separated in three different components which are all elevation angle ε dependent: (a) the change of focal length ΔF , (b) the movement of the vertex of the paraboloid ΔV , and (c) the shift of the feed horn or the sub-reflector in radial direction ΔR (Fig. 15). The extra path length ΔLgrav can then be composed of ΔL(ε) = αF ΔF (ε) + αV ΔV (ε) + γ αR ΔR(ε).

(20)

20 Very Long Baseline Interferometry

1275

ΔR θ

ε = 0°

ΔF

ε = 90°

ΔV Fig. 15 Gravitational deformations of telescope optics: Solid lines represent geometry at 90◦ elevation angle, dashed lines that at 0◦ . ΔV is the shift of the vertex of the telescope, ΔF is the change in focal length, ΔR is a shift of the sub-reflector (in case of a secondary focus telescope) or of the feed horn (in case of a primary focus telescope)

The coefficient γ is equal to 1 for prime focus telescopes and 2 for secondary focus telescopes. The net effect of the elevation-dependent displacements ΔF, V , R on the extra path length depends on scaling coefficients αF /V /R . Here, the scaling coefficients αF and αV according to [19] are linearly dependent on αR , i.e., αV = −1 − αR

(21)

αF = 1 − αR .

(22)

where the V  , R  , F  indicate that these are parameters for prime focus telescopes. For Cassegrain systems as a type of secondary focus telescopes, [1] developed the relationships αV = −1 − 2αR

(23)

αF = 2 − 2αR .

(24)

The initial concept for the computation of αR by [19] assumed that all of the reflecting area of the main reflector contributes to the total change of path length in the same way. However, feed horns are always constructed in a way that they have only a limited aperture angle matching the dimensions of the reflector system to avoid unwanted spill-over from beyond the edge. Since this cannot be realized purely binary, i.e., homogeneous for the area of the reflector system and zero beyond, the feed horns are constructed with an edge taper. This realizes a roughly exponential drop of sensitivity towards the aperture angle needed to illuminate the reflector system (Fig. 16)

1276

A. Nothnagel

0

Gain [db]

-5 -10 -15 -20 -25

0

1

2

3 4 5 Opening angle [deg]

6

7

8

Fig. 16 Edge taper characteristics, (a) as an exponential (black) and (b) as a sin2 function

The function, which describes the drop in sensitivity depending on the aperture angle θ (Fig. 15), is called the illumination function and in essence serves as a weighting function for the aperture of the telescope depending on the radial distance from optical axis, i.e., signal paths near the optical axis have a much higher weight than those near the rim of the main reflector. The determination of the exact illumination function heavily depends on measurements of the gain at multiple aperture angles of the feed horn which can be easily transformed into metric separations of the point of incidence on the main reflector from the optical axis. Unfortunately, for most telescopes only the relation between the gain on-axis and at the edge is available. With only two points of the function, this leaves room for interpretation in the sense that the shape of the function is at the discretion of the analyst with Gaussian, binomial and cosine-squared models at hand [1, 6]. Since this is a fairly new area of research, any further studies on this need to request more detailed measurements of the illumination functions on site. The coefficient αR for a prime focus telescope, which scales the shift of the feed horn in primary focus [19], is αR = 8πf 2



t2

In (t) t1

1 − t2 t dt 1 + t2

(25)

where t1 = r1 /2F and t2 = r2 /2F with the radial distances from the vertex r1 and r2 , measured in the aperture plane. The distance r1 generally is non-zero because most radio telescopes have some sort of feed horn housing at the vertex which needs to be excluded. In is the normalized illumination function.

20 Very Long Baseline Interferometry

1277

For a secondary focus Cassegrain system, [1] derived αR

 = 2π(F − a ) 2

2 2

t2

In (t) t1

t t dt t 2F 2 − a2

(26)

with t = r/2F for the scaling coefficient, again applying a weighting through the normalized illumination function In . The parameter a is the reference semi-major axis of the ellipsoid at 90◦ elevation. For a Gregorian system [6] adopted a similar derivation but with an implicit illumination function on the basis of a cosine-squared model. Here the sub-reflector is a triaxial ellipsoid with its longest extension in the z direction and with the first two semi-major axes (a and b) being identical. If the sub-reflector is displaced, the travel time increases by a factor of αR =



r2

r1

k · 10

c0 +c1 ·cos2 θ 10

1 · (D2 + D3 − 2a) dr 2

(27)

with the normalization coefficient k [1]. D2 is the distance between the focus of the sub-reflector and point of incidence on the sub-reflector and D3 is the distance between the point of incidence and the focal plane of the secondary focus (Fig. 17). The aperture angle θ can easily be deduced from r. c0 and c1 are the coefficients of the cos2 function. Every telescope has its own set of coefficients αF /V /R . A rough number for αR  for a secondary focus telescope is 0.9 (1.06 for Effelsberg [6] and 0.8 for Medicina or Noto [1]) which, according to Eqs. 23 and 24, lead to αV  =−2.8 and αF  = 0.2. This indicates that changes in the position of the vertex of the paraboloid relative to the VLBI reference point have about a ten times larger effect on the signal path length than changes in the focal length and 3 times larger an effect than of changes of the sub-reflector position.

Fig. 17 Components of path length in telescope optics

P1 (D’) 2 F1

D1

P 3

ββ

D3 α α D 2

P2

D4

P 4 RP

1278

A. Nothnagel

Similar relations also apply for primary focus telescopes where [1] quote an αR  of about 0.7 resulting in an αV  = −1.7 and an αR  = 0.3 according to Eq. 21 and 22. Here, the multiplication factor of a shift of the vertex of the paraboloid is about a factor of 2.5 larger than that of the feed horn αR  . Although variations of ΔR may originate from several phenomena, we only have to consider line-of-sight shifts of the sub-reflector or feed horn in the telescopefixed reference frame (Fig. 15). Position variations predominantly happen due to gravitation when the telescope is tilted, e.g., by bending of the struts holding the receiver [67]. These need to be measured locally. Presently, the most suitable technique for measuring the deformations of the primary reflector resulting in a ΔF is terrestrial laser scanning (TLS). Performing these at discrete elevation angles between zenith and horizon provide reliable estimates of the changes in focal length [6, 36, 67]. In some cases, the other parameters (ΔR, ΔV ) can be deduced from the TLS results [6] but often they need other sensors mounted on the structure of the telescope (Bergstrand, pers. comm.).

5.3

Receiving System

Today, geodetic VLBI systems mostly use concentric feed horns for two bands in the radio frequency (RF) domain, X band as primary observing frequency and S band for calibration of ionospheric refraction. From the injection into the feed horn, the quasar signal is lead into the first amplifier stage consisting of low-noise amplifiers, e.g., field-effect transistors (FET), down-converted, and filtered to useful bandwidths of 500 to 1000 MHz (so-called intermediate frequencies, IF). This is done in separate chains for each frequency band or sub-band and down-converted (heterodyned) with some local oscillator frequency, e.g., 2020 MHz for the S band signal and 8080 MHz for X band (Fig. 1). After reaching the control building, channels of separate frequencies with widths of 2, 4, 8, or 16 MHz are extracted and further mixed down to baseband for bandwidth synthesis . For the description of a frequency band, in most cases, one of the band limiting frequencies is given together with the indicator upper sideband (USB) or lower sideband (LSB). Throughout the VLBI community, the channel reference frequencies 2225.99, 2245.99, 2265.99, 2295.99, 2345.99, 2365.99 MHz of S band and 8212.99, 8252.99, 8352.99, 8512.99, 8732.99, 8852.99, 8912.99, 8932.99 MHz of X band, all with USB are currently allocated. At X band, two more channels are often used at 8212.99 MHz (LSB) and 8932.99 MHz (LSB) to fill the historically provided channels no. 15 and no. 16. The reasons for selecting and processing individual channels are given in Sect. 2. Digitization of the individual channels is done according to the Nyquist theorem with two samples of one bit each per Hz. “0” represents a negative voltage and “1” a voltage of zero or larger. This is called 1-bit sampling. In some setups, a second bit is added to each sample indicating whether the voltage is below or above a (halfpower) voltage threshold (2-bit sampling). The frequency allocation is subject to extension in the intermediate future because the VGOS concept (VLBI Global Observing System, [59]) foresees that

20 Very Long Baseline Interferometry

1279

the full band between 2 and 14 GHz be observable simultaneously employing a single feed horn. The exact allocation of four individual frequency channels of 0.5 or 1 GHz bandwidth will be decided depending on recording capabilities and radio frequency interference (RFI) through strong artificial emitters affecting the receivers. The general signal processing chain beyond the receiver is very similar to the legacy system and backwards compatibility with the current S/X configuration is planned for mixed-mode operations with legacy installations. The local oscillators for the down conversions are driven by frequency standards which normally are hydrogen masers for guaranteeing sufficient frequency stability [56]. The bit streams are then time-tagged in a so-called formatter unit included in the disk unit used for recording. In many cases, the recording units themselves consist of a chassis, which is normally mounted in some electronics rack, and exchangeable, transportable disk units, each containing eight commodity disks for storage of the data. The latter are used for intermediate storage and later transmission of the data by Internet lines or for shipment to the correlator by courier service. These are known as Mark5 recorders with Mark6 developments underway. Alternatively, also RAID storage systems are in use (so-called Flexbuff). Naturally, they are mainly used for transportation of the data to the correlators via the Internet. Depending on the number of individual channels and the channel bandwidths, the total data volume for legacy systems may be up to 1 GigaBit per second (Gbps) or 4.3 TerraByte (TB) per telescope at a 40% recording cycle (the rest is used for slewing of the telescopes). For VGOS systems, with four channels of 1 GHz bandwidth each, two polarizations and a 50% recording cycle, the total volume per telescope is approximately 86 TB.

5.4

System Calibrations

From the feed horn of the telescope, the signals pass through various components of the receiving system before they are digitized and time tagged (see Sect. 5.3). It is quite natural that the signals suffer from system delays which are neither constant nor predictable [20]. To calibrate the system delay, a comb of phase calibration tones of 1 MHz separation is injected near the feed horn (Fig. 18). This calibration signal is mixed with the noise from the quasars, travels through the same electronic components, and is finally recorded as part of the combined signal. The phase calibration tones are extracted again as part of the correlation process (Sect. 6). The voltage level of the phase calibration tones should be kept at about 1 to 10% of the quasar noise to avoid disturbances of the latter. Phase calibration serves two different purposes. The first one is that inherently the phase calibration marks the time epoch when a certain signal component has actually arrived at the feed horn. This can be considered as a timing reference. The second purpose is that for bandwidth synthesis, multiple channels are processed in the system, i.e., digitized and down-converted, and that dispersive effects occur to

1280

A. Nothnagel

Fig. 18 Phase and cable calibration

Feed horn Cable calibration antenna unit Pcal

5/10 MHz distributor

VLBI electronics

Cable calibration ground unit

Formatter Disk unit

H − Maser

the phase of each channel. These lead to spurious delays according to Eq. 2 which are variable and can be corrected for in the fringe fitting process (Sect. 6.5). The unit producing the phase calibration signal is located very close to the feed horn (Fig. 18). It receives its reference frequency from the H-Maser through a 5/10 MHz distributor which also supplies the local oscillators with phase coherent frequencies. Since this link is subject to changes in the electrical path caused by temperature variations and twisting of cables, it is monitored by a two-way system, the so-called cable calibration, a.k.a. delay calibration, system. In fact, the 5 MHz signal bounced by the cable calibration antenna unit is modulated by a 5 kHz signal which, at the control room, is compared with a 5 kHz reference signal. The time equivalent of the phase shift is constantly monitored with a start-stop counter and recorded in the station log for calibration in the data analysis process (Sect. 7).

6

Correlation and Fringe Fitting

6.1

General Overview

The VLBI time delay τ is defined as the travel time of a plane radio wave which passes station 1 and then reaches station 2. At this point we are not going into detail how the reference point at each station is defined, but rather focus on the question how we can compare the recorded bit-streams from each station and extract quantities like group delay, phase delay, delay rate and signal-to-noise ration (SNR) , respectively correlation amplitude. Before we have a look into theoretical aspects and study the actual correlation process, it might be good to reflect on the geometrical situation and the technical aspects of VLBI.

20 Very Long Baseline Interferometry

1281

If we consider two VLBI stations distributed arbitrarily on the globe, we can easily identify the range of possible delays to be within Re −Re ≤τ ≤ , c c

(28)

where Re is the radius of the Earth and c denotes the speed of light. Thus, any VLBI delay has to lie between these boundaries which evaluate to approximately 21 ms. Without stating explicitly, we have assumed for this simple estimate, that the station clocks are synchronized to UTC sufficiently well. While this is not a straightforward task and requires the use of GNSS receivers, VLBI stations can normally synchronize their time-tagging units to UTC within ±1 μs. Revisiting the VLBI overview depicted in Fig. 1 one has to consider that the situation shown there is a static snapshot. During an actual VLBI observation, two main effects due to temporal variations need to be considered. First, the Earth is rotating during a scan is recorded. Thus, similar to GNSS, one needs to take into account a Doppler effect which impacts the data stream at each station in a way that the recordings suffer from delay rate effects. As similar but smaller contribution might also come from the frequency standard at each station, which very likely drifts from its nominal frequency and thus adds another delay rate-like effect. Ignoring higher order temporal effects, we can thus state that the correlator’s task to find the delay τ between two stations can only be achieved if, at the same time, the delay rate τ˙ between the recorded data streams is taken into account properly. This means, that the correlator’s task can be summarized as finding the most likely values for delay and delay rate that relate the sampling data from station 1 to those of station 2. As will be addressed in more detail later, the correlation process is actually limited to a small search time interval applying a priori delays (to be re-added again at the end) from a priori geometry and clock offset information. Furthermore, a single observation is first decomposed in data chunks of individual periods of 0.1 to 1 s, so-called accumulation periods (APs), which later on are combined again. First, each sub-band channel is looked at separately and at the end these are combined as well leading to bandwidth synthesis (cf. Sect. 3).

6.2

Making Use of the Cross-Correlation Theorem

Ignoring for a moment that recorded signals suffer also from non-zero delay rate effects, and assuming that signals are captured with infinite temporal and spectral resolution we can write the cross-correlation function1 as ∞ 1 s1 (t)s2 (t − τ )dt. (29) C12 (τ ) = Corr(s1 , s2 ) = T −∞

1 In radio astronomy τ is often defined as the delay w.r.t. the second station, which would change the sign in the following equations

1282

A. Nothnagel

where s1 and s2 denote the continuous recordings at station 1 and station 2 [76]. Since we can only sample in discrete time steps and we should consider the geometrical constraints discussed before we have to collect a sufficiently large number of samples in order to be able to identify the correlation peak that corresponds to the time delay between the two recordings. We have to restrict ourselves also to a certain bandwidth B when we collect our samples which implies that we have to obey the Nyquist theorem which suggests that we should record one independent sample every Δt = 1/(2B). Thus, the finite integral from Eq. 29 turns into C12 [τ ] =

N 1 s1 [k]s2 [k − τ ], N

(30)

k=1

where τ is considered to be a discrete (or integer) delay that has a granularity of Δt. Before we discuss how we can achieve a sub-sampling delay resolution, we should look for efficient ways to compute the cross correlation function. Considering that the integral in Eq. 29 looks almost like the convolution of two ∞ functions s1 and s2 , which is defined as s1 (t)s2 (τ − t)dt, the solution can be −∞

expressed in the form

s1 ∗ s2 = F−1 (F(s1 ) · F(s2 )) ,

(31)

where F and F−1 are Fourier and inverse Fourier transforms.

6.3

Efficient Ways to Obtain the Cross Correlation and Cross Spectrum Functions

In order to obtain the correlation function rather than the convolution between the two signals, we only need to flip the 2nd signal’s time order, i.e. changing from t −τ to τ − t. However, this corresponds to a simple multiplication of the frequency axis with −1 in the Fourier domain. Furthermore, if S(f ) is the Fourier transform of a signal s(t) then we are looking for S(−f ). Fortunately, we can make use of another very useful relation that states that S(−f ) = S(f ) where long bar indicates the complex conjugate operator. Thus, we have found a very simple and elegant way to express the correlation function C12 (τ ) by two Fourier transforms, a multiplication of Fourier coefficients and a single inverse Fourier transform, i.e. C12 (τ ) =

# 1 −1 " F F(s1 ) · F(s2 ) , T

(32)

Since we deal with discrete and finite sampling data we can replace the continuous Fourier transformation by Fast Fourier Transform (FFT) operations and obtain the cross correlation function by

20 Very Long Baseline Interferometry

C12 [τ ] =

1283

" # 1 IFFT FFT(s1 ) · FFT(s2 ) . N

(33)

Please observe that the cross correlation function is now implicitly defined as a complex function and its absolute value would correspond to Eq. 29 in case s1 and s2 were real-valued data streams. As we will see later, the original raw sampling data are of course real valued, but down-conversion or signal preprocessing steps can be performed easier when dealing with complex-valued data streams. Often we are not after the cross correlation function, but try to access the so-called cross-spectrum between the two recordings, C12 [f ] = FFT(s1 ) · FFT(s2 ),

(34)

which can be turned into the cross correlation function at any point by applying an inverse Fourier transform. We could as well have obtained the cross spectrum from Eq. 30, if we apply the Fourier transform on the cross correlation function, i.e., 6

5 N 1 s1 [k]s2 [k − τ ] . C12 [f ] = FFT N

(35)

k=1

Thus, we have found two ways, summarized also in Fig. 19, to perform the correlation respectively the cross spectrum between the two signals s1 and s2 . If we follow the concept of Eq. 35 we denote this choice as XF correlation, which indicates that the cross correlation is performed first in the time domain and then the Fourier transform is applied. If we, on the other hand, follow what is implicitly expressed in Eq. 34 we perform two FFTs first and then multiply in the spectral domain, which is the reason why this choice is called FX correlation. In the end the cross-spectra will be the same, but as it turns out, FX correlation is preferable for nowadays VLBI operation due to the fact that FFT operations scale by O(N ·log2 (N )) and thus outperform the classical XF approach already for a very small number of lags.

Fig. 19 Schematic overview of XF and FX correlator architectures

1284

A. Nothnagel

In order to understand why the cross spectrum appears to be the main target in correlation rather than the actual correlation function, we need to reflect on the fact that we are dealing with sampling data rates of several mega-samples per second (MSps) and thus cannot correlate the whole length of the data stream, but we will process that data in shorter pieces, so-called accumulation periods (APs), which in the ideal case even matches order-of-two FFT sizes and thus lead to a higher performance of FX type correlators. As long as we preserve coherency among those batches of spectra we can simply add them, i.e. integrate them and then perform the inverse Fourier transform on the stacked spectrum. Moreover, having access to the complex spectra in each of these batches allows us to compensate for delay and delay rate effects in a computational efficient way so that coherency can be preserved rather straightforward. Since the relation between the cross spectral characteristics and the corresponding correlation function is crucial for understanding the overall correlation process, Fig. 20 illustrates three simple cases that allow the reader to reflect on the concept of correlation. For the sake of simplicity we have created a short 64 sample data set of purely complex random data. From this data we created a second dataset which was circularly rotated by 5 lags. Thereafter, both data streams were Fourier transformed and the cross spectrum was obtained following Eq. 34. The first row in Fig. 20 contains three plots

Fig. 20 Correlation examples based on a synthetic data set. See text for a description of each row’s content

20 Very Long Baseline Interferometry

1285

depicting the corresponding cross spectrum characteristics (amplitude and phase) and the normalized cross correlation function. As for the latter, we can see that the correlation peak appears exactly at a lag distance of 5, which is what we would have expected based on how we have generated the second data set. We can also observe that the phase of the FFT points linearly varies with frequency. Thus, we have graphically confirmed the crucial relation between correlator phase φ and the delay τ , which is usually expressed as φ = 2π ντ.

(36)

This means that any delay between the two data stream manifests itself as a phase slope in the cross spectrum. In general, we can state that a phase slope in the cross spectrum will lead to a peak in the cross correlation function. Equipped with this knowledge we can now study two other situations, the impact of noise and the effect due to finite bandwidth. If we add a certain amount of noise to the second data set and perform the same analysis as before we obtain slightly different results as depicted in the middle row of Fig. 20. As expected, the phase slope in the cross spectrum becomes more noisy, which is not a surprise since we are no longer comparing only time shifted data but need to consider that the data-sets are now corrupted by the addition from a certain noise contribution. The other effect which we can observe is that the correlation amplitude is no longer one as in the case of identical samples, but has decreased significantly. If we define the ratio between the correlation peak and the mean of the correlation function without that peak as the signal-to-noise ratio (SNR) we have gained a very intuitive understanding of how noise actually impact the SNR. Since a large SNR relates to a clear peak detection it is obvious that SNR will later (see Sect. 7) be used as a proxy to describe the uncertainty or precision of a delay measurement. If we are not only considering noise contribution, but take into account that we are bounded to sample only over a very limited bandwidth, we can carry out another simulation (last row in Fig. 20) that depicts this effect very illustrative. Reducing the bandwidth can be easily simulated by setting the complex cross spectrum outside a certain pass band to zero and then perform the inverse Fourier transform to access the cross correlation function. When doing so, we can observe that the correlation peak is still around the 5 lags where we would have expected it to be, but we can also observe that the correlation peak is no longer identified by a sharp peak but is now located on the highest point of what can be approximated well by a quadratic function. In addition, the peak height has further decreased and together with our observation that the peak is now surrounded by a broader range of large values it is again very intuitive that we can deduce a relationship to SNR. We can now directly relate to what we have defined in Eq. 6, where Tsys is an equivalent measure of the noise contributions and Δν represents the effective bandwidth of the receiving system. Lower noise (or lower system temperature) leads to a higher correlation peak and thus to a higher SNR. A wider bandwidth, together with a longer observation time, leads to a clearer phase slope in the cross spectrum

1286

A. Nothnagel

and thus a higher and narrower peak in the cross correlation function which in the end translates to a higher SNR or a better formal error of the delay observable.

6.4

Fractional Sample Delay, Fringe Rotation and Quantization Noise

The correlator uses a-priori delay and delay models which allow to account for changes in the delay by dynamically adapting the read address of the raw data streams. The delay is tracked by the correlator using steps in sample units, which causes a difference between a priori delay and the discrete delay realized in the correlator. This difference must not exceed one sample unit within an integration/accumulation period, which normally ranges from one to a few seconds. The difference between the applied discrete delay and the actually modeled delay is referred to as fractional sample delay  and is stored together with the cross spectrum for each integration period. Since we are dealing with cross spectra, we can easily correct for such delays by  C12 (ν) = C12 (ν)e−i2π ν(1 −2 ) ,

(37)

where 1 and 2 are the fractional delays at stations 1 and 2. Again, we see the advantage of using the cross spectrum since the operation expressed in Eq. 37 is computationally not very costly and can be implemented very easily. As shown in Fig. 21 one can implement the fractional delay correction directly after the FFT transformation of each station’s data stream. When the delay rate has a positive value the telescope is steadily “moving away” and thus the Doppler shift causes the signal received to be at lower frequency. Following that same logic, we can state that negative delay rates correspond to Doppler shifts towards higher frequencies. This implies that the signals from two

X X

Complex

X−Delay

FFT

X cos

Frac. delay correction

sin Phase Generator

X

X Y

Y−Delay X cos

Complex

Frac. delay

FFT

correction

Accumulate

sin Phase Generator

Fig. 21 Scheme of station-based FX complex correlator according to [83] (modified)

C(f)

20 Very Long Baseline Interferometry

1287

stations cannot be correlated unless the frequency of the signals is referring to the same reference frequency. As depicted in Fig. 21, this can be achieved by frequency conversion of the raw sampling data and is easily implemented by multiplication of the data stream with a periodic signal that has a frequency which corresponds to the Doppler shift.2 This process is called fringe stopping, and is accomplished by multiplying cosine and sine functions with respect to the time-series data of the stations, i.e. before the data are either correlated or Fourier transformed. Since geodetic VLBI deals with observation frequencies of 8 GHz and above, the delay acceleration τ¨g , respectively the Doppler rate, needs to be taken into account as well when applying this correction. Fringe stopping is usually performed at frequencies calculated at the baseband frequency ν0 or at the center of the band ν0 + νB /2 whereas the baseband is defined as the frequency of the radio frequency signal which is finally converted to the zero frequency. If the a priori model is sufficiently accurate, the resulting residual fringe rate will be at the level of a few mHz. Beside the compensation for finite delay representation and fringe rotation, one needs to consider that the recording systems are restricted by their data rate, measured in bits per second (bps). During analog to digital conversion, the signal is represented by samples having M quantization levels. Thus, it is obvious that the example discussed before and depicted in Fig. 20 would give different correlation amplitudes if we deal with quantized data. However, if the sampling rate and the quantization level is known, which is usually the case for VLBI, we can correct the obtained correlation amplitudes for these discretization effect and obtain an unbiased correlation amplitude and thus an SNR value that refers to the original analog signal. According to [80] one can for example correct the correlation amplitude ρc obtained from data sets which were 1-bit quantized by ρ = sin

"π 2

# ρc .

(38)

Corrections for other sampling rates or quantization levels can be found e.g. in [77].

6.5

Fringe Fitting and Bandwidth Synthesis

Correlation is performed so that cross spectra are stored after a certain integration period, which leaves us with a time series of spectra. Knowing now that the inverse Fourier transform would give us the correlation peak at each of these integrations, we can now also try to evaluate how the position of this peak varies over time, which in the end will give us the delay rate value that corresponds to the duration of that particular scan. Thus, the process called fringe fitting, will use a set of consecutive cross spectra and try to find the most likely values of τ and τ˙ that maximize the

2 This

is a very simplified explanation. For more information the reader is referred to literature dealing with the actual implementation in more detail.

1288

A. Nothnagel

correlation function, both in time as well as in frequency. Plotting all possible combinations of delay and delay rate gives us the so called 2D delay resolution function (see Fig. 22) which allows us to identify the most likely values for these two parameters. For the delay direction we already know that we do not need to manually try all delay values but simply make use of the Fourier transform to obtain the one dimensional delay resolution function. As it turns out, we can also make use of the Fourier transform in the direction of consecutive cross spectra and obtain a peak where we have the most likely value for the delay rate. Thus, the 2D delay resolution function can easily be obtained by 2D FFTs performed on a stack of coherently integrated cross spectra. Once we identified the peak we can do a fine search around that location by a quadratic interpolation and obtain values of τ and τ˙ with a resolution that is better than the corresponding FFT resolution. However, as we are not only collecting data from a single narrow channel, but record sampling data from several channels we are interested in using all this information together and obtain the most likely values of delay and delay rate which maximize a correlation function that is derived from all these channels. This procedure of combining data from different frequency channels is referred to as bandwidth synthesis [63]. Before we start, however, we have to correct for inter-channel phase differences stemming from different electric path lengths of the channels. For this, we make use of the phase calibration signals which were injected at the receiving system (Sect. 5.4). In other words, we apply a phase shift to each of the channel’s cross spectra and then perform a fringe fit over all bands. Finding the multi-band delay can be done in two different ways. The first one is that in a first step the cross-spectral function (amplitudes and phases w.r.t. frequency channel) is derived from individual channels as described for the single-band case. Then, a Fourier transform is carried out to convert the multi-band cross-spectral function into a multi-band delay function. The search for the delay basically works in the same way as for the individual channels, i.e., first a rough search for the location of the peak and then a suitable interpolation with selected spectral points in the vicinity of the peak [83]. The second way of finding the final delay is a joint fit of the single- and multiband delay as well as the delay rate in a common process as realized, e.g., in the PIMA software [62]. As we have seen before a wide effective bandwidth increases the SNR and improves the delay precision, which is the ultimate goal to achieve mm to cm precise VLBI delays. However, one needs to reflect on the fact that channels are usually not next to each other in the VLBI frequency bands but are separated by at least several MHz. Thus, instead of one large sinc-like correlation peak we will expect the actual correlation maximum to be surrounded by a certain number of slightly lower peaks, to which one refers to as side-lobes (Fig. 23). In general, the channel setup is carefully chosen so that one can expect minimum side lobe level in the actual data (cf. Sect. 3). A further complication arises from the fact that the full bandwidth is not covered with one continuous channel and the set of channels do not align to each other

Fig. 22 Single channel coarse delay resolution function, [35]

20 Very Long Baseline Interferometry 1289

1290

A. Nothnagel

y

Dela

Plotting window: Delay window: [ 6.31284D–08, 1.63128D–07] s Rate window: [ –2.80874D–11, 2.19126D–11] s/s 3.33333D–13 s/s Step: 2.41546D–10 s St1: NYALES20 St2: WETTZELL Sou: 0955+476 Exp: K16095 Obs: 189 Scan: 095–0759 Start: 2016.04.04–07:59:24.920 Duration: 66.560 sec Processed on: 2018.04.24–10:32:14 Ampl: 0.0003682 SNR: 14.64 PIMA u 2.25 2017.03.15 Gr_del: 1.131284D–07 sec Ph_rat: –3.087394D–12 s/s

Rate

Fig. 23 Multi-band delay function produced with PIMA [62]. S band data with 100 ns ambiguity spacing

without gaps in the frequency domain. For these reasons, we now face the challenge that our obtained delay can be biased by an integer multiple of the base ambiguity spacing. We can thus imagine that the graph Fig. 23 repeats every 100 ns which is the base ambiguity spacing for the frequency setup of this example. To address the ambiguity issue in a bit more detail, we can review the tasks of single- and multi-band delay search also from the aspect of finding a phase slope that connects all the correlation phases of the individual channels. In the case of single band delays, we deal with such a fit in a straightforward way (see discussion earlier) as we have phase values at each of the FFT points and thus should be able to detect any phase slope rather easily if the SNR is high enough. However, in the case of multi-band delay search we have to connect the phases over a much wider bandwidth and the fact alone that we have to deal with gaps between the individual bands, where we have no phase information at all, makes it already harder to find the actual phase slope that corresponds to the residual group delay. On top of that, there is a manifold of possibilities for possible delay slope values, due to the nature of phase measurements being limited to lie between −π and π . Thus, we can only determine the residual multi-band delay by satisfying phase slope criteria, but need to consider that our obtained delay value might be off by several integer multiples of the base ambiguity spacing. As described in more detail in [75] one can find a clear relation between the minimum channel spacing and the size of that base ambiguity. In general, one can

20 Very Long Baseline Interferometry

1291

state that the ambiguity spacing τamb corresponds to the greatest common divisor of all channel spacings Δνmax by τamb =

1 . Δνmax

(39)

Considering for example the eight USB channels of the standard geodetic X band setup (Sect. 5.3), we find a Δνmax = 20 MHz. Thus we face an ambiguity spacing of 50 ns, which is fortunately rather easily being dealt with given that the a priori information in the data analysis is good enough to provide theoretical delays with an uncertainty that is not exceeding half of the ambiguity spacing, i.e. 25 ns or about 7.5 m.

7

Data Analysis

7.1

Functional Model

Data analysis of geodetic and astrometric VLBI analysis follows the same rules as any other geodetic problem with a surplus of observations for the determination of the parameters of interest. Let’s first look at the purely geometric relationships following the principal VLBI equation (Eq. 1) which has to be expanded to take into account the fact that the baseline vector b and the unit vector in source direction k have to be expressed in the same reference frame. The main cause for a more elaborate formulation is the Earth’s variable rotation [43]. For this reason, the rotation matrices for precession and nutation, daily spin, and polar motion have to be applied. Precession and nutation describe the motion of a conventionally defined reference axis in space with the angular arguments X and Y , which are the coordinates of the Celestial Intermediate Pole (CIP) in the celestial frame [24]. This part of the variable rotation is modeled in the rotation matrix Q(X(t), Y (t)). The Earth’s phase of rotation is measured as the so-called stellar angle θ between the Celestial Intermediate Origin (CIO) and the Terrestrial Intermediate Origin (TIO) [18]. It should be noted here that originally the intermediate origins, which strictly speaking are rather zero-meridians, were called ephemeris origins. This was changed by a resolution of the International Astronomical Union (IAU) in 2006 (B2). To make the stellar angle a time quantity, a conventional relationship was defined as θ (Tu ) = 2π · (0.7790572732640 + 1.00273781191135448 · (Tu − 2451545.0)) (40) with Tu = Julian UT1 date (IAU Resolution 2000 B1.8). Solving for Tu provides UT1 from stellar angles θ measured by VLBI observations (in fact the only means to measure UT1). θ is the argument of the rotation matrix of the daily spin S(θ (t)). Finally, the variability of the chosen reference axis (CIP) w.r.t. the Earth’s crust is

1292

A. Nothnagel

parameterized by the two components xp and yp of polar motion contained in the rotation matrix of this wobble effect W(xp (t), yp (t)). These three rotation matrices have to be applied to either transforming the baseline vector in the terrestrial reference frame into that in the celestial reference frame or the unit vector in the celestial reference into that in the terrestrial frame. Both transformations are equivalent and can also be applied in part of the positions of the radio sources and in part to the baseline vector, as is done in practice (see below). In a closed formula, they are embedded in 1 τ (t) = t2 − t1 = − b · W(t) · S(t) · Q(t) · k c

(41)

with ⎛

⎞ x2 − x1 b = ⎝ y2 − y1 ⎠ z 2 − z1

(42)

for the Cartesian coordinates of telescopes 1 and 2, and ⎛

⎞ cos δc · cos αc k = ⎝ cos δc · sin αc ⎠ sin δc

(43)

for the positions of the quasars in the celestial system (Sect. 1.2). For the full expansion of the observed delay, we have to look at the relativistic contributions to the functional model and at several further effects to consider. Depending on the nature of these effects they are either applied as delay corrections Δτ or as corrections to the coordinate vector of the radio telescope xi . It should be noted that at first we only consider those contributions, which can be applied a priori, while under Sect. 7.5.1 we address remaining model components, which can only be estimated.

7.2

Relativistic VLBI Model Delay

At this point relativistic effects need to be taken into account for proper modeling of the delays. The first item to consider is the transformation of time scales. We have to distinguish proper time measured at an arbitrary location and coordinate time which refers to the barycenter of a coordinate system. Any clock on Earth, first of all, measures proper time, while we will need time as coordinate time of the solar system barycenter. The proper time at a telescope is normally given in the UTC time scale (Universal Time Coordinated) which relates to the International Atomic Time TAI (Temps Atomique International) through an integer number of leap seconds TLS to keep UTC within 0.9 s of the astronomical time UT1. Terrestrial Time (TT)

20 Very Long Baseline Interferometry

1293

is an ideal time which a clock would indicate on the geoid. It has the same rate as TAI but an offset of 32.184 s originating from a changeover from Ephemeris Time (ET) to TAI, thus T T = T AI + 32.184 s = U T C + TLS + 32.184 s.

(44)

The coordinate time of the Earth is Temps Coordonnées Géocentric (TCG) or Geocentric Coordinate Time. TCG [in fractions of Julian date] relates to TT [in fractions of Julian date] [9] through T CG = T T + LG · (TJ D − T T0 )

(45)

with TJ D being TT expressed as Julian date, T T0 being 2443144.5003725 (i.e., TT at 1977 January 1.0 TAI), and LG = 6.969290134 × 10−10 [58]. The coordinate time of the solar system barycenter TCB (Temps Coordonnées Barycentrique) is related to TCG through 1 T CB − T CG = 2 c

6 4 t

t0

3 5 2 v⊕ + Uext (X⊕ ) dt + V⊕ · (X − X⊕ ) 2

(46)

where the vectors X⊕ and V⊕ denote the barycentric position and velocity of the geocenter. The vector X is the barycentric position of the observer and Uext is the Newtonian potential of all of the solar system bodies apart from the Earth, evaluated at the geocenter [9]. t is TCB and t0 is 1977 January 1.0 TAI. Now we turn to the differences in arrival times at the two radio telescopes. The theoretical time difference dT T can be considered as being the same as that of dU T C dT T = dT AI = dU T C = (t2 )T T − (t1 )T T

(47)

From the proper time frame to the Geocentric Coordinate Time (TCG) frame, the time difference needs to be scaled through dT CG =

dT T 1 − LG

(48)

with LG = 6.969290134 × 10−10 as above. It should be noted here that the VLBI coordinate frame is in fact (only) related to the TT time scale (“TT-compatible”) for consistency with the other observing techniques in the ITRF as decided in an ITRF workshop in 2000, although IAU and IUGG resolutions call for TCG-compatibility [58]. The next step is the transformation of the Earth-fixed geocentric coordinates of the radio telescopes, e.g., given in the ITRS (Sect. 1.2), xi into the barycentric frame Xi . For this, xi has to be transformed from ITRS into the Geocentric Celestial Reference System (GCRS) which has the same origin but fixed axis directions with

1294

A. Nothnagel

respect to the extra-galactic radio sources of the ICRS. The transformation then just consists of adding the barycentric radius vector of the geocenter X⊕ (t1 ) to the radius vector of telescope i xi (in GCRS). Xi (t1 ) = X⊕ (t1 ) + xi (t1 )

(49)

The reference epoch is t1 , i.e., when the signal arrived at telescope 1. All quantities to be addressed further down will also be referred to this time epoch because this epoch is the reference time of observation. Here it is in the TCG time scale which needs a scaling as in Eq. 45. A very important contribution to the VLBI time delay originates from the retardation of the signal paths through gravitational bending of the celestial bodies. It is part of the relativistic formulation of the delay (Eq. 57). For modeling the bending effect, the time of closest approach of a signal originating from a quasar and passing a celestial body is needed. For the J th gravitating body, it can be expressed as the minimum   K · (XJ (t1 ) − X1 (t1 )) . t1J = min t1 , t1 − c

(50)

K is the unit vector from the solar system barycenter to the source in the absence of gravitational or aberrational bending [58]. This minimum actually means that it stays at t1 if the signal does not pass the body before it arrives at telescope 1. For the final expression, we need the vectors between the positions of the two telescopes and the gravitating body at the time of nearest approach, R1J (t1 ) and R2J (t1 ): R1J (t1 ) = X1 (t1 ) − XJ (t1J ) R2J (t1 ) = X2 (t1 ) −

V⊕ (K · b) − XJ (t1J ) c

(51)

(52)

V⊕ is the barycentric velocity of the geocenter and b is the geocentric baseline vector. The general relativistic delay, ΔTgrav , a.k.a. Shapiro term, then is J ΔTgrav = (1 + γ )

GMJ |R1J | + K · R1J ln |R2J | + K · R2J c3

(53)

with MJ being the rest mass of the J th gravitating body and γ the so-called light deflection parameter equal to 1 according to the Theory of General Relativity (GRT).

20 Very Long Baseline Interferometry

1295

The gravitational delay due to the Earth at the picosecond level is ⊕ = (1 + γ ) ΔTgrav

GM⊕ |x1 | + K · x1 ln |x2 | + K · x2 c3

(54)

with M⊕ being the rest mass of the Earth. The total gravitational delay can be expressed as the sum over all gravitating bodies, i.e., the Sun, Earth, the Earth’s moon and the other planets, ΔTgrav =



J ΔTgrav

(55)

J

In the barycentric frame, the vacuum delay is 1 T2 − T1 = − K · (X2 (T2 ) − X1 (T1 )) + ΔTgrav c

(56)

Equation 56 is now converted (back) by a Lorentz transformation from the barycentric into the geocentric system. This is done w.r.t. two groups of quantities, the barycentric vectors Xi into the corresponding geocentric vectors xi and the related transformations between the barycentric time difference (T2 − T1 ) and the geocentric time difference (t2 − t1 ). The geometric delay then follows ΔTgrav − (t2 − t1 )vac =

    (1+γ )U |V⊕ |2 V⊕ w2 K·b V⊕ b KV⊕ 1− − − − 1 + c 2c c2 2c2 c2 c2 . K(V⊕ +w2 ) 1+ c (57)

This is the formulation for the geometry in vacuum [58]. wi is the geocentric velocity vector of telescope i, i.e., w.r.t. the geocenter. V⊕ contains the x-, y-, zvelocity components of the geocenter itself. U is the total gravitational potential of the solar system at the geocenter neglecting the effects of the Earth’s mass. This formula thus contains the effects of daily and annual aberration. In the context of VLBI, this means that while the signal continues traveling for a time τ after arriving at telescope 1, telescope 2 has changed its position w.r.t. that at t1 both in the geocentric and in the solar system barycentric frame.

7.3

Atmospheric Refraction

Another very important contribution to the delay is that of atmospheric refraction. In terms of refraction, the atmosphere is separated into the charged compartment, generally called the ionosphere, and the neutral part which some authors call the troposphere. This, however, falls short of the fact that only about 75% of the

1296

A. Nothnagel

refraction effects in the neutral atmosphere take place within the troposphere while 25% happen above [25]. For observations in the radio frequency domain the ionosphere is dispersive. For this reason, VLBI observations have always been carried out with two separate frequencies (at X band and S band, Sect. 3) for calibration of the primary observing X for the X band observable is easily frequency, i.e., X band. The correction Δτion computed by X Δτion = (τS − τX ) ·

νS2

(58)

2 νS2 − νX

where τS and τX are the measured multiband group delays and νS and νX the observed frequencies at S band and X band, respectively. Since the observed frequency of each band actually results from a combination of several frequencies through bandwidth synthesis, these frequencies need to be computed as a weighted mean over all channels N

νgr

E F N 2 F N  F N ρ · 2− ρ (ν − ν ) ρ (ν − ν ) F i i i 0 i i 0 F i=1 i=1 i=1 =F F  N N N N    ρi G ρ (ν − ν ) · − ρ · ρ νi −ν0 i

i=1

i

0

i=1

νi

i

i=1

i

i=1

(59)

νi

ρi are the weights of individual channels which can be computed from the number of correlated bits or SNR. ν0 is the reference frequency of the bandwidth synthesis setup, e.g., channel #1 [32, 60]. A common way of modeling refraction in the neutral atmosphere is the separation of Δτatm into a hydrostatic and a water vapor part. The latter is often called the wet part. Both are separated further into a component reflecting the refraction effect in zenith direction and a mapping function for the conversion into the direction of the observation. The delay contribution Δτatm (ε) is commonly represented as extra path length ΔLatm (ε) which is easily transformable to a time unit quantity through the speed of light. If the model only considers a dependency on elevation angle ε, it can be composed as ΔLatm (ε) = ΔLzh · mh (ε) + ΔLzw · mw (ε).

(60)

mh (ε) and mw (ε) are the mapping functions which are applied to map the extra path lengths or their time equivalents in zenith direction onto any non-zenith elevation angle ε. A simple 1/ sin ε scaling is not sufficient, in particular at lower elevations because the curvature of the atmosphere as a consequence of the curvature of the Earth’s surface would not be taken into account. Most modern mapping functions thus are a function of ε and of a finite continuous fractional form with coefficients (a, b, c) [31]. They are specific for the hydrostatic (Eq. 61) and the wet component

20 Very Long Baseline Interferometry

1297

(Eq. 62) because the thickness of the effective atmosphere is more than 10 km in the first case and only about 2 km in the latter [52]. ah bh 1+ 1 + ch mh (ε) = ah sin ε + bh sin ε + sin ε + ch

(61)

aw bw 1+ 1 + cw mw (ε) = aw sin ε + bw sin ε + sin ε + cw

(62)

1+

1+

While in earlier mapping functions the coefficients mostly depended only on the geographic latitude of the telescope [31,51], modern mapping functions are inferred from numerical weather models [13, 14, 44]. The hydrostatic component of the extra path length in zenith direction ΔLzh follows a development of [64–66], finally formulated by [22] as ΔLzh = 0.0022768 [m hPa−1 ]

P◦ . (1 − 0.00266 cos 2φ − 0.00028 H )

(63)

It predominantly depends on the barometric pressure at the height of the VLBI reference point P0 [in hPa] and to a very small degree on latitude φ and the height above the geoid H . At sea level this corresponds to roughly 2.28 m extra path length or 7.6 ns. The hydrostatic path delay for each observation can be applied a priori according to Δτatmh (ε) =

1 · ΔLzh · mh (ε). c

(64)

A determination of the wet component along the line of sight ΔLw with sufficient accuracy regularly fails because it would need a solution for an integral along the path length over the partial pressure of the water vapor e, the temperature T and some compressibility factor Zw ≈ 1 [57] scaled by the mean temperature Tm according to the model developed by [26]: ΔLw = [1 + 6 × 10−5 Tm ] 3.754

 S

e −1 Z ds. T2 w

(65)

1298

A. Nothnagel

The reason is that the water vapor in the atmosphere is highly variable in space and time. Our inability to do a forward modeling of the wet component with sufficient accuracy leads to the generally applied strategy to just estimate an extra delay in zenith direction attributed to water vapor Δτwz = ΔLzw /c after the observations have been calibrated with a model hydrostatic contribution according to Eq. 64. More details on the background of refraction in the neutral atmosphere in VLBI can be found in [15] while [2] give detailed insights into ionospheric refraction.

7.4

Further Model Contributions

For the final geometric time delay in the terrestrial system, all known contributions have to be added to Eq. 57. The first ones are those of refraction through the ionosphere Δτion according to Eq. 58 and through the neutral atmosphere Δτatmh according to Eq. 64. At this point we also have to take into account that besides the aberration effects in vacuum embedded in Eq. 57, the path through the charged and neutral atmosphere at telescope 1 also permits telescope 2 to change its position due to the rotation of the Earth and its movement in space before the signal arrives there causing an additional aberration contribution Δτabbatm Δτabbatm = Δτatm1 ·

K · (w2 − wi ) . c

(66)

Further known corrections to be applied to the theoretical delay are tidal displacements and loading effects caused by the variability in the load forces of the oceans and the atmosphere. A complete list of them can be found in the Conventions of the International Earth Rotation and Reference Systems Service (IERS) [58] with many references to the original scientific publications. These effects all lead to displacements of the radio telescopes ΔXmod on the inter-annual to sub-daily time scale. Since they are all in the linear domain for the analysis at hand, their impact can be easily transferred to the time delay domain through computing Δτmod =

∂τ · ΔXmod . ∂Xmod

(67)

It should be noted that the latter contributions, which are coordinate displacements in their original form, do not necessarily need to be applied at this stage. As adopted in some software packages they may also be applied to the a priori coordinates before going through the full relativistic formulation in Eq. 57 (Fig. 24). The formulation of the final theoretical or a priori delay depends on whether the corrections mentioned above are actually applied to the observed delay, to the geometry or to the a priori delay. In

20 Very Long Baseline Interferometry Fig. 24 Standard Gauss-Markov model VLBI analysis data flow

1299

observed group delay ionosphere hydrostatic atmosphere instrumental calibration thermal expansion

radio source position precession/nutation

a priori telescope coordinates continental drift Earth rotation (3D) Earth tides pole tides oceanic loading atmospheric loading local deformations transformation into SS barycentre

gravitational deformation

geometry at epoch of observation model group delay from SSB geometry (relativistic) time transformation (coord. −> proper) corrected observable

observation equations from geometry => design / Jacobi matrix

normal equation system

O−C

outliers ??

right hand side

target parameters − telescope coordinates − radio source positions − pole coordinates/UT1−UTC − nutations parameters − clock parameters − atmosphere parameters

τapriori = τvac + Δτabbatm + ΔτE.T . + ΔτP .T . + ΔτO.L. + ΔτA.L.

(68)

the contributions ΔτE.T . of the Earth tides, ΔτP .T . of the pole tide, ΔτO.L. of ocean loading, and ΔτA.L. of atmospheric loading are applied to the theoretical delay. In contrast to this, ΔτT .E. the contribution of the telescopes’ thermal expansion, ΔτG.D. of the telescopes’ gravitational deformations, and ΔτI ns of the telescopes’ cable delays (Sect. 5.4) are applied as corrections to the observed delay τobscorr = τobs − Δτion − Δτatmh − ΔτT .E. − ΔτG.D. − ΔτCable

(69)

All the Δτ∗ have of course to be computed as differences related to telescope 1 and 2. Applying the respective effects to the theoretical delays provides the c (computed ≡ τapriori ) vector and the vector of observations o (observed ≡ τobscorr ). Forming the differences produces the linearized observation vector

1300

A. Nothnagel

y=o−c

(70)

for the parameter estimation.

7.5

Parameter Estimation

Various estimation schemes exist to determine the parameters of interest from VLBI multiband group delay observations. The primary difference is how these handle the stochastic nature of the clocks and in particular the atmosphere. Kalman Filter [33, 40, 53, 71], least-squares collocation [78], square-root information filters [16] treat the wet part of the atmosphere and also the clock behavior as a purely stochastic process. However, most common are standard least-squares adjustments in GaußMarkov models. It should be mentioned here, that in comparison campaigns it could not be substantiated that any of these estimation processes produces superior results. A drawback of the Filter solutions is, that combination of the results on the basis of normal equation systems is not possible directly, and for least-squares collocation, the extraction of a partly resolved normal equation system is quite complicated.

7.5.1 Gauß-Markov Model The estimation process in a Gauß-Markov model can be followed easily in Fig. 24, where the raw observations produced in the correlation and fringe fitting process are first corrected for all known effects mentioned before including the hydrostatic part of the atmosphere. On the right hand side, the radio source positions are transformed into position of date by the precession/nutation rotations composed in the rotation matrix Q(t) in Eq. 83. The a priori telescope coordinates on the right hand side are mostly those of the ITRF (Sect. 1.2) which are transformed to the epoch of the observations by applying the corrections for the station velocity components in the ITRF and the rotations from the catalog frame to the frame of date through the Earth rotation parameters, polar motion and the Earth’s phase of rotation. Further corrections to the coordinates are the effects of the Earth tides, the pole tide, oceanic and atmospheric loading effects as mentioned above. Starting point for the explanation of the parameter estimation in the GaußMarkov model is the so-called observation equation based on the initial formulation in Eq. 41. Since the two clocks are never synchronized to the accuracy needed, the geometric model is expanded for the respective, time dependent clock offset. The clock behavior is time dependent and we can formulate the clock contributions Δτclo as second order polynomials Δτ (t)clo = −(T10 + T11 · (t − t0 ) + T12 · (t − t0 )2 )

(71)

+(T20 + T21 · (t − t0 ) + T22 · (t − t0 )2 ) to be added to Eq. 41. T∗j are the parameters of the clock polynomial at telescope 1 or 2.

20 Very Long Baseline Interferometry k b c W S Q T∗j

1301

Unit vector in source direction (Eq. 43) its three components ki Baseline vector (Eq. 42) Velocity of light Rotation matrix for polar motion with the angular arguments xp and yp Rotation matrix for the daily spin of the Earth with the angular argument θ/UT1 Transformation matrix for precession and nutation with the angular arguments X and Y Parameters of the clock polynomial at telescope 1 or 2

MF∗w Mapping function of the wet part of the atmosphere at station 1 or 2 AT∗

τcorr 

Zenith wet delays at station 1 or 2 Sum of all corrections Measurement noise.

The contribution of the refraction caused by water vapor, which we like to estimate, is expressed as the product of a zenith effect and a mapping function (Sect. 7.3). In a simple form, we can then write the observation equation as 1 τobs (t) = − b · W(xp , yp ) · S(θ ) · Q(X, Y ) · k c

(72)

− (T10 + T11 · (t − t0 ) + T12 · (t − t0 )2 ) + (T20 + T21 · (t − t0 ) + T22 · (t − t0 )2 ) + MF1w · AT1 + + MF2w · AT2 + + τcorr +  with In a least squares adjustment of VLBI observations, the clock parameters always have to be estimated because of the lack of synchronized clocks at far distant locations. All other parameters of interest such as telescope coordinates, Earth orientation parameters, radio source positions, and zenith wet delays may be included in the vector of unknown parameters x more or less freely depending on the observation constellation. The classical least squares adjustment in a Gauß-Markov model reads

y=A·x+v

(73)

with the covariance matrix of the observations yy . Here, y describes the n × 1 vector of the abbreviated observations (Eq. 70), x is the m × 1 vector of the unknown parameters and v is the n × 1 vector of the residuals [42]. A is the n × m Jacobian or

1302

A. Nothnagel

linearized design matrix containing the partial derivatives of the observations w.r.t. the parameters to be estimated A(i, j ) =

∂τobs,i . ∂xj

(74)

yy is the covariance matrix of the observations according to Eq. 78 and the inverse of the covariance matrix is the weight matrix P = −1 yy . The vector of the estimated parameters x˜ is computed with −1 T x˜ = (AT −1 A −1 yy A) yy y

(75)

minimizing the sum squared of residuals vT P v . . . min

(76)

where the residuals v can be computed from v = A · x˜ − y,

(77)

[42].

7.5.2 Stochastic Model The covariance matrix of the observations yy is deduced from the a priori variance factor σ0 and the cofactor matrix Qyy yy = σ0 Qyy .

(78)

The co-factor matrix of the observations Qyy regularly only contains diagonal elements, i.e., the variances of the observations qii = σi2 which are deduced from the signal-to-noise ratio of the correlation process (Sect. 6.3). It has been found very early that these are too optimistic and that the global test of the χ 2 factor resulting from the quotient of the variance factor a posteriori σ˜ 02 and the variance factor a priori σ02 χ2 =

σ˜ 02 σ02

with

σ˜ 02 =

vT Pv n−m

(79)

for n observations and m unknowns is nowhere near the required unity. For this reason, the most common way of handling this problem in operational analyses, i.e., for standard solutions of the IVS, is by inflating the variances from the correlator 2 with an additive constant σ 2 σcorr add 2 2 2 σadj = σcorr + σadd

(80)

20 Very Long Baseline Interferometry

1303

2 match those a posteriori to achieve that the input variances to the adjustment σadj and produces the χ 2 in Eq. 79 to become close to 1. The deficit here is twofold. The first is that the error contributions are computed to make the χ 2 in Eq. 79 unity, thus they are purely empirical and lack any physical explanation. The second deficit is that correlations between observations are ignored entirely. Various attempts have been made to cope with this problem, also for other space geodetic techniques, but solutions tended to be computationally expensive and thus not suitable for operational application. Lately, a very promising approach was published by using turbulence theory as the driver for the correlations. Due to its extremely good results and even more so that the algorithm has favorable runtime costs, it is well suited for operational applications [30].

7.5.3 Extension of the Functional Model Over the years, it had been realized that the functional model as formulated in Eq. 72 could be improved in several areas. The first one to mention is the pseudo-stochastic treatment of the clock behavior beyond a pure second order polynomial. Here, the parameter model is expanded for a number of additional parameters primarily for increased time resolution using continuous piece-wise linear functions (CPWLF) or linear splines [23] of time segments of one hour or less. For this purpose the final observation equation (Eq. 83) also contains elements of these splines in the form of x(t) =

x(ti ) − x(ti−1 ) (t − ti−1 ). ti − ti−1

(81)

for time segments from ti−1 to ti . The same also applies to the temporal resolution of the zenith wet delay for which the observation equation (Eq. 83) is expanded in lines 6 and 7. So far, the functional model assumes isotropy and just contains the wet delay in zenith direction as one or more parameters to be estimated. However, this does not account for the anisotropy of refraction which may be caused, e.g., by the increased thickness of the atmospheric layers towards the equator. For this reason, [47] introduced gradients in the model τgrad (t) = +MF1w (ε1 ) · cot ε1 · [G2n cos α 1 + G1e sin α 1 ]

(82)

+MF2w (ε2 ) · cot ε2 · [G2n cos α 2 + G2e sin α 2 ] It should be emphasized here, that any remaining unmodeled hydrostatic contribution will be compensated for in the estimated wet component and in the gradients. The same applies to inaccurate pressure monitoring at the sites. For this reason the absolute values of the wet zenith delays may not be representative or even utterly wrong, e.g., if they are negative.

1304

A. Nothnagel

T∗j

Parameters of the clock polynomial at station 1 or 2

MF∗w

Coefficients of linear splines at station 1 or 2 for a specific epoch ti parametrized as continuous piece-wise linear functions (CPWLF), i.e., linear splines Mapping function of the wet part of the atmosphere at station 1 or 2 Zenith wet delays parametrized as CPWLF at station A or B for a specific epoch ti Atmospheric gradients in north or east direction at telescope 1 or 2

T ∗ (ti )

AT∗ (ti ) G∗n/e

Taking all the augmentations of the functional model into account, the full observation equation in a Gauss-Markov model then reads 1 τobs (t) = − b(t) · W(t) · S(t) · Q(t) · k c " # − T10 + T11 · (t − t0 ) + T12 · (t − t0 )2 # " + T20 + T21 · (t − t0 ) + T22 · (t − t0 )2

(83)

  T 1 (t1 ) − T 1 (t0 ) T 1 (ti ) − T 1 (ti−1 ) (t − t0 ) + . . . + (t − ti ) − T 1 (t0 ) + t1 − t0 ti − ti−1   T 2 (t1 ) − T 2 (t0 ) T 2 (ti ) − T 2 (ti−1 ) (t − t0 ) + . . . + (t − ti ) + T 2 (t0 ) + t1 − t0 ti − ti−1 4 AT1 (t1 ) − AT1 (t0 ) + MFaw · AT1 (t0 ) + (t − t0 ) + . . . t1 − t0 3 AT1 (ti ) − AT1 (ti−1 ) + (t − ti ) ti − ti−1 4 AT2 (t1 ) − AT2 (t0 ) 2 + MFw · AT2 (t0 ) + (t − t0 ) + . . . t1 − t0 3 AT2 (ti ) − AT2 (ti−1 ) + (t − ti ) ti − ti−1   + MF1w (ε1 ) · cot ε1 · G2n cos α 1 + G1e sin α 1   + MF2w (ε2 ) · cot ε2 · G2n cos α 2 + G2e sin α 2 + τcorr +  with

20 Very Long Baseline Interferometry

1305

7.5.4 Solution Types Before turning to the final estimation, we have to distinguish between single session analysis and multi-session, so-called global, analysis. The standard parameter set for single session solutions consists of four groups of parameters which are considered as local parameters, i.e., only valid at the epoch of the session: • the parameters of the clock model • radio telescope coordinates xi , yi , zi • Earth orientation parameters, i.e., the two polar motion components xp , yp , the Earth’s phase of rotation UT1–UTC, two adjustment parameters for the precession/nutation model dX, dY • the parameters of the refraction model due to the wet component of the atmosphere. The single session analysis always is the first step which is mainly devoted to preparing the session data for establishing the final setup of this session. First of all the ambiguities have to be resolved, which result from possible ambiguous phase slope (Sect. 6.5). Special care has to be taken that in all possible triangles the sum of the delays has to close to zero within a few ns. After this has been done, the X band group delays are corrected for ionospheric refraction effects according to Eq. 58. It can be expected that systematic biases between the S and X band delays prevail which contaminate the ionosphere correction. However, since these are of a constant nature, they will finally disappear in the clock parameter estimates. Next comes the parameterization of the clock and atmosphere parameters mainly depending on the number of individual delay observations and the possible and necessary time resolution of the parameters. Equation 83 contains all the parameters which are estimated in routine processing. However, we have to consider two peculiarities. The first is that we can only estimate clock parameters which are relative to a reference clock. Regularly one of the most stable clocks is selected from experience. For this purpose the design matrix does not contain coefficients for the reference clock resolving the singularity which would appear otherwise. The second caveat is the estimation of the atmospheric refraction parameters. Normally, VLBI telescopes are far enough apart that the estimates of the wet zenith path delays of any two telescopes are de-correlated. However, with the advent of more than one radio telescope at a single observatory, e.g., at Wettzell (Germany), Onsala (Sweden), or Ny Alesund (Norway), the partial derivatives of the first order wet zenith delay of two nearby telescopes ∂τ ∂AT1 (t0 )

= MF∗w (ε)

(84)

and of course all other continuous piece-wise linear polygons are highly correlated through the almost identical elevation angles. For this reason, only relative zenith wet delays can be estimated in cases where the two telescopes are close together. As

1306

A. Nothnagel

soon as a third telescope provides observations on longer baselines with different elevation angles, the estimation of absolute zenith wet delays for all telescopes becomes possible again. The estimation of parameters of piecewise linear functions does only work if a sufficient number of observations is available in each segment. Unfortunately, sometimes this is not the case because of failures or deliberate exclusions. In such a case the system of normal equations would become degenerate. However, this is regularly cured by introducing constraints in the form of pseudo observations to the effect that the rates between every two functional values should be zero d(F (ti ) − F (ti−1 )) = 0. dt

(85)

The selection of the variances of these observations determines how loosely or how tightly the constraint affects the estimation. For the clock parameters normally a standard deviation on the order of about 5 × 10−14 s/s and for the zenith wet delay of about 50 ps/h is being chosen. Another issue concerns the fact that the VLBI technique is only capable of producing relative telescope coordinates. This means that any solution setup per se is free of datum and the normal equation system would be singular if all telescope coordinates and the Earth rotation parameters (xp , yp , UT1–UTC) are tried to be estimated. It has a rank defect of six, three translations and three rotations. In a single session, the simplest option would be to fix the three coordinate components of one telescope and the three Earth rotation parameters, i.e., eliminate them from the normal equation system. Another more flexible way being less dependent on the coordinates of a single telescope is the introduction of no-net-rotation (NNR) and no-net-translation (NNT) conditions for at least three non-collinear telescopes or even all of them in the network [5]. The selection of the datum telescopes depends on the quality of the a priori coordinates being suitable for the datum definition at hand. As a final step of the session-wise analysis, an iterative identification process of significant outliers in the observations can be added applying different kinds of outlier identification schemes [42]. Thus outliers can then be down-weighted for exclusion from the parameter estimation. The second type of solutions is a multi-session or global solution. These solutions are carried out to determine parameters which are valid/constant for the whole observing period. They serve to estimate • radio source positions • telescope coordinates at a reference epoch • linear or higher order velocities of the radio telescopes due to geodynamic effects Other experimental solutions can be conceived, e.g., for the estimation of the relativistic factor γ or Love numbers.

20 Very Long Baseline Interferometry

1307

A simple way of running these solutions is by stacking pre-reduced normal equation systems from the session-wise analysis. Pre-reduction in this context means that parameters are excluded from the solution vector by reducing the system of normal equations by Gaußian elimination steps [5]. When all local parameters are pre-reduced from the session’s normal equation system, it contains only the coefficients of the global parameters set up initially. New lines and columns, have to be entered for the linear or otherwise parameterized, e.g., harmonic, telescope motions [5]. The radio source positions are mostly treated either as constant global parameters or as session-wise, so-called arc, parameters. The latter approach is chosen when a radio source evidently manifests position variations. However, new approaches also allow for intermediate modes with linear spline approximations at predetermined or automatically chosen time intervals [41]. As the final step, the global parameters, as determined by the stacked normal equation system, are reinserted in the original session normal equation systems to also estimate the local parameters. With all parameters estimated the observation residuals can be computed and an assessment of the errors can be carried out. The general concept of treating the datum in global solutions is then similar to that of single session solutions with NNR/NNT conditions [5]. In most of the cases, the latest realization of the ITRF [3] is taken as the datum frame. For the radio source positions, mostly the defining sources of the latest ICRF are chosen as the celestial datum for new and/or improved radio source positions.

8

Results

Session-wise results are of particular interest in the case of Earth orientation parameters. Unique for VLBI are the results of the Earth’s phase of rotation UT1, which cannot be determined with such a quality and time resolution with any other technique in particular not with any satellite technique. The reason is that UT1 is directly correlated with the ascending node of any satellites’ orbit. The phase of rotation is reported as time difference of UT1 – UTC. Formal errors are at the level of 2–3 μs for contemporary network sessions while daily single baseline sessions of one hour duration only (so-called Intensives) provide UT1–UTC with a standard deviation of about 8–15 μs (Fig. 25). Unique are also the estimated adjustments to the precession/nutation model. They are accurate to about 80 μas and represent not so much inaccuracies in the precession/nutation model but variations caused by free core nutation [24] which are at the level of about 1 mas. Daily polar motion estimates provide a reliable time series of the long term evolution of polar motion (Fig. 26). Although the formal errors are slightly worse than those of the IGS polar motion time series, they are needed for cross-validation. From the time series of telescope coordinates, baseline lengths can be inferred. These are independent of any datum and can conveniently be used for stability investigations. One of the most prominent baseline is that between the 18 m

1308

A. Nothnagel

Fig. 25 Results of UT1-UTC between two epochs where leap seconds were introduced

Fig. 26 Long term polar motion, y component in black

20 Very Long Baseline Interferometry

1309

Baseline WETTZELL vs. WESTFORD (BKG) 5998326.000 5998325.900

Baseline length [m]

5998325.800 5998325.700 5998325.600 5998325.500 5998325.400 5998325.300 5998325.200 5998325.100

1984

1991

1997

2004

2011

201

Time Line [Year] Regression

Session

Error IVS Combination Centre at BKG 2018-03-29 11:45:26

Fig. 27 Time series of baseline lengths between the Wettzell and Westford radio telescopes with linear regression line

telescope of MIT Haystack Observatory near Westford (MA) at the East coast of the U.S. and the 20 m telescope of the Bundesamt für Kartographie und Geodäsie and the Technische Universität München near Wettzell, Germany (Fig. 27). It should be noted that the input to the time series is not constrained in any way and still the approximation by a linear regression is extraordinary precise. Figure 27 has been produced by a Web tool of the IVS under http://www.ccivs.bkg.bund.de/EN/ Quarterly/VLBI-Baseline/vlbi-baseline_node.html where plots of all VLBI time series of baseline lengths are available interactively. Results originating from VLBI global solutions guarantee consistency between parameters estimated in the same solution setup. Standard parameters here are radio source positions, which are being used in the construction of the ICRF [27]. Telescope coordinates and velocities regularly enter the computations of the specific realizations of the International Terrestrial Reference System (ITRS), the ITRF [3, 8]. The standard deviations of the coordinates of the most reliable and often used telescopes are at the level of about 1–3 mm with velocity components having standard deviations of 0.1 mm/y. Acknowledgements Almost all of the section on correlation were written by Thomas Hobiger of Chalmers University of Technology, Gothenburg, Sweden. I am grateful for the help of a real specialist in this important part of VLBI. I also thank Klaus Börger and Armin Corbin for proofreading of the manuscript.

1310

A. Nothnagel

Literature 1. Abbondanza, C., Sarti, P.: Effects of illumination functions on the computation of gravitydependent signal path variation models in primary focus and Cassegrainian VLBI telescopes. J. Geod. 84(8), 515–525 (2010). https://doi.org/10.1007/s00190-010-0389-z 2. Alizadeh, M.M., Wijaya, D.D., Hobiger, T., Weber, R., Schuh, H.: Ionospheric effects on microwave signals. In: Böhm, J., Schuh, H. (eds.) Atmospheric Effects in Space Geodesy, pp 35–71. Springer, Berlin/Heidelberg (2013). https://doi.org/10.1007/978-3-642-36932-2_2 3. Altamimi, Z., Rebischung, P., Métivier, L., Collilieux, X.: ITRF2014: a new release of the international terrestrial reference frame modeling nonlinear station motions. J. Geophys. Res. Solid Earth 121(8), 6109–6131 (2016). https://doi.org/10.1002%2F2016jb013098 4. Amiri-Simkooei, A.R., ASCE, M., Asgari, J., Zangeneh-Nejad, F., Zaminpardaz, S.: Basic concepts of optimization and design of geodetic networks. J. Surv. Eng. 138(4), 172–183 (2012) 5. Angermann, D., Drewes, H., Krügel, M., Meisel, B., Gerstl, M., Kelm, R., Müller, H., Seemüller, W., Tesmer, V.: ITRS Combination Center at DGFI: a terrestrial reference frame realization 2003. Deutsche Geodätische Kommission bei der Bayerischen Akademie der Wissenschaften (München), Reihe B: Angewandte Geodäsie, ISSN:0065-5317. Heft Nr 313, pp. 1–141 (2004). ISBN:3-7696-8593-8 313:1–141 6. Artz, T., Springer, A., Nothnagel, A.: A complete VLBI delay model for deforming radio telescopes: the Effelsberg case. J. Geod. 88, 1145–1161 (2014). https://doi.org/10.1007/ s00190-014-0749-1 7. Baars, J.: The Paraboloidal Reflector Antenna in Radio Astronomy and Communication. Springer, New York (2007) 8. Bachmann, S., Thaller, D., Roggenbuck, O., Lösler, M., Messerschmitt, L.: Ivs contribution to ITRF2014. J. Geod. 90, 631–654 (2016). https://doi.org/10.1007/s00190-016-0899-4 9. Bangert, J., Bell, S., Calabretta, M., Capitaine, N., Folkner, W., Hohenkerk, C., Li, J., Luzum, B., Malkin, Z., Percival, J., Ransom, S., Wallace, P.: SOFA Time Scales and Calendar Tools. Technical Report (2017) 10. Bare, C., Clark, B.G., Kellermann, K.I., Cohen, M.H., Jauncey, D.L.: Interferometer experiment with independent local oscillators. Science 157(3785), 189–191 (1967) 11. Baver, K., Gipson, J., Carter, M.S., Kingham, K.: Assessment of the first use of the uniform sky strategy in scheduling the operational IVS-INT01 sessions. In: Behrend, D., Baver, K. (eds.) IVS 2012 General Meeting Proceedings, NASA/CP, pp. 251–255 (2012) 12. Böckmann, S., Artz, T., Nothnagel, A.: VLBI terrestrial reference frame contributions to ITRF2008. J. Geod. 84(3), 201–219 (2010). https://doi.org/10.1007/s00190-009-0357-7 13. Böhm, J., Schuh, H.: Vienna mapping functions in VLBI analyses. Geophys. Res. Lett. 31, L01603 (2004) 14. Böhm, J., Werl, B., Schuh, H.: Troposphere mapping functions for GPS and very long baseline interferometry from European Centre for Medium-range Weather Forecasts operational analysis data. J. Geophys. Res. 111, B02,406 (2006). https://doi.org/10.1029/2005JB003629 15. Böhm, J., Salstein, D., Alizadeh, M.M., Wijaya, D.D., Schuh, H.: Geodetic and atmospheric background. In: Böhm, J., Schuh, H. (eds.) Atmospheric Effects in Space Geodesy, pp. 73–136. Springer, Berlin/Heidelberg (2013). https://doi.org/10.1007/978-3-642-36932-2_3 16. Bolotin, S.: VLBI data analysis center at main astronomical observatory of National Academy of Science of Ukraine. In: Capitaine, N. (ed.) Journées Systèmes de Référence Spatiotemporels 2000, Observatoire de Paris, pp. 102–106 (2000) 17. Broten, N.W., Legg, T.H., Locke, J.L., McLeish, C.W., Richards, R.S., Chisholm, R.M., Gush, H.P., Yen, J.L., Galt, J.A.: Long base line interferometry: a new technique. Science 156(3782), 1592–1593 (1967). https://doi.org/10.1126/science.156.3782.1592 18. Capitaine, N., Guinot, B., McCarthy, D.D. Definition of the celestial ephemeris origin and of UT1 in the international celestial reference frame. AA 355, 398–405 (2000)

20 Very Long Baseline Interferometry

1311

19. Clark, T., Thomsen, P.: Deformations in VLBI antennas. NASA Technical Report NASA/TR 100696, NASA Greenbelt (1988) 20. Clark, T., Corey, B., Davis, J., Elgered, G., Herring, T., Hinteregger, H.F., Knight, C., Levine, J., Lundqvist, G., Ma, C., Nesman, E., Phillips, R., Rogers, A., Ronnang, B., Ryan, J., Schupler, B., Shaffer, D., Shapiro, I., Vandenberg, N., Webber, J., Whitney, A.: Precision geodesy using the mark-III very-long-baseline interferometer system. IEEE Trans. Geosci. Remote Sens. GE23, 438–449 (1985). https://doi.org/10.1109/TGRS.1985.289433 21. Cohen, M.H., Shaffer, D.B.: Positions of radio sources from long-baseline interferometry. Astron. J. 76, 76–91 (1971) 22. Davis, J.L., Herring, T.A., Shapiro, I.I., Rogers, A.E.E., Elgered, G.: Geodesy by radio interferometry – effects of atmospheric modeling errors on estimates of baseline length. Radio Sci. 20(6), 1593–1607 (1985). https://doi.org/10.1029/RS020i006p01593 23. De Boor, C.: A Practical Guide to Splines, vol 27. Springer, New York (1978) 24. Dehant, V., Mathews, P.M.: Precession, Nutation and Wobble of the Earth. Cambridge University Press, Cambridge (2015) 25. Elgered, G.: Tropospheric radio path delay from ground-based microwave radiometry. In: Janssen, M. (ed.) Atmospheric Remote Sensing by Microwave Radiometry. Wiley, New York (1993) 26. Elgered, G.K.: Tropospheric wet path-delay measurements. IEEE Trans. Antennas Prop. 30(3), 502–505 (1982) 27. Fey, A.L., Gordon, D., Jacobs, C.S., Ma, C., Gaume, R.A., Arias, E.F., Bianco, G., Boboltz, D.A., Böckmann, S., Bolotin, S., Charlot, P., Collioud, A., Engelhardt, G., Gipson, J., Gontier, A.M., Heinkelmann, R., Kurdubov, S., Lambert, S., Lytvyn, S., MacMillan, D.S., Malkin, Z., Nothnagel, A., Ojha, R., Skurikhina, E., Sokolova, J., Souchay, J., Sovers, O,J., Tesmer, V., Titov, O., Wang, G., Zharov, V.: The second realization of the international celestial reference frame by very long baseline interferometry. Astron. J. 150(2), 58 (2015). https://doi.org/10. 1088/0004-6256/150/2/58 28. Gipson, J.: Sked – VLBI scheduling software. NASA Goddard Spaceflight Center (2016). https://lupus.gsfc.nasa.gov/files_user_manuals/sked/SkedManual_v2016Dec09.pdf 29. Grafarend, E.W., Sansó, F.: Optimization and Design of Geodetic Networks. Springer, Berlin/Heidelberg (1984). https://doi.org/10.1007/978-3-642-70659-2 30. Halsig, S., Artz, T., Iddink, A., Nothnagel, A.: Using an atmospheric turbulence model for the stochastic model of geodetic VLBI data analysis. Earth Planets Space 68, 106 (2016). https:// doi.org/10.1186/s40623-016-0482-5 31. Herring, T.: Modeling atmospheric delays in the analysis of space geodetic data. In: De Munck, J., Spoelstra, T. (eds.) Refraction of Transatmospheric Signals in Geodesy. Netherlands Geodetic Commission, Delft, vol. 36. pp. 157–164 (1992) 32. Herring, T.A.: Precision and accuracy of intercontinental distance determinations using radio interferometry. Dissertation AFGL-TR-84-0182. MIT, Cambridge (1983) 33. Herring, T.A., Davis, J.L., Shapiro, I.I.: Geodesy by radio interferometry: the application of Kalman filtering to the analysis of very long baseline interferometry data. J. Geophys. Res. 95(B8), 12561–12581 (1990). https://doi.org/10.1029/JB095iB08p12561 34. Hinteregger, H.F., Shapiro, I.I., Robertson, D.S., Knight, C.A., Ergas, R.A., Whitney, A.R., Rogers, A.E.E., Moran, J.M., Clark, T.A., Burke, B.F.: Precision geodesy via radio interferometry. Science 178, 396–398 (1972). https://doi.org/10.1126/science.178.4059.396 35. Hobiger, T., Kondo, T.: An FX software correlator based on Matlab. In: Proceedings of the 17th Working Meeting on European VLBI for Geodesy and Astrometry (2005) 36. Holst, C., Nothnagel, A., Blome, M., Becker, P., Eichborn, M., Kuhlmann, H.: Improved areabased deformation analysis of a radio telescope’s main reflector based on terrestrial laser scanning. J. Appl. Geod. 9(1), 1–13 (2015). https://doi.org/10.1515/jag-2014-0018 37. Holst, C., Nothnagel, A., Schunck, D., Haas, R., Kuhlmann, H.: Investigating the stability of a radio telescopes reference point using a terrestrial laser scanner: case study at the Onsala space observatory 20-m radio telescope. ISPRS J. Photogramm. Remote Sens. (2018, in preparation)

1312

A. Nothnagel

38. Jacobs, C.S., Arias, F., Boboltz, D., Boehm, J., Bolotin, S., Bourda, G., Charlot, P., de Witt, A., Fey, A., Gaume, R., Gordon, D., Heinkelmann, R., Lambert, S., Ma, C., Malkin, Z., Nothnagel, A., Seitz, M., Skurikhina, E., Souchay, J., Titov, O., ICRF-3 working Group: the ICRF3 RoadMap to the next generation international celestial reference frame. In: American Astronomical Society Meeting Abstracts, vol. 223, p. 251.25 (2014) 39. Johansson, L., Stodne, F., Wolf, S.: The PISA project – variations in the height of the foundation of the 20 meter radio telescope. Technical Report 178, Onsala Space Observatory, Sweden (1996) 40. Kalman, R.: A new approach to linear filtering and prediction problems. J. Fluids Eng. 82(1), 35–45 (1960). https://doi.org/10.1115/1.3662552 41. Karbon, M., Heinkelmann, R., Mora-Diaz, J., Xu, M., Nilsson, T., Schuh, H.: The extension of the parametrization of the radio source coordinates in geodetic VLBI and its impact on the time series analysis. J. Geod. 91, 755–765 (2017). https://doi.org/10.1007/s00190-016-0954-1 42. Koch, K.R.: Parameter estimation and hypothesis testing in linear models. Springer, Berlin/Paris/Heidelberg/New York (1999) 43. Lambeck, K.: The Earth’s Variable Rotation. Cambridge University Press, Cambridge, (1980) 44. Landskron, D., Böhm, J.: VMF3/GPT3: refined discrete and empirical troposphere mapping functions. J. Geod. 92(4), 349–360 (2018). https://doi.org/10.1007/s00190-017-1066-2 45. Lösler, M., Haas, R., Eschelbach, C.: Automated and continual determination of radio telescope reference points with sub-mm accuracy: results from a campaign at the Onsala Space Observatory. J. Geod. 87(8), 791–804 (2013). https://doi.org/10.1007/s00190-013-0647-y 46. Lösler, M., Haas, R., Eschelbach, C.: Terrestrial monitoring of a radio telescope reference point using comprehensive uncertainty budgeting investigations during cont14 at the Onsala space observatory. JoG 90(5), 467–486 (2016). https://doi.org/10.1007/s00190-016-0887-8 47. MacMillan, D.S.: Atmospheric gradients from very long baseline interferometry observations. Geophys. Res. Lett. 22(9), 1041–1044 (1995). https://doi.org/10.1029/95GL00887 48. Matveenko, L.I., Kardashev, N.S., Sholomitskii, G.B.: Large base-line radio interferometers. Sov. Radiophys. 8(4), 461–463 (1965). https://doi.org/10.1007/BF01038318 49. Moran, J.M., Crowther, P.P., Burke, B.F., Barrett, A.H., Rogers, A.E.E., Ball, J.A., Carter, J.C., Bare, C.C.: Spectral line interferometry with independent time standards at stations separated by 845 Kilometers. Science 157(3789), 676–677 (1967). https://doi.org/10.1126/science.157. 3789.676 50. Mueller, I.I.: Spherical and Practical Astronomy as Applied to Geodesy. Frederic Ungar Publishing Co., New York (1969) 51. Niell, A.E.: Global mapping functions for the atmosphere delay at radio wavelengths. J. Geophys. Res. 101(B02), 3227–3246 (1996). https://doi.org/10.1029/95JB03048 52. Niell, A.E.: Improved atmospheric mapping functions for VLBI and GPS. Earth Planets Space 52(10), 699–702 (2000). https://doi.org/10.1186/BF03352267 53. Nilsson, T., Soja, B., Karbon, M., Heinkelmann, R., Schuh, H.: Application of Kalman filtering in VLBI data analysis. Earth Planet Space 67(1) (2015). https://doi.org/10.1186/s40623-0150307-y 54. Nothnagel, A.: Conventions on thermal expansion modelling of radio telescopes for geodetic and astrometric VLBI. J. Geod. 83(8), 782–792 (2009). https://doi.org/10.1007/s00190-0080284-z 55. Nothnagel, A., Artz, T., Behrend, D., Malkin, Z.: International VLBI service for geodesy and astrometry. J. Geod. 69, 135–142 (2016). https://doi.org/10.1007%2Fs00190-016-0950-5 56. Nothnagel, A., Nilsson, T., Schuh, H.: Very long baseline interferometry: dependencies on frequency stability. Space Sci. Rev. 214(66), 1–15 (2018). https://doi.org/10.1007/s11214-0180498-1 57. Owens, J.: Optical refractive index of air: dependence on pressure, temperature and composition. Appl. Opt. 6(1), 51–59 (1967) 58. Petit, G., Luzum, B.: IERS conventions 2010. IERS Technical Note 35, Verlag des Bundesamtes für Kartographie und Geodäsie, Frankfurt am Main (2010). ISSN: 1019-4568

20 Very Long Baseline Interferometry

1313

59. Petrachenko, B., Niell, A., Behrend, D., Corey, B., Böhm, J., Charlot, P., Collioud, A., Gipson, J., Haas, R., Hobiger, T., Koyama, Y., MacMillan, D., Nilsson, T., Pany, A., Tuccari, G., Whitney, A., Wresnik, J.: Design aspects of the VLBI2010 system. Progress report of the VLBI2010 committee. NASA Technical Memorandum NASA/TM-2009-214180, NASA Greenbelt (2009) 60. Petrov, L.: Memo: ionosphere contribution in VLBI. Technical memo (2001). http://astrogeo. org/petrov/papers/iono.ps.gz 61. Petrov, L., Gordon, D., Gipson, J., MacMillan, D., Ma, C., Fomalont, E., Walker, R.C., Carabajal, C.: Precise geodesy with the very long baseline array. J. Geod. 83(9), 859–876 (2009). https://doi.org/10.1007/s00190-009-0304-7 62. Petrov, L., Kovalev, Y., Fomalont, E., Gordon, D.: The very long baseline array galactic plane surveyVGaPS. AJ 142, 35, 23pp (2011). https://doi.org/10.1088/0004-6256/142/2/35 63. Rogers, A.E.E.: Very long baseline interferometry with large effective bandwidth for phase-delay measurements. Radio Sci. 5(10), 1239–1247 (1970). https://doi.org/10.1029/ RS005i010p01239 64. Saastamoinen, J.: Atmospheric correction for the troposphere and stratosphere in radio ranging of satellites. In: Henriksen, S.W., et al. (eds.) The Use of Artificial Satellites for Geodesy, Geophysic Monographs Series, vol. 15. American Geophysical Union, Washington (1972) 65. Saastamoinen, J.: Contributions to the theory of atmospheric refraction. Part I – astronomical refraction. Bull. Géodésique 105(1), 279–298 (1973). https://doi.org/10.1007/BF02521844 66. Saastamoinen, J.: Contributions to the theory of atmospheric refraction. Part II – refraction ccorrection in satellite geodesy. Bull. Géodésique 105(1), 279–298 (1973). https://doi.org/10. 1007/BF02521844 67. Sarti, P., Abbondanza, C., Vittuari, L.: Gravity-dependent signal path variation in a large VLBI telescope modelled with a combination of surveying methods. J. Geod. 83(11), 1115–1126 (2009). https://doi.org/10.1007/s00190-009-0331-4 68. Schlüter, W., Behrend, D.: The international VLBI service for geodesy and astrometry (IVS): current capabilities and future prospects. J. Geod. 81(6–8), 379–387 (2007). https://doi.org/10. 1007/s00190-006-0131-z 69. Schuh, H., Behrend, D.: VLBI: a fascinating technique for geodesy and astrometry. J. Geodyn. 61, 68–80 (2012). https://doi.org/10.1016/j.jog.2012.07.007 70. Shapiro, I.I., Robertson, D.S., Knight, C.A., Counselman, C.C., Rogers, A.E.E., Hinteregger, H.F., Lippincott, S., Whitney, A.R., Clark, T.A., Niell, A.E., Spitzmesser, D.J.: Transcontinental baseline and the rotation of the earth measured by radio interferometry. Science 186, 920–922 (1974) 71. Soja, B., Nilsson, T., Karbon, M., Zus, F., Dick, G., Deng, Z., Wickert, J., Heinkelmann, R., Schuh, H.: Tropospheric delay determination by Kalman filtering VLBI data. Earth Planet Space 67(1) (2015). https://doi.org/10.1186/s40623-015-0293-0 72. Sovers, O.J., Fanselow, J.L., Jacobs, C.S.: Astrometry and geodesy with radio interferometry: experiments, models, results. Rev. Mod. Phys. 70, 1393–1454 (1998). https://doi.org/10.1103/ RevModPhys.70.1393 73. Steufmehl, H.J.: Optimierung von Beobachtungsplänen in der Langbasisinterferometrie (VLBI). Ph.D.-Thesis, Rheinische Friedrich-Wilhelms-Universität Bonn, Deutsche Geodätische Kommission Bayer. Akad. Wiss. München, Reihe, C., vol. 406 (1994) 74. Sun, J., Böhm, J., Nilsson, T., Krsn, H., Böhm, S., Schuh, H.: New VLBI2010 scheduling strategies and implications on the terrestrial reference frames. J. Geod. 88(5), 449–461 (2014). https://doi.org/10.1007/s00190-014-0697-9 75. Takahashi, F., Kondo, T., Takahasi, Y., Koyama, Y.: Wave summit Course: Very Long Baseline Interferometer. Ohmsha Ltd., Tokyo (2000) 76. Thompson, A., Moran, J., Swenson, G., Jr: Interferometry and Synthesis in Radio Astronomy, 2nd edn. Wiley, Weinheim (2007). http://onlinelibrary.wiley.com/book/10.1002/ 9783527617845. ISBN:978-0-471-25492-8 77. Thompson, A.R., Moran, J., Swenson, G.W.: Interferometry and Synthesis in Radio Astronomy. Springer, Cham (2017)

1314

A. Nothnagel

78. Titov, O.A.: Estimation of the subdiurnal UT1-UTC variations by the least squares collocation method. Astron Astrophys Trans 18(6), 779–792 (2000). https://doi.org/10.1080/ 10556790008208172 79. Uunila, M., Haas, R., Kareinen, N., Lindfors, T.: Automated Analysis of dUT1 from IVS Intensive Sessions with VieVS. In: Behrend, D., Baver, K.D. (eds.) Seventh General Meeting (GM2012) of the International VLBI Service for Geodesy and Astrometry (IVS), held in Madrid, 4–9 Mar 2012. National Aeronautics and Space Administration, pp. 281–285 (2012) 80. Van Vleck, J.H., Middleton, D.: The spectrum of clipped noise. Proc. IEEE 54(1), 2–19 (1966). https://doi.org/10.1109/PROC.1966.4567 81. Vennebusch, M., Böckmann, S., Nothnagel, A.: The contribution of very long baseline interferometry to ITRF2005. J Geod. 81, 553–564 (2007). https://doi.org/10.1007/s00190-0060117-x 82. Whitney, A.R.: Precision geodesy and astrometry via very-long-baseline interferometry. Dissertation, MIT, 77 Massachusetts Av., Cambridge, 02139 (1974) 83. Whitney, A.R.: How Do VLBI Correlators Work? In: Vandenberg, N., Baver, K. (eds.) IVS 2000 General Meeting Proceedings, Bad Kötzting, 21–24 Feb 2000, NASA/CP-2000-209893, pp. 187–205 (2000) 84. Zernecke, R.: Seasonal variations in height demonstrated at the radio telescope reference point. In: Schlüter, W., Hase, H. (eds.) Proceedings of the 13th Working Meeting on European VLBI for Geodesy and Astrometry. BKG, Viechtach, pp. 15–18 (1999)

Elementary Mathematical Models for GNSS Positioning

21

Bernhard Hofmann-Wellenhof

Contents 1 Introduction to the Principle of Satellite-Based Positioning . . . . . . . . . . . . . . . . . . . . . . 2 GNSS Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Mathematical Models for Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1318 1319 1331 1378 1434 1441

Abstract In 1984, the author got the opportunity to visit the US National Geodetic Survey near Washington, D.C., and, guided by Dr. Benjamin W. Remondi, could contribute to the development of civilian software for processing data of the primarily military Global Positioning System (GPS), the US version of a Global Navigation Satellite System (GNSS). In parallel, the former Soviet Union developed its own military Global Navigation Satellite System (GLONASS). In these early days of development, the two systems were far from completion which occurred for GPS in late 1995 and for GLONASS in early 1996. After his return, the author was not only impressed by the incredibly innovative potential of satellite-based navigation, but also decided to “simply write a book on GPS”. The reader might ask why GLONASS was not an issue; the

This chapter is part of the series Handbuch der Geodäsie, volume “Mathematische Geodäsie/ Mathematical Geodesy”, edited by Willi Freeden, Kaiserslautern. B. Hofmann-Wellenhof () Working Group Navigation, Institute of Geodesy, Graz University of Technology, Graz, Austria E-Mail: [email protected] © Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6_109

1315

1316

B. Hofmann-Wellenhof

answer is that in those days it was difficult to get official information on the Russian system. Thus, the intention was realized and “GPS – theory and practice”, coauthored by Herbert Lichtenegger and Jim Collins, was published by Springer, Wien New York in 1992. Nine years later, by 2001, the fifth edition was released. In the early years of the new millenium, additional systems were either concepted or realized, e.g., the Chinese BeiDou system as a two-step approach with the first step as a regional and the second step as a global system and the European Galileo, a remarkable development of a system under civilian control. The Galileo definition phase was completed in 2003. These new developments made necessary a complete revision of the “GPS-book”; consequently, the author together with the coauthors Herbert Lichtenegger and Elmar Wasle published “GNSS – GPS, GLONASS, Galileo & more” again by Springer Wien New York in 2008. Now the story approaches its end; when the author got the invitation to contribute to the Springer Handbook of Mathematical Geodesy the mathematical models for GNSS positioning, he argued that on the one hand that he had published this issue several years ago and on the other hand that essentially the same topics and much more is contained in the just recently released Springer Handbook of Global Navigation Satellite Systems. Nevertheless, the editors of this handbook claimed that “this is just what we want and need”. Therefore, the author felt honored and contacted the Publishing Company Springer and asked if it is possible to republish parts of his book on GNSS in this Springer Handbook of Mathematical Geodesy volume mainly unchanged but updated where requested and supplemented by measurement examples – and he got the permission. Thus, essentially, this contribution is extracted from “GNSS - GPS, GLONASS, Galileo & more”. Due to the fact that these chapters were originally written by the author only and the coauthors did not contribute to them, the names of the coauthors have been omitted for the current publication. Since almost a decade has passed by with respect to the release of the GNSS book, much more experience has been gained by applying these mathematical models. This enables the author to spoil the readers with numerical examples for some of the models. These examples should justify the re-publication of the well-established models. The measurements, the processing of the data and the production of the respective figures were carried out by Mathias Duregger, a young and very talented student assistant at our institute at the Graz University of Technology. His very industrious support is gratefully acknowledged.

Zusammenfassung Der Autor bekam im Jahr 1984 die Gelegenheit, am US National Geodetic Survey nahe Washington (District of Columbia) einen wissenschaftlichen Aufenthalt zu verbringen. Unter der Betreuung von Dr. Benjamin Remondi konnte er an der

21 Elementary Mathematical Models for GNSS Positioning

1317

Entwicklung einer Software für zivile Anwendungen für die Datenauswertung des primär militärischen Global Positioning System (GPS), der amerikanischen Version eines globalen Navigationssatellitensystems (GNSS) mitwirken. Parallel zu den Vereinigten Staaten von Amerika entwickelte die ehemalige Sowjetunion ihr eigenes militärisches globales Navigationssatellitensystem (GLONASS). In diesem frühen Entwicklungsstadium waren die beiden Systeme noch im Aufbau. Die sogenannte Full Operational Capability (FOC) wurde für GPS in der zweiten Jahreshälfte 1995 und für GLONASS wenig später in der ersten Jahreshälfte 1996 erreicht. In den Anfangsjahren des neuen Milleniums wurden neue Systeme entweder konzipiert oder es wurde bereits mit deren Umsetzung begonnen. Beispiele sind das chinesische BeiDou System (in der ersten Ausbaustufe ein regionales und erst in der zweiten Ausbaustufe ein globales System) und das europäische Galileo, das als besonders bemerkenswert anzusehen ist, da es sich nicht um ein militärisches System, sondern um ein System unter ziviler Kontrolle handelt. Die Definitionsphase von Galileo wurde im Jahr 2003 abgeschlossen. Zusammen mit den Koautoren Herbert Lichtenegger und Elmar Wasle hatte der Autor im Jahr 2001 im Springer Verlag Wien New York die fünfte Auflage von „GPS – Theory and Practice“ publiziert. Diese neuen Entwicklungen machten eine vollständige Überarbeitung notwendig, die im Jahr 2008 unter dem Titel „GNSS – GPS, GLONASS, Galileo & more“ wiederum im Springer Verlag Wien New York erschien. Nun beginnt sich der Kreis zu schließen. Als der Autor von den Herausgebern des Springer Handbook of Mathematical Geodesy die Einladung bekam, einen Beitrag „Mathematical Models for GNSS Positioning“ zu schreiben, gab er zu bedenken, dass er selbst einerseits dieses Thema bereits vor einigen Jahren publiziert hätte und dass andererseits die im wesentlichen gleichen Themen mit noch viel mehr zusätzlichen Themengebieten im gerade erst veröffentlichten Springer Handbook of Global Navigation Satellite Systems erschienen seien. Die Herausgeber des vorliegenden Handbuchs aber beharrten darauf, „dieses Thema ist genau das, was wir wollen und benötigen“. Der Autor fühlte sich sehr geehrt und kontaktierte den Springer Verlag mit der Frage, ob es möglich wäre und erlaubt sei, Teile des GNSS-Buchs im vorliegenden Springer “Handbook of Mathematical Geodesy” in aktualisierter Form und ergänzt durch Messbeispiele wieder zu veröffentlichen. Der Springer Verlag sprach umgehend seine Zustimmung und Erlaubnis aus. Folglich ist der vorliegende Beitrag ein Auszug des Buchs „GNSS – GPS, GLONASS, Galileo & more“. Da diese Kapitel in der Originalversion nur vom Autor erstellt wurden und die Koautoren dazu nicht beigetragen hatten, scheinen die Namen der Koautoren des GNSSBuchs in dieser Publikation nicht auf. Da nahezu eine Dekade seit der Veröffentlichung des GNSS-Buchs verstrichen ist, liegt viel mehr Erfahrung für die Anwendung der mathematischen Modelle vor. Somit kann der Autor die Leserinnen und Leser mit numerischen Beispielen von im Feld gemessenen Daten verwöhnen. Diese Beispiele sollten

1318

B. Hofmann-Wellenhof

die Wiederveröffentlichung der bereits etablierten mathematischen Modelle rechtfertigen. Die Messungen, das Auswerten der Daten und die Generierung der zugehörigen Figuren wurden von Mathias Duregger, einem jungen und sehr talentierten Studienassistenten an unserem Institut für Geodäsie der Technischen Universität Graz durchgeführt. Für seinen Fleiß, seine Gründlichkeit und seine kontinuierliche Unterstützung bedanke ich mich sehr herzlich.

Keywords GNSS · Satellite-based positioning · GNSS positioning · Observables · Data combination · Point positioning · Differential positioning · Relative positioning · Ambiguity resolution · Linearization

1

Introduction to the Principle of Satellite-Based Positioning

Operational satellite geodesy primarily provides the user with the capability of determining his position, expressed, for example, by latitude, longitude, and height. This task is accomplished by the simple resection process using ranges or range differences measured to satellites. Imagine the satellites frozen in space at a given instant. The space vector s relative to the center of the earth (geocenter) of each satellite (Fig. 1) can be computed from the ephemerides broadcast by the satellite by an algorithm presented, e.g., in Hofmann-Wellenhof et al. [54, Chap. 3]. If the receiver on ground defined by its geocentric position vector r employed a clock that was set

Fig. 1 Principle of satellite-based positioning

21 Elementary Mathematical Models for GNSS Positioning

1319

precisely to system time, the geometric distance or range ρ to each satellite could be accurately measured by recording the run time required for the (coded) satellite signal to reach the receiver. Each range defines a sphere (more precisely: surface of a sphere) with its center at the satellite position. Hence, using this technique, ranges to only three satellites would be needed since the intersection of three spheres yields the three unknowns (e.g., latitude, longitude, and height) which could be determined from the three range equations ρ = s − r  .

(1)

Modern receivers apply a slightly different technique. They typically use an inexpensive crystal clock which is set approximately to system time. Thus, the clock of the receiver on ground is offset from true system time and, because of this offset, the distance measured to the satellite differs from the geometric range. Therefore, the measured quantities are called pseudoranges R since they represent the geometric range plus a range correction ρ resulting from the receiver clock error or clock bias δ. A simple model for the pseudorange is R = ρ + ρ = ρ +cδ

(2)

with c being the speed of light in vacuum. Four simultaneously measured pseudoranges are needed to solve for the four unknowns; namely the three components of position plus the clock bias. This is essentially the mass market application principle as used by Global Navigation Satellite Systems (GNSS).

2

GNSS Observables

In times of “classical geodesy”, usually the three coordinates of a point in threedimensional space were obtained by separating location (two-dimensional) and height. Starting with electronic distance measurements, the three-dimensional concept evolved. Satellite-based navigation in the sense of operational satellite geodesy, which means that the position of the satellites may be assumed to be known (i.e., calculable), is a three-dimensional concept. Thus, the three dimensions of, e.g., a three-dimensional position vector x = [x, y, z]T are obtained simultaneously which may easily be transformed into two-dimensional ellipsoidal coordinates ϕ, λ and ellipsoidal height h. The GNSS observables are essentially obtained from a fundamental frequency which is transformed into an electromagnetic carrier wave. Modulated on this carrier wave is one or more codes; thus, code measurements and phase measurements are generated and used for GNSS positioning.

1320

B. Hofmann-Wellenhof

Referring to the Springer Handbook of Global Navigation Satellite Systems, the topics of this chapter are treated by Langley et al. [84] in Part A, Chap. 1, Won and Pany [151] in Part C, Chap. 14, Hauschild [50] in Part D, Chap. 19, Hauschild [51] in Part D, Chap. 20, Rizos [106] in Part F, Chap. 35.

2.1

Data Acquisition

Conceptually, the satellite navigation observables are ranges which are deduced from measured run times or phases based on a comparison between received signals and receiver-generated signals. Unlike the terrestrial electronic distance measurements, satellite navigation uses the “one-way concept” where two clocks are involved, namely one in the satellite and the other in the receiver. Thus, the ranges are biased by satellite and receiver clock errors and, consequently, they are denoted as pseudoranges.

2.1.1 Code Pseudoranges Let us denote by t s (sat) the signal emission time referred to the reading of the satellite clock and by tr (rec) the signal reception time referred to the reading of the receiver clock. Note that the satellite clock reading t s (sat) is transmitted in the navigation message. The errors (or biases) of the clocks with respect to a common time system (i.e., the respective system time) are termed δ s for the satellite clock and δr for the receiver clock. The difference between the clock readings is equivalent to the time shift t which aligns the satellite and reference signal during the code correlation procedure in the receiver. Thus, tr (rec) − t s (sat) = [tr + δr ] − [t s + δ s ] =

t+

δ,

(3)

indicating that in tr (rec) and t s (sat) two different time systems are involved but that now on the right-hand side tr and t s refer to the common system time and where t = tr − t s and δ = δr − δ s . The bias δ s of the satellite clock can be modeled if the respective information is transmitted accordingly, e.g., by a polynomial with the coefficients being transmitted in the navigation message. Assuming the correction δ s has been applied, δ equals the receiver clock bias. When multiplying the time interval tr (rec) − t s (sat) of Eq. (3), which is affected by the clock errors, by the speed of light c, the code pseudorange R = c [tr (rec) − t s (sat)] = c t + c δ = ρ + c δ

(4)

is obtained, where ρ = c t has been introduced. The range ρ is calculated from the true signal travel time. In other words, ρ corresponds to the distance between the position of the satellite at epoch t s and the position of the antenna of the receiver at epoch tr . Remember that both epochs refer to the common system time. Since ρ

21 Elementary Mathematical Models for GNSS Positioning

1321

is a function of two different epochs, it is often expanded into a Taylor series with respect to, e.g., the emission time ρ = ρ(t s , tr ) = ρ(t s , (t s +

t)) = ρ(t s ) + ρ(t ˙ s ) t,

(5)

where ρ˙ denotes the time derivative of ρ or the radial velocity of the satellite relative to the receiving antenna. All epochs in Eq. (5) are expressed in a common system time. The maximum radial velocity for GNSS satellites in the case of a stationary receiver is ρ˙ ≈ 1.0 km s−1 , and the travel time of the satellite signal is from about 0.06 to 0.10 s. The amount of the correction term in Eq. (5), thus, is greater than 60 m. The precision of a pseudorange derived from code measurements has been traditionally about 1% of the chip length. Therefore, a chip length of 300 m for a coarse code would yield a precision of roughly 3 m and an assumed chip length of 30 m for a precise code would yield a precision of 0.3 m. However, more recent developments demonstrate that a precision of about 0.1% of the chip length is possible.

2.1.2 Phase Pseudoranges Let us denote by ϕ s (t) the phase of the received and reconstructed carrier with frequency f s and by ϕr (t) the phase of a reference carrier generated in the receiver with frequency fr . Here, the parameter t is an epoch in a common time system reckoned from an initial epoch t0 = 0. According to Hofmann-Wellenhof et al. [54, Eq. (4.11)], the following phase equations may be set up: ρ − ϕ0s , c ϕr (t) = fr t − ϕ0r ,

ϕ s (t) = f s t − f s

(6)

where the phases are expressed in cycles. The initial phases ϕ0s , ϕ0r are caused by the clock errors and are equal to ϕ0s = −f s δ s , ϕ0r = −fr δr .

(7)

Hence, the beat phase ϕrs (t) is given by ϕrs (t) = ϕ s (t) − ϕr (t) , = −f s

ρ + f s δ s − fr δr + (f s − fr ) t . c

(8)

1322

B. Hofmann-Wellenhof

The deviation of the frequencies f s , fr from the nominal frequency f is in the order of only some fractional parts of hertz. This may be verified by considering, e.g., a short-time stability in the frequencies of df/f = 10−12 . With the nominal carrier frequency f ≈ 1.5 GHz, the frequency error, thus, becomes df = 1.5 · 10−3 Hz. Such a frequency error may be neglected because during signal propagation (i.e., t = 0.07 s) a maximum error of 10−4 cycles in the beat phase is generated which is below the noise level. The clock errors are in the range of milliseconds and are, thus, less effective. Summarizing, Eq. (8) may be written in the simplified form ϕrs (t) = −f

ρ −f c

δ,

(9)

where again δ = δr − δ s has been used. If the assumption of frequency stability is incorrect and the oscillators are unstable, then their behavior has to be modeled by, for example, polynomials where clock and frequency offsets and a frequency drift are determined. Historically, a complete carrier phase model which includes the solution of large (e.g., 1 second) receiver clock errors was developed by Remondi [102]. In practice, eventual residual errors can be eliminated by differencing the measurements. Switching on a receiver at epoch t0 , the instantaneous fractional beat phase is measured. The initial integer number N of cycles between satellite and receiver is unknown. However, when tracking is continued without loss of lock, the number N, also called integer ambiguity, remains the same and the beat phase at epoch t is given by ϕrs (t)

=

ϕrs

 t  +N, t0

(10)

where ϕrs denotes the (measurable) fractional phase at epoch t augmented by the number of integer cycles since the initial epoch t0 . A geometrical interpretation of Eq. (10) is provided in Fig. 2, where ϕi is a shortened notation for ϕrs | tti0 and, for simplicity, the initial fractional beat phase ϕ0 is assumed to be zero. Substituting Eq. (10) into Eq. (9) and denoting the negative observation quantity by Φ = − ϕrs yields the equation for the phase pseudoranges Φ=

1 c ρ+ λ λ

δ+N,

(11)

where the wavelength λ has been introduced according to the relation c = λ f . Multiplying the above equation by λ scales the phase expressed in cycles to a range (given in units of length, e.g., meter): λΦ = ρ + c δ + λN .

(12)

21 Elementary Mathematical Models for GNSS Positioning

1323

Fig. 2 Geometrical interpretation of phase range

This equation differs from the code pseudorange only by the integer multiples of λ. Again, the range ρ represents the distance between the satellite at satellite transmission time epoch t and the receiver at reception time epoch t + t (considering both epochs in a common system time). The phase of an electromagnetic wave can be measured to better than 0.01 cycles which corresponds to millimeter precision for a frequency in the gigahertz range. It should be noted that a plus sign convention has been chosen for Eq. (11). This choice is somewhat arbitrary since quite often the phase Φ and the distance ρ have different signs. Actually, the sign is receiver-dependent because the beat phase is generated in the receiver and the combination of the satellite and the receiver signal may differ for various receiver types.

2.1.3 Doppler Data Historically, the Transit system used the integrated Doppler shifts (i.e., phase differences) which were scaled to delta ranges. Today, the raw Doppler shift, being linearly dependent on the radial velocity, and, thus, allowing for velocity determination in real time is important for navigation. Considering Eq. (12), the equation for the observed Doppler shift scaled to range rate is given by D = λ Φ˙ = ρ˙ + c δ˙ ,

(13)

where the derivatives with respect to time are indicated by a dot. The raw Doppler shift measurement is less accurate than the integrated Doppler shift. An estimate of the achievable accuracy is 0.001 Hz which corresponds to 3 · 10−4 m s−1 if the Doppler shift is based on an emitted frequency of 1 GHz. Remember when the satellite is moving toward the GNSS receiver, the Doppler shift is positive; so one gets more Doppler counts when the range is diminishing.

1324

B. Hofmann-Wellenhof

Apart from its meaning in navigation, it is worth noting here that the raw Doppler shift is also applied to determine integer ambiguities in kinematic surveying or may be used as an additional independent observable for point positioning.

2.1.4 Biases and Noise The code pseudoranges, cf. Eq. (4), and phase pseudoranges, cf. Eq. (11), are affected by systematic errors or biases and random noise as well. Note that Doppler measurements are affected by the bias rates only. The error sources can be classified into three groups, namely satellite-related errors, propagation-mediumrelated errors, and receiver-related errors. Some range biases are listed in Table 1. Some of the systematic errors can be modeled and give rise to additional terms in the observation equations which will be explained in detail in later sections. As mentioned earlier, systematic effects can also be eliminated (or at least strongly reduced) by appropriate combinations of the observables. Introducing a baseline, i.e., the vector between two points, and differencing measurements of the two receivers at these points to the same satellite eliminates satellite-specific biases; differencing between two satellites and one receiver eliminates receiver-specific biases. As a consequence, double-difference pseudoranges are, to a high degree, free of systematic errors originating from the satellites and from the receivers. With respect to refraction, this is only true for short baselines where the measured ranges at both points are affected equally. In addition, ionospheric refraction can virtually be eliminated by an adequate combination of dual-frequency data. Details on antenna phase center variations may be found in Hofmann-Wellenhof et al. [54, Sect. 5.5], Maqsood et al. [93, Sect. 17.6], Hauschild [50, Sect. 19.5], and Kouba et al. [74, Sect. 25.2.2]. Multipath is caused by multiple reflections of the signal (which can also occur at the satellite during signal emission). The interference between the direct and the reflected signal is largely not random; however, it may also appear as a noise. Wells et al. [143] report a similar effect called imaging where a reflecting obstacle generates an image of the real antenna which distorts the antenna pattern. Both effects, multipath and imaging, can be considerably reduced by selecting sites protected from reflections (buildings, vehicles, trees, etc.) and by an appropriate antenna design. It should be noted that multipath is frequency (i.e., wavelength) dependent. Therefore, carrier phases are less affected than code ranges [76]. More details on the multipath problems are given in HofmannWellenhof et al. [54, Sect. 5.6], Braasch [10], Maqsood et al. [93, Sect. 17.4], and Hauschild [51, Sect. 20.2.4]. The random noise mainly contains the actual observation noise plus multipath. Assuming a typical chip length of 300 m for a coarse code and of 30 m for a precise code, the pseudorange noise is summarized in Table 2. The signal-in-space (SIS) user range error (URE) is an estimate comprising errors of ephemerides data, satellite clock, and the ionospheric and tropospheric delay. It does not consider errors caused by the environment (e.g., multipath) or by the user equipment (e.g., receiver noise including antenna offset and variation). Extending the URE by the user equipment and environmental errors, the user equivalent range error (UERE) is obtained. Even if there are some correlations, the individual error

21 Elementary Mathematical Models for GNSS Positioning Table 1 Range biases

Source Satellite Signal propagation Receiver

Table 2 Typical magnitude of range biases

Table 3 UERE computation

1325 Effect Clock bias Orbital errors Ionospheric refraction Tropospheric refraction Antenna phase center variation Clock bias Multipath

Range Code range (coarse code) Code range (precise code) Phase range Error source Ephemerides data Satellite clock Ionosphere Troposphere Multipath Receiver measurement UERE [m]

Bias [m] 2.1 2.0 4.0 0.5 1.0 0.5 5.1

Random [m] 0.0 0.7 0.5 0.5 1.0 0.2 1.4

Bias 300 cm 30 cm 5 mm Total [m] 2.1 2.1 4.0 0.7 1.4 0.5 5.3

contributions are considered to be independent. Therefore, the UERE is computed as square root of the summed squares of the six error constituents ephemerides data, satellite clock, ionosphere, troposphere, multipath, and receiver measurement. The receiver measurement component can be further split into receiver clock error and white noise. In Table 3 adapted from Parkinson [101, p. 481], the UERE is calculated based on typical values (all given in meters) for the individual quantities. The column headed by “Total” results from the square root of the sum of the squared bias√and the squared random quantity, e.g., for the satellite clock the total is obtained by 2.02 + 0.72 = 2.1. Note that linked to this UERE computation is the 1σ probability level which amounts to 68.3%. In combination with a dilution of precision (DOP) factor, which is explained, e.g, in Hofmann-Wellenhof et al. [54, Sect. 7.3.4], UERE allows for an estimation of the achievable point positioning accuracy. As Kuusniemi [75] indicates, the value of the UERE as computed in Table 3 is limited because in real situations too many variables must be taken into account, e.g., the elevation angle of the satellite which influences the signal path length, the strength of the received signal, and the changing multipath environment. The signal strength is described by the carrier-to-noise power density ratio and the signal-to-noise (S/N) ratio. The carrier-to-noise power density ratio is, according to Lachapelle [77], the fundamental navigation signal quality parameter.

1326

2.2

B. Hofmann-Wellenhof

Data Combinations

GNSS observables are obtained from the ranging code information or the carrier wave broadcast by the satellite signal. Assuming two carriers based on the respective frequencies f1 and f2 and one code modulated on each of the two carriers, one could measure the code ranges R1 , R2 , the carrier phases Φ1 , Φ2 , and the corresponding Doppler shifts D1 , D2 for a single epoch, where the subscript indicates the respective frequency. Subsequently, the Doppler observables are not considered. In general, the number of observables of a GNSS receiver can differ; for example, a single-frequency receiver delivers only data from one frequency. Considering a receiver that may manage three carriers and two codes, the number of observables increases accordingly. The objective of this section is to show how linear combinations are developed for dual-frequency data, and how code range smoothing by means of carrier phases is performed.

2.2.1

Linear Phase Pseudorange Combinations

General Remarks Suppose two frequencies f1 and f2 and denote the respective phase pseudoranges by Φ1 and Φ2 . The linear combination of two phase pseudoranges is defined by Φ = n1 Φ1 + n2 Φ2 ,

(14)

where n1 and n2 are arbitrary numbers. The substitution of the relations Φi = fi t for the corresponding frequencies f1 and f2 gives Φ = n1 f1 t + n2 f2 t = f t .

(15)

f = n1 f1 + n2 f2

(16)

Therefore,

is the frequency and λ=

c f

(17)

is the wavelength of the linear combination. Compared to the noise  of a single phase, the noise level for the linear combination

differs by the factor n21 + n22 which follows from the application of the error propagation law and assuming the same noise level for both phases. Linear Combinations with Integer Numbers The simplest nontrivial linear combinations of the two phase pseudoranges Φ1 and Φ2 in Eq. (14) are n1 = n2 = 1, yielding the sum

21 Elementary Mathematical Models for GNSS Positioning

Φ1 + Φ2 ,

1327

(18)

and n1 = 1, n2 = −1, leading to the difference Φ 1 − Φ2 .

(19)

According to (17), increasing the frequency reduces (or narrows) the wavelength and decreasing the frequency increases (or widens) the wavelength. Accordingly, the combination Φ1 + Φ2 is denoted as narrow lane and Φ1 − Φ2 as wide lane. The lane signals are used for ambiguity resolution (Sect. 4.2). The advantage of a linear combination with integer numbers is that the integer nature of the ambiguities is preserved. Linear Combinations with Real Numbers A slightly more complicated linear combination results from the choice n1 = 1 ,

n2 = −

f2 f1

(20)

leading to the combination Φ1 −

f2 Φ2 , f1

(21)

which is often denoted as geometric residual. This quantity is the kernel in a combination used to reduce ionospheric effects. Another linear combination follows from the reciprocal values of (20) n1 = 1 ,

n2 = −

f1 f2

(22)

leading to the combination Φ1 −

f1 Φ2 , f2

(23)

which is often denoted as ionospheric residual. This quantity is used, e.g., in the context of cycle slip detection (Sect. 4.1.2). The drawback of a linear combination with real numbers is that the integer nature of the ambiguity is generally lost.

2.2.2 Code Pseudorange Smoothing The principle of code pseudorange smoothing by means of phase pseudoranges is an important issue in accurate real-time positioning.

1328

B. Hofmann-Wellenhof

Assuming dual-frequency measurements for epoch t1 , the code pseudoranges R1 (t1 ), R2 (t1 ) and the carrier phase pseudoranges Φ1 (t1 ), Φ2 (t1 ) are obtained. Further assume the code pseudoranges are scaled to cycles (but still being denoted as R) by dividing them by the corresponding carrier wavelength. Note that pseudoranges scaled to cycles are sometimes denoted code phases. Using the two frequencies f1 , f2 , the combination R(t1 ) =

f1 R1 (t1 ) − f2 R2 (t1 ) f1 + f2

(24)

is formed for the code pseudoranges and the wide-lane signal Φ(t1 ) = Φ1 (t1 ) − Φ2 (t1 )

(25)

for the carrier phase pseudoranges. From Eq. (24) it can be verified by applying the error propagation lawthat the noise of the combined code pseudorange R(t1 ) is

reduced by the factor f12 + f22 /(f1 + f2 ) which amounts to 0.7 for present GNSS compared to the noise of the single code √ measurement. The increase of the noise in the wide-lane signal by a factor of 2 has no effect because the noise of the carrier phase pseudoranges is much lower than the noise of the code pseudoranges. Note that both signals R(t1 ) and Φ(t1 ) have the same frequency and, thus, the same wavelength as may be verified by applying Eq. (16). Combinations (24) and (25) are formed for each epoch. Additionally, for all epochs ti after t1 , extrapolated values R(ti )ex of the code pseudoranges can be calculated from R(ti )ex = R(t1 ) + (Φ(ti ) − Φ(t1 )) .

(26)

The smoothed value R(ti )sm is finally obtained by the arithmetic mean R(ti )sm =

1 2

(R(ti ) + R(ti )ex ) .

(27)

Generalizing the above formulas for an arbitrary epoch ti (with the preceding epoch ti−1 ), a recursive algorithm is given by R(ti )

=

f1 R1 (ti ) − f2 R2 (ti ) , f1 + f2

Φ(ti ) = Φ1 (ti ) − Φ2 (ti ) ,

(28)

R(ti )ex = R(ti−1 )sm + (Φ(ti ) − Φ(ti−1 )) , R(ti )sm =

1 2

(R(ti ) + R(ti )ex ) ,

which works under the initial condition R(t1 ) = R(t1 )ex = R(t1 )sm for all i > 1.

21 Elementary Mathematical Models for GNSS Positioning

1329

The above algorithm assumes data free of gross errors. However, carrier phase data are sensitive to changes in the integer ambiguity (i.e., cycle slips). To circumvent this problem, a variation of the algorithm is given subsequently. Using the same notations as before for an epoch ti , the smoothed code pseudorange is obtained by R(ti )sm = w R(ti ) + (1 − w) R(ti )ex ,

(29)

where w is a time-dependent weight factor. Note that from the previous algorithm R(ti )ex = R(ti−1 )sm + Φ(ti ) − Φ(ti−1 ) could be substituted into (29). For the first epoch i = 1, the weight is set w = 1; thus, putting the full weight on the measured code pseudorange. For consecutive epochs, the weight of the code pseudoranges is continuously reduced and, thus, emphasizes the influence of the carrier phases. A reduction of the weight by 0.01 from epoch to epoch was tested in a kinematic experiment with a data sampling rate of 1 Hz. After 100 seconds, only the extrapolated value is taken into account. Again, in the case of cycle slips, the algorithm would fail. A simple check of the carrier phase difference for two consecutive epochs by the Doppler shift multiplied by the time interval may detect data irregularities such as cycle slips. After the occurrence of a cycle slip, the weight is reset to w = 1, which fully eliminates the influence of the erroneous carrier phase data. The clue of this approach is that cycle slips must be detected but do not have to be repaired. Note, however, repair is possible if there is enough redundancy in the measurements. To demonstrate the effect of the smoothing algorithm, real data are presented in Fig. 3. The code pseudoranges for a data sample of 170 epochs measured with a 1 Hz data rate are shown in the top graph (after eliminating the trend due to the satellite motion). In the middle graph, R(ti )sm of the smoothing algorithm (28)

Fig. 3 Code pseudoranges in [m]: unsmoothed (top), smoothed (middle) by algorithm (28), and weighted smoothing (bottom) using Eq. (29)

1330

B. Hofmann-Wellenhof

is given. Finally, the bottom graph of Fig. 3 shows the weighting effect of (29). As described above, the weight reduction by 0.01 from epoch to epoch shows the decreasing influence of the code and the emphasized impact of the carrier phases. Another smoothing algorithm for code pseudoranges uses phase differences Φ(ti , t1 ) obtained by the integrated Doppler shift between the current epoch ti and the starting epoch t1 . Note that the integrated Doppler shifts are insensitive to cycle slips. From each code pseudorange R(ti ) at epoch ti , an estimate of the code pseudorange at epoch t1 can be given by R(t1 )i = R(ti ) −

Φ(ti , t1 ) ,

(30)

where the subscript i on the left side of the equation indicates the epoch that the code pseudorange R(t1 ) is computed from. Obtaining an estimate consecutively for each epoch, the arithmetic mean R(t1 )m of the code pseudorange for n epochs is calculated by n 1 R(t1 )i , R(t1 )m = n

(31)

i=1

and the smoothed code pseudorange for an arbitrary epoch results from R(ti )sm = R(t1 )m +

Φ(ti , t1 ) .

(32)

The advantage of this procedure lies in the reduction of the noise in the initial code pseudorange by averaging an arbitrary number n of measured code pseudoranges. Note from the three formulas (30) through (32) that the algorithm may also be applied successively epoch by epoch where the arithmetic mean must be updated from epoch to epoch. Using the above notations, formula (32) also works for epoch t1 , where, of course, Φ(t1 , t1 ) is zero and there is no smoothing effect. All the smoothing algorithms are also applicable if only single-frequency data are available. In this case, R(ti ), Φ(ti ), and Φ(ti , t1 ) denote the single-frequency code pseudorange, carrier phase pseudorange, and phase difference, respectively. Measurement Example Figure 4 shows the result of point positioning with smoothed pseudoranges based on the Hatch smoothing algorithm. The data were measured statically with an update rate of 0.1 Hz at a reference station with known coordinates and processed by using RTKLIB, the open source program package for GNSS Positioning (see www.rtklib. com). This software takes into account the impact of the atmosphere by standard models for the ionosphere (Klobuchar) and the troposphere (Saastamoinen). Displayed are for each of the 36,000 epochs the north, east, and up differences in a local coordinate system where the origin is the reference station. The noticeable jump in the period from about minute 12 to about minute 32 results from significant changes of the satellite constellation, i.e., rising and setting satellites.

21 Elementary Mathematical Models for GNSS Positioning

1331

Fig. 4 Point positioning with smoothed pseudoranges

The smoothing effect is seen best if the results are compared with the nonsmoothed point positioning results for the same data set, see Fig. 5. Statistically, if the individual epochs are regarded as a set of measurements, the standard deviations 0.4 m for the north component, 0.2 m for the east component, and 0.8 m for the up component result.

3

Mathematical Models for Positioning

As indicated in the introduction, essentially the user coordinates are the objective when applying operational satellite geodesy. Mainly depending on the desired accuracy, different methods will be described: first, the elementary point positioning with code ranges which is the mass market application if code ranges are used, where only measurement data of the receiver at the site to be determined are required; second, differential positioning where – apart from the receiver at the observation site – a (nearby) reference station is required which transmits correction data to the receiver at the observation site in order to improve the results; third, relative positioning where the baseline between a known reference station and the desired point is determined by combining the data sets of the two involved stations. Referring to the Springer Handbook of Global Navigation Satellite Systems, the topics of this chapter are treated by Langley et al. [84] in Part A, Chap. 1, Won and

1332

B. Hofmann-Wellenhof

Fig. 5 Point positioning with code ranges with activated standard correction models for the ionosphere and the troposphere

Pany [151] in Part C, Chap. 14, Hauschild [50] in Part D, Chap. 19, Hauschild [51] in Part D, Kouba et al. [74] in Part E, Chap. 25, Odijk and Wanninger [100] in Part E, Chap. 26.

3.1

Point Positioning

Instead of point positioning, frequently the term single point positioning is used. An even more complete description is determination of absolute coordinates of a single point. This indicates that only one receiver is involved which measures data at the site to be determined. Depending on the type of measured data, different models may be set up; however, the model overwhelmingly used backs on code ranges.

3.1.1

Point Positioning with Code Ranges

Code Range Model The code pseudorange at an epoch t can be modeled, cf. Eq. (3), by Rrs (t) = ρrs (t) + c δrs (t) .

(33)

21 Elementary Mathematical Models for GNSS Positioning

1333

Here, Rrs (t) is the measured code pseudorange between the observing receiver site r and the satellite s, the term ρrs (t) is the geometric distance between the observing point and the satellite, and c is the speed of light. The last item to be explained is δrs (t). This clock bias represents the combined clock offsets of the receiver and the satellite clock with respect to system time, cf. Eq. (4). Examining Eq. (33), the desired coordinates of the receiver site to be determined are implicit in the distance ρrs (t), which can explicitly be written as ρrs (t) =



(Xs (t) − Xr )2 + (Y s (t) − Yr )2 + (Z s (t) − Zr )2 ,

(34)

where Xs (t), Y s (t), Z s (t) are the components of the geocentric position vector of the satellite at epoch t, and Xr , Yr , Zr are the three earth-centered, earth-fixed (ECEF) coordinates of the (stationary) observing receiver site. Now, the clock bias δrs (t) must be investigated in more detail. For the moment consider a single epoch; a single position r is automatically implied. Each satellite contributes one unknown clock bias which can be recognized from the superscript s at the clock term. Neglecting, for the present, the site r clock bias, the pseudorange equation for the first satellite would have four unknowns. These are the three site coordinates and one clock bias of this satellite. Each additional satellite adds one equation with the same site coordinates but with a new satellite clock bias. Thus, there would always be more unknowns than measurements. Even when an additional epoch is considered, new satellite clock biases must be modeled due to clock drift. Fortunately, the satellite clock information is known with sufficient accuracy and transmitted via the broadcast navigation message, e.g., in the form of three polynomial coefficients a0 , a1 , a2 with a reference time tc . Therefore, the equation δ s (t) = a0 + a1 (t − tc ) + a2 (t − tc )2

(35)

enables the calculation of the satellite clock bias at epoch t. It should be noted that the polynomial (35) removes a great deal of the satellite clock uncertainty, but a small amount of (random) error remains. It should also be noted that the relativistic effects are not included in the polynomial. Therefore, for a more complete user algorithm for satellite clock correction, the term δ rel must also be taken into account by δ s (t) = a0 + a1 (t − tc ) + a2 (t − tc )2 + δ rel . The combined bias term

(36)

δrs (t) is split into two parts by δrs (t) = δr (t) − δ s (t) ,

(37)

where the satellite-related part δ s (t) is known and the receiver-related term δr (t) remains unknown. Substituting (37) into (33) and shifting the satellite clock bias to the left side of the equation yields

1334

B. Hofmann-Wellenhof

Rrs (t) + c δ s (t) = ρrs (t) + c δr (t) .

(38)

Note that the left side of the equation contains observed or known quantities, while the terms on the right side are unknown. Basic Configurations Basic configurations are defined by the condition that the number of observations must be equal to or greater than the number of unknowns. This condition is sufficient but does not necessarily give a solution. The reason for this is that inherent rank deficiencies may prevent a numerical solution because of a singularity. More explanations are given later when the rank deficiency becomes an issue. The number of observations is ns nt , where ns denotes the number of satellites and nt the number of epochs. For static point positioning, the three coordinates of the observing site and the receiver clock bias for each observation epoch are unknown. Thus, the number of unknowns is 3 + nt . The basic configuration is defined by ns nt ≥ 3 + nt ,

(39)

which yields the explicit relation nt ≥

3 . ns − 1

(40)

The minimum number of satellites to get a solution is ns = 2, leading to nt ≥ 3 observation epochs. For ns = 4, the solution nt ≥ 1 is obtained. This solution reflects the instantaneous positioning capability of GNSS, where the four unknowns at any epoch are solved if at least four satellites are tracked. For kinematic point positioning, the basic configuration can be directly derived from the following consideration. Due to the motion of the receiver, the number of the unknown station coordinates is 3nt . Adding the nt unknown receiver clock biases, the total number of unknowns is 4nt . Hence, the basic configuration is defined by adapting Eq. (39), ns nt ≥ 4nt

(41)

yielding ns ≥ 4. In other words, the position (and system time) of a moving receiver can be determined at any instant as long as at least four satellites are tracked. Geometrically, the solution is represented by the intersection of four pseudoranges. For the rigorous analytical solution see Kleusberg [70] or Lichtenegger [87]. The basic configurations must be considered from a theoretical point of view. The solution ns = 2, nt ≥ 3 for static point positioning, for example, means that simultaneous observations of two satellites over three epochs would theoretically suffice. In practice, however, this situation would yield unacceptable results or the computation would fail because of an ill-conditioned system of observation

21 Elementary Mathematical Models for GNSS Positioning

1335

equations unless the epochs were widely spaced (e.g., hours). A solution is also possible if observations of three epochs for two satellites are made, followed by three additional epochs (e.g., seconds apart) for two other satellites. Such an application will be rare but is imaginable under special circumstances (e.g., in urban areas). Measurement Example The measurement update rate of the single-frequency results in Fig. 5 is 0.1 Hz and epoch by epoch has been processed individually with an elevation mask of 15 degrees. The result has been compared with the known true position. Combining all epochs statistically, the measurement standard deviations of 0.5 m for the north component, 0.2 m for the east component, and 0.8 m for the up component are achieved. Note that the software package RTKLIB has been used and that for the ionosphere and troposphere the standard models Klobuchar and Saastamoinen are activated. The amazingly good results of Fig. 5 can be further analyzed. Thus, it is interesting to calculate the impact of the atmospheric corrections for the ionosphere and the troposphere. In Fig. 6 the same data set has been processed but no atmospheric corrections have been applied. Compared to Fig. 5, a shift of the north component of some 2 m can be seen and, as to be expected, the up component is influenced more strongly by about 10 to 11 m! The east component does not deviate strongly. The

Fig. 6 Point positioning with code ranges with deactivated standard correction models for the ionosphere and the troposphere

1336

B. Hofmann-Wellenhof

Fig. 7 Point positioning with code ranges with activated ionospheric correction but deactivated tropospheric correction

next question is how the ionosphere and the troposphere contribute individually to these differences. Therefore, Fig. 7 shows the results if the ionospheric correction is taken into account but the tropopheric correction not. Referring to the up component only, there remains a shift of some 6 m due to the uncorrected troposphere. In contrast to this, Fig. 8 shows the inverse situation: here the tropospheric correction is taken into account but the ionospheric not. Now the shift of the up component amounts to some 4 m due to the uncorrected ionosphere.

3.1.2

Point Positioning with Carrier Phases

Phase Range Models Pseudoranges can also be obtained from carrier phase measurements. The mathematical model for these measurements, cf. Eq. (11), is given by Φrs (t) =

1 s c ρr (t) + Nrs + s s λ λ

δrs (t) ,

(42)

where Φrs (t) is the measured carrier phase expressed in cycles, λs is the wavelength, and ρrs (t) is the same as for the code range model. The time-independent phase ambiguity Nrs is an integer number and, therefore, often called integer ambiguity

21 Elementary Mathematical Models for GNSS Positioning

1337

Fig. 8 Point positioning with code ranges with deactivated ionospheric correction but activated tropospheric correction

or integer unknown or simply ambiguity. The term c denotes the speed of light and δrs (t) is the combined receiver and satellite clock bias. Inserting Eq. (37) into Eq. (42) and shifting the (known) satellite clock bias to the left side of the equation yields Φrs (t) + f s δ s (t) =

1 s ρ (t) + Nrs + f s δr (t) , λs r

(43)

where the frequency of the satellite carrier f s = c/λs has been substituted. Basic Configurations Using the same notations as before, the number of observations is again ns nt . The number of unknowns, however, is increased by the number ns because of the ambiguities. For static point positioning, the number of unknowns is composed of 3 coordinates of the observing station, ns unknown ambiguities, and nt unknown receiver clock biases. Referring to (43), the problem of rank deficiency is encountered. Mathematically less interested readers may skip the next paragraph. A few basics on rank and rank deficiency are given here. Deeper insight may be obtained from Koch [72, Sects. 132, 333]. Assume a large number of equations of type (43) being prepared to be solved for the unknowns. This implies a matrix-vector

1338

B. Hofmann-Wellenhof

representation where the right side is composed of a product of a design matrix A and a vector comprising the unknowns in linear form. The rank of the design matrix is equal to the order of the largest nonsingular matrix that can be formed inside A. Formulated differently: the maximum number of the linearly independent rows of matrix A is called the rank of the matrix and is denoted by rank A. Linear dependence of two rows means that their linear combination yields zero. The word “rows” in this definition may also be replaced by the word “columns”. For a simpler discussion, assume a quadratic matrix with m × m rows and columns. Thus, if the largest nonsingular matrix is the matrix A itself, the rank equals rank A = m and the matrix is regular, i.e., it may be inverted without troubles. On the other hand, if the largest nonsingular matrix inside A is a matrix with, e.g., (m−2)×(m−2) rows and columns, the rank would be m−2 and implies a rank deficiency of m−rank A which turns out to be m − (m − 2) which amounts to 2. As a consequence, the singular system becomes regularly solvable if two unknowns (also denoted as parameters) are arbitrarily chosen. This equals the “fixing” of two parameters. Figuratively speaking, two of the parameters may be transferred to the left side of the matrixvector system comprising the measurements. This transfer reduces on the other hand the columns of the matrix on the right side by the amount of the rank deficiency, i.e., by two in the example discussed. This concludes the short discussion on rank and rank deficiency. The model in the form (43) comprises a rank deficiency of 1, this means that one of the unknown parameters may (and must) be arbitrarily chosen. Suppose that a receiver clock bias at one epoch is chosen, then, instead of nt unknown receiver clock biases, only nt − 1 clock biases remain. Therefore, the basic configuration for static point positioning without rank deficiency is defined by the relation ns nt ≥ 3 + ns + (nt − 1) ,

(44)

which yields explicitly the required number of epochs as nt ≥

ns + 2 . ns − 1

(45)

The minimum number of satellites to get a solution is ns = 2 leading to nt ≥ 4 observation epochs. Another integer solution pair is ns = 4, nt ≥ 2. For kinematic point positioning with phases, 3nt unknown station coordinates must be considered because of the roving receiver compared to the three unknowns in (44). The other considerations including the discussion on the rank deficiency remain unchanged. Therefore, the basic configuration is defined by ns nt ≥ 3nt + ns + (nt − 1) yielding the explicit relation

(46)

21 Elementary Mathematical Models for GNSS Positioning

nt ≥

ns − 1 . ns − 4

1339

(47)

The minimum number of satellites to get a solution is ns = 5 which have to be tracked for nt ≥ 4 epochs. Another integer solution pair is ns = 7, nt ≥ 2. Note that solutions for a single epoch (i.e., nt = 1) do not exist for point positioning with carrier phases. As a consequence, kinematic point positioning with phases is only possible if the ns phase ambiguities are known from some initialization. In this case, the phase range model converts to the code range model.

3.1.3 Point Positioning with Doppler Data The mathematical model for Doppler data, cf. Eq. (13), is Drs (t) = ρ˙rs (t) + c δ˙rs (t)

(48)

and may be considered as time derivative of a code or phase pseudorange. In this equation, Drs (t) denotes the observed Doppler shift scaled to range rate, ρ˙rs (t) is the instantaneous radial velocity between the satellite and the receiver, and δ˙rs (t) is the time derivative of the combined clock bias term. The radial velocity for a stationary receiver, cf. Hofmann-Wellenhof et al. [54, Eq. (3.34)], ρ˙rs (t) =

s (t) − r · ˙ s (t) s (t) − r 

(49)

relates the unknown position vector r of the receiver to the instantaneous position vector s (t) and velocity vector ˙ s (t) of the satellite. These vectors can be calculated from the satellite ephemerides. Introducing on the one hand ρ = s (t) − r  according to (1) and on the other hand the components Xs (t), Y s (t), Z s (t) of the vector s (t), likewise Xr , Yr , Zr as the components of the vector r , and X˙ s (t), Y˙ s (t), Z˙ s (t) for the vector ˙ s (t) accordingly, the radial velocity may also be written as ρ˙rs (t) =

Xs (t) − Xr ˙ s Y s (t) − Yr ˙ s Z s (t) − Zr ˙ s X (t) + Y (t) + Z (t) ρ ρ ρ

after evaluating the inner product. The contribution of the satellite clock to

(50)

δ˙rs (t) is given by, cf. Eq. (35),

δ˙s (t) = a1 + 2a2 (t − tc )

(51)

and is known. Summarizing, the observation equation (48) contains four unknowns. These unknowns are the three coordinates of r and the receiver clock drift δ˙r (t). Hence, compared to the code range model, the Doppler equation contains the receiver clock drift instead of the receiver clock offset.

1340

B. Hofmann-Wellenhof

The concept of combined code pseudorange and Doppler data processing leads to a total of five unknowns. These unknowns are the three point coordinates, the receiver clock offset, and the receiver clock drift. Each satellite contributes two equations, one code pseudorange and one Doppler equation. Therefore, three satellites are sufficient to solve for the five unknowns. The similarity of the pseudorange and the Doppler equation gives rise to the question of a linear dependence of the equations. However, it can be shown that the surfaces of constant pseudoranges and the surfaces of constant Doppler are orthogonal and hence independent [86].

3.1.4

Precise Point Positioning

Basic Model Considering the methods of point positioning described in the previous sections, the main limiting factors with respect to the achievable accuracy are the orbit errors, the clock errors, and the atmospheric influences (ionospheric and tropospheric refraction). Therefore, following Witchayangkoon [150, p. 2], precise point positioning (PPP) uses accurate orbital data and accurate satellite clock data as provided, e.g., by the International GNSS Service for Geodynamics (IGS), and dual-frequency code pseudoranges and/or carrier phase observations by definition. The preferred model is based on an ionosphere-free combination of code pseudoranges and carrier phases as well and will be derived subsequently. Starting with the phase pseudorange model (42) and taking into account the frequency-dependent ionospheric refraction gives Φ1 =

1 c ρ+ λ1 λ1

1 c Φ2 = ρ+ λ2 λ2

δ + N1 −

1 λ1

1 δ + N2 − λ2

Iono

1

, (52)

Iono

2

where the subscripts 1 and 2 indicate the dependence on the respective frequency of the two carriers. Using the relation c = f λ yields Φ1 =

f1 ρ + f1 c

f2 ρ + f2 Φ2 = c

f1 c

Iono

f2 δ + N2 − c

Iono

δ + N1 −

1

, (53)

2

,

which can be written in the form Φ1 = af1 + N1 −

b , f1

b Φ2 = af2 + N2 − f2

(54)

21 Elementary Mathematical Models for GNSS Positioning

1341

by introducing a=

ρ + c

f2 b= i c

δ

geometry term , (55)

Iono

ionosphere term .

Note that the auxiliary quantities a and b are frequency independent. Therefore no subscript is needed to specify the frequency. The ionosphere term can be eliminated by the following linear combination. Multiplying the first equation of (54) by f1 and the second by f2 and forming the difference yields Φ1 f1 − Φ2 f2 = a(f12 − f22 ) + N1 f1 − N2 f2

(56)

and, after multiplying the equation by f1 /(f12 − f22 ) and a slight rearrangement, the ionosphere-free combination     f12 f12 f2 f2 Φ1 − Φ2 = af + N − N 1 1 2 f1 f1 f12 − f22 f12 − f22

(57)

is obtained. Resubstituting for the geometry term a according to (55), 

 f12 f2 f1 Φ1 − Φ 2 ρ + f1 = 2 2 f1 c f1 − f2

  f12 f2 δ + N1 − N2 f1 f12 − f22

(58)

results for the ionosphere-free combination. The significant drawback of the combination is that the integer nature of the ambiguities is lost since f2 /f1 is not an integer for current GNSS. Note that on the left side of the equation the geometric residual reappears, cf. Eq. (23). Thus, this quantity could also be denoted as reduced ionosphere-free signal. The derivation of the ionosphere-free combination for code pseudoranges starts with the model equations R1 = ρ + c δ +

Iono

R2 = ρ + c δ +

Iono

1 2

, (59) ,

where Iono is inversely proportional to the squared respective carrier frequency, see Hofmann-Wellenhof et al. [54, Eq. (5.64)]. Note also that the amount of the ionospheric refraction is the same as for the phase equation, but the sign is opposite! Thus, multiplying the first equation of (59) by f12 and the second by f22 and then forming the difference yields

1342

B. Hofmann-Wellenhof

R1 f12 − R2 f22 = (f12 − f22 )(ρ + c δ) ,

(60)

where the ionosphere term is eliminated. After dividing the equation by (f12 − f22 ) and a slight rearrangement, the ionosphere-free combination 4 R1 −

f22

3

R 2 2

f1

f12 f12 − f22

=ρ+c δ

(61)

is obtained. The advantage of the ionosphere-free combination is the elimination (or more precisely, the reduction) of ionospheric effects. When studying the derivation in detail (which is not demonstrated here), it should become clear that the term “ionosphere-free” is not fully correct because there are some approximations involved, e.g., a series expansion. Brunner and Gu [11] propose an improved model to account for the higher-order terms arising from the series expansion of the refractive index, the geomagnetic field effect, and the bending effects of the ray paths. Taking also into account the troposphere, the respective equation for the code pseudoranges is obtained from (61) and reads 4 R1 −

f22

R 2 2

f1

3

f12 f12 − f22

=ρ+c δ+

Trop

(62)

.

The ionosphere-free carrier phase relation as given in (58) reads   f12 f2 f1 Φ1 − Φ2 ρ + f1 = 2 2 f1 c f1 − f2

  f12 f2 δ + N1 − N2 . f1 f12 − f22

(63)

This equation is now multiplied by the factor c/f1 yielding     c f1 c f1 f2 f2 Φ1 − Φ2 = ρ + c δ + N1 − N2 2 2 2 f1 f1 f1 − f2 f1 − f22

(64)

or, by substituting c = λ1 f1 and adding the tropospheric delay,   λ1 f12 f2 Φ1 − Φ2 =ρ+c δ+ f1 f12 − f22

Trop

  λ1 f12 f2 + N1 − N2 f1 f12 − f22

(65)

is obtained. This formula yields after a slight rearrangement and by using c = λ2 f2 λ1 Φ1 f12 f12 − f22



λ2 Φ2 f22 f12 − f22

=ρ+c δ +

Trop

+

λ1 N1 f12 f12 − f22



λ2 N2 f22 f12 − f22

.

(66)

21 Elementary Mathematical Models for GNSS Positioning

1343

In summary, (62) and (66) are the desired ionosphere-free combinations of code pseudoranges and carrier phases for PPP: R1 f12 f12 − f22 λ1 Φ1 f12 f12 − f22

− −

R2 f22 f12 − f22 λ2 Φ2 f22 f12 − f22

=ρ+c δ+

Trop

=ρ+c δ +

Trop

, +

λ1 N1 f12 f12 − f22



λ2 N2 f22 f12 − f22

(67) ,

where (62) has been slightly rearranged. The unknown parameters to be determined are the point position contained in ρ, the receiver clock error contained in δ (see Eq. (3)), the tropospheric delay Trop , and the ambiguities. Based on this model, PPP may be applied either in static or in kinematic mode. To solve for the mentioned unknowns, several methods are possible. Deo et al. [19] apply a sequential least-squares adjustment, (extended) Kalman filtering is another frequently used method. Apart from the PPP model given in (67), different strategies may be found in the literature, e.g., with respect to the tropospheric term. Witchayangkoon [150] and Kouba and Héroux [73] estimate the total tropospheric zenith path delay as above, whereas Gao and Shen [35] model the dry tropospheric zenith path delay and estimate the wet component as a parameter. Model Refinements To exploit the full potential of PPP, a refinement of the model must be performed. Additional terms are necessary to account for the Sagnac effect, the solid earth tides, the ocean loading, the atmospheric loading (caused by the atmospheric pressure variation), polar motion, earth orientation effects, crustal motion and other earth deformation effects [73]. Also the antenna phase center offset (at the satellite and at the receiver) and antenna phase wind-up error [150, pp. 24–26] should be taken into account. The proper weighting of the observations is also a key to improve the accuracy. Numerous investigations on different weighting schemes exist. Among them, Witchayangkoon [150, Sect. 7.3.3.4] mentions exponential weighting schemes taking into account that observations from satellites near the horizon get a lower weight [23]; using weights reflecting the signal-to-noise (S/N) ratio values [15, 45]; Langley [83] derives carrier-to-noise power density (C/N0 ) ratios varying with the elevations of the arriving signal; weighting as a cosecant or square of a cosecant function of the satellite elevation angle E [15, 45, 134] which is justified by the cosecant shape of the various models of the tropospheric mapping function. Wieser [146, 147] compares identical variances σ 2 for all observations, elevationdependent variances σ02 / sin2E, and SIGMA-ε variances defined by k ·10−(C/N0 )/10 , where the model parameters σ, σ0 , and the factor k depend on the receiver and antenna types and can be determined in advance. As outlined in Wieser et al. [148],

1344

B. Hofmann-Wellenhof

the measured C/N0 is a quality indicator because there is a functional relation between this quantity and the tracking loop noise.

Numerical Results Gao and Chen [34] use real-time precise orbit and clock corrections (accurate to 20 cm and 0.5 ns, respectively) provided by the Jet Propulsion Laboratory (JPL) and present results for different positioning modes. For a static observation, they demonstrate that all position components (latitude, longitude, height) “converge to centimeter level” after 20 minutes. The problem of the convergence arises from the ambiguities. After this convergence has been achieved, the results remain even below the subcentimeter level. Therefore, they conclude that PPP is capable to provide real-time centimeter-level accuracy for static surveys. An additional remark on the ambiguities is appropriate here: because of unknown receiver and also transmitter-specific phase delays which in addition vary with time, the ambiguities are not integers [154]; only the double-difference ambiguities are integers. Note, however, that for “applications that do not require accuracies better than a few millimeters in the horizontal dimension and approximately 1 cm in the vertical dimension, ambiguity resolution is not necessary, provided that the observation time is of the order of 1 day” [154]. For kinematic applications, also real-time centimeter-level accuracy is demonstrated for a car and an airplane by Gao and Chen [34], again based on the JPL orbit and clock corrections. Note, however, these excellent results mainly profit from the JPL input data for the satellite orbit and, even more important, the satellite clock correction. These JPL data are available for commercial applications [34]. When relying on freely available products like the predicted ultrarapid orbit product as provided by the IGS, with an even better (compared to the JPL product) orbit accuracy of a few centimeters but a significantly worse accuracy of 5 ns for the clock [34], then this accuracy is insufficient for real-time decimeter-level PPP [19]. Kinematic results based on the use of IGS final orbit and clock corrections are given in [1]. The position accuracy is better than 3 decimeters for many cases including land vehicle, marine, and airborne applications. Witchayangkoon [150] gives a detailed model and incorporates corrections for solid earth tides, relativity, and satellite antenna phase center offsets and reports some results from numerical examples. In cases of low impact by multipath, “singlefrequency ionosphere-free PPP solutions are equivalent to the dual-frequency solutions”. Thus, a future trend is PPP using single-frequency data only. Even with a simple model as proposed by Satirapod and Kriengkraiwasin [108], which uses single-frequency ionospherically corrected code and phase observations corrected, introduces the Saastamoinen troposphere model to calculate the total tropospheric zenith delay which is mapped to the line-of-sight delay, a horizontal accuracy of 1–4 m can be achieved with data sessions ranging from 5–30 minutes (but being based on the precise orbit files of the IGS).

21 Elementary Mathematical Models for GNSS Positioning

1345

Fig. 9 PPP at station Graz Lustbühel using undifferenced phases, float ambiguities, final IGS orbits and clocks. (Figure used with kind permission from Katrin Huber)

Measurement Example The PPP topic is reflected in the literature by numerous publications. Two PhD theses are selected to present a measurement example: Huber [55] and Wen [144]. The investigations as carried out by Huber [55] are focusing on two main aspects: using float ambiguities or fixing the ambiguities to integer values. Using float ambiguities means that the ambiguities are used as they result from the adjustment procedure in contrast to fixing the ambiguities to integer values. The latter case is much more challenging because the fixing only makes sense if the correct integer values are found finally; otherwise the results may be worse compared to the float ambiguity solution! More on the ambiguity resolution may be found in Sect. 4.2. In Fig. 9, a typical example based on undifferenced phases (“zero differences”) and using float ambiguities is shown where the resulting PPP solution is compared with “true” coordinates. The differences are displayed in the local level system North, East, and Up as deviations dN, dE, dU . The results are similar if single differences are used instead of undifferenced phases, see Fig. 10. This is to be expected if theory only is considered, since mathematically there is no difference at all. Note, however, this is only true theoretically and is not necessarily reflected in reality! After the experience of many other examples, Huber [55] concludes on PPP with float ambiguities as follows: “Alhough PPP in real time can yield similar accuracies (cm-level) as relative positioning techniques like RTK, the convergence of PPP solutions estimating float ambiguities is rather long. At least 10 to 15 minutes

1346

B. Hofmann-Wellenhof

Fig. 10 PPP at station Graz Lustbühel using single differences, float ambiguities, final IGS orbits and clocks. (Figure used with kind permission from Katrin Huber)

are required for the solutions to approach at least the decimeter level, while an accuracy level of a few centimeters is reached only after observation times of 30 to 90 minutes.” (Remark: within the citation, tiny changes have been carried out.) The convergence time is the remaining key problem of PPP with float ambiguities; however, this characteristic does not really change if the ambiguities are fixed because just the correct fixing of the ambiguities is the critical issue. According to Huber [55], her “thesis shows that ambiguity fixing using the software produced in this research is possible, if external products containing UPD corrections (in this research from CNES and TU Vienna) are available. Fixing times can be reduced by means of integer ambiguity resolution, but 8 to 10 minutes of initialisation time are still needed in the best case.” (Remark: UPD is the acronym for uncalibrated phase delays, CNES is the acronym for Centre National d’Etudes Spatiales, and TU Vienna is a short form of Vienna University of Technology.)

3.2

Differential Positioning

In contrast to point positioning, two (ore more) receivers are used. The detailed description of the concept follows subsequently. Depending on the data used, essentially code ranges or phase ranges, the achievable accuracy may be influenced. In addition, the number of the involved reference stations impacts the results.

21 Elementary Mathematical Models for GNSS Positioning

1347

Fig. 11 Basic concept of differential positioning

The minimum requirement is a single reference station. In case of more reference stations, the concept of local-area differential positioning applies.

3.2.1 Basic Concept Differential positioning with GNSS, abbreviated by DGNSS, is a real-time positioning technique where two or more receivers are used. One receiver, usually at rest, is located at the reference or base station with (assumed) known coordinates and the remote receiver is fixed or roving and its coordinates are to be determined (see Fig. 11 where A represents the reference station and B is the remote station, frequently called rover if it is moving). At the reference station, commonly pseudorange corrections (PRC) and range rate corrections (RRC) are calculated which are transmitted to the remote station quasi in real time. At the remote station, the corrections are applied to the measured pseudoranges and point positioning with the corrected pseudoranges is performed so that the position accuracy is improved. 3.2.2 DGNSS with Code Ranges Generalizing (38) and following Lichtenegger [88], the code range at the reference station A to satellite s measured at epoch t0 may be modeled by s RA (t0 ) = ρAs (t0 ) + ρAs (t0 ) + ρ s (t0 ) + ρA (t0 ) ,

(68)

where ρAs (t0 ) is the geometric range, the term ρAs (t0 ) denotes range biases depending on the terrestrial base position and satellite position as well (e.g., radial orbital error, refraction effects), the range bias ρ s (t0 ) is purely satellite-dependent (e.g., effect of satellite clock error), and the range bias ρA (t0 ) is purely receiverdependent (e.g., effect of receiver clock error, multipath). Note that noise has been neglected in (68). The pseudorange correction for satellite s at reference epoch t0 is defined by the relation s (t ) PRCs (t0 ) = ρAs (t0 ) − RA 0

= − ρAs (t0 ) − ρ s (t0 ) − ρA (t0 )

(69)

1348

B. Hofmann-Wellenhof

and can be calculated since the geometric range ρAs (t0 ) is obtained from the known s (t ) is the position of the reference station and the broadcast ephemerides and RA 0 s measured quantity. In addition to the pseudorange correction PRC (t0 ), the time derivative or range rate correction RRCs (t0 ) is determined at the reference station. Range and range rate corrections referring to the reference epoch t0 are transmitted to the rover site B in quasi real time. At B the pseudorange corrections are predicted for the observation epoch t using the relation PRCs (t) = PRCs (t0 ) + RRCs (t0 ) (t − t0 ) ,

(70)

where t − t0 is defined as latency. The achievable accuracy increases for smaller variations of the pseudorange corrections and for smaller latencies. Adapting (68) to the rover site B and epoch t, the code pseudorange measured at the rover can be modeled by RBs (t) = ρBs (t) + ρBs (t) + ρ s (t) + ρB (t) .

(71)

Applying the predicted pseudorange correction PRCs (t), cf. Eq. (70), to the measured pseudorange RBs (t) yields RBs (t)corr = RBs (t) + PRCs (t)

(72)

or, after substitution of (71) and the pseudorange correction according to (69) and (70), respectively, RBs (t)corr = ρBs (t) + [ ρBs (t) − ρAs (t)] + [ ρB (t) − ρA (t)] ,

(73)

where the satellite-dependent bias has canceled out. For moderate distances between the reference and the rover site, the satellite–receiver-specific biases are highly correlated. Therefore, the influence of radial orbital errors and of refraction is significantly reduced. Neglecting these biases, Eq. (73) simplifies to RBs (t)corr = ρBs (t) + ρAB (t) ,

(74)

where ρAB (t) = ρB (t) − ρA (t). If multipath is neglected, this term converts to the combined receiver clock bias scaled to range, i.e., ρAB (t) = c δAB (t) = c δB (t) − c δA (t). If no latency exists, the equation is identical with the betweenreceiver single-difference of code ranges measured at A and B, and differential positioning converts to relative positioning (Sect. 3.3). Positioning at the rover site B is performed with the corrected code pseudoranges RBs (t)corr leading to improved position accuracies. The basic configuration for DGNSS with code ranges is identical with that for kinematic point positioning with code ranges, cf. Eq. (41).

21 Elementary Mathematical Models for GNSS Positioning

1349

Measurement Example For the measurement example, a static case is considered. The reference station is Graz Lustbühel with the ETRS89 Austria 2002 coordinates [x, y, z]T = [4 194 424.124 m, 1 162 702.460 m, 4 647 245.193 m]T . The coordinates of the point with the notation TP 154-163 at Voitsberg (Austria) are also known: [x, y, z]T = [4 204 242.628 m, 1 139 492.405 m, 4 644 249.437 m]T . These coordinates are considered as truth and are compared with the results of the DGNSS measurements. The distance between the reference station Graz Lustbühel and the TP 154-163 amounts to some 25 km. When processing the DGNSS data, for the 60 minute data set only GPS satellites and L1 frequency with an update rate of 0.1 Hz have been used, thus, DGNSS could be denoted more specifically as DGPS. First, the difference between the coordinates as obtained from the DGNSS measurements and the truth were calculated for each epoch. The resulting difference vectors were transformed into the local level system of TP 154-163 and split into the north, east, and up components, see Fig. 12. Note that for the processing of the data the RTKLIB software has been used and where standardized models for the ionosphere and the troposphere have been taken into account. Combining all epochs statistically, the measurement standard deviations of 0.36 m for the north component, 0.22 m for the east component, and 0.55 m for

Fig. 12 Comparing the “true” coordinates of point TP 154-163 at Voitsberg (Austria) with results as obtained by DGPS where the reference station Graz Lustbühel is some 25 km away

1350

B. Hofmann-Wellenhof

Fig. 13 Comparison of DGNSS with point positioning of point TP 154-163 at Voitsberg (Austria)

the up component are achieved. The 2D mean offset amounts to 0.10 m and the 3D mean offset amounts to 0.19 m. Since the results of point positioning in Sect. 3.1.1 are of similar accuracy, it is even tempting to compare DGNSS with point positioning. Therefore, for the same data set the point positioning solutions at TP 154-163 were calculated for each epoch and compared with the DGNSS solutions. The resulting differences in the local level coordinate system of TP 154-163 are given in Fig. 13 with standard deviations of 0.66 m for the north component, 0.23 m for the east component, and 0.35 m for the up component are achieved. The 2D mean offset amounts to 1.62 m and the 3D mean offset amounts to 1.79 m.

3.2.3 DGNSS with Phase Ranges Generalizing (42) and following Lichtenegger [88], the phase pseudorange measured at the reference station A at epoch t0 can be modeled by λs ΦAs (t0 ) = ρAs (t0 ) + ρAs (t0 ) + ρ s (t0 ) + ρA (t0 ) + λs NAs ,

(75)

where, in analogy to the code range model, ρAs (t0 ) is the geometric range, ρAs (t0 ) is the satellite–receiver-dependent bias, ρ s (t0 ) is purely satellite-dependent, ρA (t0 ) is purely receiver-dependent. Finally, NAs is the phase ambiguity. Consequently, the phase range correction at reference epoch t0 is given by

21 Elementary Mathematical Models for GNSS Positioning

PRCs (t0 ) = ρAs (t0 ) − λs ΦAs (t0 ) , = − ρAs (t0 ) − ρ s (t0 ) − ρA (t0 ) − λs NAs .

1351

(76)

The formulation of range rate corrections at the reference station A as well as the application of predicted range corrections to the observed phase ranges at the rover site B is carried out in full analogy to the previously described code range procedure. Therefore, s λs ΦBs (t)corr = ρBs (t) + ρAB (t) + λs NAB

(77)

results for the corrected phase ranges, where ρAB (t) = ρB (t) − ρA (t) and s = N s − N s is the (single-) difference of the phase ambiguities. As in the code NAB B A range model, if multipath is neglected, the term ρAB (t) converts to the combined receiver clock bias scaled to range, i.e., ρAB (t) = c δAB (t) = c δB (t) − c δA (t). Point positioning at the rover site B is performed with the corrected phase pseudoranges λs ΦBs (t)corr . The basic configuration for DGNSS with phase ranges is identical with that for kinematic point positioning with phase ranges, cf. Eq. (47). DGNSS with phase ranges, sometimes denoted as carrier phase differential technique, is used for most precise kinematic applications. For this mode of operation, on-the-fly (OTF) techniques are required to resolve the ambiguities. More details on OTF are given in Sect. 4.2.3. Note that DGNSS with phases converts to relative positioning with phases if the latency becomes zero. This method is usually denoted real-time kinematic (RTK) technique.

3.2.4 Local-Area DGNSS An extension of DGNSS is the local-area DGNSS (LADGNSS) which uses a network of GNSS reference stations. As the name implies, LADGNSS covers a larger territory than can be reasonably accommodated by a single reference station. One of the main advantages of LADGNSS is that a more consistent accuracy can be achieved throughout the region supported by the network. In the case of DGNSS with a single reference station, the accuracy decreases as a function of the distance of the remote site from the reference station at a rate of approximately 1 cm per 1 km. Other advantages of LADGNSS are that inaccessible regions can be covered, e.g., large bodies of water, and that in case of a failure in one of the reference stations, the network will still maintain a relatively high level of integrity and reliability compared to a collection of individual DGNSS reference stations. Apart from the monitor stations, the LADGNSS network includes (at least) one master station. This station collects the range corrections from the monitor stations, processes these data to form LADGNSS corrections which are transmitted to the user community as well as to the monitor stations [98]. The networks may cause slight additional delay beyond regular DGNSS due to the additional communication required between the monitor stations and the master station.

1352

B. Hofmann-Wellenhof

Since the reference stations of the LADGNSS network may be very distant from the user location, the virtual reference station (VRS) concept (Sect. 3.3.7) has been developed [140]. Here, the user gets range corrections or even the observables of a nonexistent (i.e., virtual) reference station at a user-specified position. This concept is a prerequisite mainly for RTK applications which require short distances to reference stations to facilitate ambiguity resolution.

3.3

Relative Positioning

In contrast to point positioning and differential positioning, the coordinates of the desired site are determined relatively with respect to a known reference station. The detailed description of the concept follows subsequently. Relative positioning is the preferred method if highest accuracies are required.

3.3.1 Basic Concept The objective of relative positioning is to determine the coordinates of an unknown point with respect to a known point which, for most applications, is stationary. In other words, relative positioning aims at the determination of the vector between the two points, which is often called the baseline vector or simply baseline (Fig. 14). Let A denote the (known) reference point, B the unknown point, and bAB the baseline vector. Introducing the corresponding position vectors XA , XB , the relation XB = XA + bAB

(78)

may be formulated, and the components of the baseline vector bAB are ⎡

bAB

Fig. 14 Basic concept of relative positioning

⎤ ⎡ ⎤ XB − XA XAB = ⎣ YB − YA ⎦ = ⎣ YAB ⎦ . ZB − ZA ZAB

(79)

21 Elementary Mathematical Models for GNSS Positioning

1353

The coordinates of the reference point must be given and can be approximated by a code range solution. More often the coordinates are precisely known based upon GNSS or other methods. Relative positioning can be performed with code ranges, cf. Eq. (38), or with phase ranges, cf. Eq. (43). Subsequently, only phase ranges are explicitly considered because solutions based on phase ranges are far more accurate. Relative positioning requires simultaneous observations at both the reference and the unknown point. This means that the observation time tags for the two points must be the same. Assuming such simultaneous observations at the two points A and B to satellites j and k, linear combinations can be formed leading to single-differences, doubledifferences, and triple-differences. Differencing can basically be accomplished in three different ways: across receivers, across satellites, across time [89, p. 96]. Instead of “across” frequently “between” is used. In order to avoid overburdened expressions, shorthand notations will be used with the following meanings: single-difference corresponds to across-receiver difference (or between-receiver difference), double-difference corresponds to across-receiver and across-satellite difference, and triple-difference corresponds to across-receiver and across-satellite and across-time difference. Most postprocessing software uses these three difference techniques, so their basic mathematical modeling is shown in the following sections.

3.3.2

Phase Differences

Single-Differences Two receivers and one satellite are involved, see Fig. 15. Denoting the receiver sites by A and B and the satellite by j and using Eq. (43), the phase equations for the two points are 1 j j ρ (t) + NA + f j δA (t) , λj A 1 j j j ΦB (t) + f j δ j(t) = j ρB (t) + NB + f j δB (t) λ j

ΦA (t) + f j δ j(t) =

Fig. 15 Single-difference: two stations and one satellite are involved

(80)

1354

B. Hofmann-Wellenhof

and the difference of the two equations is j

j

ΦB (t) − ΦA (t) =

 1  j j j j ρB (t) − ρA (t) + NB − NA + f j [δB (t) − δA (t)] . j λ

(81)

Equation (81) is referred to as single-difference equation. This equation stresses one aspect of the solution for the unknowns on the right side. A system of such equations would lead to a rank deficiency even in the case of an arbitrarily large redundancy. This means that the design matrix of the adjustment has linearly dependent columns and a rank deficiency exists. Therefore, the relative quantities j

NAB

j

j

= NB − NA ,

δAB (t) = δB (t) − δA (t)

(82)

are introduced. Using additionally the shorthand notations j

j

j

ΦAB (t) = ΦB (t) − ΦA (t) , j

j

j

ρAB (t) = ρB (t) − ρA (t)

(83)

and substituting (82) and (83) into (81) gives j

ΦAB (t) =

1 j j ρ (t) + NAB + f j δAB (t) , λj AB

(84)

which is the final form of the single-difference equation. Note that the satellite clock bias has canceled, compared to the phase equation (43). Double-Differences Assuming the two points A, B, and the two satellites j, k (see Fig. 16), two singledifferences according to Eq. (84) may be formed: 1 j j ρ (t) + NAB + f j δAB (t) , λj AB k (t) = 1 ρ k (t) + N k + f k δ ΦAB AB (t) . AB λk AB j

ΦAB (t) =

(85)

To obtain a double-difference, these single-differences are subtracted. Two cases must be considered. Case 1 Assuming equal frequencies f = f j = f k for the satellite signals, the result is j

k ΦAB (t) − ΦAB (t) =

 1 k j j k ρAB (t) − ρAB (t) + NAB − NAB . λ

(86)

21 Elementary Mathematical Models for GNSS Positioning

1355

Fig. 16 Double-difference: two stations and two satellites are involved

Using shorthand notations for the satellites j and k analogously to (83), the final form of the double-difference equation is jk

ΦAB (t) =

1 jk jk ρ (t) + NAB , λ AB

(87)

where λ = λj = λk . The elimination of the receiver clock biases is the main reason why double-differences are preferably used. This cancellation resulted from the assumptions of simultaneous observations and equal frequencies of the satellite signals. Symbolically, the convention jk

j

∗AB = ∗kAB − ∗AB

(88)

has been introduced, where the asterisk may be replaced by Φ, ρ, or N . Note that these terms comprising two subscripts and two superscripts are actually composed of four terms. The symbolic notation jk

j

j

∗AB = ∗kB − ∗B − ∗kA + ∗A

(89)

characterizes, in detail, the terms in the double-difference equation: jk

j

j

ΦAB (t) = ΦBk (t) − ΦB (t) − ΦAk (t) + ΦA (t) , jk

j

j

ρAB (t) = ρBk (t) − ρB (t) − ρAk (t) + ρA (t) , jk

NAB

= NBk

j

− NB

− NAk

(90)

j

+ NA .

Case 2 Now different frequencies f j = f k are considered. Referring to Eq. (80), the model equations for carrier phases measured at the two sites A and B to the satellite j are now given by

1356

B. Hofmann-Wellenhof

1 j j ρ (t) + NA + f j δA (t) , λj A 1 j j j ΦB (t) + f j δ j (t) = j ρB (t) + NB + f j δB (t) . λ j

ΦA (t) + f j δ j (t) =

(91)

The measured phases are scaled to ranges by Φ˜ j (t) = λj Φ j (t) ,

(92)

where Φ˜ j (t) is output by the receiver. The single-difference of the two equations (91) leads to j j j j j j Φ˜ B (t) − Φ˜ A (t) = ρB (t) − ρA (t) + λj [NB − NA ] + c [δB (t) − δA (t)]

(93)

with c = λj f j being the speed of light. Introducing the shorthand notations of (82) j j j and (83), i.e., symbolically ∗AB = ∗B − ∗A , a more compact form is achieved by j j j Φ˜ AB (t) = ρAB (t) + λj NAB + c δAB (t) .

(94)

Assuming two satellites j, k gives rise to two single-differences (94). From these the double-difference j j j k k k Φ˜ AB (t) − Φ˜ AB (t) = ρAB (t) − ρAB (t) + λk NAB − λj NAB

(95)

jk

j

is obtained. Introducing again shorthand notations, symbolically ∗AB = ∗kAB −∗AB , there results jk jk j k − λj NAB , Φ˜ AB (t) = ρAB (t) + λk NAB

(96) j

j

which may be rearranged by “adding zero” in the form of −λk NAB + λk NAB so that jk jk jk j Φ˜ AB (t) = ρAB (t) + λk NAB + NAB (λk − λj )

(97)

is finally obtained. This equation differs from the double-difference equation (87) by j the “single-difference bias” bSD = NAB (λk − λj ). The unknown single-difference j NAB can be estimated from single-point positioning with an accuracy of about 10 m corresponding to 50 cycles (for a typical wavelength of some 20 cm). If the wavelength difference for two carriers corresponds to 0.000,351 cycles (which is typical for GLONASS carriers being separated by one carrier number, then the result bSD = 0.02 cycles is obtained. This shows that for small frequency differences bSD acts as a nuisance parameter. For larger differences, iterative processing has been proposed. In the first step, only satellites with small wavelength differences

21 Elementary Mathematical Models for GNSS Positioning

1357

Fig. 17 Triple-difference: two stations, two satellites, and two epochs are involved

are considered. Hence, the double-differenced ambiguities of these satellites can be resolved and an improved position is obtained leading to a more accurate estimation j for NAB . The procedure is then continued and stepwise extended to all satellites until all ambiguities have been resolved. More details on this subject can be found in Habrich et al. [41] and Han et al. [44]. Triple-Differences So far only one epoch t has been considered. To eliminate the time-independent ambiguities, Remondi [102] has suggested differencing double-differences between two epochs. Subsequently, only the case f j = f k is considered explicitly. Denoting the two epochs in (87) by t1 and t2 , then 1 jk jk ρ (t1 ) + NAB , λ AB 1 jk jk jk ΦAB (t2 ) = ρAB (t2 ) + NAB λ jk

ΦAB (t1 ) =

(98)

are the two double-differences, and jk

jk

ΦAB (t2 ) − ΦAB (t1 ) =

 1  jk jk ρAB (t2 ) − ρAB (t1 ) λ

(99)

is the triple-difference (see Fig. 17), which may be written in the simplified form 1 jk ρ (t12 ) λ AB

(100)

∗ (t12 ) = ∗(t2 ) − ∗(t1 )

(101)

jk

ΦAB (t12 ) = if the symbolic formula

jk

jk

is applied to the terms Φ and ρ. It should be noted that both ΦAB (t12 ) and ρAB (t12 ) are actually composed of eight terms each. Resubstituting (99) and either (89)

1358

B. Hofmann-Wellenhof

or (90) yields jk

j

j

j

j

ΦAB (t12 ) = + ΦBk (t2 ) − ΦB (t2 ) − ΦAk (t2 ) + ΦA (t2 ) − ΦBk (t1 ) + ΦB (t1 ) + ΦAk (t1 ) − ΦA (t1 )

(102)

and jk

j

j

j

j

ρAB (t12 ) = + ρBk (t2 ) − ρB (t2 ) − ρAk (t2 ) + ρA (t2 ) − ρBk (t1 ) + ρB (t1 ) + ρAk (t1 ) − ρA (t1 ) .

(103)

It may be proved by the reader that for the case f j = f k the equation jk jk Φ˜ AB (t12 ) = ρAB (t12 )

(104)

is obtained instead of (100). The advantage of triple-differences is the canceling effect for the ambiguities, which eliminates the need to determine them.

3.3.3 Correlations of the Phase Combinations In general, there are two groups of correlations, (1) the physical and (2) the mathematical correlations. The phases from one satellite received at two points, j j for example, ΦA (t) and ΦB (t), are physically correlated since they refer to the same satellite. Usually, the physical correlation is not taken into account. The main interest is directed to the mathematical correlations introduced by differencing. The assumption may be made that the phase errors show a random behavior resulting in a normal distribution with expectation value zero and variance σ 2 , where the variance is estimated by the UERE. Measured (or raw) phases are, therefore, linearly independent or uncorrelated. Introducing a vector  containing the phases and assuming equal accuracy, then  = σ 2 I

(105)

is the covariance matrix for the phases, where I is the unit matrix. Correlation of Single-Differences Considering the two points A, B and the satellite j at epoch t gives j

j

j

ΦAB (t) = ΦB (t) − ΦA (t)

(106)

as the corresponding single-difference. Forming a second single-difference for the same two points but with another satellite k at the same epoch yields

21 Elementary Mathematical Models for GNSS Positioning

1359

k ΦAB (t) = ΦBk (t) − ΦAk (t) .

(107)

The two single-differences may be computed from the matrix-vector relation S = C,

(108)

where 4 S =

j

ΦAB (t) k (t) ΦAB

4 C =

−1 1

3 ,

0 0





⎢ j ⎥ ⎢ Φ (t) ⎥ ⎢ ⎥  =⎢ B ⎥ . ⎢ Φ k (t) ⎥ ⎣ A ⎦ ΦBk (t)

3

0 0 −1 1

j

ΦA (t)

,

(109)

The covariance law applied to Eq. (108) gives S = C  CT

(110)

S = C σ 2 I CT = σ 2 C CT

(111)

and, by substituting Eq. (105),

is obtained. Taking C from (109), the matrix product  C CT = 2

 10 = 2I 01

(112)

substituted into (111) leads to the covariance of the single-differences S = 2σ 2 I .

(113)

This shows that single-differences are uncorrelated. Note that the dimension of the unit matrix in (113) corresponds to the number of single-differences at epoch t, whereas the factor 2 does not depend on the number of single-differences. Considering more than one epoch, the covariance matrix is again a unit matrix with the dimension equivalent to the total number of single-differences. Correlation of Double-Differences Now, three satellites j, k,  with j as reference satellite are considered. For the two points A, B and epoch t, the double-differences

1360

B. Hofmann-Wellenhof jk

j

j

j

k (t) − Φ ΦAB (t) = ΦAB AB (t) ,

(114)

 (t) − Φ ΦAB (t) = ΦAB AB (t)

can be derived from the single-differences. These two equations can be written in the matrix-vector form D = C S,

(115)

where 4 D =

3 ,

j

ΦAB (t) 4

C =

jk

ΦAB (t)

−1 1 0



3

−1 0 1

j

ΦAB (t)



⎥ ⎢ k ⎥ S =⎢ ⎣ ΦAB (t) ⎦  (t) ΦAB

,

(116)

have been introduced. The covariance matrix for the double-differences is given by D = C S CT

(117)

D = 2σ 2 C CT

(118)

and substituting (113) leads to

or, explicitly, using C from (116),  D = 2σ

2

21 12

 .

(119)

This shows that double-differences are correlated. The weight or correlation matrix P(t) is obtained from the inverse of the covariance matrix P(t) = −1 D =

  1 1 2 −1 , 2σ 2 3 −1 2

(120)

where two double-differences at one epoch were used. Generally, with nD being the number of double-differences at epoch t, the correlation matrix is given by

21 Elementary Mathematical Models for GNSS Positioning

1361

⎤ nD −1 −1 . . . ⎢ −1 nD −1 . . . ⎥ 1 1 ⎥ ⎢ P(t) = ⎥, ⎢ ⎦ 2σ 2 nD + 1 ⎣ −1 .. . ... nD ⎡

(121)

where the dimension of the matrix is nD × nD . For a better illustration, assume four double-differences. In this case, the 4 × 4 matrix ⎡

⎤ 4 −1 −1 −1 ⎥ 1 1⎢ ⎢ −1 4 −1 −1 ⎥ P(t) = ⎣ 2 2σ 5 −1 −1 4 −1 ⎦ −1 −1 −1 4

(122)

is the correlation matrix. So far only one epoch has been considered. For epochs t1 , t2 , t3 , . . ., the correlation matrix becomes a block-diagonal matrix ⎡ ⎢ ⎢ P(t) = ⎢ ⎣



P(t1 )

⎥ ⎥ ⎥, ⎦

P(t2 ) P(t3 ) ..

(123)

.

where each “element” of the matrix is itself a matrix. The matrices P(t1 ), P(t2 ), P(t3 ), . . . do not necessarily have to be of the same dimension because there may be different numbers of double-differences at different epochs. Correlation of Triple-Differences The triple-difference equations are slightly more complicated because several different cases must be considered. The covariance of a single triple-difference is computed by applying the covariance propagation law to the relation, cf. Eqs. (102) and (106), jk

j

j

k k ΦAB (t12 ) = ΦAB (t2 ) − ΦAB (t2 ) − ΦAB (t1 ) + ΦAB (t1 ) .

(124)

Now, two triple-differences with the same epochs and sharing one satellite are considered. The first triple-difference using the satellites j, k is given by Eq. (124). The second triple-difference corresponds to the satellites j, : jk

j

j

j

j

j

k (t ) − Φ k ΦAB (t12 ) = ΦAB 2 AB (t2 ) − ΦAB (t1 ) + ΦAB (t1 ) ,  (t ) − Φ  ΦAB (t12 ) = ΦAB 2 AB (t2 ) − ΦAB (t1 ) + ΦAB (t1 ) .

(125)

1362

B. Hofmann-Wellenhof

By introducing 4 T =

j

3

⎡ ,

1 −1 1

0 −1 1 0

0 −1 −1 0 1

j

ΦAB (t1 )



⎥ ⎢ ⎢ Φ k (t ) ⎥ ⎢ AB 1 ⎥ ⎥ ⎢ ⎢ Φ  (t ) ⎥ ⎢ AB 1 ⎥ ⎥ S =⎢ ⎢ Φ j (t ) ⎥ ⎢ AB 2 ⎥ ⎥ ⎢ ⎢ Φ k (t ) ⎥ ⎢ AB 2 ⎥ ⎦ ⎣  (t ) ΦAB 2

ΦAB (t12 ) 4

C =

jk

ΦAB (t12 )

3 ,

(126)

the matrix-vector relation T=CS

(127)

can be formed, and the covariance for the triple-difference follows from T = C S CT

(128)

T = 2σ 2 C CT

(129)

or, by substituting (113),

is obtained, which, using (126), yields  T = 2σ 2

42 24

 (130)

for the two triple-differences (125). The tedious derivation may be abbreviated by setting up Table 4. jk It can be seen that the triple-difference ΦAB (t12 ), for example, is composed of the two single-differences (with the signs according to Table 4) for the satellites j and k at epoch t1 and of the two single-differences for the same satellites but epoch j t2 . Accordingly, the same applies for the other triple-difference ΦAB (t12 ). Thus, the coefficients of Table 4 are the same as those of matrix C in Eq. (126). Finally, the product C CT , appearing in Eq. (129), is also aided by referring to Table 4. All Table 4 Symbolic composition of triple-differences

Epoch Satellite jk ΦAB (t12 ) j ΦAB (t12 )

j

t1 k



j

t2 k



1 1

−1

0

−1

1

0

0

−1

−1

0

1

21 Elementary Mathematical Models for GNSS Positioning

1363

Table 5 Triple-difference correlations Epoch Satellite

C CT

j

t1 k



j

t2 k



j

t3 k



ΦAB (t12 )

1

−1

0

−1

1

0

0

0

0

4

jk ΦAB (t23 ) jk ΦAB (t12 ) j ΦAB (t23 )

0

0

0

1

−1

0

−1

1

0

−2

4

1

−1

0

−1

1

0

0

0

0

4

−1

0

0

0

1

0

−1

−1

0

1

−1

4

jk

−2

combinations of inner products of the two rows (one row represents one tripledifference) must be taken. The inner product (row 1 · row 1) yields the first-row, first-column element of C CT , the inner product (row 1 · row 2) yields the firstrow, second-column element of C CT , etc. Based on the general formula (124) and Table 4, arbitrary cases may be derived systematically. Table 5 shows the second group of triple-difference correlations if adjacent epochs t1 , t2 , t3 are taken. Two cases are considered. It can be seen from Table 5 that an exchange of the satellites for one tripledifference causes a change of the sign in the off-diagonal elements of the matrix kj j C CT . Therefore, the correlation of ΦAB (t12 ) and ΦAB (t23 ) produces +1 as offdiagonal element. Based on a table such as Table 5, each case may be handled with ease. According to Remondi [102, p. 142], computer program adaptations require only a few simple rules. These are the basic mathematical correlations for single-, double-, and triple-differences. More sophisticated models are investigated in Euler and Goad [23], Gerdan [36], Jin and Jong [57] by taking into account the elevation dependence of the observation variances. Gianniou [37] introduces variable weights by forming differences, applying polynomial fitting, and by using the signal-to-noise ratio for code ranges as well as for phases. Jonkman [59] and Tiberius [129] consider time correlation and crosscorrelation of the code ranges and the phases.

3.3.4 Static Relative Positioning In a static survey of a single baseline vector between points A and B, the two receivers must stay stationary during the entire observation session. In the following, the single-, double-, and triple-differencing are investigated with respect to the number of observation equations and unknowns. It is assumed that the two sites A and B are able to observe the same satellites at the same epochs. The practical problem of satellite blockage is not considered here. The number of epochs is again denoted by nt , and ns denotes the number of satellites. The undifferenced phase as shown in Eq. (43) (where the satellite clock is assumed to be known) is not included here, because there would be no connection (no common unknown) between point A and point B. The two data sets could be solved separately, which would be equivalent to point positioning.

1364

B. Hofmann-Wellenhof

A single-difference may be expressed for each satellite and for each epoch. The number of measurements is, therefore, ns nt . The number of unknowns is written below the corresponding terms of the single-difference equation, cf. Eq. (84): j

ΦAB (t) =

1 j j ρ (t) + NAB + f j δAB (t) , λj AB

ns nt ≥

3

(131)

+ ns + (nt − 1) .

The nt − 1 unknown clock biases indicate a rank deficiency of 1. The explanation is the same as for static point positioning, cf. Eq. (44). From above, the relation nt ≥

ns + 2 ns − 1

(132)

may be derived. Although this equation is equivalent to Eq. (45), it is useful to repeat the (theoretically) minimum requirements for a solution. A single satellite does not provide a solution because the denominator of (132) becomes zero. With two satellites, there results nt ≥ 4, and for the normal case of four satellites, nt ≥ 2 is obtained. For double-differences, the relationship of measurements and unknowns is obtained using the same logic. Note that for one double-difference two satellites are necessary. For ns satellites, therefore, ns − 1 double-differences are obtained at each epoch so that the total number of double-differences is (ns −1) nt . The number of unknowns is written below the corresponding terms of the double-difference equation, cf. Eq. (87): jk

ΦAB (t)

1 jk jk = ρAB (t) + NAB , λ

(ns − 1) nt ≥

3

(133)

+ (ns − 1) .

From above, the relation nt ≥

ns + 2 ns − 1

(134)

is obtained, which is identical with Eq. (132) and, therefore, the basic configurations are again given by the pairs ns = 2, nt ≥ 4 and ns = 4, nt ≥ 2. To avoid linearly dependent equations when forming double-differences, a reference satellite is used, against which the measurements of the other satellites are differenced. For example, take the case where measurements are made to the satellites 6, 9, 11, and 12 and 6 is used as reference satellite. Then, at each epoch the following doubledifferences can be formed: (9-6), (11-6), and (12-6). Other double-differences are linear combinations and, thus, linearly dependent. For instance, the doubledifference (11-9) can be formed by subtracting (11-6) and (9-6).

21 Elementary Mathematical Models for GNSS Positioning

1365

Note that relation (134) also applies if the frequencies of the satellite signals are not equal. Referring to (96), the number of single-difference ambiguities corresponds to that of (131) and amounts to ns , which may be combined to ns − 1 double-difference ambiguities if the single-difference ambiguities of one satellite are taken as reference. The triple-difference model includes only the three unknown point coordinates. For a single triple-difference, two epochs are necessary. Consequently, in the case of nt epochs, nt − 1 linearly independent epoch combinations are possible. Thus, 1 jk = ρAB (t12 ) , λ

jk

ΦAB (t12 )

(ns − 1) (nt − 1) ≥

(135)

3

are the resulting equations. From above, the relation nt ≥

ns + 2 ns − 1

(136)

is obtained. This equation is identical with Eq. (132) and, hence, the basic configurations are again given by the pairs ns = 2, nt ≥ 4 and ns = 4, nt ≥ 2. This completes the discussion on static relative positioning. As shown, each of the mathematical models – single-difference, double-difference, triple-difference – may be used. The relationships between the number of observation equations and the number of unknowns will be referred to again in the discussion of the kinematic case. Measurement Examples Baselines of several lengths were investigated by comparing the achieved results with the “truth”. Considering the baseline vector bAB between points A and B, reference coordinates for both stations were available which are considered the “true” coordinates. Thus, the baseline between A and B was once calculated simply by forming the difference of the respective three-dimensional coordinate vectors and, in the case of the measurements, the same coordinates for the reference station A were used and the coordinate differences XAB , YAB , ZAB (see Eq. 79) were calculated using RTKLIB and the measurements. Taking into account the respective antenna height, this calculated baseline vector was compared with the “true” baseline vector. The length of the difference vector true minus calculated is shown in the subsequent figures. The first results shown in Fig. 18 refer to a baseline with the “true” length of 3,097.463 m. The update rate was 0.1 Hz and the elevation mask 15 degrees. The results displayed represent the length of the difference vector true minus calculated in dependence of the measurement period. Note that the red curve shows the results for L1 only whereas the blue curve uses L1 and L2, thus, the double amount of data. The use of L1 and L2 must not be confused with the ionosphere-free combination, cf. Eq. (57)! Nevertheless, RTKLIB takes into account standardized models for the ionosphere and the troposphere. The interpretation of this figure is pretty simple:

1366

B. Hofmann-Wellenhof

Fig. 18 Comparison of the results for a 3 km baseline in dependence of the measurement period (red curve: L1, blue curve: L1 and L2)

it is sufficient to take some 30 minutes of data for this 3 km baseline. It does not make a significant difference if longer observation periods are used. There is a slight improvement if L1 and L2 are used compared to L1 only. Two aspects should be considered in addition: first, if the length of difference vectors true minus calculated are transformed into the local level system of station B and split into north east, and up, it becomes evident that the by far largest contribution to these length of difference vectors comes from the up component and amounts to about 15–20 cm; second, it should be mentioned that also the “true” baseline is not an error-free solution. In analogous way, the analysis has been carried out for an 11 km baseline (Fig. 19), a 25 km baseline (Fig. 20), a 48 km baseline (Fig. 21), and a 101 km baseline (Fig. 22). Referring to the characteristics of the 3 km baseline of (Fig. 18), the behaviour is essentially the same for the other baselines. As to be expected, a slightly longer observation period is needed with increasing baseline length (yet, the 48 km baseline is a counterexample); however, aiming at an accuracy of the aforementioned 10–15 cm, this can for all investigated baselines be achieved in less than 1 h observation time. In other terms – and this is maybe the more valuable argument – it does not make sense to have a longer observation period because for two or three hours of data the results do not converge to a better solution. Despite this result, another set of observations with 3 h periods has been measured at different days but with the same baselines. The results displayed in Table 6 represent the length of the difference vector true minus calculated in

21 Elementary Mathematical Models for GNSS Positioning

1367

Fig. 19 Comparison of the results for an 11 km baseline in dependence of the measurement period (red curve: L1, blue curve: L1 and L2)

Fig. 20 Comparison of the results for a 25 km baseline in dependence of the measurement period (red curve: L1, blue curve: L1 and L2)

1368

B. Hofmann-Wellenhof

Fig. 21 Comparison of the results for a 48 km baseline in dependence of the measurement period (red curve: L1, blue curve: L1 and L2)

Fig. 22 Comparison of the results for a 101 km baseline in dependence of the measurement period (red curve: L1, blue curve: L1 and L2)

21 Elementary Mathematical Models for GNSS Positioning

1369

Table 6 Baseline differences “true” minus calculated in centimeters represented by the local level system components n, e, u (north, east, and up) for various observation periods Obs. period 0.5 h

1.0 h

1.5 h

2.0 h

2.5 h

3.0 h

Comp. n e u n e u n e u n e u n e u n e u

3 km 3.9 1.1 9.0 0.2 1.7 10.3 1.2 2.9 8.2 1.0 1.8 9.4 0.4 0.9 9.8 0.1 0.2 10.7

11 km 0.6 15.3 10.8 0.8 8.8 12.2 0.6 5.4 11.0 1.4 1.1 10.9 0.9 0.9 10.9 0.6 0.1 10.8

Baselines 25 km 0.0 1.2 10.5 1.3 4.7 8.3 1.4 1.5 6.9 1.1 1.1 5.9 1.8 0.5 6.9 2.1 1.0 7.3

48 km 4.0 6.0 9.4 2.3 0.5 10.9 1.2 1.8 12.4 0.7 1.4 14.9 0.2 1.2 14.1 0.4 1.3 13.6

101 km 3.1 8.0 4.3 3.5 8.7 2.6 1.9 6.9 3.0 1.9 3.9 3.7 0.8 2.2 5.2 0.1 0.3 7.4

dependence of the measurement period but transformed to a local level system and split into the north, east, and up component. The numerical results shown in this section so far indicate the accuracy with respect to known results which were regarded as true and correct results even although they contain some errors too. Apart from these comparisons it is also of interest how precise the GNSS results are if they are compared with each other only, in other terms: which precision GNSS can reach. The answer to this question is taken from Kadletz [64] and graphically given in Fig. 23. With an observation period of about 30 minutes, GPS reaches a precision of 10 mm whereas GLONASS achieves the same precision after about one hour. The combined GPS and GLONASS solution does not significantly improve the precision. The precision results are slightly improved for shorter baselines: for a 3 km baseline, the 10 mm precision is achieved with GLONASS in about 29 minutes and with GPS in about 24 minutes; when combining GPS and GLONASS, 20 minutes are sufficient to achieve the 10 mm precision.

3.3.5 Kinematic Relative Positioning In kinematic relative positioning, the receiver on the known point A of the baseline vector remains fixed. The second receiver moves, and its position is to be determined for arbitrary epochs. The models for single-, double-, and triple-difference implicitly

1370

B. Hofmann-Wellenhof

Fig. 23 GNSS precision for a 25 km baseline in dependence of the measurement period (blue curve: GLONASS, green curve: GPS, red curve GLONASS and GPS)

contain the motion in the geometric distance. Considering point B and satellite j , the geometric distance in the static case is given by, cf. Eq. (34), j

ρB (t) =

 (Xj (t) − XB )2 + (Y j (t) − YB )2 + (Z j (t) − ZB )2

(137)

and in the kinematic case by j

ρB (t) =

 (Xj (t) − XB (t))2 + (Y j (t) − YB (t))2 + (Z j (t) − ZB (t))2 ,

(138)

where the time dependence for point B appears. In this mathematical model, three coordinates are unknown at each epoch. Thus, the total number of unknown site coordinates is 3 nt for nt epochs. The relations between the number of observations and the number of unknowns for the kinematic case follow from the static singleand double-difference models, cf. Eqs. (131), (133): ≥ 3 nt + ns + (nt − 1) ,

single-difference: ns nt

double-difference: (ns − 1) nt ≥ 3 nt + (ns − 1) .

(139)

For example, the relation nt ≥

ns − 1 ns − 4

(140)

is the basic configuration for single-differences, which is equivalent to Eq. (47). The continuous motion of the roving receiver restricts the available data for the determination of its position to one epoch. But none of the above two models

21 Elementary Mathematical Models for GNSS Positioning

1371

provides a useful solution for nt = 1. Thus, these models are modified: the number of unknowns is reduced by omitting the ambiguity unknowns, i.e., the ambiguities are assumed to be known. For the single-difference case, this has a twofold effect: first, the ns ambiguities may be omitted and, second, the rank deficiency vanishes because of the known ambiguities so that nt unknown clock biases have to be determined. The modified observation requirement for the singledifference is therefore ns nt ≥ 4 nt and reduces to ns ≥ 4 for a single epoch. Similarly, for the double-difference ns − 1 ambiguities are omitted in (139) so that (ns − 1) nt ≥ 3 nt results, which reduces to ns ≥ 4 for a single epoch. Hence, the single-difference and the double-difference models end up again with the fundamental requirement of four simultaneously observable satellites. The use of triple-differences for kinematic cases is strongly restricted. In principle, the definition of triple-differences with two satellites at two epochs and two stations at – with respect to the two epochs – fixed positions exclude any application since the rover position changes epoch by epoch. However, tripledifferences could be used if, e.g., the coordinates of the roving receiver were known at the reference epoch. In this case, adapting (135) to the kinematic case with 3 nt unknowns and reducing the number of unknown rover positions by 3 because of the known rover position at the reference epoch, the relationship obtained would be (ns − 1) (nt − 1) ≥ 3 (nt − 1). This leads to ns ≥ 4, which is the same requirement as for the ambiguity-reduced single- and double-differences. Omitting the ambiguities for single- and double-difference means that they must be known. The corresponding equations are simply obtained by rewriting (131) and (133) with the ambiguities shifted to the left side of the equations. The singledifferences become j

j

ΦAB (t) − NAB =

1 j ρ (t) + f j δAB (t) λj AB

(141)

1 jk ρ (t) , λ AB

(142)

and the double-differences jk

jk

ΦAB (t) − NAB =

where the unknowns now appear only on the right sides. If the frequencies of the satellite signals are not equal, an analogous relation for the double-difference is obtained. Referring to (97), the two terms containing ambiguities may be shifted to the left side to indicate that they are known. Then the jk only remaining term on the right side is ρAB (t). Thus, all of the equations can be solved if one position of the moving receiver is known. Preferably (but not necessarily), this will be the starting point of the moving receiver. The baseline related to this starting point is denoted as the starting vector. With a known starting vector, the ambiguities are determined and are known for all subsequent positions of the roving receiver as long as no loss of signal lock occurs and a minimum of four satellites is in view.

1372

B. Hofmann-Wellenhof

Static Initialization Three methods are available for the static determination of the starting vector. In the first method, the moving receiver is initially placed at a known point, creating a known starting vector. The ambiguities can then be calculated from the double-difference model (133) as real values and are then fixed to integers. A second method is to perform a static determination of the starting vector. The third initialization technique is the antenna swap method according to B.W. Remondi. The antenna swap is performed as follows: denoting the reference mark as A and the starting position of the moving receiver as B, a few measurements are taken in this configuration, and with continuous tracking, the receiver at A is moved to B, while the receiver at B is moved to A, where again a few measurements are taken. This is sufficient to precisely determine the starting vector in a very short time (e.g., 30 seconds). Often, a second antenna swap is performed by moving the receivers to their starting positions. Kinematic Initialization Special applications require kinematic GNSS without static initialization since the moving object whose position is to be calculated is in a permanent motion (e.g., a buoy or an airplane while flying). Translated to model equations, this means that the most challenging case is the determination of the ambiguities on-the-fly (OTF). The solution requires an instantaneous ambiguity resolution or an instantaneous positioning (i.e., for a single epoch). This strategy sounds very simple but it can require advanced methods. A vast literature has been written on this important topic. The main problem is to find the position as fast and as accurately as possible. This is achieved by starting with approximations for the position and improving them by least-squares adjustments or search techniques. Measurement Examples for RTK Since RTK is a relative point positioning method where motion (“kinematic”) is included, it is difficult to compare it properly with an adequate other GNSS method. In the following examples, a comparison between RTK and DGNSS (or, in the current case, DGPS is the more specific denotation since GPS satellites only have been used) is investigated where once the motion was realized by a walking person carrying the antenna and once the path of a car was measured. For the two examples, Graz-Lustbühel was chosen as reference station which is some 25 km away from the measurement area. The first results shown in Fig. 24 refer to the pedestrian measurement example. The update rate is 10 Hz and the elevation mask 15 degrees and, as mentioned, only GPS satellites have been used. The processing software was RTKLIB where the standardized models for correcting for the ionosphere and the troposphere (Klobuchar and Saastamoinen) were activated. The results were generated in the following way. First, the positions for the individual epochs were calculated as x, y, z coordinates in the reference frame ETRS 2002 for RTK and DGPS as well. Then the vector between these two positions was transformed into the local level system where the origin was placed at the RTK position for each measurement

21 Elementary Mathematical Models for GNSS Positioning

1373

Fig. 24 Comparing RTK with DGPS for a pedestrian

epoch. Finally, these local level vectors were split into north, east, and up and are displayed in Fig. 24. Combining all epochs statistically, the measurement standard deviations of 0.37 m for the north component, 0.27 m for the east component, and 0.54 m for the up component are achieved. The 2D mean offset amounts to 0.05 m and the 3D mean offset amounts to 0.28 m. In Fig. 25, the motion of the car is displayed. The measurement area, the equipment and the presuppositions are the same as for the previous example. As to be expected, the results are much noisier which is also reflected in the statistics: now the measurement standard deviations of 0.94 m for the north component, 1.18 m for the east component, and 2.05 m for the up component are achieved. In a different area, another very interesting illustration is given in Fig. 26. Again DGNSS or, more precisely, DGPS is compared with RTK; however, now the different resulting paths are shown. The red trajectory refers to DGPS and the blue one to RTK. It is very fascinating, how smooth the result of RTK is.

3.3.6 Pseudokinematic Relative Positioning The pseudokinematic method can be identified as static surveying with large data gaps [69]. The mathematical model, e.g., for double-differences, corresponds to Eq. (133) where generally two sets of phase ambiguities must be resolved since the point is occupied at different times. B.W. Remondi also has applied the triple-

1374

Fig. 25 Comparing RTK with DGPS for a car

Fig. 26 Comparison of the trajectory of the results for RTK and DGPS

B. Hofmann-Wellenhof

21 Elementary Mathematical Models for GNSS Positioning

1375

difference method followed by the ambiguity function method (Sect. 4.2.3) to avoid ambiguities altogether. Processing of the data could start with a triple-difference solution for the few minutes of data collected during the two occupations of a site. Based on this solution, the connection between the two ambiguity sets is computed [104]. After the successful ambiguity connection, the normal doubledifference solutions are performed. The time span between the two occupations is an important factor affecting accuracy. Willis and Boucher [149] investigate the accuracy improvements by an increasing time span between the two occupations. As a rule of thumb, the minimum time span should be one hour. Note that the pseudokinematic relative positioning method is rarely used today.

3.3.7 Virtual Reference Stations When processing a baseline, the effects of orbit errors, ionospheric and tropospheric refraction are reduced by forming differences of the observables, e.g., doubledifferences. These effects grow with increasing baseline length. Therefore, it is good practice to use short baselines requiring a reference station close to the rover. These basic considerations have led to reference station networks like the Austrian positioning service (APOS), the German satellite positioning service (SAPOS), and several others. After the establishment of such networks, some new ideas have evolved to exploit the available data accordingly. Among many others, real-time differential error modeling in reference station networks [139], multibase real-time kinematic (RTK) positioning using virtual reference stations [136], and networkbased techniques for RTK applications [152] are respective examples. Even in case of an existing reference station network it is desirable to further reduce the baseline length. The idea is to generate “observation data” for a nonexisting station, i.e., a virtual station, from real observations of a multiple reference station network and to transmit these data to the rover station. This is the basic principle of the virtual reference station (VRS) concept. Usually, the data of three or more reference stations surrounding the VRS are taken to calculate the observation data for the VRS. The result of this concept yields a horizontal accuracy at the level of 5 cm for baselines up to 35 km [105]. Understanding the VRS Principle by an Elementary Approach The objective is to transform measurements made at real reference stations to the location of the VRS, i.e., to a different location. This implies that all terms of the observation equation model depending on the reference receiver location have to be corrected to account for the new location. To keep it as simple as possible, the phase pseudorange Φrs (t) =

1 s ρ (t) + Nrs + f s λs r

δrs (t) ,

(143)

as given in (42) after substitution of f s = c/λs , is considered. On the left side, Φrs (t) is the measured carrier phase which is modeled by ρrs (t), the geometric

1376

B. Hofmann-Wellenhof

distance between receiver and satellite, the time-independent integer ambiguity Nrs , and δrs (t), the combined receiver and satellite clock bias. Now the key question is which of the terms is location-dependent? In other words, which of the terms changes if the same receiver is assumed at another location but considered at the same epoch t? The answer is Φrs (t) and ρrs (t), because the other two terms do not change with varying location. Now it is assumed that receiver r is once located at the real reference station A represented by the coordinate vector XA and once at the virtual reference station (VRS) represented by XV . Then there result from these two locations from (143) the two equations 1 s ρ (XA , t) + Nrs + f s λs r 1 Φrs (XV , t) = s ρrs (XV , t) + Nrs + f s λ

Φrs (XA , t) =

δrs (t) , (144) δrs (t) ,

where the location dependence is indicated accordingly. Forming the difference of the two equations yields Φrs (XV , t) − Φrs (XA , t) =

1 s 1 ρr (XV , t) − s ρrs (XA , t) , s λ λ

(145)

where the ambiguity and the clock error have vanished. After a slight rearrangement, Φrs (XV , t) = Φrs (XA , t) +

1 s ρr (XV , t) − ρrs (XA , t) s λ

(146)

results. The left-hand side is the desired “measurement quantity” at the virtual reference station. Therefore, if all terms on the right-hand side may be obtained, then there is no need to actually measure it. The term Φrs (XA , t) refers to the real reference station A and is measured. Accordingly, ρrs (XA , t) is known since the station coordinates of A and the satellite coordinates of s are known and calculable, respectively. The only remaining term to be discussed is ρrs (XV , t), which comprises the coordinates of the virtual reference station. Figure 27 shows the network of reference stations A, B, C, . . ., the virtual reference station, and the user receiver

Fig. 27 Network of reference stations with the VRS denoted as V and the roving user receiver indicated by r

21 Elementary Mathematical Models for GNSS Positioning

1377

position indicated by r. In principle, the location of the virtual reference station is arbitrary; however, the baseline between the receiver r and the virtual reference station V should be smaller than the baselines between r and any other of the real reference stations. A very useful and convenient location is therefore an approximate position of the user receiver which may readily be obtained by point positioning with code measurements, yielding XV , the coordinates of the VRS. Once the coordinates of the virtual station are known, they may remain fixed for the subsequent epochs (unless the roving user receiver moves too far from the VRS). In summary, the right-hand side of (146) is now fully determined. Thus, the observable Φrs (XV , t) of the VRS may be obtained without actually measuring it. Since XA and XV are known as well, the computation of ρrs (XA , t) and ρrs (XV , t) is straightforward for the subsequent epochs; and Φrs (XA , t) is measured at the real reference station A. The More Complex Reality Is it really that simple? In theory “yes”, but in practice “no” since (143) and (144) are idealized models without taking into account errors like satellite orbit errors, ionospheric and tropospheric refraction. These errors may be considered for the real reference station A by the collective term s r (XA , t)

=

Orbit

(XA , t) +

Iono

(XA , t) +

Trop

(XA , t) .

(147)

Consequently, model (144) for the measurement equation of the reference station A is expanded to Φrs (XA , t) =

1 s ρ (XA , t) + Nrs + f s λs r

δrs (t) +

s r (XA , t)

(148)

and in analogous form for all other real reference stations. To estimate the error term properly, all baselines of the reference network are solved. Note that this also requires the correct ambiguity determination (which should in general not be a major problem because the station coordinates are known). The desired results of the network solution are the error residuals sr (XA , t), sr (XB , t), . . . for all reference stations and at every epoch individually. Similarly, the relation (146) for the VRS is improved by Φrs (XV , t) = Φrs (XA , t) +

1 s ρr (XV , t) − ρrs (XA , t) + s λ

s r (XV , t) ,

(149)

but now the problem arises to determine sr (XV , t) for the VRS. An intuitively simple approach is to take the error residuals sr (XA , t), sr (XB , t), s (X , t) of three reference stations A, B, C surrounding the VRS and to compute C r s (X , t) by a weighted mean for which the weights depend inversely on the V r distance between the virtual reference station and the respective real reference station.

1378

B. Hofmann-Wellenhof

Another approach is to model the error residuals at the reference stations i by s r (Xi , t)

= a Xi + b Yi + c Zi ,

(150)

where Xi , Yi , Zi are the coordinates of the reference stations (e.g., ECEF coordinates or plane coordinates supplemented by a height component). Assuming three real reference stations A, B, C, the coefficients a, b, c can be calculated. To determine the error residual sr (XV , t) for the virtual reference station, use (150) by substituting the coefficients a, b, c and the coordinates of the VRS. If more than three real reference stations are available, the model (150) can either be extended or a least-squares adjustment applied. A few more words should be spent on the modeling of the error term (147). First, it may further be expanded by taking into account additional error influences like antenna phase center offset and variation or multipath. Note, however, these purely station-dependent terms are uncorrelated between stations and it does not make sense to transfer their influence to the VRS by the approach as shown above. Therefore, these error influences must be reduced or corrected by proper modeling or simply neglected. Second, the influence of the ionosphere and the troposphere in (147) may also be reduced by data combinations (Sect. 2.2) or modeling [54, Sects. 5.3.2, Sect. 5.3.3]. Nevertheless, remaining residuals will contribute to the error term. Several other approaches for modeling the error term are proposed. Wübbena et al. [152] and Wanninger [141] propose area correction parameters, Landau et al. [81] mention a weighted linear approximation approach and a least-squares collocation. Dai et al. [18] compare several interpolation algorithms like linear combination model, distance-based linear interpolation method, linear interpolation method, low-order surface model, and least-squares collocation and find out comparable performances.

4

Data Processing

Now it is assumed that according to the previous chapter one or more GNSS positioning methods have been applied and the measurement data is stored in one or more receivers. The next step is the data preprocessing with data handling (how to get it on a computer) and a rough data analysis. Depending on the measured data, in case of phase measurements the integer ambiguities must be determined, a very challenging task because the accuracy of the results is directly related to the correct ambiguity solution. Depending on the application, usually redundant data is available. Redundant means that more observations than unknowns are available. Therefore, an adjustment procedure and filtering methods may be applied adequately. Referring to the Springer Handbook of Global Navigation Satellite Systems, the topics of this Chapter are treated by Humphreys [56] in Part C, Chap. 16, Odijk [99] in Part D, Chap. 21, Verhagen and Teunissen [133] in Part D, Chap. 22,

21 Elementary Mathematical Models for GNSS Positioning

1379

Teunissen [123] in Part D, Chap. 23, Teunissen [124] in Part D, Chap. 24, Odijk and Wanninger [100] in Part E, Chap. 26.

4.1

Data Preprocessing

When initializing the step of data preprocessing, it is assumed that the data is still stored in the receiver. Note that depending on the application, the data can either be processed in the receiver and the results displayed and stored also there. A typical example would be the mass market application point positioning with code ranges epoch by epoch. However, in case of more sophisticated applications, the data must be handled appropriately and stored on a computer. This should be carried out in a receiver independent format to get the option to combine data of different receivers. In addition, a quick glance at the data is performed to prove its plausibility; in other terms, the data is checked to detect possible cycle slips and, in case of detection, to repair them.

4.1.1

Data Handling

Downloading Both the observables and the navigation message and additional information are generally stored in a binary (and receiver-dependent) format. The downloading of the data from the receiver is necessary before postprocessing can begin. Most GNSS manufacturers have designed a data management system which they recommend for data processing. Individual software is fully documented in the manuals of the manufacturers and will not be covered here. Data Exchange Although the binary receiver data may have been converted into computerindependent ASCII format during downloading, the data are still receiver dependent. Also, each GNSS processing software has its own format which necessitates the conversion of specific data into a software-independent format when they are processed with a different type of program. From the preceding, one may conclude that a receiver-independent format of GNSS data promotes data exchange. This has been realized by the receiverindependent exchange (F) format. This format was first defined for GPS data in 1989 and has been published in a second version by Gurtner and Mader [40]. Later, several minor changes were adopted [97] and in 1997 the format was extended to also account for GLONASS [39]. As of January 2018, RINEX 3.03 is the latest version which has been published by the International GNSS Service (IGS), RINEX Working Group and Radio Technical Commission for Maritime Services Special Committee 104 (RTCM-SC104) on Luy 14, 2015. Now Galileo, BeiDou and also augmentation systems are included. For more details see the just specified publication.

1380

B. Hofmann-Wellenhof

At present, RINEX is the most favored format. As a consequence, all receiver manufacturers implement software for the conversion of their receiver-dependent format into RINEX.

4.1.2

Cycle Slip Detection and Repair

Definition of Cycle Slips When a receiver is turned on, the fractional part of the beat phase (i.e., the difference between the satellite-transmitted carrier and a receiver-generated replica) is observed and an integer counter is initialized. During tracking, the counter is incremented by one cycle whenever the fractional phase changes from 2π to 0. Thus, at a given epoch, the observed accumulated phase ϕ is the sum of the fractional phase ϕ and the integer count n. The initial integer number N of cycles between the satellite and the receiver is unknown. This phase ambiguity N remains constant as long as no loss of the signal lock occurs. In this event, the integer counter is reinitialized, which causes a jump in the instantaneous accumulated phase by an integer number of cycles. This jump is called a cycle slip which, of course, is restricted to phase measurements. A schematic representation of a cycle slip is given in Fig. 28. When the measured phases are plotted versus time, a fairly smooth curve should be obtained. In the case of a cycle slip, a sudden jump appears in the plotted curve. Three sources for cycle slips can be distinguished. First, cycle slips are caused by obstructions of the satellite signal due to trees, buildings, bridges, mountains, etc. This source is the most frequent one (particularly for kinematic activities based upon the carrier phase). The second source for cycle slips is a low signal to noise (S/N) ratio due to bad ionospheric conditions, multipath, high receiver dynamics, or low satellite elevation. A third source is a failure in the receiver software [52], which leads to incorrect signal processing. Cycle slips could also be caused by malfunctioning satellite oscillators, but these cases are rare.

Fig. 28 Schematic representation of a cycle slip

21 Elementary Mathematical Models for GNSS Positioning

1381

As seen from Fig. 28, cycle slip detection and repair requires the location of the jump (i.e., cycle slip) and the determination of its size. Detection is accomplished by a testing quantity. Repairs of cycle slips are made by correcting all subsequent phase observations for this satellite and this carrier by a fixed integer number of cycles. The determination of the cycle slip size and the correction of the phase data is often denoted as cycle slip repair or fixing. Testing Quantities For a single site, the testing quantities are raw phases, phase combinations, combinations of phases and code ranges, or combinations of phases and integrated Doppler frequencies. Single receiver tests are important because they enable in situ cycle slip detection and repair by the internal software of the receiver. When the data of two sites are combined, single-, double-, or triple-differences can be used for cycle slip detection. This means that, in a first step, unrepaired phase combinations are used to process an approximate baseline vector. The corresponding residuals are then tested. Quite often several iterations are necessary to improve the baseline solution. Note that triple-differences can achieve convergence and rather high accuracy without fixing cycle slips. Note also that triple-differences make the amount of double-difference cycle slips very clear – in the static case. Subsequently, the testing quantities for a single site are treated in more detail. Raw phases The measured raw phase Φrs (t) can be modeled by λ Φrs (t) = ρrs (t) + λ Nrs + c δrs (t) −

Iono

(t) + . . . ,

(151)

where r and s denote the receiver site and the satellite, respectively. Note that the phase model contains a number of time-dependent terms on the right side of (151) which may prevent cycle slip detection. Phase combinations The model for the dual-frequency phase combination is developed considering a single site, a single satellite, and a single epoch. Thus, the sub- and superscripts and even the time dependency in Eq. (151) may be omitted. According to Eq. (54), the phases are modeled by Φ1 = a f1 + N1 −

b , f1

b , Φ2 = a f2 + N2 − f2

(152)

where the frequency dependency is shown explicitly by the subscripts 1 and 2. In order to eliminate the geometry term a, the first equation of (152) is multiplied by f2 and the second by f1 . Subtracting the resulting equations yields

1382

B. Hofmann-Wellenhof



f2 f1 f2 Φ1 − f1 Φ2 = f2 N1 − f1 N2 − b − f1 f2

 (153)

and dividing the difference by f2 gives Φ1 −



f1 f1 b Φ2 = N1 − N2 − f2 f2 f2

f2 f1 − f1 f2

 (154)

or, by extracting f2 /f1 from the term in parentheses on the right side of the equation, the final form of the geometry-free phase combination Φ1 −

f1 f1 b Φ2 = N1 − N2 − f2 f2 f1

  f2 1 − 12 f2

(155)

is obtained. The left side of Eq. (155) is identical with the ionospheric residual, cf. Eq. (23). The right side shows that the only time-varying quantity is the ionosphere term b. In comparison to the influence on the raw phases in Eq. (151), the influence of the ionosphere on the dual-frequency combination is affected by the factor (1 − f12 /f22 ). Substituting typical GNSS values, f1 = 1.6 GHz and f2 = 1.2 GHz, this factor is −0.78. If there are no cycle slips, the temporal variations of the ionospheric residual would be small for normal ionospheric conditions and for short baselines. Indicators of cycle slips are sudden jumps in successive values of the ionospheric residual. The remaining problem is to determine whether the cycle slip was on phase data referring to f1 or f2 or both. This will be investigated in the next paragraphs. Phase and code range combinations Another testing quantity follows from a phase and code range combination. Modeling the carrier phase and the code pseudoranges by λ Φrs (t) = ρrs (t) + λ Nrs + c δrs (t) − Rrs (t)

= ρrs (t)

+ c δrs (t) +

Iono

(t) +

Trop

(t) ,

Iono

(t) +

Trop

(t)

(156)

and forming the difference λ Φrs (t) − Rrs (t) = λ Nrs − 2

Iono

(t)

(157)

provides a formula where the time-dependent terms (except the ionospheric refraction) vanish from the right side of the equation. Thus, the phase and code range combination could also be used as testing quantity. The ionospheric influence may either be modeled or neglected. The change of Iono (t) will be fairly small between closely spaced epochs; this might justify neglecting the ionospheric term. It may also be neglected when using double-differences.

21 Elementary Mathematical Models for GNSS Positioning

1383

The simple testing quantity (157) has a shortcoming which is related to the noise level. The phase and code range combinations have a noise level in the range of 5 cycles. This noise is mainly caused by the noise level of the code measurements and to a minor extent by the ionosphere. The noise of code measurements is larger than the noise for phase measurements because resolution and multipath are proportional to the wavelength. Traditionally, the measurement resolution was λ/100; today, receiver hardware is achieving improved measurement resolutions approaching λ/1000. In other words, this leads to code range noise levels of a few centimeters. Hence, the phase and code range combination could be an ideal testing quantity for cycle slip detection. Combination of phases and integrated Doppler Comparing differences of measured phases with phase differences derived from integrated Doppler which has the advantage of being immune from cycle slips is a further possibility for a testing quantity. Detection and Repair Each of the described testing quantities allows the location of cycle slips by checking the difference of two consecutive epoch values. This also yields an approximate size of the cycle slip. To find the correct size, the time series of the testing quantity must be investigated in more detail. Note that for all previously mentioned testing quantities, except the ionospheric residual, the detected cycle slip must be an integer. In certain processing scenarios, cycle slips are easy to detect and repair without knowing which satellite or receiver had the problem. One of the methods for cycle slip detection is the scheme of differences. The principle can be seen from the example in Table 7. Assume a time series y(ti ), i = 1, 2, . . . , 7, for a signal which contains a jump of ε at epoch t4 . Note that any of the described testing quantities may be used as signal for the scheme of differences. The terms y 1 , y 2 , y 3 , y 4 denote the first-order, second-order, third-order, and

Table 7 Scheme of differences

ti t1

y(ti ) 0

t2

0

y1

y2

y3

y4

0 0 0 t3

0

ε −2ε

ε t4

−ε

ε 0

t5

0 0

ε ε

−ε

0

ε

0 t7

3ε ε

0 t6

−3ε

ε

1384

B. Hofmann-Wellenhof

fourth-order differences. The important property in the context of data irregularities is the amplification of a jump in higher-order differences and, thus, the improved possibility of detecting the jump. The theoretical reason implied is the fact that differences are generated by subtractive filters. These are high-pass filters damping low frequencies and eliminating constant parts. High-frequency constituents such as a jump are amplified. Replacing the signal y(ti ), for example, by the phase and assuming ε to be a cycle slip, the effect of the scheme of differences becomes evident. A method to determine the size of a cycle slip is to fit a curve through the testing quantities before and after the cycle slip. The size of the cycle slip is found from the shift between the two curves. The fits may be obtained from a simple linear regression or from more realistic least-squares models. These methods are generally called interpolation techniques. Other possibilities are prediction methods such as Kalman filtering. At a certain epoch, the function value (i.e., one of the testing quantities) for the next epoch is predicted based on the information obtained from preceding function values. The predicted value is then compared with the observed value to detect a cycle slip. The application of Kalman filtering for cycle slip detection is demonstrated by, e.g., Landau [79]. In static processing, the best method of detecting the amount of the double-difference integer jump is the tripledifference method. While this method, by itself, may not indicate which satellite or which epoch or which receiver caused the missing integer, it does indicate how to fix the double-difference integer exactly. Once fixed, double-difference processing can proceed. When a cycle slip has been detected, the testing quantities can be corrected by adding the size of the cycle slip to each of the subsequent quantities. The assignment of the detected cycle slip to a single-phase observation is ambiguous if the testing quantities were phase combinations. An exception is the ionospheric residual. Under special circumstances, this testing quantity permits a unique separation. Consider Eq. (155) and assume ambiguity changes N1 and N2 caused by cycle slips. Consequently, a jump N in the ionospheric residual would be detected. This jump is equivalent to

N1 −

N=

f1 f2

N2 ,

(158)

where N is no longer an integer. Equation (158) represents a diophantine equation for the two integer unknowns N1 and N2 . This means, there is one equation with two unknowns; hence, there is no unique solution. This can be seen by solving for integer values N1 and N2 such that N becomes zero. To get N = 0 and considering as an example f1 /f2 = 77/60, the condition N1 =

f1 f2

N2 =

77 60

N2

(159)

21 Elementary Mathematical Models for GNSS Positioning

1385

must be fulfilled. This means that N1 = 77 and N2 = 60 cannot be distinguished from N1 = 154 and N2 = 120 since both solutions satisfy Eq. (159). However, the solution would be unambiguous if N1 is less than 77 cycles. So far the consideration assumed error-free measurements. To be more realistic, the effect of measurement noise must be taken into account. A simple model for the phase measurement noise is σΦ = 0.01 cycles ,

(160)

which corresponds to a resolution of λ/100. The same model is applied to both carriers and, thus, frequency-dependent noise such as multipath is neglected. The assumption is not correct for codeless or quasi-codeless receivers since additional noise is introduced during signal processing. The value N, in principle, is derived from two consecutive ionospheric residuals. Hence, N = Φ1 (t +

t) −

f1 Φ2 (t + f2

  f1 t) − Φ1 (t) − Φ2 (t) f2

(161)

and applying to this equation the error propagation law gives σ

N

= 2.3 σΦ = 0.023 cycles .

(162)

The 3σ error yields approximately 0.07 cycles. This may be interpreted as the resolution of N. The conclusion is that two N calculated by (158) and using arbitrary integers, N1 and N2 must differ by at least 0.07 cycles in order to be uniquely separable. A systematic investigation of the lowest values for N1 , N2 is given in Table 8. For N1 and N2 the values 0, ±1, ±2, . . . , ±5 have been permutated and N calculated by (158). Table 8 is sorted with increasing N in the first column. In the second column, the first-order differences of the function values N are given. To shorten the length of the table, only the negative function values N and zero are displayed. For supplementing with positive function values, the signs in the first, third, and fourth column must be reversed. Those lines in Table 8 being marked with an asterisk do not fulfill the criterion of an at least 0.07 cycle difference. For these values, an unambiguous separation is not possible because the measurement noise is larger than the separation value. Consider the next to the last line in Table 8. A jump in the ionospheric residual of about 0.14 cycle could result from the pair of cycle slips N1 = −4, N2 = −3 or N1 = 5, N2 = 4; however, notice that for the marked lines either N1 or N2 equals 5 (plus or minus). Therefore, omitting the values for N1 = ±5 and N2 = ±5 creates uniqueness in the sense of separability. Up to ±4 cycles the function values N are discernible by 0.12 cycles. The conclusions for cycle slip repair using the ionospheric residual are as follows. Based on the measurement noise assumption in (160), the separation of the cycle slips is unambiguously possible for up to ±4 cycles. A smaller measurement

1386

B. Hofmann-Wellenhof

Table 8 Resulting N −11.42

Diff.

N by permutating ambiguity changes of N2 5

−4

5

−5

4

−3

5

−4

4

−5

3

−2

5

−3

4

−4

3

−5

2

−3.72 −3.72

−1

5

−3.56

0.29 −7.56

−2

4

−3.42

−3

3

−3.28

−4

2

−3.13

0

5

−3.00

−5

1

−2.85

−1

4

−2.72

−2

−3

−2.56

−3

2

−2.44

1

5

−2.42

−5 −5

−1 −1

−1

2

3

5

−2

1

2

4

−3

0

1

3

−4

−1

0

2

−5

−2

4

5

−1

1

3

4

0.14 −4

1

−2.28

0.15 −5.13

3

0.02 ∗

0.14 −5.28

0

0.12

0.14 −5.42

0

0.16

0.29 −5.56

−4

0.13

0.28 −5.85

4

0.15

0.15 −6.13

1

0.13

0.14 −6.28

1

0.15

0.14 −6.42

−3

0.14

0.29 −6.56

5

0.14

0.28 −6.85

2

0.16

0.29 −7.13

2

0.13

0.14 −7.42

−2

0.15 −3.85

0.28 −7.85

3

0.13 −4.00

0.29 −8.13

−1

0.15 −4.13

0.43 −8.42

0

0.14 −4.28

0.28 −8.85

−5

0.14 −4.42

0.29 −9.13

N2

0.29 −4.56

0.71 −9.42

N2

N1

0.15 −4.85

0.29 −10.13

N1 and

Diff. 0.13

−5.00

1.00 −10.42

N

N1 −5

0.15 0

4

−2.13

(continued)

21 Elementary Mathematical Models for GNSS Positioning

1387

Table 8 (Continued) N

Diff. 0.13

−2.00

N1 −2

N2 0

N −1.00

0.15 −1.85

2

3

−0.85

−3

−1

−0.72

1

2

−0.56

−4

−2

−0.44

5

5

−0.28

0

1

−0.15

−2

−1

2

2

−3

−2

1

1

−4

−3

5

4

0

0

0.02 ∗ −5

−3

−0.13

0.02 ∗ −1.13

3

0.13

0.13 −1.15

3

0.16

0.14 −1.28

0

0.12

0.02 ∗ −1.42

−1

0.16

0.12 −1.44

N2

0.13

0.16 −1.56

N1

0.15

0.13 −1.52

Diff. 0.13

0.13 4

4

0.00

noise increases the separability. For larger cycle slips, another method should be used in order to avoid wrong choices in ambiguous situations. Most often, there will be more than one cycle slip. In these cases, each cycle slip must be detected and corrected individually. The corrected phases, single-, double-, or triple-differences are then used to process the baseline. Fixing cycle slips by combining GNSS data with data of other sensors, mainly inertial navigation systems (INS), succeeded to some extent. As Colombo et al. [16] demonstrate, even a moderately accurate (and low-cost) INS, which is also small, lightweight, and portable, can substantially enhance the ability to detect and correct cycle slips. If the INS data must bridge GNSS data gaps in a stand-alone mode, this bridging time is the critical factor for keeping the desired high accuracy. It depends on several factors as, e.g., the type of application, the baseline length, the accuracy of the INS. Accordingly, the bridging time may be limited from a few seconds only to a few minutes. Details on the GNSS/INS data modeling and on tests are given in Schwarz et al. [110], Colombo et al. [16], Altmayer [5], El-Sheimy [20], Alban [4], Kim and Sukkarieh [67], and Farrell and Wendel [27].

4.2

Ambiguity Resolution

As shown in Sect. 2.1.2, ambiguities only occur if phase measurements are used. Since the determination of the ambiguities is crucial for the accuracy or the

1388

B. Hofmann-Wellenhof

precision of the achievable results and since this topic is by far not trivial, many scientists have developed some very interesting theories. Among the large number of developments, some of them are described subsequently. Despite this richness of diversity, today one method has – at least in the opinion of the author – outperformed all other methods: Teunissen’s LAMBDA method. Due to its importance, this method is described subsequently essentially in its form of development of the nineties. Since then, several small improvements and changes have been developed by other authors; yet, the main merit deserves Peter Teunissen and his team.

4.2.1 General Aspects The ambiguity inherent with phase measurements depends upon both the receiver and the satellite. There is no time dependency as long as tracking is maintained without interruption. In the model for the phase, Φ=

1 ρ+f λ

δ+N −

1 λ

Iono

,

(163)

the ambiguity is denoted by N . As soon as the ambiguity is determined as an integer value, the ambiguity is said to be resolved or fixed. In general, ambiguity fixing strengthens the baseline solution. Joosten and Tiberius [60] give an illustrative example. First, a short baseline is computed conventionally and the ambiguities are resolved. Then, introducing the float ambiguities (i.e., real values) and the fixed ambiguities (i.e., integer values), respectively, as given quantities, single-point positions on an epoch-by-epoch basis show a strong difference in the precision: with real-valued ambiguities, the solutions are scattering in the meter range in the coordinate components north, east, and up. For the integer-fixed solution, the precision of the coordinates is below the 1 cm level. But sometimes solutions with fixed ambiguities and float ambiguities may agree within a few millimeters. The use of double-differences instead of single-differences for carrier phase processing is important. The reason is that in the case of single-differences an additional unknown parameter for the receiver clock offset must be considered, which prevents an effective separation of the integer ambiguities from the clock offset. In the case of double-differences, the clock terms have been eliminated and the isolation of the ambiguities is possible. In order to fully exploit the high accuracy of the carrier phase observable, the ambiguities must be resolved to their correct integer value since one cycle may translate to a range error of some decimeters for GNSS carriers. It should be stressed here that integer ambiguity resolution may not always be possible. One of the reasons is the baseline length. When considering short baselines (e.g., sometimes < 20 km), the model for double-difference phases may be simplified to jk

jk

jk

λ ΦAB (t) = ρAB (t) + λ NAB + noise

(164)

since the effects of the ionosphere, the troposphere, and other minor effects may in general be neglected. Any significant residual error from these neglected terms will

21 Elementary Mathematical Models for GNSS Positioning

1389

spill over into the unknown parameters, namely station coordinates and ambiguities, and has the effect of degrading both the position accuracy and the integer nature of the ambiguities. Thus, if applications require a long range from the reference station, there may be a need to install several reference stations for integer ambiguity resolution or to apply the concept of virtual reference stations (Sect. 3.3.7). Another important aspect of ambiguity resolution is the satellite geometry, which can be viewed from two points. First, an increasing number of satellites tracked at any instant translates in general into a better dilution of precision (DOP) value. Thus, all-in-view receivers with the ability to track all visible GNSS satellites are preferable since redundant satellites aid in the efficiency and reliability of ambiguity resolution. The second point with respect to geometry is the length of time required to resolve ambiguities. The information content of the carrier phase is a function of time which is directly correlated to the movement of the satellite. This last point can be illustrated through an example. Suppose two datasets. The first one consists of observations collected every 15 seconds for one hour, for a total of 240 measurements per satellite. Measurements for the second dataset are collected every second for four minutes, for a total of 240 measurements per satellite. Although the number of measurements is the same, the information content clearly is not. The first dataset has a higher probability of correct ambiguity resolution since the elapsed time is longer. The time is a critical component of ambiguity resolution even under good geometric conditions. Multipath is also a critical factor for ambiguity resolution. Since multipath is station dependent, it may be significant even for short baselines. As in the case of atmospheric and orbital errors for long baselines, multipath has the effect of both contaminating the station coordinates and ambiguities. Ambiguity resolution involves three major steps. The first step is the generation of potential integer ambiguity combinations that should be considered by the algorithm. A combination is composed of an integer ambiguity for, e.g., each of the double-difference satellite pairs. In order to determine these combinations, a search space must be constructed. The search space is the volume of uncertainty which surrounds the approximate coordinates of the unknown antenna location. Since the search space dictates which integer ambiguities will be considered, it should be conservatively selected since it must contain the true antenna location. In the case of static positioning, this search space can be realized from the so-called float ambiguity solution, while for kinematic positioning it is realized from a code range solution. An important aspect of this first step in ambiguity resolution is that the size of the search space will affect the efficiency, i.e., computational speed, of the process. A larger search space gives a higher number of potential integer ambiguity combinations to assess, which in turn increases the computational burden. This is important for kinematic applications where a real-time implementation may be sought. It is, therefore, necessary to balance computational load with a conservative search space size. The second major step in the ambiguity resolution process is the identification of the correct integer ambiguity combination. The criterion used by many ambiguity resolution techniques is the selection of the integer combination which minimizes

1390

B. Hofmann-Wellenhof

the sum of squared residuals in the sense of least-squares adjustment. The reasoning here comes from the argument that the combination which best fits the data should be the correct result. However, this can be problematic if there are not enough redundant satellites. The third step in the ambiguity resolution process should be a validation (or verification) of the ambiguities. The assessment of the correctness of the integer numbers obtained should gain more attention [130]. The ambiguity success rate as defined in Joosten and Tiberius [60] may be used as a tool for determining the probability of correct integer estimation. The ambiguity success rate depends on three factors: the observation equations (i.e., the functional model), the precision of the observables (i.e., the stochastic model), and the method of integer ambiguity estimation. Although the last two steps based on residual analysis are rather straightforward, a few remarks should be made with respect to some of the potential difficulties of this approach. The first issue is the basic assumption in least-squares theory that the residuals should be normally distributed. In many cases, this assumption is not fulfilled due to systematic effects from multipath, orbital errors, and atmospheric errors. This is the reason why ambiguity resolution generally fails for long baselines; however, if strong multipath exists, it may even fail for short baselines. A second related issue is the need for statistical significance when the integer ambiguity decision is made. This means that the integer ambiguity combination which best fits the measurements should do so significantly better than all the other combinations. Statistical criteria can be used for this decision as will be discussed in some of the following subsections. Remaining systematic effects mentioned above play a role here as well as the aspect of time, i.e., ambiguity resolution is more difficult for shorter time intervals. This three-step approach, (1) generation of potential integer ambiguity combinations, (2) identification of the optimum integer ambiguity combination, and (3) validation of the ambiguities, may also be refined and expanded. Han and Rizos [43] propose six general classes and include the ambiguity recovery techniques (to reestimate ambiguities when cycle slips occur) as well as integrated models using GNSS measurements and data from other sensors. Hatch and Euler [48] propose a respective partitioning into three classes which is similarly adopted by Kim and Langley [66]: 1. Ambiguity resolution in the measurement domain. 2. Search technique in the coordinate domain. 3. Search technique in the ambiguity domain. Following this classification, a few key principles from the numerous kinds of ambiguity resolution techniques will be demonstrated subsequently. Many variations may be derived (e.g., Mervart [96] and Kim and Langley [66]). The basic approaches as given in Sect. 4.2.2 belong to the ambiguity resolution methods in the measurement domain (but are usually combined with a search technique in the ambiguity domain).

21 Elementary Mathematical Models for GNSS Positioning

1391

Table 9 Some representative ambiguity determination methods Acronym LSAST FARA — LAMBDA — FASF OMEGA

Method Least-squares ambiguity search technique Fast ambiguity resolution approach Modified Cholesky decomposition method Least-squares ambiguity decorrelation adjustment Null space method Fast ambiguity search filter Optimal method for estimating GPS ambiguities

Principal reference(s) Hatch [46] Frei and Beutler [32], Frei [31] Euler and Landau [24] Teunissen [114, 116] Martín-Neira et al. [94] Chen and Lachapelle [13] Kim and Langley [65]

Apart from its relatively poor computational efficiency, the ambiguity function method in Sect. 4.2.3 is one representative example of a search technique in the coordinate domain. The overwhelming part of current research is dedicated to the third class, the search technique in the ambiguity domain. Some examples are given in Sect. 4.2.3. This class of ambiguity resolution mainly refers to the integer least-squares method which is theoretically established by the fact that it will yield the optimal solution in the sense that the probability of correct integer estimation is maximized [119, 120]. Techniques using the integer least-squares method are usually based on three steps: (1) the float solution, (2) the integer ambiguity estimation, and (3) the fixed solution. The variance-covariance matrix resulting from the float solution in the first step is employed for different ambiguity search processes [66]. Representative methods, some of them described in Sect. 4.2.3, are given in Table 9. Sometimes these methods are very similar to each other, e.g., OMEGA may be regarded as a refined version of LSAST. Note that Table 9 does not contain methods based on simulations for multiple (more than two) frequency methods like the three-carrier ambiguity resolution (TCAR) [30, 135].

4.2.2

Basic Approaches

Single-Frequency Phase Data When phase measurements for only one frequency are available, the most direct approach is as follows. The measurements are modeled by Eq. (163), and the linearized equations are processed. Depending on the model chosen, a number of unknowns (e.g., point coordinates, clock parameters) is estimated along with N in a common adjustment. In this approach, the unmodeled errors affect all estimated parameters. Therefore, the integer nature of the ambiguities is not exploited, and they are estimated as real values. To fix ambiguities as integer values, a sequential adjustment could be performed. After an initial adjustment, the ambiguity with a

1392

B. Hofmann-Wellenhof

computed value closest to an integer and with minimum standard error is considered to be determined most reliably. This bias is then fixed, and the adjustment is repeated (with one less unknown) to fix another ambiguity and so on. When using doubledifferences over short baselines, this approach is usually successful. The critical factor is the ionospheric refraction, which must be modeled and which may prevent a correct resolution of all ambiguities. For kinematic applications, the initialization, i.e., the ambiguity determination, is a necessary initial step. Three static methods have been described in the subsection “Static initialization” of Sect. 3.3.5: (1) using a known (and usually short) baseline (the coordinates of both sites are known), which allows ambiguity resolution after a few observation epochs; (2) static determination of the first baseline; (3) the antenna swap method. The kinematic initialization (Sect. 3.3.5) is the on-the-fly (OTF) method. This is the most advanced technique to resolve phase ambiguities and is described in more detail in Sect. 4.2.3. Dual-Frequency Phase Data The situation for the ambiguity resolution improves significantly when using dualfrequency phase data. There are many advantages implied in dual-frequency data because of the various possible linear combinations that can be formed like the wide-lane and narrow-lane techniques. Denoting the phase data referring to the frequencies f1 and f2 by Φ1 and Φ2 , then, according to Eq. (19), Φ21 = Φ1 − Φ2

(165)

is the wide-lane signal. The frequency of this signal is f21 = f1 − f2 and the corresponding wavelength is increased compared to the original wavelengths. The increased wide-lane wavelength λ21 provides an increased ambiguity spacing. This is the key to an easier resolution of the integer ambiguities. To show the principle, consider the phase models in the modified form, cf. Eq. (152): Φ1 = a f1 + N1 −

b , f1

(166)

b Φ2 = a f2 + N2 − , f2

with the geometry term a and the ionosphere term b as known from (55). The difference of the two equations gives  Φ21 = a f21 + N21 − b with the wide-lane quantities

1 1 − f1 f2

 ,

(167)

21 Elementary Mathematical Models for GNSS Positioning

1393

Φ21 = Φ1 − Φ2 , f21 = f1 − f2 ,

(168)

N21 = N1 − N2 . The adjustment based on the wide-lane model gives wide-lane ambiguities N21 , which are more easily resolved than the base carrier ambiguities. To compute the ambiguities for the measured phases (i.e., N1 for Φ1 and N2 for Φ2 ), divide the first equation of (166) by f1 and (167) by f21 : Φ1 N1 b =a+ − 2, f1 f1 f1 Φ21 N21 b =a+ − f21 f21 f21



1 1 − f1 f2

(169)

 ,

and the difference of the two equations gives Φ1 Φ21 N1 N21 b b − = − − 2+ f1 f21 f1 f21 f21 f1



1 1 − f1 f2

 .

(170)

The desired ambiguity N1 follows explicitly after rearranging and multiplying the equation above by f1 : f1 b b N1 = Φ1 − (Φ21 − N21 ) + − f21 f1 f21



f1 1− f2

 .

(171)

The terms reflecting the ionospheric influence may be treated as follows: b b − f1 f21

  f1 f21 f2 − f1 f2 + f12 1− =b f2 f1 f21 f2 =b

f21 f2 + f1 (f1 − f2 ) f1 f21 f2

=b

f2 + f1 , f1 f2

(172)

where on the right side the term in parentheses was replaced by the wide-lane frequency f21 , which then canceled. Therefore, the phase ambiguity N1 in (171) can be calculated from the wide-lane ambiguity by N1 = Φ1 −

f1 f1 + f2 (Φ21 − N21 ) + b f21 f1 f2

(173)

1394

B. Hofmann-Wellenhof

and, in an analogous way, for N2 by exchanging the roles of f1 and f2 in the equation above accordingly. Equation (173) represents the so-called geometry-free linear phase combination since the geometric distance ρ and the clock bias term δ do not appear explicitly. Note, however, that these terms are implicitly contained in N21 , cf. Eq. (167). The ionospheric term is most annoying. The influence of this term will be negligible for short baselines with similar ionospheric refraction at both sites (using differenced phases). For long baselines or irregular ionospheric conditions, however, the ionospheric term may cause problems. To eliminate the ionosphere-dependent term b in the computation of the ambiguities for the measured phases (e.g., referring to f1 ), one could proceed as follows. Start again with the phase equations (166) and multiply the first equation by f1 and the second by f2 . Form the differences of the resulting equations and, thus, f2 Φ2 − f1 Φ1 = a (f22 − f12 ) + f2 N2 − f1 N1

(174)

is obtained. Eliminating N2 via the relation N2 = N1 − N21 leads to f2 Φ2 − f1 Φ1 = a (f22 − f12 ) − f2 N21 + N1 (f2 − f1 )

(175)

or, introducing f21 = f1 − f2 and dividing the equation by this relation, N1 =

f1 f2 Φ1 − (Φ2 + N21 ) − a (f1 + f2 ) f21 f21

(176)

results. By simple linear algebra it may be verified that Eq. (176) is another representation of the ionosphere-free phase combination, cf. Eq. (57). A final remark concerning the ambiguities is appropriate. Combining the terms containing N1 and N2 into a single term in the geometry-free or ionosphere-free combination destroys the integer nature of the term. This is a kind of vicious circle: either the ambiguities may be resolved where the ionosphere is a problem or the ionospheric influence is eliminated which destroys the integer nature of the ambiguities. The integer nature can be preserved by separately calculating the ambiguities, first N21 and then N1 by (173) or (176). Combining Dual-Frequency Carrier Phase and Code Data The most unreliable factor of the wide-lane technique described in the previous paragraph is the influence of the ionosphere, which increases with baseline length. This drawback can be partially overcome by a combination of phase and code data. The models for dual-frequency carrier phases and code ranges, both expressed in cycles of the corresponding carrier, can be written in the form

21 Elementary Mathematical Models for GNSS Positioning

Φ1 = a f1 −

b + N1 , f1

Φ2 = a f2 −

b + N2 , f2

b R1 = a f1 + , f1 R2 = a f2 +

1395

(177)

b , f2

with the geometry term a and the ionosphere term b as known from (55). Note that four equations are available with four unknowns for each epoch. The unknowns are a, b, and the ambiguities N1 , N2 and may be expressed explicitly as a function of the measured quantities by inverting the system represented by (177). Multiplying the third equation of (177) by f1 and the fourth by f2 and differencing the resulting equations yields the geometry term a=

1 (R2 f2 − R1 f1 ) . f22 − f12

(178)

Multiplying now the third equation of (177) by f2 and the fourth by f1 and differencing the resulting equations yields the ionosphere term b=

f1 f2 (R1 f2 − R2 f1 ) . 2 2 − f1

f2

(179)

Substituting (178) and (179) into the first two equations of (177) leads to explicit expressions for the phase ambiguities N1 = Φ1 +

f22 + f12 2f1 f2 R1 − 2 R2 , 2 2 f2 − f1 f2 − f12

2f1 f2 f22 + f12 N2 = Φ2 + 2 R − R2 . 1 f2 − f12 f22 − f12

(180)

By forming the difference N21 = N1 − N2 , finally N21 = Φ21 −

f1 − f2 (R1 + R2 ) f1 + f2

(181)

is obtained. This rather elegant equation allows for the determination of the widelane ambiguity N21 for each epoch and each site. It is independent of the baseline length and of the ionospheric effects. Even if all modeled systematic effects cancel out in (181), the multipath effect remains and affects phase and code differently.

1396

B. Hofmann-Wellenhof

Multipath is almost exclusively responsible for a variation of N21 by several cycles from epoch to epoch. These variations may be overcome by averaging over a longer period. According to Euler and Goad [23] and Euler and Landau [24], the ambiguity resolution for the combination of dual-frequency code data with a reasonably low noise level and phase data will be possible “under all circumstances” with a few epochs of data. The approach described is even appropriate for instantaneous ambiguity resolution in kinematic applications. Hatch [46] mentions that a singleepoch solution is usually possible for short baselines if seven or more satellites can be tracked. Note that several variations of the technique are known.

Combining Triple-Frequency Carrier Phase and Code Data The technique based on three carriers is denoted as three-carrier ambiguity resolution (TCAR). Before pointing out the model equations, a few remarks are appropriate when comparing TCAR with the previously described dual-frequency carrier phase and code data ambiguity resolution. Theoretically, the four unknowns a, b, N1 , N2 of (177) can be determined instantaneously by solving the four equations. Thus, in principle the unknowns can be determined epoch by epoch. In reality, fixing the ambiguities N1 , N2 to their correct values will be very unlikely even for short baselines because of the magnification of noise associated with the algebraic solution. Therefore, a detour via the wide-lane ambiguities is taken. Similarly, it may be expected that an instantaneous TCAR solution is also possible. This expectation becomes true as it may be seen immediately from the triple-frequency carrier phase and code data model Φ1 = a f1 −

b + N1 , f1

Φ2 = a f2 −

b + N2 , f2

Φ3 = a f3 −

b + N3 , f3

b R1 = a f1 + , f1 R2 = a f2 +

b , f2

R3 = a f3 +

b , f3

(182)

where, apart from the two carrier phase data on f1 , f2 , the third carrier phase on f3 is introduced. This system of six equations contains five unknowns: the geometry term

21 Elementary Mathematical Models for GNSS Positioning Table 10 GNSS frequencies and wide-lane combinations

Frequency f1 f2 f3 f1 − f3 f1 − f2 f2 − f3

1397 MHz 1580 1230 1180 400 350 50

Wavelength [m] 0.19 0.24 0.25 0.75 0.86 6.00

a, the ionosphere term b, and the ambiguities N1 , N2 , N3 . Therefore, the system has the redundancy 1 and could be solved by least-squares adjustment. Note, however, referring to the estimated ambiguities, Sjöberg [111, 112] indicates “that these estimates are too poor to be useful”. By contrast, it is possible to determine a widelane ambiguity accurately. In Table 10, specific values (typical for GNSS) for the three frequencies and wide-lane combinations are given. From the dual-frequency approach, cf. Eq. (181), the result N21 = Φ21 −

f1 − f2 (R1 + R2 ) f1 + f2

(183)

is obtained for the f1 − f2 combination and N31 = Φ31 −

f1 − f3 (R1 + R3 ) . f1 + f3

(184)

for the f1 − f3 combination. Following from the wide-lane definitions N21 = N1 − N2 and N31 = N1 − N3 , the individual ambiguities are N2 = N1 − N21 , N3 = N1 − N31 ,

(185)

where N1 is still unknown and to be determined. These equations are resubstituted into (182), the initial set of model equations: Φ1 = a f1 −

b + N1 , f1

Φ2 + N21 = a f2 −

b + N1 , f2

Φ3 + N31 = a f3 −

b + N1 , f3

R1 = a f1 +

b , f1

1398

B. Hofmann-Wellenhof

R2 = a f2 +

b , f2

(186)

b , R3 = a f3 + f3 where the known wide-lane ambiguities N21 and N31 have been shifted to the left side of the equations. This system of six equations comprises only three unknowns: a, b and N1 , thus the redundancy amounts to 3. Inherently, this combined data set of code and phase measurements reflects two accuracy classes because the last three code range equations are much less accurate compared to the first three mainly phase-derived equations. Sjöberg [113] neglects the three code range equations by arguing that they contribute little to the least-squares solution. With the remaining phase equations, the calculation of the three unknowns is still possible for a single epoch. After the successful computation of the N1 ambiguity, the same procedure may be applied accordingly to get the other two carrier ambiguities N2 and N3 . Vollath et al. [135] use the same set of equations as given in (186) but with an extended modeling of the ionospheric influence and apply a recursive least-squares adjustment. Hatch et al. [49] conclude that over short baselines the ambiguities may be resolved much more quickly (often in a single epoch), whereas for longer baselines there is limited gain from the third frequency. Vollath et al. [135] conclude similarly that the TCAR procedure will generally not suffice to resolve the ambiguities instantaneously, i.e., using data of a single epoch unless very short baselines are considered. Accumulating several epochs will, on the one hand, reduce the noise but, on the other hand, the main error components caused by the ionosphere and multipath remain because of their long correlation times. Therefore, a search for the optimal solution along with a validation is still required. However, the number of possible candidates for this optimal result is substantially reduced. Several other procedures exist like the integrated three-carrier ambiguity resolution (ITCAR), the cascade integer resolution (CIR) [63], which is essentially the same as ITCAR, or the extension of the null space method from the dual-frequency method to the triple-frequency approach [29]. Verhagen and Joosten [132] analyse the concepts and performances of TCAR, ITCAR, CIR, LAMBDA, and the null space method. Martín-Neira et al. [95] mention the investigation of the multiple carrier ambiguity resolution (MCAR), which is manifested by Werner and Winkel [145] and briefly described in the next paragraph. Multiple Carrier Ambiguity Resolution In the near future of GNSS, the data combination will no longer be restricted to dual and triple frequencies because more than a single global positioning system will be available, i.e, the modernized GPS, GLONASS, Galileo, and BeiDou. MCAR will be the future. The somehow misleading term (dual and triple frequency being also

21 Elementary Mathematical Models for GNSS Positioning

1399

“multiple” are not included) must be understood in the continuous development of an increased number of available frequencies. Numerous simulations focus therefore on possible benefits arising from the potential of a combined use of modernized GPS and Galileo, e.g., Werner and Winkel [145], Zhang et al. [153], Julien et al. [62], Sauer et al. [109]. The investigated benefits address the ambiguity resolution itself, the initialization performance, reliability, accuracy, and other aspects. Not only single baselines are considered but also the influence on a network of baselines is discussed [82]. Feng and Rizos [28] summarize these benefits in a generic way as follows: • • • •

allowing ambiguity resolution over long distances, allowing the fixing of correct integer solutions within much shorter periods, achieving highly reliable integer solutions, enabling RTK positioning in urban areas (where signal obstruction is an issue).

Many more investigations based on simulations are to be expected for years to come before the full potential of real modernized GPS and Galileo data may be exploited.

4.2.3

Search Techniques

A Standard Approach When processing the data based on double-differences by least-squares adjustment, the ambiguities are estimated as real or floating-point numbers, hence the first double-difference solution is called the float ambiguity solution. The output is the best estimate of the station coordinates as well as double-difference ambiguities. If the baseline is relatively short, say five kilometers, and the observation span relatively long, say one hour, these float ambiguities would typically be very close to integers. Ambiguity resolution in this case will improve the position accuracy. The change in the station coordinates from the float solution to the fixed ambiguity solution should not be large and in the case when ambiguity resolution fails, the float solution is generally a very good alternative. As the observation span becomes smaller, the float solution will weaken due to loss of information. Ambiguity resolution will then play a more important role, since its effect on the station coordinates will now be significant. If the observation span is further reduced, the success of ambiguity resolution may determine whether or not the user’s positioning specifications are met. As this discussion implies, there is a risk associated with a reduction in the observation span. A wrong integer can degrade the position solution significantly. The search space concept can be generated for the static case by considering the position accuracy of the float ambiguity solution. A conceptually simpler approach, however, is to directly use the estimated accuracies of the float ambiguities to set their search range. For example, if an ambiguity is estimated to be 87,457,341.88 cycles with a standard deviation of 0.30 cycles, all the integer ambiguities that fall within ± 3 standard deviations of that value (for a high statistical probability) might be searched. This would give potential integer ambiguities of

1400

B. Hofmann-Wellenhof

87,457,340 to 87,457,343 by being conservative. This procedure can be repeated for each of the double-difference ambiguities and the result is a set of potential integer ambiguity combinations. The number of ambiguity sets to be considered depends on the number of satellites tracked and the search range of the double-difference ambiguities. For example, there are five ambiguities if six satellites are tracked and if the range for each ambiguity is three cycles, the number of combinations to test is 35 = 243. If the search range is increased to five cycles, the total number of combinations is 3,125. Once all the potential ambiguity combinations are identified, each one is tested by constraining (fixing) the ambiguities to the selected integer combination and then computing the measurement residuals. The total redundancy is increased in the fixed ambiguity adjustment since only the station coordinates are estimated. However, the residuals are larger than for the float ambiguity solution. King et al. [68] present a technique by which the influence of various integer ambiguity combinations can be computed from the float ambiguity solution, rather than initiating a new leastsquares adjustment for each of the potential ambiguity combinations. The sum of squared residuals is used as the final measure of the fit of the ambiguity combination. The integer ambiguity solution corresponding to the smallest sum of squared residuals should be the candidate which is selected. Due to reasons stated earlier, however, no candidate may be significantly better than the other to warrant selection. A ratio test is often used to make this decision. For example, if the ratio of the second smallest sum of squared residuals to the smallest sum of squared residuals is 2 or 3 (depending on the algorithm), then a decision to select the smallest sum of squared residuals as the true solution can be made. Otherwise, no integer ambiguity solution can be determined and then the best estimate for the station coordinates is the float ambiguity solution. An example given in Cannon and Lachapelle [12] will illustrate this concept. On a 720 m baseline, six satellites were tracked for 10 minutes. Using doubledifferences with satellite 19 as reference, the least-squares approach yielded for the ambiguities the following values in cycles: DD SV Float ambiguity 2 − 19 17 329 426.278 6 − 19 14 178 677.032 11 − 19 11 027 757.713 16 − 19 −1 575 518.876 18 − 19 −15 754 175.795 The abbreviation DD SV indicates double-differences (DD) for the specified space vehicle (SV) numbers. To get integer values, the float solution is simply

21 Elementary Mathematical Models for GNSS Positioning

1401

rounded to the nearest integer values. To check this solution, possible other ambiguity sets are established by varying each ambiguity in a certain range, say by ±2 cycles, so that, apart from the integer solution obtained from the table above, each ambiguity is varied by −2, −1 and +1, +2 cycles. This means that for each ambiguity five cases are checked. Considering the five double-differences, in total 55 = 3,125 possible integer sets arise which are to be compared with respect to the sum of the squared residuals. Subsequently, the results for the three smallest sums of squared residuals (abbreviated as SSR) are given:

1st smallest 2nd smallest 3rd smallest SSR = 0.044 SSR = 0.386 SSR = 0.453 DD SV

Ambiguity

Ambiguity

Ambiguity

2 − 19 17 329 426 17 329 426 17 329 426 6 − 19 14 178 677 14 178 676 14 178 678 11 − 19 11 027 758 11 027 757 11 027 759 16 − 19 −1 575 519 −1 575 518 −1 575 520 18 − 19 −15 754 176 −15 754 176 −15 754 176

The ambiguity set with the smallest sum of squared residuals is likely to represent the correct integers only if its SSR compared to the 2nd smallest SSR is significantly smaller. The ratio, which amounts to 0.386/0.044 = 8.8 in the example above, should be greater than 3, a threshold which has been determined empirically. To demonstrate a failing of the ratio test, the same example is taken but the data set is reduced to 5 minutes instead of the original 10 minutes. The results for the double-difference float solution are:

DD SV Float ambiguity 2 − 19 17 329 426.455 6 − 19 14 178 677.192 11 − 19 11 027 757.762 16 − 19 −1 575 518.471 18 − 19 −15 754 175.411

When checking again the same 3,125 possible integer ambiguity sets as before, the following ambiguity sets represent the best solutions in the sense of minimal sum of squared residuals:

1402

B. Hofmann-Wellenhof

1st smallest 2nd smallest 3rd smallest SSR = 0.137 SSR = 0.155 SSR = 0.230 DD SV

Ambiguity

Ambiguity

Ambiguity

2 − 19 17 329 425 17 329 426 17 329 426 6 − 19 14 178 675 14 178 677 14 178 675 11 − 19 11 027 757 11 027 758 11 027 756 16 − 19 −1 575 516 −1 575 519 −1 575 518 18 − 19 −15 754 175 −15 754 176 −15 754 175 The ratio test for the smallest and the second smallest yields 0.155/0.137 = 1.1 and, thus, fails. This means that from the statistical point of view with regard to the squared sum of residuals, the correct solution cannot be extracted safely. Note, however, that the solution of the second smallest sum of squared residuals gives the correct integer ambiguities (as compared to the solution of the full 10-minute data set), but from the chosen criterion of the ratio this is not recognizable. This shows that the technique of comparing the sum of squared residuals is certainly not the most advanced technique. Ambiguity Resolution On-the-Fly The notation “on-the-fly” reflects any type of rover motion. The terms AROF (ambiguity resolution on-the-fly), OTF (on-the-fly), and sometimes OTR (on-therun) are different abbreviations with the same meaning, namely the development of ambiguity resolution techniques for the kinematic case. Numerous techniques have been developed to deal with the kinematic case. Code ranges are generally used to define the search space for the kinematic case. A relative code range position is used as the best estimate of antenna location, and the associated standard deviations are used to define the size of the search space. This space can be determined in several ways, for example, it can be a cube, a cylinder, or an ellipsoid. In order to reduce the number of integer ambiguity combinations to be tested, the code solution should be as accurate as possible, which means that receiver selection becomes important. The availability of low noise, narrow correlator-type code ranges is advantageous since they have a resolution in the order of 10 cm as well as improved multipath reduction compared with standard code receivers. An example is used to show the direct correlation between the code accuracy and the size of the potential ambiguities to be searched. Suppose a standard code receiver is used to define the search cube. The accuracy of the resulting position is approximately 2 m to give a cube size of 4 m on a side. If six satellites are tracked, there are five double-difference ambiguities to consider. The search range for each ambiguity is approximately 4 m/0.2 m = 20 cycles (where a typical phase wavelength of 0.2 m is considered) to give 205 = 3.2 million total combinations. If, in contrast, a narrow correlator-type receiver is used, the accuracy of the resulting

21 Elementary Mathematical Models for GNSS Positioning

1403

position is approximately 1 m to give a cube of 2 m on a side and a search range of 2 m/0.2 m = 10 cycles. Under the same six-satellite geometry, the total combinations are reduced to 105 = 100,000, which is a significant difference. The importance of the carrier phase wide lane should be mentioned here in the context of the number of potential ambiguity combinations. If the above example is repeated using a wide lane with a wavelength of 86 cm (cf. Table 10), the number of potential ambiguities for a narrow correlator-type receiver would be about 35. The advantage of using this observable instead of the original carrier phase is clear as it tremendously reduces the search time. The only disadvantage of using the wide lane is that the measurement is significantly noisier than the single phase. Many OTF implementations use the wide lane to resolve integer ambiguities and then use the resulting position to directly compute the ambiguities on the original carrier phase data, or at least to significantly limit the number of single-phase ambiguities to be considered. The wide lane is also used extensively for fast static applications where the station occupation time is limited. The OTF techniques have common features like, e.g., the determination of an initial solution; they differ only in how these features are carried out. A summary of the main features is given in Table 11 which is closely related to Erickson [22]. As far as the search technique (domain, space, reduction of trials) is concerned, there are also combinations of several listed characteristics (e.g., Abidin et al. [3]). Illustrative graphic representations of search spaces lead to an easier understanding of the reduction of trials, see Hatch [47], Erickson [21], Frei and Schubernigg [33], and Abidin [2]. The double-difference plane intersection method to reduce the number of trials as mentioned in Table 11 requires a brief explanation. Positions are derived from three double-differences with sets of possible ambiguities. Geometrically, each (linearized) double-difference with its trial ambiguity defines a plane in threedimensional space [46]. Thus, the intersection of three planes yields a possible solution position. The grid spacing is the wavelength of the carrier and is equivalent to the grid spacing in the ambiguity search domain. Minimizing the variance σ02 as selection criterion is in principle the same as minimizing the sum of the squared residuals. If the position of the receiver is eliminated by a mapping function, as proposed by Walsh [137], the residuals reflect the ambiguities only. The subsequent paragraphs explain some out of the many OTF techniques that can be used. Examples are the ambiguity function method, the least-squares ambiguity search, the fast ambiguity resolution approach, the fast ambiguity search filter, least-squares ambiguity decorrelation adjustment method, and ambiguity determination with special constraints. Numerous approaches may be found in publications. Here are some examples: the fast ambiguity resolution using an integer nonlinear programming method [142]; a maximum likelihood method based on undifferenced phases [71]; the fitting of individual epoch residuals for potential ambiguity candidates to low-order polynomials [9]. Additional methods may be found in the review papers by Chen and Lachapelle [14], Hatch and Euler [48], and Hein [53].

1404

B. Hofmann-Wellenhof

Table 11 Characteristics and options for OTF ambiguity resolution techniques • Code solution for position X, Y, Z and its accuracy σX , σY , σZ • Carrier solution for X, Y, Z and Nj and accuracies σX , σY , σZ , σNj • Test points (three-dimensional space) • Ambiguity sets (n-dimensional integer space, where n is the number of ambiguities) • k σX , k σY , k σZ • k σNj • Empirically • Statistically • Grid search (fine, coarse) • Double-difference plane intersection • Statistically (e.g., correlation of ambiguities) • Maximum ambiguity function • Minimum variance σ02 • Ratio of largest and second largest ambiguity function • Ratio of smallest and second smallest variance σ02 • Instantaneous • Some minutes • Single- or dual-frequency • Phase only or phase and code

Initial solution

Search domain

Search space Determination of k Reduction of trials

Selection criterion Acceptance criterion

Observation period Data required

Ambiguity Function Method Counselman and Gourevitch [17] proposed the principle of the ambiguity function, Remondi [102, 103] and Mader [92] further investigated this method. The concept will become clear from the following description. Assume the model (85) for the single-difference phase represented by j

ΦAB (t) =

1 j j ρ (t) + NAB + f δAB (t) λ AB

(187)

for the receiver sites A and B, and the satellite j . If point A is assumed known and j B is a selected candidate from the gridded cube, then the term ρAB (t) is known and may be shifted to the left side of the equation: j

ΦAB (t) −

1 j j ρ (t) = NAB + f δAB (t) . λ AB j

(188)

The key is to circumvent the ambiguities NAB . A special effect occurs if the term j j 2π NAB is used as the argument of a cosine or sine function because NAB is an integer. Therefore, the whole expression (188) is multiplied by 2π and placed into

21 Elementary Mathematical Models for GNSS Positioning

1405

Fig. 29 Vector representation in the complex plane

the complex√plane by raising both the left and right side to the power of ei = exp{i}, where i = −1 is the imaginary unit. In detail,   &

' 2π j j j ρAB (t) = exp i 2π NAB + 2π f δAB (t) , exp i 2π ΦAB (t) − λ (189) where the right side may also be written as & ' j exp i 2π NAB exp {i 2π f δAB (t)} .

(190)

It is illustrative to consider this situation in the complex plane (Fig. 29). Note the equivalence exp{iα} = cos α + i sin α ,

(191)

which may be represented as a unit vector with the components cos α and sin α if a real axis and an imaginary axis are used. Therefore, & ' j j j exp i 2π NAB = cos(2π NAB ) + i sin(2π NAB ) = 1 + i · 0

(192)

j

results because of the integer nature of NAB . Hence, for one epoch and one satellite, (189) reduces to   2π j j ρAB (t) = exp {i 2π f δAB (t)} exp i 2π ΦAB (t) − λ

(193)

by applying (190) and (192). Considering ns satellites and forming the sum over these satellites for the epoch t leads to

1406

B. Hofmann-Wellenhof

ns j =1

  2π j j ρAB (t) = ns exp {i 2π f δAB (t)} . exp i 2π ΦAB (t) − λ

(194)

Considering more than one epoch, the fact that the clock error δAB (t) varies with time must be taken into account. Recall that exp{i 2π f δAB (t)} is a unit vector as indicated in Fig. 29. Thus, when exp{i 2π f δAB (t)} = 1 is applied to (194), the relation 0 0 0  0 0 0 ns 2π j j 0 ρAB (t) 0 exp i 2π ΦAB (t) − (195) 0 = ns 0 λ 0 0j =1 is obtained, where the clock error has now vanished. Take for example four satellites and an error-free situation (i.e., neither measurement errors nor model errors, and correct coordinates for the points A and B). In j this case, the evaluation of the left side of (195) should yield 4, where ΦAB (t) are j the single-differences of measured phases and ρAB (t) can be calculated from the known points and satellite positions. However, if point B was chosen incorrectly, then the result must be less than 4. In reality, this maximum can probably never be achieved precisely because of measurement errors and incomplete modeling. Thus, the task is restricted to obtaining the maximum of (195) by varying B. With highly stable receiver clocks and close epoch spacing it is theoretically possible to include more than one epoch within the absolute value. Using nt epochs, the contribution of all epochs may be summed up by 0 0  0 nt 0 0 ns 0 2π j j 0 ρAB (t) 0 exp i 2π ΦAB (t) − 0 0 = nt ns , λ 0 t=1 0j =1

(196)

where for simplicity the same number of satellites at all epochs is assumed. Following Remondi [102, 103], the left side of (196) is denoted as an ambiguity function. Analogous to the case with one epoch, the maximum of the ambiguity function must be found. In general it will, as before, be less than the theoretical value nt ns . The ambiguity function procedure is simple. Assume an approximate solution for point B, e.g., achieved by triple-differences. Then, place this solution into the center of a cube (Fig. 30) and partition the cube into grid points. Each grid point is a candidate for the final solution, and the ambiguity function (196) is calculated for all single-differences. The grid point yielding the maximum ambiguity function value, which should theoretically be equal to the total number of single-differences (i.e., nt ns ), is the desired solution. Having found this solution, the ambiguities could be computed using double-differences. Also, an adjustment using double-differences might be performed to verify the position of B and the ambiguities. The computation of point B with fixed ambiguities is the final step.

21 Elementary Mathematical Models for GNSS Positioning

1407

Fig. 30 Search space

It is worth noting that the ambiguity function method is completely insensitive to cycle slips. The reason can easily be seen from Eq. (192). Even if the ambiguity j j j changes by an arbitrary integer amount NAB , then exp{i 2π (NAB + NAB )} is still a unit vector and the subsequent equations, therefore, remain unchanged. Other methods require that cycle slips be repaired before computing the ambiguity. Remondi [102] shows detailed examples of how to speed up the procedure, how to choose the density of the grid points within the cube, and how to find the correct maximum if there are many relative maxima for the ambiguity function. These considerations are significant, since the computational burden could, otherwise, become overwhelming. For illustrative purposes, assume a 6 × 6 × 6 m cube with a one centimeter grid. Then (601)3 ≈ 2.17 · 108 possible solutions must be checked with the ambiguity function (196). Least-Squares Ambiguity Search Technique The method described here is investigated in further details in Hatch [46, 47]. The least-squares ambiguity search technique (LSAST) requires an approximate solution for the position which may be obtained from a code range solution. The search area may be established by surrounding the approximate position by a 3σ region. One of the basic principles of the approach is the separation of the satellites into a primary and a secondary group. The primary group consists of four satellites. Based on these four satellites, which should have a good geometric distribution, i.e., a good position dilution of precision (PDOP), the possible ambiguity sets are determined. The remaining secondary satellites are used to eliminate candidates of the possible ambiguity sets. The set of potential solutions may be found in the following way. Assume the simplified double-difference model (164). If the ambiguities are moved to the left side as if they were known, the model reads λ Φ − N = ρ, where all indices have been omitted. For four satellites, three equations of this type may be set up. The three unknown station coordinates contained in the right side of the equation may be solved by linearizing ρ and inverting the 3 × 3 design matrix. Specifying and varying the three ambiguities on the left side gives new position solutions, whereas the inverted design matrix remains unchanged. Depending on the variation of the three ambiguities, the set of potential solutions is obtained. Note that Hatch [46] does not use double-differences but undifferenced phases to avoid any biasing.

1408

B. Hofmann-Wellenhof

From the set of potential solutions, incorrect solutions are removed by taking into account the information of the secondary group of satellites. Sequential leastsquares adjustment would be appropriately used for this task. Finally, the sum of the squared residuals may be taken as criterion for the quality indicator of the solution. Ideally, only the true set of ambiguities should remain. If this is not the case, then, as described previously, the solution with the smallest sum of squared residuals should be chosen (after comparing it with the second smallest sum). Fast Ambiguity Resolution Approach The development of the fast ambiguity resolution approach (FARA) is given in Frei [31] and summarized in Frei and Schubernigg [33]. Following the latter publication, the main characteristics are (1) to use statistical information from the initial adjustment to select the search range, (2) to use information of the variancecovariance matrix to reject ambiguity sets that are not acceptable from the statistical point of view, and (3) to apply statistical hypothesis testing to select the correct set of integer ambiguities. Following Erickson [21], the FARA algorithm may be partitioned into four steps: (1) computing the float carrier phase solution, (2) choosing ambiguity sets to be tested, (3) computing a fixed solution for each ambiguity set, and (4) statistically testing the fixed solution with the smallest variance. In the first step, real values for double-difference ambiguities are estimated based on carrier phase measurements and calculated by an adjustment procedure which also computes the cofactor matrix of the unknown parameters and the a posteriori variance of unit weight (a posteriori variance factor). Based on these results, the variance-covariance matrix of the unknown parameters and the standard deviations of the ambiguities may also be computed. In the second step, the criteria for the ambiguity ranges to be investigated are based on confidence intervals of the real values of the ambiguities. Therefore, the quality of the initial solution of the first step affects the possible ambiguity ranges. In more detail, if σN represents the standard deviation of the ambiguity N , then ±k σN is the search range for this ambiguity, where k is derived statistically from Student’s t-distribution. This is the first criterion for selecting possible ambiguity sets. A second criterion is the use of the correlation of the ambiguities. Assuming the double-difference ambiguities Ni and Nj and the difference Nij = Nj − Ni ,

(197)

the standard deviation follows from the error propagation law as σNij =



σN2 i − 2σNi Nj + σN2 j ,

(198)

where σN2 i , σNi Nj , and σN2 j are contained in the variance-covariance matrix of the parameters. The search range for the ambiguity difference Nij is kij σNij , where

21 Elementary Mathematical Models for GNSS Positioning

1409

kij is analogous to the search range for individual double-difference ambiguities. This criterion significantly reduces the number of possible integer sets. An even more impressive reduction is achieved if dual-frequency phase measurements are available. Illustrative figures demonstrating this reduction are given in Frei and Schubernigg [33]. In the third step, least-squares adjustment with fixed ambiguities is performed for each statistically accepted ambiguity set yielding adjusted baseline components and a posteriori variance factors. In the fourth and final step, the solution with the smallest a posteriori variance is further investigated. The baseline components of this solution are compared with the float solution. If the solution is compatible, it is accepted. As shown in Erickson [21], the compatibility may be checked by a χ 2 -distribution which tests the compatibility of the a posteriori variance with the a priori variance. Furthermore, another test may be applied to ensure that the second smallest variance is sufficiently less likely than the smallest variance. Note, however, that these two variances are not independent [118, Sect. 8.2.3]. As seen from the algorithm, FARA only requires data for double-difference phases; thus, in principle, neither code data nor dual-frequency data are required; however, if these data are added, the number of possible ambiguity sets increases dramatically (see the second step of the algorithm). Euler et al. [25] present an efficient and rapid search technique, similar to FARA based on the a posteriori variance (resulting from the sum of the squared residual errors). First, an integer set of ambiguities is introduced in the adjustment computation as constraints leading to an initial solution and the corresponding a posteriori variance. The influence of other ambiguity sets on the initial solution and the a posteriori variance is then determined without recomputing the whole adjustment. This influence may be calculated by some simple matrix and vector operations where only a reduced matrix with the dimension of the constraint ambiguities must be inverted. Following Landau and Euler [80], the computation time for the matrix inversion may be optimized when the Cholesky factorization method is applied which decomposes a symmetric matrix into a product of a lower and an upper triangle matrix. The impact of a changed ambiguity set on the sum of the squared residuals may be reduced by the Cholesky factorization to the computation of an inner product of two vectors. Furthermore, not even the full inner product must be computed in all cases. Based on a threshold, the computation of the inner product for some integer ambiguity sets may be interrupted and the corresponding ambiguity set rejected. The search techniques described so far performed the search in the ambiguity domain. An alternate technique substitutes the position as known and solves for the ambiguities as unknowns. This could be performed in the following way. Eliminate the ambiguities by forming triple-differences and obtain a first estimate for the position and its standard deviation σ by an adjustment. Now center the approximate position within a cube of dimension ±3σ in each coordinate direction and partition the cube into a regular spatial grid. The cube, thus, contains a matrix of points where the center point is the triple-difference solution (Fig. 30). Each of these grid points

1410

B. Hofmann-Wellenhof

is considered a candidate for the correct solution. Consequently, one by one, each candidate position is substituted into the observation equation. Then the adjustment (holding the trial position fixed) is performed and the ambiguities are computed. When all points within the cube have been considered, select the solution where the estimated real values of the ambiguities appear as close as possible to integer values. Now, fix the ambiguities to these integer values and compute (holding the ambiguities fixed) the final position which will, in general, be slightly different from the corresponding grid point of the cube.

Fast Ambiguity Search Filter Following Chen [13] and Chen and Lachapelle [14], the fast ambiguity search filter (FASF) algorithm comprises basically three components: (1) a Kalman filter is applied to predict a state vector which is treated as observable, (2) the search of the ambiguities is performed at every epoch until they are fixed, and (3) the search ranges for the ambiguities are computed recursively and are related to each other. By applying the Kalman filter, information from the initial epoch to the current epoch is taken into account. The state vector of the Kalman filter also contains the ambiguities which are estimated as real numbers if they cannot be fixed. After fixing the ambiguities, the state vector is modified accordingly. The state vector of the Kalman filter is considered an observable and establishes, along with the regular observables (i.e., double-difference phase equations), the design matrix. The recursively determined search ranges are based on the a priori geometric information and the effect of other (preliminarily) fixed ambiguities. As an example, take the case of four double-difference ambiguities. The first ambiguity is computed without fixing any other ambiguity. The search range for the second ambiguity is computed with the first ambiguity introduced as a known integer quantity (although it may even be the wrong integer number), the search range for the third ambiguity is computed with the first and the second ambiguity introduced as known integer quantities, and the procedure is continued for the fourth ambiguity. According to Chen and Lachapelle [14], this concept is denoted as recursive computation of the search range. To avoid large search ranges, a computational threshold is used. Ambiguities which cross this threshold are not fixed but computed as real numbers. Thus, an attempt to fix the ambiguities is only made if the number of potential ambiguity sets is below this threshold. Under normal circumstances, the number of potential ambiguity sets should decrease with accumulating observations. Ideally, there should finally remain a single potential ambiguity set. In practice, however, this will usually not be the case so that, conventionally, a ratio test of the sum of the squared residuals between the minimum and the second-best minimum is calculated. If this ratio fulfills a specified criterion number, the minimum solution is considered to yield the true set of ambiguities. Once the ambiguities are fixed properly, they are removed from the state vector of the Kalman filter, i.e., from the estimation. Accordingly, the corresponding observation equation is rearranged.

21 Elementary Mathematical Models for GNSS Positioning

1411

The ranges of the loops for the ambiguities, i.e., the uncertainties, are calculated by using a least-squares approach with parameter elimination. First, the parameters representing the station coordinates are eliminated from the normal equations so that the ambiguities are the only remaining parameters of the model based on doubledifferences. Furthermore, according to the previous discussion on loops associated with the ambiguities, the ambiguities of the outer loops are constrained as integers. Returning to the example of the four ambiguities, if the range of the third ambiguity is to be determined, the first and the second ambiguity are assumed to be known and introduced as constraints (which is equivalent to removing them from the estimation vector). In fact, this may be done very efficiently as shown in Chen and Lachapelle [14] where only single rows and columns of the adjustment matrices must be taken into account. As result of this parameter elimination, a float estimation of the corresponding ambiguity and its variance are finally obtained. Multiplying the variance by a scale factor and subtracting and adding this result with respect to the float solution yields the search range for this specific ambiguity. Note that if the uncertainty ranges are not calculated correctly, the true ambiguity set may not be found. Least-Squares Ambiguity Decorrelation Adjustment Method Teunissen [114] proposed the idea and further developed the least-squares ambiguity decorrelation adjustment (LAMBDA) method. A fairly detailed description of Teunissen’s method is (slightly modified) given here. At present, this method is both theoretically and practically at the top level among the ambiguity determination methods. The conventional formulation of the adjustment by parameters is vT P v = minimum! ,

(199)

where v is the vector of residuals and P is the weight matrix. This formulation implies that the weighted sum of squared residuals is minimized. As commonly used, the weight matrix equals the inverse of the cofactor matrix Q of observations. Consequently, vT Q−1 v = minimum!

(200)

is an equivalent relation. Applying the least-squares adjustment methods for, e.g., relative positioning based on double-difference phase observations, the unknowns being determined are coordinate increments for the unknown station and double-difference ambiguities. The values obtained from the adjustment procedures are in the sense of this minimum principle the most likely ones. However, the double-difference ambiguities are obtained as real values but should be integer values. The main objective is, thus, to obtain integer ambiguities which are the most likely ones. Denoting the ˆ and the vector of the corresponding vector of adjusted float ambiguities by N integer ambiguities by N, the difference between the two vectors may be regarded as

1412

B. Hofmann-Wellenhof

residuals of ambiguities. Consequently, it makes sense to minimize these residuals again by the same principle, i.e., the weighted sum of squared residuals. Following Teunissen et al. [127], ˆ − N)T Q−1 (N ˆ − N) = minimum! χ 2 (N) = (N ˆ N

(201)

is obtained, where QNˆ is the cofactor matrix of the adjusted float ambiguities. Note that sometimes the covariance matrix is used instead of the cofactor matrix. As shown in Hofmann-Wellenhof et al. [54, Eq. (7.69)], these two matrices only differ by a factor. The solution of this problem is denoted as the integer least-squares estimate of the ambiguities. Certainly, an approach different from the usual least-squares adjustment calculation must be chosen to account for the integer nature of the still unknown ambiguities N. The following simple example demonstrates the solution principle. Considering two ambiguities and assuming QNˆ as diagonal matrix ⎡ QNˆ = ⎣

qNˆ 1 Nˆ 1

0

0

qNˆ 2 Nˆ 2

⎤ ⎦,

(202)

Eq. (201) yields the result χ 2 (N) =

(Nˆ 1 − N1 )2 (Nˆ 2 − N2 )2 + . qNˆ 1 Nˆ 1 qNˆ 2 Nˆ 2

(203)

The minimum is achieved if the Ni are chosen as those integer values being nearest to the real values. In other words, rounding the real value ambiguities to their nearest integer values yields the desired minimum for χ 2 (N). Since QNˆ was assumed as diagonal matrix, the resulting N1 and N2 are still fully decorrelated which is also evident from Eq. (203). Geometrically, if two coordinate axes are associated with ambiguities N1 and N2 , this equation represents an ellipse ˆ and with the semiaxes centered around the ambiguities N  a = χ (N) qNˆ 1 Nˆ 1 ,  b = χ (N) qNˆ 2 Nˆ 2 ,

(204)

where χ (N) acts as a scale factor. The axes of the ellipse are parallel to the direction of the coordinate axes. This ellipse is regarded as an ambiguity search space. Mathematically, the two integer ambiguities are contained in the two-dimensional integer space. In reality, QNˆ will be a fully occupied symmetric matrix. The result is still an ellipse, but its axes are rotated with respect to the coordinate system associated with

21 Elementary Mathematical Models for GNSS Positioning

1413

N1 and N2 , which implies a correlation of the two ambiguities so that it is more complicated to find the minimum for χ 2 (N). In other words, the rounding to the nearest integer principle no longer works. To return to this convenient feature, the idea is to apply a transformation that decorrelates the ambiguities, which means that the transformed covariance matrix of the ambiguities becomes a diagonal matrix. Finding a transformation that produces a diagonal matrix for QNˆ seems to be trivial since an eigenvalue decomposition yields a diagonal matrix as output. Explicitly, each symmetric matrix 4 Q=

q11 q12

3 (205)

q12 q22

can be transformed into the diagonal matrix 4 

Q =

λ1 0

3 .

(206)

0 λ2

The eigenvalues are defined by λ1 =

1 2

(q11 + q22 + w) ,

λ2 =

1 2

(q11 + q22 − w) ,

(207)

with the auxiliary quantity w=

 2 . (q11 − q22 )2 + 4q12

(208)

The two eigenvectors are orthogonal to each other and are defined by the rotation angle ϕ, which can be calculated by tan 2ϕ =

2q12 . q11 − q22

(209)

The only problem is that the integer ambiguities N must also be transformed and must preserve their integer nature. Thus, an ordinary eigenvalue decomposition will not work. Generally, the task may be formulated in the following way. The ambiguities N ˆ are reparameterized by matrix Z. Note that Teunissen uses the transposed and N matrix ZT , but the principle remains the same. Hence,

1414

B. Hofmann-Wellenhof

N = Z N , ˆ, ˆ = ZN N

(210)

QNˆ  = Z QNˆ ZT , where the transformation of the cofactor matrix is obtained by applying the error propagation law. The ambiguities N obtained after transformation must remain integer values. That restricts the matrix Z to a specific class of transformations where three conditions must be fulfilled [115, 117]. These conditions are the following (1) the elements of the transformation matrix Z must be integer values, (2) the transformation must be volume preserving, and (3) the transformation must reduce the product of all ambiguity variances. Note that the inverse of the transformation matrix Z must also consist of integer values only, because upon a retransformation of the (determined) integer ambiguities N , the integer nature of the ambiguities must be kept. For the two-dimensional example shown, volume preserving reduces to area preserving of the ellipse represented by the two-dimensional cofactor (covariance) matrix. If the three conditions are fulfilled, the transformed integer ambiguities are again integer values and the cofactor (covariance) matrix of the transformed ambiguities is more diagonal than the cofactor (covariance) matrix of the original ambiguities [115]. The Gauss transformation is one of the possible candidates and may either be expressed by 4 Z1 =

1

0

3 ,

  α1 = −INT qNˆ 1 Nˆ 2 /qNˆ 1 Nˆ 1 ,

(211)

,

  α2 = −INT qNˆ 1 Nˆ 2 /qNˆ 2 Nˆ 2 ,

(212)

α1 1

or by the other form 4 Z2 =

1 α2 0

1

3

since the roles of the two ambiguities may be interchanged. In the transformation (211), the ambiguity Nˆ 1 remains unchanged and Nˆ 2 is transformed. Analogously, Nˆ 2 may be kept unchanged and Nˆ 1 will be transformed as achieved by (212). For a better distinction of the two transformations, the subscripts 1 and 2 were introduced. Here, the operator INT performs the rounding to the nearest integer. The theoretical background of the transformation procedure comprises the conditional least-squares estimate [115].

21 Elementary Mathematical Models for GNSS Positioning

1415

The transformed ambiguities are obtained from 4

Nˆ 1 Nˆ 2

3

4 =

34 3 1 −INT qNˆ 1 Nˆ 2 /qNˆ 2 Nˆ 2 Nˆ 1 0

Nˆ 2

1

(213)

.

For a numerical example, Teunissen [118, Sect. 8.5.2] assumed that after a leastsquares adjustment the ambiguities 4 ˆ = N

Nˆ 1

3

4 =

Nˆ 2

1.05

3

1.30

and 4 QNˆ =

qNˆ 1 Nˆ 1 qNˆ 1 Nˆ 2 qNˆ 1 Nˆ 2 qNˆ 2 Nˆ 2

3

4 =

53.4 38.4

3

38.4 28.0

were calculated. Now the transformation is applied to QNˆ . Translating the matrix elements to variances, the ambiguity Nˆ 1 has a larger variance than Nˆ 2 . Hence it is preferable first to change Nˆ 1 and keep Nˆ 2 unchanged, i.e., to apply a transformation based on Z2 . From (212),   α2 = −INT qNˆ 1 Nˆ 2 /qNˆ 2 Nˆ 2 = −INT[38.4/28.0] = −1 and 4 Z2 =

1 −1 0

3

1

are obtained. The transformation according to (210) reads 4 QNˆ  =

Z2 QNˆ ZT2

=

1 −1 0

1

34

53.4 38.4 38.4 28.0

34

1 0

3

−1 1

and gives 4 QNˆ  =

4.6 10.4

3 .

10.4 28.0

The effect of this transformation can be seen best if the ambiguity search space, represented by the standard ellipse (which is centered around the corresponding ambiguities), is considered. The parameters of the standard ellipse follow from (205)

1416

B. Hofmann-Wellenhof

Fig. 31 Ambiguity search space for QNˆ (left) and transformed ambiguity search spaces for QNˆ  (center) and QNˆ  (right)

through (209) if Q is replaced by QNˆ and QNˆ  respectively. The eigenvalues of the matrices equal the squared semiaxes of the ellipse and ϕ defines the direction of the semimajor axis. Explicitly, the data QNˆ :

a = 9.0, b = 0.5, ϕ = 35◦ ,

QNˆ  :

a = 5.7, b = 0.8, ϕ = 69◦

are obtained. Graphically, the standard ellipses are shown in Fig. 31. The standard ˆ i.e., the origin is at Nˆ 1 = 1.05 ellipse for QNˆ is centered around the ambiguities N, and Nˆ 2 = 1.30. The standard ellipse for QNˆ  is centered around the ambiguities ˆ  = Z2 N ˆ  , i.e., the origin follows from N ˆ and amounts to Nˆ  = −0.25 and N 1 Nˆ 2 = 1.30. In Fig. 31, search windows are also indicated with sides parallel to the two axes of the two-dimensional integer search space, i.e., two horizontal and two vertical tangents of the ellipse. The “volumes” of the two ellipses are the same because the transformation is volume preserving, but the shape and the orientation of the ellipse has changed. The distance between the two horizontal tangents has not changed because these two tangents bound the search range for the N2 ambiguity which remained unaltered by the Z2 transformation, whereas the distance of the two vertical tangents has changed. Each grid point represents one pair of ambiguities. Under the assumption that each grid point of the search window must be regarded as a possible candidate to be investigated for a reasonable solution, the advantage of the transformed search space becomes obvious. From comparing the off-diagonal elements of QNˆ and of the transformed QNˆ  , the decrease of correlation is evident. Another transformation may now be applied to QNˆ  . Since ambiguity Nˆ 2 has a larger variance than Nˆ 1 , it is preferable to change Nˆ 2 and keep Nˆ 1 unchanged, i.e., to apply a transformation based on Z1 . First, from (211) α1 = −INT[qNˆ 1 Nˆ 2 /qNˆ 1 Nˆ 1 ] = −INT[10.4/4.6] = −2

21 Elementary Mathematical Models for GNSS Positioning

1417

is determined giving 4 Z1 =

1 0

3

−2 1

and 4 QNˆ  =

Z1 QNˆ  ZT1

=

1 0

34

−2 1

4.6 10.4

34

10.4 28.0

1 −2 0

3 ,

1

where the double prime expresses that the transformation is applied on the once transformed matrix. The result is 4 QNˆ  =

4.6 1.2

3 .

1.2 4.8

The standard ellipse for QNˆ  is given by a = 2.4, b = 1.9, ϕ = 47◦ and is shown ˆ  , i.e., in Fig. 31. The standard ellipse for QNˆ  is centered around the ambiguities N     ˆ = Z1 N ˆ and amounts to Nˆ = −0.25 and Nˆ = 1.80. the origin follows from N 1 2 As far as the search window is concerned, the effect may easily be seen from the much smaller search area (represented by the window) of QNˆ  . Accordingly, the distance between the two vertical tangents has not changed because these two tangents bound the search range for the N1 ambiguity which remained unaltered by the Z1 transformation, whereas the distance of the two horizontal tangents has changed. From comparing the off-diagonal elements of QNˆ  and of the transformed QNˆ  , the decrease of correlation is evident. However, the ambiguities are still not fully decorrelated. The two transformations may also be combined to a single transformation. Using QNˆ  = Z1 QNˆ  ZT1 and substituting QNˆ  = Z2 QNˆ ZT2 leads to QNˆ  = Z1 Z2 QNˆ ZT2 ZT1 ,       Z

ZT

where 4 Z=

1 0 −2 1

34

1 −1 0

1

3

4 =

1 −1 −2

3

3

so that now the single transformation matrix Z represents the composition of the Z2 and the Z1 transformation.

1418

B. Hofmann-Wellenhof

The extension of the reparameterization of the ambiguity search space to higher dimensions is possible. Teunissen [118, Sect. 8.5.3] gives the decorrelating ambiguity transformation Z for the three-dimensional case which would apply if double-differences of four satellites are used, and a twelve-dimensional transformation for seven satellites and dual-frequency data. Rizos and Han [107] propose an iterative procedure to generate the decorrelating ambiguity transformation Z. Note that the ambiguity search space becomes an ellipsoid for the three-dimensional example and an n-dimensional hyperellipsoid for n > 3. After the decorrelation of the ambiguities by the Z transformation, the task of actually solving ambiguity estimates remains. The search can be carried out efficiently by using the sequential conditional adjustment. This adjustment determines the ambiguities step by step (i.e., sequential) one after the other. For the i-th ambiguity to be estimated, the previously determined i − 1 ambiguities are fixed (i.e., conditional). The sequential conditional least-squares adjustment ambiguities are not correlated. This means that the effect of the Z transformation will not be destroyed. An overview of the procedure is given in Jonge and Tiberius [58] and some details are covered in Teunissen [118, Sect. 8.3.2]. Details on the actual discrete search strategy are given in Teunissen [115], Teunissen et al. [126], and Teunissen [118, Sects. 8.3.2, 8.5.3]. In summary, Teunissen’s LAMBDA method may be separated into the following steps: 1. A conventional least-squares adjustment is carried out to yield the baseline components and float ambiguities. 2. Using the Z transformation, the ambiguity search space is reparameterized to decorrelate the float ambiguities. 3. Using the sequential conditional least-squares adjustment together with a discrete search strategy, the integer ambiguities are estimated. By the inverse transformation Z−1 , the ambiguities are retransformed to the original ambiguity space where the baseline components are given. Since Z−1 consists only of integer elements, the integer nature of the ambiguities is kept. 4. The integer ambiguities are fixed as known quantities and another conventional least-squares adjustment to determine the final baseline components is performed. Ambiguity Determination with Special Constraints Several multiple receiver methods for kinematic applications exist. One common procedure of this technique is to place two or more receivers at fixed locations (usually short distances apart) of the moving object. Since the locations of the antennas are fixed, constraints (e.g., the distance between two antennas) may be formulated which can be used to increase the efficiency of the ambiguity resolution. In principle, the gain by using constraints results in a reduction of the potential ambiguity sets. This is illustrated briefly by two examples. The first example, taken from Lu and Cannon [91], concerns attitude determination in a marine environment and employs the distances between the antennas on

21 Elementary Mathematical Models for GNSS Positioning

1419

Fig. 32 Aircraft-to-aircraft GNSS positioning with four receivers

a ship as constraints for the ambiguity resolution. Here, only the principle of the ambiguity resolution with the constraint of the known distance for a single baseline is described. Referring to the double-difference model (164), four satellites yielding three double-differences are considered. Analogous to the procedure for the leastsquares ambiguity search technique described earlier, the equations are reformulated as λ Φ − N = ρ, where all indices have been omitted. Lu and Cannon [91] and Lu [90] reduce the search space by introducing the known distance of the baseline. Referring to the system λ Φ − N = ρ, three doubledifferences are considered and the linearization of ρ is performed with respect to the reference station of the baseline. Thus, the linearized system may be written as w = A x, where A is a 3 × 3 design matrix resulting from the linearization, x contains the unknown baseline components (since the linearization was carried out with respect to the known station), and the left side of the equation contains the residual vector w, which also comprises the ambiguities. Since x represents the baseline components, the constraint of the length of the baseline, denoted by b, may be introduced by first forming A−1 w = x and then b2 = xT x = wT (A AT )−1 w. This system may be further simplified by applying to A AT a Cholesky decomposition which reduces A to a lower triangle matrix. The advantage obtained from this decomposition is that the third ambiguity may be expressed by a quadratic equation containing the other two ambiguities. Thus, introducing search trials for these two other ambiguities yields two solutions for the third ambiguity. Therefore, the constraint significantly reduces the search space. Redundant satellites may be used to further reduce the size of the search space. The performance of this method can best be seen by means of a simple example. Assuming a 15-cycle uncertainty for the three unknowns would yield (together with the one ambiguity set obtained by rounding the calculated unknowns to their nearest integer values) 31 × 31 × 31 = 29 791 possible ambiguity sets, whereas taking into account the constraint as described above reduces the set of possible ambiguities to 31 × 31 × 2 = 1 922. The second example presented here refers to the introduction of constraints for an aircraft-to-aircraft positioning as proposed in Lachapelle et al. [78]. The situation is shown in Fig. 32. Each of the two aircraft is equipped with two receivers. The corresponding distances of the antennas between i and j on one aircraft and k and  on the other aircraft are known and may be introduced as constraints to determine the double-difference ambiguities for each airplane separately, i.e, the double-differences Nij and Nk for the available satellites (which are not indicated

1420

B. Hofmann-Wellenhof

here by appropriate superscripts). These resolved ambiguities Nij and Nk may now be used to interrelate the two aircraft by constraints. As proposed in Lachapelle et al. [78], three sets of double-difference ambiguity relations are constrained by using, e.g., Nij = Nj k − Nik , Nij = Nj  − Ni , and Nk = Ni − Nki . Thus, for five satellites there are 4 × 3 double-difference equations of this type which are used to reduce the number of potential ambiguity solutions. Note that these relations are not independent from each other but may still contribute to average out several error sources like carrier phase noise and multipath. Examples given in Lachapelle et al. [78] demonstrate that for two aircraft within 1 km typically 4 to 6 minutes of measurements (with a data rate of 1 Hz) are sufficient to obtain a unique solution. The correctness of the ambiguities may roughly be checked by the double-difference phase residuals which must not show a significant drift over time. A drift would be an indication of wrong ambiguities. The root mean square error of the double-difference phase residuals was in the amount of 0.8 cm. Based on the given data set, several trials were performed by shifting the initial epoch from one trial to the next by 90 seconds. Of these trials, some 50% yielded the same ambiguities.

4.2.4 Ambiguity Validation After the determination of the integer ambiguities, it is of interest to validate the quality of the obtained quantities [138]. Therefore, the uncertainty of the estimated integer ambiguities is to be determined. As pointed out in Joosten and Tiberius [60], the distribution of the estimated integer ambiguities will be a probability mass function. For a probabilistic measure, the ambiguity success rate is defined, which quantifies the probability that the integer ambiguities are correctly estimated. The ambiguity success rate equals the integral of the probability density function of the float ambiguities. The integral extends over the so-called pull-in region [119], [131, Chap. 3], in which all float solutions are pulled by the integer least-squares criterion to the correct integer ambiguity solution. Due to its definition as probability measure, the success rate is only a single number between 0 and 1 (which may also be expressed as percentage between 0% and 100%). As mentioned earlier, the ambiguity success rate depends on the functional model, the stochastic model, and the chosen method of integer estimation. Similar to the dilution of precision (DOP) computations, the success rate may be calculated without actual measurements if the functional and the stochastic model are known. In the context of ambiguity validation, a theoretical problem, frequently neglected, should be mentioned. Consider a double-difference model; if integer least-squares adjustment is applied, implicitly a stochastic behavior of the observation vector is assumed. As a result, the fixed ambiguities resulting from this adjustment procedure will also be stochastic. This should be taken into account when validating the ambiguities; this is frequently neglected – however, sometimes justified [130]. With respect to the integer estimation method, Teunissen [119, 120] has proven that the LAMBDA method delivers the optimum success rate of all admissible integer estimators. A proper choice of the weight matrix is also important for the

21 Elementary Mathematical Models for GNSS Positioning

1421

ambiguity resolution. Either a too optimistic or too pessimistic precision description will result in a less than optimal ambiguity success rate. Jonkman [59] and Teunissen et al. [128] demonstrate examples of an increased ambiguity success rate by improving the stochastic modeling. Several methods exist for the computation of the success rate. Joosten and Tiberius [60] describe a simulation procedure based on a random number generator and somewhere between 100,000 and 1 million samples and achieve a success rate of 99.9%. Another method mentioned in Joosten and Tiberius [60] is the computation of a “sharp lower bound” of the probability of correct integer leastsquares estimation using conditional standard deviations of the ambiguities which follow directly from the triangular decomposition of the float ambiguity variancecovariance matrix. Applying the LAMBDA method, this decomposition is available without additional computational effort. Joosten et al. [61] stress that the success rate should be considered as the measure for judging the success of ambiguity resolution. When using the standard deviations of the ambiguities, this may yield misleading results for two reasons: (1) the correlations are neglected when using only the standard deviations, (2) ambiguity transformations change the standard deviations. In contrast to this, the success rate as defined previously is invariant for any ambiguity transformation. Verhagen [130] compares several of the integer validation methods proposed in literature systematically with the assumption that integer least-squares adjustment is used for ambiguity resolution. As known from Teunissen [119], this will result in the optimal solution in the sense that the probability of correct integer estimation is maximized. Verhagen [130] shows that only the best and the second-best integer candidate solutions must be validated. This leads to the ratio test, one of the earliest and most popular ways to validate the integer ambiguity solution [125]. The ratio is formed by the squared norm of the second-best ambiguity residual vector and the squared norm of the best ambiguity residual vector. This ratio is compared against a certain threshold, the critical value. This critical value plays a key role since it is the indicator if the two compared solutions are considered to be discriminated with sufficient confidence. The choice of the critical value may be regarded as a kind of question mark. Euler and Schaffrin [26] propose a critical value between 5 and 10 depending on the degrees of freedom. Wei and Schwarz [142] choose 2, Han and Rizos [42] propose 1.5 if elevation-dependent weights are used. Leick [85, Eq. 7.207] states that many softwares simply use a fixed critical value, for example, 3. Since a rigorous probabilistic theory for the validation of the integer ambiguities was missing, Teunissen [121,122] developed the theory of integer aperture inference and, as a consequence, the optimal integer aperture estimation.

4.3

Linearization

Starting with a typical example, it is assumed that point positioing with code ranges has been chosen as GNSS positioning method and that at one time epoch t the

1422

B. Hofmann-Wellenhof

pseudoranges with codes have been measured to four satellites. The appropriate model is Eq. (4) containing apart from the clock error the distance ρ between the satellite and the receiver. This distance is a nonlinear function of the respective position vectors. Consequently, to process the data properly, the model must be linearized to achieve equations where all unknonws appear linearly only. After linearization, adjustment and filtering methods may be applied accordingly; note that detailed descriptions of these methods are out of scope with respect to this contribution; however, for the Geodetic Handbook least-squares adjustment and filtering methods are of tremendous geodetic importance!

4.3.1 Theoretical Considerations The computation of the position is based on the evaluation of Eq. (34), which relates the range observations to the receiver position with respect to the satellite position: ρrs (t) ≡ f (Xr , Yr , Zr ) .

(214)

The nonlinear equation can be solved applying closed-form algorithms [38, 70, 87]. Linearizing the equations, in contrast, simplifies the algorithm and allows to implement adjustment algorithms.

4.3.2 Linearization of Mathematical Models Considering the mathematical models of Sect. 3, the only term comprising unknowns in nonlinear form is ρ, the geometric range between the observing receiver site r and the satellite s. After linearizing ρ, adjustment algorithms may be applied. The basic formula from Eq. (34) , ρrs (t) =



(Xs (t) − Xr )2 + (Y s (t) − Yr )2 + (Z s (t) − Zr )2

≡ f (Xr , Yr , Zr ) ,

(215)

shows the range ρ as a function of the unknown point Xr = [Xr , Yr , Zr ]. Assuming approximate values Xr0 = [Xr0 , Yr0 , Zr0 ] for the unknowns, an approximate range s (t) is calculated by ρr0 s (t) = ρr0

 (Xs (t) − Xr0 )2 + (Y s (t) − Yr0 )2 + (Z s (t) − Zr0 )2

≡ f (Xr0 , Yr0 , Zr0 ) .

(216)

Using the approximate values, the unknowns Xr , Yr , Zr are decomposed to Xr = Xr0 +

Xr ,

Yr = Yr0 +

Yr ,

Zr = Zr0 +

Zr ,

(217)

21 Elementary Mathematical Models for GNSS Positioning

1423

where now Xr , Yr , Zr are the new unknowns, whereas Xr0 , Yr0 , Zr0 are known. The advantage of this split-up is that the function f (Xr , Yr , Zr ) is replaced by an equivalent function f (Xr0 + Xr , Yr0 + Yr , Zr0 + Zr ) which now can be expanded into a Taylor series with respect to the approximate position. This leads to f (Xr , Yr , Zr ) ≡f (Xr0 +

Xr , Yr0 +

Yr , Zr0 +

Zr )  ∂f (Xr , Yr , Zr )  =f (Xr0 , Yr0 , Zr0 ) + Xr  ∂Xr Xr =Xr0   ∂f (Xr , Yr , Zr )  ∂f (Xr , Yr , Zr )  + Yr + Zr +. . . ,   ∂Yr ∂Zr Xr =Xr0 Xr =Xr0 (218) where the expansion is truncated after the linear term. The higher terms are assumed to be negligibly small, otherwise the adjustment process has to be iteratively repeated, which means the result of the adjustment is used as the new approximate position. The partial derivatives evaluated at the approximate position Xr0 are obtained using (216) as  ∂f (Xr , Yr , Zr )  Xs (t) − Xr0 , =− s (t)  ∂Xr ρr0 Xr =Xr0  ∂f (Xr , Yr , Zr )  Y s (t) − Yr0 , =− s (t)  ∂Yr ρr0 Xr =Xr0  ∂f (Xr , Yr , Zr )  Z s (t) − Zr0 , =− s (t)  ∂Zr ρr0 Xr =Xr0

(219)

which correspond to the components of the unit vector pointing from the satellites towards the approximate receiver site r. The substitution of Eqs. (216) and (219) into Eq. (218) gives s ρrs (t) = ρr0 (t) −

Xs (t) − Xr0 s (t) ρr0

Xr −

Y s (t) − Yr0 s (t) ρr0

Yr −

Z s (t) − Zr0 s (t) ρr0

Zr ,

(220) where the equivalence of f (Xr , Yr , Zr ) with ρrs (t) has been used. This equation is now linear with respect to the unknowns Xr , Yr , Zr . Linear Model for Point Positioning with Code Ranges The model is given only in its elementary form and, thus, apart from the geometry, only the clocks are modeled. The ionosphere, troposphere, and other effects are neglected for the moment. According to Eq. (38), the model for point positioning with code ranges is given by Rrs (t) = ρrs (t) + c δr (t) − c δ s (t) ,

(221)

1424

B. Hofmann-Wellenhof

which is linearized by substituting (220): s (t) − Rrs (t) = ρr0

Xs (t) − Xr0 s (t) ρr0

Z s (t) − Zr0 − s (t) ρr0

Xr −

Y s (t) − Yr0 s (t) ρr0

Yr (222)

Zr + c δr (t) − c δ (t) . s

Leaving the terms containing unknowns on the right side, the equation is rewritten as s (t) + c δ s (t) = − Rrs (t) − ρr0

Xs (t) − Xr0 s (t) ρr0

Z s (t) − Zr0 − s (t) ρr0

Xr −

Y s (t) − Yr0 s (t) ρr0

Yr (223)

Zr + c δr (t) ,

where the satellite clock bias δ s (t) is assumed to be known because satellite clock corrections are received within the navigation message. Model (223) comprises four unknowns, namely Xr , Yr , Zr , and δr (t). Consequently, four satellites are needed to solve the problem. The receiver clock unknown is sometimes introduced by c δr (t) for a better numerical stability in the matrix operations. The shorthand notations s (t) + c δ s (t) , s = Rrs (t) − ρr0 s =− aX r

Xs (t) − Xr0 , s (t) ρr0

aYs r = −

Y s (t) − Yr0 , s (t) ρr0

aZs r = −

Z s (t) − Zr0 s (t) ρr0

(224)

help to simplify the representation of the system of equations. Although s and a s are time-dependent, this dependency has not been introduced here explicitly for reasons of simplification. Assuming now four satellites numbered from 1 to 4, then 1 1 = aX r

Xr + aY1r

Yr + aZ1 r

Zr + c δr (t) ,

2 2 = aX r

Xr + aY2r

Yr + aZ2 r

Zr + c δr (t) ,

3 3 = aX r

Xr + aY3r

Yr + aZ3 r

Zr + c δr (t) ,

4 4 = aX r

Xr + aY4r

Yr + aZ4 r

Zr + c δr (t)

(225)

21 Elementary Mathematical Models for GNSS Positioning

1425

is the appropriate system of equations. Note that the superscripts are the satellite numbers and not exponents. Introducing ⎡

1





⎢ ⎥ ⎢ 2⎥ ⎢ ⎥ ⎥ l=⎢ ⎢ 3⎥ , ⎢ ⎥ ⎣ ⎦ 4

1 a1 a1 c aX Y r Zr r





⎢ ⎥ ⎢ a2 a2 a2 c ⎥ ⎢ Xr Yr Zr ⎥ ⎥ A=⎢ ⎢ 3 3 3 ⎥, ⎢ aXr aYr aZr c ⎥ ⎣ ⎦ 4 a4 a4 c aX Y r Zr r

Xr



⎢ ⎥ ⎢ ⎥ ⎢ Yr ⎥ ⎢ ⎥, x=⎢ ⎥ ⎢ Zr ⎥ ⎣ ⎦ δr (t)

(226)

the set of linear equations can be written in the matrix-vector form l = Ax.

(227)

The resubstitution of the matrix A using (224) is given explicitly for one epoch t: ⎡

X1 (t) − Xr0 − 1 (t) ⎢ ρr0 ⎢ ⎢ 2 ⎢ X (t) − Xr0 ⎢− ⎢ 2 (t) ρr0 ⎢ A=⎢ ⎢ X3 (t) − Xr0 ⎢− ⎢ 3 (t) ρr0 ⎢ ⎢ ⎣ X4 (t) − Xr0 − 4 (t) ρr0



Y 1 (t) − Yr0 1 (t) ρr0



Z 1 (t) − Zr0 1 (t) ρr0



Y 2 (t) − Yr0 2 (t) ρr0



Z 2 (t) − Zr0 2 (t) ρr0

− −

Y 3 (t) − Yr0 3 (t) ρr0

Y 4 (t) − Yr0 4 (t) ρr0

− −

Z 3 (t) − Zr0 3 (t) ρr0

Z 4 (t) − Zr0 4 (t) ρr0

⎤ c⎥ ⎥ ⎥ ⎥ c⎥ ⎥ ⎥ ⎥. ⎥ c⎥ ⎥ ⎥ ⎥ ⎦ c

(228)

From the linear system of equations, the coordinate differences Xr , Yr , Zr and the receiver clock error δr (t) for epoch t are obtained by multiplying (227) from left with the inverse A−1 . The desired point coordinates are finally obtained by (217). Recall that the selection of the approximate values for the coordinates was completely arbitrary, they could even be set equal to zero. However, depending on the quality of the approximate values, the equations have to be applied iteratively. Point positioning with code ranges is applicable for each epoch separately. Therefore, this model may also be used in kinematic applications. The equations only change by introducing a time dependency of the unknown point coordinates Xr (t). Linear Model for Point Positioning with Carrier Phases The procedure is the same as in the previous section. Using Eq. (43), the linearization is performed for ρrs (t) and known terms are shifted to the left side. Multiplying the equation by λ and using c = λ f yields

1426

B. Hofmann-Wellenhof

Xs (t) − Xr0 Xr s (t) ρr0 Z s (t) − Zr0 Yr − Zr + λ Nrs + c δr (t) , s (t) ρr0

s (t) + c δ s (t) = − λ Φrs (t) − ρr0

Y s (t) − Yr0 − s (t) ρr0

(229)

where compared to point positioning with code ranges the number of unknowns is now increased by the ambiguities. Considering again four satellites, the system is given in matrix-vector form l = A x, where ⎡

1 (t) + c δ 1 (t) λ Φr1 (t) − ρr0



⎢ ⎥ ⎢ 2 (t) + c δ 2 (t) ⎥ ⎢ λ Φr2 (t) − ρr0 ⎥ ⎥, l =⎢ ⎢ 3 (t) + c δ 3 (t) ⎥ ⎢ λ Φr3 (t) − ρr0 ⎥ ⎣ ⎦ 4 4 4 λ Φr (t) − ρr0 (t) + c δ (t) ⎡ 1 aXr (t) aY1r (t) aZ1 r (t) λ 0 0 ⎢ ⎢ 2 ⎢ aXr (t) aY2r (t) aZ2 r (t) 0 λ 0 ⎢ A=⎢ ⎢ a 3 (t) a 3 (t) a 3 (t) 0 0 λ ⎢ Xr Yr Zr ⎣ 4 (t) a 4 (t) a 4 (t) 0 0 0 aX Yr Zr r x=



Xr

Yr

0 c



⎥ ⎥ 0 c⎥ ⎥ ⎥, 0 c⎥ ⎥ ⎦ λ c

Zr Nr1 Nr2 Nr3 Nr4 δr (t)

(230)

T

and where the coefficients of the coordinate increments, cf. (224), are supplemented with the time parameter t. Obviously, the four equations are inadequate to determine the eight unknowns. This reflects the fact that point positioning with phases in this form cannot be solved epoch by epoch. Each additional epoch increases the number of unknowns by a new clock term. Thus, for two epochs there are eight equations and nine unknowns (still an underdetermined problem). For three epochs (t1 , t2 , t3 ) there are 12 equations and 10 unknowns, thus, a slightly overdetermined problem. The 10 unknowns in the latter example are the coordinate increments Xr , Yr , Zr for the unknown point, the integer ambiguities Nr1 , Nr2 , Nr3 , Nr4 for the four satellites, and the receiver clock biases δr (t1 ), δr (t2 ), δr (t3 ) for the three epochs. The design matrix A has the dimension [12×10]. The solution of this redundant system is performed by least-squares adjustment. Linear Model for Relative Positioning The previous sections have shown linear models for both code ranges and carrier phases. For the case of relative positioning, the investigation is restricted to

21 Elementary Mathematical Models for GNSS Positioning

1427

carrier phases, since on the one side relative positioning aims at higher accuracies achievable only with carrier phases and on the other side it should be obvious how to change from the more expanded model of phases to a code model. Furthermore, the linearization and setup of the linear equation system remains, in principle, the same for phases and phase combinations and could be performed analogously for each model. Therefore, the double-difference is selected for treatment in detail. The model for the double-difference of Eq. (87), multiplied by λ, is jk

jk

jk

λ ΦAB (t) = ρAB (t) + λ NAB ,

(231)

jk

where the term ρAB containing the geometry is composed of jk

j

j

ρAB (t) = ρBk (t) − ρB (t) − ρAk (t) + ρA (t) ,

(232)

which reflects the fact of four measurement quantities for a double-difference. Each of the four terms must be linearized according to (220) yielding

jk

k (t) − ρAB (t) = ρB0

Xk (t) − XB0 k (t) ρB0

XB

− −

j

− ρB0 (t) +

Xj (t) − XB0 j

ρB0 (t)

XB + +

k (t) + − ρA0

Xk (t) − XA0 k (t) ρA0

XA + +

j

+ ρA0 (t) −

Xj (t) − XA0 j ρA0 (t)

XA − −

Y k (t) − YB0 k (t) ρB0

Z k (t) − ZB0 k (t) ρB0

Y j (t) − YB0

YB ZB

j

YB

j

ZB

ρB0 (t) j Z (t) − ZB0 ρB0 (t) Y k (t) − YA0 k (t) ρA0

Z k (t) − ZA0 k (t) ρA0

Y j (t) − YA0

(233) YA ZA

j

YA

j

ZA .

ρA0 (t) Z j (t) − ZA0 ρA0 (t)

Substituting (233) into (231) and rearranging leads to the linear observation equation

1428

B. Hofmann-Wellenhof jk

jk

jk

AB (t) = aXA (t)

XA + aYA (t)

jk

jk

+ aXB (t)

XB + aYB (t)

jk

YA + aZA (t)

ZA

jk

YB + aZB (t)

jk

(234)

ZB + λ NAB ,

where the left side jk

jk

j

j

k k AB (t) = λ ΦAB (t) − ρB0 (t) + ρB0 (t) + ρA0 (t) − ρA0 (t)

(235)

comprises both the measurement quantities and all terms computed from the approximate values. On the right side of (234), the abbreviations jk

aXA (t) = +

jk

aYA (t) = +

jk

aZA (t) = +

Xk (t) − XA0 k (t) ρA0

Y k (t) − YA0 k (t) ρA0

Z k (t) − ZA0 k (t) ρA0

Xj (t) − XA0



Y j (t) − YA0



,

j

ρA0 (t)

j

,

ρA0 (t) Z j (t) − ZA0



,

j

ρA0 (t) (236)

X jk aXB (t) = −

jk

aYB (t) = −

jk

aZB (t) = −

k (t) − X B0 k ρB0 (t)

Y k (t) − YB0 k (t) ρB0

Z k (t) − ZB0 k (t) ρB0

Xj (t) − XB0

+

j

Y j (t) − YB0

+

,

ρB0 (t)

j

,

ρB0 (t)

+

Z j (t) − ZB0 j

ρB0 (t)

have been used. The coordinates of one point (e.g., A) must be known for relative positioning. More specifically, the known point A reduces the number of unknowns by three because of XA =

YA =

ZA = 0

(237)

and leads to a slight change in the left-side term jk

jk

j

j

k AB (t) = λ ΦAB (t) − ρB0 (t) + ρB0 (t) + ρAk (t) − ρA (t) .

(238)

Assuming now four satellites j, k, l, m and two epochs t1 , t2 , the matrix-vector system

21 Elementary Mathematical Models for GNSS Positioning



jk

AB (t1 )





⎢ ⎥ ⎢ jl ⎥ ⎢  (t1 ) ⎥ ⎢ AB ⎥ ⎢ ⎥ ⎢ jm ⎥ ⎢ AB (t1 ) ⎥ ⎢ ⎥ l =⎢ ⎥ ⎢ jk ⎥ ⎢ AB (t2 ) ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ j l (t ) ⎥ ⎢ AB 2 ⎥ ⎣ ⎦ jm



AB (t2 ) jk

jk

aXB (t1 ) aYB (t1 )

⎢ ⎢ jl ⎢ aX (t1 ) ⎢ B ⎢ ⎢ jm ⎢ aXB (t1 ) ⎢ A =⎢ ⎢ jk ⎢ aXB (t2 ) ⎢ ⎢ ⎢ a j l (t ) ⎢ XB 2 ⎣ jm aXB (t2 )

1429

jl

aYB (t1 ) jm

aYB (t1 ) jk

aYB (t2 ) jl

aYB (t2 ) jm

aYB (t2 )

XB



⎢ ⎥ ⎢ ⎥ ⎢ YB ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ZB ⎥ ⎢ ⎥ x=⎢ ⎥ ⎢ Njk ⎥ ⎢ AB ⎥ ⎢ ⎥ ⎢ jl ⎥ ⎢N ⎥ ⎢ AB ⎥ ⎣ ⎦ jm NAB ⎤ jk aZB (t1 ) λ 0 0 ⎥ ⎥ jl aZB (t1 ) 0 λ 0 ⎥ ⎥ ⎥ ⎥ jm aZB (t1 ) 0 0 λ ⎥ ⎥ ⎥ ⎥ jk aZB (t2 ) λ 0 0 ⎥ ⎥ ⎥ jl aZB (t2 ) 0 λ 0 ⎥ ⎥ ⎦ jm aZB (t2 ) 0 0 λ

(239)

is obtained which represents a determined and, thus, solvable system. For only one epoch, the system would have more unknowns than observation equations.

4.3.3 Network Adjustment The previous sections described the linearization of the observation equations. The adjustment itself, i.e., the solution of the system of linear equations, is a purely mathematical task to be solved. Single-Baseline Solution The adjustment principle vT P v = minimum! requires the implementation of the weight matrix P. The off-diagonal elements of P express the correlation between the measurements. As shown in Sect. 3.3.3, phases and single-differences are uncorrelated, whereas double- and triple-differences are mathematically correlated. The implementation of double-difference correlations can be easily accomplished. Alternatively, the double-differences can be decorrelated by using a Gram–Schmidt orthogonalization [102]. The implementation of the correlation of the tripledifferences is more difficult but worth to be implemented [102, Table 7.1]. In the case of an observed network, the use of the single baseline method usually implies a baseline-by-baseline computation for all possible combinations. If nr denotes the number of observing sites, then nr (nr − 1)/2 baselines can be calculated. Only nr − 1 of them are theoretically independent. The redundant

1430

B. Hofmann-Wellenhof

baselines can be used for misclosure checks. Alternatively all possible baselines are computed for the different measurement sessions. The resulting vectors of all sessions are subject to a common adjustment. The disadvantage of the simple single-baseline solution from the theoretical point of view is that it disregards the correlation of simultaneously observed baselines. By solving baseline by baseline, this correlation is ignored. Multipoint Solution In the multipoint approach, the correlations between the baselines are taken into account by considering all points in the network at once. The principal correlations have been shown in Sect. 3.3.3. The same theoretical aspects also apply to the extended case of a network. Single-difference example for a network When three points A, B, C and a single satellite j at a single epoch t are considered, two independent baselines can be defined. Taking A as reference site, for the two baselines A – B and A – C the two single-differences j

j

j

j

j

j

ΦAB (t) = ΦB (t) − ΦA (t) ,

(240)

ΦAC (t) = ΦC (t) − ΦA (t) are set up for the one satellite j at epoch t. The two single-differences and the phases are related by S = C,

(241)

where ⎡ S=⎣

j ΦAB (t) j

ΦAC (t)

⎤ ⎦,

4 C=

−1 1 0 −1 0 1



3 ,

j

ΦA (t)



⎢ ⎥ ⎢ ⎥  = ⎢ ΦBj (t) ⎥ . ⎣ ⎦ j ΦC (t)

(242)

To find the correlation, the covariance propagation law is applied by S = C  CT leading to S = σ 2 C CT

(243)

because of  = σ 2 I, cf. Eq. (105). Substituting matrix C from (242) and evaluating the matrix operation yields

21 Elementary Mathematical Models for GNSS Positioning

4 S = σ

2

2

1

1

2

1431

3 (244)

,

which shows, as is to be expected, a correlation of the single-differences of the two baselines with a common point. Recall that single-differences of a single baseline are uncorrelated as pointed out in Sect. 3.3.3. Double-difference example for a network Since double-differences are already correlated for a single baseline, a correlation must be expected for the network too. Nevertheless, the subsequent slightly larger example will demonstrate the increasing complexity. Assume again three points A, B, C with A as reference site for the two baselines A – B and A – C. Consider a single epoch t for four satellites j, k, , m, where j is taken as the reference satellite for the double-differences. There are (nr − 1) (ns − 1) independent double-differences for nr points and ns satellites. For the given example, nr = 3 and ns = 4 and, thus, 6 double-differences can be identified. Using the symbolic notation of (89), these are jk

j

j

j

j

j

jm

j

j

jk

j

j

j

j

j

jm

j

j

ΦAB (t) = ΦBk (t) − ΦB (t) − ΦAk (t) + ΦA (t) , ΦAB (t) = ΦB (t) − ΦB (t) − ΦA (t) + ΦA (t) , ΦAB (t) = ΦBm (t) − ΦB (t) − ΦAm (t) + ΦA (t) , ΦAC (t) = ΦCk (t) − ΦC (t) − ΦAk (t) + ΦA (t) ,

(245)

ΦAC (t) = ΦC (t) − ΦC (t) − ΦA (t) + ΦA (t) , ΦAC (t) = ΦCm (t) − ΦC (t) − ΦAm (t) + ΦA (t) for the assumptions made. As in the previous example, a matrix-vector relation is desired. By introducing for the matrix C: ⎡

1 −1

0

0 −1 1 0 0

0 0 0 0

⎢ ⎢ 1 0 −1 0 −1 0 1 0 0 0 ⎢ ⎢ ⎢ ⎢ 1 0 0 −1 −1 0 0 1 0 0 C=⎢ ⎢ ⎢ 1 −1 0 0 0 0 0 0 −1 1 ⎢ ⎢ ⎢ 1 0 −1 0 0 0 0 0 −1 0 ⎣ 1

0

0 −1

and for the vectors D and :



⎥ 0 0⎥ ⎥ ⎥ ⎥ 0 0⎥ ⎥, ⎥ 0 0⎥ ⎥ ⎥ 1 0⎥ ⎦

0 0 0 0 −1 0 0 1

(246)

1432

B. Hofmann-Wellenhof



jk

ΦAB (t)





j

ΦA (t)



⎢ k ⎥ ⎢ Φ (t) ⎥ ⎢ A ⎥ ⎢  ⎥ ⎢ ΦA (t) ⎥ ⎢ ⎥  = ⎢ m ⎥, ⎢ ΦA (t) ⎥ ⎢ ⎥ ⎢ Φ j (t) ⎥ ⎢ B ⎥ ⎣ ⎦ .. .

⎢ j ⎥ ⎢ Φ (t) ⎥ ⎢ AB ⎥ ⎢ jm ⎥ ⎢ Φ (t) ⎥ ⎢ AB ⎥ D = ⎢ jk ⎥, ⎢ Φ (t) ⎥ ⎢ AC ⎥ ⎢ j ⎥ ⎢ Φ (t) ⎥ ⎣ AC ⎦ jm

ΦAC (t)

(247)

the relation D = C

(248)

D = C  CT

(249)

D = σ 2 CCT

(250)

is valid. The covariance follows by

which reduces to

because of the uncorrelated phases. Explicitly, the matrix product ⎡

4 2 2 2 1 1

⎢ ⎢2 ⎢ ⎢ ⎢2 T CC = ⎢ ⎢ ⎢2 ⎢ ⎢ ⎣1



⎥ 4 2 1 2 1⎥ ⎥ ⎥ 2 4 1 1 2⎥ ⎥ ⎥ 1 1 4 2 2⎥ ⎥ ⎥ 2 1 2 4 2⎦

(251)

1 1 2 2 2 4 is a full matrix as expected. Finally, the weight matrix results from the inverse of (251): P = (C CT )−1 . Beutler et al. [6] show some results of network campaigns where the correlations have either been totally neglected, introduced in a single-baseline mode, or calculated correctly [7]. For small networks with baselines not exceeding 10 km, the differences of the three methods are in the range of a few millimeters. Clearly, the solution without any correlation deviates from the theoretically correct values by a greater amount. It is estimated that the single-baseline method deviates from the multibaseline (correlated) solution by a maximum of 2σ .

21 Elementary Mathematical Models for GNSS Positioning

1433

Single-Baseline Versus Multipoint Solution The implementation of the single-baseline method is much simpler; and it takes less effort to detect and eliminate faulty measurements. The multipoint solution, in contrast, accounts for the correlation and more easily detects and repairs cycle slips. The economic implementation of the full correlation for a multipoint solution only works properly for networks with the same observation pattern at each receiver site. In the event of numerous data outages it is better to recalculate the covariance matrix. Even in the case of the multipoint approach, it becomes questionable whether the correlations can be modeled properly. An illustrative example is given in Beutler et al. [8] where single- and dual-frequency receivers are combined in a network. For the dual-frequency receivers, the ionosphere-free combination is formed from dual-frequency measurements and processed together with the data of the single-frequency receivers. Thus, a correlation is introduced because of the single-frequency data. Least-Squares Adjustment of Baselines Considering networks, usually the number of measured baselines will exceed the minimum amount necessary, e.g., due to several measurement campaigns. In this case, redundant information is available and the determination of the coordinates of the network points may be carried out by a least-squares adjustment. The baseline vectors Xij between the unknown network points Xj and Xi are introduced as observables into the least-squares adjustment. Consequently, the linear expression Xij = Xj − Xi

(252)

referring the observables to the unknowns is formulated. Linearizing the equation (i.e., Xij = Xij 0 + Xij ) is not necessary but may be applied for numerical stability reasons. In case of redundancy, residuals vij are added to the observables to assure consistency, thus Xij = Xj − Xi + vij .

(253)

The components Xij = [Xij , Yij , Zij ] of the baseline vector are considered as observables. The elements of the design matrix are the coefficients of the unknowns Xi and Xj and amount to 0, +1, or −1. The relation (253) is solved by applying the least-squares principle vT v = minimum! if equal weights for the baselines are assumed. Otherwise, a weight matrix P must be taken into account. Introducing the coordinate differences Xij , Yij , Zij as the only observables, then absolute coordinates cannot be derived because the matrix of normal equations becomes singular. Considering the most general case, the rank deficiency of a three-dimensional network amounts to seven, corresponding to the seven degrees of freedom of a three-dimensional network or the seven parameters of a similarity transformation in space (three translations, three rotations, and one scale factor).

1434

B. Hofmann-Wellenhof

For relative positioning, orientation and scale of the network of baseline vectors (and also its shape) are determined due to the definition of the satellite orbits. This means that four, i.e., three rotations and the scale, of seven parameters are determined. The three translations of the whole network, i.e., a shift vector, are still undetermined so that the rank deficiency of the normal equations equals three. Selecting and fixing a single point of the network (i.e., considering its coordinates as known) solves the problem of the shift vector and leads to the minimally constrained solution. Note, however, that the fixing of coordinates should be restricted to one point. Otherwise, constraints would be induced into the network which could obviate the strong geometry and result in network distortions. Previously, the problem of correlations was discussed. For single-baseline solutions, only correlations between vector components are computed. For multipoint baseline solutions, correlations between the various baselines are computed. In the first case, the correlations may be disregarded, whereas in the latter case, the correlations should be taken into account. The adjustment described for three-dimensional Cartesian coordinates may also be performed in ellipsoidal coordinates. The principle of formulating the observation equations is precisely the same. However, the expression which relates the observables to the unknowns is more sophisticated.

5

Perspectives

At the very beginning of this contribution, I explained how I got access to the Global Positioning System (GPS) and also to the respective mathematical models in 1984 for the first time and, later, with much more depth and more detail in the years after which lead to my first GPS book in 1992: “GPS – theory and practice”, coauthored by Jim Collins and Herbert Lichtenegger. Why do I repeat this? In my opinion, this is remarkable in several aspects especially when following the development of the US GPS and, in these years of the cold war, the Russian GLONASS (Global Navigation Satellite System). In 1992, the year of publication of the GPS-book, the full operational capability (FOC) of these two systems was by far not achieved; it occurred in fall 1995 for GPS and in spring 1996 for GLONASS. Therefore, the mathematical models for satellite-based positioning could not be fully exploited. Nevertheless, and this might be regarded as the real surprise, the mathematical models as described in the book of 1992 did not substantially change since then and up to today! The reason for this is pretty simple: of course, the theory did not change; therefore, the models also did not change because the focus was put on the most elementary description to increase the facility for the reader to gain a proper understanding. In other terms, in a first step “pure theory” was presented in a first step and the expansion of the models by taking into account biases and error influences followed in the next steps. Certainly, the understanding of these biases and errors became better and better over the years; however, the basic models remained the same. As seen of today, the author only sees one new development which was not developed with the same level of depth in these early days of

21 Elementary Mathematical Models for GNSS Positioning

1435

satellite-based navigation: PPP (precise point positioning), see Sect. 3.1.4, and, in the author’s opinion, PPP can be regarded as a perspective of the future. Up to now, the developments are encouraging, yet, the real break-through to establish PPP as “the method” has by far not been achieved since the current convergence times are still too long. There are some other very interesting perspectives. The illustrative examples shown in this contribution mainly use GPS data only. The reader might generally ask why the expansion to GNSS by including GLONASS was not an issue. Several reasons could be mentioned, one is certainly the different access to GNSS: GPS works on the basis of code division multiple access (CDMA) whereas GLONASS uses frequency division multiple access (FDMA). Even if this is – from the theoretical perspective – not a big obstacle, in practice it is because the software used must be by far more sophisticated if, e.g., double-differences (see Eq. 95) are formed using one satellite each from the two systems. Another reason is that GLONASS was not easily publicly accessible for civilian users and, even more severe, vanished for many years from the horizon due to an insufficient number of satellites. Consequently, GLONASS was over many years not available as a global navigation satellite system but could be used only locally and not 24 hours a day. In contrast to GLONASS, GPS has been available all over the years since its FOC! Apart from this fact, software and hardware as well have been further developed permanently so that – as demonstrated by the measurement examples in this contribution – GPS “is so good” that there is in many, many cases no need to complement it by an additional system. This argument will be considered subsequently because the reader might rise the question if the ongoing completion of the European Galileo and the Chinese BeiDou is unnecessary and redundant. Two results should demonstrate how helpful and even necessary additional systems are. The first experiment was carried out in an alpine region of Austria. A car moved from Graz towards northwest on the highway A9 some 116 km and, while moving, measured GPS L1 with a frequency of 1 Hz. Then each of the calculated positions along the road was investigated individually by a planning software: over 24 hours, the satellite constellation was calculated for this position and the minimal number of available satellites over the entire day is displayed in Fig. 33 by a colored dot according to the color-scale as given in the figure. This calculation of the minimal number of available satellites over one day was repeated for all measured positions along the road resulting in the colored curve in Fig. 33. Note that the colors cover the range from deep blue to deep red, where deep blue – in association with a blue sky – is very good (more than 10 satellites visible) and deep red – in association with the hell – or orange means that the minimum number of 4 satellites was not available, see the color scale in the figure, which means that the number of satellites is insufficient to get a GNSS-based position. In order to understand Fig. 33, it is necessary to know the terrain of the investigated piece of road which was taken into account by a very detailed digital terrain model. The altitude of Graz, the initial point of the car trip is situated near the lower right corner of the picture, is some 365 m. Then the Alps, in this area with a height of some 2,000 m, are crossed by several tunnels where the longest one

1436

B. Hofmann-Wellenhof

Fig. 33 Minimal number of GPS satellites (indicated by colored dots according to the color scale) visible over 24 hours along the road A2 from Graz to Lassing at Austria

is some 8.5 km (in Fig. 33 at about ϕ = 47.25◦ and λ = 15.1◦ ). The destination point Lassing has an altitude of about 780 m. Checking the colored curve, the reader should figure out several tunnels which may be easily verified since there are gaps in the curve, i.e., no GPS measurements were available. It is interesting to see how the color turns towards red when the car approaches a tunnel or immediately after leaving it. In the next step, the GPS result was compared with the Galileo result under the same presuppositions. When we carried out this measurement and calculation experiment, the number of Galileo satellites was still insufficient. In addition, a comparison can only be made if the status of the two systems is about the same. Therefore, we used a simulation software developed by the TeleConsult Austria GmbH (see www.teleconsult-austria.at) to simulate a full Galileo constellation. Based on this simulation and using the same position data of the car as for GPS, again the 24-hour calculation was initiated to find the minimal number of satellites for each individual position. The result is shown in Fig. 34 and it is easy to see that the colored curve has much more blue or deep blue contributions compared to GPS in Fig. 33. The interpretation is simple: in this alpine area along the specific route Galileo – once it operates with full 24-satellite-constellation – is advantageous compared to GPS. Since GPS and Galileo do not have the system difference CDMA and FDMA as inferred by GLONASS, a combination of the two systems offers itself. The result of the GPS and Galileo combination is given in Fig. 35 which shows the tremendous improvement compared to an individual solution. Yet, already from this result one might rise the question if additional systems like BeiDou still contribute substantially. Can the heaven become even deeper blue? Does this still improve the quality or accuracy of the results? We will come to an answer at the end of this perspective section. The second example is a PDOP investigation of a static measurement demonstrating the benefit or even the necessity in using GNSS as compared to one individual

21 Elementary Mathematical Models for GNSS Positioning

1437

Fig. 34 Minimal number of Galileo satellites (indicated by colored dots according to the color scale) visible over 24 hours along the road A2 from Graz to Lassing at Austria

Fig. 35 Minimal number of GPS plus Galileo satellites (indicated by colored dots according to the color scale) visible over 24 hours along the road A2 from Graz to Lassing at Austria

global navigation satellite system. The PDOP value is a quality measure of the satellite geometry. It is calculated as the inverse of the volume of the body which is formed by the unit vectors from the site where the measurements are taken to the satellites. The larger this volume becomes, the smaller is the resulting PDOP value (because of the inverse), i.e., the satellites should be widely spread over the sky with respect to the location of the observer. Typically, what are good numbers for the PDOP value? Assume a completely flat area. Then the satellites could be spread over half of the unit sphere, i.e., from the horizon of the observer up to the zenith in any arbitrary direction. Now imagine, at least theoretically, this entire half of the unit sphere is not only densely but fully covered with satellites. Going towards the theoretical limit, the number of satellites will become infinite and the volume to calculate the PDOP value is the volume of one half of the unit sphere which is 2π/3 and, thus, the theoretical best value is the inverse of this quantity,

1438

B. Hofmann-Wellenhof

Fig. 36 PDOP for GPS only (red curve) in contrast to PDOP for GNSS (blue curve) at a Californian site. (Published with kind permission of James Spilker)

i.e., PDOP = 3/(2π ) = 0.48. In practice, good PDOP values are somewhere in the range between 1 and 3. The data of Fig. 36 represent a topographically challenging area in California. I received this PDOP example from James Spilker. His permission to publish this figure is gratefully acknowledged. The interpretation is easy. The red curve represents the GPS PDOP values for the Californian site with a latitude of about ϕ = 30◦ over 24 hours. A mask angle of 30◦ was used because of the many very high mountains around the observation site. Thus, referring to the observer’s site, all satellites with an elevation angle between 0◦ and 30◦ above the horizon were omitted because they were either invisible or too close to the visible horizon. As Fig. 36 indicates, the PDOP values are only acceptable for very short periods of the day, otherwise they are really bad (even out of the range of the figure). The situation changes dramatically if GNSS is used. Now the blue curve of Fig. 36 represents the PDOP values and shows the huge improvement by using not a single global navigation satellite system but GNSS. In the example shown, three frequencies were used. Now the PDOP values are excellent over the entire day. This is the proof that GNSS is necessary in topographically challenging areas. Note that this challenge does not only stem from nature but is frequently caused by mankind. In cities with narrow streets it is always a challenge to have at least four satellites available to get a satellite-based position; however, in cities this is not always critical because additional data, e.g., digital maps, may be used. From the perspectives given so far together with the measurement examples presented in this contribution, the conclusion may be drawn that a single global

21 Elementary Mathematical Models for GNSS Positioning

1439

satellite navigation system will be in many situations sufficient; however, in areas with strong satellite obstructions, e.g., due to the topography or because of buildings, the need will arise to use GNSS in the sense of multiple global satellite navigation systems. As described in this contribution and as seen of today (beginning of 2018), in the very near future four global satellite navigation systems will contribute to GNSS. Now the question may be asked if this very substantial increase still does improve the results because of many more available satellites and, thus, many more data which may be used or is there a kind of limit or perhaps even an impact which limits the usefulness of using an arbitrary number of global satellite navigation systems. The answer of this question is given by Fig. 37. The horizontal axis indicates the number of satellites for multiple global satellite navigation systems where 30 satellites are assumed for each individual system GPS, GLONASS, Galileo, BeiDou. The two blue curves represent the vertical and horizontal dilution of precision (VDOP and HDOP) in dependence of the number of satellites used (under the assumption that no obstruction occurs). The scale describing the respective DOP values is given at the left vertical axis. Roughly speaking, the DOP values (for VDOP and HDOP as well) are getting better if two instead of one global satellite navigation system are used: VDOP improves from 1.5 (with 30 satellites) to about 0.91 with 60 satellites and HDOP from 0.9 to about 0.52. This improvement continues up to about “2.5 systems” (with some 75 satellites). Then, however, if the number of satellites is further increased, no real significant improvements are achieved because the DOP curves converge only very slowly to better values which will, however, have no significant impact on the results of the coordinates. Consequently, one might argue, 2.5 systems are sufficient, it does not make sense to use more than this. Exceptions are certainly areas with strong obstructions. This limitation to 2.5 systems is manifested even more by the two remaining curves in Fig. 37 which are connected to noise where the respective noise values are indicated at the right vertical axis of the figure. The horizontal red line at about N0 = −204.5 [dBm] is the acquisition threshold for standard receivers which is the minimal signal to noise ratio necessary to acquire satellites at all. The I0 curve is the intra-noise which is produced by the system signal. Here only one frequency is considered for all the four systems. The intersection of the two noise curves is the critical issue: when I0 becomes larger than N0 , the receivers are unable to acquire satellites because the intra-noise of the satellite signal is too high! The crossing of the two curves occurs at about 68 satellites or, roughly, again with about 2.5 systems. This leads to the important conclusion that, currently, 2.5 global satellite navigation systems are the useful limit. Going beyond that does not only bring no further improvement but may cause troubles to acquire satellites because of the dominant noise. Certainly, this conclusion is true only “for the moment” because the signal and the receivers as well will be further improved. Even today, military receivers acquire much better compared to the N0 = −204.5 [dBm] level of the civilian standard receiver. In conclusion, the measurements examples of this contribution together with the perspectives show the enormous potential and performance achieved already by a

1440

B. Hofmann-Wellenhof

Fig. 37 PDOP versus noise. (Published with kind permission of Günter Hein)

single global navigation satellite system yielding amazingly good results. Nevertheless, in topographically challenging and strongly obstructed areas the expansion from an individual to multiple global navigation satellite systems, i.e., GNSS, is not only highly appreciated but may also be necessary to get measurements from a sufficient number of satellites. Referring to the achievable precision and accuracy, the combination of multiple global navigation satellite systems does improve the results but only to some extent. Due to the noise behavior, a general statement “the more satellites the better” would not be correct. Yet, GNSS with four complete global navigation satellite systems (GPS, GLONASS, Galileo, BeiDou) will even more influence everybody’s life as compared to today! GNSS will offer applications which we cannot even dream of. Using GNSS in a proper way, we will profit and benefit tremendously. The GNSS future will be bright! Acknowledgements With kind permission of the Springer Publishing Company, this contribution to the Handbook of Mathematical Geodesy is basically a part taken over from the book “GNSS – GPS, GLONASS, Galileo & more” published in 2008 and written by Herbert Lichtenegger, Elmar Wasle, and myself. The republication in this Handbook offered the chance to include the experience obtained in the years having passed since then by using the mathematical models. Therefore, I asked Mathias Duregger, student assistant at the Graz University of Technology, Institute of Geodesy, to carry out measurement campaigns, to process the data, and to prepare representative figures. These measurement examples are illustrative supplements to the pure theory. The ambitious support by Mathias and his independent and extremely reliable work are gratefully acknowledged. I would also like to thank Katrin Huber, James J. Spilker and Guenter Hein who gave the permission to publish the contents of figures they had used in one of their publications or presentations. These figures themselves have been prepared again by Mathias Duregger.

21 Elementary Mathematical Models for GNSS Positioning

1441

Literature 1. Abdel-salam, M.A.: Precise point positioning using un-differenced code and carrier phase observations. Department of Geomatics Engineering, University of Calgary, Canada, UCGE Reports No. 20229 (2005) 2. Abidin, H.Z.: On the construction of the ambiguity searching space for on-the-fly ambiguity resolution. Navigation 40(3), 321–338 (1993) 3. Abidin, H.Z., Wells, D.E., Kleusberg, A.: Some aspects of “on the fly” ambiguity resolution. In: Proceedings of the Sixth International Geodetic Symposium on Satellite Positioning, Columbus, 17–20 Mar, vol. 2, pp. 660–669 (1992) 4. Alban, S.: Design and performance of a robust GPS/INS attitude system for automobile applications. PhD dissertation, Stanford University, California (2004) 5. Altmayer, C.: Cycle slip detection and correction by means of integrated systems. In: Proceedings of the 2000 National Technical Meeting of the Institute of Navigation, Anaheim, 26–28 Jan, pp. 134–144 (2000) 6. Beutler, G., Gurtner, W., Bauersima, I., Rothacher, M.: Efficient computation of the inverse of the covariance matrix of simultaneous GPS carrier phase difference observations. Manuscripta Geodaetica 11, 249–255 (1986) 7. Beutler, G., Bauersima, I., Gurtner, W., Rothacher, M.: Correlations between simultaneous GPS double difference carrier phase observations in the multistation mode: implementation considerations and first experiences. Manuscripta Geodaetica 12, 40–44 (1987) 8. Beutler, G., Gurtner, W., Rothacher, M., Wild, U., Frei, E.: Relative static positioning with the global positioning system: basic technical considerations. In: Bock, Y., Leppard, N. (eds) Global Positioning System: an overview, pp. 1–23. Springer, New York/Berlin/Heidelberg/Tokyo [Mueller, I.I. (ed) IAG Symposia Proceedings, vol. 102] (1990) 9. Borge, T.K., Forssell, B.: A new real-time ambiguity resolution strategy based on polynomial identification. In: Proceedings of the International Symposium on Kinematic Systems in Geodesy, Geomatics and Navigation, Banff, 30 Aug–2 Sept, pp. 233–240 (1994) 10. Braasch, M.S.: Multipath, Part C, Chapter 15. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 11. Brunner, F.K., Gu, M.: An improved model for the dual frequency ionospheric correction of GPS observations. Manuscripta Geodaetica 16, 205–214 (1991) 12. Cannon, M.E., Lachapelle, G.: GPS – theory and applications. Lecture Notes for a Seminar on GPS given at Graz in spring (1993) 13. Chen, D.: Development of a fast ambiguity search filtering (FASF) method for GPS carrier phase ambiguity resolution. Reports of the Department of Geomatics Engineering of the University of Calgary, vol. 20071 (1994) 14. Chen, D., Lachapelle, G.: A comparison of the FASF and least-squares search algorithms for ambiguity resolution on the fly. In: Proceedings of the International Symposium on Kinematic Systems in Geodesy, Geomatics and Navigation, Banff, 30 Aug–2 Sept, pp. 241–253 (1994) 15. Collins, J.P., Langley, R.B.: Possible weighting schemes for GPS carrier phase observations in the presence of multipath. Contract Report No. DAAH04-96-C-0086/TCN 98151 for the United States Army Corps of Engineers Topographic Engineering Center. Available at http:// gge.unb.ca/Personnel/Langley/Langley.html 16. Colombo, O.L., Bhapkar, U.V., Evans, A.G.: Inertial-aided cycle-slip detection/correction for precise, long-baseline kinematic GPS. In: Proceedings of ION GPS-99, 12th International Technical Meeting of the Satellite Division of the Institute of Navigation, Nashville, 14–17 Sept, pp. 1915–1921 (1999) 17. Counselman, C.C., Gourevitch, S.A.: Miniature interferometer terminals for earth surveying: ambiguity and multipath with the global positioning system. IEEE Trans. Geosci. Remote Sens. GE–19(4), 244–252 (1981)

1442

B. Hofmann-Wellenhof

18. Dai, L., Han, S., Wang, J., Rizos, C.: A study on GPS/GLONASS multiple reference station techniques for precise real-time carrier phase-based positioning. In: Proceedings of ION GPS 2001, 14th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, 11–14 Sept, pp. 392–403 (2001) 19. Deo, M.N., Zhang, K., Roberts, C., Talbot, N.C.: An investigation of GPS precise point positioning methods. Paper presented at SatNav 2003, 6th International Symposium on Satellite Navigation Technology Including Mobile Positioning & Location Services, Melbourne, 22– 25 July (2003) 20. El-Sheimy, N.: An expert knowledge GPS/INS system for mobile mapping and GIS applications. In: Proceedings of the 2000 National Technical Meeting of the Institute of Navigation, Anaheim, 26–28 Jan, pp. 816–824 (2000) 21. Erickson, C.: Investigations of C/A code and carrier measurements and techniques for rapid static GPS surveys. Reports of the Department of Geomatics Engineering of the University of Calgary, vol. 20044 (1992) 22. Erickson, C.: An analysis of ambiguity resolution techniques for rapid static GPS surveys using single frequency data. In: Proceedings of ION GPS-92, 5th International Technical Meeting of the Satellite Division of the Institute of Navigation, Albuquerque, 16–18 Sept, pp. 453–462 (1992) 23. Euler, H.-J., Goad, C.C.: On optimal filtering of GPS dual frequency observations without using orbit information. Bull. Géodésique 65, 130–143 (1991) 24. Euler, H.-J., Landau, H.: Fast GPS ambiguity resolution on-the-fly for real-time applications. In: Proceedings of the Sixth International Geodetic Symposium on Satellite Positioning, Columbus, 17–20 Mar, vol. 2, pp. 650–659 (1992) 25. Euler, H.-J., Schaffrin, B.: On a measure for the discernibility between different ambiguity solutions in the static-kinematic GPS-mode. In: Schwarz, K.P., Lachapelle, G. (eds.) Kinematic Systems in Geodesy, Surveying, and Remote Sensing, pp. 285–295. Springer, New York/Berlin/Heidelberg/Tokyo (1990) [Mueller, I.I. (ed.) IAG Symposia Proceedings, vol. 107] 26. Euler, H.-J., Sauermann, K., Becker, M.: Rapid ambiguity fixing in small scale networks. In: Proceedings of the Second International Symposium on Precise Positioning with the Global Positioning System, Ottawa, 3–7 Sept, pp. 508–523 (1990) 27. Farrell, J.A., Wendel, J.: GNSS/INS integration, Part E, Chapter 28. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 28. Feng, Y., Rizos, C.: Three carrier approaches for future global, regional and local GNSS positioning services: concepts and performance perspectives. In: Proceedings of ION GNSS 2005, 18th International Technical Meeting of the Satellite Division of the Institute of Navigation, Long Beach, 13–16 Sept, pp. 2277–2287 (2005) 29. Fernández-Plazaola, U., Martín-Guerrero, T.M., Entrambasaguas-Muñoz, J.T., Martín-Neira, M.: The null method applied to GNSS three-carrier phase ambiguity resolution. J. Geodesy 78, 96–102 (2004) 30. Forssell, B., Martín-Neira, M., Harris, R.A.: Carrier phase ambiguity resolution in GNSS-2. In: Proceedings of ION GPS-97, 10th International Technical Meeting of the Satellite Division of the Institute of Navigation, Kansas City, 16–19 Sept, pp. 1727–1736 (1997) 31. Frei, E.: GPS – fast ambiguity resolution approach “FARA”: theory and application. Paper presented at XX General Assembly of the IUGG, IAG-Symposium GM 1/4, Vienna, 11–24 Aug (1991) 32. Frei, E., Beutler, G.: Rapid static positioning based on the fast ambiguity resolution approach “FARA”: theory and first results. Manuscripta Geodaetica 15(4), 325–356 (1990) 33. Frei, E., Schubernigg, M.: GPS surveying techniques using the “fast ambiguity resolution approach (FARA)”. Paper presented at the 34th Australian Surveyors Congress and the 18th National Surveying Conference at Cairns, 23–29 May (1992)

21 Elementary Mathematical Models for GNSS Positioning

1443

34. Gao, Y., Chen, K.: Performance analysis of precise point positioning using real-time orbit and clock products. J. Glob. Positioning Syst. 3(1–2), 95–100 (2004) 35. Gao, Y., Shen, K.: Improving ambiguity convergence in carrier phase-based precise point positioning. In: Proceedings of ION GPS 2001, 14th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, 11–14 Sept, pp. 1532–1539 (2001) 36. Gerdan, G.P.: A comparison of four methods of weighting double difference pseudo range measurements. Trans. Tasman Surv. 1(1), 60–66 (1995) 37. Gianniou, M.: Genauigkeitssteigerung bei kurzzeit-statischen und kinematischen Satellitenmessungen bis hin zur Echtzeitanwendung. Deutsche Geodätische Kommission bei der Bayerischen Akademie der Wissenschaften, Reihe C, vol. 458 (1996) 38. Grafarend, E.W., Shan, J.: GPS solutions: closed forms, critical and special configurations of P4P. GPS Solut. 5(3), 29–41 (2002) 39. Gurtner, W., Estey, L.: RINEX: the receiver independent exchange format version 3.00. Available at http://igscb.jpl.nasa.gov/igscb/data/format/ (2006) 40. Gurtner, W., Mader, G.: Receiver independent exchange format version 2. GPS Bull. 3(3), 1–8 (1990) 41. Habrich, H., Beutler, G., Gurtner, W., Rothacher, M.: Double difference ambiguity resolution for GLONASS / GPS carrier phase. In: Proceedings of ION GPS-99, 12th International Technical Meeting of the Satellite Division of the Institute of Navigation, Nashville, 14–17 Sept, pp. 1609–1618 (1999) 42. Han, S., Rizos, C.: Integrated methods for instantaneous ambiguity resolution using new generation GPS receivers. In: Proceedings of IEEE Position Location and Navigation Symposium PLANS’96, Atlanta, 22–26 Apr, pp. 254–261 (1996) 43. Han, S., Rizos, C.: Comparing GPS ambiguity resolution techniques. GPS World 8(10), 54–61 (1997) 44. Han, S., Dai, L., Rizos, C.: A new data processing strategy for combined GPS/GLONASS carrier phase-based positioning. In: Proceedings of ION GPS-99, 12th International Technical Meeting of the Satellite Division of the Institute of Navigation, Nashville, 14–17 Sept, pp. 1619–1627 (1999) 45. Hartinger, H., Brunner, F.K.: Variances of GPS phase observations: the SIGMA-ε model. GPS Solut. 2(4), 35–43 (1999) 46. Hatch, R.: Instantaneous ambiguity resolution. In: Schwarz, K.P., Lachapelle, G. (eds.) Kinematic Systems in Geodesy, Surveying, and Remote Sensing, pp. 299–308. Springer, New York/Berlin/Heidelberg/Tokyo (1990) [Mueller, I.I. (ed.) IAG Symposia Proceedings, vol. 107] 47. Hatch, R.: Ambiguity resolution while moving – experimental results. In: Proceedings of ION GPS-91, 4th International Technical Meeting of the Satellite Division of the Institute of Navigation, Albuquerque, 11–13 Sept, pp. 707–713 (1991) 48. Hatch, R., Euler, H.-J.: Comparison of several AROF kinematic techniques. In: Proceedings of ION GPS-94, 7th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, Part 1, 20–23 Sept, pp. 363–370 (1994) 49. Hatch, R., Jung, J., Enge, P., Pervan, B.: Civilian GPS: the benefits of three frequencies. GPS Solut. 3(4), 1–9 (2000) 50. Hauschild, A.: Basic observation equations, Part D, Chapter 19. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 51. Hauschild, A.: Combinations of observations, Part D, Chapter 20. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 52. Hein, G.W.: Kinematic differential GPS positioning: applications in airborne photogrammetry and gravimetry. In: Crosilla, F., Mussio, L. (eds.) Il sistema di posizionamento globale satellitare GPS. International Centre for Mechanical Sciences, Collana di Geodesia e Cartografia, Udine, pp. 139–173 (1990)

1444

B. Hofmann-Wellenhof

53. Hein, G.W.: Comparison of different on-the-fly ambiguity resolution techniques. In: Proceedings of ION GPS-95, 8th International Technical Meeting of the Satellite Division of the Institute of Navigation, Palm Springs, Part 2, 12–15 Sept, pp. 1137–1144 (1995) 54. Hofmann-Wellenhof, B., Lichtenegger, H., Wasle, E.: GNSS – Global Navigation Satellite Systems. GPS, GLONASS, Galileo & more. Springer, Wien/New York (2008) 55. Huber, K.: Precise point positioning with ambiguity resolution for real-time application. PhD dissertation, Institute of Geodesy, Graz University of Technology (2015) 56. Humphreys, T.: Interference, Part C, Chapter 16. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 57. Jin, X.X., de Jong, C.D.: Relationship between satellite elevation and precision of GPS code observations. J. Navig. 49, 253–265 (1996) 58. de Jonge, P., Tiberius, C.: Integer ambiguity estimation with the Lambda method. In: Beutler, G., Hein, G.W., Melbourne, W.G., Seeber, G. (eds.) GPS Trends in Precise Terrestrial, Airborne, and Spaceborne Applications, pp. 280–284. Springer, New York/Berlin/Heidelberg/Tokyo (1995) [Mueller, I.I. (ed.) IAG Symposia Proceedings, vol. 115] 59. Jonkman, N.F.: Integer GPS Ambiguity Estimation Without the Receiver-Satellite Geometry. LGR Series, vol. 18. Delft Geodetic Computing Centre, Delft University of Technology (1998) 60. Joosten, P., Tiberius, C.: Fixing the ambiguities – are you sure they’re right? GPS World 11(5), 46–51 (2000) 61. Joosten, P., Teunissen, P.J.G., Jonkman, N.: GNSS three carrier phase ambiguity resolution using the LAMBDA-method. In: Proceedings of GNSS’99, 3rd European Symposium on Global Navigation Satellite Systems, Genova, Part 1, 5–8 Oct, pp. 367–372 (1999) 62. Julien, O., Cannon, M.E., Alves, P., Lachapelle, G.: Triple frequency ambiguity resolution using GPS/Galileo. Eur. J. Navig. 2(2), 51–57 (2004) 63. Jung, J., Enge, P., Pervan, B.: Optimization of cascade integer resolution with three civil GPS frequencies. In: Proceedings of ION GPS 2000, 13th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, 19–22 Sept, pp. 2191–2200 (2000) 64. Kadletz, M.: Relative Positionierung mittels GNSS: Genauigkeitsanalyse. Institute of Geodesy, Graz University of Technology (2017) 65. Kim, D., Langley, R.B.: An optimized least-squares technique for improving ambiguity resolution and computational efficiency. In: Proceedings of ION GPS-99, 12th International Technical Meeting of the Satellite Division of the Institute of Navigation, Nashville, 14–17 Sept, pp. 1579–1588 (1999) 66. Kim, D., Langley, R.B.: GPS ambiguity resolution and validation: methodologies, trends and issues. Paper presented at the 7th GNSS Workshop – International Symposium on GPS/GNSS, Seoul, 30 Nov–2 Dec. Available at http://gauss.gge.unb.ca/papers.pdf/gnss2000. kim.pdf (2000) 67. Kim, J., Sukkarieh, S.: 6DoF SLAM aided GNSS/INS navigation in GNSS denied and unknown environments. J. Glob. Positioning Syst. 4(1–2), 120–128 (2005) 68. King, R.W., Masters, E.G., Rizos, C., Stolz, A., Collins, J.: Surveying with Global Positioning System. Dümmler, Bonn (1987) 69. Kleusberg, A.: A review of kinematic and static GPS surveying procedures. In: Proceedings of the Second International Symposium on Precise Positioning with the Global Positioning System, Ottawa, 3–7 Sept, pp. 1102–1113 (1990) 70. Kleusberg, A.: Die direkte Lösung des räumlichen Hyperbelschnitts. Zeitschrift für Vermessungswesen 119(4), 188–192 (1994) 71. Knight, D.: A new method of instantaneous ambiguity resolution. In: Proceedings of ION GPS-94, 7th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, Part 1, 20–23 Sept, pp. 707–716 (1994) 72. Koch, K.-R.: Parameter Estimation and Hypothesis Testing in Linear Models. Springer, Berlin/Heidelberg/New York/London/Paris/Tokyo (1987)

21 Elementary Mathematical Models for GNSS Positioning

1445

73. Kouba, J., Héroux, P.: Precise point positioning using IGS orbit products. GPS Solut. 5(2), 12–28 (2001) 74. Kouba, J., Lahaye, F., Tétreault, P.: Precise point positioning, Part E, Chapter 25. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 75. Kuusniemi, H.: User-level reliability and quality monitoring in satellite-based personal navigation. PhD dissertation, Institute of Digital and Computer Systems, Tampere University of Technology, Finland (2005) 76. Lachapelle, G.: GPS observables and error sources for kinematic positioning. In: Schwarz, K.P., Lachapelle, G.: (eds.) Kinematic Systems in Geodesy, Surveying, and Remote Sensing, pp. 17–26. Springer, New York/Berlin/Heidelberg/Tokyo (1990) [Mueller, I.I. (ed.) IAG Symposia Proceedings, vol. 107] 77. Lachapelle, G.: Advanced GPS theory and applications. ENGO 625 Lecture Notes, University of Calgary, Canada (2003) 78. Lachapelle, G., Sun, H., Cannon, M.E., Lu, G.: Precise aircraft-to-aircraft positioning using a multiple receiver configuration. Can. Aeronaut. Space J. 40(2), 74–78 (1994) 79. Landau, H.: Zur Nutzung des Global Positioning Systems in Geodäsie und Geodynamik: Modellbildung, Software-Entwicklung und Analyse. Schriftenreihe der Universität der Bundeswehr München, vol. 36 (1988) 80. Landau, H., Euler, H.-J.: On-the-fly ambiguity resolution for precise differential positioning. In: Proceedings of ION GPS-92, Fifth International Technical Meeting of the Satellite Division of the Institute of Navigation, Albuquerque, 16–18 Sept, pp. 607–613 (1992) 81. Landau, H., Vollath, U., Chen, X.: Virtual reference station systems. J. Glob. Positioning Syst. 1(2), 137–143 (2002) 82. Landau, H., Vollath, U., Chen, X.: Benefits of modernized GPS/Galileo to RTK positioning. In: Proceedings of the 2004 International Symposium on GNSS/GPS, Sydney, 6–8 Dec, pp. 92–103 (2004) 83. Langley, R.B.: GPS receiver system noise. GPS World 8(6), 40–45 (1997) 84. Langley, R.B., Teunissen, P.J.G., Montenbruck, O.: Introduction to GNSS, Part A, Chapter 1. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 85. Leick, A.: GPS Satellite Surveying, 3rd edn. Wiley, Hoboken (2004) 86. Levanon, N.: Instant active positioning with one LEO satellite. Navigation 46(2), 87–95 (1999) 87. Lichtenegger, H.: Eine direkte Lösung des räumlichen Bogenschnitts. Österreichische Zeitschrift für Vermessung und Geoinformation 83(4), 224–226 (1995) 88. Lichtenegger, H.: DGPS fundamentals. Rep. Geodesy 11(41), 7–19 (1998). Warsaw University of Technology 89. Logsdon, T.: The NAVSTAR Global Positioning System. Van Nostrand, New York (1992) 90. Lu, G.: Development of a GPS multi-antenna system for attitude determination. Reports of the Department of Geomatics Engineering of the University of Calgary, vol. 20073 (1995) 91. Lu, G., Cannon, M.E.: Attitude determination using a multi-antenna GPS system for hydrographic applications. Mar. Geodesy 17, 237–250 (1994) 92. Mader, G.L.: Ambiguity function techniques for GPS phase initialization and kinematic solutions. In: Proceedings of the Second International Symposium on Precise Positioning with the Global Positioning System, Ottawa, 3–7 Sept, pp. 1233–1247 (1990) 93. Maqsood, M., Gao, S., Montenbruck, M.: Antennas, Part C, Chapter 17. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 94. Martín-Neira, M., Toledo, M., Pelaez, A.: The null space method for GPS integer ambiguity resolution. In: Proceedings of DSNS’95, Bergen, 24–28 Apr, paper no. 31 (1995) 95. Martín-Neira, M., Lucas, R., Garcia, A., Tossaint, M., Amarillo, F.: The development of high precision applications with Galileo. In: CD-ROM-Proceedings of the European Navigation Conference GNSS 2003, Graz, 22–25 Apr (2003)

1446

B. Hofmann-Wellenhof

96. Mervart, L.: Ambiguity resolution techniques in geodetic and geodynamic applications of the Global Positioning System. PhD dissertation, University of Berne, Switzerland (1995) 97. Mervart, L.: Experience with SINEX format and proposals for its further development. In: Proceedings of the Fifth International Seminar “GPS in Central Europe”, Reports on Geodesy, Warsaw University of Technology 5(46), 103–110 (1999) 98. Mueller, T.M.: Wide area differential GPS. GPS World 5(6), 36–44 (1994) 99. Odijk, D.: Positioning model, Part D, Chapter 21. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 100. Odijk, D., Wanninger, L.: Differential positioning, Part E, Chapter 26. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 101. Parkinson, B.W.: GPS error analysis. In: Parkinson, B.W., Spilker, J.J. (eds.) Global Positioning System: Theory and Applications, vol. 1, pp. 469–483. American Institute of Aeronautics and Astronautics, Washington, DC (1996) 102. Remondi, B.W.: Using the Global Positioning System (GPS) phase observable for relative geodesy: modeling, processing, and results. Center for Space Research, University of Texas at Austin (1984) 103. Remondi, B.W.: Pseudo-kinematic GPS results using the ambiguity function method. National Information Center, Rockville, NOAA Technical Memorandum NOS NGS-52 (1990) 104. Remondi, B.W.: Recent advances in pseudo-kinematic GPS. In: Proceedings of the Second International Symposium on Precise Positioning with the Global Positioning System, Ottawa, 3–7 Sept, pp. 1114–1137 (1990) 105. Retscher, G.: Accuracy performance of virtual reference station (VRS) networks. J. Glob. Positioning Syst. 1(1), 40–47 (2002) 106. Rizos, C.: Surveying, Part F, Chapter 35. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 107. Rizos, C., Han, S.: A new method for constructing multi-satellite ambiguity combinations for improved ambiguity resolution. In: Proceedings of ION GPS-95, 8th International Technical Meeting of the Satellite Division of the Institute of Navigation, Palm Springs, Part 2, 12–15 Sept, pp. 1145–1153 (1995) 108. Satirapod, C., Kriengkraiwasin, S.: Performance of single-frequency GPS precise point positioning. Available at www.gisdevelopment.net/technology/gps/ma06_19pf.htm (2006) 109. Sauer, K., Vollath, U., Amarillo, F.: Three and four carriers for reliable ambiguity resolution. In: CD-ROM-Proceedings of the European Navigation Conference GNSS 2004, Rotterdam, 16–19 May (2004) 110. Schwarz, K.P., El-Sheimy, N., Liu, Z.: Fixing GPS cycle slips by INS/GPS: methods and experience. In: Proceedings of the International Symposium on Kinematic Systems in Geodesy, Geomatics and Navigation, Banff, 30 Aug–2 Sept, pp. 265–275 (1994) 111. Sjöberg, L.E.: On optimality and reliability for GPS base ambiguity resolution by combined phase and code observables. Zeitschrift für Vermessungswesen 122(6), 270–275 (1997) 112. Sjöberg, L.E.: A new method for GPS phase base ambiguity resolution by combined phase and code observables. Surv. Rev. 34(268), 363–372 (1998) 113. Sjöberg, L.E.: Triple frequency GPS for precise positioning. In: Krumm, F., Schwarze, V.S. (eds.) Quo vadis geodesia . . . ? Festschrift for Erik W. Grafarend on the occasion of his 60th birthday. Schriftenreihe der Institute des Studiengangs Geodäsie und Geoinformatik, Universität Stuttgart, Part 2, Report vol. 1999.6-2, pp. 467–471 (1999) 114. Teunissen, P.J.G.: Least squares estimation of the integer GPS ambiguities. Paper presented at the General Meeting of the IAG at Beijing, 8–13 Aug (1993) 115. Teunissen, P.J.G.: A new method for fast carrier phase ambiguity estimation. In: Proceedings of PLANS’94 – Position Location and Navigation Symposium, Las Vegas, 11–15 Apr, pp. 562–573 (1994) 116. Teunissen, P.J.G.: The least-squares ambiguity decorrelation adjustment: a method for fast GPS integer ambiguity estimation. J. Geodesy 70, 65–82 (1995)

21 Elementary Mathematical Models for GNSS Positioning

1447

117. Teunissen, P.J.G. The invertible GPS ambiguity transformations. Manuscripta Geodaetica 20, 489–497 (1995) 118. Teunissen, P.J.G.: GPS carrier phase ambiguity fixing concept. In: Kleusberg, A., Teunissen, P.J.G. (eds.) GPS for Geodesy. Springer, Berlin/Heidelberg/New York/Tokyo, pp. 263–335 (1996) [Bhattacharji, S., Friedman, G.M., Neugebauer, H.J., Seilacher, A. (eds.) Lecture Notes in Earth Sciences, vol. 60] 119. Teunissen, P.J.G.: An optimality property of the integer least-squares estimator. J. Geodesy 73, 587–593 (1999) 120. Teunissen, P.J.G.: A theorem on maximizing the probability of correct integer estimation. Artif. Satell. J. Planet. Geodesy 34(1), 3–9 (1999) 121. Teunissen, P.J.G.: Integer aperture GNSS ambiguity resolution. Artif. Satell. J. Planet. Geodesy 38(3), 79–88 (2003) 122. Teunissen, P.J.G.: Penalized GNSS ambiguity resolution. J. Geodesy 78, 235–244 (2004) 123. Teunissen, P.J.G.: Carrier phase integer ambiguity resolution, Part D, Chapter 23. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 124. Teunissen, P.J.G.: Batch and recursive model validation, Part D, Chapter 24. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 125. Teunissen, P.J.G., Verhagen, S.: On the foundation of the popular ratio test for GNSS ambiguity resolution. In: Proceedings of ION GNSS 2004, 17th International Technical Meeting of the Satellite Division of the Institute of Navigation, Long Beach, 21–24 Sept, pp. 2529–2540 (2004) 126. Teunissen, P.J.G., de Jonge, P.J., Tiberius, C.C.J.M.: On the spectrum of the GPS DDambiguities. In: Proceedings of ION GPS-94, 7th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, Part 1, 20–23 Sept, pp. 115–124 (1994) 127. Teunissen, P.J.G., de Jonge, P.J., Tiberius, C.C.J.M.: A new way to fix carrier-phase ambiguities. GPS World 6(4), 58–61 (1995) 128. Teunissen, P.J.G., Jonkman, N.F., Tiberius, C.C.J.M.: Weighting GPS dual frequency observations: bearing the cross of cross-correlation. GPS Solut. 2(2), 28–37 (1998) 129. Tiberius, C.C.J.M.: Recursive Data Processing for Kinematic GPS Surveying. Publications on Geodesy, vol. 45. Netherlands Geodetic Commission, Delft (1998) 130. Verhagen, S.: Integer ambiguity validation: an open problem? GPS Solut. 8(1), 36–43 (2004) 131. Verhagen, S.: The GNSS Integer Ambiguities: Estimation and Validation. Publications on Geodesy (‘Yellow Series’), vol. 58. Netherlands Geodetic Commission, Delft (2005) 132. Verhagen, S., Joosten, P.: Analysis of integer ambiguity resolution algorithms. In: CD-ROMProceedings of the European Navigation Conference GNSS 2004, Rotterdam, 16–19 May (2004) 133. Verhagen, S., Teunissen, P.J.G.: Least-squares estimation and Kalman filtering, Part D, Chapter 22. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 134. Vermeer, M.: The precision of geodetic GPS and one way to improve it. J. Geodesy 71(4), 240–245 (1997) 135. Vollath, U., Birnbach, S., Landau, H., Fraile-Ordoñez, J.M., Martín-Neira, M.: Analysis of three-carrier ambiguity resolution technique for precise relative positioning in GNSS-2. Navigation 46(1), 13–23 (1999) 136. Vollath, U., Buecherl, A., Landau, H., Pagels, C., Wagner, B.: Multi-base RTK positioning using virtual reference stations. In: Proceedings of ION GPS 2000, 13th International Technical Meeting of the Satellite Division of the Institute of Navigation, Salt Lake City, 19–22 Sept, pp. 123–131 (2000) 137. Walsh, D.: Real time ambiguity resolution while on the move. In: Proceedings of ION GPS-92, Fifth International Technical Meeting of the Satellite Division of the Institute of Navigation, Albuquerque, 16–18 Sept, pp. 473–481 (1992)

1448

B. Hofmann-Wellenhof

138. Wang, J.: Modelling and quality control for precise GPS and GLONASS satellite positioning. PhD dissertation, Curtin University of Technology (1999) 139. Wanninger, L.: Real-time differential GPS error modelling in regional reference station networks. In: Brunner, F.K. (ed.) Advances in positioning and reference frames, pp. 86–92. Springer, New York/Berlin/Heidelberg/Tokyo (1997) [Mueller, I.I. (ed.) IAG Symposia Proceedings, vol. 118] 140. Wanninger, L.: The performance of virtual reference stations in active geodetic GPS-networks under solar maximum conditions. In: Proceedings of ION GPS-99, 12th International Technical Meeting of the Satellite Division of the Institute of Navigation, Nashville, 14–17 Sept, pp. 1419–1427 (1999) 141. Wanninger, L.: Virtual reference stations for centimeter-level kinematic positioning. In: Proceedings of ION GPS 2002, 15th International Technical Meeting of the Satellite Division of the Institute of Navigation, Portland, 24–27 Sept, pp. 1400–1407 (2002) 142. Wei, M., Schwarz, K.P.: Fast ambiguity resolution using an integer nonlinear programming method. In: Proceedings of ION GPS-95, 8th International Technical Meeting of the Satellite Division of the Institute of Navigation, Palm Springs, Part 2, 12–15 Sept, pp. 1101–1110 (1995) 143. Wells, D.E., Beck, N., Delikaraoglou, D., Kleusberg, A., Krakiwsky, E.J., Lachapelle, G., Langley, R.B., Nakiboglu, M., Schwarz, K.P., Tranquilla, J.M., Vanicek, P.: Guide to GPS Positioning. Canadian GPS Associates, Fredericton (1987) 144. Wen, Z.: Bias estimation for precise point positioning. PhD dissertation, Institute of Communication and Navigation, Munich University of Technology (2017) 145. Werner, W., Winkel, J.: TCAR and MCAR options with Galileo and GPS. In: Proceedings of ION GPS/GNSS 2003, Portland, 9–11 Sept, pp. 790–800 (2003) 146. Wieser, A.: How important is GNSS observation weighting? In: Lachapelle, G., Petovello, M. (eds.) GNSS Solutions: Weighting GNSS observations and variations of GNSS/INS integration. Inside GNSS 2(1), 26–33 (2007) 147. Wieser, A.: GPS based velocity estimation and its application to an odometer. In: Brunner, F.K. (ed.) Ingenieurgeodäsie – TU Graz. Shaker, Aachen (2007) 148. Wieser, A., Gaggl, M., Hartinger, H.: Improved positioning accuracy with high-sensitivity GNSS receivers and SNR aided integrity monitoring of pseudo-range observations. In: Proceedings of ION GNSS 2005, 18th International Technical Meeting of the Satellite Division of the Institute of Navigation, Long Beach, 13–16 Sept, pp. 1545–1554 (2005) 149. Willis, P., Boucher, C.: High precision kinematic positioning using GPS at the IGN: recent results. In: Bock, Y., Leppard, N. (eds.) Global Positioning System: an Overview, pp. 340–350. Springer, New York/Berlin/Heidelberg/Tokyo (1990) [Mueller, I.I. (ed.) IAG Symposia Proceedings, vol. 102] 150. Witchayangkoon, B.: Elements of GPS precise point positioning. PhD dissertation, University of Maine, Orono. Available at www.spatial.maine.edu/SIEWEB/thesesdissert.htm (2000) 151. Won, J.-H., Pany, T.: Signal processing, Part C, Chapter 14. In: Teunissen, P.J.G., Montenbruck, O. (eds.) Springer Handbook of Global Navigation Satellite Systems. Springer, Cham (2017) 152. Wübbena, G., Bagge, A., Schmitz, M.: Network-based techniques for RTK applications. Paper presented at the GPS Symposium, GPS JIN 2001, GPS Society, Japan Institute of Navigation, 14–16 Nov (2001) 153. Zhang, W., Cannon, M.E., Julien, O., Alves, P.: Investigation of combined GPS/Galileo cascading ambiguity resolution schemes. In: Proceedings of ION GPS/GNSS 2003, Portland, 9–11 Sept, pp. 2599–2610 (2003) 154. Zumberge, J.F., Heflin, M.B., Jefferson, D.C., Watkins, M.M., Webb, F.H.: Precise point positioning for the efficient and robust analysis of GPS data from large networks. J. Geophys. Res. 102(B3), 5005–5017 (1997)

Stichwortverzeichnis

A Abel-Poisson kernel, 163, 1194, 1195, 1238 scalar, 1194, 1238 tensorial, 1194, 1238 absolute gravimeter, 16 absolute value, 872 abstraction, 57, 150 acceleration in a rotating reference system, 703 accumulation period, 1281, 1284 addition theorem, 1211 admissible, 1229 for spherical harmonics, 203, 257 for spin-weighted spherical harmonics, 255 tensorial, 1213 advection equation, 600 Akaike information criterion (AIC), 568 ambiguity, 1290 ambiguity resolution, 1327, 1344–1346, 1352, 1387–1392, 1396, 1399, 1403, 1404, 1419, 1421 on-the-fly, 1402 instantaneous, 1372 angle between the maximal principal direction and direction of shear, 639 angular momentum barycentric, 652 angular velocity correction for IERS representation, 710 anharmonic, 984, 985, 989 basis, 989 anharmonic parts, 359, 365 approximate radius, 379 approximation error, 1200 a-priori-information, 84 ARISTOTELES, 1191 AROF (ambiguity resolution on the fly), 1402 AR-process, 934, 941, 943, 946, 949 Magic Square, 943, 947 Arzela-Ascoli theorem, 854

associated spin-weighted Legendre function, 253 astrometry, 1258 asymptotic regularization, 860, 880 axes of a reference system, 700 axial vector of an antisymmetric matrix, 701 axisymmetric gravity field, 357, 359, 376 azimuth, 1266

B Backus-Gilbert method, 1007 Balancing principle, 566 hardened, 566 bandlimited, 156 bandpass, 1265 bandwidth synthesis, 1265, 1278, 1288 base vectors derivatives, 702 of a reference system, 700 basic equations, 68 basis systems (gravimetry problem), 986 Bayesian statistic, 447, 448 bell curve, 837 Beltrami operator, 199, 269, 1090, 1209 eigenfunctions, 203, 204 eigenvalues, 203, 204 spin-weighted, 223, 295, 297 Bernstein kernel, 161 oscillation, 161 properties, 161 best approximate solution, 1199 Best Linear Unbiased Estimator, 936 Best Linear Unbiased Predictor, 936, 937 Best Uniformly Unbiased Estimates, 714 best worst case error, 875 bias, 483–487, 489–493, 495, 496, 515, 519–522, 564 Bifurcation point, 391

© Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2020 W. Freeden (Hrsg.), Mathematische Geodäsie/Mathematical Geodesy, Springer Reference Naturwissenschaften, https://doi.org/10.1007/978-3-662-55854-6

1449

1450 BLUE (Best Linear Unbiased Estimator), 936 BLUP (Best Linear Unbiased Predictor), 936, 937 Boscovic-Laplace method, 831 Bouguer anomaly, 1125 Bouguer correction, 1125 boundary element method, 577 boundary value problem, 1073, 1079 Dirichlet, 540 geodetic, 1073 Molodensky, 1073 Neumann, 540, 1073, 1077 oblique, 1073 Stokes, 1073 bridge least squares solution, 837 pseudoinverse, 837 Bruns’ formula, 1075, 1129, 1161 Bruns’ polyhedron, 9, 10

C cable calibration, 1280 Cardan angles, 702 Cartesian coordinates, 700 Cassegrain system, 1269 CASTOR, 1191 Cauchy stretch tensor left, 637 right, 637 cause, 75 covariance model exponential, 669 Celestial Ephemeris Pole (CEP), 708 Celestial Intermediate Origin (CIO), 708 Celestial Intermediate Pole (CIP), 708 celestial reference system, 704 celestial system, 711 center of mass of the earth, 652 central scheme uniform grid, 591 centrifugal force, 353, 703 centrifugal potential, 14, 357, 358, 1065, 1119 Chandler wobble, 26 Chebyshev-principle, 833 Cholesky approach, 938, 950, 951, 955, 961 backward factorization, 939, 948, 960, 963 backward substitution, 953, 955, 957, 958, 963 block matrices, 957 decorrelation, 941 filter, 940, 949 forward factorization, 938, 953, 955, 963

Stichwortverzeichnis forward substitution, 953, 955, 961, 963 indefinite systems, 954 inversion, 950, 956 partial inverse, 958 recursive backward edging, 959, 963 recursive forward edging, 948, 949, 961, 963 solution strategy, 952 sparse systems, 954, 958 CIO (Conventional International Origin), 26 circuit geomathematics, 55 close to the identity transformation, 714 closed form solutions, 484, 497, 510–514 code ranges, 1347 collocation, 668, 933, 936 covariance approach, 934, 937–941, 943, 948 covariance estimator, 936, 937 deterministic model, 936 filter approach, 941–943 parameter estimator, 936 signal estimator, 937 stochastic model, 936 collocation method, 582 compact operator, 853, 988 complete orthogonal anharmonic systems, 989 concretization, 57, 150 condition equation, 833 constitutive equations, 628 constraints manifold, 713 contamination error, 865 continuity equation, 1100 convective coordinates, 679 convergence rate, 79 convolution, 902 coordinate transformation, 482–484, 490, 497, 503, 510–514, 713 coordinates gravity acceleration, 14 core, 990 Coriolis acceleration, 1101 Coriolis force, 354, 358, 703 Coriolis parameter, 1102 correction Bouguer correction, 1125 drift correction, 1124 Eötvös correction, 1124 isostatic correction, 1126 terrain correction, 1126 correlation, 1280 covariance, 833 approach, see Collocation, covariance approach

Stichwortverzeichnis function, finite, 934, 938, 941, 954 matrix, 936–940 recursive defined sequence, 945–947, 950 sequence, 934, 935, 943 covariance model inverse quadratic, 669 Tscherning and Rapp, 670 using Legendre functions, 669 critical radius, 366 cross correlation theorem, 1281 cross spectrum, 1283 cross-covariance function modeling problem, 670 treatment by decorrelation, 670 cubature, 1239

D data combination, 1398 data gaps, 943, 950 data preprocessing, 1379 data set combination basics, 771 uncorrelated data sets, 771 de la Vallèe Poussin-type singular integral, 1000 decomposition Hardy-Hodge, 332 Helmholtz, 329 decorrelation, 53, 937, 939 Cholesky, 941, 950 deflection of the vertical, 20, 341, 1074, 1075, 1090, 1130 deformation analysis on the reference ellipsoid, 683 in three dimensions, 689 deformation of earth surface, 681 deformation gradient determinant relation to dilatation, 637 deformation gradient matrix, 633 on the reference ellipsoid, 684 transformation under cahange of reference systems, 641 deformation gradient tensor, 629 deformation mapping, 628 deformation parameters invariance characteristics, 640 deformation of surface, 676 deformation of surface-like construction, 680 Delaunay triangulation, 663 delay calibration, 1280 delay search, 1288 design matrix, 718 detail space, 53

1451 DGNSS, code ranges, 1347, 1348 DHHN2016, 1166 DHSN2016, 1166 differential positioning, 1340 diffusion methods, 55, 74 dilatation infinitesimal, 632 rigorous, 637, 638 dilemma, 67, 74, 82 dimensionality, 84 Dirac kernel, 164 Dirac distribution, 1090 direct gravimetry, 974 direct problem, 75, 848, 973 Discrepancy principle, 565, 882 variants, 565 discretization, 81 displacement gradient, 630 stochastic prediction, 672 displacement interpolation, 662 displacement vector, 630 disturbing gravity field, 19 disturbing potential, 17, 19, 22, 1071, 1079, 1080, 1090, 1160, 1221 diurnal rotation, 704 divergence theorem, 1057 Doppler data, 1339 downward computation, 68 downward continuation, 112, 114, 115, 135, 1192, 1231, 1232 SGG (Satellite Gravity Gradiometry), 32, 1237 drift correction, 1124

E Earth Orientation Parameters (EOPs), 709 compatibility conditions, 709 direction compatibility conditions, 709 for IERS representation, 709 linearized observation equations, 789 magnitude condition, 709 related to the angular momentum direction, 788 related to the Celestial Intermediate Pole, 788 related to the rotation vector, 788 Earth rotation classical IAU representation, 710 Earth rotation angle, 704 calculation of, 707 Earth surface deformation, 681 Earth system research, 6 Earth’s rotation, 26

1452 edge taper, 1275 effect, 75 EGM2008, 588, 1037 elevation, 1266 elliptic parts, 359, 360 elongation factor, 636 along a particular direction, 635 maximum, 636 minimum, 636 EOP see Earth Orientation Parameters EOP time series inplementation in stacking, 787 Eötvös correction, 1124 errors-in-variables, 479 equatorial radius, 355, 356 equipotential surface, 14, 1120 error contamination, 865 function, 1027 regularization, 865 spezification, 78 estimability, 492–497 estimable linear functions of the parameters, 714 Euclidean geometry, 40 Euler angles, 702 Euler pole, 661 migrating, 662 Euler-Darwin force, 354, 358 evolving surface numerical approximation, 604 expectation, 831 exploration methods, 74 Exponential to exponential (X2X), 559 to local (X2L), 559 exterior oblique derivative problem, 1122 exterior vector product, 701

F falling apple, 13, 16 FARA (fast ambiguity resolution approach), 1408 fast ambiguity resolution approach, 1408 Fast Fourier Transform, 1282 Fast multipole algorithm, 557 acceleration, 558 tuning parameters, 559 feed horn, 1268, 1275 FFT (Fast Fourier Transform), 1282 fiber solution manifold, 712 fibering by shape manifolds, 712

Stichwortverzeichnis filter Cholesky, 938 equispaced, 950 moving average, see filter, non-recursive non-recursive, 939, 940, 942, 949, 950 prewhitening, 934, 942, 947 recursive, 938, 939 time invariant, 942 time variant, 938 warmup phase, 943, 949, 950 filter methods, 84 filter qualification, 879 filter regularization, 876 Finger tensor, 634 finite element interpolation, 663 finite volume method, 577 first Green theorem exterior, 1063 interior, 1058 first order upwind scheme, 577 uniform grid, 591 Fourier coefficients, 1210 sphere, 542 Fourier expansion Legendre, 155 spherical harmonics, 153 Fourier series, 1104, 1210, 1216, 1223, 1226 Fourier transform, 1282 fractional sample delay, 1286 frame, 171 free air anomaly, 1125 frequency localization, 53 fringe rotation, 1286 fully normalized spherical harmonic, 204 function spaces, 84 fundamental equations of physical geodesy, 1132 for the sphere, 1134 fundamental solution, 1056 linearly regularized, 1142 of Laplace-Beltrami operator, 320 FX, 1283 G Gauss kernel, 166 Gauss theorem, 1057 Gauss-Green potential, 354, 358 Gaussian adjustment, 833 Gaussian error law, 832 Gaussian function, 831 Gaussian probability distribution, 831 Gaussian sum, 1027 initial approximation, 1027 spherical context, 1033

Stichwortverzeichnis reductions, 1030 representation, 1243 test calculations, 1029 Gauss-Markov theorem, 714 GBVP (geodetic boundary-value problem), 17, 93, 101, 576, 1065 GCG2016, 1162 GCV (generalized cross-validation) 568 modified GCV method, 569 robust GCV method (RGCV), 568 strong robust GCV (R1 GCV), 569 generalized Bayes’ theorem, 448 generalized inner constraints, 730 generalized inverse, 726, 836, 838, 841, 850, 857, 892, 1196 least squares, 726 minimum norm, 726 reflexive, 726 generalized kinematic Euler equations, 702 generalized kriging, 674 geocenter, 736 geodesy geometrical, 10 physical, 6 satellite, 28 geodetic application civil engineering, 6 Earth’s climate system, 8 exploration, 8 physical oceanography, 7 solid Earth system, 7 geodetic boundary value problem, 17, 93, 101, 576, 1065 geodetic circuit, 58 geodetic height, 1155 geodetic observable, 1043 geodetic quantities, 1070 geoid, 6, 11, 16, 19, 1067, 1078, 1122, 1128, 1187 geoid height, 17 geoidal height, 20, 1128, 1155 geoidal undulation, 20, 22, 1068, 1075, 1128, 1187 geomathematics challenge, 47 cultural asset, 36 generic discipline, 38 interdisciplinary discipline, 46 interdisciplinary science, 47 potential theoretic roots, 36 as solution method, 49 solution potential, 54 task and objective, 36 geometrical geodesy, 8, 10

1453 geometry, 39 physical, 5, 11 topographic, 5, 6 geopotential numbers, 1188 Geostationary Satellite Orbits, 379 geostrophic flow, 1103 geostrophic oceanic flow, 26 Geosynchronous Satellite Orbits, 379 German Basic Gravity Network (DSGN), 1166 German Combined Quasigeoid, 2016, 1162 German Main Gravity Network, 2016, 1166 German Main Height Network, 2016, 1166 GGP (Geodätische Grundnetzpunkte), 1163 GGRF (Global Geodetic Reference Frame), 5 Gibbs sampler, 448, 461–464 global navigation satellite system see GNSS GNNS, 7 BEIDOU, 7 GALILEO, 7 GLONASS, 7 NAVSTAR GPS, 7 GNSS (global navigation satellite system), 15, 29, 1136, 1315 elementary mathematical models, 1315 GNSS observables, 1319, 1320, 1324, 1326, 1375, 1379, 1390, 1410, 1433 GNSS positioning, 1315, 1320 GOCE (Gravity and steady-state Ocean Circulation Explorer), 1191, 1192, 1221 GRACE, 1192 gradient matrix, 629 decomposition in terms of strain parameters, 632 infinitesimal gradient method, 917 GRADIO, 1191 gradiometry, 70, 1189 scalar, 72 tensorial, 72 gravimeter absolute, 16, 974 relative, 16, 974 gravimetry, 69, 1123, 1188 absolute, 1189 inverse problem, 69 mollifier solution, 992 observable based, 70 relative, 1189 reproducing kernel structure, 1011 spline solution, 1022, 1042, 1043 gravimetry problem direct, 973 inverse, 973

1454 gravitation, 6 gravitational constant, 978 gravitational force, 353, 379 gravitational potential, 1065 gravity, 6, 13, 14, 1078, 1089, 1101, 1119 acceleration, 13, 14, 1119 anomaly, 17, 20, 98, 1071, 1090, 1125, 1131 disturbance, 20, 1130 disturbance vector, 1071 intensity, 14, 1065, 1119 normal potential, 19 potential, 14, 19, 1064, 1065, 1119 vector, 1065 gravity changes, 1188 gravity data acquisition, 1164 providing, 1164 gravity field, 91, 93 ellipsoidal-spheroidal representation, 351 gravity field observables scientific use, 9 gravity satellite missions, 93 Green formulas interior, 1058 Green theorem, 975 vector analysis, 975 Green’s function, 1075 of Laplace-Beltrami operator, 327 Green’s second surface identity, 200, 258, 259 Greenwich Sidereal Time, 710 Gregorian system, 1269 gyroscopic force, 704

H Haar function, 996 Haar kernel, 166, 1105 smoothed, 166 Haar scaling function, 167 Haar-type mollification, 980 Haar-type singular integral, 992 Hadamard’s classification, 848, 1198 harmonic, 295 harmonic continuation, 1225 harmonic parts, 357, 359, 429 harmonic solution, 991

Stichwortverzeichnis inverse gravimetry, 987 harmonic upward continuation, 101 height geodal, 1155 geodetic, 1155 orthometric, 1155 quasigeodal, 1157 height anomaly, 17 Helmert Blocking, 957 Helmert’s definition, 57 Hesse matrix, 1206 Hesse tensor, 1206, 1231–1233 higher order upwind method, 577 higher order upwind scheme non-uniform grid, 591 H-Maser, 1264 homogeneous harmonic polynomial, 202 homogeneous polynomial, 201, 291 hotspot, 1097

I ICRF (International Celestial Reference Frame), 1261 IERS representation of the earth rotation, 706 IGSR2016, 1162 ill-posed, 1198, 1207 in the sense of Hadamard, 1197 in the sense of Nashed, 1198 ill-posed problem, 44, 848 ill-posedness, 75, 848 image analysis, 180 incompressible fluid, 1058 inner constraints, 717 inner constraint matrix, 718 inner constraints solution by conversion of any least squares solution, 724 generalized form, 725 inner constraints for stacking, 744 inner harmonics, 553, 986 translation theorem, 554 instantaneous angular velocity, 703 instantaneous rotation vector definition, 703 Integrated Geodetic Spatial Reference, 2016, 1162 intermediate celestial reference system, 704 intermediate terrestrial reference system, 704 International Celestial Reference Frame, 1261 International Terrestrial Reference Frame, 1261 International VLBI Service for Geodesy and Astrometry, 1260

Stichwortverzeichnis interpolation of displacements or velocities, 662 interpolation problem on a regular surface, 539 inverse Fourier transform, 153 generalized, 836 Moore-Penrose, 836 problem, 44, 75, 180, 973 inverse gravimetric problem, 97, 98, 110, 115, 116, 121, 139 inverse gravimetry, 974 solution by series expansion, 988 spectral solution, 986 IPP (ill-posed problems), 44 isostatic correction, 1126 isostatic compensation topographic masses, 1188 iterative regularization, 909 ITRF (International Terrestrial Reference Frame), 768, 1261 combination in the case of non-singular covariance matrice, 801 combination including EOP time series, 797 combination of initial coordinates, velocities and EOPs from all space techniques, 792 combination witout taking EOPs into considerations, 794 formulation, 768 origin and scale treatment choices, 809 IVS, 1260

J Jacobi matrix, 372–374 Jacobi matrix field, 1206 Jacobi polynomial, 252, 989

K Kelvin transform, 549 kernel Abel-Poisson, 163, 316, 546 bandlimited, 156 Dirac, 164 Gauss, 166 Haar, 166 Legendre, 159 locally supported, 156, 548 reproducing, 544 scaling, 321 singularity, 546

1455 spacelimited, 156 symbol, 156 wavelet, 321 zonal, 159 Kerr-Newman Solution, 357 kinematic constraints, 756 linearized, 759 nonlinear, 757 Kite structures, 958 Krarup sphere, 543 kriging, 668

L Landweber-iteration, 84, 911 Laplace equation, 1056 Laplace operator, 1209 Laplace-Beltrami operator, 312 lattice point sampling, 179 Lavrentiev regularization, 859 law of transformation of angular momentum, 764 layers, 990 L-curve method, 567 least squares, 828 Least Squares Collocation, see Collocation least squares method, 77 least squares problem, 850 least squares solution with minimal constraints, 722, 723 covariance factor matrix, 722, 724 least-squares solution, 1196, 1198 left Cauchy-Green deformation tensor, 634 Legendre kernel, 159 polynomial, 156, 313 symbol, 156 Legendre function associated spin-weighted, 253 Legendre tensors, 1213 LEO (Low Earth Orbiter), 29 level ellipsoid, 365, 366, 372, 383 level radius, 385 level surface, 9, 14, 15, 1120, 1187 levelling, 1159 Levinson-Durbin algorithm, 948, 960 linearization, 1419, 1421, 1422, 1425, 1427, 1429 LLR, 15 local basis, 700 Local to targets (L2T), 557 local-area DGNSS, 1351 localization Abel-Poisson kernel, 163

1456 Bernstein kernel, 161 Dirac kernel, 164 frequency, 53 Gaussian kernel, 164 Haar kernel, 166 Shannon kernel, 167 space, 53 locally supported, 156 Love Shida Hypothesis, 427 Low-pass filter, 1095, 1105

M Magic Square, 934, 943, 946 AR-process, 943, 946, 947 signal, 934 mantle, 990 mantle plume, 1097, 1188 map, 10 mapping function, 1267 Markov Chain Monte Carlo method, 447, 461, 462 Marmousi model, 1000 Marussi condition, 1067, 1069 mass point, 978 mass point system, 978 mass transport, 1188 mathematical model, 848 mathematization, 46 of sciences, 38, 47 matrix equation ill-conditioned, 839 maximal index, 564 maximum shear infinitesimal, 632 maximum shear strain direction of, 632 infinitesimal, 631 maximum strain direction angle, 632 Maxwell’s equations time harmonic, 342 MCAR (multiple carrier ambiguity resolution), 1398 mean locally supported, 167 mean celestial system, 710 mean quadratic size of a network, 757 mean square error, 831 mean vernal equinox, 711 Meissl scheme, 155, 1192, 1233 method Haar mollifier, 992 iterative, 909

Stichwortverzeichnis of least squares, 828 mollifier, 921 multiscale, 899 singular integral mollifier, 1004 de la Vallée Poussin mollifier, 1000 Metropolis algorithm, 447, 461, 462 minimal constraints, 718 condition for, 719 minimal constraints solution by conversion of any least squares solution, 724 generalized form, 725 minimum distance constraints, 745 minimum mean square error interpolation, 667 minimum norm of initial coordinates and velocities separately, 754 minimum norm interpolation, 665 minimum norm solution by conversion of any least squares solution, 717 minimum trace of covariance matrix for initial coordinates and velocities separately, 755 minimum weighted norm constraints, 730 model mathematical, 848 models without full rank solution by minimal constraints, 718 modified GCV method, 569 mollifier reproducing Hilbert space kernel, 977 singular integral-type, 977 mollifier method, 921 Backus-Gilbert-type, 1007 Haar-type, 991 Haar-type numerics, 996 multiscale Haar-type, 993 RHKS-type, 1041 singular integral-type, 1005 spline-type, 1042 de la Vallèe Poussin-type, 1002 Molodensky boundary value problem, 17 Molodensky problem, 1064, 1066, 1072, 1079, 1160, 1162 linearization, 1068 moment problems, 180 Monte Carlo Integration, 457 Monte Carlo method, 447, 448, 452 Moore-Penrose inverse, 836, 838, 841 Morozov’s discrepancy principle, 884 multiple carrier ambiguity resolution, 1398 Multipole to exponential (M2X), 559 multiresolution scheme, 908

Stichwortverzeichnis multiscale approximation, 53 multiscale expansion, 321 multiscale method, 899, 1085, 1146 multiscale mollifier method, 995 multiscale regularization, 1195

N Nashed’s bibliography, 836 nature of gravitation, 6 Navier-Stokes equations, 1100 Nested Dissection, 958 Neumann boundary value problem, 20 of physical geodesy, 1137 Neumann kernel, 1137 linearly regularized, 1143 Newton, 1054 law, 1054, 1055, 1205 second law, 1101 Newton integral, 974 properties, 979 Newton kernel, 68 Newton operator, 98 Newton spline method, 1022 Newton spline mollifier method, 1041 Newton volume integral, 1205 Newton’s Law of Motion, 12 Newton’s second law of dynamics in a rotating reference system, 703 Newtonian volume potential, 979 Poisson equation, 982 properties, 979 noise deterministic, 562 stochastic, 562 strong, 864 weak, 866 noise error, 79 noise process, 936, 942, 944 Non Rotating Origin (NRO) conditions, 706 Non Rotating Origin (NRO) principle, 705 non-adjustable observations, 777 in correlated data sets, 779 in uncorrelated data sets, 777 nonlinear adjustment, 508, 509, 515, 517, 519, 520 nonlinear confidence region, 484, 517–518, 523, 524 normal equations, 841, 881 using minimal constraints, 719 normal equations coefficient matrix recovery from parameter covariance matrix, 770 normal field, 17

1457 normal gravity, 1160 normal height, 1157 normal height vector, 1160 normal potential, 1066, 1160 null space of design matrix, basis, 716 inverse gravimetry, 987 nutation in longitude, 710 nutation in obliquity, 710 Nyquist, 1278 Nyquist frequency, 935 O object, 973 oblique derivative, 1043 oblique derivative problem, 17, 577, 1073, 1074 obliquity, 710 observables manifold, 711 observation vector reduced, 937 ocean tides magnetic field, 342 ocean topography, 1102, 1108, 1187 oceanic flow, 25, 1100 Octtree child (cube), 551 construction, 551 directional lists, 559 lists of each cube, 551 neighbor (cube), 551 parent (cube), 551 well-separated (cube), 551 ðN, see also spin lowering and raising operators, 209, 264 kernel, 281, 284 operator S , 96 operator equation, 74, 1196 gravimetry, 974 solvability conditions, 1197 optimal spatiotemporal reference system differential equations, 733, 736 as a geodesic of minimum length, 733 orthogonality conditions, 735 by vanishing of relative angular momentum and constant barycenter, 736 origin of a reference system, 700 orthogonal anharmonic systems, 989 orthogonal harmonic systems, 988 orthogonal regression, 484, 497–499, 501 orthometric height, 1155 oscillations Bernstein kernel, 162 Shannon kernel, 162

1458 outer harmonics, 553, 986, 1192, 1232 translation theorem, 554

P parallel spatiotemporal reference systems, 734 parameter choice, 83, 1201 parameter choice method a-posteriori, 563 a-priori, 563 data-driven, 563 heuristic, 563 partial generalized inner constraints, 730 partial inner constraints, 729 involving only initial coordinates and transformation parameters, 746 involving only initial coordinates and velocities, 746 involving only transformation parameters, 746 involving only velocities and transformation parameters, 746 phase calibration, 1279, 1288 physical geodesy, 6, 8, 1054 physical geometry, 5, 11 physical surface, 10 Picard condition, 857, 900 Picard criterion, 1202 Pizetti concept, 1077 planar rotation matrix, 715 plumb line, 14, 15, 1065, 1119 Point to multipole (P2M), 557 point positioning, 1318, 1324, 1326, 1330–1338, 1350, 1377 carrier phases, 1336 code ranges, 1423 Doppler data, 1339 precise, 1340 Poisson equation, 69, 982, 1056 Poisson kernel, 69 polar motion, 26, 704 polar motion matrix, 704, 706 post-linear coordinate variation models, 816 amplitude and phase modulation of annual carrier, 819 cubic splines with equidistant nodes, 817 Fourier series model, 817 Singular Spectral Analysis, 817 potato Earth, 48 potential, 1054, 1055 anomaly, 1071 mass point, 979 mass point system, 979 volume, 979

Stichwortverzeichnis potential methods, 55, 74 power spectrum, 934, 935, 947 precession angles, 710 precession and nutation, 704 precession-nutation matrix, 704, 706 prime focus, 1269 principal directions of deformation, 634 principal plane 3D dilatations, 690 principal plane 3D shears, 691 principal strains infinitesimal, 631 principal strains, rigorous, 637 probability distribution function, 837 problem direct, 848 ill-posed, 839, 848, 1197 inverse, 839 least squares, 850 weighted least squares, 852 well-posed, 848, 1197 product kernel, 544 projector orthogonal, 937 propagated noise error, 563 pseudodifferential operator, 154, 1194, 1228, 1230 Hesse tensor, 1232 pseudo-forces in a rotating reference system, 703 pseudoinverse, 726, 836, 838, 841, 857, 1196 publication list, 181 Q qualification, 878 quantization, 1286 quasar, 1258, 1261, 1262, 1265 quasi-optimality criterion, 567 quasigeoid, 1157 quasigeoid modeling, 1164 quasigeoidal height, 1157 quaternion, 510, 511, 513 R radial basis function, 315 radially symmetric density distribution, 990 radio source, 1258, 1261 range rate measurements, 28 rank deficiency interpretation indices, 741 reality space, 58 reconstruction error, 878 reconstruction method, 875 recovery problem, 180 redistribution of points

Stichwortverzeichnis tangential, 601 reference ellipsoid, 1072 sphere, 1072 reference surface, 1121 reference system definition of, 700 reference system for the description of earth rotation, 704 reference system in motion, 701 reference system for a rigid network realization within data analysis, 711 regular region, 853, 977, 1056 regular surface, 543, 1207 regularization, 77, 1194 asymptotic, 880 Backus-Gilbert method, 1204 classification, 45 filter scheme, 1204 generalized inverse, 1203 iteration scheme, 1204 iterative, 909 mollification scheme, 1204 moment scheme, 1204 strategy, 1199 Tikhonov, 867, 881 regularization error, 79, 563, 865 regularization method, 44, 870, 1203 regularization parameter, 1199 chosen, 563 maximal, 564 regularization problem, 79 regularization-approximation scheme, 1205 regularized Green kernel function, 1092 regularized kernel, 1082 regularizer, 78, 864 regularizing filter, 1203 relative angular momentum, 652, 737 relative gravimeter, 16 relative kinetic energy, 652, 737 relative motion separation form deformation, 648 of two regions, 651 relative motion detection from 3D displacements, 655 from 3D velocities, 653 from plane displacements, 659 from plane velocities, 656 from velocities on the sphere, 660 relative positioning, 1352 relative rotation vector, 705 reproducing kernel, 544 reproducing kernel function, 1224 reproducing kernel Hilbert Space, 1011

1459 reproducing kernel space, 888 right Cauchy-Green deformation tensor, 634 robust GCV method (RGCV), 568 rotating reference system, 702 rotation angle, 713 rigorous, 637 rotation matrix, 510, 511, 514, 637, 702 Runge concept, 1219, 1232 Runge (Bjerhammar) sphere, 543, 986, 1219 Runge-Walsh approximation property, 1218 scalar, 1219 tensorial, 1219 Runge-Walsh concept, 18 Runge-Walsh theorem, 1112

S sample covariance functions, 668 sampling, 149 Euclidean, 181 spherical harmonics, 176 sampling-importance-resampling (SIR algorithm), 446, 447, 460, 462 sampling theorem, 890, 1013 SampTA, 181 SAPOS, 1164 satellite altimetry, 26 satellite-based positioning, 1318 satellite concepts, 32 satellite geodesy, 8, 28 satellite gravimetry, 1188 satellite gravitational gradiometry see SGG satellite laser ranging, 24 satellite observation, 27 satellite-to-satellite tracking, 28 scalar outer harmonics, 1216 scalar spherical harmonics, 1209 scale definition in ITRF formulation extended approach, 811 restricted approach, 810 scale factor ratio, 641 scale parameter, 713 scale space, 53 scaling equation, 1094 scaling function, 171, 1083, 1094 decomposition regularization, 903 Neumann scaling function, 1145 reconstruction regularization, 903 Stokes scaling function, 1140 vector scaling function, 1151 scientific bridge geodesy, 59 mathematics, 59

1460 sea surface topography, 26 second Green theorem exterior, 1063 interior, 1059 section of a fibering, 712 SEFD (System Equivalent Flux Density), 1264, 1266 semi-iterative method, 914 separable generalized inner constraints, 745 SGG (Satellite Gravity Gradiometry), 29, 32, 1188, 1189, 1237 as non-standard problem, 1206 frequency domain, 1194, 1220 space domain, 1194, 1220 spacewise approach, 1192 timewise approach, 1192 SGG bandpass potential, 1247 SGG lowpass potential, 1247 SGG pseudodifferential operator, 1233 SGG uniqueness, 1221 Shannon sampling, 175 Shannon sampling theorem, 176 shape manifold, 712 of a deformable geodetic network, 732 Shapiro term, 1294 shear along the first coordinate axis, 637 at any direction, 637 direction of, 639 rigorous, 639 shear components, infinitesimal, 632 signal analysis, 180 similarity transformation, 713 linearized three-dimensional, 715 in the plane, 715 single-layer kernel, 1140 linearly regularized, 1141 singular integral, 992, 1003, 1004 Singular Spectral Analysis, 817 singular system, 1202 singular value decomposition, 633, 855 of deformation gradient matrix, 634 singular value expansions, 1202 sink, 1058 SIR algorithm, 446, 447, 460, 462 Slepian function, 157, 324 SLR, 15, 24 smoothed Haar kernel, 166 smoothing parameter, 547 SNR (signal-to-noise ration), 1280 Sobolev Lemma, 1225, 1227 scalar, 1225 tensorial, 1227 Sobolev space, 1194, 1223, 1226

Stichwortverzeichnis harmonic functions, 544 solid angle, 975, 1059 solution manifold, 712 Somigliana-Pizzetti level ellipsoid, 365, 372, 375, 376 source, 1058 space localization, 53 spaceborne methods GNSS, 15 LLR, 15 SLR, 15 VLBI, 15 spacelimited, 156 spatiotemporal reference system definition by various types of minimal constraints, 744 for a deformable geodetic network, 731 deterministic aspects and reference system optimality, 731 mathematical modeling, 731 a posteriori change in a stacking solution, 748 as a section of the shape manifolds, 732 spectral domain, 935, 947 spectral inversion inverse gravimetry, 990 spherical coordinates, 94 spherical geometry, 95 spherical harmonic, 95, 96, 152, 202–204, 313, 1209, 1213, 1215, see also spin-weighted spherical harmonic addition theorem, 203, 257 complex-valued fully normalized, 541 fully normalized, 204 real-valued fully normalized, 541 scalar, 153 tensorial, 205, 303 vectorial, 205, 302, 335 spherical harmonic expansion, 1104 spherical sampling need, 149 spheroidal parts, 359, 360 spin lowering and raising operators, see also ðN , 264 spin weight, 207, 266 spin-weighted Beltrami operator, 223, 295, 297 spin-weighted differential operators, see also spin lowering and raising operators, 209 spin-weighted spherical harmonic, 202 addition theorem, 255 asymptotic boundedness, 250 boundedness, 248

Stichwortverzeichnis Christoffel-Darboux formula, 220 completeness, 290 complex conjugation, 258 definition, 209 differential relations, 299 as eigenfunction of a differential operator, 223, 297 equivalent definition, 211, 237 norm estimate, 296 orthonormality, 254 Parseval identity, 291 recursion relations, 214, 300, 301 Sturm-Liouville differential equation, 222 in terms of (associated) spin-weighted Legendre functions, 254 in terms of Jacobi polynomials, 252, 253 spline, 170, 545 coefficients, 545 harmonic, 545 interpolating, 545 minimum properties, 546 smoothing, 547, 558 spherical, 317, 548 vector spherical, 338 SST (Satellite-to-Satelltite Tracking), 28 stacking conversion to the solution for inner constrains, 750 conversion to the solution for kinematic constraints, 761 conversion to the solution for minimum distance constrains, 749 conversion to a solution satisfying different constraints for initial coordinates and velocities, 753 conversion to a solution satisfying some type of inner constraints, 748 conversion to a solution satisfying some type of partial inner constraints, 750 of a coordinate time series from a single space technique, 782 coordinates and velocities, 751 generalized partial inner constraints solution involving only initial generalized partial inner constraints solution involving only transformation parameters, 752 linearized observation equations, 739 relation between different least squares solutions, 742 stain parameters, rigorous

1461 invariance characteristics, 642 standard deviation, 831 static relative positioning, 1363 stochastic process autoregressive, see AR-process covariance stationary, 934–936, 943 moving average, 946 noise, 936, 942, 944 Stokes exterior problem, 1072 integral formula, 1080 kernel, 1080 Stokes boundary value problem, 17 Stokes problem, 1162 stopping rule, 84 strain matrix, 630 diagonalization of infinitesimal, 631 infinitesimal, 631 rigorous, 636 transformation under change of reference systems, 641 strain parameters, 629 computational algorithm, 636 infinitesimal invariance characteristics, 641 strain rate infinitesimal, rigorous character for linear in time coordinates, 647 invariance characteristics, 644 single epoch, 644 single epoch, rigorous, at any epoch, 648 two epoch, 644 strain rate parameters, 630 stress-strain relations, 628 strong noise, 864 strong robust GCV (R1 GCV), 569 subduction process, 1188 summability condition, 543, 548 surface curl, 1209 curl gradient, 1103, 1209 divergence, 1209 physical, 10 regular, 543 topographic, 9 surface deformation, 676 surface density, 97 surface gradient, 1075, 1209 surveying, 10 symbol, 990, 1228, 1229 kernel, 156 Legendre, 156 system Earth, 9 system temperature, 1264

1462 T tangent space to a shape manifold, 713 telluroid, 1066, 1157 tensor outer harmonics, 1217 tensor spherical harmonic, 205, 303, 1212, 1214 tensorial SGG-problem, 1217 terrain correction, 1126 terrestrial geodesy, 16 Terrestrial Intermediate Origin (TIO), 708 terrestrial levelling, 1157 terrestrial reference system, 704 third Green theorem exterior, 1064 interior, 1059 three-dimensional deformation, 689 tidal signal, 1188 Tikhonov regularization, 860, 867, 881, 1203 time domain, 935, 947 Tisserand axes, 737 Tisserand reference system, discrete, 652 Toeplitz structure, 947, 949, 960 Toeplitz system, 948 topographic geometry, 5, 6 topographic surface, 9 toroidal parts, 426, 429, 430 total error, 1200 total least squares, 482, 499 totally symmetric, 207 transformation of a network reference system to an earth reference system, 762 of stacking parameters under change of reference system, 743 transformation parameters, 713 translation exponential, 558 L2L with rotations, 558 local to local (L2L), 556 M2M with rotations, 558 multipole to local (M2L), 555 multipole to multipole (M2M), 555 translation components, 713 translation theorem inner harmonics, 554 outer harmonics, 554 transport theorem, 1100 transversality condition for section of fibering of shape manifolds, 713 tree algorithm, 53 triangulation, 9, 12, 582 trivial minimal constraints, 728 true celestial system, 710 truncated singular value decomposition, 1203

Stichwortverzeichnis U unbiased estimate, 491, 492, 496 unbiased prediction risk method, 568 uncertainty principle, 51, 157–1219 upper sideband, 1278 upward continuation, 70, 95, 99, 1194, 1227 harmonic, 101 USB (upper sideband), 1278 UT1 time, 707

V variance, 564, 831 of unit weight, 483–485, 489–491, 493, 496, 509, 522, 524, 529 variance components, 482 variation of EOPs under changes in the terrestrial, or the celestial reference system, or both, 788 vector components, 700 vector spherical harmonic, 205, 302 velocity interpolation, 662 velocity gradient matrix, 630 on the reference ellipsoid, 685 vernal equinox, 711 vertical deflections, 1139 very long baseline interferometry, 24 virtuality space, 58 VLBI, 15, 24 volume dilatation, 691

W wave methods, 55, 74 wavelet, 171, 1083, 1093, 1106, 1140 Neumann wavelets, 1145 vector wavelets, 1151 wavelet approach, 1195 wavelet function, 171 decomposition regularization, 906 reconstruction regularization, 906 wavelet reconstruction, 1146 weak noise, 866 weighted least squares, 479, 482–492, 502, 522, 524 weighted minimum distance constraints, 745 weighted minimum norm constraints, 745 weighted total least squares, 482–484, 497, 502–505, 522, 526

Stichwortverzeichnis well-posed in the least-squares sense, 1198 well-posed problem, 848 well-posedness, 76, 848 Wiener-Chintschin theorem, 935, 947 Wigner D-function, 224 worst case bound, 884 worst case error, 875

X XF, 1283

1463 Y Yn,j , see spin-weighted spherical harmonic Ynm , 96 Yule-Walker equations, 944, 945, 948, 950 reorganized, 945

Z zero frequency tide system, 359, 375–377 zero gravity radius, 383, 384–386 Zund Theorem, 392