129 92 21MB
English Pages [856] Year 2024
Table of contents :
Preface
Acknowledgments
Contents
1 The Mass of the Earth
1.1 The Radius of the Earth
1.2 Earth-Sun and Earth-Moon Distances
1.3 Earth-Moon Distance (Hipparchus' Way)
1.4 From Aristarchus to the Enlightenment
1.5 The Figure of the Earth
1.6 Newton's Laws of Motion and Gravity
1.7 Bouguer's Expedition to the Andes
1.8 Maskelyne and the Schiehallion
1.9 Cavendish' Experiment
2 The Structure of the Earth: Rotation, Precession, Cosmogony
2.1 Euler's Laws of Motion
2.2 The Inertia Tensor
2.3 Precession (Free and Forced)
2.4 The Formation of the Earth (and Other Planets)
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
3.1 Charles Lyell
3.2 Aristotle's Meteorology
3.3 Theophrastus
3.4 Steno
3.5 Neptunism: De Maillet, Buffon, Werner
3.6 Basalt
3.7 Plutonism: James Hutton
3.8 James Hall (Scottish Physicist): Experimental Mineral Physics
3.9 Granite
4 The Age of the Earth
4.1 Layers, Fossils, Correlation
4.2 Relative Versus Absolute Age
4.3 Newton's/Buffon's Estimate of the Age of the Earth
4.4 Hutton's Idea of Time
4.5 Darwin and the Denudation of the Weald
4.6 John Phillips and the Ganges
4.7 Conservation of Energy
4.8 Conservation of Mechanical Energy
4.9 Heat
4.10 Heat Capacity
4.11 Conservation of Energy, ``Generalized''
4.12 The Sun's Energy Output and the Sun's Age
4.13 Fourier's Analytical Theory of Heat
4.14 Kelvin's Estimate of the Age of the Earth
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
5.1 Élie De Beaumont: The Shrinking Earth
5.2 Henry Rogers
5.3 James Hall (American Geologist)
5.4 James Dana
5.5 Isostasy
5.6 Asthenosphere
5.7 Continental Drift
6 The Vibrations of the Earth
6.1 Robert Mallet, 1848
6.2 John Michell, 1760
6.3 Back to Mallet
6.4 Hydrodynamics
6.5 Elastic Waves (``Body Waves'') in an Unbounded Fluid
6.6 ``Surface Waves'' in a Fluid Half Space
6.7 Back to Mallet, Again
6.8 The Stress Tensor
6.9 Hooke's Law
6.10 Shear and Compressional Waves
6.11 Rayleigh Waves
6.12 Richard Oldham, 1900
6.13 Love Waves
6.14 Dispersion, Group, Phase
7 The Structure of the Earth: Seismology
7.1 The Earth's ``Core'', Seen by Seismic Waves
7.2 The ``Moho''
7.3 Geometrical Optics
7.4 Snell's Law
7.5 Huygens' Principle
7.6 Diffraction
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
7.8 Reflection and Refraction Coefficients
7.9 Stoneley Waves
7.10 Head Waves
7.11 The Herglotz-Wiechert-Bateman Method
7.12 The Rigidity of the Earth's Central Core
7.13 Inge Lehmann and the Inner Core
7.14 From Seismic Velocities to Density: The Williamson-Adams Method
7.15 Francis Birch
7.16 Bulk Composition of the Earth
7.17 Olivine, Peridotite, Perovskite, Post-Perovskite
8 The Forces that Shape the Earth: Convection and Plates
8.1 The Discovery of Radioactivity
8.2 Radiometric Dating
8.3 Earth's Radioactivity and Surface Heat Flow
8.4 Viscosity
8.5 Postglacial Rebound
8.6 Convection
8.7 Rayleigh Number
8.8 The Earth's Magnetic Field
8.9 Magnetization of Rocks
8.10 Maps of the Ocean Floor
8.11 Robert Dietz and Harry Hess
8.12 Geomagnetic Reversals
8.13 The Zebra Pattern
8.14 Transform Faults
8.15 Rigid Blocks
8.16 A Model of the Oceanic Lithosphere
8.17 The Half-Space Cooling Model Versus the Data
8.18 Subduction
8.19 What Moves the Plates?
8.20 Mountain Building
9 Our Concept of the Earth
9.1 Extrapolating Temperature via the Adiabatic Gradient
9.2 Geochemical Models of Mantle Convection
9.3 Geodynamical Models of Mantle Convection
9.4 Global Seismic Tomography
9.5 Putting Everything Together
Appendix Exercises: Further Food for Thought
Appendix Notes
Index
Lapo Boschi
Our Concept of the Earth What We Know About Our Planet and How It Was Discovered
Our Concept of the Earth
Lapo Boschi
Our Concept of the Earth What We Know About Our Planet and How It Was Discovered
Lapo Boschi Department of Geosciences University of Padua Padua, Italy
ISBN 978-3-031-71578-5 ISBN 978-3-031-71579-2 (eBook) https://doi.org/10.1007/978-3-031-71579-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland If disposing of this product, please recycle the paper.
To all the people I shared my time with in Cambridge, Mass., 1996–2001: my teachers Adam, Göran, Jeroen; my fellow graduate students; and my good friends at the Bologna Café.
Preface
There are a few things you need to know before you decide whether to read this book. I don’t think you’d be able to figure them out from just looking at the title and/or the table of contents, or quickly scanning the chapters, so here is this preface. I’ll keep it as short as I can. First of all, Our Concept of the Earth is a geophysics book. “Geophysics” means a lot of things, but basically this book is about the so-called “solid earth”: no climate, atmosphere, oceans, really—important topics, but not what this book is about. When I say solid earth I mean the planet—our planet—as a whole: its “global” structure, from the surface of the continents down to the so-called “inner core”. No “applied” geophysics, either: I am not going to teach you how to find oil or other valuable underground stuff: although some of the material you’ll learn here might help you to do that, too. And very little seismology, actually: or, I should say, a lot of seismic waves—which some geophysicists “use” to make x-ray-type scans of the interior of the planet—but very little about earthquakes per se. This is not a random choice of topics, of course. They all give the most important contributions to addressing one fundamental question, or set of closely connected questions about the earth: why does it have mountains and oceans? why is it shaken by earthquakes, volcanoes? how come it has a magnetic field? and so on and so forth: in summary, how does it work? Fundamental questions that humans have been trying to answer throughout our entire, multi-millennial history. This brings me to the most important point that I need to make: that the main thing I am trying to do with Our Concept of the Earth, which you won’t find in other geophysics books, is to place ideas in their historical context. That doesn’t mean that this is a history-of-science book: it is not. But the thing is, in every single geophysics book I’ve ever read, you’d always run into some very important piece of knowledge that somehow “falls from the sky”. You start reading about the origin of the earth’s magnetic field, and you find out that the earth’s core is made of iron—but how do we know it’s made of iron? how can we be so sure? this, you are not told. Or, you read about earthquakes: most earthquakes happen where a tectonic plate slides past another tectonic plate, which makes sense, but, wait a second, what exactly is a tectonic plate? how are we so confident about the whole theory of plates? OK, today vii
viii
Preface
we have GPS, but what about before GPS? how did people find out? And how about the age of the earth? we know it’s four and a half billion years, through radioactivity. But, what exactly is radioactivity? and how can one use it to “date” rocks? Bottom line, maybe it’s just me, but, reading those books, I couldn’t help feeling that for each answer I got, there was at least one more new question coming up— which was frustrating. At some point, though, I realized that there is one way out of this: look at things chronologically. Follow how the geophysics discourse developed over the years, over the centuries. From simple, early models based on few data, all the way to the complex idea of the earth that people agree upon today. If we start from the beginning, and learn about things sequentially, in the order in which they were actually discovered, then there’s no way anything will ever need to fall from the sky. That’s exactly what Our Concept of the Earth does. A consequence of this, for instance, and I am talking to those of you who are “scared of”, or “don’t like” physics and/or math, a consequence of learning things this way is that you do not need to read three of four heavy chapters on calculus and/ or thermodynamics, Maxwell’s equations, and so forth, before you can even think about the main subject of the book. Instead, we’ll start right away with the main questions we are interested in—which see above—and see how people addressed them before calculus even existed. And then, as new questions arise from the very simple (too simple) theories that had been proposed early on, the book will teach you just enough math and physics so that you can understand what’s wrong with those theories, and what makes the new ones better. And you’ll keep learning the hard theory along the way, no less and no more than it’s necessary. This might be annoying if you are like I was a few years back: a physics graduate who starts a Ph.D. program in geophysics, knowing everything about integrals and derivatives, but nothing of the geo-problems that this book is about. It might also be annoying the other way around: because the book has a lot of basic geology that might be interesting to the physics major that I was, but too much for geologists who’ve already studied it for a semester or more. This is why Our Concept of the Earth has a large apparatus of endnotes. The main body of text, which skips some of the “details”, is designed to be read seamlessly: meaning, you don’t have to read the notes. You have enough elements, even without notes, to follow the story that I am trying to tell. If you are curious, though; if, like me, you are bothered when explanations are not 100% transparent; or if you simply enjoy going off on a tangent every now and then: then yes, go ahead and read the notes. The back-and-forth is annoying? still, IMHO, better than having to grab your phone or tablet to look things up on Wikipedia or elsewhere. Now, even after all this, I am sure that people will still ask me—and I am thinking of those friends of mine that are not scientists—they’d still ask me, “yes, but would I be able to read your book?” My answer to this is a resolute “yes”... followed by a cautionary note: as it is always the case with science books, you do need a certain amount of patience. Because Our Concept of the Earth is not a “pop-sci” book. And maybe, at first sight, it’s less “exciting” than those books are; but, unlike those books, it doesn’t stop at the surface of things, and tries, instead, to get to the bottom of them. A consequence of this is that it’s got to have math: equations: that notorious turn-off.
Preface
ix
There’s no way around it, because that’s what we do: those are the tools and the language of geophysics. But here’s another thing: you do not need to be proficient in that language before you start reading this book. All you have to do is play the game. If something is not clear to you, there’ll be an endnote, providing the background you are missing. Occasionally, you might have to re-read a paragraph a couple of times, or more. But I am persuaded that, with a high-school level background in math and some motivation, the book will work for you just fine. Before wrapping this up, I realize there are few slightly more mundane questions that I probably need to answer. You might want to know what courses this book might match. It certainly matches the courses I have been teaching, over the last few years, at the University of Padua, called “physics of the earth”, “solid-earth geophysics”, and “mathematical physics for the earth system”. “Physics of the earth” is offered to third-year geology undergrads, while the other two courses are given in the first semester of a master’s program in “geophysics for natural risks and resources”. Our Concept of the Earth will teach you the basics of geodynamics, seismology, seismic imaging, tectonophysics; a bit of geochemistry and mineral/rock physics—if you know what I am talking about—if not, the book explains it all—and show you how all those fields of study are linked to basic geology. Personally, I wish I could have read this book when I started graduate school—it would have filled a lot of gaps in my background, and provided context for, and links between all the disparate things that I had to very quickly learn about. You might notice an “exercises” chapter at the end of the book (before the notes). Questions are ordered according to when the corresponding material first appears. Feel free to take a look at them every now and then, and certainly once you have finished the book, and give them some thought. If you’ve read the whole thing (incl. the endnotes), then you have all the info you need to answer all questions. Now that you’ve been so patient as to get to the end of the Preface, I hope you’ll be intrigued enough to want to read the whole thing. If that is the case, here’s my final piece of advice. You can use this book as a cookbook if you really want to—there’s even a subject index at the end—but I suggest you don’t. Our Concept of the Earth tells a story, so it will work best for you if you read it like you’d read a novel—I don’t claim that it’ll be as entertaining as a good novel, although it should be more entertaining than many bad ones—but you see what I mean: read it from beginning to end. I promise it will be useful for your coursework, too—but try not to think about that as you read. Forget about school for a little while. Be patient; let the story unfold; enjoy yourself. Padua, Italy
Lapo Boschi
Acknowledgments
I am grateful, first of all, to Hijiri Shimamoto for helping with figures 3.5, 5.1, 5.4–5.6, 5.8, 8.4, 8.5, 8.10, 8.31, 8.42, 8.43, N.22, N.54 (and with many other things); Piero Poli helped with figures 4.2, 5.2, 6.2, 6.10, 6.22, 8.13, 8.14, 8.15, 8.18, 8.21, N.24; Alessandro Mastrolia and Irene Molinari helped with figure 6.13; Henrique Berger Roisenberg with figure N.50. The photograph on the front cover was taken by Claudio Rosenberg; Quentin Grimal did some very careful proofreading. I also wish to thank, for many reasons: Thorsten Becker, Maria Elina Belardinelli, Maurizio Bonafede, Fabio Cammarano, Giorgio Cassiani, Nicolas Coltice, Chris Finlay, Galen Pippa Halverson, John Hernlund, Ben Holtzman, Christine Houser, Tomoo Katsura, Fabrizio Magrini, Christine Meyzen, Jerry Mitrovica, Jean-Paul Montagner, Aldo Rampioni, Claudio Rosenberg, Giorgio Spada, Andreas Stracke, Jeroen Tromp, Martin Vallée. Special thanks to Angela Lahee, who believed in my project and turned it into something real. Finally, and importantly, I don’t think I would have been able to write Our Concept of the Earth, the way it is, if I had to go and find physical copies of all the books and journals I’ve read: particularly for the older material, archive.org and gallica.bnf.fr were very, very useful.
xi
Contents
1 The Mass of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The Radius of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Earth-Sun and Earth-Moon Distances . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Earth-Moon Distance (Hipparchus’ Way) . . . . . . . . . . . . . . . . . . . . . 1.4 From Aristarchus to the Enlightenment . . . . . . . . . . . . . . . . . . . . . . . 1.5 The Figure of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Newton’s Laws of Motion and Gravity . . . . . . . . . . . . . . . . . . . . . . . 1.7 Bouguer’s Expedition to the Andes . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Maskelyne and the Schiehallion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Cavendish’ Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 2 3 7 8 12 13 20 24 26
2 The Structure of the Earth: Rotation, Precession, Cosmogony . . . . . . 2.1 Euler’s Laws of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Inertia Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Precession (Free and Forced) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 The Formation of the Earth (and Other Planets) . . . . . . . . . . . . . . . .
33 33 37 41 45
3 The Forces that Shape the Earth: Neptunism Versus Plutonism . . . . . 3.1 Charles Lyell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Aristotle’s Meteorology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Theophrastus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Steno . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Neptunism: De Maillet, Buffon, Werner . . . . . . . . . . . . . . . . . . . . . . 3.6 Basalt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Plutonism: James Hutton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 James Hall (Scottish Physicist): Experimental Mineral Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Granite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 50 51 53 56 57 64 65 70 72
xiii
xiv
Contents
4 The Age of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Layers, Fossils, Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Relative Versus Absolute Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Newton’s/Buffon’s Estimate of the Age of the Earth . . . . . . . . . . . . 4.4 Hutton’s Idea of Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Darwin and the Denudation of the Weald . . . . . . . . . . . . . . . . . . . . . 4.6 John Phillips and the Ganges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Conservation of Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Conservation of Mechanical Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 Heat Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Conservation of Energy, “Generalized” . . . . . . . . . . . . . . . . . . . . . . . 4.12 The Sun’s Energy Output and the Sun’s Age . . . . . . . . . . . . . . . . . . . 4.13 Fourier’s Analytical Theory of Heat . . . . . . . . . . . . . . . . . . . . . . . . . . 4.14 Kelvin’s Estimate of the Age of the Earth . . . . . . . . . . . . . . . . . . . . .
77 78 81 82 84 86 89 92 94 96 99 100 104 111 116
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift . . . . . . . . 5.1 Élie De Beaumont: The Shrinking Earth . . . . . . . . . . . . . . . . . . . . . . 5.2 Henry Rogers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 James Hall (American Geologist) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 James Dana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Isostasy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Asthenosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Continental Drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
123 124 128 132 137 141 144 145
6 The Vibrations of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Robert Mallet, 1848 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 John Michell, 1760 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Back to Mallet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Elastic Waves (“Body Waves”) in an Unbounded Fluid . . . . . . . . . 6.6 “Surface Waves” in a Fluid Half Space . . . . . . . . . . . . . . . . . . . . . . . 6.7 Back to Mallet, Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 The Stress Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9 Hooke’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 Shear and Compressional Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11 Rayleigh Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.12 Richard Oldham, 1900 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.13 Love Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.14 Dispersion, Group, Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
155 156 159 164 165 169 177 189 197 202 210 218 234 242 249
Contents
xv
7 The Structure of the Earth: Seismology . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 The Earth’s “Core”, Seen by Seismic Waves . . . . . . . . . . . . . . . . . . . 7.2 The “Moho” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Snell’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Huygens’ Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8 Reflection and Refraction Coefficients . . . . . . . . . . . . . . . . . . . . . . . . 7.9 Stoneley Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10 Head Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11 The Herglotz-Wiechert-Bateman Method . . . . . . . . . . . . . . . . . . . . . 7.12 The Rigidity of the Earth’s Central Core . . . . . . . . . . . . . . . . . . . . . . 7.13 Inge Lehmann and the Inner Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.14 From Seismic Velocities to Density: The Williamson-Adams Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.15 Francis Birch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.16 Bulk Composition of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.17 Olivine, Peridotite, Perovskite, Post-Perovskite . . . . . . . . . . . . . . . .
257 259 261 264 266 269 272
310 316 323 324
8 The Forces that Shape the Earth: Convection and Plates . . . . . . . . . . . 8.1 The Discovery of Radioactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Radiometric Dating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Earth’s Radioactivity and Surface Heat Flow . . . . . . . . . . . . . . . . . . 8.4 Viscosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Postglacial Rebound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Convection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Rayleigh Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 The Earth’s Magnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.9 Magnetization of Rocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.10 Maps of the Ocean Floor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.11 Robert Dietz and Harry Hess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.12 Geomagnetic Reversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.13 The Zebra Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.14 Transform Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.15 Rigid Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.16 A Model of the Oceanic Lithosphere . . . . . . . . . . . . . . . . . . . . . . . . . 8.17 The Half-Space Cooling Model Versus the Data . . . . . . . . . . . . . . . 8.18 Subduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.19 What Moves the Plates? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.20 Mountain Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
327 327 328 332 334 337 351 352 364 370 376 379 382 384 385 388 397 402 408 413 418
273 284 296 298 300 306 308
xvi
Contents
9 Our Concept of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Extrapolating Temperature via the Adiabatic Gradient . . . . . . . . . . 9.2 Geochemical Models of Mantle Convection . . . . . . . . . . . . . . . . . . . 9.3 Geodynamical Models of Mantle Convection . . . . . . . . . . . . . . . . . . 9.4 Global Seismic Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Putting Everything Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
421 422 426 432 445 457
Exercises: Further Food for Thought . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841
Chapter 1
The Mass of the Earth
I gave it some thought. I gave it quite a bit of thought, actually. And I figured the right place to start is Buffon. Meaning, Georges-Louis Leclerc de Buffon, who in 1749 started publishing his Histoire Naturelle. Because what we talk about when we talk about science, today, is really what people decided to call science around the time of the so-called enlightenment, or l’age classique as the French like to call it; and the ultimate enlightenment earth science book is the first volume of Buffon’s magnum opus. Which opens with a relatively short section on “the history and theory of the earth”, followed by a long chapter called “proofs of the theory of the earth.” OK, Buffon’s is not the only natural history of the earth published around that time, but I think it’s safe to say that it was the most influential one of its generation, maybe of the whole century. Buffon was an important man: in 1738 he was appointed director of the King’s Botanical Garden, a position that he would hold until his death, fifty years later. Most of what came before Buffon’s book, you wouldn’t even call science, today. Buffon himself starts his book with a review, “a cursory view”, as he says, “of the notions of some authors who have written upon this subject.” And he doesn’t seem to think of those people as scientists, the way we have learned to think a scientist should be. Re some prominent authors of the century before his, he writes: “In subjects [...] where some facts are but partially known, and others obscure, it is easier to form a fanciful system, than to establish a rational theory”; “the theory of the earth has so far never been treated but in a vague and hypothetical manner. [...] These unstable hypotheses are all constructed on tottering foundations. The ideas they contain are indistinct, the facts are confounded, and the whole is a motely jumble of physics and fable. They, accordingly, have never been adopted but by people who embrace opinions without examination, and who, incapable of distinguishing the degrees of probability, are more deeply impressed with marvellous chimeras than with the force of truth. [...] With regard to the history of the earth, therefore, we shall begin with such facts as have been universally acknowledged in all ages, not omitting those additional truths that have fallen within our own observation.” Buffon considers that “it is unnecessary here to enumerate the proofs and observations by which these facts have been established.” The relevant “facts” are then simply enumerated at © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_1
1
2
1 The Mass of the Earth
the beginning of the chapter on the “Proofs of the theory of the earth.” Essentially, “the earth is a globe of about 3000 leagues in diameter; it is situated 30 millions of leagues from the sun, round which it revolves in 365 days [...]. These are the principal phenomena of the earth, the results of discoveries made by means of geometry, astronomy, and navigation.” The figures provided here by Buffon are reasonably close to the values that today we consider correct.
1.1 The Radius of the Earth Most of the observations that Buffon is talking about were first made by the socalled ancient Greeks. Greek philosophers had made the hypothesis that the earth was round and that it rotated around an axis. They even found ways to estimate its circumference. Some of you probably already heard the story of how Eratosthenes did it. Eratosthenes was Greek, but lived in Egypt—he was head librarian, actually, at the great library in Alexandria. At the time, Egyptians were already very good at measuring horizontal distances with surveying techniques not so different from those in use a couple of millennia later (more on that below); if distances were too large to be measured by chains or rods, then there were people, “step measurers”, whose job was to march at a very regular pace, and count how many steps between places. Obviously there was no way one could walk the entire circumference of the earth, but Eratosthenes had access to reasonably accurate estimates of distances between cities, in Egypt, that were relatively far from one another. Like, for instance, Alexandria and Syene (today’s Aswan), which you know Alexandria is on the mediterranean coast, while Aswan is way down south. Also, Eratosthenes, who worked in the best library in the world, had compiled a large, as we would say today, database of all sorts of geographic observations (similarities in fauna, flora, astronomical observations, measured horizontal distances, etc.: he published all that in a treatise, that he called Geography), and that way he realized that Alexandria and Aswan lay roughly on a direct North-South line (i.e., the same “meridian”). Now, here’s an interesting astronomical phenomenon: (i) if you are in Aswan, on the summer solstice, at noon, the shadow you cast is precisely under you: if you look down a well (this is how Eratosthenes would put it), no matter how deep, you’ll see the shadow of your head projected all the way down to the well’s bottom. From this one infers that the sun is, at that moment and place, directly overhead (today we would say that Aswan is on a “tropic”); (ii) but in Alexandria, to the North of Aswan, at noon and on the same day (or at any other time of the year, for that matter) the sun is not directly overhead. Now, Eratosthenes or his contemporaries could measure the sun’s elevation, which was the same as measuring, e.g., the angle between a vertical staff, and an imaginary line connecting the end of the staff with the end of its shadow on the ground (you might want to look at Fig. 1.1). If one makes the assumption that incoming rays of light from the sun are all parallel to one another, regardless of the observation point on the earth’s surface (i.e., the sun is much bigger than the earth), it is inferred that the angle just measured coincides with the angle, at the center of the earth, between
1.2 Earth-Sun and Earth-Moon Distances
3
Fig. 1.1 This is how Eratosthenes is said to have estimated the radius of the earth. The sun is very far, and but way bigger than the earth, so we can assume its light to strike us along the same direction, wherever we are on the earth’s surface. That means that the angle α from the lengths of a pole and its shadow, at Alexandria, is the same as the angle made, at the center of the earth, by the Alexandria-center-of-the-earth straight line, and a ray going straight from the sun to the center of the earth itself. Eratosthenes happened to know that, at the summer solstice, at midday, that ray would pass right through Aswan—because if you are in Aswan at that moment, and look into a deep, vertical well, you’d see light illuminating the water at the bottom of the well. But then, the Alexandria-Aswan distance along the surface of the earth, call it D, was known: and trigonometry says that D = Rα, with R the earth’s radius: so, measure α (at summer solstice, in Alexandria), divide D by it, and you are done
Aswan and Alexandria. The angle in question was found to be equal to one fiftieth of a circle: one, then, had just to multiply by fifty the distance between the two cities, to obtain the circumference of the earth.
1.2 Earth-Sun and Earth-Moon Distances The Greeks also speculated that the simplest way to describe the motion of celestial bodies is by placing the sun at the center of a solar system, with the planets revolving around it, and the moon around the earth. Aristarchus of Samos (who was by one generation older than Eratosthenes) came up with this idea some three centuries B.C., in a treatise that has been lost; but, e.g., Archimedes in his essay on the size of the universe1 summarizes Aristarchus’ ideas2 and takes them as the starting point for his estimate. Aristarchus is also credited with the first attempt at determining the earthsun distance via mathematics, reported in his only surviving text, a Treatise on the
4
1 The Mass of the Earth
Fig. 1.2 Aristarchus measured the angle φ, seen from his observation point at the earth’s surface, between the centers of moon and sun. He found it to be about 87◦ . (Today we estimate it to be about 89◦ 50 , i.e., much closer to a right angle.) The diagram is obviously not to scale
Sizes and Distances of the Sun and Moon. This was done via three simple exercises in geometry, and one extremely difficult (for the time) astronomy observation. The observation is that of the angle formed by the moon and the sun, with the earth as its vertex, during a half moon. The first geometry exercise starts from the consideration that the triangle formed by earth, sun and moon at that moment is a rectangular triangle, with the right angle at the moon (Fig. 1.2). Based on trigonometry, the cosine of the angle formed at the earth by sun and moon then equals the ratio of earth-moon distance (call it L) to earth-sun distance (call it S). To Aristarchus, the angle measured 87◦ , and so L ≈ cos(87◦ ) (1.1) S (it is understood that Aristarchus phrased this in a very different, and, for us, much more convoluted way than we do today), and after calculating the cosine of 87◦ , he obtained that the sun is about 20 times further away from us than the moon is. This is a gross underestimate; Aristarchus’ geometrical reasoning is correct, but his measure of the sun-moon angle very inaccurate. Because the earth is much closer to the moon than to the sun, this angle is also very close to a right angle; today we know that the angle seen by Aristarchus was much closer to 90◦ than to 87◦ . The second geometry exercise follows the observation that the radii of sun and moon as we perceive them visually are roughly the same; seen from earth, the sun and moon are roughly the same size. (If you’re not convinced, watch a solar eclipse, or watch the video of a solar eclipse, and you’ll see that when the moon moves in front of the sun, the two discs almost perfectly fit one another.) Having estimated the earth-sun distance to be twenty times larger than the earth-moon distance, Aristarchus inferred that the sun’s radius should be twenty times larger than that of the moon. Another underestimation, as the initial measurement error continues to propagate through Aristarchus’ calculations, but the reasoning is still correct in modern terms:
1.2 Earth-Sun and Earth-Moon Distances
5
Fig. 1.3 The sun eclipsed by the moon. Seen from earth, the sun and moon are roughly the same size. That means that you can trace a straight line, from the point on earth where the eclipse is being observed, that’s tangent to the surfaces of both sun and moon. If we call α the angle formed, at the observation point, by that tangent, and a line that goes through the center of both sun and moon, then trigonometry says that Ll ≈ tan(α), and but also Ss ≈ tan(α). It follows that LS ≈ sl
Fig. 1.4 Look at the dark-shaded triangle first: by trigonometry, L t−d one: t−d L = tan γ. It follows that S = s−t
s−t S
= tan γ. Then the light-shaded
in general, the ratio of earth-sun distance to earth-moon distance must approximately coincide with the ratio of the sun’s radius to that of the moon. Looking at Fig. 1.3, we write this as L/S ≈ l/s. Aristarchus next thought about the geometry of a lunar eclipse. Draw the sun and the earth, and then draw the moon behind the earth, like in Fig. 1.4. (Because we are in a lunar eclipse, the centers of sun, earth and moon are, at least approximately, aligned, and we can put them along a horizontal line, to keep things simple.) Now, trace a straight, oblique line, tangent to both sun and earth, and two horizontal lines, one tangent to the earth and the other going through the point where the oblique line is closest to the moon (look at Fig. 1.4 to see what I mean). After some trigonometry, we get that L/S = (t − d)/(s − t), which, based on what we had found out from the previous exercise, we can also write l/s ≈ (t − d)/(s − t).
(1.2)
6
1 The Mass of the Earth
Fig. 1.5 By trigonometry, s = S tan θ. It follows that sides by tan θ: St = st tan1 θ
s t
=
S t
tan θ. But then, if you divide both
This relationship can be manipulated in a number of ways, via tedious but simple arithmetic. For instance, the following equations are derived from it3 : 1 + sl s = . t 1 + dl
(1.3)
1 + sl l = , t 1 + dl
(1.4)
They express the ratios of the moon’s and sun’s radii to that of the earth in terms of quantities (d/l and s/l) that one could observe directly: we have seen how Aristarchusv estimated s/l to be about 20, while the ratio d/l could be measured during a lunar eclipse4 . Aristarchus’ third geometry exercise allows to determine the ratios of both earthsun and earth-moon distances to the radius of the earth, based on the results obtained so far. Look at the angle θ in Fig. 1.5: it is, essentially, the size of the sun as perceived by an observer placed at the earth’s surface; it can be measured directly. We know from trigonometry that s = S tan θ, which, if we divide both sides by t tan θ, s S = t t tan θ 1 1 + sl , = tan θ 1 + dl
(1.5)
by Eq. (1.3). And but the thing about Eq. (1.5) is that everything at the right-hand side is known by direct observation, as we’ve just seen, and so one can use this to calculate the earth-sun distance in “terrestrial units”—one terrestrial unit is defined to coincide with the length of earth’s radius—or in whatever units we want, actually, since we know t from Eratosthenes. The exact same reasoning can be applied to the moon, measuring the corresponding value of θ and replacing s and S with l and L, respectively.
1.3 Earth-Moon Distance (Hipparchus’ Way)
7
Because, like I was saying, his astronomic measurements were not so precise, Aristarchus’ final estimate of, for example, the earth-moon distance amounted to 20 terrestrial units, which is off by a factor of 3 with respect to today’s value of 60 terrestrial units. But his method was totally OK.
1.3 Earth-Moon Distance (Hipparchus’ Way) A few decades later, Hipparchus of Nicaea estimated the earth-moon distance with a new approach, based on two observations of the same solar eclipse, made at two different points that roughly lie on the same meridian: the Hellespont (today’s Dardanelles, i.e., the strait connecting the Aegean Sea to the Sea of Marmara) and, again, Alexandria. The eclipse in question was total at the Hellespont, but not in Alexandria where the sun was only partly obscured by the moon: about one fifth of the sun’s diameter was still visible (segment C D in Fig. 1.6). Hipparchus figured that the angle subtended by C D, as observed from A (that is, the angle C AD) was approximately equal to the angle C M D: that should be OK, given that earth and moon are (relatively) very close to one another, and the sun is very far from both (the diagram is totally not to scale). Keep looking at Fig. 1.6; this angle, which Hipparchus measured to be about 0.1◦ , coincides with the angle AM B, that is, the angular distance between A and B as seen from the moon. To a very crude approximation, again by simple geometry, the distance between A and B should roughly coincide with the product of the earth-moon distance L, times the angle AM B measured in radians. This also means that by measuring the distance between A and B, and dividing it by the angle AM B, the earth-moon distance is obtained. Hipparchus knew Alexandria and the Hellespont to be roughly on the same meridian and about 9◦ from one another, which by knowing that you can actually redo his calculations yourself, and you should get that distance between earth and moon is 90 times the radius of the earth—which
Fig. 1.6 Hipparchus’ first estimate of earth-moon distance L. The distance, along the surface of the earth, between his two observation points, A (Alexandria) and B (Hellespont) is equal to t θ (where, same as above, t denotes the earth’s radius), and but it should also approximately coincide with Lγ. It follows that L ≈ tθ γ . The angles θ and γ can be measured, t is (approximately) known: Hipparchus then could calculate L
8
1 The Mass of the Earth
we know today to be a gross overestimate—although already in the right order of magnitude. The problem with the method I just showed you is that it requires that the straight line defined by the centers of the earth and sun be exactly tangent to the surface of the moon. In general, this won’t be the case, and if you take that into account the trigonometry we just did doesn’t work anymore. Hipparchus was aware of this problem, and solved it via some more trigonometry. This time I am not going to go through it; I’ll just tell you that, eventually, Hipparchus estimated the earth-moon distance to be between 62 and 73 terrestrial units5 : not so far from today’s estimate, which is around 60.
1.4 From Aristarchus to the Enlightenment The bottom line of all this is that the Greek already had some quite sophisticated theories, which today we would consider pretty much correct, about the nature of the solar system. And, like I was saying, about 2000 years later Buffon and company would take many of their results as established “facts”. But before we get back to Buffon, the funny thing is, for all its ingenuity, Aristarchus’ heliocentric idea never caught on with his contemporaries. One century after Christ, and two hundred years after the death of Hipparchus, Ptolemy put together in his Almagest a tremendous database of planetary paths, and proposed the following model to summarize them: each planet revolves in a circular orbit (the “epicycle”) around a virtual point which in its turn describes another circular orbit (the “deferent”) around another virtual point (the “equant”). The equant is a fixed point, at a fixed distance from the earth, and each planet has its own equant. The earth doesn’t move. While, in a way, it is just a matter of what reference frame one chooses, Aristarchus’ statement that all planets including the earth revolve around the sun, while the moon is a satellite of the earth, obviously (to us) describes things in a much more compact and transparent way. Yet, it was forgotten, while Ptolemy’s geocentric model remained undisputed for over a millennium, until the times of Copernicus and Kepler. Why pre-Kepler people would prefer a complex over a simple description of the same set of data is not clear. Sure, Aristarchus’ model didn’t quite “fit the data,” either: nobody had thought of elliptical orbits yet, and trying to describe the trajectories of planets as circles centered on earth would have caused some problems, too. But maybe a better answer is that Aristarchus’ theory did not quite agree with the spirit of the time; because the heliocentric idea was in disagreement with the teachings of Plato and Aristotle. Plato and Aristotle, who lived before Aristarchus and company, had founded “schools” which, for centuries after their deaths, would continue to pretty much embody mainstream western thought. Their systems of philosophy, explains, e.g., Arthur Koestler in his book The Sleepwalkers (Macmillan, 1959), “seemed to provide a complete answer to the predicament of their time”: which “was the political, economic and moral bankruptcy of classical Greece prior to the Macedonian conquest.”
1.4 From Aristarchus to the Enlightenment
9
Accordingly, “the general mood underlying these philosophies [is one of] unconscious yearning for stability and permanence in a crumbling world where ‘change’ can only be a change for the worse”. Plato concocts a world of ideal Forms, and sees our perceived reality as a cheap copy of it; Aristotle splits the universe into the lowly, “sublunar” region, where we live and that is subject to change, and, around it and above it, a system of rotating spheres, one nested within the other, made of a “fifth element”—anything in this world, according to the ancient Greeks, could only be made of earth, water, wind, and/or fire (more about this in Chap. 4): Aristotle postulated that celestial spheres should be made of an additional element, pure and immutable: something that you can’t find on earth. And the fifth element should move along circular orbits, because the circle is a perfect form, etc. The big limit of Aristarchus’ model was that it took the earth away from the center—the bottom—of the universe: which clashed with the idea that our planet should be the most lowly chunk of creation. So much for its simplicity, then. “The task of the mathematicians”, says Koestler, “was now to design a system which would reduce the apparent irregularities in the motions of the planets to regular motions in perfectly regular circles. This task kept them busy for the next two thousand years”. And that, roughly two thousand years after Plato and Aristotle, is when people like Buffon and Isaac Newton enter the scene. Newton obviously read Copernicus and Kepler, or in any case was familiar with their idea of placing the sun at the center of the universe. Then he started thinking of the sun as the entity that governs the motion of the planets, then eventually hypothesized that, through its much larger mass, the sun would accelerate each planet via a force-at-distance called gravity. It is gravity that determines the trajectory of a planet’s orbit. And gravity is not limited to astronomical objects. In Newton’s idea, a mutual gravitational attraction exists between any two pieces of matter, however small. The idea that some mysterious force should attract bodies to one another without need for any physical contact was a difficult one to digest, and it probably did take some amount of something we might call madness, and/or courage, to have it published in a book. But Newton was a peculiar character. “Newton came to be thought of as the first and greatest of the modern age of scientists, a rationalist, one who taught us to think on the lines of cold and untinctured reason”, says John Maynard Keynes6 , but no-one “who has pored over the contents of that box which he packed up when he finally left Cambridge in 1696 and which, though partly dispersed, have come down to us, can see him like that. Newton was not the first of the age of reason. He was the last of the magicians, the last of the Babylonians and Sumerians”, etc. In Newton’s box, packed when he left Cambridge for London, were, among other things, his writings on alchemy: the elixir of life, the philosopher’s stone and all that. People before Keynes had not paid much attention to those contributions of Newton, but, apparently, Newton himself invested at least as much time and energy in those as he did in the Principia—which he also composed in his Cambridge years. Anyway, if Newton was “the last of the magicians”7 , Buffon, on the other hand, is very much an age-of-reason kind of guy. Like I was saying, he shows little sympathy for his pre-enlightenment colleagues. The ideas dismissed by Buffon in his Theory of the Earth include William Whiston’s (A New Theory of the Earth, 1696) hypothesis
10
1 The Mass of the Earth
“that every possible event depends upon the motions and direction of the stars”, so that, accordingly, all “changes this earth has undergone have been produced by the tail of a comet.” Another thinker (Thomas Burnet, Sacred Theory of the Earth, 1681) “was so fully impregnated with poetical illusions, that he imagined he had seen the universe created. After telling us the state of the earth when it first sprung from nothing, what changes have been introduced by the deluge, and what it now is, he assumes the prophetic style, and predicts what will be its condition after the destruction of the human kind.” Finally, John Woodward (An Essay toward a Natural History of the Earth and Terrestrial Bodies, 1695) “explains the principal appearances of the globe by the aid of an immense abyss in the bowels of the earth; which, in his estimation, is nothing but a thin crust inclosing this vast ocean of fluid matter.” This passage shows quite clearly the discontinuity between the thought of Buffon and his peers, and that of the authors that had come before, who according to Buffon had never treated “the theory of the earth [...] but in a vague and hypothetical manner.” Reading his description of the works of Whiston, Burnet, Woodward, we see that what Buffon refutes are not just those authors’ conclusions, but their very way of thinking. Whiston’s “hypothesis appears, at first view, to be extravagant and fantastical.” “Burnet’s system [...] is an elegant romance, a book that might be read for amusement, but which cannot convey any instruction. The author was ignorant of the chief phenomena of the earth, and a man of no observation.” And as far as Woodward is concerned, “it is [...] unnecessary to give a formal refutation of absurd notions, especially when they proceed upon conjectures that are contrary both to the laws of probability and of mechanics.” Ultimately, it is sufficient to show that those ideas belong to another era, which the Enlightenment man can only look at with irony and scorn. It is the era of “fanciful systems” and “poetical illusions”, while now “it is the business of an historian to describe, not to invent; [...] no gratuitous suppositions are to be admitted in subjects that depend upon fact and observation; and [...], in historical compositions, the imagination cannot be employed”. The thing is, as much as Burnet and company are indeed men “of no observation”, from today’s and from Buffon’s perspective, it is also true that both we and Buffon are no longer able to understand them. We stand on this side of a major, like, discontinuity in the history of how people think about the world. Newton is right on the fault line, and can relate both to French enlightenment and to the semi-theological approach of Burnet, Whiston and Woodward8 . But, as much as we understand English, French and even Latin, we don’t speak those guys’ language anymore. Our mind functions in a different way: and I don’t think most of us are even aware of this discrepancy. As far as I am concerned, I think this became clear to me when I read Michel Foucault’s The Order of Things9 (Random House, 1971). I realize it’s kind of weird that someone who’s supposed to be an earth scientist, almost a geologist, has read Foucault; and even weirder to mention him in this, an earth science book; and I wonder if I am, like, qualified enough to bring up Foucault? And but anyway, assuming that I am: Foucault sees the pre-Enlightenment era (which he roughly identifies with the sixteenth century; the seventeenth century is a period of transition, and by the eighteenth century we have entered the age classique) as a world of “resemblances.” Basically, says Foucault, “the sixteenth
1.4 From Aristarchus to the Enlightenment
11
century superimposed hermeneutics and semiology in the form of similitude”, i.e., in the sixteenth century, “to search for a meaning is to bring to light a resemblance”. Signs are treated as things, and perceived to “resemble” the things they stand for. “The universe was folded in upon itself: the earth echoing the sky, faces seeing themselves reflected in the stars, and plants holding within their stems the secrets that were of use to man.” Resemblance requires imagination; because, “without imagination, there would be no resemblance between things. [...] There must be, in the things represented, the insistent murmur of resemblance; there must be, in the representation, the perpetual possibility of imaginative recall.” I don’t know how clear that is to you; I don’t even know how clear that is to me, though I find it kind of beautiful. But here is what I think is a good geological example: in the world of resemblance, stones are described in lapidaria as living beings, male or female, capable of reproducing, believing, having feelings. If a stone resembles some other entity—for example, parts of human or animal bodies—this resemblance must explain their origin and properties, whether they be directly observable or “magic”. Gems, for instance, are the most precious works of Nature and it is inferred that they must be born and receive their properties from the heavens and the stars, which they resemble. As for metals, gold originates from the sun, silver from the moon, quicksilver from Mercury, copper from Venus, iron from Mars, etc. This implied, e.g., that gold was most likely to be found in the tropics, where sunrays are at their brightest. Stuff like that. The enlightenment rids natural history of imagination. With the enlightenment, writes Foucault, “resemblance, which had for long been the fundamental category of knowledge—both the form and the content of what we know—became dissociated in an analysis based on terms of identity and difference”. Buffon doesn’t understand how Whiston, Burnet, Woodward could possibly defend their “fanciful” systems. He just thinks they are crazy: and so do we. Enlightenment and post-enlightenment scientists are those that do not sound crazy at all to us, because, basically, like I said, they stick to “facts”, as opposed to speculation, and but also because we’ve grown up in their world, in the culture that, starting with Newton, they have massively contributed to shaping: and their science is the science we are taught in school, etc. Nothing wrong with that, of course, but maybe we should think of all this whenever, for whatever reason, we stumble upon the writings of pre-enlightenment authors, and are put off by their way of reasoning. Maybe one day some future reader of twentieth and twenty-first century science will be equally put off by our way of thinking about the world? Anyway, that was quite a detour. But, this book is (also) about how our idea of the earth evolved over time: and so, then, it was necessary to establish a starting point, in time, for the story told by this book; and explain why that particular point was chosen, rather than another. The starting point being, like I said, Buffon, this book covers almost only post-enlightenment science. (And the occasional bit about the Greeks. But the Greeks don’t count. The Greeks are, like, some kind of singularity, a discontinuity in pre-enlightenment history of knowledge.) All this having been said, this book is not unaware of what happened before, and will in fact get into it from time to time; and certainly, finally, this book does recognize that enlightenment
12
1 The Mass of the Earth
science is not the only possible way for a human being to look at things. (And maybe, deep down, I would rather be an alchemist, like Newton wanted to be. In 2021 that wouldn’t pay the rent, though, and so here I am, teaching “normal” physics and math at a university.) So, then, Newton gave us the laws of dynamics and of gravitation, which are so good at explaining the way the earth and the other planets and their satellites all move; but then people were having a hard time accepting some aspects of Newton’s physics which, like I was saying, are kind of weird. For one, it wasn’t just about planets: but any pair of chunks of matter, including, e.g., you, and the book you are holding in your hands, were supposed to physically attract one another. Also, and this has immediate implications for the earth scientist, Newton figured that a rotating body might be deformed as a consequence of its very rotation. Because, as Newton’s argument goes, if you’ve got rotation you’ve also got centrifugal force10 . Centrifugal force is not the same everywhere within a rotating body: it’s zero along the rotation axis (because a material point that’s right on the rotation axis doesn’t move at all) and grows with distance from it. So, unless a body is perfectly rigid, centrifugal force will deform it a bit—until it’s counterbalanced by whatever forces keep the non-quite-rigid body together. In the case of the earth, centrifugal force would be largest at the equator—the locus of all points at the surface of the earth that are farthest from the rotation axis. Mass near the equator would be gently but inexorably pulled further away from the axis and ultimately the earth shouldn’t quite be spherical; but rather ellipsoidal, flattened at the poles and bulging symmetrically all along the equator11 . In the eighteenth century this was still theory, not necessarily fact, and plus, for instance, Descartes’ physics predicted quite the opposite, i.e., that the earth was, like, egg-shaped: elongated at the poles and flattened at the equator12 . So, Newton’s ideas still needed some more experimental validation.
1.5 The Figure of the Earth People understood that to determine whether the earth was spherical, or flattened at the poles or elongated at the poles, all one had to do was measure the “curvature” of the earth at a couple of different latitudes; ideally, near the poles and near the equator, so as to maximize the (presumably small) discrepancy between the two values of curvature that would be observed13 . Curvature is essentially one over the radius of a circle tangent to the surface of the earth at the location in question (minus, of course, local relief, which could be extrapolated by air pressure, measured with a barometer). Constant curvature would mean a perfectly spherical earth. Larger curvature at the poles means polar flattening and equatorial swelling, and so on. Curvature could be measured by comparing the horizontal distance14 between two points lying on the same meridian, and the difference between their latitudes: by trigonometry, their ratio is the local distance from the center of the earth.
1.6 Newton’s Laws of Motion and Gravity
13
Besides intellectual curiosity, the French understood that knowing the exact shape of the earth was relevant to strategically important endeavours like, e.g., navigation; and they financed two expeditions. One team of researchers was sent to Lapland, not too far from the north pole, and another to what at the time was generically called Peru, in any case somewhere along the Andes and near the equator. The “polar” team, operating on very flat terrain where measuring horizontal distances was as easy as it could possibly be, carried out their task fairly quickly. It was determined that curvature in Lapland was significantly smaller than that measured near Paris, i.e., the earth is flattened at the poles, as guessed by Newton, and that was that as far as the Descartes vs. Newton question was concerned. The “equatorial” expedition turned into a grotesque mess, something that would make a plot for a Werner Herzog’s movie. This includes the expedition leader, one Louis Godin, spending a significant chunk of the expedition’s budget on a diamond for one of his mistresses. Things sort of came back to normal after Pierre Bouguer took over.
1.6 Newton’s Laws of Motion and Gravity The reason I want to tell you more about Bouguer, though, is that, while he was concerned more than anything else with measuring curvature (which he eventually did, before heading back to France a few years behind schedule), he ended up making another important observation, which turned out to be relevant to the problem of finding the mass of the earth. This I haven’t even mentioned, yet, so let me take one short step back and explain to you why, now that Newton’s laws had been established, measuring the earth’s mass was still even a problem at all. Newton’s second law stipulated that the acceleration of a body caused by a force applied to that body is proportional to the force itself, through a constant, called “mass,” that depends on the body itself15 . Newton’s law of universal gravitation added that a force-at-distance called “gravity” existed between any two bodies, directly proportional to the product of their masses, and inversely proportional to their squared distance, i.e., if we call m 1 and m 2 the two masses, and r their distance, then F=
Gm 1 m 2 , r2
(1.6)
where G is a constant, whose value, as we are about to see, Newton didn’t know. These are empirical laws; they were not derived mathematically, but found by Newton based on observation: to explain, for instance, the relative positions of the planets in time (for which a great wealth of data were already available). You might think that playing around with these equations and some relatively simple measurements, one could determine the mass of the earth and those of other planets. It turns out that is not the case. To see what’s missing, start by imagining you know the distance r between the sun and a planet X , and the acceleration a of planet X with respect to the sun, at some moment in time (and from the previous pages you know
14
1 The Mass of the Earth
that these are things that people before Maskelyne did actually know), and you want to determine the mass of the sun, M S . The law of gravitation says that the force the sun exerts on the planet equals G M X M S /r 2 , where M X is the unknown mass of the planet. You might rightly guess that it is OK to assume that the mass of the sun is so much larger than that of other planets, that the gravitational forces other planets apply to planet X are negligible with respect to that applied to it by the sun. So then a, the acceleration of planet X , is entirely due to the sun’s attraction, and the second law says that the force the sun exerts on the planet must also equal M X a. If you equate these two expressions to one another, M X cancels out, which is nice, and a=
G MS . r2
(1.7)
Now, as we know, a and r can be measured, which is also nice, but both M S and G are unknown. One equation, two unknowns: no way to determine the value of either one. You can find the numerical value of the product G M S , and that will be enough to verify that Newton’s laws do “predict” the displacement of planets as we actually observe it (which involves lots of cumbersome math and I am not going to do it here, but you probably get the point). But you can’t disentangle G and M S from one another: you can’t know how many times the mass of the earth is needed to make up the mass of the sun, etc. So, Newton couldn’t determine the value of G. Maybe you are convinced of this, maybe not; let’s think of another experiment. Think of a small object, something around you, a pebble, a pencil. Obviously, it has weight, it is attracted towards the earth, so if you take it in your hand and then let it go, it falls to the ground with the acceleration g that Galileo first measured (you probably remember its value, from school: about 10 m/s 2 ). What happens if we apply to that object the same reasoning as above? can we use measurements of its mass, and of g, to “constrain” the mass of the earth? To answer the question, one needs to calculate, and perhaps to find an algebraic formula for the force of attraction that the entire earth exerts on the pencil. But we can’t do it the way we did it above: because now the difference in size between the earth and my pencil is very large compared to the distance that separates them. So, this time, we can’t think of the two masses involved in the problem as “material points”. The trick to solve this problem, and many similar problems that involve bodies of finite (i.e., not small enough to be negligible) spatial extent, is to think of the sphere (the earth) as the combination of an infinity of infinitely small points. The idea that something finite could be written as the sum of an infinite number of infinitely small contributions, and that this could actually simplify things rather than making them unbearably complicated, is the main building block of mathematical analysis; it is due to Newton himself, and to his contemporary Gottfried Wilhelm Leibniz. Let us see how it works in practice. We are going to call d V the volume of each infinitely small element of the sphere. The letter d stands for “differential” (no need, at this point, to understand what that precisely means) and serves to remind us that d V is as small as we want it to be. We will call ρ the density of the earth,
1.6 Newton’s Laws of Motion and Gravity
15
Fig. 1.7 A sphere and a material point attracting one another through gravity
or how much mass is contained in one unit of its volume—so the numerical value of ρ depends on which set of units of measurements we shall decide to use in our calculations—and also, ρ can in principle change from place to place within the earth... anyway, at a given point, the mass of d V is dm = ρd V , where rho is the value of density at that point and, since d V is extremely small, dm is also extremely small. If Newton is right, the mass dm = ρd V would accelerate my pencil according to G ρd V , (1.8) da = ξ2 where ξ is the distance between the observation point and the position in space occupied by d V (Fig. 1.7), and da the acceleration that would be imparted to my pencil if, besides my pencil, ρd V was the only other mass existing in the universe. Again, the letter d serves to remind me that that acceleration would be very small. Now, consider the combined effect of several small elements d V , standing at different locations within the sphere. This could in principle make things much more complicated, as in most cases the d V s will not only be at different distances from the observation point, but also in different directions. This information is not directly encoded in Eq. (1.6). At the time of Maskelyne and company, before “vectors” were invented (maybe you’ll be surprised to know that that did not happen until the last decades of the nineteenth century, that is, about two centuries after Newton), this required a lot of awkward book-keeping16 . If some simplifying assumptions are made, however, it turns out that by a clever trick one can get rid of direction issues and make things reasonably feasible, even without vectors. What we need to do is, we need to make the hypothesis that ρ only changes as a function of depth, or distance from the center of the earth. If we accept that that could be a reasonable approximation to the real world, we can go ahead: trace a straight line connecting the observation point P with
16
1 The Mass of the Earth
the center of the sphere. For each element d V to one side of this axis, there exists another d V on the opposite side, occupying a perfectly symmetrical location, and also at the same distance from the center of the earth—and so, then, having the same ρ and the same mass dm = ρd V . Now, think of the attraction applied at P by either dm as the sum of two forces: one in the direction of the center of the sphere, the other perpendicular to it, pointing on the side of the sphere where that dm is. When you sum this contribution to that of the dm which is placed symmetrically with respect to the first, the forces acting in the direction of the center of the sphere are combined, to increase that attraction; the other two cancel out. Extend this reasoning to the entire volume of the sphere: you see that everything cancels out except for a large force of attraction pointing towards the center of the sphere. So, provided that the sphere’s density only changes with distance from the sphere’s center: gravitational attraction measured anywhere outside the sphere points directly to its center. Now that direction is sorted out, let us worry about the strength of the attraction. Here we have to do some actual math, and in the next few pages or so there will probably be a higher density of formulae than throughout most of this book. If you want, you can probably skip to the next place where there’s more text than equations, and take the results of the following calculations for granted. But if you’d rather check for yourself, here you go: you can start out by finding the acceleration of your pencil, pebble, whatever that was, caused only by the small volume which we have called d V : as if there was nothing else in the universe but that small parcel of matter: we’ll worry later about summing up the effects of all the parcels of matter that make up the earth. So, call D the distance between the origin and your observation point, ϑ the “colatitude” (which is just 90◦ minus latitude) of d V , ϕ its longitude, and r its distance from the center of the sphere (look at Fig. 1.8). Because d V is as small as we want it to be, it’s OK to think that it is entirely identified by just one value of ϑ, one value of φ and one value of r . Think of it as a tiny brick, of sides dr (the distance
Fig. 1.8 A sphere and a material point attracting one another through gravity: calculating the acceleration at P
1.6 Newton’s Laws of Motion and Gravity
17
between its “top” and its “bottom”), r dϑ (by trigonometry, the distance between its, let’s say, north and south boundaries), and r sin ϑdϕ (again by trigonometry, the tiny distance between its east and west boundaries). It follows that d V = r 2 sin ϑdr dϑdϕ,
(1.9)
and, by Eq. (1.8), its effect on the acceleration that one would observe at P is da =
Gρr 2 sin ϑdr dϑdϕ cos α, ξ2
(1.10)
where if you look again at Fig. 1.8 you’ll see that ξ coincides with the distance between P and the element of mass at r , ϑ, ϕ, and multiplying with cos α is equivalent to taking, for each infinitesimal contribution to the total acceleration, only the component that points towards the center of the sphere17 ; which is the only one we care about, because we’ve just proven that those that are perpendicular to it cancel out. The total a caused by the whole sphere is what one obtains by (i) letting r , ϑ, and ϕ change so that they subsequently occupy all possible locations throughout the 2 ϑ cos α at each location (G is constant, and sphere, (ii) calculating the value of ρ(r )r sin ξ2 dr , dϑ, dϕ can be chosen to be constant) and then (iii) summing. This operation is usually called “integral”, and represented by
R
a=G
dr 0
π
2π
dϑ 0
dϕ 0
r 2 sin ϑρ(r ) cos α(r, ϑ) , ξ 2 (r, ϑ)
(1.11)
where R denotes the radius of the sphere, and writing α and ξ as α(r, ϑ), ξ(r, ϑ) serves to remind us that α and ξ are not constant, but change when r and ϑ change (if you look carefully at Fig. 1.8, though, you’ll see that α and ξ are constant with respect to ϕ). Now, because by definition d V is infinitely small, the sphere is made of an infinity of such bricks, and the sum in Eq. (1.11) cannot really be computed as described. But Newton’s and Leibniz’s method of mathematical analysis outflanks this difficulty: dealing with the infinite and the infinitely small is precisely what it was designed for18 . And their operation of integration19 does precisely that: it sums up an infinity of infinitely small contributions. What we are going to do next is, we are going to study the terms that contribute to the sum in (1.11), and “manipulate” them mathematically until we are able to reduce (1.11) to a simpler mathematical formula, that can be interpreted more intuitively and/or “implemented” (the numerical value of the sum can be computed) more quickly. (This might be called, “solving” the integral.) In many ways, integrals follow the same rules as sums, so, I don’t know how good you are at math, but even if you are not that good you might be able to follow the next few steps; and if you manage, that will make you stronger and help you deal with the math that will come later, in this and other chapters of this book. Notice that in the ugly expression—the “integrand”—to the right of the integral symbols, nothing is related to ϕ except for the small constant increment dϕ; this
18
1 The Mass of the Earth
Fig. 1.9 The law of cosines relates the cosine of δ to the lengths of b, c, d
means that each term in our “sum” contains a constant factor which coincides with the sum all the dϕs that make up the interval of values of ϕ that are relevant to us—all longitudes between 0 and 2π. We can go ahead, take that sum—which must necessarily give 2π—and “pull it out” of the other integrals, and
π
r 2 sin ϑρ(r ) cos α(r, ϑ) ξ 2 (r, ϑ) 0 0 π R sin ϑ cos α(r, ϑ) , dr ρ(r )r 2 dϑ =2πG ξ 2 (r, ϑ) 0 0
a =2πG
R
dr
dϑ
(1.12)
where, just for clarity, as they say, I’ve also pulled out from the integral over ϑ all the stuff that doesn’t depend on ϑ. The two integrals that are left in (1.12) are trickier because we have ρ which is a function of r —and we don’t even know anything about that function... unless we make other assumptions, but that won’t be needed: there’s a trick that most of you probably wouldn’t be able to see—unless you spend really a lot of time thinking about this integral, and/or perhaps trying various ways to solve it, failing and then trying again—and it consists of using the so-called “law of cosines”, which is a law of trigonometry that, like all other laws of trigonometry, I’ve decided I won’t cover directly in this book (I won’t prove it), and but anyway states that for any triangle whose sides are called b, c, d, and δ denotes the angle opposite d, see Fig. 1.9, then: d 2 = b2 + c2 − 2bc cos δ, or, which is the same, cos δ =
b2 + c2 − d 2 . 2bc
(1.13)
(1.14)
So now if you apply the law of cosines to the triangle formed by the observation point P, the center of the earth and the element of mass at r , ϑ, ϕ, you get, for example, that D2 + r 2 − ξ2 cos ϑ = . (1.15) 2Dr
1.6 Newton’s Laws of Motion and Gravity
19
This is useful for the integral in (1.12), because the integral in (1.12) contains sin ϑdϑ, and if you differentiate both sides of (1.15) with respect to ξ you actually end up with an expression for sin ϑdϑ, which doesn’t look that simple per se, but will actually help you to simplify the integral... which probably sounds somewhat abstract, so let’s see it in action: differentiate (1.15), and20 − sin ϑ
ξ dϑ =− . dξ Dr
(1.16)
This last step might have some of you worried, because r and ξ are not independent: changing ξ that means changing the position that the brick dm occupies within the sphere: and so its coordinates r , ϑ and ϕ have got to change too; so how come I don’t have some derivative of r with respect to ξ at the right-hand side of (1.16)? But the thing is, we are really interested in manipulating the expression that gets integrated over ϑ in (1.12); and the way (1.12) is written, we are supposed to integrate over ϑ first, leaving r constant, and only later over r (as you might expect, yes, nested integrals work just like nested sums). Equation (1.16) is true for any fixed value of r , and it follows from it that, for any fixed value of r , sin ϑdϑ =
ξ dξ . Dr
(1.17)
Like I was saying, now we might replace sin ϑdϑ in (1.12) with its expression (1.17): careful, though, because the limits of integration, 0 and π, refer to ϑ, and if we “change the integration variable,” replacing ϑ with ξ, like we are doing, we also have to replace 0 and π with the values taken by ξ when ϑ = 0 and ϑ = π. Figure 1.8 shows that, for a given value of r , those are21 D − r (for ϑ = 0) and D + r . So then Eq. (1.12) becomes D+r 2πG R cos α(r, ξ) . (1.18) a= dr ρ(r )r dξ D 0 ξ D−r The reason the change of variables was useful is that now if we use again the law of cosines, we can immediately find an expression for cos α(r, ξ) that doesn’t explicitly involve ϑ, i.e., D2 + ξ2 − r 2 ; (1.19) cos α = 2Dξ plug that into (1.18) and πG D2
D2 + ξ2 − r 2 ξ2 0 D−r R D+r D+r πG 1 = 2 dr ρ(r )r (D 2 − r 2 ) dξ 2 + dξ , D 0 ξ D−r D−r
a=
R
dr ρ(r )r
D+r
dξ
(1.20)
20
1 The Mass of the Earth
which I think you’ll agree is not so hard to solve; ξ12 is the derivative22 of − 1ξ ; the D+r integral D−r dξ coincides with the length of the interval from D − r to D + r , or D + r − (D − r ) = 2r , so 1 πG R 1 2 2 − + 2r a= 2 dr ρ(r )r (D − r ) D 0 D −r D +r 4πG R = 2 dr ρ(r )r 2 , D 0
(1.21)
because D 2 − r 2 = (D + r )(D − r ). Now, keep a cool head, look at this final formula and consider that 4πr 2 is the surface area of a sphere of radius r . But then, if you multiply this by a small length dr , what you get is the volume of a spherical shell of radius r and thickness dr . If you multiply this by the density ρ(r ) of earth material at that radius, you get the mass of that same spherical shell. And finally, if you integrate over r from 0 to R, that is to say, if you sum together the masses of all concentric shells of radii between 0 and R, what you get is the total mass of the earth, let’s call it M E . And it all boils down to a=
G ME , D2
(1.22)
which is the exact same acceleration you’d get if all the mass of the earth were concentrated at its center23,24 . Doesn’t matter how that mass might actually be distributed as a function of r : whatever the function ρ(r ) looks like, we’d be getting this same result. Very nice, but also disappointing, because again we end up with a simple expression where G and M E are multiplied together, and we don’t find any new clues of how we might be able to separate them from one another.
1.7 Bouguer’s Expedition to the Andes After all this, we can go back to Bouguer and co., who, unlike their colleagues in Lapland, were in the middle of a major mountain range (the Andes). Besides the difficulty of estimating horizontal distances in a landscape that’s very much non horizontal, they realized that they were faced with another problem, whose entity was not clear and could be important. (And the stuff we’ve just learned will come handy to understand the issue.) To measure latitude back then people would use a plumb line, that is, a line with a plumb attached to it, that, as you now understand, should point directly towards the center of the earth (so if you measure the angle between the plumb line and a star that lies on the plane of the earth’s equator, when that star is at its zenith, that angle is your latitude). But the theory that we’ve derived above, and on which this reasoning is based, holds only in the assumption that the earth is a spherical or approximately spherical body. Bouguer and his partner La Condamine
1.7 Bouguer’s Expedition to the Andes
21
Fig. 1.10 Bouguer’s measurements of plumb-line deflection near Chimborazo: make two measurements of deflection on the two sides of a mountain (of approximately symmetric shape and mass density), and take their average to reduce the error
realized that, being in the Andes, that approximation was not that great anymore: near their observation point, the figure of the earth would very much depart from spherical symmetry, being “perturbed” by enormous mountains. Their plumb would be attracted not only by the spherical earth under their feet, but also by the mountain nearby: and so the line would not point exactly towards the center of the earth, but form an angle with respect to it, whose value should depend (we shall see the details of this in a minute) on the ratio of the mass of the mountain to that of the whole planet. The angle would be quite small, as the earth is still way bigger than any mountain, but possibly large enough to mess up the length-of-a-degree-of-latitude measurement, which obviously had to be as high-res as possible. Bouguer and La Condamine came to the conclusion it was worthwhile to quantify the error, and concocted the following trick25 . Pick a meridian that goes through the mountain. Pick two points, lying on that meridian, located at opposite sides of the mountain. Using a plumb line, measure the elevation of a known star at both points, at the same time of the day, e.g. when it reaches its zenith. The sum of the two elevations is the angle β of Fig. 1.10. If there was no deflection, β would coincide with the latitude difference26 , α; the discrepancy between β and the latitude difference is twice the deflection. The point is that, while the deflection is tiny and hard to measure directly, α and β are reasonably large, plus doing two measurements instead of one you should reduce the error, too. The French implemented this experiment at a particularly massive mountain called Chimborazo (a volcano, currently inactive, over 6000 m high). Now, here is what you do with the deflection: think of the force acting on the plumb line as the sum of two forces: the gravitational attraction of the earth minus
22
1 The Mass of the Earth
Chimborazo—let us call it W, and make it boldface to remind ourselves that it’s more that just one number: forces have size, but also direction27 —and the gravitational attraction of Chimborazo alone, which we shall call F. It is OK to assume that the earth is close enough to a sphere and that mass density within the earth only changes with depth—different chunks of earth’s material should always have the same density if they are at the same depth—this is a simplification that is still used by geophysicists today, at least occasionally, to simplify their calculations—and we have seen that in such simple cases W points towards the geometrical center of the earth. F points, instead, towards Chimborazo. Or, I should say, to a point within Chimborazo—Chimborazo’s so-called “center of mass”28 . To know where exactly that point is, we should do an integral kind of like the one we’d done earlier over the volume of the earth, except instead of over a sphere this time we’d be integrating over the volume of Chimborazo (not easy, obviously), and also we’d need some estimate of the density of the rocks that form Chimborazo. So, in the first approximation, let us assume that F is simply parallel to the earth’s spherical surface (I think that that is what Bouguer did): and, as a consequence, perpendicular to the earth’s attraction. Now, the plumb line necessarily points in the direction of the total gravitational force acting on the plumb, or the combination of F and W. When the bob stops oscillating, it means that this total force is compensated by tension in the plumb line: another force, that we shall call T. F, W and T are all drawn in Fig. 1.11. Let us do the force balance in the directions of F and of W. Essentially, the vector sum29 of F and W must coincide with T in magnitude, but point in the opposite direction so that the plumb doesn’t move. The magnitude of F must coincide with that of the projection of T in the horizontal direction, i.e., F = T sin γ, where F is the magnitude of F, i.e., just its size—direction doesn’t matter—and γ is the deflection, see Fig. 1.11. As for the vertical direction, same reasoning, W = T cos γ. And their ratio, F = tan γ. W
Fig. 1.11 Measurement of plumb-line deflection near a mountain: the force balance
(1.23)
1.7 Bouguer’s Expedition to the Andes
23
We can use Newton’s law of gravitation to write the magnitudes F and W in terms of the earth’s and Chimborazo’s masses. earth’s attraction: W =
Gm M E , R2
(1.24)
where m is the mass of the plumb, M E is the mass of the earth, and R is its radius; Chimborazo’s attraction: Gm MC , (1.25) F= d2 where MC is the total mass of Chimborazo, and d the distance between the observation point and the surface projection of the mountain’s top. Take the ratio of F to W, MC d 2 F = , (1.26) W ME R2 and you might have noticed that G has cancelled out, which is a very good thing. We have used two independent constraints from Newton’s theory (that (i) tension in the plumb line is equal and opposite to the total gravitational attraction acting on the plumb, and that (ii) gravitational forces associated to Chimborazo and to the rest of the planet are both accurately described by the law of gravitation) to write the ratio of magnitudes, F/W in two different ways. We can equate expressions (1.23) and (1.26), to find MC d 2 , (1.27) tan γ = ME R2 or, if you solve for M E /MC ,
ME d2 , = 2 MC R tan γ
(1.28)
that is, an expression for the ratio between the masses of the mountain and of the earth, in terms of three other, known and/or measurable quantities; and, finally, no sign of G. After taking measurements and doing all this math, Bouguer estimated γ to be about 7.5 (where the double prime stands for seconds of arc; a second of arc is 1/60 of a minute of arc which is 1/60 of a degree), which is much smaller than what he would have expected—which was something like 1 min of arc30 . His calculations were quite rough, though, as we’ve seen: for instance, assuming density to vary only with depth was dangerous, because a density heterogeneity, within the planet, similar in size to Chimborazo, and close to the observation point, would be enough to mess everything up31 : and “we know very little about the density of the earth”, writes Bouguer in his commentary towards the end of his book. Bottom line, he doesn’t really trust his observation of γ that much, and doesn’t attempt to give a figure for the earth’s average density.
24
1 The Mass of the Earth
1.8 Maskelyne and the Schiehallion This was attempted some years later by Nevil Maskelyne, Astronomer Royal, who in 1772 read to the Royal Society a paper formulating his “Proposal for Measuring the Attraction of Some Hill in This Kingdom by Astronomical Observations”32 . Maskelyne wanted to verify, by experiment, Newton’s hypothesis that “the attraction of gravity be exerted [...] not only between the large bodies of the universe, but within the minute particles of which these bodies are composed, or into which the mind can imagine them to be divided”. If Newton is right, says Maskelyne, a plumb line near a mountain, or hill, would not point precisely towards the center of the earth, but would be slightly deflected toward the mountain or hill itself. And so his plan was to redo what Bouguer had tried to do near Chimborazo, but under much more favourable conditions—i.e., a hill of nice symmetric shape, in Scotland. Maskelyne had become involved with the curvature problem, and the problem of somehow correcting curvature data from the deflections caused by mountains, when Charles Mason and Jeremiah Dixon were hired by the Royal Society to survey the borders between Pennsylvania, Maryland and Delaware33 . Finding themselves with plenty of geodesy gear on very flat terrain, Mason and Dixon figured this could be a good chance to try and measure the length of a degree of latitude, redoing what their French colleagues had done in Ecuador and Lapland a few decades before. They published their results in a Philosophical Transactions article34 to which Maskelyne wrote a postscript; by now, the experiment had been repeated at various latitudes by several researchers, and, “as it may be agreeable to the reader to see the result of the principal measures of degrees of latitude, that have been taken with later instruments and proper accuracy brought together into one view,” Maskelyne included in his postscript a table (page 577 of the Transactions volume) showing the value in Paris toises of the length of one degree of latitude, as a function of the latitude where it had been measured. If one assumes that the earth is an “oblate spheroid,” i.e., if meridians are elliptical, then it should be possible to find a unique ellipse fitting all these observations. Maskelyne shows that this is not the case. He infers that “either the figures of the meridians are not accurately elliptical, or that the inequalities of the earth’s surface have a considerable effect in deflecting the plumb line from its true situation, or both. Mr. M.35 had indeed supposed that any deflections of the plumbline were not to be feared with respect to this particular measure of a degree, at the end of his Introduction to Messrs. Mason and Dixon’s account of the same, by arguing, perhaps too far, from the level disposition of the country through which the degree passes. But Mr. Henry Cavendish has since considered this matter more minutely; and having mathematically investigated several rules for finding the attraction of the inequalities of the earth, has, upon probable suppositions of the distance and height of the Allegany mountains from the degree measured, and the depth and declivity of the Atlantic ocean, computed what alteration might be produced in the length of the degree, from the attraction of the said hills, and the defect of attraction of the Atlantic; and finds the degree may have been diminished by 60 or 100 toises from these causes. He has also found, by similar calculations, that the degrees measured in
1.8 Maskelyne and the Schiehallion
25
Italy, and at the Cape of Good Hope, may be very sensibly affected by the attraction of hills, and defect of the attraction of the Mediterranean Sea and Indian Ocean.” Cavendish would later continue to consider the matter more and more minutely, as we shall see. Maskelyne figured that the best way to sort out the mountain deflection effect was to think big, and, like I was saying, redo Bouguer’s experiment under better conditions. It was, again, Charles Mason who went to the field, looking for the right place to give it a try. He found Schiehallion, of which Maskelyne wrote, “a remarkable hill, nearly at the centre of Scotland, of sufficient height, tolerably detached from other hills, and considerably larger from East to West than from North to South”. The fact that Schiehallion stood in isolation from any nearby hills would reduce the graviational influence of the latter; because it was steep (on its northern and western slopes), instruments could be deployed close to its “center of mass”36 and the tiny deflection that was to be measured would accordingly be maximized. To borrow an ubiquitous, ugly expression from today’s typical research grant proposals: the Schiehallion experiment included two “work packages”. The first consisted of measuring the positions of stars with respect to a plumb line, to the North and to the South of the mountain, as Bouguer had done in Peru. Maskelyne first spent six weeks in a temporary observatory built to the South of Schiehallion, making in all 169 observations of the zenith distances of 39 stars. Observations took time also because bad weather would inevitably delay them; many observations were necessary so that one could then take an average value of plumb deflection, and so reduce the observational error. Seven more weeks where spent on the North side of Schiehallion, resulting in 168 observations made on 37 stars. From those data, Maskelyne estimated the β of Fig. 1.10 to be 54.6 , while the latitudes of the two observation points differed by 42.94 , and so that gives a discrepancy of 11.6 , twice the γ of Fig. 1.11, which must be interpreted according to Eq. (1.27): except that now MC does not stand for the mass of Chimborazo, but for the much smaller mass of Schiehallion. The goal of the second work package was to measure the volume and mass, and “center of mass” of the mountain as precisely as possible. The mass of Schiehallion could be estimated from its volume, if one simply multiplied volume by the weightper-unit-volume of rock samples—which you can measure. So, to get volume, the mountain was first of all surveyed, with the usual methods: chains, theodolites for triangulation37 , barometers to determine altitude. Then, the task of “integrating” all these measurements to compute Schiehallion’s volume and “center of mass” was assigned to a mathematician named Charles Hutton (not to be confused with his contemporary James Hutton, about whom more in the next chapter). Maskelyne discusses and interprets the results of work packages one and two in his 1775 Philosophical Transactions paper38 . No important differences between his method and that of Bouguer: it all boils down to (see Fig. 1.10) identifying the directions of T (the plumb line, compared to fixed stars), W (towards the center of the earth), and F (towards Schiehallion’s center of mass), except that measurements now are more precise; also, C. Hutton had come up with an estimate of where exactly Schiehallion’s center of mass actually is, and if we want the vector T to point towards it, as it should, we have to accept that T won’t be exactly parallel to the surface of
26
1 The Mass of the Earth
the earth; so the algebra gets slightly more convoluted, but really not much different. Eventually, it all boils down to the discrepancy γ between where the plumb line actually points, and where it would point if there were no Schiehallion. In his paper Maskelyne compares the γ that he has observed with a value of γ that Hutton has computed, using the math that we’ve seen above, and the hypothesis that the density of rock be 2.5 times the density of water (which is the same as saying, as we would today, 2500 kg/m3 ), i.e., a reasonable rough estimate for the average weight of rock samples picked up at the earth’s surface: but Maskelyne and co., for lack of better knowledge, assumed that this estimate was okay not only within the volume of Mount Schiehallion, but all the way down to the center of the planet. The two γs turn out to be very different, with Hutton’s γ about twice as large as Maskelyne’s. That meant, if Hutton hadn’t messed something up39 , that the earth’s overall gravitational pull was stronger than expected; and the only way to explain this was to conclude that the average density of the whole planet was higher than “2.5 times the density of water,” i.e., somewhere within our planet there existed rocks much heavier than those that we see up here. After some more algebra, Maskelyne estimated the average density of the earth to be 4.5 times that of water40 .
1.9 Cavendish’ Experiment A problem with Maskelyne’s result is that it rests on the assumption that the density of observed rock samples be a good estimate for the average density of Schiehallion. That’s not necessarily very accurate. One way to do away with this problem is to run a similar experiment, but replace the mountain (Chimborazo, Schiehallion or whatever) with an object that can be weighed independently, so that we know its mass precisely. The Reverend John Michell41 , a contemporary of Maskelyne, designed an “apparatus” to “determine the density of the earth, by rendering sensible the attraction of small quantities of matter; but, as he was engaged in other pursuits, he did not complete the apparatus till a short time before his death, and did not live to make any experiments with it. After his death, the apparatus came to the Rev. Francis John Hyde Wollaston, Jacksonian Professor at Cambridge, who, not having conveniences for making experiments with it, in the manner he could wish, was so good as to give it to me.” Me here is Cavendish, whom we’ve met briefly not long ago, and the text is from the first paragraph of his 1798 Philosophical Transactions paper, “Experiments to Determine the Density of the Earth.” Cavendish goes on to explain that “the apparatus is very simple; it consists of a wooden arm, 6 feet long, made so as to unite great strength with little weight. This arm is suspended in horizontal position, by a slender wire 40 inches long, and to each extremity is hung a leaden ball, about 2 inches in diameter; and the whole is enclosed in a narrow wooden case, to defend it from the wind. [...] If the wire is sufficiently slender, the most minute force, such as the attraction of a leaden weight a few inches in diameter, will be sufficient to draw the arm sensibly aside. The weights which Mr. Michell intended to use were 8 inches diameter. One of these was to be placed on one side
1.9 Cavendish’ Experiment
27
Fig. 1.12 Cavendish’ experiment. The large (black) spheres are fixed. The small (gray) spheres attached to the rod are free to rotate about the wire
the case, opposite to one of the balls, and as near it as could conveniently be done, and the other on the other side, opposite to the other ball, so that the attraction of both these weights would conspire in drawing the arm aside; and, when its position, as affected by these weights, was ascertained, the weights were to be removed to the other side of the case, so as to draw the arm the contrary way, and the position of the arm was to be again determined; and, consequently, half the difference of these positions would shew how much the arm was drawn aside by the attraction of the weights.” In other words (look at the diagram in Fig. 1.12), if Newton is right and there is a gravitational attraction between the small moving bobs and the large fixed weights, the moment you add the large weights to the system, the moving arm will rotate as a result of the bobs being attracted each to the closest weight. This will tend to twist the wire, and the wire will naturally resist this deformation, more or less strongly depending on how “slender” it is (and on the material of which it is made, etc.). At some angle φ (see Fig. 1.12), the torsion of the wire will balance the gravitational attraction: that would be the new equilibrium point of the system: so if you have a way (which you do, as we will see in a second) to measure the force associated with the torsion of the wire at that point, you have measured gravitational attraction. You might be wondering how we get from there to the density of the earth. Well, remember Newton’s law of universal gravitation, stating that any pair of bodies attract one another with a force proportional to their masses and inversely proportional to the square root of their distance: which is what Eq. (1.6) says. If we call F the force of gravitational attraction, Mb the mass of one of the bobs, Mw that of one of the weights, and d the distance (at equilibrium) between a bob and the closest weight, that reads G Mw Mb , (1.29) F= d2 and if you have a way to measure the force associated with the torsion of the wire, then you know F, and then the only thing in this equation that you don’t know is the proportionality constant G. Now consider another two-body system, consisting of the earth, whose mass Me at this point is not known, and one bob of mass Mb . The weight Wb of the bob can
28
1 The Mass of the Earth
be measured: it is nothing but the gravitational force that attracts the bob toward the center of the earth. So, according to Newton’s law, Wb =
G Me Mb , Re2
(1.30)
where Re denotes the earth’s radius. Divide Eq. (1.29) by (1.30), and Mw Re2 F = , Wb Me d 2
(1.31)
and again, if you have a way to measure F, then the only unknown quantity that is left here is Me : so you just have to solve for Me , Me =
Mw Wb Re2 , Fd 2
(1.32)
substitute the symbols with numbers and do the arithmetic42 . Now, you understand that the crux of Michell’s and Cavendish’ work is how to measure the force, F, applied on the bob by the wire, through its torsion. “This Mr. Michell intended to do,” writes Cavendish, “by putting the arm in motion, and observing the time of its vibrations.” The “time” of vibrations being what today we would call their period, or (the inverse of) their frequency. A similar approach was used, at the time, in early studies of electromagnetic attractions, which are also often very small: in a footnote, Cavendish mentions that “Mr. Coulomb43 has, in a variety of cases, used a contrivance of this kind for trying small attractions; but Mr. Michell informed me of his intention of making this experiment, and of the method he intended to use, before the publication of any of Mr. Coulomb’s experiments.” Anyway, why does one need to measure a frequency of oscillation to figure out a force? Cavendish takes all of this for granted in his paper, but let’s do some physics. First of all, what oscillations are we talking about? Well, think of Cavendish’ apparatus, even without the large weights, to make it simpler: just the two bobs, attached to a horizontal rod suspended, through its midpoint, by a string. Start with the wire not twisted, the system at rest. If now the bobs are pulled aside along the horizontal plane, you’ll probably guess that the rod will start oscillating within that plane, and will do so for a very long time. In fact, Newton’s laws tell us that it would oscillate forever if it wasn’t for some resisting force—friction with air, for example—that would eventually stop it. It is actually not so hard to work this out mathematically, i.e., to find a formula that describes the motion of the bobs over time; and it gets easier if we neglect the mass of the rod (which Cavendish didn’t, but then again we can’t just cover everything). Also, the bobs are forced to move along a circle whose radius coincides with the half-length of the rod—from its center to one of the bobs—so the only thing that can change over time is the angle, φ in Fig. 1.12, that the rod forms with respect to some reference position. Let’s agree that φ = 0 when there is no torsion in the wire. If,
1.9 Cavendish’ Experiment
29
then, we take L to denote the rod’s length, by trigonometry the displacement of the bob is L2 φ. Now, as usual with this kind of physics exercises, you have to equate force to mass times acceleration—where acceleration now is not g but the unknown (for the time being) actual acceleration of the bob, and the problem then sort of solves itself. Let’s look at this in detail. Acceleration is the ratio of a change in the bob’s velocity by the length of the time interval over which this change occurs. The bob’s velocity is in turn the ratio of a change in its position, versus time. Since displacement , where t denotes time and δ means variation: is L2 φ and L2 is constant, velocity is L2 δφ δt δφ is a change in φ and δt the time it takes for that change to occur. The letter δ also usually indicates that variations are small; the smaller they are, the closer we are to instantaneous velocity, the precise velocity of the bob at a certain moment in time—which is what we need to determine instantaneous acceleration. Bottom with the derivative44 of φ with respect to t, dφ . Then we can get line, we replace δφ δt dt acceleration by differentiating, again, with respect to t, i.e., taking the derivative of velocity with respect to t, which is the same as the derivative of displacement second δφ L 1 with respect to t. This could be written 2 δt δ δt —which, be careful, is not the same thing as
δφ2 —but δt 2
the commonly accepted notation is the more compact
d2φ , δt 2
or ∂∂tφ2 . It follows that, in Cavendish’ setup, Newton’s old “force equals mass times acceleration” translates, for each bob, to 2
F =m
L d 2φ . 2 dt 2
(1.33)
Now, Hooke’s empirical law (which we’ll meet again in Chap. 6) stipulates that L F = −kφ, 2
(1.34)
where F is a force applied at one end of the rod, and k a positive constant that depends on the material the string is made of, etc. (Incidentally, L2 F is also called the torque that results from the string’s torsion.) If we start off with the rod at some nozero angle φ0 , i.e., with some nonzero initial torsion in the string, the string pulls the rod toward the 0-torsion setup, which we’ve chosen to correspond with φ = 0. The motion that ensues is described, implicitly, by Eq. (1.33), which, after replacing F with Hooke’s formula (1.34) and doing a little algebra, becomes 2 2 d φ L . (1.35) − kφ = m 2 dt 2 Incidentally, Eq. (1.35) can be described as stating that the torque from the string’s torsion equals the moment of inertia (which is what people like to call m( L2 )2 , though a better name is probably rotational inertia) of the rod/point mass system, times its angular acceleration. You can see this as an angular form of Newton’s law, with the moment of inertia replacing mass, sort of, and the angle φ replacing distance45 :
30
1 The Mass of the Earth
the person who first reformulated Newton’s physics in this “angular” perspective is Leonhard Euler. If you start with the bobs at φ = 0 and with no initial velocity or acceleration, nothing happens, because there’s no movement in the bobs and no force from the string (or anything else); but if you start off with the bobs displaced at some angle from the 0-torsion position, the system will begin to move according to Eq. (1.35). Knowing what the bob will do over time amounts to finding the unknown relation, between φ and t, that is implicit in (1.35); or, in mathematical terms, to finding the unknown function φ = φ(t): which are different ways of saying that we need to solve the differential equation (1.35). There are no easy recipes to solve differential equations46 , but in this case, if you already have some knowledge of derivatives, sines, cosines, you might be able to convince yourself that
2 k φ(t) = φ0 cos t L m
(1.36)
works as a solution of (1.35): just compute its second derivative with respect to t, and check47,48 that it does coincide with the right-hand side of (1.35). This is true independent of the value assigned to the arbitrary constant φ0 . But then if you consider that at t = 0, the cosine is just 1 and our expression for φ(t) collapses to φ(0) = φ0 , you understand that the value of φ0 should coincide with the initial value of φ. (And, just as we anticipated based on physics alone, if you start at φ = 0 and with no initial acceleration, it follows that φ0 = 0, and but then φ(t) = 0 at all times t, i.e., nothing moves.) To understand what Eq. (1.36) means in practice, consider that, as its argument changes, a cosine oscillates back and forth, always in the same way, between one and minus one. That means that the behavior of the system never changes, whatever the value of t, positive or even negative: the bob oscillates forever. Also remember that cos x goes through one full oscillation—from one all the way to minus one and then back to one—each time x is incremented or decremented by 2π; here we are dealing with the cosine of L2 will take a time T such that
k t, m
which means that one full oscillation of the bob
2π = 2L
k T, m
or T = πL
m . k
(1.37)
seconds. Everything in Eq. (1.37) is known except for k. Cavendish put “the arm in motion, and observ[ed] the time of its vibrations”, i.e., T . All he had to do, then, was solve Eq. (1.37) for k, replace symbols with numbers, do the arithmetic. Now, finally: put the larger weights back at their place in the apparatus. Their gravitational force immediately attracts the bobs, which, as we now understand, will
1.9 Cavendish’ Experiment
31
respond by oscillating—but they will oscillate around a new equilibrium position, corresponding to the value of φ for which the string’s torque is equal and opposite to the torque from gravitational attraction by the Mw ’s; call it φe . We know k, now. We can measure49 φe . All we need to do, then, is multiply them together, according to (1.34), to get F, from which we can get the mass of the earth via Eq. (1.32). After dividing by the earth’s volume—which we know its radius, and it’s easy to get the volume, 43 π times the radius cubed, if we know the radius—Cavendish wrote that the earth’s density was about 5.48 times the density of water, or 5480 kg/m3 , which you might want to compare with the modern “accepted” figure of 5515 kg/m3 : not that big a difference, as you see. Maskelyne’s and Cavendish’ estimates of the earth’s mass differ from one another, but both imply that the average density of our planet must be much higher than that of any rock sample we might collect here at the surface. So the earth must have some sort of internal structure, or in any case the materials that make it up must become much denser as depth increases towards the planet’s center. Which looks like an important thing to know about the earth, and makes this a good place to wrap the chapter up and move on, looking at things from some other angle.
Chapter 2
The Structure of the Earth: Rotation, Precession, Cosmogony
Our planet rotates once a day on its axis. The rate of this rotation—the exact duration of a day, i.e., the time between two successive sunrises—is not perfectly constant; it varies a little, and very slowly, over time; and the same is true of the direction of the rotation axis—which changes, also very slightly, kind of like the axis of a spinning top. Using fixed stars as a reference, astronomers have been measuring these things for ages. And physicists and mathematicians have discovered mathematical relationships between the parameters that describe the earth’s rotation and its variations over time, and the shape of the earth, and how mass is distributed within it: the structure of our planet. Those relationships, and how they are derived, are the first big topic of this chapter. They are, basically, a version of Newton’s laws, adapted so as to be effective when we study things that rotate. This was developed by Leonhard Euler in the decades after Newton’s death, and it is one of the reasons Euler is so famous—chances are you’ve heard his name before, even if you’re not a scientist.
2.1 Euler’s Laws of Motion Newton’s second law says that force is proportional to “linear” acceleration, that is, to the second time-derivative of displacement. In problems that involve rotation, like the earth’s spinning, or the rotation of the moon around the earth, etc., it is easier to measure the “angular” acceleration, which we’ve met in Chap. 1 when we went through Cavendish’ experiment. Now, Cavendish’ is a very simple example of the rotational version of Newton’s second; Euler worked out the general case, where you don’t have just a couple of material points, but a rotating object of finite size. The way he did it was, he applied Newton’s second to an arbitrarily large set of material points, or, which is the same, infinitely small mass elements—which, taken together, would form the finite object we are interested in. In general, mass elements might attract (or repel) one another50 , i.e., for each mass element, call it i, which
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_2
33
34
2 The Structure of the Earth: Rotation, Precession, Cosmogony
is an integer number—its “index”—there’ll be a force f i j exerted on it by the mass element number j. At the very least, gravity. Then there might be some “external” forces, too, acting on some or all mass elements, but not caused by any of the mass elements in question: we are going to call Fi the net sum of all external forces acting on i, and so then for each i Newton’s second gives d 2 ri mi 2 = fi j + Fi , dt j=1 N
(2.1)
j=i
where N is the total number of mass elements that there are, m i is the mass of the i-th mass element, ri its position, and t, as usual, is time. (Force, displacement, acceleration: all those entities have both a magnitude and a direction: i.e., they are vectors51 : which is why, remember Chap. 1, I am typing them in boldface fonts.) Now take the cross product52 of both sides of (2.1) with ri , d 2 ri ri × fi j + ri × Fi , = 2 dt j=1 N
m i ri ×
(2.2)
j=i
and sum over all values of i, N N N N d 2 ri m i ri × 2 = ri × fi j + (ri × Fi ) . dt i=1 i=1 j=1 i=1
(2.3)
j=i
The reason this will become useful (in ways that you will see in a minute) is that the first term at the right-hand side is zero, which let’s see why that is the case: for a given pair of values of i, j, you’ve got two terms, ri × fi j and r j × f ji , which are both included in the sum at the right-hand side of (2.2). In practice, that means that you can write N N i=1 j=1 j=i
N i−1 ri × fi j = ri × fi j + r j × f ji i=1 j=1
(2.4) N i−1 = ri − r j × fi j , i=1 j=1
where we’ve also remembered that by Newton’s third (action and reaction), fi j = −f ji . Now, consider that it is safe to assume the forces fi j we are dealing with are attractive (gravity), or in any case directed along the line that connects points i and j (the forces that keep crystals together): then ri − r j is always parallel to fi j , which
2.1 Euler’s Laws of Motion
35
in turn implies (the way cross product works) that the right-hand side of (2.4) is zero, QED. And so Eq. (2.3) boils down to N N d 2 ri m i ri × 2 = (ri × Fi ) . dt i=1 i=1
(2.5)
The cross product ri × Fi is what people call the torque, exerted by force Fi on the i-th mass element which lies at ri . In most books, you’ll find (2.5) in a slightly different form; because usually at this point one notices that d dt
ri ×
dri dt
dri dri d 2 ri × + ri × 2 dt dt dt 2 d ri =ri × 2 , dt =
(2.6)
again by the properties of cross product. Substituting (2.6) into (2.5), N N dri d m i ri × = (ri × Fi ) , dt i=1 dt i=1
(2.7)
which is just another way of stating (2.5). Incidentally, people like to call the i-th term in the sum at the left-hand side of (2.7) the “angular momentum” of mass element i, while the sum as a whole would be called the total angular momentum of the system, etc. The right-hand side, like I was saying, coincides with the total (“net”) torque that external forces exert on the system. It makes sense at this point to switch from sum to integral53 , so we can drop the index i, and Eq. (2.7) becomes d dt
V
dr = d Vρ(r) r × d V [r × F(r)] , dt V
(2.8)
where F is a force-per-unit-volume vector. Now I come to the reason we’ve gotten into all this math. Like I said, Euler’s motivation in doing all this was that he was looking for a version of Newton’s laws that’d be easier to apply to problems involving rotation. For instance, the earth revolves about its rotation axis, completing one full revolution in one day, i.e., with constant angular velocity: 360◦ /day. So let’s see what the equation we’ve just found means for the earth. Take a Cartesian reference frame whose origin is at the center of the earth. Let’s say the rotation axis goes through the center of the earth (which, to a very good approximation, it should), and let’s pick the vertical axis to coincide with where the rotation axis is at a time t. (Because, yes, the rotation axis is not necessarily exactly fixed, as you are about to see.) Call (t) the magnitude of earth’s
36
2 The Structure of the Earth: Rotation, Precession, Cosmogony
angular velocity; then, at least for a short time δt after t, a point r within the earth moves according to the rotation formula54, 55 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ r1 (t + δt) cos((t)δt) − sin((t)δt) 0 r1 (t) ⎝r2 (t + δt)⎠ = ⎝ sin((t)δt) cos((t)δt) 0⎠ · ⎝r2 (t)⎠ . 0 0 1 r3 (t + δt) r3 (t)
(2.9)
It follows from (2.9), and from the fact that the cosine of a really small number is approximately one, and that the sine of a really small number is approximately the same as its argument56 , that ⎛ ⎞ ⎛ ⎞ ⎛ r1 (t + δt) r1 (t) cos((t)δt) − 1 − sin((t)δt) ⎝r2 (t + δt)⎠ − ⎝r2 (t)⎠ = ⎝ sin((t)δt) cos((t)δt) − 1 0 0 r3 (t + δt) r3 (t) ⎛ ⎞ ⎛ ⎞ 0 −(t) δt 0 r1 (t) 0 0⎠ · ⎝r2 (t)⎠ ≈ ⎝(t) δt 0 0 0 r3 (t) ⎛ ⎞ −(t) δt r2 (t) ≈ ⎝ (t) δt r1 (t) ⎠ . 0
⎞ ⎛ ⎞ 0 r1 (t) 0⎠ · ⎝r2 (t)⎠ 0 r3 (t)
(2.10) Now define an angular velocity vector (t), of magnitude (t) and parallel to the rotation (and, initially, vertical) axis, i.e. (t) = (0, 0, (t)). You can check that the cross product of that with r(t), times δt, coincides with the right-hand side of (2.10); so then (2.10) can be rewritten r(t + δt) − r(t) = δt (t) × r(t),
(2.11)
which, if we divide both sides by δt, and think of δt as arbitrarily small, we get57, 58 dr (t) = (t) × r(t). dt
(2.12)
While we’ve chosen a convenient reference frame to derive it, (2.12) will still hold even if we rotate the reference frame around—we could always re-derive it the way we just did, after a transformation of coordinates to a system where the vertical axis coincides with the direction of at time t. Having learned all this, let’s go back to Eq. (2.8). What we want to do is we want to replace dr in (2.8) with the expression we’ve just found, Eq. (2.12). Let’s first see dt what this does to the cross product that’s inside the integral, at the left-hand side of (2.8);
2.2 The Inertia Tensor
r×
37
dr × r) =r × ( dt − (r · )r = (r · r) ⎛ ⎞ ⎛ ⎞ 1 r1 = (r12 + r22 + r32 ) ⎝2 ⎠ − (r1 1 + r2 2 + r3 3 ) ⎝r2 ⎠ 3 r3 ⎛ 2 ⎞ 2 r2 1 + r3 1 − r1r2 2 − r1r3 3 ⎜r 2 + r 2 − r r − r r ⎟ 1 2 1 2 3 3⎠ =⎝ 1 2 3 2
(2.13)
r12 2 + r22 3 − r1r3 1 − r2 r3 2 ⎛ 2 ⎞ ⎛ ⎞ r2 + r32 −r1r2 −r1r3 1 ⎜ ⎟ 2 2 = ⎝ −r1r2 r1 + r3 −r2 r3 ⎠ · ⎝2 ⎠ , 3 −r1r3 −r2 r3 r12 + r22
where the second step was because, given any three vectors a, b, c, the cross product has the property59 that a × (b × c) = (a · c)b − (a · b)c. Equation (2.13) is valid in whatever fixed reference frame, which means also that the components 1 , 2 and 3 of can be and in general will be nonzero.
2.2 The Inertia Tensor With this, we are ready to introduce the so-called “inertia tensor”, i.e., a 3 × 3 matrix60 I whose components are defined Ii j =
d Vρ(r)(rk rk δi j − ri r j ),
(2.14)
V
so that, using also (2.13), Eq. (2.8) reduces to d (I · ) = dt
d V [r × F(r)] .
(2.15)
V
Incidentally, if you take I to be constant over time (which is the same as saying that ρ(r) is constant over time), (2.15) boils down to I·
d = dt
d V [r × F(r)] ,
(2.16)
V
which looks a lot61 like Newton’s second in its form (2.1): angular acceleration dtd has replaced linear acceleration, and torque has replaced force. The math that we’ve seen in the last few pages was mostly derived by Euler (though in a different form, for instance because Euler, like Newton, didn’t have
38
2 The Structure of the Earth: Rotation, Precession, Cosmogony
vectors, or matrices). Already in Euler’s time, Eq. (2.15) helped people to find out new things about the earth, as we are about to see. First of all, we’ve just learned that, back then, the earth was already known to be an ellipsoid, rather than a sphere. So then, let’s calculate the values of I11 , I12 , etc. for the case of an ellipsoid. If you remember the equation that describes a spherical surface in Cartesian coordinates, (2.17) x12 + x22 + x32 = a 2 , with a 2 a positive constant, then maybe you’ll believe me if I tell you that the surface of an ellipsoid is given by x22 x32 x12 + + = 1, (2.18) a2 b2 c2 where a 2 , b2 and c2 are all positive constants, and they are also the squared lengths of the three “axes”, as they are called, of the ellipsoid; if the three axes have the same length, then you should see that you get (2.17) back. To calculate the moment of inertia of an ellipsoid means to work out the integrals in (2.14) for the case when V is the volume bounded by the surface defined by (2.18). Take for example I33 ; the way you do the integral is you do a transformation of the integration variables, I33 =
V
dr1 dr2 dr3 ρ(r1 , r2 , r3 )(r12 + r22 )
dr1 dr2 dr3 dr1 dr2 dr3 ρ(r1 , r2 , r3 ) (a r1 )2 + (b r2 )2 dr1 dr2 dr3 =a b c dr1 dr2 dr3 ρ(r1 , r2 , r3 ) (a r1 )2 + (b r2 )2 ,
=
V
(2.19)
V
where r1 = ra1 , r2 = rb2 , r3 = rc3 , and V is the volume of a sphere that has radius equal to one: which is good, because you’ve learned in the previous chapter how to integrate over a sphere, right? then you do another transformation of coordinates, i.e., you switch from Cartesian to spherical (coordinates), so that the volume element dr1 dr2 dr3 = r 2 sin ϑ dr dϑ dϕ and r , ϑ, ϕ have the same meaning as in chapter 1. If you also make the assumption that ρ changes only with depth62 , ρ = ρ(r ), (2.19) becomes I33 =a b c =a b c =a b c
1
2π dr
0
0
dr 0
0
0
2π
1 1
π dϕ π dϕ
0
dr ρ(r ) r 4
0
2π 0
dϑ ρ(r )r 2 sin ϑ a 2 r 2 sin2 ϑ sin2 ϕ + b2 r 2 sin2 ϑ cos2 ϕ dϑ ρ(r )r 4 sin3 ϑ a 2 sin2 ϕ + b2 cos2 ϕ dϕ(a 2 sin2 ϕ + b2 cos2 ϕ)
π 0
dϑ sin3 ϑ,
(2.20) where notice that the three integrals can be treated independent of one another. The integral over ϑ is solved with, again, a transformation of the integration variable,
2.2 The Inertia Tensor
39
i.e., introducing u = cos ϑ,
π
π
dϑ sin3 ϑ =
dϑ sin ϑ sin2 ϑ
π d cos ϑ (1 − cos2 ϑ) dϑ =− dϑ 0 1 du(1 − u 2 ) =
0
0
(2.21)
−1
1 4 u3 = . = u− 3 −1 3 The integral of sin2 ϕ over ϕ is done by parts63 , sort of:
2π
2π
dϕ sin ϕ = 2
0
dϕ sin ϕ sin ϕ
0
sin ϕ]2π 0
= [− cos ϕ − 0 2π dϕ(1 − sin2 ϕ), =
2π
dϕ(− cos ϕ) cos ϕ
(2.22)
0
which if you bring
2π 0
dϕ sin2 ϕ from the right- to the left-hand side,
2π
1 dϕ sin ϕ = 2
2π
2
0
dϕ = π.
(2.23)
0
From which we can also quickly derive the integral of cos2 ϕ, because
2π
2π
dϕ cos ϕ = 2
0
2π
dϕ −
0
dϕ sin2 ϕ = π.
(2.24)
0
Finally, the integral over r in (2.20) can’t be solved unless one has a formula for ρ(r ), which at this point we don’t, so we’re left with I33
4π 2 (a + b2 ) = abc 3
1
ρ(r )r 4 dr.
(2.25)
0
Now, it’s pretty clear, I guess, that the math we’ve just done could be redone in exactly the same way if the lengths a, b, c of the axes of the ellipsoid were swapped with one another in whatever way: and so by finding I33 we’ve also found I11 = a b c
4π 2 (b + c2 ) 3
0
1
ρ(r )r 4 dr
(2.26)
40
2 The Structure of the Earth: Rotation, Precession, Cosmogony
and I22 = a b c
4π 2 (a + c2 ) 3
1
ρ(r )r 4 dr.
(2.27)
0
As for the off-diagonal terms of I, let’s look for example at I12 . Through Eq. (2.14), I12 = −
dr1 dr2 dr3 ρ(r1 , r2 ) r1 r2 = − a 2 b2 c dr1 dr2 dr3 ρ(ar1 , br2 ) r1 r2 V 1 2π 2 2 =−a b c drρ(r ) r 4 dϕ sin ϕ cos ϕ V
0
0
(2.28) π
dϑ sin3 ϑ,
0
where we’ve played all the same tricks as above. The main thing that’s changed with respect to (2.20) is the integral over ϕ. So let’s check that out; it is solved through the same transformation of variables as for the ϑ-integral we’ve just done, i.e., if u = cos ϕ, then 0
2π
d cos ϕ cos ϕ dϕ 0 2π π d cos ϕ d cos ϕ cos ϕ − cos ϕ =− dϕ dϕ dϕ dϕ 0 π 0 1 =− du u − du u 1 0 1 1 = du u − du u = 0.
dϕ sin ϕ cos ϕ = −
0
2π
dϕ
(2.29)
0
Which, again, is independent of how we pick a, b, c: and so we’ve just proved that all the off-diagonal terms of I are zero. So here’s the full inertia tensor of an ellipsoid where ρ = ρ(r ), ⎞ 1 b2 + c2 0 0 4π ⎝ 0 a 2 + c2 0 ⎠ I = abc ρ(r )r 4 dr. 3 2 2 0 0 0 a +b ⎛
(2.30)
We know from Bouger and co. that the earth is, to a good approximation, an axially symmetric ellipsoid: flattened at the poles and bulging at the equator. So let’s say the x3 axis goes through the poles, while the other two lie on the equatorial plane, then a = b, and we can write the approximate inertia tensor of the earth, ⎛
⎞ A 0 0 I = ⎝0 A 0⎠ , 0 0 C
(2.31)
2.3 Precession (Free and Forced)
41
1 1 where A = a b c 4π (b2 + c2 ) 0 ρ(r )r 4 dr = a b c 4π (a 2 + c2 ) 0 ρ(r )r 4 dr , 3 3 1 C = a b c 4π (a 2 + b2 ) 0 ρ(r )r 4 dr . 3 To see what all this means for the earth and its rotation, we should go back to Eq. (2.15); let’s also make the assumption that, to some reasonable approximation, external torques64 can be neglected, i.e. (2.15) can be reduced to d (I · ) = 0. dt
(2.32)
This equation means that in the absence of external forces, and if the earth is rigid— i.e. ρ(r) is constant, so the inertia tensor I is constant—, the angular velocity won’t change over time. Now, I didn’t particularly emphasize it, and sorry if I wasn’t clear, but in what we’ve done so far it was assumed that all displacements and velocities, etc., are measured in a fixed reference frame, i.e., one that doesn’t move with respect to the so-called fixed stars. Newton’s laws hold in such a system. What if we decided to write things with respect to a reference frame that moves along with the earth? for example, take a frame whose axes coincide with the symmetry axes of the ellipsoid, so that I has exactly the expression that we’ve just derived. (The earth being axially symmetric, its two equatorial axes can be picked arbitrarily—just use some geographic landmark, like the Greenwich observatory or whatever.) In a way, that’s our most natural reference frame, because we stand on the surface of the earth and make all our observations from there, while rotating together with the earth. This time, no requirement for the Cartesian axes to be in any way aligned with the rotation axis. Working in a rotating reference frame means that, for any vector v that gets differentiated with respect to time in equations that were derived in the non-rotating with dv + × v: and so then in the rotating frame frame, we’ve got to replace65 dv dt dt (2.32) becomes d (2.33) (I · ) + × (I · ) = 0. dt
2.3 Precession (Free and Forced) It’s not so hard to work out (2.33) for the case of an axially symmetric ellipsoid, whose I we now know so well. Substitute (2.31) into (2.33), do the algebra (it’s a good exercise, and easier than a lot of what we’ve done so far), and ⎧ d1 C−A ⎪ ⎨ dt + A 2 3 = 0, d2 − C−A 1 3 = 0, dt A ⎪ ⎩ d3 = 0. dt
(2.34)
Notice that the system of differential equations (2.34) has the same form (although the physics is different) as the system (2.10) that we’ve met not so long ago. And so
42
2 The Structure of the Earth: Rotation, Precession, Cosmogony
if you remember how we ended up with (2.12), you should be able to see that the solution to (2.34) has the same form as Eq. (2.9), i.e., ⎧ C−A C−A ⎪ ⎨ 1 (t + δt) = 1 (t) cos A 3 (t) δt − 2 (t) sin A 3 (t) δt , 2 (t + δt) = 1 (t) sin C−A 3 (t) δt + 2 (t) cos C−A 3 (t) δt , A A ⎪ ⎩ 3 (t + δt) = 3 (t),
(2.35)
which tells us how the angular velocity of the earth, the vector , evolves with time t. The last of the (2.35) says that its vertical component 3 , i.e., its projection on the vertical axis, is constant, so we might as well simply write
1 (t + δt) = 1 (t) cos 2 (t + δt) =
C−A
3 δt A C−A 1 (t) sin A 3 δt
− 2 (t) sin +
C−A
3 A C−A 2 (t) cos A 3
δt , δt .
(2.36)
These two equations that we are left with say66 that the point identified by 1 (t), 2 (t) moves along a circle, starting at some initial position 1 (0), 2 (0) and then completing one full revolution every A/[(C − A)3 ] days, or months or years or whatever unit of measurement you want to use for time. Euler predicted this phenomenon via the math we’ve just seen, this circular displacement of the earth’s rotation axis with respect to the earth’s symmetry axis, and called it “precession”, or actually “free precession” to remind us that that’s actually the precession we would get if there were no external torques at all. (People sometime use the word “nutation” instead of “precession”, and/or treat “nutation” and “precession” as synonyms. And but “nutation” can also mean something else altogether. I’ll stick to “precession”.) Now, besides the free precession, there is also, like I was saying, a forced one. If we were to factor the attraction from sun and moon back in67 , and do the math again, we’d get another pair of functions 1 (t) and 2 (t), different from their “free” counterparts, and but also related68 to A and C. The trajectory of the rotation pole that we’d expect to observe would be given by the sums69 of the 1 ’s, for the first component, and of the 2 ’s for the second, that we get with and without external torque. From the point of view of the so-called geophysicist, who’s interested in finding out stuff about the interior of the earth, the reason all this is interesting is that the motion of the earth’s rotation pole is something that can be measured by astronomers, using fixed stars as reference. And the data show, indeed, that the pole’s trajectory is the sum of two “wobbles”; a relatively fast (but very subtle) one, with a period of about 430 days, and a much slower one—one cycle every 26,000 years or so. The 430-day precession was first measured by Seth Carlo Chandler70 in 1891; to this day, people like to call it the Chandler wobble. The 26,000-year precession is also often called “precession of the equinoxes”; because over the centuries, as the inclination of the earth’s rotation axis slowly changes, the points, along the earth’s orbit, where equinoxes happen must also change. It is also called luni-solar precession—because it is the effect of the moon’s and sun’s gravity pull. The precession of the equinoxes
2.3 Precession (Free and Forced)
43
has been known and studied for ages: apparently, it was first noticed by Hipparchus (which see Chap. 1), when he compared his observations with data that had been collected a couple centuries before—long enough to appreciate the precession. So, now, anyway: the period of free precession is A/[(C − A)3 ]; 3 is the frequency at which the earth revolves on itself, i.e., one day to the minus one; if I have a “model” of how density changes with respect to depth between the surface and the center of the earth, I can calculate A and C through Eq. (2.14); I can sub the numbers I get into A/[(C − A)3 ] and get a numerical estimate. There’s a formula for the period of forced precession, too, which depends on A and C and 3 , and I can implement it in the same way. This way, people71 could make sure that the Chandler wobble is free precession, and that the precession of the equinoxes is forced precession. Once that was sorted out, then, one could actually use this idea to improve the density model: see if changing how mass is distributed within the earth, and redoing the math, improves the fit between observed and calculated wobbles72 . That is essentially what a German researcher, Emil Wiechert, did in the last years of the nineteenth century. In his 1897 paper73 he combined a central core of about 5000 km radius with an outer shell about 1400 km thick; took the density of the shell to be 3.2 times the density of water; the density of the core to be 8.2 times that of water; he did the math and found that this planet would have a total mass equal to that of the earth, and that the components of its inertia tensor, A and C, would be in agreement with precession data74 . The density of iron is 7.8 under “normal” conditions, so Wiechert surmised the core should be made of iron—which would be confirmed later on by people looking at things from very different angles, as we shall see. We shall also see that Wiechert’s estimate of the radius of the iron core is, probably, wrong—today we have a number of good reasons to think that the outer shell is twice as thick, and the core much smaller (its radius only slightly more than half the earth’s radius). The first person to “use” precession data for geophysics, though, had probably been William Hopkins, circa 184075 . Hopkins’ idea was that the earth had a liquid core, surrounded by a solid outer shell, or “crust”. (In a minute we’ll get to why he thought the interior of the earth had to be liquid: that’s the second big topic of this chapter.) To keep things simple, Hopkins also assumed that the fluid that the core is made of is what people call an “inviscid” fluid, that is, a fluid that doesn’t produce any friction—so it won’t resist motion unless motion involves compression. So then if the outer shell rotates around an axis that goes through the center of the earth, the core—which is not going to be compressed at all by that kind of motion—won’t resist that. Now, if Hopkins’ hypothesis is right, the precession that we observe (living as we do on the outer surface of the outer shell) is the precession of the outer shell alone: because it can’t resist motion, the fluid core doesn’t affect the motion of the outer shell at all. That means that the relevant inertia tensor is not that of a sphere or ellipsoid but, rather, that of a shell: and the formulae that give us the components of the inertia tensor, A and C, must be modified accordingly—i.e., in Eq. (2.31) the integration bounds should not be 0 and 1, but, rather, the ratio of the inner radius of the shell to the earth’s radius, and 1. I’ll let you do the math; you should find that, if the inertia tensor is that of a shell, and we want A and C to fit precession data,
44
2 The Structure of the Earth: Rotation, Precession, Cosmogony
then the shell has got to be at the very least 800–1000 km thick, says Hopkins, and extremely dense. It’s easier to fit the data (i.e., you don’t need the shell’s density to be super high) if you assume the whole planet to be solid. Back in his day (i.e., half a century before Chandler’s observations), Hopkins must have come to this conclusion on the basis of the precession of the equinoxes alone: no clear measurements of free precession, yet. There is at least one big problem with Hopkins’ model, though: there is no reason to expect that the core of the earth should be a frictionless fluid. It might very well be a viscous fluid; which we’ll learn about viscosity much later in this book, but basically just think of a fluid that does resist shear to some extent: think honey, oil, glue (all pretty viscous) in comparison to water (approximately no viscosity), which doesn’t resist shear at all. Think of lava (very viscous): look at how slowly even the most liquid of lavas flow out of a volcano, and how friction (shear stress) within a lava resists gravity; and consider that lava must definitely be better than water as a proxy of rocks inside the earth. If we accept that the deep interior of the earth behaves sort of like lava, we have to expect a significant amount of friction between the “crust” and whatever is underneath it, so that, in the time-scale of the precession of equinoxes, the viscous portion of the earth is likely to stick with the crust, and Hopkins’ reasoning doesn’t hold. There were good reasons, though, for Hopkins to look at a two-shell fluid-solid model. Today, if you ask a geologist what’s the crust, the answer you get will probably revolve around the idea of a so-called “seismic discontinuity” defining its base; meaning that there’s a surface of approximately (very approximately) constant depth, across which the properties of rocks change suddenly. Whatever is above the discontinuity is what we call the crust, and the rest—below—is the “mantle”. The crust is “brittle” and the mantle is “ductile”. The depth of the discontinuity changes a bit across the globe, but as an order of magnitude think 10–50 km. But now stop, because of course at this point you are not supposed to know what a seismic discontinuity is: certainly not from this book unless you cheated and already read the later chapters, which you shouldn’t. In reality, the idea that the earth has a “crust” goes back to the late eighteenth/early nineteenth century, i.e., the time of Cavendish and co., and but back then the word “crust” was supposed to mean a solid (and, possibly, thin) shell, as opposed to whatever lies within, which at the time was thought to be fluid. The data which, back then, supported this idea are summarized nicely in a paper by Vincent Deparis, “La Controverse sur la Fluidité de la Terre au XIXe Siècle”, or “The XIX-Century Controversy on the Earth’s Fluidity”76 . Which starts with the most conspicuous piece of evidence in favour of the theory that most of the earth be molten, i.e., the temperature gradient that one can measure at and just below the surface. Deparis quotes Louis Cordier’s77 1827 Essai sur la Température de l’Intérieur de la Terre, or Essay on the Temperature of the Earth’s Interior. Cordier had put together data on how temperature rises with depth in mines (something people who are involved with mines had already known for ages78 ); “temperature would actually grow, on average,” says Cordier (quoted by Deparis, translated into English by yours truly), “by one degree for every 25 m of descent. At such a rate, and assuming that this figure would stay the same independent of depth, the temperature of boiling water
2.4 The Formation of the Earth (and Other Planets)
45
would be reached at 2500 m depth [. . .], while at 50 km depth we’d find a temperature of 1600◦ C, that is, a temperature such that any known rock would be molten. The interior of the earth could then only be an ocean of magma.” And so yeah, a thin crust of about 50 km thick, and below that: fusion.
2.4 The Formation of the Earth (and Other Planets) And, Deparis explains, the thing is, Cordier’s inference was confirmed by people looking at things from totally different angles. The famous mathematical physicist Pierre-Simon Laplace, in his 1796 “cosmogony”, Exposition du Système du Monde, pointed to the remarkable fact that all planets of the solar system revolve around the sun on the same plane and in the same sense: which could hardly be a coincidence: Laplace figured that all planetary motion must originate from some common cause; he surmised, in Deparis’ words, “that the sun had initially been surrounded by a rotating atmosphere which, being extremely hot, had expanded further away than the current orbits of all planets. The outer boundary of this atmosphere was determined by the balance between gravitational attraction and centrifugal force. But then the atmosphere cooled down, and, in so doing, it became smaller—it ‘contracted’— and denser; then, by the law of conservation of the angular momentum79 , its speed of rotation went up. At the outer surface, centrifugal force became bigger than gravitational attraction, which resulted in a ring of gaseous [still super hot] stuff being released along the ‘equatorial’ plane”, i.e., I guess, the plane perpendicular to the rotation axis, and which goes through the center of mass of the whole thing. So then Laplace thinks there’s like a sequence of punctual contractions/emissions, and, each time, the gas that had been released “would condense into a single spherical gaseous mass. Which is how the different planets were formed.” How this condensation occurs, and why the temperature is initially very high, is explained in a paper by François-Désiré Roulin, published in La Revue des Deux Mondes80 in 1833. Roulin summarizes in very clear fashion some ideas from William Herschel and André-Marie Ampère81 : “Mr. Ampère,” he writes, “formulated, in his lectures82 on the natural classification of human knowledge, some very ingenious opinions on the theory of earth, which he was kind enough to develop in more detail during some personal conversations that we had. I shall try here to give an idea of those, but before so, I believe it necessary to briefly remind of Herschel’s hypotheses on the formation of the globe”: (i) the solar system is, initially, in the gaseous state. Herschel, says Roulin, “on the basis of his own observations of celestial bodies, and in particular of nebulae, thought it reasonable to infer that the matter of which worlds are made had initially been in the gaseous state.” Because Herschel had observed, Roulin explains, that when you look at nebulae you see “either a homogeneous, diffuse light, similar to that emitted by comet tails, and or some occasional, brighter
46
2 The Structure of the Earth: Rotation, Precession, Cosmogony
points, as if gaseous particles were gathering together to form solid or liquid cores.”83 (ii) In his chats with Roulin, Ampère pointed out that if Herschel were right, that is, if the entire solar system had once been gas, then that means that its temperature, back then, must have been higher than the temperature at which the least volatile of all substances that exist in it (in the solar system) could still be in the liquid state. “Whatever this substance”, says Roulin, “let’s call T A the temperature at which it ceases to exist in the state of an elastic fluid”, i.e., if temperature rises above T A , the least volatile substance goes from liquid to gas (while everything else is already gaseous). (iii) So, say you have a mass of gas that has separated from the rest of the gaseous proto-solar system, according to the mechanism described by Laplace. “For solid bodies to form out of this immense gaseous mass”, says Roulin, “we have to assume that the mass cools”; as soon as it cools below T A , the least volatile substance becomes liquid. “By virtue of the mutual gravitational attraction of all its parts, one towards the other, it will take the shape of a sphere—or, if it happens to have a spin, an ellipsoid.” Then nothing happens until a lower temperature, TB , is reached, where a second substance goes from gas to liquid. “At this point, the second substance is deposited on the initial core, around which it forms a concentric shell”; and so on and so forth. So that’s how planets are formed according to Laplace and Ampère. Very hot stuff that’s initially gaseous, then starts cooling and becomes liquid. And it continues to cool, and has certainly begun to become solid—because we walk on a solid crust—but might well still be liquid in its inner shells—like Cordier had said. And there’s two more pieces of evidence in favour of this idea. For one, it was known from the work of Bouguer and co. (Chap. 1) that the earth was an ellipsoid, flattened at the rotation poles. In addition, it seemed very likely, from the contributions of Maskelyne and Cavendish (also Chap. 1), that mass density inside the earth must grow with increasing depth. In his Traité de Mécanique Céleste Laplace surmises that that’s because, he says, “if the different substances that make the earth had initially—under the effect of very high temperatures—been fluid, then the denser ones must have collapsed toward the center”, etc. So, to sum up: a bunch of independent lines of reasoning, based on independent data, might lead one to conclude that (1) the earth has been entirely fluid at some point in its history, and that (2) it is probably still fluid to a large extent—although we do walk on a solid “crust”—but chances are the crust is relatively thin. Point (1) is very reasonable, and, as far as I know, there’s still consensus on it, today. On the other hand, a number of issues were soon found with point (2): (i) the issue with the melting point: the melting point is the temperature at which a given solid melts; let’s say it’s agreed that temperature rises with growing depth, OK: but, independently of that, pressure must also grow with depth (pressure is essentially the weight of all the rock that’s above you—so the larger the depth at which you are, the more rock is squashing you down, the higher the pressure): and at higher pressure it’s harder for a solid to melt: i.e., its melting
2.4 The Formation of the Earth (and Other Planets)
47
point will also go up. So we can’t tell a priori whether everything will become molten below a certain depth. This objection was first raised by Hopkins (in the same papers where he looked at the inertia tensor of the “crust” and concluded that the crust had to be very thick). It could be that, below a certain depth, pressure is so high that we’ll have solidification despite the high temperature. Depending on where that depth is, it could be that a shell within the earth has been liquid for a long time, and maybe is still liquid—but the central core is solid. And who knows how big that central core could now be. (ii) Gravitational stability, or lack thereof: earth materials usually become denser (i.e., heavier) as they freeze from lava to rock. This had been seen in experiment already in the mid eighteen hundreds: in a paper published in 186484 , William Thomson, AKA Lord Kelvin85 , a former student of Hopkins86 at Cambridge, says that “Bischof’s87 experiments, upon the validity of which, so far as I am aware, no doubt has ever been thrown, show that melted granite, slate, and trachyte, all contracted by something about 20 per cent. in freezing”. The materials mentioned by Kelvin, that Bischof looked at, are all found in the earth’s crust, and I guess the implicit assumption is that they can be representative of the earth as a whole: at least, as close as one could be to representative in 1864—when people knew next to nothing about the planet’s chemical composition (which in this book we’ll get to it in a few chapters). Kelvin then made the almost obvious inference that a solid can’t float on a fluid that’s less dense: and so, when lava becomes rock, it sinks—meaning, put simply, there’s no way you can have a solid crust because it’s gravitationally unstable—unless the whole earth has already become solid. And but since the earth does have a solid crust, then Kelvin concluded that the whole earth is now solid. (iii) The issue with tides: if the earth were mostly fluid, then the moon and sun would raise tides not only in the oceans, but also in our planet’s interior. “Those who admit the liquidity of the inner core of the Earth,” said, e.g., Ampère—according to Roulin and Deparis, which see above—“seem not to have considered the action that the Moon would exert on this enormous liquid mass: action from which would result tides analogous to those of our seas, but much more terrible, both by their extent and by the density of the liquid.”—Basically, those tides would end up breaking the crust apart. Or, should the crust withstand such a strain, the solid earth as a whole would be moved up and down by the moon’s and sun’s attraction just about as much as the oceans: and the tides we observe— which are nothing but the difference between the ocean’s and the solid earth’s motions—would be much smaller than they are.
Chapter 3
The Forces that Shape the Earth: Neptunism Versus Plutonism
When I was a graduate student, the “geoclub” at my American university had a T-shirt that read “no vestige of a beginning, no prospect of an end.” As an Italian, and somewhat bohemian, expat, I had a tendency to quickly leave the institute after my (many) hours of research work, and I didn’t really take part in the geoclub activities. It felt like one of those American things. Why would I want to spend time with my work colleagues, if friendships didn’t just develop naturally? (which they did, in some cases, and then I would hang out with those friends away from the lab, in alternative clubs and parties involving “independent” music and everything that comes with it. It’s not like I am particularly proud of that now, but it is just the way it was, and it might say something about why this book is being written, and therefore it might make sense to mention this here.) So, anyway, I didn’t think that much about the motto on those T-shirts, as I never bought one and I would almost never join those guys in their outdoors expeditions (this might also be the right place, then, to mention that I was never a geologist; I had a physics bachelor’s; I couldn’t tell basalt from granite; I preferred our big city’s concrete jungle to the geologically interesting exotic locations those guys would choose for their social field-trip events; my thesis work consisted mostly of writing software in a “computer room”)... I didn’t have much reason to think about it, but I do remember that motto because I always thought it was funny: I understood it as a joke on the typical graduate student condition, lost in a seemingly endless research project, that for most of us had started already a long time ago—and yet there was no end in sight. I also perhaps had the intuition that it was a quotation from something, presumably something geological, though, which I wasn’t curious enough to ask about. So, I had a funny feeling when, something like ten years after my graduation, I found that quotation in John McPhee’s book, Basin and Range (I had entered a phase in my life when I figured it wasn’t enough for me to think of being a researcher in geophysics as “just a job,” and that if I wanted to continue doing it without developing a major depression, I should try to address
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_3
49
50
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
the questions, why are we doing this? what am I writing software for? etc.). At that point, I was an apprentice writer (because yes, that’s what I was really hoping to do, one day, for a living) in the process of failing, and so deciding to pursue a “career”, so to say, in science; McPhee, instead, was a successful writer who couldn’t help being fascinated by science, specifically by the geosciences (I don’t think he ever wrote much about any other science, while he did write some memorable pages about tennis and the Swiss army), and while he presumably didn’t care at all about software writing (which is not something that I found particularly satisfying in terms of finding meaning for my existence, either; but it was “fun”, sometimes, and I felt happy when things worked), he was pretty good at identifying those big questions that I didn’t even know about—I was mostly a physics major, not a particularly good one, who pragmatically recycled himself into a field where there was a relatively high demand for physicists. So anyway, what a funny feeling, of having come full circle, when, after my decision to finally make an effort, and understand what I had been doing, I ran into that sentence, which McPhee himself quotes from an eighteenth century medical doctor, farmer and amateur geologist living in Edinburgh: James Hutton. McPhee credits Hutton with the discovery of the “angular unconformity”. I’d rather explain this central concept properly, and so I’ll come to it slowly. But, as an introduction, here is the passage, quoted by McPhee, from Hutton’s Abstract of a Dissertation Read in the Royal Society of Edinburgh, upon the Seventh of March and Fourth of April 1785, Concerning the System of the Earth, Its Duration, and Stability: “The purpose of this dissertation is to form some estimate with regard to the time the globe of this earth has existed. [...] The world which we inhabit is composed of the materials not of the earth which was the immediate predecessor of the present but of the earth which [...] had preceded the land that was above the surface of the sea while our present land was yet beneath the water of the ocean. Here are three distinct successive periods of existence, and each of these is, in our measurement of time, a thing of indefinite duration [...] The result, therefore, of this physical inquiry is, that we find no vestige of a beginning, no prospect of an end.” When Hutton says “here” he means: at the unconformity—we’ll get to that.
3.1 Charles Lyell In the mid nineteenth century, the most influential advocate of Hutton’s theories was Charles Lyell, a British geology professor (originally from Scotland, like Hutton, but he spent most of his career as faculty at King’s College, London), author of possibly the best selling earth science textbook of his century—Principles of Geology. Lyell defines geology as “the science which investigates the successive changes that have taken place in the organic and inorganic kingdoms of nature: it inquires into the causes of these changes, and the influence which they have exerted in modifying the surface and external structure of our planet.
3.2 Aristotle’s Meteorology
51
“By these researches into the state of the earth and its inhabitants at former periods, we acquire a more perfect knowledge of its present condition, and more comprehensive views concerning the laws now governing its animate and inanimate productions. When we study history, we obtain a more profound insight into human nature, by instituting a comparison between the present and former states of society. We trace the long series of events which have gradually led to the actual posture of affairs; and by connecting effects with their causes, we are enabled to classify and retain in the memory a multitude of complicated relations—the various peculiarities of national character—the different degrees of moral and intellectual refinement, and numerous other circumstances, which, without historical associations, would be uninteresting or imperfectly understood. As the present condition of nations is the result of many antecedent changes, some extremely remote and others recent, some gradual, others sudden and violent, the state of the natural world is the result of a long succession of events; and if we would enlarge our experience of the present economy of nature, we must investigate the effects of her operations in former epochs.”
3.2 Aristotle’s Meteorology Lyell does not doubt that, even though “catastrophes” like major earthquakes or landslides or volcanic eruptions do occur, landscape as we see it today—“the state ot the natural world,” the shape of the earth – is to a large extent the result of gradual change, going sometimes very far back in time. As we shall see, this is one of the main tenets of Hutton’s doctrine. It was not a new idea. For instance Aristotle wrote, in his Meteorology, that “the changes of the earth are so slow in comparison to the duration of our lives, that they are overlooked; and the migrations of people after great catastrophes and their removal from other regions cause the event to be forgotten.” Aristotle inferred this from observations of the landscape. For example, he thinks that “it is obvious that [in Egypt] the land is continually getting drier and that the whole country is a deposit of the river Nile. But because the neighbouring peoples settled in the land gradually as the marshes dried, the lapse of time has hidden the beginning of the process. [...]” And some more examples follow: apparently, a Pharaoh Sesostris, mentioned by Herodotus, and then later Darius I of Persia, had tried to cut a canal linking the Nile with the Red Sea (“for it would have been of no little advantage to them for the whole region to have become navigable”); but it turned out “that the sea was higher than the land,” so the project was abandoned, “lest the sea should mix with the river water and spoil it”. And Aristotle’s inference from this is that “it is clear that all this territory was once unbroken sea”. “For the same reason,” continues Aristotle, “Libya—the country of Ammon—is, strangely enough, lower and hollower than the land to the seaward of it. For it is clear that a barrier of silt was formed and after it lakes and dry land, but in course of time the water that was left behind in the lakes dried up and is now all gone. Again the silting up of the lake Maeotis [i.e., what today is called the Sea of Azov, between Russia, Ukraine, and Crimea; and it hasn’t dried up, yet!] by the rivers has advanced
52
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
so much that the limit to the size of the ships which can now sail into it to trade is much lower than it was sixty years ago. Hence it is easy to infer that it, too, like most lakes, was originally produced by the rivers and that it must end by drying up entirely. Again, this process of silting up causes a continuous current through the Bosporus; and in this case we can directly observe the nature of the process. Whenever the current from the Asiatic shore threw up a sandbank, there first formed a small lake behind it. Later it dried up and a second sandbank formed in front of the first and a second lake. This process went on uniformly and without interruption. Now when this has been repeated often enough, in the course of time the strait must become like a river, and in the end the river itself must dry up.” Maybe because of his “unconscious yearning for stability and permanence in a crumbling world where ‘change’ can only be a change for the worse” (Arthur Koestler’s words, see Chap. 1), it is important for Aristotle to emphasize that, despite these observations, overall the world is not really changing: “those whose vision is limited think that the cause of these effects is a universal process of change, the whole universe being in process of growth. So they say that the sea is becoming less because it is drying up, their reason being that we find more places so affected now than in former times. There is some truth in this, but some falsehood also. For it is true that there is an increase in the number of places that have become dry land and were formerly submerged; but the opposite is also true, for if they will look they will find many places where the sea has invaded the land.” Or in other words (as cited by Lyell in chapter II, book I of his Principles of Geology88 ), “as time never fails, and the universe is eternal, neither the Tanais nor the Nile can have flowed forever. The places were they rise were once dry, and there is a limit to their operations: but there is none to time. So of all other rivers; they spring up, and they perish; and the sea also continually deserts some lands and invades others. The same tracts, therefore, of the earth are not, some always sea, and others always continents, but everything changes in the course of time...” It is strange that in Meteorology Aristotle doesn’t even mention another fundamental datum that was available to him and that had been discussed by some of his predecessors: fossils. Or, more specifically, fossil marine creatures found in solid rock at the tops of mountains. In “History and Methods of Paleontological Discovery” (The Popular Science Monthly, 1879) Othniel Charles Marsh reports that “the philosopher Zenophanes, of Colophon, who lived about 500 B.C.” (that is, more than one century before Aristotle), “mentions the remains of fishes and other animals in the stone quarries near Syracuse, the impression of an anchovy in the rock of Paros, and various marine fossils at other places. His conclusion from these facts was, that the surface of the earth had once been in a soft condition at the bottom of the sea;” what is now rock was once sand, or mud, and was covered by water; “and thus the objects mentioned were entombed. Herodotus, half a century later, speaks of marine shells on the hills of Egypt and over the Libyan Desert, and he inferred therefrom that the sea had once covered that whole region. Empedocles, of Agrigentum (450 B.C.), believed that the many hippopotamus-bones found in Sicily were remains of human giants, in comparison with which the present race were as children. Here, he thought, was a battle-field between the gods and the Titans, and the bones belonged
3.3 Theophrastus
53
to the slain. Pythagoras (582 B.C.) had already anticipated one conclusion of modern geology, if the following statement, attributed to him by Ovid (Metamorphoses, liber XV, 262), was his own: vidi ego quod fuerat solidissima tellus,/esse fretum: vidi factas ex aequore terras;/et procul a pelago conchae jacuere marinae,” or, “I have seen myself what once was firm land, become the sea: I have seen earth made from the waters: and seashells lie far away from the ocean.” And, again if Ovid is not improvising, this is centuries before Aristotle. But then if you read Theophrastus, who was a student of Aristotle and who wrote a whole treatise On Stones which gets cited all the time in the subsequent literature, again you will see that he makes no mention of the remains of marine animals found on mountain tops.
3.3 Theophrastus On Stones is a 15-page affair which, to be honest, doesn’t read very well by current standards. Today’s students turning in a homework like that might be failed. That doesn’t mean, of course, that we are smarter than Aristotle and Theophrastus; it just means, again, that we think in a different way. Compared to how we write dissertations today, On Stones seems to lack structure; properties of many different stones are thrown in one after the other, with no apparent connection other than the authors’ unconscious associations; there is no attempt at classification; but then you must not forget, for instance, that the whole idea of classification as a useful way of looking at the world is a product of the eighteenth century. (I am not going to bring up Foucault again, but remember Chap. 1.) Bottom line, most of On Stones is, so far as I can tell, an unsystematic account of various rocks’ physical and chemical (as we would say today) properties: whether they are heavy or light, how easy or difficult it is to cut or carve them, what happens if you heat them up, where each species can be found, etc. It’s essentially a database of observations, made by the author or, most likely, reported to the author by other “researchers.” But then, scattered in the text, are also a few theories of the origin of certain rocks. At paragraph 5, it is hinted that, maybe, stones can actually have children: “the greatest and most wonderful power, if this is true, is that of stones which give birth to young.” Theophrastus sounds skeptical about it (“if this is true...”). But according to Earle Caley and John Richards’ impressively detailed commentary to their 1956 translation of the treatise, “such skepticism is much less evident in the statements of the other ancient writers who touch on this subject.” Caley and Richards were a chemist (from Princeton and later Ohio State University) and a classical philologist (from Columbia), respectively, who spent something like two decades working on the translation and accompanying notes, first independently and then, when they found out about one another, in collaboration. They suggest that the idea of stones being “born” of other stones came from “certain kinds of geode-like concretions that consist of an outer shell within which is contained a clayey, sandy, or stony nucleus. Sometimes the internal material is held so loosely that the concretion rattles when shaken. The ancients apparently believed that such stones were pregnant, and that the mineral matter on the inside was in the process of
54
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
being generated.” Pliny, for instance, explains that those, uhm, “procreating” stones are called “eaglestone” because they are “found in eagles’ nests, and the eagles were unable to hatch out their young without the aid of these stones. [...] He names in addition, and sometimes describes briefly, other stones which contained embryo stones within them, such as cyitis and gassinade. He even goes so far as to declare that the period of gestation for the second of these stones was three months. [...] As one might expect, eaglestones were worn in ancient times as amulets to prevent miscarriage.” Now, re rocks that give birth, I am no chemist, no biologist, and, like I said, not a real geologist either, so maybe I shouldn’t speculate on such topics, but I can’t help thinking of coral: which Theophrastus himself discusses in paragraph 38: “coral, which is like a stone, is red in color and rounded like a root, and it grows in the sea.” Caley and Richards add: “Theophrastus was not sure whether coral should be classified as a stone or as a plant. Pliny, on the other hand, was doubtful whether to classify it as a plant or as an animal; for though his descriptions lead one to suppose that he considered it to be a plant, his chapter on coral is included in his book on sea animals and the remedies derived from them. [...] In his notes on this passage, Hill [John Hill, who published an annotated translation of Theophrastus in 1746] reflects the confusion that existed in his day when he says: ‘The Nature and Origin of Coral has been as much contested as any one Point in natural Knowledge; the Moderns can neither agree with the Antients about it, nor with one another; And there are at this Time, among the Men of Eminence in these Studies, some who will have it to be of the vegetable, others of the mineral, and others of the animal Kingdom.’ Hill’s own conclusion, which he defends at length, was that coral is a plant, and he roundly criticizes those who think otherwise. But he changed his mind in his second edition, where he says it belongs to the animal kingdom.” I guess what I am trying to convey with all this is that if you spend some time thinking about it, and if you consider that the meanings of words such as ‘stone’ (coral is a stone?) and ‘birth,’ or ‘pregnancy’ have changed through the millennia, then all of a sudden the statement that some stones “give birth to young” is perhaps not that weird anymore. And re what is sometimes called sympathetic magic, i.e., the idea that eagles would be “unable to hatch out their young without the aid of” eaglestones, or the Greeks’ and Romans’ belief “in the special curative value of coral,” which “has long been supposed to possess magical properties; it was therefore worn as an amulet in ancient times, and this practice has by no means disappeared today...”: re all this, remember, again, Michel Foucault’s paragraph on resemblance, which I quoted in Chap. 1. Here are some other theories about the origin of rocks, as they are exposed in On Stones. (i) The pearl. Not much to say about it: Theophrastus: “Among choice stones there is also the one called the pearl; this is translucent by nature, and valuable necklaces are made from it. It is produced in an oyster”. (ii) Pumice: Theophrastus: “some think that pumice is formed entirely as a result of burning, with the exception of the kind that is produced from the foam of the sea.” Caley: “The ancients apparently made little or no distinction between true combustion and other high-temperature phenomena [...]. Evidently fire was a term that included all phenomena involving light
3.3 Theophrastus
55
and a high temperature. Therefore, when Theophrastus speaks of the origin of pumice from burning, combustion is not to be understood, but rather the formation of this material in the usual way by the expulsion of gases from molten lava.” And also: “The pumice that was thought to be produced from foam is clearly the same as the floating pumice still found around the shores of islands in the Aegean Sea. Such pumice emanates from the active volcanic island of Thera (Santorin), where considerable quantities are to be seen floating on the surface of the water. Theophrastus evidently believed that it was formed in some way from the foam of sea water.” (iii) And finally, a really weird one: “Lyngourion [...] is cold and very transparent, and it is better when it comes from wild animals rather than tame ones and from males rather than females; for there is a difference in their food, in the exercise they take or fail to take, and in general in the nature of their bodies, so that one is drier and the other more moist. Those who are experienced find the stone by digging it up; for when the animal makes water, it conceals this by heaping earth on top. This stone needs working even more than the other kind.” According to Caley, Lyngourion is amber; clearly, Theophrastus takes for granted that amber is petrified urine; Caley thinks that this idea arose from the attempt to come up with a reasonable etymology for the word lyngourion, its real origin having been forgotten: so lyngourion would come from lynx-urine; “Though Theophrastus fails to state explicitly in this treatise what animal was supposed to produce lyngourion, the lynx is specifically named in all later accounts.” And that’s pretty much it re the origin of rocks, as far as Theophrastus is concerned: no mention of fossils, in the sense of the remains of what once were living creatures. And we have seen that Aristotle doesn’t mention those either, although Zenophanes’ and maybe Pythagoras’ idea of “petrified” marine organisms found in the middle of continents certainly was in agreeement with his theory of changes in landscape: that what is now dry land used to be the bottom of the sea, etc., as we have seen above. Now, like I think I said in Chap. 1, there doesn’t seem to be much work contributing to what today we call science, between the Greeks and, basically, Kepler, and then the enlightenment. But just like Copernicus and Galileo paved the way for the revolution that was to come a few generations later, also in the so called earth sciences there are some precursors. The idea that fossils are the remains of animals that were killed and buried during the 40- (Genesis 7:17) or 150-day (Genesis 7:24) long flood described in the Book of Genesis is found, e.g., in Tertullian89 . Later on, people started to question the Flood theory based on “empirical” considerations: Leonardo da Vinci90 points out that “if the Deluge had carried the shells over distances of three and four hundred miles from the sea it would have carried them mixed with various other natural objects all heaped up together; but even at such distances from the sea we see the oysters all together and also the shellfish and the cuttlefish and all the other shells which congregate together, found all together dead; and the solitary shells are found apart from one another as we see them every day on the sea-shores.” In other words, to Leonardo it didn’t look like those creatures were violently killed in a big catastrophe, but, rather, it appeared that each of them died its own, more or less natural death, in its natural environment.
56
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
3.4 Steno About a century after Leonardo, enter Nicolas Steno: “a Dane,” writes Lyell in the introductory sections of Principles, “once professor of anatomy at Padua, and who afterwards resided many years at the court of the Grand Duke of Tuscany. His treatise bears the quaint title of De Solido intra Solidum naturaliter contento (1669), by which the author intended to express, ‘On Gems, Crystals, and organic petrifactions inclosed within solid Rocks’ [...]. It was still a favourite dogma, that the fossil remains of shells and marine creatures were not of animal origin; an opinion adhered to by many from their extreme reluctance to believe, that the earth could have been inhabited by living beings before a great part of the existing mountains were formed. In reference to this controversy, Steno had dissected a shark recently taken from the Mediterranean, and had demonstrated that its teeth and bones were identical with many fossils found in Tuscany91 . He had also compared the shells discovered in the Italian strata with living species, pointed out their resemblance, and traced the various gradations from shells merely calcined, or which had only lost their animal gluten, to those petrifactions in which there was a perfect substitution of stony matter. [...] He distinguished between marine formations and those of a fluviatile character, the last containing reeds, grasses, or the trunks and branches of trees. He argued in favour of tbe original horizontality of sedimentary deposits, attributing their present inclined and vertical position sometimes to the escape of subterranean vapours, heaving tbe crust or the earth from below upwards, and sometimes to the falling in of masses over-lying subterranean cavities. “He declared that he had obtained proof that Tuscany must successively have acquired six distinct configurations, having been twice covered by water, twice laid dry with a level, and twice with an irregular and uneven surface.” Today, Steno is credited with establishing the so called four principles of stratigraphy, i.e., in his own words92 , 1. “At the time when a given stratum was being formed, there was beneath it another substance which prevented the further descent of the comminuted matter and so at the time when the lowest stratum was being formed either another solid substance was beneath it, or if some fluid existed there, then it was not only of a different character from the upper fluid, but also heavier than the solid sediment of the upper fluid.” (Today people call this “the law of superposition”.) 2. “At the time when one of the upper strata was being formed, the lower stratum had already gained the consistency of a solid.” (“The principle of original horizontality”.) 3. “At the time when any given stratum was being formed it was either encompassed on its sides by another solid substance, or it covered the entire spherical surface of the earth. Hence it follows that in whatever place the bared sides of the strata are seen, either a continuation of the same strata must be sought, or another solid substance must be found which kept the matter of the strata from dispersion.” (“The principle of lateral continuity”.)
3.5 Neptunism: De Maillet, Buffon, Werner
57
4. “If a body or discontinuity cuts across a stratum, it must have formed after that stratum.” (“The principle of cross-cutting relationships”.)
3.5 Neptunism: De Maillet, Buffon, Werner From this point on, and until, basically, Hutton, the consensual theory on the formation of rocks and the shape of landscape revolves around the contributions of a number of similarly-minded authors, spanning several generations—a “school of thought” that would later be labeled Neptunism, the point being that those guys were all convinced that water, the oceans, has the most important part in explaining how the earth works. Eventually they would be proved wrong, but we’ll see about that later. The Neptunist narrative starts from the hypothesis that most rocks are formed by sedimentation: that’s the only way to explain Steno’s observation that they are piled up as parallel, and most often horizontal strata. Sedimentation can be either “mechanical” (water and wind erode pre-existing rocks in some places and carry sediments somewhere else, where they pile up in layers, and again are compacted into hard rock, over very long times, by their own weight) or chemical (for some reason, some elements that were originally present in water would separate themselves from it, “precipitate” and, over time, form layers of rock). In both cases, you need to postulate some sort of “primitive” ocean, and consensus was that initially the ocean would cover the whole surface of the globe—otherwise you can’t account for sea shells found at very high altitude on mountains. Now, for this theory to function, pretty much the entire outer shell of our planet needs to be covered by sedimentary rocks. Today we have many reasons to believe that if you take a random sample (random meaning including possibly at some non-insignificant depth below the earth’s surface) of rock from the shallower part of the earth, chances are that it will not be a sedimentary rock. But the people we are talking about lived three hundred years ago, had no direct access to most of the globe, or even to independent observations made by colleagues outside of Europe—“science” in the European sense didn’t really exist at that point outside of Europe—and actually lived in places like Paris where everything is pretty much flat and sedimentary; mountain ranges had been only sporadically explored, etc. To get a feeling of how they saw things, we might as well look at some eighteenth century texts; chronologically the first example of a Neptunist theory of earth history is probably de Maillet’s Telliamed93 ; but then again de Maillet never achieved much more than a cult following, while Buffon’s Natural History (which see Chap. 1), published shortly after Telliamed, made a big splash, plus Buffon is, if I may say so, a great writer, so here is Neptunism in Buffon’s words, from his discourse on “history and theory of the earth” (in the first volume of Natural History): “we find that the upper stratum that surrounds the globe is universally the same. That this substance which serves for the growth and nourishment of animals and vegetables, is nothing but a composition of decayed animal and vegetable bodies reduced into such small particles, that their former organization is not distinguishable; on penetrating a little
58
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
further,” that is, by looking at “outcrops”—places where hard rock emerges out of the top “organic” layer (if, like Buffon, and myself at least for the time being, you live in Paris, you can go to the Buttes Chaumont park, which was once a quarry, to see a bunch of flat, parallel layers of rock—but be careful: when Buttes Chaumont was turned into a park, some “artificial” rock formations, i.e., concrete, were added for increased dramatic effect) “we find the real earth, beds of sand, limestone, argol, shells, marble, gravel, chalk, etc. These beds are always parallel to each other and of the same thickness throughout their whole extent. In neighbouring hills beds of the same materials are invariably found upon the same levels, though the hills are separated by deep and extensive intervals. All beds of earth, even the most solid strata, as rocks, quarries of marble, etc. are uniformly divided by perpendicular fissures; it is the same in the largest as well as smallest depths, and appears a rule which nature invariably pursues. “In the very bowels of the earth, on the tops of mountains, and even the most remote parts from the sea, shells, skeletons of fish, marine plants, etc. are frequently found, and these shells, fish, and plants, are exactly similar to those which exist in the Ocean. There are a prodigious quantity of petrified shells to be met with in an infinity of places: They are not only inclosed in rocks of marble, limestone, as well as in earth and clays, but are actually incorporated and filled with the very substance which surrounds them. In short, I find myself convinced, by repeated observations, that marbles, stones, chalks, marls, clay, sand, and almost all terrestrial substances, wherever they may be placed, are filled with shells and other substances, the productions of the sea.” To explain these observations, Buffon then proposes “that the earth which we now inhabit, and even the tops of the highest mountains, were formerly covered with the sea [...]; it appears also that the water remained a considerable time on the surface of the earth, since in many places there have been discovered such prodigious banks of shells, that it is impossible so great a multitude of animals could exist at the same time [...]. If we were to suppose that at the Deluge all the shell-fish were raised from the bottom of the sea, and transported over all the earth; besides the difficulty of establishing this supposition, it is evident, that as we find shells incorporated in marble and in the rocks of the highest mountains, we must likewise suppose that all these marbles and rocks were formed at the same time, and that too at the very instant of the Deluge; and besides, that previous to this great revolution there were neither mountains, marble, nor rocks, nor clays, nor matters of any kind similar to those we are at present acquainted with, as they almost all contain shells and other productions of the sea. Besides, at the time of the Deluge, the earth must have acquired a considerable degree of solidity, from the action of gravity for more than sixteen centuries, and consequently it does not appear possible that the waters, during the short time the Deluge lasted, should have overturned and dissolved its surface to the greatest depths we have since been enabled to penetrate.” So, not a short Deluge, but “the waters of the sea at some period covered and remained for ages upon that part of the globe which is now known to be dry land; and consequently the whole continents of Asia, Europe, Africa, and America, were then the bottom of an ocean abounding with similar productions to those which the sea at present contains: it is equally certain that the different strata which compose the earth are parallel and
3.5 Neptunism: De Maillet, Buffon, Werner
59
horizontal, and it is evident their being in this situation is the operation of the waters which have collected and accumulated by degrees the different materials, and given them the same position that the water itself always assumes. We observe that the position of strata is almost universally horizontal: in plains it is exactly so, and it is only in the mountains that they are inclined to the horizon, from their having been originally formed by a sediment deposited upon an inclined base. Now I insist that these strata must have been formed by degrees, and not all at once, by any revolution whatever, because strata composed of heavy materials are very frequently found placed above light ones, which could not be, if, as some authors assert, the whole had been mixed with the waters at the time of the Deluge, and afterwards precipitated: in that case [...] the heaviest bodies would have descended first, and each particular stratum would have been arranged according to its weight and specific gravity, and we should not see solid rocks or metals placed above light sand any more than clay under coal. “We should also pay attention to another circumstance; it confirms what we have said on the formation of the strata; no other cause than the motions and sediments of water could possibly produce so regular a position of it, for the highest mountains are composed of parallel strata as well as the lowest plains, and therefore we cannot attribute the origin and formation of mountains to the shocks of earthquakes, or eruptions of volcanos. The small eminences which are sometimes raised by volcanos, or convulsive motions of the earth, are not by any means composed of parallel strata, they are a mere disordered heap of matters thrown confusedly together; but the horizontal and parallel position of the strata must necessarily proceed from the operations of a constant cause and motion, always regulated and directed in the same uniform manner.” OK. But so then, anyway, what made the mountains? According to Buffon, from the fact “that the dry part of the globe [...] has remained for a long time under the waters of the sea,” it follows that this formerly subaqueous terrain must have for all that time “underwent the same fluctuations and changes which the bottom of the ocean is at present actually undergoing. To discover therefore what formerly passed on the earth, let us examine what now passes at the bottom of the sea.” Which makes sense. A discussion of “the ebbing and flowing of the tides, and the motion of the earth” follows. In summary, “we cannot possibly have the least doubt that the tides, the winds, and every other cause which agitates the sea, must produce eminences and inequalities at the bottom, and those heights must ever be composed of horizontal or equally inclined strata. These eminences will gradually increase until they become hills, which will rise in situations similar to the waves that produce them; and if there is a long extent of soil, they will continue to augment by degrees; so that in course of time they will form a vast chain of mountains”, and so on and so forth. What Buffon is saying, here, is that mountains are somehow put together by marine currents; but this is just after demonstrating that most of the rock strata found in nature are flat because that is the way they were formed by marine currents. Maybe I am oversimplifying, but it is a fact that Buffon’s theory on the origin of mountains didn’t enjoy a long life. It was probably OK for a while, and Buffon does sound very confident that it’s a good theory: because it explains, in a simple but convincing way,
60
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
most of his data, i.e., (i) in most places (that he had access to), strata are flat; (ii) there’s plenty of fossil sea shells in most strata; faced with some difficulties (how about mountains? strata are parallel there, but why not flat?), the easy way out is to think of those as exceptions to the rule, and add some complications to the theory, without really rethinking it from scratch. This is probably something that happens all the time to us, as well, and we don’t even realize it because no-one showed up yet with new, simpler and more convincing explanations for the phenomena we’re looking at. (It will turn out, instead, that the forces that are responsible for “the shocks of earthquakes, or eruptions of volcanoes” do play a major part—but more on that later). There’s another complication that Buffon tries to deal with, and it’s worth reading what he has to say about that, because it is going to become a central issue in the debates that follow. If rock strata are formed by sedimentation, deposited on top of one another by the slow action of waters, what are then the thin vertical sheets of rock (see the drawing in Fig. 3.1) that sometime cut vertically through some or all visible horizontal or almost horizontal layers in an outcrop? “We have already explained how the horizontal strata of the earth were formed, but the perpendicular divisions that are commonly found in rocks, clays, and all matters of which the globe is composed, still remain to be considered. These perpendicular strata are, in fact, placed much farther from each other than the horizontal, and the softer the matter the greater the distance; in marble and hard earths they are frequently found only a few feet; but if the mass of rock be very extensive, then these fissures are at some fathoms distant; sometimes they descend from the top of the rock to the bottom, and sometimes terminate at an horizontal fissure. They are always perpendicular in the strata of calcinable matters, as chalk, marle, marble, etc. but are more oblique and irregularly placed in vitrifiable substances, brown freestone, and rocks of flint, where they are frequently adorned with chrystals, and other minerals. In quarries of marble
Fig. 3.1 Buffon’s “perpendicular strata” (later to be called dykes, or dikes). Buffon explained them as sedimentary deposits, too: but today that’s not what we think they are. The drawing is from Lyell’s Principles: it shows some “dikes at the base of the Serre del Solfizio, Etna”
3.5 Neptunism: De Maillet, Buffon, Werner
61
or calcinable stone, the divisions are filled with spar, gypsum, gravel, and an earthy sand, which contains a great quantity or chalk. In clay, maris, and every other kind of earth, excepting turf, these perpendicular divisions are either empty or filled with such matters as the water has transported thither.” Buffon doesn’t see this as much of a problem: “We need seek very little farther for the cause and origin of those perpendicular cracks. The materials by which the different strata are composed being carried by the water, and deposited as a kind of sediment, must necessarily, at first, contain a considerable share of water, the which, as they began to harden, they would part with by degrees, and, as they must necessarily lessen in the course of drying, that decrease would occasion them to split at irregular distances. They naturally split in a perpendicular direction, because in that direction the action of gravity of one particle upon another has no actual effect, while, on the contrary, it is directly opposite in a horizontal situation; the diminution of bulk therefore could have no sensible effect but in a vertical line.” Those who already know their geology might frown at how utterly wrong some of Buffon’s ideas are by today’s standards (those who don’t, no worries, because, if you are patient enough, assuming I am doing a decent job, which hopefully I am, you are going to learn all the new ideas over the course of this book); but if one puts his work, and/or that of de Maillet, and other thinkers of the time, in historical context, it is easier to see how those guys did contribute to shape the way we look at geological things now; because they introduced in geological thought, let’s say, a new angle to look at things, a new approach, without which our idea of the earth would not be what it is today. Buffon and his peers are the first to look at a rock not as something that always exists in some sort of eternal present (recall Aristotle), but rather as something that has been formed through some more or less mysterious processes that took place at some point in time, i.e., something that has a history; understanding its origin becomes a central issue. I guess it is okay to forget most of Buffon’s excerpt above, if you remember this point. The same idea is more apparent in Gottlob Werner, a famous professor at the School of Mines in Freiberg, who proposed a relatively detailed history of how, within the Neptunist paradigm, different layers were formed at different epochs in Earth’s history and acquired their specific nature. According to Anthony Hallam94 , Werner’s “originality lay principally in his making the time of formation of the rocks, rather than their mineralogy, their most significant character. Although his use of rock formations as unique historical entities, rather than natural kinds, was anticipated by earlier researchers such as [Johann Gottlob] Lehmann and [Georg Christian] Fuechsel, Werner was the first to propose a system according them a universal significance.” Another way of putting it is that Werner “went far beyond mere classification of minerals and rocks”. Classification was the dominant sixteenth, seventeenth-century fashion (think Lynnaeus), and Werner (1749–1817) starts to leave it behind by worrying about not just how things are, but also what phenomena made them the way they are. Hallam (italic mine): “the more general, synthetic, part of his teaching concerned what he called geognosy, [...] ‘the science which treats of the solid body of the earth as a whole and of the different occurrences of minerals and rocks of which it is composed and of the origins of these and their relations to one another.’ In his book Kurze Klassifikation95 ..., Werner outlined what was in effect a
62
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
Fig. 3.2 This is, with some simplification, how Werner and co. saw a mountain range. The point is that the central axis is where you find rocks that show no layering and carry no fossils. They cannot be sedimentary, so they must be “primitive”: they were there already before erosion and sedimentation started. Everything around them is sedimentary. Depending on where you are with respect to the axis, you will find different layers formed in different times
stratigraphic scheme that was claimed to be applicable to the whole earth.” The idea was, roughly, that it is along the axis of a mountain range that the deepest (oldest) rocks become visible through denudation by erosion (see the drawing in Fig. 3.2). And in fact these “primitive” rocks are not organized in layers and contain no fossils: so Werner figured they had not been created by sedimentation: they were there since before the whole erosion/sedimentation process even started. As you travel away from the axis you are bound to find more recent, sedimentary layers, deposited on top of the oldest rocks before erosion cut through them all. Which, by the late XVIII century, had been or were being observed in various mountain ranges in Germany and elsewhere in Europe96 . Werner generalized all those observations, coming up with a global ‘stratigraphic’ scheme. According to Hallam’s summary, “in order of decreasing age the units are as follows: “1. Urgebirge (‘primitive strata’). Granite, gneiss, schist, serpentine, quartz, porphiry, etc.; “2. Uebergansgebirge (‘transitional strata’) [also called ‘secondary’ rocks]. A succession [...] of limestone, diabase, and greywackes; “3. Floetzgebirge (Floetz strata). Twelve subdivisions ranging in succession from what is now known as [string of names of geological periods, that you don’t need to learn, right now.] “4. Aufgeschwemmte Gebirge97 (‘swept together strata’). Relatively unconsolidated deposits; “5. Vulkanische Gesteine. Both true volcanics (lavas, tuff) and ‘pseudovolcanics’ (hornstone, porcelain jasper).” Werner came up with a theory to explain this structure; look at the drawing in Fig. 3.3: he attributed a sedimentary origin to most of the rocks he could observe, extrapolating to the entire world the idea that geology is controlled by sedimentation. Sedimentation in turn was well known to be controlled by water—it can be observed to take place along rivers, around lakes, and of course on the seaside. So, all must start from water. Initially, says Werner, the earth is entirely covered with water: turbid
3.5 Neptunism: De Maillet, Buffon, Werner
63
Fig. 3.3 Werner’s stratigraphy, or: how we got from the “primitive” shape of the earth’s surface (or ocean floor), shown top left, to the current shape, shown at bottom right (and in Fig. 3.2). Based on a similar sketch by Albert V. Carozzi in his edition of Telliamed, University of Illinois Press, 1968
water, holding in suspension or solution a whole lot of solid materials that, today, form the shallowest layer(s) of the solid earth. Those materials separated from the water over a long time, starting with the stuff that Werner called “primitive”, including, e.g., granite, which he thought was the first material that separated out of the water. At this point there’s no life, which explains why there’s no fossils in granite and or in other so-called primitive rocks. According to Werner, primitive rocks are made from only chemical precipitates98 . But after the first stage, you start to get mechanically deposited sediments99 ; and over time, you get more and more of those. Then life emerges, and you get fossils as well. The relative amount of chemical precipitates (which eventually disappeared) vs. mechanical deposits, and the quantity and nature of fossils found in the rocks, are what distinguishes the multiple “stages” of Fig. 3.3 from one another. Like in Buffon, marine currents move materials around, piling them up to form mountains. Werner thought, says Hallam, that “the primitive ocean [...] did not subside slowly and quietly but was very turbulent, with powerful currents cutting deep channels to produce valleys and mountains. As the ocean water became progressively quieter through time so the strata tended towards horizontality and had a progressively more restricted distribution as water level dropped.” But there were problems with this model. First of all, where had all the water gone? Scipione Breislak argued in his well known (at the time) textbook Introduzione alla Geologia (1811, and then translated in French in 1812) “that the volume of water now present on the globe was utterly insufficient to contain in solution or suspension all the solid material of the crust. The Wernerians were never able to give a satisfactory answer” (Hallam). Italian geologists, like Breislak, were used to seeing, in their
64
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
fieldwork, lots of rocks whose origin was clearly volcanic—unlike Germany, France or Britain, Italy has a number of active volcanoes—so besides the water issue, it is not surprising that they were unimpressed by Werner et al.’s idea that everything was sedimentary. Consider that in 1538 a whole hill (called “Monte Nuovo”, i.e., “New Mountain”), more than 100 m high, was formed by a single, one-week-long eruption just outside Naples. So, these guys proposed that mountains could be mostly the result of that kind of events. Werner didn’t buy this, generally thought that volcanic rocks were nothing more than an exception to the rule, and even claimed that basalt was actually not a volcanic rock at all: and this at a time when growing evidence was being found, and advertised among geologists around Europe, for the presence of a group of volcanoes in the Massif Central, i.e., in the middle of France, with lots of basalt all over the place. The volcanoes had been asleep since before history—there was no written document of any activity on their part, ever—but for the first geologists who went to do their fieldwork in the area there was little doubt: “in 1751, while he was going through the town of Moulins, to the north of the Central Massif, [JeanEtienne Guettard’s] attention was attracted by some curious black stones. Although he had never seen a volcano, he recognized that the stones were lava. When he asked from which quarry it had come, he was told that it was from Volvic. According to Condorcet, Guettard then exclaimed ‘Volvic, Vulcani vicus [the village of Vulcan]!’ Whether the tale is apocryphal or not, the fact remains that Guettard quickly resumed his route heading toward Auvergne. ‘I recognize a volcano, this is what Vesuvius looks like! or Etna, or the peak of Tenerife—I’ve seen engravings of them!’ he exclaimed at the heights of the peak dominating Volvic; and ‘the craters, the lavas, the inclined, parallel layers—they must have been formed by molten material!’ ” (Pascal Richet, A Natural History of Time, University of Chicago Press, 2009).
3.6 Basalt Now, at the time of Werner basalt just meant a dark, hard rock, sometimes found in the form of large “columnar” structures: think the Giants’ Causeway in Ireland, but also the Palisades across the river from New York City, etc. (The word basalt had been used by Pliny the elder and later—in the mid fifteen hundreds—by Agricola, in his De Re Metallica.) It was not automatically associated to volcanoes as it is today—today, if you ask someone who’s ever taken a geology course what basalt is, they’ll start by saying it’s a volcanic rock—but of course it did resemble the rocks that one could see being formed at some volcanic eruptions—and it did not show the layered structure typical of sedimentation. Geologists doing fieldwork in Auvergne (including many students of Werner’s) all came to the conclusion that at least that basalt had frozen, long ago, after flowing out of those volcanoes. It just looked too much like solidified lava! To this, Werner would reply that, okay, that particular basalt had indeed been turned into lava, at some point, but it was already basalt before being lava: formed, like all basalt, by chemical precipitation, then melted by underground fires at coal deposits, erupted, solidified again. Except that, in this case,
3.7 Plutonism: James Hutton
65
that explanation didn’t work, either: because granite was supposed to be the primitive rock: the base on top of which everything else rests. Eruptions were supposed to take place where rocks were melted by the combustion of underlying layers of coal. Now, in Auvergne, basalt was found directly above granite. There are no layers of other rocks between the granite and the basalt. Which, if one accepted that basalt had been erupted, meant that whatever seam of coal had burned to melt it and trigger the eruption, had to be underneath the granite. Which obviously was a paradox. “In 1789 Guy de Dolomieu (1750–1801), professor at the Ecole des Mines of Paris, [put] forward the revolutionary proposal that granite was not primordial but was underlain by rock of very different composition, which had penetrated the granite [from below] to give rise to basaltic lava. The volcanic hearth could not therefore be located in sedimentary strata containing combustible materials, and the heat source must lie at some considerable depth below the consolidated crust” (Hallam). The bottom line, if you will, of all this, is that volcanic rocks were not just the effect of some occasional phenomena of little relevance, accounting for a small portion of the materials forming our planet: on the contrary, much of the planet was made up of volcanic material, and nobody knew how that material was created—nobody had understood, yet, what forces from deep within the earth gave rise to volcanoes. The generation of geologists that looked at rocks between the French revolution and Waterloo ended up pretty much agreeing on this. Then to come up with a good story for how lava would be formed and pushed out of the earth, would take more than another century; but one first relevant contribution in this direction came already in the XVIII century through the work of James Hutton.
3.7 Plutonism: James Hutton To understand how important Hutton is to geology, consider this: when I went through my phase of wanting to learn some geology—because what kind of geophysicist was I going to be if I didn’t have some rough knowledge of at least the basics of geology?— while I was going through this phase I would often have lunch with a colleague who, despite being, like me, in a geophysics institute, had a geology background. Which for him was a problem because researchers at our university—“institute of technology”, actually—were supposed to be highly specialized (which today you are supposed to be highly specialized, anyway, if you want to be able to claim that the research you are doing is “cutting-edge”), so that being in the geophysics institute meant you were really surrounded by physicists doing a lot of abstract stuff—in any case, abstract with respect to how geologists think: fieldwork, and all that. So, anyway, maybe because we were both Italians living abroad, we would often eat our lunch together, in the restaurant across the street, and I started asking him geology questions, and I always had a very hard time understanding the answers, so we soon both realized that I had to start learning my geology from scratch (“what?! you have never heard of that?!”) and I asked him what’s a good starting point, and he said, well, maybe a good starting point is that there are only three types of rock: igneous, sedimentary
66
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
and metamorphic. Whatever stone you pick up in the field, it’s got to be one of those three. And indeed, now that some ten years have passed, I do think that that was a very good starting point—because, in principle, one way to classify might be as good as another, but the thing is that this particular classification points directly at how the earth works. Accepting this classification, you begin to understand how the earth works. As I am going to try and demonstrate to you in a minute. Now, to appreciate how important Hutton is, consider that okay, at his time everybody had already figured out that lots of rocks were formed by sedimentation, that they formed parallel layers, etc. But Werner would still claim that all rocks were sedimentary, and rocks erupted by volcanoes were really just molten, reprocessed sedimentary rocks; it was Hutton who first emphasized the importance of igneous rocks, i.e., rocks that were once molten, whether volcanic—erupted from a volcano—or plutonic (which we’ll see later what that exactly means), he emphasized igneous rocks as something radically different from sedimentary. And as for metamorphic rock, the concept didn’t even exist: that comes entirely from Hutton and his followers. (And, by the way, if you don’t know what a metamorphic rock is, don’t worry. Because, for one, we’ll get to it soon, and two, I, for instance, received a Ph.D. in geophysics from one of those famous universities, without being able to explain what a metamorphic rock is. Which, certainly, is not a good thing, but that’s how science works today: I was just telling you about the need to be “highly specialized”, wasn’t I? so, it’s OK to not understand the sense of what you are doing provided that you are proficient in the techniques that you specialize in. It’s not great, sure, but it’s definitely OK. I was supposed to “map the interior of the earth from seismic observations,” which meant one part physics, two parts maths, and three parts informatics; admittedly, the people in my thesis committee did complain about my lack of “vision”; but that wasn’t enough to fail me.) Like I was saying at the beginning of this chapter, Hutton’s biggest discovery might be the angular unconformity. According to McPhee’s Basin and Range, “it was an angular unconformity in Scotland—exposed in a riverbank at Jedburgh [...]— that helped to bring the history of the earth, as people had understood it, out of theological metaphor and into the perspectives of actual time. This happened toward the end of the eighteenth century, signalling a revolution that would be quieter, slower, and of another order than the ones that were contemporary in America and France.” And it was Hutton who first saw it—the unconformity—for what it actually was. McPhee is in the field, in Nevada, with famous Princeton geology professor Kenneth Deffeyes, and they are looking at an eroded hillside, which “consisted of two distinct rock formations, awry to each other, awry to the gyroscope of the earth— just stuck together there like two artistic impulses in a pointedly haphazard collage. Both formations were of stratified rock, sedimentary rock, put down originally in and beside the sea, where they had lain, initially, flat. But now the strata of the upper part of the hill were dipping more than sixty degrees, and the strata of the lower part of the hill were standing almost straight up on end. [...] In order to account for that hillside, Deffeyes was saying, you had to build a mountain range, destroy it, and then build a second set of mountains in the same place, and then for the most part destroy them. You would first have had the rock of the lower strata lying flat [...].
3.7 Plutonism: James Hutton
67
Then the forces that had compressed the region and produced mountains would have tilted [it], not to the vertical, where it stood now, but to something like forty-five degrees. That mountain range wore away—from peaks to hills to nubbins and on down to nothing much but a horizontal line, the bevelled surface of slanting strata, eventually covered by a sea. In the water, the new sediment of the upper formation would have accumulated gradually upon that surface, and, later, the forces building a fresh mountain range would have shoved, lifted, and rotated the whole package”, etc.; you see what all this means: history just repeats itself—the same process over and over again, forever: with no vestige of a beginning, no prospect of an end. McPhee, who’s a great writer, figured that “angular unconformity” is a powerful concept and used it as a starting point and central point of his long text on Hutton and the early history of geology. Of course there’s more than that to Hutton. Perhaps equally important to his observing angular unconformities and interpreting them in his original (and, from today’s angle, correct) way, is the fact he placed much emphasis on erosion, denudation of relief by erosion, transport of eroded material from the mountains down to the sea. Maybe he was aware of Jean-Etienne Guettard’s essay Sur la Dégradation des Montagnes Fait de Nos Jours par les Fortes Pluies ou Averses d’Eau, par les Fleuves, les Rivières et la Mer, or, On the Degradation of Mountains Effected in Our Time by Heavy Rains, Rivers and the Sea100 . (Already some paragraphs ago I told you about Guettard and his discovery of volcanism in central France; his work on erosion was even more significant, and Guettard’s contributions don’t even end there. The guy was a giant.) Essentially, Guettard looked in more detail than anybody else at how rains, rivers and the sea erode the earth, and the consequences of that. He figured that water breaks rocks, triggers landslides, exposes deep layers, and eventually carries all the debris, via rivers, to the sea. He figured that sediments from all over the place would eventually pile up along the shore of the sea. There, they would compact under their own weight and eventually turn into new layers of sedimentary rock. Guettard realized that the sea bottom as it is today (or as it was in his time) had a lot in common with the rock strata, carrying fossils of marine life forms, that he would find while doing fieldwork in the continent: the process by which new sedimentary rocks are being formed is the same now, along today’s shores, as it was some enormous amount of time ago in places that were once sea. Finally, Guettard figured that sediments only pile up relatively close to land: deposits do not extend far out to sea. “Consequently,” he wrote, “the erection of new mountains by the sea by the deposition of sediment is a process very difficult to conceive”, i.e., Buffon is probably wrong about the origin of mountains, and “we are still very little advanced towards the theory of the earth as it now exists.” So, I don’t know whether Hutton read or was aware of Guettard’s contributions, but there’s one fundamental aspect of it that certainly comes back in Hutton’s work: the earth, the landscape as we the human species get to see it, is essentially a single frame in a very slow and cyclic process of change. Mountains are washed away by erosion, sediments are carried to the sea by rivers, they pile up, are turned into stone and then something somehow pushes them up to form new mountains. And then the new mountains are eroded, too, and so on. Hutton tried to come up with some ideas of how “loose or incoherent” sediments would be turned into actual rock, and then what
68
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
force would be capable of lifting them up to form mountains. In the synopsis of his magnum opus Theory of the Earth, or an Investigation of the Laws Observable in the Composition, Dissolution, and Restoration of Land Upon the Globe, read to the Royal Society in 1785, he writes: “We find reason to conclude, 1st, That the land on which we rest is not simple and original, but that it is a composition, and had been formed by the operation of second causes. 2dly, That before the present land was made there had subsisted a world composed of sea and land, in which were tides and currents, with such operations at the bottom of the sea as now take place. And, Lastly, That while the present land was forming at the bottom of the ocean, the former land maintained plants and animals [...] in a similar manner as it is at present. Hence we are led to conclude that the greater part of our land, if not the whole, had been produced by operations natural to this globe; but that in order to make this land a permanent body resisting the operations of the waters two things had been required; 1st, The consolidation of masses formed by collections of loose or incoherent materials; 2dly, The elevation of those consolidated masses from the bottom of the sea, the place where they were collected, to the stations in which they now remain above the level of the ocean [...]. “Having found strata consolidated with every species of substance, it is concluded that strata in general have not been consolidated by means of aqueous solution. “With regard to the other probable means, heat and fusion, these are found to be perfectly competent for producing the end in view [...]” [or, as Hutton says a few paragraphs earlier, “the consolidation of strata [...] may be conceived to have been performed [through] the fusion of bodies by means of heat, and the subsequent congelation of those consolidating substances.” “It is supposed that the same power of extreme heat by which every different mineral substance had been brought into a melted state might be capable of producing an expansive force sufficient for elevating the land from the bottom of the ocean to the place it now occupies above the surface of the sea”. This last idea, that the “power of extreme heat”, emanating for some unknown reason from the interior of the earth, “might be capable of producing an expansive force sufficient for elevating the land” is one of the things that are usually remembered about Hutton, because it is in some sense correct in today’s terms. It can be said that it anticipated a lot of stuff that proved very useful in forming our current idea of the earth. That, and the importance he attributed to igneous rocks: it has been proved, much more recently, that they account for most of the earth’s outer shell. While sedimentary rocks, though very conspicuous at the surface of continents— i.e., where we live—are only a small chunk. The position of Hutton was even more radical, actually, because he basically claimed that, in a sense, all rocks were igneous, just as Werner had claimed that all rocks that were not “primitive”, including basalt, were sedimentary. More precisely, what Hutton supposed was101 : “I. That heat has acted, at some remote period, on all rocks. “II. That during the action of heat, all these rocks (even such as now appear at the surface) lay covered by a superincumbent mass, of great weight and strength.
3.7 Plutonism: James Hutton
69
“III. That in consequence of the combined action of heat and pressure, effects were produced different from those of heat on common occasions; in particular, that the carbonate of lime was reduced to a state of fusion, more or less complete, without any calcination.” The latter result was due to James Hall, who was a chemist, a friend of Hutton, and regularly exchanged with him on geology. He reports on it in his paper “Account of a Series of Experiments, Shewing the Effects of Compression in Modifying the Action of Heat”, Transactions of the Royal Society of Edinburgh, vol. 6, 1812. In the same paper, Hall tells of the big geological fight that followed the publication of Hutton’s studies. “Fire and water,” he wrote, “the only agents in nature by which stony substances are produced, under our observation, were employed by contending sects of geologists, to explain all the phenomena of the mineral kingdom. But the known properties of water, are quite repugnant to the belief of its universal influence, since a very great proportion of the substances under consideration are insoluble, or nearly so, in that fluid; and since, if they were all extremely soluble, the quantity of water which is known to exist, or that could possibly exist in our planet, would be far too small to accomplish the office assigned to it in the Neptunian theory [...]. On the other hand, the known properties of fire are no less inadequate to the purpose; for, various substances which frequently occur in the mineral kingdom, seem, by their presence, to preclude its supposed agency; since experiment shews, that, in our fires, they are totally changed or destroyed. “Under such circumstances, the advocates of either element were enabled, very successfully, to refute the opinions of their adversaries, though they could but feebly defend their own: and, owing perhaps to this mutual power of attack, and for want of any alternative to which the opinions of men could lean, both systems maintained a certain degree of credit; and writers on geology indulged themselves, with a sort of impunity, in a style of unphilosophical reasoning, which would not have been tolerated in other sciences.” So, anyway: according to Hutton heat is needed, together with the pressure of the overlying sediments, that always continue to pile up, to turn sand, mud, or lime into sedimentary rock. This is actually wrong by today’s standards (people have come to the conclusion that pressure does it, with no need for high temperatures) so it doesn’t get remembered. But it’s interesting, because it motivated Hall to concoct some experiments that, again, are like the blueprint of a lot of science to come. Hutton didn’t believe in laboratory experiments, “on account”, writes Hall, “of the immensity of the natural agents, whose operations he supposed to lie far beyond the reach of our imitation; and he seemed to imagine, that any such attempt must undoubtedly fail, and thus throw discredit on opinions already sufficiently established, as he conceived, on other principles.” Hall, of course, didn’t see things this way, but he put his experiments on hold until after his friend’s death in 1797. “I considered myself as bound, in practice, to pay deference to his opinion, in a field which he had already so nobly occupied,” he says, “and abstained, during the remainder of his life, from the prosecution of some experiments with compression, which I had begun in 1790.”
70
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
3.8 James Hall (Scottish Physicist): Experimental Mineral Physics Hall experimented most of all with calcium carbonate: no need, he figured, to try with a lot of different rocks. “Of all mineral substances, the Carbonate of Lime is unquestionably the most important in a general view. [Lime is not an element, but a mineral that contains a lot of calcium; in modern chemical terms, what Hall is talking about here is calcium carbonate.] As limestone or marble, it constitutes a very considerable part of the solid mass of many countries; and, in the form of veins and nodules of spar, pervades every species of stone. Its history is thus interwoven in such a manner with that of the mineral kingdom at large, that the fate of any geological theory must very much depend upon its successful application to the various conditions of this substance.” The thing about calcium carbonate, and Hall and Hutton knew about it, is that it “contains” carbonic acid, which is a gas at the “normal” conditions (pressure, temperature) that are met at the surface of the earth. When combined with lime, i.e., calcium, it becomes part of the rock—in Hall’s words, “its volatility is repressed [...] by the chemical force of the earthy substance, which retains it in a solid form”. This is true under “normal” conditions. But then if you heat up the carbonate, “when the temperature is raised to a full red-heat, the acid acquires a volatility by which that force is overcome, it escapes from the lime, and assumes its gaseous form.” In other words: at atmospheric pressure, heat destroys the limestone. This is “hostile”, says Hall, “to the supposed [by Hutton] action of fire”, i.e., of heat: because if experiment shows that heat, like I said, destroys the limestone, turning it into ashes plus some gas, “it seemed absurd to ascribe to that same agent [heat] the formation of limestone”. Hutton speculated that if you bring a rock, like carbonate, to very high temperature and put it under a lot of pressure, chances are that the rock will behave differently than at atmospheric (earth’s surface) temperature and pressure. The idea, says Hall, was that “when a mechanical force opposes the expansion of the acid, its volatility must, to a certain degree, be diminished. Under pressure, then, the carbonate may be expected to remain unchanged in a heat, by which, in the open air, it would have been calcined.” This needed to be tested in the lab, because “experiment alone can teach us, what compressing force is requisite to enable [limestone] to resist any given elevation of temperature; and what is to be the result of such an operation. Some of the compounds of lime with acids are fusible, others refractory...” etc. Hall mentions a field observation, originally made by Hutton, which seemed to imply that calcium carbonate might be fusible—i.e., that heat might turn it, through fusion, into something similar, rather than burn it and totally obliterate its nature. “Nothing is more common”, says Hall, “than to meet with nodules of calcareous spar inclosed in whinstone [whinstone is essentially how Hutton called basalt; calcareous spar is, again, calcium carbonate, in crystalline form102 ], and we suppose, according to the Huttonian Theory, that the whin and the spar had been liquid together; the two fluids keeping separate, like oil and water. It is natural, at the junction of these
3.8 James Hall (Scottish Physicist): Experimental Mineral Physics
71
two, to look for indications of their relative fusibilities”: the way this is done is, you look at the “termination” of the calcareous spar: which, in this case, is “globular and smooth”. And but the thing is, when calcareous spar freezes, it tends to “shoot out into prominent crystals”: it follows that, had it solidified while immersed in liquid whin, it would have “darted” into it—we’d be seeing sharp crystals rather than a smooth termination. But it wasn’t so, and “from this I concluded,” says Hall, “that when the whin congealed, which must have happened about 28◦ or 30◦ of Wedgwood103,104 , the spar was still liquid. I therefore expected, if I could compel the carbonate to bear a heat of 28◦ [also Wedgwood, or course] without decomposition, that it would enter into fusion. The sequel will shew, that this conjecture was not without foundation.” So, Hall packed the carbonate of lime into very rigid containers (like, tubes bored through iron) that he then sealed hermetically; he brought them to the highest temperature he could; and got something that looked very much like marble. Bottom line: just like Hutton had guessed, heat doesn’t necessarily “destroy” the carbonate, but, if under sufficient pressure, rather transforms it from one form to the other (e.g. limestone or marble). As it turned out, like I was saying a couple pages ago, Hutton was wrong re limestone: in the sense that to make limestone from sediments you actually don’t need to heat the sediments up: as we know today from experiment, etc., the effect of increased pressure (from their own accumulated weight as they pile up) is enough. Still, Hutton was right, overall, in realizing that it was more useful to think about the role of heat, rather than that of water, in order to understand how the earth functions and/or has taken its present form. Of course, just as it wasn’t clear where all the water required by the Wernerian system had gone, the origin of the heat was not known. But it was known, ever since people had started to dig mines, that the earth’s temperature grows with depth (you might remember this from the previous chapter: Cordier’s paper about this, etc.). So the idea of an “internal fire” was not crazy. And then there were the observations that Hutton made in the field. If you are in the center of Edinburgh and look up, you are likely to see Salisbury Crags, an impressive hill which to the geologist looks like a sandwich, with a layer (or vein) of volcanic-looking rock (dark, uniform, no layers: something that must have been formed by freezing of once molten material) between layers of sandstone—which on the contrary show a clear stratigraphy and must be sedimentary. Your typical neptunist would have perhaps tried to claim that, everything being of aqueous origin, the presumed lava had instead been formed by chemical precipitation from the primitive ocean. But Salisbury Crags is, like, a ninety-feet-high block of basalt, which should have been deposited at its particular location while at the same time other kinds of materials were forming other kinds of stones all around it... a difficult claim to make.
72
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
3.9 Granite You might know how granite looks like (see Fig. 3.4, in case you don’t); its very name refers to its appearance: “granite (n.): 1640 s, from French granit(e) (17c.) or directly from Italian granito ‘granite,’ originally ‘grained,’ past-participle adjective from granire ‘granulate, make grainy,’ from grano ‘grain,’ from Latin granum ‘grain’.”105 . For Wernerians that was the most primitive of primitive rocks, because, in the Alps or elsewhere, it was always found at the central, most eroded axis of the range, and you’d never find layers of any other materials underlying the granite. But then Dolomieu had already casted doubt: if that’s primitive, how come in Auvergne we have basalt—which, as even Werner admitted, has been at some point molten into lava—directly on top of the granite? if basalt was erupted through the granite, it must have been below the granite before being erupted: so, Werner’s model didn’t seem to work, in this case. Hutton saw things differently. He thought, because of the way it often looks when you see it in the field, that granite itself had been once molten. In fact, he had found some samples of granite where all of those little flecks, those crystals, see Fig. 3.4, were strangely aligned: and to him that meant that the rock must once have been melted, and must have solidified while under some kind of elongating strain. He also figured that if granite was once molten, and if it was “supposed that the same power of extreme heat by which every different mineral substance had been brought into a melted state might be capable of producing an expansive force sufficient for elevating the land” (Hutton, see above), then “surely it would have forced its way into cracks and fissures in any rocks with which it came in contact.” Initially, that was just a theory—but then Hutton and some of his friends106
Fig. 3.4 A granite sample. The size of one grain is typically about a millimeter or so
3.9 Granite
73
Fig. 3.5 Veins of granite traversing the gneiss of Cape Wrath in Scotland, after Louis Figuier’s The World Before the Deluge, 1867
went to the field to see if they could actually find any such “veins”... and they did. They didn’t even have to go very far from Edinburgh: in the Grampians, the granite chain west of Aberdeen, they spotted a bunch of large veins of red granite cutting through the black schists107 that make up those mountains’ lower ridges (Fig. 3.5). This meant that granite was, at some point, molten, and had solidified after “intruding” (as geologists say) pre-existing rock from below, i.e., granite is “younger” than the sedimentary rock found around it. So forget granite the “primitive” rock, and Werner’s model. Another important point: the granite Hutton and company were looking at had been trying to find its way through the surrounding rock, somehow pushed by heat (which “might be capable of producing an expansive force sufficient for elevating the land from the bottom of the ocean to the place it now occupies”— see above). But that granite did not look like it had been spit out into our world by a volcano, to then solidify quickly, forming a layer of “proper lava” (Hutton’s expression) as in the Auvergne; instead, it became solid right there in those veins and fractures before it could ever make it to what at the time was the outer surface of the earth (because don’t forget that erosion has laid bare, over millions of years, outcrops such as the one Hutton was looking at, which were once buried deep within more rock). Hutton called this a “subterraneous lava”: which had taken a relative long time to become solid—never having experienced such a rapid, drastic change in temperature as lava that flows out of a volcano. And you can tell (Hutton could tell) one of those two kinds of lava from the other sort of in the same way as, e.g., Hall would look for “indications of their relative fusibilities” at the “junction” of basalt and spar. Because in a lava that solidifies slowly, crystals from different minerals are formed at different times—first you reach the temperature at which one of them crystallizes; time passes; temperature goes further down and another mineral starts to crystallize; more time passes, and so on—and so you are likely to see some crystals of a less fusible mineral protruding into material that has (partly) crystallized later, etc., or in Hall’s words, again, we would see their “tendency [...], on all occasions of freedom, to shoot out into prominent crystals, [which] would have made it dart into
74
3 The Forces that Shape the Earth: Neptunism Versus Plutonism
the liquid” rock around it. You might be interested in knowing that the terms geologists use today for Hutton’s proper and subterraneous lava are extrusive and intrusive rock, respectively. Or, which I think is the same, volcanic and plutonic: hence the label plutonism, attached to the geological school of thought formed around Hutton’s ideas: not quite the same thing as volcanism à la Moro, etc. Anyway what matters now is that Hutton figured that all those rocks shared the fundamental property of having once been lava: of being, as we say today, igneous. Which, of the three types of rock that exist in nature, is the second we meet in this book (the first being, of course, sedimentary rock). As for the one we still haven’t met, i.e., metamorphic rocks: Hutton noticed that the rock near a granitic intrusion would look different from the way the same rock looked, say, some centimeters further away from the intrusion. He figured the molten granite—very hot—had “cooked” the rock it had “intruded”, before cooling down. Today, this is called contact metamorphism: which is metamorphism—a rock is metamorphosized: its properties are changed: it looks different, etc.—through contact with an intrusive igneous rock still at the molten state or in any case very hot. More simply, a rock can be metamorphosized just by being buried deeply beneath the earth’s surface, where temperature, as people have known ever since we’ve been mining the earth, is higher than near the surface (and pressure is higher, too, from the weight of the rock layers above): imagine sedimentary strata piling up on top of one another, the deeper layers might sink even further down just because of the weight that continues to accumulate on top of them; or they might be pushed down by some other forces... which we’ll see about that later. (People would also later identify other kinds of metamorphism, like for example by impact, i.e., if a meteorite hits a rock: relatively low temperature but ultrahigh pressure, which is also going to modify the rock’s properties in some radical ways; etc.) So while Hutton didn’t come up with the word, and it is debated whether it was Lyell who first used it—so he himself claimed—or Ami Boué or Elie de Beaumont, who are French geologists we shall meet later, at least the second one... while Hutton didn’t actually give a specific name to metamorphic rocks, he should be credited, for what it’s worth, with the discovery of the process. It is often the case that new theories are slow to pick up, and Hutton’s is no exception. Outside of the Edinburgh group, people didn’t pay much attention to Hutton’s work—while he was still alive, that is. And those who paid attention didn’t necessarily like it. Part of the problem was, according to Lyell108 that, “by a singular coincidence, Neptunianism and [christian] orthodoxy were now associated in the same creed; and the tide of prejudice ran so strong, that the majority were carried far away into the [...] cosmological inventions of Werner. These fictions the Saxon professor had borrowed with little modification, and without any improvement, from his predecessors. They had not the smallest foundation either in Scripture or in common sense, and were probably approved of by many as being so ideal and unsubstantial, that they could never come into violent collision with any preconceived opinions.” The first full-scale attack on The Theory of the Earth came from Richard Kirwan, “president of the Royal Academy of Dublin, a chemist and mineralogist of some merit,” says Lyell, “but who possessed much greater authority in the scientific world
3.9 Granite
75
than he was entitled by his talents to enjoy”. In a 1793 paper in the Transactions of the Royal Irish Academy (and then again in the introduction to his Geological Essays, 1799), Kirwan said, says Lyell, “that sound geology graduated into religion, and was required to dispel certain systems of atheism or infidelity, of which they had had recent experience”. Kirwan “was an uncompromising defender of the aqueous theory of all rocks, and was scarcely surpassed by Burnet and Whiston, in his desire to adduce the Mosaic writings in confirmation of his opinions.” Wow. According to Kirwan, Hutton, who denied that our world has a beginning, was, essentially, an atheist. This was very disturbing for Hutton—who, apparently, was quite religious. So disturbing that he actually sat down and rewrote his whole book, producing a huge three-volume edition only two volumes of which saw the light of day before his death (which was in 1797)—the third volume unfinished and left in manuscript form. Hutton was no great writer, though, anyway, and his ideas would be more effectively diffused later, through John Playfair’s109 and Charles Lyell’s books. Re Kirwan’s and the other Wernerians’ attacks, Paul N. Pearson110 wrote: “Hutton’s adversary, Richard Kirwan [...] objected that the crystals in granite appear to impress one another equally, and thus could not have crystallized sequentially.” What I think this means is that, according to Kirwan’s own observations, you might find crystals of one mineral protruding into non-crystallized masses of another mineral, and vice-versa: which contradicts Hutton’s observations and interpretations as I’ve summarized them above. “Many early nineteenth century authorities such as Charles Daubeny”, says Pearson, “reiterated Kirwan’s objections. To Daubeny, granite was the ‘skeleton of the planet’ which had formed in a primeval era under conditions very different from those of the present. Granite was usually depicted as a ‘primary’ rock in cross sections of the crust, often implying that it is the oldest or most fundamental rock type and the ultimate geological basement.” And but ultimately experiment (which, like I said, Hutton himself didn’t trust much) would establish that Hutton was right. Because, after Hutton’s death, Hall got back to work. He would take some volcanic rock, melt it, then cool it rapidly, so that it would solidify while staying as uniform as possible, and become glass. Then he would melt the glass, again, and cool it slowly: and depending on the rate at which he cooled it, he would be able to get something like the original volcanic stone back, or something different—he managed, e.g., to “make” various kinds of granite: depending on the cooling rate, the materials that are in the granite would coalesce in difference ways, forming different crystals. In 1866, after a few more decades of fieldwork and experiment by many authors, George L. Vose111 writes: “Granite is a compound of quartz, felspar, and mica, confusedly mixed [...]. It has generally been considered as the product of a molten mass, which has cooled very slowly, and under great pressure; as precisely the same ingredients cooled rapidly and not under pressure, do not form granite, but a compact form of uniform texture.”
Chapter 4
The Age of the Earth
“We must look for a moment at the common notions regarding the fundamental rocks. In the ordinary teaching of geology, we are given to understand that the foundationrock of the globe is granite, and that the fluid matter supposed to underlie the solid crust is melted granite; that the difference between the matter ejected from volcanoes and the fundamental rock lies in the mode of cooling: the former, solidifying in the open air, becomes lava, basalt, or the like; while the latter, slowly cooled under enormous pressure, becomes the mineral compound called granite. When this appears at the surface, it does so not because it was formed there, but because some superficial covering has been stripped from it. When the overlying rocks have not been removed, we pass in descending through various alternations of limestones, sandstones, slates, shales, and conglomerates, containing more or less the remains of the animals which lived in the seas where these now solid rocks were once laid down as incoherent masses of sediment. As we pass downward in the series, the buried animals become further and further removed in their physiological characteristics from existing species, until in the lowest Silurian beds we seem to have arrived at the beginning of animal life”. What you just read is taken from Orographic geology, a book on the origin of mountains published in 1866 by George L. Vose112 , and a good summary of what the geology of the time thought about that topic. There is this idea of “foundation” rock, which sounds more Wernerian than Huttonian: remember that for Hutton the interesting thing about granite was that it could intrude pre-existing formations, etc. So why call it foundation rock, if it can be younger than the layers around it? Maybe because people still couldn’t quite digest the idea of the earth being in a possibly endless cycle—“no vestige of a beginning” and all that? and had to place, somewhere, a stable base on which everything else would rest? At least call it that way—foundation, even though they were beginning to realize that it wasn’t the foundation of anything at all?
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_4
77
78
4 The Age of the Earth
4.1 Layers, Fossils, Correlation Anyway: everything else was an alternation “of limestones, sandstones, slates, shales, and conglomerates”. Vose here is not really spelling it out, but what this means is that if you go to the field and look at an outcrop, it is possible that you find all those types of sedimentary layers piled in any order113 . In the geological past, any given place might have been flatland, mountain, shallow ocean, in any succession. So, now here is a problem: let’s say that, again, you go to the field, but to a different place, far from where you were before. You are confronted with a sequence that resembles the one you’ve seen before. (But of course its top has been eroded away, its bottom is buried, etc.) Can you correlate the two sequences? Are they contemporary? Do they belong to the same world, the same era of the earth? Or are they just evidence of two similar but distinct successions of events, many millions of years apart from one another? And here we come to the last bit of Vose’s passage, where he mentions something that we haven’t encountered yet in this book: the idea that as one goes from younger to older strata, “the buried animals”, the fossils found in the strata, “become further and further removed in their physiological characteristics from existing species, until in the lowest Silurian beds we seem to have arrived at the beginning of animal life.” Vose is writing only a few years after the publication of On the Origin of Species, but what he’s writing is already “common notions”, common knowledge, because, even before Darwin, geologists had understood that more complex organisms also belong to more recent epochs in the history of the earth. They had understood that rocks get recycled, and so you can’t easily tell, e.g., the sandstones of one era from the sandstones of another; but they also figured that evolution does not occur in cycles: and so one could distinguish strata that were otherwise very similar, based on the fossils you’d find in them. People started figuring out this sort of things (“stratigraphy”, “biostratigraphy”) when the first “geological maps”, as they are called today, were redacted. This appears to have been largely a collective effort by many geologists of the time, whether working in teams or individually. So I can’t be sure whether he actually came first—but as far as I can tell from what I’ve read, a big one is certainly Jean-Etienne Guettard. A medical doctor by background, but a naturalist by passion, Guettard made his living as the curator of the Duke of Orleans’ natural history collections. He also traveled around France, initially to study the flora, and noticed how the distribution of minerals and rocks, similar to the distribution of plants, would change from place to place. Guettard tried to see if he could find some sort of pattern in this; and he started to make maps showing where specific minerals, rocks and fossils were found. He ended up charting three great bands, each with its own set of rocks, which crossed all of France, suddenly coming to an end at the English Channel. Guettard then figured, chances are that, if I were to cross the Channel, I’d find the same bands on the other side. Now, traveling from France to Britain wasn’t easy then as it is now, and Guettard had to look for books on English natural history: but that was enough to prove that, indeed, his surmise had been right. The overall geological pattern in England was a continuation of what he had seen in France (see Fig. 4.1).
4.1 Layers, Fossils, Correlation
79
Fig. 4.1 Correlation of geological features across the Channel, as mapped in the mid eighteenth century by Guettard. Rocks found in the shaded area all belonged, according to Guettard, to the same interval of relative geological ages—different from that of rocks found elsewhere in France and Britain
So, what Guettard had done, in the eighteenth century, was, and I am using this word for the second time, Guettard had correlated strata found in France with similar strata found far away in England. He understood that there was a connection: and making this connection is the beginning of the principle of geological correlation, i.e., the idea that the order of geological layers is coherent across the globe. The next step would be, then, to establish a global, stratigraphy: very important to understand the history of the earth. Werner, in fact, used correlation systematically and did propose a global stratigraphy, which is precisely the scheme we have seen in Chap. 3 (if you don’t see what I am talking about, look again at figures 3.2 and 3.3, there). But then, Werner did not realize the importance of fossils. Most of my colleagues will agree that it was the work of Cuvier114 and Brongniart that made the bigger splash, with their 1811 book Essai sur la Géographie des Environs de Paris where they show, basically, that the study of fossils is the key to global stratigraphy115 . Georges Cuvier’s and Alexandre Brongniart’s was an example of what today would be called by scientists an “interdisciplinary” collaboration. Cuvier was a paleontologist, i.e., someone who studies the remains of living beings that had existed in the (possibly very remote) past. He had attracted some attention with his early work, where he demonstrated that some ancient bones he found near Paris belonged to what we now call mammoths—an extinct species of big quadrupeds kind of like elephants—but not the same. And besides mammoths, he was able to find the skeletons of huge, uhm, lizards—dinosaurs—and of other creatures that didn’t look like
80
4 The Age of the Earth
anything that is alive now116 ... Cuvier figured that not just individual species, but whole genera had disappeared from the face of the planet. This, the concept of extinction, was a new one at that time—and so was the big scientific question that it brought: how could those extinctions have happened? To understand how fossil animals had been entombed, Cuvier needed to work with someone whose competences are such that today we would call her or him a “geologist”. And that’s how he found Brongniart, director of the national porcelain factory, Manufacture Nationale, at Sèvres117 , who went with him to the field around Paris—almost every week for two years—to look at both the geology and the fossils, and to try and understand the order in which the strata and the fossils appeared. Post-Cuvier, then, people start to do this—to correlate fossils and figure out a global stratigraphy. In Basin and Range, McPhee gives a good example: “It was in the rock [marine limestone] of Devonshire”, he writes, “that geologists in the eighteenthirties found cup corals—fossilized skeletons, cornucopian in shape—that were not of an age with corals they had found before. They had found related corals that were obviously less developed than these, and they had found corals that were more so. The less developed corals had been in rock that lay under the Old Red Sandstone. The more developed corals had been in rock above the Old Red Sandstone.” Now, the Old Red Sandstone that McPhee is talking about is in North Britain, while Devonshire is in the South of England, near Cornwall. So the implications were: (1) that any rock that shows the same kind of fossil corals is very likely to be of the same age as the marine limestone of Devonshire (unless those corals hid in some unknown place from some amazing amount of time and then reappeared, but that would be weird); but also (2) that any rock that does not show those corals but is overlain by rock that shows younger corals and underlain by rock that shows older corals, should be of the same age. Like, for instance, the Old Red Sandstone of North Britain, where no cup corals had been found. “Therefore, it was inferred [...] that the Old Red Sandstone of North Britain and the marine limestone of Devon were of the same age, and that henceforth any rock of that age anywhere in the world—in downtown Iowa City; on Pequop Summit, in Nevada; in Stroudsburg, Pennsylvania; in Sandusky, Ohio—would be called Devonian. It was a name given, although they did not know it then, to forty-six million years. They still had no means of measuring the time involved. They also had no way of knowing that those forty-six million years had ended a third of a billion years ago. All they had was their new and expanding insight that they were dealing with time in quantities beyond comprehension.” (I always found it confusing, by the way, that names that are obviously attached to a specific place, like Devonian, are used by geologists to name rocks that you can find all around the world; but that’s the way it is.). “Geologists”, continues McPhee, “did not have to look long at the coal seams of Europe—the coals of the Ruhr, the coals of the Tyne—to decide that the coals were of an age, which they labelled Carboniferous. The coal and related strata lay on top of the Old Red Sandstone. So, in the succession of time, the Carboniferous period (eventually subdivided into Mississippian and Pennsylvanian in the United States) would follow the Devonian, coupling on, as the science would eventually determine, another seventy-two million years—362 to 290 million years before the present.”
4.2 Relative Versus Absolute Age
81
4.2 Relative Versus Absolute Age You understand, now, that we have ways to tell which of two formations is older, etc. But how does one figure out absolute age? not just whether that layer of limestone is older or younger than this granitic intrusion: but how many million years ago sediments were turned to rock, and or how many million years ago the granite solidified taking the form that it has now. To start answering to this, I am going to take a step back and tell you about some of the very first estimates of the age of the earth, which were obtained in the seventeen hundreds under the assumption, as Hallam puts it, “that the existence of man was effectively coeval with that of the earth”, that is to say, that man and earth came into being at the same time, which if you think about it is not such a crazy idea, because if we knew nothing of what we know now, why should we hypothesize that the human race is younger—or older—than our planet? consider that we have no historical record at all of major physiological changes in the way our species is, and or no document of man’s arrival on this planet. So, in the absence of other data (that we now have, but that wasn’t the case two or three hundred years ago), it makes sense to start with the simplest possible model, i.e. that nothing has changed and nothing will change118 ; so then what people did was to “undertake a critical study of ancient documents, especially calendar systems. This”, says Hallam119 , “was highly skilled work involving a knowledge of languages, history, and astronomy. [...] Newton was among the great minds of the period who took part in this kind of chronology, and so there is nothing ridiculous” about this approach, etc., which this last sentence is funny, if you think about it: Newton did it, so it’s not “ridiculous”. But Newton was also an alchemist (remember Chap. 1? Keynes’ Newton speech?). Is alchemy “serious”? or “ridiculous”? if you ask me, no attempt to understand nature is ridiculous. Or maybe: no attempt to understand nature is serious. Anyway: today, the scholar who’s most frequently remembered, of those who computed the age of the earth from ancient documents, is James Ussher, Archbishop of Armagh (1581–1656), “known to us today almost entirely in ridicule—as the man who fixed the time of creation at 4004 B.C., and even had the audacity to name the date and hour. October 23 at midday.” This is from an article120 by Stephen Jay Gould, who of course doesn’t care for theories of a young earth, but is critical of modern textbooks that quickly dispose of studies like that of Ussher, making no effort to show them in their historical context. I, obviously, think that Gould is right—which is one of the reasons I am writing this book. “To this day,” he says, “one can scarcely find a textbook in introductory geology that does not take a swipe at Ussher’s date as the opening comment in an obligatory page or two on older concepts of the earth’s age [...]. Other worthies are praised for good tries in a scientific spirit (even if their ages are way off), but Ussher is excoriated for biblical idolatry and just plain foolishness. How could anyone look at a hill, a lake, or a rock pile and not know that the earth must be ancient? “One text”, continues Gould, “discusses Ussher under the heading ‘Rule of Authority’ and later proposals under ‘Advent of the Scientific Method.’ We learn— although the statement is absolute nonsense—that Ussher’s ‘date of 4004 B.C. came
82
4 The Age of the Earth
to be venerated as much as the sacred text itself.’ Another text places Ussher under ‘Early Speculation’ and later writers under ‘Scientific Approach.’ These authors tell us that Ussher’s date of 4004 B.C. ‘thus was incorporated into the dogma of the Christian Church’ [...] They continue: ‘For more than a century thereafter it was considered heretical to assume more than 6,000 years for the formation of the earth.’ “Even the verbs used to describe Ussher’s efforts reek with disdain. In one text, Ussher ‘pronounced’ his date; in a second, he ‘decreed’ it; in a third, he ‘announced with great certainty that... the world had been created in the year 4004 B.C. on the 26th of October at nine o’clock in the morning!’ (Ussher actually said October 23 at noon—but I found three texts with the same error of October 26 at nine, so they must be copying from each other.) This third text then continues: ‘Ussher’s judgment of the age of the earth was gospel for fully 200 years.”’, etc.121 The way Ussher comes up with twelve o’clock, October 23, 4004 B.C. is quite complicated... I would lie if I told you that I looked at even one single page of Ussher’s book; but Gould is clear that Ussher’s work involved much more than (as his detractors think) “adding up ages and dates given directly in the Old Testament”, because “even a cursory look at the Bible clearly shows that no such easy solution is available”, and in fact “Ussher’s chronology extends out to several volumes and 2,000 pages of text”. The problems that Ussher faced, and the sources, other than the Bible, that he drew upon, are summarized by Gould in his article122 . While we all agree now that it is not fair, and not productive, to ridicule Ussher and his similarly minded contemporaries, we know (and of course Gould knew) that, nowadays, their work is, uhm, not very useful as a constraint for the age of the earth: because, for instance, you don’t need much speculation or elaborate observations to figure out, today, that it takes much more than a few thousand years to erode a mountain: in the section of Basin and Range where he explains “Hutton’s essentially novel and all but incomprehensible sense of time”, McPhee points out that “Hutton had seen Hadrian’s Wall running across moor and fen after sixteen hundred winters in Northumberland. Not a great deal had happened to it”: ergo, it must take much longer than sixteen hundred winters to erode away from mountains enough material to form all the sedimentary strata that anyone can see in the field. And so, if we accept the ideas brought forward in Chap. 3, we need the earth to be way older than that123 .
4.3 Newton’s/Buffon’s Estimate of the Age of the Earth Even before people actually tried to estimate erosion rate, and to translate it into duration, there was at least another estimate of the age of the earth that was not based on the Scriptures. The way I understand it, it was Buffon who came up with it, though in some textbooks you might find it attributed to Newton. What Newton actually did was, towards the end of his Principia, he came up with a thought experiment to estimate the time it would take for a large spherical body to cool to ambient
4.3 Newton’s/Buffon’s Estimate of the Age of the Earth
83
temperature. Here’s the quotation from Newton: “A globe of iron of an inch in diameter, exposed red hot to the open air, will scarcely lose all its heat in an hour’s time; but a greater globe would retain its heat longer in proportion of its diameter, because the surface (in proportion of which it is cooled by the contact of the ambient air) is in that proportion less in respect of the quantity of the included hot matter; and therefore a globe of red hot iron equal to our earth, that is, about 40,000,000 ft in diameter, would scarcely cool in an equal number of days, or in above 50,000 years.” The physics, that needs some understanding, or at least in my case needed understanding, is all in the following: “a greater globe would retain its heat longer in proportion of its diameter, because the surface (in proportion of which it is cooled by the contact of the ambient air) is in that proportion less in respect of the quantity of the included hot matter.” What Newton is saying here is that the time it takes to cool a red hot globe of iron to ambient temperature is proportional to its diameter (“a greater globe would retain its heat longer in proportion of its diameter”), and this is because: (i) that time is inversely proportional (the largest the area, the shorter the time) to the area of the globe’s outer surface (the globe is cooled in proportion of its surface, is how Newton phrases it)—which, if I call R the globe’s radius, is given by 4π R 2 ; (ii) the globe’s volume is 4/3 π R 3 , and if I call ρ the globe’s mass density, its mass is 4/3 ρπ R 3 ; the ratio of mass to area then is (4/3 ρπ R 3 )/(4π R 2 ) = ρ R/3, or ρ D/6, where D is diameter; (iii) Newton isn’t quite spelling this out, but what he does is he assumes that the time taken to cool the globe is directly proportional to its mass. So then if, according to (i), the time t coincides with some constant (which might be related to mass) times the area of the globe’s surface, and but also, based on (iii), t equals some other constant (which might related to surface area) times the globe’s mass... well then it follows that t equals yet another constant (which is related neither to mass nor to area) times mass and divided by area. According to (ii), then, t is proportional to ρ D/6, or in Newton’s words “a greater globe would retain its heat longer in proportion of its diameter” (which we are neglecting changes in density, but it’s always iron anyways, so the error should be small). I don’t know why Newton thought that both these relationships—t to mass and t to area—should be linear. Initially I thought this should be somehow related to Newton’s law of cooling124 , and or his 1701 paper where he presents that “law” and his own thermometry scale; but after spending some time on this I am almost convinced that is not the case. So, if I am not mistaken, Newton here is just making an assumption that seems reasonable. It follows from that assumption that the ratio of the cooling times of two globes equals the ratio of their diameters. Newton also knew—he must have measured this—that “a globe of iron of an inch in diameter, exposed red hot to the open air” will take about an hour to cool to the temperature of the “open air”. So then from that, Newton could calculate the time needed to cool any globe of red hot iron to the same final temperature, by just knowing its diameter. Newton, for whatever reason, did that calculation for the case of a globe the size of our planet. The diameter of the earth is about 13 × 106 m; one inch is about 25 × 10−3 m: it follows that the ratio of the diameter of the Earth to Newton’s reference globe is roughly 0.5 × 109 . Ergo, it would take 0.5 × 109 hours to bring a globe of iron
84
4 The Age of the Earth
the size of earth from red hot to its current temperature: which if you divide by 365 times 24, comes to about 57000 years, which is roughly the value given by Newton, see above. I am not sure that this result should be read as Newton trying to estimate the age of the earth. The theories on the origin of the earth of Chap. 2, though, the Kant-Laplace nebular hypothesis and all that, convinced everybody, some time after Newton, that the earth is very likely to have been initially, some very long time ago, in an entirely molten state, and then to have cooled down and solidified, at least to some extent. Then, indeed, evaluating the time it takes for a mass of liquid rock similar to the mass of the earth to become solid, would be a way to have, at least, an order-of-magnitude estimate for the age of the earth, or in any case for the time passed since the earth was just pure lava. Now, Buffon, that we’ve met before, ran an experiment125 that’s very reminiscent of Newton’s speculation, but with the manifest goal of estimating, in some admittedly extremely rough and approximate way, but better than nothing, the age of the earth. Buffon, who actually owned a foundry, had ten iron spheres made, with diameters growing in half-inch increments, from 0.5 to 5 in.. He heated each of them to white heat126 , and then measured how long they took to cool to room temperature. He found that the relation between that time, and the diameter of the sphere, was approximately linear—which meant that Newton’s hypothesis was correct, as is often the case with Newton’s hypotheses. Buffon extrapolated this to a sphere the size of the earth, and it turned out that, assuming the linear relationship would stay linear when such big masses are involved, a molten-iron sphere the size of the earth would take about 100,000 years to cool. “Buffon”, says Brent Dalrymple127 , “was suspicious of his results because he felt that the events of Earth history required a time much longer than the 75,000 years his calculations indicated, perhaps as much as double. He was impressed, for example, by the tremendous thickness of sedimentary rocks exposed in the Alps and by the exceedingly slow rate at which similar modern sediments are formed in the ocean. In unpublished manuscripts not publicized until the century after his death, Buffon detailed several longer chronologies, including one that estimated the age of the Earth at nearly 3 Ga.” (I think Dalrymple meant to write 3 Ma—three million years—rather than 3 Ga—three billion years: although the latter is closer to the age of the earth as we estimate it today (more on that later). Anyway, you see the point.)
4.4 Hutton’s Idea of Time Living in pre-revolutionary France, Buffon is, just like de Maillet (whom we’ve met in the previous chapter, and who remained “underground” until his death), very careful to avoid conflict with orthodox thought, and the scandal that would ensue. Hutton is less careful. That’s ironic, because unlike Buffon who’s totally a French enlightenment person, and chances are that, privately, he is (Buffon is) more than skeptical about literal interpretations of the bible and all that, Hutton is pious and genuinely
4.4 Hutton’s Idea of Time
85
concerned that his doctrine be coherent with the Christian faith. But Hutton’s time, as we’ve seen, is even longer than de Maillet’s and Buffon’s: because, without even trying to come up with a numerical estimate, Hutton figured all one could say was that there was “no vestige of a beginning”: so forget creation altogether. And then obviously Hutton had to face all that “Neptunian” hostility, etc. The problem was not just philosophical, but also political. In his book128 , Hallam has a short piece about how, “traditionally, the process of history was seen as one of decline from a paradisial state of perfection to an imminent dissolution of the world to make way for a new heaven and a new earth. The Enlightenment belief in a globe of considerable antiquity became mixed up in the minds of the pious with political events in France, and a reaction grew up in the British Isles in the closing years of the century against the ungodly influences emanating from the continent. This is the proper context in which to view Kirwan’s vehement attack on Hutton”, etc. But, the evidence in favour of the earth’s “considerable antiquity” was getting to be just too much. We’ve seen earlier in this chapter that also the work of paleontologists like, e.g., Cuvier, pointed to (Hallam again) “a process of history consisting of a sequence of ‘previous’ worlds, each populated by different, unfamiliar, and now-extinct organisms.” And: “there [...] began to emerge a picture of progression through time of various life forms, posing a further challenge to Christian orthodoxy”. What Hallam is saying here is that you need an enormous amount of time not only to account for the accumulation and deformation of sedimentary rocks, but also to explain the sequence of transformations that must have turned the very simple life forms of the deeper and older strata into the complex organisms that we see around us today (incl. ourselves). And this naturally brings me to Charles Darwin, who in Chap. IX (“On the imperfection of the geological record”) of The Origin of Species (1859) will write: “it may be objected, that time will not have sufficed for so great an amount of organic change, all changes having been effected very slowly through natural selection. It is hardly possible for me even to recall to the reader, who may not be a practical geologist, the facts leading the mind feebly to comprehend the lapse of time. “He who can read Sir Charles Lyell’s grand work on the Principles of Geology, which the future historian will recognise as having produced a revolution in natural science, yet does not admit how incomprehensibly vast have been the past periods of time, may at once close this volume. Not that it suffices to study the Principles of Geology, or to read special treatises by different observers on separate formations, and to mark how each author attempts to give an inadequate idea of the duration of each formation or even each stratum. A man must for years examine for himself great piles of superimposed strata, and watch the sea at work grinding down old rocks and making fresh sediment, before he can hope to comprehend anything of the lapse of time, the monuments of which we see around us.”
86
4 The Age of the Earth
4.5 Darwin and the Denudation of the Weald In the same chapter, Darwin attempts a rough but quantitative estimate of geological time. Not the age of the earth, but the time needed to complete one of those processes “the monuments of which we see around us”: he picked the “denudation” (erosion) of the Weald, a valley south of London and of the estuary of the Thames129 , probably because it had been studied in detail already—Chaps. XXI and XXII of book IV of Lyell’s Principles of Geology are all about the Weald. Here’s what Darwin has to say—or at least an excerpt: “it is an admirable lesson to stand on the North Downs and to look at the distant South Downs; for, remembering that at no great distance to the west the northern and southern escarpments meet and close, one can safely picture to oneself the great dome of rocks which must have covered up the Weald within so limited a period as since the latter part of the Chalk formation.” Before I go on with the quote, take a look at Fig. 4.2. “The distance from the northern to the southern Downs is about 22 mi, and the thickness of the several formations is on average about 1100 ft, as I am informed by Prof. Ramsay. [...] If, then, we knew the rate at which the sea commonly wears away a line of cliff, we could measure the time requisite to have denuded the Weald. This, of course, cannot be done; but we may, in order to form some crude notion on the subject, assume that the sea would eat into cliffs 500 ft in height at the rate of one inch in a century. [...] At this rate, on the above data, the denudation of the Weald must have required 306,662,400 years; or say three hundred million years. [...] I have made these few remarks because it is highly important for us to gain some notion, however imperfect, of the lapse of years.” Darwin doesn’t say much more than what I’ve copied here, and I wasn’t at all sure of what to do actually with those numbers—how to get his 300-million-years estimate. So I went and read Lyell’s chapters about the Weald. I immediately found something in Lyell that I could not figure out from Darwin but that is quite fundamental; “the following hypothesis has been very generally adopted”, he writes in Chap. XXI of book IV of the Principles: “Suppose the five formations to lie in horizontal stratification at the bottom of the sea; then let a movement from below press them
Fig. 4.2 Left: simplified geological map of the south-east of England showing the denudation of the Weald. Right: sketch of a vertical section, taken through the straight black line on the left. Different shades of gray correspond to different geological strata—darker strata are younger
4.5 Darwin and the Denudation of the Weald
87
upwards into the form of a flattened dome, and let the crown of this dome be afterwards cut off, so that the incision should penetrate to the lowest of the five groups. The different beds would then be exposed on the surface in the manner exhibited in the map, plate 5.” Again, look at Fig. 4.2 here, because the map there is a copy of Lyell’s. “The quantity of denudation or removal by water of vast masses which are assumed to have once reached continuously from the North to the South Downs is so enormous, that the reader may at first be startled by the boldness of the hypothesis. But he will find the difficulty to vanish when once sufficient time is allowed for the gradual and successive rise of the strata, during which the waves and currents of the ocean might slowly accomplish an operation, which no sudden diluvial rush of waters could possibly have effected.” The italic is mine. In the next chapter, after deducing from many different features of the landscape that denudation of the Weald was most likely caused by the action of water (and I am skipping a lot of stuff, here, to be honest, but you can easily look it up—Lyell’s book is available online, for free), Lyell summarizes and reiterates: “We infer from the existence of large valleys along the outcrop of the softer beds, and of parallel chains of hills where harder rocks come up to the surface, that water was the removing cause; and from the shape of the escarpments presented by the harder rocks, and the distribution of alluvium over different parts of the surface of the Weald district, we conclude that the denudation was successive and gradual during the rise of the strata.” What Lyell is saying, and maybe for Darwin it was so obvious that he figured it wasn’t worth mentioning, is that denudation by water occurs when the top of the “dome” is near the sea surface; while something pushes it up from below (and Lyell doesn’t know what it is—doesn’t claim to know what it is—it’s a mystery, something related to heat probably—see the previous chapter of this book, and the next), the water eats it away at the top; so, just to make it clear, the dome that one could draw by imagining what the landscape would look like, had the chalk layers never been eroded—look at the dashed lines in Fig. 4.2—that dome has never existed, because its top was eroded away before it could rise. So, while the strata are slowly pushed up to form an anticlinal (which, by the way, is the first time this word appears in Lyell’s book; and because at the time it wasn’t such a common word as it is (to geologists) now, Lyell also makes a drawing (similar to that in Fig. 4.3) to explain it, together with its sister word, synclinal, and mentions that they were invented by Sedgwick; these words will get used much more in the next chapter), whose axis Lyell supposes “to have coincided with what is now the
Fig. 4.3 a stands for anticlinal, b for synclinal. (After Lyell’s Principles.)
88
4 The Age of the Earth
Fig. 4.4 Schematic vertical cut across the Weald. The dotted line represents the sea level, and the strata are numbered from youngest (1) to oldest (5). Shortly after being formed, the Wealden anticlinal is, in Lyell’s words, “so rent and shattered on its summit as to give more easy access to the waves, until at length the masses represented by the fainter lines [are] removed”. (After Lyell’s Principles.)
Fig. 4.5 Same cut as in Fig. 4.4, some (geological) time later. “When at length the gault was entirely swept away from the central parts of the channel,” says Lyell, “the lower green-sand would be laid bare, and portions of it would become land”. (After Lyell’s Principles.)
central ridge of the Weald Valley [...]. Here a number of reefs may have existed, and islands of chalk, which may have been gradually devoured by the ocean in the same manner as Heligoland and other European isles have disappeared in modern times [...]. “Suppose the ridge or dome first elevated to have been so rent and shattered on its summit as to give more easy access to the waves, until at length the masses represented by the fainter lines in the annexed diagram [see Fig. 4.4 here] were removed. Two strips of land might then remain on each side of a channel, in the same manner as the opposite coasts of France and England, composed of chalk, present ranges of white cliffs facing each other. A powerful current might then rush, like that which now ebbs and flows through the straits of Dover, and might scoop out a channel in the gault. We must bear in mind that the intermittent action of earthquakes130 would accompany this denuding process, [...] bringing up, from time to time, new stratified masses [...]. When at length the gault was entirely swept away from the central parts of the channel, the lower green-sand [see Fig. 4.5] would be laid bare, and portions of it would become land [...]. Meanwhile the chalk cliffs would recede farther from one another”, etc. After reading this and with some reverse engineering (i.e., I certainly didn’t get it right the first time around, for a number of reasons that you don’t want to hear about), I was able to reproduce Darwin’s estimate. Darwin thinks of erosion as a 500-foot cliff being eaten away by the ocean at a speed of one inch per century. In other words: every century, a vertical sheet of rock (chalk, mostly) of 1 in. in thickness, 500 ft in height, and as wide as the Weald valley is long, was reduced to dust and dispersed in the ocean. Today we would call this a very simple numerical model of the processes outlined by Lyell. Because the Weald is 22 mi wide, to find how long
4.6 John Phillips and the Ganges
89
that took, convert miles to inches and multiply by the speed, which is 1, so that’s easy—you should get 1,393,920 centuries. Now, the formations we are talking about were... or, rather, would have been 1100 ft high. So to remove them completely, you have to repeat the above process 1100/500=2.2 times. Multiply 1,393,920 by 2.2; then by 100 because those are centuries and we want years, and we get 306,662,400 which is the exact same number of years given by Darwin. Chances are that this is precisely the calculation that Darwin did, because it would be difficult to get all those digits right just by coincidence. But then, it doesn’t quite make sense: for instance, Darwin says that one inch in a century is “the rate at which the sea commonly wears away a line of cliff”; but according to Lyell’s theory two cliffs had been formed (“in the same manner as the opposite coasts of France and England”), and so the sea was wearing away both cliffs at the same time, and presumably at the same speed, so then one actually would need to divide those 300 million years by two. If you ask me, whether it’s 100 of 500 million years is no big deal—it’s quite obvious that Darwin only needs an order-of-magnitude estimate: he knows that there’s a lot of uncertainty, but all he wants to say is that geological time is extremely slow. His somewhat casual attitude in this respect costed him some bad reviews, though. And in the American edition of the Origin, the 300-million-years estimate comes with a footnote, where Darwin says: “I confess that an able and justly severe article, since published in the Saturday Review (December 24th, 1859), shows that I have been rash. I have not sufficiently allowed for the softness of the strata underlying the chalk; the remarks made are more truly applicable to denuded areas composed of hard rocks. Nor have I allowed for the denudation going on on both sides of the ancient Weald-Bay; but the circumstances of denudation having taken place within a protected bay would prolong the process. It has long been my habit to observe the shape and state of surface of the fragments at the base of lofty retreating cliffs, and I can find no words too strong to express my conviction of the extreme slowness with which they are worn away and removed. I beg the reader to observe that I have expressly stated that we cannot know at what rate the sea wears away a line of cliff: I assumed the one inch per century in order to gain some crude idea of the lapse of years; but I always supposed that the reader would double or quadruple or increase in any proportion which seemed to him fair the probable rate of denudation per century. But I own that I have been rash and unguarded in the calculation.” “This footnote”, says Joe Burchfield in “Darwin and the dilemma of geological time” (ISIS, vol. 65), “is unique. It is the only one ever to appear in any edition of the Origin and as such bears eloquent witness to Darwin’s agitation and concern.”
4.6 John Phillips and the Ganges “The Saturday Review”, goes on Burchfield, “was just the beginning, however. In February 1860 [Darwin] was attacked on the same issue from the presidential chair of the Geological Society. The attacker was John Phillips131 , the Professor of Geology at Oxford and an old opponent of Darwin and Lyell. Phillips argued that a river similar
90
4 The Age of the Earth
to the Ganges [...] could account for the denudation of the Weald in only 1.3 million years [...]. Phillips carefully discounted any claims to accuracy in these calculations but considered them quite sufficient to discredit Darwin’s ‘inconceivable number of 306,662,400 years,’ which he denounced as an ‘abuse of arithmetic.’ During the next several months he expanded his critique, first in the Rede Lecture at Cambridge and then in a monograph132 . In this final form Phillips included the first important attempt to calculate geological time from the rate of accumulation of strata”. While it’s difficult to observe/quantify the rate of marine denudation, said Phillips, it is less difficult to “observe and measure what is carried away by one or more rivers from [the land] to the sea in one year”133 . Measuring the amount of sediment carried by one or more rivers in a given amount of time is roughly the same as measuring how much rock is eroded away from a certain area of the globe in the same time (if, as Phillips does, you assume that amount of river sediment “to be a fair average for the whole surface of the land”). “The fine sediments of the Marañon were still discoverable by Sabine at the discoloured surface of the sea 300 mi from the mouth”134 . So then “to save trouble, suppose the whole of the sediment to be spread out on the sea-bed, over an area equal to that from which it was derived. The thickness wasted from the land, in one year, is thus the same as that added to the sea-bed in one year. Divide by this thickness the measured thickness of the sedimentary strata, the result is the number of Uniformitarian135 years employed in depositing the strata, if the materials were all derived in this manner from atmospheric waste of the land, as some suppose. “For example,” Phillips continues, “the Ganges, which drains 300,000 square miles of plains, hills, and mountains, containing a great variety of rocky and earthy masses, delivers annually to the Bay of Bengal 6,368,077,440 cubic feet of sediment, which is equal to 1/111th of an inch in a year. The maximum thickness of the strata is supposed to be about 72,000 ft = 864,000 in., and dividing this by 1/111th we have the calculated antiquity of the base of the stratified rocks = 95,904,000 years. But here two things are to be allowed for. The thickness of the old strata is taken at a maximum, and the new deposit is supposed to spread over a much larger space of sea-bed than it really does, so that the period found is something too large. On the other hand, the Ganges carries very much more sediment than some other great rivers,—nearly twenty times as much as the Rhine;—it has the character of tropical or excessive effect, and on this account the period may be much too short.” You might be wondering where that figure, 6,368,077,440, comes from. At least, I did. If you look up Phillips you’ll see that he published in 1837 and 1838 a two-volume Treatise on Geology, whose Chap. VI is all about “fluviatile deposits”; at some point he mentions that “Mr. Everest found in the water of the Ganges, during rains, 1/856th of its volume of mud; and the total annual discharge of sediment into the Bay of Bengal 6,368,077,440 cubic feet (= 255, 854, 720 cubic yards). (Biblioth. Universelle, 1834)”, but then doesn’t explain how this and similar measurements were made. Looking up “Biblioth. Universelle” didn’t help. But if you google “6,368,077,440”, you’ll find a paper in the Journal of the Asiatic Society of Bengal, volume 1, signed by the Rev. R. Everest136 (“read before the Physical Class, 13th June, 1832”): “In the course of last summer, I made some attempts to ascertain the weight of solid matter
4.6 John Phillips and the Ganges
91
contained in a given quantity of Ganges water, both in the dry and rainy season, but I found the weight so variable on different days (when little difference might have been expected) that I can hardly consider the observations numerous enough to give a correct average. Such as they are, however, they may not be without interest in the absence of other information on the subject. I therefore take the liberty of laying there before the Society, and shall, if opportunity offers, endeavour to add some further data, both for the weight of solid matter, and also for the rise and velocity of the river. “1. A quantity of Ganges’ water taken 27th May, 1831, gave when evaporated, a solid residuum of 1.084 grains per wine quart. [A grain is a unit of measurement of mass that I don’t think I’ve ever heard about. According to Wikipedia 1 g is about 15 grains; nowadays “the grain is commonly used to measure the mass of bullets and propellants.”] “2. July 21st. There had been little rain for some days, and the river was low for the season; a wine quart contained of soluble matter 2.0 grains; of insoluble, 16.2;– total 18.2”, etc. [The measurements become more sophisticated as Everest realizes, on August 20th, that “the quantity of matter held in suspension in the middle of the current was much greater than towards the bank,” so he starts taking two samples, one from the middle and one from the side of the river.] Everest next adds “such data as I possess respecting the breadth, depth, and velocity of the stream”... which is a lot of data because those parameters also are measured when it rains, when it doesn’t rain, etc. Eventually, the measurements go on for about a year. Then it’s a bit tricky because Everest has to do the “rains”, “winter” and “hot weather” seasons separately, and then for some reason he prefers to convert mass to volume (which later will mean one less calculation for Phillips), but eventually he comes up with 6,368,077,440 cubic feet for the total annual discharge of sediment. Anyway, so, the controversy re Darwin’s estimate of the rate of denudation of the Weald was a big deal. “During the spring of 1860’, writes Burchfield, “Francis Bowen, Professor of Natural Religion and Moral Philosophy at Harvard, took up several meetings of the American Academy of Arts and Sciences to denounce the speculative nature of Darwin’s reliance on time, condemning it as cosmology rather than natural history. And in his subsequent review he directed his most severe criticism at Darwin’s reliance upon the imperfection of the geological record137 and at the calculation of the denudation of the Weald”, etc. “From the beginning”, continues Burchfield, Darwin “had believed the imperfection of the geological record to be his weakest point, and now he was convinced that he had been guilty of an incredible blunder. His distress was acute, as this note to Lyell reveals: ‘Having burnt my own fingers so consumedly with the Wealden, I am fearful for you138 ... for heaven’s sake take care of your fingers; to burn them as severely as I have done, is very unpleasant.’139 ” And so, “when the third edition of the Origin appeared in April 1861 all mention of the Weald was conspicuously missing, including the proposed footnotes, although the claims for the imperfection of the geological record remained as strong as ever.”140 Another important review of Darwin’s Origin was published by Henry Charles Fleeming Jenkin in the North British Review (June 1867). “Darwin”, writes Burchfield, “considered this to be one of the most valuable reviews ever written concerning
92
4 The Age of the Earth
his theory”. The review comes after the fourth edition of the Origin, published in 1866, where “all reference to the Weald had been removed”. Nevertheless, Burchfield continues, “on turning to the question of time, [Fleeming Jenkin], like many other critics, resurrected the Wealden calculation and Darwin’s ‘confounded millions of years.’ He dismissed it almost immediately, as an example of the kind of fuzzy reasoning that had led geologists to the ‘wholly erroneous’ conclusion that past time was virtually limitless. On the basis of the evidence available, he asserted, Darwin’s results could be expanded or contracted by a factor of a hundred or even a million; the data were simply too meagre for judgment.” Jenkin didn’t care much for other purely, uhm, geological estimates à la Phillips, either: he saw things from a totally different angle. “He pointed out”, says Burchfield, “that in a finite world heated by a finite sun the available store of energy must be limited, and he explained why, according to the second law of thermodynamics, every energy transformation—that is, every process of change—must dissipate a part of that energy and render it useless for further transformations. In geological terms he argued that these facts meant that uniformity could not be a law of nature,” i.e., that the theory of uniformitarianism doesn’t work: “that the earth must be running down, and that present geological forces must be less powerful and less violent than those in the past. The present rate of geological change cannot therefore be used justifiably as a guide to the age of the world [...].
4.7 Conservation of Energy “Neither Jenkin’s attack on uniformitarianism nor the physical calculations of time he used to support it were original. They had in fact been fully discussed some five years earlier by his close friend and associate William Thomson, the future Lord Kelvin...” In fact, from this point on, the whole idea of estimating geological time from observations of denudation and sedimentation, etc., becomes much less relevant, in the face of a number of things that in the meantime (between Hutton and the Origin) had been discovered by physicists, and that now begin to have some impact also to people studying the earth and trying to guess its age. We need to take a closer look at this stuff, so that phrases like “the second law of thermodynamics” and or “energy transformation”, etc., which you’ve just read and that might have sounded weird, become sufficiently clear to you before we go on. To begin with, even before the first edition of the Origin came out, the so-called law of energy conservation was discovered. What this law essentially does is, it makes a connection between mechanical energy, i.e. the sum of kinetic and potential energy, which you might or might not remember from school, and energy associated with heat. Basically, when machines use heat to move things around (think a steam engine), this law establishes a precise quantitative relationship between the heat administered and the power expressed by the engine, or how fast the ship (or whatever) moves. The same holds when motion generates heat (think friction). One of the main advocates of the concepts of energy and of energy conservation was a German professor named
4.7 Conservation of Energy
93
Hermann von Helmholtz141 , that I am going to cite quite a bit in the next pages, not least because he also was a very good scientific writer, and the author of a very popular popular-science book which was, in fact, a collection of science lectures that he gave to “general” audiences all over Germany, on all sorts of topics, over the course of his career; the book is called Popular Scientific Lectures142 ; an excerpt (the first of a few) follows; it is from a lecture titled “On the origin of the planetary system”, which was “delivered in Heidelberg and Cologne, in 1871”: “All life and all motion on our earth is, with few exceptions, kept up by a single force, that of the sun’s rays, which bring to us light and heat143 . [...] The sun, in fact, drives on earth a kind of steam-engine whose performances are far greater than those of artificially constructed machines. The circulation of water in the atmosphere raises [...] the water evaporated from the warm tropical seas to the mountain heights; it is, as it were, a water-raising engine of the most magnificent kind, with whose power no artificial machine can be even distantly compared. [...] For a long time experience had impressed on our mechanicians that a working force cannot be produced from nothing; that it can only be taken from the stores which nature possesses; which are strictly limited, and which cannot be increased at pleasure whether it be taken from the rushing water or from the wind; whether from the layers of coal, or from men and from animals, which cannot work without the consumption of food. [...] [Hence] the all-ruling natural law of the Conservation of Force144 . No natural process, and no series of natural processes, can be found, however manifold may be the changes which take place among them, by which a motive force can be continuously produced without a corresponding consumption.” In other words, everything is governed by a “force” called energy, which can’t be created nor destroyed, but constantly changes form: and heat is but one form of energy. What all this meant for people like Darwin and Lyell and Phillips was that there was now another “constraint” for the age of the earth. Because, remember Chap. 2, the earth can’t be older than the sun. And if energy is conserved, that means that the energy that’s consumed as the sun gives away its heat must have been converted from some other form. And if we knew what that “other form” is, we could probably try to figure out for how long it might reasonably have existed in the past, and/or last in the future; which means we’d get, at least, an order of magnitude for the total life expectancy of the sun, the earth, etc. Or in other words: the sun is a machine that produces and radiates heat: if we knew how that machine worked (“whence does the sun derive this enormous store of force which it sends out?”, asks Helmholtz in the Planetary System lecture), we could maybe find some ways to guess for how long it has been heating us and for how long it will continue to do so. So, let’s take a step back and see how the idea of energy (and its conservation) developed, because it’s very important and not trivial. It all happens roughly around the time of the industrial revolution (which at that moment people were giving a lot of thought to engines and all that). The very word energy is invented around that time, imported from the Greek by Thomas Young145 in his Lectures on Natural Philosophy (1807). At that point, the link between mechanical energy and heat had not been made yet; Young was just stating the principle of vis viva, or living force (what we now prefer to call kinetic energy) being “proportional to the labour expended in producing
94
4 The Age of the Earth
[the] motion”. The vis viva/kinetic energy concept was at least as old as Newton: e.g. Leibniz had called vis viva the product of mass times squared velocity of an object, and stated (based on observation, I guess) that that, the total vis viva of a bunch of objects, is something that is conserved, i.e. stays constant, after elastic impacts. Now Young in 1807 was saying something more than Leibniz: he’s saying that in general—not just limited to elastic impacts—something that he calls labour (later: work) can be turned into motion, or more precisely into vis viva.
4.8 Conservation of Mechanical Energy This idea leads to the so-called law of conservation of mechanical energy. Unlike Newton’s laws, conservation of mechanical energy was not discovered empirically, but derived mathematically146 from Newton’s laws themselves (and then validated empirically in a billion experiments). You can find the derivation, e.g., in Laplace’s Celestial Mechanics (Eq. (143) in book 1, Chap. V, paragraph 19 of the Bowditch translation147 ; if you look at the French version, the equation is not numbered, but identified with the letter Q), or in Lagrange’s Mécanique Analytique, etc. As far as I am concerned, though, my favourite vintage derivation of the law is the one in Coriolis’148 1829 treatise, Du Calcul de l’Effet des Machines, ou Considerations sur l’Emploi des Moteurs et sur leur Évaluation (Calculation of the Effect of Machines, or Considerations on the use of Engines and their Evaluation). Coriolis understands that mechanical energy conservation is of help in solving a lot of practical problems in physics and engineering, so he derives it carefully in the first pages of his book, making it the basis for all he’s going to do next. He starts off with Newton’s F = ma,
(4.1)
where of course F is force (a vector) and m is mass (a scalar) and a is acceleration (another vector). Then, Coriolis dot-multiplies both sides by the infinitely small displacement du, that m covers in an infinitely short time dt, F · du = ma · du.
(4.2)
Now imagine you’ve got a set of material points, each identified by an index i, and of course Newton’s law holds for each of these material points individually, so Fi · dui = m i ai · dui ,
(4.3)
where Fi is whatever total force is applied to the material point i, dui is its displacement, etc. One can sum over all material points (this is similar to stuff we did in Chap. 1 when we were studying the earth’s inertia tensor), say there’s N of them,
4.8 Conservation of Mechanical Energy N
95
(Fi · dui ) =
i=1
N
(m i ai · dui ) .
(4.4)
i=1
Now consider the dot product within the sum at the right-hand side, ai · dui ; let vi i i denote the speed of the ith material point, so then vi = du and ai = dv . But then dt dt dvi dui dvi ai · dui = dt · dt dt = dt · vi dt = vi · dvi , and N
(Fi · dui ) =
i=1
N
(m i vi · dvi ) .
(4.5)
i=1
This equation holds true at any moment in time. In Coriolis’ words (my translation): “since the previous equation is true during all movement, and since all the differential elements it contains correspond to the same time increment, dt, then we can take the integral of both its sides, between any two instants in time.” So, what Coriolis does is, he integrates left and right between two arbitrary times, call them 0 and t. Before we do that, to make sure everything is clear, let’s rewrite (4.5) in a more explicit form, i.e. spell out the dot products, 3 N
3 N Fi,k du i,k = m i vi,k dvi,k ,
i=1 k=1
(4.6)
i=1 k=1
where Fi,k is the k th component of vector Fi , etc. Then, the integral reads N 3 i=1 k=1
u i,k (t) u i,k (0)
Fi,k du i,k
=
N 3 i=1 k=1
vi,k (t) vi,k (0)
m i vi,k dvi,k
N 3 1 2 2 = mi v (t) − vi,k (0) 2 k=1 i,k i=1 N 1 1 2 2 m i |vi (t)| − m i |vi (0)| , = 2 2 i=1
(4.7)
and, writes Coriolis, “that is the formula known under the name of equation of the living forces”, the vis viva equation. You might write it in a slightly different form, leaving at the right-hand side only the terms that are constant with respect to time, 1 m i |vi (t)|2 − 2 i=1 i=1 k=1 N
N
3
u i,k (t)
u i,k (0)
Fi,k du i,k
1 m i |vi (0)|2 : 2 i=1 N
=
(4.8)
if the right-hand side is constant, and the left-hand side is equal to the right-hand side, then the left-hand side has to be constant as well: i.e., the left-hand side of (4.8), which is precisely what we call “mechanical energy”, is conserved.
96
4 The Age of the Earth
Coriolis was not the first to derive this equation, but he has the merit of calling, for the first time, “work” the term at the left-hand side of (4.7) and also of deciding that one should call living force (AKA “kinetic energy”) not the product of mass times squared velocity, but one half of that149 . Which makes sense, if you look at eqs. (4.7) or (4.8). The name, work, eventually stuck, and in fact today they teach you that work is the integral of force times displacement150 . Now: there’s more to the concept of energy than just work and motion. The main reason Helmholtz and others are remembered today is that they saw a connection between the idea of energy as formulated e.g. by Coriolis, above, and the idea of heat151 ; and this involved more than just playing around (mathematically speaking) with Newton’s equations.
4.9 Heat To understand how this came about, and the relevance of this step, we need to be aware of what “heat” meant before the advent of generalized energy conservation, i.e. of a new law of energy conservation that would involve not only work and kinetic energy, but also heat. Back in the eighteenth century, most people, or at least people involved in research on heat, would tend to think that heat was an element, in the sense defined by Robert Boyle152 , and so they worked to isolate that element. Before Boyle, you can maybe find the same idea in the Greek tradition, where “fire” was called an “element” just like water and earth and wind. Which then the alchemists figured that you need more than four categories to describe the substances found in nature, and introduced the “principle” of “sulfur”, which is hard, now, to distinguish from the substance that is also called sulfur; which I imagine that eventually someone figured out that you do not necessarily need sulfur to start a fire, and so it became apparent that heat/fire, whether or not you called it an “element”, did not correspond to any easily identified substance, and seventeenth century (al)chemists postulated the existence of an invisible substance, sort of like Newtonian aether, an invisible substance that was first called terra pinguis and then, by Georg Ernst Stahl153 , phlogiston154 , which is Greek for “burnt up”. Stahl’s phlogiston theory stipulated that materials could be more or less rich in phlogiston; specifically, combustible stuff is rich in phlogiston and non-combustible stuff is poor in phlogiston; phlogiston is lost through combustion; so, e.g., wood is rich in phlogiston and but there’s no phlogiston left in ashes, which can’t burn anymore. So then towards the end of the seventeenth century we have researchers doing chemical reactions where they try to isolate phlogiston. Henry Cavendish155 thought that he had achieved that in 1766, when he catalogued the properties of an inflammable gas that one gets when you combine acids with metals. (Lavoisier gave it its current name, “hydrogen”, some twenty years later, but Cavendish is credited with its discovery.) Cavendish, as we know, was very good at weighing things: and he was also the first, apparently, to weigh volumes of different gases to compare their
4.9 Heat
97
density. Because hydrogen was really light—fourteen times lighter than air—, plus flammable, he figured what he had found was actually phlogiston. In 1774, Joseph Priestley156 was playing around with mercury and air when, in one of the reactions he did, he managed to produce a gas with the peculiar property that combustibles burned more brilliantly and rapidly in it than in air. He made the inference that gas was poor in phlogiston: because, being poor in phlogiston, it would absorb it more easily than air did—and in phlogiston theory combustion goes hand in hand with phlogiston loss, and so faster phlogiston loss means faster combustion. Priestley called his new gas “dephlogisticated air”. The phlogiston theory was eventually killed by the work of Lavoisier, who probably partly because of that often gets called “the father of modern chemistry”157 . Lavoisier had the idea to do his chemical experiments in closed containers, checking the weight of the containers (and their contents) before and after an experiment. So, for instance, Lavoisier heated metals like tin and lead. When you do that, above a certain temperature, a layer of ashy stuff called “calx” forms on the surface of the metal. Lavoisier verified that the total weight of the closed vessel (incl. the metal, calx, and air it contained) was exactly the same before and after the experiment. Calx turns out to be heavier than the metal it comes from, so then Lavoisier figured that the combustion of metal must have taken some mass from the air within the vessel: and in fact, as soon as he opened the vessel, air would rush in, and the weight of the vessel plus its contents would grow. Lavoisier’s result shows that there is no loss of mass—actually, there is a gain in the mass of the metal-plus-calx combination after versus before combustion. This meant that there was a problem with phlogiston theory: because, in phlogiston theory, calx is phlogiston-poor, so then (i) if combustion means a loss in phlogiston, then why doesn’t the mass of metal-plus-calx go down after combustion? and plus, (ii) given that the air has lost some of its mass during combustion, and that the vessel was closed, where did the phlogiston go? All this is pretty weird... unless you postulate that phlogiston has zero mass—which is weird, too. And that’s, basically, how phlogiston theory was abandoned. Incidentally, the fact that mass doesn’t get created or destroyed in chemical reactions, which Lavoisier also established with these experiments, is also very important: today we call it the law of conservation of mass. Lavoisier redid Cavendish’s and Priestley’s experiments in his own way, and understood that what Cavendish had thought was phlogiston was one gas, and Priestley’s “dephlogisticated air” was another, and both were present in air. He figured that when heated metals turn to calx, the hydrogen is sucked out of the air and into the calx. Another gas which is also massively present in air is nitrogen, which Lavoisier called “azote”: what Priestley had done is, he had extracted all the oxygen away from the air—and oxygen is best at supporting combustion (and, as a result) life; azote, which accounts for most of what was left, doesn’t support combustion at all: and as such doesn’t support life: and azote is Greek for “no life”. In Martin Barnett’s158 words: “the [...] researches of Lavoisier (1775) on the weight relations involved in combustion were quite sufficient to cast out from the minds of most chemists the 2,000-year old notion that burning is a process of decomposition and to establish in its place the postulate that combustion consists in the union
98
4 The Age of the Earth
of the burning substance with the oxygen of the atmosphere. [...] The mysterious phlogiston had no place in a description of the weight changes incurred by burning.” Still, Lavoisier would say that something is liberated in burning, and in fact “our modern phraseology”, says Barnett, “still bears witness to that reincarnation, for we say that heat is liberated in burning.” The bottom line for Lavoisier is that heat can be thought of as a substance, but that substance does not correspond to any known element—not even hydrogen which seemed to be such a good candidate—and, in addition, it must be imponderable. Another chunk of what would later become the principle of generalized energy conservation comes from Joseph Black, whom we’ve met already (very briefly) in relation to the work of James Hutton. Before I go on, though, let me clarify what is meant by heat and temperature. “If a body of definite nature and weight”, writes Joseph Fourier in his famous 1822 book, Théorie Analytique de la Chaleur, or Analytical Theory of Heat, which we’ll look at in some detail later in this chapter; if a body “occupies a volume V at temperature 0, it will occupy a greater volume V + D, when it has acquired the temperature 1.” Let’s call C the amount of heat you need to administer the body so that its temperature is raised from 0 to 1. “If,” says Fourier, “instead of adding this quantity C, a quantity zC is added (z being a number positive or negative) the new volume will be V + d instead of V + D. Now experiments shew that if z is equal to 1/2, [...] d is [also] half the [...] increment D, and that in general the value of d is z D, when the quantity of heat added is zC. [...] The ratio z of the two quantities zC and and C of heat added, which is the same as the ratio of the two increments of volume D and d, is that which is called the temperature”. In other words: it is a fact that, at fixed pressure, say atmospheric pressure, if you administer heat to a given mass of material, its volume grows (and if you take heat away, the volume shrinks); if you administer twice as much heat, the volume grows twice as much; if you administer half as much heat, the volume grows half as much, and so on and so forth: heat and change in volume are proportional to one another. So take a reference substance, and call “0” the temperature of a chosen mass of this substance when it occupies a volume V (at, say, sea-level pressure). Administer it a certain amount of heat, C, measure the volume change (call it D), call “1” the temperature of the same mass of the same substance after it’s received such amount of heat. Now, the thing is, if you administer z times the same amount of heat as before, the volume also grows by an amount z D, i.e. z times bigger than before. So, (i) z can be measured by looking only at the volume change of the substance to which heat is administered, but (ii) z also tells us about how much heat has been received, and (iii) it is decided that z is called temperature: and that’s what I will mean (or have meant) every time I’ll refer (or have referred) to temperature in this book. Now that that is out of the way, to understand Black’s contribution, consider some experiments done by Daniel Gabriel Fahrenheit159 and interpreted by Herman Boerhaave160 in his widely circulated textbook. “Fahrenheit found”, says Barnett, “that two equal volumes of water at different temperatures attain, on rapid mixing, a temperature which is the arithmetical mean of the two initial temperatures, but that when equal volumes of water and mercury at different temperatures are mixed, the
4.10 Heat Capacity
99
temperature of the mixture is higher or lower than the arithmetical mean, depending on whether water is the warmer or colder constituent. Only when two volumes of water were mixed with three of mercury was the final temperature found to be the arithmetical mean of the initial temperatures.” Which you can think of this as saying that two volumes of water have the same heating effect as three volumes of mercury. Fahrenheit also measured the mass of the materials he combined, and found “one weight of water to have the same heating effect, per degree change of temperature, as twenty weights of mercury”. Boerhaave doesn’t think that how much heat is carried by a substance could depend on the very nature of that substance. So he takes the easy way out and, interpreting Fahrenheit’s observation, “sees fit to state that heat is probably distributed between bodies at the same temperature in accordance with their volumes.” This means that one volume of water should have the same heating effect as one volume of mercury: which is not Fahrenheit’s result: but Boerhaave thinks that the deviation of three-halves from unity can be attributed to experimental error. (That of twenty from unity couldn’t, really, so Boerhaave excluded that heat should be distributed between bodies according to their mass.) Now, Joseph Black saw Fahrenheit’s results from a different angle. Black thinks, says Barnett, that “Fahrenheit’s experiments clearly show that ‘the same quantity of the matter of heat has more effect in heating quicksilver than in heating an equal measure (volume) of water... Quicksilver, therefore, has less capacity for the matter of heat than water... has’; in fact, mercury has only two-thirds the capacity for heat than water has. Heat, then, is distributed in a body, neither according to the body’s mass nor its volume but ‘according to its (the body’s) particular capacity, or its particular force of attraction for this matter.”’ (The last sentence is a quote from Black’s Lectures on the Elements of Chemistry.)
4.10 Heat Capacity What Black is saying here is that if you administer the same amount of heat to the same mass of different substances, the temperature of each substance will vary of a different amount161 . We might now write an equation that some of you probably have seen in school, (4.9) Q = cm m dT, where cm is what Black calls “capacity for heat”, Q is heat, m is mass, and dT is the difference in thermometer reading before and after administering heat. Now, to understand what Barnett/Black means with “neither according to the body’s mass nor its volume”, you might want to write m = ρV where ρ is mass density and V is volume. Then you can also rewrite the equation above, Q = cm ρ V dT.
(4.10)
100
4 The Age of the Earth
Consider that, just like cm , the density ρ is a property of the substance, that remains stable unless one changes significantly the pressure and temperature in the laboratory. So, let us call cV = cm ρ: and here you have an alternative definition of heat capacity, in terms of volume rather than mass. In fact you can write, Q = cV V dT,
(4.11)
where cV depends on the substance that you experiment with: just like cm . This last equation works exactly as the previous equation for Q, except that we now have V instead of m. After Black, people start measuring systematically the heat capacity of substances162 . They still think of heat itself as some kind of special, imponderable substance, which after Lavoisier is called “caloric”.
4.11 Conservation of Energy, “Generalized” The idea of caloric was quite convincing, because of its enormous, as they say, explanatory power, i.e., it provided clear and simple solutions to a number of problems that had been weird for quite some time; and it was widely accepted. But then a few scattered researchers around Europe started looking at things in a different way and came up with an alternative theory that turned out to be even stronger and more convincing and with even more explanatory power, etc.: generalized energy conservation. Now, I realize that I’ve been going on for quite a few pages about chemistry and physics. I haven’t forgotten that my project is to write a book about the earth sciences; but I have to ask you to bear with me for a little longer, because, the way my book works, I cannot not spend a few paragraphs to tell you about how the law of energy conservation came about. The story has three characters: Helmholtz, that we’ve met already; Julius von Mayer; James Joule. As for Helmholtz, he’s the great, academic-type physicist who had a clear perception of what was happening in science in his time, understood what Mayer and Joule were doing, and brought it all together in his treatise Über die Erhaltung der Kraft, or On the Conservation of Force163 , where of course the word “force” should be intended as synonym of energy, but physicists hadn’t yet agreed on what words to use for what, published in 1847—that then everybody would cite, and that became a milestone, and serves as an answer to the questions of: when and by whom was the principle of conservation of energy established. Although Joule is a good answer as well: see his first publication, which came out in 1843. And, still today, the standard unit of measurement for energy is called Joule (J). Fewer people would remember the name of Mayer; but it seems that, technically, he’s the first to actually have published the idea. Here’s from an article on “The Century’s Progress in Physics”, that came out on Harper’s New Monthly Magazine in 1897, i.e. some fifty years after Joule and company did their thing, signed by one Henry Smith Williams164 : “In 1842, Dr. Julius Robert Mayer, practising physician165 in the little German town
4.11 Conservation of Energy, “Generalized”
101
of Heilbronn, published a paper in Liebig’s Annalen on ‘The forces of inorganic nature’, in which not merely the mechanical theory of heat but the entire doctrine of the conservation of energy was explicitly if briefly stated.” Mayer looked at a mechanism, powered by a horse, that stirred paper pulp in some sort of large kettle. He somehow measured both the work done by the horse, and the temperature rise in the pulp, and figured that it made sense to think of an entity named “energy” that had transferred from the horse’s muscles, to the machine’s motion, to the paper pulp’s temperature; and that a quantitative relationship—a mathematical formula—could be established, that connected the amount of work done by the horse to the change in degrees of temperature of the pulp. Which is the principle of energy conservation. So Mayer wrote a paper about this and submitted it to the Annalen der Physik, which was the most important physics journal in Germany: they rejected the manuscript, which is weird if you think how important Mayer’s idea would eventually turn out to be. The paper was accepted by Justus von Liebig, a famous German chemist, for the journal he himself edited. “The great principle he had discovered became the dominating thought of [Mayer’s] life,” writes Williams, “and filled all his leisure hours. [...] Yet for a long time his work attracted no attention whatever166 . In 1847, when another German physician, Hermann von Helmholtz, one of the most massive and towering intellects of any age, had been independently led to comprehension of the doctrine of conservation of energy, and published his treatise on the subject, he had hardly heard of his countryman Mayer. When he did hear of him, however, he hastened to renounce all claim to the doctrine of conservation, though the world at large gives him credit of independent even though subsequent discovery. “Meantime in England, Joule was going on from one experimental demonstration to another, oblivious of his German competitor, and almost as little noticed by his own countrymen. He read his first paper before the chemical section of the British Association for the Advancement of Science in 1843, and no one heeded it in the least. Two years later he wished to read another paper, but the chairman hinted that time was limited, and asked him to confine himself to a brief verbal synopsis of the results of his experiments. Had the chairman but known it, he was curtailing a paper vastly more important than all the other papers of the meeting put together. However, the synopsis was given, and one man was there to hear it who had the genius to appreciate its importance. This was William Thomson, the present Lord Kelvin, now known to all the world as among the greatest of natural philosophers, but then only a novitiate in science. He came to Joule’s aid, started rolling the ball of controversy, and subsequently associated himself with the Manchester experimenter in pursuing his investigations.” The experiment, or suite of experiments concocted by Joule are based on the same principle as Mayer’s experiment with the horse: perform mechanical work to create motion; motion causes friction within some substance; friction generates heat, which in turn causes a change in temperature of the substance; measure both the work performed and the resulting change in temperature. Convert the change in temperature to a measurement of heat, using the known heat capacity of the substance. Compare the measurement of heat with that of work.
102
4 The Age of the Earth
Fig. 4.6 Joule’s apparatus, as sketched in his original paper, plus my annotations to show what is what
To implement this, Joule built a machine where a paddle wheel (think a water mill) is immersed in a can filled with water; the wheel is moved by falling weights, similar to some old clocks. “For each degree of heat evolved by the friction of water”, writes Joule167 , “a mechanical power equal to that which can raise a weight of 890 lb to the height of one foot had been expended. The paddle moved with great resistance in the can of water, so that the weights (each of four pounds) descended at the slow rate of about one foot per second. The height of the pulleys from the ground was twelve yards, and consequently, when the weights had descended through that distance, they had to be wound up again in order to renew the motion of the paddle. After this operation had been repeated sixteen times, the increase of the temperature of the water was ascertained by means of a very sensible and accurate thermometer” (Fig. 4.6). The friction between rotating paddles and resisting water raised the temperature of water. So the work done by gravitational force as it pulls down the weight is transformed into heat. Here’s the account given by Helmholtz in his 1862–63 Carlsruhe lectures: “[Joule’s] experiments show that when heat is produced by the consumption of work, a definite quantity of work is required to produce that amount of heat which is known to physicists as the unit of heat; the heat, that is to say, which is necessary to raise [the temperature of] one gramme of water through one degree centigrade. The quantity of work necessary for this is, according to Joule’s best experiments, equal to the work which a gramme would perform in falling through a height of 425 m.” (i.e., equal to the work that gravitational force from the earth performs in pulling a gramme of mass downward, through a height of 425 m.) The point here, which Helmholtz could perhaps have been a bit more explicit about it, the point is that if you perform the same work on a gramme of mercury, rather than water, the temperature of mercury rises of a different amount, but if you convert
4.11 Conservation of Energy, “Generalized”
103
the change in temperature to heat through mercury’s heat capacity, you find that the heat received by mercury is indeed the same. Which means, to Joule and Helmholtz, and, after a while, to everybody else, that the phenomenon by virtue of which a body is heated does not resemble a chemical reaction, but rather a Newtonian-type mechanical process. “In order to show how closely concordant are his numbers,” continues Helmholtz, “I will adduce the results of a few series of experiments which he obtained after introducing the latest improvements in his methods.” (That is, Joule repeated his experiment in various ways, checking how much work was needed to administer one unit of heat to different substances—or to the same substance, but through different mechanisms.) “1. A series of experiments in which water was heated by friction in a brass vessel. In the interior of this vessel a vertical axle provided with sixteen paddles was rotated, the eddies thus produced being broken by a series of projecting barriers, in which parts were cut out large enough for the paddles to pass through. The value of the equivalent was 424.9 m.” “2. Two similar experiments, in which mercury in an iron vessel was substituted for water in a brass one, gave 425 and 426.3 m. “3. Two series of experiments, in which a conical ring rubbed against another, both surrounded by mercury, gave 426.7 and 425.6 m.” The point here is that, whatever the experimental setup, you always need (within reasonable experimental error) the same amount of mechanical work to produce the same amount (in this case, one unit) of heat. “Exactly the same relations between heat and work”, explains Helmholtz, and this is very important, too, “were also found in the reverse process—that is, when work was produced by heat. [...] A gas which is allowed to expand with moderate velocity becomes cooled. Joule was the first to show the reason of this cooling. For the gas has, in expanding, to overcome the resistance which the pressure of the atmosphere and the slowly yielding side of the vessel oppose to it [...]. Gas thus performs work, and this work is produced at the cost of its heat. Hence the cooling. “[...] Thus then: a certain quantity of heat may be changed into a definite quantity of work; this quantity of work can also be retransformed into heat, and, indeed, into exactly the same quantity of heat as that from which it originated; in a mechanical point of view, they are exactly equivalent. Heat is a new form in which a quantity of work may appear. “These facts no longer permit us to regard heat as a substance, for its quantity is not unchangeable. It can be produced anew from the vis viva of motion destroyed; it can be destroyed, and then produces motion. We must rather conclude from this that heat itself is a motion, an internal invisible motion of the smallest elementary particles of bodies.” So now we know what energy is, sort of, and we know that heat is but a form of energy; but that was quite a detour, and maybe you’re wondering how we got here in the first place. I am sorry for not being able to find a more schematic way of telling this story. (I promise you, I tried to do it in a few different ways, and this one still feels like the best.) Anyway, remember: the energy thing started a bunch of pages back, when after telling you about Darwin and Phillips and co. and their
104
4 The Age of the Earth
vague and controversial ways of estimating the absolute duration of certain geological processes and establishing a lower limit for the age of the earth... after showing you that that approach just wouldn’t go very far, I said, OK, but: at about the same time, physicists like Joule and Mayer and Helmholtz came up with energy conservation, and that totally changed the picture. Because one implication of energy conservation is, all machines work by converting one form of energy into another, but they must get their energy from somewhere—some kind of fuel (i.e., a chemical that releases energy through chemical reactions), external forces applied to it, etc. The thing now is, you can think of the sun as a machine for producing heat. And so based on energy conservation, the energy that the sun turns into heat must come from somewhere. Like I said earlier, if one can figure out from where it comes, and by virtue of what process it becomes heat, one could place some new constraints on the age of the solar system, etc. So, what we are going to do next is, we are going to get back to Helmholtz’ Popular Scientific Lecture on the origin of the planetary system, because after reviewing energy conservation, in the same lecture Helmholtz goes on to present three different hypotheses about the functioning of the sun.
4.12 The Sun’s Energy Output and the Sun’s Age “On earth”, writes Helmholtz, “the processes of combustion are the most abundant source of heat. Does the sun’s heat originate in a process of this kind? To this question we can reply with a complete and decided negative”. Why? Because “we now know that the sun contains the terrestrial elements with which we are acquainted”: it is clear at this point that the sun is not made by some strange elements that don’t exist on earth168 . We also know—Helmholtz and his contemporaries knew—how much energy the sun spits out per unit time. This had been measured in the 1830 s by Claude Pouillet, a French physicist teaching at the Sorbonne. What Pouillet did is, he invented an instrument that he called pyrheliometer169 , to measure how much heat per unit time per unit area is received by the earth from the sun (which, incidentally, for whatever reason, people like to call the “solar constant”). He also figured a way to correct for the time spent in the atmosphere by the rays of light caught by his instrument170 , and then after averaging lots of measurements made at different times, he came up with the value 1228 J/(s×m2 ) for the solar constant171 . From the solar constant (energy per unit time per unit area) one can figure out the total energy emitted by the sun per unit time: just multiply it by the total area over which it is spread. Remember (Chap. 1) that the distance between earth and sun is about 1.5 × 1011 m; remember that the area of a sphere is 4π times its squared radius; so then the area we are talking about is 4π × (1.5 × 1011 m)2 , which is roughly 2 × 1023 m2 . Multiply this by Pouillet’s number, and the energy emitted by the sun per unit time is 2 × 1026 J/s, or so. So then let’s take some substance that’s very good at producing heat when you burn it, and see if that could account for the heat produced by the sun. For instance: coal. The heat released when burning a given amount of coal can be measured; it’s
4.12 The Sun’s Energy Output and the Sun’s Age
105
3 × 107 J/kg. So let’s say, to fix ideas, that the sun is all coal. The mass of the sun (which, again, had been figured out from all the science we’ve seen in Chap. 1, i.e. astronomy data plus Newton’s laws) is about 2 × 1030 kg. Multiply mass with energy per unit mass and you find that 6 × 1037 J is the max “potential energy of combustion” that could be stored in the sun (if none of it had burned yet...). Now, because we know the rate at which energy is consumed per unit time, we can figure out how long the sun can last: divide 6 × 1037 J by 2 × 1026 J/s and you get 3 × 1011 s, which is the life expectancy of the sun and in years is about 10,000. Which is not a lot. In his lecture, Helmholtz actually uses the combustion of hydrogen rather than coal, and finds an even smaller figure: about 3000 years. “That, it is true, is a long time,” says Helmholtz, “but even profane history teaches that the sun has lighted and warmed us for 3000 years, and geology puts it beyond doubt that this period must be extended to millions of years.” Hence, the “complete and decided negative”: “Known chemical forces are [...] so completely inadequate [...] to explain the production of heat which takes place in the sun, that we must quite drop this hypothesis.” Helmholtz figures “we must seek for forces of far greater magnitude, and these we can only find in cosmical attraction” (which means gravitational attraction between cosmic bodies). The idea is that when, e.g., a meteor hits a planet, its kinetic energy is turned into heat; “the comparatively small masses of shooting-stars and meteorites”, says Helmholtz, “can produce extraordinarily large amounts of heat when their cosmical velocities are arrested by our atmosphere.” Because we know that an important number of meteors hit the earth on a regular basis, “we may assume with great probability that very many more meteors fall upon that the sun than upon the earth, and with greater velocity, too.” And so, Helmholtz explains, Julius von Mayer had thought172 that perhaps “the entire amount of the sun’s heat which is continually lost by radiation, is made up by tha fall of meteors, a hypothesis which [...] has been favorably met”, says Helmholtz, “by several other physicists”. Mayer’s reasoning and calculations are found for example in his book Die Mechanik der Wärme (1867), for which a number of German editions seem to exist, but I haven’t been able to find an English-language one, and what follows is my own translation, so help me God: “A weight which falls onto a celestial body from which it is attracted, arrives at the surface of the attracting celestial body with the greater the final velocity, the greater the height from which the weight has begun its fall. But if the weight receives its movement only by falling, this final speed cannot exceed a certain size; it has a maximum, the value of which depends on the nature of the attracting celestial body, namely, on the volume and mass thereof. “Let us denote R the radius of the spherical celestial body [...], measured in meters; the [acceleration] which the free-falling weight undergoes when near the surface of this celestial body [...], is also measured in meters [over squared seconds], and denoted g. And so the highest speed that is achieved when hitting the surface after a fall from infinite √height is the squared root of twice the product of g times R (meters per second)”, 2g R. We can understand this formula, which Mayer doesn’t explain maybe because it’s too simple to him, if we go back to the vis viva Eq. (4.7), and apply it to a simple case where you’ve got only two masses, call them m and M, of which M (think a
106
4 The Age of the Earth
large celestial body: a planet or a star) is so much bigger than m (think a rocket, or a meteor) that we can neglect its displacement. So the two-body system becomes a one-body system, and (4.7) boils down to
u(t) u(0)
Fdu =
1 2 1 mv (t) − mv 2 (0), 2 2
(4.12)
where F is the magnitude of gravitational attraction between M and m, and I’ve assumed that m’s initial velocity is either zero, or points along the same direction as the attraction (possibly with opposite sign, e.g. m is a rocket that takes off from planet M), so the problem is reduced to only one dimension, which is along the line connecting the center of the large celestial body and the initial position of the smaller mass. That is also the direction of gravity’s pull from M itself: and since we’re neglecting other forces, m is eventually going to fall towards M along that direction. So displacement u and velocity v might as well be scalar—each equal to its component in the only direction that matters, all other components being zero. Things are simplified even further when you replace F at the left-hand side with Newton’s formula, u(t) u(t) G Mm Fdu = − du. (4.13) u2 u(0) u(0)
u(t) u(0)
G Mm G Mm G Mm − . du = u2 u(0) u(t)
(4.14)
That, in turn, can be plugged into (4.12), and G Mm G Mm 1 1 − = mv 2 (t) − mv 2 (0), u(t) u(0) 2 2
(4.15)
which you just have to shift things around a bit to rewrite as 1 2 1 G Mm G Mm mv (0) − = mv 2 (t) − . 2 u(0) 2 u(t)
(4.16)
Call K (t) the time-dependent kinetic energy, and call U (t) the work done by gravity, which is also what people like to call potential energy. The left-hand side of (4.16) is constant over time: it depends only on the initial velocity v(0) of m, and on its initial position u(0). And but (4.16) also stipulates that that constant coincides, at any moment t, with the sum of K (t) + U (t); so that constant is nothing but the total mechanical energy of the system, call it E, and you can write E = K (t) + U (t).
(4.17)
Now consider the case of a meteor (mass m) that’s just hanging out somewhere in the universe, minding its own business, so far from, say, the sun (mass M) that its
4.12 The Sun’s Energy Output and the Sun’s Age
107
potential energy, associated with the sun, is about zero (this is what Mayer meant, in Mm ≈ practice, with “infinite height”); in mathematical terms, u(0) is so large that Gu(0) 0. Let the initial velocity of the meteor also be close to 0; it follows that also its mechanical energy, E = K (0) + U (0) ≈ 0. If no other force other than that of the sun is pulling it around (and we can neglect the attraction of other solar-system planets, compared to that of the sun), the meteor is slowly but inexorably bound to fall into the sun. During the fall, all the above math holds, and E is conserved while K (t) grows in the positive, u(t) becomes smaller (but stays positive) and U (t) accordingly grows in the negative. At the moment the meteor hits the surface of , where R is the radius of the sun, E is still zero because E is the sun, U = − G Mm R conserved, and so it follows from (4.16) and or (4.17) that 1 2 G Mm mv (t) = , 2 R
(4.18)
which the velocity v(t) at the left-hand side is precisely Mayer’s “highest speed that is achieved when hitting the surface after a fall from infinite height”. You solve (4.18) for v(t) and get
2G M , (4.19) v(t) = R which is not the same as Mayer’s formula. But by Newton’s law, at the surface of the sun gravitational attraction is G Mm/R 2 ; and remember that this coincides with mg, with g the acceleration of gravity at the surface of the sun. It follows that M G = g R 2 , and if you plug that into (4.19) you get v(t) =
2 g R,
(4.20)
which is the same as Mayer’s formula. You can see from (4.19) or (4.20) that the speed of free fall, v(t), is actually independent of time t: it is a constant, fully defined by the mass M and radius R of the celestial body that does the attracting. It is usually called escape velocity173 , actually, and denoted ve . After deriving the formula, Mayer also plugs some numbers into it. In the case of the “g = 9, 8164 and R = 6, 369, 800. Consequently [...] the fall velocity is √ earth, √ 2g R = 2 × 9, 8164 × 6, 369, 800 = 11, 183 m/s. “The radius of the sun is 112.05 times larger than that of the earth, but the gravitational acceleration on the solar surface is 28.36 times that on the surface of the earth [again, see above, you can calculate it if you know the mass of the sun, which at this point in this book you have √ all the ingredients for]. Consequently, for the sun, the maximum fall velocity is 28.36 × 112.05 × 11, 183 = 630, 400 m/s”. I am not sure why this should be the maximum fall velocity: what if the meteor happened to have nonzero initial velocity, in the direction of the sun? But anyway, it sounds like a decent order-of-magnitude estimate for the fall velocity of your average bit of outer-space material.
108
4 The Age of the Earth
Mayer then derives a minimum fall velocity, too. He starts off by giving, without proof174 , a formula for “the velocity [v] of a body in a central motion at any point in its course [...]. Let a denote [...] half the major axis of the orbit, [...] measured in solar radii; let r denote the variable distance of the orbiting body from the center of gravity, also measured in solar radii; then the velocity of the orbiting body at distance
. r [...] reads ve 2a−r 2ar “At the moment when the planetary body meets the solar surface, r = 1, hence 2a−1 1 . v then grows with growing a. [...] its velocity ve 2a ”, i.e., v = ve 1 − 2a “This velocity, like the major axis, has a minimum, for as long as the planetary body moves outside the sun [or, in other words, at least part of its trajectory is not within the volume of the sun, which wouldn’t make sense...], its major axis [2a] can never be smaller than the sun’s diameter, i.e., it can never be less than 2 (solar radii). The smallest conceivable speed of a cosmic body falling on the sun is accordingly = 445, 750 [m/s] or 60 geogr. miles in 1 sec.” Note that in Mayer’s formulae distances (a, r ) are given in units of the sun’s radius R; if I want to give everything in meters, which Mayer eventually does, the
ve
1 2
. And because, by virtue of velocity during orbit should really be written ve R 2a−r 2ar √ eqs. (4.19) and (4.20), ve = 2G M/R, then
v(r ) =
GM
2a − r . ar
(4.21)
So we have a minimum and a maximum velocity of fall: 445,750 m/s and 630,400 m/s, respectively. What Mayer does next to derive estimates of how much of the associated K is released in the form of heat is not clear to me—he’s got what looks like an empirical formula, to convert impact velocity to heat, that I don’t understand; but it’s OK, as a first approximation, to just consider the case where the max possible heat would be produced out of each impact, i.e. just take the maximum velocity (Mayer’s minimum velocity is not that much smaller anyway—same order of magnitude) and assume that all that kinetic energy is turned into heat. So, one kg of matter hitting the sun at velocity ve = 6.3 × 105 m/s has kinetic energy = 1/2 × 1 × (6.3 × 105 )2 J, or 2 × 1011 J. Remember that we know, through the work of Pouillet, the energy output of the sun, which is about 3.6 × 1026 J/s. In a year there are about 3 × 107 seconds, and so over 1000 years the sun would release about 3 × 1010 × 3.6 × 1026 J, or roughly 1037 J. Divide this by 2 × 1011 J/kg, and you get about 3 × 1026 kg, which is how much mass you need to fall into the sun, every thousand years, to justify the sun’s energy ouput. The total mass of the sun is about 2 × 1030 kg, so that means that, if Mayer’s right, we’d have to see an increase in the sun’s total mass of about 1/10,000 its current mass every 1000 years. This is in the most efficient case, where meteors are hitting the sun at high fall velocity, and all their kinetic energy is converted to heat. You would need to put more extra mass in the sun, if the fall velocity was lower, etc.
4.12 The Sun’s Energy Output and the Sun’s Age
109
Astronomy says that this change in the sun’s mass is too much, too fast: it should have caused a gradual change in the orbits of planets, for instance, which would have been observed, and but wasn’t—and remember that people have been recording the motion of planets for millennia. In his lecture, Helmholtz concludes that Mayer’s hypothesis, “is open [...] to objection175 ; for, assuming it to hold, the mass of the sun should increase so rapidly that the consequences would have shown themselves in the accelerated motion of the planets. The entire loss of heat from the sun cannot at all events be produced in this way; at the most a portion, which, however, may not be inconsiderable.” This brings us to the third idea on the age of the sun presented in Helmholtz’ lecture, which is actually Helmholtz’ own idea. “If [...] there is no present manifestation of force sufficient to cover the expenditure of the sun’s heat,” writes Helmholtz, “the sun must originally have had a store of heat which it gradually gives out. But whence this store? [...] If the mass of the sun had been once diffused in cosmical space, and had then been condensed—that is, had fallen together under the influence of celestial gravity—if then the resultant motion had been destroyed by friction and impact, with the production of heat, the new world produced by such condensation must have acquired a store of heat not only of considerable, but even of colossal, magnitude. “Calculation [which I am going to show you in a minute] shows that, assuming the thermal capacity of the sun to be the same as that of water, the temperature might be raised to 28,000,000 of degrees, if this quantity of heat could ever have been present in the sun at one time. This cannot be assumed, for such an increase of temperature would offer the greatest hindrance to condensation. It is probable rather that a great part of this heat, which was produced by condensation, began to radiate into space before this condensation was complete. But the heat which the sun could have previously developed by its condensation, would have been sufficient to cover its present expenditure for not less than 22,000,000 of years of the past.” So what Helmholtz is saying is not that far from Mayer’s proposal, except that in Mayer’s “model” the kinetic energy associated with an impact is turned to heat and radiated away right as the impact occurs, and there is no other source of heat in the sun; in Helmholtz’ model heat is initially stored in the sun, i.e. the sun’s temperature is raised; and then, with some delay, stored heat gets conducted through the sun’s mass, and finally emitted to space. “The form of meteoric theory which now seems most probable, and which was first discussed on true thermodynamic principles by Helmholtz176 ,” writes Kelvin177 in 1862, “consists in supposing the sun and his heat to have originated in a coalition of smaller bodies, falling together by mutual gravitation, and generating, as they must do according to the great law demonstrated by Joule, an exact equivalent of heat for the motion lost in collision.” Now, we’ve seen, already, that the kinetic energy carried by a small mass, call it dm now, that falls from infinity into a star of radius r , is given by dK =
G M(r )dm ; r
(4.22)
110
4 The Age of the Earth
this is the same as Eq. (4.18), but I’ve changed the notation a bit, to emphasize that the sun grows at each impact, so r changes over time and so does M = M(r ). And the small mass dm contributes only a small portion d K of the total kinetic energy that’s being converted into heat and stored in the form of a temperature increase. To find out how much that is, Helmholtz’ trick is to integrate (4.22) from r = 0 (no condensation at all, the sun doesn’t yet exist as a star) to the current radius of the sun, call it R now. This requires (i) making the simplifying assumption that the sun’s density ρ is constant and has always been the same, so that M(r ) =
4 πρr 3 , 3
(4.23)
and (ii) establishing a relationship between the growth in mass dm and the growth in radius dr , which just think that to increase radius from r to r + dr we need to add to the proto-sun a layer of mass 4πρr 2 dr (remember 4πr 2 is the surface area of a sphere of radius r ). Plug all that into (4.22) and integrate, and R 16 2 2 r 4 dr K = Gπ ρ 3 0 16 = Gπ 2 ρ 2 R 5 . 15
(4.24)
Once again, let’s throw some numbers in. The mean density of the sun is 1.4 × 103 kg/m3 ; its radius R is about 7 × 108 m: it turns out from (4.24) that the total stored energy178 is 2.4 × 1041 J. You might guess what happens next: divide the figure you just obtained by the rate at which heat is released by the sun, e.g. the estimate we’ve got based on Pouillet which, by now you know it by heart, is 3.6 × 1026 J/s. Do the algebra, and (I’ve checked a couple times) you’ll find that the time needed to release all that heat, at the present rate, is 21 million years: which is not the exact same number given by Helmholtz, but close enough, I guess. Helmholtz thinks that the sun is still becoming denser—by gravitation—and this process produces heat: “the sun is by no means so dense as it may become”, he writes. “Spectrum analysis demonstrates the presence of large masses of iron and of other known constituents of the rocks. The pressure which endeavors to condense the interior is about 800 times as great as that in the centre of the earth: and yet the density of the sun, owing probably to its enormous temperature, is less than a quarter of the mean density of the earth. “We may therefore assume with great probability that the sun will still continue in its condensation, even if it only attained the density of the earth [...]—this would develop fresh quantities of heat, which would be sufficient to maintain for an additional 17,000,000 of years the same intensity of sunshine as that which is now the source of all terrestrial life...” To Helmholtz’ conclusions, which he essentially agrees with, Kelvin adds (in the 1862 paper that I’ve just cited179 ) that “the sun’s density must, in all probability, increase very much toward his centre, and therefore a considerably greater amount
4.13 Fourier’s Analytical Theory of Heat
111
of heat than [the above estimate of 2.4 × 1041 J] must be supposed to have been generated if [the sun’s] whole mass was formed by the coalition of comparatively small bodies. On the other hand, we do not know how much heat may have been dissipated by resistance and minor impacts before the final conglomeration; but there is reason to believe that even the most rapid conglomeration that we can conceive to have probably taken place could only leave the finished globe with about half the entire heat due to the amount of potential energy of mutual gravitation exhausted. We may, therefore, accept, as a lowest estimate for the sun’s initial heat, 10,000,000 times a year’s supply at the present rate, but 50,000,000 or 100,000,000 as possible, in consequence of the sun’s greater density in his central parts.” Kelvin concludes that the sun is very unlikely to be older than 100 million years. “What then are we to think”, he asks sarcastically, “of such geological estimates as 300,000,000 years for the denudation of the Weald? Whether is it more probable that the physical conditions of the sun’s matter differ 1,000 times more than dynamics compel us to suppose they differ from those of matter in our laboratories; or that a stormy sea, with possibly Channel tides of extreme violence, should encroach on a chalk cliff 1,000 times more rapidly than Mr. Darwin’s estimate of one inch per century?” (Kelvin implies, I guess, that the latter is true, and that the sun’s matter can’t be that different from matter in his lab. He’ll turn out to be totally wrong, but we’ll see about that later.)
4.13 Fourier’s Analytical Theory of Heat Shortly after his work on the age of the sun, Kelvin figured that the earth, just like the sun, was slowly getting rid of its stock of heat; and he found a way to estimate, based on this idea, the age of the earth as well. All he needed to do was make some assumption on initial temperature, and then use the theory of heat conduction that had been put together and published, some forty years earlier, by his French colleague Joseph Fourier. In Analytical Theory of Heat, Fourier reduced everything about heat to mathematical equations: how heat propagates across a body of material, what happens if you keep heating some material for a while, what if you keep heating it indefinitely, what if you let it cool down, etc. Fourier starts off with the definition of temperature that I’ve already shown you, a few pages back. Then, to explain the nature of heat, he illustrates some simple experiments—for instance, imagine that two thermometers are deployed at two different points in space, points P1 and P2 . Let’s start administering heat at a point P0 , closer to P1 than it is to P2 , and while we do that let’s look carefully at the two thermometers. Assuming space is occupied by some “normal” substance that conducts heat, experience tells us that the column of mercury (or whatever non-toxic material you might prefer to use) begins to rise at P1 (closer to the source of heat) before it rises at P2 . At that point, some heat has been transferred from P0 to P1 , but none yet from P0 to P2 . We infer that the amount of heat Q that’s exchanged, in a given
112
4 The Age of the Earth
amount of time δt, between whatever two points at distance d from one another, depends on that distance d. Add to this another experimental fact: in Fourier’s words, “experiments have disclosed [...] that all the other circumstances being the same, the quantity of heat which one of the molecules receives from the other is proportional to the difference of temperature of the two molecules”—when you read “molecule”, here, think: “parcel of matter”. So, if you’ve got two parcels of matter at distance d from one another, starting off at temperatures T1 and T2 , the heat that’s exchanged in a given (short) amount of time is (4.25) Q = (T1 − T2 ) f (d), where at this point we have no idea what the function f is like. Time has got to be short because otherwise T1 and T2 could start to change, etc. Now consider a prism of infinite extension (look at Fig. 4.7), but of finite thickness e. Each of the two surfaces of the prism is the outer surface of a half space, that somehow (no need to know how) is kept at constant temperature. Call Ta and Tb the constant temperatures of the two half spaces; call e the thickness of the prism and z the coordinate perpendicular to the surfaces of the prism. Fourier now shows that if at some point the temperature within the prism is given by z T (z) = Ta + (Tb − Ta ), e
(4.26)
then the temperature “distribution” T (z) won’t change: it has reached a “steady state”. Fourier’s proof is as follows: first, think of a vertical layer within the prism of Fig. 4.7: any thin layer parallel to the outer surfaces of the prism will do. Then, think
Fig. 4.7 A prism of infinite extent but finite thickness e separates two constant-temperature half spaces. You have to think that both width and length of the two gray rectangles are infinite
4.13 Fourier’s Analytical Theory of Heat
113
of a pair of points at distance d from one another, each on either side of, say, the left interface of the layer, and another pair, also at distance d from one another, on the opposite sides of the right interface of the layer. If d is the same for the two pairs of points, and if T is given by Eq. (4.26), then it’s not so hard to see that the difference in temperature will be the same for the two pairs of points: just sub z + d for z into (4.26), subtract (4.26) from the equation that you’ve so obtained, and you’re left with T (z + d) − T (z) =
d (Tb − Ta ), e
(4.27)
no matter where z is. But now there’s also Eq. (4.25), which tells us that if the temperature difference is the same, then the heat that’s exchanged is also the same. Bottom line, if T (z) is given by (4.26), then just as much heat goes into the layer through the surface at z, as is lost through the surface at z + dz (or vice-versa). And but if the net amount of heat received by the layer is zero, the temperature of the layer won’t change. The reasoning applies to any layer anywhere in the prism. Ergo, T won’t change anywhere in the prism, QED. You might notice that Eq. (4.26) identifies a linear change in temperature, i.e., T as a function of z is a straight line; you might also notice that, the way we’ve proven (4.26), it is evident that heat continues indefinitely to flow across the prism, from the warmer to the cooler half space—but, like we said, with no change in the way temperature is distributed. Now that we know what is meant by “steady state”, consider a more general temperature distribution, associated with a more complex, realistic situation than the one we just looked at. Call it T (z). If d is small, we have T (z + d) − T (z) ≈
dT (z)d, dz
(4.28)
is like—since we don’t know what T (z) is like. where now we don’t know what dT dz Take the ratio of (4.28) to (4.27), dT e T (z + d) − T (z) ≈ (z) . T (z + d) − T (z) dz Tb − Ta
(4.29)
Now remember Eq. (4.25), which you can substitute both at the numerator and denominator in the left-hand side of (4.29); f (d) cancels out, and dT e Q = (z) , Q dz Tb − Ta
(4.30)
where Q is the heat exchanged between two parcels of matter at z and z + d, when the temperature distribution is T (z), and Q is the same thing, when temperature is as in (4.26). It’s OK now to replace the proxy in (4.29) with an exact equality, by the way, since we can assume that d is as small as we want it to be.
114
4 The Age of the Earth
The steady-state case works as a sort of reference situation, and one might as well pick the simplest possible steady-state case, with e = 1m and Tb − Ta = 1 ◦ C; the heat Q that gets exchanged in such a case could serve us as a unit of heat; except, Q would change depending on the size of the parcel of matter that we are dealing with, and or how long we wait for heat to be exchanged, so that’s annoying. It’s better to introduce heat-exchanged-per-unit-surface-per-unit-time, which is what people like to call heat flux; if we use F to denote flux, and δS is the small surface through which heat is exchanged, and δt the short time during which it is exchanged, then Q = FδtδS and Q = F δtδS, and dT F = (z), F dz
(4.31)
where F is the heat exchanged at steady state, per unit surface per unit time, when you’ve got a temperature difference of 1 ◦ C across a distance of 1 m. That’s actually what people call thermal conductivity, except for a negative sign, i.e. thermal conductivity is denoted k = −F, so that F(z) = −k
dT (z), dz
(4.32)
where I’ve dropped the prime because it’s not needed anymore; and Eq. (4.32) is what’s usually referred to as Fourier’s law of heat conduction180 . Now, (4.32) relates flux across a surface to the rate of temperature change with distance. In the most general case, by the way, flux is a vector, and at the right-hand side you’ve got the gradient—also a vector—instead of the z-derivative of T ; but we are keeping things simple, and we are happy with looking at a setup where heat is transferred along one direction only. Then, we can turn (4.32) into a differential equation with only one unknown function, that is T = T (z, t), i.e. temperature as a function not only of distance z, but also of time, t; and, solving that differential equation, we’ll learn things about how the temperature distribution evolves over time—which is what we are ultimately after. Think (with the help of Fig. 4.8) of a parcel of matter that receives/loses heat along the z direction. The heat it exchanges with the rest of the world through its side at z is dT (z)d x d y dt, (4.33) F(z)d x d y dt = −k dz while at z + dz F(z + dz)d x d y dt = −k
dT (z + dz)d x d y dt. dz
(4.34)
4.13 Fourier’s Analytical Theory of Heat
115
Fig. 4.8 This parcel of matter, of volume d x d y dz, exchanges heat with the rest of the universe only along the z direction: i.e., only through those of its faces that are perpendicular to the z axis. Their area is dx dy
Manipulate the right-hand side of (4.34) with a first-order Taylor expansion: dT dT d dT (z + dz) ≈ (z) + (z)dz, dz dz dz dz
(4.35)
so that
F(z + dz)d x d y dt ≈ −k
d dT dT (z) + (z)dz d x d y dt. dz dz dz
(4.36)
d2T (z)d x d y dz dt. dz 2
(4.37)
Subtract (4.33) from (4.36), and [F(z + dz) − F(z)] d x d y dt ≈ −k
Now, at the left-hand side of (4.37) we’ve got, by definition of F, the total amount of heat lost (or received, should it be negative) by our parcel of matter, of volume d x d y dz, in a time dt. If I call dT the change in the temperature of the parcel of matter over dt, then Eq. (4.10) tells me that that amount of heat also equals cm ρ dx dy dz dT , and so I can replace the left-hand side with that, and cm ρ dx dy dz dT ≈ k
d2T d x d y dz dt. dz 2
(4.38)
After a bit of algebra181 we end up with the famous “heat conduction equation”, or “heat equation”,
116
4 The Age of the Earth
k d2T dT (z, t) = (z, t), dt cm ρ dz 2
(4.39)
where, again, it’s OK to replace proxy with equality because I can take dz to be as small as I want it to be.
4.14 Kelvin’s Estimate of the Age of the Earth Now, finally, (4.39) is precisely what was used in the mid nineteenth century to evaluate the age of the earth based on the rate at which our planet seems to be releasing heat out to the rest of the universe. This was Kelvin’s idea and it was developed in his paper “On the Secular Cooling of the Earth”182 , published in 1862. The starting point of Kelvin’s reasoning is that we know from observations that temperature in the earth grows with depth: “in all parts of the world in which the earth’s crust has been examined, at sufficiently great depths to escape influence of the irregular and of the annual variations of the superficial temperature, a gradually increasing temperature has been found in going deeper. [...] There is, on the whole, about 1 ◦ Fahr. of elevation of temperature per 50 British feet of descent.” Not a surprise for us: see Chap. 2. (And, now that we’ve looked at Fourier’s Analytical Theory, from this we immediately figure that heat flows from within the earth to the outside, which means that the earth gives out heat to the rest of the universe, i.e., it’s cooling. And unless there are sources of heat somewhere in the planet, which Kelvin doesn’t see why there should be, and what they could be183 , that means that the earth will eventually release all its heat and become as cold as outer space.) The next step is, Kelvin assumes that some external cause “produce at one instant, and maintain for ever after, a seven thousand degrees’ [Fahrenheit] lowering of the surface temperature”. 7000 ◦ F is about 4000 ◦ C, and it is the melting temperature of rocks. What Kelvin means here is, assume that we start off with the earth as a globe of molten rock: temperature of say 4000 ◦ C, and then all of a sudden have the atmosphere around it drop to roughly 0 ◦ C, and stay at 0 ◦ C while the earth as a result cools down. To explain this hypothesis Kelvin refers e.g. to “Leibnitz’s theory, which simply supposes the earth to have been at one time an incandescent liquid, without explaining how it got into that state”; and but anyway we’ve seen in Chap. 2 that there’s consensus on the idea that the earth was at some point so hot as to be completely liquid: so this is OK. (Kelvin anticipates the point that some of you might be about to bring up, and writes: “it may be objected [...] [that] no natural action could possibly produce at one instant, and maintain for ever after, a seven thousand degrees’ lowering of the surface temperature.” But, says Kelvin, “a large mass of melted rock, exposed freely to our air and sky, will, after it once becomes crusted over, present in a few hours, or a few days, or at the most a few weeks, a surface so cool that it can be walked over with impunity. Hence, after 10,000 years, or, indeed, I may say after a single year, its condition will be sensibly the same as if the actual lowering of temperature experienced by the surface had been produced in
4.14 Kelvin’s Estimate of the Age of the Earth
117
an instant and maintained constant ever after.” And in the face of the estimates we’re about to get for the age of the earth, both 1 or 10,000 years are pretty small values.) Then, finally, there’s Fourier’s equation (4.39), and here’s the crux of Kelvin’s idea: solving (4.39) with the conditions that the earth’s initial temperature be 4000 ◦ C, and that the temperature just outside the earth be constant at 0 ◦ C, we can find a mathematical formula for temperature as a function of time and depth: and then we can use it to figure how long it took for the earth to cool to its current state. And that’s at least a decent order-of-magnitude estimate for the age of the earth. Kelvin works this out starting with Sects. 12 and 13 of his paper, where he demonstrates that (4.39) is verified by the function
√z 2 κt
T (z, t) = D
e−x d x, 2
(4.40)
0
where D is an arbitrary constant (until we deal with the initial and boundary conditions, which then it won’t be arbitrary anymore), and to save some room so that things are lisible I introduced κ = cmkρ , which people like to call “thermal diffusivity”. Kelvin’s proof is by “direct substitution”, i.e. he plugs (4.40) into (4.39) and finds that the equality holds. Let’s do exactly that, beginning with the derivative at , which can be sorted out by defining a new variable y = 2√zκt , the left-hand side, dT dt so that dT dy dT = , (4.41) dt dy dt and the two differentiations in (4.41) are doable: dy d = dt dt
z 1 √ √ 2 κ t z =− √ 4 κt 3
and
d dT = dy dy
D
y 0
−x 2
e dx y+δy
D 2 = lim e−x d x − δy−→0 δy 0 y+δy D 2 e−x d x = lim δy−→0 δy y D −y 2 e δy = lim δy−→0 δy = De−y
2
z2
= De− 4κt ,
(4.42)
y
−x 2
e
dx
0
(4.43)
118
4 The Age of the Earth
which if you sub both into (4.41), z dT z2 = −De− 4κt √ . dt 4 κt 3
(4.44)
Now let’s do the derivatives at the right-hand side of (4.39), one at a time: z+δz √z √ 2 κt 2 κt dT D −x 2 −x 2 = lim e dx − e dx δz−→0 δz dz 0 0 z+δz D 2√κt −x 2 = lim e dx δz−→0 δz √z 2 κt D − z2 δz = lim e 4κt √ δz−→0 δz 2 κt
(4.45)
z2
e− 4κt =D √ , 2 κt so that then
d dT d2T = dz 2 dz dz z2 d e− 4κt = D √ dz 2 κt 2 z2 z e− 4κt d =− D √ 2 κt dz 4κt
(4.46)
z2
z e− 4κt =− D √ . 4 κ 3t 3 OK, so, now what we’ve got to do is, we need to see whether (4.45) and (4.46) verify (4.39), i.e. multiply (4.46) by κ and check that the result coincides with (4.45): which actually it’s not that hard to see that it does: and so Kelvin is right and (4.40) is solution to (4.39). The next and final step is to determine a value of D such that, at t = 0, there be a temperature jump of 4000 ◦ C across the outer surface of the earth (which, go back a couple pages, is Kelvin’s simplified description of the initial molten state of the planet). To do that, we’ve got to know a bit more about the integral in (4.40), which in Kelvin’s time had already been studied, e.g. by Fourier184 . It’s useful to think for a second about the simpler function185 2 erf(x) = √ π
0
x
e−u du, 2
(4.47)
4.14 Kelvin’s Estimate of the Age of the Earth
119
the so called error function, which looks a lot like (4.40). If you were to do the integral in (4.47) for a range of values of x, and plot the results in a graph, you’d get the curve in Fig. 4.9. It also turns out186 that when x goes to infinity lim erf(x) = 1,
(4.48)
lim erf(x) = −1.
(4.49)
x−→∞
and likewise x−→−∞
Now, it follows from the definition (4.47) of erf and from Eq. (4.40) that √ T (z, t) = D
z π . erf √ 2 2 κt
(4.50)
In our case, then, the limits (4.48) and (4.49) might correspond to: infinite depth (z −→ ∞), which is not so interesting; infinite elevation (z −→ −∞), which is even less interesting; and but also to t = 0: because t is at the denominator in the argument of erf in (4.50), and so by going to 0 it would send that argument to either +∞, if z > 0, or to −∞ if z < 0. At t = 0, then, √ D 2π√ if z > 0 T (z, 0) = (4.51) −D 2π if z < 0, which look at Fig. 4.10 to see how that looks like, graphically. As t grows, that evolves into the curve in Fig. 4.9, initially very steep across z = 0 and then gradually
Fig. 4.9 The so-called “error function”, or erf(x): see Eq. (4.47)
120
4 The Age of the Earth
Fig. 4.10 Kelvin’s solution for temperature as a function of time t and depth z, at time t = 0 (solid black line) and at three other values t > 0 (dashed and dotted lines: see legend). Only the values of T at z > 0 are meaningful
smoother. And of course T (z = 0, t)√ = 0 for any value of t. Bottom line, Kelvin’s conditions are fulfilled if we pick D 2π = 4000 ◦ C. Now that Fourier’s equation is solved, and in such a way as to account for the initial and boundary conditions as well, let’s see what its solution can tell us about the earth. The depth derivative of temperature T at a time t is given by Eq. (4.45), and since we’ve gotten a value for D we can actually plug numbers into it: to see, for at or near z = 0 changes as we vary t. And we can keep look for a example, how dT dz coincides with the datum that we actually have for it: about value of t such that dT dz “1 ◦ Fahr. of elevation of temperature per 50 British feet of descent”, in the units used by Kelvin, or, which is the same, about 1 ◦ C per 30 m (0.03 ◦ C/m), in the units we prefer to use today. And that value of t would coincide with the time it took the earth to evolve from totally molten (T = 4000 ◦ C throughout) to its current state. To obtain the actual number found by Kelvin, which is 100 million years, consider that “if the rock be of a certain average quality as to conductivity and specific heat, the value of [κ], as I have shown in a previous communication to the Royal Society187 , will be 400, to unit of length a British foot and unit of time a year”. You can figure from its definition that κ has dimensions of squared length divided by time, and since a UK foot is about 0.3 m, the way I understand it, Kelvin picks κ ≈ 40m2 /year. Then, according to Eq. (4.45), near the surface of the earth (z ≈ 0) we’d have 1 dT (z = 0, t) =D √ dz 2 κt 2 1 =4000 ◦ C √ π 2 40 m2 /yr × 108 yrs ≈0.036 ◦ C/m,
(4.52)
4.14 Kelvin’s Estimate of the Age of the Earth
121
which is just about right. Ergo, if we call age of the earth the time it took for the planet to cool to its present state, starting the moment its temperature dropped to just below the melting point of typical known rocks, then the earth is about 100 million years old. Kelvin must have been happy to see that this estimate was consistent with his estimate for the age of the sun, that we had seen above. Of course, those are just order-of-magnitude estimates. For instance, writes Kelvin, “we are very ignorant as to the effects of high temperatures in altering the conductivities and specific heats of rocks, and as to their latent heat of fusion. We must, therefore, allow very wide limits in such an estimate as I have attempted to make; but I think we may with much probability say that the consolidation cannot have taken place less than 20,000,000 years ago, or we should have more underground heat than we actually have, nor more than 400,000,000 years ago, or we should not have so much as the least observed underground increment of temperature.”
Chapter 5
The Forces That Shape the Earth: Shrinking, Isostasy, Drift
How are mountains formed? they are pretty big—look at the Alps, the Andes, the Himalaya, the Rockies—and, if you think about it, it’s not obvious that they should be there, thousands of meters above sea level, as rocks are heavy, and we know of a force called gravity that pulls them towards sea level, but we don’t know, or Hutton and Lyell and company didn’t know, of any physical force capable of moving such great masses in the opposite direction—away from sea level and up into the sky. We’ve seen that Werner and his followers figured there was no such thing as uplift, but simply deposition of sediments and then erosion digging through them to create relief. But then the ideas of Werner, at least those on the history of the earth, are pretty much dead at this point. In Hutton’s theory there had to be some episodes of uplift: and only after the uplift valleys would be carved out, and the overall level of the land worn down by erosion, resulting in the landscape that we know. Hutton didn’t know exactly what the force that caused the uplift was, but he figured (today we would say: correctly) that force was related to heat. In the mid-nineteenth century, the main proponent of this approach was Lyell, through his very successful textbook Principles of Geology188 . The label that people have found useful to attach to it is “uniformitarian”, to contrast it with the “catastrophist” approach. “Catastrophists” would interpret certain features of the earth— mountains, in particular—as signs of past events that occurred over a very short time—think of a single, enormous earthquake to raise an entire mountain range, rather than the slow, impalpable erosion and sedimentation, or the regular-sized quakes and eruptions favored by the “uniformitarians”. I don’t think I have found those terms—catastrophist and uniformitarian—in books or articles published in Lyell’s time, and my guess is that people have come up with them much later—you do find them in books that are published today by historians of science. Now, like Hutton, Lyell doesn’t have a very clear idea of what pushes continents and stuff up; “Lyell devoted little space in his Principles of Geology to the genesis of mountains”,
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_5
123
124
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
says Mott T. Greene189 , “and implied that even such ranges as the Alps and the Andes had risen to towering heights by incremental stages of a few inches or a few feet at a time, lifted by earthquakes and volcanism.”190 .
5.1 Élie De Beaumont: The Shrinking Earth Élie de Beaumont, or Jean-Baptiste Armand Louis Léonce Élie de Beaumont, where Élie-de-Beaumont is the family name, or but simply Beaumont, is one example of trying to answer the same question from the “catastrophist” angle. Beaumont was born in 1798 (one year after Lyell) and died in 1874. Beaumont and others looked again at Werner’s picture of the Alps and other European ranges, with a crystalline central axis surrounded by parallel areas of sedimentary rock (remember Fig. 3.2). They realized that central axis was not “primitive”, but more recent than the sedimentary strata it had intruded and deformed. So now they decide to look more closely at the sedimentary formations that are to be found on the sides. Beaumont’s early contributions to geological thought are summarized nicely by the article he published in 1831 in The Philosophical Magazine and Annals of Philosophy191 , where he points out that, “along nearly all mountain chains, [...] the most recent rocks extend horizontally up to the foot of such chains, as we should expect would be the case if they were deposited in seas or lakes of which these mountains have partly formed the shores; whilst the other sedimentary beds tilted up, and more or less contorted on the flanks of the mountains, rise in certain points even to their highest crests.” He explains that one can then use “the geological age of these beds [as a] means of determining the relative age of the mountains themselves; for it is evident, the first appearance of the chain itself is necessarily intermediate between” the younger age of the flat strata, and that of the older ones tilted and contorted by the uplift. “There is nothing so essential to remark,” continues Beaumont, “as the constant clear line of separation between these two series of beds in each chain.” It’s a sharp discontinuity, much like Hutton’s unconformities. Shallower and more recent strata are flat—no major plutonic force has perturbed them yet; older sedimentary strata are dramatically tilted and contorted, presumably by the intrusion of the plutonic rock forming the central axis of the chain; there is nothing in between: no slightly tilted strata, progressively deposited as the mountain was slowly rising—because the mountain rose fast. “It follows from this difference, always clear and without passage, between the upheaved beds and those which are horizontal, that the elevation of the beds has not been effected in a continuous and progressive manner, but that it has been produced in a space of time comprised between the periods of deposition of the two consecutive rocks, and during which no regular series of beds was produced;— in a word, that it was sudden, and of short duration.” Now this was a controversial point, because Lyell would reply that the area in question, like any portion of the surface of the earth, would alternate epochs of sedimentation with epochs of erosion, and the uplift could have occurred—slowly—during a time of erosion, and be done before sedimentation started again. And Beaumont could not really reply to this, but
5.1 Élie De Beaumont: The Shrinking Earth
125
neither could Lyell actually prove that things actually happened the way he thought... so both speculations were plausible and the question remained open. By the way, as one reads Beaumont’s words “sudden, and of short duration”, a question inevitably comes to mind: just how short was he thinking? minutes, years, centuries? the answer to this is to be found later in the same paper. If I understand him correctly, he thinks that “the sudden and passing deluge noticed among the traditions of all nations as having occurred at nearly the same epoch”—i.e., the deluge as in the Bible, Noah’s ark and all that—must have been caused by “the appearance of a new system of mountains [...]; and perhaps we may be justified in observing”, he adds, “that the chain of the Andes, whose volcanic vents are still in activity, [...] presents the most extensive, the most clearly defined, and as it were the least obliterated feature observable in the present exterior configuration of the globe.” Beaumont also thinks that the deluge was basically a tsunami (more about those in chapter 6), and “it has been shown”, he says, “that paroxysms of internal energy, accompanied by the elevation of mountains, and followed by mighty waves desolating whole regions of the earth, were a part of the mechanism of Nature; and what has happened again and again, from the most ancient up to the most modern periods, may have happened once during the few thousand years that man has been living on its surface”, etc. So, the “paroxysm” that brings up a mountain range is the same event that causes a tsunami: which then, in a geological time frame, we are talking about a very short duration: minutes, like: the time scale of an earthquake, as we observe them today. Beaumont doesn’t for one moment think that the catastrophes that formed mountain ranges are the result of some strange mechanisms that belong to the past and today don’t exist anymore; Greene calls him “an actualistic catastrophist—whatever raised these mountains had to be explained within [...] causes [...] that obeyed the known laws of nature and, further, were capable of acting in the present as well as the past: the upthrust of mountains could not be something from an earlier, uniquely paroxysmal epoch of earth history.” So then one must look at the present shape of the globe and phenomena that currently take place on it, to find what makes mountains. Having found a way to date any given mountain range that has been explored by geologists, Beaumont takes a global look at the mountain ranges of Europe and of the world, and comes to the conclusion “that in each mountainous district all the beds upheaved at the same moment have been so raised in the same general direction.” In other words: mountain ranges formed at the same time have parallel axes. (If you’ve already read a bit of geology before this book, you’re probably thinking that this doesn’t make much sense. And, by today’s standards, you are right. But you have to think that Beaumont was extrapolating on the basis of much less data than you can access now.) And what are the chances that, just by coincidence, every time two or more mountain ranges are formed at the same time, they have the same “strike”?! So Beaumont goes further, and infers that “it is [...] very probable, that not only, as is proved by observation, all beds upheaved at the same time have been so raised in the same direction, but also that this [...] is the result of this collection of beds having been thrown up at the same time by a single effort of nature: whence it would follow that the number of the epochs of elevation would not be unlimited, but that it would at
126
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
least be equal to that of the directions of those chains which are clearly distinct, a number not incompatible with that of the solutions of continuity observable in the sedimentary deposits.” Follows a long and detailed analysis of European mountain systems, which Beaumont finds to conform to and confirm his theory. And then the extrapolation: “If we attentively consider, on a terrestrial globe of sufficient size and good execution, the most prominent and the most recent systems of mountains which ridge Europe, we may remark that each of them forms a part of a vast system of parallel chains, which extends far beyond the countries geologically known to us. But as in all the parts of each of these systems situated in well examined portions of Europe, it has been more and more observed that parallel chains are in general contemporaneous, there is no reason to suppose that this law should suddenly cease, if its verification should be pushed still further. It is therefore natural to consider, until direct observations may show the contrary, that each of these vast systems, of which the European systems are respectively portions, originates in a single epoch of dislocation.” So, no matter where you travel on the surface of the earth, “parallel chains are in general contemporaneous”. So now for the answer to the initial question: what makes mountains? “the secular refrigeration, that is to say, the slow diffusion of the primitive heat to which the planets owe their spheroidal form, and the generally regular disposition of their beds from the centre to the circumference, in the order of specific gravity,—the secular refrigeration, on the march of which M. Fourier has thrown so much light [which see Chap. 4], does offer an element to which these extraordinary effects may be referred. [...] In a given time, the temperature of the interior of the planets is lowered by a much greater quantity than that on their surfaces, of which the refrigeration is now nearly insensible. We are, undoubtedly, ignorant of the physical properties of the matter composing the interior of these bodies; but analogy leads us to consider, that the inequality of cooling above noticed would place their crusts under the necessity of continually diminishing their capacities, notwithstanding the nearly rigorous constancy of their temperature, in order that they should not cease to embrace their internal masses exactly, the temperature of which diminishes sensibly. They must therefore depart in a slight and progressive manner from the spheroidal figure proper to them, and corresponding to a maximum of capacity; and the gradually increasing tendency to revert to that figure, whether it acts alone, or whether it combines with other internal causes of change which the planets may contain, may, with great probability, completely account for the ridges and protuberances which have been suddenly formed at intervals on the external crust of the earth”. To understand what this means, think of a globe made of hot lava, kept together by its own gravity, that slowly cools down. Temperature will be lower near its outer surface than near its center, so the outermost shell of the planet will be the first to reach a temperature low enough for the lava to freeze. (This is the idea of Leibniz and others, that we talked about in Chap. 2 already.) So, very soon, a rigid, brittle crust forms, sort of like the shell of an egg. But then cooling continues, and people at Beaumont’s time already figured by experiments that rocks shrink as they cool. The temperature of the crust is essentially constant by now so the crust doesn’t really shrink anymore, but the volume of the fluid material inside is still diminishing considerably: “the
5.1 Élie De Beaumont: The Shrinking Earth
127
inequality of cooling [...] would place their crusts under the necessity of continually diminishing their capacities, notwithstanding the nearly rigorous constancy of their temperature, in order that they should not cease to embrace their internal masses exactly, the temperature of which diminishes sensibly.” So if the planet as a whole shrinks, you end up with a vacuum between crust and interior of the planet, unless the crust collapses at some places. Now you also have to think that as the crust collapses, you cannot just erase part of it—its total surface remains constant and the only way to account for this is by folding it192 . Folding occurs along some particular lines—or rather, to use the proper geographical term, great circles, i.e. circles on the surface of the earth which lie in a plane passing through the earth’s center: like meridians (which are just great circles that pass through the poles), and unlike parallels (except for the equator). So, for some mechanical reason that remains to be determined, mountain ranges are formed along those great circles, if we believe Beaumont, and, as we have seen, if two mountain ranges are formed at the same time they follow two great circles that are parallel to one another. Or at least this is what Beaumont thought he saw, whenever he looked at a topographic map of the globe. He eventually became convinced that he could identify a relatively small number of “preferential” great circles along which mountain ranges had sprung up, and that those great circles formed a pentagonal network. This latter idea became quite famous, but was seen already by many of Beaumont’s contemporaries as rather wild. See for instance this summary by George L. Vose, from his 1866 book Orographic Geology (which we’ve met in Chap. 3 already): “M. Beaumont supposes, that, in the history of the earth, there have been long periods of comparative repose, during which the deposition of sedimentary matter went on regularly and continuously; that between these periods there have been short ones of paroxysmal violence, during which a great number of mountain-chains have been suddenly thrown up, all of the ranges originating at one time being very nearly parallel, no matter how far distant from one another [...]. The origin of these mountain eruptions is found in the secular cooling of the globe, the whole mass of which, with the exception of an envelope much thinner in proportion than the shell to an egg, is kept in a melted condition by heat, but is constantly cooling, and thus contracting. The external crust does not gradually collapse upon the shrinking nucleus, but becomes separated from the central portion [...], and when it gives way, falls in suddenly along determinate lines of fracture. At such times the rocks are subject to great lateral pressure, which crushes the rigid masses, and bends the pliant ones into elevations and depressions, thus producing the folds or waves called mountains. All these lines of fracture, and thus of elevation, made at one time, are supposed to be parallel,—by which is meant that they lie on different small circles parallel to the same great circle; and the various systems of parallel ranges are so grouped as to form a pentagonal network, upon which geometrical plan a vast amount of good mathematics has been wasted.” Vose’s book reviews everything that had been proposed since the time of Lyell to explain mountains. Vose has the merit of exposing the weaknesses of all theories, with no particular concern for the authors’ reputation and, uhm, academic influence193 . In fact Beaumont, who “wasted” such a “vast amount of good mathematics”, was one of the leading figures in geology at the time. His theory, which “made its first
128
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
appearance in 1829, and [...] has been presented to the public since that time under a variety of modifications, both by its author and by other geologists,” is the first one to be trashed by Vose. Vose raises a number of objections to Beaumont’s theory, all revolving around Lyell’s comment (see above) that Beaumont’s way of dating the formation of mountains was not reliable. “To avoid these difficulties, M. Beaumont has, since first putting forth his theory, so modified it as to destroy the only claim it ever had to attention,—its simplicity; inasmuch as he has multiplied the number of successive upheavals, and asserts that new lines of elevation sometimes take the direction of old ones, thus destroying entirely the use of a parallelism as a time boundary”. So, goodbye Monsieur Beaumont. But (Vose again) “in judging thus of M. Beaumont’s plan, we must be careful not to do him the injustice of allowing him no part in the formation of a theory of mountain structure. He has furnished us an immense mass of facts regarding the disposition of mountain-ranges,” etc. (Vose’s views reflect the consensus that was emerging at his time in the, as they say, “community”: which was to reject catastrophism and adhere to the Lyellian view that geological things always happen in the same way, and slowly; but, while abandoning the idea that a mountain range could be formed by a single earthquake is a positive step towards the geology of today—and many of you presumably already know this, and are thinking this, so I might as well say it—postulating no “catastrophe” possible in earth history would leave out, for instance, mass extinctions. Which of course today we know would be a problem: the dinosaurs, etc. We’ll get to this later.)
5.2 Henry Rogers The next declination of geological catastrophist thought in Vose’s review is the work of the Rogers brothers194 , and in particular of Mr. Henry Rogers. First of all, Rogers reports on the observations he made in his Appalachian fieldwork. Most importantly, the “folds”, or, as Rogers likes to say, “waves”, or “flexures”. We haven’t really talked about folds yet, so here we go. Take, I don’t know, a few blankets or tablecloths and lay them flat, on top of one another. These are your sedimentary strata in their original shape (remember Steno). Then, you don’t know what this force corresponds to in the real world—that’s precisely what Beaumont and Rogers and Vose and so on were trying to find out—but, just to see what happens, push your stack of strata horizontally, on both sides, with both hands, in converging directions. Or push it down or pull it up in the middle. It turns out that what you see, now, resembles what typical outcrops look like in mountain ranges: folded strata, or waves of strata. (And if you think of it in terms of waves, then it will be useful to know that the trough of the wave is called, by geologists, a “synclinal”, and the crest an “anticlinal”: words that will be used again in the following.) The only thing that’s missing is erosion—for that you’d have to take a pair of scissors and cut off the top of your stack along a horizontal line. And then from the top you’d see outcrops of the “deeper” layers: just like in the real world. Now, if you go to the Alps the picture is not so simple: “where the elevating forces have acted in different directions at different times, causing interference in the waves
5.2 Henry Rogers
129
Fig. 5.1 Idealized vertical section including, from left to right: one symmetric flexure, AKA symmetric fold; one normal flexure, or asymmetric fold; and then two examples of what Rogers called, if I understand correctly, “folded flexures”—today people would rather call them “overfold” and “recumbent fold”. The axis plane of the flexure, or fold, by the way, is the plane bisecting the angle formed by the slopes on the opposite sides of the “crest of the wave”: here the axis planes are all perpendicular to the page, so they are rendered with (solid, thick) lines
like a chopped sea, as in the Swiss Alps and the mountains of Wales and Cumberland, the undulations are disguised, and are with extreme difficulty made out” (Vose). But the Appalachians are much better behaved. Rogers is able to identify, there, three classes of “flexures” (while you read the following, keep an eye on Fig. 5.1): “The individual flexures of a belt of undulated strata occur under three forms, which Mr. Rogers calls symmetrical, normal, and folded flexures. In the symmetrical flexure, the wave is broad [that is to say, the strata are thick] and flat, the slope on each side of the axis being the same; in the normal flexure, the crest is more elevated, the amplitude is less [i.e., the strata are thin], and the slope is steeper on one side of the axis than on the other; in the folded flexure, the crust disturbance is the most violent, the width still less, and the slope very much steeper on one side than on the other, being even vertical or inverted, i.e., so as to bring the lower formation uppermost. At the same time, the axis plane, which in the symmetrical flexure is perpendicular, in the normal flexure is inclined towards the steepest side of the the wave, and in the folded flexure still more so.” Another general observation of Rogers (confirming other observations by other people who looked at other mountain ranges, elsewhere) is summarized by Vose as follows: the Alps can be described as “a double system of plication. Deep in towards the higher central chains, the folding of the strata is excessive, the waves being as it were doubled up, and pushed entirely over outwards towards the north-west; whilst, as we proceed into the great plain of Switzerland, the waves assume the normal, and finally, at the outward base of the mountains, the symmetrical type. The same phenomena are found in going from the central axis of the chain into the plains of Lombardy;” So Rogers thinks (and Vose seems to agree) “that the forces which have plicated the mountain regions have acted in such a manner as to produce a succession of waves, which are more and more closely folded, faulted, and disarranged, as we approach the centre of action; and that, as the undulations become more and more abrupt, they assume what Mr. Rogers calls the normal, and finally the folded form.” A fault, by the way, is what happens when some relatively rigid layer is folded to
130
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
a point where the deformation is too much, and it just breaks: a rupture. Strictly speaking, if you break a chunk of whatever material in two, you get two chunks of the same material, and the surface that separates them—along which you have broken the original chunk—is the fault195 . Rogers also notices that in the higher parts of the Alps folds are systematically pushed to the south-east; he thinks that this reflects some sort of universal law: “from the phenomena of the Appalachians, Alps, and other disturbed regions, says Mr. Rogers,” says Vose, “it is obviously a general law that the axis planes are not only all inclined in one prevailing direction, but that they dip invariably towards the quarter of maximum disturbance.” So, what do these observations teach us re how mountains are formed? Again Vose: “Mr. Rogers thinks that no purely upward or vertical force, exerted either simultaneously or successively, along parallel lines, would produce a series of symmetrical flexures; and that a tangential pressure, unaccompanied by a vertical force, would result only in an imperceptible bulging of the whole region, or an irregular plication dependent on local irregularities in the amount of resistance. It is hard, he says, to account for the phenomena by any hypothesis of a gradual pressure...” So a vertical force alone will not do—Lyell is wrong—some sort of horizontal pressure is needed. And here is how Rogers tries to solve the problem: (i) first, let’s make the hypothesis that, for some reason that we don’t necessarily know, fluid material below the crust expands and/or releases “gaseous vapors”; (ii) this process takes place over (or, rather, under) a large area (larger, at least, than the mountain range itself as we see it in our geologic present) that as a result becomes elevated; (iii) as the fluid underneath pushes up and sort of inflates the crust, the crust (which is rigid/brittle) eventually breaks and fissures; (iv) the crust is now pushed horizontally away from the center of a fissure, through some sort of gigantic earthquake—which accounts for the forming of folds—while (v) “a melted key-stone”, as Vose calls it, emerges from the depths of the earth below, to fill the “vacuum”—the fissure itself. Vose immediately sees a couple of problems with this: “we are called upon to believe in the production of a gigantic force from the expansion of molten matter and certain gaseous vapors,” but Rogers just assumes that those vapors exist, without making any particular effort to convince us that they do; instead, Vose points out that if heat is spent to increase the volume of molten rocks, or separate the gases from the rocks, well, that heat must have come from somewhere, and if some other place inside the earth has given away some heat, that means that it has cooled, and but then it should also have shrunk. “The total amount of heat being constant, the contraction at the center would just balance the outer expansion, so that the result would be no pressure at all from below against the crust”196 . And that’s just the first problem. Then, in Vose’s view, it gets worse: we are supposed to believe “that the rocky crust of the earth, which must have been rigid, or it would not have cracked and allowed the vapors to escape, [...] could be thrown into rapid pulsations”. But “That the rocky crust, twenty-five or one hundred miles in thickness, perhaps much more, could be thrown into rapid undulations over a mile in height, and yet allow the several strata to remain well-defined and unshattered, is an assumption too monstrous to be entertained.”
5.2 Henry Rogers
131
Then, even if one accepted to make this monstrous assumption, the theory of Rogers would demand “that the intrusion of a melted key-stone can support an arc the height and breadth of which are measured by miles.” But, says Vose, “the injection of liquid lavas would be a poor support for a chain of mountains; and injections of granite [...] could not have acted so, as its crystalline nature shows that it cooled very slowly”197 ; which is something that James Dana (more on him later) actually had initially brought up: molten material would by its very nature be squeezed away by the weight of layers pulled by gravity back to their initial positions. Finally, to give Rogers’ theory the death blow, as if the rest hadn’t been enough, Vose brings up a point originally made by Henry De La Beche in 1849. De la Beche was the president of the Geological Society of London, and he commented on Rogers’ work in his anniversary address to the society, then published in the society’s quarterly journal (and Vose pretty much repeats De La Beche’s own words): “Those familiar with the Alps must be well-aware of the great dislocations and folds exhibited, and of the whole presenting a crushed appearance, such as we might expect from the heavy pressure of the masses composing them against each other on a line corresponding with that of the main range. If the various dislocated parts were reunited, and the folds flattened out, and the component beds restored to the condition in which they were formed, the area now occupied by the region of the Alps would have to be expanded to various distances parallel to the main range,—the flanks pressed out into Italy on the one hand, and towards the countries on the N.W. and W. on the other.” In other words, going back to the tablecloth/blanket experiment, imagine that the blankets have been folded so much that if you were to repair them and spread them out, they’d cover, horizontally, a much larger area than they do right now in their folded state. The same goes for mountain ranges: if one could restore eroded layers and spread them out, they’d cover a much larger portion of the surface of the earth than they do now. Now let’s say that you want to explain folding in terms of global cooling and shrinking of the earth; a lot of folding—a lot of horizontal compression—means a great reduction of the earth’s outer surface and therefore of its volume. The fact that the Alps are so folded then implies that before the formation of the chain the earth was way bigger—which is, to say the least, weird. This problem will continue to come up, because during the entire nineteenth century geologists continue to explore mountains and what they find is more and more folds: so that, with each new campaign, estimates of the total, original (flat) surface of the now folded strata are systematically increased. And none of the models I am going to tell you about are going to be able to even start to solve the issue: until the advent of a model called “plate tectonics” which is essentially what most of us believe today. Anyway, De la Beche continues, “in looking at the flexures and dislocations in the Alps we have to regard the mass of them, and in doing so we seem scarcely to arrive at the conclusion that the flanks have been driven outwards by impulses acting upon a fluid mass beneath [...]; but rather that the component beds were squeezed from both sides up against a main central line, extending along the main range of these mountains, so that the effects produced upon the partly flexible and partly more unyielding rocks would throw them into flexures and break them [...]. The contorted, broken and jammed masses would struggle to expand themselves, and to avoid being
132
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
squeezed and piled up into the atmosphere, would act with all the power due to their gravity in forcing the rocks which could yield into flexures, breaking others more rigid, and even the flexures themselves when too sharp for the cohesion of the beds, so that towards the central axis the anticlinals would be sharper and more folded, with inversions, and become less so towards the extreme flanks.” So, I don’t know how quantitative this is, but the point is that if you take two masses of partly rigid, partly ductile materials organised in horizontal layers, imagine two millefeuille cakes, if you have any experience with French pastry, and push them against each other, you are likely to see the same sort of effects observed in the Alps. Bottom line, “after carefully considering the facts brought forward by the Professors Rogers, for which the best thanks of geologists are due to them, it appears to us that lateral pressure, not from the mere injection of some liquid and molten matter [...], but from the pressure of masses of the crust of the earth against other masses along great lines of fracture on the surface, has been the cause of these flexures.” i.e., thanks, but no.
5.3 James Hall (American Geologist) After Beaumont and Rogers, the next geologist reviewed by Vose is James Hall— American, 1811–1898, no relation to the Scottish James Hall who—see above—had tested and validated some of Hutton’s theory from the standpoint of chemistry. The American James Hall came from a poor family, but as a kid was a good student and managed to go to college at Rensselaer School in Troy, New York (being broke, he had no choice but to walk the 200 mi between near Boston, where he was born, and Troy). After graduating he ended up working for the State of New York, first as geological surveyor and then as “state paleontologist”. His headquarters were in Albany, where he had his own lab and, over the years, hired and trained quite a few apprentices who then went on to contribute their own research. Hall was also quite a character, and I know, and it is clear to you by now, that the purpose of this book is to expound some important ideas on the earth in a chronological order, to make sense of each of them on the basis of what came before, etc., so this is really meant to be a science book, although the story is told in a somewhat unusual way: not a book on the history of science, and certainly not a book on scientists—but Hall was quite a character and I can’t help reporting some of the weirdness about him: “Hall’s assistants learned more from him than just paleontology [...], for they also experienced a strong, egotistical, and irascible personality. [I’m citing from a 2005 Biographical memoir on him, written by Robert H. Dott, Jr., and published by the National Academy of Sciences—of which Hall was a founding member.] Although his sharpest attacks were reserved for his enemies in the New York legislature, most assistants were also treated to his infamous outbursts. Besides throwing vituperative verbal daggers, he sometimes brandished menacingly either a stout cane or even a shotgun kept at the ready near his desk.”198 Over the years, the work of Hall and his team resulted in a 13-volume Natural History of New York: Paleontology. According to the N.A.S. memoir, “it was primarily
5.3 James Hall (American Geologist)
133
[this magnum opus] that initially brought Hall his fame; however, the broader community of geologists now remembers him more for the curious theory of mountains presented in his presidential address to the American Association for the Advancement of Science in 1857.” And it is this curious theory of mountains that I’m going to try to tell you about here; funny that in 2004 it should be called curious—which is certainly not the most flattering qualifier for a scientific theory; in 1866 Vose clearly has a very high opinion of Hall’s work and in fact he seems to think that, while some of its passages are hard to defend, Hall’s theory is definitely not crazier than those of all those other guys—Beaumont, Rogers, etc.199 Hall’s theory of mountains was designed to explain some important observations that he made in his fieldwork. Hall spent a lot of time on the field, and ended up knowing extremely well the geology of the Northeastern U.S. And he realized—was maybe the first to realize—that, despite their enormous cumulative thickness, all strata found in Appalachia were formed in shallow ocean water. That is something one could figure out from the nature of the fossils—you have to think that organisms that live in shallow sea water are different from organisms that live in deep water—or in lakes and rivers. And they often resemble shallow-water organisms that live today. Hall inferred that, as sediments piled up near the shore, and their total weight increased, that caused the underlying rock to subside equally slowly, so that, over time, the depth of the seafloor would remain constant with respect to sea level, despite the accumulation of more and more sediments. In other words: sediments continuously accumulate, but at the same time the ocean floor subsides at approximately the same rate: as a result the depth of the ocean floor, where new layers are formed, is always about the same. Hall’s deduction placed an important constraint on whatever process formed the Appalachians: whatever that was, it was now clear that layers were formed slowly, and not “catastrophically”. Hall also saw that, as you travel from New York to, say, Ohio, the nature of the sediments changes: the huge beds seen in New York become thinner, and you get more carbonate of lime and less shale (which is a kind of rock that is made from mud). To him, that changed marked the transition between (ancient) shallow and deep water. “In the region of Eastern New-York,” he wrote in the introduction to his Paleontology, “the coarse materials of mechanical origin are accompanied by littoral or shallow-sea shells: while farther from the shore and the influence of the stronger currents, the same deposit became the habitation of other forms adapted to the changed condition, and finally coral reefs occupied the bed of the ocean in the vicinity of the present Ohio valley. [...]” “Along the shores of this ocean, in a direction from northeast to southwest, from Newfoundland to the southern extremity of the Laurentian mountains and thence from Canada to Alabama, were spread these immense sediments along the line of the present mountain ranges; while on the northwestern and western sides lay the quiet ocean teeming with its inhabitants and scarcely disturbed by the gentle currents which transported the fine and almost impalpable mud”, etc. So: lots of sediment near the shore, and not much away from the shore towards where the ocean was deeper (and where different fossils, etc., are now found). Then Hall makes another important point: the Appalachian mountains are right were the
134
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
shore used to be, and where the most significant accumulation of sediments took place. What used to be deep ocean, on the other hand, is now flatland. Hall think this might be a general rule of mountain formation: where lots of sediments are deposited, you will eventually have mountains, while areas where not much sediments are deposited tend to stay flat. Hall gives an explanation for this observation, which is fairly long. Here’s how Dana200 sums it up in his 1873 paper “On the Origin of Mountains”201 : “1. The Paleozoic strata of the Appalachian region bear evidence that they were mostly of shallow water origin. [This I already explained in the last few paragraphs.] “2. Their great thickness, consequently, was attained through a slowly progressing subsidence, the axis of which was in the direction of the Appalachian chain. [This too.] “3. This slowly progressing subsidence was occasioned by the weight of the slowly and successively accumulated sediments. The memoir202 says (p. 69): ‘[w]hen these [‘accumulations of sedimentary matter’] are spread along a sea bottom, as originally in the line of the Appalachian chain, the first effect of this augmentation of matter would be to introduce a yielding of the earth’s crust beneath, and a gradual subsidence would be the consequence’ [i.e., according to Hall, the crust is deformed by the weight of all the sediments that accumulate on top]. “4. This subsidence produced, as one of its direct results, the extensive folds and faults of the strata characterizing the Appalachian formations. [...] “5. The formation of the Appalachians (and so of all mountains) was dependent upon, and the height related to, the thickness of the sedimentary accumulations, of which they are made; and a mountain chain was not a possibility over the Mississippi basin [i.e., away from the ancient shore], because there ’the materials of accumulation were insufficient.’ “6. (a) The elevation of the Appalachian mountains was not a result of the process of accumulation, or of the subsidence. (b) The elevation of mountains is, ‘of continental, and not of local, origin; there is no more evidence of local elevation along the Appalachian chain than there is along the plateau in the west.’ [...] ‘It is this ultimate rising of continental masses that I contend for, [says Hall,] in opposition to special elevatory movement along the lines of mountain chains.’ (c) After a continental elevation, the mountain range received its present shape mainly through erosion.” (There is a seventh point, dealing with metamorphism, but I’ll leave it out because it’s too complicated, and anyway Dana mentions it but doesn’t really discuss it in that paper.) Everybody agreed with points 1 and 2 at the time when Dana was writing (everybody still does, I think). Points three and four deserve the most attention: Hall drops the idea of shrinking caused by cooling, or in any case does not need it to explain the folding of strata, and instead explains everything in terms of gravity. The sediments pile up, they are heavy, they tend to bend the underlying crust. Making use of a word that we first used when we talked about Rogers, we might say that a large-scale synclinal is so formed. Now, look at the drawing in Fig. 5.2. Because of the largescale folding, strata closer to the surface are under lateral compression, they get
5.3 James Hall (American Geologist)
135
Fig. 5.2 Folds, as seen by James Hall
Fig. 5.3 Major folding in the Col du Pillon, which is roughly halfway between Lausanne and Brig, in Switzerland. The picture was taken by my friend Claudio Rosenberg, a geology professor in Paris, and the person I always go to when I have a question about something somehow related to the Alps. He calls this a “folded Mesozoic sequence in the Helvetic nappes”
compressed in the horizontal direction, while deeper strata are under lateral tension. Hall proposes that lateral compression makes smaller-scale folds in the shallower layers. These folds are what we see when we look at outcrops in places like the Alps, Appalachians, etc.: see, e.g., the photo in Fig. 5.3. Both Vose and Dana see this as a progress with respect to Rogers’ ideas, because (Vose): “So far as we have examined the facts, no other hypothesis than that of the slow sinking of vast masses of yielding sediments can at all satisfactorily account for the plication and other evident effects of a compressive force so invariably exhibited in mountain districts”. What
136
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
Vose means by “compressive force” here, is the lateral compression, from the sides towards the central axis of the chain: remember that according to Rogers the folds are caused by expansion from the central axis: and this is problematic because the expansion (vertical and, as a result, horizontal) postulated by Rogers would create an anticlynal along the central axis of the plain, and that would result in the shallower layers being under tension, rather than compression: but so then why is folding always so pronounced when one looks at the central axis of a mountain chain? It had been observed by De La Beche in the Alps, for instance, (Vose again) “that the component beds have been squeezed up from both sides against a main central line corresponding with the main range of mountains (which would be Mr Hall’s line of greatest accumulation), thus throwing the beds into flexures very abrupt at the centre line, but decreasing in going towards the margin.” Vose concludes: “It seems very strange, that many geologists, while admitting that [before being elevated] the strata have subsided to a depth at least equal to their thickness, [...] attribute the folding of the rocks, not to the downward, but to the upward movement; when, as Mr. Rogers observes, an upward movement would extend and not compress the surface; and evidently no folding can be made by extension.” Now, that being said, you probably realize that Hall avoids the problem of explaining how all those folded layers are pushed back up, from about sea level to altitudes of a thousand meters or so—all he has to say about that is that folds and the rising of mountains are not directly related: folds are caused by the piling up of sediments, and they are already there when relief emerges. According to Hall, not mountain ranges, but entire continents are pushed up—and Hall, like Hutton and Lyell, doesn’t yet know why or how. Later in that same paper, and probably also elsewhere, Dana famously sums up Hall’s theory as “a theory of the origin of mountains with the elevation of mountains left out” (which if anybody today knows anything about James Hall, this is most likely what they’ll remember), and Vose agrees: “If, as Mr. Hall says, the folding and plication have added nothing to the height, and if no local elevation has occurred, why are the Appalachian summits now above the Mississippi basin, the Rocky Mountains above the great western desert, the Alps above the plains of Lombardy, or the Andes above the Pampas of South America? [...] The height of the mountains, Mr. Hall says, is not due to the folding or plication... the cause producing the disturbance is not that producing the elevation, the accumulation of sediments has been local, but the elevation continental [...]. Now, does it not follow, that if the accumulations are local, the foldings dependent on the accumulations, the mountains present just where the foldings are most extreme and the accumulations deepest, and not where the strata are horizontal and of but little depth, that the elevation is also local, and not continental; and that the folding and elevation [...] are to be considered as proceeding from a common cause, or, at any rate, as forming parts of a common process? We think with Mr. Dana upon this point, that, when the last of the sedimentary beds is laid down, the whole is still under water; and that some force is needed to raise them above the ocean to entitle them to a place among the earth’s mountains.” And then there are some other problems. Dana explains that point 3 in his summary of Hall’s model is a “physical impossibility”. If you think about it, it is straightforward to see what’s wrong: “The third [proposition] assumes that the first 500 ft in depth of
5.4 James Dana
137
sediments would press down the crust 500 ft, and so on to the end; but no reason is given why sediments under water should have so immense gravitating power, when the crystalline rocks of the Adirondacks, piled to a height of some thousands of feet above the water, had a firm footing close along side of the subsiding region.” How can piles of sediment be heavy enough to force the crust to subside, if mountain ranges aren’t? And the fifth proposition needs to be revised: yes, “it is evidently a common fact that where mountains have been raised, there, in general, thick accumulations of sediments were previously made; and conversely”; but for instance the Silurian beds constituting the Green Mountains or Vermont “are probably not over half the thickness of those of the Appalachian region in Pennsylvania and Virginia; and yet the average height is greater, although exposed to erosion for a vastly longer time”— and this is just an example. So, the height of mountains is to some extent, perhaps to a large extent, independent of the thickness of the sediments of which the mountains are made.
5.4 James Dana Dana tries to do better than Hall. He goes back to the idea of shrinking by cooling, which at least was not (yet) a “physical impossibility.” His idea of the earth is similar to the models we have seen in Chap. 2: there’s a solid crust, whose thickness we don’t really know but could be as large as, like, 1000 km (remember, e.g., Hopkins’ model), and within it there is a “fluid” layer (which quote unquote “fluid” because when people say “fluid” they often mean an inviscid fluid, zero viscosity, and we’ve seen already in Chap. 2 that we expect earth materials to have at least some viscosity— think of lava); the earth is slowly cooling, and the fluid stuff shrinks quite a bit by cooling (experiment shows that’s what happens to very hot earth materials when they cool). You remember, then, Beaumont’s point, see above, that if the earth as a whole shrinks, its solid outer shell will break and fold, etc.; “The contraction of the cooling underlying layer”, says Dana in his Manual, “would result either in one or all of the following results: (1) in breaking the continuity of the layer itself, and producing rents opening downward; (2) in making open spaces between it and the crust; (3) in forcing the hard crust above, if adhering to it, to become accommodated to its own decreasing size. Owing to the second of the results of contraction here mentioned, gravity, or the weight of the crust, would produce in it lateral or tangential pressure, with its consequences; and the third would add more or less to this pressure.” Bringing the earth’s presumed contraction back into the picture, Dana swaps the cause and effect of Hall’s model: it’s not the weight of sediments that makes them sink, deforming the crust underneath them, but, rather, the deformation of the crust—the collapse of the rigid outer shell of a shrinking planet—that creates troughs; which then eventually sediments naturally pile up in the troughs203 . And but just like in Hall’s model, as they pile up sediments become folded, and so on and so forth. Now, Dana has a detailed theory of how it all happened, which is based on the very important geophysical fact that the surface of the earth is made of very large
138
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
continents and very large oceans, and, needless to say, a continent and an ocean are two very different entities. For one, I don’t know how well established this was in Dana’s time, but the oceanic crust is made of rock (mostly basalt) that is quite different from the rock that makes up the continental crust (mostly granite). Starting with, I think, Eduard Suess204 , people would use the word sal, or sial to refer to the average rock from the continental crust, and sima for the oceanic-crust stuff; both words are acronyms: sial stands for “rich in silicate of alumina”, which continental rock is, and sima for silicate of magnesium, which is abundant in basalt and in rocks that you tend to find in the oceanic crust205 . What is most important right now is that sial and sima differ in fusibility (oceanic crust/sima is more fusible, i.e., the temperature at which it melts/freezes is lower than that at which continental crust/sial melts/freezes) and density (basalt/sima is heavier than granite/sial). From this Dana infers that, for some reason, the composition of the early, molten earth changed from place to place; meaning that the density and fusibility of rock changed from place to place: and of course the less fusible areas—those with a higher temperature of fusion—would be the first to start freezing. Solidification starts from the surface (lowest temperature) and advances downward: so, at any given time, the crust is thicker in areas where solidification has begun earlier. Continental crust, then, whose fusibility is higher, is always thicker than oceanic crust. By Archimedes’ principle206 the thinner (and also heavier) oceanic crust is depressed with respect to the thicker (and lighter) continental crust (think of a pair of icebergs, a bigger one and a smaller one). Which explains why continents and oceans exist 207 . The geographic distribution of volcanoes seems to fit the model: “continents have, to a great extent, been long free from volcanoes”, while “over the bed of the ocean [...] nearly all islands are volcanic”: so then this suggests “that continental areas were first free from eruptions; they cooled first, and were and are the thickest, and the contraction affected most the oceanic areas. The great depressions thus began which are now the oceans”, etc. The quote is from Vose’s book, the section where he expounds Dana’s thought208 . “Before these depressions were very deep,”, continues Vose (and see Fig. 5.4 here), “they would be too small to contain the seas; and thus the whole land would be under water,—and we know that, in Silurian times, nearly every part of the globe was beneath the sea. [...] The more prominent effects of contraction, then, will be depressions provided the contraction is unequal, apparent elevations as a consequence of the depressions, fissures and ejections of igneous matter there through, upheaval along the lines of fissure, and upliftings and foldings from lateral pressure. The results of the contracting of the ocean-bed would be seen at the junction of the contracting and non-contracting areas, i.e., at the borders of continents; and, in fact, mountainranges do generally follow coast lines,—an arrangement particularly well seen in the Andes, Appalachians, and mountains of the Western United States. [...] As the ocean-bed became thicker (through sedimentation), the contraction would produce a lateral tension against the non-contracting areas (the continents), which would occasion pressure, upheaval, &c.” In a nutshell, Dana’s model of mountain formation combines Hall’s ideas with the shrinking-earth theory. First, large-scale troughs are formed—presumably because
5.4 James Dana
139
Fig. 5.4 Outer surfaces of the solid earth (solid line) and of the ocean (dashed line), before (left) and after (right) contraction, according to Dana’s theory of the formation of oceans and continents. Before cooling and shrinking, the “primitive” earth (left) is a perfect ellipsoid, entirely covered by a water layer. Because the chemical composition of rock is not homogeneous over the primitive earth, the earth shrinks asymmetrically. Once the surface of the earth has become sufficiently irregular (right), the same volume of water will not be enough to cover it entirely
the earth shrinks; then, sediments pile up in the troughs209 . One such trough, filled with sediments, is called a geosynclinal, or, more commonly, I think, geosyncline. The prefix geo- is there to remind us that this is a very large scale feature: you might have all sorts of smaller anti- and synclinals in the folded layers within a geosyncline210 . Once the geosyncline is there, lateral forces, again related to the shrinking of the earth, squeeze it and turn it into a mountain range. Vose explains all this pretty well in his book, and then he shows that there is a big problem with Dana-Hall’s model. “A general objection, also, to the plication of the earth’s crust by the contraction dependent on secular refrigeration, may be stated here”, he says. “The shortening of a given line on the earth’s surface from a general contraction is quite insignificant as compared with the actual compression which has the taken place in folded regions. If a quarter of the globe contracted laterally until the surface sunk eight miles, it would cause a lateral compression of only twelve miles and a half in the quadrant of six thousand miles, or one mile in five hundred, which, if the material was absolutely incompressible (which, of course, cannot be admitted), would produce only a single wave, a mile high and two miles wide, in a tract five hundred miles in breadth.” This is an important point, I think, so let me restate it in, I hope, simpler words. Let’s say the earth shrinks, because of cooling, by 10 km, i.e., its radius used to be 6380 km, and is now 6370 km. This means a really important loss of volume—a lot of shrinking. The length of a great circle (two, times π , times the earth’s radius), then, is reduced by about 60 km, meaning about 0.2% (or, indeed, one km, or mile, in five hundred). The crust doesn’t shrink—its temperature is already relatively low—but wrinkles form in it. Vose calculates that one wrinkle, or fold (“wave”), a few km wide and a couple km high is enough to account for 1 km of horizontal reduction211 . The thing is, if you go to, like, the Alps, and look at the folds there, you see a tremendous
140
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
amount of folding—much more than a couple-mile “wave” every five hundred miles. Which means, if you want to explain the folds in terms of the earth’s contraction, you need a reduction in the earth’s radius much bigger than just 10 km. Which, in turn, was hard to justify on the basis of what people knew, or thought they knew, of how the earth’s temperature evolved over time (which see Chap. 4). And that is not it. Vose adds that “contraction from cooling would have been most active in the most ancient times”, i.e., I guess, when temperature was higher and was diminishing more rapidly (again, Chap. 4, and Kelvin’s error-function model); “while the mountains, so far as we know, have all been made since the deposition of fossil-bearing rocks,—many of them, indeed, later than the carboniferous age”. Because, yes, the folded layers are all full of fossils. These are very good points, but somehow the geosyncline model lived on, particularly in the U.S. In Europe, Eduard Suess came up with an alternative model— which, unless I missed it, is not in Vose’s book. Suess’ model is still a shrinking-earth model—the earth cools, and shrinks because of cooling, and that’s ultimately what causes relief; it isn’t better than Dana-Hall at explaining the actual amount of folding that’s observed in the field. But it has been very popular, and Suess’ Das Antlitz der Erde (The Face of the Earth), first edition 1885, was a best-selling geology textbook for many years—so it’s worth taking a look. “Suess”, writes Naomi Oreskes212 , “assumed that Earth’s initial crust was continuous, but broke apart as the interior shrunk. The collapsed portions formed the ocean basins; the remaining elevated portions formed the continents.” Which up to here, not too far from Dana’s; but then, “with continued cooling, the original continents became unstable and collapsed to form the next generation of ocean floor, and what had formerly been ocean now became dry land. Over the course of geological history, there would be a continual interchange of land and sea, a periodic rearrangement of the land masses. “The interchangeability of continents and oceans explained a number of other perplexing geological observations, such as the presence of marine fossils on land [...] and the extensive interleaving of marine and terrestrial sediments in the stratigraphic record. Suess’ theory also explained the striking similarities of fossils in parts of India, Africa, and South America. Indeed, in some cases the fossils seemed to be identical, even though they were found thousands of miles apart”, and with an ocean in between213 . If Darwin was right, and “plants and animals had evolved independently in different places within diverse environments, then why did they look so similar?” Suess proposed that, in some early geological age, India, Africa and South America were all connected by land; i.e., instead of a bunch of continents, separated by oceans, there was one “supercontinent”, that he called Gondwanaland, or Gondwana214 ; and those species, very similar but now found in very distant places, had evolved during that age. The “striking” similarity of fossils across oceans is a very important observation— I’ll get back to it shortly. But first, there’s two interesting differences between DanaHall and Suess; for one, we’ve seen that in Dana’s model an ocean is always an ocean and a continent is always a continent: “The boundaries between continents and oceans took up most of the pressure”, says Oreskes, “like the seams on a dress—and
5.5 Isostasy
141
so mountains began to form along continental margins. With continued contraction came continued deformation, but with the continents and oceans always in the same relative positions. Although Dana’s theory was a version of contraction, it came to be known as permanence theory, because it viewed continents and oceans as globally permanent features.” The other big difference between Suess and Dana-Hall is that, in Dana’s scheme, the earth’s shallowest layer—the crust, whether oceanic or continental—floats above the underlying (molten, or viscous: see Chap. 2) substratum because it is less dense than the substratum: Archimedes’ principle. Also, like I said earlier, continental crust (granite) is, on average, lighter than oceanic crust (basalt), which then it makes sense that the average elevation of continents should be higher than that of oceans. This simple observation, combined with Archimedes’ principle, makes it difficult to defend Suess’ theory that a continent should sink deeper than oceanic crust. But Dana’s ideas, and the theory of isostasy, which I’ll tell you about in a sec, must have been less successful in Europe than they were in the U.S: and Suess got away with it215 .
5.5 Isostasy So, anyway, isostasy216 : which is also an important introduction to what comes next. George Biddell Airy, astronomer royal from 1835 to 1881, published in 1855 a short comment on a long paper by John Henry Pratt, archdeacon of Calcutta. Both articles appeared, back-to-back, in volume 145 of the Philosophical Transactions of the Royal Society of London. Pratt was trying to make sense of data from George Everest’s217 survey of India (published in 1847), which did not match astronomical observations: “It has been found by triangulation”, says Pratt, “that the difference of latitude between the two extreme stations of the northern division of the arc, that is, between Kalianpur and Kaliana, is 5◦ 23 42.294 , whereas astronomical observations show a difference of 5◦ 23 37.058 , which is 5.236 less than the former”218 . If you remember the Schiehallion experiment, you are probably saying: of course: there’s the Himalaya right there, which has a huge mass, which deflects plumb lines. That is also what Pratt initially thought: a “very probable cause [of the discrepancy] is the attraction of the superficial matter which lies in such abundance on the north of the Indian arc [i.e., the Himalaya, I guess]. This disturbing cause acts in the right direction; for the tendency of the mountain mass must be to draw the lead of the plumb-line at the northern extremity of the arc more to the north than at the southern extremity, which is further removed from the attracting mass. Hence the effect of the attraction will be to lessen the difference of latitude, which is the effect observed. Whether this cause will account for the error in the difference of latitude in quantity, as well as in direction, remains to be considered, and is the question I propose to discuss in the present paper.” And so Pratt embarked in the “herculean undertaking” (his words) of calculating “the attraction of the masses of which the Himalayas, and the regions beyond, are composed”, i.e., essentially, to do for the
142
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
whole of central Asia what C. Hutton had done for Mount Schiehallion219 . Pratt claims at the beginning of his paper to have invented a method that simplifies “very greatly” the calculations; still, the paper is 48 pages long—almost a short novel— and it doesn’t look that simple, so we’re going to skip most of it. The bottom line is that the theoretical deflection that Pratt manages to come up with is around 30 , much larger than the observed 5 . He tries and changes, within reasonable limits, the values of some parameters that are not well known (the density of rocks; the topography of some areas), but that doesn’t really change much the final figure. He comes to the conclusion that something must be wrong in the way the data were collected. “The whole subject, however, deserves careful examination; as no anomaly should, if possible, remain unexplained in a work conducted with such care, labour, and ability, as [Everest’s] measurement of the Indian arc has exhibited.” And so Airy did examine the subject carefully in his short, 4-page document “On the Computation of the Effect of the Attraction of Mountain-Masses, as Disturbing the Apparent Astronomical Latitude of Stations in Geodetic Surveys.” Airy’s explanation of Pratt’s enigma is quite simple. Remember (Chap. 2) that it was generally accepted at this point that the earth was originally fluid, and has slowly become solid, or is perhaps still in the process of becoming solid, starting from its surface. In Airy’s words: “Although the surface of the earth consists everywhere of a hard crust, [...] yet the smallness of [the surface’s] elevations and depths, the correctness with which the hard part of the earth has assumed the spheroidal form, and the absence of any particular preponderance either of land or of water at the Equator as compared with the poles, have induced most physicists to suppose, either that the interior of the earth is now fluid, or that it was fluid when the mountains took their present forms. This fluidity may be very imperfect220 ; it may be mere viscidity; it may even be little more than that degree of yielding which (as is well known to miners) shows itself by changes in the floors of subterraneous chambers at a great depth when their width exceeds 20 or 30 ft; and this yielding may be sufficient for my present explanation.” Now assume, “to fix our ideas,” that the crust is fairly thin, say ten miles. Airy makes a drawing of this, including a prominence representing a “tableland, 100 mi broad in its smaller horizontal dimension, and two miles high”, so that the crust under the “table-land” would be twelve- (instead of ten-) mile thick. But this doesn’t make any sense, says Airy: those two extra miles of rock would result, via gravity, in a tremendous downward pull that would break the crust—unless the surrounding crustal material were able, through what Airy calls “cohesion,” to exert an equal and opposite vertical force on the table-land segment of the crust: Airy does a rough calculation and concludes that this “cohesion must be such as would support a hanging column of rock twenty miles long”; and “I need not say that there is no such thing in nature.” So, bottom line, the table land necessarily sinks into the underlying “lava” (Airy’s word). So, what’s going on? i.e., how come we do have mountains on the surface of our planet, and they don’t sink? Airy’s explanation221 is so brief and clear that I do not need to rephrase it: “I conceive that there can be no other support [preventing the table-land from sinking] than that arising from the downward projection of a portion of the earth’s light crust into the dense lava; the horizontal extent of that projection
5.5 Isostasy
143
Fig. 5.5 Airy’s isostasy: mountains float on the earth’s denser substratum like icebergs in water
corresponding rudely with the horizontal extent of the table land, and the depth of its projection downwards being such that the increased power of floatation thus gained is roughly equal to the increase of weight above from the prominence of the tableland. It appears to me that the state of the earth’s crust lying upon the lava may be compared with perfect correctness to the state of a raft of timber floating upon water; in which, if we remark one log whose upper surface floats much higher than the upper surfaces of the others, we are certain that its lower surface lies deeper in the water than the lower surfaces of the others.” The sketch in Fig. 5.5 shows how Airy’s idea explains the Everest-Pratt dilemma: compared to a reference model, where the earth is made up of a suite of concentric shells, a mountain is an extra mass that attracts a plumb bob towards itself. But, says Airy, the shallow layer—the crust—, of which the mountain is part, is lighter than the deeper ones: it floats on the deeper ones. And so, the way Archimedes’ principle works, wherever the outer surface of the earth has some topography, there’s also got to be a root under it, to compensate. Or in other words: mountains float on the earth’s substratum like icebergs in water: they float because ice is lighter than water, and they’ve got an underwater root whose size is proportional to the elevation of the iceberg above sea level. Now, if Airy’s right, besides the extra mass above-sea-level that tends to deflect the plumb-bob, we’ve also got to consider the deficit of mass below sea level, where the root, right underneath the mountain, is lighter than the average rocks around it: so it attracts the bob less than the rest of the substratum does: which, if you do a force balance, you’d see that the net force acting on the bob pulls it almost towards the center of the earth: which is what Everest had observed near the Himalaya222 . And this, basically, is what we call isostasy.
144
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
5.6 Asthenosphere An immediate implication of Airy’s theory is that you need whatever is underneath the crust to be “weak”, i.e., the opposite of rigid, i.e., capable of deforming even under relatively small stress. Otherwise, the mountain/iceberg wouldn’t be able to sink in the substratum/ocean, and there would be no root and no isostasy. The earliest paper that I am aware of, where this point is made, is “The Strength of the Earth’s Crust—Part VI. Relations of Isostatic Movements to a Sphere of Weakness—the Asthenosphere”, published by Joseph Barrell223 in 1914 in the Journal of Geology. That’s where the word asthenosphere first shows up: which Barrell invented it to mean “sphere of weakness”, as per the title—asthenos is Greek for “weak”. (In today’s terms, what Barrell means by that is that the viscosity of the asthenosphere is very low. We’ll learn about the concept of viscosity later; at this point, perhaps, just think that “weak” is something that does not resist deformation—which the rigid “crust”, instead, does.) Barrell’s paper says, essentially, the following: (i) we know that most of the topography around the globe is isostatically compensated, meaning, you go and do a gravity measurement near it, the plumb line is not deflected, or not as much as you would have expected, as in the Everest-Pratt-Airy story. And that tends to be true all over the place224 . (ii) Remember that the topography changes constantly: erosion and deposition of sediments, and stuff: and yet it’s always compensated; (iii) so, isostatic compensation is not something that happened just once in the history of the earth: “isostatic equilibrium”, says Barrell, “exists at present in spite of the leveling surface actions and compressive crustal movements of all past geologic time. There must be, consequently, some internal mode of restoring more or less perfectly an isostatic condition”. (iv) Barrell figures the mechanism that’s needed to explain isostasy is “a thick earth-shell marked by a capacity to yield readily to long-enduring strains of limited magnitude. [...] To give proper emphasis and avoid the repetition of descriptive clauses it needs a distinctive name. [...] Its comparative weakness is in that connection its distinctive feature. It may then be called the sphere of weakness—the asthenosphere”. And (v) here is how the asthenosphere works according to Barrell: “Upon the disturbance of equilibrium by erosion and deposition [...] isostatic movements consist of a rising of the eroded areas, a sinking of those which are loaded. [...] When the accumulating vertical stresses have overcome the strength of the crust, the excess pressure from the heavy area is transmitted to the zone below the level of compensation. This deep zone is in turn the hydraulic agent which converts the gravity of the excess of matter in the heavy column into a force acting upward against the lighter column and thus deforms the crust of the eroded area. By this means even the continental interiors are kept in isostatic equilibrium with the distant ocean basins. This implies a great depth and thickness to the zone of plastic flow. Although it must be plastic under moderate permanent stresses, this does not imply by any means a necessarily fluid condition, and fluidity is disproved by other lines of evidence.” (Which here Barrell is thinking about seismic waves, I guess? Because shear waves wouldn’t travel through the asthenosphere if it were
5.7 Continental Drift
145
fluid: but we know they do—and we already did in Barrell’s times. We’ll get to that in a chapter or two.) So the asthenosphere has got to be fairly thick. How thick exactly? we don’t know. But (vi) the consensus is that the asthenosphere doesn’t go on forever—there must be something rigid underneath. Barrell says that the asthenosphere is sandwiched “between the lithosphere above and the centrosphere below, both of which possess the strength to bear, without yielding, large and long-enduring strains.” The reason he’s sure that the “centrosphere”, i.e. everything that’s below the asthenosphere, must be strong, is tides225 : which I mentioned briefly their role in the last part of Chap. 2; and we’ll get back to tides in some more detail, in Chap. 7.
5.7 Continental Drift If we believe that rigid continents float on top of a viscous asthenosphere, relatively easy to deform, then it is not so weird to make the hypothesis that continents might also move horizontally: which is what Alfred Wegener226 did at the turn of the twentieth century, in his Die Entstehung der Kontinente und Ozeane, AKA The Origin of Continents and Oceans, first published in Germany in 1915227 . Wegener figured that allowing continents to “drift” would be a good way to explain a number of otherwise mysterious facts. There was, of course, the old228 observation that the edges of continents fit into one another like the pieces of a puzzle (Fig. 5.6). There were other discoveries; I mentioned, already, that people had found very similar fossils across oceans. Glossopteris were large trees, now extinct, of which today we find fossils that are between 300 and 250 million years old. There existed many different species of Glossopteris, all with at least one feature in common: the tongue-shaped leaf which is what gives them their name229 . Adolphe Brongniart230 , who pretty much invented what we now call “paleobotany”, shows, in his Histoire des Végétaux Fossiles, or History of Fossil Plants (published between 1828 and 1837), that Glossopteris lived in parts of India and Australia. Later, people found fossils of Glossopteris in a number of other places in the southern hemisphere; in 1912, Robert Scott’s British expedition found Glossopteris even in Antarctica231 . Now, if you make a map of where Glossopteris appears to have lived, see Fig. 5.7, the pattern you get is strange, to say the least: this very same flora is found in totally different climates—e.g., Antarctica and Africa—and in places that are separated by oceans: e.g., in current Argentina and in the southern half of Africa—but not in Brazil, or central Africa. However, if you do the Pangea thing (Fig. 5.6), like Wegener did, and solve the puzzle, all those areas become connected—again Fig. 5.7. Glossopteris is the most famous example, I think, of flora or fauna with a geographic distribution that needs continental drift to be explained; but not the only one232 . In Chaps. 2 and 6 of Die Entstehung, Wegener gives long lists; essentially, even before Wegener, paleontologists agreed that continents must have been connected at some point in the history of the earth—otherwise, it would be impossible to explain that identical fauna and flora should emerge at the same time, but in
146
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
Fig. 5.6 Fact: the edges of continents fit into one another like the pieces of a puzzle. Wegener decided to call “Pangaea”, or “Pangea”, the continent you get by putting together all the continents of today. Its southern half included all continents that were part of Suess’ Gondwana, and so was named after it; and later, people invented the name “Laurasia” for Pangaea’s northern half. (People also figured that, in Pangaea times, an ocean that they called Tethys separated Africa and India from the rest of Asia. Then, while north and south America separated from Europe and Africa, opening the Atlantic, Africa and India converged towards Laurasia, closing Tethys.) (Tethys, incidentally, is a minor goddess in Greek mythology, associated with rivers and seas and, generally, water.) Fig. 5.7 The geographic distribution of Glossopteris in Pangaea. The numbers 1 through 6 stand for South America, Africa, Madagascar, India, Antarctica and Australia, respectively. The shaded area is where fossils of Glossopteris have been found
5.7 Continental Drift
147
completely separated, far away parts of the world. Paleontologists were not necessarily supporters of Wegener, though, because they figured Suess’ idea of a shrinking earth, with continents sinking below oceans and oceans becoming continents, could also explain their observations—and it was more widely accepted by the geology community233 . The phrase “land bridges” was used, to refer to more or less large continental masses that had allowed various forms of life to spread between continents—before collapsing. Chapter 7 of Die Enstehung is about the “paleoclimatic arguments” in favor of drift theory. Before we get into that, I need to say something about ice ages; which you’ve probably been told at some point in your life that on several occasions the earth’s climate had gotten so cold that glaciers in the more northern and southern parts of the globe advanced from the places where they usually hang out into the surrounding regions234 . Then, each time, the climate got warmer again and the ice melted. So, for example, the last ice age ended some 12,000 years ago and lasted about 100,000 years. The whole of Scandinavia and the Baltic regions and most of Britain were covered in ice; and much of North America as well. So, then, you can look for the traces left by glaciers as they expanded over, and then retreated from, vast areas: the remains of glacial moraines235 , and or the tracks left, through friction, by the advancing ice on the rocks that lay under it: which you don’t expect to find very close to the equator. And not just that, of course, not just the ice: looking at the fossils embedded in sedimentary rocks, you can make inferences re the environment and climate where they lived (and at the time when their sedimentary layer was formed). And or, from coal deposits, you can figure out the flora that the coal was made from. And so on, and so forth. Wegener took all those data, associated with one geological period—the Carboniferous, i.e., about 300 million yeas ago—and he found that stuff that should be close to the equator was far from it; and that in central Africa you see things you’d rather expect to see near the poles. But, Wegener says, if you allow the continents to move, and reconstruct their positions during the Carboniferous, according to the displacements of continents that we saw already in Fig. 5.6, then everything begins to make sense: with the South Pole not so far from its current position in Antarctica, and all the continents clustered around Antarctica, so that much of Africa is quite close to the South Pole, while the equator crosses parts of North America, Europe and central Asia: see Fig. 5.8. From paleontology and paleoclimate data, by the way, one can also more or less date the displacement of continents. Because if you know, e.g., the timespan of Glossopteris, you know that the continents where Glossopteris lived must have been attached to one another during at least the beginning of that timespan, to allow Glossopteris, wherever they originally appeared, to spread over their whole territory. Repeat the same exercise with some other species’ data, and you should be able to estimate how fast continents converged or diverged, etc. Then, there’s the “geological arguments”, which Wegener covers in Chap. 5 of his book. Take two continents that, according to Fig. 5.6, are now separated and but were once together, and go look at the geology of the margins that they used to share— which is were the break-up must have happened. It turns out that the geological
148
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
Fig. 5.8 Wegener reconstructed the ancient positions of the poles and of the equator, from the distribution of certain kinds of sedimentary rock. He figured that, during an ice age, glaciers all advance away from the poles, so if you can tell which way some of them were going in a given ice age, you can trace the location of a pole at that time. According to Wegener’s reconstruction, the South Pole during the carboniferous is still within Antarctica—i.e., Antarctica hasn’t moved much; but other continents moved quite a bit with respect to Antarctica. Ninety degrees from where he had placed the ancient south pole, Wegener also found lots of evidence for an ancient, humid equatorial zone: vast deposits of coal, that can be associated with tropical plants; and salt and wind-deposited sand, meaning ancient deserts. In other words, the equator during the carboniferous ice age was exactly where you’d expect it to be—ninety degrees north of what was then the South Pole
formations that you see on either side look the same. “By comparing the geological structure of both sides of the Atlantic”, says Wegener, “we can provide a very clearcut test of our theory that this ocean region is an enormously widened rift whose edges were once directly connected, or so nearly as makes no difference. This is because one would expect that many folds and other formations that arose before the split occurred would conform on both sides, and in fact their terminal sections on either side of the ocean must have been so situated that they appear as direct continuations of each other in a reconstruction of the original state of affairs. Since the reconstruction itself is necessarily unambiguous because of the well-marked outlines of the continental margins and allows no scope for juggling, we have here a totally independent criterion of the highest importance for assessing the correctness of drift theory.” And then there is a long list of field observations, mostly from the work of Alexander du Toit—who was originally from South Africa, but went to do his fieldwork in South America: “Particularly thorough comparative studies have been carried out by the well-known South African geologist du Toit, who made a journey of exploration in South America for this purpose. The results of his investigation, which includes a very complete survey of the literature, were published in 1927 as No. 381 of the Carnegie Institution of Washington (157 pages) with the title A Geological Comparison of South America with South Africa. The whole work is a unique geological demonstration of the
5.7 Continental Drift
149
correctness of drift theory so far as these parts of the globe are concerned. If we wanted to cite every detail in the book which favors the theory, we would have to translate it [du Toit’s book must have been published in English; and Wegener wrote and published his own book in German] from start to finish. There are many statements like the following: ‘Indeed, viewed even at short range, I had great difficulty in realising that this was another continent and not some portion of one of the southern districts in the Cape...’ (p. 26). [...] Conformities between the two sides of the ocean, he says, are now known in such numbers that it is no longer possible to imagine them accidentally coexisting, particularly since they cover vast stretches of land and the time span from the pre-Devonian to the Tertiary. Du Toit adds: ‘Furthermore, these so called coincidences are of a stratigraphical, lithological, paleontological, tectonic, volcanic and climatic nature’ ”, etc. In Fig. 5.9 you have a drastically simplified graphic summary of du Toit’s findings. The same sort of observations had been made in eastern Africa/Madagascar/India, and in Europe vs. North America236 . Another geological argument in favor of the drift we’ve seen already: it’s the problem with the amount of folding seen in places like the Alps, which is way too much to be explained by global shrinking. Wegener picks it up early in Die Enstehung: “it was particularly the discovery of the scale-like ‘sheet-fault structure’ or overthrusts in the Alps”, he writes, “which made the shrinkage theory of mountain formation, which presented enough difficulties in any case, seem more and more inadequate. This new
Fig. 5.9 Geological evidence in favor of drift theory: rock formations found in different continents merge into one another when continents are lined up next to each other according to Wegener’s reconstruction. Shaded areas are what geologists call cratons (which are usually found in the middle of continents: places where not much ever happened in the way of orogenesis and deposition of sediments, etc., and you find plenty of outcrops of crystalline, basement rocks—kind of like the stuff that you find along the axis of a mountain range, remember Fig. 3.2); dashed lines are where mountain chains are, and are oriented like the chains (i.e., parallel to the chain’s axis, perpendicular to the direction along which compression has occurred, which you figure out by looking at the folds, etc.). A little chunk of the west-African craton is found in northern Brasil; and a little chunk of the Angolan craton is found around Salvador de Bahía. Likewise, there is continuity in the trend of mountain chains, from one continent to the other
150
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
concept [...] leads to the idea of far larger compressions than did the earlier theory.” It was mostly Marcel Bertrand237 who figured that the Alps were not only folded, like in Fig. 5.1 above, but they had been under so much horizontal pressure that the fold would eventually break, and a sheet of folded material from one side would eventually be pushed above the other side—over distances on the order of a few km (see Fig. 5.10). Geologists called this kind of rupture a “thrust fault”, and the sheetlike body of rock that got moved on top of the thrust fault would be a “thrust sheet”238 . So, without allowing for faulting à la Bertrand, says Wegener, “Heim calculated239 in the case of the Alps a 50% contraction, but on the basis of the sheet-faulting theory, now generally accepted, [Heim calculated a] contraction of 41 to 18 of the initial span”, meaning, Albert Heim calculated that, if you were to “unfold” the Alps and spread the strata out, their north-south extension would be four to eight times longer than it is now: “Since the present-day width of the chain is about 150 km, a stretch of crust from 600 to 1200 km wide (5-10◦ C of latitude) must have been compressed in this case. Yet in the most recent large-scale synthesis on Alpine sheet-faults, R. Staub agrees with Argand that the compression must have been even greater. On page 257 he concludes: ‘The Apine orogenesis is the result of the northward drift of the African land mass. If we smooth out only the Alpine folds and sheets over the transverse section between the Black Forest and Africa, then in relation to the present-day distances of about 1800 km, the original distance separating the two must have been about 3000 to 3500 km, which means an alpine compression (in the wider sense of the word Alpine) of around 1500 km. Africa must have been displaced relative to Europe by this amount. What is involved here is a true continental drift of the African land mass and an extensive one at that.’ Other geologists have put forward similar views”, etc. Wegener’s diagram of how a vertical section through a continental margin should look like, from Chap. 4 of Die Entstehung, is reproduced in Fig. 5.11 here: continents, made of sial, float on top of a layer of sima, which is heavier than sial and behaves like a viscous fluid. Wegener doesn’t seem to be aware of Barrell’s work—or, in any case, I didn’t find Barrell cited in Wegener’s book—but: “the liquid in which the crust is immersed apparently has a very high viscosity,” says Wegener, “one which is hard to imagine, so that oscillations in the state of equilibrium are excluded and the tendency to restore equilibrium after a perturbation is one which can only proceed with extreme slowness, requiring many millennia to reach completion. Under laboratory conditions, this ‘liquid’ would perhaps scarcely be distinguishable from a ‘solid.’ ” Which is pretty much Barrell’s model. To Wegener, though, the asthenosphere and what we would now call the “oceanic crust” are one and the same thing, i.e., the sima layer. (He also doesn’t seem to have an idea of, or doesn’t worry about how deep the sima layer should be.) This model goes very well with the theory of isostasy, and with the concept of oceans and continents as two different entities, that goes back to Dana240 . The continental margin is also where mountain ranges are formed: either because two continents clash with one another—like India and Asia241 : which Himalaya is right at Asia’s southern margin—or because of the friction between sial and sima, as sial “drifts” around: for instance, “in the westward drift of both Americas,” writes Wegener, “their leading edges were compressed and folded by the frontal
5.7 Continental Drift
151
Fig. 5.10 Development of an overlying fold into a thrust fault, after Albert Heim’s Der Bau der Schweizeralpen, 1908. Look at points a and c and see how their relative positions change over time
resistance of the [...] Pacific floor, [...] a source of viscous drag. The result was the vast Andean range which extends from Alaska to Antarctica. Consider also the case of the Australian block, including New Guinea [...]: on the leading side, relative to the direction of displacement, one finds the high-altitude New Guinea range, a recent formation”, etc. And global topography maps agree that recent mountain ranges are, more or less, where Wegener would have expected them to be.
152
5 The Forces That Shape the Earth: Shrinking, Isostasy, Drift
Fig. 5.11 Wegener’s “vertical section of the margin of a continent”, from Chap. 4 of Die Entstehung. The layer filled with horizontal dashes is the water layer—the ocean. Oblique lines mean continent, and the dots stand for the sima layer, which is both under the continents and directly under the seafloor: in Wegener’s model, it is both asthenosphere and oceanic crust
The one thing that’s missing in Wegener’s theory is a driving force: some mechanism capable of displacing continents. It’s not a small detail. Wegener speculates, but the ideas he manages to come up with are not (yet) convincing—and he’s aware of that242 . Chapter 9 of his book is called “The Displacement Forces”, and it starts like, “the determination and proof of relative continental displacements, as shown by the previous chapters, have proceeded purely empirically, that is, by means of the totality of geodetic, geophysical, geological, biological and paleoclimatic data, but without making any assumptions about the origin of these processes. This is the inductive method, one which the natural sciences are forced to employ in the vast majority of cases. The formulation of the laws of falling bodies and of the planetary orbits was first determined purely inductively, by observation; only then did Newton appear and show how to derive these laws deductively from the one formula of universal gravitation. This is the normal scientific procedure, repeated time and again. “The Newton of drift theory has not yet appeared. His absence need cause no anxiety; the theory is still young and still often treated with suspicion. In the long run, one cannot blame a theoretician for hesitating to spend time and trouble on explaining a law about whose validity no unanimity prevails. It is probable, at any rate, that the complete solution of the problem of the driving forces will still be a long time coming, for it means the unravelling of a whole tangle of interdependent phenomena, where it is often hard to distinguish what is cause and what is effect”, etc. We’ll see in Chap. 8 that one key to unravelling the whole tangle is radioactivity— which Wegener was aware of its implications, because in the introductory chapter of his book he mentions it as evidence against the cooling/shrinking earth theory: “even the apparently obvious basic assumption of contraction theory, namely that the earth is continuously cooling,” he says, “is in full retreat before the discovery of radium. This element, whose decay produces heat continuously243 , is contained in measurable amounts everywhere in the earth’s rock crust accessible to us. Many measurements
5.7 Continental Drift
153
lead to the conclusion that [...] if the inner portion had the same radium content [as the shallowest layers—those we have access to], the production of heat would have to be incomparably greater than its conduction outwards from the centre, which we can measure” (which see Chap. 4, of course); but then Wegener stops, because he has no way of estimating the amount of radioactivity in the deep earth. Arthur Holmes will pick up the radioactivity argument after Wegener’s death, in the 1930s and 40 s; and from it he will build a convincing, and essentially correct—or, at least, so we think today—model of a driving mechanism. We’ll meet Holmes in Chap. 8. The historian of science Henry Frankel244 tells of the several meetings, in the 1920s, held by earth science associations around the world to discuss the continental drift hypothesis. People at those meetings, he says, “were overwhelmingly against the theory”: the supporters of Wegener were always a small minority. The meeting that tends to be mentioned in earth science books is the symposium of the American Association of Petroleum Geologists, organized in 1926 by Willem van Waterschoot van der Gracht245,246 —which “was the first international meeting to be held on drift theory”, says Frankel. “Wegener himself, [Frank B.] Taylor—the American geologist who came up with his own drift hypothesis independently of Wegener—and van der Gracht were the only proponents of the drift hypothesis; of the other eleven participants, only two, John Joly and G. A. F. Molengraaff, were at all sympathetic toward the drift programme.” And Anthony Hallam (not in his book, but in “Alfred Wegener and the Hypothesis of Continental Drift”, Scientific American, vol. 232, 1975): “Wegener first presented his ideas to the scientific community in 1912, but it was not until 50 years later that they gained general currency. [...] In the interim Wegener’s theory had at best been neglected, and it had often been scorned. At the nadir proponents of continental drift were dismissed contemptuously as cranks.”
Chapter 6
The Vibrations of the Earth
If you’ve watched the news after an earthquake, you’ve heard of its epicenter, which is the point at the surface of the earth right above the fault. The fault of course occupies more than a point: it’s a whole area that breaks, so that one side of it “slips” with respect to the other. The epicenter should be sort of like in the middle of that area, projected to the surface of the earth. From the news you might also have figured that earthquake vibrations (“seismic waves”) are felt even quite far from the epicenter. In fact, in a fairly big quake247 , chances are that a sensitive instrument like the ones seismologists use today (I am not talking about the small accelerometer in your smartphone, but some very expensive device, installed by professionals in an extremely quiet place) will pick it up even at the opposite side of the world. When a big earthquake strikes New Zealand, people in Italy won’t feel it, but seismometers in Italy will record it. And because there’s so much ground between New Zealand and Italy, the signal recorded in Italy will carry info about not only what happened on that fault “down under”, but also about the properties of rocks, deep inside the earth, along the path traveled by the waves. (Because, of course, depending on the composition of a rock, its temperature, its density and whatnot, seismic waves will propagate through it with a different speed.) The point I am trying to make, to begin this chapter which as you might have guessed is about what today we call seismology—a word derived from the Greek, where seismos means, precisely, earthquake—, the point I am trying to make is that what we call an earthquake is really the combination of two processes: there’s the rupturing, the breaking-up of a large mass of rock somewhere inside the earth, along some sort of crack (the “fault”): and this process is the origin, the source of the quake; and then there’s the propagation of vibrations, generated by that sudden rupture, away from the crack itself. To understand what I mean, and why it’s important, think of this: if you’ve been at a concert in some concert hall, you might have noticed that the music doesn’t sound exactly the same if you’re right by the stage versus in the back of the hall: and the difference between what you hear in those two places (and I am not talking about the
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_6
155
156
6 The Vibrations of the Earth
difference in the volume of the sounds you hear, which is pretty obvious: if you are right by the speaker, volume is higher than if you are far from it; but now I am talking about everything but the volume) tells you something about what the concert hall is like: so for the earth it’s the same: the music sounds different depending on the venue, and where you stand in it. Now, because this book is about the earth more than it is about earthquakes, I should warn you that I’ll probably end up telling you much more about the seismic waves themselves than about the fracture; and more about seismic waves than about how and why those fractures occur, and whether they might be predicted, etc. (i.e., the questions that the average seismologist is asked anytime she reveals some innocent civilians what her profession is. Sorry, but we are not really going to get into that.)
6.1 Robert Mallet, 1848 A good starting point to learn about earthquakes and earthquake waves is the long article “On the Dynamics of Earthquakes; Being an Attempt to Reduce Their Observed Phenomena to the Known Laws of Wave Motion in Solids and Fluids”, that Robert Mallet published in 1848 in vol. 21 of the Transactions of the Royal Irish Academy. Today you’ll see Mallet mentioned in many textbooks as the founding father of seismology: and rightly so, I would say, because it’s hard to find work on what today we call seismology, published before Mallet’s paper; plus, apparently, the very word, seismology, was invented by Mallet, together with epicenter and a few more that are less frequently used, but still used in earthquake-related discourse. Incidentally, Mallet is your quintessential Victorian scientist: industrialist and son of an industrialist (he inherited his dad’s iron foundry), and as such independently rich, as they say, he practiced science as a personal interest. “I have followed,” this is how Mallet begins, “I have followed, in the subsequent pages, rather the course of my own progress than a strictly systematic arrangement, and shall therefore commence with the explanation of the phenomenon above alluded to, and which led me to the subsequent investigation and results. “The phenomenon alluded to, is the displacement of the separate stones of pedestals or pinnacles, or of other portions of the masonry of buildings, by the motion of earthquakes, in such a manner that the part moved presents evidence of having been twisted upon its bed round a vertical axis. This has been hitherto attempted to be explained by assuming a vorticose motion to occur.” In fact, Lyell himself had a drawing, in his textbook, showing two obelisks (see Fig. 6.1), each made of three separate stones resting on top of one another, damaged by an earthquake in 1783 in Calabria. “Two obelisks [...] placed at the extremities of a magnificent facade in the convent of S. Bruno, in a small town called Stefano del Bosco,” he says, “were observed to have undergone a movement of a singular kind. The shock which agitated the building is described as having been horizontal and vorticose. The pedestal of each obelisk remained in its original place; but the separate stones above were turned partially round, and removed sometimes nine inches from
6.1 Robert Mallet, 1848
157
Fig. 6.1 Obelisks damaged by an earthquake. (After Lyell’s Principles)
their position without falling”. And Lyell “contents himself”, says Mallet, “with the vorticose account of the Neapolitan Academy”. Mallet cites “some few other notices of similar phenomena”; there’s a church in Valparaiso two of whose buttresses were apparently “twisted off from the wall”; there’s Charles Darwin, who was in a major earthquake, also in Chile, in 1835, and described the damage in his Naturalist’s voyage. Darwin didn’t think that the motion had been “vorticose”: the “twisting displacement”, writes Darwin, cited by Mallet, “at first appears to indicate a vorticose movement beneath each point thus affected, but that is highly improbable.” Mallet comments: “The sagacity of Darwin shewed him that the vorticose hypothesis was improbable, and that in order to its being at all tenable, a separate vortex must be admitted for every separate stone found twisted, the axis of rotation of the vortex having been coincident with that of the stone.” Which Mallet thinks is a “paramount improbability”. In other words: seeing an obelisk damaged as in the drawing in Fig. 6.1, reporters would intuitively think that the obelisk must have been twisted, rotated around its own axis. That’s what Lyell and Darwin and Mallet call a “vorticose” displacement. People would think that the earthquake wave must have imparted a rotation to the base of the obelisk, and then, because there’s friction between the stones that make the obelisk, the whole thing would rotate around the same axis. Then, depending on its inertia and on the friction, each stone would come back to rest at a different point in its rotation, and when everything is, again, still, you get the drawing in Fig. 6.1. Mallet thought that this was crazy because he couldn’t think of an elastic wave (and he figured that ultimately what you experience when you are hit by an earthquake is nothing but an elastic wave) that would manifest itself as a rotation (vortex): waves for Mallet are a back-and-forth, up-and- down sort of thing, and that’s that248 . So here is how Mallet reasons to explain the data without invoking vortexes. The forces that act on a given body when the quake comes are: friction [surface force] and weight [body force]: “If a stone [...] rest upon a given base, and [...] motion be suddenly communicated horizontally to that base in any direction, the stone itself will be solicited to move in the same direction. The measure of force with which the movement of the base is capable of affecting the stone or other incumbent body is equal to the amount of friction of the latter upon its base”.
158
6 The Vibrations of the Earth
Friction is the force generated between the surfaces of two objects in contact with each other, which opposes any force that would accelerate one object relative to the other. So what happens is, the base on which the obelisk rests is moved; the obelisk applies to its base the resisting friction μW , where W is the weight of the obelisk and μ a coefficient that depends on the properties of the two surfaces that are in contact; by action-reaction (Newton), the base applies an equal and contrary force on the obelisk. So there are two forces acting on the obelisk: its own weight, which can be considered as if it was applied to the obelisk’s center of mass, and friction, applied on the obelisk’s base. Three different scenarios, then, might unfold: 1. the center of mass is far from the surface of contact; think a very tall obelisk. There is a significant torque (remember Chap. 2) from the friction; the obelisk falls (rotates in the plane defined by a vertical line going through the center of mass, and the direction of motion). 2. the center of mass is near the surface of contact, and within the plane defined by the vertical line and the direction of motion. In this case, the torque is small and there is no rotation. There is no requirement that the horizontal acceleration μW m (where m is the obelisk’s mass, and I’ve used Newton’s second) resulting from friction coincide with that of the ground, and typically (in Mallet’s words) “the bed will move more or less from under the stone, or the stone will appear to move in a contrary direction to that of the motion of its bed.” 3. It could be that the center of mass is near the surface, so the stone won’t fall... but it might be away from the center of adherence, and from the plane defined by the vertical and the direction of motion: “in which case the effect of the rectilinear motion in the plane of the base will be to twist the body round upon its bed”. That is to say, like in case 1, we have a non-negligible torque: but this is a torque that causes a rotation on the horizontal, rather than vertical plane. So Lyell’s obelisks are probably an instance of scenario 3. There’s no vorticose motion, the soil oscillates back and forth along just one direction; but mass is not distributed exactly uniformly within each of the stones that form the obelisk; and/or the surfaces of contact between the stones are not everywhere equally rough; and/or the shape of the stones is not perfectly symmetric. And so the centers of mass and adherence are not perfectly aligned and that’s what causes friction to twist the stones around. With that, Mallet was confident to have “proved that no vorticose motion is requisite to account for the twisting of bodies, as observed in earthquakes; that nothing more than a simple horizontal rectilinear motion is demanded.”249 And the first task that he had set for himself was completed. And but he must have enjoyed it, because he doesn’t end the paper there, but instead goes on, as anticipated, to put together the first draft of a comprehensive theory of quakes. So what else can we say, or what else could Mallet say about earthquake displacement? what kind of horizontal motion do we have? Mallet reasons that if motion was just a net displacement in one direction, “all bodies must, as the effect of one shock, fall in the one direction, and not in opposite directions.” But this “is contrary
6.2 John Michell, 1760
159
to observed facts”. Bodies in an earthquake fall in multiple directions, and the only phenomenon that can account for this is “the transit of a wave of elastic compression”. What I think Mallet means is, think of a place somewhere in the distance: let it be the epicenter of a quake, and now think of a disturbance, energy (not matter, just energy: a displacement) moving away from the epicenter and straight towards you. Let it be a compression along the direction of the epicenter itself. When a structure is hit, it will first be pushed away from the epicenter, then back towards the epicenter and probably back and forth a couple times. In all this, it might be one or two or more oscillations before a structure that is hit by the earthquake wave breaks off from whatever connects it to the ground (in the simplest case, just friction), and falls. You can’t predict at what point of its back-and-forth oscillating it actually falls, i.e., which way it falls: and that would explain the kind of damage that we see when we’re in an earthquake. It follows, and that’s possibly the most important point in Mallet’s paper, a point that might seem obvious today but wasn’t obvious in 1846, it follows that to understand earthquakes one needs to “reduce their observed phenomena to the known laws of wave motion250 in solids and fluids”, as per the paper’s title. “The idea that earthquake motion consists of a wave of some sort is not new,” writes Mallet, “although so entirely neglected by the great mass of recent geological authors. To the Rev. John Mitchell [sic] [...] the merit of this idea appears to be due. In a paper communicated to the Royal Society, read in 1760, he [...] distinctly enunciates [...] that the motion of the earth is due to a wave propagated along its surface from a point where it has been produced by an original impulse”.
6.2 John Michell, 1760 Now what I am going to do is I am going to tell you a bit about Michell’s (whose name should be spelled without a “t”) 1760 paper, because I want to show you how difficult it was to arrive at a conclusion that today you might call obvious, i.e., that quakes generate elastic waves, and that if you feel an earthquake, that means you’re experiencing an elastic wave. Michell wrote his paper not long after the 1755 Lisbon earthquake, and from that earthquake he got both the inspiration for his speculations, and the data needed to substantiate them. Michell starts out by confuting the idea that quakes be caused by some sort of meteo event. “That these concussions should owe their origin to something in the air, as it has sometimes been imagined,” he writes, “seems very ill to correspond with the phenomena.” He quotes several reports of quakes, from which no correlation between the quake and atmospheric events emerges. Next he announces that he is going to prove that quakes are due to “subterraneous fires. These fires, if a large quantity of water should be let out upon them suddenly, may produce a vapour, whose quantity and elastic force may be fully sufficient” to crack rocks: because the vapour, heated up, will try to expand and push until the rocks break and let it out. The evidence, available to Michell, in favour of this idea revolves around volcanoes and the fact
160
6 The Vibrations of the Earth
that areas with much volcanism would also be the sites of many earthquakes. Or in Michell’s words, “those places that are in the neighbourhood of burning mountains are always subject to frequent earthquakes”. Michell thinks subterranean fires feed both earthquakes and volcanoes, and here is his reasoning: the earth is layered, the same sequence of layers usally extends over very long horizontal distances (“we have an instance of this in the chalky and flinty countries of England and France, which (excepting the interruption of the Channel, and the clays, sands &c. of a few counties) compose a tract of about three hundred miles each way”, says Michell, who’s writing some ten years after Guettard published his Carte minéralogique, see Chap. 4, and, obviously, has seen it), and layers “commonly lie more inclining from the mountainous countries, than the countries themselves: these circumstances make it very probable, that those strata of combustible materials, which break out in volcanos on the tops of the hills, are to be found at a considerable depth under ground in the level and low countries near them.” To understand what Michell is saying, here, just think of those sketches we’ve seen in Chap. 3, that showed sections across mountain ranges. If you extend the “stratigraphy” away from the central axis of the range, and under the sedimentary plains that surround the range (it’s a fact that sedimentary plains always surround mountain ranges), then, because the layers are inclined, they dip at some angle into the earth and the depth of any layer grows with growing distance from the axis. So, now, if that mountain range has volcanoes, to Michell that means there’s a layer “of combustible material” which must provide the fuel for the volcano, and that layer is tilted, like all other layers, and so if you stand on the sedimentary plain at some distance from the mountain range, the layer of combustible material runs under your feet at some considerable depth, see Fig. 6.2. Michell: “If this should be the case, and if the same strata should be on fire in any places under such countries, as well as on top of the hills, all vapours, or whatsoever kind, raised from these fires, must be pent up, unless so far as they can open themselves a passage between the strata; whereas the vapours raised from volcanos find a vent,
Fig. 6.2 Michell’s idea of a “combustible layer”, which might cause both earthquakes and volcanic eruptions
6.2 John Michell, 1760
161
and are discharged in blasts from the mouths of them. Now, if, when they find such a vent, they are yet capable of shaking the country to the distance of ten or twenty miles round, what may we not expect from them, when they are confined? [...]. If we suppose that these vapours, when pent up, are the cause of earthquakes, we must naturally expect, from what has been just said, that the most extensive earthquakes should take their rise from the level and low countries; but more especially from the sea, which is nothing else than waters covering such countries. Accordingly we find, that the great earthquake of the 1st November 1755 [...] took its rise from under the sea; this is manifest, from that wave which accompanied it [...]. The same thing is to be understood of the earthquake that destroyed Lima in the year 1746’. I should probably mention that none of this fits with today’s theories of how earthquakes work; but it would be a while before the subterraneous fire and gas explosion idea be eventually abandoned. Mallet, as we shall see in a minute, doesn’t really spend much time discussing the origin of quakes: because the data he has in 1846 are still not enough to do better than Michell. (What he has, that Michell didn’t have, is the theory of elastic waves: and that’s what his contribution is mostly about.) But so anyway, later we’ll find out what’s wrong in Michell’s theory; in 1760 or so one could be content with it, because it agreed with the (not so many) observations of earthquakes and ideas in geology of the time. For now let’s do as if Michell were right and let’s look at his theory of how the disturbance originated by a quake propagates at large distances from the source. Michell: “let us suppose the roof over some subterraneous fire to fall in251 . If this should be the case, the earth, stones, &c., of which it was composed, would immediately sink in the melted matter of the fire below: hence all the water contained in the fissures and cavities of the part falling in, would come in contact with the fire, and be almost instantly raised into vapour. From the first effort of this vapour, a cavity would be formed (between the melted matter and the superincumbent earth) filled with vapour only, before any motion would be perceived at the surface of the earth: this must necessarily happen, on account of the compressibility of all kinds of earth, stones, &c. but as the compression of the materials immediately over the cavity, would be more than sufficient to make them bear the weight of the superincumbent matter, this compression must be propagated on account of the elasticity of the earth, in the same manner as a pulse is propagated through the air; and again the materials immediately over the cavity, restoring themselves beyond their natural bounds, a dilatation will succeed to the compression; and these two following each other alternately, for some time, a vibratory motion will be produced at the surface of the earth252 ” So this is elastic wave propagation as we understand it today: displacement propagating through the rock just like sound traveling through air: transfer of energy without transfer of mass, which for us is precisely what a wave is. Michell doesn’t call it a “wave” though; he calls it a “vibratory motion”. And then (p. 37), “as a small quantity of vapour almost instantly generated at some considerable depth below the surface of the earth, will produce a vibratory motion, so a very large quantity [...] will produce a wave-like motion. The manner in which this wave-like motion will be propagated, may, in some measure, be represented by the following experiment. Suppose a large [...] carpet [...] to be raised at one edge, and then suddenly brought down again to the
162
6 The Vibrations of the Earth
floor, the air under it, being by this means propelled, will pass along, till it escapes at the opposite side [...]. In a like manner, a large quantity of vapour may be conceived to raise the earth in a wave,” etc. And this, which is what Michell calls a “wave”, is not at all what we mean today with the words wave. Again, for us, the propagation of wave is transfer of energy without transfer of mass. Material points hit by a wave oscillate around some reference position without ever getting very far from it. In what Michell calls a wave-like motion, “the air [...] will pass along, till it escapes at the opposite side”, which clearly means transfer of mass (air) and so is definitely something else. And we’ll see with Mallet that that’s not what happens. In any case, the idea that an earthquake has an origin, a source, where some catastrophic phenomenon takes place, and that then the effects of this phenomenon are propagated away from the source at even some very large distance, is due to Michell. And Michell also reviewed eyewitness’ accounts of the recent Lisbon earthquake, and other major events in the past, too, to substantiate his theory. He finds all data to confirm that “the motion of the earth is partly tremulous, and partly propagated by waves, which succeed one another at larger and sometimes at smaller distances; and this latter motion [“propagated by waves”] is generally propagated much farther than the former [“tremulous”]” (which at this point you understand what in reality he means by “waves” and by “tremulous”). For instance, “at Jamaica in 1687–8 [...] it is said, that a gentleman there saw the ground rise like the sea in a wave, as the earthquake passed along, and that he could distinguish the effects of it, to some miles distance, by the motion of the tops of the trees on the hills253 . “Again, in an account of [the 1692 Jamaica earthquake], it is said, ‘the ground heaved and swelled, like a rolling swelling sea,’ insomuch, that people could hardly stand upon their legs by reason of it. [...] The same was also observed at Lisbon, in the earthquake of the 1st November 1755, as may be plainly collected from many of the accounts that have been published concerning it, some of which affirm it expressly: and this wave-like motion was propagated to far greater distances than the other tremulous one, being perceived by the motion of waters, and the hanging branches in churches, through all Germany, amongst the Alps, in Denmark, Sweden, Norway, and all over the British Isles.” Michell even realizes that from people’s accounts of earthquakes one can actually figure out where the source is. Which except for the instruments we have now, that should in principle be way more accurate than shocked civilians’ recollections, what Michell proposes to do is pretty much what seismologists do today when they realize some quake just took place and get to work to “localize” its epicenter. Michell: “If we would inquire into the place of origin of any particular earthquake, we have the following grounds to go upon. “First, The different directions, in which it arrives at several distant places: if lines be drawn in these directions, the place of their common intersection must be nearly the place sought: but this is liable to great difficulties; for there must necessarily be great uncertainty in observations, which cannot, at best, be made with any great precision, and which are generally made by minds too little at ease to be nice observers of what passes...”, and actually Michell doesn’t really say how one can see this direction (other than the anecdote above): it seems you’d need an observer in a very advantageous position, and a very violent wave coming in.
6.2 John Michell, 1760
163
“Secondly, We may form some judgment concerning the place of the origin of a particular earthquake, from the time of its arrival at different places; but this also is liable to great difficulties. In both these methods, however, we may come to a much greater degree of exactness, by taking a medium amongst a variety of accounts, as they are related by different observers. But, “Thirdly, we may come to the greatest degree of exactness in those cases, where earthquakes have their source from under the ocean; for, in these instances, the proportional distance of different places from that source may be very nearly ascertained, by the interval between the earthquake and the succeeding wave: and this is the more to be depended on, as people are much less likely to be mistaken in determining the time between two events, which follow one another at a small interval, than in observing the precise time of the happening of some single event.” Michell proceeds to “localize” the great Lisbon earthquakes based on eyewitness reports of the times at which both the tsunami and the seismic ‘wave’ (Michell would not have called that like that). A place and time is assumed, and the distance between that place and places where observations were made is compared with the observed time of arrival of both the seismic and tsunami waves.” It is observed that “the times, which the [tsunami] wave took up in travelling, are not in the same proportion with the distances of the respective places from the supposed source of the motion; this, however, is no objection against the point assumed, since it is manifest, wherever it was, that it could not be far from Lisbon, as well because the [tsunami] wave arrived there so very soon after the earthquake [i.e., after the seismic waves], as because it was so great [...]. The true reason of this disproportion, seems to be the difference in the depth of the water; for, in every instance in the above table, the time will be found to be proportionally shorter or longer, as the water through which the wave passed was deeper or shallower. Thus the motion of the wave to Kingsale [perhaps Kinsale, in Ireland?] or Mountsbay [Cornwall] through waters not deeper in general than 200 fathoms was slower than that to Madeira (where the waters are much deeper)”. There are many ideas here that will come back in more recent acoustics and seismology. Seismologists in the twenty-first century still mostly look at “the time of its arrival at different places” to find “the place of the origin of a particular earthquake”. If you don’t have data from different places, you can measure how much time passes between the seismic wave that propagates through the solid ground and the water wave that propagates through the water; the speed of wave in rocks is different (higher) than in water, so the delay between seismic wave and tsunami wave grows with the “epicentral distance”, and can be and has been tabulated after recording many earthquakes: so from a measure of that delay you can get an estimate of how far you are from the epicenter254 . All this assumes, of course, that the quake is big enough to move some large mass of water, and that you are near the shore255 : in which case, get away from the beach, get to the roof of a tall, solidly constructed building, and look at the tsunami from there. Finally, those of you who already know how a tsunami works might have noticed that Michell’s intuition that a “difference in the depth of the water” means difference in the speed of the water (tsunami) wave is very much right. We’ll get back to that.
164
6 The Vibrations of the Earth
6.3 Back to Mallet Now let’s get to Mallet’s idea of what an earthquake is. Michell is right, says Mallet, that earthquakes are waves. Or, to be precise, some kind of big commotion deep in the earth, that generates a sudden displacement of matter which in turn generates waves. But “Michell wholly mistook the nature of the wave itself”. We’ve seen above that Michell got the first part of the signal, what today we would call the body wave, right: he speaks in fact of an initial “compression” that “must be propagated on account of the elasticity of the earth, in the same manner as a pulse is propagated through the air”; but then he gets the rest of the seismogram wrong: remember the story about the earth’s crust being like a carpet, and “the air under it, being [...] propelled, will pass along, till it escapes at the opposite side”, etc. Mallet figures that it doesn’t make much sense to call that a “wave”, particularly now that rigorous “laws of wave motion in solids and fluids” had been developed (long after Michell’s paper, and within Mallet’s lifetime). But let’s not get ahead of ourselves. First of all, Mallet reviews and organizes the data256 . Comparing all available accounts of earthquakes, of which at this point we’ve seen some examples, he comes to the conclusion that when you feel a quake you feel one or more of the following: (i) an elastic wave propagating through the solid crust of the earth; (ii) an acoustic wave propagating through the air, if the earthquake is on land; (iii) an ocean wave (tsunami wave) which only occurs when the source is under the bed of the ocean. Also: when the elastic wave propagates across a continent, the shaking of the ground propagates into the air as well, and some of this shaking happens with frequencies that our ears can hear257 . According to the reports read by Mallet, it sounds “like the bellowing of oxen, the rolling of waggons, or of distant thunder, accompanying the shock”. This is not the same thing as the acoustic wave (ii), because the acoustic wave (ii) is the acoustic wave that might be generated directly at the source, particularly if the source is an exposed fault, i.e., one that actually cuts through the surface of the earth, or a volcanic eruption or a big explosion at the surface of the earth, etc. The acoustic wave (ii) is presumably louder than this secondary acoustic wave. As for the tsunami wave, seismology courses of today barely mention it, but in Mallet’s time there were basically no instruments available to measure earthquakes in a scientific way (i.e., “quantitatively”: with units of measurements, numbers, and an estimate of the uncertainty that the measurement carries), and he had to make his inferences about the nature of quakes mostly on the basis of eyewitness accounts, and many of those accounts were observations of anomalously large waves in the ocean after a quake, which Mallet figured were an important piece of information that one had to take into account if one wanted to understand what earthquakes were and what they did, etc. (For example, if you have a bunch of tsunami observations from different places, they are as good as seismic waves to localize the epicenter.) So in his big paper Mallet spends quite some time on tsunami waves.
6.4 Hydrodynamics
165
Mallet explains that the tsunami wave is not the same thing as an elastic wave propagating through water: he first introduces the tsunami wave as “the wave analogous to a tidal258 wave produced and propagated on the surface of the ocean,” which should not be confused with the “wave (of sound namely) [that] will be propagated through the mass of the ocean water likewise, with a velocity far greater than the former, and will reach the land, and be heard there as a sound long before the great surface wave will have rolled in. The terrestrial sound wave is isochronous with the great elastic wave or earth wave of shock.”259
6.4 Hydrodynamics Mallet doesn’t go through the theory of water waves at all, but I am going to do it nevertheless, because the theory existed (although not in its final, or, should I say, current form, which is probably due to Horace Lamb and his Hydrodynamics textbook) in Mallet’s time and because in any case the concepts that will emerge now are going to be useful again later when I shall cover seismic surface waves: which differ from seismic body waves in the same way as tsunami waves differ from sound waves in water. In his Principes Généraux du Movement des Fluides, published in 1755 and presumably known to Mallet, Euler gives plenty of examples of how to apply calculus (which at the time was a new thing, introduced and made popular by Newton and Leibniz) to many physical problems. So let’s see how Euler derives an equation that is quite fundamental for describing mathematically how a mass of water behaves when some kind of force is applied to it. Euler defines water, but more in general he speaks of “fluids”, and I am going to stick to his language: Euler in that paper defines a fluid as a substance such that, if one could take any surface within it, and measure pressure on either side of it, he would find that pressure is independent on how the surface is oriented260,261 . It follows that pressure in the fluid is a function of position, and time, but not of direction: if we call x1 , x2 , x3 the Cartesian coordinates, then pressure is fully defined at each point x1 , x2 , x3 and at any given time t by a single value that I am going to call p = p(x1 , x2 , x3 , t). Now let’s focus on a tiny chunk of fluid within the larger body of fluid that we are looking at; for the sake of simplicity, as they say, let’s take this “parcel of matter” to be a rectangular prism. Call δx1 , δx2 , δx3 the lengths of the edges of the prism. Now the prism receives pressure from the rest of the fluid around it, meaning there will be for example a force p(x1 , x2 , x3 , t)δx2 δx3 (remember pressure is force per unit surface, so you’ve got to multiply it by surface area if you want the force) applied at x1 on its vertical face perpendicular to the x1 axis; and likewise there will be a force p(x1 + δx1 , x2 , x3 , t)δx2 δx3 acting at x1 + δx1 on the face parallel to the first (see Fig. 6.3), and the same for all other faces262 . Pressure is the only force that acts on the outer surface of the prism (because there is no friction in fluids, so we are only concerned with push and pull); but there might be other forces that act on the fluid as a whole, e.g., its weight263 . For now let us call f 1 , f 2 , f 3 the Cartesian components of the combination of all “body” forces acting at a point of the fluid;
166
6 The Vibrations of the Earth
Fig. 6.3 Pressure acting on two parallel sides (shaded) of an infinitely small, regular prism. The total pressure compressing the prism along the x1 -direction is p(x1 , x2 , x3 ) − p(x1 + δx1 , x2 , x3 )
they are also functions of x1 , x2 , x3 , t; and let us measure them per unit of mass, so that ρ f 1 (x1 , x2 , x3 , t)δx1 δx2 δx3 is the total body force acting on our prism at time t in the direction 1 (ρ denotes density, as usual, and so ρδx1 δx2 δx3 is the mass of the prism). To do the force balance, remember Newton’s law that says that the product of mass times acceleration of the prism must coincide with the sum total of the forces acting on it, and this for each Cartesian component of both force and acceleration, so, if we do it along all three Cartesian axes, ⎧ ρ f 1 (x1 , x2 , x3 , t)δx1 δx2 δx3 + p(x1 , x2 , x3 , t)δx2 δx3 ⎪ ⎪ ⎪ ⎪ ⎪ dv1 ⎪ ⎪ , − p(x1 + δx1 , x2 , x3 , t)δx2 δx3 = ρδx1 δx2 δx3 ⎪ ⎪ dt ⎪ ⎪ ⎪ ⎨ ρ f 2 (x1 , x2 , x3 , t)δx1 δx2 δx3 + p(x1 , x2 , x3 , t)δx1 δx3 dv2 ⎪ , − p(x1 , x2 + δx2 , x3 , t)δx1 δx3 = ρδx1 δx2 δx3 ⎪ ⎪ dt ⎪ ⎪ ⎪ ρ f 3 (x1 , x2 , x3 , t)δx1 δx2 δx3 + p(x1 , x2 , x3 , t)δx1 δx2 ⎪ ⎪ ⎪ ⎪ ⎪ dv3 ⎪ ⎩ , − p(x1 , x2 , x3 + δx3 , t)δx1 δx2 = ρδx1 δx2 δx3 dt (6.1) where v1 , v2 , v3 stand for the components of the velocity of the prism in the directions 1 , etc., are the components of acceleration in the same directions. 1, 2, 3, and so dv dt We might as well divide everything by δx1 δx2 δx3 , so that ⎧ ⎪ ⎨ ρ f 1 (x1 , x2 , x3 , t) + ρ f 2 (x1 , x2 , x3 , t) + ⎪ ⎩ ρ f (x , x , x , t) + 3 1 2 3
p(x1 ,x2 ,x3 ,t)− p(x1 +δx1 ,x2 ,x3 ,t) δx1 p(x1 ,x2 ,x3 ,t)− p(x1 ,x2 +δx2 ,x3 ,t) δx2 p(x1 ,x2 ,x3 ,t)− p(x1 ,x2 ,x3 +δx3 ,t) δx3
1 = ρ dv , dt dv2 = ρ dt , 3 = ρ dv . dt
(6.2)
6.4 Hydrodynamics
167
However small we take the prism to be, what we’ve done so far remains valid: and the equation we’ve just written continues to hold even as δx1 becomes infinitely small, i.e., it tends to zero, which means that we can replace the second terms at the left-hand sides with the derivatives of pressure with respect to x1 , x2 , x3 , and ⎧ ⎪ ⎨ f 1 (x1 , x2 , x3 , t) − f 2 (x1 , x2 , x3 , t) − ⎪ ⎩ f (x , x , x , t) − 3 1 2 3
1 ρ 1 ρ 1 ρ
∂p (x , x , x3 , t) ∂ x1 1 2 ∂p (x , x , x3 , t) ∂ x2 1 2 ∂p (x , x , x3 , t) ∂ x3 1 2
= = =
dv1 , dt dv2 , dt dv3 dt
(6.3)
(you might have noticed that I’ve also divided everything by ρ). These few steps were probably fairly straightforward, but the one that comes now is delicate. On the left of (6.3) we have the derivatives of pressure with respect to position, at time t: which means that if we had a bunch of instruments capable of measuring p at several locations simultaneously, all we would have to do is take the difference between measurements made at two nearby points, e.g. x1 , x2 , x3 and x1 + δx1 , x2 , x3 at the same time, and divide by the distance (δx1 ). OK. But at the right-hand side we have a derivative with respect to time. And, over time, the fluid is flowing. The acceleration experienced by our prism of fluid at time t is the difference between the velocity of the same prism at time t + δt and at time t (divided by δt). And but the thing is, at time t + δt the prism will have moved to a new location, with coordinates x1 + v1 δt, x2 + v2 δt, x3 + v3 δt. And so, for example, dv1 v1 (x1 + v1 δt, x2 + v2 δt, x3 + v3 δt, t + δt) − v(x1 , x2 , x3 , t) (x1 , x2 , x3 , t) ≈ , dt δt (6.4) etc. Through a first-order Taylor expansion264 , I can rewrite the numerator of the right-hand side, v1 (x1 + v1 δt, x2 + v2 δt, x3 + v3 δt, t + δt) − v(x1 , x2 , x3 , t) ∂v1 (x1 , x2 , x3 , t)v1 (x1 , x2 , x3 , t)δt ≈ ∂ x1 ∂v1 + (x1 , x2 , x3 , t)v2 (x1 , x2 , x3 , t)δt ∂ x2 ∂v1 + (x1 , x2 , x3 , t)v3 (x1 , x2 , x3 , t)δt ∂ x3 ∂v1 (x1 , x2 , x3 , t)δt, + ∂t (6.5) etc., and if we plug this expression into (6.4), we find that
168
6 The Vibrations of the Earth
dv1 ∂v1 (x1 , x2 , x3 , t) ≈ (x1 , x2 , x3 , t)v1 (x1 , x2 , x3 , t) dt ∂ x1 ∂v1 + (x1 , x2 , x3 , t)v2 (x1 , x2 , x3 , t) ∂ x2 ∂v1 + (x1 , x2 , x3 , t)v3 (x1 , x2 , x3 , t) ∂ x3 ∂v1 (x1 , x2 , x3 , t), + ∂t
(6.6)
where actually you can replace the ≈ with = because, again, we can take all intervals, including δt, as small as we like. You can repeat the same exercise to find equivalent equations for v2 and v3 . Plug that all into (6.3), and f 1 (x1 , x2 , x3 , t) −
1 ∂p ∂v1 ∂v1 ∂v1 ∂v1 , (6.7) (x1 , x2 , x3 , t) = v1 + v2 + v3 + ρ ∂ x1 ∂ x1 ∂ x2 ∂ x3 ∂t
plus two similar equations for the two other components. Or, in vector notation, f(x) −
1 ∂v ∇ p(x, t) = v(x, t) · ∇v(x, t) + (x, t), ρ ∂t
(6.8)
where ∇ stands for the vector265 ( ∂∂x1 , ∂∂x2 , ∂∂x3 ), and you might begin to appreciate that writing physical things as vectors can actually simplify your life. (Which is why vectors were invented.) This (vectorial) equation, by the way, or set of (scalar) equations, is called Euler’s equation, because the first time it shows up in the literature is in Euler’s Principes Généraux. (It is the last equation at page 286 of Euler’s text.) In practice, what this equation is, it is the equivalent of Newton’s force-equals-masstimes-acceleration for each individual point of a continuum. If you want to know how a very large body, i.e., a continuum of density ρ deforms under the action of some forces (like, in our case, tectonic forces) you solve this equation, which means you find functions v(x, t) and/or p(x, t) that describe how each point of the continuum moves and/or how pressure at that point changes over time (we’ll see in a minute an example of how that can be done in practice). Except, of course, be careful: at this point we are talking about fluids, and accordingly we’ve started out by saying that forces across any surface within the continuum are perpendicular to the surface itself, which is something that won’t be true for rocks, so later (soon) when we do the same thing for the “solid earth”, we are going to find a slightly different equation266 . Now there are two things I want to do with Euler’s equation. First, I will use it to derive some other mathematical formulae that, as you shall see, describe how elastic waves propagate through fluids267 . To do that, I’ll take the simplest possible approach and just do as if we lived within an infinitely large unbounded atmosphere made of air, and/or as if sound waves were propagating through an infinite unbounded ocean of water. This is OK as a first approximation, because both the ocean and the atmosphere are very large compared to the spatial extent of many of the phenomena that we study.
6.5 Elastic Waves (“Body Waves”) in an Unbounded Fluid
169
The second thing I want to do is, use math to predict what’s going to happen if the fluid does not occupy an infinite amount of space. This will bring us back to the reason we’ve gotten into this in the first place, i.e., to understand mathematically the tsunami wave: because what we are going to find out, when we combine Euler’s equation with a “boundary condition” that describes the interface between ocean and atmosphere, is that some additional waves emerge, propagating along that interface with different speed and different properties than the “sound waves” that we’ve just met, that wouldn’t exist if the ocean were just an infinite unbounded body of water.
6.5 Elastic Waves (“Body Waves”) in an Unbounded Fluid So, exercise one, elastic waves in an unbounded fluid. To make it as easy as can be, we are going to make some simplifications. Like I said, we shall neglect viscosity; in addition, we shall assume that the fluid deforms in such a way that v always changes smoothly with respect to position, i.e., the gradient of v is small and the so-called “advective” term268 in Euler’s equation, i.e., the first term at the right-hand side of (6.8) can be neglected. Call p0 the pressure that one would measure in the absence of motion; then, when v = 0 everywhere in the medium, Euler’s equation (6.8) boils down to f(x) −
1 ∇ p0 (x, t) = 0. ρ
(6.9)
Now, introduce δp such that p = p0 + δp; subtract (6.9) from (6.8), and you should get ∂v 1 (x, t), (6.10) − ∇δp(x, t) = v(x, t) · ∇v(x, t) + ρ ∂t which if we neglect advection, like I said we would, −
1 ∂v ∇δp(x, t) = (x, t). ρ ∂t
(6.11)
Equation (6.11) is one differential equation in the two unknown functions δp(x, t) and v(x, t). Or rather, ∇ and v being vectors, it’s three differential equations, in the four unknown functions δp and v1 , v2 , v3 . Either way, before we can solve it mathematically, we need to establish some additional relation between the unknowns269 . And for that, we need some more physics. For instance, it’s useful to know that the pressure and volume of the fluid are not independent of one another. It turns out that, in the process of wave propagation through a fluid, the product of p times V γ must stay constant, with γ also a constant that depends on the nature of the fluid270 . If you differentiate pV γ against V , you get
170
6 The Vibrations of the Earth
d p dp γ ( pV γ ) = γ V γ + V . dV V dV But if pV γ is constant, then
d dV
γ or
(6.12)
( pV γ ) = 0, and so p γ dp γ V + V = 0, V dV p dp = −γ . dV V
(6.13)
(6.14)
Above we’d written p = p0 + δp, so if we plug that into (6.14), keeping in mind that p0 is just a constant, p0 dδp = −γ . (6.15) dV V Let us use this result to manipulate the expression for the derivative of δp with respect to time t, dδp dδp d V = dt d V dt (6.16) p0 d V , = −γ V dt which will be useful in a minute. After the thermodynamics, some mechanics. Let u(x, t) denote the fluid’s displacement, so that v(x, t) = dtd u(x,t). So, for example, in Fig. 6.4, displacement is only along the x1 direction; the whole volume element moves to the right by an amount u 1 (x1 ) which varies as a function of x1 (because the element might also expand or contract): the difference between the displacement at the two sides that are perpendicular to the x1 axis, call it δu 1 , multiplied by their area, gives the volume
Fig. 6.4 Volume element (rectangular prism with sides δx1 , δx2 , δx3 along the three Cartesian axes) before (a) and after (b) being displaced and expanded along the direction x1 . The dashed line in diagram (b) marks the element’s extent along x1 before expansion—i.e., as it was in (a)
6.5 Elastic Waves (“Body Waves”) in an Unbounded Fluid
171
change, δV = δu 1 δx2 δx3 . In general, in a fluid that moves and expands, or contracts, matter can move in all directions, and δV = δu 1 δx2 δx3 + δu 2 δx1 δx3 + δu 3 δx1 δx2 .
(6.17)
Now, divide both sides of (6.17) by the volume as it was before displacement/deformation, which you might call it V0 , δu 1 δu 2 δu 3 δV = + + . V0 δx1 δx2 δx3 Pushing this to the limit where δx1 , etc., are all infinitely small, the ratios 1 become the same thing as the derivatives ∂u , etc., and ∂ x1 δV δu 1 δu 2 δu 3 = + + V0 δx1 δx2 δx3 ∂u 1 ∂u 2 ∂u 3 = + + ∂ x1 ∂ x2 ∂ x3 = ∇ · u,
(6.18) δu 1 , δx1
etc.,
(6.19)
where ∇· is what people call “divergence”271 . Equation (6.19) says that the divergence at a given point in a material that’s being deformed coincides with the relative volume change that you would observe near that point (which, by the way, is a result that we shall often come back to, as you will see). We might differentiate (6.19) with respect to time—it’ll be useful in what we are about to do: d dV = V0 (∇ · u) dt dt = V0 ∇ · v,
(6.20)
after switching the order of differentiation. With that, we are ready to turn Eq. (6.11) into something that we can actually try to solve. Let us sub (6.20) into (6.16); we get dδp = −γ p0 ∇ · v. dt
(6.21)
Let us next differentiate (6.21) with respect to time t, d 2 δp d = −γ p0 (∇ · v) , 2 dt dt
(6.22)
172
6 The Vibrations of the Earth
which you can turn that around, to get d 1 d 2 δp . (∇ · v) = − dt γ p0 dt 2
(6.23)
Now take the divergence of (6.11) and multiply it by ρ, too, and ρ
d (∇ · v) = −∇ 2 δp, dt
(6.24)
where the “Laplacian”272 ∇ 2 = ∇ · ∇, and sub (6.23) into (6.24), ∇ 2 δp(x, t) = or, if you denote c2 = practice),
γ p0 ρ
ρ d 2 δp (x, t), γ p0 dt 2
(6.25)
(which we shall see in a minute how this makes sense in ∇ 2 δp(x, t) =
1 d 2 δp (x, t), c2 dt 2
(6.26)
which is what people call the wave equation in three dimensions. Scalar wave equation, actually, because δp is a scalar, and there’s also a vectorial wave equation, which is the same thing but with δp replaced by a vector—we’ll have to deal with it soon. Now, it was Jean le Rond d’Alembert273 who first noticed that Eq. (6.26) is solved by any function with the form δp(x, t) = δp(k · x ± ωt),
(6.27)
where ω is a real, positive constant, k is a 3-vector with constant coefficients, i.e., k = (k1 , k2 , k3 ), and ω2 = c2 . (6.28) |k|2 In other words: if, in the argument of δp, space and time appear only combined together as in k · x ± ωt, with ω and k related through (6.28), then δp, whatever its form, will solve the wave equation. Cosines and sines, exponentials and logarithms, powers and roots, anything will do. This can be verified by direct substitution, which is what I am going to do next. It’s best to first define z = k · x ± ωt. Then, by way of the chain rule274 , the derivative of δp, with respect to x1 , can be written ∂z ∂δp ∂δp = ∂ x1 ∂ x1 ∂z ∂δp =k1 , ∂z
(6.29)
6.5 Elastic Waves (“Body Waves”) in an Unbounded Fluid
and likewise
and
173
∂δp ∂δp = k2 , ∂ x2 ∂z
(6.30)
∂δp ∂δp , = k3 ∂ x3 ∂z
(6.31)
∂δp ∂δp = ±ω . ∂t ∂z
(6.32)
If we do the second derivatives in the same way, we get ∂z ∂ ∂ 2 δp = 2 ∂ x1 ∂z ∂ x1 ∂ =k12
∂δp k1 ∂z
δp , ∂z 2
and likewise
and
2
(6.33)
2 ∂ 2 δp 2 ∂ δp = k , 2 ∂z 2 ∂ x22
(6.34)
2 ∂ 2 δp 2 ∂ δp = k , 3 ∂z 2 ∂ x32
(6.35)
2 ∂ 2 δp 2 ∂ δp = ω , ∂t 2 ∂z 2
(6.36)
which is true whatever the sign at the right-hand side of (6.32). It follows that ∇ 2 δp = (k12 + k22 + k32 )
∂ 2 δp , ∂z 2
(6.37)
and if you plug that, and (6.36), into (6.26), (k12 + k22 + k32 )
∂ 2 δp ω2 ∂ 2 δp = . ∂z 2 c2 ∂z 2
(6.38)
If you compare (6.38)–(6.26), you see that they are the same equation, as long as 2 (k12 + k22 + k32 ) = ωc2 : which is nothing but (6.28). And so, yes, d’Alembert is right, QED. As you have seen, there was no need to specify any particular property for the function δp: which proves that d’Alembert works, like I said, whatever the form of δp. Now that the math is done, let us try to understand what Eq. (6.27) means in practice. It’s easier if we rotate the reference frame, so that one of the axes, let’s say
174
6 The Vibrations of the Earth
x1 , is parallel to k. How the reference frame is oriented doesn’t do anything to the physics of the problem, so no worries. In this frame, k2 = k3 = 0 and k · x = k1 x1 , so that (6.27) collapses to δp(x, t) = δp(k1 x1 ± ωt).
(6.39)
This reference frame is nice because it makes it immediately apparent that, at a given time, δp only depends on the distance along the direction defined by k, i.e., it is a function of x1 only. Now, imagine you know what δp(x1 ) is like at a reference time t = 0: let that be represented by the black curve in Fig. 6.5. To visualize what happens after a time δt, take, e.g., the case where the sign in front of ω, in (6.39), is a minus, and look at one particular feature of the black curve: the easiest to spot is the main positive peak (but it could be any peak, actually, or any point along the curve, for that matters). Call x p (0) the coordinate of your favourite peak at zero time. Now, wait for δt seconds and then take another snapshot of δp. Consider, at time δt, the point x p (t) such that (6.40) k1 x p (t) − ωδt = x p (0) : using both the left- and right-hand sides of (6.40) as arguments for δp, we must have that (6.41) δp[k1 x p (t) − ωδt] = δp[k1 x p (0)], which means that the value that δp used to have, at zero time, at the point x p (0) is now observed at the point x p (t) given by (6.40). Or, in other words, that whatever peak you were looking at has “moved” from x p (0) to x p (t). This is true, like I said, for any value of x p (0): so what is happening is that the entire snapshot taken at time δt is exactly the same as that taken at time t = 0, only shifted along the direction x1 . (In the direction of growing x1 , i.e., in Fig. 6.5, towards the right. Had we chosen a plus in front of ω, it would have been the left.) A function of x and t with this behavior might be called a wave—a “pressure” wave in this case, since we are talking about a perturbation in pressure. And the best word to describe what it is doing is, it propagates. The wave speed is the speed at which any point along the wave—think, e.g., its main peak—is shifted from x p (0) to x p (t). The equality (6.40), if you rearrange it a bit, says that the distance covered during δt is x p (t) − x p (0) =
ω δt. k1
And but, because k2 = k3 = 0 in this frame, then kω1 is the same as same as c, and x p (t) − x p (0) = cδt.
(6.42) ω , |k|
which is the (6.43)
6.5 Elastic Waves (“Body Waves”) in an Unbounded Fluid
175
if you divide the distance by the time needed to cover it, x p (t) − x p (0) = c, δt
(6.44)
and I think it’s pretty clear that c, then, can be interpreted as a speed275 . Careful, though: what “moves” at this speed is not matter, but a “disturbance”: the shape of the wave. If you think sound, as in meaningful sound—speech, or music, or whatever— what happens is that the information-carrying signal propagates, for example across a lecture room, but the air particles in the room almost stay put—they only experience a little vibration and the associated pressure variation. In Fig. 6.5 I am showing you δp as a function of x1 , at two selected instants in time. You could also pick a value for x1 : say, for the sake of simplicity, x1 = 0, and plot δp at that location as a function of time, which would be δp(−ωt), i.e., the mirror image276 , shown in Fig. 6.6, of the curve in Fig. 6.5. And if you were to make the same observation at a horizontal distance δx1 from the origin, you would observe the same thing, delayed by a time δxc (which is distance divided by propagation speed: if you think about it, it all makes sense). Because δp only changes as a function of x1 (and t), if we were to repeat the same observation shown in Fig. 6.6 at any point x2 , x3 on the “vertical” plane, with the same x1 coordinate, we would always be seeing the exact same δp. In other words, the value of δp at all points of the vertical plane defined by a given value of x1 oscillate in sync. After a time δt, the exact same δp is observed along the entire vertical plane of horizontal coordinate by x1 + kω1 δt: imagine that, instead of the wave of Figs. 6.5 and 6.6, we had an impulse—a click, a gunshot, a Dirac delta277 ; then it’s easier, I guess, to visualize the wave in 3-D as a planar disturbance, propagating along the direction perpendicular to the plane itself. Whatever the complexity of the signal, we call this kind of wave a plane wave. The direction of propagation coincides with the x1 axis—which, if we go back to the
Fig. 6.5 Wave propagation. The black curve is the δp observed at time t = 0 along a line parallel to the axis x1 . The grey curve is the δp observed along the same line, after a time δt. We have shown that the grey curve is the same as the black one, shifted along the x1 direction. The function x p (t) is the x1 coordinate of the wave’s main positive peak, at time t
176
6 The Vibrations of the Earth
Fig. 6.6 The same wave as in Fig. 6.5, but recorded at two fixed locations, 0 and δx1 , as a function of t
initial reference frame, is the same as saying that it is parallel with k. The speed of ω —nothing has changed on that front. propagation still coincides with c = |k| Before I move on to exercise two, I think it’s worth pointing out a couple more things. In what I’ve done there’s no constraint on the direction of k—plane waves in an unbounded medium can propagate whichever way they want. ω and the magnitude of k are also free to change, in principle, but if you change either one, you also have ω , AKA c, stays the same: which makes sense, to change the other so that the ratio |k| γ p0 because remember, c = , and γ , p0 and ρ are not arbitrary coefficients, but ρ physical parameters that define the, uhm, experimental setup that we are dealing with. Depending on their values, we can use the formulation we’ve just put together to study how audible sound propagates in the earth’s atmosphere, how ultrasound propagates through water, etc. There’s a big limitation to what we’ve done so far, though. We’ve shown that plane waves work as a possible solution of Euler’s equation in unbounded three dimensional space: but they are not the only possible solution. In fact, in your everyday life you’re experiencing spherical waves much more often than plane waves. A spherical wave is like a plane wave, except that planar surfaces are replaced by spherical ones—e.g., an impulsive sound is emitted at a point, and, as it propagates away from it, the loci of points that are hit by the impulse at the same time are spherical surfaces278 . This is what happens, e.g., when someone speaks, or when sound comes out of a loudspeaker—if you are OK with considering those sources small enough, compared to the distance at which the listeners sit, that we can approximate them as points, with no volume or structure. Which is usually OK, really. We’ll look at one example of spherical waves (not “sound”, though, but actual seismic waves—compressional and shear) later in the chapter. Either way, anyway, as far as only observers that are far enough from the source are considered, the surfaces’ curvature is small, and, locally, there isn’t much difference between a plane wave and a spherical wave emitted by a faraway source. And one last thing: all this doesn’t mean much, in practice, if you don’t prescribe some initial conditions (boundary conditions don’t apply here, because we don’t have boundaries). Thus far, I’ve shown you that plane waves can exist in an elastic fluid; but to use this knowledge to make any sort of “prediction” about the future—like,
6.6 “Surface Waves” in a Fluid Half Space
177
what sound am I going to here in a few seconds?—we need to have some info about the status of the system at least a little earlier—like, the lightning I just saw at the horizon279 .
6.6 “Surface Waves” in a Fluid Half Space And now exercise two: water waves in what people call a half space. This is more complex than exercise one, so we simplify some more, picking the displacement (or, which is the same, the velocity) to be irrotational, and the medium to be incompressible. An incompressible medium is a medium which, even when it’s under pressure, doesn’t lose or gain volume; it will be deformed but its volume and, as a result, its mass density stay constant. Through Eq. (6.19), we know that’s the same as the displacement having zero divergence; and because velocity is just the time-derivative of displacement, velocity then must have zero divergence as well, ∇ · v = 0. At the pressures and temperatures that we are talking about (those found at the earth’s surface) we know that water is pretty much incompressible, so it’s OK to make this simplification. Now, the thing about the displacement being irrotational: displacement is a combination of translation (a body moves from one place to another), deformation (its shape changes), and rotation (it rotates, which of course might happen without any translation or deformation). If there is no rotation, then, that’s when we say that the displacement is irrotational. I am going to try to show you how one identifies the rotational component of displacement. We know from Chap. 2, Eq. (2.11), that, at a generic location x, the displacement field280 that describes a pure rotation reads × x, u = δ
(6.45)
= δt, with the angular velocity vector where × means a “cross product”281 , δ and δt a small increment in time. To keep things simple, let’s assume to be constant, is constant, too. Now, the curl of a vector is the cross product of nabla so that δ with that vector282 , so to “take the curl” of both sides of what we just wrote means × x). ∇ × u = δt ∇ × (
(6.46)
Let us first work out × x, which283 × x = (2 x3 − 3 x2 , 3 x1 − 1 x3 , 1 x2 − 2 x1 ). Now take the curl of what you just got,
(6.47)
178
6 The Vibrations of the Earth
∂ ∂ ∂ × (1 x3 − 3 x1 , 3 x2 − 2 x3 , 1 x2 − 2 x1 ) , , ∂ x1 ∂ x2 ∂ x3 ⎛ ⎞ ∂ (1 x2 − 2 x1 ) − ∂∂x3 (3 x1 − 1 x3 ) ∂ x2 ⎜ ⎟ = ⎝ ∂∂x3 (2 x3 − 3 x2 ) − ∂∂x1 (1 x2 − 2 x1 )⎠ ∂ (3 x1 − 1 x3 ) − ∂∂x2 (2 x3 − 3 x2 ) ∂ x1 ⎛ ⎞ 1 = 2 ⎝2 ⎠ 3
× x) = ∇ × (
= 2 , (6.48) because the vector is constant. Substituting into (6.46), ∇ × u = 2 δt ,
(6.49)
which means that the curl of a displacement field that, at a point x, describes a rotation, coincides with the angular velocity vector associated with that same rotation, multiplied by 2δt. You can turn this around, as follows: if the curl of a displacement field u at a location x is nonzero, that means that there exists a vector , equal to half the curl of u, such that u = 2δt × x at that location. So, locally, at x, you can still think of the displacement as a rotation. Bottom line, the curl of a displacement field is strictly related to how much rotation is happening. If the curl is zero, that means that there’s no rotation at all—there could still be distortion, though284 . If the curl is nonzero, we could imagine to split the displacement field in three parts: (i) translation (which is curl-free—it’s also divergence-free, actually: it’s a constant vector); (ii) curl-free deformation; (iii) a combination of pure rotation, and/or “curly” deformation (which you can always think of as a combination of rotations). In the following, like I said, the assumption is made that water displacement is irrotational. Unlike with incompressibility, I don’t think there’s a particularly good a-priori reason to make this assumption: the reason we make it is that we know, a posteriori, that the theory that comes out of this is good at describing what happens in the real world. Now that the concepts of curl and of irrotational vector field are clarified, more or less, we are ready to derive ocean surface waves mathematically285 . Let’s go back to Euler’s equation (6.8); we need, first, to play around with v · ∇v, which appears at its right-hand side, and whose i-th component can be written, after adding and subtracting the same thing to and from it, and via Einstein’s convention286 vk
∂vi = vk ∂ xk
∂vi ∂vk − ∂ xk ∂ xi
+ vk
∂vk . ∂ xi
(6.50)
6.6 “Surface Waves” in a Fluid Half Space
179
This is the same as saying, in tensorial notation, v · ∇v = v · [∇v − (∇v)T ] + v · (∇v)T .
(6.51)
But, it follows from the definition of curl287 that v × (∇ × v) = −v · (∇v) + v · (∇v)T ,
(6.52)
v · (∇v) = −v × (∇ × v) + v · (∇v)T ,
(6.53)
or
and so if I plug that into (6.8), f−
∂v 1 ∇ p = −v × (∇ × v) + v · (∇v)T + . ρ ∂t
(6.54)
Now look at the second term at the right-hand side of what we’ve just written, i.e., v · (∇v)T : its k-th component reads 3 ∂vi 1 ∂ vi = v2 ∂ xk 2 ∂ xk i=1 i 3 1 2 ∂ = v ; ∂ xk 2 i=1 i
(6.55)
3 vi2 coincides with the kinetic energy per remember Chap. 4: the expression 21 i=1 unit mass of water in the vicinity the material point at x; let’s call it K and substitute into Eq. (6.54), ∂v 1 . (6.56) f − ∇ p = −v × (∇ × v) + ∇K + ρ ∂t The idea of all this is that to predict what the motion is going to be like, you never need to know, like, the value of K, or v for the whole continuum: once you solve the differential Eq. (6.56), meaning, once you find a function v(x, t) for which (6.56) is verified at any x and t, then you just implement it—find its numerical value—at the place and time that’s relevant for you; e.g., at your observation point, for instance: where you’ve installed the buoy that will move up and down at each passing wave: so you can compare prediction with data. Now, we said we’d look at irrotational (AKA, curl-free) displacements only. So, we have ∇ × v = 0, which then (6.56) becomes f(x, t) −
1 ∂v ∇ p(x, t) = ∇K(x, t) + (x, t). ρ ∂t
(6.57)
180
6 The Vibrations of the Earth
And but if ∇ × v = 0 has another, important consequence: there’s a mathematical theorem called Stokes’ theorem which implies that when a vector field is, like v, curl-free, its integral C v · dr along any closed curve C is zero288 . One consequence of this is that, if v is curl-free, then there must exist a scalar function φ(x, t) such that v can be written as its gradient289 ,
and
v(x, t) = ∇φ(x, t),
(6.58)
∂φ 1 (x, t) . f(x, t) − ∇ p(x, t) = ∇K(x, t) + ∇ ρ ∂t
(6.59)
Then, now, consider that the force-per-unit-mass, f(x, t), in practice, is just gravity, which if g is the gravitational acceleration caused by the earth’s attraction (we can neglect the rest, like the moon and the sun and so on, for the time being), then f(x, t) = − g xˆ 3 = − ∇(gx3 ),
(6.60)
∂φ 1 − ∇(gx3 ) − ∇ p(x, t) = ∇K(x, t) + ∇ (x, t) . ρ ∂t
(6.61)
which we can sub into (6.59),
But then, move things around a bit, and what we’ve just written becomes
1 ∂φ ∇ gx3 + p + K + ρ ∂t
= 0,
(6.62)
which is the same as saying that the argument of ∇ at the left-hand side is constant with respect to position in space, or gx3 +
∂φ 1 p+K+ = C, ρ ∂t
(6.63)
with C an arbitrary constant. If you add to both sides the ratio pρ0 , where p0 is a reference value for pressure that could be, e.g., atmospheric pressure at the ocean surface, which is approximately constant, of course, and water density ρ is constant, too, in our approximation, so pρ0 is constant; if you add that to both sides of (6.63), gx3 +
∂φ p0 δp +K+ =C− , ρ ∂t ρ
(6.64)
where δp = p − p0 . At the right-hand side we still have an arbitrary constant: we might call it C ; and let’s neglect the square of the velocity, like Lamb does in his
6.6 “Surface Waves” in a Fluid Half Space
181
book, the idea being that we are looking at relatively small deformations and so velocity is small too, presumably, and a squared velocity is second-order small. So, anyway, that means that K ≈ 0, and so gx3 +
δp ∂φ + = C . ρ ∂t
(6.65)
Now, some more info that we haven’t taken into account, yet: at the ocean surface, the atmosphere can’t really oppose the motion of the water—other than with its weight, which is already accounted for in the δp term; but there can’t be any additional pressure acting on that surface. The ocean bottom, on the other hand, is rigid—can’t be moved significantly by water—so, no vertical displacements there. Mathematically, that translates into the boundary conditions δp(x1 , x2 , η, t) = 0,
(6.66)
for all values of x1 , x2 and t, where η = η(x1 , x2 , t) is the deformation of the surface; and (6.67) v3 (x1 , x2 , −H ) = 0, where H is the depth of the ocean floor. (And, remember, x3 is positive upwards: so, yes, at the ocean floor, x3 = −H .) If you sub δp = 0 and x3 = η into (6.65), gη(x1 , x2 , t) +
∂φ (x1 , x2 , 0, t) = C . ∂t
(6.68)
The function φ(x, t), as we introduced it a minute ago through Eq. (6.58), doesn’t really have a physical meaning per se; only its derivatives with respect to the spatial coordinates do. And C being a constant, it won’t contribute to the derivatives at all. So we can give it whatever value we want, and the easiest is C = 0, so we are going to do just that, and ∂φ (x1 , x2 , 0, t) = 0. gη(x1 , x2 , t) + (6.69) ∂t Now, by definition of η, v3 (x1 , x2 , 0, t) =
∂η (x1 , x2 , t); ∂t
(6.70)
and but then, through Eq. (6.58), ∂φ ∂η (x1 , x2 , t). (x1 , x2 , 0, t) = ∂ x3 ∂t
(6.71)
182
6 The Vibrations of the Earth
Likewise, the boundary condition (6.67) can be written ∂φ (x1 , x2 , −H ) = 0. ∂ x3
(6.72)
I anticipated that, to keep things simple, we’d take the medium to be incompressible—which water approximately is, I guess, anyway. Incompressibility means that, whatever happens, there’s no relative change in the volume of any chunk of matter: and but, through equation (6.19), we know that that’s the same as the displacement field having zero divergence, i.e., 0 =∇ · v = ∇ · (∇φ)
(6.73)
=∇ φ 2
Equations (6.69), (6.71), (6.72) and (6.73) contain all the constraints we have been able to put together re our fluid half-space problem: they’re the full physical description of the problem. All that’s left to do is try to find a solution, i.e., find functions φ and η such that all four equations are verified. That’s not trivial at all and I don’t expect that even the best of you would be able to guess how that is done right away, unless you’ve seen the solution to some similar problem before. (I, for one, certainly didn’t know what was going on, when I was exposed to these things for the first time. And the second time, too). Anyway, so, the trick is that if you look at the ocean surface, you see that there are “surface waves” propagating across it, and so you might as well make an educated guess that the disturbance we’re looking for be a plane wave, similar to those we’ve met in exercise one; the direction of propagation is parallel to the (unperturbed) ocean surface, i.e., the half-space surface, and we might take the x1 axis to coincide with it, so that φ(x, t) = f (x3 )q(kx1 − ωt),
(6.74)
η(x, t) = ah(kx1 − ωt),
(6.75)
with constant a, k and ω—these parameters are the same for both φ and η, of course, because surface deformation η and velocity v are two expressions of the same wave. The functions f , q and h remain to be determined. Equations (6.74) and (6.75) say that, if we look at the ocean surface from above, we indeed see a planar, or, I should say, linear wave propagating across it in the direction x1 (if you are not sure about this, go back to exercise one). The fact that all the x3 -dependence is contained in f , which, in turn, depends on x3 only, means that, deeper in the ocean, we’d see the wave as we at the surface, only with different amplitude. You might have noticed that, for the sake of “simplicity”, we are only looking at waves traveling in the growing-x1 direction—that’s what the minus in front of ω does.
6.6 “Surface Waves” in a Fluid Half Space
183
Before we plug our guess for the solution into those four equations above, there’s one more trick that we shall play. It’s an important one: it’s the first example, in this book, of solving a differential problem via the Fourier transform. The Fourier transform could be the topic of entire books—it probably is, actually, but at this point I’ll just tell you what I think is the most important point of the whole story: which is that Fourier, back in 1807 or so290 , had the intuition that pretty much any function defined over a finite interval, i.e., any function that could reasonably be used to describe real-world stuff, could be written as a combination of sinusoidal functions291,292 , i.e., u(x) =
+∞ −∞
Uc (ν) cos(νx)dν +
+∞ −∞
Us (ν) sin(νx)dν
(6.76)
(or something similar293 ), where the pair of functions Uc (ν), Us (ν), which work kind of like the coefficients of a linear combination (of sines and cosines), are called, together, the “Fourier transform” of u(x). After Fourier’s contribution, physicists faced with problems like the one we are dealing with right now started to systematically look for sinusoidal solutions, i.e., they’d pick the functions q and h in (6.74) and (6.75) to be a sine and a cosine, as in294 φ(x, t) = f (x3 ) sin(kx1 − ωt), (6.77) η(x, t) = a cos(kx1 − ωt),
(6.78)
with k > 0, ω > 0. (ω in (6.78) is called the frequency of the sinusoidal wave: because imagine you are sitting at some fixed location x1 : you’ll see the surface of the water going up and down at a rate of ω cycles per second (AKA Hertz). For more obscure reasons, k is called the wavenumber. If you take a snapshot of the wave at a given t, k tells you how dense the crests and/or troughs are with respect to x1 : if you walk along x1 , you cover precisely one full cycle of the wave: and so 2π is a distance 2π k k the wave length; the smaller the wavenumber, the longer the wave, etc.) What one needs to do, now, is find a function f (x3 ), and combinations of values of ω and k, such that (6.77) and (6.78) solve the system of (6.69) and (6.71)–(6.73). If you can do that, then the general solution—the general ocean wave—will be a superposition of instances of ∇φ and/or η where f (x3 ), ω and k are replaced by the values that we so determined. I guess this will become clearer when we actually work it out. Which let’s just do it, then. Substituting (6.77) into (6.73) gives − f (x3 )k 2 sin(kx1 − ωt) +
∂2 f (x3 ) sin(kx1 − ωt) = 0, ∂ x32
(6.79)
184
6 The Vibrations of the Earth
which if you divide it by sin(kx1 − ωt) boils down to − f (x3 )k 2 +
∂2 f (x3 ) = 0, ∂ x32
(6.80)
a second-order ODE, whose general solution is f (x3 ) = Aekx3 + Be−kx3 ;
(6.81)
φ(x1 , x3 , t) = (Aekx3 + Be−kx3 ) sin(kx1 − ωt).
(6.82)
and it follows that
Then, if we plug this into the boundary condition (6.72), we get k Ae−k H − Bek H sin(kx1 − ωt) = 0
(6.83)
for all x1 , and t, which implies that
or
Ae−k H − Bek H = 0,
(6.84)
B = Ae−2k H ;
(6.85)
we should update, then, the formula for φ, replacing (6.82) with φ(x1 , x3 , t) = Aekx3 + Ae−2k H e−kx3 sin(kx1 − ωt).
(6.86)
Substituting this into Eq. (6.71), we get ∂η (x1 , x2 , t) = k A 1 − e−2k H sin(kx1 − ωt); ∂t
(6.87)
subbing (6.78), which we still haven’t used, into the left-hand side of what we just wrote, aω sin(kx1 − ωt) = k A 1 − e−2k H sin(kx1 − ωt), (6.88) or
It follows that
aω = k A 1 − e−2k H . aω A= k 1 − e−2k H aωe+k H , = +k H k e − e−k H
(6.89)
(6.90)
6.6 “Surface Waves” in a Fluid Half Space
185
which if you sub that into (6.86), you get φ(x1 , x3 , t) =
aω ek(x3 +H ) + e−k(x3 +H ) sin(kx1 − ωt). k e+k H − e−k H
(6.91)
From (6.86), we can reconstruct velocity and displacement: ∂φ(x1 , x3 , t) ∂ x1 ek(x3 +H ) + e−k(x3 +H ) =aω cos(kx1 − ωt); e+k H − e−k H
(6.92)
∂φ(x1 , x3 , t) ∂ x3 aω kek(x3 +H ) − ke−k(x3 +H ) = sin(kx1 − ωt) k e+k H − e−k H ek(x3 +H ) − e−k(x3 +H ) =aω sin(kx1 − ωt); e+k H − e−k H
(6.93)
v1 (x1 , x3 , t) =
v3 (x1 , x3 , t) =
u 1 (x1 , x3 , t) =
dt v1 (x1 , x3 , t)
ek(x3 +H ) + e−k(x3 +H ) dt cos(kx1 − ωt) e+k H − e−k H ek(x3 +H ) + e−k(x3 +H ) 1 sin(kx1 − ωt) =aω e+k H − e−k H −ω ek(x3 +H ) + e−k(x3 +H ) =−a sin(kx1 − ωt); e+k H − e−k H u 3 (x1 , x3 , t) = dt v3 (x1 , x3 , t) ek(x3 +H ) − e−k(x3 +H ) =aω dt sin(kx1 − ωt) e+k H − e−k H ek(x3 +H ) − e−k(x3 +H ) cos(kx1 − ωt). =a e+k H − e−k H =aω
(6.94)
(6.95)
If you look this stuff up in textbooks, you’re likely to find Eqs. (6.94) and (6.95) in a more compact form, involving the so-called hyperbolic cosine and hyperbolic sine, cosh and sinh, which are defined295 cosh(x) =
1 x e + e−x 2
(6.96)
186
6 The Vibrations of the Earth
Fig. 6.7 Hyperbolic cosine and hyperbolic sine
and sinh(x) =
1 x e − e−x , 2
(6.97)
and look at Fig. 6.7 to see how they look like. If you prefer hyperbolic cosine and sine to the exponential, you’ll want to write cosh [k(x3 + H )] sin(kx1 − ωt) sinh(k H )
(6.98)
sinh [k(x3 + H )] cos(kx1 − ωt); sinh(k H )
(6.99)
u 1 (x1 , x3 , t) = −a and u 3 (x1 , x3 , t) = a
let us try to visualize the displacement that they describe. The way it changes over time is controlled by the sin and cos in (6.98) and (6.99). Because the argument of both is kx1 − ωt, we know, from what we’ve learned a few pages ago about body waves in water, that, again, what we have is a wave; and that it propagates in the x1 direction, i.e., parallel to the ocean surface: we might call it a “surface wave”; or a “gravity wave”, because the force that pulls the water back at each oscillation is gravity—see above. Surface waves are more complicated than the body waves we’ve seen earlier, though, because what is being propagated now is not (or not only) pressure, but a vectorial displacement—with one component, u 2 , which is identically zero, but the other two, u 1 and u 3 , aren’t. To visualize what happens, let us position ourselves, for
6.6 “Surface Waves” in a Fluid Half Space
187
the sake of simplicity, at origin of the reference frame, x1 = 0 and x3 = 0. At time t = 0, displacement here will have components u 1 (x1 , x3 , t) = 0 and u 3 (x1 , x3 , t) = a
sinh [k(x3 + H )] , sinh(k H )
(6.100)
(6.101)
because the sin of zero is zero and the cos of zero is one. So if we were previously 3 +H )] sitting at the origin, now we’ve been pushed up vertically a distance a sinh[k(x sinh(k H ) from our initial position, and we occupy the point that’s marked A in Fig. 6.8. In the moments that follow t = 0, the cos in (6.99) decreases296 while the sin in π , the argument of both sin and cos becomes − π2 , so the (6.98) grows. At time t = 2ω cos becomes zero, the sin becomes minus one, and
Fig. 6.8 Trajectory of a material point hit by the ocean wave described by Eqs. (6.98) and sinh[k(x 3 +H )] ; the coordinates of B and D (6.99). The coordinates of A and C are 0, ±a sinh(k H ) cosh[k(x 3 +H )] are 0, ±a sinh(k H ) . You see from Eqs. (6.96) and (6.97), and/or from Fig. 6.7, that cosh [k(x3 + H )] > sinh [k(x3 + H )] if k(x3 + H ) > 0, which it is. Bottom line, a water parcel hit by one of these waves moves along an ellipse, whose horizontal axis is larger than the vertical one. It moves “in the direction of wave-propagation when it is at a crest, and in the opposite direction when it is in a trough” (Lamb), i.e., its motion is prograde
188
6 The Vibrations of the Earth
u 1 (x1 , x3 , t) = a
cosh [k(x3 + H )] , sinh(k H )
u 3 (x1 , x3 , t) = 0.
(6.102) (6.103)
This means we’re back at our initial elevation x3 = 0, but pushed forward (positive 3 +H )] from the origin, i.e., point B in Fig. 6.8. x1 direction) to a distance a cosh[k(x sinh(k H ) It shouldn’t be too difficult now to see that at time t = πω we shall find ourselves at 3 +H )] point C, or 0, − sinh[k(x —below our initial position. Likewise, at time t = 3π sinh(k H ) 2ω we’ll be at D, and when t = 2π we’ll be back at A: and so on and so forth, because sin ω and cos, of course, are periodic. You see from Fig. 6.8 that the trajectory of a parcel of water, hit by the ocean wave, is elliptical; in Lamb’s words, “a surface-particle is moving in the direction of wave-propagation when it is at a crest, and in the opposite direction when it is in a trough”—which, by the way, is what people call a “prograde” motion. The way displacement depends on depth is controlled by the argument, x3 + H , of the hyperbolic functions at the numerators. At the ocean surface, x3 = 0, they boil down to cosh(H ) and sinh(H ). Then, as depth grows, x3 grows in the negative, and so x3 + H decreases from H to 0—which is its value at the ocean bottom x3 = −H . As a result the cosh (you should be looking at Fig. 6.7 as you read this) drops from cosh(H ) to 1, and the sinh from sinh(H ) to 1. In practice, motion is still elliptical, but the size of the ellipse decreases with increasing depth; it is largest at the ocean surface, and smallest at the ocean bottom (where vertical displacement u 3 is zero, actually, as required by the boundary conditions). More interpretation: our “surface wave” propagates with speed ωk —we see this straight away from (6.98) and (6.99) if we remember the piece about body waves, above. There’s more to the speed of surface waves, though, that emerges from the math we’ve just worked out. To see what I am talking about, go back to Eq. (6.69), and sub expression (6.91) for φ, and expression (6.78) for η into it. You should get: aω ek H + e−k H ∂ sin(kx1 − ωt) = −g a cos(kx1 − ωt), k e+k H − e−k H ∂t
(6.104)
which, after some algebra,
or
or
aω2 cosh(k H ) = g a, k sinh(k H )
(6.105)
ω2 g sinh(k H ) , = 2 k k cosh(k H )
(6.106)
ω2 g = tanh(k H ), k2 k
(6.107)
6.7 Back to Mallet, Again
189
because, not surpisingly, the hyperbolic tangent tanh is defined as the ratio of sinh to cosh. So then (6.108) ω = g k tanh(k H ). What this is saying is that frequency ω and wavenumber k are not independent, but at each frequency—imagine that a vibration with frequency exactly equal to ω is transmitted into the water—, at each frequency ω there’s only one possible value for the wavenumber k, and vice-versa, and their relationship is fully determined by the depth H of the ocean floor. All this also means that, unlike in the case of body waves which propagate at one and only one speed, regardless of their frequency and/or wavenumber, surface waves travel at different speed depending on their frequency—and/or wavenumber297 , which is the same, because see Eq. (6.108). Waves with such a property are called “dispersive”, and the equation that relates their speed (which by now you know to coincide with the ratio ωk ) to ω, or to k, or that relates ω with k, all these are called “dispersion relation” and/or “dispersion curve”.
6.7 Back to Mallet, Again So now we’ve done all the math for the case of waves in water, whether they be sound waves or surface waves. This math was more or less available to Mallet when he wrote his paper; and he seems to be aware of it at least to some extent. But as for elastic waves in solid media, things don’t seem to be so clear to him. Remember, Mallet speaks of: (i) an elastic wave propagating through the solid crust of the earth; (ii) an acoustic wave propagating through the air, if the earthquake is on land; (iii) an ocean wave (tsunami wave) which only occurs when the source is under the bed of the ocean. There is only one “elastic wave” in Mallet’s idea of an earthquake. He calls it “earth wave”, or “elastic earth wave, or shock”, or “earth wave of shock”, and doesn’t say anything about the separation of P and S waves, which had already been predicted theoretically (in a minute we’ll see how) by Cauchy and/or his French colleagues Navier and Poisson: but Mallet ignores the works of those Frenchmen and instead refers to experiments done in Germany by the Weber brothers298 , which were done in water and other fluid substances (e.g., mercury), and “it is not yet known”, says Mallet, “that precisely analogous motions take place amongst the particles of solids within their limits of elasticity, when transmitting a wave [...]; but it seems highly probable that there is a close analogy in the motions of the particles in both cases, so that the particles of a table land of solid rock, transmitting an elastic wave, or earthquake shock, from a distant primary impulse, do, in all probability, describe similar elliptic curves to those of a watery wave, and varying in the same way with respect to depth, though much smaller, because withheld within the elastic limits due to the particular rocky mass or masses, and transmitted also with inconceivably greater velocity. This, however, is certain, that the surface of the solid earth does undulate;
190
6 The Vibrations of the Earth
it has been repeatedly observed to do so. The transit of the wave along the surface, even, has been observed, as in the great earthquake of Jamaica, in 1692: hence, it is certain that any particle upon the surface of the earth, under such circumstances, must describe either a circle or an ellipse, most probably the latter, whether those within the mass of the undulating land do so or not.” So what Mallet is describing here is a seismic surface wave. Or, to be more precise, a Rayleigh wave, sort of like the same thing as a surface water wave, but in a solid. (And we’ll find out more about Rayleigh waves later; we’ll also learn that there’s another kind of surface waves, which involve a shearing motion and so have less in common with tsunami waves than Rayleigh waves do: and they are called Love waves.) In a sense, Mallet is ahead of his time, because he’s saying that surface waves exist in solids, but a rigorous demonstration of this will be given for the first time only about forty years later. On the other hand, Mallet totally ignores body waves, which involve no circular motion: for him, the “earth wave of shock” is just what today we’d call the surface wave. How fast does the “earth wave” propagate? we’ve seen a figure, already, 2000 m/s or so, which Michell and/or Mallet had estimated based on all the accounts of earthquakes that they could read. But how about a precise, direct measurement? “Hassenfratz and Gay Lussac observed, in the quarries under Paris, that sound travels with immense velocity in rock, and the same observation has been made in blasting the rocks in the Cornish mines299 ; but no measures of its velocity have been ascertained [neither by Hassenfratz and Gay Lussac, nor in Cornwall], and the only trustworthy measurements we possess, of the velocity of sound in any mineral solids, are those obtained by Biot, as to the time of wave transit through cast iron. He found, as has been previously stated, that sound is transmitted through cast iron at the rate of 11,090 ft [that is, about 3380 m] per second.” Mallet doesn’t mention this here, but Biot’s experiments were done on an elastic pipe (or “rod”, or “bar”: anyway, something long and thin), and Biot measured how fast the elastic wave travels from one end of the pipe to the other300 . Now, there’s a parameter called “modulus of elasticity”, or “Young’s301 modulus”, that tells us how much a certain material deforms when under pressure or tension in only one direction; let’s call it E (for “elasticity”) and then E = −δp/(δL/L), where δp is the additional pressure that you might impart, or the reduction in pressure, with respect to the initial condition, and L the initial dimension of the sample in the direction in which pressure is changed, and δL the corresponding change in size302 : so, to fix ideas, think of the sample as a bar of length L, and that the pressure or traction is applied at one end of the bar, pointing in the direction parallel to the bar; and so then δL/L is just the relative change in the length of the bar. that the E is related to the speed of elastic waves. For example, it can be shown303 √ speed at which elastic waves propagate along an elastic pipe is precisely v E = E/ρ, with ρ denoting density, as usual, and the subscript E to remind us that this is true in this particular setup, but then if you attempt a similar measurement in a body with a different geometry, say a sphere, the formula might not work anymore (and in fact it won’t).
6.7 Back to Mallet, Again
191
So, v E is not the same thing as the speed of elastic waves in a body of any geometry, or in an unbounded continuum (and the earth is very large compared to the wavelength of seismic waves, so it would be OK to approximate it with an unbounded continuum), but Mallet doesn’t care, because in any case all the data he has are measures of E and v E in cast iron, and of E (not v E ) in some rocks. To make the best of what is available to him, he considers that, as we have seen, v E is proportional to the square root of E, and so the squared ratio of v E in a given rock to the v E of cast iron coincides with the ratio of their Es, or (6.109) v E (rock) ≈ v E (cast iron) × E(rock)/E(cast iron), where everything at the right-hand side is “known” through observation. (You might object that v E is also inversely proportional to the square root of ρ, and the densities of rock and cast iron are not the same, but “let it be understood,” says Mallet, “that no present importance attaches to any of these numbers, which are all but crude approximations, and to be viewed as mere illustrations of the application of my theory, rather than proofs of, or deductions from it.”) Mallet applied Eq. (6.109) to various rocks (limestone, sandstone, “clay slate”...), and found velocities between 3640 and 12,757 ft/s, which translates to ∼ 1000 m/s and ∼ 4000 m/s. (These must all be surface samples so the effects of depth/temperature/pressure upon E are not accounted for, of course.) All his results are given in a table, in feet per second: Limestone (soft Lias), Sandstone (millstone grit), Portland stone (oolite), Limestone (primary marble), Limestone (hard carboniferous), Clay slate (Leicestershire),
3640 5248 5723 6696 7075 12757
Mallet understands, and explains in his paper, that you can also turn this problem around: so far we have been looking at rocks wondering how fast waves propagate through them; but what if you have a measurement of the time it takes the wave to travel from the epicenter of a quake to a seismic instrument? Then, if you divide distance by time you get speed, which you can compare to the data above to guess what kind of rocks might lie between source and instrument304 . And this could be very useful, because, for instance, “however well modern geologists have surveyed and mapped the formations constituting the land which we can see and handle, of the nature of the bottom of the great ocean we know nothing; no human eye ever has or ever can behold it; we cannot even reach its deep abysses with the sounding line; yet the ocean covers nearly three-fourths of our entire globe, and of this vast area the geology is an utter blank. If, however, we are enabled hereafter to determine accurately the time of earthquake shocks, in their passage from land to land, under the ocean bed, we shall be enabled almost with certainty to know the sort of rock formation through which they have passed, and hence to trace out at least approximate geological maps of the floor of the ocean. For, knowing the time of transit of the
192
6 The Vibrations of the Earth
wave, we can find the modulus of elasticity which corresponds to it, and finding this, discover the particular species of rock formation to which this specific elasticity belongs.” In other words, each material has its own seismic velocity (seismic waves travel through each given material with a speed that is more or less unique to that material), so if you know that speed, you can tell what material that is. Later we shall see this is useful for example to know what materials lay “inside” the earth, at some depth beneath our feet; here Mallet thinks “horizontally”, and is concerned mostly with ocean versus continent differences in the velocity of what will later be called seismic surface waves. There are a couple other ideas in Mallet’s paper that anticipate the current toolbox of seismologists. For example, there’s the concept of “earthquake cotidal lines”, i.e., “snapshots” of what today we would call the wavefront: Mallet understands that it makes sense to connect, on a map, all points that are hit by the earthquake at the same time. “The plan, or horizontal outline of each of these waves, will be more or less circular or elliptical at first, according as the origin or centre of disturbance is at a single point, or along a line of impulse more or less regular, and the crests of the earthquake waves of every order of which we are about to speak, or, as they may be called, the earthquake cotidal lines, will, in their progress of propagation through the earth and sea, alter their curvilinear forms, by changes in their respective velocities, becoming more and more distorted from the original form; but, in every case, these cotidal lines will form closed figures.” He also provides a sketch (Fig. 6.9). Then Mallet knows the laws of optics (refraction, reflection, etc., which had been established experimentally since at least the time of Newton), and figures that since they apply to light waves, and light at the time of Mallet was thought to be an elastic wave propagating through a substance called aether305 in the same way as sound propagates through air, so they should apply also to seismic waves. It follows, says Mallet, that “when the earth wave passes abruptly from a formation of high elasticity to one of low elasticity, or vice versa, it will be partly reflected; a wave will be sent back again, producing a shock in the opposite direction; it will be partly refracted, that is to say, its course onwards will be changed [...]. This is exactly what has been observed to take place. Thus, Dolomieu informs us that, in Calabria, the shocks were felt most formidably, and did most mischief, at the line of junction of the deep diluvial plains with the slates and granite of the mountains, and were felt more in the former than in the hard granite of the latter.” This is all true; or in any case, I am not 100% sure of what exactly happened in Calabria, but it sounds like it could be a case of “constructive interference”, or “resonance”, as seismologists sometimes call it, which is what happens when a discontinuity reflects the seismic waves (just like in optics), and if you have a longish wave train that continues to come in while it’s already being reflected, then the reflected wave sums up with the incoming wave and the earth is shaken even more violently. In the next chapter we will meet Augustin Fresnel (1788–1827), who studied interference in optics and made it the basis of a new theory of light; so in case what I’ve just said seems obscure, that will be the occasion to clarify it; we will also see in more detail how it is that all sorts of seismic waves are reflected by “discontinuities” and refracted whenever the velocity at which they propagate changes, because of the properties of the medium.
6.7 Back to Mallet, Again
193
Fig. 6.9 Mallet’s cotidal lines, after his famous paper. “This represents”, says Mallet, “an imaginary portion of the earth’s surface, the seat of earthquake disturbance [...]. The uncoloured portions of the map represent the sea, the depth of soundings of which are given in two sections [shaded areas], intersecting at right angles, through the point which is the origin of the earthquake impulse.” Besides the soundings, different shades of gray mean different groups of “geological formations”, each with approximately uniform wave velocity. “The lines of junction of these groups are indicated by a strong, dotted line [...]. The dark, continuous, and nearly circular curves, represent the crest of the earth wave of shock, at successive small intervals of time, as it traverses each of the formations of the map, starting from the centre of impulse”. If you look carefully, you should see that the velocity of what Mallet calls the “earth wave of shock” is highest in the south-west, and lowest in the north-east
Another concept introduced to seismology by Mallet, and still very important for seismologists today, is that of the attenuation of seismic waves, which Mallet phrases in a somewhat convoluted way: “the earth wave or shock propagated in all directions at once, from the centre of impulse, if this be situated, with formations of low elasticity at one side of it, and those of high elasticity at the other, may reach very distant regions by transit through the latter, while it may be scarcely felt in closely adjacent ones situated upon the former.” To understand what this means you might want to go back to the concept of wave that I tried to give you earlier on, that is: a parcel of matter is set into motion by some external force (say, the rupture of a fault: a quake), and the neighbouring parcels of matter, through the intermolecular forces that keep matter together, tend to stop it from moving; through the principle of action and reaction, then, those neighbouring parcels also receive a force that accelerates them; and they move; and by the same mechanism they set into motion their own
194
6 The Vibrations of the Earth
neighbours, and so on. Now, at each step of this process some energy is dissipated, for example because no medium (no rock) is perfectly uniform and homogeneous and there will be friction between the chunks of different materials that form a medium; and/or there will be little pores or intrusions or whatever that partly reflect, like Mallet himself has explained, the wave, etc.: which are mechanisms that are not described by the simple concept of wave we’ve used so far306 . So what Mallet is saying is that there exist “formations of low elasticity” where energy is more rapidly dissipated as the wave travels on, so that even at fairly short distances from the epicenter the wave won’t be felt at all; and other formations, “of high elasticity”, where waves “may reach very distant regions”. The typical case history, taught in seismology courses, is that of North America, where we have big earthquakes along the West Coast that are not felt across the Rocky Mountains, versus smaller quakes in the Midwest and East that are felt very far away, e.g., the New Madrid quake that in 1811 is said to have woken up president Madison in Washington, D.C. And New Madrid is in Missouri, so we are talking a really large distance. (Just to be clear, Mallet does not relate this particular story in his paper; but the story illustrates very well what he’s talking about.) Before we leave Mallet, there’s another thing he does in his 1848 paper that I think is worth talking about: he refutes Michell’s ideas of what an earthquake actually is, and expounds his own. Re “Michell’s views, as to the agencies of vast regions of subterraneous vapour, pent up in a state of high elastic compression”, that would blow up so causing an earthquake, Mallet says, “we have not a particle of evidence that any large tract of the earth’s surface ever is afloat upon, much less buoyed up by, and elevated upon, vast masses of elastic vapour or gases. The only evidences we have of subterraneous vapour playing any part at all in the forces of elevation, are at the foci of volcanic action, by the projection of solid masses from craters, the occasional splitting and blowing out of the sides of these, the spouting of geysers, &c.; but these [...] are only minor phenomena in the great machinery of the elevation of the earth’s crust; and if any considerable tracts of surface [...] were afloat upon elastic vapour, rapid and perceptible falls of surface, ending with shocks, [...] would be felt at those moments of the eruption when [...] the caverns below were eased of their pent-up winds, and their roofs again dropped suddenly into contact with their molten floors. But not such facts have ever been observed; in all eruptions, however violent, the principal phenomena indicate the steady, upward pressure of liquid, but not aeriform, matter from below. [...] We may, therefore, affirm that there is no ground in observed facts, for supposing large tracts of the earth’s crust ever to float upon, or be elevated by, subterraneous seas of elastic gaseous matter.” Mallet also addresses the nature of the seismic wave postulated by Michell. Remember the carpet metaphor? The seismic wave as a mass of air that escapes from under a carpet if you shake it? The carpet being the crust, etc.? In Mallet’s words, Michell’s theory is that the thin crust undulates not by propagation of a wave through “the elastic plate itself, [...] but that the crust is forced to undulate by the passage below it of a wave of the fluid upon which it rests [...]. Now, we have shewn that the rate of transit of the great earth wave [...] is immense, that it, at least, equals the velocity of sound in the same solids. The question then is, can a wave propagated
6.7 Back to Mallet, Again
195
under the conditions thus assigned by Michell, have any such velocity of transit?” The answer is no: even admitting that the “crust of our earth bears any analogy to the flexible carpet of his experiment,—admitting that the enormous shell of, at least, forty-five miles in average thickness, could have flexibility enough to follow the constraining motion of the wave of fluid upon which it rests,—then”, if Michell’s reasoning is right, the “earth wave” must travel at the velocity “of the fluid wave below [...]. [A]nd although it would be at present impossible, for want of data, to calculate the exact velocity with which this subterraneous lava wave could move, it may be certainly affirmed that its velocity would be immeasurably short of the observed or theoretic velocity of the great earth wave, or true shock, in earthquakes307 [...]. The rate of ascertained progress of the great Lisbon shock, the only one [...] that has been observed with any pretension to accuracy, is stated by Michell at twenty miles per minute [...]—a speed [...] at least twenty times as great as it is possible to admit the velocity of propagation of a similar wave in imperfectly fluid lava; and yet this velocity is probably underrated, or, if not, the elasticity of the earth’s crust must be greatly impaired by its increased temperature due to depth.” So we abandon Michell’s gaseous idea. What, then, is the force behind earthquakes? Mallet thinks that earthquakes are produced by the same forces that raise mountains. It was an “established fact”, at the time of Mallet, “that forces of some kind [Mallet understood that the nature of these forces was not well established], acting from below upwards, produce local elevations of portions of the earth’s solid crust, often attended with dislocation and fracture of the crust [...]; that these elevations take place with various degrees of rapidity, sometimes continuing to lift the land slowly for many years [...]; at other times producing an upheaval of several feet in a very short time, and that such elevations occur both on land and beneath the ocean.” Don’t forget Mallet is writing in 1848 (or earlier) and ideas on the origin of mountains and on the geological time scale are foggy308 , no consensus on either (“many years”... “very short time”...). “In such local elevations, then, I find the efficient cause of the earthquake shock, which I define to be a wave of elastic compression, produced either by the sudden flexure and constraint of the elastic materials forming a portion of the earth’s crust, or by the sudden relief of this constraint by withdrawal of the force, or by their giving way, and becoming fractured.” Mallet proposes that his readers think of the Earth’s crust “as a platform or beam, supported or held fast at the edges or ends, and loaded or pressed upwards, more or less uniformly, from beneath. If the platform or beam bend under the strain, all the particles below a certain neutral plane will be thrown into a state of compression, those above it into the opposite condition of extension [see Fig. 6.10]; and if this bent and constrained condition of the plate or crust be suddenly produced—if the pressure from below be suddenly brought upon it, a wave of elastic compression will, at the moment of flexure, be produced, and propagated at once in every direction [...]. This wave will be negative above the neutral plane, and positive below it. [What Mallet means by “negative” and “positive” is not quite clear, but today a seismologist would agree that the direction and sign of the initial displacement caused by a quake depends on where you are with respect to the fault. To make a simple example, if you are on the downgoing side of a vertical fault, you are going to move down first, and then rebound upwards, etc.]
196
6 The Vibrations of the Earth
Fig. 6.10 Mallet’s idea of what might start a quake. “If the platform or beam bend under the strain, all the particles below a certain neutral plane will be thrown into a state of compression, those above it into the opposite condition of extension; and if this bent and constrained condition of the plate or crust be suddenly produced—if the pressure from below be suddenly brought upon it, a wave of elastic compression will, at the moment of flexure, be produced, and propagated at once in every direction [...]. This wave will be negative above the neutral plane, and positive below it.”
“Again, after the plate or crust has been so elevated, whether quickly or slowly, if the constraining forces be suddenly relaxed, so that the plate, like a bent bow, is permitted to become straight again, that is to say, drops down from its state of flexure to its former level, or partially towards it, the resilience—the sudden return of the extended particles above the neutral plane, and of the compressed particles below it to their condition of repose—will produce and propagate in all directions a similar wave of elastic compression, which will be positive above the neutral plane, and negative below it. “Thus, sudden elevation, or sudden depression of a tract of country, must always be attended with the production and transit through the surrounding crust, whose level has not been disturbed, of an elastic wave, or true earthquake shock, even although not preceded in either case by dislocation or fracture of the rocky crust; the amount of extension and compression of the particles above and below the neutral plane having been, in these cases, within the elastic limits of the particular rocks constituting the elevated crust.” i.e., in contrast with today’s beliefs, Mallet does not think that the crust needs to break to generate elastic waves. A sudden flexure, even
6.8 The Stress Tensor
197
without rupture, is enough. This is not quite right by today’s terms, because today we know that earthquakes are always accompanied by rupturing of faults; but it doesn’t matter. Because the big intuition of Mallet, that quakes are caused by tectonic forces, is still essentially right even for the seismology of today. Speaking of which (the seismology “of today”), now I have to tell you more about the theory. Because the theory in Mallet is quite simple, he really only seems to have Euler’s equation, which only works in fluids with little or no viscosity; and the experiments by the Weber brothers, which show that there exists such a thing as surface waves, but those only done with fluids, too; and Mallet assumes that the concept of surface waves can be extended to solid materials, like rock. Which he’s right, because clearly Mallet is someone with above-average scientific intuition. But then again, in Mallet’s writings I don’t seem to find any knowledge of shear waves, and body waves in general are neglected when interpreting observations, the main shock being identified with what today we’d call a Rayleigh wave. Post-Mallet, in the course of the second half of the nineteenth century more precise instruments to measure earthquakes will be built, and installed all over the globe, and the new data will contain more info than the older ones, and their interpretation will require more advanced theory309 . The theory of seismic waves that is currently taught in seismology courses, and written in textbooks, starts out with an equation that is most often called Navier’s, or Navier-Stokes’, and the easy way to derive it, that you find in twentieth-century textbooks, is as follows.
6.8 The Stress Tensor First, we have to introduce the so-called “stress tensor”, which is a convenient way of keeping track of what the surface forces are for any given surface within the medium. What I mean by that, is that in principle the surface force changes depending on how the surface that it acts upon is oriented (that’s not the case in water, and in fluids with no viscosity, as we’ve already seen: but it is the case in solids). This could make things quite complicated, except that it turns out that there exists a 3 × 3 matrix, or tensor, that we shall call τ , or stress tensor, such that the surface force per unit area T = nˆ · τ ,
(6.110)
where nˆ is a unit vector perpendicular to the surface on which T acts. In giving this definition, one has to be careful with how the surface is “oriented”; let’s call + and − its two sides: the convention is that if the T in Eq. (6.110) is supposed to be the force applied by the material on the + to that on the − side, then nˆ has got to point towards the + side, and vice-versa: and that’s an important part of the definition of τ (it controls the sign of its coefficients). So, what Eq. (6.110) does is, if I have the nine components of τ at a given point, it tells me what T is, at that same point, on any surface going through that point. τ itself
198
6 The Vibrations of the Earth
Fig. 6.11 The infinitesimal tetrahedron shown in this diagram is bounded by the infinitesimal surfaces d S1 , d S2 , d S3 , which are perpendicular to the Cartesian axes, and share the vertex O. Call (τ11 , τ12 , τ13 ) the traction acting on d S1 ; (τ21 , τ22 , τ23 ) the traction on d S2 , etc. Taken together, those three vectors form what we call the stress tensor τ . Next, consider the infinitesimal planar surface d S—the fourth face of the tetrahedron. The unit vector nˆ is perpendicular to d S; it might form whatever angles with the Cartesian axes. It turns out, and this is a very important result, that the traction T on d S is related to those on d S1 , etc., through Eq. (6.110). I call N , by the way, the intersection of d S and a straight line perpendicular to d S and passing through O. I Call h the length of the segment O N
doesn’t depend on how the surface is oriented: all that info is carried by nˆ alone. That such a tensor τ actually exists follows from some, uhm, geometrical considerations that we are going to look at together, right now: start by considering a tetrahedron whose three edges are parallel to the cartesian axes x1 , x2 , x3 , just like in the diagram (Fig. 6.11). I anticipate that we are going to make the tetrahedron become infinitely small, so let’s call d S1 (because remember that the letter d in front of some symbol, like S1 here, usually means that symbol is referred to something infinitely small) the face of the tetrahedron that’s perpendicular to axis x1 , or rather I should say, perpendicular to the unit vector (1, 0, 0), d S2 the one that’s perpendicular to x2 or equivalently to (0, 1, 0), etc., just keep looking at the diagram. Let’s call τ11 , τ12 , τ13 the Cartesian components of the surface-force-per-unit-area acting on d S1 , and likewise τ21 , τ22 , τ23 those that act on d S2 , etc. Equation (6.110) then automatically holds, by the definition of τ , so long as nˆ points in the direction of one of the Cartesian axes. And but what we are going to do next is, we are going to see that (6.110) also ˆ holds for any other orientation of n. Start with the formula for the volume of a tetrahedron, which is310 one third of its base times its height; if I call h the height of the tetrahedron from its base d S, then its volume 1 (6.111) d V = hd S. 3
6.8 The Stress Tensor
199
But the formula works independently of which base you pick, so at the same time dV =
1 O Ad S1 , 3
(6.112)
dV =
1 O Bd S2 , 3
(6.113)
dV =
1 OCd S3 . 3
(6.114)
Now look again at Eq. (6.111) and at the diagram: by trigonometry, h = O A cos(A Oˆ N ),
(6.115)
where A Oˆ N is a way of referring to the angle made at O by the segments O A and O N . If that doesn’t immediately make sense to you, consider that the triangle formed by the points A, N and O has a right angle at N (if it didn’t, h wouldn’t be the height of the tetrahedron: read the caption). Likewise,
and
h = O B cos(B Oˆ N ),
(6.116)
h = OC cos(C Oˆ N ).
(6.117)
So now if, for example, I plug this into Eq. (6.111), it follows that dV =
1 OC cos(C Oˆ N )d S. 3
(6.118)
But I’ve also found that d V = 13 OCd S3 , see above, and so I can equate to one another the right-hand sides of these two equations, and if I cancel out all that can be cancelled out, I am left with d S3 = d S cos(C Oˆ N ).
(6.119)
Next, I’ll play the same game with the equations that contain O A, and O B, to find that (6.120) d S1 = d S cos(A Oˆ N ), d S2 = d S cos(B Oˆ N ).
(6.121)
Now remember, we’d decided to call nˆ = (n 1 , n 2 , n 3 ) the unit vector perpendicular to d S. If you dot-multiply nˆ with (1, 0, 0) you get n 1 , of course. But by definition of dot product, and because the norms of both nˆ and (1, 0, 0) are 1, the dot product
200
6 The Vibrations of the Earth
of nˆ times (1, 0, 0) coincides with the cosine of the angle they form, which happens to be A Oˆ N , i.e. (6.122) n 1 = nˆ · (1, 0, 0) = cos(A Oˆ N ), and likewise
n 2 = nˆ · (0, 1, 0) = cos(B Oˆ N ),
(6.123)
n 3 = nˆ · (0, 0, 1) = cos(C Oˆ N ).
(6.124)
This means that the expressions we’ve found for the three small surfaces d Si , with the index i taking the values 1, 2, 3, can be rewritten d Si = d Sn i .
(6.125)
Now let our tetrahedron represent a chunk of material under the action of body and surface forces; besides the surface forces that act on d S1 , d S2 and d S3 , that we already know how to write in terms of various τi j ’s, there’s also the surface-forceper-unit-area T that acts on the surface d S, and a body force per unit volume that I am going to call f. So the force balance for the first Cartesian component reads ρd V
d 2u1 = f 1 d V + T1 (d S1 )d S1 + T1 (d S2 )d S2 + T1 (d S3 )d S3 + T1 d S, (6.126) dt 2 2
2
where u stands for displacement (so ddt u2 is acceleration, and ddtu21 its component along the x1 axis, etc.), and Ti (d S j ) means the i-th Cartesian component of the surfaceforce-per-unit-area acting on the surface d S j . Now, what we are interested in are the forces that the material within the tetrahedron is subjected to; which means that when you use Eq. (6.110) to replace T1 (d S1 ) with expressions that involve τ , the unit-vector nˆ must point from within the tetrahedron towards the outside. So, for example, on the surface d S1 we have nˆ = (−1, 0, 0), and as a result T1 (d S1 ) = −τ11 , etc. Do the same for T1 (d S2 ) = −τ21 , etc., and (6.126) becomes ρd V
d 2u1 = f 1 d V − τ11 d S1 − τ21 d S2 − τ31 d S3 + T1 d S. dt 2
(6.127)
You get similar equations for the other two Cartesian components (check for yourself). Replace d Si with the expression we had just found, d Sn i (this now makes things quite simple, so that’s why we’ve gone through all the trouble), ρd V
d 2u1 = f 1 d V − τ11 n 1 d S − τ21 n 2 d S − τ31 n 3 d S + T1 d S, dt 2
(6.128)
6.8 The Stress Tensor
201
etc., or, if you allow me to switch to vector notation, ρd V
d 2u = fd V − nˆ · τ d S + Td S. dt 2
(6.129)
Now replace d V with 1/3hd S, simplify what can be simplified, and ρ
h h d 2u = f − nˆ · τ + T. 2 3 dt 3
(6.130) 2
In deriving all this, I’ve been implicitly assuming that f and ddtu21 be constant within the tetrahedron, and but that will be the case if we take the tetrahedron to be sufficiently small—and the tetrahedron can be as small as we want it to be. So then let’s make it infinitely small: h goes to zero and the left-hand side and first term at the right-hand side of (6.130) become zero. We are left with 0 = −nˆ · τ + T, or T = nˆ · τ , which is precisely what we wanted to prove. You should probably realize that this equation, Eq. (6.110), is not trivial at all: I started out by just calling τ the tensor containing the components of surface forces on three arbitrarily chosen, perpendicular surfaces at a given point... and I discovered that from this τ thing I can calculate the surface force acting on any surfaces passing through that same point. Now that is sorted out, we can proceed to derive a “momentum equation” for a solid pretty much in the same way as we did for Euler’s equation in a fluid, i.e., via a force balance on a small rectangular prism. The total surface force acting, for example, in the x1 direction includes the contribution of the τ11 terms that push or pull on the two sides that are perpendicular to the x1 axis311 , which is the same as what we had called p earlier on; and that reads τ11 (x1 + d x1 )d x2 d x3 − τ11 (x1 )d x2 d x3 . . . and but it also includes the τ21 terms on the sides that are perpendicular to the axis x2 , and τ31 on the faces perpendicular to x3 . It follows that the x1 -component of the sum of all surface forces acting on the prism reads [τ11 (x1 + d x1 , x2 , x3 ) − τ11 (x1 , x2 , x3 )] d x2 d x3 + [τ21 (x1 , x2 + d x2 , x3 ) − τ21 (x1 , x2 , x3 )] d x1 d x3 + [τ31 (x1 , x2 , x3 + d x3 ) − τ31 (x1 , x2 , x3 )] d x1 d x2 .
(6.131)
If you divide (6.131) by the volume d x1 d x2 d x3 you get the total force per unit volume; which, if you also let the prism become infinitely small, i.e., d x1 , d x2 and d x3 all tend do zero, becomes ∂τ11 ∂τ21 ∂τ31 (x1 , x2 , x3 ) + (x1 , x2 , x3 ) + (x1 , x2 , x3 ). ∂ x1 ∂ x2 ∂ x3
(6.132)
Likewise, the x2 -component reads ∂τ22 ∂τ32 ∂τ12 (x1 , x2 , x3 ) + (x1 , x2 , x3 ) + (x1 , x2 , x3 ), ∂ x1 ∂ x2 ∂ x3
(6.133)
202
6 The Vibrations of the Earth
etc.: all three components, taken together, can be written
∂ ∂ ∂ , , ∂ x1 ∂ x2 ∂ x3
⎛
⎞ τ11 τ12 τ13 · ⎝τ21 τ22 τ23 ⎠ , τ31 τ32 τ33
(6.134)
i.e., ∇ · τ , which you might call the divergence of (the matrix) τ . Now, just like we did with the fluid when we derived Euler’s equation, plug this into the force balance, which also involves mass (per unit volume) times acceleration, and body force (per unit volume), and d 2u ρ 2 = ∇ · τ + f, (6.135) dt i.e., yes, Navier-Stokes’ equation.
6.9 Hooke’s Law We’ve seen that Euler’s equation (6.8) becomes useful, for example, if we find a mathematical relation between displacement and pressure, which we did, and plug that into Euler’s equation, which becomes a differential equation with only one unknown function, i.e., pressure δp = δp(x1 , x2 , x3 , t). And this is something that we can solve, and in fact I’ve already shown you two cases (unbounded infinite space; half space) where we did, and it wasn’t even too hard. We can play the same game with Navier-Stokes, because τ and displacement are related, just like displacement and pressure are. What this relationship exactly is depends on the nature of the medium we’re dealing with312 . For example, we call “elastic” a medium that responds instantly to stress: the delay between the moment a force or traction is applied, and the moment the medium has deformed to respond to that, is zero. That is obviously an idealization, but one that works very well in many practical applications, incl. seismology: because, indeed, the response of a rock to a sudden change in stress is so fast compared to, say, the speed of a seismic wave, that we might very well regard it as zero, and be happy with it313 . Robert Hooke first gave a mathematical description of elasticity: he formulated it as ut tensio sic vis, meaning that the deformation of a spring (tensio) is proportional to how forcefully (vis) you stretch or compress it. You might have learned before that if F is the force you apply on the spring—parallel to the spring—and x the associated change in the spring’s length, then F = −kx,
(6.136)
which is essentially Hooke’s empirical result. The spring’s constant, k, depends on the material the spring is made of, etc. This is a purely one dimensional thing, though, and in seismology (and engineering, etc.) we have to deal with all three dimensions.
6.9 Hooke’s Law
203
The modern way to derive Hooke’s law in 3-D is via the principle of energy conservation. Let me first write down the work W (remember Chap. 4) done by all (body and surface) forces acting on a mass that, before being deformed/displaced, occupies a volume V ,
dV f · u +
W =
dV f · u +
=
∂V
dS T · u
V
V
∂V
(6.137) d S (nˆ · τ ) · u.
The rate of work (the derivative of work with respect to time) then is314 dW = dt
V
∂u + dV f · ∂t
∂u . ∂t
(6.138)
∂u ∂u + · τ · nˆ f· dS ∂t ∂t ∂V ∂u ∂u + ·τ f· dV ∇ · ∂t ∂t V ∂u ∂u +∇ · ·τ f· ∂t ∂t ∂u j ∂ ∂u k , + τ ji fk ∂t ∂ xi ∂t
(6.139)
∂V
d S (nˆ · τ ) ·
But, it can be shown315 that τ is always symmetric, and so dW = dt
dV
V
V
V
=
dV
=
dV
=
dV V
316 where in the second step I’ve used the divergence theorem , which says that, for any vector field v, ∂ V d S v · nˆ = V d V ∇ · v. And then I’ve switched from tensor to “index” notation because that’s going to make the next couple of steps, I think, somewhat simpler. And, Nota Bene, I am using Einstein’s convention re repeated indexes317 : from now on I am going to use it by default. 2 ∂τ Now remember Navier-Stokes318 : f k = ρ ∂∂tu2k − ∂ xjkj . Substitute that into the first term inside the integral in (6.139), and
fk Plug that into (6.139), and
∂u k ∂ 2 u k ∂u k ∂u k ∂τ jk =ρ − . ∂t ∂t ∂t 2 ∂t ∂ x j
(6.140)
204
6 The Vibrations of the Earth
dW = dt
dV
V
V
V
=
dV
=
dV
=
dV V
∂u j ∂u k ∂ 2 u k ∂u k ∂τ jk ∂ τ ji ρ − + ∂t ∂t 2 ∂t ∂ x j ∂ xi ∂t ∂u j ∂u j ∂τi j ∂u k ∂ 2 u k ∂ τ ji ρ − + ∂t ∂t 2 ∂t ∂ xi ∂ xi ∂t ∂u j ∂u j ∂τ ji ∂u k ∂ 2 u k ∂ ρ − + τ ji ∂t ∂t 2 ∂t ∂ xi ∂ xi ∂t ∂u j ∂ 1 ∂ ∂u k ∂u k + τ ji , ρ 2 ∂t ∂t ∂t ∂ xi ∂t
(6.141)
where in the first step I’ve played with the names of “dummy” indexes (those that get summed over), first replacing j with i and then k with j in the second term of the integrand—because that helps us seeing what the various terms of the integrand have in common; in the second step I’ve used the symmetry of τ to swap i with j, again in the second term; at the third step, I’ve remembered what happens to the derivative of the product of two functions, to combine the last two terms and to rewrite the first in a slightly different way. Now, and this is the trickiest step of the whole procedure, if I introduce a tensor ε=
1 ∇u + (∇u)T , 2
i.e., εi j =
1 2
∂u i ∂u i + ∂x j ∂ xi
(6.142)
,
(6.143)
then (6.141) becomes319 dW = dt
dV V
∂ε ji 1 ∂ ∂u k ∂u k + τ ji , ρ 2 ∂t ∂t ∂t ∂t
(6.144)
or, if you are patient enough to tolerate my switching back and forth between index and tensor notation, dW d = dt dt
dV V
1 ∂u 2 ∂εε ρ + , dV τ : 2 ∂t ∂t V
(6.145)
where the colon mark between τ and the derivative of ε means we are summing over both indices of both tensors that are involved, which is precisely what (6.141) and (6.144) do to the last term at their right-hand side. That’s called a double-dot product, I think. Also, I’ve swapped a volume-integral with a time-derivative, and made the simplifying assumption that ρ is constant (which is OK, because we’re going to be interested in very small volumes V , as you’re about to see).
6.9 Hooke’s Law
205
Now, the “really cool” thing, so to say, of (6.145), is that the first term at its right-hand side is just the time-derivative of the kinetic energy of the mass that’s in V . This can be interesting if we write out the law of energy conservation as d dW = (U + K ), dt dt
(6.146)
i.e., the rate of mechanical work equals the rate in the change of energy in the system, which in turn is the sum of the system’s internal energy320 U , and its kinetic energy at the leftK . This neglects the rate of heat transfer, which should be summed to dW dt hand side: but the elastic deformations of rocks are pretty fast (seconds) compared to the time needed for heat to propagate through rocks (millions of years), so they can be treated as if they were adiabatic, in the sense that there’s not enough time for any relevant heat exchange to take place while these deformations occur, and that’s in (6.146) with its expression (6.145), a totally safe approximation. Now replace dW dt d dt
dV V
d 1 ∂u 2 ∂εε ρ + = (U + K ), dV τ : 2 ∂t ∂t dt V
and but the first term at the left-hand side, like I said, is just with ddtK at the right-hand side, and we’re left with dV τ : V
∂εε dU = . ∂t dt
dK dt
(6.147)
, so that cancels out
(6.148)
You can now take V to be arbitrarily small (call it δV , to remind ourselves it’s small, and call δU its internal energy), which amounts to dropping the integral at the left-hand side, i.e., d ∂εε δV = δU, (6.149) τ : ∂t dt
or d
δU δV
∂εε dt ∂t ∂εi j dt =τi j ∂t =τi j dεi j , =ττ :
(6.150)
where everything gets summed over both i and j. It follows from (6.150) that ∂ ∂εi j
δU δV
= τi j .
(6.151)
δu i Now, let’s say that, in general, the relative deformation, δx , is very small, for all j combinations of i and j; that turns out to be the only situation when the tensor ε is
206
6 The Vibrations of the Earth
useful, which is why people sometimes call it the small deformation tensor. So, if δu i , i.e., if all components of ε are small, then it’s OK to replace the function δU (εε ) δx j δV ε with its (multi-dimensional) Taylor expansion to second order about = 0, δU δU δU ∂ ∂ ∂ δU (εε ) ≈ (0) + εi j + εi j εkl . δV δV ∂εi j δV ε =0 ∂εi j ∂εkl δV ε =0 (6.152) If there’s no deformation, ε = 0, there’s also no change in U , so δU = 0 and δU (0) = 0—and the first term at the right-hand side of (6.152) is gone. Also, in the δV undeformed state there can’t be no stress—because stress would necessary bring deformation. ε = 0 implies τ = 0. But then, because of Eq. (6.151), we also So, δU ∂ = 0 for all i, j. Based on all this, (6.152) becomes simply have that ∂εi j δV ε =0
δU δU ∂ ∂ (εε ) ≈ εi j εkl . δV ∂εi j ∂εkl δV ε =0
(6.153)
The term that multiplies εi j εkl at the right-hand side is fully identified by the four indexes i, j, k, l: it is a fourth-order tensor, and we can write it more compactly as ci jkl , so that δU (6.154) (εε ) ≈ ci jkl εi j εkl . δV And but finally, if you differentiate both sides of (6.154) with respect to εi j , and use (6.151), you have (6.155) τi j = ci jkl εkl , where the equality holds as long as relative deformations are small (i.e., all components of ε are small). And this is what people call Hooke’s law in three dimensions. The way we’ve derived (6.155), it should be clear, although I left it implicit so far, so let me point it out now, it should be clear that, just like u is a displacement, a change in shape with respect to some reference, equilibrium situation, so τ is a change in stress, with respect to that same reference. For instance, the interior of the earth is under a lot of (“hydrostatic”) pressure from the weight of the earth itself; that’s balanced by the compression of the rocks inside the earth, so that nothing needs to move—equilibrium—and yet there’s a lot of pressure across any surface: pressure from within a rock surface balancing the pressure that the rock surface receives from the rest of the world. Hooke’s law, then, might relate, e.g., the displacement associated with a seismic wave with the stress that causes: hydrostatic pressure and the subsequent compression of mantle rocks at equilbrium don’t need to enter the equation321 . Equation (6.155) concerns directly only surfaces that are perpendicular or parallel to the Cartesian axes (remember τ11 , τ12 , τ13 are the components of the surface force acting on a surface that’s perpendicular to axis x1 , etc.); but that’s okay, because for
6.9 Hooke’s Law
207
one there is no constraint, in principle, on how you orient your Cartesian axes, and plus we’ve seen that in any case the traction across any surface can be reconstructed from ε via Eq. (6.110). All indexes in (6.155) are associated with the Cartesian axes and so their values can only be 1, 2, 3: c, then, is 3 × 3 × 3 × 3: that’s how many coefficients you need, in principle, if you are to describe the linear relations that exist between each of the 3 × 3 coefficients of τ and each of the 3 × 3 coefficients of ε . But then we can tell right away that the number of coefficients in c is much smaller than 81, just by virtue of the fact that both ε and τ are symmetric, from which it follows322 that ci jkl = c jikl , and ci jkl = ci jlk , for all values of i, j, l, k. And then it’s reduced even further as a consequence, again, of energy conservation. Let me show how that works. If you go back to Eq. (6.151) (which, like you just saw, follows from energy conservation) and replace τi j at its right-hand side with its expression (6.155), you get δU ∂ = ci jkl εkl . (6.156) ∂εi j δV If you differentiate both sides of what you just got, with respect to ε pq , for whatever pair of indexes p, q, ∂ ∂ ∂ε pq ∂εi j
δU δV
=
∂ (ci jkl εkl ), ∂ε pq
(6.157)
and but now, instead of trying to compute the derivative at the right-hand side, stop, and swap the order of differentiation at the left-hand side, ∂ ∂ ∂εi j ∂ε pq
δU δV
=
∂ (ci jkl εkl ). ∂ε pq
Equation (6.156), if you think about it, provides you with an expression for that you can substitute into (6.158), which gives ∂ ∂ (c pqkl εkl ) = (ci jkl εkl ). ∂εi j ∂ε pq
(6.158) ∂ ∂ε pq
δU δV
(6.159)
And now you can do the derivatives on both sides: at the left-hand side all terms in the sum are independent of εi j except for the one that contains εi j itself, and so you are left with c pqi j ; and likewise at the right-hand side you are left with ci j pq , and the bottom line is that it follows from the principle of energy conservation that c pqi j = ci j pq ,
(6.160)
for all combinations of i, j, p, q. If we put together all the symmetries we’ve found, i.e., ci jkl = c jikl , ci jkl = ci jlk and ci jkl = ckli j , it turns out that instead of 34 = 81, only 21 coefficients are enough to completely define c. To convince yourself of that, think of the number of coefficients
208
6 The Vibrations of the Earth
of the 3 × 3 × 3 × 3 tensor ci jkl as the number of possible permutations (which is almost like saying combinations, except that the order matters, and i j is not the same as ji) of i and j, that is 9, times the number of possible permutations of k and l: also 9. Which would give 81 which is also 3 × 3 × 3 × 3: OK. But then, in our case, the order of i and j doesn’t matter, because we’ve seen that c enjoys that particular symmetry: and so then there are only 6 different combinations of i and j. And the same for k and l. And so from 81, we’ve already reduced the number of independent coefficients of c to 6 × 6 = 36. Now, there might be more compact of elegant ways of doing this, I don’t know, but draw a two-dimensional tensor, a matrix, with all the independent coefficients of c that are left. To decide where to place each coefficient, just pick an order for the 6 combinations of i j and of kl that are left, e.g. 11, 22, 33, 12, 13, 23; and then the entry on the first row, first column of our new matrix will be c1111 ; on the first row, second column we shall have c1122 ; on the, I don’t know, fourth row and second column, c1222 , etc.: you get the idea, ⎛
c1111 ⎜c2211 ⎜ ⎜c3311 ⎜ ⎜c1211 ⎜ ⎝c1311 c2311
c1122 c2222 c3322 c1222 c1322 c2322
c1133 c2233 c3333 c1233 c1333 c2333
c1112 c2212 c3312 c1212 c1312 c2312
c1113 c2213 c3313 c1213 c1313 c2313
⎞ c1123 c2223 ⎟ ⎟ c3323 ⎟ ⎟, c1223 ⎟ ⎟ c1323 ⎠ c2323
(6.161)
which you see has precisely 6 × 6 = 36 coefficients. Now look at the matrix. Take any element in its “upper triangular part”—i.e., everything that’s above the diagonal—, say, on the row number m and column n, and then look in the lower triangular part at the element on row number n and column m: you’ll find that those two elements coincide, because the latter is obtained from the former by swapping the first two indexes in ci jkl with the second two: and but now remember the third symmetry of c that we’d discovered, ci jkl = ckli j . We can count, then, how many independent entries there are in c: in the upper triangular part there’s (36 − 6)/2 = 15 (subtract the number of diagonal entries from the total before dividing by 2) entries; if you sum to that the 6 independent entries that we’ve got on the diagonal you get 21; and that’s it, because the lower-triangular-part coefficients are not independent of the upper-triangular-part one. And so from 81 to 21, and the material we are looking at is still totally general—other than being elastic. What happens next in most seismology or geodynamics or continuum mechanics courses is that an important assumption is made about the physics of the problem— which is usually OK in the first approximation, and simplifies the math quite a bit: it is assumed that the material is isotropic. If you look it up in a dictionary, “isotropic” is explained as: having the same value when measured in different directions; identical in all directions; invariant with respect to direction, etc. An isotropic material is a material that opposes the same resistance to deformation, regardless of the direction of deformation. So if there’s a compression in the x1 direction, the response to that shall be pressure in the x1 direction; and if in another experiment you apply to the same material the same amount of compression, but in the direction x2 , then you’ll
6.9 Hooke’s Law
209
get in response the same pressure—but in the direction x2 . And that for all possible directions, of course—not just along the Cartesian axes. that the Now, remember ∂u l 1 ∂u k coefficient ci jkl , multiplied with the component εkl = 2 ∂ xl + ∂ xk of deformation, gives the contribution of the latter to the component τi j of stress. If the material is isotropic, that contribution stays exactly the same even if you rotate the medium around. Which is just another way of saying that all coefficients ci jkl stay the same when you rotate the reference frame: i.e., c is what we call an isotropic tensor. Now, obviously, an isotropic tensor is a very special tensor, meaning its various coefficients have to be in some very specific relationships to one another for the condition of isotropy to be satisfied. These relationships were found by mathematicians323 ; and if you put them all together, and plus don’t forget the other symmetries of c that we’ve discovered above, it all boils down to ci jkl = λδi j δkl + μ(δik δ jl + δil δ jk ),
(6.162)
where δi j (AKA, Kronecker’s delta) is a symbol that stands either for 1, if i and j have the same value, or for 0 if they don’t (and so you can also say that δi j is the i, j coefficient of the “identity matrix”). So, then, in (6.162) only two independent coefficients are left324 , which are usually called λ and μ. If you double-dot-multiply expres∂u l 1 ∂u k sion (6.162) for ci jkl with εkl = 2 ∂ xl + ∂ xk , which amounts to implementing the right-hand side of (6.155), ∂u k 1 1 ∂u l ∂u k ∂u l + μ(δik δ jl + δil δ jk ) + + ci jkl εkl = λδi j δkl 2 ∂ xl ∂ xk 2 ∂ xl ∂ xk ∂u j ∂u j ∂u i ∂u k 1 ∂u i =λδi j + μ + + + (6.163) ∂ xk 2 ∂x j ∂ xi ∂x j ∂ xi ∂u j ∂u k ∂u i , =λδi j +μ + ∂ xk ∂x j ∂ xi and (6.155) boils down to ∂u j ∂u k ∂u i τi j =λδi j +μ + ∂ xk ∂x j ∂ xi
(6.164)
=λδi j εkk + 2μεi j , which is Hooke’s law for an isotropic medium. The two parameters λ and μ are all that’s left of the initial 81 coefficients of c: if the material is isotropic, all the info you need, to calculate how it responds to any deformation, is stored in those two numbers alone325 .
210
6 The Vibrations of the Earth
6.10 Shear and Compressional Waves From Hooke’s law (6.164) we see right away that μ quantifies the response to shearing. Because if i is different from j, then λ is irrelevant (δi j = 0), and τi j = 2μεi j . If μ is very large, a little shearing results in a lot of stress. Or, if you turn the argument around, if μ is very large then you need a lot of lateral stress to balance even a very small shear deformation. The response to compression/expansion is controlled by both λ and μ; but if there’s a lot of compression/expansion and little shearing, then λ becomes much more important than μ: even a small change in λ would change the (normal) stresses quite a bit. (Again, you can turn it around: if there’s lots of normal stresses, even a small change in λ would affect mass density considerably.) After all this, how about we substitute the expression (6.164), that we just found, for τ , into (6.135)? it seems like a natural thing to do. Things get pretty messy if we allow λ and μ to change with location; but, if we take them to be constant326 , the divergence of τ becomes ∂u j ∂τi j ∂u k ∂ ∂u i λδi j = +μ + ∂ xi ∂ xi ∂ xk ∂x j ∂ xi ∂u j ∂u i ∂ ∂u k ∂ =λδi j +μ + ∂ xi ∂ xk ∂ xi ∂ x j ∂ xi =λ
∂ 2u j ∂ ∂u k ∂ ∂u i +μ +μ 2 ∂ x j ∂ xk ∂ xi ∂ x j ∂ xi
=λ
∂ 2u j ∂ ∂u k ∂ ∂u i +μ +μ 2 ∂ x j ∂ xk ∂ x j ∂ xi ∂ xi
=(λ + μ)
(6.165)
∂ 2u j ∂ ∂u k +μ 2 . ∂ x j ∂ xk ∂ xi
Or, in vector notation, ∇ · τ = (λ + μ)∇ (∇ · u) + μ∇ 2 u.
(6.166)
Then, if, like I was saying, we plug that into (6.135), ρ
d 2u = (λ + μ)∇ (∇ · u) + μ∇ 2 u + f, dt 2
(6.167)
which people sometimes call the “Navier-Cauchy” equation327 . Imagine you know the values of density, rigidity and the other Lamé parameter, λ, of a continuum, but you don’t know how the continuum deforms (after it is deformed with respect to its no-stress, equilibrium setup). Then (6.167) is a tool to find the deformation and how it evolves over time. Of course, (6.167) alone is not enough: you need to know what that initial deformation u(x, 0) is. Without prescribing u(x, 0) you can’t make any
6.10 Shear and Compressional Waves
211
specific prediction about the future: but looking at u(x, 0) with a bit of mathematical insight, we can make some inferences about the nature of u in elastic media, in general. Stokes published a paper328 in 1849, where he solves (6.167) and shows that his solution includes two contributions: waves of dilatation, as he called them (AKA waves of rarefaction, condensational waves, compressional waves, P waves...), and waves of distortion (AKA distortional waves, shear waves, S waves). He does this in infinite three-dimensional space, meaning he does not combine Eq. (6.167) with any boundary condition. (There will be some initial conditions, but we’ll see about that later.) I am not going to show you Stokes’ derivation exactly as you would read it in his paper, though, but a radically abridged version, the main difference being that Stokes actually finds some formulae for the displacement u in terms of the initial conditions, while I won’t give any formulae for u. What I hope to do is just try to convince you that u is made up of waves of dilatation and waves of distortion, or P and S waves, which is what we tend to call them today. Stokes starts out with a slightly different version of (6.167), which you get if you divide both sides of (6.167) by ρ, and if you neglect body forces f—which, as usual, amounts, in practice, to neglecting gravitation—, i.e., μ λ+μ ∂2 ∇ [∇ · u(x, t)] + ∇ 2 u(x, t). u(x, t) = ∂t 2 ρ ρ
(6.168)
First, he takes the divergence of both sides, ∇·
μ λ+μ ∂2 ∇ · ∇ [∇ · u(x, t)] + ∇ · ∇ 2 u(x, t); u(x, t) = ∂t 2 ρ ρ
(6.169)
remember that ∇ · ∇ is exactly the same as ∇ 2 ; swap the order of differentiation at the left-hand side and in the last term at the right-hand side; you’re left with: λ + 2μ 2 ∂2 ∇ [∇ · u(x, t)] . [∇ · u(x, t)] = ∂t 2 ρ
(6.170)
Then, he (Stokes) goes back to (6.168) and takes the curl of both sides, ∇×
μ λ+μ ∂2 ∇ × ∇ [∇ · u(x, t)] + ∇ × ∇ 2 u(x, t); u(x, t) = ∂t 2 ρ ρ
(6.171)
now, it turns out329 that the curl of a gradient is always 0: the first term at the righthand side cancels out, and after swapping the order of differentiation in the other two terms, ∂2 μ (6.172) [∇ × u(x, t)] = ∇ 2 [∇ × u(x, t)] . ∂t 2 ρ
212
6 The Vibrations of the Earth
So, instead of solving (6.167) directly, which he didn’t know how to do and maybe there’s no way of doing it, Stokes reduces it to two Eqs. (6.170) and (6.172), which the unknown function of the former is the divergence of displacement, and the unknown of the latter is its curl. Stokes actually does more than that, in his paper, and finds ways to reconstruct u based on its divergence and its curl: but it gets quite complex and we don’t really need to go through all that to understand what we need to understand about seismic waves, at least so far as this book is concerned. The important point, right now, is that (6.170) and (6.172) are the same equation, except in (6.170) the unknown function is a scalar, while in (6.172) it’s a threedimensional vector; but that doesn’t make that much of a difference, because each component of ∇ × u in (6.172) obeys the exact same equation, i.e., (6.172) is just three times the same equation: the wave equation, actually, which see above, Eq. (6.26)330 . As of 1849, this equation had already been studied quite a lot, and solved: Stokes refers to a paper by Poisson331 , but rather than going through Poisson’s algebra, which is quite complex, I’ll show you how Liouville, another French mathematician, solves the same problem with a simpler procedure in a paper published332 in 1856. Because, in fact, Liouville just simplifies Poisson’s treatment, using some other earlier result from Poisson himself333 . He formulates the problem as 1 ∂2 φ(x, t) = ∇ 2 φ(x, t), c2 ∂t 2
(6.173)
where c2 is a real and positive but otherwise arbitrary constant334 , with initial conditions φ(x, 0) = f (x), (6.174) ∂ φ(x, 0) = F(x), ∂t for some prescribed functions f and F. And of course what you want to do is you want to find a general formula for φ(x, t) that solves (6.173) and adheres to the conditions (6.174). Liouville switches to “spherical coordinates”335 , i.e., first he introduces a function (x, ξ , t) = φ(x + ξ , t), so that if (6.173) is true, then also336 1 ∂2 (x, ξ , t) = c2 ∂t 2
∂2 ∂2 ∂2 + 2+ 2 2 ∂ξ1 ∂ξ2 ∂ξ3
and then he “transforms the coordinates”, ⎧ ⎨ ξ1 = r sin ϑ cos ϕ, ξ2 = r sin ϑ sin ϕ, ⎩ ξ3 = r cos ϑ,
(x, ξ , t),
(6.175)
(6.176)
6.10 Shear and Compressional Waves
213
Fig. 6.12 Spherical coordinates: r , which is the magnitude of the position vector r; ϑ, AKA colatitude; ϕ, AKA longitude
where r is the distance between x and x + ξ , i.e., r = |ξξ |, and see the diagram in Fig. 6.12 to see what exactly the angles ϑ and ϕ are. Through quite a bit of algebra337 , one can translate differentiation with respect to ξ1 , ξ2 , ξ3 to differentiation with respect to r, ϑ, ϕ, i.e.
∂2 ∂2 ∂2 + 2 + 2 2 ∂ξ1 ∂ξ2 ∂ξ3
(x, ξ , t) =
∂ 1 1 ∂ 2 (r ) + 2 r ∂r 2 r sin ϑ ∂ϑ
∂2 ∂ 1 . sin ϑ + 2 2 ∂ϑ r sin ϑ ∂ϕ 2
(6.177) It also follows from (6.174) that must obey the initial conditions (x, r, ϑ, ϕ, t = 0) = f (x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ), ∂ (x, r, ϑ, ϕ, t = 0) = F(x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ); ∂t (6.178) and so here we are with a brand new differential equation, or, should I say, “initialvalue problem”, where the unknown function is (x, r, ϑ, ϕ, t) and x can be treated as a constant. Now substitute (6.177) into (6.175); then, and here’s the really clever trick, multiply both sides by sin ϑ and integrate over all possible values of ϑ and ϕ, 1 c2
π
2π
dϑ 0
0
dϕ sin ϑ
∂2 = ∂t 2
π
0
2π
dϑ
dϕ 0
sin ϑ ∂ 2 (r ) 1 ∂ + 2 r ∂r 2 r ∂ϑ
∂ 1 ∂2 . sin ϑ + 2 2 ∂ϑ r sin ϑ ∂ϕ
(6.179)
214
6 The Vibrations of the Earth
And now some ad hoc rearranging of the order in which we do the integrals and the derivatives, 1 ∂2 c2 ∂t 2
π
2π
dϑ sin ϑ
0
0
π 2π 1 ∂2 dϕ = dϑ sin ϑ dϕ r r ∂r 2 0 0 π 2π 1 ∂ ∂ + 2 sin ϑ dϕ dϑ r 0 ∂ϑ ∂ϑ 0 2π 1 π 1 ∂ 2 + 2 dϑ dϕ 2 , r 0 sin ϑ 0 ∂ϕ
(6.180)
and you see that the second and third integrals at the right-hand side are not so hard to do, in fact they are quite easy: remember the “fundamental theorem” and you’ll see that π ∂ ∂ π ∂ sin ϑ = sin ϑ dϑ =0 (6.181) ∂ϑ ∂ϑ ∂ϑ 0 0 (sin 0 is 0 and so is sin π ), and likewise,
2π
dϕ 0
∂ 2 ∂ 2π = = 0, ∂ϕ 2 ∂ϕ 0
(6.182)
which the very last step maybe is not so trivial, but look again at Fig. 6.12: for given values of ϑ and r , ϕ = 0 and ϕ = 2π point to the exact same location in space, and must have the same value whether ϕ be 0 or 2π , hence (6.182). Bottom line, so ∂ ∂ϕ the last two terms at the right-hand side of (6.180) both cancel out, and if we multiply both sides by r we are left with π π 2π 2π ∂2 1 ∂2 r r dϑ sin ϑ dϕ = dϑ sin ϑ dϕ . c2 ∂t 2 ∂r 2 0 0 0 0 (6.183) Equation (6.183) stipulates that the unknown function (r, ϑ, ϕ, t) must be such, that 2π π the strange expression r 0 dϑ sin ϑ 0 dϕ be solution to the one-dimensional wave equation (the variables being time t and distance r ): and that is an equation that people in Stokes’, and Liouville’s, but also in Poisson’s times knew really well, because d’Alembert and company had studied it in so much detail already during the previous century338 . We know from d’Alembert that the general solution to the scalar wave equation is a function that depends on the sum and/or the difference of r and the product of c times t, i.e., in its argument time and distance appear only in the form ct ± r , and so in our case we can write 2π π dϑ sin ϑ dϕ = υ(ct + r ) + χ (ct − r ), (6.184) r 0
0
6.10 Shear and Compressional Waves
215
where for now υ and χ are totally arbitrary functions. But then Liouville points out that when r = 0 Eq. (6.184) boils down to 0 = υ(ct) + χ (ct),
(6.185)
from which it follows that χ (ct) = −υ(ct), whatever the value of ct, and r
π
2π
dϑ sin ϑ
0
dϕ = υ(ct + r ) − υ(ct − r ).
(6.186)
0
We can also apply initial conditions (6.178); the first of which implies
π
r
2π
dϑ sin ϑ
0
dϕ f (x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ) = υ(r ) − υ(−r );
0
for the second one, i.e., the condition on of (6.186) with respect to t:
π
dϑ sin ϑ
r 0
2π
dϕ 0
∂ , ∂t
(6.187) you first have to differentiate both sides
∂ = cυ (ct + r ) − cυ (ct − r ) ∂t
(6.188)
(where υ is the derivative of υ), and then set t = 0, and π r 0
dϑ sin ϑ
2π 0
dϕ F(x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ) = cυ (r ) − cυ (−r ),
(6.189) where in this last step, finally, we’ve made use of the second of (6.178). Okay so now, and I apologize, because this is quite cumbersome, but it’s also easy to do if you are patient, now the simplest way to constrain υ on the basis of (6.187) and (6.189) is to first differentiate the former with respect to r , π 2π ∂ r dϑ sin ϑ dϕ f (x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ) = υ (r ) + υ (−r ), ∂r 0 0
(6.190) and then multiply by c each side of the equation you’ve just gotten, and sum it with the corresponding side of (6.189), 2cυ (r ) = r
π 0
2π
dϑ sin ϑ
dϕ F(x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ)
0
π 2π ∂ +c r dϑ sin ϑ dϕ f (x1 + r sin ϑ cos ϕ, x2 + r sin ϑ sin ϕ, x3 + r cos ϑ) , ∂r 0 0
(6.191)
216
6 The Vibrations of the Earth
and with that, we have υ (or, I should say υ , which in practice, as you will see in a minute, is even more useful), which admittedly was a lot of work (and in retrospect it is perhaps not so surprising that Poisson hadn’t seen all this right away?), but we are almost done. To translate the expression he’s found for υ into an expression for φ, which was his initial unknown function, Liouville goes back to (6.186), and, instead of with respect to t, differentiates it with respect to r . What he obtains is π 0
dϑ sin ϑ
2π 0
∂ dϕ + r ∂r
π
0
dϑ sin ϑ
2π 0
dϕ = υ (ct + r ) + υ (ct − r ),
(6.192) which has to hold for any value of r , including r = 0. And so if I pick precisely r = 0, it follows from it that
π
2π
dϑ sin ϑ
0
dϕ (x, 0, t) = 2υ (ct);
(6.193)
0
the second argument of is 0, because remember, = (x, ξ , t), and but ξ = 0, because r = 0 and look at Eqs. (6.176). But so then in (6.193) is constant with respect to ϑ and ϕ; and so we can pull it out of the integrals; and
π
φ(x, t) 0
2π
dϑ sin ϑ
dϕ = 2υ (ct),
(6.194)
0
after replacing (x, ξ , t) = φ(x + ξ , t). Now, the two integrals at the right-hand side 2π π are not so hard to do: 0 dϕ is just 2π , while 0 dϑ sin ϑ is cos(0) − cos(π ) = 2. So, 1 υ (ct) : (6.195) φ(x, t) = 2π and with this Liouville is really done, because all he needs to do is replace r with t in (6.191), and then plug the resulting expression for υ (t) into (6.195), and: φ(x, t) =
t 4π
0
π
dϑ sin ϑ
2π
dϕ F(x1 + ct sin ϑ cos ϕ, x2 + ct sin ϑ sin ϕ, x3 + ct cos ϑ)
0
π 2π 1 ∂ t dϑ sin ϑ dϕ f (x1 + ct sin ϑ cos ϕ, x2 + ct sin ϑ sin ϕ, x3 + ct cos ϑ) , + 4π ∂t 0 0
(6.196) where notice that c has simplified away, except for inside the argument of f and F. Before we interpret φ in terms of the divergence or curl of displacement, let’s look some more at Eq. (6.196): because there’s already some interesting info, there, about what in general goes on in a system described by (6.173). To see what I mean, doesn’t matter whether φ is the divergence or the curl of something, or whatever, just think of it as some sort of generic “disturbance”; think of (x1 + ct sin ϑ cos ϕ, x2 + ct sin ϑ sin ϕ, x3 + ct cos ϑ) as the position vector identifying a location at a distance
6.10 Shear and Compressional Waves
217
ct from x, in the direction pointed at by the angles ϑ and ϕ. You see, then, that Eq. (6.196) states that at time t, the disturbance at x will be nonzero if and only if the initial disturbance f and/or rate of disturbance F are nonzero at least one point that lays at a distance of exactly ct from x. More in general, because there are integrals over ϑ and ϕ, the value of φ(x, t) is determined by the combination of the values of F and f at all points that lay at a distance ct from x; or in other words, that lay over a sphere of radius ct, centered at x. Which then, incidentally, you see that it makes sense to interpret c as the speed of propagation. You can also turn this interpretation around and instead of one observer seeing disturbances coming in from multiple sources, think of multiple observers, distributed all over the place, recording signal that comes from a single, localized source (a “point source”), i.e., F and/or f are everywhere zero except for one point and, perhaps, its immediate vicinity. To keep things simple, let this signal be an “impulse”, i.e., something that on a seismogram would look like a single wiggle, one peak along an otherwise flat line. Equation (6.196) says that, in such a case, no receiver would record anything significant at any time, except for the time t such that the product ct coincides with the distance between the source and that receiver. Which another way to put it is, the disturbance propagates out of the source along a spherical wavefront, and in fact we call this a spherical wave; the center of the “sphere of disturbance” is the source and its radius grows proportionally with time339 . With this picture in mind, let’s go back to Eqs. (6.170) and (6.172). If you think of φ as the divergence of displacement, and replace c with
λ+2μ , then ρ μ = ρ , then
(6.196) is
solution to (6.170); if φ is the curl of displacement and c (6.196) is solution to (6.172). I guess to understand what happens in an earthquake, think of it as some sudden release of energy that takes the form of a displacement u(x, 0) which in general will have nonzero curl and divergence. (Typically, the initial displacement u(x, 0) is everywhere zero except for in the immediate vicinity of a fault, where the two sides of the fault move suddenly, or “slip” with respect to one another.) What we’ve learned from the work of Liouville and Stokes is that both the curl and the divergence of u evolve over time according to Eq. (6.173), i.e., as spherical waves, each with its own wavespeed340 . In practice, there’s two propagating displacement fields: one, the faster one, with nonzero divergence but zero curl; followed by a slower one that carries no divergence, but has nonzero curl. To convince yourself of this, remember that the divergence of displacement is a measure of how the medium is being compressed and/or expanded. Zero divergence means no compression and no dilatation. So the divergence-free wave, the one that travels at speed c = μρ , is a wave of pure “distortion”, like Stokes called it, or of pure shear like we call it today, a shear or S wave. All compression and dilatation
is restricted to the other wave, the one that travels at c = λ+2μ , and that Stokes ρ called wave of dilatation, and we call compressional or P wave. See Fig. 6.13 to get a feeling of what P and S waves look like in practice341 . Another pretty conspicuous thing about P and S waves is that the P wave is always faster, because λ, μ, ρ are all real and positive, by definition (density can’t possibly be negative, etc.), and so
218
6 The Vibrations of the Earth
Fig. 6.13 Deformation of a chunk of matter, at three subsequent moments in time (top to bottom), as it gets hit by a P (left) or an S wave (right). Strictly speaking, what we are seeing is a portion of an infinite, homogeneous medium, and far from the source: if that wasn’t the case, the wavefront wouldn’t be a plane, and we might see other phases than just regular P and regular S: reflections, refractions, surface waves and so on. (We’ll see about that later in this chapter, and in the next chapter.)
λ+2μ ρ
> μρ , and the same is true of their square roots, of course. This explains their names, P and S: the letter P stands for prima, which is latin for first, because the P is the fastest and always the first peak in a seismogram; and S stands for secunda, latin for second, because the S wave always comes in second after the P.
6.11 Rayleigh Waves There’s more that we can do with the momentum Eq. (6.135). When we looked at waves in fluids, we started out with no boundary conditions—acoustic waves in “free space”, as they say—and then added a boundary condition—we replaced free space with half space. We shall do the same now—to find so-called seismic surface waves— starting out with the paper where they were first “discovered”, i.e., Rayleigh’s “On
6.11 Rayleigh Waves
219
Waves Propagated along the Plane Surface of an Elastic Solid” (Proceedings of the London Mathematical Society, 1885). Rayleigh solved the solid, elastic half-space problem mathematically—he showed that surface waves should exist in theory; we shall see that only some years later, other authors came to the conclusion that surface waves are indeed observed in seismograms. Similar to what we did with ocean waves, Rayleigh assumes that u(x, t) varies in time like342 eiωt (where i denotes the imaginary unit343 ): the idea being, let’s find a ˜ function u(x) such that iωt ˜ (6.197) u(x, t) = u(x)e is a solution of the momentum Eq. (6.168), plus the boundary conditions at the half space’s surface. The fact that all dependence on time t is contained in the factor eiωt means that displacement will be a sinusoidal oscillation at a single frequency344,345 ω, i.e., there will be one cycle of oscillation every 2π/ω seconds. In what we are going to do, ω will always stay arbitrary, and so what we are going ˜ ω) for any value of ω. (And I wrote u(x, ˜ ω) rather to find is a formula that gives u(x, ˜ ˜ than u(x), to emphasize that if ω changes u(x) can also change, which in general it will.) This is useful because, one: Eq. (6.168) is a linear differential equation346 , and so, as you know347 , the linear combination of any number of its solutions is again a solution. Two: we know from Fourier348 that any function can be written as a linear ˜ ω) means combination of sinusoids eiωt . If you put one and two together, finding u(x, finding the coefficients of some linear combination of sinusoids that solves (6.168); ˜ ω)—you’ve and if you find all such combinations of sinusoids—all possible u(x, found the most general solution of (6.168): because there exists no function that can’t be written as a linear combination of sinusoids. Another reason this is useful is that sinusoidal functions can be very easy to deal with, if you remember your trigonometry: we’ll see them in action in a second. ˜ ω), but what a mess, so I’ll just drop it and Now, I could keep the tilde in u(x, replace the unknown function u(x, t) in (6.168) with u(x, ω)eiωt , and but we have to remember that u(x, ω) with argument ω is not the same function as u(x, t) with argument t. After this substitution, and if you also recall that the derivative of eiωt with respect to t is just iωeiωt , Eq. (6.168) takes the form − ρω2 u(x, ω) = (λ + μ)∇ [∇ · u(x, ω)] + μ∇ 2 u(x, ω)
(6.198)
(hint: we’ve divided both sides by eiωt ), and what you’ve done is you’ve simplified it, getting rid of a double derivative. If you take the divergence of (6.198), − ρω2 ∇ · u(x, ω) = (λ + μ)∇ 2 [∇ · u(x, ω)] + μ∇ 2 [∇ · u(x, ω)] ,
(6.199)
or ρω2 ∇ · u(x, ω)(λ + 2μ)∇ 2 [∇ · u(x, ω)] = 0,
(6.200)
220
6 The Vibrations of the Earth
or ∇ 2 [∇ · u(x, ω)] +
ρω2 ∇ · u(x, ω) = 0. λ + 2μ
(6.201)
Which, if you think of the divergence ∇ · u(x, ω) as an unknown scalar function, that you might call θ , like Rayleigh does, θ (x, ω) = ∇ · u(x, ω), then more compactly ∇ 2 θ (x, ω) +
ρω2 θ (x, ω) = 0. λ + 2μ
(6.202)
To solve this equation Rayleigh plays, with the spatial dependence of the functions he’s looking for, the same trick that he’s just played with their time dependence, i.e., he looks for a solution to (6.202) that is “monochromatic in space”349 . In practice, that would be a function which, if you take its picture at a given moment in time, looks like a sinusoid. In one dimension, that is. In two dimensions, i.e., if we are looking for a function of, say, the Cartesian coordinates x1 and x2 (what we are doing when solving (6.202) is actually to look for a function of x1 , x2 , x3 and t, but let’s take one step at a time), it is helpful to choose it to be proportional to the product of a sinusoidal function of x1 times a sinusoidal function of x2 , like sin(k1 x1 ) times sin(k2 x2 ), or cos(k1 x1 ) times cos(k2 x2 ) or sine times cosine, or whatever, where k1 and k2 are some constant real numbers. This looks like the plot in Fig. 6.14. The values of k1 and k2 control how frequent the oscillations of our function are in space,
Fig. 6.14 The product of sin(k1 x1 ) times sin(k2 x2 ), with k1 = 2 and k2 = 3, and values of both x1 and x2 between 0 and 2π . Because k2 > k1 , change is quicker in the x2 direction. I hope the 3-D rendition helps to understand the picture; in any case, look at the bar on the right to see what shade of gray corresponds to what value of the thing I’ve plotted
6.11 Rayleigh Waves
221
i.e., k1 and k2 are to distance (along the respective Cartesian axes) what ω is to time. A function of x1 , x2 and x3 that is monochromatic in all three directions is harder to visualize, but it doesn’t matter at this point, because Rayleigh only spatially-Fouriertransforms along x1 and x2 , and not x3 (and in a minute we’ll understand why). So then let’s do like Rayleigh does and replace θ (x1 , x2 , x3 , ω) with the function θ (k1 , k2 , x3 , ω)eik1 x1 eik2 x2 , which is monochromatic with respect to x1 and x2 , but we still haven’t specified how it behaves with respect to x3 . Again, θ (k1 , k2 , x3 , ω) and θ (x1 , x2 , x3 , ω) are not the same function, just like u(x, ω) and u(x, t) aren’t: but I can’t stand tildes. The spatial Fourier transform of (6.202) reads
∂2 ρω2 θ (k1 , k2 , x3 , ω) = 0, + 2+ λ + 2μ ∂ x3
(6.203)
ρω2 ∂2 2 2 θ (k1 , k2 , x3 , ω). θ (k1 , k2 , x3 , ω) = k1 + k2 − λ + 2μ ∂ x32
(6.204)
−k12 or
−
k22
You should be able to see, by now, that the general solution to (6.204) is350 θ (k1 , k2 , x3 , ω) =P(k1 , k2 , ω)e
ρω2 − k12 +k22 − λ+2μ x3
Q(k1 , k2 , ω)e
ρω2 + k12 +k22 − λ+2μ x3
+
(6.205)
,
where the functions P and Q are, for the time being, unknown. With that, we’ve found a mathematical expression for the divergence of the displacement u that solves Eq. (6.198). In a minute we’ll see how Rayleigh uses that to figure out u itself. For that, we’ll need to go back to (6.198). Before we do that, there’s one more step we can take to simplify our result (6.205). Because the thing is, okay, (6.205) “solves” (6.204), but remember: what Rayleigh is after are elastic waves generated by earthquakes. And if we think of the earth, in the first approximation, as a half space, there is no way that the divergence of the displacement associated with those earthquakes can keep growing indefinitely with growing depth x3 . The hypocenters of quakes are at finite depths (within a planet, nothing can possibly be at an “infinite” depth below the surface!), and the displacement they cause dies away with distance by conservation of energy, and so it’s got to go to zero when x3 grows to infinity. The general solution (6.205) is OK with that, as long as you require that ρω2 be Q be zero for all values of k1 , k2 , ω, and that the square root of k12 + k22 − λ+2μ 351 positive . Bottom line, i(k1 x1 +k2 x2 )
θ (x, ω) = P(k1 , k2 , ω)e
ρω2 − k12 +k22 − λ+2μ x3
e
.
(6.206)
222
6 The Vibrations of the Earth
And that’s all we can say at this point about the divergence of u. So let’s go back to Eq. (6.198) and manipulate it some more to find more info on u itself. With a bit of algebra, Rayleigh rearranges (6.198) to ∇ 2 u(x, ω) +
λ+μ ρω2 u(x, ω) = − ∇ [∇ · u(x, ω)] , μ μ
(6.207)
which at the left-hand side now we’ve got something that looks very similar to Eq. (6.202), and at the right-hand side something that contains the divergence of u; and Rayleigh decides to think of this, actually, as a system of differential equations, that is, the combination of the two equations !
∇ 2 u(x, ω) + ρω u(x, ω) = − λ+μ ∇θ (x, ω), μ μ ∇ · u(x, ω) = θ (x, ω). 2
(6.208)
This is OK, because it is clear that, to be a solution of (6.208), u(x, ω) must be a solution of (6.207), too, and vice-versa. And it’s useful, because after “separating” θ from u like we just did, we can temporarily set θ to zero and attack the so-called “homogeneous”352 system associated with (6.208), !
∇ 2 u(x, ω) + ρω u(x, ω) = 0, μ ∇ · u(x, ω) = 0, 2
(6.209)
which we are already familiar with after solving (6.202). If we do the spatial Fourier thing again, the first equation in (6.209) becomes 2 ρω ∂2 2 2 − k1 − k2 u(k1 , k2 , x3 , ω) = 0, u(k1 , k2 , x3 , ω) + μ ∂ x32
(6.210)
with the general, physically acceptable (no indefinite growth of u with growing x3 ) solution u H (k1 , k2 , x3 , ω) = U(k1 , k2 , ω)e
2 − k12 +k22 − ρωμ x3
,
(6.211)
where uppercase U (a vector with components U1 , U2 , U3 ) can be thought of as an arbitrary constant when we solve (6.210), which is an ordinary differential equation with x3 as the only variable: but in general U will change with k1 , k2 , ω. And for the 2 has to be real and positive. reasons you now know, the square root of k12 + k22 − ρω μ Before we’re done with the homogeneous system, we must substitute (6.211) into the second equation in (6.209), which results in " ik1 U1 + ik2 U2 −
k12 + k22 −
ρω2 U3 = 0, μ
(6.212)
6.11 Rayleigh Waves
223
and so the combination of (6.211) and (6.212) describes all possible solutions of the homogeneous system (6.209). Rayleigh notices that one particular solution of the non-homogeneous system (6.208) is λ + 2μ u N (x, ω) = − ∇θ (x, ω). (6.213) ρω2 That that is the case “can be proved by direct substitution”, says Rayleigh; so, to check whether he’s right, we are going to replace u with its expression (6.213) in the equations it is supposed to solve, do the algebra, and hopefully we’ll see that the equations are indeed verified. If you plug (6.213) into the second equation of (6.208) (I’d rather start with that one, because it’s easier) you get −
λ + 2μ ∇ · ∇θ (x, ω) = θ (x, ω); ρω2
(6.214)
but ∇ · ∇ = ∇ 2 , and if you divide both sides of (6.214) by λ+2μ and bring everything ρω2 that’s not zero to the left-hand side, you’ll see that (6.214) and (6.202) are really the same equation: which means that we already know that the second of (6.208) has to be satisfied by θ (x, ω). If you plug (6.213) into the first of (6.208) (swapping ∇ 2 with ∇ in the first term at the left-hand side), λ + 2μ λ+μ λ + 2μ 2 ∇θ (x, ω) = ∇θ (x, ω). ∇ ∇ θ (x, ω) + 2 ρω μ μ
(6.215)
ρω Then remember that, by virtue of Eq. (6.202), ∇ 2 θ (x, ω) = − λ+2μ θ (x, ω). If we substitute that into (6.215), 2
− ∇θ (x, ω) +
λ+μ λ + 2μ ∇θ (x, ω) = ∇θ (x, ω), μ μ
(6.216)
and it becomes easy to see that left- and right-hand side are the same. So (6.213) is indeed solution to (6.208), QED. Maybe that doesn’t look like such a great result, because, after all, θ is the divergence of u, and so what (6.213) does is, it establishes that there’s this funny relationship between u N and its divergence? But it doesn’t show clearly how u looks like, like, what kind of signal is a seismometer going to measure? Yeah, but wait: remember that before we started studying (6.198), or its rearranged form (6.207), which (6.208) is just another way of writing (6.207), we had already shown that θ must be a solution to Eq. (6.202), and we’d found the general formula for that, which is (6.206). So we just need to substitute (6.206) into (6.213). Careful, though, because that’s trickier than it looks: you must not confuse θ (x, ω) with θ (k1 , k2 , x3 , ω); what I mean is
224
6 The Vibrations of the Earth
∇θ (x, ω) = ∇ θ (k1 , k2 , x3 , ω)ei(k1 x1 +k2 x2 ) ∂ )θ (k1 , k2 , x3 , ω). = ei(k1 x1 +k2 x2 ) (ik1 , ik2 , ∂ x3
(6.217)
Subtitute now this into (6.213), and (6.213) becomes ρω2 λ + 2μ ∂ − k12 +k22 − λ+2μ x3 i(k1 x1 +k2 x2 ) u N (x, ω) = − ik e P(k , k , ω)e , ik , , 1 2 1 2 ρω2 ∂ x3 (6.218) or, more compactly, ρω2 2 2 λ + 2μ ∂ e− k1 +k2 − λ+2μ x3 P(k , k , ω) ik , ik , 1 2 1 2 2 ρω ∂ x3 ⎛ ⎞ " 2 ρω2 λ + 2μ ρω − k12 +k22 − λ+2μ x3 2 2 ⎝ ⎠ =− P(k , k , ω) ik , ik , − k + k − . e 1 2 1 2 1 2 ρω2 λ + 2μ u N (k1 , k2 , x3 , ω) = −
(6.219) With that, we have all that’s needed to finally write the general solution to (6.208), u(k1 , k2 , x3 , ω) = u H (k1 , k2 , x3 , ω) + u N (k1 , k2 , x3 , ω) √2 2 2 = U(k1 , k2 , ω)e− k1 +k2 −κ x3 √ 1 2 2 2 2 2 2 − 2 P(k1 , k2 , ω) ik1 , ik2 , − k1 + k2 − h e− k1 +k2 −h x3 , h (6.220) where, to avoid clutter, as they say, I’ve followed Rayleigh and introduced the symbols 2 ρω2 and κ 2 = ρω . To get u(x, t) from u(k1 , k2 , x3 , ω), all you need to do is h 2 = λ+2μ μ i(k1 x1 +k2 x2 +ωt) multiply the latter by e . If you multiply or divide a monochromatic solution u(x, t) by a factor, that might depend on k1 , k2 , x3 , ω, what you get is still solution of (6.168); it’s probably enough to look at (6.168) to convince yourself of that... So, to simplify the algebra that follows, Rayleigh divides (6.220) by P(k1 , k2 , ω). Because the modulus of U at this point is also arbitrary (the components of U are related by (6.212), but (6.212) is linear, so if you multiply U1 , U2 and U3 by the same constant nothing happens to it), there’s no need, really, to distinguish U from U/P; so I am not going to replace U with another symbol, we already have enough symbols as it is, and353 u(k1 , k2 , x3 , ω) = U(k1 , k2 , ω)e
− k12 +k22 −κ 2 x 3
−
1 h2
− k12 +k22 −h 2 x 3 . ik1 , ik2 , − k12 + k22 − h 2 e
(6.221) This and (6.212), which relates the components of U to one another, contain all the info that we’ve now gathered on all possible solutions to Eq. (6.168) in a half space. Equation (6.221) is already quite a result. (It’s the same equation as Eq. (19) in Rayleigh’s paper, by the way.) But we are not done yet, because remember, we are
6.11 Rayleigh Waves
225
looking for waves in a half space, bounded by a flat surface on the other side of which is the atmosphere. The atmosphere is a fluid with little or no viscosity, and as such doesn’t significantly resist lateral motion—so the half space’s outer surface is free from stress. But this boundary condition hasn’t yet shown anywhere in our calculations so far: and so there’s no guarantee that solutions described by (6.221) and (6.212) will result in zero stress at x3 =0. In fact, only some of them do, depending on the relative values of U, k1 , k2 , ω. So now what we need to do is, we are going to find formulae for τ13 , τ23 , τ33 that contain u(k1 , k2 , x3 , ω); in those formulae, replace u with its expression (6.221); equate the resulting formulae to zero, i.e., “prescribe”, like we said, that the surface at x3 = 0 be free of stress: and you will get relationships between U, k1 , k2 and ω that must hold for a displacement of the form (6.221) to be possible in our half space. But let’s see this in detail. (And I should warn you, what follows is going to be somewhat hardcore, as in lots of convoluted algebra: as much as, I promise, I am making a serious effort to make it as transparent as possible for you; but some of you might want to skip all that and jump directly to Eq. (6.258), which will be Rayleigh’s final formula for what have now come to be called Rayleigh waves.) The expressions that relate τ13 , etc. to u are τ13 = μ
∂u 1 ∂u 3 + ∂ x3 ∂ x1
τ23
∂u 2 ∂u 3 =μ + ∂ x3 ∂ x2
and τ33 = λ∇ · u + 2μ
,
(6.222)
∂u 3 . ∂ x3
(6.223)
(6.224)
In the steps that follow it’s probably a good idea to compact things a bit, the way Rayleigh does it, and introduce r = k12 + k22 − h 2 and s = k12 + k22 − κ 2 ; so that then for example the first component of u reads ik1 u 1 (x, ω) = U1 (k1 , k2 , ω)e−sx3 − 2 e−r x3 ei(k1 x1 +k2 x2 ) , h
(6.225)
which now I am writing it again as a function of x, rather than k1 , k2 , x3 and ω, because, as you might expect from (6.222), I am about to differentiate with respect to x1 , etc. This switching back and forth, x1 and x2 to k1 and k2 and vice-versa, or t and ω, might be annoying, I know, but hopefully you’re getting used to it, so... if I differentiate with respect to x3 , ir k1 −r x3 i(k1 x1 +k2 x2 ) ∂ −sx3 e u 1 (x, ω) = −sU1 (k1 , k2 , ω)e + 2 e , ∂ x3 h
(6.226)
226
6 The Vibrations of the Earth
from which it follows that ir k1 ∂ u 1 (k1 , k2 , x3 , ω) = −sU1 (k1 , k2 , ω)e−sx3 + 2 e−r x3 ; ∂ x3 h
(6.227)
and likewise if you differentiate in the same way u 3 with respect to x1 , ir k1 ∂ u 3 (k1 , k2 , x3 , ω) = ik1 U3 (k1 , k2 , ω)e−sx3 + 2 e−r x3 . ∂ x1 h
(6.228)
2 If you sub that into (6.222) and then also do ∂u , ∂u 3 , which isn’t really different ∂ x3 ∂ x2 in any deep way from what we’ve done above, and sub the result into (6.223), the two equations you get are
ir k1 τ13 = μ −sU1 (k1 , k2 , ω)e−sx3 + 2 2 e−r x3 + ik1 U3 (k1 , k2 , ω)e−sx3 h
(6.229)
and ir k2 τ23 = μ −sU2 (k1 , k2 , ω)e−sx3 + 2 2 e−r x3 + ik2 U3 (k1 , k2 , ω)e−sx3 . (6.230) h The condition (6.224) on τ33 requires some more work; Rayleigh notices that, the 2 2 2 , and λ = ρω -2 ρω , and rewrites way we’ve defined h and κ, it follows that μ = ρω κ2 h2 κ2 (6.224) in the form ρω2 ∂u 3 ρω2 ρω2 ∇ ·u+2 2 − 2 2 2 h κ κ ∂ x3 2 2 κ ρω ∂u 3 . = 2 − 2 ∇ · u + 2 κ h2 ∂ x3
τ33 =
(6.231)
3 , which we’ve learned a minute ago how those can be Besides the formula for ∂u ∂ x3 derived, we now also substitute into (6.231) the expression (6.206) for ∇ · u, or rather ∇ · u(k1 , k2 , x3 , ω) = e−r x3 (because, remember, now P = 1), and
τ33
ρω2 = 2 κ
κ2 r 2 −r x3 −r x3 −sx3 . −2 e − 2 sU3 (k1 , k2 , ω)e + 2e h2 h
(6.232)
Prescribing the boundary conditions τ13 =0, etc., at x3 =0 means in practice that we first replace x3 = 0 into (6.229), (6.230) and (6.232), and then require that to be zero, so that what we get is
6.11 Rayleigh Waves
227
− sU1 (k1 , k2 , ω) + 2
ir k1 + ik1 U3 (k1 , k2 , ω) = 0, h2
(6.233)
ir k2 + ik2 U3 (k1 , k2 , ω) = 0 h2
(6.234)
− sU2 (k1 , k2 , ω) + 2 and
κ 2 − 2h 2 − 2 sh 2 U3 (k1 , k2 , ω) − 2r 2 = 0.
(6.235)
Next, Rayleigh plays with these three equations (and Eq. (6.212), which we haven’t really used, yet) until he finds “the equation by which the time of vibration is determined as a function of the wave lengths and of the properties of the solids”. You might start by manipulating (6.233) and (6.234) so that they give both U1 and U2 in terms of U3 : U1 (k1 , k2 , ω) = 2
ir k1 ik1 U3 (k1 , k2 , ω), + 2 sh s
(6.236)
U2 (k1 , k2 , ω) = 2
ir k2 ik2 U3 (k1 , k2 , ω). + sh 2 s
(6.237)
Then substitute both into (6.212), which becomes ir k1 ir k2 ik1 ik2 ik1 2 2 + U3 (k1 , k2 , ω) + ik2 2 2 + U3 (k1 , k2 , ω) − sU3 (k1 , k2 , ω) = 0. sh s sh s
(6.238) Remembering that i is the square root of −1, and multiplying both sides by −sh 2 , this is reduced to 2r (k12 + k22 ) + h 2 k12 + k22 + s 2 U3 (k1 , k2 , ω) = 0,
(6.239)
which we might solve for U3 , U3 (k1 , k2 , ω) = −
2r (k 2 + k 2 ) 2 1 2 2 . h 2 k1 + k2 + s 2
(6.240)
Substituting (6.240) into (6.235),
or
4r s(k 2 + k 2 ) κ 2 − 2h 2 + 2 1 2 22 − 2r 2 = 0, k1 + k2 + s
(6.241)
2 k1 + k22 + s 2 (κ 2 − 2h 2 − 2r 2 ) = −4r s(k12 + k22 ).
(6.242)
228
6 The Vibrations of the Earth
Rayleigh first squares (6.242), then replaces r and s with their expressions in terms of k1 , k2 , h, κ, and 2 4 2(k1 + k22 ) − κ 2 = 16(k12 + k22 − h 2 )(k12 + k22 − κ 2 )(k12 + k22 )2 ;
(6.243)
this scary-looking formula can be simplified quite a bit through the very simple trick of dividing everything by (k12 + k22 )4 , which gives 2−
κ2 k12 + k22
4
h2 − 16 1 − 2 k1 + k22
so that then if you introduce χ 2 =
κ2 k12 +k22
1−
and η2 =
κ2 k12 + k22
h2 , k12 +k22
= 0,
(6.244)
it all boils down to
4 2 − χ 2 − 16 1 − η2 1 − χ 2 = 0.
(6.245)
If you work out all the powers, χ 8 − 8χ 6 + 24χ 4 − 16χ 2 − 16η2 χ 2 + 16η2 = 0.
(6.246)
μ μ , or η2 = λ+2μ χ 2 , which we Remember that χ and η are related: χη 2 = κh 2 = λ+2μ can substitute into (6.246), to turn it into a polynomial equation with χ as its only unknown354 μ μ χ 4 + 16 − 1 χ 2 = 0. (6.247) χ 8 − 8χ 6 + 24 − 16 λ + 2μ λ + 2μ 2
2
We can use this equation to find a numerical value for χ 2 (notice that all the powers of χ in it are integer powers of χ 2 , too); once we have that, we can immediately get η2 , too. Rayleigh doesn’t say anything about how one is supposed to solve (6.247), but because its order is larger than four, I guess he presumes his readers know that has to be done numerically or graphically355 . To do that, of course, you have to replace μ : because that’s how λ and μ show up in (6.247)) λ and μ (or, actually, the ratio λ+2μ with their numerical estimates, which might come from laboratory experiments of how rocks deform under pressure and shearing, etc. The orders of magnitude of λ and that of μ are usually about the same, so Rayleigh picks λ = μ “as an example μ = 13 , and, according to Rayleigh, Eq. (6.247) of finite compressibility”; then λ+2μ has one real root, which is χ 2 =0.8453. Before we go on, let’s think about Eq. (6.247) some more. By the definitions of κ and χ , # # μ μ 2 =χ (k + k22 ) (6.248) ω=κ ρ ρ 1
6.11 Rayleigh Waves
229
from which you understand, if it wasn’t clear yet, that (6.247) is precisely what Rayleigh calls “the equation by which the time of vibration is determined as a function of the wave lengths and of the properties of the solids”: the “properties of the solids” are ρ, λ, μ; the wave lengths are k1 , k2 ; the “time of vibration” is ω (or rather, the inverse of ω, which is period: but you get the point). So, in other words, if you know the rigidity μ and compressibility λ, you have all the info you need to find the “roots” of (6.247); and if you also know the density ρ, then you can plug the numerical value of each root into (6.248) to find the ratio of ω to k12 + k22 . You see that for every value of ω there’s one “spatial wavelength” k12 + k22 ; or vice versa356 . Now that that’s clarified, let’s work our newly gained knowledge into the displacement solution, Eq. (6.221). First, plug the value of χ 2 that we’ve just found into the expressions we have for κ 2 and h 2 : κ 2 = (k12 + k22 )χ 2 = 0.8453(k12 + k22 ) and h2 =
1 2 (k + k22 )χ 2 = 0.2818(k12 + k22 ). 3 1
(6.249)
(6.250)
It follows that s 2 = k12 + k22 − κ 2 = 0.1547(k12 + k22 )
(6.251)
r 2 = k12 + k22 − h 2 = 0.7182(k12 + k22 ).
(6.252)
and
Before we substitute all this into (6.221), remember that we’ve also found a formula, Eq. (6.240), for the third component of the vector U(k1 , k2 , ω) in terms of k1 , k2 , h, r , s; and we have formulae, Eqs. (6.236) and (6.237), that give us U1 and U2 if we give them U3 . The time has come to put all that together. If we substitute (6.240) into (6.236), (6.237), $ % ir k1 (k12 + k22 ) , U1 (k1 , k2 , ω) = 2 2 1 − 2 sh k1 + k22 + s 2
(6.253)
$ % ir k2 (k12 + k22 ) ; U2 (k1 , k2 , ω) = 2 2 1 − 2 sh k1 + k22 + s 2
(6.254)
Now plug into (6.253) and (6.254) the ratio of r over s, and s 2 , from Eqs. (6.251) and (6.252); there’s some arithmetics and some canceling out, until U1 (k1 , k2 , ω) = 0.5773
ik1 , h2
(6.255)
U2 (k1 , k2 , ω) = 0.5773
ik2 ; h2
(6.256)
230
6 The Vibrations of the Earth
which is quite nice: or about as nice as these things can possibly get. And something similar happens if you play the same game with Eq. (6.240), U3 (k1 , k2 , ω) = −1.4679 k12 + k22 .
(6.257)
Finally: let us substitute the results (6.251), (6.252), (6.255), (6.256) and (6.257) into Rayleigh’s general solution (6.221). This time I’ll write it component by component357 , ⎧ ⎪ ⎪ u 1 (k1 , k2 , x3 , ω) = ⎪ ⎪ ⎪ ⎪ ⎨ u 2 (k1 , k2 , x3 , ω) = ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ u 3 (k1 , k2 , x3 , ω) =
−0.3933 (k12 +k22 )x 3 −0.8475 (k12 +k22 )x 3 0.5773e , −e −0.3933 (k12 +k22 )x 3 −0.8475 (k12 +k22 )x 3 ik2 −e 0.5773e , h2 2 2 k1 +k2 −0.3933 (k12 +k22 )x 3 −0.8475 (k12 +k22 )x 3 + 0.8475e −1.4679e . h2
ik1 h2
(6.258) We might as well put the horizontal distances x1 and x2 , and time, back into the equation; and if I replace h 2 with its expression (6.250), ⎧ ⎪ ⎪ ⎪ u 1 (x, t) = ⎪ ⎪ ⎪ ⎨ u 2 (x, t) = ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ u 3 (x, t) =
ik1 k12 +k22 ik2 k12 +k22
−0.3933 (k12 +k22 )x 3 −0.8475 (k12 +k22 )x 3 2.0486e ei(k1 x1 +k2 x2 +ωt) , − 3.5486e −0.3933 (k12 +k22 )x 3 −0.8475 (k12 +k22 )x 3 − 3.5486e 2.0486e ei(k1 x1 +k2 x2 +ωt) , −0.3933 (k12 +k22 )x 3 −0.8475 (k12 +k22 )x 3 −5.2090e ei(k1 x1 +k2 x2 +ωt) . + 3.0075e 2
1 k12 +k2
(6.259) You see that what these formulae describe is a plane wave traveling in the direction (k1 , k2 ) parallel to the x1 x2 plane, right? So you might as well pick your reference frame358 so that the direction of propagation is parallel to the axis x1 , i.e., k2 =0. The second of (6.259) then boils down to u 2 (x, t) = 0, i.e., whatever is hit by this kind of waves will only move within a plane that is both perpendicular to the free surface (the earth’s surface, in practice) and parallel to the direction of propagation. The other two become & u 1 (x, t) = ki1 2.0486e−0.3933k1 x3 − 3.5486e−0.8475k1 x3 ei(k1 x1 +ωt) , (6.260) u 3 (x, t) = k11 −5.2090e−0.3933k1 x3 + 3.0075e−0.8475k1 x3 ei(k1 x1 +ωt) . This is where Rayleigh drops the imaginary parts, so that −2.0486e−0.3933k1 x3 + 3.5486e−0.8475k1 x3 sin(k1 x1 + ωt), −5.2090e−0.3933k1 x3 + 3.0075e−0.8475k1 x3 cos(k1 x1 + ωt), (6.261) which if you consider, now, that instruments that might measure these things would presumably be deployed at the surface of the Earth, that is, at x3 =0, and do the arithmetics, &
u 1 (x, t) = u 3 (x, t) =
1 k1 1 k1
6.11 Rayleigh Waves
231
!
u 1 (x, t) = 1.500 sin(k1 x1 + ωt), k1 2.2015 u 3 (x, t) = − k1 cos(k1 x1 + ωt).
(6.262)
Equation (6.261) is the same as Rayleigh’s (39). The numbers are not quite the same, I am not sure why, but the ratio of the coefficient of the sine to that of the cosine is the same359 , and remember that if u is a solution of the linear Eq. (6.168), then u times a scalar constant is also a solution: and so we are good. There are quite a few things to say about Eqs. (6.261) and (6.262). Their most conspicuous implication is that there can exist in a half space, and so presumably also just below the surface of the earth, some kind of “surface waves” that propagate along that surface, sort of like guided by it; the exponentials in (6.261) show that the amplitude of these waves decays quickly with depth, meaning: the deeper the observer is, the less the wave will be “felt” by it, and at some depth surface waves become negligible. The coefficients of the exponentials areproportional to k1 , if the wave
propagates in the x1 direction; or, more in general, to k12 + k22 , which, remember Eq. (6.248), is proportional to ω: ergo, the higher the frequency, the faster the decay with depth, i.e., the closer to the surface you need to be to feel the oscillation; conversely, low-frequency surface waves “sample”, as they say, relatively large depths in the earth’s interior. How fast do surface waves propagate? If you look at the argument of sin and cos in (6.261) and (6.262), you see that wavespeed is given by the ratio of ω to k1 , which, remember again Eq. (6.248), is ω =χ k1
#
# μ μ = 0.9194 , ρ ρ
(6.263)
that is, they’re a bit slower than shear waves. Finally, if I’m sitting somewhere, and I’m hit by a surface wave, how will I move? Well, we’ve seen that, if I take the x1 axis to be oriented like the direction of propagation, then u 2 = 0, i.e., there is no motion transversal to the direction of propagation; “it is to be remarked”, says A. E. H. Love, “that the displacement involved in [surface waves] is two-dimensional. If we think of the plane boundary as horizontal, the components of displacement are a vertical component, and a horizontal component, and it is important to notice that the horizontal component is parallel to the direction of propagation”360 . We can also figure out the trajectory of a material point under the effect of a surface wave, in the same way like we did when we were looking at tsunami waves; and interestingly, now, when the material point is at the crest of the wave, it moves horizontally in the direction opposite that of propagation: and vice versa when it’s at the trough: which is the opposite of what we’ve found for tsunami waves. This kind of motion is called “retrograde”. And because the coefficient of the cosine in the expression for u 3 is larger than that of the sine in u 1 , we expect the trajectory to be elliptical and we expect its vertical axis to be larger than the horizontal one, i.e., the material point will oscillate vertically more than horizontally. See the sketch in Fig. 6.15. Rayleigh closes his 1885 paper with a very clever remark: “It is not improbable that the surface waves here investigated play an important part in earthquakes [...].
232
6 The Vibrations of the Earth
Fig. 6.15 “Retrograde” trajectory of a material point hit by a Rayleigh wave traveling in the decreasing-x1 direction, as per Eq. (6.262). For the sake of simplicity we look at the point that, in the unperturbed state, occupies the origin, i.e. x1 = 0 in (6.262). At the time t = 0, (6.262) says that π the material point is displaced to u 1 = 0, u 3 = −2.2015; at the time t = 2ω it is at u 1 = 1.5, u 3 = 0; and so on and so forth
Diverging in two dimensions only, they must acquire at a great distance from the source a continually increasing preponderance.” To understand what he means, you have to consider, first, that yes, he’s solved the half space problem with plane waves alone, so he’s not proved rigorously that there could be surface waves emanating from an earthquake, along wavefronts that are not planar; but it’s difficult to imagine that wouldn’t be the case, and Rayleigh takes for granted that it is. Now, the wavefront of a surface wave is a cylinder; think of “cylindrical” and spherical wave propagation from the point of view of energy conservation. Imagine a half space with a cylindrical wave, an impulse emitted from a point. Shortly after the explosion, or quake, or whatever, say after a time t1 , oscillations are confined to within a cylinder of radius = 0.9194 μρ t1 around the source. After a longer time t2 , the impulse has traveled farther, and now displacement is distributed over a cylinder of larger radius = 0.9194 μρ t2 . The thing is, energy is conserved, so the total kinetic energy associated with those oscillations has to be the same at any moment: t1 or t2 or whatever. If the same amount of energy at time t2 is expressed by the oscillation of an area (the surface area of the cylinder) that has grown with respect to time t1 , well, then the oscillation of each single material point over that area around time t2 has to be smaller361 than around time t1 . As the radius of the cylinder grows, which it does proportionally to time, its surface area grows like the squared radius. OK. Now, think of a P or S wave: their wavefronts are spheres, whose surface area grows like the cube of the radius. And so, accordingly, the energy gets dispersed faster, and the farther an observer is from the epicenter of a quake, the larger the surface wave is in comparison to the body
6.11 Rayleigh Waves
233
wave: “a continually increasing preponderance.” That was Rayleigh’s guess, based on his math plus intuition, and it would later turn out to be right (as we shall see). In his book that I’ve quoted a paragraph ago, by the way, Love already refers to the surface waves predicted by Rayleigh as “Rayleigh waves”; and this is how seismologists call them now, and how we are going to call them in the rest of this book. With that, we’re about as up to date, with the theory of seismic waves, as people were around the end of the nineteenth century. According to Stokes and Rayleigh and all the literature I’ve cited, at that point it was expected that a seismogram might have three prominent wiggles, or phases: the P wave, which would come in first, then the S, then the surface wave. Or perhaps, more than three: because the “halfspace” models of Stokes and Rayleigh and co. would totally neglect reflected waves, which in more realistic models—that is, models with some important discontinuities between their surface and their center—would need to be summed to the direct ones. Or fewer than three, because the math you’ve seen above proves that compressional and transverse and Rayleigh waves can exist in an elastic half space: but how can we be sure, at this point, that earthquakes can actually generate such waves? To borrow the words of Duncan Agnew362 , “seismologists faced a new problem: sorting out the different ‘phases’ observed, and relating them to different kinds of waves propagating inside the Earth. Theorizing about the Earth’s interior was an active subject in the 19th century, but [...] one in which a maximum of ingenuity was applied to minimal amounts of data. It was agreed that the depths of the Earth were hot and dense, but what parts were solid or liquid (or even gaseous) was the subject of much debate, though by the 1890s there was general agreement (from Kelvin’s and [George] Darwin’s tidal studies) that a large part of the Earth must be solid, and thus capable of transmitting the two known types of elastic waves, along with the elastic surface wave proposed by Rayleigh in 1885. But it was not known which of these wave types would actually occur, and how much they might become indistinguishably confused during propagation. [...] The most obvious distinction was between the large ‘main phase’ and the preceding ‘preliminary tremors,’ which were early suggested to be surface waves and body waves, respectively. In 1900 R. D. Oldham363 , using measurements of the 1897 Assam earthquake, classified the preliminary tremors into two phases, which he identified with longitudinal and transverse body waves, the main phase being a surface wave”. In Oldham’s work, says Agnew, “measurements of particle motion in the main phase showed motions quite different from Rayleigh’s theory”. But then, there was “the demonstration in 1911 by the mathematician A. E. H. Love that surface waves with particle motion transverse to the direction of propagation were possible in a layered Earth, thus satisfying the characteristics of the main phase in seismograms.”
234
6 The Vibrations of the Earth
6.12 Richard Oldham, 1900 Before we move on to Love and Love waves, let me tell you some more about Oldham’s 1900 paper. Which is a turning point, because for instance it is the first paper ever to state, based not only on theory but also on data, that P and S and Rayleigh waves can indeed be identified on recordings of distant earthquakes; it is also the first paper to show travel-time curves, that is, plots that show the arrival time of each of those phases as a function of distance of the observer from the quake. Oldham starts out with a literature review. He knows that, according to the theory, when you look at a seismogram you should be able to identify compressional, shear and surface waves. Guillaume Wertheim364 , he says, “finds in the descriptions of great earthquakes indications of two distinct types of disturbance, [...] which he attributes to the two forms [“condensational”, i.e., compressional (or “dilatational”), and “distortional”, i.e., shear] of elastic wave motion.” But Wertheim’s memoir “seems for long to have been devoid of influence on seismological research.” In fact, like I said, Mallet, around the same time, totally ignores shear waves. “It would not be materially incorrect to say that Robert Mallet’s classic works were based on the hypothesis that earthquake motion was solely that of a condensational wave.” Besides Wertheim’s old work, there’s Rayleigh’s more recent, 1885 paper, that now you know everything about. In Oldham’s words, Rayleigh waves are “superficial waves analogous to deepwater waves, but owing their propagation to the elasticity of the substance in which they are propagated instead of to gravity365 . The concluding passage of [Rayleigh’s] paper suggests that ’it is not improbable that the surface waves here investigated play an important part in earthquakes’ ”, etc. (see above). Now, Oldham mentions two papers by the Italian Cancani (1894 and 1895), where Wertheim’s work is cited, but then the surface waves postulated theoretically by Rayleigh are confused for the distortional waves (shear waves). This is a problem, because “Dr. Cancani’s views have [...] been accepted by other seismologists”, and “the idea that the surface undulations due to great earthquakes are of the nature of distortional, and the so-called ‘preliminary tremors’ of condensational, waves, seems to have taken root”: and for example John Milne, who seems to be one of the most prominent seismologists of his time, doesn’t contemplate, in his papers, the possibility that Rayleigh waves could be observed in a seismogram; and wiggles that might correspond to Rayleigh waves are attributed to distortional waves traveling across the Earth rather than along its surface. So what Oldham does now is, he takes data from stations mostly in Italy366 , and then picks a bunch of quakes for which the data fulfil some conditions, because “to ensure [...] accuracy, it is necessary to select the earthquakes dealt with. In the first place, it is essential that the disturbance should originate in a single effort of short duration. [...] Secondly, the time and place of origin must be tolerably accurately known. The limits of error adopted have been 1 min of time and 1◦ of arc. Thirdly, I have excluded all cases where there were not a sufficient number of independent records to serve as a check on, and confirmation of, each other. Fourthly, as a separation of the condensational and distortional plane waves is not to be looked for in
6.12 Richard Oldham, 1900
235
the heterogeneous materials near the surface of the earth, the records from places at distances of less than 20◦ of arc367 from the origin have not been taken into consideration.” (I think the last point means that, because P and S waves are presumably generated at about the same time, i.e., when the fault breaks, your instrument needs to be at some distance from the source if you want the P and S arrivals to be well separated from one another.) (Re the second point: how do we know the “time and place of origin”? This is explained by Oldham case by case—quake by quake—in the paper, but basically the idea is to use some existing study of the earthquake based on local data: for instance if the epicenter is on land, you’ll have records of the perceived intensity (damage, etc.), which will typically decrease with distance away from the epicenter; so if you have enough of this kind of data, and you place them on a map, you can extrapolate (interpolate) the epicentral location. Now take a nearby station. If you are near the quake, seismic waves are not separated yet (see point four above), and the whole disturbance was estimated to travel at about 3 km/s (the early work of, e.g., Mallet: see above). So just divide distance by 3 km/s and you get a time of propagation, that you can subtract from the time at which the quake has “arrived” at your nearby station, to find the “origin” time of the quake.) Oldham sees that there are, indeed, three prominent phases that can be identified in most of the seismograms he’s looking at: and he tries to determine what each of them is. After discussing the data quake by quake (restricted, of course, to the earthquakes he’s selected), which takes about half his paper, he shows one table for each of those phases. One row of a table corresponds to the combination of one quake and one station that recorded it (a “source-receiver pair”, as seismologists would say, today), and they are sorted by growing quake-station distance (“epicentral distance”). All tables have the following columns: distance between epicenter and station (measured in degrees: that would be the angle, with vertex at the center of the earth, between epicenter and station); difference, in minutes, between origin time and time of arrival at the station of the phase in question (he calls it “interval”; later it would be called “travel time”); their ratio, in kilometers per second, which is the average speed of propagation for that phase (he calls it “rate”). There are three of those tables... no, four, because for the third phase he picks both time of “commencement” (the “onset” of that phase) and of maximum amplitude. And Oldham sees that the ratio in the last column grows with growing distance for the first two phases, and is constant for the third phase: “Here we get a result different from that obtained in the case of the first and second phases, in that the time intervals increase in practically the same ratio as the distances, and there is no indication of an increase of apparent velocity with the distance.” In other words, if one plots “interval” versus distance (which Oldham did: see Fig. 6.16), what happens is data from the first two phases form two curved lines (apparent velocity grows with distance, so the curves go flat as distance grows), while data from the third phase form a straight line. This can be confusing because what on earth is “apparent” velocity? The difficulty is in the use of the word “distance” here, and I could have made it simpler for you by changing the language and skipping some steps in Oldham’s reasoning, but we might as well follow Oldham, since this paper is a blueprint of a lot of the seismology
236
6 The Vibrations of the Earth
Fig. 6.16 After Oldham (1900): “Time curves of the three phases of earthquake motion recognisable in distant records”. On the horizontal axis there’s epicentral distance, in degrees; on the vertical axis, time in minutes. Pluses and dots are observations (let’s not get into why two different symbols, or what the numbers mean, which I am not even sure), and the solid lines almost touch most of those data points, while still being very smooth: see the text in the figure for which line corresponds to which phase
that was to come in the following years, and until today; Oldham decides to make a distinction between actual velocity—distance (which in physics is not an angle, but it’s a length: measured in unit of lengths, like meters or kilometers: not degrees) over time—and something that he calls (and seismologists will continue to call) apparent velocity. Oldham explains this in detail (see Fig. 6.17): “the rate of propagation may [...] mean one of two things, either (1) the apparent rate of propagation as measured at the surface of the earth, or (2) the true rate of propagation as measured along the actual wave path”, where “measured at the surface of the earth” means: “measured in degrees”: and this is the “distance” that Oldham notes in his tables. (Because, remember your trigonometry: if you multiply the angle, taken at the center of the earth, with the earth’s radius, you get a length measured along the surface, or the length of the “arc” that connects the two points in question.) Apparent and true “velocities, whose distinction is well recognised [...] must, however, be further subdivided, and for the present purpose the four following values recognised:
6.12 Richard Oldham, 1900
237
Fig. 6.17 Thin solid line: the surface of the earth; thick dashed line: source-receiver distance along the surface of the earth—used to calculate apparent velocity of a seismic wave; thick solid line: the source-receiver distance tout court. If seismic velocities inside the earth were constant, the thick solid line would also coincide with the body-wave propagation path
“v = the apparent rate of propagation at any given point on the surface. This what is commonly meant by apparent rate of propagation. “va = the apparent average rate of propagation as between two points on the surface of the earth. [...] It is this value which has been given in the tabular statements above as the rate. “V = the true rate of propagation at any given point of the wave path. “Va = the true average rate of propagation, obtained by dividing the distance [between epicenter and station] by the time interval.” Which by the way, the “wave path” (which later we shall also call “ray path”) would be the curve that’s everywhere perpendicular to the wavefront; if you want to know how long it takes, after the quake, before the wave hits the receiver, you divide distance along the wave path by velocity of propagation—straightforward kinematics. If the speed of propagation is not constant across a medium, though, things get more complicated. If you’ve studied refraction in optics (which Oldham knew all about, because the laws of reflection and refraction, the so-called “geometrical optics”, had been known at least since Newton) you know what I am talking about: as it enters an area of the medium where for whatever reason wave propagation is slower, the wave is deflected, meaning the wavefront is deflected, like, retarded or sped up, and so is the wave path. If you don’t know what I am talking about, it’s okay, because we’ll look into that in the next chapter: for now just believe me, if seismic velocities are heterogeneous then the wave path is not a straight line. And so, back to Oldham: “In the special case of waves propagated along the surface with a uniform velocity these four values(v, va , V , Va ) are identical; if wave motion is propagated at a uniform speed and along rectilineal [sic] paths through the earth the values of V and va will be identical but different from v and va , which will also differ from each other. In any other case the four quantities must necessarily be different.” Let’s look at the data (the tables, or Fig. 6.16), starting with the third phase. Like I was saying, Oldham shows that its average apparent velocity va is approximately the same for all source-receiver pairs for which there are data. That means that, as far as the third phase is concerned, va does not change with distance between source and receiver. What does that imply? Well, look at Fig. 6.17, showing the trajectory of a (body) wave, traveling on a straight line across the planet, versus that of a surface wave, traveling along its surface.
238
6 The Vibrations of the Earth
If the third phase were a body wave, its apparent and true average velocity would differ, and their ratio would change (grow) with source-receiver distance: because the ratio of apparent (along-surface) to true travelled distance would change (grow). And so unless the true velocity changes in a very peculiar way, which could be, but is unlikely, the apparent velocity of a body wave can’t be independent of distance: it should rather grow with distance. Oldham infers that the third phase is very likely to be a surface wave. And because va doesn’t change much even though sources and stations are all over the globe, he also infers that the velocity of propagation of surface waves is approximately constant throughout the surface of the globe. The first two phases behave differently; their apparent velocity actually grows with distance; so neither of those two phases can be “surface” waves; they must be traveling through the Earth, and since there’s two of them, (the faster) one must be the P, and the other one the S wave predicted by the theory. The next question is, how fast do they propagate? Oldham notices that the apparent velocities of P and S waves both grow with source-receiver distance, and this “increase is markedly greater than what would be due to rectilinear propagation”368 . This is quite an important observation; but before we discuss it, I have to make sure you understand what is meant by “rectilinear propagation”. What’s rectlinear, in rectilinear propagation, is the wave path, meaning that if the source is (approximately) a point in space, the wavefront is a sphere whose radius grows with time, while its centre stays the same: and so wherever you place the receiver in space, the sourcereceiver path that’s always perpendicular to the wavefront is a straight line. This happens provided that the velocity of propagation is constant in space and time, which was always implicitly the case in this book so far, so that there’s no reason for the wavefront to deviate from its spherical shape (or planar shape, if we’re dealing with plane waves; anyway, you get the point). So then, if you turn the argument around: when propagation is rectilinear, i.e., the wavefront is spherical, that means that velocity must be constant. So, Oldham says he figured by “a simple calculation” that the increase in apparent body-wave velocity with source-receiver distance “is markedly greater than what would be due to rectilinear propagation”. Presumably he’s calculated how the difference between the length of arc and chord (the straight line) in Fig. 6.17 grows with the source-center of the earth-receiver angle?369 , which if you divide it by some estimate of velocity (and in doing so you’re implicitly assuming velocity is constant) you should get the difference between the body-wave curves in Fig. 6.16 and the straight line corresponding to that velocity. But what Oldham is saying is that, whatever the velocity estimate you use, the curve you get from this calculation doesn’t have the same trend as that seen in the data. Ergo: propagation is not rectilinear and wave speeds are not constant across the earth. Oldham: “The earliest suggestion that the propagation of earthquake movement was along curved, and not straight, wave paths is contained in a paper by Dr. A. Schmidt370 , of Stuttgart, in which he pointed out that the assumption, made by all previous investigators, of a constant rate of propagation and a rectilinear wave path, was an improbable one. The very different conditions of temperature and pressure in the interior of the earth cannot be without influence in modifying the elasticity and consequently the rate of propagation, and
6.12 Richard Oldham, 1900
239
an investigation of the observed rates of propagation of certain earthquakes indicates that this modification results in an increase of the rate of propagation with the depth below the surface. “Rudzki371 has “investigated [this problem] mathematically [...] for the case of a spherical body, such as the earth. He shows that the wave path which would be straight in the case of a homogeneous solid, or one in which the rate of transmission was constant for all distances from the centre, would be convex towards the centre if the velocity of transmission increased as the distance from the centre diminished, that is as the depth below the surface increased, and concave towards the centre if the opposite were the case. He then investigates the form of the wave paths on the assumption that the velocity of propagation is a constant function of the radial distance from the centre.” So, you might be disappointed, now, and I sort of hope you are, because I am not giving you any details about those papers by Schmidt and Rudzki. But no worries, because in the next chapter we’ll learn more about body waves and their paths of propagation. Oldham then addresses the question of whether the velocity that he finds looking at seismograms fits independent, experimental observations. Oldham: “Little is known of the rate of transmission of elastic waves through rock. Direct experimental determinations, by measuring the rate of transmission of the disturbance set up by an explosion, give little assistance, as the velocities obtained are much less than should result from the elastic constants of the rock, a difference probably due to the weathered and fissured condition of all rocks near the surface. The only experiments of much value are those of Professor Gray and Milne [Quarterly Journal of the Geological Society, vol. 39, 1883, p. 140], who measured the elastic constants of certain rocks, and obtained results from which the following rates of transmission of condensational and distortional waves were deduced”... and Milne then gives a table, with three rocks—granite, marble and slate—and for each rocks values of compressional and shear velocity and their ratio that are a bit different but on average: 4.09 km/s, 2.38 km/s and 1.68. In a later paper by Knott [Transactions of the Seismological Society of Japan, vol. 12, 1888] different density values are used which result in slightly higher velocities, 4.3 and 2.5 km/s for compressional and shear velocity, respectively. These are experiments made on rock sampled at or near the surface, so wave velocity is that of shallow earth, i.e., near the source. Oldham extrapolates the curves he has obtained, to estimate what velocity one would observe for each phase if it were possible to identify both P and S phases near the source—i.e., sampling only the shallow part of the earth: “Having thus two forms of wave motion propagated through the earth, it is natural to regard them as condensational and distortional, and this inference is strengthened if we complete the curves and carry them on to the origin. They then give initial velocities of transmission of about 5 and 3 km/s respectively.” Which Oldham considers to be “in close accord” with the experimental data just mentioned. Finally, re the third phase, Oldham concludes that it “corresponds to the arrival of a form of wave motion which is propagated round the surface and not through the interior of the earth; [...] that the true and apparent velocities of propagation are
240
6 The Vibrations of the Earth
everywhere the same, but that the rate of propagation varies in the case of different earthquakes, being dependent in some way on the size of the waves set up by it. In the case of the greatest earthquakes, which are recorded at distances of 60◦ and over, this rate of propagation appears to be practically always about 2.9 km/s for the principal and largest waves, and may rise to over 4.0 km/s for the long low waves which outrun the principal ones.” So, the third phase is precisely the surface wave found in theory by Rayleigh? Oldham is not so sure: “The nature of these waves has yet to be elucidated. The elastic surface waves investigated by Lord Rayleigh should travel, in material of the nature of the rocks with which we are acquainted, at a rate of about 0.9 of the rate of propagation of a distortional plane wave in an infinite solid. This for continuous rock of the nature of that which forms the crust of the earth is about 2.6 km/s, so that if we take the rate of propagation of the greatest surface waves at 2.9 km/s, the excess is just about what the defect should be.” What Oldham is saying is that the third phase, as seen in seismograms, is “faster” than what Rayleigh thought “his” waves should be. Much faster, actually. Because we’ve seen that the speed of Rayleigh waves was supposed to be 0.9 times the shear velocity of near-surface rocks: which is 2.6. And 0.9 times 2.6 is about 2.3 km/s. And Oldham observes 2.9 km/S from his data. So, yeah, something “has yet to be elucidated” about surface waves. Also, says Oldham, “the form of the molecular movement in the waves investigated by Lord Rayleigh, does not seem to be consonant with that recorded in the neighbourhood of the origin. At great distances it may be in closer accord, but apart from this, the rate of propagation of the purely elastic surface waves is not a function of either their length or amplitude, while that of the great surface undulations of an earthquake appears to be a function of one or both of these.” This is a very dense piece of text, but what I think Oldham means is, first, that the so-called particle motion associated with the third phase is not just the retrograde elliptical motion confined to the vertical plane that came out of Rayleigh’s calculations. It also has a “transverse component”, i.e., at the arrival of the third phase, the seismograph’s needle might also move along the axis that’s perpendicular to the direction of propagation but parallel to the surface of the Earth. Secondly, according to Rayleigh’s formulae, the speed at which his waves propagate depends on μ and ρ, remember Eq. (6.263), but not on their own frequency: and that’s a problem, because according to Oldham’s data low-frequency waves are faster than high-frequency ones: wave speed “may rise to over 4.0 km/s for the long low waves”. Both remarks would be confirmed in later, more realistic versions of the theory, beginning, I guess, with Love’s 1911 book, Some Problems of Geodynamics, which is what we are going to look at next. By “more realistic” I mean that Rayleigh did his math assuming that the earth behaves like a half space, and that’s a quite brutal approximation, because the properties of the rocks that make up the earth: ρ, λ, μ change quite rapidly with depth: and as soon as you take that into account, new calculations, similar to those that Rayleigh did, show that there are surface waves that carry a transverse motion, that those waves are faster than Rayleigh waves, and that both kinds of surface waves, anyway, are dispersive: which means (see the bit
6.12 Richard Oldham, 1900
241
on ocean waves earlier in this chapter) that their speed of propagation is, indeed, a function of their frequency. But let’s not get ahead of ourselves. When Some Problems came out in 1911 A. E. H. Love was already the author of a monumental Treatise on the Mathematical Theory of Elasticity, which was also a success, reprinted many times, etc. Some Problems is a much shorter book, dealing, like the title says, with some specific questions, like that of whether the third phase seen in seismograms coincided with the surface wave postulated by Rayleigh, or Rayleigh’s theory needed to be revised... “a high-level mathematical treatment”, say the notes on the back of my 1967 Dover paperback edition, “of certain basic problems connected with the structure and vibrations of the earth’s mass. Winner of the Adams Prize (1911) in the University of Cambridge372 , and long regarded as an indespensable body of theory for seismologists and geophysicists”, etc.: a famous book, and it’s got a very clear summary of the “state-of-the-art” in 1911 seismology, which here are some excerpts of—going, again, through Rayleigh’s and Oldham’s results, before something new is proposed: “Systematic records [...] of the disturbances that are transmitted to distant stations when a great earthquake takes place,” says Love, “began to be made about the year 1889373 . [...] The records showed two very distinct stages, the first characterized by a very feeble movement, the second by a much larger movement. These are the so called so-called ‘preliminary tremor’ and ’main shock.’ The idea that these might be dilatational and distortional waves, emerging at the surface, took firm root among seismologists for a time374 . In the light of increasing knowledge this idea had to be abandoned. “The theory of the dilatational and distortional waves takes no account of the existence of a boundary. When the waves from a source of disturbance within a body reach the boundary they are reflected, but in general the dilatational wave gives rise on reflexion to both kinds of waves, and the same is true of the distortional wave. Any subsequent state of the body can, of course, be represented as the result of superposing waves of the two kinds reflected one or more times at the boundary, with an allowance for the motions that take place between the two waves, but this mode of representation is very difficult to follow in detail. In particular it is not easy to see without mathematical analysis how such waves can combine to form a disturbance travelling with a definite velocity, less than either [the velocity of P waves or that of S waves] over the surface. Yet such is the case. Lord Rayleigh showed in 1885 that an irrotational displacement involving dilatation and an equivoluminal displacement involving rotation can be such that (1) neither of them penetrates far beneath the surface, (2) when they are combined the surface is free from traction. Such displacements may take the form of standing simple harmonic waves of a definite wave-length and period, or they may take the form of progressive simple harmonic waves of a definite wave-length and wave-velocity. [...] The displacement involved in [Rayleigh-waves] is two-dimensional. If we think of the plane boundary as horizontal, the components of displacement are a vertical component, and a horizontal component, and it is important to notice that the horizontal component is parallel to the direction of propagation. [...] The vertical component at the surface
242
6 The Vibrations of the Earth
is larger than the horizontal component. [...] [Rayleigh waves] might play an important part in earthquakes. Since they do not penetrate far beneath the surface, they diverge practically in two dimensions only, and so acquire a continually increasing preponderance at a great distance from the source.” We’ve seen all this a few pages ago, more or less, right? Then Love mentions that Rayleigh’s ideas were first applied to the interpretation of seismograms by Oldham. And he mentions Oldham’s convincing argument about the apparent and true velocity of the third phase being equal. Now, at the time when Love is writing, Oldham’s “suggestion that the first and second phases of the preliminary tremors should be regarded as dilatational and distortional waves, transmitted through the body of the earth, has been very generally accepted, but the proposed identification of the main shock with Rayleigh-waves has been less favourably received, [...] because observation has shown that a large part of the motion transmitted in the main shock is a horizontal movement at right angles to the direction of propagation.” People have been analyzing new and more precise seismograms (better technology, more instruments deployed around the world, etc.), to find “that the large waves of the main shock, like the preliminary tremors, show more than one phase. The first phase is characterized by relatively long periods and a preponderance of transverse movement, the second phase by shorter periods with again a preponderance of transverse movement; the distinction between these two seems not to be very important. In the third phase the horizontal movement is mainly in the direction of propagation, the periods are shorter than those which occur in the two preceding phases, and these periods gradually diminish. This phase brings the largest movements”. In these few sentences are a couple of observations that were missing in both Rayleigh’s and Oldham’s papers: first, that in what Oldham used to call the third phase, and is safer to call the main shock (and today we’d call it the surface waves: but Love and his contemporaries couldn’t be sure that those were just surface waves, yet) there can actually be identified “more than one phase”: initially, in fact, displacement is approximately only transverse to the direction of propagation, and parallel to the surface of the earth. Then, this first phase of the main shock is followed by a second phase that also carries transverse motion, but at higher frequency: which is suggestive of dispersion, indeed. Finally, a third phase of the main shock appears to coincide with the Rayleigh wave, whose associated “horizontal movement is mainly in the direction of propagation”.
6.13 Love Waves Love’s main idea, which he works out in chapter XI of Some Problems, is to prove mathematically that, besides the Rayleigh waves, there exist also another kind of surface waves, that carry only transverse motion—see the sketch in Fig. 6.18. Incidentally, those are precisely what today we call Love waves. We’re going to see
6.13 Love Waves
243
Love’s derivation next, and we are also going to see that, because Love now introduces some “structure” on top of Rayleigh’s half space, the waves he finds are also going to be dispersive. So, go back to Navier-Stokes, Eq. (6.135), and write it component by component (you’ll see that that’s convenient, for what we have to do next), plus neglect the gravity term—gravity points in the vertical direction, and the displacement Love is looking for is parallel to the earth’s surface, i.e., horizontal: so gravity won’t affect it much. What’s left of Navier-Stokes’s equation is ρ
∂τ ji ∂ 2ui = . ∂t 2 ∂x j
(6.264)
Let the reference frame be the same as Rayleigh’s, with x1 the direction of propagation and x3 the direction of increasing depth. Then, a surface-wave displacement like that in Fig. 6.18 can be written u(r, t) = 0, h(x3 )ei(kx1 −ωt) , 0 ,
(6.265)
where, by definition of surface wave, we expect the unknown function h(x3 ) to decay with increasing depth below the surface. Like Rayleigh, Love chooses to look for a monochromatic solution. (Again, kind of like taking the Fourier transform, but not quite.) As you can see in (6.265), Love is looking for motions that are nonzero only along x2 and that do not change as a function of x2 itself (plane waves). This actually
Fig. 6.18 Love-wave motion (small gray arrows) is parallel to the surface of the earth and transversal to the direction of propagation (large gray arrow). The sketch shows a section of the half space modeled by Love. Its upper face coincides with the surface of the earth
244
6 The Vibrations of the Earth
makes the algebra much simpler. First, we can substitute u 1 = u 3 = elasticity law (6.164), which then collapses to
∂u j ∂u i + τi j = μ ∂x j ∂ xi
∂u 2 ∂ x2
= 0 into the
.
(6.266)
As a result, the equation of motion (6.264) becomes ∂ 2ui ∂ ρ 2 =μ ∂t ∂x j or, after substituting, again, u 1 = u 3 =
∂u 2 ∂ x2
∂u j ∂u i + ∂x j ∂ xi
,
(6.267)
= 0,
2 ∂ u2 ∂ 2u2 ∂ 2u2 . + ρ 2 =μ ∂t ∂ x12 ∂ x32
(6.268)
Now, Love knows—he’s presumably figured it out by trial and error—that he won’t be able to find transverse surface waves unless his earth model includes at least one layer, between atmosphere (which, like Rayleigh, he treats like a vacuum) and the rest of the solid earth, where density and seismic velocities are different than in the top layer: actually, where shear velocity is lower than below. So, in his book, he solves (6.268) in a setup consisting of a flat, homogeneous top layer of thickness H , overlying a homogeneous half space. Call ρ1 and μ1 the density and rigidity√of the layer, and ρ√2 and μ2 those of the half space. As for the shear velocity, v S1 = μ1 /ρ1 ≤ v S2 = μ2 /ρ2 . The equation of motion (6.268) must hold in both layer and half space. Substituting into it both the layer and half-space versions of (6.265) we get ω2 ∂ 2 h(x3 ) 2 = k − 2 h(x3 ), ∂ x32 vS j
(6.269)
where ei(kx1 −ωt) has canceled out, and j = 1, 2 (layer and half space, like I said). You’ve seen versions of (6.269) before, and you know that its complete solution is √2 2 2 √2 2 2 (6.270) h j (x3 ) = A j e− k −ω /vS j x3 + B j e+ k −ω /vS j x3 ( j = 1, 2), where the arbitrary constants A j and B j are to be determined via the boundary conditions, and will have different values in the layer versus the half space. Substituting (6.270) into (6.265), we find the general solution for displacement √2 2 2 √2 2 2 u 2 (r, t) = A j e− k −ω /vS j x3 + B j e+ k −ω /vS j x3 ei(kx1 −ωt) ( j = 1, 2) (6.271)
6.13 Love Waves
245
The quantities A1 , B1 , A2 , B2 , k and ω are (partly) constrained when realistic boundary conditions are prescribed, namely: (i) that displacements tend to zero when x3 −→ ∞ (no seismic sources below a certain depth); (ii) that the outer surface be free of stresses; (iii) that displacements and tractions be continuous at the interface x3 = H between layer and half space. The if we require either boundary condition (i) is satisfied 2 2 < 0, and B2 =0, or Re + k 2 − ω2 /v S2 < 0, and A2 =0, Re − k 2 − ω2 /v S2 where Re stands for real part, of course. Let us pick the first option. The condition (ii) can be written τ31 (x3 = 0) = τ32 (x3 = 0) = τ33 (x3 = 0) = 0.
(6.272)
We already know that τ31 = τ33 = 0 since u 1 = u 3 = 0 by construction; we are left 2 , and the condition ∂u 2 ∂(xx33=0) = 0; using the expression (6.271) for with τ32 = μ ∂u ∂ x3 u 2 and substituting x3 = 0, this becomes 2 2 − k 2 − ω2 /v S1 A1 + k 2 − ω2 /v S1 B1 ei(kx1 −ωt) = 0
(6.273)
at all times t and distances x1 , hence A1 = B1 . Before we get to condition (iii), notice that the solution in the top layer can now be written in the simpler form √2 2 2 √2 2 2 u 2 (r, t) =A1 e− k −ω /vS1 x3 + e+ k −ω /vS1 x3 ei(kx1 −ωt) 2 =2 A1 cos i k 2 − ω2 /v S1 x3 ei(kx1 −ωt) ,
(6.274)
and likewise in the half space √2 2 2 u 2 (r, t) = A2 e− k −ω /vS2 x3 ei(kx1 −ωt) .
(6.275)
Condition (iii) involves the continuity of displacements at x3 = H , and but now displacements on the two sides of x3 = H are given by (6.274) and (6.275), which ei(kx1 −ωt) cancels out, and375 √2 2 2 2 2 A1 cos i k 2 − ω2 /v S1 H = A2 e− k −ω /vS2 H . (6.276)
246
6 The Vibrations of the Earth
2 Condition (iii) then also involves continuity of tractions, that is, continuity of μ ∂u , ∂ x3 which means
− k 2 −ω2 /v 2S2 H 2 2 2 2 2 2 2 2 2 2μ1 A1 i k − ω /v S1 sin i k − ω /v S1 H = μ2 k − ω /v S2 A2 e ,
(6.277) at x3 = H . With this, we have no more boundary conditions left, and the two Eqs. (6.276) and (6.277) are not sufficient to determine the four parameters A1 , A2 , ω and k uniquely. This means that the physical problem we are facing does not have a single solution, but, rather, like, a spectrum of solutions. Love is clever enough to extract, from the formulae we got so far, some info re which solutions are “allowed”. He solves both (6.276) and (6.277) for the ratio A2 /A1 , getting two different expressions for the same thing, which he then equates to one another. This gives 2 2 2 2μ1 i k 2 − ω2 /v S1 H sin i k 2 − ω2 /v S1 H 2 cos i k 2 − ω2 /v S1 = . √2 2 2 √2 2 2 2 e− k −ω /vS2 H μ2 k 2 − ω2 /v S2 e− k −ω /vS2 H (6.278) After some algebra, 2 μ2 k 2 − ω2 /v S2 2 H = . tan i k 2 − ω2 /v S1 2 μ1 i k 2 − ω2 /v S1
(6.279)
The next steps are probably easier to follow if you define a new quantity, c = ω/k, which, as you see from Eq. (6.265), coincides with the wave velocity (remember the acoustic-wave piece earlier in this chapter). Replace k with ω/c in (6.279), and
"
tan iωH
or, equivalently,
tan ωH
"
1 1 − 2 2 c v S1
1 1 − 2 2 c v S1
=
=
μ2
μ1 i
μ2 μ1
1 c2
−
1 2 v S2
1 c2
−
1 2 v S1
1 c2 1 2 v S1
− −
1 2 v S2 1 c2
.
,
(6.280)
(6.281)
In principle, given ω, we can use (6.281) to find one or more values of c (other quantities in (6.281), i.e., μ1 , μ2 , v S1 , v S2 , H , are fixed) for which the boundary conditions at x3 = H are satisfied. Before trying to do any algebra, notice that if c > v S2 , the numerator of the right-hand side of (6.281) is imaginary (the square
6.13 Love Waves
247
root of a negative real number); the denominator is real: and so the right-hand side of (6.281) is imaginary: but its left-hand side is real: which can’t be: and so we can exclude c > v S2 , and only look for solutions with c < v S2 . Likewise, if c < v S1 then the right-hand side of (6.281) is imaginary: and so we are only going to be looking for c > v S1 : or, putting things together, we need that v S1 < c < v S2 .
(6.282)
Now, there’s no way to solve (6.281) algebraically, but we can learn quite a bit from just plotting both its left- and right-hand sides and comparing them to one
2 − 1/c2 , so that (6.281) takes the another. Before you do that, define τ = H 1/v S1 more compact form 2 2 μ2 vH2 − vH2 − τ 2 S1 S2 . (6.283) tan (ωτ ) = μ1 τ
The left- and right-hand sides of (6.283) are sketched in Fig. 6.19 against τ , for
2 2 a fixed value of ω, in the interval 0 < τ < H 1/v S1 − 1/v S2 , which is the same as v S1 < c < v S2 . The right-hand side of (6.283) is always positive in that interval; it tends to +∞ for τ −→ 0, that is to say c −→ v S1 ; it equals 0 when c = v S2 . Each intersection (keep looking at Fig. 6.19) of left- and right-hand side identifies a value of c for which boundary conditions are satisfied at the given value of ω: i.e., one possible solution (6.274), (6.275). You can infer that, for a given ω, the number of solutions grows with
2 2 − 1/v S2 , i.e., with the maximum value of τ , that is to say, with the value of H 1/v S1 the difference between v S1 and v S2 . The lowest-c solution is called “fundamentalmode” surface wave; additional solutions are called “overtones”376 , and sorted (first overtone, second overtone, etc.) by increasing velocity. There’s as many overtones as cycles of tan(ωτ ) within the allowed-τ interval. Pairs of values of ω and c determined via Fig. 6.19 can be substituted into (6.274) and (6.275), which remember k = ω/c. That gives us the displacements associated with the fundamental mode and with each of the overtones. How u 2 (x1 , x3 , t) changes with x3 is totally controlled by the function h(x3 ), which see Eq. (6.270), except that at this point we know that in the top layer B1 = A1 , and in the half space B2 = 0. So then in the top layer we have
√2 2 2 √2 2 2 h 1 (x3 ) =A1 e− k −ω /vS1 x3 + e+ k −ω /vS1 x3 √ 2 √ 2 2 2 =A1 e−iω 1/vS1 −1/c x3 + e+iω 1/vS1 −1/ x3 2 − 1/c2 x3 , =2 A1 cos ω 1/v S1
(6.284)
248
6 The Vibrations of the Earth
Fig. 6.19 Left- (solid line) and right-hand (dashed line) sides of Eq. (6.283). Intersections identify possible values of surface-wave speed c at frequency ω, i.e., half-space surface wave “modes”. This is the graph you get if ω is 0.02 Hz, the thickness of the top layer is 200 km, and shear velocity is 2 km/s in the top layer and 4 km/s in the bottom layer, with the same rigidity in both layers. This is not particularly realistic, but it is not completely crazy, either. If you read the values of τ at the intersections, you can figure the values of c that are associated with them, via the definition of τ , i.e., 1 via c = 1/v 2 −τ 2 /H 2 . You should get c = 2.01 km/s for the fundamental mode, and 2.13 km/s, S1
2.43 km/s and 3.23 km/s for the three overtones
which we’ve found that c must be bigger than v S1 , so the expression we are left with is real, no imaginary part. In the half space we have h 2 (x3 ) = A2 e−ω
√
2 1/c2 −/v S2 x3
,
(6.285)
where we know now that c < v S2 and so, again, it’s all real—no worries. I implemented and plotted all this—see Fig. 6.20. In a nutshell, everything dies away fairly quickly with x3 . The overtones are oscillatory in the top layer—the higher the overtone, the more the oscillations—but also decay exponentially in the half space. A surface wave, indeed. Now, you can do Rayleigh waves in a layer-plus-half-space model, too, more or less the same way as we did Love waves, except the algebra is more complicated. There’s ways to do the maths of both Love and Rayleigh waves in models with multiple layers, too377 ; but that’s too much for this book, I think. The bottom line is that everything we’ve just learned for Love waves works for Rayleigh waves, too—there’s a discrete set of “modes”, displacement vanishes with depth similar to Fig. 6.20, etc.
6.14 Dispersion, Group, Phase
249
Fig. 6.20 How Love-wave displacement changes in the vertical direction, at any time t and location x1 , in the layer-over-half-space model of Fig. 6.19: i.e., the function h(x3 ). To make this plot I’ve taken A1 = 1—totally arbitrary, for lack of better info (like, the size of the quake or stuff like that)—and I’ve gotten A2 from A1 via the requirement that h(x3 ) be continuous at x3 = H
6.14 Dispersion, Group, Phase Before I wrap surface waves up (for the time being), there’s one last thing that I need to point out. So far, to show you what Love waves are like, I’ve picked a value for the frequency, ω, and then I’ve shown you how you can find the multiple Love waves that can exist at that frequency—each with its speed c, its depth-dependent amplitude h, etc. (And, like I am saying, we could do the exact same thing for Rayleigh waves.) We can also look at this from another angle, and ask how each of those values of c changes, when the frequency of the surface wave changes. It amounts to repeating what we just did for a whole bunch of values of ω, then plot all the c’s you get, versus ω, and the result are the curves in Fig. 6.21: the fundamental-mode one, and then one per overtone. When we did ocean waves earlier in this chapter, which is basically surface waves in water, we’ve seen that their speed of propagation depends on their frequency, too (or, their wavenumber depends on their frequency; but then again, speed is just frequency divided by wavenumber, so...). I mentioned that this phenomenon is called dispersion, and that waves with this property are said to be dispersive378 , etc. The dispersion of seismic surface waves is not too difficult to spot on seismograms, particularly those that are recorded at stations very far from the epicenter of the quake: see Fig. 6.22.
250
6 The Vibrations of the Earth
Fig. 6.21 Love-wave velocity c versus period T , for the layer-plus-half-space model of Figs. 6.19 and 6.20. (At T = 50s, i.e., ω = 0.02Hz, you’ve got the four modes of Figs. 6.19 and 6.20.) Plots like this are called dispersion curves. I’ve decided to plot against period, which is just the inverse of frequency. I’ve always found it easier to think of a surface wave in terms of how long one cycle of oscillation lasts, rather than how many cycles over time. Not sure why; maybe because they are so slow (like, of the order of a minute or so, at least for big, faraway quakes) Fig. 6.22 Dispersion of a surface wave. The longer the distance from the source, the wider the separation, in time, between different frequencies
To see what dispersion entails, let’s start with the simplest example possible: a surface wave that consists of two “modes”, or whatever you want to call them: two sinusoidal signals at slightly different frequencies ω + δω and ω − δω, and slightly different “wavenumbers” are k − δk and k + δk, respectively (remember k = ω/c). For simplicity, let them be just cosines; the total displacement then is u 2 = cos [(k − δk)x1 − (ω − δω)t] + cos [(k + δk)x1 − (ω + δω)t] = cos [(kx1 − ωt) − (δk x1 − δω t)] + cos [(kx1 − ωt) + (δk x1 − δω t)] =2 cos(kx1 − ωt) cos(δk x1 − δω t), (6.286)
6.14 Dispersion, Group, Phase
251
Fig. 6.23 Top: two sinusoidal waves with slightly different frequencies, at fixed x1 = 0 and time between 0 to 60 s. Bottom: their sum, which see Eq. (6.286)
where I’ve resorted once again to the property of the cosine that cos(α − β) + cos(α + β) = 2 cos α cos β. Equation (6.286) says that, when two surface waves propagating at very close frequencies ω − δω and ω + δω are combined379 , the resulting displacement can be written as the product of two sinusoidal waves: one with frequency ω (which is the average of the frequencies of the two waves we started out with) and speed ω/k, the other with a much lower frequency δω, and . To see what this means in pracspeed δω/δk which is approximately the same as dω dk tice, check out Fig. 6.23; the low-frequency/long-period signal is said to “modulate” the higher-frequency wave. Imagine that ω is within the audible frequency range, and that what we are looking at are pressure waves in the atmosphere (which we learned about earlier in this chapter); δω, which is way lower than ω, is out of the audible range; then what you hear is a “tone” (a single-frequency sound—ω is the frequency that your ears perceive) whose volume goes up and down380 at a rate δω: which is what people call “modulation frequency” (more when talking about radio waves than seismic waves, really), while ω is the “carrier frequency” (radio slang, too). Seismologists would rather speak of group and phase381 ; which group is the envelope, the low-frequency thing, the modulation, and the speed at which it travels, dω , is the group velocity—while ωk is the phase velocity382 . I realize that this can be dk confusing, so I’ve made Fig. 6.24, too, which shows three “snapshots” of the modulated wave, taken at three different times: and you should see that the envelope of the whole thing—the group—propagates with its own speed—the group velocity—: it gets shifted from the left to the right in subsequent snapshots.
252
6 The Vibrations of the Earth
Fig. 6.24 From top to bottom, the modulated wave of Fig. 6.23, shown as a function of distance along the direction of propagation x1 , at three successive instants in time: t = 0, 10, 20s
Now let’s be totally general and look at the combination of a “packet” of Love or Rayleigh waves, each with its own frequency, covering a continuous range of frequencies. Which means that, if a single-frequency wave is written S(ω) cos(kx1 − ωt), with S a generic amplitude term, then we are going to be looking at u(x1 , t) =
ω0 +ε
ω0 −ε
S(ω) cos(kx1 − ωt)dω,
(6.287)
where ω0 is some reference frequency—basically you decide to look at an interval of frequencies around some ω0 , with ε ω0 . To save some space in the steps that follow, introduce ψ = kx − ωt; which, then, because ε is small, it makes sense to replace ψ(ω) with its Taylor expansion around ω0 , i.e., ψ(ω) ≈ ψ(ω0 ) + (ω − ω0 )
dψ (ω0 ). dω
(6.288)
6.14 Dispersion, Group, Phase
253
Sub that into (6.287) (assuming that S doesn’t change significantly between ω0 − ε and ω0 + ε), and u(x1 , t) ≈ Re
ω0 +ε
ω0 −ε
= Re S(ω0 )e $
S(ω0 )e
ω0 +ε
e
dω
iω dψ dω (ω0 )
dω
ω0 −ε dψ
−iω0 dψ dω
dψ
i(ω0 +ε) dω (ω0 ) − ei(ω0 −ε) dω (ω0 ) (ω0 ) e
dψ
= Re S(ω0 )eiψ(ω0 )
⎩
e
$
= Re
e
iψ(ω0 ) −iω0 dψ dω (ω0 )
= Re S(ω0 )eiψ(ω0 ) e
⎧ ⎨
iψ(ω0 ) i(ω−ω0 ) dψ dω (ω0 )
i dψ (ω0 ) dω %
dψ
e+iε dω (ω0 ) − e−iε dω (ω0 )
S(ω0 )eiψ(ω0 )
%
(6.289)
i dψ (ω0 ) dω ⎫ 2i sin ε dψ (ω0 ) ⎬ dω
⎭ i dψ (ω0 ) dω sin ε dψ (ω0 ) dω = 2S(ω0 ) cos [ψ(ω0 )] . dψ (ω0 ) dω
This, now, is compact enough that we can replace ψ and its ω-derivative with their explicit forms, i.e., ψ = kx1 − ωt and d dψ = (kx1 − ωt) dω dω dk x1 − t. = dω
(6.290)
(Yes, k is a function of ω, because we’ve just found that, given an earth model that isn’t a simple uniform half space, there exist only a range of possible solutions, i.e., only a discrete set of possible k’s per value of ω.) Substituting into (6.289), we find that our multiple-frequency surface wave looks like u(x1 , t) ≈ 2 S(ω0 ) cos(kx1 − ω0 t)
dk (ω0 )x1 − εt sin ε dω dk (ω0 )x1 dω
or383
u(x1 , t) ≈ 2 S(ω0 ) cos(kx1 − ω0 t)
sin
ε
dω dk (ω0 )
x1 dω dk (ω0 )
−t
x1 − εt −t
,
,
(6.291)
(6.292)
The right-hand side of Eq. (6.292) should remind you of (6.286): it is the product of a wave of frequency ω0 and speed ω0 /k, times one of much lower frequency ε, and
254
6 The Vibrations of the Earth
Fig. 6.25 One-and-a-half-hour long recording of the great Tohoku earthquake of March, 2011, made at HRV station near Harvard, Massachusetts. Seismologists, for reasons that we don’t need to get into, like to plot ground velocity, rather than displacement, so that is what I also did. Top: vertical component; middle: north-south component; bottom: east-west component. Zero time is the moment when the fault breaks, reconstructed as precisely as possible from all available data
speed dω/dk. Again, the wave traveling at speed dω/dk is the modulation factor— or, rather, dω/dk is the speed at which a whole packet of surface waves propagates: it’s the group velocity. To conclude this chapter, let’s take a look at a real, three-component seismogram: see Fig. 6.25. I haven’t talked about it at all, but you should realize that, in all this, what happens at the source is very important. A quake—a huge mass of rock fracturing and slipping along a fault—is a pretty messy event, and this is reflected by the seismograms, which are modulated by some complex “source time function”, as seismologists call it. We are not surprised, then, that things in Fig. 6.25 are by far not as neat as in, like, Figs. 6.22 and 6.23 and 6.24. I hope you can still recognize the P, coming in before everything else at around 15 min, mostly on the vertical component384 ; the S at 25 min or so, mostly on the other components; the Love wave on the horizontal components beginning at, I don’t know, 40 min; and finally the
6.14 Dispersion, Group, Phase
255
Rayleigh wave (Rayleigh waves are always slower than Love waves at the same frequency), vertical and radial component, beginning at about 45 min. You can also tell that lower-frequency surface wave stuff comes earlier than higher-frequency: surface-wave dispersion. In this example source and receiver are very far (Japan and Massachusetts), so the body-wave energy at this point is much more spread-out than the surface-wave energy, and the amplitude of Love and Rayleigh waves is much larger than that of body waves.
Chapter 7
The Structure of the Earth: Seismology
“Of all regions of the earth none invites speculation more than that which lies beneath our feet, and in none is speculation more dangerous; yet, apart from speculation, it is little that we can say regarding the constitution of the interior of the earth. We know, with sufficient accuracy for most purposes, its size and shape: we know that its mean density is about 5 21 times that of water, that the density must increase towards the centre, and that the temperature must be high, but beyond these facts little can be said to be known. Many theories of the earth have been propounded at different times: the central substance of the earth has been supposed to be fiery, fluid, solid, and gaseous in turn, till geologists have turned in despair from the subject, and become inclined to confine their attention to the outermost crust of the earth, leaving its centre as playground for mathematicians. “The object of this paper is not to introduce another speculation, but to point out that the subject is, at least partly, removed from the realm of speculation into that of knowledge by the instrument of research which the modern seismograph has placed in our hands. [...] The seismograph, recording the unfelt motion of distant earthquakes, enables us to see into the earth and determine its nature with as great a certainty, up to a certain point, as if we could drive a tunnel through it and take samples of the matter passed through.”385 This is the second chapter in a row, in this book, that talks about seismology. So then maybe people will say that is because I, the author, am a seismologist myself, and so I am biased in thinking that my discipline is more important than the others. But I don’t think that’s the case. Because, yes, I did a Ph.D. in seismic tomography, which indeed is the discipline, or the method or the set of methods by virtue of which we can use seismic waves to make maps of the properties (seismic velocities: and maybe also density, rigidity, compressibility; maybe temperature) of the interior of the earth: which is what this chapter is about. But, to be honest, I don’t really do those things anymore, or only very little, because I got really bored with it after my Ph.D., maybe even during my Ph.D., and so I started a long-term process that within, I don’t © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_7
257
258
7 The Structure of the Earth: Seismology
know, ten years (because despite what some people might say, if you want to have a career in academia the worst thing you can do is drift away from your thesis topic, because your thesis topic is what you are known for and you’re only really taken seriously when you talk about your thesis topic, and more so if your Ph.D. supervisor is influent. And your thesis topic, and or the name of your thesis supervisor, is what will ultimately land you a job; which, if you happen to be an academician yourself, you know what I am talking about, right?) would bring me to operate in a really different field, like musical acoustics386 or whatever387 . The quotation I started out with is from R. D. Oldham, again, but this is not his major 1900 paper that we’ve looked at in the previous chapter, but another paper, published in 1906, which is much shorter, and illustrates a very specific exercise in data analysis, namely: similar to the kind of stuff Oldham had done in the 1900 paper, take a bunch of quakes and plot the arrival time of the P and S waves (at this point, Oldham calls them first and second phases of preliminary tremors, respectively) versus distance between epicenter and recording instrument. What Oldham came up with, with the data he was able to collect in 1906, is the diagram in Fig. 7.1. You see from Fig. 7.1 that the travel times of the first and second phase of preliminary tremors are aligned each along its respective curve, the first preliminary tremor, i.e., the P-wave curve, being below the S-wave curve, of course (P waves are faster). There are fluctuations in the data, and even their averages, along both curves; which might partly reflect errors made by people when they read the travel times on seismograms388 ; but “another possible source of discrepancy”, says Oldham, “is the possibility that the rate of propagation is not uniform in every direction, and that the time taken by wave-motion in travelling, say from Japan to Europe, is
Fig. 7.1 After Oldham (1906): circled dots denote averages, taken at epicentral distances where a relatively large number of data points are found
7.1 The Earth’s “Core”, Seen by Seismic Waves
259
different from that taken by the same form of wave-motion in travelling from an equal distance in America”: or in other words, even the deep earth might be heterogeneous, at least to some extent, as we shall see more clearly later—at this point, Oldham is just speculating.
7.1 The Earth’s “Core”, Seen by Seismic Waves All this having been said, some “features”389 of the data in Fig. 7.1 look pretty “robust”. The rate of increment in distance with respect to time in both curves changes (grows) with source-receiver distance, i.e., the curves are not flat, but “convex”, rather: which, if you remember Oldham’s reasoning from his 1900 paper, means that those phases can’t be surface waves, and instead they must be body waves sampling the deep interior of the earth. And, similar to the previous chapter, we can estimate the speed of P and S waves deep inside the earth, and even how it changes with depth, etc., on the basis of how long it takes each phase to travel from the source to the various receivers, etc. But that’s not the main point of this paper. The point is that “it will be seen that the time-curves of the first two phases are very similar in shape, up to 120◦ from the origin; but beyond this they differ radically in form. That of the first phase, after an irregularity between 130◦ and 140◦ , becomes very flat and proceeds almost horizontally from 150◦ to 180◦ ; that of the second phase comes to an end at 130◦ from the origin, and is continued some 11 min farther up”, i.e., as you’ve probably noticed right away, the S-wave curve goes through, like, a major jump, a discontinuity, when source-receiver distance—on the horizontal axis of Fig. 7.1—is about 130◦ , on the immediate left of which we have a few data points showing travel times of about half an hour, and on the immediate right, after a gap of 10◦ or so, of about fourty-five minutes, which obviously is a lot longer. This is a pretty strange and unexpected observation and not easily explained, and the fact that the P-wave curve doesn’t show anything like that adds, perhaps, to the mystery. Here is, essentially, what Oldham concludes: he concludes, first, that, at a depth that is more than the max depth touched by the path of the S-wave (look at the drawing in Fig. 7.2; initially, just look at the straight ray paths that don’t get “deflected” by “discontinuities”... we’ll get to the broken lines in a sec; you should realize that the max depth reached by a wave path, the “chord”, grows with the angle “subtended” by the chord) that travels 130◦ along the surface, S-wave velocity must drop, i.e., beyond that depth S-waves propagate at a much slower rate, which accounts for their appearing much later than at receivers that are only a few degrees closer to the source. Secondly: one can make a rough estimate for that depth, assuming the S-wave velocity doesn’t change much above and below it, so that within each shell wavefronts would be spherical and ray paths straight, so that it’s easy to know the distance traveled by the wave—we measure it along the straight ray paths, which are the chords that subtend the angular distances that we’ve used so far—see again Fig. 7.2—and so that we can first estimate velocity in the top layer: just divide length of chord by propagation time. Then for the paths that “sample” what Oldham begins to call the “core”: (i)
260
7 The Structure of the Earth: Seismology
you subtract the length of the wave path within the top layer from its total length (this gives you the length of the portion of the wave path that’s within the core); (ii) you divide length within top layer by top-layer velocity, and obtain the time spent by the S wave in the top layer; (iii) you subtract the time spent in the top layer from the total propagation time, and you are left with the time spent by the S wave within the core: which, (iv) if you divide by that the length of wave path within core, finally, you get your estimate of S-wave velocity within the core, which according to Oldham “drops from an average of over 6 [above the core] to about 4 1/2 [in the core] kilometer per second.” There is of course, at least one problem with all this, and it is that the “reduction in speed” across the core boundary “means [...] a great deviation of the wave-paths as they enter the central core. As a first approximation to the actual course of the wave-paths, I give [...] a representation [Fig. 7.2] of what they would be on the supposition of a central core, occupying .4 of the radius, in which the rate of propagation is one half of that in the outer shell: in this it will be seen that the wave-paths emerging at 150◦ reach their emergence after passing on the opposite side of the centre”. We’ll see the details of this shortly, but basically a big change in wave speed means a big deformation of the wave front/deflection of the wave path, similar to what you see in Oldham’s sketch, Fig. 7.2, for the paths that actually enter into the core. Ultimately that’s a bit of a mess, because if velocity changes that much, then that also results in a major deflection of the wave fronts (and or wave paths) across the discontinuity, so that the waves that we see emerging on the opposite side of the core would have traveled a much longer path than if there were no deflection, and that would also affect our estimate of the velocity and so on and so forth. And Oldham is aware of the problem, clearly, but, to my understanding, doesn’t try to fix it. I guess it’s enough for him to have shown that, in any case, there must be a major drop in velocity when entering the core; which incidentally, speculates Oldham, might also mean that the material that makes up the core is very different from whatever is between us and the core: meaning not just a change in temperature and density, which was pretty much expected with increasing depth throughout the planet, but a change in the chemistry, in the chemical composition of the “rocks” in the core versus (most of) the rest of the earth, which we might begin to call “mantle”. Finally, the last of Oldham’s major conclusions: you can also make an estimate of the size of the core, because “we have seen that [the core] is not penetrated by the wave-paths which emerge at 120◦ [see above: curves maintain their character all the way to 120◦ —so, up to that point, they can’t be sampling the core]; and the great decrease at 150◦ shows that the wave-paths emerging at this distance have penetrated deeply into it. Now, the chord of 120◦ reaches a maximum depth from the surface of half the radius”, which would be a first, very rough estimate for the size of the core390 . Much of what Oldham says is still valid today, and his way of interpreting observations is, like, a blueprint for tons of seismology papers that would come after. Later in this chapter we shall see that S waves do not actually travel through the (outer part of) the earth’s core at all, for reasons that will become clear as we get there; so the weird shape of the S travel time curve in Fig. 7.1 will have to be interpreted in some other way391 : but let’s not get ahead of ourselves.
7.2 The “Moho”
261
Fig. 7.2 After Oldham (1906): “As a first approximation to the actual course of the wave-paths,” explains Oldham, “I give [...] a representation of what they would be on the supposition of a central core, occupying .4 of the radius, in which the rate of propagation [AKA the seismic wave speed] is one half of that in the outer shell”. All that was used to make this figure was, I think, Snell’s law (which see below), and a protractor (which is a simple tool to measure/draw angles)
If we look at Oldham’s 1906 paper from the point of view of the seismology that we did in Chap. 6 (Stokes, Rayleigh, Love, etc.), it’s maybe surprising to see that, in the end, all the convoluted theory that we covered is not really needed to make the inferences that Oldham starts to make about the properties of the interior of the Earth: once you know for sure that what you observe when you look at a seismogram are P waves, followed by S, followed by Love, followed by Rayleigh waves—and to know that for sure, yes, you do need the theory—once you know that for sure, you can use seismic data to “constrain” the structure of the earth by means of just the laws of “geometrical optics”. Oldham says it very clearly: “Wave-motion originating at any point in the earth will be propagated in all directions from it, and whatever the nature of these waves their wave-paths will be straight lines so long as the velocity of propagation remains constant; [...] if [the velocity] varies, the course of the wavepath will be altered according to the laws of refraction, which are to be found in every text-book of physics.” So, yes, the laws of refraction form a subset of the laws of geometrical optics, which will be the topic of the next part of this chapter.
7.2 The “Moho” Before we move on to that, though, here’s another example, contemporary to Oldham, of a pretty important inference about the earth, that was made in about 1909 interpreting, again, some seismic recordings via geometrical optics. The topic is the
262
7 The Structure of the Earth: Seismology
crust of the earth—you’ll remember this word has come up many times in this book already, initially (Chap. 2) meaning the outer shell of the earth that has already solidified while the deeper part of the planet, people thought, is still “fluid”; later this idea would have to be revised and then of course in any case the final blow to any remaining claim that most of the earth be fluid was dealt by papers like those of Oldham that we are looking at now, showing that S waves propagate at least half the way to the center of the earth (remember: S can’t propagate in fluids: in Chap. 6 we worked out the equation of motion in fluids, and we found no such thing as shear waves): so that the crust, in that old sense of the word, would have to be extremely thick, would have to extend at least all the way down to the outer boundary of the core, or (if Oldham were right and S waves did propagate into the core) even all the way to the center. Although... but I digress. The point I was trying to make is that, in Oldham’s time one wouldn’t call “crust” the solid, as opposed to fluid, region of the earth, because at that point people were pretty much sure that most and maybe all of the earth was solid, anyway. But the word crust continues to be in use and is associated with a thin, outermost shell, right below our feet; and Oldham, possibly for the first time, defines this shell “seismically”: because looking at seismograms recorded in the vicinity of quakes, one observes (page 457 of Oldham’s 1906 paper, again) “that no simple form of wave-motion can be transmitted through the heterogeneous rocks forming the outermost crust of the earth, and that the records from instruments situated near the origin of an earthquake cannot show any sorting-out of different kinds of wave-motion. It is only in more homogeneous material that this sorting-out can take place, and it is only at a distance of 10◦ C of arc, or about 700 mi, from the origin that the three-phase character of the record begins to appear. The waves emerging at this distance have evidently traversed more homogeneous material for a part of their course; in this part there has been a sorting-out of the forms of wave-motion into which the disturbance has been converted, and the fact that the sorting-out can be detected at so comparatively small a distance from the origin shows that the outer crust must be, comparatively, very thin. I have not been able to collect sufficient data for an accurate estimate of its thickness, but this cannot be more than a score of miles392 , and below it comes material of a very different character, which not only allows a sorting-out of different forms of wave-motion, but [...] transmits these at a velocity much greater than is met with in the outer crust.” So, for Oldham, crust means a thin layer with lots of heterogeneity, waves being reflected all over the place (again, the laws of geometrical optics...) so that if you look at a seismic record from an instrument deployed relatively close to an epicenter, the main arrivals of P and S will be followed by all sorts of reflected phases that are impossible to tell from one another—the so-called “coda”393 . Instruments far from the epicenter record P and S waves that have traveled through the deeper earth (think the chords in Fig. 7.2) and are separated from one another by a “quiet” interval without major wiggles: Oldham figures that whatever is below the crust is, more or less, homogeneous—no sharp obstacles that would create additional seismic phases—and that, because you don’t need to be too far from an epicenter to get nicely separated P and S, the crust should be pretty thin (“cannot be more than a score of miles”). But: “I have not been able to collect sufficient data for an accurate estimate of its thickness”. Enter
7.2 The “Moho”
263
Andrija Mohoroviˇci´c, Croatian, a prof. of astronomy and geophysics at the University of Zagreb. According to a memoir published on Historical Seismologist394 , “at the beginning of his career Mohoroviˇci´c focused on meteorology”, but “about the turn of the century [his] scientific interest turned almost exclusively to seismology, although he continued to spend a tremendous amount of time on routine meteorological observatory duties. It is indeed remarkable for a scientist who decided to turn his career upside-down in his mid-forties by starting research from scratch in a new, almost nonexistent field in his country, to subsequently achieve such an international reputation. The reason for this dramatic change is not known—one can only speculate that intense seismic activity around the Croatian capital in the late 19th century ignited the spark in his curious mind. [...] The meticulous analyses of recordings of the Kupa Valley earthquake of 8 October 1909 [...] enabled him to prove the existence of the crust-mantle boundary, which later became known as the Mohoroviˇci´c discontinuity, popularly known as the Moho”: which is the pretty important observation that I wanted to tell you about. What happened in 1909 is that a biggish earthquake struck about 40km away from Zagreb, so Mohoroviˇci´c had some good records of it from his own station, and he decided to study them together with all the additional recordings of the same quake that he could collect from instruments around Europe—unlike Oldham, who compiled global travel-time curves including observations made almost all the way to the antipodes of the epicenter, Mohoroviˇci´c is happy with stations that are at most, like, a thousand km away, or so. There’s a lot of commentary in Mohoroviˇci´c’s 1910 paper395 , about P, S (which incidentally, this paper might be the first where compressional waves are called P, or undae primae, as Mohoroviˇci´c also says, and shear waves are called S or undae secundae) and surface-wave phases, which Mohoroviˇci´c tries to identify in his data, and the depth of the source396 , etc., but the main point as far as we are concerned (the point, anyway, that earned Mohoroviˇci´c at least one paragraph in all geophysics textbooks) is that, looking at all those data, Mohoroviˇci´c noticed that in seismograms at distances between 300 and 700 km (which roughly means 3◦ to 7◦ : much shorter than what we’ve seen in Oldham’s papers) “two kinds of first preliminary waves” (which is just another way of saying P waves, S being the “second preliminary waves”) are observed. More precisely, two kinds of P waves “exist, both reaching all locations from 300 to 700 km distance”; but “from the epicenter to approximately 300 km distance only the first kind arrives, whereas from 700 km distance onward only the second kind arrives”. See the diagram in Fig. 7.3: Mohoroviˇci´c calls “first preliminary waves of the second kind”, or undae primae inferiores, those that are the fastest to arrive, while undae primae superiores397 , wherever they coexist with the inferiores, are slower to arrive. This is obviously an important observation, and a “robust” one (it would be confirmed, later, in other papers by many other people). But Mohoroviˇci´c’s discussion of it is, IMHO, somewhat obscure. He seems to realize clearly, though, that to explain the existence of a previously unknown seismic phase we need to introduce a previously unknown, or not clearly defined discontinuity: which later, actually, came to be called the Mohoroviˇci´c discontinuity, or Moho for short. Essentially, Mohoroviˇci´c figured that if wave speed under the discontinuity is much higher than above it, waves refracted into that lower layer, and
264
7 The Structure of the Earth: Seismology
Fig. 7.3 Travel time curves (distance on the horizontal axis and time on the vertical one) observed by Mohoroviˇci´c in 1909, after Jarchow and Thompson (1989). Mohoroviˇci´c’s diagram has much more information; Jarchow and Thompson show just the phases that we care about right now: undae primae superiores (or P), which are the first to hit the receiver at distances below 300 km, and undae primae inferiores (or Pn), that precede P starting at 300 km; and the same for undae secundae, S and Sn, superiores and inferiores. (Used with permission of Annual Reviews, Inc., from Craig Jarchow and George Thompson, “The Nature of the Mohoroviˇci´c Discontinuity”, Annual Reviews of Earth and Planetary Sciences, vol. 17, 1989; permission conveyed through Copyright Clearance Center, Inc.)
then reflected back up (by some other, deeper discontinuity, or something), might travel from source to receiver in a shorter time than waves that propagate directly from source to receiver without leaving the top layer398 . In the years that followed, people would figure out that the Moho exists not only under Europe, but everywhere globally. The word “crust” in the earth sciences became synonymous with the global shell that goes from the surface of the earth down to the Moho. People also figured that the depth of the Moho changes quite a bit across the globe: Mohoroviˇci´c estimated about 50 km using data from continental Europe, but under large mountain ranges like Himalaya, or the Andes, it would turn out to be even deeper (as much as, like, 80 km or so), while under oceans much thinner: less than 10 km. The way the depth of the Moho is estimated is, once again, through the laws of geometrical optics: which, finally, is what I am going to talk about next399 .
7.3 Geometrical Optics So, now: in the previous chapter we’ve solved the “momentum equation” formally for the cases of infinite three-dimensional space, and infinite half space. We’ve solved it in both cases by somehow reducing it to the “wave equation”. We’ve seen how the concept of wavefront emerges from solving the wave equation. We’ve seen waves
7.3 Geometrical Optics
265
traveling along the boundary between the half space and the vacuum—Love and Rayleigh waves. We’ve seen body waves (P and S waves) propagating undisturbed in infinite space. But now think of a body wave emitted by some source in a space that is not infinite, and/or not uniform. If the area near the source is homogeneous, initially propagation is no different than what happens in infinite uniform space—a P circular wavefront followed by an S circular wavefront. But what happens when a wavefront meets an obstacle? For example, in a half space like those of the previous chapter, what happens when the P wave meets the free surface? We could try to answer this question mathematically—we’ll actually do that later in the chapter—but I guess given how these phenomena were studied, historically, which still influences how we think about them today, I prefer to take the experimental approach and first see how people understood wave propagation in complex media by observation and a bit of speculation—before they’d resort to heavy mathematics in the nineteenth and twentieth centuries (and twenty-first, of course). It all started with optics: which people have studied optical wave propagation for the longest time, long before they’d seriously worry about sound and elastic-wave propagation... and or even before they even began to think of light as a wave: you might have studied the atomist view of light in high-school philosophy, if you’ve done any philosophy in high school, which probably most of the Greek thinkers agreed that light consisted of some sort of “atoms”, corpuscles flowing continuously from whatever object we look at, straight into our eyes. And a couple of millennia on, Newton (who published a very much read and cited Opticks treatise) still essentially thought the atomists were right. It was probably Christian Huygens, a contemporary, more or less, of Newton, who first suggested that light resembled more, like, a wave... but we’ll get back to Huygens in a minute. Whatever light was, people started long ago to observe and wonder about, for instance, what it is that light does when it hits a reflecting surface—a mirror; you can measure your position in space, and the position of some object that you see reflected on a mirror, and the position of its image as you see it in the mirror from your observation point: and you’ll see that the angle formed by your position and the position of the image on the mirror and the perpendicular to the mirror is the same as the angle formed by the same perpendicular, same image, and the actual object, whose reflection you are looking at (Fig. 7.4). Ernst Mach400 wrote a Principles of Physical Optics401 where he goes in quite some detail through the historical development of optics, and he mentions Euclid’s Catoptrics as one of the first ever treatises in optics, from which one gathers that Euclid knew that the angles of incidence and reflection coincided with one another; and this three centuries B.C. Mach also cites the first emergence of the so-called principle of least action, when he mentions that “according to Damianus [of Larissa, presumably], probably fifth to sixth centuries CE [...] author of a short monograph in optics, Hero of Alexandria (second century A.D.), [...] ‘has proved that the reflected straight lines which form two equal angles are shorter than those which are reflected at unequal angles at the same mirror surface towards the same end points.”’ (Fig. 7.5.) And also according to Hero, through Damianus, through Mach: “if the nature of our ray of vision did not
266
7 The Structure of the Earth: Seismology
Fig. 7.4 The equality of the angles of incidence and reflection, or: α = β
Fig. 7.5 Mach: “the reflected straight lines [thick dashed lines] which form two equal angles [α and β] are shorter than those [thin dashed lines] which are reflected at unequal angles at the same mirror surface towards the same end points.” Which is a way of stating the so-called principle of first action
permit of aimless wandering, the ray would be reflected at equal angles”, which is something a philosopher would call a teleological argument? I’ll get to the principle of least action in a sec.
7.4 Snell’s Law Refraction is trickier than reflection. Refraction’s simplest definition is that it’s what happens to light as it propagates across an interface, separating two different “media”: like air and water, or whatever. According to Mach, one of the first authors to study refraction was “Ptolemy (second century A.D.) [, who] knew that a ray of light on
7.4 Snell’s Law
267
Fig. 7.6 If D is the observer’s position and A is the object’s position, Snell’s law says that, depending on where you place A, B will be shifted around, and so the lengths of both B D and BC will change: but the ratio of B D to BC stays constant. Trigonometry shows that this is the same as stating that, no matter where you place A, the ratio of sin(i) to sin(r ) is constant
entering a denser medium approaches the normal and on passing out into a less dense medium recedes from the normal. He also made visual measurements [...] with the help of a graduated vertical circle which was half immersed in water, and carried at the centre a small pin, and on the periphery two movable indicators. He drew up refraction tables for every 10◦ up to 80◦ . According to his view, the ratio of the angle of incidence to the angle of refraction remains constant [...] for the same pair of media.” Which is not yet the law of refraction as we think of it today, but is correct, I guess. Another piece of the puzzle is provided by Alhazen, who (Mach, again) “knew that a ray crossing the interface between two media, when reversed, retraced its original path. This important fact might be designated the reciprocal law.” According to Mach, “the correct quantitative form of the law of refraction was found, as Huygens mentions in his Dioptrica, by Willebrord Snellius402 . [...] On the basis of experiments, Snell gave the law in the following form. [Look at the diagram in Fig. 7.6.] Let B E represent a surface of water, and D an object at the bottom. An eye at A sees D in the direction ABC at the point C of the vertical line D E [...]. Then the ratio B D : BC is constant”, i.e., it’s independent of the angle made by the incoming ray and the refracting surface (“angle of incidence”). So in practice if you move D around, C will also move, but B D/BC won’t change. There’s another way of describing Snell’s finding, that might sound more familiar; if you call i the incidence angle made by at B by BC and BG (where BG is also vertical, i.e., parallel to D E and perpendicular to B E), and r the angle made at B by the refracted ray BD with the vertical BG, you can convince yourself via trigonometry that B D = G D/ sin(r ), and BC = G D/ sin(i): it follows that B D/BC = sin(i)/ sin(r ); and we have proven that Snell’s statement is equivalent to: the ratio sin(i)/ sin(r ) is constant. This looks a bit more like Snell’s law (which
268
7 The Structure of the Earth: Seismology
Fig. 7.7 Let A be the eye of an observer standing by a swimming pool, and B is an object floating at some depth under water. Call D and E the intersections of the water surface with the vertical lines traced from A and B, respectively. because of refraction, the direction of the wave path from B to A changes at C. Fermat makes the assumption that light takes the fastest path to propagate from B to A, and derives Snell’s law based on that assumption alone
is the name that has come to be given to the law of refraction403 ) the way it is usually written in textbooks and or taught in physics courses (in case you’ve been through that, already); but one important ingredient is still missing: there is no mention of velocity of propagation in Snell’s formulation of his law (according to Mach, at least). A generation or so after Snell, the mathematician Pierre de Fermat realized that he could arrive at Snell’s result in a different way, i.e., starting off with the assumption that, please now look at Fig. 7.7, light chooses the fastest path between A and B— that is, not the geometrically shortest path, which would be just the straight line connecting the two points, but the one which requires the shortest time for light to propagate from one point to the other404 . Here’s the reasoning: take two points A and B, to fix ideas you can think that A is the eye of the observer, standing by a swimming pool, and B is an object floating at some depth under water. Call D and E the intersections of the water surface with the vertical lines traced from A and B, respectively; because of refraction, the direction of the wave path from B to A changes at C: the problem is, assuming Fermat’s assumption is right, find where C is—or, which is the same, the values of the angles α and β. Let us call h 1 and h 2 the perpendicular distances AD and B E, just so that the math that follows doesn’t take too much room, and for the same reason let D E be denoted by e and DC by x. If the velocities of wave propagation above and below the water surface are v1 and v2 respectively, then, according to Fermat, the value of x needs to be such that the quantity AC CB + = v1 v2
h 21 + x 2 v1
+
h 22 + (e − x)2 v2
(7.1)
7.5 Huygens’ Principle
269
be as small as possible, i.e., ⎡ d ⎣ 0= dx = =
v1
h 21 + x 2 v1
x h 21
+
− x2
+
v2
h 22 + (e − x)2 v2 e−x
h 22
+ (e −
⎤ ⎦ (7.2)
x)2
sin α sin β − , v1 v2
or
v1 sin α = . sin β v2
(7.3)
And you see that Fermat’s result is fully consistent with that of Snell—the ratio of the sines of the incidence to refraction angles is constant—plus Fermat also finds that the ratio of the sines must also coincide with the ratio of the velocities of propagation— which, like I said, was missing from Snell’s original version of Snell’s law: but would be confirmed later. Of course, Fermat hasn’t really proven that nature systematically follows the so-called least-action path; but it’s, uhm, remarkable that, if the assumption of least action is made, the experimental result of Snell is reproduced by direct mathematical inference, as you’ve seen.
7.5 Huygens’ Principle Now: realize that in “deriving” the laws of reflection and refraction, the way they historically came to be accepted, we didn’t really need to answer the question of what light really is, i.e., waves or corpuscles405 ; but sooner or later the issue had to be addressed and that’s what Christiaan Huygens did in 1678 when he communicated his wave theory of light to the Royal Academy of Science in Paris406 . Essentially, Huygens surmises that the speed of light is not infinite407 , and “that light consists in the motion of some sort of matter”, although such “sort of matter” can’t be touched or seen and the only indication that it exists is that light propagates through it... Huygens appropriately calls it “ethereal matter”, which later authors will shorten to ether. To understand what Huygens means by light consisting in motion of matter, you have to think of ether, whatever that is, as made up of a lot of molecules, small parcels of (invisible) matter amassed near one another. Imagine you set one of those parcels in motion. You might hit it once, or attach it to a spring so that it oscillates back and forth, or move it around in whatever way you want. That particle will hit its neighbors, the other particles that are around it, and give away some of its kinetic energy to those other particles, that will then move in the same way. Particles that have been set in motion by the first one will transmit their motion to their own
270
7 The Structure of the Earth: Seismology
neighbors, and so on and so forth. Now, Huygens figured that this is the same thing as saying that each particle, when it “receives” an impulse, immediately becomes a “source” (we might call it a “secondary source”) for the same impulse, transmitting it to other particles that it is in contact with: which is how Huygens’ principle is usually stated. It goes without saying that this is, pretty much, a general description of a wave, that is also good for any other medium besides ether, i.e., to media we can actually perceive, be they liquid or solid or gaseous; and Huygens’ idea is cited as a founding principle for optics but also acoustics and in general wave propagation. The thing about Huygens’ principle is that, at the time when it came out, it could explain a lot of well known observations in optics, which people had not really understood, yet: Newton, e.g., needed some very convoluted reasoning to reconcile the facts of refraction with his idea of light as a beam of flowing corpuscles; while on the contrary you can fairly easily derive the law of refraction from Huygens’ principle. Huygens does it in the third chapter of his Treatise. His proof works like this: look at Fig. 7.8: to keep things simple, we consider a plane wave, i.e., a wave with a planar, or, since to make the drawing we are taking a two-dimensional section of the world, rectilinear wavefront. If you are far enough from the source, a spherical waveform is, locally, approximately planar, so the demonstration is still pretty general. So, think of two media separated by a flat interface, and let the velocities of propagation in the two media be v1 and v2 . To fix ideas, say v2 > v1 . Draw a line, in the medium of velocity v1 , that represents the wavefront at a time t0 ; then, following Huygens’ idea, pick a lot of equidistant points (ideally, you should take an infinity of them, but) along that line, think of them as sources, and draw circular wavefronts around them, all with the same radius which is the distance v1 δt covered by the wave in a time δt. It follows from Huygens’ principle that the envelope of all the circles you’ve just drawn is the wavefront at time t0 + δt. Now think of what happens as the wavefront hits the interface between the two media: you are going to have a new source that is right in the interface: so if you redo the same thing as before, now you’ve got to draw around this new source a circle with radius v2 δt, which is larger than v1 δt because v2 > v1 . Look at the drawing: the wavefront now is not a straight line anymore, but a broken line, whose inclination changes across the interface. Think of two subsequent moments in time, during this phase when the wavefront is interacting with the interface; call them t1 and t1 + δt, and Fig. 7.8 shows the situation of the wavefront at those two moments. Huygens’ trick now is to do some trigonometry on the two rectangular triangles that are identified by: the wavefront in the first medium at time t1 ; the wavefront in the second medium at time t2 ; the segment (call x its length) spanned, between t1 and t2 , by the intersection of wavefront and interface; and again in the first medium the perpendicular to the wavefront that meets the interface at the point where that intersection lays at time t2 ; and in the second medium the perpendicular to the wavefront that meets the interface at the point where that intersection lays at time t1 ... all this should be clear as you look at the drawing. Trigonometry in the top triangle: x sin(i) = v1 dt, where i is the incidence angle just like we’ve defined it earlier; bottom triangle: x sin(r ) = v2 dt, where r is the refraction angle. Take the ratio of the two equations we’ve just written, and sin(i)/ sin(r ) = v1 /v2 , which is nothing else than Snell’s law, QED!
7.5 Huygens’ Principle
Fig. 7.8 Deriving Snell’s law from Huygens’ principle
271
272
7 The Structure of the Earth: Seismology
7.6 Diffraction All this having been said, there are some problems with Huygens’ scheme. Huygens defines the wavefront—the envelope of all “secondary wavelets”—as “the termination of the movement”—which makes perfect sense, because, if v is wave speed, in no way can a disturbance propagate over a time dt through a distance larger than vdt. Next, he shows convincingly that, according to this principle, a plane wave injected in a homogeneous medium continues to propagate as a plane wave, and likewise a spherical wave stays spherical. But Huygens’ principle doesn’t say what happens behind the wavefront. Think of a propagating plane wave that, in time, is just an impulse: a “click”. If all parcels of matter that receive the impulse become sources, shouldn’t we expect that the signals they generate will coalesce, to some extent, also behind the wavefront, giving rise to some sort of “coda”? thing is, if you throw a pebble in a pond and look at the circular wave that then appears, you know that’s not the case at all: but why? And/or, for each set of secondary sources that are hit simultaneously, how come we don’t have a second plane wave, emanating from the same secondary sources, in the same direction as the first one but in the opposite sense? And if Huygens is right, how come things actually cast shadows? According to his principle, parcels of ether along the profile of some object turn into point sources, and as such should spread light in all directions—spherical waves. But it’s easy to see that (direct) light doesn’t make it into the object’s cone of shadow408 . The latter point is probably the weirdest one, at least to me, so let’s look into it first. This requires some background. So far I’ve told you about experimental observations of reflection and refraction, which could be explained both in terms of Newton’s corpuscular theory of light (with some effort), and in terms of Huygens’ wave theory of light. But then there are a whole bunch of observations that can’t really be explained either by Newton or by Huygens alone, and these generally fall under the umbrella of diffraction. Diffraction, by the way, is a word that gets used a lot, with meanings that can somewhat vary, referring to a whole bunch of effects that the idea that light is a ray (or, more in general, the idea that given a source and a receiver, a wave propagates from the former to the latter along a unique path) is not enough to explain. Let’s start from the beginning. Mach: “Grimaldi409 [discovered] diffraction phenomena, which he observed with accuracy, and whose essential details he recorded with such fidelity that the subject could no longer remain in obscurity. Grimaldi found that when a small opaque body was placed in the cone of sunlight which emerged from a small aperture [...], the shadow which it cast was somewhat broader than it would have been if the light were propagated rectilinearly past the edges of the opaque object. [...] Grimaldi thus made the following statement: ‘Lumen propagatur seu diffunditur non solum directe, refracte ac reflexe, sed etiam quodam quarto modo diffracte.’ He observed three coloured fringes bordering on the shadow, each fringe being narrower than its next neighbour on the side of the shadow. With a sufficiently bright light source fringes could also be seen inside the shadow [...], an odd number of the brighter fringes always being present. Light must therefore penetrate into the shadow. The fringes both within and outside the shadow
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
273
Fig. 7.9 Grimaldi’s experiment
were generally parallel to the edges of the latter”, etc. In other words, imagine you have light propagating through an aperture: a point source, in practice; place a screen in front of the aperture, and a small obstacle between aperture and screen (see the sketch in Fig. 7.9); light and shadow form a pattern on the screen that is not like what you might predict from straightforward geometry: i.e., there are a dark area and an illuminated area, whose sizes depend as you might expect on the size of the obstacle and its distance from screen and aperture; but on top of that, and this is the crux of Grimaldi’s contribution, there are illuminated fringes within the shadow, and dark fringes in the light. (And the distribution and size of the fringes also depends on the distances between aperture, obstacle, screen, and their sizes, etc.) And this strange effect is what Grimaldi called diffraction.
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle I can’t think of how corpuscular theory could possibly explain this, and I don’t think the contemporaries of Newton and Huygens could either, although they probably gave it some convoluted tries. On the other hand, it turned out that Huygens’ recipe lacked just one key ingredient to solve the riddle: we might call it interference, and the person who’s mostly responsible for adding it to the mix is Augustin Fresnel. I guess there’s some debate as to whether the real champ of interference isn’t Thomas Young410 , rather than Fresnel, whole books devoted to this question, and which maybe in the framework of this book it doesn’t matter that much, but having done some reading I tend to adhere to the account given by Emile Verdet in his introduction
274
7 The Structure of the Earth: Seismology
to the collected edition of Fresnel’s Oeuvres, published in three volumes between 1866 and 1870. Fresnel had died of tuberculosis in 1827 at the age of only 39, after a short but productive life411 . If we believe Verdet, Fresnel introduced the “interference principle” in its complete, correct form, and as a consequence of that was able to come up with a much more convincing formulation of Huygens’ principle than Huygens himself: which in fact, in current textbooks, is usually called HuygensFresnel principle. According to Verdet (my translation), Huygens “has no difficulty to establish that elementary waves [read: impulses, clicks] have a common envelope, which is what we call the wave [the wavefront, actually], and that there could be no movement beyond that envelope [i.e., farther away from the source(s)], but he does not really prove that within that envelope, also, movement is negligible.” A first intuition that helps Fresnel solve this problem is: rather than think of waves as impulses (which is what Huygens tended to do), think of them as periodic disturbances412 ; a second intuition concerns diffraction and consists of figuring that diffraction effects (Grimaldi’s fringes, etc.) are not caused by obstacles (a small object casting a shadow; the edges of a small aperture, etc.) reflecting the waves (which apparently Thomas Young thought they did), but simply the screening off of some sources by the obstacle. Anyway, all this probably sounds pretty abstract, so what I am going to do next is, I am going to show you examples of how Fresnel uses these ideas to explain (quantitatively) some strange experimental results. Before I do so, though, I should probably clarify, again, that this is not really about optics: it is about seismic and acoustic waves just as much as it is about optics, and or even more: because in Huygens’ paradigm, to be challenged only at the beginning of the twentieth century, a wave of light propagating through ether is totally the same thing as a seismic wave propagating through rock, i.e., it’s all about shifting matter around; almost a century after Fresnel, yes, the whole idea of ether will be abandoned, plus physicists in the early twentieth century will show that light doesn’t exactly, or doesn’t always behave like a wave; but that doesn’t matter as far as this book is concerned, because conclusions drawn re the propagation of elastic waves, like earthquake or sound waves, remain valid. As a first exercise413 , let’s take a simple setup where a spherical... or rather, to make it even simpler, let the setup be purely two dimensional, and let a circular (or if you insist to think three-dimensionally, cylindrical) wave be emitted by a single point (in three dimensions, a line) which we call R in Fig. 7.10 (in the three dimensional case, which from now on I’ll just forget because the 2-D example we are going to work out leads to the same results, in the 3-D case the line source would be perpendicular to the plane of the diagram, intersecting it at R). Then there’s an obstacle, which to keep things simple will just be a segment of negligible thickness, through which waves can’t propagate at all, it’s just totally fixed and motionless. Let A be one end of the obstacle (we don’t care about the other end). On the other side of the obstacle there’s a screen, parallel to the obstacle. Now, what happens in such a setup, similar to Grimaldi’s experiment, is that you are going to observe fringes of darkness in the illuminated area of the screen, to the left of point T , and some (dimly) illuminated fringes in the shadow to the right of T . Let’s just focus on what happens on the screen to the left of T , and let’s see how Fresnel uses Huygens’ principle plus the
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
275
Fig. 7.10 After Fresnel, Oeuvres, p. 270
interference principle to predict where the dark fringes exactly should be along the screeen. I haven’t really explained how the interference principle works in practice, actually, so here you go: the point, as anticipated, is that (in Verdet’s words) “Fresnel simply supposes that light is produced by periodic vibrations of very short duration, propagating at an immense [but finite] speed”, where it is understood that a periodic vibration consists of a sequence of half-oscillations equal and opposite to one another. And, for instance, look at Fig. 7.11, if one observer is hit by two waves in such a way that motion in one direction associated with the first wave comes at the exact same time as motion in the opposite direction associated with the second wave, well, the two displacements add up to zero and the observer doesn’t move (for this to work, of course, the two waves must have same amplitude and frequency)—“destructive” interference. Likewise, if two maxima of equal sign come in at the same time, the two waves add up and the observer moves twice as much (with respect to the case where it is hit by one wave only). Incidentally, Fresnel initially worked out these ideas without specifying the direction of oscillation; which is great, because it means that the inferences that follow are independent of how waves are polarized414 (whether they be P, or S, etc. Of course, the kind of interference we are talking about now only concerns combinations of waves that are polarized in the same way; otherwise I guess things would be much more complicated). Now that the meaning of interference is clarified, we can go back to Fresnel’s experiment in Fig. 7.10. Fresnel, like Young before him, interprets the fringes as areas where motion (of ether) is minimum, because interference is destructive— interference between the direct, circular wave coming out of R, and landing at F at a time R F/v, where v is the speed of light, and (in Fresnel’s first formulation of the
276
7 The Structure of the Earth: Seismology
Fig. 7.11 Very simple examples of constructive (left) and destructive (right) interference. In both cases, the sum of the top and middle signals gives the signal at the bottom
theory, which is taken directly from Young) another circular wave, reflected415 off the edge of the obstacle at A—hitting F after a time (R A + AF)/v. So then there will be a darker fringe at point F if the difference between the distances R F and R A + AF is precisely half a wavelength: because if that’s the case, the crest of the direct wave will come in exactly at the same time as the trough of the diffracted one, and vice-versa: destructive interference. So what Fresnel does is, (i) he finds via trigonometry a formula relating the difference R A + AF − R F with the distance x between T and F, then (ii) finds the numerical value of x that corresponds to R A + AF − R F being exactly half a wavelength, and (iii) compares that to the position of the first fringe, as observed experimentally. This is all done in his paper “Mémoire sur la Diffraction de la Lumière”, that got him the Grand Prix biannuel de Physique in 1819. If you look at Fig. 7.10 and remember your trigonometry, hopefully you’ll agree that C F = c + x + b tan(θ), but
c , a
(7.5)
bc + x. a
(7.6)
tan(θ) = and so CF = c +
(7.4)
Also, by the Pythagorean theorem, RA =
a 2 + c2
and AF =
b2 +
bc +x a
(7.7)
2 (7.8)
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
and RF =
277
2 bc (a + b)2 + c + +x . a
(7.9)
Now, before combining R A + AF − R F, and get a really ugly formula, let’s simplify things a bit, which we can, because for instance in Eq. (7.7) c2 is very small compared to a 2 : the experiment is implemented for a small value of the angle θ, 2 + x is much smaller than b2 (the so that c is also relatively small; in (7.8) bc a ratio of b to a is of the order of unity; x is small because θ is small), and in (7.9) 2 + x is much smaller than (a + b)2 (for similar reasons). We can then take c + bc a the Taylor expansion √ (first order should be OK; it’s OK for Fresnel, anyway) of the function f (z) = K + z 2 near z = 0, with K an arbitrary constant, f (z) ≈
√
z2 K+ √ , 2 K
(7.10)
and use it to find approximate expressions for R A, AF, R F. Namely, RA ≈ a + AF ≈ b +
1 2b
bc +x a
c2 ; 2a
2 =b+
(7.11) cx bc2 x2 + ; + 2 2a 2b a
2 1 bc c+ +x 2(a + b) a
bcx 1 b2 c2 bc2 c2 + 2 + x 2 + 2cx + 2 +2 . ≈a + b + 2(a + b) a a a
(7.12)
RF ≈ a + b +
(7.13)
Now it is convenient to do the arithmetics, because a lot of stuff actually cancels out. To see how that happens, consider first R A + AF ≈ a + b +
1 2
c2 bc2 cx x2 + 2 + +2 a a b a
,
(7.14)
which begins to look kind of similar to R F as in (7.13), and in fact if you both multiply and divide by a + b the term inside the parentheses at the right-hand side of (7.14),
b2 c2 1 bc2 ax 2 bcx 2 2 + 2 + + x + 2cx + 2 R A + AF ≈ a + b + c +2 , 2(a + b) a b a a
(7.15)
278
7 The Structure of the Earth: Seismology
and it becomes easy to find that R A + AF − R F ≈
ax 2 . 2b(a + b)
(7.16)
It follows that the value of x corresponding to the first dark fringe is given by ax 2 λ ≈ , 2 2b(a + b)
or x≈
λb(a + b) , a
(7.17)
(7.18)
where λ is wavelength (see, again, Fresnel, same place, page 271). It all looks very good, but the problem is that it doesn’t fit the data at all. “The position of dark and bright fringes as inferred from this formula”, says Fresnel, “is almost exactly the opposite of what is observed experimentally.” Some convoluted attempts to get the theory to fit the data by some strange corrections follow, but eventually (p. 282) Fresnel concludes that “the interference principle, applied to the direct rays plus the rays that are reflected [...] by the edges of an obstacle, [is] insufficient to explain diffraction phenomena.” But, and here’s where Fresnel begins to contribute some really new ideas, surpassing both Huygens and Young, “I will show now that we can give a satisfactory explanation and a general theory [of diffraction phenomena], in the framework of wave physics, without the help of any additional assumptions other than the principle of Huygens’ and that of interference.” What we need to do is, we need to think of, e.g., the obstacle in Fig. 7.10 not as something that reflects waves, but rather as something that just blocks them off, without reflecting anything; and we should be able to predict diffraction phenomena just by applying Huygens principle—summing only the sources that the obstacle hasn’t screened off. So let’s see how this idea helps Fresnel to find, for the problem of Fig. 7.10, a solution that actually fits the data. I am jumping to p. 313 of Oeuvres, i.e., his “application of the theory of interferences to Huygens’ principle”; which is still part of his 1818 award-winning paper. Because calculations are going to be completely different, it’s better to have a brand new diagram to look at, see Fig. 7.12. The motion observed at P will be the sum of the effects of all secondary sources everywhere in space, unless of course the signal they emit is blocked by the obstacle—again, a segment whose relevant endpoint is A. Fresnel introduces an important simplification: in the absence of obstacles, we know that a point emits a spherical (or, in 2-D, circular) wave. This is both an experimental result (throw a peddle in a pond, and watch), and the fundamental assumption of Fresnel—his interpretation of Huygens principle. So, in this particular case, the wavefront is circular until it hits the obstacle; at the moment when that happens, motion is restricted to the arc of circumference denoted by a thick black line in Fig. 7.12: there cannot be any diffraction effect yet, while any motion to the right of A is going to be reflected back towards the upper end of
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
279
Fig. 7.12 The thick curved line is the wavefront of the circular wave emitted at O. The thin curved line is an arc of circumference centered at P. (After Fresnel, Oeuvres, p.313.)
the page—or absorbed by the obstacle, depending on its properties, but in any case it is never going to make it to P, i.e., it doesn’t affect the diffraction pattern that we shall observe on the screen. So then, predicting the motion at P caused by the wave emitted at O amounts to summing the motions caused at P by circular waves emitted at all points of the thick black line starting at A; in the language of mathematical analysis, we are going to have to do an integral along that curve; which might perhaps be tricky, but the good news is that by stopping the integral at A we, according to Huygens’ principle, are taking full account of diffraction. And in fact, here is how Fresnel formulates Huygens’ principle, which from this point on becomes the Huygens-Fresnel principle: “the vibrations at each point of a light wave may be treated as the sum of elementary movements which are sent there, independently of one another, by all the parts of the wave, considered in one of its previous positions”, where you can interpret the phrase “considered in one of its previous positions” to refer specifically, again, to the thick black line in Fig. 7.12: which, calculation-wise, is the most convenient of all possible previous positions of the wave that one might think of considering: because, at earlier positions, there would still be no account for the effect of the obstacle, while later positions are polluted by diffraction effects that are hard to quantify. At that optimal moment in time, instead, the portion of wavefront that is not blocked off is circular, and each of its points is subjected to the same motion. So that means that, if you think of the points along the wavefront as secondary sources, each of them will emit the exact same energy—a circular wave of the same amplitude. If the signal emitted by O is sinusoidal in time, all points along the wavefront will simultaneously emit the same sinusoidal signal, and P will be receiving the same
280
7 The Structure of the Earth: Seismology
signal, with the same maximum amplitude416 , from each of those secondary sources, except for a delay that is proportional to the difference in the distance between P and each secondary source. Let M be the intersection of O P with the wavefront that goes through A, and N another generic point along that wavefront. Let s be the length of the wavefront between A and N . We are going to have to do an integral along the wavefront, starting at A: so what Fresnel does is, he thinks of s as the incremental length along that wavefront, or the “integration variable”: and so to put the integral in a form that can be calculated, finds a way to write the integral–the amplitude of motion–as a function of s. Let’s see how. If we take as a reference, and call sin(ωt) (i.e., zero delay), for example, the signal received at P at time t from the secondary source at M, with ω the angular frequency, or 2π times the inverse of period, then the signal received at P at the same time from the source at N reads sin[ω(t − u/c)], where c is wave speed, and the distance u is defined as in Fig. 7.12. sin[ω(t − u/c)] coincides with the signal that had been received and emitted by N at an earlier time t − u/c, before t and then took an extra time u/c to make it to P. It will become useful to transform u u u = sin(ωt) cos ω − cos(ωt) sin ω , sin ω t − c c c
(7.19)
via the properties of sinusoidal functions. Waves are circular so the wavefront is an arc of circumference, and it follows that the angle made at O by M O and N O measures s/a exactly. If s is small, which it is in this kind of experiments (just like θ in Fig. 7.10), we can also think of the arc of length s as, approximately, a straight line M N : and then if we draw from N a line perpendicular to O P intersecting it at Q, we can think of Q M N as a right triangle. Now, Q M N shares with O M N the angle at M, while its angle at Q is right, just like the angle made at N by O N and M N . Then, the angle made at N by Q N and M N must coincide with s/a, too. It follows from this, and from the fact, again, that s is small (and therefore s/a is even smaller, and417 sin(s/a) ≈ s/a)that the 4
length of Q M is about s 2 /a; it also follows that the length of Q N is about s 2 − as 2 (Pythagorean theorem). Keeping all this is mind, let’s look at triangle P Q N . It’s a right triangle, with hypotenuse P N of length b + u, and legs Q N and P Q. The length of P Q being b + s 2 /a, we can apply the Pythagorean theorem to write
2 s2 s4 (b + u) = b + + s2 − 2 , a a 2
or b2 + u 2 + 2ub = b2 +
s4 bs 2 s4 + s2 − 2 , +2 2 a a a
(7.20)
(7.21)
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
281
which after a bit of algebra boils down to u 2 + 2ub = 2
bs 2 + s2. a
(7.22)
But actually b is much bigger than u, so then ub is much bigger than u 2 , which we can reasonably delete from the latter equation, and then it’s easy to solve for u the (approximate) equation that’s left, which gives u≈
s 2 (a + b) 2ab
(7.23)
(see Fresnel, p. 215), i.e., the expression that we were after, of u as a function of s. Finally, substitute formula (7.23) for u into (7.19), and the signal at P is given by 418,419 signal at P = sin(ωt)
s 2 (a + b) s 2 (a + b) ds cos ω − d cos(ωt) ds sin ω , 2abc 2abc
(7.24) where the integral is to be conducted along the wavefront that goes through A. The cosine and sine of ωt don’t depend on s, so I have already pulled them out of the integration. The integrals in (7.24) are not easy to do, and but Fresnel does them numerically420 , see the tables in Oeuvres, p. 319. What’s most important is that this time he manages to fit the data–the diffraction pattern actually observed on the screen in Fig. 7.12–really well, or “within the limits of experimental error”, as they say. This is sort of a proof of Fresnel’s (as opposed to Young’s) interpretation of diffraction, i.e., of what we’ve come to call Huygens-Fresnel principle421 . There’s another application of Huygens-Fresnel that recurs in textbooks and webpages and lecture notes, although I have to admit I haven’t been able to find it in Fresnel’s work itself—though that doesn’t mean that it isn’t there at all... It leads to an exercise that is quite useful, I think, to explain once and for all the concept of wave path, or ray path, which is most important for what I have to do next: so here it goes. The idea is, use Huygens-Fresnel to prove that, in a homogeneous medium, a wavefront that at a given moment is plane continues to be plane—a plane wave keeps propagating as a plane wave. Maybe that’s obvious to you, but take it as a test that Huygens-Fresnel makes sense. We shall solve this problem with essentially the same set of tricks concocted by Fresnel to solve the previous one, except of course the setup is, again, different: see Fig. 7.13. We’ve got a plane wave propagating from left to right, and the x, z plane coincides with its wavefront; and P is an arbitrary point at a distance b from the wavefront. To calculate the signal at P, by Huygens-Fresnel, similar to what we’ve done earlier, we have to sum the contributions from all points (i.e., secondary sources) of the wavefront. We do it by subdividing the wavefront in a clever way. Call O the point you intercept, on the wavefront, by drawing from P the perpendicular to the
282
7 The Structure of the Earth: Seismology
Fig. 7.13 Plane wave propagation according to Huygens-Fresnel
wavefront itself. Then pick a set of points M1 , M2 , M3 , M4 , etc., on the wavefront, at varying distances from O (and P) such that P M1 = b + λ/2, where λ is, again, wavelength, P M2 = b + λ, P M3 = b + 3λ/2, P M4 = b + 2λ, and so on and so forth. Notice that, by Pythagoras, λ2 , 4
(7.25)
O M22 = P M22 − b2 = 2bλ + λ2 ,
(7.26)
O M12 = P M12 − b2 = bλ + and likewise
O M32 = P M32 − b2 = 3bλ +
9λ2 , 4
(7.27)
and so on, which unless we pick b to be incredibly close to the wavefront (which we don’t have to), then wavelength λ is way smaller than b, λ2 way smaller than bλ, and we are left with (7.28) O Mn2 ≈ nbλ, or O Mn ≈
√
nbλ,
(7.29)
7.7 Fresnel: Interference Principle and Huygens-Fresnel Principle
283
for all integer values of n = 1, 2, 3, etc. From that, it’s not difficult to calculate the area of the ring between the circles, centered in O, that go through Mn and Mn+1 , for whatever n: because the areas of the two circles are πO Mn2 ≈ nπbλ and 2 πO Mn+1 ≈ (n + 1)πbλ, and so their difference is (approximately) πbλ: which is constant, independent of n, and equal to the area of the circle of radius O M1 , centered at O. So, now, by Huygens-Fresnel, let’s combine the contributions, to the signal at P, from all rings around O. Call m 1 the contribution at a given time t, from the combination of all secondary sources that are within the circle of radius O M1 , centered at O; call m 2 the contribution, at the same time, of the first ring, with inner radius O M1 and outer radius O M2 ; then m 3 is the contribution from the ring with outer radius O M3 , etc. Rings (and innermost circle) all have the same area, so the contribution of each is controlled by its distance from P; the larger the distance, the more the geometrical spreading and the lower the amplitude. But differences in distance, from one ring to the other, are on the order of λ, which we know is much smaller than b: so geometrical spreading shouldn’t change much. We might reasonably write mn ≈
m n−1 + m n+1 . 2
(7.30)
Simultaneous arrivals from different rings will be “off phase” with respect to one another, according to the difference in distance between each ring and P. We are talking, again, about sinusoidal waves—all waves in Fresnel are sinusoidal422 —so for each source within, say, the n-th ring that contributes, say, a crest to P, there’s one in the n + 1-th ring that contributes a trough (and this is true not only of crests and troughs, but also of any intermediate “phase” between them); to take account of this, we might assign a minus or plus sign to each m n , and write signal at P = m 1 − m 2 + m 3 − m 4 + m 5 − m 6 + . . . ;
(7.31)
which now there’s a funny trick we can play, putting together (7.30) and (7.31); because you can always replace m 1 with m21 + m21 ; likewise m 3 = m23 + m23 , etc.; imagine that you do this to all terms with odd indexes in (7.31), which becomes m1 m3 m5 m3 m5 m1 + − m2 + + − m4 + + − m6 + . . . 2 2 2 2 2 5 (7.32) But now it follows from (7.30) that the second, third and fourth term at the righthand side of (7.32) cancel one another out, and so do the fifth with the sixth and the seventh, and so on and so forth to infinity, until we are left with signal at P =
signal at P ≈
m1 . 2
(7.33)
Equation (7.33) stipulates that all contributions to the signal observed at P, other than that coming from secondary sources at O and its immediate vicinity, interfere
284
7 The Structure of the Earth: Seismology
destructively and cancel out. This is true independently of where O and P are along the wavefront, provided the former lays on the perpendicular drawn from the latter to the wavefront. That perpendicular to the wavefront is the ray path between O and P, of course, and the time it takes for the wavefront to propagate between its two successive positions (the one that includes O, and the one that includes P) coincides with the length of that ray path, divided by wave speed. But now we’ve also shown that, in a sense, all energy recorded at P has travelled along that very path; because contributions from all other paths kill each other out. Things get more complicated, though, if the medium is not uniform, if the wavefront is not flat, etc.: an obstacle along the wavefront, away from O, will disrupt interference and will in principle have an effect also on the signal recorded at P, etc.423
7.8 Reflection and Refraction Coefficients Fresnel derives the laws of reflection and refraction from the interference principle, too. Above, I’ve already shown other ways in which those laws are derived. What’s perhaps more relevant at this point, Fresnel contributed a way to compute the so-called reflection and refraction coefficients, i.e., to determine how much of the energy carried by a wave gets reflected off an interface, and how much of it gets refracted through the same interface, depending on the angle at which the wave hits the interface—the angle of incidence. The idea first emerged in a paper424 that Fresnel presented in 1823, and only much later C. G. Knott425 extended it to seismology. Knott’s contribution was quite important too, at least technically, because seismic waves—if you consider that in the earth we’ve got both P and S waves, and that, as we shall see, one of the things that might happen at an interface is that P and S might to some extent get converted into one another... in the end the physics of seismic waves is not entirely the same as the physics of light-propagating-in-ether as conceived by Fresnel426 . So Knott had to work pretty hard to generalize Fresnel’s ideas on reflection and refraction to seismology. Before I try to repeat what Fresnel and Knott did, I need to give you some more background on P and S waves, which instead of the 1849 paper by Stokes, that we spent some time on in the previous chapter, now we are going to look at Chap. XIII of Love’s Treatise427 . The main thing I need to do is I need to introduce seismic plane waves, which Love’s book does very clearly, I think. I am going to follow essentially that, with a couple of (what I hope are) simplifications. One of the things Love does is he takes a function u(x, t) = A cos k(p · x − ct) ,
(7.34)
which is a monochromatic plane wave traveling along the direction p, and shows that it is a solution of Eq. (6.167). In formula (7.34), it makes sense for p to be a vector of unit norm, i.e., p · p = 1, so that all the info it carries is the direction of propagation,
7.8 Reflection and Refraction Coefficients
285
while info on wavelength is carried by k and c is the speed of propagation (and if you put c and k together you get frequency, too.) So, Love plugs (7.34) into (6.167). If you do the algebra, that gives k 2 c2 ρA cos[k(p · x − ct)] =(λ + μ)∇ k (p · A) sin k(p · x − ct) + μk 2 A (p · p) cos[k(p · x − ct)] =(λ + μ)k 2 (p · A) p cos k(p · x − ct) + μk 2 A cos k(p · x − ct) ,
(7.35)
which then if you simplify what can be simplified, you are left with ρc2 A = (λ + μ) (p · A) p + μA, or
λ+μ μ c − A= (p · A) p. ρ ρ 2
(7.36)
(7.37)
Now, the first thing that comes to mind looking at Eq. (7.37) is that the vector at the left-hand side points in the direction of A while the right-hand side points in the direction of p: so then one might infer that, for (7.37) to hold, A and p have to be parallel. But then, p · A = A (which of course A is the magnitude of A) and if you dot-multiply both sides of (7.37) with p, you are left with
λ+μ μ 2 A= A, c − ρ ρ
or c=
λ + 2μ . ρ
(7.38)
(7.39)
What we’ve just found is that the plane wave (7.34) is indeed solution to (6.167) if A is parallel to p, which means that displacement is parallel to the direction of propagation, and the speed of propagation c is given by (7.39). Which we’ve just described a plane P wave. And but now look again at (7.37). Say A is perpendicular to p, so the right-hand side of (7.37) is zero. Then, (7.37) will still hold if also its left-hand side is zero: which it is if c2 = μρ . What this means is that (7.34) is solution to (6.167) also if displacement is perpendicular to the direction of propagation, and propagates at speed μρ : now we’ve described a plane S wave. What happens when a plane wave428 hits a discontinuity or a boundary? Fresnel tries to give a quantitative answer to this question in his 1823 paper. He knows already, through Huygens’ principle and/or his version of Huygens’ principle, that there will be both reflection and refraction, and that both reflected and refracted waves will
286
7 The Structure of the Earth: Seismology
be plane waves—see above. He points out that the frequency of both reflected and refracted waves should be the same as that of the incident wave: “we can affirm that each oscillation [...] in glass takes the same amount of time as each oscillation in air [i.e., the period stays the same, or, if you prefer, the frequency stays the same], otherwise there would a discontinuity, or discordance between the oscillations that come before and those that follow”, and indeed such discontinuity, or discordance, doesn’t make any sense, if you think (with Huygens) that the vibration of each parcel of matter is determined by the vibrations of its neighbors. Glass and air, by the way, are just an example: Fresnel’s statement is general. Then Fresnel also proposes to make the assumption that vibrations are small enough that they don’t, like, break the materials that they propagate through, which means that displacement on both sides of the interface should be the same, or in other words that displacement must be continuous across the interface: which, otherwise, fractures and or holes would be opening up along the interface. He also figures that, by virtue of conservation of energy... think of a single incident wavelet that carries a certain energy; if you neglect anelastic dissipation, the energy carried by the wavelet before it hits the interface should coincide with the sum of the energy carried by its reflected and refracted counterparts after they bounce off/pass through the interface429 . These two “conditions” can be expressed mathematically and used to determine the values of the amplitude of the reflected and refracted waves in terms of the amplitude of the incident one. Alternatively, and that’s what’s usually done in geophysics books, incl. this one, starting with the work of Knott that I’ve already mentioned, one can require that besides displacement, also stress be continuous across the interface—and then energy conservation becomes redundant430 . Whichever path you choose, the calculations are quite messy, as you are about to see. One might need to know the reflection and refraction coefficients in a bunch of different setups, by which I mean that for example you might want to know the reflection coefficient for a P wave, traveling from deep within the earth and hitting its outer surface, which reflects it back towards the interior of the planet. In the first approximation, the earth’s surface is free from external stresses, so all energy gets reflected back into the earth: which simplifies things. But if we look at a discontinuity within the earth, then the reflection coefficient will be different, because part of the energy gets refracted (transmitted) through the discontinuity and so there’s a also a refraction coefficient, which we might want to calculate. And then maybe the incoming (“incident”) wave is S, not P: and the calculations are different, again. In addition, as I think I anticipated a few pages ago, an incident P wave is usually partly converted to S, both through reflection and refraction; and likewise an incident S wave is usually partly converted to P. So, for example, when a P wave hits an interface, in general that gives rise to: a reflected P wave, a refracted P wave, a reflected (“converted”) S wave, a refracted (“converted”) S wave. And when an S wave hits an interface... you get the idea. And that’s not all yet, because there’s one last thing that we have to worry about, and that is the polarization of S waves. Which I haven’t even mentioned in this book yet, I’m afraid! Well, the thing is, I told you that S-wave displacement is “perpendicular” to the direction of wave propagation... which, if you think about it, means that the displacement vector must be “tangential”
7.8 Reflection and Refraction Coefficients
287
to the wavefront, but could point in any direction within the plane that’s parallel to the wavefront. Now, when a P wave hits a discontinuity, and/or the outer surface of the earth, particle motion is confined to the plane that’s perpendicular to the discontinuity, and contains the incident wave path. S waves generated by conversion of that incoming P wave can only involve displacement within that same plane. Typically, in the first approximation, we think of the earth as a layered half space (or a sphere made of concentric shells, which, if you zoom into a relatively small portion of it, will look pretty much like a layered half space...), where layers are all horizontal and parallel to the outer surface. Then, it makes sense to call “vertical” the plane where P-wave motion, as well as converted S motion, occur: and S waves polarized in such a way are called SV , where V stands for vertical. People also often talk of S H waves, which are S waves polarized in such a way that particle motion is perpendicular to both the wave path, and the vertical plane; H , of course, stands for horizontal431 . What I am going to do in the following is, I am going to work out some examples of how reflection and refraction coefficients are derived in some of the cases I’ve mentioned. We might start with the reflection and conversion of a P wave at a surface that is free from stresses, which should be a relatively easy exercise. This is called a “free surface”, in seismology argot, and we’ve already dealt with it in Chap. 6, the piece on Rayleigh waves. To a very good approximation the earth’s surface, for example, is a free surface. In practice, because there’s nothing on its outer side that can resist stress, we require that stresses acting on it from within be zero, everywhere and all the time. As for the displacement, there’s (approximately) nothing on the outer side of the discontinuity, so it doesn’t make sense to require continuity of displacement. Displacement is “free” along the boundary. Let us work in a Cartesian reference frame where x1 and x2 are the horizontal coordinates, x3 the vertical one (positive downwards), and we pick x1 to lay in the same plane as the direction of propagation: so then a P wave is given by Eq. (7.34), with p2 = 0 and the vector A parallel to p. If we call θ1 the angle of incidence (check out Fig. 7.14), then (7.40) p = (sin θ1 , 0, − cos θ1 ),
Fig. 7.14 Reflection and conversion of a P wave at a free surface
288
7 The Structure of the Earth: Seismology
and we can write A = A(sin θ1 , 0, − cos θ1 ).
(7.41)
The incident P wave gives rise to another, reflected P wave, and a reflected, converted S wave. By the law of reflection we can anticipate that the angle at which the reflected P leaves the free surface coincides with θ1 ; the reflected P, then, propagates in the direction (7.42) p R P = (sin θ1 , 0, cos θ1 ), where R P stands for “reflected P”, and we can write its amplitude A R P = A R P (sin θ1 , 0, cos θ1 ).
(7.43)
As for the converted S, let’s call θ2 the angle along which it propagates, so p R S = (sin θ2 , 0, cos θ2 )
(7.44)
(where RS stands for “reflected S”, of course, which in this case is synonymous with converted S), and its amplitude A R S = A R S (− cos θ2 , 0, sin θ2 ),
(7.45)
perpendicular to p R S and within the vertical plane, so that the converted S is SV polarized. The reflection and refraction coefficients—which, remember, is what we are after—are the ratios of A R P and A R S to A. After writing A, p, etc., the way we did, we can substitute into (7.34) to get some more explicit expressions for the displacements associated with the incident and reflected P, and with the converted S: u P = A(sin θ1 , 0, − cos θ1 ) cos [k P (sin θ1 x1 − cos θ1 x3 − c P t)] ,
(7.46)
u R P = A R P (sin θ1 , 0, cos θ1 ) cos [k R P (sin θ1 x1 + cos θ1 x3 − c P t)] ,
(7.47)
u R S = A R S (− cos θ2 , 0, sin θ2 ) cos [k S (sin θ2 x1 + cos θ2 x3 − c S t)] ,
(7.48)
where c P = λ+2μ is the speed at which P waves propagate, and c S = μρ that of S; ρ the constants k P and k R P appearing in the incident versus reflected P formulae might in principle differ, but if that were the case the incident and reflected P would oscillate at different frequencies (you should be able to see that, in an expression like (7.34), frequency is proportional to c times k): which goes against one of Fresnel’s main hypotheses (see above). So from now on we might as well just replace k R P = k P . The displacement one would observe at the surface (or anywhere within the half space, for that matter), would be the sum of the displacements carried by the incident
7.8 Reflection and Refraction Coefficients
289
and both reflected waves, u = uP + uR P + uRS.
(7.49)
Now we might start thinking about stress: which if we have formulae for displacement, through Hooke’s law (6.155) we can find formulae for stress. That might sound like a lot of work—the stress tensor τ has nine coefficients—but the only coefficients of τ that we care about are those that concern the outer surface, where the boundary condition is prescribed. The outer surface is horizontal, we can pick our reference frame so that its equation is x3 = 0, and it follows that stress on that surface is fully accounted for by coefficients τ13 , τ23 , τ33 . Now, because u 2P = u 2R P = u 2R S = 0, i.e., 3 = 0, it also follows from Hooke’s law (6.155) that τ23 = 0. Based, u 2 = 0, and ∂u ∂x2 again, on (6.155), based on u 2 being zero and based on everything being constant with respect to x2 , we are left with τ13 = μ
∂u 1 ∂u 3 + ∂x3 ∂x1
∂u 3 ∂u 1 ∂u 3 + 2μ =λ + ∂x1 ∂x3 ∂x3 ∂u 1 ∂u 3 =λ + (λ + 2μ) . ∂x1 ∂x3
(7.50)
and τ33
(7.51)
1 , etc., are to One still needs to do some algebra to see what expressions for ∂u ∂x1 be plugged into (7.50) and (7.51). Differentiating expressions (7.46) through (7.48) with respect to x1 and x3 , we find
∂u 1P = −Ak P sin2 θ1 sin [k P (sin θ1 x1 − cos θ1 x3 − c P t)] ∂x1 and similar formulae for
∂u 1P ∂x3
,
∂u 3P ∂x1
,
∂u 3P ∂x3
;
∂u 1R P = −A R P k P sin2 θ1 sin [k P (sin θ1 x1 + cos θ1 x3 − c P t)] ∂x1 and formulae for
∂u 1R P ∂x3
(7.52)
(7.53)
, etc.;
∂u 1R S = −A R S k S cos θ2 sin θ2 sin [k S (sin θ2 x1 + cos θ2 x3 − c S t)] , ∂x1 etc. You’ve guessed what happens next; we remember (7.49), write
(7.54)
290
7 The Structure of the Earth: Seismology
τ13
∂u 1P ∂u P =μ + 3 ∂x3 ∂x1
∂u 1R P ∂u R P +μ + 3 ∂x3 ∂x1
∂u 1R S ∂u R S +μ + 3 ∂x3 ∂x1
and substitute into (7.55) the formulae we’ve just found for same for τ33 . After some algebra,
∂u 1P ∂x3
(7.55)
, etc. And then the
τ13 = −μAk P sin(2θ1 ) sin [k P (sin θ1 x1 − cos θ1 x3 − c P t)] +μA R P k P sin(2θ1 ) sin [k P (sin θ1 x1 + cos θ1 x3 − c P t)] +μA
RS
(7.56)
k S cos(2θ2 ) sin [k S (sin θ2 x1 + cos θ2 x3 − c S t)] ,
where, in case you want to redo this by yourself, it is useful to remember the properties of the sine and cosine; in particular, cos2 θ − sin2 θ = cos(2θ), and 2 sin θ cos θ = sin(2θ). Likewise, τ33 = −Ak P (λ + 2μ cos2 θ1 ) sin [k P (sin θ1 x1 − cos θ1 x3 − c P t)] −A R P k P (λ + 2μ cos2 θ1 ) sin [k P (sin θ1 x1 + cos θ1 x3 − c P t)] +μA
RS
(7.57)
k S sin(2θ2 ) sin [k S (sin θ2 x1 + cos θ2 x3 − c S t)] .
At the boundary x3 = 0, Eqs. (7.56) and (7.57) translate to the conditions (A R P − A)k P sin(2θ1 ) sin [k P (sin θ1 x1 − c P t)] = −A RS k S cos(2θ2 ) sin [k S (sin θ2 x1 − c S t)]
(7.58) and (A R P + A)(λ + 2μ cos2 θ1 )k P sin [k P (sin θ1 x1 − c P t)] = μA R S sin(2θ2 )k S sin [k S (sin θ2 x1 − c S t)] .
(7.59)
Both (7.58) and (7.59) must hold everywhere on the free surface and at any moment in time, i.e., for all values of x1 and t, and the only way this can happen is if the arguments of the sin[. . .] at the left- and right-hand side of both equations coincide, (7.60) k P (sin θ1 x1 − c P t) = k S (sin θ2 x1 − c S t), which both (7.58) and (7.59) result in this same condition. Equation (7.60) must also hold for all x1 and t, including t = 0, where it reduces to
and x1 = 0, where
sin θ2 kP = , kS sin θ1
(7.61)
kP cS = . kS cP
(7.62)
7.8 Reflection and Refraction Coefficients
291
From (7.61) and (7.62) it also follows that432 sin θ2 cS = , sin θ1 cP
(7.63)
cS θ2 = arcsin sin θ1 , cP
(7.64)
or
i.e., you can get the direction of propagation of the converted S wave, from the ratio of S to P velocity and the angle of incidence of the P-wave, θ1 . This is all very useful stuff, but more work is needed to get the reflection coefficients. Let’s go back to Eqs. (7.58) and (7.59), which now we can simplify away the sines—we’ve just found that their arguments always coincide—so (A R P − A)k P sin(2θ1 ) = −A R S k S cos(2θ2 )
(7.65)
(A R P + A)(λ + 2μ cos2 θ1 )k P = μA R S sin(2θ2 )k S .
(7.66)
and
We might use Eq. (7.62) to write k P and k S in terms of c P and c S ... cS sin(2θ1 ) = −A R S cos(2θ2 ) cP
(7.67)
cS (λ + 2μ cos2 θ1 ) = μA R S sin(2θ2 ) cP
(7.68)
(A R P − A) and (A R P + A)
...and we can get rid of the λ’s and μ’s that are left; keeping in mind that c P = and c S = μρ , write
λ+2μ ρ
λ + 2μ cos2 θ1 λ + 2μ(1 − sin2 θ1 ) λ + 2μ c2 c2 = = − 2 sin2 θ1 = P2 − 2 sin2 θ2 P2 , μ μ μ cS cS (7.69) where in the last step we’ve used (7.63). But then c2 λ + 2μ cos2 θ1 c2 = P2 (1 − 2 sin2 θ2 ) = P2 (cos2 θ2 − sin2 θ2 ), μ cS cS and so433 λ + 2μ cos2 θ1 = which, plugged into (7.68), reduces it to
c2P μ cos(2θ2 ), c2S
(7.70)
(7.71)
292
7 The Structure of the Earth: Seismology
(A R P + A)
cP cos(2θ2 ) = A R S sin(2θ2 ). cS
(7.72)
Finally, divide both sides of both (7.67) and (7.72) by A, and we are left with the linear system
ARP A ARP A
RS
sin(2θ1 ) + ccPS AA cos(2θ2 ) = sin(2θ1 ) RS cos(2θ2 ) − ccPS AA sin(2θ2 ) = − cos(2θ2 ), RP
(7.73)
RS
that we can solve for the two unknowns AA and AA , i.e., the reflection coefficients. For example, from the first of (7.73) you find that c P A R S cos(2θ2 ) ARP =1− ; A c S A sin(2θ1 )
(7.74)
then you plug that into the second of (7.73), and after a not insignificant amount of algebra, which however is pretty straightforward, I think, you get 2 ccPS sin(2θ1 ) cos(2θ2 ) ARS = , c2 A sin(2θ1 ) sin(2θ2 ) + cP2 cos2 (2θ2 )
(7.75)
S
which substituted back into (7.74), plus some more algebra, gives sin(2θ1 ) sin(2θ2 ) − ARP = A sin(2θ1 ) sin(2θ2 ) +
c2P c2S
cos2 (2θ2 )
c2P c2S
cos2 (2θ2 )
(7.76)
... and with that, we have formulae that allow to calculate the reflection coefficients associated with a P wave hitting a free surface: all we need to know are its angle of incidence θ1 , and as the P- and S-wave velocities in the medium—remember θ2 is obtained from θ1 via (7.64). The second example that I propose to look at, which now is going to be easy for you because it is not very different from the one we just did—still, keep reading, because it will serve us to introduce a couple of new concepts—, the second example is that of an SV wave hitting a free surface, which reflects it back partly as SV , and partly as a “converted” P wave. Let’s write the incident SV as u S = A S (cos θ1 , 0, sin θ1 ) cos [k S (sin θ1 x1 − cos θ1 x3 − c S t)] ;
(7.77)
the reflected P and SV are just as in Eqs. (7.47) and (7.48), except you have to swap θ1 with θ2 because we’ve decided in (7.77) to call θ1 the angle of incidence of the incoming SV . (Anyway, just look at the drawing in Fig. 7.15.) The algebra that follows is similar to the above, and, if you decide to do it yourself, hopefully you’ll get
7.8 Reflection and Refraction Coefficients
293
Fig. 7.15 Reflection and conversion of an SV wave at a free surface
τ13 = μA S k S cos(2θ1 ) sin [k S (sin θ1 x1 − cos θ1 x3 − c S t)] + μA R P k P sin(2θ2 ) sin [k P (sin θ2 x1 + cos θ2 x3 − c P t)] + μA
RS
(7.78)
k S cos(2θ1 ) sin [k S (sin θ1 x1 + cos θ1 x3 − c S t)] ,
which replaces (7.56), and τ33 = −μA S k S sin(2θ1 ) sin [k S (sin θ1 x1 − cos θ1 x3 − c P t)] −A R P k P (λ + 2μ cos2 θ2 ) sin [k P (sin θ2 x1 + cos θ2 x3 − c P t)] +μA
RS
(7.79)
k S sin(2θ1 ) sin [k S (sin θ1 x1 + cos θ1 x3 − c S t)]
instead of (7.57). At x3 = 0, both τ13 and τ33 must be zero, and (A S + A R S )k S cos(2θ1 ) sin [k S (sin θ1 x1 − c S t)] = −A R P k P sin(2θ2 ) sin [k P (sin θ2 x1 − c P t)] ,
(7.80)
μ(A R S − A S )k S sin(2θ1 ) sin [k S (sin θ1 x1 − c S t)] = A R P k P (λ + 2μ cos2 θ2 ) sin [k P (sin θ2 x1 − c P t)] .
(7.81)
Just as Eq. (7.60) followed from (7.58) or (7.59), it follows from (7.80) or (7.81) that k S (sin θ1 x1 − c S t) = k P (sin θ2 x1 − c P t), from which it is inferred that
sin θ2 cP = , cS sin θ1
(7.82)
(7.83)
so again θ2 can be calculated from θ1 . Let’s derive the reflection coefficients, the way we did above, reducing (7.80) and (7.81) to the system
294
7 The Structure of the Earth: Seismology
A RS AS A RS AS
RP
cos(2θ1 ) + ccPS AA S sin(2θ2 ) = − cos(2θ1 ) RP sin(2θ1 ) − ccPS AA S cos(2θ1 ) = sin(2θ1 ),
(7.84)
and then finding its solution sin(2θ1 ) sin(2θ2 ) − ARS = S A sin(2θ1 ) sin(2θ2 ) +
c2P c2S
cos2 (2θ1 )
c2P c2S
cos2 (2θ1 )
,
cP sin(4θ1 ) ARP cS = − . c2 AS sin(2θ1 ) sin(2θ2 ) + cP2 cos2 (2θ1 )
(7.85)
(7.86)
S
All this is similar to what we’ve found earlier, with one important difference. To see it clearly, rewrite (7.83) as sin θ2 =
cP sin θ1 ; cS
(7.87)
because c P > c S , then ccPS > 1, then there will be some values of θ1 for which sin θ2 > 1: which is a problem, because a sine can never be larger than 1... The values in question are those such that cS , (7.88) sin θ1 > cP
or θ1 > arcsin
cS cP
,
(7.89)
and people call arcsin ccPS the “critical angle”. So if θ1 is smaller than the critical angle, no problem, there’ll be reflected P and SV waves and now we have all we need to calculate their amplitude. But if θ1 is equal to or larger than the critical angle, things are not so simple. For instance, if θ1 coincides with the critical angle, then θ2 = π/2 and sin(2θ2 ) = 0, and RS Eq. (7.85) boils down to AA S = −1, which would mean that all the SV energy that propagates from below the free surface is reflected back in the form of SV (with the sign of the displacement switched at the moment of reflection), and there is no conversion to P; but at the same time Eq. (7.86) in this case doesn’t show A R P to be zero: so I guess reflection of SV with no conversion to P can’t happen—the boundary conditions won’t hold—unless there’s some P energy propagating “along” the surface—i.e., in the half-space, with θ2 = 0. In this case, I guess it doesn’t make much sense to think of R P as converted from SV —energy wouldn’t be conserved. R P just needs to be there on its own or we wouldn’t possibly be observing reflection of SV , either. Which is kind of weird.
7.8 Reflection and Refraction Coefficients
295
If θ1 is larger than the critical angle, then we might try to make some sense of what’s happening if we switch to complex numbers. Similar to what we did in the previous chapter, let’s write plane waves as complex exponentials, i.e., let’s replace (7.34) with (7.90) u(x, t) = Aeik(p·x−ct) , which anyway is the same as (7.34) plus the imaginary unit times A times a sine which has the same argument as the cosine434 . Based on (7.90), the three waves that are involved when some SV hits a free surface are rewritten u S = A S (cos θ1 , 0, sin θ1 )ei[k S (sin θ1 x1 −cos θ1 x3 −cS t)] ;
u
RP
=A
RP
c i k R P cP c2P cP S 2 sin θ1 , 0, 1 − 2 sin θ1 e cS cS
(7.91)
sin θ1 x1 + 1−(c2P /c2S ) sin2 θ1 x3 −c P t
,
(7.92) u R S = A R S (cos θ1 , 0, sin θ1 )ei[k S (sin θ1 x1 +cos θ1 x3 −cS t)] ,
(7.93)
sin θ1 , and I have obtained an equivalent expression for cos θ2 through the equality cos θ2 = 1 − sin2 θ2 . After writing things this way, we have a clearer view of what happens to the term , when θ1 grows beyond the that we had initially called “converted P wave”, or u R P where I have replaced sin θ2 with its expression
cP cS
critical angle. Keep your eyes, in particular, on the term
1 − (c2P /c2S ) sin2 θ1 , which
controls the amplitude of the x3 component of u R P , and appears in the argument of the complex exponential (the sinusoid) also in u R P . If ccPS sin θ1 < 1, which is the same as saying sin θ1 < ccPS or, look at (7.88), that θ1 is smaller than the critical angle, then 1 − (c2P /c2S ) sin2 θ1 is a real number, and (7.92) is just a sinusoidal function, in time and space: and this is the same as what we’d found before435 . When ccPS sin θ1 > 1, then 1 − (c2P /c2S ) sin2 θ1 becomes the square root of a negative number, i.e., an imaginary number,
1 − (c2P /c2S ) sin2 θ1 = i (c2P /c2S ) sin2 θ1 − 1
(7.94)
To understand what that means for u R P , write (7.92) component by component, ⎧ √ 2 2 2 cP ⎪ ⎨ u 1R P = A R P ccP sin θ1 ei k R P cS sin θ1 x1 + 1−(c P /cS ) sin θ1 x3 −c P t , S √ cP 2 i k sin θ1 x1 + 1−(c2P /c2S ) sin2 θ1 x3 −c P t ⎪ ⎩ u 3R P = A R P 1 − ccP2 sin2 θ1 e R P cS , S
π
Now substitute (7.94) into (7.95), remember that −i = e−i 2 , and
(7.95)
296
7 The Structure of the Earth: Seismology
√ ⎧ c ⎨ u R P = A R P c P sin θ ei k R P cPS sin θ1 x1 −c P t e− (c2P /c2S ) sin2 θ1 −1 x3 , 1 1 cS √ c ⎩ u R P = −A R P (c2 /c2 ) sin2 θ − 1 ei k R P cPS sin θ1 x1 −c P t − π2 e− (c2P /c2S ) sin2 θ1 −1 x3 . 3
P
S
1
(7.96) The argument of the first exponential at the right-hand sides of both of (7.96) is imaginary, while the argument of the second exponential at the right-hand sides is real. It follows that the first exponential is a sinusoidal wave propagating along x1 , in the direction of growing x1 ; the second exponential, if you remember that x3 is defined as positive upwards, is 1 at the surface, and then decays to zero with increasing depth (negative x3 growing in absolute value). The real part of displacement (7.96), if we take it for example at x1 = 0 to simplify our life, reads (cos (−k R P c P t) , 0, cos (−k R P c P t − π/2)), which is the same as (cos (−k R P c P t) , 0, sin (−k R P c P t)). Maybe you’re beginning to see that this looks a lot like a Rayleigh wave, right? Except, if you think of the trajectory described by (cos (−k R P c P t) , 0, sin (−k R P c P t)), it turns out that particle motion here is prograde (i.e., clockwise, with respect to an observer, if the wave propagates from the left to the right of an observer), while the motion of a particle that’s hit by a Rayleigh wave is always retrograde (remember Fig. 6.15). (Just to be clear, we are talking, of course, about motion right at the surface, x3 = 0.) So it might be a bad idea to call this a Rayleigh wave: A term that is sometimes used instead is “inhomogeneous” wave; another is “evanescent” wave. I guess both adjectives refer to the fact that the amplitude of these waves in not “homogeneous” with depth, but “vanishes” exponentially as depth grows. Whatever. In any case, this is what Aki and Richards436 call them.
7.9 Stoneley Waves Aki and Richards’ is, at least to my knowledge, the most exhaustive textbook437 on the theory (as in analytical theory, meaning mathematical derivations and all that) of what happens when a seismic wave meets a discontinuity. In the following I’ll keep referring to Aki and Richards, because it’s the book I happen to know best. What I’m going to do next is I am going to give you a summary of how Aki and Richards further elaborate on the stuff we’ve seen so far—you’ve seen how keen I am, in general, to redo all the maths that’s behind the really important results in our discipline; this is really a bit too much, though, and for this once you’ll have to allow me to cheat... So, first of all, Aki and Richards do not stop at the cases of a P or SV wave hitting a free surface, but they generalize to waves hitting internal discontinuities, as well, and as a result, like I was saying, you have not only reflected, but also refracted waves; they also generalize to combinations of different waves hitting the discontinuity: meaning that, for example, a P and an SV wave might both impinge on the discontinuity, and in general that will generate both reflected and refracted P and SV , plus, possibly, inhomogeneous waves. So Aki and Richards find reflection
7.9 Stoneley Waves
297
and refraction coefficients for all those combinations. (And they do S H waves, too, but those are easy, because they can’t convert to P or SV , they just stay S H ; you can check the math out in their book, or in that of Ewing et al., and I am pretty sure that if you were able to follow what I did above, S H reflection and refraction is going to be a piece of cake for you.) When refraction comes into play, it turns out that not only the reflected P, and the refracted P, but also the refracted SV wave might be inhomogeneous. This will necessarily happen, beyond some critical angle, if c S in the layer where refracted waves propagate is higher than c S in the layer where the incident waves propagate. And so we meet the SV inhomogeneous wave, which is not the same thing as the P inhomogeneous wave that we’ve met before. Aki and Richards438 show that if you sum a P inhomogeneous and an SV inhomogeneous wave traveling along a free surface, the boundary conditions can still be verified, with no incoming, reflected or refracted body waves, and but what you get is actually a Rayleigh wave, retrograde motion and all439 . Because there’s no incident body wave, nor reflected or refracted body waves, we are left with just two inhomogenous waves along the surface, i.e., only two coefficients to be constrained via the boundary conditions; Aki and Richards: “there is one less amplitude ratio to be determined, with no reduction in the number of boundary conditions”: so to make sure the boundary conditions are still met, it turns out that only Rayleigh waves with a certain ratio of frequency to wavenumber are allowed to exist—the numerical value of that ratio depending on c P and c S : and Aki and Richards obtain an equation describing this constraint440 , which you shouldn’t be too surprised, because that is the exact same equation, originally discovered by Rayleigh, that showed up already in this book as Eq. (6.246). So here you have a second way of “discovering” Rayleigh waves mathematically, although personally I find Rayleigh’s original derivation more, I don’t know, intuitive. The thing about Aki and Richards’ approach, though, is that you can do the same thing at internal discontinuities: inhomogeneous waves might exist there, too, on both sides of any interface: like, you can have P, SV inhomogenous waves traveling along the top side of an interface, and/or P, SV inhomogenous waves traveling along its bottom side, at a different speed. Taken together, all these waves are called Stoneley waves, because Robert Stoneley was the first to figure this out and do the math, etc.441 Aki and Richards: “such a wave is composed of inhomogeneous waves decaying upward in the upper medium and decaying downward in the lower medium, so that particle motions are effectively confined to the vicinity of the interface”. Just like the Rayleigh wave, the Stoneley wave can exist on its own, but, again, “you need all the plane waves interacting with the boundary [to be] inhomogeneous waves”: incoming P and SV body waves alone won’t do it. And so there is also no reason to expect that a Stoneley wave will generate body waves propagating away from the interface: instead, it will continue to propagate along the discontinuity forever, or until it’s destroyed by anelastic dissipation (which, anelastic dissipation I’ve neglected from the theory, here, to keep things simple: but it does exist in the real world: and so it is understood that we don’t expect waves to just keep bouncing around forever...). This is all done in Chap. 5 of Aki and Richards in what we might call the planewave approximation, i.e., assuming that P and S waves behave as in Eq. (7.90) here.
298
7 The Structure of the Earth: Seismology
So most of Aki and Richards’ Chap. 5 can be understood based on what I’ve done in the last, I don’t know, ten or twenty pages, except there’s a lot more algebra. In their Chap. 6, though, Aki and Richards study the reflection and refraction of spherical waves, i.e., waves that propagate out of a point. To simplify things, they also cheat a bit this time, and only look at “acoustic” waves, which here we’ve met already, in Chap. 6: it’s the elastic waves that propagate in fluids. Aki and Richards start off with a function that describes a spherical, sinusoidal sound wave, 1 iω( |x| −t ) , (7.97) e c p(x, t) = |x| 1 where the factor |x| says that the wave amplitude decays like the squared distance from the origin of the wave itself—where this function, actually, is singular. Because of that factor, (7.97) is not quite a d’Alembert-type wave, so it’s not trivial to see that it should be a solution of the wave equation (6.26): Aki and Richards actually show that it is, and but their proof442 qualifies, at least to me, as reasonably hardcore math. The main trick, anyway, is to manipulate the right-hand side of (7.97) until it can be read as a superposition of plane waves: and then Aki and Richards apply their findings re the reflection and refraction of plane waves, see above, to the individual plane-wave contributions in the mutated version of (7.97).
7.10 Head Waves Once Aki and Richards have gotten their spherical-wave formulae, they interpret them: and it turns out, importantly, that, besides all the direct and reflected and refracted body waves, and the Rayleigh/Love/Stoneley-type inhomogeneous waves that we already know about, there’s another kind of waves443 , which they call head waves: in Aki and Richards’ words, these are “waves that travel from source to receiver via a path involving refraction along the boundary at a body-wave speed”. (They also call them “conical waves”, for reasons that I’d rather not get into—the name hasn’t stuck anyway—and or other authors, like Ewing and co-workers, call them “refraction arrival”.) So, just to be clear, head waves don’t show up at all in the “plane wave” theory (Aki and Richards’ Chap. 5) and only emerge when a spherical wave meets a discontinuity. When a spherical body wave hits a discontinuity at exactly the critical angle, energy gets refracted into the other side—deeper layer in the diagram of Fig. 7.16—of the discontinuity and propagates along the discontinuity, at the body-wave velocity found in that deeper layer. Unlike what happens with Stoneley waves, head-wave energy is not trapped at the discontinuity, though: as the head wave travels on, energy gets emitted back into the upper layer in the form of body waves, whose angle of propagation coincides, again, with the critical angle (again Fig: 7.16). That means that head-wave energy is consumed relatively quickly. But let’s say we have an instrument that’s not so far from a quake; if propagation velocity in the lower layer is much higher than the in the layer above, then the
7.10 Head Waves
299
Fig. 7.16 Propagation path of a head wave. The angle θcr is the so-called “critical angle”
head wave will hit the receiver before the direct P wave. Now, apparently, if you remember Mohorovicic and the Moho, it is generally accepted now that the undae primae inferiores seen by Mohorovicic were head waves, “refracted along” what would later come to be called the Moho. Here is how Ewing et al. (1957, p.93–94) put it: “when an impulsive source and a receiver are located in a lower-velocity medium separated by a distance large compared with the distance of either from the plane of contact with the higher-velocity medium, it is observed that the first disturbance arrives at a time corresponding to propagation along the path shown in [Fig. 7.16]. From observations of travel times it can be inferred that the part of the path along the interface is traversed at the higher velocity [of the deeper layer], the reminder of the path at the lower velocity [of the shallower layer], and the angle of incidence is equal to the critical angle [...]. This is the well-known refraction arrival [AKA, the head wave], first used by Mohorovicic in 1909 for deducing continental crustal layering. It is the basis of the seismic-refraction method of exploration. The refraction arrival presented a serious difficulty in that no energy would be expected for this path from the viewpoint of geometric optics. This difficulty was first resolved by Jeffreys444 who, using wave theory, found terms corresponding to the refraction arrival”; which Jeffreys’ derivation is, I think, different from the more recent one of Aki and Richards, but in any case reserved for those, among you, that are really into math... For those who like to look at data, the bottom line is that at very short epicentral distances (look at Fig. 7.17), let’s say closer than 300 km or so, could be less, could be more, depending on how deep the hypocenter is, we have the direct P coming in first; then at some point, because the speed of propagation below the Moho is so much higher than above, the head wave associated with the Moho (that seismologists like to call445 Pn ) always becomes the first arrival. At long distance, Pn is eventually too weak to be observed, the direct P can’t even exist anymore because the earth is round and the first arrival in a seismogram is the refracted P that propagates along a path that goes almost straight down from the hypocenter and then gets refracted as velocity grows with depth (velocity almost always grows with depth in the earth: we’ll get to that in a minute), until eventually it gets refracted back up, almost vertically towards the surface: and that “phase” of the seismogram makes for the great majority of P-wave observations made by seismologists.
300
7 The Structure of the Earth: Seismology
Fig. 7.17 The solid concentric circles are boundaries between different layers of the earth; the outermost one is the earth surface, where receivers are deployed. The dotted, dot-dashed and dashed lines are example of the ray paths of “direct” P, Pn (head, diffracted...), and refracted P waves, respectively. Of the two “direct” Ps paths that are shown, one is tangent to the base of the top layer: beyond its epicentral distance, no direct P can be observed. (In the real world, of course, that distance changes locally depending on Moho depth; but you get the idea.)
7.11 The Herglotz-Wiechert-Bateman Method Now, this pretty much ends the theory of wave propagation that you need, I think, to understand the rest of this book. And having established the concepts of wavefront, reflection, refraction, interference, we can start thinking in terms of phases and ray paths, and it should be clear that travel time is controlled by wave speed along the ray path. What we are going to see next, then, is how this theory can be “used” to constrain the structure of the earth. We’ll do some of that in this chapter, and some more in Chap. 9, too. Historically, as far as I know, the first quantitative method to infer values of seismic velocity from travel-time measurements is the one that’s known as Herglotz-Wiechert-Bateman, after the names of the people that first invented it: mostly Gustav Herglotz and Harry Bateman. Here is how it works (and why). Let x1 and x3 be two Cartesian axes: x1 on the horizontal plane and x3 positive downwards. Let the velocity of seismic waves, let’s say P waves, but the exercise that follows would work out in exactly the same way if we were talking of S waves, let seismic velocity v be a function of depth only, v = v(x3 ). Now think of a ray path in the plane x1 , x3 , and let ds be the length of a small segment of the ray path, see the drawing in Fig. 7.18. If θ is the incidence angle, then, through basic trigonometry, there’s a relationship between ds and the corresponding increment in depth: ds = d x3 / cos θ; from which it also follows that d x1 =d x3 sin θ/ cos θ. Speaking of sin θ, we’ve learned (Snell’s law) that if at some depth there’s a change in v, then at the same depth θ also has to change, in such a way that the ratio sin θ/v is the same on both sides of that discontinuity. Now, any velocity “profile” v(x3 ) can be approximated by a sequence of uniform, flat layers, with v constant within each layer—if v changes rapidly as a function of x3 we’ll just have to have more layers. So, whether or not there are sharp discontinuities in v, the ratio sin θ/v stays constant along the ray
7.11 The Herglotz-Wiechert-Bateman Method
301
Fig. 7.18 Ray path segment in a layered half space
path; people like to call p = sin θ/v the “ray parameter”. It follows that sin θ = pv, cos θ = 1 − p 2 v 2 (again, basic trigonometry), and pv p d x1 = d x3 = d x3 . 1 2 1 − p2 v2 − p 2 v
(7.98)
If we could integrate along the entire depth range spanned by the ray path, let’s call x3max the depth of its “bottoming point” (see the drawing, again), we would find the distance between the source and the point where the ray path emerges at the surface again, x3max ( p) 1 d x3 , (7.99) x1 ( p) = 2 p 1 2 0 − p 2 v (x3 ) where the value of p identifies the ray path, and as such determines both x1 and x3max : and so I’ve written them both as functions of p. To keep things simple, I’ve placed the source at the surface (x3 = 0) of the half space, so there’s only one integral in (7.99), which goes from 0 to x3max ; and then I multiply it by 2. If the source were at some depth x3S I’d have to integrate between x3S and x3max , and then also from 0 to x3max , and sum. The math that follows is slightly easier if we replace 1/v with u, the inverse of velocity, which some people like to call slowness; then x1 ( p) = 2 p 0
x3max ( p)
1 d x3 . 2 u (x3 ) − p 2
(7.100)
Now, what Herglotz and Bateman446 did was, they turned Eq. (7.100) into a tool to obtain, from measurements of travel time versus source-receiver distance—similar to the data in Fig. 7.1, for example—an estimate of the earth’s velocity profile v = v(x3 ). To show you how they did this, I am going to first derive the simple relationship that
302
7 The Structure of the Earth: Seismology
Fig. 7.19 Incremental epicentral distance x1 , and incremental distance along the ray path (seismic velocity times increment in travel time)
exists between p and travel time: you can see from the drawing in Fig. 7.19 that to a difference x1 in epicentral distance corresponds a difference v(0)t in distance traveled along the ray path—t is travel-time difference and v(0) is the seismic velocity just below the outer surface. Through the usual bit of trigonometry we know that those two quantities are related, x1 sin θ = v(0)t, but so then 1/ p =
x1 v(0) = , sin θ t
(7.101)
and if we bring this to the limit where x1 and t are infinitely small, 1/ p =
d x1 . dt
(7.102)
So if we have data like those of Oldham, epicentral distance x1 as a function of t, we can calculate p as a function of t by just differentiating447 x1 with respect to t; and once that is done, you can think of the data as a table with three columns, each with the values of p, x1 , or t. From such a table, one can extract a function p(x1 ), or x1 ( p), or whatever. And this will become useful in a minute. Before we get there, though, we need to use an old result from Niels Abel448 to put Eq. (7.100) in a different form. (And maybe I should warn you than the next couple of pages are going to be somewhat dense, as in mathematically dense: so much so that you might want to consider skipping to the final result of what I am about to do, i.e., Eq. (7.110).) Abel showed that if a function h is defined in terms of another, arbitrary function f , as follows,
a
h(x) = x
then f can be obtained from h through
f (ξ) dξ √ ξ−x
(7.103)
7.11 The Herglotz-Wiechert-Bateman Method
f (ξ) = −
1 d π dξ
303
a
h(x) d x, √ x −ξ
ξ
(7.104)
and vice-versa. For the time being it’s probably best if you take this for granted449 and go on with the rest of my derivation. The trick, now, is to manipulate Eq. (7.100) to put it in the form (7.103); let’s divide both sides of (7.100) by 2 p, and use as integration variable ξ = u 2 instead of x3 . Then,
x1 ( p) = 2p
p2
u 2 (0)
1 ξ−
p2
d x3 dξ, dξ
(7.105)
where, to understand the new upper integration limit, just think that when x3 = x3max ( p), i.e., at the bottoming point of the ray path with ray parameter p, the incidence angle is π/2, and sin(π/2) = 1, and so u(x3max ( p)) = p. Now if you look carefully at (7.105), you’ll see that it looks very much like (7.103): in fact, define d x3 a=u 2 (0), x= p 2 , ξ = u 2 , h(x) = h( p 2 ) = x12(pp) and f (ξ) = f (u 2 ) = − d(u 2 ) , swap the integration limits, and (7.105) becomes precisely (7.103). That means that we d x3 can use Abel’s result (7.104) to find an expression for d(u 2) , 1 d d x3 = d(u 2 ) π d(u 2 )
u 2 (0)
u2
1 x1 ( p) d( p 2 ). p2 − u 2 2 p
(7.106)
Equation (7.106) can be simplified, first by integrating both sides with respect to u 2 , 2 d 2 p = 2 p, i.e., dp = dp, so you get rid of those derivatives; and then noticing that dp 2p so it is natural to change the integration variable from p 2 to p, and x3 (u) =
1 π
u
u(0)
x1 ( p) p2 − u 2
dp,
(7.107)
where the integration variables are u and u(0) because when p 2 = u 2 then p = u, etc.—u is positive by definition so nothing to worry about450 . And, assuming u decreases (v grows) with depth x3 , which it almost always does in the earth, u is always smaller than u(0) and the argument of the square root in (7.107) is always positive. Now, you might think that with Eq. (7.107) we are pretty much done, because if we plug a set of data in the form x1 = x1 ( p) into the integrand at the right-hand side, we could evaluate the integral numerically for any value of u, etc. Problem is, √ 12 2 is infinite when p = u, i.e., the integrand blows up at the integration limit p −u
p = u, so√in practice there isn’t much we can do with (7.107), yet. The good news451 is that 1/ x 2 − c2 , with c a constant, is the derivative of cosh−1 (x/c); in case you are wondering, the hyperbolic arccosine cosh−1 is the inverse function of the hyperbolic cosine
304
7 The Structure of the Earth: Seismology
cosh x =
ex + e−x . 2
(copy of Eq. 6.96)
So, we know (Herglotz and Bateman knew) that √
1 x 2 − c2
d x = sgn(x) cosh−1
x c
+ d,
(7.108)
where sgn(x) means the sign of x, either +1 or –1, and d is an arbitrary constant, which is there because the left-hand side of (7.108) is an indefinite integral. This formula can be used to integrate the right-hand side of (7.107) by parts. Keeping in mind that the sign of p is positive by definition, x3 (u) =
1 π
u(0)
x1 ( p)
dp p2 − u 2 p p dx u(0) 1 u(0) 1 1 = cosh−1 x1 ( p) dp. − cosh−1 π u π u u dp u u
(7.109)
Now, because p is constant along a ray path, p must coincide with the value of u(x3 ) = 1/v(x3 ) at the bottoming point of the ray path, where the incidence angle is π/2 and its sine is 1. But so then the integration limit u(0) coincides with the value of p associated with a “ray” that bottoms at the surface of the earth, and but then x1 (u(0)) = 0, because a ray that bottoms at the surface of the earth can’t cover any distance at all! Also, and again I am asking you to look up the properties of the hyperbolic arccosine, cosh−1 (1) = 0. It follows that the first term at the right-hand side of (7.109) is 0. As for the remaining integral, it is natural to switch the variable of integration from p to x1 . Because, again, x1 (u(0)) = 0, we’re left with 1 x3 (u) = π
x1 (u)
cosh 0
−1
p(x1 ) d x1 . u
(7.110)
You can think of (7.110) as the essence of Herglotz’s and/or Bateman’s contribution452 : and here’s how you use it in practice: imagine you have a data set, a table of values of epicentral distance x1 and corresponding travel time t for a certain seismic phase; we’ve already seen (Eq. (7.102), etc.) that we can immediately get a value of p for each pair x1 , t; which means we can substitute numerical values (data) for the function p(x1 ) at the right-hand side of (7.107). If we pick an arbitrary, positive value for u, then we have all we need to evaluate numerically453 the integral at the right-hand side of (7.110); divide by π the result of that, and you get the value of x3 where that value of u is found: the depth, below the surface of the earth, where seismic velocity coincides with 1/u. Repeat this exercise for a suite of plausible values of u, and you get a curve of u versus x3 , or, if you take v = 1/u, a “velocity profile”: you’ve obtained, from travel time data, a “model” of earth structure (and see one vintage example in Fig. 7.20).
7.11 The Herglotz-Wiechert-Bateman Method
305
Fig. 7.20 Velocity of P and S waves in the earth, from its center to the surface, as estimated by Beno Gutenberg in his 1914 “Über Erdbebenwellen VIIA” paper. Distance from the earth’s center (horizontal axis) is in thousands of kilometers; seismic velocities (vertical axis) in km/s
This argument holds as long as there’s only one depth x3 where slowness is precisely u; we know that seismic velocities in the earth generally grow (slowness decreases) with increasing depth, and so in principle we should be OK. To be honest, there probably exists at least one small low-velocity layer, i.e., a short depth range where velocity actually decreases with depth before starting to grow again, which might mess things up. But (as we shall see in the next chapter) that’s a small effect and limited to short epicentral distances: it shouldn’t cause any trouble in a global data set like, e.g., those of Oldham, with quakes recorded at stations that are very far from the epicenter. Wiechert’s student Beno Gutenberg used his supervisor’s method to derive a global model of the earth that would achieve a certain popularity, so to say; the paper that usually gets cited is “Über Erdbenwellen VIIA. Beobachtungen an Registrierungen von Fernbeben in Göttingen und Folgerungen über die Konstitution des Erdkörpers”454 , Nachrichten von der Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-Physikalische Klasse 1914 (Like its title suggests, this is really part 7 of a series of papers über Erdbenwellen, on seismic waves, that Wiechert, Zöppritz455 , Geiger and Gutenberg had all contributed to since 1907), signed by Gutenberg alone. Figure 7.20 is taken from there and shows a picture of the earth’s mantle that would not change too much in the forthcoming years; beyond the mantle, there’s a major discontinuity at about 2900 km depth, which is the top of Oldham’s core, except Oldham thought it to be much shallower, while Gutenberg’s estimate of its size also wouldn’t be modified much by future seismologists. I guess the reasons Gutenberg’s estimate is more precise than Oldham’s are: more data, and a better estimate of velocity and how it changes with depth within the mantle: which Gutenberg had obtained
306
7 The Structure of the Earth: Seismology
via the Herglotz-Wiechert-Bateman method. As for estimating velocities in the core, we already know from Oldham that seismic velocities drop drastically as we get into the core: which is a problem for HWB. So at those depths, Gutenberg had to make an a priori guess of velocity, calculate travel times of waves bottoming in the core, and see if his guess fit the data: a trial-and-error approach that eventually resulted in the diagram of Fig. 7.20. Something that might perturb some of you in this model is that Gutenberg gives estimates of both P and S velocity in the core; and maybe you have heard that the core is supposed to be liquid, actually, which implies that shear waves can’t propagate through it (remember Chap. 6)—and so it doesn’t make any sense to speak of shearwave velocity in the core. In fact, most textbooks (those that I am aware of, anyway) simply say that the core must be liquid because S waves are not detected at the antipodes of a quake, no matter how big: if the core were solid, S waves would be able to propagate through it and so then they would be seen emerging at the antipodes: but since they are not, well, then the core has got to be liquid. This is a bit of a simplification, because S arrivals are observed at or near the antipodes of a quake (although they come in very late with respect to P and other phases, and/or their amplitude on the seismogram is very much “damped”: more on all that in a minute), which is how Oldham and Gutenberg, who were not stupid, came up with some estimates for S velocity in the core. The discussion on whether the core be liquid or solid had to do with more than just seismic travel times, as one sees from Harold Jeffreys’ paper, “The Rigidity of the Earth’s Central Core”456 , which is sort of the turning point after which the majority of people pretty much accepted the idea that the earth’s core is in the liquid state, etc.
7.12 The Rigidity of the Earth’s Central Core Jeffreys looked at two pieces of evidence: solid-earth tides, and the Chandler wobble. Let’s look first at what he did with tides. You all know about ocean tides457 : during the earth’s rotation on its axis, any given portion of the ocean surface finds itself, at some point, relatively close to the moon: and the moon then slightly pulls it, by its gravitational attraction, away from the solid earth. The effect goes away—or, rather, moves to another part of the world—as the earth keeps revolving. What people pay less attention to is the fact that the solid earth is also attracted to the moon by the same mechanism: it doesn’t move as much as the water, because rocks are way more rigid than water; but they are not perfectly rigid, and so the solid earth does deform slightly—enough that people in Jeffreys’ time had already been able to measure “earth tides”. And, starting with Kelvin in 1863, they had figured that, from knowing the entity of the solid earth’s response to the moon’s attraction, one could estimate our planet’s average rigidity—the μ of Hooke’s law. So what Jeffreys did was, he took various models of the earth, starting with Wiechert’s two-shell model, and assigned to each shell a different rigidity, assuming that density was already sufficiently well constrained: then, he calculated how large
7.12 The Rigidity of the Earth’s Central Core
307
earth tides should be in each of those models458 , and checked whether that fit existing observations of earth tides. Jeffreys gets√his μ from values of v S estimated by the seismologists, as in Fig. 7.20, because v S = μ/ρ, and so if we have v S and ρ, we also have μ. But then, if ρ is Wiechert’s ρ, things don’t quite work. “The analysis of seismological data”, says Jeffreys, “leads to the conclusion that the velocity [of S waves] is about 4 km/s at a depth of some tens of kilometres, increases steadily to about 7 km/s at [a depth of] about 0.2a [a being the radius of the earth] and remains roughly constant from there to about 0.45a. Combining these data with the standard Wiechert law of density, we find for the rigidity [like I said, μ = ρv S2 , where ρ is Wiechert’s ρ] at the surface the value 5.12 × 1011 [c.g.s.459 ], at the base of the rocky layer 15.6 × 1011 and in the upper part of the metallic core 40 × 1011 .” By “upper part of the metallic core” Jeffreys means, at this point, the depth range just below 1500 km depth, which is where Wiechert had his big density jump—essentially the core-mantle boundary according to Wiechert—and of course the “rocky layer” is everything between that and the Moho. But these values don’t fit tide data. They are “at every point [...] substantially greater”, says Jeffreys, “than those found to be consistent with the tidal data.” Jeffreys figures that estimates of v S from seismology are fairly reliable; Wiechert’s ρ, on the other hand, is non-unique: meaning, it’s not that hard to find a different profile of ρ, that still fits the precession/inertia-tensor data that Wiechert used. So Jeffreys does precisely that—he proposes to move the major density jump, that Wiechert had placed at a radius of 0.78a, down to 0.545a—about halfway towards the center of the earth. (Based on Wiechert’s precession data, and on the total mass of the earth, which is known since Cavendish, the average densities of mantle and core are 4.27 × 103 and 12.04 × 103 kg/m3 in Jeffreys’ new model.) At the same depth, Jeffreys postulates a major jump in rigidity as well. Above it, the rigidity is that inferred from seismology; below it, rigidity needs to be extremely small—we might as well set it to zero, i.e., we are back to the old idea that the deep interior of the earth is liquid—but now this idea is found to fit a variety of different data: seismology, density, tides. And the Chandler wobble as well. I think I explained briefly in Chap. 2 that there had been an issue, with free precession, that was only clarified after Chandler’s observation of it—which was the first time, as you know, that someone ever observed it, and which came quite late—1891—while the precession of equinoxes, AKA forced precession, had been observed and known for ages—even before the physics behind it was explained by Euler. The frequencies of both free and forced precession are related, through Euler’s theory, to the ratio (C − A)/A, where A and C, as we know, are the components of the inertia tensor460 . Long before Chandler’s observation, people were able to estimate (C − A)/A from forced precession data. The frequency of free precession is simply (C − A)3 /A, where 3 is the frequency of the earth’s rotation on itself, i.e., one day to the minus one. Forced-precession data gave461 (C − A)/A = 1/305 and so people expected to observe another precession—the free precession—with a period of 305 d, or about ten months. What happened, instead, is that nobody was able to observe anything of that kind for a long time, until Chandler
308
7 The Structure of the Earth: Seismology
saw a very subtle (like, a tenth of a second of arc) precession with a period of, as we know, about 427 d, or fourteen months. Shortly after Chandler published his finding, Simon Newcomb462 explained that the reason Euler’s theory had predicted a wrong value for what would come to be called “Chandler wobble” was that, in Euler’s theory, the earth is treated as a perfectly rigid body. But the earth, like we were just saying, is not perfectly rigid. As the direction of the earth’s rotation axis changes, the centrifugal force that is associated with rotation changes as well, and that causes a deformation, that affects A and C, and the period of free precession as a result. Now, and here’s why this is relevant to Jeffreys’ work in 1926: the extent to which (C − A)/A is perturbed depends, via Hooke’s law and all that, on the value of the earth’s rigidity. And, long story short, the μ that Newcomb inferred from the period of Chandler wobble is consistent with the μ derived from earth-tide observations. Which substantiates Jeffreys’ conclusions that the earth’s core had got to have very little rigidity—that it had to be fluid.
7.13 Inge Lehmann and the Inner Core After Jeffreys’ contribution, people kept looking into the core, and about ten years later another important observation was made—a major discontinuity in the physical properties of the core was found, some 2000 km deeper than the core-mantle boundary. To see how that was discovered, start by looking at Fig. 7.21. In models like Jeffreys’, because v P in the core is much lower than in the mantle, by Snell’s law P
Fig. 7.21 The P-wave (emitted at E) ray path that bottoms closest to the core-mantle boundary emerges at the location labeled “1” (dashed line). Paths that bottom deeper (solid lines) enter the core, where v P is low, and so are refracted away from “1”. The smallest incidence angle at the CMB results in the “core”-P observation closest to the epicenter, i.e., point “3” (thin solid line). Between “1” and “3” is the P-wave “shadow zone”
7.13 Inge Lehmann and the Inner Core
309
waves passing from the mantle into the core are refracted downwards: and that gives us the path marked E2 in the diagram. If velocity increases a bit with depth within the mantle, P waves that just miss the core-mantle boundary are refracted upwards, hence path E1. If you go and draw all possible ray paths—i.e., shooting from E at all possible take-off angles—you’ll see that there is a minimum distance corresponding to E3. And but, between E1 and E3 there’s nothing: a seismometer placed at those distances from the quake won’t measure no direct P wave: that’s called the P-wave “shadow zone”, which “1” is measured to happen at about 105◦ from the source, and “3” at about 142◦ . But the thing is, as new data accumulated, people started realizing that the shadow zone isn’t completely blank. Inge Lehmann, from the Royal Danish Geodetic Institute, and one of the very few women in this story463 , happened to look at recordings, made at Danish stations in Greenland, of a big quake that happened in New Zealand in 1929. Greenland was totally in the shadow zone of its epicenter, and yet Lehmann was able to catch some clear P arrivals at those stations. Those arrivals were quite late, if compared, e.g., with the travel time of the direct P wave that you measure right before the shadow zone at “1”, or after the shadow zone, at “3” and “2”. Lehmann realized that you could explain those arrival times with a model that includes a corewithin-the-core: an inner core, as it later came to be called. In the earth model that she proposed, v P is 8 km/s in the “outer” core, and suddenly grows to 8.6 km/s on the other side of the inner-core boundary. P waves emitted from an epicenter in New Zealand, refracted into the liquid outer core, and then reflected off the surface of the inner core and refracted back into the mantle, would be observed in Greenland at about the time she measured from seismograms. That is path E5 in Fig. 7.22.
Fig. 7.22 The P wave that emerges at “4” (i.e., in the shadow zone) is reflected by the inner-core boundary. The P wave that emerges at “5” is refracted into the inner core and then again through outer core and mantle. These phases travel a longer way than “outer core” P phases emerging at similar epicentral distances: and so they appear much later in seismograms
310
7 The Structure of the Earth: Seismology
Lehmann published her famous paper464 in 1936. In the years that followed, people like Gutenberg, Jeffreys, etc., looked at more data and confirmed Lehmann’s speculation. P waves refracted from the outer core into the inner core, and back (path E6 in Fig. 7.22), were observed. The inner core was the last of the most prominent discontinuities of the earth’s interior to be mapped: with Lehmann’s discovery, the grand structure of the earth as we conceive it today was, essentially, established. As you read this, though, you might wonder, what is meant exactly by “structure”? we have been talking, mostly, about the velocity of seismic waves; about the earth’s rigidity, and, to some extent, density. But the density models we’ve seen so far (Wiechert’s, etc.) have much less complexity than the v P and v S profiles elaborated by seismologists: they look like they might be a much simplified version of how density changes with depth in the real world. And then of course there’s also the question of the chemical composition of the earth; and of its temperature. Post-Lehmann, there’s still a lot to be learned about the deep interior of the earth.
7.14 From Seismic Velocities to Density: The Williamson-Adams Method In 1923, Erskine Williamson and Leason Adams465 publish an article, “Density Distribution in the Earth” on the Journal of the Washington Academy of Sciences, which they start out by saying that, other than the mass of the earth and “the constant of precession and other astronomic and geodetic data from which the moments of inertia of the Earth may be calculated”, now also “seismologic data from which the elastic constants of the materials in the interior may be computed [...] provide the basis for the present estimate of the density and composition of the Earth at various depths.” That is, they are announcing that they’ve found a way to convert models of the earth’s seismic velocity, which we’ve learned how they can be derived (Herglotz-Wiechert-Bateman, for example), into models of the earth’s density. Let’s see how they do this. “It has been shown from the theory of elasticity”, they say, “that any disturbance in a sphere of elastic isotropic material should give rise to various kinds of waves traveling with velocities depending only on the density and elastic constants of the material at each point. Waves of two of these kinds pass through the Earth, while the others, which are less simple to analyze, travel over the surface. A seismograph recording the time of arrival of the various waves at some other point would show the arrival first of the two waves passing through the Earth and later that of the various surface ones. One of the ‘through-waves’ consists of transverse vibrations and travels with a velocity vS =
μ ρ
(7.111)
while the other consists of longitudinal vibrations and travels with the higher velocity
7.14 From Seismic Velocities to Density: The Williamson-Adams Method
vP =
K + 43 μ ρ
311
(7.112)
μ being the rigidity and K the bulk modulus.” (Williamson and Adams use the letter R instead of μ.) You probably remember that we derived all this stuff in Chap. 6; except that in Chap. 6 I barely mentioned the bulk modulus and wrote everything in terms of Lamé’s parameters μ, OK, but also λ. So to make sense of Eq. (7.112), let me show you the relationship that there is between λ and K . (And in case you are wondering, why do we need to introduce a new parameter? why don’t we just stick to λ and μ?: well, that’s a good question, and but the thing is, K is useful now because by its definition it is related to pressure in a simple way; and pressure is closely connected to density, as you can imagine, which is what Williamson and Adams are after... we’ll get to that in a minute.) Bulk modulus is the name people give to the ratio of a change in pressure to the corresponding relative change in volume466 , K =− which we’ve also seen467 that be rewritten
dV V
dp ; d V /V
(7.113)
= − dρ , with ρ being density, so Eq. (7.113) can ρ K =ρ
dp . dρ
(7.114)
Now that you also know what the stress tensor is, though, you might realize that it doesn’t make much sense to speak of pressure in a solid—surface forces in a solid are described by the six independent coefficients of τ , three of which (the diagonal ones), are related to the pressure, and might differ from one another— meaning “pressure” might change depending on direction. So then the bulk modulus is defined by prescribing that shear stresses—the non-diagonal entries of the stress tensor—be all zero, and that normal stresses be the same in all directions. In practice, that’s the same as setting up an experiment where a chunk of material is under isotropic stress with no shearing (some people would call that hydrostatic stress, whatever, either way it just means that τ11 = τ22 = τ33 , and all the other components of τ are zero), and then we measure the change in volume; then by definition the bulk modulus is given by τ11 . (7.115) K = d V /V Equation (7.115) is basically (7.113), except for the sign, because while pressure is conventionally taken to be positive when it tries to reduce the volume of the element of material it’s applied to, in Chap. 6 we’ve chosen to call positive a normal stress that pulls the material within a given volume towards the outside. And at the left-hand side, of course, instead of τ11 you might write τ22 or τ33 : they are all the same468 . Now to see how K can be expressed in terms of λ, write out the elastic stress-strain relationship for each of the diagonal entries of the stress tensor τ ,
312
7 The Structure of the Earth: Seismology
⎧ " k ⎪ τ11 = λ 3k=1 ∂u + 2με11 , ⎪ ⎪ ∂xk ⎨ "3 ∂u k τ22 = λ k=1 ∂xk + 2με22 , ⎪ ⎪ " ∂u k ⎪ ⎩ τ33 = λ 3 k=1 ∂xk + 2με22 ,
(7.116)
where I’ve temporarily dropped Einstein’s summation convention, so there’s no danger of confusing anybody. If you sum up the three equations in (7.116), τ11 + τ22 + τ33 = 3λ
3 # ∂u k ∂xk
k=1
+ 2μ
3 # ∂u k k=1
∂xk
(7.117)
(because remember the definition of ε), or τ11 + τ22 + τ33 = (3λ + 2μ)∇ · u.
(7.118)
Now, if stress is purely hydrostatic, which that’s what it is in the definition of K , the left-hand side of (7.118) coincides with 3 times dp, and
2 dp = − λ + μ ∇ · u. 3 But we know from Chap. 6 that ∇ · u =
dV V
, so
2 dV dp = − λ + μ . 3 V If you replace
dV V
(7.119)
(7.120)
with − dρ , see above, ρ ρ
2 dp = λ + μ, dρ 3
(7.121)
and if you compare this with (7.114) it follows that the bulk modulus K = λ + 23 μ, or, λ, λ = K − 23 μ. Sub that into our old formula for v P , vP = = =
λ + 2μ ρ K − 23 μ + 2μ ρ K + 43 μ , ρ
(7.122)
7.14 From Seismic Velocities to Density: The Williamson-Adams Method
313
i.e., Eq. (7.112), QED. Now, from Eqs. (7.111) and (7.112), plus some algebra, we find that 4 K v 2P − v S2 = , 3 ρ
(7.123)
which is the first ingredient of Williamson and Adams’ recipe469 . Also, if one could measure the weight per unit surface (read: pressure) of all rocks that sit on top of a depth a − r in the earth (where a is the radius of the earth), that should coincide with470 a p(r ) = ρ(r )g(r )dr , (7.124) r
where g is acceleration of gravity—not at the surface of the earth, but as a function of distance r from the center of the earth. So the derivative
a a 1 dp ρ(r )g(r )dr − ρ(r )g(r )dr = lim dr δr −→0 δr r +δr r r 1 ρ(r )g(r )dr = lim δr −→0 δr r +δr (7.125) 1 r +δr = − lim ρ(r )g(r )dr δr −→0 δr r 1 ρ(r )g(r )δr = − lim δr −→0 δr = − ρ(r )g(r ), where the minus sign at the right-hand side makes sense (besides being algebraically correct), because the deeper you go into the earth, the smaller r is, and but the larger the pressure p. What Williamson and Adams then do is they combine Eqs. (7.114) and (7.125), which if you rewrite the former as dp =
dρ K ρ
(7.126)
and substitute that into the latter, you get
or
K (r ) dρ(r ) = −ρ(r )g(r ), ρ(r ) dr
(7.127)
dρ(r ) ρ2 (r )g(r ) =− . dr K (r )
(7.128)
314
7 The Structure of the Earth: Seismology
In playing these little calculus tricks, though, we must not forget that the derivative at the right-hand side of Eq. (7.114) refers to a setup where pressure and density change because stuff is being mechanically compressed or expanded; that is, for example, you’ve got some specific material, compress it by dp, while keeping all other parameters in your lab exactly the same471 , and, through (7.114), its density dρ = Kρ dp. Now imagine that the way you increase (decrease) pressure is by taking your measurement of ρ at a slightly larger (smaller) depth inside the earth, which is what Eqs. (7.127) and (7.128) implicitly mean. Then, (7.114) is still OK, but only so long as the material at r + dr is the same as that at r . If at some depth you’ve got some significant chemical change, that’s something that could involve the same variation in density through a much smaller (or larger) variation in pressure, or in any case a variation in pressure that is not described by (7.114), or (7.126). Equation (7.114) would become useless, and everything Williamson and Adams are deriving based on it, starting with Eq. (7.128), would cease to be valid, too. In their paper, they summarize this with the quite terse phrase: “on the assumption of homogeneity”472,473 . (We’ll see in a minute what one can do if this assumption is dropped.) What Williamson and Adams ultimately want is a differential equation that they can somehow solve for ρ(r ), and you see, I guess, that we are getting closer to it. One problem with Eq. (7.128), though, is that the function g(r ), a priori, is unknown, just like ρ(r ). Something we can do about this is, remember all we’ve learned about gravity in Chap. 1, and write G M(r ) , (7.129) g(r ) = r2 where M(r ) is the mass of the fraction474 of the earth that’s contained within a sphere of radius r , centered at the center of the earth. So here’s Williamson and Adams’ clever trick: write a π 2π dr dϑ dϕi.e., ρ(r )r 2 sin ϑ M(r ) =Me − r 0 0 π a 2π 2 =Me − dr ρ(r )r dϑ sin ϑ dϕ r 0 0 (7.130) π a 2 =Me − 2π dr ρ(r )r dϑ sin ϑ r 0 a =Me − 4π dr ρ(r )r 2 , r
which everything you need to know to understand the math in (7.130) was covered in Chap. 1, and but the important thing is that the integral over r at the right-hand side has to be calculated from r to a, i.e., it requires no knowledge of the earth’s density (or of any other earth’s parameter) below r . Now we’ve got all we need to put together Williamson and Adams’ puzzle. Equation (7.128) can be rewritten
7.14 From Seismic Velocities to Density: The Williamson-Adams Method
ρ(r )G M(r ) dρ(r ) =− 2 2 , dr r [v P (r ) − 43 v S2 (r )]
315
(7.131)
using (7.123) and (7.129). Imagine that some seismologists give you their earth model: tables of v P and v S for a set of closely spaced depths from the surface to the center of—or as deep as possible into—the earth, which they’d derived from measurements of the propagation time of seismic waves, and nothing else (remember Herglotz and co.). When r = a, i.e., at the earth’s surface, you’ve got values for all the functions that go into the right-hand side of (7.131): ρ could be an average from the rocks that one finds more frequently in outcrops, etc.475 ; g is about 10 m/s2 , as you know, and M(a) is just the total mass of the planet. So you can find a numerical . But then the thing is, once you’ve gotten the derivative, you can also value for dρ(a) dr estimate dρ(a) δr : (7.132) ρ(a − δr ) ≈ ρ(a) − dr and you can keep repeating this476 , decreasing r each time by a little jump δr : until or unless you run into some important chemical change—because see above. At each step, of course, you’d have to calculate M(r ) via Eq. (7.130)—but that’s OK, because as you’ve made your way from a to r , you’ve obtained all the values of ρ at radii above r: which is all you need to implement (7.130). The result of this exercise is a density model ρ(r ), that is entirely based on seismological data, v P (r ) and v S (r ). I am not getting into the details of Williamson and Adams’ model that they obtained in this way, because it is based on relatively few data, compared to what is about to come, and plus Williamson and Adams in 1923 are unsure as to where exactly the core-mantle boundary should be—Jeffreys’ paper that we just looked at would come out in 1926—which makes their discussion more complicated. But one important point that Williamson and Adams are able to make and that remains pretty much valid even if you change the size of the core a bit, is that density, all the way down to wherever the core is, doesn’t grow that fast, with depth: the max they get, at the largest depth that they look at, is just about 5.5 × 103 kg/m3 , which is known since Cavendish to be the average density of the earth. Consider on top of that the volume of the core, whatever reasonable estimate we take for its radius, is pretty small compared to that of the rest of the earth: “It is therefore impossible”, they write, “to explain the high density of the Earth on the basis of compressibility alone. The dense interior can not consist of ordinary rocks compressed to a small volume. We must therefore fall back upon the only reasonable alternative, namely, the presence of a heavier material, presumably some metal, which, to judge from its abundance in the Earth’s crust, in meteorites, and in the sun, is probably iron. We thus arrive at the conclusion accepted by the majority of geophysicists, but, in addition, we have here (1) a quantitative estimate of the increase of density due to compression alone, and (2) direct evidence of the presence in the interior of the Earth of a dense material such as iron.”
316
7 The Structure of the Earth: Seismology
7.15 Francis Birch Calling it “direct evidence” is perhaps a bit of a stretch. The arguments that support Willamson and Adams’ inference are pretty solid, and it is true that “the majority of geophysicists” would have agreed with it (and, as I am writing this just about a century later, they still do). But the thing is, the Williamson-Adams procedure spits out density but doesn’t really give any direct info on chemical composition, and so if one really wants to find out what the earth is made of—and I mean not just the core, but the mantle as well—, something else needs to be done. Someone who contributed a lot of work to try and answer this question is Francis Birch, whose efforts follow logically from those of Williamson and Adams. Birch was a student of Percy Bridgman477 at Harvard, then became a prof. himself, also at Harvard, and was encouraged by Bridgman and his geologist friend Reginald Daly to look at the chemical composition of the earth478 . The way I understand it, the crux of Birch and co.’s intuition and contribution is that density “models” of the earth derived through the Williamson-Adams thing can and should be compared with laboratory results. Williamson and Adams give us estimates of how, according to the travel times of seismic waves in the earth, pressure and density change with depth. Which is the same as saying, how the bulk modulus and or density change with pressure, i.e., how the materials that make up the earth respond to compression. And but these are things one could measure in one of those so called “high-pressure laboratories”—if the materials in question were known. If the materials are not known and you want to guess what they actually are, what you can do is you bring a sample of something to the lab, put it under pressure, and measure how its density and bulk modulus, and or its density and v P and v S , change with changing pressure. (We are talking, of course, very high pressures, similar to what is estimated by Williamson-Adams for the earth’s mantle: at, say, a thousand km depth, that would be about half a million bar, which, just to get some sort of feeling for what that means, think that atmospheric pressure here at the earth’s surface is about one bar. So being at 1000 km depth in the earth would be like having half a million times the earth’s whole atmosphere pushing down on us.) Imagine you do this for a whole bunch of different materials: if you’d done the Williamson-Adams thing right, you should be able to find at least one that shows a K /ρ-vs-ρ relationship similar to those you see, at various depths in the earth, from seismic data via Williamson-Adams. And if you find one, chances are that material would be at least similar to the material you’d find in the earth layer or layers where the same K /ρ-vs-ρ curve is seen. Maybe that was a bit convoluted, as an explanation, but check out Fig. 7.23, which I took from Birch’s 1961 paper479 , “Composition of the Earth’s Mantle”. The dashed lines, there, are curves of “bulk sound velocity” versus density, that people had gotten through Williamson-Adams-type analyses: two for the mantle and two480 for the core. Solid lines come out of the lab experiments that I was talking about, each a different experiment on a sample of a different metal. So then one checks which solid lines are close to which dashed lines, and the inference is that the mantle is likely to be
7.15 Francis Birch
317
Fig. 7.23 After Francis Birch, “Composition of the Earth’s Mantle”, Geophysical Journal of the Royal Astronomical Society, vol. 4, 1961. Dashed lines are “bulk sound velocity” versus density curves derived from seismology models plus the Williamson-Adams thing (to get density). Solid lines are laboratory measurements of what Birch calls “hydrodynamical sound velocity”: and I am not sure what exactly that means, in practice, because I don’t know much about how those laboratory experiments are done. But Birch explains that v S and v P as “furnished” by seismology “can be used 1 to provide the comparable function (K /ρ) 2 ”, i.e., in other words, hydrodynamical sound velocity is a proxy for bulk sound velocity. (Which is another reason why it is not useless to look at bulk sound velocity in the earth.)
made mostly of stuff that contains magnesium (Mg), for example, or aluminum (Al), while the core is, presumably, mostly iron (Fe). In another one of his papers481 , by the way, Birch also figures that the inner core is solid iron (while the outer core, as we know, is fluid—liquid iron). The idea being that the melting temperature of iron rises fast enough with pressure (depth) to exceed the actual temperature in the core—the place where this happens would be the inner core boundary482 . Now, the Williamson-Adams method plus the work of Birch is not the end of the story, because, remember, the Williamson-Adams Eq. (7.131) holds only as far as the change in density with depth is accounted for by mechanical compression alone. If chemical composition also changes with depth, that messes everything up, as we’ve seen. By just noticing how little mass was accounted for by the whole mantle, compared to the total mass of the earth, Williamson and Adams figured the core must be made of some much heavier, chemically different substance, and so didn’t try to extend their procedure beyond what they thought might be the core-mantle boundary. In the 1930s people also began to suspect that the shell going from, say, 200 to about 1000 km depth, where Herglotz-Wiechert-Bateman-type analyses had shown seismic velocity to grow very quickly with growing depth (see Fig. 7.20), people figured that
318
7 The Structure of the Earth: Seismology
through that depth range chemical composition might also change. That shell would be referred to as the transition layer, or transition zone. And if that was true, one couldn’t just go and apply Williamson-Adams through the transition zone. It would be OK to do Williamson-Adams above and below the transition zone, and extrapolate in between... but that would require knowing what density was like just below the transition zone—and assumptions had to be made and values had to be guessed. So, Birch’s solution to this was to replace the Williamson-Adams equation (7.131) , that was as general as possible. He figured that, other than with a new formula for dρ dr through (i) compression, density could change through (ii) a change in temperature, or (iii) in chemical composition, or (iv) in the way matter is packed together; because even if the abundances of the various elements stay the same, their molecules might be distributed according to different structures, depending on pressure and temperature; and the switch from one such structure to another is called a phase change483 . (I know, you’ve learned that the phases of matter are solid, liquid, gas? maybe a couple more? and now we’re calling “phase change” a shift from one kind of crystal to another? but that’s just the way people call things; it’s not always the best one; and maybe there isn’t a best way to call things, anyway.) The change of density in an interval of radius dr , explains Birch in his 1961 paper that I’ve already mentioned, can be generalized to include the effects of change in iron content and in phase, as well as of compression and thermal expansion,
∂ρ ∂T
∂T + ∂r
∂ρ ∂p
∂p + ∂r
∂ρ ∂w
∂w + ∂r
∂y , p T T, p,y T, p,w ∂r (7.133) where w denotes “mean atomic weight, and y” is, in Birch’s words, “a parameter which defines the closeness of packing or mean atomic volume.” The subscripts T , p, etc. are there to insist on the fact that partial differentiation of ρ, e.g. with respect to p, means taking the rate of change of ρ while T is kept constant, etc. “The mean atomic weight of a compound”, says Birch, “is the molecular weight divided by the number of particles in the molecular formula; for example, the molecular weight of SiO2 is 60.09, the number of particles 3, and w is 20.03.” Bottom line, (7.133) “is the generalization of the Williamson-Adams equation, to the extent that compositional variation may be represented484 by the single parameter w.” Equation (7.133) appears in Birch’s 1961 paper, page 306, but not exactly in the same form as here, because Birch had already done sone manipulation of the first two terms at its right-hand side, in his 1952 paper where the third and fourth terms were still missing. I am going to use the next couple pages to look into that in some detail; so then we shall end up with Birch’s 1961 version of (7.133), and at that point we’ll see Birch’s conclusions re phase and composition changes in the transition zone. ∂ρ is how much ρ changes because of a change in T while p is kept So: ∂T dρ = dr
∂ρ ∂y
p
constant: which is something that can be measured experimentally and that we might call volumetric coefficient of thermal expansion485 —Birch simply calls it volume thermal expansion and denotes it with α. At constant p, it is empirically known that
7.15 Francis Birch
319
a small relative change in volume will be proportional to a small absolute change in temperature, and the coefficient that relates those changes is precisely α, dρ dV =− = αdT. V ρ It follows that
∂ρ ∂T
(7.134)
= −αρ.
(7.135)
p
Birch also replaces the second term at the right-hand side of (7.133) with gρ2 /K T (which, incidentally, I am pretty sure the sign is wrong, but we’ll get to that in a minute), where K T denotes what Birch calls isothermal incompressibility. Incompressibility is synonymous with bulk modulus; because look at Eqs. (7.113) and (7.114), and you see that the larger the K for a given dp, the smaller the change in density. Now, when we defined K through (7.113) and (7.114) earlier, we were talking about purely mechanical compression, i.e., with no heat exchanges: Birch calls that adiabatic incompressibility486 . Isothermal incompressibility K T can be defined through the very same equations, except one requires T to stay constant through the changes in p and V and ρ—which can be achieved if some heat exchange is allowed. And of course, in general, K and K T won’t have the same value. So, in a process that’s isothermal rather than adiabatic, (7.113) is still valid provided that one replaces K with K T , i.e. dV dp =− ; (7.136) V KT it follows that
dρ dp = , ρ KT
(7.137)
and but then, if you divide both sides by dp, and multiply by ρ,
∂ρ ∂p
= T
ρ . KT
(7.138)
Now through this and Eq. (7.125), the second term at the right-hand side of (7.133) can be written
∂ρ ρ2 g ∂p =− ; (7.139) ∂ p T ∂r KT and it follows from this and from (7.135) that if you neglect for a moment the effects of chemical and phase changes, Eq. (7.133) becomes ρ2 g ∂T dρ =− . − αρ dr KT ∂r
(7.140)
320
7 The Structure of the Earth: Seismology
If you compare (7.140) with Eq. (5) in Birch’s 1952 paper, you see that they are the same equation except for the sign of the first term at the right-hand side—which, like I said, I think is just a typo in Birch’s paper. If you compare (7.140) with Eq. (7.128), i.e., the Williamson-Adams equation, you see that the first term at the right-hand side of (7.140) looks a lot like the righthand side of (7.128): except that, as we’ve just seen, K and K T are not the same thing; the second term at the right-hand side of (7.140) accounts for the change in temperature with depth, which hopefully you remember that Williamson and Adams had chosen to neglect—so it makes sense that should disappear in (7.128). Then Birch rewrites (7.140) “in such a way as to exhibit the effect of a departure from adiabatic conditions, by which [i.e., by “adiabatic conditions”] we mean a relation between the changes of pressure and temperature with depth such that” αT dT , = dp ρC p
(7.141)
where C p is the specific heat capacity, at constant pressure487 (and in a minute we’ll see why that is useful). If it’s not obvious for you why (7.141) should actually describe dT in an adiabatic process, that’s fine, because that is a thermodynamics result that’s dp not so easy to derive, given that we haven’t done so much thermodynamics in this book so far488 ; if you believe it489 , though, then it makes sense to write the “adiabatic temperature gradient”
dp ∂T dT = dr S ∂ p S dr αT ρg =− (7.142) ρC p αgT , =− Cp where we’ve used Eq. (7.125), and the subscript S stands for: no heat exchange490 . And then Birch writes αgT dT =− − σ, (7.143) dr Cp “with σ merely denoting the difference between the actual gradient of temperature and the adiabatic gradient”, and if we substitute (7.143) into (7.140), ρ2 g α2 ρgT dρ =− + + αρσ dr KT Cp
ρ2 g α2 T K T =− 1− + αρσ. KT ρC p
(7.144)
Finally, and this is where things start to look a bit simpler, Birch invokes a “thermodynamic relation between isothermal and adiabatic incompressibility”491 ,
7.15 Francis Birch
321
α2 T K T KT =1− , K ρC p
(7.145)
which492 if you sub for K T into (7.144), you get dρ ρ2 g =− + αρσ dr K
(7.146)
(same as Eq. (7) in Birch’s 1952 paper), and you can see quite clearly that if σ = 0, i.e., if processes in the earth’s mantle are adiabatic, (7.146) collapses back to Williamson-Adams. Now remember that to write (7.140) we’d temporarily switched off changes in phase and composition from (7.133): if we switch them back on, (7.146) is replaced by the general equation ρ2 g dρ =− + αρσ + dr K
∂ρ ∂w
T, p,y
∂w + ∂r
∂ρ ∂y
T, p,w
∂y , ∂r
(7.147)
which is what Birch wrote in his 1961 paper. Equation (7.147) is the generalization of (7.128), which was the cornerstone of Williamson-Adams; but, if I am reading his paper right, Birch does not care for rewriting the Williamson-Adams algorithm after swapping (7.128) for (7.147). He is happy with Williamson-Adams wherever Williamson-Adams appears to work. The thing is, though, he sees that in some cases the results given by WilliamsonAdams don’t make any sense at all. For instance, so far we’ve been thinking of seismic velocity and density as functions of pressure (or depth). But if you’ve got lab measurements of density and seismic velocity under various pressures (i.e., at various depths), which I told you a few pages up that Birch did, then you can plot a laboratorybased velocity-vs-density curve493 : and do the same with the seismology-based output of Williamson-Adams. Now, in principle, the curve one gets from lab data and the one that comes from Williamson-Adams should coincide, within observational error. And but Birch found that in many cases, particularly in the transition zone, they wouldn’t. So that meant that, right there in the transition zone, the terms in (7.147) that had been neglected by Williamson and Adams would need to be considered. In practice, then, one needs to dump the Williamson-Adams result, and go back to seismic-velocity versus depth curves derived from quake data. Seismic velocity will change if ρ changes; and but also if w changes, even without a change in ρ; so Birch writes (he doesn’t exactly write all this in his paper, but it’s as if he did)
322
7 The Structure of the Earth: Seismology
∂v p ∂v p ∂v p dρ dw = + ∂r ∂ρ w dr ∂w ρ dr
2
∂v p ∂v p ∂ρ ∂ρ ∂w ∂y dw ρ g + + ≈ − + ∂ρ w K ∂w T, p,y ∂r ∂ y T, p,w ∂r ∂w ρ dr
∂v p ∂ρ ρ2 g ∂v p ∂y ≈− + K ∂ρ w ∂ρ w ∂ y T, p,w ∂r
∂v p ∂v p dw ∂ρ + + , ∂ρ w ∂w T, p,y ∂w ρ dr (7.148) was replaced by its expression (7.147), but the non-adiabatic where of course dρ dr contribution to the thermal gradient is neglected, i.e., σ = 0 (which might look totally arbitrary at this point, but there will be more on that in the next chapter). It follows from (7.148) that ρ2 g v P ≈ − K
∂v p ∂ρ
w
r +
∂v p ∂ρ
w
y +
∂v p ∂ρ
w
∂ρ ∂w
+ T, p,y
∂v p ∂w
ρ
w,
(7.149) where v P and w are the finite changes in P velocity and mean atomic number over a finite increase r in radius. To understand what y is, remember y had been introduced quite vaguely, see above, as “a parameter which defines the closeness of packing”. Birch figures it’s easiest if dy is chosen to be “the change of density arising from change of phase, at constant T , p, and w, in the interval dr .” So then ∂ρ = 1, and (7.149) follows. ∂y T, p,w
What follows is just a rough calculation, as worked out by Birch in his 1961 paper, but a convincing one. Birch figured the transition zone would go from about 400 to about 900 km of depth, and he knew from seismology that between those depths v P grows by about 2.32 km/s. So v can be by 2.3 km/s, and r P in (7.149) replaced ∂v ∂v ∂ρ and ∂wp are found in laboratory by –500 km. Estimates of ∂ρp , ∂w w
T, p,y
ρ
studies, including those of Birch himself494 . In the first term at the right-hand side of (7.149), the values of K , ρ and g are those at about 400 km depth, i.e., right above the transition zone, and known through seismology and Williamson-Adams. Plugging all the numbers in, (7.149) boils down to 2.32 ≈ 1.09 + 3.31y − 0.36w,
(7.150)
where the units are km/s, of course. Equation (7.150) means that “ordinary adiabatic compression alone [which is what the 1.09 km/s term at the right-hand side comes from] accounts for less than half of the observed change” in P velocity. Then you need a negative w and/or a positive contribution y to density from phase change to explain the rest of v P . Let’s say there’s no phase change in the transition zone, i.e., y = 0. Then, Eq. (7.150) gives
7.16 Bulk Composition of the Earth
323
1.23 ≈ −0.36w,
(7.151)
or w ≈ −3. “As w is unlikely to fall below 21,” writes Birch, “the material above the transition layer would then have a mean atomic weight of 24–25 [...] This seems excessively high.” So it’s unlikely that there be no phase change at all. Then let’s say, instead, that “w is uniform through the transition layer,” i.e., w = 0; in that case Eq. (7.150) reduces to 1.23 ≈ 3.31y,
(7.152)
and if you solve for y you find that “the change of density through phase change is 0.37 g/cm3 .” Now, the changes “of density through phase change” that are expected to occur to earth materials at the kind of pressures that one might find in the transition zone are between, roughly, 0.2 to 0.5 g/cm3 : based, once again, on lab experiments, which Birch cites a bunch of papers, starting with the famous one by J. D. Bernal495 in 1936, that started the whole idea of phase changes in the earth. It follows that explaining that large jump in velocity, 2.32 km/s, by compression alone is impossible; explaining it in terms of chemical changes is difficult, because you would need to have rocks with a very strange composition (w = 24 or so) above the transition zone; instead, the explanation in terms of phase changes is in agreement with laboratory data.
7.16 Bulk Composition of the Earth Besides the work of Birch, and those who followed in his footsteps, trying to establish how different elements are distributed inside the earth, there’s ways to estimate what one might call the “bulk” composition of the earth, i.e., what the earth as a whole is made of. This has very little to do with seismology and the speed of seismic waves, and or with gravity and density and tides and precession and rigidity—it’s almost purely chemical. I’d say there’s basically two fundamental constraints that people have been looking at, and which I’m going to call the peridotite story and the chondrite story. The peridotite story starts with three findings—which we’re already aware of the first two—re the properties of mantle rocks: (i) lab experiments such as those of Bridgman and Birch tell us that the earth’s mantle is likely to be made of silicate rocks (see, e.g., the diagram of Fig. 7.23); (ii) we know from seismology that, under upper-mantle pressures, the ρ of mantle rocks is about 3 g/cm3 , and their v P is 8 km/s; (iii) we also know that basalt is the most common magmatic rock observed on earth. (I am getting a bit ahead of myself here, but we’ll see in the next chapter that, when people started exploring the oceans, it was found that the entire ocean floor, below a very thin layer of sediments, is made of basalt. You find quite a bit of basalt in continents too—think, e.g., the Massif Central.) It is reasonable to expect, then, that, if we could take a mantle rock and melt it, we’d get basalt.
324
7 The Structure of the Earth: Seismology
So: the stuff that makes up the mantle must be (i) a silicate rock, with (ii) the ρ and v P (at mantle p) that we know from seismology (which can be checked in a laboratory), and (iii) it must make basalt when it’s melted. There’s one rock that is occasionally (rarely) found at the surface of the earth496 that has all those properties: it’s called, you got it, peridotite. And so one very first guess as to the bulk composition of the earth’s mantle (not yet the entire earth, because we know at this point that mantle and core are two very different beasts) is that it should be made of something that closely resembles peridotite. The chondrite story has to do with, yes, chondrites, which are meteorites that occasionally land on the earth’s surface and are not so hard to find. Apparently, about 80% of meteorites that fall to Earth are chondrites. The name comes from small silicate-rock spheres, which are quite abundant in chondrites (but otherwise not found on earth), called chondrules497 . Chondrites all have pretty much the same chemistry and the same age498 . Their chemical composition is similar to that of the sun499 , minus the volatile elements like hydrogen and helium. Now, and this is really interesting: if you crush a chondrite, extract all the iron that’s in it, e.g. with a magnet, and analyze what’s left, well, that’s very similar to the composition of peridotite. Bottom line, if we make the hypothesis, which makes sense given the age and composition of chondrites, that chondrites are, like, left-overs from the formation of the solar system, and so their composition is the bulk composition of the solar system—and therefore also of the earth—well, then our hypothesis is in agreement with two totally independent theories of the earth’s composition, based on independent data, i.e., (i) that the earth’s core is made of iron, and (ii) that the earth’s mantle is made of peridotite—or, I should say, of stuff that has the same chemical composition of peridotite. The idea being that, early in the history of the earth, iron, which is heavy, sank toward the center of the earth, separating from the rest. And if you measure how much silicate rock and how much iron you find, on average, in chondrites, that pretty much corresponds to the ratio of silicates and iron we expect to have on earth, given the relative sizes (and densities) of the earth’s mantle and core. In conclusion, we have a model of what the bulk composition of the earth is like—the chondrite model. This will be useful in the next chapters, as we try to figure out the dynamic processes that take place inside the earth: convection, the magnetic field and all that.
7.17 Olivine, Peridotite, Perovskite, Post-Perovskite Now that we have an idea of what the earth’s mantle is made of, we might try to look in some more detail at what happens in the transition zone, with the phase of those rocks and minerals. Post-Birch, lots of laboratory experiments have determined in some detail what the transition-zone phase changes are probably like. The way this is done is, you reproduce phase changes in the lab, taking, e.g., an olivine sample (olivine, chemical formula Mg2 SiO4 or, less frequently, Fe2 SiO4 ,
7.17 Olivine, Peridotite, Perovskite, Post-Perovskite
325
being the most common mineral in peridotite) and putting it under increasing pressure, and temperature, until a phase change happens. Which then, of course, you record the p and T where the phase changed. Depending on p, the phase change might occur at a different T , and vice versa. But people also measure seismic velocities in their rock samples, before and after the phase change. And seismology-based earth models (think, once again, Williamson-Adams) have both p and seismic velocities. So you can go and take a seismology model and look for a depth where both ρ and seismic velocities coincide with those found in your experiment: and if you find such a depth, then chances are that you identified the right phase changes. And the T where your phase change occurred becomes an estimate of T in the mantle, at the depth of the phase change. People who look at these things see at least two major phase changes in the transition zone. At each phase change, as p grows, the lattice structure of what was originally olivine becomes more densely packed. Lab data show that this should happen at about 410 and 660 km depth, and should be accompanied, in both cases, by a sharp increase in seismic velocities. Which is precisely what seismologists see, from earthquake data, around those two depths500 . (And there are no other places on earth, between the Moho and the core-mantle boundary, where seismic velocity is observed to grow so quickly.) The mineral that olivine eventually transforms to when you bring it to the p and T of below-the-660 is called perovskite501,502 . The chemical formula is MgSiO3 or FeSiO3 : different from that of olivine, but with the same elements. Another phase change has been detected, way more recently, near the bottom of the mantle, a couple of hundred km above the core-mantle boundary. As early as, like, the 1940s, seismologists thought that they could see another sharp increase in seismic velocities at about that depth. Not as sharp as in the transition zone, but sharp. (And below the velocity jump, actually, the v P and v S gradients are both smaller than above.) In some early 1-D earth models503 , where different depth ranges were identified with letters from A to F, the lower mantle was called D, which then the anomalous region between 2700 km and the CMB was called D (“D double prime”), and the rest of the mantle was called D . For whatever reason people have quit using letters for all layers except D , which we still call that way. It was speculated that D could be the expression of a phase change, but mineral physicists weren’t able to look into this until the early 2000s, because that involved, as you can imagine, bringing their samples to enormous p and T : those of the lowermost mantle. As soon as technology made it possible, high-pressure physicists like Kei Hirose and Shigeaki Ono were able to find a new phase, which is what perovskite transforms to if you bring it to lowermost-mantle p and T . They called it post-perovskite. (The chemical formula of post-perovskite is the same as that of perovskite, MgSiO3 , but that crystal structure is different.) Now, like I was saying, this whole story is also “used” to constrain the values of temperature at the depths where phase changes happen. It’s worthwhile to look at this a bit more closely. In practice, the results of laboratory experiments are summarized by temperature-pressure diagrams like that in Fig. 7.24, showing, for a given pressure, at what temperature a phase change will occur. The curve showing this relationship,
326
7 The Structure of the Earth: Seismology
Fig. 7.24 Let the thick solid lines be the Clapeyron slopes (found experimentally) of two known phase transitions taking place in the earth. Let’s say that seismology tells us that the first phase transition occurs at a given depth, and that pressure at that depth has the value p1 . Then, if we draw a vertical line crossing the horizontal (pressure) axis at p1 , the intersection of that with the first Clapeyron slope gives us the temperature (vertical axis) T1 at that same depth. And then we can do the same for other transition zone depth(s) (e.g., p2 and T2 here), and extrapolate a “geotherm” (dashed line), i.e., a T -vs-pressure, or, which is the same, T -vs-depth curve
for one given phase transition, is called Clapeyron slope504 . So, for a given phase change, draw a straight line on the temperature-pressure plot, corresponding to the pressure (depth) where, like I just explained, seismology models place that phase change: the intersection of that with the Clapeyron slope of that same phase change tells me what the temperature is at that precise depth. From this exercise, T is found to be ≈1800 ◦ K at the 410, 2000 ◦ K at the 660, and 2500 ◦ K at the perovskite-to-post-perovskite transition505 , i.e., at 2700 km depth, if we take that phase change to coincide with the top of the D . From the transition-zone values we see right away that the gradient of T (which, as you might remember from Chap. 4, is as high as 10−50 ◦ K/km just below the earth’s surface and in the crust) must drop to drastically lower values at relatively shallow depth. Then, temperature doesn’t grow that much over most of the mantle. But we’ll get back to this, in Chaps. 8 and 9.
Chapter 8
The Forces that Shape the Earth: Convection and Plates
8.1 The Discovery of Radioactivity Anybody who, like me, has spent some significant amount of time in the city of Paris knows that the weather, there, is very often overcast. It was so in the last days of February, 1896, as Henri Becquerel ran an experiment in his laboratory involving uranium and X-rays. Becquerel’s lab was in the museum of natural history, literally next door from old Buffon’s apartments at the Jardin des Plantes, and—today— from the new site of the Institut de Physique du Globe. If you ever pass by, you might see a marble plaque that says, “dans le laboratoire de physique appliquée du museum Henri Becquerel a découvert la radioactivité le 1er mars 1896”. Becquerel had thought that what uranium did was absorb energy from the sun and then release it—in the form of high-energy electromagnetic radiation, i.e., X-rays. To see more precisely how that happened, he decided to expose to sunlight some material that contained uranium, then place it near photographic plates, to record the radiation that was emitted. The way Becquerel saw things, no sunlight meant the experiment couldn’t be done; but for some reason, after that one overcast day, he left the uranium near a photographic plate anyway. And it turned out that, even though the uranium could not have absorbed any energy from the sun, the imprint of radiation still showed up in the plate. Becquerel realized that, somehow, the energy radiated into the plates must have been stocked in the uranium independent of it being exposed to the sun, and decided to look into this. He did some new experiments where he saw that the radiations he was dealing with were not electromagnetic waves, but some strange phenomenon that hadn’t been observed before. He discovered that if you use a collimator506 to control the direction of the radiation, and have the radiation propagate through a magnetic field507 , the magnetic field could deflect the radiation, which would land at some other location on the photographic plate; and that was something that did not happen at all with X-rays.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_8
327
328
8 The Forces that Shape the Earth: Convection and Plates
It took a number of years to figure it all out, but, simplifying quite a bit, the point that’s relevant to us here is that radioactivity happens when the structure of an atom changes suddenly, in some drastic way. This can be either through the loss of some of the particles—protons, neutrons, electrons—that form the atom, and that are shot out from it at high speed, i.e. carrying a lot of kinetic energy; or through a change in the way those particles are, uhm, stuck to one another in the atomic structure: in which case there is no emission of matter from the atom, but of electromagnetic energy. The radiations observed by Becquerel were made of electrons and/or protons: electrons and protons are electrically charged, which explains why their trajectory would be deflected by a magnetic field508 . Now, remember that an element, in modern physics and chemistry, is defined by how many protons are in its nucleus: if something happens that changes that number, we are going to have an element turn into another element: which is called “radioactive decay”. Some elements—I should say, certain isotopes of some elements, but we’ll get to that in a minute—are radioactive, meaning their atoms have a certain tendency to “decay” into something else. This is a random process, sort of: there’s no way of telling when a given atom is going to decay. But, if you take a very large number of atoms of a radioactive element, and check how many of them have decayed after a certain amount of time t—that’s actually quite predictable, and is given, with little uncertainty, by the formula P(t) = P(0)e−λt ,
(8.1)
where P(0) is the number of atoms of that element at time t = 0, and P(t) the same thing, of course, at time t—and if the number of atoms is given per unit mass, you might call it concentration. The parameter λ is a constant, which depends on what element we look at; and the larger λ is, the faster the decay; in principle, the value of λ associated with a given element can be determined empirically, looking at a sample of radioactive material in a lab. Translated into words, Eq. (8.1) says that the concentration of a radioactive element decays exponentially.
8.2 Radiometric Dating One of the reasons I am telling you about radioactivity in this book, is that Eq. (8.1) can be, and is, used to date igneous rocks, i.e., to measure the time that has passed since a rock has “formed”—since it has solidified from the liquid state into the form as we know it now. The way this works is, in a nuclear reaction, each time an atom of what we call the “parent” element decays, one atom of the so-called “daughter” element is born. Zircon is a mineral that contains uranium; its chemical formula has no U in it; but the thing is, very small amounts (“trace” amounts) of U are included in the crystalline structure of zircon. Zircon, on the other hand, excludes lead: Pb just doesn’t fit in its structure. Finally, uranium is radioactive and its daughter is lead. Let the moment when a zircon crystal forms, out of some melt509 , coincide with
8.2 Radiometric Dating
329
the origin t = 0 of the time axis. Because Pb is excluded from zircon at t = 0, at that point there’s only parents in it, and no daughters; it follows that any daughters found—as “defects” in the “lattice”510 —in the zircon at time t must have been born from the initial set of parents—exactly as many parents have decayed as daughters were born. Then, the concentration of daughters must read D(t) = P(0) − P(t).
(8.2)
P(0) = D(t) + P(t),
(8.3)
P(t) = [D(t) + P(t)]e−λt .
(8.4)
But then
and, if we sub this into (8.1),
Which then eλt = and so
D(t) + P(t) , P(t)
1 D(t) + P(t) t = ln . λ P(t)
(8.5)
(8.6)
At the right-hand side we have the concentrations of daughters and parents at time t, i.e., now: so, we can bring our sample to the lab and measure everything511 and calculate t, which is what we call the age of the rock. You can date an igneous rock even if doesn’t carry any mineral with the properties of zircon; that is, even if there is no reason to assume that the concentration of daughters at t = 0 be zero. If we call D(0) the initial concentration of daughters, then their concentration at a generic time t must be given by D(0) plus the concentration of daughters that came from the radioactive decay of parents: Eq. (8.2) is replaced by D(t) = P(0) − P(t) + D(0). (8.7) Equation (8.1) is still valid, though. It’s convenient to turn it around, P(0) = P(t)eλt ,
(8.8)
and then sub that into (8.7), which gives D(t) = D(0) + (eλt − 1)P(t).
(8.9)
Now the trick is, instead of measuring the absolute concentration of parents and daughters, imagine that we measure their concentration relative to a non-radiogenic isotope of the daughter element: i.e., atoms of the daughter element that are not
330
8 The Forces that Shape the Earth: Convection and Plates
actually daughters—they were not generated by radioactive decay512 . If we call Di the concentration of the non-radiogenic isotope that we use as reference, this amounts to dividing both sides of (8.9) by Di , D(t) D(0) P(t) = + (eλt − 1) . Di Di Di
(8.10)
Say you could measure the concentrations of daughters and parents, relative to Di , at the moment just before the rock solidifies. You might take as many samples as you want and repeat the measurement: atoms haven’t coalesced to form crystalline structures—minerals—yet, so parents and daughters and non-radiogenic isotopes are randomly—uniformly—distributed all over our parcel of (molten) rock, and, no matter where you look, you’ll find about the same amounts of parents and radiogenic and non-radiogenic daughters. In a Cartesian plot where DPi is along the horizontal axis and DDi along the vertical one, data cluster around one point. Next, repeat the measurement right after the rock has solidified—i.e., what we define as t = 0—the moment when the rock, the way we see it now, is born513 . As the rock cools, chemical reactions occur that lead to the formation of different minerals; Di and D take part in the same reactions with equal probabilities; so, if you take several different samples of a given mineral found in this rock, you will find roughly the same ratio of D to Di in all samples. You will also find the same ratio of D to Di in different minerals, i.e., even if you look at (multiple samples of) different minerals and take the average. The ratio of D to Di depends only on the concentrations of the daughter element and its non-radiogenic isotope at zero time—it’s actually the ratio of those concentrations. It’s got nothing to do with the difference between those two isotopes, because chemically the two isotopes are the same. D(0) is not same for all minerals, because it depends on which mineral (its chemistry) we are looking at—but D(0)/Di is. That is not the case with P, or P/Di : parent and daughter have different chemistry, so they will be found with different concentrations in different minerals: which is what the horizontal line (labeled t0 ) in Fig. 8.1 tries to show. Data points form a few separate clusters, each corresponding to a different mineral that we have sampled. The solid line labeled t1 in Fig. 8.1 shows what happens, some significant time t1 − t0 after the rock has become solid. Within each mineral the parents decay, and is for a given mineral, the larger D(t) will be—more parents the larger P(0), or P(0) Di will give birth to more daughters. We might write Dk (t) Dk (0) Pk (t) = + (eλt − 1) , Di,k Di,k Di,k
(8.11)
where k identifies the mineral we are looking at. At a time t, we have a set of as many data points DDki,k(t) as there are minerals (values of k); Eq. (8.11) says those data
(0) points are all aligned along a straight line with slope eλt − 1 and intersect DDk i,k . So, here’s how you use all this to estimate the age of a rock: take a rock sample and make
8.2 Radiometric Dating
331
Fig. 8.1 Radiometric dating: the “isochron” method. In this example, rubidium-87 (87 Rb) is the parent (concentration P); strontium-87 (87 Sr) is the daughter (concentration D); strontium-86 (86 Sr) is the reference (concentration Di ). At any given time (t0 or t1 here), each circle corresponds to one and only one mineral. The minerals found in a sample of granite, for example, could be plagioclase feldspar, K-feldspar, hornblende, biotite, muscovite: each of which has a different initial rubidium/strontium ratio depending on its chemistry. But the initial (time t0 ) 87 Sr/86 Sr ratio is the same for all, because, chemically, 87 Sr and 86 Sr are the same thing. After some time, though, the amount of daughter (87 Sr) found in a sample will be proportional to the amount of parent (87 Rb) found in the same sample at time t0 ; as a result, data points at a time t1 would still be on a straight line, but with a slope—an “isochron”
a bunch of measurements of DDi,kk , DPi,kk , with at least one data point per different mineral found in the sample; plot them in a Cartesian graph, similar to Fig. 8.1: you’ll get a straight line, so measure its slope: this is supposed to coincide with eλt − 1: sum 1 to it and then take the natural logarithm of the result: divide by λ, and what you get is t, or the age of the rock. The straight line is an “isochron” curve—all samples along it must have the same age, which is what iso and chron(os) mean in Greek; and this way of dating a rock is called “isochron dating”. Bottom line, we now have ways to tell the absolute age of a rock formation. If it’s an igneous rock, we can determine quite precisely how much time passed since it formed as a solid chunk of rock, out of molten material; but even if it’s a sedimentary rock we can come up with an age, provided that the sediments contain a non-negligible amount of originally igneous material; if they do, we can date all the igneous rock we find in the sediment; igneous stuff found in the sediments is a product of erosion, i.e., it’s chunks of solid rock that have been eroded away from mountains and carried by a river to the sea shore, where sediments accumulate: it follows that the sedimentary formation as a whole has to be younger than the youngest of all igneous samples found in it: and OK, that doesn’t tell you precisely how old the sedimentary rock is, but at least we can establish some bounds514,515 .
332
8 The Forces that Shape the Earth: Convection and Plates
8.3 Earth’s Radioactivity and Surface Heat Flow The other reason I am telling you about radioactivity has to do with Kelvin’s idea that the earth is cooling. Kelvin’s estimate of the age of the earth, see Chap. 4, rested on the assumption, which Kelvin was very confident was a safe assumption, that there be no sources of heat within the earth, other than cooling and conduction516 . Now that radioactivity’s been discovered, though, the assumption isn’t very credible at all. Because each time a radioactive atom decays, like I said, it shoots out a chunk of its nucleus, or an electron, or a photon: and the kinetic energy carried by those particles is, essentially, heat: remember the kinetic theory of heat517 . After the work of Becquerel, it was straightforward for people studying the earth to verify that the materials that make up our planet are radioactive, and to measure how much; at least those that we can sample, that is to say, the rocks that make up the crust. And the crust turned out to be significantly radioactive. Not to the point that that could be bad for our health518 , but, taken together, a pretty large amount of heat is produced. As for whatever is below the crust, we can’t be sure of its chemical composition—or we certainly couldn’t in Becquerel’s day—but, given the prevalence of radioactivity in earth materials that we can sample, it seemed likely that the deep earth should be at least slightly radioactive. Now, that meant that the heat flow that is measured at the surface must account for both energy loss by conduction, and radioactive heat. Kelvin’s assumption was equivalent to saying that all heat flow at surface must be accounted for by cooling and conduction alone. Kelvin found, remember Chap. 4, that
dT (z = 0, t) dz
cooling
1 =D √ , 2 κt
(8.12)
where the subscript “cooling” is there to remind us that, for this equation to make sense, the thermal gradient at the left-hand side must be due solely to cooling; other than that, everything is like in Chap. 4: t is the age of the earth, D and κ are constants that can be estimated, very roughly, based on what we know empirically about rocks in general. This is all good; but then Kelvin mistook the left-hand side of (8.12) for the heat flow/thermal gradient that’s actually observed at the earth’s surface. Which we just learned that the observed heat flow is the sum of two contributions: that of “passive” cooling, yes, but also that of radioactive heat. Both contribute a flow of heat from the earth to the rest of the universe, of course: i.e., their contributions have the same sign. It follows that subbing the data into the left-hand side of (8.12) is equivalent to overestimating the amount of heat flow that’s due to cooling and, as a result, to underestimating the age of the earth519 . (Because its square root, says Eq. (8.12), is inversely proportional to heat flow.) Ernest Rutherford520 , Robert John Strutt521 and Arthur Holmes522 were some of the first people to worry about this. This is how Holmes tells the story in one of his papers523 : “Rutherford suggested in 1905 that the heat constantly evolved in virtue of the disintegration of the earth’s supply of radium might be sufficient to maintain
8.3 Earth’s Radioactivity and Surface Heat Flow
333
the observed temperature gradient. On the assumption that 100 calories524 per hour represented the heat output of one gram of radium, he calculated that if each gram of the substance of the earth contained 4.6 × 10−14 grams of radium, the heat so produced would be equivalent to that brought to the earth’s surface by conduction.” Strutt then built an “apparatus for the estimation of radium”, and “applied it successfully to a large number of representative rocks. [...] The average of twentyeight igneous rocks gave 1.6 × 10−12 grams of radium per gram of rock, about thirty-five times as much as Rutherford had demanded for thermal equilibrium of the earth.” And “Strutt’s figures”, says Holmes, “have been confirmed by several other observers, and rocks from every continent have now been examined.” These results were “very surprising in light of Rutherford’s calculation.” The way I understand this, Rutherford’s point is that the earth, rather than slowly cooling down, could, in principle, be in a steady state, remember Eq. (4.26), with its deep interior working as a reservoir of energy, releasing an approximately constant amount of heat via radioactive decay. If this were the case, the total heat released per unit time by radioactive stuff in the earth must coincide with the total heat, per unit time, let out by the earth through its outer surface—so one could convert surface heat flow to the total amount of “radium” inside the earth (I guess “radium” should be understood here as just a (brutal) proxy for whatever radioactive elements one could expect to find in the deep earth). Which is what Rutherford does. But then Strutt finds experimentally that typical earth rocks—those, at least, that we have samples of: mostly from the crust—contain much more radium than would be needed to maintain the steady state. And that’s “very surprising”, for sure, because if there is so much radioactivity in the earth, and yet the thermal gradient near the surface is what it is, then where does all the radioactive heat go? One inference made by Strutt and Holmes is that substratum rocks, of which close to nothing was known at that point, are much less radioactive than those in the crust. If Strutt’s estimate of the radioactivity of earth’s rocks were only valid within the crust—which is a small portion of the total volume of the planet—and could be replaced by a much lower figure as far as the substratum was concerned, then maybe the discrepancy could be explained away. We’ve seen in Chap. 7 that, already in Holmes’ time, the composition of the substratum could be estimated— with much uncertainty—from that of meteorites; and if you have an idea of the abundances of various elements in the substratum, and you know which ones are radioactive, and which aren’t, etc., you can make a rough guess as to how radioactive the substratum might be. Long story short, Holmes does the math, and it seems likely that the substratum be less radioactive than the crust. In his paper, Holmes also points out that meteorites that are rich in iron are systematically poor in radioactive elements, which means that the core is probably the least radioactive part of the earth—if radioactive at all. Holmes comes back to all this in later papers. Let’s see the figures he comes up with in 1931525 : first, direct observations of thermal gradient/heat-flow-perunit-area-per-unit-time: the average temperature gradient below the Earth’s sur= 3.2 × 10−4 ◦ C/cm; the average thermal conductivity of rocks is k = face is dT dz
334
8 The Forces that Shape the Earth: Convection and Plates
6 × 10−3 cal/ (cm × sec ×◦ C); Fourier’s law of heat conduction, remember Chap. . After plug4, Eq. (4.32), says that heat-flow-per-unit-area-per-unit-time F = −k dT dz −6 2 ging the numbers in, this gives ≈ 2 × 10 cal/(cm × sec). In a year there are 3.15 × 107 seconds, so that’s the same as F ≈ 60 cal/ cm2 × year : and that’s how much heat we actually see flowing out. Next, the radioactivity of crustal rocks: 1 m3 of granite releases 42.2 calories/year. But 1 m3 is the same as 106 cm3 , or 1 cm2 times 106 cm, or 1 cm2 times 10 km. It follows that, if the crust were 10-km thick and entirely made of granite—which is OK for a back-of-the-envelope estimate—and if there were no radioactivity in the earth other than in the crust, radioactive heat flow would amount to 42.2 calories per cm2 per year. Now, if one compares the two figures, the inference is that about two thirds (42.2/60) of the heat released by the earth are accounted for by the granite that’s probably in the crust. In fact, with a 14-km-thick (instead of 10-km-thick) granitic crust, you’d make up for all of the observed 60 calories/year. Basalt is less radioactive than granite; Holmes does the math with basalt data, too, and finds that it would take a 52-km thick basaltic crust to account for the 60 calories per cm2 per year. Or, he adds, a 60-km thick peridotite crust. If one assumes that the substratum is “so poor in radioactive elements that its heat output can be ignored”, writes Holmes, one could still think of the earth, or at least most of it, as passively cooling; but the stuff we expect the substratum to be made of, peridotite for example, is at least slightly radioactive: this is something that people measure in the lab. And so the geochemical evidence, says Holmes, “is certainly in favor of a slightly radioactive substratum and this implies the production of an excess of heat over and above that conducted to the surface.”
8.4 Viscosity This is where convection comes into play. Radioactivity could provide sufficient energy for the substratum to be in a “glassy”, viscous state, so that it could actually flow; and also sufficient energy to keep the flow going. (Certainly the energy can’t be spent to heat the earth up, because it’s unlikely that the earth’s temperature be rising— if it were, how could we be walking on a solid crust?) And then, convective circulation could be Wegener’s missing engine: the force that moves continents around. One parameter that is really important when it comes to the earth’s convection, and but hasn’t gotten much attention in this book thus far, is the earth’s so-called viscosity. I think you’ll remember from Chap. 6 that an elastic material is a material that responds more or less immediately to forcing, with stress and deformation that are proportional to one another. A viscous material is one that responds to forcing with some delay—in a viscous material, stress and the time-derivative of deformation are proportional to one another: which means that the moment you turn stress on,
8.4 Viscosity
335
deformation is zero, but its time-derivative is non-zero—and so the deformation itself will start to be nonzero a moment later, so to say. Now, as for the earth, it turns out that the earth behaves like an elastic thing in the way it responds to sudden disturbances in the short term. After an earthquake, for example, there are elastic waves: what we observe is pretty much what we predicted in Chap. 6, when we assumed the earth to be elastic, and derived the equations of Navier-Cauchy, and P and S waves and all that. But, the earth behaves like a viscous fluid in the way it responds to disturbances in the long term—meaning, in geological time—we’ll see some examples shortly. Viscosity is very much a non-trivial concept. Henry Frankel526 tells the story of how Jeffreys debated against Wegener, and other proponents of continental drift, whether the continents might “plough their way” through the “ocean floor”. In Frankel’s words, Jeffreys thought “that the moving continents would disintegrate rather than remain intact, since seafloor material is stronger than continental material. [...] Having the continents plough their way through the seafloor would be like attempting to thrust a leaden chisel into steel.” Which means, if I understand it correctly, that, according to Frankel, according to Jeffreys, if you apply a force that pushes crustal material (sial) through substratum material (sima), a priori you might expect either to break the sial, or to pierce through the sima. The sima is composed of basaltic rock, Jeffreys continues, which is stronger than the sial: and so the sial—the chisel—breaks. But is the sima really “stronger” than the sial? How a material responds to forcing also depends on the duration of that forcing in time. If you make a ball with silly putty and throw it against a wall, it bounces back, and its shape doesn’t change—it’s still a ball. If you slowly push it with your hand, though, you can deform it into another shape. To describe the “rheology” of (earth) materials, or how they respond to forcing in general, we need at least two parameters: rigidity (μ), that we’ve met before, and viscosity, that we are beginning to figure out now. Over a short time, rocks are elastic and their behaviour is controlled by rigidity; over a long time they are viscous, and their behaviour is controlled by their viscosity. According to Frankel, and to Wegener’s acolytes who got into that controversy with him, Jeffreys calls the sima strong because he’s thinking of its rigidity, and but doesn’t realize that what would control the sima’s response to the very slow “ploughing” of a drifting continent is its viscosity: and the viscosity of sima actually turns out to be pretty low—while its rigidity is indeed high, and higher than that of sial527,528 . Probably the one person that more than anyone else is responsible for giving viscosity a quantitative treatment is George Stokes. We’ve already met the equation that today people call Navier-Stokes (Chap. 7), though we only really played with it in cases where the material we were looking at was elastic. Remember that NavierStokes, Eq. (6.135), is valid independent of the so-called rheology of the material; specifying the rheology amounts to introducing a relationship between the stress tensor τ and the displacement vector u, which people call a “constitutive relation”. Which, in Chap. 6, was Hooke’s law, because, like I said, all we cared about at that point was elasticity: and you might also remember that the equation that we get if we plug Hooke’s law into Navier-Stokes is sometimes called the Navier-Cauchy
336
8 The Forces that Shape the Earth: Convection and Plates
equation. We’ve also briefly met Euler’s equation, which can be derived in various ways, but is also what Navier-Stokes collapses to if a material is such that there can be no shear stress in it, even when neighboring parts of it move at different velocities ∂u i = 0 for some i, j. It reads with respect to one another—i.e., when ∂x j τi j = − pδi j .
(8.13)
It follows from (8.13) that p = − 13 τkk , and so it makes sense to call p the mean pressure. A material described by (8.13) is called an inviscid fluid, and a good approximation of that is water. Now, people call linear fluid a material that behaves as per Eq. (8.13) viscous dεi j when there’s no motion dt = 0 , otherwise stress and the rate of deformation are in a linear relationship to one another. Formally, τi j = − pδi j + Ci jkl
dεkl , dt
(8.14)
where Ci jkl is a 3 × 3 × 3 × 3 tensor, kind of like the one we met when we did elasticity, and its coefficients are independent of the deformation529 . Also similar to elasticity, if one factors in the symmetries of τ and ε , isotropy and all that, the 81 parameters in Ci jkl reduce to two, that we might call λ and η, and (8.14) collapses to dεi j dεkk + 2η . (8.15) τi j = − pδi j + λ dt dt Now, remember εkk is the divergence of u: then, if the material is incompressible, εkk = 0, and (8.15) is replaced by τi j = − pδi j + 2η
dεi j . dt
(8.16)
If we want to do like we did in Chap. 6 to study elastic media, then the next step is to sub one of the new constitutive relations that we’ve just written, (8.15) or (8.16), into Navier-Stokes530,531 . People have tried both, but the general, compressible case turned out to be quite messy for a number of reasons, and so most of the literature covers only the incompressible approximation, hoping that that is realistic enough. You see from Eq. (8.16) that, if you have incompressibility, then the single parameter η controls how the material responds to the rate of deformation, and in fact that’s what people typically use as a quantitative measure of viscosity: so, from now on in this book, η and “viscosity” are going to be the same thing. So, then, take (8.16) and substitute that into Navier-Stokes. If we call v the velocity, or time-derivative of , then the k-th component of the divergence of τ can be written displacement, v = du dt
8.5 Postglacial Rebound
337
∂vi ∂τik ∂p ∂ ∂vk =− +η + ∂xi ∂xk ∂xi ∂xk ∂xi 2 ∂p ∂ vk ∂ ∂vi =− +η + ∂xk ∂xk ∂xi ∂xi2 2 ∂p ∂ vk , =− +η ∂xk ∂xi2
(8.17)
which, in case you are wondering, the reason the last term at the right-hand side disappears in the last step is that d ∂u k ∂ du k ∂vk = = , ∂xk ∂xk dt dt ∂xk
(8.18)
and but, again, we’ve decided that we’re dealing with incompressible stuff, so ∂u k = 0. Other than that, you should be used by now to me swapping the order ∂xk of differentiation, if that’s somehow helpful, whenever there’s multiple derivatives. So, then, Navier-Stokes becomes ρ
dv = −∇ p + η∇ 2 v + f. dt
(8.19)
Equation (8.19) was used by Norman Haskell532 in at least three papers533 that he published between 1935 and 1937, to estimate some average value for the viscosity of the earth’s interior. That’s the first time534 that people tried to come up with some measure of the earth’s viscosity535 . Now, viscosity, like I said, is an essential ingredient of what comes next, so before going on with the rest of the chapter, I am going to take some time and tell you what Haskell precisely did.
8.5 Postglacial Rebound Haskell got his idea from the observation, made shortly before his time, that, today, the surface of the earth is in the process of “bouncing” back up, after being relieved of the load of ice from the last ice age (which see Chap. 5). Whether the earth be viscous, or elastic, or something in between, its surface must have been deformed under the weight of all the ice; and after the ice was gone, and its weight removed, it must have gotten back to the shape it had before the glaciation. Except, if the earth is viscous, the “post-glacial” uplift won’t be a fast, elastic rebound, but a slow one—the higher the viscosity, the slower the process. People realized, actually, that the earth’s surface is still bouncing back from the last ice age, because, for instance, that would be the best (the only?) explanation for the decline in sea level observed around the
338
8 The Forces that Shape the Earth: Convection and Plates
Baltic (and, later, in plenty other places that had been covered by ice): the decline was actually an uplift of the solid earth, not related to a change in the overall volume of water around it536 . Haskell figured he could make a mathematical model of uplift, with the earth’s viscosity η as a parameter, and then compare the model to measurements of the uplift, looking for a value of η that would make model and data “fit” each other. Because in fact, he writes, “the progress of this recovery has been dated537 [...] and F. Nansen538 has constructed curves illustrating the uplift of the Fennoscandian region539 from 16,000 B.C. to the present time.” Haskell’s model consisted of a half space made of a material that obeyed Eq. (8.19), with η and ρ constant throughout. To calculate uplift as a function of time for a given value of η one needs initial conditions, as we are about to see, i.e. the shape of the earth’s surface at “zero time”: but “zero time” didn’t have to be the end of the ice age: any moment after the removal of the ice load would do540 . And since Nansen provided uplift data for a sequence of times after the glaciation, Haskell just had to pick one, and start his “simulation” right there. So, then, Haskell takes Eq. (8.19), but “since, in the case of the earth,” he says, “we are dealing with extremely small accelerations and very high viscosity, we may neglect the inertial terms in the equations of motion in comparison with those arising from viscous forces”, i.e., acceleration at the right-hand side of (8.19) is very small, because everything is really slow, and (8.19) can be approximated by η∇ 2 v = ∇ p − (0, 0, ρg).
(8.20)
Equation (8.20) is the same as Eq. (1.1) in Haskell’s 1935 paper, with f from (8.19) replaced by (0, 0, ρg), i.e., gravity—no other body force is considered to be relevant. To keep things simple, Haskell assumes the ice load (and so then the initial deformation, too) to be symmetric around a vertical axis (a cylinder, a cone, or something like that); he places the origin of his reference frame at the intersection of that axis with the (unperturbed) surface of the earth at z = 0; and then he finds it convenient to switch from Cartesian to cylindrical coordinates (Fig. 8.2). That means that the position of a point in space is described by (i) its depth z (same as the Cartesian depth), (ii) the distance r from the origin to the point’s projection on the z = 0 surface, and (iii) the angle ϕ between the segment connecting the origin and said projection, and a reference direction that lies on the z = 0 plane. And then one can define a set of unit vectors rˆ , zˆ , ϕˆ , where zˆ is just like the Cartesian unit vector associated with the vertical axis, except it points downward—depth is positive downward; rˆ always points away from the origin along the z = 0 plane; and ϕˆ is perpendicular to both rˆ and zˆ , and points in the sense of growing ϕ—which we can take to be counterclockwise. Then, e.g., we might write velocity v = vr rˆ + vz zˆ + vϕϕˆ , which you have to be careful because while zˆ is everywhere the same, rˆ and ϕˆ point in different directions depending on where we are.
8.5 Postglacial Rebound
339
Fig. 8.2 The cylindrical coordinates of the point P are r, ϕ, z. The cylindrical coordinate z and the Cartesian coordinate z are the same thing
In cylindrical coordinates, Eq. (8.20) becomes541
1
∂ r ∂r 1 ∂ r ∂r
∂vr vr 2 r − + ∂ vr = η1 ∂∂rp¯ , ∂r r 2 2 ∂z 2 z r ∂v + ∂∂rv2z = η1 ∂∂zp¯ , ∂r
(8.21)
where, like in Haskell’s paper, p¯ means p − ρgz, and so of course ∂∂rp¯ = ∂∂rp , but ∂∂zp¯ = ∂p − ρg. There’s also the requirement that there be no compression, or expansion, ∂z i.e. (see Chap. 6) ∇ · u = 0, or ∇ · v = 0, which in cylindrical coordinates reads ∂vz 1 ∂ (r vr ) + = 0. r ∂r ∂z
(8.22)
Before we see how the system of partial differential Eqs. (8.21) plus (8.22) can be solved, there are a couple more things that need to be specified. Haskell calls ζ = ζ(r, t) the function that describes the outer surface of the earth, which because there’s symmetry is only a function of r , not of ϕ. “If we assume”, he writes, “that ζ remains small in comparison with other distances entering into the problem, such as z at z = ζ by its value the radius of the applied load, we may replace the value of ∂v ∂z at z = 0,” etc. This is useful when applying the boundary condition, which is, like in Chap. 6 when we were looking at seismic surface waves, that the earth’s surface be free of stresses (except for the pressure from the ice load, of course: but, like I said, Haskell has the computation start after the ice is gone, so we can set that to zero, too). The components of τ that concern the outer surface are τzz , τr z , τϕz , which in cylindrical coordinates, through Eq. (8.16), τzz = − p + 2η
∂vz ∂z
(8.23)
340
8 The Forces that Shape the Earth: Convection and Plates
and τr z = η
∂vz ∂z
(8.24)
which are simpler than they would be if the load weren’t symmetric about the origin: but because the initial load is symmetric, then so must be the initial deformation, and but then also the uplift that will follow, etc.; and from that it also follows that τϕz = 0. That leaves us with the two bondary conditions542,543 p(r, ¯ 0, t) + ρgζ(r, t) − 2η and
∂vz (r, 0, t) = 0 ∂z
∂vz ∂vr (r, 0, t) + (r, 0, t) = 0, ∂z ∂r
(8.25)
(8.26)
plus the requirement that both v and τ be zero at infinity (very large r and/or z), because if that weren’t the case energy wouldn’t be conserved, etc. The way Haskell solves the system of Eqs. (8.21) and (8.22) is by separation of variables, i.e., he makes the hypothesis that functions R1 (r ), Z 1 (z), R2 (r ), etc. exist, such that (8.27) vr = R1 (r )Z 1 (z), vz = R2 (r )Z 2 (z)
(8.28)
p¯ = R3 (r )Z 3 (z),
(8.29)
and
then he substitutes (8.27) through (8.29) into (8.21) and (8.22), which gives
and
1 1 1 R¨ 1 Z 1 + R˙ 1 Z 1 − 2 R1 Z 1 + R1 Z¨ 1 = R˙ 3 Z 3 , r r η
(8.30)
1 1 R¨ 2 Z 2 + R˙ 2 Z 2 + R2 Z¨ 2 = R3 Z˙ 3 r η
(8.31)
1 R˙ 1 Z 1 + R1 Z 1 + R2 Z˙ 2 = 0, r
(8.32)
where each dot stands for one differentiation, and of course R1 and R2 , etc. are differentiated with respect to r , and Z 1 , etc. with respect to z. The trick, now, is to divide (8.30) by R1 Z 1 , (8.31) by R2 Z 2 , and (8.32) by R2 Z 1 , so that
8.5 Postglacial Rebound
341
R¨ 1 1 R˙ 1 1 1 R˙ 3 Z 3 Z¨ 1 + − 2 = − , R1 r R1 r η R1 Z 1 Z1
(8.33)
R¨ 2 1 R˙ 2 1 R3 Z˙ 3 Z¨ 2 + = − , R2 r R2 η R2 Z 2 Z2
(8.34)
1 R1 Z˙ 2 R˙ 1 + =− . R2 r R2 Z1
(8.35)
and
The left-hand side of Eq. (8.35) depends on r but not on z; its right-hand side depends on z but not on r : this is precisely the situation you are after when you try to solve a partial differential equation by separation of variables: because the only way (8.35) can be true is if both its left- and right-hand sides are constant with respect to both r and z: it follows that you can equate each side of (8.35) to a constant, which means that you’ve got two ordinary differential equations to solve, instead of one partial differential equation: but ODEs are often easier than PDEs, so... Anyway, (8.35) separates as it is, but (8.33) and (8.34) require more work. Equation (8.33) separates if (8.36) R˙ 3 = A R1 ; Equation (8.34) separates if R3 = B R2 .
(8.37)
So let’s try that. Equation (8.33) separates into
while (8.34) separates into
1 R˙ 1 1 R¨ 1 + − 2 = −λ21 , R1 r R1 r
(8.38)
A Z3 Z¨ 1 − = −λ21 , η Z1 Z1
(8.39)
R¨ 2 1 R˙ 2 + = −λ22 , R2 r R2
(8.40)
B Z˙ 3 Z¨ 2 − = −λ22 , η Z2 Z2
(8.41)
where −λ21 and −λ22 are two completely arbitrary constants544 . Let’s look at (8.38) and (8.40) first; let’s start out with a bit of algebra, so that from (8.38) we get (8.42) r 2 R¨ 1 + r R˙ 1 + λ21 r 2 − 1 R1 = 0
342
and from (8.40)
8 The Forces that Shape the Earth: Convection and Plates
r 2 R¨ 2 + r R˙ 2 + λ22 r 2 R2 = 0.
(8.43)
This is interesting, and useful, because (8.42) and (8.43) are two slightly different instances of the general equation545 , x 2 y¨ (x) + x y˙ (x) + (a 2 x 2 − b2 )y(x) = 0,
(8.44)
where a and b are constant and x of course is the independent variable and y = y(x) the function one looks for. And this equation, kind of like the wave equation and the conduction equation, that we’ve also learned, and some other important equations, had been studied in much detail by mathematicians and physicists long before Haskell, who, as a result, already knew all that he needed to know about it: in particular, he knew that its general solution is y(x) = c1 Jb (ax) + c2 Yb (ax),
(8.45)
where c1 and c2 are arbitrary constants, and I’ll tell you in a sec what Jb and Yb stand for. Equation (8.42) is just like (8.44) with a = λ1 and b2 = 1, so its solution must be (8.46) R1 (r ) = C1 J1 (λ1r ) + C2 Y1 (λ1r ); likewise (8.43) is (8.44) with a = λ2 and b = 0, so R2 (r ) = D1 J0 (λ2 r ) + D2 Y0 (λ2 r ).
(8.47)
The functions Jn and Yn are functions that tend to show up relatively frequently in physics; they are called Bessel functions of the first kind (J ) and of the second kind (Y )—and (8.44) is called the Bessel equation. One could write volumes on the Bessel equation and on Bessel functions546 , but to go on with this book I don’t think you need to know a lot about them; in practice, they are all oscillatory functions, like sines or cosines, but unlike sines and cosines they decay to zero as their argument grows to infinity (only +∞, positive infinity, matters, because Bessel functions are not allowed to have negative arguments). The main difference between the various Bessel functions (see Fig. 8.3) is the value they have when their argument is 0. J0 (x), which is called the 0-th order Bessel function of the first kind, is 1 when x = 0. But J1 (x), J2 (x), etc.—all other Bessel functions of the first kind—are 0 at x = 0. On the other hand, Bessel functions of the second kind, Yn (x), all tend to −∞ when x tends to 0. Now, there’s no way we’re ever going to have infinite post-glacial deformation; it would be non-physical, as they say, so we can happily set C2 and D2 to zero, and R1 (r ) = C J1 (λ1r ),
(8.48)
R2 (r ) = D J0 (λ2 r ),
(8.49)
8.5 Postglacial Rebound
343
Fig. 8.3 Bessel functions of the first (J ) and second (Y ) kind, of orders 0, 1 and 2
where I figured that, for the sake of simplicity, I might as well drop the subscripts from the surviving constants C1 and D1 . It follows from (8.37) and (8.49) that R3 (r ) = B D J0 (λ2 r ).
(8.50)
Equations (8.48)–(8.50) solve the “radial” part of our problem (8.21) plus (8.22), but not for any values of B, C, D, λ1 and λ2 . We still need to check that (8.48)–(8.50) verify eqs. (8.36) and (8.35), and what we are going to find out is that they do, but only if those parameters, B, D, etc. comply with certain requirements. To begin with, it follows from (8.36) and (8.48) that R˙ 3 (r ) = AC J1 (λ1r ).
(8.51)
Now, Bessel functions have a number of properties that are useful to know; they are related to each other in various ways, for instance the derivative of J0 is equal to −J1 , and so (8.52) J˙0 (λ2 r ) = −λ2 J1 (λ2 r ). So, if you differentiate both sides of (8.50), R˙ 3 (r ) = −B Dλ2 J1 (λ2 r ).
(8.53)
But then, that means the right-hand side of (8.53) must coincide with the right-hand side of (8.51), (8.54) − B Dλ2 J1 (λ2 r ) = AC J1 (λ1r ),
344
8 The Forces that Shape the Earth: Convection and Plates
which the only way this can happen is if λ2 = λ1 ,
(8.55)
so that the Bessel functions at the left- and right-hand side are “in phase” with one another, and − B Dλ = AC, (8.56) where I’ve dropped the subscript from λ, too, now, as we’ve just learned that λ1 and λ2 must be the same thing. By the logic of separation of variables, the left-hand side of Eq. (8.35) must also be constant. It can be rewritten R˙ 1 1 R1 1 + = (r R˙ 1 + R1 ) R2 r R2 r R2 1 ∂ = (r R1 ) r R2 ∂r 1 ∂ [rC J1 (λr )] = r R2 ∂r Cλ = J0 (λr ) R2 Cλ = , D
(8.57)
where I’ve used (8.48), first, with λ1 duly replaced by λ, and then another one of those nice properties of Bessel functions, namely the one that says that547 ddx [x J1 (x)] = x J0 (x), and finally (8.49) to replace J0 with RD2 . This actually doesn’t introduce any new constraints on the free parameters, but has to be taken into account when solving the z-equations. The z-equations we need to solve are (8.39) and (8.41), with both λ1 and λ2 replaced by λ, and Cλ Z˙ 2 (8.58) =− Z1 D which follows from (8.35) and (8.57). To solve this problem, start by multiplying both sides of (8.39) by Z 1 , and differentiate, and move things left and right so that η ... Z˙ 3 = ( Z 1 − λ2 Z˙ 1 ); A
(8.59)
playing around with (8.41) in a similar way, you can get η Z˙ 3 = ( Z¨ 2 − λ2 Z 2 ), B
(8.60)
8.5 Postglacial Rebound
345
but then that means that 1 ... 1 ( Z 1 − λ2 Z˙ 1 ) = ( Z¨ 2 − λ2 Z 2 ). A B
(8.61)
The trick now is to look at (8.58), which if you differentiate it once you get D ¨ Z˙ 1 = − Z2, Cλ
(8.62)
and if you differentiate twice more, ... D .... Z1 = − Z 2. Cλ
(8.63)
Sub that into (8.61), and you get a fourth-order, ordinary differential equation, 1 − A
D .... D Z 2 − λ Z¨ 2 Cλ C
=
1 ¨ ( Z 2 − λ2 Z 2 ). B
(8.64)
with only one unknown function, that is Z 2 . Equation (8.64) can be compacted quite a bit through (8.56), which says that BACD = − λ1 , so that
or
1 .... Z 2 − Z¨ 2 = Z¨ 2 − λ2 Z 2 , λ2
(8.65)
.... Z 2 − 2λ2 Z¨ 2 + λ4 Z 2 = 0.
(8.66)
This is another equation which, if you’ve studied some relatively advanced math, is probably a piece of cake for you to solve. But even if you wouldn’t know how to attack it, it should be OK for you, at this point in this book, to verify “by direct substitution” that the functions e−λz , ze−λz , eλz and zeλz are all solutions to (8.66). Also, there’s four of them, and they are linearly independent548 , and because (8.66) is a fourth-order ODE, that means their linear combination, with (four) arbitrary coefficients, is the general solution of (8.66). It reads Z 2 (z) = Ee−λz + F ze−λz + Geλz + H zeλz ,
(8.67)
where E, F, G, H are the arbitrary coefficients I am talking about. Now, solutions that grow when z grows are non-physical (imagine a load, placed at the surface of the earth, that causes more deformation at large depths than near the surface itself, and the larger the depth the larger the deformation... doesn’t make much sense, right?), so, if we agree that from now on λ has got to be positive, we must set G = H = 0, and we are left with
346
8 The Forces that Shape the Earth: Convection and Plates
Z 2 (z) = Ee−λz + F ze−λz
(8.68)
(we could also have decided that λ < 0, and set E and F to zero). Now that Z 2 is sorted out, we can derive Z 1 and Z 3 from it. Equation (8.58) says that D ˙ Z1 = − Z2 Cλ (8.69) F D = (E − + F z)e−λz ; C λ Equation (8.39) says that
η ¨ ( Z 1 − λ2 Z 1 ); A
Z3 =
(8.70)
we need to differentiate (8.69) twice:
and
F D D F −λz E − + F z e−λz + e Z˙ 1 = − λ C λ C F D E − 2 + F z e−λz , =−λ C λ
(8.71)
F D E − + F z e−λz − λFe−λz Z¨ 1 = λ2 C λ F D E − 3 + F z e−λz . = λ2 C λ
(8.72)
Sub (8.69) and (8.72) into (8.70), and Z 3 = −2
DF λη e−λz . AC
(8.73)
Both the r - and z- problems are solved, so we can substitute everything into (8.27)–(8.29), which become vr = (E −
and
F + F z)J1 (λr )e−λz , λ
(8.74)
vz = (E + F z)J0 (λr )e−λz
(8.75)
p¯ = 2Fη J0 (λr )e−λz ,
(8.76)
which I’ve managed to simplify a bit using (8.56), and replacing D E and D F with E, F: or, which is the same, setting D to 1: because D would only show up multiplied
8.5 Postglacial Rebound
347
by E and F: so there’s no way we could separate it from them, even through the boundary conditions: but that’s OK, because we don’t need to. Anyway, so, through the boundary conditions we can/must reduce the arbitrarity in E, F. We’ve already required the solution to go to zero at large r and large z. But we haven’t yet worried about the surface. After the load is removed (the extra ice has molten and we’re back to normal) the surface must be free of stress, and we’ve seen that formally that is described by Eqs. (8.25) and (8.26). If you differentiate the expressions we’ve found for vr and vz ,
and
∂vr = (−λE − λF z + 2 F)J1 (λr )e−λz ∂z
(8.77)
∂vz = −λ(E + F z)J1 (λr )e−λz , ∂r
(8.78)
which if you set z = 0 in both and then sub into (8.26), you get (−λE + 2 F)J1 (λr ) − λE J1 (λr ) = 0,
(8.79)
where J1 (λr ) cancels out, so that you’re left with F = λE.
(8.80)
So then, the solutions (8.74)–(8.76) can be rewritten
and
vr = λz E J1 (λr )e−λz ,
(8.81)
vz = (1 + λz) E J0 (λr )e−λz
(8.82)
p¯ = 2λη E J0 (λr )e−λz ,
(8.83)
which one might substitute into the other boundary condition, (8.25) to find a value for E. Problem is, though, we have never specified a value for λ, and everything we’ve done would work whatever the value of λ; so the E we would get out of this exercise would be itself a function of λ. The way differential equations work549 , if you have multiple values of E for which (8.81)–(8.83) solve (8.21) and (8.22), then the sum of the solutions associated to those values of λ is also a solution to (8.21) and (8.22). For each value of λ between 0 and ∞ (remember that, earlier, we had to pick λ > 0), one can find a value E = E(λ) through (8.25), so Haskell figures that, most generally,
348
8 The Forces that Shape the Earth: Convection and Plates
∞ vr = z
λ E(λ)J1 (λr )e−λz dλ,
(8.84)
0
∞ vz = (1 + λz) E(λ) J0 (λr )e−λz dλ
(8.85)
0
and
∞ p¯ = 2η
λE(λ) J0 (λr )e−λz dλ :
(8.86)
0
and so that is what we are going to plug into (8.25). Which gives ∞ ρgζ(r, t) = −2η
λE(λ) J0 (λr )dλ 0
⎡ ∂ + 2η ⎣ ∂z
⎤ ∞ (1 + λz) E(λ) J0 (λr )e−λz dλ⎦ 0
z=0
∞ = − 2η
λE(λ) J0 (λr )dλ 0
⎤ ⎡∞ + 2η ⎣ λ E(λ) J0 (λr )e−λz − λ(1 + λz) E(λ) J0 (λr )e−λz dλ⎦ 0
∞ = − 2η
⎡
λE(λ) J0 (λr )dλ + 2η ⎣z
0 ∞
= − 2η
∞
z=0
⎤ λ E(λ) J0 (λr )e−λz dλ⎦
0
z=0
λE(λ) J0 (λr )dλ. 0
(8.87) What we are going to do next is, we are going to turn (8.87) into an ODE, with t as independent variable, and E = E(λ, t) as the unkown function we are looking for—because, yes, E has got be able to change with t, if we want (8.87) to hold. The way we are going to do this is, consider that ζ is the vertical displacement measured at z = 0, or the deformation of the z = 0 surface, anyway vz (r, t) =
dζ (r, t), dt
(8.88)
8.5 Postglacial Rebound
349
and if we differentiate (8.87) with respect to time ∞ λ
−2η
d dζ E(λ, t) J0 (λr )dλ =ρg (r, t) dt dt
0
=ρgvz (r, 0, t) ∞ =ρg E(λ, t) J0 (λr )dλ,
(8.89)
0
where at the right-hand side I’ve replaced vz with its expression (8.85), setting z = 0. But so then it all boils down to ρg d E(λ, t) + E(λ, t) = 0, dt 2ηλ
(8.90)
which is the ODE we were looking for. We’ve seen equations like this one in this book before, and you might guess that its solution is ρg
E(λ, t) = G(λ)e− 2ηλ t ,
(8.91)
with G(λ) an arbitrary constant. Which can be fully determined via the initial conditions, i.e., some estimate of the vertical velocity at the surface at zero time, which Haskell can extrapolate from measurements of surface uplift made at different times; plus, another interesting property of the Bessel functions. In practice: sub (8.91) into (8.85) and set both z and t to 0: then ∞ vz (r, 0, 0) =
dλ G(λ) J0 (λr ).
(8.92)
0
Now multiply both sides by r J0 (λr ), where λ can have any value between 0 and ∞: ∞ (8.93) vz (r, 0, 0)r J0 (λ r ) = r J0 (λ r ) dλ G(λ) J0 (λr ). 0
Integrate both sides over r , from 0 to ∞: ∞
∞
dr vz (r, 0, 0)r J0 (λ r ) = 0
∞
dr r J0 (λ r ) 0
dλ G(λ) J0 (λr ), 0
(8.94)
350
8 The Forces that Shape the Earth: Convection and Plates
which, if I swap the order of the two integrals at the right-hand side, becomes ∞
∞
dr vz (r, 0, 0)r J0 (λ r ) = 0
∞ dλ G(λ)
0
dr r J0 (λr ) J0 (λr ).
(8.95)
0
Now, somewhat similar to sines and cosines550 , Bessel functions are “orthogonal” in the sense that ∞ δ(λ − λ ) , (8.96) dr r Jn (λr ) Jn (λr ) = λ 0
whatever the value of n, and with δ denoting the Dirac delta function551 . (So, that means that the integral at the left-hand side of (8.96) is zero unless λ = λ .) But then, by the very nature of the Dirac function, (8.95) becomes ∞
∞
dr vz (r, 0, 0)r J0 (λ r ) = 0
dλ G(λ)
δ(λ − λ ) λ
0
=
(8.97)
G(λ ) , λ
and we might drop the prime and just write ∞ dr vz (r, 0, 0)r J0 (λr ) :
G(λ) = λ
(8.98)
0
and you see that (8.98) can be used to calculate G(λ) “numerically” (i.e., in 1935, by hand) for as many values of λ as we might want. Once you have G(λ) for “all” λs, you can also calculate vz numerically; because if you sub (8.91) into (8.85) you get ∞ ρg (8.99) vz (r, z, t) = (1 + λz)G(λ)e− 2ηλ t J0 (λr )e−λz dλ, 0
which, now that we have a way of knowing G(λ), is enough, in principle, to determine vz at any point r, z and at any time t that one cares about. We are really only interested in z = 0, though, because that is where we have data. So let’s set z = 0 in (8.99), and it becomes ∞ ρg (8.100) vz (r, 0, t) = G(λ)J0 (λr )e− 2ηλ t dλ. 0
8.6 Convection
351
What we have here is a formula to calculate the vertical component of postglacial rebound, at a given distance from the presumed center of the ice load, for a given value of η: which then you can compare to the data, recorded for multiple ts at the same location, and see if they look anything similar: which if they don’t you try again with another η, and so on and so forth until you get a decent fit. That’s precisely what Haskell did, using Nansen’s data: and that’s how he came up with η ≈ 1021 Pa s.
8.6 Convection That was quite a detour, but let us get back to the main reason we got involved with viscosity: we were wondering whether convection might actually take place in the earth; it should be clear by now that a high-viscosity substratum would resist deformation more, and therefore be less likely to convect, than a lower-viscosity one; but is 1021 Pa s—our best estimate for the earth’s viscosity—high or low? is it low enough to allow convection? Rayleigh, who studied convection (and not only from the geophysical perspective) in the early twentieth century, found a way to answer this question—a rule of thumb that people refer to as Rayleigh’s number and which now I am going to try to explain you what it is, and how it works. I am going to have to start with some explanation of what convection precisely is—just to make sure we are on the same page. I would guess that most of you have heard about convection in school, at some point; that you’ve been told to look at boiling water circulating in a pot, etc. Anyway, in the words of Count Rumford552 , who was among the first to look at convection “scientifically” at the end of the eighteenth century, “in the course of a set of experiments on the communication of Heat, in which I had occasion to use thermometers of an uncommon size (their globular bulbs being above four inches in diameter) filled with various kinds of liquids, having exposed one of them, which was filled with spirits of wine, in as great a heat as it was capable of supporting, [...] I saw the whole mass of the liquid in the tube in a most rapid motion, running swiftly in two opposite directions, up and down at the same time. [...] Some fine particles of dust [...], which were intimately mixed with the spirit of wine, on their being illuminated by the sun’s beams, became perfectly visible, [...] and by their motion discovered the violent motions by which the spirit of wine in the tube of the thermometer was agitated. [...] On examining the motion of the spirits of wine with a lens, I found that the ascending current occupied the axis of the tube, and that it descended by the sides of the tube.”553 So, Rumford’s stuff is pretty old, but then apparently there isn’t much research about convection (though there’s a lot of research about conduction of heat, like we’ve seen in the last chapter), until the work of Henri Bénard at the turn of the twentieth century. What Bénard did is, he looked at thin layers of fluid, think some oily (viscous) liquid (apparently, Bénard had a preference for whale oil) distributed on a fairly large pan, and observed what happened when you heated it, for example from below, while the temperature of the atmosphere above stayed pretty much
352
8 The Forces that Shape the Earth: Convection and Plates
constant. And the main point is, he found that a pattern of convection cells (i.e., according to the Oxford English Dictionary, “self-contained area[s] in a fluid in which upward motion of warmer fluid in the centre is balanced by downward motion of cooler fluid at the periphery”) form, and that, seen from above, the cells were, strangely enough, hexagonal. That was, incidentally, the topic of his Ph.D. thesis, Les Tourbillons Cellulaires dans une Nappe Liquide Propageant de la Chaleur par Convection, en Régime Permanent, which he defended in 1901 at the Faculté des sciences, University of Paris.
8.7 Rayleigh Number Now, then, Rayleigh looked at Bénard’s work and was able to do the math and come up with a theoretical explanation for Bénard’s results, which he published (Rayleigh published) in 1916554 ; we are not going to get into all the details of that, but one thing that’s definitely worth looking at is Rayleigh’s introduction of what would later come to be called “Rayleigh number.” Which is a helpful concept when one has to think about convection—in general, not just in thin layers. The Rayleigh number is a dimensionless number whose value tells us whether a system, which might be the interior of the earth, for example, or of another planet, is convecting or not. Let us derive Rayleigh’s formula. Convection happens because some parts of a viscous material are heated up, and heating changes their density, usually actually decreases their density, so then if density goes down, the material that’s been heated also becomes lighter, and if it’s lighter that means it’s also more buoyant (geophysicists love this word, I am not sure why: they use it all the time), i.e., by Archimedes’ principle, it tends to rise up. We’ve already met α, AKA the volumetric coefficient of thermal expansion: remember that δρ = ραδT , by definition, and if you multiply δρ by the volume V of the rock parcel whose temperature’s been changed by δT , you get a change in mass, and if you multiply that by gravitational acceleration, g, you get the force that’s pushing the rock up, the buoyancy force, F = −V ραδT g.
(8.101)
I figured we might use a reference frame where g and ρ etc. are functions of depth (rather than distance from the center of the earth); depth is positive downwards (otherwise it wouldn’t be called depth): and but F in (8.101) must point upwards, i.e., it’s got to be negative, if δT is positive, and vice versa: if you cool a parcel of rock rather than heating it, it’ll get denser and tend to sink. Hence the negative sign at the right-hand side. It’s useful (you’ll see why in a sec) to write V = a1 d 3 , where d is some measure of the lateral extent of the parcel—a length —and the constant a1 depends on the parcel’s shape and is “dimensionless”, as they say, i.e., just a pure number, with no physical dimension or unit of measurement. (For example, if the parcel is spherical,
8.7 Rayleigh Number
353
then d can be its radius and a1 = 43 π, so that V = 43 πd 3 which we know must be the case for a sphere of radius d.) In general, F = −a1 d 3 ραδT g,
(8.102)
which we don’t specify the value of a1 , and so things might change, to some extent, depending on the shape of the rock parcel that’s initially warmed up. Another parameter that contributes to deciding whether a certain δT can trigger convection is the material’s viscosity. You shouldn’t be surprised, since I’ve just used so many pages to explain viscosity in general, and how the viscosity of the earth is “known”, sort of. Now, viscosity enters the Rayleigh number story through the so-called Stokes’ law, which we haven’t met yet, which stipulates that the speed at which the rock parcel either rises or sinks is inversely proportional to viscosity (the “stronger”, i.e. more viscous the medium, the more it will resist flow) and directly proportional to buoyancy force. Formally555 , v=
a2 F , ηd
(8.103)
where η is viscosity, F is buoyancy force, and the constant a2 , like a1 , depends on the geometry of the rock parcel that’s been heated (or cooled), whose spatial extent is still d. If you substitute (8.102) into (8.103), v=−
a1 a2 d 2 ραδT g . η
(8.104)
Now what happens is that, in the time it takes the heated (cooled) rock parcel to rise (sink) a certain distance towards the surface (center) of the earth, its T will be decreased (increased) via conduction with the surrounding rock—which is cooler (hotter) and becomes even more so as depth is reduced (increased). So there will be a time δt after which the temperature of the rock parcel will be the same as that of the surrounding rock. To figure out δt we might use Newton’s cooling law, which we met a couple times in Chap. 4 and but basically, in case you forgot, just states the empirical fact that the rate at which heat is exchanged between two bodies is proportional to the difference in their temperatures; or, the rate of heat loss of a body is directly proportional to the difference in its temperature and that of its environment. In Chap. 4 we wrote it556 kδSδT dT =− dt cm mδz kδS[T (t) − Tamb ] =− cm mδz k δS T (t) k δS Tamb =− + . cm mδz cm mδz
(8.105)
354
8 The Forces that Shape the Earth: Convection and Plates
Now T (t) would be the temperature of the rock parcel at the time t, δS the area of its outer surface, and Tamb the temperature of the rock all around it—which we assume to be constant. And, of course, m is the mass of the rock parcel; δz its vertical extent; cm the heat capacity of the stuff it’s made of and k its thermal conductivity—both defined in Chap. 4. Equation (8.105) states that the function T (t) is proportional to its own derivative—plus a constant term. There is one function that is proportional to its own derivative—and that’s the exponential function; so we might write T (t) = A + B eCt ,
(8.106)
and you’ll forgive me if I use, again, A, B and C to mean three (not so) arbitrary constants. When t becomes very large, T (t) must converge to Tamb , which means that C must be real and negative—otherwise the exponential term would become infinitely large; but then eCt can only decrease to zero as t grows, and we must have A = Tamb . When t = 0, on the other hand, (8.106) boils down to T (0) = A + B, which means that we must have B = T (0) − Tamb . Sub into (8.106) these expressions we’ve just found for A and B, differentiate with respect to t, and you’ll see that, for (8.105) . Put all these findings together, and we have the to hold, we must have C = − cmkδS mδz solution kδS (8.107) T (t) = Tamb + [T (0) − Tamb ] e− cm mδz t . We might replace δz with the parcel’s “size” parameter d, and k = κcm ρ (remember we met κ in Chap. 4: we called it thermal diffusivity). Then, if you look at the argument of the exponential in (8.107), −
κcm ρ δS kδS =− cm mδz cm md κδS =− Vd κ , =− a3 d 2
(8.108)
where the last step amounts to considering that, whatever the shape of the parcel, its volume V is proportional to d 3 , and δS to d 2 ; similar to a1 and a2 , a3 is a constant that depends on the shape of the parcel557 . Substituting into (8.107), we find that the solution to (8.105) can be written T (t) = Tamb + [T (0) − Tamb ] e
− a κd 2 t 3
,
(8.109)
where the arbitrary constant is chosen so that at t = 0, T = T (0). We’re interested in the time δt needed so that T (δt) = Tamb , which if we sub into (8.109) we get κ δt [T (0) − Tamb ] e a3 d 2 = 0. (8.110)
8.7 Rayleigh Number
355
Strictly speaking, this never happens (unless T (0) is already the same as Tamb , but obviously that isn’t very interesting), because the exponential in (8.110) is never zero. But, exponentials become small pretty quickly; so we might consider that already 2 when the exponent in (8.110) is about −1, that is when δt ∼ a3κd , then the exponential in (8.110) is sensibly less than unity (e, AKA Euler’s number, is about 3, so at that amb ) and then as t grows it will point the left-hand side of (8.110) will be about T (0)−T 3 quickly become smaller; so, anyway, we can take δt ≈
a3 d 2 , κ
(8.111)
as an estimate of the order of magnitude of the time needed for the rock parcel to lose its thermal buoyancy. So, now, to find out how far the rock parcel can go during this time, you multiply the speed at which it moves, from Stokes’ law (8.103), with time itself. This is, of course, another brutal approximation, because in practice v won’t be constant over time, but it will decrease as time passes and the rock’s temperature changes, and its density changes, too, as a result, and so does its buoyancy; but the Rayleigh number is not something that you need to know with any precision—we care about orders of magnitude. So, let’s call l the distance covered by the rock parcel, and l ≈| v δt| a1 a2 d 2 ραδT g a3 d 2 η κ ραδT gd 4 , ≈ ηκ
≈
(8.112)
where we have replaced a1 a2 a3 with 1—which is a pretty brutal approximation, but OK for our goals558 : because the next step consists of saying: if the distance l traveled by that chunk of rock before it cools down and loses its buoyancy is much bigger than the size of the chunk of rock itself, then chances are that convection is actually triggered, i.e., that stuff starts to flow around. That is why people look at the ratio ραδT gd 3 l ≈ , d ηκ
(8.113)
which they call the Rayleigh number, or Ra, and the rule of thumb is that if the viscosity and diffusivity and expansivity, and the thermal anomaly δT that’s introduced by heating or cooling, and the area over which there is a thermal anomaly, are such that the Rayleigh number is much larger than 1, then we reasonably expect the material we’re looking at to convect. I know that “much larger than 1” is pretty vague, but there is no recipe to find out what really the value of the Rayleigh number is, above which there’d be convection for sure: it has to be determined experimentally, case by case559 , and as far as the earth’s interior is concerned one can only make
356
8 The Forces that Shape the Earth: Convection and Plates
guesses. In practice, we can go and substitute into (8.113) all the info we have from seismology plus Williamson-Adams-type stuff (ρ and g, which are both functions of z), and from laboratory studies of earth materials (α and κ, which of course, again, lots of uncertainty, because in principle you should measure that—thermal expansion and diffusivity—at pressure and temperature as high as we expect to find deep inside the earth; and that’s not easy to do at all, and so people have to extrapolate, etc.), and viscosity as we figure it out from post-glacial rebound data and models, and then we can make guesses as to d and δT , and see how large the Rayleigh number would be. Here, then, are some order-of-magnitude values560 for the earth’s substratum: ρ = 4 × 103 Kg m−3 ; α = 2 × 10−5 ◦ C−1 ; g = 10 m s−2 ; η = 1021 Pa s; κ = 8 × 10−7 m2 s−1 . Substitute all that into (8.113) and 4 × 103 × 2 × 10−5 × 10 δT d 3 1021 × 8 × 10−7 ≈ 10−15 δT d 3 ,
Ra ≈
(8.114)
which 10−15 is pretty small, but if you imagine a temperature anomaly δT of the order of, I don’t know, 102 ◦ C over a d of about 106 m, Ra is already as large as 105 , which is definitely much larger than 1 (in experiments, it all depends on the setup, like I said, but to get convection to start it’s often sufficient for Ra to be of the order of 102 ). Ergo, chances are that there is convection in the substratum, if sufficiently large areas of it are heated, by radioactive decay from within, or conduction from the core—which is presumably hot (molten iron at very high pressure...) and cooling—, or why not both. Speaking of the core, we might as well repeat this same exercise using the values of ρ and g found, via Williamson-Adams, in the core, and experimental values for molten iron at high pressure561 : ρ = 104 Kg m−3 ; α = 10−5 ◦ C−1 ; g = 5 m s−2 ; η = 10−3 Pa s; κ = 7 × 10−6 m2 s−1 . Substitute, again, in (8.113) and 104 × 10−5 × 5 δT d 3 10−3 × 7 × 10−6 ≈ 108 δT d 3 .
Racore ≈
(8.115)
Which it’s easy to see that, because viscosity is so low, the Rayleigh number of the core, or, I should say, of the fluid outer core, is huge, even if we were to start out with a thermal anomaly (δT d 3 ) much smaller than what I just suggested for the substratum. And so, chances are that the outer core is convecting very vigorously, as geophysicists like to say. There’s one particular aspect of all this Rayleigh-number story that I want to stress, because it’s got at least one important consequence: like I said, δt is an estimate of the time it takes our rock parcel to acquire by conduction the temperature of the stuff that’s around it. If viscosity is low (and so the speed of convection, v, is large) then that means that conduction is not that important, because the parcel is transported somewhere else before it has time to exchange much heat by conduction. People like
8.7 Rayleigh Number
357
to use the word “adiabatic” (remember Chap. 7) to describe this situation; in reality there is conduction in the substratum, and it becomes important relatively close to the surface where it causes temperature to vary quite rapidly with depth: but, it’s just so tremendously slow compared to convection (which is not that fast, either, but we’ll get to that later), that we can almost neglect it. So the big consequence that I was talking about is that it’s probably OK to use the adiabatic gradient that we’ve met in Chap. 7, Eq. (7.142), as an estimate of the temperature gradient in the substratum. The bottom line of all this is that already by the mid twentieth century, before people figured out what today is called plate tectonics—which I was careful not to mention much so far in this book—it was reasonable to think that there’d be convection in the earth’s mantle, or “substratum”, whatever people called it at the time. At the very end of his 1944 textbook, Principles of Physical Geology, Arthur Holmes surmised (Fig. 8.4) that if there’s convection below some depth in the earth, and if there’s some sort of friction between the convecting substratum and the rigid stuff that’s on top of it, then convective currents might drag continents and oceans around; ascending currents might even cause continents to break up, “with consequent [...] ocean floor development on the site of the gap”; and the push from developing “oceanic plates” (which but Holmes is not using the word “plate” yet) into continental ones explains mountain building: which often occurs at the “margin” of continents. Except for the
Fig. 8.4 Arthur Holmes’ “purely hypothetical mechanism for ‘engineering’ continental drift”, as described in his Principles of Physical Geology (1944). First (above), we have a single continental block sitting on top of an ascending current in the mantle. The current eventually (below) “drag[s] the two halves of the original continent apart, with consequent mountain building in front where the currents are descending, and ocean floor development on the site of the gap, where the currents are ascending”
358
8 The Forces that Shape the Earth: Convection and Plates
idea that plates be “dragged” by currents in the substratum (we’ll get, later, to why that doesn’t quite work), everything else in Holmes’ diagrams and accompanying short piece anticipates quite accurately the conclusions of plate tectonics. Holmes is careful to qualify his “cartoon”562 of continental drift as “purely hypothetical”; and it’s likely that most of his colleagues in 1944 thought of it as nothing more than a funny speculation; or maybe just didn’t pay attention to it at all (after all, it’s the very last illustration in his book); but there are people, too, around this time, who connect with Holmes’ ideas, as they observe previously unnoticed stuff— besides all the evidence already collected by Wegener—, that seems to suggest that Holmes might actually be right. “In the 1920s, a group of American scientists led by William Bowie had begun a program in cooperation with the U.S. Navy to measure gravity at sea. Bowie and John Hayford563 had demonstrated that isostasy applied over the continents, but did it also apply over the oceans? What was the structure of the crust under the ocean basins? What was the ocean floor made of?”, etc. “The world’s expert on the subject was a Dutch geodesist, Felix Vening Meinesz (1887–1966), who had invented a novel gravimeter [...]. In 1923, in a series of Dutch submarine expeditions to Indonesia, [...] he had discovered major gravity anomalies associated with the Java Trench.” (this is taken from Naomi Oreskes, from her introduction to Plate tectonics—An Insider’s History of the Modern Theory of the Earth564 .) I think what is meant here is, gravity anomalies besides the anomaly that you already expect to see just because the ocean’s much deeper than average (in geology slang, a “trench” is a place where the ocean is very deep). So already there’s that mass deficiency, which reduces the gravitational force one might measure above the trench: and but in addition, it turns out that the gravity is even less than what you’d expect to find after taking seafloor depth into account. “In 1928,” says Oreskes, “Bowie invited Vening Meinesz to the United States, and a series of gravity expeditions followed,” and it was found, again, that wherever the ocean is very deep—not only at Java—you also get a negative gravity anomaly: Vening Meinesz’ observation was a global thing; it reflected a global phenomenon. “Vening Meinesz proposed that convection currents565 might be dragging the crust downward into the denser mantle below, explaining both the ocean deeps and the negative gravity anomalies associated with them”. In other words: in the absence of convection, we had the weight of, say, the crust, and the resistance of whatever is underneath it, and they had to balance one another—otherwise the crust would sink: as a result, mass excesses/deficits near the surface are compensated by mass deficits/excesses at depth, and if you measure gravity along the earth’s surface, to a good approximation it should be everywhere the same. Now, what Vening Meinesz is saying is that, at trenches, there’s systematically a gravity anomaly, and, to explain it, you need some additional term in the force balance. What exactly this term could be isn’t obvious—there’s the viscous drag from the convecting mantle, which, at the trench, becomes close to vertical (a horizontal drag wouldn’t enter in the isostasy force balance, which is along the vertical component); there’s the resistance of the crust to flexure, which happens as, again, convection tries to suck it into the downgoing current; there may be something else that I can’t think of right now—and is certainly difficult to quantify.
8.7 Rayleigh Number
359
Fig. 8.5 Sketches of Griggs’ “analogue” experiment, as described in his 1939 paper. On the left, the case when both drums rotate at the same time; on the right, only one drum rotates. These are original drawings, similar to those in Griggs’ paper. (Used with permission of Yale University—American Journal of Science, from David Griggs, “A Theory of Mountain Building”, American Journal of Science, vol. 237, 1939; permission conveyed through Copyright Clearance Center, Inc.)
This is when people start to do convection models, to figure what might be going on in the earth’s mantle. But at this point in history, like, in the 1930s and 40 s and 50 s, computers practically don’t exist, or their use is not so widespread yet, and they aren’t really that powerful anyway, which means that, for the time being, there isn’t much that you can do with mathematical modeling alone. So people are also looking at so-called analogue experiments: for example David Griggs566 put “a mixture of heavy oil and sand” on top a “substratum” made of “very viscous waterglass”: all this in a tank, with two rotating drums that set the substratum in motion—resulting in one or two “convection” cells. The “type of thrusts formed at the junction of two downcurrents” are shown in Fig. 8.5, where you can see both the “downfold” and the negative topography on top of it. Griggs also looks at what happens if you have one of the drums staying still while the other one is rotated. “When only one drum is rotated (corresponding to the development of a single convection cell), the effect on the crust is different—the crustal downfold formed is not so narrow, and is asymmetrical in that it is steeper on the side facing the current”, etc.: see the drawing on the right, Fig. 8.5 again. The quotes are from Griggs’ paper, “A Theory of Mountain-Building”567 , where he describes his experiments, with photos and everything. So, the point here is that Griggs managed to reproduce in the lab, as a result of convection, a thickening of the crust, plus trench, similar to what Vening Meinesz and co. had observed in the field. Maybe trenches and the accompanying gravity anomaly could be explained by convection in the deep earth? Toward the end of his paper Griggs brings up another piece of evidence: “Visser, Leith and Sharpe, and Gutenberg and Richter all agree that the foci of deep earthquakes in the circus-pacific region seem to lie on planes inclined 45◦ toward the continents. It might be possible that these quakes were caused by slipping along the convection-current surfaces. These flow surfaces would be expected to dip toward the continent on the hypothesis of Pacific up-currents.” Of the people cited by Griggs: Simon Willem Visser was a Dutch geophysicist, and I haven’t found much about him except he was at some point working as a seismologist (probably the seismologist?) at the Dutch Royal Magnetic and Meteorological Observatory in Batavia (now Jakarta): which is one of those places where you get
360
8 The Forces that Shape the Earth: Convection and Plates
a lot of deep earthquakes. Gutenberg and Richter we’ve already met and they’ve contributed so much, it’s hard to find where they might have talked of deep quakes in particular. And But Andrew Leith and J. A. Sharpe published a paper on the Journal of Geology, in 1936, so a couple years before Griggs’ paper, called “Deep-Focus Earthquakes and their Geological Significance”. I don’t think I’ve used the word “focus” before, in relation to earthquakes, so here’s how Leith and Sharpe define it: “An earthquake is the rapid release of energy in the form of elastic waves. The region in which the process causing the energy release occurs is the ‘focal region,’ and the center of this region is the ‘focus.’ The region on the surface of the earth immediately above the focal region is the ‘epicentral region,’ and the point immediately above the focus, the ‘epicenter’ ”, etc. Leith and Sharpe explain that “owing largely to the very natural feeling that the material of the earth as a result of increasing temperature should become increasingly plastic with depth, seismologists and geologists have believed until comparatively recently that all earthquake foci are confined to the crust. The scarcity of seismic observations indicating the contrary, and [remember the bit on Barrell and the asthenosphere, in 5] the demand of many of the interpreters of surface structures and of the observations of gravitational anomalies for an asthenosphere, or zone of weakness, within which elastic stresses could not be maintained, combined to make this belief very strong.” Apparently, that changed with “the publication of a series of studies by Wadati568 beginning in 1928. Wadati found near Japan a class of earthquakes characterized by numerous phenomena which could be accounted for only by the assumption that these earthquakes had abnormally deep foci. Among the observations that demanded this were [i] the anomalously widespread areas over which the shaking was sensible (although never destructive), [ii] an anomalously high apparent surface velocity of the earthquake waves, and [iii] a large interval between the times of arrival of the P and S phases at the epicenter. [iv] From this interval the depth of focus could be computed.” So, here’s my explanation of what Leith and Sharpe mean, point by point: [i] if the earthquake focus (the place, at some depth, where the earth actually ruptures) is shallow, you’re going to feel most of it near the epicenter (the place, on the earth’s surface, right above the focus), and the effects will decay at a certain rate with increasing distance from the epicenter. But if the focus is deep, even if you are right at the epicenter you are at some important distance from the focus; and your distance from the focus decays relatively slowly if you move away from the epicenter, along the earth’s surface; so in that case the effects of the quake will be more uniform over the area where it’s felt, and if the quake is deep but big, that area must be very large—yet, you’re less likely to have destructive effects even at the epicenter. [ii] Take a straight line, or rather a great circle, on the earth’s surface, say a meridian, that goes through the epicenter of a deep quake; imagine you install a bunch of seismometers along that line. The apparent velocity (see Sect. 6.12) is what you measure if you forget that the focus is deep, and just look at how much time passes between the moment the seismic wave hits the first instrument and the second, the third, etc. If the focus were shallow, like for most quakes (Leith and Sharpe estimate about
8.7 Rayleigh Number
361
90% of all quakes), you wouldn’t see much difference between true and apparent seismic velocity. But if the focus is deep, think about the wavefront coming from below upwards: it might reach many of those instruments at about the same time, so apparent velocity is going to be high; certainly higher than true velocity. And the deeper the focus, the smaller those arrival-time differences, the higher the apparent velocity. [iii] The time between the arrival of P and S grows with the distance covered by both waves; so, at a fixed epicentral distance, it must grow with growing depth of the focus. [iv] The way you locate the epicenter of a quake is, you measure the delay between P and S arrivals, at least at three stations that recorded the quake: you compare that to the curve showing how that delay grows with epicentral distance—a curve determined empirically from lots of data taken from lots of quakes, mostly shallow, for which the epicenter has been determined already—, so that you get epicentral distances for those three stations; you use a map and a drawing compass to find the one point that lies at those distances from those three stations. If a quake is very deep, you can play the same trick, but in three dimensions (and if you neglect the growth of seismic velocities with depth, it’s OK to keep using the same empirical curve—there’ll be an error, but not huge, hopefully). So based on all this, Wadati put together a database of epicenters and focal depths of Japanese quakes, and he made a map with all the epicenters of Japanese quakes he’d found, and he noticed that their focal depths, the depths of their foci, would increase roughly east to west and south to north: see in Fig. 8.6 the plot he came up with in his 1935 paper569 . Leith and Sharpe show a similar map of deep quakes in Japan, updated with more data I guess, and but also show that sort of the same thing happens under South America, where “the earthquakes with depths exceeding 200 km lie, with two exceptions, inland. There appear to be two lines of earthquakes of intermediate depth, trailing off northwestward and southwestward from the oval zone enclosing the deeper earthquakes. As in the Japanese region, the average depth appears to increase with the distance inland, and the number of shallow earthquakes to decrease”, and there’s a map of South America with quake epicenters represented by symbols that change depending on focal depth: similar to Wadati’s map of Japan. A few years and one world war later, there’s a paper by Hugo Benioff570 , which has more data571 and a clearer picture of how the foci of deep quakes are distributed. What Benioff shows, that Wadati and Leith and Sharpe hadn’t shown, is where the foci stand in several vertical sections, e.g. under South America: look at Fig. 8.7. Essentially, in Benioff’s words, “the great fault intersects the earth’s surface along the curved line of oceanic deeps”, which means that foci are aligned along one big “fault”, that starts at the trench (“oceanic deep” is synonymous with “trench”) and then “dips” into the earth’s mantle down to 700 km or so. The angle that the fault makes with the earth’s surface is small initially, but then at larger depth grows to about 45◦ or so572 . Later, Benioff’s “great fault” will come to be called the “Wadati-Benioff zone”. In his paper, Benioff looks mostly at South America, which Leith and Sharpe already did, though they didn’t have so many foci to plot, and but also TongaKermadec573 . Towards the end he adds: “A preliminary examination of a number of other deep-focus earthquake sequences which are associated with oceanic deeps
362
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.6 Wadati put all epicenters he had on a map, noting the focal depth of each: here, deep quakes are shown as black circles, shallow ones as white circles, and quakes whose focal depth was unclear as white squares. He then extrapolated lines of approximately constant focal depth: which are the contour lines you see in this plot. The focal depth associated with each line is written near it. The pattern is surprisingly simple: focal depth grows from Japan toward Kamchatka in the north (and extreme south-west) of the Japanese archipelago, and from the Pacific Ocean toward Korea in the south/center. (Used with permission of John Wiley and Sons, from Cliff Frohlich, “Kiyoo Wadati and Early Research on Deep Focus Earthquakes”, Journal of Geophysical Research, vol. 92, 1987; permission conveyed through Copyright Clearance Center, Inc.)
indicates that they may all be generated on great faults similar to those described in this paper. Among these may be mentioned the Aleutian sequence, another one which extends from Southern Japan to Kamchatka, several sequences in the East Indies [meaning Indonesia], the Central American sequences, and the West Indies [meaning the Caribbean] sequences.” Unless I missed it, though, Benioff doesn’t mention continental drift and or convection at all; he speaks of a “downwarping” of “oceanic blocks”, etc., but there’s no reference to, like, a large-scale displacement of those blocks. Reading all this in the twenty-first century you might see, already, that the contributions of Holmes, of Griggs, of Wadati and Benioff, and of Wegener of course,
8.7 Rayleigh Number
363
Fig. 8.7 From Benioff, 1949. The map on the left (Fig. 9 in Benioff’s paper) is like that of Wadati in Fig. 8.6, except there are no contour lines. The “vertical sections” on the right have horizontal distance on the x-axis and depth on the y-axis. Horizontal distance is defined as distance from one of the three segments marked A, B and C on the map. Each section includes all the foci that lie within three distinct areas, A, B and C, each associated to one of those segments. (Used with permission of Geological Society of America, from Benioff, H., 1949, “Seismic Evidence for the Fault Origin of Oceanic Deeps”, The Geological Society of America Bulletin, vol. 60, 1949; permission conveyed through Copyright Clearance Center, Inc.)
all point in the same direction—continental drift controlled by convective flow. But, back when those books and papers were published—at the same time as a lot of other studies expressing different views—things weren’t so clear. People would start accepting the idea that oceans and continents move around only much later—about two decades later—and after a whole lot of different observations, coming from different, uhm, “disciplines”...
364
8 The Forces that Shape the Earth: Convection and Plates
8.8 The Earth’s Magnetic Field Probably the most important piece of evidence came from studies of the earth’s magnetic field. Which, admittedly, it’s crazy that I got this far into this book without ever talking about it: and I am going to have to take care of it now. To get started, think of a lodestone, or a bar magnet. If you’ve ever seen a bar magnet (like for example the one in Fig. 8.8), you know that it’s got two poles; if you place another magnet near it, they’ll attract or repel one another depending on how their poles are positioned with respect to one another. To fix ideas, actually, consider the case where one of the bar magnets is much bigger (i.e., has much more mass) than the other, so that then only the smaller one will be moved around by the attraction574 ; the bigger one just stays still, so you don’t have to worry about it. In practice, the combination of the two poles of the big magnet acting on the two poles of the smaller one adds up to a torque, and the small magnet in general will tend to rotate about an axis (which goes through it at the midpoint between its poles). Imagine you position the small magnet in lots of different places around the bigger one (Fig. 8.8), and at each location you draw the direction along which it eventually aligns:, you’ll end up with a diagram like in Fig. 8.9. Today, this and all the other basic empirical facts of electromagnetism are summarized in four experimental laws called Maxwell’s equations575 . Now, the thing about the earth is, the earth has a magnetic field that also looks like the one in Fig. 8.9: it’s as if the earth was one giant lodestone, basically, oriented like its rotation axis. This has been known at least since William Gilbert, who wrote in his book, De Magnete, that “the whole earth is a big magnet”. The reason he was able to make this statement was that, beginning about a century before his time (De Magnete first came out in 1600), people had used compasses—read: small bar magnets, kept fixed at some location on the earth’s surface, but allowed to rotate about their vertical axis, on the horizontal plane—people had been using compasses to orient themselves,
Fig. 8.8 A bar magnet. Its two poles are labeled S and N . Smaller bar magnets (compasses) are torqued by its magnetic field and align with it as a result: notice that there’s a clear pattern in their apparent disorder. I got this image from Flickr: it was posted there on January 9, 2008, by Dayna N. Mason. She calls it “bar magnet on a compass array”. She was kind enough to give me permission to reuse it
8.8 The Earth’s Magnetic Field
365
Fig. 8.9 The magnetic field of a dipole: lines of force. The pattern is the same as in Fig. 8.8 but shown over a broader area. You see the field’s section on a flat surface, but of course the field is 3-D, and symmetric about the N -S axis of the dipole
most of all in navigation: and by the time he wrote his book, Gilbert had data from many places around the world, showing the result of the small versus big bar magnet experiment, applied to the earth576 . Post-Gilbert, the “dipole” (two-pole) pattern of the magnetic field would be confirmed by lots of measurements of both its declination—its angle with respect to the earth’s “true north”, i.e. the place where the earth’s rotation axis intercepts the surface—and inclination—angle with the earth’s surface. On top of the macroscopic dipole thing, though, people started noticing some smaller perturbations that change with time. Today, that’s called the secular variation of the field. We’ll get back to that. After looking at all those data, the question naturally arises: what makes the earth’s magnetic field? Gilbert’s idea, actually, can be ruled out: the earth’s magnetic field does look like that of a magnetic dipole, yes, but it cannot be created by a “giant magnet”. The reason we’re sure of that is that, like Frank Press and Raymond Siever explain in their famous textbook577 , “heat destroys magnetism, and magnetic materials lose their permanent magnetism when the temperature exceeds a certain value called the Curie point (after Pierre Curie). For most magnetic materials, the Curie point is about 500◦ C, a temperature exceeded below depths of about 20 or 30 kilometers in the Earth [as we know from chapter 4]. In other words, the Earth cannot be permanently magnetized below this depth. So the notion that there is something like a bar magnet near the center of the Earth is eliminated, even though it represents the field nicely.” Maybe you’re not convinced, you’re saying, yes, when depth and pressure grow temperature does go up, too, but what if the Curie point goes up as well? and what if the Curie point goes up faster than the rocks’ actual temperature?
366
8 The Forces that Shape the Earth: Convection and Plates
if that were the case, in principle, we could still have permanently magnetized stuff in the deep earth. That doesn’t seem very likely, though: because in fact people have studied in the lab how the Curie point changes with changing pressure: and as far as we know the Curie point does not rise when pressure rises. On the other hand, Gilbert was probably right that the field does come from something that’s inside the earth. Carl Friedrich Gauss figured out a way to verify this, looking at observations of the earth’s magnetic field made on the earth’s surface578 . He applied his method to whatever data available to him at the time and figured that pretty much everything had to come from the inside; and then later people could do the same thing with more and more, and better, data, coming to the same conclusion. “Another way to make magnetic fields”, say Press and Siever, “is with electric currents”. Walter M. Elsasser579 explains how that could happen in the earth in his 1958 Scientific American580 article, with a very simple model that Press and Siever also show in book. It goes like this: there’s Ohm’s law581 , which says (among other things) that whenever some electrically conductive material that’s immersed in a magnetic field moves, electric current flows through the material. In Elsasser’s cartoon there’s a disk of conductive material that rotates around an axis, in a magnetic field perpendicular to it, and so there will be current in the disk as a result of the disc’s motion—if some force keeps spinning the disc, current will continue to be generated in it. If you close the loop with some electric wire, like in Fig. 8.10, current will flow through the wire. Then besides Ohm’s law, there’s also Ampère’s law, which is the last of Maxwell’s equations582 , and says that an electric current flowing in a closed loop generates a magnetic field perpendicular to the plane of the loop. So if you wind the wire so that it forms a coil, you’ll get an additional magnetic field, with the same direction and (if the wire is wound right) sign as the field that started the whole thing. And Elsasser’s point is that even if the initial field dies away—a magnet can’t be magnetized forever, etc.—the system, once started, might in principle continue to function, sort of feeding on itself. Provided, of course, that some external force continues to keep the disk in motion. “No-one believes”, write for example Press and Siever, “that a rotating disk like the one [in Fig. 8.10] actually exists in the core; what is proposed is that the fluid iron [because remember from Chap. 7 that we have good reasons to think the outer core is made mostly of liquid iron] is stirred into convective motion by heat generated from residual radioactivity in the core. A small, stray magnetic field would be enough to produce electric currents, which would then create their own magnetic field, starting up a self-exciting dynamo.” We’ve seen that the outer core’s Rayleigh number is probably very high; so you just need to trigger some initial heat anomaly, not even very large, and convection will start. It’s not that hard to think of something that could do that. One option is radioactivity in the core—not clear how much of that we should expect, but it is a possibility. The inner core must be cooling, like the rest of the planet, which is the same as saying that some heat must be continuously conducted from there into the outer core. So, even without radioactivity, the outer core could be heated from below. It is also likely that as the core—both inner and outer—loses heat and gets colder, a
8.8 The Earth’s Magnetic Field
367
Fig. 8.10 An electrically conductive disk rotates in a magnetic field perpendicular to its surface. On the left, the magnetic field is generated by a bar magnet parallel to the disc’s rotation axis. By Ohm’s law, the magnetic field combined with the rotation of the disk causes electric current to flow through the disc, and through a wire connected to the disc. To the right, there’s no magnet anymore, but the wire is coiled around where the magnet used to be: by Ampère’s law, the current flowing through a wire coiled in that way generates a magnetic field that’s similar to the one on the left. Once you get it started with some initial current and/or initial magnetic field, the system on the right can keep the magnetic field in place, for as long as the disk continues to rotate: mechanical energy spent to turn the disk around is converted into electromagnetic energy. (And yes: a mechanism like this is what we call a dynamo.) (After Elsasser, “The Earth as a Dynamo”, Scientific American, vol. 198, 1958.)
larger portion of it will become solid, i.e. the solid inner core grows at the expense of the liquid outer shell: this gives us two possible sources of buoyancy (i.e., triggers of convection), because, as explained by Gary Glatzmaier and Peter Olson583 , “as liquid iron solidifies into crystals onto the outside of the solid inner core, latent heat584 is released as a by-product. This heat contributes to thermal buoyancy. In addition, less dense chemical compounds, such as iron sulfide and iron oxide, are excluded from the inner core crystals and rise through the outer core, also enhancing convection.” There’s also at least another argument in favor of convection in the outer core, and here is how Elsasser explains it in that Scientific American paper: “The nature of the earth’s magnetic field itself is the strongest argument for the dynamo theory. For many years geophysicists have known that the magnetic field is irregular and constantly changing [its secular variation]. The compass needle does not point exactly to the north, and it deviates from true north by different amounts in various parts of the world. There are many eddies, as it were, in the field. They are eddies in a literal sense, because they change as time goes on. Seamen and fliers who use magnetic charts for navigation know only too well that the maps must be brought up to date every few years.” And “this [...] represents overwhelming evidence that the core of the earth is in motion; the variations and changes in the field must reflect these motions.” Elsasser is simplifying things a bit, here: I don’t think it’s entirely obvious that secular “changes in the field must reflect” flow of matter in the outer core. This can be demonstrated, though, if you make the assumption that the magnetic field lines move together with
368
8 The Forces that Shape the Earth: Convection and Plates
the conducting fluid that generates them (people like to call this the “frozen-flux” approximation585 ): then, as convection stirs the outer core, the pattern of the magnetic field that we can observe at the earth’s surface is stirred at more or less the same rate. (Incidentally, it follows that “from the measured changes in the field we can compute the speed of [convective motions in the outer core]. It turns out that matter in the core is moving at the rate of about a hundredth of an inch per second,” says Elsasser, meaning a quarter of millimeter per second, meaning 10 km/year. Which is much faster than continental drift/plate tectonics, and/or convective flow in the mantle. But I shouldn’t be getting ahead of myself.) Bottom line, it really does look like there should be convection in the earth’s core. Like the spinning disk in Fig. 8.10, electrically conductive liquid iron in the outer core moves around, and, by Ohm’s law, that must get some electric current going through the iron itself. And by Ampère’s law, electric current can make a magnetic field. In the earth’s core there’s nothing in principle like the electric wire of Fig. 8.10, though. Electric current flows through the convecting iron, and but there’s no way to tell which way it flows. But so then, how come the earth’s magnetic field is so close to the field of a simple dipole? i.e., that magnetic field that we get when electric current flows in one closed loop? I have looked at a whole bunch of geophysics books that try to give some “handwaving” answer to this, and, I might be wrong, but my feeling is that there is no simple explanation. The one thing everybody agree upon is that the thing that’s most likely responsible for lining up convection cells, so as to create a net magnetic field, must be the earth’s rotation, because the axis of the earth’s magnetic field is aligned with the rotation axis586 . Everybody also agree that, technically, the way earth’s rotation controls the direction of the dipole is through an effect that is called Coriolis force587 . To see what that is, think of a parcel of matter moving with respect to a rotating body: if you look at it from within a reference frame that rotates with that body, it will look to you as if its velocity is changing: as if it accelerates; and the acceleration you see is proportional to its speed relative to the body. If you do a force balance within the rotating frame, you’ll have to factor this acceleration in—even though it’s there only because you reference frame is moving: that’s why it’s called an apparent force. That is Coriolis force. If rotation is counterclockwise—which is the case of the earth if you look at it from its northern hemisphere—Coriolis curves the trajectories of moving stuff always to their right; while in the southern hemisphere it curves them to the left588 . That means that chunks of molten iron which, deep in the outer core, become hot and buoyant and start to rise up, are all deflected in the same way by Coriolis force. So, loops of convecting material—convection cells—all circulate in the same sense. The fact that the total magnetic field of the earth is nonzero, and actually oriented like the rotation axis, is likely to be a consequence of this. Now, there’s only so much we can say about the earth’s magnetic field by speculation, thinking about Maxwell’s equations and rotation and all that and trying to be smart. People have tried (and are still trying, as far as I know) to find out more about it in two ways: with laboratory experiments and with numerical models. We’ll get to numerical models in a minute. In the paper we’ve been looking at, Elsasser describes (leaving out all the practical difficulties of putting it together) a lab-made
8.8 The Earth’s Magnetic Field
369
dynamo that’s supposed to resemble that of the earth. Essentially, he says, a large bowl (like, a couple meters in diameter) filled with mercury. You can, e.g., heat the mercury up; thermal anomalies cause displacements of mass, which in turn give rise to electrical currents through the mercury, which you can measure. Elsasser mentions observations of the “decay time”: the time it takes those currents to die out once you stop the heating and the convection. If you repeat the experiment with bigger and bigger bowls, the decay time turns out to grow very quickly as the bowls get bigger: in fact, it turns out to be proportional to the squared diameter of the bowl. One can extrapolate to the size of the core; “Calculations show”, writes Elasser, “that an electric current circulating in the earth’s core would require about 10,000 years to decay. Ten thousand years would be ample time for this current and its associated magnetic field to be ‘pushed around’ considerably by motions in the fluid.” To understand what Elsasser means by “pushing around”, remember the frozenflux approximation. Magnetic-field lines are stuck to, frozen into the conducting fluid, so when the fluid is moved by convection, it sort of carries the field lines with itself. “We can think of the field lines as so many ribbons which are pulled and stretched by the motion of particles in the fluid”, says Elsasser. If the speed of particle motion changes along a ribbon, the part of the ribbon where particles are moving faster will be pulled farther than that where they are moving slower, and the ribbon will be stretched. “In the process of stretching [ribbons] gain energy—energy which is imparted by the motion of the particles. This basic process of the conversion of mechanical energy into electrical and magnetic energy —not essentially different from the operation of a dynamo—can be shown mathematically to account satisfactorily for the electric currents and the magnetism of the earth’s core”. As for numerical models of the earth’s dynamo, those are even more complex than, e.g., models of mantle convection, because we’ve seen that the dynamo is the product of both convection and electromagnetism: meaning, you’ve got to solve both the Navier-Stokes equation—which governs the flow of mass in the core—and the induction equation—controlling the magnetic field itself. (And, of course, NavierStokes still brings with it all the complications—all the additional equations—that we’ve met earlier: energy conservation, gravity and possibly self-gravitation, the constitutive relation—which requires having a decent estimate for the value of viscosity in the core...). And you can’t solve the two problems separately, because they are coupled and feed back into one another589 . The first, or in any case most famous early paper showing realistic numerical models of flow in the outer core, and the magnetic field that that flow generates, is that of, again, Glatzmaier, with Paul Roberts, published on Nature in 1995. Glatzmaier and Roberts’ simulation was, at the time, impressive for its “computational cost”. They reported that “the simulation required over 2,000,000 computational time steps that, over a period of more than a year, took more than 2,000 CPU hours on a Cray C-90.” According to a contemporary brochure from Cray, in the early nineties the C90 was “the most powerful supercomputer available for production applications, [...] a major milestone of technology leadership”; so, being able to run their software on a machine way more powerful than anything that had been available to them before is probably what made the difference, and made Glatzmaier and Roberts the first
370
8 The Forces that Shape the Earth: Convection and Plates
researchers to come up with a realistic simulation of the geodynamo. One important feature of which was, the dynamo would just keep going, which is one of the main things we require of it, to fit what we know about the earth. Glatzmaier and Roberts wrote it like this: “Our present solution, with a finitely conducting inner core, spans > 40,000 years, [...] with no indication that it will decay away, which is suggestive evidence that our solution is a self-sustaining convective dynamo. The solution begins with random small-scale temperature perturbations and a seed magnetic field. After an initial period of adjustment (∼10,000 years) during which the dipole part of the field gradually becomes dominant, our time-dependent solution maintains its dipole polarity until near the end of the simulation, when it reverses in little more than 1,000 years and then maintains the new dipole polarity for roughly the remaining 4,000 years of the simulation.” Admittedly, 40,000 years is kind of short for geologic time, but in the years that followed new simulations would confirm their inference, that the dynamo could go on pretty much forever. More importantly, though, notice what Glatzmaier and Roberts say toward the end of the quote: that, within those 40,000 years, the polarity of the dipole reverses. That’s actually what made Glatzmaier and Roberts publish their paper, even though, like I said, 40,000 is not that many years: but having simulated “a reversal of the dipole polarity” was important enough that they “decided to report on our results now instead of waiting for our simulation to span a much longer period of time.” A polarity reversal is when an approximately dipolar magnetic field maintains its general pattern (i.e., it continues to be approximately dipolar, and the axis stays the same), but the north and south pole swap places. At the time when Glatzmaier and Roberts ran their simulation, it had already been inferred from the magnetization of rocks that the earth’s magnetic field had gone through many polarity reversals in the past millions and hundreds of millions of years—you’ll learn everything about this in the coming pages. But it seemed, to say the least, difficult to explain polarity reversals based on the theory —and so, when Glatzmaier and Roberts got one, it was a big deal. Glatzmaier and Roberts’ model was the first to incorporate all the main properties of the earth’s field, including the weirdest one. And that was a strong indication that our current ideas of the earth’s magnetic dynamo are probably, to a large extent, correct.
8.9 Magnetization of Rocks Now, geodynamo and the earth’s core are interesting topics per se, but the reason I had brought them up just now is that, in a sense, when we look at the earth’s magnetic field we are also looking at the movement of continents and oceans, which is supposed to be the main topic of this chapter. To understand what I mean by that, we need some more physics: we need to know what happens when a material that has a potential for being magnetized is immersed in a magnetic field.
8.9 Magnetization of Rocks
371
Fig. 8.11 Simplified model of an atom: electrons circulate around the nucleus; the thick solid arrow shows the direction of the atom’s own magnetic field
The first thing to keep in mind is, magnetic fields show up whenever electrical charges move around. But then, there are also permanent magnets: which, clearly, can’t be made by moving charges, because for charged particles to move, energy is needed—and but permanent magnets work without a power supply. To understand how permanent magnets work, start with the structure of the atom, Fig. 8.11: a nucleus made of protons and neutrons, and electrons orbiting around the nucleus. Electrons are charged particles, and they are moving, so each atom carries its own small magnetic field, perpendicular to the electrons’ orbits. Now, put together a bunch of atoms, Fig. 8.12, their magnetic fields’ directions are random, and so when you sum them up, they more or less cancel each other out: so, in principle, any given chunk of matter will generate no significant magnetic field. There exist some special materials, though, called ferromagnetic materials (iron, nickel, cobalt, and their alloys), whose atoms, for a number of reasons that, sorry, I am not going to get into (the main ingredients are: chemistry, quantum mechanics, and thermodynamics: look this up in a good physics book), whose atoms tend to align over relatively long distances: like, one millimeter or so. When I say “align” I mean that their magnetic fields all point approximately in the same direction, and don’t cancel out. A volume over which atoms line up is called a domain. Under normal conditions, domains are not aligned with one another (Fig. 8.13), and so, again, even a random piece of iron or nickel doesn’t have a net magnetic field. But unlike individual atoms, domains have the property that, if you immerse your piece of ferromagnetic material in a magnetic field, they rotate, to align with the direction of the magnetic field, as in Fig. 8.14. A sufficiently strong magnetic field, or one that lasts long enough, can cause all domains to approximately line up: Fig. 8.15. Now, the thing is, when you remove the (external) magnetic field, domains have no reason to go back to their initial orientation: they’ll stay like they are, and as a result their individual magnetic fields will add up constructively, giving rise to a net, permanent magnetic field. The new field is parallel to the one that initially lined up the domains—it “remembers” its orientation, even though now that field is gone. People gave the name hysteresis to this “memory effect”—from a Greek stem that means, essentially, “delay”. (Incidentally, hysteresis is how, e.g., a computer hard drive works. The disk’s “platter” is covered with a thin film of ferromagnetic material; storing one “bit” of information on the
372
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.12 Randomly aligned atoms yield no significant net magnetic field Fig. 8.13 Magnetic domains are randomly aligned, so there is still no net magnetic field
Fig. 8.14 An external magnetic field (dashed arrows) causes the domains to start to rotate
disk means, in practice, that the disk’s magnetic “head” goes to one location along the platter, and “magnetizes” it one way or the other—call it north/south, or 0/1: the two digits of binary arithmetic.) Actually, hystheresis is not so relevant to us here, because that’s not exactly how rocks get magnetized. For one, rocks are always immersed in the earth’s magnetic field—you can’t just “remove” the earth’s magnetic field; secondly, the field at the
8.9 Magnetization of Rocks
373
Fig. 8.15 A strong, and or long-lasting external field causes the domains to line up in the same direction
earth’s surface is actually pretty weak—it wouldn’t be strong enough to turn those “domains” around. That is, unless you do something to the rocks. One thing you can do is heat a rock up. In De magnete, which I’ve already cited, William Gilbert describes an experiment where a steel rod is heated until it’s red hot, and then quickly cooled, like by dumping it in cold water, and then what happens is that the steel is magnetized—it has become a magnet. Today we understand that when temperature is very high—above the Curie point—the forces that maintain the molecular structure of the steel—that keep the “domains” in place, etc.—become weaker, and the earth’s magnetic field is enough to line up the domains. The steel then cools down real quick; the domains, etc., have no time to lose their alignment, and they are frozen as they are: the steel will stay magnetized. Now think of a lava, that contains some magnetic materials. When it’s spit out from the interior of the earth, the lava is super hot, and the metals that it contain will be just like the steel in Gilbert’s experiment. At that moment, the magnetic domains align themselves with the earth’s magnetic field. As soon as it’s erupted, though, the lava cools down to become igneous rock, and the cooling is so quick that there’s no time to lose the alignment: so the rock becomes a permanent record of the orientation of the magnetic field at the time and place when the rock became solid. People like to call this process “thermoremanent magnetization.” Another way that the earth’s magnetic field manages to magnetize the rocks is when sediments are being deposited. Think the tiny grains that accumulate along the shore: how each grain is positioned as they pile up on top of one another is pretty much random; but grains that contain magnetic minerals are all pushed the same way by the earth’s magnetic field, and, on average, their magnetic domains will tend to align with it —a small, but not necessarily negligible effect. That’s called “detrital remanent magnetism.” I should probably also mention that the word “paleomagnetism” refers to any kind of remanent magnetism—magnetization “resulting from the orientation of the Earth’s magnetic field at the time of rock formation in a past geological age” (Britannica). Thermoremanent magnetization is by far the easiest to observe. It was Patrick Blackett and Keith Runcorn who made the connection between all this and the idea of continental drift. In the early 1950s, when the next chapter of our story unfolds, Blackett was a physics prof. at the university of Manchester, and in 1948 he had received a Nobel prize for research that doesn’t have much to do with this book; he also had an interest in the earth’s magnetic field, though: during
374
8 The Forces that Shape the Earth: Convection and Plates
the 1940s he had worked on a theory alternative to the geodynamo, to explain the origin of the field—but his theory never fit the data, which is why I decided not to mention it earlier, and he eventually dropped it, and I won’t talk about it in this book. Keith Runcorn was Blackett’s grad student in Manchester, and wrote a thesis revolving around Blackett’s magnetic field ideas. He graduated in 1949 and moved to Cambridge to pursue his own career. In 1951, both Blackett and Runcorn were physicists, mainly, trying to learn some geology so they could figure out how the earth’s magnetic field works. In Runcorn’s words590 , “in 1951 Blackett visited me for a few days [in Cambridge] to concert plans and, as neither of us had any training in geology, I took him to Sedgwick museum where we were jocularly greeted by Professor W. B. R. King, who said, ‘Have you come to learn all about geology this morning?’ Well Blackett had. It was one of his remarkable characteristics that he was able, by vigorous questioning of a chosen expert (who often felt less of an expert afterwards) and by a judicious choice of literature (in this case Arthur Holmes’ Physical Geology) to get up a new subject with remarkable speed and effectiveness.” One important thing that Blackett learned from geologists is that the magnetic field of the past is recorded in the remanent magnetism of rocks. After reading very carefully Arthur Holmes’ book, he realized that this could be used to test the idea of continental drift. (Of which, according to Runcorn, “Blackett was, from the beginning, entirely convinced.”) To fix ideas, take for example the Deccan “large igneous province”. A large igneous province is a place where a lot of volcanic rock has been erupted. Blackett got samples of basalt from the Deccan and looked at their remanent magnetism and found that their remanent magnetism does not agree with the current direction of the magnetic field: instead, it changes as a function of the age of the rock: younger basalts are aligned with today’s field; older basalts aren’t. For Blackett, that was evidence of continental drift. Because, if India had been at a different latitude in the past, the remanent magnetism of volcanic rocks that had solidified back then would be aligned with the magnetic field that we have at that latitude: and not where India is today: see the drawing in Fig. 8.16. This assumes, of course, that, while continents “drift”, the magnetic field doesn’t. Runcorn wasn’t a fan of continental drift, initially, or not as much as Blackett was, and took the totally opposite view: that India had not moved, but that, instead, the orientation of the earth’s magnetic field had changed. He figured that that would be explained by a displacement of the axis of the magnetic field, which is the same as saying, a displacement of the poles of the magnetic field—a “polar wander”. And if you have measurements of the field’s orientation over time, at a given place, you could use them to reconstruct the trajectory of polar wander. Of course, this is something you can do with samples from India, but also from any other place on the surface of the earth where you can find volcanic rocks of different ages. It goes without saying that, no matter where you take the data from, if Runcorn’s idea is right you should always get the same trajectory for the pole. There’s a whole lot of papers published (mostly by Runcorn, and people working with him) in the mid 1950s that try to test these ideas. This is how Cox and Doell591 sum it up: “Since the average magnetic dipole and rotational axes have been parallel in
8.9 Magnetization of Rocks
375
Fig. 8.16 The magnetic field generated by the remanent magnetization of Deccan basalt (thick black arrow), which Blackett measured, points to a different direction than the earth’s magnetic field measured today at the location where the basalt is picked up (dot-dashed lines). Blackett thought that that doesn’t make sense, unless you allow for India to have “drifted”, from the southern into the northern hemisphere
late Pleistocene through Recent time, [...] the principle of uniformitarianism suggests polar wandering and continental drift as possible interpretations for paleomagnetic results that do not agree with the present field configuration”, i.e., like I was saying, if e.g. in Deccan basalt you measure a remanent magnetization that is not aligned with today’s magnetic field, then you’ve got to infer that either the Deccan used to be somewhere else, or the magnetic field has changed. The latter “interpretation, in the form of a polar-wandering curve, was suggested by Creer and others (1954, p. 165) to explain pre-Tertiary paleomagnetic data from North America and Europe. When more data became available, the possibility of obtaining a better fit with separate paths for North America and Europe was pointed out by Irving (1956, p. 39) and Runcorn (1956, p. 82–83) [...]. Subsequent paleomagnetic data from India, Australia, and Japan have led many authors to postulate different polar-wandering curves for each of these regions. [...] Very large relative drifts and rotations are required to bring them into coincidence.”592 Look at Fig. 8.17, which shows five “polar-wandering curves” compared by Cox and Doell: each obtained using data from one of five different places: and they’re all different. What this means, basically, is: OK, let’s say the pole has “wandered”: even so, that doesn’t account for the discrepancy between the magnetic field of today and that of the past. You need continents to move with respect to one another, too, just like Blackett had always thought. In their paper Cox and Doell aren’t making any big claims re continental drift, though. “In order to test properly the hypotheses of large-scale continental drifts since the Cretaceous and Eocene, it will be necessary to have additional paleomagnetic data from well-dated Cretaceous and Eocene rocks from all continents”, etc. I have a feeling that they’re being so cautious because they
376
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.17 After Cox and Doell (1960): “Postulated paleomagnetic polar-wandering curves for Europe, North America, Australia, India, and Japan. e-Cambrian, S-Silurian, D-Devonian, C-Carboniferous, P-Permian, T-Triassic, J-Jurassic, K-Cretaceous, E-Eocene, M-Miocene, PlPliocene”. (Used with permission of Geological Society of America, from Allan Cox and Richard R. Doell, “Review of Paleomagnetism”, The Geological Society of America bulletin, vol. 71, 1960; permission conveyed through Copyright Clearance Center, Inc.)
need to publish their work, and but most of the community at this point is not wild about continental drift: manuscripts submitted for publication are reviewed by peers, and most of Cox and Doell’s peers in the 1950s don’t believe that continents move.
8.10 Maps of the Ocean Floor But the thing is, evidence for drift continues to pour in from all sorts of places.Consider the depth of the ocean bottom, the bathymetry. Since the turn of the twentieth century, that was measured using sonar-type devices, that is, devices that both emit and record sound: they send out a specific signal, called a “ping”, and then if they record, after some delay, anything that sounds like it, that’s the ping reflected by some obstacle: and the delay is a measure of how far the obstacle is. (The speed of sound in water, roughly one and a half km per second, is close to constant, despite changes in temperature, etc.: so you just multiply that by the delay, and what you get is twice the distance.) You can use this to see how deep the ocean is, but also to catch an enemy submarine. So, obviously, “echo-sounding” technology was
8.10 Maps of the Ocean Floor
377
much improved during world war two, and this continued with the cold war. Long story short, throughout the 1950s and 60 s tons of bathymetry data are collected, mostly through the efforts of Maurice Ewing and his growing army of scientists at the Lamont Geological Observatory593 , who begin to try to make sense of them. Initially there’s lots of data from the Atlantic ocean, and Lamont’s Bruce Heezen and Marie Tharp notice one strange thing, or actually two: (i) that there is a whole huge mountain chain that runs roughly north-south, sort of cutting the Atlantic seafloor in two and but following, more or less, the North America/Europe and South America/Africa matching coastlines, and (ii) along the axis of the chain there’s a canyon, sort of: a V-shaped fracture which they figured could be a rift, i.e., a place where the surface of the earth is being pulled apart594,595 . There were two reasons for that canyon to be a rift, both explained by Heezen in his 1960 Scientific American (vol. 203) paper, “The Rift in the Ocean Floor”. The first reason is that “folded mountains such as the Jura of northern France and the Appalachians of the U. S.”, says Heezen, “are huge wrinkles in strata of sedimentary rock. These sediments are five to 10 miles thick; the fact that the sediments of the mid-oceanic ridge are only half a mile thick seems to rule out the possibility that it is a range of folded mountains.” There’s only “one continental feature that does bear resemblance to the mid-ocean ridge and rift. This is the rift valley system of the rugged East African Plateau. Geologists who held with the shrinking earth theory were at a loss to explain this formation when it was first entered on topographic maps at the turn of the century. The valleys appeared to be huge tension cracks in the earth’s crust—as if the crust had been stretched at right angles to the axes of the valleys until it had split.” The second reason Heezen and Tharp figured the canyon along the ridge axis must be a rift is, there’s a lot of earthquakes that happen under it. “Before the seismograph came into use late in the 19th century,” writes Heezen, “the determination of earthquake epicenters depended upon reports from earthquake-devastated areas. As a consequence most earthquakes at sea went unrecorded. The early seismograph networks in Europe and America soon showed that earthquakes occur quite frequently in the mid-Atlantic [...]. With more accurate location of epicenters [because more and more instruments are deployed, and the more instruments you have the more precise your locations are going to be], geologists came to realize that [...] [a]lmost all of the epicenters of mid-ocean earthquakes lie within [the rift]. The few outside usually lie within the limits of error of the earthquake-detecting network.” And that to Heezen means that “the rift valley is undoubtedly an active fracture in the crust of the earth, and crustal movement along this fracture generates the earthquakes”, etc. (Fig. 8.18 shows where ocean quakes typically occur: you see how clear the picture is: you look at the dots and you see the ridge.) Strangely enough, Heezen’s inference from this was that the earth expands. There’s a section in his paper about this idea. “I have recently suggested that the earth is neither shrinking nor remaining at the same size; rather, it is expanding.” We’re seeing new “crust” being formed at the mid-ocean ridges, so unless that crust is absorbed back into the earth somewhere else—but Heezen doesn’t see how, or where?—the outer surface of the earth must be growing in size, and of course if
378
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.18 Earthquake epicenters (black circles) in the Atlantic ocean are all within a very narrow belt, following quite precisely the mid-ocean ridge (which, like other plate boundaries, is denoted here by a gray line). There’s a plot similar to this in Heezen’s 1960 paper, where he used the relatively few data that were available back then. My plot includes all quakes detected between January 1, 2010, and December 31, 2015 (no particular reason for picking these dates), all magnitudes and depths, etc. Data from the United States Geological Survey
that’s the case, that means that the earth as a whole must be expanding. Plus, he claims, it can be shown “that expansion of the earth would change the relative positions of the continents in a way that would satisfy the different polar-wandering curves—much as inflation of a balloon changes the orientation of points drawn upon it.” (He doesn’t really prove this, though, and he doesn’t refer his readers to anyone who’s actually worked this out.) “The idea of an expanding earth”, he writes, “is not new. The British physicist P. A. M. Dirac first proposed 25 years ago, on cosmological grounds, that the force of gravity decreases in proportion to the age of the universe. R. H. Dicke of Princeton University has calculated that the decrease of the gravitational constant would permit the circumference of the earth to grow 1,100 miles in 3.25 billion years. J. Tuzo Wilson of the University of Toronto has pointed out that an 1,100-mile increase in the circumference of the earth would increase the surface area by an amount equal to the total area of the mid-ocean ridge.” Dirac’s was just a speculation, though, and so far as I know it still is. Rifting along the mid-ocean ridges is (geologically speaking) fast: people in 1960 already have estimates of the age of ocean-bottom rocks from samples (“cores”) collected e.g. in Ewing’s cruises, and everything seemed to be pretty young, meaning the crust created by rifting had to be created in a geologically short time. So, to begin with, the idea that the earth was inflating was difficult to fit e.g. with Holmes’ energy budget, convection, radioactivity estimates given the composition: and but on top of that, expansion also had to be really fast: the whole thing just wasn’t very convincing.
8.11 Robert Dietz and Harry Hess
379
8.11 Robert Dietz and Harry Hess So the following year, Robert Dietz published a paper on Nature, “Continent and Ocean Basin Evolution by Spreading of the Sea Floor”, where he came up with an alternative “concept [...] derived through an attempt to interpret sea-floor bathymetry.” He explains, there, that “geologists have traditionally recognized that compression of the continents (and they assumed of the ocean floors as well) was the principal tectonic problem. It was supposed that the Earth was cooling and shrinking. But recently, geologists have been impressed by tensional structures, especially on the ocean floor. To account for sea floor rifting, Heezen, for example, has advocated an expanding Earth”. And Heezen wasn’t the only “expansionist” around: “Carey’s596 tectonic analysis has resulted in the need for a twenty-fold increase in volume of the Earth.” So Dietz (like, e.g., Ewing, and others) thinks this is crazy, though he doesn’t say it with these words, and proposes in his paper an alternative mechanism that he calls “spreading of the sea floor”, which “offers the less-radical answer that the Earth’s volume has remained constant.” Simplifying a bit, Dietz’ reasoning is thus: (i) we know from Heezen that the mid-ocean ridges are rifting: which it looks like new seafloor is being formed at ridges: injected from the “mantle” or “substratum” underneath. Dietz doesn’t think this means that the earth is expanding, though, because, like I was saying, (ii) we know—e.g., from all the samples collected by Ewing and his people at Lamont—that seafloor rocks are always young. But then, (iii) there must be some places where the stuff that had been formed at the ridges is eaten back into the earth—otherwise we’d have found some very old seafloor somewhere, just like in continents you can find rocks that are billions of years old. And so the total amount of oceanic crust under the oceans would stay about the same. But so then, anyway, if there’s no expansion, what is the force that opens the rifts, and pushes new seafloor away from them, etc.? Well, (iv) below the shallowest, rigid (high-viscosity) layer(s) of the earth (which Dietz makes a point of calling them “lithosphere”, rather than crust: I’ll get back to this distinction shortly) there is convection: we know that much from the work of Holmes, for example, and Griggs, too. So, (v) what if the lithosphere were “coupled with the convective overturn of the mantle”? meaning, if there’s enough friction—or, more precisely, “viscous drag”—between the convecting viscous mantle below, and the rigid lithosphere above, then the former will carry the latter with itself: just like in Griggs’ experiment. People would later use the conveyor-belt, or tapis-roulant metaphor to describe this mechanism (which, incidentally, today isn’t considered to be most representative of what really might happen between lithosphere and mantle: but we’ll get to that later). “The gross structures of the sea floor”, writes Dietz, “are direct expressions of this convection. The median rises mark the up-welling sites or divergences; the trenches are associated with the convergences or down-welling sites [...]. The high heat-flow under the rises is indicative of the ascending convection currents”, because, of course, the way convection works, the stuff that rises does so because it’s hot—and therefore buoyant. “By viscous drag, the continents initially are moved along with the sima
380
8 The Forces that Shape the Earth: Convection and Plates
until they attain a position of dynamic balance overlying a convergence. There the continents come to rest, but the sima continues to shear under and descend beneath them; so the continents generally cover the down-welling sites. If new upwells do happen to rise under a continental mass, it tends to be rifted. Thus, the entire North and South Atlantic Ocean marks an ancient rift which separated North and South America from Europe and Africa.” So, in Dietz’ scheme, continents move. But “this is not exactly the same as continental drift. The continents do not plow through oceanic crust impelled by unknown forces; rather they ride passively on mantle material as it comes through the surface at the crest of the ridge and then moves laterally away from it.” This is actually a quote from Harry Hess’ famous 1962 paper, where he says pretty much the same things that Dietz had said the year before597 . In Dietz’ words, “a principal objection to Wegener’s continental drift hypothesis was that it was physically impossible for a continent to ‘sail like a ship’ through the sima; and nowhere is there any sea floor deformation ascribable to an on-coming continent. Seafloor spreading obviates this difficulty: continents never move through the sima—they either move along with it or stand still while the sima shears beneath them.” It’s the conveyor-belt idea that I was telling you about. In all this, though, the “leading edges” of continents, says Hess, “are strongly deformed when they impinge upon the downward moving limbs of convecting mantle”: that’s how mountains are formed. Both Hess and Dietz invoke a new observation, or suite of observations from seismology to support their claims. One of the things Maurice Ewing did when he went to sea, or had people do when he sent them to sea, was seismic refraction (remember Chap. 7); in 1959 he published, with his brother John, their “seismicrefraction measurements in the Atlantic Ocean basins, in the Mediterranean Sea, on the mid-Atlantic ridge, and in the Norwegian sea”, which showed many things, and in particular that the sediment cover (“lithified” stuff below, and a few hundred meters of “unconsolidated” sediments on top) under the ocean is relatively thin, about 1 km or so on average. Cruises in the Pacific saw the same thing—Hess cites “Raitt’s (1956) refraction profiles on the East Pacific Rise.”598 But, says Hess, “looking over the reported data on rates of sedimentation in the deep sea, rates somewhere between 2 cm and 5 mm/1000 yrs seem to be indicated”. If you do the math, assume that there’s no seafloor spreading, no continental drift, nothing, just sediments piling up on the bottom of the ocean, even if you take into account that the sediments are compacted under their own weight, so the sedimentary rock, particularly in the deeper layers, is denser than sediments are when they are first deposited, so anyway Hess does the math and Figs. 8.5 km is, like, the smallest possible thickness that he can come up with. And that’s much bigger than 1 km. The theory of seafloor spreading solves the problem, because it says that sediments deposited at sea bottom are carried along as the seafloor spreads away from the ridge and finally, writes Hess, “will move to the axis of a downward-moving limb of a convection current, be metamorphosed, and probably eventually be welded onto a continent.” So there’s no time for them to form really big piles, as might happen on continents.
8.11 Robert Dietz and Harry Hess
381
One useful thing that is in Dietz’ paper and not (or not equally clear) in Hess’ paper is the distinction between “crust” and “lithosphere”. Dietz explains that there’s a problem with the word “crust”, which “has been effectively pre-empted from its classical meaning by seismological usage applying it to the layer above the Moho”: meaning, as you know from Chap. 7, people discovered a sharp seismic discontinuity at, I don’t know, 35 km depth under continents, and 12 under oceans (12 below sea level, or 7 below the ocean floor)—these are the figures that are in Dietz’ paper— and they somewhat arbitrarily decided to call “crust” everything that’s above it. But that’s confusing, because the “classical meaning” of crust is, the outer layer of the earth which has very high viscosity, as opposed to the substratum that has low viscosity and, in geological time, can flow. But the thing is, there is no reason to think that the seismic discontinuity found by Mohoroviˇci´c should be that: and in fact it probably isn’t: because “deviations from isostasy prove that approximately the outer 70 km of the Earth (under the continents and ocean basins alike) is moderately strong and rigid even over time-spans of 100,000 years or more”. This “outer rind” should be called lithosphere. “Beneath lies the asthenosphere [which] is a domain of rock plasticity and flowage where any stresses are quickly removed. No seismic discontinuity marks the [lithosphere-asthenosphere boundary] and very likely it is actually a zone of uniform composition showing a gradual transition in strength as pressure and temperature rise”599 . So, convective flow moves asthenospheric material around, and the lithosphere is what rides on top of the flow. The general pattern of mantle flow can be guessed, to some extent, from the scale of earth’s topography—the size of continents and oceans. Hess writes, “cells would have the approximate diameter of 3000 to 6000 km in cross section (the other horizontal dimension might be 10,000–20,000 km, giving them a banana-like shape).” The speed at which seafloor spreads is the same as the speed at which continents drift, which had been roughly measured by Runcorn—it’s the speed at which the relative magnetic pole of a given continent apparently “wanders”, and it’s “a few cm/yr” in Dietz’ paper, or “lies between a fraction of a cm/yr to as much as 10 cm/yr” according to Hess. At the end of his paper, Dietz makes a suggestive remark, which I don’t see in Hess’ paper; that “Vacquier, V., et al. (in the press) recently have completed excellent sea-floor magnetic surveys off the west coast of North America. A striking northsouth lineation shows up which seems to reveal a stress pattern (Mason, R. G., and Raff, A. D., in the press).” I am not sure what Dietz means by “stress pattern”, but the north-south lineations recorded by Victor Vacquier et al., and identified as such by Ronald Mason and Arthur Raff600 would reveal something important indeed. To explain what that is—and why it’s important —I need to talk more about studies of the magnetic field.
382
8 The Forces that Shape the Earth: Convection and Plates
8.12 Geomagnetic Reversals In parallel with the work of Blackett, Runcorn, Irving and co., other magnetists were interested in understanding the source of the earth’s magnetic field, geodynamo and all that: this you already know. Something important that had happened on that front, starting early in the twentieth century—hope you don’t mind if I go back a few decades: I won’t be long—something important was the discovery of geomagnetic reversals, which I mentioned earlier in this chapter, in the geodynamo piece. So here’s the full story on that: when you look at magnetized rocks, say for example you have several layers of frozen lava, from multiple eruptions... and of course the shallower layers are the younger ones; and but anyway in the twentieth century one can also date rocks with radioactivity—so, people noticed that when you look at older volcanic rocks, which should be magnetized according to how the magnetic field was at the moment of the eruption, people noticed that the polarity of the magnetic field recorded in the rock sometimes was, not just slightly different, but opposite the polarity of the earth’s magnetic field as we see it today. I am not an expert of geomagnetic data, but I can imagine that early on, when not so many of them were available, they must have been very “noisy” and not particularly coherent, so presumably initially it was thought that whether or not the field could be accurately “recorded” depended of some properties of the rocks where it was recorded; perhaps some minerals somehow “reversed” the field direction? But then people came up with the idea that the whole magnetic field might have been “inverted”: the dipole flipped over by exactly 180◦ , in a geologically short time. The first to publish this was Bernard Brunhes, the director of the Geophysical Observatory of Puy de Dôme (which is, more or less, the place where Guettard had found basalt—in Chap. 3): he wrote several articles in 1905–1906 where he showed samples of old basaltic lava flow from his area, whose magnetization appeared to be reversed with respect to that of more recent lavas, and proposed that that could be explained with a reversal of the magnetic field. Of course a few samples from one place aren’t enough to really conclude anything: but then in 1926 Paul-Louis Mercanton601 published a couple of papers re the magnetization of basalts that he had found in Greenland, showing again a reversal in their magnetization; and then there’s Motonari Matuyama, who looked at many volcanic rocks from Japan, and saw that all rocks that had been erupted “recently” were polarized according to the magnetic field of today; while rocks of Pleistocene age, or older, were polarized the opposite way—“reversed”. This was the first time that the age of a magnetic reversal could be estimated: “there is a group of specimens whose directions of magnetization falls around the present earth’s field”, wrote Matuyama. “A number of other specimens forms another group almost antipodal to the former. [Although] the ages of the collected basalts are not always clearly known, [...] we may consider that in the earlier part of the Quaternary Period the earth’s magnetic field in the area under consideration was probably in the state represented by the second group, which gradually changed to the state represented by the first group.”
8.12 Geomagnetic Reversals
383
Matuyama’s work did not immediately get the attention it deserved. In their famous 1964 Science paper602 , Allan Cox603 and his colleagues from the U. S. Geological Survey note that “during the two decades (1928 to 1948) that followed [Matuyama’s] work there was [...] little scientific reaction to the hypothesis of geomagnetic field reversal.” But, “the observational basis for field reversals has been greatly extended since 1950, partly as a result of improvement in techniques of determining the stability and reliability of rock magnetism for determining past geomagnetic field directions, and partly as a result of the vast increase in paleomagnetic data available”, etc. With the data604 available to them in 1964, Cox et al. are able to date three reversals over the past four million years—plus some “events” where the field is flipped twice in a relatively short time: see their diagram in Fig. 8.19 here. The “normal” epoch we live in is given the name of Brunhes; it is preceded by the Matuyama reversed epoch, which is preceded by the Gauss normal one. I don’t know if Cox and co. actually named those epochs for the first time, anyway the names stayed, and then the earlier, reversed epoch would be called Gilbert, and I don’t know if there are other names for earlier epochs.
Fig. 8.19 After Cox et al. (1964): “Magnetic polarities of 64 volcanic rocks and their potassiumargon ages. Geomagnetic declination for moderate latitudes is indicated schematically.” (Used with permission of The American Association for the Advancement of Science, from Allan Cox, Richard R. Doell, G. Brent Dalrymple, “Reversals of the Earth’s Magnetic Field: Recent Paleomagnetic and Geochronologic Data Provide Information on Time and Frequency of Field Reversals”, Science, vol. 144, 1964; permission conveyed through Copyright Clearance Center, Inc.)
384
8 The Forces that Shape the Earth: Convection and Plates
8.13 The Zebra Pattern “Meanwhile, throughout the 1950s,” says Oreskes, “researchers at Scripps and Lamont had been collecting sea floor magnetic data, with funds and logistical support provided by the U.S. Navy. In 1961, Scripps scientists Ronald Mason and Arthur Raff published a widely read paper documenting a distinctive pattern of normal and reversely magnetized rocks off the northwest coast of the United States”: that’s the “striking north-south lineation” which, Dietz wrote (see above), “seems to reveal a stress pattern”, whatever that is. The meaning of Mason and Raff’s zebra pattern—as it is understood and taught today—is explained for the first time in a 1963 paper by Cambridge grad student Frederick Vine and his supervisor Drummond Matthews605 , who looked at data from “a detailed magnetic survey over a central part of the Carlsberg Ridge as part of the International Indian Ocean Expedition” by a British survey ship, the HMS Owen. Vine and Matthews plotted magnetization as a function of distance from the ridge, and saw that its sign changes abruptly every 20 km or so. “Some 50 per cent of the oceanic crust might be reversely magnetized”, they wrote, “and this [...] has suggested a new model to account for the pattern” of this remanent magnetization. “The theory is consistent with, in fact virtually a corollary of, current ideas on ocean floor spreading [and they cite Dietz’ Nature paper] and periodic reversals in the Earth’s magnetic field. If the main crustal layer [...] of the oceanic crust is formed over a convective up-current in the mantle at the centre of an oceanic ridge, it will be magnetized in the current direction of the magnetic field. Assuming impermanence of the ocean floor, [...] the thermo-remanent component of its magnetization is [...] either essentially normal, or reversed with respect to the present field of the Earth. Thus, if spreading of the ocean occurs, blocks of alternately normal[ly] and reversely magnetized material would drift away from the centre of the ridge and parallel to the crest of it. “This configuration of magnetic material could explain the lineation or ‘grain’ of magnetic anomalies observed over the Eastern Pacific to the west of North America”, i.e. it could explain Mason and Raff’s observation. In other words, new seafloor acquires its remanent magnetization at the moment it emerges near the surface, as molten rock, and freezes, see above; so the direction of magnetization is determined by the earth’s magnetic field as it is at that moment. If Dietz and Hess are right and the seafloor actually spreads, then the seafloor that we see 20 km away from the ridge was formed some time ago—the time it took the conveyor belt to carry it 20 km away: which at 1 cm/year would be 2 million years—which brings us back to the Matuyama reversed epoch, see Fig. 8.19. Now do this calculation, say, every few km starting at the rift and moving away from it, perpendicularly to the ridge direction, and you get a jumpy positive/negative curve; which actually looks a lot like the data. And if you repeat it for a whole bunch of sections along the ridge, you get the zebra pattern (Fig. 8.20).
8.14 Transform Faults
385
Fig. 8.20 The earliest magnetic-zebra-pattern images that I know of are in Mason and Raff’s papers. But they are kind of messy, because Mason and Raff happened to be collecting their data in an area where the geometry of the mid-ocean ridge is not straightforward—lots of transform faults—we’ll see in a minute what those are. Anyway, messy. A couple years later the magnetists at Lamont—James R. Heirtzler, Xavier Le Pichon, J. Gregory Baron—published (vol. 13 of Deep-Sea Research, 1966) a similar study of the Reykjanes ridge, which is a segment of the mid-Atlantic ridge south-west of Iceland, and their zebra pattern is very clear: just a bunch of parallel stripes. The figure here shows Heirtzler et al.’s image, but is taken from the 1966 Science paper by Vine (“Spreading of the ocean floor: new evidence”), where he analyzes stuff from both Scripps and Lamont. The area of study is signaled by a rectangle in the map on the left, and the axis of the ridge by two solid-line segments in the diagram on the right. As for the diagram: black stands for normal magnetization, white for reversed. (Used with permission of The American Association for the Advancement of Science, from F. J. Vine, “Spreading of the Ocean Floor: New Evidence: Magnetic Anomalies May Record Histories of the Ocean Basins and Earth’s Magnetic Field for 2 × 108 Years”, Science, vol. 154, 1966; permission conveyed through Copyright Clearance Center, Inc.)
8.14 Transform Faults There’s one more piece of evidence in favor of the new theory that I need to explain to you so that then you can chat about plate tectonics even with the experts. The keyword is “transform faults”; which is not a great keyword, because it is used by different people in different situations to mean different things: but I am going to use it the way Wilson and Sykes and others used it in the heyday of plate tectonics. First, let me give you a crash course in how geologists or seismologists classify earthquake faults. The classification per se is simple, but the names that are used are a mess because, for whatever reason, there’s a bunch of different ways for saying basically the same things. The easiest ones to picture are “strike-slip” faults, which are such that the angle that the plane of the fault makes with the surface of the earth (the so-called “dip angle”) is about 90◦ , and the slip—the displacement of one “wall” of the fault with respect to the other, when you have a quake—is parallel to the surface of the earth. The San Andreas, in California, is the most famous strike-slip fault. When the dip angle is not 90◦ , people like to call footwall the side of the fault that’s underneath, and hanging wall the one that’s on top—see the sketch in Fig. 8.21; so
386
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.21 Basic types of fault ruptures. Arrows show which way each block is moving; one highlighted horizontal layer and a vertical intrusion (dark gray) should help you to visualize the geometry of the faults
then if the hanging wall slips downward with respect to the footwall, that’s called a “normal fault”, while if it climbs up the footwall, then that’s a “thrust fault”, or “reverse fault”. According to the new theory of Hess and Dietz and co., you’d expect to have normal faulting where there’s divergence between blocks of lithosphere—mid-ocean ridges, and rifts of all kinds—these are called divergent margins—, and reverse faulting where there’s convergence: like in Chile and Japan and Indonesia: remember Fig. 8.6. Figuring out the plane and direction of slip, which seismologists had gotten pretty good at, was a good additional test that the new theory made sense. It turns out that the test is really decisive if applied to a plate boundary that is in the middle of an ocean: let me explain why. You remember Fig. 8.18, which only showed where the epicenters of quakes are. As more “stations”, i.e. more permanently installed recording instruments became available, seismologists managed to locate the epicenters more precisely, and you could tell that some quakes occurred right at the rift, like Heezen had figured, and but others were along the so-called “fracture zones”, i.e. those strange features, that we haven’t talked about yet, that also showed up clearly in new bathymetry data in the 1950s, and that are, basically, discontinuities in the ridges and zebra patterns, kind of like a strike-slip fault cutting through a ridge (Fig. 8.22). So, J. Tuzo Wilson understood that looking at quakes happening along fracture zones was yet another way to test Hess’ and Dietz’ (and Holmes’) hypothesis. Here’s the idea: say Hess and co. are wrong: plates don’t spread, and mid-ocean ridges are just mountain chains hanging out at the bottom of the ocean. Then take the quakes that are measured on a fracture zone, right between two segments of the ridge. If, indeed, the fracture zone is just a strike-slip fault that cuts through the ridge, quakes should continue to push the two segments of the ridge crest away from one another:
8.14 Transform Faults
387
Fig. 8.22 Bathymetry of a “fracture zone”. The coastlines of South America and Africa are marked by a white line Fig. 8.23 After Sykes (1967): “Sense of motion associated with transform faults and transcurrent faults [...]. Double line represents crest of mid-oceanc ridge; single line, fracture zone.” (Used with permission of John Wiley and Sons, from Lynn R. Sykes, “Mechanism of Earthquakes and Nature of Faulting on the Mid-oceanic Ridges”, Journal of Geophysical Research, vol. 72, 1967; permission conveyed through Copyright Clearance Center, Inc.)
just like the bottom sketches in Fig. 8.23. So, for example, if you sit on the southern side of the strike-slip fault to the left, bottom panel of Fig. 8.23, you’ll see the other side of the fault moving toward your left, etc. But now let’s say Hess is right: mass is being emitted at the ridge, so if you are sitting in that same spot, you should see the other side of the fault moving toward your right! that’s the situation sketched at the top left of Fig. 8.23. Wilson explained this idea in a paper that was published on Science in 1965, “A New Class of Faults and Their Bearing on Continental Drift”. He figured that fracture-zone faults resulting from seafloor spreading—assuming they actually existed, because at this point Wilson was just speculating—were strike-slip faults, but different from other strike-slip faults found, e.g., on continents. So he gave them a name and called them “transform” faults, meaning, he wrote, “faults in which the displacement stops or
388
8 The Forces that Shape the Earth: Convection and Plates
changes form and direction”, or “which terminate abruptly at both ends, but which nevertheless may show great displacement”. The thing is, though, that’s something you can say, more or less, of any fault between different blocks of lithosphere that slip past one another—so that’s actually very confusing. Wilson called “transcurrent” a strike-slip fault that wasn’t a transform fault; but again, if you think about it, all transcurrent faults are ultimately transform faults, so this classification is not very useful606 . Wilson’s idea for testing seafloor spreading was, though. He had no quake data in his paper, but seismologists had everything that was needed to do the test, and in 1967 Lynn Sykes published the results in the Journal of Geophysical Research, in a paper called “Mechanism of Earthquakes and Nature of Faulting on the Midoceanic Ridges”. Bottom line, fracture-zone faults were strike-slip faults, slipping the way Wilson had guessed. Sykes wrote: “The mechanism of 17 earthquakes on the mid-oceanic ridges and their continental extensions were investigated using data from the World-Wide Standardized Seismograph Network607 of the U.S. Coast and Geodetic Survey and from other long-period seismograph instruments. Mechanism solutions of high precision can now be obtained for a large number of earthquakes with magnitudes as small as 6 in many areas of the world. Less than 1% of the data used in this study are inconsistent with a quadrant distribution of first motions608 [...]; in many previous investigations 15 to 20% of the data were often inconsistent with the published solutions. Ten of the earthquakes that were studied occurred on fracture zones that intersect the crest of the mid-oceanic ridge. The mechanism of each of the shocks that is located on a fracture zone is characterized by a predominance of strike-slip motion on a steeply dipping plane; the strike of one of the nodal planes for P waves is nearly coincident with the strike of the fracture zone. The sense of strike-slip motion in each of the ten solutions is in agreement with that predicted for transform faults; it is opposite to that expected for a simple offset of the ridge crest along the various fracture zones”, etc. Now you might be wondering why fracture zones and or transform faults even exist at all. Why doesn’t the seafloor just break along a great circle. Strangely enough, we still aren’t really able to answer to that.
8.15 Rigid Blocks Anyway, I don’t know if you had heard or read about plate tectonics before, but if you did, it might look to you now like this is it, that if we combine the contributions from Dietz plus Hess, plus Vine and Matthews, plus Wilson, plus Sykes—all the stuff that I have covered so far in this chapter—it might look like if we put it all together, what we get is the theory that’s taught today under the name of plate tectonics. That’s actually not quite true, though, because what’s missing, the thing that marks the transition from “continental drift” to “plate tectonics”, is that each plate (let’s start calling them that way, even though that word actually doesn’t show up until 1968 or so), meaning each continent but also each oceanic plate, is a rigid thing that moves
8.15 Rigid Blocks
389
like a block, with no large internal deformation. That’s the one ingredient of plate tectonics that isn’t stated explicitly in the papers we’ve seen, although, yes, the word block had been used, e.g. by Dietz. So there’s a paper by Teddy Bullard: Bullard, Everett and Smith, 1965, “The Fit of the Continents around the Atlantic”609 , testing the hypothesis that the motion of a plate can be described as a rigid-body rotation around an axis that goes through the center of the earth. Bullard et al. take Africa and South America and show that they can get their boundaries to fit one another really well, if you keep one fixed and just rotate the other around an axis—the only difficulty is to find the right axis. And they do the same for North America and Europe. The underlying idea, of course, is that Africa and America were one continent, then they broke up according to Holmes’/Dietz’/Hess’ seafloor spreading idea; both new continents were pushed away from one another by the new oceanic plate forming at the rift, and all this could be described mathematically by one rotation—one axis, one angle, one angular velocity. No deformation within the plates. Incidentally, Bullard and co. don’t fit coastlines—that’s why I just spoke of “boundaries” of continents—“the fit of the two coastlines”, they write, “is not close and is in any case not very meaningful, since the position of the coastline would, in many places, be greatly changed by a small rise or fall of sea level. The real ‘edge of the continent’ is the continental slope where the sea floor runs down steeply from 50 or 100 fm. [fathoms, I guess: one fathom is about 2 m] to over 2000 fm. in a few miles.” They don’t know a priori which bathymetry contour (which “isobath”) they should take—how deep—but what they do is they repeat the exercise that I am (they are) about to describe with a suite of different contours, and pick the one that gives the best continent-to-continent fit, i.e. 500 fathoms. Bullard et al. define the “edge of a continent” as a discrete set of closely-spaced points—longitudes and latitudes—like in Fig. 8.24. By the “fixed-point theorem”, AKA Euler’s theorem, “any displacement of a spherical surface over itself”, says Bullard, “leaves one point fixed610 ; that is any displacement of a contour line or of a continent may be considered as a rigid rotation611 about a vertical axis through some point on the surface of the Earth”—“vertical” meaning, it goes through the center of the earth. So, Bullard’s problem consists of finding (i) the coordinates of this point—the “centre of rotation”, as he calls it—which in Fig. 8.24 is the vertex of the angle φ— and (ii) the value of the angle φ itself, or how much either continent has rotated with respect to the other. And here is his recipe to solve it. Call Pn , with n an integer index, any one of the points that define the edge of the continent to the west; measure the distance612 between the centre of rotation and Pn ; interpolate the points along the edge of the eastern continent, to find a point Pn which is exactly as far from the centre of rotation as Pn is. Then, for a second, do as if the centre of rotation was the north pole, and measure the difference in “longitude” between Pn and Pn , with respect to that “pole”; call it φn . Now imagine you rotate one continent, with respect to the other, by an angle φ = φ0 . Post-rotation, there will be a residual “longitude” discrepancy: (φn − φ0 ). If you redo the same thing, starting from a point on the eastern continent’s edge and
390
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.24 Think of the two solid lines as the eastern boundary of South America (left) and western margin of Africa (right). The black dots (Pn , Pn , with n = 0, 1, 2, . . .) are the locations where the longitude and latitude of the margins were sampled. The angle φ is measured at a presumed “rotation pole” of one plate with respect to the other. (Used with permission of The Royal Society (U.K.), from Bullard, E., Everett, J. E., Smith, A. G., “The Fit of the Continents around the Atlantic”, Philosophical transactions of the Royal Society of London: Series A, Mathematical and Physical Sciences, vol. 258, 1965; permission conveyed through Copyright Clearance Center, Inc.)
then interpolating the western continent’s one, you should get a similar discrepancy, though not exactly the same—call it φn . If you repeat these exercises for both discrete set of points—primed and unprimed, you can quantify the overall discrepancy613 associated with your guess for where the centre of rotation is, and for the value of φ0 , N 1 (φn − φ0 )2 + (φn − φ0 )2 , (8.116) Q2 = 2N n=1 where N is how many points you have in each of your discrete sets —to the east and to the west, and you don’t want to have a different number of points on one contour versus the other, of course. Bullard et al. call Q 2 the “mean square misfit”614 . Now, what Bullard et al. want to do is, they make a guess as to where the centre of rotation might be, and then they need to find the value of φ0 such that Q 2 is minimum. To do this, they differentiate the right-hand side of (8.116) with respect to φ0 , set the result to zero615 and solve for φ0 , i.e.,
8.15 Rigid Blocks
391 N ∂ Q2 1 (φn − φ0 ) + (φn − φ0 ) , =− ∂φ0 N n=1
(8.117)
and if I set that to 0 and solve for φ0 , φ0 =
N 1 φn + φn . 2N n=1
(8.118)
“The most convenient method of finding the minimum”, says Bullard, “is to start from an estimated position and search systematically about it. This was done with the computer EDSAC 2 by the process illustrated [in Fig. 8.25]. The misfit Q(θ, λ) was calculated from (8.116) for an assumed centre of rotation with latitude θ and longitude λ using the best angle of rotation given by (8.118). The latitude of the centre of rotation was then increased by some small angle δ (usually 2◦ ) and Q(θ + δ, λ) found. If this was smaller than Q(θ, λ), further increases in latitude were made till for some integer, r , Q(θ + (r + 1)δ, λ) was found to be larger than Q(θ + r δ, λ). The longitude of the centre of rotation was then increased in a similar manner until a position of least misfit was found, giving Q(θ + r δ, λ + sδ), where s is an integer. The whole process was then repeated with increments − 21 δ, starting from this point, in order to locate the minimum more accurately. This multiplication of the interval by − 21 was continued until the increment had fallen below some chosen value (usually 0.1◦ ).” Bullard and co. run their algorithm to find φ0 for Africa and South America, and then for North America and Europe, and it all works out nicely. They find that a very good fit between Africa and South America, for example, is found if the center or rotation is at latitude 44◦ N, longitude 30.6◦ W, and the rotation angle φ0 is 57◦ ; so if you do just that, the continents fit together as in Fig. 8.26. The northern Atlantic is comparatively messy because “the fitting of the lands around [it] requires the bringing together of three major continental masses, North America, Greenland and Europe.” So what they do is, first they fit Greenland to Europe, and then Greenland plus the rotated Europe to North America. “A fit of Greenland to northern Europe on the 500 fm. contour was first attempted. In this fit Iceland was ignored altogether [...]. Iceland is composed of Tertiary and Recent igneous rocks and its omission is clearly justified.” Good agreement (check out Fig. 8.27) of Greenland and Europe, with the center of rotation at 73◦ N, −96.5◦ W and a 22◦ rotation around that axis. It’s even “a closer fit than the fit of South America to Africa but over a shorter length of contour.” Greenland plus Europe do not fit North America that well, and I guess I don’t need to keep bombarding you with numbers, but it’s still OK (again Fig. 8.27). Bullard et al. eventually also check how North America fits Africa. They find that “if northwest Africa is fitted to eastern North America, Africa overlaps the position of southern Spain determined from the north Atlantic fit [...]. It is thus impossible to get any reasonable fit without some distortion of the continents.
392
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.25 Qualitative sketch showing Bullard and co.’s way to find the best center of rotation for a given pair of plates, again from their 1965 paper. (I guess it’s kind of a simplified version of what people that are into “inverse theory” would call “steepest descent”, or “gradient descent”.) (Used with permission of The Royal Society (U.K.), from Bullard, E., Everett, J. E., Smith, A. G., “The Fit of the Continents around the Atlantic”, Philosophical transactions of the Royal Society of London: Series A, Mathematical and Physical Sciences, vol. 258, 1965; permission conveyed through Copyright Clearance Center, Inc.)
“The least distortion that will avoid the large overlap is a rotation of Spain to close up the Bay of Biscay and bring its north coast against the 500 fm. contour of western France. Such a rotation [...] is supported by some paleomagnetic evidence”, etc. And then funnily enough, the poet W. H. Auden is quoted. “That there may be something anomalous about the position of Spain was noticed by W. H. Auden (1950), who wrote in 1937 of ‘...that arid square, that fragment nipped off from hot Africa, soldered so crudely to inventive Europe.’ ” Look at Fig. 8.28 to see also how Spain needs to be rotated to fit the rest of the puzzle. After this, which is what people call a “seminal” paper, the treatment of Bullard et al. was extended, by many other researchers, to the entire planet. In 1968, Jason Morgan summed up the new continental-drift616 : “The surface of the earth is divided into about twenty units, or blocks”, which check out Fig. 8.29 here. “The boundaries between blocks are of three types and are determined by present day tectonic activity”, meaning, you look at the mechanisms of the quakes that happen along a boundary and from those you tell what kind of boundary that is617 : i.e., which of the following three types: “the rise type at which new crustal material is being formed. [...] The trench type at which crustal material is being destroyed; that is, the distance between two landmarks on opposite sides of a trench gradually decreases and at least one of the landmarks will eventually disappear into the trench floor. [...] The fault type at which crustal surface is neither created nor destroyed [in Wilson’s terms, that would be called a transform or transcurrent boundary]. Each block [...] is surrounded by
8.15 Rigid Blocks
393
Fig. 8.26 After Bullard et al., “the fit of Africa and South America at the 500 fm. contour [...], Mercator’s projection. Overlaps in [gray], gaps in [black].” (Used with permission of The Royal Society (U.K.), from Bullard, E., Everett, J. E., Smith, A. G., “The Fit of the Continents around the Atlantic”, Philosophical transactions of the Royal Society of London: Series A, Mathematical and Physical Sciences, vol. 258, 1965; permission conveyed through Copyright Clearance Center, Inc.)
some combination of these three types of boundaries.” The “assumption that gives this model mathematical rigor”, says Morgan, is “that each crustal block is perfectly rigid.” If you take, e.g., any two islands within the Pacific block, even very far from each other, and measure their distance, very precisely, says Morgan, and then do it again after a few years, in this assumption you’d get the exact same number. If you measure the distance between, say, Tokyo, which lies on another block, and any one of those islands, the distance would change. Morgan gets results like those of Bullard for other combinations of blocks; showing that, indeed, you can get blocks to fit one another at ridges, with approximately no distortion: which means that Morgan’s, and Bullard’s, and other’s assumption that the blocks be rigid is very likely to be a good one. As for the velocity of plate motions, that could be calculated from paleomagnetic data: because the age of reversals are known and the distance between a given reversal and the ridge can be measured. Bottom line, the reconstruction of former plate motions from this point on is just a matter of collecting and processing more and more data. In their 1968 Journal of Geophysical Research paper618 , Bryan Isacks and coworkers at Lamont summarize the results of Jason Morgan, and others, e.g., Xavier
394
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.27 Bullard et al.’s fit of continents around the north Atlantic, again from the same paper. Overlaps and gaps in gray and black, respectively. (Used with permission of The Royal Society (U.K.), from Bullard, E., Everett, J. E., Smith, A. G., “The Fit of the Continents around the Atlantic”, Philosophical transactions of the Royal Society of London: Series A, Mathematical and Physical Sciences, vol. 258, 1965; permission conveyed through Copyright Clearance Center, Inc.)
8.15 Rigid Blocks
395
Fig. 8.28 Bullard et al.’s fit of Africa plus South America to Europe plus Greenland plus North America. Maybe the Iberian peninsula wasn’t always part of Europe. (Used with permission of The Royal Society (U.K.), from Bullard, E., Everett, J. E., Smith, A. G., “The Fit of the Continents around the Atlantic”, Philosophical transactions of the Royal Society of London: Series A, Mathematical and Physical Sciences, vol. 258, 1965; permission conveyed through Copyright Clearance Center, Inc.)
Le Pichon, who “have demonstrated in a general way and with remarkable success”, says Isacks, “that such movement is self-consistent on a worldwide scale and that the movements agree with the pattern of seafloor spreading rates determined from magnetic anomalies at sea and with the orientation of oceanic fracture zones”. Others, like Dan McKenzie and Bob Parker, “used the mobile lithosphere concept to explain focal mechanisms of earthquakes, volcanism, and other tectonic features in the northern Pacific.” So, kinematically, the theory of plate tectonics is self consistent. But what about its dynamics? what about the forces that drive it? In Isacks’ paper there’s a cartoon of plate tectonics (see the sketch in Fig. 8.30), where flow in the asthenosphere and plate motion on top of it actually point in opposite directions. Because the asthenosphere, by its very nature, must have really low viscosity, it seems unlikely to Isacks and company that there should be that much of a viscous drag between that and the lithosphere. They think the conveyor-belt idea of Holmes/Dietz/Hess doesn’t work. But we’ll have to wait a few more years before this question is answered more, uhm, “quantitatively”, as they say.
396
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.29 Jason Morgan’s twenty “blocks”. (Used with permission of John Wiley and Sons, from W. J. Morgan, “Rises, Trenches, Great Faults, and Crustal Blocks”, Journal of Geophysical Research, vol. 73, 1968; permission conveyed through Copyright Clearance Center, Inc.)
Fig. 8.30 This is how Isacks et al. revisited in 1968 the sketch of plate tectonics drawn by Holmes in 1944 (Fig. 8.4 here), and or Dietz in 1961 and Hess in 1962. Incidentally, Isacks’ paper is the earliest paper where I happened to find the phrase “new global tectonics”—as well as “plate”, which is their synonym for “lithospheric block”. In Morgan’s paper, published that same year, the word “plate” still doesn’t appear. (Used with permission of John Wiley and Sons, from Bryan Isacks, Jack Oliver, Lynn R. Sykes, “Seismology and the New Global Tectonics”, Journal of Geophysical Research, vol. 73, 1968; permission conveyed through Copyright Clearance Center, Inc.)
Before we get to that, let’s think a bit more of what is above the asthenosphere, i.e., the so-called lithosphere, which all we know about is that it is the combination of the crust as defined by seismologists—from the earth’s surface to the Moho—and some seismically different stuff that’s underneath it, that seismologists already call “mantle”, but that still must have pretty high viscosity, because it’s part of the rigid plate and moves with it. Now, we are about to discover that oceanic and continental
8.16 A Model of the Oceanic Lithosphere
397
lithosphere are two very different beasts; and we’ll have to study them separately. Oceanic lithosphere is easier, so to say, because, actually, there is a simple way to make an estimate for how thick it may be. So we are going to start with that estimate. As you shall see, it is not very clear-cut; truth to be told, it’s only good at predicting the rate at which thickness grows with increasing distance from the ridge: but it’s still pretty useful, as I am going to try to show you now.
8.16 A Model of the Oceanic Lithosphere The idea is that, to some approximation, the moment a new chunk of igneous material freezes at one side of a mid-ocean ridge, you can think of it as a “slab” of homogeneous composition at constant, and uniform, and very high temperature (look at the sketch in Fig. 8.31); then its upper side is cooled by ocean water at roughly 0 ◦ C, and as time passes—and more seafloor is formed and the slab migrates away from the ridge—its temperature will be controlled by the same equation that governs the cooling of a half space. This is only true within some approximation, though. First, because the slab doesn’t really have infinite thickness. Second, because the slab doesn’t just cool down passively but contains radioactive material that gives out some additional heat (based on what we know of chemical composition under the oceans, though, we don’t think that there’s much radioactivity, there, and that additional heat should be pretty small). Third, because in reality there should be some heat conduction horizontally along the plate: but we know that temperature changes very rapidly in the vertical direction (remember Chap. 4: old observation made, e.g., in mines): and heat flux is proportional to temperature gradient: so it’s OK to assume that heat is conducted only upward, straight into the ocean. So then, based on Eq. (4.50), T in the slab is approximately given by z , (8.119) T (z, t) = T0 erf √ 2 κt
Fig. 8.31 A sketch of oceanic lithosphere near a ridge. The two vertical “slabs” have just solidified from melt, and are ready to migrate away from the rift
398
8 The Forces that Shape the Earth: Convection and Plates
where T0 is the temperature of the slab at the moment it freezes (which happens very quickly after it emerges to the earth’s surface). If you look that up in the literature or on the web, you are likely to find that T0 is estimated at around 1300◦ C, which is a typical value for the melting point of mantle rocks in laboratory studies. Also from lab data, the order of magnitude of thermal diffusivity of the materials we’re dealing with, at that kind of temperature, would be something like 10−5 km2 /yrs, or 10 km2 /Myr. Now, one way to think of the oceanic lithosphere is, think about it “thermally”: meaning, assume that the base of the lithosphere is the place where T grows beyond a certain value—high enough for some kind of phase change to happen619 , that makes deeper and hotter material very low-viscosity and prone to important viscous deformation: which is basically a description of the stuff that makes up the asthenosphere. It makes sense, then, to try and use Eq. (8.119) to calculate how deep the base of the lithosphere is. The error function, erf, doesn’t have an inverse function, but what people have been doing is they’ve calculated erf(x) for lots of values of x, compiling a “table” that you can find in books (or, nowadays, online, or you can just calculate it yourself with software such as Matlab and Python and all that). Call T1 the temperature we’re after. You read the table and look for the value ζ such that erf (ζ) =
T1 T1 . = T0 1300◦ C
(8.120)
T1 For example, if T1 = 1000◦ C, then 1300 = 0.77, so we look for the value of ζ such ◦C that erf (ζ) ≈ 0.77, and the table will tell you that that happens when ζ ≈ 0.85; and but so z (8.121) √ ≈ 0.85, 2 κt
or
√ z ≈ 1.7 κt = 1.7 8 × 10−7 × t,
(8.122)
after replacing κ with the value I gave you a few pages back, right before estimating the Rayleigh number via Eq. (8.114). Then620 , e.g., z ≈ 26 km if t = 10 Myr. √ Had we chosen a different value for T1 , we’d get a different coefficient in front of κt in (8.122): if we think of ζ as the inverse function of erf, then we can write √ z ≈ 2 ζ(T1 /T0 ) κt,
(8.123)
which says that the depth at which any chosen T1 is found under an ocean is proportional to the square root of the age t of the lithosphere. One can repeat the exercise for many values of T1 and t, and the result is you end up with a diagram like that in Fig. 8.32. When you look at Fig. 8.32 you should keep in mind that there’s a simple relation between t and the distance between slab and ridge: because the slab drifts away from the ridge at a speed that can be calculated from ocean-bottom paleomagnetic data,
8.16 A Model of the Oceanic Lithosphere
399
Fig. 8.32 “Isotherms” in oceanic lithosphere as a function of lithosphere age, according to the half-space cooling model. The temperature of each isotherm is as indicated in the plot
Fig. 8.33 The same isotherms, but now the horizontal axis shows distance from the ridge
see above, and that we know—from the data themselves—to be pretty much constant across each ocean. It does change from ocean to ocean, but the order of magnitude is always few cm/years. To fix ideas, let’s take speed to be 1 cm/year, and convert time to distance in Fig. 8.32: we get the diagram in Fig. 8.33, which is the same as Fig. 8.32 except for the units on the horizontal axis. You can see it as a cross-section of the temperature distribution in the oceanic lithosphere, along the direction of seafloor spreading and with the origin at the earth’s surface and mid-ocean ridge. In practice, you see from both Figs. 8.32 and 8.33 that there’s basically no lithosphere at the ridge, and then the lithosphere, whatever the T of its base is, thickens with increasing age or, which is the same, distance from the ridge. Whatever the t or distance √ from ridge, the thickness of the lithosphere is always growing—proportionally to t—although at a diminishing rate.
400
8 The Forces that Shape the Earth: Convection and Plates
This all looks very nice, but the model has at least one major problem. We can’t really be 100% sure of what the temperature T1 is, at which the lithosphereasthenosphere transition occurs, but remember: T0 is the “temperature of the slab at the moment it freezes”: it’s the temperature of material that has recently been pulled up by convection from within the mantle: as such, T0 is actually a reasonable estimate for the temperature of the asthenosphere. Good, so then let’s replace T1 with T0 in Eq. (8.120); then, that means that the z we are looking for, the z of the base of the lithosphere, is given by erf(ζ) = 1: but if you remember how erf works, that only happens when ζ = ∞: which in turn can only happen if t = 0, i.e. at the very moment the lithosphere freezes at the ridge; or at z = ∞, that is, at infinite depth. Which, I think you’ll agree, doesn’t make much sense. What happened is, the approximations we made were enough to mess things up. The model is still useful, though, and in fact you find it explained in all sorts of textbooks. One thing people do in practice is, they set the temperature of the base of the lithosphere to a (large) fraction of T0 , typically T1 = 0.9 T0 or so: and that way one gets at least a feeling for how thick the oceanic lithosphere may be. But one can also forget about thickness of the lithosphere altogether, and, strangely enough, it turns out that the half-space cooling idea agrees fairly well with observations of bathymetry (how deep the seafloor is under the ocean surface), heat-flux (measures of thermal gradient, in practice, right below the seafloor), and even seismic waves, which is not bad at all. Let me explain. The trick to check the bathymetry fit is, work out the isostatic balance at the ridge versus somewhere far from the ridge. The way isostasy works, the topography above some slab of lithosphere depends on how dense and how thick the slab is. Which you have to be careful about one thing: the density of what we decided to call lithosphere depends on temperature, and, as we’ve just learned, temperature changes quite a bit both with depth and with distance from the ridge. To sort this out, remember how we defined the volumetric coefficient of thermal expansion in Chap. 7 (the piece on Williamson-Adams, and Birch), 1 dρ , ρ dT
(8.124)
αρδT = −δρ.
(8.125)
α=− which you can manipulate to give
The minute it starts freezing and cooling at the ridge, the entire slab has the same temperature and density as the asthenosphere, call them T0 and ρ0 . As temperature then changes to a new value T , density must change, too, to a value ρ, and (8.125) says that (8.126) αρ0 (T0 − T ) = ρ − ρ0 . In our case, T means the temperature of the lithosphere: the function T = T (z, t) that we just came up with.
8.16 A Model of the Oceanic Lithosphere
401
Fig. 8.34 Isostatic balance applied to the cooling/spreading seafloor model
So now let’s do the isostatic balance (and you might look at Fig. 8.34). Consider a “column” of lithosphere which has already been spreading away from the ridge for a time t. Call L its thickness and call w the depth of the ocean right on top of it. Take the “base” of the column, i.e. its horizontal cross-section, whatever you want to call it, to be of unit area, which keeps the calculations simple. Then, the weight of the column, incl. the water, is L wg ρw + g
ρ(z, t)dz = wr g ρw + (L + w − wr )g ρ0
(8.127)
0
where, you guessed it, ρw is the density of water, and g is gravitational acceleration (we take it to be approx. constant everywhere in the lithosphere). The right-hand side is the weight of the asthenosphere plus water at the ridge, where we take T and ρ, like I said, to be constant with depth, and wr is the depth of the top of the ridge. The equality holds because of the isostasy principle. After some algebra, (8.127) becomes L ρ(z, t)dz = (w − wr )(ρ0 − ρw ) + Lρ0 , (8.128) 0
which, if you consider that Lρ0 =
L 0
ρ0 dz,
L [ρ(z, t) − ρ0 ]dz = (w − wr )(ρ0 − ρw ),
(8.129)
0
and if you sub Eq. (8.126) into (8.129), L [T0 − T (z, t)]dz = (w − wr )(ρ0 − ρw ).
αρ0 0
(8.130)
402
8 The Forces that Shape the Earth: Convection and Plates
Now we want to check what this model has to say about bathymetry—and whether that fits bathymetry data: so then solve (8.130) for w − wr : αρ0 w − wr = ρ0 − ρw
L [T0 − T (z, t)]dz.
(8.131)
0
T (z, t) at the right-hand side is given by (8.119): if we substitute that in, we get w − wr =
αρ0 T0 ρ0 − ρw
L z dz 1 − erf √ 2 κt 0 L
√ 2√κt 2αρ0 T0 κt = [1 − erf (ξ)] dξ ρ0 − ρw √
0 ∞
(8.132)
2αρ0 T0 κt [1 − erf (ξ)] dξ ρ0 − ρw 0 √ 2αρ0 T0 κt ≈√ π(ρ0 − ρw ) ≈
which the first step was just your typical change of variables; the second involved recalling that κ must be about 10 km2 /Myr: so, as long as t is not too short, 2√Lκt is above unity: and if you remember that erf(ξ) is pretty much constant and equal to 1 when ξ is above unity, you see that we’re not making that much of an error if we replace L (as an integration limit) with ∞. ∞Finally, via some extra math that maybe we can skip for now621 , it is known that 0 [1 − erf (ξ)] dξ = √1π .
8.17 The Half-Space Cooling Model Versus the Data At the left-hand side of (8.132) we’ve got something that can be measured directly: the bathymetry wr at the ridge, and the bathymetry w at that distance from the ridge where lithosphere of age t is found. We have ways to estimate most parameters at the right-hand side (remember the values I gave above), and or we can actually use (8.132) to find values of those parameters such that the match between the data at the left-hand side and the numbers we get at the right-hand side is as good as can be. In his textbook Fundamentals of Geophysics, where these things are derived more or less the way I’ve derived them here, Bill Lowrie says that, with values of α and κ that are similar, in order of magnitude, to the ones I gave you earlier, and with ρw = 1030 kg/m3 (sea water slightly denser than pure water), the fit is best if the density and temperature of the asthenosphere are ρ0 = 3300 kg/m3 and
8.17 The Half-Space Cooling Model Versus the Data
403
Fig. 8.35 The dotted curve is the depth of the ocean bottom calculated with Eq. (8.132), i.e., with the so-called half-space cooling model. The triangles are worldwide averages of bathymetry data. When t is large, the cooling model doesn’t fit the data. (Used with permission of Elsevier, from Jean-Claude Mareschal, Claude Jaupart, Catherine Phaneuf, Claire Perry, “Geoneutrinos and the energy budget of the Earth”, Journal of Geodynamics, vol. 54, 2012; permission conveyed through Copyright Clearance Center, Inc.)
√ T0 = 1450 ◦ C. The curve at the right-hand side of (8.132), proportional to t, is closest to the data when lithosphere is not so old, say, less than 80 Myr; it doesn’t work equally well for older lithosphere: see Fig. 8.35. As for heat flux data: it’s not that hard to calculate what heat flux you’d expect to measure at the ocean floor, on the basis of the oceanic lithosphere model we’ve put together. Remember flux is given by Fourier’s law of heat conduction, i.e., Eq. (4.32), (z), where k is specific conductivity. We’ve agreed that, in the oceanic F(z) = −k dT dz lithosphere, T (z) is approximately given by Eq. (8.119): so then plug (8.119) into Fourier’s equation, which becomes F(z, t) = − kT0
z d erf √ dz 2 κt √z
2kT0 d =− √ π dz
2 κt
e−u du 2
(8.133)
0
kT0 − z2 =− √ e 4κt πκt (which, if you are not convinced, remember chapter 4: the definition of erf, Eq. (4.47), and then the algebra in Eq. (4.45), which is pretty much the same algebra that’s needed here). Now, z in (8.133) is depth below the bottom of the ocean (the top of the lithosphere), so the heat flux that one measures on the ocean floor should be kT0 F(0, t) = − √ , (8.134) πκt
404
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.36 This diagram, borrowed from Turcotte and Schubert’s Geodynamics, shows “heat flow as a function of the age of the ocean floor. The data points are from sediment-covered regions of the Atlantic and Pacific Oceans”. The solid line is heat flow predicted by the half-space cooling model. (Used with permission of Cambridge University Press, from Turcotte and Schubert, Geodynamics, second edition, 2002; permission conveyed through Copyright Clearance Center, Inc.)
i.e., Eq. (8.133) with z set to 0. Same as above, the thickness of the lithosphere doesn’t really matter and somehow cancels out. Similar to ocean depth, which grows proportionally to the square root of t, as t grows, heat flux through the ocean floor decreases like the inverse of the square root of t. (It makes sense, because older lithosphere has already cooled some, and so has gotten less heat to give out.) It turns out that, if we pick the same values for the various parameters as we did in the bathymetry fit, this matches the data pretty well (see Fig. 8.36). Also same as bathymetry, the fit is best when the lithosphere is young, and not that good when the lithosphere is older. From both Figs. 8.35 and 8.36, we might say the half-space cooling model doesn’t really work anymore when the lithosphere is older than about 80 Myr: ocean floor in the HSCM keeps becoming deeper with increasing distance from the ridge, and the heat flux keeps decreasing, and but, like I said, that’s not what the ocean-bottom data (both bathymetry and flux) show. The reason why that happens, people think, is that there’s some heating from below (from mantle convection) that keeps the lithosphere from cooling beyond a certain temperature, and so keeps it from thickening, so that then it doesn’t subside, either; and the heat flux at the ocean floor doesn’t decrease anymore even as t grows. So people have come up with (similar, but) alternative models. One famous example of a model that doesn’t have this problem is the so-called “plate model”. In the plate model, the temperature is not controlled by the equation that governs the cooling of a half space, i.e. Eq. (8.119): instead, the heat diffusion equation, Eq. (4.39), is solved in the case where T is kept constant not only at the top of the lithosphere, where it might be set to about 0◦ C (same as the HSCM), but also at some arbitrary
8.17 The Half-Space Cooling Model Versus the Data
405
depth, which should correspond to the bottom of the lithosphere at a large distance from the ridge (where its depth is not supposed to change anymore). Down there, T is required to always coincide with the temperature T0 of the asthenosphere622 . In Chap. 4 I only derived a formal solution of the conduction equation in the halfspace case: just one boundary condition: at the top. This case is more complicated and I haven’t covered it. The solution is given e.g. by Don Turcotte and Gerry Schubert in sec. 4-17 of their famous textbook, Geodynamics, or in the much cited 1977 Journal of Geophysical Research (vol. 82) paper by Barry Parsons and John Sclater, “An Analysis of the Variation of Ocean Floor Bathymetry and Heat Flow with Age”. In both cases, the reader who needs to know how the solution is actually derived is told to go look in the old book by H. S. Carlslaw and J. C. Jaeger, Conduction of Heat in Solids (page 100 of the 1959 edition). Parsons and Sclater show essentially that plate model and HSCM predict the same bathymetry and heat flux near the ridge, while the plate model fits the data much better than HSCM away from it. As for the depth of litosphere-asthenosphere boundary, that can’t really be defined clearly on the basis of either model. Incidentally, both HSCM and plate model were proposed around the same time at the end of the 1960s. As far as I can tell, the plate model first shows up in a 1967 Journal of Geophysical Research (vol. 72) article by Dan McKenzie, “Some Remarks on Heat Flow and Gravity Anomalies” (but the explanation of it in Parsons and Sclater, see above, is more complete, I think); and the HSCM is first found in Turcotte and Oxburgh, “Convection in a Mantle with Variable Physical Properties”, Journal of Geophysical Research, vol. 74, 1969: see their equations (2) through (4) and compare with Eq. (8.132) and its derivation here (which is a bit simpler because I set the temperature of the ocean bottom at precisely 0◦ C: presumably not too far from reality, anyway). Similar to Lowrie above, Parsons and Sclater (1977) actually do an “inversion” of the plate model, to find the values of the various parameters that fit the model best. “Inverting the data”, they write, “gives a plate thickness of 125 ± 10 km, a bottom boundary temperature of 1350◦ ± 275 ◦ C, and a thermal expansion coefficient of (3.2 ± 1.1) × 10−5 ◦ C−1 ”. By “plate thickness”, let me insist on this point, they refer to the thickness of the lithosphere far from the ridge, after it has become constant. Finally, what can seismic waves tell us about the oceanic lithosphere? A lot of the info we have re the oceanic lithosphere turns out to come from observations of seismic surface waves. You’ve learned in Chap. 6 that surface waves are stuck “near” the surface of the earth. You’ve also learned that how near depends on their frequency: because a surface wave is made up of vibrations at all sorts of frequencies—which the gravest (say, 0.01 Hz, or 100 s in period) “sample” hundreds of km deep below the surface, and the highest (say around 1 Hz) only concern the top few km. So you filter seismograms around the frequency you want, and depending on how low or high that is, you will be looking at a different depth. We’ll see shortly how this works in practice, and what exactly it tells you about the earth. There are two reasons why seismologists who want to map the structure of lithosphere and asthenosphere prefer to look at surface rather than body waves. The first reason concerns only the oceans; there’s very few permanent seismic stations in
406
8 The Forces that Shape the Earth: Convection and Plates
oceans, because it’s difficult to deploy stuff under thousands of meters of water, and there aren’t that many islands in oceans, either. So, what people knew in the 1960s about the so-called “seismic structure” under oceans came from seismic prospection à la Ewing: meaning, you blow something up in the water, and record the (purely compressional) waves that are reflected back up to the sensors that are towed by your ship, after being refracted through the crust and the top of the mantle. This can’t “sample” very deep, though, unless you set off some incredible amount of explosive. Now, long-period surface waves can sample deep, and plus, the depth to which they can sample doesn’t change with distance from source and receiver: it’s the same all along the wave path. If there’s a big quake in Chile, for example, and you record its surface waves in Hawaii, you have info on upper-mantle structure everywhere along the great circle that connects the epicenter to your receiver in Hawaii: which is quite a large slice of the Pacific ocean. The other reason is that, in the words of James Dorman et al.623 , “conventional body-wave studies of the upper mantle are difficult because the travel-time method fails for shallow shocks if the velocity decreases moderately with depth [...]. Surfacewave travel-time methods, on the other hand, encounter no particular difficulty in the presence of a low-velocity layer.” To understand what Dorman is saying here, you have to think that, yes, like we learned in Chap. 7, the speed of both P and S waves, above the earth’s core, almost always grows with increasing depth. That means that both P and S waves are almost always refracted in the same way and, in the first approximation, their ray paths look like the direct-P curve in Fig. 7.21. And that contributes to making things relatively easy. But, it turns out that just below the lithosphere there’s a relatively thin layer, which Gutenberg first observed and called the low-velocity zone, or LVZ, and which seems to coincide with the asthenosphere, where the seismic velocities are actually lower than they are in the lithosphere—i.e., they decrease with increasing depth. It being so thin, it was OK to neglect it earlier, when we were looking at the global structure of the earth, from surface all the way to center. But now we are concerned with smaller-scale structure near the top, and so we need to be more careful. The sketch in Fig. 8.37, from Gutenberg and Richter’s 1939 paper in the Bulletin of the Seismological Society of America, shows what a low-velocity layer does to body-wave ray paths. “We assume two layers,”, say Gutenberg and Richter, “within each of which the velocity is constant, being smaller in the lower layer; the layers are assumed to be separated by a discontinuity.” Then they illustrate the sketch in Fig. 8.37: “the focus [the earthquake source] F is at the surface. With decreasing angle of incidence, the rays reach the surface successively at A, B, and C. The ray to C is tangent to the discontinuity at the point D. From D the first ray through the discontinuity is refracted to E, and reaches the surface at G. With slightly decreasing angle of incidence, rays of type F H I J result, reaching the surface at points nearer than G. The minimum distance reached by these rays is [at J ]. Between this and C is a shadow zone. With still smaller angles of incidence the distance of emergence to the surface again increases.” Even without understanding everything that Gutenberg and Richter are saying here, two things should be clear. First: if you record at, like, J or G a seismic wave that’s been emitted by F, what you see will be a combination
8.17 The Half-Space Cooling Model Versus the Data
407
Fig. 8.37 After Gutenberg and Richter (1939). The outer (arc of) circle represents the earth’s surface; the inner one is the lithosphere/asthenosphere boundary. Each of those two layers is taken to be homogeneous, so, within either layer, a seismic ray path is a straight line. But the paths are deflected (refracted) by the discontinuity (Chap. 7). F denotes the focus, or earthquake source—we take it to be at the surface, for the sake of simplicity. A, B, C, J and G are possible locations for the receivers—the instruments that record the quake. D is where a ray path coming out of F is tangent to the lithosphere/asthenosphere boundary. That ray path emerges back to the surface at C; after C, there is a shadow zone—an area where a receiver wouldn’t record body waves at all. Receivers deployed beyond the shadow zone, e.g. at J or G, record two refracted body wave arrivals each, each associated with its own ray path. If one’s goal is to use these data to make inferences re the structure of the earth, this complicates things: which is why, in this range of (short) epicentral distances, people often prefer to use surface waves. (Used with permission of Seismological Society of America, from Gutenberg and Richter, “New Evidence for a Change in Physical Conditions at Depths near 100 Kilometers”, The Bulletin of the Seismological Society of America, vol. 29, 1939; permission conveyed through Copyright Clearance Center, Inc.)
of signals that have traveled along different paths, arriving at your instruments one shortly after the other: it’s hard to tell which is which, and along which path it has travelled. Second: incidentally the fact that the shadow zone is observed between C and J implies that there must a reduction in seismic velocity below the lithosphere: and this is how, as early as 1926 (in a paper that, as far as I know, has only been published in German), Gutenberg first concluded that there must be an LVZ624 . Anyway, so, sorry for the digression, but that’s why Dorman and co. and other seismologists started looking at surface waves, rather than body waves, to try and see the oceanic lithosphere: because the trajectories, the ray paths of body waves are messed up by the LVZ and so you don’t really easily know what range of depth a certain “arrival”, a certain peak on a seismogram is sensitive to. The paths of surface waves, on the other hand, are parallel to the earth’s surface anyway, and, as such, they can’t be messed up by the LVZ. What, then, are the “surface-wave travel-time methods” mentioned by Dorman, exactly? Remember Chap. 6: we’ve seen, there, that, if we agree that the crust and upper mantle are approximately just a pile of flat layers, you can calculate how the speed c of a surface wave changes as a function of its frequency ω: AKA the dispersion curve. (I’ve worked that out explicitly only for the case of Love waves propagating in a—very simple—half-space-plus-one-layer model; but I should have convinced you that you can follow more or less the same procedure to derive c = c(ω) for Rayleigh waves as well, and with as many layers as you wish625 .) What’s done, now, is, on the one hand, surface-wave dispersion is observed in seismograms; on the other hand, says Dorman, “dispersion is computed for assumed models in an
408
8 The Forces that Shape the Earth: Convection and Plates
attempt to fit observed dispersion”. It’s sort of the same principle like we’ve seen earlier in this chapter, when we’ve looked at how Bullard searched for the center of rotation of South America with respect to Africa: you make a guess for the values of the parameters you’re after—in this case, the thicknesses of layers and the values of λ, μ, ρ within layers—plug those into the surface-wave math in Chap. 6, and find numerically a curve of c = c(ω) (see Fig. 6.21), which you can compare with the data: and if your result “fits” the data, then that means that your guess for the values of the parameters is OK, and you have an acceptable model of earth structure. If it doesn’t, you make another guess, and keep going until you get what you consider to be a good fit. What people doing stuff like Dorman et al. saw under the ocean was: a thin (10 km or less) top layer or very low v S (about 3.5 km/s), which one naturally interprets as being the crust; below that, a relatively high-velocity layer (about 4.5 km/s), which can only be the (rest of the) lithosphere; below the lithosphere, the LVZ, with slightly lower v S (say, around 4.2 km/s? it depends on where you look, too—both where you look geographically, and where in the literature) than the lithosphere. Below the LVZ/asthenosphere, v S starts growing again. In the two papers they published on Science in 1974, Edgar Kausel and Leon Knopoff and his grad student Alan Leeds subdivide “the Pacific Basin into eight zones, each having a different value of lithospheric age [...]. The [...] eight regions represent a subdivision of the tectonic history of the upper mantle of the Pacific from ridge to trench.” For each of the eight regions they have a bunch of dispersion curves (Fig. 8.38), and they look for the layered model that fits those dispersion curves best. It turns out that the older the region’s age, the thicker its lithosphere (Fig. 8.39). To see if that agrees with the half-space cooling model, one can plot Leeds’ depth estimates on top of a diagram like that of Fig. 8.32: you get Fig. 8.40: which there is some qualitative agreemeent, perhaps, but nothing to write home about626 .
8.18 Subduction So, anyway, oceanic lithosphere models may need some improvement, and we are not really sure of where exactly the asthenosphere ends, but at least one thing is clear: oceanic plates get colder with age, that is, with distance from the ridge: as they get colder their density goes up, and eventually they are not buoyant anymore, and, apparently, they start to sink into the mantle. Because, remember the Wadati-Benioff zones? which Griggs and company figured would be the downgoing branches of the earth’s convection cells? Maybe you also remember that Wadati-Benioff zones are systematically found below margins between ocean and continents? So, there you go, there’s oceanic and continental plates, and those downgoing slabs are nothing but old oceanic lithosphere, sinking. That makes sense, because “continental rocks”, says Dan McKenzie627 , “contain more silica than those from the oceans; they are therefore less dense and cannot sink into the mantle. Thus continental rocks remain on the surface and are generally considerably older and more deformed than those
8.18 Subduction
409
Fig. 8.38 Rayleigh-wave paths and dispersion curves of the Pacific ocean. (Used with permission of The American Association for the Advancement of Science, from Edgar G. Kausel, Alan R. Leeds, Leon Knopoff, “Variations of Rayleigh Wave Phase Velocities across the Pacific Ocean”, Science, vol. 186, 1974; permission conveyed through Copyright Clearance Center, Inc.)
410
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.39 After Leeds et al. (1974): “Variation of upper mantle structure as a function of [the average] lithospheric age [in each region]. Depths are measured from the top of the lithosphere. Region numbers are shown beneath the age scale.” (Used with permission of The American Association for the Advancement of Science, from Alan R. Leeds, Leon Knopoff, Edgar G. Kausel, “Variations of Upper Mantle Structure under the Pacific Ocean”, Science, vol. 186, 1974; permission conveyed through Copyright Clearance Center, Inc.)
Fig. 8.40 Thickness of lithosphere from the half-space cooling model and from Leeds et al.’s Rayleigh-wave data. (Used with permission of Cambridge University Press, from Turcotte and Schubert, Geodynamics, second edition, 2002; permission conveyed through Copyright Clearance Center, Inc.)
from the ocean basins628 . However, the oceanic crust and upper mantle are [i.e., the oceanic lithosphere is] formed from the mantle below the plates. Therefore the mean chemical composition of the oceanic part of a plate is the same as that of the material below, from which it differs only by being colder. For this reason the oceanic plates are denser than the mantle below and can sink through it.” People have found ways to calculate how old oceanic lithosphere needs to be before it starts sinking. Reading about the half-space cooling model, you might have wondered why the cooling lithosphere doesn’t start sinking right away after
8.18 Subduction
411
freezing—the moment it freezes, it is denser than the asthenosphere under it, and so... But the thing is, those models were quite simplified. The oceanic lithosphere, actually, is not just one layer that’s everywhere denser than the asthenosphere: but it’s at least three layers: there’s the oceanic crust, which is less dense than the asthenosphere; and if there is a crust, then there is also a “depleted oceanic mantle lithosphere”, which is depleted in whatever stuff you took away from it to make the crust; and below that, finally, there’s the mantle lithosphere proper629,630 . You might be wondering what on earth I am talking about, so here’s an explanation, which I’ll try to keep as short as I can: think of a chunk of peridotite rising, by convection, towards the earth’s surface below a mid-ocean ridge. Above the asthenosphere the peridotite is fully solid631 . As the peridotite rises, pressure goes down; the ambient temperature goes down too, but remember, rock in the mantle isn’t very good at conducting heat, so before it has time to cool by conduction, the peridotite starts to melt632 , i.e., some combination of the minerals that form it—not all of them—melt. At that point, you can think of our peridotite parcel as “a sponge soaked up with liquid”633 , except that the liquid comes from the sponge itself: it’s formed of whatever materials, within the peridotite, melt first. The proportion of liquid grows as the sponge rises towards the surface, and, as that happens, the chemical and mineral composition of both liquid and sponge gradually change. Eventually what happens is that, while the liquid is still buoyant, the sponge isn’t buoyant anymore. So, the materials that make up the peridotite separate up. The stuff that’s become liquid is released upwards, while the solid remains stuck—you might call it residual, because that’s the residue that’s left after extracting the fluid. People call this “differentiation”, or “segregation”. The stuff that’s risen up eventually gets to a point where, even at low pressure, it’s too cold to stay liquid, and so it solidifies again and becomes the basalt that makes up the oceanic crust. The dry sponge that’s left behind forms a layer which is the “depleted oceanic mantle lithosphere” that I was telling you about634 . So, the oceanic lithosphere should be a three-layer thing, as in Fig. 8.41, with oceanic crust on top, depleted lithosphere in the middle, and normal lithosphere below635 . Back to my question: how much time before a slab of oceanic lithosphere is old and cold and dense enough to sink? The trick is to consider the buoyancy of the plate as a whole, i.e. the average density of the lithosphere incl. all three layers: which if all densities and thicknesses are called like in the sketch of Fig. 8.41, reads ρC h + ρ D δ + ρ L (d − h − δ) . d
(8.135)
Now, remember that ρ L and d grow with age as the lithosphere cools636 , while everything else stays approximately the same: so then (8.135) grows. At some point, ρ L and d will be such that the average density of the oceanic lithosphere coincides with the density of the asthenosphere ρC h + ρ D δ + ρ L (d − h − δ) = ρA. d
(8.136)
412
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.41 The oceanic lithosphere, in detail. ρC , ρ D , ρ L , ρ A are the mean densities of: the oceanic crust, the depleted mantle lithosphere layer immediately under the crust, the rest of the mantle lithosphere, the (sublithospheric mantle, or) asthenosphere, respectively. The total thickness of the lithosphere, incl. the crust, is d; h is the thickness of the crust and δ is the thickness of the depleted layer
Then, the age of this chunk of lithosphere is just about right for it to still be buoyant; but wait just a bit longer, ρ L and or d grow slightly, so does the right-hand side of (8.136) and the chunk starts to go down. To find its age when this happens, first you solve (8.136) for d, h(ρ L − ρC ) + δ(ρ L − ρ D ) , (8.137) d= ρL − ρ A then you remember Eq. (8.123), which says that the thickness of oceanic lithosphere equals a factor times the square root of its age, and the factor depends on the temperature at√the base of the lithosphere. Let’s call that proportionality factor γ. So, then, d = γ t, and h(ρ L − ρC ) + δ(ρ L − ρ D ) 2 t= . (8.138) γ(ρ L − ρ A ) One can substitute numbers637 into (8.138): we’ve got estimates for all the quantities involved, from seismic prospection, from seismic surface waves, and from analyzing samples in the lab: and it turns out that the age of so-called neutral buoyancy is somewhere between 20 and 30 Myr. That’s pretty small, compared to the average age of oceanic lithosphere near convergent margins—i.e., at the onset of subduction—like we observe it in the real world from, e.g., paleomagnetic data: which is more like 100 Myr. But the thing is, the fact that a chunk of lithosphere isn’t buoyant doesn’t mean it’s going to sink right away: just like, remember how the Rayleigh number works, the fact that a hotter-than-average chunk of mantle is buoyant doesn’t mean it immediately rises. There’s other factors involved, like viscosity, mostly, and I am not going to get into that, but you shouldn’t be too surprised that some significant amount of geologic time needs to pass before the sinking begins.
8.19 What Moves the Plates?
413
8.19 What Moves the Plates? With that, we have the full story of what happens to a slab of oceanic lithosphere, from the moment it emerges at the ridge to when it finally sinks back into the mantle. The way I told you the story, I haven’t worried much about the so-called engine that drives plate motion, because, I guess, initially people were happy with the explanation given by Holmes, first, and then by Dietz and Hess: that plates “ride” the asthenosphere, i.e. friction (AKA viscous drag) between plates and asthenosphere makes sure that the plates go wherever mantle convection pulls them. The direction and speed of plate motions, then, would be the same as the direction and speed of flow in the asthenosphere: which is where the top, horizontal branches of mantle convection cells are found. But then people started to question whether things actually happen this way. There’s a bunch of papers written in the mid-1970s that attack the question of what “drives” plate tectonics, take, e.g., that of William Chapple and Terry Tullis, published in Journal of Geophysical Research, 1977: “What is causing the motion of the plates? [...] In one sense, any part of the circulation system [mantle convection] which allows motion to occur is causing plate motion, since all parts of the circulation system must function in order to have motion. As our views of the particular nature of thermal convection have become sharper, progressing from the views of Holmes [1928] and Griggs [1939] through those of Dietz [1961] and Hess [1962] to those represented by plate tectonics [...], the question of what forces drive various parts of the circulation has focused on the plates themselves. Thus a question which is frequently discussed is whether the shear stress on the bottom of the plates is in the sense to aid or to retard their motion. This is basically a detail of exactly how the convection works: Are the plates merely passively riding along on the top of the horizontally moving limb of a convection cell (the plates are like coal on a conveyor belt), or are the plates the fundamental part of the top of the cell that drags along the material underneath (the plates are the conveyor belt, and the material under them merely the rollers)?” So then what people like Chapple and Tullis do, or Don Forsyth and Seiya Uyeda do, in a 1975 paper in the Geophysical Journal of the Royal Astronomical Society that’s cited in pretty much every geophysics textbook, what they do is, first, they identify all possible forces acting on a plate (the gravitational pull from the cold sinking end of the slab; the viscous drag from the mantle, etc.). We can’t just go and measure those forces, but we know that the system is approximately in equilibrium: we know from paleomag data, etc., that the velocities at which plates move are constant—zero acceleration. So then for each plate one can do a force balance, and set the inertia term to zero, and what’s left is an equation connecting all those forces that we don’t know. For each plate we have one such equation, and if we take them all together, we end up with a system of equations containing a bunch of unknowns. And the trick, then, is to solve that system, as a way to find numerical values for all those unknown forces. But I should explain this in some more detail.
414
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.42 Forsyth and Uyeda’s catalogue of all forces that might act on a tectonic plate, based on their famous 1975 paper
I think it’s agreed, today, that the forces that could be exerted on a plate are those in Fig. 8.42, which is based on the work of Forsyth and Uyeda that I just mentioned. First of all, there’s (i) the mantle drag force FDF , i.e., the “friction”, or “viscous coupling” between a plate and the asthenosphere below it. In, e.g., Dietz’ and Hess’ initial, conveyorbelt model, the idea was that FDF was a driving force for plate tectonics, pulling oceanic plates away from mid-ocean ridges. But it might also be, see, e.g., Fig. 8.30, that flow in the asthenosphere, if there is any, does not contribute positively to plate motion: in which case FDF would be a resistive force. Forsyth and Uyeda are open to both possibilities. (ii) Oceanic plates are not the same thing as continental plates, and oceanic asthenosphere is not the same thing as continental asthenosphere; their rheologies differ. So, while it is likely that FDF is the same for all oceanic plates, or for all continental plates, there will be a difference between FDF under continents versus FDF under oceans. Forsyth and Uyeda actually call FDF the drag force acting on oceanic plates, and then introduce a correction FCD such that the drag on a continental plate is equal to FDF + FCD . (iii) “At the diverging boundary,” say Forsyth and Uyeda, “plates are pushed apart by way of gravitational sliding638 .” That’s the ridge push FRP . (iv) “At the transform fault boundary, there should be some resistive force [friction between the two plates, basically. I guess here “transform fault” refers to any strike-slip-type boundary: i.e., San Andreas is catalogued as a “transform” fault], which we call transform fault resistance, FTF .” (v) As for converging boundaries, “there is a negative buoyancy force acting on the downgoing slab part of the [...] oceanic plate. This body force will pull the whole oceanic plate toward the trench and is called slab-pull, FSP .” (vi) And but “since the slab is plunging into the [mantle below the asthenosphere], where the viscous resistance may be much higher than in the asthenosphere [remember that the asthenosphere is very low-viscosity], the descending slab
8.19 What Moves the Plates?
415
may meet significant resistance”, which Forsyth and Uyeda call the slab resistance, FSR . (vii) At the convergent margin between “the two plates, their relative motion is resisted by a frictional force. This force, the colliding resistance, FCR is opposite in direction but identical in magnitude for the two plates because of the principle of action and reaction.” (viii) “For the continental or overthrust plate, another force, called suction, that pulls the plate toward the trench was proposed by Elsasser639 (1971). We denote it FSU .” This suction thing is not easy to understand, let alone explain. The best I can come up with is, subducting slabs in the mantle can “induce mantle circulation patterns that exert shear tractions at the base of nearby plates” (the quote is from a Nature paper by Clint Conrad and Carolina Lithgow-Bertelloni, 2002). Shear traction is really another name for viscous drag, but this is not related to the drag exerted by the rest of the mantle along the whole bottom of the plate: it is, rather, a local thing, limited to the vicinity of the trench. This way, the subducting portion of a plate indirectly sucks into the mantle also the so-called “overriding” plate (the plate on the other side of the same trench)640 . This is purely speculative, of course, but I don’t think there’s anything more convincing than this on the market, yet. Now all the forces are listed; and but Forsyth and Uyeda explain that it is not possible to tell a priori which is (or are) the most important in driving the plates: “For instance,” they say, “if the ridge push, FRP , is the only driving force, why does the Philippine Sea plate, which has no ridge on its boundary, move [...]? Similarly, the slab pull, FSP , cannot be the sole driving force because the plates on both sides of the mid-Atlantic Ridge are moving apart without being attached to any significant downgoing slab. Apparently, some combined effect of these forces is responsible for maintaining the plate motions and it is the intent of the present paper to decipher the relative importance of these possible forces.” And so, on with the force balance, like I was saying, which will give us a linear system of equations that we can solve to find the forces themselves. To do the force balance that we need to do, it’s a good idea to remember (see above: Bullard and all that) that a tectonic plate doesn’t move in a straight line, but rather rotates along the surface of the earth. So, for example, when we say, like we just did, that a tectonic plate moves with zero acceleration, what we really mean is that its angular acceleration, with respect to the axis around which it rotates, is zero. “Cartesian” acceleration isn’t zero; but we don’t have to worry about that if we do our force balance through the Eulerian, rotational version of Newton’s second law, i.e. Eq. (2.15). Equation (2.15), the way I derived it in Chap. 2, applies to the whole earth—V in (2.15) denotes the volume of the earth. The outer surface of the planet is, to a good approximation, free of stresses, which is the same as saying that there are no unbalanced surface forces acting on the earth: so I didn’t worry about surface forces in my derivation. But now I want an equation that works for a single tectonic plate, which you see that most of the forces in Fig. 8.42 are actually surface forces: so then
416
8 The Forces that Shape the Earth: Convection and Plates
if I go back to the derivation of (2.15), the moment I introduce the integral in Eq. (2.8), I should consider that there are surface forces (which it’s convenient to express as forces-per-unit-area) FS as well as body forces (per unit volume) FB . Equation (2.15), then, should be replaced by
d V [r × FB (r)] +
d S [r × FS (r)] = ∂V
V
d (I · ) . dt
(8.139)
The inertia tensor I does change over time, because we’ve seen that the distribution of mass within the earth changes over time (erosion, post-glacial rebound, mantle convection, you name it), but those changes are small compared to the kind of forces we are dealing with right now: so
d V [r × FB (r)] +
d S [r × FS (r)] = I · ∂V
V
d ; dt
(8.140)
now, at the right-hand side you’re left with the time-derivative of the plate’s angular velocity : which is the same as saying, the plate’s angular acceleration: which I just told you that the starting point of Forsyth and co.’s reasoning is that that is equal to zero: from which it follows that d V [r × FB (r)] + d S [r × FS (r)] = 0. (8.141) ∂V
V
Now, doing the force (torque) balance for a plate means that at each location r, within the plate or along its surface, we replace FB (r) and FS (r) with the sums of all body and surface forces, respectively, from Forysth and Uyeda’s list, that happen to be nonzero at that location. You swap sum and integration and end up with a sum of integrals,
d V [r × FSP (r)] + V
d V [r × FRP (r)] +
V
d S [r × FDF (r)] + . . . = 0, ∂V
(8.142) where FSP is the force-per-unit-volume version of FSP , FDF is the force-per-unit-area version of FDF , etc.; to keep this short, I am not explicitly writing down all the forces we’ve talked about. Then people simplify things as much as they can, replacing each term like r × FSP (r) with the product of some geometrical stuff, which depends on the directions of the driving or resistive force, and of plate motion, with a constant, unknown factor that describes the magnitude of the force (per unit volume, or unit area). Forsyth and Chapple do this in different ways (and later papers find some other ways to do it, too), and I’d rather not work out all the details; but one fundamental point which is the same for both is that the unknown factors depend on the nature
8.19 What Moves the Plates?
417
of the force (slab pull, ridge push, asthenosphere drag, etc.) but not on the plate, i.e. they are the same for all plates641 . There’s one vectorial Eq. (8.141) per plate: each is really three scalar, linear equations containing as many unknowns as there are forces acting on that plate. Working things out as I just outlined, and denoting x1 , x2 , . . . , x8 the unknown factors, which there’s eight of them because there’s eight forces in Forsyth and Uyeda’s list642 , each scalar equation is reduced to the form 8
Ai j x j = 0,
(8.143)
j=1
where the coefficients Ai j can be calculated knowing the geometry of the plates and of their motion. The index i keeps track of which scalar equation we’re looking at—which plate, and which component of the torque equation. If you agree with both Forsyth and Chapple that the globe should be subdivided into twelve plates643 , then that means there’s 36 of those scalar equations—and you can think of Ai j as the coefficients of a 36 × 8 matrix644 . The number of equations in the system is four times bigger than the number of unknowns, which means that the system is what people call an overdetermined system, or overdetermined linear inverse problem. An overdetermined problem in general does not have an exact solution. Basically, a system that has as many equations as unknowns, i.e., as many constraints as “degrees of freedom”, has got one solution; if you have more unknowns than you have equations, then there’s an infinity of possible solutions; and if you have more equations than unknowns, then it might well be that there’s no solution at all—unless some equations are duplicates of one another, that is645 . If the empirical constraints, see above, that we have re the plate motions were all perfectly coherent with one another, then yes, we should have an exact solution; but given that the coefficients in (8.143) were determined in a very, uhm, approximate fashion, we shouldn’t really expect that to happen in this case. So, there are no values of the coefficients of x such that A · x is exactly 0; but an approximate solution can be found: we can look for the values of x1 , x2 , . . . , x8 such that the magnitude of A · x is as small as it can possibly be646 . We can then plug those values back into (8.142), and evaluate the net contribution of each of the integrals that appear at its left-hand side, i.e. of each of the eight forces. Forsyth and Chapple (and, I think, anyone who attacked the problem later, no matter how) agree that, overall, slab pull is the biggest contribution647 —it is of course 0 if the plate doesn’t have a subducting slab attached to it, but it’s one the biggest forces whenever it does. The other important forces are suction, and colliding resistance. Everything else is much smaller, including ridge push and the viscous drag from the mantle. Which means, importantly, that you can forget the conveyor-belt story proposed by Holmes, Dietz and Hess as the general explanation for how convection drives plate tectonics. For some plates, viscous drag might still be a driving force, but, on the other hand, it appears to resist the motion of others; overall, the motion of plates doesn’t reflect the global pattern of convective motions at the top of the mantle.
418
8 The Forces that Shape the Earth: Convection and Plates
8.20 Mountain Building Now that you know all that there is to know about subduction, you are ready to learn about mountain building as well. It was important that we looked at subduction, first, because it is, uhm, very likely, to say the least, that the two are related. In 1970, John F. Dewey and John M. Bird published a paper called “Mountain Belts and the New Global Tectonics”648 , which stated that once we have a map of where plate boundaries are, and we know which are convergent and which aren’t, then we see right away that the biggest and youngest mountain ranges are along convergent plate boundaries; and from this simple fact they make the very important inference that it is, essentially, the energy of plates bumping into one another that pushes stuff up to make mountains. They wrote: “The emergence of the theory of lithosphere plate tectonics, the new global tectonics, for the first time provides a unifying worldwide explanation for tectonic processes. [...] “Young mountain belts and modern island arcs are associated with the highly seismic and volcanic belts adjacent to consuming plate margins. This association occurs at continent/ocean boundaries (e.g., the Andes), in island arcs within oceans (the New Hebrides), and where continental collision has occurred (the Himalayas) or is impending (the eastern Mediterranean). The close similarity of the features of these younger orogenic belts with older ones strongly indicates that similar processes have been involved in their development. “We take the view that plate tectonics is too powerful and viable a mechanism in explaining modern mountain belts to be disregarded in favor of ad hoc, nonactualistic models for ancient mountain belts, and that understanding of mountain belts can only come from a full integration of their features with observed sedimentary, volcanic, and tectonic processes of modern oceans and continental margins”, etc. Dewey and Bird are saying that plate tectonics is the final answer to the old question of how mountains are formed. Remember how Werner and Hutton and Hall and Dana and Wegener, etc., all addressed it, over about a century and a half, and but none of them seemed to be able to come up with a totally convincing explanation? With plate tectonics, finally, we have one. And we shouldn’t think that plate tectonics only explains mountain ranges that are being formed now at current plate boundaries: older ranges, the ones that are now largely eroded, and whose folded layers go back to ancient eras, like, say, the Appalachians, were presumably formed at old plate boundaries that are now inactive. So if you understand how young mountain ranges are being formed now by the interaction of today’s plates, you understand how all mountain ranges have been formed in the history of the earth: good old uniformitarianism. Dewey and Bird are also saying that there’s different kinds of convergent plate boundaries. Basically, three: ocean-continent, continent-continent and ocean-ocean. And young mountain ranges are either along a line that goes from the Alps to the Himalaya—where there’s mostly continent-continent collision: Africa or Arabia
8.20 Mountain Building
419
or India against Eurasia—or around the Pacific ocean, where there’s both oceancontinent and ocean-ocean collision. To begin to understand what happens when different kinds of plates collide, remember that the oceanic lithosphere is only buoyant while it’s young; continental lithosphere on the other hand, because of the minerals it’s made of, is always buoyant. At the margin between a continental and an oceanic plate, the oceanic plate will eventually subduct under the continental one, for the reasons we’ve seen. At the margin between two oceanic plates, the older one, which is less buoyant, will eventually subduct under the younger one. Either way, a deep-sea trench forms along the plate boundary, like Vening Meinesz and co. had observed, and Griggs explained with his experiments. There are always volcanoes along convergent margins where an oceanic plate subducts (whether under another ocean or a continent). The reason this happens is that subudction is not a clear-cut process where exactly the same stuff that had emerged at the mid-oceanic ridge is sucked back into the mantle: it is reasonable to think that the subducting plate manages to drag a certain amount of sedimentary material along, despite its buoyancy; it is also reasonable to think that, before subducting, lithospheric rock has become soaked up with water from the ocean. When the cold oceanic lithosphere dives back into the asthenosphere, chemical things happen that, besides oceanic lithosphere, involve water and sediments, and, without getting into too much detail, the result is a very viscous magma, whose composition is controlled by that of sediments, rather than by that of the oceanic lithosphere. Viscous magma, because it’s viscous, doesn’t flow that easily, which means that it solidifies before it can’t get very far from the volcano itself. This is how you get volcanoes that are not very wide, but are very high. The subduction of an oceanic plate under another oceanic plate won’t make a major mountain range, but it will make volcanoes, and those will form an island arc, like the Aleutian islands, the Marianas, the Philippines, and the Japanese archipelago649 . The subduction of an oceanic plate under a continental one can make a mountain range, but doesn’t necessarily do so. Both plates are compressed as they collide into one another; because the oceanic plate is heavy and happy to sink into the mantle, horizontal compression is not as large as in the case of continent-continent collision, about which I’ll tell you in a second, but still there is some horizontal compression: and there are at least two major mountain ranges that exist because of ocean-continent subduction: the Rockies650 and the Andes. A continent-continent margin happens when an ocean closes down. Imagine a ridge that becomes inactive, so that no more oceanic lithosphere is formed, there. What’s left of the oceanic plate continues to cool and, once it’s cold and dense enough, to subduct, until the whole ocean is gone (see the drawing to the left in Fig. 8.43). So then the continents on both sides of that former ocean are pulled against one another and collide—the oceanic slab is still pulling, even though it’s totally sunken. But the thing is, like I just said, a slab of continental lithosphere, as a whole, is lighter than the the same volume of asthenosphere material, and so it can’t sink. What probably happens is that crust and mantle lithosphere separate from one another—somehow
420
8 The Forces that Shape the Earth: Convection and Plates
Fig. 8.43 Continental collision happens after an entire oceanic plate is “consumed” by subduction. The sketches show where two continental plates before (left) and during (right) collision; on the left you also see the oceanic plates that initially separates them. You can probably guess what all the acronyms stand for, but anyway, CC is continental crust, S is sediment, LM is lithospheric mantle, AM is asthenospheric mantle, and OC is oceanic crust. Based on the cartoon by Hervé Martin, in “Earth Structure and Plate Tectonics; Basic Knowledge”, Advances in Astrobiology and Biogeophysics, vol. 1, 2005
the plate breaks apart; mantle lithosphere alone is not that buoyant, and might sink— continental subduction. The crust is buoyant, though: it can’t possibly sink, and as a result it folds. And if you ever go see, e.g., the Alps, you’ll see how crazy those folds are (for now, you can go back to the picture in Fig. 5.3). The result of all this is a mountain range, and the two macroscopic examples that we have in our era are the Alps and the Himalaya (and everything that’s geographically in between).
Chapter 9
Our Concept of the Earth
In this final chapter I am going to try to give you a relatively coherent picture of the interior of our planet, putting together everything we’ve learned so far in this book, plus some more recent stuff—mostly from the so-called “disciplines” of geodynamics, geochemistry and seismology. These days, geodynamicists come up with models of the pattern of convection in the earth’s mantle, which should help us to figure out where it is that the hot stuff rises, and the cold stuff sinks, etc. Geochemists try to extrapolate, from the data they collect at the earth’s surface, what the composition of the mantle could be like. Seismologists derive, from earthquake data, tomography maps of the deep earth: sort of like X-ray scans of the planet. And there are also the mineral physicists, who continue to do high-pressure experiments (Chap. 7), at higher and higher pressures. All these things—mantle circulation, composition, and seismic structure—are related to the temperature distribution inside the earth: which, at this point in this book, we have established a number of “facts” (as factual as facts might be when you are looking at the deep earth) about T at various depths. We have direct observations of the rate at which T grows with depth in the (shallower parts of the) crust. We have estimates of T at the base of the lithosphere: see the oceanic lithosphere piece in Chap. 8. Values of T at the depths where the phase of minerals is thought to change (the so-called “transition zone”) are constrained by laboratory studies à la Birch (Chap. 7).
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2_9
421
422
9 Our Concept of the Earth
9.1 Extrapolating Temperature via the Adiabatic Gradient From all I said in Chap. 8, we have a number of reasons to think that there is convection in the mantle: and we know that this implies that the temperature gradient, in the mantle, must be close to adiabatic651 : it was one of the conclusions of the Rayleigh, number section. It follows that, e.g., we can use the formula for the adiabatic dT dr Eq. (7.142), to extrapolate T from the bottom of the lithosphere down to the transition zone: and then check that the extrapolated T that we get at various depths in the transition zone matches, more or less, the T of phase changes happening at those depths652 . If it does, we can be more confident that the ideas outlined so far in this book about the composition, etc., of the transition zone, and the dynamics of the mantle—i.e., that there is, indeed, convection—are OK. From the transition zone, then, we can also extrapolate further into the lower mantle and all the way down to the core. But let’s start from the upper mantle. At the bottom of the lithosphere/top of . Like I the asthenosphere we’ve got estimates of all the parameters involved in dT dr said in Chap. 8, temperature, there, should be about 1600 ◦ K, because that’s roughly the melting point of mantle rocks in laboratory studies, and we know that there is at least some melting going on in the asthenosphere; from Chap. 7 we have an idea of the composition of rocks below the lithosphere, which means that we can measure, in the lab, their volumetric coefficient of thermal expansion653 α and their specific heat capacity at constant pressure654 C p : on average, estimates are about α ≈ 3 × 10−5 ◦ K−1 and C p ≈ 1 kJ kg−1 ◦ K−1 . Plug it all into the adiabatic gradient formula, and dT αgT =− dr S Cp (9.1) ◦ −1 ≈ 0.5 K km , where we take the acceleration of gravity g ≈ 10 ms−2 , i.e., its value at the earth’s surface655 . Now, if, like I was saying, I assume that the gradient stays constant all the way to the transition zone, then I can estimate T down to a distance r from the earth’s center by linear extrapolation, T (r ) ≈ T (r0 ) −
dT dr
(r − r0 ).
(9.2)
S
If, for instance, we take r to be the radius of the so-called 410 discontinuity, i.e., about 6000 km, and r0 that of the bottom of the continental lithosphere (remember Fig. N.52 in Chap. 8), i.e., about 6200 km, we have r − r0 ≈ −200 km, and T (r ) ≈1600 ◦ K + 0.5 ◦ K km−1 × 200 km ≈1700 ◦ K.
(9.3)
9.1 Extrapolating Temperature via the Adiabatic Gradient
423
This value, 1700 ◦ K, is “close enough” to the temperature at which olivine turns into spinel (the 410-km phase transition), that mineral physicists find in their labs (see Chap. 7), if you consider the experimental uncertainty associated with those estimates. We can repeat this same exercise between the 410 and 520 km phase changes656 , and between the 520 and 660, which it all works, and then, from the 660, we can go into the lower mantle. You might have noticed that in the calculation we’ve just done it was assumed that α and C p do not change with growing pressure and temperature, or, which is the same, that they do not change with depth, and that it be OK to take some constant, average values for α and C p , representative of the entire depth range that we are looking at. This was a reasonable approximation to make, because we only went from the bottom of the lithosphere to the top of the transition-zone: about 200 km, which is not a lot. But the lower mantle is a couple thousand km thick. In fact, laboratory experiments tell us that the C p of the stuff we find in the mantle doesn’t change much, at mantle-like temperatures, even when you increase T and p a lot; but they also tell us that α does change quite a bit. The gravitational acceleration g also changes significantly from upper mantle to core. So what people do is, they use estimates for α as a function of r taken, once again, from lab experiments657 , see Fig. 9.1 for an example, and estimates of g = g(r ) like those we found in Chap. 7, the Williamson-Adams way, and then extrapolate small step by small step, T (r − δr ) ≈ T (r ) +
α(r )g(r )T (r ) δr, Cp
(9.4)
“iteratively”, sort of like in the Williamson-Adams thing. Look at Fig. 9.2 to see estimates of T down to the core-mantle boundary, obtained this way. At 2700 km we have T ≈ 2500 ◦ K. That’s roughly the depth where the perovskite-to-postperovskite
Fig. 9.1 Coefficient of thermal expansion α as a function of depth in the earth’s mantle, courtesy of Tomoo Katsura. The solid line is Tomo’s best estimate, and the dashed lines mark the interval within which he expects that the true value should be
424
9 Our Concept of the Earth
Fig. 9.2 Tomoo Katsura’s estimates of T as a function of depth in the earth’s mantle, extrapolated via the adiabatic-gradient formula. Again, the solid line is the best estimate, and the dashed line tells you how wrong that might be
transition probably takes place, and that’s exactly the T that, given the phase change, is expected from, e.g., Hirose’s work (which see Chap. 7). There’s at least two more constraints on temperature near the core-mantle boundary. For one thing, laboratory experiments can give us the temperature at which mantle rocks melt at the pressures that are found at the CMB658 ; and but we know that the mantle at the CMB is solid, from which it follows that the mantle’s temperature at the CMB must be lower than the melting temperature: and so we have an upper bound, which different studies have found to be between 3500 ◦ K and 4200 ◦ K. Secondly, the CMB itself is not a phase boundary: yes, we’ve got solid stuff on one side and fluid stuff on the other, but they are not the same stuff. (Iron could still be liquid, I guess, at lowermost-mantle temperature, and/or the phase of lowermostmantle rock would not change, if one raised its temperature and pressure to outer-core values.) So, then, mineral physics doesn’t help us to constrain T at the CMB directly. On the other hand, we know from Chap. 7 that the boundary between fluid outer core and solid inner core (which people call the “inner-core boundary”, AKA the “ICB”) is totally a phase boundary: it is there because, as pressure grows, fluid iron becomes solid659 . And so mineral physics does give us an estimate of T at that boundary. We also know fairly well, from seismology (again Chap. 7), the depth of the inner-core boundary. And we know that the outer core is convecting (remember the piece on the earth’s magnetic field and “dynamo” in Chap. 8), which means that the temperature gradient in the outer core is adiabatic: and so we can come up with a quantitative estimate for that gradient, and use it to extrapolate T from the ICB to the CMB: just like we did in the mantle via Eq. (9.4), but in the upwards direction. This all sounds very nice, except the core is probably not made of pure iron, and we are not sure what its exact composition is, and this leads to quite a bit of uncertainty in our estimates of T at the ICB660 and, consequently, at the CMB as well. Bottom line, we are talking about 3800◦ –4600 ◦ K at the core-mantle boundary.
9.1 Extrapolating Temperature via the Adiabatic Gradient
425
Fig. 9.3 Rough sketch of how the temperature profile of the earth’s lithosphere and mantle might look like. At the planet’s outer surface T is about 0 ◦ C, or 273 ◦ K. Temperatures at the 410 and 660 are about 1700 ◦ K and 1950 ◦ K, as per Chap. 7 and Fig. 9.2. Values of T below the 410 and down to 200 km above the CMB (2700 km depth) are the same as in Fig. 9.2. The lithosphere (0–200 km depth) is a thermal-boundary layer: within the TBL, T as a function of depth looks like erf(x) when x is close to zero. Temperature at the CMB (estimated based of what we know of the earth’s core) is much higher than what we get by extrapolation from above, but it is likely that another TBL exists between there and the convecting part of the mantle: here I took that TBL to be as thick as the top one, and if T = 4000 ◦ K at the CMB (which it’s unlikely to be much lower than that), then the temperature extents of the two TBLs are comparable
Let’s put together all the info we have, and draw a sketch of what the temperature profile could be between the earth’s surface and CMB: this is what you see in Fig. 9.3. In the lithosphere, heat transfer happens purely by conduction, and so the temperature profile versus depth is error-function-like (Chap. 4). People call this a thermal boundary layer (AKA, a “TBL”): which you tend to have at least one every time you have convection. Starting at the base of the lithosphere, we might take Katsura’s estimates from Fig. 9.2, all the way to near the CMB. There, even if we take into account all the uncertainty, there clearly is a discrepancy between CMB temperatures found by extrapolation from the mantle, versus from the core. The way the discrepancy is explained away is, we make the hypothesis that another thermal boundary layer should exist at the bottom of the mantle: i.e., again, heat transfer near the CMB is purely conductive, and described mathematically by the error-function solution of Chap. 4: except that this time the T of the reservoir (the core) is higher than that of the half space (the mantle). It’s reasonable to take this TBL to be about as thick as the other661 , and if T = 4000 ◦ K at the CMB (which it’s unlikely to be much lower than that), then the temperature extents of the two TBLs are comparable. Anyway, so, how this came to be is, I think, a good question. At the earth’s surface we started out, some billion years ago, with a very hot planet radiating energy into space, a relatively cold atmosphere must have formed, etc.: Kelvin’s story makes sense. But how did a temperature jump ever arise in the middle of the planet? I don’t think people agree on the details, yet, but whichever way it happened, it must have
426
9 Our Concept of the Earth
to do with the separation of the heavier stuff—mostly iron—collapsing towards the center of the planet to form the core; the lighter stuff—the rocks—turns out to be a very poor conductor of heat, as you know from Chap. 7. So what might happen is that the rocks near the CMB get hotter and hotter as they receive heat by conduction from the core, and but then conduction within the mantle is so slow that it takes forever for heat to be conducted upwards towards the surface. And even convection isn’t enough to rid the CMB of all the heat. So, temperature just above the CMB rises: and a bottom TBL arises. Anyway, after all this, I hope you’ll agree that the idea that the mantle is convecting is, to say the least, not invalidated by the info we have on temperature in the deep earth. People tend to think that there is convection in the earth’s mantle, and quite a lot of research has been done, and is being done, to figure out the details. Like I said, geodynamicists, geochemists and seismologists are all contributing to this, each discipline its own way.
9.2 Geochemical Models of Mantle Convection I am going to tell you, first, about the geochemistry. My Ph.D. thesis was, to a large extent, about the pattern of mantle convection, the question being, in a nutshell, does stuff circulate across the entire mantle from core to crust? or are there, like, layers of convective cells, not mixing with one another? Early in my grad school career a paper came out in Nature by Albrecht Hofmann662,663 , reviewing all information that we can get from geochemistry about the structure of the earth. Back then I had a really hard time with geochemistry664 , and I guess I understood, uhm, part of that paper. But it was clear to me that that was the place where to look for everything geochemical that could be relevant to my thesis; and it still gets cited quite a lot: so I’ll use it as a guide for what I am going to tell you next. Hofmann explains that “geochemists began to study the composition and evolution of the earth’s mantle in the 1960s, when it became clear that plate tectonics is driven by [...] mantle convection, which carries deep-mantle material upwards until it begins to melt [...]. This melt rises to the surface where it extrudes as basaltic lava and delivers a chemical ‘message’ from the mantle to the geochemist.” To understand how to read the message, you need to know what incompatible elements are. An “incompatible element”, says Hofmann, is a “chemical element that is excluded from solid minerals of the upper mantle and therefore preferentially enters an available melt phase.” So, in practice, every time something in the upper mantle melts partially, incompatible elements tend to go with the melt. Any basalt, for example, had once been molten, so it carries within itself the incompatible elements that used to be in the mother rock, and but left the mother rock at the moment of melting. So the idea is, we hypothesize that in the beginning (post separation of mantle and core, i.e., the iron and all the very heavy stuff has already collapsed towards the center of the earth) there be only what we might call primitive mantle rock; that at least some of it partially melts, and separates into (i) a melt “enriched” in incompatible elements,
9.2 Geochemical Models of Mantle Convection
427
and (ii) the part that doesn’t melt, which is “depleted” in incompatible elements. The melt is buoyant and rises to the surface to become the crust; the depleted rock remains in the mantle. Primitive mantle can’t be sampled from where we are, but it can be defined theoretically: “Primitive mantle”, says Hofmann, is the “silicate portion of the earth as it existed after separation of the core but before it was differentiated into crust and present-day mantle. [...] The composition of the primitive mantle is inferred from the composition of the stony meteorites known as chondrites.” So, I guess, more or less the composition of chondrites minus all the iron, like we figured in Chap. 7. In Fig. 9.4 you see the concentration of various elements in continental and oceanic crust; there’s two kinds of oceanic crust: mid-ocean-ridge basalt, MORB in short, and ocean-island basalt, or OIB. MORB, as its name says, is the crust that’s formed at ridges, i.e., the bulk of oceanic crust; OIB is what you find in the volcanic islands that are scattered across the oceans, like Hawaii, or Tristan da Cunha in the southern Atlantic ocean, and Tubuai in French Polynesia, which is were Hofmann took data from. People soon found out that the composition of OIB is quite different from that of the MORB that lies around it. The first thing to notice in the curves of Fig. 9.4 is that, relative to primitive mantle, all types of crust (MORB, OIB, continental) are enriched in incompatible
Fig. 9.4 After Hofmann (1997): concentration of various elements, sorted according to their degree of “compatibility” (most incompatible elements to the left), in continental crust, MORB, and three different OIBs (see the legend). Each concentration is given relative to (i.e., is “normalized” to) the “primitive-mantle” concentration of the same element. (Used with permission of Springer Nature, from A. W. Hofmann, “Mantle geochemistry: the message from oceanic volcanism”, Nature, vol. 385, 1997; permission conveyed through Copyright Clearance Center, Inc)
428
9 Our Concept of the Earth
elements: i.e., all curves in Fig. 9.4 grow from right to left. Like I was saying, this is “because they originate as melts,” writes Hofmann, “which concentrate incompatible elements from the mantle”. If you look more closely, you’ll also see that, overall, the most incompatible elements (rubidium, barium, thorium,...) are much less present in the MORB/oceanic crust than they are in the continental crust: that’s because the continental crust has differentiated before the oceanic crust: which means that the oceanic crust has formed from the mantle after the mantle had already lost quite a bit of its incompatible stuff. And if you look even more closely, you’ll see that a few incompatible elements like, e.g., niobium (Nb), that are relatively rare in the continental crust, are relatively abundant in MORB—and vice versa for elements that are relatively abundant in the continental crust. Kind of like an anti-correlation between the anomalies in the MORB and continental-crust curves. This fits the “differentiation” story, too: the idea being that, for whatever reason, not a lot of niobium went into the continental crust back then, which means that a relatively large amount of it stayed in the mantle, which means that the MORB that gets created now originates from a relatively niobium-rich melt. The OIBs, on the other hand, “are much more enriched in incompatible elements than MORB”. They are about as enriched as the continental crust—but the continental crust is supposed to have separated early, from a very primitive mantle (with all the incompatible stuff still in it). So, the fact that OIB is enriched doesn’t agree very well with the rest of the story. So, then, “in the 1970s,” says Hofmann, “geochemists developed the idea of a chemically layered mantle, with an upper depleted and a lower undepleted (also called primitive) layer.” MORB, which, as we’ve just seen, is relatively depleted in the most incompatible elements, must come from a more depleted layer, while OIB must come from a less depleted one. Now, if you look at the seafloor near Hawaii, you’ll find a chain of older seamounts, forming a straight line, south-east to north-west. The way they look, those seamounts are the remains of old volcanoes: i.e., the volcanoes that are now Hawaii have moved, from the north-west to the south-east, relative to the Pacific plate. But actually, given that we know, now, that plates move, and given that, at Hawaii, the Pacific plate moves south-east to north-west, it makes more sense to think of the source of Hawaiian volcanism as something that stands still, while the plate moves on top of it665,666 . (Hawaii is not the only place where something like this is seen. There’s also, e.g., the Louisville seamount chain, with the Louisville hotspot667 , in the southern Pacific.) The reason I am bringing this up now is that this is what made people think that the stuff that gets erupted at ocean islands has its source in some deep and stable part of the mantle—not in the uppermost mantle, which is being moved around horizontally by convection. (Remember that, initially, people even thought that the upper mantle was kind of like a conveyor belt on which the plates are passively sitting.) Tuzo Wilson was the first, I think, to propose that a narrow pile (“plume”) of hot buoyant stuff could rise from deep in the mantle all the way to Hawaii668 . Now, as we know from Chap. 7, it was pretty much established by the 1970s that there is a major change in the mineralogical composition of the mantle across the so-called 660—the transition between upper and lower mantle. People put all this evidence together and came up
9.2 Geochemical Models of Mantle Convection
429
Fig. 9.5 Some examples of the “proliferation of speculative models” in geochemistry. (Used with permission of Springer Nature, from A. W. Hofmann, “Mantle Geochemistry: The Message from Oceanic Volcanism”, Nature, vol. 385, 1997; permission conveyed through Copyright Clearance Center, Inc)
with what Hofmann calls the “standard” geochemical model of mantle convection, which see Fig. 9.5a. MORB must come from the upper mantle, OIB from the lower mantle, and upper and lower mantle are both convecting and are kept separate by, e.g., an intrinsic compositional density contrast at 660 km: i.e., composition of the upper mantle is such that it’s too light to ever sink into the lower mantle. The thing is, though, if you look carefully at Fig. 9.4, you’ll see that the composition of OIBs changes quite a lot from one ocean island to the other669 . So much so that, Hofmann figures, the OIBs “could not be derived from [one] chemically uniform, primitive layer of the mantle”. As more and more data of this kind670 were collected, all showing important variations in OIB composition, the “standard model”, says Hofmann, eventually became “untenable”. Hence “a proliferation of speculative models”, which see Fig. 9.5. In some of them we still have “layered” convection like in the old standard model, except that now “plumes originate from the base671 of the upper layer but may entrain material ‘leaking’ from the lower layer” (Fig. 9.5b). “Other models are based on ‘whole-mantle’ convection” (Fig. 9.5c); and but “recently, hybrid models have been developed in which the two, normally separated convecting layers are intermittently perturbed when sinking pieces of subducted lithosphere and rising plumes occasionally penetrate the 660-km boundary” (Fig. 9.5d). One way to figure out which models work and which don’t is to calculate approximately how much of today’s mantle needs to be primitive, based, essentially, on
430
9 Our Concept of the Earth
the composition of the primitive mantle, and on how depleted the current depleted mantle is, and how enriched the continental crust. The way you do it is, take the concentrations of some incompatible element in the continental crust, depleted mantle (which MORB comes from) and in the primitive, chondrite-minus-iron, earth: call them i c , i m , i p ; and then if Mc is the mass of the continental crust, and Mm the mass of the depleted part of the mantle, (Mc + Mm )i p = Mc i c + Mm i m
(9.5)
where i c , etc., can be given in moles per unit mass. Mc + Mm is the mass of primitive mantle (could be less than its total, initial mass) that separated to form either crust or depleted mantle: so all (9.5) is saying is that the amount of our isotope that was initially part of that mass, went either into the continental crust or into the depleted mantle: none was created, or destroyed: conservation of mass, basically. Equation (9.5) holds as long as the separation of continental crust from the mantle was (geologically) fast—and the concentrations i c , etc., are those of when the separation took place—not those of today. This could be a problem, in principle, because we are dealing with radioactive stuff and so concentrations will change, even if depleted mantle and continental crust are chemically isolated; but, there are isotopes of incompatible elements whose half life is so long that, to a good approximation, we can consider their concentration as we observe it today to be the same as their concentration multiple billion years ago: and those are the isotopes that people use in this kind of calculations: for example 87 Sr (daughter of 87 Rb, see Chap. 8) and 143 Nd, or neodymium-143 (daughter of 147 Sm, samarium-147): their half lives are about 50 and about 100 billion years, respectively. That is, like, more than ten and more than twenty times the estimated age of the whole planet. So, now, i c , i m , i p can all be measured; Mc can be estimated, too: and you can solve (9.5) to find the value of Mm . There’s a bunch of different elements that you can use, though, besides Sr and Nd, and, depending on which you use, you get quite different values of Mm . “Initial results based on Nd isotopes”, says Hofmann, “indicated depleted-reservoir sizes between 25 and 30% of the mantle. This corresponds to the mass of the mantle above the 660-km discontinuity and is thus consistent with the layered mantle model”. But then other studies found all sorts of different sizes, “between 30 and 90% [...]. Perhaps the best-constrained mass balance is that for radiogenic argon”, from which Hofmann finds that Mm is about half the total mass of the mantle. That’s still a lot of primitive mantle left, but not as much as the entire lower mantle. Besides models where OIB composition is explained by “reservoirs” of primitive stuff in the mantle, Hofmann proposes a uniformitarian (his word) model, based on the idea that a certain amount of crustal material is recycled into the mantle, via subduction, and carried back up to the crust via plumes, contributing to making OIB: then it gets recycled, again: and so on and so forth. “Subduction of oceanic lithosphere”, says Hofmann, “continuously recycles oceanic crust and thus injects about 20 km3 of enriched material per year into the mantle. This is the uniformitarian basis of” a recycling model where, shortly after subduction, subducted oceanic crust
9.2 Geochemical Models of Mantle Convection
431
is transformed into higher-pressure phases which are several percent denser than the average mantle. “The assumed segregation of the dense former oceanic crust from its lithospheric base and the main mantle mass is inhibited at normal mantle viscosities, but is enhanced in the hot boundary layer at the base of the convection system (660 or 2900 km). Storage and isotopic evolution of some of this material for periods of 1–2 [billion years], followed ultimately by incorporation into plumes rising from this boundary layer, could generate” the entire range of observed compositions of OIB. Whether they are based on segregation of primitive mantle, or recycling of the crust, there’s two more “geochemical” constraints that circulation models must verify. First: the whole idea of Hofmann’s reservoirs is that they are, at least to a good approximation, closed geochemical systems: meaning the radioactive and radiogenic stuff that’s in them when they separate stays there—they are reservoirs, and don’t mix with the rest of the mantle. It follows that, looking at an OIB, you can determine, with the radioactive-dating approach which see the beginning of Chap. 8, you can determine when the presumed reservoir that it came from separated from the rest of the mantle and became a reservoir. And it turns out that some OIBs definitely come from a reservoir of really primitive mantle, that must have been isolated since the birth of the planet; and but other OIBs come from more recent reservoirs that are more likely to be formed by recycling of crust. Hofmann concludes “that much of the chemical heterogeneity evident in oceanic basalts is caused by recycling of oceanic, and to a much lesser extent continental, crustal material, which resurfaces predominantly in the ‘mantle plumes’ that create volcanic islands.” Second: the earth’s heat budget, which we talked about when discussing Arthur Holmes’ contributions, after the discovery of radioactivity. (Which was at the beginning of Chap. 8.) Remember that we can more or less measure how much heat-perunit-time the earth is releasing, and we can estimate how much heat is produced by radioactive decay within the crust, whose composition we know with some relative accuracy. So then the difference between those two quantities is how much heat is released by mantle plus core. For example Louise Kellogg et al.672 explain that “of the 44 TW of the present-day heat flux out of Earth, 6 TW is generated within the crust by radioactive decay of U, Th, and K, [so that the remaining] 38 TW must be provided either by [radioactive] generation of heat within the mantle and core or by cooling of the planet.” From, e.g., the chondrite model (remember Chap. 7), we have an idea of how much radioactivity we can expect to have in the planet as a whole—and so, if we subtract from that the 6 TW that come from the crust, we have an estimate of how much radioactivity is to be found in the mantle-and-core. From the theory of Kelvin (Chap. 4) we can place some constraints on how much heat can be produced by cooling (we know, now, the age of the earth; we know that temperature in the past should never have risen above some extreme values...). Now, “geochemical analyses of basalts,” say Kellogg et al., “show that the source region of MORBs is depleted in heat production by a factor of 5–10 relative to a chondritic silicate value. Thus, if the MORB source region made up most of the mantle, the mantle heat production would be only 2–6 TW, comparable to that of the crust. Matching the observed heat flux would require rapid cooling of the planet by, on average, 175 ◦ K per 109 years, which requires excessive internal temperatures
432
9 Our Concept of the Earth
during the Archaean”, etc. In other words: judging from MORB samples, the upper mantle doesn’t produce much radioactive heat: and so, to account for the release of heat that we see at the planet’s surface per unit time, we need the primitive-mantle reservoir to carry enough radioactive, heat-producing elements, and be big enough, to provide, well, we don’t know how many TW really673 , but definitely more than a couple. Bottom line, we all pretty much agree that, to explain the variability in OIB composition, you need to have some chemical heterogeneity in the mantle, more complex than just a couple of layers. Other than this, it gets quite complicated, as you see, and I don’t think there’s consensus, yet, on why exactly OIBs are the way they are: but these are the ideas that are around.
9.3 Geodynamical Models of Mantle Convection Geodynamicists ask the same questions: what’s the pattern of mantle convection, and whether stuff in the mantle is well mixed or not—but they have a different way of trying to answer them: mostly, they simulate things, in the laboratory or through calculations. We’ve seen some “analogue” geodynamical models already, in Chap. 8 (remember Dave Griggs?); people still do that, but what we rely on most, nowadays, is mathematical modeling, which, mantle convection being a pretty complex problem, is almost always done numerically674 . Which is why in this chapter you won’t see analytical solutions—you won’t see math formulae that tell you directly how mass flows. What I am going to do now is, I am going to put together all the laws that control those motions, many of which we’ve already seen, actually, in previous chapters. Those laws are then implemented numerically in computers, which this book is not really about how that is done—there are many books on that, anyway, even written by and for geophysicists; here we shall jump directly to the results of such exercises. Not their fine details; just the general traits shared by most of the convection models that are around today. I’ll try to identify some “features”, as they say, on which people seem to agree, which there’s not so many; my feeling is that there’s a really huge spectrum of models on the, uhm, market, which can be quite different from one another: as we shall see, there’s still too much uncertainty on some of the parameters that control how those models behave. Anyway, like I said, let’s put together all the physical laws that need to be taken into account by a good model. This involves both the conduction of heat and the displacement of mass. We’ve studied heat conduction quite a bit already, and we know that, in the earth’s mantle, conduction approximately only occurs within thermal boundary layers—which are likely to be at the top and bottom of the mantle675 . As for mass transport—convection sensu stricto—, which happens everywhere but in thermal boundary layers, that is described by the usual “momentum equation”, i.e., Navier–Stokes, which so far we’ve mostly used in the “elastic” approximation, but which, if you remember what I said in Chap. 6, is valid independent of the rheology.
9.3 Geodynamical Models of Mantle Convection
433
Earlier on we’ve substituted Hooke’s law in it, now we’ll have to use something that describes the long-term, geological-time behavior of earth materials. The way we’ve learned them earlier on in this book, the heat law and Navier– Stokes don’t talk to one another—one deals with temperature, the other with mass and displacement. But there’s also thermodynamics, telling us how changes in pressure (related to stress) and volume (related to displacement) and temperature affect one another; and the principle of energy conservation of Chap. 4, that couples mechanical (changes in stress and deformation) and thermal effects. If a model accounts for all these laws, then a change in temperature, which could happen via conduction of heat, can cause a change in density, which could make stuff more or less buoyant, and so move it, via Navier–Stokes. Which, simply put, is what we think happens in the real world. Models of mantle convection are made with all these ingredients, and so let’s look at them (at the ingredients) one by one. (i) The momentum equation, that is to say, Navier–Stokes equation of Chap. 6, ρ
∂ 2u = ∇ · τ + f. ∂t 2
(copy of Eq. 6.135)
The body force per unit volume, f, at the right-hand side, could include gravitation, and but also the centrifugal and Coriolis forces676 , depending on the level of accuracy you hope to achieve. (ii) A constitutive relation, or stress-strain relation, or rheological equation, whatever you like to call it. We’ve met at least three of those in this book so far: Hooke’s law, in Chap. 6, which describes what we call a perfectly elastic medium, and then the equation of an inviscid fluid (Eq. (8.13), Chap. 8) and that of a linear viscous fluid, compressible or incompressible (Eqs. (8.15) and (8.16) also in Chap. 8). Navier–Stokes is valid independent of the rheology, so the idea is you replace τ in it with the function of deformation that is at the right-hand side of one of those equations. The goal, as usual, is to end up with a differential equation having u(x, t)—displacement of mass, i.e., flow, i.e., convection—as its unknown function. That’s what we did in Chap. 6, with Hooke’s law, which gave us the Navier– Cauchy equation (6.167), which works for seismology but not for geodynamics; and in Chap. 8, too, when we looked at Haskell’s famous paper, where he assumed the mantle to be a linear incompressible viscous fluid: which results in Haskell’s equation (8.19). At this point in this book, though, we know very well that the stuff that makes up the mantle is neither fluid nor elastic. We know that it is very close to elastic in the short term (i.e., the minutes and hours that follow an earthquake, for example); and fluid over very long time scales—at the very least thousands of years (think the time scale of postglacial rebound, which we are talking uplifts of about, like, some mm/year). If we want to be realistic, we need a rheology equation that mimics both behaviors.
434
9 Our Concept of the Earth
The simplest example of viscoelastic rheology is what people call a Maxwell material. It was, yes, James Clerk Maxwell, of Maxwell’s equations fame, who came up with it in his 1867 paper “On the Dynamical Theory of Gases”677 . Maxwell gives a simple, one-dimensional example of how viscoelastic rheology might work. “The phenomena of viscosity in all bodies may be described,” he says, “as follows:– “A distortion or strain of some kind, which we may call ε, is produced in the body by displacement. A state of stress or elastic force which we may call τ is thus excited. The relation between the stress and the strain may be written τ = E ε, where E is the coefficient of elasticity for that particular kind of strain. In a solid body free from viscosity, τ will remain = E ε, and dε dτ =E . dt dt
(9.6)
“If, however,” goes on Maxwell, “the body is viscous, τ will not remain constant, but will tend to disappear at a rate depending on the value of τ , and on the nature of the body. If we suppose this rate proportional to τ , the equation may be written” 1 dτ τ dε + = . (9.7) E dt η dt Maxwell uses different symbols for what I’ve called τ , ε, η, but I thought I’d stick to the notation you tend to find in today’s books and papers and stuff. In practice, like I was saying, Eq. (9.7) works for a one-dimensional thing where you have a spring attached to a damper, see Fig. 9.6. The spring is the elastic object by definition: you apply some stress τ —which in 1-D is actually basically a change in pressure, something we called δp elsewhere in this book—to it, and the deformation is proportional to Young’s modulus E times the relative deformation ε (which, again, in 1-D, ε = δL/L) with L the length of the spring pre-deformation, and δL the amount of extension or contraction you’ve administered. When you push or pull (i.e., apply some kind of τ ) on a damper678 , on the other hand, initially there is no ε at all: only the time-derivative of ε is affected—without spring, in fact, we’d just have τ dε = . dt η
(9.8)
At the next instant in time, though, its derivative being nonzero, ε must become nonzero as well: and the damper begins to deform. If you put spring and damper together like in Fig. 9.6, says Maxwell, then the total ε is the sum of the ε of the damper plus the ε of the spring; which if you differentiate this with respect to time, and use (9.6) and (9.8), you get Eq. (9.7). If η is very large in comparison with E, which is the case in earth materials, then the ratio of τ to η will tend to be small, i.e., you’ll need a lot of stress τ before you can see any effect from the damper, or you’d have to wait for a very
9.3 Geodynamical Models of Mantle Convection
435
Fig. 9.6 Maxwell’s model of viscoelasticity, in one dimension. A perfectly elastic spring (or bar, or whatever, but, in any case, perfectly elastic) with elastic constant E is attached to a viscous damper, or dashpot, with viscosity η
long time (which, indeed, is what happens when you look at the earth). The lower the viscosity, the smaller the value of τ needed, and/or the shorter the time you need to wait to see some significant viscous flow, etc. Before being able to use Maxwell’s idea in three-dimensional models, we would need to translate it to 3-D, kind of like we did in Chap. 6, where we started with Hooke’s law in 1-D and ended up with the mighty 3 × 3 × 3 × 3 tensor c, and λ and μ and all that. But I figure we’ve already done enough of that kind of stuff in this book and we don’t need to do more. Plus, a Maxwell material is, like I said, the simplest case of viscoelastic rheology that one can think of, but is definitely not the only one679 . I am not going to go through all the models of viscoelasticity that have been proposed over the years, because there’s too many of them680 , and nobody really knows which one is most realistic. The problem being, I guess, that viscous phenomena are very slow681 . To test your viscoelastic model against the real world, you’d need to collect data for a geological amount of time: which is not very practical. (iii) Newton’s law of gravitation, plus his “shell theorem” (Chap. 1), are summarized in two equations682 , both valid at any given point within the earth. The first introduces a “gravitational potential”, U (r), i.e., a function such that fgrav (r) = ρ(r )∇U (r),
(9.9)
where fgrav is (gravitational) force per unit volume, and U is (potential) energy per unit mass. If g is gravitational acceleration, we might also write this as g(r) = ∇U (r),
(9.10)
The second equation, which people often call Poisson’s equation, says that, for fgrav to be indeed gravitational attraction according to Newton’s laws, U must verify (9.11) ∇ 2 U (r) = −4π Gρ(r ), or, if you combine it with (9.10), ∇ · g(r) = −4π Gρ(r ), where G, as usual, is the gravitational constant.
(9.12)
436
9 Our Concept of the Earth
Pressure p is also related to gravity, of course. We saw in Chap. 7 that, in the simple case of a spherically symmetric earth, dp = −ρg(r ). dr
(copy of Eq. 7.1256)
Strictly speaking, Eqs. (7.125) and (9.11) are valid as long as the earth is spherically symmetric. Anyway, I guess, in all we are doing here we might consider the earth to be almost spherically symmetric. Should one decide to drop this approximation, a more complicated/general description of gravitation needs to be implemented. In any case, the physics is all in Newton’s laws. (iv) Conservation of mass: i.e., the equation, which we first met in Chap. 6, that relates changes in volume with the divergence of mantle flow; δV = ∇ · u, V0
(copy of Eq. 6.19)
where V0 is the initial volume (before the displacement u occurs) of some parcel of material, and δV is how much that volume has changed. Now let m be the mass of the parcel. You might write m = ρV , and of course m = ρ0 V0 , where ρ0 is density before displacement. It follows that V = mρ , and so m ∂V = − 2, ∂ρ ρ or δV = −
m δρ. ρ02
(9.13)
(9.14)
Sub into Eq. (6.19), and mδρ/ρ02 m/ρ0 δρ =− . ρ0
∇ ·u =−
(9.15)
This equation, which people call continuity equation683 , is sometimes found in the form 1 ∂ρ (9.16) ∇ ·v =− ρ0 ∂t which is what you get if you differentiate both sides of (9.15) with respect to time. Equations (9.15) and (9.16) contain the same info as Eq. (6.19), as you’ve just seen: but, for what we are doing now, they are preferable to it, because, unlike volume, density is something that can be defined at one individual point of a continuum.
9.3 Geodynamical Models of Mantle Convection
437
People often assume that mantle stuff is incompressible (this is what Haskell did in his postglacial rebound papers, for example), i.e., ∇ · u = ∇ · v = 0, because that simplifies the calculations, and because it’s probably a reasonable approximation, based on what we know from laboratory experiments. (v) The so-called energy conservation equation. Navier–Stokes, above, is basically a version of Newton’s law, and, as such, does not account for “dissipation” via heat. So that’s why we also need to take account of generalized energy conservation, where heat also appears. We are going to do it based on generalized energy conservation, which, as you remember from Chap. 4, associates changes in a system’s mechanical energy with changes in the system’s temperature, and/or heat exchanges between the system and the outside world. In practice, so far in our applications of Navier–Stokes we’ve made the implicit assumption that mechanical energy can’t be dissipated through heat. But, in the real world, that’s not at all what happens—if that was the case, e.g., seismic waves wouldn’t be attenuated, and would continue to propagate around the earth for ever, and so on and so forth. We might formulate the principle of generalized energy conservation as follows: let’s say that heat Q and work W are exchanged between a system, like the earth’s mantle in our case, and the rest of the universe. Work performed on the system, i.e., received by the system, is a positive number; same for heat. Work performed by the system is a negative number, and so is heat that is lost. Both W and Q contribute to the system’s energy E. If we could measure E before and after whatever processes resulted in Q and W , and take the difference, we’d get E = W + Q. (9.17) Imagine you measure W and Q over a given amount of time, and make that amount of time infinitely small: then it follows from (9.17) that dW dQ dE = + . dt dt dt
(9.18)
Now the trick is to derive from (9.18) an equation, presumably a differential equation, that involves the displacement or velocity fields: u or v. That equation then can be combined with the equations that Haskell (see above) already had— which also had u or v as unknown function. And then that would be a new system of differential equations, that we must solve simultaneously. So let’s work first on dW and then on ddtE and ddtQ , to see how they can be expressed in terms of v. dt Call f the field of body forces (mostly gravity) per-unit-volume acting on the earth’s mantle, or whatever object we want to study; call T the surface forces per-unit-area (the so-called tractions684 ); then, by definition of work,
f · u dV +
W = V
∂V
T · u d S.
(9.19)
438
9 Our Concept of the Earth
In the kind of problems we’re interested in right now, displacements are pretty small—convection in the earth’s mantle is clearly a very slow process, otherwise we’d be seening all sorts of fast, catastrophic changes at the surface. Body forces and tractions can be very large; but then again, because convection is slow, changes in T and f are small. So if we differentiate both sides of (9.19) with respect to time, and but neglect second-order stuff685 , we are left with
∂u dV + ∂t V ≈ f · v dV +
dW ≈ dt
f·
∂V
∂V
V
T·
∂u dS ∂t
(9.20)
T · v d S.
Now, remember T = nˆ · τ , where τ is the stress tensor and nˆ is a unit vector that’s everywhere perpendicular to ∂ V ; then dW ≈ dt
f · v dV +
f · v dV +
≈
∂V
nˆ · τ · v d S
V
V
∂V
(9.21) (ττ · v) · nˆ d S,
which by virtue of the divergence theorem, dW ≈ dt
∇ · (ττ · v) d V
f · v dV +
V
V
[f · v + ∇ · (ττ · v)] d V ∂ f i vi + ≈ (τ jk vk ) d V ∂x j V ∂τ jk ∂vk f i vi + vk + τ jk d V ≈ ∂x j ∂x j V ∂τ jk ∂vk fk + vk + ≈ τ jk d V ∂x j ∂x j V ≈
V
But by Navier–Stokes, f k +
∂τ jk ∂x j
(9.22)
k = ρ ∂v ; so ∂t
∂vk ∂vk ρ vk + τ jk d V ∂t ∂x j V 1 ∂ ∂vk ρ (vk vk ) + ≈ τ jk d V ∂x j V 2 ∂t 1 ∂ ≈ ρ (v · v) d V + τ : ∇v d V. 2 ∂t V V
dW ≈ dt
(9.23)
9.3 Geodynamical Models of Mantle Convection
439
(There was no particular reason, by the way, to switch to scalar notation during those intermediate steps, except I figured it’d be easier for me to work them out, and for you to follow.) in terms of v and τ —and remember Now that we’ve managed to write dW dt that τ can also be replaced by some expression containing v, or u, once we’ve decided what kind of constitutive relation we shall use to approximate the rheology of the earth’s mantle—now that that’s done, let’s look at ddtQ , or the rate at which heat is lost or received. This is how much heat is produced from within the mantle—by radioactivity, mostly—plus (or minus) the heat that’s received from (lost to) the outside world—received, by conduction, from the core, or lost through the crust and the outer surface. We can write it like that: dQ = dt
ρ H dV − V
∂V
q · nˆ d S,
(9.24)
where H is how much heat is produced from within the mantle, per unit mass, and q is heat flux, i.e., the 3-D version of what in Chap. 4 I called F—which was a scalar, because we worked in one dimension back then; but in 3-D, flux also has a direction686 . These are parameters that we need to have a decent estimate for, if we are to come up with a realistic model for mantle convection. Or, we can try to guess their values, and if we get a model that is not totally crazy we might infer that our guess could be OK: and we can do this again and again, trial-and-error-like, until it works. Anyway, more on that later. Now for the left-hand side of (9.18): we are going to think of E as the combination of kinetic plus internal energy of the system. If you know anything about the so-called kinetic theory of matter, you might be confused with what I mean here by kinetic energy. My kinetic energy is a purely mechanical, “macroscopic” thing: you know it if you know how fast mass is flowing everywhere in the mantle. It does not include the kinetic energy that’s associated with the random “microscopic” motions of molecules, which control, instead, the system’s temperature: we are going to account for that as part of the internal energy687 . The kinetic energy of a volume element d V at a location r reads 21 ρv(r) · v(r)d V (remember Eq. (4.8), in Chap. 4). Let us just call υ = υ(r) the internal energy per unit mass, so that E= V
or
dE d = dt dt
1 ρv · v + ρυ d V, 2
V
1 ρv · v + ρυ d V. 2
(9.25)
(9.26)
In a minute we’ll take care of writing υ in terms of stuff that can be measured or estimated—temperature, etc. Before we do that, let us substitute the expressions we now have for dW , ddtQ and ddtE into (9.18), which becomes dt
440
9 Our Concept of the Earth
d dt
V
1 1 ∂ ρv · v + ρυ d V = ρ (v · v) d V + τ : ∇v d V 2 V 2 ∂t V + ρ H dV − q · nˆ d S. (9.27) ∂V
V
As long as ρ does not change significantly with respect to time, we can bring the time-derivative at the right-hand side out of the integral, and d dt
V
1 d 1 ρv · v + ρυ d V = ρv · v d V + τ : ∇v d V 2 dt V 2 V + ρ H dV − q · nˆ d S; (9.28) ∂V
V
then you see that the first term at the left-hand side cancels out with the first term at the right-hand side, d dt
τ : ∇v d V +
ρυ d V = V
ρ H dV −
V
∂V
V
q · nˆ d S.
(9.29)
Now, apply the divergence theorem to the last integral at the right-hand side: d dt
V
τ : ∇v d V +
ρυ d V = V
ρ H dV − V
∇ · q d V,
(9.30)
V
so that all integrals are over V ; the equation holds independent of how we choose V ; we can take V to be an infinitely small volume around any point in space: which means that everywhere in the continuum we must have688 ρ
dυ = τ : ∇v + ρ H − ∇ · q, dt
(9.31)
ρ
dυ = τ : ∇v + ρ H + k∇ 2 T, dt
(9.32)
or
because we know689 that heat flux q = −k∇T , where k is thermal conductivity. The internal energy υ is controlled by the system’s pressure, volume and temperature. The relationship between υ and p and V and T can be written, e.g., dT ∇ ·v dυ = CV + (α K T T − p) (9.33) dt dt ρ which is a thermodynamic equation kind of like the ones that we’ve met in Chap. 7, and can be derived in ways similar to what we’ve done back then690 .
9.3 Geodynamical Models of Mantle Convection
441
You might remember also from Chap. 7, by the way, that C V and K T are specific heat capacity at constant volume, and isothermal incompressibility, respectively. α is the volumetric coefficient of thermal expansion, which we’ve already used in this chapter, too. So, finally, if you substitute (9.33) into (9.32), ρC V
dT =ττ : ∇v + p∇ · v − α K T T ∇ · v + ρ H + k∇ 2 T dt = (ττ + pI) : ∇v − α K T T ∇ · v + ρ H + k∇ 2 T,
(9.34)
where as usual I is the identity matrix, and if you are not convinced by the last step, just think of it in scalar form: I : ∇v = i, j δi j ∂∂vx ij , where δi j is Kronecker’s symbol. Which then, by the properties of Kronecker’s symbol, ∂vi ∂vi δ = , which is nothing but ∇ · v. i, j i j ∂ x j ∂ xi The contribution of stresses to (9.34) is all in the first term at the right-hand side; if you remember the relation between τ and p—if you don’t, go back to Chap. 7, around Eq. (7.118)—the expression τ + pI means, in practice, that we are taking the normal stresses p away from the stress tensor691 . (vi) Boussinesq approximation: which is not mandatory, but is a very common way of simplifying things a bit. Joseph Boussinesq, a prof. at Sorbonne692 , phrased it like this in a book693 he published in 1903: “in most displacement caused by heat in fluids within a gravitational field, volumes or density are approximately conserved, and this even though the change in weight per unit of volume is precisely the reason of the phenomena under question. Hence the possibility of neglecting density variations where they are not multiplied by gravity g”. In practice, if we believe Boussinesq’s observation, which people tend to do even today in geodynamic modeling, we can account for density changes caused by temperature changes while at the same time keeping the incompressibility approximation, ∇ · u = 0, which makes the math much simpler: because with the Boussinesq approximation, δρ in (9.15) isn’t multiplied by g and so can be set to 0, and we are left with ∇ · u = 0, (9.35) or, which is the same, ∇ · v = 0.
(9.36)
In the Navier–Stokes equation, where it gets multiplied by g, density is allowed to change by thermal expansion/compression. By definition of α, the thermal density change is given by δρ(x, t) = −α(r )ρ(r )[T (x, t) − T (x, 0)],
(9.37)
where T (x, 0) is the temperature field in the mantle at time t = 0, meaning before we set our model in motion.
442
9 Our Concept of the Earth
To give you an idea of what this means in practice, let’s take Navier–Stokes, and, like Haskell, assume the mantle to be a viscous fluid, except this time we are not going to neglect inertia, so ρ
dv = −∇ p + η∇ 2 v − ρg, dt
(9.38)
where I have replaced the generic body-force per-unit-volume f with gravity fgrav , and then fgrav with the product of mass-per-unit-volume ρ(r ) times (gravity) acceleration g(r ). If we sub ρ in (9.38) with ρ0 + δρ, where ρ0 (r ) is the density profile we start out with, and δρ is given by (9.37), i.e., if we sub ρ with ρ = ρ0 (r ) − α(r )ρ0 (r )[T (x, t) − T (x, 0)],
(9.39)
then ρ
dv = −∇ p + η∇ 2 v − ρ0 g + αρ0 [T (x, t) − T (x, 0)] g. dt
(9.40)
So this is our ingredient (i), the momentum equation, already mixed with ingredients (ii) and (iv). As for (v), energy conservation, that simplifies quite a bit, too, because, for one, in a purely viscous mantle there’s no shear stresses; on top of that we have ∇ · v = 0, like I said, and so (9.34) boils down to ρ0 C V
dT = ρ0 H + k∇ 2 T. dt
(9.41)
Finally, Poisson’s equation—ingredient (iii)—where density variation isn’t multiplied by g—becomes ∇ · g = −4π Gρ0 (r ),
(9.42)
i.e., in the Boussinesq approximation, g is not perturbed by mantle flow. (Pressure and gravity are still related to one another via Eq. (7.125), of course.) So, now, modeling flow means finding a velocity field v(x, t) and a temperature field T (x, t) that solve (9.36), (9.40) and (9.41), while meeting some initial and boundary conditions. The Boussinesq approximation has to do with density changes, and is totally independent of rheology. We just looked at the viscous case because it’s the simplest possible example, but you can use Boussinesq in combination with whatever rheology you want. (vii) The most recent addition to all this is that people in the 1990s have started to take into account the effect of chemical heterogeneity: which means, in practice, that instead of Eq. (9.39) you have something like
9.3 Geodynamical Models of Mantle Convection
ρ = ρ0 (r ) − αρ0 (r )[T (x, t) − T (x, 0)] +
443 N
Bi Ci (x, t),
(9.43)
i=1
where the function Ci = Ci (x, t) is the concentration of material i at point x at time t. The coefficient Bi depends on how dense material i is with respect to the average density of all materials. Materials could be, e.g., depleted mantle, primitive mantle, continental lithosphere, etc.: the modeller decides how many different materials she wants to try to simulate, i.e., the value of N . The total amount of a given material that’s hanging around in the earth is conserved: you can’t just create it or destroy it at any place in the planet, so, at any given point, the change in Ci over time coincides with how much of it is flowing into/away from that point, i.e., ∂Ci ∂u 2 ∂Ci ∂u 3 ∂Ci ∂Ci ∂u 1 = + + ∂t ∂ x1 ∂t ∂ x2 ∂t ∂ x3 ∂t =v · ∇Ci ,
(9.44)
for all materials: i = 1, 2, . . . , N . Subbing (9.43), instead of (9.37), into (9.38), we get the thermochemical version of (9.40), which together with (9.36), (9.41) and (9.44) forms the system of differential equations that a thermochemical model has to solve. The spatial patterns of Ci (x, t) will tell us where, according to the model, the different reservoirs are. In the end, numerical models of mantle circulation are computer programs that calculate a displacement field u(x, t), a temperature field T (x, t), and (in the thermochemical case) concentration fields Ci (x, t), meeting requirements (i) through (v), plus (vii), plus some boundary and initial conditions. The boundary condition often consists of requiring that flow near the outer surface of the model coincides with the displacement of plates—which can be reconstructed, based on what we learned in Chap. 8, for the, I don’t know, past couple hundred million years. Of course, the pattern of the displacements, and of temperature and composition changes that will come out of your calculations depends very much on the parameters you put in to describe earth structure/rheology: some of which, as you know by now, are, uhm, less well constrained than others. One particularly delicate parameter is viscosity: which is super important in controlling how mass flows, and but pretty poorly constrained. What we know about viscosity comes mostly694 from studies of postglacial rebound—as you might remember from Chap. 8. There’s lots of papers around, calculating rebound based on different viscosity models—i.e., models with multiple layers, with different assumptions as to their possible depth extent—and looking at more and more data, etc., and checking which models fit the data and which don’t. The models that people have published over the years are different from one another; but, basically, there are a few things on which most people, if asked, will probably agree: (i) that the viscosity of the lithosphere, or at least of the upper part of the
444
9 Our Concept of the Earth
lithosphere, is so high that one might as well take it to be infinite: a rigid lid; (ii) that below the rigid lid there’s a low-viscosity (something like 1019 , 1020 Pa s) layer that roughly corresponds with the asthenosphere—which is a nice confirmation of the whole idea of asthenosphere; (iii) below that, and down to the transition zone, viscosity is probably close to Haskell’s estimate: 1021 Pa s; (iv) lower-mantle viscosity, which is harder to constrain because rebound data are not as sensitive to it as they are sensitive to shallower depths, might be, like, two or three times higher than that of the upper mantle, but probably of the same order of magnitude. This is vague, I know, but it’s the best that we can do. If we feed this model to mantle-circulation software, we tend to get convection patterns that span the entire mantle, with hot deep mantle stuff being transported from the base to the very top of the mantle, and vice-versa cold stuff sinking from the earth’s surface all the way down to the core. We get, that is, the geochemists’ whole-mantle convection pattern. Stuff rises and/or sinks across the transition zone—as it passes through those discontinuities, its phase will change, like we have learned—but if there’s no major (as in, order-of-magnitude) change in viscosity, a phase-change alone can’t be a barrier to flow—sinking and rising stuff will go right through it695 . In this scenario, the mantle tends to be chemically well-mixed. There might be spots that don’t mix with the rest, but certainly not entire layers. That is a problem, because most geochemists “favour”, I am quoting Hofmann again, “some type of layered convection”: at least two shells, each with its own convective system. Geodynamicists have been trying to see if they can come up with models that are OK with this constraint. This is trickier than whole-mantle convection, because, e.g., if flow below and above, say, the 660, doesn’t point in the same direction (is not “mechanically coupled”) there would be too much friction between upper and lower mantle, unless either one is way less viscous (like, four orders of magnitude less viscous) than the other. But that’s unlikely, based on what we just learned about the earth’s viscosity. On the other hand, if flow is mechanically coupled, then you end up having cold stuff in the upper mantle sinking right on top of rising hot lower-mantle stuff: and that would be weird: how do you even start a hot anomaly, e.g., at the base of the upper mantle, if the lower mantle below it is colder then average? I am not sure whether someone has managed to come up with a numerical model that does this, but if they did, it probably required quite a bit of “fine tuning”, as they say, of all those parameters. And there’s also another thing; if convection is layered, that means you’ve got two thermal boundary layers near the 660: one above it, which is the bottom of the upper-mantle convective system, and one below it—the top of the lower mantle. In a thermal boundary layer temperature changes real quick with depth—just like in the lithosphere: and I don’t think we know for sure what total temperature change to expect, but it should be close, in order of magnitude, to what we see in the lithosphere, which is something like 1000 ◦ K. To some extent that could be “good”: because remember the piece on the earth’s temperature at the beginning of this chapter: if we extrapolate T at the CMB from above, the CMB turns out to be too cold with respect to extrapolations “from below” (i.e., from the ICB phase-change temperature). If we
9.4 Global Seismic Tomography
445
had a thermal boundary layer somewhere within the mantle, like at the 660, then our estimate of T just below that TBL would be higher than in a whole-mantle-convection model, and then as we extrapolate across the lower mantle via the adiabatic gradient, we’d end up with higher T at the CMB. But how higher? All the quantities we are dealing with here have big uncertainties, but anyway, my feeling is that we might easily end up with estimates of lower-mantle T that are actually too high. When you put all these things together, I guess evidence is rather against the layered-convection idea.
9.4 Global Seismic Tomography The final piece of information (at least for the time being) to try and solve the riddle of mantle circulation, chemical mixing, etc., comes, like I anticipated, from seismology696 . In Chap. 8 there’s a piece on how the speed of surface waves changes, e.g., across an oceanic plate. If you don’t remember what I am talking about, go back to Fig. 8.38: which looking at surface waves traveling between, I don’t know, twenty or thirty source-receiver pairs, Kausel and company find values of phase velocity as small as 3.7 and as large as 4.1 km/s; and they see that the spatial distribution of those values is correlated with the age of the various parts of the plate. In the midseventies, seismologists doing this kind of things would look at, like, a few dozens travel-time observations at a time697 . By the early-to-mid eighties, the size of your typical seismology data set had grown by about an order of magnitude698 . People like Adam Dziewonski, John Woodhouse, Brad Hager, were able to look at a few hundred travel time “picks” at a time: enough to try and compile some relatively detailed three-dimensional maps of P and even S wave velocities—covering not just a lithospheric plate, but the entire mantle. So, what I am going to do next is, I am going to tell you in some detail about how this is done. It’s a technique called seismic tomography, and, I should warn you, it was the number-one tool that I used to do the research for my Ph.D. thesis. Which means that I studied it in quite some depth, so that at some point I became, like, an “expert” of it. But I’ll try not to be a geek, and spare you the useless technicalities. Start by thinking of a homogeneous medium, i.e., a material through which some kind of waves, for example P waves, propagate at a speed that is constant both in time and space. Take a source and a receiver within that medium, say you know the exact position of both. Question: how long does it take for the signal emitted by the source to hit the receiver? After all we’ve learned in Chap. 7, we know that the travel time must coincide with the length of the ray path, call it l, divided by the wave speed (or velocity, if you want to call it the way seismologists call it) v of the wave in question; l (9.45) t= . v
446
9 Our Concept of the Earth
We also know that, because the medium is homogeneous, there’s no refraction or anything of that kind, so the ray path is a straight line, and l is just the distance between source and receiver. Now, take a medium that is not homogeneous, and ask the same question. It is still true that travel time equals length of ray path divided by velocity; but what’s the ray path? If the medium consists of a bunch of flat layers, each of which is homogeneous, which is a first approximation for many real setups in small-scale seismology (in many places, the crust is just a pile of flat sedimentary layers), then we have a recipe to find the ray path: make a first guess, i.e., take a path leaving the source at an angle θ0 , within the plane defined by source, receiver and the vertical; trace it—trace a straight line—to the first interface; use Snell’s refraction law to see what the incidence angle becomes in the next layer; trace a straight line in that layer, and so on and so forth. We know from Chap. 7 that seismic velocities almost always grow with depth, so at some point the so-called “incidence” angle becomes bigger than the critical angle, refraction becomes reflection, etc. Eventually, then, the ray comes back to the earth’s surface, and if you are lucky it does so at precisely the spot where the receiver sits. If not, what you do is you start over with a new “take-off angle”, say θ0 + δθ where δθ is small: and you iterate the process until you hit the receiver (see Fig. 9.7). Once that is done, you can measure the length of each segment of ray path—one per layer—and t=
k li 2 , v i i=1
(9.46)
where k is how many layers are crossed by the “good” ray path; li is the length of the good ray path within the i th sampled layer; and vi is wave velocity in that same layer (constant throughout the layer, because we decided that each layer is homogeneous). The factor 2 is there because each layer is crossed twice: the first time as emitted energy propagates from the source down into the earth, and the second time as refracted/reflected energy travels back up towards the surface (the receiver). (If both source and receiver are at the earth’s surface, that is. Otherwise, some layers
Fig. 9.7 Vertical section through a pile of layers, and seismic ray paths traced through them. The thin solid lines are the surface of the earth and, below it, boundaries between layers; the two dashed lines are ray paths that do not hit the receiver; the thick solid line is the correct ray path, given where source and receiver are
9.4 Global Seismic Tomography
447
are crossed only once, and (9.46) needs to be replaced by a slightly more complex formula. In practice, receivers—seismic “stations”—are pretty much always at the surface. (Or, like, a few meters deep.) Sources can be deep, as you know, but in most cases their depth is very small compared to the size of the mantle.) Finally, take a heterogeneous medium: velocity changes with position within the medium, in some general way. This problem is not that different from the one we just solved, if we think of the medium as made of small, but homogeneous voxels, cells, blocks, whatever you call them, each with constant velocity. Then, the recipe I just gave you still works, except there’s no reason for the ray path to stay on one plane, like it does when we have flat layers only. So now you need to start out with two angles, e.g., one with respect to the vertical and one with respect to the north, or whatever direction in the horizontal plane. And the path might end up having some really funny shape. We still end up with Eq. (9.46), though, where now li is the length of the ray path segment lying in voxel number i, and vi the wave velocity in that same voxel. Maybe you’ve guessed what happens next: if you take the voxels to be infinitely small, the sum in (9.46) becomes an integral, and t=
1 dl. ray path v[x(l)]
(9.47)
The expression v[x(l)] might look a bit strange, so let me explain. Because the medium is heterogeneous, v is a function of the position vector, i.e., v = v(x). As we increment l and integrate according to (9.47), though, we shall only be concerned with values of v along the ray path. So, then, I take x = x(l) to be the function that tells me what my coordinates are, if I have walked a distance l from the source, along the ray path. (Just so that we are clear: l is not measured along the straight line connecting source and receiver, but along the ray path, whatever its geometry; you might call it the “incremental length of the ray path”.) Bottom line, for a given value of the integration variable l, the value of x that you need to plug into v(x) is x(l), hence Eq. (9.47). Now that you understand where (9.47) comes from and what it means, I am going to show you how you can use it to set up a “tomographic inverse problem”—that is to say, an algorithm that allows you to estimate v, which is inside the integral in (9.47) and so is not something we can compute directly from measurements of travel time t. Tomographers always start out with some “reference” model, which could be one of the Herglotz-Wiechert-Bateman-type models we’ve met earlier. What they measure, rather than the “absolute” travel time, then, is a “delay time”, or “traveltime anomaly”: the difference δt between the travel time you’d calculate699 , for a given source-receiver pair, in the reference model, and the t from a seismogram700 . What they try to find, strictly speaking, is not v, but the discrepancy δv with respect to the reference velocity. If you replace v with v0 + δv in (9.47), you get t0 + δt =
1 dl, ray path v0 [x(l)] + δv[x(l)]
(9.48)
448
9 Our Concept of the Earth
where v0 is the reference model and t0 is the travel time predicted by the reference model, for that source-receiver pair. The integration domain, “ray path”, is primed, to remind us that the if v changes, the ray path will change as well. Equation (9.48) doesn’t look that great, but let us do a first-order Taylor expansion701 of the function f (v) = v1 around v = v0 , 1 v − v0 1 ≈ − v v0 v02 1 δv ≈ − 2. v0 v0
(9.49)
Plug that into (9.48), and
1 δv(x) − dl 2 ray path v0 (x) v0 (x) 1 δv(x) ≈ dl − dl 2 v (x) ray path 0 ray path v0 (x) 1 δv(x) ≈ dl − dl, 2 ray path v0 (x) ray path v0 (x)
t0 + δt ≈
(9.50)
where I haven’t bothered to state explicitly the dependence of x on l—which of course is still implicitly true, though; and in the last step I’ve replaced the primed, “perturbed” ray path with the reference, “unperturbed” one: this is OK, because as long as δv is small, it is reasonable to assume that changes in the ray path won’t have more than a second-order effect on δt—and what we have here is a first-order Taylor expansion anyway. Now, by virtue of Eq. (9.47), t0 =
1 dl. v ray path 0 (x)
(9.51)
Both sides of (9.51), then, cancel out from the respective sides of (9.50), and we are left with δv(x) dl. (9.52) δt ≈ − 2 ray path v0 (x) Equation (9.52) might look a bit nicer than (9.48), but you are probably still at a loss as to how we are going to be able to turn it around, and constrain δv on the , basis of an observation of δt. Well, so, the trick is, you write δv—or, actually, δv v0 which we like better because it is a dimensionless number—you write δv as the linear v0 combination of a set of known functions of x: let’s say there’s n of them and let’s call them f 1 (x), f 2 (x), f 3 (x), . . . , f n (x); what I mean, then, is, you assume that the relative velocity perturbation can be written
9.4 Global Seismic Tomography
449
δv(x) = ci f i (x), v0 (x) i=1 n
(9.53)
where c1 , c2 , c3 , . . . , cn are constant coefficients, whose values are unknown—they must be, because δv is unknown and but the functions f i (x) are known. (The f i ’s are v0 usually called “basis functions”. In a minute I’ll give you an example of what they could be like, in practice.) Let’s sub Eq. (9.53) into (9.52),
n 1 ci f i (x) dl δt ≈ − ray path v0 (x) i=1 n 1 ≈− f i (x)dl, ci v ray path 0 (x) i=1
(9.54)
where it’s OK to pull the ci ’s out of the integral, because, even though we don’t yet know them, we know that they’re just constant numbers—by definition. Tomographers speak of the f i ’s, by the way, as a set of basis functions702 . Now, imagine you’ve got just one quake, one recording instrument, and, as a result, one delay-time observation δt. This is not going to be enough to constrain v(x) all over the planet—or whatever region of the planet you are interested in, right? Even before I get into the mathematical details, you’ll agree with me that if that’s what you hope to achieve, you should collect many δt’s, from as many sources and receivers as possible. So, let’s go ahead, and assume you’ve compiled a database of δt: we shall use the index j to tell one observation from the other, and so write δt j ≈ −
n i=1
ci
1 f i (x)dl. v ray path j 0 (x)
(9.55)
Each datum δt j comes from its own source-receiver pair, and so it’s got its own ray path associated to it; which the index j added to “ray path” at the right-hand side serves to remind us of that. Now, everything in Eq. (9.55) is known, except for the ci ’s: because δt j is your datum: you pick it from the seismogram; v0 is the reference velocity—you take it from the reference model, which you have chosen; and the f i ’s are the basis functions, which you’ve also chosen. It follows that the integrals at the right-hand side can be implemented, and replaced with numbers: that isn’t necessarily an easy task (you don’t really want to do it “by hand”, especially if you’ve got a lot of data, and/or a lot of basis functions), but it is something that you can write an algorithm for, which then you can run iteratively, once per source-receiver couple, on a computer. If you do just that, the computer will spit out a different number for each pair of values of i and j; and we might store all those numbers in a matrix, call it A, where i is the column index and j is the row index, so that (9.55) can be written in the compact form
450
9 Our Concept of the Earth
δt j ≈
n
A ji ci ,
(9.56)
1 f i (x). v ray path j 0 (x)
(9.57)
i=1
with A ji = −
If you think of the δt j ’s as the entries of a vector δt, and of the ci ’s as the entries of a vector c, you get the even more compact form δt ≈ A · c.
(9.58)
If you remember Chap. 2, you’ll realize that (9.56) and (9.58) describe a linear system, where the vector c has all the unknown coefficients, and everything else, like I said, is just numbers that we know. You’ll also realize that, once you’ve computed from the δt j ’s amounts to solving that linear system, or rather, I A, estimating δv v0 should say, that linear inverse problem. Solving (9.58) gives you the ci ’s, and once you have those, all you have to do is implement (9.53) to calculate the value of δv v0 anywhere you want. What you have, at that point, is what people call a tomographic model (or tomographic map, tomographic image, etc.) Now, all this looks great, but actually there’s a problem. Because the thing is, we can’t fully control how we “sample” the medium that we are trying to “image”. Because that depends on how many instruments we have access to, and where we can place them (i.e., it is relatively easy to deploy seismometers on land; it is difficult to deploy them under the ocean; two thirds of our planet’s surface is covered with oceans), and how many earthquakes occur, and where, and of what magnitude. To see what I’m talking about, consider a specific example. Let’s say that our δt j ’s are Rayleigh-wave travel times, which we could measure after filtering the vertical component of seismograms in a narrow frequency band. (Like, e.g., Kausel’s data that I just mentioned, phase velocities around 4 km/s, were filtered around 0.025 Hz— which means the period is 40 s, or 1/0.025.) This way, we don’t have to worry about dispersion: single-frequency Rayleigh-wave velocity is a function of (co)latitude ϑ, and latitude ϕ—nothing else. Let’s pick the simplest set of basis functions that one can possibly think of, i.e., let’s subdivide the earth’s surface into pixels703 , with constant latitudinal and longitudinal increments. Formally, this means that our basis functions can be written 1 if ϑi ≤ ϑ < ϑi + δϑ and ϕi ≤ ϕ < ϕi + δϕ (9.59) f i (ϑ, ϕ) = 0 otherwise, where δϑ is the latitudinal extent of pixels, δϕ their longitudinal extent (to keep things simple, we take both to be constant for all pixels), and ϑi , ϕi are the colatitude and longitude of the north-west corner of the i th pixel (if we pick longitude to grow from west to east). In other words, f i is one within pixel i and zero everywhere else, i.e., it is what (some) people like to call the “characteristic function” of pixel i.
9.4 Global Seismic Tomography
451
Now look at the integral in (9.57). If ray path j doesn’t go through pixel i (which will be the case most of the times, unless pixels are really huge), that integral is zero. Now imagine there’s a pixel that isn’t sampled by any ray path at all: there will be no way to constrain the velocity within that pixel, and so, in that case, there’s no unique solution to problem (9.58): you could assign any value to that pixel’s ci , and nothing would change. Bottom line, you can’t control how many (nonzero) columns your matrix A has: that depends on the spatial distribution of sources and receivers. Next, think about the number of rows in (9.58). For each observation δt j , that is, for every source-receiver pair for which you’ve been able to pick a travel time, you’ve got one row. You can’t control how many rows you have704 : that depends on how many earthquakes occur, on how well your seismometers are recording them (the so-called noise floor, combined with the magnitude of the quakes: noisy instruments will miss the smaller events, etc.). Because of all this, A is not a square matrix, it doesn’t have an inverse, and you can’t solve (9.58). The best you can do is you can find the values of the ci ’s such that the difference between the left- and right-hand side of (9.58) is as small as it can possibly be. What people usually do is they take the squared difference |δt − A · c|2 , and find c such that this expression is minimum: which is called the least-squares criterion705 . If you work out all the math706 , you’ll find that the least-squares value of c is given by the formula
−1 · AT · δt, cLS = AT · A
(9.60)
−1 is called the generalized inverse of A. where, for whatever reason, AT · A Equation (9.60) is the mother of all equations for seismic tomographers. In practice, A is a rectangular matrix with more rows than columns; that’s because data are noisy, and if you want to reduce the effects of noise, it’s best to “oversample” and have more δt j ’s than you have ci ’s. (If you have only little data, you’ll have to be content with few basis functions, i.e., with lower resolution; you can have high resolution only if you have a lot of data—and geographically well distributed, too.) The matrix AT · A, on the other hand, is square; its size is determined by the number of columns in A—the number of rows don’t matter, so just collect and use as many data as you can. Because AT · A is square, it is possible in principle to find its inverse707 , and Eq. (9.60) can be implemented as it is. So, anyway: after doing all that, and finding the coefficients cLS , one plugs them in the place of the ci ’s in Eq. (9.53), to find a phase-velocity map, δv(ϑ,ϕ) , which, if the v0 period around which we filtered the data is 60 seconds, should look something like the map in Fig. 9.8. Three-dimensional tomography maps are derived from bodywave travel times, either P or S, in pretty much the same way—except that in 3-D the maths is slightly more complicated. Figures 9.9 and 9.10 show some examples of the results that people get. There are a lot of things that one might say about Figs. 9.8, 9.9, 9.10. In fact, I’ve spent a significant part of my, uhm, career, listening to discussions, which in some cases where arguments rather than discussions, if not outright fights708 , listening and
452
9 Our Concept of the Earth
Fig. 9.8 Global map of the phase velocity of Rayleigh waves filtered around 60 s (period), i.e., 0.013 Hz (frequency), with coastlines (thin solid lines) and plate boundaries (thick solid lines). Phase velocity is expressed as percent difference with respect to its average. The bar below the plot shows which shade of gray corresponds to which level of heterogeneity: as you see, that rarely exceeds ±3%. This map was done by yours truly, using data from Göran Ekström’s website at Lamont. You can tell that the “parameterization” I’ve chosen is a grid of pixels; their size is 3◦ × 3◦ at the equator, and but their longitudinal extent gets bigger the closer they are to the poles, so that a pixel’s area is about same at all latitudes
occasionally taking part (because you have to do what you have to do) in discussions about what is right in figures like these, and what is wrong, and what they mean in terms of mantle circulation and earth structure, etc.; and, I must say, this was not the most interesting and/or productive part of my career. Because there’s a significant amount of arbitrarity in the way tomo models are derived (e.g., in the way you pick your basis functions—which don’t necessarily need to be pixels, and but then anyway even if they’re pixels, what size should they be?—or in the way you deal with having to find the inverse of AT · A when AT · A is sparse and numerically bad, etc.709 ): and what you end up arguing about—the features that differ from model to model—is precisely the arbitrary part of the models: that stuff that isn’t really “robust”, as they say. What I think is more interesting is, I am going to try and focus on the features that do not change depending on the model we look at. And I’ll try to enumerate those, now. The first thing I should point out is that the heterogeneity we are looking at is small. We are talking about a few percent tops. This shouldn’t come to us as a surprise, because we’ve been through Chap. 7 and we know that the earth is, to a good approximation, spherically symmetric: the properties of the materials we find in it change with depth (because pressure and temperature change a lot with depth), but not so much with latitude and longitude, i.e., “laterally”. The second thing that seems important to me is that, wherever we have some a-priori idea of what lateral heterogeneity in temperature should be like, well, it turns out that that correlates with the heterogeneity in seismic velocities—both v P
9.4 Global Seismic Tomography
453
Fig. 9.9 Vertical sections, AKA cross sections, through two tomographic models. Showing tomography in cross section means you show seismic velocities as functions of depth (vertical axes) and distance along a great circle (horizontal axis, with a star to show the origin). Here are two small slices of cross section: one (top) cuts under Japan (east-west from the Pacific Ocean, cutting through almost the Northern tip of Honshu, and the North Korea-China border into China); the other is under Central America (from the Pacific Ocean off the coast of Guatemala, along the Honduras-Guatemala border, all the way to central Cuba). The models are (to the left) the v P model of Yoshio Fukao and Masayuki Obayashi (Journal of Geophysical Research, vol. 118, 2013) and (to the right) the v S model of J. Ritsema et al. (Geophysical Journal International, vol. 184, 2011). These are places where plates converge against one another, and we expect one to subduct under the other, and if higher-than-average seismic velocity correlates with lower-than-average temperature, which we think it does, then this tomography indeed shows subducting plates. (In case you doubt it, look at the white dots, which show earthquake hypocenters as reconstructed by seismologists—and see Chap. 8, the bit on Wadati and Benioff, Figs. 8.6 and 8.7)
454
9 Our Concept of the Earth
Fig. 9.10 Tomography maps, published in 2019 (Journal of Geophysical Research, vol. 124) by Steve Grand and his team, of P-wave (left), and S-wave (right) velocity at 1250 (top) and 2750 (bottom) km depth in the mantle. You see that the results of P- and S-wave tomography are, to a large extent, similar. The so-called mid mantle, around 1000–1500 km below the earth’s surface, is “dominated” by two “linear” high-velocity anomalies—one under the Americas and the other going roughly from Anatolia to Indonesia, which people think are Farallon and Tethys: former oceanic plates now sinking into the lower mantle. At the bottom of the mantle, 2500–2900 km or so, the most prominent features one sees are two large low-velocity areas: one directly below the middle of the Pacific ocean; the other under southern and western Africa and the Indian Ocean
and v S . (In fact, and this is the third point I want to make, tomo models of v P and tomo models of v S are also similar to each other: look at Fig. 9.10, too.) If you look at Fig. 9.8, which remember that Rayleigh-wave velocity is an average of the values of v S from the crust to, like, a couple hundred km depth in the upper mantle, if you look at Fig. 9.8 you see that phase velocity is lower-than-average—and so v S must be lower-than-average—wherever you have mid-ocean ridges—or under eastern Africa, where you have a rift valley, which is like a mid-ocean ridge that is being born right now. These are all places where mantle stuff is rising, to form new lithosphere—and it’s rising because it’s hot: high temperature correlates with low seismic velocity. Likewise, phase velocity is higher-than-average under continents, where we know710 from Chap. 8 that temperature is likely to be much lower than under oceans. In Fig. 9.9 we’re looking at v P and v S , but same thing: these are vertical cuts through areas where plates converge: which plate tectonics says that one plate should be subducting under the other. The subducting plate is colder than the mantle around it, conduction being so incredibly slow in mantle materials—we looked into this in Chap. 8: and it turns out that we see higher-than-average seismic velocity more or less where we expect the subducting slab to be. Of course, we are not always totally sure of where the slab should be. We do find hypocenters of deep quakes down to the
9.4 Global Seismic Tomography
455
660 or so, and those only happen where plates are expected to subduct, and we can use them to trace the subducting slabs. But below the 660, who knows? This is the kind of thing that people get into fights about. The high velocities mapped under the Kuriles seem to stop above the 660, as if the slab weren’t able to go through it. But the high-velocity thing under the Aegean arc looks like a slab that goes right through the 660, unhindered by whatever viscosity discontinuity might happen there. So do plates sink all the way to, like, the CMB, or do they stop at the 660 (or somewhere else in the lower mantle)? If you ask most of my colleagues, the answer you’ll probably get is, it depends on the slab. But then, is convection layered or whole-mantle? It’s still difficult to give a straight answer to this one, I guess. We’ll get back to that in a minute. Incidentally, physics explains why high T means low seismic velocity; but the whole story is quite complicated. In a nutshell, it has to do with how atoms oscillate around their equilibrium positions. I touched on the kinetic theory of matter in Chap. 4711 . Physicists and chemists figured that atoms in minerals form a lattice712 , a network of regularly spaced points; but atoms are not standing still: they oscillate around the network’s nodes. In the first approximation, they oscillate like a mass attached to a spring, or like a pendulum713 , i.e., like the cosine or sine of some frequency times time: this is called, as some of you probably know, harmonic oscillation. Strictly speaking, though, this is only correct at low temperatures. If we could look at an atom’s behavior while T grows, we’d see that, at higher T , it oscillates kind of like a spring that responds to compression more than it responds to expansion: i.e., the restoring force that pulls the atom back to its equilibrium position is stronger when interatomic distance is reduced than when it is increased. What we see is still an oscillation, but the atom’s position doesn’t evolve like a sinusoid anymore; because atoms, now, like to spend more time away from one another than near one another; this is what people call an anharmonic oscillation. The higher T is, the more anharmonic the oscillation: this causes “thermal expansion” (which we’ve kind of taken for granted, as an empirical result, throughout this book: if you raise the T of most materials, their volume grows as well). As the average distance between atoms grows, the (average) restoring force that keeps them together becomes weaker. And this actually reduces the seismic velocities, too: because the weaker the restoring force, the longer it takes each atom, when the crystal is hit by an elastic wave, to respond and bounce back towards its initial position. Bottom line, people study anharmonicity to quantify the the derivatives of v S and v P with respect to T , i.e., to translate perturbations in seismic velocities into perturbations in temperature. The story gets even more complicated if you consider that the materials we are dealing with are viscoelastic rather than elastic. Which we did when we studied mantle circulation, but not with seismology. In all the seismology we covered so far, it was OK to treat the earth like an elastic medium; but, if you think about it, that poses some problems. A perfectly elastic medium, like a perfect spring714 , keeps oscillating forever—energy conservation plus no dissipation. In a perfectly elastic earth, the waves emitted by a quake would continue to bounce back and forth across the planet forever—which we know it’s not the case. But if the stuff that makes up the earth works like the spring-plus-damper system of Fig. 9.6715 , then, each time
456
9 Our Concept of the Earth
new atoms are hit by a seismic wave, some of the energy is sucked into the damper, which doesn’t bounce back instantly—the damper’s ε is not proportional to τ (its time-derivative is)—and so doesn’t contribute to the propagation of the wave. Which, eventually dies away. This is what we observe in earthquakes, and it’s called seismic attenuation. So, yes, viscoelasticity is relevant to seismic waves, too, and the theory of seismic waves should be put together starting from some viscoelastic stress-strain relation rather than from Hooke’s law (like we did in Chap. 6). Which people naturally tried to do (although, like I said, nobody knows for sure what viscoelastic stress-strain relation we should use), and found that viscosity enters not only in the formulae for the parameters that describe attenuation—the rate at which the amplitude of seismic wave decays with distance—but also in the formulae for the seismic velocities. And the reason I mention this right now, is that temperature controls the damper’s viscosity, just like it controls anharmonicity: and so we must study viscoelasticity— or anelasticity, as seismologists prefer to say—we must study anelasticity as well, and not just anharmonicity, if we are to properly interpret seismic velocities in terms of temperature. Once we’ve taken all these effects into account, it turns out (i) that (small) lateral P and δvvSS , are both proportional to variations in compressional and shear velocity, δv vP lateral variations in temperature, δT : i.e., both shear and compressional waves are fast where T is low, and slow where T is high: which is exactly what we see in Figs. 9.9 and P versus T is different 9.10, over most of the earth’s mantle. (ii) The derivative of δv vP δv S from the derivative of vS versus T —and the way both derivatives change with depth (i.e., with temperature) is also different; over much of the mantle, tomography (again, Figs. 9.9 and 9.10) shows that δvvSS is about twice as large as δvv PP , which is more or less what you expect based on anharmonicity plus anelasticity. Differences in temperature, then, are enough to explain heterogeneities in sesmic velocity. But: (iii) there are places, particularly the very bottom of the mantle (Fig. 9.11), where δvvSS and δvv PP are less correlated716 than elsewhere. We might need something else than thermal variations, then, to explain that kind of heterogeneity: and the only candidate we have left is chemical heterogeneity. (iv) There are also places where the ratio of δv S to δvv PP is as high as 3, or more: and this is too large to be explained in terms of vS anharmonicity plus anelasticity. Again, this happens most of all in the deep mantle (look at the maps in Fig. 9.10), and, again, we don’t see how it can happen, unless we accept that rock in those places is chemically different from rock in the rest of the mantle. It’s not clear, though—not to me at least—what kind of chemical changes you need to have. But people have some ideas, as we shall see in a minute.
9.5 Putting Everything Together
457
δv P S Fig. 9.11 Correlation (dashed line) and ratio (solid line) of δv v S and v P , according to the model of Thorsten Becker and yours truly (Geochemistry, Geophysics, Geosystems, vol. 3, 2002). To find correlation as a function of depth, you sample both P and S models over a regular latitude-longitude grid at a bunch of depths, and then, at each depth, implement the correlation formula. To find ratio S as a function of depth, we took, again, both models at one given depth, calculated the ratio of δv vS
to
δv P vP
at each gridpoint at that depth, and then averaged over all gridpoints at that depth (excluding
gridpoints where
δv P vP
and/or
δv S vS
are too small to be significant). We iterated this over all depths.
You see that, at the bottom of the mantle, correlation is relatively low, and δv P vP
δv S vS
is much bigger than
9.5 Putting Everything Together Now, the research we’ve seen so far in this chapter was begun in the 1970s, developed through the 1980s, and more or less concluded in the 1990s. In 2000, Paul Tackley717 published, in vol. 288 of Science, a review paper called “Mantle Convection and Plate Tectonics: Toward an Integrated Physical and Chemical Theory”, where he puts together all findings from geochemistry, seismology and geodynamics, kind of like I am trying to do with this chapter, and comes to some tentative conclusions about how it all works. He starts out by saying, “the physics of plate tectonics is poorly understood; other planets do not exhibit it. Recent seismic evidence for convection and mixing throughout the mantle seems at odds with the chemical composition of erupted magmas requiring the presence of several chemically distinct reservoirs within the mantle. [...] “Despite three decades of research, some first-order questions regarding the platemantle system remain largely unresolved. The first of these is why Earth developed plate tectonics at all, given that other terrestrial planets such as Mars and Venus do not currently exhibit this behavior. [...] The second major question is how the
458
9 Our Concept of the Earth
differing chemical compositions of erupted magmas, which require several chemically distinct reservoirs within the mantle, can be reconciled with other observational and dynamical constraints favoring whole-mantle convection [...], which should mix chemical heterogeneities.” The observational and dynamical constraints are seismic tomography and geodynamical modeling, of course, respectively, which we’ve just learned everything about. I don’t think Tackley (or anyone else) has a real, deep answer to the first question, other than listing some features of numerical models: e.g., that convection does not need to generate plates, it rather generates a “rigid” lid. What’s interesting, for us, right now, in his paper, has to do with the second question. Because Tackley goes through all plausible ideas of mantle circulation that had been put forward at that point, and shows what works and what doesn’t with all of them, from each discipline’s angle. The main thing is that “arguments against [the idea of “layered” convection] have been reinforced by high-resolution global models of 3-D mantle velocity structure obtained using seismic tomography, which indicate recently subducted downgoing oceanic plates (‘slabs’) in the lower mantle. Supporting this interpretation are mantle convection models in which plate motions for the last 140 million years are imposed as a surface boundary condition; these models display volumetric structure resembling the seismically imaged structure718 . In many places, the slabs appear to temporarily flatten out above the 660-km discontinuity before descending into the lower mantle, as expected from the actions of a major mineralogical phase transition at this depth”. (Meaning that the stuff that’s below the 660 is also denser than the stuff above—it’s in a more closely-packed phase—and so the stuff that’s above is more buoyant and tends to float above the 660: until it cools down enough to sink, I guess. Plus, we know now that people looking at, like, rebound data see a viscosity change around that depth, and the viscosity change might well be related to the phase change. And the change from lower to higher viscosity could slow the slabs down, and/or stop (some of) them altogether, as they try to sink into the lower mantle.) Put simply, Tackley says that layered convection would be a good way to explain the geochemists’ observations, but just doesn’t fit the findings of seismic tomography and of geodynamical modeling. Simple whole-mantle convection, on the other hand, he says, doesn’t work, because there’s way too much difference in composition between OIB and MORB, for them to come from the same source. So, then, it could be that there are some reservoirs of primitive materials scattered around in the mantle; or maybe there’s a primitive layer, yes, but not as large as the entire lower mantle. The former idea was proposed by my friends Thorsten Becker and Jamie Kellogg, who were working on their theses at the same time as me, and in the same department—their supervisor was Rick O’Connell. Thorsten et al. came up with the concept of primitive-mantle reservoirs in the form of “blobs”—that’s the word they used—, i.e., huge chunks of mantle “with a viscosity 10–100 times that of normal mantle”, says Tackley in his review, that “deform slowly enough to remain unmixed after billions of years [...]. The blobs would be hidden from mid-ocean ridge spreading centers but sampled by plumes.” This is illustrated in Fig. 9.12: model “C”. According to Tackley’s simulations back in 2000, this wouldn’t work: because of the “difficulty in balancing thermal and compositional buoyancy. The blobs should
9.5 Putting Everything Together
459
Fig. 9.12 In his 2000 Science paper, Tackley identifies six circulation models that summarize, more or less, all the ideas on mantle convection and chemical mixing, etc., at the time his paper is written. The first two—pure whole-mantle convection, with no reservoirs, and pure layered convection, where the entire lower mantle would be a reservoir—we already went through, and abandoned. What you see here are sketches of the four remaining models: “C”, “D”, “E” and “F”. ERC stands for “enriched recycled crust”: crustal material that separated early on from the primitive mantle, and so is enriched in incompatible elements, and but is then carried back into the mantle by subducting slabs. P stands for “primitive”. DMM stands for “depleted MORB mantle”: the kind of mantle that today’s MORB comes from. (Used with permission of The American Association for the Advancement of Science, from Paul J. Tackley, “Mantle Convection and Plate Tectonics: Toward an Integrated Physical and Chemical Theory” Science, vol. 288, 2000; permission conveyed through Copyright Clearance Center, Inc)
either float to the top or sink to the bottom, forming a layer.” In other words, if I understand his point correctly, Tackley is saying that it is very unlikely that blobs should have the same buoyancy as the mantle around them, and at the same time a different composition and a different viscosity. Then, again, if you look in the literature today, you can find numerical models where “large-scale heterogeneity with viscosity” that’s 20 times that of the rest of the mantle, is “sufficient to prevent efficient mantle mixing, even on large scales.”719 And Nicolas Coltice, whom I mentioned earlier, and who runs, I think, the same simulation software as Paul Tackley, wrote720 in 2021 that his simulations “helped build consensus between the [...] viewpoints [of layered versus whole-mantle convection, i.e., geochemistry vs. geophysics]: the mantle is not dynamically stratified in shells but distributed chemically distinct domains can resist mixing from top to bottom over billions of years.” So, I am not sure who’s right, then. (At the moment everybody’s right, I guess, and everything is possible.)
460
9 Our Concept of the Earth
Model “D” in Tackley’s paper (and in Fig. 9.12 here) is essentially Hofmann’s “uniformitarian” model, see above, where some early, “enriched” MORB gets recycled into the mantle by subduction, then is brought back to the crust—at OIBs—by plumes, then subducted back again into the mantle, and so on and so forth, so there’s no need for multiple stable reservoirs: everything is explained by this cycle. It doesn’t look like you can make this happen in dynamic simulations, though: “The dynamical feasibility of this model with respect to convective mixing”, says Tackley, “remains to be demonstrated.” The sketch in Fig. 9.12 is not the snapshot of a simulation, but just a “cartoon”, representing graphically the geochemists’ speculation. As far as I can tell, “E” and “F” are Tackley’s and most of my colleagues’ favorite models. They are actually very similar, the idea being that there is one huge reservoir of primitive mantle material at the bottom of the lower mantle. Seismologists don’t seem to see it clearly: they don’t see any major, sharp discontinuity in the lowermost mantle: there’s the D , yes, but that’s not as sharp as, like, the 410 or the 660 (it is not observed at all latitudes and longitudes, and its depth can change quite a bit depending on where you look), plus, the D layer, about a couple hundred km thick, would be too small to account for the amount of primitive mantle that we expect to still have in the earth. The “character” of tomography models, on the other hand, does change, say a few hundred, maybe even 1000 km above the CMB. In some models, slabs don’t look like they sink all the way down to the CMB, but seem to stop at some shallower depth; and the bottom, I don’t know, 500 km of the mantle are “dominated”, as they say, by two “prominent” low-velocity regions of very large lateral extent, one roughly under the central Pacific Ocean, the other under southern Africa and parts of the Indian and Atlantic oceans. You can see this, more or less, in Figs. 9.9 and 9.10. I think it was Adam Dziewonski who first talked about them, in the 1980s or early 90s, and called them “superplumes”; later they were called “megaplumes”, and “large low-shear-velocity provinces” (v P is also low, there, though, to some extent), and probably by some other names, too. Now, that’s also the depth range where v P and v S are not that well correlated—the S is Pacific megaplume is seen in S-, but not in P-velocity maps, and, on average, δv vS δv P more three times as large as v P . Which, like I was saying, might mean that the stuff that forms, e.g., the megaplumes is chemically different from the rest of the mantle at the same depth. Model “E” is based on this idea: the seismologists’ megaplumes are the reservoir we’re looking for, i.e., they are “piles”, as Tackley calls them in his paper, of primitive mantle material, which “would be hot as a result of radiogenic heating, with thermal upwellings arising from their tops, thus explaining the distribution of hotspots721 and the topography of southern Africa722 .” (Two reservoirs, then, actually, rather than one: but Tackley calls them “a discontinuous layer”.) “This model fits most of the seismic observations”, but “the volume of the piles is rather small to satisfy geochemical constraints”—remember the mass balance thing in Hofmann’s paper, which basically you need, like, half of the mantle to be primitive: and the superplumes are much smaller than that. Same for the heat budget. The reservoir in model “F”, instead, forms one continuous layer: “a global, undulating layer [meaning, I guess, that its outer boundary isn’t spherical, but has a lot
9.5 Putting Everything Together
461
of “topography”] of average thickness 1300 km (but varying by ∼1000 km [voilà]) and constituting ∼ 30% of mantle mass [...]. This model”, says Tackley, “fits current geochemical constraints and some seismic constraints, particularly the stopping of slabs [1000–1500 km] above the CMB in some seismic tomographic models. However, it is necessary to explain why such a layer is invisible seismically, because the undulating boundary and the recycled layer above it should generate [...] considerable [...] seismic heterogeneity at mid-mantle depths”: which doesn’t show in tomography models. Model “F” was Louise Kellogg et al.’s idea, published in their 1999 Science paper that I cited a few pages back. Both “E” and “F” are “dynamically plausible”, i.e., the way they are sketched in Fig. 9.12 is a rendition of how computer simulations look like. But, basically, model “E” is tuned to reproduce the main features of seismic tomography, and doesn’t quite fit geochemistry data; “F” is tuned to satisfy the constraints of geochemistry, but is not totally convincing from the seismology angle. And these are the models of mantle circulation that were around when Tackley wrote his review: which are also, more or less, the models of mantle circulation that are around today. Because, and I realize that this might sound somewhat pessimistic, but the thing is, I don’t think our idea of the earth has changed much in the last twenty, twenty-five years or so. That’s kind of depressing, to me at least, given that that covers approximately the span of my “career” in the earth sciences. And it’s not like we don’t try: we collect more and more data, run models, go to meetings. But the theories that we come up with, to explain what we measure and the results of our calculations, don’t get any simpler—if anything, they get more complex. Which in other words means, we’ve got an enormous, and growing amount of info about the earth, but that’s not helping us to understand it better than our Ph.D. supervisors already did, a generation ago. There’s plate tectonics, mantle convection, olivine, the iron core and the geodynamo—those concepts have been around for quite a while, and now all we are doing is add footnotes to them. Maybe our supervisors were smarter than we are. Or maybe it’s just that we’ve gotten to the point where we already know all that we’re ever going to be able to find out about our planet; and there’s no more questions around, which, if answered, would really change our concept of the earth. I am not saying that this is necessarily the case: it’s just my feeling. Maybe I am totally wrong, and there’s a big discovery right around the corner, that will revolutionize our way to think about earthquakes, volcanoes, mountain ranges, continents and oceans. I don’t know. This is a question for you now: as far as I am concerned, this is how I am going to end this book.
Exercises: Further Food for Thought
1. Explain Eratosthenes’ method for estimating the Earth’s radius. 2. From the value of the angle between moon and sun at half moon, can you derive the ratio of moon-earth and moon-sun distances? Do the math, assuming the angle in question to be 87◦ (which was Aristarchus’ estimate). Comment on your result. 3. Consider two spheres of equal mass. The density of the first one changes within the sphere, but only as a function of distance from the sphere’s center (no “lateral” variations); the density of the second one is uniform throughout the sphere. An observer measures each sphere’s gravitational attraction; the measurements are made at equal distance from the center of each sphere. How different are the two measurements? why? 4. How were the first estimates of the Earth’s mass made? Explain briefly. 5. You are standing 10 km away from (the center of mass of) a mountain, and, albeit slightly, you are gravitationally attracted to it. Via some incredibly precise instrument, you measure the mountain’s force of attraction on your body, and find that it is equal to 0.007% of your weight. You estimate the mass of the mountain to be about 1015 kg. Calculate the mass of the earth. (Hint: you do not need to know the so-called gravitational constant, G.) 6. Find the formula for the volume of a sphere with radius R, via a three-dimensional integral in spherical coordinates. 7. Define the Chandler wobble (you probably want to make a sketch) and explain how a measure of the Chandler wobble can be used to constrain the internal structure of the earth. 8. The radius of the moon is about 1700 km; its mean density is about 3300 kg/m3 . Calculate the (approximate) gravity acceleration of the moon, as it would be observed on the moon’s surface. How does that compare to the earth’s gravitational acceleration as we experience it?
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2
463
464
Exercises
9. Find the general solution φ(t) to the differential equation a
dφ + φ = 0, dt
with a a constant. 10. L’Hôpital’s rule concerns pairs of functions f (x), g(x) such that lim x−→c f (x) = 0 and lim x−→c g(x) = 0, where c is a real number. It states that the limit of the ratio of two such functions coincides with the limit of the ratio of their first derivatives, f (x) f (x) = lim , lim x−→c g(x) x−→c g (x) where h denotes the first derivative of a function h. Prove that L’Hôpital’s rule is valid, provided that both f and g are differentiable (i.e., their derivatives exist) within an interval (a continuous range of real values) that contains x = c, and that g (c) = 0. (Hints: you might use the formal definition of derivative. The fact that both f and g go to zero when x goes to c should also help you.) 11. Schwarz’s theorem, AKA Clairaut’s theorem, stipulates that, if a function f = f (x1 , x2 , . . . , x N ) can be differentiated at least twice at the point (a1 , a2 , . . . , a N ), and if all its second derivatives are continuous at that same point, then their value, at that point, is independent of the order in which they are taken, or in other words, ∂ ∂f ∂ ∂f (a1 , a2 , . . . , a N ) = (a1 , a2 , . . . , a N ) = ∂xi ∂x j ∂x j ∂xi Use the definition of derivative to prove Schwarz’s/Clairaut’s theorem. 12. Prove that the sine and the cosine functions are linearly independent from one another. 13. Given two functions f (x) and g(x), let us define h(x) as the function such that h(x) = f (g(x)) for every x. Then, the derivative h (x) of h is given by h (x) = f (g(x))g (x). This very important result is called the chain rule. Now, try to give a proof of the chain rule. Remember: you can use the definition of derivative as the limit of the incremental ratio. 14. Let u 1 (x), u 2 (x), . . . , u n (x) be n linearly independent solutions to the homogeneous, linear ODE L y(x) = 0, where the operator L does not involve derivatives of order higher than n. Let u NH be one particular solution to the nonhomogeneous ODE L y(x) = h(x). Write the general solutions of L y(x) = 0 and L y(x) = h(x). 15. Show that the following equality is true:
Exercises
465
b
f (ϕ(x))ϕ (x)dx =
a
ϕ(b) ϕ(a)
f (u)du,
where ϕ denotes the derivative of the function ϕ, and, at least between a and b, f has an antiderivative (that you might call, e.g., F). Incidentally, this equality is the basis for a very useful method of integration that is called integration by substitution. (Hint: you might use, first, the fact that F = f —definition of antiderivative—and then the “chain rule”.) 16. Use integration-by-substitution to solve
2
(x + 1)4 dx
1
17. Solve analytically the following indefinite integral (2x 4 + 3)5 x 3 dx. 18. Newton’s shell theorem says that: (i) a spherically symmetric body (i.e., a spherical shell, or sphere, whose density only depends on distance from the center) attracts external objects gravitationally as though all of its mass were concentrated at its center; (ii) a spherically symmetric shell exerts no net gravitational force on objects inside it, wherever they are within the shell. Give analytical proofs of both statements. 19. Show, by direct substitution, that the combination of
f 1 (t) = a cos (c t) − b sin (c t) , f 2 (t) = a sin (c t) + b cos (c t) ,
where a, b and c are arbitrary constants, solves d f1 dt d f2 dt
+ c f 2 = 0, − c f 1 = 0.
20. A, B and C are matrices. Let A · B = C. Write out this matrix multiplication using index notation. If A has n columns, how many rows does B need to have for the dot product A · B to exist? how many columns? what can we say about the dimensions (numbers of rows and columns) of C, given the dimensions of A and B? 21. The following expressions should be read according to Einstein’s summation convention. Convert them to matrix notation. (i) Dαβ = Aαμ Bμν Cβν ,
466
Exercises
(ii) Dαβ = Aαμ Bβγ Cμγ , (iii) Dαβ = Aαγ (Bγβ + Cγβ ). 22. Convert to matrix notation: Dβν =
n n
Aμν Bαμ Cαβ .
μ=1 α=1
23. How does the inertia tensor of an axially symmetric ellipsoid look like? 24. Write the Taylor series (i.e., a sum up to n = ∞) of cos(x) around x = 0. What can you say about the convergence of this series when x is close to 0? 25. Find the Taylor expansion, up to degree 2, approximating the function f (x) = 1 + x + x 2 near x = 1. 26. Write the Taylor series of sin(x) around x = 0. Use the resulting formula to evaluate the limit sin(x) . lim x→0 x 27. Find the Taylor expansion formula, up to at least second order, for a generic two-variable function f (x1 , x2 ). (Hint: you might introduce a variable t such that x1 = a1 + αt and x2 = a2 + βt, where α and β are constants.) 28. At some point in the history of geology, two schools of thought argued about the origin of mountains: Plutonists and Neptunists. Explain their different points of view. Which theory was, in your opinion, more accurate, and why? 29. What are Steno’s principles? Try to state them as precisely as you can. 30. Prove that the function
√z 2 κt
T (z, t) = D
e−x d x, 2
0
verifies the PDE
∂2 T ∂T (z, t) = κ 2 (z, t). ∂t ∂z
31. What is “uniformitarianism” in the earth sciences? 32. The walls of an industrial furnace are made of refractory material with a thickness of 0.15 m and a thermal conductivity of 1.7 Watts per meter per Kelvin degree. At steady state, the temperatures measured at the internal and external surfaces of a wall are 1400 ◦ K and 1150 ◦ K, respectively. How much heat is transferred, per unit time, from the furnace to the outside through a portion of rectangular wall, with an area of 0.5 m × 1.2 m? 33. Coriolis proved that the difference of total kinetic energy of a system (i.e., a set of material points), minus total work done on the system, is “conserved”, i.e., it is always constant. He did that based solely on Newton’s second principle, that force equals mass times acceleration. Can you repeat Coriolis’ proof?
Exercises
467
34. Describe Joule’s experiment, and explain how it established a quantitative relationship between heat and mechanical work. Nota Bene: the concept of heat capacity should be part of your explanation. 35. In one of his Popular Scientific Lectures, Helmholtz shows that the sun’s heat does not originate from simple combustion. To do that, he makes an approximate calculation of how quick the sun would be entirely consumed, if it was made of coal (a very efficient fuel), and shows that this would happen in too short a time compared to the duration of geological processes, known to have happened on earth. Can you repeat his proof? (Hints: the solar constant, or how much heat per unit time per unit area is received by the earth from the sun, is about 1400 J/(s×m2 ). The distance between earth and sun is about 1.5 × 1011 m. The area of the outer surface of a sphere is 4π times the sphere’s squared radius. The mass of the sun is roughly 2 × 1030 kg. Burning 1 kg of coal releases 3 × 107 J of heat). 36. A meteorite hits the earth. Can you estimate (the order of magnitude of) its speed at the moment of impact? Does it depend on the meteorite’s mass? (Hints: neglect the meteorite’s initial velocity—i.e., its velocity before it is close enough to us to be significantly attracted by the earth’s gravity field. Neglect the effects of the earth’s atmosphere.) 37. Briefly explain the method used by Kelvin (1862) to estimate the age of the earth. 38. the error function is defined 2 erf(x) = √ π
x
e−u du. 2
0
Find the analytical formula of its first derivative with respect to x. 39. What is a syncline? what is an anticline? 40. I pick a random place on the surface of the (solid) earth. Is it more likely to be: (i) 4000–5000 m below sea level, (ii) 4000–5000 m above sea level, or (iii) 1000–2000 m above sea level? 41. What is Glossopteris, and why is it relevant to the theory of continental drift? 42. For a long time, before the advent of continental drift and plate tectonics, most people thought that the earth was slowly shrinking. What was the reasoning behind this idea? 43. Isostasy: briefly explain Pratt’s and Airy’s models. How do these models address the issue of gravity anomalies observed (or not observed) by George Everest (1847) in northern India? 44. Determine the average elevation of a mountain range based on the principle of isostasy according to Airy, knowing that the average depth of the crust-mantle interface is about 65 km under the range, and 35 km in the surrounding continental areas. Use a simple two-layer model with the crust and mantle having constant densities of ρc = 2.8 × 103 kg m−3 and ρm = 3.3 × 103 kg m−3 , respectively. If the range is eroding at a rate of 2 mm per year, what will its elevation be in 10 million years? 45. Prove that the divergence of the curl of a vector quantity is always 0.
468
Exercises
46. Prove that the curl of the gradient of a scalar quantity is always 0. 47. What is “stress” for a physicist? Define the stress tensor. Write the relation that links stress to surface force. 48. Write the relation that links stress and displacement (deformation) in an elastic medium. 49. Show how the following linear partial differential equation, ∂ 2 u(x, t) ∂u(x, t) = , ∂t ∂x 2 can be reduced to two linear ODEs. 50. The deformation u of a perfectly elastic string under tension is described by the linear partial differential equation T
∂ 2 u(x, t) ∂ 2 u(x, t) = ρ , ∂x 2 ∂t 2
where T is the tension (force per unit area) imparted to the string, ρ is mass density, and x and t denote distance (along the string) and time, respectively. Show how this equation can be reduced to two linear ordinary differential equations. Try to solve both ODEs, and combine them so as to obtain a function describing the string’s deformation in space and time. Identify (sketch) the first few normal modes of the string. 51. The deformation u of a perfectly elastic membrane under tension is described by the two-dimensional wave equation
∂ 2 u(x1 , x2 , , t) ∂ 2 u(x1 , x2 , t) ∂ 2 u(x1 , x2 , t) = 1/c2 + , 2 2 ∂t 2 ∂x1 ∂x2
()
where x1 and x2 are Cartesian coordinates and c2 is a real, scalar constant, related to the membrane’s density and tension. Consider all functions u(x1 , x2 , t) = u(k1 x1 + k2 x2 ± ct),
()
where k1 and k2 are real numbers. Are functions with the form () solution to equation ()? Under what conditions? 52. Assume you have three vectors x, y and z which satisfy the following relations, y=B·x and z = A · y, where A and B are matrices. First, rewrite these relations using index notation. Then, write the relation between z and x using index notation.
Exercises
469
53. Show that, if p and q are integer and p = q, then
π −π
sin( px) sin(q x)dx = 0,
for any values of p and q (including p = q). 54. Show that π sin2 ( px)dx = π −π
for any value of p. Does p need to be integer for this equality to old? 55. Displacement u in a homogeneous, elastic, isotropic medium obeys Navier– Cauchy’s equation ρ
56.
57.
58. 59. 60.
61.
d 2u = (λ + μ)∇ (∇ · u) + μ∇ 2 u + f, dt 2
where x is position, t time, and λ, μ are the so-called Lamé’s parameters. Show that the divergence of u, in a medium such as this, behaves like a wave, i.e., it must be a solution to the wave equation. What can you say about its speed of propagation? The d’Alembert solution of the one-dimensional wave equation reads u(x, t) = u(x ± ct), where x is distance, t is time, u is displacement, and c a real, scalar constant. Define, in terms of those symbols, the velocity at which a small portion of a medium moves, when it is “hit” by the wave, and the speed of propagation of the wave. Describe the motion of a buoy as a sea wave passes through. Describe the motion of a point, stuck to the earth’s surface, as a Rayleigh wave passes through. (Drawings are fine.) Do they differ? how? Surface waves are said to be “dispersive”. What does that mean? Formulate the Huygens-Fresnel principle. What did Augustin Fresnel specifically contribute to it? Use the Huygens-Fresnel principle to prove that, in a homogeneous medium, a wavefront that at a given moment is plane keeps propagating as a plane wave. Maybe that’s obvious to you, but take it as a test that Huygens-Fresnel makes sense. Consider an unbounded continuum of approximately constant density ρ, whose deformation is described at all points x in three-dimensional space by the displacement field u(x). First, show that the relative compression/expansion of the continuum around x coincides with the divergence ∇ · u(x). (Hint: by “relative V , where d V is an arbicompression/expansion around x”, I mean a variation δd dV trarily small volume element around x.) Next, use what you’ve just found to determine a relationship between ∇ · u(x) and the change in mass contained by a fixed volume V .
470
Exercises
62. Let x denote distance, t time, and v vertical velocity (e.g. along a string). Rewrite the standing wave v(x, t) = sin(kx) cos(ωt) as a linear combination of traveling waves. (Hint: use trigonometry. For instance, you might look up the formulae for the sine of the sum and/or difference of two numbers, etc.) 63. Starting with Navier–Cauchy’s equation (6.167), prove that both P (compressional) and S (shear) waves can exist in an elastic medium. 64. Give an estimate for the volume of the earth’s mantle. By mantle I mean everything that’s deeper than the crust, and less deep than the core. Remember, the earth is approximately spherical, its radius is, on average, about 6370 km, the radius of the core is about 3480 km, and the Moho is about 20 km deep, on average. Explain your reasoning. 65. Consider a case of radioactive decay, where, the moment your sample solidified, it contained no daughters, D(0) = 0. (Can you name one mineral for which this is always the case? And why is that?) Let’s say we measure the concentrations of daughters and parents in the sample as it is now, and the parameter λ such that P(t) = P(0)e−λt . From such information, find out how long ago the sample has become solid. 66. Write the relation that links stress and displacement (deformation) in a viscous fluid. 67. Newton’s second principle stipulates that a force coincides with the product between the mass to which it is applied, and the acceleration that same mass undergoes, caused by that force. Show that Navier–Stokes’ equation is a direct consequence of Newton’s second. 68. Write the relation between the divergence of a displacement field u(x) at a location x, and the density change at that same location. Explain. 69. The divergence of displacement coincides with the relative change in volume: explain. 70. Imagine you have a “model” of P- and S-wave velocity, i.e., the numerical values of two functions v P (r ) and v S (r ), where r is distance from the center of the earth. The bulk modulus, K , is related to v P and v S via the equation 4 K (r ) , v 2P (r ) − v S2 (r ) = 3 ρ(r ) where ρ is density. K is also related to gravitational acceleration, g: we saw that dρ(r ) ρ2 (r )g(r ) =− . dr K (r )
Exercises
71. 72. 73. 74. 75. 76.
77.
471
Using these two equations, and Newton’s gravitational law applied to a sphere with radius r , write an iterative formula allowing to determine ρ as a function of r , based on the known functions v P (r ), v S (r ). (Hint: if you are lost, go back and look for the Williamson-Adams method.) How did Francis Birch estimate the composition of the earth’s mantle and core? On what basis is it believed that the earth’s core is primarily composed of iron? On what basis is it believed that the earth’s outer core is liquid? How did Erskine Williamson and Leason Adams obtain a density profile of the earth from its seismic (P ∞and S) velocity profile? Show analytically that 0 [1 − erf (ξ)] dξ = √1π . Define qualitatively Rayleigh’s number. What are the main parameters that affect it? Try to remember the formula, and write it down as best you can. A howitzer in the earth’s northern hemisphere fires towards a target located directly south. The gunners are very careful and have taken everything (wind direction, etc.) into account, except that, somehow, they have forgotten the Coriolis force. Will they hit the target? If not, where will the projectile land relative to the target? Verify that the functions f (x) = e−λx , xe−λx , eλx and xeλx are all solutions to 2 d4 f 2d f (x) − 2λ (x) + λ4 f (x) = 0. dx4 dx2
78. What is “postglacial rebound”? How can the observation of this phenomenon be used to estimate the viscosity of the earth? 79. Verify that ∇ × (∇ × v) = ∇(∇ · v) − ∇ 2 v, for any vector field v. 80. Plate tectonics: briefly list some geomagnetic, seismological, and bathymetric observations that have contributed to establishing plate tectonics as a new paradigm for the theory of the earth. 81. What is the “Clapeyron slope” of the transition between two “phases” (e.g., olivine and spinel; perovskite and postperovskite) of a material? How do people use it to determine the depth, below earth’s surface, where the phase transition occurs? I recommend that you answer this one with the help of a little diagram. 82. How was the “transition zone” in the earth’s mantle first observed? How is it explained in terms of mineral physics? 83. Explain the differences and similarities between the theories of continental drift and plate tectonics. 84. What are the forces that drive the plates around? Harry Hess and others initially proposed that the convective mantle functions like a “conveyor belt”, on which plates stand passively. Do you think that that is a fully accurate description of what’s probably happening? why, or why not? 85. Prove that the least-squares solution of the inverse problem
472
Exercises
A·x =d is
−1 · AT · d. x = AT · A
(Hint: remember that the least-squares solution of A · x = d is the value of x such that |A · x − d|2 is minimum.) 86. Prove that ∇ × (∇ × v) = ∇(∇ · v) − ∇ 2 v, for any vector v. 87. Draw the focal mechanisms (“beachballs”) associated with a strike-slip, a thrustfault and a normal-fault earthquake. 88. Hurricanes happen in the northern hemisphere, and cyclones happen in the southern hemisphere. Other than that, hurricanes and cyclones are the same—except for one thing: hurricanes always swirl counterclockwise, and cyclones always swirl clockwise. Why do you think that’s the case? 89. Apply Gaussian elimination to the square matrix: ⎛
3 ⎜2 ⎜ ⎝2 1
−2 1 8 1
4 0 −8 2
⎞ 7 −3⎟ ⎟. 2⎠ −1
90. Apply Gaussian elimination to the rectangular matrix: ⎛
⎞ 1 2 3 −1 ⎝4 5 6 3 ⎠ . 789 5 Does this help you find a solution (x1 , x2 , x3 ) to the inverse problem ⎛
⎞ ⎛ ⎞ ⎛ ⎞ 1 2 3 −1 x1 b1 ⎝4 5 6 3 ⎠ · ⎝x2 ⎠ = ⎝b2 ⎠ , 789 5 x3 b3 where b1 , b2 , b3 have arbitrary, constant values? 91. Use Gaussian elimination to find the solution(s) (if any) to the following system of linear equations: ⎧ ⎨ x + 2y + 3z = 9 2x − 2z = −2 ⎩ 3x + 2y + z = 7 92. What does it mean for a normal mode to be split? 93. The so-called heat equation, in one dimension, reads
Exercises
473
∂ 2 u(x, t) ∂u(x, t) =a , ∂t ∂x 2
94.
95.
96.
97.
where u denotes temperature, x and t are distance and time, respectively, and a is a real, positive constant. Solve the heat equation analytically, under the following assumptions: (i) the temperature at the bar’s endpoints is fixed (i.e. each endpoint is in contact with a heat reservoir kept at a constant temperature; call the temperatures T1 and T2 and call L the length of the bar); (ii) the initial temperature u(x, 0) = u 0 (x) is known for all x; (iii) the bar is thermally insulated from the outside world, except for its two endpoints. Discretize the one-dimensional heat equation according to the finite-difference method. Show how this method can be used to determine how T evolves over time throughout a bar of finite length, under the following assumptions: (i) the temperature at the bar’s endpoints is fixed (i.e. each endpoint is in contact with a temperature reservoir kept at a constant temperature; call them T1 and T2 and call L the length of the bar); (ii) the initial temperature T (x, 0) = T0 (x) is known for all x; (iii) the bar is thermally insulated from the outside world (except for its two endpoints). You are starting a Ph.D. in seismology, and your supervisor asks you to write code to model the propagation of seismic waves. What are the main equations that your software should implement? You are starting a Ph.D. in geodynamics, and your supervisor asks you to write code to model flow in the earth’s mantle. What are the main equations that your software should implement? What are the main differences between what you and your seismology colleague (previous question) need to do? You are starting a Ph.D. in geomagnetism, and your supervisor asks you to write code to model the geodynamo. What are the main equations that your software should implement?
Notes
1. Arenarius, or The Sand Reckoner: because the grain of sand was the unit of “size” that Archimedes chose to use. 2. Archimedes writes: “His [Aristarchus’] hypotheses are that the fixed stars and the sun remain unmoved, that the earth revolves about the sun on the circumference of a circle, the sun lying in the middle of the orbit, and that the sphere of fixed stars, situated about the same center as the sun, is so great that the circle in which he supposes the earth to revolve bears such a proportion to the distance of the fixed stars as the center of the sphere bears to its surface.” 3. Multiply both sides of (1.2) by s(s − t), and you get s(t − d) = l(s − t). But then st − sd = ls − lt, or t (s + l) = s(d + l). Divide both sides by t (d + l), and l 1 + sl 1 + sl s +l s = = , = t d +l l 1 + dl 1 + dl which is Eq. (1.3), and 1 + sl l ls ls l 1 + sl = = = = , t ts st s 1 + dl 1 + dl i.e., Eq. (1.4). © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2
475
476
Notes
4. I guess, for instance, imagine you measure the angle subtended, at the earth, by the moon (before or after the lunar eclipse), call it θl ; then you identify the point where the eclipse starts and the point where it ends, and the distance between those two points is 2d; and if the angle subtended by those two points is θd , then d = L sin(θd /2). And likewise l = L sin(θl /2). And you don’t know L, but if d /2) , you take the ratio of the two equations I just wrote, you get that dl = sin(θ sin(θl /2) and you are good. 5. See David P. Stern, From Stargazers to Starships, which, as this is being written, is a website accessible under the nasa.gov domain. 6. Yes, the John Maynard Keynes of Keynesian economics. This is from a lecture (“Newton, the Man”) that he should have delivered at the Royal Society of London, during an event planned for the tercentenary of Isaac Newton’s birth; which was in 1942 and as you can imagine had to be postponed because at that point London was being bombed by the Germans. When finally the event took place in 1946, Keynes had just died and the lecture was given by his brother Geoffrey. Keynes was a collector of books; particularly first editions of books that he considered to be relevant to the history of thought: and he managed to acquire many of Newton’s unpublished manuscripts. If you are interested in this sort of things, and if the website still exists when you read this, check out the bit about the Sotheby’s sale of Newton’s “Portsmouth Papers” at newtonproject.ox.ac.uk: “an event which [...] finally revealed the full extent of Newton’s interests in alchemy and unorthodox theology.” 7. When you think about it, what wasn’t “magical” in a force that attracts any pair of objects to one another, without them being in contact, and even at large distances? Consider that, in Newton’s times, no sophisticated experiments, like those you read about in the textbooks of today, had yet shown that the theory of gravity actually does fit all sorts of observations. 8. In fact, those guys were all Newton’s colleagues, and if he wasn’t friendly with Woodward (long story), he was on good terms with the other two. Whiston’s New Theory of the earth so impressed Newton, that he had him hired as assistant professor in Cambridge, and later favoured his election as Lucasian professor (when Newton resigned: another long story). As for Burnet, upon reading his Sacred Theory of the Earth Newton wrote to him a long letter (Keynes Ms. 106(A), King’s College, Cambridge, UK), where issues like, e.g., the length of day at the time of God’s Creation are discussed. 9. Or, Les Mots et les Choses, Gallimard, 1966. 10. Which you’ve probably at least heard about before? In a nutshell, the idea is that a material point rotates around an axis because there’s some kind of force that pulls it towards that axis; if that force were to be turned off, the point then would just continue to move away along a straight line, by its own inertia and the first of Newton’s laws. Now, if you—the observer—sit on some rotating body, like us on the earth, and forget that you are rotating with it, the inertia that pulls you away from the rotation axis does feel like a force (the typical example is when you are sitting in a vehicle that makes a turn, and you feel like you’re being pulled away from that turn). So in such a case people speak,
Notes
11.
12.
13.
14.
477
instead of just inertia (as they probably should), they speak of centrifugal force, and then they sort of correct the error by calling it apparent force. We’ll meet the centrifugal force again, in later chapters. This, of course, assumes that the earth is not exactly rigid; which was fine, because back in Newton’s day most people would rather think that, below the rocky crust that we walk upon, most of the interior of the earth was actually liquid. But more of that in the next chapter. I am not going to get into Descartes’ “vortex” theory of planetary motion, though: it’s quite complex and we know now that it doesn’t really work in terms of explaining observations. This is a subtle but important point: when you make a scientific observation your measurement is never a single number, but rather an interval of numbers that are more or less equally acceptable estimates as far as you can tell—and how broad that interval is depends on the accuracy of the instrument and/or the method you use to make the measurement. Physicists call this range of values “uncertainty”. If you measure the difference in curvature between, say, Paris and Barcelona, which are only about 7◦ away from one another along the same meridian, if you measure the difference in curvature between two places that are not that far from one another chances are that the difference you get will be small, and possibly smaller than the uncertainty. Which then, in practice, you won’t be able to tell whether the curvature is the same, or in which place curvature really is higher; so, if the experiment was to be successful, it was preferable to take the measurements at places expected to have curvatures as different as possible. Horizontal distances, by the way, could be measured by “chaining,” i.e., by deploying chains or rods of known length on the ground, along a straight line whose direction is precisely measured via astronomical observations. Levels were used to check that rods were flat, and changes in elevation had to be dealt with, of course. Egyptian cadastral surveys, to which, e.g., Eratosthenes had access, were based on techniques such as these. A more clever, but less precise technique called “triangulation”, that worked whenever changes in elevation were too severe, or the “line” to be “surveyed” crossed inaccessible terrain, was developed at the beginning of the seventeenth century (Wildebord Snellius, AKA Snell, Eratosthenes Batavus-De Terrae Ambitus Vera Quantitate, 1617. Snell used triangulation to redo Eratosthenes’ experiment, but with more precise measurements: hence the reference to Eratosthenes in the title of his book; Batavia being a region in the Netherlands, which is where Snell was from). In a nutshell, triangulation is based on the idea that, if you know the distance between the points A and B, and the angle they form at a third point C, you can determine their distances from C, e.g. via the “law of sines”723 . So if A and B are close to one another, so that it is easy to measure their distance, but distant enough that you can tell them from one another as you observe them from a faraway point C, then you can use trigonometry to measure the distances AC and BC based on AB. To do this right, one had to use a very precise instrument known as quadrant, equipped with a telescope that, when
478
Notes
looking a kilometer or so away, allowed to distinguish points that were only few centimeters apart. 15. And Galileo had already shown that the gravitational acceleration g was a constant that could be measured; and each object would fall to the ground, to the surface of the earth, with the same g, no matter how big or heavy: so then measuring the mass of an object was equivalent to measuring its weight: the gravitational pull, mass times g, exerted on it by the earth. 16. We might call vector a mathematical entity that is only fully defined if you know its “size” and its “direction”. There are many things in physics that are inherently vectorial: take, for example, Newton’s second law, which Newton—who didn’t have vectors—stated as: “the alteration of motion is ever proportional to the motive force impressed; and is made in the direction of the right line in which that force is impressed”; i.e., the sizes of force and acceleration are proportional to one another (through the mass), and their directions coincide. Now that we have vectorial algebra, we can state the same law with a formula, F = ma, where m is the mass of the object to which the force F is applied, and, as a result, the acceleration a (Newton’s “alteration of motion”, i.e., the time-derivative of velocity) is imparted. Both force and acceleration are vectors—both are defined by a size and a direction. And velocity as well, of course, and displacement. When there’s more than one force acting on the same object, e.g., its weight; electromagnetic forces in case it’s electrically charged or magnetized (more on this in Chap. 8); someone kicking it around, etc., Newton’s second can be written n d Fi , (N.1) m v= dt i=1 where Fi (i = 1, 2, ..., n) are the n forces in question. Observation tells us that the combined effect of those forces is described by the parallelogram law, see Fig. N.1: which then is also how the vector sum is generally defined. There’s many ways to distinguish vectors from “scalars”—i.e., from quantities that are fully defined by a single number. I’ve chosen to use boldface font for vector; but you might find Newton’s second written as d F i , m v = dt i=1 n
for instance, or
d Fi . v= dt i=1 n
m
Notes
479
Fig. N.1 How to sum vectors, i.e., the parallelogram law. Think of two displacements of the same object, happening one after the other. a Let the vector u describe the first of those displacements; and the vector v the second. b If you want to know where the object ends up, drag v—without changing its size and direction, of course—so that its starting point coincides with the endpoint of u: now the endpoint of v marks the final position of the object. Not surprisingly, the vector that now connects the starting point of u with the endpoint of v is what we call the vector sum u + v. c Alternatively, you can move v so that its starting point coincides with the starting point of u; then draw the parallelogram that has u and v as two of its sides: and you see that u + v coincides with one of the diagonals of the parallelogram—the one that bisects the angle formed by u and v
Besides vector sum, it’s useful to define other algebraic operations involving vectors, in particular the scalar and vector products. People call scalar product, or “dot” product of u and v, the scalar quantity u · v = |u| |v| cos θ, where θ is the angle between the directions of u et v. (And we use two vertical segments, one before and one after the symbol denoting the vector, as in |u|, etc., to mean the magnitude—the “size” of the vector.) From the definition of dot product that I just gave, it immediately follows that u · v = v · u, u · u = |u|2 , and u · v = 0 if θ =
π . 2
We call vector product of u et v a vector whose size, or magnitude, |u × v| = |u| |v| sin θ, while its direction is given by the so-called “right-hand rule”, which see Fig. N.2. It follows from this definition that: the cross product of a vector by itself is zero, i.e., u × u = 0, and that u × v = −v × u
480
Notes
Fig. N.2 The right-hand rule for the cross product
(which if you are not convinced, remember that sin(θ) = − sin(−θ)). One last thing before we wrap this up. I told you that a vector is defined by its magnitude and its direction; in two-dimensional space—i.e., on a plane—one angle is enough to identify direction, and a vector is defined by two numbers total. In three dimensions you’ll need two angles to identify direction, and three numbers total, then, to define your vector. Now, this is not the only way you define a vector. You can also just give the lengths of the vector’s projections onto the axes of a Cartesian frame. If we call i, j, k, respectively, the unit vectors (vectors whose magnitude is 1) aligned with the x, y and z axes, then any vector u can be written u = u x i + u y j + u z k,
(N.2)
where u x denotes the projection of u on the x axis, etc. You see, graphically, that Eq. (N.2) verifies the parallelogram law: look at Fig. N.3. The scalar quantities u x , u y et u z are what people call the components of u. Equation (N.2) applies to vectors v et Fi in Eq. (N.1), m
n d v x i + v y j + vz k = Fi,x i + Fi,y j + Fi,z k ; dt i=1
if you take the dot product of both sides of this with i; because i · i = 1, i · j = 0 and i · k = 0, we are left with dvx Fi,x . = dt i=1 n
m
Notes
481
Fig. N.3 The unit vectors i, j, k (left), aligned with the Cartesian axes, and the components u x , u y , u z of a generic vector u (right)
Likewise, dot-multiplying by j et k we find m
dv y = Fi,y , dt i=1
m
dvz = Fi,z . dt i=1
n
n
In general, any vectorial equation can be translated into multiple scalar equations: one per component. Expression (N.2) allows us to determine the components of the cross product u × v in terms of the components of u and v. It follows from the definition of cross product that i × j = k, j × k = i, k × i = j;
(N.3)
because of the properties of cross product, which see above, it follows from (N.3) that j × i = −k, k × j = −i, i × k = −j. (N.4) Now, u × v=(u x i + u y j + u z k) × (u x i + u y j + u z k). But so then, u × v = u x v x i × i + u x v y i × j + u x vz i × k + u y v x j × i + u y v y j × j + u y vz j × k + u z vx k × i + u z v y k × j + u z vz k × k,
482
Notes
where we can use (N.3) and (N.4), and i × i = j × j = k × k = 0, to simplify a bunch of stuff out and find the relatively simple formula for the components of the cross product, u × v = (u y vz − u z v y )i + (u z vx − u x vz )j + (u x v y − u y vx )k.
(N.5)
17. If you are not comfortable with that, then please be patient, and go back to your trigonometry. 18. This point will keep coming up, so I might as well explain it in some detail, even if it might feel like we are spiraling down a bottomless pit. But I promise, things get easier later on, once you’ve digested a handful of tough concepts. So let’s quickly go back to the Greeks. A good starting point is the work of Zeno of Elea, a “pre-socratic” philosopher, i.e., older than even Plato, who’s the most ancient scientist we’ve discussed so far. His “paradoxes” are most famously described by Aristotle in his Physics, Book VI. At the time, there existed no convincing answer to them. Here’s a paradox: The Greek hero Achilles runs a race with a tortoise. Being a very fast runner, Achilles gives the tortoise a 10-meter head start. Now, according to Zeno, Achilles will never catch up with the tortoise, because during the time it takes Achilles to get to the point where the tortoise has started, the tortoise will advance of at least some small distance; Achilles will then quickly cover this new, shorter distance; but while Achilles does that, the tortoise will still advance. And this goes on ad infinitum, so that Achilles can never catch up. In Aristotle’s words: “the slowest racer will never be caught by the fastest; for he who pursues must always begin by reaching the point from which he has gone, so that the slowest always has some advance.” Aristotle and Zeno were obviously smart enough to realize that, in the real world, the fastest racer does catch up with the slowest: it’s just a matter of time. They figured there was something wrong in their algebra, but they couldn’t figure what. Today, post-Leibniz and post-Newton, you might see that the flaw is in the implicit assumption, in Zeno’s reasoning, that if you keep summing numbers up, their sum will “diverge” to infinity, i.e., it will eventually become larger than any possible large number you might think of, even if the numbers you sum up get systematically smaller. Starting with Leibniz and Newton, modern calculus says, wait a second: it all depends on the rate at which those increments shrink. I don’t know how clear that is, so let me show you what that means in practice, for instance in the case of Achilles and the turtle. Let d1 be the distance between Achilles and the turtle the moment the race starts; if Achilles runs at a speed v A , he’ll cover that distance in a time t1 = d1 /v A . In the same interval of time the turtle, whose speed vT is smaller than v A , crawls a distance d2 = vT t1 .
Notes
483
Achilles covers that distance in a time t2 = d2 /v A =
vT d1 , v 2A
during which the turtle covers the distance d3 = vT t2 =
vT2 d1 , v 2A
which Achilles in turn covers in a time t3 = d3 /v A =
vT2 d1 , v 3A
etc. The time Achilles needs to catch up with the turtle is the sum of all intervals t1 , t2 , t3 , . . ., which, even from the point of view of modern calculus, are still an infinity of intervals, and we might call each of them ti with i = 1, 2, . . . , ∞. i We can see from the above that ti = vdA1 vvTA ; as i grows ti shrinks, and/or in the language of calculus we might say that as i goes to infinity (i −→ ∞), ti becomes infinitely small, or infinitesimal. We might also call the ti s a sequence of real numbers; and because ti “tends” to a finite limit (zero), we also call it a convergent sequence. This is not enough yet to conclude that the sum of all ti s is finite—but if we had found the ti s to form a divergent sequence, there would have been no hope at all that the sum be finite. The sum of all the elements of a sequence is called a series. We might write t1 + t 2 + t 3 + · · · =
∞ i=1
ti
∞ d1 vT i−1 = . v A i=1 v A
(N.6)
It is not trivial to determine whether the right-hand n side of (N.6) is finite. It usually helps, though, to look at the sequence i=1 ti , with n = 1, 2, . . . , ∞, i.e., the sequence of the so-called partial sums of the series. We can think of the right-hand side of (N.6) as the limit to which the sequence of partial sums tends as n goes to infinity. To find that limit, now, one would need to be very smart, or, like me, find the solution, which obviously is very old, and usually taught in calculus courses, n vT i−1 by vvTA , and in some book. It goes like this: first, you multiply i=1 vA n vT i−1 vT n vT i−1 then you look at the difference i=1 − v A i=1 v A . Notice that vA
484
Notes
n vT i−1 i=1
vA
n n vT vT i−1 vT vT i−1 − = 1− . v A i=1 v A v A i=1 v A
(N.7)
But you can also write n vT i−1 vA
i=1
−
n n n vT vT i−1 vT i−1 vT i = − v A i=1 v A vA vA i=1 i=1 n vT =1− . vA
(N.8)
It follows from the Eqs. (N.7) and (N.8) that 1− and so
vT vA
n i=1
vT vA
n vT i−1 i=1
vA
i−1
=1−
=
n ,
n
vT vA − vvTA
1− 1
vT vA
.
(N.9)
Now it’s relatively easy to see that, when n goes to infinity, the sequence at the right-hand side of (N.9) converges to 1vT , provided that vvTA < 1, 1− v
A
i.e., provided that Achilles is faster than the turtle, which he is. Bottom line, the series (N.6) is convergent, and Achilles overtakes the turtle in a time t = d1 1 1 = v Ad−v . vT vA T 1− v
A
19. I’ll assume you’ve read note 18 and/or you are familiar with the idea that, through calculus, one can sum up an infinity of infinitely small values and get a finite number as a result. This is what integrals do. Let’s see a practical example. You have a sequence of measurements of the speed at which, I don’t know, you have been walking around during the day—maybe you’ve got one of those apps installed on your smartphone—and you want to find out the total distance you’ve covered. Hopefully your speed measurements have been taken quite frequently; frequently enough that it’s OK to assume speed to be about constant between one measurement and the other; so then let k be an integer number that identifies each measurement, and vk the kth speed that’s been measured and tk the time between measurements k and k + 1; then, let’s say you’ve got N measurements total, your total displacement should be about x ≈
N k=1
vk tk .
(N.10)
Notes
485
If you want to estimate your total displacement more accurately, what you do is you sample your speed v more often, i.e., N grows and all the tk s accordingly shrink. We might bring this process to the limit, where each tk becomes arbitrarily small, while their number becomes arbitrarily large. This might be expressed by ∞ lim vk tk . k=1
tk −→0
That is precisely what we call an integral; in this case, the integral of speed over time. Except, we usually write it in a different way, i.e.,
tB
tA
v(t)dt =
∞ k=1
lim vk tk ,
tk −→0
where t A and t B are the times at which the sum should start and end (the first and last instants of the time period over which you want to know your displacement), i.e., the “limits of integration”, and v(t) replaces vk , because, when k grows to infinity, it doesn’t make sense to think of speed as a discrete set of numbers: it should really be a function, which should have a value for any possible value of t between t A and t B . If you draw v(t) as a function of t, in a Cartesian graph where t is on the horizontal axis and v on the vertical one (Fig. N.4), you see that the right-hand side of (N.10) can be interpreted as an approximation of the area between the curve of v versus t and the horizontal axis (with a negative sign, if the curve lies below the axis). The smaller the tk ’s, the better the approximation; and the integral of v(t) is precisely the area under the curve. It follows from the definition of integral that we just gave, that if you’ve got a third point, call it tC , along the t axis between t A and t B , then
Fig. N.4 As the tk become infinitely small, the gray area converges to the integral of v(t)
486
Notes
tB
v(t)dt =
tA
tC
v(t)dt +
tA
tB
v(t)dt,
tC
etc. 20. This is the first time I actually do a derivative in this book, so it’s the right time to tell you what a derivative is. Or remind you, in case you already know. Like in note 19, let’s think in terms of speed and displacement, which is probably the most obvious example one can think of. In math and physics, displacement is the distance between two locations subsequently occupied by, e.g., a material point that’s moving around. Speed is the rate at which displacement evolves over time; so, in the first approximation, you can think of speed as the ratio v≈
x(t1 ) − x(t2 ) , t1 − t2
(N.11)
where x(t1 ) is the distance covered at time t1 and x(t2 ) is the distance at time t2 . Once again, calculus brings the concept to the limit, where the time increment is arbitrarily small, and v(t1 ) = lim
t2 −→t1
x(t1 ) − x(t2 ) , t1 − t2
which is called the derivative of the function x(t) with respect to the variable t, evaluated at t = t1 . Another way of expressing the same concept is, e.g., v(t) = lim
δt−→0
x(t + δt) − x(t) , δt
(N.12)
which maybe is clearer, maybe not. One important thing that might or might not be obvious to you, is that when you’ve got a bunch of data, like, I don’t know, numerical values of displacements x at given times t, doing the derivative of x with respect to t or, as they say, differentiating x with respect to t amounts to implementing the approximate Eq. (N.11) directly, either by hand or with some kind of computer or calculator or something. Which (N.11) is approximate in the sense that when you implement it, what you get is a reasonable estimate for, let’s say, speed averaged over the time interval beginning at the instant t1 and ending at t2 ; Eq. (N.12), on the other hand, is an exact equation, because it defines the exact speed as you would measure it at precisely the time t. So: when you have numerical data, you can at best use an expression similar to (N.11) to estimate a derivative; when you have an analytical formula for the function x = x(t), you can use mathematical analysis to find an analytical formula for the derivative. Here’s a simple example: say you are dealing with some quantity that is proportional to the square of t, and t is time. So then f (t) = ct 2 , with c a constant, and
Notes
487
df c(t + δt)2 − ct 2 = lim δt−→0 dt δt (t + δt)2 − t 2 = c lim δt−→0 δt δt 2 + 2tδt = c lim δt−→0 δt δt (δt + 2t) = c lim δt−→0 δt = 2ct. The point being, you don’t need to plug any numbers into (N.11) to know what the derivative of something that goes like t 2 is. (And of course, if for a moment you go back to note 19: it works the same way for integrals: if all you’ve got is a bunch of numbers, you can at most implement (N.10) “numerically”, i.e., plug the numbers in and “crunch” them, by hand or with a computer. But if you have the analytical formula for a function, you can use analysis to “solve” its integral: as we sometimes do in this book, incl. in this very chapter.) Now, this is not a calculus book, and but if you go and read any one of those, you’ll find tons of simple functions like this for which the derivatives can be calculated analytically in similar ways, and general rules to find the derivatives of more complicated functions, and so on and so forth. Of all the rules of calculus, the most important one is probably the so-called fundamental theorem of calculus, which, simply put, states that the process of differentiating and that of integrating are sort of the inverse of one another. More rigorously: let f be a continuous function of x (one that never, like, “jumps” from one value to another; i.e., it doesn’t have “discontinuities”; or, in other words, you can always find a value of x small enough for the difference f (x + x) − f (x) to be smaller than any x threshold you want). The fundamental theorem then says that if F(x) = a f (t)dt, then ddx F(x) = f (x), and vice versa. To prove it, start by writing out the incremental ratio, x+x
f (t)dt − x x+x f (t)dt . = lim x x−→0 x
d F(x) = lim x−→0 dx
a
x a
f (t)dt
Because f is a continuous function, there must exist, like I just said, a value of x small enough for f to be approximately constant between x and x + x. When x −→ 0, d F(x) = f (x) lim x−→0 dx
x+x x
x
dt
x = f (x), x−→0 x
= f (x) lim
488
Notes
QED. The whole story, as you see, works independent of the value of a. This means that for each function f (x) you have an infinity of “primitive” functions (the primitive of a function f is another function, which, when differentiated, gives back the function f ) F(x), one per possible value of a. If F(x) is a primitive of f (x), then the function F(x) + C, with C an arbitrary constant, is also a primitive of f , because d d F(x) = f (x). [F(x) + C] = dx dx But, if F(x) is known and the integration limits x A , x B are known, then
xB xA
21.
22. 23.
24.
f (t)dt = a
xB
f (t)dt −
xA
f (t)dt = F(x B ) − F(x A ),
a
is uniquely determined—for whatever values of x A , x B , provided that f (x) is continuous between x A et x B . You might notice that this is only true if P is outside the shell of radius r ; if it were in the volume enclosed by the shell, we would have ξ = r − D when ϑ = 0. This is not important now, but it will become so later: see note 24. See note 20. If you don’t see that right away, just compare (1.22) with Newton’s gravitational law (1.6), which, remember what we’ve said, is written assuming the two masses attracting one another, m 1 and m 2 , are material points, i.e., they have infinitely small volume. Equation (1.22) says that the gravitational attraction from any spherical body is exactly the same, no matter how mass is distributed within the sphere, as long as mass distribution is spherically symmetric. The simplest possible case is that all mass be concentrated at the center of the sphere—hence the righthand side of (1.22). So far, we’ve proven this result only in the case where the observation point lies outside the sphere. That is enough to go on with the rest of this chapter. But there’s another result, closely associated with (1.22) and that was originally proved by Isaac Newton himself together with (1.22)—the two results are usually taught together under the “shell theorem” moniker. We’ll use it later in this book, and but I might as well prove it here. It can be stated as follows: the gravitational field inside a sphere at a distance r from the sphere’s center is the same as if the total mass within a distance r from the center were concentrated at the sphere’s center. Or in other words: if the observation point P is inside the sphere—call r the distance between P and the center of the sphere—look at Fig. N.5 and compare with Fig. 1.8—then only the gravitational attraction from whatever mass lies at a distance < r from the center of the sphere matters; the rest—the attraction by the shell that includes all mass that lies at radii bigger than r —cancels out.
Notes
489
Fig. N.5 Second instalment of the shell theorem: the observation point P lies within the globe. When ϑ = 0, then ξ = r − D, which differs from the P-is-outside case. When ϑ = 0, then ξ = r + D, same as before. Note that the integral in (N.13) is restricted to the shaded area
To prove this, remember note 21, and look at Fig. N.5, too. The proof I’ve done earlier is still OK all the way to Eq. (1.18), exclusive; Eq. (1.18) doesn’t apply anymore, because, like I said in note 21, the integration limits must change. To show that the outer shell, radii D to r , does not contribute to a, replace (1.18) with D+r cos α(r, ξ) 2πG R . (N.13) dr ρ(r )r dξ aout = D D ξ r −D We can solve this exactly, like we just solved (1.18), i.e., via the law of cosines, Eq. (1.19); if you use that to rewrite cos α in (N.13) you get—remember the algebra we did earlier— πG D2
D2 + ξ2 − r 2 ξ2 D r −D R 1 πG 1 − + 2D = 2 dr ρ(r )r (D 2 − r 2 ) D D r−D D +r = 0,
aout =
R
dr ρ(r )r
D+r
dξ
QED. 25. Bouguer, La Figure de la Terre (1749), pp. 369–370. 26. And the careful reader will ask, but how do you know the latitudes of those two points, if you measure latitude with a plumb line, and the line is deflected, and the amount of that deflection is precisely what you need to figure out?! Well, what you do is, you move straight to the east and/or west of your observation
490
27. 28.
29. 30. 31.
32. 33.
Notes
point, and redo the measurement. In Bouguer’s words, “we shall move along the same east-west line, to a sufficiently great distance, so that there is no longer any deflection to fear: and if we measure latitude in this second place, with the same care and by the same means as in the first, it is obvious that all the difference that we shall see will be due to attraction [by the mountain]. To conduct your second observation at a point precisely to the East or to the West of the place of the first, you will need to observe the azimuth of the Sun, at its rising or setting, with respect to some remarkable point on the horizon”, etc. And so, actually, it is best to describe them as vectors: see note 16. In case you haven’t heard this expression before, physicists call “center of mass” of an object of whatever shape the point toward which other masses accelerate, by the effect of gravitational attraction from that object alone. Imagine you subdivide the object in small parcels of mass—material points: each of them exerts its own gravitational attraction on the observer. If you sum all those attractions up, you find the net acceleration of the observer, and the direction that that points to is where the center of mass is. See note 16. That’s what he says in La Figure de la Terre, p. 369. Less than a century later, the theory of isostasy would suggest that such anomalies are to be found systematically under mountains. We’ll learn all about that later on. Philosophical Transactions of the Royal Society, vol. 65, 1775. Between 1763 and 1767, Mason and Dixon traced the border between Pennsylvania and Maryland, which put an end to about a century of occasionally violent controversies between the governors of the two states. It all had started in 1681 when the King of England had awarded William Penn a territory, between Maryland to the South and New York to the north, whose southern boundary was to extend eastward along the fortieth parallel, until it intersected a circle of twelve miles radius centered at the town of New Castle (Maryland): then the arc of that circle would be the boundary from the point of intersection to Delaware Bay. There were two problems with this. A relatively minor problem was that the point, within New Castle, where the circle was to be centered was left unspecified; a more serious problem was that, in any case, the fortieth parallel simply does not intercept at all the circle so defined. Yet, William Penn would naturally interpret the king’s charter as stipulating that his lower boundary should be no more than twelve miles north of New Castle. The controversy was settled, at least in theory, in 1732, when the Committee for Trade and Plantations in England established that the boundary should be fifteen miles south of the city of Philadelphia. After a number of additional controversies and technical and logistical problems delayed the actual surveying, Mason and Dixon are finally assigned the job of going to the field and translating the words of the agreement into an actual line, or an alignment of “monuments” of some kind, marking the border. They come to Philadelphia in November 1763, identify the southernmost house in Philadelphia (on what is today South Street), and proceed to measure its latitude. Then they realize that by traveling “fifteen
Notes
34.
35. 36. 37. 38. 39.
491
miles south” from there they would end up in the Delaware river, and decide to first move 31 miles west, to a place called Brandywine. From there they marched south along a meridian, and, after checking its latitude, identified a “post marked west” that would be the first point on their line. From this point on, all Mason, Dixon and their team had to do was to “chain” (see note 14) hundreds of miles westward... except that one “chains” along a straight line, and a straight line on the surface of the earth is, in three dimensions, a great circle. A great circle is a circle whose center and radius coincide with those of the earth. (Which of course, the earth is not exactly circular, so a great circle is not exactly a circle, but you get the idea.) The shortest path between any two points at the surface of the earth is the segment of great circle that goes through them: think the route of an airplane. A meridian is a great circle. But a parallel is not a great circle: and so the parallel selected to be the PennsylvaniaMaryland border had to be reconstructed from a sequence of great circles with a posteriori calculations and corrections (“tables of offsets”). So what Mason and Dixon did was (1) at the starting point of their east-west survey, at the exact prescribed latitude of the Pennsylvania-Maryland border, calculate the initial bearing that, by tracing a straight line, would lead them to hit the parallel again after covering exactly ten minutes of a degree (this can be done by spherical trigonometry); (2) after ten seconds are measured and the intersection reached, check that no major errors were made by measuring latitude via astronomy observations (astronomy observations were quite time-consuming, and that is why they were done at only those selected points along the survey, sort of as a double check). (3) If everything is OK, iterate the procedure, running a new ten-second great-circle arc from the newly determined intersection. They occasionally had to use triangulation, for instance when crossing large rivers or whenever they found large abrupt topography on their path: but they tried to stick as much as possible to chaining, as that would be more precise. (That’s not what the French had done in the Andes: Bouguer and co. had no choice but to triangulate, since they were working in the middle of a major mountain range. Then again, Ecuador/Peru was probably still logistically better than the any other equatorial locations, in Africa or South America: i.e., it was better than the jungle.) See: Mason, A.H., The journal of Charles Mason and Jeremiah Dixon, Memoirs of the American Philosophical Society, vol. 76, 1969. C. Mason and J. Dixon, “Observations for Determining the Length of a Degree of Latitude in the Provinces of Maryland and Pennsylvania, in North America”, Philosophical Transactions of the Royal Society, Vol. 58, 1768. For some reason, Maskelyne refers to himself in the third person. See note 28. See note 14. “An Account of Observations Made on the Mountain Schiehallion for Finding its Attraction”, Philosophical Transactions of the Royal Society, vol. 65. But Hutton’s 1779 “Account of the Calculating Made from the Survey and Measures Taken at Schiehallion in Order to Ascertain the Mean Density of the Earth,” over 100 page long, is quite convincing. (As a side result, it introduces
492
40.
41.
42.
43. 44. 45. 46.
Notes
for the first time the idea of “contour lines,” invented by Hutton as a convenient way to show survey results over a geographic map.) Maskelyne is cautious re the accuracy of his results (how dense are rocks within Schiehallion, really? How accurate the observations?, etc.) but thinks they are robust enough to at least make the claim that the earth cannot possibly be a hollow shell—as some thinkers of the day, including, e.g., Edmond Halley, had surmised. Michell is interesting to us also because he wrote possibly the first ever scientific treatise on earthquakes and the propagation of earthquake waves; a later chapter of this book will cover that, too. Incidentally, you might be confused by high school memories of Cavendish’ experiment as the experiment where the “gravitational constant” was first quantified. The gravitational constant is the G of Eq. (1.6), etc. Of course, once Me is found, you just have to substitute that, and some other known numerical values, in some of the above equations, to determine G. But Cavendish doesn’t even mention G in his paper, and, the way he does his algebra (dividing Eq. (1.29) by (1.30) as done also here), an explicit value for G never needs to be specified. Over the nineteenth century people continued to try and repeat Cavendish’ experiment, via increasingly sophisticated concoctions that should afford better accuracy. But only towards the end of the century it became common practice for those researchers to publish estimates not only of the earth’s density, but also of the gravitational constant. Which see Chap. 8: Maxwell’s equations and all that. See note 20. 2 And you might write, like many textbooks do, T = I ddtφ2 , with T denoting torque and I the moment of inertia. When I did vectors earlier in this chapter (note 16), I didn’t spell out that one cool thing about the unit vectors i, j, k is that any position, but really any vectorial entity in three-dimensional space can be written as their linear combination: i.e., in practice, their weighted sum: which is what Eq. (N.2) from note 16 is. A set of vectors with that property is called a basis of that space. And of course i and j alone are a basis in bi-dimensional space—i.e. on a plane. And, like, theoretically speaking, you can imagine a space with four dimensions, and then you’d need a fourth vector, on top of i, j, k, to form a basis in that space. In most practical applications, in 3-D, i, j and k form the simplest and most convenient basis that one can think of, because they are perpendicular to one another. But they are not the only possible basis. In fact, any trio of vectors that are linearly independent from one another would do. Many of you have seen these things before, those who didn’t can probably look them up quite easily, so I am not going to spend a lot of time on it, but anyway, we say that a set of n vectors v1 , v2 , . . . vn are linearly dependent if there exist a set of scalar coefficients c1 , c2 , . . . , cn , at least one of which isn’t zero, such that c1 v1 + c2 v2 + · · · + cn vn = 0,
Notes
493
which the left-hand side is precisely what we call a linear combination of v1 , v2 , . . . vn —in case it wasn’t clear. When a bunch of vectors aren’t linearly dependent, then that’s when we say that they are linearly independent. At this point you are probably wondering, what is he talking about? Wasn’t this supposed to be about recipes to solve differential equations? and/or maybe you’re thinking, this is the wrong endnote—too many endnotes in this book, anyway, etc. But no, the note is the right note—the differential equations note, anyway: the thing is, after the invention of vectors and tensors at the end of the nineteenth century, people quickly realized that the toolbox of linear algebra, the concepts of linear combination, linear dependence, basis, etc., could be adapted to other topics in math, like for example the study of functions. And, as you will see in a second, studying functions and studying/solving differential equations are two very closely related activities. So, now: just like vectors, functions can also be linearly dependent or independent from one another. Take a set of n functions; call them f 1 (x), f 2 (x), . . ., f n (x): we say that they are linearly dependent if there exist a suite of numbers c1 , c2 , . . . , cn , at least one of which is nonzero, such that c1 f 1 (x) + c2 f 2 (x) + · · · + cn f n (x) = 0 for all values of x. Incidentally, the expression at the left-hand side is called a linear combination of the functions f 1 (x), f 2 (x), . . ., f n (x). Functions that are not linearly dependent are said to be linearly independent724 . Now, differential equations. A differential equation is an equation that involves one or more unknown functions and their derivatives. What I mean by “unknown functions” is that, when you solve a differential equation, you’re not looking for the numerical value(s) of some variables: you’re looking for a function: so, there’s an independent variable, which in the example we were just looking at—the torsion pendulum—is time, t, and there’s a function of t, φ = φ(t) in our case, i.e., a formula that assigns a value of the angle φ to each value of time t. That formula is what we are looking for—it is the solution of the differential equation. If all derivatives in the differential equation are with respect to only one variable, then that equation is called an ordinary differential equation, or ODE. Otherwise, it’s a partial differential equation, or PDE. We call “order” of a differential equation the order of the highest derivative appearing in the equation. Equation (1.35), for example, is a second-order ODE. The simplest differential equation that one can think of is dy = h(x), dx where h(x) denotes a known function of x. This ODE is simple enough that you can solve it by calculating the integral of both sides, i.e.,
494
Notes
y(x) =
x
h(x )d x + C,
where C is an arbitrary constant (which remember the fundamental theorem, note 20). A linear differential equation looks like d n−1 y dy dn y + a (x) + · · · + an−1 (x) + an (x)y = h(x); 1 n n−1 dx dx dx
(N.14)
what makes us call it linear is that there are no products of the unknown function y(x) and its derivatives, and no nonlinear functions of y(x) and its derivatives. To make the algebra that follows less cumbersome, let us define the differential operator L=
dn d n−1 d + an (x), + a (x) + · · · + an−1 (x) 1 n n−1 dx dx dx
so that (N.14) collapses to L y(x) = h(x).
(N.15)
Now, solving (N.15) means finding a formula for y(x) that represents all possible functions which, if the operator L is “applied” to them, give h(x). In practice, that formula will carry some arbitrary constants: and any possible solution to (N.15) can be represented by that same formula, depending on how one chooses the numerical values of those constants. There’s one case where it is is relatively “easy” to solve (N.15)—that is when all the coefficients a1 = a2 = · · · = an−1 = an = 0. Then, all you need to do is “integrate” n times, similar to what we’ve done a minute ago with ddyx = h(x). Let me show you: we have dn y(x) = h(x), dxn which is the same as d dx
d n−1 y(x) = h(x). d x n−1
Integrate both sides, and d n−1 y(x) = d x n−1
x
d x h(x ) + A,
x d x h(x ) is just a primitive of h, and A is arbitrary. Now you can do where the exact same thing again: first you manipulate the left-hand side of what we just wrote,
Notes
495
d dx
d n−2 y(x) = d x n−2
x
d x h(x ) + A,
and then you integrate, d n−2 y(x) = d x n−2
x
dx
x
d x h(x ) + Ax + B,
with B another arbitrary constant. And likewise d n−3 y(x) = d x n−3
x
d x
x
d x
x
d x h(x ) + A
x2 + Bx + C 2
(with C yet another arbitrary constant), and so on and so forth, until at the left-hand side you have the derivative of order n − n, i.e., of order 0, i.e., the function y(x) itself. Because at each integration we gain one arbitrary constant, it follows that the general solution y(x) that we get in this way carries exactly n arbitrary constants, total. Now, this is not even close to being a rigorous proof, but my hope is that you’ll believe me, if I tell you that the general solution y(x) of any linear ODE of order n always contains n arbitrary constants—even when the coefficients a1 , a2 , etc. are all nonzero. Now, I’m not even going to try to give you recipes to solve differential equations, because, for one, there wouldn’t be enough room in this book to do that; secondly, you don’t really need that to be able to understand the rest of this book; and finally, there’s no recipe good enough to solve more than a few equations—which means, it really makes sense to start learning recipes if you are ready to study a lot of them—a whole cookbook. This having been said, here are a couple pretty general things about differential equations that, IMHO, are worth being aware of. We call homogeneous ODE a differential equation like (N.15), but with h(x) = 0. That means, in practice, that there is no term in the equation that doesn’t contain the unknown function and/or its derivatives. We also say that L y(x) = 0 is the homogeneous differential equation associated with L y(x) = h(x). Homogeneous equations are special, because imagine you know at least two solutions of the homogeneous equation, call them y1 (x) and y2 (x). Precisely because y1 (x) and y2 (x) are solutions, we have L y1 (x) = 0 and L y2 (x) = 0. If you sum both left- and right-hand sides together, you get
496
Notes
L[y1 (x) + y2 (x)] = 0. Because, remember, L is just a bunch of derivatives, and the sum of the derivatives of multiple functions is the same as the derivative of the sum of the functions. For the same reason, we also have L[c1 y1 (x) + c2 y2 (x)] = 0 for any coefficients c1 , c2 . In other words, and here is where I begin to use concepts from the vector algebra toolbox that I’ve introduced at the beginning of this note, in other words, if both y1 (x) and y2 (x) solve the homogeneous equation L y(x) = 0, then all their linear combinations also solve the same equation. Notice that this would not work if the equation were not homogeneous. (To convince yourself of that, try to repeat the four steps I just did, but for the case of the non-homogeneous equation L y(x) = h(x).) Imagine, next, that you are aware of not just two, but n linearly independent solutions, which let’s call them y1 (x), y2 (x), . . ., yn (x). Remember that n is the order ofthe ODE. The algebra we just did still works, and the linear n ci yi (x) is solution to L y(x) = 0, whatever the values of combination i=1 the ci ’s. The thing about this solution, though, is that it contains precisely n arbitrary constants: and so, remember what we learned earlier, it must be the general solution to the homogeneous ODE. It’s important that you realize that this only works if the yi ’s are linearly independent from one another. If they weren’t, you could always reduce the number of terms in the sum—i.e., the number of arbitrary constants. (Again, vector algebra concepts applied to differential equations.) What we have just discovered re homogeneous ODEs is useful also when dealing with non-homogeneous ODEs, because look: we know that L
n
ci yi (x) = 0;
i=1
if we call yNH (x) the solution of the non-homogeneous ODE, we have L yNH (x) = h(x), and if we sum the left- and right-hand sides of what we just wrote, we get L
n
ci yi (x) + yNH (x) = h(x).
i=1
n ci yi (x) + yNH (x) is a solution of the nonThis last equation means that i=1 homogeneous equation L y(x) = h(x); and but it is also a solution that carries
Notes
497
n arbitrary constants: as many as the order of the ODE. Which means that what we have here is the general solution to the non-homogeneous equation. The reason I am spending time on this is that it can be quite helpful when you solve a non-homogeneous ODE. Because depending on the form of h(x), it might be tricky to find a general solution—but then, if that’s the case, you can drop h(x), find the general solution to the homogeneous equation (which is usually easier), and then see if you can find one solution to the non-homogeneous one: if you do, sum it to the “homogeneous” solution, and you’re done. 47. If φ(t) is given by (1.36), then 2 dφ = −φ0 dt L
k sin m
2 k t , L m
which if you differentiate that again with respect to t, d 2φ 4k = −φ0 cos dt 2 m L2
2 k t . L m
But using Eq. (1.36), it follows from this that d 2φ 4k =− φ(t), 2 dt m L2 which is essentially (1.35), QED. 48. This, by the way, is called solving a differential equation by “direct substitution”: substitution of the solution one has guessed into the equation that is to be solved. 49. The best way to do that, and the way Cavendish did it, is probably to measure the angle between the endpoints of oscillations. At each oscillation you get a new measurement. The pendulum oscillates around φe , so taking the average of all such measurements, you can get φe . 50. In the most general case, one could think of forces that are not simply of attraction or repulsion (e.g., some electromagnetic forces: which see Chap. 9); but that really isn’t our case: gravity is a force of pure attraction, and the other forces that keep solid matter together, “intermolecular” forces or whatever, are either attractive or repulsive, as we shall see in a later chapter. 51. See note 16. You should have been able to read and understand Chap. 1 without knowing anything about vectors and vector algebra, but vectors are going to get used more and more in this book, so you probably want to go through that note sooner or later. This might be the right time. 52. Which is an algebraic operation that involves vectors: again, see note 16, Chap. 1. 53. See note 19.
498
Notes
Fig. N.6 A body rotates rigidly (its shape doesn’t change) by an angle θ. A parcel of matter within the body occupies the position (x, y) before the rotation. In note 54 I show you how you can find the position (x , y ) the same parcel of matter occupies after the rotation
54. This is pretty important, but then again it is mostly trigonometry, and, like I said, I was not really planning on proving any trigonometry results in this book; plus, it’s long, so it would distract you from the more relevant stuff that I am trying to get across. But yeah, it is important, sort of, so I am going to do it, but hide it, like this, in an endnote. The question is, consider a solid, rigid body that rotates around an axis; imagine we’ve got the Cartesian coordinates of a material point P, I call them x and y (there’s also z, but we don’t care about z because we can always choose z to be parallel with the rotation axis, so that the rotation won’t change it); as the body rotates, after a little while the same material point occupies a position P , coordinates x and y . So, like I was saying, the question is: if I know x, y, and the angle θ by which the body has rotated around its axis, what are the values of x and y ? To answer this question, look at Fig. N.6. In that drawing, the origin O of the reference frame is somewhere on the rotation axis, and, like I anticipated, the planeformed by the x and y axes is perpendicular to the rotation axis. Let r = x 2 + y 2 = x 2 + y 2 —the distance of a given material point from the rotation axis must stay constant, or that wouldn’t be a rigid body like we said it is—and like, in a first approximation at least, we can consider the earth to be. All this having been said, let us define a bunch of points in the diagram of Fig. N.6, that will be useful in all the trigonometry we are about to do. Trace from P a line that’s perpendicular to the segment O P and call A the intersection of that line with O P. From A, trace the perpendicular to the x axis, and
Notes
499
call B the intersection of that with the x axis. Again from P , trace a line that’s perpendicular the x axis and call C its intersection with the x axis, and E its intersection with O P. Finally, call φ the angle between O P and the x axis. It’s all in the drawing, anyway. Now look at the triangles OC E and AE P : both have one right angle (at C and at A); their angles at E coincide (they are so-called “opposite angles”, i.e. non-adjacent angles formed by a pair of intersecting lines, and so they’ve got to coincide); but so then since the sum of the angles of any triangles is always the same (180◦ , as I am sure most of you remember from school), then also the remaining pair of angles have to coincide, i.e. φ has to be equal to the angle formed at P by the triangle AE P . This helps us to calculate x and y in terms of x, y and θ. Start with x . Notice that x = OC = O B − BC = O B − AD. But
AD = A P sin φ = r sin θ sin φ = y sin θ,
and
O B = O A cos φ = r cos θ cos φ = x cos θ,
and so
x = x cos θ − y sin θ,
which answers half the question. Now we are going to find y in terms of x and y and θ, sort of in the same way. y = P C = P D + AB. But
AB = O A sin φ = r cos θ sin φ = y cos θ,
and
500
Notes
P D = A P cos φ = r sin θ cos φ = x sin θ, and so
y = y cos θ + x sin θ,
which answers the rest of the question. If you think of (x, y) and (x , y ) as vectors, then you might write the vectorial equations that relates them, x cos θ − sin θ x = · , y sin θ cos θ y which if you’ve never seen a matrix, go straight to note 55. 55. A matrix is a collection of numbers, aligned along rows and columns. An m × n matrix is a matrix with m rows and n columns. When m = n, that’s a square matrix. Vectors are matrices, too: a vector with n coefficients can be thought of as a 1 × n or n × 1 matrix. In general, a matrix looks like, e.g., ⎛
A11 A12 ⎜ A21 A22 ⎜ A=⎜ . .. ⎝ .. . Am1 Am2
⎞ . . . A1n . . . A2n ⎟ ⎟ , . . .. ⎟ . . ⎠ . . . Amn
and, usually, we call Ai j the coefficient that lies on row i and column j of the matrix A. Matrices are closely related to linear systems. A linear system, ⎧ A11 x1 + A12 x2 + · · · + A1n xn ⎪ ⎪ ⎪ ⎨ A21 x1 + A22 x2 + · · · + A2n xn .. .. .. .. ⎪ . . . . ⎪ ⎪ ⎩ Am1 x1 + Am2 x2 + · · · + Amn xn
= b1 = b2 . = ..
(N.16)
= bm ,
is made up of three parts, which are: the matrix A, whose coefficients Ai j are the (known, of course) coefficients of the linear system; the unknown vector x = (x1 , x2 , . . . , xn ); the right-hand side vector b = (b1 , b2 , . . . , bm ), which, like A, is known. Equation (N.16) can be reduced to the very compact form A·x =b
(N.17)
Notes
501
via matrix multiplication, i.e., the dot product of matrices. Let me explain. Take an m × n matrix, called B, and an n × p matrix C: the number of columns in B and the number of rows in C are the same, both equal to n. If that is the case, we can multiply B with C according to the definition Di j =
n
Bik Ck j
(i = 1, 2, . . . , m; j = 1, 2, . . . , p),
k=1
where the matrix D of components Di j is what we call the dot product of B times C. What we just wrote is equivalent to D = B · C. A vector is a matrix, like I said, so this same definition also works if you want to dot B with a vector, call it u, ci =
n
Bik u j
(i = 1, 2, . . . , m),
k=1
or c = B · u, where c, the dot product of B times u, is a vector with as many entries as A has rows. With this, it should be clear that Eqs. (N.16) and (N.17) are the same thing. The definition of dot product I just gave also works if youdot two vectors n u i vi : which with one another: if the vectors are called u and v, we’d get i=1 coincides with the dot product between vectors that I had introduced in Chap. 1, note 16. (You are beginning to see that there are various ways of writing matrix algebra. When, for instance, you write a linear system in the form A · x = b you are using Gibbs’ notation—J. Willard Gibbs being pretty much the inventor of matrices and vectors, “popularized” (so to say) by his 1901 textbook Vector Analysis. But you could also write n
Ai j x j = bi (i = 1, 2, . . . , m),
j=1
which is the most explicit way that there is to spell everything out. Or you could write Ai j x j = bi (i = 1, 2, . . . , m),
502
Notes
which is Albert Einstein’s way of writing the same thing: according to “Einstein’s convention”, whenever an index is repeated in an algebraic expression, that means that we are supposed to sum over that index, so that, for instance, j is implicit in what I just wrote.) It follows from its very definition that matrix multiplication is not commutative: i.e., in general, A · B is not the same thing as B · A. (In fact, if you think about it, even if A · B exists, that doesn’t mean that B · A exists. Actually, it won’t, unless both A and B are square.) On the other hand, matrix multiplication is associative, i.e., you can verify that (A · B) · C = A · (B · C) = A · B · C, where C is a matrix that has as many rows as B has columns. From the definition of matrix multiplication follows the definition of matrix inverse, i.e., the inverse of A is a matrix A−1 such that
and
A · A−1 = I,
(N.18)
A−1 · A = I,
(N.19)
where I is the so-called identity matrix, i.e., a square matrix whose diagonal entries are all 1, while its off-diagonal entries are all zeros, ⎛
1 ⎜0 ⎜ ⎜0 ⎜ I=⎜. ⎜ .. ⎜ ⎝0 0
⎞ 0 0⎟ ⎟ 0⎟ ⎟ .. ⎟ . .⎟ ⎟ 0 0 ... 1 0⎠ 0 0 ... 0 1 0 1 0 .. .
0 0 1 .. .
... ... ... .. .
0 0 0 .. .
If (N.18) is true, then (N.19) must also be true725 . I is called identity matrix, by the way, because whatever matrix you dot it with, whether to the left or to the right726 , you get that matrix back. Other tools in matrix algebra that will come handy are: (i) matrix sum, i.e., C = A + B ⇐⇒ Ci j = Ai j + Bi j for all values of i, j, where of course A and B must have the same dimensions—the same numbers of rows and columns; (ii) multiplication of a matrix A by a scalar k: B = kA ⇐⇒ Bi j = k Ai j for all values of i, j;
Notes
503
(iii) transposition—the transpose, AT , of a matrix A is the matrix you get by swapping the row and column indexes of all entries. What used to be Ai j becomes A ji . The transpose of an m × n matrix is an n × m matrix. If you think of a vector as a matrix with only one column, then its transpose is a matrix with only one row. For the general, matrix-algebra definition of dot product to make sense, if you are dotting two vectors with one another, the one to the left has got to be a row vector and the one to the right a column vector. When you dot a matrix with a vector, a vector to the right of the dot must be a row vector; a vector to the left must be a column vector. There’s lots of other stuff you probably will need to learn, at some point, about matrix algebra; but, for the time being, this should be enough to get you started. 56. Given that cos 0 = 1, you shouldn’t be too surprised that if x is close to 0, then cos x is close to 1. As for the sine of a small number, consider that if you know the derivative of a function at some point x, you can use its value to estimate the function itself in the vicinity, x + δx, of that point: because, by definition f (x + δx) − f (x) df = lim δx−→0 dx δx (see Chap. 1, note 20), and so, if δx is small, f (x + δx) − f (x) df ≈ , dx δx but then, if you turn this around, f (x + δx) ≈ f (x) +
df δx. dx
Now, trigonometry teaches us that the derivative of the sine is the cosine: it follows that sin(x + δx) ≈ sin x + cos x δx, and, if x is zero, sin 0 is zero, cos 0 is one, and sin δx ≈ δx, QED. What I am doing here, really, is just an example of a general physical mathematical trick known as Taylor expansion, or Taylor series, which Brook Taylor figured out in the early 1700s, and published in his Methodus Incrementorum Directa et Inversa. His theorem says that a function f can be written as a “power series”, ∞ f (n) (a) (N.20) f (x) = (x − a)n , n! n=0
504
Notes
where f (n) (x) stands for the nth derivative of f (x), n! is the factorial of n, n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1,
(N.21)
i.e., the product of the first n positive integer numbers, starting with 1, and a is any point on the real axis. People often write Eq. (N.20) more explicitly, f (2) (a) (x − a)2 2 f (3) (a) f (n) (a) (x − a)3 + · · · + (x − a)n + · · · + 6 n!
f (x) = f (a) + f (1) (a)(x − a) +
(N.22)
To convince yourself that Taylor’s right, apply the fundamental theorem (note 20) to f (x), i.e., write
x
f (x) = f (a) +
dt f (1) (t).
(N.23)
a
Then, do the same thing with the first derivative of f (x), f
(1)
(x) = f
(1)
x
(a) +
dt f (2) (t),
a
and sub into (N.23), which becomes f (x) = f (a) +
x
dt f (1) (a) +
a
x
t
dt f (2) (t ).
dt a
(N.24)
a
Apply the fundamental theorem to the second derivative of f (x): f
(2)
(x) = f
(2)
(a) +
x
dt f (3) (t),
a
and sub into (N.24), f (x) = f (a) +
x
dt f (1) (a) +
a
x
dt a
t
dt f (2) (a) +
a
x
dt a
a
t
dt
a
t
dt f (3) (t ).
(N.25) You might have guessed what the next step is: apply the fundamental theorem to the third derivative of f (x), f (3) (x) = f (3) (a) +
a
and sub into (N.25),
x
dt f (4) (t),
Notes
505
x
f (x) = f (a) +
dt f (1) (a) +
a
x
+
t
dt a
dt
a
x
t
dt a t
dt
a t
a
dt f (2) (a) +
x
t
dt a
dt
t
dt f (3) (a)
a
a
dt f (4) (t ).
a
(N.26)
and so on and so forth,
x
f (x) = f (a) +
dt f (1) (a) +
a
x
+
t
dt a
dt
t
dt a t
dt
a
a
x
and sorry for all the primes. Now, x
dt f
a t
dt f (2) (a) +
x
dt a
t
dt
t
dt f (3) (a)
a
a
dt f (4) (a) + · · · ,
a
(1)
(N.27)
(a) = f
(1)
x
(a)
a
dt a
= f (1) (a)(x − a) Likewise,
x
t
dt a
dt f (2) (a) = f (2) (a)
a
x
a
= f (2) (a)
t
dt
dt
a x
dt (t − a)
a
= f (2) (a)
(x − a)2 , 2
and likewise,
x
dt a
a
t
dt
t
dt f (3) (a) = f (3) (a)
x
a
a
= f (3) (a)
t
dt
a
x
dt a
dt
t
dt
a t
dt (t − a)
a
(t − a)2 2 a 3 (x − a) . = f (3) (a) 3×2
= f (3) (a)
x
dt
Substituting all this into (N.27), we get f (x) = f (a) + f (1) (a)(x − a) + f (2) (a)
(x − a)2 (x − a)3 + f (3) (a) + ··· . 2 6 (N.28)
506
Notes
Fig. N.7 Rigid-body rotation. The vector is what we call the rigid body’s angular velocity. The origin of the reference frame is on the rotation axis. If r is the position of a small chunk of rigid body, then the time-derivative of r is the velocity (not angular velocity: just velocity) v of that chunk. The vectors , r and v are related by v =×r
Now, you see that, in principle, we could keep playing this game forever, applying the fundamental theorem to derivatives of higher and higher order, etc.: and if we did, you see that what we would get is precisely (N.22), QED. Strictly speaking, the equality in (N.20) and (N.22) holds only if you extend the sum all the way to infinity: which would make Taylor’s result not very useful in practice. But the thing is, if (x − a) is small, then (x − a)2 is even smaller, and (x − a)3 is much smaller, and so on, which means that as long as x is close to a, then it’s OK to stop the sum early. (Very often, first order is already OK, as you are going to see if you keep reading this book.) 57. See note 20, where I tried to explain what a derivative is. 58. This can also be proven by old-school algebra and trigonometry, without matrices and stuff. Look at the diagram in Fig. N.7: if the rigid body only rotates, with no translation, in a time δt the chunk of rigid body at position r moves to a point r + δr. By trigonometry, the magnitude of δr is δr = r δφ sin α, where α is the angle between the direction of r and the rotation axis (i.e., , is between r and ); and but so then the magnitude of velocity, v = dr dt v = r sin α
dφ . dt
By definition of angular velocity, we have =
dφ , dt
and
v = r sin α, which is the magnitude of the cross product between and r. As for the direction of velocity, for δr to be purely a rotation around the axis defined by
Notes
507
, it must be perpendicular to the plane identified by and r, i.e., perpendicular to both and r. Now, δr and v point to the same direction, and so it follows that v is also perpendicular to both and r. This, together with the expression we just found for the magnitude of v, implies that v = × r,
59. 60. 61. 62.
63.
which is equivalent to (2.12), QED. You might have noticed that I have picked the origin of the reference frame right on the rotation axis. This just makes everything easier. I guess you can obtain the same result, no matter where you take the origin to be. But I haven’t checked. Which one can prove, if one remembers just what one has learned in note 16. In practice, you need to use Eq. (N.5) from there, and a lot of patience. See note 55. If you take N to be infinite and the mass elements infinitely small, or, alternatively, if you approximate the integral in (2.1) with a sum over small elements. Actually, the earth is an ellipsoid, and r is normalized distance from the center of a unit sphere. What we are assuming, strictly speaking, is that in the primed coordinate system, where the surface of the earth is described by the equation r1 2 + r2 2 + r3 2 = 1, density ρ(r1 , r2 , r3 ) is really only a function of r1 2 + r2 2 + r3 2 , i.e. of distance from the center of the earth. This distinction is subtle and can pretty much be neglected, I guess, at least in the first approximation: because the earth is an ellipsoid, yes, but quite close to a sphere. Integration by parts is a trick to solve some integrals—usually when the product of two functions is involved. To understand how that works, we need to first look at the derivative of the product of two functions; let us call them f (x) and g(x) and let us do the limit of the incremental ratio (see note 20) to find the derivative of f (x) g(x) at a point x0 : d( f g) f (x)g(x) − f (x0 )g(x0 ) (x0 ) = lim . x−→x0 dx x − x0 It is fine (and, as we shall see in a second, useful) to add and subtract the same 0 )g(x) quantity, f (xx−x , to the right-hand side: 0 f (x)g(x) − f (x0 )g(x) + f (x0 )g(x) − f (x0 )g(x0 ) = x − x0 f (x) − f (x0 ) g(x) − g(x0 ) lim g(x) + lim f (x0 ) = x−→x0 x−→x 0 x − x0 x − x0 df dg g(x0 ) (x0 ) + f (x0 ) (x0 ). dx dx
lim
x−→x0
And so,
508
Notes
d( f g) df dg (x0 ) = g(x0 ) (x0 ) + f (x0 ) (x0 ). dx dx dx This works whatever the value of x0 : so we might as well write d( f g) df dg (x) = g(x) (x) + f (x) (x). dx dx dx That is an important result per se; and but now we are going to use it to see how integration by parts works. If we take the integral of both sides of the last equation I wrote, we have dx
d( f g) (x) = dx
d xg(x)
df (x) + dx
d x f (x)
dg (x), dx
but then, by the fundamental theorem, that means that f (x) g(x) =
d xg(x)
df (x) + dx
d x f (x)
dg (x). dx
This equation becomes useful if you swap things around, i.e., d xg(x)
df (x) = f (x) g(x) − dx
d x f (x)
dg (x), dx
(N.29)
because if you happen to have to calculate the integral of the product of two functions, say g(x) and h(x), and you happen to know a primitive of h(x), i.e. a function f (x) such that dd xf (x) = h(x), then you can replace your integral with the right-hand side of (N.29)—all you need to do is take the derivative of g(x). This doesn’t always help: but it might turn out that the integral of f (x) ddgx (x) is easier to solve than the integral you started out with: and that is when you actually recur to integration by parts. 64. In the case of the earth, there are indeed some relatively important external torques. There’s a portion of the gravitational attraction from the sun, and a portion of that from the moon, that are not accounted for by the rotation of earth around sun, and moon around earth. Those are the torques that cause tides. We’ll see about all that later. 65. Newton’s laws are valid, in the forms in which we’ve seen them so far in this book, in a fixed reference frame. In fact, they were initially verified by Newton against astronomical data, which are expressed in the reference frame of the fixed stars. The earth rotates around its own axis with angular velocity . So, if your reference frame is attached to the surface of the earth, then your reference frame rotates with angular velocity . Now, imagine that some object is not rotating together with the earth, but is fixed, with respect to Newton’s fixed reference frame. If you are rotating together with the earth, like you probably
Notes
509
actually are as you are reading this, that hypothetical object would look to you as if it were moving. Let’s quantify this effect. The algebra we need to do looks a lot like what we did in note 54, which gave us Eq. (2.9). The difference is that now what’s rotating is not a material point, but a system of reference in which the coordinates of that point are given. What we want to find is a formula that gives the coordinates x1 , x2 , x3 of a point in the rotating system, if its coordinates x1 , x2 , x3 in the fixed system are known. We shall call i , j , k the unit vectors that identify the axes of the rotating frame, and i, j, k their counterparts in the fixed frame. It’s convenient to take the vertical axis in both frames to coincide with the rotation axis, i.e. k and k are both parallel to (and coincide with one another as a result, and so do x3 and x3 ). It follows from all this that one can write x1 i + x2 j = x1 i + x2 j, which is the same as j1 1 0 i1 + x2 , x1 + x2 = x1 i2 j2 0 1 or
x1 i 1 + x2 j1 = x1 , x1 i 2 + x2 j2 = x2 .
(N.30)
Like I said, the primed and unprimed frames differ only through a rotation around the vertical axis. We might as well call δt the angle by which the primed system has rotated with respect to the fixed one; then the components of the unit vector i in the fixed system are cos(δt) and sin(δt), i.e. i = cos(δt)i + sin(δt)j, or
and likewise
(i 1 , i 2 ) = (cos(δt), sin(δt)), ( j1 , j2 ) = (− sin(δt), cos(δt)),
and if you plug that into (N.30), plus you remember that x3 = x3 , it follows that ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ x1 cos(δt) − sin(δt) 0 x1 ⎝x2 ⎠ = ⎝ sin(δt) cos(δt) 0⎠ · ⎝x2 ⎠ . (N.31) 0 0 1 x3 x3 Now, people who are used to playing around with matrices might see right away that if you take the matrix at the right-hand side of (N.31), swap the second entry of the first row with the first entry of the second row, and then dot the
510
Notes
matrix so obtained, on the left, with both sides of (N.31), what you get reads, ⎛
⎞ ⎛ ⎞ ⎛ ⎞ cos(δt) sin(δt) 0 x1 x1 ⎝− sin(δt) cos(δt) 0⎠ · ⎝x2 ⎠ = ⎝x2 ⎠ 0 0 1 x3 x3
(N.32)
(i.e. dot-multiplying the matrix at the left-hand side of (N.32) with the matrix at the right-hand side of (N.31) gives the identity matrix). If you remember Eq. (2.10), you’ll predict what comes next: we subtract the (column) vector (x1 , x2 , x3 ) from both sides of (N.32), and ⎛ ⎞ ⎛ ⎞ ⎛ x1 x1 cos(δt) − 1 sin(δt) ⎝x2 ⎠ − ⎝x2 ⎠ = ⎝ − sin(δt) cos(δt) − 1 0 0 x3 x3 ⎛ ⎞ ⎛ ⎞ x1 0 δt 0 ≈ ⎝− δt 0 0⎠ · ⎝x2 ⎠ 0 0 0 x3 ⎛ ⎞ δt x2 ≈ ⎝− δt x1 ⎠ , 0
⎞ ⎛ ⎞ 0 x1 0⎠ · ⎝ x 2 ⎠ 0 x3
which indeed looks a lot like (2.10). In analogy with (2.10), etc., it follows that dx × x, = − dt
(N.33)
which is the same as (2.12) except for the sign: makes sense, because a positive—counterclockwise—rotation of the earth would be equivalent (for an observer that rotates with the earth) to an apparent negative (clockwise) rotation of any truly fixed object that she might be looking at. Now, the bottom line of all this as far as we’re concerned is that if we (as we rotate together with the earth) see that something is moving with velocity × x, that means that its velocity in a fixed reference frame is precisely − zero. And in general, if we want to translate the velocity we’ve observed in our rotating frame to the one we would have observed in the fixed frame, then we’ve got to sum the vector × x to your observation. Incidentally, if you differentiate (N.33) against time you get an expression for acceleration, which is more complex than (N.33), in that it also involves the derivative of with respect to time, etc. Plugging that expression into Newton’s second law (force equals mass times acceleration), you get the formula for the so called “centrifugal” and “Coriolis” forces. As you might know, in fact, those are “apparent” forces which we observe only so long as we make our measurements in a rotating frame. Anyway, no need to get into that now.
Notes
511
66. I took a shortcut, without really solving (2.34), right? As a result, (2.35) and (2.36) are only valid for a short time interval δt, after the moment t at which we know . But, just think about it: at the time t + δt, points to a new direction, and but then, from there, keeps revolving around the same axis, and at the same rate, according to
3 δt − 2 (t + δt) sin C−A 3 δt , 1 (t + 2δt) = 1 (t + δt) cos C−A A A C−A 2 (t + 2δt) = 1 (t + δt) sin C−A δt + (t + δt) cos 3 δt , 3 2 A A
and so on and so forth. So I guess that we can just write
3 t − 2 (0) sin C−A 3 t , 1 (t) = 1 (0) cos C−A A A C−A 2 (t) = 1 (0) sin C−A t + (0) cos 3 t . 3 2 A A
Which by the way, you can sub that into (2.34) and check that it really is a solution: a nice exercise. 67. That is, start the derivation from Eq. (2.15) instead of (2.32), and replace F(r) in (2.15) with the appropriate formula—Newton’s gravitational law—for the sun’s and moon’s attractions. 68. No proof, because that would take too much room, but, e.g., the sun alone forces the earth’s rotation axis to go through one full precession in727 3G M S (C − A) cos(ε) 3
2a 3 (1 − e2 ) 2 C3 days (if 3 = 1, or hours if 3 = 24, etc.). Where: M S is the mass of the sun; ε the obliquity728 ; a half the length of the major axis of the (elliptical) orbit of the earth around the sun; e the orbit’s eccentricity729 ; and I think you know what the rest is. 69. To see why we sum free and forced precession, think of (2.32) as a homogeneous system of ODEs, associated with the non-homogeneous system (2.15): we have seen in note 46, that if you sum one solution of a non-homogeneous ODE with the general solution of the associated homogeneous ODE, what you get is the general solution of the non-homogeneous ODE. 70. Chandler wasn’t an academician, at all. While still in high school, though, he “performed mathematical computations” for Benjamin Peirce, a prof. of mathematics and astronomy at Harvard. After graduating high school, which he did in 1861, “he became an assistant to B. A. Gould, one of the best-known American astronomers of that time”. But, for whatever reason, he didn’t go to college. Instead, “in 1866, Chandler accepted a position as an aid in the U.S. Coast Survey. The recently developed technology of telegraphy made it possible to transfer time accurately over intercontinental distances and, with transoceanic cables, even across the oceans. The Coast Survey was making use of this new technology to develop a unified national geodetic reference system. Chandler’s first field experience was as a member of a team that was
512
71. 72.
73.
74.
Notes
making astronomic longitude observations at Calais, Maine, by using the transAtlantic cable to communicate directly with the Royal Greenwich Observatory in London. [...] In 1869, B. A. Gould was preparing to accept a new position as director of the Cordoba Observatory in Argentina. Gould invited Chandler to join him, but the young man had fallen in love with Carrie Margaret Herman, a niece of his stepmother. Chandler left the Coast Survey and accepted a position in New York as an actuary with the Continental Life Insurance Company. In October 1870, he and Carrie Herman were married. During the next 6 years, his first three daughters were born”, etc. (Quotes from W. E. Carter, “Seth Carlo Chandler, Jr.: Discoveries in Polar Motion”, Eos, vol. 68, 1987.) Already in the 1800s, people knew at least something about ρ(r ): Mount Schiehallion, and Cavendish’ experiment, and all that. I am really simplifying things, here, though. If this is the first time you read this book, and you are anxious to move on to other topics, skip this note. Otherwise: the big thing about Chandler’s observation, when he first made it public, is that the figure he had come up with—430 days or so—was inconsistent with previous estimates of A/[(C − A)3 ], derived from precession-of-equinoxes data, which said that free precession should have a much shorter period: about 300 days. Such a big discrepancy couldn’t be explained away by a bad density profile, because, like I said, the precession of the equinoxes was fit pretty well already. People suspected that the whole theory could be wrong, or that Chandler might have messed something up. Simon Newcomb, an American astronomer, sorted out the problem shortly after Chandler’s publication, and published his explanation in 1892. Put simply, yes, there was a problem with the theory, in that the theory was based on the assumption that the earth be a perfectly rigid body: which it isn’t: for one, most of it is covered by a fluid layer: the oceans; secondly, the materials that form the earth can deform, as we know, e.g., from earthquakes (in Chap. 6 we are going to find out all about the earth’s elasticity). So, if one takes the earth’s elasticity and the oceans’ fluidity into account, then Chandler’s result is explained. The important point being, I guess, that if the rotation axis changes, then a non-rigid body tends to change its shape under the effect of the centrifugal force, etc. (remember Chap. 1), and the oceans and solid earth do precisely that, which in turn perturbs A and C, etc. I’ll spare you (and myself) all the maths, but there are formulae, which have been determined, that relate the period of free precession with the unperturbed earth’s A/(C − A), taking account of the earth’s elasticity (or “defect of rigidity”, as A. E. H. Love calls it), and people who wish to constrain earth’s structure based on wobble data have to use those, rather than (2.36). “Über die Massenverteilung im Inneren der Erde”, i.e., “On the Distribution of Mass in the Earth’s Interior”, Nachrichten von der Gesellschaft der Wissenschaften zu Göttingen, vol. 221. To be honest, there are, and were, more observations to fit, than those of precession and total mass. In the first half of the eighteen hundreds James MacCullagh derived a mathematical relation between the acceleration of gravity, g, as you
Notes
75.
76. 77.
78.
79.
513
would measure it at the earth’s surface, and the latitude at which you’d measure it, and it turned out that A and C appear in it as parameters. Which meant that Wiechert could plug his estimates of A and C, based on his density model, into those expressions, and check that the function of g versus latitude that he would get coincides, more or less, with how the g that we actually observe changes with latitude. An additional constraint on top of those I’ve already mentioned, and I guess Wiechert must have used it as well. MacCullagh’s theory is heavier than what I want to do in this book, though. “Researches in Physical Geology”, Philosophical Transactions of the Royal Society of London, vol. 129, 1839, and “Researches in Physical Geology— Second Series”, which came out in 1840 in vol. 130 of the same journal. Travaux du Comité Français d’Histoire de la Géologie, 1998, 3ème Série (tome 12), pp. 59–70. Educated at the École des Mines (the French national school of mines, in Paris), prof. of geology at the National Museum of Natural History between 1819 and 1861 (the year he died). For a while he was also director of the Museum, where he initiated the galerie de géologie. There’s a biographic memoir on Cordier published in 1895 by M. Bertrand— another famous French geologist—in vol. 27 of Annales des Mines. We learn, there, that Cordier was one of the scientists that took part in Napoleon’s expedition to Egypt (1798). In his diary he wrote: “most of the Frenchmen who visited the Pyramids didn’t see in them anything but the tasteless stacks of stones. The others, understanding the ingenuity that was needed to build them, call them monuments to tyranny and unhappiness; don’t they realize that there is no great monument in the world that can’t be attributed to those three great scourges of mankind: ambition, superstition, and tyranny?!” For instance, Ingrid Stober and Kurt Bucher write that “early experience with geothermal phenomena has also been reported from the mining industry. Agricola realized in 1530 already730 that the temperature in underground mines increases with depth. The first reported [underground] temperature measurements with a thermometer are probably those by De Gensanne in 1740 in a mine near Belfort, France. Alexander von Humboldt measured a temperature increase of 3.8 ◦ C per 100 m depth [. . .] in the mining district of Freiberg, Saxony, in the year 1791” (Geothermal Energy: from Theoretical Models to Exploration and Development). Remember the definition of angular momentum that I gave earlier, when we derived Eq. (2.7). Now, it follows from Eq. (2.7) that, if there’s no forces, i.e., if Fi = 0 for all values of i, then the time derivative of total angular momentum is zero: which means that the total angular momentum doesn’t change over time—which means that it is conserved—which is the law of conservation of the angular momentum. It holds for a whole rigid body, like in Eq. (2.7), and but it holds also for an individual material point, which is described by (2.7) with N = 1. I don’t think I’ve said anything about what “angular momentum” means in practice. There’s a classic example, that everyone who’s been in a physics
514
Notes
course must have heard: imagine you’re sitting on a swivel chair, and you are spinning it—and yourself—around. If, when you start spinning, your arms and or legs are spread out, you will gain speed by pulling them towards your body— and the rotation axis. Conversely, you can slow yourself down if you spread arms and legs out, away from the rotation axis. This is all in Eq. (2.7); or, which is the same, Eq. (2.8). If you are not imparting any new force on yourself other than your initial spin, (2.8) boils down to d dt
80. 81.
82. 83.
V
dr d V ρ(r) r × dt
= 0,
i.e., yes, the angular momentum doesn’t change. Now, if, like I was saying, you spread out (away from the rotation axis) parts of your body, that means that, on average, r in r × dr grows, and but so, then, for angular momentum to be dt , must decrease. And the opposite is true if you pull conserved, the velocity, dr dt arms and legs and or whatever other parts of your body together, and close to the rotation axis. If you don’t believe it, give it a try. A French monthly cultural magazine that was founded in 1829 and, actually, is still around. More famous for his contributions to the study of electromagnetism: we’ll meet him again when I cover that, in Chap. 8. He came of age in Lyon during the French revolution, and in 1793, according to Isaac Asimov’s Biographical Encyclopedia of Science and Technology (second edition, Doubleday 1982), “Ampère’s father, who was a well-to-do merchant and one of the city’s officials, was guillotined. Ampère went into a profound depression as a result, out of which [...] he struggled with difficulty. In 1803 his beloved wife of but four years died and this again hit him hard. Indeed, he never recovered from that blow. (In 1818 he married a second time, and this time the marriage was unhappy.) “At Napoleon’s insistence, Ampère continued a fruitful career as a professor of physics and chemistry at Bourg, and then in 1809 as a professor of mathematics in Paris. “He was, like Newton, the classic example of an ‘absent-minded professor.’ Many stories (not necessarily true) are told of him, including one in which he forgot to keep an invitation to dine with the Emperor Napoleon, probably the only occasion on which the emperor was ever disappointed in this manner—and with impunity, for Napoleon appointed him inspector general of the national university system in 1808.” Held 1824–1836 at the Collège de France731 . When you look at the night sky, usually you just see lots of fairly bright spots— each a planet, or a star. If you look with a telescope, though, you might be able to see some hazy areas—nebulae. When William Herschel started looking at nebulae in 1774, people weren’t clear about what they might really be—whether they were clusters of stars, so far from us that we can’t tell one star from the
Notes
515
other, or vast masses of gaseous material distributed over enormous volumes in interstellar space—like Edmond Halley had proposed some thirty years earlier. Herschel had emigrated from Germany to Britain at the age of nineteen, in the aftermath of one of the many strange wars that took place in Europe during that century. He settled in Bath together with his sister Caroline, and both became fairly successful musicians—their father had been an oboist. Herschel figured he could do better as an astronomer, and started playing around with telescopes (incl. looking at nebulae, trying to figure out what they were). He turned out to be pretty good at that—in 1781 he came across a strange-looking star which he soon realized was actually a planet—one that nobody had ever seen before: Uranus. This attracted the attention of King George III, who then hired Herschel to live near Windsor and be, like, a court astronomer, while still doing his thing. Both William and Caroline quit Bath, and their careers in music, and went on to put together a tremendous catalogue of nebulae, which eventually included thousands of them, and was way bigger than anything people had been able to come up with thus far. The way they did it was kind of by brute force: “To decide the riddle of the nebulae”, writes Michael Hoskin732 , Herschel “would need big telescopes with mirrors that could collect enough light to make possible an examination of these faint and mysterious objects. Accordingly, on arrival in the Windsor area, he set himself to build a large reflector, of 20 feet focal length and mirrors 18 inches in diameter. Soon after this was completed he would be petitioning the King for funds to build a 40 feet monster with mirrors 4 feet in diameter”; and Hoskin infers that “the study of nebulae must have been his primary motivation” for using something so big. Anyway, “the nature of the nebulae remained an issue until 13 November 1790, when [Herschel] came across a star surrounded by nebulosity and concluded (correctly) that the nebulosity was associated with the star and was not a vast and distant star system”: just like Halley had thought. 84. Transactions of the Royal Society of Edinburgh, vol. 23, pages 167-169. 85. Possibly the most influential British physicist of the Victorian era. “In 1846 he was elected Professor of Natural Philosophy at Glasgow and held that post until 1899. He is best known for his work, in combination with James Joule, on the laws of thermodynamics, and for the invention of navigational and electrical measuring instruments. In 1892 he became the first scientist to be honoured with a peerage and took the title Baron Kelvin from Kelvin Grove in Glasgow, where his grandmother had lived [...]. He was awarded the Order of Merit in 1902 and died of a severe chill on 17th December 1907.” (From the website of Westminster Abbey, where Kelvin is buried.) We’ll meet him again a number of times in this book, starting with Chap. 4. 86. “Hopkins [...] is one of those rare scientists who is famous for his teaching. As a tutor at Cambridge University, he trained a whole generation of outstanding mathematical physicists including G. G. Stokes, William Thomson, and James Clerk Maxwell.” (Stephen G. Brush, “Discovery of the Earth’s core”, American Journal of Physics, vol. 48, 1980.)
516
Notes
87. Carl Gustav Christoph Bischof. Professor of Chemistry and Technology at the University of Bonn, author of Lehrbuch der Chemischen und Physikalischen Geologie, or Elements of Chemical and Physical Geology, a 3000-page long affair, and the geochemistry book of the late eighteen hundreds. 88. 1835 edition. 89. Who was one of the so called “fathers of the Church”. In De Pallio (translated by S. Thelwall), he says: “There was a time when [the earth’s] whole orb [...] underwent mutation, overrun by all waters. To this day marine conchs and tritons’ horns sojourn as foreigners on the mountains”. 90. Codex Leicester 9B. 91. Steno’s life story is strange in today’s terms—but probably not so much for a learned man of the seventeenth century. He was born in Copenhagen, 1631, and went to study medicine in Paris. In 1665 he moved to Florence, where the Grand Duke of Tuscany, Ferdinand II de’ Medici, appointed him to a hospital post that left him enough time to do research. In 1666, Ferdinand asked him to dissect the head of a huge shark that had been caught near Livorno. “Steno dissected it and published his findings in 1667”, says Steno’s page in the website of UC Berkeley museum of paleontology. “While examining the teeth of the shark, Steno was struck by their resemblance to certain stony objects, called glossopetrae or ‘tongue stones’, that were found in certain rocks. Ancient authorities, such as the Roman author Pliny the Elder, had suggested that these stones fell from the sky or from the moon. Others were of the opinion, also going back to ancient times, that fossils naturally grew in the rocks. [...] Steno, however, argued that glossopetrae looked like shark teeth because they were shark teeth, that had come from the mouths of once-living sharks, and come to be buried in mud or sand that was now dry land. There were differences in composition between glossopetrae and living sharks’ teeth, but Steno [argued] that fossils could be altered in chemical composition without changing their form.” 92. From Nicolas Steno’s Dissertation Concerning a Solid Body Enclosed by Process of Nature within a Solid: An English Version with an Introduction and Explanatory Notes, 1916. This is at pages 229–230. 93. Benoît de Maillet was a French diplomat who spent much of his life in Egypt and wrote an anonymous treatise, whose title is just his last name, spelled backwards. Or more precisely, Telliamed: or, Discourses between an Indian Philosopher and a French Missionary, on the Diminution of the Sea, the Formation of the Earth, the Origin of Men and Animals, and Other Curious Subjects, Relating to Natural History and Philosophy. Telliamed was one of those clandestine pamphlets that circulated in manuscript, rather than printed form, through much of the eighteenth century, for fear that they might be censored as blasphemous by the political/religious establishment of the time. It was written between the end of the XVII, beginning of the XVIII centuries, and but the first printed edition only appeared in 1748 (which is ten years after de Maillet’s death), in bowdlerized form. I am not sure what exactly the problem was—maybe the underlying idea that the earth was older than a couple thou-
Notes
94. 95.
96.
97. 98.
99. 100.
101.
517
sand years, which we’ll get back to that later; or, just in general, the fact that, according to de Maillet, creation didn’t go exactly as written in the Bible? Anyway, if you are curious, an English translation of Telliamed based on an early (1729), unexpurgated manuscript was published in the US in 1968: University of Illinois Press, translated, edited and annotated by Albert V. Carozzi. Great Geological Controversies, second edition, Oxford Science Publications, 1990. Kurze Klassifikation und Beschreibung der Verschiedenen Gebirgsarten, or, Short Classification and Description of the Various Kinds of Mountains (Dresden, 1787). It seems that around this time it became, uhm, fashionable to go and explore mountain ranges—something that today some people like to do for fun, but wasn’t really common back then. If you are curious about this, look up Horace Bénédict de Saussure; and maybe also Peter Simon Pallas. In an early version of Werner’s stratigraphy, 3 and 4 were thought of as a single unit, and they were called “tertiary”. I am no chemist, but I guess the phrase chemical precipitate here means some solid material, initially dissolved in water but at some point separating from water by a chemical reaction, or some other phenomenon we don’t necessarily know, yet, in this case, and then sinking by gravity to the bottom of the ocean. This is different from mechanically deposited sediments, i.e., stuff that gets eroded away from pre-existing rocks, broken into tiny fragments, and then transported (e.g. by water or wind) and deposited and compacted by its own weight, etc. See note 98. Guettard’s essays are collected in his Mémoires sur Différentes Parties des Sciences et Arts; this one is the cinquième mèmoire of volume 3 (published in 1770). In the same volume, there’s also the septième mèmoire, Sur les Dépôts Faits par la Mer, or On Sediments Deposited by the Sea: the quote on the sediments not extending far out to sea is from there. The way I understand it, already in Guettard’s time, it had been noticed, through navigation, that sandbanks and shoals were found mostly near the shore and near the mouths of rivers; then water would quickly become deeper, and stay deep over most of the oceans. It also had been noticed that the extent of sandbanks, in distance from the river mouth, was proportional to the speed of the river waters as they flow into the sea. Guettard was aware that data available to him were few, and his inference that sediment pile up only near the shore was really just an “educated guess”, as they say. But today, of course, the depth of seas and oceans has been charted with great precision all over the place—as we shall see in Chap. 8—and Guettard’s speculation is confirmed. In his 1866 book, Orographic Geology: the Origin and Structure of Mountains, George L. Vose writes that “according to Hutton, the lowest rocks were crystalline masses, formed by the cooling of a globe of molten matter. The upper surface of such rocks was first denuded by the atmosphere and the sea; the sediments thus produced were then deposited at the bottom of the ocean;
518
Notes
and there, by heat ascending from below and from the pressure above, were first consolidated, next converted into a crystalline bedded rock; and finally so far heated as to fuse into granite, and to be re-absorbed into the interior, to be again upheaved, ready to repeat the same series of operations, to which there is neither beginning nor end” (italic mine); i.e., what is now an igneous rock used to be a sedimentary rock. (We’ll meet Vose again, and look again at his book, in the next couple of chapters.) 102. Calcareous spar: “An old, common name used for crystalline calcium carbonate prior to 1845 when the name was changed to Calcite. Calcareous spar is still occasionally used for calcite in suppliers’ catalogs.” From cameo.mfa.org. 103. Meaning about 2600 ◦ C or 2800 ◦ C. The Wedgwood scale is a temperature scale, now long abandoned, designed by a potter—Josiah Wedgwood—to measure the quite high temperatures at which clay becomes pottery. Thermometers that function at such temperatures are, or I should say were called pyrometers. “The first instrument worthy of the name ‘pyrometer’ used in relation to metallurgy, mineralogy, and glass and ceramics making,” writes Sally Newcomb733 , “was invented by Josiah Wedgwood of Etruria, friend, correspondent, and laboratory ware supplier to well-known scientists of his day, including Joseph Black, Joseph Priestley, Sir James Hall, and Lavoisier, and many others. Wedgwood was both a fine potter and an excellent businessman, who wished to maintain the quality of his known wares and initiate new forms, slips, colors, and glazes.” Before Wedgwood, potters and or people working at very high temperatures would use the colors, that stuff would turn to when heated, as a reference for their temperatures; e.g., the adjective “red hot”, meaning that color turns to red, corresponds to roughly 1000 ◦ F or 580 ◦ C. Wedgwood figured, in his own words (“An Attempt to make a Thermometer for measuring the higher Degrees of Heat, from a red Heat up to the strongest that vessels made of Clay can support”, Philosophical Transactions, vol. 72, 1782), cited by Newcomb, that “a red, bright red, and white heat, are indeterminate expressions.” To try and come up with a proper temperature scale, says Newcomb, Wedgwood initially tried to map more precisely the colors one gets at different temperatures: he “took a mixture of calces of iron [calces, plural of calx I guess, are the stuff that forms when you bring iron to really high temperature—you kind of roast it, and then this layer of ashy material—calx—forms on its surface] mixed with clay, formed circular disks an inch in diameter and a quarter of an inch thick, and placed them in a kiln ‘in which [again Wedgwood, from the 1782 paper] the fire was gradually augmented, with as much uniformity and regularity as possible, for near 60 hours; the pieces taken out at equal intervals of time during the successive increase of heat, and piled in their order on each other in a glass tube.’ “The pieces,” writes Newcomb, “demonstrated a gradation of colors that at first Wedgwood felt could be used to indicate the heat to which a mixture of the same kind had been exposed. Practically, however, Wedgwood found that both color perception and description were not exact enough for his purposes. He then said [still the 1782 paper]: ‘In considering this subject attentively, another property
Notes
519
of argillaceous bodies occurred to me; a property which obtains, in a greater or less degree, in every kind of them that has come under my examination, so that it might be deemed a distinguishing character of this order of earths; I mean, the diminution of their bulk [read: volume] by fire [read: heating]; I have the satisfaction to find, in a course of experiments made with this view, that it is a more accurate and extensive measure of heat than the different shades of color’.” So Wedgwood dropped color, and adopted the amount of contraction of the clay on heating as a measure of temperature. “He reported that his highest temperature, 240 ◦ W, corresponded to 32, 277 ◦ F, and on down to the freezing point of water at −8.042 ◦ W or 32 ◦ F. Zero on Wedgwood’s scale was ‘red-heat fully visible in day-light’. “Despite Wedgwood’s hopes, and the optimistic reports of contemporaries, his Fahrenheit values were incorrect; the relation between scales could not be reduced to a simple function, and the amount of contraction of the clay on heating was not linear. [...] Even so, Wedgwood’s pyrometer pieces were the first even remotely reliable way to quantify the heat of furnaces.” I’ll tell you more about heat and temperature in Chap. 4. 104. From the Dictionary of National Biography, 1885–1900: “Wedgwood, Josiah by Arthur Herbert Church. Wedgwood, Josiah (1730–1795), potter, thirteenth and youngest child of Thomas and Mary Wedgwood (born Stringer), was baptised in the parish church of Burslem, Staffordshire, on 12 July 1730. He was the fourth in descent from Gilbert Wedgwood of the Mole in Biddulph, born in 1588, who settled in Burslem about 1612, when he married Margaret, one of the two daughters and coheirs of Thomas Burslem. This Gilbert was a greatgreat-grandson of John Wedgwood of Dunwood, whose marriage took place in 1470. The Wedgwoods were a prolific race, so that, in spite of the possession of some property in hinds and houses, it was necessary for the cadet branches of the family to make a living by adopting the staple occupation of the district. Thus it came to pass that Josiah Wedgwood’s father, as well as several of his uncles and cousins, were potters—some masters, some journeymen. “Before Josiah had completed his ninth year his father died, and the boy’s school career, such as it was, closed. He at once began work at Burslem in the pottery of his eldest brother, Thomas, and soon became an expert ‘thrower’ on the wheel. An attack of virulent smallpox when he was about eleven greatly enfeebled him, particularly affecting his right knee. However, on 11 Nov. 1714, when Josiah was in his fifteenth year, he was apprenticed for five years to his brother Thomas. Unfortunately—so it seemed at the time—he was soon compelled, by a return of the weakness in his knee, to abandon the thrower’s bench and to occupy himself with other departments of the potter’s art. He thus obtained a wider insight into the many practical requirements of his craft, learning, for instance, the business of a ‘modeller,’ and fashioning various imitations of onyx and agate by the association of differently coloured clays. Towards the close of his apprenticeship Josiah developed a love for original experimenting, which was not appreciated by his master and eldest brother,
520
Notes
who declined on the expiry of his indentures to take him into partnership. The young and enthusiastic innovator was not fortunate in his next step, when he joined—about 1751—Thomas Alders and John Harrison in a small pot-works at Cliff Bank, near Stoke. He succeeded, indeed, in improving the quality and increasing the out-turn of the humble pottery, but his copartners did not appreciate nor adequately recompense the efforts of one who was so much in advance of them in mental power and artistic perception. A more congenial position was, however, soon offered to him by a worthy master-potter, Thomas Whieldon of Fenton. With this new partner Wedgwood worked for about six years, until the close of 1758. [...] “It was probably during the first half of 1759 that Wedgwood, now in his twenty-ninth year, became a master-potter. His capital was extremely small; but he knew his strength, and ventured to take on lease a small pot-works in Burslem, part of the premises belonging to his cousins John and Thomas Wedgwood. Although the annual rent paid for this Ivy House Works was but 10l., this sum did not represent its market value. The kilns and buildings soon became unequal to the demands made upon them. More accommodation was wanted, not only for an increased number of workmen, but also for carrying out the modern system of division of labour which Wedgwood was introducing, and for improved methods of manipulation. But the master-potter himself was everything and everywhere, and not only superintended all departments, but was the best workman in the place [...]. “Dissatisfied with the clumsiness of the ordinary crockery of his day, he aimed at higher finish, more exact form, less redundancy of material. He endeavoured to modify the crude if naive and picturesque decorative treatment of the common wares by the influence of a cultivated taste and of a wider knowledge of ornamental art. Such changes were not effected without some loss of those individual and human elements which gave life to many of the rougher products of English kilns during the seventeenth and eighteenth centuries. But there was much to be said on the other side. Owing to their uniformity in size and substance, dozens of Wedgwood’s plates could be piled up without fear of collapse from unequal pressure. In glaze and body his useful wares were well adapted for their several purposes. And then the forms and contours of the different pieces showed perfect adjustment to their use: lids fitted, spouts poured, handles could be held.” Wedgwood was appointed queen’s potter in 1762 as his business kept expanding. In 1766, he “acquired for £ 3,000 a suitable site between Burslem and Stoke-upon-Trent for a new factory and residence. Later on he added considerably to this domain, and built thereon for his workmen a village, to which he gave the name Etruria, as well as the mansion Etruria Hall and an extensive and well-equipped pot-works. The new Etruria factory was opened on 13 June 1769”, etc., which explains the “Etruria” mentioned by S. Newcomb in note 103. “The out-turn and sale of the products of Wedgwood’s factory greatly increased after the opening of the Etruria works in 1769. The ornamental as well as
Notes
105. 106.
107.
108. 109. 110. 111. 112.
521
the useful ware became better and better known and appreciated, not only in England but on the continent”, etc. Wedgwood became quite successful. “Home and foreign patronage, royal, noble, or distinguished, greatly extended his reputation and his business. The two dinner services finished in 1774 for the Empress Catherine II of Russia consisted of 952 pieces, of cream-coloured ware, the decoration of which, in enamel with English views and with ornamental leaf borders, added a sum of over £ 2,000 to the original cost of the plain services, which was under £ 52”. Etruria is now a suburb of Stoke-on-Trent. The company still functions and, as I write this in the fall of 2021, you can check their website: www.wedgwood. com. At Josiah Wedgwood’s death it was taken over by his eldest son, John, who “joined the business rather reluctantly, mainly interested in horticulture”. In fact, he was one of the seven founders of the Royal Horticultural Society (Jules Janick, “The Founding and Founders of the Royal Horticultural Society”, Chronica Horticulturae, vol. 48, 2008). Incidentally, Josiah is the grandfather of Charles Darwin: his daughter, Susannah, married to one Robert Darwin, was Charles’ mother. And then Charles married cousin Emma Wedgwood. Another of Josiah Wedgwood’s children, Thomas Wedgwood, is credited as one of the inventors of photography: “son of the famous potter Josiah Wedgwood, reported his experiments in recording images on paper or leather sensitized with silver nitrate. He could record silhouettes of objects placed on the paper, but he was not able to make them permanent” (Encyclopedia Britannica). Apparently Thomas was good friends with Samuel Taylor Coleridge, the poet. Online Etymology Dictionary: www.etymonline.com. John Clerk and John Playfair. John Clerk is best known for his influential (meaning it influenced people like Horatio Nelson, etc.) Essay on naval tactics. John Playfair was prof. of natural history at the university of Edinburgh, and today he is mostly remembered for his Illustrations of the Huttonian Theory of the Earth—basically, a pop-sci adaptation of Hutton’s work. Geologists might call schist any rock that shows schistosity: i.e., a “foliated” texture; in other words, any rock that is relatively easily split into thin flakes or plates. (Today we know that schistosity is acquired, by some igneous and some sedimentary rocks, during metamorphism: so, all schists are metamorphic rocks, and but they might have been, originally, both igneous or sedimentary.) In Chap. IV, book I of his Principles of Geology. See note 106. “Charles Darwin on the Origin and Diversity of Igneous Rocks”, Earth Sciences History, vol. 15, 1996. See note 101. Technically not a geologist, but a prof. of civil engineering, at M.I.T. and elsewhere, known for his work on how to build railroads; which I guess topography is part of it: and that must be how he got into geology, and mountains. This excerpt is at pages 60–61 of his book.
522
Notes
113. Which, by the way, contributed to the failure of early stratigraphic schemes à la Werner: which were based on the idea that the chemical composition of sediments would control the relative time of their deposition, and thus their place in the sequence. 114. Georges Cuvier was born in 1769 into a Huguenot family—the Huguenots were (are) Protestants who had gotten kicked out of France and settled all over Europe in the seventeenth century. He was born in Montbéliard, not far from Germany, Switzerland, the Jura and all that, then independent from France and mostly Protestant—but became French when the revolution annexed that whole region. After studying biology in Stuttgart, having no money, he worked for a while as tutor for the son of a Protestant noble. But Cuvier was a good student, and a good scientist; he started doing his researches that I am going to tell you about shortly, and eventually got in touch with the French intelligentsia. In the 1790s he got his first academic job, at the Jardin des Plantes, which by then was called National Museum of Natural History. 115. Depending who you ask, there’s different ideas as to who first came up with the correlation principle in geology. In his book From Stone to Star (Harvard University Press, 1984), Claude Allègre says, “it seems that the first document establishing the principle of correlation between sedimentary strata is due to Antoine Lavoisier [...], who would become famous as a chemist [a very important chemist, as you shall see in a minute]. In 1789 Lavoisier published a suite of articles where, through some very clear diagrams734 , he shows that each series of geological layers contains a characteristic series of fossils, which one can use to tell different geological layers from one another. It seems, though, that his works were totally overlooked.” And but there’s also books that say that the idea of geological correlation is really due to William Smith—the author, in 1815, of the first geological map of England. 116. Via the so-called “Cuvier’s principle of correlation of parts”, Cuvier was able to reconstruct how those animals had been in life, based on relatively few remains—a few bones. The idea is that, e.g., if an animal eats flesh, then it’s got to have teeth made for chewing flesh; and but also a digestive tract that is good at digesting flesh, and muscles to catch and kill its prey, etc. On the other hand, a species that’s good at foraging plants but whose digestive tract is tuned to meat won’t survive. So, bottom line, for instance, if all you have are the teeth of an ancient being, and the teeth are the teeth of a flesh-eater, then you can be sure of a whole lot of other little things about that being. And if besides the teeth you have a few other clues, then you might be able to extrapolate how the whole body was like. According to Asimov’s Biographical Encyclopedia, see note 81, Cuvier “came to understand the necessary relationship of one part of a body with another so well that from the existence of some bones he could infer the shape of others and so, little by little, reconstruct the entire animal (a process that even today strikes laymen with amazement and incredulity). [...] Cuvier’s appreciation of how one part of an organism made other qualities necessary is exemplified in a famous story. One of his students dressed up in a devil’s costume and, with
Notes
117.
118.
119. 120. 121.
122.
123.
523
others, invaded Cuvier’s room in the dead of night and woke him with a grisly ‘Cuvier, Cuvier, I have come to eat you.’ Cuvier opened one eye and said, ‘All creatures with horns and hooves are herbivores. You can’t eat me.’ Then he went back to sleep.” Which is in the immediate banlieue of Paris. Once a year they also have, for some reason, a huge fireworks show, which happens right in front of the old factory. It’s not like I am a big fireworks fan, but I did happen to go see it, once, and it was pretty good. There’s this old idea, in modern science, that whenever there are multiple possible explanations for some observation, you should always pick the simplest one; which when I say “simple” I mean the one that requires fewer extra assumptions. Like, if your car won’t start, and the gauge says the tank is empty, the simplest explanation is that you’re out of gas; it’s also plausible that there still is some gas in the tank, but both the engine and the gauge are broken: but it’s more reasonable to reject the latter explanation, and adhere to the former. This was first articulated, apparently, by William of Occam, or Ockham, a Franciscan friar and theology/philosophy author who lived in England in the fourteenth century. Philosophers call razor a logical rule of thumb that will help you shave off unlikely explanations for observations—this one is known as Occam’s razor. In Great Geological Controversies: see note 94. “Fall in the House of Ussher”, Natural History, 100, November 1991, pages 12–21. The thing is, in our culture, we are systematically taught to think of science in terms of progress. Science is something that goes forward, so that, ultimately, its past is not that important. What’s important is not what was thought in the past, but what we know today. And the idea that the people of the future might find all sorts of faults in our theories doesn’t even cross our minds. It’s as if we were convinced, or anyway, science textbook writers of today were convinced of being the best possible culture, with the best possible view of the world, and nothing to learn from the past or from the others. Which obviously gives them licence of ridiculing something they don’t even know735 . Gould also cites a longer and more detailed study by James Barr, “Why the World Was Created in 4004 B.C.: Archbishop Ussher and Biblical Chronology,” Bulletin of the John Rylands University Library of Manchester, vol. 67, pp. 575–608. In Chap. 3 there was a long quote from Hutton, ending with his hypothesis that the “power of extreme heat [...] might be capable of producing an expansive force sufficient for elevating the land”, etc. Here’s what Hutton wrote next: “Having thus ascertained a regular system in which the present land of the globe had been first formed at the bottom of the ocean and then raised above the surface of the sea, a question naturally occurs with regard to time; what had been the space of time necessary for accomplishing this great work? [...] “We shall be warranted in drawing the following conclusions; 1st. That it had required an indefinite space of time to have produced the land which now
524
Notes
appears; 2dly. That an equal space had been employed upon the construction of that former land from whence the materials of the present came; Lastly, That there is presently laying at the bottom of the ocean the foundation of future land”. “To make the rock of that lower formation and then tilt it up”, says John McPhee in Basin and Range, “and wear it down and deposit sediment on it to form the rock above would require an immense quantity of time, an amount that was expressed in the clean, sharp line that divided the formations —the angular unconformity itself. [...] Alive in a world that thought of itself as six thousand years old, a society which had placed in that number the outer limits of its grasp of time, Hutton had no way of knowing that there were seventy million years just in the line that separated the two kinds of rock, and many millions more in the story of each formation—but he sensed something like it, sensed the awesome truth, and as he stood there staring at the riverbank he was seeing it for all mankind.” 124. This refers to Newton’s assumption made in, I think, his 1701 paper about how to measure very high temperatures: which he suggests to make the following assumption about how the temperature of a cooling body decreases over time: his words, translated from latin into English (by U. Besson, in “The History of the Cooling Law: When the Search for Simplicity Can be an Obstacle”, Science and Education, 2010), are: “supposing (ponendo) that the excess of the degrees of heat of the iron above the heat of the atmosphere, found by the thermometer, were in geometrical progression when the times are in an arithmetical progression...” Notice that Newton uses the word “heat” to mean what today we would rather call “temperature”: the two concepts had not been separated yet (and we’ll see later what we mean, today, by each of those two words). In today’s mathematical symbols, Newton’s “law of cooling” reads d [T (t) − Tamb ] = −a [T (t) − Tamb ] dt or, which is the same (remember how we worked out the equation of the torsion pendulum in Chap. 1) T (t) − Tamb = Ae−at , where Tamb is the ambient temperature, and T (t) the temperature of a body immersed in this ambient temperature, and a and A are constants. 125. The results of which did not appear in the first volume of Natural History, about which see above, but in its fifth supplementary volume, published in 1778: if you find that volume, you have to look through it for a section titled Les Époques de la Nature. 126. If you heat iron to a really really high temperature, so that it becomes incandescent, it turns red first, then orange, then white: which is what you call white heat. (When it’s red, it’s red heat.) In Buffon’s time, there was no way to mea-
Notes
127. 128. 129.
130.
131.
132. 133. 134.
135.
525
sure the temperature of something so hot with any accuracy (remember note 103), but today we know white heat means more than 1300 ◦ C. The Age of the Earth (Stanford University Press, 1991). See note 94. Incidentally, I’ve read the following in Alan R. Rogers’ lecture notes, which I’ve found online, on the age of the earth: “If you have ever read Watership Down, then you have read about this valley—it lies between two ridges of chalk, called the North Down (where Darwin lived) and the South Down. In the book, the rabbits started in The Weald and traveled south to Watership Down, a part of South Down.” Rogers is professor of anthropology at the university of Utah; the notes are from his course on The Evidence for Evolution, which is also the title of a book he’s published in 2011 with the University of Chicago Press. Lyell sees earthquakes mostly as the expression of that mysterious Huttonian forces, presumably related to heat, that somehow pushes land up and creates mountains. John Phillips was an English geologist, a contemporary of Darwin. His mother happened to be the sister of William Smith (see note 115). Both of Phillips’ parents passed away when he was still a child, and he was adopted by Smith, who brought him along in the field trips he took to make his famous geological maps; so that’s how he got involved with geology. (See, e.g., the article about Phillips in the Strange Science website, www.strangescience.net/phillips.htm.) Life on Earth: Its Origin and Successions, MacMillan and co., 1860. This and the next quotes are from a section, titled “Antiquity of the Earth”, of Phillips’ Life on Earth, which see note 132. This sentence was mysterious to me. Here’s something I found in the 1842 edition of the Encyclopedia Britannica, under physical geography: “The enormous mass of water discharged by the Marañon, according to Sabine, discolours the ocean, and has even considerable rapidity [...] at a distance of 300 miles from the American coast. Fresh water may be obtained from the surface of this current, at a great distance from the shore.” Sabine presumably means Edward Sabine, born in Dublin in 1788, educated at the Royal Military Academy, second lieutenant in the Royal Artillery since 1803, later (1818), after serving in the war with the United states, appointed astronomer on the British Naval Northwest Passage Expedition, and from then on he is more of a scientist than a soldier. “Between 1821 and 1823 Sabine continued to travel extensively, conducting further pendulum experiments at many different latitudes”. (this is from the Dictionary of Canadian Biography, which you can find online—“Sabine was especially active in promoting magnetic observations in Canada.”) Strange adjective, and funny that it should be capitalized, right? Let me explain. I don’t know to what extent I have made it clear for you, but one of the main points of Hutton’s theory of how the earth works is that the geological processes that we see today in (very slow) action—weathering, erosion, sedimentation, igneous intrusions and so on and so forth—are the same that always occurred, and presumably will always occur in an eternal cycle. Lyell picked up this idea and expanded in his own work, making it sort of a fundamental dogma on which
526
136.
137.
138. 139.
Notes
to base his interpretation of whatever geological observation. This view would later come to be called “uniformitarianism”, or perhaps “Uniformitarianism” with a capital u in reference to the whole school of thought formed by Lyell’s followers. In the words of Claude Allègre, from the third chapter of From Stone to Star: “[the] theory of uniformitarianism attributes infinite duration to geological time: like the movement of planets, of which we don’t know when it started, or when it will end, geological phenomena repeat themselves, always in the same way, since the infinity of times and until the infinity of future.” In the absence of any vestiges of a beginning, or prospects of an end, infinite repetition seems a reasonable work hypothesis. We’ll meet the so-called uniformitarians again in the next chapter, where I’ll briefly touch upon their conflict with the so-called “catastrophists”. That is Robert Everest, author of A Journey through the United States and Part of Canada (John Chapman, London, 1855); according to the title page of which, he’d been chaplain to the East India Company: which is presumably why he lived in India, where he decided to study the Ganges. The inspiration for such geological endeavor came, perhaps, from his famous older brother, George Everest, who “distinguished himself during engineering training at military schools in England”, says Britannica, and then “joined the East India Company in 1806”—i.e., still as a teenager: because he was born in 1790— “and served the next seven years in Bengal. During the British occupation of the Dutch East Indies, [George] Everest worked on the survey of Java (1814–16), then returned to India. From 1818 to 1843, except for two leaves to recover his health, he worked on the survey of India, as superintendent from 1823 and as surveyor general from 1830. During his term as surveyor general, Everest introduced the most accurate surveying instruments of the day”, etc. “[George] Everest was elected a Fellow of the Royal Society in 1827 and was knighted in 1861. Mount Everest, the world’s highest peak, which had been called Peak XV, was renamed in his honour in 1865.” Darwin thinks that species evolve from, say, a form A found in fossils that are very old, to a form Z found in more recent, possibly still existing beings. He thinks that between A and Z there must be a number of intermediate steps, some of which are found in the “geological record” as fossils of intermediate age between A and Z , and some which aren’t. The fact that some are not found is not a fatal blow to Darwin’s theory, precisely because of the “imperfection of the geological record”. Those forms of life of which we find no trace today might very well have existed, if all rock strata that would have carried their remains have been destroyed by weathering, erosion and all that. In this sense, although it might sound like a paradox, Darwin does in fact “rely” upon the imperfection of the geological record. Darwin refers to Lyell’s book, Geological Evidences of the Antiquity of Man, which at the time must have been a work in progress. F. Darwin and A. C. Seward, More Letters of Charles Darwin, 2 vols., Appleton, New York, 1903, vol. 2, p. 129, Nov. 25, 1860.
Notes
527
140. Before I tell you about how physicists like Kelvin tried to sort out the age of the earth, you should know that there were, in the mid eighteen hundreds, other attempts at estimating it, based on all sorts of methods, of which I’ll mention just one736 , because I find it interesting, or in any case it’s something that I could never possibly have thought about, and, although it ultimately doesn’t work, it seems quite clever to me: the idea is that the reason ocean water is salty is that sediments, carried by rivers to shores, contain salt—the result of the erosion of sodium-bearing rocks. So if you make the assumption that the rate at which those sediments are removed from mountains and taken to the oceans has stayed the same over the whole history of the earth (the typical uniformitarian approach, which in the absence of better information, people have a tendency to accept), you just have to divide the total amount of sodium in the ocean right now by that, and you get the age of the earth. Or in other words: earth’s age in years =
volume of sodium in the ocean today . volume of sodium added to the ocean every year
Both quantities at the right-hand side can be measured, although of course I guess there is some important uncertainty on such measurements, but hopefully that’s good enough for an order-of-magnitude estimate: which was done in 1899 by John Joly737 and gave about 100 Myrs: not too far from Kelvin’s estimates that we are about to look at—much less than implied by Darwin’s study of the Weald. There’s a paper about “John Joly (1857–1933) and his determinations of the age of the Earth”, by Patrick Wyse Jackson, published in a special volume by the Geological Society of London, The Age of the Earth: From 4004 B.C. to A.D. 2002, from which we learn that, “brilliant in its simplicity, Joly’s paper of 1899 fired the imagination of both scientific and general audiences, and for perhaps a decade his ‘sodium method’ held sway amongst geochronologists.”738 But then Wyse Jackson also explains that Joly’s idea succumbed to the many issues that were raised by its critics. Essentially, the data on river discharge and ocean sodium content were not very reliable, like I said; the role of the atmosphere, also absorbing and redistributing sodium through rain, was not quantified clearly, etc. And then Arthur Holmes (an important figure in earlytwentieth-century geophysics, whom we’ll meet again), writes Wyse Jackson, “reasoned that the rocks would have had to lose more sodium into the oceans than they had ever contained, for Joly’s figures to add up. [...] The theory was finally consigned to the scientific scrap-heap by several geologists [...] who, in the damning words of Arthur Holmes, ‘rejected it as worthless’ ”. The quote is from Holmes’ 1926 paper, “Estimates of Geological Time, with Special Reference to Thorium Minerals and Uranium Haloes”, published in the Philosophical Magazine.
528
Notes
anyway, check out Wyse Jackson’s paper first if you are interested in this—it comes with a long bibliography739 and it will guide you through the whole debate on the subject. 141. Like Mayer, actually, Helmholtz was a medical doctor by background, although he had always wanted to be a physicist—after he entered the university system as professor of physiology, he managed to evolve into a physics professor later in his career. It’s interesting, though, that some of his main contributions had to do with the the eye—optics—and the ear—acoustics. His book On the Sensations of Tone as Physiological Basis for the Theory of Music, which has a lot of physics, too, was quite influential for a very long time. 142. The (digital) copy I’ve found, in English translation, was published by one J. Fitzgerald, New York, in I think 1932 (it’s the scan of a library copy, and the library stamp says April 27, 1932), as n. 24 in a series called Humboldt Library of Popular Science Literature. It was also published by Dover, in 1962. And I guess there are still more editions of it. 143. In principle, there’s another “force” that could keep the earth’s surface warm: and that’s the heat that comes from the interior of the earth: because temperature rises as one goes deeper into the earth (Chap. 2) and that means that the interior of the earth is hotter than the surface and so must be heating the atmosphere, etc. But it turns out that the contribution of heat from inside the earth is negligible compared to that of the sun; this is something that Joseph Fourier figured in his studies on heat, and published in 1824 (“Mémoire sur les Températures du Globe Terrestre et des Espaces Planétaires”, Annales de Chimie et de Physique; and again in 1827 with minor corrections, in the Mémoires de l’Académie Royale des Sciences de l’Institut de France). Fourier’s argument is as follows: (i) if you take data from all over the globe, you’ll see that no matter where you are, the rate of temperature increase with depth below 30 m or so is about the same everywhere: so heating from inside is globally uniform; (ii) if you measure surface temperature all over the globe, that changes a lot with season, time of day, etc.; (iii) but if you measure temperature at a depth of, again, about 30 m, you see that that does not change over time—it’s the same whether it’s day or night, summer or winter, etc.: (iv) so, it is constant over time; but it does change with location: “temperature in such places [30 m or so below the surface, like I said] doesn’t change appreciably over the course of the year, it is constant; but it is very different in different climates: it is the result of the perpetual action of sun rays and of the inequal exposition [to sun light] of the different parts of the [earth’s] surface, from the equator to the poles”. Bottom line, if the temperature near the surface were controlled by heat “flowing” from within, it should be about constant no matter where you are on the globe, because so is the heat that flows from within; but it is not! instead, it changes depending, essentially, on latitude, i.e. how much an area of the globe is exposed to sunlight, on average over time. That means that heating from the sun is much more important than heating from within, QED. By the way, Fourier’s reasoning at point (i) implicitly assumes that you accept that the rate of change of temperature versus increasing depth is proportional to heat flowing from
Notes
144.
145. 146.
147. 148. 149.
150.
529
within. That’s maybe intuitive, but not trivial; we’ll get back to that, and to Fourier’s work, later in this chapter. You have to be patient with this. Today, If you’re a student, you’d get a bad mark for mistaking “force” for “energy”; but at the time, people hadn’t really decided how to call things. Whom we shall meet again, a couple chapters down the way, when we discuss the propagation of waves. As Laplace puts it in his Celestial Mechanics, after “Newton published his discovery of universal gravitation, mathematicians have [...] succeeded in reducing to this great law of nature all the known phenomena of the system of the world,” which is a bit of an overstatement, but the point is that, after Newton summarized, with few simple laws, centuries of complex observations about the motions of celestial bodies and lots of other stuff, it was mostly a mathematician’s job to derive, from Newton’s equations, new formulae that would usefully describe a variety of more complex phenomena. The book of Laplace is an effort, in the author’s own words, “to present a connected view of these theories, which are now scattered in a great number of works.” Published in 1829, in Boston, by Hilliard, Gray, Little and Wilkins. Yes, that’s the Gustave Coriolis of Coriolis force fame. In his words, “Nous proposerons la dénomination de travail dynamique, ou u (t) simplement travail, pour la quantité Pds [ u i,ki,k(0) Fi,k du i,k , in my notation] définie comme on vient de le dire. Ce nom ne fera confusion avec aucune autre dénomination mécanique [...]. Quant à la dénomination de force vive, donnée jusqu’à présent [...] au produit de la masse [...] par le carré de la vitesse, nous la conserverons [...]; seulement, nous appliquerons cette dénomination à la moitié de ce produit.” To understand why it makes sense to call “work” (or “labour”, “travail” or whatever) the product of a force times a displacement, check out this quote of Helmholtz, from On the Conservation of Force. Introduction to a Series of Lectures Delivered at Carlsruhe in the Winter of 1862–1863 (translated to English by Edmund Atkinson and published in Scientific Papers, The Harvard Classics, vol. 30, Collier and Son, New York, 1910): “in principle, any mechanical machine can be ran by a weight that falls. Consider for example a clock driven by a descending weight. “We get [...] from this example, a measure for the amount of work. Let us assume that a clock is driven by a weight of a pound, which falls five feet in twenty-four hours. If we fix ten such clocks, each with a weight of one pound, then ten clocks will be driven twenty-four hours; hence, as each has to overcome the same resistances in the same time as the others, ten times as much work is performed for ten pounds fall through five feet. Hence, we conclude that the height of the fall being the same, the work increases directly as the weight. “Now, if we increase the length of the string so that the weight runs down ten feet, the clock will go two days instead of one. [...] The weight being the same, the work increases as the height of fall. Hence, we may take the product of
530
151.
152.
153. 154.
Notes
the weight into the height of fall as a measure of work [...]. We may apply this measure of work to all kinds of machines, for we should be able to set them all in motion by means of a weight sufficient to turn a pulley. We could thus always express the magnitude of any driving force, for any given machine, by the magnitude and height of fall of such a weight as would be necessary to keep the machine going with its arrangements until it had performed a certain work. Hence it is that the measurement of work by foot pounds is universally applicable.” In Coriolis’ times, motion just meant the translation or rotation of objects that were visible to a naked eye. Those of you who have already studied some physics might think of heat also as the motion of many microscopic particles: i.e., heat is just another form of kinetic energy. Yes, but this was not trivial to demonstrate, back then: we are getting there shortly. “Robert Boyle was born into the aristocracy (as the fourteenth child and seventh son of the earl of Cork) and was an infant prodigy”, says Asimov (Biographical Encyclopedia). “He went to Eton at eight, at which time he was already speaking Greek and Latin, traveled through Europe (with a tutor) at eleven, and at fourteen was in Italy studying the works of Galileo [...]. While in Geneva he was frightened by an intense thunderstorm into a devoutness that persisted for the rest of his life. He never married”. Boyle’s most important contribution is the idea of an element as a substance that can’t be broken down into two or more different substances; that if you combine multiple elements to form a compound, then there’s always some way to separate the compound and get the individual elements back; and that, in general, elements can, and should, be identified by experiment. This is all in his book, The Sceptical Chemist, which was published in 1661. A professor of chemistry at the university of Jena, contemporary of Newton. Mainstream Greek thought classified all matter according to four and only four possible substances, or elements, i.e. earth, water, wind and fire. In today’s language, we might say that the Greeks classified matter according to its “phases”, i.e. solid (earth), liquid (water), and gas (wind); plus fire which instead doesn’t correspond to any “phase”; but can somehow be associated with what today we would call heat, or perhaps, more generally, energy. Heraclitus more than anyone else at the time saw fire as something special compared to the other three ‘elements’; according to Martin Barnett (“The Development of the Concept of Heat-I: From the Fire Principle of Heraclitus through the Caloric Theory of Joseph Black”, The Scientific Monthly, vol. 62, 1946), Heraclitus was “the first European to give the concept of ‘fire’ a prominent position in a system of natural philosophy [...]. As Windelband (A History of Philosophy, English edition by Tufts, Macmillan, New York, 1926) notes, when Heraclitus declared fire ‘to be the essence of all things, he understood by this not a material or substance which survived all its transformations, but just the transforming process itself in its ever-darting, vibrating activity, the soaring up and vanishing which corresponds to the becoming and passing away.’ ” Yes, so we might even say that Heraclitus saw heat/fire as a process rather than a substance. “Hence”,
Notes
531
continues Barnett, “it would be a superficial view, indeed, which regarded the Fire Theory of Heraclitus as the root of the phlogiston theory or the caloric hypothesis which flourished over a thousand years later. Rather, in his conviction that all things are in a continual flux, that permanence is illusion due to the strife of opposites, he is closer to the kinetic view of heat.” The phlogiston and caloric hypotheses, as we shall see in a minute, are two theories of heat that regard heat as a substance; i.e., they postulate that, if a body grows hotter, it is because it’s absorbing some sort of fluid material called heat, etc. So, actually, they resemble more Plato’s and Aristotles’s categorization of matter where fire is not “the essence of all things”, but rather just another element, like earth and wind and fire. The kinetic theory of heat, instead, is a theory of heat according to which heat is the motion, the oscillation of all the particles, atoms and molecules, that make up matter. The faster they oscillate, the hotter the body. And this is mainstream physics now. (And we’ll get back to it in a few chapters.) The Arabs and/or the alchemists integrated the four elements with several “properties”, or “principles” of matter. According to the Greeks, the properties of an element would stem from the intrinsic nature of the atoms that made up a certain element, “thus peculiar fluid nature of liquids”, explains Barnett, “was attributed to the smoothness and roundness of their atoms”, etc. But for instance “Geber, an Arabian alchemist of the ninth century,” introduced two properties, that he called “mercury” and “sulfur”: “the first to account for the luster, volatility, fusibility, and malleability of metals, the second to account for their color, combustibility, affinity, and hardness”, etc. You can tell from their very names though—which are the names of two of today’s elements—that there was a certain ambiguity as to what a “property”/“principle” actually was: e.g., is it something that can be isolated? does it have weight?, etc. Phlogiston was the principle that controlled combustion, the “combustible principle”. 155. We’ve met Cavendish already in Chap. 1—he “weighed the earth”—, but I don’t think I told you about his, uhm, weirdness. You can read about it in many places, but Asimov neatly sums it up: “Cavendish [...] spent four years at Cambridge, but he never took his degree, partly because he would not participate in the obligatory religious exercises. He also seems to have thought he could not face the professors during the necessary examinations. In all his life, he had difficulty facing people. “[Cavendish] might, in an emergency, exchange a few words with one man,” Asimov explains, “but never with more than one man, and never with a woman. He feared women to the point where he could not bear to look at one740 . He communicated with his female servants by notes (to order dinner, for instance) and any of these female servants who accidentally crossed his path in his house was fired on the spot. He built a separate entrance to his house so he could come and leave alone,” etc. 156. British Unitarian (i.e., there’s no such thing as Trinity, but only one individual God) minister, enlightenment philosopher, radical politician—against slave trade, in favour of the French Revolution and American independence, and a
532
Notes
Fig. N.8 Monsieur and Madame Lavoisier, as painted in 1788 by Jacques-Louis David—now at the Metropolitan Museum of Art, New York
personal friend of Benjamin Franklin. (Which all these things didn’t make him very popular in Britain, apparently, and he eventually emigrated to the U.S., where he died in 1804.) His experiments with gases are probably the main reason he is still remembered. “When Priestley dissolved carbon dioxide in water he tasted the solution and found that he had created a pleasantly tart and refreshing drink, the one we call seltzer or soda water today. The Royal Society awarded him the Copley medal741 for this. Since it required only flavoring and sugar to produce soda pop, Priestley may be viewed as the father of the modern soft-drink industry” (Asimov). 157. A Frenchman living in the eighteenth century, Lavoisier’s day job was in the administration of the Ferme Générale: a private company to which the king outsourced the collection of indirect taxes—e.g. on tobacco, salt, customs and stuff like that. This was towards the end of the Ancien Régime, that is, shortly before the Revolution. Lavoisier made lots of money through the Ferme (which he used to finance his scientific work), and even married the daughter of another “tax farmer”, as they were called—Jacques Paulze. Madame Lavoisier (Marie-Anne Pierrette Paulze Lavoisier) turned out to be a good scientist herself, working with her husband in his lab—translating into French the works of Lavoisier’s English colleagues (Lavoisier did not speak English, she did); drawing illustrations of his experiments, etc. (See Fig. N.8, which is actually a painting by David742 ).
Notes
533
Being a tax farmer was OK until the French Revolution happened, starting in 1789. Then, tax farmers became enemies of the people743 . Lavoisier’s position was particularly delicate—a few years back, Jean-Paul Marat744 , a leader of the revolution, had applied for membership of the French Academy of Sciences, and had been turned down: and Lavoisier was one of the most prominent members of the Academy. Bottom line, Lavoisier was arrested in November, 1793, and tried on May 8, 1794. He was guillotined the same day and buried in a mass grave745 . 158. See note 154. 159. Mostly a manufacturer of thermometers, known, yes, for his temperature scale, Fahrenheit was German, with Hanseatic nationality—the Hanseatic league being a long-extinct confederation of merchant towns in northern Europe, from the Netherlands to current Lithuania. He was born in Gda´nsk in 1686 and died in Amsterdam, 1736. 160. A Dutch contemporary—roughly—of Newton, well known at the time for his textbook of chemistry. Which I haven’t read, but apparently it was a tremendous compilation of experimental facts in chemistry: “it is nothing less than a complete collection of all the chemical facts and processes which were known in Boerhaave’s time, collected from a thousand different sources, and from writings equally disgusting from their obscurity and their mysticism. Every thing is stated in the plainest way, stripped of all mystery, and chermistry is shown as a science and an art of the first importance,” etc., says Thomas Thomson in his History of Chemistry, London 1830, quoted by Tenney L. Davis in The vicissitudes of Boerhaave’s textbook of chemistry, ISIS, 1928. Anyway, the funny thing about Boerhaave’s book is that it “would probably never have been written if the ‘disgustful task’ (labor ingratissimus) had not been forced upon its author”, says Tenney Davis. Because “his students had collated their lecture notes and had edited a textbook which was published under his name as if it had been his own production. He disowned it entirely, and pointed out that it contained errors on nearly every page—which was gross exaggeration. But the book was very successful; it was not long before such copies of it as could be had were sold at an advanced price, and a second edition—also without the author’s consent—seemed imminent. There can be no doubt that the book was a substantially accurate account of the lectures. Boerhaave blamed himself for its faults, and the universal approbation which it received put him in mind of Petrarch who lamented the decadence of his age which reckoned him among the greatest of its poets. He therefore wrote and published the book as he wished it to be, to correct the faults of the spurious edition, and because he was unwilling to accept the praise which accrued to the inferior and unconsidered publication.” Hence, the “vicissitudes”. The spurious edition, Institutiones et Experimenta Chemiae, “appeared in 1724,” says Davis, “and contained nothing to show where or by whom it was printed. Although the title page indicates Paris as the place of publication, Burton746 is positive that the book was printed at Leiden. Another edition [...] appears to have been published at Venice in 1726.” An English version, A
534
Notes
New Method of Chemistry, Including the Theory and Practice of that Art: Laid Down on Mechanical Principles and Accommodated to the Uses of Life came out in 1727, printed in London by Osborn and Longman. Boerhaave’s own version, Elementa Chemiae, finally came out at Leiden in 1732. Davis says that “the work was enormously successful and appeared in many editions in Latin, French, German, and English”; he also says that “indeed it is difficult to say which version is the better textbook”: whether the spurious or the genuine one. 161. Which is one way to answer the question, are heat and temperature the same thing? They aren’t, because the same quantity of heat, administered to the same quantity of different substances, results in different increments of temperature. Which Black probably never formulated it in these terms, though. 162. One relatively simple technique to measure heat capacity is via the so-called water calorimeter. You need to find some small object made of the substance whose heat capacity you want to measure. Then, you immerse that object in a tank full of water. Define the heat capacity of water to be 1. Let m be the mass of the object, and m the total mass of the water in the tank. Call T and T the initial temperatures of the small object and of the water, respectively, and T the final common temperature. Then, the total heat received by the water must equal m (T − T ), while the total heat lost by the object is −mcm (T − T ), where cm is the specific heat we are trying to measure. If you equate these two terms to one another, m(T − T )cm = m (T − T ), and you can calculate cm . 163. See note 150. 164. According to Wikipedia, “Henry Smith Williams (1863–1943) was a medical doctor, lawyer, and author of a number of books on medicine, history, and science”, some co-authored with his brother, Edward Huntington Williams. It seems that both the Williams were early researchers of alcohol and drug addiction, and fought, as early as the, like, 1920s, to make a distinction between “soft” and “hard” drugs and “for the kinder treatment of addicts[,] which eventually lead to [H. Williams’] writing of the book, Drug Addicts are Human Beings”. It seems that these ideas didn’t fare very well with the American establishment of the time, as Edward Williams ended up being arrested and imprisoned in 1931; “in his book, Chasing the Scream, Johann Hari describes how the 1931 arrest and subsequent imprisonment of Williams’ brother, Edward, was orchestrated by Harry J. Anslinger, head of the Federal Bureau of Narcotics [...]. In his 1938 book, Williams predicted with a high degree of accuracy that, fifty years later, drug-smuggling would grow to become a five-billion-dollar industry”, etc. 165. Mayer graduated in medicine at Tübingen, 1838. “While surgeon to a Dutch India vessel cruising in the tropics,” says Williams, Mayer “observed that the venous blood of a patient seemed redder than venous blood usually is observed to be in temperate climates. He pondered over this seemingly insignificant fact,
Notes
535
and at last reached the conclusion that the cause must be the lesser amount of oxidation required to keep up the body temperature in the tropics. Led by this reflection to consider the body as a machine dependent on outside forces for its capacity to act, he passed on into a novel realm of thought, which brought him at last to independent discovery of the mechanical theory of heat, and to the first full and comprehensive appreciation of the great law of conservation.” 166. “Mayer’s failure to be appreciated and the fact that he was on the losing side in controversies as to priority affected him strongly”, writes Asimov. “The year 1848 saw additional disasters, with the death of two of his children and his brother’s involvement with revolutionary activities. Mayer tried to commit suicide in 1849 by jumping from a third-story window but failed in that too, merely injuring his legs severely, and laming himself permanently. In 1851 he was taken to a mental institution where primitive and cruel methods for treating the sick prevailed. He was eventually released but he never fully recovered. He lived in such obscurity that when Liebig lectured on Mayer’s views in 1858 he referred to the man as being dead.” In the years that followed, though, Mayer’s work got some of the attention it deserved, when people like, e.g., Helmholtz recognized it publicly. Eventually, Mayer was granted the title of “von”, and received the Copley medal (see note 741) in 1871. Better than nothing, I guess. 167. “On the Existence of an Equivalent Relation between Heat and the Ordinary Forms of Mechanical Power”, The London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, vol. 27, 1845. 168. The way we know it is, essentially, you place a glass prism in front of the objective lens of a telescope. The speed at which light of propagates through glass depends on the frequency of light, so if you’ve got a beam of white light (which, as you might already know, if light is white that means, really, that all possible wave lengths, or frequencies of visible light are present in that beam at the same time), then each wave length is refracted by the prism at a different angle (if you don’t know what I am talking about, wait until Chap. 7, Fresnel and all that), and so the prism becomes a tool to decompose light: which is what you see, e.g., on the cover of Pink Floyd’s The Dark Side of the Moon: the “spectrum” of visible light. Now, the thing is, when you heat some material to white heat (see note 126), it turns out that the white light that it emits is not really perfectly white, but some frequencies are missing—and we see dark bands in the spectrum, where those colours should be. Where the dark bands are, i.e. which frequencies are missing, depends on the material: which means that by looking at a white-light spectrum, you can tell which substance it was that emitted that light. And if, like I was saying, you place your prism in front of a telescope, you can tell what substances stars are made of. Provided, of course, that someone has brought that substance to white heat before, in a lab, and looked at its spectrum. But, in general, that’s always the case: i.e., the universe is all pretty much made of the same stuff.
536
Notes
169. Which consisted of a silver cylinder, one of whose circular surfaces, “soigneusement noirci au noir de fumée”, would be pointed towards the sun. The pyrheliometer was filled with water, so when the blackened surface got heated by sunlight, the temperature of water would change, and one could measure that change, etc.747 170. This gets quite messy, but it’s mostly geometry, I guess. One thing that is not clear to me, though, is where Pouillet gets the estimates he uses for the thickness of the atmosphere, which he takes to coincide with the radius of the earth, divided by 80. Pouillet doesn’t say, I don’t think. I’ve found something in Charles Hutton, yes, the Hutton of Mt. Schiehallion, in A Philosophical and Mathematical Dictionary, 1815: “Various attempts have been made to ascertain the height to which the atmosphere extends above the earth. These commenced soon after it was discovered, by means of the Torricellian tube, that air is endued with weight and pressure”, etc., so one way is to measure pressure, assume an average value for density of air, convert to height: but what value for density? But then later in the same place Hutton also says, “it was found by Kepler, and Lahire after him, who computed the height of the atmosphere from the duration of twilight [i.e. the time during which we still receive light from the sun, through refraction by atmosphere, though the sun is below the horizon], and from the magnitude of the terrestrial shadow in lunar eclipses, that the effect of the atmosphere to reflect and intercept the light of the sun, is only sensible to the altitude of between 40 and 50 miles”, and so on, which involves some quite complex geometry, I guess, but 50 miles multiplied by 80 is not so far from the radius of the earth, so who knows, maybe these are the values that Pouillet had in mind? 171. Which, actually, is very close to the current estimate of 1367 J/(s×m2 ). 172. Mayer apparently wrote a paper on this in 1841, couldn’t find a journal that would publish it, and eventually published it at his own expense in 1848. He just wasn’t lucky at all when it came to getting his work to be accepted. See also note 166. 173. You’re presumably confused as to why it should be called escape velocity, if it’s the velocity of the meteor as it falls into the star... So, the question is: how fast does a rocket need to be at take-off, so that it escapes the earth’s gravity field and never comes back? (Think of a projectile with no engine of its own; all that gets it going, and keeps it going—for a while—is inertia, and how far it goes depends entirely on its initial velocity.) The rocket never really “escapes” the earth’s gravity field, of course, because earth’s attraction decays like squared distance, but is never exactly 0; but there are other planets and stars in the universe: and the rocket might get to be so far from earth that the earth’s pull on it becomes smaller than those of other celestial bodies that it has approached in the meantime: in that case, it’s never going to come back. A reasonable condition for that to happen is that, at some time t, we’ve got K (t) U (t), which is the same as saying K (t) ≈ E.
Notes
537
Whether this can be the case depends on whether E is positive or negative. If E < 0, for K (t) to be ≈ E we’d need K to be negative, which is impossible by definition. So there isn’t such a distance where this could be the case, and the rocket falls back. Notice that E < 0 is the same as K (0) < −U (0); or K (0) < G Mm/R, where R in this case is the radius of the earth... so that means that, in this case, the initial velocity is so small that K (0) is smaller than G Mm/R. If E ≥ 0, on the other hand, the rocket will get to a point where U (t) is about zero but there’s still some kinetic energy left in it, K (t) ≈ E. If E is known, the equality E = U (0) + K (0) allows us to calculate the initial velocity v(0) from E, via K (0) = E − U (0): which then you get v(0) from K (0). Now, the smallest possible value of E for which this can be done is E = 0; then K (0) = −U (0), or 1 2 G Mm mv (0) = , 2 R and the initial velocity v(0) at the left-hand side is, by definition, the escape velocity, which with a bit of algebra ve =
2G M ; R
and this is the same formula as Eq. (4.19), i.e., escape velocity and free-fall velocity coincide, QED. 174. Which I am going to give here, but frankly, it is quite heavy, so feel free to skip this footnote and go ahead. If this is the first time you’re reading this book, you probably really should skip the rest of this footnote for the time being. If you are still here, take Newton’s law of gravitation applied to a pair of bodies, say the sun, of mass M, and a planet of mass m; written in vector form, it says that the force imparted by M upon m is F=
G Mm (rM − rm ), |rM − rm |3
where rM and rm are the position of the sun and the planet, and it doesn’t matter where you place the origin of the reference frame. If you take it to be at the center of the sun, then rM = 0, and r¨ m = −
GM rm , rm3
(N.34)
where, by virtue of Newton’s second, I replaced F with m r¨ m , and then I divided both left- and right-hand sides by m. Each of the two dots above r at the lefthand side means one differentiation of r with respect to time, so r¨ m is the acceleration of m with respect to M. I decided, by the way, that I can write the
538
Notes
magnitude of rm , or of whatever vector, in two ways: |rm | and but also simply rm . In the following I am going to drop the subscript m, because we don’t really need it. One consequence of (N.34) is that d (r × r˙ ) = 0 dt (where the single dot above r means one differentiation with respect to time), i.e. the cross product r × r˙ , which in the following we are going to call h, is constant in time748 . To see why that is the case, take the time-derivative749 of h = r × r˙ and sub (N.34) into the expression you get: dh = r˙ × r˙ + r × r¨ dt GM = 0− 3 r×r r = 0, because the cross product of a vector with itself is zero. Saying that r × r˙ is constant, by the way, is like saying that the plane where r and r˙ both lie is constant: because remember that the cross product of two vectors is perpendicular to the plane that’s defined by those two vectors; and if the cross product doesn’t change, that plane can’t change, either. Now, r tells you where the planet of mass m is at a given moment, and r˙ is the vector difference between the place where it is right now and the next place where it’s going to be after a short time (divided by that short time). We might write r(t + δt) ≈ r(t) + δt r˙ (t) where the approximate equality becomes exact if δt tends to zero. That means that the next position occupied by planet m will be on the plane defined by r and r˙ . And we’ve just found that that plane cannot ever possibly change. It follows that planet m won’t ever leave that plane; i.e., the plane defined by the orbit of m around M is constant. It is useful to play around with another cross product, that of r¨ with h:
Notes
539
GM r×h r3 GM − 3 [r × (r × r˙ )] r GM − 3 [(r · r˙ )r − (r · r)˙r] r GM − 3 (r r˙ r − r 2 r˙ ) r r˙ r r˙ − 2 GM r r d r , GM dt r
r¨ × h = − = = = = =
(N.35)
where the second step happens because of a property of the cross product, i.e. that for whatever trio of vectors, call them u, v and w, we have that u × (v × w) = (u · w)v − (u · v)w. The third step happens because: (i) if you take the derivative of r · r, dr dr d (r · r) = ·r+r· dt dt dt = 2 r · r˙ ; and but (ii), if you take the derivative of r 2 , d 2 r = 2 r r; ˙ dt but (iii) r · r = r 2 , and so r · r˙ = r r˙ . Now if we take the time-integral of both sides of (N.35),
t
dt [¨r(t ) × h] = G M
t0
t
dt
t0
d r(t ) ; dt r (t )
which remember that h is constant: but then, it follows that t0
t
t d r(t ) dt r¨ (t ) × h = G M dt dt r (t ) t 0 r(t) r(t0 ) − , = GM r (t) r (t0 )
which if you work out the integral at the left-hand side becomes r(t) r(t0 ) − , r˙ (t) × h − r˙ (t0 ) × h = G M r (t) r (t0 )
or
540
Notes
r˙ (t) × h = G M where the vector e=
r(t) +e , r (t)
1 r(t0 ) r˙ (t0 ) × h − GM r (t0 )
(N.36)
(N.37)
is constant. Dot both sides of (N.36) with r: r · (˙r × h) = G M r ·
r
+e
r = G M(r + r e cos θ) = G M r (1 + e cos θ),
where θ denotes the angle formed by r and e. At the left-hand side, by the properties of the cross product750 , r · (˙r × h) = h · (r × r˙ ) , = h·h = h2, and so h 2 = G M r (1 + e cos θ), which can also be turned around, to write r (θ) =
h2 . G M (1 + e cos θ)
(N.38)
If you’ve done some (college-level) geometry before, you might see a resemblance between (N.38) and the equation of a “conic section”751 in polar coordinates: which, if the conic is either an ellipse (or a circle, which is a special case of the ellipse) or a hyperbola752 , in its most general form reads r (θ) =
a(1 − b2 ) , 1 + b cos θ
where a and b are arbitrary parameters. (In the case of the ellipse, for example, a is half the length of the “major axis”, while b coincides with its so-called “eccentricity”: see Fig. N.9.) And but so, indeed, if b = e and a(1 − b2 ) =
h2 , GM
Notes
541
Fig. N.9 The ellipse. The major axis has length 2a, while the minor axis has length 2b. An ellipse’s eccentricity is a measure of how flattened/elongated the ellipse ! is, and it is defined 2
e = 1 − ab2 ; you see that a circle (shown as a dashed line, for reference), is an ellipse with a = b and no eccentricity. Eccentricity and ellipticity, by the way, are not the same thing, but let’s not get into that
Fig. N.10 The vectors r and r˙ define the plane of an orbit. The vector h = r × r˙ , by definition, is perpendicular to the plane of the orbit. The vector e is perpendicular to h and as such must lie in the plane of the orbit; e is constant over time, so we might use it to define the angle θ, which fully describes position along the orbit
or a=
h2 , G M(1 − e2 )
then the ellipse/hyperbola equation and (N.38) are the exact same thing: i.e., we’ve discovered that the mass m must travel along an ellipse or a hyperbola. This will be useful in a minute. Now, let’s take a look at the vector e, i.e. Eq. (N.37). By the definition of h, r and h are perpendicular to one another, so the second term at the right-hand side of (N.37) is perpendicular to h. But so is the first term, which is the cross product of h with another vector. But so then, the entire right-hand side of (N.37) must be perpendicular to h, i.e. e is perpendicular to h. By its definition, h is perpendicular to the plane defined by r and r˙ , i.e. the plane of the orbit of m: it follows that what we’ve just shown is that e lies on the plane of the orbit (Fig. N.10).
542
Notes
Let us call rˆ the unit vector that points in the direction of r, and θˆ the unit vector that sits in the plane of the orbit and is perpendicular to rˆ , pointing in the direction of increasing θ. Then, velocity r˙ can be written753 r˙ = r˙ rˆ + r θ˙ θˆ , and its squared magnitude, |˙r|2 = r˙ 2 + r 2 θ˙2 . Plug the expression we just found for r˙ into the definition of h, h = r × r˙ = r × (˙r rˆ + r θ˙ θˆ ) = r θ˙ r × θˆ , because of course r × rˆ = 0: by definition, r and rˆ point in the same direction. But r and θˆ are perpendicular to one another, and so it follows, from the equation I just wrote, that ˙ h = r 2 θ. But we saw a minute ago that h 2 = G M a(1 − e2 ): so, then, r 4 θ˙2 = G M a(1 − e2 ), or r 2 θ˙2 =
G M a(1 − e2 ) r2
(1 + e cos θ)2 [a(1 − e2 )]2 (1 + e cos θ)2 . = GM a(1 − e2 ) = G M a(1 − e2 )
Look, now, at the other contribution to |˙r|2 , i.e., r˙ 2 , the time-derivative of r :
Notes
543
dr r˙ = θ˙ dθ d = θ˙ dθ
a(1 − e2 ) 1 + e cos θ
a e (1 − e2 ) sin θ = θ˙ (1 + e cos θ)2 e sin θ r = θ˙ (1 + e cos θ) = θ˙ e sin θ
r2 a(1 − e2 )
r2 h = 2 e sin θ r a(1 − e2 ) ! e sin θ = G M a(1 − e2 ) a(1 − e2 ) √ G M e sin θ . = a(1 − e2 )
But so then r˙ 2 = and
G M e2 sin2 θ , a(1 − e2 )
|˙r|2 = r˙ 2 + r 2 θ˙2 = = = = = =
(1 + e cos θ)2 G M e2 sin2 θ + GM 2 a(1 − e ) a(1 − e2 ) GM (1 + e2 cos2 θ + 2e cos θ + e2 sin2 θ) a(1 − e2 ) GM (1 + e2 + 2e cos θ) a(1 − e2 ) GM GM (1 + e cos θ) + (e2 − 1) 2 2 a(1 − e ) a(1 − e2 ) 2 1 − GM r a 2a − r . GM ar
If you take the square root of this, to get |˙r| from |˙r|2 , you end up with Eq. (4.21), which, if you go back to the main body of text, now will be shown to be exactly equivalent to Mayer’s formula754 . 175. Helmholtz cites Kelvin as the first originator of this objection.
544
Notes
176. Popular lecture delivered on the 7th February, 1854, at Königsberg, on the occasion of the Kant commemoration. I am not sure that that is the same popular lecture that I’ve found in my book. 177. “On the Age of the Sun’s Heat”, Macmillan’s Magazine, vol. 5, 1862. 178. You might need to convince yourself that the units in (4.24) are indeed Joules, or units of energy anyway. Remember that G can be given in m3 /(kg×s2 ); ρ2 then would need to be in kg2 /m6 ; and R 5 in m5 . Simplify all you can simplify and you shall be left with kg×m2 /s2 : which we know to be exactly the same as J. 179. See note 177. 180. I guess earlier in this chapter I mentioned Newton’s cooling law? which is not the same thing as this one, but the two are related, and I might show you briefly how one can get Newton’s equation from Fourier’s, via just some math. We’ve just written that Q = FδtδS, so if you take a small amount of heat Q exchanged in a short time δt, you can write F=
1 dQ . δS dt
Then, from this and Fourier’s law (4.32), we get −k
dT 1 dQ = , dz δS dt
which, if you solve for the rate of Q, dQ dT = −kδS . dt dz If, again, you take a short distance δz, you might write kδSδT dQ =− , dt δz
(N.34)
where δT is the change in temperature across the distance δz. At steady state, by the way, T and z are in a linear relationship to one another, like we’ve just seen, i.e.: dT is constant along z, and it follows that (N.34) is valid even without dz the requirement that δz be small, etc. To get Newton’s cooling law from (N.34), remember Eq. (4.10), that says that Q = cm mdT , where cm is the heat capacity and m is the mass of the rock , and parcel, so then ddtQ = cm m dT dt cm m or
kδSδT dT =− , dt δz
Notes
545
kδSδT dT =− , dt cm mδz which compare with the first equation in note 124, and you see that this is indeed Newton’s cooling law, if α (which for Newton was just some a priori unknown constant) is replaced with cmkδS . We won’t need this stuff in this mδz chapter, actually, but we’ll get back to it in Chap. 8. 181. Equation (4.39) is the first partial differential equation that we meet in this book. PDEs differ from ODEs in that the functions that solve them have to be functions of more than one variable—because in a PDE we’ve got derivatives with respect to multiple variables: in this case depth, z, and time, t. When you differentiate a function of more than one variable, you’ve got to —the total specify whether you take its total derivative, which reads, e.g., dT dt ∂T derivative of T with respect to t—or its partial derivative, e.g., ∂t . The distinction is relevant if the variables are not independent from one another. For example, if we took T (z, t) to be the temperature of the parcel of matter that occupies the position z at time t, then dT ∂T ∂T ∂z = + , dt ∂t ∂z ∂t where the first term at the right-hand side accounts for changes, if any, in the temperature field as a whole; and the second term is there because, even when the T field is constant over time, z might change, i.e., the particle might move to a place with a different T —and so its T will change as well. In what we are doing right now we don’t have to worry about this: there’s no displacement of matter, just conduction pure and simple: and so partial and total derivatives coincide. But we’ll have to be more careful when we study convection, mantle flow and all that, in Chap. 9. 182. Transactions of The Royal Society of Edinburgh, vol. 23. 183. “The fact that the temperature increases with the depth implies a continual loss of heat from the interior, by conduction outwards through or into the upper crust”, writes Kelvin. “Hence, since the upper crust does not become hotter from year to year, there must be a secular loss of heat from the whole earth. It is possible that no cooling may result from this loss of heat, but only an exhaustion of potential energy, which in this case could scarcely be other than chemical affinity between substances forming part of the earth’s mass. But it is certain that either the earth is becoming on the whole cooler from age to age, or the heat conducted out is generated in the interior by temporary dynamical (that is, in this case; chemical) action. To suppose, as Lyell, adopting the chemical hypothesis, has done [in Principles of Geology], that the substances, combining together, may be again separated electrolytically by thermo-electric currents, due to the heat generated by their combination, and thus the chemical action and its heat continued in an endless cycle, violates the principles of natural philosophy in exactly the same manner, and to the same degree, as to believe that
546
Notes
a clock constructed with a self-winding movement may fulfil the expectations of its ingenious inventor by going for ever.” 184. “The admirable analysis by which Fourier arrived at solutions including this,” writes Kelvin, “forms a most interesting and important mathematical study. It is to be found in his Théorie Analytique de la Chaleur.” 185. Which people like to call “error function”, or “erf”. The reason the error function is called this way has to do with the practical meaning it has when it is used in statistics/probability. If you are into these things, you might have noticed that, with a slight transformation of the integration variable, the erf becomes the integral of a generic Gaussian function. This is not something I need to get into in this book, but just in case you were wondering. 186. The proof is quite long for such a small result, but here it is. By Eq. (4.47), x −u 2 what we are looking for is the limit lim x→∞ 0 e du , multiplied by √2π . Start out by calling x
A(x) =
e−u du. 2
0
x 2 2 e−u du e−v dv 0 x x 0 2 2 = du dv e−(u +v ) .
Then
x
A2 (x) =
0
(N.35)
0
Now look at Fig. N.11; look at the circle of radius x—i.e., radius equal to the side of the square over which we have to integrate—and at the circle which goes through the vertex of the square marked √ √ √ by the letter C. (The radius of the latter is given by x 2 + x 2 = 2x 2 = x 2.) The diagram should be enough to convince you that the area covered by the circle of radius x is smaller than that of the square, while that covered by the other circle is larger. 2 2 Now, because the function we’re integrating, e−(u +v ) , is always positive whatever the values of u and v, its integral will always grow if the area over which we integrate grows. But so then
du dv e−(u R (x)
2
+v 2 )
≤ A2 (x) −(u 2 +v 2 ) , ≤ √ du dv e
(N.36)
R (x 2)
where R(ξ) means the circle of radius ξ, centered at the origin. The integrals in (N.36) are easier to do if we switch to polar coordinates:
Notes
547
Fig. N.11 The double integral in Eq. (N.35) is over the shaded square, the length of whose sides is x. The area of the quadrant of radius x is clearly smaller than that of the shaded square; the area of√the quadrant or radius x 2 is larger
−(u 2 +v 2 )
du dv e R (ξ)
=
ξ
=
π 2
0
0
π 2
r dr dθ e−r
ξ
dθ
dr r e−r
0
0
π ξ 2 = dr r e−r 2 0 π 2 = (1 − e−ξ ) 4 where to do the last step I’ve used the fact that d −r 2 2 e = −2r e−r . dr
It follows that
du dv e−(u
2
+v 2 )
R (x)
and likewise
√
du dv e−(u
R (x 2)
But so then (N.36) becomes
2
+v 2 )
=
=
2
π 2 (1 − e−x ), 4 π 2 (1 − e−2x ). 4
2
548
Notes
π 2 (1 − e−x ) ≤ A2 (x) 4 π 2 ≤ (1 − e−2x ). 4 x 2 But now remember, we are interested in the limit of A(x), or 0 e−u du, when x tends to infinity. If we go there, the equation—actually, the inequality—we’ve just written boils down to π π ≤ A2 (x) ≤ , 4 4 or A2 = π4 , i.e. A = that
or
√ π , 2
but so then, by the definition of A, we’ve just proven √ x π −u 2 , lim e du = x→∞ 0 2 2 lim √ x→∞ π
x
e−u du = 1, 2
0
or lim erf(x) = 1,
x→∞
QED. 187. “On the Periodical Variations of Underground Temperature”, Transactions of the Royal Society of Edinburgh, 1860. 188. Published in three volumes between 1830 and 1833. 189. In Geology in the Nineteenth Century: Changing Views of a Changing World, Cornell University Press, 1982. Greene is an American historian of science. In 2015 he published an impressive, 675-page biography of Alfred Wegener—an important character of this story as well. 190. I know: we haven’t discussed earthquakes yet; we’ll do it later. We’ll see that in the mid nineteenth century people didn’t have a very clear idea of what earthquakes were; I guess Lyell and co. thought that these strange phenomena might be caused by some processes extraneous to mountain building—i.e., the explosion of buried masses of gas—, and actually contribute to mountain building... while today we would rather state that an earthquake is just a byproduct of the same, great “tectonic” forces that build mountains. 191. This article bears the not-so-catchy title: “Researches on some of the Revolutions which Have Taken Place on the Surface of the Globe; Presenting Various Examples of the Coincidence between the Elevation of Beds in Certain Systems of Mountains, and the sudden Changes which have produced the Lines of Demarcation observable in certain Stages of the Sedimentary Deposits,” but the running head more intriguingly cuts it to “Researches on Some of the Revolutions which Have Taken Place on the Globe”.
Notes
549
192. To visualize this, think of an apple forgotten in a corner of your kitchen, and left to dry. After a few days, the semi-dry apple has gotten smaller; its inside has shrunk; the total surface of the skin, which was dry to begin with, though, doesn’t change, and so the skin is now covered with wrinkles. According to “contraction” models à la Beaumont, the earth’s interior behaves like the apple’s flesh—except that it shrinks not because of drying, but because of cooling; and the earth’s crust behaves like the skin: mountain ranges are the crust’s wrinkles. To speed up the process, put the apple in the oven and bake it—which, indeed, the contraction model has often been referred to as the baked-apple model. 193. Which maybe is partly explained by the fact that George Vose wasn’t an earth scientist: according to the Appletons’ Cyclopaedia of American Biography, 1900 edition, “during 1849–’50 he studied at the Lawrence scientific school of Harvard, then began his career as assistant engineer on the Kennebec and Portland railroad, and until 1859 was engaged on various railroads. From 1859 till 1863 he was associate editor of The American Railway Times in Boston [...]. He was professor of civil engineering in Bowdoin college from 1872 till 1881, and held a similar chair in the Massachusetts institute of technology from 1881 till 1886. His larger works include Handbook of Railroad Construction (Boston, 1857); Orographic Geology, or the Origin and Structure of Mountains (1866)”, etc. The M.I.T. website says he was head of their department of civil and environmental engineering between 1882 and 1887. 194. Henry Darwin Rogers and William Barton Rogers. They had two more brothers, who were chemists, and their father, who had emigrated to the U.S. from Ireland, was a scientist, too—“Professor of Natural Philosophy and Mathematics in the ancient College of William and Mary, founded at Williamsburg, Va.” All four children followed in their father’s footsepts. “All were able university professors. They labored jointly as well as separately to increase and diffuse knowledge. On this account they were more or less distinguished. [...] Each followed his routine course; but often they engaged jointly in one investigation, so that the public sometimes confounded their labors and gave credit to one which truly belonged to another. Their works were frequently mentioned at home and abroad as of ‘the brothers Rogers,’ and always in respectful and kindly terms. Mistakes of the sort never disturbed the perfect harmony that always existed between them, as they might have done had the brothers been rivals or competitors for reputation.” Henry D. and William B. became geologists, and they did a lot of stuff. “At the meeting [of the Association of American Geologists and Naturalists755 ] held in Boston, in 1842, [they] presented [...] , a paper on ‘The Laws of Structure of the more Disturbed Zones of the Earth’s Crust’, embracing what is called the wave theory of mountain chains. This theory was a result of an extensive study of the Appalachian chain in Pennsylvania and Virginia, and was supported by reference to many geological sections and facts. They were first to assert that the structure of mountain chains everywhere is the same in all essential features,
550
195.
196. 197. 198.
Notes
an assertion which has been confirmed by the observations of Murchison in the Ural mountains, and by Darwin in the Andes. “The meeting was memorable. Dr. Samuel George Morton presided. Among the distinguished naturalists present were the elder Silliman, Professor Hitchcock, Dr. Charles T. Jackson, the French astronomer, Nicollet, Sir Charles Lyell, and the palaeontologist, Hall. Several able and elaborate essays were read and discussed, but the prominent feature of the meeting was the Rogers paper, which was delivered as an oral statement. William B. Rogers first described the physical structure of the mountain chain extending 1500 miles, from Vermont to Alabama, and then Henry D. Rogers followed, explaining the phenomena and expounding the hypothesis deduced from them. “John L. Hays, of Cambridge, Mass., who was present, says, June 1, 1882: ‘I have frequently read it [the paper] since. To me it is now comparatively tame in expression. It lacks the inspiration of the scene and the man, the illustrative diagrams, the emphasis of voice and finger pointing out the distinguishing phenomena, and the fervor of spontaneous utterance. The impression I have of this exposition as delivered is, that next to the Phi Beta Kappa oration of Wendell Phillips at Harvard, it is the most lucid and elegant effort of oral statement to which I ever listened. It may be true that eloquence is but a secondary quality in the philosopher; but in respect to the matter of this memoir and the general researches and deductions of the brothers Rogers here named, in their peculiar field of exploration, it may be safely asserted that they have made the most original and brilliant generalizations recorded in the annals of American geology, and have thrown light on the structure of mountain chains generally, which entitles them to a place by the side of the great expositor of this subject, Eli [sic] de Beaumont, of France.’ ” The quotes are from William Ruschenberger, “A Sketch of the Life of Robert E. Rogers, M.D., LL.D., with Biographical Notices of his Father and Brothers”, Proceedings of the American Philosophical Society, vol. 23, 1886. From the glossary given at the end of Lyell’s book: “FAULT, in the language of miners, is the sudden interruption of the continuity of strata in the same plane, accompanied by a crack or fissure varying in width from a mere line to seyeral feet, which is generally filled with broken stone, clay, &c. “The strata a, b [see Fig. N.12] must at one time have been continuous; but a fracture having taken place at the fault F, either by the upheaving of the portion A, or the sinking of the portion B, the strata were so displaced, that the bed a in B is many feet lower than the same bed a in the portion A.” Nota Bene: here Vose assumes that there is no net source of heat within the earth: we will discover later that this is not such a trivial assumption to be made. We saw this in Chap. 3: James Hall’s experiments, for instance. “Perhaps the most extreme self-righteous attack”, says Dott, “was upon James T. Foster, a school teacher in Greenbush, New York”. Foster had published a, uhm, “popularized” geological chart—a two-meter wide color lithograph, showing strata, intrusions, a volcano, fossils, and so on, full of what Hall thought were “monstrosities”: geological eras and ancient species all mixed up, etc.
Notes
551
Fig. N.12 A fault. F is the fault, which breaks the section in two portions called A and B; a and b and c are layers, cut and offset by the fault. (After Lyell, Principles.)
Anyway, one day in 1849, according to John Mason Clarke (who had been Hall’s assistant for twelve years until Hall’s death), “Hall happened to be in the office of Samuel S. Randall, the deputy Superintendent of Public Instruction, and Mr. Randall directed his attention to something new and interesting in the educational way—a Geological Chart, so-called, prepared for the use of the schools and submitted to the superintendent for his endorsement. [...] This was a proof sheet of ‘Foster’s Complete Geological Chart’ prepared by a schoolteacher at Greenbush, across the river from Albany, by the name of James T. Foster. Though he lived but a mile or so away, Hall had never heard of this person—this audacious fellow. Assuming his most suave manner and gentlest voice, as he ever did when on the verge of explosion, he begged from Mr. Randall the loan of the chart that he might examine it at his leisure; and with it securely rolled under his arm, he burst out in a torrent of denunciation and invective over the imprudent document, and made his way out.” Hall sent the chart to Louis Agassiz (whom we’ll meet again, see note 234), at Harvard, who replied “with a terrific demolition of the Chart and told Hall that he might use the letter as he saw fit. Hall saw fit to print it in the Albany newspapers with one of his own of much the same tone. And thereupon James T. Foster brought an action for civil damages against Hall and Agassiz separately for libel in the amount of $40, 000 for the former and $20, 000 for the latter.” Which is a lot of money now, but was even more money in the nineteenth century. In early 1850, while Hall prepared his and Agassiz’ defense, says Clarke, “the printing of the offending chart proceeded at the old book house of Websters and Skinners in Albany and the edition was packed off to New York to be put on the market. It would appear that the shipment was made by the Hudson river night boat, and that Hall, hearing of it, took passage by the same boat. The charts never reached New York. George H. Cook, the distinguished State Geologist of New Jersey, in those days intimate with scientific affairs in Albany, told me the
552
199.
200.
201. 202.
203.
204.
Notes
story of this little moonlight excursion, from which the only logical deduction was that Hall threw the entire edition of the Foster Chart into the Hudson. ‘Do you think he really did that?’ said I wonderingly to Professor Cook. ‘Do you think he would not?’ was the answer.” Incidentally, “after many postponements, Professor Agassiz’ case was called for trial in March, 1851. The suit”, says Clarke, “had now grown to momentous proportions as the reigning sensation in the scientific community of the country and in the civic community of Albany as well. Though the issue was trivial, the defendant was a national possession, known and honored by all intellectual America [...]; it was already a cause celèbre and many letters came to Hall from men who expressed a wish to give testimony. [...] The trial lasted several days”, and, needless to say, Agassiz was acquitted. “The case against Hall”, concludes Clarke, “was never called. And thus ended the Foster Chart.” In the twentieth century, after being revisited by James Dana, who gave it the name of “geosynclinal theory”, which see below, Hall’s theory would be essentially the most widely accepted idea of mountain formation—until plate tectonics, that is. James Dwight Dana might be, like, the leading figure in American geology in the mid 1800s. He was a prof. at Yale for many years, starting in 1850. Before that, and shortly after graduating (also from Yale), he had been “instructor of mathematics to midshipmen in the U.S. navy, and in this capacity visited the seaports of France, Italy, Greece, and Turkey while on the ‘Delaware’ and the ‘United States.’ [...] In December, 1836, he was appointed mineralogist and geologist to the U. S. exploring expedition, then about to be sent by the government of the United States to the Southern and Pacific oceans under the command of Capt. Charles Wilkes. The expedition sailed in August, 1838, and Mr. Dana was on board the ‘Peacock’ until it was wrecked on a sand-bar at the mouth of Columbia river. In June, 1842, after an absence of three years and ten months, Mr. Dana returned home. Besides the mineralogy and geology, he had under his supervision the zoological departments, including the crustacea and corals. During the thirteen years that followed he was occupied principally in studying the material that he had collected, making drawings, and preparing the reports for publication.” (The text is from Appleton’s Cyclopaedia of American Biography, edition of 1900.) Dana also did a lot more stuff, some of which gets mentioned in this chapter. American Journal of Science, vol. 5. I guess Dana refers to the “statement from the introduction to the third volume of Professor Hall’s New York Paleontology,” that he had just mentioned; that introduction contains the ideas first expressed by Hall in his 1857 presidential address. In Dana’s words, “while the weight of accumulating sediments will not cause subsidence, a slow subsidence of a continental region has often been the occasion for thick accumulations of sediments.” A prof. at the University of Vienna, from the mid 1850s to the 1900s.
Notes
553
205. “If a curve be constructed showing the number of analyses in which the silica percentages fall between certain limits [...], two peaks appear, one at 52.5 per cent. and the other at 73 per cent. [...] If similar curves be constructed for the other oxides, it can be shown that the two most frequent igneous rocks so far sampled for analysis have compositions representing granite and acid basalt [...] respectively.” Then: “It is worth while noting the harmony between these results and the composition of the crust [...]. From geodetic observations, such as the determination of the force of gravity, and from considerations based on the doctrine of isostasy, it is believed that the continents consist mainly of rock of granitic composition (sial) resting upon, and passing into, a substratum (sima), the upper part of which is of basaltic composition.” (George W. Tyrrell, The Principles of Petrology, Chapman and Hall, 1926.) 206. Which in this context people call isostasy principle. We’ll get to that shortly. 207. In the mid nineteenth century, people start to record the depth of seafloor around the globe. A weighted line is dropped over the side of a ship, and its length recorded once the weight, or “sinker”, has hit the bottom: the “line-andsinker” method. One way to analyze these data (and all sorts of scientific data, in general) is to put them in a histogram. Look at Fig. N.13: on the horizontal axis is seafloor depth, subdivided into finite intervals, or “bins”; on the vertical axis is how many, or what percent of measurements fall into each bin. For this to be useful, measurements should be made at more-or-less equally spaced points, or you should divide the seafloor in pixels of equal area, and take the average of all measurements made within each pixel. Then, what you count is not the fraction of measurements, but the fraction of seafloor area where a certain depth is recorded. The curve in Fig. N.13 is shaped like a bell; or, in scientific slang, the data are “distributed normally”: that means that, over most of the oceans, the depth is close to about 4000 m—the value that corresponds to the middle of the bell, which is usually not far from the average of all measurements—and places where the ocean is deeper than 6000, or shallower than 2000 m, are quite rare. Something interesting happens if you combine measurements of seafloor depth, AKA “bathymetry”, with measurements of elevation taken on continents, i.e., topography, and do the histogram thing again: see Fig. N.14. The “distribution” of data that you get now is not “normal”, but is what people call “bimodal”, i.e., it’s got two clear maxima (on average, the altitude on continents is slightly less than 1000 m). Alfred Wegener, whom you are going to meet shortly, wrote quite a bit about isostasy in his On the Origin of Continents and Oceans, e.g., “it’s hard to find, in all of geophysics, a clear and reliable law like the one that stipulates the existence of two, thus defined, levels. It is surprising, then, that, it being known already for a long time, an explanation for it has not been sought. If, according to the interpretation currently accepted by geologists, elevations were formed by uplift, and depressions by subsidence of the same initial level, then they both would be the more uncommon, the larger their entity, and their statistical distribution should be normal. It should have, that is, only a single peak, corresponding to the average level of the earth’s crust
554
Notes
Fig. N.13 Histogram showing the global statistical distribution of the oceans’ depth. The “bins” are 1000 m wide. For example: about 20% of the ocean floor is between 4000 and 4500 m deep
208. 209.
210.
211.
212.
(−2450 m). Instead of that, two maxima are seen, and around each maximum the curve has an approximately normal trend. [...] It is inevitably inferred, that the continents and the floors of oceans consist of two different layers of the earth’s crust, which behave, so to say, like water and large blocks of ice.” Continental and oceanic crust don’t just “happen” to be above or below sea level: instead, continents and oceans are really different entities: continents seem to “float” on whatever stuff the floors of oceans are made of. You can find Dana’s own version of this in his Manual, e.g. starting at page 814 of the third edition. Like I said, here Dana’s and Hall’s “models” differ. In Dana’s words, “while the weight of accumulating sediments will not cause subsidence, a slow subsidence of a continental region has often been the occasion for thick accumulations of sediments.” Dana: “The making of the Alleghany range was carried forward at first through a long-continued subsidence—a geosynclinal (not a true synclinal, since the rocks of the bending crust may have had in them many true or simple synclinals as well as anticlinals), and a consequent accumulation of sediments”, etc. Again from the 1873 paper that I’ve quoted above. I don’t know how Vose figured that the fold should be “a mile high and two miles wide”. If, for the sake of simplicity, the fold is triangular, with a base of about 6 km and a height of about 2, then the length of each of the two other sides would be, by the Pythagorean theorem, about 3.5 km, and the lost km would be accounted for. So, my fold is bigger than Vose’s—same order of magnitude, though. In “Before Continental Drift: Versions Of Contraction Theory”, Chap. 1 of a volume she curated on the history of the theory of plate tectonics, i.e., Plate
Notes
555
Fig. N.14 Altitude of the earth’s surface with respect to sea level. On the left the histogram, same as in Fig. N.13, but now including topography measurements from continents, too: and the graph is rotated by 90◦ , so elevation is on the vertical axis, and frequency on the horizontal one. On the right, another way of visualizing the same data: on the vertical axis, again, elevation: but on the horizontal axis, for each elevation, the fraction (percent) of earth’s surface that lies above that elevation: i.e., 0% of the earth’s surface is higher than the Himalaya’s more than 8000 m; and 100% of it is above the Mariana’s −11, 000. Essentially, “the statistics of the different altitudes of the earth’s crust highlights the following, peculiar result: there are two altitudes that happen most frequently, while the intermediate ones are very rare” (Wegener)
Tectonics: an Insider’s History of the Modern Theory of the Earth, Westview Press 2003. 213. In his paper on “The Mammals of Madagascar”, published in the Quarterly Journal of Science, vol. 1, 1864, Philip Sclater756 wrote: “Organic beings are not scattered broadcast over the earth’s surface without regularity or arrangement, as the casual observer might suppose, nor are they distributed according to the variations of climate or of any other physical external agent, although the latter have, unquestionably, much influence in modifying their forms. But each species (or assemblage of similar individuals), whether of the animal or vegetable kingdom, is found to occupy a certain definite and contiguous geographical area on the earth. [...] The various parts of the world are characterized by possessing special groups of animals and vegetables, and [...] such tracts of land as are most nearly contiguous have their Faunae and Florae most nearly resembling one another; while, vice versa, those that are farthest asunder are inhabited by most different forms of animal and vegetable life. When any exception to this rule occurs, and two adjacent lands possess dissimilar forms, or two regions far apart exhibit similar forms, it is the task of the student of geographical distribution to give some reason why this has come about, and so to make ‘the exception prove the rule.’ ”
556
Notes
214. Gondwana being a region of India, where one of those problematic species of fossils—the Glossopteris tree, which see below—was found. 215. We’ve seen already in note 207 how bathymetry and topography data essentially confirm that continents float on top of oceans. Alfred Wegener, like I was saying, saw that this was a super convincing proof of the theory of isostasy; and also, that it totally disproved Suess’ idea of sinking continents. See chapter 4 of Wegener’s The Origin of Continents and Oceans. 216. According to Tony Watts757 , the word isostasy, or isostacy, was invented by Clarence Dutton, circa 1880, in his review of Osmond Fisher’s book, Physics of the Earth’s Crust. (These are all late-nineteenth-century American geologists.) Fisher had picked up Airy’s idea, which I am going to tell you about, that the crust is lighter rock floating on a denser substratum. “In Dutton’s view,” says Watts, “one of the ‘fundamental doctrines’ of Fisher’s book was the notion that the broader features of the Earth’s surface were simply those that were due to its flotation. [You’ll see exactly what this means in a second.] This idea, he thought, should form an important part of any true theory of the Earth’s evolution”, etc., and he uses, Dutton uses the word isostacy, in a footnote to his review. Then, he uses it again, a few years later, in an actual article: “On Some of the Greater Problems of Physical Geology”, Bulletin of the Philosophical Society of Washington, vol. 11, 1889. 217. See note 136. 218. This figure seems pretty small and one wonders whether it couldn’t be just “an error in the geodetic calculations”. But that’s unlikely. Remember (note 14) how triangulation works: you can directly measure the length of a “base”, then triangulate over a long distance without doing any direct measurement of distance—you only measure angles—and calculate distances from the angles. Now, Everest had measured “two bases, about seven miles long, at the [northern and southern] extremities of the arc”, to then compute “the length of the northern base [...] from the measured length of the southern one, through a chain of triangles stretching along the whole arc, about 370 miles in extent,” and then compared his result with that of direct measurement, and the difference was “0.6 of a foot, an error which would produce, even if wholly lying in the meridian, a difference of latitude no greater than 0.006 .” That is, about a thousand times smaller than the discrepancy that we are talking about, which is about 5 . 219. With the additional problem that the geography (most importantly, the topography) of central Asia was not that well known. Pratt makes a lot of extrapolations from Alexander von Humboldt’s memoir, Aspects of nature. 220. There were arguments, in Airy’s time, on whether the interior of the earth was fluid or solid: see Chap. 2. Today we know that most of the earth is solid: only the so-called “outer core” is fluid—but that’s almost 3000 km deep, and deeper: so that has nothing to do with isostasy. The thing is, though, Airy’s isostasy doesn’t need an actual liquid below the crust: a viscous layer is enough. We’ll get back to all these concepts later in the book.
Notes
557
221. Which, incidentally, is just another way of stating Archimedes’ principle, which if you don’t know or don’t remember what it is, I am sure you are motivated now to look it up758 . 222. And but not what Maskelyne and co. had observed near Schiehallion: but then, again, Airy’s isostasy doesn’t always work, and not at all scales: more on this later. 223. “Joseph Barrell’s scientific life coincided with the ‘Golden Era’ of Geology in America, and in him American Geology has lost a leader who promised to stand as high as the highest. ‘Those whom the gods love, die young.’ [...] Upon his colleagues at Yale, Barrell’s death [...] fell as a heavy blow. Coming to us as a matured and highly educated young man, we saw Barrell grow into a leading geologist who exceeded our hopes and more than justified our choice of him to fill the chair of Structural Geology created for him at Yale. He was a power among us, and it was around him that our graduate courses in Geology were built. “Barrell’s death occurred in New Haven, Connecticut, on May 4, 1919, after a week’s illness with pneumonia and spinal meningitis. He left a wife, Lena Hopper Bailey, and four sons [...]. Standing 5 feet 10.5 inches in height, of the blue-eyed Nordic type, with a full head of wavy light brown hair, he was pale and spare of build, and yet of great muscular strength—the “strong man” of his class at Lehigh. Once seen, he was easily remembered, and he was quickly picked out in a crowd. This was due in part to his tall slender build, his long and awkward stride, and his confident bearing, but more especially to the strength of character reflected in his large features, particularly the wide mouth and long, narrow nose, concave in profile. He prided himself on the longevity of his ancestors, and from a careful study of insurance tables held the unshakable belief that some thirty years of active life were ahead of him. “Modest and optimistic, with a strong independence of mind, he prized true worth highly, and was easily aroused to criticism of posers and social climbers, and of shams and errors. Simple in attire, and fond of simple living, his intellectual ideals were of the highest. He cared little for popularity, or for adverse criticism, and not much more for praise. His colleague, Professor Gregory759 , says that what he valued most was ‘uninterrupted time for research and intellectual fellowship established through writings. His intellectual power was so obvious and so continuously displayed that twenty years of intimacy has left on me an impression of a mind rather than of a man. His mind was of surpassing fertility, imagination, and machine-logic.’ In appearance and mentality, Barrell reminded his older colleagues at Yale of James Dana; whatever subject these two great geologists touched was made clearer and the way indicated for new lines of research.” (From Barrell’s biographical memoir written by Charles Schuchert760 for the 1925 annual meeting of the U.S. National Academy of Sciences.) 224. More precisely, it tends to be true all over the place depending on the scale of the topography features we are talking about. As he writes his 1914 paper, Barrell has already looked into this: “the areal limits and degree of perfection
558
225. 226.
227. 228.
Notes
of isostatic adjustment [have] been dealt with in the previous parts of this investigation. It has been found that, although the relations of continents and ocean basins show with respect to each other a high degree of isostasy, there is but little such adjustment within areas 200–300 km. in diameter [...]. Individual mountains and mountain ranges may stand by virtue of the rigidity of the crust.” Besides the rigidity of the crust, there might be other ways to explain topography that’s not compensated isostatically. But we won’t get to this until much later. At least, that would be my guess: I don’t think Barrell states that explicitly in his paper. Wegener graduated in astronomy from the University of Berlin; then worked (with his brother Kurt) at the Lindenberg Meteorological Observatory in Beeskow (not far from Berlin), becoming a specialist of using air balloons761 to study, if I understand correctly, the flow of air masses across the atmosphere; as such, he took part in a Danish expedition to Greenland in 1906–1908; after which he became lecturer in meteorology at the University of Marburg (north of Frankfurt); he wrote a textbook, then, Thermodynamik der Atmosphäre, or Thermodynamics of the Atmosphere, which apparently was quite a success; he went to Greenland again, a bunch of times; fought in World War One and was wounded; in 1924 he became professor of meteorology and geophysics in Graz; and but he died in his fourth polar expedition, 1930. In all this, he managed to develop the theory of continental drift, and write his famous book that explains it, and of which four different editions were published during his life. Basically, Wegener did a lot of different things: which makes sense, because to see continental drift you have to look at the earth from a lot of different angles at the same time. In the foreword to the fourth edition of his continental drift book, he wrote: “Scientists still do not appear to understand sufficiently that all earth sciences must contribute evidence toward unveiling the state of our planet in earlier times, and that the truth of the matter can only be reached by combining all this evidence [...]. It is only by combining the information furnished by all the earth sciences that we can hope to determine ‘truth’ here, that is to say, to find the picture that sets out all the known facts in the best arrangement and that therefore has the highest degree of probability.” And but in the following I’ll quote from the 1929 edition, which is the fourth one and I think is quite different from those before. About as old as global cartography, which developed very quickly between the end of the fifteenth, and the beginning of the sixteenth century—the discovery of America, and all that. “The jigsaw-puzzle fit of the coast lines on each side of the Atlantic Ocean must have been noticed almost as soon as the first reliable maps of the New World were prepared”, writes Anthony Hallam (in the 1975 Scientific American article). “The complementary shapes of the continents soon provoked speculation about their origin and history, and a number of early theories suggested that the shapes were not the product of mere coincidence. In 1620 Francis Bacon called attention to the similarities of the continental outlines, although he did not go on to suggest that they might
Notes
229. 230.
231.
232.
559
once have formed a unified land mass. In the succeeding centuries several other proposals attempted to account for the correspondence, usually as a result of some postulated catastrophe, such as the sinking of the mythical Atlantis. The first to suggest that the continents had actually moved across the surface of the earth was Antonio Snider-Pellegrini762 in 1858, but he too attributed the event to a supernatural agency: the Great Flood.” Because glôssa means “tongue” in ancient Greek, and pterís means “fern”. Of the National Museum of Natural History, in Paris: same as Buffon, Cuvier, etc. We’ve already met his father, Alexandre Brongniart, in Chap. 4: Cuvier’s co-worker. Incidentally, Adolphe’s grandfather, and Alexandre’s father, Alexandre-Théodore Brongniart, is the architect who designed the Paris Bourse, i.e., stock exchange, AKA Palais Brongniart, and the plan of the Père Lachaise cemetery. “It was on the return from the Pole, only three days before Capt. Scott began his diary with the words: ‘In a very critical condition,’ and less than six weeks before the last entry763 was made, that the fossils were collected. In his account of the Geological History of South Victoria Land Mr. Debenham writes: ‘The 35 lbs. of specimens brought back by the Polar Party from Mt. Buckley contain impressions of fossil plants of late paleozoic age, some of which a cursory inspection identifies as occurring in other parts of the world. When fully examined, they will assuredly prove to be of the highest geological importance.’ The impressions referred to are leaves of Glossopterls, one of the few genera that can be identified with confidence from fragmentary specimens. Their occurrence 300 miles from the Pole is not merely a fact of the greatest interest, but, considering the fragmentary nature of the material, it is a cause of profound satisfaction that the best preserved specimens collected by Dr. Wilson and Lieut. Bowers leave no shadow of doubt as to their generic position. The discovery of Glossopteris was singularly fortunate; it is one of the most widely distributed and characteristic members of a flora which, from the point of view both of paleo-phytogeography and of geological history in the widest sense, is of the greatest interest. As Mr. Debenham says in his summary of Geological Journeys; ‘The notes made by Dr. Wilson and the specimens collected by him and by Lieut. Bowers are perhaps the most important of all the geological results. The plant fossils collected by this party are the best preserved of any yet found in this quadrant of the Antarctic, and are of the character best suited to settle a long-standing controversy between geologists as to the nature of the former union between Antarctica and Australasia.’ ” (from Albert Seward, Antarctic Fossil Plants, printed by order of the Trustees of the British Museum, 1914). See chapter 6 of Die Entstehung. Today we have many more data than Wegener had, so if you look this up, you’ll find lots of other examples. One that comes up all the time, for some reason, and that is mentioned also by Wegener, is Mesosaurus, a smallish reptile that lived about 270 million years ago in Brazil and... South Africa. “If Mesosaurus was able to swim well enough to cross the ocean, it should have diffused far more widely. Since it did not, this [...] suggests that South America and Africa must have been joined at that time.” (Anthony
560
Notes
Hallam, “Continental Drift and the Fossil Record”, Scientific American, vol. 227, 1972.) 233. “It is odd that neither paleontologists nor geophysicists paid much heed to Wegener’s cogent argument. The paleontologists were almost unanimous in rejecting the notion, perhaps because they did not fully appreciate the force of Wegener’s geophysical proposals. [...] As for the geophysicists, they largely ignored the considerable body of fossil evidence for continental drift that Wegener had assembled. Perhaps they failed to appreciate its significance or perhaps they mistrusted data of a merely qualitative kind and were suspicious of the seemingly subjective character of taxonomic assessments.” (Hallam, again: see note 232.) 234. The scientist who first came up with the theory of ice ages was Louis Agassiz, a geologist at the university of Neuchâtel, who eventually emigrated to the U.S. (you might remember him from note 198). As a kid, Agassiz had always been big about the natural sciences, but his parents wanted him to study medicine—so he studied both that and zoology, and around the time when he got his medical degree (University of Munich, 1830) he also published his major treatise on Brazilian Fish, which eventually (1833) landed him his job as prof. of natural sciences in Neuchâtel. Neuchâtel is in Northern Switzerland, near the Jura mountains: a part of the world where you can see a lot of erratic boulders and or moraines—piles of unconsolidated debris that, people thought, had been displaced by water: rivers, or floods. Two colleagues of Agassiz—Jean de Charpentier and Ignatz Venetz—came up with the idea that all that material had actually been spread around by glaciers—another quite common feature of the Swiss landscape. Agassiz wasn’t a big fan of this idea, but decided to go and check for himself: so for a couple of years—1836, 1837—he did his fieldwork in the valley of the Rhine... and he came to the conclusion that not only his friends were right—they were too cautious. Agassiz compared what he saw at current glaciers versus far away from them. And he saw that moraines looked the same way, and that rocks are scratched and or polished, by the glaciers of today, in exact the same way as rocks found very far away from those glaciers. Agassiz verified that glaciers actually move—he planted a row of rods in the ice, came back months later to see that the rods had actually moved, and quite a bit. It was reasonable, then, to think that the moraines were indeed displaced by the glaciers: and, old moraines being found very far from current glaciers, it was also reasonable to think that, at some point in the recent history of the earth, glaciers had advanced to very large distances with respect to the areas they occupy today. In 1837 he gave a famous address at the meeting of the Helvetic Society of Natural Sciences, of which he was president: he presented his study, and his inference that the world had gone through a great ice age, i.e., that a sheet of ice similar to the one that now covers the Arctic had spread from the North Pole to central Europe and Asia. This was probably somewhat shocking, at the time. It took some years, and a lot more fieldwork, before
Notes
235. 236.
237. 238. 239.
240. 241.
242.
561
Agassiz’ ideas were accepted by the community—Lyell and Humboldt and co.—but, eventually, he made it. See note 234. Marcel Bertrand, a famous French geologist from the School of Mines in Paris, had pointed out the concordance of geological structures on both sides of the northern half of the Atlantic, in a paper published in 1887. This was long before Wegener came up with his theory. Bertrand didn’t think that continents had moved: but rather, that they had been connected by an additional continental mass that occupied what is now the Atlantic Ocean—a “land bridge” that later sank, in compliance with shrinking-earth theory à la Suess. See note 236. A translation, I guess, of Bertrand’s nappe, or nappe de charriage. Actually, a nappe is more like a tablecloth, I think, but OK. The way I understand this, calculations such as that of Albert Heim were done by (i) drawing a “section” of the Alps, perpendicular to the axis of the mountain range; (ii) reconstructing, by extrapolation, whatever had been taken away by erosion; and then (iii), measuring the length of the curved and/or faulted layers, i.e., their horizontal extension if one could lay them flat again. You shouldn’t be surprised, then, that there is a lot of uncertainty in these estimates. See note 207. Wegener, actually, thought, on the basis of I don’t know what data, that India was “originally [...] joined to Asia by a long stretch of land, mostly under shallow water. After the separation of India from Australia [...] and from Madagascar [...], this long junction zone became increasingly folded by the continuing approach of present-day India to Asia; it is now the largest folded range on earth, i.e., the Himalaya and the many other folded chains of upland Asia.” In fact, see the difference between Wegener’s, and a more recent version of Pangea, in the middle and bottom maps of Fig. 5.6. Three driving forces are proposed in Die Entstehung’s 1929 edition. (i) The so-called pole-flight force, first illustrated in by the Hungarian physicist—and gravity specialist—Lorànd Eötvös, in 1913, and then in 1921 by Wladimir Köppen764 , who understood its potential relevance to continental drift. The idea is as follows: the earth is ellipsoidal, and so the locus of all points that gravity attracts towards the center of the earth with equal strength is also ellipsoidal: this is called an equipotential surface. Now, geometry tells us that the “flattening”, or “ellipticity” of equipotential surfaces decreases with depth. In Köppen’s words, cited by Wegener, the equipotential surfaces “are not mutually parallel but are slightly tilted with respect to each other, except at the equator and poles, where they are normal to the radius of the earth. [...] The point of application of the buoyancy force acting on a floating body lies at the centre of gravity of the displaced medium, but the point at which the body’s own weight acts is at its own center of gravity, the direction of both forces being normal to the [equipotential] surface [passing through] the point in question; the force
562
Notes
Fig. N.15 The pole-flight force is the sum of the continent’s weight—acting as if applied at the continent center of mass, and perpendicular to the equipotential surface passing through the continent’s center of mass—and of Archimedes’ force associated with the displaced sima, acting as if applied at the center of mass of the displaced sima, and perpendicular to the equipotential surface passing through the center of mass of the displaced sima. It is a very small force, always pointing towards the equator, where it has zero magnitude. Of course, nothing in this figure is even remotely to scale
vectors are therefore not opposed, but have a resultant of small magnitude, which, if the buoyancy centre lies below the centre of gravity [which, in the case of a continent, it does] points towards the equator.” This is illustrated by the diagram in Fig. N.15, which is adapted from figure 44 in Wegener’s book. The idea that continents be pushed away from the poles goes well with Wegener’s reconstruction of continental displacement, which shows that all continents used to be attached to Antarctica, before separating from it and “drifting” northward, i.e., towards the equator, indeed. You can more-or-less quantify the pole-flight force acting on a given continent, if you estimate the continent’s mass, the mass of the displaced sima, the locations of the centers of mass of both continent and displaced sima, etc. Then, you can do a force balance, if you can come up with an estimate for the other force acting on the continent, which is the friction (or “viscous drag”, about which more in Chap. 8) between sima and sial. Wegener does all this in Chap. 9 of his book, and but, needless to say, can only come up with very vague estimates. One thing is clear, though: the pole-flight force is very small, and Wegener himself thinks that it might be “enough to displace the continental blocks through the sima, but not enough to produce
Notes
563
the great fold-mountain ranges that rose in association with the flight of the continents from the poles.” So, that is not it. (ii) The gravitational attraction from sun and moon, which we briefly looked at in Chap. 2. Remember Hopkins’ argument that if the earth had a frictionless fluid core, then that and the “crust” (i.e., in Hopkins’ terms, everything but the core) would both respond to the sun’s and moon’s attraction, independently of one another. Now, there certainly is some friction between the continents and the sima, but still, if friction is low enough, then the sun’s and moon’s attraction could result in a differential movement of continents with respect to the sima. At this point, though, very little, if anything at all, was known of the sial-sima friction. So this was very speculative. Wegener quotes Wilhelm Schweydar, who “states: ‘The theory of the precession of the rotational axis of the earth under the influence of the attractive forces of sun and moon presupposes that the individual portions of the earth cannot undergo large mutual displacements. The calculation of the shift of the earth’s axis in space is more difficult if the continents are assumed to have drifted. In this case, a distinction must be made between the axis of rotation of the continents and that of the earth as a whole. I have calculated that the precession of the rotational axis of a continent lying between latitudes −30◦ and +40◦ and meridians 0◦ and 40 ◦ W is about 220 times larger than that of the axis of the whole earth. The continent will tend to rotate about an axis which differs from the general axis of rotation. Hence forces arise which act not only in a meridional direction, but also in a westerly one, and which seek to displace the continent; the meridional force reverses its direction in the course of the day and does not enter the problem. These forces are much larger than that producing flight from the poles. The force is strongest at the equator and zero at +36◦ of latitude. I hope later to give a more precise description of the problem. As a result of the theory, a westerly displacement of the continents would be possible.’ Although this, too, is only a provisional statement (the promised final version has unfortunately still not appeared), it nevertheless seems very likely that the most clearly recognisable general movement of the continents, their westerly drift, could be explained by the attractive forces of the sun and moon acting on the viscous earth.”765 Like Wegener is saying, Schweydar didn’t come up with a “final”, fully convincing version of his theory. So this argument is interesting, but vague. (iii) Convection: Wegener mentions “the concept of convection currents in the sima”, which “recently, several authors [...] have made use of”. We’ll learn all about convection in Chaps. 8 and 9, but maybe you already know that that’s what happens when you heat up a fluid: the hotter parts become lighter and, by Archimedes’ principle, tend to rise up—and the colder and denser parts, then, to sink, etc. Wegener cites “Joly’s766 idea that the sima under the continental blocks is heated by the large radium content”, etc.: wherever the concentration of radioactive stuff is high enough, radioactive decay releases energy that warms the sima up, and this triggers convective circulation, etc.
564
Notes
Displacement of the sima might carry the continents around, via friction, and or break them apart where there’s divergence in the sima’s motion. But Wegener himself isn’t convinced, because “the relatively great fluidity of the sima”, that would be needed (or so he thinks) for this model to make sense, “has been regarded as unlikely by the majority of authors to date.”
243.
244. 245.
246.
247. 248.
We’ll see in Chap. 8 that one of these three ideas will later turn out to make sense. But in Wegener’s time, there were no sufficient elements to figure it out. Wegener is adamant: “Our discussion will have shown the reader that the problem of the forces which have produced and are producing continental drift [...] is still in its infancy.” If you don’t know what radioactivity is, you can’t possibly understand this sentence. But that’s OK, because we’ll get back to this in Chap. 8, and you won’t need to worry about radioactivity again, before then. “Arthur Holmes and Continental Drift”, The British Journal for the History of Science, vol. 11, 1978. Who also put together a collection of papers which were presented at the 1926 symposium, i.e., Theory of Continental Drift: A Symposium on the Origin and Movement of Land Masses Both Inter-Continental and Intra-Continental, as Proposed by Alfred Wegener, 1928. In case you are wondering about the very long name, van der Gracht was from a Dutch patrician family, which I guess tend to have that kind of names. Interestingly, he had degrees in both law and geology, and but worked as a mining engineer for Shell. Say, at least, a magnitude 5, for those of you who already know about these things. We’ll do the theory of wave propagation later in this chapter; but you can probably already understand Mallet’s difficulty here, with only a bit of common sense: an elastic wave happens when for some reason a parcel of matter moves, somewhere, and the parcels of matter around it react to its motion, through friction or pressure or whatever the forces that hold matter together are— which Mallet and his contemporaries didn’t yet fully understand, but they knew existed. By Newton’s action-reaction principle, if the parcels of matter around the one that moved first resist its motion, i.e., apply a force to it to pull it back to its initial position, then an equal and opposite force is also applied by parcel zero to its neighbors: and they’ll move a bit. And then the same will happen to the neighbors of the neighbors, and so on: and this is what you see happening in the water when you throw a stone in a pond767 . And so now it’s easy to imagine that for example it all starts out with an expansion, an explosion for instance, and matter around the area that has expanded doesn’t want to be compressed, and applies an equal and opposite pressure, etc.; or it all starts with some kind of shearing, i.e., the transversal displacement of a parcel of matter with respect to its neighbors: and the neighbors will resist shearing, applying a shearing traction in return. Even without knowing the theory you should be able to figure all this: but a rotational motion (vortex) that gives rise, in the
Notes
565
surrounding matter, to rotations (vortexes) around many different axes... it’s just very hard to imagine, I guess. 249. Which, incidentally, Mallet is not saying, here, that there is no vertical motion at all. But Mallet was trying to make sense of the data he had, which weren’t many, and were mostly just reports of damage that had occurred to buildings (remember, no instrument had been invented yet to measure the intensity of a quake). Now, if you ask an engineer, they’ll explain to you that it is the horizontal motions that have the most damaging effects on buildings. So to explain why, e.g., a certain building has collapsed, you need a large but relatively small horizontal motion, or an incredibly tremendous vertical motion: and Mallet sticks to the simplest explanation. 250. Mallet’s theoretical understanding of the elastic waves that propagate on earth has a lot of gaps compared to what we would read today in a seismology textbook, although of course it is fairly advanced for 1846. He refers to Weber and Weber, “Wellenlehre, auf experimente gegründet”, Leipzig 1825, a monograph where it was shown experimentally that a water particle hit by an ocean-type wave describes a circle. This is water, though, and experiments like those are not so easy to implement in solid media (although Mallet himself would spend time to do precisely that, later in his career). “It is not yet known”, writes Mallet, “that precisely analogous motions take place amongst the particles of solids within their limits of elasticity, when transmitting a wave of the first order; but it seems highly probably [sic] that there is a close analogy in the motions of the particles in both cases [...].” So you see that Mallet’s knowledge of elastic wave propagation at this point was mostly “qualitative”. And but it appears, from his paper, that even the very idea of elastic wave—or the idea that many phenomena that we experience in our everyday life are explained in terms of elastic waves—which is something that today we learn when we are kids, I think, don’t we?—even that idea is not really taken for granted in Mallet’s time, because, see: “Difficult as it may seem at first to reconcile to our common notions of solid bodies, that such internal motions should pass through them, causing compression or even intermobility of their particles within their elastic limits, yet, through large measurable spaces, and with immense velocity, such motions are, in fact, before us constantly : the vibrations of the air of a drawing-room shake the solid walls of the house, when a tune is played upon a piano-forte, or otherwise the tune could not be heard in an adjoining house. [...] So also when the great Spanish powder magazine, said to have contained 1500 barrels, was blown up, near Corunna, at the conclusion of Sir John Moore’s retreat, I have been told by officers who were present that the ground rocked sensibly for miles away, and the wave was felt at a distance before the sound of the explosion was heard.” Mallet has to explain all this, because it is not obvious: even to a professional, scientific audience. 251. Michell (pp. 41–42 of his paper) reasons on how this can happen: “so long as the roof rests on the matter on fire, no part of it can fall in, unless the matter below could rise and take its place”, so it is not clear how the water, above, can actually come into contact with the ’fire’ below; Michell then suggests
566
Notes
that “the matter of which any subterraneous fire is composed, must be greatly extended beyond its original dimensions by the heat” (because, he mentions in a footnote, “all bodies we are acquainted with are liable to be extended by heat”), and so “whilst the matter spreads itself, or grows hotter, the parts over the fire will be gradually raised and bent; and [...], as the fire continues to increase, the earth will at last begin to be raised somewhat beyond the limits of” the fire itself, i.e., a sort of vacuum is created between ’fire’ and overlying crust, and then “this space will be gradually filled with water”, etc. All this is fairly hard to believe, but hey. 252. Michell adds: “If these alternate dilatations and compressions should succeed one another at very small intervals, they would excite a like motion in the air, and thereby occasion a considerable noise. The noise that is usually observed to precede or accompany earthquakes, is probably owing partly to this cause, and partly to the grating of the parts of the earth together”. The audible sounds that come with an earthquake are discussed in all early seismology papers; they were a regular feature of eyewitnesses’ accounts of quakes and as such a robust piece of observation, that needed to be accounted by a proper theory of earthquakes. Today, people still hear noises during earthquakes, but for some reason seismologists are not much concerned with that. They record the displacement of the ground with accelerometers, and do not look at recordings of the displacement of air picked up by microphones. A notable exception is Earthquake Sounds: a “catalog of earthquake-related sounds” compiled by Karl V. Steinbrugge, and “originally published in [the Bulletin of the Seismological Society of America] vol. 64, no. 5 in 1974 and updated with eight additional sounds in 1985. The collection was sold by the Society for many years, first as an audio cassette tape and later as a compact disc. SSA is now making it available for free in the form of digital audio files in MP3 format.” Those sounds were all collected by sheer chance, by people who happened to be recording audio just when a quake hit. 253. A similar observation is made, e.g., by Spallanzani, in Travels in the Two Sicilies, cited by Mallet in 1848: “a thick cloud (of dust) rose from the shore of Calabria opposite, where was the centre of the earthquake, the propagation of which was apparent by the successive falls of buildings from the point of the Faro to the city of Messina”. 254. I don’t know how clear this is, so let me formulate it in a different way: each time you have a quake, the time at which its seismic waves are first recorded, i.e., their “arrival time” is observed at different places around the world. Let’s just consider quakes that occur in some populated area, and that are big enough that there is damage, or witnesses, which means that even without doing any advanced science we’ll have some approximate idea of where the epicenter is. Now for each instrument, installed even very far across the world, that has recorded the quake, you add a line to a table of numbers, whose first column shows distance between the instrument and the epicenter, and the second column shows the difference between arrival time and the time at which the earthquake occurred (if you can get this info from observers close to the epi-
Notes
567
center, I guess); or, the difference between, say, the arrival time of the seismic wave, and that of the tsunami wave (which for that you need absolutely no info other than what you can read at that particular station). This is what I meant by “tabulating” the data. And you can sort the data by increasing “epicentral” distance, and see that both travel time, and the time between seismic and tsunami waves, grow with distance. So now next time one particular quake hits a station, you can collect the arrival time data, and if you know at what time the quake has occurred you can compute the travel time and see at what point of your table that would sit: and from that extrapolated the distance. And if you don’t know the “origin time” of the quake, well then you can still do the same thing, using the delay between seismic and tsunami wave. 255. And but actually, if you are not near the shore but you have an instrument that can measure the P and the S wave, then you can measure the delay between those, and because S waves are slower than P waves, the rest of the reasoning holds. We’ll see later what P and S waves are and why they propagate at different speeds. 256. Speaking of which, about ten years later Mallet would publish The Earthquake Catalogue of the British Association (1858), which he compiled together with his son, John William Mallet, and which is presumably the very first earthquake catalogue ever made (an earthquake catalogue being a collection of observations of earthquakes that have occurred either globally or in some specific region of the earth, observations meaning for example the estimated coordinates of the epicenter, intensity of shaking, etc. Mallet also produced a “seismographic map of the world” (Fig. N.16) “The method of colouring”, he explains, “was this. The whole of the recorded earthquakes of the Catalogue were subdivided [...] into three great classes:– “1o. Great earthquakes, being those in which, over large areas, numerous cities, &c., were overthrown, multitudes of persons killed, rocky masses dislocated [...]. “2o. Mean earthquakes, or those which, although perhaps having a wide superficial area, were recorded to have produced much less destructive effects upon cities, &c., and little or no changes upon natural objects [...]. “3o. Minor earthquakes, [of which] we find notices almost daily from every quarter. To distinguish these three classes upon the map, three different intensities of water-colour tint were prepared—all from the same colour (red ochre and Indian yellow). The first and most intense having been decided to designate the first class, that for the second was obtained of one-third of the intensity, by dilution with three volumes of water”, etc. Mallet then has, for each earthquake, the tint so selected distributed over the whole area where the quake has been felt. In the case of (then) recent events, that was pretty well known; otherwise, assumptions are made based on whatever data are available.” This means, by the way, that Mallet does not plot the epicenters (which if today you Google “earthquake catalogue” and look for “images”, chances are you’ll
568
Notes
Fig. N.16 Mallet’s seismographic map of the world, from his “Fourth report on the facts and theory of earthquake phenomena”, Report of the Twenty-eighth Meeting of the British Association for the Advancement of Science; held at Leeds in September 1858
get a bunch of maps of epicenters); he plots the level of shaking, which is something that today we would call “shakemap”. 257. In most cases, most of the vibrations of the ground that come with earthquakes are too low-frequency to be heard, i.e., they are “infrasounds”. But some are audible. 258. The phrase “tidal wave” is one of the most ambiguous in geophysics. Different people mean different things by it. I think that Mallet, here, refers to the crest of a tide as it moves around the earth? But I might be wrong. 259. The speed of the tsunami wave had been estimated by Michell, and found to vary greatly with sea depth: like, according to Michell, on the order of 500 m/s between Portugal and England, and but more like 2000 m/s between Portugal and the other side of the Atlantic. It was known, on the other hand, that the speed of a sound wave in water is always about 1500 m/s, independent of water depth, etc. Incidentally, Mallet compares these figures with his estimate (presumably based on many accounts of quakes of the kind I’ve quoted above) of “the velocity of the sound wave through the earth”, which is what he calls the elastic wave, and it is not clear if he is thinking more along the lines of what today we would call a compressional, or P wave, or rather a Rayleigh wave; on the other hand I am pretty sure that whatever he has in mind is not a shear, or S wave; he estimates that the velocity of this generic seismic wave “will probably be from 7000 to 10,000 feet per second” (i.e., 2000–3000 m/s).
Notes
260.
261.
262.
263.
264.
569
So the seismic wave is always much faster than the tsunami wave. (Based on this, and I am mentioning just in case you go read Mallet’s paper, and I don’t want you to be confused, based on this Mallet distinguishes the tsunami wave, whose energy entirely propagates through water, and “a small forced sea wave” that, “when in shallow water”, accompanies the seismic wave below, i.e., the oscillating of the sea bottom moves the water, too: so if you are near the shore and you’re hit by a seismic wave, there might some sort of motion of the sea surfaces that comes with it: but that’s not the tsunami which will come much later (tsunami waves are slow) and might be much larger.) In the words of G. G. Stokes769 : “The equations of fluid motion commonly employed depend upon the fundamental hypothesis that the mutual action of two adjacent elements of the fluid is normal to the surface which separates them. From this assumption the equality of pressure in all directions is easily deduced”. This is not the only way to define a fluid, and things might change for example if a fluid has viscosity; you should be aware of it, in case you read this then read some other book where the word fluid is taken to have a different meaning. But for now, and since water is a very simple fluid, with little viscosity, this definition is OK, and you can keep reading without even knowing what viscosity is. If you take any surface, oriented whichever way you want, within the prism, there’s pressure acting on that surface as well. But the thing is, by the law of action and reaction, there’s just as much pressure acting on both sides of the surface, and pushing in opposite directions: the total effect of this pair of pressures on the motion of the prism as a whole is null, because they cancel each other out. And so we are left with the total pressure that, like I said, the rest of the fluid outside of the prism puts on the prism itself. If you are really fussy, the earth rotates around its axis, so if the fluid we are talking about is part of the earth, which presumably it is given what this book is about, then there are also the centrifugal force and Coriolis’ force (see note 65, for the time being; we’ll get back to this in Chap. 8), which, ideally, one should take into account. They are relatively small, though, and we can neglect them at this point. In Chap. 2 I covered the Taylor expansion for functions of only one variable, see note 56. What you have here is the Taylor expansion of a multivariable function—four variables total—, which I haven’t covered. The formula is fairly easy, and intuitive, I guess, if we stop at first order, as I am doing here; its general version is quite ugly, f (x1 , . . . , xn ) =
∞ i 1 =0
...
∞ (x1 − a1 )i1 . . . (xn − an )in ∂ i1 +···+in f (a1 , . . . , an ) i1 ! . . . in ! ∂x1i1 . . . ∂xnin i n =0
(where, of course, f is a function of the n variables x1 , x2 , . . . , xn , and we are expanding it around the point x1 = a1 , x2 = a2 , etc.), and the proof would take
570
Notes
more room and more time than I want to spend on it. But maybe, to convince yourself that this makes sense, you can verify that it works for a function of two variables only, up to, say, second order at least. The way you do it is, introduce a variable t such that x1 = a1 + αt and x2 = a2 + βt, where α and β are arbitrary constants. Then f (t) = f (x1 (t), x2 (t)) is a function of t only, and you can Taylor-expand it the old way. Finally, use the chain rule (note 274): df ∂ f d x1 ∂ f d x2 = + ; dt ∂x1 dt ∂x2 dt this could be a nice exercise, I guess. 265. In Eq. (6.8) ∇ p and ∇v stand for the gradient of p and of v, respectively. So, before I tell you more about ∇, AKA the “nabla770 operator”, let me explain what the gradient is. Then, before this note is finished, we’ll get to divergence and curl, which are also quantities that the nabla helps us to represent mathematically. Let me start with a Taylor expansion in three dimensions (note 264). Say you know the value of the function T (x1 , x2 , x3 ), and of its first derivatives with respect to the spatial coordinates x1 , x2 , x3 , at a location (x1 , x2 , x3 ). Then, provided that δx1 , δx2 , δx3 are small, T (x1 + δx1 , x2 + δx2 , x3 + δx3 ) ≈ T (x1 , x2 , x3 ) +
∂T ∂T ∂T (x1 , x2 , x3 )δx1 + (x1 , x2 , x3 )δx2 + (x1 , x2 , x3 )δx3 . ∂x1 ∂x2 ∂x3
Lets’s define the vectors δx = (δx1 , δx2 , δx3 )
and ∇T =
∂T ∂T ∂T , , ∂x ∂ y ∂z
,
(N.37)
so that then we can write T (x1 + δx1 , x2 + δx2 , x3 + δx3 ) ≈ T (x1 , x2 , x3 ) + δx · ∇T. ∂T ∂T , i.e., nabla applied What we call gradient is precisely the vector ∂T , , ∂x ∂ y ∂z to the scalar function T . To see what the gradient does in practice, let us write δx = |δx|ˆx, where xˆ is the unit vector parallel to δx. It follows from (N.37), then, that T (x1 + δx1 , x2 + δx2 , x3 + δx3 ) − T (x1 , x2 , x3 ) ≈ xˆ · ∇T. |δx|
(N.38)
If you let |δx| go to zero, you see that (N.38) says that the component of the gradient of T along any direction xˆ coincides with the rate of change of T
Notes
571
in that direction, i.e., the derivative of T with respect to distance along that direction. Which means, if you know the gradient of T at a location x1 , x2 , x3 , then you know how fast T changes along any direction near that location: to find out, you just have to dot the gradient with the unit vector that points in that direction. Now, imagine you dot ∇T (x1 , x2 , x3 ) with all possible unit vectors xˆ , i.e., pointing in all possible directions; the magnitude |ˆx · ∇T | coincides with the magnitude |∇T | multiplied by the cosine of the angle formed by xˆ and ∇T : and so it is always smaller than |∇T |: except for when xˆ and ∇T are parallel: which then the cosine in question is one. Remember, xˆ · ∇T is how fast T changes in the direction of xˆ : and but now we’ve just proven that xˆ · ∇T is largest when xˆ points in the same direction as ∇T : i.e., that the gradient of T always points in the direction along which T changes most rapidly. Next, consider a direction sˆ such that, if (δs1 , δs2 , δs3 ) points that way, then T (x1 + δs1 , x2 + δs2 , x3 + δs3 ) − T (x1 , x2 , x3 ) = 0, |δs| i.e., such that T is constant along that direction. By definition, that means that the component of ∇T in that direction is zero. Say you have a surface over which T is constant. Then, all directions sˆ that are tangent to that surface give sˆ · ∇T = 0. It follows that the gradient of a function is perpendicular to any surface along which that function is constant. It’s OK to take the gradient of a vector function, too—in fact, we just did it, in Eq. (6.8). The gradient of the vector v = (v1 , v2 , v3 ) is the matrix ⎛ ∂v
1
∇v =
1 ⎜ ∂x ∂v ⎝ ∂x21 ∂v3 ∂x1
∂v1 ∂x2 ∂v2 ∂x2 ∂v3 ∂x2
⎞
∂v1 ∂x3 ∂v2 ⎟ ∂x3 ⎠ , ∂v3 ∂x3
each row of which is the gradient of one component of v: the first row of ∇v is the gradient of v1 , and so on and so forth. You can dot-multiply ∇ with a vector. What you get, then, is not a gradient, though—it’s what we call the divergence of that vector, i.e., ∂ ∂ ∂ · (v1 , v2 , v3 ) , , ∇ ·v = ∂x1 ∂x2 ∂x3 ∂v2 ∂v3 ∂v1 + + . = ∂x1 ∂x2 ∂x3
We call the curl of v the cross product ∇ ×v =
∂v3 ∂v2 ∂v1 ∂v3 ∂v2 ∂v1 − , − , − ∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2
,
(N.39)
572
Notes
and we call Laplacian ∇ 2 the dot product of ∇ with itself ∇2 = ∇ · ∇ =
∂2 ∂2 ∂2 + + , ∂x12 ∂x22 ∂x32
which (like the gradient, and unlike divergence and curl) you can apply to both scalars and tensors. Maybe this was a bit too abstract for you? But very soon you are going to see practical examples of all this; like, e.g., the physical meaning of divergence and curl when applied to displacement. 266. Which, if you look up Euler’s paper, you’ll find on page 286, right at the beginning of article 21. Euler’s derivation there is not quite the same as what I’ve done here, I think, but I think that also my simpler derivation here is originally due to Euler. If you are curious about these things, check out chapter 1 of Lamb’s Hydrodynamics, where Lamb’s Eq. (2), art. 4, is again Euler’s equation. Lamb, there, also explains the difference between the Eulerian approach, and the alternative Lagrangian approach, which is something you might hear about at some point in your life, although not in this book. Lamb: “The equations of motion of a fluid have been obtained in two different forms, corresponding to the two ways in which the problem of determining the motion of a fluid mass, acted on by given forces and subject to given conditions, may be viewed. We may either regard as the object of our investigations a knowledge of the velocity, the pressure, and the density, at all points of space occupied by the fluid, for all instants; or we may seek to determine the history of any particle. The equations obtained on these two plans are conveniently designated, as by German mathematicians, the Eulerian and the Lagrangian forms of the hydrokinetic equations, although both forms are in reality due to Euler.” 267. Which in a way is the same as saying, how sound propagates through fluids. It’s up to you whether you want to use the word “sound”: what might be confusing about this word is that, the way it’s used by civilians, you would expect it to refer only to something you can hear, i.e., to elastic waves that, when they hit our ears, they interact with them to produce some feeling in our brain; and in fact we’ve invented the words infrasound and ultrasound to mean waves that are of too low, or too high frequency, respectively, for human ears to pick them up. But from the point of view of a seismologist, or an engineer, the most important thing about waves that propagate in fluids is that they are just compressional waves (or pressure waves), i.e., they involve no “shearing”, no “lateral displacement” of one side of any arbitrary surface within the fluid with respect to the other (and if you are not convinced by this statement, don’t worry: because in a minute we shall prove this mathematically, based on Euler’s equation that we’ve just found). And so, they like to use the word “sound” to mean any elastic wave of this kind, even if it can’t be heard. Maybe the expression “acoustic waves” is less confusing; which, incidentally, people also speak of “acoustic media” to
Notes
268.
269.
270.
271. 272. 273.
573
refer to media where only compressional waves can propagate, i.e., fluids with zero viscosity. People say that some property of a fluid is “advected” when it gets transferred from one place to another, as a result of a transfer of mass between those two places. So, for example, think of meteorology, say there’s a warm front south of where you are: the warm front moves north and eventually hits you: what happens is that hotter air has been transferred to where you are, and so temperature around you is higher, now: not because the air that was around you yesterday has gotten hotter, but because it’s been replaced by air that was hotter to begin with. In note 46, Chap. 1, I covered ordinary differential equations, and showed you ways to tell if an ODE can be solved, uniquely or not, given its order and whether or not boundary and/or initial conditions are provided, and how many, etc. What we have here is a partial differential equation, or PDE, and with PDEs things are not that clear-cut anymore. In this book we shall deal with relatively simple PDEs, that can be separated into systems of ODEs. In note 292, e.g., you’ll see how the one-dimensional wave equation—a PDE—is solved after being reduced to a pair of ODEs. Back in the 1600s and 1700s, starting with Newton, people had tried to calculate the speed of sound in air, but systematically found values that were way too small compared to experimental771 results. The reason was, essentially, that they assumed that temperature were constant during the propagation of a sound wave, and not affected by the propagation of the wave itself; and when temperature is constant, thermodynamics says (if you don’t know about this—which there’s nothing wrong with that—please be patient and wait until the next chapter) that the product of p times V (without γ) is constant. It was, apparently, Laplace and his student Jean-Baptiste Biot who understood in the early 1800s that acoustic wave propagation in air is not an isothermal (i.e., constant temperature), but an adiabatic (no exchange of heat) process. Bernard Finn772 sums it up: Laplace figured, he writes, that “when the sound wave compressed—then rarefied—the air, [...] the temperature did not remain constant. Under compression, for instance, heat was liberated. Because of the speed with which the compression-rarefaction process took place, this heat did not have time to dissipate; thus the local temperature was raised, the local pressure was raised, and the speed of sound was that much greater than what Newton had predicted.” In Chap. 7, note 488, you’ll find the theoretical proof that pV γ is constant in an adiabatic process, and that γ is the ratio of specific heat capacity at constant pressure to specific heat capacity at constant volume. See note 265. See note 265. “D’Alembert”, says Asimov, “was brought up by a glazier and his wife, after having been found abandoned at the church of St. Jean-le-Rond [in Paris, near Notre Dame—now demolished], from which he derived his name. He was the illegitimate son of an aristocrat who did, however, contribute to his support. In
574
Notes
later years, when his talents were clearly evident, his mother tried to claim him, but d’Alembert proudly refused her. ‘The glazier’s wife is my mother,’ he said. He never married and lived with his foster parents till he was forty-seven.” 274. Which is a formula to find the derivative of the function of a function; by which I mean, given two functions f (x) and g(x), and their derivatives f (x) and g (x), the chain rule gives us the derivative of h(x) = f (g(x)). Namely, d h(x) = f (g(x))g (x). dx
(N.40)
To prove (N.40), start with the definition of derivative, d h(x + δx) − h(x) h(x) = lim δx→0 dx δx f (g(x + δx)) − f [g(x)] . = lim δx→0 δx Now call δg(x) a quantity such that g(x + δx) = g(x) + δg(x). Then, d f (g(x) + δg(x)) − f (g(x)) h(x) = lim δx→0 dx δx f (g(x) + δg(x)) − f (g(x)) δg(x) , = lim δx→0 δg(x) δx
(N.41)
after dividing and multiplying the ratio at the right-hand side by the same thing, i.e., δg(x). Again by definition of derivative, δg(x) = g (x), δx→0 δx lim
(N.42)
the derivative of g with respect to x. In addition, if g doesn’t have discontinuities in the interval of x that we’re interested in, then limδx→0 δg = 0, and but so then f (g(x) + δg(x)) − f (g(x)) = f (g(x)). (N.43) lim δx→0 δg(x) Substitute (N.41) and (N.43) into (N.41), and you get (N.40), QED. 275. Had we chosen a plus in front of ω, we’d have minus now in front of kω1 : a negative speed, which makes sense for a wave propagating from right to left, i.e., in the negative-x1 direction. 276. Plus, of course, the units along the horizontal axis wouldn’t be the same anymore.
Notes
575
277. A strange function, defined by P. A. M. Dirac and usually denoted δ(x), whose value is zero everywhere except at x = 0, whose integral over x from minus to plus infinity is one, and which “maps” all functions to their value at x = 0, i.e., +∞
−∞
δ(x) f (x)d x = f (0);
and/or more generally, via a change of the integration variable,
+∞ −∞
δ(x − x0 ) f (x)d x = f (x0 ).
The Dirac delta gets used a lot in wave physics, where it represents the simplest possible source—an “impulse”; it turns out (“Betti’s theorem” is the key word, if you want to look this up) that once you know how a medium responds to an impulse, you are able to calculate how it’ll respond to any possible source, via a relatively simple mathematical tool called “convolution”. 278. Which we’ll see later when we do actual seismic waves... this is just an appetizer, really. 279. If you have the feeling that things are getting too complex for you, it might be a good idea to take a step back and look at a related, but simpler problem. I’ve been teaching these things for a while, and one thing I typically do around this point in the course is, work out the problem of the deformation of an elastic string under tension, AKA the guitar string. Which is an old, classic problem in physics, and, as I will show you in a second, boils down to the so-called one-dimensional wave equation. Like (6.26), that is, but simpler. Let us begin with a force balance, to figure out which differential equation controls the deformation of the string. (Yes, it will turn out to be the scalar wave equation, of course: but let us make sure that we understand why.) The idea is that when you pull an elastic string, all its points are subject to the same force T, called “tension”. Take the x axis to be parallel to the string, as it is when the string is still. A guitarist, then, pulls the string at a point—take the y axis to point in the direction of the pull. The string is deformed now, shaped like a triangle in the x, y plane. At time t = 0, the guitarist lets go, and the string starts to move. Call u(x, t) the string deformation, which, for simplicity, we take to be small with respect to the length of the string773 ; because the initial deformation was in the y direction, u(x, t) will continue to point everywhere in the y direction. See the sketch in Fig. N.17. Like I said, we are going to do a force balance; we’ll do it for an infinitesimal segment of the string, of length δx; the y-component of T is T sin θ, and the y-component (the only one that matters, here) of Newton’s second reads T sin θ(x + δx) − T sin θ(x) = ρδx
∂ 2 u(x, t) , ∂t 2
(N.44)
576
Notes
Fig. N.17 Section of an elastic string under a tension T. The y-component of T, or T (x) sin[θ(x)], causes a displacement u, also in the direction y. The thick line is, like, a snapshot of the deformation u(x, t) of the string at a given time t; when undeformed, the string is parallel to the x axis
where ρ now is the string’s density, but defined as mass per unit length. Now, trigonometry says that, if δx is small, which it is, sin θ(x) ≈
∂u(x, t) : ∂x
substituting into (N.44), T
∂u(x + δx, t) ∂u(x, t) ∂ 2 u(x, t) −T = ρδx . ∂x ∂x ∂t 2
and dividing both sides by ρδx, ∂ 2 u(x, t) T ∂ 2 u(x, t) = . 2 ρ ∂x ∂t 2 We have ρ > 0, T > 0 by definition: Then T /ρ is always positive, we call it c2 , and ∂ 2 u(x, t) ∂ 2 u(x, t) c2 = , (N.45) ∂x 2 ∂t 2 which is, indeed, the one-dimensional wave equation. (A PDE, or partial differential equation, like the heat conduction Eq. 4.39 that we’ve met a couple of chapters ago.) One (not the only one, as we shall see) way to solve (N.45) is to just notice that any deformation of the form u(x, t) = f (x − ct) or u(x, t) = g(x + ct), where both f and g could be any function of a single variable. This is the 1-D version of d’Alembert’s solution, which we’ve just met in 3-D. Again, the best way to prove that this works is by direct substitution: call z = x + ct, and notice that, if u(x, t) = g(x + ct), then dg ∂z ∂u = ∂x dz ∂x dg = , dz
Notes
577
because
∂z ∂x
= 1. Likewise,
∂u dg ∂z = ∂t dz ∂t dg =c . dz
As for the second derivatives, ∂2u ∂ ∂u = ∂x 2 ∂x ∂x ∂ dg = ∂x dz d 2 g ∂z = dz 2 ∂x d 2g = , dz 2 ∂g ∂ ∂2u c = ∂t 2 ∂x ∂z 2 d g = c2 2 . dz
and
Now if you substitute into (N.45) the expressions we’ve just found for both second derivatives, you see that what you get is an identity, i.e., (N.45) is verified, which means that u(x, t) = g(x + ct) is indeed a solution. You can play the exact same game with u(x, t) = f (x − ct), to convince yourself that it is a solution as well. So, then, the most general d’Alembert solution reads u(x, t) = f (x − ct) + g(x + ct). You can prescribe some initial conditions: on the displacement, u(x, 0) = u 0 (x) for some known function u 0 , i.e., f (x) + g(x) = u 0 (x);
(N.46)
and on velocity, which is ∂(x − ct) ∂(x + ct) ∂u = f (x − ct) + g (x + ct) ∂t ∂x ∂x = − c f (x − ct) + cg (x + ct), where the prime stands for differentiation of a single-variable function, i.e., f (z) , etc. So if v0 (x) is the initial velocity, then our condition reads f (z) = d dz
578
Notes
− c f (x) + cg (x) = v0 (x).
(N.47)
Both (N.46) and (N.47) must hold for all values of x between 0 and L. We can get rid of the derivatives at the right-hand side of (N.47) if we integrate both sides, x x x dz v0 (z) = − c dz f (z) + c dzg (z) x0
x0
x0
= − c[ f (x) − f (x0 )] + c[g(x) − g(x0 )], or, g(x) − f (x) = g(x0 ) − f (x0 ) +
1 c
x
dz v0 (z).
(N.48)
x0
And the value of x0 , by the way, is arbitrary—but x0 that doesn’t matter, anyway, as I’ll show you in a minute. Sum (N.46) and (N.48) together, left-hand side with left-hand side and righthand side with right-hand side, and you get 1 2g(x) = u 0 (x) + g(x0 ) − f (x0 ) + c
x
dz v0 (z),
x0
which holds for whatever value of x, and so also for 1 2g(x + ct) = u 0 (x + ct) + g(x0 ) − f (x0 ) + c
x+ct
dz v0 (z).
(N.49)
dz v0 (z).
(N.50)
x0
Next, subtract (N.48) from (N.46), and 2 f (x − ct) = u 0 (x − ct) − g(x0 ) + f (x0 ) +
1 c
x−ct x0
Finally, sum (N.49) and (N.50), plus some algebra, and we get the one d’Alembert solution that satisfies the initial conditions that we prescribed, i.e., we get u(x, t) = g(x + ct) + f (x − ct) =
1 1 [u 0 (x + ct) + u 0 (x − ct)] + 2 2c
x+ct
dz v0 (z).
x−ct
where, like I promised, x0 is nowhere to be seen. Now, everything we’ve done so far is OK on an infinite string—no boundaries, and so no boundary conditions—and makes sense as long as the functions u 0 (z), v0 (z) are defined for all values of z—from minus to plus infinity. But in the real world guitar strings are of finite length, and they are fixed at their two
Notes
579
ends—at the “nut” and at the “bridge”. So u 0 (x) and v0 (x) is only defined for x between 0 and L—it doesn’t even make sense to speak of u 0 (x) and v0 (x) outside that interval—and our result is not (yet) very helpful. If you pick the reference frame so that one end is at x = 0, and let L be the length of the string, so that the other end is at x = L, the boundary conditions read u(0, t) = 0; (N.51) u(L , t) = 0, which must be true at all times t. It turns out that these conditions are met by the solution x+ct 1 1 u(x, t) = [U0 (x + ct) + U0 (x − ct)] + dz V0 (z). (N.52) 2 2c x−ct where the function U0 (z) denotes the odd extension of u 0 (z), and V0 (z) the odd extension of v0 (z). By “odd extension” of u 0 (or v0 , or whatever) I mean another function U0 (V0 , etc.) that: (i) coincides with u 0 in the interval where u 0 is defined (i.e., between x = 0 and L in our case); (ii) is periodic, with period L, between 0 and +∞, that is to say, U0 (x + n L) = u 0 (x), for all positive integer values of n and all x between 0 and L; (iii) to the left of the interval where u 0 is defined (that is, in our case, when the argument of U0 is negative) we have U0 (x − n L) = −u 0 (x): so that U0 is an odd function of x. You can verify by direct substitution that (N.52) satisfies both boundary (fixed endpoints) and initial conditions. You can see all this as simpler version, sort of, of the 3-D acoustic wave exercise that we just did, and hopefully if you get back to it now, things will be clearer for you. As for the guitar string, there’s still something else that I want to tell you about it, but this will have to wait until I do the Fourier transform, later in this chapter. If you are curious and want to find out right away, go straight to note 291, to learn the Fourier stuff, and then back to the string in note 292. 280. Starting with Chap. 1, I often talked about the gravity field of the earth; this expression—“gravity field”—is so common that I used it without thinking; I hope that you somehow understood what I meant—the context should have helped. At this point, though, as I begin to speak of all kinds of fields—the displacement field, right now, and then later on we’ll have the magnetic field, and there’s the temperature field, of course, etc.—I guess I need to be explicit re what that word actually means in physics. A “field” is, basically, a physical quantity, which could be a scalar, vector, or tensor, that has a value at each point in space and time. Today we are totally used to thinking in terms of fields—everybody has heard of “force fields”; like I said, everybody knows what the gravity field is, etc. In the old days, though, like, in Newton’s Principia you are not going to find, say, the gravitational attraction from the earth, or whatever other planet or star, written as a function of the place x and time t where and when the measurement is taken, as in
580
Notes
F(x) = G
281. 282. 283. 284.
MAm (x − x A ) |x − x A |3
where x A are the coordinates of the attracting body (or its center of mass, rather), M A its mass, m the mass you use to make your measurement, and G the gravitational constant—you know this. You don’t find this in Newton. Newton thought of the gravitational attraction as a force between a pair of massive objects; post-Newton, though, people realized that writing stuff as a function of position is a much more convenient way to keep track of everything. Remember the definition of “cross product”: see note 16. See note 265. See note 16. Yes, there can be distortion with zero curl: take, for instance (look at Fig. N.18), of a square prism (a prism whose base is a square), and deform it into a rhombic prism (a prism whose base is a rhombus). Pick your reference frame to be aligned with the edges of the square prism—let its base lie on the x1 , x2 , plane, etc. The transformation from square to rhombic prism involves no deformation in the x3 direction; all components of u are constant with respect to x3 ; and ∂u 1 2 = ∂u . The curl, then ∂x2 ∂x1
∂u 2 ∂u 1 ∂u 3 ∂u 2 ∂u 1 ∂u 3 − , − , − ∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2 = (0, 0, 0).
∇ ×u=
285. Most of the theory that follows is due, in its original form, to George Biddell Airy—whom we’ll meet again, in later chapters, in his capacity as British Astronomer Royal, in the mid 1800s, and one of the fathers of the theory of isostasy. The theory of ocean waves, the way I am about to show it to you here, is still known and referred to as Airy wave theory. Here I’m rather following Lamb’s Hydrodynamics (Chaps. II and IX), like I said, which I think is what most modern books are probably based on? I looked at Airy’s paper, “Tides and Waves” (1845), which Lamb also cites, but I think Lamb is clearer. 286. See note 55. 287. This is quite a bit of algebra, but you can prove it, starting with the definitions of cross product (note 16 in Chap. 1) and of curl (note 265). It just takes a lot of patience. 288. Stokes’ theorem says that the integral of a vector v along a curve ∂ S coincides with the integral of ∇ × v over the surface (any surface) S enclosed by ∂ S. Formally, nˆ · (∇ × v)d 2 r = S
∂S
v · dr
(N.53)
where nˆ is the unit vector that is everywhere normal to S. The vector dr, of infinitesimal magnitude, is everywhere tangent to ∂ S. The direction to which
Notes
581
Fig. N.18 Left: a square prism (thick solid line, shaded) deforms into a rhombic prism (thin solid line). Right: looking at just the base of the prism, it’s easier to see that deformation is only in the ∂u 2 1 x1 and x2 directions, and that u 1 grows with x2 at the same rate as u 2 with x1 , i.e., ∂u ∂x 2 = ∂x 1 . As a result, the shape of the prism is distorted, but the displacement field is curl-free
dr points is somewhat difficult to define; but important, because it determines the sign of the integral. So, imagine you are sitting on the side of S towards which nˆ points: then, dr is positive if it points in the counterclockwise direction. In fact, this doesn’t matter that much to us, because we only care about the case where v is everywhere curl-free: and in such a case, it follows from (N.53) the integral of v · dr over any closed curve is zero. Anyway, here’s a proof for the general theorem. First off, I am going to prove (N.53) in the case where S is flat—it lies on a plane; we’ll worry later about generalizing this to the case where S isn’t flat, i.e., ∂ S is an arbitrary curve in three-dimensional space. Consider a volume V bounded by S, by a replica of S which is parallel to S and at an arbitrary distance h (which eventually we’ll take to be arbitrarily small) from S, and by the surface that connects the two S’s: see Fig. N.19 to see what I mean. Let nˆ be the vector that is everywhere normal to S—a constant vector, ˆ (This is strange, I know, then, because S is flat—and consider the vector v × n. but you will see in a minute why we are doing this.) Apply the divergence theorem: ˆ 3r = ˆ 2r nˆ V · (v × n)d ∇ · (v × n)d (N.54) V
∂V
where nˆ V has the same meaning as before. Play with the left-hand side of (N.54), first:
582
Notes
Fig. N.19 Volume V used to prove (a special case of) Stokes’ theorem. For simplicity, I drew the surface S (the shaded area) as a circle, but it could really have any shape—only, it has to be flat. The unit vector nˆ is normal to S; the unit vector nˆ V is everywhere normal to ∂V , and pointing outward ˆ the unit vector tˆ is tangent to the curve ∂ S, i.e., the closed curve that (on S, it coincides with −n); bounds S
ˆ d 3r nˆ · (∇ × v) − v · (∇ × n)
ˆ d 3r = ∇ · (v × n) V
V
nˆ · (∇ × v)d 3 r
=
V
h
=
nˆ · (∇ × v)d 2 r,
dz 0
S
where in the first step I’ve used one of those ∇ formulae; in the second step, the fact that nˆ is constant, because S is flat; and in the third step, I’ve chosen a clever frame of reference where the z-axis is perpendicular to S. Next, play with the right-hand side of (N.54): integrating all over ∂V is like summing the integrals over the two flat surfaces—S and its copy—with the integral along the “edge” of height h. Let us choose h to be small enough that everything within V is constant—h is totally arbitrary, so this is fine; then, the integral over the two flat surfaces are equal to one another except for the sign (the direction nˆ points to), and cancel out. So,
∂V
ˆ 2r = nˆ V · (v × n)d
h
dz ∂S
0
ˆ 2r nˆ V · (v × n)d
The unit vector nˆ V , normal to ∂V along ∂ S, is perpendicular to both nˆ and the unit vector, call it tˆ, that is everywhere tangent to ∂ S: it follows that nˆ V = tˆ × nˆ (if we define everything like in Fig. N.19), and so ∂V
h
ˆ nˆ V · (v × n)dr =
dz 0
∂S
ˆ · (v × n)d ˆ 2 r. (tˆ × n)
The way the cross product works, we have that ˆ · (v × n) ˆ = (tˆ · v)(nˆ · n) ˆ − (tˆ · n)( ˆ nˆ · v) (tˆ × n)
Notes
583
But tˆ · nˆ = 0, because tˆ and nˆ are perpendicular to one another; and of course nˆ · nˆ = 1; and so
∂V
ˆ = nˆ V · (v × n)dr
h
dz ∂S
0
(tˆ · v) d 2 r.
So, finally, sub these new expressions that we just got, for both the left- and right-hand sides, into (N.54); replace the integral from 0 to h with just h—which is OK because h is arbitrarily small; but then h cancels out, and
nˆ · (∇ × v)d 2 r = S
=
∂ S ∂S
(tˆ · v) dr v · dr,
because tˆ dr is exactly the vector dr, the way we defined it at the beginning of this note. And the equation we have just gotten is exactly (N.53), i.e., what we had set out to prove. I still have to show you that (N.53) is true not only when S and ∂ S lie on a plane, but also when ∂ S is an arbitrary closed curve in three-dimensional space. But the thing is, whatever its shape, you can always think of S as a polyhedron made up of an arbitrarily large number of arbitrarily small flat triangles. (It could also be some other geometrical shape, I guess, but, to fix ideas, let them be triangles—and let’s not get into the problem of what’s the best way to interpolate an arbitrary surface: it’s not for this book, and it’s not really that much of a problem, since our triangles are, basically, infinitely small.) Because they’re flat, we know now that (N.53) must hold for each of those triangles; so imagine you write (N.53) once per triangle, and sum all left-hand sides together, and all right-hand sides. At the left-hand side you’ll get the integral over the sum of all triangular surfaces, i.e., the integral over the non-flat surface S; at the right-hand side, you end up integrating v · dr twice along each side of each triangle—and but the sign of dr will be one time negative, one time positive: everything cancels out except for the sides that belong to ∂ S: you only integrate once along those. Bottom line, you end up with (N.53), again, but this time with no constraints on how S and/or ∂ S should look like. 289. It follows from Stokes’ theorem that, because ∇ × v = 0, then v · dr = 0 C
for any curve C. But then it follows that you can define a function V (x) as follows: x v · dr, (N.55) V (x) = 0
584
Notes
the point being that, because of (N.55), the value of the integral at the righthand side does not depend on the path along which you integrate—only on its starting and end points; if that wasn’t the case, the definition of V (x) I just gave would be ambiguous—it wouldn’t work as the definition of a function. Now, based on what I showed you about the gradient in note 265, the increment of a function of x over a (small) change of position, δx, coincides with the dot product of δx with the gradient of that function, i.e., in our case
x
δx · ∇
x+δx
v(r) · dr =
0
v(r) · dr −
0
v(r) · dr
0 x+δx
=
x
v(r) · dr
x
= v(x) · δx. (This is strictly valid only when δx becomes infinitely small—but that’s fine by us, since we are just interested in what happens at the point x.) Because of Stokes’ theorem, you can take δx to point to whatever direction you want; for instance, you could choose it to point to each of the Cartesian directions, which would give you a separate equality for each of the components of v; anyway, even without doing that, I think it’s clear from the equation I just wrote that, δx being arbitrary, x
∇
v(x) · dr = v(x);
0
or ∇V (x) = v(x), i.e., V as defined by (N.55) is indeed the “potential” we are looking for. 290. That year, he submitted to the Académie des Sciences a paper that, apparently, had the Fourier series idea in it, but was never published. The manuscript still exists, though, and anyway, Fourier included that material in his later publications. (See Ivor Grattan-Guinness, “Joseph Fourier and the Revolution in Mathematical Physics”, IMA Journal of Applied Mathematics, vol. 5, 1969.) 291. Which is the so-called Fourier’s theorem; which I am not going to give you a rigorous proof of it, because that would be take too much space, etc., but I am going to show you some properties of sines and cosines until you will see that after all Fourier’s result is not so strange. We start off with the set of all functions, which there’s an infinity of them, defined in the interval −π to π on the real axis. And in particular we are going to look at sines, and cosines, between −π and π, the underlying question being, can we form a basis of this “space of functions”, using only cosines, or only sines, or a combination of cosines and sines. Which when I say, e.g., cosines, I mean all possible functions cos(mx), with m an integer number—which there’s also an infinity of them—and the same for the sines; and when I say basis I
Notes
585
mean, remember note 46 (Chap. 1), by basis I mean a discrete set of functions such that any function defined between −π and π can be written as their linear combination. It is useful, for what we are going to do next, to look at the integral between −π and π of the product cos(mx) cos(nx), where m and n are integer. Through the old trigonometric equality cos(A + B) + cos(A − B) , 2
cos(A) cos(B) =
valid for whatever pair of angles A and B, we have
π −π
cos(mx) cos(nx)d x =
π
−π
cos[(m + n)x] + cos[(m − n)x] d x. 2
The integral at the right-hand side can be solved analytically—in two different ways, though, depending on whether m = n or m = n. If the latter is true, we have that
π
−π
1 sin[(m + n)x] sin[(m − n)x] π cos[(m + n)x] + cos[(m − n)x] dx = + 2 2 m+n m−n −π = 0,
because the sine of all integer multiples of π or −π is zero773 . On the other hand, if m = n then the above wouldn’t work, because it involves a division by 0, and but instead we can write
π
−π
cos[(m + n)x] + cos[(m − n)x] dx = 2 = = = =
π
cos(2mx) + 1 dx 2
−π 1 π 1 π cos(2mx)d x + dx 2 −π 2 −π π 1 sin(2mx) 1 + [x]π−π 2 2 2 −π π − (−π) 0+ 2 π.
You can show, via pretty much the same strategy, that the integral from −π to π of the product of two sines, sin(mx) sin(nx), is also zero unless m = n, in which case it’s π: just like with the cosines. As for the integral of the product of a sine with a cosine, that doesn’t need much algebra, but just being aware of what odd and even functions are, and of some of their properties. So: we call “odd” a function f (x) such that f (−x) = − f (x) for all x; we call “even” a function f (x) such that f (−x) = f (x) for all x.
586
Notes
The integral of an odd function of x over an interval that’s symmetric around x = 0, i.e., from x = −a to x = a, whatever the value of a, is zero, because
0 −a
0
−a −a
f (x)d x = − = + = −
0 a
f (−x)d x f (−x)d x,
f (y)dy
0
(where in the last step I did a change of variables, y = −x), and so
a −a
f (x)d x =
0 −a
a
f (x)d x + 0
a
= −
f (x)d x
f (y)dy +
0
a
f (x)d x
0
= 0. Now: cos(mx) is an even function; sin(nx) is an odd function. It’s easy, I think, to see that the product of an even times an odd function is an odd function. It follows that the function cos(mx) sin(nx) is odd. But so, then, the integral of cos(mx) sin(nx) between −π and π must be zero. And that is independent of whether m = n or not. So, before I go ahead, here’s all we’ve found so far re sines and cosines between −π and π, in compact form:
π
−π
and likewise
π
−π
and
π −π
cos(mx) cos(nx)d x = sin(mx) sin(nx)d x =
π if m = n 0 if m = n
π if m = n 0 if m = n
sin(mx) cos(nx)d x = 0 for all m, n.
So, now, then, back to the original question: can we write any functions between −π and π as a combination of cosines? and/or as a combinations of sines? and/or of sines and cosines? Let’s start with the cosines, i.e., with the set of functions 1, cos(x), cos(2x), cos(3x), ..., cos(nx), ..., etc., all the way to n = ∞. Can this be a basis of the “space” of all functions f (x), with −π ≤ x ≤ π? The way we answer this is, start off with the hypothesis that, indeed, the cosines form a basis, and then see if we can find a function that cannot be written as their
Notes
587
linear combination: if we succeed to find just one such function, then we’ve disproved the initial hypothesis and gotten an answer to the question we started out with, i.e., the cosines aren’t a basis. (Which, by the way, is an example of what people call reductio ad absurdum: you assume that something’s true, and but then prove that making such an assumption inevitably leads to absurdity.) Anyway, so, if the cosines are a basis, then given any function f (x), a set of coefficients c0 , c1 , etc., must exist, such that f (x) = c0 + c1 cos(x) + c2 cos(2x) + · · · =
∞
(N.56)
ck cos(kx).
k=0
Now, if this is true, then we can use one of the properties of the cosines that we just found to calculate the values of the ck ’s. Because if we multiply both sides of (N.56) by cos(mx), for some arbitrary value of m, ∞
cos(mx) f (x) = cos(mx)
[ck cos(kx)]
k=0 ∞ [ck cos(kx) cos(mx)], = k=0
and then integrate over x from −π to π,
π −π
f (x) cos(mx)d x =
∞
ck
k=0
π
cos(kx) cos(mx)d x −π
= πcm , because, of course, all integrals in the sum at the right-hand side are zero, except for the one with k = m. So but then you can turn it around, cm =
1 π
π −π
f (x) cos(mx)d x,
which is a formula to find cm for any value of m. So far, so good. But now, consider the case when f (x) is some arbitrary odd function. Then, the cosine being even, the integral at the left-hand side of what we just wrote is zero, whatever the value of m. It follows that, if f is odd, then cm = 0 for all m; but then it also follows from (N.56) that f (x) is zero: which it wasn’t supposed to be, and that’s the paradox we were looking for. Bottom line, the cosines of mx aren’t a basis of our function space. And you can apply pretty much the same proof to sines, to find that they aren’t a basis, either: try if you have time and want to strengthen your mathematical muscles.
588
Notes
These results are some—not all—of the ingredients of Fourier’s theorem, which says, more or less, that the combination of all the cosines and the sines seen before, taken together, is a basis; or, in other words, that (almost) any function f (x), with −π ≤ x ≤ π, can be written as a linear combination of the functions 1, cos(x), sin(x), cos(2x), sin(2x),..., cos(nx), sin(nx), ..., i.e., ∞
f (x) =
a0 + [an cos(nx) + bn sin(nx)] , 2 n=1
(N.57)
where the coefficients an and bn are given by "
an =
1 π
bn =
1 π
π −π
f (x) cos(nx)d x,
−π
f (x) sin(nx)d x.
π
(N.58)
Notice that the sum starts at n = 1 and there’s no b0 : that’s because when n = 0 then sin(nx) = 0, whatever the value of x. The cosine of zero, instead, as you know, is one. Notice also that cos(nπ) = cos(−nπ) = ±1, and sin(nπ) = sin(−nπ) = 0, whatever the value of n, so Eq. (N.57) won’t hold—and Fourier’s theorem won’t work—unless the function f is such that f (−π) = f (π). Which is why I added “almost”, before “any function”, a paragraph ago774 . You might also be wondering why a0 in (N.57) is divided by 2. To see why that is the case, imagine you are looking for the Fourier coefficients of a constant function— f (x) = c, with c an arbitrary real value. Then, Eq. (N.58) gives775 a0 = 2c, and an = bn = 0 for all n bigger than 0: and you see that without that factor 1/2 we would be in trouble. I am stopping short of giving the entire proof of Fourier’s theorem776 , because that would be too long and complex, even for this long and complex book. But you should be able to see, if you remember what we just did, that, if (N.57) is true, then (N.58) has to be true as well. Because if you multiply both sides of (N.57) by sin(kx), ∞ a0 sin(kx) + f (x) sin(kx) = [an cos(nx) sin(kx) + bn sin(nx) sin(kx)] , 2 n=1
and integrate both sides from −π to π,
π
−π
f (x) sin(kx)d x = +
π
a0
−π 2 ∞
sin(kx)d x
an
n=1
π
−π
cos(nx) sin(kx)d x + bn
π
−π
sin(nx) sin(kx)d x ,
which this time I went through all the algebra pretty fast, but I figured this is not that far from stuff we’ve done a few moments ago—when I showed you how to get a formula for cm from Eq. (N.56); now, all integrals at the right-hand
Notes
589
π side are zero, except for −π sin(nx) sin(kx)d x, which is nonzero only when n = k, in which case it is equal to π. And so we are left with
π −π
f (x) sin(kx)d x = πak ,
and if you replace k with n in this equation, which at this point it doesn’t matter since there’s only one index, and no sums, if you do that and divide both sides by π, then you have the second of (N.58), QED777 . And you can play a similar game with cos(kx), to find the first of (N.58) as well778 . So, even without a formal proof, I am hoping that the Fourier story begins to make some sense to you. Now, it all looks very nice, but of course a tool that can only be used on functions that are defined between779 −π and π is not going to be that useful in the real world. The thing is, though, it is not that difficult to extend Fourier’s theorem to functions defined from −L to L, where L is a totally arbitrary real number. and cos nπx . Consider, instead of cos(mx), cos(nx) the functions cos mπx L L Try to integrate their product from −L to L; you see that
L
cos
mπx L
−L
cos
nπx L
dx = =
π −π
L π
L cos(my) cos(ny) dy π π
cos(my) cos(ny)dy, −π
via the change of variables y = πx . But now we know how to do the integral L at the right-hand side, because we’ve just learned that a couple of pages ago, and we know that integral is 0 if m = n and it is π if m = n. It follows that
L
cos −L
mπx L
cos
nπx L
dx =
L if m = n 0 if m = n
I am not going to go through all the algebra, because that be quite repet would , etc., and of course itive, frankly, but you can do the same thing with sin mπx L you can also show that the integral of that times all the corresponding cosines between −L is zero, etc., and the bottom line is that the cosine and sine of mπx L and L have the same properties as the cosine and sine of mx between −π and π. (Where of course in all this m (and n, and k) always stand for generic integer numbers.) So we can rewrite Fourier’s theorem: any function f (x), defined for −L ≤ x ≤ L and/or periodic, with period 2L, outside that interval, and with f(L) = f (−L), written a linear combination of the πx canbe as nπx nπx functions 2πx 2πx , sin , cos , sin ,...,cos , sin ,..., i.e., 1, cos πx L L L L L L nπx nπx $ A0 # + + Bn sin , An cos 2 L L n=1 ∞
f (x) =
(N.59)
590
Notes
where the coefficients An and Bn are given by "
An = Bn =
1 L 1 L
L d x, f (x) cos nπx L −L L nπx d x. f (x) sin −L L
(N.60)
This is what people call Fourier series (which is not quite the same thing as the Fourier transform, by the way, as we are about to see). One might say that Eq. (N.59) is the Fourier-series “expansion”, or Fourier expansion, of f (x), and An and Bn are the Fourier coefficients of f (x), etc. For what I need to do next, it’s preferable to switch to another, uhm, notation of the Fourier series, that would have been more difficult to derive from scratch— the way I did it via cosines and sines—but that is relatively easy to obtain now, and is more compact and will make the stuff that follows less cumbersome. For this, though, we are going to need complex numbers780 and the formula ei z = cos(z) + i sin(z),
(N.61)
where z could be any real number, and i is the imaginary unit. The first step (no complex numbers yet) consists of plugging (N.60) into (N.59), which gives f (x) =
1 2L
L
f (ξ)dξ +
∞ 1
L
f (ξ) cos
nπξ L
nπx
dξ L −L L nπx 1 L nπξ + sin dξ f (ξ) sin L −L L L L ∞ nπx nπξ 1 L 1 f (ξ)dξ + dξ f (ξ) cos = cos 2L −L L n=1 −L L L nπξ nπx + sin sin ; L L −L
n=1
cos
but trigonometry tells us that sin(A) sin(B) =
cos(A − B) − cos(A + B) , 2
and the similar equality for the cosines—which see above in this same note— and if you plug both into the right-hand side of what we just wrote, you get 1 f (x) = 2L
L −L
∞ nπξ 1 L nπx − . (N.62) f (ξ)dξ + dξ f (ξ) cos L n=1 −L L L
Notes
591
Now, here come the complex numbers. If you replace z with −z in (N.61), and sum the formula you get with (N.61) itself, you’ll find that cos(z) =
ei z + e−i z . 2
Substitute that into (N.62), and f (x) = = = =
1 2L 1 2L 1 2L 1 2L
L
−L
∞ $ # nπ(x−ξ) nπ(x−ξ) 1 L dξ f (ξ) ei L + e−i L 2L n=1 −L ∞ L ∞ L nπ(x−ξ) nπ(x−ξ) f (ξ)dξ + dξ f (ξ)ei L + dξ f (ξ)e−i L
f (ξ)dξ + L
−L L −L
∞
n=1 −L
f (ξ)dξ +
L
n=−∞ −L
∞
L
n=1 −L
dξ f (ξ)e−i
nπξ L
n=1 −L
dξ f (ξ)ei ei
nπx L
nπ(x−ξ) L
+
−1
L
n=−∞ −L
dξ f (ξ)ei
nπ(x−ξ) L
,
(N.63) where notice that the sum now starts at −∞, rather than 0: the idea is that nπ(x−ξ) nπ(x−ξ) e−i L with positive n is the same as ei L with negative n. The n = 0 term coincides with the integral of f (no exponentials, sines or cosines), because e0 = 1. Now if we call L inπξ 1 f (ξ)e− L dξ, (N.64) Cn = 2L −L then (N.63) becomes f (x) =
∞
Cn e
inπx L
,
(N.65)
n=−∞
which is the complex-number version of the Fourier series781 . You see how compact that is, compared to what we had before. The Fourier series (N.59) plus (N.60), or (N.64) plus (N.65), is more powerful than (N.57) plus (N.58), because L is arbitrary, while π is just π. Those of you who have already given some thought to signal processing and all that might realize that any signal that starts and ends with 0: no displacement, no sound—which is (approximately) true of your typical earthquake seismogram, for example—can be “expanded” in a Fourier series: just think of L as the half-duration of the seismogram, with the recording starting before the quake, and ending after the last surface wave has passed: silence. But we can make this even more general if we send L to ∞. We need to make the hypothesis that the integral at the right-hand side of (N.64) will “converge”, as they say, i.e., it will not become infinite, even if L −→ ∞. If you rearrange (N.64),
592
Notes
2LCn =
L
f (ξ)e−
−L
inπξ L
dξ,
(N.66)
this implies that Cn has to become infinitely small as L becomes infinitely large, in such a way that their product stays finite. As L grows indefinitely we also have that π lim = 0. L−→∞ L is It follows that each time the value of n changes by 1, the increment in nπ L is now a continuous variable: infinitesimal: which is the same as saying that nπ L let’s call it ω. Then Cn , which is a “function” of n, is also a function of ω. Let’s denote F(ω) = 2LCn , so that (N.66) becomes F(ω) =
∞
−∞
f (ξ)e−iωξ dξ,
(N.67)
which, actually, is what people call the Fourier-transform formula, i.e., F(ω)— the “continuous” counterpart of the “discrete” Cn ’s—is the Fourier transform of f (t). F nπ Now look at Eq. (N.65). If we sub F(ω) , or, which is the same, (2LL ) for Cn , 2L Eq. (N.65) becomes ∞ F nπ inπx L e L . f (x) = 2L n=−∞ Multiply and divide the right-hand side by πL , so that we can write f (x) =
∞ F
nπ L
2π
n=−∞
e
inπx L
π . L
Now, again, send L to ∞: replace nπ by ω; replace the sum over n with an L integral; realize that πL is the increment in ω that we get if we increment n by one, which, as we go to the limit, is nothing but dω: so replace πL with dω, and 1 f (x) = 2π
∞
F(ω)eiωx dω.
(N.68)
−∞
Equation (N.68) is the Fourier-transform equivalent of the Fourier-series formula (N.65). It is also called inverse Fourier transform. Notice that (N.67) and (N.68) are the same equation except for the names of the functions and of the 1 , in variables, the sign of the argument of the exponential, and the factor, 2π front of the integral in (N.68). In real-world applications of the Fourier transform, x often stands for time, and Eq. (N.68) amounts to writing, e.g., a seismogram f (x) as the combination
Notes
593
of purely sinusoidal oscillations eiωx , each with its own frequency ω. Once you’ve calculated its transform F(ω), it’s relatively easy to filter it, which in the simplest and most brutal case means you just replace F(ω) with zero at frequencies ω that you are not interested in. (We’ll learn, later, e.g., that surface waves are always lower-frequency than body waves, so if you want to focus on the surface-wave component of a seismogram, you often filter out the higher frequencies, and vice versa.) You can inverse-Fourier-transform the filtered F(ω), and see how the filtered seismogram looks in time. Of course x doesn’t need to be time; it could be anything, in principle; it could be distance, and then ω would be a “spatial frequency”, etc. If you have a function of two variables, you can apply the Fourier transform twice, once per variable: which is how you Fourier-transform an image, for example. And again you can filter in the so-called “Fourier domain”, or “frequency domain”, and then inverse-transform back to regular two-dimensional space, etc. Another very useful property of the Fourier transform has to do with the derivative. If you differentiate both sides of (N.68) with respect to x, you get ∞ 1 d d f (x) = F(ω)eiωx dω dx 2π d x −∞ ∞ d iωx 1 e dω F(ω) = 2π −∞ dx ∞ 1 iω F(ω)eiωx dω, = 2π −∞ which if you compare this to (N.68), you see that what this is saying is that iω F(ω) is the Fourier transform of the derivative of f . This could be very helpful when you are trying to solve a differential equation782 —which, as you know, differential equations contain all sorts of derivatives. We’ll see this in action very soon. 292. In note 279 I told you all about how d’Alembert solved the guitar string problem. There’s something more about the string that I haven’t covered, there, yet, because I needed some of the Fourier-transform material for that. Now that Fourier is done, I can wrap up the string, too. If you haven’t read note 291, though, please go back and read it; and if you haven’t read note 279, you can still go ahead, but it won’t be equally interesting, I think. So, anyway, a funny thing about the wave equation is that there are various ways to solve it, and the story goes that, in the old days, it was solved by both d’Alembert and Daniel Bernoulli, and Bernoulli solved it with a totally different approach and his results and those of d’Alembert didn’t look like they were saying the same thing. Which lead to much controversy, starting in the mid-1700s and until, basically, the time of Fourier. Bernoulli figured that equation (N.45) of note 279, which is a PDE, like I said, and as such not very attractive, could be reduced to a pair of ODEs that are straightforward to solve. The way you do it is you make the hypothesis that the
594
Notes
solution to the PDE can be written as the product of a function that depends only on distance x, times another function that depends only on time t, u(x, t) = ξ(x)σ(t). Then you sub this into (N.45), ¨ = σ(t)σ(t), ¨ c2 ξ(x)ξ(x) and if you divide both sides of this by the product ξ(x)σ(t), c2
¨ σ(t) ¨ ξ(x) = . ξ(x) σ(t)
(N.69)
(Each dot above σ or ξ means, differentiate once with respect to the only variable that σ or ξ is a function of. It’s just to save space.) Now, the lefthand side of (N.69) depends only on x—it won’t change if we change t; the right-hand side of (N.69) depends only on t. Equation (N.69) says that leftand right-hand side must coincide for all values of x and t: but if its left-hand side is constant with respect to t, and its right-hand side constant with respect to x, the only way for this to happen is that both left- and right-hand side are constant with respect to both x and t—and equal to the same constant, which we are going to call, I don’t know, λ. Formally, then, c2 and
or
¨ ξ(x) =λ ξ(x)
σ(t) ¨ = λ, σ(t) λ ¨ ξ(x) = 2 ξ(x), c σ(t) ¨ = λσ(t).
Which now what we have are two ODEs, which are actually the same ODE, which is actually an ODE that we’ve solved before. So, success. This became a standard method, by the way, that occasionally works to solve some PDEs, and people call it solution by separation of variables. The general solution of a second-order, homogeneous ODE is the linear combination of two linearly independent solutions, so
Notes
595
√ ξ(x) = A cos and
σ(t) = A cos
−λ x c
√ + B sin
−λ x c
√ √ −λ t + B sin −λ t ,
where A, B, A , B are four independent, arbitrary constants. Until we prescribe some boundary and/or initial conditions, that is. Again, the guitar string’s endpoints must be fixed, so we still have the boundary conditions (N.51). Because they are independent of t, in Bernoulli’s approach they might as well be reduced to ξ(0) = 0 and ξ(L) = 0. From ξ(0) = 0 it immediately follows that A = 0. From ξ(L) = 0, it follows that √ B sin √
and so
−λ L c
= 0,
(N.70)
−λ L = kπ (k = 1, 2, 3, ...), c
or
λ = λk = −
kπc L
2 (k = 1, 2, 3, ...),
which if we sub this into (N.70), we get a discrete, infinite set of solutions ξk (x) = B sin
kπ x L
(k = 1, 2, 3, ...),
that verify the boundary conditions. We must also substitute the same thing into the expression we got for σ(t), which, as a result, is replaced by kπc kπc t + B sin t (k = 1, 2, 3, ...). σk (t) = A cos L L
Then, all the products u k (x, t) = ξk (x)σk (t) kπc kπ kπc kπ = α cos t sin x + β sin t sin x (k = 1, 2, 3, ...), L L L L
where α = A B and β = B B, are Bernoulli’s solutions of the guitar string equation with fixed endpoints. But then (remember note 46) all their linear combinations,
596
Notes
kπ kπ kπc kπc t sin x + βk sin t sin x , αk cos u(x, t) = L L L L k=1 (N.71) are solutions as well783 . We should also prescribe to Bernoulli’s solution (N.71) the same initial conditions like d’Alembert did—i.e., that the initial displacement coincide with some function u 0 (x), and initial velocity with another function v0 (x); see Eqs. (N.71) and (N.59) of note 279. The displacement condition, substituting t = 0 in (N.71), reads ∞ kπ αk sin x = u 0 (x), L k=1 ∞
and the velocity condition, after differentiating (N.71) against time, and then subbing t = 0, ∞ kπ kπc βk sin x = v0 (x). L L k=1 If you multiply both sides of both equations with sin(nπx/L), then integrate over x from 0 to L,
∞ L
αk sin
0
k=1
kπ x L
sin
L nπ nπ u 0 (x) sin x dx = x d x, L L 0
and L nπ nπ kπ kπc sin x sin x dx = x d x. βk v0 (x) sin L L L L 0
∞ L
0
k=1
If you let me play my usual games with sums and integrals, etc., this becomes ∞ αk k=1
L
sin 0
nπ L nπ kπ x sin x dx = x d x, u 0 (x) sin L L L 0
and ∞ nπ L nπ kπc L kπ x sin x dx = x d x. βk sin v0 (x) sin L 0 L L L 0 k=1 The integral on the left hand side of both equations is easy to do, if you’ve read note 291: because the functions we have to integrate—the product of two sines of something times x, i.e. the product of two odd functions, is even: so its integral between 0 and L is half its integral between −L and L. In note 291
Notes
597
2π Fig. N.20 Top: the cosine term of the first overtone (k = 2), i.e., cos 2πc L t sin L x , at four different times t, as per the legend—T being the overtone’s period, which T = 2L k c . (The string is 1 m long, weighs 0.3 grams per meter and a tension T = 60 N is applied to it: then the period is about 2 ms which means frequency about 447 Hz—not quite an A.) Bottom: the first four modes of the same string at t = 0
we actually worked out the integral between −L and L, and saw that it is equal to zero if k = n, and equal to L if k = n. Bottom line, you can replace both integrals at the left-hand sides with L2 δkn . Then αn =
2 L
and βn =
2 nπc
L
u 0 (x) sin
0
L 0
nπ x dx L
(N.72)
nπ x d x, L
(N.73)
v0 (x) sin
which means that if u 0 (x) and v0 (x) are prescribed, if they are known, then we have a way to calculate all the coefficients αk , βk in (N.71), and there is no more arbitrarity in the solution. It’s worth, now, to have a closer look at (N.71), which I made Fig. N.20 to help x describes how each of Bernoulli’s solutions with that. The factor sin kπ L depends on distance along the string. That’s called a “normal mode”, or just of “mode”, or “mode of oscillation”784 ; it’s associated with the frequency 2πc L the time-dependent factors in u k , i.e., you might say that mode k oscillates at : which is what the top plot in Fig. N.20 tries to show. We could frequency kπc L see this, in practice, if we could manage to have u 0 (x) coincide with one of the
598
Notes
modes, that is, u 0 (x) = sin kπ x for some integer value of k, and v0 (x) = 0, L which then you see from Eqs. (N.72) and (N.73) that only α j is nonzero, and all the other terms in (N.71) vanish: so the string behaves exactly like mode k. For each mode (except for the one with k = 1) there’s some points along the string where the displacement is 0 at all times t. If, for instance, k = 2 and x = L2 , then 2π L kπ x = sin sin L L 2 = sin(π) = 0. In general, sin kπ x = 0 when x = Lk , 2L , 3L , . . ., i.e., when x = jkL , with L k k 0 ≤ j ≤ k. If the string happens to oscillate according to just mode k, then, whatever the value of t, those points never move. They are called “nodes” of mode k. By the way, people don’t really speak of “mode k”, but rather of the fundamental mode, which is the k = 1 mode, and then, starting with k = 2, of the first, second, third, etc., overtone. You see that the fundamental mode is the gravest one, and then frequency grows with the overtone number. Namely, the string’s fun, and the overtone frequencies are 2πc , 3πc , 4πc , . . ., damental frequency is πc L L L L i.e., all integer multiples of the fundamental one785 . I hope it’s not too difficult to see that the waves we saw earlier, in Figs. 6.5 and 6.6, etc. are not the same thing as the oscillations in Fig. N.20. In fact, people like to refer to the former as traveling waves, and to the latter—the normal modes— as standing waves, to emphasize their differences. The most conspicuous one perhaps being that traveling waves can’t possibly have nodes. When Bernoulli came up with standing waves, and d’Alembert with traveling waves, and both claimed to have found the correct solution of the string problem, the feeling was that one of them must have done something wrong. But instead, they were both right. The controversy was settled post-Fourier, and let me briefly show you how. I’ll only do it in the simpler case where there’s no initial velocity: you just pull the string, hold it, and let it go, v0 = 0. Then, d’Alembert’s formula (N.52) of note 279 collapses to u(x, t) =
1 [U0 (x + ct) + U0 (x − ct)]; 2
(N.74)
and (N.73) with v0 (x) = 0 gives βn = 0 for all n, and Bernoulli’s (N.71) becomes ∞ kπ kπc αk cos t sin x . (N.75) u(x, t) = L L k=1
Notes
599
Which, OK, these two expressions don’t seem to be saying the same thing. But, we have Fourier’s theorem now, note 291, which Bernoulli and d’Alembert didn’t have, and Fourier’s theorem says that the function U0 (z), being periodic with period equal to 2L (with period equal to L, actually, but if you’re periodic with period L then you’re also periodic with period 2L), can be written786 U0 (z) =
∞ # nπz nπz $ + bn sin , an cos L L n=0
for some real coefficients an , bn . Because of the boundary conditions, we discovered in note 279 that U0 (z) also needs to be odd—the odd extension of u 0 (z). It can be shown that, the cosine being an even function, the cosine-coefficients of the Fourier series of an odd function are all zero: an = 0 for all n. And so U0 (z) =
∞ # nπz $ bn sin L n=1
(where the n = 0 term is zero (sine of zero...) and so the sum might as well start at n = 1). And but so then U0 (x ± ct) =
∞
bn sin
# nπ
n=1
L
$ (x ± ct) .
(N.76)
Now, trigonometry review, sin(α + β) = sin(α) cos(β) + cos(α) sin(β) and sin(α − β) = sin(α) cos(β) − cos(α) sin(β), for whatever α and β. It follows that sin
# nπ L
$
(x + ct) = sin
nπx L
nπct cos L
+ cos
nπx L
nπct sin L (N.77)
and nπct − cos sin , sin (x − ct) = sin L L L L (N.78) which is already, like, half the solution of the Bernoulli versus d’Alembert thing: because what this is saying is that any sinusoidal traveling wave is the sum of two sinusoidal standing waves. (And if we sum and/or subtract Eqs. (N.77) and (N.78) to/from one another, we immediately find that each standing wave, each # nπ
$
nπx
nπct cos L
nπx
600
Notes
mode of the string can be written as the combination of two traveling waves, propagating along the string in opposite directions.) Substituting into (N.76), we get U0 (x + ct) =
nπx nπct nπct nπx cos + cos sin bn sin L L L L n=1
∞
and nπx nπx nπct nπct cos − cos sin , U0 (x − ct) = bn sin L L L L n=1 ∞
and but so then it turns out that the d’Alembert solution (N.74) can also be written ∞ nπx 1 nπct bn sin [U0 (x + ct) + U0 (x − ct)] = cos , 2 L L n=1 which is Bernoulli’s solution (N.75), if you just pick an = bn . (The case with nonzero initial velocity should be the same thing, only heavier. x+ct You write the term 2c1 x−ct dz V0 (z) in (N.52) as its own primitive, whatever it is, evaluated at x + ct minus the same thing at x − ct. Then you replace that, too, with its Fourier series. And then it should all look a lot like what we just did, except we end up with two terms, but that’s fine, because we need to equate our result not to Bernoulli’s (N.75), but to Bernoulli’s (N.71), with both sines and cosines.) So, you see, traveling waves in a string with fixed endpoints are really just combinations of normal modes. This wouldn’t make any sense if the string weren’t finite, with some kind of boundary conditions prescribed at its endpoints; but, in general, “boundary value” problems are solved by discrete sets of normal modes, each with its own frequency. Engineers study the normal modes of buildings, seismologists study the normal modes of the earth, and so on and so forth. And yes, any seismic disturbance propagating across the earth can always be thought of and, in principle, written as a linear combination of modes. 293. If you look it up, you’ll see that there are a number of ways to write this—you could use a series versus an integral, complex numbers or only real numbers, the combination of the sine and the cosine of an imaginary number, or its exponential. And things could be normalized against different constants. Be prepared to ambiguity and chaos, and/or see note 291, where I tried to sort it all out, as much as I could. 294. There are, of course, good reasons to pick g to be a sine, and h a cosine. If this was 1850 or so, and we’d be doing this for the first time ever, we might
Notes
295.
296. 297.
298.
601
actually try two cosines first, or two sines, or whatever: to discover eventually that the combination in (6.77) and (6.78) is the one that works. If you are already familiar with Euler’s formula, Eq. (N.61) of note 291, which I am not saying you necessarily should be, yet, but in case you are, compare it with (6.96) and (6.97): and you should see right away why the hyperbolic cosine and sine are called the way they are. If you are confused, remember that we are at x1 = 0, and consider that cos(−ωt) = cos(ωt) by the properties of the cosine. Equation (6.107) also says that the speed of a tsunami wave depends on H , the depth of the ocean: the value of tanh grows with the value of its argument, so, then, the speed of a tsunami always grows with H . This has some important consequences. Mallet (page 76 of his paper) mentions that, when in the open sea, the “slope” of a tsunami wave is “so gentle, that it might even pass under a ship without being noticed.” But when it “leaves the open sea, [...] and advances into the shallow water [...] near the shore, its front slope becomes short and steep, and its rear slope long and gentle”. Mallet doesn’t explain why that is the case, but I guess one can see it as a simple consequence of energy conservation. Which, let me explain what I mean by that. A wave is generated by some sort of disturbance: a displacement, let’s say, whose entity can be measured in kinetic energy. The larger the energy carried by the wave, the larger the amplitude, of course. Now, saying that the wave propagates is the same as saying that that energy gets transmitted, from the initial location of the disturbance to places around it, and from those places to places around them, and so on (see note 248). In all this process, no additional energy is injected into the medium, and so the total energy carried by the wave at any moment of time equals the energy accounted for by the initial disturbance. And/or, the energy carried by our tsunami wave is always the same, along its entire journey. (OK, some of it is dissipated, actually, as water isn’t perfectly elastic—nothing is. So, the total energy carried by the wave actually goes down a bit, as the wave propagates on.) Now, if something suddenly slows down the wave, that fixed amount of energy gets concentrated, for a constant amount of time, within a smaller space: which means bigger displacements. Think, oh, I don’t know, think of a whole lot of people running from a large open space into a small gate. They’re forced to slow down, as they queue to get in, and so, near the gate, you get many more people per unit area than away from it. The same with the energy/amplitude of a wave, as the wave propagates from an area where it can travel fast, to one where it is slower. What happens with the tsunami wave, like I said, is that its speed grows with how deep the ocean is. Near the shore, the ocean quickly goes from 3 or 4 km deep to, like, a few meters? which means speed goes down quite a bit, and amplitude, which was negligible in the open sea, now grows quickly and dangerously. Which is why tsunamis can cause a great deal of damage. “We are called upon to chronicle the death, at Leipzig, of Prof. Ernst Heinrich Weber, whose name is so closely united with the fundamental principles of modern optics and acoustics. He was born at Wittenberg, June 24, 1795, and
602
Notes
after having studied at the university of that city received, in 1815, the degree of M.D. Two years later he published a short work on the anatomy of the sympathetic nerves, which brought his name at once into prominence. The following year he was appointed extraordinary professor of anatomy at the University of Leipzig, and in 1821 he became ordinary professor of human anatomy. [...] Science is, however, chiefly indebted to Prof. [Ernst Heinrich] Weber for the classical researches carried out by him and his brother Wilhelm Eduard while still young men, on which is grounded the celebrated wavetheory. The work in which their investigations are recorded—Die Wellenlehre auf Experiments Gegründet (1825), is a remarkable relation of the most delicate and ingenious observations ever undertaken to establish a series of physical laws. Among the most notable of these might be mentioned the experiments on waves of water [...] by means of which they found that the particles near the surface move in circular paths, while those deeper in the liquid describe ellipses, the horizontal axes of which are longer than the vertical. [...] These and a multitude of other facts, studied and elaborated in the most scrupulous and conscientious manner, form the basis for the whole theoretical structure accepted at present as explanatory of the phenomena of light and sound. So thoroughly and scientifically were these researches carried out that subsequent physicists have never been called upon to correct them.” (Nature, February 7, 1878. Article not signed.) 299. I have found in the Journal Des Mines, vol. 17, year XIII of the French Republic, i.e., 1804, a summary, signed Laplace and Haüy, of Jean-Henri Hassenfratz and Joseph Gay Lussac’s experiments sur la propagation du son: “the experiments we are about to report on were carried out in quarries located below Paris; as they can easily be repeated in the interior of mines, and as it is possible that they are of a nature to lead to results which could become, in certain circumstances, useful to the very art of mine exploitation, we thought that our readers would be grateful if we made known to them the following report, which two illustrious scholars made to the National Institute, on the experiments we want to talk about.” And actually this is very right: understanding and measuring seismic velocity would turn out to be extremely useful to the art of mine exploitation. “Physicists have long recognized that air is not the only medium capable of transmitting sound. [...] But little attention had been paid until now to comparing the speed of sound propagated through the intermediary of air, with that which occurs through other bodies [...] having descended into one of the quarries situated below Paris787 , [Hassenfratz] instructed someone to strike with a hammer against a mass of stone forming the wall of one of the galleries cut into these quarries. In the meantime, he would gradually move away from the point where the stone was being struck, pressing one ear against the mass of stone. Soon he could distinguish two sounds, one transmitted by the stone and the other by the air. The first, which faded much more rapidly as the observer moved further away, ceased to be heard at a distance of 134 paces788 , and the second, for which the air served as a vehicle, only died out at a distance of
Notes
603
400 paces. Moreover, the sound transmitted by the stone reached the ear much more quickly than that propagated through the air.” 300. In Select Reviews, and Spirit of the Foreign Magazines, by E. Bronson and others, 1810, there’s a review of the “Experiments on the Propagation of Sound through solid bodies and through Air in very long Tubes, published by Monsieur Biot in the Mémoires de la Société d’Arcueil”. Here goes: “it is well known that air is not essential to the propagation of sound, which can be transmitted through any elastick medium, solid, liquid or gaseous. The celerity of its flight is also much greater in the denser substances. This fact has been ascertained in Denmark and England, by direct experiments on the sound conducted through beams of wood and stretched wires; through water and sheets of ice. It was very conspicuous in the observation made by Hassenfratz in the subterranean quarries extending under the site of Paris. The ingenious Chladni proposed to determine the relative swiftness of transmission through a solid body, merely from the note which a rod of the given materials yields when excited into a tremor by friction. “M. Biot, whose attention is ever alert, has seized the occasion of some considerable improvements now going forward in the capital of France, to repeat similar experiments with great precision. The pipes intended to convey water to that metropolis consist of cylinders of cast iron, each eight feet three inches in length: the joints are secured by a collar of lead nearly half an inch thick, covered with pitched cotton rag, and strongly compressed by screws. Into one end of the compound pipe was inserted a iron hoop, holding a bell with a clapper, and at the other end, the observer was stationed. In these observations, M. Biot was occasionally assisted by M. Bouvard or M. Malus, colonel of artillery, and by Martin, a chronometer maker. On striking the clapper at once against the bell and the inside of the tube, two distinct sounds were heard at the remote extremity, the one sent through the iron, and the other conducted along the air. The interval between those sounds was measured by a chronometer that marked half seconds. In the first experiment, the pipe consisted of 78 pieces; its length, exclusive of the lead rings was 647 feet; and the interval between the two sounds was ascertained, from a mean of fifty trials, to be 0.542”. But the ordinary propagation of sound through. the atmosphere would, at that temperature, have required 0.579”; and consequently the difference, 0.037”, must give the time of transmission through the metallick tube. In another experiment, the assemblage of pipes, including the leaden joints extended to 2550 feet, or nearly half a mile; and on a medium of 200 trials the two sounds were heard at the interval of 2.79 s... [...]. [Biot] concludes [...] that sound is transmitted ten or twelve times faster through cast iron than through the atmosphere. [the latter being ∼3.3 km/s.] “[...] Perhaps cast iron is more languid in its tremors than the purer, malleable iron. Chladni had assigned the celerity of vibration through iron and glass at 17500 feet in a second [∼5.3 km/s]; and Leslie had shown, in one of the curious notes annexed to his book on heat, that through a fir board, the velocity of impulsion, which he proved to be the same as that of vibration, is 17300
604
Notes
Fig. N.21 Infinitesimal segment of a thin rod. Pressure, either positive or negative, is applied to the shaded surfaces, perpendicular to the x1 axis
feet in a second. We wish that some experiments, on a large scale, were made on the time of transmission of sound through water.” [The work of Colladon comes later, in 1826. The experiments of the brothers Weber in 1825.] 301. Asimov says that Thomas Young, “the son of a Quaker banker, was an infant prodigy who could read at two and who had worked his way twice through the Bible at six. During his youth, he learned a dozen languages including not only Greek, Latin, and Hebrew, but also Arabic, Persian, Turkish, and Ethiopian. He could also play a variety of musical instruments, including the bagpipes.” Young studied medicine in Edinburgh (where his chemistry prof. was James Black, which see Chap. 4) and Göttingen, and eventually opened a medical practice in London. Apparently, he wasn’t very successful as a medical doctor; but through his studies he understood quite a few new things about how the eye works. From there he went on to study light: and he got involved with the debate over whether light is a wave, or a substance. In Young’s time, the theory of light and of elastic wave propagation were still one and the same thing: light was thought to be an elastic wave. Young’s contribution, as a result, is relevant to acoustics and seismology just as it is relevant to optics. 302. Negative if the change pressure is positive, and vice versa—and so the minus sign is there, in front of δ p, because Young, or whoever first invented what was to be called Young’s modulus, wanted it to be positive, I guess. 303. Take the x1 axis to be aligned with the rod. We assume the rod to be deformed—pushed, or pulled—only along the x1 direction. Call u(x1 , t) the deformation—a displacement in the x1 direction, either positive or negative. Consider an infinitesimal segment, length δx1 , of the rod. Expand or compress that particular segment. By definition of E, the extra pressure acting on the segment, as a result of deformation, from the left (look at Fig. N.21), is δ p(x1 , t) = −E δu(x1 , t)/δx1 . Here δu(x1 , t) is the difference in displacement across x1 , i.e., how much one side of the section at x1 is moving, compared to the other side: if you divide that by the initial length δx1 of the segment, you get a relative deformation, which is what matters in the definition of E. (And it’s
Notes
605
OK that the sign should be negative, because if u > 0, i.e., if the displacement points to the right, then δ p must be pushing in the positive x1 direction, and this tends to reduce the length of the segment.) Likewise, the extra pressure from the right is δ p(x1 + δx1 , t) = E δu(x1 + δx1 , t)/δx1 . (And the sign has changed, because now positive u means expansion of the segment.) Then, the force balance (Newton’s second) reads ρδx1 δx2 δx3
∂2 u(x1 , t) = δ p(x1 + δx1 , t)δx2 δx3 + δ p(x1 , t)δx2 δx3 ∂t 2 δu(x1 + δx1 , t) δu(x1 , t) = E δx2 δx3 − E δx2 δx3 . δx1 δx1
where ρ is mass density, as usual, and ρδx1 δx2 δx3 is the mass of the segment. If we divide both sides by that, E 1 δu(x1 + δx1 , t) δu(x1 , t) ∂2 u(x1 , t) = − ∂t 2 ρ δx1 δx1 δx1 2 E ∂ = u(x1 , t), ρ ∂x12
(N.79)
which is fine, because δx1 can be as small as we want it to be. Compare (N.79) with (N.45), the equation of the elastic string (note 279): they are the exact same equation—the one-dimensional wave equation—with the ratio of tension to density in (N.45) replaced by E/ρ in (N.79). Tension over density √ is the squared speed of elastic wave propagation in the string: bottom line, E/ρ is the speed at which deformation waves propagate along the bar, QED. 304. Which is just a very much simplified instance, because we are just looking at the “travel time”, and not at the form of the wave (which also changes through propagation because different frequencies might propagate at different speeds, and because of reflections, etc., but we’ll learn about all that later), of the concept that I threw at you at the beginning of this chapter, i.e., that music sounds different depending on where you sit in the concert hall, and so do earthquakes: so from measures of quakes you can also determine what the venue is like. 305. Aether was invisible, immeasurable etc., the only way we knew or thought we knew about it was because we had no better way of explaining the propagation of light. People were convinced for a long time that aether actually exists, though. If you are curious to see how it was eventually determined that it doesn’t, this is not directly related to geophysics, but look up “MichelsonMorley experiment”. 306. And it’s not trivial at all to find out how their effect can be expressed mathematically. It generally boils down to an exponential function, decaying with growing
606
Notes
time and/or distance from the epicenter, that multiplies the wave functions that we’ve seen above... but at this point it doesn’t matter. 307. Notice here the ambiguity in the way the word “wave” is used. For us wave means energy that propagates: matter is not transferred around. If you are hit by a quake, you’ll be rocked around for a minute or two then find yourself more or less in the same place where you were before, while at the same time the wave that hit you kept advancing and by now is, like, 50 km away. Michell—see above—thinks of the main earth wave as an actual translation of matter: the propagation of energy while matter oscillates in place is what Michell calls a tremulous motion. In the quote I’ve just copied, it is not clear whether Mallet understood what Michell meant; in any case he’s right both ways, as chances are (were) that both the velocity at which an elastic wave would propagate through viscous lava, and the velocity at which the lava itself could be moved around, are very low: Mallet and everybody at his time had already seen (or heard about) how slowly lava moves when it flows out of a volcano. 308. See Chaps. 4 and 5. 309. In case you are wondering how seismometers work: the best way to record ground motion would be to have a mass that sort of floats in a fixed point in space, unattached to the earth or to anything else, unaffected by gravity. Attach something that writes—the “stylus”—to it, and have it write on a roll of paper, stuck to a rotating drum. Drum and paper, unlike the “floating” mass and the stylus, are attached to the earth’s surface. When the ground oscillates the drum will oscillate, too, but the floating mass won’t; the stylus will record the motion of the floating mass relative to the earth—or, which is the same, the motion of the earth. Of course this is impossible: no way you can have such a fixed, gravity-free mass. And but on the other hand, if the mass were rigidly attached to the earth’s surface, you wouldn’t record any motion at all—mass and pen would move together with the earth, so their relative displacement would be zero. A simple way around this is to connect the mass to the earth through a spring and/or a damper: see Fig. N.22. Call x(t) the spring’s deformation, which is the same as the displacement of the mass (m) with respect to the ground (the spring is attached on one side to the ground and on the other to the mass), and u(t) the displacement of the ground in the same direction (to fix ideas, think the vertical component). Of course, x and u aren’t the same thing. The total displacement of the mass is x + u, and the force balance reads m
d d2 [x(t) + u(t)] = −K x(t) − D x(t), dt 2 dt
(N.80)
where K is the spring’s elastic constant (if you don’t know or don’t remember what that is, don’t worry too much: we’ll look into it shortly) and D the damper’s, uhm, resistance (it will be a little while before we do dampers, and “viscosity”—but basically it works more or less like friction). It is OK to neglect gravity, which has a tiny effect compared to spring and/or damper. Of course, if
Notes
607
Fig. N.22 Simplified diagram of a one-component seismometer. True seismometers usually have three components—vertical, east-west, north-south. Here we only see the vertical one. K and D denote spring and damper, respectively, and m means “mass”
there’s no spring, the equation is still valid, with K = 0; and likewise, D = 0 if there’s no damper. There are all sorts of seismometers on the market. You might want to rearrange (N.80), K d2 D d d2 x(t) + x(t) = − x(t) + u(t), dt 2 m dt m dt 2
(N.81)
to see more clearly that what we have is a second-order, linear, nonhomogeneous ODE, with x(t) as the unknown function and the ground accel2 eration ∂t∂ 2 u(t) as a “forcing term”. In practice, x is what gets recorded on the rotating drum, while what we really want to measure is u—the ground’s displacement. What we need to do, then, is solve (N.81): which means, in practice, find a relation between x and u, through which we can then calculate u from x. The simplest way to achieve this is via the Fourier transformation (note 291) of both sides of (N.81), which gives K D −ω 2 x(ω) + iω x(ω) + x(ω) = ω 2 u(ω), m m where ω is frequency, x(ω) is the Fourier transform of x(t), and u(ω) that of u(t). It follows that u(ω) =
K D − 1 x(ω), +i mω 2 mω
i.e., in the “frequency domain”, translating a raw seismogram to ground acceleration amounts to a simple algebraic operation (to be implemented numerically
608
Notes
Fig. N.23 Tetrahedron with three right angles, like that in Fig. 6.11. The origin of the reference frame is chosen to be at O, and the Cartesian axes are parallel with three of the tetrahedron’s four edges. All angles at O are right angles
at each sampled frequency ω). And then you can inverse-Fourier-transform the result to go back to the time domain. 310. I am just going to do it for the case of tetrahedra like that in Fig. 6.11, which three out of four sides are right triangles. That’s easier, and that’s what’s relevant to us anyway. So then (look at Fig. N.23 now), let us call a = a(x3 ) and b = b(x3 ) the lengths of the sides of a horizontal section taken at some height x3 through the tetrahedron; you see that a and b grow smaller as x3 grows larger. The area of the section is simply a(x3 )b(x3 )/2 (because it’s a right triangle). Now think on an infinitely thin slice of tetrahedron, with the section at x3 as its base. Its volume is a(x3 )b(x3 )d x3 /2. So, what we need to do is, we need to integrate that volume over x3 . To do that analytically, we need formulae for the functions a(x3 ) and b(x3 ): they are straight lines, with a(x3 ) a(0) = c − x3 c and
b(x3 ) b(0) = c − x3 c
where c is the tetrahedron’s vertical extent, as per Fig. N.23; and of course when x3 = 0 then a = a(0), etc. Based on what we just wrote, a(0) (c − x3 ) c a(0) x3 , = a(0) − c
a(x3 ) =
Notes
609
and b(x3 ) = b(0) −
b(0) x3 . c
Then the formula for the area of the section at x3 , call it d S(x3 ), is d S(x3 ) = = = = =
a(x3 )b(x3 ) d x3 2 b(0) a(0) 1 a(0) − x3 b(0) − x3 d x3 2 c c b(0) a(0) 1 a(0) − x3 b(0) − x3 d x3 2 c c a(0)b(0) 1 a(0)b(0) 2 a(0)b(0) − 2 x3 + x 3 d x3 2 c c2 2x3 x2 a(0)b(0) 1− + 32 d x3 , 2 c c
and we integrate that to get the volume, i.e., V =
c
d S(x3 ) 2x3 x32 a(0)b(0) c 1− + 2 d x3 2 c c 0 c c c 2 x3 x3 a(0)b(0) d x3 d x3 − 2 d x3 + 2 2 0 0 c 0 c c c % a(0)b(0) 2 x32 1 x33 [x3 ]c0 − + 2 2 c 2 0 c 3 0 c a(0)b(0) c−c+ 2 3 a(0)b(0) c , 2 3 0
= = = = =
which if you consider that a(0)b(0)/2 is the area of the base of the tetrahedron, and c is the tetrahedron’s height, then this is exactly the same as Eq. (6.111), QED. 311. It is still true that surface forces on any surface within the prism cancel out, because the stress on one side of the surface has to be equal and opposite to the stress on the other side, by Newton’s action-reaction principle, and that holds true for all three components of T and therefore for all nine coefficients of τ . The other thing you have to consider is that the faces of the prism are all perpendicular to one another; and so the components of traction on any one of those surfaces coincide with the coefficients of the row of τ that’s associated
610
Notes
with the orientation of that surface. It’s just the way τ is defined, and simplifies things a bit. 312. Navier-Stokes is a very general equation. For instance, if you require all shear stresses to be zero, and τ11 = τ22 = τ33 , which we know by observation is precisely what happens in a fluid that has no viscosity, Navier-Stokes boils down to Euler’s equation (just replace δ p with τ11 , or vice-versa). 313. A viscoelastic medium, on the other hand, is one that is elastic in the short term, but then, in the long term, continues to respond to stress, deforming very slowly. Earth materials are, in general, viscoelastic. And but their viscous response is so slow that it really isn’t significant before, like, a few years, or even decades after a quake. In this chapter, we’re not going to worry about viscosity and viscoelasticity at all, and just treat the earth as a perfectly elastic body. 314. To be honest, (6.138) is only approximately true. It involves swapping integration over space and differentiation with respect to time, i.e., we are supposed to differentiate the whole integral, and instead we differentiate the integrand, first, and then integrate. It is easy to see that this is not quite exact, because, for instance, the volume V will also change over time, and that should be taken into account somehow, when you differentiate an integral over V : which we don’t. We get away with this because the deformations we deal with are very small, so we can neglect changes in V , and also ∂u ∂u ∂xi du = + dt ∂t ∂xi ∂t i=1 3
∂u ≈ . ∂t
(N.82)
In addition, as you are about to see, we don’t differentiate the entire integrand, but do as if body and surface forces were constant. This is also fine, because, in our applications, body and surface forces change in time at a rate much slower than the displacement they cause. 315. To prove that τ is symmetric, go back to Euler’s stuff that we covered in Chap. 2; take Eq. (2.8) and work out the cross products at both left- and right-hand sides. You should get, for the ith component (i = 1, 2, 3), d dt
d V ρ(r)ei jk r j V
drk dt
d V ei jk r j Fk (r) ,
=
(N.83)
V
where we’ve chosen to write the cross product via the Levi-Civita tensor (see note 749) ei jk . Now, the force-per-unit-volume, F, includes all body forces-per-unit-volume f acting on the mass within V , and plus all the forces-per-unit-surface, AKA tractions T acting on all possible surfaces within V , and on the boundary ∂V of V . The forces-per-unit-surface acting on surfaces within V cancel out by Newton’s action-reaction: for each traction acting on either side of any of those
Notes
611
surfaces, you’ve got an equal and opposite traction on the other side; we are left with the tractions on ∂V , and (N.83) becomes
d dt
d V ρ(r)ei jk r j V
drk dt
d V ei jk r j f k (r) +
=
V
∂V
ˆ , d S ei jk r j Tk (n)
(N.84) where the unit vector nˆ is defined to be everywhere perpendicular to ∂V , and serves to remind us that the traction T is the traction acting on ∂V and not on any other surface. Let’s work out the time-derivative at the left-hand side: d dt
d V ρ(r)ei jk r j V
dr drk dt
where ei jk dtj into (N.84), V
drk dt
d drk rj dt dt V dr j drk d 2 rk + rj 2 = d V ρ(r)ei jk (N.85) dt dt dt V d 2 rk = d V ρ(r)ei jk r j 2 , dt V
=
d V ρ(r)ei jk
has vanished because ei jk = −eik j , etc. Now sub this back
d 2 rk d V ρ(r)ei jk r j 2 = dV dt V = dV V = dV
ei jk r j f k (r) +
∂V
ˆ d S ei jk r j Tk (n)
d S nˆ l τlk ei jk r j ∂V ∂ ei jk r j f k (r) + dV (τlk ei jk r j ) ∂rl V V (N.86) via Eq. (6.110), first, and then the divergence theorem. It follows that V
ei jk r j f k (r) +
d 2 rk ∂ d V ρ(r)ei jk r j 2 − ei jk r j f k (r) − ei jk (r j τlk ) = 0; dt ∂rl
(N.87)
but the volume V is arbitrary, so d 2 rk ∂ − ei jk r j f k (r) − ei jk (r j τlk ) = 0, dt 2 ∂rl
(N.88)
d 2 rk ∂τlk − ei jk r j f k (r) − ei jk r j − ei jk τ jk = 0, 2 dt ∂rl
(N.89)
ρ(r)ei jk r j or ρ(r)ei jk r j or
d 2 rk ∂τlk − ei jk τ jk = 0. ei jk r j ρ(r) 2 − f k (r) − dt ∂rl
(N.90)
612
Notes
But the bracketed thing is zero because of Eq. (6.135), which we’d just figured out. And so we are left with (N.91) ei jk τ jk = 0. By the definition of ei jk , this is the same as saying that τ12 = τ21 , τ13 = τ31 and τ23 = τ32 (while there are no implications for the diagonal terms of τ ), i.e., that τ is a symmetric matrix. Don’t forget that what we just got follows from Eq. (2.8) of Chap. 2; and but Eq. (2.8) of Chap. 2 was a corollary of Newton’s second: bottom line, it follows from Newton’s second that the stress tensor τ must always be symmetric, QED. 316. Which is a very important result that is going to get used multiple times in this book, as you will see. I am not going to prove it rigorously, but, as I often to do, I’ll work out a simple exercise to try and convince you that the divergence theorem makes sense. So, here it goes, consider a continuum that’s expanding and/or contracting; we can either look at a fixed portion of its mass, and we shall see that its volumes changes; or we can look at a fixed volume, and we’ll see that the mass that’s contained within it will change. We are actually going to do the latter. Before we do that, though: relative changes in volume and density coincide (except for the sign), because remember, if m is mass, then m = ρV , and/or V = m/ρ, and so m dV = − 2, dρ ρ which is the same as δV = − or
m δρ, ρ2
δρ δV =− . V ρ
Remember Eq. (6.19), which, if we replace volume change with density change, becomes δρ = −∇ · u. ρ We can use this to calculate the total gain or loss of mass, call it δm, of a fixed volume V of continuum: just integrate over V , and δm = +
δρ d V V
= +
ρ
V
= −ρ
δρ dV ρ ∇ · u d V,
V
Notes
613
Fig. N.24 Think of the small prism as an element of mass flowing through an infinitesimal portion d S of the surface ∂V . (In my derivation of the divergence theorem, actually, ∂V is a closed surface; but here we see only part of it.) The vector u is the displacement field at d S; and nˆ is the unit vector normal to d S, pointing away from V
where, for the sake of simplicity, I’ve taken the density ρ, pre-deformation, to be constant over the volume V (which I guess that, anyway, you can always take V small enough for this to be the case). There’s another way to calculate δm. Consider the surface ∂V , that bounds the fixed volume V , and calculate how much mass is carried through ∂V , into V or out of it, by the displacement u. Look, first, at an infinitesimal portion of ∂V , call it d S: picture a prism (see Fig. N.24), whose base is d S and whose ˆ normal to d S: u · nˆ is the height is the dot product of u with the unit vector n, component of u that actually carries stuff across d S; if u is parallel to d S, that means there’s no flow of mass across d S. So, anyway the amount of mass that “flows” through d S is ρ times the volume u · nˆ d S of the prism in question. And if integrate over ∂V , I get δm = −ρ
u · nˆ d S,
∂V
because, like I said, we take ρ to be constant (The minus sign is there because, by definition, nˆ points away from V : if u also points away from V , then u · nˆ is positive, but δm must be negative). The two equations I just wrote have the same left-hand side, so their right-hand sides must coincide, i.e. (ρ cancels out),
∇ · u dV =
∂V
u · nˆ d S,
which is exactly what the divergence theorem says. Not quite a proof, I guess, but a convincing example. 317. See note 55. 318. Yes, a total and a partial derivative are not the same thing: and in (6.135) we have a total derivative: but in the steps that follow it is more convenient to have the partial one. Whether the derivative is partial or total, though, Navier-Stokes is still approximately valid: see note 314.
614
Notes
319. I doubt that you are ready to take this at face value, so let me give you a few extra details. Let’s look at just the relevant chunk of (6.141), temporarily dropping Einstein’s notation, 3 3 i=1 j=1
τ ji
∂ ∂xi
∂u j ∂t
=
3 3
τ ji
i=1 j=1
=
3
τii
i=1
=
3
τii
i=1
=
3
τii
i=1
=
∂ ∂t ∂ ∂t ∂ ∂t
3 3
τ ji
i=1 j=1
=
3 3 i=1 j=1
∂ ∂t
τ ji
∂xi
∂u i ∂xi ∂u i ∂xi ∂u i ∂xi
∂ ∂t
∂u j
+
3
τ ji
∂ ∂t
τ ji
∂ ∂t
i=1 j =i
+
3 i=1 j>i
∂ε ji ∂t
+ 1 2
∂u j ∂xi
+
∂u i ∂x j
∂xi
3 ∂ 1 τ ji 2 ∂t i=1 j =i
∂u j
∂u j ∂xi
+
∂u j ∂xi
∂u i ∂x j +
∂u i ∂x j
,
QED, where all the tricks I played are legit because τ is a symmetric tensor (which see note 315). I hope you are convinced, now? Yes, maybe you do need to go through it a couple times—but everything is in there. 320. That is, all energy that is not kinetic energy. So, heat, I guess, plus potential energy associated with whatever force fields we are immersed in. According to the kinetic theory of heat (which I mentioned briefly in note 154, Chap. 4, by the way), heat is kinetic energy, too, except it’s the kinetic energy associated with all those motions that are too small to be seen—the displacement of, like, atoms and molecules. So, right now, forget kinetic theory—heat is heat and kinetic energy is only that associated with “macroscopic” motion. 321. As long as you assume that seismic displacements are too small to actually perturb significantly the density distribution, and therefore the gravitational field. 322. We’ve said that τi j = ci jkl εkl , and but also that τi j = τ ji for any i, j, so but that’s the same as saying that ci jkl εkl = c jikl εkl , which since Eq. (6.155) has got to hold for any possible set of values of εkl —i.e., independent of how the continuum gets deformed—, it follows that ci jkl = c jikl . As for ε being symmetric, to my understanding, that doesn’t really imply that ci jkl = ci jlk ; but it does imply that we can choose ci jkl to be the same as ci jlk : and so we do. Let me explain. I am going to drop Einstein’s notation for a second, just to see things more clearly: τi j =
3 3 k=1 l=1
ci jkl εkl ,
Notes
615
and but because εkl = εlk we can rewrite the right-hand side, τi j =
3 3
(ci jkl εkl + ci jlk εlk ) +
k=1 l=k+1
3
ci jkk εkk .
k=1
But εkl = εlk , and so τi j =
3 3
(ci jkl + ci jlk )εkl +
k=1 l=k+1
3
ci jkk εkk .
k=1
What this means is that, from the point of view of physics, there is no way of separating ci jkl and ci jlk ; in practice, one could ever only measure their sum: and so it wouldn’t make much sense to define them to be different from one another—that wouldn’t reflect the true number of degrees of freedom in the system—it’s simpler to just decide that ci jkl = ci jlk , which is what is conventionally done. Later in this chapter we’ll calculate how many independent coefficients are left in c after all this is taken into account. 323. I don’t think I’ve ever learned how exactly they are derived, either in college or graduate school. I tried to look it up in Aki and Richards (Quantitative Seismology, second edition, University Science Books, 2002), which is still the standard textbook for seismology, I think, and Aki and Richards don’t give a proof, either, but towards the end of their Sect. 2.2 they refer to the book by Jeffreys and Jeffreys, Methods of Mathematical Physics. Which, incidentally, Harold Jeffreys was a British physicist, and/or applied mathematician, or geophysicist, who contributed quite a lot of influential work to the field, but is systematically remembered for being, during his entire life, a stubborn opponent of continental drift/plate tectonics789 . And this despite having lived for almost a century (1891–1989). The second Jeffreys, the second author of the book I’m citing, is his wife, Bertha Swirles Jeffreys. In the next note I am showing their proof, based on how I understood it from their book. 324. The idea is that saying that something is isotropic is the same as saying that the relationship between τ and ε is exactly the same regardless of how stress and strain are oriented, i.e., regardless of which way stuff is being pushed, pulled, or sheared. Which, in turn, is the same as saying that if you rotate the reference frame, the tensor c stays the same. We discussed rotations before, in Chap. 2, but, before I go on, I should probably explain what happens to a tensor, rather than a vector, when the reference frame is rotated. So, let R be the rotation matrix that translates the components of a vector given with respect to some initial reference frame, call that v, into the components of the same vector, but defined in the primed frame, call that v : v = R · v.
(N.94)
616
Notes
Now, given a matrix T in the initial reference frame, we call T the matrix such that T · v = R · T · v, i.e., T · v is the primed counterpart of T · v, just like v is the primed counterpart of v. But, because of (N.94), then we can also write T · R · v = R · T · v. The vector v being totally arbitrary, it follows that T · R = R · T, which, if you dot both sides (to the right) with R−1 , what you get is T = R · T · R−1 , because of course R−1 · R is the identity matrix. Bottom line, to rotate a matrix—a second-order tensor—you need it to dot it twice with the rotation matrix R: once, actually, to the right, with the inverse, R−1 , of the rotation matrix; and once, to the left, with R itself. It turns out (and please look that up, if you are interested: no proof in this book) that, in general, to rotate a tensor of arbitrary order, the number of times you need to dot it with R or R−1 coincides with the tensor’s order. The tensor we are concerned with right now is a 4th order tensor, dimensions 3 × 3 × 3 × 3: component by component, we have ci jkl =
3 3 3 3
−1 Rim R jn cmnpq R −1 . pk Rql
m=1 n=1 p=1 q=1
It also turns out that R−1 = R T , i.e., the inverse of a rotation matrix coincides with the rotation matrix’ transpose790 . But then, T ci jkl = Rim R jn cmnpq R Tpk Rql
= Rim R jn cmnpq Rkp Rlq
(N.95)
(using Einstein’s summation convention, of course.) Now, like I was saying, if we want the medium, i.e., c to be isotropic, that’s the same as requiring that rotating the reference frame does nothing to c, that is to say, (N.96) ci jkl = ci jkl
Notes
617
for all values of i, j, k, l. The question is, what does this tell us specifically about the individual components of c? Well, first of all, you can rotate the reference frame about an axis directed like the vector (1, 1, 1); this brings the primed axis x1 to coincide with the original axis x2 , primed x2 to coincide with original x3 , and primed x3 with x1 . The result of this rotation is just to swap the suffixes with one another, i.e., = c2222 ; c1111 = c3333 ; c2222 = c1111 . c3333 = c1111 , etc., it But then if we combine that with (N.96), which gives c1111 follows that c1111 = c2222 = c3333 ,
which is a first constraint on the components of c. Then, let us rotate through 90◦ about the axis x3 . The rotation matrix (remember Chap. 2) in this case reads ⎛
⎞ 0 10 R = ⎝−1 0 0⎠ . 0 01 Let us calculate, e.g., the coefficient c3111 that results from this rotation: using (N.95), = R3m R1n R1 p R1q cmnpq c3111
= R3m R1n R1 p cmnp2 = R3m R1n cmn22 = R3m cm222 = c3222 (because R12 = R33 = 1; and R21 doesn’t matter, actually, in this case); combining this with (N.96), we have c3111 = c3222 , which is another constraint on c. The idea, then, is to do as many of such rotations as you can, each time reducing the number of “degrees of freedom” of c, until eventually you are left with ci jkl = λδi j δkl + μδik δ jl + νδil δ jk ,
(N.97)
618
Notes
where δi j, etc., means Kronecker’s symbol. We can get rid of ν, too, via the symmetries we already knew before we got into this note. Those are: (N.98) ci jkl = c jikl and ci jkl = ci jlk ,
(N.99)
which, like I said, follow immediately from (6.155) and the fact that τ and ε are both symmetric; and (N.100) ci jkl = ckli j , i.e., Eq. (6.160), which follows from the requirement that energy be conserved. It follows from (N.98) that 0 = ci jkl − c jikl = λδi j δkl + μδik δ jl + νδil δ jk − λδ ji δkl − μδ jk δil − νδ jl δik = (μ − ν)δik δ jl + (ν − μ)δil δ jk ; likewise, from (N.99), 0 = ci jkl − ci jlk = λδi j δkl + μδik δ jl + νδil δ jk − λδi j δlk − μδil δ jk − νδik δ jl = (μ − ν)δik δ jl + (ν − μ)δil δ jk (which is exactly the same condition as we had just written). From (N.100) we get 0 = ci jkl − ckli j = λδi j δkl + μδik δ jl + νδil δ jk − λδi j δkl − μδki δl j − νδk j δli = (λ − λ)δi j δkl + (μ − μ)δik δ jl + (ν − ν)δil δ jk , which, obviously, is always verified, whatever the values of λ, μ, ν. But the two previous requirements are only met if μ = ν: which if we substitute into (N.97), we get Eq. (6.162), QED. 325. They are called Lamé parameters, after Gabriel Lamé, the French mathematician who first defined them. Incidentally, Lamé was a good friend and co-worker of Émile Clapeyron, whom we shall meet later (the Clapeyron slope and all that, maybe you’ve heard of it already? if not, it will be in the next chapter anyway). In 1820, Lamé and Clapeyron were both détachés to Saint Petersburg, hired by the Tzar to teach in the new Institut et Corps du Genie des Voies de Communication. Alexander I had figured, I guess, that Russia must catch up with western Europe, where “progress” in the early 1800s had been crazy fast.
Notes
326.
327.
328. 329.
619
So, then, European teachers were needed to form a new generation of scientists and engineers, etc. The reason this is OK is that, in practice, the earth is, to a good approximation, and down to pretty small scale, a layered medium—each layer being approximately homogeneous in temperature, composition, density, etc. So the solution we shall find assuming constant λ and μ is still valid within each individual layer; and then it’s a matter of prescribing some boundary conditions at the boundaries between layers... Equation (6.167) was found in this exact same form by Augustin-Louis Cauchy via a different method, in 1828. I read Cauchy’s papers, and they are quite good, I thought, and if I understand them correctly, what he does is, he starts off with the hypothesis that matter is made of tiny particles, that lie at some distance from one another and that are kept together by some “intermolecular” force, of which little is (was) known; and that the intermolecular force between a pair of particles is stronger the closer the particles are to one another, kind of like gravity (except, a priori, Cauchy’s force could be both attractive or repulsive—and eventually things would work out the same way). Based on only this hypothesis and quite a bit of math, he derives both (6.167) and Hooke’s law. Most of this is in “Sur l’Équilibre et le Mouvement d’un Système de Points Materiels Sollecités par des Forces d’Attraction ou de Répulsion Mutuelle” (1828). “On the Dynamical Theory of Diffraction”, Transactions of the Cambridge Philosophical Society, vol. 9. To prove it, go back to the definition of curl in note 265, and take the vector v to coincidewith the gradient of a generic scalar function ψ(x), i.e, v = ∂ψ ∂ψ ∂ψ . You should get , , ∂x1 ∂x2 ∂x3 ∇ ×v =
∂ ∂ψ ∂ ∂ψ ∂ ∂ψ ∂ ∂ψ ∂ ∂ψ ∂ ∂ψ − , − , − ∂x2 ∂x3 ∂x3 ∂x2 ∂x3 ∂x1 ∂x1 ∂x3 ∂x1 ∂x2 ∂x2 ∂x1
.
When you differentiate the same function twice, or more, against different variables, it doesn’t matter which derivative you take first791 ; which means that ∂ψ ∂ ∂ψ = ∂x∂ 3 ∂x , etc. But then you see that all components of the vector at the ∂x2 ∂x3 2 right-hand side must be zero, i.e., ∇ × (∇ψ) = 0 for any function ψ(x1 , x2 , x3 ), QED. 330. And notes 292 and 279. 331. “Mémoire sur la Propagation du Mouvement dans les Milieux Élastiques”, which if I understand correctly was written, or read, in 1818, but then published only in 1831, in vol. 10 of the Mémoires de l’Académie, Paris.
620
Notes
332. In the Journal de Mathématiques Pures et Appliquées, and the paper is titled “Sur Deux Mémoires de Poisson”; it’s, like, a comment or critique of some of Poisson’s work. 333. Liouville (translated by yours truly): “I will show that a formula obtained by Poisson as early as 1807 (in a mémoire printed in the 14th volume of the Journal de l’École Polytechnique: see pages 334–338) could have, I was going to say should have, lead the illustrious author to the beautiful solution [of Eq. (6.173) here] that he only gave in 1819, and but which, since then, was already available to him as the immediate consequence of a very simple calculation.” Incidentally, Siméon Poisson had been Joseph Liouville’s doctoral supervisor, and had passed away in 1840. 334. In Liouville’s paper, c is simply set to 1. 335. Which is the trick that Poisson had used in 1807 to solve a similar problem, in some other paper, and then had not thought of when working on the equation that we are looking at now. 336. Where the term in parentheses at the right-hand side is a Laplacian, of course, but in a reference frame of coordinates ξ = (ξ1 , ξ2 , ξ3 )—and but I don’t think that introducing a specific new symbol for that, like ∇ξ , or whatever, would be helpful, so I am just going to write things out in a more explicit way. 337. Basically, you apply the chain rule, see note 274, multiple times. 338. See notes 279 and 292. 339. It’s important, I think, to see the difference between this and a plane wave— remember Eq. (6.39) and all that discussion. 340. If the earthquake is a big one, with a large portion of a fault breaking, it might be inadequate to think of it as a point source; but you can always think of it as the combination (the sum) of multiple point sources, so our reasoning still applies. 341. Something you can also do is you can find a Slinky, which I assume you know what that is, but if you don’t you can easily look it up, find a Slinky, lay it on a table, and hit it sideways, i.e., in a direction perpendicular to its axis and parallel to the plane of the table: and you shall see a shear wave propagating through the Slinky. If, instead, you hit it in the direction of its axis, what you are going to see is a compressional wave. 342. This time I am going to use complex sinusoids, rather than sines and cosines, but remember that eiωt = cos(ωt) + i sin(ωt), and note 291. Rayleigh does it this way, too, in his paper. Displacement u is a physical entity that one can measure, so you might ask, why should it be written as a combination of complex numbers? why don’t we write it as a combinations of sines and cosines, with real coefficients? Using a complex exponential instead of cos ˜ and sin means that this new function u(x) that we are going to be looking for is going to be complex, too; and complex numbers are weird and confusing. Maybe so, but, as it turns out, using eiωt rather than cos and sin actually simplifies the algebra—I’ll show you. Oh, and I might as well anticipate that, eventually, we’ll happily neglect the imaginary part of u(x, t) and only care
Notes
621
about the real one. That’s okay, because plug u = Re(u) + iIm(u) into (6.168), bring everything to the left-hand side: ρ
∂2 Re(u) + iIm(u) − (λ + μ)∇ {∇ · [Re(u) + iIm(u)]} − μ∇ 2 [Re(u) + iIm(u)] = 0 ∂t 2
(you’ll forgive me for omitting, just this once, the arguments (x, t) of u), where λ, μ, ρ are all real, so ∂2 Re(u) − (λ + μ)∇ [∇ · Re(u)] − μ∇ 2 Re(u) ∂t 2 2 % ∂ 2 +i ρ 2 Im(u) − (λ + μ)∇ [∇ · Im(u)] − μ∇ Im(u) = 0; ∂t ρ
which the only way that this can happen is if both real and imaginary parts are independently zero. So let’s write out explicitly the real-part part, ρ
∂2 Re(u) − (λ + μ)∇ [∇ · Re(u)] − μ∇ 2 Re(u) = 0, ∂t 2
and you see that this equation is the same as (6.168), except its solution is required to be real, and it follows that Re[u(x, t)] has to be solution of (6.168) if u(x, t) is a solution of (6.168). We also need to show that this applies to boundary conditions as well, i.e., if u(x, t) meets whatever b.c.’s you prescribe, then its real part meets them as well: but it’s basically the same thing. In a minute I am going to show you that the half-space boundary conditions consist of requiring that the right-hand sides of Eqs. (6.222), (6.223) and (6.224) be zero; then, if you feel like, you can go ahead and check that it all works. It’s a good exercise. 343. Which I touched upon earlier in this chapter when I did the Fourier transform (note 291). If you haven’t looked into that, then, perhaps just read note 780 now. 344. Or, as some might say, u(x, t) is a “monochromatic” function of time, with frequency ω: an analogy, that might or might not be helpful, with light, whose color is defined by frequency, so that light that carries only one frequency has only one color, i.e., is monochromatic. Anyway, if a seismologist speaks of monochromatic waves, this is what they mean. 345. What Rayleigh does here is similar, but not exactly the same thing as taking the Fourier transform (note 291) of Eq. (6.168) and then solving in the Fourier domain. The subtle difference between that and looking for a monochromatic solution is that, if one takes the Fourier transform, then ω should really be thought of as a variable rather than a constant. Or, in other words, when we Fourier-transform the momentum equation, we switch to a different description of the world where time doesn’t matter anymore, and everything that used to be a function of time becomes a function of frequency. And, in practice, the
622
346.
347. 348. 349.
350.
351.
Notes
monochromatic solution, that will then be found, is a function of time, with no need to inverse-Fourier transform back to the time domain. Nature is described, for the most part, by linear differential equations: so it’s not like seismologists are so incredibly lucky—what you are learning right now applies to other disciplines, too. See note 46. See note 291. Which is similar to taking a spatial Fourier transform; and, mathematically, a spatial Fourier transform doesn’t differ at all from a Fourier transform with respect to time, or whatever other variable you might find yourself dealing with. See note 291. Where if you are just looking for a monochromatic (in x1 , x2 and t) solution, you can “forget” (like Rayleigh does) that P and Q depend on k1 , k2 , ω, and treat them as arbitrary constants. But if you prefer to think of θ(k1 , k2 , x3 , ω) as the Fourier transform of θ(x1 , x2 , x3 , t), then you better keep in mind that P and Q are functions of k1 , etc.: because to inverse-Fourier transform them back to the space-time domain, in fact, you’ll have to do some integral over those very variables. If you know about complex numbers, because you’ve ! read about them in note 780, or elsewhere, I should add that we also need ! ρω 2 − k12 +k22 − λ+2μ x3
352.
353.
354.
355. 356.
k12 + k22 −
ρω 2 λ+2μ
to be real,
because if it were complex, e would continue to oscillate forever with x3 going to infinity, which also is against the idea that there are no quakes at infinite depth. If you haven’t read note 46 of Chap. 1, I advice you to do so now. There, I explain that a helpful trick when you have to solve a non-homogeneous differential equation (or system of equations) like (6.208) is to first find a solution of the associated homogeneous equation. Because then, the general solution is given by the sum of that, plus one particular solution of the non-homogeneous equation. Rayleigh, by the way, explains this step much more tersely than I do, with just the words: “we will write for brevity P=1”. But it did take me some time to convince myself that it is OK to do so. Rayleigh doesn’t write (6.247) explicitly, and is content instead with its form (6.246), which is the same as Eq. (24) in his paper. He summarizes the last few steps: “As k1 and k2 occur here [i.e., Eq. (6.243) above, which also appears in Rayleigh’s paper but without a number, right between his Eqs. (23) and (24)] only in the combination (k12 + k22 ), a quantity homogeneous with h 2 and κ2 , we may conveniently replace (k12 + k22 ) by unity”, which again is quite, uhm, dense. Which is what Abel-Ruffini theorem says, AKA Abel’s impossibility theorem. Classic problem in nineteenth-century math. You can look it up. If we were looking for the general solution of (6.168) via Fourier analysis, rather than for its generic monochromatic solution, we would need to be very
Notes
623
careful with this; because inverse-Fourier transforming involves integration over k1 , k2 , ω, but now we are seeing that only some values of combinations of k1 , k2 and ω actually work. So let’s say that we integrate first over ω and then over k1 and k2 : then, the integral over ! ω would need to boil down to the value taken by the integrand at ω = χ μρ (k12 + k22 ), plus its value at ω = ! −χ μρ (k12 + k22 ) (±χ being the two possible values taken by χ for a given value
357.
358.
359. 360. 361. 362. 363. 364. 365.
366.
of χ2 , which is the real root of (6.247)). In other words, #we’ve got ! to multiply$the coefficient of each monochromatic contribution by δ ω − χ μρ (k12 + k22 ) + # $ ! δ ω + χ μρ (k12 + k22 ) , where δ is the Dirac function (see note 277). It shouldn’t be too hard to see that (6.258) is the same as Eq. (37) in Rayleigh’s paper. Only difference, it’s been a few pages since we defined r and s and at this point we might even have forgotten what they are, so to remind ourselves of that, I replaced them with their expressions in terms of k1 , k2 , and some of the numerical constants I’ve found, which in some cases I had to take the square root of, etc. Oh yes, and Rayleigh also divides both sides by h 2 . And multiplies both sides by the factors ei(k1 x1 +k2 x2 +ωt) , that in his formulation contain all dependence on x1 , x2 , t. The general solution to (6.168) is a linear combination of such plane waves, associated to all possible values of k1 , k2 , i.e., traveling in all directions. But here we are not looking to write such general solution, but rather to understand how each of the monochromatic solutions looks like. = 0.6813 (these are the numbers in Rayleigh’s paper) and Check it out: 0.4227 0.6204 1.500 = 0.6814. 2.2015 A. E. H. Love, Some Problems in Geodynamics, 1911, art. 160. Seismologists like to call this effect “geometrical spreading”. “History of Seismology”, in International Handbook of Earthquake and Engineering Seismology, vol. 81A, 2002. R. D. Oldham, 1900, “On the Propagation of Earthquake Motion to Great Distances”, Philosophical Transactions of the Royal Society A, vol. 194. “Mémoire sur la Propagation du Mouvement dans les Corps Solides et Liquides”, Comptes Rendus, vol. 29, 1849. Oldham implicitly refers to surface waves in water. Which we’ve seen earlier in this chapter that the restoring force that allows those surface waves to exist is gravity. People even call them gravity waves, sometimes. “In collecting the facts I have been almost exclusively indebted to the careful and detailed accounts of earthquakes, both sensible and insensible, recorded in Italy, which are regularly published through the enlightened liberality of the Italian Government. Detailed descriptions of the records of most of the more important seismographs established in Italy have been printed, at first in the Bollettino of the Central Meteorological Office in Rome, and afterwards in that of the Italian Seismological Society. From these I have extracted the times of (1) the commencement of the record, (2) of any marked sudden increase of
624
367. 368.
369.
370.
371.
372.
Notes
movement, and (3) of any change in its character, so far as these are mentioned”. He does have data from a few other stations around the world, though: Charkow; Nicolaiew; Tokio; Strassburg [i.e., Strasbourg]; Shide; Potsdam; Grenoble; Edinburgh—but only for a small subset of the quakes. Just to get an idea, 20◦ is roughly the distance between London and Athens. That those two phases of preliminary tremor were the P and S waves was not seen as an obvious thing. In his 1906 paper “On the Constitution of the Interior of the Earth”, Oldham responds to some other authors, not convinced that of the P-wave/S-wave interpretation; according to Oldham, that “the first and second phase [...] must be referred to different forms of wave-motion, which I interpreted as being, probably, the two known forms—compressional and distortional—which can be transmitted by an homogeneous solid”... this interpretation, writes Oldham, is supported “by the records of Prof. Vicentini’s792 type of seismograph, composed of two heavy masses, one free to move horizontally, the other free to move vertically. In the records of great earthquakes originating at a distance of 90◦ of arc or more, it is found that the former gives a very small displacement for the first phase, while the latter frequently registers the maximum displacement of the whole disturbance. In the second phase the conditions are reversed, and, while the mass free to move vertically seldom gives any indication of disturbance, that which is free to move horizontally gives a very large displacement. This difference in the character of the record in the two phases shows that the movement is different, and incidentally tends to support the interpretation that I have proposed: for, if the first phase represents the outcrop of a disturbance transmitted through the earth as a condensational wave, the vertical movement would preponderate over horizontal at distances of 90◦ or more; while, if the second phase is caused by distortional waves, horizontal movement would preponderate in it. The length of the arc is R, with the source-receiver distance in radians, and R the radius of the earth; the length of the chord is (trigonometry) 2R sin(/2), and so the difference is R[ − 2 sin(/2)]: see the plot in Fig. N.25. Published only in German, in 1888: “Wellenbewegung und Erdbeben. Ein Beitrag zur Dynamik der Erdbeben”, or “Wave Motion and Earthquakes. A Study on the Dynamics of Earthquakes”. Oldham refers to Maurycy Pius Rudzki793 , and to a paper of his, published in 1898, whose title, translated, again, from German, is something like “On the Apparent Speed of Earthquake Propagation. I. Study Based on the Theory of Earthquakes”. I guess the speed of earthquake propagation is the speed at which the waves that earthquakes generate propagate. (Because, you know, before it sends out elastic waves, a quake is a fracture that begins at a point, and then also propagates, and that’s how the fault breaks, etc., so there might be some ambiguity, in this title, that we don’t want. But I guess in Rudzki’s time very little was known of fracture mechanics, so...) According to Wikipedia, “the Adams Prize is one of the most prestigious prizes awarded by the University of Cambridge. It is awarded each year by the Faculty
Notes
625
Fig. N.25 In the first approximation, a body wave travels along a straight line, through the earth’s interior, from source to receiver; a surface wave travels along the surface the earth. (See Fig. 6.17.) For whatever source-receiver pair, then, the distance covered by the surface wave is always longer than that covered by the body wave. The curve in this plot is the difference between the two, as a function of the source-receiver angle at the center of the earth
of Mathematics at the University of Cambridge and St John’s College to a UKbased mathematician for distinguished research in the Mathematical Sciences.” 373. That’s the year when Ernst von Rebeur-Paschwitz discovered that his instruments could record, from Germany, large earthquakes happening in Japan. He published that in Nature, and it was quite a turning point for seismology. 374. What Love means here is that before Rayleigh came up with his “surface-wave” concept, people were expecting to see just P and S waves in seismograms; now, unless you are very close to the epicenter, seismograms after a fairly big quake consists of some small wiggles followed some minute later by a bigger arrival. Early seismologists before Rayleigh and Oldham presumably didn’t see much detail in the initial wiggles, and figured they were all just one phase, the P; and the later phase, the big one, would be S. Pre-Rayleigh, they hadn’t thought of surface waves, so this interpretation made sense. 375. Once you accept to work with complex number, taking the cosine or sine of a complex or imaginary number shouldn’t be a problem. The usual formula e±i x = cos(x) ± i sin(x) holds, or, which is the same, cos(x) = and
1 ix e + e−i x , 2
626
Notes
sin(x) =
1 ix e − e−i x . 2
Now x might happen to be, like I was saying, an imaginary number, say x = i y with y real. All you have to do is plug that into the latter equation, and cos(i y) =
376.
377.
378.
379. 380.
381.
1 −y e + ey , 2
i.e., we’ve just found that the cosine of the imaginary number i y is actually a real number—the hyperbolic cosine of y—if you remember Eq. (6.96). If you’ve read note 292, you might remember that a guitar string also has a fundamental mode, and an infinity of overtones. And that it can only oscillate at a discrete set of frequencies—each mode with its own frequency. Surface waves are a bit different: a surface wave in a half space can exist with whatever frequency you want—at any given frequency, Eq. (6.283) gives you the speeds at which fundamental mode and overtones propagate. If you really want to see the details of how that works, the books you need to look at are probably Aki and Richards’ Quantitative Seismology (first edition: 1980, or second edition: 2002), and/or Ewing, Jardetzky and Press’ Elastic Waves in Layered Media (1957). We’ll meet both books again. In principle, any wave can be dispersive—not just surface waves. As far as seismic waves though, dispersion is prominent in Love and Rayleigh waves and much more subtle—if it can be observed at all—in body waves. If you look at the Love-wave math we’ve just done, you’ll see that dispersion is there because there’s structure in the medium—shear-wave velocity, in particular, has different values in the half-space versus the top layer. In Rayleigh’s simple half-space model there was no structure and no dispersion. In the real world, you might get dispersion in media where the parameters that describe Hooke’s law also change with frequency. This is something that people look at, including seismologists, but I don’t think it is usually a big effect in earth materials. Check out the work of, e.g., Shun-Ichiro Karato, though, if you want to find out more. That is to say, summed—which is OK, because remember, the wave equation, which they need to solve, is linear. You must have experienced this, actually, if you’ve ever tried to tune a guitar, or listened to someone doing so. The way you tune a guitar is, you try to play the same note on two different strings; if the guitar is out of tune the two tones you’ll get will be at slightly different frequencies, and you’ll hear the “beating” pattern I’ve just described—louder, quieter, louder, quieter, and so on. So then you change the strings’ tension until the beating pattern is gone: which means that at least those two strings are in tune with one another. Seismologists use this word to refer to different things, which I always found very confusing. Earlier in this chapter, following Oldham and others, I told you about the phases of a seismogram: the compressional wave, shear wave,
Notes
627
the Love and Rayleigh waves. But right now phase is used with a different meaning, that comes from Fourier-analysis/complex-number jargon. A complex number z can always be written as the product z = Meiφ (see note 780), where both M and φ are real, and M is called magnitude and φ is called phase angle, or, yes, phase. Now take Eq. (N.68) from note 291. In general F(ω) (the Fourier transform of f ) is complex, and so F(ω) = M(ω)eiφ(ω) , and people call M and φ the magnitude and phase of the Fourier transform. To understand what φ(ω) is in practice, think, first, of a monochromatic plane wave with frequency ω0 , like cos(kx − ω0 t). You should know by now that cos(kx − ω0 t) =
1 i(kx−ω0 t) e + e−i(kx−ω0 t) . 2
(N.101)
The Fourier transform (from time t to frequency ω) of this is F(ω) =
1 ikx e [δ(ω + ω0 ) + δ(ω − ω0 )] , 2
with δ denoting, as usual, the Dirac function. (If you are not convinced, just substitute into Eq. (N.68) of note 291, and solve the integral via the properties of δ: you should get (N.101) back.) The phase of this is kx. Now take a sinusoidal wave with the same frequency, but delayed with respect to the first one, i.e., cos[kx − ω0 (t + δt)]. Repeat the above, and you see that the phase of this guy is kx − ω0 δt. So the phase difference between the two sinusoids is their frequency times the delay of one with respect to the other, i.e., it’s the delay in cycles (in radians). You might also say that the phase of a signal, given the place x and time t of your measurement, tells you precisely which point of the sinusoid you are going to observe. And so it makes sense, then, that phase velocity (as opposed to group velocity) should refer to how fast that “point” is propagating in space. 382. To find a formula giving group and phase velocity in terms of one another, start by taking the phase-velocity formula, c = ωk , and turning it around, k = ωc ; then, d ω dk = dω dω c(ω) 1 ω dc = − (N.102) c(ω) c2 (ω) dω ω dc 1 1− . = c(ω) c(ω) dω
628
Notes
It follows that794
c(ω) dω = . ω dc dk 1 − c(ω) dω
(N.103)
dc Equation (N.103) says that, in the absence of dispersion (i.e., dω = 0) phase and dω group velocities coincide. In practice, the values of c and dk are always comparable, and but the large difference in frequency results in a large difference in the wave length of the phase and group terms. 383. I don’t think I’ve shown this to you anywhere in the book so far. Let two variables x and y be related through a function f ,
y = f (x).
(N.104)
Call f (x) the first derivative of f , i.e., df = f (x). dx Now differentiate both left- and right-hand sides of (N.104) with respect to y: d d y= f (x). dy dy But the derivative of y with respect to itself is 1, 1=
d f (x), dy
and to differentiate f , a function of x, against y, we need to apply the chain rule (note 274), d f dx , 1= d x dy or 1 = f (x)
dx . dy
Divide both sides by f (x), and 1 f (x)
=
dx . dy
This is a general result; but in our case, right now, it implies that dk 1 = dω . dω dk
Notes
629
384. Because remember Fig. 6.17, and consider that the P ray path isn’t a straight line, but gets refracted by the growth of P velocity with depth (more about this in Chap. 7), and so is slightly concave, and P-wave displacement is parallel to the ray path, etc. 385. From Richard Dixon Oldham, “The Constitution of the Interior of the Earth, as Revealed by Earthquakes”, Quarterly Journal of the Geological Society of London, vol. 62, 1906. 386. I am a musician, part-time at least. 387. Incidentally, my father was a fairly well known seismologist: which is maybe partly why I don’t like to be called a seismologist, and when people ask me what it is that I do for a living I’ll say wave physics, or acoustics, or earth sciences, whatever, anything but seismology. As in: I didn’t just follow in my dad’s footsteps: I did my own thing. To each their own complexes. 388. “In dealing with the records,” says Oldham, “it is necessary to bear in mind that they are liable to certain errors. In the first place, many earthquakes consist, not of a single impulse only, but of two or more, separated from each other by intervals of some minutes; and it is not uncommon for the disturbance due to the first impulse to be overlooked, either because it fails to overcome the inertia of the instrument, or because the disturbance is too small to be recognized. Secondly the disturbances, instead of beginning abruptly, as is sometimes the case, may come in gradually; and when this is the case, it is easy for the times of commencement of each phase to vary by a minute or more, on the records of different instruments, or even in the reading of the same record by different individuals. Either of these causes will make the recorded time late, but it also happens, not commonly, though often enough for the contingency to be borne in mind, that one station or a group of stations is affected by some small local disturbance, which almost coincides with a distant earthquake, and leads to the apparent commencement being too early. Apart from these sources of error, there is also that which may easily occur in determining the time of origin of the earthquake: this will introduce an error into all the intervals, which is constant for each earthquake but varies for different ones both in amount and direction; it will, consequently, be eliminated when an average from a sufficiently-large number of earthquakes is taken, and will be partly eliminated even with the few which are at present available. The other sources of error are partly eliminated by averaging, but it is necessary to reject any records which are abnormally early or late”, etc. 389. I might be using quotes quite a bit in this chapter; because this chapter deals with the stuff I started doing when I moved to the U.S. a long time ago and started a Ph.D., and my boss assigned me to the project of “mapping” the structure of the earth on the basis of seismic data. Which as I write about it now, I have this memory of being a young graduate student flooded with new info and learning all those new words, like “robust”, and “features”, which maybe I had heard before, of course I did, but with some other meaning, and now they were acquiring new meaning, and in this new meaning I would be
630
Notes
using them all the time, like several times in any conversation I might have with my boss about things like “slabs subducting below the 660”, etc., which just a couple months earlier I was finishing a bachelor’s in physics and those words would mean nothing to me, but now all of a sudden they carried lots of meaning, to the point of being, like, at the center of my existence. I mean, the way I spoke even changed. And maybe too much. Because within a year or two I started thinking that maybe that change wasn’t really a good thing at all. That maybe that was not what I had really wanted, in taking that trip across the ocean. But what was it, that I really wanted? Don’t worry if you have no idea of what “slabs subducting below the 660” might be; that was my situation in the initial months, and maybe years of my Ph.D. career, when people around me, in meetings I took (not-so-active) part in, would discuss all the time about things I knew nothing or close to nothing about, with words I couldn’t understand. We’ll learn about “slabs”, “subduction” and “the 660” in the next chapter, actually. 390. Oldham actually tries to do a bit better than this: when we assume that wave paths coincide with the chords of Fig. 7.2 we make an approximation, because in reality “the wave-paths up to [120◦ or so] are convex towards the centre of the earth”. It follows that “it may be taken that the central core does not extend beyond 0.4 of the radius from the centre”. The reasoning goes as follows: as I kind of anticipated towards the end of the previous chapter, if seismic velocity changes within a medium, wavefronts are inevitably deformed and wave paths accordingly deflected, and in this chapter we’ll learn some rules that control such deflections: after which, Oldham’s statement that “wave-paths [...] are convex towards the center of the earth” (i.e., kind of like in the diagram in Fig. N.26) will fully make sense. (The sudden deflections of wave paths as they hit the core boundary will also make sense.) And so then knowing that ray paths are somewhat convex—not tremendously convex, because, as far as Oldham could tell at that point, growth in seismic velocity with depth is fairly slow; but still convex—Oldham figures that the largest depth reached by any actual path should be a bit larger than that hit by the corresponding chord, and ends up applying, like, a very, uhm, pragmatic correction to his first estimate for the radius of the core, reducing it from 0.5 to 0.4 times the radius of the earth: which in fact, you might have noticed if you read figure captions carefully, is the value used to calculate Fig. 7.2. As for why the paths are convex, rather than concave or whatever, we’ll see about that in a minute. 391. Which if we read carefully Oldham’s conclusions, we see that he sort of had an intuition for all that, when he mentions that “the second-phase waves [i.e., the S waves] are either not transmitted at all, or, more probably, transmitted at about half the rate which prevails in the outer shell.” 392. “There is some, seismological, indication of a want of uniformity in this thickness,” Oldham adds, in a footnote, “for earthquakes originating off the eastern coast of Japan exhibit a three-phase character at less distances from the origin
Notes
631
Fig. N.26 Ray paths of seismic waves, after Oldham (1906). “The broken lines”, says Oldham, “represent the first phase, the broken-and-dotted lines the second phase”. (As you know by now, he’s talking about the P and S waves.) Paths are convex towards the center of the earth: that’s because of Snell’s law, which I’ll tell you about momentarily, and because seismic velocity grows with depth (more later on that, too)
393.
394.
395.
396.
397.
than appears to be the case in Europe, indicating a lesser thickness of the outer crust in the former region.” Plus, the P and S wave themselves didn’t have time yet to separate themselves from one another, and plus Oldham forgets that the same is true of surface waves, which are also still there in the mix, not yet dispersed and not yet greatly delayed with respect to the rest. Davorka Herak and Marijan Herak, “Andrija Mohoroviˇci´c (1857–1936)—On the occasion of the 150th anniversary of his birth”, Historical Seismologist, November/December 2007. Which only came out in a Polish and a German journal, back then; but in 1992 an English translation was published in vol. 9 of Geofyzika, with the title “Earthquake of 8 October 1909”. The October 8 event is felt strongly in Zagreb but is only a small wiggle in faraway seismograms; a large “aftershock” on October 10 is felt more weakly in Zagreb, but shows up larger at faraway stations such as Hamburg: Mohoroviˇci´c infers that the hypocenter of the aftershock must be deeper than that of the October 8 event. In the travel time versus distance plot, the P arrivals that Mohorovicic calls of the first kind form a curve that’s above the one formed by P of the second kind: hence the names inferiores and superiores. All this nomenclature, by the way, is dropped in the subsequent literature, so even if your goal in life is to be a seismologist, there’s probably no benefit in learning all this by heart. Just so that you know.
632
Notes
398. Which is fine. But if you look at real data from today (more sensitive instruments, better installations etc.) the picture will probably be more complex, or richer than in Fig. 7.3. For one, there might at least one additional discontinuity between the Earth’s surface and the Moho, so that the travel-time curve associated with the first P wiggle will have (at least) one additional kink. People mention, most often, the so-called Conrad discontinuity, after the name of the seismologist who first observed it—Victor Conrad795 . The Conrad only shows up in continental crust, though—not under oceans. Also, the P arrivals interpreted by Mohoroviˇci´c and Conrad were not simply “reflected back up by some other, deeper discontinuity, or something”, as I wrote earlier, hoping that this would initially go unnoticed. It turns out that when a body wave hits a discontinuity, it doesn’t give rise only to reflected and refracted waves, but also to so-called head waves, which the short description of what they are is they are refracted waves that propagate for a while exactly along and just below the Moho and the Conrad, before bouncing back towards the surface. (If this sounds weird, it’s OK, because I am going to do head waves in some detail later in this chapter.) And it seems that head waves is precisely what Mohoroviˇci´c and Conrad had observed. 399. Beno Gutenberg, “Dispersion und Extinktion von Seismischen Oberflächenwellen und der Aufbau der Obersten Erdschichten”, Physikalische Zeitschrift, vol. 25, 1924. From Knopoff’s bio of Gutenberg: “Gutenberg confirmed and made precise the observations of Tams, Angenheister, and Macelwane in 1921– 22, in which the velocities of propagation of surface waves were faster across the oceanic than across the continental portions of the Earth’s surface (1924). For this he used measurements of the velocities of both Love and Rayleigh waves at a number of periods from all the prewar seismographic records at Strassburg, from all records at Jena, and selected records from other stations. He proposed a method of inversion of the dispersion of surface waves to determine upper mantle structure that was similar to the method ultimately applied in the late 1950s. His inversion for crustal thickness (he managed to confuse group and phase velocities) gave a thick crust under the continents and a thinner crust under the oceans, with a crustal thickness of only 5 km under the Pacific. The latter was a remarkably foresightful result in view of direct substantiation about thirty years later when exploration of the oceanic crust became possible.” So, yes, the main method for accurate estimates is exploration—reflection and refraction—but Gutenberg apparently anticipates later results by looking at global surface-wave data. 400. Mach (professor of mathematics, physics, and of “the History and Philosophy of the Inductive Sciences” in Graz, Prague and Vienna, over most of the second half of the nineteenth century) was an intellectual giant, contributing to everything from physics to philosophy to psychology to physiology to the history of science. “Mach’s name is, of course, associated with the speed of sound, where Mach 1 means the speed of sound in a given medium. Since the speed of sound varies with the density of the medium it is traveling through, Mach numbers are
Notes
401. 402. 403.
404.
405.
406.
633
not absolute quantities but relational ones. [...] Most importantly for engineers, Mach Number is the ratio of the speed of the object to the speed of sound in the given medium; [Mach’s] work is essential to modern aerodynamics, and through it the word ‘Mach’ has bizarrely entered into popular culture as an icon for razors, sound systems, fighter pilots, and high speed fuels.” (The Stanford Encyclopedia of Philosophy.) Methuen and co., 1926. Later reprinted by Dover. Whom we met in note 14, because another thing he did was, he used triangulation to measure the size of the earth. Snell’s law is occasionally additionally credited to Descartes, and referred to as Snell-Descartes’ law, because of an alternative proof that Descartes claimed to have given at about the same time as Snell. Mach discusses Descartes briefly but is not impressed with Descartes’ deductions: “after actually reading these discussions in Chap. II of Descartes’ Dioptrices, [1637] it will scarcely be assumed [...] that Descartes discovered the law of refraction.” You might notice the similarity and difference with Hero of Alexandria, whom I mentioned earlier, who also figured light would, for some mysterious, sort of mystical reason, choose the fastest path from object to reflecting surface to observer; and but Hero was dealing with reflection, not refraction, and obviously if waves propagate everywhere at the same speed, the shortest path is also the fastest, and vice versa. Mach: “In 1657, Pierre de Fermat received from Marin Cureau de la Chambre a copy of newly published treatise, in which La Chambre noted Hero’s principle and complained that it did not work for refraction. “Fermat replied that refraction might be brought into the same framework by supposing that light took the path of least resistance, and that different media offered different resistances. His eventual solution, described in a letter to La Chambre dated 1 January 1662, construed ‘resistance’ as inversely proportional to speed, so that light took the path of least time. That premise yielded the ordinary law of refraction, provided that light traveled more slowly in the optically denser medium.” And incidentally, as these lines are being written, early twenty-first century, I don’t think the question has been definitively clarified, yet. But it doesn’t matter that much for this book, because this book is not concerned with light but with mechanical waves; or rather, it is concerned with light only in so far as one mistakes light for a mechanical wave, which is what Huygens and most physicists after him and until the advent of quantum mechanics, I guess, did. He published in 1690 his Treatise on Light, “in which are explained the causes of that which occurs in reflexion, & in refraction. I wrote this Treatise during my sojourn in France twelve years ago,” writes Huygens in the preface, “and I communicated it in the year 1678 to the learned persons who then composed the Royal Academy of Science, to the membership of which the King had done me the honour of calling me. Several of that body who are still alive will remember having been present when I read it, and above the rest those amongst them who applied themselves particularly to the study of Mathematics; of whom I cannot
634
Notes
cite more than the celebrated gentlemen Cassini, Rømer, and De La Hire. And although I have since corrected and changed some parts, the copies which I had made of it at that time may serve for proof that I have yet added nothing to it save [some relatively minor details]. “One may ask why I have so long delayed to bring this work to the light. The reason is that I wrote it rather carelessly in the language in which it appears, with the intention of translating it into Latin, so doing in order to obtain greater attention to the thing. [...] But the pleasure of novelty being past, I have put off from time to time the execution of this design, and I know not when I shall ever come to an end of it, being often turned aside by business or by some other study. Considering which I have finally judged that it was better worth while to publish this writing, such as it is, than to let it run the risk, by waiting longer, of remaining lost.” 407. Ole Rømer and Giovanni Domenico Cassini, see note 406, both at the Royal Observatory in Paris, had actually proposed an empirical estimate for the speed of light, in 1676. What happened is, Rømer had been looking at Io, one of Jupiter’s satellites, to measure its orbital period—how long it took it to complete one full revolution around Jupiter. At some point during its orbit, Io is eclipsed by Jupiter, so Rømer figured all he had to do was measure the time between two successive eclipses. To get a more precise measurement, he’d repeat his measurement many times—over a few years, that is—and then take the average. But then he noticed something weird: his measures of Io’s orbital period were systematically shorter, about eleven minutes shorter than average, when the earth was nearest to Jupiter, i.e., at point E1 in Fig. N.27; and when the earth was at E2, they’d be about eleven minutes longer than the average. Rømer had the very clever intuition that the discrepancy was there because the speed of light is finite; and it takes light about eleven times two, that is, twenty-two minutes to travel the diameter of the earth’s orbit around the sun—from E1 to E2. But then, if that’s true, you can use it to measure how fast light travels: just divide twice the earth-sun distance (which people had known for a while: remember Chap. 1) by 22 min. The value for the speed of light that Rømer obtained this way is about 220,000 km/s, which is not too far from the current estimate—roughly 300,000 km/s. 408. Also: if every point is a secondary source, why do I, when I look into a mirror, see only one, coherent reflected image? 409. Francesco Maria Grimaldi was a Jesuit, who was born and lived in Bologna in the mid-1600s. He, essentially, had a beam of light go through two narrow apertures, one in front of the other, and fall on a screen. “On the whole,” says Mach, “Grimaldi discovered many beautiful and important results, which were the outcome of skilled observation; these he described with great accuracy. Although he did not draw any definite theoretical conclusions, he made many important suggestions. His book indicates him to be a man of a lovable and upright nature, who announced his convictions openly and did not suppress his doubts. At the conclusion of the first posthumous edition of his book, we
Notes
635
Fig. N.27 How Ole Rømer measured the speed of light. S is where the sun is, E1 and E2 are two successive positions of the earth, and J1 and J2 are where Jupiter is when the earth is at E1 and E2, respectively. Eclipses of Io begin earlier than average when Jupiter is close to the earth, and later than average when Jupiter is far from the earth
learn that the brothers of his Order composed the epitaph: ‘P. Franciscus Maria Grimaldus vixit inter nos sine querela’.” 410. See Chap. 6, and note 301. 411. Fresnel796 was born in a small town in Normandy, and but admitted to the École Polytechnique of Paris, 1804. “Can you imagine what life must have been”, asks G.-A. Boutry797 , “in that new school suddenly opened in the capital city of an old nation which had as suddenly burst back into youth? The Napoleonic spirit was everywhere. Students who wore—and still wear—a military uniform; Professors and Assistants (most of them under thirty) whose names were Laplace, Ampère, Poisson, Cauchy, Monge, Gay-Lussac; lectures brought to an end at the sound of drums. Napoleon was first and foremost an artillery officer; in the École Polytechnique he meant officers to be trained first of all. Second in his thoughts came the Administration he was then building: an army of engineers and civil servants, of roadbuilders and peacemakers sent everywhere in the country where the last echoes of the great storm were slow to die: those ungrateful pupils who were not strong enough to give good military service had to staff this second army.” Fresnel is sent to the Vendée department, which is between Nantes and La Rochelle, where some new roads had to be built. He has to coordinate a whole lot of people to get the job done—not something that he enjoys (“there is nothing I loathe more than having to lead men”, he wrote in a letter), but somehow he manages. So he gets promoted and sent to Nyons, southeastern France, to work on some other road. Apparently, this is when he begins to look into the problems
636
412.
413. 414.
415.
416.
417.
Notes
in optics that he will become famous for. This is also the time, in 1814, when Napoleon is exiled in Elba. When the former emperor comes back to France to seize power again, Fresnel, who is no Bonapartist, “unhooks from the wall a sword he has never used”, says Boutry, “and hastens to Toulouse, there to join the loyalist army. He seriously believes in this army which does not believe in itself. But this haste is too much for his frail body: what with the rough journey and the ironical welcome he receives from his general, he has to take to his bed almost as soon as he has reported for duty. After a few days, finding it useless to stay, he returns to Nyons [which is, instead, “intensely Bonapartist”], there to be received by a hostile crowd which breaks his windows and threatens him. Of course he is cashiered and has to report to police headquarters at regular intervals. [...] Now is the time to think and to learn.” In July 1815, Fresnel is given permission to go back to live with his mother, in Normandy. In September 1815 he writes a letter to François Arago, a very famous physicist, where he explains that he got some interesting new theoretical/experimental results re the theory of light—the first instalment of his work in optics that is covered, to some extent, in this book. Some years later, Fourier will show that, in practice, every function can be thought of as the combination of a set of periodic functions (see Chap. 6). Which means that, in a linear regime—e.g., in a world described by the wave equation, which is a linear PDE—a result that can be proved, independent of their frequencies, for sinusoidal functions, then holds for any function. Which, in turn, means that the results of Fresnel’s approach are much more general than one might initially think. See Fresnel, Oeuvres, p. 270. Fresnel, actually, did some important research on the polarization of light, too; which is another reason he’s famous. Without abandoning the idea that light is the motion of ether, he could prove that light waves are always polarized perpendicular to the direction of propagation: sort of like S waves. But this is not relevant to our goals, now. Fresnel: “The first hypothesis that comes to mind is that [the fringes] be produced by the meeting of direct rays and rays reflected by the edge of the opaque body [...]. That seems to be Mr. Young’s opinion.” You might remember from the previous chapter that amplitude decays with distance from the source—the so-called geometrical spreading—but this is neglected by Fresnel, presumably because the angle at O is small, and so distances along the wavefront (distances between secondary sources) are much smaller than the distances between source and obstacle and screen: from which it follows that all waves landing at a given point on the screen should have undergone about the same amount of geometrical spreading. I am surprised that this hasn’t come up yet, but I guess it hasn’t. So, anyway, one way to prove it is via a Taylor expansion of sin(x) around x ≈ 0, which sin(x) ≈ sin(0) + cos(0)x − sin(0)
x2 x2 − cos(0) + · · · . 2 6
Notes
418.
419.
420. 421.
422. 423.
424.
425.
637
This says, basically, that when x tends to 0, then sin(x) ≈ x, and or sinx x ≈ 1. What I mean by “signal” here is motion as a function of time, which is what a seismometer, or a microphone, records. In actual optics people measure the intensity of light, which is basically what the eye perceives, and is independent of time—that involves an integration or averaging over time as well, see Fresnel at p. 315. But we are not concerned, in this book, with diffraction of light. This equation is equivalent to the formulae given by Fresnel at p. 315, Oeuvres, if you consider that ω = 2π/T , with T the period, and but c = λ/T , with λ the wavelength: it follows from all this that ω/c = 2π/λ, and you can do the remaining algebra in (7.24). If you are wondering what it means to do an integral numerically, go back to Chap. 1, notes 19 and 20. There are lots of experiments like this in Fresnel’s Oeuvres, but this is one of the clearest, and or most convincing. In fact, it is explained also, e.g., by Mach, p. 282–3, although I am not quite sure I agree with Mach’s figure 262, and Nahum Kipnis, History of the Principle of Interference of Light, Springer 1991, at p. 187. Kipnis’ figure 37 makes more sense than that of Mach, but then again Kipnis gives very few details of the derivation. And but, again, remember Fourier. Seismologists still tend to stick to simple ray theory, though: if you ask them why they think they can get away with that, they’ll probably say that’s because the earth is not that heterogeneous, so that most diffraction-like effects are small. There are also seismologists that disagree with this view, though, and do things in more sophisticated ways. “Mémoire sur la Loi des Modifications que la Réflexion Imprime à la Lumière Polarisée”, read to the Academy of Sciences, January 7, 1823; you can find it in Oeuvres, p. 767. Cargill Gilston Knott, “Reflexion and Refraction of Elastic Waves, with Seismological Applications”, London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 48, 1899. In a short intro to the article, Knott says he is publishing it “at Lord Kelvin’s suggestion”—Kelvin being the main “editor”, as we would say today, of the Philosphical Magazine. The first part of this article reproduces “a paper I published eleven years ago in the Transactions of the Seismological Society of Japan. [Like, e.g., John Milne, see note 309, Knott is one of those British scientists who were hired by the Japanese government in the late nineteen hundreds, and then spent in Japan a big chunk of their careers.] This Society ceased to exist some years ago; a fact which may serve as a further reason for reproducing a paper, in which the problem of the behaviour of an elastic wave incident on the interface of rock and water was for the first time fully worked out,” etc. But then in this 1899 version Knott adds two new sections: “Part II. contains detailed numerical calculations for rock-rock interface and for rock-air interface, similar to the calculations for rock-water interface in Part I. gives the mathematical investigation and the various sets of formulas on which these calculations are based.”
638
Notes
426. For those of you who know a bit about the physics of light and electromagnetic waves and all that: the main problem with the analogy of light and seismic waves is that (even if we decide to believe in the idea of light as a mechanical wave propagating in ether) light is necessarily made of transverse waves (this is part of what Fresnel discovered and is famous for; but it’s not really relevant to this book, plus I don’t remember, or, to be honest, don’t even know how he figured it out; so this is not something that I am going to cover: sorry), and so the analogy would only work if seismic P waves didn’t exist at all: only S. In that case, there would be no conversion between P to S at interface, and the physics (and math) would be simpler. 427. I’ve mentioned Love’s Treatise on the Mathematical Theory of Elasticity somewhere in Chap. 6 (though in that chapter we looked mostly at his later, shorter Some Problems of Geodynamics). I’ve found in my father’s books (see note 387) a copy of the 1906 edition of the Treatise, so that’s what I am going to refer to. 428. To understand why it’s useful to do what we are going to do for plane waves only, it’s important to consider that, to study what happens at an interface, it should be enough to look at an arbitrarily small portion of it (of the interface). Technically, (7.34) describes a disturbance with a wavefront of infinite extension. But then again, if you take, say, a spherical wavefront that has already propagated to some distance from its source, and you just look at a portion of the wavefront that is small compared to distance from the source, what you are seeing is essentially the same as a portion of a plane wavefront. This wouldn’t be the case if the source were very close to the interface, so that the curvature of the wavefront would still be significant the moment the wavefront hits the interface. But then again, you could take a smaller portion of the interface, small enough that curvature becomes negligible, and so on and so forth. Bottom line, we are going to derive formulae for the reflection and refraction coefficients of plane waves, and but they are going to be a good proxy for more general cases; or in any case, a good illustrations of the physics of wave reflection and refraction. 429. If you are not convinced, think of an ideal setup where, initially, nothing at all is moving except for a wavelet propagating towards an interface. You might measure the velocity at which particles hit by the wavelet oscillate at any given time, before any reflection/refraction has occurred; sum the squared value of such velocity, multiplied by the mass of the corresponding particle, over all particles that are moving at that time: and that’s the total kinetic energy of the system —which it’s OK to make the simple assumption that the potential energy, whatever that is, is not changing much through the small oscillations caused by the wavelet. So you’ve got the total energy of the system. And now do the same thing for all reflected and refracted wavelets that propagate around, at any moment in time after the wavelet has interacted with the interface. Sum the energy of all those wavelets. Because energy is conserved, if you do everything right you’ll get the same value as before reflection/refraction.
Notes
639
430. Knott (1899, Part III) actually proves, at least for some setups (e.g., “distortional wave at the interface of two elastic solids”, etc.) that if displacement and stress are continuous across an interface, then the energy stays the same, before and after a wavelet hits the interface. 431. SV - and S H -polarized S waves are relatively easy to mathematize, and that’s what you tend to find in textbooks. More in general, an S wave is polarized at some arbitrary angle, somewhere between purely SV and purely S H . That’s most general case, but it would also be quite messy to work out mathematically, the costs (in pages and brain energy) of doing so probably exceeding the benefits. Just think of all the conversions when such a wave hits a discontinuity... and what if the discontinuity is not horizontal? what if it doesn’t have a regular shape? for all that, today we have numerical methods (which I’ll tell you more about in Chap. 9, note 674). For the time being, let’s stick to trying to understand some basic physics, which is best achieved when the experimental setup is kept simple. 432. Equation (7.63) stipulates that the ratio between the sines of θ2 and θ1 coincides with the ratio of the velocities of propagation. This might remind you of Snell’s law, which says that the ratio of the velocities coincides with the ratio between the sines of the angles of incidence and refraction. 433. If you consider that 1 − 2 sin2 θ2 = cos2 θ2 + sin2 θ2 − 2 sin2 θ2 = cos2 θ2 − sin2 θ2 = cos(2θ2 ). 434. In case you are wondering what on earth that imaginary part is supposed to be, if u is a displacement, and a displacement is a real value that you can measure, which cannot possibly have an imaginary part: well, that’s a very good question, actually, but remember Chap. 6, when we talked Rayleigh waves? You might recall that there are two ways to think about this. For one, if a complex displacement like (7.90) solves the equation of motion, then that means that both its real and imaginary parts solve it; and the same is true for the boundary conditions. So, if we have a complex solution like (7.90)—and you can check if you are not sure: (7.90) does solve the Navier-Cauchy’s equation (6.167)—then automatically we also have a real solution: its real part; and in the following we can think of the real part of whatever expressions we find as the physically meaningful solution, and forget about the imaginary one. Another way to look at this (which might or might not help you, but I’ll throw it at you just in case) is, think of (7.90) as sort of the main ingredient of the Fourier transform, in both time and space, of a more general function u(x, t). To make this clearer, actually, let’s introduce frequency ω = −kc and wavenumber κ = kp. Then, (7.90) takes the form u(x, t) = Aei(κκ·x+ωt) .
(N.105)
You see that this—(7.90) or (N.105)—is monochromatic, i.e., a sinusoidal function of x and t with given frequency ω and wavenumber κ . Now, forget about monochromatic waves, for a moment, and think of a realworld measurement of displacement in time and space (made, e.g., with an
640
Notes
array of seismometers). Observed displacement is, obviously, a real-valued function of x and t. Imagine you Fourier-transform it in both time and space. Then, what you get is a complex function of ω and κ ; call it A(ω, κ ). The inverse Fourier transform of A(ω, κ ), Eq. (N.68), in both time and space, reads 1 24 π 4
435. 436.
437.
438. 439.
440. 441.
∞
−∞
dκ1
∞ −∞
dκ2
∞ −∞
dκ3
∞
−∞
dω A(ω, κ )eiκκ·x eiωx ,
(N.106)
which, if you do the integrals, should give you your observed, real-valued displacement function back. You see that the (N.106) is just the right-hand side of (N.105), integrated over frequency and all three components of the wavenumber. The point that I am trying to make is, I guess, that (7.90) or (N.105) might be a complex-valued function, which apparently doesn’t make sense: but if you know the value of A in (7.90) for all values of κ and ω, then what you have is the Fourier transform of the displacement. And the function A(ω, κ ) should be such that its inverse Fourier transform—the actual displacement one observes in the field—is real. Or in other words: you can think of the complex displacement (7.90) or (N.105) as one monochromatic contribution to a real displacement field; its imaginary part is necessary to describe things mathematically, but doesn’t need to reflect any imaginary number observed in the real world—which wouldn’t make any sense. OK, now (7.93) also has a nonzero imaginary part, but see note 434. K. Aki and P. G. Richards, Quantitative Seismology—Second Edition, University Science Books, 2002. I am not sure how it is now, but this is the book that my generation were told you have to read if you want to be a seismologist. It was first published by Freeman in 1980, and had become hard to find by the time its second edition came out. OK, there’s also Ewing et al.’s—Ewing, W. M., W. S. Jardetzky and F. Press, Elastic Waves in Layered Media, McGraw-Hill, 1957—which is a pretty detailed work as well, and some of you might prefer it over Aki and Richards’, I guess. It’s all in Sect. 5.3 of the second edition. In particular, take a look at what they say from the end of page 153 to the beginning of page 157. Nota Bene, though, to get Rayleigh waves “you need all the plane waves interacting with the boundary [to be] inhomogeneous waves”, so you can’t really say, from this, that Rayleigh waves are born of P and SV body waves impinging on a free surface... Equation (5.56) of the second edition. “Elastic Waves at the Surface of Separation of Two Solids”, Proceedings of the Royal Society A, vol. 106, 1924. Other than Stoneley waves, there are quite a few other contributions from Stoneley, which you can read about, e.g., in the bio that Harold Jeffreys (see note 323) wrote for the Royal Society when his friend passed away in 1976. The two appear to have been very close, judging also from the “reminiscences” published by Bertha Swirles Jeffreys in Geophysi-
Notes
442. 443.
444. 445. 446.
641
cal Journal International, 1994: “Robert Stoneley entered St John’s College, Cambridge, in 1912, two years after Harold Jeffreys, and they soon formed a friendship that Harold greatly valued throughout his life. [...] Their first cycling tour in 1915 was to the Isle of Wight, a pilgrimage to the Observatory of John Milne who died in 1913. This was followed by other tours and they went to Harold’s home in County Durham. After the war, when Harold’s father visited us, he said to me, after a happy afternoon in the Stoneleys’ house and garden, ‘One of the salt of the earth’.” Stoneley was a lecturer in mathematics at the universities of Sheffield, then Leeds, then Cambridge, where he stayed until his retirement in 1961; then he actually moved to the U.S. where he worked for the Coast and Geodetic Survey in D.C., and then as a prof. of geophysics at Pittsburgh University. According to Harold Jeffreys’ bio, “we were at a conference at Stuttgart once, where in post-war reconstruction the Germans had scraped together all the rubble from demolished buildings and piled it outside the town as a memorial. He deprecated this, saying that it should have been used in making foundations for new buildings. Presumably his father’s business [Stoneley’s father, also Robert, had been a builder] inspired this remark.” Which begins with section 6.1 of their book. Aki and Richards (p.189 of the second edition): “When the spherical wave interacts with a plane boundary between two different half-spaces, the resulting wave systems can naturally be divided into three major types: (i) waves that are directly reflected from or transmitted through the boundary; (ii) waves that travel from source to receiver via a path involving refraction along the boundary at a body-wave speed (head waves); and (iii) waves of Rayleigh or Stoneley type, with amplitude decaying exponentially with distance from the interface.” Jeffreys, H., “On Compressional Waves in Two Superposed Layers”, Mathematical Proceedings of the Cambridge Philosophical Society, vol. 23, 1926. To distinguish it from other, less prominent head waves that are sometimes observed, traveling along other, less prominent discontinuities: see note 398. The bibliographic references that are usually given (besides Bateman’s paper, which doesn’t get mentioned as often, but we’ll get to that in a section) are: G. Herglotz, “Über das Benndorfsche Problem der Fortpflanzungsgeschwindigkeit der Erdbebenstrahlen”, Physikalische Zeitschrift, vol. 8, 1907, and/or E. Wiechert and L. C. Geiger “Die Bestimmung des Weges der Erdbebenwellen im Erdinnern”, Physikalische Zeitschrift, vol. 11, 1910. I haven’t been able to find copies of those papers, plus I am not that great at reading German. Anyway, after some googling around, my guess is that Gustav Herglotz, who was more of a physicist, figured out the theory (his main intuition being that an old formula from Abel, i.e., Eqs. (7.103) and (7.104) here, which we shall see in a second, could be used to manipulate (7.99) usefully), while Emil Wiechert, who was more of an earth scientist, applied the theory to data from seismometers and started deriving models of the earth. Both were profs. at Göttingen at the time their papers were published, though Herglotz then moved to Vienna in 1908.
642
Notes
Wiechert, incidentally, was Beno Gutenberg’s798 teacher and thesis supervisor. Gutenberg would later use the same method as Herglotz and Wiechert to derive some models of the earth that would become the reference for most of his contemporaries... and are actually not that far from the reference models we look at today. Harry Bateman, an English mathematician, had the same idea as Herglotz, and worked it own by himself, while he was a young lecturer at the universities of Liverpool and Manchester; in 1910 he published a short paper with a very long title, “The solution of the integral equation connecting the velocity of propagation of an earthquake wave in the interior of the earth with the times which the disturbance takes to travel to the different stations on the earth’s surface” (Philosophical Magazine, vol. 19). At the very end of the paper, which bears the date of January 14, 1909, there’s a “note added Feb. 28th, 1910.—It has been pointed out to me that the problem has been treated from a similar point of view by G. Herglotz in a short note, Phys. Zeitschr. 1907, p. 145.” Also in 1910, Bateman moved to the U.S. where he taught at Bryn Mawr, Johns Hopkins, and eventually Caltech. I looked him up and the first thing I noticed, on Wikipedia, was a quote of his Caltech colleague Theodore von Kàrmàn (one of the founders of the Jet Propulsion Laboratory, apparently): “In 1926 Caltech had only a minor interest in aeronautics. The professorship that came nearest to aeronautics was occupied by a shy, meticulous Englishman, Dr. Harry Bateman. He was an applied mathematician from Cambridge who worked in the field of fluid mechanics. He seemed to know everything but did nothing important. I liked him.” There’s also the obituary that F. D. Murnaghan (whom we’ll meet again) published on the Bulletin of the American Mathematical Society. “In 1914”, he says, “I was awarded a Traveling Studentship in Mathematical Physics by the National University of Ireland and was looking about for some place to study. My professor, A. W. Conway, told me that there was a young man, Bateman, at Hopkins and that he thought that I could not do better than study with him. I followed this advice and, looking back over a third of a century, I judge the advice to have been sound. Bateman, a frail slight man of 32, was lecturing on The Absolute Calculus and Electrodynamics [...]. [S]ix students started the course and by March I was, if my memory is correct, the only student. I do not think that this diminution of the size of his class bothered the lecturer very much, and I have sometimes thought that if the vicissitudes of student life had prevented my attendance, the lecture would have been none-the-less delivered. [...] I remember [...] a feeling of amazement, mingled with discouragement, which came over me when I discovered the thoroughness of the man. He already possessed a large, carefully indexed card-catalogue on each card of which was written in his minute, but beautifully clear, handwriting an abstract of a paper which he had read. I am told that in later years this card catalogue crowded him out of his office and almost out of his home. [...] His memory was phenomenal. No matter what stubborn integral or intractable differential equation you
Notes
643
showed him, a moment’s thought and a reference to the card catalogue never failed to produce something useful.” 447. I am talking about numerical differentiation: because these are numerical data, and we know nothing of the analytical form of the functions, which probably doesn’t even exist. So, e.g., given two data points, p1 (t1 ) and p2 (t2 ), the ratio p2 − p1 is a (rough) estimate of the derivative of p with respect to t, between t1 t2 −t1 and t2 . There are better ways to do this, but you get the idea. 448. A Norwegian prodigy mathematician who died from tuberculosis at age 26. He did all of his work, which is quite a bit of work, in just a few years. According to Asimov, “Abel, the son of an alcoholic pastor, lived his life in poverty and [...] had to support the family when his father died, but he managed to attend the University of Christiania (now Oslo) in Norway’s chief city. There a teacher recognized his talent, encouraged him, and helped him financially”, etc. 449. But if you are ready for the proof, here it is: let us start out by dividing both √ sides of (7.103) by x − η, for some arbitrary η, and integrating over x from a to η, and a
η
h(x) dx = − √ x −η
η
dx a
a
x
1 f (ξ) , dξ √ √ ξ−x x −η
(N.107)
where we’ve also swapped the integration limits in (7.103). That maybe doesn’t look like that much of an improvement, but wait. At the right-hand side of (N.107) we are supposed to integrate over ξ from a to x, and repeat that for all values of x between a and η, then “sum” over x. That is illustrated by the first diagram in Fig. N.28: you see that what we’re doing is we are integrating √f (ξ) √ 1 , which is a function of x and ξ, over the shaded triangle. If one ξ−x η−x integrates the same function first over x between ξ and η, repeats that for all values of ξ between a and η and then “sums” (second diagram in Fig. N.28), the same result is obtained. Which is the same as saying that (N.107) can be rewritten η η η 1 h(x) f (ξ) dx = − dξ dx √ √ √ x − η x −η ξ − x a a ξ η η 1 1 . = − dξ f (ξ) dx √ √ x −η ξ − x a ξ Now, probably that wasn’t that easy even for Abel to figure out, but if you replace the integration variable x with a new one, let’s call it γ, such that x = ξ cos2 γ + η sin2 γ, then: d x = 2(η − ξ) cos γ sin γdγ; x = ξ if γ = 0 and x = η if γ = π/2; and something funny happens:
644
Notes
Fig. N.28 Change of integration limits in Abel’s problem a
η
h(x) dx = − √ x −η
η
dξ f (ξ)
a
dξ f (ξ)
a
η
=2 a
=π
η
2(η − ξ) sin γ cos γ dγ ξ(1 − cos2 γ) − η sin2 γ ξ cos2 γ + η(sin2 γ − 1)
π 2
2(η − ξ) sin γ cos γ dγ (ξ − η) sin2 γ (ξ − η) cos2 γ
0
η
= −
π 2
0
π 2
dξ f (ξ)
dγ
0
dξ f (ξ).
a
If we differentiate both sides with respect to η, we find that η h(x) 1 d dx √ π dη a x −η a h(x) 1 d d x, = − √ π dη η x −η
f (η) =
which, if you replace η with ξ (that’s just a matter of how you call things) is precisely Eq. (7.104), QED. 450. Incidentally, Eq. (7.107) here is exactly the same as Eq. (9.34) in Aki and Richards’ book. 451. OK, I know I give proofs for almost everything in this book, except trigonometry. But maybe i’ll skip this one, if you don’t mind? It’s easy to look up, though, I think: you’ll find videos on Youtube, and all that. 452. To be honest, the earth is spherical, and that’s particularly noticeable if your epicenter and receiver are very far from one another, so in most textbooks you’re likely to find the spherical-earth version of Herglotz-Wiechert’s formula. The derivation goes as follows. The main point is that, in a sphere made up of concentric shells each with its own seismic velocity, which is roughly what the earth is, the ratio sin θ/v is not constant along a ray path. Because, think of a
Notes
645
Fig. N.29 Snell’s law in spherical geometry. C is the center of the earth. Seismic velocity has the value v0 in the outer shell and v1 in the central sphere
spherical shell with uniform seismic velocity: ray paths within it are straight, but its boundaries are curved, and so the angle formed by the ray with the outer boundary of the shell isn’t the same as the angle formed with the inner boundary. Look at Fig. N.29, though, and write the law of sines (see note 723) for the triangle whose vertices are A, B, C: BC AC AB = = . sin γ sin α sin β In our case, α is the incidence angle of the ray path at the outer boundary of the shell, let’s call it θ0 ; the incidence angle at the inner boundary is π − β, let’s call it θ0 ; let’s also call r0 the outer radius of the shell, which coincides with AC, and r1 its inner radius (equal to BC): substituting all this into the law of sines, r1 r0 = sin θ0 sin(π − θ0 ) (N.108) r0 . = sin θ0 As for Snell’s law, said v0 and v1 the seismic velocities in the shell we’re looking at and in the one immediately below it, respectively, it is still true that sin θ0 v0 = , sin θ1 v1
646
Notes
and but notice that of course at the left-hand side we have the sine of θ0 , and not θ0 . If we solve (N.108) for sin θ0 and plug the result into Snell’s law as we’ve just written it, v0 r0 sin θ0 = , r1 sin θ1 v1 or
r0 sin θ0 r1 sin θ1 = , v0 v1
and it is inferred that in a layered spherical earth, what is constant along a ray path is not sin θ/v, but r sin θ/v. And so the ray parameter is defined p=
r sin θ . v
(N.109)
Now let’s look again at Fig. N.29; within some approximation, which is OK as long as the shell is not too thick, AB ≈
(r0 − r1 ) (r0 − r1 ) , π = sin( 2 − θ0 ) cos θ0
and so r1 γ ≈ AB cos
(r − r ) 0 1 − θ0 = sin θ0 . 2 cos θ0
π
This relationship is true independent of whether r0 and r1 are discontinuity radii, and if we bring it to the limit where γ is very small and AB very short, etc., we might write dr sin θ , d = r cos θ where I’ve decided to call d the increment in (angular) distance from source to receiver. Similar to what we’ve done in the flat-earth approximation, let’s write d in terms of p: it follows from (N.109) that sin θ = vp/r , and so vp
d = ! r 1− !
= r
v 2 p2 r2
dr
p r2 v2
− p 2 dr
,
which is the spherical-earth equivalent of Eq. (7.98), and
Notes
647
( p) = 2
rearth
rB ( p)
p
!
r2 v2
r
dr,
(N.110)
− p2
where of course rearth means the radius of the earth, while rB ( p) is the radius of the “bottoming point” for the ray path of parameter p. You’ll agree that (N.110) looks a lot like (7.99); but it’s not quite the same thing because of that r at the denominator... Anyway, what we’re going to do next is not that different from what we’ve done in the flat-earth case. If we divide both sides by 2 p and change the integration variable from r to ξ = r 2 /v 2 (r ), then ( p) = 2p
where I’ve called ξ0 the ratio rB2 ( p) v 2 (rB2 ( p))
ξ0 p2
1 ξ − p2
2 rearth , v 2 (rearth )
1 dr r dξ
dξ,
and plus at the bottoming point ξ =
= p , because the incidence angle, there, is π2 . Now, what we’ve gotten is very similar to (7.105), and/or to Abel’s Eq. (7.103), and we can do the Abel thing like we did before, to find, through Eq. (7.104), that ξ0 1 dr 1 d ( p) =− d( p 2 ). (N.111) r dξ π dξ ξ 2 p p 2 − ξ 2
The next few steps, that will give us an equation equivalent to (7.107), are a bit more convoluted that in the flat-earth case. Anyway, it goes like this. Integrate 2 2 /v 2 (rmin ) to rearth /v 2 (rearth ), where rmin is both sides of (N.111) over ξ from rmin some arbitrary (for the time being) radius, obviously smaller than rearth ;
2 rearth v 2 (rearth ) 2 rmin v 2 (rmin )
1 1 dr =− dξ r dξ π
2 rearth v 2 (rearth ) 2 rmin v 2 (rmin )
dξ
d dξ
ξ
ξ0
( p) d( p 2 ). 2 2p p − ξ
Now, first, let’s do the integral at the left-hand side,
2 rearth v 2 (rearth ) 2 rmin v 2 (rmin )
At the right-hand side,
1 dr = dξ r dξ
rearth
1 dr r rmin rearth = ln . rmin
648
Notes 1 − π
r2
earth
v 2 (rearth ) r2
min v 2 (rmin )
d dξ dξ
ξ0 ξ
1 ( p) d( p 2 ) = π 2 p p2 − ξ 1 π
−
r2
earth
v 2 (rearth )
ξ0
( p)
2 p p2 − r2
min
v 2 (rmin )
( p)
ξ0
p2
2p
2 rearth v 2 (rearth )
2 rmin v 2 (rmin )
−
d( p 2 )
d( p 2 );
2 but remember that ξ0 = rearth /v 2 (rearth ) and so the first integral at the right-hand side is null, and
1 − π
r2
earth
v 2 (rearth ) r2
min min )
v 2 (r
d dξ dξ
ξ0 ξ
1 ( p) d( p 2 ) = π 2 p p2 − ξ =
1 π
1 = π
r2
earth
v 2 (rearth ) r2
min min )
v 2 (r
p0
!
p1
( p) p2
2p
rearth v(rearth ) rmin v(rmin )
−
2 rmin v 2 (rmin )
( p)
p2
−
( p)
d( p 2 )
dp,
2 rmin
v 2 (r
min )
dp,
p 2 − p12
where I have changed the integration variable p 2 to p; I’ve also figured that, because p is constant along a ray path, p must coincide with the value of r/v(r ) at the bottoming point of the ray path, where the incidence angle is π/2 and its sine is 1. But so then the integration limit rmin /v(rmin ) coincides with the value of p associated with the ray that bottoms at rmin , which let’s call it p1 , and the limit rearth /v(rearth ) corresponds to a “ray path” that bottoms at the surface of the earth, i.e., like, the “ground zero” of p, which I’ve called p0 . Bottom line, we’ve reduced (N.111) to ln
rearth rmin
=
1 π
p0 p1
!
( p)
dp,
(N.112)
p 2 − p12
Just like in the flat-earth case, we are not done yet, because the integrand in (N.112) is infinite at p = p1 , etc. We can integrate by parts, though, again like for the flat earth, and 1 p0 ( p) rearth ! = ln dp rmin π p1 p 2 − p12 p0 p p d 1 1 p0 −1 −1 ( p) cosh dp. = − cosh π p1 π p dp 1 p1 p1 But: cosh−1 (1) = 0; ( p0 ) = 0 because a path that bottoms at the surface of the Earth can’t propagate at any distance from the source; and at the righthand side we might switch the variable of integration from p to , with, again
Notes
649
( p0 ) = 0. We’re left with
rearth ln rmin
1 = π
or 1 ln (rmin ) = ln(rearth ) − π
453. 454. 455.
456. 457.
( p1 )
cosh
−1
0
( p1 )
cosh 0
−1
p p1 p p1
d, d.
(N.113)
Equation (N.113) is the spherical-earth equivalent of (7.110), and it is used pretty much in the same way. Remember, again, that p can be calculated for all data points in a sufficiently dense set of observations t = t () of the travel time of a seismic phase; i.e., seismic data à la Oldham, etc., can be provided as a table of values of versus p. Let’s say we select a value for p1 , and we use whatever estimate we have of v(rearth ), from lab experiments made on rocks collected at the earth’s surface, to plug a value into p0 . Then, we have all we need to evaluate numerically the right-hand side of (N.113). And what we get is ln (rmin ), that is, in practice, the radius of the bottoming point of the ray whose parameter is p1 . And if we divide rmin by p1 , we find the value, v(rmin ), of seismic velocity at rmin : meaning, we can add a point to a seismic velocity profile v = v(r ). Then, we can do the same for another value of p1 : and so on and so forth until we are happy with how detailed our curve v = v(r ) is. Just like in the flat-earth case, the trick fails if you’ve got a layer of significant thickness where velocity decreases instead of growing with depth. The integrand, now, is well behaved: because, yes, cosh−1 (x) blows up at x = 0, but p is not 0 when x1 = 0, so... “On Earthquake Waves VIIA. Observations on Recordings of Remote Earthquakes in Göttingen, and Inferences on the Structure of the Earth”. A promising German geophysicist who died very young: “Karl Bernhard Zöppritz”, says the website of the Göttingen seismic station, “died in 1908 at the age of 26 years of an infectious disease, which he had contracted the winter before. Unfortunately, many of his research findings had not yet been published. This was taken care of by Wiechert and his Göttingen-based colleagues Ludwig Geiger and Beno Gutenberg. The outstanding foresight and grasp of young Karl Zöppritz clinched the decision of the German Geophysical Society (DGG) to name the Karl Zöppritz Prize after him, which was awarded for the first time in 2003 to honour the outstanding achievements of young geophysicists.” Geophysical Supplements to the Monthly Notices of the Royal Astronomical Society, vol. 371, 1926. If you feel that you need to know how ocean tides actually work, quantitatively, then this note is for you. Start by looking at the sketch in Fig. N.30. The earth and the moon orbit about their center of mass B, with angular velocity ω; the earth is so much bigger and heavier than the moon that B is actually inside the earth. The force that makes
650
Notes
Fig. N.30 The earth and moon. C is the center of the earth, M the center of the moon, B the center of mass of the earth-moon system
the tides is observed—by us, living on earth—in a reference frame that moves with the earth, so it makes sense to do our math in that reference frame. For the time being, though, let’s forget about the earth’s spin on its axis: we take the axes of our reference frame to always point in the same directions—the frame’s orientation with respect to the fixed stars doesn’t change. An observer sitting in this frame feels (at least) three forces: gravitational attraction from the earth itself, of course; gravitational attraction from the moon; and because the moon-earth system is rotating, and the observer moves with it, an apparent force, which really is just the effect of observing things from a point that is actually accelerating—that apparent force is called centrifugal force. Like I said, our reference frame is not spinning around the earth’s rotation axis, so this centrifugal force is associated with the rotation of the whole earth-moon system around its center of mass: and not with the earth’s spin. Now, at the center of the earth, centrifugal acceleration and the moon’s attraction perfectly counterbalance one another. (If they didn’t, the moon would either fall on us or fly away from us. From this equality, by the way, you can figure out the angular velocity of the earth-moon system—but let’s not get into that—I haven’t shown you how to calculate centrifugal force, and I don’t want to do it just yet.) At other locations on earth they don’t; their sum—which people sometimes call “tidal force”—is nonzero, and that’s what causes tides. In the reference frame we’ve chosen, the center of the earth describes a circle around B; and but (because the earth is a rigid body) all other locations in the reference system we’ve chosen—all possible observers—must describe circles around points that are neither B nor C: each observer around its own point. The best way to understand it is to make a drawing like the one in Fig. N.31. If the drawing is clear enough, which I hope it is, you should be able to see that all observers attached to our reference frame are experiencing the exact same centrifugal force, which is exactly the same as the centrifugal force experienced by the center of the earth: parallel to the line that connects C, B and M, and pointing away from the moon.
Notes
651
Fig. N.31 The dotted line is the trajectory of the center of the earth, C. The dash-dotted line is the trajectory of an observer O, attached to (the surface of) a sphere whose outer surface always coincides with that of the earth (if you approximate the earth with a sphere), except it does not spin around the earth’ axis. Everything else is like in Fig. N.30, of course
Fig. N.32 Black arrows: centrifugal force of the earth-moon system; gray arrows: the moon’s gravitational attraction
Let’s think, now, about the force balance at the observation point—call it O. Let’s take O to be at the earth’s surface, and let’s first see what happens when it’s between B and M, along the line that connects those two points. Figure N.32 shows the centrifugal force (which, like I told you, always points away from the moon); the gravitational attraction from the moon only (which, needless to say, always points towards the moon); and their sum. The moon’s attraction, because of the way gravity works, remember chapter 1, is stronger the closer the observer is to the moon, and so it is at its strongest at the current location of our observer. If we plugged numbers in, we’d see that it turns out that at that location the moon’s attraction prevails over the centrifugal force, and their sum points towards the moon. Next, check out (keep looking at Fig. N.32) what happens at A—the exact antipodes of O: there, the magnitude of the moon’s attraction is as weak as can
652
Notes
be, and weaker than the centrifugal force: and it turns out that their sum points away from the moon. In all this, I haven’t even mentioned the gravitational attraction from the earth: but the things is, that force is exactly the same at the two locations we have considered—only, it points in opposite directions—because it always points towards C. You’ll understand better why that makes it irrelevant—to tides—if we switch on the earth’s spin around its own axis. The earth’s spin is much faster than the rotation of the moon-earth system around B (which, from our earth-centered perspective, is the same thing as the orbiting of the moon around us, and which takes a month). During the time it takes the earth to complete one full revolution around its axis (one day), M approximately remains in the same place: and so, an observer who’s glued to the earth’s surface, and at some point passes through O, will pass through A just about twelve hours later. That observer experiences the earth’s gravity’s pull, of course; but over the course of the day, that pull will always be the same; on the other hand, the additional force that we’ve just studied will be relatively large at O and A, and weaker elsewhere—At D (look at Fig. N.32 again), for example, it will be close to zero799 . Now, finally, let’s ask ourselves, what does this small, changing force do to the earth? Well, the figure of the earth is determined by the earth’s own gravity, which keeps the planet together, and by the resistance to deformation offered by the materials that form the earth. Tidal force and gravity pull in opposite directions at both O and A, so it’s as if gravity were a bit weaker at those points—we aren’t squeezed towards the center of the earth as strongly as we are elsewhere. It’s a tiny difference, but enough to cause a slight bulge in the figure of the earth. This bulge changes over time the same way that the tidal force does. When a continent goes through (or near) O or A, the effect is essentially zero, because rocks are quite effective at resisting deformation. But when an ocean passes by, water doesn’t resist deformation as much, and the differential force is enough to pull it “slightly” away from the center of the earth, causing tides. This “bulging” happens twice a day, at any location on earth, and the effect is stronger depending on where you are relative to the earth’s rotation axis, and to the moon (or, which is the same, to O and A). I’ve tried to sketch this, and the spatial pattern of the tidal force, in Fig. N.33. 458. A very complex calculation, done analytically through a lot of complex math, building on Kelvin’s paper of 1863, which is already pretty hardcore. What I am going to do here is, I am going to cover just some of Kelvin’s/Jeffreys’ (and George Darwin’s, and A. E. H. Love’s) theory, so that you have at least a feeling for how you go from measuring the earth tides to estimating the earth’s average μ. I feel bad about not going through the whole thing, but it would end up being such a big addition to a book that maybe is already too long, that I figure that maybe just for this once... Anyway, the first ingredient of the recipe is Poisson’s equation, which is a convenient way to describe how gravity works in the earth800 . It is based on Newton’s so-called “shell theorem”, which we met in Chap. 1, and which says
Notes
653
Fig. N.33 The tidal force (arrows), location by location, and (solid lines) how the ocean surface responds to it. (Used with permission of Elsevier, from D. C. Agnew, “Earth tides”, in Treatise of Geophysics: Geodesy, 2007; permission conveyed through Copyright Clearance Center, Inc)
that (i) the gravitational attraction from a planet (approximately spherical) is exactly the same no matter how mass is distributed within the planet, provided the distribution is spherically symmetric—density only changes with distance from the center of the planet, that is; and that (ii) if you measure gravitational attraction at a point within the planet, what you observe is the same that you would observe if there were no mass between you and the planet’s outer surface; if you are inside the planet, at a distance r from its center, only the gravitational attraction from whatever mass lies at a distance < r from the center matters: the rest cancels out801 . So, if r0 is where the center of the planet is, and r is where you are, and M is the mass of a spherical chunk of planet, with center at r0 and radius |r − r0 |, then, by Newton’s gravity law, the pull per-unit-mass that you experience is F = −G
r − r0 M . 2 |r − r0 | |r − r0 |
(N.114)
It will be helpful to the algebra we need to do, if we write this force as the gradient of a function—of a “potential”. Consider the function U (r) = G Take its gradient,
M . |r − r0 |
(N.115)
654
Notes
1 ∂ ∂ ∂ , , ∂x ∂ y ∂z |r − r0 | ∂ ∂ ∂ GM , , = − (x − x0 )2 + (y − y0 )2 + (z − z 0 )2 |r − r0 |2 ∂x ∂ y ∂z G M (x − x0 , y − y0 , z − z 0 ) = − |r − r0 |2 |r − r0 | G M r − r0 = − , |r − r0 |2 |r − r0 | (N.116) and you see that what you just got is the same thing as the right-hand side of (N.114). So, we’ve proven that the force-per-unit-mass F of Newton’s law (N.114) is the gradient of the scalar function U defined by (N.115),
∇U (r) = G M
F = ∇U.
(N.117)
Technically, U is a (potential) energy per unit mass. Before I use this result, I need to play around with Eq. (N.114) some more. ˆ S, where d S is an arbitrarily small Let me dot-multiply it left and right by nd surface element, and nˆ is the unit vector that’s perpendicular to d S; then, I am going to integrate both sides over the surface ∂V of the sphere whose center is in r0 and whose radius is |r − r0 |. This reads
∂V
F · nˆ d S = −
∂V
G M r − r0 · nˆ d S. |r − r0 |2 |r − r0 |
(N.118)
But, if the center of ∂V is in r0 , like I’ve just said, then (i) |r − r0 | is constant r−r0 and nˆ are the same thing. And but so then all over ∂V , and (ii) |r−r 0| ∂V
r − r0 GM · nˆ d S 2 |r − r0 | ∂V |r − r0 | GM = − dS |r − r0 |2 ∂V GM = − 4π|r − r0 |2 |r − r0 |2 = − 4πG M.
F · nˆ d S = −
(N.119)
Now, by the divergence theorem (remember Chap. 6), the left-hand side of (N.119) can be replaced by
∂V
F · nˆ d S =
∇ · Fd V, V
(N.120)
Notes
655
where V is the volume of the sphere bounded by ∂V . The right-hand side of (N.119) can be replaced by ρ(|r − r0 |)d V,
− 4πG M = −4πG
(N.121)
V
where ρ(r ) is the earth’s density profile. But then, it follows from (N.119) that
∇ · Fd V = −4πG V
ρd V,
(N.122)
V
or
(∇ · F + 4πGρ) d V = 0.
(N.123)
V
This can be simplified even more if you consider that it has to hold for any spherical volume V whose center is the center of the earth. You can rewrite it as (N.124) (∇ · F + 4πGρ) d V + (∇ · F + 4πGρ) d V = 0, δV
V
where δV is a spherical shell of arbitrary thickness, whose inner surface coincides with the outer surface of V . By (N.123) the first integral in (N.124) is zero, though, and so δV
(∇ · F + 4πGρ) d V = 0.
(N.125)
Now, like I said, the thickness of δV is arbitrary, so we can take it as thin as we want, to the point that F and ρ can be treated as if they were constant within δV . It follows that for (N.119) to hold, we need to have ∇ · F + 4πGρ = 0,
(N.126)
∇ · F = −4πGρ.
(N.127)
or
This is where we use our result (N.117), which if we sub that into (N.127), we get (N.128) ∇ 2 U = −4πGρ, which people like to call Poisson’s equation. Strictly speaking, it’s only valid if the earth is spherically symmetric—otherwise the shell theorem doesn’t hold and the derivation I just did doesn’t work. But, as we are beginning to learn,
656
Notes
the earth is close enough to being spherically symmetric, so that Poisson’s equation is generally considered to be OK. The second thing one needs to do is, basically, take Navier-Stokes equation and replace the force per-unit-volume f in it with the force that is caused by gravitational attraction from the moon802 And then what you get is a differential equation—actually, a system of differential equations, that you need to solve. Kelvin did all this in 1863, for a very simple earth model: an incompressible (∇ · u = 0) sphere of uniform density and rigidity. I am going to show you just the initial steps of how this is done. And I am not going to follow Kelvin’s method, because I found a paper published in 1950 by Hitoshi Takeuchi803 that does the same thing but is, I think, easier to read, at least if you are born in the twentieth—or twenty-first, I guess—century. So, anyway, we might start with Navier-Stokes (Chap. 6), which I’ll replicate here so you don’t have to go back and look for it, ρ
∂2u = ∇ · τ + f. ∂t 2
(copy of 6.135)
What we want to do is, we want to calculate the effect of gravitational attraction from the moon and or the sun—to fix ideas, let’s say we do the moon—for the sun we’d follow the same procedure anyway, with different numerical values. So, f should be given by equations (N.115) and (N.117), with M the mass of the moon and r0 the position of its center: except that in (N.117) we have F, which is a force per unit mass; and so the force-per-unit-volume f here is density ρ (the earth’s density at the point where the moon’s pull is observed) times804 the F of (N.117), i.e., f = ρF (N.129) = ρ∇U. Now, the moon’s attraction that is felt here on earth is very small, when you compare it, at any given location, with the magnitude of the gravitational force from the earth itself. So, it’s a good idea then to go with what mathematicians call the “perturbative” approach, i.e., to write each of ρ, τ , f, U , etc., as the sum of an equilibrium term, which is the value they would have if the moon’s gravity wasn’t pulling, plus a small perturbation, which is precisely the effect of the moon’s gravity pulling. Then one does the math, and the equilibrium terms cancel out, as you shall see, and the terms where you multiply together two or more very small things are even smaller than the rest, and so can be neglected—but let’s see this in action. Take density, for instance. If we call ρ0 the equilibrium density, ρ(r) = ρ0 − u r
dρ0 − ρ0 ∇ · u, dr
(N.130)
Notes
657
where u, as usual, is displacement, and u r is the radial component of displacement805 . The second term at the right-hand side of (N.130) is saying that the moment the earth is deformed, the density distribution changes because of that deformation806 ; the third term is saying that rocks might be expanded or compressed, as part of the deformation we are talking about. (Remember that δρ/ρ0 = −∇ · u: see Chap. 6: Eq. (6.19) and note 316.) Kelvin assumes that the earth is approximately incompressible, and so ∇ · u ≈ 0, and ρ(r) ≈ ρ0 − u r
dρ0 . dr
(N.131)
0 =0 Kelvin takes ρ0 to be constant throughout the planet, and so then dρ dr (N.131) boils down to just (N.132) ρ(r) ≈ ρ0 .
Potential energy per-unit-mass U can also be thought of as the sum of an equilibrium term U0 , plus a perturbation δU resulting from the moon’s pull. Force per unit volume, then, reads f = ρ0 ∇U0 + ρ0 ∇δU.
(N.133)
Kelvin takes the earth to be perfectly elastic, i.e., the relationship between deformation and stress is as described by Hooke’s law—Eq. (6.155). There’s a difference, though, with respect to when we were looking at seismic waves, in that now the force that sets the system into motion is gravity, including the gravitational attraction of the earth itself. The displacements that seismologists are concerned with are too small, throughout most of the globe at least, to affect gravity that much: and so in seismology it is often assumed (and was implicitly assumed in Chap. 6) that the earth’s gravitational attraction stays unperturbed and is perfectly balanced by the pressure p0 that you’d measure inside the earth just as a result of the planet’s own weight807 —and which would also stay approximately unperturbed. Here, we’ve got to be more careful, because, like Eq. (N.133) says, the forcing that causes tides is entirely due to perturbations in the gravity field: which, then, can’t be neglected. And so the same is true for the associated perturbation in p0 . Hooke’s law already accounts for the change in pressure that’s directly due to the deformation—compression or expansion: shearing doesn’t matter right now—but now on top of that we have to consider that, if the earth is deformed, the initial pressure p0 will also change; and so then we should replace Hooke’s law with
∂u j ∂u i + τi j = (− p0 + δ p0 ) δi j + μ ∂x j ∂xi
.
(N.134)
Plugging (N.132) and (N.133) and (N.134) into Navier-Stokes, we end up with
658
Notes
' & ∂2u = ∇ · (− p0 + δ p0 )I + μ ∇u + (∇u)T + ρ0 ∇U0 + ρ0 ∇δU. ∂t 2 (N.135) Navier-Stokes must hold also at equilibrium, i.e., when u = 0 and δ p0 = 0 and δU = 0: so then (N.136) 0 = ∇ · (− p0 I) + ρ0 ∇U0 , ρ0
and if we use this in (N.135), ρ0
' & ∂2u = ∇ · δ p0 I + μ ∇u + (∇u)T + ρ0 ∇δU. ∂t 2
(N.137)
Now, Kelvin figures that earth tides are small and slow enough that acceleration can be neglected. (This is a trick that geodynamicists play quite often: see, e.g., Norman Haskell’s paper on postglacial rebound, that we are going to look at in the next chapter. Seismologists, on the other hand, can’t do it: the accelerations associated with earthquakes and seismic waves are too big.) So, then ' & ∇ · δ p0 I + μ ∇u + (∇u)T + ρ0 ∇δU = 0,
(N.138)
which, after some algebra, ∇δ p0 + μ ∇ 2 u + ∇ (∇ · u) + ρ0 ∇δU = 0.
(N.139)
If, like Kelvin in his paper, we take the earth to be incompressible, then ∇ · u = 0, as you know, and ∇δ p0 + μ∇ 2 u + ρ0 ∇δU = 0.
(N.140)
(This, by the way, in case you’re checking, is Eq. (129) in Takeuchi’s paper. The sign of Takeuchi’s p in his (129) is different from the sign of my δ p0 here, but that’s OK because my δ p0 is the same as Takeuchi’s − p.) Now, in Takeuchi’s derivation, something happens that is not immediately obvious, but it will become so if you keep reading. Equation (N.140) is a vectorial equation, i.e., it’s really three equations—one per Cartesian component. So Takeuchi differentiates each of them by the corresponding Cartesian component, and sums up the three equations so obtained, and if you do that too you should get (N.141) ∇ 2 (δ p0 + ρ0 δU ) = 0, which ρ0 is constant, so you can pull it in and out of the differentiation—the differentiation does nothing to it. Next, Takeuchi takes Poisson’s equation (N.128), and replaces ρ with ρ0 , because remember Eq. (N.132), and U with U0 + δU , and
Notes
659
∇ 2 (U0 + δU ) = −4πGρ0 .
(N.142)
But Poisson’s equation must hold also at equilibrium, i.e., ∇ 2 U0 = −4πGρ0 , which if you subtract that from (N.142) you are left with ∇ 2 δU = 0.
(N.143)
If you sub (N.143) in (N.141), that reduces to ∇ 2 δ p0 = 0.
459.
460. 461.
462. 463.
(N.144)
Equations (N.140), (N.143) and (N.144) are five scalar differential equations in the five unknown functions u 1 , u 2 , u 3 , δU and δ p0 (which are all functions of spatial coordinates only—no time—because Kelvin/Takeuchi assume equilibrium: inertia is neglected), i.e., a system of differential equations, with as many unknown functions as there are equations, which one can solve, and which in fact both Kelvin and Takeuchi solved analytically, in different ways which are both quite complicated—and that’s something that I’ve decided to skip in this book. Because I tried, but there is no way that I can make it clearer or simpler than it is in Takeuchi’s paper. c.g.s., which is how Jeffreys calls his units for rigidity in his paper, means “centimeter-gram-second”; implying that rigidity, μ, which has the same units as force per unit area (remember Hooke’s law), is given in dyne/cm2 . 1 dyne, kg×m by the way, is the same as 10−5 Newton, and 1 Newton is 1 s2 . See note 68. According to Newcomb, see note 72, the ratio (C − A)/A “is given with an error not exceeding a few hundredths of its total amount by the magnitude of 1 , giving the the precession and nutation. The value found by Oppolzer is 305 time of rotation as 305 days.” Newcomb gives no bibliographic reference for Oppolzer, and I haven’t been able to find that contribution by myself; as for “precession and nutation”, I believe that’s essentially what I call “forced precession”, here. See note 72. “Lehmann, in an autobiographical memoir, states that she attended the first coeducational school in Denmark, founded and maintained by Hanna Adler, an aunt of Niels Bohr. She recalls that boys and girls were treated completely alike in this school; both played football and learned needlework. ‘No difference between the intellect of boys and girls was recognized, a fact that brought me disappointments later in life when I had to realize that this was not the general attitude.’ She took a degree in mathematics and physical science at the University of Copenhagen, and began her career in seismology in 1925; in 1928 she was appointed chief of the Seismological Department of the Royal Danish Geodetic Institute, a post she held until her retirement in 1953.” (S. G.
660
464.
465.
466.
467. 468.
Notes
Brush, “Discovery of the Earth’s Core”, American Journal of Physics, vol. 48, 1980: which I quoted in Chap. 2 already, note 86.) “P ” (yes, that’s the title; all of it. P is how Lehmann called all P waves that got reflected into the core, whether only outer or both outer and inner), Publications du Bureau Central Séismologique International. Série A: Travaux Scientifiques, vol. 14. “In the summer of 1914, Erskine Williamson, a 28-year-old Scottish mathematical physicist, joined a new laboratory [i.e., the Geophysical Laboratory at the Carnegie Institution] in Washington, DC. Although he had been hired as a research assistant in geology, he began to attack a broad range of problems in physical chemistry, thermodynamics, and heat flow. [...] Leason Adams was an experimentalist who had arrived at the laboratory just before Williamson.” (Russell J. Hemley, “Erskine Williamson, Extreme Conditions, and the Birth of Mineral Physics”, Physics Today, April 2006.) Nota Bene: the bulk modulus shouldn’t be confused with Young’s modulus— which we met in Chap. 6, note 303. Young’s modulus is related with the change in length of a rod—or whatever you call it: some object whose sizes in all directions except for one are negligible; K , as you see, is related with change in volume. See note 316. In case you are disturbed by a pressure increment, dp, being replaced by a component of stress, consider that all the entries of τ are perturbations with respect to whatever stress you might have at equilibrium. Remember the derivation of Hooke’s !law in Chap. 6. !
, or v 2P − 43 v S2 , is also sometimes called bulk sound velocity, or bulk sound speed. This is really not a good idea, because there is no elastic wave in nature that propagates at a speed that’s equal to the square root of the ratio of K to the density of the material it propagates through. Unless !by coincidence, I guess. Or in fluids, where μ is zero and then v S = 0 and Kρ = v P , etc. But in the context of earth (i.e., solid, or viscous) materials, which is where usually people bring up K and speak of bulk sound velocity, I can’t see any good reason to call it that way. Incidentally, the fact that the ratio of Young’s modulus to density coincides with the (squared) speed of waves propagating along a thin pipe (see Chap. 6) adds to the confusion, because yeah, sure, but Young’s modulus and bulk modulus are not the same thing... and so, bottom line, there’s no way you can measure the so-called bulk sound speed by looking at the propagation of a specific wave; you must either measure v P and v S and combine them through (7.123), or measure K and divide it by ρ. And but don’t get me wrong: the ratio of K to ρ is a useful parameter to look at, as we shall see. Just don’t call it velocity, or speed, is all. 470. This is true in the assumption of hydrostatic equilibrium. Which is not always exactly true, as we shall see: convection, plate tectonics, postglacial rebound, earthquakes, and so on and so forth—but close enough to being true, because most of those displacements are reasonably small. 469.
K ρ
Notes
661
471. The effects we are talking about are triggered by the propagation of seismic waves—it’s seismic waves that perturb p, etc.; not a change in temperature, for instance, or anything else. Incidentally, the bulk modulus K is also called adiabatic incompressibility: adiabatic meaning that it doesn’t account for variations in density/volume that might result from heat lost or received. 472. This is also what they mean when they say that the density changes that they calculate are “due to compression.” 473. Temperature, incidentally, could do it too, i.e., if at some depth for whatever reason temperature changed sharply, that could also mess up Eq. (7.114), etc. But Williamson and Adams “ignore the effect of temperature, but with some confidence that in relation to density it is a minor factor”, because “at high pressures the expansion coefficient becomes less than at low pressures”, and “the pressure half way down to the center of the Earth is more than a million atmospheres, and it is not at all improbable that at this pressure the total thermal expansion would be relatively small.” Admittedly, this is not fully convincing, but the thing is, Williamson and Adams are doing the best that they can with the info they have, and they have—everybody in their time had—zero info about temperature at large depths. Francis Birch and others would later replace Williamson and Adams’ simple equation with others, that also involve thermal compression, etc. 474. Remember Newton’s “shell theorem” in Chap. 1. And or see note 458. 475. Actually, given that we are working under the assumption of homogeneity, it would be best to take as ρ(a) the density that one finds in rocks just below the crust, meaning rocks found at the surface, but which there’s good reason to think were formed just below the crust. In fact, we’ll understand this point better when we do some petrology in the next chapter, but already back then people figured there might be some important chemical change across the Moho. “It is generally agreed”, say Williamson and Adams, “that although the average density of surface rocks is from 2.7 to 2.8, corresponding to granite or granodiorite, nevertheless the granitic layer is relatively [thin] (say 5 to 20 [mile deep]); and that underneath this very thin skin of granitic (and sedimentary) rocks lies a more basic material such as gabbro or even pyroxenite or peridotite.” (You don’t need to know or remember what gabbro and granodiorite and pyroxenite are, to see that what Williamson and Adams are saying is that the nature of the rocks that you would find on the two sides of the Moho is likely to be very different...) 476. What we have here is a simple example of how a differential equation can be solved numerically: without finding any analytical formula—which maybe doesn’t even exist—for the solution. I’ll tell you more about numerical methods to solve differential equations in Chap. 9. 477. A famous physicist at the time, Nobel prize in 1946 “for the invention of an apparatus to produce extremely high pressures, and for the discoveries he made therewith in the field of high pressure physics.” 478. Birch studied electrical engineering at Harvard, receiving his bachelor’s in 1924. He got a fellowship to go study abroad, then, and went to France for a
662
Notes
couple of years: Dijon and Strasbourg. After that, he went back to Harvard, to do a Ph.D was in high-pressure physics, under Bridgman. Right when Birch was doing his Ph.D., Bridgman and his friend Reginald Daly, a geologist, started an “interdepartmental program” to study the physics of the earth and of the materials that make up the earth, most of which are usually under very high pressures—the weight of all rock layers that are on top of them. Birch, then, got a job as “research associate” in this new program and that’s how he started his career. He never quit Harvard and over the years he became assistant prof., full prof., and even chairman of the Geological Sciences division. “When he became chairman of Harvard’s Division of Geological Sciences in 1958,” says James Thompson (Memorial to Francis Birch 1903–1992, Geological Society of America, 1993), “the geologists were housed in an ancient museum building, and Francis occupied Dunbar Laboratory, a converted garage off Hammond Street [I looked it up, and it’s been demolished, probably a long time ago, like in the 1970s or so] in Cambridge. He worked tirelessly, first with potential benefactors and later with the architect, to obtain the modern laboratory space that was so desperately needed. The Hoffman Laboratory of Experimental Geology and Geophysics was completed in 1963, the last year of his chairmanship. [Incidentally, that is where I did my Ph.D. My boss, who had been at Harvard for quite some time already, would mention Birch from time to time, and how important he had been to geophysics, etc. We would hold our group meetings in a meeting room called Birch seminar room. Birch had died in 1992, which means about four years before I started my Ph.D.] “Although he had no prior background in geology, Francis quickly established close and profitable working relationships with the geology faculty, a group of quite different people, each dealing with one aspect or another of our richly varied Earth. To maintain effective communication amid this diversity, there was an informal lunch table at the Harvard Faculty Club [the Harvard faculty club was a five minute walk from the Hoffman Labs, past the School of Design, and the Fogg Museum where I would go from time to time to think about things other than my Ph.D.; which, to tell you the truth, at the beginning that was a rare event, thinking about things other than my Ph.D., because I had decided I would be a scientist and all my energies would be invested in that; but then as the years passed I figured maybe I wouldn’t be a scientist after all, and I started reading a lot of books and stuff, that weren’t books of science. At the Fogg, I remember seeing exhibitions of Franz Marc, Mondrian. And or then I would take breaks and go look for all sorts of music and literature in the second-hand shops of Harvard Square, Porter Square, Davis Square. Very much of what I read at the time went back to the first half of the twentieth century, Henry Miller and Céline that I read in old New Directions paperbacks, I would try to buy as many of those as I could, printed between the nineteen-fifties and sixties, books that were doing all they could to break with the past, to break with everything, to break with I wasn’t sure what, maybe I was trying to figure out what. Grad students were not allowed in the Faculty Club unless they went there together with professors; It did happen that I went there, when I ate with my boss because
Notes
479. 480.
481. 482.
663
we had some guests at the department, or something, like visiting faculty or a perspective post-doc or something. I remember the food being good, but that is in comparison with how I typically ate at the time, which wasn’t very well. I doubt that in my time, which was the nineties, there was still anything like an informal lunch table for geologists and physicists. I guess the thing is, Harvard like all universities had become too big, and people had become too specialized for that sort of things. I think nowadays we are specializing even more, and plus there’s the Internet, etc.] where one could learn of the discoveries, hopes, and enthusiasms of one’s colleagues. [By my third year in graduate school, I had lost most if not all of the enthusiasm that I might have initially felt, anyway.] Francis quickly became a central member of the group. Ideas were exchanged freely in an atmosphere of trust and mutual respect. All of us gained much from it. Ideas spawned there often found their way into print. Francis, always meticulous in acknowledging his sources, would sometimes surprise us by carefully citing, months or years later, something that had been scrawled casually on an overturned placemat.” Geophysical Journal of the Royal Astronomical Society, vol. 4. Birch doesn’t explain, in his paper, why two curves and not just one—one for the mantle and one for the core. Or if he does, I must have missed it. Anyway, those estimates of density are certainly not perfect, and I guess all values within each pair of curves can be considered reasonable. “The Alpha-Gamma Transformation of Iron at High Pressures, and the Problem of the Earth’s Magnetism”, American Journal of Science, vol. 238, 1940. There’s more evidence than just Birch’s reasoning for the inner core being solid. Apparently, people eventually became convinced of that through looking at seismic data. Not (traveling) waves, though, because waves propagating as deep as (almost) the center of the earth and then back up are not easy to measure: you need a reasonably big quake near the antipodes of a seismic station (remember Lehmann’s work); and even if you have that, the travel time and waveform of a disturbance propagating across the entire planet is controlled by all sorts of structure it might encounter along its way—not just in the core but everywhere above it as well. So, the deeper you try to look, the less clearly you see. But, starting already with Kelvin, Lamb, Love, Rayleigh, etc., people had understood that the earth, being a finite, bounded medium—with a stress-free outer surface—behaves kind of like a guitar string. What I mean by that (remember note 292) is that when, e.g., a quake suddenly, and for a brief amount of time, perturbs the earth’s shape and or internal structure, then the earth’s subsequent free oscillations are the linear combination of a discrete set of spatial patterns (modes: which more about in Chap. 9), each with its own temporal frequency. And the thing is, the values of the modes’ frequencies are related to the earth’s structure, each mode in its own way, sort of like the string’s frequencies are related to the string’s length, density, tension. Modes are not easy to measure, either: you need to wait until the initial, “transient” vibrations that immediately follow the quake (incl. the first arriving P and
664
483.
484.
485. 486. 487.
Notes
S and surface waves) dissipate away, and you’re left with the lower-frequency, small-amplitude signal of the relatively graver modes. Seismologists had been debating re whether or not they were able to actually observe modes, when in 1960 and 1964 two huge earthquakes occurred: the first in Chile and the second in Alaska. The bigger the quake, the higher the signal-to-noise ratio, the better the measurements, of course: so with the new data, which in the meantime the earth was monitored by a constantly growing network of high-quality seismic instruments: with the new data, people finally had pretty clear observations of the modes and their frequency—which several different authors, each looking at their own data, got very similar estimates of the modes’ frequencies: and that could hardly be a coincidence. Now, the thing about modes is, different modes are related to different parts of the earth, depending on their spatial patterns. There’s some modes, e.g., that consist of oscillations of only the shallowest layers of our planet, and there’s also modes where the innermost shells are what moves most—which means that the frequencies of the latter modes will very much depend on the properties of the core. So if you can nail down the right mode, it’s as if you could look directly to the center of the earth, having removed all the stuff that lies in between. Chaim Pekeris, a geophysicist at M.I.T., put together the first mathematical models of the modes, which then you could take a model of density/seismic velocities in the earth, and calculate what the mode frequencies would be like if the earth were like that: and of course if those didn’t resemble the stuff that people were now learning to measure from actual quake data, that meant that the model wasn’t good. And soon it turned out, that the best models to fit all modes were those that included a solid inner core. I think the first paper ever to talk about it—the one that everybody cites at least, though I haven’t been able to find a copy, is Desmond Bernal’s “Hypothesis on the 20◦ Discontinuity”, published in 1936, vol. 268 of a journal called Observatory that doesn’t seem to exist anymore. On the basis of, I guess, his experiences in the lab, Birch thinks that it is OK to use w as an index of chemical composition. There are certainly earth materials that have the same w as one another but different chemistry, but “the mean atomic weight”, writes Birch in the 1961 paper, “is a significant parameter for a number of physical properties”, which, I guess, means that we can consider materials that have the same w to be chemically similar to one another. Birch also adds that “in the context of geochemical abundances, it is evidently an index of heavy metal, especially of iron content, and in the following discussion of major divisions of the mantle the heavy metal content will be identified with iron content.” See note 124. See note 471. Meaning, if I administer a quantity δ Q of heat while keeping p constant, then the change in temperature δT that will result from δ Q is related to δ Q through δ Q = C p mδT,
Notes
665
where m is the body’s mass. In Chap. 4 we’ve met the heat capacity per unit volume, which we’d called cV , with a lowercase c, and per unit mass (cm ), and but we hadn’t specified whether those quantities were calculated at constant p or V . So perhaps the notation I am using here is slightly incoherent with that of Chap. 4. Or at best, confusing. But I guess we can live with it. 488. This note is going to be, like, a crash course in thermodynamics, limited to the stuff you need to know to understand the rest of this book—still, quite a bit of stuff. I guess I’m using this word, thermodynamics, without explaining what that means, but that’s OK because, for one, you are going to learn it along the way, and plus, like all words that try to pack a field of knowledge in one box, one can’t really define it so precisely. I mean, for instance: who’s the first thermodynamicist? Not clear at all. And whoever that was, chances are he had no idea that he was starting a new field. That having been said, it’s clear, at least, that thermodynamics deals with the relationships that there exist between heat, work, and the temperature of matter. And, as you shall see, if we have to pick one founding father for the whole thing, it could probably be argued that it all started with the work of Sadi Carnot? But I shouldn’t be getting ahead of myself. Some thermodynamics we’ve already done, actually: the law of energy conservation (with energy meaning both heat and mechanical energy) that we’ve seen in Chap. 4 is also thought of as the first principle of thermodynamics. The next notion we need is that of entropy, and the underlying so-called second law of thermodynamics. You’ve probably heard of entropy before, because the word is often used, like, metaphorically to mean a lot of things; but here let’s stick to entropy in thermodynamics. As far as topics in physics go, entropy is one of the tough ones. In the following I’ll renounce my obsession for the history of ideas, to put clarity first and give you the simplest explanation of entropy that I could manage to come up with. I’ll give you the idea of entropy that eventually emerged from many contributions (Carnot and Clausius being the most important names, probably), without bothering, or not as much as I usually do, with chronological order and with who did what. To see what entropy is about, before we define it quantitatively, let’s imagine a mass of an ideal gas—an ideal gas, by the way, is a gas that obeys the theoretical law pV = m RT, relating pressure p, volume V and temperature T through the positive constant R, and the mass m of the gas we’re looking at. (This is not how the law is usually formulated: it turns out that if you measure mass as a number of molecules rather than in grams or kilograms or whatever, R is a universal constant: one that does not depends on the substance we’re working on: which here, instead, it does. But for our goals, we don’t really need R to be universal, and I think the algebra is simpler this way.) While there exists no gas in nature that behaves exactly like an ideal gas at all pressures and temperatures, the so-called ideal
666
Notes
gas law is still an empirical law: people observed the behavior of many real gases, and came up with pV = m RT because that turns out to approximate the data for most gases, at least when temperature and pressure are not very very low or high.—To begin to understand entropy, I was gonna say, let’s imagine some ideal gas initially confined to some limited volume, that all of a sudden is left free (the gas is) to expand through empty space. What happens is that the volume occupied by the gas grows while its temperature stays the same. The reason the temperature stays the same is, temperature is controlled by the gas’ internal energy: and energy must be conserved. Energy comes and goes to and from the gas either through work that’s performed on or by the gas, or heat that the gas receives or gives away. In this experiment, though, no work is performed, and plus there’s no way the gas can exchange any heat with the external world if the external world amounts to empty space. So volume goes up and temperature stays constant. Now compare that same amount of gas, before and after the expansion. Its total energy is conserved, as we’ve seen. But the funny thing is, through its expansion, the gas has lost some of its ability to, as they say, perform work. Because think a concentrated mass of gas at high pressure: if you could enclose it in some container with, say, a piston, then the gas’ natural tendency to expand could be used, for example, to move the piston, i.e., to do work. But once the gas has expanded, the potential for producing such work has been lost—if we were to compress the gas to its initial volume, that would actually require us to do some work on the gas, losing what we might later gain in work done by the gas. This change in the gas’ potential to do work has nothing to do with energy: energy doesn’t change throughout the free expansion. So, we have another quantity for it—and that is what we call entropy. To actually define entropy, though, like with a mathematical formula, we need some preliminaries. First of all, you need to learn a few things about isothermal processes, i.e., processes where the temperature is kept constant (like in the free expansion we just saw), and adiabatic ones, where there’s no exchange of heat between the material we’re looking at and the rest of the world. We are going to study what happens to p and V during isothermal versus adiabatic processes; and for our goals here, it will be enough to do it in the case of ideal gases. With ideal gases, isotherms are really easy, because T is constant, and so if pV = m RT , then p is proportional to 1/V , which is pretty much all we need to know about isotherms, in view of what comes next. In an adiabatic transformation, on the other hand, p, V and T can all vary at the same time so things are a bit more complicated. To find the relation between p and V in such a case, you might start with energy conservation, d Q = dU + dW,
(N.145)
where d Q and dW are net received heat and work—negative if given away or performed rather than received—and δU is the change in the gas’ internal
Notes
667
energy. It’s useful to write dW in terms of p and V : it can be done this way: think of the work that the freely-expanding gas didn t do: if, like I said, we had enclosed it in an insulated tank, with a piston on top, then the gas’ pressure p might have pushed the piston up, which means the gas would have done work, and this work would coincide with the product of the force applied by the gas on the piston, times the vertical displacement of the piston. Which, if you think about it, the force is just p times the surface area of the piston, and dW is that times the displacement of the piston: but then the surface area of the piston times its displacement equals the change d V in the gas’ volume, and so (N.145) can also be written d Q = dU + pd V. Now compare a constant-volume (d V = 0) transformation, where, if you solve it for dU , Eq. (N.145) boils down to dU = d Q − pd V = d Q = mC V dT
(N.146)
(with C V denoting the specific heat capacity at constant volume, see note 487), with a constant-pressure transformation, where dU = d Q − pd V = mC p dT − pd V = mC p dT − m RdT (in the last step we’ve used the ideal gas’ equation pV = m RT , keeping in mind that this is a transformation where p doesn’t change so dp = 0). If, in the two transformations, the temperature T of the same mass m of material changes by the same amount dT , then also the change dU in internal energy must be the same in the two cases, because internal energy U depends only on T . But so then, if you subtract the last two equations from one another you get 0 = m(C p − C V )dT − m RdT, and if you divide by mdT , C p − C V = R, which is a relation between three numerical constants, and so it must hold independently of whether the transformation we’re looking at is isothermal, or adiabatic, or whatever. (Incidentally, I think I’ve mentioned that R > 0, so this relation also proves that C p > C V .) We’re pretty close now to finding the “adiabatic” relation between p and V that we’ve been looking for. We saw a minute ago, Eq. (N.146), that a temperature variation dT , if everything else stays the same, translates into a variation dU = mC V dT of internal energy. In an adiabatic transformation, then, d Q = 0 and good old energy conservation boils down to
668
Notes
mC V dT = − pd V, which if you solve for dT , dT = −
pd V . mC V
Now take the ideal gas’ equation and differentiate it: V dp + pd V = m RdT ; replace R and dT with the expressions that we just found; V dp + pd V = − pd V
C p − CV , CV
which don’t forget that this is valid only as long as the transformation is adiabatic. Usually people like to call γ=
Cp CV
(which, by the way, remember C p > C V , so γ > 1), and so the last equation we’d written becomes V dp + pd V = pd V (1 − γ). Now subtract pd V and divide by pV , left and right, and dV dp +γ = 0. p V Consider that if log is the natural logarithm808 , then d log(x) = x1 , and but so dx dx then d log(x) = x , and I can rewrite the last equation, d log( p) + γd log(V ) = 0, or d[log( p) + γ log(V )] = 0, or
d[log( p) + log(V γ )] = 0.
This means that log( p) + log(V γ ) is constant—i.e., constant during an adiabatic process, of course. But then also
Notes
669
elog( p)+log(V
γ
)
= elog( p) elog(V
γ
)
= pV γ
must be constant throughout any adiabatic transformation809 . Which is the property of adiabats that we were looking for. The next piece of information you need is re reversible and irreversible processes. I like Feynman’s definition from his famous Lectures on Physics810 . To him, a reversible transformation in thermodynamics is kind of like a mechanism that works without friction: “when we have a practically frictionless machine”, he says, “if we push it with a little force one way, it goes that way, and if we push it with a little force the other way, it goes the other way.” A reversible process, then, is, like, “the [thermodynamics] analog of frictionless motion”. For example, a “heat transfer whose direction we can reverse with only a tiny change” in the temperatures of the bodies that are involved. “If the difference in temperature is finite”, i.e., “if we put a hot object at a high temperature against a cold object, so that the heat flows, then it is not possible to make that heat flow in a reverse direction by a very small change in the temperature of either object. [...] But if one makes sure that heat flows always between two things at essentially the same temperature, with just an infinitesimal difference to make it flow in the desired direction, the flow is said to be reversible.” If you heat one of the two objects a little, heat will flow from it toward the other, and but if we cool it a little, heat with flow toward it from the other. And the same goes for work: imagine some gas in a tank, closed by a piston on top of it: if you add some small weight to the piston you’ll increase the pressure and decrease the volume slightly; but if you diminish the weight a little you can get back to the initial state, and if you keep (slowly) diminishing it you can reduce the pressure, grow the volume, etc. On the other hand, if you’ve got a system where work is converted to heat via friction (think Joule’s experiments), or, in general, a system where some energy is lost through friction, that’s definitely irreversible... (Incidentally, that means there isn’t such a think as a really, purely, 100% reversible process in nature—because something is always lost to friction, etc.) One reversible process, or sequence of reversible processes, that is really important to entropy and to the history of how entropy came to be “discovered” is the so-called “Carnot cycle”, or Carnot “engine”. It’s named after Sadi Carnot811 , a French engineer that thought it up in his book Reflections on the Motive Power of Heat, published 1824. It’s purely conceptual: Carnot never actually built his engine. It’s interesting that Carnot did not know about the first principle (which as you know from Chap. 4 came to be accepted as such much later than in 1824), and yet formulated already what would come to be called the second law of thermodynamics; and made some important inferences based on that and his engine. We’ll get to that in a bit. The engine (see Fig. N.34) consists of a homogeneous substance kept in a tank whose base can conduct heat, while all its other surfaces are insulating—they conduct no heat at all. The top of the tank is free to move up and down—a piston, with its own nonnegligible weight. The substance could in principle be
670
Notes
Fig. N.34 The four stages of a Carnot engine. This sketch is based on a similar figure that I found in Fisica Generale, Editrice Ambrosiana, 1964, an early Italian translation of Halliday and Resnick’s famous physics textbook
anything: in practice, some sort of gas; Carnot obviously was thinking about steam engines, which were all the rage around 1824. To make things simpler, we’ll assume it’s an ideal gas—then later we’ll show that that does not affect the general validity of what we are about to learn. To get the machine to work we also need two “thermal reservoirs”, i.e., two bodies kept at constant temperatures T1 and T2 , with T1 > T2 . The way it works is, the gas is alternatively heated up by the hot reservoir, and cooled by the cold one, and this gives rise—and in a second I’ll explain exactly why and how—to a repetitive up-and-down motion of the piston, that then can be used to set some machinery in motion. More precisely, the engine has four stages, each of which is a slow, reversible process, and they go like this (now you might look also at Fig. N.35): (a) The gas’ initial state is described by the values p1 , V1 , T1 for pressure, volume and temperature, respectively. The tank has been placed on the hot thermal reservoir (temperature T1 ), so T of the gas stays constant—thermal equilibrium between gas and reservoir is maintained—while the gas expands. V grows to a larger value V2 (the piston moves up) and p decreases to p2 . This is called an isothermal expansion. Some heat is received by the gas through the base of the tank and some work is done by the gas, to raise the piston. (b) The tank is removed from the first reservoir and placed on an insulating slab. So, at this stage no heat can be exchanged at all between the inside of the tank and the outside world. Temperature will tend to go back down, but since the internal energy that’s associated with it can’t be lost as heat, it will still be turned into positive work done on the piston. It follows that V still grows while both T and p decrease: an adiabatic expansion. If inside the Carnot
Notes
671
Fig. N.35 Pressure as a function of volume through a Carnot cycle
engine there’s an ideal gas, like we postulated, then p is proportional to V1γ , with γ > 1, like we’ve seen, which means that p at stage 2 goes down faster than at stage 1. If one leaves the gas alone, the expansion will continue, I guess, until p is such that the weight of the piston is perfectly balanced, i.e., it’s too small to push the piston further up. (c) Before we get to that point, though, T of the gas becomes as low as T2 — the temperature of the cold reservoir. (Which of course isn’t a coincidence: when designing the engine, T2 must be chosen so that that will be the case.) At this very moment, the tank is transferred from the insulating slab to the cold reservoir, which then keeps T constant at T2 (Later we shall refer to p3 and V3 as the values of pressure and volume at this point of the cycle). In the meantime, some additional weight is applied to the piston, pushing it down slowly (remember that the process has got to be reversible). So, at this stage, work is done by the outside world on the gas, the gas’ volume is reduced, its pressure goes up, and for T to stay constant some heat must be given by the gas to the reservoir. (And by the way, you’ve guessed it, this is called isothermal compression.) (d) At some point the tank is transferred again to the insulating slab. The piston continues to be pushed down, same as in the previous stage. Now there’s nothing to keep the gas cool, and heat can’t be given away, so the build-up of pressure under the weight of the piston partly translates into a growth in T ,
672
Notes
which eventually rises back to its initial value T1 . (Adiabatic compression.) As soon as T = T1 , the tank goes on the hot reservoir again, and we are precisely back at stage one. (At the beginning of this stage, pressure and volume have values that we might call p4 and V4 : at the end, they’re also back to p1 and V1 , of course.) The design of the Carnot engine is probably not the most practical, but the fact that one can think it up shows that, at least in principle, a reversible or approximately (one has to minimize and/or neglect friction, assume insulation is perfect, etc.) reversible engine is possible. Starting with Carnot himself, people used this idea to introduce some practical concepts about how heatbased engines work, how you can make them more efficient, etc. The concept of entropy is an example of that—and it’s not Carnot’s idea, by the way, but usually attributed to Rudolf Clausius, who figured it out as a result of his own reflections on Carnot’s work. Using energy conservation—which Carnot didn’t have, but we don’t care, because like I said, for this one time: clarity first—using energy conservation, let’s see what we can say about the work and heat received and given away by the engine. Carnot’s stages 1 and 2 are isotherms, and so p1 V1 = p2 V2 and p3 V3 = p4 V4 . Stages 2 and 4 are adiabats, and so γ
γ
γ
γ
p2 V2 = p3 V3 and
p4 V4 = p1 V1 . If you take the product of the left-hand sides of these four equations and equate it to the product of their right-hand sides, and divide both sides of the resulting equation by p1 p2 p3 p4 , you get γ
γ
γ
γ
V1 V3 V2 V4 = V2 V4 V3 V1 , or, which is the same,
(V2 V4 )γ−1 = (V1 V3 )γ−1 ,
or V2 V4 = V1 V3 ,
(N.147)
Notes
673
which is going to be useful in a second. Q 1 is the heat received from the hot reservoir during stage 1; stage 1 is isothermal, so the internal energy of the gas is constant, and by energy conservation Q 1 must coincide with the work W1 performed by the gas throughout stage 1, Q 1 = W1 =
b
p Ads =
m RT1 , V
pd V ;
V1
a
but (the gas is ideal) p =
V2
so
V2
Q 1 = m RT1 V
1 dV V
1 = m RT1 log(V2 ) − log(V1 ) V2 . = m RT1 log V1 Stage 3 is also an isotherm; the heat given away by the gas (a positive number) is Q2 = −
V4
pd V ;
V3
V4
1 dV V V 3 V3 . = m RT2 log V4
= − m RT2
Taking the ratio of the expressions we’ve just found for Q 1 and Q 2 , Q1 T1 log (V2 /V1 ) . = Q2 T2 log (V3 /V4 ) And but now see that it follows from Eq. (N.147) that V2 /V1 = V3 /V4 , and so what we’ve found is that, in a Carnot cycle, T1 Q1 = , Q2 T2 or
Q1 Q2 − = 0. T1 T2
(N.148)
This turns out to be very important. Because: you can combine multiple Carnot cycles to form a more complex cycle, see Fig. N.36, where the three cycles abi ja, bcghb, and de f gd are combined to form one; Eq. (N.148) holds for each of the pairs of isotherms ab and i j, bc and gh, and de and g f : it follows
674
Notes
Fig. N.36 Reversible cycle formed by three Carnot cycles
Fig. N.37 Any reversible cycle can be approximated by a combination of Carnot cycles. The thick solid line marks a multiple Carnot cycle that approximates the arbitrary, sort-of-egg-shaped cycle marked by a thin solid line. Again, all individual isotherms are fully included in the approximate cycle: the dashed lines mark the segments of adiabats that are excluded—adiabats don’t contribute to i QTii so that’s OK. Like Fig. N.34, this one too is based on a sketch in Halliday and Resnick’s 1964 Fisica Generale
Notes
675
that, if we assign to the isotherms indexes from 1 to 6, Qi = 0. Ti i
(N.149)
And but then, that means you can take any cycle, approximate it with a sequence of Carnot cycles, see Fig. N.37, and (N.149) still holds; and bringing this to the limit, where an infinitely small quantity d Q of heat is exchanged at each infinitely short isothermal stage of the cycle, (N.149) can be replaced by (
dQ = 0, T
(N.150)
where the new funny version of the integral symbol that I’ve just introduced, ) , means we’re integrating along a closed curve. Equation (N.150) is true independent of which closed curve we pick, provided that the sequence of transformations that it describes are all reversible. Now choose any two states, call them a and b, of the system—if that helps, think of them as two points on a p, V diagram. You can draw as many closed curves as you want passing through those two)points: provided all curves represent reversible transformations, the integrals dTQ along each of them will be zero. So then let’s define a function S of the state of the system, that has got some arbitrary value at one reference state, let it be a, and that at all other states, for instance at b, is given by S(b) = S(a) + a
b
dQ . T
(N.151)
Because the state b is arbitrary—any values of p, V , T will do—that is a perfectly general definition. Now, (N.151) stipulates that the total change in S in a reversible transformation from the state a to b is the integral of dTQ , calculated along the one curve that describes that transformation. But, and here’s the crux of all this, there’s also Eq. (N.150), which says that the integral along that particular curve will be equal to the integral along any other curve connecting a and b—as long as it’s a curve that describes a reversible transformation. In other words: the value of the integral at the right-hand side of (N.151) is independent of the integration path. But so then, the way we’ve defined it, S is a function of nothing else but of where b is: i.e., of the state of the system: S is what people call a state variable, just like p, V, T and the internal energy U . This brings me back to the beginning of this note, that is now officially competing for being the longest crazy endnote in this crazy book: to the beginning of this endnote and to the problem of defining entropy. Because it turns out that S is just what we needed to quantify entropy—in fact: that S like we’ve defined it here with Eq. (N.151) is precisely what people call entropy. To show that that makes sense, I need to do two more things: first, I have to generalize my results to when the substance in Carnot’s engine is not an ideal gas: because
676
Notes
remember, the equations I’ve started out with, to derive (N.148) with all that follows from it, are ideal-gas equations, and won’t work if the stuff that’s in the tank isn’t an ideal gas. Then, I will try to convince you that, no matter what irreversible process you might think of: if indeed it’s irreversible, it will always cause the entropy of the system to grow. It was, again, Carnot who first figured out all this, in his Réflexions, as he was trying to conceive the most efficient engine possible. To generalize Eq. (N.148)— the conservation of QT —he started out by assuming what would later be proved to be a very important, empirical law of physics—today we call it the second law of thermodynamics. Based on what he understood of engines, Carnot assumed that you cannot just take an amount of heat out of a reservoir (whose temperature is T1 ), and convert it to work—unless there’s another reservoir (whose temperature is lower than T1 ) where some of the heat extracted from the first reservoir is dumped. I like how Feynman puts it (Chap. 44 of the first volume of the Lectures): “In other words,” he says, “if the whole world were at the same temperature, one could not convert any of its heat energy into work”. The way I understand it, this—the second law—was just a hypothesis, which must have seemed very realistic to Carnot—and has been verified by experiment ever since. What’s important in Carnot’s contribution, though, is that he showed that his hypothesis had a very important consequence. He did it through the following, relatively simple thought experiment. Imagine you’ve got a reversible engine A that receives an amount Q 1 of heat from a reservoir that’s kept at temperature T1 , produces some work W and dumps heat Q 1 − W to a reservoir kept at temperature T2 , with T2 < T1 . Then let’s say there’s a second engine, B, not necessarily reversible, which also receives Q 1 (same as A) at T1 , does work W , and dumps heat (Q 1 − W ) at T2 . Engine A is reversible: so one can actually use B to “run A backwards” (look at Fig. N.38): provided that W > W , we can inject into A a portion of the work W produced by B, to slowly and reversibly raise A’s piston, or whatever, and then take Q 1 − W from the colder reservoir and give Q 1 to the hot one. The net result of this would be: the reservoir at T1 receives just as much heat as it gives away; and heat equal to W − W is taken from the other reservoir and turned entirely into work. But this goes against Carnot’s hypothesis—the second law. So, if we accept Carnot’s hypothesis, we must conclude that this can’t happen—i.e., that W can’t be bigger than W ; in other words, that no engine absorbing heat at T1 and dumping it at T2 can do more work than a reversible engine operating between the same two temperatures. “Now,” says Feynman, “suppose that engine B is also reversible. Then, of course, not only must W be not greater than W , but now we can reverse the argument and show that W cannot be greater than W .” Meaning, you can swap the roles of A and B with one another, and then repeat the argument: and because B now is reversible and as such can be run backwards, the argument works in exactly the same way, implying that the work done by A can’t be more than that done by B. But so then, “if both engines are reversible they must both do the same amount of work”, goes on Feynman, “and we thus come
Notes
677
Fig. N.38 Reversible engine A being driven backwards by engine B. The boxes at the top and bottom of the diagram are reservoirs, kept at temperatures T1 and T2 , with T1 higher than T2
to Carnot’s brilliant conclusion: that if an engine is reversible, it makes no difference how it is designed, because the amount of work one will obtain if the engine absorbs a given amount of heat at temperature T1 and delivers heat at some other temperature T2 does not depend on the design of the engine. It is a property of the world, not a property of a particular engine.” In other words, the answer to Carnot’s question is that any reversible engine operating between a pair of reservoirs at temperatures T1 and T2 produces the same amount of work. Now add the law of energy conservation to that. In any reversible engine—no loss of energy to friction and all that—energy is conserved, so net work done must coincide with net heat received, and if we call Q 1 the heat received by the gas at stage 1 and Q 2 the heat given away by the gas at stage 2, then we also know that W = Q1 − Q2. It follows that any reversible engine operating between T1 and T2 , whatever substance be inside it, when receiving an amount of heat equal to Q 1 , will release Q 2 = Q 1 − W which, since W is the same for all such engines, will also be the same for all such engines. And then it follows that (N.148) will continue to hold. Bottom line, everything we derived above applies to all reversible cycles, and S is a state variable, no matter what substance we are talking about. Now that that’s sorted out, take the freely-expanding-gas example that we started out with: an irreversible process where no heat is exchanged, no work done, the V of the gas grows, its p goes down, and T stays the same. And what happens to S as the gas expands? Because the process is irreversible, we
678
Notes
couldn’t just apply Eq. (N.151) to it—it would result in zero entropy because d Q is zero all along the p, V curve that you might measure. Instead, we need to think of a reversible process that brings the gas back from its final precisely to its initial, pre-expansion state. We slowly (i.e., reversibly) compress the gas, which means we’re doing some work on it; we also have to take some heat away from it so that its T stays the same: it’s an isothermal compression like we’ve met before when we discussed the Carnot cycle. If we call a and b the states of the system respectively at the beginning and at the end this transformation (i.e., at the end and beginning of the previous, irreversible expansion), then, now, yes, the change in entropy between a and b is precisely given by Eq. (N.151). The isothermal compression we are doing requires, like I said, that the system loses some heat: a negative δ Q: which means we are reducing the system’s entropy: or, alternatively, that the free expansion had increased it. Another example that you’ll find in textbooks, although each time slightly differently phrased, is the following: you’ve got a cold and a hot body and you bring them together, so that one touches the other: heat then flows from the hot to the cold one in a fast, irreversible process. Assume that both bodies are so large that the overall temperature of each will approximately stay the same during this process. (Unless one waits for a very long time, which we won’t. So basically the two bodies function as a pair of heat reservoirs.) If nothing else is being done to either of the two bodies, whatever change happens in their states is entirely determined by this loss and/or gain of heat. So, even though the process we’re talking about is irreversible, it’s OK to think of a pair of reversible processes—one where all you do is you (reversibly) take away an amount Q of heat from the hot body, and another where you give the exact same amount of heat to the cold one: and so the change in the state of each body will be the same as that entailed by the irreversible process we started out with: which since entropy is a state variable, the change of entropy also must be the same. So then use Eq. (N.151), with constant temperatures in both cases, and the change in entropy is δSc =
Q Tc
for the cold reservoir (I’ve called Tc its temperature), and δSh = −
Q Th
for the hot one. Now, you might think of a closed (no exchange of heat and/or work with the rest of the world) system formed by the two reservoirs, and then calculate its total change in entropy through the two reversible transformations above as the sum
Notes
679
δS = δSc + δSh Q Q = − Tc T h 1 1 ; = Q − Tc Th which, needless to say, Th > Tc , or T1h < T1c , and so δS > 0. And you can try with as many examples of irreversible processes as you can think of... the bottom line being that the entropy is conserved in reversible processes, but as soon as you’ve got some effects, like friction—ubiquitous in the real world—, that make a process irreversible, entropy is always going up, and never down. And the inevitable gain in entropy signals the loss of potential for doing work (despite energy being conserved), that always comes with irreversible transformations812 . So, anyway, it took forever, and I hope this doesn’t in any way disappoint you; but the thing is, I couldn’t possibly prove (7.141), which is what this note was supposed to be about, without you knowing what entropy is—which maybe you did already; but then again, maybe you didn’t. Now that entropy is wrapped up—at least to the best of my competences—we can go back to our original goals. We’ve got two more biggish steps to go through before we are done. The first is, we need to prove that our newly discovered entity S has the property that
∂T ∂p
= S
∂V ∂S
. p
You can do that via conservation of energy, Eq. (N.145), which now we can write T d S = dU + pd V : consider processes where only p and S change; differentiate both sides, and TdS + T
∂S ∂p
dp = S
∂U ∂p
dp + S
∂U ∂S
dS + p p
∂V ∂p
dp + p S
∂V ∂S
d S. p
The second term at the left-hand side is zero—S won’t change, even if you change p, in a process where S is required to be constant...; collect all factors that multiply d S, and all factors that multiply dp, so that
T−
∂U ∂S
∂V −p ∂S p
dS −
p
∂U ∂p
+p S
∂V ∂p
dp = 0. S
Now, the perturbations d S and dp are small but, other than that, arbitrary; so for this equation to be verified, we need that
680
Notes
"
T − ∂U − p ∂V = 0, ∂ S p ∂ S p ∂U + p ∂V = 0. ∂p ∂p S
S
The trick now is, differentiate the first equation of the pair with respect to p, and the second with respect to S: ⎧ ⎨ ∂T − ∂ 2 U − ∂V − p ∂ 2 V = 0, ∂p S ∂ S∂ p ∂S p ∂ S∂ p ⎩ ∂ 2 U + p ∂ 2 V = 0, ∂ p∂ S
∂ p∂ S
where, whenever I take a double derivative, it is understood that the differentiation with respect to S is done while keeping p constant, and vice versa. Now sum to one another the pair of equations that we’ve just ended up with, and it turns out that ∂T ∂V − = 0, (N.152) ∂p S ∂S p QED. The second and final step is, think a bit more about ∂V , which, in practice, is ∂S p the ratio of a very small change in V to a very small change in S, in a transformation where p doesn’t change. Now, by Eq. (7.134), in such a transformation d V = αV dT (and you might remember that α is the coefficient of thermal expansion—a property of the material—ideal gas or whatever—we are dealing with). And by all we’ve learned in this note, d S = dTQ , and but at constant pressure d Q = mC p dT . . . So plug all this into (N.152), and
∂V ∂S
T V αδT C p mδT TVα = C pm αT = , ρC p
= p
from which Eq. (7.141) follows immediately: which ends this note. 489. If you don’t believe it, and but want to believe it, you might have to read the previous, long note. 490. S stands for entropy, actually: which you’ll know about after reading note 488. 491. Which he seems to take for granted, but it wasn’t known to me when I first saw it in his paper, despite having been for five or six years a diligent, if not terrific, undergraduate physics and then graduate geophysics student. Or in any case, I couldn’t remember it. So I figured it was worth trying to give a proof for it, and see the next note.
Notes
681
492. The proof is, I guess, quite complex, but anyway here it is... the idea is we are going to find a relation between the specific heat capacities, C p and C V , and then we are also going to find out that the ratio of C p to C V is simply related to the ratio of K to K T , and combining both relations we’ll find the one used by Birch. We’ve already met specific heat capacity at constant pressure, see note 487; for the derivation that follows it’s better to write it in terms of entropy (see note 488), which is easy because from δS = δTQ it follows right away that δS = C p m and but then
T m
Cp =
δT , T
∂S ∂T
;
(N.153)
p
and likewise if volume, rather than pressure, is kept constant, T m
CV =
∂S ∂T
.
(N.154)
V
Now, let’s think more about the kind of transformations that define C V , where V is kept constant. A small amount of heat is administered to the system, causing a change in temperature dT . The associated change in entropy reads dS =
∂S ∂T
dT, V
In the case of a more general transformation where volume is allowed to change, we might write the change in entropy as the sum813 of the change resulting from only the change in T , plus that resulting from only the change in V , dS =
∂S ∂T
dT + V
∂S ∂V
d V. T
d V might also be the result of the change in T , and but also of a change in p, so ∂V ∂V dT + dp. dV = ∂T p ∂p T If we substitute the latter equation into the former, we get dS =
∂S ∂T
dT + V
∂S ∂V
T
∂V ∂T
dT + p
∂S ∂V
T
∂V ∂p
dp. (N.155) T
682
Notes
Now from this we can find the T -derivative of S at constant pressure, because, if dp = 0, then it follows from (N.155) that
∂S ∂T
= p
∂S ∂T
+ V
∂S ∂V
T
∂V ∂T
; p
and but then, if you sub (N.153) and (N.154) into what we’ve just written, you see that ∂V T ∂S C p = CV + . m ∂V T ∂T p If we bring C V to the left-hand side, then C p − C V = αV
T m
∂S ∂V
,
(N.156)
T
with α the volumetric coefficient of thermal expansion, defined by Eq. (7.134), i.e., ∂V = αV. (N.157) ∂T p The next step ∂ S towards the equation we’re after, relating K and K T , consists of replacing ∂V with another of those partial derivatives, but one that doesn’t T involve entropy S. The way we are going to do this is very similar to how we’ve = ∂V in note 488. Start with energy conservation, found the equality ∂T ∂p S ∂S p i.e., change in internal energy equals heat plus work, or dU = T d S − pd V ;
(N.158)
again, consider transformations where only T and V might change. It follows from (N.158) that
∂U ∂T
or
∂U ∂T
dT + V
∂U ∂V
−T V
∂S ∂T
dV = T T
∂S ∂T
dT +
V
∂U ∂V
dT + T V
−T T
∂S ∂V
∂S ∂V
d V − pd V, T
+ p d V = 0. T
Now, dT and d V are supposed to be small, but other than that their values are arbitrary, so for this last equation to hold it must be that
∂U ∂T
−T V
∂S ∂T
= 0, V
Notes
683
and
∂U ∂V
−T T
∂S ∂V
+p=0 T
at the same time. Differentiate the last two equations, the first with respect to V and the second with respect to T , ∂ 2U ∂2 S −T = 0, ∂T ∂V ∂T ∂V ∂2 S ∂S ∂ 2U ∂p −T − + = 0, ∂V ∂T ∂V ∂T ∂V T ∂T V where, for the sake of simplicity, as far as second derivatives are concerned, I’ve omitted the subscripts, so it is understood that when differentiating with respect to V I take T to be constant, etc. In any case, if we subtract from one another the two equations that we’ve just ended up with, the second derivatives cancel out, and it all boils down to ∂p ∂S = , ∂V T ∂T V which if we substitute that into (N.156), T ∂p m ∂T V αT ∂ p = . ρ ∂T V
C p − C V = αV
(N.159)
∂p with an expression This can be manipulated even further to replace ∂T V containing only α and K T and no partial derivatives: consider that in general dV =
∂V ∂p
dp + T
∂V ∂T
dT, p
and so in a transformation where d V = 0, ∂V ∂V dp = − dT ; ∂p T ∂T p and if you divide both sides by
∂V ∂p
T
dT , then have dT go to zero,
684
Notes
∂p ∂T
∂V
= − V
∂T
p
.
∂V ∂p
T
Now remember Eq. (7.136), from which it follows that KVT = − Eq. (N.157), and what we’ve just obtained can be reduced to
∂p ∂T
∂V ∂p
T
, and
= αK T , V
and so if you plug that into (N.159), α2 K T T . ρ
C p − CV =
(N.160)
That’s the relation between C p and C V that we were looking for. Now, the next and final step (for this note) is to write both C p and C V in terms of K and K T . Start by taking the ratio of C p to C V , from (N.153) and (N.154), ∂S Cp ∂T p = ∂S . CV ∂T V Then, consider that in general814 dp =
∂p ∂S
dS + T
∂p ∂T
so that if you set dp = 0 and solve for the ratio
∂S ∂T
= − p
∂p ∂T ∂p ∂S
dT, S dS dT
then let dT go to zero,
S ; T
∂S and then with the same strategy you can find a similar expression for ∂T , V writing d V in terms of d S and dT , putting d V = 0 and solving for the ratio of d S to dT . Substituting all that into the expression that we just found for the ratio of the specific heat capacities,
∂V Cp S ∂S T = ∂V . ∂p CV ∂T S ∂p ∂T ∂S
T
(N.161)
Notes
685
Next, one can play with the ratios of the derivatives. The idea is that you can always think of a derivative as the limit of an incremental ratio; so for instance if you have the ratio of two derivatives, you can replace both with incremental ratios, simplify what can be simplified, and then reevaluate the limits of the ratio and... maybe it’s easier to explain this formally,
∂p ∂T
∂V = S
∂T S
δp δT limδT →0 δV δT
limδT →0
δ p δT δT →0 δT δV δp = lim δT →0 δV ∂p = ∂V S
= lim
(because if δT is going to 0 with constant S, then δV must be going to zero as well). Playing the same game with the ratio of the two derivatives taken at constant T , one finds that ∂V
∂ S T = ∂p ∂S
∂V ∂p
; T
T
substituting both expressions into (N.161), Cp = CV
∂p ∂V
S
∂V ∂p
T
from which if you apply both (7.113) and, again, (7.136), it immediately follows that Cp K = . CV KT We are almost done; divide each side of (N.160) by C p and subtract both from 1, so that C p − CV α2 K T T 1− =1− . Cp ρC p After working out the algebra at the left-hand side, α2 K T T CV =1− , Cp ρC p or
686
Notes
α2 K T T KT =1− , K ρC p which is precisely (7.145), QED. 493. One of Birch’s important experimental findings, that he describes in the 1961 paper, is that P velocity and density are in a linear relationship to one another for a given rock sample. Experiments systematically show that. This then came to be called Birch’s law. Strictly, that is true only above a certain pressure (10 kbars), but that’s OK for Birch’s and our goals, because we’re really looking at the (relatively) deep earth right now. Birch explains that most of the measurements are made on rocks with w = 21; for other atomic weights he’s got fewer data, but after having determined “a linear relationship between velocity and density for a constant mean atomic weight [...] of about 21”, writes Birch, “it is [...] possible to interpret the other more scattered points in terms of series of lines of constant mean atomic weight parallel to but displaced from the well-determined line for w = 21.” This means, in practice, that if you’ve got an earth model where v P changes with depth in a way that’s not linear, not a straight line, or even a model where v P changes linearly but with a slope that’s not the same as that established by Birch at w = 21: then, either the model’s wrong, or, if it is right, it means that there must be a change in the properties of earth materials with depth, that is not only due to compression. The microscopic structure of the materials needs to change with depth. This is not necessarily a change in chemistry: the crystalline structure (which later we shall call the phase) of the same compound of elements might change, packed. densely sothat matter is more or less ∂v ∂ρ ≈ 3.31 km/s3 , that ∂w ≈ 0.13 g/cm3 , 494. Namely, Birch says that ∂ρp g/cm w T, p,y ∂v p and that ∂w ≈ −0.79 km/s. ρ
495. See note 483. 496. It is definitely not trivial to get ahold of a peridotite sample, because it’s a mantle rock. But you do find peridotite in ophiolites, xenolithes, and diamonds: geological things that I’ll tell you a bit about in Chap. 8: see notes 635, 628 and 866. 497. And I don’t know what the name “chondrules” comes from, but let’s stop this here. 498. Meaning, the time since the igneous rocks that are in them have solidified to their current form. I know I haven’t told you yet about how we find the absolute age of an igneous rock: but I am going to do it at the very beginning of the next chapter, so please be patient. As much as I want this book to be as sequential as possible, things in this discipline are so entangled together that sometimes it’s just too hard! 499. Maybe you are wondering how we know the composition of the sun? “Spectrum analysis”, wrote Helmholtz in his Popular Scientific Lectures (which see Chap. 4), “has taught us that a number of well-known terrestrial elements are
Notes
687
met with in the atmosphere of the fixed stars, and even of the nebulae.” [Re nebulae, see Chap. 2, note 83.] “You all know that a fine bright line of light, seen through a glass prism, appears as a colored band, red and yellow at one edge, blue and violet at the other, and green in the middle. Such a colored image is called a spectrum—the rainbow is such a one, produced by the refraction of light, though not exactly by a prism; and it exhibits therefore the series of colors into which white sunlight can thus be decomposed. The formation of the prismatic spectrum depends on the fact that the sun’s light, and that of most ignited bodies, is made up of various kinds of light, which appear of different colors to our eyes, and the rays of which are separated from each other when refracted by a prism. “Now if a solid or a liquid is heated to such an extent that it becomes incandescent, the spectrum which its light gives is, like the rainbow, a broad colored band without any breaks, with the well-known set of colors: red, yellow, green, blue and violet, and in no wise characteristic of the nature of the body which emits the light. “The case is different if the light is emitted by an ignited gas, or by an ignited vapor—that is, a substance vaporized by heat. The spectrum of such a body consists, then, of one or more, and sometimes even a great number, of entirely distinct bright lines, whose position and arrangement in the spectrum is characteristic for the substances of which the gas or vapor consists, so that it can be ascertained, by means of spectrum analysis, what is the chemical constitution of the ignited gaseous body. Gaseous spectra of this kind are shown in the heavenly space by many nebulae; for the most part they are spectra which show the bright line[s] of ignited hydrogen and oxygen, and along with it a line which, as yet, has never been again found in the spectrum of any terrestrial element [I think Helmholtz is talking about Helium, which is also found in the sun (see below)—it is named, in fact, after the Greek word for sun—and only later would be observed on earth]. Apart from the proof of two well known terrestrial elements, this discovery was of the utmost importance, since it furnished the first unmistakable proof that the cosmical nebulae are not, for the most part, small heaps of fine stars, but that the greater part of the light which they emit is really due to gaseous bodies. “The gaseous spectra present a different appearance when the gas is in front of an ignited solid whose temperature is far higher than that of the gas. The observer sees then a continuous spectrum of a solid, but transversed by fine dark lines, which are just visible in the places in which the gas alone, seen in front of a dark background, would show bright lines. The solar spectrum is of this kind, and also that of a great number of fixed stars. The dark lines of the solar spectrum, originally discovered by Wollaston, were first investigated and measured by Fraunhofer, and are hence known as Fraunhofer’s lines”, etc. Over a century later (2007), in their big review paper for Treatise on Geochemistry, Herbert Palme and Hugh O’Neill say that, indeed, “the abundances of all major and many minor and trace elements in the Sun are known from [...] spectroscopy of the solar [surface]. Accuracy and precision of these data have
688
500. 501.
502. 503.
504. 505.
Notes
continuously improved over the last 70 years. [...] The more accurate the solar abundances are, the better they fit with the abundances of [...] chondrites, a small group of meteorites that appear closest in composition to the Sun.” It is inferred that “a first approximation to the composition of the bulk Earth is [...] to assume that Earth has the average solar-system composition for rockforming elements, that is, excluding highly volatile elements such as H, N, C, O, and noble gases.” So I guess one takes the sun/the chondrites, which have about the same composition, removes He (noble gas) and H and other less abundant volatile elements, and looks at what’s left. And that’s the model. Another phase change is expected at 520 km, actually, from lab experiments, but doesn’t show up in seismology data as clearly as the other two. The word “perovskite”, like many other words in this book, is ambiguous— that’s not my fault, because those words were ambiguous long before this book was written. If anything, their ambiguity is one of the reasons I am writing this book. Anyway: perovskite is, strictly speaking, calcium titanium oxide: a mineral with a peculiar crystal structure, that was first discovered by one Gustav Rose in 1839—and apparently because it was found in Russia—in the Urals—the mineral was given the name of a famous Russian mineralogist: Lev Perovski. But then, people started to call “perovskite” any mineral with that same lattice structure (ABX3 , as in MgSiO3 , FeSiO3 , etc.): and the mineral that you find when you bring olivine to the p and T of lower mantle is, in this sense, a perovskite. There are also names for the minerals one expects to find between the 410 and the 520, and between the 520 and the 660; but let’s not get into that. It was Keith Bullen who first came up with that nomenclature, in (I think) his papers written in the 1930s and 1940s, and then in his book, The Earth’s Density (1975). In practice, Bullen’s models consisted of several layers, separated by discontinuities, which Bullen placed at the depths where he found particularly sharp changes in v P and v S , or in their “gradients”—their rate-of-change with respect to depth. In practice, A is the crust all the way to the Moho; B goes from the Moho to the 410; C from the 410 to about 900-1000 km, which is where Bullen placed the bottom of the transition zone; D is what we now call the lower mantle: from the bottom of the transition zone to the core-mantle boundary; and E is the outer core and G the inner core. F is a shell, about 200 km thick or so, at the bottom of the outer core, where seismic-velocity gradients had seemed, in early studies (Jeffreys, Gutenberg, in the 1930s and 1940s and 1950s) to change quite a bit. (I don’t think there’s anything that looks like the F layer in any of the current models, anyway.) As in Émile Clapeyron, the friend of Gabriel Lamé, that we met in note 325. This gets more convoluted than I would like to, but that’s the way it is: it seems that there’s yet another phase transition in the mantle, deeper than the perovskite-to-post-perovskite one. What happens is that at the very bottom of the mantle temperature grows very fast, (which in Chap. 9 we’ll also understand why that could be): so fast that, despite p also growing a bit, at some depth the temperature profile might intersect the perovskite/post-perovskite Clapeyron
Notes
506.
507. 508. 509.
510.
511.
512.
513. 514.
689
curve, again: and so post-perovskite then would turn into perovskite again. I don’t know how robust this observation is; but if we accept it, then we have one more constraint for T near the core-mantle boundary. Estimates of T derived in this way are not in disagreement with what we think about that part of the earth—but let’s get back to this in Chap. 9. Some sort of conduit, made of material that’s refractory to the radiation: think a pipe, which is the only path for the radiation to eventually hit the photographic plate. Radiation coming out of the collimator is propagating, more or less, in the direction defined by the collimator. More on electric and magnetic fields later in this chapter. Or, it will explain it, when you read the stuff on electromagnetism later in this chapter. Or more precisely: the moment when a zircon crystal achieves what people who know more than me about this stuff call radioisotopic closure. That’s the moment when radioactive parents and daughters stop diffusing into or out of the mineral: and it happens when the mineral has cooled below a certain temperature. Britannica: “crystal defect: imperfection in the regular geometrical arrangement of the atoms in a crystalline solid. These imperfections result from deformation of the solid, rapid cooling from high temperature, or high-energy radiation (X-rays or neutrons) striking the solid. Located at single points, along lines, or on whole surfaces in the solid, these defects influence its mechanical, electrical, and optical behaviour.” The instrument people use, by the way, to measure the, uhm, isotopic abundance of elements in a sample is the mass spectrometer. I have to admit, I am totally not a laboratory person. This is embarrassing, but... I am not sure how beneficial it would be to you, to me or to this book, if now I were to try and give an explanation of how a mass spectrometer work. Plus, the book is already very long as it is. I’ll let you look this up elsewhere, if you are curious. Different isotopes of a given element are instances of that element that differ by the number of neutrons in the nucleus—not the number of protons: which, like I said, is actually what we use to distinguish elements from one another. And, a non-radiogenic isotope is an isotope that cannot be born from radioactive decay—and doesn’t decay, either, so its concentration won’t change over time. (Because, yes, it just turns out that, for a number of reasons that we don’t need to get into, different isotopes of the same element might or might not be radiogenic.) Or, more precisely, the moment when radioisotopic closure is achieved; see note 509. Zircon, by the way, is very resistant to weathering, and, as a result, you can find very nice samples of zircon crystals embedded in sedimentary rock. This, together with the nature of the nuclear reactions that happen in it, see above, and the fact that it is very chemically inert, i.e., it rarely takes part in chemical reactions with other substances; all these properties make zircon the most powerful radiometric clock that there is.
690
Notes
515. This was a very short account of radiometric dating; which entire books could be written, and have been written, about. The main thing that I am not talking about is that there are quite a few different decay schemes, or systems; I mentioned uranium-lead, which is what happens in zircon, and rubidium-strontium. The scheme you use depends on the samples you are able to find, of course; but also on how quick the decay is—the value of λ in Eq. (8.1). People like to talk of the half-life of a radioactive isotope, which is the time it takes for its concentration to be reduced, by radioactive decay, to one half its initial value. If we call t1/2 the half-life, then P(t1/2 ) 1 = , P(0) 2 by definition, and so, by Eq. (8.1), t1/2
516. 517. 518.
519.
1 1 . = − ln λ 2
Which the point I am trying to make is that it doesn’t matter whether the λ of a decay process is specified, or the t1/2 : one is the inverse of the other, times a numerical constant. So, then, for instance, the half life of uranium, when transforming to lead, can be either 4.5 billion or 704 million years, depending on which isotope of uranium we start out with. The half life of rubidium-strontium is 48.8 Ma. The half life of radiocarbon, which is the radioactive isotope of carbon (i.e., carbon14, with 8 neutrons and, of course, 6 protons), is only about 5700 years. That is why radiocarbon dating is great in archaeology—or, in general, for dating “recent” stuff; while uranium-lead and rubidium-strontium are more suited for your typical geological samples. See note 183. Which I’ve mentioned in Chap. 4, note 154. Actually, there’s one way in which radioactivity coming directly from inside the earth can be bad for our health. Radon is the radiogenic, and radioactive daughter of radium, which is in the same decay series as uranium and thorium— eventually decaying to a stable—non radioactive—isotope of lead. The thing about radon is, radon is a gas: so, the radon that’s produced in the earth’s crust through radioactive decay is much lighter than the rocks around it, and will tend to seep out of the ground and into the atmosphere. If that happens outdoors, no big deal, because radon quickly disperses in the atmosphere. But if this happens in somebody’s house, then the radon can accumulate to a concentration that can actually be bad for you—as radon itself decays, the radiation it emits can damage people’s tissues, etc. The current accepted estimate for the age of the earth, by the way, which is about 4.5 billion years, was established by Clair Patterson815 in a couple of papers published in 1955-56. Patterson used the isochron method to date meteorite samples816 found on earth: and it turned out—from his work and that of other
Notes
691
people looking at the same thing—that the great majority of meteorites are all of the same age, which is about four and half billion years. So, like we already did at the end of Chap. 7 to speculate about the earth’s chemical composition, Patterson made the hypothesis that most meteorites are left-overs from the formation of the solar system. Besides the usual chondrites, Patterson also looked at iron meteorites, which are a kind of meteorites that, yes, contain a lot of iron, but also, more importantly for us right now, contain almost no uranium. The reason this is important is that uranium is radioactive—the product of its decay is lead—and but has a very long half life, which means it decays very slowly, which means you need some very important amount of uranium to generate, even in several billion years, any significant amount of lead. Now, a meteorite is a closed system, so all or almost all the lead we find in a meteorite where there’s no uranium must be primordial, i.e., it must have been there at the moment when the meteorite formed—when the solar system, and the earth formed. And the lead’s so-called “isotopic composition” (i.e., the relative abundancies of its various isotopes) in iron meteorites must be about the same as it was in the earth, at the moment the earth was formed. Patterson’s method to date the earth is partly based on this idea. Patterson didn’t date individual rocks, but thought of the entire earth as a closed system, and measured the isotopic composition, i.e., the average relative amounts of the various existing isotopes of lead in all the lead we have on earth. “Let the isotopic composition of lead in the earth at the time it was formed”, he says in his 1955 paper817 , “be Pb-204 = 1, Pb-206 = x, Pb-207 = y, [...] where x, y [...] are the number of atoms of these isotopes for each atom of Pb-204; let the isotopic composition of lead in the earth today be Pb-204 = 1, Pb-206 = x , Pb-207 = y [...].” Now, Pb-204 is stable and non radiogenic, so there should be about as much Pb-204 in the earth now as there was when the earth formed— acquired its current mass. Pb-206 and Pb-207 are stable and radiogenic, the ends of decays chains starting with U-238 and U-235. “Then”, says Patterson, “Pb-206 = (x − x) is the amount of radiogenic Pb-206 generated by U-238 since the earth was formed, and similarly, Pb-207 = (y − y) is the amount of radiogenic Pb-207 generated by U-235. These amounts of radiogenic leads may be expressed in terms of the present U-238/U-235 ratio, k, the disintegration constants of U-235 and U-238, λ1 and λ2 , and the period of decay, T , as λT e 1 −1 y − y . = x − x k eλ2 T − 1 This expression is solved for T .” Of course, to solve for T you need to know x and y, i.e., the initial amounts of Pb-206 and Pb-207 in the earth. But, and here’s why I mentioned iron meteorites a minute ago, Patterson assumes that “when the earth was formed, it contained lead with an isotopic composition the same as that in iron meteorites”. So, “the
692
520.
521.
522. 523. 524.
525. 526. 527.
528.
Notes
values of x and y are measured in the lead from iron meteorites and the values of x and y are measured in a sample of the average lead in the earth’s crust. The age of the earth calculated in this manner, about 4.5 × 109 yr, is considerably greater than the generally accepted estimate”, but perfectly consistent with the age of meteorites—which could hardly be a coincidence. A pioneer of nuclear physics who won the Nobel prize in 1908 “for his investigations into the disintegration of the elements, and the chemistry of radioactive substances”. AKA, the fourth baron of Rayleigh—son of John William Strutt, AKA the baron of Rayleigh. I am going to use Strutt for the son, and Rayleigh for the father. Who kicked off his career as a grad student of Strutt’s. “The Terrestrial Distribution of Radium”, in Science Progress in the Twentieth Century (1906–1916), vol. 9, 1914. The calorie (cal) is a unit of energy, just like the Joule. One cal is defined as the amount of heat you need to raise by 1 ◦ C the temperature of 1 g of water. Today we tend to use the Joule, instead, as you have seen in earlier chapters; 1 cal = 4.184 J. “Radioactivity and Earth Movements”, Transaction of the Geological Society of Glasgow, vol. 18. In his paper “Arthur Holmes and Continental Drift”, see note 244. The bit on Jeffreys is at pages 134 to 136. In his 1906 paper that we’ve met before, Oldham writes: “In the case of the waves emerging at 90◦ from the origin, the material traversed has, on the average, nearly 12 times the resistance of granite to compression, and 15 times its rigidity [...]. It must, however, be borne in mind that this high degree of rigidity, as against stresses of very short duration, is quite compatible with the yielding to stresses of long duration, which is required by known facts of structural geology, and need not necessarily be inconsistent with those movements, of the nature of convection-currents, which Mr. Fisher [Physics of the Earth’s Crust, see note 216] believes to exist in the interior of the earth.” In one of his contributions to Theory of Continental Drift: a Symposium, etc., van der Gracht replied to Jeffreys that his objection was really “a confusion of ‘rigidity’ with ‘residual rigidity’ or ‘strength’. The sima has greater ‘rigidity’ than the sial, but the latter, as a whole, evidently has greater ‘strength’. That is, it has a greater resistance against long enduring stresses [...]. Whether a substance is ‘hard’ or ‘soft’, like steel or lead, has not necessarily anything to do with its ‘strength’. In looking at similies, substances should be used whose properties are more within our grasp, because we can handle them under room temperature, and ordinary pressures and time intervals. So I shall refer [...] to beeswax and pitch. Pitch has great ‘rigidity’ [...] but extremely little ‘strength’. While beeswax has little rigidity, but considerable strength. We can perfectly well press a chisel of soft beeswax into a block of hard pitch, provided we push in our chisel slowly enough. That is what happens both in an isostatic adjustment and at the front of a continent floating forward into the sima.”
Notes
693
529. A non-linear viscous fluid would be one where η itself changes depending on the rate of deformation. Linear fluids are also called Newtonian (I don’t know why), and non-linear ones, you’ve guessed it, are also called non-Newtonian. 530. I don’t think that’s really what Stokes does in his famous 1849 paper “On the Theories of the Internal Friction of Fluids in Motion, and of the Equilibrium and Motion of Elastic Solids” (published in the Transactions of the Cambridge Philosophical Society), where he first figured out the viscous version of NavierStokes, etc. Here, like in most current textbooks (see, for example, Rheology of the Earth, by Giorgio Ranalli), I sort of have Eq. (8.16) fall from the sky, implicitly asking you to believe it as an experimental fact: that there are materials in nature that behave like that, and we call them viscous fluids, etc. And in Chap. 6 I did the same with Hooke’s law. Indeed, there are experiments that show those constitutive relations—elastic and viscous—work, at least under certain conditions. But the constitutive relation can also be derived from basic principles, like Newton’s law, and simple hypotheses about the particles of a fluid interact with one another; and I think that’s more like what Stokes did in his paper. 531. Doing things this way, we end up with two theories, each with its own mathematical machinery: one for elastic media and one for viscous media. Alternatively, one can come up with only one stress-strain relationship, that boils down to the elastic one when time is very short, and to the viscous one when time is very long. People call viscoelastic a medium that’s described by a stress-strain relationship like that. There isn’t only one viscoelastic stress-strain relationship: people have looked into various options and I don’t think we are clear as to which is most realistic. There’s whole books written on this: see e.g. the one by Ranalli, that I have cited in note 530. 532. Norman Haskell was born in Illinois and but went to Harvard—he received his Ph.D. in physics from there in 1936. “After doing research work at Columbia University and the California Institute of Technology, he worked for the Air Force Cambridge (Mass.) Research Laboratory from 1948 until his retirement in 1968. He taught theoretical geophysics at the Massachusetts Institute of Technology from 1950 to 1954, and was a technical adviser to the United States delegation at the Geneva Conference on a nuclear test ban treaty in 1958.” This is from his obituary, published in the New York Times (April 15, 1970), where it is also said that “he was 64 years old and lived in Cummaquid”, which is a village in Cape Cod, about halfway along the peninsula, looking on Cape Cod Bay. 533. The one that I am following most closely is “The Motion of a Viscous Fluid Under a Surface Load”, Physics, vol. 6, p. 265. 534. Or one of the first times: Haskell makes a reference to a German paper that had just come out818 . But Haskell’s work is the one that is still getting cited in textbooks and papers today; and his order-of-magnitude estimate of the earth’s mean η is still considered to be OK.
694
Notes
535. Why isn’t this done in laboratory studies, similar to seismic velocities? Well, I guess people try to do so, but viscosity, says Haskell, is “difficult to measure experimentally, different observers giving widely varying values”. 536. The idea of postglacial rebound was first uttered by Thomas Jamieson, a British geologist, in a paper “On the History of the Last Geological Changes in Scotland”, published in vol. 21 of the Quarterly Journal of the Geological Society of London (1865). “In Scandinavia and North America,” says Jamieson, “as well as in Scotland, we have evidence of a depression of the land following close upon the presence of the great ice-covering; and, singular to say, the height to which marine fossils have been found in all these countries is very nearly the same. It has occurred to me that the enormous weight of ice thrown upon the land may have had something to do with this depression. Agassiz considers the ice to have been a mile thick in some parts of America; and everything points to a great thickness in Scandinavia and North Britain. We don’t know what is the state of the matter on which the solid crust of the earth reposes. If it is in a state of fusion, a depression might take place from a cause of this kind, and then the melting of the ice would account for the rising of the land, which seems to have followed upon the decrease of the glaciers.” People didn’t pay much attention to this, though, apparently. The concept of post-glacial rebound took hold after Gerard De Geer, a Swedish geologist, published, in 1888, a map of the rate of uplift all over Scandinavia. He did it by figuring out—the way geologists are able to go to the field, look at outcrops and stuff, and figure out these things—the highest shore-line ever reached by water, place by place. The pattern he came up with, Fig. N.39, correlates well with what people looking at moraines, etc. (see note 234), estimated to be the area covered by glaciation. Which then “it does, therefore, not seem to be easy avoiding the conclusion at which Jamieson arrived already in 1865,” wrote De Geer, “namely that the immense loading of the ice gradually caused a local depression of the Earth’s crust which is supposed to be in a fairly sensitive state of equilibrium, and that the area after the deglaciation slowly raised again although it has scarcely succeeded in fully reaching its original level”. (This is from a paper which I think exists only in Swedish. I took the translation from Martin Ekman’s book, The Changing Level of the Baltic Sea during 300 Years: A Clue to Understanding the Earth, published by the Summer Institute for Historical Geophysics of the Åland Islands819 , 2009.) 537. See note 536. 538. Fridtjof Nansen, The Earth’s Crust, its Surface Forms and Isostatic Adjustment, 1928. 539. Finland plus Norway plus Sweden, or Scandinavia minus Denmark. 540. If we were to pick the zero of t before the end of the ice age, we would need info on the mass and extent of the load—which Haskell didn’t have. 541. Transforming Eq. (8.19) from Cartesian to cylindrical coordinates is a laborious exercise in vector algebra, that I am now going to solve for you. It all boils down to rewriting the cylindrical-coordinate components of the gradient, whose
Notes
695
Fig. N.39 De Geer’s 1888 map of uplift—the isolines are the loci of points of equal highest shore line. This was redrawn by Martin Ekman for his book The Changing Level of the Baltic Sea during 300 Years: A Clue to Understanding the Earth, 2009. Used with permission
∂ ∂ Cartesian components are ∂x , ∂∂y , ∂z , in terms of derivatives with respect to the cylindrical coordinates r , ϕ and z. To start off, Fig. 8.2 and a bit of trigonometry tell us that the formulae that relate Cartesian and cylindrical coordinates are
and
and
⎧ ⎨ x = r cos ϕ, y = r sin ϕ, ⎩ z = z,
(N.162)
⎧ ⎨ r = x 2 +y 2 ϕ = arctan xy ⎩ z = z,
(N.163)
⎧ ⎨ xˆ = rˆ cos ϕ − ϕˆ sin ϕ yˆ = rˆ sin ϕ + ϕˆ cos ϕ ⎩ zˆ = zˆ ,
where xˆ , yˆ and zˆ are unit vectors directed like the Cartesian axes, while rˆ lies on the x, y plane and points from the origin to the point (x, y), and ϕˆ lies on the same plane, is perpendicular to rˆ and points in the sense of increasing ϕ. So, then, consider e.g. the derivative of a function f = f (x, y, z) with respect to x:
696
Notes
∂f ∂f = ∂x ∂r ∂f = ∂r ∂f = ∂r ∂f = ∂r
∂r ∂ f ∂ϕ ∂ f ∂z + + ∂x ∂ϕ ∂x ∂z ∂x ∂r ∂ f ∂ϕ + ∂x ∂ϕ ∂x y ∂ 2 ∂f ∂ x + y2 + arctan ∂x ∂ϕ ∂x x y x ∂ f − x2 + r ∂ϕ 1 + y 22 x
∂f x ∂f y = − ∂r r ∂ϕ r 2 ∂ f sin ϕ ∂f cos ϕ − . = ∂r ∂ϕ r At this point in this book, with some patience, which certainly you’ve got a lot of that if you got all the way to here, at this point you should be able to figure out most or all of what I’ve just done: I’ve used the chain rule; the trivial fact ∂z that ∂x = 0; Eqs. (N.162) and (N.163), of course; and the trigonometry result ∂g
∂ arctan [g(x)] = 1+g∂x2 (x) for any decently behaved function g(x). One that ∂x then does pretty much the same thing with the derivative of f versus y, which gives ∂f ∂f ∂ f cos ϕ = sin ϕ + . ∂y ∂r ∂ϕ r
As for ∂∂zf we don’t need to do anything about that, because z is the same coordinate in both systems. The gradient of f then can be rewritten ∂f ∂f ∂f xˆ + yˆ + zˆ ∂x ∂y ∂z ∂f ∂ f sin ϕ ∂f ∂ f cos ϕ ∂f = cos ϕ − xˆ + sin ϕ + yˆ + zˆ ∂r ∂ϕ r ∂r ∂ϕ r ∂z ∂ f sin ϕ ∂f cos ϕ − (ˆr cos ϕ − ϕˆ sin ϕ) = ∂r ∂ϕ r ∂ f cos ϕ ∂f ∂f sin ϕ + (ˆr sin ϕ + ϕˆ cos ϕ) + zˆ + ∂r ∂ϕ r ∂z 1∂f ∂f ∂f rˆ + ϕˆ + zˆ . = ∂r r ∂ϕ ∂z
∇f =
The second derivatives of f with respect to x, y:
Notes
697
∂2 f ∂ ∂f = ∂x 2 ∂x ∂x ∂f ∂ sin ϕ ∂ sin ϕ ∂ f cos ϕ = cos ϕ − − ∂r r ∂ϕ ∂r r ∂ϕ
(N.164)
∂2 f ∂ ∂f = 2 ∂y ∂y ∂y cos ϕ ∂ ∂f cos ϕ ∂ f ∂ + sin ϕ + . = sin ϕ ∂r r ∂ϕ ∂r r ∂ϕ
(N.165)
and
A formula for the Laplacian of f in cylindrical coordinates is obtained summing the right-hand sides of (N.164) and (N.165). This requires some algebra, but it is not that bad. Think about it this way: the sum of the products of the first ∂2 ∂2 2 terms, cos2 ϕ ∂r 2 and sin ϕ ∂r 2 , gives a second derivative with respect to r ; the ∂ products of the “outside” terms, cos ϕ ∂r times minus and plus sinr ϕ ∂∂ϕf , cancel out; the sum of the products of the “inside” terms gives a derivative with respect to r , divided by r . And finally, the sum of the products of the last terms gives a second derivative with respect to ϕ, divided by r 2 . Put it all together, and ∇2 =
1 ∂2 ∂2 1 ∂ ∂2 + + + . ∂r 2 r ∂r r 2 ∂ϕ2 ∂z 2
Or, which is the same (work out the r -derivative and see), ∇2 =
1 ∂ r ∂r
∂ 1 ∂2 ∂2 r + 2 + 2. 2 ∂r r ∂ϕ ∂z
In Haskell’s paper, things are simplified by the fact that the glacial load is assumed to be concentrated at one point, and so we just have to place the origin of the reference frame at that point, for everything to be symmetric with respect to ϕ: which means that the derivatives of the components of v with respect to ϕ are zero, and the component of v in the ϕ direction is also zero. Things are also complicated, though, by the fact that Haskell takes the Laplacian of a vector, rather than a scalar. In practice what we have is ∂ 1 ∂2 1 ∂ ∂2 (vr rˆ + vz zˆ ) r + 2 + r ∂r ∂r r ∂ϕ2 ∂z 2 1 ∂ ∂vr vr ∂ 2 rˆ ∂vz ∂ 2 vz 1 ∂ ∂ 2 vr ˆ = r rˆ + 2 r zˆ + r + zˆ + 2 2 r ∂r ∂r r ∂ϕ ∂z r ∂r ∂r ∂z 2 ∂vr vr ∂vz ∂ 2 vz 1 ∂ 1 ∂ r − 2 rˆ + r + zˆ , = r ∂r ∂r r r ∂r ∂r ∂z 2 (N.166)
∇2v =
698
542.
543.
544.
545.
Notes
where, to differentiate rˆ with respect to ϕ, you can write it in Cartesian coor∂ 2 rˆ r. dinates, rˆ = (cos ϕ, sin ϕ), and then it’s not so hard to show that ∂ϕ 2 = −ˆ Equation (8.21) follows fairly straightforwardly from (N.166). In Haskell’s paper, the right-hand side of his Eq. (1.41), which is equivalent to (8.25) here, is nonzero and depends on r and t. Because, in fact, Haskell works out the theory for the general problem of a viscous half-space’s response to a generic time-dependent load. But as far as the practical application that he was and we are after is concerned, i.e., to estimate the η of the earth, he sets the load to zero—he studies the earth’s “recovery after removal of load.” When I got rid of “inertia” in the step that brought us from (8.19) to (8.20), I also practically eliminated time from the equations of motion. In the next few pages we are going to look for functions of r and z, leaving t behind. Implicitly, though, we are requiring that the arbitrary coefficients that emerge from solving (8.21) and (8.22) be t-dependent. This dependence on t will become explicit again when we invoke the boundary conditions, (8.25) and (8.26), that I am about to write, and that actually carry the time-dependence: because the topography of the free surface, ζ, is a function of time. And only by knowing already where we are going I can predict that it is convenient to write them as minus a positive (i.e., square, assuming λ is real) number. Check out what we did in Chap. 6, note 292, when we solved the wave equation the Bernoulli way: it’s the same thing. Which, in turn, is equivalent to what people call the Bessel equation820 , z 2 y¨ (z) + z y˙ (z) + (z 2 − b2 )y(z) = 0.
(N.167)
To convince yourself that (N.167) and (8.44) are the same thing, define z = ax; then by the chain rule, dy dz dy = dx dz d x dy =a , dz and likewise
2 d2 y 2d y = a ; dx2 dz 2
sub all that into (8.44), and a2
z2 d 2 y z dy + (z 2 − b2 )y = 0, +a 2 2 a dz a dz
from which (N.167) follows, QED. 546. I thought about giving you a convincing, self-contained proof that Jn and Yn are linearly independent from one another, that (8.45) is indeed the general solution to (8.44), etc., and I started to look into how I could do it in a reasonable amount
Notes
699
of space: and but that basically opened a Pandora’s box of mathematical stuff that I wasn’t planning on doing in this book, and that to some extent I also had never studied myself: and because this is not supposed to be a mathematics book, and it is already quite long as it is, and I will have to find a publisher for it: because of all these reasons I decided to give up on this idea. This having been said, I still think it’s worthwhile to at least get a feeling of how a problem like (8.44) can be attacked, and an idea of what the Bessel functions actually are. So after some deliberation, I figured I’ll show you how Bessel functions of the first kind emerge sort of naturally when you try to solve (8.44). Which you remember from note 545 that solving (8.44) is equivalent to solving (N.167)—the actual Bessel equation. Equation (N.167) can be solved by the so-called Frobenius method821 , which consists of substituting into it a tentative solution that reads y(z) =
∞
an z n+r
(N.168)
n=0
(with z > 0 and a0 = 0), i.e., a “power series”; and then of doing the math to see if (N.167) is verified—or, rather, to find values of the an ’s and of r such that (N.167) is verified. To begin with, differentiate Eq. N.168, which gives ∞
dy = (n + r )an z n+r −1 , dz n=0 and
∞
d2 y = (n + r )(n + r − 1)an z n+r −2 . dz 2 n=0
Then, substitute all that into the Bessel equation (N.167), and ∞ ∞ ∞ ∞ (n + r )(n + r − 1)an z n+r + (n + r )an z n+r + z 2 an z n+r − b2 an z n+r = 0, n=0
n=0
n=0
n=0
or zr
∞ ∞ (n + r )(n + r − 1) + (n + r ) − b2 an z n + z r an z n+2 = 0, n=0
n=0
which after some algebra becomes ∞ ∞ (n + r )2 − b2 an z n + an z n+2 = 0. n=0
n=0
(N.169)
700
Notes
For (N.169) to be verified independent of the value of z, we need the coefficient of each power of z in it, including the constant term (i.e. the term where the exponent of z is 0), to be 0. Remember that requires that a0 = 0. For Frobenius that to be true, and the constant term, r 2 − b2 a0 , to be 0 at the same time, we need r 2 − b2 = 0, or r = ±b. Whether we pick the + or −, the next steps are going to be exactly the same, so let’s say r = b for now. Requiring the coefficients of z k in (N.169) to be zero for all values of k from 1 to ∞, we get (N.170) (1 + b)2 − b2 a1 = 0 (which is the coefficient of z), and (k + b)2 − b2 ak + ak−2 = 0
(N.171)
for all k ≥ 2. Equation (N.170) is verified if a1 = 0. Next, solve (N.171) for ak , and you get ak−2 (k + b)2 − b2 ak−2 , = − k(k + 2b)
ak = −
which is what people call a “recurrence relation”—if you have a0 and a1 , you can use it to find all the ak with k > 1. We actually do have a1 : it’s zero: and so if k is odd the recurrence relation gives a3 = a5 = a7 = · · · = 0. If k is even we get: a2 = − a4 = − a6 = −
a0 ; 22 (1 + b)
a0 a2 = (−1)2 4 ; 4(4 + 2b) 2 2!(1 + b)(2 + b)
a0 a4 = (−1)3 6 , 6(6 + 2b) 2 3!(1 + b)(2 + b)(3 + b)
and so on and so forth: which you can also write
Notes
701
a0 . 22k k!(1 + b)(2 + b)(3 + b) . . . (k + b)
a2k = (−1)k
Now we can sub what we’ve found into (N.168), and the bottom line is that822 y(z) = a0 z
b
1+
∞ n=1
(−1)n z 2n 22n n!(1 + b)(2 + b)(3 + b) . . . (n + b)
(N.172)
solves the Bessel equation, whatever the value of a0 . (Incidentally, you see now why Frobenius decided that a0 shouldn’t be zero: because if a0 = 0 then the Frobenius solution is nothing but the trivial solution that y(z) = 0 for all z: which is certainly correct, but not very interesting.) If we replace b with −b in (N.172) we also get a solution of the Bessel equation, because see above how we dealt with r being equal to either b or −b. The functions y(z) that we just found are precisely what we call “Bessel functions of the first kind”. They are usually denoted Jb (z), and the number b (or −b) is called their order. Now, Jb is one solution of the Bessel equation. The Bessel equation is a secondorder823 , linear, ordinary differential equation. The fact that it’s second order means that we need two linearly independent solutions to fully determine the general solution—the general solution being their generic linear combination (see note 46). So, don’t ask why, because that’s part of the stuff in the Pandora’s box that I’ve decided should stay closed; don’t ask why, but a second solution that’s linearly independent from Jb is824 the Bessel function of the second kind, which you can get from Jb and J−b through Yb (z) = lim
α→b
cos(απ)Jb (z) − J−b (z) . sin(απ)
I don’t know how useful it is to show you this formula, given that I am not explaining anything about it; but it would be even weirder to not even say what Yb is. And this is the most compact formula I could find. The “lim” in it makes sense, if you consider that, without going to the limit, and when b is integer, the ratio is like 0/0: undefined. But the limit, actually, exists. To wrap this up, we might write the general solution of the Bessel equation (N.167), y(z) = A Jb (z) + BYb (z), where A and B are two arbitrary coefficients; and Eqs. (8.46) and (8.47) follow. 547. Its general form, that you might find in books or on the Internet, is d n x Jn (x) = x n Jn−1 (x). dx
702
Notes
548. The n functions f i (x), i = 1, 2, . . . , n are linearly dependent if there exist n n ai nonzero coefficients a1 , a2 , . . . , an such that the linear combination i=1 f i (x) = 0 for all x. If you can show that n such coefficients don’t exist, then you’ve shown that the functions are linearly independent. If our four functions were linearly dependent, that would mean that there’d be four coefficients a1 , a2 , a3 , a4 such that a1 e−λx + a2 xe−λx + a3 eλx + a4 xeλx = 0
(N.173)
for all x. But, if we plug x = 1 in (N.173), a1 e−λ + a2 e−λ + a3 eλ + a4 eλ = 0, or
(a1 + a2 )e−λ + (a3 + a4 )eλ = 0,
which (unless λ = 0, but then we wouldn’t have four solutions to begin with) can only be verified if (N.174) a1 = −a2 and a3 = −a4 .
(N.175)
If, instead, we sub x = −1 into (N.173), a1 eλ − a2 eλ + a3 e−λ − a4 e−λ = 0, or
(a1 − a2 )eλ + (a3 − a4 )e−λ = 0,
which means that we must have a1 = a2
(N.176)
a3 = a4 .
(N.177)
and Now, there’s no way both (N.174) and (N.176) can be verified unless a1 = a2 = 0; and, likewise, from the combination of (N.175) and (N.177) we must infer that a3 = a4 = 0. This means that (N.173) can’t be true unless the coefficients a1 through a4 are all 0; i.e., it is impossible to find a set of nonzero coefficients a1 = a2 , a3 , a4 ; or, our four functions are linearly independent, QED. 549. Because (8.21)–(8.22), with vr , vz and p¯ as unknown functions, is a homogeneous system of differential equations: remember my short introduction to differential equations, note 46.
Notes
703
550. I explained the orthogonality of sines and cosines in note 292. 551. See note 277. 552. An early example of a frequently moving scholar, Benjamin Thompson AKA Count Rumford was born near Boston, and fought for the English in the Revolutionary War—first as a spy, apparently, then as lieutenant colonel and commander of a British regiment in New York. This involved abandoning his first wife and daughter—forever, as it turned out. Exiled to Britain, he made his reputation as a scientist while working various high-profile jobs for the government. In 1784, with King George’s permission, he moved to Munich, where he served as confidential adviser of the elector of Bavaria. He did really well, there, e.g., he introduced potatoes to Bavaria for the first time (as well as James Watt’s steam engine). The elector created him a count, and Thompson picked Rumford as his title—that was the original name of Concord, New Hampshire: the town where he had lived with his American wife. “Through his arrogance and the general unpleasantness of his character,” writes Asimov in his Encyclopedia, Rumford “finally outwore his welcome in Bavaria too, particularly after the death of the elector. That, and the pressure of Napoleon’s victories, made it advisable for Rumford to return to England in 1799. [...] “In 1804 Rumford went to Paris, [and] (having outlived his first wife) proceeded to marry Lavoisier’s widow (who was rich and who kept the famous name of her martyred first husband). It was a late marriage—he being slightly over fifty, she slightly under—and an unhappy one, their first quarrel coming the day after their marriage. After four years they separated and Rumford was so ungallant as to hint that she was so hard to get along with that Lavoisier was lucky to have been guillotined. However, it is quite obvious that Rumford was no daisy himself.” 553. Rumford, Essays, Political, Economical, and Philosophical (T. Cadell and W. Davies, London, 1798), Essay VII. To be honest, I found the citation in S. C. Brown, “The Discovery of Convection Currents by Benjamin Thompson, Count of Rumford”, American Journal of Physics, vol. 15, 1947. Brown mentions that “the effect went without a name for many years. It was not until 1834 that the well-known William Prout, writing in one of the Bridgewater Treatises825 suggested: ‘There is at present no single term in our language employed to denote this mode of propagation of heat; but we venture to propose for that purpose the term convection (convection, a carrying or conveying), which not only expresses the leading facts, but also accords very well with the two other terms [conduction and radiation].’ Even after this name was suggested, one finds that it was another 20 years before convection found its way to the universal acceptance which the term now enjoys.” 554. “On the Convective Currents in a Horizontal Layer of Fluid when the Higher Temperature Is on the Under Side,” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 32. 555. In a paper that he published in 1851 in the Transactions of the Cambridge Philosophical Society, Stokes shows that when a sphere of radius d moves
704
Notes
at velocity v through a viscous material with viscosity η, the viscous drag, or resisting force offered by the material, is F = 6πηdv, which, if you turn 1 , is exactly Eq. (8.103). (Which to turn it around, it around and pick a2 = 6π of course, we have to make the assumption that the buoyancy force and the viscous drag are equal, i.e., convection is, as they say, a steady-state, dynamicequilibrium sort of thing where Archimedes’ push (or pull) is exactly balanced by the drag of viscosity, or friction or whatever you want to call it.) Stokes gave a mathematical proof of this starting from the equations of motion for a viscous fluid—Eq. (8.19): see above—, before his result was also verified experimentally. I hope you won’t be disappointed if I skip Stokes’ proof here—it’s really long and complex, to the point that even I think that that might be a bit too much. Instead, I’ll show you a simplified “proof”, which I hope will be enough to convince you that Stokes’ law, at least, makes sense. It starts off by reasoning that the drag force F might depend on: the viscosity η of the fluid; the size (e.g., d, the radius of the sphere) of the object that’s trying to move through the fluid; the speed v at which it moves; and nothing else. We assume that, mathematically, this can be written F = kη x d y v z ,
(N.178)
where k is a dimensionless constant, and we don’t know a priori the values of x, y, z. But we can figure them out if we do what people call a dimensional analysis. The dimensions of F are mass (M) times acceleration, or mass times length (L) times time (T ) to the power of minus two, which we can write as: [F] = M L T −2 . The dimensions of the left- and right-hand side of Eq. (N.178) must coincide, M L T −2 = [η]x L y L z T −z .
(N.179)
To figure out the dimensions of η, look at Eqs. (8.15) or (8.16), which is where we defined it: η is stress divided by a rate of deformation. The rate of deformation is ε , which is dimensionless, over time, and stress is force per unit area, bottom line: = [force][area]−1 [time] = M L T −2 L −2 T = M L −1 T −1 ; and but so, subbing in (N.179) M L T −2 = M x L −x T −x L y L z T −z = M x L y−x+z T −x−z .
(N.180)
Notes
705
Looking at just M in (N.180), it follows immediately that x = 1. Then we are left with L T −2 = L y−1+z T −1−z : looking at T , we see that we must have z = 1, which leaves us with L = L y, or y = 1. Plugging into (N.178) the values we found for x, y, z, we get Stokes’ law F = kηdv.
556. 557. 558.
559. 560.
561. 562.
563.
The dimensionless constant k cannot be determined via dimensional analysis; but it can be determined experimentally: if you have spheres of various masses and sizes fall through fluids of various viscosities, etc., you’ll find that, for a sphere, k = 6π. (You can also work out all the algebra in Stokes’ 1851 paper, and find the same thing.) For other shapes, you’ll get other values of k, which is why in Eq. (8.103) I just have an unspecified constant a2 . See note 180. For example, it’s not hard to show that, if the parcel is spherical, then a3 = 23 . 1 In the case of a sphere a1 = 43 π, a2 = 6π (note 555) and a3 = 23 . Their product 4 a1 a2 a3 = 27 is one order of magnitude smaller than one: which gives you an idea of how precise the formulae we are about to derive are going to be: not that precise; but still useful for a rough evaluation of things. People run experiments with fluids of different viscosities in tanks of various sizes, and stuff like that. After Y. Ricard, “Physics of Mantle Convection”, in Treatise of Geophysics, 2015. You should be familiar with all the units of measurements here, except maybe that of viscosity, which is Pa s, or Pascal (after Blaise Pascal, who had been mostly a mathematician before, apparently, seeing some sort of light (in November, 1654, “from about half-past ten in the evening until about half-past midnight”) and becoming a mystic, kind of, and the author of the Pensées, etc.) second. The Pascal is a unit for pressure, and 1 Pascal is the same as 1 Newton per square meter, or, if you do the arithmetics, 1 kg m−1 s−2 . So, then, 1 Pa s = 1 kg m−1 s−1 : which see also note 555: the dimensions are right. Again, after Ricard’s paper. I’ve heard scientists use this word to describe simple physical models that can be explained with sketches, or waving your arms around,—which is not pejorative at all. American geodesists. Both were chief of the geodesy division of the U.S. Coast and Geodetic Survey—Bowie succeeded Hayford in 1909. Bowie, in particular, was very much into measuring gravity, and figuring whether and to what extent relief was isostatically compensated: in 1927 he published a whole book called Isostasy. He was one of the founders of the American Geophysical Union, and
706
564. 565.
566.
567. 568.
569. 570.
Notes
one of its first presidents, and now there’s the Bowie medal, which is a big deal: the highest honour of the A.G.U. See note 212. The volume has contributions from many of the actual authors that first formulated the theory—several of which we are about to meet. In his 1929 Nature paper Vening Meinesz doesn’t mention convection. He says, “a second result of importance has been found over the Nares Deep, north of Porto Rico. It is in harmony with the few values observed in that part during my former cruise with the Dutch submarine K XIII from Holland via Panama to Java. It shows great departures from isostatic equilibrium, which probably may be ascribed to stresses working in the earth’s crust in connexion with the formation of the deep.” I haven’t read his Dutch papers that Oreskes quotes, though. Dave Griggs’ father, Robert Griggs, was a prominent botanist, who explored the Valley of Ten Thousand Smokes in Alaska, and gave it its name—because “the whole valley as far as the eye could reach was full of hundreds, no, thousands— literally, tens of thousands—of smokes curling up from its fissured floor”: the valley had just been formed, apparently, by the eruption of the Novarupta volcano, 1912. So, under the influence, perhaps, of such a father, Dave got interested in geophysics, as well as skiing, climbing, and all those things that geologists love to do. In the summer of 1936 he was driving across Europe with a friend, to go climb the Caucasus mountains, when they both almost died in a car crash. He was close to losing his legs, but eventually recovered. At that point he had just graduated from Harvard, where he became a “junior fellow”, i.e., a graduate researcher—but not a Ph.D. candidate—a position that doesn’t exist anymore, I think. Between 1934 and 1941 he published a number of papers, including the 1939 one that I am telling you about. When the U.S. entered WWII he served at the M.I.T. Radiation Laboratory, which is one of the places where radar was being developed, and eventually ended up flying actual missions, to introduce airborne bombing radar systems to crews in Europe. Which is how he was hit by “flak” in a leg and received the Purple Heart Medal. American Journal of Science, vol. 237, 1939. “[Kiyoo] Wadati was born in Nagoya City in 1902, and graduated from the Institute of Physics of the Imperial University of Tokyo in 1925. After graduation, he was engaged in the observation of earthquakes at the Central Meteorological Observatory. He was a director of the Central Meteorological Observatory, a director-general of the Japan Meteorological Agency, and President of Saitama University” (Y. Suzuki, “Kiyoo Wadati and the Path to the Discovery of the Intermediate-Deep Earthquake Zone”, Episodes, vol. 24, 1998). Wadati, K., “On the Activity of Deep-Focus Earthquakes in the Japan Islands and Neighbourhoods”, Geophysical Magazine, vol. 8, 1935. H. Benioff, “Seismic Evidence for the Fault Origin of Oceanic Deeps”, Bulletin of the Seismological Society of America, vol. 60, 1837–1856, 1949. Born in L. A., Benioff worked at the Pasadena Seismological Laboratory, which was initially part of the Carnegie Institution, until it was transferred to Cal-
Notes
571.
572.
573. 574.
575.
707
tech in 1937—which is when Benioff was appointed assistant prof. Besides being a seismologist, Benioff must have had a thing for music. “He became intensely interested in the physics of musical instruments”, writes Frank Press in Benioff’s biographical memoir for the National Academy of Sciences, “and worked toward the development of an electronic violin, cello, and piano. The last instrument reached fruition under sponsorship of the Baldwin Piano Company and was used in public concerts by famous artists. His motivation always was to heighten the pleasure of the listener and to lighten the task of the performer while preserving the fidelity of the instrument.” He also contributed one side of the vinyl album Out of This World, with true recordings of earthquakes that he sped up, so that they would become audible826 (which is something some people do, sometimes, and it’s called sonification, or, more precisely, audification). Except for track 2, Original Motion San Clemente, which is actually not sped up, so it can’t be heard; Benioff says in the liner notes, specifically re this track, that “motion of the phonograph arm is not necessarily identical with original earth motion, is more than occurred at Pasadena; however motion originated from direct amplification of original tape signals. Any audible sound here will be from the skipping of grooves by the arm and cartridge” and “skipping is intentional and indigenous to the nature of the subject matter.” The number of seismic “stations” distributed across the world grew exponentially after WWII. The reason was, you can use them to monitor nuclear tests— and you want to know whether some countries are testing nuclear weapons without saying. There’s a range of depths, by the way, with clearly no foci at all: and one might wonder about the rheology of materials in that depth range—maybe not brittle enough for rupturing? but let’s not get into that now. Which is the long trench between New Zealand and Samoa. Remember gravity: the sun both attracts and is attracted by each planet, and the force applied by the sun on a planet is equal and opposite to the force applied by the same planet on the sun. But the sun’s mass is way bigger than that of any of the planets, so, to a very good approximation, the sun is not affected by the planets’ gravity fields, while the motions of all planets are controlled by the gravity field of the sun. The same happens with magnets and or magnetized bodies, except that, instead of gravitational attraction, you have the attractive or repulsive force that’s associated with a magnetic field. “The crowning work of Maxwell’s life”, writes Asimov, “was carried on between 1864 and 1873, when he placed into mathematical form the speculations of Faraday concerning magnetic lines of force. [...] “In working on the concept of lines of force, Maxwell was able to work out a few simple equations that expressed all the varied phenomena of electricity and magnetism and bound them indissolubly together. Maxwell’s theory showed that electricity and magnetism could not exist in isolation. Where one was, so was the other, so that his work is usually referred to as the electromagnetic theory.”
708
Notes
Like Newton’s, Maxwell’s equations are empirical laws, so, in a way, there isn’t much to say about them other than they are found and verified by doing and repeating all kinds of experiments: you don’t prove them with math. That doesn’t necessarily make them an easy topic to study, though. For one, just reading them once probably won’t be enough for you to understand what they mean. And plus, even if you’re really good at math, the very concepts of electric charge, and of electric and magnetic fields, are not that intuitive at all. The phenomenon that first made people think of what later would be called electricity is the so-called amber effect (or “triboelectric”827 effect): take a piece of glass and a piece of resin—meaning, solidified, like, “fossilized” resin: amber—check that neither does anything funny—yet—like attracting other things—and rub them together. Stop rubbing, and leave glass and resin in contact with each other: nothing happens. Now separate them: you’ll notice that, now, they’ll resist separation, attracting one another. Now take a second piece of glass and a second piece of resin, and rub them together, too. Here’s what you will observe: (i) that the two pieces of glass repel each other; (ii) that each piece of glass attracts each piece of resin, and vice versa; (iii) that the two pieces of resin repel each other. It is inferred that attractive or repulsive force must be connected to some acquired property of matter—what today we call “charge”. Things that are charged with the same kind of charge repel one another, while opposite charges attract one another. Charge is to electric force what mass is to gravitational force, and measuring the force between two electrically charged bodies is the same as measuring their charges—the force defines the charge, sort of. The experiment can be repeated with other materials, with similar results. But not all materials will work. In fact, most won’t. Those that have been found to work form an ordered list called the triboelectric series: (i) friction between two bodies made of materials that are on that list will cause them to become charged; (ii) which one of them becomes positively charged depends on their relative positions in the series. Glass is close to the top of the series, which means that, when rubbed, it almost always gets a plus charge. Resin (amber) is close to the bottom. The Greeks wrote about this (Thales of Miletus, apparently, was first, some five or six centuries B.C., in a book that has been lost) and but then, like with everything else that today we call science, not much was done about it until the sixteenth, seventeenth centuries. William Gilbert discusses the amber effect in De Magnete, and invents the “latin” word electrica (which he transliterates from the Greek word for “amber”, and which will be then transliterated into English as electrics) to refer to that and related phenomena: and that’s already in 1600. I’m not going to give you the full story of all research and experiments with electrics done through the following centuries, until James Clerk Maxwell wrote the epilogue in the mid-1800s, but one milestone contribution is that of Ben Franklin, whose life spanned almost the entire eighteenth century, and who came up with the one-fluid theory of electricity; meaning, he thought that elec-
Notes
709
tric charge was some sort of invisible fluid present in all matter; that rubbing certain things together, like amber and resin, caused this fluid to change location, via what today we’d call an electric current. Even though there appeared to be two kinds of electric charge, the flow of only one invisible fluid was enough to explain electric phenomena (hence, one-fluid theory). Because what happens, according to Franklin, is that, when matter contains an excess of that fluid, it is positively charged, and when it has a deficit of it, it’s negatively charged. (Which is basically still correct in today’s terms, except that we don’t speak of an invisible fluid, but of a flow of electrons: and you see now where the word “electron” comes from, too.) The observation that convinced Franklin that this was the case, was that, when you do the resin-glass thing, the glass receives the same, but opposite, charge strength as the cloth used to rub the glass828 . So whatever the glass has gained (lost), the cloth has lost (gained) just as much: Franklin figured that that could hardly be a coincidence. He dropped the phrases “vitreous electricity” and “resinous electricity”, that had been used before, and said that the glass gets a positive charge while the resin gets a negative one: an excess versus a deficit of the same fluid. I don’t think Franklin figured why rubbing a pair of bodies would cause charge to flow from one to the other. That requires knowing about molecules and atoms and electrons, which Franklin didn’t. But his description of electricity is powerful enough to explain quite a few experiments that people would do at the time. Here are three examples—they are also going to be useful in a minute, when we look at magnetism and at the differences between electricity and magnetism. The first one goes like this: you’ve already gotten glass and resin, now get a metal bar, too. Check that nothing is charged. Then rub glass with resin, and then make the glass touch the metal. The metal becomes electrically charged (attracts and repels stuff, etc.), with the same (positive) polarity as the glass. Some of Franklin’s fluid has flowed from glass to metal. (Had we touched it with the charged resin, the metal would have gotten negatively charged, instead: same process, but flow goes in the opposite direction. In the triboelectric series, metals are between glass and resin.) Put the resin away: the metal stays charged. Second experiment: do the same thing, except the glass gets close to one end of the metal, but never touches it. As long as the glass stays close to the metal, the metal shows an electrical charge, too. This time, though, the charge is not the same all over the bar, but the end of the bar that’s close to the glass is charged negatively; the other end of the bar is charged positively. You take the glass away, everything goes back to how it was before the experiment: no part of the metal bar shows to be charged. What we’ve seen this time is that the electrical fluid is repelled by the force field that we’ve generated when we’ve charged the glass. Once we take the field, i.e. the repulsive force away, the fluid flows back to where it was. Because there’s no contact between glass and metal, there can be no transfer of electric fluid, either. Third experiment: take two pieces of metal, uncharged, at rest and touching one another. Again rub glass with resin, and again bring the glass near one
710
Notes
of the metal pieces: but with no contact. The piece of metal that’s nearer to the glass gets a positive charge, and the other piece of metal gets a negative charge instead. If now, before putting the glass away, you separate the two pieces of metal from one another, you’ll see that they stay charged: because by separating them you prevent the fluid from flowing back, like it did in the previous experiment. People also figured that electric current, or flow of electric fluid, or of electric charge, which really is the same thing, is proportional to the strength of the electric field, that attracts or repels charge and causes it to move. It’s useful to introduce the so-called density of electric current, which is a vector, and is usually denoted j. The density of electric current is how much charge, per-unittime and per-unit-area, is flowing across a surface that’s perpendicular to the direction of flow; the direction of j is the direction of flow. Then, the amount of charge flowing through some arbitrary surface S per unit time, which people tend to call I , is given by j · nˆ d S,
I =
(N.181)
S
where nˆ is a unit vector perpendicular to S. When one speaks of “current”, usually she means I as in (N.181). Like I was saying, experiment shows that if you’ve got an electric field E within a medium that can conduct electric current, then there will be current in the medium, in the direction of E, and the stronger the field, the stronger the current. In other words, j = σE,
(N.182)
where σ is called the conductivity of the medium: and it will change, of course, if the medium is changed. Equation (N.182) is called Ohm’s law829 . I have talked a great deal about electricity already, but nothing on magnetism. Today we take courses on “electromagnetism”, and speak of “electromagnetic” waves, etc., but actually, people didn’t figure out that electric and magnetic phenomena are connected, until the work of Hans Christian Ørsted in the late 1810s. Before then, magnetism was just the property of some rocks—or rather chunks of mineral—called lodestones—and the mineral itself would later be called magnetite—rocks that had the peculiar property of attracting iron. Ørsted saw that if you place a magnetized needle near a wire through which electric current is flowing, then the needle is deflected in a direction that depends on the polarity of the current and orientation of the wire. (He self-published his discovery in the form of a latin treatise, Experimenta circa Effectum Conflictus Electrici in Acum Magneticam, “which was sent to all of the most renowned natural philosophers and scientific societies”830 .) Saying that an electric current deflects a magnetized needle is the same as saying that an electric current generates a magnetic field: Ørsted established a first link between electric and magnetic phenomena.
Notes
711
At first sight a lodestone, or magnet, behaves in a similar way as an electrically charged body, although instead of attracting electric charges, it attracts other magnets, and or stuff that’s made with certain materials: mostly metals, and most of all iron. So, is magnetism, like, another kind of electricity? Is there such a thing as a magnetic fluid? Let’s try to repeat, with two bar magnets, the experiments we’ve done with glass, resin, etc. Say you hold one of the magnets in your hand, while the other is just sitting somewhere. Bring one end of your magnet near either end of the other: they’re going to either repel or attract each other. Now bring the same end of your magnet near the other end of the other magnet: if earlier there’d been repulsion, now there’d be attraction, and vice versa. Finally, bring the usual end of your magnet near the middle of the other: there’ll be neither attraction nor repulsion. Bottom line, a magnet has always got two poles. (The word “dipole” is often used, as in “magnetic dipole”.) These experiment ideas come from a famous book831 by Einstein and Leopold Infeld. Right after the one I’ve just described, they propose another experiment: “Each magnet has two poles”, they say. “Can we not isolate one of them? The idea is very simple: just break a magnet into two equal parts. We have seen that there is no force between the pole of one magnet and the middle of the other. But the result of actually breaking a magnet is surprising and unexpected. If we repeat the experiment [just described], with only half a magnet [...], the results are exactly the same as before! Where there was no trace of magnetic force previously, there is now a strong pole.” The inference is that the idea of a magnetic “fluid”, that worked to explain electric phenomena, won’t work for magnetism. This, say Einstein and Infeld, “forces us to introduce a somewhat more subtle theory. [...] We may imagine that the magnet consists of very small elementary [magnets] which cannot be broken into separate poles. Order reigns in the magnet as a whole, for all the elementary [magnets] are directed in the same way. We see immediately why cutting a magnet causes two new poles to appear on the new ends, and why this more refined theory explains the facts” of both experiments832 . I said, already, that Ørsted was the first to observe and report that a magnetic needle’s orientation is affected by a nearby electrical current; but there’s more to Ørsted’s experiment than that. Here’s how Einstein and Infeld describe it: “Suppose we have a voltaic battery and a conducting wire. [...] Let us assume that the wire is bent to form a circle, in the centre of which a magnetic needle is placed, both wire and needle lying in the same plane.” As soon as the wire is connected to the battery, and current begins to flow, “the magnetic needle turns from its previous position. One of its poles now points to the reader if the page of this book represents the plane of the circle. The effect is that of a force, perpendicular to the plane, acting on the magnetic pole. Faced with the facts of the experiment, we can hardly avoid drawing such a conclusion about the direction of the force acting. [...] The force between the magnetic pole and the small portions of the wire through which the current flows cannot lie along lines connecting the wire and needle, or the particles of flowing electric fluid and the elementary magnetic dipoles. The force is perpendicular to these
712
Notes
lines!” Einstein and Infeld remark that that was quite a novelty in the mid 1800s, because “the forces of gravitation, electrostatics, and magnetism, obeying the laws of Newton and Coulomb833 , act along the line adjoining the two attracting or repelling bodies.” But when you mix up magnetism and electric currents, it’s a different story. I’ve shown you a bunch of experiments in electricity, magnetism, and electromagnetism. There’s many, many more of those. And what James Clerk Maxwell did was, he looked at all available experimental work, and managed to describe it all with only four compact equations834 . I’ll show you how they look like, first, then I’ll tell you what each symbol means, and then I’ll try and give you an explanation as simple as I can possibly come up with, of what they mean in practice. That’ll be enough to understand the magnetic field of the earth, and “geomagnetism” in general. So here they are: ∇ ·E=
ρc ,
∇ · B = 0, ∇ ×E=−
∂B , ∂t
∂E j 1 ∇ ×B= + . μ ∂t
(N.183) (N.184) (N.185)
(N.186)
The vector E is what we call the electric field, B is the magnetic field; and μ are parameters of the material that the field is immersed in, like air for most of our lab experiments that we discussed, or rocks inside the earth, if you are already thinking about the magnetic field of the earth. is called permittivity, μ permeability—which both depend on what medium the fields are immersed in—and I know I’ve used these letters before to mean other things, but hey, there’s only as many letters in the Latin and Greek alphabets, and this is a big book. Finally, ρc is the density of electrical charge in space, and j is still electric current density. To Maxwell’s four equations we might add the “force law”, F = q(E + v × B), that describes the force received by a charge q immersed in a magnetic and/or electric field, and moving within it with velocity v. Because, strictly speaking, E and B are not themselves forces: the force that’s felt by a charge in an electric field E is parallel to the direction of E, though; and a dipole835 in a magnetic field B is pushed to align with the direction of B. Before we look at individual equations, notice that the first one is only relevant to electric charge and electric field; the second one only to the magnetic field:
Notes
713
Fig. N.40 Vector fields with positive (a), negative (b), and no (c, d) divergence
but in the last two equations, electricity and magnetism are combined. Ørsted’s experimental result is implicit in (N.186)—I’ll get back to this. Equations (N.183) and (N.184) are both called Gauss’ law—Gauss’ law of electricity and Gauss’ law of magnetism. To understand what they mean, you just have to remember what we learned about divergence (i.e., the dot product of nabla and some vector, which is what we have at the left-hand side of both Gauss’ laws) in Chap. 6. Back then, the vector field we were dealing with described the displacement of some material: and we learned that the divergence of that vector field was a measure of how much the material was being compressed or expanded, at each point in the vector field. We saw how, mathematically, that was described by Eq. (6.19), which says that displacement u(x) (x), are related through ∇ · u(x) = − δρ (x). and a change in mass density, δρ ρ ρ So, wherever you’ve got compression (a positive anomaly in ρ), divergence’s negative, and but the more compression you’ve got, the larger divergence is in absolute value. Whenever you’ve got expansion, divergence’s positive, and large divergence means a lot of expansion, etc. Now think about how this would look like—and look at the sketches in Fig. N.40. A place where you’ve got compression is a place that the vectors representing displacement point to: matter wants to cluster around that place, it’s being pushed or pulled into that place, so there must be compression around that place: negative divergence. A place where you’ve got expansion is a place those vectors point away from: matter is being pushed or pulled away from it, so density, there, must be going down: positive divergence. Now forget about displacement of matter, and look at Eq. (N.183). Divergence still works in the same way, though the physics is different. Relative change in mass density is replaced by charge density: so, wherever charge is positive and high, positive divergence, the electric field vectors point away from there. Wherever charge is negative and high (in absolute value), negative divergence, the electric field points that way. So: Gauss’ law (N.183) tells you where the electric field points; but it also tells you what the strength of the electric field is. There’s another empirical law that’s called Coulomb’s law, and that was established experimentally by Charles-Augustin Coulomb836 before Maxwell’s and Gauss’ contributions; it
714
Notes
says, essentially, that the attraction between two electric charges is proportional to both charges and inversely proportional to the square of their distance (kind of like the law of gravitation). I am going to show you that Coulomb’s law is implicit in Gauss’ law of electricity; meaning, you can derive the former from the latter: and this is how it goes: we’re talking about isolated charges, so it’s OK to replace ρc in Eq. (N.183) with qδ(r ) (where δ is the Dirac function), meaning we’ve got a charge of strength q occupying a very small—strictly speaking, infinitely small; in practice, negligible—volume, at the referenceframe origin. Then, integrate both sides of (N.183) over a sphere of radius R and volume V , 1 dV ∇ · E = d V qδ(r ). (N.187) V V To solve this kind of integral—over a sphere—we’d usually switch to spherical coordinates and blah blah blah (see, e.g., note 578); but this is easier than that, because at the left-hand side we can apply the divergence theorem, and at the right-hand side we have the properties of Dirac’s delta, and (N.187) immediately boils down to ∂V
d S E · nˆ =
q ,
where ∂V is the surface of a sphere of radius R, d S the area of an infinitely small element of ∂V , and nˆ a unit vector that’s everywhere perpendicular to ∂V . But, the only charge you’ve got inside that sphere is that q in the middle: so, no matter where you measure it, E must point towards the center of the sphere: but so then, E and nˆ must always be parallel: but then E · nˆ = E, or the magnitude of E. It follows that E ∂V
dS =
or 4π R 2 E =
q ,
q ,
because all that was left to integrate was the surface area of the sphere. Solve for E, q E= . 4πR 2 Now, like I said, the so-called force law says that the force felt by a charge immersed in an electric field E is precisely the product of the charge times the electric field. So, say we deploy a charge q0 in the field E that we are looking at; use F to denote the force acting on q0 ; then
Notes
715
F = q0 E q0 q , = 4πR 2 which is exactly what Coulomb had observed. You see that Eq. (N.183) contains that, too. Once Gauss’ law of electricity is clear to you, then Gauss’ law of magnetism (N.184) should be fairly straightforward to interpret: what it says is that the divergence of a magnetic field is always zero. In practice, that means that the vectors that describe a magnetic field never point towards one particular point: rather, they form loops. You might be confused, now, because, if you think of either pole of a magnetic dipole, we’ve seen that that attracts or repels, e.g., iron objects, just like a positive electrical charge attracts negative ones, or vice versa: so shouldn’t there be some divergence, positive or negative, in those places? but the thing is, remember: if you break a magnetic dipole in two, you don’t get one positive and one negative magnetic monopole, but two dipoles. That means that you can never think of one end of a magnetic dipole without taking into account the other one: and if you consider the effect of both, you see that the B is like in the sketch in Fig. 8.9. The third of Maxwell’s equations, Eq. (N.185), is known as Faraday’s law, because it describes the results of some famous experiments done by Michael Faraday in the 1820s and early 1830s. Faraday had observed that if you move a magnet through a loop of wire, an electric current flows through the wire; and or the current also flows if you move the loop of wire over a magnet that doesn’t move: which is the same, really. Maxwell looked at Faraday’s results and figured they could be stated mathematically as: the curl of the electric field is equal to the time-derivative of the magnetic field: which is Eq. (N.185). To understand what that means, let’s go back to the idea of curl. The curl of a vector field like in Fig. N.40a or b is zero (to convince yourself of that, consider that, e.g., the field in Fig. N.40a can be written something like xex + ye y , and then take the curl of that: you’ll find that it is zero). The curl of a vector field like in Fig. N.40d, on the contrary, is relatively large (the field in Fig. N.40d is similar to −yex + xe y : take the curl of that: you’ll get 2ez ). In general, the curl of a vector field is a measure of how much the field “swirls” around something: the fields in Fig. N.40a, b don’t swirl at all: no curl; the field in Fig. N.40d is pure swirling: large curl. The curl points in the direction perpendicular to the swirling, and the stronger the swirling, the larger the curl. Faraday’s law, then, is saying (look at the drawings in Fig. N.41) that a magnetic field that changes over time always comes with a “curly” (“swirly”?) electric field, whose curl points to the same direction as the magnetic field: i.e., an electric field that swirls around the direction of the magnetic field. Or vice versa, an electric field that has a curl always comes with a magnetic field pointing in the direction of the curl, i.e. perpendicular to the plane of swirling: and that magnetic field changes in time: and the stronger the swirling, the faster the change.
716
Notes
Fig. N.41 Faraday’s law: a “swirly” electric field E generates a magnetic field B. The time-derivative of the ˙ is magnetic field, B, proportional to the curl of E
Fig. N.42 Illustration of Ampère’s law: electric current flows through the wire, which makes a loop; near the center of the loop, the magnetic field generated by the current is perpendicular to the plane of the loop
The fourth of Maxwell’s equations, Eq. (N.186), is known as Ampère’s law. Half of Ampère’s law was actually contributed by Maxwell himself, from some new experiments, some decades after Ampère’s work. Before Maxwell, Ampère’s law read simply ∇ × B = μ j, (N.188) without the time-derivative of E that we have in (N.186). Taken by itself, (N.188) means that electric current always comes with a magnetic field, which swirls around the direction in which the current flows: around the wire, say, that carries the current. If the wire happens to make a loop, as in Fig. N.42, at and near the center of the loop the magnetic field generated by the current is perpendicular to the plane of the loop. All this looks OK, but, actually, there is a problem. If you take the divergence of both sides of (N.188), the divergence of a curl is always zero837 , so it follows that ∇ · j = 0.
Notes
717
If ∇ · j is everywhere zero, that means that its volume integral over any given volume is also zero. But then, remember the divergence theorem:
∇ · j dV = V
∂V
j · nˆ d S,
which if the left-hand side of this is always zero, like we’d just concluded it must be, then the right-hand side is always zero as well: but the right-hand side is the net current flowing through ∂V : so we’ve just found that that current is always zero, too. In other words, we’ve found that electric current must have the following general property: the current flowing into any volume is always equal to the current flowing out of the same volume. Which can’t possibly be right: in Feynman’s words (vol. II, chapter 18 of his Lectures), “the flux of current from a closed surface is the decrease of the charge inside the surface. This certainly cannot in general be zero because we know that the charges can be moved from one place to another.” That is, imagine that the volume V bounded by the closed surface ∂V contains an electrically charged body of some kind. In practice, nothing can stop us from taking that charged body and putting it somewhere outside of V . But because current and moving electric charge are one and the same, in so doing we violate Eq. (N.188). But then, if we can violate it, that means that Eq. (N.188) can’t be a general law, as we assumed it was. Maxwell understood the problem, and proposed that it could be avoided if the term ∂∂tE were added to the right-hand side of (N.188). It was a speculation, really, but one that later would be confirmed, like Feynman says, by “untold numbers of experiments.” Now, Taking the divergence of (N.186) we get ∂E 1 ∇ ·j+∇ · = 0. ∂t In the second term, swap the divergence and the time-derivative, and ∇ ·j+
∂ ∇ · E = 0. ∂t
(N.189)
But (N.183) says that the divergence of E is ρc /. Substituting that into (N.189), ∇ ·j=−
∂ρc , ∂t
which “expresses”, says Feynman, “the very fundamental law that electric charge is conserved—any flow of charge must come from some supply.” So, in practice, this means two things. First, that Maxwell’s equations also contain the law of “conservation of charge”, and secondly, that indeed Maxwell’s speculation makes sense.
718
Notes
Fig. N.43 Magnetic field of a circular current loop. Electric current flows along the dashed line; the solid arrows show the direction of the magnetic field
One consequence of Ampère’s law is that you can make a magnetic dipole by having electric current flow in a loop, or coil. Without doing any difficult math, just think about the magnetic field that’s generated by electric current that flows in a circular loop, or multiple circular loops on parallel planes and with the same symmetry axis, like in Fig. N.43. = 0, so Keep it simple for now, and neglect the change in E over time, i.e. ∂E ∂t that version (N.188) of Ampère’s law is actually OK. By Ampère’s law, then, the magnetic field swirls everywhere around the direction of the current; each small segment of electrical wire contributes a swirly magnetic field, and if you sum them all up, they’ll add up constructively to form a relatively strong magnetic field within the loop, perpendicular to the plane of the loop (again, look at Fig. N.43). If you compare Figs. N.43 and 8.9, you see that the magnetic field you get from circular current loop is just like that of a bar magnet: a magnetic dipole. 576. A medical doctor by background, Gilbert published De Magnete one year before being appointed court physician to Queen Elizabeth I. “Gilbert, like Galileo, was a pioneer of experimentation and refuted many superstitions by direct testing. Indeed, Galileo considered Gilbert the chief founder of experimentalism. Gilbert showed that garlic did not destroy magnetism, as it was believed to do, by smearing a magnet with it and demonstrating that the magnet’s powers remained unimpaired. “Gilbert further showed not only that a compass needle points roughly north and south, but also that if it is suspended to allow vertical movement it points down ward toward the earth (“magnetic dip"). [...] Gilbert’s great contribution was to suggest that the earth itself is a great spherical magnet and that the compass needle points not to the heavens [...] but to the magnetic poles of the planet” (Asimov). 577. Earth. The quote is from the 1981, W. H. Freeman edition.
Notes
719
578. It’s all in a paper that Gauss published in 1839, in that year’s volume of the Resultate aus den Beobachtungen des Magnetischen Vereins, a journal where Gauss himself and his colleague Wilhelm Weber would accept contributions on new geomagnetic observations and how to make them, etc. The paper is called Allgemeine Theorie des Erdmagnetismus, or “General Theory of Terrestrial Magnetism”. To see what Gauss did, start with Ampère’s law (N.186), i.e., number four of Maxwell’s equations (note 575). If you pick an observation point where there are no electric currents (which in the following we’ll get back to why it makes sense to assume that you can pick such a point) (J = 0), = 0), that boils down to and no (rapidly varying) electric fields ( ∂E ∂t ∇ × B = 0. We know from Chap. 6 (notes 288 and 289) that when a vector field is, like B, curl-free (AKA, irrotational), then there must exist a scalar field such that B can be written as its gradient. People like to introduce the function V = V (x), and write B = −∇V, and call V a “potential”. Then take the divergence of both sides, remember Gauss’s law of magnetism, (N.184), and you get838 ∇ 2 V = 0.
(N.190)
Because the earth is—roughly—a sphere, it makes sense to translate the problem into spherical coordinates (see Chap. 6 and Fig. 6.12). If r is distance from the center of the earth, and ϑ colatitude and ϕ longitude, then Eq. (N.190) becomes839 1 ∂ sin ϑ ∂V 1 ∂2 V ∂ 2 ∂V ∂ϑ r + + =0 (N.191) ∂r ∂r sin ϑ ∂ϑ sin2 ϑ ∂ϕ2 This can be solved by separation of variables (see note 292), i.e., assume that V , which is a function of r , ϑ and ϕ, can be written as the product of a function that depends only on r , one that depends only on ϑ, etc., V (r, ϑ, ϕ) = R(r ) (ϑ) (ϕ), and then plug that into (N.191) and if you also divide by R(r ) (ϑ) (ϕ) you get 1 d R(r ) dr
1 d d(ϑ) 1 d 2 (ϕ) d R(r ) r2 + sin ϑ + = 0. dr (ϑ) sin ϑ dϑ dϑ (ϕ) sin2 ϑ dϕ2
720
Notes
This immediately separates into an ODE where r is the independent variable, i.e., d R(r ) 1 d r2 = λ, (N.192) R(r ) dr dr and a PDE in ϑ and ϕ, i.e., 1 d d(ϑ) 1 d 2 (ϕ) sin ϑ + = −λ, 2 (ϑ) sin ϑ dϑ dϑ (ϕ) sin ϑ dϕ2 with λ an arbitrary constant, of course. If we multiply the last equation we got by sin2 ϑ, we get d(ϑ) 1 d 2 (ϕ) sin ϑ d sin ϑ + = −λ sin2 ϑ, (ϑ) dϑ dϑ (ϕ) dϕ2 and that actually separates into
and
1 d 2 (ϕ) = −μ, (ϕ) dϕ2
(N.193)
d(ϑ) sin ϑ d sin ϑ + λ sin2 ϑ = μ (ϑ) dϑ dϑ
(N.194)
with μ an arbitrary constant, and so now instead of one PDE, we’ve got three ODEs to solve, which are (N.192), (N.193) and (N.194). The two linearly independent solutions to (N.193), that we need to put together its general solution, are √ (ϕ) = cos( μϕ) and
√ (ϕ) = sin( μϕ).
Now remember that ϕ is an angle that can take values between 0 and 2π; or, which is the same, ϕ and ϕ + 2π are the exact same thing. But then, (ϕ) and (ϕ + 2π) must also be the same thing. For that to be the case, we need μ2 to be an integer number. To remind ourselves of this, we replace it with the letter m—because m is one of those letters that people like to use to denote integer numbers. So then the next equation we need to solve is (N.194), with μ = m 2 , i.e., sin ϑ
d(ϑ) d sin ϑ + λ sin2 ϑ − m 2 (ϑ) = 0. dϑ dϑ
(N.195)
Notes
721
The recipe to solve this, which you can find in ODE cookbooks, says that you should define x = cos ϑ, so that (chain rule) d dx d = dϑ dϑ d x = − sin ϑ and sin ϑ = get
√ 1 − x 2 ; and, if we substitute all that into what we just wrote, we
(1 − x 2 ) or
d dx
d d f (x) (1 − x 2 ) + λ(1 − x 2 ) − m 2 f (x) = 0, dx dx
m2 d f (x) ∂ f (x) = 0, (1 − x 2 ) + λ− ∂x dx 1 − x2
(N.196)
where f (x) = (ϑ(x)), and x can only take values in the interval between −1 and 1. Next, you have to (i) set m = 0 initially—you’ll generalize to m = 0 later; (ii) plug into (N.196) a power-series type solution, kind of like the Frobenius thing of note 546; except this time it’s a little easier, because it turns out k you can set r = 0 (again, see note 546), i.e. f (x) = ∞ k=0 ak x ; (iii) which if you plug that into (N.196), with m = 0, you get a recursive relation that one can use to find all the ak ’s: except that the power series won’t converge in the case when x = ±1; (iv) for convergence to happen, it turns out that we also need λ = l(l + 1), where l is an integer number, l = 0, 1, 2, . . .; (v) Equation (N.196) with m = 0 and λ = l(l + 1) is one of those “classic” equations: it’s called Legendre840 equation and its solution is the so-called Legendre polynomial of degree l; (vi) which people like to denote it Pl (x), and the most compact, general formula for it is 1 dl Pl (x) = l [(x 2 − 1)l ], 2 l! d x l with x between −1 and 1 (it is shown, in math books, that Pl is a polynomial of degree l); (vii) then, like I said, generalize this to m = 0: the solutions of (N.196) with λ = l(l + 1) and m an integer number, AKA the general Legendre equation, are given by the formula
722
Notes m
Plm (x) = (−1)m (1 − x 2 ) 2
dm Pl (x), dxm
(N.197)
which is called associated Legendre function of degree l and order m, and which collapses back to Pl when m = 0; (viii) and the solutions to (N.195) are (ϑ) = Plm (cos ϑ), which you can get by implementing (N.197) for any l, m, and then replacing x with cos θ; (ix) notice that, because Pl (x) is a polynomial of degree l, then Plm (x) = 0 if m > l. (And no, this time I haven’t worked out all the steps, and I am sorry about it: but with all the math you’ve hopefully learned through this book already, maybe this bullet list is enough for you to figure all the details out. If not, this is stuff that you really can find in many places.) Finally, we still haven’t attacked the “radial” Eq. (N.192), which now we can replace λ with l(l + 1). After some algebra, (N.192) becomes r2
d 2 R(r ) d R(r ) + 2r − l(l + 1)R(r ) = 0, dr 2 dr
which you can check by direct substitution that two (which is all we need, because this is a second-order ODE) linearly independent solutions are R(r ) = r l and
R(r ) = r −(l+1) .
So then, if we compile the general solutions to the ϕ, ϑ and r ODEs, and multiply them together, the most general solution to (N.191) turns out to be841 V (r, ϑ, ϕ) =
∞ l &
[alm cos(mϕ) + blm sin(mϕ)]r −(l+1)
n=1 m=0
' +[clm cos(mϕ) + dlm sin(mϕ)]r l Plm (cos ϑ), where alm , blm , clm , dlm are all arbitrary constants, and there’s one of each per combination of l and m; and l is allowed to take all integer values between 0 and ∞, while m ≤ l. If you take the gradient of V ,
Notes
723
B = (Br , Bϑ , Bϕ ) = −
1 ∂V ∂V 1 ∂V , , ∂r r ∂ϑ r sin ϑ ∂ϕ
,
and so ' & ⎧ (l + 1) [alm cos(mϕ) + blm sin(mϕ)]r −(l+2) − l [clm cos(mϕ) + dlm sin(mϕ)]r l−1 Plm (cos ϑ), ⎨ Br = l,m ' d m & Bϑ = − l,m [alm cos(mϕ) + blm sin(mϕ)]r −(l+2) + [clm cos(mϕ) + dlm sin(mϕ)]r l−1 dϑ Pl (cos ϑ), ' & ⎩ Bϕ = l,m [alm sin(mϕ) − blm cos(mϕ)]r −(l+2) + [clm sin(mϕ) − dlm cos(mϕ)]r l−1 sinm ϑ Plm (cos ϑ),
(N.198) ∞ n . where, of course, l,m stands for l=1 m=0 Through the expressions we’ve just written, one can separate the “internal” and “external” contributions to the magnetic field. To understand how that works, consider that the terms, in those expressions, that are proportional to r l−1 diverge to infinity with growing r , and that doesn’t make any sense physically: even if the source of the magnetic field were outside the earth, no source can generate a field of infinite magnitude. Likewise, terms that are proportional to r −(l+2) explode when r goes to 0, i.e. at the center of the earth: and that doesn’t make any physical sense, either. This means that the solution that we’ve found for V and B is only valid within some finite range of values of r : but that’s actually OK and shouldn’t surprise us: because we had started out by saying that our observation point must be free of electrical currents: and but electrical currents are the most likely sources of the magnetic field we’re looking at: so, whether the source/s of the magnetic field is/are outside or within the earth, our solution doesn’t hold near them and is only OK away from them: in some r -interval that is free of sources. Now, and this is the crux of this whole note, having written B the way we just did allows us to disentangle the contributions of internal and external sources. Because the r l−1 terms describe a field that grows if we move further from the earth’s center: that must be related to sources that are above us, in the atmosphere or in space, whatever: external sources; and the r −(l+2) terms, which grow with depth, must be related to sources within the planet. If we had only one measurement of Br , Bϑ , Bϕ , made at a single observation point, this wouldn’t be very helpful. But, let’s say we’ve got multiple observations made at different places along the surface the earth—which we actually do. Then, for each observation, you can replace the left-hand sides of (N.198) with numbers—the observed values of Br , Bϑ , Bϕ : and you have as many instances of (N.198) as there are observation points. If you are OK with stopping the sums in (N.198) at some finite value842 of l, then you can think of this as a linear system, or system of linear algebraic equations, whose unknowns are the coefficients alm , blm , etc., and everything else is known—r , ϑ, ϕ are, each time, the coordinates of the observation point. If you have enough data, and if the data are good, you can solve this inverse problem, presumably in least-squares843 sense: and this is how people, starting with Gauss, figured that the magnetic field that
724
Notes
we see on earth does not originate from space: because when they solved for alm , etc., they found clm , dlm to be zero or close to zero: important external sources just wouldn’t fit the data. 579. Elsasser, who died in 1991, is probably mostly remembered for his ideas on the earth’s magnetic field; but he was a physicist with a very broad range of interests who contributed to everything from biology to meteorology to nuclear physics. He was born in Germany, went to school and even started his academic career there, but fled the country when Hitler came about, first to Paris and then to the U.S. He was faculty at a bunch of schools there, including Caltech, Princeton, and Johns Hopkins. After Pearl Harbor he was drafted and spent the war working as a researcher for the U.S. Signal Corps, in New Jersey, and then for the Radio Propagation Committee of the National Defense Research Council in the Empire State Building. And that is when, on weekends, he worked out his geomagnetic theory. He wondered about biology, the nature of life and all that, and was skeptical of physics’ tendency to, uhm, simplify. Meaning, physicists are happy if they can establish some simple, compact laws (think Newton’s laws and Maxwell’s equations) that are enough to explain a wide variety of intricate phenomena. Elsasser figured that that just wasn’t good enough to really understand the world. In France he met Théophile Cahn, a famous physiologist; “Walter was impressed”, writes Harry Rubin (in Elsasser’s biographical memoir for the National Academy of Sciences), “with Cahn’s claim that biology was first and foremost the realm of utter complexity. [...] He realized that [...] the man who thinks in traditional terms of physics [...] must change his ways and learn to accommodate complexity and individuality if he is to encounter life and understand its nature.” 580. “The Earth as a Dynamo”, Scientific American, vol. 198, 1958. 581. We’ve met Ohm’s law—relating electric current density j to electric field E—in that long note on Maxwell’s equations: note 575, and Ohm’s was Eq. (N.182). But I didn’t tell you all about it. If there’s a magnetic field B, and if the medium that conducts the current is not at rest but moving with velocity v, then v and B also affect the current, and it turns out that j = σ (E + v × B) , where, remember, σ the conductivity. Nota bene, if σ and v are nonzero, current will flow even in case there’s no electric field, E = 0. 582. Ampère’s law, or Maxwell’s (N.186) in note 575. If you already went through that, you might remember that that law is essentially a description of Ørsted’s and Ampère’s experiments—which are also mentioned, there. If you haven’t checked out that note yet, you should probably do it at some point. 583. G. A. Glatzmaier and P. Olson, “Probing the Geodynamo”, Scientific American Special Editions 15, pp. 28–35, 2005.
Notes
725
584. When matter changes its “phase”, like, from solid to liquid, or from liquid to solid, or from liquid to gas, etc., it usually absorbs, or emits, a certain amount of heat that doesn’t translate into a change in temperature. For example, to melt ice, you need, first, to warm it up, from its initial below-zero temperature to 0 ◦ C, which you do by supplying heat to it—you warm it up, and its temperature goes up. OK. But then, after temperature has risen all the way to 0 ◦ C, for a while it stays at zero, even if you keep heating the (melting) ice. Put simply, what happens is that some of the heat is spent to modify the, uhm, structure of matter: and so it cannot be spent to change temperature, which stays the same, until the ice has molten completely. That extra heat is what people call latent heat. And if we were cooling water down, to make ice, the opposite would happen—the same amount of latent heat would have to be taken away from the freezing water, while temperature remains equal to 0 ◦ C, until no water is left and all we have is ice. 585. If you’re interested in the math, here’s at least some of it. First, we need to derive the differential equation that controls the field B generated by motion of charged material—which is described, in turn, by the velocity field v. The PDE we are going to find is usually called “induction equation”, and it “shows that magnetic field lines in a perfectly conducting fluid must move with the fluid, as if frozen into the medium”844 . We might consider the simple case where845 ∂E ≈ 0, so that Ampère’s law again reduces to ∇ × B = μ j. If you don’t know ∂t what I am talking about, that means you should probably go back and read note 575. Besides Ampère’s, we need Faraday’s law—Eq. (N.185) of note 575—, i.e., ; and we need Ohm’s law—the version that takes into account ∇ × E = − ∂B ∂t that the material through which current flows might also be moving: j = σ (E + v × B). We can use Ohm’s law to find an expression for E in terms of j and B, j E = − v × B; σ plug that into Faraday’s law, which gives ∇×
∂B j −v×B =− ; σ ∂t
and then use Ampère’s law to get rid of j from the expression we’ve just gotten: ∇×
1 ∂B ∇ ×B−v×B =− . μσ ∂t
The curl has the property846 that ∇ × (∇ × A) = ∇(∇ · A) − ∇ 2 A,
(N.199)
726
Notes
for whatever vector A: we can use it to rewrite (N.199), ∂B 1 = −∇ × ∇ ×B−v×B ∂t μσ 1 ∇ × (∇ × B) + ∇ × (v × B) = − μσ 1 2 1 = ∇ B− ∇ (∇ · B) + ∇ × (v × B) μσ μσ 1 2 = ∇ B + ∇ × (v × B) μσ
(N.200)
(because according to Gauss’ law of magnetism, ∇ · B = 0): and that’s what people call the induction equation. Now, let’s make another approximation. Fluid iron is very conductive; its conductivity is so high we might replace it with ∞: which is the same as saying that σ1 = 0: but then the first term at the right-hand side of (N.200) cancels out, and the induction equation collapses to ∂B = ∇ × (v × B) . ∂t
(N.201)
This will be very useful for our goals... but first, you have to bear with me for a minute: consider the dot product of the field B with a unit vector nˆ that is everywhere perpendicular to a surface S in the outer core, and integrate that over S: that’s called the flux847 of B through S. Let S be a “material surface” (meaning, a surface defined by a set of particles: i.e., if you deform the material, the surface deforms with it) within the outer core. What we need to do now, and in a minute you’ll see why we are doing it, we need to take the time derivative of the flux of B. That’s not easy, actually, because, besides B itself changing over time, you must not forget that matter is flowing around, and so the shape of the material surface S changes over time as well. Let’s write the general formula for the derivative, and we’ll see what we can do: ˆ d S − S(t) B(t) · nˆ d S d S(t+dt) B(t + dt) · n . B · nˆ d S = lim dt→0 dt S dt Because dt is arbitrarily small, it’s OK to replace B(t + dt) at the right-hand side with a first-order Taylor expansion, i.e. B(t + dt) ≈ B(t) + ∂B (t)dt, and ∂t
Notes
727 d dt
S(t+dt)
B · nˆ d S = lim
dt→0
S
= lim
# B(t) +
∂B ∂t (t)dt
ˆ dS − S(t+dt) B(t) · n
$
· nˆ d S −
ˆ dS S(t) B(t) · n
dt ˆ dS S(t) B(t) · n
dt
dt→0
+ S(t)
∂B (t) · nˆ d S, ∂t
(N.202) where you might notice that in the last term I have replaced S(t + dt) with S(t), after taking the limit as dt → 0. That term doesn’t look like it can be made any simpler than it already is, so let’s focus on the other one. That would be the derivative of the flux of B, if B were constant and equal to its value at the time t. It can be worked out to something much simpler if we play a mathematical trick: consider the closed surface that’s formed by S(t), S(t + dt) and the surface, let’s call it , spanned by the edge of S(t) as it moves to become the edge of S(t + dt). Because it’s a closed surface, the divergence theorem applies, and
B(t) · nˆ d S − S(t+dt)
B(t) · nˆ d S + S(t)
B(t) · nˆ d S =
∇ · B(t)d V, V
where V is the volume enclosed by S(t) and S(t + dt) and . But, wait a second, there’s Maxwell’s equation (N.184) that says that ∇ · B = 0; but so then B(t) · nˆ d S − B(t) · nˆ d S = − B(t) · nˆ d S. S(t+dt)
S(t)
The negative sign before the second term at the left-hand side is there because, to apply the divergence theorem, we need nˆ to point outward from V , and so we need to flip it around, either along S(t) or along S(t + dt). Anyway, we can go and plug what we’ve just found into (N.202), and d dt
B · nˆ d S = − lim S
dt→0
B(t) · nˆ d S + dt
S(t)
∂B (t) · nˆ d S. ∂t
If we call dl an arbitrarily small element of the edge of S(t), with direction parallel to the edge of S(t), then at each point along the edge of S(t) we have that nˆ d S = dl × vdt : the magnitude of dl × vdt is the product of the magnitude of the two vectors times the sine of the angle they form: i.e. the area of an infinitely small rectangle with base along S(t) and height equal to the distance, at that point, between S(t) and S(t + dt): i.e., the area of d S; the direction, by the right-hand rule, is ˆ It follows that outward from —which is the direction of n. d ∂B (t) · nˆ d S, B · nˆ d S = − B(t) · (dl × v) + dt S ∂ S(t) S(t) ∂t
728
Notes
where ∂ S denotes the edge of S. The way we’d defined the dot and cross products in Chap. 1, it can be shown848 that for any trio of vectors a, b and c, we have a · (b × c) = (c × a) · b: so, in our case, ∂B d B · nˆ d S = − (t) · nˆ d S [v × B(t)] · dl + dt S ∂ S(t) S(t) ∂t Then by Stokes’ theorem (see note 578), d dt
∂B {∇ × [v × B(t)]} · nˆ d S + (t) · nˆ d S S(t) S(t) ∂t % ∂B = (t) − ∇ × [v × B(t)] · nˆ d S, ∂t S(t)
B · nˆ d S = − S
which if you compare it to (N.201) it’s easy to see that the term in curly brackets must be zero: but so then the whole right-hand side is zero and we must conclude that, as long as we are OK with assuming that the fluid iron down there is perfectly conductive, then the so-called flux of B through any material surface S within the outer core is constant through time, d dt
B · nˆ d S = 0. S
Remember that S moves with the material—it’s a “material surface”—so when we say that the flux is constant through time that means that the flux through any bunch of particles that form a surface remains constant, even as the particles flow across the core. And but for the flux to be constant as the particles move, we need the magnetic field lines to move with the flow. (Meaning, each and every field line: because, the way we’ve derived things, S is totally arbitrary, and can be as small as we want it to be.) At some point someone must have described this with the phrase: the flux of the magnetic field is “frozen into” the moving fluid, or something similar: and this is where the term “frozen flux” comes from. 586. It can hardly be a coincidence that magnetic dipole and rotation axis are almost parallel to one another. Although, OK, they are parallel right now: but what about the past? Paleomagnetic data (which you’re about to learn everything about, in a few pages) show that, over the last couple million years, the earth’s magnetic field was always a dipole, parallel to the current axis of rotation. But, during this time, has the rotation axis moved? One way to check could be to look at the geological signature of past changes in the climate of certain places, i.e., paleoclimate data (which remember the piece on Wegener’s continental drift, Chap. 5): because, e.g., you must have ice near the rotation poles, and no ice near the equator, and large, expanding/retreating ice sheets leave geological traces; and if the rotation poles move...
Notes
729
In “Review of Paleomagnetism”, their paper in vol. 71 of the Bulletin of the Geological Society of America, which came out in 1960, i.e., only a couple years after Elsasser’s Scientific American paper, Allan Cox and Richard D. Doell say that “estimates of displacements of the rotational axis during the past several decades, as found from astronomical observations, indicate [...] that the geographic pole moved at most 15 feet between 1900 and 1940. If the average rate of motion during the past half million years had been twice that amount, the total polar shift would have been only 1◦ .” Which is pretty small. “Since there is no other evidence to suggest that the axis of rotation differed significantly from the present one, the late Pleistocene and Recent paleomagnetic results [i.e., paleomagnetic data covering the last couple million years] constitute strong evidence in support of the dynamo theory for the origin of the earth’s magnetic field.” 587. Elsasser doesn’t mention Coriolis force in his paper, presumably to keep things simple. Anyway, Gaspard-Gustave Coriolis, the same Coriolis who first used the word “work” (or, actually, travail) in its current scientific meaning—we’ve met him in Chap. 4 already—, also did some work on machines with rotating parts, looking at (apparent) forces which only show up when one does the math in a rotating frame of reference; i.e. a frame of reference that is attached to whatever part of the machine actually rotates. The first thing you need to know about Coriolis and centrifugal forces—which are close relatives, and I might as well tell you about centrifugal force, too, while I’m at it—the first thing you need to know is that the laws of Newton, that force equals mass times acceleration, etc., are valid as long as everything— displacement, velocity, acceleration—is measured with a respect to a reference frame that is not accelerating: it might be moving at constant speed on a rectilinear trajectory, but must have zero acceleration. Reference frames with such a property are called “inertial”. To understand what I am talking about, think of a car, accelerating at a constant rate. To keep things simple, take acceleration to be parallel with velocity: the car’s direction stays constant, but its speed grows. Say there is a plumb-bob hanging from the roof of the car. What happens to the bob as the car accelerates? If you look at the car from the outside (meaning, from an inertial reference frame), you’d answer that the bob wants to stay where it is (Newton’s first), and but the string keeps it attached to the car, so ultimately, besides its weight, there’s only one force acting on the bob and that’s the tension from the string. If the horizontal component of the string’s tension equals the product of the bob’s mass with the car’s acceleration, then the bob accelerates together with the car, with the exact same acceleration as the car. (This is what the drawing in Fig. N.44 tries to show.) Now imagine you’re in the car, and imagine you forgot that the car is accelerating: seeing that the bob is hanging at an angle with respect to the vertical, you’d have to conclude that there’s some mysterious, additional force that’s pulling it towards the back of the car (Fig. N.45). The physical consequence of this force—the inclination of the plumb line—is evident to you as a passenger,
730
Notes
Fig. N.44 A car moves along a straight line, with constant acceleration. A bob hangs from the roof of the car. Two forces act on the bob: its weight, and the tension from the string. The vertical component of the string’s tension coincides with the bob’s weight; its horizontal component with the car’s acceleration, multiplied by the bob’s mass: then, the bob accelerates together with the car
Fig. N.45 Imagine you’re in the car, but you don’t know that the car accelerates. The bob looks motionless to you (because you are also accelerating, together with it and with the car), but the plumb line is not vertical. To explain this, you must invoke an additional force, that pulls the bob towards the back of the car, so that all forces cancel out
but if you were looking at things from the outside you’d know that there’s no need to call it a force: it’s just the inertia of the bob that wants it to stay put. People call that an “apparent” force. You can solve physics problems both in the inertial and accelerating frames; but in the latter case, you’ve got to include in all your force balances an apparent force ma, where a is the reference frame’s (the car’s) acceleration, m the mass of whatever object you’re looking at, e.g. the bob: and the apparent force points opposite the car’s sense of motion. The second thing you need to know is that centrifugal and Coriolis are apparent forces that emerge when you work in a frame that’s rotating at uniform angular velocity with respect to an inertial frame. If you haven’t forgotten the bit on Euler’s equations in Chap. 1, you’ll remember that a rotating body necessarily accelerates—even if the angular velocity, i.e. the rotational speed, is constant. Which means, we’re going to get apparent forces, for instance, in reference frames that rotate uniformly around an axis. Still, it can be convenient, when you try to solve physics problems, to refer motions to a rotating frame: for example in geophysics, if you use a reference
Notes
731
Fig. N.46 The point P is motionless in the frame, but rotates, together with , with respect to the inertial frame
frame that’s anchored to the earth’s geography, longitude and latitude and depth or altitude, which is what we usually do, well, naturally that reference frame is not inertial, but rotates together with our planet. One way to deal with this is, find a mathematical expression for the apparent force that we’ve got to add, whenever we write equations of motion in the rotating frame. Start by introducing an inertial frame , with origin O at the center of the earth, and axes x, y, z, with z parallel to the rotation axis; and a frame that is rotating about the z axis with angular velocity (again, remember Euler from Chap. 1) ω = ωˆz (and zˆ , of course, is a unit vector parallel to the z axis and pointing to the direction of positive z), origin again at O, and the axes called x , y and z . To keep things simple, let’s limit this to the case where the z and z axes coincide: the earth’s rotation axis doesn’t move around much, meaning we can consider it to be anchored to an inertial frame. In practice, we shall have solved our problem if we answer the following questions (maybe you won’t immediately see why that is so, but keep reading): (i) if the point P is fixed with respect to , what is its velocity v with respect to ? (ii) if P is moving with velocity v with respect to , what is its velocity v with respect to ? (iii) if P has acceleration a with respect to , what is its acceleration a with respect to ? And here are some answers: (i) At time t, P occupies the position r with respect to . The point rotates with —its distance from the z axis doesn’t change—so, a
732
Notes
short while later, t + δt, it occupies the position r + δr. Provided that δt is sufficiently short, the magnitude of the vector δr, look at the drawing in Fig. N.46, is given by δr = r ω sin ϑ δt; if (as for the earth, if we look at it from the northern hemisphere) rotation is counterclockwise, and if (as it’s usually done), ϕ is taken to grow counterclockwise, then the direction of δr is that of increasing ϕ, i.e. δr = r ω sin ϑ δt ϕˆ ; which if you remember how the cross product is defined, right-hand rule and all that, we might as well write δr = ω × r δt. Now divide both sides by δt, make it indefinitely small, and v =ω ×r is the velocity of point P (which is still in ) with respect to . (ii) It follows from (i) that if P is, instead, moving with respect to , with velocity v , then its velocity v with respect to must be v = v + ω × r.
(N.203)
Which answers the question. And but from this we also infer that d = dt
d dt
+ ω ×,
(N.204)
meaning that the rate of change of a spatial, vectorial quantity (displacement, velocity, acceleration...) observed in the inertial frame is the same as observed it in the rotating frame, plus a correction term that’s equal to the cross product of ω times the quantity itself (as observed in the inertial frame). (iii) To answer the remaining question, differentiate Eq. (N.203) and then use (N.204), i.e.,
Notes
733
a= = = = =
dv dt d v +ω ×r dt d v + ω × r + ω × v + ω × r dt dr dv ω × r) +ω × + ω × v + ω × (ω dt dt ω × r) . a + 2 ω × v + ω × (ω
If we turn this equation around, ω × r) . a = a − 2 ω × v − ω × (ω We might multiply by the mass m of the material point P, or whatever rigid body we are looking at, ω × r) . ma = ma − 2 m ω × v − m ω × (ω And but now here’s the thing. Newton’s laws hold in the inertial frame, which means that if a force F is applied to P or whatever, then ω × r) . ma = F − 2 m ω × v − m ω × (ω
(N.205)
Equation (N.205) has the answer to our initial problem—that of finding a “mathematical expression for the apparent force that we’ve got to add”, etc. It says—the equation says—that if you try to apply Newton’s laws in the rotating system, you’re going to have to account for an “apparent force” which coincides with everything that’s at the right-hand side, minus F (which is the “true” force). And those two terms are what we call Coriolis force and centrifugal force, in this order. Now to understand what those “apparent” forces actually do, let’s start with the simple case where P is not moving with respect to : v = 0, and the only apparent force we need to worry about is the ω × r). Look at Fig. N.46, remember the centrifugal force, −m ω × (ω right-hand rule, and you see that ω × r = ωr sin ϑϕˆ . If you cross-multiply that with ω (and remember that ω = ωˆz),
734
Notes
Fig. N.47 Projectiles veer to the right in the northern hemisphere and to the left in the southern hemisphere. a A cannon is in the northern hemisphere and points to the north, i.e. the projectile’s initial velocity v , in the rotating frame, is directed towards the north: then, the cross product ω × v ω × v must point into the page: bottom line, points—almost—towards you, the reader; but then −ω Coriolis’ force deflects the projectile toward the right of the target. b The cannon is still in the northern hemisphere, but now it points to the west. Then, ω × v points away from the rotation axis; ω × v points towards the axis, and but has a nonzero component pointing towards the north. and −ω It follows that, again, the projectile lands to the right of the target. c The cannon is in the southern ω × v points towards hemisphere now, and points to the north. ω × v points into the page (east); −ω you: the projectile lands to the left of the target
ω × r) = ω × ωr sin ϑϕˆ ω × (ω = ωr sin ϑ ω × ϕˆ = ω 2 r sin ϑ zˆ × ϕˆ , which is perpendicular to both zˆ and ϕˆ and (right-hand rule once again) points towards the earth’s axis: but then that means that ω × r), which carries a minus sign, must point away from −m ω × (ω the rotation axis: which is why it is called centrifugal force. The remaining apparent force in (N.205) is Coriolis force, −2 m ω × v , which doesn’t show up if v = 0, i.e., if the body we’re looking at is motionless with respect to the rotating frame. To figure out what Coriolis does, imagine (look at the sketches in Fig. N.47) you fire a cannon ball somewhere in the northern hemisphere; if the cannon points towards the south, so will v , and so ω × v will point to the east: and so Coriolis force (the minus sign!) will point to the west: i.e., to the right of the target you were aiming at (which stands due south). Now point the cannon to the north and fire again: this time ω × v points to the east, Coriolis then must point to the west, that is, again, to the right of the target. Point the cannon to the west: Coriolis points towards the rotation axis, and but if you think of it as the sum of a component that’s perpendicular and one that’s tangential to the surface of the earth, the tangential component points to the north—to the right of the target, again. And so on. If you do the same exercise in the southern hemisphere, though, if you are in the southern hemisphere and point your cannon to the north, for example, ω × v then points to the east, Coriolis points to the west, and the cannon ball lands to the left of the target. Point the
Notes
735
cannon in other directions, like above, and you’ll find that Coriolis again deflects the ball to the left. At the equator, the component of Coriolis that’s tangential to the earth’s surface is zero, so there’s no deflection, either right or left. (Cannonballs still miss the target, though, because Coriolis pulls them down, I guess, and so they won’t fly as far as they would without Coriolis.) Long story short, projectiles veer to the right in the northern hemisphere and to the left in the southern hemisphere. And I’ve chosen to use the cannon example, because it’s easy to picture, I guess, but the same applies, e.g., to masses of air in the atmosphere (see note 588); or to fluid iron in the core, as we are about to see. 588. Incidentally, Coriolis force also explains which way hurricanes spin. A hurricane starts when air moves into a low pressure center, from all around it. Because the earth rotates, air moving towards the low-pressure center is everywhere deflected to the right (if we are in the northern hemisphere) and that can only start a counterclockwise spin. According to the same logic, hurricanes in the southern hemisphere spin clockwise. 589. We’ve derived the induction equation already: Eq. (N.200) in note 585. At that point, the reason for deriving it was I wanted to show you the frozen-flux application and what that means for the relationship between flow of matter in the core and temporal changes of the magnetic field that we can observe at the earth’s surface. But the induction equation is also the partial differential equation—one vectorial PDE, or three scalar PDEs—that you need to solve if you want to find B. At first sight it doesn’t even look much worse than the stuff we’ve dealt with when we were doing wave propagation in Chap. 6. But when you look more carefully, you realize that this equation can’t be solved on its own, or in any case it can’t be solved unless you know v, i.e. you’ve got to know the velocity field of convecting material in the outer core... and you need to know it at all times t for which you want to know B. And if you want to do things right, you’ve got to solve for both B and v simultaneously —one big system of PDEs—because there’s not only gravity now to pull things around, but the electric and magnetic fields also exert a force on whatever materials are charged—and remember, the core is made of iron, which is not unlikely to be charged—and so you should add that, as an additional body force in the Navier-Stokes equation (6.135). Which is what I meant when I wrote that, in the outer core, the induction equation is coupled to Navier-Stokes. 590. from a “Memorial Meeting for Lord Blackett, O.M., C.H., F.R.S. at the Royal Society on 31 October 1974”, published in 1975 in the Notes and Records of the Royal Society of London. 591. In their 1960 paper: see note 586. 592. The articles cited by Cox and Doell in this excerpt are: K. M. Creer, E. Irving849 and S. K. Runcorn, “The Direction of the Geomagnetic Field in Remote Epochs in Great Britain”, Journal of Geomagnetism and Geoelectricity, vol. 6, 1954;
736
Notes
E. Irving, 1956, “Palaeomagnetic and Palaeoclimatological Aspects of Polar Wandering”, Geofisica Pura e Applicata, vol. 33, 1956; S. K. Runcorn, “Palaeomagnetic Comparisons between Europe and North America”, Proceedings of the Geological Association of Canada, vol. 8, 1956. 593. Lamont, a branch of Columbia University, was founded by Maurice Ewing in 1949. Ewing was a researcher at the Woods Hole Oceanographic Institution during WWII, when the U.S. Navy needed help to improve their sonar systems. Together with his colleague Joe Worzel, he discovered a “shadow zone” for sound waves in the ocean—a temperature gradient that creates an important velocity gradient that bends waves: sort like the velocity gradient in the earth’s core causes a shadow zone for seismic waves, which we learned about in Chap. 7. This is quite important per se, because you can use it, e.g., to hide your submarine from enemy ship. On top of that, just like you get a “head wave”, or “guided wave” wherever you have a discontinuity/a strong vertical gradient within the earth (again, Chap. 7), sound would propagate along Ewing’s strong temperature/sound velocity gradient, to very long distances. Which was a quite useful thing to know, too, for a number of reasons that I am not getting into—but, if you are curious about this, look up “SOFAR (SOund Fixing and Ranging)” and “SOSUS (SOund SUrveillance System)”. Bottom line, “by war’s end,” says Naomi Oreskes in the intro to Plate Tectonics, “the U.S. Navy was convinced of the value of geophysical research. Through the newly established Office of Naval Research (ONR), funds began to flow generously into American laboratories. Three institutions particularly benefited from ONR support: Woods Hole, the Scripps Institution of Oceanography, and the newly created Lamont Geological Observatory at Columbia University, now directed by Ewing.” And then “in 1953, soon after founding Lamont, Ewing acquired its first ship, the Vema. He secured the use of a second ship, the Conrad, in 1962. Instead of making voyages of a few months with defined goals to limited areas, Lamont’s ships soon circled the globe annually, accumulating 40,000 or so miles and 300 days at sea year after year. Work did not stop; information was collected continuously with every usable kind of measurement, whether or not anyone had asked for it. Doc always wanted it. Crews soon had a motto: A core a day keeps the doctor away.” (W. Wertenbaker, “William Maurice Ewing: Pioneer Explorer of the Ocean Floor and Architect of Lamont”, GSA Today, October 2000.) (“Doc”, by the way, was Ewing’s nickname. And a “core”, or “core sample”, is a cylindrical sample of rock (or whatever substance), extracted with a so-called “core drill”.) 594. Heezen (1960): “The segment of the ridge that underlies the mid-Atlantic [...] came to light in 1873 when investigators aboard the British ship Challenger, on its epoch-making three-and-a-half-year oceanographic cruise around the world, employed a 200-pound weight on a hemp line to take laborious soundings approximately every 100 miles. These revealed that the middle of the Atlantic, contrary to what had been expected, is less than half as deep as two broad troughs located about a third of the way across from either shore. On the basis
Notes
737
of such widely spaced soundings it was not possible to tell whether the central elevation was mountainous or simply a broad, smooth rise. In 1925, 1926 and 1927 the German ship Meteor, making the first extensive application of echosounding gear, produced detailed profiles of the ocean bottom that showed the Mid-Atlantic Ridge to be a rugged mountain range.”850 And but then, with the work of Ewing and company in the 1950s, lots of new data, and: “In 1953 Marie Tharp of the Lamont Geological Observatory and I were making a detailed physiographic diagram of the floor of the Atlantic, based upon a large number of echo-sounding profiles. As the preliminary sketch emerged, Miss Tharp was startled to see that she had drawn a deep canyon down the center of the Mid-Atlantic Ridge. [...] The mid-Atlantic rift averages more than 6,000 feet in depth and ranges from eight to 30 miles in width for hundreds of miles.” For comparison, “the Grand Canyon of the Colorado River averages perhaps 4,000 feet in depth and four to 18 miles in width along its most majestic 50–60 miles”. 595. Marie Tharp is one or the very few characters, in this story, who aren’t guys. “Tharp, who was born in 1920, came of age during a time that was suspicious of women who chose to make science their life’s work”, says Erin Blakemore in “Seeing is Believing: How Marie Tharp Changed Geology Forever”, Smithsonian Magazine, August 2016. “In retrospect, it makes plenty of sense that the daughter of a soil surveyor for the U.S. Department of Agriculture would inherit a taste for both geology and cartography. But given the scant number of women in geology at the time—women obtained fewer than 4 percent of all earth sciences doctorates between 1920 and 1970—it’s surprising that Tharp was able to pursue her passion.” In the 1940s, though, as the young men had all gone off to war, Tharp was able to enter a master’s degree in geology at the University of Michigan. She worked, briefly, in the oil industry, and then at Lamont, where she spent her entire career. “Navy regulations meant Tharp couldn’t go out on the research vessels that Ewing and her other colleagues chartered”, says Blakemore. So what she did was, she sat at her table and worked out the bathymetry data collected, mostly, by Bruce Heezen. That was a lot of work, because we are talking about the 1940s, 1950s: no computers. Anyway, she slowly transformed Heezen’s numbers into a map of the depth of the ocean floor, and, as the map grew more complete, something strange showed up: a huge chain of submarine mountains—a “ridge”—in the middle of the Atlantic, with a deep “valley” running along its axis. Apparently, Tharp figured right away that that valley was a rift, i.e., that the floor of the Atlantic was opening up. “When I showed what I found to Bruce,” Tharp recalled, Blakemore says, “he groaned and said ‘It cannot be. It looks too much like continental drift.’ [...] Bruce initially dismissed my interpretation of the profiles as ‘girl talk’.” Eventually, though, Heezen agreed re the rift (he still wasn’t into continental drift, though, as you are about to see), and published several papers about it. “When Heezen—who published the work and took credit for it—announced his findings in 1956,” says Blakemore, “it was no less
738
Notes
than a seismic event in geology. But Tharp, like many other women scientists of her day, was shunted to the background.” As I kept reading about Tharp and co., some curious stuff emerged. In “The Contrary Map Maker” (The New York Times magazine, Dec. 31, 2006.), Stephen S. Hall recounts that “Heezen and Tharp argued about almost every point on these maps.” He quotes Paul J. Fox, who at the time was a grad student at Lamont: “Bruce would breeze in on occasion”, says Fox, “and he and Marie were Hamlet-like in their relationship. There would be hooting and hollering, and occasionally you’d hear crashing sounds. Bruce would kick the wastebasket across the room, or Marie would throw a map weight at him. He would say: ‘Marie, this is rubbish. Rubbish!’ And then he would take a big rubber eraser, lick the end of it and then erase a big chunk of the map, something Marie had been working on for three weeks, in a wild, expressive manner. And then she’d say, ‘Bruce, how could you do that?’ ” Ewing, it turns out, wasn’t convinced by what Heezen and Tharp were doing— he wasn’t a believer in plate tectonics, it seems, or in Heezen’s theory that the earth is expanding (which I am about to tell you about), for that matter, and, Hall says, “began to clash with Heezen over both ideas and ego. Heezen had become a tenured professor, but Ewing did what he could to thwart the mapping project. He refused to share important data about the sea floor with the map makers—data that Heezen’s graduate students sometimes surreptitiously ‘exported’ to Tharp and her assistants. He stripped Heezen of his departmental responsibilities, took away his space, drilled the locks out of his office door and dumped his files in a hallway. Most important, Ewing blocked Heezen’s grant requests”, etc. “Although [Tharp] and Heezen fought like cats and dogs over the accuracy of the map, the adversity with Ewing united them as fiercely as a wedding pact. They dined and drank together like husband and wife, and Heezen received his graduate students at her house. Yet according to people who knew them, their unusually intense relationship was platonic.” 596. Samuel Warren Carey, geology Prof. in Tasmania, an early (pre-WWII) advocate of continental drift. There is a S. W. Carey Medal, awarded by the Geological Society of Australia since 1992. During the war, Carey was a captain in the special forces. 597. So whose idea was it? There’s a short contribution by Dietz, again on seafloor spreading, in a collection of papers called The Crust of the Pacific Basin, which is vol. 6 of the Geophysical Monograph Series published by the American Geophysical Union. In a “note added in proof”, Dietz writes: “The writer’s attention has been drawn to a reprint by H. Hess also suggesting a highly mobile sea floor. Full credit of priority is to be accorded him for any merit which this suggestion has.” The story behind this is told by Henry W. Menard (another recurrent name in the history of plate tectonics; at the time a faculty member at the University of California in San Diego), in his 1986 book The Ocean of Truth: a Personal History of Global Tectonics. In 1960 “Hess began to circulate a manuscript titled ‘The Evolution of Ocean Basins’ prepared for The Sea, Ideas and Observations, a planned new series of volumes”. The manuscript, for
Notes
739
whatever reason, was read by many more people that unpublished manuscripts are usually read by; meanwhile, the publication of The Sea was delayed, and so Hess decided to publish his manuscript in Petrologic Studies: A Volume to Honor A. F. Buddington, Buddington being a Princeton colleague, and friend, who had just retired. Bottom line, Hess’ paper only came out in late 1962. “It may seem surprising”, says Menard, “that Hess himself did not rush to publication in Nature or Science with his revolutionary ideas. The explanation may be complex, but the ideas were just ideas—of which he had many. At the time he may not have thought of them as anything other than the ‘geopoetry’ he called them851 . The more revolutionary they were, the more he might feel the desirability for review by selected colleagues.” So anyway, in early 1961 Dietz sent his manuscript, which was to be submitted to Nature, to Menard, who was his friend, plus he also lived in San Diego, I guess, so that he’d take a look and give some feedback. At this point Hess’ paper is not published yet, and but Menard, like many others, has seen the manuscript. “I was dumbfounded”, writes Menard. “Hess had sent me a copy of ‘The Evolution of Ocean Basins’ soon after he bound the manuscript. Dietz’s manuscript, ‘Continent and Ocean Basin Evolution by Spreading of the Sea Floor,’ was amazingly similar in more than the name. “I remember what then happened 23 years ago with perfect clarity. (Dietz remembers it exactly the same way.) I phoned him with the news. He expressed surprise because he had not received or seen Hess’s manuscript. He came over to my office to read the manuscript for the first time.” The manuscripts were very similar, but then again, they were both making a sort-of natural connection between two concepts—rifting at mid-oceanic ridges, and convection—that many people, at the time, were discussing independently: so then, Menard thinks, it is very possible that Hess and Dietz had the same idea at the same time. “But did they? I certainly cannot be sure,” says Menard, “and I am the only person thanked by both writers for critical discussions of the manuscripts before publication. [...] “In 1968 Hess wrote, rather casually it appeared, in a one-page response to a criticism by Meyerhoff, ‘The cogent term ‘sea-floor spreading,’ which so nicely summed up my concept, was coined by Dietz (1961) after he and I had discussed the proposition at length in 1960.” In 1962 Dietz had also written, “also rather casually it seemed”, that brief acknowledgement. “Dietz wrote this remarkable note on the strong urging of myself, Ed Hamilton, and, I believe, others. I remember that, at the time, we thought only one person could have priority and that, by circulating his manuscript, Hess had it. We also thought the similarities in the papers would prove embarrassing unless priority was openly ceded. These urgings were despite the fact that I was certain, and said so, that Dietz had written his paper before he saw Hess’s manuscript. I assumed that somehow Dietz had heard of Hess’s ideas at some meeting or cocktail party and then forgotten about it. This is a common occurrence”, etc.
740
Notes
Meyerhoff’s paper that Hess responded to, by the way, says essentially that it doesn’t really matter whether Hess has priority, or Dietz, because, actually, it’s Arthur Holmes who had come up with seafloor spreading more than ten years before. Menard agrees: “If there was any priority, it may have belonged to Arthur Holmes for the ideas in his textbook of 1944”. But, “Hess until his death and Dietz as late as 1984 were familiar only with Holmes’s paper on convection and drift in 1931.” In that paper “Holmes merely stretched the crust”; only in his 1944 book “Holmes proposed that stretching was merely an initial stage”, to be followed by rifting and seafloor spreading etc.: “a view that was unwittingly duplicated by Hess in 1962”, etc.852 598. Russell Raitt was a marine geophysicist from University of California at San Diego, who went to collect data in the Pacific and found similar things, i.e., that the sedimentary layer at the top of the crust was thin; his 1956 paper is in volume 67 of the Bulletin of the Geological Society of America. 599. Maybe you remember from Chap. 5 that, because there’s isostasy on earth, there must be an asthenosphere (Barrell’s 1914 paper, etc.). But, because we know e.g. from looking at tides (Chap. 3) that the bulk of the earth has relatively high viscosity, it follows that the asthenosphere must be just a layer—how thick exactly we don’t know, but thin compared to the size of the planet. You might also remember that the way we figure there is (or there is no) isostasy under some relief, is by looking at gravity data: if there’s isostasy, even a big mountain won’t perturb much the direction of a plumb line nearby853 . So, in case you are wondering what it is that Dietz means when he says that deviations from isostasy imply that the lithosphere is 70-km thick, the point is, whatever topography is not “compensated isostatically”, i.e., whatever topography doesn’t have an iceberg-type root (and isn’t of anomalously low density, either, as in the so-called Pratt’s model of isostasy), must be supported by the rigidity and high viscosity of the rocks. That is, say there’s some area with elevated topography that’s not compensated: by gravity, that chunk of lithosphere will want to sink, but the lithosphere has strength, and resists this sinking (just like the table on which I am writing doesn’t visibly bend under the weight of my laptop) until it becomes too much. Now, the thicker the lithosphere, the easier it is for it to support some topography without isostasy. So, I guess, from the size of the topographic highs that are known to have no roots, one can try and guess the thickness of the lithosphere. This is deliberately vague, and I am sorry, but I am going to cover the lithosphere and its thickness in some detail, soon. Dietz’ estimate of 70 km, for which he gives no bibliographic reference or any other explanation, is not going to be confirmed, so you might as well forget it. 600. Mason and Raff published two papers in 1961: R. G. Mason and A. D. Raff, “Magnetic Survey off the West Coast of North America, 32 ◦ N latitude to 42 ◦ N latitude”, Bulletin of the Geological Society of America, vol. 72, p. 1259–1265. And, A. D. Raff and R. G. Mason, “Magnetic Survey off the West Coast of North America, 40 ◦ N latitude to 52 ◦ N latitude”, Bulletin of the Geological Society of America, vol. 72, p. 1267–1270, 1961. They are published back to back on the same journal, as you see. Both are among the important papers in plate
Notes
601.
602. 603. 604.
605. 606.
741
tectonics. As for Vacquier et al., that’s probably “Horizontal Displacements in the Floor of the Northeastern Pacific Ocean”, by Vacquier, Raff (again) and Robert E. Warren, same volume of the same journal, pages 1251–1258 so just before the other two. At the time, Mason, Raff, Vacquier were all at the University of California in San Diego, Scripps Institution of Oceanography854 . A Swiss physicist, born in Lausanne. According to his entry in the Swiss Historical Dictionary, he achieved quite a number of things in his fairly long (1876– 1963) life: physics prof. at the University of Lausanne, director of the meteorological service of Vaud (the canton where Lausanne is), and then of the Central Meteorological Station of Zürich—AKA MeteoSuisse, or MeteoSchweiz— which you are familiar with, if you’ve ever been in Switzerland and googled the weather forecast; a glaciologist by passion, he took part in some early expeditions to Greenland and to Jan Mayen. Which is presumably what inspired his papers on the magnetization of Greenlandic rocks, and the reason he’s featured in our story. “Reversals of the Earth’s Magnetic Field”, Science, vol. 144, 1964. The Cox of Cox and Doell: see note 586. UC Berkeley prof. John Verhoogen and his team, incl. Allan Cox (see above), who was Verhoogen’s student at the time, “wanted to determine”, says Naomi Oreskes in Plate Tectonics, “whether reversals reflected the ambient magnetic field or were a consequence of the physical properties of the minerals involved. Cox began a project analyzing hundreds of samples from the Snake River basalts in the northwest United States, and found results that confirmed the work of Matuyama [...]: the patterns were coherent, and they appeared to depend upon the age of the basalt flows. To pin this down, Cox needed accurate ages for the flows. “At this point, a key instrumental development emerged.” We know already how radiometric dating works, in principle, but then it can be more or less effective depending on how old the rocks you want to date are, and on the halflife of the isotopes you look at: e.g. isotopes with a very long half life are not that great if you need to date relatively young rock. So, for instance, Oreskes writes, “the radio-metric uranium-lead (U-Pb) method for dating rocks had been around since the 1910s, but given the long half-life of uranium, it was accurate only for very old materials. However, Berkeley geochemists had developed the potassium-argon (K-Ar) dating technique to the point where it was accurate for very young rocks, including basalts that might be only a few hundred thousand years old.” “Magnetic Anomalies over Oceanic Ridges”, Nature, vol. 199. “One early autumn morning in 1964,” writes John Dewey, “I was sitting in my room in the Sedgwick Museum in Cambridge [...], when Toronto’s Tuzo Wilson, on sabbatical leave, sauntered in clearly bursting to tell anyone who would listen about his new ideas. He had discovered that I was the new lecturer in structural geology and said, ‘Dewey, I have just discovered a new class of fault.’ ‘Rubbish,’ I said, ‘we know about the geometry and kinematics of every kind of fault known to mankind.’ Tuzo grinned and produced a simple
742
Notes
colored folded-paper version of his now-famous ridge/transform/ridge model and proceeded to open and close, open and close it with that wonderful smile on his face.” (Simon Winchester, Krakatoa: The Day the World Exploded: August 27, 1883, Viking Press, 2003.) 607. In the 1960s, says Duncan Agnew in his paper on the history of seismology (which you might remember from Chap. 6), “seismology became what physics had been since the 1940s, a science viewed as relevant to national security— in this case not for the weapons it could build, but for those it could detect.” After the American and Soviets acquired nuclear arsenals sufficient to disintegrate the planet multiple times, there finally was some general international agreement that the so-called proliferation of nuclear weapons should end, and nuclear tests should be banned: and but it was also recognized that it didn’t make much sense to ban them if one didn’t have some reliable way of monitoring them—of checking whether people weren’t doing their tests in secret. “A US government panel”, writes Agnew, “recommended a large-scale program of ‘fundamental research in seismology.’ This resulted in the creation in 1960 of the VELA-UNIFORM program—which, though a project funded by the US government, provided support to a large number of seismologists outside the United States. [...] A large fraction of the VELA-UNIFORM funds went for improved instrumentation, including a considerable amount for seismic array development [...]. VELA-UNIFORM’s most important instrumental contribution to seismology was certainly the World Wide Standard Seismograph Network (WWSSN). This provided seismologists, for the first time, with easy access to records from standardized and well-calibrated sensors spread around the world.” 608. We’ve learned in Chap. 6 that the first vibration to come out of an earthquake rupture is the P, AKA compressional wave. Imagine an earthquake occurs, and you are standing near the fault; imagine that, the way the fault breaks, the side of the fault on which you stand upon is slipping towards you: then, the P wave will initially push you away from the epicenter (then, because it’s a wave—oscillatory motion—it’ll pull you back towards your initial position, etc.). If, on the other hand, your side of the fault is slipping away from you, then the P wave will initially pull you towards the epicenter, etc. Now consider that, in general, we’ve got quite a lot of seismic instruments around quake epicenters: both near and far away. After a quake, you can look at P-wave displacement all over the place, and check which way the first motion goes—away or towards the epicenter. For each P-wave observation, trace the P-wave propagation path between source and receiver; then, you can project the sign of the first motion—push or pull—onto an imaginary sphere, whose center coincides with the earthquake source—to be precise, we should speak of hypocenter rather than epicenter, see Chap. 6, although most quakes are relatively close to the earth’s surface so that, often times, the distinction is not that important. So, anyway, what you get from this exercise is a sphere, centered at the hypocenter, and the surface of the sphere is, say, black in places where an observer would be pushed away from the source, and white in places
Notes
743
Fig. N.48 Simplified sketch of (left) slip along a fault, and (right) the associated focal mechanism AKA beachball. On the left, the thick gray line shows where the fault is—the fault is, of course, a plane, and what we are looking at is a section along a plane that is perpendicular to the fault; the arrows show mass displacement. the labels “T” and “C” stand for “tension” and “compression”, respectively
Fig. N.49 For each basic type of fault (left to right, as indicated), from top to bottom: a sketch of the fault’s geometry; the beachball as would be seen from the same viewpoint as for the sketches; the beachball as would be plotted on a geographic map (i.e., seen from below)
where she would be pulled towards it: see Fig. N.48. This is what Sykes means by “quadrant distribution of first motions”. Today, people call diagrams like that on the left side of Fig. N.48 “focal mechanisms”—or, for obvious reasons, “beachballs”. It’s not difficult to see how the beachball is oriented, for each of the three basic fault geometries that I was telling you about: look at Fig. N.49.
744
Notes
Fig. N.50 Beachballs across the globe. What you see are all quakes bigger than a certain size (seismic magnitude of 5 or more, in case you are wondering, but don’t worry if you don’t know what that means—it might be important, yes, depending on what you do for a living, but you don’t need it to understand the rest of this book), recorded globally between January 1st, 1988, and January 1st, 1990. The data are from the CMT catalog, which see www.globalcmt.org to find out all about it
609. 610.
611.
612.
People started plotting beachballs on geographic maps, each beachball at the location of its quake’s epicenter. It was agreed that the lower hemisphere of the beachball would be shown—as if you were looking at the beachball from the center of the earth: which is not the most obvious choice, if you ask me, but that’s the way it is. Images like that in Fig. N.50 confirmed that the orientation of earthquake faults depends in a straightforward way on relative plate motions, with, like, strike-slip events being predominant in places like San Andreas, or along fracture zones; normal faults at rifts; thrust faults at convergent margins. Philosophical Transactions of the Royal Society of London, Series A, Mathematical and Physical Sciences, vol. 258. Euler’s rotation theorem, AKA Euler’s fixed-point theorem, states that “when a sphere is moved around its centre it is always possible to find a diameter whose direction in the displaced position is the same as in the initial position.” By diameter Euler means a diameter of the sphere, i.e., two points on the surface of the sphere that are exactly 180◦ away from one another. So, yes, he and Bullard are saying the same thing, although, to be precise, there’s always two points that stay fixed—an axis, as Bullard himself is about to say. Euler’s proof is essentially a difficult exercise in trigonometry—which is one of the things that I had decided to leave out of this book. That doesn’t mean that, to go from one position to the other, the plate must do a single rotation: but it certainly can do it through a single rotation, no matter what the starting and ending positions are. As in distance defined on a sphere, i.e., along a great circle.
Notes
745
613. I guess one could define discrepancy based on only the primed, or only the unprimed φn ’s. But Bullard says that “for symmetry the mean of the two was used”. 614. This is essentially the least-squares criterion, though Bullard doesn’t call it that in his paper. We’ll meet least-squares again very soon. 615. I don’t think I’ve ever spelled out, in this book, yet, that looking for places where a function has a minimum, or a maximum, is the same as looking for places where its derivative is zero (provided, of course, that its derivative exists). Because, think about it, a place where a function (a function whose derivatives exists: i.e., a function that is continuous and smooth) has a maximum or a minimum is a place where that function is neither growing nor shrinking: at that location, then, the function’s derivative can only be zero. 616. Which he doesn’t call continental drift anymore, but he doesn’t call it “plate tectonics” yet, either. 617. See note 608. 618. Bryan Isacks, Jack Oliver, Lynn R. Sykes, “Seismology and the New Global Tectonics”, Journal of Geophysical Research, vol. 73, 1968. 619. This is horribly vague, I know; but we’ll get to that shortly. 620. If you try to redo this, be very careful with the units. For instance, remember that κ is given in m2 over seconds, so t needs to be converted to seconds. 621. Anyway, in case, here is how I would do it. First the integral of the error function between 0 and some real number, call it z, is relatively easy to do by parts, if you consider that erf(x) is like erf(x) multiplied by 1, and but 1 is the derivative of x, and the derivative of erf(x) is 2 d d erf(t)dt = √ dx π dx
2 = √ lim π δx→0 2 = √ lim π δx→0 2 2 = √ e−x . π
x
e−t dt 2
0
x+δx 0
x+δx x
e−u dt − δx 2
x 0
e−u du 2
e−u du δx 2
So, then, integrating, like I was saying, by parts, gives 0
z
z 2 2 erf(x)d x = [x erf(x)]0z − √ xe−x π 0 1 # −x 2 $z = z erf(z) + √ e 0 π 1 −z 2 1 = z erf(z) + √ e − √ π π
746
Notes
(because −2xe−x is the derivative of e−x ). This result helps us to work out the integral of 1 − erf, AKA the complementary error function, AKA erfc. z z erf(x)d x [1 − erf(x)] d x = z − 2
2
0
0
1 1 2 = z[1 − erf(z)] − √ e−z + √ . π π Now, when z goes to infinity, we know that 1 − erf(z) goes to zero very quickly, much quicker than z goes to infinity: and so, even without proving this in any rigorous way, we are pretty sure that lim {z[1 − erf(z)]} = 0.
z→∞
Also, lim z→∞ e−z = 0, and so we are left with 2
∞ 0
1 [1 − erf(x)] d x = √ , π
QED. 622. The name, “plate model”, is ambiguous to me. I think the “plate” under question would be the layer between the ocean bottom and the depth where T = T0 ? but that obviously isn’t supposed to coincide with the lithosphere (except at large t), which we might still pick to be an isotherm like in the HSCM. And “plate” and “lithosphere” are pretty much synonyms in plate tectonics’ language. But anyway. 623. In “Study of Shear-Velocity Distribution in the Upper Mantle by Mantle Rayleigh Waves”, by James Dorman, Maurice Ewing, and Jack Oliver, Bulletin of the Seismological Society of America, vol. 50, 1960. You see how, when it comes to mapping the “subsurface” under the ocean, Maurice Ewing is everywhere. In 1957 he also published, together with Wenceslas S. Jardetzky and Frank Press, a book called Elastic Waves in Layered Media, where all the theory behind that kind of studies is explained, I would say, pretty well. It was published by McGraw-Hill and but I don’t think it’s ever been reprinted. 624. After making sure that it’s there, on seismological grounds, Gutenberg tries to figure out why the LVZ exists. In a 1954 paper (Bulletin of the Geological Society of America, vol. 65) he writes: “At the depths of the asthenosphere channel the temperature is not far from the melting point, and the effect of increase in temperature with depth, which decreases the wave velocity, is probably not compensated by the opposite effect of the increase in pressure, whereas above and below the channel the pressure effect prevails.” I am not sure how clear the sentence is to you—I always have to re-read it a couple times—but basically, remember that (i) rising temperature alone (with no change in pressure)
Notes
747
reduces seismic speed; but (ii) increasing depth makes both temperature and pressure rise, and then in general (iii) their combined effect (of T and p) is to increase seismic speed. But what Gutenberg is saying here is that maybe what happens in the LVZ is that the growth in temperature with depth is somewhat faster than elsewhere, and the growth in pressure somehow slower, and then as a result temperature might decrease with depth instead of growing. In a 1959 paper (“The asthenosphere low-velocity layer”, Annals of Geophysics, vol. 12), he mentions a contribution by one G. S. Gorshkov, who also has observed the LVZ in seismic data (Bulletin of Volcanology, vol. 19, 1958) and has written that “the most probable reason for this phenomenon is a transition from a crystalline state into an amorphous condition”. Whatever an “amorphous condition” might be, it’s more than just being “not far from the melting point”: it sounds like the stuff that makes up the LVZ is not exactly a “normal” solid; and Gutenberg comments: “These results give additional weight to the over fifty year old concept of the asthenosphere as a relatively weak layer, in which gradual movements, subcrustal currents, movements to make isostasy possible, and movements of extended portions of continental blocks relative to the main portion of the mantle may take place more easily than in the lithosphere.” Meaning, Gutenberg suspects that both the weakness of the layer and the fact that seismic waves propagate through it at lower-thanaverage speed, both are effects of some kind of transformation in the, who knows, “structure” of matter in the layer? kind of like what happens at the transition zone? In the following years people became convinced—which I guess most of us still are—that LVZ and asthenosphere are one and the same, and developed a more precise idea of what is going on, involving a chemical phenomenon called “partial melting”. To understand what that is, you have to think that a rock is not a “pure” substance—like, say, water, which freezes and/or melts at 0 ◦ C, which is its melting point, and that’s it—but the combination of a bunch of minerals855 , each with its own properties. Simply put, different minerals might tend to melt at different temperatures, so, as a result, when you are dealing with rocks, it doesn’t really make sense to speak of one melting point. What happens instead—and people have been measuring this stuff in laboratories— is, for a given rock you can identify a solidus, which is the temperature below which the entire rock is solid, and a liquidus, which is the temperature above which the entire rock is fluid, and usually solidus and liquidus differ, and in between you’ve got a combination of solid and molten material: and this is partial melting. Solidus and liquidus both depend on pressure; as a general rule, both grow with increasing pressure. In the mantle, the solidus is increased by more than 100 ◦ C per GPa of pressure, while the adiabatic temperature gradient is about ten times smaller; in the lithosphere, though, as you know by now—at least in the oceanic lithosphere—the T gradient is much bigger. So, anyway, what might happen, and probably happens, is shown in Fig. N.51. The temperature of rock in the mantle is described by the adiabatic geotherm, AKA the “adiabat”. Now, follow the geotherm from right to left. As pressure
748
Notes
Fig. N.51 The “geotherm” (solid line) is the earth’s actual temperature as a function of pressure (which is pretty much the same as saying, as a function of depth). At each pressure, the “solidus” is the temperature below which all earth materials are solid, while the “liquidus” is the temperature above which all those materials are liquid. Between the earth’s surface—approximately zero pressure and zero temperature—and the depth/pressure where the geotherm intersects the solidus, rock is all solid—that’s the lithosphere. Below that depth, we have partial melting (which is what the gray area is meant to represent): that’s the asthenosphere. Eventually, the geotherm intersects the solidus again: the asthenosphere is a relatively thin layer, below which there’s no partial melting anymore
goes down (depth is reduced), at some point the adiabat meets the solidus, meaning the temperature of the rock is such that some of the rock (one or more minerals, I guess) is molten. Pressure keeps going down, but before we get to a point where a large percent of the peridotite is molten, temperature starts to go quickly down: and soon the geotherm hits the solidus again and we are back to a combination of pressure and temperature where the rock is entirely solid. So then, if all this is true, it follows that there’s a range of pressures, which is the same as saying a range of depths (remember that pressure and depth always go together in the earth), between the two intersections of solidus and geotherm, where the rocks that make up the earth are partially molten. That range of depths corresponds more or less to where the low-velocity zone has been observed to be. It also turns out (from laboratory experiments) that partial melting reduces the speed of seismic waves. There’s quite a bit of uncertainty on that—as well as on the chemical and mineral composition of rocks at those depths, because it goes without saying that we don’t have direct samples—so you can’t really be sure, but it seems that the reduction in seismic speeds that one would extrapolate
Notes
625.
626.
627.
628.
749
from lab data is, indeed, about the same as the reduction that seismologists observe in the LVZ. And it turns out, from even more experiments that people did, that partial melting also reduces the viscosity of rocks. Here the uncertainty is huge, like, entire orders of magnitude of viscosity for few percents of more or less melt856 : but it seems pretty clear that there should be an effect of this kind. So, bottom line: partial melting is likely to happen in about the same depth range where we find seismic velocities to be low (the LVZ); if it does happen, it is likely to cause a reduction in seismic velocity comparable to the reduction we observe; and finally, it would also reduce viscosity in the same depth range. And with all that, I would say that we have at least a reasonable idea of what the asthenosphere is, and why it is where it is found to be, and not elsewhere. Incidentally: I am not that much of a petrologist/geochemist/rock physicist— whatever you need to be to know about this stuff; but if you want to look further into it, I guess I liked Claude Allègre’s explanations in Chaps. 5 and 7 of The Behavior of the Earth. Allègre mentions Peter Wyllie and Don Anderson as (some of) the people who first came up with these ideas back in the 1960s. Not all people who looked into this share the same views, though (check out, e.g., Shun-Ichiro Karato’s book The Dynamic Structure of the Deep Earth); if, today, you go to a meeting where people discuss what the asthenosphere is, etc., there are probably going to be arguments. If you really want to see in some detail how that is done, check out Haskell, “The Dispersion of Surface Waves on Multilayered Media” (Bulletin of the Seismological Society of America, vol. 43, 1953). Today we tend to think that the lithosphere-asthenosphere boundary isn’t really a “sharp” discontinuity—a surface across which v S and v P jump from a value to another that is quite different: like at the Moho, for instance, or at the coremantle boundary. Rather, it is a relatively slow transition. If you try to model it with a discontinuity, like in Dorman’s and Leeds’ papers, what you are doing is you are making an approximation that maybe is not so great: and it follows that the “inversion” results don’t describe the real world that well. That might explain why Leeds’ data don’t look very conclusive—although they do show clearly that there’s pretty much no lithosphere at and near the ridge, and but the lithosphere, whatever exactly that is, does thicken as its age and distance from the ridge grow. Anyway, do try and go ask your local geophysicist about the depth the lithosphere-asthenosphere boundary. My guess is that you are not going to receive a very precise answer. D. P. Mckenzie, J. M. Roberts and N. O. Weiss, “Convection in the Earth’s Mantle: Towards a Numerical Simulation”, Journal of Fluid Mechanics, vol. 62, 1974. There’s much more than this, as you might guess, that one could say about the continental lithosphere. There is a whole theory of the continental lithosphere, different and, I would say, more complicated than that of the oceanic lithosphere. The paper on continental lithosphere that we probably need to be
750
Notes
aware of is Tom Jordan’s “Composition and development of the continental tectosphere”, published in Nature in 1978. Part of my Ph.D. thesis was about looking at seismic data to find out how thick the lithosphere is under continents—similar, actually, to what I just showed you—but with much more data. So, to make some sense of what I was doing, I had to try and read Jordan’s paper early on in my “career”. With all due respect for Tom Jordan’s writing skills, and probably because of my incompetence at the time (by which I don’t mean that I am not incompetent today, but I was certainly much more incompetent then, than I am today), Jordan’s reasonings were quite obscure to me, and stayed so for a while. I’ll try to make it simpler for you and first give you an overview of what the paper does; then I’ll dig into some of the details. First, Jordan shows some estimates of the earth’s temperature at relatively large depth: values of T (z) under continents obtained from a thermal model (so far we’ve seen thermal models of the oceanic lithosphere, but, like I said, the continental lithosphere is a different story, which I’ll tell you about in a minute), and from xenolith data. The latter are info that are extrapolated from rocks found at the surface of the earth, and I’ll try to explain you how that works, too. We’ve seen that under the oceans the T (z) curve is close to the error function, which means in practice a very rapid growth of T with growing z as long as z is relatively small, say less than 100 km, and then—at larger depth—almost no change in T with z. Below about 100 km, actually, we don’t think that’s the lithosphere anymore: it’s the asthenosphere, and then the mantle, and there’s convection going on there, with little or no conduction of heat (remember the Rayleigh-number story): so, below the base of the lithosphere, wherever exactly that is, but not so far from 100 km, we should replace the error function with the adiabatic T (z), which we can calculate: just take the value of T at 100 km (from the HSCM or plate model), and plug it into the formula for the adiabatic gradient, Eq. (7.142). Knowing what we know of mantle composition, we can make some estimates for all parameters involved in that formula. So that gives us a model for the temperature profile, as they say, in the sub-oceanic upper mantle below the lithosphere and down to the transition zone. In practice, the in the mantle is small, i.e., T grows slowly with depth adiabatic “gradient” ∂T ∂z below the lithosphere—about half a degree K per km. As for the continents, the data and models that Jordan reviews for his Nature paper (and that I am going to tell you about in a minute; but first, like I said, let’s see what the main points of Jordan’s paper are) all agree that, above 100 km, the T (z) curve under continents is less steep than that of T under the oceans; below 100 km (where the “sub-oceanic thermal gradient” is adiabatic) the opposite is true and the curve under continents is steeper than that under the oceans. To see what that means, look at Fig. N.52. One implication of this is that the continental lithosphere might be thicker than the oceanic lithosphere. Jordan takes the base of the lithosphere to be at the depth where the mode of heat transfer changes; what I mean by that is that in the lithosphere we have that
Notes
751
Fig. N.52 If you take a bunch of different models of temperature in the mantle, some for under the oceans and others for under the continents, and plot their averages, with some error bar based on their standard deviation, you are likely to see something like this
relatively steep conductive T (z) curve; in the mantle we have the “adiabat”, which is the same whether it’s under oceans or continents857 : so, extrapolate the sub-continental lithosphere T curve you got from your thermal model and xenolith data; trace the mantle adiabat; find their intersection: that’s the base of the continental lithosphere. In the sketch of Fig. N.52 that happens at almost 300 km depth. Could be more than that, or less, depending which continent (or which part of a continent) we’re looking at. From the fact that continents have deep cold roots Jordan makes another inference, which is the main twist of his paper. As a general rule, colder means denser: if the lithosphere has the same chemical composition under oceans and continents, then it follows from the results in Fig. N.52 that the continental lithosphere is denser, perhaps much denser than the oceanic lithosphere. Jordan then does the isostasy exercise that we’ve seen a couple times already in this book: if, as usual, α is the volumetric coefficient of thermal expansion858 , then the relative density difference between oceanic and continental lithosphere = −αδT (z), where δT is the “ocean-continent temperature contrast” is δρ ρ (which you can read from Fig. N.52) at depth z. The absolute density difference at z, then, is −α δT (z) ρ; the mass excess per unit area (which, if the asthenosphere is doing its job and everything is isostatically compensated, should be the same under oceans versus continents) is the integral of that over z. To do the integral for an average continent, Jordan takes the top integration limit at 40 km, i.e. the average continental Moho, more or less859 ; the bottom integration limit is the bottom of the lithosphere, defined as I just told you. So, Jordan works this out, and finds that oceanic and continental plates are not
752
Notes
isostatically balanced: in fact, he finds mass-excess-per-unit-area to be about 107 kg m−2 , meaning pressure at the base of a continental root is 107 kg m−2 higher than it is, at the same depth, under an ocean. And of course, says Jordan, this can’t be. Because if this, at a given moment in time, were true, the continental lithosphere would just sink, to be replaced with warmer-and-therefore-buoyant “average” mantle. So but then, we need to figure out what’s wrong in the above reasoning, i.e., “what could stabilise the large sub-lithospheric thermal gradients against convective disruption”? Or, what’s the factor that’s been left out, in the above derivation, that, when we put it back in, shows that oceanic and continental plates are isostatically balanced? Jordan’s answer to this is: “chemical variations. [...] Thermally produced mass excesses” under the continents “are locally cancelled by chemically produced mass deficiencies”. Simply put, if you take two equal volumes of the same material and bring each to a different temperature, their weight will differ. But, if you take equal volumes of two different materials at different temperatures, then you can’t be sure that their weight will differ: because temperature and chemistry might conspire to reduce the difference to zero. Imagine you could have a pair of samples of lithospheric mantle rock: a sub-continental one and a sub-oceanic one. They are both in your lab, at the same temperature, and they occupy equal volumes. Imagine that, under such conditions, the sub-continental one is lighter than the sub-oceanic one. Imagine you bring the sub-continental one to a much lower temperature: it will shrink and become denser. For some value of T , it will acquire the exact same density as the suboceanic one. Jordan figures that that’s exactly what happens in the real world: “the mineralogical assemblages within the thick continental root zones are less dense than those beneath the oceans if referenced to the same pressure and temperature”: but become just as dense if brought to lower temperature, hence the isostatic balance, etc.860 So, says Jordan: let δT be the temperature contrast between an ocean and a continent, averaged over all depths within the tectosphere. We just saw that if chemical compositions under ocean and continent were the same, then the relative density contrast would be −α δT . Now, that also means that, to have your isostatic balance back, you need some chemical difference, between suboceanic mantle and continental tectosphere, that would cause a density contrast of just about +α δT , canceling out with the above. “The average temperature contrast between the ocean basins and shields861 probably lies in between 300 and 500 ◦ C. Thus for α = 3 × 10−5 per ◦ C the hypothesis requires a compositionally induced density difference between −0.9 and −1.5% at these depths.” Now the question is, is this OK from the point of view of chemistry? Yes, apparently: what Jordan does is, he comes up with an estimate of what minerals we expect to have—the “mineral composition”—in the continental tectosphere, and finds that if you just take basalt away from that, the residue would be less dense than the parental material. So, if, for whatever reason, the continental tectosphere is depleted in basalt with respect to the oceanic mantle at the same depth, then “the continental tectosphere can be in hydrostatic equilibrium with
Notes
753
its surrounding mantle.” To check whether this is really the case, though, we need to know how depleted in basalt the continental tectosphere is, and what its density is as a result—we need a good estimate of chemical composition in the continental tectosphere. Of course, Jordan has a way to get that estimate, and to show, then, that indeed the the mantle under continents has got less basalt per-unit-volume than it is found, on average, anywhere in the mantle. He then calculates the density, at the p and T of the mantle, of his basalt-depleted tectosphere, and finds that it is lower than that of the average mantle: which confirms his whole scheme. So this, now, is where I start digging into the details. Because before I tell you about how Jordan figures out composition and mineralogy, I need to tell you more about xenoliths—which are involved in that, too. And before I cover xenoliths, I need to give you some more info on thermal models of the continental lithosphere—like, the continental version of the oceanic HSCM or plate model. So, if you’re still with me after this introduction, let’s start with a good recipe for continental thermal models. The math is almost the same like what we’ve done under the oceans: still based on the heat transfer equation. But, for one, continental plates are not made of young stuff that cools as it ages: so we can forget about the temperature at the ridge versus temperature at the margin, and all that. Second, the continental crust is both thicker and more radioactive than the oceanic crust: which means that some significant amount of heat is released by radioactive decay within the continental crust: which is not simply passively cooling down, then. Strictly speaking, this is true also of the oceanic crust, as far as we can tell, but its main constituent, basalt, is less radioactive than granite, which is what continents are mostly made of; plus, the oceanic crust is, like, five times thinner, on average? So it was OK to neglect that in the oceanic HSCM and plate model. In continental models that you might find in textbooks or in the literature, though, radioactive heat production in the crust is not neglected, and I am going to stick to that. As for below the crust, there seems to be some general consensus that radioactivity is weaker862 , both under continents and oceans; but since we are going to learn how to include it in our thermal theory, we might as well take that into account, too: again, that’s what people do in textbooks and stuff. So, then, the mathematical model is basically like HSCM, except we need to add heat production from within. And here is how it’s done: differentiate Fourier’s law of heat conduction, Eq. (4.39), with respect to depth, which gives ∂2 T ∂F = −k 2 , ∂z ∂z
(N.206)
so that the net heat flux coming in or out of some arbitrary layer of thickness δz would be ∂2 T (N.207) δ F ≈ −k 2 δz. ∂z
754
Notes
By conservation of energy, the net flux out of the layer has to coincide with the net amount of heat produced by the layer: so if we call H the amount of heat produced per unit mass per unit time, then we have δ F = ρH δz (where ρ, as usual, is mass density), and if we plug that into (N.207) we get ρH ≈ −k
∂2 T , ∂z 2
(N.208)
because δz cancels out. Now, if we take ρ and H to be constant with depth (which you’ll see in a second why that is actually OK for our current purposes), then solving Eq. (N.208) is a relatively trivial problem. Let’s add some boundary conditions: let’s say we know T and F at zero depth, denote their values, there, T0 and F0 , and the b.c.’s are T (0) = T0 , F(0) = F0 . Then, integrate both sides of (N.208) once, and get ∂T + c1 , (N.209) ρH z ≈ −k ∂z where c1 is an arbitrary constant. But wait: we’ve got a b.c. that says that (Fourier’s law of heat, F(0) = F0 ; but F(0) is just the value taken by −k ∂T ∂z again) when z = 0. So, then, at z = 0, (N.209) becomes 0 ≈ −F0 + c1 ,
(N.210)
from which it follows that c1 ≈ F0 , and the arbitrary constant is not arbitrary anymore. Plug that into (N.209) and integrate a second time. You get ρH z 2 ≈ −kT (z) + F0 z + c2 ,
(N.211)
with a brand new arbitrary constant that I called c2 . But play the same trick again: at z = 0 we are left with c2 ≈ kT (0), and so c2 can be replaced with kT0 . The unique (approximate) solution to our problem, then, is T (z) ≈ −
F0 ρH z 2 + z + T0 . k k
(N.212)
Equation (N.212) is the model we were after863 : if you know temperature and flux at zero depth, and have estimates of mass density ρ, rate of heat production H and specific conductivity k in the layer you want to look at—which, in our case, is the lithosphere—if you have numbers for all that, then you can implement (N.212) to find T (z) anywhere in the layer. Actually, though, if we want to model the lithosphere, that means that our “zero depth” is where the Moho is: so but we can’t just go out and measure864 T0 and F0 . What we can do is, we can solve the above for within the crust, first: because we have many measurements of T and F at the surface of continents.
Notes
755
The other parameters, ρ, H and k, we can figure out from seismic models, i.e., estimates of v P , v S versus depth. Because we have lab experiments that tell us what rocks (what chemical composition, etc.) could correspond to given values of v P , v S in the crust; we have more lab data, that tell us how dense (remember Chap. 7), but also how conductive and how radioactive different rocks are: so, put all that together, and we have all the parameters we need to implement (N.212) and find T (z) in the crust. Now, if we know T (z) everywhere in the continental crust, that means we also know T at the Moho, and from Fourier’s law of heat we also know F at the Moho. Which means that, now, we do have the boundary conditions that we needed, at the top of the lithosphere. With this, we are ready to find T (z) in the lithosphere, again via (N.212). The way we come up with estimates of ρ, H and k in the continental lithosphere is still the same as for the crust, although, as a general rule, the deeper we go into the earth, the more our estimates of composition are based on speculation rather than observation. Anyway, people seem to agree that the continental lithosphere isn’t very radioactive; what I mean by this is that, while the average radioactive heat production in the continental crust is thought865 to be between 0.4 to 0.7 mW/m3 , which is considered to be high, in the continental lithosphere this is one-order-of-magnitude lower: between 0.02 and 0.10 mW/m3 . When all the algebra is done, one ends up with a “geotherm” that should look more or less like the continental curve in Fig. N.52. With quite a bit of uncertainty, though, like I was saying, because how good is our estimate of H ? how safe is it to assume that H doesn’t change with depth in the crust and or in the lithosphere? So what Jordan did in his paper, and that’s the other thing I wanted to tell you more about, he substantiates this thermal model with some totally independent data: that is, xenoliths. A xenolith is some relatively small rock sample that’s found embedded in some other rock, which it has very little or nothing in common with. That’s what the word stands for, because xeno is Greek for “alien”, as in xenophobic, etc. Some igneous rocks that we find at the earth’s surface contain xenoliths with a chemical composition which, from what we know about the composition of the mantle (remember Chap. 7), are likely to have been carried all the way up from down there866 . This is important, because we don’t have much direct evidence of what the mantle (incl. the lithosphere, below the crust) is like. So, it appears that these guys are somehow dragged by the magma (which is a totally different rock, different chemistry etc.) in its ascent towards the surface; they do not melt (except a tiny bit along the surface of contact with the magma) and so they stay as they were when they sat down there in the mantle—which means that their structure is a structure that has formed, that is, they have solidified at the pressure and temperature of the mantle. So, let’s say that I find a number of mantle xenoliths: each with its own somewhat different properties—they come from the mantle, OK, but they are not all exactly the same. In a lab, I can take a sample and play with it, like, heat it until it’s completely molten, and then put it under pressure, and change the pressure,
756
Notes
Fig. N.53 Surface heat flux changes from place to place over the globe, and so does the composition of the crust—and, as a consequence, how much heat is produced locally by radioactivity in the crust. Depending on the value of those parameters that you plug into a mathematical thermal model of the continental lithosphere—like, e.g., Eq. (N.212)—you get a different geotherm. In this figure taken from Rudnick et al. (1998), you see a whole bunch of geotherms, obtained in this way from a range of heat-flux values and plausible rates of crustal radioactive heat production: those are the solid lines. The symbols—if you connect the dots—are paleogeotherms derived from mantle xenoliths. They are from three geographic areas only (three “shields”), but if you were to look at more data from elsewhere over the globe, you’d see that they all fall more or less in the same region of the p-T (or depth-T ) diagrams. If the geotherm you get from a given assumed value of radioactive heat production is close enough to the “cloud” of xenolith data, you infer that that’s a good estimate of heat production. “Cratonic peridotite xenolith p-T data points”, says Rudnick, “fall between geotherms corresponding to crustal heat production of 0.4–0.5 µW/m3 , which corresponds to a lithospheric thickness of [slightly] less than 200 km”: which is probably a good rule-of-thumb value for how thick the continental lithosphere is on average. (Remember that that includes the crust, too: so that means that the boundary, whatever exactly that is, between continental lithosphere and asthenosphere is a bit less than 200 km deep.) (Used with permission of Elsevier, from Roberta L. Rudnick, William F. McDonough, Richard J. O’Connell, “Thermal structure, thickness and composition of continental lithosphere”, Chemical Geology, vol. 145, 1998; permission conveyed through Copyright Clearance Center, Inc)
and for each pressure check at what temperature it becomes solid again. This gives me a curve in the pressure-temperature diagram; but doesn’t tell me at what p and T the sample that I started out with was formed. So, this is not enough for me to draw the geotherm under the location where samples were found. There’s another piece of information that I can use, though, to figure that out: depending on the p and T at which the xenolith becomes solid, it acquires a different structure (like, crystalline structure: what earlier on we’ve also called “phase”). As I experiment with xenolith samples in the lab, I will be able to reconstruct all those structures: and comparing such observations with the structure of a xenolith found in the field, I will know at which p and T that xenolith was formed. This is how Fig. N.53 was put together, one data point per xenolith.
Notes
629.
630.
631. 632. 633. 634.
757
Bottom line, if I have enough samples of mantle xenoliths from a given area of the globe, I can put together what people call the “paleogeotherm”: a curve showing the T as a function of p (and but then also as a function of depth, because, again, remember Chap. 7) in the mantle—mostly the top of the mantle: mantle xenoliths usually don’t come from deeper than the lithosphere—under that area. This whole game is called “thermobarometry”. What Jordan does in his paper is he compares paleogeotherms made from xenoliths with geotherms coming out of mathematical models, and finds that they are compatible—if you pick your model parameters right: see the examples in Fig. N.53. The other thing mantle xenoliths are used for in Jordan’s paper is to estimate the density of the continental tectosphere. (Which, remember, is the ultimate piece of information that’s needed, to verify that the chemically distinct continental roots postulated by Jordan are isostatically compensated and don’t need to sink.) It’s clear by now, I hope, that mantle xenoliths carry info about what happens (pressure, temperature, composition...) at least down to 200 km or so depth in the mantle. We know how to figure out the p and T at which a xenolith has solidified. We can also measure the chemical composition of a xenolith. Then, from the chemistry, Jordan extrapolates the mineral composition of a xenolith, at the depth and p and T where it last solidified; which, once the mineralogy is known, he can also estimate the density. He finds, essentially, that the xenoliths he’s looking at, which sample the continental tectosphere and are all “garnet lherzolites”, are indeed lighter than the average mantle rock one expects to find at the depth those guys came from; and he finds that that density difference is just right so that the temperature anomaly is compensated and there’s no mass excess under continent—see above. I keep trying to remind myself to say “mantle lithosphere” instead of just lithosphere, because the crust is part of the lithosphere too. So there’s crust and mantle lithosphere, and together they form the lithosphere. But people keep talking of “crust and lithosphere”. Which can cause confusion, particularly when you think of depths and thicknesses. People are so sloppy sometimes. The crust and depleted lithosphere layers have approximately constant thickness across an oceanic plate—this much we know, I think, from seismic prospection and stuff. So whether or not they are there doesn’t change the isostatic balance we calculated earlier, when we looked at the cooling and thickening of the oceanic lithosphere, and its heat flux and bathymetry, etc. But it does make a difference, as you are going to see shortly, if you compare the average density of an oceanic plate to the density of the asthenosphere underneath—for a while, before it gets too old, the lithosphere is indeed buoyant: which explains why it doesn’t sink right away, etc. In the asthenosphere it was partially molten. People call this “decompression melting”. The quote is from Claude Allègre, The Behaviour of the Earth. To be “depleted in something” means that you have very little of something, that you’ve lost a certain amount of it. Implicitly, the depleted lithosphere layer
758
Notes
is compared to lithosphere that’s found underneath it, that hasn’t gone through partial melting and differentiation. 635. There’s one important data set that I haven’t mentioned, to confirm this theory of the oceanic lithosphere, i.e., what people call ophiolites. An ophiolite, just to be clear, is not a particular kind of rock, but a structure, a sequence of layers, which can be several-km thick; ophiolites are found in continents, mostly along mountain ranges; but, ophiolites are strangely reminiscent of the layered structure that the model I’ve just described predicts in oceanic lithosphere. Most importantly, samples of peridotite from ophiolites are very much like you’d expect the depleted oceanic mantle lithosphere—the “dry sponge”—to be. They are also “serpentinized” as geologists like to say—serpentinization is what happens to peridotite under the action of water—another indication that ophiolites be of oceanic origin. Today, people agree that ophiolites are chunks of oceanic lithosphere that have been transported on top of continental plates—“obducted”. That could happen in a collision between a continental and an oceanic plate. So, you look at an ophiolite, in the middle of, say, the Alps, and you are also looking at how the oceanic lithosphere looks like. 636. Yes: we’ve seen that density changes quite a bit across the oceanic lithosphere, with distance from the ridge and also with depth, and but that would make things cumbersome, so it might be a good idea to just take some average value for it, which is what ρ L is going to be. It’s not my idea: I took it from Geoffrey F. Davies’ 1992 paper, “On the Emergence of Plate Tectonics”, published in Geology, vol. 20, pages 963–966. (In case you actually go and read that paper, be careful because there are a few errors (typos, really: the numbers that Davies gets are OK) in the formulae, which Davies published a correction in the following volume of the same journal, issue of June 1993. But the formulae that I give here are the right ones.) Even if ρ L is just a constant, the buoyancy of the plate changes with the thickness of the lithosphere, d: when there’s too much lithosphere, which is heavy, the oceanic crust, which is light, is not enough to compensate for it, and the plate as a whole isn’t buoyant anymore. Anyway, this is what we’re going to see in the equations that I am about to derive; keep reading. 637. In his paper, Davies picks the densities ρC and ρ A to be about 2990 and 3300 kg/m3 . Then, he says that ρ L = ρ A (1 + αT ),
(N.213)
where α is the thermal expansion coefficient (which Davies estimates at 3 × 10−5 ◦ C−1 ), and T is what Davies calls the “mean temperature deficiency of lithosphere”, and, he says, should be about 700 ◦ C. To see what Davies means by that, and why 700 ◦ C, start from Eq. (8.119). Remember that we can approximate the thickness of the oceanic lithosphere like this: √ d ≈ 2.32 κt,
(N.214)
Notes
759
which is Eq. (8.123) after having chosen the arbitrary T that defines the bottom of the lithosphere as 0.9 times the T of the asthenosphere (I’d warned you that we’d do something like this). Now the trick is, calculate the mean temperature < T (t) > of the lithosphere at age t, 1 < T (t) >= d
d
z √
dz 2 κt T0 κt − d 2 κt d = +2 d erf √ e 4κt − 2 d π π 2 κt 2T0 κt − d 2 d + e 4κt − 1 , = T0 erf √ d π 2 κt T0 erf
0
(N.215)
where I have used one of those properties of the error function that we find in math books or websites867 , and the rest √ is algebra. If you sub (N.214) into (N.215), you see right away that κt or κt simplifies away everywhere, and that, as a result, < T (t) > doesn’t depend on t and we might just call it < T >. The “mean temperature deficiency of the lithosphere” can be derived from (N.215), T = < T > −T0 2 κt − d 2 d + e 4κt − 1 − 1 = T0 erf √ d π 2 κt 2 1 − 2.322 4 = T0 0.9 + −1 −1 , √ e 2.32 π = 0.9 (that’s where that after subbing (N.214) and remembering that erf 2.32 2 2.32 came from in the first place). So then if we crunch the numbers, T ≈ −0.46 T0 . The value for T picked (without explanation) by Davies is 700 ◦ C, i.e., about half your typical estimate of T in the asthenosphere868 . Assuming that the lithosphere is made approximately of the same stuff that the asthenosphere is made of, the average density surplus that corresponds to T then can be calculated via the coefficient of thermal expansion, i.e. via (N.213). So, anyway, plug the numbers in, and it follows that ρ L = 3369 kg/m3 . As for ρ D , Davies estimates that that should be just about ρ L minus 30 kg/m3 , i.e. ρ D = 3339 kg/m3 , which I assume comes from lab experiment with partial melting of peridotite? (Davies doesn’t say.) The thicknesses of crust and depleted lithosphere are 7 and 20 km, respectively. With these values, if you do the arithmetics, you should get t = 22 Myr.
760
Notes
Fig. N.54 Suction force according to Forsyth and Uyeda. The arrows point in the direction of mass displacement; the ellipse denotes circular flow, where mantle material is dragged downward by the sinking slab, and in turn drags the overriding plate towards the trench
638. I am not sure how obvious this sounds to you? I guess, buoyancy brings hot material up, right at the ridge. The stuff is brought up to a higher elevation than the mean depth of the seafloor. Once it’s up there, buoyancy doesn’t matter anymore, and the material just wants to go back down (“slide” down) by gravity—“gravitational sliding”. People estimate ridge push based on the difference in pressure at the base of the plate, at the ridge minus far from the ridge. You can check out Richter869 and Mackenzie’s paper, “Simple Plate Models of Mantle Convection”, Eqs. (21) through (25), Journal of Geophysics, vol. 44, 1978. 639. “Sea-Floor Spreading as Thermal Convection”, Journal of Geophysical Research, vol. 76. 640. This is not what Elsasser had originally thought. Forsyth and Uyeda write: “Elsasser visualizes this force as due to a continual downwarping of the oceanic plates at trenches, creating an empty space which is continually filled by the seaward movement of the continental plate.” Which, by the way, people call trench rollback. ”Suction in a fashion to fill the void, however,” continue Forsyth and Uyeda, “appears to be incompatible with the notion that a trench is a colliding boundary across which plates are pushing each other. If the seaward movement of continental plates is due merely to the push transmitted from the other side of the plate, the force is already taken care of by FRP and introduction of FSU is unnecessary”, which I think is a very good point. And but then they end up with an explanation that’s pretty much the same as what Conrad and Lithgow-Bertelloni are saying: “there may be some mechanism”, say Forsyth and Uyeda, “to generate FSU on the continent side of a trench. McKenzie (1969) and Sleep and Toksöz (1971) postulate that a secondary hydrodynamic flow is induced in the upper mantle above a sinking slab and this flow would generate tensile stress behind island arcs”, etc. Do you mind if I don’t copy all of Forsyth and Uyeda’s bibliography references here? if you want to look deeper into this, you should probably check out their paper first, anyway, and you’ll find the references there. In exchange for that, I give you in Fig. N.54, here, a sketch of the suction mechanism that they propose. 641. This doesn’t mean that all plates that subduct receive the same amount of pull: but, rather, that the magnitude of the pull depends only on the size of the
Notes
642.
643.
644.
645. 646. 647.
648. 649.
650.
761
subducting slab—in Forsyth and Uyeda’s approximation, only on the horizontal length of the trench that comes with the slab. Sure, each force is a 3-component vector, of course: but its direction is determined by the geometry of the plate’s motion, which we know: so all we need to find out is the magnitude of each force. There are small plates, and there are things like the Adriatic plate, that I know about because it’s near where I live, I guess, which people don’t quite agree on whether it is its own plate or should be attached to the rest of the African plate. Forsyth and Uyeda’s twelve plates are the biggest ones, and we tend to be pretty sure that they are all separated from one another. Actually, Chapple and Tullis “feel” that FSP “is relatively well understood”, and decide to estimate it, based on the thickness of the slab (of the plate), on its density, dip angle and depth “at which the first motions from in-slab earthquakes indicate a crossover from downdip tension to downdip compression”, i.e. the depth at which the slab is not sinking (and so, not pulling) anymore, presumably because it has become as hot as the rest of the mantle. They think their estimate is OK and so treat FSP “as a known force rather than an unknown force parameter.” It follows that, for Chapple and Tullis, there’s only seven terms in the sum at the left-hand side of Eq. (8.143), and but its right-hand side is not zero, but it contains a component of the slab pull force for the plate in question. Just so that you realize that there’s many different ways in which things can be done. This is not very rigorous/detailed, I know. If you want to learn more about this stuff, you can look up, e.g., “rank of a matrix”. That is, we can “solve” the problem, like Forsyth and Uyeda say, “in the leastsquares sense”: see note 843. Which, incidentally, they could have guessed it before doing their calculations (and, I don’t know, maybe they did). Because if you look at which plates are attached to subducting slabs, and at how fast the various plates move, it turns out the faster plates are actually those that subduct. In vol. 75 of Journal of Geophysical Research. Japan, actually, used to be part of Eurasia, and some relief was formed there, as a result of ocean-continent subduction; there’s high mountains in Japan that are not volcanoes: like one that is called Mount Kita, for example, which you can look up. But at some point a rift opened up, it seems, and the Sea of Japan formed, and Japan became a bunch of big islands, and then many volcanoes formed, there, as a result of ocean-continent convergence. The Sea of Japan is an example of what people called backarc basin, or backarc while the forearc is, roughly, the region between the axis of the arc—where all the volcanoes are—and the trench. I know: there’s no convergent margin on either side of the Rockies. San Andreas is a transcurrent (transform, whatever you prefer to call it) boundary—no squeezing of the plates. But there used to be an oceanic plate right there, which eventually (kind of recently, i.e. within the past 100 Myrs) entirely sank under north America; people call it the Farallon plate—we’ll meet it again. One figures Farallon had to be there after reconstructing the past motions of
762
651.
652.
653. 654. 655.
656. 657.
Notes
plates that are still around, from paleomag data, etc., the way we’ve learned earlier in this chapter. The amount of heat that’s transferred via conduction is approximately negligible with respect to heat transfer by convection; but conduction isn’t strictly zero: it follows that the real gradient should be slightly smaller than the adiabatic one. Given how little we know in general about the deep earth, though, we can definitely live with this. We can also extrapolate upwards from the transition zone, starting from the the T of phase changes, and then check whether the extrapolated T at the base of lithosphere matches our independent estimates. Equation (7.134). See note 487. We could also do it the Williamson-Adams way, and let g vary with depth, using the estimates that we get from seismology: see Chap. 8. But for the backof-the-envelope calculation we are doing right now, that’s not really worth it. I haven’t talked much about the 520 phase transition, because it’s less famous than the other two, I guess. But it’s there. Which is way more complex than you might think. Here I am only going to skim the surface of the whole topic, just so you have some notion of how well we know those things, and what the difficulties are, etc. To begin with, remember that α is defined dρ dV =− = αdT, V ρ or, which is the same, α=− or
1 ρ
1 α= V
(copy of Eq. 7.134)
∂ρ ∂T
∂V ∂T
,
(N.216)
.
(N.217)
p
p
In principle, one might measure α directly: like, you keep a sample of mantlelike material under constant, very high mantle-like pressure; then you raise its T (which is already at very high, mantle-like values, too) and see what happens to its volume—or to its weight, whatever is easier. And then sub the data directly into Eq. (N.216) or (N.217). But high-pressure/high-temperature experiments are a tricky thing to do, and, in practice, there is no way to do things like I’ve just described. There are, on the other hand, various, complicated ways of collecting what people call p-V -T data: info on what kind of values of pressure, volume and temperature are allowed to happen at the same time in mantle stuff. Just like with ideal gases, which you might have heard about already, p, V and T must obey an equation of state—which for ideal gases is
Notes
763
the very simple pV = kT , with k a constant (see note 488). We are not sure what the equation of state is for mantle material at mantle p and T , but there’s a whole community of people who try to pin it down, using all sorts of different experiments and calculations. The idea is that once the equation of state is established, parameters such as α, C p , etc., can be derived from it. At this point, different authors come up with different equations of state, depending on the assumptions they choose to make re composition, temperature, pressure, etc., in the mantle. As far as I understand, right now, and for a while, the most popular equation of state that people assume to be a good approximation of whatever really happens in the mantle is the so-called “Birch-Murnaghan”870 equation of state. To derive Birch-Murnaghan, you can start from the equation of energy conservation, written in the form pd V = T d S − dU
(N.218)
which see Chap. 7, note 488, where S is entropy, and U the system’s “internal” energy. Consider a transformation, any transformation, where T is kept constant: it follows from (N.218) that ∂U ∂S − ∂V T ∂V T ∂(U − T S) = − . ∂V T
p=T
We know from continuum mechanics that changes in the quantity U − T S (which, incidentally, people like to call “Helmoltz free energy”) can be written871,872,873 δ(U − T S) = a f 2 , (N.219) where a= 1 f = 2
9 K T 0 V0 , 2
V0 V
23
(N.220)
−1 ,
V0 is reference volume of the system, in the state that you consider to be “unstrained”, and K T 0 is the value of K T in that same state, where, by the way, temperature T = T0 . Same as in Chap. 7, where I first introduced, K T denotes the isothermal incompressibility—not to be confused with the adiabatic incompressibility AKA bulk modulus. The quantity f is what people call “finite strain”, and is a measure of the compression/expansion that’s happening in the system (see note 871). From (N.218) and (N.219), keeping in mind that T is being kept constant, you can write
764
Notes
∂(U − T S) ∂V T ∂(a f 2 ) = − ∂V T ∂f = − 2a f . ∂V T
p= −
(N.221)
Now, differentiate f with respect to V , i.e.,
∂f ∂V
T
1 ∂ = 2 ∂V 1 ∂ = 2 ∂V
V0 V V0 V
23
−1
23
V ∂ −2 = 0 V 3 2 ∂V 5 1 1 V0 3 , = − 3 V0 V
(N.222)
2 3
replace (N.220) and (N.222) into (N.221), and ∂f p = − 2a f ∂V T 2 5 1 1 V0 3 V0 3 1 = 9K T 0 V0 −1 2 V 3 V0 V 7 5 1 V0 3 1 V0 3 3 − = K T 0 V0 , 2 V0 V V0 V
(N.223)
which gives p as a function of V (starting from some reference volume V0 ) at constant T , i.e., it is the equation of state we were looking for, and, like I said, is called the Birch-Murnaghan equation of state. To be more precise, actually, it is called the second-order Birch-Murnaghan equation of state874 . In principle, Birch-Murnaghan describes an isothermal transformation—it gives p as a function of V (or, same thing, V as a function of p) for some fixed T = T0 . But if you re-implement it for various values of T0 , you shall still identify the combinations of values of p, V , T that are allowed to happen in your system: i.e., you still end up with the general equation of state p = p(V, T ). (Or T = T ( p, V ), or V = V ( p, T ), whichever is the simplest way for you to think about it.) (Which I am not saying that one should turn Birch-Murnaghan around, algebraically, to express V as a function of p and T0 . Rather, you should
Notes
765
think of (N.222) as something that can be implemented numerically to find a “table” of p-V -T values—a surface in the p-V -T space). One tricky thing about Eq. (N.222) is that it will work—it will provide numerical values of p as a function of V and V0 and T0 —as long as you know K T 0 : but, in general, K T 0 is not known at all the values of p and V and T0 where we need Birch-Murnaghan to function—i.e., the entire mantle. So, what people do is they “fit” (which remember, for instance, how the contributions of all those plate-driving forces were determined, in Chap. 8) the function p = p(V, T0 ) to pressure-volume-temperature data from the lab. The “cost function” would be something like | p(V, T0 ) − pOBS (V, T )|2 , where p(V, T0 ) is short for Eq. (N.222), pOBS (V, T0 ) points to the values of p observed under the same T and p, and one looks for the value of K T 0 such that the cost function is minimum. Oncethat is done, you can go ahead and calculate α from (N.222). By definition , see above, so, for example, I guess, you can pick a value of p, and α = ∂V ∂T p find the corresponding V , at temperatures T0 = T ± δT , via (N.222). Then α≈
658.
659. 660. 661. 662.
V ( p, T + δT ) − V ( p, T − δT ) 1 . V ( p, T ) 2δT
A very delicate and confusing (at least to me) point here is that there is not one single equation of state, and people chose their equation of state depending on the data that they have, and that they need to fit, to constrain their unknown parameters. For instance, if your data are recorded in an experiment where an isothermal transformation is done, indeed it is best to “fit” them via BirchMurnaghan; but if the experiment isn’t isothermal... you see the point. So, it is not unusual to read a paper where multiple equations of state are used. Actually, you can’t really bring rocks to pressures as high as those we expect to find at the CMB, or deeper; what people do is, they subject solid rocks to some really high pressure, and then check how hot it’s got to be before it melts; they do this repeatedly, each time increasing the pressure, as far as their equipment allows: and so then they get a curve (e.g. Fig. N.55) of melting temperature as a function of pressure (which remember, pressure and temperature are associated via Williamson-Adams-type diagrams, so they’re kind of like the same thing). After that is done, they assume that T continues to vary with depth according to the same trend as in their measurements, and that’s how that they can get to whatever depth they want—but with quite a bit of uncertainty, of course. See note 482. Which, people say, could be between 5150 and 6200 ◦ K. Which means, incidentally, that its top would be at about the same depth as the perovskite-to-post-perovskite transition. For a moment, right after that paper came out, I thought its author was Paul Hoffman. Different first name, I know, and last name spelled differently, and not even a geochemist, but a geologist in my department at the time—he became famous, sort of, for his “snowball theory” of the earth; but let’s not get into
766
Notes
Fig. N.55 “Melting curve of (Mg0.88 Fe0.12 )SiO3 perovskite between 25 and 96 GPa”, from Elise Knittle and Raymond Jeanloz, “Melting Curve of (Mg,Fe)SiO3 Perovskite to 96 GPa: Evidence for a Structural Transition in Lower Mantle Melts”, Geophysical Research Letters, vol. 16, 1989. Based on the curve shown here, Knittle and Jeanloz state that “by extrapolation to 136 GPa”, which is the pressure at CMB, the melting T of perovskite “is 4500 (+ 500) K at the core-mantle boundary”. (“H & J” stands for Dion Heinz and, again, Jeanloz: the two had done some earlier experiment, at lower pressure than that of Knittle and Jeanloz.) (Used with permission of John Wiley and Sons, from Raymond Jeanloz, Elise Knittle, “Melting curve of (Mg,Fe)SiO3 perovskite to 96 GPa: Evidence for a structural transition in lower mantle melts”, Geophysical Research Letters, vol. 16, 1989; permission conveyed through Copyright Clearance Center, Inc)
that. I remember this: I am in the hallway of the, yes, Hoffman Lab—which is the name of the department of earth and planetary sciences where I did his Ph.D.—but no relation to any of the Hofmanns or Hoffmans involved in this story—I am standing there with a bunch of other grad students, chatting, and one of them has a copy of the latest Nature and says, look, there’s a review paper by Paul Hoffman, here, saying that tomography models are fuzzy. Which indeed, very last paragraph, Hofmann says, “the controversy over the issue of convective layering and depth of plume sources is likely to be resolved when the so far rather fuzzy seismic imaging of the Earth’s interior becomes comparable in resolution to that achieved by geochemical mapping, so that geophysical and geochemical data can be more specifically correlated.”875 Which was difficult for me and the people in my group to ignore, because we (and a few other groups, but not so many) were the people doing the fuzzy imaging of the earth’s interior. And for a few moments we thought, Paul H. is declaring war to the seismology group. Which was kind of funny, actually. Then, I think, a prof. passed by, was asked about this, and explained the qui pro quo. 663. Hofmann, A. W., “Mantle Geochemistry: the Message from Oceanic Volcanism”, Nature, vol. 385, 1997. 664. I still do, actually. 665. And people dated Hawaiian seamounts and verified that it all fits.
Notes
767
666. I briefly mentioned hotspots in note 765 to chapter 5, the piece about Wegener and continental drift. Then, strangely, they didn’t come up again. Hotspots are places at the surface of the earth where you have lots of (mostly basaltic) volcanism, but that are not near plate boundaries, so that you can’t explain their volcanism in terms of what happens at plate boundaries876 . Starting with Tuzo Wilson, which I mentioned when I started telling you about geochemical reservoirs, it has been suggested that the stuff that gets erupted at hotspots comes directly from the lower mantle, through narrow, uhm, “conduits”, called mantle plumes. This is another one of those things that people fight about. Hotspots, by the way, don’t move much relative to one another (compared to how plates move relative to one another): so people like to use them as a reference frame to describe plate motions. (See also note 765.) 667. See note 666. 668. “A Possible Origin of the Hawaiian Islands”, Canadian Journal of Physics, vol. 14, 1963. 669. The composition of MORBs varies too, but not nearly as much. See note 670. 670. Not only “elemental” geochemistry data, as in Fig. 9.4, but also “isotope” geochemistry. People look at the “isotopic signature” of samples, let’s say of basalts, which is what we are concerned with right now. (If you remember the beginning of Chap. 8, those are the same data that we use to date rocks.) So people started to look at the concentration of isotopes of incompatible elements in basalt, and they found something peculiar: if you take a lot of samples, from all sorts of MORBs and OIBs, and for each sample measure the concentrations of a pair of radiogenic isotopes of incompatible elements, and plot one against the other, you systematically find a trend: i.e., a correlation877 , or anticorrelation, like in Fig. N.56; and when this is done for different pairs of isotopes, it turns out that whether it’s correlation or anticorrelation depends on whether either or both of the isotopes’ parents are more or less compatible than their daughters. If both parents, say, are each less compatible than its daughter, you get a correlation; if both parents are more compatible than their daughter, also correlation; but if one parent is less compatible than its daughter, and the other parent is more compatible than its daughter, then what you get is an anticorrelation. This is what happens in Fig. N.56. 87 Sr comes from the decay of 87 Rb, which is less compatible than Sr; 143 Nd comes from 147 Sm, which is more compatible than Nd. And, here is how geochemists explained this result. When something partial-melts, it separates; when this happens, like we were saying, some of each element will go with the melt, some will stay with the solid “residue”. How much of a given element goes and how much stays depends on the element’s compatibility. Now, consider 87 Sr and its parent 87 Rb. When a rock that carries some of both partial-melts, Rb is less compatible than Sr (see Fig. 9.4), so Rb tends to go with the melt more than the Sr does. On the other hand, 147 Sm, which is the parent of 143 Nd, is more compatible than its daughter: so after partial melting, you’ll tend to find a lower concentration of parent-with-respectto-daughter in the melt than in the solid residue. Bottom line: after partial
768
Notes
Fig. N.56 The correlation of radiogenic isotopes in MORB and OIB. (Data taken from Andreas Stracke et al., Geochemistry, Geophysics, Geosystems, vol. 23, 2022. There’s a very similar figure in Hofmann’s Nature paper, too; but Stracke’s data should be more up to date.) The values on the axes are the ratio of the concentration of 87 Sr versus the concentration of 86 Sr (horizontal axis), and same for 143 Nd and 144 Nd on the vertical axis. The isotopes 86 Sr and 144 Nd are stable isotopes of strontium and neodymium: i.e., their numbers don’t change over time because they’re neither radioactive nor radiogenic. When people measure these things, they always use the concentration of a stable isotope as a reference
melting, the ratio of 87 Rb (parent) to 87 Sr (daughter) is higher in the melt than it is in the solid residue, while the ratio of 147 Sm (parent) to 143 Nd (daughter) is lower in the melt than it is in the solid residue. and but then, as each radioactive parent continues to decay in both melt and residue, it’ll turn into its radiogenic daughter: and so what’s going to happen is that the ratio of the concentration of, e.g., 87 Sr to that of the non-radiogenic isotope of Sr that we use as reference (i.e., 86 Sr) will be higher in the melt than it is in the residue, etc. (We need a lot of time, by the way, for this to happen, because the half-life of most of those parents is on the order of the age of the earth.) Now, to understand how we end up with Fig. N.56: again, let’s make the hypothesis that, early in the history of the earth, the stuff that now makes up the mantle and continental and oceanic crust was all mixed together—we called that mixture the “primitive mantle”. And let’s say that, at some point, there’s a big event where you separate mantle from continental crust. The continental crust is the result of partial melting of a “primitive” mantle: meaning, the primitive mantle partial-melts, and the melt rises to the earth’s surface, freezes and becomes the crust. So the crust must have a lot of incompatible stuff, incl., e.g., a lot of Rb. This means that from now on in the crust, relative to the mantle, there will be
Notes
671.
672. 673. 674.
769
much more radiogenic Sr than non-radiogenic Sr. The opposite will be true of Nd, because Nd’s parent, Rb, like I said, is more compatible than Nd. So when the crust forms, it takes more Nd than Rb, and the implication is that, after some time, in the crust we shall have a smaller percent of radiogenic Nd than we have in the mantle. You see that there is an anticorrelation, already—except that this model would account for only two data points in the plot of Fig. N.56: one for the continental crust and one for primitive-mantle-minus-continentalcrust. The latter partial-melts below mid-ocean ridges, and the result is MORB. There’s some variability in the concentrations of Nd and Sr isotopes, etc., in MORB, but basically MORB data are very much clustered in the top left corner of the plot in Fig. N.56; the continental crust is at the opposite end of the “line”, to the bottom right of the plot. This doesn’t still quite explain the plot, though, right? because what are all those data points between MORB and continental crust? We know from Chap. 8 that the MORB gets recycled back into the mantle via subduction, so you’d expect the mantle to keep, more or less, its composition—and relative concentrations of various isotopes, etc. Well, those data points are, of course, the OIBs. Which, compared to the MORB, are scattered all over the place. And this confirms what we were just saying, that OIBs tend to be chemically quite different from one another, depending on where you collect the data. Models, both numerical (which I’ll tell you all about in a minute) and analogue (which see, e.g., the piece on Griggs, in Chap. 8), show that plumes tend to form, and to start rising, from the top of thermal boundary layers. in “Compositional Stratification in the Deep Mantle”, Science, vol. 283, 1999; besides Kellogg, the two other authors are Brad Hager and Rob van der Hilst. Because, for instance, we don’t really know with any accuracy how fast the mantle and core are cooling, right? It’s important that we are clear on what numerical simulations are, because that’s one of the most important tools in geophysics today—both for seismologists and geodynamicists. I am not going to go into any detail of how programs that calculate mantle circulation and/or seismic wave propagation are written— there are people who do that for a living, anyways—but I think it’s worthwhile to take a simple problem that we’ve already studied in some detail from the analytical angle, and look at how it can be solved numerically878 . I’ll take the guitar string, which you know from note 279 that the (small) deformation u of an elastic string under tension is controlled by the wave equation 1 ∂ 2 u(x, t) ∂ 2 u(x, t) = 2 , 2 ∂x c ∂t 2
(copy of N.47)
where x is distance along the string, t is time, and c is the wave speed along the string. Before we integrate Eq. (N.45) numerically, we have to figure out how we are going to discretize it. Because in (N.45) x and t are continuous variables that could in principle have any real value whatsoever; and the derivatives are ratios of infinitely small things: computers can’t deal with this kind of stuff.
770
Notes
So, x and t will only be allowed a finite, discrete set of values—I am going to call them xi and tk , with i = 1, 2, . . . , M, and k = 1, 2, . . . , N —and we’ll have to pick an approximate formula for the derivatives. To make sure that we know how approximate the formula is going to be, we derive it from the Taylor expansion (note 56) formula. Let me show you how this is done in practice, and why it is a smart thing to do. Consider, for example, a function f of x, defined along the string—could be the displacement, but let’s be general for now. Because we decided to discretize the problem, we are always going to evaluate f only at the points x1 , x2 , etc. Now, for the sake of simplicity, let’s take those points to be equally spaced, i.e., xi+1 = xi + δx, for whatever value of i, with δx, the spacing between points, a constant. Presumably, δx is pretty small, so then Taylor says that f (xi+1 ) = f (xi + δx) =
∞ 1 dk f (xi )δx k k k! d x k=0
≈ f (xi ) +
(N.224) 2
3
df 1d f 1d f (xi )δx + (xi )δx 2 + (xi )δx 3 , 2 dx 2 dx 6 dx3
and likewise f (xi−1 ) = f (xi − δx) ≈ f (xi ) −
df 1 d2 f 1 d3 f 2 (xi )δx + (x )δx − (xi )δx 3 . i dx 2 dx2 6 dx3
(N.225)
Now the trick is, subtract (N.225) from (N.224), f (xi+1 ) − f (xi−1 ) ≈ 2
df 1 d3 f (xi )δx + (xi )δx 3 , dx 3 dx3
which notice that terms with δx 2 cancel out with one another; then solve for df (xi ), which gives dx df f (xi+1 ) − f (xi−1 ) (xi ) ≈ , dx 2δx after neglecting the δx 3 term—but not the δx 2 ones, which just happened to cancel out. So our approximation is accurate to third order in δx. Had we used the values of f at two immediate neighbors, like xi and xi+1 , things wouldn’t have canceled out and we’d have ended up with an expression that’s only second-order accurate. Anyway, so, that’s a good way to discretize the first derivative. As for the second derivative, well, just sum (N.224) and (N.225),
Notes
771
f (xi+1 ) + f (xi−1 ) ≈ 2 f (xi ) +
d2 f (xi )δx 2 , dx2
which now the δx 3 terms have canceled out, and, solving for
d2 f dx2
, you get
d2 f f (xi+1 ) + f (xi−1 ) − 2 f (xi ) (xi ) ≈ , 2 dx δx 2 which is fourth-order accurate. Now, there’s two second derivatives in (N.45): one with respect to distance and one with respect to time. Using our discretization scheme, ∂2u u(xi+1 , tk ) + u(xi−1 , tk ) − 2u(xi , tk ) (xi , tk ) ≈ , ∂x 2 δx 2 and
∂2u u(xi , tk+1 ) + u(xi , tk−1 ) − 2u(xi , tk ) (xi , tk ) ≈ , 2 ∂t δt 2
and so the discretized wave equation altogether reads 1 u(xi , tk+1 ) + u(xi , tk−1 ) − 2u(xi , tk ) u(xi+1 , tk ) + u(xi−1 , tk ) − 2u(xi , tk ) = 2 . δx 2 c δt 2
Now, let’s say that we know the (discretized) shape of the string at and before time tk ; i.e. we know the value of u(xi , tk ) at all points xi , when t = tk , tk−1 , tk−2 , etc. We just don’t know (yet) what’s going to happen to the string after tk . But we can begin to find out, if we simply solve for u ik+1 , AKA u(xi , tk+1 ), which gives u(xi , tk+1 ) = c2
δt 2 u(xi+1 , tk ) + u(xi−1 , tk ) − 2u(xi , tk ) + 2u(xi , tk ) − u(xi , tk−1 ), δx 2
(N.226) because if we know what happened to the string before tk+1 , then we know everything at the right-hand side: and we can just plug numbers in and crunch them—and if we do that for all values of i = 1, 2, ..., M, the result will be the shape of the string at time tk+1 . What’s more, we can iterate this procedure, play this game as many times as we want: because once we know what the string is like at t = tk+1 , then we are ready to find how it will be like at t = tk+2 —just increment k in (N.226) by one—and so on and so forth. This is the basis of the so-called finite-difference method: which with only equation (N.226) we are almost ready to implement it on a computer—I just need to clarify a couple things. First, the boundary conditions. I don’t know if you noticed, but anyway, when i = 1, and/or i = M, you see that to calculate u(x1 , tk+1 ), u(x1 , tk+1 ) via (N.226), you would need to know the values of u
772
Notes
at x0 and x M+1 at t = tk : which you don’t, of course, because those points are not even part of the string. But the things is, We have boundary conditions to take into account, and the boundary conditions are simply that the endpoints of the string be fixed, (N.227) u(x1 , tk ) = 0 and u(x M , tk ) = 0,
(N.228)
for all k. So, when i = 1, (N.226) is replaced by (N.227), and when i = M, (N.226) is replaced by (N.228). Second, how do we start the whole thing? well, just like with the analytical solutions that we saw in notes 279 and 292, we need to prescribe some initial conditions. Typically, we know the string’s initial displacement, that is its shape at time t = t1 : which means that, for all i = 2, 3, . . . , M − 1, we assign a value to u(xi , t1 ) (values at i = 1 and i = M are already assigned). We also know . Velocity actually doesn’t appear in the string’s initial velocity879 , or v = du dt the finite-difference equation; but from v(xi , t1 ) you can compute u(xi , t2 ) ≈ u(xi , t1 ) + v(xi , t1 )δt (which follows from the definition of velocity as the time-derivative of the displacement) for all i, and so then the initial conditions consist of knowing u(xi , t1 ) and u(xi , t2 ). The first step of the finite-difference process, then, is u(xi , t3 ) = c2
δt 2 u(xi+1 , t2 ) + u(xi−1 , t2 ) − 2u(xi , t2 ) + 2u(xi , t2 ) − u(xi , t1 ), δx 2
(N.229) for i = 2, 3, . . . , M − 1. And then we just iterate. Before you go and try to code this thing in Matlab or Python or whatever programming language you like, there’s one last thing you have to be careful with: you can’t just pick the values of δx and δt randomly. Let’s say that the . ratio of δx to δt is smaller than c: to fix ideas, let’s say it’s c halves, or c = 2 δx δt Let’s also say that, at time tk , there’s some displacement at xi , while the rest of the string is yet unperturbed. Then, at the next sampled instant in time, t = tk+1 , we should have the same displacement at a distance cδt = 2δx from xi , i.e., at xi−2 or xi+2 . And but then it’s as if the chunks of string at xi−1 and xi+1 never moved: which doesn’t make sense, physically: and it will mess things up numerically. The rule of thumb—the so-called stability condition—is that δx ≥ c. δt And with this you are ready to go and code.
Notes
675. 676. 677. 678.
679. 680.
773
By the way, in case you are wondering what’s the use of implementing this numerically when we already have plenty of ways to solve the problem analytically: well, if all we were interested in was just to model a guitar string, then yes, the analytical stuff is probably enough. But, to understand why numerical methods are useful, consider what would happen if the density of the string, ρ, weren’t constant—say half the string has one density, and the other half has some other density. Or, which is the same, the speed c of elastic waves in one half of the string has a different value than in the other half of the string. I don’t know if there is an analytical solution for that—there probably is, but it would take some work to derive it (or to look it up); on the other hand, all we have to do to our numerical code is replace the constant c in Eq. (N.229) with a function c = c(x), which has one value when x is between x1 and x M/2 , and another value between x M/2+1 and x M , and we should be good. (OK, yes, we also need to make sure to use the smallest of the two values of c in the stability condition.) This is a very stupid example, but in the real world there’s many problems where it’s impossible to find an analytical solution, and numerical modeling is the only way to go. In the old days, i.e., before computers, geophysicists derived all sorts of analytical solutions, simplifying their problems in such a way that it would be possible to solve them that way. For example, we saw how Rayleigh and Love derived their surface-wave formulae assuming that the earth could be treated—at least locally—like a half space, or a half space with a flat layer on top. And, likewise, Haskell computed the postglacial rebound of a flat earth. In the twentieth century, people came up with analytical models where the earth was a sphere, homogeneous first, and then made up of a bunch of shells—each of them homogeneous, but with different properties from one another. And so on, and so forth: lots of complex math, but still very important approximations which are not needed if you go numerical. Numerical methods can’t solve everything—and they are expensive, because you need big computers, and the resources and competences to operate them—but a lot of stuff that was unthinkable some years ago, now has become possible. Actually, this is also something that numerical models should also help us to figure out. See note 587, Chap. 8. Philosophical Transactions of the Royal Society of London, vol. 157. A damper, or “dashpot”, is any device that resists motion via viscous friction. Think, e.g., door closers: where the thing that stops the door from slamming shut is, precisely, a dashpot. See note 715. A major complication being that viscosity η, which in the first approximation we treat like a constant parameter, probably changes with τ as well. A famous example of this is ketchup: “merely tilting [...] or even turning [a bottle of ketchup] upside down will only dislodge a little sauce from the neck. [...] To liquefy the sauce, you need to shake the bottle vigorously or thwack it with your hand. If you are not careful, a lot more will end up on your food than you
774
681.
682. 683.
684.
685.
686.
Notes
intended. As experienced users know, there is no need to rush after shaking because the effect takes a certain amount of time: You can relax, remove the cap and take aim.” (H. Joachim Schlichting, “Ketchup Is Not Just a Condiment: It Is Also a Non-Newtonian Fluid”, Scientific American, 2021.). Materials with this property are called non-Newtonian, and I think most of my colleagues in geodynamics will agree that the earth’s mantle is non-Newtonian, at least to some not-entirely-negligible extent. The speed of mantle convection is reflected by the speed at which plates move— the idea being that plates move because of convection, and wouldn’t move if there were no convection, etc.—and so we are talking about a few cm/year: which is not a lot. Which I derived in note 458 already, back when we were talking about earth tides. Because it says that a continuous medium stays continuous: the flux of matter described by the displacement field u is such that gains or losses of mass are explained by changes in density: holes won’t just open in the medium because of its deformation. Which is really just another way of describing conservation of mass, the way it is stated by Eq. (6.19). Remember (Chap. 6) that within the continuum surface forces cancel out; we are left with, in the case of the mantle, friction at the core-mantle boundary and friction between the mantle and the plates. That is, terms that are proportional to the square of changes in T and/or f. Remember the Taylor expansion (note 56), where I said that depending on how precise you want to be, you can decide to stop at first order, or second order, or third order, etc.: and the first-order term—the term that contains the first derivative of f (x)—was proportional to δx; the second-order term to δx 2 , etc.? It’s the same thing. OK: what I mean, really, is the following: in Chap. 4 I worked out the theory of heat in one dimension only—the earth as a half space—and I defined heat flux via Eq. (4.31). To derive the 3-D counterpart of that, you need to start with the rate of change of T along an arbitrary direction, defined, e.g., by a unit vector ˆ If you remember how the gradient works (note 265), then you realize that n. Eq. (4.30) is replaced by e Q = nˆ · ∇T , Q Tb − Ta and if, then, you go on with the derivation like we did in Chap. 4, eventually, instead of (4.32), you’ll get ˆ F = −k∇T · n. People call q = −k∇T as the heat flux vector, and the total heat flowing in or out of a volume bounded by ∂V is given by the second term at the right-hand side of Eq. (9.24).
Notes
775
687. Internal energy is defined as in notes 488 and 492, where I did most of the thermodynamics that’s in this book. 688. In case you want to check me against other people, by the way, Eq. (9.31) here is precisely the same as Eq. (30) in the 2008 Treatise of Geophysics article by Yanick Ricard, called “Physics of Mantle Convection”; and/or Eq. (5.4.7) in Lawrence Malvern’s 1969 textbook, Introduction to the Mechanics of a Continuous Medium. 689. See note 686. 690. In note 488 we had written the change in internal energy (not per-unit-mass, yet) as dU = T d S − pd V, which picking T and V as the independent state variables, and differentiating the right-hand side only, ∂S ∂S dT + T d V − pd V ∂T V ∂V T ∂S ∂S =T dT + T − p d V. ∂T V ∂V T
dU = T
(N.230)
In the same note—Eq. (N.154)—we also saw that specific heat capacity C V = T ∂S , where m is the mass of the continuum we’re looking at. Plug that m ∂T V into (N.230), and ∂S dU = mC V dT + T − p d V. ∂V T We saw that
∂S
∂V T
=
∂p ∂T
(N.231)
V
, and if we sub that into (N.231),
∂p − p dV ; dU = mC V dT + T ∂T V finally, we saw that
∂p ∂T
V
= αK T , and so
dU = mC V dT + [αK T T − p] d V. We are ready now to switch to internal energy per unit mass, which we are going to denote υ. Divide everything by m, and
776
Notes
dV m dV = C V dT + (αK T T − p) ρV dρ = C V dT − (αK T T − p) 2 , ρ
dυ = C V dT + (αK T T − p)
(N.232)
where we’ve used the relationship dVV = − dρ (note 316). ρ To derive the time-derivative of υ from (N.232), you just need to divide both sides of (N.232) by the length of the time interval during which T and ρ—and υ as a consequence—are varying, dT 1 dρ dυ = CV − (αK T T − p) 2 . dt dt ρ dt
691.
692. 693. 694.
695.
696.
But then Eq. (9.16) says that dρ = −ρ∇ · v: and if use that in the equation we dt just wrote, we get precisely (9.33). Which, incidentally, is exactly the same as e.g. Eq. (35) in Ricard’s Treatise article (note 688). OK, yes, in Chap. 7 I am talking about dp, rather than p: but that’s just because it’s convenient to consider that the equilibrium part of the stresses ha’s already canceled out, and what we are left with are small perturbations with respect to it; back then, we were thinking mostly about seismic waves propagating across the earth. And some other places in France, but mostly Sorbonne. Théorie Analytique de la Chaleur, vol. 2. Same title, yes, like Fourier’s book. But not only. You can constrain rheology from measuring the extent of anelastic effects in the propagation of seismic waves: how much of a wave’s amplitude is reduced by dissipation—which is related to viscosity. You can also make circulation models including gravity perturbations, and see how those perturbations fit gravity data collected at the earth’s surface: which if viscosity in your model is right, the fit should be good. I am not telling much about all this, though, because it’s more complicated than rebound, and—I suspect—less “robust”. Which, actually, the whole idea, which seems to be substantiated by both laboratory and seismology data, that the transition zone—the so-called “410” and “660”, see Chap. 7—are phase rather than composition changes—the whole idea that in the transition zone the mineralogy of mantle rock changes, but not its chemistry, is very much in favour of whole mantle convection. Because geochemical layered-convection models rest on the assumption that lower and upper mantle are made of very different chemical elements, and the theory of phase changes says that, actually, the chemicals in the lower and upper mantle should be exactly the same. The only thing that changes is the way matter is packed together—the lattice. See Chap. 7, like I said, the transition-zone piece. The way people in global seismology translate earthquake data to 3-D earth models is usually via seismic tomography, which is the data-processing tech-
Notes
777
nique that I am telling you about in this chapter. Another way of doing it is by looking at the earth’s normal modes, AKA free oscillations: which is what I am going to tell you about in this note. Although, actually, I anticipated the idea in note 292: but, since then, we’ve learned quite a few new things—and so now I am able to explain modes in a more, uhm, quantitative fashion. To keep things simple, though, I won’t really try to derive the earth’s modes— as in, the modes of a (in the first approximation) spherical, elastic, isotropic solid. It’ll be enough, as far as this book is concerned, to look at the modes of a spherical, inviscid fluid body. And we are not going to look at displacement, which is a 3-vector, but only at pressure—a scalar. (The sphere being fluid, hence incapable of shear, other components of the stress tensor are 0.) Because the earth’s outer surface is approximately free from stresses880 , and we want our fluid sphere to behave, at least qualitatively, kind of like the earth, we shall require that pressure be 0 over the outer surface of a sphere whose center and radius (pre-deformation) coincide with those of the earth. This is algebraically hard to do if you insist on describing motion in a Cartesian reference frame, but becomes easy if you switch to spherical coordinates: which is what I am going to do. We know from Chap. 6, the bit on acoustic waves in water, that pressure881 in a fluid medium obeys the 3-D wave equation, ∇ 2 δ p(x, t) =
1 d2 δ p(x, t), c2 dt 2
(copy of 6.26)
and we’ve learned in Chap. 8, note 839, how ∇ 2 looks like in spherical coordinates. We’ve already used these two results together: look at note 578, where we solved Laplace’s equation (N.190)—which is like (6.26) minus it right-hand side—via separation of variables. I am going to do the exact same thing here, i.e., assume that there exist four functions, R = R(r ), = (ϑ), = (ϕ) and T = T (t), such that δ p(r, ϑ, ϕ, t) = R(r )(ϑ)(ϕ)T (t)
(N.233)
solves (6.26); then plug (N.233) into the spherical-coordinate-Laplacian version of (6.26), and divide by R(r )(ϑ)(ϕ)T (t), so that d 1 d d 1 2 d r R(r ) + 2 sin ϑ (ϑ) r 2 R(r ) dr dr r sin ϑ(ϑ) dϑ dϑ d2 1 1 d2 1 (ϕ) = 2 T (t) (N.234) + 2 2 c T (t) dt 2 r sin ϑ(ϕ) d 2 ϕ2 —same as Eq. (N.191) except for the right-hand side. Next, I can do the separation-of-variables thing: T is a function of t only, and the left-hand side of (N.234) does not depend on t: it follows that we must have
778
Notes
1 d2 T (t) = −ω 2 , T (t) dt 2 with −ω 2 a constant: which I decided to write as minus a squared number because I remember the modes of a guitar string, note 292, where we had ended up with the exact same ODE: and we found that its general solution then can be written T (t) = a cos(ωt) + b sin(ωt), or, which is the same, T (t) = Aeiωt , with A an arbitrary, complex constant. Now sub −ω 2 into the right-hand side of (N.234), and multiply both left and right-hand sides of what you get by r 2 : 1 d d d 1 d r 2 R(r ) + sin ϑ (ϑ) R(r ) dr dr sin ϑ (ϑ) dϑ dϑ 2 d 1 ω2r 2 + 2 (ϕ) + 2 = 0. 2 2 c sin ϑ (ϕ) d ϕ The r -dependent stuff separates out, ω2r 2 d 1 d r 2 R(r ) + 2 = ν 2 , R(r ) dr dr c or r2
d2 d R(r ) + 2r R(r ) + 2 dr dr
ω2r 2 2 R(r ) = 0, − ν c2
(N.235)
and 1 d d d2 1 sin ϑ (ϑ) + (ϕ) = −ν 2 . 2 sin ϑ (ϑ) dϑ dϑ sin ϑ (ϕ) d 2 ϕ2 The ϑ, ϕ PDE can also be separated into one ODE in ϑ and one ODE in ϕ: we’ve seen this in note 578, too: the ODEs are
and
1 ∂ 2 (ϕ) = −m 2 , (ϕ) ∂ϕ2
(copy of N.193)
d(ϑ) sin ϑ d sin ϑ + ν 2 sin2 ϑ = m 2 , (ϑ) dϑ dϑ
(copy of N.194)
Notes
779
with m yet another arbitrary constant. Similar to note 578, then, (ϕ) = Beimϕ , with B an arbitrary, complex constant; and solutions to (N.194) can only be found if ν 2 = l(l + 1), with l a positive integer: possible solutions are (ϑ) = Plm (cos ϑ), where Plm means the associated Legendre function of degree l and/order m, as before. The product (ϑ)(ϕ) = B Plm (cos ϑ)eimϕ = Ylm (ϑ, ϕ) is going to be a factor in the general solution for p(r, ϑ, ϕ, t) that we are seeking; it’s also what we call the spherical harmonic (see note 841) of degree l and/order m. The radial equation (N.235) is not quite the same as that in note 578. We did meet it before, though: look at the post-glacial rebound piece in Chap. 8: if you replace x with r , y with R, a with ω/c and b with ν, Eq. (N.235) is almost the same thing as the Bessel equation (8.44); the only thing that doesn’t fit is that factor 2 that multiplies the second term at the right-hand side of (N.235). But that’s √ no big deal, because let’s introduce a function, I don’t know, f (r ) = r R(r ), so that f (r ) R(r ) = √ . r We have that
and
(N.236)
d 1 d 1 R(r ) = √ f (r ) − √ f (r ) dr dr r 2 r3
3 d2 1 d2 1 d f (r ) + √ f (r ), R(r ) = f (r ) − √ √ 2 2 dr r dr r 3 dr 4 r5
which sub into (N.235), and d2 √ d √ d 3 1 r 3 2 f (r ) − r f (r ) + √ f (r ) + 2 r f (r ) − √ f (r ) + dr dr 4 r r dr
ω2 r 2 − ν2 c2
f (r ) √ = 0, r
780
Notes
or, after some algebra, d2 d f (r ) + r r f (r ) + 2 dr dr 2
1 ω2r 2 − ν2 − 2 c 4
f (r ) = 0,
which, now, is exactly Eq. (8.44), with r instead of x, f instead of y, ω 2 /c2 instead of a 2 and ν 2 + 14 instead of b2 . We know from chapter 8 that this is solved by the Bessel functions882 , i.e., f (r ) = C J√ν 2 + 1 4
ωr c
,
(N.237)
where C is an arbitrary constant, and we can drop the Bessel functions of the second kind, Yb , just like I did in Chap. 8, because they become infinite at r = 0, which wouldn’t make any sense, physically, in our case—pressure, whether at the center of our fluid sphere or elsewhere, can’t possibly be infinite. It follows from (N.236) and from (N.237) that ωr C . R(r ) = √ J√ν 2 + 1 4 c r And actually, since we know, now, that ν 2 must be equal to l(l + 1), with l integer, for the solution of the θ-equation to exist, we might write ωr C R(r ) = √ Jl+ 21 c r
(N.238)
2 (because l(l + 1) + 14 = l + 21 .) It took a little while, but now we have solutions for all four ODEs that our initial PDE—the 3-D wave equation—separated into. Putting them all together via (N.233), ωr D Ylm (ϑ, ϕ)eiωt , (N.239) δ p(r, ϑ, ϕ, t) = √ Jl+ 21 c r where D = ABC: the three arbitrary constant boil down to one. We still haven’t applied the boundary condition—the requirement, see above, that δ p (and/or p) be zero at the outer surface of the sphere. That has to hold at all possible values of ϑ, ϕ, and t: and the only way that this can happen is if R(a) = 0, or
ωa 1 , √ J√ν 2 + 1 4 c a
(N.240)
Notes
781
where a is the radius of the sphere. Now, as you can see from Figs. 8.3 or N.63, the Bessels are oscillatory functions whose sign, like that of a sine or a cosine, changes frequently. The values of the argument of a Bessel where the Bessel is zero—the “zeros” of the Bessel—are not exactly regularly spaced, like those of a sine or a cosine, but they are known—in the old days you would look them up in books; today on the Internet—if you don’t have some software, installed in your computer, that can spit them out for you (Matlab, Python, etc., have all sorts of routines and libraries for that). So, let’s call zl1 , zl2 , . . . , zln , . . ., all the way to n = ∞, the “zeros” of Jl+ 21 ; then, the b.c. (N.240) translates to ωa = zln c
(N.241)
(with n = 1, 2, . . . , ∞). But a, like I said, is the radius of the sphere and c the speed of sound in the fluid that the sphere is made of. So, in practice, (N.241) means that only a discrete set of values of ω are allowed, ω = ωn with n = 1, 2, . . . , ∞, and czln . (N.242) ωln = a All pressure fields δ p(r, ϑ, ϕ, t) described by (N.239) can exist in our fluid sphere, provided that ω coincides with one of the ωln ’s. All their linear combinations, ∞ ∞ l ω r D ln Ylm (ϑ, ϕ)eiωln t , δ p(r, ϑ, ϕ, t) = √ Jl+ 21 c r n=0 l=0 m=−l
(N.243)
can also exist. Remember (note 578) that Legendre functions are zero, and, as a consequence, spherical harmonics are zero, when |m| > l, so that’s why the sum over m in (N.243) might start at −l and end at l. People call normal mode of the fluid sphere each term—defined by a triplet of values for n, l, m—of that sum: in fact, they’re standing waves in three dimensions, just like, e.g., the modes of the guitar string (see note 292) are standing waves in 1-D. Modes with n = 0 are what people call “fundamental modes” (unlike a guitar string, a sphere has more than one fundamental mode); all other modes are “overtones”. At any given depth within the sphere, the pattern of oscillation associated with a mode is described by the corresponding spherical harmonic Ylm (ϑ, ϕ): if one could excite that mode alone, the value of δ p would oscillate back and forth wherever Ylm is nonzero, while pressure would remain constant and equal to its hydrostatic value wherever Ylm is zero—just like the nodes of the guitar string—but nodes over the sphere are meridians and parallels. Of course, in a real-world experiment, in general, a multitude of modes will always be excited at the same time, resulting in a pretty complex pattern; but if you record how δ p changes over time at a given point within the sphere, take the Fourier transform of that and plot its power spectrum (i.e., the squared
782
Notes
Fig. N.57 After Dahlen and Tromp’s Theoretical Global Seismology: “Radial-component, longperiod amplitude spectrum [...] of the ground acceleration recorded at station TUC in Tucson, Arizona following the June 9, 1994 deep-focus Bolivia earthquake.” Dahlen and Tromp label each peak with the name of the corresponding mode, which you can figure out by comparison with the theoretical spectrum for the same quake, same receiver (the locations of source and receiver, and the geometry of the earthquake fault, won’t change the modes; but they will determine how much each mode is excited with respect to the others: so no matter where the quake and/or the station are, you’ll see the same peaks: but the amplitudes of the peaks might change quite a bit); S stands for spheroidal mode (long story); the subscript to the left is the value of n (i.e., the index of the radial function); the subscript to the right is the value of l. m is not specified, because remember that in the first approximation the modes’ frequencies are degenerate, so it’s hard to tell apart frequencies associated with different m’s; but you can see, e.g., that mode 1 S4 is clearly split. (Used with permission of Princeton University Press, from Dahlen and Tromp, Theoretical Global Seismology, 1998; permission conveyed through Copyright Clearance Center, Inc.)
magnitude of f (ξ), as per Eq. (N.67), plotted as a function of frequency ξ), you should see a bunch of fairly well isolated peaks emerge, at frequencies given by (N.242). Which peaks are higher depends on how the sphere has been set into motion—i.e., on the initial conditions: just like with the guitar string. Now, the earth works kind of like our fluid sphere: it’s got a stress-free outer boundary, which limits the frequencies of oscillation—and the wavelengths of the radial patterns of the modes—to a discrete spectrum. If (remember note 482) you take a seismogram, Fourier-transform it, and plot the transform’s spectrum, you’ll also see peaks, corresponding to the values of the modes’ frequencies ωln . See the example in Fig. N.57. The earth is not an inviscid fluid, though, of course, so there might be shear stresses in it, and, as a result, the math that describes the normal modes of the earth is more complex than what we’ve seen here. In any case, seismologists
Notes
697.
698.
699. 700.
783
have put together all the theory needed to calculate our planet’s deformation— after a quake or any other relatively small disturbance—as a sum of its normal modes: they can do that based on whatever model of its seismic velocity and density, provided it’s reasonably close to spherically symmetric (which is probably the case for the true earth). Looking at the modes’ frequencies is useful to figure out the global structure of the earth—particularly in places like the core, which it’s not that easy to measure body waves that have propagated all the way to that depth. Mode frequency data can also constrain the earth’s relatively weak 3-D heterogeneity that I am talking about in this chapter. The frequencies of a perfectly spherically symmetric planet’s modes, just like the ωln of the fluid sphere, depend on the the “overtone number” l, and on the index of the radial function, n (which in the case of a solid, elastic and/or viscous planet is not just a simple Bessel function— but let’s not worry about that, now); but, if the planet is slightly asymmetric, like we are learning the earth is, then the frequencies are said to “split”— there’s a different frequency for each value of m (Fig. N.57). Now, how exactly that is done is a bit of a long story (which you can learn, e.g., in Lapwood and Usami’s Free Oscillations of the Earth883 or in Dahlen and Tromp’s Theoretical Global Seismology884 ), but, basically, seismologists are able to constrain the earth’s 3-D heterogeneity based on measurements of the modes’ frequencies’ splitting—their deviations with respect to their “degenerate” (yes, this is how we call them), i.e., non-m-dependent, theoretical, spherically-symmetric-earth values. In practice, this affords us a lower resolution than the tomography methods that you are going to learn about here; but modes, for one, are directly sensitive also to the earth’s density, rather than just v S and v P ; and they have access to places, inside the planet, that are not so well illuminated by the body-wave data that we have. Bottom line, if you really want the best possible 3-D picture of the earth’s interior, you should really make use of both databases. Consider that, back then, they’d have to collect hard copies of seismograms, as in rolls of actual paper. They’d spread them on their desks, then, and look for wiggles, and “pick” the arrivals of various phases, etc. All done with pen and paper: no computers. Around this time, data became digital, so that it became easier to collect them, and to process them, etc.: the speed and the ease of doing these things has been improving steadily since then. Using, e.g., the recipe I just gave you, and Eq. (9.46), because those models are all made of layers. It goes without saying that to measure t you need to know when precisely the quake occurred—if we are talking quakes. If we are talking “active” seismology no problem—you’ll know at what time you set off your explosive. If we are indeed talking quakes, then the source must have been located beforehand, like I showed you it can be done in Chap. 7; once the epicenter is known, time can be figured out from empirical tables such as that of Fig. 7.1, Chap. 7. Except,
784
Notes
that figure is more than a century old, and today we have many more data, and much better tables as a result. 701. See note 56. 702. Even though, in practice, they won’t really form a complete basis of the space of all possible functions δv(x): for that you’d need an infinite set of functions, and that’s not something that we can afford in the real world. 703. Basis functions don’t necessarily need to be pixels (or, if you are in three dimensions, voxels), of course. In the early days of global tomography, people like Dziewonski and Woodhouse would prefer to use spherical harmonics (which see note 841). Any spherical harmonic is nonzero over most of the globe, so when the f i ’s are spherical harmonics, the integral in (9.57) is nonzero for all i’s, and, as a result, the coefficients of A are all nonzero. This has an interesting practical consequence: compared to the A you get from a pixel parameterization, and if you have, say, about the same number of pixels as you have spherical harmonics, the spherical-harmonics A will fill up much more space in your computer’s memory and/or drive. Because sparse matrices—matrices whose coefficients are mostly zero, like the “pixel” version of A—can be stored in a clever way, where rather than filling your computer with zeros, you write into one array the values of the nonzero coefficients only, while in another array, of the same size as the first, you’ll have one integer number per nonzero coefficient—a number that tells you where that nonzero coefficient is in the matrix—its row and column. And that turns out to save you a lot of memory and disk space. In the early 1990s, Yoshio Fukao, Rob van der Hilst, Steve Grand and others figured that this meant that, given the same amount of computer power— memory, disk space, speed—which important seismology labs around the world could probably afford roughly the same kind of so-called “supercomputers” at the time—pixel tomography could achieve finer resolution than spherical-harmonic tomography. In other words, if one abandoned spherical harmonics in favor of pixels, taking advantage of this sparse-matrix storage trick, they’d be able to “parameterize” seismic velocities on very fine grids, and ultimately come up with “high-resolution” tomography maps—significantly higher that those you could get via spherical harmonics. They turned out to be right; and the models they got, and published in the middle of the nineties, were not, like, totally different from those of Dziewonski and company—but they had just enough more detail, at least in some areas, that some previously, uhm, unseen “features” became prominent. The most important being the so-called Farallon and Tethys slabs: which see Figs. 9.9 and 9.10: subducting plates sinking all the way to the bottom of the mantle, apparently proving beyond any reasonable doubt that convection is a “wholemantle” thing—not layered. Adam Dziewonski was skeptical of all this pixel stuff; particularly of Rob van der Hilst’s work—to be honest, I don’t think they liked each other that much, or in any case Adam didn’t like Rob that much, or didn’t like his work that much, which for Adam would amount to the same thing, and for Rob too,
Notes
704.
705.
706. 707.
785
probably; anyway, so, this was happening at the time when I moved to the U.S. to do my Ph.D. precisely under Adam, and Adam told me to write my own pixel/sparse-matrix-storage code and see whether Rob hadn’t done something wrong, somewhere. (OK, he didn’t put it exactly that way, but almost.) So I wrote the code, and started getting results that looked a lot like what Rob had gotten; and I couldn’t really figure what exactly was wrong with what Rob had done. Except for maybe some rather minor technicalities that I don’t really want to get into right now: but at the time those technicalities seemed like a big deal, I guess, or we made a big deal out of them, and got into a fight, sort of, with Rob, which I don’t think was very useful for the, uhm, advancement of science (or of my career, for that matter). Anyway, things pass, I worked on other stuff, too, graduated, went back to Europe, etc. And I don’t think Rob is still mad at me, although I am pretty sure he was for a while. He shouldn’t be, though, actually, because, after all, he is a big shot at M.I.T., and I a small professor in Italy. This is not true if you are doing “active” seismology—which is the kind of seismology where people make their own sources, via explosives or whatever; this is usually done at relatively small scale—kilometers, tens of kilometers—, where it is also feasible to deploy arrays of equally spaced sensors. And so the problems that I am talking about are not that severe anymore. You can’t do this at larger scales, because man-made explosions are small compared to earthquakes, and cannot be recorded at large distances. Yes, OK, there’s nuclear explosions, which are quite large, but have a number of, uhm, drawbacks, as I am sure you realize. We’ve met the least-squares criterion in Chap. 8 already: that’s how Bullard estimated plate displacements, and that’s how people studying the earth’s magnetic field extrapolated the field’s global pattern from geographically sparse data. See notes 843 and 894. Which I did in notes 843 and 894 already. That is not always true, actually. Algorithms that implement (9.60) might fail even though AT · A is square. To see how that might happen, let’s look at one such algorithm—the simplest one, and the one to which most other algorithms are related: Gaussian elimination885 . Take, e.g., a very simple problem M · x = b, where M is a 3 × 3 matrix and b is a 3-vector, ⎛ ⎞ ⎛ ⎞ 121 2 M = ⎝2 6 1⎠, b = ⎝7⎠, (N.244) 114 3 and the vector x = (x, y, z)T is what we need to find out. The linear system ⎧ ⎨ x + 2y + z = 2, 2x + 6y + z = 7, ⎩ x + y + 4z = 3,
(N.245)
786
Notes
is just another way of writing (N.244). Gaussian elimination is a stepwise procedure, where at each step you multiply one row of the system (N.245) by a factor, and sum or subtract the result to or from another row. (This of course doesn’t change the info contained in the system—it only rearranges it.) At the first step, you want to get rid of the coefficient of x (the first component of x) in the second row of the system (we won’t touch the first row). For that, you subtract from the second row of (N.245) twice the first row, and replace the second row with the result of that, ⎧ ⎨ x + 2y + z = 2, 2y − z = 3, (N.246) ⎩ x + y + 4z = 3. Next, we take care of the coefficient of x in the third row of (N.246): which you just subtract the first row from the third, and replace the latter with the result of that, and ⎧ ⎨ x + 2y + z = 2, 2y − z = 3, ⎩ −y + 3z = 1. Then there’s the coefficient of y, i.e., the second component of x, in the third row. If you sum the third row with half the second, and replace the third with the result, then ⎧ ⎨ x + 2y + z = 2, 2y − z = 3, (N.247) ⎩5 5 z = . 2 2 The system (N.247) is easy to solve, because there’s only one unknown left in its last row, so we see right away that z = 1. If we plug that into the second-to-last row, which has only two unknowns, we immediately get y = 2; and so on and so forth. People say that a linear system like (N.247) is “lower triangular”; and we just solved it by “back substitution”. Gaussian elimination is the algorithm by virtue of which we reduced the system (N.245) to its lower-triangular form (N.247). This was a super simple example, with only three unknowns; but you can check for yourself, Gaussian elimination will work on a system of any size886 . Now, think about how AT · A might & look like, ' in practice. The i, j coefficient of AT · A, which we might denote AT · A i j , is the dot product of columns i and j of A. If a pixel, say pixel k, is not “sampled” ' the &data (i.e., ' is not & at all by crossed by any ray path in the database), then AT · A ik = AT · A k j = 0 & ' for all values of i and j, and this includes the diagonal entry, AT · A kk = 0. If & ' pixel k is sampled by few data only (“undersampled”), then AT · A ik , etc., are not zero, but they might be very small. Other entries of AT · A, e.g., diagonal entries corresponding to oversampled pixels, might be comparatively large.
Notes
787
When data coverage is dense and spatially uniform—i.e., all pixels are sampled, and they are all sampled about equally well—then this is not going to happen, and Gaussian elimination works fine on AT · A. But when data coverage is nonuniform, and you do Gaussian elimination (or some similar algorithm), then you’ll end up trying to divide some numbers by zero—which can’t be done, of course, and so the “inversion” will fail: the code you are using will probably just stop, and give you some kind of error message. Or maybe there will be no divisions by zero, strictly speaking, but still, you’ll have to divide a relatively very large number by a very small one, which, I am not going to get into the subtle details of how computers do arithmetics, but this is something that can easily create problems—meaning, large “numerical errors”: so large that your cLS will be crap. What most tomographers do, in practice, to avoid such crap, is a trick called “regularization”, or “damping”. The idea is, you combine the info about the earth’s structure that is carried by seismology data, and contained in Eq. (9.58), with some extra, independent info coming from some other source. The simplest example that I can think of is, you have good reasons to believe that your reference model is not that bad, otherwise you wouldn’t have picked it as a reference model in the first place, plus we know that the earth is, to a good approximation, spherically symmetric—the velocity heterogeneities that tomographers are after are, in most cases, pretty small. So, it is OK to require be as small as it could posthat some measure of the overall magnitude of δv v0 sibly be, while |δt − A · c|2 is still reasonably close to its minimum. People sometimes call this magnitude the “norm” of the tomography model; you can $2 # define it, for example, as the integral δv (x) d 2 x, where means the full v0 solid angle, i.e., the entire surface of the earth. To translate this new constraint into a linear equation in c, use Eq. (9.53), which implies that
δv (x) v0
2 d x=
δv (ϑ, ϕ) v0
i=1
2
=
n
2 dϑdϕ
ci f i (ϑ, ϕ)
2 dϑdϕ.
Requiring that the expression at the right-hand side be minimum is the same as requiring that (see note 615)
788
Notes ∂ 0= ∂ck
n
2 ci f i (ϑ, ϕ)
dϑdϕ
i=1
2 n ∂ = ci f i (ϑ, ϕ) dϑdϕ ∂ck i=1 n n ∂ ci f i (ϑ, ϕ) ci f i (ϑ, ϕ) dϑdϕ =2 ∂ck i=1 i=1 n n =2 ci f i (ϑ, ϕ) δik f i (ϑ, ϕ) dϑdϕ
=2 =2 =2
i=1
i=1
n
ci f i (ϑ, ϕ)
n ci i=1 n
i=1
f k (ϑ, ϕ)dϑdϕ
f i (ϑ, ϕ) f k (ϑ, ϕ)dϑdϕ
ci δik
i=1
= 2ck
for all values of k = 1, 2, . . . , n. This can be written ⎧ c1 = 0 ⎪ ⎪ ⎨ c2 = 0 . .. ⎪ ⎪ ⎩ cn = 0, or I · c = 0, where I is the n × n identity matrix. The “damped” or “regularized” leastD is the least-squares solution of the linear system you get squares solution cLS by combining this with (9.58),
δt A ≈ · c, λ0 λI
where λ is a constant, real number, whose value reflects how much one trusts the regularization constraint relative to the actual data. (If the data are relatively noisy, you’ll tend to pick a large value for λ, etc.; if λ is very large compared to the average value of the entries of A, though, the data become negligible and we’ll just get c = 0 as a solution; if λ = 0, we are back to the case with no regularization; Ideally, you want to be somewhere in between.) Because, obviously, λ0 = 0, we might as well write
Notes
789
δt A ≈ · c. 0 λI I hope it is clear that the left-hand side of this is the vector you get when you append n zeros to δt, i.e., if there’s m travel-time observations in the data set, you replace δt with (δt1 , δt2 , . . . , δtm , 0, 0, . . . , 0)T . Likewise, you get the matrix on the right-hand side by appending the n × n matrix λI to A, that is, ⎛
A11 ⎜ A21 ⎜ ⎜ ⎜ ... ⎜ Am1 A =⎜ ⎜ λ λI ⎜ ⎜ 0 ⎜ ⎝ ... 0
A12 A22 ... Am2 0 λ ... 0
... ... ... ... ... ... ... ...
⎞ A1n A2n ⎟ ⎟ ... ⎟ ⎟ Amn ⎟ ⎟. 0 ⎟ ⎟ 0 ⎟ ⎟ ... ⎠ λ
A δt , and δt with in (9.60), we find (I spare λI 0 you all the algebra, but I promise you, it is not so hard: so, if you feel so inclined, you should go ahead and give it a try) Now, if we replace A with
−1 D cLS = A T · A + λ2 I · AT · δt,
708. 709. 710. 711.
712. 713. 714.
AKA the damped least squares solution, like I said. Minimizing the model’s norm isn’t the only way to regularize an inverse problem. For instance, another popular choice is minimization of what people call the model’s roughness, of which a typical measure is the integrated, squared # δv $2 2 , i.e., gradient of δv ∇ v0 (x) d x. This can be translated into a condition v0 on the coefficients c, in much the same way as we just did, except it’s trickier, because what’s the gradient of a pixel function defined as in (9.59)? There are ways around this, of course, but nothing we should spend a lot of time on, in this book. One option is to drop pixels in favor of smooth basis functions, like, e.g., spherical harmonics (see note 703). See note 703. See note 707. See note 628. See note 154. I must have mentioned that the temperature of some amount of material, as we observe it at the “macroscopic” scale, is determined by how fast atoms oscillate (or, in a gas, by how fast they move around and collide with one another). Remember the piece on radiometric dating in Chap. 8. Which we covered in Chap. 1, actually, when we looked at Cavendish’ work. Or a perfectly elastic string, (see notes 279 and 292).
790
Notes
Fig. N.58 Kelvin-Voigt viscoelasticity model: perfectly elastic spring with elastic constant E attached—in parallel—to a dashpot with viscosity η
715. Or some other spring-plus-damper system: I mentioned that Maxwell’s is the simplest form of viscoelasticity that there is: but there are others. People have figured out, and used, the equations that you get if you connect dashpot and spring in parallel, see Fig. N.58, rather than in series (that’s called the KelvinVoigt model, I think); they’ve also taken a Maxwell solid and connected that, in parallel, with another spring—that’s called the Zener model, or, for some reason, the “standard linear solid”. And then, there’s materials which behave in such a way that you can’t really accurate represent them with any of these models, because their very viscosity, the way we’ve defined it in this book, would change depending on how much stress they’re under. I don’t really want to get into this, though. (But see note 680.) 716. See note 877. 717. A specialist, maybe the specialist of mantle circulation models. At the time he was at the University of California, L.A.: then a few years later became professor at the ETH of Zürich, where he leads a big group, doing all kinds of numerical models of mantle flow. 718. This is a fairly optimistic statement, I guess—even now, i.e., twenty years after Paul’s paper was published. If reconstructed plate motions are prescribed as the boundary condition, then yes, circulations models kind of look a bit like tomography. But without that constraint, not really. As I write my book, I know that, e.g., Nicolas Coltice and his people at the École Normale Supérieure, in Paris, are managing to make models that, without a-priori imposed plate motions, do look statistically like the earth: at their surface plates form, whose number and sizes are about the same as we observe in the real world; and below the surface we see slabs and plumes and superplumes, like in tomography models. For whatever reason, it seems that transform faults don’t emerge naturally in circulation models—“Our next major goal is to study transform faults”, says Coltice887 , “such as the San Andreas fault. In my model these active boundaries are diffuse and broad, but on Earth they are prominent structures, and we don’t know how they initiate and evolve”, etc.
Notes
791
719. M. D. Ballmer, C. Houser, J. W. Hernlund, R. M. Wentzcovitch and K. Hirose, “Persistence of Strong Silica-Enriched Domains in the Earth’s Lower Mantle”, Nature Geoscience, vol. 10, 2017. 720. In his American Scientist article, which see note 718. 721. Because, yes, most hotspots, i.e., the places where OIB is erupted, which are mostly ocean islands and, if Wilson’s idea is correct, the surface expression of mantle plumes, most hotspots are scattered in the Pacific Ocean and around southern Africa. 722. The average elevation is quite big, like 1 km above sea level: and it seems that this can’t be explained in terms of a thicker-than-average root, i.e.: isostasy doesn’t work. The explanation that most people agree upon, for this and other similar places across the planet, is that isostasy doesn’t hold because you can’t do a simple, static force balance as per Archimedes’ principe: because the mantle under southern Africa might be flowing very strongly in the upwards direction: which means it’s pushing southern Africa up: which messes up the isostatic force balance: it’s an extra force which we don’t really know how large it is, but we have to take it into account. Seismic tomographers agree that the mantle under southern Africa is slow, therefore (based on what we learned in this chapter) hot, therefore presumably buoyant and rising. The contribution to topography that isn’t accounted for by isostasy is called dynamic topography. 723. The law of sines says that, in any triangle, the ratio of the length of a side to the sine of its opposite angle is the same for all three sides. 724. A very simple example of linearly independent functions: the sine and the cosine. The question is, can I find a pair of values for the coefficients c1 , c2 (other than the obvious c1 = c2 = 0), such that c1 cos x + c2 sin x = 0 for all x? Well, when x = 0 the expression I just wrote boils down to c1 cos 0 + c2 sin 0 = 0, but cos 0 = 1, and sin 0 = 0, and so the only way for this to be 0 is that c1 = 0. Likewise when x = π2 we have c1 cos
π π + c2 sin = c2 , 2 2
which the only way for this to be 0 is c2 = 0. But so then, c1 cos x + c2 sin x can’t possibly be 0 for all x unless c1 = c2 = 0: which is the same as saying that the sine and the cosine are linearly independent, QED. 725. Let X be a matrix such that A · X = I. (N.248)
792
Notes
Take the dot product X · A and dot it with itself, (X · A) · (X · A) = X · (A · X) · A = X·I·A = X · A,
726. 727.
728. 729.
730.
731.
732. 733. 734.
because of (N.248), and because matrix multiplication is associative. Now, if you’ve got two matrices, call them M1 and M2 , such that M1 · M2 = M2 , then it follows that M1 = I. In our case, M1 = M2 = X · A. Bottom line, I’ve proven that if A · X = I, then X · A = I, QED. Provided, of course, that you pick the identity matrix with the right number of rows/columns. I took the formula from a paper by a James G. Williams, “Contributions to the Earth’s Obliquity Rate, Precession, and Nutation”, The Astronomical Journal, vol. 108, 1994. The angle between the earth’s own axis of rotation, and the plane of its orbit around the sun. The ratio of the difference, versus the sum, of the farthest and closest distance between earth and sun: i.e., a measure of how far from circular the earth’s orbit around the sun is. I’ve seen this Agricola reference a bunch of times, but went—admittedly, very quickly—through Agricola’s De Re Metallica and wasn’t able to find the place where he talks about this. The date 1530 probably refers to the book Agricola published in that year, Bermannus, which I haven’t found; but, if I understand it correctly, De Re Metallica, which came out after Agricola’s death, is really just a much expanded version of Bermannus. Agricola, by the way, is the latin name of Georg Bauer, a German renaissance scholar who studied mining and metals. In the preface of De Re Metallica he wrote that he would exclude from his book “all those things which I have not myself seen, or have not read or heard of [...]. That which I have neither seen, nor carefully considered after reading or hearing of, I have not written about.” A foreshadow of enlightenment, I guess. If you read any (history of the) earth sciences, you are likely to find him mentioned as the “father of mineralogy”. The Collège de France hires a few very prominent scholars to teach to literally anyone who’s interested in taking the lectures—which are free and open to all. (You need to understand French, though, because the lectures are delivered in French.) It’s in Paris, not far from the Sorbonne building on rue des Écoles, and has been around since 1530. “Unfinished Business: William Herschel’s Sweeps for Nebulae”, History of Science, vol. 43, 2005. The World in a Crucible: Laboratory Practice and Geological Theory at the Beginning of Geology, Geological Society of America, 2009. Presumably drawn by his wife, see note 742.
Notes
793
735. Long before I thought about writing this book, one day near the entrance of my department’s library (I was still in Switzerland back then) I saw that there were a whole bunch of books in the discard tray. There was a copy of Hutton’s Abstract of a Dissertation, which I picked up, and then there was a xeroxed booklet by one Vladimir Belousov, which I had never heard of, but it turns out he was a pretty big guy in the earth sciences in the middle of the twentieth century. Anyway, the little book’s title was Essays on the History of Geology, and I picked it up as well. He’s quite sharp in dissing “old” science: “Geology traveled a difficult way”, he says. “It did not progress by leaps and starts from one genius to another. On the contrary, it was a statistical summing up of the many little intellectual steps in most directions, and in fantastic collisions or combinations. This latent molecular process led to gradual extinction of delusions and affirmation of sound and practical ideas which evolved with the time both in terms of wording and contents. “Imperceptibly, the numerous zigzags of imaginative thought which had bred fantastic ideas of male and female minerals turned by the start of the 18th century into the concept of ‘mineral juices’ which contained the sound idea of rocks and minerals as acqueous sediments.” In Belousov’s view of science, I guess, there’s one right path, the path of Progress, of sound scientific ideas, which eventually are bound to prevail on delusions and imaginative thought. Interestingly, Belousov was also one of the few scientists who, in the early 1970s, didn’t go for the theory of plate tectonics. “It is evident”, he wrote in 1970, “that not a single aspect of the ocean-floor spreading hypothesis can stand up to criticism. This hypothesis is based on a hasty generalization of certain data whose significance has been monstrously overestimated. It is replete with distortions of actual phenomena of nature and with raw statements. It brought into the earth sciences an alien rough schematization permeated by total ignorance of the actual properties of the medium.” (“Against the Hypothesis of Ocean-Floor Spreading”, Tectonophysics, vol. 9, 1970). (If you have never heard about plate tectonics, that’s fine: you’ll learn about it in Chap. 9. When you get to the end of that chapter, maybe you’ll think about this note again.) 736. Check out, e.g., the book by Hallam, see note 94, if you want to find out about at least some of the others. 737. “An Estimate of the Geological Age of the Earth”, Scientific Transactions of the Royal Dublin Society, vol. 7. 738. We also learn from Wyse Jackson that “John Joly was born on 1 November 1857 in Hollywood House (the Rectory), Bracknagh, County Offaly, the third and youngest son of the Reverend John Plunket Joly (1826–1858) and Julia Anna Maria Georgina née Comtesse de Lusi. The Joly family originated from France, but came to Ireland from Belgium in the 1760s. [...] After his father’s sudden death at a young age, the Joly family moved to Dublin where John Joly received his secondary education [...]. Although he did not excel in the classroom, he was popular, nevertheless, and became known as ‘The Professor’ on account of his tinkering with chemical apparatus and other gadgets. [...]
794
739.
740. 741.
742.
Notes
“In adulthood Joly was a distinctive and unforgettable man. Tall, with hair swept off his forehead, a bushy moustache, and pince-nez perched on his nose, he spoke with what was considered to be a foreign accent, but in reality the rolled r’s were simply used to conceal a slight lisp. [...] Although Joly considered a career abroad, he acceded to his widowed mother’s wishes and remained in Ireland, at Trinity College, Dublin”, etc. Which includes a 1715 paper by Edmond Halley (the astronomer who first saw the comet that bears his name), “A Short Account of the Cause of Saltiness of the Oceans, and of the Several Lakes that Emit No Rivers; With a Proposal by Help Thereof, to Discover the Age of the World”, published in the Philosophical Transactions. “In 1715”, explains Wyse Jackson, “Halley had proposed to the Royal Society that salt concentrations in lakes that had no discharge rivers should be measured every 100 years, as he considered that from the incremental increase of the salt, the age of the lake could eventually be deduced. Once enough data had been collected over time, inferences about the age of the ocean, and therefore of the Earth, could be drawn from the results.” Halley, then, “recommended to the Society that experiments be started ‘for the benefit of future Ages’. But the Society does not seem to have heeded his advice and Halley’s idea was only rediscovered in 1910”. I haven’t read Halley’s paper, but clearly his idea is very close to that of Joly; anyway, in 1899 when he published his own work, “Joly was unaware, as were his contemporaries, of the pioneering work of the English astronomer Edmond Halley”, etc. I can’t help but think of Raj from The Big Bang Theory. Maybe the screenwriters were inspired by this? It is said on the Royal Society’s website that “the Copley Medal is the Society’s oldest and most prestigious award. The medal is awarded for outstanding achievements in research in any branch of science. “First awarded in 1731 following donations from Godfrey Copley, FRS, it was initially awarded for the most important scientific discovery or for the greatest contribution made by experiment. The Copley Medal is thought to be the world’s oldest scientific prize and it was awarded 170 years before the first Nobel Prize. Notable winners include Benjamin Franklin, Dorothy Hodgkin, Albert Einstein and Charles Darwin. The medal is of silver gilt, is awarded annually, alternating between the physical and biological sciences (odd and even years respectively), and is accompanied by a a gift of £25,000”. Marie-Anne actually studied art in David’s atelier around 1785-86; that’s how she learned to draw. Art critic Thomas B. Hess wrote about this painting, “to understand the revisionist, in fact revolutionary, tone of the portrait, put yourself in the shoes of a spectator in 1788. Consider the cluster of the four hands. There are no rings. How many Rococo or Baroque portraits of wealthy patrons can you recall in which the subjects aren’t bejeweled—rings, bracelets, brooches, pearls, fobs, medals—and wearing costly silks or furs? The Lavoisiers’ only appurtenances are those of the intellect—some scientific instruments, a portfolio of graphics. [...] It’s a starkness that must have shocked—like a trumpet blast announcing the appearance of the New Rational Man and the New Lib-
Notes
743.
744. 745.
746. 747.
748.
749.
795
erated Woman united in a marriage of intellects, with all the fripperies of class cleaned away.” (In “David’s plot”, New York Magazine, 1977.) In the 2020s, David’s painting (Fig. N.8) was X-rayed in ways that were not possible back in the 1970s: something that people do, nowadays, to see how the painter did it—directly the way we see it now, or via multiple versions and corrections. “The results revealed”, write Silvia Centeno et al., “that the first version depicted not the progressive, scientific-minded couple that we see today, but their other identity, that of wealthy tax collectors and fashionable luxury consumers.” (S. A. Centeno, D. Mahon, F. Carò and D. Pullins, “Discovering the Evolution of Jacques-Louis David’s portrait of Antoine-Laurent and Marie-Anne Pierrette Paulze Lavoisier”, Heritage Science, vol. 84, 2021.) At some point before or during the revolution, the Lavoisiers must have realised that being “wealthy tax collectors and fashionable luxury consumers” wasn’t safe, and asked David to replace jewels with science tools, get rid of MarieAnne’s fancy hat, etc.—do away with “all fripperies of class”, indeed (see note 742). Which remember another famous painting by David: The Death of Marat. At that point Jacques-Louis David, to whom Lavoisier and/or his wife had commissioned their portrait about five years earlier (see note 742), had become a very powerful man—“one of the three members of the dread Committee for Public Safety”, writes Hess (see note 742). “Surviving documents from May, 1793, when Lavoisier was arrested, and from November, 1794, when he was executed, indicate that David was busy signing warrants for arrest, death, and pardon. The piece of paper condemning Lavoisier hasn’t been found, nor are we sure about David’s attitude toward his former patron. [...] In the portrait, Lavoisier looks dotingly at his wife. She aims a beautifully straight blue gaze at you—or at the painter, at David, 40 years old and just entering his fame. Is it a look of complicity? Are Marie-Anne and Jacques-Louis beginning a plot that will end in Antoine-Laurent’s death sentence?” After Lavoisier’s death, Marie-Anne ran a “salon” attended by scientists that she also worked with—without her work ever being recognized. She eventually married Count Rumford (see Chap. 8), but their marriage only lasted a few months. William Burton, An Account of the Life and Writings of Herman Boerhaave, London 1746. If you want to know more, you can check out Pouillet’s paper, “Mémoire sur la Chaleur Solaire, sur les Pouvoirs Rayonnants et Absorbants de l’Air Atmosphérique, et sur la Température de l’Espace”, Extrait des Comptes Rendus de l’Académie des Sciences, 1838; or Jean-Louis Dufresne, “La Détermination de la Constante Solaire par Claude Pouillet”, La Météorologie, 2008. Because, yes, if the derivative of something with respect to time is zero, that’s the same as saying that the rate of change of that something with respect to time is zero, which is the same as saying that that something is constant. As far as products go, the cross product is kind of weird; but it’s not hard to see that the derivative of the cross product of two whatever vectors a and b
796
Notes
coincides with the cross product of the derivative of a, times b, plus the cross product of a times the derivative of b—i.e., the usual rule for differentiating a product. Before I show you the proof, you should know that Tullio Levi-Civita, who was a prof. at the mathematics department in Padua, not far from where yours truly is sitting right now, but about a century ago, Levi-Civita figured that the cross product of two vectors, see note 16, can be written more compactly if one defines a 3 × 3 tensor ei jk the value of whose i, j, k coefficient is 1 if i, j, k is an even permutation of 1, 2, 3; −1 if it an odd permutation of 1, 2, 3; zero if any index is repeated. In other words,
ei jk
⎧ ⎪ ⎨+1 if (i, j, k) is (1, 2, 3), (2, 3, 1), or (3, 1, 2), = −1 if (i, j, k) is (3, 2, 1), (1, 3, 2), or (2, 1, 3), ⎪ ⎩ 0 if i = j, and/or j = k, and/or k = i.
With this, the proof becomes relatively easy: all you have to do is rewrite the ith component of a × b via the Levi-Civita symbol, and d d {a × b}i = (ei jk a j bk ) dx dx da j dbk bk + a j = ei jk dx dx da j dbk bk + ei jk a j = ei jk dx dx db da ×b + a× , = dx d x i i QED. 750. For whatever three vectors u, v and w, we have that u · (v × w) = w · (u × v). 751. I am not going to cover conics in any detail, because you just can’t do everything. But, in a nutshell: conics are the curves you get if you cut a double cone (a cone with two “nappes”) along a plane, see Fig. N.59. Depending on the angle that that plane makes with the axis of the cone, you can get: a circle, if the plane is perpendicular to the axis; an ellipse, if the angle is less than π2 but still large enough that the plane only cuts through one of the two nappes of the parabola—and so the ellipse is a closed curve; a parabola, if the angle has just the right value so that, again, the plane only cuts through one of the nappes, but the intersection is an unbounded curve, i.e., not closed: which is the case if, keep looking at Fig. N.59, the plane is inclined from the axis of the cone by the same angle as the side of the cone; and, finally, a hyperbola, if the angle between cutting plane and axis of the cone is even smaller, and both nappes of the cone are intersected. It can be shown, although I won’t do it here, that all conics can be described with the same equation,
Notes
797
Fig. N.59 A conic section is the curve you get if you cut a double cone along a plane. Depending on the angle that the “cutting plane” makes with the axis of the cone, you get a different kind of conic
r (θ) =
A , 1 + B cos θ
where r and θ are polar coordinates in the plane of the conic, and A and B are arbitrary constants, which depending on their values, you can get anything from a circle to a hyperbola. And you can see that Eq. (N.38) of note 174 is a particular instance of this. 752. Parabolas have e = 1, so this won’t work for them. We don’t care though, because what we’ve proven is that the curve we are dealing with is either an ellipse or a hyperbola. 753. Which, yes, is not trivial. To prove it, start by writing r as the product of its magnitude, r , times the unit vector rˆ which points in the same direction as r. It follows that d (r rˆ ) v= dt d = r˙ rˆ + r rˆ . dt The components of rˆ along the Cartesian axes x and y are cos θ and sin θ, respectively, where θ is defined as in Fig. N.10. If I call xˆ and yˆ the unit vectors directed each like the Cartesian axis of the same name, then d rˆ dt d = r˙ rˆ + r (ˆx cos θ + yˆ sin θ) dt ˙ x sin θ + yˆ cos θ). = r˙ rˆ + r θ(−ˆ
v = r˙ rˆ + r
798
Notes
If you dot −ˆx sin θ + yˆ cos θ with rˆ , you’ll see that the result is zero, i.e. the two vectors are perpendicular to one another. If you look at the signs of its components, and at Fig. N.10 again, you’ll see that −ˆx sin θ + yˆ cos θ is nothing but the unit vector θˆ that we had just defined. But so then that means that v = r˙ rˆ + r θ˙ (−ˆx sin θ + yˆ cos θ) = r˙ rˆ + r θ˙ θˆ , QED. 754. Initially I had wanted to copy the proof of Laplace—super-important physicist—from his Mécanique Céleste, which I’ve read—not the whole thing, but some parts—in the English translation of Nathaniel Bowditch. But Laplace’s derivation is tremendously difficult; at least for me; but maybe I have to realize that some people just have talents and skills that I don’t have: and Laplace obviously is one of them. In his introduction to the English edition that he curated, Bowditch explains that Laplace’s goal “in composing this work, as stated by him in his preface, was to reduce all the known phenomena of the system of the world to the law of gravity, by strict mathematical principles; and to complete the investigations of the motions of the planets, satellites, and comets, begun by Newton in his Principia. This he has accomplished, in a manner deserving the highest praise, for its symmetry and completeness; but from the abridged manner, in which the analytical calculations have been made, it has been found difficult to be understood by many persons, who have a strong and decided taste for mathematical studies, on account of the time and labour required to insert the intermediate steps of the demonstrations, necessary to enable them easily to follow the author in his reasoning.” Which yes, that’s totally my case, I guess. “To remedy, in some measure, this defect has been the chief object of the translator in the notes. It is hoped that the facility, arising from having the work in our own language, with the aid of these explanatory notes, will render it more accessible to persons who have been unable to prepare themselves for this study by a previous course”, etc. It may be that Laplace finds the steps that he skips so easy that they are not even worth mentioning. But he is Laplace. To Bowditch, and probably to many other people, they are not so obvious: so that’s why Bowditch adds all his long footnotes. Much sympathy to him, for sharing my obsession with making everything as clear as it could possibly be. The fourth and last volume of Bowditch’s translation includes a “memoir of the translator by his son, Nathaniel Ingersoll Bowditch”. The whole story of the Bowditch family is told, which had always resided in Salem, Mass., “from its earliest settlement, having been, for the four last generations, ship-masters.” That’s where Nathaniel was born, in 1773, “being the fourth of seven children of Habakkuk Bowditch, by his wife Mary, who was the daughter of Nathaniel Ingersoll.”
Notes
799
Bowditch “early showed a great fondness for mathematics”, but “even the slight elementary instruction which he might have obtained at [the elementary school he went too, in Salem, “kept by a Mr. Watson”], he was obliged to forego altogether at the age of ten years and two months, when he was taken by his father into his cooper’s shop, that he might by his labor assist in the support of the family. After remaining here a short time, he entered as a clerk or apprentice into the ship-chandlery shop of Messrs. Ropes and Hodges, when he was about twelve years of age. In this shop he remained till his employers retired from business, at which time, as early as 1790, he entered the similar shop of Mr. Samuel Curwin Ward, where he remained until he sailed on his first voyage, in 1795. Here, when not engaged in serving customers, he spent his time in reading, and particularly in the study of mathematics, for which he then felt a confirmed and decided taste.” We are told that “Dr. Bowditch began life with the same pursuits which his ancestors had followed for so many generations. Between the years 1795 and 1804, he made five voyages, all under the command of Captain Henry Prince, of Salem. On his fifth and last voyage, he acted as both master and supercargo. He sailed upon the first of these voyages, January 11, 1795, in the ship Henry, bound to the Isle of Bourbon, and was absent exactly one year. His three next voyages were in the ship Astrea, which sailed, in 1796, for Lisbon, Madeira, and Manilla, and arrived at Salem in May, 1797; and again in August, 1798, sailed for Cadiz, thence to the Mediterranean, loaded at Alicant, and arrived at Salem in April, 1799 ; and in July, 1799, sailed from Boston to Batavia and Manilla, and returned in September, 1800;—and his fifth voyage was in the Putnam, which sailed from Beverly, November 21, 1802, bound for Sumatra, and arrived at Salem December 25, 1803.” Then, shortly after his last voyage, “Dr. Bowditch was elected President of the Essex Fire and Marine Company, which situation he held till his removal to Boston, in 1823. Here, also, he displayed his usual good judgment and discretion, and his usual success attended him.” So, Bowditch was never, how should I put it, a pro scientist; “But the long intervals which a sailor’s life afforded, he chiefly devoted to his favorite study, pursuing with unremitting zeal those researches in which he had already made such progress, notwithstanding the interruptions and embarrassments of his earlier days. Here, with only the sea around him, and the sky above him, protected alike from all the intruding cares and engrossing pleasures of life, he especially delighted to hold converse with the master-spirits who had attempted to explain the mysteries of the visible universe, and the laws by which the great energies of nature are guided and controlled [...].” While still a sailor, he began to publish in scientific journals. And to write books. “The most important result of this period of Dr. Bowditch’s life, was the publication of The New American Practical Navigator, a manual in which were imbodied a scientific explanation of the principles of navigation, and also the practical application of these principles in the simplest and most effective manner [...].
800
Notes
“Dr. Bowditch [...] did not himself consider this work as one which would much advance his scientific reputation. It was, in his view, only a ‘practical manual.’ But it was the work by which, almost exclusively, he was, for a long time, known in this country, and it laid the basis of a wide-spread popularity, such as few, if any, works upon scientific subjects have ever gained for their authors. Several years ago, he was much amused by the following incident. Two young men came into the shop of his bookseller to purchase a copy of the Navigator. Upon being shown one bearing on its title-page the number of the edition, and purporting to have been revised and corrected by the author, one said to the other, ‘That is all a mere cheat; the old fellow must have been dead years ago!’ They were astonished, and perhaps a little embarrassed, at being introduced to an active, sprightly gentleman, in full health and good spirits, as the author of this work, which they had known from their earliest entrance upon a sailor’s life. It was in honor, especially, of the memory of him who had written the Practical Navigator, that, when the news of his death was received at Cronstadt, all the American shipping, and many of the English and Russian vessels, hoisted their flags at half-mast in that naval depôt of the Czars,—a tribute of respect which had been previously paid in the ports of Baltimore, Boston, and Salem.” And but Bowditch’s really important contribution was probably, indeed, the translation of Laplace’s magnum opus: “notwithstanding”, writes his son, “all these duties and engagements, and all the occasional scientific labors which have been mentioned, such was his wonderful economy of time, that [...] he also completed what has justly been characterized as the gigantic undertaking of making the translation and commentary now before the reader,—a work upon which, almost exclusively, will rest his fame as a man of science. [...] His great design”, like we were saying, writes Bowditch’s son, “was to supply those steps in the author’s demonstrations, which were not discoverable without much study and research, and which had rendered the original work so abstruse and difficult, as to lead a writer in the Edinburgh Review to say there were not twelve individuals in Great Britain who could read it with any facility [‘We will venture to say, that the number of those in this island who can read that work with any tolerable facility, is small indeed. If we reckon two or three in London and the military schools in its vicinity, the same number at each of the English Universities, and perhaps four in Scotland, we shall hardly exceed a dozen; and yet we are fully persuaded that our reckoning is beyond the truth.’]. Dr. Bowditch himself was accustomed to remark, ‘Whenever I meet in Laplace with the words «thus it plainly appears», I am sure that hours, and perhaps days, of hard study will alone enable me to discover how it plainly appears.’ ” 755. Which would later be renamed to “American Association for the Advancement of Science”. 756. Famous British zoologist of the Victorian era. One thing that he is remembered for is, he came up with the theory of Lemuria, a sunken continent between India and Madagascar, which explains why lemurs lived in those places—and only those places. Not too crazy in view of baked-apple (see note 192) geology, etc.,
Notes
801
like I am trying to show you. But the thing is, for whatever reason, Lemuria entered some not-so-scientific theories re the origin of mankind: the famous occultist Helena Blavatsky thought lemurs were one of the “root races” of humanity; and then, at the turn of the twentieth century, Frederick Spencer Oliver wrote A Dweller on Two Planets (Baumgardt Publishing Company, Los Angeles, 1905; “one of the most important texts of the 19th Century Atlantis canon” according to amazon.com), where he came up with the, uhm, theory that Lemuria was actually in the Pacific Ocean and Lemurs are giants now living, in secrecy, yes, in Mount Shasta, California. And, to be completely honest with you, the reason I can’t forget this is that it provided the inspiration for one of my favorite Pixies’ song, Velouria (4AD, 1990). If you are curious about how the idea of Lemuria became “a fascination of pseudoscientists and occultists”, and how it “lives on in surprising places”, you might want to check out “Lemuria, the Weirdest Continent that Never Existed”, by Frank Jacobs, in bigthink.com, September 2023. 757. Isostasy and Flexure of the Lithosphere, Cambridge University Press, 2001. 758. “If a solid lighter than a fluid be forcibly immersed in it, the solid will be driven upwards by a force equal to the difference between its weight and the weight of the fluid displaced”: this is how Archimedes himself phrased it in On Floating Bodies888 . The idea is that the only “traction” (surface force889 per unit surface) acting within a fluid is the so-called hydrostatic pressure, which at a depth d below the surface is p = ρ f gd, with ρ f the density of the fluid, and g the acceleration of gravity. At equilibrium, that force must coincide with the weight-per-unitsurface of the solid that’s immersed in the fluid, which is given by ρs gh, with ρs the solid’s density and h its thickness. To fix ideas, think of it as rectangular prism; we can always decompose a solid of any shape into an infinity of infinitely small rectangular prisms. So then ρs gh = ρ f gd, which is the fundamental equation of isostasy. As long as the interior of the earth, and or at least its shallowest layers, behave approximately like a fluid over relatively long times (i.e., if, indeed, the earth has an asthenosphere), we can use this equation to estimate, e.g., the depth of the root of a mountain range like in Fig. 5.5. Call h 1 the thickness of the shallow layer in Fig. 5.5; call h 0 the height of the mountain range and h 2 the thickness of the root. Call ρ1 and ρ2 the densities of the shallower and deeper layers, respectively. The pressure at the base of the root of the mountain must be ρ1 g(h 0 + h 1 + h 2 ). This must coincide with the pressure that you find at the same depth, under areas where there’s no significant topography,
802
Notes
ρ1 gh 1 + ρ2 gh 2 . If you equate the two expressions and solve for h 2 , the terms that carry h 1 cancel out and h0 h 2 = ρ2 . −1 ρ1 759. Herbert E. Gregory, another fairly famous geologist at the time and, at some point, the head of the geology department at Yale. According to this same biographical sketch, “the three summer months of [1901] were spent [by Barrell] in Europe with Professors Herbert E. Gregory and Charles H. Warren, travelling ‘by foot, by bicycle, and by third-class trains, the object being to see the countries and study Geology rather than to do sightseeing in the cities’.” The quote, if I understand correctly, is from Joseph Barrell himself, something he wrote “for the twenty-fifth anniversary of his graduation at Lehigh”. 760. Another Yale prof. at the time; a paleontologist. 761. Almost every bio of Wegener that I’ve read mentions that he and his brother Kurt flew in their balloon for 52 and a half hours, non-stop, which was a world record at the time. I thought this could be an urban legend, but I checked, and I found Kurt Wegener’s own report, which was published in various languages in aeronautics journals across the world, e.g., in the American Magazine of Aeronautics, whose vol. 2, no. 4 (November 1907), carries the version that I’ve read. Kurt explains that the goal of their balloon trip wasn’t to set any sort of record—free ballooning was just their normal way of collecting atmospheric data, working, as they were, at a meteorological observatory. But the thing is, their fateful trip “should have begun without fail on the evening of Wednesday, April 4th, 1906, so as to enable the companion, my brother, Dr. Alfred Wegener, who had made many observations by day, to likewise make observations by night. We desired on the following day to ascertain the meteorological conditions at a high altitude.” But “at the inflation, which, as usual, was made by the balloon battalion at Reinickendorf, the net of our balloon ‘Brandenburg’ became damaged, and we were able to only make our ascension at 9 o’clock in the morning of April 5th in another balloon hurriedly made ready. “Because of this our plans had to be changed; our efforts now had to be bent to make an easy trip the first day in order to be sure to be in position to make the necessary observations during the following night, and then the following day be able to take the balloon to a higher altitude. [...] “The mid-day heat made a considerable change in the atmosphere, and we noticed that the partly warmed air rose while the colder air sank away from it. This vertical downward current is the principal enemy of long balloon trips, for it always means a sacrifice of ballast when a balloon is suddenly caught by one of these downward currents. “The balloon flew rapidly over the Tegeler Sea, Neu-Ruppin and Wittstock, and toward noon sailed east from Wismar toward the ocean (Baltic Sea). Soon
Notes
803
Fig. N.60 The Wegener brothers’ 52-hour balloon trip, from American Magazine of Aeronautics, 1907
the ocean became visible in all its greatness below us, and the smoke we could see in the distance showed that we had the same wind as we had before. Cities, forests and water flew past under us. [See Fig. N.60.] “It was by much hard work that we succeeded in putting the basket in order, and in doing this work we discovered many defects in our equipment which we did not notice in our hurried departure. For provisions we had only for each man one pound of chocolate, two cutlets, one orange and one flask of seltzer. We also discovered we had forgotten something even of more importance. In our hurry to get away we had forgotten to take our heavier coats, and we only wore light summer jackets. “We had lived fairly well the first day; almost three-quarters of a pound of chocolate and one cutlet were used. In spite of this, as a result of our heavy work with the ballast bags and because of the extreme smallness of our basket (1
804
Notes
m. × 1.20), we had many pains, which were very noticeable at each movement of the basket. “During the night we had another queer experience. The aeronaut knows that the basket begins to shake slightly when the drag rope drags on the earth. We had decided to remain at 2–300 m. altitude because the current below us travelled too much toward the west, directly toward the North Sea. “The basket began to shake slightly while I lay rolled up in the bottom of the basket vainly trying to sleep. I imagined that my brother, who had assumed command of the balloon, had fallen asleep, and that the balloon had sunk so that it was resting on the guide rope. I called to ‘The guide rope is dragging.’ But he answered, ‘No, I am only shivering from the cold.’ It was he, shaking from the frost, which did not leave him until after we had landed. [...] “Our rations on the second day were cut down some. Six chocolate bonbons, one cutlet and a half orange per person. [...] “The first part of the [second] night we passed very restless. After passing Kiel our course began to turn sharply toward the right, so that we feared we would be driven over the North Sea, and we prepared to land. While we were considering the question of landing, our balloon rose to an altitude of 2400 m., where we suffered considerably from the cold and exposure. It must have been 10 deg. C. [I think the Magazine of Aeronautics must have messed this one up: a minus sign must be missing or Alfred wouldn’t have been shaking so much] up there, and we could hardly endure it. Soon, however, the course again changed toward the south, and we again got control of our balloon. During the night the balloon drifted over the Elbe and Hamburg [...]. “In the morning the sun again pulled us into a higher altitude, and we gained an altitude of 3700 m. and continued to fly over cities and towns. The cold at this altitude—16 deg. C. [again]—combined with our cramped quarters and insufficient nourishment, made it unbearable. After we had been at this altitude about two hours, at 11:30 o’clock we again began to descend, although we still had six bags of ballast. “When we came nearer to the earth we drifted under a heavy cloud and into a strong downward current, and we had to use four bags of ballast in two hours, and were compelled to land near Laufach in Spessart after a trip of 52 hours’ duration. We had in fact broken the record for duration.” 762. A peculiar character. According to Giovanni Modaffari890 , Snider-Pellegrini was “raised in Trieste, then the Austrian Empire’s main outlet to the sea, by a family of bankers belonging to the French nobility. In the first part of his life, he was a [...] businessman, and in 1831 was one of the founders of the Generali insurance company.” In the late 1830s, though, apparently, he was found guilty of some sort of fraud, and spent one year in prison in Trieste. “Bankrupted after what appears to have been a series of judicial mishaps,” says Modaffari, Pellegrini “left Trieste in a hurry, reaching Paris first, and then Texas, making a failed attempt to set up a colony modelled on the ideals of Charles Fourier. “On returning to Europe, from 1848 onwards, in Civitavecchia and in London, he became a prominent associate (and financial supporter) of the Italian Revo-
Notes
763.
764. 765.
766. 767.
768.
805
lutionary governments, and, in 1849, saved the life of its most famous member, Giuseppe Mazzini. Between 1857 and 1861, he also published a series of books in Paris on economic geography: Du développement du commerce de l’Algérie (1857) and on political and religious issues: Le Pape et son pouvoir temporel (1860); Dernière réponse aux évêques et à tous les avocats du pouvoir temporel du Pape (1860). [...]. However, the book that really enabled Pellegrini to miraculously escape oblivion was La Création et ses Mystères dévoilé´s [Creation and its Mysteries Unveiled] (1858), where he outlined a bizarre theory on the creation of the universe, starting from a Genesis-inspired structure divided into six days, with several original observations including two illustrations of the Continental Drift hypothesis predating Alfred Wegener’s diagrams by more than half a century”, etc. (The title page of La Création reads: “a work which expounds clearly the nature of all beings, the elements of which they are composed and their relationship with the earth and the astral bodies, the nature and location of the sun’s fire, the origin of the Americas and of its native inhabitants, the forced formation of new planets, the origin of languages and the causes of the diversity of physiognomies”, etc.) Scott and his four companions all died on their way back. Theirs was the second expedition ever to reach the South Pole, Amundsen and co. having preceded them by five weeks. German-Russian climatologist, Wegener’s colleague and, incidentally, fatherin-law. In fact, Wegener thought that continents move predominantly towards the West. I don’t know how he figured that out; but today we know that, indeed, the motion of “tectonic plates” with respect to the so-called “hotspot” reference frame, averaged over the entire globe, is nonzero and points to the West. So I guess Wegener was, in some sense, right; but the concept of “hotspot” didn’t exist in his time, and from his book I can’t tell what frame he refers the westward drift of the continents to? Hotspots, by the way, are relatively small areas with a lot of volcanism—Hawaii is the perfect example—that, in plate-tectonics theory, are thought to be approximately fixed with respect to plate motion: the perfect reference frame. But I am really getting ahead of myself, now. John Joly: see Chap. 4, and note 738. This idea of what a wave is, by the way, is called Huygens’ principle. The idea is probably older than Huygens—a Dutch physicist who’s roughly a contemporary of Newton—but what Huygens did is he proposed that the propagation of light worked in that way too, i.e., that light is a wave: which Newton himself, for instance, would think that light is more, like, beams of particles that travel from the source of light to our eyes, etc. And if you ask a physicist in 2023, she’ll probably tell you that they were both right, in a way... we’ll get back to Huygens’ principle in Chap. 7. “On the Theories of the Internal Friction of Fluids in Motion, and of the Equilibrium and Motion of Elastic Solids”, Transactions of the Cambridge Philosophical Society, vol. 8, 1845.
806
Notes
769. I read somewhere, probably Wikipedia, that nabla is Hebrew for “harp”. Harps do have that shape, kind of; but I have no idea who decided to pick a Hebrew word for that, and why. 770. Experiments were done by firing a gun, and, looking at (and listening to) it from far away, measuring the time between the flash and the bang. 771. AKA Barney Finn, historian of science, now (2023) “Curator Emeritus of the Electricity Collections at the Smithsonian Institution’s National Museum of American History” in Washington, D.C. The quote is from his paper “Laplace and the Speed of Sound”, vol. 55 of ISIS, 1964. 772. If you are not convinced, look at a guitar while it’s being played: you hear sound, but you can’t really see the strings move, can you? 773. People like to phrase this as, cos(mx) and cos(nx) are orthogonal to one another for x between −π and π, and if m = n. The integral from −π to π of the product of two functions is seen as the function-space equivalent of the vector-space dot product. (The name that people give to this operation is inner product; which, like, the dot product of two vectors is a particular case of inner product, etc.) I don’t think you need to be aware of this to understand this book, but just in case you look these things up elsewhere, and are confused by the, uhm, semantic differences 774. Another way of stating Fourier’s theorem is to say that any periodic function f can be written according to (N.57) and (N.58)—not just for −π ≤ x ≤ π, but over the entire x. Which requiring that f be periodic is the same as requiring that f (−π) = f (π), and then that f replicates itself without any change from π to 3π, and then from 3π to 5π, and so on and so forth, and on the other side from −π to −3π, from −3π to −5π, and so on and so forth. It doesn’t really matter so much in view of what we are going to do next with the Fourier transform, but anyway. 775. When n = 0, (N.58) boils down to 1 π c dx π −π π c dx = π −π 2cπ = π = 2c.
a0 =
When n > 0 we have
Notes
807
c π cos(nx)d x π −π c sin(nx) π = π n −π 2c sin(nπ) = π n = 0,
an =
and
c bn = π
sin(nx)d x cos(nx) π n −π cos(nπ) cos(−nπ) − n n
−π
c π c = − π = 0.
= −
π
776. A book that I like, that gets into all the important details of all this, is Francis B. Hildebrand’s Advanced Calculus for Applications. What I have is the second edition, Prentice-Hall 1976. 777. You see from all this that it is not so stupid to have a0 divided by 2 in (N.57): because without that 2, we’d have to introduce an extra formula for a0 in (N.57), different from those for a1 , a2 , etc. 778. I am cheating, actually: because one needs π to do some extra work for the case when k = 0, i.e., to prove that a0 = π1 −π f (x). That is part of the long and complex derivation that I decided to avoid. But you can look it up: like I said in note 776, I think Hildebrand’s is a good book; and there’s more. 779. Or periodic, with period 2π: see note 774. 780. Complex numbers emerged sort of naturally in mathematics, when people looking for formulae to solve polynomial equations (see also note 355) were confronted with the square roots of negative numbers. So they invented the imaginary√unit as the square root of −1. If i is how you denote the imaginary unit, i = −1, then for any negative number x, √
x=
−|x| i |x|.
What you have here, the product of i with a real number, is what we call an imaginary number. A complex number is the sum of a real plus an imaginary number, i.e., of its real part and its imaginary part, as in z = x + i y.
808
Notes
Occasionally, I might need to use functions that extract from z only its real or only its imaginary part, i.e., if z = x + i y then x = Re(z) and y = Im(z). Complex numbers were pretty much a curiosity until it was discovered, by Lenohard Euler and co., that they could be used to simplify calculations involving trigonometric functions—which, in fact, is why we are using them in this book. Formula (N.61) of note 291 is attributed to Euler, who included it in his Introduction to the Analysis of the Infinite (1748). To prove Euler’s formula, define a function cos x + i sin x ei x = (cos x + i sin x)e−i x
f (x) =
and take its derivative, df = (− sin x + i cos x)e−i x − i(cos x + i sin x)e−i x dx = (− sin x + i cos x) − i cos x − i 2 sin x)e−i x = (− sin x + sin x)e−i x = 0. This means that f (x) is constant, actually. But then, if you plug x = 0 into the definition of f (x), you get f (0) = 1: it follows that f (x) = 1 for all x, and so cos x + i sin x = 1, ei x or cos x + i sin x = ei x , QED. One last thing before I wrap this up, this might sound strange, but given a complex number, I don’t know, z = a + ib, think of the right triangle whose legs are z’s real and imaginary parts, a and b. If you call c the hypotenuse, then you have c = a 2 + b2 , a = c cos(θ)
Notes
809
and b = c sin(θ), where θ is the angle formed by the hypotenuse and leg a, or, by the magic of trigonometry, b . θ = arctan a Now if you go back to z = a + ib = cos(θ) + i sin(θ) = ceiθ = a 2 + b2 ei arctan(b/a) , and you just proved that any complex number can be written as the product of √ a magnitude a 2 + b2 and the exponential of i times, a phase, arctan(b/a). I promise that this will be useful later. 781. You might have noticed that because cos(A − B) = cos(B − A), you could go back to (N.62), swap x and ξ, and end up with swapped signs in the arguments of the exponentials in (N.64) and (N.65). There is nothing wrong with it. The Fourier machine can be assembled in different ways, which you just have to pick one. If you check this out in books, you’ll see that different authors often use different conventions. 782. And, as you might have guessed if you remember our first encounter with J. Fourier in Chap. 4, Fourier initially developed his method as a tool to solve heattransport problems—i.e., the differential equation (4.39), plus various boundary and initial conditions. 783. If you look at only one point x along the string, or look at the whole string at a fixed instant in time, t, then Eq. (N.71) is a Fourier series. Something similar emerges when you try to solve the heat equation (4.39) via separation of variables. Many textbooks have an exercise where you are asked to predict, based on (4.39), the evolution of T within a rod—the rod being insulated over its entire length, except for its two ends, whose temperatures are kept at the same some constant value. In practice, imagine you plunge the rod into a tank full of, e.g., icy water (T ≈ 0◦ ): the T of both the rod’s ends will quickly become the same as that of the water (whose T won’t change much, provided there’s enough water compared to the size of the rod); and then heat will be transported somehow through the rod until thermal equilibrium is reached and the rod is as cold as the water. After separating variables and finding a general solution, you prescribe the boundary conditions: just like the string’s displacement, now it’s the rod’s temperature that is fixed at the rod’s ends—the b.c.’s are the requirement that T at both ends be zero at all times. Similar to the string problem, you end up
810
784.
785.
786.
787. 788. 789.
790.
Notes
finding that the “spatial” part of the solution is an infinite sum of sines, with discrete spatial frequencies: a Fourier series. Which might explain why Fourier invented his thing as a by-product, so to say, of his work on heat. Anyway, the exercise goes quite smoothly as long as the prescribed, constant T is the same at both ends: if it isn’t things quickly get messy, and I am not even sure you can actually find a solution. Which is why Kelvin, I guess, went for the fairly complex error-function solution that we looked at in Chap. 4. Anyway, try separation of variables on (4.39): it’s a good exercise. Another phrase that you might hear to refer to the same thing is: “free oscillations”. A guitar string—or any other object, for that matters—can be forced to oscillate according to whatever frequency: you just need to apply some forcing to it, that varies in time like a sinusoid: e.g., take your electric toothbrush (without the brush), place it on the string, and turn it on. That would be called a “forced oscillation”. The modes’ frequencies are those you observe after you’ve turned off all external forcing, and the string is left free to oscillate in any way it wants—until air resistance stops it. Hence, free oscillations. Here’s a funny and mysterious (I think) thing. Take a bunch of sinusoidal, i.e., single-frequency sounds: and let the frequencies be all multiples of one another. That could be, for example, all the sounds emitted by a guitar when you hit one string. Our brain has the very strange property of perceiving all those frequencies, taken together, as a single entity: what we call a musical note. Read Arthur Benade’s Fundamentals of Musical Acoustics (Oxford University Press, 1976, then Dover 1990) if you are curious to find out more about this. A great book. If you are careful, you might have noticed that we didn’t actually write it in this exact same way in note 291, but it’s not a big deal—it’s just a matter of convention, i.e., of how you define A0 and a0 . In the catacombs under Paris. One of Laplace, Hassenfratz and co.’s paces should mean about 80 cm, I think. According to S. G. Brush (see note 86), “Jeffreys may be considered the heir to the Kelvin-Darwin tradition of mathematical geophysics in Britain; like Darwin, he held the Plumian chair of astronomy at Cambridge University. For the younger generation of geophysicists his reputation is somewhat tarnished by his persistent opposition [...] to the theory of continental drift”, etc. The Darwin in question is George (1845–1912), the son of Charles. The proof is easy if rotation is around one and only one of the Cartesian axes: just take the transpose of the rotation matrix formula, i.e., Eq. (2.9) in Chap. 2, and dot it with the rotation matrix itself. If rotation is around an arbitrarily oriented axis, I guess you can think of rotation around an arbitrary axis as a sequence of rotations around each of the Cartesian axes. Or perhaps start by showing that the rows/columns of the rotation matrix are the components of the rotated axes with respect the initial axes: and as such any two rows of R can be thought of as vectors that are perpendicular to one another... and the equality R−1 = R T follows in a couple of moves.
Notes
811
Either way, dear reader, a nice exercise for you. 791. You can prove this result if you just apply the definition of derivative, which if you don’t remember it go back to note 20. 792. Giuseppe Vicentini was a physics prof. at the university of Padua, 1894 to 1931. “As a scientist, he was more inclined to the development and enhancement of methods, to the ever more precise determination of constants, and systematic investigations into the physical properties of bodies, rather than the discovery of new phenomena or new laws. He [...] devised a seismic instrument of remarkable sensitivity, the three-component microseismograph, which was highly successful and widespread in seismic stations both in Italy and abroad. [...] The seismographic facility at Vicentini’s Institute of Physics was visited by many famous seismologists of the time such as Milne, Omori, Algué and Wiechert.” (From the website of the Istituto Nazionale di Geofisica e Vulcanologia.) 793. “Chair of Geophysics at the Jagiellonian University [in Krakow], where, in 1895, the first Institute of Geophysics in the world was created” (Michael A. Slawinski, Waves and Rays in Elastic Continua, Samizdat Press, 2007). 794. Or you can do it in this other way, too: start from ω = kc, from which it follows that dc dω = c(ω) + k(ω) dk dk dc dω . = c(ω) + k(ω) dω dk That gives
dω dc c(ω) = 1 − k(ω) , dk dω
or
ω dc dω 1− , c(ω) = dk c(ω) dω
which, if you solve for dω , you immediately get (N.103), as you should. I don’t dk know which path is the clearest, for you; I like both. 795. Conrad was the first head of the Austrian Seismological Service, 1904 to 1910. In 1910 he took a chair at the Chernivtsi National University in Bukovina, in what is now Ukraine. But then (1918) the Austro-Hungarian Empire fell apart, Bukovina became part of Rumania, and Conrad had to go back to Vienna and the Seismological Service, having lost both his professorship and his personal assets. Being a socialist, and of Jewish descent, he emigrated to the U.S. in 1934. He was prof. at Harvard, among other places, and passed away in Cambridge, Massachusetts, 1962. 796. I went to go look for Fresnel’s grave in the Père Lachaise cemetery. It’s not far from the apartment I still own in Paris, and where I am staying at the moment (summer of 2021). So I took the short walk via rue de la Roquette where the
812
Notes
women’s prison used to be, went through the main entrance and then quickly to division 4, because I’d checked the Internet before leaving my place and it said Fresnel was buried in division 4 of the cemetery. Or so I thought. In fact I must had gotten confused because after not having been able to find the grave for a while... and division 4 is quite small: two narrow sections along the main alley that from the Roquette entrance leads to the crematorium and the centre of the graveyard, so after not finding Fresnel’s grave for a while (which I’d looked at the picture, too) I googled it again on my phone, and it turns out it’s in division 14. There’s no website anywhere in the Internet saying Fresnel is in division 4, so I don’t know what I’d read earlier. Anyway, I walked to division 14, which is also small, and not far from 4. Then I walked around it, first all the way up to the big roundabout with the statue of Casimir Perier—which no idea who that is, and I guess I could google that, too, but whatever; but he must have been very rich and/or famous in his day, because the statue is quite conspicuous and plus he’s even gotten an alley in the cemetery named after him: which is the one I’d just walked. Anyway, no sign of Fresnel around there, so I kept looking, covering the entire perimeter of the division, then venturing inside, among broken graves and luxuriant vegetation. Long story short, I couldn’t find Fresnel’s grave even after one hour of research, or more, incl. googling it once again to see how it looks like and how the graves around it look like—there are a couple of pictures you can find, on Wikipedia and/or some specialized websites for people who are into people’s tombs—and trying to spot at least the bigger monument which seems to lay just behind it. But no luck. I didn’t mind, though. I like cemeteries, and the almost hopeless research I was pursuing seemed to totally make sense, in view of my book project—the book you are now reading, I mean. I guess research is ultimately what the book is about, and research is always almost hopeless—even if we can’t help discussing successful research—research that leads to convincing findings, and has impact—much more than forgotten research that didn’t lead anywhere—even though the latter is so much more common... But I digress. Another reason I figured I wasn’t just wasting time, walking back and forth through division 14, was that it felt special to be finally looking for something that cannot be precisely located through some app, etc. It seemed almost impossible to find myself in such situation, in Paris, in 2021, but indeed there I was, forced to actually walk around to find something. I started to think that the info I got from the web was wrong, and that to find Fresnel I’d have to literally go somewhere where nobody (nobody with an interest to correct Wikipedia’s entries, anyway; but in any case, how many people interested in visiting Fresnel’s grave can there be, alive today on this planet? he’s certainly not listed among the tombes celèbres of tourist brochures; and neither is Joseph Fourier, for that matter) had gone for a very long time—maybe for decades. And that is also something this book is supposed to do. Thought I, in my boundless ambition. After a while, obviously, I gave up and left the Cemetery. Outside, on the Boulevard, people were having lunch in the terrasses, and everything looked beautiful—what a trite word; but what I mean is that every small shop, restau-
Notes
813
rant, as well as people’s outfits, etc., appeared to be carefully designed to look special and catch the eye. (This is, I think, what Paris is ultimately about.) And people were having lunch and chatting away without worries, or so it seemed. The truth being, some of them probably were worryless, like I might have been, I don’t know, fifteen years earlier when I lived in Zürich, and worked as a researcher without much teaching to do and with lots of resources to do whatever research, hopeless or not, I might want to do, and extra time to try to write songs or novels, etc., and friends to spend the evenings with, without much thinking and without much worrying... some of the lunchgoers, like I was saying, were worryless, and but others presumably with some significant concerns, possibly worse than my own 2021 concerns, and insomnia comparable or worse than mine. But I was almost sure that none of the people hanging out in the streets I was walking were struggling with a thousand-page book project which who knows where it’s going, really, and will I be able to make it exactly as I think it should be—turn my abstract idea into words, I mean: how is it going to be like, when it’s done? If it ever will be done? A project that involves deriving some difficult equations, finding uncharted graves, and plus the occasional hint of autobiography. Thinking those thoughts, I walked into a bookstore that I like in rue Léon Frot, and bought a comic book—or a graphic novel, as they say—which would help me to unwind when I’d take my break later in the afternoon. Then I went to the sandwich shop downstairs from my place and ordered a sandwich. Do you want a tarte aux poires, said the guy behind the counter, they are very good, c’est ma mère qui les fait. So I got the cake, too, and I went upstairs and ate lunch. 797. In “Augustin Fresnel: His Time, Life and Work, 1788–1827”, Science Progress, vol. 36, 1948. 798. An important seismologist, as you can tell by the number of times he is being cited in this chapter. Born and educated in Germany, he emigrated to the U.S. in 1930, with a faculty position at Caltech. In Germany, he had worked at the Central Bureau of the International Association of Seismology, in Strassburg (now Strasbourg), and then in the Central Station for Meteorology, near Berlin. But he was never able to get a proper academic job, possibly (probably) because he was Jewish. In the 1920s, and until he went to the U.S., he had a day job in his father’s soap factory, and did seismology in his spare time. 799. You might be wondering about the new centrifugal force that emerges when the earth’s spins around its axis. Why don’t we redo the force balance, to take that into account, too? Because, just like earth’s gravity, this centrifugal force is not relevant to tides: its magnitude is fully determined by the distance between observer and rotation axis: so, if the observer, as she should, sits at a given geographic location on the earth’s surface, then over the course of a day the tidal force she experiences will change, like we have learned, but the spinningearth’s centrifugal force won’t. It won’t affect how the sea level goes up and down at that—or any—location.
814
Notes
800. We met Siméon Poisson in Chap. 6 already, while we were looking at P and S waves. He published his “Remarques sur une Équation qui se Présente dans la Théorie des Attractions des Sphéroïdes”, i.e., on the equation that we are about to derive, Eq. (N.128), which was then named after him, in the Nouveau Bulletin des Sciences, par la Société Philomatique, n. 75, 1813. 801. See how I got to Eq. (1.22) in Chap. 1, and see note 24. 802. We would proceed in the same way also if we wanted to look at tides caused by the sun—we’d just have to change mass and distance of the attracting body. But like I said, we are only going to do the moon. 803. “On the Earth Tide of the Compressible Earth of Variable Density and Elasticity”, Transactions of the American Geophysical Union, vol. 31. Which I thought the Transactions of the American Geophysical Union only published the abstracts of the A.G.U. meetings—but no, this is an actual paper, and quite long at that. As you can tell from the title, by the way, Takeuchi does the general case: his earth is compressible and neither ρ nor μ are constant. But in section 9 of the paper he applies his method to Kelvin’s simple model, too, to re-derive Kelvin’s old results. 804. To convert a quantity that is measured per units of mass to a measure per units of volume, we need to multiply it by the mass, so, e.g., in the case of force, we go from force-per-unit-mass to force tout court; then we need to divide by volume, to go from force to force-per-unit-volume. The combination of the two operations is the same as multiplying by density, ρ. 805. In a spherical frame with origin at the center of the earth. We don’t need to totally switch to spherical coordinates, but it’s useful to introduce u r , because that’s the only component of displacement that has any effect on ρ, ρ0 being spherically symmetric. 806. If the parcel of rock at r moves to r + u r , then the density we end up seeing at r is the density that used to be at r − δr . Hence the minus sign in front of 0 in (N.130). u r dρ dr 807. People like to call it hydrostatic pressure: presumably because it’s the pressure you’d observe in the unperturbed—“static”—state, and also because it is purely a pressure—no shear—just like in an inviscid fluid—“hydro”. 808. The logarithm is the power to which you need to raise a fixed number (the base of the logarithm) to produce a given number: as in, “the logarithm to the base 10 of 100 is 2”—because 102 = 100. The natural logarithm is the logarithm to the base e, where e is Euler’s number: this means that the natural logarithm of the exponential of x is equal to x. 809. You might remember that I had to anticipate this result in Sect. 6.5, so that I could tell you about acoustic waves. 810. Which is the book that all physicists are supposed to, like, know by heart. I must confess I’ve never owned it and I’ve never read the whole thing, but as I started working on this book, I’ve found it useful to study at least parts of it—from the copy I found in my father’s library (notes 387 and 427). Richard Feynman was a famous physicist, a great teacher—you can check out some of his lectures on YouTube—and he even got the physics Nobel prize, for his “fundamental
Notes
815
work in quantum electrodynamics, with deep-ploughing consequences for the physics of elementary particles”. 811. Sadi Carnot was the son of Lazare Carnot, who was a physicist, but also a reasonably important figure in the French revolution, and minister of war under Bonaparte. Sadi published just one book, few copies, neglected during his lifetime, but then rediscovered by Kelvin. “Carnot’s Réflexions sur la Puissance Motrice du Feu, published in 1824, escaped notice at the time, was only now and then slightly referred to later, [...] and might, perhaps, have still remained unknown to the world except for the fact that Sir William Thomson, that greatest of modern mathematical physicists, fortunately, when still a youth and at the commencement of his own great work, discovered it, revealed its extraordinary merit, and, readjusting Carnot’s principles in accordance with the modern views of heat-energy, gave it the place that it is so well entitled to in the list of the eramaking books of the age.” This is from the editor’s note to the 1897 American (Wiley and Sons) edition of Carnot’s book, translated as Reflections on the Motive Power of Heat. The same book is dedicated, by the American publisher, “to Sadi Carnot, president of the French republic, that distinguished member of the profession of engineering whose whole life has been an honor to his profession and to his country”, etc., because, yes, Marie-François-Sadi Carnot, a nephew of the original Sadi and named after him (he was born a couple years after his uncle’s premature death), had become, like his grandfather, a statesman and eventually the president of France. He would be stabbed to death by an Italian anarchist, Sante Caserio, in 1894. Anyway, the American edition of Reflections also has a bio of Sadi Carnot written by his brother, and the future president’s father, Hippolyte Carnot, translated from the “French copy of the book as published by Gauthier-Villars, the latest reproduction of the book in the original tongue.” “Nicolas-Léonard Sadi Carnot”, says Hippolyte, “was born June 1, 1796, in the smaller Luxembourg. This was that part of the [Luxembourg] palace where our father then dwelt as a member of the Directory. Our father had a predilection for the name of Sadi, which recalled to his mind ideas of wisdom and poetry. His firstborn had borne this name, and despite the fate of this poor child, who lived but a few months, he called the second also Sadi, in memory of the celebrated Persian poet and moralist [Sadi of Shiraz, who lived in the thirteenth century]. “Scarcely a year had passed when the proscription, which included the Director, obliged him to give up his life, or at least his liberty, to the conspirators of fructidor. Our mother carried her son far from the palace in which violation of law had just triumphed. She fled to St. Omer, with her family, while her husband was exiled to Switzerland, then to Germany. “Our mother often said to me, ‘Thy brother was born in the midst of the cares and agitations of grandeur, thou in the calm of an obscure retreat. Your constitutions show this difference of origin.’
816
Notes
“My brother in fact was of delicate constitution. He increased his strength later, by means of varied and judicious bodily exercises. He was of medium size, endowed with extreme sensibility and at the same time with extreme energy, more than reserved, almost rude, but singularly courageous on occasion. When he felt himself to be contending against injustice, nothing could restrain him. The following is an anecdote in illustration. “The Directory had given place to the Consulate. [Lazare] Carnot, after two years of exile, returned to his country and was appointed Minister of War. Bonaparte at the same time was still in favor with the republicans. He remembered that Carnot had assisted him in the beginning of his military career, and he resumed the intimate relation which had existed between them during the Directory. When the minister went to Malmaison to work with the First Consul, he often took with him his son, then about four years old, to stay with Madame Bonaparte, who was greatly attached to him. “She was one day with some other ladies in a small boat on a pond, the ladies rowing the boat themselves, when Bonaparte, unexpectedly appearing, amused himself by picking up stones and throwing them near the boat, spattering water on the fresh toilets of the rowers. The ladies dared not manifest their displeasure, but the little Sadi, after having looked on at the affair for some time, suddenly placed himself boldly before the conqueror of Marengo, and threatening him with his fist, he cried ‘Beast of a First Consul, will you stop tormenting those ladies!’ “Bonaparte, at this unexpected attack, stopped and looked in astonishment at the child. Then he was seized by a fit of laughter in which all the spectators of the scene joined.” Later, Sadi turned out to be very much inclined to science. He ended up at the École Polytechnique, and became a military engineer. His career was very much affected, it seems, by the ups and downs in his father’s political career: “The events of 1815 brought General Carnot back into politics during the Cent Jours which ended in a fresh catastrophe. “This gave Sadi a glimpse of human nature of which he could not speak without disgust. His little sub-lieutenant’s room was visited by certain superior officers who did not disdain to mount to the third floor to pay their respect to the son of the new minister.” And but “Waterloo put an end to their attentions. The Bourbons re-established on the throne, [Lazare] Carnot was proscribed and Sadi sent successively into many trying places to pursue his vocation of engineer, to count bricks, to repair walls, and to draw plans destined to be hidden in portfolios. He performed these duties conscientiously and without hope of recompense, for his name, which not long before had brought him so many flatteries, was henceforth the cause of his advancement being long delayed.” Eventually Sadi got his promotion and then took a long leave of absence “and availed himself of it to lead, in Paris and in the country round about Paris, a studious life interrupted but once, in 1821, by a journey to Germany to visit
Notes
817
our father in his exile at Magdeburg. We had then the pleasure of passing some weeks all three together. “When, two years later, death took from us this revered father and I returned alone to France, I found Sadi devoting himself to his scientific studies, which he alternated with the culture of the arts. In this way also, his tastes had marked out for him an original direction, for no one was more opposed than he to the traditional and the conventional. On his music-desk were seen only the compositions of Lully that he had studied, and the concerti of Viotti which he executed.” This is about the time when Sadi wrote his Réflexions, which he published in 1824. Which was strange, because while Carnot had curiosity for a lot of different stuff: while “he made himself familiar with the processes of manufacture; mathematical sciences, natural history, political economy—all these he cultivated with equal ardor [...]. I have seen him not only practise as an amusement, but search theoretically into, gymnastics, fencing, swimming, dancing, and even skating”, despite all that, says his brother, he also “had such a repugnance for bringing himself forward that, in his intimate conversations with a few friends, he kept them ignorant of the treasures of science which he had accumulated. They never knew more than a small part of them. How was it that he determined to formulate his ideas about the motive power of heat, and especially to publish them? I still ask myself this question,—, I, who lived with him in the little apartment where our father was confined in the Rue du Parc-Royal while the police of the first Restoration were threatening him. Anxious to be perfectly clear, Sadi made me read some passages of his manuscript in order to convince himself that it would be understood by persons occupied with other studies.” In the last years of his life, Sadi kept working super-hard on topics related to thermodynamics—“he undertook profound researches on the physical properties of gases and vapors”. And but “his excessive application affected his health towards the end of June, 1832.” He caught “an inflammation of the lungs, followed by scarlet-fever [...]. There was a relapse, then brain fever; then finally, hardly recovered from so many violent illnesses which had weakened him morally and physically, Sadi was carried off in a few hours, August 24, 1832. Towards the last, and as if from a dark presentiment, he had given much attention to the prevailing epidemic, following its course with the attention and penetration that he gave to everything.” 812. Another way to look at this: the concept of entropy emerges when we need to explain the difference between the ideal, perfectly reversible Carnot cycle, and a cycle that’s not reversible. In a cycle that’s not reversible, the cycle integral is not exactly zero, so at the end of the cycle some entropy has been produced. So the people who first came up with the idea of entropy were sort of saying: entropy is that thing that is nonzero in a real cycle, and but zero in an ideal Carnot cycle. 813. This is OK because remember note 488: entropy is a state variable, i.e.. how it changes with p and V during a (OK, reversible) transformation depends only
818
814.
815.
816. 817. 818.
819.
Notes
on the initial and final values of p and V —the initial and final states—and not on how we get from one state to the other. We can take the small transformation from p, V to p + dp, V + d V to be reversible, and so take, for instance, the constant- p-then-constant-V (or constant-V -then-constant- p) path to calculate the variation in S. (Which is what I am doing here.) p, V , T and S are not all independent of one another; i.e., if changes in p, V , T are given, then the change of S can be determined from those. So, then, it’s OK to take S as independent variable instead of V —there just can’t be more than three independent variables, in this game, I guess. Patterson, a geochemistry prof. at Caltech for his entire career, is famous for two things. One is, he came up with what we still think is the most reliable estimate for the age of the earth, i.e., about four and a half billion years, which we are about to see how he achieved that; the other is, lead pollution. Patterson had to deal with lead contamination while measuring uranium and lead isotopes to find better ways to date rocks: because it turned out that the concentration of lead in his samples would change significantly every time the samples were exposed to air. Patterson understood that finding so much unwanted lead in his lab meant there was a lot of lead in the atmosphere, and car exhaust was a likely source for it. Later, after he had finished his Ph.D. and got an academic job, he decided to make a science project out of this idea. So he got funding for that (incl. from oil companies, which perhaps were hoping for a different outcome than what actually happened), and went to sea and measured the concentration of lead in shallow versus deep ocean waters; he found that there was systematically more lead near the surface of the ocean than at depth: which suggested that, again, lead must come from above, from the atmosphere. So Patterson’s initial idea must be right. Then he went to Greenland and Antarctica and drilled glaciers, to get ice samples from layers of ice of different ages. He measured the concentration of lead in those samples, and found that it had started to grow, suddenly and rapidly, in the 1920s. It was largely because of Patterson’s scientific contribution891 that, through the 1970s, 80s and 90s, lead was “phased out” from consumer products and, eventually, from gasoline. Which you can, because a meteorite is a closed system, and has been a closed system since the moment it was formed. C. Patterson, G. Tilton and M. Inghram, “Age of the Earth”, Science, vol. 121, 1955. R. W. van Bemmelen and H. P. Berlage, “Versuch einer Mathematischen Behandlung Geotektonischer Bewegungen unter Besonderer Beriicksichtigung der Undationstheorie”, Gerlands Beiträge zur Geophysik, vol. 43, 1934. “The Summer Institute for Historical Geophysics”, says a brief note in the book’s front matter, “is the author’s one-man institute, performing research, issuing publications and now and then giving lectures within the field of historical geophysics. For further information, or for access to publications in the series ‘Small Publications in Historical Geophysics’ issued by the institute, go to: www.historicalgeophysics.ax”.
Notes
819
Fig. N.61 The stellar parallax. Say there’s a “near star” that is closer to the earth than most “distant”, essentially motionless stars. Over one year, i.e. one revolution of the earth around the sun, the near star’s position relative to the distant stars changes along an (approximately) circular trajectory. This defines the angle p, or parallax angle, as shown
820. Friedrich Wilhelm Bessel’s most famous contribution, actually, was probably that he found a way to measure the parallax of a star. Which if you don’t know what the parallax of a star is, check out Fig. N.61: it’s all in there. The important thing about measuring the parallax of a star is that through the parallax you can figure out how far that star is from us. To see how that is done, see, again, Fig. N.61. The Bessel functions are functions that showed up in some of Bessel’s work (totally unrelated to postglacial rebound, for sure: but let’s not get into it). Bessel studied and described the functions in some detail; people then started calling them by his name, and the name stuck. 821. Named after the German mathematician Georg Ferdinand Frobenius, who in the second half of the nineteenth century was a prof. at the Federal Institute of Technology in Zürich and at the University of Berlin. 822. If you look this up, you’ll see that, in formulae for the solutions to the Bessel equation (AKA the Bessel functions), the product of (b + 1) times (b + 2), etc., at the denominator of (N.172) is written in compact form via the Gamma () function. is kind of like a generalization of the factorial (which see note 56) for real—rather than just integer—numbers, defined e.g. by the property that (x + 1) = x(x) for any number x. I don’t think is particularly useful for us right now, but just in case. 823. You might notice the ambiguity, here, between what we called the order of the Bessel function, which is equal to b and can in principle take any value, and the order of the Bessel equation, which is just 2, by definition, because the
820
824.
825.
826.
827. 828.
829.
830. 831. 832.
833.
834.
Notes
highest derivative of the unknown function that appears in the Bessel equation is a second derivative. That is not ideal, but it’s just the way it is. You might be wondering, since we’ve just found that if Jb is a solution then also J−b is a solution, you might ask why don’t we just use those? The answer is that Jb and J−b are not, or not necessarily, linearly independent. The reason why that is the case is in the Pandora’s box—and in a lot of good mathematics book which, if you got to this footnote, you might actually be interested in reading. “In 19th-century England the eight Bridgewater Treatises were commissioned so that eminent scientists and philosophers would expand on the marvels of the natural world and thereby set forth ‘the Power, Wisdom, and Goodness of God as manifested in the Creation.’ ” (Britannica) Our ears can’t hear anything graver than about 20 Hz, and most seismic waves carry frequencies rather close to 1Hz. If you speed up a sound, all frequencies become higher—just think of it: speeding up a sound is the same as increasing the rate of oscillation per unit time. Tribo is ancient Greek for “rubbing”, “friction”, etc. Which I guess Franklin could pick some reference charge, and measure the force (in ways similar to how Cavendish measured gravity: the torsion pendulum, etc.) that attracted/repelled that to/from a pair of resin and glass pieces that had been rubbed together (and that carried no previous charge, of course). If you’ve heard of Ohm’s law before, you’re likely to have seen it written in a different way. The two ways of writing it are equivalent—you can look it up. In any case, it doesn’t matter much for our current goals. C. L. Madsen, “Hans Christian Ørsted”, The Journal of the Society of Telegraph Engineers, vol. 5, 1876. The Evolution of Physics—the Growth of Ideas from Early Concepts to Relativity and Quanta, Simon and Schuster, New York 1938. I don’t know how “immediate” this is to you; it wasn’t totally immediate to me, I don’t think. But anyway, we shall get back to this when I’ll tell you about “magnetization” of rocks, a few pages down. I’ll get to Coulomb’s law in a second. I am not bothering to make a distinction between electrostatics and electrodynamics, because Maxwell’s equations contain both anyway; but in case you’re wondering, people speak about electrostatics when they look at electric charges that are not moving: in that case, Coulomb’s law is enough to describe all their effects. But a moving charge happens to generate a magnetic field on top of the electric one, and Coulomb’s law doesn’t account for that. That’s what we call electrodynamics, and, taken together, Maxwell’s equations do describe those effects, too. Which is not quite true, actually. When Maxwell wrote his own papers, between the 1860s and 1880s, vector algebra hadn’t been invented yet, or hadn’t yet been put in a definitive form; so Maxwell wrote everything in scalar notation which meant twelve equations at the very least. The vector version that we study today, and which I am about to give you, is due to Oliver Heaviside—telegraph
Notes
835.
836.
837.
838.
839.
821
operator, self-taught physicist, unsung hero of electromagnetism who never made it into the academia. In case you are wondering, I am talking of a magnetic dipole because there doesn’t exist in nature such a thing like a, uhm, magnetic “monopole”, or “magnetic charge.” Or in any case, it’s never been observed. Coulomb did this with a very precise torsion balance, similar to that of Michell (which remember Chap. 1); but he used it to measure electric fields, instead of the earth’s mass. We learned about grad, div, curl and all that in Chap. 6 already, and we saw, e.g., that the curl of the gradient of any scalar field is always zero (note 329). But we didn’t look at the divergence of a curl. To see that that, too, is always zero, just take the divergence of expression (N.39) for the curl, from note 329, and see what happens. Remember (Chap. 6) that ∇ 2 is called the “Laplacian” operator; and, not surprisingly, then, an equation that says that the Laplacian of some function is zero is called “Laplace’s equation. In note 541 we have managed to, I quote myself, “rewrit[e] the cylindricalcoordinate components of the gradient, whose Cartesian components are ∂ , ∂ , ∂ , in terms of derivatives with respect to the cylindrical coordinates ∂x ∂ y ∂z r , ϕ and z.” Now, to translate (N.190) into spherical coordinates, we need to rewrite the spherical-coordinate components of the gradient in terms of derivatives with respect to the spherical coordinates r , ϑ and ϕ, defined like in Fig. 6.12. Similar to note 541, use Fig. 6.12 and trigonometry to see that
and
and
⎧ ⎨ x = r sin ϑ cos ϕ, y = r sin ϑ sin ϕ, ⎩ z = r cos ϑ,
(N.249)
⎧ r = x 2 +y 2 + z 2 ⎪ ⎪ √2 2 ⎨ x +y , ϑ = arctan x ⎪ ⎪ y ⎩ ϕ = arctan x ,
(N.250)
⎧ ⎪ ⎨ xˆ = rˆ sin ϑ cos ϕ + ϑˆ cos ϑ cos ϕ − ϕˆ sin ϕ yˆ = rˆ sin ϑ sin ϕ + ϑˆ cos ϑ sin ϕ + ϕˆ cos ϕ ⎪ ⎩ zˆ = rˆ cos ϑ − ϑˆ sin ϑ,
(N.251)
where xˆ , yˆ , zˆ , rˆ and ϕˆ are all defined like in note 541; ϑˆ is perpendicular to both rˆ and ϕˆ and points in the sense of increasing ϑ. Next, you can play with the derivatives of the generic function f (x, y, z),
822
Notes
∂ f ∂r ∂ f ∂ϑ ∂ f ∂ϕ ∂f = + + : ∂x ∂r ∂x ∂ϑ ∂x ∂ϕ ∂x first, you should work out, based on (N.250), the derivatives of r , ϑ and ϕ with respect to x. Then, replace x and y and z with their expressions (N.249) in terms of r , ϑ and ϕ. Then, you do the (almost) same thing again with ∂∂ yf and ∂f ∂z
. Finally, remember that ∇ f = xˆ
∂f ∂f ∂f + yˆ + zˆ , ∂x ∂y ∂z
and replace xˆ , yˆ and zˆ with their expressions (N.251), and ∂∂xf , etc., with what we’ve just found: so that what you get is a formula for the gradient where no Cartesian stuff appears: everything is referred to spherical coordinates, i.e., ∇ f = rˆ
∂f 1 ∂f 1∂f + ϑˆ + ϕˆ . ∂r r sin ϑ ∂ϑ r ∂ϕ
This is the same procedure that we used in note 541 to figure out the gradient and Laplacian in cylindrical coordinates; except it’s even more tedious. I am not going to go through the details—I am pretty sure you can do it yourself, with some patience. I’ve just shown you the gradient formula; as for the Laplacian, you should get: 1 ∂ ∇ f = 2 r ∂r 2
1 ∂ ∂f 1 ∂2 f 2∂ f r + 2 sin ϑ + 2 2 , ∂r r sin ϑ ∂ϑ ∂ϑ r sin ϑ ∂ϕ2
from which Eq. (N.191) of note 578 follows. 840. After Adrien Marie Legendre. Born in Paris, 1752, “into a well-to-do family, [Legendre] had enough money to allow him to dedicate himself to mathematics. He made important contributions to the theory of numbers and to a branch of calculus that dealt with what are called elliptical integrals, though in the latter case he was quickly surpassed by the work of Abel (see note 448) and Jacobi. Legendre rejoiced in these new discoveries regardless of the fact that they overshadowed his own years of labor. [...] The upheaval of the French Revolution cost him his financial independence, but by that time he was sufficiently well known to be able to support himself by teaching and by accepting government positions. He never did as well as he deserved, apparently because of the enmity of Laplace, a small-minded man” (Asimov). 841. The product of a generalized Legendre polynomial, with cos ϑ as its argument, times a sinusoidal function of ϕ, is what people call a spherical harmonic function, Ylm (ϑ, ϕ) = Plm (cos ϑ)eimϕ .
Notes
823
(Remember that eimϕ = cos(mϕ) + i sin(mϕ), with i the imaginary unit.) The thing about spherical harmonics, which makes them quite useful in a number of applications, is that, if you take all of them, with l going all the way to infinity, what you have is a basis for the space of functions defined over the surface of a sphere: i.e., any function defined on a sphere can be written as a linear combination of the Ylm ’s. And, if you define the inner product in that space as the integral of the arithmetic product of functions over the whole surface of the sphere, then it can also be shown that the Ylm ’s are all orthogonal to one another, i.e., remember the sines and the cosines, the inner product of two harmonics is always zero—except for the inner product of a harmonic with itself892 . This makes them a very convenient basis: like sines and cosines are, when you work in one dimension. Also similar to sines and cosines, spherical harmonics are oscillatory functions—see Fig. N.62 for some examples—so that, in practice, finding the spherical-harmonic coefficients of a function defined on a sphere is, like, the two-dimensional, spherical-geometry equivalent of finding the Fourier series, or transform, of a one-dimensional function of, e.g., time or distance. The earth being (almost) a sphere, earth scientists deal with a lot of functions that are defined on a sphere: if you are considering a career in geophysics, chances are that you’ll meet spherical harmonics again. 842. When l and m are large, what we get are very rapidly oscillatory functions, which are related to the very small-scale details of the magnetic field—which we don’t expect to be able to constrain, anyway. 843. Real-world problems like this—constraining the global pattern of the earth’s magnetic field from a set of local measurements—most often don’t have an exact solution. There are several reasons for that: (i) you can’t really decide how many data are available: the best you can do is, don’t try to constrain more parameters than you have data—in the case we’re treating right now, just make sure that the max value of l that you decide to pick is not too large; (ii) even so, you also have little control on how densely and uniformly distributed the observation points are: or in other words, as a general rule, not all the coefficients alm , etc., are equally well constrained by the data—some might be totally unconstrained; and finally, (iii) data are “noisy”—no measurement is perfect. Bottom line, let us write our linear inverse problem in the form A · x = d,
(N.252)
where each row is calculated from one of the (N.198), plugging in the coordinates r , ϑ, ϕ of the observation point. The vector d contains the numerical values of all observations of Br , Bϑ , Bϕ ; x contains the unknown coefficients, alm , etc., for all values of l and m. Because we decided to have more data than unknowns, A will have more rows than columns, and, as a general rule, it won’t have an inverse—which is the same as saying, the linear system does not have a solution893 .
824
Notes
Fig. N.62 Some examples of spherical harmonics, plotted on the surface of a sphere, at various degrees l (growing from top to bottom) and orders m (growing left to right). Different shades of gray stand for different values of the function, with gray to black meaning increasingly negative and gray to white increasingly positive
What people do, though, is, they say, let’s pick the vector x that is, so to say, closest to being a solution, i.e., x such that the magnitude of A · x − d is as small as it can possibly be. The measure of magnitude that people like the most is the squared norm |A · x − d|2 —hence “least squares”. Finding x such that this expression is minimum is the same as finding x such that (see note 615) ∂ |A · x − d|2 = 0 ∂xi for all i’s, i.e., all model parameters. If you do the math894 , you’ll find that this implies that AT · A · x = AT · d. And but AT · A is a square matrix: whatever the dimensions of A, the matrix AT by definition has as many rows as A has columns. With some luck, then, AT · A will have an inverse, and −1 x = AT · A · AT · d,
Notes
825
−1 and people like to call AT · A the “generalized” inverse of A. 844. P. A. Davidson, An Introduction to Magnetohydrodynamics, Cambridge University Press, 2001. 845. It is OK to neglect the displacement current here, because as a general rule, in is always much smaller than the other terms in Ampère’s geomagnetism, ∂E ∂t law, Maxwell’s (N.186) in note 575, i.e., * * * ∂E * * (N.253) |∇ × B| *μ ** . ∂t To check that this is actually true, call L a typical length scale, and T a typical time scale of electromagnetic variations (meaning, how much you have to move in space, or how much time you have to wait, in order-of-magnitude sense, before you see some significant variation in E and or B). It follows from (N.253), then, that B E
μ , L T and we can use this to try and find out how large or how small L and T need to be for displacement current to be negligible. (This, by the way, is called a “scale analysis” in science slang.) By Faraday’s law, Maxwell’s (N.185) in note 575, we have that EL ∼ TB , and substituting gives BL B
μ 2 . L T B cancels out, and after rearranging we are left with T 2 μL 2 .
(N.254)
Now, the largest conceivable L for geomagnetic variations is of the order of Earth’s circumference, say 40,000 km; if (N.254) is true when L = 40, 000, it will also be true for any other reasonable value that L might take. Permittivity and permeability of the core can’t be measured directly, of course; but we know that, whatever material we look at, their orders of magnitude are always the same as those of the values they take in a vacuum (with rare exceptions that need not concern us—nothing that we expect to find in significant quantities in the core, or generally within the earth). So, filling in numbers gives T 8 s, which means that for all variations with a time scale longer than 8 s we can safely neglect the displacement current term. In geomagnetism we are usually looking at variations with timescales of minutes and longer (much longer for
826
Notes
core field variations) so this is a good approximation. (It is known, by the way, as the “quasi-static approximation”.) 846. In note 329 I worked out a nabla-related proof already; in note 837 I almost worked out another one. You can use those examples as, uhm, inspiration, to figure out this one. 847. If you ask me, by the way, I don’t like this way of calling things: in everyday language if you say: “flux” you’re probably talking about some displacement of matter through something—a synonym of “flow”, I’d say? So using this word here would make sense if B were a displacement: but B is the direction of the magnetic field: it doesn’t directly describe any motion of matter. Anyway. 848. I think that the best way to go about this is to use the Levi-Civita symbol ei jk , see note 749, to rewrite the ith component of u × v, {u × v}i = ei jk u j vk . Apply this to a · (b × c): a · (b × c) = ai ei jk b j ck = ei jk ai b j ck ; and then apply it to (c × a) · b: (c × a) · b = bi ei jk c j ak = ei jk bi c j ak = e jki b j ck ai = e jki ai b j ck . Now, you need two permutations of the indexes i, j, k to go from ei jk to e jki : which means, by the definition of the Levi-Civita tensor, that ei jk = e jki . But then, the right-hand sides of the two equalities we just wrote are the same thing. It follows that a · (b × c) = (c × a) · b, QED. 849. Both Edward, or “Ted” Irving and Kenneth Creer were Runcorn’s grad students in Cambridge. Irving actually failed to graduate—he did land a research job in Australia, though, and went on to have a pretty good academic career. Check out the interview he gave to Earth magazine in the 2000s—“Down to Earth with: Ted Irving”, by Nate Burgess. 850. The so-called East Pacific Rise, the mid-oceanic ridge in the Pacific ocean, is discovered around the same time. “At the turn of the [twentieth, obviously] century”, writes Heezen, “widely spaced soundings made by Alexander Agassiz from the Albatross had revealed a relatively high elevation on the bottom of the eastern Pacific southeast of central Mexico. In 1929 echo soundings from the oceanographic vessel Carnegie showed that this so-called Albatross Plateau is in reality a mountain range”, etc. Alexander Agassiz was the son of
Notes
851.
852. 853.
854.
827
Louis Agassiz, of ice-age fame, who had emigrated to the US in the mid-1800s, becoming a prof. at Harvard. Alexander graduated from Harvard and then “studied engineering in the Lawrence scientific school,” says the Biographical Dictionary of America, 1906, “and after taking his degree pursued a post-graduate course in chemistry, at the same time teaching that science in a young ladies’ seminary conducted by his father. [...] He became interested in coal mining in Pennsylvania, and in 1866 made some investigations in the copper mines of Lake Superior, and became president of the Calumet and Hecla mining company, which corporation paid to its stockholders over $50,000,000 in dividends prior to 1895. This brought Agassiz a very large fortune, which he used in munificent gifts to the Harvard museum, of which he became assistant curator, and, after the death of his father, curator. These gifts aggregated over $500,000, and were mostly spontaneous responses to needs that presented themselves in his daily work.” Early on in his paper, Hess writes: “I consider this paper an essay in geopoetry”—in reference to Jan Umbgrove, a Dutch geologist who in the intro to his own textbook (The Pulse of the Earth, Martinus Nijhoff, 1947) had written, “it often happens that the historian, who supplies from his own imagination the missing lines of his scientific prose, allows himself to be carried away by an unbridled poetical inspiration and soars to giddy heights. This is inevitable, yet he should never forget that hypotheses constitute a necessary evil and should be discarded as soon as contradictory facts come to light. “We may expect to find a similar ‘geopoetical’ aspect in many a geological treatise, in addition to the normal geological prose. However, authors should always keep their theories strictly separated from descriptions and conclusions of a more rigorously documented kind.” So what Hess means is that his study should be regarded as mere speculation: that will later have to be proved or disproved by data. He also writes: “mantle convection is considered a radical hypothesis not widely accepted by geologists and geophysicists.” (Which is kind of funny, though, because so many people were talking about it as a realistic hypothesis at that point, and had been doing so for a while now.) But then he goes on, “if it were accepted, a rather reasonable story could be constructed”, and that’s followed by the seafloor spreading story, again, although Hess doesn’t call it that way. Why do people care so much about stuff like “priority”, anyway? Incidentally, while we know from Everest’s data and Pratt’s and Airy’s studies that Himalaya has a deep isostatic root, we might guess that Mount Schiehallion (Chap. 1) doesn’t. If it did, Maskelyne’s value for the density of the earth would have turned out to be much more different from Cavendish’ estimate than it actually is. “Victor Vacquier Sr., a Scripps Institution of Oceanography geophysicist who developed key instruments for mapping the Earth’s magnetic field and whose research provided a strong experimental foundation for the now widely accepted theory of plate tectonics, died of pneumonia Jan. 11 in La Jolla. He was 101. [...] Vacquier was born Oct. 13, 1907, in St. Petersburg, Russia. In 1920,
828
855.
856. 857. 858. 859. 860.
861.
Notes
after the Russian revolution, he and his family escaped to Helsinki, crossing the frozen Gulf of Finland in a one-horse sleigh. The family lived in France until he and his mother immigrated to the United States in 1923.” (From Vacquier’s obituary in the Los Angeles Times, January 24, 2009.) I don’t think I told you what the difference between a “rock” and a “mineral” is. That’s probably partly because those words must have been used interchangeably for a long time before people decided to attach each of them to a different, specific thing. So, today, if you hear a geologist utter the word mineral, they mean something with a chemical formula: some crystalline substance that’s identified by the relative quantities of elements that make it up, and how they are connected together. For example, calcite is a mineral. It’s made of calcium carbonate, i.e., CaCO3 , with the atoms packed together in a certain way. Two minerals might share the exact same chemical formula, but they differ because of the way their crystals are made: e.g., aragonite is also CaCO3 , but its atoms are not packed in the same way as in calcite: hence different physical/chemical properties, and a different name. A rock, on the other hand, is something you pick up on the field. In principle you might be able to find a rock that’s made of one and only one mineral, but usually rocks will carry a whole bunch of different minerals within them. Think, e.g., granite, where you can actually see tiny chunks of each of the minerals that form it. And but however, uncertainty is always huge—multiple-order-of-magnitude huge—when you’re dealing with viscosity in geophysics. It’s just the way it is. There’s convection in the mantle, and there always has been, so everything should be well mixed, no regional differences in chemistry or stuff like that. If you don’t remember what that is, see Chap. 7, Eq. (7.134). And mass excess above that depth is neglected, I guess. Incidentally, Jordan makes a distinction between lithosphere and “tectosphere”. The lithosphere is still the shallowest, high-viscosity, rigid layer of the earth, that includes the crust but is thicker than the crust. The tectosphere is “the region of the Earth occupied by [...] coherent [tectonic] plates”: it’s whatever forms a plate and moves when the plate moves. In practice, according to Jordan, deep continental roots are not necessarily part of the lithosphere, i.e. they don’t necessarily have the same composition, and rigidity, and viscosity, etc. But, they must be part of the tectosphere, because otherwise it would be a very strange coincidence to only find them under continents. (To be honest, though, I wonder whether we really need an extra word for this. I think most of my colleagues would agree with Jordan’s ideas on continental roots; but the word “tectosphere” is rarely spoken.) Geologists will tell you that shields and cratons are not the same thing: but as far as this book is concerned, we can just do as if they were. And as for what a craton is, if you don’t remember, go back to the caption of Fig. 5.9. I don’t know why cratons exist: I might be wrong, but I don’t think anyone knows for sure—although there’s plenty of people who are trying to find out.
Notes
829
862. For example from studies of the basalt that you find at mid-ocean ridges, whose composition, as we know by now, shouldn’t be too far from that of the mantle. 863. And it’s a fine enough model, I think, given how little we know of how radioactive the lithosphere exactly is. (And maybe the crust, too.) But it is not the only thermal model that there is, that could work for the continental lithosphere. If you check out some textbooks—like the ones by Turcotte and Schubert, that I’ve mentioned before, and Mary Fowler, The Solid Earth—an Introduction to Global Geophysics, which what I have is the second edition, published by the Cambridge University Press in 2005—if you look at textbooks you find that people like a model where H is not constant but decays exponentially as depth grows. 864. In case you’re wondering, no, there is no way you can dig as deep as the Moho. People tried: in the U.S., the National Science Foundation funded a project called “Mohole”, involving a lot of the big names in American 1950s/1960s geophysics, like Harry Hess, Maurice Ewing, etc., the idea being to pick a place where the crust is particularly thin, and drill all the way down to the mantle, collecting samples. Oil companies had found ways to drill the ocean floor, even in places where the ocean is deep, operating the drills from floating platforms and ships. So, it was decided to try and drill through oceanic crust, starting in the Caribbean, off Guadalupe island. But it wasn’t easy, and after a number of attempts, 1961 to 1966, Mohole became “no hole”—the interesting story of how that happened is told, e.g., by Daniel Sweeney, “Why Mohole Was No Hole”. Invention & Technology, vol. 9, 1993. There’s also the “Kola superdeep borehole”, drilled by the Soviet Union, and then Russia, between 1970 and 1994. Which, as of today, is still the deepest hole on earth. The Soviets figured that drilling under 3000 m of water was too much of a mess, and chose to go for the continental Moho. According to Mihai Andrei895 , temperature under Kola turned out to grow, with growing depth, faster than initially estimated—which caused all sort of unexpected problems. The Russians managed to hit 12 km in 1983: but that same year, the drill broke—you had to work with such superlong drills... and a 5-km section of it was left in the hole. The record depth of 12 km was never surpassed, and the facility in Kola was closed a couple of years after the collapse of the Soviet Union. 865. The values that follow, both for crust and lithosphere, are taken from the 1998 Chemical Geology paper by Roberta Rudnick, Bill McDonough and Rick O’Connell, “Thermal Structure, Thickness and Composition of Continental Lithosphere”. Roberta and Rick were profs. at my department during my Ph.D. Bill (Roberta’s husband) was mostly running Roberta’s geochemistry lab, if I remember correctly. This paper was written during my stay there, clearly, but it’s funny that at the time I probably understood nothing or close to nothing of what they were doing. Still, like I said, I was supposed to look at the thickness of continental roots and stuff like that (and I did, and published papers on it). Rick gave a course in geodynamics, which I took, and but I think there were large portions of it that remained obscure to me. I was a grad student—not an
830
Notes
undergrad—so Rick wouldn’t be strict, the way he graded me. It was clear that I was totally not a geologist. I did better in the seismology and applied math courses that I took at the same time. Otherwise, I probably would have been in trouble. 866. The best example of a rock that carries mantle xenoliths is kimberlite, which is called that way because it is largely found in Kimberley, South Africa; it is found in the earth’s crust in the form of long vertical structures that people call kimberlite pipes; and it is the “matrix” rock for diamonds. Peter H. Nixon and others (“Kimberlites and Associated Inclusions of Basutoland: a Mineralogical and Geochemical Study”, The American Mineralogist, 1963) write that “the term ‘kimberlite’ was proposed by Lewis896 (1888) for the brecciated, serpentinized, peridotitic rocks occurring in the Kimberley area of South Africa” (H. C. Lewis, “The matrix of the Diamond”, Geological Magazine, vol. 5, 1888). 867. Unless we prefer to figure it out by ourselves. Which we can, if we want to. Because in note 621 we’ve learned that z 1 1 2 erf(x)d x = z erf(z) + √ e−z − √ . π π 0 The integral we’re faced with right now is slightly different, but, through a change of the integration variable, ξ = 2√zκt , we can rewrite it
d 0
z erf √ 2 κt
√ dz = 2 κt
√d 2 κt
erf(ξ)dξ √ d d 1 1 − d2 4κt = 2 κt √ erf √ −√ +√ e π π 2 κt 2 κt 2 d κt − d κt +2 e 4κt − 2 , = d erf √ π π 2 κt 0
and Eq. (N.215) of note 637 follows, QED. 868. Perhaps, Davies just thought: inside the lithosphere, the erf in the T -versusdepth diagram is almost a straight line, connecting the origin (T ≈ 0 ◦ C at the earth’s surface, z = 0) to the point where z = d and T = T0 (that’s what we have at the base of the lithosphere). So, if you take the average of that over depth, you get half T0 : which is about 700 ◦ C. Which is easy to see that this doesn’t change if you change t: similar to my result. I guess that looked so simple to Davies, that he figured no explanation was needed. 869. Frank M. Richter, a prof. at the University of Chicago. No relation with the Charles F. Richter of Richter scale fame, I don’t think. 870. You already know who Francis Birch was. Francis Dominic Murnaghan was a mathematics prof. (and head of department) at Johns Hopkins; I don’t think he and Birch ever worked together, but Birch used some earlier work by Murnaghan to sort out the Birch-Murnaghan equation. Hence the name.
Notes
831
871. I hope that, back in Chap. 6, I made it sufficiently clear that the formulae we derived to relate stress and deformation (AKA strain) are only valid as long as (the changes in) both stress and deformation are very small—approximately infinitesimal. This was a good-enough proxy to deal with the propagation of seismic waves through the earth (tiny disturbances, provided that you are not too close to the actual fault—which we don’t need to be: not in this book, anyway) and/or to deal with postglacial rebound and mantle convection. But an equation of state, which is what we are after right now, needs to work in the more general case where stress and deformation are not infinitesimal, but “finite”: because an equation of state is supposed to describe the behavior of matter in laboratory experiments, where changes in stress and strain could be relatively large, and/or across a broad range of depths within the earth, etc. To see the rationale for the definition of finite strain that I am about to give, start by considering a cube of some material, whose edge length, before it is strained, we shall call x0 , and whose “unstrained” volume is then V0 = x03 . Imagine that the cube is compressed uniformly in all directions (this is a simplification with respect to the most general case, where the amount of compression/expansion might depend on direction; but that’s enough for our current goals): so that after compression it is still a cube, but with edge length equal to some new value that we might call x, and volume V = x 3 . Let us call u = x − x0 the change in edge length. We have: x 2 − x02 = x 2 − (x − u)2 = 2xu − u 2 . If we write u=
(N.255)
∂u x ∂x
and sub that into (N.255), we get 2 ∂u ∂u x− x2 ∂x ∂x 1 ∂u 2 ∂u = −2 − x 2. 2 ∂x ∂x
x 2 − x02 = 2x
The term in square brackets is precisely what we call (“Eulerian”) finite strain, i.e., 1 ∂u 2 ∂u f = − 2 ∂x ∂x (N.256) 2 2 x0 − x = . 2x 2 It follows from this that
832
Notes
x02 − x 2 = 2x 2 f, or, which is the same, x02 = x 2 (1 + 2 f ), or
x0 = 1+2f. x
From this we can get the ratio of unstrained versus strained volume, V0 x3 = 03 V x 3 = (1 + 2 f ) 2 ,
(N.257)
which shows in what sense f is a measure, like I was saying, of how much the material is being compressed or expanded. You can turn things around and find an expression for f in terms of V0 and V : it follows from (N.257) that 1+2f = or 1 f = 2
V0 V
V0 V
23
23
,
−1 ,
which I use in note 657, where I figure out the Birch-Murnaghan equation of state. This derivation of f approximately follows Tomoo Katsura’s lecture notes and webpage; Katsura quotes Jean-Paul Poirier’s textbook897 : he mentions that in Poirier’s treatment, the finite strain is a tensorial entity: which makes it more general, but also more complex. However, says Katsura, “if we just want to understand compression at high pressure and [the] Birch-Murnaghan equation of state”, then it’s OK to just look at uniform compression—which when we decide to do that, see above, the problem from tensorial becomes scalar, and much simpler to deal with. 872. After defining finite strain f in the previous note, maybe you are expecting me to explain equation (N.219) (of note 657), where it is said that changes in Helmholtz free energy are proportional to the square of f . I am afraid that my “explanation” will disappoint you, though. So far as I understand, in fact, (N.219) is largely an assumption that people make, and that turns out to “fit the data” well. What people typically say in books and papers is, first, that change in “strain energy” can be assimilated to change in U − T S, and, second, that, in the classic case where strain is inifinitesimal, change in strain energy is
Notes
833
proportional, indeed, to the squares of the strain coefficients—the components of ε . So, then, when one switches from infinitesimal to finite strain, it is assumed that δ(U − T S) = a f 2 + b f 3 + c f 4 + · · · , and then, the more precise you hope your equation of state to be, the higher the power of f where you’ll stop the sum. If you are happy with f 2 , which we are in this book, then what you get is the second-order Birch-Murnaghan equation; if you include f 3 you get third-order Birch-Murnaghan, and so on. Equation (N.256) shows that the finite strain f is a ratio of squared displacements: it’s really not the same thing as strain the way we’d defined it before, but it is still a dimensionless measure of deformation: and people figure that, that being the case, it’s OK to assume that it will stand in the same relation with change in strain energy, as strain does. I wish there were a better explanation for Eq. (N.219) than this one, but I decided that I would stop trying to find it when I saw that Tomoo Katsura, who is probably one of the people around the world that know more than anybody else about this stuff, candidly admits in his own website that he doesn’t know why f is defined the way it is—i.e., based on a difference of squared lengths. He actually redoes all the math starting with x − x0 = u, and then again with x 3 − x03 = x 3 − (x − u)3 , instead of (N.255), and defining f =
x − x0 , x
f =
x 3 − x03 , 3x 3
and
respectively, instead of (N.256). From this you can get two more equations of state, and see how well they fit the data: and neither can fit them as well as Birch-Murnaghan—the “squared-lengths” one. “I do not know”, concludes Katsura, “why the equation of state based on the squared difference in length [I think he means “difference in squared lengths”] gives proper values. However, experimental results support it. If you know the physical reason why the squared length difference is essential, please let me know it”: and he gives his email address. 873. One more note, to show you where the formula for a, Eq. (N.220) of note 657, comes from. Here it goes: if Eq. (N.219) of note 657 is true, then the formula
834
Notes
(N.221) for p must also hold, and if we differentiate it with respect to V we get ∂f ∂ ∂p f = − 2a ∂V T ∂V ∂V T T 2 2 ∂f ∂ f = − 2a + f . ∂V T ∂V 2 T In the unstrained state, when V = V0 , etc., this collapses to
∂p ∂V
T =T0 , p=0
= −2a
∂f ∂V
2 T =T0 , p=0
,
because in the unstrained state, of course, f = 0. You can solve that for a, to find ∂p
1 ∂V T =T , p=0 . a = − 2 0 2 ∂f ∂V
(N.258)
T =T0 , p=0
Now, the trick is to find formulae for
∂p ∂V
T =T0 , p=0
and
∂f ∂V
T =T0 , p=0
that
we can plug into (N.258). First, look at Eq. (N.222), also of note 657: in the unstrained state it boils down to ∂f 1 =− . (N.259) ∂V T =T0 , p=0 3V0 Next, remember the definition of the isothermal incompressibility, Eq. (7.136), which can be rewritten ∂p KT . =− ∂V T V In the unstrained state this gives
∂p ∂V
T =T0 , p=0
=−
KT 0 , V0
(N.260)
and K T 0 is defined as the value of K T when T = T0 and p = 0. All that’s left to do is, sub (N.259) and (N.260) into (N.258). We find that 1 − VT00 a= − 2 2 − 3V1 0 K
=
9 K T 0 V0 , 2
Notes
835
QED. 874. If you’re wondering, why “second-order”: go back to note 872. In the real world, by the way, people tend to trust the third-order Birch-Murnaghan more than the second-order one, and in some cases—depending on the materials you are trying to model—they will want to go to higher orders. 875. Was he serious, though? seismic images might be “fuzzy”, I guess they are. But geochemical “mapping”? does that even have any “resolution” at all? In Hofmann’s words, geochemists “estimated the proportion of depleted mantle to be [...] between 30% and 90% [of the total mass of the mantle], depending on specific assumptions about the composition of the continental crust and depending on the inclusion or omission of the OIB source reservoir(s)”... 876. Which see Chap. 8, at the very end, the bit on Dewey and Bird’s work. 877. By correlation I mean a statistical thing, that gets used quite a lot in geophysics, though, because geophysicists spend a lot of time comparing all kinds of different data sets to one another. Basically, it is a way of quantifying how similar two sequences of numbers are. If I call yi the concentration of one isotope in sample i, and z i the concentration of the other isotope in the same sample, and let’s say I have N rock samples, then the two isotopes’ correlation is defined as N
− y¯i )(z i − z¯i ) ! , N 2 2 (y − y ¯ ) (z − z ¯ ) i i i i i=1 i=1
r = ! N
i=1 (yi
where y¯i is the average of all the yi ’s, N y¯i =
i=1
yi
N
,
and z¯i the average of all the z i ’s. People call r the Pearson898 correlation coefficient, because there are other ways to define correlation, but let’s not get into that. You see that if yi = z i for all i’s, then their correlation is 1; if yi = −z i for all i’s then r = −1, which in that case we say that the two sets are perfectly anticorrelated. If yi is not equal to z i for all i’s (or −z i for all i’s), then one way to look at this is, think of u i = yi − y¯i as the ith component of a vector u, and likewise vi = z i − z¯i , so that r=
u·v ; |u| |v|
now, if your remember how the dot product of two vectors works899 , clearly if u and v aren’t the same, then u · v < |u| |v|, and so r < 1. The closer u and v are to one another—the smaller the angle they form—the higher the correlation r.
836
Notes
Fig. N.63 Bessel functions of the first kind, of orders 1/2, 3/2, all the way to 11/2 (as per the legend)
878. Earlier in this book, we’ve already met at least one case of numerical solution: I am talking about the Williamson-Adams method, which see Chap. 7. Maybe read note 674 now, then go back to Williamson-Adams, and see what the two exercises have in common. 879. Nota Bene: I am talking about the velocity at which the mass of the string actually moves—not the wave speed. 880. Which is the b.c. that we prescribed both when we looked at surface waves (Chap. 6) and postglacial rebound (Chap. 7). 881. Or, rather, the variation δ p with respect to hydrostatic pressure. 882. In the postglacial rebound problem that we looked at in Chap. 8, we ended up with Bessel functions of integer order, J1 , J2 , etc. But if you look at how Bessel functions are defined/derived—note 546—you see that there’s no reason why the order shouldn’t be any real number. And in case you are wondering, the Bessel functions of order 1 + 21 , 2 + 21 , etc., look like in Fig. N.63. 883. Cambridge University Press, 1981. 884. Princeton University Press, 1998. 885. Named after Carl Friedrich Gauss. Although, actually, “many people have contributed to Gaussian elimination, including Carl Friedrich Gauss. His method for calculating a special case was adopted by professional hand computers in the nineteenth century. Confusion about the history eventually made Gauss not only the namesake but also the originator of the subject. We may write Gaussian elimination to honor him without intending an attribution.” (From Joseph F. Grcar, “Mathematicians of Gaussian Elimination”, Notices of the American Mathematical Society, vol. 58, 2011.)
Notes
837
886. If you look at seismic tomography literature, you’ll read, e.g., of “LU factorization”, or “Cholesky factorization” of AT · A. People speak of matrix factorization because it can be shown (but I am not going to do it in this book) that applying Gaussian elimination to a system like (N.245) is the same as finding a lower triangular matrix L, and an upper triangular matrix U, such that AT · A = L · U. (I’ll let you guess what “upper triangular” means.) Now, AT · A is symmetric and all its entries are positive (you should be able to convince yourself of that with a little bit of algebra): in such a case it turns out that U = LT , and the algorithm that’s typically used to factorize AT · A = L · LT is what people call Cholesky factorization—it takes advantage of the symmetry of AT · A and turns out to be a bit faster than LU. 887. “The Chicken, the Egg and Plate Tectonics”, American Scientist, vol. 109, 2021. 888. See, e.g., The Works of Archimedes, Cambridge University Press, 1897. 889. If this expression doesn’t make sense to you, nothing to worry about: it just means that you need to come back here after you’ve read Chap. 6 of this book. 890. “The Geographer who Hid Giuseppe Mazzini Under his Bed: the Forgotten Story of Antonio Snider-Pellegrini and his Role in the Italian Revolutions of 1848–1849”, Qualestoria, vol. 50, 2022. 891. If you want the full story, read Jason Kelly, “Immeasurable”, University of Chicago Magazine, 2015. 892. In most definitions of Ylm , actually, Plm eimϕ is also multiplied by a “normalization factor”: a number, which changes with l and m, such that the inner product of any two spherical harmonics is either 1 or 0. 893. Because, if A has an inverse (note 55), then you can dot both sides of (N.252) with A−1 , and A−1 · A · x = A−1 · d, or
or
I · x = A−1 · d, x = A−1 · d.
894. Which is a tedious exercise in matrix algebra; but its implications are so important that, the way I see things, it’s worth working it out at least once in your lifetime: so here goes. Written in scalar notation, the left-hand side of the equation we just wrote becomes
838
Notes
∂ ∂xi j
A jk xk − d j
k
A jl xl − d j
l
∂ 2 = A jk A jl xk xl − A jk xk d j − A jl xl d j + d j ∂xi j k,l k l ⎛ ⎞ ⎛ ⎞ ∂ ∂ = x k xl ⎝ A jk A jl ⎠ − 2 xk ⎝ A jk d j ⎠ ∂xi k,l ∂x i j k j ⎛ ⎞ =2 xk ⎝ A jk A ji ⎠ − 2 A ji d j k
=2
k
j
j
xk (A · A)ki − 2 (AT )i j d j , T
j
where the only relatively fancy trick I’ve played was to swap the order of summation (sum first over j and then over k and l rather than the other way around, etc.). All this needs to be zero; which, in matrix notation, if you consider that the dot product of a matrix with its transpose must be a symmetric matrix (and so (AT · A)ki = (AT · A)ik ), reads AT · A · x − AT · d = 0; and the least-squares formula that I am going to be using follows immediately. 895. “The World’s Deepest Hole Lies Beneath this Rusty Metal Cap—The Kola Superdeep Borehole”, ZM Science, 2014. 896. I’ve read on Wikipedia, July 2022, that “Henry Carvill Lewis (November 16, 1853 - July 21, 1888) was an American geologist and mineralogist. [...] Lewis took interest in investigating paranormal claims. In 1886, he attended séances of the medium William Eglinton and detected him in fraud. The exposure was published in the Proceedings of the Society for Psychical Research in 1887.” 897. Introduction To The Physics Of The Earth’s Interior, Cambridge University Press, 1991. 898. After Karl, or Carl Pearson, a British statistician who invented a whole bunch of statistical tools, like, e.g., this one. He graduated in mathematics from Cambridge, 1879, and then studied a whole lot of different things: philosophy, law, German history, German, etc., and Darwin’s theory of evolution, which at the time was a fairly new thing. He figured he could use his background in maths to study the statistics of evolution (think “genetic algorithms”, and all that), which is great, but then, after he became prof. at University College London, he got involved with his colleague there, Francis Galton. And the thing is, Galton, who became his friend and associate, was an eugenicist, and Pearson totally followed his ideas; Galton, actually, literally invented the very word, eugenics, which if you look it up in a dictionary you’ll see that that means,
Notes
839
essentially, “the study of selective breeding of humans, to increase the occurrence of heritable characteristics regarded as desirable” (The Guardian, June 19, 2020). Read, e.g., what Karl Pearson himself has to say in National Life from the Standpoint of Science, “an address delivered at Newcastle, November 19, 1900”: “Nurture and education may immensely aid the social machine, but they must be repeated generation by generation; they will not in themselves reduce the tendency to the production of bad stock. Conscious or unconscious selection can alone bring that about. “What I have said about bad stock seems to me to hold for the lower races of man. How many centuries, how many thousands of years, have the Kaffir and the Negro held large districts in Africa undisturbed by the white man ? Yet their inter-tribal struggles have not yet produced a civilization in the least comparable with the Aryan. Educate and nurture them as you will, I do not believe that you will succeed in modifying the stock. History shows me one way, and one way only, in which a high state of civilization has been produced, namely, the struggle of race with race, and the survival of the physically and mentally fitter race. If you want to know whether the lower races of man can evolve a higher type, I fear the only course is to leave them to fight it out among themselves, and even then the struggle for existence between individual and individual, between tribe and tribe, may not be supported by that physical selection due to a particular climate on which probably so much of the Aryan’s success depended. “If you bring the white man into contact with the black, you too often suspend the very process of natural selection on which the evolution of a higher type depends. You get superior and inferior races living on the same soil, and that co-existence is demoralizing for both. They naturally sink into the position of master and servant, if not admittedly or covertly into that of slave-owner and slave. Frequendy they intercross, and if the bad stock be raised the good is lowered. Even in the case of Eurasians, of whom I have met mentally and physically fine specimens, I have felt how much better they would have been had they been pure Asiatics or pure Europeans.” And so on. In 2020, UCL “renamed two lecture theatres and a building that honoured the prominent eugenicists Francis Galton and Karl Pearson. “The university said on Friday that the Galton lecture theatre had been renamed lecture theatre 115, the Pearson lecture theatre changed to lecture theatre G22 and the Pearson building to the north-west wing” (The Guardian, see above). 899. To be honest, based on the material we covered in this book re vectors and vector algebra, you only have enough elements to prove this for the case when u and v are 2- or 3-component vectors: quite small data sets. If the number of samples is arbitrary, then you’d need some more math than we’ve actually done.
Index
A Achilles, 482–484 Acoustic wave, 164, 189, 218, 274, 572, 573, 579, 777, 814 Adams, Leason, 310, 471 Adiabat, 748 Adiabatic conditions, 320 Adiabatic gradient, 320, 357, 422, 445, 750 Adiabatic process, 170, 320, 666, 668 Agassiz, Alexander, 826 Agassiz, Louis, 551, 827 Age of the earth, 81, 82, 84, 86, 93, 104, 111, 116, 117, 121, 332, 431, 525, 527, 690, 692, 768, 793, 818 Age of the sun, 109, 111, 121, 544 Agnew, Duncan, 233, 742 Airy, George Biddel, 141, 580 Aki, Keiiti, 640 Alchemists, 96, 531 Alchemy, 9, 81, 476 Allègre, Claude, 522, 526, 749, 757 Alps, 72, 84, 123, 124, 128–131, 135, 136, 139, 149, 150, 162, 418, 420, 561, 758 American Geophysical Union (A.G.U.), 705, 738 Ampère, André-Marie, 45 Ampère’s law, 716 Anderson, Don L., 422, 749 Andes, 264, 418 Angular momentum, 35, 45, 513, 514 Anharmonicity (anharmonic), 455, 456 Anticline, 467 Appalachian Mountains (Appalachia, Appalachians), 133, 134
Apparent force, 477, 650, 730, 731, 733, 734 Aragonite, 828 Archimedes, 3, 138, 141, 143, 352, 557, 562, 563, 704, 791, 801 Aristarchus, 3–9, 463 Aristotle, 8, 9, 51–53, 55, 61, 482, 531 Asimov, Isaac, 514 Asthenosphere, 144, 145, 150, 152, 360, 381, 395, 396, 398, 400, 402, 405, 406, 408, 411–414, 417, 419, 422, 444, 740, 746–751, 756, 757, 759, 801 Atom, 328, 332, 371, 455 Atomic weight, 318, 323, 664, 686 Attenuation, 193, 456 Auden, W.H., 392 Auvergne, 64, 65, 72, 73 B Bar magnet, 364, 365, 367, 711, 718 Barrell, Joseph, 144, 802 Basalt, 426 Bateman, Harry, 300, 642 Bathymetry, 376, 377, 379, 386, 389, 400, 402–405, 553, 556, 737, 757 Beachball, 388, 472, 743, 744 Beaumont, see Élie de Beaumont74 Becker, Thorsten, 457, 458 Becquerel, Henri, 327 Belousov, Vladimir, 793 Benioff, Hugo, 361 Bernoulli, Daniel, 593 Bertrand, Marcel, 513 Bessel equation, 342, 698, 699, 701, 779, 819, 820
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 L. Boschi, Our Concept of the Earth, https://doi.org/10.1007/978-3-031-71579-2
841
842 Bessel, Friedrich Wilhelm, 819 Bessel function, 342–344, 349, 350, 699, 701, 780, 783, 819, 836 Bible (Scriptures), 125, 604 Biot, Jean-Baptiste, 573 Birch, Francis, 316, 317, 471, 661, 662, 830 Birch-Murnaghan equation of state, 764, 832 Bird, John M., 418 Bischof, Carl Gustav Christoph, 516 Blackett, Patrick, 373 Black, Joseph, 98, 99, 518, 530 Blavatsky, Helena, 801 Blob, 458 Body wave, 164, 165, 169, 186, 188–190, 197, 233, 238, 239, 255, 259, 265, 297, 298, 405, 407, 593, 625, 626, 632, 640, 783 Boerhaave, Herman, 98 Bonaparte, Napoleon, 636 Bouguer, Pierre, 13 Boussinesq approximation, 441, 442 Boussinesq, Joseph, 441 Bowditch, Nathaniel, 798 Bowie, William, 358 Boyle, Robert, 96 Breislak, Scipione, 63 Bridgewater Treatises, 703, 820 Bridgman, Percy, 316 Brongniart, Alexandre, 79 Brunhes, Bernard, 382 Brush, Stephen G., 515, 660, 810 Buffon, Georges-Louis Leclerc de, 1 Bulk modulus, 311, 312, 316, 319, 470, 660, 661, 763 Bulk sound velocity (=bulk sound speed), 313, 316, 317 Bullard, Teddy (Edward), 389–395, 408, 744, 785 Buoyancy (buoyant), 352, 353, 355, 367, 411, 412, 414, 419, 458, 459, 561, 562, 704, 758, 760 Burchfield, Joe, 89 Burnet, Thomas, 10 Buttes Chaumont (park), 58
C Calcareous spar, 70, 71 Calcite, 518, 828 Calcium carbonate, 70, 828 Calculus, 165, 314, 482–484, 486, 487, 642, 822 Caley, Earle, 53
Index Caloric, 100, 530, 531 Calx, 518 Carboniferous, 80, 140, 147, 148 Carey, Samuel Warren , 738 Carnot cycle (also: Carnot engine), 669–674 Carnot engine (also: Carnot cycle), 669–674 Carnot, Hyppolyte, 815 Carnot, Lazare, 815 Carnot, Sadi (Nicolas-Léonard Sadi), 665, 669–671, 675, 815–817 Cassini, Giovanni Domenico, 634 Catastrophism, catastrophist, 123 Cauchy, Augustin-Louis, 189, 619, 635 Cavendish, Henry, 24 Central Massif (=Massif Central), 64 Centrifugal force, 12, 45, 308, 477, 512, 569, 650–652, 729, 733, 813 Chalk, 58 Chandler, Seth Carlo, 42, 307, 308, 511, 512 Chapple, William, 413 Chemical reaction, 96, 97, 103, 104, 330, 517, 689 Chimborazo, Mount, 26 Chladni, Ernst, 603 Cholesky factorization, 837 Chondrite, 323, 324, 427, 431, 688, 691 Circular wave, 272, 275, 276, 279 Clapeyron, Émile, 618 Clapeyron slope, 326, 471, 618 Clay, 58, 59, 191, 518, 519, 550 Coal, 59, 64, 65, 80, 148, 413, 467, 827 Coleridge, Samuel Taylor, 521 Colliding resistance, 415, 417 Collimator, 327, 689 Coltice, Nicolas, 790 Complex number, 295, 590, 591, 600, 620, 622, 625, 627, 807–809 Compressibility, 161, 171, 177, 182, 228, 229, 257, 315, 436, 612, 657 Compressional wave (=condensational wave, dilatational wave, P wave, wave of dilatation ), 210, 211, 263, 456, 573, 626, 641, 742 Conductivity (=thermal conductivity), 114, 333, 354, 440, 466 Conic (curve), 797 Constitutive relation (=stress-strain relation, rheological equation), 335, 336, 369, 433, 439, 693 Continental drift, 145, 147, 150, 153, 335, 357, 358, 362, 363, 368, 373–376, 380, 387, 388, 467, 471, 554, 558,
Index 560, 561, 564, 615, 692, 728, 737, 738, 745, 767, 805, 810 Continental root, 752, 757, 828, 829 Continent-continent margin (=continentcontinent collision, continental collision), 418, 419 Continuum, 168, 179, 191, 210, 436, 440, 469, 612, 614, 774, 775 Continuum mechanics, 208, 763 Convection (=convective current, convective flow), 357, 703 Conveyor belt, 384, 413, 428, 471 Copernicus, 8, 9, 55 Coral, 54, 133 Cordier, Louis, 44 Core-Mantle Boundary (=CMB), 307–309, 315, 317, 325, 423, 424, 688, 689, 749, 766, 774 Core (of the earth), 563 Coriolis force, 368, 433, 471, 529, 733–735 Coriolis, Gustave, 529 Correlation, 78, 522, 768, 835 Cosmogony, 45, 46 Coulomb Charles-Augustin, 713 Cox, Allan, 383, 729, 741 Craton (cratonic), 149 Cray C-90, 369 Creer, Kenneth, 826 Cross product, 34–37, 177, 479, 481, 482, 506, 538, 540, 541, 580, 582, 610, 728, 732, 734, 795, 796 Crust (of the earth), 47, 116, 141, 143, 144, 152, 164, 189, 194, 195, 262, 315, 333, 377, 396, 549, 553–555, 690, 692, 694, 706, 828, 830 Crystal, 318, 325, 328, 455 Curie, Pierre, 365 Curie point, 365, 366, 373 Curl, 467 Curvature, 12 Curvature of the earth, 12 Cuvier, Georges, 79 Cylindrical coordinates, 339, 695, 697, 821, 822 Cylindrical wave, 232
D D” (=D double prime), 23 D’Alembert, Jean le Rond, 172 Dalrymple, Brent, 84, 383 Damped least-squares solution, 789 Damper (=dashpot), 434, 435, 456, 606, 607
843 Dana, James Dwight, 131, 134–141, 418, 552–554 Darwin, Charles, 78, 85–91, 140, 157, 521, 525, 526, 549, 794 Darwin, George, 810 Dashpot (=damper), 434, 435, 773, 790 David, Jacques-Louis, 532, 795 Davies, Geoffrey F., 758 Da Vinci, Leonardo, 55 Deccan, 374, 375 Deffeyes, Kenneth, 66 De Geer, Gerard, 694 De La Beche, Henry, 131 Deluge (Flood) (Biblical), 10 De Maillet, Benoît, 516 Denudation, 62 Deparis, Vincent, 44 Depletion (depleted), 752 Derivative, 334 Descartes, Réné, 12, 13, 477, 633 Detrital remanent magnetism, 373 Dewey, John F., 741 Diamond, 13, 830 Dietz, Robert, 379 Difference between “rock” and “mineral”, 828 Diffraction, 211 Dirac, P.A.M., 378 Dirac’s delta (=Dirac’s function), 714 Discontinuity, 124, 192, 259, 260, 263, 287, 632, 639 Dispersion (of waves), 189, 242, 249–253, 407–409, 450, 626, 628, 632 Dissipation, 286, 297, 437, 455, 776 Divergence, 171, 172, 177, 178, 182, 202, 203, 211, 212, 216, 217, 219–223, 336, 379, 386, 436, 438, 440, 467, 469, 470, 564, 570–572, 581, 611– 613, 654, 713–717, 719, 727, 821 Dixon, Jeremiah, 24, 491 Doell, Richard D., 729 Dorman, James, 406 Dot product, 95, 199, 465, 479, 480, 501, 503, 572, 584, 613, 713, 726, 786, 792, 806, 835, 838 Du Toit, Alexander, 148 Dyke (dike), 60 Dziewonski, Adam, 445, 460, 784
E Earthquake, 51, 59, 60, 88, 123–125, 128, 130, 155–164, 189–197, 217, 221,
844 231–236, 238–242, 254, 257, 262, 263, 274, 325, 335, 359–362, 377, 385, 388, 395, 406, 407, 421, 433, 450, 451, 453, 456, 461, 472, 492, 512, 548, 566–568, 591, 605, 620, 623–625, 629–631, 642, 649, 658, 660, 664, 706, 707, 742, 744, 761, 776, 782, 785 Earth’s rotation, 33, 42, 306–308, 365, 368, 511, 650, 652, 731 East Pacific Rise, 380, 826 Einstein, Albert, 502, 794 Einstein’s notation (or: Einstein’s convention, Einstein’s summation convention), 178, 203, 312, 465, 614, 616 Electromagnetism, 364, 369, 514, 689, 710, 712, 821 Electron, 332 Élie de Beaumont, Jean-Baptiste Armand Louis Léonce, 124 Ellipse, 796 Ellipsoid, 38 Elsasser, Walter, 366 Energy budget (=heat budget), 378, 403 Enlightenment, 1, 8, 10, 11, 55, 84, 85, 531, 792 Enriched, 426–428, 430, 459, 460 Epicenter, 155, 156, 159, 162–164, 191, 194, 232, 235, 237, 249, 258, 262, 263, 305, 308, 309, 360, 361, 377, 378, 386, 406, 566, 567, 606, 625, 644, 742, 744, 783 Epicentral distance, 163, 235, 236, 258, 299, 300, 302, 304, 305, 309, 361, 407 Equation of state, 441, 762–764, 831–833 Eratosthenes, 2, 3, 6, 463, 477 Erosion, 331 Error function (=erf), 398 Escape velocity, 107, 537 Euclid, 265 Eugenicist, 838, 839 Euler, Leonhard, 30 Euler’s equation, 168 Everest, George, 467, 526 Everest, Mount (Mount Everest), 526 Everest, Robert, 526 Ewing, John, 380 Ewing, Maurice, 377, 379, 380, 406, 736– 738, 746, 829 Expansivity (=volumetric coefficient of thermal expansion), 318 Extrusive rock, 74
Index F Factorial, 819 Fahrenheit, Daniel Gabriel, 98 Farallon (plate), 761 Fault, 134, 155, 361, 387, 688 Fault, normal, 386, 472, 744 Fault, strike-slip, 385–387 Fault, thrust, 150, 386, 472, 744 Fault, transcurrent, 387, 388 Fault, transform, 385, 387, 388, 790 Fermat, Pierre de, 268, 633 Ferromagnetism (ferromagnetic material), 371 Feynman, Richard, 814 Finite differences (=finite-difference method), 473 Finite strain, 763, 831–833 Finn, Bernard (Barney), 573 Fleeming Jenkin, Charles, 91 Flood (Deluge) (biblical), 81 Focal mechanism, 395, 472, 743 Fold (folding), 127, 129, 131, 134, 136, 140, 149 Forced oscillation, 810 Forsyth, Donald W. (Don), 413 Fossil, 52, 55, 62, 78, 80, 133, 140, 145–147, 526, 550, 556, 559, 560, 694 Foster, James T., 550 Foucault, Michel, 10, 54 Fourier, Joseph, 98, 111, 528, 584 Fourier series, 584, 590, 591, 599, 600, 809, 810, 823 Fourier transform, 183, 221, 243, 579, 592, 593, 607, 621, 622, 627, 639, 640, 781 Fracture zone, 386–388, 395, 744 Frankel, Henry, 153 Franklin, Benjamin, 532 Free oscillation (=normal mode), 663, 777, 783, 810 French revolution, 65, 514, 531, 533, 815, 822 Fresnel, Augustin, 192, 273, 469 Friction, 44, 92, 101, 102, 109, 150, 157, 158, 563, 564, 603, 669, 773, 820 Frobenius, Georg Ferdinand, 819 Frobenius’ method, 699 Frozen-flux approximation, 369 Fukao, Yoshio, 784
G Galton, Francis, 838, 839
Index Gamma function, 819 Ganges (river), 90, 91 Gauss, Carl Friedrich, 366, 836 Gaussian elimination, 472, 785–787, 836, 837 Gay Lussac, Joseph, 602 Geber, 531 Generalized inverse, 451 Genesis, Book of, 55 Geochemical model, 426, 429 Geochemistry (geochemist), 421, 427, 457, 459, 687, 766–768, 818, 829 Geodynamics (geodynamicist), 208, 240, 404, 405, 421, 433, 457, 473, 638, 774, 829 Geodynamo, 367, 370, 374, 382, 461, 473 Geopoetry, 739, 827 Geosyncline (geosynclinal), 139, 140 Gibbs, J. Willard, 501 Gilbert, William, 364, 373, 708 Glaciation, 337, 338, 694 Glatzmaier, Gary, 367 Glossopteris, 145–147, 467, 556, 559 Godin, Louis, 13 Gondwana (Gondwanaland), 140, 146 Gould, Stephen Jay, 81 Gradient, 44, 114, 180, 211, 320, 322, 332, 333, 400, 422, 424, 570–572, 584, 619, 653, 654, 750, 762, 774, 789, 821 Grand, Steve, 784 Granite, 62, 331, 334 Gravitation, Newton’s law of, 23, 435, 537 Gravity, 58 Gravity wave, 186, 623 Greene, Mott T., 124 Gregory, Herbert E, 802 Griggs, David, 359 Grimaldi, Francesco Maria, 634 Group velocity, 251, 254, 627 Guettard, Jean-Étienne, 64 Guitar (guitarist), 575, 578, 579, 593, 595, 626, 663, 769, 773, 778, 781, 810 Gutenberg, Beno, 305, 632, 642
H Hadrian’s Wall, 82 Hager, Bradford H. (Brad), 445, 769 Half-life, half life (of an isotope), 430, 690, 691 Half-Space Cooling Model (HSCM), 399, 402–404, 408, 410
845 Hallam, Anthony, 61, 153, 558, 560 Hall, James (American geologist), 132 Hall, James (Scottish physicist), 70 Haskell, Norman, 658 Hassenfratz, Jean-Henri, 602 Hawaii, 406, 427, 428, 805 Hayford, John, 358 Head waves, 298, 299, 632, 641, 736 Heat, 65, 70, 71, 73, 83, 87, 92, 93, 96, 98, 99, 102–104, 109, 111, 114, 152, 333, 356, 366, 405, 431, 437, 467, 523– 525, 528, 530, 531, 534, 545, 660, 666, 670, 677, 682, 755, 762, 809 Heat budget (=energy budget), 431, 460 Heat capacity (=capacity for heat), 99–101, 103, 320, 354, 422, 441, 467, 544, 573, 667, 681, 775 Heat equation (AKA heat conduction equation), 115, 472, 473 Heat flow, 116, 332–334, 404, 405, 669 Heat flux, 114, 397, 403–405, 431, 439, 440, 753, 756, 757, 774 Heaviside, Oliver, 820 Heezen, Bruce, 377, 737 Heim, Albert, 150, 561 Helmholtz free energy, 832 Helmholtz, Hermann von, 92 Heraclitus, 530, 531 Herglotz, Gustav, 300, 641, 642 Herschel, Caroline, 515 Herschel, William, 45, 514, 792 Herzog, Werner, 13 Hess, Harry, 379, 380, 471, 738, 829 Hill, John, 54 Himalayas, 141, 418 Hoffman, Paul, 765 Hofmann, Albrecht W., 426 Holmes, Arthur, 153, 332, 357, 374, 431, 527, 740 Hooke, Robert, 202 Hooke’s law, 202 Hopkins, William, 43 Hotspot, 767, 791, 805 Huguenot, 522 Humboldt, Alexander von, 556 Hutton, Charles, 25, 142 Hutton, James, 98 Huygens, Christiaan, 265 Huygens’ principle, 269, 270 Hyperbola, 541, 796, 797 Hyperbolic cosine (cosh), 185, 186, 303, 626 Hyperbolic sine (sinh), 185, 186 Hyperbolic tanget (tanh), 189
846 Hypocenter, 221, 453, 454, 631, 742 Hysteresis, 371
I Ice age, 337, 694 Ideal gas, 665–667, 671, 676, 680, 762 Identity matrix, 209, 441, 502, 510, 616, 788, 792 Igneous rock, 66, 68, 74, 328, 329, 331, 333, 373, 391, 518, 521, 553, 686, 755 Imaginary number, 295, 600, 625, 626, 640, 807 Imaginary unit, 219, 295, 590, 807, 823 Incompressibility, 178, 182, 319, 320, 336, 441, 661, 763, 834 Inertia tensor, 37, 40, 41, 43, 94, 307, 416, 466 Inner-Core Boundary (ICB), 309, 424 Integral, 17, 19, 35, 36, 38–40, 95, 96, 118, 119, 214, 279, 281, 304, 314, 417, 485, 493, 508, 539, 546, 575, 580– 582, 587, 596, 637, 717, 787 Integration by parts, 507, 508 Internal energy, 125, 205, 439, 440, 666, 667, 670, 673, 675, 682, 775 Intrusion, 74, 81, 124, 131, 194, 525, 550 Intrusive rock, 74 Inversion, 632 Iron, 43, 83, 190, 191, 315, 324, 371, 518, 691, 715 Irreversible process, 676–679 Irrotational (=curl-free), 177–179, 719 Irving, Edward (Ted), 735, 826 Isacks, Bryan, 396, 745 Island arc, 418, 419, 760 Isochron, 331 Isostasy (also isostacy), 123, 141, 143, 144, 150, 358, 381, 400, 401, 467, 490, 553–558, 580, 705, 740, 747, 751, 791, 801 Isothermal, 319, 320, 441, 573, 666, 671, 673, 675, 678, 764, 765, 834 Isotope, 329 Isotropic tensor, 209
J Jamieson, Thomas, 694 Jardin des Plantes, 327, 522 Jeanloz, Raymond, 766 Jeffreys, Bertha, see Swirles Jeffreys, Bertha615
Index Jeffreys, Harold, 299, 306–308, 310, 335, 615, 640, 641, 659, 810 Joly, John, 527 Jordan, Thomas H. (Tom), 750 Joule, James, 100 K Karato, Shun-Ichiro, 626, 749 Katsura, Tomoo, 832 Kausel, Edgar G., 408, 410 Kellogg, Louise, 431, 461 Kelvin-Voigt model, 790 Kelvin (William Thomson), 92 Kepler, 8, 9, 55, 536 Keynes, John Maynard, 9, 476 Kimberlite, 830 Kinetic energy, 93, 96, 105, 106, 108, 109, 179, 205, 232, 269, 328, 332, 439, 466, 530, 537, 601, 614, 638 Kirwan, Richard, 74, 75 Knopoff, Leon, 408–410 Knott, Cargill Gilston, 637 Koestler, Arthur, 8, 52 Kola borehole, 829 Kronecker’s delta (also: Kronecker’s symbol), 209, 618 L La Condamine, Charles Marie de, 20, 21 Lamé’s parameters, 210, 469, 618 Lamé, Gabriel, 618 Lamont (Lamont-Doherty Earth Observatory), 377, 379, 385, 393, 736, 737 Laplace, Pierre-Simon, 45 Large igneous province, 374 Large low-shear-velocity province (megaplume, superplume), 460 Lattice structure, 325, 688 Laurasia, 146 Lava, 44, 47, 55, 64, 65, 71–74, 77, 84, 126, 137, 142, 143, 195, 373, 382, 426, 606 Lavoisier, Antoine, 96–98, 100, 518, 522, 532, 533, 703, 795 Lavoisier, Marie, see Paulze Lavoisier, Marie-Anne Pierrette518 Law of cosines, 18, 19, 489 Law of sines, 477, 645 Law of superposition, 56 Laws of stratigraphy (=stratigraphy, laws of/principles of), 56 Layered convection, 459
Index Lead (Pb), 328 Lead pollution, 818 Least squares, 789 Leeds, Alan R., 409 Legendre, Adrien Marie, 822 Legendre function, 722, 779, 781 Legendre polynomial, 721, 822 Lehmann, Inge, 308 Leibniz, Gottfried Wilhelm, 14 Leith, Andrew, 360 Lemur, 800, 801 Lemuria, 800, 801 Le Pichon, Xavier, 385, 395 Levi-Civita tensor, 610, 826 Levi-Civita, Tullio, 796 Lewis, Henry Carvill, 838 Limestone, 58, 62, 70, 71, 77, 80, 81, 191 Liouville, Joseph, 620 Liquidus, 747, 748 Lisbon earthquake, 159, 162, 163 Lithosphere, 145, 379, 381, 388, 395, 396, 398, 399, 401, 404, 405, 408, 421, 740, 750, 755, 757, 758, 828 Lodestone (=magnet), 364, 710 Love wave, 190, 234, 242, 248, 249, 254, 407 Lower mantle, 325, 423, 428–430, 444, 445, 454, 455, 458–460, 688, 766, 767 Lowrie, William (Bill), 402 Low-velocity zone (LVZ), 406 LU factorization, 837 Lyell, Charles, 50–52, 56, 74, 75, 86–89, 91, 93, 123–125, 128, 130, 136, 156, 157, 525, 545, 548–550, 561 Lyngourion, 55 M Matthews, Drummond, 384 MacCullagh, James, 512 Mach, Ernst, 265 Magnet (=lodestone), 364 Magnetic domain, 372, 373 Magnetic field, 712 Magnetic field of the earth, 368, 712 Magnetite, 710 Magnetization, 370, 373, 382, 384, 385, 820 Mallet, Robert, 156–159, 161, 162, 164, 165, 189–197, 234, 235, 567–569, 601, 606 Mantle, 305, 324, 326, 357, 358, 380, 404, 423, 424, 428, 430, 444, 454, 459, 467, 688, 705, 749, 752, 755, 758, 774
847 Mantle drag force, 414 Mantle plume (=plume), 428–431, 458, 460, 766, 767, 769, 790 Marat, Jean-Paul, 533 Margin (=plate margin, plate boundary), 386, 419 Margin, convergent/converging (=boundary, convergent/converging), 412, 419, 744 Margin, divergent/diverging (=boundary, divergent/diverging), 386 Marsh, Othniel Charles, 52 Maskelyne, Nevil, 24 Mason, Charles, 24, 25, 491 Mason, Ronald, 381, 384 Massif Central (Central Massif), 64, 323 Mass spectrometer, 689 Matuyama, Motonari, 382 Maxwell, James Clerk, 433, 434, 708 Maxwell material, 434, 435 Maxwell’s equations, 364, 368, 708, 715, 716, 719 Mayer, Julius Robert (von), 100 Mazzini, Giuseppe, 805, 837 McDonough, William, 756 McKenzie, Dan, 395 McPhee, John, 49, 524 Megaplume (=large low-shear-velocity province, superplume), 460 Mercanton, Paul-Louis, 382 Mercury (element), 99, 102, 103, 111, 189, 369 Metamorphic rock, 66, 74, 521 Metamorphism, 74, 134, 521 Meteorite, 74, 105, 315, 324, 333, 427, 467, 688, 690–692 Michell, John, 26 Mid-ocean ridge (=mid-oceanic ridge), 377– 379, 385, 386, 397, 399, 411, 414, 454, 458, 769, 829 Mid-Ocean Ridge Basalt (=MORB), 427– 432, 458–460, 768, 769 Millefeuille (pastry), 132 Milne, John, 234, 637, 641 Modulus of elasticity (=Young’s modulus), 190, 192 Moho, 261, 263, 264, 299, 307, 325, 381, 396, 470, 632, 661, 688, 749, 751, 754, 755, 829 Mohoroviˇci´c, Andrija, 263 Monte Nuovo, 64 MORB (=mid-ocean ridge basalt), 427–429, 767
848 Morgan, Jason (W. J.), 392 Mountain building (orogenesis), 357, 359, 418, 548 Musical note, 810 N Nabla, 570 Napoleon, see Bonaparte, Napoleon635 Nappe de charriage (or simply: nappe), 561 Navier–Cauchy equation, 336, 433 Navier–Stokes equation, 369, 433, 441, 656, 735 Nebula, 45, 514, 515, 687, 792 Neodymium (Nd), 430, 768 Neptunism (neptunist), 57 Neutron, 328, 371, 689, 690 Newcomb, Sally, 518 Newton, Isaac, 9, 476, 488 Newton’s law of cooling, 83 Newton’s law of gravitation, 23, 435, 537 Newton’s laws, 13, 14, 28, 33, 35, 41, 94, 105, 435, 436, 476, 508, 724, 733 Niobium (Nb), 428 Normal fault, see fault, normal386 Normal mode (=free oscillation), 777 Nuclear test ban treaty, 693 Nucleus, 127, 328, 371, 689 Numerical (solution of differential equations), 765 Nutation, 42, 659, 792 O Obduction (obducted), 758 Occam (or Ockham), William of, 523 Occam’s razor, 81 Ocean-Island Basalt (=OIB), 427 O’Connell, Richard, 756 Ohm’s law, 366–368, 710, 725 Oldham, Richard Dixon, 234–242, 258–263, 302, 305, 306, 623–626, 629–631, 649, 692 Oliver, Frederick Spencer, 801 Olivine, 423 Olson, Peter, 367 O’Neill, Hugh, 687 Ophiolite, 686, 758 Optics, 190, 192, 237, 261, 262, 264, 265, 270, 528, 601 Ordinary Differential Equation (ODE), 701 Oreskes, Naomi, 140, 358, 736, 741 Origin of Continents and Oceans, the (Die Entstehung der Kontinente und
Index Ozeane, Die Enstehung) (Wegener’s book) , 145, 553, 556 Origin of Species, the (Darwin’s book), 78, 85 Orogenesis (mountain building), 149 Ørsted, Hans-Christian, 710 Ovid, 53 P Paleobotany, 145 Paleoclimate (palaeoclimate), 147 Paleomagnetism (palaeomagnetism), 373, 376. see also paleomagnetic/palaeomagnetic Paleontology (palaeontology), 132, 133, 147, 516, 552 Palme, Herbert, 687 Pangaea (Pangea), 145–147, 561 Parabola, 796, 797 Parallelogram law, 478–480 Parameterization, 452, 784 Parker, Bob, 395 Parsons, Barry, 405 Partial Differential Equation (PDE), 339, 341, 468, 735 Pascal, Blaise, 705 Patterson, Clair, 690, 818 Paulze Lavoisier, Marie-Anne Pierrette, 532, 795 Pearson, Karl (or Carl), 838, 839 Peridotite, 323, 325, 334, 411, 686, 748, 756, 758, 759 Permeability, 712, 825 Permittivity, 712, 825 Perovski, Lev, 688 Perovskite, 324, 766 Phase transition, 326, 423, 458, 471, 688 Phase velocity, 452, 627 Phillips, John, 89 Phlogiston, 96–98, 530, 531 Pink Floyd (rock band), 535 Pixies, The (rock band), 801 Plates, 327, 357, 386, 392, 408, 410, 413, 415, 443, 520, 752, 753, 761, 805 Plate tectonics, 131, 357, 358, 368, 385, 388, 389, 392, 395, 413, 414, 417, 418, 420, 426, 457, 459, 461, 467, 471, 552, 554, 660, 736, 738, 740, 741, 746, 758, 793 Plato, 8, 9, 531 Playfair, John, 75, 521 Pliny (=Pliny the elder), 64, 516 Plumb line, 20–26, 144, 729, 730, 740
Index Plume (mantle plume), 428–431, 458, 460, 766, 767, 769, 790 Plutonism (plutonist), 65, 74 Poisson, Siméon, 620 Polar wander, 374 Postglacial rebound, 337, 694 Post-perovskite, 324–326, 689 Pouillet, Claude, 104 Power spectrum, 781 Pratt, John Henry, 141 Precession, 41, 42, 44, 307, 512, 563, 659 Press, Frank, 707 Priestley, Joseph, 97, 518 Primitive mantle, 427, 428, 430, 431, 443, 459, 460, 768 Principia Mathematica, or: Principia (Newton’s book), 9, 82, 579, 798 Principle of conservation of energy (also: conservation of energy, energy conservation, principle of energy conservation), 100 Principle of correlation of parts, 522 Principle of cross-cutting relationships, 57 Principle of geological correlation (also: geological correlation), 79 Principle of lateral continuity, 56 Principle of original horizontality, 56 Principles of Geology (Lyell’s book), 50 Prograde motion, 580 Proton, 328, 371, 690 Ptolemy, 8, 266 Pumice, 54 Puy de Dôme, 382 Pyrheliometer, 104, 536 Pythagoras, 53, 55, 282
Q Quasi-static approximation, 826
R Radioactive decay, 328–330, 333, 356, 431, 470, 563, 689, 690, 753 Radiocarbon, 690 Radiogenic (isotope), 330 Radioisotopic closure, 689 Radiometric dating, 328, 331, 690, 741, 789 Radon, 690 Raff, Arthur, 381 Raitt, Russell, 740 Ranalli, Giorgio, 693 Rayleigh (John William Strutt), 692
849 Rayleigh number, 352, 353, 355, 356, 366, 398, 412 Rayleigh wave, 190, 197, 218, 225, 233, 234, 240, 242, 248, 249, 252, 255, 261, 265, 287, 296, 297, 407, 409, 452, 469, 568, 626, 627, 632, 639, 640, 746 Ray path (also: wave path), 284, 300, 304, 308, 407, 445–449, 648 Ray (seismic ray), 407, 446 Rebeur-Paschwitz, Ernst, 625 Reflection (reflected wave), 192, 289 Refraction (refracted wave), 285, 286, 297, 632 Regularization (regularized), 787, 788 Remanent magnetization, 375, 384 Reservoir, 333, 431, 459, 460, 473, 670–673, 676, 835 Resolution, 451, 766, 783, 784 Retrograde motion, 231, 296, 297 Reversal (of the earth’s magnetic field) also: polarity reversal, geomagnetic reversal, 370, 382, 393 Reversible engine, 672, 676, 677 Reversible process, 669, 670, 679 Reykjanes ridge, 385 Rheology, 335, 432, 433, 435, 442, 443, 776 Ricard, Yanick, 775 Richards, John, 53 Richards, Paul, 75 Richet, Pascal, 64 Richter, Charles, 830 Ridge push, 414, 415, 417, 760 Rift, 148, 377, 380, 454, 737, 761 Rigidity, 210, 244, 248, 257, 306, 311, 512, 558, 656, 659, 692 Roberts, Paul, 369 Rocky Mountains (Rockies), 136, 194 Rogers, Henry Darwin, 549 Rogers, William Barton, 549 Rømer, Ole, 634, 635 Rotation matrix, 615–617, 810 Roulin, François-Désiré, 45 Rubidium (Rb), 428 Rudnick, Roberta, 829 Rudzki, Maurycy Pius, 624 Rumford, Benjamin Thompson Count of, 351 Runcorn, (Stanley) Keith, 373, 735 Rutherford, Ernest, 332 S Sabine, Edward, 525
850 Salisbury Crags, 71 Samarium (Sm), 430 Sandstone, 71, 77, 78, 80, 191 Schiehallion, Mount, 26, 142, 827 Schist, 62, 521 Schubert, Gerald (Gerry), 405 Sclater, John, 405 Sclater, Philip, 555 Scott, Robert, 145 Scripps (Scripps Institution of Oceanography), 736, 827 Scriptures (Bible), 82 Seafloor spreading (spreading of the ocean floor, ocean-floor spreading), 380, 387–389, 395, 399, 738, 740, 827 Sediment, 56, 57, 59, 61, 63, 67, 69, 71, 77, 81, 84, 85, 90, 91, 123, 133–137, 139, 140, 144, 149, 323, 331, 373, 377, 380, 419, 420, 517, 522, 524, 527, 554, 793 Sedimentary rock, 57, 66–68, 73, 74, 84, 85, 124, 147, 148, 331, 377, 380, 518, 521 Sedimentation, 57, 60, 62, 64, 66, 92, 123, 124, 138, 380, 525 Segregation, 411, 431 Seismic station, 405, 649, 663, 811 Seismograph (or seismometer), 240, 257, 310, 377, 388, 623, 624 Seismology, 155, 156, 163, 164, 193, 197, 202, 208, 235, 241, 257, 260, 261, 263, 284, 287, 307, 317, 321–326, 356, 380, 396, 421, 424, 433, 445, 446, 455, 457, 461, 473, 565, 566, 604, 615, 623, 626, 629, 640, 657, 659, 688, 742, 745, 762, 766, 776, 782–785, 787, 813, 830 Seismometer (or seismograph), 155, 223, 257, 309, 360, 450, 451, 606, 607, 637, 640, 641 Separation of variables, 340, 341, 344, 594, 719, 777, 809, 810 Serpentinization, 758 Shadow zone, 308, 309, 406, 407, 736 Sharpe, J.A., 360 Shear wave (distortional wave, S wave), 144, 197, 211, 231, 234, 262, 263, 306, 620, 626 Shell theorem, 435, 465, 488, 489, 652, 655, 661 Shield, 752, 756 Sial, 138, 150, 335, 562, 692 Siever, Raymond, 365
Index Silicate rock, 324 Sima, 138, 150, 152, 335, 379, 380, 562– 564, 692 Slab pull, 415, 417, 761 Slab resistance, 415 Small deformation tensor, 206 Smith, William, 522 Snell’s law, 261, 266–270, 300, 308, 631, 633, 639, 645, 646 Snellius, Wildebord (also: Snell), 267, 477 Snider-Pellegrini, Antonio, 559 Solar constant, 104, 467 Solidus, 747, 748 Sonar, 376, 736 Sorbonne, 104, 441, 776, 792 SOund Fixing and Ranging (SOFAR), 736 SOund SUrveillance System (SOSUS), 736 Sparse matrix, 784, 785 Spectrum analysis, 110, 686, 687 Spherical coordinates, 212, 463, 714, 719, 777, 814 Spherical harmonics, 779, 781, 784, 789, 822–824, 837 Spherical wave, 176, 217, 232, 272 Spinel, 423, 471 Spring, 52, 91, 202, 269, 434, 455, 606, 607, 790 Stahl, Georg Ernst, 96 Steam engine, 92, 670, 703 Steno, Nicolas (Niels Steensen), 56, 516 Stokes’ theorem, 180, 583, 584, 728 Stoneley, Robert, 297, 641 Stoneley waves, 296–298 Stracke, Andreas, 768 Strain tensor, see small deformation tensor386 Stratigraphy, 71, 78–80, 160, 517 Stratigraphy, principles of (laws of), 56 Stress tensor, 197, 198, 289, 311, 335, 438, 441, 468, 612, 777 Strike-slip fault, see fault, strike-slip385 String, 28 Strontium (Sr), 331, 690, 768 Strutt, John William (or: Rayleigh), 692 Strutt, Robert John, 333, 692 Subduction (subucted/subducting plate, subducted/subducting slab), 769 Suction force, 760 Suess, Eduard, 138, 140 Sun, 2–4, 6, 8, 11, 13, 92, 93, 104, 105, 107– 110, 315, 324, 327, 351, 463, 467, 475, 508, 528, 536, 537, 563, 635, 656, 687, 688, 707, 819
Index Superplume (large low-shear-velocity province, megaplume), 460 Surface wave, 165, 186, 188, 190, 231–233, 237, 238, 240, 241, 243, 247–250, 253, 255, 405, 407, 591, 625, 626 Swirles Jeffreys, Bertha, 615, 640 Sykes, Lynn R., 387, 388 Syncline, 467
T Tackley, Paul, 457 Tapis roulant, 379 Tectonics, 131, 357, 358, 368, 385, 388, 389, 395, 396, 413, 414, 417, 418, 420, 426, 454, 457, 459, 461, 467, 471, 552, 554, 615, 660, 736, 738, 741, 745, 746, 758, 793, 805, 827, 837 Tectosphere, 750, 752, 753, 757, 828 Telliamed, 57, 517 Temperature (definition of), 111 Tertiary, 149, 391 Tertullian, 55 Tethys, 146, 454, 784 Tharp, Marie, 377, 737, 738 Theophrastus, 53–55 Thermal Boundary Layer (TBL), 425, 432, 444, 445, 769 Thermal conductivity (=conductivity), 114, 333, 354, 440, 466 Thermal diffusivity, 117, 354, 398 Thermochemical (thermochemical convection, thermochemical pile), 443 Thermoremanent magnetization, 373 Thomson, William (Kelvin), 92, 515, 815 Thorium, 428, 527, 690 Thrust fault, see fault, thrust386 Tohoku earthquake, 254 Tomography (tomography model, tomo model), 257, 421, 445, 451, 453, 456, 458, 460, 461, 766, 776, 783, 784, 787, 790, 837 Topography, 142–144, 151, 359, 381, 400, 460, 461, 491, 521, 553, 556, 558, 698, 740, 791, 801 Torque, 29, 31, 35, 42, 417, 492 Torsion pendulum, 493, 524, 820 Transcurrent fault, see fault, transcurrent386 Transformation of coordinates, 36, 38 Transition zone, 318, 321–326, 421, 422, 444, 471, 688, 747, 750, 776 Transform fault, see fault, transform Transform fault resistance, 414
851 Tsunami, 125, 163–165, 169, 189, 190, 231, 567–569, 601 Tullis, Terry, 413 Turcotte, Donald L. (Don), 405
U Uniformitarian, 90, 123, 430, 460, 526, 527 Uniformitarianism, 92, 375, 418, 466, 526 Upper mantle, 406–408, 410, 422, 423, 426, 428, 429, 432, 444, 454, 632, 746, 750, 760, 776 Uranium (U), 327, 328, 527, 690, 691, 741 Ussher, James, 81 Uyeda, Seiya, 413
V Vacquier, Victor, 381 Van der Hilst, Robert D. (Rob), 431, 784 Van Waterschoot van der Gracht, Willem, 153 Vector sum, 22, 478, 479 VELA-UNIFORM, 742 Vening Meinesz, Felix, 358 Verhoogen, John, 741 Vicentini, Giuseppe, 811 Vine, Frederick J., 384 Viscoelasticity (viscoelastic), 435, 456, 610, 790 Viscosity, 44, 137, 144, 150, 169, 197, 225, 334–338, 351, 353, 355, 356, 369, 379, 381, 395, 396, 412, 434, 435, 443, 444, 455, 456, 458, 459, 471, 569, 606, 610, 704, 705, 740, 749, 773, 776, 790, 828 Visser, Simon Willem, 359 Vis viva, 93, 95, 103, 105 Volcano (also: volcanism), 21, 64, 419 Volumetric coefficient of thermal expansion (=expansivity), 318, 352, 400, 422, 441, 682, 751 Volvic, 64 Vose, George L., 75, 77, 517
W Wadati-Benioff zone, 361, 408 Wadati, Kiyoo, 362, 706 Water calorimeter, 534 Watership down, 525 Wave equation, 172, 212, 214, 264, 298, 342, 468, 469, 573, 575, 576, 593, 605, 626, 636, 698, 769, 771, 777, 780
852 Wave length, 183, 227, 229, 535, 628 Wavenumber, 183 Weald, 86 Wedgwood, Josiah, 518, 519, 521 Wedgwood scale, 518 Wedgwood, Thomas, 520 Wegener, Alfred, 145, 153, 548, 553, 556, 802 Wegener, Kurt, 802 Werner, Gottlob, 61 Wertheim, Guillaume, 234 Whiston, William, 9–11, 476 Whole-mantle convection, 444, 458, 459 Wiechert, Emil, 43, 641 Williams, Henry Smith, 100 Williamson-Adams method, 310, 317, 471, 836 Williamson, Erskine, 310, 471, 660 Wilson, J. Tuzo, 378 Woodhouse, John, 445, 784 Woods Hole (Woods Hole Oceanographic Institution), 736 Woodward, John, 10
Index World-Wide Standardized Seismograph Network (WWSN), 388 WWII, 706, 707, 736 Wyse Jackson, Patrick, 527
X Xenolith, 686, 750, 751, 753, 755–757, 830
Y Young’s modulus (same as: modulus of elasticity), 190, 192, 434, 604, 660 Young, Thomas, 93, 94, 274–276, 278, 281, 604, 605, 636
Z Zeno, 482 Zenophanes, 52, 55 Zeno’s paradoxes, 482 Zircon, 328 Zöppritz, Karl, 649