Some Problems of Unlikely Intersections in Arithmetic and Geometry
 1400842719, 9781400842711

Citation preview

Annals of Mathematics Studies Number 181

This page intentionally left blank

Some Problems of Unlikely Intersections in Arithmetic and Geometry

Umberto Zannier with Appendixes by David Masser

PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD 2012

c 2012 by Princeton University Press Copyright � Published by Princeton University Press, 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press, 6 Oxford Street, Woodstock, Oxfordshire OX20 1TW press.princeton.edu All Rights Reserved Library of Congress Cataloging-in-Publication Data Zannier, U. (Umberto), 1957Some problems of unlikely intersections in arithmetic and geometry / Umberto Zannier ; with appendixes by David Masser. p. cm. – (Annals of mathematics studies ; no. 181) Includes bibliographical references and index. ISBN 978-0-691-15370-4 (hardcover : acid-free paper) – ISBN 978-0-691-15371-1 (pbk. : acid-free paper) 1. Intersection theory. 2. Algebraic varieties. 3. Algebraic geometry. I. Masser, David William, 1948- II. Title. QA564.Z36 2012 516.3� 5–dc23 2011037619 British Library Cataloging-in-Publication Data is available This book has been composed in LATEX. The publisher would like to acknowledge the author of this volume for providing the camera-ready copy from which this book was printed. Printed on acid-free paper ∞ Printed in the United States of America 10 9 8 7 6 5 4 3 2 1

Contents

Preface

ix

Notation and Conventions

xi

Introduction: An Overview of Some Problems of Unlikely Intersections

1

1 Unlikely Intersections in Multiplicative Groups and the Zilber Conjecture 1.1 Torsion points on subvarieties of nm . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Higher multiplicative rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Remarks on Theorem 1.3 and its developments . . . . . . . . . . . . . . . . . . 1.3.1 Fields other than . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Weakened assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Unlikely intersections of positive dimension and height bounds . . . . . . 1.3.4 Unlikely intersections of positive dimension and Zilber’s . . . . . . conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.5 Unlikely intersections and reducibility of lacunary poly. . . . . . nomials (Schinzel’s conjecture) . . . . . . . . . . . . . . . . . . . . . . . 1.3.6 Zhang’s notion of dependence . . . . . . . . . . . . . . . . . . . . . . . . 1.3.7 Abelian varieties (and other algebraic groups) . . . . . . . . . . . . . . . 1.3.8 Uniformity of bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes to Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sparseness of multiplicatively dependent points . . . . . . . . . . . . . . . . . . Other unlikely intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A generalization of Theorem 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . An application of the methods to zeros of linear recurrences . . . . . . . . . . . Comments on the Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

2 An Arithmetical Analogue 2.1 Some unlikely intersections in number fields . . . . . . . . . 2.2 Some applications of Theorem 2.1 . . . . . . . . . . . . . . 2.3 An analogue of Theorem 2.1 for function fields . . . . . . . 2.4 Some applications of Theorem 2.2 . . . . . . . . . . . . . . 2.5 A proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . Notes to Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . Simplifying the proof of Theorem 1.3 . . . . . . . . . . . . . Rational points on curves over p . . . . . . . . . . . . . . . Unlikely Intersections and Holomorphic GCD in Nevanlinna

. . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theory

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

15 16 22 29 29 29 31 33 35 36 36 37 39 39 39 40 40 41 43 43 48 50 52 54 58 58 58 60

vi

CONTENTS

3 Unlikely Intersections in Elliptic Surfaces and Problems of Masser 3.1 A method for the Manin-Mumford conjecture . . . . . . . . . . . . . . . 3.2 Masser’s questions on elliptic pencils . . . . . . . . . . . . . . . . . . . . 3.3 A finiteness proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Related problems, conjectures, and developments . . . . . . . . . . . . . 3.4.1 Pink’s and related conjectures . . . . . . . . . . . . . . . . . . . . 3.4.2 Extending Theorem 3.3 from to C . . . . . . . . . . . . . . . . 3.4.3 Effectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4 Extending Theorem 3.3 to arbitrary pairs of points on families . of elliptic curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.5 Simple abelian surfaces and Pell’s equations over function fields . 3.4.6 Further extensions and analogues . . . . . . . . . . . . . . . . . . 3.4.7 Dynamical analogues . . . . . . . . . . . . . . . . . . . . . . . . . Notes to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Torsion values for a single point: other arguments . . . . . . . . . . . . A variation on the Manin-Mumford conjecture . . . . . . . . . . . . . . Comments on the Methods . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

4 About the Andr´ e-Oort Conjecture 4.1 Generalities about the Andr´e-Oort Conjecture . . . . . 4.2 Modular curves and complex multiplication . . . . . . 4.3 The theorem of Andr´e . . . . . . . . . . . . . . . . . . 4.3.1 An effective variation . . . . . . . . . . . . . . . 4.4 Pila’s proof of Andr´e’s theorem . . . . . . . . . . . . . 4.5 Shimura varieties . . . . . . . . . . . . . . . . . . . . . Notes to Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . Remarks on Edixhoven’s approach to Andr´e’s theorem Some unlikely intersections beyond Andr´e-Oort . . . . Definability and o-minimal structures . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

62 62 66 70 77 77 80 83 84 85 87 89 92 92 93 94 96 96 99 105 111 112 118 123 123 124 125

Appendix A Distribution of Rational Points on Subanalytic Surfaces by Umberto Zannier

128

Appendix B Uniformity in Unlikely Intersections: An Example for Lines in Three Dimensions by David Masser

136

Appendix C Silverman’s Bounded Height Theorem for Elliptic Curves: A Direct Proof by David Masser

138

Appendix D Lower Bounds for Degrees of Torsion Points: The Transcendence Approach 140 by David Masser Appendix E A Transcendence Measure for a Quotient of Periods by David Masser

143

CONTENTS

Appendix F Counting Rational Points on Analytic Curves: A Transcendence Approach by David Masser

vii

145

Appendix G Mixed Problems: Another Approach by David Masser

147

Bibliography

149

Index

159

This page intentionally left blank

Preface The present monograph arose from the Hermann Weyl Lectures, which I had the honor and pleasure to deliver during May 2010 at the Institute for Advanced Study in Princeton. The series in question consisted of four lectures, entitled, respectively: 1. 2. 3. 4.

“An Overview of Some Problems of Unlikely Intersections” “Unlikely Intersections in Multiplicative Groups and the Zilber Conjecture” “Unlikely Intersections in Elliptic Surfaces and Problems of Masser” “About the Andr´e-Oort Conjecture”

The denomination “unlikely intersections” roughly speaking refers to varieties which we do not expect to intersect, due to natural dimensional reasons: for instance, if X, Y are varieties of dimensions r, s in a space of dimension n > r + s, we usually expect X ∩ Y to be empty. Moreover, if this emptiness does not occur for whole families of varieties which arise for independent reasons, we should expect some (simple) structural reason behind this; discovering and proving such a motivation for the existence of the said unexpected intersections is the basic pattern of the problems in this realm. We stress that the families considered here are not part of continuous or algebraic ones, but are genuinely discrete ones, a fact which introduces certain arithmetical aspects into the picture, seemingly not of the most common diophantine type. In the lectures, I focused on some known problems that can be viewed in this perspective, which sometimes unifies them. I tried to offer an overview of some of the problems and especially of a method that arose recently, without any attempt to be complete, but limited mainly to the issues with which I am more familiar. Also, according to the general spirit of these lectures, I tried to give some survey and sketch of the proofs, rather than fine details on technical points. In these notes I have basically followed similar principles, and I have presented essentially the same general topics and theorems as in the lectures, with a few exceptions. Naturally, I have added a substantial amount of detail. Similarly to the lectures, these notes are addressed to a “general” reader: they may be considered rather elementary, and known facts are often recalled for convenience. Also, the subject shows rapid evolution, so probably several results shall soon be superseded; nevertheless, hopefully this should not affect too much the said spirit of the notes. The introduction, which roughly corresponds to the first of the lectures, gives an overview of the topics and of the content of the subsequent chapters, which should correspond to the remaining three lectures; however, Chapter 2 presents material which in a sense is more specific and could not be mentioned in the lectures. The chapters are concluded with notes, where other, more specific questions are included. Moreover, the volume contains seven (short) appendixes (six of which are by David Masser), which in particular illustrate some essentials of the proofs of certain auxiliary tools needed for the main results. Acknowledgments. It is a real pleasure to thank the Institute for Advanced Study for the invitation to deliver the Hermann Weyl Lectures, and for the generous hospitality. In particular, I

x

Preface

deeply and heartily thank Enrico Bombieri and Peter Sarnak, also for their very important general help and advice and for rewarding encouragement. I am also much indebted to David Masser for several illuminating discussions, revisions, and the precious appendixes. I further thank Yves Andr´e, Matt Baker, Daniel Bertrand, Yuri Bilu, Paula Cohen-Tretkoff, Pietro Corvaja, Philipp Habegger, Ben Hutz, Lars K¨ uhne, Aaron Levin, Vincenzo Mantova, Jonathan Pila, Francesco Veneziano and Shou-Wu Zhang for very helpful clarifications and references. I further thank the staff of the Princeton University Press, especially Vickie Kearn and Ben Holmes, for very kind and helpful assistance.

xi Notation and Conventions For a set A, we shall usually denote by |A| its cardinality. If k is a field, we shall denote with k¯ an algebraic closure. For a (commutative) ring R, we shall denote by R∗ the group of invertible elements in R. By the rank of an abelian group Γ, written multiplicatively, we shall mean the maximum number of elements γ1 , . . . , γr ∈ Γ such that no relation exists of the shape γ1a1 · · · γrar = 1 with integer exponents ai not all zero. For instance, a torsion abelian group has rank 0. By algebraic variety we mean a subset of an affine or projective space, defined by a set of algebraic equations. By “variety” we shall usually mean “algebraic variety”; however, in Chapters 3 and 4 there is extensive appearance of transcendental varieties. By saying that a(n algebraic) variety X is defined over a field k we mean that it may be defined by a set of equations with coefficients in k, and we sometimes write X/k; we often identify the variety with the set of its points over an algebraic closure of k. Usually we shall consider varieties defined over (or occasionally over C or some finite field). For an algebraic variety X defined over k, we denote by k(X) the field of rational functions on X with coefficients in k. By X(k) we mean the set of points of X with coordinates in k. By algebraic point we usually mean a point in X( ). If X is a variety defined by equations fi = 0 over a field k with an automorphism σ, by X σ we mean the variety defined by the equations fiσ = 0. (It is not difficult to see that this does not depend on the set of defining equations.)

Q

Q

G

A

We shall denote by m the (affine) variety 1 \ {0}, i.e., the affine line deprived of the origin, endowed with the multiplicative group law. We extend this law coordinatewise to nm . When working with schemes over a base variety B, we shall often continue to indicate by m the “constant” scheme m × B, sometimes denoted m/B . Any vector a = (a1 , . . . , an ) ∈ n induces an algebraic homomorphism from nm to m , denoted by x = (x1 , . . . , xn ) 7→ xa := xa1 1 · · · xann .

G

Z

G

G G G G

For G a(n algebraic) group and m an integer, the symbol [m] shall denote the (algebraic) map of multiplication by m in G.

Z

In Chapter 4 we shall usually put Γ := SL2 ( ). The symbol “f  g,” for complex functions f, g of a variable x, means as usual that |f (x)| ≤ c|g(x)| for an unspecified positive number c, independent of x in the domain, but which may depend on other data, thought of as fixed as x varies. (This is occasionally indicated explicitly by writing, e.g., “f S, g.”) We also use f = O(g) with the same meaning. In the same context, the symbol “f ∼ g” means that the ratio f (x)/g(x) tends to 1 when the variable x tends to some limit point implicit in the discussion.

This page intentionally left blank

Some Problems of Unlikely Intersections in Arithmetic and Geometry

This page intentionally left blank

Introduction An Overview of Some Problems of Unlikely Intersections The present (rather long) introduction is intended to illustrate some basic problems of the topic, and to give an overview of the results and methods treated in the subsequent chapters. This follows the pattern I adopted in the lectures. Let me first say a few words on the general title. This has to do with the simple expectation that when we intersect two varieties X, Y (whose type is immaterial now) of dimensions r, s ≥ 0 in a space of dimension n, in absence of special reasons we expect the intersection to have dimension ≤ r + s − n, and in particular to be empty if r + s < n. (This expectation may of course be justified on several grounds.) More specifically, let X be fixed and let Y run through a denumerable set Y of algebraic varieties, chosen in advance independently of X, with a certain structure relevant for us, and such that dim X + dim Y < n = dim(ambient); then we expect that only for a small subset of Y ∈ Y, we shall have X ∩ Y 6= ∅, unless there is a special structure relating X with Y which forces the contrary to happen. We shall usually express this by saying that X is a special variety. When X is nonspecial, the said (expected) smallness may be measured in terms of the union S of the intersections Y ∈Y (X ∩ Y ): how is this set distributed in X? Is this set finite? Similarly, we may study analogous situations when dim Y = s is any fixed number, whereas dim(X ∩ Y ) > dim(X) + s − n for several Y ∈ Y. We note that often these problems can also be seen as expressing some kind of local-global principle: a point of intersection of X with some Y ∈ Y encodes a local property of a suitable set of coordinate functions on X at that point; we expect this property to occur only at a few points, unless it is the specialization of a global property of these functions on X. Such a global property should correspond to X being “special.” Now, it turns out that some known problems involving arithmetic and geometry can be put into this (rough) context, where the varieties in Y are usually described by equations of growing degrees, and depending on discrete parameters.1 For instance, Y could consist of denumerably

Q

1 So, especially because the degree (over ) grows, the varieties in Y considered here do not vary in families in the common algebraic or continuous meaning. In particular, the said intersections (whether unlikely or not) usually cannot be described by (rational or integral) points on algebraic varieties. This fact introduces an arithmetical aspect in the problems seemingly of different type compared to the usual diophantine questions. So, for instance, although a line meeting a given curve in 3 may be considered to produce an “unlikely intersection,” the diophantine issue of describing the set of such lines which are defined over does not fall in the realm considered here.

A

Q

2

Introduction

many prescribed points, which is indeed the case in many of the basic issues in this topic; such points shall be called the special points. In each case, the special varieties shall constitute the natural (for the structure in question) higher-dimensional analogue of the special points. Then we expect that a nonspecial variety X of positive codimension shall contain only a few special points, for instance a set which is not Zariski-dense. Let me give a simple example at the basis of the problems to be discussed. Lang’s problem on roots of unity Such an issue was raised by S. Lang in the 1960s. He posed the following attractive problem, a kind of simple prototype of other questions we shall touch: suppose that X : f (x, y) = 0 is a complex plane irreducible curve containing infinitely many points (ζ, θ) whose coordinates are roots of unity; what can be said of the polynomial f ? Actually, equations in roots of unity go back to long ago: P. Gordan already in 1877 studied certain equations of this type, linear and with rational coefficients, related to the classification of finite groups of homographies. In part inspired by this and by subsequent papers, e.g., of H.B. Mann, the subject was also investigated in a systematic way by J.H. Conway and A.J. Jones [CJ76]. In their terminology, we may view such problems as trigonometric diophantine equations; in fact, if we write the coordinates in exponential shape, we have a trigonometric equation f (exp(2πiα), exp(2πiβ)) = 0, to be solved in rational “angles” α, β ∈ / .2

QZ

Observe also that the points with roots of unity coordinates are precisely the torsion points in the algebraic group 2m . (We recall that as a variety m is simply 1 \ {0}, the affine line with the origin removed; we endow it with the multiplicative group law to make it into an algebraic group.) The torsion points constitute the set Y of special points in this problem.

G

G

A

Lang actually expected only finitely many torsion points to lie in X, unless a special (multiplicative) structure occurred, which he formulated as X being a translate of an algebraic subgroup by a torsion point, which we call a torsion coset. This amounts to the equation f (x, y) = 0 being (up to a monomial factor) of the shape xa y b = ρ, for integers a, b not both zero (their sign is immaterial here) and ρ a root of unity. This structure is actually clearly unavoidable because it yields infinitely many torsion points in X. We call it the special structure for the problem in question, and we call the torsion cosets the special irreducible (sub)varieties for this issue; they are also named torsion varieties. The result foreseen by Lang can be rephrased by stating that an irreducible curve contains infinitely many special (=torsion) points if and only if it is a special (=torsion) curve. As mentioned in [Lan83], this expectation of Lang was soon proved by Ihara, Serre, Tate (see next chapter and also [Lan65] for an account of these proofs); it was accompanied by other questions (also of others), such as what happens in higher dimensions and for other algebraic groups, and provided further motivation for them. Let us give a description of these evolutions and of some other related issues, which will serve also as a sort of summary for the topics of these notes. They involve several different methods, of which I shall describe only a small part in some detail. Summary Chapter 1: Unlikely Intersections in Multiplicative Groups and the Zilber Conjecture. 2 The quoted paper of Conway-Jones constructs a theory which reduces trigonometric diophantine equations to usual diophantine equations. It also mentions a number of classical applications, including the one noted by Gordan (1877), of the problem of solving equations in roots of unity.

3

An overview of some problems of unlikely intersections

We shall discuss unlikely intersections in (commutative) multiplicative algebraic groups (over a field of characteristic zero). For our purposes we may perform a finite extension of the ground field, and then such groups are of the shape nm ; they are also called (algebraic) tori. The above-mentioned Lang’s problem is the simplest nontrivial issue in this context. The natural generalization to higher dimensions also formed the object of a conjecture of Lang, proved by M. Laurent [Lau84] and independently by Sarnak-Adams [SA94]. The final result, of which we shall sketch a proof (by a number of methods) in Chapter 1, may be stated in the following form:

G

Theorem. Let Σ be a set of torsion points in of torsion cosets.

Gnm(Q). The Zariski closure of Σ is a finite union

The torsion cosets are by definition the translates of algebraic subgroups by a torsion point; it turns out that they may be always defined by finitely many equations of the shape xa1 1 · · · xann = θ, for integers ai and root of unity θ. They are the special varieties in this context; in the case when the dimension is 0 we find the special points, i.e., the torsion points. Then, a rephrasing is that The Zariski closure of a(ny) set of special points is a finite union of special varieties.

G

Note that if we start with any (irreducible) algebraic variety X ⊂ nm and take Σ as the set of all torsion points in X, we find that Σ is confined to a finite union of torsion cosets in X, and hence is not Zariski-dense in X unless X is itself a torsion coset. This result in practice describes the set of solutions of a system of algebraic equations in roots of unity, and confirms the natural intuition that all the solutions are originated by a multiplicative structure of finitely many subvarieties of X. As noted above, such solutions in roots of unity had been treated from a somewhat different viewpoint also by Mann [Man65] and Conway-Jones [CJ76]. These conjectures and results inspired analogous and deeper problems; for instance, we may replace nm by an abelian variety and ask about the corresponding result. In the case of a complex curve X of genus at least 2 embedded in its Jacobian, such a problem was raised independently by Yu. Manin and separately by D. Mumford, already in the 1960s; it predicted finiteness for the set of torsion points on X. Actually, Mumford’s question apparently motivated in part Lang’s above problem, as mentioned in [Lan65]; then Lang was led to unifying statements. The Manin-Mumford conjecture was proved by M. Raynaud, who soon was able to analyze completely the general case of an arbitrary subvariety X of an abelian variety A (see, e.g., [Ray83]). The final result by Raynaud may be phrased in the above shape:

G

Raynaud’s Theorem. Let A be an abelian variety defined over a field of characteristic 0 and let Σ be a set of torsion points in A. Then the Zariski closure of Σ is a finite union of translates of abelian subvarieties of A by a torsion point.

Q

The fundamental case occurs when A is defined over , which indeed implies (by specialization) the general case of characteristic zero, whereas the conclusion is generally false, e.g., in the case of p , since any point is then torsion. The special varieties of this abelian context are the torsiontranslates of abelian subvarieties, and the special points are again the torsion points. There are now several known proofs of the Manin-Mumford conjecture and its extensions, but none of them is really easy, and the matter is distinctly deeper than the toric case. A new proof appears in the paper [PZ08] with a method which applies also to other issues on unlikely intersections where other arguments do not apply directly. This method and its implications constitute one of the main topics we shall discuss, however, in Chapters 3 and 4.

F

This context evolved in several deep directions, first with the study of points in X ∩ Γ for a subgroup Γ of finite rank (Lang, P. Liardet, M. Laurent, G. Faltings, P. Vojta...), or later with the study of algebraic points of small height (after F. Bogomolov); actually, certain issues on

4

Introduction

“small points” implicitly motivated some of the studies we shall discuss later. For the sake of completeness, in Chapter 1 we shall briefly mention a few results on this kind of problem, without pausing, however, on any detail. Still in another direction, we may continue the above problems to higher multiplicative rank, by intersecting a given variety X ⊂ nm not merely with the set of torsion points, but with the family of algebraic subgroups of nm up to any given dimension; when this dimension is 0, we find back the torsion points. The structure of algebraic subgroups of nm (recalled below) shows that a point (u1 , . . . , un ) ∈ X lies in such a subgroup of dimension r if and only if the coordinates u1 , . . . , un satisfy at least mn 1 = 1 of multiplicative dependence, i.e., the rank of the n − r independent relations um 1 · · · un Z Z multiplicative group u1 · · · un is at most r. We find maximal dependence when the point is torsion, i.e., we may take r = 0. But of course the intersection shall be unlikely as soon as r + dim X < n. So, under this condition, we already expect a sparse set of intersections, unless X is “special” in some appropriate algebraic sense. For instance, we do not expect X to be special if the coordinates x1 , . . . , xn are multiplicatively independent as functions on X; in this case we should expect this independence to be preserved by evaluation at most points of X. We clearly see here a kind of local-global principle alluded to above. A prototype of this kind of problem was studied already by Schinzel in the 1980s (see, e.g., Ch. 4 of [Sch00]) in the course of his theory of irreducibility of lacunary polynomials, so with independent motivations; he obtained fairly complete results only when X is a curve in a space up to dimension n ≤ 3. For the case of arbitrary dimension n, confining again to curves X, this study was the object of a joint paper with E. Bombieri and D. Masser [BMZ99], and then was studied further by P. Habegger, G. R´emond, G. Maurin, and also by M. Carrizosa, E. Viada, and others in the case of abelian varieties (here the results are less complete). We shall present in some detail the two main results of the paper [BMZ99] in Chapter 1. The common assumption is that the curve X is not contained in a translate of a proper algebraic subgroup of nm ; this means that the coordinate functions x1 , . . . , xn are multiplicatively independent modulo constants. Under this assumption we have (working over ):

G

G

G

G

Q

Q

Theorem 1. If X is an irreducible curve over , not S contained in any translate of a proper algebraic subgroup of nm , then the Weil height in the set dim G≤n−1 (X ∩ G) is bounded above.3

G

G

Here and below, G is understood to run through algebraic subgroups of nm . Note that the sum dim X + dim G here may be equal to n, so these intersections X ∩ G may be indeed considered “likely intersections”; actually, it is easily proved that they constitute an infinite set. However, the result shows that they are already sparse: in fact, for instance the well-known (easy) Northcott’s theorem immediately implies that there are only finitely many such points of bounded degree. On the other hand, note also that a priori it is not even clear that these intersections do not exhaust all the algebraic points on our curve X. In particular, Theorem 1 applies to the curve x1 + x2 = 1, predicting that such x1 , x2 which are multiplicatively dependent have bounded height; this example, first proposed by Masser (and analogue of an example by S. Zhang and D. Zagier for lower bounds for heights), played a motivating role in the early work. To go on, let us now look at algebraic subgroups G with dim G ≤ n − 2; now we impose at least two multiplicative relations, and we have, so to say, double sparseness, and truly unlikely intersections. This is confirmed by the following result:

Q

irreducible curve over , not contained in any translate of a proper algebraic Theorem 2. If X is anS subgroup of nm , then dim G≤n−2 (X ∩ G) is finite.

G

3 We

shall briefly recall this notion of height in Chapter 1.

5

An overview of some problems of unlikely intersections

For instance, the set A (resp. B) of algebraic numbers x (resp. y) such that x, 1 − x (resp. y, 1 + y) are multiplicatively dependent has bounded height, by Theorem 1, whereas A ∩ B is finite, by Theorem 2 (applied to the line in 3m parametrized by (t, 1 − t, 1 + t)). We shall give sketches of proofs of these theorems, which roughly speaking depend on certain comparisons between degrees and heights.

G

The assumption that X is not contained in a translate of a proper algebraic subgroup, rather than just in a proper algebraic subgroup (or, equivalently, in a torsion-translate of a proper algebraic subgroup) makes a subtle difference; it is necessary for the first result to hold, but this necessity was not clear for the second one. This turned out to be in fact an important issue, because for instance it brought into the picture the deep Lang’s conjectures (alluded to above) on the intersections (of a curve) with finitely generated groups. Only recently has it been proved by G. Maurin [Mau08] that the weaker assumption suffices. Before Maurin’s proof, the attempt to clarify this issue (as, for instance, in [BMZ06]) led to other natural and independent questions, like an extension of Theorem 1 to higher dimensional varieties; it was soon realized that for this aim new assumptions were necessary, and in turn this led to the consideration of unlikely intersections of higher dimensions. Such a study was performed in [BMZ07], where among other things a kind of function field analogue of Theorem 2 was obtained, for unlikely intersections of positive dimension (with algebraic cosets); also, the issue of the height was explicitly stated therein with a “bounded height conjecture.” This was eventually proved by P. Habegger in [Hab09c] with his new ideas. With the aid of this result, a new proof of Maurin’s theorem was also achieved in [BHMZ10]. In the meantime, after the paper [BMZ99] was published, it turned out that quite similar problems had been considered independently and from another viewpoint also by B. Zilber, who, with completely different motivations arising from model theory, had formulated in [Zil02] general conjectures for varieties of arbitrary dimensions (also in the abelian context), of which the said theorem of [Mau08] is a special case. Another independent formulation of such conjectures (even in greater generality) was given by R. Pink (unpublished). These conjectural statements contain several of the said theorems4 and in practice predict a certain natural finite description for all the unlikely intersections in question. We shall state Zilber’s conjecture (in the toric case) and then present in short some extensions of the above theorems, some other results (e.g., on the said unlikely intersections of positive dimension) and some applications, for instance, to the irreducibility theory of lacunary polynomials. We shall also see how Zilber’s conjecture implies uniformity in quantitative versions of the said results. Further, in the notes to the chapter we shall offer some detail about an independent method of Masser to study the sparseness of the intersections considered in Theorem 1 (actually with a milder assumption), and we shall see other more specific questions. Chapter 2: An Arithmetical Analogue

G

The unlikely intersections of the above Theorem 2 for X a curve in nm correspond to (complex) solutions to pairs of equations xa = xb = 1 on X, where we have abbreviated, e.g., xa := xa1 1 · · · xann . The xi are the natural coordinate functions on nm , whereas a, b vary over all pairs of linearly independent integral vectors. In other words, we are considering common zeros of two rational functions u − 1 and v − 1 on X, where both u := xa , v := xb are taken from a finitely Z generated multiplicative group Γ of rational functions, namely, Γ = xZ 1 · · · xn , the group generated by the coordinates x1 , . . . , xn .

G

4 However,

they do not consider heights.

6

Introduction

In this view, we obtain an analogue issue for number fields k on considering a finitely generated group Γ in k ∗ (e.g., the group of S-units, for a prescribed finite set S of places) and on looking at primes dividing both u − 1 and v − 1, for u, v running through Γ. These primes now constitute the unlikely intersections; note that now there are always infinitely many ones (if Γ is infinite). For given u, v, a measure of the magnitude of the set of these primes is the gcd(1 − u, 1 − v). It turns out that if u, v are multiplicatively independent this can be estimated nontrivially; for instance, we have:

Q

Theorem: Let  > 0 and let Γ ⊂ ∗ be a finitely generated subgroup. Then there is a number c = c(, Γ) such that if u, v ∈ ∩ Γ are multiplicatively independent, we have gcd(1 − u, 1 − v) ≤ c max(|u|, |v|) .

Z

We shall give a proof of this statement, relying on the subspace theorem of Schmidt, which we shall recall in a simplified version that is sufficient here. A substantial difference with the above context is that here we estimate the individual intersections, whereas previously we estimated their union over all pairs u, v ∈ Γ. However, in this context a uniform bound for the union does not hold. In the special cases u = an , v = bn (with a, b fixed integers, n ∈ ), the displayed results were obtained in joint work with Y. Bugeaud and P. Corvaja [BCZ03], whereas the general case was achieved in [CZ03] and [CZ05], also for number fields (where the gcd may be suitably defined, also involving archimedean places). All of these proofs use the above-mentioned Schmidt subspace theorem (which is a higher-dimensional version of Roth’s theorem in diophantine approximation; see, e.g., [BG06] for a proof and also for its application to the present theorem).

N

These results admit, for instance, an application to the proof of a conjecture by Gyory-SarkozyStewart (that the greatest prime factor of (ab + 1)(ac + 1) → ∞ as a → ∞, where a > b > c > 0). They also have applications to various other problems, including the structure of the groups E( qn ), n → ∞, for a given ordinary elliptic curve E/ q (Luca-Shparlinski [LS05]) and to a Torelli Theorem over finite fields (Bogomolov-Korotiaev-Tschinkel, [BT08] and [BKT10]). J. Silverman [Sil05] formulated the result in terms of certain heights and recovered it as a consequence of a conjecture of Vojta for the blow-up of 2m at the origin; we shall recall in brief these interpretations.

F

F

G

There are also analogous estimates over function fields (obtained in [CZ08b]); they also provide simplification for some proofs related to results in Chapter 1, which yields further evidence that such an analogy is not artificial. These estimates also have implications to the proof of certain special cases of Vojta’s conjecture on integral points over function fields, specifically for P2 \ three divisors, and for counting rational points on curves over finite fields. We shall present (also in the notes) some of these results and provide detail for some of the proofs. Chapter 3: Unlikely Intersections in Elliptic Surfaces and Problems of Masser D. Masser formulated the following attractive problem. p Consider the Legendre p elliptic curve Eλ : Y 2 = X(X − 1)(X − λ) and the points Pλ = (2, 2(2 − λ)), Qλ = (3, 6(3 − λ)) on it. Here λ denotes an indeterminate, but we can also think of specializing it. Consider then the set of complex values λ0 6= 0, 1 of λ such that both Pλ0 and Qλ0 are torsion points on Eλ0 . It may be easily proved that none of the points is identically torsion on Eλ and that actually the points are linearly independent over , so it makes sense to ask:

Z

Masser’s problem: Is this set finite? Clearly this is also a problem of unlikely intersections: for varying λ ∈ C, the Eλ describe a(n elliptic) surface, with a rational map λ to P1 . The squares Eλ2 , again for varying λ 6= 0, 1, describe

7

An overview of some problems of unlikely intersections

a threefold, i.e., the fibered product (with respect to the map λ) Eλ ×P1 \{0,1,∞} Eλ . (This is an elliptic group-scheme over P1 \ {0, 1, ∞}.) The Pλ × Eλ and Eλ × Qλ both describe a surface, whereas the points Pλ × Qλ describe a curve in this threefold; also, each condition mPλ = Oλ (i.e., the origin of Eλ ) or nQλ = Oλ also corresponds to a surface (if mn 6= 0), whereas a pair of such conditions yields a curve, since the points are linearly independent on Eλ (as we shall see). Putting together these dimensional data, we should then expect (i) that each point gives rise to infinitely many “torsion values” λ0 of λ (which may be proved, however, a bit less trivially than might be expected), but (ii) that it is unlikely that a same value λ0 works for both points, which would correspond to an intersection of two curves in the threefold. Hence, a finiteness expectation in Masser’s context is indeed sensible. Of course, this is just a simple example of analogous questions that could be stated. For instance, we could pick any two points in Eλ ( (λ)) and ask the same question. In the general case we would expect finiteness only when the points are linearly independent in Eλ ; any (identical) linear dependence would correspond to a special variety on this issue of unlikely intersections. In fact, it turned out that S. Zhang had raised certain similar issues in 1998, and R. Pink in 2005 raised independently such a type of conjecture in the general context of semiabelian group schemes, generalizing the present problems. In these notes we shall stick mainly to special cases like the one above, but we shall discuss also some recent progress toward more general cases.

Q

Q

Note that if we take a fixed “constant” elliptic curve E, say over , in place of Eλ , and if we take two points P, Q ∈ E( (λ)), the obvious analogue of Masser’s question reduces to the Manin-Mumford problem for E 2 , namely, we are just asking for the torsion points in the curve described by Pλ × Qλ in E 2 (the fact that now E is constant allows us to work with the surface E 2 rather than the threefold E 2 × P1 ). The special varieties here occur when the points are linearly dependent: aPλ = bQλ for integers a, b not both zero.5 However, the known arguments for Raynaud’s theorem seem not to carry over directly to a variable elliptic curve as in Masser’s problem.

Q

In Chapter 3, we shall actually start with the Manin-Mumford conjecture (in extended form, i.e., Raynaud’s theorem), sketching the mentioned method of [PZ08] to recover it, and we shall then illustrate in some detail how the method also applies in this relative situation. The method proves the finiteness expectation, as in the papers [MZ08] and [MZ10b]. (In more recent work [MZ10d], it is shown that this method suffices also for any choice of two points on Eλ , with coordinates in (λ) and not linearly dependent.)

Q

The principle of the method is, very roughly, as follows. Let us consider for simplicity the case of the Manin-Mumford issue of torsion points on a curve X in a fixed abelian variety A/ . We start by considering a transcendental uniformization π : Cg /Λ → A, where Λ is a full lattice in Cg . By means of a basis for Λ, we may identify Cg with 2g , and under this identification the torsion points on A (of order N ) become the rational points in 2g (of denominator N ). Thus the torsion points on X of order N give rise to rational points on Z := π −1 (X) of denominator N ; in the above identification, Z becomes a real-analytic surface. Now the proof compares two kind of estimates for these rational points:

Q

R

R

• Upper bounds: By work of Bombieri-Pila [BP89], generalized by Pila [Pil04] and further by Pila-Wilkie[PW06], one can often estimate nontrivially the number of rational points with denominator dividing N on a (compact part of a) transcendental variety Z;6 the estimates take 5 If

E has complex multiplication, we should, however, take into account these relations with a, b ∈ End(E). original paper [BP89] strongly influenced the subsequent work; it was concerned with curves, also algebraic, and the attention to transcendental ones came also from a specific question of Sarnak. 6 The

8

Introduction

the shape ,Z N  , for any  > 0, provided, however, we remove from Z the union of connected semialgebraic arcs 7 contained in it. (This proviso is necessary, as these possible arcs could contain many more rational points.) In the present context, purely geometrical considerations show that if X is not a translate of an elliptic curve, then Z does not contain any such semialgebraic arc. Thus the estimate holds indeed for all rational points on Z. • Lower bounds: Going back to the algebraic context of A and X, we observe that a torsion point x ∈ X carries all its conjugates xσ ∈ X over a number field k of definition for X; these conjugates are also torsion, of the same order as x. Further, by a deep estimate of Masser [Mas84] (coming from methods stemming from transcendence theory), their number (i.e., the degree [k(x) : k]) is A N δ , where δ > 0 is a certain positive number depending only on (the dimension of) A. Conclusion: Plainly, comparison of these estimates proves (on choosing  < δ) that N is bounded unless X is an elliptic curve inside A, as required.

G

This method would work also for the simpler context of torsion points in subvarieties of nm , as in the original question by Lang. (See, e.g., [Sca11b].) In the case of Masser’s above-mentioned problem, things look somewhat different, since we have a family of elliptic curves Eλ , and then the uniformizations C/Λλ → Eλ make up a family as well. Also, when we take conjugates of a torsion point in Eλ0 , we fall in other curves Eλσ0 . However, in spite of these discrepancies with the “constant” case, it turns out that the issue is still in the range of the method: we may use a varying basis for the lattices Λλ (constructed by hypergeometric functions) and again define real coordinates which take rational values on torsion points. And then the above-sketched proof-pattern still works in this relative picture. We shall present a fairly detailed account of this. Further problems: Masser formulated also analogous issues, e.g., for other abelian surfaces and further for a larger number of points (depending on more parameters). There are also interesting applications, such as an attractive special case considered by Masser, concerning polynomial solutions in C[t] of Pell’s equations, e.g., of the shape x2 = (t6 + t + λ)y 2 + 1. Independently of this, Pink also stated related and more general conjectures for group-schemes over arbitrary varieties. In Chapter 3 we shall also mention some of these possible extensions, and some more recent work. The said method in principle applies in greater generality, but in particular it often needs as a crucial ingredient a certain height bound, which is due to Silverman for the case of a single parameter. In higher dimensions, an analogous bound was proved only recently by P. Habegger, which may lead to significant progress toward the general questions of Masser and Pink. A dynamical analogue. It is worth mentioning an analogue of these questions in algebraic dynamics; this comes on realizing that the torsion points on an elliptic curve correspond to the preperiodic points for the so-called Latt`es map, namely the rational map P1 → P1 of degree 4 which expresses x(2P ) in terms of x(P ), for a point P ∈ E, where x is the first coordinate in a Weierstrass model. (For the Legendre curve this map is (x2 − λ)2 /4x(x − 1)(x − λ).) A reformulation in this context of Masser’s question becomes: Are there infinitely many λ ∈ C such that 2, 3 are both preperiodic for the Latt`es map xλ relative to Eλ ? At the AIM meeting in Palo Alto (Jan. 2008) I asked whether such a question may be dealt with for other rational functions depending on a parameter, for instance x2 + λ. M. Baker and L. DeMarco recently succeeded, in fact, in proving [BD11] that, if d ≥ 2, then a, b ∈ C are preperiodic 7 By

“semialgebraic arc” we mean the image of a C ∞ nonconstant map from an interval to a real algebraic curve.

9

An overview of some problems of unlikely intersections

with respect to xd + λ for infinitely many λ ∈ C, if and only if ad = bd (this is the special variety in this problem of unlikely intersections). The method of Baker-DeMarco is completely different from the one above, and we shall only say a few words about it; it is likely that it applies to the Latt`es maps as well (although this has not yet been carried out). In any case it provides a complement to the method used by Masser, Pila, and myself. We shall conclude by mentioning and discussing as well some dynamical analogues of the Manin-Mumford conjecture. Chapter 4: About the Andr´e-Oort Conjecture Recently, J. Pila found that the method that we have very briefly described in connection with Chapter 3, on the problems of Masser and the conjectures of Manin-Mumford and Pink, can be applied to another bunch of well-known problems that go under the name “Andr´e-Oort conjecture.” This conjecture concerns Shimura varieties and is rather technical to state in the most general shape; fundamental examples of Shimura varieties are the modular curves8 and, more generally, the moduli spaces parametrizing principally polarized abelian varieties of given dimension, possibly with additional (level) structure (see below for simple examples and Chapter 4 for more). At this point let us merely say that the pattern of the relevant statement is similar to the others we have found; in fact, there is a notion of Shimura subvariety of a Shimura variety, and if we interpret these subvarieties as being the “special” ones (in dimension 0 we find the special points), the general statement becomes: Andr´ e-Oort conjecture: The Zariski closure of a set of special points in a Shimura variety is a special subvariety, or, equivalently: If a subvariety of a Shimura variety has a Zariski-dense set of special points, then it is a special subvariety. For the case when the said subvariety is a curve, this statement first appeared (in equivalent form) as Problem 1 on p. 215 of Y. Andr´e’s book [And89]; then it was stated independently by F. Oort at the Cortona Conference in 1994 for the case of moduli spaces of principally polarized abelian varieties of given dimension [Oor97]. (We also note that in [And89], p. 216, Andr´e already pointed out a similarity with the Manin-Mumford conjecture.) To illustrate the simplest instances of this conjecture, let me briefly recall a few basic facts in the theory of elliptic curves (see, e.g., [Sil92]). Every complex elliptic curve E may be defined (in 2 ) by a Weierstrass equation E : y 2 = x3 + ax + b, where a, b ∈ C are such that 4a3 + 27b2 6= 0. 3 0 It has an invariant j = j(E) := 1728 4a34a +27b2 , which is such that E, E are isomorphic over C if and only if they have the same invariant. Hence we may view the affine (complex) line 1 as parametrizing (isomorphism classes of) elliptic curves through the j-invariant; this is the simplest example of modular curves, which in turn are the simplest Shimura varieties of positive dimension. Other modular curves are obtained as finite covers of 1 , parametrizing elliptic curves plus some discrete additional structure, for instance a choice of a torsion point of given order, or a finite cyclic subgroup of given order. A “generic” complex elliptic curve has an endomorphism ring equal to ; however, it may happen that the endomorphism ring is larger, in which case it is an order in the ring of integers of an imaginary quadratic field (and any such order is possible). In such cases the invariant j

A

A

A

Z

8 These are algebraic curves, with an affine piece analytically isomorphic to Γ\H, where H is the upper-half-plane and Γ is a so-called congruence subgroup of SL2 ( ).

Z

10

Introduction

turns out to be an algebraic number (actually algebraic integer), and equal to a value j(τ ) of the celebrated “modular function” j(z) at some imaginary quadratic number τ in the upper-half plane H := {z ∈ C : =z > 0}. These values are called “singular (moduli)” or “CM ” (from “complex multiplication,” which is the way in which the endomorphisms arise). They turn out to be the “special” points in the modular curves (and they are the smallest Shimura varieties, of dimension 0). We shall be more detailed on this in Chapter 4, with an explicit section devoted to a brief review of all of this; see also [Lan73], [Shi94], [Sil92] and [Sil02]. Now, the Andr´e-Oort conjecture for (the modular) curves is a trivial statement, since there are no varieties strictly intermediate between points and an irreducible curve. However, if we take the product of two modular curves, such as 1 × 1 = 2 , we may choose X as any (irreducible) plane curve in this product, as an intermediate variety. To see the shape of the conjecture in this context, let us briefly discuss the special points and the special curves in 2 . The special points in 2 are the pairs of special points in 1 . The special curves turn out to be of two types. A first (trivial) type is obtained by taking {x} × 1 or 1 × {x}, where x ∈ 1 is a special point. A second type is obtained by the modular curves denoted Y0 (n); such a curve is defined by a certain symmetric (if n > 1) irreducible polynomial Φn (x1 , x2 ) ∈ [x1 , x2 ], constructed explicitly, e.g., in [Lan73], Ch. 5. The curve Y0 (n) parametrizes (isomorphism classes of) pairs (E1 , E2 ) of elliptic curves such that there exists a cyclic isogeny φ : E1 → E2 of degree n. We note that all of these curves contain infinitely many special points; for the first type this is clear, whereas for the second type it suffices to recall that isogenous curves have isomorphic fields of endomorphisms.9

A

A

A

A

A

A

A A

A

Q

A

Coming back to our (irreducible) plane curve X ⊂ 2 , note that we may view any point (x1 , x2 ) ∈ X as corresponding to a pair of (isomorphism classes of) elliptic curves E1 , E2 with j(Ei ) = xi for i = 1, 2. This point is special if E1 , E2 are both CM -elliptic curves. Note that we may fix the coordinate x1 as we please, but of course this shall determine x2 up to boundedly many possibilities, so if x1 is a special point, it shall be unlikely that some suitable x2 shall also be special. The Andr´e-Oort conjecture in this setting states that there are infinitely many special points on the irreducible curve X ⊂ 2 precisely if X is of one of the two types we have just described above.

A

This basic case of the conjecture is the precise analogue for this context of the question of Lang about torsion points on a plane curve, which we have recalled at the beginning, and which provided so much motivation for subsequent work. In this special case the conjecture was proved by Andr´e himself in [And98]. (The proof readily generalizes to arbitrary products of two modular curves.) The conclusion may be viewed as the complete description of the fixed algebraic relations holding between infinitely many pairs of CM -invariants. (An attractive example concerns the Legendre curves Eλ : y 2 = x(x − 1)(x − λ). The result implies the finiteness of the set of complex λ0 such that both Eλ0 and E−λ0 have complex multiplication.) We shall reproduce this proof by Andr´e in Chapter 4 (together with an effective variation: see the remarks below). Another argument for Andr´e’s theorem, by B. Edixhoven, appeared almost simultaneously [Edi98]; however, this was conditional to the Generalized Riemann Hypothesis for zeta-functions of imaginary quadratic fields. A merit of this argument is that it opened the way to generalizations; for instance, Edixhoven treated arbitrary products of modular curves [Edi05], and other extensions were obtained by Edixhoven and A. Yafaev [EY03]. Recently this has also been combined with equidistribution methods (by L. Clozel and E. Ullmo), and a conditional proof of the full conjecture 9 We

mean fields End(E) ⊗

Q.

An overview of some problems of unlikely intersections

11

has been announced recently by B. Klingler, Ullmo, and Yafaev. (See also [Noo06] and [Pil11] for other references.) We shall briefly comment on Edixhoven’s method in the notes to Chapter 4. More recently Pila in [Pil09b] succeeded in applying unconditionally the method we have mentioned while summarizing Chapter 3 (at the basis of the papers [MZ08], [PZ08], and [MZ10b]) to these problems in the Andr´e-Oort context. In place of the uniformization of an abelian variety by a complex torus, he used the uniformization of modular curves provided by the modular function. He first obtained in [Pil09b] some unconditional results including Andr´e’s original one, but with an entirely different argument. In this same paper he also succeeded to mix the Manin-Mumford and Andr´e-Oort issues in special cases. Moreover, in a more recent paper [Pil11] he goes much further and obtains a combination of the Andr´e-Oort, Manin-Mumford, and Lang’s statements for arbitrary subvarieties of X1 × . . . × Xn × E1 × . . . × Em × `m or of X1 × . . . × Xn × A, where Xi are modular curves, Ei are elliptic curves, and A is an abelian variety, all defined over . The special points are now the products of CM -points (relative to Cn ) times torsion points (relative to A or to E1 ×· · ·×Em × `m ). The special subvarieties are products of special subvarieties of the three main factors X1 × . . . × Xn , E1 × . . . × Em or `m . As to each of these factors, we already mentioned the abelian (resp. toric) case: the special subvarieties are the torsion translates of abelian subvarieties (resp. algebraic subgroups). As to the “modular” factor X1 × . . . × Xn , the special varieties are, roughly speaking, defined by some CM constant coordinates plus several modular relations Φnij (ci , cj ), relative to vectors (c1 , . . . , cn ) ∈ Cn . (If we view the modular curves as quotients of H, the special subvarieties become images of points (τ1 , . . . , τn ) ∈ Hn with relations either of the shape τi = ai , ai fixed quadratic imaginary numbers, or τr = grs τs , grs fixed in GL+ 2 ( ).)

G

Q

G

G

Q

These results of Pila are quite remarkable, especially taking into account that only a few unconditional cases of the Andr´e-Oort conjecture have been published after the one obtained by Andr´e.10 In Chapter 4 we shall sketch Pila’s argument for Andr´e’s theorem, in order to illustrate the method and to compare it with Andr´e’s. This method of Pila is along the same lines of what we have seen in connection with Masser’s questions; this time he relies on estimates not merely for rational points, but for algebraic points of bounded degree on transcendental varieties (he carries this out in [Pil09a] and [Pil11]). Also, the nature of these varieties is more complicated than before (they are not subanalytic), and in order to apply his results, Pila then relies on the fact that the varieties in question are of mixed exponential-subanalytic type. Substantial effort also comes in characterizing the algebraic part of the relevant transcendental varieties, namely the union of connected semialgebraic positive dimensional arcs contained in it (we have already mentioned that these arcs have to be removed in order to obtain estimates of the sought type). This issue represents the geometric step of the method; it is the part taking into account the special subvarieties. This was a relatively easy matter in the paper [MZ08] but became more difficult in [PZ08] and [MZ10b] and is a major point in [Pil11]. We shall conclude the chapter by a brief discussion of Shimura varieties. In the notes to this chapter we shall briefly discuss Edixhoven’s approach to Andr´e’s theorem, and then give a few definitions in the context of definability and o-minimal structures: these concepts concern collections of real varieties satisfying certain axiomatic properties, which are fundamental in the said analysis of rational points (by Pila and Pila-Wilkie). 10 See,

for instance, [Zha05] for some results of Shou-Wu Zhang on quaternion Shimura varieties.

12

Introduction

A few words on effectivity. The mentioned method used jointly with Masser and separately with Pila for (relative) Manin-Mumford, and then by Pila toward the Andr´e-Oort conjecture, should be in principle effective in itself; but the tools on which it relies may or may not be effective, depending on the cases. For the Manin-Mumford statement and the problems of Masser, the necessary lower bounds for degrees are effective, and the upper bounds coming from the Bombieri-Pila-Wilkie papers should also be effective. (However, this has not yet been formally proved.) On the contrary, the lower bound used in Pila’s proofs concerning the Andr´e-Oort statement is not yet known to be effective: this is because it comes from Siegel’s lower bounds for the class number of imaginary quadratic fields. (This leads only to an upper bound for the number of exceptions.) Precisely the same kind of ineffectivity occurs in Andr´e’s proof. However, inspection shows that one may in fact dispense with his (ingenious) opening argument, which is the only one using class-number estimates, at the cost of working out with additional precision the last part of his proof. (Here one also has to use effective lower bounds for linear forms in logarithms, in addition to effective lower bounds of Masser for algebraic approximations to values of the inverse of the modular function at algebraic points, a step which already appeared in the proof.) This variation leads to a completely effective statement. This is very recent work of L. K¨ uhne [K¨ uh11], and has been independently realized also in joint work with Yu. Bilu and Masser [BMZ11]; we shall give a sketch of this in Chapter 4, after reproducing Andr´e’s proof. Just to mention an explicit instance, we may think of the plane curve x1 + x2 = 1, which appeared already in the context of Theorem 1 as a motivating simple example; it might be sensible to take it into account again by asking about the CM -pairs on it: Which are the pairs of CM-invariants j, j 0 such that j + j 0 = 1? Now, x1 + x2 = 1 is not one of the Y0 (n), so we know from Andr´e that the list of such pairs is finite. In fact, the said effective argument may be carried out explicitly in this case, and actually the special shape of the equation allows us to avoid all the ingredients from transcendental number theory which appear in the treatment of the general case. This leads to a simple elementary argument (carried out independently by K¨ uhne and Masser) showing that There are no such pairs. For the sake of illustration, here are numerical values of some CM -invariants11 (for which I thank David Masser), which certainly would somewhat suggest a priori that it is difficult to find solutions: √ √ √ √ √ −1 + −3 ) = 0, j( −2) = 8000, j( −5) = 632000 + 282880 5. j( −1) = 1728, j( 2 √

A celebrated example (with which we see that eπ 163 is remarkably near to an integer) is also √ −1 + −163 ) = −262537412640768000. j( 2 Less well known is (see [Mas03], p. 20) √ √ −1 + −427 ) = −7805727756261891959906304000 − 999421027517377348595712000 61. j( 2 See also [Sil02], and further [Ser97], A.4, for the complete table of rational integer values of j at CM -points. 11 See

the end of Section 4.2 for some corresponding Weierstrass equations.

13

An overview of some problems of unlikely intersections Appendixes

As announced in the preface, the book is concluded with seven short appendixes, the last six of which were written by David Masser. Let us briefly illustrate the corresponding contents. Appendix A. This is concerned with the estimates for rational points on subanalytic surfaces, an ingredient at the basis of the method used in Chapter 3 (see the above paragraph “Upper bounds”) and Chapter 4. We shall sketch a proof of a theorem of Pila, basic for the application to Masser’s question in Chapter 3; this result (obtained in [Pil05]) concerns estimates for the number of rational points of denominator ≤ N on the “transcendental part” of a compact subanalytic surface. The method in part follows Bombieri-Pila’s paper [BP89], which treated curves. Although we shall not touch the more general results of Pila needed for the applications of Chapter 4, we hope that this appendix may contribute to illustrate some of the main ideas in the context. Appendix B (by D. Masser). Recall that Theorem 2 (in the above description of Chapter 1) predicts finiteness for the unlikely intersections of a curve in a torus; the proof yields an estimate for their number, which depends (among others) on the height of the curve. The question arose if this dependence can be eliminated. This appendix presents a deduction by Masser that it is indeed so, provided one assumes the Zilber’s conjecture; for simplicity, Masser works with lines in 3m . (A generalization of these arguments to arbitrary varieties is also sketched in Subsection 1.3.8 below.)

G

Appendix C (by D. Masser). Consider an elliptic curve E with a finitely generated subgroup Γ of points, all defined over a rational field K(t) (K being a number field); we assume that E has nonconstant invariant. This appendix presents a direct argument of Masser, proving an upper bound for the height of algebraic values t0 of t for which the rank of the specialized group at t0 decreases. This is a basic case of a general result by Silverman, containing the essentials of it. It is relevant here because a special case is in turn important in the finiteness proofs in Chapter 3 (and is carried out separately therein as Proposition 3.2). Appendix D (by D. Masser). This presents a sketch of a proof of a lower bound by Masser for the degree d of (the field of definition of) a point of order n on an elliptic curve E defined over ; this takes the shape d ≥ cn/ log n, for an explicit positive c = c(E). This is crucial for application to the proofs in Chapter 3. In that context one uses inequality (3.3.2), which essentially follows from the present bound after an explicit estimation of c(E); this last step is indicated as well by Masser in this appendix.

Q

Appendix E (by D. Masser). This appendix deals with transcendence of values of the modular function j : H → C. Masser starts by proving the theorem of T. Schneider that if j(τ ) is algebraic (for a τ ∈ H) then either τ is quadratic or transcendental. Masser also shows how to adapt this proof to a quantitative version in which the distance |τ −β| between τ and an algebraic β is bounded below effectively by exp(−c(1 + h(β)k ); this is important in the proof of Andr´e’s theorem. (See Lemma 4.3, where this result is used with k = 3 + , as in (4.3.6). This supplementary precision was obtained by Masser in [Mas75], but any k is sufficient for application to Andr´e’s theorem.) Appendix F (by D. Masser). We have mentioned the paper [BP89] by Bombieri-Pila on estimates for the distribution of rational points on real curves (discussed also in Appendix A). In this appendix, Masser gives a different argument, stemming from transcendence theory, to obtain similar bounds for the rational points lying on the graph of a transcendental real-analytic function. Appendix G (by D. Masser). In this last appendix Masser considers a mixed Manin-MumfordAndr´e-Oort statement, similarly to the end of Chapter 4; see especially Remark 4.4.4(ii), discussing Theorem 1.2 in [Pil09b]. Here Masser sketches a third argument for such a result, sticking for simplicity to a special case; he proves that there are only finitely many complex λ 6= 0, 1 for which

14

Introduction

√ the Legendre curve y 2 = x(x−1)(x−λ) has complex multiplication and the point P = (2, 4 − 2λ) is torsion on it. This proof follows the method discussed in Chapter 3, and, contrary to Pila’s, avoids recourse to Siegel’s class-number estimates, as well as the appeal to the results of Pila [Pil09b] on quadratic points on definable varieties; in this direction it uses only Pila’s results on rational points on subanalytic surfaces (sketched in Appendix A). This approach in particular eliminates the ineffectivity coming from Siegel’s theorem.

Chapter 1

Unlikely Intersections in Multiplicative Groups and the Zilber Conjecture As anticipated in the introduction, in this first chapter we shall describe some results of unlikely intersections in the case of multiplicative algebraic groups (also called “tori”) nm , together with a sketch of some of the proofs. (The important analogue for abelian varieties shall be discussed later in Chapter 3 with other methods.)

G

Remark 1.0.1 Algebraic subgroups and cosets. Before going on, it shall be convenient to recall briefly the simple theory giving the structure of algebraic subgroups and cosets of n m . (Simple proofs may be found, e.g., in [BG06], Ch. 3.)

G

G

a1 a a an Every algebraic subgroup G of n m may be defined by equations x = 1 (on denoting x := x1 · · · xn ), n where the vector a = (a1 , . . . , an ) runs through a lattice Λ = ΛG ⊂ ; of course it suffices to choose equations corresponding to a basis of Λ. The correspondence G ↔ ΛG is one-to-one. If ΛG has rank r then G has dimension n − r, and is irreducible if and only if ΛG is primitive (i.e., is a factor of n ). We say that G is proper if G 6= n m.

Z

Z

G

Any irreducible algebraic subgroup G is also called a “subtorus” and, setting d := dim G, it becomes isomorphic (as an algebraic group) to dm , the isomorphism being induced by a suitable monomial change bid bi1 of coordinates xi → xai on n m . Such a G may be also parametrized by xi = t1 · · · td , i = 1, . . . , n, bij ∈ .

Z

G

G

G

The torsion points in n m are those whose coordinates are roots of unity; they are Zariski-dense in any a algebraic subgroup G. Further, any coset of G in n m may be defined by equations x = ca , a ∈ ΛG , for suitable nonzero constants ca ; this coset is a torsion coset (i.e., t a translate of G by a torsion point) if and only if the ca are roots of unity (which amounts to the fact that it contains a torsion point). Any torsion coset of an irreducible G is a component of an algebraic subgroup of the same dimension.

G

G

l As usual, we shall denote by [l] the multiplication-by-l map on n m : [l] : x 7→ x . It is not too difficult n to prove that if X is a nonempty irreducible subvariety of m such that [l]X ⊂ X for an integer l > 1, then X is a torsion coset (and conversely). (See, for instance, [Zan09], Theorem 4.6. A proof may be also obtained by projection, from the simpler case of hypersurfaces, for instance similarly to the argument in [BZ95]; see further Remark 1.1.1 in the next section for some details.)

G

16

Chapter 1

1.1

Torsion points on subvarieties of

Gnm

G

Let us start with Lang’s original problem of torsion points on plane curves in 2m and its generalization to higher dimensions, i.e., describing the torsion points on a subvariety X ⊂ nm . Before offering a general statement in this direction, let us recall that equations in roots of unity go back to long ago; for example, P. Gordan studied in [Gor77] the equation cos 2πx + cos 2πy + cos 2πz = −1, in rationals x, y, z, with the purpose of classifying finite subgroups of P GL2 (C).1 Other equations of similar shape arise in the enumeration of polytopes satisfying suitable conditions. See [CJ76] for a brief description of this, and also for a general theory of trigonometric diophantine equations (in the authors’ terminology), aimed to reduce every mixed diophantine equation (in angle-variables and usual variables) to a usual one. Such paper was also partly inspired by H.B. Mann’s [Man65], classifying solutions in roots of unity to linear equations with rational coefficients. (We shall see later that these results are relevant also for Lang’s issue, and represent one of the possible tools to achieve a complete solution of it.)

G

Coming back to Lang’s formulation, as mentioned in the introduction, work of M. Laurent [Lau84] and independently of Sarnak-Adams [SA94] led to the following general result: Theorem 1.1. Torsion points theorem. Let Σ be any set of torsion points in Zariski closure of Σ is a finite union of torsion cosets. This may be reformulated by saying that

Gnm (Q).

The

G

The torsion points in a subvariety X ⊂ nm all lie and are Zariski-dense in a finite number of torsion cosets contained in X. In particular, if X is irreducible, the torsion points in X are Zariski-dense if and only if X is a torsion coset.2 This result may be proved in several different ways. Since the arguments are elementary and the essentials may be explained in short, we recall some of the possible approaches.

G

Let first X be an irreducible curve in 2m , as in the case of the original problem of Lang, and r s let ζ = (e2πi N , e2πi N ) be a torsion point on X, of exact order N (i.e., (r, s, N ) = 1). We have to prove that either N is bounded (in terms of X) or X is a torsion coset. If X is defined by an irreducible polynomial equation f (x, y) = 0, to be a torsion coset amounts to f being, up to a monomial factor, of the shape xa y b − ρ, for a, b ∈ and ρ a root of unity.

Z

QQ

- Intersecting X and [l]X. One approach consists in observing that a conjugation σ ∈ Gal( / ) sends the torsion point ζ to a power ζ l , where l = lσ may be chosen as any integer coprime to N . Now, if ζ ∈ X, then ζ σ = ζ l lies in X σ . If we choose σ fixing a number field k of definition for X 3 we then have [l]ζ := ζ l ∈ X (where [l] denotes the algebraic map of multiplication by l in nm ); and of course [l]ζ ∈ [l]X. This fact also holds for the conjugates of ζ over k, hence either [l]X and X have a component in common (which amounts to being equal, since they are irreducible) or by Bezout’s theorem we get [k(ζ) : k] ≤ |X ∩ [l]X| ≤ deg X · deg([l]X).4 As recalled in the opening Remark 1.0.1, the first alternative, which amounts to [l]X ⊂ X, may be shown to entail that X is a torsion coset, as in the following:

G

1 He exploited relations among traces of representing matrices, using that the trace of a matrix of finite order in GL2 (C) is a sum of two roots of unity. There are also other methods, however, to achieve the said classification. 2 This result may be generalized to produce a description of the torsion points of d , which lift to points defined m over the cyclotomic closure of the ground number field, under a (ramified) finite-degree cover π : X → dm . In turn, this has applications to Hilbert irreducibility. See [DZ07], [Zan10], and Note 2 to Chapter 3. 3 Note that we may indeed suppose that X is defined over , else it contains only finitely many algebraic points, in particular torsion points. 4 Equivalently, in this kind of argument one may note that ζ ∈ X ∩ [l]−1 X and work by comparing X and [l]−1 X.

G

Q

G

1.1 Torsion points on subvarieties of

Gnm

17

Remark 1.1.1 [l]X ⊂ X implies that X is a torsion coset. For the present case of curves this implication may easily be proved directly: for instance, note that if [l]X ⊂ X then f (θx, ηy) divides f (xl , y l ) for all l-th roots of unity θ, η. Comparison of degrees then shows that X is invariant by at least l distinct multiplicative translations (x, y) 7→ (θx, ηy), where (θ, η) has order dividing l (and varies in a set of ≥ l elements). Replacing l by powers lm , m = 1, 2, . . ., one deduces that X is invariant under multiplicative translations X → τ · X by an infinite set of τ ∈ 2m ; and certainly it is invariant by multiplicative translation by any element in the Zariski-closure of that set. This Zariski-closure must contain a curve Z, and Z · X ⊂ X, so X must be a translate of Z and now one easily deduces Z · Z ⊂ τ Z and the sought conclusion follows. Such a strategy applies in any dimensions; see also [Zan09], Theorem 4.6, for a different general argument, and see further [Hin88], Lemme 10, for another method relying on degrees and valid for any extension of a complex abelian varieties by a torus. (Still another, p-adic analytic, argument related to the Skolem-Mahler-Lech theorem is given in the appendix to [Lan65].)

G

In the other cases, the estimate deg([l]X) ≤ l · deg X 5 yields [k(ζ) : k] ≤ l · (deg X)2 . On the other hand, [k(ζ) : k] ≥ φ(N )[k : ]−1 , so on choosing l “small” and such that σ fixes 6 k we deduce that N is bounded, concluding the argument.

Q

This approach is the one adopted in the first solutions to Lang’s question by J-P. Serre and by J. Tate. (Lang’s paper [Lan65] reproduces this, together with another argument by Y. Ihara; see also [Lan83], see Sec. 8.6.) Finally, note that actually for this argument we do not need that X σ = X: in fact, arguing as above would lead either to the sought bound or to [l]X ⊂ X σ ; but an iteration of this yields [lm ]X ⊂ X for some m ≥ 1, so again we deduce that X is a torsion coset. This observation is relevant if we want to forget about the field of definition of X; in fact, a detailed analysis on these lines, with careful choice of σ, was carried out by F. Beukers and C. Smyth [BS02]; in particular, they use that any root of unity ζ has a conjugate of the shape −ζ or ±ζ 2 , and obtain upper bounds on the number of torsion points remarkably dependent only on the degree of X (in fact, bounded by 22 times the area of the Newton polygon of f (x, y)). - Constructing auxiliary monomials. Another, partially similar, approach, due to Liardet, consists in constructing a nontrivial monomial µ = xa y b , a, b ∈ , of “small” degree, such that µ(ζ) = 1; we want ar + bs ≡ 0 (mod N ) and by an easy √ application of the pigeon-hole√principle2 N : consider, for instance, the ([ N ] + 1) this may be done with 0 < deg(µ) := max(|a|, |b|) ≤ √ pairs (u, v) ∈ 2 with 0 ≤ u, v ≤ N , and take the difference of two pairs producing a same value ur + vs (mod N ). Then ζ lies in the intersection of X with the curve H defined by µ = 1 (which is an algebraic subgroup of 2m ), and the same holds for √ the conjugates of ζ over k. By Bezout’s theorem again, one may conclude that either φ(N ) k N deg X or H contains X. In the first case, N is bounded. In the second case, i.e., X ⊂ H, we again easily find (on factoring xa y b − 1) that X must be of the predicted special shape. Note that in this argument the bound takes the shape N k, (deg X)2+ for every  > 0, where the exponent ‘2’ is best-possible.7 (See [Zan09], Prop. 4.1 and Ex. 4.4, for more details.)

Z

Z

G

Remark 1.1.2 Taking conjugates. We remark at once that the idea of taking conjugates, present in both proofs that we have seen, shall appear again in different shapes in many of the problems considered in our future discussions, and has proved to be fundamental in the context. Often, Bezout’s theorem 5 For instance, this may be seen by Bezout’s theorem, on intersecting a (general) line with [l]X: this yields an equation axl + by l + c = 0, where (x, y) ∈ X. Instead, deg([l]−1 X) = |{(x, y) : ax + by + c = 0, (xl , y l ) ∈ X}|. 6 It is easy to see, by means of a simple “Erathostenes sieve,” that, for instance, l can be taken   k, N . Pp−1 q r √ √ 7 For this, take X defined by f (x, y) = x y , where p is a prime, n = qR+r, R = [ p+1] and 0 ≤ r ≤ [ p]. n=0 √ R Then X may be proved to be irreducible with deg X < 2 p and X has the torsion point (ζp , ζp ) of order p.

18

Chapter 1

may then be used as above, to compare estimates; eventually one concludes that either the degree (over ) of the torsion point is bounded (and then the same holds for its order N ) or we have a geometrical consequence for which we fall in the special varieties. (In the present case they are the torsion cosets.) For instance, an advanced version of this kind of idea, relying on deep results by Serre on homotheties in the image of Galois on torsion points, led M. Hindry [Hin88] to a quantitative proof of the Manin-Mumford statement (to be discussed later in Chapter 3 with different methods).

Q

- Considering heights. Still another approach, working for arbitrary X and ambient dimension n, comes from considerations of heights; we explain the essentials. Let f (ζ) = 0, where f ∈ [x]8 and ζ ∈ nm is a torsion point. Then we have by Fermat’s Little Theorem, f (ζ p ) ≡ f (ζ)p = 0 (mod p) (in the ring [ζ]). Now, ζ p has zero Weil height,9 so if p is large enough with respect to f , this congruence yields f (ζ p ) = 0. Hence we gain a new equation; actually, applying this argument to a set of defining equations for X shows that if the torsion points are Zariski-dense in X we must have X ⊂ [p]−1 X; but this implies [p]X ⊂ X for large p, which as already noted in turn implies that X is special. This procedure may be carried out for arbitrary varieties, as done in [BZ95], and leads more generally to a positive lower bound for the height of algebraic points in X, valid outside the translates of torsion cosets (i.e., a solution of the toric case of Bogomolov problems, first obtained by S. Zhang, recalled below in Remark 1.1.7); see [BZ95] or [Zan09] for the details of this approach.

G

Z

Z

Q

Remark 1.1.3 Weil height. We have referred here to the logarithmic Weil absolute height on . We refer, e.g., to [BG06] for this, but we recall here that this height may be defined P as follows: if α ∈ has minimal polynomial a0 (x − α1 ) · · · (x − αd ) over , we have dh(α) = log |a0 | + di=1 max(0, log |αi |). One usually Q writes H(α) := exp h(α). Another, equivalent, definition is in terms of absolute values: H(α) = v∈Mk sup(1, |α|v ), where k is any number field containing α and v runs through the places of k, suitably normalized in terms of k (which is important to make the definition independent of k). For instance, the Height of a rational number p/q in lowest terms is H(p/q) = max(|p|, |q|). This height satisfies h(x−1 ) = h(x), h(xy) ≤ h(x) + h(y), h(x + y) ≤ h(x) + h(y) + log 2. It vanishes precisely at 0 and at the roots of unity. (See [BG06].) There is also a related notion of Weil height h(x0 : . . . : xn ) for algebraic points in projective spaces Pn : see (2.1.1); we have h(x0 : x1 ) = h(x1 /x0 ) if x0 6= 0. Pn For an algebraic point z = (x1 , . . . , xn ) ∈ n m ( ), we may set h(z) := i=1 h(xi ). Such a height n defines a semidistance on n m ( ) and a distance on m ( )/torsion. We further recall the easy but very important Northcott’s theorem, which states that there are only finitely many algebraic points of bounded height and bounded degree. There is a whole geometric theory of heights, initiated by A. Weil, which is of the utmost importance in arithmetic, for which we refer to [BG06]; in this view, a height on a variety V may be constructed by restriction, once V is embedded in some projective space, and in turn these embeddings are associated to divisors on V . For instance, in the case of abelian varieties, all of this leads to a notion of “canonical height,” more sophisticated than the above one, usually called “N´eron-Tate height.” See further the survey by Zhang [Zha98b] for the viewpoint of Arakelov’s theory.

Z

G Q

Q

G Q G Q

- Considering degrees in cyclotomic fields. Let us mention a further approach, perhaps even simpler than the above ones and working for arbitrary X and ambient dimension n; it comes from the following elementary result of H.B. Mann [Man65] on linear relations among roots of unity: If a0 + a1 ζ1 + . . . + ar ζr = 0 is a relation with ai ∈ ∗ and the ζi roots of unity of exact common order N , then either some proper subsum also vanishes, or N is squarefree and any prime divisor p of N satisfies p ≤ r + 1.

Q

8 Indeed, 9 See

one may either reduce to rational coefficients or argue similarly over a number field. next Remark 1.1.3 for more on this notion.

1.1 Torsion points on subvarieties of

Gnm

19

This was later refined in [CJ76] (to take into account all primes dividing N ) and in turn extended to number fields, with a different proof, in [DZ00]. As in Mann’s argument, this may be proved on writing the ζi as products of roots of unity of prime-power order, and using knowledge of (relative) degrees of cyclotomic fields; e.g., one observes that a primitive p-th root of unity has minimal polynomial xp−1 + . . . + x + 1 over the field generated over by roots of unity of order prime to p. (The above conclusion does not hold in positive characteristic, since, for instance, every nonzero element in a finite field is a root of unity; however, see [DZ02] for an analogue of Mann’s and Conway-Jones’s results, concerning linear relations holding modulo a prime number.) Now, even if the said relation has some vanishing subsums, Mann’s result, applied to a minimal vanishing subsum, leads to a bound depending only on r for the minimum order of the ratios ζi /ζj , i 6= j. (See also [SA94] for a related result.) This result concerns linear relations, but one may use a Veronese embedding of X to deal with the general case. In fact, let fi = 0 be a defining system of polynomial equations for X (defined over , as one may suppose); on viewing the equations fi (θ1 , . . . , θn ) = 0 in roots of unity θj as linear equations in the monomials appearing in f (Veronese!), one can apply Mann’s result; this yields that some nontrivial monomial (taken from a fixed finite set) in θ1 , . . . , θn equals 1. Equivalently, (θ1 , . . . , θn ) lies in one at least among finitely many proper algebraic subgroups depending only on X; each of these subgroups is isomorphic to a finite union of torsion translates of some hm , for some h < n, which allows one to prove Theorem 1.1 after some further rather straightforward work by induction on dimensions. From a quantitative viewpoint, we also mention that one may carry out explicit upper bounds for the number of solutions of a linear equation as above, in roots of unity ζi such that no proper subsum vanishes (which also means a0 6= 0). Some of these bounds are especially significant, in that they hold for arbitrary complex coefficients and do not depends on them but only on r. Among the first such bounds are the ones of Schlikewei [Sch96] (improved later by Evertse [Eve99]), and by Bombieri and the author [BZ95] (who deal also with points of small height). These results lead to uniform bounds (depending only on n and deg X) for the number of components (torsion cosets) in the Zariski-closure of the set of torsion points on a variety X ⊂ nm (as in [BZ95]).

Q

Q

G

G

Q

- Galois equidistribution. Let, as above, a0 + a1 ζ1 + . . . + ar ζr = 0, with ai ∈ ∗ and ζi roots of unity. Another method to prove that the minimum order of the ζi /ζj , i 6= j is bounded (in terms of r) comes also on exploiting that the conjugates of a root of unity of large order tend to be R1 uniformly distributed in the unit circle. For instance, taking into account that 0 exp(2πiα)dα = 0, we should expect the mean value of the said conjugates to be “small”. Indeed, one can note that Q(θ) the trace T rQ (θ) of a root of unity θ of order n equals the value µ(n) ∈ {0, ±1} of the M¨obius function, and so is small compared to the degree φ(n). Then, let us assume that in the said relation a0 has maximum modulus among the ai (which can be achieved after division by some root of unity); Pr taking the trace of the relation, from (ζ1 , . . . , ζr ) to , one then easily deduces that 1 ≤ i=1 φ(ni )−1 , where ni is the order of ζi . This implies that some ni is bounded in terms only of r, as wanted. One can then continue as before using a Veronese map. This simple equidistribution principle represents a germ at the basis of several modern sophisticated methods, which we shall sometimes allude to in the course of our future discussions. (See especially the survey [Zha98b] by Zhang.)

Q

Q

- Counting rational points on real-analytic curves. There is a further method for the said problems; this is rather more involved than the ones we have seen, but it applies in other contexts at the heart of this book, as in Chapters 3 and 4. So it may be worthwhile to present at once here, in a few words, one of the simplest descriptions of this method. We illustrate the idea for the original question of Lang concerning an irreducible plane curve f = 0, which we assume not to be of the said special shape. The equation f (exp 2πiθ, exp 2πiη) = 0,

20

Chapter 1

R

R

θ, η ∈ , defines a real-analytic variety Z in 2 , possibly finite, or of dimension 1 (see Example 1.1 below); the torsion points that we are seeking correspond to rational points on Z, and we may in fact restrict to Z := Z ∩ [0, 1]2 . We observe that Z is compact, since it is closed in the unit compact square.10 We may assume that Z is a curve and that it does not contain any line segment, otherwise f = 0 would define an algebraic coset, which we are excluding.11 Hence (with the aid, e.g., of Puiseux series) we may express the compact Z as a finite union of graphs of strictly convex differentiable functions (with respect to respective suitable linear changes of coordinates). Now, a torsion point of order n corresponds to a rational point of exact denominator n on Z. If there is such a point, by conjugating over a number field of definition for f , we obtain (as in previous proofs) at least  φ(n)  n1− such points on Z. To obtain a bound for n, it thus suffices to prove that Z cannot contain so many rational points of denominator dividing n. For this task, rather subtle methods are now available, stemming from a paper of Bombieri-Pila [BP89], which we shall widely mention and discuss in Chapters 3 and 4 and in Appendix A. For the present modest purpose it is sufficient to use a precursor result by Jarnik [Jar26], who proved 1 2 1 that a strictly convex arc y = f (x) of length L contains at most 3(16π)− 3 L 3 + O(L 3 ) integral lattice points. (Here the constant in the O is absolute; see next remark for a sketch of proof.) This estimate concerns integral points, but by dilation we immediately obtain that a convex arc 2 A : y = f (x) in [0, 1]2 contains at most  n 3 rational points of denominator n. Now, in the above discussion it then suffices to choose  as any positive number < 1/3 to obtain a bound for n, and the sought conclusion. See also the survey paper [Sca11a] for a more general illustration of this method in the toric context (especially from the viewpoint of model theory). Remark 1.1.4 Jarnik’s theorem. We sketch a simple argument (similar to the original one) for a result of Jarnik’s type, but slightly weaker. We order the integer points P1 , . . . , Pk on y = f (x) by increasing abscissa. The differences P∆i := Pi+1 −Pi are integer vectors, pairwise distinct (by convexity), with positive abscissa and such that k−1 i=1 |∆i | ≤ L. Let Q(m) denote the number of integer vectors P of positive abscissa and squared-length m. Clearly, by the above, if s is the largest integer such that m≤s2 Q(m) < k, we P √ must have m≤s2 mQ(m) ≤ L. P π 2 Now, by counting lattice points in the half-disk of radius r, we have m≤r 2 Q(m) ∼ 2 r . Hence, P √ 1 3 mQ(m) ∼ π3 s . All of this easily yields s ∼ (2k/π) 2 . Also, by partial summation we have m≤s2 2

1

k ≤ (c + o(1))L 3 , where c = (9π/8) 3 . (In this last argument, some of the vectors involved in the various Q(m) are parallel, so should not be counted more than once; this explains the better constant in Jarnik’s result, which can be obtained with a small refinement of the above.) Example 1.1 Intersections with products of circles. Especially in connection with the last sketched method, we note that the torsion points in the original problem of Lang for 2m yield points in the intersection of the curve X ⊂ 2m with S1 × S1 where S1 = {z : |z| = 1} ⊂ m (C) is the unit circle. Note that S1 × S1 is a “natural” set here: it is the closure of the set of torsion points in 2m for the complex topology and also the maximal compact subgroup of 2m . We observe that these intersections are also unlikely to be infinite in number, since both S1 × S1 and X(C) have real dimension 2, in the 4-dimensional real variety 2m (C); at first sight this could motivate the thinking that these more general intersections already make up a finite set. However, this is not true in general; indeed, writing S12 = {(e2πiθ , e2πiφ ) : θ, φ ∈ [0, 1)}, there are curves X, not of special type (i.e., not algebraic cosets), for which the said intersection actually contains a whole arc of a real-analytic curve in the (θ, φ)-plane: examples in this sense may be constructed by taking any rational function ρ ∈ C(z) sending the unit circle into itself (e.g., any product of M¨ obius transformations z 7→ aaz+b , |a| 6= |b|) and considering the graph X := {(z, ρ(z))}. (Still, of course, ¯ +¯ bz

G

G

G

G

G

G

10 Of course, we are strongly using throughout the fact that f is a polynomial; for instance, this setting would not apply to y = xα with real irrational α. 11 In fact, if Z would contain a segment of the line η = aθ + b, then the function xa would be algebraic, so a would be rational, and f (x, βxa ) = 0, where β = e2πib ; in turn, f would be a binomial.

1.1 Torsion points on subvarieties of

Gnm

21

this graph contains only a finite number of torsion points if ρ is not a monomial; this corresponds to the fact that the real-analytic arc mentioned above, defined by 2πiφ = log(ρ(e2πiθ )), θ, φ ∈ [0, 1), contains only finitely many rational points.) One can construct even more general examples: in any case, they come from nonspecial curves X ⊂ 2m such that [−1]X c = X, where c denotes complex conjugation (but this is not a sufficient condition); such curves may be defined by equations of the shape g(x, y) + g¯(x−1 , y −1 ) = 0 for a rational function g. Of course there are analogues in higher dimensions.

G

We conclude this section with a few more remarks, mentioning some important issues related to the present context. Remark 1.1.5 Abelian varieties and other algebraic groups. Completely similar results are known, due to M. Raynaud, for abelian varieties over a field of characteristic zero (see Theorem 3.1). A main motivation this time came from a question raised independently by Manin and by Mumford, predicting finiteness for the set of torsion points on a curve of genus at least 2, embedded in its Jacobian. (Actually, Mumford’s question motivated and influenced some of Lang’s problems and conjectures: see Lang’s comments in his paper [Lan65].) The corresponding theorems lie distinctly deeper than the toric case. Raynaud’s subtle proof was local and, so to say, by reduction modulo p2 ; he exploited the fact that multiplication by p on the points defined over a p-adic ring strongly restricts the reduction mod p2 . (Just as a simple instance, note that there are only p reductions mod p2 of p-adic integers of the shape ap , a ∈ p .) As mentioned in Remark 1.1.2, another proof, similar in nature with the first one sketched for Lang’s question, is due to Hindry [Hin88]. (His proof was quantitative and treated arbitrary commutative algebraic groups.) We have also briefly discussed such a topic in the abelian context in the introduction, and shall return to it with some detail in Chapter 3, with a sketch of proof of the original conclusion foreseen by ManinMumford by means of a recent method. All of this may be extended also to semiabelian varieties (see the next remark for references), i.e., extensions G of an abelian variety A by some torus n m . (Namely, G is a commutative algebraic group for which there is an exact sequence 0 → n m → G → A → 0. See [Ser88], VII.3, for the algebraic theory.) Here is a simple, albeit interesting and illustrative case of Manin-Mumford for a product G = m × E, where E is a complex elliptic curve. This G may be described by coordinates (u, x, y), where y 2 = x3 + ax + b is an equation for E. If we take the curve u = x inside G, the relevant statement becomes: there are only finitely many torsion points (x0 , y0 ) ∈ E for which x0 is a root of unity. (See Subsect. 3.4.1 for a relative version of this.) Further, one can consider extensions of an abelian variety by the additive group a ; of course, the fact that (in characteristic 0) a has no torsion changes some features of the problem: for instance, when the extension is split, i.e., a product a × A, then the torsion points are in {0} × A and we reduce to the abelian case. For nonsplit extensions, however, the issue seems to introduce nontrivial new points. (I owe to D. Bertrand the comment that this could be dealt with by the methods of Chapter 3.) Anyway, the general case of an arbitrary commutative algebraic group (in char. 0) is completely treated by Hindry in the already quoted article [Hin88].12 On the contrary, the case of noncommutative linear algebraic groups seems not to have been explicitly investigated and actually presents different aspects. For instance, in such groups the torsion points of given order are not a finite or discrete set but may make up whole families, obtained, for instance, as conjugacy classes of given torsion elements. Still, in subvarieties of GLn one can study the torsion points via the eigenvalues, which must be roots of unity. This allows one to reduce some relevant problems to the analysis of torsion points on subvarieties of n m , using the above-presented results.

Z

G

G

G

G

G

G

G

Remark 1.1.6 Intersections with subgroups of finite rank. For the sake of completeness, we also mention very briefly some other most important developments arising from this context, which, however, we shall not discuss directly: Lang considered not merely torsion points on a variety, but the points in

G

12 Concerning an extension G of a complex elliptic curve by a , joint work in progress with Corvaja and Masser should show that if X is a curve in G, not a translate of an algebraic subgroup, then already the intersection of X with the topological closure of the set of torsion points of G is finite. At the light of Example 1.1 above, this shows a discrepancy with the case of 2m .

G

22

Chapter 1

G

X ∩ Γ where Γ is a subgroup of finite rank, either of n m or of an abelian variety A. (Note that the case of torsion points occurs when the rank is zero.) Lang was first concerned merely with finitely generated Γ, and the transition to “Γ of finite rank” was seemingly suggested to him by the above-mentioned Mumford’s “Jacobian” question: see Lang’s introductory comments in [Lan65]. Lang again expected the Zariski closure of X ∩ Γ to be a finite union of translates (not necessarily torsion) of algebraic subgroups. See [Lan65] for these conjectures and for a proof in the case of plane curves in tori; the case of finitely generated groups had been proved earlier by Lang in [Lan60]; for these theorems he relied on deep results of diophantine approximation. In the general case of tori of any dimension, this conjecture was proved by Liardet for the case of curves (see [Lan83] for an account) and by M. Laurent [Lau84] for arbitrary subvarieties; Laurent relied on the difficult subspace theorem of W.M. Schmidt (see [BG06]). The conjecture was proved by G. Faltings in the abelian case (after a method introduced by P. Vojta for the case when X is a curve); note that this abelian case contains the Mordell conjecture as a rather special case (which had also been proved previously by Faltings with different methods). See [BG06] for Bombieri’s proof of the Mordell conjecture by this method of Vojta. Subsequently, M. McQuillan worked on the topic, in particular combining all of this and proving the conjecture for semiabelian varieties (see [McQ95]). These more general problems of Lang on intersections with groups of finite rank shall implicitly appear in some of the problems discussed in the next sections below.

G

Remark 1.1.7 Bogomolov problems. Recall that the torsion points in n m are those of zero Weil height. As pointed out by F. Bogomolov, one can consider more generally points of “small” Weil height. In this view, Theorem 1.1 was extended in [BZ95] with the “height” method just sketched in one of the approaches above: one proves that there exists a number c(X) > 0 such that all the points P ∈ X( ) with height h(P ) < c are contained in finitely many torsion cosets in X (and are Zariski-dense therein). This is analogous to a conjecture explicitly raised by Bogomolov, about points of small N´eron-Tate height on an abelian variety, refining the Manin-Mumford conjecture. For the case of tori, this result was first proved by S. Zhang [Zha92] prior to [BZ95] by different and more sophisticated ideas from Arakelov theory. (See further the survey [Zha98b] by Zhang, also for further precision on Bogomolov’s question and on Zhang’s results.) Subsequently, Bilu [Bil97] gave a further different proof based on Galois equidistribution. The paper [BZ95] also obtained uniform lower bounds, at the cost of excluding all the positive dimensional translates (not necessarily by torsion points) of algebraic subgroups lying in X. These more general translates shall further appear and play a significant role below.13 Generally speaking, the whole investigation of small points (in the toric context) turned out to be related to and highly motivating for several questions considered next in this chapter. The original problem of Bogomolov for the abelian context was solved later than the toric case by E. Ullmo [Ull98] (for curves inside an abelian variety) and by Zhang [Zha98a] and L. Szpiro-Ullmo-Zhang [SUZ97] generally, again with methods from Galois equidistribution (see [Zha98b] for a brief account). The case of semiabelian varieties was dealt with by A. Chambert-Loir in [CL00] (suitable assumptions are now necessary for analogous conclusions). B. Poonen combined in a single statement these viewpoints with the ones of the previous remark. We shall soon see in the next section how various kinds of lower bounds for heights are relevant for applications in the present context. Now we will not pause further on these issues, despite their importance.

Q

1.2

Higher multiplicative rank

The torsion points constitute a multiplicative group of rank 0 and those of given order make up a corresponding (finite) algebraic subgroup of dimension 0. So in Lang’s question we are just intersecting X with the family of such algebraic subgroups. It can then be asked what happens more generally on intersecting X with the union of algebraic subgroups of fixed dimension d > 0; 13 See also [BG06] and [Zan09] for the said uniform lower bounds for c(X), in terms of n, deg X, in the toric case and for references, especially to subsequent quantitative work by F. Amoroso, S. David, P. Philippon, and others.

23

1.2 Higher multiplicative rank

by the very choice of our terminology, we shall continue to have unlikely intersections as far as this dimension d is < n − dim X. This typical issue of unlikely intersections was raised in the paper [BMZ99]. Such paper considered only the case when X is a curve defined over , and proved two kinds of results. To state them it shall be convenient to introduce first a simple notation, setting (for an arbitrary variety X ⊂ nm ) [ X(d) := (X ∩ G), G an algebraic subgroup of nm . (1.2.1)

Q

G

G

dim G≤d

Note that X(0) is precisely the set of torsion points on X, whereas X(n) = X and X(d) is empty for d < 0. More generally, the opening description of algebraic subgroups shows that (ξ1 , . . . , ξn ) ∈ X(d) if and only if (ξ1 , . . . , ξn ) ∈ X, and there are at least n−d independent multiplicative relations ξ1m1 · · · ξnmn = 1 among ξ1 , . . . , ξn . Equivalently: X(d) consists of the points (ξ1 , . . . , ξn ) on X such that the coordinates ξi generate a multiplicative group ξ1Z · · · ξnZ of rank at most d. Still another description is the following:

G

n m

X(d) is the set of points ξ ∈ X such that the Zariski-closure of the subgroup {ξ m : m ∈ has dimension ≤ d. (Note that such closure is automatically an algebraic subgroup.)

Z} of

This follows immediately, e.g., from the previous criterion, on recalling that functions m 7→ ξim from to k ∗ (k a field), i = 1, . . . , d, are algebraically independent over k if and only if the ξi ∈ k ∗ are multiplicatively independent (in turn, this is a consequence, e.g., of Artin’s theorem on linear independence of distinct characters).

Z

The study of X(0) has been the object of the previous section. As to larger values of d, the paper [BMZ99] focused on the case of an arbitrary curve X in nm and considered first, so to say, the opposite extremal (nontrivial) case of the set X(n−1) , i.e., the intersection of the curve X with the union of all proper algebraic subgroups, i.e., those of dimension (at most) n−1. Note that these are “likely intersections.” In fact, we expect that a “general” algebraic subgroup of codimension 1 shall meet X somewhere, and it is in fact easily proved that X(n−1) is indeed infinite.

G

Let now x1 , . . . , xn denote the coordinate functions on X. By the above remark, a point z ∈ X lies in X(n−1) if and only ifQthe values x1 (z), . . . , xn (z) are multiplicatively dependent, i.e., they n satisfy a nontrivial relation i=1 xi (z)mi = 1. Certainly this dependence at z shall not be surprising if the xi are likewise dependent as functions on X, for then any point will give rise to dependence of values. So let us assume that the coordinate functions are multiplicatively independent. Note that, since algebraic subgroups are described by multiplicative relations among coordinates, this independence amounts to X not being contained in any proper algebraic subgroup of nm , or that X(n−1) 6= X;14 it is a very natural assumption. In this case, independent work by Masser [Mas89b] had already shown that X(n−1) must be a very sparse set (in a precise sense on which we shall return in Note 1 below). Actually, let us assume for the moment the stronger fact that the xi are independent modulo constants, i.e., that X is not contained in any translate of a proper algebraic subgroup of nm .15 Then the following holds:

G

G

Q

Theorem 1.2. ([BMZ99], Theorem 1.) If X is an irreducible curve defined over , not contained in any translate of a proper subtorus of nm , the Weil height of (algebraic) points in X(n−1) is bounded.

G

14 We

may also rephrase this by saying that a generic point of X does not lie in X(n−1) . general translates of algebraic subgroups, rather then the torsion translates, also appeared to be relevant in uniformity issues concerning small heights of algebraic points on subvarieties of n m , as shown in [BZ95]. 15 The

G

24

Chapter 1

Thinking of the semi-distance defined by the height (see Remark 1.1.3), in a sense this conclusion says that the points in X(n−1) are not too far from the torsion points (i.e., those of zero height). Example 1.2 As a simple example, which actually provided much motivation for these questions, take X to be the line x + y = 1 in 2m . This curve is related to the question considered by Lang as a prototype of his conjectures: namely, the question of points on this same line, with coordinates in a group Γ of finite rank, e.g., S-units having sum 1; this issue appeared in fact in previous papers of Siegel, who proved its relevance in solving other interesting diophantine equations. Also, it had already appeared as a significant example in the study of points of small height, after Zhang’s solution of the Bogomolov problems for tori (Zagier [Zag93] had found a best-possible lower bound for the height of nontorsion points on the line). In the present context, this line was originally considered by Masser. The theorem now can be restated as claiming that the algebraic numbers t such that ta (1 − t)b = 1 for integers a, b not both zero, have bounded Weil height. (See [Mas09] for a simple direct argument and [CZ00] for a proof of the best-possible bound h(t) ≤ log 2 for such algebraic numbers t.)

G

Sketch of proof of Theorem 1.2. A proof of this theorem isQnot too difficult and runs as follows: n take a point z ∈ X(n−1) (it is automatically algebraic), so i=1 xi (z)mi = 1 for some integers mi not all 0. The mi may be very large, but using an elementary pigeon-hole principle we can “mimic” such relation with one having bounded exponents bi . More precisely, suppose mn = max |mi |. Then, for a given (large) integer B > 1 by the pigeonmi are integers bi up to an hole principle one finds a positive integer q ≤ B n−1 such that the q m n m1 −1 }, . . . , {u mmn−1 }) : error ≤ B . (For this, consider the set of vectors of fractional parts {({u m m n n−1 −1 } and take the difference of two such vectors falling in a same cube of side B .) u = 0, . . . , B B −1 mn . Raising the above relation to the q-th power, We now Q write δi := bi mn − qm Qni , so |δi | δ≤ n bi mn i = i=1 xi (z) . Taking heights and recalling the properties stated in we obtain ( i=1 xi (z) ) the above Remark 1.1.3, we get Y X h( xi (z)bi ) ≤ B −1 h(xi (z)).

(1.2.2)

On the other hand, the bi are not Q all zero (e.g., bn 6= 0), and they are bounded independently of z: |bi | ≤ B n−1 + 1. Hence, ψ := xbi i is a nonconstant16 function on X taken from a finite set independent of z; thus, for our purposes we may assume that ψ is fixed. Let us now look at an irreducible algebraic relation Fj (ψ, xj ) = 0 between ψ and xj ; note that the degree of Fj in the first variable does not exceed the degree deg xj of xj as a function on X, and is thus bounded independently of B. Then, evaluating this relation at the point z, it follows from elementary properties Pn of the height that h(xj (z)) ≤ cj h(ψ(z)) + OB (1), whence by (1.2.2) we get h(xj (z)) ≤ cj B −1 i=1 h(xi (z)) + OB (1), where cj is independent of z or B. Finally, summing this P last bound over j = 1, . . . , n yields the sought result for large enough B (it suffices that B > 2 cj ). (See [BMZ99] for this and other arguments; see also [Zan09], pp. 127–130.) Some Remarks on Theorem 1.2 1. Local-global principle. We have already noted that the above assumption on X amounts to the coordinate functions to be multiplicatively independent modulo constants. Then we clearly see a local-global principle: If the functions are independent (modulo constants), the values are usually independent, or else: If too many values are dependent, the functions must be dependent. Note that a priori the values of the functions at any algebraic point could always be multiplicatively dependent, even if the functions are not. 16 Here

we use that X is not contained in a proper algebraic coset.

25

1.2 Higher multiplicative rank

2. Necessity of assumptions. The said assumption on X is easily seen to be necessary in the theorem: in fact, suppose that xa1 1 · · · xann is a constant c 6= 0 on X (where the ai are not all zero). Then, if h(c) > 0, we may consider, for instance, the algebraic subgroups defined by ban 1 , to obtain unbounded height in X(n−1) ; on the other hand, if h(c) = 0, xi = xba 1 · · · xn , b ∈ then c is a root of unity and hence X is contained in a proper algebraic subgroup G, so X = X(n−1) and the height is unbounded in X(n−1) ( ).

N

Q

3. Finiteness implications. It is very easy to check that X(n−1) is always an infinite set: it equals the whole set of all zeros of the nonzero rational functions on X expressed as xa1 1 · · · xann − 1, whose degree may be arbitrarily large; we may take for instance the zeros of xm i − 1, where xi is nonconstant and m ∈ .17 However, Theorem 1.2 of course implies (through Northcott’s theorem recalled in Remark 1.1.3) finiteness of the set of points in X(n−1) which are defined over a given number field, or even which have bounded degree over .18 This is indeed what we should expect a priori, since the degree of algebraic subgroups tends to infinity, so it is unlikely that infinitely many intersections with X shall contain points of small degree over . Note also that this conditional finiteness is a measure that these “likely intersections” are already rather sparse. (Other sparseness conclusions come from previous work by Masser; see the notes below.)

Z

Q

Q

4. Precedent analogues. The result may be seen as an analogue for tori (actually for the family

Gm × X) of Silverman’s specialization results for pencils of abelian varieties [Sil83], which bound

the height of values of the parameter for which the rank of a finitely generated group decreases under specialization. (We shall prove and use a simple case of this theorem in Chapter 3, when discussing Masser’s questions: see Proposition 3.2 and also Appendix C for a direct argument of Masser for the general elliptic case.) Another analogue is a theorem of Manin-Demjanenko (see [Ser97], p. 62), which considers maps f1 , . . . , fm P from a variety X to an abelian variety A; they are supposed to be independent in the sense that ni fi can be constant only if the integers ni are all zero. A result states that, if the rank of the N´eron-Severi group of X is 1, then the values fi (P ) ∈ A can be linearly dependent over only for finitely many points P ∈ X(k), where k is a fixed number field. This last condition makes a substantial difference with our context, in which we consider all points over . See especially the next Theorem 1.3, giving finiteness for “doubly dependent” points over .

Z

Q

Q

5. Higher dimensions. It is natural and also highly relevant for some applications, as we shall see, to ask for a possible analogue of Theorem 1.2 for X of arbitrary dimension. In the general case, by analogy we should consider the set denoted X(n−dim X) in (1.2.1). An analogue was proved (by Bombieri and the author) for X of codimension 1, in Lemma 1 of the appendix to Schinzel’s book [Sch00]. (This boils down to a height estimate for roots ξ of a “lacunary” equation a0 + a1 ξ m1 + . . . + ar ξ mr = 0; one can prove that if no subsum vanishes, then h(ξ) r max(h(ai ) + 1)/ max(|mi |). See also [Zan09], Ex. 3.7.) Further progress came from the paper [BMZ08a], when the case of X a plane in nm was settled. However, none of the methods underlying these papers provided enough information in complete generality. We note at once that the basic assumption for Theorem 1.2, that X is not contained in a proper coset, is not generally sufficient to guarantee the sought boundedness conclusion, already if dim X = 2. A simple (counter)example is as follows:

G

G

Example 1.3 Unbounded height. Consider the surface X in 4m defined by x + y = 1, z + w = c for a constant c 6= 0. This is easily checked not to be contained in any proper (algebraic) coset. However, if we intersect it with the algebraic subgroups of codimension 2 defined by z = xa , w = y b , we obtain 17 We also note that the multiplicity of a zero of any such nonzero function is bounded, as easily follows from the “abc” theorem for function fields of curves; see [BG06]. 18 For instance, for the above example of X : x + y = 1, it is easy to see that the rational points in X (1) are (−1, 2), (2, −1), (1/2, 1/2).

26

Chapter 1

unbounded height: in fact, by a theorem of Zhang we have h(x) + h(y) ≥ h0 > 0 for algebraic numbers x, y not roots of unity and satisfying x + y = 119 so h(z) + h(w) ≥ min(|a|, |b|)h0 .

More recently, P. Habegger succeeded in [Hab09c] in proving a natural analogue of Theorem 1.2, i.e., a height bound holding for the “likely intersections” of an appropriate subset of X with the union of algebraic subgroups of complementary dimension. To illustrate this, we pause to point out that in joint work with Bombieri and Masser it had already been noted that the surface X of Example 1.3 has “unlikely intersections of positive dimension” with the algebraic cosets of dimension 2: in fact, the cosets of dimension 2 defined by x = x0 , y = 1− x0 intersect X in a curve rather than in finitely many points. On the one hand, this led to the study of new types of unliklely intersections, as we shall describe below. On the other hand, it turned out that to have a height bound for X(n−dim X) , it is indeed generally necessary to remove from X all the unlikely intersections of positive dimension with algebraic cosets. (For the above surface, this would leave empty the resulting space.) This led to a suitable conjecture, eventually proved by Habegger. We shall say more on Habegger’s theorem and this issue in the next section. Coming back to the case of curves X, Theorem 1.2 accounts for the “likely intersections.” For the unlikely ones (that is, the points in X(n−2) ) the above argument gives nothing more, but the result nevertheless may be applied to obtain the finiteness of the unlikely intersections: Theorem 1.3. ([BMZ99], Theorem 2.) If X is an irreducible curve defined over in any translate of a proper subtorus of nm , then X(n−2) is finite.

G

Q, not contained

Note that now the relevant points z are such that the coordinate values x1 (z), . . . , xn (z) satisfy two independent multiplicative relations. Inspection shows that the proof is effective in the sense that, if equations for X are given, it allows us to produce the complete list of the relevant points. An equivalent formulation of this theorem in dimension n = 3 was actually first obtained by Schinzel in 1989, who studied this topic with independent motivations: see Theorem 45 of [Sch00]. Example 1.4 An example of the theorem is as follows: There are only finitely many numbers t 6= 0, 1, −1 such that t, t + 1, t − 1 satisfy two independent multiplicative relations of the shape (t − 1)a tb (t + 1)c = 1, a, b, c integers. The condition is equivalent to the fact that the group tZ (t − 1)Z (t + 1)Z has rank at most 1. (In the paper [CZ00] it is shown, by a different method, that there are exactly 34 such numbers t, listed therein.)

Sketch of proof of Theorem 1.3. Let z ∈ X(n−2) so the coordinates xi (z) generate a multiplicative ∗ subgroup Γ = Γz ⊂ of rank rQ≤ n − 2. We may thus write (on invoking elementary abelian m r group decomposition) xi (z) = ζi i=1 gj ij for generators gi ∈ (z) of the free part of Γ, integers mij and roots of unity ζi ∈ (z). It is important that the ζi may be indeed chosen to lie in (z); we let N be their exact common order. The proof now splits in some steps.

Q

Q

Q

Q

Small bounds for the height. First, geometry of numbers may be applied, to the euclidean ∗ normed space provided by the height on the subgroup of /tors generated by the xi (z) (modulo torsion), as in Lemma 2 of [BMZ99]. Namely, given generators gi , i = 1, . . . , r, for the said subgroup, one constructs a norm which satisfies ||(a1 , . . . , ar )|| = h(g1a1 · · · grar ) for integers a1 , . . . , ar (see for instance [Zan09], Ch. 3, pp. 116–119). By using Minkowski’s theorem on successive minima, one deduces that the generators gj may be chosen with “good” behavior with respect to this norm, or, equivalently, to the Weil height. By this we mean that we have h(g1a1 · · · grar ) ≥ c maxri=1 (|ai |h(gi )), where c is a positive constant depending only on r.

Q

19 Zhang’s

general theorem was made explicit by Zagier [Zag93] in this case, with a best-possible h0 .

27

1.2 Higher multiplicative rank

This choice of the generators leads to inequalities maxj |mij |h(gj ) r h(xi (z)) for i = 1, . . . , n. (Note the crucial fact that the implied constant here does not depend on z, only on r.) By Theorem 1.2 we thus have (1.2.3) h(gj )  1/|mj |, uniformly in z, where |mQ j | denotes the euclidean length of mj := (m1j , . . . , mnj ). r We also define V := j=1 |mj |; this is an upper bound for the volume of the parallelepiped spanned by the mj .

G

Upper bound for the degree. The next step is to find a proper algebraic subgroup H ⊂ nm of small degree and containing z. It is convenient to choose H of largest possible dimension, i.e., n − 1, and this corresponds to finding a nonzero vector b ∈ n orthogonal to the mj and such that ζ1b1 · · · ζnbn = 1. Again by standard geometry of numbers (or pigeon-hole principle) this may be done with 0 < |b| ≤ (NQ V )1/(n−r) ; here we see explicitly how a small rank r is advantageous. We have z ∈ H, i.e., i xi (z)bi = 1. In practice, using that z is contained in an algebraic subgroup of codimension 2 (and a certain degree), we have constructed a larger algebraic subgroup, of codimension 1, containing z, but whose degree has a much smaller dependence on the |mi |.20

Z

equation holds for the conjugates z σ of z over a number field k of definition for X: Q Theσsame bi = 1. Hence, X ∩ H contains all of these conjugates. Note that X is not contained i xi (z ) in H by assumption, hence by Bezout’s theorem we have |X ∩ H| ≤ deg X deg H  max |bi | ≤ (N V )1/(n−r) . This bound therefore holds for the degree of z, i.e., [k(z) : k]  (N V )1/(n−r) . Comparison of bounds. Now comes the final step; it is a fact that a small nonzero height forces large degree; here we cannot enter in detail into this question, which started with Lehmer’s problem (see [BG06]) and nowadays involves deep extensions to higher dimensions and also in a geometrical context, where varieties replace numbers (see [Zan09], especially Appendix by F. Amoroso). We limit ourselves to mentioning that in this direction Amoroso and S. David [AD99] have proved (after the case of rank 1 treated by E. Dobrowolski) the following theorem: Theorem 1.4. (Amoroso-David [AD99].) Let  > 0. If g1 , . . . , gr are multiplicatively independent algebraic numbers in a number field of degree d, then h(g1 ) · · · h(gr ) ,r d−1− .21 We apply this result to the above defined gi , setting d := [k(z) : k]. By the above inequalities for h(gi ) we get h(g1 ) · · · h(gr )  V −1 and d  (N V )1/(n−r) , whence, by this theorem, V n−r−1−  N 1+ ; in turn, we get [k(z) : k]  N 1/(n−r−1−) . To go on, we recall that the ζi lie in k(z) hence φ(N )  [k(z) : k], where φ is Euler’s function; the value φ(N ) is well known to be  N 1− . This is sufficient to conclude if r ≤ n − 3, because then, choosing a fixed positive  < 1/4, we obtain that N and [k(z) : k] are uniformly bounded, so by bounded height (provided by Theorem 1.2) the points z have only finitely many possibilities. If r = n − 2 we get merely φ(N )  [k(z) : k]  N 1/(1−) . An idea is now as follows: the extension k(z)/k is “almost” normal, because the above inequalities show that it has a small degree (i.e.,  N 2/(1−) ) over the cyclotomic extension k(ζ1 , . . . , ζr )/k of degree  φ(N ). Hence, we may add to k(z) a conjugate g 0 of some g` of minimal height, and keep 20 Note that we have used geometry of numbers twice. Masser has replaced this “double” use by a “single” use; he works with any free generators gj and introduces the function N (t1 , . . . , tr ) = h(g1t1 · · · grtr ); as noted above, this may be extended to r and induces a norm there; see [BG06] or [Zan09], pp. 116–118. Then Minkowski’s second theorem may be applied. For this argument, see [Mas09], p. 330. 21 The original statement is more precise, and replaces the “d− ” by a certain explicit function of d and r; this is immaterial for our purposes.

R

28

Chapter 1

the degree of the resulting field not much larger than it was, i.e.,  N 1/(1−) N 2/(1−)  N 1+5 if  is small enough. But now, we can apply the Amoroso-David theorem to the r + 1 numbers g1 , . . . , gr , g 0 , and (taking into account that h(g 0 ) = h(g` ) is minimal) improve our inequalities to a sufficient amount to conclude exactly as before, provided, however, that g1 , . . . , gr , g 0 are multiplicatively independent. Let us then suppose that they are multiplicatively dependent, no matter the choice of the conjugate g 0 of g` ; then the multiplicative group generated by the conjugates of g` has rank s ≤ r and, through a basis for the free part of this group, we obtain a representation of the Galois group into GLs ( ). But, as is well known, the order of a finite subgroup of GLs ( ) is bounded only in terms of s (see [Ser07]). This yields that some positive integral power g`M of g` has a uniformly bounded degree over . At this point, on taking a norm to k(ζ1 , . . . , ζr ) and using a simple Galois cohomology argument (see Lemma 6 of [BMZ99], which we omit here) allows us to control a minimal such exponent M ; this in turn yields a “good” lower bound h(g` )  M −1 , sufficient to conclude, again by comparison with the degree of g` (by the case r = 1 of the Amoroso-David Theorem).

Q

Q

Q

This last, somewhat tricky argument may be replaced by a simpler one, using a subsequent theorem proved jointly with Amoroso in [AZ00]: Theorem 1.5. ([AZ00].) Let k be a given number field, let  > 0 and let ξ be an algebraic number, not a root of unity. Also, define dab as the minimal degree of ξ over any abelian extension of k. . Then we have h(ξ) k, d−1− ab The issue is that the implied constant depends only on k, , not on an abelian extension over which ξ has degree dab .22 In our case we may apply this result with ξ = gi . The degree dab may be taken to be at most [k(z) : k(ζ1 , . . . , ζn )], which we have established to be small, i.e.,  N 3 (for small enough  > 0). Then, from this bound and Theorem 1.5 we get h(gi )  N −4 for each generator gi , whence Q 1+4n V −1  i h(gi )  N −4r , so V  N 4r . In turn, [k(z) : k]  (N V )1/(n−r)  N 2 . But for small  comparison with N 1−  φ(N )  [k(z) : k] again shows than N is bounded, concluding finally the argument.

A further and perhaps even simpler argument to conclude, valid also in the crucial case r = n−2, uses a result presented in the next chapter, i.e., Theorem 2.2, in place of the last Theorem 1.5. We shall sketch such a deduction in the notes to Chapter 2. Remark 1.2.1 Relation between Theorem 1.2 and Theorem 1.3. We note that the above proof of Theorem 1.3 uses Theorem 1.2 in a crucial way, i.e., to derive the opening inequality 1.2.3. However, it turns out that if we have an a priori bound for the height of the relevant points, the finiteness of Theorem 1.3 can be proved without appealing to Theorem 1.2, and actually with the milder hypothesis that X is not contained in any torsion coset (rather than in any coset). Namely, one may prove that for any number B, if the curve X is not contained in any proper torsion coset, the set of points in X(n−2) with height at most B is finite. This deduction was explicitly carried out in [BMZ06] and is relevant in the issues discussed in 1.3.2 below. 22 The

recent paper [AZ10] also proves that the implied constant may be choosen dependent only on [k :

Q].

29

1.3 Remarks on Theorem 1.3 and its developments

1.3

Remarks on Theorem 1.3 and its developments

We shall now describe in short some developments arising from the above results. (See also A. Chambert-Loir’s S´eminaire Bourbaki [CL11] for an alternative presentation of a relevant part of this material.)

1.3.1

Fields other than

Q

Q

We have worked throughout with curves defined over , which is a relevant restriction, because the above proofs use arithmetical tools. However, whereas the very statement of Theorem 1.2 involves directly through the Weil height, the statement of Theorem 1.3 makes sense for curves defined over any field, and one can thus ask for which fields the corresponding conclusion still holds. In fact, although the original arguments, as they stand, do not carry over outside , this theorem was subsequently extended to complex curves by specialization arguments in [BMZ03]; the principle was: (i) to specialize to the coefficients of defining equations for the curve, obtaining a curve over ; (ii) to extend the said specialization to the unlikely intersections, with the purpose to apply the result already obtained in the algebraic case. However, somewhat unexpectedly, these arguments turned out to be not very simple, the difficulty being that a priori infinitely many unlikely intersections could collapse and give rise only to finitely many ones after specialization; however, it was eventually possible to control this collapsing, allowing the said principle to work. In greater generality, a deduction of the complex case from the algebraic one in this context was later achieved with another (more natural) method in the paper [BMZ08b] (see, for instance, Cor. 1 and Cor. 2 therein). We remark that the complex case in fact covers the case of an arbitrary field of characteristic zero (by the usual Lefschetz Principle). As to fields of positive characteristic, plainly the analogous conclusion would not hold over , p every point being torsion. However, the case of function fields over p is open and certainly presents interesting and difficult issues in this and similar directions. (See, for instance, [AV92] for a discussion of an analogue of the Mordell-Lang conjecture for semi-abelian varieties in positive characteristic and for proofs of some cases. We further mention a Mordell-Lang conjecture for Drinfeld modules, due to L. Denis [Den92]. There are also preliminary investigations, by D. Brownawell and Masser, about more general unlikely intersections for nm and Drinfeld modules, as well as the simpler so-called F -modules.)

Q

Q

Q

Q

F

F

G

1.3.2

Weakened assumptions

Whereas we have noted that the assumption on X (not to be contained in a translate of a proper algebraic subgroup) is certainly necessary for Theorem 1.2 above, the question arose whether for Theorem 1.3 the weaker assumption suffices that X is not contained in any proper algebraic subgroup (rather than translate). This is equivalent to the assumption that X is not contained in any proper torsion coset. This question was raised explicitly in [BMZ99]; indeed, in such paper the stronger assumption was needed because the proof of Theorem 1.3 used Theorem 1.2 (in this connection see also Remark 1.2.1 above). It is an apparently small issue, but actually it turned out to be a relevant and difficult one, carrying into the picture the question of points in Y ∩ Γ, where Y is a curve and where now Γ is a(ny) finitely generated (or finite-rank) subgroup. (These Y and Γ arise as follows: if X is contained in a translate τ H of minimal dimension m, then we may choose Y = τ −1 X and Γ as the group generated by the coordinates of τ , replacing the ambient space by H ∼ = m m .) Now, as pointed out in Remark 1.1.6 above, the investigation of Y ∩ Γ is a deep problem, at the heart of

G

30

Chapter 1

the Lang’s conjectures. Actually, we shall soon note how the improved statement for Theorem 1.3 would directly contain the multiplicative case of Lang’s conjectures for curves. Example 1.5 An instance of this sharpened form of Theorem 1.3, related to the previous Example 1.4, is as follows: There are only finitely many numbers t such that t, t + 1, t − 1, 2, 3 satisfy two independent relations of the shape (t − 1)a tb (t + 1)c 2d 3e = 1. (This condition means that the group tZ (t − 1)Z (t + 1)Z 2Z 3Z has rank at most 3.) This conclusion would follow from the statement analogous to Theorem 1.3, but with the said milder assumption, applied to the curve X in 5m parametrized by (t, t − 1, t + 1, 2, 3); note that this curve is contained in the (nontorsion) translate (1, 1, 1, 2, 3) · H of the algebraic subgroup H ∼ = 3m defined by x4 = x5 = 1, but it is not contained in any proper algebraic subgroup.

G

G

Jointly with Bombieri and Masser, a proof of the noted stronger statement for curves in low dimensions was obtained in [BMZ06]. In its starting point, the proof method was partially along the same lines of the above proof of Theorem 1.3, to obtain the variation stated in Remark 1.2.1 above; namely, with the weaker assumption we could obtain the sought conclusion provided we had a bound for the height of the relevant points, gotten without appealing to Theorem 1.2. To achieve this bound, we first took Galois conjugates to eliminate the multiplicative translation. Namely, if X lies in a coset of dimension m, after an automorphism of nm we may write n−m is constant; the significance of this X = Y × {z0 }, where Y is a curve in m m and z0 ∈ m is that now Y may be assumed to satisfy the former assumption, i.e., not to lie in any proper σ −1 , for algebraic coset in m m . Now, starting from a point z ∈ X ∩ G one considers the points z · z σ conjugates z of z (over a field of definition k for X). Note that such new points lie in the surface Y · Y −1 × {1}, in bijection with Y · Y −1 := {uv −1 : u, v ∈ Y }, so that z0 has disappeared. At this stage, one would need a height bound for surfaces, similar to Theorem 1.2, which could be transferred in the desired bound for the height on Y , and in turn to a bound for h(z); however, we could prove this surface version of Theorem 1.2 only for surfaces in dimension 3 (which are relevant when X is contained in a coset of dimension at most 3, as in the above Example 1.5). Combining all of this also with “Lang’s conjecture for curves” (a theorem of Liardet) led to the proof of the improved version of the theorem for the said special cases. As in Remark 5 to Theorem 1.2, the required general height bound (for surfaces, but also in higher dimensions) was proved recently by Habegger and eventually led to a complete extension of the arguments of [BMZ06]; we shall soon discuss Habegger’s result in some detail. Anyway, in the meantime, the above basic question was settled affirmatively in 2008 by G. Maurin [Mau08], who weakened the said assumption and proved:

G

G

G

G

Theorem 1.6. (Maurin, [Mau08].) If X is an irreducible curve defined over in any proper torsion coset of nm , then X(n−2) is finite.

G

Q, not contained

As already remarked, this result, applied to curves X = Y × (α1 , . . . , αm ), where Y is a curve ∗ and the αi ∈ , implies the toric case of Lang’s conjecture for curves over . To in n−m m see the essentials of this implication, take an algebraic subgroup G of codimension 2, defined by a1 −b b1 am bm x−a 1 xn−m+1 · · · xn = x2 xn−m+1 · · · xn = 1, ab 6= 0. Then the intersection X ∩ G corresponds to a /a a /a b /b b /b points (y1 , . . . , yn−m ) ∈ Y such that y1 = θα1 1 · · · αmm and y2 = µα11 · · · αmm , with roots of unity θ, µ. In other words, the projection on the first two coordinates is a point with coordinates in the division group of the group Γ generated by the αi . The conclusion of Maurin’s theorem yields finiteness unless this projection is an algebraic coset, as predicted by Lang (and first proved by him in [Lan65] for finitely generated Γ and by Liardet in general).

G

Q

Q

In his proof, Maurin used methods introduced by G. R´emond, quite different from the illustrated ones. He used suitable so-called Vojta’s inequalities (in the toric context) in the theory of heights; this kind of idea had been indeed introduced by Vojta in his proof of Faltings’ theorem, previously

31

1.3 Remarks on Theorem 1.3 and its developments

Mordell conjecture, and concerned abelian varieties (see [BG06], Chapter 11, especially 11.7). Later, R´emond succeeded in obtaining very useful and far-reaching analogues for tori, with rather subtle and technical proofs.

Q

Maurin’s arguments work only over , but a deduction of an extension of his theorem to general complex curves appears in the paper [BMZ08b], Cor. 1. As mentioned above, Habegger’s cited height bound has been eventually applied to the setting and arguments of the quoted paper [BMZ06], involving Galois conjugates as in the above sketch. This led indeed to another proof (moreover “probably” effective) of this result of Maurin, carried out in detail in the paper [BHMZ10].

1.3.3

Unlikely intersections of positive dimension and height bounds

Now we shall briefly discuss the said new tool provided only recently by Habegger, namely a suitable higher dimensional analogue of Theorem 1.2. This is related to other unlikely intersections and so, before stating Habegger’s result, recall that we have already noted in Remark 5 to Theorem 1.2 that for a height bound to hold in X(n−dim X) , suitable new features of X have to be taken into account. Actually, following [BMZ07], let us define a suitable subset of X:

G

Definition. For a subvariety X ⊂ nm , we define X oa to be the complement in X of the union of unlikely intersections of positive dimension (with cosets), namely the components of some positive dimension δ > 0 of some intersection X ∩ gG, where gG is a coset of an algebraic subgroup G of dimension dim G ≤ n + δ − dim X − 1.23 Note that X oa is empty when X is contained in a proper coset, but also in other cases, like for the above counterexample of surface in 4m , i.e., Example 1.3 in Remark 5 of the previous section. (We again remark that, as for Maurin’s theorem, the distinction between torsion translates and general translates is often substantial in the whole context.) In [BMZ07] it is proved that X oa is Zariski-open in X, and its structure is discussed in detail (see Subsection 1.3.4 for more). Now, it turns out that the sought bound for the height in X(n−dim X) may generally fail to hold outside X oa ; this was also discussed in [BMZ07] (see especially Section 5), where we put forward a conjecture (Bounded height conjecture), later to become the theorem by Habegger:

G

Theorem 1.7. (Bounded height conjecture = Habegger’s theorem; [Hab09c].) Let X be an irreducible variety over . The Weil height is bounded in the intersection of X oa with the union of algebraic subgroups of dimension ≤ n − dim X.

Q

Habegger’s proof is seemingly effective (which would lead to an effective version of Maurin’s theorem, as in [BHMZ10]). It is rather involved, and we cannot reproduce a full sketch of it here; we only add a brief description of the main points. We may suppose that X oa is not empty. Writing r := dim X, we are interested in the height inside X ∩G, where G is an algebraic subgroup of dimension n−r. Such a G is defined by equations xai = 1, for i = 1, . . . , r, where ai = (ai1 , . . . , ain ) ∈ n are linearly independent. Habegger considers the function (a homomorphism) ψ = ψa1 ,...,ar : nm → rm , ψ(x) = (xa1 , . . . , xar ), and

Z G

G

23 The notation X o came from a previous notation introduced in [BZ95], concerning the case of cosets entirely contained in X; that paper considered lower bounds for the height and showed that the distinction between “cosets” and “torsion cosets” is relevant in uniformity issues. The letter “a” refers to “anomalous,” which had the same meaning of “unlikely.”

32

Chapter 1

more generally seeks upper and especially lower bounds (uniform as G varies!) for the height h(ψ(x)) in terms of h(x), for x ∈ X. (Note that ψ(x) = 1 precisely if x ∈ X ∩ G.) Now, a first step is to approximate a rational power of such a function ψ with some other homomorphism ϕ(x) = (xb1 , . . . , xbr ) in a finite set. This may be done just by taking a large parameter Q and mimicking the vectors Qai / sup |aij | by integer vectors bi in a finite set (similarly and even more simply than in the proof of Theorem 1.2). From the equality ψ(x) = 1, one then easily obtains that h(ϕ(x))  h(x) for x ∈ X ∩ G, with an implicit constant dependent only on n. Now comes the crucial point: to bound below h(ϕ(x)), x ∈ X, by a large multiple of h(x) plus a bounded quantity, in order to obtain by comparison the sought upper bound for h(x). Since ϕ is in a finite set, we can suppose it is fixed. By standard height machinery (developed, e.g., in [BG06]) this h(ϕ(x)) is (essentially) the height hD (x), taken with respect to the divisor ¯ of X to Pr , D = ϕ∗ (∞), where we view ϕ as a function from a suitable compactification X and where ∞ denotes a divisor in Pr corresponding to the standard height on rm . Habegger compactifies X as the closure of the graph of ϕ in Pn × Pr ; he then uses a theorem of Siu in algebraic geometry (plus standard functorial height properties) to bound below the height hD (x), for x in a dense open subset of X. He obtains hD (x)  C1−1 · (deg ϕ) · h(x) + Oϕ (1), where C1 is ¯ in turn, this C1 may be suitably bounded a certain intersection product related to ϕ, D and X; r−1 above (by  Q ) using a multiprojective version of Bezout’s theorem due to P. Philippon. To get a suitable lower bound for the degree deg ϕ (with respect to X), Habegger introduces supplementary compactness arguments, by using a result of J. Ax (Theorem 7.2 of [Hab09c]) to work with virtual monomials, corresponding to real (rather than integral) exponents. The lower bound for C1−1 · deg ϕ turns out to be  Q, which yields h(ϕ(x))  Qh(x) + OQ (1); although the constant in the O-term may depend on Q, the one implicit in the  does not, so the coefficient of h(x) for large Q kills the constant in the previous upper bound. This is enough to get a bound for the height of x in some dense Zariski-open subset of X. To obtain the precise X oa , Habegger finally uses a descent argument on dimension.

G

We also mention another result appearing in the same paper of Habegger (Cor. 2 in [Hab09c]), again related to X oa , similar in spirit to Theorem 1.3 or Maurin’s theorem, and valid for X of any dimensions. For this he introduces the following notion of degeneracy: Definition. A subvariety X ⊂ nm of dimension r is nondegenerate if for any linearly independent vectors a1 , . . . , ar ∈ n the monomials xa1 , . . . , xar are algebraically independent on X. For dim X > 1, this is stronger than X not being contained in a proper algebraic subgroup or coset; it may be rephrased by saying that X is nondegenerate if every dominant homomorphism n h m → m , h ≥ d = dim X, is generically finite on X. He proves:

Z

G

G

G

G

Theorem 1.8. (Habegger [Hab09c].) Let X ⊂ nm be a nondegenerate subvariety of dimension d defined over . Then the set X(n−d−1) is not Zariski-dense in X.

Q

This result contains Theorem 1.3, because plainly a curve is nondegenerate if and only if it is not contained in a translate of a proper algebraic subgroup. We also note that if X oa is nonempty, then X is nondegenerate. In fact, otherwise xa1 , . . . , xad are algebraically dependent on X for suitable linearly independent ai ∈ n . Then, after an automorphism of nm we may suppose that the ai are the first d canonical vectors and that X is contained in a hypersurface f (x1 , . . . , xd ) = 0. Let π denote the projection on the first d coordinates. Then π|X sends X to a variety of dimension at most d − 1, hence the nonempty fibers have dimension > 0. But these fibers are just intersections of X with translates of the torus H of dimension n − d defined by x1 = . . . = xd = 1, so they are unlikely intersections of positive dimension. Thus, removal of these fibers empties X. (See also Lemmas 2.1, 2.2 of [BMZ07].) A converse of this last assertion is also true; for instance, it follows implicitly from Theorem 1.4 in [BMZ07].)

G

Z

33

1.3 Remarks on Theorem 1.3 and its developments

1.3.4

Unlikely intersections of positive dimension and Zilber’s conjecture

The brief history that we have just recalled shows that the study of the height for the set X(n−dim X) of “likely” intersections naturally involves other truly “unlikely” objects in higher dimensions, related to X oa . As mentioned above, a classification of these new unlikely intersections of dimension > 0 had been done in [BMZ07] (and then by R´emond in [R´em09b] for abelian varieties). Roughly speaking, in [BMZ07] it is proved in particular that Theorem 1.9. ([BMZ07].) The maximal unlikely intersections of positive dimension come from intersections with cosets cH where the algebraic subgroups H have only finitely many possibilities. It also follows that X oa is Zariski-open in X. Note that here we consider not merely the intersections with torsion translates but with all translates of algebraic subgroups. (The corresponding sets have somewhat different significance; this appeared already in the paper [BZ95], concerning lower bounds for heights of algebraic points, where removal of certain nontorsion cosets led to better uniformity.) The paper [BMZ07] actually gives more precise descriptions than Theorem 1.9, moreover with explicit bounds for the involved degrees; this also allows us to find effectively the relevant algebraic subgroups. Independently of the quoted papers, B. Zilber was led to the topic with completely different motivations from model theory; he sought a model-theoretical study of the formal theory of exponentiation. In other words, he looked at fields having a formal analogue of the exponential function, and he sought, for instance, an analogue of the Nullstellensatz for varieties defined by algebraic operations plus the exponentiation in question. He had also in mind, as an axiom which should hold for such a formal exponential (and actually “responsible for geometric properties of fields with exp”), the celebrated Schanuel’s conjecture: this asserts that for complex numbers x1 , . . . , xn linearly independent over , the transcendence degree tr deg(x1 , . . . , xn , exp(x1 ), . . . , exp(xn )) ≥ n.24 In the paper [Zil02] Zilber stated a conjecture (Conjecture 1) of which Theorem 1.3, Maurin’s theorem, and Theorem 1.9 are special cases. Here is our statement of it.

Q

G

Zilber’s conjecture. Let X be an irreducible variety in nm , defined over C. Then there is a finite collection Ω = ΩX of translates T of tori of dimension at most n − 1 by torsion points such that for every torsion coset K and every component Y of X ∩ K satisfying dim Y > dim X + dim K − n one has Y ⊆ T for some T in Ω. Since every torsion coset has finite index in a suitable algebraic subgroup, one may rephrase the conclusion by replacing the torsion cosets T with proper algebraic subgroups of nm . Note that the conclusion of Zilber’s conjecture allows us inductively to reduce dimension by studying the intersections with finitely many torsion translates of proper subtori, and each torus is of course isomorphic to lm for some l < n. In a sense (by dimension-descent) it predicts finite description for all the unlikely intersections (of any dimensions) with torsion cosets. Actually, note that an unlikely intersection of positive dimension Y ⊂ X ∩ K (i.e., a component of X ∩ K of dimension at least max(1, dim X + dim K − n + 1)) produces, by further intersection with a torus H of codimension 1, and not contained in K, an intersection Y ∩ H ⊂ X ∩ (K ∩ H), which, for “general” H, is still unlikely: in fact, if H meets Y , the intersection Y ∩H has dimension ≥ dim Y − 1 ≥ dim X + dim K − n = dim X + dim(K ∩ H) − n + 1. This phenomenon sometimes

G

G

24 In the case when x , . . . , x are algebraic numbers, the conjecture is the famous Lindemann’s theorem, which n 1 contains, for instance, the transcendency of e and π; see [Bak90]. This is essentially the only known case of this conjecture. An analogue for power series has been proved by Ax, and is also relevant in Zilber’s approach.

34

Chapter 1

allows induction steps, and also shows that the crucial case of the conjecture substantially occurs when dim Y = 0, i.e., when Y is a point. Considering such special cases when Y is a point, the conclusion of the conjecture asserts that The set Xn−dim X−1 is contained in a finite union of proper algebraic subgroups.25 Example 1.6 We note that, on taking X to be a curve, this case of the conjecture indeed implies, and is indeed equivalent to, the refined version of Theorem 1.3 mentioned at Subsection 1.3.2, i.e., Maurin’s Theorem 1.6, if we work over . In fact, the last statement with dim X = 1 reads as predicting that X(n−2) is contained in a finite union of proper torsion cosets. Hence, either X(n−2) is finite or X is actually contained in a torsion coset. As already mentioned, Maurin’s theorem has been extended to complex curves in [BMZ08b], Cor. 1.; thus this establishes the Zilber conjecture for curves. Another known case occurs when dim X = n − 2; this was carried out in the appendix to [Sch00] (for tori K). The proof was fairly involved (see 1.3.5 below for a bit more on it); another more natural proof (for general torsion cosets K) appeared later in [BMZ07] with a partially different method.

Q

Similarly to what happens in the case of Maurin’s theorem, we may take X of the shape and αi are nonzero constants, and consider Y × (α1 , . . . , αm ), where Y is a given variety in n−m m the intersections X ∩ K for suitable torsion cosets K of dimension n − dim Y − 1; then it is not difficult to realize that this conjecture contains, for instance, the conclusion of Lang’s conjecture Z . This fact itself hints (i.e., Laurent’s theorem) for the intersection Y ∩ Γ, where Γ = α1Z · · · αm that the statement is indeed very deep. Theorem 1.8 above is clearly in this direction, though with the nondegeneracy assumption and with a conclusion less precise than what would follow from this conjecture. Theorem 1.9, which takes into account only intersections of positive dimension, may be seen as a function-field version of Zilber’s statement (for general algebraic cosets in place of algebraic subgroups).

G

Zilber’s original formulation was slightly different, and he called atypical the intersections of unlikely dimension. (See, e.g., Conjecture 1 in [Zil02].) He also had other conjectures “with parameters,” i.e., for variable subvarieties. (This corresponds to considering subschemes of nm × B over B, as in conjectures of Pink to be mentioned in 3.4.1.) Zilber related his conjecture with Schanuel’s, and more precisely he showed that the two conjectures imply a uniform version of Schanuel’s conjecture (Prop. 5 of [Zil02]), in which one classifies the points (x1 , . . . , xn , exp(x1 ), . . . , exp(xn )) lying in a given variety X ⊂ C2n of dimension < n: it predicts in particular that the corresponding points (x1 , . . . , xn ) all lie in a finite number of hyperplanes translated by 2πi · n . He also stated and sketched a proof of a function field version (Cor. 3 of [Zil02]), which can lead to a(n ineffective) case of Theorem 1.9.

G

Q

Z

One can also formulate an abelian analogue (as done by Zilber and also by R. Pink in unpublished work [Pin05b]), which this time in particular would contain the Mordell-Lang conjecture. It is possible that the said height bounds of Habegger, combined with the methods of [BHMZ10], allow further steps toward Zilber’s conjecture, for instance an analogue for surfaces of Theorem 1.3 and of Maurin’s theorem. This is the object of work in progress. As to further recent work in these directions, let us finally mention that in a paper [Mau10] (extracted from his thesis) Maurin has announced other powerful results in the same spirit of Habegger’s Theorem 1.8. For instance, he obtains in particular the following strengthening:

G

Theorem 1.10. (Maurin, [Mau10], Theorem 1.1.) Let X ⊂ nm be a nondegenerate algebraic variety of dimension d, definedSover , and let Γ ⊂ nm ( ) be a finitely generated subgroup. Then the union of the intersections dim H≤n−d−1 (X ∩ ΓH) is not Zariski-dense in X.

Q

G Q

25 Actually, by cutting the positive dimensional unlikely intersections with algebraic subgroups, one may, conversely, deduce the full conjecture from this last statement.

35

1.3 Remarks on Theorem 1.3 and its developments

This result bears to Theorem 1.8 (which is the case Γ = {1}) the same relation that Maurin’s Theorem 1.6 bears to Theorem 1.3. Also, this plainly contains Maurin’s Theorem 1.6. In the same paper [Mau10], Maurin also obtains another proof, however ineffective, of Habegger’s Theorem 1.7. As in his former result, Maurin uses suitable Vojta’s inequalities, in refined uniform shape, together with other tools related to R´emond’s methods.

1.3.5

Unlikely intersections and reducibility of lacunary polynomials (Schinzel’s conjecture)

Motivated by his studies on the reducibility of lacunary polynomials,26 A. Schinzel formulated a conjecture (see Conjecture 1 in [Sch00]), which, roughly speaking, predicted that If P, Q are given coprime polynomials in k[x1 , . . . , xn ], and if a1 , . . . , an are integers, then P (ta1 , . . . , tan ) and Q(ta1 , . . . , tan ) may have only roots of unity as common roots, unless the vector (a1 , . . . , an ) belongs to one of certain finitely many proper subgroups of n depending only on P, Q.

Z

This question fell in the above realm, because the common zeros of the said polynomials P (ta1 , . . . , tan ) and Q(ta1 , . . . , tan ) correspond to nothing else than the unlikely intersections of the variety X : P = Q = 0, of codimension 2 in nm , with the 1-dimensional irreducible algebraic subgroup parametrized by xi = tai , i = 1, . . . , n. For n ≤ 3, such statement is actually equivalent with Theorem 1.3, and the conclusion had been proved by Schinzel already in 1989 (see Theorem 45 of [Sch00]); however, those arguments did not extend to larger values of n. The methods outlined in this chapter, especially for the proof of Theorem 1.3, could be applied to this context, providing a complete solution of the conjecture of Schinzel, obtained jointly with Bombieri (see the appendix by the author to [Sch00]). By means of this result, Schinzel was then able to complete his irreducibility theory for arbitrary l-nomials with given coefficients. (See [Sch00], Ch. 4, for this.) In fact, the said appendix to [Sch00] contains a proof of a version of Zilber’s conjecture for the case of intersections of any given variety X with 1-dimensional tori (which is essentially equivalent with Schinzel’s conjecture). Let us say a few words on the proofs. One first obtains an analogue of Theorem 1.2 for the intersections of an arbitrary variety with tori of dimension 1 (Lemma 1 in [Sch00], already mentioned at n. 5 of the previous section). Then one again compares heights and degrees, similarly to the arguments for Theorem 1.3; however, this does not lead directly to the sought conclusion, and instead produces certain unlikely intersections of positive dimension from the given unlikely points: this is an “ascent” procedure, and the problem remains to prove the finiteness of the maximal unlikely intersections. For this, the said proof now used a “descent,” on cutting successively with tori of codimension 1, to produce many unlikely intersections again of dimension 0 (see the remarks after the statement of Zilber’s conjecture). In all of this one controls the involved degrees. At this stage, one uses once more the previous ascent, starting with all the new unlikely points so far obtained; if we had no finiteness of maximal unlikely intersections, this eventually would produce so many unlikely intersections of positive dimension to violate other simple general considerations of degree, concluding the argument. A partially different and also more natural and simpler approach for this geometric part of the proof (i.e., to control the maximal unlikely intersections of positive dimension) was later obtained in [BMZ07], as an application of Theorem 1.9 above. This actually led to a strengthening of

G

26 Here by “lacunary polynomial” we think of a polynomial with a given number of terms and given coefficients but variable degrees; that is, an expression a1 tm1 + . . . + ar tmr , where r and the ai are fixed. For instance, when the ai ∈ , one can seek a description of the integers m1 , . . . , mr such that the polynomial is reducible over . The case of binomials is classical and goes back to Capelli; already the case of trinomials is much more difficult. Some of the first results go back to E. Selmer, W. Ljunggren, H. Tverberg, who obtained the factorization over of xn ± xm ± 1 for 0 < m < n.

Q

Q Q

36

Chapter 1

Schinzel’s statement in which one can intersect with arbitrary 1-dimensional torsion cosets (not merely with the 1-dimensional subgroups), obtaining a description of X(1) . Beyond Schinzel’s, an application of these results has been given by M. Filaseta, A. Granville, and Schinzel [FGS08] (and then by L. Leroux in full generality) to obtain an algorithm for computing the gcd of lacunary polynomials (in time bounded by a power of the height and log degree). Assuming an effective version of the full Zilber conjecture, this has been recently extended to polynomials in several variables by Amoroso, Leroux, and M. Sombra (work in progress). Another application occurs with the proof by Thang Le of a conjecture of K. Schmidt in algebraic dynamical system theory: see [Le10]. (This says that the growth rate of the number of connected components of actions of n on a compact metric space is equal to the entropy of the torsion part.) A further consequence obtained in [Le10] is the proof of a conjecture of Silver and Williams in topology (saying that the growth rate of homology torsion of abelian coverings is equal to the Mahler measure of the first nonzero Alexander polynomial.)

Z

1.3.6

Zhang’s notion of dependence

We have noted that Theorem 1.2 may be rephrased in terms of dependence of values of independent functions, and in this view it is a kind of local-global principle. We have also noted that, although Theorem 1.2 indicates that the corresponding set of points is sparse, actually such a set is infinite, and we need two independent relations to recover finiteness. Shou-Wu Zhang has formulated a different subtler notion of dependence, according to which independent maps on a curve take independent values at all but finitely many points. (However, now the relevant maps shall go to higher-dimensional tori.) Such a statement is more natural and translates the “unlikely” hypotheses into a language of “dependence”; it makes a more precise local-global principle here. Zhang’s concept of dependence is formulated not quite in terms of the functions but in terms of images; he argues with general commutative algebraic groups, but let us stick for simplicity to the present case of tori nm . Let f1 , . . . , fr : X → hm be rational maps. We say that they are Zhang-independent if no nontrivial monomial f1a1 · · · frar has an image whose Zariski-closure is a torsion coset. This notion is rather different from the naive one; for instance, if the fi : X → m are rational functions (so h = 1) on X ⊂ nm , then they turn out to be Zhang-independent if and only if they are multiplicatively independent constants! We skip here any further details; a discussion of Zhang’s dependence and its precise relations with the present context can be found in the appendix to [BMZ06]. In particular, it is proved therein that Maurin’s theorem may be restated as the following assertion: If f1 , . . . , fr : X → h m are Zhang-independent, then there are only finitely many points z ∈ X such that the values f1 (z), . . . , fr (z) are multiplicatively dependent.

G

G

G

G

G

1.3.7

Abelian varieties (and other algebraic groups)

The whole present context has a natural analogue for (semi)abelian varieties. The question of torsion points in the abelian context has been already discussed briefly in Remark 1.1.5; it was first considered in a conjecture raised independently by Manin and Mumford, solved by Raynaud. We shall consider the Manin-Mumford issue later in Chapter 3, with other methods. As to higher rank, an abelian analogue of Theorem 1.3 is not yet known in full generality, due in particular to the absence of lower bounds for the N´eron-Tate height analogous to the ones of Amoroso and David for the Weil height. In some cases sufficiently good bounds are known, e.g., in the case of products of elliptic curves and of CM -abelian varieties; here one uses also a geometric analogue of the Amoroso-David bound (i.e., a Bogomolov explicit bound, due also to Amoroso-David).

37

1.3 Remarks on Theorem 1.3 and its developments

Corresponding theorems have been proved by E. Viada, G. R´emond, and M. Carrizosa in that context. We will not pause further on this, and refer the reader to, for instance, the papers [Via03], [R´em09a], [R´em09b], [Via08], and [Car10] for details and other references. Fully general analogues for abelian varieties are known for the height upper bounds (and due to Habegger [Hab09b]) and for the classification of higher-dimensional unlikely intersections (1.3.4 above), due to R´emond (see [R´em09b]). As to other algebraic groups, for instance linear noncommutative ones, these issues seem not to have been taken into account to date in the literature; we have already observed in Remark 1.1.5 that, via conjugation, the torsion points (and also the algebraic subgroups) may make up whole families, which, however, can be sometimes studied via the above conclusions. Still in other cases, like the additive groups na , the situation is distinctly of different flavor, as already remarked in the first footnote in the introduction. (For a subvariety X of na , the algebraic subgroups are not a discrete family but are parametrized by the Grassmannian, and the set analogous to X(d) would be the intersection of X with the union of linear spaces of dimension at most d; this problem is classical in geometry and has no arithmetical ingredient unless we add, e.g., the restriction to subspaces defined over or a number field. In this case, when, for instance, X is a curve and d = n − 2, the issue boils down to Faltings’ theorem = Mordell conjecture when n = 3, whereas for n ≥ 4 partially new issues seem to emerge. For dim X > 1 and d = n − dim X − 1, one is led in general to problems nowadays beyond hope.)

G

G

Q

1.3.8

Uniformity of bounds

The issue arises about the exact dependence of the bounds, e.g., in Theorem 1.2 and Theorem 1.3 above. An explicit and effective bound for the height in Theorem 1.2 may be obtained by the abovesketched arguments. (Good estimates follow on appealing to explicit proofs of Siegel’s theorem about comparison of heights on curves, carried out, for instance, by Habegger in his thesis.) As to Theorem 1.3, the proof we have sketched is also effective; it goes through Theorem 1.2, and in particular leads to a bound for |X(n−2) | dependent on the height of the curve X. The question arises whether there is a bound dependent only on the ambient dimension and the degree of X. Masser has verified that, at least for the case of a line in 3m , such a uniform bound would follow from Zilber’s conjecture: see Appendix B for his argument. Here we sketch how the idea can be applied in general, which is also a good illustration of the power of the general Zilber’s conjecture:

G

For simplicity we consider Theorem 1.3 rather than Maurin’s theorem, and assume that X is not contained in any proper coset. Let z1 , . . . , zh ∈ X(n−2) ; here we think of h as being fixed but sufficiently large in terms of deg X and n. We may view these points as producing a point h z1 × . . . × zh ∈ X h ⊂ ( nm )h ∼ = nh m ; this point actually lies in (X )(n−2)h . Since X is a curve, if (x1 , . . . , xn ) ∈ X we have fixed equations fj (x1 , xj ) = 0, j = 2, . . . , n. Then, setting zp = (zp1 , . . . , zpn ), we have fj (z11 , z1j ) = . . . = fj (zh1 , zhj ) = 0. For each fixed j = 2, . . . , n, these equations may be viewed as a linear system in the coefficients of fj , the entries being certain monomials in the subset of coordinates z11 , . . . , zh1 , z1j , . . . , zhj of z1 , . . . , zh . Then, if each fj has at most m terms and if h ≥ m, we obtain that all m × m-minors of the corresponding matrix vanish. On letting j go through 2, . . . , n, this gives a universal (determinantal) variety V = Vn,h,m in ( nm )h , which contains our point z1 × . . . × zh . The dimension of V is easily estimated to be at most h + (m − 1)(n − 1), so if h + (m − 1)(n − 1) + (n − 2)h < nh, i.e., if h > (m − 1)(n − 1), we are in position to apply to V the conclusion of Zilber’s conjecture. Let us then set h = (m − 1)(n − 1) + 1; we obtain that there exist finitely many proper torsion cosets in ( nm )h containing all the points

G

G

G

G

38

Chapter 1

z1 × . . . × zh ∈ (X(n−2) )h . These torsion cosets (in particular their number and degrees) depend only on V , hence only on m, n. Let us now fix, for instance, z2 , . . . , zh , letting z1 vary. We distinguish the said torsion cosets (supposed to be of codimension 1) depending on whether they contain a torsion-translate of nm × {1}n(h−1) or not.27 If |X(n−2) | is large enough in terms of n, m, deg X, and for enough choices of z1 the point z1 × . . . × zh lies in a coset of the second type, then we easily see that X must be contained in a proper coset (of bounded degree), against the assumptions. Hence we may assume that for all choices of z2 , . . . , zh ∈ X(n−2) the point z2 × . . . × zh is contained in a coset of the first type. But n(h−1) . Then, iterating the last argument cosets of the first type correspond to proper cosets in m shows that after all |X(n−2) | is anyway bounded only in terms of deg X and n.

G

G

This method of taking determinants to go to a universal variety to gain uniformity has been applied by H.P. Schlickewei and others in different contexts; see, for instance, [BZ95] for the context of Bogomolov problem for tori. Another method for the above deductions, depending on model theory, and seemingly ineffective, is sketched in Zilber’s paper [Zil02], Theorem 1. One might ask for “complete” uniformity, for instance, for the degrees of the relevant algebraic subgroups of the maximal codimension, containing the points in X(n−2) ; however, as observed by Masser in Appendix B, uniform bounds here cannot generally hold.

Q Q

Some (unconditional) uniform estimates for the maximum degree over of the points in X(n−2) appear in [CZ02a] for special cases when X is a line in nm , defined over ; the underlying method is completely different from what we have seen and relies on Khovanski’s theory of fewnomials, for which we refer to [Kho91]. (We note that this theory is also related to methods and results in model theory, which in turn are highly relevant for the results of Chapters 3 and 4; this indicates further connections with model theory.)

G

27 Note that the first alternative means that the coset may be defined by a binomial equation which does not depend on the first n variables.

39

Notes to Chapter 1

Notes to Chapter 1 1. Sparseness of multiplicatively dependent points As briefly mentioned earlier, it follows already from results in 1989 by Masser [Mas89b] that X(n−1) is a sparse set, in a certain sense explained below. His method predates Theorem 1.2 and is independent of the corresponding proof; we now say a few words about it. The fact at bottom is that if nonzero algebraic numbers α1 , . . . , αn in a number field k of degree at most d are multiplicatively dependent, then there is a relation α1m1 · · · αnmn = 1,

0 < max |mi | ≤ c(n, d)(max h(αi ))n−1 .

The proof of this uses the pigeon-hole principle, as in Theorem 1.2, to mimic a multiplicative relation with one having bounded exponents; more specifically,Pfrom equation (1.2.2) above, worked out with αj in place of xj (z), we get h(α1b1 · · · αnbn ) ≤ B −1 ( h(αi )) for some integers bi not all zero and with absolute value ≤ B n−1 . By Northcott’s theorem, for B > C max h(αi ), where C is large enough with respect to n, d, this forces α1b1 · · · αnbn to be a root of unity. This lies in k, so it has degree ≤ d, and hence has order bounded only in terms of d; then, raising to this order produces the sought relation. Let now x1 , . . . , xn be the coordinate functions on a curve X ⊂ nm , defined over . By letting, for instance, x1 be nonconstant and considering all the points z ∈ X such that x1 (z) is a rational number of height ≤ T , we obtain at least  exp(2T ) algebraic points z ∈ X of (logarithmic) height  T and bounded degree. Let us now apply the above remark with αi := xi (z). We obtain that if z ∈ X(n−1) there is a relation x1 (z)m1 · · · xn (z)mn = 1, with 0 < max |mi |  T n−1 . But the total number of such choices for the mi is  T n(n−1) . Each such choice produces at most  max |mi |  T n−1 values of z, provided X is not contained in any proper torsion coset. 2 So, out of  exp(2T ) such algebraic points (of bounded degree and height  T ), only  T n can give rise to multiplicative dependence.

G

Q

Note that this method works also for curves contained in a proper algebraic nontorsion coset, so the conclusion is not quite contained in Theorem 1.2. (Also, the method may be easily applied to related questions, such as producing algebraic points on X whose coordinates are multiplicatively independent modulo k ∗ , where k is a given number field.) 2. Other unlikely intersections The power of the general conjecture of Zilber is well illustrated by showing how it applies to other related but seemingly different issues. An example already occurred in connection with the uniformity of bounds in 1.3.8 of the previous section. As another instance, let us briefly discuss the following attractive question raised by A. Levin: Suppose that X, Y are curves in power xm (m = mx 6= 0) lies in Y ?

Gnm. What can be said about the points x ∈ X such that some

When n = 2, in general the curves [m]X and Y shall certainly intersect for almost all integers m,28 and the whole relevant set of intersections shall be infinite; the same happens if X, Y lie in a same torsion coset of dimension 2. And if [l]X = Y for some integer l 6= 0, then we can take 28 Exceptions may occur, but it is not difficult to describe them; for instance, it may be shown that X must then be a translate of a subtorus.

40

Chapter 1

mx = l for all X and any x ∈ X shall be in the set. But otherwise, for a random point x ∈ X, we expect this to be unlikely to happen. So, we are considering unlikely intersections, though seemingly different compared to what we have seen. Nevertheless, this fits into the previous context as follows. Consider the surface m which lies in the intersection of X × Y ⊂ 2n m . Any point x as above produces a point z := x × x 2n X ×Y with the algebraic subgroup of codimension n defined in m by xn+j = xm j , for j = 1, . . . , n. For 2 + n < 2n, i.e., for n > 2, the intersection is unlikely. Then, if we apply the conclusion of Zilber’s conjecture we obtain that each such point z is contained in the union of certain finitely many proper torsion cosets in 2n m , given in advance. Now, take one of these cosets, gG. If gG contains X × Y , then also X, Y must be contained in proper algebraic cosets; this case is easily analyzed and may be dealt with by induction. If gG does not contain X × Y , then gG ∩ (X × Y ) is (at most) a curve. If this curve is a torsion coset, then by projection we see that also X or Y is a torsion coset, and again this leads to an easy analysis. If this curve is not a torsion coset, then using Zilber’s conjecture again (now Maurin’s theorem suffices) leads to finiteness. In conclusion, in this way we may deduce from Zilber’s conjecture that the only cases when the said set is infinite indeed come from the opening observations.

G

G

G

3. A generalization of Theorem 1.3 We point out that another proof of Theorem 1.3 is due to Habegger, moreover in an improved form. Namely, let us define, for a subset H ⊂ nm ( ) and  > 0, the “truncated -cone” C(H, ) := {xy, x ∈ H, y ∈ nm ( ) : h(y) ≤ (1 + h(x))}. Let us also define X(d) () as the intersection of X with the union of C(H, ) for H ranging over all algebraic subgroups of dimension d. Clearly, for any  > 0 this contains X(d) . In Theorem 1 of [Hab09a], Habegger proves finiteness of X(n−2) (), provided X is a curve as in Theorem 1.3, and provided  > 0 is smaller than a positive quantity that can be effectively computed in terms of n and the dimension and the height of X. For this proof Habegger uses a different method, based this time on a geometric version of the result by Amoroso-David, due to the same authors; they prove an explicit and uniform Bogomolov property, stating that For a curve Y ⊂ nm (n > 1) not in a proper coset and for z ∈ Y ( ), the height h(z) ,n (deg Y )(−1−)/(n−1) , up to a finite set of such points z. It is crucial here that the implicit constant depends only on  and n. Habegger applies this result to auxiliary curves Y ; such curves are finite in number and depend on the points under consideration; they are suitably constructed by geometry of numbers as homomorphic images of X to some lm , and dependently on a large parameter. (Roughly, if the point x ∈ H satisfies xa = xb = 1, then one, for instance, mimics a, b with bounded vectors a0 , b0 and uses the homomorphism 0 0 x 7→ (xa , xb ) from X to 2m , whose image defines Y .) Habegger’s paper contains also further results for X of higher dimension. Other conclusions in this direction (for instance, with the milder assumption of Maurin’s theorem) are also contained in the recent paper [Mau10] by Maurin.

G Q

G Q

Q

G

G

G

4. An application of the methods to zeros of linear recurrences Motivated by a certain problem in algebraic geometry (concerning tangents to locally toric curves in Pn ), M. Bolognesi and G. Pirola were led in [BP10b] to the following question, related to zeros of linear-recurrences. Let ξ be a complex number with |ξ| > 1 and let f ∈ C[x] be a complex polynomial of given degree d. Question: How many integral zeros can a function of the shape ξ m − f (m) have? Special cases of this question actually appeared in previous papers on related geometrical problems. The question fits into the widely studied issue of bounding the number of integral zeros

41

Notes to Chapter 1

of exponential polynomials (which express solutions of linear recurrences). It is a corollary of a very general deep result of W.M. Schmidt that the number of zeros here is bounded only in terms of d; however, in our special context we ask for more precise information. By translation we can assume that the least zero is m = 0 and by rescaling we can assume that the other zeros are coprime positive integers 0 < m1 < m2 < . . . < mr . For d = 0 plainly we can have at most a single integral zero. Let d ≥ 1 and, for given distinct integers u0 = 0, u1 , . . . , ud+1 , consider the determinant of the matrix with rows (1, ui , . . . , udi , ξ ui ), i = 0, 1, . . . , d + 1, in the variable ξ. A zero ξ of the determinant of this matrix, with |ξ| > 1 (it exists for d > 0 and suitable ui ), shall give a solution to our problem, with r = d + 1 (i.e., d + 2 integral zeros). The question arises whether there can be d + 3 or more zeros. N. Elkies and Bolognesi-Pirola have independently shown by a simple elegant argument that this cannot happen for d = 1: they multiply by a conjugate equation to obtain |ξ|2m = Q(m) for m = 0 < m1 < m2 < m3 , where Q is a real quadratic polynomial. Now Rolle’s theorem applied three times leads to a contradiction. However, in general such method only shows that there are at most 2d + 1 zeros and does not answer the questions for d > 1. In the appendix to the paper [BP10b], joint with P. Corvaja, it is shown that there are at most finitely many exceptions for d = 2, which moreover can be found effectively. Here, the relevance of this issue comes from the fact that the method of proof of this result is strictly related to the one for Theorem 1.3: if there are five integral zeros, we obtain as above a 4×5matrix of rank < 4. So we obtain at least two independent vanishing determinants. This vanishing defines a certain linear variety X of codimension 2 in 5m , with a point p := (ξ m1 , . . . , ξ m5 ) in a 1-dimensional torus H: an unlikely intersection (in X(1) ). Now, contrary to Theorem 1.3, the variety X varies together with the torus H and the point p, dependently on the mi ; nevertheless, this variation has a tame nature (polynomial in the entries mi ). Consequently, the said determinantal equations yield a good upper bound for h(ξ), in analogy with Theorem 1.2 (now one uses essentially Lemma 1 in the appendix to [Sch00], on heights of solutions of lacunary equations). This is then compared to a lower bound coming from Dobrowolski’s theorem (i.e., the case r = 1 of the theorem of Amoroso-David). So one obtains a good lower bound for [ (ξ) : ]. Finally, as in the proof of Theorem 1.3, we may construct a 2-dimensional torus of small degree containing the said point. Via Bezout’s theorem this yields an upper bound for [ (ξ) : ], which contradicts the previous lower bound if max |mi | is large enough. (Of course many verifications are necessary along the way.) In principle, this method can be applied for any given d. We do not pause further on this issue and refer to [BP10b], especially the appendix, for details.

G

Q

Q

Q

Q

Comments on the Methods Before going to the next chapters, let us pause to see the steps of the methods of proof that we encountered so far, for instance sticking to Theorem 1.3. The main points were to take conjugates and to compare estimates for the degree of the relevant unlikely points. Namely, let G be an algebraic subgroup of codimension 2, and z ∈ X ∩ G; we compared: (LB) A Lower Bound for the degree of z (in terms of deg G): this comes from an upper bound for the height of points z ∈ X ∩ G (through Theorem 1.2), which in turn yields very small height for suitable generators of the group x1 (z)Z · · · xn (z)Z (through geometry of numbers). Then one applies results from Lehmer’s problems (i.e., Amoroso-David), which relate degree and height.

42

Chapter 1

(UB) An Upper Bound for the degree of any point in X ∩ G (hence for the degree of z): this comes from Bezout’s theorem for X ∩ H, applied to an algebraic subgroup H containing G, of codimension 1 and small degree (constructed by geometry of numbers). We shall see that these degree estimates and comparisons have relevance also in the other methods we shall discuss, but are obtained through rather different tools. In other contexts there is also a geometric part (GP), which here is hidden in Step (UB); namely, to apply Bezout’s theorem we need that X is not contained in G. Here this is ensured by our very assumptions, whereas in other cases one must work out the corresponding possibility and prove that this leads to special varieties in the appropriate sense. (An example occurs with the first solution to Lang’s original question that we sketched at the beginning of this chapter. One has to deduce that if [l]X ⊂ X then X is a torsion coset.)

Chapter 2

An Arithmetical Analogue In this chapter we shall somewhat pause and, although not entirely leaving the context, we shall slightly move in another direction, to describe a purely arithmetical analogue of Theorem 1.3 above. This material was not mentioned in the lectures, for lack of time and because of its partially different nature with the rest.

2.1

Some unlikely intersections in number fields

Recall that Theorem 1.3 considers intersections of a curve X with algebraic subgroups of codimension 2; these intersections come from pairs of multiplicative relations of the shape xa1 1 · · · xann = 1, xb11 · · · xbnn = 1, where xi are fixed rational (coordinate) functions on X and (a1 , . . . , an ), (b1 , . . . , bn ) ∈ n are linearly independent integer vectors. Note that, under the assumption that X is not contained in any proper algebraic subgroup, the independence of these vectors amounts to the multiplicative independence of the rational functions on X given by u := xa1 1 · · · xann , Z ∗ v := xb11 · · · xbnn ; these functions lie in the multiplicative subgroup Γ = xZ 1 · · · xn of k(X) generated by the xi . In other words, in Theorem 1.3 we are considering the points z ∈ X, such that u(z) − 1 = v(z) − 1 = 0, i.e., the common zeros of u − 1, v − 1, for varying multiplicatively independent u, v ∈ Γ, and we prove (global) finiteness of the set of such points.

Z

A question in some way analogous is then to take multiplicatively independent numbers u, v from a finitely generated subgroup Γ ⊂ ∗ and consider the prime numbers dividing (the numerators of) both u − 1 and v − 1. And naturally we may work with in place of . The unlikely intersections here are the said prime numbers. Now, of course, we do not have finiteness of such primes for u, v running in Γ: if Γ has rank > 1, each prime occurs as a divisor of infinitely many numbers t − 1 for pairwise multiplicatively independent t ∈ Γ. However, we may ask somewhat to bound the set of unlikely primes for given u, v. For this, we may look for a bound of the gcd(u − 1, v − 1), in terms of u, v.1 (There is, of course, an appropriate notion of gcd over , which we shall give below.)

Q

Q

Q

Q

This issue arose also in a context independent of the previous chapter; for instance, an old elementary problem asked to prove that for integers a, b ≥ 2, if an − 1 divides bn − 1 for all natural numbers n, then b must be a power of a.2 Thinking of the functions n 7→ an − 1, n 7→ bn − 1, we 1 Note

that in this way we count also multiplicities of the present unlikely intersections. converse deduction is clear. The statement is a special case of a theorem of Pourchet-van der Poorten, solving a problem of Pisot on the divisibility between values of two linear recurrence sequences; see [Rum88] for a 2 The

44

Chapter 2

again see here a local-global principle: if there is divisibility between the values, there is identical divisibility in the appropriate ring of functions. There are two different short proofs of this statement that I know (one is completely elementary, the other one uses Chebotarev’s theorem). In the paper [CZ98], the first of these arguments was refined by introducing in it the subspace theorem of Schmidt (a higher-dimensional extension of Roth’s theorem, for which we refer to [BG06]). The said conclusion was obtained under the assumption holding merely for an infinity of integers n rather than all integers n. Then, in the paper [BCZ03] with Y. Bugeaud, a relevant variation was introduced, which allowed to quantify this principle in an essentially optimal way. The conclusion was that for all  > 0, gcd(an − 1, bn − 1) a,b, exp(n) provided a, b are multiplicatively independent.3 (This independence is a necessary assumption: otherwise a = cr , b = cs for integers c > 1, r, s, and the gcd is  cn .) Finally, the papers [CZ03] and [CZ05] proved the following, denoting by H(t) the exponential Weil height of t, i.e., H(r/s) = exp h(r/s) = max(|r|, |s|) for coprime integers r, s:

Q

Theorem 2.1. Version 1. ([CZ05].) Let Γ be a finitely generated subgroup of , and let  > 0. Then the Zariski closure of the set of pairs u, v ∈ Γ with gcd(u − 1, v − 1) ≥ max(H(u), H(v)) is the union of a finite set with finitely many proper algebraic subgroups of 2m .

G

Q

The paper [CZ03] worked over whereas [CZ05] contains a generalization to number fields and to pairs of bivariate polynomials other than u − 1, v − 1. Taking into account that algebraic subroups of 2m are of the shape xa y b = 1, this implies (and it is easy to see that it is, in fact, equivalent with):

G

Theorem 2.1. Version 2. For multiplicatively independent u, v ∈ Γ we have gcd(u−1, v−1) Γ, max(H(u), H(v)) .

Q

The result may be generalized to arbitrary Γ ⊂ , with the appropriate notion of gcd. As to the definition of gcd in a number field, one may measure it through a gcd height, defined by Silverman as X min(log+ |x−1 |v , log+ |y −1 |v ), hgcd (x, y) := v∈Mk

where Mk is the set of places of the number field k,P normalized so that the product formula holds and the absolute logarithmic Weil height is h(x) = v log+ |x|v ; here log+ z = max(0, log z). The analogous result over number fields reads as follows, where we denote by OS = Ok,S the ring of S-integers (those algebraic numbers in k which are integral outside the places in S) and by OS∗ the group of invertible elements in OS , i.e., the so-called S-units: Theorem 2.1. Version 3. For  > 0 there exists a constant C = C(k, S, ) such that for multiplicatively independent u, v ∈ OS∗ , we have hgcd (u − 1, v − 1) ≤  max(h(u), h(v)) + C. ∗ , where k is a number Observe that, in fact, it suffices to have all these statements for Γ = Ok,S field and S is a finite set of places (because any finitely generated Γ is included in one of such groups).

To put these results in broader perspective, we note that the S-units are the S-integral points relative to m , which is P1 \ {0, ∞} as a variety. Accordingly, there are several other natural

G

careful exposition, and see [CZ02b] for a sharpening. 3 As to an opposite bound, one can recall the estimate by Alford, Granville and Pomerance: gcd(an − 1, bn − 1)  γ

exp(n log log n ) for an infinity of n and all a, b; one takes n to be a multiple of several p − 1, for suitable primes p.

2.1 Some unlikely intersections in number fields

45

diophantine statements on S-units, not unrelated to the present ones; one of the simplest asserts the finiteness of the solutions of the S-unit equation, i.e., the x, y ∈ OS∗ such that x + y = 1; this expresses the finiteness of S-integral points for P1 \{0, 1, ∞}, and is the genus-0 case (already rather deep) of the famous Siegel’s theorem on integral points. (It is also an analogue of Picard’s little theorem for holomorphic functions.) The celebrated abc-conjecture of Masser and Oesterl´e may be seen as a far-reaching uniform version of this finiteness assertion. We recall that the simplest version of this conjecture (yet seemingly out of the present horizons) asserts that for any  > 0 there exists a number c() such that for any coprime integers a, b, c with a + b + c = 0 we have max(|a|, |b|, |c|) ≤ c()P 1+ , where P is the product of distinct prime numbers dividing abc. See [BG06] Ch. 14, for an ample discussion, also of some implications of this statement. Example 2.1 Let us pause to see what could be obtained directly for our gcd from the abc-conjecture; let u, v ∈ be S-units and put d = gcd(u − 1, v − 1), so u − 1 = dm, v − 1 = dn, where we suppose that |u| ≥ |v|, so that |n| ≤ |m|. This yields nu − mv = n − m. Viewing this as an abc-equation, we derive from the abc-conjecture (taking into account that u, v are S-units) that |nu|  |mn(m − n)|1+ , whence 1 |u|  |m2 |1+2 and in turn |d|  |m|1+4 , which leads to |d|  |u| 2 +2 . This is indeed more or less the best that can be obtained without using the multiplicative independence hypothesis.

Z

Coming back to our context, the said finiteness statement for x + y = 1 may in fact be reformulated and sharpened by means of Lang’s general formulation of Roth’s theorem in diophantine approximation (see [Lan83]); with the aid of this result one may obtain: For an S-unit x ∈ OS∗ , the contribution to the height h(1/(x − 1)) coming from the places in S is negligible with respect to h(x). In other words, the equation x + y = 1 has only finitely many solutions where x is an S-unit and y is an “almost” S-unit. Contrary to this and other statements on S-units, the above Theorem 2.1 (and the related results) instead bounds a contribution to the height coming from places outside S. This is a relevant difference because S is finite, whereas there is no a priori control on its complement. As in [CZ05], we note that a further formulation of Theorem 2.1 takes this shape: For multiplicatively independent u, v ∈ Γ we have h( u−1 v−1 ) ∼ h(1 : u : v), expressing that there is not much cancellation in the fraction (u − 1)/(v − 1). On the right we have referred to the projective Weil height, defined, for a point P = (x0 : . . . : xn ) ∈ Pn (k) as X h(P ) = h(x0 : . . . : xn ) := log sup(|x0 |v , . . . , |xn |v ).4 (2.1.1) v∈Mk

In fact, as pointed out by Masser (and implicitly verified in [CZ05]), we have hgcd (x, y) = h(1 : x : y) − h(x : y) = h(1 : x : y) − h(x/y). Such height estimates may be further generalized to estimates for h(f (u, v)) for a rational function f = P/Q; this amounts to control the gcd between numerator P (u, v) and denominator Q(u, v), evaluated at (u, v). (This is obtained in [CZ05] by the resultant elimination between numerator and denominator to reduce to the case of separated variables.) J. Silverman (see, e.g., [Sil05]) went further and interpreted the gcd as a height on a blowup; he observed that −1 hgcd (u − 1, v − 1) = hX,E (u, v)) + O(1), ˜ (π 4 By

(2.1.2)

the so-called product formula, this does not depend on the choice of projective coordinates of P . See [BG06].

46

Chapter 2

G

˜ = blowup of 2 at (1, 1) and E is the exceptional divisor.5 With this illuminating where X m reformulation, Silverman was able to read the result as a case of the following: Vojta’s conjecture: Let X be an irreducible smooth projective variety, D a normal crossing divisor, K a canonical divisor, A an ample divisor,  > 0. Then hD,S (P ) + hK (P ) ≤ hA (P ) for P out of a proper subvariety Z ⊂ X. Here hS is the S-part of the height. To relate the above Theorem 2.1, Version 3, with this statement, we take X = blowup of P2 at (1, 1), D = L1 + L2 + L3 , where Li are the pullbacks of the lines defined by the homogeneous coordinates; then a canonical divisor is ∼ −3L1 + E, where E is the exceptional divisor. Now the equivalence follows from (2.1.2). All of this also indicates that the result really is related to integral points on surfaces. Silverman’s view also led to the “correct” generalization of the statements to other algebraic groups; for instance, in the case of elliptic curves one should obtain log gcd(D(nP ), D(nQ)) ≤ n2 + O(1), where D(P ) is the denominator at a rational point P and P, Q are given linearly independent points in E( ). However, the proofs seem not to carry over to contexts other than 2 3 2 m , and even the simplest next cases, like m , E × m or E (E an elliptic curve), remain open.

Q

G

G

G

Q

Sketch of proof of Theorem 2.1. We restrict for simplicity to k = and integer S-units u, v ∈ ∩ OS ; here S is a finite set of prime numbers; actually, we think of it as a set of absolute values, containing the usual one as well. The idea is that if there is a big gcd between 1 − u, 1 − v, then the fraction q := (1 − u)/(1 − v) has a small denominator D = D(u, v), where we suppose |v| ≥ |u|. The main point is now to approximate such a fraction with a sum of S-units. At any place ν ∈ S where v is large, we may use, for instance, a truncated geometric series and write, for a fixed h,

Z

q :=

1 1 1 1−u = −(1 − u)( + 2 + . . . + h ) + O(uv −h−1 ). 1−v v v v

(2.1.3)

Z

In the present special case when u, v are integers in , it shall be enough to use this approximation with respect to the usual absolute value. We may view this approximation as giving a small linear form q + v −1 − . . . + v −h − uv −1 + . . . − uv −h in the 2h + 1 quantities q, v −1 , . . . , v −h , uv −1 , . . . , uv −h . The proof now appeals to the subspace theorem of Schmidt (more precisely a version of Schlickewei with arbitrary places); this is a far-reaching extension of Roth’s theorem, to the case of diophantine approximation in higher dimensions, i.e., to linear spaces rather than points. For the reader’s convenience, we recall here the version we need, for rational points. (See [BG06] for more general ones and for a proof.)

Q

Subspace theorem. Let S be a finite set of absolute values of , including the usual one and normalized so that |p|ν = p−1 if ν corresponds to the prime p. For ν ∈ S, let L1ν , . . . , LN ν be linearly independent linear forms in N variables with rational coefficients, and let  > 0. Then there is a finite union of proper subspaces of N containing all the solutions x ∈ OSN to the inequality QN Q − 6 ν∈S i=1 |Liν (x)|ν < H(x) .

Q

5 We are using here the geometric definitions of heights, for which we refer to [BG06]. With this in mind, to verify (2.1.2), let x = X/Z, y = Y /Z in place of u − 1, v − 1, where X, Y, Z are homogeneous coordinates on P2 . Also, let W be the blowup of P2 at the origin (0 : 0 : 1); the function f (p) = x/y = X/Y : W → P1 is regular. The pullback f ∗ (∞) is the proper transform in W of Y = 0. Hence, if we write x = qx1 , y = qy1 with coprime x1 , y1 and q their gcd, we have H(x/y) = max(|x1 |, |y1 |), H(x, y, 1) = max(|x|, |y|); this last one is the height with respect to a line, which is equivalent to E+proper transform of Y = 0, where E is the exceptional divisor. Hence q is the height with respect to E. 6 We remark that for convenience we have stated only a version of the subspace theorem for linear forms over , whereas most versions are given over .

Q

Q

47

2.1 Some unlikely intersections in number fields

Here H(x) is the projective (exponential) Weil height. A (p-adic) version of Roth’s theorem is obtained setting N = 2 (projective dimension 1). It is easy to recover from this the finiteness ∗ result for the S-unit equation x + y = 1, x, y ∈ OQ ,S . (We refer to [BG06] for a deduction of this result and its analogue in higher dimensions.) To go on, we first give a result weaker than 2.1, to better illustrate the principle of the proof. We apply this subspace theorem with N = 2h + 1. The variables x0 , . . . , xN shall be evaluated at the S-integer points obtained by multiplying by D the above quantities, i.e., x = (x0 , . . . , x2h ) := D(q, v −1 , . . . , v −h , uv −1 , . . . , uv −h ). We choose the following set of linear forms: at the infinite place we choose the above “small” linear form x0 + x1 − . . . + xh − xh+1 − . . . − x2h and x1 , . . . , x2h as the other 2h linear forms; at a finite place in S we choose just the 2h + 1 coordinates x0 , . . . , x2h as linear forms. Note that the values at our points of the last 2h coordinates are D times S-units, these last onesQ being small at several places in S: this average smallness is expressed by the product formula ν∈S η = 1 for an S-unit η. So the contribution of the last 2h linear forms in the double product in the subspace theorem is  D 2h . As to the first linear form, in view of (2.1.3), its contribution is Q −h −h  D|v| . Also, H(x) ≤ D|v|h . ν∈S\{∞} |Dq|ν ≤ D|v| 1

1

Suppose now that D ≤ |v| 2 −δ , for a given δ > 0, i.e., that gcd(u − 1, v − 1) ≥ |v| 2 +δ . Then the 1 double product is  D2h+1 |v|−h  |v| 2 −(2h+1)δ . For large h > h0 (δ) this is smaller than some negative power of the height H(x) so the subspace theorem yields a linear relation taken from a finite set: h h X X bi v −i + cj uv −j , aq = i=1

j=1

where the coefficients a, bi , cj are rational numbers, not all zero, and with only finitely many possibilities independent of the point (u, v). In conclusion, one finds, that, either there is one such relation or 1 (2.1.4) gcd(1 − u, 1 − v)  max(|u|, |v|) 2 +δ . Note that the said relation may be written in the shape av h (1 − u) = (1 − v)(P (v) + uQ(v)) for polynomials P, Q of degree ≤ h, not both identically zero. But such a relation cannot hold identically in u, v; hence, by the known results on S-unit points on curves,7 all but finitely many points (u, v) on the plane curve so defined correspond to components defined by a polynomial of the shape U V r − α. This yields the upper bound (2.1.4) for the gcd, apart for a set of points (u, v) that can be partitioned in finitely many subsets on each of which we have u = αv r for constant α, r. And since the gcd is large, this α has to be 1 if the corresponding set of (u, v) is infinite. We have not yet gained the bound we want, only an exponent 12 + δ in place of δ for the gcd(1 − u, 1 − v) (albeit with a sharper conclusion if the bound does not hold). Our device to sharpen this bound is to construct other small linear forms. Somewhat surprisingly, we may obtain them just on multiplying the approximation (2.1.3) above by several powers ul .8 Defining ql := ul q, we have 1 1 1 ql = ul (1 − u)( + 2 + . . . + h ) + O(ul+1 v −h−1 ). v v v 7 They correspond to Lang’s conjectures mentioned in Chapter 1, and the special case useful here was proved by Lang himself in [Lan65]. 8 A posteriori, this is vaguely analogous to a device used by Siegel, concerning algebraic dependence of values of E-functions: he multiplied a hypothetical relation P (ξ1 , . . . , ξn ) = 0 by monomials in the ξi to obtain other relations. See [Bak90], pp. 115–116.

48

Chapter 2

We may do this for l = 0, 1, . . . , k − 1 (the previous construction occurs for k = 1), obtaining k small linear forms in the N := k + h + hk quantities q0 , . . . , qk−1 , v −1 , . . . , v −h , uv −1 , . . . , uv −h , . . . , uk v −1 , . . . , uk v −h . For a fixed but large k > k0 (δ), by choosing a large enough h > h0 (k, δ) as before, we obtain by 1 completely similar calculations an estimate gcd(1 − u, 1 − v)  max(|u|, |v|) k+1 +δ , unless there is a relation R(u)v h (1 − u) = (1 − v)P (u, v), with polynomials R, P taken from a finite set, not both zero, of degrees ≤ h, k in v, u respectively. As before, no such relation can be trivial, and as before the known results on S-unit points on curves lead to a multiplicative dependence ur v s = 1 with exponents r, s bounded in terms of k and of δ. Since δ may be chosen as any positive number, this proves Theorem 2.1. Remark 2.1.1 In this procedure of taking several quantities ql = ul q, we seem to be using again and again the same information, so nothing should be gained. Note, however, that the exceptional linear relations delivered by the subspace theorem become more and more general to include any possible multiplicative dependence, which would not be taken into account on taking k = 1, or even on taking bounded k.

2.2

Some applications of Theorem 2.1

The above result has been applied to a number of different issues. We describe in short some of these links, with only brief mention of the proofs. 1. Conjecture of Gyory-Sarkozy-Stewart This conjecture stated that for integers a > b > c > 0, the greatest prime factor of (ab + 1)(ac + 1)(bc + 1) tends to ∞ as a → ∞. It had been studied mainly with methods from transcendence theory (e.g., by Stewart and Tijdeman), with partial results. The above theorem implies actually the following sharpening (obtained in [CZ03]): Theorem. For a > b > c > 0, the greatest prime factor of (ab + 1)(ac + 1) tends to ∞ as a → ∞. For a proof, it suffices to note that otherwise there are infinitely many S-units u = ab + 1, v = ac + 1, with gcd(u − 1, v − 1) ≥ a ≥ max(|u|, |v|)1/2 . By Theorem 2.1, u, v have to be multiplicatively dependent for large a: u = tr , v = ts for some integers t, r, s > 0, (r, s) = 1. But then one easily finds gcd(u − 1, v − 1) = t − 1, forcing r, s ≤ 2, and a contradiction easily comes. 2. Families of S-unit equations We have already mentioned S-unit equations, of which the linear ones au + bv = 1 (fixed a, b ∈ k ∗ , variable u, v ∈ OS∗ ) often appear. The above results allow us to study certain families in which also the coefficients a, b may vary; more precisely, we assume that a = a(t), b = b(t) ∈ P GL2 (k) are linear fractional, in the S-integer variable t ∈ OS ; we thus have the equation a(t)u + b(t)v = 1. Such equations correspond to integral points on certain affine surfaces; to treat them with the present results, one first easily reduces to the case when a(t), b(t) have the same denominator (otherwise the denominators must be essentially units, and standard results may be applied). Then the equation yields a divisibility α + βu + γv|α0 + β 0 u + γ 0 v in OS , where α, β, ... come from the coefficients of a(t), b(t). This implies big gcd between two linear forms in 1, u, v, and eliminating one concludes that the solutions fall into finitely many families where u, v or u/v is constant. (This may be further generalized with other methods to a(t)u + b(t)v = c(t), with polynomials a, b, c

49

2.2 Some applications of Theorem 2.1

such that deg a + deg b = deg c, as in a paper of Levin [Lev06], or deg a = deg b = deg c, as in the paper [CZ10b]. See also the paper of Corvaja [Cor07].) 3. Elliptic curves over finite fields F. Luca and I. Shparlinski have applied Theorem 2.1, Version 3, in [LS05] to prove the following:

F

Theorem. ([LS05].) Let E/ q be an ordinary elliptic curve and let  > 0. Then for large n the group E( qn ) has a cyclic factor of order > q n(1−) .

F

F

F

Of course, |E( qn )| ≤ 2q n (and actually by Hasse’s theorem |E( qn )| = q n + O(q n/2 )); hence, the result says that the group E( qn ) tends to be “almost” cyclic as n → ∞. A proof runs as follows: If Φ is the q-Frobenius on E, we have E( qn ) = ker(Φn − I). It is known that E( qn ) is the direct sum of two cyclic groups and thus decomposes as ( /rn ) ⊕ ( /sn ) for rn |sn . Then the rn -torsion points are contained in the kernel of Φn − I, whence Φn − I = rn ψn for a ψn ∈ End(E). But then the Frobenius eigenvalues a, b satisfy an ≡ bn ≡ 1 (mod rn ), so rn divides the gcd(an − 1, bn − 1). Now, if a, b are multiplicatively dependent, then E may be easily shown to be supersingular √ (note that |a| = |b| = q, so ar bs = 1 implies r = −s). If they are independent, Version 3 of Theorem 2.1 applies and yields rn  exp(n), which implies the result.

F

F

F

Z

Z

C. Magagna [Mag08] has carried out generalizations in higher dimensions, and, for instance, has proved the following:

F

Fq

Theorem. ([Mag08]) If E, E 0 / q are ordinary and not isogenous elliptic curves, then |E( and |E 0 ( qn )| have a gcd  exp(n) for n → ∞.

F

n

)|

Recall that if E, E 0 are isogenous, then the relevant orders are equal. This yields an isogeny criterion. F. Bogomolov, M. Korotiaev, and Y. Tschinkel used such type of result, again for Frobenius eigenvalues, this time of Jacobians, to prove in [BKT10] a kind of group-theoretic analogue of Torelli’s theorem over finite fields. (Very roughly: they obtain an isogeny criterion for two Jacobians of curves over a finite field by looking merely at the isomorphisms between the abstract groups of rational points on the Jacobians.) In a related paper [BT08], they also use similar criteria to draw geometric information on two varieties over a finite field from divisibility information on the cardinality of the set of rational points. 4. Periods of toral automorphisms A further application was motivated by the study of dynamics and periods of toral automorphisms (especially the existence of quantum limits different from Lebesgue measure). Analysis of Z. Rudnick reduced this issue to the search of lower bound for periods of integer matrices modulo growing integers N . More precisely, let ord(A, N ) be the minimal positive integer r > 0 such that Ar ≡ 1 (mod N ), where A is a square matrix over , and N is an integer coprime to det A. For fixed A of infinite order, one has the easy lower bound ord(A, N )  log N , which is generally the best possible (e.g., in dimension 1). Now, Theorem 2.1 (in Version 3 for number fields) allows an improvement under suitable assumptions. To state it, we say that a matrix is exceptional if it is diagonalizable (over ) and for some integer r > 0 the eigenvalues of Ar are powers of a single rational integer or of a single unit λ of a real quadratic field. It may be checked that (given that A is diagonalizable) this is the same as saying that the dimension of the Zariski closure of the set of powers {As : s ∈ } is 0 or 1. Now, the paper [CRZ04] proves the following:

Z

Q

Z

Theorem. ([CRZ04].) If A is not exceptional then ord(A, N )/ log N → ∞. In fact, one may check that there is also a converse deduction. The idea of the proof is very simple: one diagonalizes A (over an extension field) and uses Theorem 2.1, generalized to number

50

Chapter 2

fields, on pairs of eigenvalues. (If instead A is not diagonalizable, a simpler self-contained argument suffices to prove an even sharper conclusion.) This result may be also applied to a matrix representation of the Frobenius map to derive the results at n. 3 above. 5. Periodic points of quadratic maps Still another application, due to J.K. Canci, appears in the paper [Can10], where he studies periodic points for quadratic rational maps f : P1 → P1 , assuming, as in previous work by several authors, that f has good reduction outside S. More precisely, he studies moduli spaces for pairs (f, P ), where f is a quadratic rational map in k(t) (k a number field) with good reduction outside S, and P ∈ P1 (k) is a rational periodic point for f , of fixed period m. He first proves: Up to conjugation in P GL2 (OS ) there are only finitely many pairs (f, P ), where P is periodic of any period ≥ 4. This uses some new arguments and the S-unit theorem for surfaces, classifying the S-unit solutions to x + y + z = 1. For period 3, the S-unit theorem, however, seems no more sufficient. Canci has to use the above result to prove now that Theorem. ([Can10].) Up to conjugation in P GL2 (OS ), the pairs (f, P ) as above with P periodic of period 3 lie in finitely many conjugacy classes plus the family f (x) = (x − 1)(ax + 1)/a, for a ∈ OS∗ . 6. Heights of values of polynomials at S-unit points We mention another very recent application, due to A. Levin and D. McKinnon. In the paper [LM11], the authors prove that for f (x) a polynomial over a number field k, for S-units u ∈ OS∗ outside some natural exceptional set, the prime ideals of Ok dividing f (u) “mostly” have degree 1 over (in the sense that these primes give almost all the contribution to the height of f (u)). Their proof uses Theorem 2.1 in the version for number fields in [CZ05]. They also observe a link of their result with Vojta’s conjecture. This is, of course, in accordance with the above discussion.

Q

2.3

An analogue of Theorem 2.1 for function fields

A version of Theorem 2.1 bounding the gcd has been obtained when u, v are rational functions on a (nonsingular complete) curve X over an algebraically closed field κ of characteristic 0, and have all their poles and zeros in a prescribed finite set S ⊂ X; these functions form a multiplicative group that is the analogue of the group of S-units of a number field, and we shall denote it by OS∗ , as before. It contains κ∗ and it is finitely generated over it.9 The ring OS shall now consist of the rational functions on X having no pole outside S. The analogue of gcd(u − 1, v − 1) is clearly the number of common zeros (with multiplicity) of u − 1 and v − 1, i.e., the unlikely intersections among X and the two hypersurfaces defined by u − 1, v − 1 in the ambient space. We have the following theorem (joint with Corvaja [CZ08b]), in which we denote by X an irreducible projective nonsingular curve over κ, of genus g, and by χ := 2g − 2 + |S|: it is the Euler characteristic of the affine curve X \ S: Theorem 2.2. Let u, v ∈ κ(X) be nonconstant and have all their zeros and poles in S, and also suppose they are multiplicatively independent. Then X √ 1 3 min(ordp (1 − u), ordp (1 − v)) ≤ 3 2(deg(u) deg(v)χ) 3 . p∈X(κ)\S

is easy to see that its rank over κ∗ is precisely |S| − 1 − r, where r is the rank of the group generated by the classes of P − Q in the Jacobian of X, for P, Q ∈ S. 9 It

2.3 An analogue of Theorem 2.1 for function fields

51

This result is plainly analogous to Theorem 2.1, but much more explicit and uniform. A proof may be obtained using the same ideas of Theorem 2.1, namely, the same approximating linear forms. Schmidt subspace theorem is replaced by a Wronskian argument. Such a proof is self-contained, and therefore we have decided to outline it in the last section below. Remark 2.3.1 (i) If we choose S to be exactly the set of zeros and poles of u, v, the condition p 6∈ S in the theorem can be omitted and we may replace “ord” by “ord+ ,” defined as max(0,ord). (ii) We note that the left side can be trivially bounded by min(deg(u), deg(v)); hence, the result may become useful only under appropriate circumstances, e.g., when χ is small. For instance, for fixed χ, we get a bound  max(deg(u), deg(v))2/3 , which often improves on the said trivial bound. As to lower bounds, we do not have significant ones when u, v have large degree, and we do not have any definite idea on a best-possible asymptotic result, e.g., for fixed S. √ As to the constant 3 3 2 (which is relevant in some applications), it cannot be lowered below (4/3)1/3 (take u = t3 , v = −t − t2 , κ(X) = κ(t)). (However, a slight variation in the proof below would lead to a small asymptotic improvement on the above value.) (iii) It is easy to see what happens in the multiplicatively dependent case: if ur v s = θ is a relation with coprime r, s not both 0, and θ a root of unity, then the gcd-sum is 0 if θ 6= 1 and is anyway ≤ max(deg(u), deg(v))/ max(|r|, |s|), so again this is small if such “minimal” relation has large exponents. This is essentially best-possible: if θ = 1, we may write u = ts , v = t−r for a suitable monomial t in u, v; in turn, the gcd comes out from the zeros of t − 1, and this inequality becomes an equality if S is exactly the set of zeros/poles of u, v (so if the exponents are small, the gcd is in fact large). (iv) The result may be viewed as bounding the “badness” of the singularity at (1, 1) of the plane curve determined by (u, v). For instance, if (1, 1) is not singular on this curve, the sum expressing the gcd is at most 1. To my knowledge, the resulting estimate does not follow from standard methods or theorems on singularities of plane curves. (v) The result is linked with the “abc” for function fields and implies a weak version of it (and similarly there is a link of Theorem 2.1 with the usual “abc,” already outlined, e.g., in Example 2.1). In fact, let x + y = 1 with x, y ∈ OS∗ not constant. For a constant ξ ∈ κ∗ this yields 1 − ξx = y(1 − (ξ − 1)xy −1 ). We may suppose that x, y are multiplicatively independent (otherwise they have to be constant). Then Theorem 2.2 may be applied to u := ξx, v := (ξ − 1)xy −1 . Assuming also deg x ≥ deg y, for suitable ξ we obtain easily that deg x ≤ 54χ. Of course the abc gives this without the factor 54, but this shows a definite implication in this sense (which for number fields would be almost as good as abc).

For the sake of possible applications, we state explicitly also the following immediate Corollary 2.3. Let X be as above, defined over a number field k. Let u, v ∈ k(X) be nonconstant and have all their zeros and poles in S, and also suppose they are multiplicatively independent. Let z ∈ X( ) be a common zero of u − 1, v − 1. Then √ 1 3 [k(z) : k] ≤ 3 2(deg(u) deg(v)χ) 3 .

Q

Q

A proof follows at once from Theorem 2.2 (with κ = ) on noting that since u, v are defined over k, also the conjugates of z over k are common zeros of u − 1, v − 1. (Observe also that we can take S to be just the set of zeros/poles of u, v, so no common zero of u − 1, v − 1 can lie in S.) This corollary fits into the pattern of the proofs in the previous chapter because it yields “half” of them, namely the upper bound part. If we can suitably compare with a lower bound, we can then achieve finiteness. A good illustration is given by the following application (and also by note 1 below), in which we consider the special but significant case when u = f (t)n , v = g(t)n , where f, g ∈ k[t] are fixed multiplicatively independent nonconstant polynomials, and where n → ∞. In this case we may take X = P1 in Theorem 2.2, obtaining in particular that deg(gcd(f n − 1, g n − 1))  n2/3 , which improves on the obvious upper bound  n.

52

Chapter 2

Actually, we can do better on using Corollary 2.3: let z be any common zero of f n − 1, g n − 1, supposing n is minimal for this z. On the one hand, the corollary yields [k(z) : k]  n2/3 . On the other hand, the equations f (z)n = 1, g(z)n = 1 and minimality of n show that [k(z) : k]  φ(n)  n1− for any  > 0. In both cases we find by comparison that the set of possible n is finite, and the same holds for the set of possible common zeros. This conclusion is a result of N. Ailon and Z. Rudnick [AR04],10 which we state as a further corollary: Corollary 2.4. (Ailon-Rudnick [AR04].) Let f, g be multiplicatively independent polynomials over a number field k. Then deg(gcd(f n − 1, g n − 1)) is bounded as n → ∞. Remark 2.3.2 Comparing Theorem 1.3 and Theorem 2.2 We pointed out that Theorem 1.3, like this Theorem 2.2, is also a function-field analogue of Theorem 2.1. Let us see what the differences are between these function-field statements. In Theorem 1.3 we obtained a finiteness result for common zeros of u − 1, v − 1 over all pairs of multiplicatively independent functions u, v ∈ Γ; in this sense this was stronger, because here, on the contrary, we obtain merely an individual bound for each pair. However, the present bound is much more uniform in the data, and is nontrivial for not “too” large |S|; this works for much more general pairs of functions than those in a fixed finitely generated group Γ. It also turns out that this may be used to simplify the proof of Theorem 1.3, via Corollary 2.3. (This is outlined in the appendix to [CZ08b] and also appears in the notes below.) To exemplify the meaning of these results within a simple context, we observe that • Theorem 1.3 proves, for instance, the finiteness of the set of complex numbers t such that t, t − 1, t + 1 0 0 0 satisfy two multiplicative relations ta (t − 1)b (t + 1)c = 1, ta (t − 1)b (t + 1)c = 1, with linearly independent 0 0 0 0 integer vectors m := (a, b, c), m := (a , b , c ). • Theorem 2.2 proves that for each such vectors m, m0 and for arbitrary nonzero complex numbers √ 0 0 1 α 6= β, γ, γ 0 , there are at most 3 3 4(|m||m0 |) 3 numbers t such that γta (t−α)b (t−β)c = 1, γ 0 ta (t−α)b (t− 0 β)c = 1, where |m| is the sum of the absolute values of the coordinates. This latter bound may seem weak, but, as in the proof of Corollary 2.4, sometimes it may be combined with other informations to yields finiteness.

2.4

Some applications of Theorem 2.2

1. Bounding the order of torsion points on curves An application of Theorem 2.2 is to Lang’s original question of torsion points on curves; more precisely, we seek an upper bound for the maximal order NX of a torsion point θ on a plane irreducible curve X ⊂ 2m , of degree d, defined over a number field k and not a torsion coset. Recall first how a bound NX k, d2+ came from the results mentioned in Chapter 1 (see, for example, the second method sketched in Section 1.1). In the paper [CZ08a], Theorem 2.2 was applied to prove √ that such an estimate may be improved if X has small genus g, in fact to the bound NX k, (d d + g)1+ . (Note that, since g  d2 , this recovers the previous estimate and improves on it when g is much smaller than its natural upper bound.) The proof constructs two functions u, v expressed by independent monomials in the coordinates, with small degree and taking the value 1 at θ ∈ X: by geometry of numbers the monomials u, v may be found such that deg(u) deg(v)  d2 N . Now, if θ has exact order N = NX , we have [k(θ) : k]  φ(N )  N 1− , and now the conclusion follows from Corollary 2.3, since χ ≤ 4d + 2g.

G

2. Integral points on surfaces over function fields 10 In

their proof they used the result foreseen by Lang on torsion points on curves; this confirms a strict relation of the present topic with Chapter 1.

53

2.4 Some applications of Theorem 2.2

In [CZ08b] the result has been applied to solve in the affirmative a special case of Vojta’s conjecture for integral points over function fields (see [BG06], Ch. 14 for a description of Vojta’s conjectures for integral points on varieties). Namely, one proves the boundedness of the degree for affine curves of a given Euler characteristic lying in the affine surface X obtained by removing from P2 two lines and a conic in general position. This case is indeed rather special, but it is the simplest borderline case of Vojta’s conjecture which was still open; we call it borderline because the divisor at infinity for X (with respect to P2 ) has exact degree four (the minimum degree for the conjecture to apply to P2 ), and it is the simplest splitting of degree four which was not dealt with (the case of four lines amounts to the S-unit equation in three variables: x + y + z = 1, x, y, z ∈ OS∗ ). It was investigated for holomorphic functions by Mark Green in the 1970s. A simple diophantine equation strictly related to this issue is y 2 = 1 + u + v, to be solved in S-integer y ∈ OS and S-units u, v ∈ OS∗ ; this represents an affine surface, given as a double cover of 2 m ramified along 1 + u + v = 0. The Vojta conjecture predicts that the solutions are degenerate, namely, all lie on a fixed curve. In the case of number fields this is still unknown, even in special cases like y 2 = 1 + 2m + 3n ; now one expects the finiteness of the solutions y ∈ k, m, n ∈ , which is indeed implied by degeneracy.11 In the case of function fields, already quite particular cases like y 2 = c0 + c1 xm + c2 (1 − x)n (ci ∈ C∗ , y ∈ C[x], m, n ∈ ) do not seem to be easy to treat by standard techniques. (Here we are thinking, for instance, of the abc theorems over function fields, as in [BM86]. See [Zan04] for an ad hoc argument for the last mentioned equation.) This matter is solved in [CZ08b] for the function field case as an application of Theorem 2.2. Also, the same paper obtains more generally a bound for the number of multiple zeros outside S (counted with multiplicity m − 1) of P (u, v), where P is an irreducible polynomial P and u, v are S-units in κ(X), multiplicatively independent modulo κ∗ . The bound takes the shape (m − 1) <  max(deg(u), deg(v)) for large enough (with respect to  and deg P ) max(deg(u), deg(v)). The above is the special case P (u, v) = 1 + u + v. In turn, in [CZ10a] this has been used to prove Vojta’s conjecture for integral points over function fields, for the complement of any three normal-crossing divisors in P2 .

G

Z

Z

The idea for a proof of these results is roughly as follows, where we argue with the said equation y 2 = 1 + u + v. Differentiating (with respect to a fixed nonconstant function in κ(X)) and denoting by l(u) = u0 /u the logarithmic derivative, one obtains that both l(u)u + l(v)v and u + v + 1 are divisible by y in OS . One now eliminates and obtains that both (l(u/v)/l(v))u − 1 and (l(v/u)/l(u))v − 1 are also multiples of y, so have “many” zeros in common. Then the theorem is applied to u1 := (l(u/v)/l(v))u and v1 := l(v/u)/l(u))v in place of u, v, respectively. Here it is crucial that, since u, v ∈ OS∗ , the degrees of the differentials du/u, dv/v are bounded by |S|, since the only poles are in S and are simple; hence, the set of zeros/poles of the new functions u1 and v1 is not increased by much. Precisely the same calculations lead more generally to a bound for the number of multiple zeros of an expression 1 + u + v for u, v ∈ OS∗ . (This represents a special case of Vojta’s conjectural formulation of the abcd inequality.) 3. Rational points on curves over finite fields The methods of proof of the theorem (especially through Theorem 2.6 below) may be applied in positive characteristic p, but with some new difficulties, mainly due to the fact that derivations vanish on p-th powers. Working with Hasse hyperderivatives one even obtains (work in progress)

Q

11 For k = , surprisingly D. Leitner, with congruence arguments, found the complete list of solutions; however, it is not known how to extend to arbitrary k, even less to general S-units u, v.

54

Chapter 2

F

Weil’s bound for the number of rational points on a curve over q . (One takes u = xq−1 , v = y q−1 .) Some details on this, for the case q = p, may be found in the notes below.

2.5

A proof of Theorem 2.2

We prove a result with parameters, from which the theorem shall follow. Here we denote by degS (f ) the number of poles (with multiplicity) outside S of the rational function f ∈ κ(X). (For a certain notational convenience we have denoted here a, b the functions denoted u, v elsewhere.) Theorem 2.5. Let a, b ∈ OS∗ be S-units, not 1 and not both constant, and let h, k be positive integers. Then, either deg(a) ≤ h · [κ(X) : κ(a, b)] and deg(b)) ≤ k · [κ(X) : κ(a, b)], or   hk k hk + h + k − 1 1−a ≥ deg(b) − (deg(a) + deg(b)) − χ. degS 1−b hk + h + k hk + h + k 2 An essentially equivalent formulation of the result is in terms of the gcd, where by v we denote a place of κ(X)/κ (i.e., a point of X(κ)): Theorem 2.6. Let a, b ∈ OS∗ be S-units, not 1 and not both constant, and let h, k be positive integers. Then, either deg(a) ≤ h · [κ(X) : κ(a, b)] and deg(b)) ≤ k · [κ(X) : κ(a, b)], or X v6∈S

min{v(1 − a), v(1 − b)} ≤

k hk + h + k − 1 h + 2k deg(b) + deg(a) + χ. hk + h + k hk + h + k 2

Theorem 2.2 shall follow, as shown below, by a suitable choice of h, k. Remark 2.5.1 Note the asymmetry between a, b in these last statements. This arises from the “approximating form” corresponding to identity (2.5.6) in the proof below, in which a, b play different roles. (This asymmetry disappears in Theorem 2.2.) It would be interesting to find a more symmetric analogue of Theorem 2.5.

Proof of Theorem 2.5. We start with the following simple but important remark: if we replace the curve by a cover of it, of degree d, then the new degrees will be multiplied by d while the new χ will be at least the old one multiplied by d. (This amounts to the Riemann-Hurwitz formula, taking into account the possible ramifications in and out of S; see, e.g., [Zan09], Theorem 3.18.) By this observation, and taking S to be the exact set of zeros/poles of u, v, it will be clear that in proving this statement (and also others in this section) we may consider the minimal function field containing the relevant quantities. Therefore, we may work with κ(X) = κ(a, b). Since a, b are not both constant, we have |S| ≥ 2, so χ ≥ 0; hence, if b ∈ κ the result holds trivially, so suppose that b 6∈ κ. The argument now will mimic the above proof of Theorem 2.1, but will involve a simple Wronskian argument rather than the Schmidt’s subspace theorem. Thus we recall some standard facts about Wronskians. For f1 , . . . , fn ∈ κ(X) and a nonconstant t ∈ κ(X) \ κ we let the “Wronskian” Wt (f1 , . . . , fn ) be the determinant of the n × n matrix whose j-th row-entries are the (j − 1)-th derivatives of the fi ’s with respect to t. It is well known that Wt = 0 if and only if the fi ’s are linearly dependent over κ. (Recall that here char κ = 0.) Let z ∈ κ(X) be another nonconstant element. Then we have the known, easily proved, formula Wz (f1 , . . . , fn ) = (

dt (n2 ) ) Wt (f1 , . . . , fn ). dz

(2.5.5)

55

2.5 A proof of Theorem 2.2

For a place v ∈ X, we choose once and for all a local parameter tv at v and we define Wv := Wtv . This depends on the choice of tv , but (2.5.5) shows that the order v(Wv ) depends only on v. To prove the theorem we shall consider suitable Wronskians. We define q = (1 − a)/(1 − b) and, letting n := hk + h + k, we define functions f1 , . . . , fn as follows. For i = 1, . . . , k we let fi := ai−1 q while we define fk+1 , . . . , fn as the functions ar bs , r = 0, 1, . . . , k, s = 0, 1, . . . , h − 1, in some order. We now choose nonconstant t, tv as above and we put ω = Wt (f1 , . . . , fn ),

ωv = Wtv (f1 , . . . , fn ).

Suppose first that ω = 0; then, as we have remarked, the fi are linearly dependent over the constant field κ; recalling the definition of the fi , this amounts to a relation P1 (a)(1 − a) + P2 (a, b)(1 − b) = 0, where P1 (X) is a polynomial of degree ≤ k − 1 and P2 (X, Y ) is a polynomial of degree ≤ k in X and ≤ h − 1 in Y , and where not both P1 , P2 are zero. Observe that P1 (X)(1 − X)+P2 (X, Y )(1−Y ) is not identically zero, for otherwise P1 (X) would vanish (set Y = 1) and then P2 would also vanish, a contradiction. Then we find a nontrivial polynomial relation P (a, b) = 0, for a polynomial P 6= 0 of degree ≤ k in X and ≤ h in Y . Of course, we may assume that P is irreducible. Then, since a, b generate our function field, we have deg a ≤ h, deg b ≤ k, falling into the first possibility of the sought conclusion. Therefore, in what follows we assume that ω 6= 0. To go on, we first seek a lower bound for v(ωv ), and distinguish several cases. Case (i): v 6∈ S, v(q) < 0. For this case we use the homogeneity formula Wu (φg1 , . . . , φgn ) = φn Wu (g1 , . . . , gn ) for the Wronskian (with respect to any variable u); this is easily proved, e.g., by the product rule for derivatives and elementary row operations. Using this with φ := q, gi := fi /q we obtain ωv = Wtv (f1 , . . . , fn ) = q n Wtv (g1 , . . . , gn ). Observe that each gi is regular at v, so v(Wtv (g1 , . . . , gn )) ≥ 0. Therefore in this case we have v(ωv ) ≥ v(q)n. Case (ii): v 6∈ S, v(q) ≥ 0. Now every element of the local Wronskian matrix is v-integral, so the same holds for the determinant, i.e., v(ωv ) ≥ 0 in this case. Case (iii): v ∈ S, v(b) > 0. This case contains the crucial point. Similarly to the above proof of Theorem 2.1, we consider the identity aj q − aj (1 − a)(1 + b + . . . + bh−1 ) = aj bh q.

(2.5.6)

This will be useful to approximate aj q with a polynomial in a, b, at the places under consideration. In fact, we may use the identity to replace, for i = 1, . . . , k, the function fi with the left side of (2.5.6), with j = i − 1, which by (2.5.6) equals ai−1 bh q, denoted gi . Observe that this corresponds to subtract from fi a certain κ-linear combination of fk+1 , . . . , fn , and thus the value of ωv is l unchanged. We have v(gi ) = (i − 1)v(a) + hv(b) + v(q). Since v( ddtfl ) ≥ v(f ) − l, we easily find, on v looking again at the individual terms in the determinant expansion, that !   n X k(k − 1) n v(a) + hkv(b) + kv(q) + v(fi ) − . v(ωv ) ≥ 2 2 i=k+1

56

Chapter 2

Case (iv): v ∈ S, v(b) ≤ 0. We now argue directly with the terms in the determinant expansion (that is, we do not perform any column operation). Since v(fi ) = (i − 1)v(a) + v(q) for i = 1, . . . , k, we find as in the previous case that !   n X k(k − 1) n v(a) + kv(q) + v(fi ) − . v(ωv ) ≥ 2 2 i=k+1

Now, summing over allP places v of κ(X), P taking into Paccount the estimates obtained in the four cases, and recalling that v∈S v(a) = v∈S v(b) = v∈S v(fi ) = 0 for i > k, because a, b are S-units, we get   X X X X n |S|. (2.5.7) v(ωv ) ≥ n · ( v(q)) + hk v(b) + k v(q) − 2 v v6∈S,v(q)0

v∈S

P P P P Now, v∈S,v(b)>0 v(b) = v(b)>0 v(b) (b is an S-unit) and v(b)>0 v(b) = − v(b) 0. 12 Here

P we might improve this to − v∈S v(q) ≤ deg(a) + deg(b) − degS (q), which also generally leads to a small √ 3 improvement of the constant 3 2 in Theorem 2.2.

57

2.5 A proof of Theorem 2.2

Suppose that χ = 0; then |S| = 2, g = 0 and necessarily there is a relation ar bs = γ ∈ κ∗ , where r, s are not both zero: this is because some function adeg b b± deg a has no zeros or poles. Then γ 6= 1 by assumption, so min(v(1 − a), v(1 − b)) ≤ 0 for all v, proving the result. Therefore, suppose χ 6= 0 in the sequel. Also, as we have remarked in the opening arguments of the proof of Theorem 2.5, we may assume that κ(X) = κ(a, b). Let us try to choose h = [(4 deg(a)2 /χ deg(b))1/3 ] − 1, k = [(4 deg(b)2 /χ deg(a))1/3 ] − 1 (so h ≥ k). Suppose that k < 1. Then [(4 deg(b)2 /χ deg(a))1/3 ] < 2, whence (4 deg(b)2 /χ deg(a))1/3 < 2, and so deg2 (b) < 2χ deg(a). Then deg3 (b) < 2χ P deg(a) deg(b). In this case we use the obvious inequality v6∈S min(v(1 − a), v(1 − b)) ≤ deg(1 − b) = deg(b) < (2χ deg(a) deg(b))1/3 , proving what we need, with a better constant. Hence, in what follows we assume that h ≥ k ≥ 1, so in particular we may apply the above Theorem 2.6. The conclusion gives two possibilities. 2 In the first case, we have deg(a) ≤ h, whence deg(a)3 ≤ (h + 1)3 ≤ 4 deg(a)P /(χ deg(b)), so deg(a) deg(b)χ ≤ 4. Since χ ≥ 1, this implies deg(b) ≤ 2, so as above we obtain v6∈S min(v(1 − a), v(1−b)) ≤ deg(1−b) = deg(b) ≤ 2. Again, this gives the sought result since deg(a) deg(b)χ ≥ 1. Then we may assume deg(a) > h, so the second alternative of the theorem must hold. It is easily checked that the coefficient (h + 2k)/(hk + h + k) is bounded by 3/(k + 2), since h ≥ k. Similarly, k/(hk + h + k) ≤ 1/(h + 2). Therefore, X

min(v(1 − a), v(1 − b)) ≤

v6∈S

3 1 (h + 1)(k + 1) − 2 deg(b) + deg(a) + χ. k+2 h+2 2

We now use k +2 ≥ (4 deg(b)2 /χ deg(a))1/3 , h+2 ≥ (4 deg(a)2 /χ deg(b))1/3 and also (h+1)(k + 1) ≤ (4 deg(b)2 /χ deg(a))1/3 (4 deg(a)2 /χ deg(b))1/3 = (16 deg(a) deg(b)/χ2 )1/3 , which yields X

min(v(1−a), v(1−b)) ≤ 4(

v6∈S

concluding the proof.

√ 1 1 deg(a) deg(b)χ 1 1 3 ) 3 + (16 deg(a) deg(b)χ) 3 = 3 2(deg(a) deg(b)χ) 3 , 4 2

58

Chapter 2

Notes to Chapter 2 1. Simplifying the proof of Theorem 1.3 We sketch here how Theorem 2.2 may be used to simplify the proof of Theorem 1.3 in the crucial case r = n − 2 (we refer here and in the sequel to the sketch of proof and notation given in Chapter 1). Recall that the values xi (z) of the restrictions xi of the coordinates on nm to our curve X, evaluated at the relevant point z, generate a subgroup of rank − 2. In our illustration of Qrr ≤ n m the proof of Theorem 1.3 we have written equations xi (z) = ζi i=1 gj ij for certain roots of unity ζi ∈ (z) of exact common order N and certain gi ∈ ( (z))∗ , generating the free part of the group x1 (z)Z · · · xn (z)Z . Qr Consider now the lattice spanned by the vectors mj := (m1j , . . . , mnj ), with volume  V := j=1 |mj |. By geometry of numbers its volume is the same as that of the orthogonal lattice Λ. Now, the sublattice Λ0 ⊂ Λ consisting of the vectors a = (a1 , . . . , an ) that satisfy the supplementary condition ζ1a1 · · · ζnan = 1 has index ≤ N in Λ and thus vol(Λ0 ) ≤ N vol(Λ)  N V . Now, letting a, a0 be two successive minima of Λ0 (recall that rank(Λ0 ) = rank(Λ) = r ≥ 2), we have, again by geometry of numbers (Minkowski’s second theorem),

G

Q

Q

|a| · |a0 |  N V. 0

These vectors are independent and yield two multiplicative relations xa (z) = 1 and xa (z) = 1 (where we have abbreviated, e.g., xa := xa1 1 · · · xann ). 0 Define u := xa , v := xa , and note that u, v have zeros and poles in a fixed set independent of a, a0 and are multiplicatively independent because a, a0 are linearly independent and the coordinate functions are multiplicatively independent modulo constants. Now, let z be a common zero of u, v. By Corollary 2.3, we get the estimate [k(z) : k]  (deg(u) deg(v))1/3  (|a||a0 |)1/3  (N V )1/3 , where the implied constant depends only on X and the xi , not on z, a, a0 . This bound represents the improvement with respect to the previous argument (in which, for r = n − 2, we obtained the exponent 1/2 in place of 1/3). At this point the proof continues as before: the product h(g1 ) · · · h(gr ) is, on the one hand, bounded above by  V −1 and, on the other hand, bounded below by [k(z) : k]−1− by the theorem of Amoroso-David. These inequalities yield V  N 1/2+ . Finally, since the ζi lie in k(z), we have φ(N ) ≤ [k(z) : ]  (N V )1/3  N 1/2+ . These bounds are inconsistent for small fixed , unless N , and hence [k(z) : k], is bounded, concluding the proof.

Q

F

2. Rational points on curves over p It turns out that Theorem 2.2 does not hold in positive characteristic, even assuming that the functions u, v have nonzero differentials. (This was observed by Silverman: see [Sil04].) However, we shall outline here how the proof of Theorem 2.5 yields some results even in positive characteristic, with applications in counting points on plane curves over finite fields. The matter is the object of work in progress; here we limit ourselves to a sketch of some arguments for the prime fields p .

F

Z

For instance, we let f ∈ [x, y] be an absolutely irreducible polynomial; we want to estimate from above the number of solutions of the congruence f (x, y) ≡ 0 (mod p) for large primes p. We fix such a prime and we think of f as an element of p [x, y]; we note at once that such a reduction of f modulo p remains absolutely irreducible if p is large enough in view of a well-known theorem by Ostrowski; we denote by Z the plane curve defined by f = 0 and by X a projective nonsingular

F

59

Notes to Chapter 2

F

model of Z. The idea is that if Q := (x0 , y0 ) ∈ 2p lies on Z, and if x0 y0 6= 0, then xp−1 − 1, y p−1 − 1 are functions on X, vanishing at Q. Hence, if N is the number of such points, we have X N≤ min(ν(xp−1 − 1), ν(y p−1 − 1)), ν∈X\S

where ν runs through the points (= places) of X, not zeros or poles of x, y (so ν denotes also the order function associated to the place). To bound the right-hand side, we may now try to imitate the proof of Theorem 2.5, with a = xp−1 , b = y p−1 ; note that these functions are S-units for a set S that is small with respect to their degree: in fact, |S| is fixed while p shall become large. This suggests that we may obtain a nontrivial estimate. To perform this program, we look at the proof arguments for Theorem 2.5, with (u, v) := (x1−p , y 1−p ) in place of (a, b); namely, we construct a suitable Wronskian as therein, depending on √ integers h, k. We choose k = degx f − 1, h = c p + O(1), where c is a fixed positive constant. Let us imagine to carry out the proof in the present context. Then, either we shall obtain the same conclusion as in Theorem 2.5, or the relevant Wronskian shall vanish: note that this was very easy to exclude in the previous situation, but is a different matter now, because the Wronskian criterion says that W = 0 if and only if the relevant functions are linearly dependent over the field of constants for the relevant derivation; in the present case, this field is Lp , where L := p (x, y); this is a much bigger field than the ground field p (whereas these fields coincide in zero characteristic).

F

F

Let us check that nevertheless this dependence cannot occur. Otherwise we would have a relation Q(u, v) = 0, where Q(U, V ) = Q1 (U )(U − 1) + Q2 (U, V )(V − 1), Q1 resp. Q2 being polynomials in Lp [U, V ], not both zero, of degrees ≤ k − 1 resp. k in U and ≤ 0 resp. h − 1 in V . Since u = x/xp , v = y/y p , we have Q∗ (x, y) = 0, where Q∗ (X, Y ) = Q1 (X/xp ) · (1 − X/xp ) + Q2 (X/xp , Y /y p )(1 − Y /y p ). Therefore, Q∗ ∈ Lp [X, Y ] as well, and degX Q∗ ≤ k, degY Q∗ ≤ h. Now, we have f (x, y) = Q∗ (x, y) = 0. Consider R(Y ) ∈ Lp [Y ], defined as the resultant with respect to X of f (X, Y ), Q∗ (X, Y ). This resultant is not identically zero. In fact, f (X, Y ), being absolutely irreducible, would otherwise divide Q∗ (X, Y ), which is impossible since degX Q∗ = k < degX f . However, R(y) = 0, because f (X, y) and Q∗ (X, y) have a common root X = x. But this √ is a contradiction for large p because degY R ≤ 3hk  p (from the determinant expression of p p 13 the resultant), while [L (y) : L ] = p, as is easy to see. In conclusion, the Wronskian cannot vanish for large p, whence we must fall into one of the alternatives in the conclusion of the said theorem. We can quickly check that the first alternative cannot hold, and therefore we fall into the second one: X

min{ν(1 − u), ν(1 − v)} ≤

ν6∈S

k hk + h + k − 1 h + 2k deg(v) + deg(u) + χ. hk + h + k hk + h + k 2

Finally, deg(v) = (p − 1) deg(y) = (p − 1)(k + 1), deg(u) = (p − 1) deg(x), whence one easily finds an explicit estimate of the shape X √ min{ν(1 − u), ν(1 − v)} ≤ p + O( p) N≤ ν6∈S 13 In fact, for large p, L/Lp (y) is a separable extension because it is contained in L/κ(y); it is also purely inseparable as a subextension of L/Lp , whence L = Lp (y). The equality [L : Lp ] = p is, on the other hand, standard.

60

Chapter 2

(where the constant implicit in the O term depends only on deg f and can be optimized by careful choice of c). Of course, this is similar to what follows from Weil’s famous estimate, in turn equivalent to the Riemann hypothesis for zeta functions of curves over finite fields. In the mentioned work in progress, the plan is to recover the Weil (upper) bound,14 on extending this argument to all finite fields q , using Wronskians constructed out from Hasse hyperderivatives.15 Restricting again to prime fields p , an argument similar to the one above leads to the following extension of Theorem 2.2 (obtained in recent work with Corvaja):

F

F

F

Theorem 2.7. Let X be a smooth projective irreducible curve over an algebraic closure κ of p and let u, v ∈ κ(X) have nonzero differentials and be multiplicatively independent modulo κ∗ . Then, letting S, χ be as above, we have   X √ 1 deg u deg v 3 . min{ν(1 − u), ν(1 − v)} ≤ max 3 2(deg u deg v χ) 3 , 12 p ν∈X(κ)\S

Since the gcd on the left is trivially bounded by min(deg u, deg v) anyway, this may be nontrivial only if the degrees are small, specifically ≤ p/12. As a simple application, let us take X to be defined by x + y = 1, and set u = xm , v = xn with, e.g., m, n divisors of p − 1. We obtain that  √  1 The number of x ∈ p with xm = (1 − x)n = 1 is bounded by max 3 3 2(mn) 3 , 12mn . p

F

F

In turn, setting p−1 = am = bn we easily obtain  √that the2 number  of p -rational affine points on 3 a b the Fermat-like curve ξ +η = 1 is at most max 3 2(abp) 3 , 12p . Weil’s bound yields essentially √ an estimate ≤ p + ab p. The previous estimate improves on this when ab is “much” larger than √ p. (Taking m = n we find back a result of Garcia-Voloch, in [GV88]; this also appears as the case T = 1 of Lemma 5 of [HBK00], gotten with Heath-Brown’s variation of Stepanov’s method, which in turn led to rather delicate estimates on exponential sums.) 3. Unlikely Intersections and Holomorphic GCD in Nevanlinna Theory As recognized mainly by P. Vojta in the 1980s, many central diophantine statements have analogues not only in the well-known case of function fields, but also in the context of holomorphic mappings from C to an algebraic variety; this is now commonly referred to as “Nevanlinna theory” (see, for instance, [BG06], Chapter 13). The issues that we have considered in this chapter are no exception, and recently a “holomorphic gcd” has been introduced (in a completely independent way) and studied by J. Noguchi, J. Winkelmann, and K. Yamanoi in [NWY08]. They have succeeded in obtaining very general results; here we limit ourselves to outlining a few essentials.

G

We consider a holomorphic function f : C → nm , assuming it is nondegenerate, i.e., that the image f (C) is Zariski-dense. It is a corollary of a well-known theorem of H. Cartan that for every divisor D in nm the intersection f (C) ∩ D is nonempty (so, in a sense, f (C) behaves like an algebraic curve, although f is highly transcendental, being nondegenerate). This is a higherdimensional generalization of Picard’s theorem.

G

F

14 It is well known that a “correct” upper bound valid over all finite fields q m is sufficient for the cited Riemann hypothesis. 15 These Wronskians had been used also by O. Stohr and J.F. Voloch to obtain a proof of Weil’s bounds, and we can extrapolate that there are certainly analogies behind these methods. But there are also differences in some respects. For instance, it is peculiar that the present method is inspired by, and is analogous to, a result in characteristic zero, i.e., Theorem 2.2, that admits applications in contexts which appear to be of distant nature.

61

Notes to Chapter 2

More precisely, one first defines a function Nf (D, r) of real positive r > 0, which measures with weights the intersections f (z) ∈ D, for 0 < |z| < r. One sets Z r nf (D, s) Nf (D, r) = ds, s 0 P where nf (D, s) := 0 1, and actually it takes some work in [PZ08] to provide suitable details for this step in full generality.8 (I owe to D. Bertrand the remark that the needed conclusion may be also readily derived as an application of the results of J. Ax’s paper [Ax72] on algebraic dependence on analytic subgroups of algebraic groups.) In any case, we shall stress that this step, of proving that there are no algebraic arcs on the relevant varieties, may represent a substantial difficulty in other applications of the method. Step (vi): All of this does not still yield any finiteness result, merely estimates for the number of torsion points on X of a given order N . The crucial issue is that if X contains an algebraic point x ∈ X( ), it automatically contains its conjugates over a field of definition. Now, Masser has proved in [Mas84] that the degree, over a field of definition for A, of any torsion point of exact order N is  N ρ for a certain ρ > 0 depending only on dim A. (See Appendix D by Masser for a sketch of the arguments in the elliptic case.) Hence, if X contains a point of order N , it contains at least  N ρ such points.9

Q

Step (vii): Finally, comparing the estimates coming from (v) and (vi), where we choose 0 <  < ρ, we deduce that the order of the torsion points on X is bounded, concluding the argument. This same proof pattern works perfectly also for the toric case, i.e., for torsion points on subvarieties of nm ; we have explicitly illustrated this in a basic situation at the end of Section 1.1. In this direction see especially the survey paper [Sca11a] (which also discusses in some detail Step (iv), especially concerning the issues of the next chapter). The toric case is, however, much less significant for the method, since we have seen in Chapter 1 (Section 1.1) that there are several other elementary approaches; hence we do not discuss this issue any further here. We do not pause further on the Manin-Mumford context as well, referring to the paper [PZ08] for more details, and we turn to Masser’s problems, which as noted are not in a very far area. However, before ending the section, we include a brief history of the origins of an essential ingredient in the above method, which shall appear also in the further topics we shall touch.

G

Remark 3.1.1 Estimates for the distribution of rational points on transcendental varieties. At Step (iv) of the above description, we have mentioned some estimates concerning the distribution of rational points lying in a bounded region of a real analytic variety. We have quoted the paper [BP89] by Bombieri and Pila as the first one in this direction. This paper, which was partly motivated by Jarnik’s paper [Jar26] and a few subsequent ones, in fact contained different types of results. In the first place, it offered some estimates for the number of rational and integral points in bounded regions of plane algebraic curves; remarkably, the bounds were nearly best-possible and uniform in the coefficients defining the curve. The methods involved polynomial interpolation in two variables, and extrapolation via comparison of bounds coming from (a) mean value theorems and (b) integer valuedness of the relevant determinants. (See Appendix A below for some details on this step.) 8 More precisely, in [PZ08], Theorem 2.1, the following result is proved: Let Z ⊂ Cg be a periodic (under a lattice of rank 2g) analytic set. Then the union of all connected real semialgebraic sets of positive dimension contained in Z coincides with the union of all the cosets in Z of complex linear subspaces intersecting the said lattice in a sublattice of maximal rank. 9 It may be shown that this estimate remains valid over any field of definition of zero characteristic. For instance, if A is defined over a field of transcendence degree r, one may view A as a family of abelian varieties Ax , for x in a variety V / of dimension r; in this case, one may specialize x to a point of good reduction x0 ∈ V ( ) and reduce to the result for Ax0 . However, sometimes for fields of positive transcendence degree, such estimates are distinctly easier to be obtained directly (think, for example, of the case of elliptic curves).

Q

Q

66

Chapter 3

Second, a further completely new feature of this paper, especially relevant here, was the consideration, with analogous methods, of bounds for the number of rational points of given denominator lying on transcendental arcs. For instance, it was proved that If f (x) is a real transcendental analytic function on a compact interval, and if Ω(N ) denotes the number of rational points with denominator N lying on the graph of f (x), then for every  > 0 we have Ω(N ) f, N  . (See Theorem 1 of [BP89] for an essentially equivalent statement.) This viewpoint came especially from an explicit question of Sarnak and represents a crucial issue because it motivated the developments needed for the present applications. In fact, subsequently, with this last viewpoint in mind, Pila developed the method of [BP89] for (real subanalytic) surfaces. In this context it is not sufficient anymore to assume that the surface is transcendental to obtain bounds of the same strength; in fact, if, for instance, the surface contains an arc of an algebraic curve, then there may be too many rational points already on the arc. (E.g., this shall certainly happen if the said arc concerns a rational curve defined over .) Therefore, one counts merely the rational points (of denominator dividing N ) outside the union of the algebraic arcs lying on the surface.10 Then, in [Pil04] and in [Pil05], bounds of the shape  N  were obtained for the number of remaining points. The paper [Pil05] was based on the same method in [Pil04], but with a technical refinement; as a result, the bounds referred to rational points of denominator at most N (rather than dividing N ), which may be an important difference in some applications. In addition to the methods employed for the one-dimensional case, the proofs of these two-dimensional results needed delicate uniform estimates (due, e.g., to Gabrielov) for the number of connected components in the context of fibers of projections of real-subanalytic varieties. (Bombieri had actually pointed out since the beginning that subanalytic varieties could have played an important role, due, for instance, to their good behavior under projection operations; see Example 11 in the next chapter for definitions.)

Q

It was then clear that further extensions in desirable generality to even higher dimensions would require appropriate language and the consideration of suitable categories of relevant varieties. All of this was finally developed in the paper [PW06] by Pila-Wilkie, for varieties which are definable in an o-minimal structure over . We skip here any formal definition, postponing a brief description to the notes to the next chapter; however, we note that these categories include the compact subanalytic sets considered earlier by Pila, but also larger sets of varieties, a generality which becomes important in the applications we shall consider next, in Chapter 4. In these more recent applications ([Pil09b], [Pil11]) Pila considered also algebraic points of bounded degree rather than merely rational points (see, e.g., [Pil09a]), again within the category of varieties which are definable in an o-minimal structure, especially applying the results for the structure denoted an,exp ; this is “generated,” so to say, by the globally subanalytic maps and the exponential map (see the notes to Chapter 4). We shall illustrate the use of such results in the next chapter. In Appendix A below we shall provide some detail for estimates in this context (limiting to the case of surfaces, sufficient for the present chapter): this has the mere purpose of illustrating some of the main ideas in this realm, rather than providing any complete and precise account of proofs for these tools, auxiliary to the solution of the main problems.

R

R

3.2

Masser’s questions on elliptic pencils

Let us start with one of the simplest issues raised by Masser, which provided much motivation and insight for further questions. Let us consider the Legendre family of elliptic curves defined in the affine plane by an equation y 2 = x(x − 1)(x − λ) (for λ 6= 0, 1). We can view it in several ways, more or less equivalent for the present purposes, and we shall often tacitly switch among these perspectives. Namely, we can consider it as an elliptic curve E = Eλ defined over (λ), or as an elliptic group-scheme L over P1 \ {0, 1, ∞}, or also as just a (singular) surface (in P3 or P1 × P2 ) with a

Q

10 The structure of algebraic arcs on a surface may be fairly complicated, even if the surface is subanalytic; take, e.g., the surface in 3 determined by z = xy , x, y ∈ [2, 3]. It contains the algebraic arcs z = xy0 , x ∈ [2, 3], for any given y0 ∈ ∩ [2, 3], and no other algebraic arc, as is easy to see.

Q

R

67

3.2 Masser’s questions on elliptic pencils

map λ to P1 whose generic fiber Eλ is an elliptic curve (we have an “elliptic pencil,” or pencil of elliptic curves, over P1 \ {0, 1, ∞}). Restricting to λ 6= 0, 1, ∞, the surface becomes nonsingular: Example 3.1

Legendre quasiprojective surface L in P1 × P2 : L:

Y 2 Z = X(X − Z)(X − λZ),

λ ∈ P1 \ {0, 1, ∞}.

(3.2.1)

Q

Let us now take two points in Eλ ( (λ)), choosing for simplicity (I follow Masser’s original example): p p Q = Qλ = (3, 6(3 − λ)). P = Pλ = (2, 2(2 − λ)), Here we are using a slight abuse of notation: there are two points for each choice of abscissa (and this yields two complex points for each complex value of λ); of course, we may choose a fixed square root in each case. It would be perhaps more appropriate to view the points as √ sections √ σ : C → E of an elliptic pencil E → C, where C is a (smooth) curve with function field (λ, 2 − λ, 3 − λ), and where E is obtained as a pullback of the Legendre pencil L → P1 by the map λ : C → P1 . However, here and in the sequel we shall often skip such precisions, when there is no risk of confusion. Now, for any complex λ0 6= 0, 1 we obtain an elliptic curve Eλ0 /C (the fiber of L above λ0 ), and two points Pλ0 , Qλ0 ∈ Eλ0 . The question fundamental for us now is:

Q

Question: What can be said about the values λ0 ∈ C \ {0, 1} of λ such that both Pλ0 , Qλ0 have finite order on Eλ0 ? Here we implicitly refer to the group law on Eλ0 with the point at infinity as origin. Remark 3.2.1 Analogy with Manin-Mumford-Raynaud: Note that if the curve E was fixed, namely, 2 independent p of λ (e.g., E : y = x(x − 1)(x p − 3)), and if Pλ , Qλ were again points in E( (λ)) (e.g., Pλ = (λ, λ(λ − 1)(λ − 3)), Qλ = (1 − λ, λ(1 − λ)(2 + λ))), the point Pλ × Qλ would describe (for varying λ) a curve in E × E. The analogue of the present question would then amount to the description of the torsion points on such a curve and would fall into a special case of the generalization by Raynaud of the Manin-Mumford conjecture treated in the previous section.11 Hence we may say that the above question represents an analogue of Manin-Mumford for elliptic curves (or abelian varieties) varying in families. We remark that, from the viewpoint of the proof methods, the fact that ambient space is varying makes substantial difference with the constant case, and in particular it seems that it does not allow other known proofs to work in a straightforward way. We shall soon point out that this generalization of the Manin-Mumford context is in turn a case of Pink’s conjectures on abelian schemes.

Q

Some remarks on the Question 1. Unlikely intersections and special varieties of the context. The point Pλ × Qλ corresponds to a curve on the threefold given by the fiber product L ×P1 L (which is the set of (u, v) ∈ L2 with λ(u) = λ(v)). Removing the bad values 0, 1, ∞ of λ from P1 , the resulting threefold becomes an abelian group-scheme over P1 \ {0, 1, ∞}. (The curve given by Pλ × Qλ may be also described as the image of C by the pair of sections of the pencil E 2 → C provided by the points.) Also, if Tm , Tn are torsion points of orders resp. m, n in E( (λ)), the point Tm ×Tn ∈ E( (λ))2 again yields a curve on L×P1 L. The unlikely intersections come from those among these last curves that meet the curve described by Pλ ×Qλ . Since we have two curves in a threefold, the intersections are indeed unlikely, and for them we expect (perhaps) finiteness for varying m, n. Note, however, that if we had a nontrivial relation aPλ = bQλ with integers a, b not both zero, i.e., if Pλ and Qλ were linearly dependent over , then Pλ0 being torsion and Qλ0 being

Q

Q

Z

11 For

the present example, we obtain a finiteness conclusion, since the points are linearly independent over

Z = End(E), as is easy to check.

68

Chapter 3

torsion would imply each other (if ab 6= 0), so indeed the finiteness expectation would not hold. (In fact, we shall soon check the apparently evident fact that there are infinitely many special values λ0 that make a single point to be torsion.) Such a dependence relation defines the special varieties in this problem, namely the analogue for this context of the torsion cosets for the toric (or Manin-Mumford) context. Such special varieties are group subschemes.12 Let us also note that with the actual choice of the points, this special situation of Pλ , Qλ being dependent does not occur. Indeed, the minimal fields of definition of Pλ , Qλ are disjoint quadratic √ over (λ), whence a conjugation (over ( 3 − λ)) would yield 2aP = 0, and similarly 2bQ = 0. However, none among Pλ , Qλ is (identically) torsion on Eλ , as can be seen in a number of ways. For instance, we may notice that the fields of definition of Pλ and Qλ are ramified (over (λ)) above λ = 2, 3, respectively, whereas the field generated over (λ) by the torsion points is ramified only above 0, 1, ∞, which are the λ-values of bad reduction. (This last fact amounts to the injectivity of torsion under good reduction.) Or else we may specialize λ to a rational number ρ, and apply the Lutz-Nagell theorem to 2Pρ ; taking, e.g., ρ = 6, we obtain that P6 is not torsion on E6 , whence Pλ cannot be torsion on Eλ . (See, e.g., [Sil92], Prop. 3.1, p. 176, for these standard facts.)

Q

Q

Q

Q

2. Torsion values of λ for a single point. Note that a single relation nPλ = 0 yields, for any given n 6= 0, a “vertical” curve in L. For varying n this should produce infinitely many complex values λ0 6= 0, 1 that make Pλ0 torsion (and similarly for Q). Indeed, this can be proved. Nevertheless, it is perhaps less obvious than it may seem, and so let us study this issue a bit. For instance, we may compute x-coordinates of multiplication by n on Eλ , expressing them in the shape An (λ, x)/Bn (λ, x) for certain coprime polynomials An , Bn ∈ [λ, x]. They remain coprime as polynomials in C[x] at every specialization λ = λ0 6= 0, 1, because if An (λ0 , x) and Bn (λ0 , x) had a common factor, then Eλ0 would have less than n2 points of order n. In turn, the (possible) common roots of An (λ, x0 ), Bn (λ, x0 ) lie in 0, 1 (i.e., the set of bad reduction) for each x0 ∈ C and in particular for x0 = 2, 3. Now note that the order of Pλ0 (λ0 6= 0, 1) divides n if and only if Bn (λ0 , 2) = 0 (and in particular, of course, λ0 has to be algebraic). We have, for instance,

Z

B2 (λ, 2) = 8(2 − λ),

B3 (λ, 2) = (λ2 + 8λ − 16)2 ,

B4 (λ, 2) = 32λ2 (λ − 2)(λ − 4)2 (3λ − 4)2 .

Note also that none of the Bn (λ, 2) is identically zero, otherwise Pλ would be identically torsion, which we have already excluded. For this discussion, let Sn denote the set of complex zeros 6= 0, 1 of the polynomial Bn (λ, 2). n varies, then, by taking the multiples of Pλ , If the union S := {0, 1} ∪n≥1 Sn would be finite as p of unbounded we would obtain infinitely many points in Eλ ( (λ, 2(2 − λ)), with √ x-coordinates Q degree in λ, and actually lying in the finitely generated ring [λ, 2 − λ, ξ∈S (λ − ξ)−1 ]; but this contradicts Siegel’s theorem on integral points over function fields.13 Hence S is an infinite set, and thus there are infinitely many torsion specializations for Pλ , and similarly for Qλ . This proves actually more, namely that there are infinitely many torsion values for the point Pλ in any infinite sequence of torsion orders.14 There are at least two other methods for proving this infinitude without appealing to Siegel’s theorem; they are, respectively, analytical and through reduction modulo primes, and shall be

Q

Q

12 They are all the group subschemes which are flat over P \ {0, 1, ∞}. Other nonflat group subschemes, not 1 contained in any of these, are obtained by adding to the zero-subscheme either the fibers Eλ0 or the finite subgroups of fibers Eλ0 . 13 See, e.g., [Sil02]; such a case of Siegel’s theorem is nontrivial but much easier to prove than the number-field case, especially the version needed here. 14 Note that if λ 6= 0, 1 would be a common zero of B (λ, 2), B (λ, 2), then the order of P m n 0 λ0 would divide (m, n) = 1, which is impossible. Hence, the sets S` , with ` running, for instance, through the primes, form pairwise disjoint sets. However, this is not enough to conclude infinitude of S because a priori all these S` could be empty!

69

3.2 Masser’s questions on elliptic pencils

recalled in the notes below. (The analytical method yields the refined result that for each large n there are new torsion values, i.e., roots of Bn (λ, 2) that we have not met for smaller order and which produce a point of exact order n. They are the roots of polynomials denominated “bicyclotomic” by Masser.) This issue of the infinitude of torsion values for a single point is analogous to the (easier) infinitude of the set X(n−1) considered in Chapter 1 (see Remark 3 to Theorem 1.2). 3. Sparseness of torsion values. The “torsion” values of λ (relative either to Pλ or Qλ ), although making up an infinite set, as we have seen, are, however, rather sparse, as follows from work of Masser [Mas89b], already mentioned above and relevant in connection with Theorem 1.2. Actually, a very special case of a theorem of Silverman (see [Sil83] or also [Sil02]) shows that the absolute Weil height h(λ) of such values (recall they are algebraic numbers) is bounded. This not only makes plausible the finiteness expectation for “doubly torsion,” truly unlikely, values (we now have double sparseness), but shall be also an important information for the subsequent proof arguments. The whole situation reminds of and is in fact analogous to Theorem 1.2. Let us then give a precise statement and a brief sketch of the argument of Silverman, in the simple case of our special context. (See Appendix C by Masser for a direct proof, different from Silverman’s, of a more general case concerning specialization of a finitely generated group, whereas the present one is the case of a cyclic group.) Proposition 3.2. ([MZ10b], Lemma 6.1.) There is a number c such that if either Pλ0 or Qλ0 is torsion on Eλ0 , then λ0 is an algebraic number with h(λ0 ) ≤ c. Proof. We assume a few basic facts concerning heights on elliptic curves. Let us consider p a large but fixed multiple of Pλ , say n0 Pλ . This is a point on Eλ , defined over (Pλ ) = (λ, 2(2 − λ)), whose x-coordinate shall be a certain rational function R(λ) = Rn0 (λ) of λ. The degree of R(λ) is a “naive” (function field) height of n0 Pλ , and may be compared to the N´eron-Tate height (in the function field sense) by a well-known result of Zimmer [Zim76]; this says that the difference between the N´eron-Tate height and the naive height is bounded by an absolute constant times the height of the elliptic curve. ˆ 0 Pλ )  n2 , whence Zimmer’s result In our setting, since Pλ is not identically torsion we have h(n 0 2 (over function fields) yields that deg R ≥ c1 n0 for a computable positive constant c1 depending (like the subsequent c2 , c3 , . . .) only on E, Pλ .15 ˆ 0 Pλ ) = 0, where now h ˆ is the (standard) N´eronOn the other hand, if Pλ0 is torsion then h(n 0 Tate height on Eλ0 . By the same result of Zimmer (this time over the algebraic numbers), we have ˆ 0 Pλ ) + c2 h(Eλ ) + c3 ≤ c4 h(λ0 ) + c5 . On the other hand, for the naive height h(n0 Pλ0 ) ≤ h(n 0 0 by elementary estimates for the height of values of a rational function (see [BG06] or [Zan09]), the naive height is bounded below as follows: h(n0 Pλ0 ) = h(R(λ0 )) ≥ deg R · h(λ0 ) − c6 h(R) ≥ c1 n20 h(λ0 ) − c6 h(R). Here h(R) p denotes the height of the coefficient vector of R. Now it suffices to choose, for instance, n0 > 2c4 /c1 to obtain the sought bound for h(λ0 ).

Q

Q

Q

The same argument yields that for each given B ≥ 0, there a bound h(λ0 ) ≤ cB for the λ0 ∈ ˆ λ ) ≤ B (the above being the case B = 0). The conclusion in particular implies such that h(P 0 finiteness for the set of values of λ over a given number field or even with bounded degree over , which yield torsion either for Pλ or for Qλ , and we again stress that this provides in itself rather strong support for a finiteness expectation for the set of values in which yield torsion for both.

Q

Q

15 The computability of c follows, e.g., from the interpretation of the N´ eron-Tate height on function fields as a 1 certain intersection product; see [Sil02]. This step involving Zimmer’s theorem may be avoided for the actual special choice of the point Pλ , on computing explicitly R(λ), e.g., for n0 = 3: see [MZ10b]; however, the whole argument becomes important in dealing with more general points.

70

Chapter 3

3.3

A finiteness proof

Let us now sketch a proof of the said expectation, namely: Theorem 3.3. There are at most finitely many complex numbers λ0 such that both Pλ0 , Qλ0 are torsion on Eλ0 . We shall use, roughly, the method already outlined for the Manin-Mumford statement. Here the point is that the ambient abelian variety (i.e., Eλ2 ) varies with λ, and thus we cannot use a single uniformization map. However, we may use individual uniformization maps and let them vary analytically. Another important difference with the “constant” case is that, when taking the conjugates, the conjugate points shall lie on distinct elliptic curves, namely, on the Eλσ0 , where λσ0 are the conjugates of a relevant algebraic number λ0 . In this respect, we shall need a lower bound for the degree of the relevant λ0 , rather than for the degree of the torsion points over the field of definition of the abelian variety, which was the estimate useful in the Manin-Mumford context. (Observe that here the degree of Pλ0 over the field of definition (λ0 ) of Eλ0 is at most 2!)

Q

To take care of the variation of the involved elliptic curves, we may view Eλ as a complex torus depending on λ; i.e., for λ ∈ C \ {0, 1} we have a uniformizing map βλ : C/Lλ → Eλ ,16 where Lλ is a suitable lattice in C, varying locally analytically. If we choose real coordinates in C corresponding to a basis for this lattice, a torsion point on Eλ corresponds to a rational point in 2 , or rather to a class in 2 / 2 ; here, to get rid of the class it may be simpler to think of a fundamental parallelogram in C for C/Lλ . Now, since Lλ varies analytically with λ, the family of the Eλ gets uniformized by a continuous or even analytic 2-dimensional (over ) family of such parallelograms, parametrized by λ; using a basis of Lλ , which also varies locally analytically, we may replace this family just by [0, 1)2 × C \ {0, 1}. And it shall be convenient to view the domain C \ {0, 1} of λ as well as 2 minus two points.

R

R Z

R

R

Locally, by means of elliptic logarithms, we may now view Pλ , Qλ as points in the fundamental parallelogram corresponding to λ, and we may consider their coordinates in the chosen basis for Lλ ; in this way we obtain two pairs (x, y), (u, v) with real x, y, u, v ∈ [0, 1), representing, respectively, Pλ , Qλ ; here x, y, u, v are real functions of λ, which locally are real analytic. The four-tuple (x, y, u, v) defines a point in [0, 1)4 varying with λ in the said way, and describing a (real) surface X in [0, 1)2 × C \ {0, 1}. Now, since the points Pλ , Qλ vary algebraically with λ in the Legendre model, and since the uniformizing maps are transcendental, we may expect that the said surface will be transcendental; actually, thinking of analytic continuation, we may expect the stronger fact that it will contain no real algebraic arc, because moving λ along such an arc would yield algebraic dependence both for Pλ , Qλ and for their images under the said transcendental map. (A proof of this corresponds to Step (v) in the above description on the method for the Manin-Mumford context.) If we take this for granted, a theorem of Pila (see [Pil04] or a refined version in [Pil05]) asserts that the number of rational points with denominator dividing N , on any fixed compact piece of the surface X, shall be bounded by c(X, )N  for any prescribed  > 0. On the contrary, such a bound a priori does not hold if λ is allowed to move in a noncompact region. Fortunately for our purposes, in view of Proposition 3.2 above, we may consider only the λ with a certain prescribed bound on the height; this indeed allows us to work inside a compact part of X. 16 This

may be given in terms of a Weierstrass ℘-function associated to Lλ ; see below.

71

3.3 A finiteness proof

The resulting estimate corresponds to Step (vi) in the description of the method given in the previous section. At this point we work again on the Legendre model. Suppose that λ0 is a torsion value for either Pλ0 or Qλ0 , that is, suppose that at least one of these points is torsion, say of order m > 0 (depending of course on λ0 ). Then a deep result of Sinnou David [Dav97] implies (via a simple argument of Masser in [Mas89a])

Q

m ≤ c1 [ (λ0 ) :

Q]2(1 + h(λ0)),

(3.3.2)

for an absolute (effective) constant c1 . (See Lemma 5.1 in [MZ10b].) A weaker, but still sufficient version of this is actually implicit in Masser’s paper [Mas89a], which (in particular) gives lower bounds for the degree of the field of definition of a torsion point on an elliptic curve E; if the point has exact order n, the bound is of the shape c(E)n/ log n, for an effective c(E) > 0. Here the elliptic curve is varying, but the method actually yields the estimate c(E)−1  (d(E) + h(E))κ , where d(E) is the degree of a field of definition for E and where the implicit constant and κ are absolute and effective. The methods for proving these results come from transcendental number theory; later, a proof shall be outlined in Appendix D by Masser. In our case, both points Pλ0 , Qλ0 are defined over a field of degree ≤ 4 over (λ0 ), hence Masser’s said estimate immediately leads to a bound similar to the displayed inequality 3.3.2, except that the right side has to be raised to an exponent κ + 1, say. However, it shall be clear from the coming discussion that raising to any fixed exponent suffices for our purposes.

Q

Anyway, due to the boundedness of the height delivered by Proposition 3.2, inequality 3.3.2 yields the lower bound √ [ (λ0 ) : ] ≥ c2 m (3.3.3)

Q

Q

for an absolute (and effective) constant c2 > 0. Now we use the algebraic setting, supposing that λ0 is doubly torsion, i.e., that both Pλ0 , Qλ0 are torsion on Eλ0 ; let {m, n} be the set of the respective orders and suppose m ≥ n > 0. Then we note that the conjugates of Pλ0 , Qλ0 over yield other torsion points, of the same respective orders. These new points shall correspond to new √ doubly torsion values of λ, namely the conjugates of λ0 ; and by (3.3.3) we obtain at least  m such values. In turn, the transcendental picture gives a corresponding set of new rational points on the said surface, and we may assume many of them lie in a prescribed compact part, due again to the boundedness of the height. Observe that these conjugate points have order also dividing mn, thus shall correspond to rational points with denominator dividing mn.

Q

To conclude it now suffices to compare the said estimates (as in Step (vi) of √ the previous section): we have  (mn) rational points because of Pila’s Upper Bounds and  m by David’s (or Masser’s) Lower Bounds; since m ≥ n we obtain a bound for m on choosing, e.g.,  = 1/8. As a consequence, the number of unlikely values for λ is finite. We shall now give more detail for the various steps. • Period lattices. A varying period lattice Lλ corresponding to Eλ , for complex λ 6= 0, 1, may be generated by hypergeometric function values: Lλ = f (λ) + g(λ), f, g certain (explicit) analytic series, for instance, in the domain

Z

Λ = |λ| < 1, |1 − λ| < 1.

Z

72

Chapter 3

More precisely, we consider special choices of the hypergeometric function F (a, b, c, λ), i.e., ∞ X 1 1 (2m)!2 m λ . F ( , , 1; λ) := 2 2 24m m!4 m=0

and we define

1 1 1 1 g(λ) := πiF ( , , 1; 1 − λ). (3.3.4) f (λ) = πF ( , , 1; λ), 2 2 2 2 These functions, or more precisely any analytic continuation of them, are known to be generators for a lattice of periods for Eλ , for each λ ∈ C \ {0, 1, ∞} (see, for instance, [Hus87], p. 187). Further, it is known that they may be analytically continued in such domain C \ {0, 1} with a monodromy group; such a group has generators sending (f, g) either into the pair (f, 2f + g) or into (f − 2g, g), corresponding to loops around resp. 0, 1.17 • Elliptic logarithms of Pλ , Qλ . The points Pλ , Qλ have elliptic logarithms given by determinations z(λ), w(λ), also locally analytic on Λ, and capable of analytic continuation with a certain monodromy; more precisely, we may set Z ∞ Z ∞ dX dX p p , w(λ) = 2 . (3.3.5) z(λ) = 2 X(X − 1)(X − λ) X(X − 1)(X − λ) 2 3 As to the monodromy, if a closed curve does not encircle 2 or 3, both z, w are left unchanged. Instead, going, e.g., around a small closed circle whose interior contains 2 but √ not 3, will leave w unchanged but will change the sign of z. (This can be seen by multiplying by 2 − λ and noting q q

that

X−λ 2−λ

=

1+

X−2 2−λ

has a single determination for, say, |2 − λ| = 1/2 and |X − 2| < 1/2.)

These elliptic logarithms are relative to the said uniformization βλ : C/Lλ → Eλ . In turn, this ˜=E ˜λ of uniformization is obtained in terms of Weierstrass ℘-functions. The Weierstrass form E Eλ is given by ˜ 3 − g2 X ˜ − g3 , Y˜ 2 = 4X with

4 2 4 (λ − λ + 1), g3 = (λ − 2)(λ + 1)(2λ − 1). 3 27 ˜ is given by The isomorphism φ from Eλ to E g2 =

˜ = X − 1 (λ + 1), Y˜ = 2Y. X 3 By general theory (see, e.g., [Lan73]) there is a corresponding Weierstrass elliptic function ℘ = ℘(u) = ℘λ (u). For any λ ∈ Λ we get, after substituting X = ℘(u) + 13 (λ + 1), the relations 1 ℘λ (z(λ)) = 2 − (λ + 1), 3

℘0λ (z(λ)) = 2

p

2(2 − λ)

with similar equations at w(λ). This is the elliptic logarithm property alluded to above. 17 Although this is classical, it is not so easy to locate a proof in the literature; see the paper [MZ10d], Appendix, for a proof. Another proof comes, for instance, on looking at the differential equation of order 2 satisfied by f (λ), namely λ(1 − λ)y 00 + (1 − 2λ)y 0 − 14 y = 0. One observes that locally around 0 it has two linearly independent solutions, which are, respectively, f (λ) and a function of the shape log λ · f (λ) + σ(λ), where σ(λ) is a certain power series; this yields the monodromy around 0.

73

3.3 A finiteness proof

• Real coordinates for points in Eλ . We may now express z, w in the basis provided by f, g for C over , obtaining real coordinates x, y, u, v. They need not lie in [0, 1), but actually this shall be immaterial for our purposes: they are uniquely determined modulo and we need merely to take into account the denominators of their rational values. They satisfy the equations

R

Z

z = xf + yg, We define a real-analytic function θ : Λ →

w = uf + vg.

(3.3.6)

R4 by

θ = (x, y, u, v). We may in fact express x, y, u, v in terms of f, g, z, w and their conjugates: for instance, x(λ) = z(λ)g(λ)−z(λ)g(λ) , ∆(λ)

where |∆(λ)| = |f (λ)g(λ) − f (λ)g(λ)| is twice the determinant of a period lattice for Eλ (and so does not vanish for λ 6= 0, 1). We shall have to estimate the number of rational points in the image of θ, which correspond to the torsion points that we are considering. However, presently θ is defined only on Λ so first we need to analytically continue this definition. Then we shall have to work in fixed disks where there is a well-defined continuation. • Continuation to a “large” domain. The above functions are defined only on Λ but we can extend them as follows. First we define a “large” domain ΛB as the set of λ ∈ C such that B ≥ |λ| ≥ 1/B, |λ − 1| ≥ 1/B, |λ − 2| ≥ 1/B, |λ − 3| ≥ 1/B, where B is a fixed large number. (We want that λ stays away from 0, 1, ∞, but also from 2, 3, for avoiding any problems with z, w.) For a point p ∈ ΛB we now fix a small closed disk Dp around p and continue analytically our functions f, g, z, w to a neighborhood of Dp (e.g., through a path from Λ), obtaining analytic functions fp , gp , zp , wp on a neighborhood of Dp . Similarly to the above, we may define real functions xp , yp , up , vp on Dp and a function θp := (xp , yp , up , vp ) : Dp → 4 . By compactness we can cover ΛB with finitely many such disks. (We shall work separately in each disk, so it does not matter if the disks overlap, and if the continuations of our functions do not agree on the intersections.)

R

• Restricting the domain. Now, let us assume as before that λ0 is a doubly torsion value, producing torsion orders m, n for Pλ0 , Qλ0 . Let us set d(λ0 ) := [ (λ0 ) : ]. Also, suppose that at least δd(λ0 ) conjugates of λ0 have (complex) absolute value > B; then the height h(λ0 ) would be > δB. Similarly, if at least δd(λ0 ) conjugates lie in the disk |λ − 1| < 1/B, then h(1/(λ0 − 1)) > δB, whence h(λ0 ) > δB − log 2. In this way we deduce from boundedness of the height (Proposition 3.2) that the number of conjugates of λ0 lying outside ΛB is  d(λ0 )/B, where the implied constant does not depend on B. Hence, for large enough B there are at least d(λ0 )/2, say, conjugates of λ0 lying within ΛB . We also deduce that at least a fixed positive proportion ηd(λ0 ) (fixed η > 0) of conjugates will fall in a fixed disk Dp . (Note that this disk may depend on λ0 , but it has only finitely many possibilities.) Note that this conclusion is significant because for our purposes we may assume that λ0 has large degree d(λ0 ): in fact, the height h(λ0 ) is bounded (in view of Proposition 3.2), and if the degree remained also bounded we would get finiteness by Northcott’s theorem.

Q

Q

• A subanalytic surface and its “algebraic part.” We now fix a disk Dp among the finitely many ones contructed as above, chosen so as to contain a maximal number of conjugates of λ0 . We set S := Sp = θp (Dp ).

74

Chapter 3

This is a real surface parametrized by θp = θp (λ) = (xp , yp , up , vp ), where xp , yp , up , vp are real functions of λ ∈ Dp ; note that θp is actually real analytic in a neighborhood of Dp . In order to apply a certain theorem of Pila, we first need to prove that θp (Dp ) is subanalytic.18 In fact, by definition θp (Dp ) is in particular the projection of a subanalytic set, namely the graph of θp on Dp . (Note that θp is analytic on an open disk D0 containing Dp , so the said graph is the set-theoretic difference of two analytic manifolds, i.e., the graphs on D0 and D0 \ Dp , and is thus subanalytic.) So by [Shi97], p. viii, it is subanalytic provided the restriction of the projection map to the closure of the graph (of θp on Dp ) is proper. But the graph is closed and bounded, so it is compact, and any continuous map on a compact Hausdorff space is proper. As in the above sketch, the said conjugates of λ0 in Dp produce, via θp , rational points, with denominator dividing mn, in S. We proceed to a deduction of the Step (iv) of Section 3.3 of an upper bound for the number of such rational points, where we shall use estimates of Pila alluded to above. In order to apply such results, we shall also need to study the so-called algebraic part of S, which we now define: Definition: We define S alg to be the union of connected semialgebraic sets of positive dimension contained in S. We call S alg the algebraic part of S. Accordingly, we denote with S trans the complement of S alg in S, calling it the transcendental part of S.19

R

For convenience we recall that a semialgebraic set in h is a finite union of sets of the form {x ∈ n : f1 (x) = . . . = fk (x) = 0, g1 (x) > 0, . . . , g` (x) > 0}, where fi , gj ∈ [x], x = (x1 , . . . , xh ); see, e.g., [BM88], Def. 1.1, or [Shi97], p. 51. It has a dimension: see [BM88], p. 14.

R

R

We note that indeed the existence of any such semialgebraic set of positive dimension in S could be an obstruction to the estimates in question (think especially of the case of an arc of a rational curve). But, as in Step (v) of Section 3.1, we proceed to prove that this set is actually empty in our case, namely: Lemma 3.4. The above-defined real surface S does not contain any semialgebraic arc of positive dimension, i.e., S alg is empty and S trans = S. Proof. If S would contain a semialgebraic curve, it would contain a Zariski-dense subset C0 of a ˜ and in particular we could write C0 = θ(E) for some infinite subset E ⊂ Dp . real algebraic curve C, Because of the relations z = xf + yg and w = uf + vg, the homogeneous transcendence degree of fp , gp , zp , wp over L := C(xp , yp , up , vp ) is at most 2. But now, if we restrict all of these functions to E, the transcendence degree of L is at most 1. (By this we mean that the restriction to E of any two among xp , yp , up , vp satisfies a fixed, nontrivial polynomial relation with complex coefficients.) Thus on E, the homogeneous transcendence degree of fp , gp , zp , wp is at most 3 and so there is a homogeneous algebraic relation between these functions restricted to E; but by analytic continuation this relation would remain true on Dp . And by analytic continuation we could also continue this relation back to a relation among f, g, z, w valid on Λ. To rule out the existence of such a relation we can use monodromy: by continuation along a curve encircling 1 but not 0 we have that (f, g) goes to (f − 2g, g), and, going similarly around 0, it goes to (f, 2f + g). If these paths do not touch [2, ∞), then z, w are left unchanged. These two

R

18 We recall that by definition a set Z ⊂ n is subanalytic if it is a finite union of sets of the shape Imf \ Imf , 1 2 where f1 , f2 : M → n are proper real analytic maps on an analytic manifold M . See [Shi97] or also [BM88], Prop. 3.13. A useful description is that A closed subanalytic set of a real analytic manifold M may be written as the image X = ψ(N ) of a real analytic manifold N with dim N = dim X, where ψ : N → M is real analytic and proper; see [BM88], Sec. 0.1. 19 The word “connected” avoids trivialities: e.g., if S contains a semialgebraic arc C, then C ∪ {x} is positive dimensional semialgebraic for all points x ∈ S, so if we omitted “connected” in the definition, S trans would often turn out to be empty in a trivial way.

R

75

3.3 A finiteness proof

linear transformations generate a group which is easily shown to be Zariski-dense in SL2 , and hence the homogeneous relation alluded to would hold by replacing f, g with any SL2 transform. But then our relation could involve only z, w. But this would in turn imply z = cw for some constant c. This is disproved, e.g., by continuing up to some real λ < 2 and then using monodromy about 2, which exchanges the sign of z but leaves w fixed. Remark 3.3.1 The matter of proving S trans = S here has been relatively simple, similarly to the abovesketched application of this method to a proof of Raynaud’s theorem for curves. However, note that already in the present case of an elliptic curve that varies in a family we have found it necessary to prove a specific algebraic independence result. Actually, as already noted (e.g., in Step (v) of Section 3.1, concerning the proof in [PZ08] of the full Raynaud theorem by this method), this step of geometric nature, of describing S trans , which might perhaps appear straightforward compared to the arithmetical issues, on the contrary often represents a serious difficulty in further applications of the method. It is in this step that the assumption that the variety X is nonspecial is exploited. We also remark that in the present case an alternative argument for this point comes from differential Galois theory, especially from Theorem 5 of [Ber89]; this theorem is also sufficient for a more general form of Masser’s questions; see the discussions in the next section.

• Upper estimates for rational points on S. We can now apply the results of Pila alluded to above. In this setting the following Theorem 1.3 of [Pil04] suffices:

R

Theorem 3.5. (Pila [Pil04], Theorem 1.3.) Let R ⊂ n be a compact subanalytic surface and let  > 0. There is a constant c(R, ) such that the number Ω(N ) of rational points on Rtrans of denominator dividing N satisfies Ω(N ) ≤ c(R, )N  for all large N . From this we immediately derive the following: Lemma 3.6. For any  > 0 there is c = c(S, ) such that for each N > 0 there are ≤ cN  rational points of S in N1 4 .

Z

Proof. We have just shown that the set S trans is the whole of S. Also, S is compact subanalytic. Then we may apply just the previous theorem to obtain directly the conclusion. Remark 3.3.2 In Theorem 1.1 of the paper [Pil05], Pila obtains the same kind of estimates, but for the total number of rational points with denominator at most N (and a similar estimate was achieved in any dimension by Pila and Wilkie in [PW06]). Note that a straightforward application of the above theorem would give merely an exponent 1 +  rather than  for this a priori larger set of points. For the present purposes, this kind of sharpening is immaterial (because the involved denominators divide mn), but it becomes important for other applications, especially the recent ones toward the Andr´eOort conjecture, discussed in the next chapter. In such a case the relevant points shall have bounded degree over , and the notion of denominator dividing N would not even have any useful analogue. As already recalled, in Appendix A below we shall present a sketch of a proof of Theorem 3.5.

Q

• Conclusion of the proof. The conclusion is as in the summary above, but with a last small point (not mentioned there). Let λ0 be a complex number such that Pλ0 (resp. Qλ0 ) is a torsion point of order m (resp. n). As remarked above, by the bounds provided by results of Masser-David, i.e., inequality (3.3.3), there are at least  max(m, n)1/2 conjugates λσ0 of λ0 such that λσ0 ∈ Dp , so that θp (λσ0 ) lies in S = θp (Dp ). However, a priori θp could take the same value at many of these conjugates, and then we could have only few rational points on S. But, a posteriori, it is easy to prove that actually a given value can be attained at most a bounded number of times, that is, independently of the value. Note that if (ξ1 , ξ2 , ξ3 , ξ4 ) = θp (λ) ∈ θp (Dp ) is a given value of θp , then zp (λ) = ξ1 fp (λ) + ξ2 gp (λ). Hence, the sought uniform boundedness is implied by the following:

76

Chapter 3

Lemma 3.7. The number of values of λ ∈ Dp such that zp (λ) − ξ1 fp (λ) − ξ2 gp (λ) = 0 is bounded independently of ξ1 , ξ2 ∈ C. Proof. This can be proved (even in the more general context of any number of linearly independent functions, analytic in a neighborhood of a compact region) by an elementary argument of (l) (l) complex analysis which we only sketch: suppose that there is a sequence ρl := (ξ1 , ξ2 ) ∈ C2 , l = 1, 2, . . ., such that the corresponding function has at least l zeros in Dp . Then, on dividing (l) (l) by max(1, |ξ1 |, |ξ2 |) we obtain a function αl zp + βl fp + γl gp having at least l zeros in Dp and such that max(|αl |, |βl |, |γl |) = 1. We may also assume that (αl , βl , γl ) converges to a nonzero point (α, β, γ) ∈ C3 . But then a simple use of Rouch´e’s theorem shows that αzp + βfp + γgp has infinitely many zeros in Dp and thus must vanish. This is, however, impossible, as can be seen by monodromy as in the proof of Lemma 3.4. For more details, see [MZ10b], Lemma 7.1. Remark 3.3.3 An effective version of the lemma. A method to achieve an effective bound in this lemma is as in the following sketch, which also provides an alternative proof. We want a uniform upper bound for the number of zeros of a linear combination F := a1 f1 +a2 f2 +a3 f3 , where the fi are effectively given linearly independent functions, analytic in some connected open set Ω, and where the ai are any complex numbers. For simplicity we assume that Ω is a disk. We assume at least that we can obtain effective bounds for the maximum modulus of the fi and of given derivatives of them, in given circles. As in the above proof, we can suppose max |ai | = 1. It is rather standard now to show that if F has many zeros in |z| ≤ 2R, then |F (i) (u)| must be small in |u| ≤ R, andQ effectively, for i = 0, 1, 2, say. (One applies Cauchy’s integral formula to |z| = 2R and the function F (z)/ j (z − zj ), where zj are the said zeros.) So we are reduced to find an effective lower bound for maxi=0,1,2 max|u|≤R |F (i) (u)|; this shall deliver an upper bound for the number of zeros. But now three inequalities |F (i) (u)| ≤ , for i = 0, 1, 2, and the assumption on the ai , yield an (effective) estimate |W (u)| ≤ M , where W is the Wronskian determinant of the fi and where M is effectively computable in terms of the fi and R. Thus we are reduced to bound below effectively max |W (u)| over |u| ≤ R. The advantage is that now the ai have disappeared. (Note that this maximum is certainly positive, for the vanishing of the Wronskian would imply linear dependence of the functions.) An effective lower bound may be given if we have an effective knowledge, e.g., of a few Taylor coefficients at some point, and a bound for all Taylor coefficients.

Finally, by this lemma and by the estimate (3.3.3), the said conjugates produce at least  max(m, n)1/2 rational points on S, with denominator dividing mn; we compare this lower bound with the upper bound of Lemma 3.6, with N := mn and, e.g.,  = 1/5, to obtain boundedness of mn and the required finiteness of rational points. Remark 3.3.4 An alternative approach. Shou-Wu Zhang has pointed out a result of his which should simplify the proof of this finiteness result (i.e., Theorem 3.3); we say only a few words about this and then sketch very briefly a couple of instances on how this result could be used for the present purposes. By means of the techniques introduced by Zhang in his thesis and reproduced in part in his paper [Zha92], it is possible to prove in particular the following statement (and other similar and more general ones). Let A be an abelian scheme over a curve B defined over . Let s1 , s2 : B → A be sections with nonzero ˆ b (s1 (b)) = h ˆ b (s2 (b)) = canonical heights h1 , h2 , say. Suppose also that for an infinity of b ∈ B( ) we have h 20 ˆ 0 (where hb (·) denotes now canonical height in Ab ). Then for all b ∈ B( ) we have

Q

ˆ b (s1 (b)) ˆ b (s2 (b)) h h = . h1 h2 20 More 21 We

Q

Q

21

generally, it would suffice that s1 (b), s2 (b) have sufficiently small height. have assumed B to be a curve for simplicity, but Zhang’s method allows also higher-dimensional B, in the

77

3.4 Related problems, conjectures, and developments

Let us see what this means in our context on taking s1 , s2 to be the sections corresponding to Pλ , Qλ (for the Legendre scheme over a base curve B with function field (λ, Pλ , Qλ )). Now the assumption that an infinity of λ0 ∈ C yield torsion for both points would then lead to the displayed equation, which in turn would imply that (∗) Every λ0 which yields torsion for Pλ yields automatically torsion for Qλ , and conversely. Now, this is a highly important piece of information, which sometimes enables one to conclude in an easier way compared to the full above arguments. For instance, if the two points are given explicitly as above, we could specialize at some suitable λ0 for which we can check that Pλ0 is torsion but Qλ0 is not, obtaining a contradiction. (So, this would work for the case of the present section but not for the general case of two points, i.e., for the results of the paper [MZ10d] mentioned soon at 3.4.6.) Another, less narrow, way to exploit this gain is as follows, where we refer to the notation used previously in this section. The claim (∗) above would imply that the function pair (x, y) (e.g., on some open disk D ⊂ Λ) takes rational values precisely at the same points λ ∈ D where (u, v) also takes rational values. We also note that the denominators of these rational points would be related: if the rational point (x(λ0 ), y(λ0 )) has denominator n, then Pλ0 has order n and so deg(λ0 )  n2 . And if (u(λ0 ), v(λ0 )) has exact denominator m, then David’s inequality 3.3.2 (+ boundedness of h(λ0 )) entails m  deg(λ0 )2 . Hence we would have m  n4 for all these rational points. Now, this may be excluded by less demanding results than Pila’s estimates for surfaces (used in Lemma 3.6 above). In fact, let S be the surface parametrized by (x, y, u, v) on D. In the first place, note that the image of (x, y) contains an open disk B ⊂ 2 . Hence, if we restrict to points of S such that (x, y) lies on a given segment of a rational line in B, we would cut S in a curve, whose components would be 1 transcendental by Lemma 3.4. By the above estimate, each component would have at least  T 4 rational points of denominator ≤ T , and we could then use the Bombieri-Pila estimates for rational points on transcendental curves (as in [BP89]), where actually any specific  < 1/4 would lead to a contradiction. (See Appendix A below: the analogue of Proposition A.2 for curves in place of surfaces would suffice.)

Q

R

Further, we note that the above statement of Zhang would provide the same kind of information also for some of the generalizations to be soon discussed in Section 3.4 below, and the sketched method would again allow one to replace Pila’s results with simpler tools. However, no information at all would be obtained, e.g., in the case (considered at Subsection 3.4.5) when the elliptic scheme L ×P1 \{0,1,∞} L is replaced by an abelian surface scheme with a trivial endomorphism ring.

3.4 3.4.1

Related problems, conjectures, and developments Pink’s and related conjectures

As previously noted, the above-considered question of Masser in fact fits in other independent formulations. For instance: (A) There is a related conjecture already put forward by Shou-Wu Zhang in 1998, seemingly the first of this type: see Section 4 of [Zha98b], Conjecture and, especially, Examples 4(a), 4(b). Zhang defines a determinantal-regulator height hΛ (x) := det(hti (x), tj (x)i) associated to a finitely generated torsion-free subgroup Λ = ⊕ ti of sections of a family A → C of abelian varieties (over a curve C). He assumes that the family has simple fibers of dimension ≥ 2 and looks not merely at torsion points but at points x ∈ C of small determinantal Λ-height, in the spirit of Bogomolov conjecture: for small enough  > 0, he predicts finiteness for the points with hΛ (x) < . This conjecture of Zhang and the said examples are closely related to some of the problems considered

Z

case when the ratio [h1 (C) : h2 (C)] of geometric canonical heights of curves C in B is independent of C if they are nonzero. Then, denoting the ratio by [h1 : h2 ], and provided the “infinity of b” in the above statement is changed to “all b in a set which is Zariski dense in B”, we may still conclude that

ˆ 1 (b)) h(s ˆ 2 (b)) h(s

Q

= [h1 : h2 ] for all b ∈ B( ).

78

Chapter 3

in this chapter: for instance, under the said assumptions, Example 4(a) predicts finiteness of torsion values t(x) for a nontorsion section t : C → A (which especially fits in Subsection 3.4.5 below). (B) Then, independently, issues similar to Masser’s were stated in more general form by Pink in 2005, e.g., in [Pin05b], with a Conjecture 6.2 of which Theorem 3.3 proves a special case (as we shall verify). Let us state a version of this general conjecture (where we use “abelian subgroup scheme” in place of “abelian subscheme” to allow nonconnected subschemes). Pink’s conjecture. ([Pin05b].) Let S be an abelian scheme over a variety B defined over C, and denote by S [c] the union of abelian subgroup schemes of codimension at least c. Let V be an irreducible closed subvariety of S. Then V ∩ S [1+dim V] is contained in a finite union of abelian subgroup schemes of S of positive codimension. Before discussing this, we immediately remark that Pink’s original formulation concerned, more generally, semiabelian schemes. Now, very recently D. Bertrand found a counterexample to this, for a suitable nonsplit extension of a CM elliptic constant family E0 × B (over a curve B) by 22 see Bertrand’s paper [Ber11]. We shall very briefly illustrate this in Example 3.2 below. m: This situation is, however, rather special and does not at all affect the abelian case, which we continue to refer to as “Pink’s conjecture.”23 Here are a few comments on Pink’s conjecture:

G

(i) The conclusion concerns points in V ∩ S [1+dim V] , which are by definition unlikely intersections. So the conjecture fits completely S in our context. In fact, one could define, in analogy with the toric case (see (1.2.1)), V(d) := rel. dim T ≤d (V ∩ T ), where the union is taken over all abelian subgroup schemes T of relative (i.e., over the said variety B) dimension ≤ d. In this notation, we have V ∩ S [1+dim V] = V(n−dim V−1) , where S has relative dimension n. (ii) Theorem 3.3 becomes indeed a case of this statement on taking (as in the first of the above Remarks to the Question) S = L ×B L, where L is the said Legendre elliptic surface, B := P1 \ {0, 1, ∞}, and V is the curve described by Pξ × Qξ , ξ ∈ B. Let us verify this deduction. This S is an abelian scheme over B of dimension 3. The set S [2] consists of the union of subgroup schemes of dimension at most 1; each of them has the shape of a finite (nonempty) union of torsion subgroup schemes plus a finite union of subgroups of separate fibers. So V ∩ S [2] projects under the map λ : S → B to a set containing the unlikely values λ0 in question. Hence to recover Theorem 3.3 it suffices to show that V ∩ S [2] is finite. Now, the conclusion of the conjecture predicts that V ∩ S [2] is contained in a finite number of subgroup schemes of positive codimension. In turn, each of them is contained in a finite union of subgroup schemes either of the shape mP + nQ = 0 ((P, Q) ∈ S) or obtained as a union of the zero section plus a finite union of separate fibers Eλ20 . Finally, as easily shown in the above discussion on the Question, our irreducible curve V is not contained in any of these last subgroup schemes of codimension 1 and so intersects each of them only in finitely many points. So indeed we get Theorem 3.3 from the conclusion of the conjecture, completing the verification. (iii) If we take S to be an abelian surface over a curve, the conjecture amounts to cases discussed in the following subsections (especially 3.4.4 and 3.4.5); some of them have been proved, some other ones are subjects of works in progress. As already noted, when S is simple the conjecture is a special case of Zhang’s conjecture in (A) above and stated at Section 4 of [Zha98b].

G

22 Here and in the sequel, when we deal with schemes over B, it shall be understood that m in fact means the constant scheme m × B, sometimes denoted m/B . 23 Pink’s version is anyway slightly different, but substantially equivalent to the above one. We also note that in [Pin05b] there appears another Conjecture 1.3, embracing also the Andr´ e-Oort context, which seemingly should not be affected by the said counterexample but actually supported by it: see the discussion in Sec. 2 of [Ber11].

G

G

79

3.4 Related problems, conjectures, and developments

Let us now briefly discuss the case of semiabelian (nonabelian) schemes; this was contained in the original Pink’s conjecture, obtained from the above one by simply replacing “abelian” with “semiabelian” throughout. (Recall that a semiabelian scheme is an extension of an abelian one by a torus.) (iv) As a first example, we can take the abelian part to be trivial, and B = a point, so S = now we recover the toric conjecture of Zilber discussed in Subsection 1.3.4 of Chapter 1.

Gnm:

(v) The simplest mixed and nonconstant case of Pink’s conjecture occurs for the (split) family S of products m × Eλ , for λ ∈ B := P1 \ {0, 1, ∞} (where m denotes the constant scheme over B). In this case, if, for instance, we choose the point Pλ as in Masser’s problem above, and if we choose in m the point parametrized by λ, we obtain a curve V, described by λ × Pλ , λ ∈ P1 \ {0, 1, ∞}); it is then easy to see that the conjecture may be restated as asserting that There are only finitely many roots of unity λ such that Pλ is torsion. (Note the similarity to a statement in Remark 1.1.5, where Eλ is replaced by a constant curve.) This may be proved precisely by the same method as above, using an analogue of Lemma 3.4, which in turn can be obtained by related means; details are explicitly carried out in (the present) Section 4 of [BMPZ11]. (Actually, in some respect things are simpler now compared to Theorem 3.3: in fact, it is clearly sufficient to restrict λ to the unit circle, hence we only need estimates for rational points on transcendental curves, rather than surfaces. As outlined in Remark 3.1.1, such bounds were obtained already in the paper [BP89]. Also, the bound for the height provided by Proposition 3.2 is automatic now.)

G

G

G

(vi) Further mixed cases arise by taking nonsplit extensions of the Legendre family Eλ by 16, p. 183, for the classification and construction of these semiabelian schemes). A study of Pink’s conjecture in this context is in progress, carried out in [BMPZ11]. However, if we take a nonsplit extension by m of a constant elliptic scheme E0 × B over a curve B,24 the conclusion of the conjecture may not be true; in fact, as noted above, Bertrand very recently produced a counterexample in [Ber11], on taking E0 to be a CM -elliptic curve (see 4.2 for a brief review of complex multiplication). Now come a few details about Bertrand’s subtle construction.

Gm (see [Ser88], VII, especially no.

G

Example 3.2 On Bertrand’s counterexample. Here we shall outline very briefly Bertrand’s example of a semiabelian scheme G of relative dimension 2 over a base complex curve B, for which there exists a curve C ⊂ G, projecting onto B, not contained in any proper subgroup scheme of G, but containing infinitely many torsion points of G.25 This clearly disproves the conclusion of the conjecture for S = G. Bertrand takes G as any nonsplit extension by m of a constant elliptic scheme E := E0 × B, where E0 is a complex CM -elliptic curve. In what follows let us take, for instance, E0 defined by y 2 = x3 − x; √ this has CM by i = −1: [i](x, y) = (−x, iy); also, we take B = E0 and let π : G → E be the natural map. Now, every divisor of degree 0 on an elliptic curve defines an extension of it by m (and conversely, as in [Ser88], VII.3.16); here, if b ∈ B = E0 , we take the divisor (b) − (0) on E0 to define the fiber Gb above b. Bertrand now constructs what he calls a Ribet section26 β : B → G; this is such that the section p = π ◦ β : B → E satisfies p = ([i] − [¯i]) ◦ id = 2[i] ◦ id. (We skip here the technical construction, which works also for endomorphisms other than [i].) With this in hand, he defines C := β(B). Note that here it is crucial that E0 has CM , for if i were replaced by an integer, we would have π ◦ β = 0 and C would be contained in a proper subgroup scheme. The fundamental fact now is that β has a lifting property: for every torsion point ξ ∈ B(C), its lift β(ξ) is automatically torsion on Gξ . The proof of the lifting property is also technical and shall not be reproduced here.

G

G

24 This

is referred to as the “semi-constant case” in [BMPZ11], and is discussed in (the present) Sec. 6 therein. torsion points of G we mean those which are torsion in some fiber. 26 See [Ber11] for a motivation for this terminology, inspired by previous work of K. Ribet. 25 By

80

Chapter 3

Finally, we have that C is indeed not contained in any proper subgroup scheme (otherwise (b) − (0) would be identically torsion on E0 ), so by the lifting property the above conditions are satisfied. A phrasing and proof different from Bertrand’s are due to Edixhoven and appear as the Appendix to [Ber11]; we say something about this, keeping our special choice and changing the language a bit. Roughly speaking, he interprets a fiber Gξ as the generalized Jacobian of the singular curve Xξ obtained by identifying the points ξ, −ξ of E (see [Ser88], Chs. IV, V). He defines a section β 0 as follows: β 0 (ξ) is the class in P ic0 (Xξ )27 of [i]∗ ((ξ) − (−ξ)) − [i]∗ ((ξ) − (−ξ)). (We may also note that [i]∗ = [¯i]∗ .) It may be easily checked that π ◦ β 0 (ξ) = 2(i − ¯i)ξ = 4iξ (similarly to the above, just with 4 in place of 2). To prove the lifting property, take an odd integer n and let ξ ∈ E0 be a torsion point of order n, so there exists a function f on E such that div(f ) = n((ξ) − (−ξ)). Then div(f ◦ [i]) = [i]∗ div(f ), div(N orm[i] (f )) = [i]∗ div(f ). Letting g := (f ◦ [i])/N orm[i] (f ), we then have div(g) = n([i]∗ ((ξ) − (−ξ)) − [i]∗ ((ξ) − (−ξ))), and nβ 0 (ξ) is the element g(ξ)/g(−ξ) of C∗ . Now, by Weil’s reciprocity law (see [Sil92], Ex. 2.11),28 we have (g(ξ)/g(−ξ))n = g(div(f )) = f (div(g)) = f (div(f ◦ [i]) − div(N orm[i] (f ))) = f (div(f ◦ [i]))/f (div(N orm[i] (f ))) = (f ◦ [i])(div(f ))/f ([i]∗ div(f )) = 1. Hence, β 0 (ξ) is torsion of exponent n2 , proving finally what is needed for the curve C 0 .

In the following subsections we shall discuss some extensions of Theorem 3.3, providing further cases of Pink’s conjecture, and we shall also allude to work in progress in related directions.

3.4.2

Extending Theorem 3.3 from

Q to C

A first possibility of extending Theorem 3.3 was proposed already by the presenter of the Comptes Rendus note [MZ08], who remarked one couldn’t help but ask whether the theorem remains true on taking, in place of 2, 3, arbitrary fixed distinct complex abscissas 6= 0, 1 for our points. The answer is affirmative, and one can prove the following: Theoremp3.8. For any distinct a, b ∈ C \ {0, p 1} there are only finitely many λ ∈ C such that both Pλ := (a, a(a − 1)(a − λ)) and Qλ := (b, b(b − 1)(b − λ)) are torsion on Eλ . Let us briefly discuss this result, which leads to questions a bit less innocuous than could perhaps be expected. (The original choice of 2, 3 was made for simplicity; the general case a posteriori presented some aspects not appearing in the special case.) The strength and ease with which we can prove this statement depend on the transcendence degree τ := tr( (a, b)/ ). First, observe that for a, b 6= 0, 1 none of the points is identically torsion, and if further we have a 6= b, then the points are linearly independent over ; this follows as above for the special case a = 2, b = 3.

Q

Q

Z

• The case τ = 0 amounts to assuming that a, b are algebraic numbers. This case may be treated analogously to the above one a = 2, b = 3, though with some complications; we shall say more on this soon. • The case τ = 2 is indeed easy (as remarked by the presenter): now a, b are algebraically independent. On the other hand, if Pλ0 is torsion on Eλ0 , then we have the equation Bn (λ0 , a) = 0 for some n > 0, hence a is algebraic over (λ0 ) (indeed, Bn (λ, x) is essentially monic in x); but then b is transcendental over (λ0 ) and by the same argument we deduce that Qλ0 is not torsion. In other words, the relevant set of complex numbers λ0 is empty: we have “impossible intersections.”

Q

Q

• The case τ = 1 is not free of interest and can be treated in a completely different, more elementary, and also more satisfactory way than the algebraic case. Note that now we have some nontrivial equation f (a, b) = 0 (f ∈ [X, Y ]) relating a, b, and this equation defines a curve C in the square Eλ2 ; namely, if (x1 , y1 ) × (x2 , y2 ) is a point in Eλ2 , C

Q

27 By 28 It

this we refer to the class relative to the “modulus” (ξ) + (−ξ), in the sense of [Ser88]. may be checked that this may be applied since the relevant divisors have disjoint support.

81

3.4 Related problems, conjectures, and developments

is defined by f (x1 , x2 ) = 0. Let now λ0 ∈ T (a, b) be an unlikely intersection, i.e., a doubly torsion value. Then both a, b are algebraic functions of λ0 , so we obtain a point on C, defined over (λ0 ) and torsion in Eλ20 . But now, since τ = 1, λ0 is necessarily transcendental, and therefore we may replace λ0 by λ and get a torsion point inside the square Eλ2 , lying on the curve C and defined over (λ). In this view the issue may be considered just a case of Raynaud’s theorem for the square of an elliptic curve with transcendental invariant. This could be dealt with as in Section 3.1,29 or else deduced from Raynaud’s theorem over as done by Raynaud himself, i.e., by specialization to any value λ = ξ ∈ of good reduction, i.e., ξ 6= 0, 1.30 The specialization argument is anyway less efficient (for instance, from a quantitative viewpoint), and here we shall instead adopt a Galois strategy, similarly to methods that we have seen for Lang’s original problem in 2m .

Q

Q

Q

Q

G

Let us write T (a, b) for the relevant set of unlikely λ0 ∈ C, and assume that a, b are distinct transcendental complex numbers satisfying an algebraic relation f (a, b) = 0 as above, over , of in place of ) yields other degree d. The following joint result with Masser (here stated with impossible intersections.

Q

Q

Q

Theorem 3.9. Let a 6= b be transcendental complex numbers, satisfying an irreducible algebraic of degree d. Then the cardinality |T (a, b)| is finite and bounded only in terms of equation over d. Moreover there is a finite set Pd of polynomials in [x, y], depending only on d and effectively computable, such that T (a, b) is empty unless P (a, b) = 0 for some P ∈ Pd .

Q

Q

Sketch of proof. We omit quantitative precisions, which are given in [MZ10d] and especially in [MZ10a]. Actually, here we follow a slightly different argument, corresponding to the Serre-Tate method for Lang’s problem, whereas the method in the quoted papers is nearer to Liardet’s.31

Q

There is an irreducible relation f (a, b) = 0, where f ∈ [X, Y ] has degree ≤ d. Let us consider the set T = T (a, b) and pick a λ0 ∈ T , letting nP , nQ denote the exact orders of the corresponding points P, Q on Eλ0 , with abscissas resp. a, b. As noted above, λ0 is transcendental, like a, b; hence, we may change λ0 into λ, with the proviso that then a, b are viewed as certain algebraic functions of λ, i.e., the abscissas of the torsion points P, Q ∈ Eλ ( (λ)) of the said orders. We want to prove a bound depending only on d for the common order of P, Q.

Q

Q

Q

The curve Eλ is isomorphic (over (λ)) to a curve Ej∗ defined over (j) by a Weierstrass equation, with transcendental invariant j = j(Eλ ). For instance, we may take Ej∗ defined by Y 2 = 4X 3 −

27j 27j X− . j − 1728 j − 1728

(3.4.7)

The invariant j = j(Eλ ) is easily expressed as the rational function 28 (λ2 − λ + 1)3 /λ2 (1 − λ)2 of degree 6 (see, e.g., the formulas after (3.3.5)). Now (λ) becomes the field generated by the 2torsion on Ej∗ . After such isomorphism, we may also view P, Q as torsion points P∗ , Q∗ ∈ Ej∗ . The x-coordinates x(P∗ ), x(Q∗ ) of P∗ , Q∗ , as points on Ej∗ , are also easily expressed as linear functions of a, b with coefficients in (λ).

Q

Q

29 The only difference with the sketched argument lies in the lower bound of Masser for the degree of torsion points, which, however, is much easier, and also sharper, in this case of Eλ2 . 30 One could also specialize a, b to algebraic values a , b ∈ with f (a0 , b0 ) = 0, rather than specializing λ. This 0 0 would reduce us to the case τ = 0, i.e., essentially to the above-treated Masser’s question, rather than to Raynaud’s theorem for . However, on the one hand, the relative case is more delicate than the constant one, and, on the other hand, one must keep under control the possible loss of information in the specialization procedure: in fact, whereas specializing λ → λ0 requires merely good reduction, i.e., λ0 6= 0, 1, the specialization of a, b leads to some annoying obstacles, to control possible collapsing of doubly torsion values of λ. 31 For a part of such proof there is also an alternative argument using Tate’s parametrization, given in [MZ10d].

Q

Q

82

Chapter 3

Q

These coordinates together with y(P∗ ), y(Q∗ ) generate over (j) fields contained in modular function fields of levels nP , nQ , as explained, e.g., in [Lan73] or [Shi94]. As such, these fields are subfields of the corresponding field of level m := lcm(2, nP , nQ ), denoted here as F ; that is, F is the field generated over (j) by the coordinates of all torsion points of order m. Now, the Galois structure of F/ (j) is well known (since Fricke and Weber) to be the maximal possible one. Namely, viewing the m-torsion on Ej∗ as a finite group isomorphic to /(m) × /(m), and viewing the Galois action through its natural representation as a subgroup of GL2 ( /(m)), the noted result states that such a Galois image is in fact the whole group GL2 ( /(m)); moreover, the Galois group of F over (j) corresponds to SL2 ( /(m)) (see the quoted references).

Q

Q

Q

Q

Z Z

Z

Z Z

The original equation f (a, b) = 0 relating the abscissas of the points on Eλ yields a similar equation f ∗ (x(P∗ ), x(Q∗ ), λ) = 0, where f ∗ (X, Y, λ) ∈ (λ)[X, Y ] again has degree ≤ d in X, Y . This equation yields a curve C in (Ej∗ )2 , defined over (λ). Note that this curve C is possibly reducible, but since f ∗ defines an irreducible (over ) curve in the XY -plane, it gives rise to at most four components in (Ej∗ )2 : in fact, C is the inverse image of the plane curve f ∗ = 0, through the x × x-map, of degree 4, from Ej∗ × Ej∗ to P1 × P1 . Also, letting Z be one irreducible component of C, every other component is obtained as the image of Z by some endomorphism (±1) × (±1) of Ej∗ × Ej∗ .

Q

Q Q

Q

Let us now use the Galois group over (λ) of the said modular function field F of level m; this Galois group is represented as the subgroup Γ ⊂ GL2 ( /(m)) consisting of the matrices ≡ I (mod 2), and so it contains the homotheties ` · I : P 7→ `P with (`, m) = 1. The subgroup Ω := Γ ∩ SL2 ( /(m)) corresponds to the field ( ∩ F )(λ). (See, e.g., [Lan73], Cor. 1(ii), p. 68.) By the quoted results, the group Ω may be seen also as the Galois group Gal( F/ (λ)); in this view, if g ∈ Ω, then g fixes a field of definition for C, so C g = C. Suppose now that ` > 1 is such that (`, m) = 1, and let us extend the homothety ` · I (as an element of our Galois group) to an automorphism of (λ) over (λ); if k is a (Galois) number field of definition for f, f ∗ , C, suppose that this automorphism acts on k as σ ∈ Gal(k/ ).32 Then we have, for all g ∈ Ω,

Z

Z

Q

Q Q

Q

(P∗g , Qg∗ ) ∈ C,

Q

(`P∗g , `Qg∗ ) ∈ C σ .

Q

(3.4.8)

An easy calculation, using the Chinese remainder theorem to reduce to the case of prime-power order, allows to estimate from above the size of the stabilizer of (P∗ , Q∗ ) in Ω. In fact, let pr ||m; then at least one of P∗ , Q∗ , considered as points in ( /pr )2 , has order pr if p 6= 2 and ≥ 2r−1 otherwise. Hence (by conjugating by a matrix in SL2 ( /pr )) we may assume it is (1, 0) or (2, 0). Suppose p 6= 2. Now, the stabilizer of (1, 0) in SL2 ( /pr ) has order pr , and this is an upper bound for the order of the stabilizer of both P∗ , Q∗ . On the other hand, the reduction of Ω modulo pr has order equal to the order of SL2 ( /pr ), i.e., p3r−2 (p2 − 1) = p3r (1 − p−2 ). If p = 2, then this reduction has order ≥ 1/6th of the previous corresponding lower bound. In conclusion, the index of the stabilizer is ≥ p2r (1 − p−2 ) if p 6= 2 and ≥ 22r−6 if p = 2. In turn, on taking a corresponding product over primes, this shows that the orbit by Ω of the point (P∗ , Q∗ ) ∈ (Ej∗ )2 ( (j)) contains at least ≥ m2 /64π 2 points. (One can also directly estimate the size of the orbit from below instead of using the stabilizer.) Now two cases can possibly occur.

Z Z

Z

Z

Q

1. C σ has no component in common with [`]C. Then by Bezout’s theorem applied to the projection of these curves to P1 ×P1 (this is a bit simpler than working in (Ej∗ )2 ), we find |C σ ∩[`]C|  32 It

may happen that this σ is not uniquely determined by `.

3.4 Related problems, conjectures, and developments

83

d2 `2 , whence, by (3.4.8), m  d`. On the other hand, elementary prime-number theory yields the existence of a suitable prime `  log m, whence we get a bound for m: m  d log d. In this case the order of torsion is likewise bounded for both points. Each double relation np P = nq Q = 0 yields, on eliminating λ, a relation between a, b (nontrivial because the Bn (λ, x) are essentially monic in x), which possibly gives an element of Fd . Clearly, only finitely many elements can arise in this way and can be computed. 2. There is a component in common to C σ and [`]C. Let [`]Z be such component, where Z is a component of C. We have already noted that every other component is obtained as the image of one of them by some map µ of the shape (±1) × (±1) on Ej∗ × Ej∗ . Then we have µZ σ = [`]Z for some such µ. Iterating shows that [`r ]Z = Z for some r ≥ 1. But then by general theory Z is a translate by a torsion point of an irreducible algebraic subgroup. However, this would say that the original relation f (a, b) = 0 defines a union of at most four torsion translates of an algebraic subgroup. In turn, the original points P, Q on Eλ with abscissas resp. a, b would be linearly dependent on Eλ , which we can exclude, for instance, by ramification, as in the case a = 2, b = 3. This completes the proof.

Remark 3.4.1 In this case τ = 1, it is perhaps also possible to adopt arguments similar to the case τ = 0, since Pila’s results still work. However one would have to take conjugates of λ over a field of definition, and to explore their distribution in the complex plane. As to the above method, is possible to give further precision and to quantify explicitly all of this. This is done in the note [MZ10a] but with the said different argument. In such paper some algebraic relations are explicitly computed and emptiness of T is proved in some cases. It is also noted that, for instance, P4 is not empty, and contains 4uv 3 − 3v 4 − 6uv 2 + 4v 3 + u2 ; this relation gives rise to torsion relations 2P = 3Q = 0 on Eu . Further, it is probably possible to obtain quantitatively better estimates by transposing in this elliptic context the cyclotomic arguments of Beukers and Smyth [BS02].

3.4.3

Effectivity

This last result (i.e., Theorem 3.9) is fully effective. It seems that one can also achieve effectivity for Pila’s estimates, at least those stated as Theorem 3.5 in the present context of the proof of Theorem 3.3 (and perhaps as well in subsequent proofs relying on Bombieri-Pila-Wilkie’s estimates); however, this verification has not been explicitly carried out yet. With this proviso, Theorem 3.3 (and the generalization we shall soon state) would also be effective, in the sense that one could compute the finite set of unlikely λ0 in question. In the same way, effectivity for Pila’s proofs would also lead to an effective version of the Manin-Mumford statement, since Masser’s ingredient in the method is fully effective. (Thinking of the Manin-Mumford context, see also [Bak00] for a completely explicit version in the case of Jacobians of modular curves.) Recently, in the context of dynamics of a quadratic polynomial x2 + c, an application has been found in [HHK11] of an analogue of Theorem 3.3 for arbitrary pairs of points, i.e., Theorem 3.10, discussed in the next subsection. For a number field K and for c ∈ K, the authors study the number of K-rational pre-images (by pre-iteration of x2 + c) of an algebraic number a, and prove this is finite, actually at most 10 for all but finitely many c ∈ K; also, in Theorem 1.3 of [HHK11], they predict the exact number in terms of a and K. Now, Theorem 3.10 leads to a further precision in such result, in that it would become independent of K except for a finite set of a; if made effective, Theorem 3.10 would lead to a completely explicit version of that result.

84

Chapter 3

3.4.4

Extending Theorem 3.3 to arbitrary pairs of points on families of elliptic curves

The method of proof of Theorem 3.3 works also for a natural generalization (including Theorem 3.8 as well), namely, allowing the points to have coordinates in (λ) and even in C(λ). More precisely, in [MZ10b] the following result is proved, stated here in the language of Pink’s conjecture:

Q

Theorem 3.10. Let A be an abelian surface scheme over a variety defined over C, isogenous to the fibered product of two isogenous elliptic schemes. Let V be an irreducible closed curve in A. Then V ∩ A[2] is contained in a finite union of abelian subgroup schemes of positive codimension. As mentioned in 3.4.3, this may be applied to issues in arithmetic dynamics, as in [HHK11]. A reformulation can also be given in terms of an abelian pencil (or also family of arbitrary dimension) A → C, A isogenous to the square of an elliptic pencil E → C, with a section η : C → A; then η corresponds to two sections σ, τ : C → E. We have: Theorem 3.11. Let E → C be a family of elliptic curves over a base variety C over a field of characteristic zero, and let σ, τ be two sections linearly independent over . Then there are only finitely many p ∈ C such that σ(p), τ (p) are both torsion in Ep .

Z

In the proof of these theorems it is easy to reduce to the case when A = L = Eλ ×P1 \{0,1,∞} Eλ is the fibered square of the Legendre curve (as in Theorem 3.3), and the points have coordinates in C(λ) (which corresponds to C being a curve). Now the theorem reads: If two such points are not identically linearly dependent, they assume simultaneously torsion values above only finitely many λ ∈ C. In this phrasing it is a kind of local-global principle.

Q

When the points are defined over (λ), the proof of the theorem follows the same lines as for Theorem 3.3, but with some new technical obstacles about which we shall soon comment. The general case when the transcendence degree is > 1 (i.e., dim C > 1) can be treated by means of arguments very similar to the ones just given for Theorem 3.9 (essentially, the case τ = 1 of Theorem 3.8). As to the “algebraic case,” the relevant new issues appear as follows. Let K be a finite extension of (λ) containing the coordinates of P, Q. Similarly to the proof of Theorem 3.3, one introduces the hypergeometric functions f, g, whose values are lattice generators, and the elliptic logarithms z, w. These are now functions on a curve with function field K. It is not difficult to obtain the required analytic continuations, but the proof that the relevant surface does not contain semialgebraic curves is much less direct. For the special case of Theorem 3.3 this was done in the course of the proof of Lemma 3.4: we proved that a semialgebraic curve would lead to the homogeneous algebraic dependence of f, g, z, w. And then, by using simple monodromy of f and g, we proved this did not hold. Extending these arguments to the present case involves monodromy on all of f, g, z, w and turns out to be more complicated. (See the appendix to [MZ10b] for this argument.) Daniel Bertrand noted that one could use here the algebraic independence of z, w over C(f, g), which he proved in 1990 using some D-module theory applied to the Picard-Fuchs differential operator (of which f, g are solutions). See [Ber89], Theorem 5, for a much stronger algebraic independence assertion than needed, which indeed leads to a rather shorter proof.33 With this in hand, the argument is the same as in Lemma 3.4: z, w lie in C(x, y, u, v, f, g), and restricting to the set E mentioned therein yields restricted functions x ¯, y¯, u ¯, v¯ generating a field of transcendence

Q

33 See

also Andr´ e’s paper [And92] for further generality.

3.4 Related problems, conjectures, and developments

85

degree at most 1 over C(f¯, g¯). Hence z¯, w ¯ are algebraically dependent on C(f¯, g¯), a dependence which can be analytically continued, giving a contradiction with Bertrand’s theorem. In conclusion, this implies that the relevant surface does not contain semialgebraic curves, and allows the rest of the proof to go through. A further step is to consider the analogous question for the product of two nonisogenous curves, i.e., a family E1 × E2 → C, and a nontorsion section, not mapping into the origin of any of the factors. This has been carried out in recent work [MZ10c], by similar methods, to obtain: Theorem 3.12. Let A be a nonsimple abelian surface scheme over a variety defined over C. Let V be an irreducible closed curve in A. Then V ∩ A[2] is contained in a finite union of abelian subgroup schemes of positive codimension. This time the abelian family A may be assumed to be isogenous to the product Eλ × Eκ of Legendre families, where λ, κ are transcendental but are algebraically dependent. We do not pause on new details of the proofs, and only note that new cases appear, depending also on the field of definition of an algebraic relation between λ, κ. Similarly to what happens for the situation treated at 3.4.2 above, in some cases a specialization argument may be used to reduce to the algebraic case; and as before, one may specialize either λ or the coordinates of the involved points, a choice which changes some features of the problem. (Depending on such choice one may also be led to a certain effective uniform version of Siegel’s theorem over function fields, of some independent interest, carried out in [MVZ10].)

3.4.5

Simple abelian surfaces and Pell’s equations over function fields

The remaining case in Pink’s conjecture for families of abelian surfaces is the case of simple abelian surfaces. In particular, we find simple Jacobians of a curve of genus 2;34 this would also yield an attractive application, suggested by Masser, to a Pell’s equation to be solved in polynomials. For instance, by means of the above method one should be able to answer negatively the following question: Question B. For a given λ ∈ C, consider the Pell’s equation X 2 − (t6 + t + λ)Y 2 = 1, to be solved in nonconstant polynomials X, Y ∈ C[t]. Can this be done for infinitely many λ ∈ C? The link with the previous context is as follows: a nonconstant solution X = x(t), Y = y(t) ∈ C[t] to the Pell’s equation, for λ = λ0 ∈ C, would produce a function x + uy on the nonsingular hyperelliptic curve Cλ0 of genus 2 (if λ50 6= 55 /66 ) with function field C(t, u) given by u2 = t6 +t+λ0 . The divisor of this function would be supported on the two points denoted ∞+ , ∞− ∈ Cλ0 above t = ∞ (all of this depends of course on λ0 ); hence, such a divisor would be of the shape m(∞+ −∞− ) and so the difference ∞+ − ∞− would be a torsion divisor on the Jacobian of the curve. Contrary to previous situations, now we have only a single point ∞+ − ∞− (dependent on λ0 ) which has to become torsion on the Jacobian Jλ0 of Cλ0 ; but the ambient space now has dimension 2, and in fact a point of the Jacobian corresponds to a pair of points on the curve. Hence it is again unlikely that this happens. (Note also that the set of points ∞+ − ∞− ∈ Jλ0 describes a curve in a threefold for varying λ0 , like each torsion condition mP = 0.) We fall in another case of Pink’s conjecture. (As noted in 3.4.1, this implicitly appears also in Example 4(a), Section 4, of Zhang’s survey [Zha98b] as a case of a certain Bogomolov-type conjecture discussed therein.) Remark 3.4.2 We note that the divisor expressed as the difference ∞+ − ∞− of the said points is not identically torsion, a fact which admits a self-contained easy proof: if it were, the Pell’s equation would 34 It is known that any simple principally polarized abelian surface is the Jacobian of a suitable curve of genus 2; see [LB92], Cor. 8.2.

86

Chapter 3

have an identical nontrivial solution (x, y) over C(λ)[t]. Conjugating over C(λ) the function x + uy, and noting that the points at infinity on Cλ are defined over C(λ), would give us another function with the same divisor, and hence xσ + uy σ = c(x + uy), where c ∈ C(λ). Necessarily c2 = 1, so the square (x2 + u2 y 2 ) + u(2xy) would yield a nontrivial solution (x2 + u2 y 2 , 2xy) of the Pell’s equation with entries in C(λ)[t]. In turn, this would give a nontrivial solution Z 2 − (t6 + t + λ)W 2 = δ(λ) with Z, W ∈ C[t, λ] and δ ∈ C[λ] \ {0}; we may also assume that δ does not divide both Z, W . If δ is constant, comparison of degrees in λ yields a contradiction. If δ is nonconstant, specializing λ to a root ξ ∈ C of δ also yields a contradiction, because t6 + t + ξ cannot be a square in C(t).

This kind of problem requires further investigations anyway, and would lead naturally to related and crucial issues, as, for instance, establishing whether the said Jacobian is identically simple or isogenous to the product of elliptic curves. In fact, if the Jacobian is (generically) isogenous to a product of elliptic curves, a priori there might be infinitely many torsion values even if ∞+ − ∞− is not identically torsion. In this respect, I owe to Masser the following example: consider the Pell’s equation X 2 − (t6 + t2 + λ)Y 2 = 1. Now, the curve u2 = t6 + t2 + λ has genus 2 and maps (t, u) 7→ (t2 , u) and (t, u) 7→ (t2 , tu) resp. to the elliptic curves u2 = t3 + t + λ and u2 = t4 + t2 + λ (which can be shown to be nonisogenous). Also, the (elliptic) Pell’s equation X 2 − (t4 + t2 + λ0 t)Y 2 = 1 shall be solvable in polynomials Xλ0 (t), Yλ0 (t) for infinitely many λ0 ∈ C (because there are infinitely many torsion values for a single point on an elliptic curve over C(λ)). From this we find the solutions (Xλ0 (t2 ), tYλ0 (t2 )) to the original Pell’s equation. On the other hand, it may be proved, as in the previous Remark 3.4.2, that there are no identical solutions (i.e., over C(λ)[t]).

Q

Remark 3.4.3 For certain hyperelliptic curves over (λ) the simplicity of the Jacobian has been proved by Masser in [Mas99], who actually determines the endomorphism ring as being . Other examples come from N.M. Katz’s work [Kat93], Theorems 5.13, 5.17. To prove the mere simplicity, one may also specialize λ → λ0 ∈ to obtain a curve over and then use a method sketched in [Sto95] and [CF96], by means of counting points over p and p2 to reconstruct the local Zeta-function of the Jacobian J; then one looks whether such Zeta-function splits into the product of local Zeta-functions of elliptic curves, after a field extension. If this does not happen for some prime of good reduction for J, it implies simplicity (in view of results of Serre-Tate on preservation of good reduction of abelian varieties under isogeny). For the Jacobian of y 2 = x6 + x + λ, this “negative” test works, e.g., on choosing p = 11 and λ0 ≡ 0 (mod 11) or p = 7 and λ0 ≡ 1 (mod 7).

Q

F

F

Z

Q

It is worth noting that the method of proof of Theorem 3.3 is likely to apply also to the problem of the splitting of the specialization of a generically simple Jacobian of dimension 2 into a product of elliptic curves of special type. This is the object of work in progress, in which, for instance, calculations of Masser should lead to a proof of the following result: Let Aλ be the Jacobian of the curve y 2 = x(x − 1)(x − λ)(x − λ2 )(x − λ4 ). There are at most finitely many λ0 ∈ C (λ0 not 0 or a root of unity of order 2, 3, or 4) such that Aλ0 is isogenous to the product of two isogenous CM elliptic curves or is simple and has CM type. (By “CM-type” we mean that the endomorphism ring has rank at least 4.) The arguments use a similar method as above, applied this time to a Siegel matrix for the Jacobian, which for the relevant λ0 (for the first condition) has entries in a complex quadratic field. Several deep ingredients are involved. This time the height estimates for λ0 come from a theorem of Andr´e (p. 201 of [And89]) obtained with G-function theory and similar to Silverman’s bound, but bounding the height by a certain power of the degree of λ0 : h(λ0 )  d(λ0 )k .35 Then this must be supplemented with endomorphism estimates of Masser and W¨ ustholz [MW94] on isogeny degrees (see Remark 4.2.1) to relate h(λ0 ) with the height of the complex multipliers which arise (but further delicate 35 Andr´ e’s

theorem requires a certain multiplicative reduction condition, here satisfied at λ = ∞.

3.4 Related problems, conjectures, and developments

87

calculations with discriminants of suitable lattices with respect to the Rosati quadratic form are necessary). To my knowledge this is the first result of this type (previous conclusions concerned specializations of simple Jacobians at algebraic points of bounded degree). Remark 3.4.4 The first conclusion may be described as predicting finiteness of complex specializations which increase “by three levels” the endomorphism algebra of Aλ : one level for being isogenous with a product of two elliptic curves (which raises End from to at least 2 ), a second level by making the two factors isogenous (this raises End to 2 × 2 rational matrices), and a third level by requiring the factors to be CM (now the matrices have entries in a quadratic field). Note also that this “CM -level” might in fact be considered to count twice (because of the two factors). The question arises whether one can get finiteness, e.g., already at the second level. An example when this can be done comes from the (nonsimple!) abelian surface Eλ × E−λ : in Example 4.1 below it is noted that Andr´e’s theorem of the next chapter implies finiteness for the set of complex values of λ which make both factors to be CM , increasing the endomorphism algebra from 2 to a product of two quadratic fields. (Of course, the second conclusion of the above statement also prevents the second level in the case of a simple abelian surface.) On the other hand, a single level does not suffice, e.g., in this last example; in fact it can be proved that there are infinitely many λ0 ∈ C such that Eλ0 is isogenous to E−λ0 , so that the abelian surface in question becomes infinitely many times isogenous to a square: see Remark 3.4.5 in the notes below. Other finiteness results at the first level are available, with different methods, if one limits to specializations over a number field; this happens, for instance, for Eλ × E−λ , and also for Jacobians of y 2 = f (x)(x − λ): see [EEHK09], Theorem 8.36 The mere sparseness of the relevant values of λ in a given number field comes from previous results by Masser in [Mas89b]: by a method as in Note 1 to Ch. 1, he estimates the relevant λ in a given number field with height at most T , using also the already quoted estimates on isogeny degree of Masser-W¨ ustholz.

Q

Q

Q

In this connection (and thinking also of the last remark), the following question also arose: Are there infinitely many complex λ0 such that the three curves Eλ0 , E−λ0 , E2λ0 are all isogenous? Actually, a finiteness answer is implicit in a conjecture of Pink in [Pin05b]; it almost follows from a result of Habegger and Pila, stated as Theorem 4.7 in Note 2 to the next chapter. (The “symmetry” condition therein is not verified; however, it would be verified for the analogous issue for Eλ0 , E−λ0 , Eλ20 or for Eλ0 , E−λ0 , E3 .)

3.4.6

Further extensions and analogues

Beyond abelian schemes, further generalizations of the problems would concern group extensions by m of a pencil of elliptic curves, or other semiabelian varieties, as mentioned in the above discussion of Pink’s conjecture in Subsection 3.4.1. The generalizations we have mentioned may certainly appear rather special if compared with general statements one might think about, such as Pink’s conjecture. On the other hand, in the first place they cover the general case of a curve inside an abelian scheme of relative dimension 2. Second, it clearly appears that the basic arguments for the proof of Theorem 3.3 cannot be carried out in a straightforward way already for the said cases. Hence, providing complete proofs may

G

36 This finiteness is obtained on observing that a “bad” specialization λ → λ ∈ k yields a degeneracy of the 0 Galois structure of torsion points of given (large) order `, with respect to the generic Galois group, which is “large,” e.g., since Eλ is not isogenous to E−λ identically. (In more general cases, this step may require a deep theorem of Deligne on monodromy: Th´ eorie de Hodge II, Pub. Math. I.H.E.S., 40 (1971), 5–57.) In turn, this produces a k-rational point of a certain curve (depending on `), and Faltings’ finiteness theorem may be applied when the genus is ≥ 2. In more recent work by some of these authors, the known technique of going to a symmetric power of the curve is applied to obtain finiteness for the bad specializations of bounded degree.

88

Chapter 3

help to clarify underlying obstacles and important issues which can appear in even more general investigations. For the case of higher relative dimensions, Masser had already formulated analogues for 3 (or more) points: Question C: Are there p λ0 ∈ C such thatpthere are two independent relations pare infinitely many between the points (2, 2(2 − λ0 )), (3, 6(3 − λ0 )), (5, 20(5 − λ0 )) on Eλ0 ? We expect, of course, a negative (i.e., finiteness) answer; this would represent an analogue of Theorem 1.3 for families of elliptic curves; it might be very difficult. The case of a constant elliptic curve has been settled in [Via08], but with methods that would not cover unconditionally the obvious analogue for four or more points. (This is covered for arbitrary powers of a CM -elliptic curve in [Via03].) Question D: What can be said about the set of complex pairs (a, b) ∈ C2 such that three points with abscissas, e.g., resp. 1, 2, 3 are torsion on the elliptic curve y 2 = x3 + ax + b? Here we again would expect at least that the set of relevant pairs is degenerate, i.e., not Zariskidense in C2 . In principle, this could also be treated by the above described methods. However, again some new obstacles appear compared to what we have seen. A first matter is the lack in this situation of a bound for the height, i.e., an analogue of Proposition 3.2 above. We have two parameters, and Silverman’s method (outlined in the corresponding proof) does not yield what is needed. One may note that the elliptic curve y 2 = x3 + ax + b is of course isomorphic to a curve depending only on one parameter (on going to a bounded degree extension of (a, b)); however, the three points would then still depend on two variables, and the said difficulty would not disappear. Nevertheless, Habegger has recently proved the required bound for the height. Here is a version of his result (in the special case of dimension 2):

Q

Q

Q

Theorem 3.13. (Habegger’s elliptic height bound.) Let S/ be a surface, λ ∈ (S) and let P, Q be linearly independent rational sections of the Legendre elliptic threefold Eλ ×P1 S over S. Then there are a number B and a dense Zariski open subset Z ⊂ S such that if z ∈ Z is such that P (z), Q(z) are torsion, then (z ∈ S( ) and) h(z) ≤ B.

Q

See [Hab11a] for a proof of a more general statement (Theorem 1.3). In spite of this important advance, the analysis of the set S trans for this problem (indispensable to apply the results by Pila-Wilkie) leads to still another obstacle. A possible proof that the relevant S alg is empty is related to a function-field analogue of the Schanuel conjecture in the context of a family of elliptic curves (the constant case being known over function fields), and moreover on a function-field analogue of Question C. (A relevant paper in the direction of the first issue is the recent one [BP10a] by Bertrand and Pillay.) It may be that such an approach will eventually lead to a degeneracy proof, but, anyway, Habegger has recently overcome all of this and also found a different path to this question, solving the algebraic independence issue at the basis. See [Hab11b] for a proof that actually the relevant set of pairs in Question D is not merely degenerate, but actually finite (Theorem 1 therein).37 This approach by Habegger contains a number of new ideas. First, for the values of the parameters which lead to “triple-torsion,” he uses geometry of numbers to construct a dependence relation with small coefficients among the three points. The relation itself defines a curve in the space of parameters; this curve is not fixed, but using a uniform version of Pila’s estimates, he is able 37 The finiteness is due to the special choice of the three points; for more general choices, the independence of the three points could merely lead to degeneracy, in accordance with a conjecture made in [MZ10b]. An explicit example, given by Habegger, occurs when the three abscissas are −1, 0, 1: there are infinitely many values of a such that the three points are torsion on y 2 = x3 + ax, so we have an infinity of values with b = 0.

3.4 Related problems, conjectures, and developments

89

to control its variation, reducing thus to the case of a single parameter. For this to work, it is essential to have a good enough exponent in a lower bound for the degree of torsion points, as in 3.3.3 (whereas any positive exponent would suffice in the previously illustrated proofs). In all of this he also uses the Tate parametrization to transform the dependency issues which arise about the three points, into corresponding multiplicative dependencies of elements in a suitable complete field, which are simpler to control.

3.4.7

Dynamical analogues

We conclude this excursus on possible developments from Theorem 3.3 by mentioning a few analogues in algebraic dynamics. In the same spirit of Masser’s problems, I asked the following question at the AIM meeting in Palo Alto (January 2008): Question E. What can be said about the complex numbers λ such that both 0 and 1 are preperiodic for the map x 7→ x2 + λ? Note that torsion points on an elliptic curve are just the preperiodic points for the doubling map p 7→ [2]p; this is the same as saying that the abscissa of the point is preperiodic for the so-called Latt`es map x(p) 7→ x([2]p) associated to the Legendre elliptic curve. And hence Masser’s original question could be reformulated in terms of the complex numbers λ such that 2, 3 are both preperiodic for the Latt`es map; for the Legendre curve, this map is explicitly given by the rational function of degree 4 (x2 − λ)2 . x 7→ 4x(x − 1)(x − λ) This makes it quite explicit the connection of the dynamical question with Masser’s one. Recently the question for x2 + λ has been answered by M. Baker and L. DeMarco [BD11]; they actually deal with a more general question and prove: Theorem 3.14. ([BD11].) For a given integer d ≥ 2, there are only finitely many λ0 ∈ C such that both a, b are preperiodic for x 7→ xd + λ0 , unless ad = bd , in which case the relevant set is infinite. Their arguments involve dynamical Galois equidistribution, and are completely different from the above ones we have outlined for Masser’s problems.38 The arguments should also apply to points of small canonical height (with respect to the maps in question), which of course include preperiodic points. Such arguments, as they stand, do not apply to the Latt`es maps; but in principle it is possible that they work in this more general context, provided certain supplementary ingredients are proved; it seems likely that this can indeed be done (although at the moment no complete proof has yet been written down). Then this could provide an alternative proof for results like Theorem 3.3 above. (A difficulty behind this extension is related to the fact that the set of preperiodic points for the Latt`es map is always dense in the Riemann sphere, because the torsion points on an elliptic curve are dense in it. In the general case of rational functions, the closure of this set is generally smaller.) In their proofs, Baker-DeMarco use the (generalized) Mandelbrot set Mt = Mt (d) for xd + λ, defined as the set of complex numbers t such that the iterates of t by the map stay bounded. It is a compact full set (i.e., has connected complement) in C. 38 We note that here the preperiodic points are supposed to be given, whereas it is the map which varies. Hence, the said equidistribution has a different meaning with respect to the more common Galois equidistribution of preperiodic points of growing order, with respect to a fixed map.

90

Chapter 3

In the case of algebraic a, b, they consider (for a relevant λ) the discrete probability measure δλ on C supported equally on the Galois conjugates of λ (which has to be indeed an algebraic number). Using the fact that a is preperiodic for λ, an arithmetic Galois equidistribution theorem based on the product formula shows that (if the relevant set of λ is infinite) the measures δλ converge weakly to the equilibrium measure µa for Ma (in the sense of complex potential theory) on the Riemann sphere. But then by symmetry they get µa = µb , which implies Ma = Mb . And then by a complex-analytic argument using Green’s functions and univalent function theory they deduce that ad = bd . (For the special case of the original question, it is possible to show directly that i ∈ M0 but i 6∈ M1 . Also, Baker-DeMarco conjecture that the only relevant values of λ in that case are 0, −1, −2, but at the moment there is no known way to compute this set effectively.) As for Theorem 3.3, this method does not work as it stands when a, b are transcendental; for this case, however, the authors are able to get the proof by similar arguments, this time involving the Berkovich projective line in place of the Riemann sphere. (See the recent book by Baker and R. Rumely [BR10] for all these auxiliary results.) It seems to us interesting that similar problems, but for different rational functions, involve (at present) entirely different ways to be solved. The result of Baker-DeMarco has been more recently extended by D. Ghioca, L.C. Hsia, and T. Tucker in the paper [GHT11]. In place of xd + λ, they consider families of polynomials of the Pd−2 shape fλ (x) = xd + i=0 ci (λ)xi , for polynomials ci (λ). Also, in place of constant points a, b, they consider polynomials a(λ), b(λ) and assume that there exist infinitely many λ0 ∈ C such that a(λ0 ), b(λ0 ) are both preperiodic for fλ0 . Under certain (mild) conditions on a(λ), b(λ), which we omit here, they prove, for instance, that this may hold only if f (λ (a(λ)) = fλ (b(λ)) identically. (Their paper also contains other results and corollaries.) These authors further consider a more general setting, embracing the issues so far discussed: let Xλ be an algebraic family of complex quasiprojective varieties with endomorphisms Φλ : Xλ → Xλ , and let Pλ , Qλ ∈ Xλ be two algebraic families of points. Under what conditions do there exist infinitely many λ such that both Pλ , Qλ are preperiodic for Φλ ? As to the paper [BD11], another result is that If f, g ∈ C(z) are rational functions of degree at least 2, with infinitely many preperiodic points in common, then they have exactly the same set of preperiodic points. This had been proved by Mimar for functions over , but the authors use Berkovich spaces to extend this in general. In particular, then f, g must have the same Julia set, which is the set of accumulation points of the (repelling) preperiodic set. For Latt`es maps this reduces to the Raynaud theorem for the product of two elliptic curves. This is the case X = P1 of a more general statement appearing as Theorem 1.2 in the recent paper [YZ10] by X. Yuan and Zhang.

Q

Again in the context of algebraic dynamics, let us also mention that there are conjectural analogues of the Manin-Mumford conjecture, where preperiodic points replace torsion points. Recently, Ghioca, Tucker, and Zhang have formulated the following conjecture in [GTZ11] (see Conjecture 1.4): Conjecture. (Ghioca-Tucker-Zhang [GTZ11].) Let X be a projective variety, let ϕ : X → X be an endomorphism defined over C with a polarization, and let Y be a subvariety of X which has no component included into the singular part of X. Then Y is preperiodic under ϕ if and only if there exists a Zariski-dense subset of smooth points x ∈ Y ∩ P repϕ (X)) such that the tangent subspace of Y at x is preperiodic under the induced action of ϕ on the Grassmanian Grdim(Y ) (TX,x ).39 39 Here we have denoted by T X,x the tangent space of X at the point x; we implicitly mean that, since an iterated power of ϕ fixes x, the differential of this power acts as an endomorphism of such tangent space.

3.4 Related problems, conjectures, and developments

91

Here by polarization we mean an ample line bundle L on X such that ϕ∗ (L) = L⊗d for some integer d > 1. Following [GTZ11], we note that the apparently more natural statement which results on omitting the condition on the tangent space is false. See the false statement 1.1 and Theorem 1.2 in [GTZ11], which provides a counterexample for the diagonal Y in a square X = E 2 of an elliptic curve E with complex multiplication; one takes ϕ = (α, β), where α, β ∈ End(E) have the same norm d > 1 and are such that α/β is not a root of unity. On the other hand, even if the tangent-space condition appears to be a heavy restriction, the conjecture implies the Manin-Mumford one, on taking ϕ to be multiplication by 2 on the relevant abelian variety (so the induced action on the tangent space of a fixed point for ϕm is dilation by 2m , which yields a fixed point in the Grassmannian). A deduction in the opposite direction is also proved in [GTZ11].40 There is also a different conjectural analogue of Manin-Mumford appearing in [YZ10], on which we do not pause here; instead, we offer the following modified version due to Zhang: Conjecture. (Zhang) Let X, L, ϕ be as above, and let Y be a proper closed subvariety of X with a dense subset of preperiodic points. Then there are two polarized endomorphisms f and g commuting with ϕ, and a proper g-periodic closed subvariety Z such that f (Y ) ⊂ Z. This statement is easily shown to be true for abelian varieties (using Raynaud’s theorem), and conversely Raynaud’s theorem in turn would follow easily from the statement. Finally, the above counterexample in fact fits into the present conjecture. Let us conclude this section by mentioning the following result, which may be viewed as fitting in some of the above dynamical statements (thinking of families of Latt`es maps); it follows from Theorem 3.12: Let E → C be a pencil of elliptic curves over a curve C, and let σ be a nontorsion section. Let X ∈ C 2 be an irreducible curve. Then, either there are only finitely many points (p, p0 ) ∈ X such that σ(p), σ(p0 ) are both torsion, or σ × σ : X → E 2 has an image dense in a group subscheme. Summing up, already in the realm of the above problems there are several directions of development both of the problems and of the main method presented here. We can also inquire whether this method can be used for the problems discussed in the first chapter. However, in the next chapter we shall turn to another interesting direction, the Andr´e-Oort conjecture, where Pila has already achieved significant conclusions by a suitable variation of such method.

40 Also, one direction of the conjecture (i.e., the existence of preperiodic points with the said property on a preperiodic Y ) follows from a result of Fakhruddin, as in [GTZ11].

92

Chapter 3

Notes to Chapter 3 1. Torsion values for a single point: Other arguments We discuss again the issue raised at point no. 2 of Section 3.2. Namely, we showed therein that the (likely) values λ0 ∈ C such that Pλ0 is a torsion point on the curve Eλ0 make up indeed an infinite set; we remarked that this assertion is a little less obvious than it may seem, and we justified it by appeal to Siegel’s theorem on S-integral points for elliptic curves over function fields (a much easier theorem than the version for number fields, albeit a nontrivial one). Because of this apparently misleading aspect of the issue, it is perhaps not out of place to insert in these notes two more methods for proving this infinitude. We note at once that all these methods work for arbitrary points in Eλ ( (λ)).

Q

Analytical method. This has to do with the setting introduced in Section 3.3, especially equations (3.3.4), (3.3.5), and (3.3.6). Let us consider the map θ1 : λ 7→ (x, y) as a real map from a small disk D in the domain Λ to 2 . If we view D as a disk in 2 , it is a C ∞ -map, and, essentially by the implicit functions theorem, either θ1 (D) contains an open set or there is a differentiable arc in D (an infinite set suffices) on which θ1 is constant, equal to (x0 , y0 ). In the first case we are done, because the said open set contains infinitely many rational points, which give rise, on taking inverse images under θ1 , to infinitely many torsion Pλ . In the second case, we have by definition that z(λ) − x0 f (λ) − y0 g(λ) vanishes for λ on a whole arc, whence it vanishes identically in λ, by analytic continuation. But we have already seen that this is impossible. (To prove this for the present points with constant abscissa, one can use, e.g., monodromy on going through small loops around 0 and around 1. In general, the already quoted results of [Ber89] suffice.)

R

R

Note that this actually proves that for each large enough integer n there are values λ0 ∈ C such that Pλ0 has exact order n: in fact, this follows in quantitative form from the well-known easy fact that an interval [αn, βn], α < β, contains asymptotically (β − α)φ(n) integers prime to n. (Using an easy two-dimensional analogue one can also get the right lower bound  n2 for the number of such points.) These sharper conclusions (even the mere existence of such points) seem not to follow from the other methods outlined here. Remark 3.4.5 This method also adapts to give a proof of the assertion made in Remark 3.4.4 that there are infinitely many complex λ0 such that Eλ0 and E−λ0 are isogenous. We sketch the argument: let j(Eλ ) = 28 (λ2 − λ + 1)3 /λ2 (1 − λ)2 be the j invariant of Eλ . We can write j(Eλ ) = j(τ ) for a certain √ τ = τ (λ) ∈ H, which locally is an analytic function of λ 6= 0, 1, 1± 2 −3 , ∞ (or of some fractional power of λ, at the remaining points). The sought property follows, for instance, at the values of λ for which the ratio τ (λ)/τ (−λ) is a rational number. This ratio may be easily shown to be nonconstant as a function of λ. Now, the curve described by (j(Eλ ), j(E−λ )) has genus zero, and is defined by a certain polynomial P (u, v) (of separated degrees 6). Consider the complex function of τ ∈ H given by P (j(τ ), j(nτ )), where n ∈ . For large n, this is nonconstant and factors through the modular curve Y0 (n) (see next chapter for the essentials of this theory). Hence, there is some zero τ0 ∈ H ∪ {∞}; also, this may be shown to be actually in H, from the Fourier expansion of j(τ ). For large n, this yields a λ0 ∈ C such that τ (λ0 ) = τ0 , τ (−λ0 ) = nτ0 . But then the above analytic function takes some rational value (i.e., n); therefore it takes infinitely many rational values (all the rational numbers in a neighborhood of n). The result may be also proved on letting n vary and showing that there are infinitely many values j(τ0 ) which arise, on using that j(τ1 ) = j(τ2 ) if and only if τ2 = g(τ1 ) for some g ∈ SL2 ( ).

N

Z

Reduction modulo p. This method was found in conversation with P. Corvaja. Let us write

93

Notes to Chapter 3

Z

(similarly to above) x(nPλ ) = An (λ)/Bn (λ), for coprime polynomials An , Bn ∈ [λ]; we have to prove that the union for varying n of the set of complex roots of Bn is infinite. Now, let us first observe that no prime p ≥ 7 may divide any Bn . For otherwise Pλ would be identically torsion on the reduction of Eλ modulo p. On the other hand, viewing this reduction as an elliptic curve over p (λ), we note that the field of definition of the reduction of Pλ is ramified above λ = 2, so out of λ = 0, 1, ∞. Then, by general theory (see, e.g., [Sil92], Ch. 7), the order of Pλ would be a power of p. Then, by specialization λ = m ∈ p \ {0, 1}, we would find that 2 every p elliptic curve y = x(x − 1)(x − m) has a point of order a positive power of p, defined over p ( 2(2 − m)); this is, however, impossible if 2(2 − m) is a quadratic residue of p, because the curve has points of order 2 and then the group Em ( p ) would have order ≥ 2p, whereas its order is < 2p. For p ≥ 7 we may find such an m, proving the claim.41 Now, if Bn had only finitely many irreducible factors over for varying n, then these factors evaluated at some fixed λ0 ∈ , not a root of any factor, would have only finitely many prime factors, and by the above claim the same would hold for the values Bn (λ0 ), n ∈ . But then the reduction of Pλ0 at any prime ` outside this finite set would have infinite order on the reduction of Eλ0 , a contradiction.

F

F

F

F

Z

Z

N

2. A variation on the Manin-Mumford conjecture The Manin-Mumford conjecture and the corresponding Raynaud’s theorems deal with torsion points on a subvariety of an abelian variety A over a number field k. Now, in place of torsion points, one could consider more generally points defined over the field k(T ) generated over k by the set T = TA of all torsion points of A. For instance, given a finite-degree (ramified) cover π : X → A of an abelian variety,42 one can ask which torsion points lift to points of X(k(T )). In the paper [Zan10], p. 405, the following conjecture is stated: Conjecture. Let π : X → A be a cover defined over k and suppose that π(X(k(T ))) ∩ T is Zariski-dense in A. Then there exists an isogeny of abelian varieties ρ : B → A and a birational map ψ : X → B such that π = ρ ◦ ψ. On taking a dual isogeny, it is clear that a converse statement holds after a possible finite extension of k. Also, as remarked in [Zan10], this conjecture would directly imply the ManinMumford conjecture. In fact, suppose that the (smooth) curve C/k of genus g ≥ 2 contains infinitely many torsion points in a given embedding in its Jacobian A. Consider the cover π : C g → A given by π(x1 , . . . , xg ) = x1 + . . . + xg ; it is surjective of degree g!, and ramified along the whole “generalized diagonal” xi = xj for some i 6= j. With our assumptions, C g has the Zariskidense set (C ∩ T )g of points defined over k(T ), which are sent to T by π. So the assumptions for the present conjecture are verified, and let us then apply its conclusion. This would provide a birational map ψ : C g → B, which would be a morphism (by [BG06], Cor. 8.2.22), which is impossible: for instance, then π would be unramified, which is not true for g ≥ 2. It is probably possible to deduce similarly as well the full Raynaud’s theorem directly from this conjecture. As observed in [Zan10], the conjecture is relevant in studying Hilbert irreducibility on covers of abelian varieties. There is also a known analogue in the toric context, where k(T ) is replaced with the cyclotomic closure k c of k. Such result is stated and proved in [DZ07] (and recalled as Theorem [DZ] in [Zan10]). The proof uses among others Theorem 1.1 (i.e., the former Lang’s conjecture on torsion points on subvarieties of tori), confirming that these statements are not unrelated to the present context. This leads to strong explicit versions of Hilbert’s irreducibility theorem over k c . For instance, one deduces that 41 As 42 By

pointed out by Corvaja, this can be proved also by specializing λ at supersingular values. “cover” here we mean a dominant rational map of finite degree.

94

Chapter 3

If a polynomial f (x, y) ∈ k c [x, y] of positive degree d in y is such that f (xd , y) is absolutely irreducible, then f (ζ, y) is irreducible in k c [y] for all but finitely many roots of unity ζ. See [DZ07] and also [Zan10], Theorem 2.1, for a version of this result in several variables; in [Zan10] this is applied to prove by “descent” a Hilbert irreducibility result for points on a cover of n n m lying above a dense cyclic subgroup of m (k). The methods of [DZ07] also lead to results about the field of definition of preperiodic points of polynomial maps f : 1 → 1 ; it is proved that there may be infinitely many over k c only if f is linearly conjugate to a cyclic or Chebyshev polynomial.

G

G

A

A

Comments on the Methods As we did at the end of the notes to Chapter 1, let us pause to summarize the main components of the methods of proof that we met in this chapter, also with the purpose of comparison with the case of nm .

G

Methods of Chapter 1. In that case we considered conjugates and compared upper and lower bounds for their number: (LB) The lower bounds were obtained either directly (cyclotomic degrees, Galois action) or by height estimates and comparison degree ↔ height (Dobrowolski, Amoroso-David), which in turn use techniques stemming from transcendental number theory. (UB) The upper bounds were derived by Bezout’s theorem: the said conjugates lie in certain varieties, and one can apply Bezout to their intersection, provided there are no components of positive (= unlikely) dimension. One either obtains a contradiction or instead there are such unlikely components. To exploit this information represents what we may call the (GP) Geometric part: If the varieties arising from conjugation have a component in common,43 one draws geometrical consequences that show that X is a special variety. Present methods. The present method contains features that are common with the previous one, but also has different principles. Again we take conjugates and consider different bounds for their number. (LB) Lower bounds are obtained by Masser’s (or David’s) theorems [Mas84] on the degree of torsion points on abelian varieties; the corresponding techniques come from transcendence (and predate the methods alluded to above for comparison height ↔ degree). In the relative case, these methods relate heights and degrees (in a different way compared to the toric case). (UB) The upper bounds are obtained by the methods introduced by Bombieri and Pila in [BP89], and developed by Pila [Pil04], [Pil05] and Pila-Wilkie [PW06], to estimate the distribution of rational points on (algebraic and) transcendental analytic curves and varieties. Whereas Bezout theorem is algebraic in nature and does not exploit the arithmetic of the points, these last methods are partly analytic in nature, and the rationality of the points, exploited through discreteness, plays a crucial role. The information that allows this principle to work in the present context comes from the transcendental uniformization which represents the abelian variety as a complex torus, and a subvariety X as a transcendental analytic subvariety of the complex torus. These last features represent the main difference between this method and the previous approaches for nm . (We also note that the present method works as well in the toric case of torsion

G

43 In

some cases, as for Theorem 1.3, this was excluded by the very assumptions therein.

Notes to Chapter 3

95

points. Pila’s Theorem 4.1 of the next chapter in fact treats simultaneously torsion points in the abelian and toric context, together with Andr´e-Oort aspects.) As to the geometric part (GP), this usually appears to be more difficult than the corresponding step in the previous methods, probably because it now involves a transcendental context as well: (GP) In this step one has to investigate the above defined algebraic part of the transcendental variety in question, i.e., the union of connected positive-dimensional semialgebraic sets contained in the variety. For instance, in Manin-Mumford questions one finds that the algebraic part (in the complex torus) consists of linear varieties (and similarly in Lang’s question); its image through the uniformizing map corresponds to cosets of algebraic subgroups of the abelian variety. It is in this piece of the proof that the geometric nature of the special varieties is exploited. Finally, let us also note that the idea of taking conjugates, common to all of these approaches, is sometimes exploited with a different “weight”; for instance, Hindry’s cited argument for ManinMumford uses not merely the number of conjugates, i.e., the degree, but also the precise Galois action. In the abelian case, this kind of information is substantially more delicate and lies quite deep.

Chapter 4

About the Andr´ e-Oort Conjecture In this chapter we shall discuss some aspects of the so-called Andr´e-Oort conjecture. We shall observe how this resembles the pattern of the statements discussed so far, recalling also a brief history. That this subject fits in our main theme is confirmed also by recent proofs of J. Pila, who adopted the method explained in the last chapter to obtain new significant cases of this conjecture. After a few general considerations about the shape of the Andr´e-Oort conjecture and its relation to the topics of the previous chapters, we shall continue by discussing in some detail what can probably be considered the most basic case of the conjecture, which is, moreover, distinctly simpler to state. It is a rather special one, but nevertheless very interesting and illustrative of the difficulty of these issue; it bears to the general case more or less the same relation that Lang’s problem on roots of unity bears to the study of torsion points on subvarieties of tori (or abelian varieties). It was predicted and eventually achieved by Andr´e around the 1990s. After a section devoted to a review of the basic definitions and results in the context of modular curves and singular invariants, we shall present Andr´e’s original proof (also in a recently obtained effective form). Subsequently we shall also outline Pila’s proof of the same statement, which follows the method already considered in the previous chapter. Pila’s more recent results are much more general, but we hope that the consideration of the said special case shall help to clarify further the underlying principles of the method and simultaneously show the capability of more general applications. It will also serve the sake of comparison of different techniques for a same issue. Finally, in the last section we shall discuss in a bit more detail the general definition of Shimura varieties, which appear in the very statements of Andr´e-Oort type. In the notes, we shall sketch a further argument for Andr´e’s theorem, due to Edixhoven, and we shall present some definitions from model-theory, which appear in the theorems by Pila and Pila-Wilkie, in turn crucial for Pila’s proof.

4.1

Generalities about the Andr´ e-Oort Conjecture

The statements in this direction concern Shimura varieties, a notion which is rather technical to state. We postpone to the last section a brief discussion of this; at this point let us merely say that a Shimura variety is roughly a quotient of a symmetric hermitian domain by an arithmetic group. Actually, Shimura varieties first arose as moduli spaces, i.e., algebraic varieties representing parameter spaces for sets of other varieties with certain structures. Fundamental examples of Shimura varieties are the modular curves Γ\H, where H is the upper half plane {τ ∈ C : =τ > 0} and Γ is a congruence subgroup of SL2 ( ) (we shall recall below, in little detail, the corresponding theory); modular curves parametrize elliptic curves (if Γ = SL2 ( )), with possible “level” structure

Z

Z

97

´-Oort Conjecture 4.1 Generalities about the Andre

Z

(for general congruence subgroups Γ ⊂ SL2 ( )). More generally, fundamental examples of Shimura varieties are given by the moduli spaces parametrizing principally polarized abelian varieties of given dimension, again possibly with additional (level or endomorphism) prescribed structure.1 One reason why all of this fits into the context of the present book is because the pattern of the statements in the Andr´e-Oort context is similar to several other ones we have met so far. In order to illustrate this, let us first recall that there is a notion of Shimura subvariety of a Shimura variety; in dimension 0 we have “Shimura points,” also called “CM ” (complex multiplication) points, a terminology which shall be explained later. If we interpret these subvarieties as being the special ones, we have the following table of comparison with the previous chapters: Variety type: Special subvariety: Special point:

Multiplicative torus Torsion coset Torsion point

Abelian variety Torsion translate of ab. subv. Torsion point

Shimura variety Shimura subv. CM point

Then, here is a phrasing of the general statement, where the analogy with some previous basic ones is manifest: Andr´ e-Oort conjecture: The Zariski closure of a set of special points in a Shimura variety is a special (=Shimura) subvariety or, equivalently: If a subvariety of a Shimura variety has a Zariski-dense set of special points, then it is a special subvariety. For the case when the relevant subvariety is a curve, this statement first appeared (in equivalent form) as Problem 1 at p. 215 of Y. Andr´e’s book [And89]; then it was stated independently by F. Oort at the Cortona Conference in 1994 for the case when the Shimura variety is a moduli space Ag,N of principally polarized abelian varieties of given dimension [Oor97], with level structure. (We also note that in [And89], p. 215, Andr´e refers to a similar statement by Coleman, concerning the moduli space of curves of genus g. This predicted finiteness for isomorphism classes of projective curves of genus g having CM -Jacobian. Such a statement, however, was found to be false for g = 4, 6, 7, due to the presence of nontrivial special subvarieties of Ag,1 ; it remains open for g ≥ 8. See Section 4 of [Noo06], also for references.) Further, on p. 216 Andr´e already pointed out a similarity with the Manin-Mumford conjecture and Raynaud’s theorems.2 See also [Pin05a] and [Pin05b] for conjectures of R. Pink putting together all of this, and also containing the Zilber conjecture. (Pink considers not merely special points in the relevant subvariety, but also the intersections, of unlikely dimension, of special subvarieties with it, in analogy with Chapter 1 and the Zilber conjecture. Some results in this direction, going beyond Andr´e-Oort, appear in a very recent paper by Habegger and Pila [HP11]; we shall give a few statements in the notes below.) A special but basic case of the Andr´e-Oort conjecture was settled by Andr´e himself in [And98]. It concerns plane curves, and in this sense it is analogous to Lang’s original question on torsion points discussed in the first chapter, which provided much motivation for many developments. In such a case the Andr´e-Oort conjecture may be rephrased as asserting that 1 Sometimes

a Shimura variety is automatically understood to fall into these types, as, for instance, in [LB92]. noted at the beginning of Sec. 4.5, Andr´ e already saw a common structure between abelian varieties and definitions of Shimura varieties due to Mumford and Deligne. 2 As

98

Chapter 4

A

If a plane irreducible curve in 2 contains infinitely many pairs (j1 , j2 ) such that j1 , j2 are invariants of elliptic curves with complex multiplication, then either the curve is a horizontal or vertical line, or it is a modular curve Y0 (n), for some integer n ∈ .

N

In the next sections we shall discuss this statement in some detail, recalling the basic notions implicit in it, and presenting Andr´e’s original proof. Let us also note that the result easily implies an analogue for curves in any n . Later on, in the notes, we shall also recall in brief some ideas of a proof by B. Edixhoven [Edi98], but relying on an unproved assertion depending on the generalized Riemann hypothesis for imaginary quadratic fields. A merit of this argument is that it opened the way to generalizations; for instance, Edixhoven treated arbitrary products of modular curves [Edi05], and other extensions, again conditional to GRH, were obtained by Edixhoven and Yafaev [EY03]. Recently this has also been combined with equidistribution methods (by L. Clozel and E. Ullmo) and a conditional proof of the full conjecture has been announced recently by Klingler, Ullmo, and Yafaev. See also [Noo06] and [Pil11] for other references. (Edixhoven’s method also yields unconditional results, but under an additional restriction on the special points, which may be often replaced by GRH. We refer to the Bourbaki Seminar by Noot [Noo06] for a comprehensive discussion of this and other approaches, up to 2006.)

A

Whereas these results are conditional to cases of GRH, Pila succeeded in [Pil09b] in applying unconditionally the method we have discussed in the previous chapter (at the basis of the papers [MZ08], [PZ08], and [MZ10b]) to these problems. In place of the uniformization of an abelian variety by a complex torus, he used the uniformization of modular curves provided by the modular function. (These applications provide further evidence of the analogy of the present context with the topics so far discussed.) He first obtained in [Pil09b] some unconditional results, including Andr´e’s original one, but with an entirely different argument. In this same paper he also succeeded to mix the Manin-Mumford and Andr´e-Oort issues in special cases. Further, in the more recent paper [Pil11] he goes rather further and obtains a combination of the Andr´e-Oort, Manin-Mumford, and Lang’s statements for arbitrary subvarieties of Y1 × . . . × Yn ×E1 ×. . .×Em × `m or of Y1 ×. . .×Yn ×A, where Yi are modular curves Γi \H (with congruence subgroups Γi ⊂ SL2 ( )), Ei are elliptic curves and A is an abelian variety, all over . Let us state in precision his result, anticipating a brief formal definition of the special varieties in question.

G Z

Q

The irreducible special subvarieties are defined to be products of irreducible special subvarieties resp. of Y1 ×. . .×Yn , A, E1 ×. . .×Em , lm . We already have met the irreducible special subvarieties of the varieties of the last three types: they are the translates of group subvarieties by torsion points. As to the first type Y1 × . . . × Yn , note that we have a map j × . . . × j : Cn → Y1 × . . . × Yn , where j(τ ) is the modular function (see Section 4.2 for a review). An irreducible special subvariety is the image by this map of a subset of points (τ1 , . . . , τn ) of Cn defined by finitely many relations of the shape either τi = ξi or τr = grs τs , where ξi are constant imaginary quadratic numbers and grs are fixed elements of GL2 ( )+ . When, for instance, Yi = 1 for each i, we may also describe such subvarieties directly in Y1 × . . . × Yn = n , as being defined by several modular relations Φnij (xi , xj ), i 6= j, relative to points (x1 , . . . , xn ) ∈ n , plus some relation xl = γl , where γl is a constant singular modulus. In particular, the special points are now the products of CM -points (relative to Cn ) times torsion points (relative to A or to E1 × · · · × Em × `m ). We also note the important fact that the special points are indeed Zariski-dense in the special subvarieties.

G

A

Q A

A

G

With these notions, Pila proves in [Pil11] the following result:

99

4.2 Modular curves and complex multiplication

Theorem 4.1. (Pila [Pil11], Theorems 1.1 and 12.1.) Let V be either of the shape Y1 × . . . × Yn × E1 . . . × Em × lm or Y1 × . . . × Yn × A, where Yi are modular curves, Ei are elliptic curves and A is an abelian variety, all over . Let X be a subvariety of V . Then X contains only a finite number of maximal special subvarieties.

G

Q

Equivalently: The Zariski closure in V of any set of special points is a finite union of special subvarieties. These results are quite remarkable, especially taking into account that there are only few unconditional published cases of the Andr´e-Oort conjecture. (For instance, beyond the one obtained by Andr´e, there are results by Zhang [Zha05] on quaternion Shimura varieties, obtained by equidistribution methods; also, recent results for Hilbert modular surfaces have been announced by C. Daw.) Below, after discussing Andr´e’s proof, we shall sketch Pila’s argument for the special case of Andr´e’s theorem in order to better illustrate the method and to compare it with Andr´e’s. As already remarked several times, this method of Pila is along the same lines of what we have seen; a relevant difference (of technical nature) is that this time he relies on estimates not merely for rational points, but for algebraic points of bounded degree on transcendental varieties (he carries out this in [Pil09a] and [Pil11]). Another difference is that the transcendental variety in question comes from the modular function on the whole upper half plane, and even its restriction to a fundamental domain is not anymore globally (compact) subanalytic, as was the case in the applications of the previous chapter. This requires Pila to work with varieties in a larger realm, especially in the so-called o-minimal structure an,exp , generated, so to say, by subanalytic maps and the exponential functions. These categories were in fact taken into account already in the Pila-Wilkie paper [PW06], whose full force has to be used now. (Actually, as remarked above, Pila has to extend the corresponding results to points of bounded degree.) See the notes for some formal definitions on this context of o-minimal structures. Substantial effort in Pila’s most general results comes also in characterizing the algebraic part of the relevant transcendental varieties, namely the union of connected semialgebraic positive dimensional arcs contained in it. As stressed in the “Comments on the Method” in the notes to the previous chapter, this issue represents the geometric step of the whole procedure; it is the part taking into account the special subvarieties. This was a relatively easy matter in the papers [MZ08] but became more and more difficult in [PZ08] and [MZ10b]. In [Pil11] this step, somewhat surprisingly, involves a second use of the estimates (of Pila-Wilkie [PW06]) for the distribution of rational points that we have met in the previous chapter; see Remark 4.4.1 (ii) below for more on this issue. We also refer to Scanlon’s S´eminaire Bourbaki [Sca11b] for a survey account of these methods and results, especially from the viewpoint of model theory.

R

4.2

Modular curves and complex multiplication

For convenience, we shall pause by devoting this section to a review of a few basics from the theory of modular curves and complex multiplication, which shall also be useful while discussing the proofs by Andr´e and Pila. Complete details may be found in [Lan73], [Shi94], [Sil92], and [Sil02]. Modular curves arise as spaces parametrizing elliptic curves + possible additional structure, as, for instance, the choice of a point of given order, or of a (cyclic) subgroup of given order, on the elliptic curve; they are the simplest type of Shimura variety of positive dimension. In turn, the simplest modular curve is the affine line 1 ; in this perspective 1 is viewed as parametrizing just complex elliptic curves, through the so-called j-invariant. Let us briefly recall how this is done.

A

A

100

Chapter 4

A complex elliptic curve E 3 may be always defined as the projective closure in P2 of an affine plane curve defined by a Weierstrass equation y 2 = x3 + ax + b, where a, b ∈ C are such that the cubic polynomial on the right has no multiple roots, i.e., 4a3 + 27b2 6= 0. It is nonsingular. The unique point at infinity O in the closure of this curve in P2 is the origin for the familiar group law on E, obtained via the chord and tangent process: namely, if P, Q, R ∈ E are collinear points, then P + Q + R = O. 3 Now, to such a curve is attached the complex number j = jE = 1728 4a34a +27b2 ; this is called the j-invariant of E; in fact, it is known that this quantity is unchanged under complex isomorphism between two elliptic curves, and actually it turns out that conversely two elliptic curves are isomorphic over C if and only if they have equal j-invariants. This correspondence [E] ↔ jE ∈ C (where [E] here denotes the isomorphism class of E) is 3 bijective; in fact, solving 1728 4a34a +27b2 = j0 for a, b, immediately shows that for each complex number j0 we may write down a Weierstrass equation for an elliptic curve with invariant j0 . (Or else use equation (3.4.7) for j0 6= 1728, 0, whereas for j0 = 1728 we may take any curve y 2 = x3 + ax, a 6= 0, and for j = 0 we may take any y 2 = x3 + b, b 6= 0. Note that we may choose a relevant equation defined over (j0 ).) This gives a sense to saying that 1 (C) = C parametrizes (isomorphism classes of) complex elliptic curves.

A

Q

This invariant arises also in another (complex analytic) way. Any complex elliptic curve E has a natural complex structure (e.g., inherited by P2 (C)), so it becomes a Riemann surface and is known to be analytically isomorphic to a complex torus C/L, where L is a lattice (of rank 2) in C, determined by [E] up to homothety; so C is identified with the universal cover of the topological space E. The analytic isomorphism of C/L with a curve y 2 = x3 + ax + b associated to [E] is of course provided by the map z 7→ (℘(z), 12 ℘0 (z)), where ℘(z) = ℘L (z) is the Weierstrass function associated to the lattice (it is a meromorphic function on C invariant by translations in the lattice and with double poles precisely at the lattice points). Up to isomorphism given by a dilation, the lattice may be normalized and chosen of the shape L = Lτ := τ + , where τ lies in the upper-half plane H := {ζ ∈ C : =ζ > 0}. This τ is not uniquely determined by [E] or E, because the lattice basis may be changed from (τ, 1) into (aτ + b, cτ + d), where a, b, c, d are integers with ad − bc = 1, i.e., the matrix ac db lies in SL2 ( ). +b After a dilation, the normalized lattice expression in the new basis becomes L = · aτ cτ +d + , i.e., τ has been replaced by a Γ-image of it, where Γ is just SL2 ( ). (We mean that Γ acts on H by linear fractional transformations associated to 2 × 2-matrices in the usual way.)4 Every basis of the lattice arises in the above way, and thus it is the class of τ modulo this action that is naturally associated to [E]. In other words, the set of (isomorphism classes of) elliptic curves is in bijection with the (orbit) space Γ\H; this quotient space can be given a natural topology and complex structure, under which it becomes a Riemann surface, actually isomorphic to the Riemann sphere deprived of one point (which we can think of as being the point at infinity on the imaginary axis), i.e., isomorphic to C = 1 (C). Very explicitly, one may also consider a fundamental domain in H representing this quotient, usually chosen as the set

Z

Z

Z

Z

Z

Z

A

F = {τ ∈ H : −1/2 ≤ 1} ∪ {τ ∈ H : |τ | = 1, −

1 ≤ 1, is a symmetric polynomial in [x, y], Q irreducible over C; the polynomial Φn has equal separate degrees in x, y, given by ψ(n) := n p|n (1 + p−1 ). For instance, from Andr´e’s paper [And98] we find Φ1 (x, y) = x − y,

Z

Φ2 (x, y) = x3 + y 3 − x2 y 2 + 24 · 3 · 31(x2 y + xy 2 ) − 24 · 34 · 53 (x2 + y 2 ) +34 · 53 · 4027xy + 28 · 37 · 56 (x + y) − 212 · 39 · 59 . Other important modular curves come from other congruence subgroups, but we do not pause further on this here, since we shall be explicitly concerned only with Y0 (n). Isogenies. Two (complex) elliptic curves E, E 0 are said to be isogenous if there is a nonconstant rational (hence regular) map E → E 0 carrying the origin of E into the one of E 0 ; such a map is called an isogeny, and it necessarily has finite degree. If E, E 0 correspond respectively to complex tori C/L, C/L0 , as before, then an isogeny ρ : E → E 0 may be lifted to the universal covers, thus to an analytic map α : C → C. Clearly, for any point z ∈ C, we must have α(z + L) ⊂ α(z) + L0 , 5 This

function j(τ ) may also be thought of as a function j(L) on lattices L ⊂ C, invariant under dilation. [Ser73], p. 90, for a list of remarkable divisibility properties of the coefficients in these expansions. See also the quoted sources and their references for a wealth of other striking formulas. 7 For this, let us consider, for instance, the subgroup of order n generated by τ /n. Then action of Γ leaves the elliptic curve invariant and carries the subgroup transitively to any of the other such subgroups; the stabilizer of the subgroup is easily seen to be Γ0 (n) and the above assertion follows. 8 A subgroup Γ0 of Γ is called a congruence subgroup if it contains the elements of Γ which reduce to the identity modulo N , for some integer N > 0. 6 See

102

Chapter 4

and it follows from discreteness that α(z + λ) − α(z) is constant in z, for each λ ∈ L. Hence the derivative of α is L-periodic and regular, and hence constant. This means that α is in fact a multiplication z 7→ αz by a complex number, also denoted α, such that αL ⊂ L0 . In particular, an isogeny is automatically a homomorphism for the group laws on E, E 0 (which can also be proved algebraically), and in turn this implies that any isogeny is unramified. If L = τ + , L0 = τ 0 + (τ, τ 0 ∈ H), then we have equations ατ = aτ 0 + b, α = cτ 0 + d,  0 +b a b where a, b, c, d ∈ . Hence τ = aτ must automatically have positive c d cτ 0 +d (and the matrix determinant). Conversely, if τ, τ 0 are related by such a relation, the same argument shows that the corresponding elliptic curves are isogenous. The degree of an isogeny (as a rational map between algebraic curves) is finite, since it is a nonconstant rational map. It is a fact (which may be checked from the above description) that to any isogeny ρ : E → E 0 one can associate a dual isogeny ρˆ : E 0 → E, such that ρ◦ ρˆ is multiplication by deg ρ. In particular, being isogenous is an equivalence relation (which can also be deduced from the above matrix criterion). The kernel of an isogeny ρ : E → E 0 is a finite subgroup C = Cρ of E; its order is the degree of the isogeny, and E 0 is isomorphic to the quotient curve E/C. If the isogeny is represented as multiplication by α as above, then the kernel is the set of z ∈ C/L such that αz ∈ L0 . It may be checked that the kernel has order det ac db , and that it is cyclic if and only if the matrix a b is primitive, i.e., gcd(a, b, c, d) = 1. This also shows that any isogeny may be factored as a c d composition (in any order) of a multiplication map x 7→ mx (for an m ∈ ) and a cyclic isogeny. Since the choice of the kernel determines the isogeny up to isomorphism, we see that the isogenies with cyclic kernel (also called cyclic isogenies) of order n correspond (up to isomorphism) to the  cyclic subgroups of order n of E; thus, such a subgroup corresponds to a primitive matrix a b of determinant n, considered up to SL ( )-multiplication (on the left or right). 2 c d In this way we also obtain another description of the curve Y0 (n): we may view it as classifying triples (E, E 0 , α) (up to isomorphism), where α : E → E 0 is a cyclic isogeny of degree n. This description can be related with the above-mentioned modular polynomials Φn ; in fact, the remarkable property of them follows that, letting A, B be elliptic curves over C, with invariants jA , jB , there exists an isogeny ρ : B → A with cyclic kernel of order n if and only if Φn (jA , jB ) = 0.  Note further that, by the above remarks, if ac db is a primitive integer matrix of determinant n > 0, then the elliptic curves Eτ and E aτ +b are related by a cyclic isogeny of degree n;

Z

Z Z

Z

Z

Z

Z

cτ +d

+b aτ +b hence the pair of invariants (j(τ ), j( aτ cτ +d )) lies in Y0 (n), i.e., Φn (j(τ ), j( cτ +d )) = 0; this yields a parametrization (uniformization) of the modular curve Y0 (n) by the modular function.9 Conversely, let us finally note for later reference that, for matrices g1 , g2 ∈ GL2 ( )+ , the image of the map τ 7→ (j(g1 τ ), j(g2 τ )) from H to 2 is dense in a modular curve Y0 (n) for some n: in fact, the image is the same as for the map τ 7→ (j(τ ), j(gτ )), where g = g2 g1−1 . But we may replace g by any integral multiple of it, obtaining the same map. Hence, we may assume that g is a primitive integral matrix of positive determinant, and the conclusion follows.

Q

A

Complex Multiplication. ([Lan73], Part 2, Sec. 10; [Shi94], Ch. 5; [Sil02], Ch. II.) A special case of isogeny E → E 0 occurs when E = E 0 ; in this case an isogeny is an endomorphism of E. This set of endomorphisms plus the zero one, denoted End(E), is actually a ring with respect to addition and composition. It always contains the trivial endomorphisms given by the multiplication maps [m] : x 7→ mx for m ∈ ; actually, for a “general” elliptic curve E, there are no nontrivial endomorphisms and thus End(E) is isomorphic to . However, for certain complex elliptic curves this ring may be larger; let us check when this can occur by looking again at the complex-analytic representation E ∼ = C/L. By the above, each

Z

9 Reversing

Z

the arguments, this fact also rapidly leads to the construction of the modular polynomials Φn (x, y).

103

4.2 Modular curves and complex multiplication

nonzero endomorphism corresponds to multiplication in C by a complex α such that αL ⊂ L. As +b above, we have equations ατ = aτ + b, α = cτ + d for integers a, b, c, d, and then τ = aτ cτ +d . If α 6∈ , this yields a nontrivial quadratic equation for τ with integer coefficients, and then, since τ ∈ H, τ has to be quadratic imaginary. Conversely, each such τ ∈ H yields an elliptic curve ∼ = C/( τ + ) whose endomorphism ring is larger than . Note that in any case the endomorphisms are represented by multiplications by corresponding complex numbers, which justifies the terminology “complex multiplication” (abbreviated CM ) for the whole situation in which there are nontrivial endomorphisms. A corresponding elliptic curve is also called a CM -curve, and the same for a corresponding τ ; moreover, an imaginary quadratic field is often called CM -field.10 Observe that CM -curves with the same CM -field are necessarily isogenous (because the corresponding τ are rationally related, i.e., through GL2 ( )). Note that this picture also gives precision to the above vague sentence that the general elliptic curve has trivial (i.e., equal to ) endomorphism ring: in fact, the set of exceptional τ (quadratic imaginary numbers) is in particular denumerable, and in a sense sparse. In particular, we also obtain the remarkable fact that the value of the j-invariant at a CM complex number τ , i.e., the value j(τ ) of the modular function, must be an algebraic number: for otherwise by specialization we would obtain nondenumerably many pairwise nonisomorphic CM elliptic curves. This algebraicity may also be proved in other ways, for instance, producing explicit equations, e.g., as in [Lan73], Theorem 4, p. 57, using the above modular polynomials. Looking at these equations (or with more abstract arguments) it may be further proved that actually the values j(τ ) so obtained are algebraic integers.11 They are also called CM -moduli, or singular moduli. If a point of a modular curve (as, for instance, a value j0 ∈ 1 (C)) corresponds to a CM -elliptic curve, then it is called a “CM -point”; these points are the simplest type of Shimura varieties of dimension 0, and represent the special points of a modular curve. These numbers have most remarkable arithmetical properties (related to classfield theory for imaginary quadratic fields), a small part of which we shall now very briefly summarize. When τ ∈ H is a quadratic irrational, the endomorphism ring End(Eτ ) (which coincides with the set of complex numbers α such that αLτ ⊂ Lτ ) is a certain order12 Oτ in the ring Rτ of algebraic integers in (τ ); any order is (uniquely) of the √ shape + f Rτ for an integer f > 0, called the conductor. The field (τ ) shall be of the form ( d) for a unique integer d < 0 such that −d is squarefree; the number D = f 2 d is called the discriminant of τ and it is not difficult to check that it is in fact the discriminant of the minimal quadratic equation over . The degree of j(τ ) over is the class number of Oτ . Actually, the extension (τ, j(τ ))/ (τ ) is Galois ([Lan73], p. 123), with degree [ (τ, j(τ )) : (τ )] = [ (j(τ )) : ] ([Lan73], pp. 132, 133); it easily follows that (τ, j(τ ))/ is Galois. The Galois group Gal( (τ, j(τ ))/ (τ )) is isomorphic to P ic(Oτ ), namely, the group of classes of projective modules of rank 1 (i.e., invertible ideals) over Oτ . See [Lan73], Theorem 5, p. 133, for an explicit description of the Galois action of P ic(Oτ ) on j(τ ), which is obtained through and multiplication of ideals;13 in particular, it follows that the conjugates of j(τ ) (both over over (τ )) are precisely the√values j(η) for all η such that the order of the lattice η + is again Oτ . If we write (τ ) = ( d), where d = dτ is a squarefree√(negative) integer, and D = f 2 d is the discriminant of the order, then one such η is given by D+2 D . A special case occurs when the conductor f = 1, i.e., Oτ is precisely the whole ring Rτ . Let

Z Z Z

Z

Q

Z

A

Q

Q

Z

Q

Q

Q

Q

Z

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Z

Q

10 CM

Z

Q

fields of higher degree similarly arise from abelian varieties of higher dimension. See, for instance, [Shi94]. the end of the introduction for a few values of j at quadratic imaginary numbers. Other values may be found in [Ser97], Appendix, and in [Sil02], especially Appendix A, Sec. 3. 12 An order in (τ ) by definition is a set of the shape {α ∈ (τ ) : αL ⊂ L} for a lattice L; alternatively, it may be defined as a subring with 1 of the full ring of integers of finite index as a subgroup. 13 Principal ideals or modules act trivially because the function j, viewed as a function on lattices, is invariant by homothety. 11 See

Q

Q

104

Chapter 4

us recall for future reference the celebrated estimate of Siegel, who proved that, setting h(Rτ ) = |P ic(Rτ )|, 1 |d| → ∞. (4.2.3) log h(Rτ ) ∼ log |d|, 2 This is to date ineffective; however, for given  > 0 it is possible to estimate effectively from above the number of exceptions to an inequality log h(Rτ ) > ( 12 − ) log |d| − C() for a suitable explicit C() (see [Nar04]). There is also an important effective lower bound tending to infinity with d, provided by work of D. Goldfeld and B. Gross-D. Zagier, but this bound, although solving “Gauss’ problem,” is not enough for the present applications. (For a discussion of the class-number problem, see, for instance, [Nar04] and [Ser97].) Further, there is a simple (effective) formula relating the class number h(Oτ ) := |P ic(Oτ )| with h(Rτ ) (see [Lan73], Theorem 7, p. 95); it implies that   X k 1 + O(1), (4.2.4) log 1 − ( ) log h(Oτ ) = log h(Rτ ) + log f + p p p|f

where ( kp ) is the usual Legendre symbol and the O(1) refers to an absolute constant (which may be taken ≤ log 6). Remark 4.2.1 An effectivity issue. Let us also mention that, given a Weierstrass equation with effectively given algebraic coefficients, defining an elliptic curve E0 , one may effectively establish whether it has complex multiplication or not; let us see how. In the first place, one may effectively compute the invariant j0 of E0 and estimate the maximum of its conjugates by an effective integer M = M (E0 ).14 Now, the Galois properties of the CM -invariants that √we have recalled above show that if E0 is a CM -curve, say with conductor f and associated CM -field ( D), where D = f 2 d < 0 is the discriminant of a √quadratic τ0 such that j0 = j(τ0 ), then some conjugate of j0 (over ) is of the shape j(τ1 ), where τ1 = D+2 D . Then p the Fourier expansion 4.2.2 easily yields an inequality |j(τ1 )| > c1 exp(π |D|) − c2 , for explicit absolute constants c1 , c2 > 0. (Actually, Masser has established the refined inequality |j(τ ) − exp(2πy)| ≤ 2079 if =τ = y > 0.) Hence we may effectively bound |D| in terms of M . Since for every given D one may effectively list the finitely many minimal equations of the corresponding CM -invariants (e.g., by the method explained in [Lan73], Ch. 3), we may just check whether j0 in fact verifies any of these equations, concluding the argument. (Another, much more demanding method comes from the effective lower bounds for class numbers of imaginary quadratic fields provided by the above-mentioned work of Goldfeld and Gross-Zagier: since the class number h(Oτ0 ) is the degree of j0 , this yields an effective bound for |D|.) The same essentially elementary method yields an effective algorithm to “find” End(E0 ). On the contrary, given any two elliptic curves E0 , E1 defined over , it is a very difficult problem to describe the group Hom(E0 , E1 ) of (algebraic) homomorphisms from E0 to E1 , and even to decide merely whether this group contains any nonzero element. This issue has been completely solved (actually in much greater generality) by Masser and W¨ ustholz using deep methods from transcendence in a series of papers of which here we recall just [MW94] together with the references therein. For the specific question just mentioned, they prove that if E, E 0 are isogenous elliptic curves over a number field of degree d, then there exists an isogeny ρ : E → E 0 with deg ρ ≤ C(d)h(jE )k , where C(d) is effectively computable number depending only on d and where k is absolute. (See also Masser’s survey paper [Mas03] for something on the methods and further details.) With this in hand, it is a computational matter to find out if some isogeny exists at all: it just suffices to check whether Φn (jE , jE 0 ) = 0 for n up to the said bound.

Q

Q

Q

Just for the sake of illustration, to conclude this section, here are a few Weierstrass equations of elliptic curves with complex multiplication (and some nontrivial endomorphisms), corresponding to some of the CM j-invariants which we have listed in the introduction. (See also [Sil02], Ch. II.) y 2 = 4x3 + 4x, 14 Of

√ j = j( −1) = 1728,

√ (x, y) 7→ (−x, −1y).

course, if it happens that j0 is not an algebraic integer, this already rules out that E0 has CM .

105

´ 4.3 The theorem of Andre

√ √ −1 + −3 −1 + −3 ) = 0, (x, y) 7→ ( x, y). j = j( y = 4x + 1, 2 2 √ 3375 3375 y 2 = 4x3 − x− , j = j( −2) = 8000. 98 98 √ √ √ √ 266625 + 119340 −5 266625 + 119340 −5 √ √ x− , j = j( −5) = 632000+282880 −5. y 2 = 4x3 − 9848 + 4420 −5 9848 + 4420 −5 2

3

√ Let us also write down an endomorphism σ corresponding to multiplication by −2 on the third curve E, say. The cubic on the right has three roots α, β, γ, which correspond to the three points of exact order 2 on E; say that (α, 0) is in the kernel of σ (which determines α),15 and set Q(x) = (x − β)(x − γ). Then it can be checked that (up to a sign in the second coordinate) √   Q(x) −2 Q( α2 − x) σ(x, y) = α − , y . 2(x − α) 64 (x − α)2

4.3

The theorem of Andr´ e

In this section we shall discuss the above-mentioned theorem by Andr´e, which represents a basic and significant case of the Andr´e-Oort conjecture. Until Pila’s recent results, it was one of the few known cases of the conjecture, most other proofs relying on unproved cases of the Riemann hypothesis.

A

The Shimura varieties appearing in this theorem are the plane 2 , the modular curves in it, and the special points in them (i.e., the so-called complex-multiplication points recalled above). Note that the Andr´e-Oort conjecture for a Shimura variety of dimension 1 is a trivial statement, since there are no intermediate (irreducible) varieties between points and irreducible curves. So indeed the interesting cases concern a Shimura variety of dimension at least 2, of which the simplest example is indeed the affine plane, viewed as the square of the (modular) Shimura curve 1 . As already remarked, this case is entirely analogous to Lang’s original question in the context of torsion points on the square 2m of the multiplicative algebraic group m . In place of torsion points we now find the special points in 2 = ( 1 )2 , which are just the pairs (x1 , x2 ) of special points of 1 (C), i.e., the pairs such that x1 , x2 are both singular moduli. In this situation the Andr´e-Oort conjecture then considers a curve X in 2 , and predicts that if X contains infinitely many special points, then X must be a Shimura curve in the plane. Now, such Shimura curves turn out to be of two types (see the last section for a motivation):

A

G

A

A

G

A

A

A

A

(i) A first type occurs with the products {x1 } × 1 and 1 × {x2 } of a special point (x1 or x2 ) times the Shimura curve 1 , i.e., vertical or horizontal lines. It is clear that each of these curves contains infinitely many special points (since there are infinitely many CM -points in 1 (C)).

A

A

(ii) A second type occurs with the modular plane curves Y0 (n), which we have briefly introduced in the previous section. Note now that we may choose x1 arbitrarily as a CM -invariant, and then we may determine x2 as any complex solution of Φn (x1 , x2 ) = 0. Then any elliptic curves E1 , E2 with invariants resp. x1 , x2 shall be isogenous (by the above-recalled property of Y0 (n)), by means of a (cyclic) isogeny ρ. But now, since the elliptic curve E1 has complex multiplication (say by 15 It may be shown that α is determined by 4(α − β)(α − γ) = (β − γ)2 and also by being the unique rational one √ , {β, γ} = { 15 (2 ± 3 2)}. among the three roots. We have α = − 15 7 28

106

Chapter 4

ρ). In this way we find an infinity of CM -pairs on α), E2 also has complex multiplication (by ραˆ Y0 (n). So, indeed the Shimura curves contain infinitely many special points, and Andr´e’s theorem is the converse assertion. Note that we have indeed a problem of unlikely intersections: we do not expect both coordinates of a point on a general curve to have the unlikely property of being singular moduli. Theorem 4.2. Theorem of Andr´ e [And98]. If an irreducible affine plane curve is not a horizontal or vertical line, then it contains infinitely many points (x1 , x2 ) such that both x1 , x2 are singular moduli if and only if it is a modular curve Y0 (n) for some n. From a slightly different viewpoint, this result yields a general description of all the fixed algebraic relations holding between singular moduli x1 , x2 for infinitely many such pairs. Example 4.1 As an instance of the theorem, suggested by Masser, we conclude that there are only finitely many complex numbers λ0 6= 0, ±1 such that both Legendre elliptic curves Eλ0 and E−λ0 have complex multiplication. By the theorem of Andr´e this finiteness follows if we prove that Eλ is not (identically) isogenous to E−λ by means of a cyclic isogeny (over (λ)).16 For this, first observe that the invariants of Eλ and E−λ are resp. j = 28 (λ2 −λ+1)3 /λ2 (1−λ)2 = 28 (u−1)3 /(u−2) and j 0 = 28 (λ2 +λ+1)3 /λ2 (1+λ)2 = 28 (u + 1)3 /(u + 2), where u = λ + λ−1 . These two rational functions of u satisfy some irreducible relation P (j, j 0 ) = 0 of separate degrees at most 3. (Actually, we may take P (x, y) = x3 y − 26 · 33 x3 − 2x2 y 2 + 26 · 19x2 y + 217 · 33 x2 + 28 xy 2 − 217 · 3 · 7xy − 228 · 32 x − 26 · 32 y 3 + 217 · 33 y 2 − 228 · 32 y + 243 .) If the curves were isogenous, then they would satisfy some relation Φn (j, j 0 ) = 0. But Φn is irreducible, so this would force the separate degrees of Φn to be at most 3, which in turn would imply n = 1, 2. However, by direct verification we may check that the said relation does not hold for these values of n (which amounts to check that P is not a constant multiple of Φ1 or Φ2 ). An essentially equivalent argument which avoids the irreducibility of Φn is as follows. The said isogeny would yield a cyclic kernel in Eλ ; but both curves are defined over (λ), so the same would hold for the kernel (e.g., by Prop. 5.3, p. 113 of [Shi94]). So let us explore the possibilities for such a finite subgroup of Eλ . Let E be an elliptic curve defined over (j) with invariant j (this is easily written down: see equation 3.4.7); then Eλ is isomorphic to E over (λ), and (λ) is the field generated over (j) by the 2-torsion in E. Also, the Galois structure over (j) of the field generated by the torsion points of order n on E is well known to be the maximal possible one. (Namely, in the natural representation on ( /n)2 , the Galois group is GL2 ( /(n)), as explained, e.g., in [Lan73], Ch. 6.) All of this again entails that the only possibilities for the order of the said kernel, i.e., the degree of the isogeny, would be 1 or 2, which can again be excluded by direct substitution in the modular equation.17 In the present case of Legendre parameters λ, −λ, an even simpler argument noted by Masser is to observe that j is not integral over C(j 0 ) (because, for instance, they have poles at different λ-points), whereas the modular polynomials may be shown to be monic in each variable. Concerning isogeny for these pairs of elliptic curves, we also note in passing that Eλ , E−λ are actually isogenous for an infinity of complex values λ0 6= 0, ±1 of λ, as has been proved in Remark 3.4.5; necessarily such λ0 are algebraic numbers. Using what we have checked above, it can be proved that there are only finitely many such numbers if we impose a bound on the degree: this comes on looking at the generic and special Galois action on torsion points, as in the paper [EEHK09]. Sparseness of these values of λ also follows from Masser’s results in [Mas89b].18 Another simple example for Andr´e’s theorem comes from considering the curve x1 + x2 = 1 in 2 . This is not a modular curve Y0 (n) for any n, hence there are only finitely many singular moduli j such that 1 − j is also a singular modulus.

Q

Q

Q

Q Q

Q

Q Z

Z

A

16 Observe

that if two elliptic curves are isogenous then they are automatically related by some cyclic isogeny. arguments generally allow one to check effectively whether two elliptic curves with transcendental invariants satisfying a given algebraic relation (over ) are isogenous, because one can bound a priori the degree of a possible cyclic isogeny; the analogous issue for elliptic curves defined over lies very much deeper; the most efficient effective criteria come from celebrated work of Masser and W¨ ustholz; see, e.g., [MW94] and Remark 4.2.1 above. 18 We point out, however, that recent results by Habegger imply that these numbers have unbounded height. 17 These

Q

Q

107

´ 4.3 The theorem of Andre

In both examples, Andr´e’s proof does not lead to the complete list of exceptions, but after illustrating such proof we shall sketch an effective variation on it, recently observed by L. K¨ uhne [K¨ uh11] and independently by Bilu, Masser, and the author [BMZ11]; following this, K¨ uhne and Masser have shown that there are no solutions (at least) for the second example.

We shall now present a version of Andr´e’s argument in [And98].19 His method is only superficially linked with the ones at the basis of these notes, but it shall be interesting later to compare the methods; in fact, as noted earlier, we shall offer another proof due to Pila, relying on the principles of the previous chapter. Andr´e’s proof as it stands is not effective, because it uses the above-recalled Siegel’s noneffective lower bounds for class-numbers;20 however, as mentioned above, it has been recently found by L. K¨ uhne [K¨ uh11] and then independently by Bilu, Masser, and the author [BMZ11], that it is possible to obtain a variation on it which is entirely effective, outlined at the end of this section. Andr´e’s proof. Let X denote the curve in question, containing an infinity of CM -points. A main issue in the proof is to show that X may be suitably parametrized by the j-function, and more precisely that there is a map τ 7→ (j(g1 τ ), j(g2 τ )) of the upper half plane H to the curve, where g1 , g2 are suitable elements of GL2 ( )+ ; as remarked above, this would imply that X equals some curve Y0 (n). With this in mind, it shall be useful to represent the said CM -points (x1 , x2 ) ∈ X as values (j(τ1 ), j(τ2 )) of the modular function j at points (τ1 , τ2 ) ∈ H2 . Now, to compare τ1 and τ2 it is first relevant to know that the corresponding orders Oτ1 , Oτ2 are “almost” equal; this constitutes the first part of the proof and is accomplished by class number estimates and Galois theory of CM -points, using that there is (by assumption) an equation of bounded degree relating x1 = j(τ1 ) with x2 = j(τ2 ). This already proves that τ2 = g(τ1 ) for some g ∈ GL2 ( ); the point is to show that g may be taken from a fixed finite set (so g shall be constant on an infinite subsequence). This needs a careful (asymptotic) comparison of τ1 , τ2 , and an idea for achieving it is to take advantage of the following: (i) a comparison between x1 , x2 coming from Puiseux expansion at some limit point of (x1 , x2 ) on the curve X; (ii) the Fourier expansion (4.2.2), relating τi with xi = j(τi ). This expansion converges rather rapidly for τ with large imaginary part, so we would try to apply this when τ1 , τ2 both have large imaginary part. Now, one can make τ1 with large imaginary part simply on replacing x1 with some suitable conjugate. (In this, we can use the remarkable Galois properties of the singular moduli noted above.) However, we have not yet a priori achieved that x2 also has large imaginary part; but one can actually prove this by invoking a rather deep result by Masser from transcendence theory: this yields a contradiction if =τ1 is large but =τ2 stays bounded. (Again, one uses a Puiseux expansion here and the fact that x2 = j(τ2 ) would have to converge to an algebraic number, at least on an infinite subsequence of the points in question.) Finally, the opening equality between the orders Oτi and the Puiseux and Fourier expansions will show indeed that τ2 = g(τ1 ) for a g in a finite subset of GL+ 2 ( ), proving what is needed.

Q

Q

Q

Now, let us perform this program in detail. As we have remarked above, the CM -points (x1 , x2 ) are defined over and hence X is also defined over . We may actually take the union of X with

Q

Q

19 In a previous paper Andr´ e had proved the result under a further restriction on the points or on the curve. A proof conditional to GRH had been given by Edixhoven [Edi98], on which we shall very briefly comment later. 20 By this we mean that this proof does allow one to produce the complete list of special points on a nonspecial curve. Nevertheless, the argument leads to an effective bound for the number of such points.

108

Chapter 4

Q

Q

its conjugates and assume it is defined over and irreducible over . We shall tacitly let (x1 , x2 ) run through an infinite sequence of such points and possibly to suitable infinite subsequences whose properties shall be specified along the discussion. For a given such point (x1 , x2 ), we let E1 , E2 be corresponding elliptic curves, with End(Ei ) =: Oτi , j(τi ) = xi . We write the discriminant of Oτi in the form Di = fi2 di , where f1 , f2 are the conductors and d1 , d2 are negative squarefree integers. We shall assume that both D1 , D2 tend to infinity along the said sequence (otherwise there are only finitely many values of x1 or x2 to take into account); this amounts to excluding the trivial cases of horizontal or vertical lines. As in the above summary, the first part of the proof exploits Galois theory and analytic number theory. √ √ Suppose first that ( D1 ) 6= ( D2 ). Then a basic step is to show that the field

Q

Q p p √ √ K := Q( D1 , D2 , x1 ) ∩ Q( D1 , D2 , x2 )

is a composite of quadratic fields. For this, we give only a brief discussion of the argument. See also the references in [And98] or see [And01] or [Edi98] for another equivalent argument. We first observe directly from the definition that for any automorphism σ of C one has j(E)σ = j(E σ ), where σ operates on E through equation. √ a Weierstrass √ √ √ The basic issue is that Gal( ( D , D )/ ) operates on ( D1 , D2 )) by con1 2 √ √ Gal(K/ √ jugation, and similarly Gal( ( Di )/ ) operates on Gal( ( Di , xi )/ ( Di )). √ In√fact, note √ that K/√ and ( √Di , xi )/ are both Galois; also, the subgroups Gal(K/ ( D1 , D2 )) and Gal( ( Di , xi )/ ( Di )) of the respective Galois groups are both abelian, so the action is well defined. √ Fix i ∈ {1, 2} and let σ = σi be the nontrivial with √ element of Gal( ( Di )/√ ): this coincides √ complex conjugation restricted to (τi ) = ( Di )). Also, let g ∈ Gal( ( Di , xi )/ ( Di )). Let us check how σ −1 gσ operates on√xi = j(τi ); since the conjugacy action g 7→ σ −1 gσ is well defined, xi ), and for the present verification we choose to lift it as we can lift σ as we wish to ( Di , √ complex conjugation on the whole ( Di , xi ). Then, first we have j(τi )σ = j( τiσ + ). Second, τi shall correspond to a certain elliptic curve E (i.e., E is associated to the lattice Λi := τi + ) and a certain order OE = Oτi ⊂ Rτi ; then, if g corresponds to an ideal (class) Ig ∈ OE , we shall have21 j(Λσi )g = j(Ig Λσi ). Finally, −1 j(τi )σ gσ = j(Igσ Λi ). Observe now that the nontrivial element σ acts on P ic(Oτi ) as −1: in fact, for an ideal I ⊂ Oτi , −1 the norm I · σ(I) is an ideal of , hence principal. Therefore, j(τi )σ gσ = j(Ig−1 Λi ). In turn, −1 −1 −1 j(Ig Λi ) = g j(Λi ) = g j(τi ). √ √ √ Hence the said action of Gal( ( Di )/ ) on Gal( ( Di , xi )/ ( Di )), considered as P ic(Oτi ) (read additively), is the map −1. √ √ √ √ Now, the said action of Gal( ( D1 , D2 )/ ) on Gal(K/ ( D1 , D2 )) factors through both quadratic fields. Since these fields are presently supposed to be distinct, we may choose σ operating nontrivially on τ1 and in any of the two possible ways on τ2 ; thus we see that the action on K is represented √ by the trivial element and by −1. This means that multiplication by 2 kills √ both Gal(K/ ( D1 , D2 )), which must then be a composite of quadratic extensions, as asserted. √ √ Further, Gal( ( Di , xi )/Q( Di )) is isomorphic to P ic(Oτi ). By the discussion in [Lan73], Theorem 7, p. 95, and in [Lan94], pp. 125–127, the largest quotient of exponent 2 of this group has order dividing h2 41+ν(fi ) , where h2 is the largest power of 2 dividing the class number h(Rτ )

Q

Q

Q Q

Q Q

Q

Q

Q

Q

Q

Q Z

Z

Q

Q Q

Q

Q

Z

Q

Z

Z

Q

Q

Q

21 We

Q

Q

Q

Q

Q

Q

use here the quoted explicit Galois action, as explained in [Lan73] or [Sil02].

Q

Q

Q

109

´ 4.3 The theorem of Andre

and where ν(m) denotes the number of distinct prime factors of m. On the other hand, it goes 8.8.) back to Gauss that h2 divides 21+ν(di ) . (See, e.g., [Nar04],√Theorem √ Hence, by what has just been proved, the degree [ ( D1 , D2 , xi ) : K] is at least the ratio h(Oτi )/41+ν(fi )+ν(di ) .22

Q

Now, by Siegel’s estimate (4.2.3), by (4.2.4), and by easy elementary estimates on the divisor function (for instance, ν(m) = o(log m) suffices), the ratio h(Oτi )/41+ν(fi )+ν(di ) tends to infinity as |Di | → ∞.23 √ √ √ √ On the other hand, the definition of√K and √ the fact that ( D1 , D2 , xi )/ ( D1 , D2 ) is Galois prove √ that the D1 , D2 , x1 ) : K] is bounded above by (and is in fact √ above degree√[ ( √ equal to) [ ( D1 , D2 , x1 , x2 ) : ( D1 , D2 , x2 )], which in turn is bounded above by the degree large D2 . of our curve X. This yields a contradiction for large D1 , and by symmetry √ also for √ Hence, disregarding a finite number of points, we may assume ( D1 ) = ( D2 ), and so d1 = d2 = d, say.24 √ √ Now put f = lcm(f1 , f2 ), D := f 2 d. One knows that [ ( d, x1 , x2 ) : ( d)] is the class number of the order O with conductor f (this follows from the Galois action on x1 , x2 ; see, for instance, [Lan73], Ch. 10.3). Then, similarly to the above proof, one deduces that 2 are bounded independently √ f /f1 and f /f√ have an upper bound [ ( d, x1 , x2 ) : ( d)] ≤ [ (x1 , x2 ) : (xi )] · of (x √ h(O) = √ √1 , x2 ): we √ [ ( d, xi ) : ( d)] ≤ deg X · [ ( d, xi ) : ( d)] = deg X · h(Oτi ); if we compare this with (4.2.4), used for f on the left and fi on the right, the claim follows, after an easy treatment of the corresponding summations on p|f and p|fi (whose difference may be shown to be O(log log(f /fi ))). (For these deductions see also [Edi98], Sec. 3.) This concludes the first part of the proof.

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Now comes the second part. We use that X is defined over to replace (x1 , x2 ) by suitable conjugates. The conjugates of x1 run through the values j(τ ) of the elliptic modular function, corresponding to lattices τ + , all of whose stabilizers coincide with the order Oτ1 ; the lattices correspond to ideals of the order, covering all ideal classes; so in particular √some conjugate shall correspond to the full order Oτ1 , and we may thus assume that x1 = j( D1 +2 D1 ) (i.e., that E1 = C/Oτ1 ). Then the Fourier expansion q −1 + 744 + 196884 · q + . . ., q = exp(2πiτ ), for j(τ ) shows that p (4.3.5) log |x1 | ∼ π |D1 |,

Z

Z

so in particular |x1 | → ∞. Andr´e now has a crucial lemma: Lemma 4.3. For (x1 , x2 ) running in as above subsequence, we have that |x2 | → ∞. Proof. 25 To prove this, we assume the contrary, so we may extract an infinite subsequence of points such that x2 → l, where (∞, l) is a point at infinity on X (for the first projection) and l ∈ C. We may also assume x2 6= l. By taking Puiseux expansions for x2 in terms of x1 around such point 22 These results about the largest quotient of exponent 2 in the class group are implicit in Gauss’s theory of genera. Andr´ e’s paper [And98] refers to the treatment in [Nar04], Theorem 8.8. However, it seems to us that this covers only the cases of trivial conductor. Through the arguments which follow, this would yield merely the boundedness of those d1 , d2 for which d1 6= d2 , which nevertheless is seemingly still sufficient for the rest of the proof to work. 23 This kind of estimate had been explicitly carried out by Chowla, as quoted in [And98]. 24 In fact, it may be easily checked that there are only finitely many quadratic τ of discriminant D in the fundamental domain F (i.e., up to SL2 ( ) action). See, for instance, the discussion in the next section. 25 This is the most technical ingredient of the proof, depending on a deep transcendence estimate of Masser, dealt with here in simplified form in Appendix E. In a previous paper Andr´ e did not have this lemma and proved his theorem under an additional assumption on X that guaranteed the conclusion of the lemma.

Z

110

Chapter 4

we deduce from (4.3.5) (on going to a further infinite subsequence) that for some positive constant α∈ , p log |x2 − l| ∼ −α log |x1 | ∼ −απ |D1 |,

Q

so x2 approches l very quickly. For each special point (x1 , x2 ) in our subsequence, we may now pick a complex τ2 in the usual modular fundamental domain F (recalled in (4.2.1)), so that j(τ2 ) = x2 ; note that τ2 shall be imaginary quadratic over . Since the restriction of j to F is a bijection, we may also suppose that τ2 converges to some ζ ∈ C, with j(ζ) = l. Then, expanding the j function as a Taylor series around ζ, we get

Q

log |x2 − l| ∼ κ log |τ2 − ζ|, where κ ∈ {1, 2, 3} (depending on ζ, i.e., whether l = j(ζ) equals a regular value of j, or whether l is in the set {0, 1728} of the critical values of j). Hence, for some constant c1 > 0, p log |ζ − τ2 | ∼ −c1 |D1 |. Now, clearly j(ζ) = l is algebraic, because it is the value of the second coordinate function at a pole of the first coordinate, and X is defined over ; however, with this assumption Masser has established in [Mas75], I, 1.1, that for all algebraic w 6= ζ we have an inequality

Q

log |ζ − w| > −c2 max(1, h(w))3+ ,

Q

(4.3.6)

Q

where the positive constant c2 depends only on ζ, [ (w) : ],  > 0 (and where h(w) denotes as usual the logarithmic Weil height). But, putting w = τ2 and recalling √ that τ2 has been chosen in the standard fundamental domain and that it is quadratic in ( d), we easily find (on looking at a minimal equation for τ2 over ) that h(τ2 ) = O(log |D2 |). Recalling also |D2 |  |D1 | from the first part of the proof, we see that the last two displayed formulas are eventually inconsistent, proving that in fact (∞, ∞) is the only possible limit point, and the conclusion of the lemma. (See Appendix E for a proof by Masser of the lower bound (4.3.6) with some exponent k in place of 3 + , which is, however, sufficient to derive the same contradiction.)

Z

Q

Hence, in virtue of the above, we may go to a suitable infinite subsequence of the relevant points, to assume that (i) (x1 , x2 ) converges to some point z∞ in a complete smooth model of a component of X, z∞ being a pole of both coordinate functions, and that (ii) f2 /f1 is a constant ρ on the subsequence, so that D1 = f12 d, D2 = ρ2 f12 d = ρ2 D1 . Let us now choose τ1 , τ2 in the usual fundamental domain F, so that xi = j(τi ). In view of our opening normalization on the conjugate of x1 , we may write √ √ c + D1 b + ρ D1 , τ2 = , (4.3.7) τ1 = 2 2a where a, b, c are integers with a ≥ |b|, c = 0, 1. Now, Puiseux expansion for x2 in terms of x1 at the limit point z∞ at infinity shows that log |x2 | ∼ β log |x1 | for some rational β, which must be strictly positive (in virtue of the lemma). On the other hand, Fourier expansion of the function j(z) yields p p |D1 | . log |x2 | ∼ πρ log |x1 | ∼ π |D1 |, a Therefore, ρ/a converges to β > 0, and since ρ is constant and a is an integer, a must also be eventually constant on the sequence. Hence, on going to a further subsequence we may also assume that the integer b takes a constant value.

111

´ 4.3 The theorem of Andre We conclude that for suitable fixed coprime integers r, s, t ∈ many points of the shape   r + sτ ) . j(τ ), j( t

N, the curve X contains infinitely

But these points lie also on Y0 (st), and the sought conclusion follows. Remark 4.3.1 As remarked in Andr´e’s paper [And98] (see the Variante at the end), the result immediately extends to irreducible curves X inside a product of modular curves Z1 ×Z2 ; the conclusion is that X contains infinitely many special points if and only if X is either a fiber of one of the two projections or a component of a Hecke correspondence. (This is defined as the image in Z1 × Z2 of a locus {(τ, gτ ) : τ ∈ H}, where g ∈ GL2 ( )+ , when we think of Z1 , Z2 as quotients of H.) To obtain this one may just apply Andr´e’s theorem to the image of X in 2 under the natural product map Z1 × Z2 → 2 . (See also [Pil09b], 4.2.2, for a direct argument.) Andr´e’s method proves actually the conjecture for all curves in n (or in a product of any number of modular curves); this may also be derived by projection from the case n = 2. The general result may be stated as follows: If X ⊂ n is an irreducible curve containing infinitely many special points, then either some coordinate function is constant on X or X is a fibered product of modular curves through coordinate functions. That is, each pair of nonconstant coordinates are related by a modular relation of the type we have seen.

Q

A

A

A

A

4.3.1

An effective variation

As promised above, we now consider the effective approach recently pointed out by L. K¨ uhne and then independently by Bilu, Masser, and the author.26 Below we summarize the principles of this variation in a very brief sketch, keeping the notation and the pattern of the previous proof. We shall assume throughout that |D1 | ≥ |D2 | and shall denote by C0 , C1 , C2 , . . . suitable positive effective numbers (which may depend on an explicit presentation for the curve X). The opening ingenious (ineffective) argument by Andr´e, eventually using class-number estimates, shall not appear in this approach, whereas the middle part shall remain essentially the same. Instead, the final asymptotic formulas related to Puiseux series shall be refined to expansions with error terms; this shall lead to a small linear form in logarithms, to which celebrated effective results by A. Baker can be applied. √ As before, we may assume that X is defined over , and by conjugation that x1 = j( D1 +2 D1 ). We skip what we have called above the first part; instead, the second part of the proof proceeds in the same way, except that the asymptotic formulas that appear have to be replaced by explicit inequalities, which is harmless. Here we recall that Masser’s lower bound (4.3.6) is effective (see Appendix E for a proof by Masser of a weaker but still sufficient bound). The output of this is that if X is not a horizontal line (so x2 is not constant), for large enough C1 and |D1 | > C1 , we have that x2 is given by a convergent Puiseux expansion at infinity of the shape x2 = P (x1 ), where P (t) = γ0 tβ + lower-order terms, where γ0 6= 0 is algebraic and β > 0 is a rational, both from a finite set which may be computed. As in Andr´e’s proof, let us choose τ1 , τ2 in the usual fundamental domain, so that xi = j(τ ), √ i i = 1, 2. In view of the above normalization on x1 , we have (as in 4.3.7 above) τ1 = c+ 2 D1 , √ τ2 = b+2aD2 , with integers a, b, c such that a ≥ |b|, c = 0, 1. The said Puiseux expansion x2 = P (x1 ) now shows that we have an inequality, p |log |x2 | − log |γ0 | − β log |x1 || ≤ C2 exp(−C3 |D1 |).

Q

26 K¨ uhne’s

paper contains certain explicit estimates not carried out in [BMZ11].

112

Chapter 4

But the Fourier expansion 4.2.2 shows that if =τ = y, then | log |j(τ )| − 2πy| ≤ C0 exp(−2πy), so this yields p ! p p |D2 | |D2 | − log |γ0 | − βπ |D1 | ≤ C4 exp −C5 . π a a p p It readily follows that for large enough C1 and |D1 | > C1 , we have |D2 | ≥ aβ |D1 |/2, whence in particular a ≤ 2/β. The left side of the last displayed inequality may be written as √ √ √ √ |Λ| for Λ = ( aD2 − β D1 )πi − log |γ0 | = ( aD2 − β D1 ) log(−1) − log |γ0 |; then standard results on linear forms in logarithms27 show that for large C1 we must have Λ = 0. In turn, this forces log |γ0 | = 0 (again by transcendence results, e.g., of Gelfond) and then D2 = a2 β 2 D1 . Further, since |a|, |b|, |c| are bounded by C6 , there are integers r, s, t also bounded in absolute 1 value by C8 and such that t 6= 0, τ2 = r+sτ t . Thus, all the relevant points (x1 , x2 ) either satisfy |D1 |, |D2 | ≤ C1 or lie in the union of finitely many modular curves that can be written down. This concludes the argument. K¨ uhne and independently Masser have carried out explicitly the case of the curve X1 + X2 = 1. It turns out that the special feature of the equation allows this proof to go through even without appealing to the difficult transcendence results appearing above. Also, the relevant estimates are sufficiently realistic to allow one to check that there are no solutions at all. Namely: There are no pairs of CM -invariants j1 , j2 such that j1 + j2 = 1.

4.4

Pila’s proof of Andr´ e’s theorem

We now present the promised proof of Andr´e theorem due to Pila by the same strategy we used in the previous chapter for the Manin-Mumford statement and for the questions of Masser. (See also the account in Scanlon’s S´eminaire Bourbaki [Sca11b].) This method compared upper and lower estimates for the number of conjugates of an unlikely intersection. The upper estimates in turn relied on Pila’s (and Wilkie’s) estimates for the distribution of rational points on transcendental varieties. Now, for an application to Andr´e-Oort, Pila had to work out suitable generalizations, which we shall describe in the course of the exposition. As noted above, this method allowed Pila to prove the case of the Andr´e-Oort conjecture appearing as Theorem 4.1 in Section 4.1, substantially more general than Andr´e’s theorem. The corresponding proofs are rather more involved, however, several principles already appear in the special case discussed here. A real surface and its algebraic part. One again starts by considering a uniformizing transcendental map π to the relevant variety. For Theorem 3.3 (resp. for Manin-Mumford), this was the elliptic (resp. abelian) exponential; here it is obtained through the j-map: π : H2 → C2 , π(τ1 , τ2 ) := (j(τ1 ), j(τ2 )) (where H is as before the upper half plane). Correspondingly, we consider the inverse image Z := π −1 (X) of our curve; this is a two-realdimensional subset of H2 . We may think of this last space as an open subset of 4 , by means of real coordinates (u, v, x, y), where τ1 = u + iv, τ2 = x + iy; Z becomes a real surface, in 2 the coordinates u, v, x, y. We shall actually consider the intersection Z = Z ∩ F of Z with the 2 square F of the (closure of the) usual fundamental domain F in H for SL2 ( ). (Recall that F = {z ∈ H : −1/2 ≤ 1 there would be some finite branch point29 x0 ∈ C; but the above automorphism would imply that x0 + l is also a branch point, and iterating this we would find 28 Our argument for this point differs from Pila’s, who considers only the multivalued function α rather than the polynomial P , and in practice uses simultaneously several branches. 29 By “branch point” here we mean a critical value of the map x on a smooth model of the curve P (x, y) = 0.

114

Chapter 4

infinitely many branch points. Hence we may infer that P must have degree 1 in y, or equivalently α is a (nonconstant) rational function, satisfying α(x + l) = γ(α(x)). Let now u0 ∈ P1 be a fixed point of γ, and let x0 ∈ P1 be such that α(x0 ) = u0 . If x0 ∈ C, we find that α(x0 + nl) = u0 for each integer n, a contradiction, since α is nonconstant. Therefore, x0 = ∞ is the only possibility. In particular, γ has a single fixed point, necessarily rational (because γ ∈ Γ), and by a conjugation in P GL2 ( ) we may assume it is ∞: namely, we replace α by α ˜ := β −1 ◦ α and γ by γ˜ := β −1 ◦ γ ◦ β for a suitable β ∈ P GL2 ( ), to assume that γ˜ (u) = u + a for rational a. Then α ˜ (x + l) = α ˜ (x) + a, which easily implies α ˜ (x) = bx + c for b = a/l ∈ and a c ∈ C. Taking into account the conjugation, we find that α = β ◦ (bx + c). Now, since j(τ ), j(ατ ) are algebraically dependent, also j(τ ), j(αδτ ) are algebraically dependent for each δ ∈ P GL2 ( ) (because j(τ ), j(δτ ) are algebraically dependent). Applying the above argument with α ◦ δ in place of α, we then infer that α ◦ δ = η ◦ (b0 x + c0 ), i.e., β ◦ (bx + c) ◦ δ = η ◦ (b0 x + c0 ), where η ∈ P GL2 ( ), b0 ∈ , c0 ∈ C. Choosing, for instance, δ = 1/x, we find (b + cx)/x = (β −1 η)(b0 x + c0 ). Looking at the denominators of both sides, we obtain that c0 ∈ , whence c ∈ and α ∈ P GL2 ( ). This proves that the locus (τ, α(τ )) is sent to a modular curve by (j, j); this is, however, against the assumptions, concluding the proof.

Q

Q

Q

Q

Q

Q

Q

Q

Q

Remark 4.4.1 (i) Note that the said automorphism (x, y) 7→ (x + l, γ(y)) of the curve defined byP (x, y) = 0 has infinite order and stabilizes C(x); this fact says itself that the curve has genus 0 (look, e.g., at ramification). Also, the automorphism produces by iteration a very dense set of integral points on the curve (relative to some number field and the x-coordinate). Then we may apply the estimates for integral points on algebraic curves coming from [BP89] (or else Siegel’s theorem) to derive the sought consequences (i.e., that α is a rational function and in fact in P GL2 ( )). Of course, such tools lie much deeper than the above direct elementary method; however, this shows that diophantine arguments may be relevant here.

Q

(ii) In fact, in the most recent applications of the method to the Andr´e-Oort conjecture ([Pil11]), Pila uses for a second time the diophantine results of Pila-Wilkie ([PW06]) for the corresponding step of describing the algebraic part Z alg of the relevant variety Z. We only roughly outline the idea. (See [Pil11], 6.8, for a more complete sketch, and Secs. 6, 7, and 8 for the detailed proof.) Let Y be a maximal complex algebraic component of Z. Now, Z is Γ-invariant, hence gY ⊂ Z for all g ∈ Γ, whence gY ∩ F ⊂ Z.30 Now, although gY ∩ F shall be empty for many g, it turns out from independent arguments that for  T δ elements g ∈ Γ of height ≤ T , the dimension of gY ∩ F is equal to that of Y , where δ > 0. (Suitable elements g may be constructed as translations in Γ carrying into F pieces of Y that lie outside F.) But these g may be viewed as rational points of a certain definable variety V (see the notes for some definitions), and this is a set to which the results of [PW06] may be applied. Since δ > 0, this yields that V contains some positive dimensional semialgebraic set Va ; hence, for g ∈ Va , we have that gY ∩ F is contained in Z and has the same dimension of Y ; by maximality of Y , this in turn implies that Y is invariant by translations in Va , which finally forces (through further arguments) Y to be of the sought shape. In the very special case considered above, the Pila-Wilkie estimates are not needed and this strategy is invisible. However, note that Y appears as the curve P = 0, which indeed admits automorphisms (x, y) 7→ (x + l, γ(y)); actually, by iteration of one of them, we see that there is a continuous family of them (which would correspond to elements of the said set Va ), proving that Y is an image of a .

G

Upper estimates for algebraic points of degree ≤ 2 in Z. Let us now consider upper estimates for the distribution of the points π −1 (x1 , x2 ), where (x1 , x2 ) is a special point on X. This means that if we write (x1 , x2 ) = π(τ1 , τ2 ) := (j(τ1 ), j(τ2 )), then τ1 , τ2 are imaginary quadratic. In the above-defined real coordinates, π −1 (x1 , x2 ) yields a point (u, v, x, y) ∈ 4 for which τ1 = u + iv

R

30 In the general situation dealt by Pila, there appear indeed a discrete group Γ and a fundamental domain F analogous to the present ones.

115

´’s theorem 4.4 Pila’s proof of Andre

and τ2 = x + iy are imaginary quadratic, so all coordinates (u, v, x, y) are also at most quadratic over .

Q

In the previous chapter, we have quoted and used Pila’s estimates in [Pil04] for the number of rational points with denominator dividing a given integer N and lying on a given subanalytic surface. In Remark 3.3.2 above, we have also pointed out that such results have been extended to rational points with denominator at most N (see [Pil05]) and further in higher dimensions in [PW06]. This would still not be enough for the points we are interested in now, which can be irrational. But Pila has extended in [Pil09a] such an estimate to points whose degree over is bounded by a fixed integer k. He bounds the number of such points with Height at most T : here by “Height” we mean the exponential Weil height, or, equivalently for the present purpose, the maximum exponential Weil height of the coordinates. (Other natural heights would lead to similar bounds.)

Q

Apart from this extension from rational points to points of bounded degree, there is, however, another important difference between the previous context and the present one. Namely, for the present application we are concerned with the function j(τ ) on F; the point is that its graph is not globally subanalytic, because of the essential singularity at τ = ∞. (On the contrary, the corresponding varieties appearing in the proofs of the previous chapter were compact subanalytic.) However, the estimates of Pila-Wilkie, as well as Pila’s most recent ones, concern not merely globally subanalytic sets but more generally subsets Y ⊂ h with the property of being definable in an o-minimal structure over . We postpone to the notes below a small discussion of these concepts, a few formal definitions, and the indication of suitable references. At this point we only say that a “structure” is a collection of subsets of the spaces n , n = 1, 2, . . ., satisfying certain axiomatic set-theoretical properties. If a certain fundamental finiteness property is satisfied as well, the structure is said to be “o-minimal.” The sets in such a collection are said to be definable in the structure in question. Fundamental examples of o-minimal structures (over ) are (i) the collection of semialgebraic sets in some n ; (ii) the collection an of globally subanalytic sets (i.e., those which are subanalytic when considered as subsets of Pn ( )); (iii) the structure exp , which is the smallest structure containing semialgebraic sets and the graph of exp(x); and (iv) the structure an,exp , which is the smallest structure containing the last two, and is the most relevant one for the present applications. To prove that these collections are actually o-minimal is rather delicate.

R

R

R

R

R

R R

R

R

R

Let us then state one of Pila’s estimates. As before, for a set Y ⊂ n , we let Y alg denote the union of connected positive-dimensional semialgebraic sets contained in Y . Relying also on the results of [PW06], Pila proves the following: Theorem 4.5. ([Pil09b], Thm. 1.4, [Pil09a], Thm. 1.5) Let Y be definable in an o-minimal structure over , let k ≥ 1, and  > 0. Also, let Nk (T ) be the number of points of degree ≤ k over , and Height ≤ T , contained in Y \ Y alg . Then there exists c = c(Y, k, ) such that for T ≥ 1,

Q

R

Nk (T ) ≤ cT  . Remark 4.4.2 From rational points to points of bounded degree. Let us say just a few words on how Pila deals in [Pil09a] with points of bounded degree in place of rational points. An algebraic number α of degree ≤ k satisfies an equation a0 + a1 α + . . . + ak αk = 0 with integers ai not all zero. The first device of Pila is to associate to each such α the integer vector (a0 , . . . , ak ), and similarly for an algebraic point of degree ≤ k on X. In this way we are reduced to estimating the number of rational vectors of height bounded by a parameter T , lying on a certain variety X associated to X. Now, we may apply the Pila-Wilkie estimates for rational points on X, but the problem is that we have to remove the algebraic part of X, and in the present case this would remove all the points we are interested in. This difficulty

116

Chapter 4

is solved by entering into the proofs of the estimates for rational points, to see what is really needed to be removed; it turns out that, taking into account the special nature of the variety X so constructed, the estimates for rational points continue to hold even if we do keep in X the vectors relevant to us. Indeed, as explained in [Pil09a], Sec. 3, already the paper [PW06] contained results for families of varieties, in which the sought estimates for rational points were achieved by removing subsets smaller than the whole algebraic part; see especially Theorem 1.10 in [PW06], i.e., Theorem 3.1 in [Pil09a], refined to the relevant Theorem 3.5 in [Pil09a].

We still have to deal with the present surface Z of interest for us, defined by means of j(τ ). It turns out that the graph of j(τ ), τ ∈ F, although not globally subanalytic, is a definable set in the structure an,exp . See Example 4.14 in the notes below for a sketch of this simple deduction (which indeed follows easily from the q-expansion (4.2.2) for j(τ )).31 In turn, since Z is defined by the vanishing of a polynomial in j(τ1 ), j(τ2 ), τ1 , τ2 ∈ F, this immediately implies (via the above-noted formal properties of o-minimal structures, recalled in the notes) that also Z is definable in an,exp .

R

R

In conclusion, we may then apply Pila’s Theorem 4.5 to our present Z, with k = 2. Taking also into account Lemma 4.4 stating that Z alg is empty, we obtain Lemma 4.6. Let N (T ) be the number of points of degree ≤ 2 over in Z. Then, for every  > 0 there exists c1 () such that for T ≥ 1

Q and Height ≤ T

contained

N (T ) ≤ c1 ()T  .

A lower bound for the degree. We now come to the part concerning estimates from below for the degree of a point in Z, compared to the height. Recall that j(τ ) is special (i.e., a singular modulus) if and only if τ is imaginary quadratic in H, say a root of ax2 + bx + c, where a, b, c ∈ and where we assume a > 0 and (a, b, c) = 1. Since τ 6∈ , we must also have c > 0. The lattice Λ = Λτ = τ + has CM (i.e., is stabilized) by the √ order OD := [ D+2 D ] (which is in fact [aτ ]), where D := b2 − 4ac is the discriminant of τ . As in Section 4.2, we shall indicate by hD := h(OD ) the class number. By means of a few elementary considerations, this may be shown to be also the number of imaginary quadratic τ of discriminant D, up to SL2 ( )-action (consider the lattices τ + corresponding to OD -fractional ideals). We may take representatives for them to lie in the fundamental domain F. In view of what we have already recalled on complex multiplication, the corresponding values j(τ ) form a complete set of conjugate algebraic integers, so in particular [ (j(τ )) : ] = hD . As to the growth of hD we have Siegel’s estimates (4.2.3) and (4.2.4), which imply hD ≥ c()|D|1/2− for a(n ineffective) c2 () > 0.32

R

Z

Z

Z

Z

Z

Z

Z Z Q

Q

Now, if τ ∈ F, we have |τ | ≥ 1 so, if H(τ ) is the (exponential) Weil height of τ , we have 2 H(τ )2 = ac. Taking again into account that p τ ∈ F, we obtain |b| ≤ a ≤ c; hence 4ac = |D| + b ≤ |D| + ac, whence 3ac ≤ |D| and H(τ ) < |D|. Recall now that we count points in Z according to the real coordinates given above. Namely, we write τ = u + iv. We get easily that H(u), H(v) ≤ 16H(τ ). So, if Z contains a point 31 In [Pil09b], Sec. 3, Pila deduces this from a certain result of Peterzil-Starchenko, quoted below in the notes. Actually, although such result is important for other applications, in this case, and also for the recent results of [Pil11], it is not really necessary. 32 These estimates are essentially best-possible, as there is also an (easier and effective) upper bound h D ≤ C()|D|1/2+ . See [Nar04]. This is not needed in this argument, but becomes necessary for other proofs in the context, for instance, in [Pil09b].

117

´’s theorem 4.4 Pila’s proof of Andre

P = (u, v, x, y) ↔ (τ1 , τ2 ) where τi has discriminant Di , and pu, v, x, y are the corresponding real coordinates, we have H(P ) := max(H(τ1 ), H(τ2 )) ≤ 16 max |Di |. Hence by Siegel’s estimates we obtain that for any given  > 0 there exists c4 () > 0 such that for any special point P ∈ Z, (4.4.8) H(P ) ≤ c4 ()[ (π(P )) : ]1+ .

Q

Q

Remark 4.4.3 Very recent work of J. Tsimerman in his thesis provides lower bounds for Galois orbits of CM -points on Siegel space, for g < 6; this should replace class-number estimates and lead to further cases of Andr´e-Oort by Pila’s method.

Conclusion through comparison of bounds. As before, the idea is to consider the image of our special point (τ1 , τ2 ) by our uniformizing map π = (j, j) and to consider its conjugates (over a number field of definition for X). Let the curve X be defined over a number field of degree δ, and suppose it contains the special point (x1 , x2 ) = (j(τ1 ), j(τ2 )) = π(P ), say, where P = (u, v, x, y). Then X shall contain also the conjugates (π(P ))σ of π(P ) = (x1 , x2 ) over the said number field and hence it shall contain at least δ −1 L such conjugates, where L := [ (π(P )) : ] is the degree. These conjugates shall be of the shape (j(µ1 ), j(µ2 )) for quadratic imaginary µ1 , µ2 of discriminants resp. D1 , D2 .pHence, the Height of the corresponding points P 0 = (µ1 , µ2 ) on Z again shall not exceed 16 max |Di |. Therefore, the estimate (4.4.8), i.e., H(P 0 ) ≤ c4 ()L1+ , shall hold also for these points P 0 , because the degree L = [ (π(P ) : ] is the same for all these points. Then, on choosing  = 1/2 and 2/3 setting T := c4 (1/2)L3/2 , we conclude that Z shall then contain at least δ −1 L = δ −1 (T /c4 (1/2)) special points of height ≤ T .

Q

Q

Q

Q

For large enough L we shall have T ≥ 1. But now Lemma 4.6 yields, for every 0 > 0, at most 0 c1 (0 )T  quadratic points on Z of height ≤ T . On choosing again, e.g., 0 = 1/2, we obtain a bound for T , proving boundedness of max |Di | and finiteness of the set of special points. Remark 4.4.4 (i) We note that there are common features in the two proofs by Andr´e and Pila, for instance, in the class number inequalities.33 But of course it is not surprising that this aspect appears here, being intimately related with the quantities which appear in the very statement. Another similarity is in the consideration of conjugates of a relevant special point (x1 , x2 ).34 However, the use of these tools and arguments has rather different purposes in the two proofs, so that they may be considered quite different after all. (ii) Pila’s paper [Pil09b] contains other interesting results, proved by variations of the same method. There is, for instance, a Theorem 1.2 concerning the Legendre surface L ⊂ 1 × P2 of the previous chapter; it has a map λ : L → 1 . We call a point z0 ∈ L special if λ0 := λ(z0 ) is such that Eλ0 is a CM -curve and if z0 is torsion on Eλ0 . A curve X on L is called special if it is either a fiber of λ above a special value, or a torsion curve. It is proved that if X is a nonspecial curve on L, then it contains only finitely many special points. This result is not itself contained in the Andr´e-Oort conjecture, and suggests mixed versions of it. See the next section for a natural and more general statement, due to Andr´e. In fact, a result equivalent to

A

A

33 This

issue leads to ineffectivity. conjugates of a singular modulus j0 , we mention a result of W. Duke [Duk06], proving a certain equidistribution for them. A previous result of Duke [Duk88] again asserted uniform distribution, this time of the inverse images of the conjugates in SL2 ( )\H ∼ = F with respect to the natural invariant measure (3/π 2 )dx · dy/y 2 , x + iy ∈ H. This is analogous, for instance, to an equidistribution theorem for torsion points of m (and to similar, deeper, results for abelian varieties), saying that conjugates of a root of unity of high order tend to be equidistributed in the unit circle. (This has even been extended by Bilu to points of small height in [Bil97].) Now, such a result in higher dimensions leads in particular to a different proof of the torsion points Theorem 1.1, and, similarly, an extension of Duke’s theorems to higher dimensions would lead to a further proof of Andr´ e’s theorem; however, to my knowledge such a result is not yet available. See [Zha05] for other equidistribution results with application to the Andr´ e-Oort context. 34 Concerning

Z

G

118

Chapter 4

Pila’s, in different phrasing, and a sketch of proof appear in the notes [And01] by Andr´e. (See the Theorem on p. 12. Andr´e’s proof needs a height limit theorem of Silverman, for algebraic points of a section of an elliptic pencil, and a lower bound of Colmez for the height of CM -points.)35 For his proof Pila uses tools and principles similar to the ones above, but the definability in the structure an,exp of the relevant surface now requires a recent result by Peterzil-Starchenko, recalled in the notes. He also needs a lower bound for the degree of a torsion point on a CM -curve (due to A. Silverberg) and an (easy) upper bound for the class number (recalled in a footnote above). In Appendix G below, Masser shows how to derive this result (working with a special case for simplicity) using the methods of Chapter 3, and avoiding both Siegel’s class-number estimates and Pila’s results on quadratic points.

R

Finally, Pila’s paper [Pil09b] contains a Theorem 1.3 concerning Heegner points; namely, one has a modular parametrization X0 (n) → E of an elliptic curve E and considers images in E of special points on X0 (n), which are also torsion on E; here we do not pause further on this. (As remarked by Pila, his Theorem 1.3 is a very special case of results of Buium and Poonen.) (iii) We have already noted that the proofs we have seen do not lead to effectivity, but we have seen in Subsection 4.3.1 that a variation on Andr´e’s argument is possible to produce an effective proof. This does not appear possible at the moment using Pila’s argument. See also Section 13 of [Pil11] for other considerations on effectivity and uniformity of the bounds.

4.5

Shimura varieties

In this last section we offer a few general definitions concerning Shimura varieties, of which we have considered up to now only rather special cases. Our treatment shall be very brief, giving the essentials and a few examples. (I also thank Pietro Corvaja for several discussions and Paula Cohen-Tretkoff for other comments and references.) We have borrowed freely from Andr´e’s notes [And01], but with a few variations. (We remark that these general definitions were known since Mumford [Mum69] and Deligne [Del71]. An issue in Andr´e’s formulation is that he identifies the properties that enable one to talk about Shimura varieties and abelian varieties in terms of the same general structure. Of course, this was crucial in thinking of the Manin-Mumford conjecture and the Andr´e-Oort conjecture as being related.) Shimura varieties. We let:

Q

• H: a connected reductive36 algebraic group defined over ; for simplicity we assume the stronger condition that H is semisimple (i.e., with no nontrivial solvable normal subgroup). • H( )+ : the connected component of the identity in H( ). • K = K∞ ⊂ H( ) be a maximal compact subgroup. (We recall that all such subgroups are conjugate in H( )+ .)

R

R

R

R

R

Iwasawa decomposition: We have H( ) = K∞ · A · N = N · A · K∞ , where A is the identity component of a maximal (split) real torus (i.e., isomorphic to rm ( )) and N is a maximal unipotent subgroup.37 We further have A ∼ = ( ∗+ )r , N ∼ = n (as real analytic manifolds). Further, denote

R

R

R

R

G R

+ ∼ • X + := H( )/K∞ ∼ = H( )+ /K∞ = N · A. 35 Andr´ e’s statement is given in terms of sections of an elliptic pencil over a curve. On taking a pull-back, every curve on L may be viewed as the image of a suitable section for such a pencil, so that indeed Andr´ e’s version contains Pila’s; a converse deduction may also be easily proved. 36 We recall this means that the maximal normal solvable connected subgroup is a torus. 37 For this we refer, for instance, to A. Borel’s S´ eminaire Bourbaki [Bor50].

119

4.5 Shimura varieties

R

Hermitian type: We assume that H is of hermitian type, namely, that X + = H( )/K∞ has a complex structure invariant by H( )+ .38 (Note that this does not depend on the choice of K∞ , since conjugation must preserve the complex structure by assumption.)

R

R

R

R

Example 4.2 As our first and most basic example we let H = SL2 , so H( )+ = H( ) = SL2 ( ). We may choose K = K∞ = SO2 ( ). The Iwasawa decomposition occurs with A = the subgroup of positive diagonal matrices in SL2 ( ), N = the subgroup of matrices ( 10 1t ). It is illustrated by the following formula: for every matrix ( ac db ) ∈ SL2 ( ) we have, for suitable θ, t ∈ , λ ∈ ∗+ :

R R R

R

0 θ ( ac db ) = ( 10 1t ) · ( λ0 λ−1 ) · ( −cos sin θ

R

R

sin θ cos θ ).

+b We let SL2 ( ) operate on H in the usual way: ( ac db )τ := aτ . Then K is easily found to be the stabilizer cτ +d √ of i = −1 in H( ). Thus, the map g 7→ gi induces a bijection of X + ∼ = N ·A ∼ = × ∗+ with H. We now + ∼ have hermitian type in the above sense: indeed, now X = H ⊂ C has a natural complex structure, which is invariant by the action of H( )+ , since the linear fractional maps induced by SL2 ( ) are biholomorphic on H. (See also [Lan75], III.1.)

R R

R

R

R

Z-model of H so we can define a group H(Z). We let • Γ = an arithmetic subgroup of H(Q), i.e., a subgroup which is commensurable with H(Z). (This means that Γ ∩ H(Z) has finite index in both groups.) We assume that Γ ⊂ H(R)+ . To go on, we suppose to have a

Now, Γ\X + =: S inherits a complex structure. It is a result of Baily-Borel (see [BB66]) that this quotient has a (unique) structure of algebraic variety over C (smooth if Γ is torsion free) compatible with the said complex structure.

R

Remark 4.5.1 Description of the algebraic structure. We assume for simplicity that (H/Z)( ) (where Z = Z(H) is the center) has no factor P SL2 ( ). The complex dimension d of S is (r + n)/2. Let ω = ∧d Ω1 . Then S embeds into P roj ⊕m Γ(S, ω ⊗mn ) for large n sufficiently divisible; this induces the algebraic structure. We notice that Γ(S, ω ⊗mn ) is a space of automorphic forms.

R

Q

Finally, suppose also that Γ is a congruence subgroup of H( ), i.e., that Γ contains the kernel of H( ) → H( /n) for some n.39 Then we call S = Γ\X + a connected Shimura variety.

Z

Z

Z

Example 4.3 Modular curves. In the above Example 4.2, we may choose, for instance, Γ = SL2 ( ), and S becomes the quotient Γ\H, which we have already noticed in Section 4.2 to be isomorphic to C (in the complex analytic sense), the isomorphism being provided by the modular function j(τ ), which is automorphic with respect to Γ. And if we take Γ = Γ0 (n), we obtain the quotient Γ0 (n)\H, this time complex analytic-isomorphic to the modular (algebraic!) curve Y0 (n). The isomorphism this time is provided by considering, together with j(τ ), other modular functions and modular forms for Γ0 (n); in turn, these forms are holomorphic differentials on Γ0 (n)\H, of any degree (similarly to the description in 4.5.1). Of course, analogous examples of modular curves are obtained with other congruence subgroups of SL2 ( ). See [Lan73] and [Shi94].

Z

Example 4.4 The variety P1 \ {0, ∞} as a Shimura variety. Here is a special example of the construction just seen. Let us consider the congruence subgroup Γ0 (2) of SL2 ( ) consisting of upper triangular matrices modulo 2. The quotient Γ0 (2)\H is the modular curve Y0 (2). It may easily be seen that it has genus zero and that it is actually isomorphic to m = P1 \ {0, ∞} as a variety (it has two “cusps,”

Z

G

38 As remarked by Andr´ e, this is a very restrictive condition, which excludes, for instance, the groups SLn for any n ≥ 3. 39 As observed in [And01], this is equivalent to Γ being the intersection of H( ) with a compact open subgroup of H(Af ), i.e., the set of points of H in the ring of finite adeles of .

Q

Q

120

Chapter 4

G

G

corresponding to the missing points 0, ∞ of m in P1 ); however, of course the group law of m does not enter into this picture. (If we take, for instance, the square 2m , the Andr´e-Oort conjecture reduces to Andr´e’s theorem, as for the modular curve 1 .)

G

A

Example 4.5 Moduli spaces for p.p. abelian varieties. Let us now consider the group of symplectic “ “ ” ” 0 I 0 I matrices H = Sp2g (i.e., 2g × 2g-matrices M such that M −Ig 0g t M = −Ig 0g , where Ig is the identity g × g matrix). Then the unitary group K = Ug (C) may be identified with a maximal compact in H( ). (We think of a complex g × g-matrix as an endomorphism of real 2g-space. Then, that a complex ¯ yields the symplectic condition.) Then X + becomes identified matrix fixes the hermitian form A(t B) with the “Siegel half space” Hg , consisting of the g × g symmetric complex matrices with positive definite imaginary part; it is a complex manifold of dimension g(g + 1)/2. We may let Sp2g ( ) act on Hg by −1 A B (C . Then the quotient Sp2g ( )\Hd may be identified with the moduli D ) : τ 7→ (Aτ + B)(Cτ + D) space Ag of principally polarized abelian varieties of dimension g. See, for instance, [Ros86] or [LB92]. (As in the examples in [And01], p. 2, we may also consider congruence subgroups and level structure, in analogy with the case g = 1.) √ Example 4.6 Hilbert modular surfaces. Let us consider a real quadratic field ( d) and the associated ring of integers O, denoting with a dash the conjugation over . The Hilbert modular group SL2 (O)/{±1} acts on H; we shall now define an action on H2 , obtained by a kind of restriction of scalars. For this, we let H = SL2 × SL2 ; however, we now choose the -structure which is obtained by setting Γ = H( ) = {(T, T 0 ) : T ∈ SL2 (O)}. As before, we obtain X + = H × H. The quotient Γ\H × H turns out to be an algebraic surface. (See [Hir73] for a detailed study.)

R

Z

Z

Q

Q

Z

Z

Shimura subvarieties. To define properly Shimura subvarieties, Andr´e considers a different description of X + as a conjugacy class of 1-parameter subgroups (see p. 2 of [And01]). See also [Pin05a]. We shall follow a slightly different path (nearer to Andr´e’s alternative approach appearing on p. 8 of the cited notes, or to [And89]); we shall be rather sketchy, mainly thinking of a few illustrative examples. Suppose that H1 is an algebraic subgroup of H, with similar properties. Then we may try to construct from it a Shimura variety, on forming first a quotient X1+ := H1 ( )/K1 , where K1 is a maximal compact subgroup of H1 ( ). If this succeeds, it is natural to try to see this X1+ as a subvariety of X + , in order to take then a left quotient by Γ ∩ H1 ( ) and obtain a Shimura subvariety of S. Now, at least two observations are in order:

R

R

Z

(a) It may well happen that H1 is not of hermitian type, although H is. (A simple example occurs with H = SL2 , H1 = subgroup of diagonal matrices in H.) (b) Even if H1 is of hermitian type, it may happen that K1 is not contained in K but merely in a conjugate qKq −1 , with a q ∈ H( )+ . (We may change K1 within H1 ( ), but only up to conjugation in H1 ( )!) For this reason in general we cannot view X1+ = H1 ( )/K1 as a subspace of X + = H( )/K, but merely as a subspace of another model of X + , namely, X + (q) := H( )/qKq −1 .

R

R R

R

R

R

If we are in case (a), we do not obtain a Shimura variety from H1 . If we are in case (b), we may still embed X1+ in X + (q) and then use the natural map X + (q) → X + defined by xqKq −1 7→ xqK. (This is well defined, because if we replace x by xqyq −1 with an y ∈ K, then xqK is replaced by xqyK = xqK.) This is equivalent to viewing everything in X + , but replacing at the beginning H1 , first with the subgroup H1∗ = q −1 H1 q, which has a maximal compact q −1 K1 q contained in K, and then with the translate qH1∗ = H1 q; at this point we may take the left K-cosets of H1 ( )q. (This is what Andr´e directly does on p. 8 of [And01]. See also [And89], pp. 214–215.)

R

121

4.5 Shimura varieties

Q

Finally, let us also assume that H1 is defined over . Then we may take the right Γ-cosets of the resulting space to obtain a connected Shimura subvariety. Example 4.7 CM -points. We now recover CM -points from these definitions, justifying their previous appearence as special points, or Shimura subvarieties of dimension 0. θ sin θ Let us consider the case H = SL2 of Example 4.2 above, and the choice K := {( −cos } sin θ cos θ ) : θ ∈ for a maximal compact in H( ). The (set of complex points of the) corresponding Shimura variety is SL2 ( )\H ∼ = 1 (C). So any connected Shimura proper subvariety must be a point. Therefore it must come either from H1 = {I} (and we obtain the CM -points q(i) ∈ SL2 ( )\H, with q ∈ SL2 ( )) or from an algebraic subgroup H1 ⊂ SL2 of dimension 1, such that H1 ( ) is actually compact. (In fact, if dim H1 ≥ 2, dimension counting shows that either it is not of hermitian type or we obtain the whole X + = H after dividing by K1 .) Now, any connected algebraic subgroup of SL2 (C) of dimension 1 is known to be conjugate over C either to the subgroup N of matrices ( 10 1t ) or to the subgroup A of diagonal matrices. This first subgroup acts on H with the single fixed point ∞. Therefore gN g −1 acts by fixing g(∞) and if it is defined over , then g(∞) ∈ . So, up to conjugation over we may assume that g fixes ∞, namely, it is a triangular matrix. But then gN g −1 = N . Now, N ( ) has a trivial maximal compact subgroup, so it is not of hermitian type, and we do not obtain a Shimura subvariety from this case. In conclusion, we may assume that H1 = gAg −1 . Again, if it is defined over , then g must send the set {0, ∞} of fixed points of the action of A on P1 to a set {g(0), g(∞)} defined over . If this set consists of two real points, then again the maximal compact is trivial and we do not get hermitian type. So we √ may assume that {g(0), g(∞)} consists of two imaginary quadratic points, say in (i d) (d ∈ + ), and conjugate over by means of complex conjugation. √ √ √ Let then q ∈ SL2 ( ) be such that q(i d) = g(0), so q(−i d) = g(∞). Then q( 0d 01 )K( 10 √0d )q −1 fixes g(0), g(∞), so this must be the maximal compact in H1 ( ) (and must be actually equal to H1 ( )). Hence, by the above, the Shimura √ subvariety is obtained by taking the image in H = SL2 ( )/K of √ H1 ( )q( 0d 01 ). This is just {q(i d)}, a quadratic imaginary number. In turn, this produces indeed a CM -point on SL2 ( )\H. The same argument also shows that all CM -points can be obtained in this way.

Z

R

R

A

Z

R

Q

Q

Q

Q R

Q

Q Q

Q

R

Q

R

Q

R

R

Z

Example 4.8 Hecke correspondences. Some Shimura subvarieties of a square S 2 of a Shimura variety are obtained by means of the so-called Hecke correspondences (in a way which can of course be read also as above). To define them, let g ∈ H( ). Since Γ is arithmetic it may be proved that g −1 Γg is commensurable to Γ. Left multiplication by g induces a map g· : X + → X + , which in turn yields an algebraic correspondence S ↔ S. Explicitly, let u ∈ S and pick a representative x ∈ X + . Then this correspondence associates to u the (formal) sum of the image of gΓx in S. Note that this image is finite, since if γ1 , γ2 ∈ Γ are in the same right coset modulo g −1 Γg, then gγ1 x and gγ2 x have the same image in S. This correspondence may be also viewed as a map (Γ ∩ g −1 Γg)\X + → S 2 . For instance, when Γ = SL2 ( ) and g = ( n0 10 ), we find Γ ∩ g −1 Γg to be the group of matrices a b ( c d ) ∈ SL2 ( ) such that b ≡ 0 (mod n); the Hecke correspondence yields the modular curve Y0 (n) ⊂ 2 (i.e., x1 corresponds to x2 precisely if (x1 , x2 ) ∈ Y0 (n)). This curve is a special subvariety, and it may be obtained also by the previous construction: we take H1 to be a skew-diagonal subgroup of H 2 (which is the algebraic group underlying S 2 ): H1 = {(x, g −1 xg) : x ∈ H}. (If H is simple and Aut(H) consists of inner automorphisms, all proper algebraic subgroups of H 2 defined over and with surjective projections arise in this way.)

Q

Z

Z

A

Q

´-Oort and Manin-Mumford. By means of similar considerations, Andr´e Unifying Andre also suggested a construction unifying the Shimura-Andr´e-Oort and the abelian-Manin-Mumford context (and formulated a “tentative” unified conjecture). We briefly repeat this. (See p. 8 of [And01] and see also Pink’s paper [Pin05a] and manuscript [Pin05b] for conjectures including Mordell-Lang as well. Here we have limited ourselves to the simplest considerations.)

Q

R

Take H to be a connected linear algebraic group over , K a maximal compact in H( ) containing a maximal real torus, such that H( )/K has an H( )+ -invariant complex structure; take also an arithmetic Γ ⊂ H( ). Assume further that S := Γ\H( )/K is an algebraic variety.

Z

R

R

R

122

Chapter 4

Q

Q

Q

Call an irreducible subvariety Z ⊂ S special if there exist H1 / ⊂ H/ , q ∈ H( ), such that H1 ( ) ∩ K∞ is a maximal compact in H1 , and Z is the image of qH1 ( ) by the map

R

R

R

R

R

qH1 ( ) → H( ) → Γ\H( )/K∞ = S. Now, observe that

Z

Q

1. If H is reductive, Γ is a congruence subgroup of H( ) and there is a maximal torus of H/ which is compact, we recover the above definition of special subvariety of a connected Shimura variety.

R

G

R

Z R

∼ 2g ∼ 2. If H = 2g = Cg . Now Γ ⊂ H( ) is a lattice (if we give H the a , then K∞ = {1}, H( ) = usual -structure); assume now that S = Γ\H( ) is given an H( )-invariant complex structure and a compatible structure of an algebraic variety; then S is an abelian variety. Also, the special points are torsion points (recall q ∈ H( ) in the definition above). It may be also verified that the special subvarieties are translates of abelian subvarieties by a torsion point (in fact, the assumption that Z is an algebraic subvariety forces Γ ∩ H1 ( ) to be a full lattice in H1 ( ), necessarily with a Riemann-form induced from H).

Z

R

Q

R

R

Andr´e formulates a tentative generalized conjecture: Z ⊂ S is special if and only if it contains a dense subset of special points.

G

Now, although Example 4.4 exhibits m as a Shimura variety (but disregarding the group structure), and although Andr´e’s construction includes abelian varieties (and the Manin-Mumford context), it is not clear to me which construction recovers as well the multiplicative (and semiabelian) context of Chapter 1.40 Nevertheless, as special cases of Andr´e’s construction and statement we can also recover mixed Andr´e-Oort-Manin-Mumford statements; an example occurs with Theorem 1.2 of Pila’s paper [Pil09b], already mentioned in Remark 4.4.4 (ii) above (in turn, a rather special case of Pila’s general results of [Pil11], stated above as Theorem 4.1). This appears as Example on p. 9 of [And01] and we reproduce it now.

G

Let H ⊂ SL3 be the semidirect product of 2a with SL2 (with natural action). It may be “1 x x ” 1 2 represented as the group of matrices of the form 0 a b , where ( ac db ) ∈ SL2 . 0 c d Let K∞ = {0} · SO2 . Then H/K∞ ∼ = C × H. Let also Γ be the semidirect product of 2 with SL2 ( ). Then the map 1 Γ\H( )/K∞ ∼ = Γ\C × H → SL2 ( )\H ∼ =

Example 4.9

Z

R

Z

Z

A

defines the universal elliptic curve. Using the above definitions, the special points are found to be the torsion points in a CM -elliptic curve. Take finally the Legendre elliptic curve Eλ and a nontorsion point Pλ ∈ Eλ ( (λ)). In this context the generalized conjecture predicts, for instance, that there are only finitely many λ0 ∈ C such that Eλ0 has complex multiplication and Pλ0 is torsion (on Eλ0 ). As remarked above, this has been proved by Andr´e ([And01], p. 12)41 and also follows from Theorem 1.2 of [Pil09b]. See also Appendix G below for another proof by Masser.

Q

Q

Q

40 Actually, Theorem 18.2(ii) in [Bor91] implies that any reductive group H/ is unirational over . In particular, the closure of H( ) in the real topology of H( ) contains some nonempty open set. But then, if we take any quotient Γ\H( )/K∞ isomorphic to m , the rational points H( ) cannot project to a set contained in the torsion points of m . This indicates that some modification is needed to recover from the above setting the Lang’s toric context as well. 41 Andr´ e’s proof uses a theorem of Silverman on comparison between functional and numerical specialized heights (tending to infinity) on a section of an elliptic pencil, and a result of P. Colmez asserting that CM -values of j have height tending to infinity.

G

R

Q

G

R

Q

123

Notes to Chapter 4

Notes to Chapter 4 ´’s theorem 1. Remarks on Edixhoven’s approach to Andre Some aspects in common with the above proof by Pila (and by Andr´e) also appear in the quoted paper by Edixhoven [Edi98], where a conditional proof is given of the same theorem of Andr´e. Let us spend a few words on this approach as well. The first part is very similar to Andr´e’s and proves the equality of the CM fields of x1 , x2 , and the “almost” equality of the discriminants, for all but finitely many special points (x1 , x2 ) ∈ X. At this stage, a first, almost successful, attempt observed by Edixhoven could be as follows. From Minkowski’s theorem on small representatives in ideal classes, one deduces that for the relevant p points (x1 , x2 ) there is a cyclic isogeny between the corresponding elliptic curves of degree n  |D|. Note that (x1 , x2 ) ∈ Y0 (n). (Here we denote by D a least common multiple of the discriminants associated to x1 , x2 .) Now one considers conjugates, which appear in so many other arguments we have encountered Bezout’s theorem for P21 one obtains in these notes. The conjugate Q points also lie on Y0 (n) and by p −1 an upper bound  ψ(n) := n p|n (1 + p )  n log log n  |D| log log |D| for their number (or else the sought conclusion that Y0 (n) ⊂ X). On the other hand, from Siegel’s lower bound for class numbers we derive  |D|1/2− for the degree of the point, which, by comparison with the above, is almost sufficient, but not quite, to obtain an upper bound for |D| and to conclude. Nevertheless, this kind of idea plays a crucial role in Edixhoven’s proof and shows the noted similarity with Pila’s proof, in comparing bounds for the number of conjugate points. In the actual proof (on the assumption of the generalized Riemann hypothesis for zeta functions of imaginary quadratic fields) the intersection is in fact not taken directly with Y0 (n), but with images of X under Hecke’s correspondences. More precisely, let Tn be the correspondence on 1 (induced by Y0 (n)) that sends an elliptic curve to the sum (as a divisor) of its quotients by its cyclic subgroups of order n;42 then Tn × Tn may be seen as a correspondence on 2 (sending a point (x1 , x2 ) to the divisor (Tn x1 , Tn x2 )). Take a special point (x1 , x2 ) ∈ X corresponding to a CM -field K (we know this field is the same for both x1 , x2 ) and conductors f1 , f2 with lcm(f1 , f2 ) = f . It may be checked that the ideal class [I] of an ideal I ⊂ OK,f such that OK,f /I is cyclic of order m induces a Galois automorphism σ such that (σ(x1 ), σ(x2 )) is in (Tm × Tm )X. Hence the intersection X ∩ ((Tm × Tm )X) also contains (x1 , x2 ); in turn, it contains as well the conjugate points, shown to be  h(OK,f ) in number. (Note that here we consider isogenies between xi , σ(xi ) (i = 1, 2) rather than between x1 , x2 .) 2 X) in P21 may be computed as 2dd0 ψ(m)2 On the other hand, the intersection number X · (Tm 0 2 (where d, d are the separate degrees of X) or else X is contained in Tm X.

A

A

Edixhoven proves that if this inclusion occurs for suitably large m of special form (including being a prime), then indeed X is a modular curve. The proof of this is a bit involved (see Sec. 6 of [Edi98]) and we do not comment on it in any detail. However, we point out that, like Pila’s proof, it also uses the transcendental uniformization provided by the j-function (one analyzes the stabilizer in SL2 ( )2 of the inverse image of X by the map (j, j) : H2 → C2 ). In some sense this step can be compared to the characterization of Z alg given in Pila’s proof and represents the geometric part of the proof.

R

42 Here,

of course, we think of (isomorphism classes of) elliptic curves as points of

A1 through the j-invariant.

124

Chapter 4

On the other hand, if the inclusion does not occur, we find h(OK,f )  ψ(m)2 . Then what remains (in order to obtain a contradiction and conclude) is to find integers m as above, such that ψ(m)2 is in fact much smaller than h(OK,f ). Edixhoven tries to construct such integers as suitable primes, m = p, as follows. Let p be a prime splitting in OK,f , and take I as a prime ideal above p. Since h(OK,f ) grows like |DK,f |1/2+o(1) , it will suffice that p  |DK,f |1/4−δ for some fixed δ > 0. Now, as shown in [Edi98], it is a consequence of well-known work by Lagarias-MontgomeryOdlyzko (on explicit Chebotarev theorems) that such primes exist (for large |D|) if we assume the generalized Riemann hypothesis for imaginary quadratic fields. As remarked in [Edi98], the exponent 1/4 is critical, since the existence of the required primes has been proved unconditionally if 1/4 is replaced by any larger number. We have observed that the proofs by Pila and Andr´e exploit, in part, similar tools and structures, and maybe the present strategy is even nearer to Pila’s than is Andr´e’s. However, in other parts (for instance, the use of Bezout’s theorem, compared to estimates for rational points) the proofs also have profound differences. ´-Oort 2. Some unlikely intersections beyond Andre In a very recent paper [HP11], Habegger and Pila have considered other unlikely intersections, as in a conjecture of Pink [Pin05b], in the spirit of Andr´e-Oort but beyond that. We outline a few essentials of what they prove. Let us consider S := Cn as a Shimura variety and, similarly to Pink’s conjectures, let us denote by S [2] the union of all special subvarieties of codimension at least 2. Also, let X be an irreducible curve in Cn . Pink’s conjecture for this setting reads: X ∩ S [2] is finite unless X is contained in a proper special subvariety of Cn . In this direction, Habegger and Pila have a result under a condition on X. To define this, let degi (X) be the degree of X with respect to the i-th coordinate (i.e., this is the number of points of X having some generic xi coordinate). Of course, this degree may be zero: this happens if xi is constant on X. We have the following: Definition: The curve X is asymmetric if among those numbers degi (X) that are positive there are no repetitions, apart possibly for a single value which is allowed to appear at most twice. Habegger and Pila prove the following result, which may be considered an Andr´e-Oort version of Theorem 1.3:

Q

Theorem 4.7. ([HP11].) If X is defined over , is not contained in a proper special subvariety of Cn , and is asymmetric, then X ∩ S [2] is finite. Without the asymmetry condition, Habegger and Pila can still prove the result, but with S [2] replaced by a certain smaller union of special subvarieties of codimension at least 2. The paper also contains an analogue of Mordell-Lang in Cn , where finitely generated subgroups are replaced by j-images of orbits by GL2 ( )+ (so-called Hecke-orbits). To state this, let us give another definition.

Q

Definition: Fix a finite set A of algebraic numbers ai . A point of V is called A-special if each coordinate is either special (in the usual sense) or is in the Hecke orbit of one of the ai (defined as j(GL2 ( )+ τi ) where j(τi ) = ai ). Also, an A-special subvariety is a weakly special subvariety43 that contains an A-special point (i.e., any fixed coordinates are A-special).

Q

With this definitions they have the following: 43 This

is defined like special, except that the constant coordinates are not required to be special values.

125

Notes to Chapter 4

Theorem 4.8. ([HP11].) The A-special points of V are contained in finitely many A-special subvarieties contained in V . 3. Definability and o-minimal structures In this note we shall offer some further detail concerning the context of the estimates by Pila and Pila-Wilkie, heavily used in the last two chapters. The estimates concerned algebraic points of bounded degree on suitable real varieties arising from the problems. In the previous chapter such real varieties were compact subanalytic in some n , and hence globally subanalytic, namely subanalytic as subsets of Pn ( ). (See Example 4.11 below for definitions.) In the present chapter the relevant varieties are neither bounded nor globally subanalytic, because they arise from the graph of the function j(τ ) on the unbounded region F. However, we observed that they may be defined on considering graphs of maps constructed out from the exponential function together with subanalytic functions.

R

R

Both categories of varieties arising in this way fall into the more general notion of o-minimal structure.44 We shall now present a formal definition of this, borrowing from [Pil09b] and [PW06] (where also other references may be found, e.g., van den Dries’s book [vdD98]). See also Scanlon’s S´eminaire Bourbaki notes [Sca11b], Section 3, for an alternative and more extended survey presentation. It is a fact that the theorems of Pila-Wilkie [PW06] and of Pila [Pil09b], [Pil11] apply to sets in any o-minimal structure (over ). See especially Pila’s Theorem 4.5 above.

R An o-minimal structure (over R) is a sequence S = {S1 , S2 , . . .}, where Sn is a collection of subsets of Rn , such that the following axioms are satisfied for all m, n ∈ N: 1. Sn is a Boolean algebra under the usual set-theoretic operations.45 2. Sn contains the semialgebraic subsets of

Rn (see Section 3.3 for a definition).46

3. If A ∈ Sm , B ∈ Sn , then A × B ∈ Sm+n . 4. If n ≥ m and if A ∈ Sn , then π(A) ∈ Sm , where π : m coordinates. 5. The boundary of every set in S1 is finite.47

R

Rn → Rm is the projection to the first R

Rn is called definable if

A set Z ⊂ n is called definable in S if Z ∈ Sn . A function f : m → its graph is definable in m+n . We observe that the above properties entail the following consequence:

R

6. The composition of definable functions is definable. Remark 4.5.2 (i) Note also that, by properties nos. 2 and 6, any rational combination and composition of definable functions is definable. (By “composition” f ◦ g of f, g we mean the function whose graph is the projection to the first and last coordinate of the set {(x, y, z) : (x, y) ∈ Γg , (y, z) ∈ Γf }, where Γf , Γg are the graphs of f, g.) (ii) An alternative way (in fact, the original one by van den Dries) to define o-minimal structures comes from model theory. More specifically, the axioms 1–4 may be rephrased by saying that the definable sets are 44 This

terminology was coined by Pillay and Steinhorn; see, e.g., [PS88]. stress that the important complementation property is sometimes very difficult to check for the structures important here. 46 However, some authors replace this requirement by a weaker one. 47 A sequence S as above satisfying the first four properties but possibly not the fifth one is merely called a structure. In our context such last finiteness property is fundamental. 45 We

126

Chapter 4

those which can be defined using “first-order formulas” in a language containing at least {+ , × , = , 0, . . . , g` (x) > 0} where fi , gj ∈ [x], x = (x1 , . . . , xh ); see, e.g., [BM88], Def. 1.1, or [Shi97], p. 51. It has a dimension: see [BM88], p. 14. It is a theorem of Tarski-Seidenberg that the image of a semialgebraic set by a linear projection is itself semialgebraic. It readily follows that the collection of semialgebraic sets (in any real space n ) forms an o-minimal structure (which by no. 2 above is the smallest possible one).

R

R

R

R R

Example 4.11 Globally subanalytic sets. We recall from the previous chapter that a set Z ⊂ n is said to be subanalytic if it is a finite union of sets of the shape Imf1 \ Imf2 , where f1 , f2 : M → n are proper real analytic maps on an analytic manifold M . See [Shi97] or [BM88]. (In [BM88] other equivalent definitions are given; see Prop. 3.13. Also, a useful description is provided by the so-called uniformization theorem: A closed subanalytic set of a real analytic manifold M may be written as the image X = ψ(N ) of a real analytic manifold N with dim N = dim X, where ψ : N → M is real analytic and proper; see [BM88], Sec. 0.1.) We also recall that Z is said to be globally subanalytic if it is subanalytic when considered as a subset of Pn ( ). Now, the Tarski-Seidenberg theorem may be extended to this context (it is here that the mere concept of “semianalytic” would not work), providing property no. 4, whereas a theorem of Gabrielov [Gab68] was interpreted (by van den Dries) as asserting that the complement of a globally subanalytic set is also globally subanalytic. Also, it is proved that the fundamental property no. 5 is satisfied.48 This entails that the globally subanalytic sets constitute as well an o-minimal structure, denoted an .

R

R

R

Example 4.12 Exponential function. Consider the collection Sn of subsets Z of n of the shape Z = π(f −1 (0)), where, for some m ≥ n, π : m → n is the projection on the first n coordinates and where f : m → is an exponential polynomial, that is f (x1 , . . . , xm ) = Q(x1 , . . . , xm , exp(x1 ), . . . , exp(xm )) for some polynomial Q ∈ [y1 , . . . , y2m ]. The fundamental finiteness property no. 5 for this S1 follows from Khovanski’s theory of fewnomials (see [Kho91], Theorem 2, Ch. 1), whereas the difficult complementation property is due to Wilkie [Wil96]. Again, it follows that the collection of the Sn forms an o-minimal structure, denoted exp . (In view of Wilkie’s result, exp may also be described as the smallest structure containing semialgebraic sets and the graph of the exponential function on .) Note that this structure is not contained in an , as shown, for instance, by the sets {(x, xc ) : x > 0} for irrational c > 0 and {(x, exp(−1/x)) : x > 0} ⊂ 2 , which are not subanalytic at the origin, but lie in exp . It may also be shown that exp does not contain an . (In fact, for instance, it is known that sin x, restricted to any real interval, is not definable in exp ; see [Bia97].)49

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

Example 4.13 Subanalytic + exponential. Let us consider the sets appearing either in an or in exp . They generate a structure, in the sense that there is a smallest collection of subsets which contains the above ones and which satisfies the above first four properties. It is a known important fact that this structure is o-minimal, namely, the fifth finiteness property is satisfied as well. (This is due to van den Dries and Miller; see, e.g., [vdDMM94] for a proof.) This structure is denoted an,exp .

R

R

R

As we have seen, the fact that the structure an,exp of this last example is an o-minimal structure is fundamental in Pila’s proofs in the Andr´e-Oort context; this depends on the fact that j(τ ) is definable in this structure. We sketch a deduction of this in the following:

R

Example 4.14 Definability of j(τ ), τ ∈ F in an,exp . In this example we observe that the modular function j(τ ) restricted to the fundamental domain F ⊂ H is definable in the structure an,exp (where we view it as a function on a subset of 2 through real and imaginary parts).

R

R

48 Both

properties at nos. 4 and 5 are not satisfied by the larger structure of subanalytic sets. may seem surprising, since 2i sin x = exp(ix) − exp(−ix); this is a warning, showing that it is fundamental to take into account only the real exponential function in exp . 49 This

R

127

Notes to Chapter 4

Assuming this result, as noted above, the properties 1–4 of (o-minimal) structures then immediately imply that any set defined in F 2 by an equation G(j(τ1 ), j(τ2 )) = 0, where G is a polynomial, is also definable in the same structure. It follows that the surface Z of interest in the above Pila’s proof is definable as well. We further note that definability of j(τ ), τ ∈ F is also sufficient for the proofs of the more general theorems in [Pil11] to work. Coming to our point, we first set τ = x + iy, q = exp(2πiτ ), so q = exp−2πy (cos 2πx + i sin 2πx) =: q1 + iq2 . In practice, we use the subanalytic structure horizontally and the exponential (+ subanalytic) one vertically. In fact (by property n. 6 and Remark 4.5.2 above), real and imaginary parts q1 , q2 of q are functions of x, y which, if restricted to {y ≥ 1/2, |x| ≤ 1}, are definable in an,exp .50 Now observe that the q-expansion of j(τ ) appearing as (4.2.2) yields a function F (q1 , q2 ) := qj(τ ), real analytic in the q-disk |q| ≤ exp(−π), which corresponds to the xy-strip y ≥ 1/2, |x| ≤ 1. Therefore, this function is globally subanalytic, as a function of q1 , q2 in the said xy-strip and is thus definable in an . By property no. 6, the composition of F (q1 , q2 ) with the above functions expressing q1 , q2 in terms of x, y, is also definable; however, this is just qj(τ ) as a function of x, y. Finally, since we work in an,exp , j(τ ), τ ∈ F , is as well definable, as required. (Note that when τ → i∞, an essential singularity appears, so j(τ ) is not globally subanalytic and not definable in the smaller structure an .)

R

R

R

R

R

The definability of j(τ ), τ ∈ F in an,exp , sketched in this last example, may also be deduced from the following result by Peterzil and Starchenko, important for definability of a surface relevant in Theorem 1.2 of [Pil09b] (not considered in detail here), and which certainly might be relevant for other applications in the context. To state this interesting result, we view again C as 2 through real and imaginary parts:

R

Theorem 4.9. (Peterzil-Starchenko [PS04].) Let ℘(z, τ ) denote the Weierstrass function relative to the lattice τ + . For τ ∈ H let Dτ := {t1 + t2 τ ∈ C : t1 , t2 ∈ [0, 1)}. Then the restriction of this function to the domain {(z, τ ) ∈ C × H : τ ∈ F, z ∈ Dτ } is definable in an,exp .

Z Z

R = τ Z + Z is the lattice

Note that Dτ is a natural fundamental domain for C/Lτ , where Lτ associated to τ . The proof of this theorem involves several steps; an important one concerns a description of an elliptic curve as a quotient C∗ /q Z (as in “Tate’s curve”; see [Lan73]), followed by a suitable compactification of the unit punctured disk in C∗ . From this result and standard formulas from the theory of elliptic functions, Pila readily deduces that the functions λ(τ ), j(τ ) : F → C are also definable (see [Pil09b]).

50 For simplicity we are partially working with complex coordinates, but the obvious expressions in terms of real and imaginary parts would transfer all the arguments to spaces of real coordinates. Note, however, that the domain taken into account here is important: if x would also be allowed to vary in a whole half-line L in , this deduction would not hold. In fact, sin x, x ∈ L is not globally subanalytic; also, as remarked above, it is not definable in exp . Actually, we can see at once that this function cannot be definable in any o-minimal structure, since the set of zeros of sin x, x ∈ L, has infinite boundary.

R

R

Appendix A

Distribution of Rational Points on Subanalytic Surfaces by Umberto Zannier In the present appendix we shall offer a sketch of a proof of Pila’s Theorem 3.5 (proved in [Pil04]), which estimates the number of rational points with denominator dividing N on a given compact subanalytic surface; actually, we shall consider the sharper version obtained in [Pil05], in which “dividing” is replaced by “at most.”1 Before stating this result, let us recall from Chapter 3 the following fundamental:

R

Definition: For a real variety X ⊂ n , we define X alg to be the union of connected semialgebraic sets of positive dimension contained in S. We call X alg the algebraic part of X. Accordingly, we denote with X trans the complement of X alg in X, calling it the transcendental part of X. Example A.1 The structure of algebraic arcs on a surface may be fairly complicated, even if the surface is subanalytic. We borrow from [PW06] the following example. Take the surface in 3 determined by z = xy , x, y ∈ [2, 3]. It contains the algebraic arcs z = xy0 , x ∈ [2, 3] for any given y0 ∈ ∩ [2, 3], and no other algebraic arc. (This last conclusion is easy to see; indeed, suppose that xy(x) is algebraic, for an algebraic function y(x) : I → [2, 3], where I is some interval. Then the logarithmic derivative of xy(x) , i.e., y 0 (x) log x + y(x)/x, would also be algebraic. But log x is not algebraic, whence y 0 (x) must vanish.) In particular, this shows that for a subanalytic surface X, the set X alg need not be subanalytic. (See also [Pil04], p. 208.)

R

Q

As anticipated, we shall work with compact subanalytic X, whereas in the paper [PW06] Pila and Wilkie work with sets in any o-minimal structure over . Let us denote by X( , N ) the set of rational points on X of denominator at most N . We have:

R

R Q

Q

Theorem A.1. (Pila [Pil05], Theorem 1.1.) Let X ⊂ n be a compact subanalytic surface and let  > 0. There is a constant c(X, ) such that |X trans ( , N )| ≤ c(X, )N  for all large N .

Q

General strategy. The proof starts by showing that the points in X( , N ) lie on the intersections of X with very few algebraic hypersurfaces of suitably large degree. This step is accomplished by interpolation-extrapolation techniques, similar to the ones introduced in BombieriPila’s original paper [BP89].2 These intersections shall be subanalytic curves, and since we are considering merely points in the transcendental part, we may restrict our attention only to those components which are not 1 This 2 Such

is achieved by a technical refinement of precisely the same method of [Pil04]. arguments have common features with techniques from the theory of transcendental numbers.

129

Appendix A

algebraic. Then, a similar technique applied to these curves in place of the original surface shows that the relevant rational points in turn lie on few algebraic curves. This should give rise to an estimate for the cardinality of the resulting finite set of points outside the algebraic arcs contained in X. However, the curves which arise in the first step shall depend on the parameter N , and therefore for the subsequent step one needs a bound which is sufficiently uniform as the curve varies. This leads to an especially careful study of the case of curves. Let us now provide some details of this program. In the sequel we shall denote by c1 , c2 , . . . positive numbers depending on fixed data (like X, but not N ), sometimes explicitly indicated. Step 1: confining the points to few algebraic hypersurfaces

R

Proposition A.2. Let X ⊂ n , n ≥ 3, be a compact subanalytic surface and let  > 0. Then there is c1 = c1 (X, n, ) such that for large enough d (in terms only of n, ) the set X( , N ) is contained in at most c1 N  algebraic hypersurfaces of degree ≤ d.

Q

Remark A.0.3 The condition n ≥ 3 is clearly necessary. Aiming to the above theorem, however, this is harmless, since if a connected subanalytic surface is embedded in 2 , its transcendental part is subanalytic of dimension ≤ 1.

R

We present now a proof of this proposition. Since X is subanalytic, there exists (uniformization theorem, [BM88], 0.1) a real analytic smooth surface S and a proper analytic map ψ = (ψ1 , . . . , ψn ) : S → n with ψ(S) = X. Since X is compact and ψ is proper, S is compact. We may cover S with finitely many sets Al analytically homeomorphic to closed disks. We shall argue separately on each Al , which for our purposes may then be assumed to be a disk. The idea now goes as in the original paper [BP89] by Bombieri-Pila, and is as follows. We want to prove that several points z1 , . . . , zm ∈ X( ) ⊂ n lie on an algebraic hypersurface of suitably large but controlled degree, say at most d; forgetting that the points are rational, this amounts to solve nontrivially a system of m linear equations, where the entries are the monomials of degree ≤ d evaluated at the coordinates of the zi . Such a solution shall exist automatically only if m is less  of monomials of degree ≤ d in n variables, whereas if m is larger we need than the number d+n n  d+n that all maximal minors (of size n ) coming from the system vanish. This setting corresponds to an interpolation step. To ensure the required vanishing of appropriate minors, we use that the points are rational: on the one hand, the Taylor expansions of the functions ψj parametrizing X show that the determinants are small if the points zi lie in a disk of small radius. On the other hand, the determinants in question take rational values with not too large denominators, whence they cannot be too small unless they vanish, as wanted. This is the extrapolation step. Let us provide some details for this.

R

Q

Q

A determinantal-interpolation lemma. We shall now prove the needed estimate for the determinants. In the papers [BP89] and [Pil05] this was done by appealing to generalizations of the classical mean-value theorem in calculus, going back to the nineteenth century. Here we adopt the approach in [Pil04] (see Lemma 3.1), in a simplified version sufficient for the present purposes.

R

R

Lemma A.3. Let D ≥ 10, φ1 , . . . , φD : 2 → be functions, C ∞ on a neighborhood of a compact convex set U . There exists a constant c2 = c2 (U, φ1 , . . . , φD ) such that if z1 , . . . , zD ∈ U satisfy −1

supi,j |zi − zj | ≤ r ≤ 1 then |det φi (zj )| ≤ c2 r4

3

D2

.

Proof.  Set ζ := z − z1 and ζi := zi − z1 , where z is a variable. We choose an integer b such that b+2 ≤ D and expand the φi in a Taylor series around z1 up to order b, viewing the φi as functions 2 in one variable restricted to the line z1 –zi , contained in U . We can then write φi (zj ) = pi (ζj ) + θij .

130

Distribution of Rational Points on Subanalytic Surfaces

1 a ∂ φi (z1 ) (a = (a1 , a2 )), Here pi is a certain polynomial of degree ≤ b, with coefficients of the shape a! 1 b+1 |a| := a1 + a2 ≤ b, and |θij | ≤ (b + 2)r M , where M = supz∈U,|a|=b+1 a! |∂ a φi (z)|. We can view the matrix (φi (zj )) as having rows (φi (z1 ), . . . , φi (zD )), for i = 1, . . . , D. Using that the determinant is multilinear-alternating (in the row vectors), we can expand det(φi (zj )) into an alternating sum of determinants of D × D-matrices having some rows of the shape (pi (zj ))j and the remaining rows of the shape (θij )j (where the index i refers to the row, and where j runs through {1, . . . , D}). In a typical such determinant, let h be the number of rows of the first type (so there are D − h rows of the second type). P , Note that if h > b+2 i λi pi (ζ) = 0, λi ∈ 2 , there is an identical nontrivial linear relation valid identically in ζ, where the index i runs through the h rows in question. Therefore the said h rows of the determinant are linearly dependent, and the determinant therefore vanishes. Hence  survive. only determinants with h ≤ b+2 2 Now, expansion of these determinants together with the above estimate for the |θij | (on estimating the |pi (ζj )| in terms only of U, φ1 , . . . , φD ) easily yields an upper bound of the sought  type, with an exponent for r given by (D − b+2 2 )(b + 1); we obtain the desired result on choosing √ b = [ D] − 2.

R

R

R

Remark A.0.4 A similar result of course holds for any k in place of 2 (as in [Pil04]); this leads to a corresponding analogue of the proposition (where now it has to be assumed that n > k := dim X). Also, it is possible to quantify c2 more precisely, with an estimate depending explicitly on the partial derivatives of the φi up to order b + 1, in the disk |z − z1 | ≤ r. This shall be useful in dealing with curves.

We can now complete the proof of the proposition, i.e., the first step, supposing, as we may, that N is large enough. We let A ⊂ 2 be one of the finitely many above disks Al . We fix a (large) integer d and consider the functions φ1 , . . . , φD defined as the monomials of degree ≤ d in  ψ1 , . . . , ψn ; so D = d+n n . Let 0 < r < 1 be a parameter to be chosen later, and let us cover A with at most c3 (A)r−2 compact convex sets contained in disks of radius ≤ r/2. Let B be one of these last small sets, and let p1 , . . . , pD ∈ ψ(B) ∩ X( , N ), writing pi = (ψ1 (zi ), . . . , ψn (zi )). Then Lemma A.3 implies

R

3

4−1 D 2

Q

that | det(φi (zj )| ≤ c2 r . On the other hand, if pi has denominator Ni ≤ N , Nid is a common denominator for φ1 (zi ), . . . , φD (zi ). Hence, the determinant in question is a rational number with denominator at most (N1 · · · ND )d ≤ N dD . Consequently, either the determinant vanishes, or −1

1 ≤ N dD c2 r4

3

D2 3

.

−3 2

−4d √

Let us then choose r = 12 c−4D N D . Since n ≥ 3, we have D n d3 ; so, if d has been chosen 2 large enough (in terms only of n), for large N we shall have 0 < r < 1, making this choice indeed −1

3

legitimate. Since N dD c2 r4 D 2 < 1, we conclude that any determinant as above vanishes. This means that the matrix with rows (φ1 (ξ), . . . , φD (ξ)), where ξ runs through B ∩ ψ −1 (X( , N )), has not maximal rank. Then the orthogonal to the row vectors is not trivial, which means Pspace D that we have a nontrivial linear relation i=1 αi φi (ξ) = 0 (αi constants not all zero) for all ξ ∈ B ∩ ψ −1 (X( , N )). In view of the definition of the φi , this yields a hypersurface of degree ≤ d containing ψ(B) ∩ X( , N ). Hence, X( , N ) is contained in at most as many hypersurfaces as there are sets B; we have already noted that the number of such sets can be taken ≤ c3 (Al )r−2 , for each of the finitely many Al chosen at the beginning and depending only on X. In conclusion, recalling the present choice 8d √ of r, we obtain at most c4 (X, d)N D hypersurfaces. Since D  d3 , for d larger than a suitable function of n and , the exponent of N may be made < , and we obtain the conclusion of the proposition.

Q

Q

Q

3 We

Q

are applying what Masser once called the fundamental theorem of transcendence: A positive integer is ≥ 1.

131

Appendix A

Step 2: Analysis of the intersections of X with hypersurfaces. As already mentioned in the above “Strategy,” the proof of Theorem A.1 proceeds, by descent with respect to dimensions, similarly to the above Proposition A.2, replacing X with the hypersurface sections of X that arise from the proposition itself; these intersections shall have dimension 1 on X trans , so indeed at this stage we should achieve finiteness, with explicit bounds. However, the obstacle appears that the hypersurface sections depend on N . Therefore, we shall need some uniformity of the bounds. In turn, this uniformity can be achieved only at the cost of controlling a priori the derivatives of the relevant functions that arise. To do this, it shall then be necessary to decompose the said hypersurface sections into smaller subanalytic pieces, to be worked out separately. To carry over all of this, let us start with a first uniformity issue, dealt with by the following fundamental theorem of Gabrielov. To state it we denote, here and in the sequel, by cc(Z) the number (possibly infinite) of connected components of Z: Theorem A.4. (Gabrielov, see [BM88], 3.14.) Let p : N × Y → Y be the second projection, where N , Y are real analytic manifolds. Let X ⊂ N × Y be relatively compact and subanalytic. There exists a number c5 (N , Y, X) such that for all y ∈ Y we have cc(X ∩ p−1 (y)) ≤ c5 . To go on, let us first illustrate the procedure in the simpler case when we start with X a compact subanalytic curve in n , and again we want to estimate |X( , N )|; the singular points on X form a finite set (by compactness, since X is the image of a manifold by an analytic map); so we may remove them and work with the finitely many connected curves that remain. Hence we may assume that X is compact-connected subanalytic and smooth (at nonboundary points). For n > 2, let us consider projections πs on pairs s = {i < j} of coordinates. If all images πs (X) are semialgebraic, then X is semialgebraic and again X trans is empty. Otherwise, some such projection π has a nonsemialgebraic image. By Gabrielov’s theorem, π : X → π(X) has finite fibers of uniformly bounded cardinality. It thus suffices to prove the sought estimate assuming that X ⊂ 2 is nonsemialgebraic. Note that, since X is smooth and connected, the intersection with an algebraic curve is finite (otherwise, by analytic continuation, the whole X would be semialgebraic). Now, the analogue for curves of Proposition A.2 would again hold for n ≥ 2, whereas for n = 1 we would have X trans = ∅. Then, viewing the space of curves of degree d as a (compact) projective space, we see by compactness that the intersection with algebraic curves of bounded degrees would have uniformly bounded cardinality. (This compactness argument briefly appears in [BP89], Theorem 1, and may be justified, e.g., as in the proof of Lemma 3.7; this uniformity could also be deduced from Gabrielov’s theorem.) Therefore, we obtain no more points than a constant multiple of the number of said algebraic curves, and this concludes the analysis of curves.

Q

R

R

For surfaces the principles are similar, but rather more complicated, and we shall be brief. First of all, we use the uniformization theorem as in the opening arguments to assume that X = ψ(S) for a compact connected real manifold S of dimension 2 and a proper real analytic map ψ : S → n . Also, we may assume that n ≥ 3 and that some projection onto three coordinates sends X to a set not contained in any algebraic hypersurface: otherwise X trans has dimension ≤ 1 and we fall in the case of curves. We now apply Proposition A.2 to conclude that X( , N ) is contained into at most c1 N  algebraic hypersurfaces of large enough prescribed degree d > d0 (n, ). Let Υ denote one of these hypersurface sections, noting it is subanalytic, and let V = ψ −1 (Υ). Each such V is semianalytic4 of dimension ≤ 1 with a number of connected components bounded in terms of d by Gabrielov’s theorem.5

R

Q

4 The concept of semianalytic set is similar to semialgebraic set, on replacing polynomial functions locally by real-analytic functions; however it does not share the same good properties, and this led to consider subanalytic sets. Semianalytic sets are subanalytic; see [Shi97] or [BM88], where this concept is used to define subanalytic. 5 Here and in the sequel, the point is to view the space of hypersurfaces H = 0 of given degree d as a compact real

132

Distribution of Rational Points on Subanalytic Surfaces

Q

We are reduced to estimate |Υ( , N )|, that is, replacing X with Υ and S with V , so decreasing the dimension. As remarked above, for this task we shall need a uniform analogue of Proposition A.2 in dimension 1. Let us state this result now: Proposition A.5. ([Pil05], Theorem 1.2.) Let  > 0. There are integers d, D and a c5 , depending only on , with the following property. Let L ≥ N −2 and I be a closed interval of length |I| ≤ L. Suppose f ∈ C D (I) has |f 0 | ≤ 1 on I and f (D) nowhere vanishing in I. Then, denoting Γ the graph of f , Γ( , N ) is contained in at most c5 (LN 3 ) algebraic curves of degree d.

Q

We postpone a sketch of the proof of this, which refines the above arguments for Proposition A.2, similarly to [BP89]. At the moment we show how to deduce Theorem A.1 from this result, on decomposing V in a way suitable for the application of it. Let us fix for the moment a coordinate plane Π ⊂ n , with orthogonal projection π : n → Π on two of the standard coordinates, denoted (u, v). We define:

R

R

- Vs : the set of singular points of V , which has dimension ≤ 0, and we put Vns = V \ Vs . - Vu : the set of points p ∈ Vns , where the projection π ◦ ψ has indeterminate slope; i.e., if t is a local parameter at p, we have du/dt = dv/dt = 0 at p. This also says that Vu is subanalytic. - Va : the subset of p ∈ Vns \ Vu , where the slope du/dv ∈ {0, ±1, ∞}. Again, Va is subanalytic. At the remaining points, π(ψ(V )) is a graph with respect to both coordinate axes; it is especially crucial that the slope shall be ≤ 1 with respect to a suitable one of the axes (which is clearly relevant for the last proposition). We also define: - Vb : the set of p ∈ Vns \ Vu \ Va , where the D-th derivative relative to one of the axes vanishes. - Vc : the set of remaining points of V . Note, for instance, the relation of Va , Vb with the hypotheses of the proposition. Now we show how to achieve Theorem A.1. The number of connected components of each such set is easily seen to be bounded by Gabrielov’s theorem (see last footnote). So it suffices to work in each single connected component. The point is that if we work in ψ(Vc ), then we may project down to Π to work with the rational points on a graph of an analytic function f having slope |f 0 | ≤ 1 and f (D) 6= 0. Then Proposition A.5 (with L some number dependent only on X) confines our points to at most c6 N  curves of degree d. Hence, if we also assume that f is transcendental, the intersections of these algebraic curves with Γ are finite sets of cardinality bounded only in terms of d, X, again by Gabrielov’s Theorem A.4. This takes care of the components of the sets Vc with transcendental projection. As to the other sets Vu , Va , Vb , we note that their very definition ensures that they have semialgebraic projections. This does not itself ensure that they are not counted in X trans , but is a first indication in this sense.  To take this into full account, we now repeat this procedure for all n2 coordinate planes Π; in each case we shall obtain a corresponding subdivision into sets now denoted VθΠ , where θ may be either u, a, b or c. To put together these data, we may consider a global subdivision, namely n T S T expressing Vns = Π ( θ∈{u,a,b,c} VθΠ ) as a union of 4( 2 ) sets W of the shape W = Π VθΠΠ , where each θΠ ∈ {u, a, b, c}. Once more, the number of connected components of each set W is uniformly bounded (for given d) by Gabrielov’s theorem, so we may work in individual connected components. It is now important to note that at this stage we may disregard all components (of any W ) arising from the intersections (as Π varies) of sets VθΠ , where either θ ∈ {u, a, b} or where θ = c but the projection to Π has semialgebraic image: in fact, these components must lie in X alg .

R

projective space PM ( ), through the coefficients of H. See especially [Pil04], Sec. 6, for details of such applications of Gabrielov’s theorem.

133

Appendix A

On the other hand, if a component has a projection to some Π with transcendental image, then Proposition A.5 applies as explained just above. This yields the required bound and completes the proof of Theorem A.1. (See [Pil05] and especially [Pil04], Sections 5, 6, for full detail about this decomposition of V .) Curves: sketch of proof of Proposition A.5. Here we follow [Pil05]. A first point is to carry out the proof of an explicit analogue of Lemma A.3 for curves, actually for 1-dimensional graphs, that is with in place of 2 , an interval I in place of U , and with the φi taken as the monomials of degree ≤ d in x, f (x). As in Remark A.0.4, one can take care of the constant c2 which appears, estimating it in terms of the derivatives of f on I. In turn, this lemma may be applied to deduce, precisely as above, a suitable explicit analogue (for curves) of Proposition A.2, now with d equal to a certain absolute constant and c1 explicitly expressed in terms of the derivatives of functions parametrizing the curve. We skip these details and directly quote a result from [Pil05], where, as already mentioned, a mean-value theorem is used in place of Lemma A.3. (This has some effect on the precise estimates, but also Lemma A.3 would suffice for the present purposes.) The final estimate is as follows, where we slightly change Pila’s notation. 1/k Let |I| ≤ L, LN 2 ≥ 1, k ≥ 1. For f ∈ C k (I), set (this is Pila’s AL,k ):

R

R

1/κ  Lκ−1 |f (κ) (t)| . BL,k (f ) := max sup 1, 1≤κ≤k t∈I κ!

(A.0.1)

Lemma A.6. ([Pil05], Cor. 2.5.) Let f be as above, with graph Γ and |f 0 | ≤ 1 on I, and  8 3 3(d+3) BL,D−1 (f ) real let D = d+2 2 . Then Γ( , N ) is contained in the union of at most 6(LN ) algebraic curves of degree ≤ d.

Q

We still have to eliminate the dependence on the derivatives of f , which appears through BL,D−1 (f ). The idea, appearing in [BP89], is that intervals where some derivative is large are short and few. This is the content of the following lemma, where we keep the previous notation. Lemma A.7. ([BP89], Lemma 7.) Suppose A ≥ 1, |I| ≤ L. Let g ∈ C k (I) be such that |g 0 (x)| ≤ 1 and |g (κ) (x)| ≤ κ!Aκ/k L1−κ for 1 ≤ κ < k and for all x ∈ I. Suppose also that |g (k) (x)| ≥ k!AL1−k for all x ∈ I. Then |I| ≤ 2A−1/k L. Proof. Writing I = [a, b], by the usual mean-value theorem we have, for some ξ ∈ I, g(b) − g(a) = Pk−1 g(κ) (a) g (k) (ξ) κ k 0 κ=1 κ! (b − a) + k! (b − a) . Since |g(b) − g(a)| ≤ supη∈I |g (η)||b − a| ≤ L, we have P Pk−1 k−1 |I|k AL1−k ≤ κ=1 |I|κ Aκ/k L1−κ + L. Setting σ := (|I|/L)A1/k , this reads σ k ≤ κ=1 σ κ + 1, whence σ ≤ 2, as claimed. We now present a proof of Proposition A.5, following [Pil05]; we use a recursion argument, again introduced in [BP89]. As above, we choose a d large enough so that α := 8/3(d + 3) < ,  . and we set D := d+2 2 Let g be a function satisfying the conditions stated for f in Proposition A.5, with graph Γg , and define Gg as the minimum number of curves of degree ≤ d that contain Γg ( , N ). Further, let G(L) = G(d, N, L) be the maximum of Gg over all such g (where LN 2 ≥ 1 as before). For such a function g, we fix a number A > 1, to be conveniently chosen later. We are assuming that g (D) never vanishes in I, so an equation g (κ) (x) = c, for a positive κ ≤ D − 1, has at most κ/(D−1) 1−κ L , we see that I may be divided into at D − κ solutions PD−1in I. Hence, 2taking c = ±κ!A most 1 + 2 κ=1 (D − κ) ≤ D subintervals Iν such that, for each of them and each 1 ≤ κ ≤ D − 1, either (i) or (ii) holds:

Q

(i) |g (κ) (x)| ≤ κ!Aκ/(D−1) L1−κ for all x ∈ Iν ;

134

Distribution of Rational Points on Subanalytic Surfaces (ii) |g (κ) (x)| ≥ κ!Aκ/(D−1) L1−κ for all x ∈ Iν .

Suppose now that Iν is such that alternative (i) holds for all κ ≤ D − 1; then, by definition (A.0.1), we have BL,D−1 ≤ A1/(D−1) ; hence, we may apply Lemma A.6 to deduce that the points in question for this interval lie on at most 6(LN 3 )α A1/(D−1) real algebraic curves of degree d. If instead Iν is such that (ii) holds for some κ ≤ D − 1, let κ be minimum. (Necessarily we have κ ≥ 2, because A > 1, whereas |g 0 (x)| ≤ 1.) Then, by Lemma A.7, we have |Iν | ≤ 2A−1/(D−1) L. Therefore, the function G(L) satisfies the following recurrence relation, provided L ≥ N −2 : G(L) ≤ 6D2 A1/(D−1) N 3α Lα + D2 G(λL),

(A.0.2)

D−1  3(D−1) > 1. where we have set λ := 2A−1/(D−1) . Let us now choose A := 2(2D2 ) 4d 3(D−1)

With this choice we find λ = 2(2(2D2 ) 4d )−1 = (1/2D2 )1/α < 1. λ 1 n n−1 < LN L ≥ 1/N 2 . In particular, Let us then find n > 0 so that LN 2 ≤ λ 2 . Then we have λ we may iterate (A.0.2), n − 1 times, to obtain G(L) ≤ 6D2 A1/(D−1) (LN 3 )α (1 + D2 λα + . . . + (D2 λα )n−1 ) + D2n G(λn L). Now, an interval of length < 1/N 2 cannot contain two rational points with denominator at most N , so, since, since λn L < 1/N 2 , we have G(λn L) ≤ 1. Taking also into account that D2 λα = 1/2, we thus find G(L) ≤ 12D2 A1/(D−1) (LN 3 )α + D2n ≤ 12D2 A1/(D−1) (LN 3 )α + 2−n λ−α (LN 2 )α ≤ 28D2 (2D2 )

3(D−1) 4d

8

(LN 3 ) 3(d+3) ,

as required. Final Remarks (i) In the paper [PW06], dealing with arbitrary dimensions (and o-minimal structures other than

Ran), the recursion method that we have seen in the final argument of the proof of Proposition A.5

does not apply as it stands. In fact, in dimension 3 or more, the above dimension descent does not immediately lead to “intervals.” Then this is replaced by several reparametrizations, which control the size of derivatives. A reparametrization of the graph of a given definable function F amounts to replacing F with a composition F ◦ ψ with suitable functions ψ, with the purpose of controlling the derivatives (F ◦ ψ)(α) . More precisely, we have the following general definition (as phrased in [Sca11b]): For a definable (in some o-minimal structure) k-dimensional set X ⊂ n , we say that φ = (φ1 , . . . , φn ) : (0, 1)k → X is a partial r-parametrization of X if φ is definable and ∂ |α| φi for all x ∈ (0, 1)k . sup sup α1 αn (x) ≤ 1, i |α|≤r ∂x1 · · · ∂xn

R

By an r-parametrization of X we mean a finite set of partial r-parametrizations of X for which X is covered by the union of the respective images of these maps. Such reparametrizations then replace the above recursive subdivisions of intervals (which may be read in terms of linear reparametrizations). Some basic ideas for the existence of suitable reparametrizations are already apparent in what we have seen above, in dimension 1: for instance, a graph of a function has locally slope ≤ 1 with

Appendix A

135

respect to one of the axes. In general, it may be proved that if X ⊂ [−1, 1]n is definable and if r is a positive integer, then X admits an r-parametrization. This result extends work of Yomdin and Gromov from the context of semialgebraic sets to o-minimal structures. (See [PW06], Sections 2–5, especially Theorems 2.3, 2.5, relying on the simple but crucial Lemma 3.1, and the references.) (ii) Before the paper [BP89], the study of rational points on smooth varieties appeared, for instance, in works of V. Jarnik [Jar26] (see Remark 1.1.4), H.P.F. Swinnerton-Dyer (for curves), and W.M. Schmidt (see [BP89] or [PW06] for references). For the counting of rational points on the graph of a transcendental real-analytic function, an estimate as in Theorem 1 of [BP89] has been carried out by Masser with “transcendence” methods, as shown in Appendix F below. For the case of algebraic varieties, a highly powerful p-adic analogue of the methods of [BP89] was carried out by D.R. Heath-Brown in a series of papers starting with [HB02]. Also in view of the ultrametric nature of the absolute value in question, this method often led to quantitative improvements; it inspired a vast subsequent literature, for which we do not pause here. To my knowledge, an application of such p-adic version of the methods to a transcendental context has not yet been studied.

Appendix B

Uniformity in Unlikely Intersections: An Example for Lines in Three Dimensions by David Masser For each complex number τ , the points (x, x − 1, x − τ ) (x 6= 0, 1, τ ) parametrize a line L(τ ) in

G3m. It is easy to see that L(τ ) does not lie in any proper algebraic subgroup, or even a translate

of it, provided τ 6= 0, 1. Thus, in this case the set L(τ )(1) of points lying in the union of all onedimensional algebraic subgroups is at most finite, by Maurin’s Theorem [Mau08] or Theorem 2 of [BMZ99], provided τ is algebraic. But even if τ is transcendental the same conclusion holds, for example by the work of [BMZ08b]. That is, there are at most finitely many x 6= 0, 1, τ for which there are two independent relations: xa1 (x − 1)a2 (x − τ )a3 = xb1 (x − 1)b2 (x − τ )b3 = 1.

(B.0.1)

Assuming Zilber’s conjecture, we prove here that the cardinality |L(τ )(1) | is bounded above by an absolute constant, so independently of the complex τ 6= 0, 1. Already Zilber in [Zil02] had observed that certain uniformity versions of his conjecture follow automatically from his original conjecture. I thank him for valuable correspondence on this issue. Let x, y correspond to points of L(τ )(1) , not necessarily different. Then there are four independent relations among the coordinates of (z1 , z2 , z3 , z4 , z5 , z6 ) = (x, x − 1, x − τ, y, y − 1, y − τ ) in

(B.0.2)

G6m. Eliminating τ , we see that this point lies on a linear variety T defined by z1 − z2 = 1, z4 − z5 = 1, z1 − z4 = z3 − z6

G

of dimension 3 in 6m (which is of course defined over Q). Thus (B.0.2) lies in T(2) the intersection of T with the union of all two-dimensional algebraic subgroups. By Zilber, T(2) is contained in a finite union of fixed proper subgroups z1c1 z2c2 z3c3 z4c4 z5c5 z6c6 = 1.

(B.0.3)

Let C be a bound for all the max{|c1 |, |c2 |, |c3 |, |c4 |, |c5 |, |c6 |} occurring here. We show that the number of x in (B.0.1) is at most 3C(2C + 1)6 .

137

Appendix B

Certainly this expression counts the x which satisfy an equation xc1 (x − 1)c2 (x − τ )c3 = 1 with c1 , c2 , c3 not all zero of absolute values at most C. Namely the number of triples (c1 , c2 , c3 ) is at most (2C + 1)3 and for each triple there are at most 3C values of x. If there are no other solutions, then we are finished. Otherwise fix a solution x not yet counted, and let y be any solution whatsoever. Then (B.0.3) holds. Since we have not yet counted x, the c4 , c5 , c6 are not all zero. We can rewrite (B.0.3) as y c4 (y − 1)c5 (y − τ )c6 = w

(B.0.4)

for w = x−c1 (x − 1)−c2 (x − τ )−c3 . As x, τ were fixed, this determines at most (2C + 1)3 values of w. For each such w, the equation (B.0.4) determines at most 3C(2C + 1)3 values of y. Thus the number of y is at most 3C(2C + 1)6 . But y was an arbitrary solution (even y = x was not excluded). So we have finished! We may note that a complete uniformity for L(τ )(1) is not to be expected. For example, when there are integers b1 , b2 not both zero with (τ + 1)b1 τ b2 = 1, then x = τ + 1 lies in L(τ )(1) because (B.0.1) holds with a1 = a2 = 0, a3 = 1 and b3 = 0. Thus one cannot uniformly bound the exponents in (B.0.1) or even the degree of the corresponding subgroup. On the other hand, from Theorem 1 of [BMZ99] these particular τ have absolute height bounded above, so maybe one can hope for some sort of uniformity in the height.

Appendix C

Silverman’s Bounded Height Theorem for Elliptic Curves: A Direct Proof by David Masser For simplicity we restrict ourselves to independent points P1 (t), . . . , Pn (t) on a nonisotrivial elliptic curve E(t) all defined over K(t) for a number field K and a single variable t (a parametrization by points on a general curve would lead to minor technical complications). In this situation the bounded height theorem [Sil83] states that there is a constant c such that the absolute logarithmic height h(τ ) ≤ c for all algebraic τ such that the specialized points P1 (τ ), . . . , Pn (τ ) become dependent on the specialized elliptic curve E(τ ). Silverman’s proof, also valid for abelian varieties parametrized by a curve, proceeded through a limit formula for N´eron-Tate heights. Here we give a direct proof using simple diophantine approximation; this also makes the effectivity clear. We use freely some standard height estimations. Suppose then that there is a dependence a1 P1 (τ ) + · · · + an Pn (τ ) = 0. Divide by b = 1 max{|a1 |, . . . , |an |} and approximate | abi − pqi | ≤ q −1 X − n in the usual way with q ≤ X. With ci = qai − bpi we get c1 P1 (τ ) + · · · + cn Pn (τ ) = −bQ(τ ), (C.0.1) where Q(t) = p1 P1 (t) + · · · + pn Pn (t).

(C.0.2)

ˆ on E(τ ), for definiteness with respect to Take arithmetic heights h and N´eron-Tate heights h ˆ i (τ )) ≤ h(Pi (τ )) + Z(τ ), where Z(τ ) is the “Zimmer constant” some fixed coordinate. We have h(P for E(τ ). It is well known that Z(τ )  h(τ ) + 1, where the (implied) constants throughout depend 1 only on P1 (t), . . . , Pn (t) and E(t), and because also h(Pi (τ ))  h(τ ) + 1 and |ci | ≤ bX − n , we get 2 ˆ h(bQ(τ ))  b2 X − n (h(τ ) + 1) or, better, 2 ˆ h(Q(τ )) ≤ CX − n (h(τ ) + 1).

(C.0.3)

If q > 1, it is clear that not all the pi = 0, so the point Q(t) is nontorsion. ˆ on E(t). Because this is We now move to functional heights h and N´eron-Tate heights h nonisotrivial it is known that points of bounded height defined over K(t) form a finite set. It ˆ form a finite group. So h(Q(t)) ˆ ˆ follows that points of zero h is positive and even h(Q(t)) ≥ C −1 .

139

Appendix C

We now embark on a ruthless strategy of “killing Zimmer constants.” For a positive integer d ˆ − Z for the Zimmer constant Z of E(t). Thus, if d is large enough we have h(dQ(t)) ≥ d2 h(Q(t)) (independent of τ ), we deduce h(dQ(t)) ≥ 12 C −1 d2 . As this is just the degree in t of the coordinate of dQ(t), specialization gives h(dQ(τ )) ≥ 12 C −1 d2 h(τ ) − C ? for some C ? depending only on d and ˆ ˆ Q(t). Thus h(dQ(τ )) ≥ 12 C −1 d2 h(τ ) − C ? − Z(τ ) so h(dQ(τ )) ≥ 14 C −1 d2 h(τ ) − 2C ? if d is large enough, and therefore 1 ˆ h(Q(τ )) ≥ C −1 h(τ ) − 2C ? . (C.0.4) 4 2

Now we fix X in (C.0.3) with CX − n ≤ 18 C −1 in (C.0.4), so that there are only finitely many possibilities for Q(t), and then fix d. The result follows. A similar proof works for one-parameter abelian families, say with no constant part. √ For the example n = 1, P1 (t) = (2, 4 − 2t) over a quadratic extension of Q(t) on the Legendre y 2 = x(x − 1)(x − t), it suffices to look at 3P1 (τ ), which has height 5h(τ ) + O(1); after some calculations one finds h(τ ) < 75 when P1 (τ ) is torsion.

Appendix D

Lower Bounds for Degrees of Torsion Points: The Transcendence Approach by David Masser We start off with a sketch of the proof of the basic result, a refinement of that of my paper “Division fields of elliptic functions,” Bull. London Math. Soc. 9 (1977), 49-53. Namely, that if P is a point of order n ≥ 2 on a Weierstrass elliptic curve E over the field of all algebraic numbers, then it has degree d at least c logn n , with effective c = c(E) > 0. For the sake of brevity we skip over any detailed discussion of standard transcendence techniques such as the Siegel lemma and the Schwarz lemma. After the proof we will make some comments on the nature of c(E) and also on some abelian analogues. We use the associated Weierstrass function ℘ and select u with P = (℘(u), ℘0 (u)). Step 1. We introduce large integers N, S, T, L, and we would like to construct a nonzero polynomial F of degree at most L such that F (℘(z), ℘(N z)) = 0

at z = u, 2u, . . . , Su

(D.0.1)

with multiplicity at least T . But it is crucial to restrict to those multiples of u lying close to the period lattice, say within X1 . By the box principle this holds for  XS2 of them (assuming for the moment X  S 1/2 ); during this proof, c, C and all implied constants are positive and depend only on E; further C is supposed to be sufficiently large. In fact, one allows the coefficients of F to be in Q(P ) rather than Q. This makes a solvability condition C XS2 T ≤ L2 or CST ≤ L2 X 2

(D.0.2)

rather than CdST ≤ L2 X 2 . Then Siegel’s lemma, standard estimates for high-order derivatives, and use of the N´eron-Tate height give for the absolute logarithmic height h(F ) of the coefficient vector h(F )  d + T log T + T log N + L.

141

Appendix D

Step 2. Using the Schwarz lemma we extrapolate to get slightly larger multiplicity T 0 > T in (D.0.1). This gives first the upper bound S log |α| ≤ − 2 T + cLN 2 X



1 X

2 −

S T X2

for any corresponding nonzero value α in (D.0.1), provided ST ≥ CLN 2 .

(D.0.3)

But from Liouville-type height properties we get the lower bound log |α|  −d(d + T 0 log T 0 + T 0 log N + L). For a contradiction we need ST ≥ Cd2 X 2 ,

ST ≥ CdT 0 X 2 L,

ST ≥ CdLX 2 ,

(D.0.4)

where L is something logarithmic in all the parameters. Step 3. We get a contradiction because there are now too many zeroes. In fact, if the function φ in (D.0.1) is not identically zero, then it is an elliptic function of “order”  LN 2 and therefore can have at most this number of zeroes (so there are no sophisticated zero estimates or even simple ones with resultants). We have  XS2 T 0 zeroes provided P, 2P, . . . , SP are distinct. This latter leads to an extra condition S < n, (D.0.5) and we get our final contradiction if ST 0 ≥ CLN 2 X 2 .

(D.0.6)

There is a final subtlety: why is φ not identically zero? Indeed ℘(z) and ℘(N z) are algebraically dependent. But the relation connecting them has degree N 2 , so we are fine as long as L < N 2.

(D.0.7)

Now we can see the necessity for X; the equations (D.0.2),(D.0.3),(D.0.7) would be inconsistent without X. Step 4. We now have to solve (D.0.2),(D.0.3),(D.0.4),(D.0.5),(D.0.6),(D.0.7) for the unknown parameters N, S, T, L, T 0 , a routine matter for transcendentalists. We can take, for example (ignoring fractional parts), n d = C −11 (now L = log n), (D.0.8) L  n 1/2 n n n , S = C −1 n, T = C −12 2 , L = C −9 , T 0 = C −9 2 , N = C −4 L L L L 1

as well as X = C 3 (so that indeed X  S 2 as above), where C = C(E) is sufficiently large. Of n instead of (D.0.8), then (D.0.4) still holds; thus we get the required result. course, if d ≤ C −11 L For our applications to families of elliptic curves we have to calculate c(E) or equivalently C(E). By working out the dependence in E in the above proofs, one gets relatively easily C(E)  (d(E) + h(E))κ ,

142

Lower Bounds for Degrees of Torsion Points: The Transcendence Approach

where d(E) is the degree of the field of definition of E and h(E) is its logarithmic height (absolute or relative—it makes no difference), this time with an implied constant that is absolute, like κ. In the Legendre situation we have an algebraic λ, and d and d(E) are both around D = [Q(λ) : Q]. So we get from (D.0.8) D  (d(E) + h(E))−11κ

√ n  (D + h(λ))−11κ n, L

say, still with absolute constants. Thus n  D2 (D + h(λ))22κ , a weaker but sufficient version of (3.3.2). The method extends to abelian varieties A of dimension g to show that a point of order n has degree at least cnδ with effective c = c(A) > 0 and δ = δ(g) > 0. For example, in [Mas84] 1 by choosing t = t(δ) and using a polynomial in z1 , . . . , zt (the so-called we obtained any δ < 6g many-variable trick) and abelian functions in these variables (no longer with multiplicities). But now more sophisticated zero estimates are needed. In fact, the simpler choice t = 1 leads to any δ
T in (E.0.2). This gives first the upper bound, log |α| ≤ −ST + cLS 2  −ST,

144

A Transcendence Measure for a Quotient of Periods

for any corresponding nonzero value α in (E.0.2), provided T ≥ CLS.

(E.0.4)

But from Liouville-type height properties we get the lower bound log |α|  −T 0 log T 0 − L. For a contradiction we need ST ≥ CT 0 L,

ST ≥ CL,

(E.0.5)

where L is something logarithmic in all the parameters. Step 3. We get a final contradiction when there are now too many zeroes. Here it is possible to take, say, T 0 = 2T and then repeat the whole argument to get T 00 = 4T, T 000 = 8T, . . . and so infinite multiplicity; note the function in (E.0.2) is not identically zero, otherwise ℘(z), ℘(βz) would be algebraically dependent, which would imply that τ is quadratic. Step 4. We now have to solve (E.0.3),(E.0.4) and (E.0.5) for the unknown parameters T, L, S; a simple matter. We get, for example (ignoring fractional parts), S = Ls , T = Lt for any fixed real s, t with 1 + s < t < 2, s + t > 1, where L is sufficiently large. For the transcendence measure we have to stop at T 0 and specify it explicitly using a zero estimate. In [Mas75] we took T 0 = T u for arbitrary u > 1 and used an ad-hoc argument. One can use resultants and get away with something like T 0 = 16T . This leads to κ = s+t s in (E.0.1). Making s → 1 and t → 2, we get any κ > 3, also as in [Mas75]. Nowadays the best possible κ = 1 is known.

Appendix F

Counting Rational Points on Analytic Curves: A Transcendence Approach by David Masser We sketch here a proof that if f is a transcendental function real analytic on a closed real interval I, then for any ε > 0 the number of z in I such that z and f (z) are both rational with denominator at most D is at most cDε , where c = c(f, ε) is independent of D. It proceeds via methods as in Appendixes D and E, and again for the sake of brevity we skip over any detailed discussion of standard transcendence techniques. However, we do not claim that c is effective, and it is anyway not so clear what this would mean. We can clearly take I = [0, 1] without loss of generality. By compactness for any integer L there is Z = Z(f, L) such that for any nonzero polynomial F of degree at most L in each variable, the function F (z, f (z)) has at most Z different zeros in I. By linear algebra, Z ≥ L2 + 2L. For a large integer N we can divide I into subintervals of length N1 . There is N0 = N0 (f ) such that if N ≥ N0 then f is analytic on the closed disks of radius N10 centered at the centers of the subintervals. Step 1. Pick N ≥ N0 and a large integer L, and focus on a particular subinterval. If it contains at least S = L(L+1) such points z = z1 , z2 , . . . , zS , then we can find, using Siegel’s lemma, 2 a nonzero polynomial F of degree at most L in each variable such that F (z, f (z)) = 0 at

z = z1 , z2 , . . . , zS

(F.0.1)

without multiplicity, of course. We allow the coefficients of F to be in Q, and we get for the absolute logarithmic height h(F ) of the coefficient vector h(F )  L log D, where during this proof c, C and all implied constants are positive and depend only on f ; further C is supposed sufficiently large. Step 2. Using the Schwarz lemma we extrapolate to all such z in the subinterval. Since |z − zs | ≤ N1 (s = 1, . . . , S), this gives first the upper bound log |α| ≤ −S log N + cL + cS  −L2 log N,

146

Counting Rational Points on Analytic Curves: A Transcendence Approach

for the corresponding nonzero values α in (F.0.1). But from considering denominators we have the lower bound log |α|  −L log D and so we get at once a contradiction, provided C log D ≤ L log N.

(F.0.2)

Step 3. Thus the number of points z in the subinterval is at most Z(f, L); and this holds even such z. So the total number in I is at most N Z(f, L). if there were fewer than S = L(L+1) 2 Step 4. For (F.0.2) we need only take N = DC/L , making the total at most DC/L Z(f, L); and the required result follows on fixing some L ≥ Cε . The method would become effective if a suitable zero estimate for Z(f, L) were known. It can be extended without difficulty to a pair (f1 , f2 ) of algebraically independent real analytic functions in place of (z, f ), even taking values of bounded degree and height bounded by D. As it stands, the method seems not to extend to analytic surfaces involving, say, two complex variables; the reason is that we lack a suitable Schwarz lemma for functions vanishing on arbitrary finite subsets of C2 .

Appendix G

Mixed Problems: Another Approach by David Masser Here we sketch a third proof of the mixed Manin-Mumford-Andr´e-Oort example discussed at the end of Chapter 4. For simplicity we restrict ourselves to the assertion that there are at most √ finitely many λ 6= 0, 1 for which the point P = (2, 4 − 2λ) is torsion on the Legendre curve Eλ defined by y 2 = x(x − 1)(x − λ) when there is complex multiplication. This proof avoids the use of the ineffective Siegel estimate as well as the appeal in [Pil09b] to definability and the consideration of quadratic points. The only possible remaining ineffectivity may be √ in the counting√results for subanalytic surfaces; this feature is still present for the results on (2, 4 − 2λ) and (3, 6 − 3λ) in [MZ08], [MZ10b]. Let u be the logarithm of P on Eλ . With hypergeometric ω1 , ω2 define real functions r1 , r2 of λ by (G.0.1) u = r1 ω1 + r2 ω2 , so r1 , r2 take rational values if P is torsion. The condition of complex multiplication can be expressed by similarly defining real functions ω2 2 2 s1 , s2 by ( ω ω1 ) = s2 ω1 + s1 , or, better, ω22 = s1 ω1 + s2 ω2 . ω1

(G.0.2)

Then these take rational values if Eλ has complex multiplication. Taking complex conjugates we get r1 , r2 , s1 , s2 as real analytic functions of λ, λ, and so they parametrize a surface S in R4 . ω2

Showing that S trans = S seems harder than before, because clearly the functions u, ω21 , ω1 , ω2 are not homogeneously algebraically independent. But only slightly harder: let t be a parameter on a semialgebraic curve in S. First (G.0.1) shows that u is algebraic over C(t, ω1 , ω2 ). Then (G.0.2) suggests that t is algebraic 2 2 ω2 over C(ω1 , ω2 ), and in fact this must hold otherwise the three functions ( ω ω1 ) , ω1 , 1 would be linearly dependent over C, which is clearly absurd. We deduce u algebraic over C(ω1 , ω2 ), which is impossible, for example, by [Ber89].

148

Mixed Problems: Another Approach

Now thanks to [Sil83] the λ in question has bounded height, and this means that we can restrict to a compact set. So [Pil04] shows that the number of rational points on S is  (N M ) , where N is the order of P and M is the leading coefficient of the minimal integral quadratic satisfied by 2 τ = ω ω1 . Here (and later) the implied constants depend only on  and are effective. By [Dav97] N  d2 for d = [Q(λ) : Q]. But what about M ? We may note an observation of Faisant and Philibert’s “Quelques r´esultats de transcendance li´es `a l’invariant modulaire j,” J. Number Theory 25 (1987), 184-200. Their lemma 3 (i) (p. 187) says M  log |j(τ )| even with an explicit absolute constant, provided τ lies in the standard fundamental domain. In that case, M  [Q(j(τ )) : Q]h(j(τ ))  d(1 + h(λ))  d.

(G.0.3)

Unfortunately we do not know that our τ lies in the standard fundamental domain; and further they supply no proof of their lemma, except in some typewritten notes. We give a related direct argument here. Note that τ =

ω2 (λ) ω1 (λ)

must at least lie in a compact subset of the upper half plane. So its

imaginary part =τ  1. But =τ =

√ D 2M

for the absolute discriminant D = D(τ ), so we get M

√ D.

(G.0.4)

Now j(τ ) is a conjugate of j(M τ ) over Q by Theorem 5 (p. 133) of [Lan73], because Z + Zτ and Z + ZM τ are proper (Z + ZM τ )-modules. It follows that h(j(M τ )) = h(j(τ ))  1. Here √ j(M τ ) = j( −b+i2 D ) for some integer b, and from the Fourier expansion this has absolute value √ √ eπ D + O(1). It follows easily that dh(j(M τ ))  D. Thus, √ d  D, (G.0.5) and now from (G.0.4) we recover (G.0.3). So the number of rational points on S is  d3 ; and we conclude with the usual Galois argument. Here (G.0.5) seems miraculously to make effective (and improve) Siegel’s noneffective d ≥ c(δ)D1/2−δ ; but of course we used the boundedness of h(j(τ )).

Bibliography [AD99]

F. Amoroso and S. David, Le probl`eme de Lehmer en dimension sup´erieure. [The Lehmer problem in higher dimension], J. reine angew. Math. 513 (1999), 145–179.

[And89]

Y. Andr´e, G-functions and geometry, Aspects of Mathematics, Vieweg, 1989.

[And92]

, Mumford-Tate groups of mixed Hodge structures and the theorem of the fixed part, Compositio Math. 82 (1992), 1–24.

[And98]

, Finitude des couples d’invariants modulaires singuliers sur une courbe alg´ebrique plane non modulaire, J. reine angew. Math. 505 (1998), 203–208.

[And01]

, Shimura varieties, subvarieties and CM points, lectures NCTS, Hsinchu (2001), no. http://math.cts.nthu.edu.tw/Mathematics/english/lecnotes/lecture.html, 20 pp.

[AR04]

N. Ailon and Z. Rudnick, Torsion points on curves and common divisors of ak − 1 and bk − 1, Acta Arith. 113 (2004), 31–38.

[AV92]

D. Abramovich and J. F. Voloch, Toward a proof of the Mordell-Lang conjecture in characteristic p, Internat. Math. Res. Notices 5 (1992), 103–115.

[Ax72]

J. Ax, Some topics in differential algebraic geometry I : Analytic subgroups of algebraic groups, Amer. J. Math. 94 (1972), 1195–1204.

[AZ00]

F. Amoroso and U. Zannier, A relative Dobrowolski lower bound over abelian extensions, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 29 (2000), no. 3, 711–727.

[AZ10]

, A relative Dobrowolski lower bound over abelian extensions, Bull. London Math. Soc. 3 (2010), 489–498.

[Bak90]

A. Baker, Transcendental Number Theory, second ed., Cambridge Mathematical Library, Cambridge Univ. Press, 1990.

[Bak00]

M. Baker, Torsion points on modular curves., Invent. Math. 3 (2000), 487–509.

[BB66]

W. Baily and A. Borel, Compactification of arithmetic quotients of bounded symmetric domains, Ann. Math. 84 (1966), 442–528.

[BCZ03]

Y. Bugeaud, P. Corvaja, and U. Zannier, An upper bound for the g.c.d. of an − 1 and bn − 1, Math. Zeit. 243 (2003), 79–84.

[BD11]

M. Baker and L. DeMarco, Preperiodic points and unlikely intersections, Duke Math. J. (2011), to appear.

150

Bibliography

[Ber89]

D. Bertrand, Extensions de D-modules et groupes de Galois diff´erentiels, p-adic Analysis, Springer, LNM 1454., 1989, ed. Baldassarri, Bosch, Dwork, pp. 67–85.

[Ber11]

, Special points and Poincar´e bi-extensions, with an appendix by B. Edixhoven, arXiv:1104.5178v1 [math.NT] 27 Apr 2011 (2011), 11 pp.

[BG06]

E. Bombieri and W. Gubler, Heights in Diophantine Geometry, New Mathematical Monographs, vol. 4, Cambridge Univ. Press, 2006.

[BHMZ10]

E. Bombieri, P. Habegger, D.W. Masser, and U. Zannier, A note on Maurin’s theorem, Rend. Mat. Acc. Lincei 21 (2010), 251–260.

[Bia97]

R. Bianconi, Nondefinability results for expansions of the field of real numbers by the exponential function and by the restricted sine function, The Journal of Symbolic Logic 62 (1997), 1173–1178.

[Bil97]

Yu. F. Bilu, Limit distribution of small points on algebraic tori, Duke Math. J. 89 (1997), 465–476.

[BKT10]

F. Bogomolov, M. Korotiaev, and Y. Tschinkel, A Torelli Theorem for curves over finite fields, Volume for John Tate’s 80th birthday, 6 N 1, vol. V, Pure and Applied Math. Quarterly, 2010, pp. 245–294.

[BM86]

W.D. Brownawell and D.W. Masser, Vanishing sums in function fields, Math. Proc. Cambridge Philos. Soc. 100 (1986), no. 3, 427–434.

[BM88]

E. Bierstone and P. Milman, Semianalytic and subanalytic sets, Publ. Math. I.H.E.S. 67 (1988), 5–42.

[BMPZ11]

D. Bertrand, D. Masser, A. Pillay, and U. Zannier, Relative Manin-Mumford for semi-abelian surfaces, preprint (2011), 15 pp.

[BMZ99]

E. Bombieri, D.W. Masser, and U. Zannier, Intersecting a curve with algebraic subgroups of multiplicative groups, Internat. Math. Res. Notices 20 (1999), 1119–1140.

[BMZ03]

, Finiteness results for multiplicatively dependent points on complex curves, Michigan Math. J. 51 (2003), no. 3, 451–466.

[BMZ06]

, Intersecting curves and algebraic subgroups: conjectures and more results, Trans. Amer. Math. Soc. 358 (2006), 2247–2257.

[BMZ07]

, Anomalous subvarieties—structure theorems and applications, Int. Math. Res. Notices 19 (2007), 33 pp.

[BMZ08a]

, Intersecting a plane with algebraic subgroups of multiplicative groups, Ann. Sc. Norm. Super. Pisa Cl. Sci. 7 (2008), no. 5, 51–80.

[BMZ08b]

, On unlikely intersections of complex varieties with tori, Acta Arith. 133 (2008), no. 4, 309–323.

[BMZ11]

Yu. F. Bilu, D.W. Masser, and U. Zannier, An effective “Theorem of Andr´e” for CM -points on a plane curve, preprint (2011), 8 pp.

[Bor50]

A. Borel, Sous-groupes compacts maximaux des groupes de Lie (d’apr`es Cartan, Iwasawa et Mostow), S´eminaire Bourbaki, SMF, 1950, Exp. no. 33.

151

Bibliography [Bor91]

, Linear Algebraic Groups, 2d ed., GTM 126, Springer-Verlag, 1991.

[BP89]

E. Bombieri and J. Pila, The number of integral points on arcs and ovals, Duke Math. J. 59 (1989), 337–357.

[BP10a]

D. Bertrand and A. Pillay, A Lindemann-Weierstrass theorem for semi-abelian varieties over function fields, J. Amer. Math. Soc. (2010), 491–533.

[BP10b]

M. Bolognesi and G. Pirola, Osculating spaces and diophantine equations (with an appendix by P. Corvaja and U. Zannier), Math. Nachrichten (2010), 20 pp., to appear.

[BR10]

M. Baker and R. Rumely, Potential Theory and Dynamics on the Berkovich Projective Line, Mathematical Surveys and Monographs, vol. 159, American Mathematical Society, 2010.

[BS02]

F. Beukers and C. J. Smyth, Cyclotomic points on curves, Number theory for the millennium, vol. I, A K Peters, Ltd., 2002, Urbana, 2000, pp. 67–85.

[BT08]

F. Bogomolov and Y. Tschinkel, On a theorem of Tate, Cent. Eur. J. Math. 6 (2008), no. 3, 343–350.

[BZ95]

E. Bombieri and U. Zannier, Algebraic points on subvarieties of Res. Notices 7 (1995), 333–347.

[Can10]

J. K. Canci, Rational periodic points for quadratic maps, Ann. Inst. Fourier 60 (2010), no. 3, 20 pp.

[Car10]

M. Carrizosa, Relative Lehmer problem on CM abelian varieties, Internat. Math. Res. Notices (2010), to appear.

[CF96]

J.W.S. Cassels and E.V. Flynn, Prolegomena to a Middlebrow Arithmetic of Curves of Genus 2, London Mathematical Society Lecture Note Series, vol. 230, Cambridge Univ. Press, 1996.

[CJ76]

J.H. Conway and A.J. Jones, Trigonometric diophantine equations (on vanishing sums of roots of unity), Acta Arith. 30 (1976), 229–240.

[CL00]

A. Chambert-Loir, Points de petite hauteur sur les varit´es semi-abliennes, Ann. Sci. ´ Ecole Norm. Sup. 33 (2000), no. 4, 789–821.

[CL11]

, Relations de d´ependance et intersections exceptionnelles, S´eminaire Bourbaki, 63e ann´ee, vol. 1032, Ast´erisque, 2010-2011.

[Cor07]

P. Corvaja, Problems and results on integral points on rational surfaces, Diophantine Geometry (U. Zannier, ed.), CRM series, vol. 4, Edizioni della Normale, 2007, (Pisa, 2005), pp. 123–140.

[CRZ04]

P. Corvaja, Z. Rudnick, and U. Zannier, A lower bound for periods of matrices, Comm. Math. Phys. 252 (2004), no. 1, 535–541.

[CZ98]

P. Corvaja and U. Zannier, Diophantine equations with power sums and universal Hilbert sets, Indag. Math. (N.S.) 9 (1998), no. 3, 317–332.

[CZ00]

P.B. Cohen and U. Zannier, Multiplicative dependence and bounded height, an example, Algebraic Number Theory and Diophantine Analysis (Graz, 1998), de Gruyter, Berlin, 2000.

Gnm, Internat. Math.

152

Bibliography

Gnm,

[CZ02a]

, Fewnomials and intersections of lines with real analytic subgroups in Bull. London. Math. Soc. 34 (2002), 21–32.

[CZ02b]

P. Corvaja and U. Zannier, Finiteness of integral values for the ratio of two linear recurrences, Inv. Math. 149 (2002), 431–451.

[CZ03]

, On the greatest prime factor of (ab + 1)(ac + 1), Proc. Amer. Math. Soc. 131 (2003), no. 6, 1705–1709.

[CZ05]

, A lower bound for the height of a rational function at S-unit points, Monatsh. Math. 144 (2005), no. 3, 203–224.

[CZ08a]

, On the maximal order of a torsion point on a curve in Gnm , Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei. 19 (2008), no. 1, 73–78.

[CZ08b]

, Some cases of Vojta’s conjecture on integral points over function fields, J. Algebraic Geom. 17 (2008), no. 2, 295–333.

[CZ10a]

, Algebraic hyperbolicity of ramified covers of

G2m , preprint (2010), 20 pp.

[CZ10b]

, Integral points, divisibility between values of polynomials and entire curves, Advances in Math. 225 (2010), 1095–1118.

[Dav97]

S. David, Points de petite hauteur sur les courbes elliptiques, J. Number Theory. 64 (1997), 104–129.

[Del71]

P. Deligne, Travaux de Shimura, S´eminaire Bourbaki, LNM, vol. 244, Springer-Verlag, 1971, pp. 123–165.

[Den92]

L. Denis, G´eometrie diophantienne sur les modules de Drinfeld (Columbus, OH, 1991), The arithmetic of function fields, vol. 4, de Gruyter, Berlin, 1992, pp. 285–302.

[Duk88]

W. Duke, Hyperbolic distribution problems and half-integral weight Maass forms, Invent. Math. 92 (1988), no. 1, 73–90.

[Duk06]

, Modular functions and the uniform distribution of CM points, Math. Ann. 334 (2006), no. 2, 241–252.

[DZ00]

R. Dvornicich and U. Zannier, On sums of roots of unity, Monatsh. Math. 129 (2000), no. 2, 97–108.

[DZ02]

, Sums of roots of unity vanishing modulo a prime, Arch. Math. 79 (2002), no. 2, 104–108.

[DZ07]

, Cyclotomic diophantine problems (Hilbert irreducibility and invariant sets for polynomial maps), Duke Math. J. 139 (2007), 527–554.

[Edi98]

B. Edixoven, Special points on the product of two modular curves, Compositio Math. 114 (1998), 315–328.

[Edi05]

, Special points on products of modular curves, Duke Math. J. 126 (2005), 325–348.

[EEHK09]

J. Ellenberg, C. Elsholtz, C. Hall, and E. Kowalski, Non-simple abelian varieties in a family: geometric and analytic approaches, J. London Math. Soc. 80 (2009), no. 2, 135–154.

153

Bibliography [Eve99]

J.-H. Evertse, The number of solutions of linear equations in roots of unity, Acta Arith. 89 (1999), no. 1, 45–51.

[EY03]

B. Edixoven and A. Yafaev, Subvarieties of Shimura varieties, Ann. Math. 157 (2003), no. 2, 621–645.

[FGS08]

M. Filaseta, A. Granville, and A. Schinzel, Irreducibility and Greatest Common Divisor Algorithms for Sparse Polynomials, Number Theory and Polynomials (J. McKee and C. Smyth, eds.), LNS, vol. 352, London Math. Soc., 2008, pp. 155–176.

[Gab68]

A. Gabrielov, Projections of semi-analytic sets, Funct. Anal. Appl. 2 (1968), 282–291.

[GHT11]

D. Ghioca, L.C. Hsia, and T. Tucker, Preperiodic points for families of polynomials, preprint (2011), 27 pp..

[Gor77]

P. Gordan, Ueber endliche Gruppen linearer Transformationen einer Ver¨ anderlichen, Math. Ann. 12 (1877), 23–46.

[GTZ11]

D. Ghioca, T. Tucker, and S. Zhang, A dynamical Manin-Mumford conjecture, Intern. Math. Res. Notices (2011), 11 pp., to appear.

[GV88]

A. Garcia and J. Voloch, Fermat curves over finite fields, Journal Number Theory 30 (1988), 345–356.

[Hab09a]

P. Habegger, A Bogomolov property for curves modulo algebraic subgroups, Bull. Soc. Math. France 137 (2009), no. 1, 93–125.

[Hab09b]

, Intersecting subvarieties of abelian varieties with algebraic subgroups of complementary dimension, Inv. Math. 176 (2009), 405–447.

[Hab09c]

, On the bounded height conjecture, Internat. Math. Res. Notices 5 (2009), 860–886.

[Hab11a]

, Special points on fibered powers of elliptic surfaces, preprint (2011), 23 pp.

[Hab11b]

, Torsion points on elliptic curves in Weierstrass form, preprint (2011), 23 pp.

[HB02]

D. R. Heath-Brown, The density of rational points on curves and surfaces, Ann. Math. 155 (2002), no. 2, 553–595.

[HBK00]

D. R. Heath-Brown and S. Konyagin, New bounds for Gauss sums derived from kth powers, and for Heilbronn’s exponential sum, Quarterly J. Math. 51 (2000), 221–235.

[HHK11]

B. Hutz, T. Hyde, and B. Krause, Pre-images in quadratic dynamical systems, Involve (2011), 20 pp.

[Hin88]

M. Hindry, Autour d’une conjecture de Serge Lang, Inventiones Math. 94 (1988), 575–603.

[Hir73]

F. Hirzebruch, Hilbert modular surfaces, Enseignement Math., 1973.

[HP11]

P. Habegger and J. Pila, Some unlikely intersections beyond Andre-Oort., Compositio Math. (2011), to appear, 28 pp., http://www.maths.ox.ac.uk/contact/details/pila.

[Hus87]

D. Husem¨ oller, Elliptic Curves, Graduate Texts in Mathematics, Springer-Verlag, 1987.

154

Bibliography

[Jar26]

V. Jarnik, Ueber die Gitterpunkte auf konvexen Kurven, Math. Zeit. 24 (1926), 500– 518.

[Kat93]

N.M. Katz, Affine cohomological transforms, perversity, and monodromy, J. Amer. Math. Soc. 6 (1993), no. 1, 149–222.

[Kho91]

A. G. Khovanski, Fewnomials, Transl. Math. Monogr., vol. 88, Amer. Math. Soc., 1991.

[K¨ uh11]

L.. K¨ uhne, On effectivity and uniformity in a result of Andr´e-Oort type, preprint (2011), 20 pp.

[Lan60]

` S. Lang, Integral points on curves, Inst. Hautes Etudes Sci. Publ. Math. 6 (1960), 27–43.

[Lan65]

, Division points on curves, Ann. Mt. Pura Appl. 70 (1965), no. 4, 229–234.

[Lan73]

, Elliptic Functions, Addison-Wesley, 1973.

[Lan75]

, SL2 ( ), Addison-Wesley, 1975.

[Lan83]

, Fundamentals of Diophantine Geometry, Springer-Verlag, 1983.

R

[Lan94]

, Algebraic Number Theory, second ed., Graduate Texts in Mathematics, vol. 110, Springer-Verlag, 1994.

[Lau84]

´ M. Laurent, Equations diophantiennes exponentielles, Invent. Math. 78 (1984), no. 2, 299–327.

[LB92]

H. Lange and Ch. Birkenhake, Complex Abelian Varieties, Springer-Verlag, 1992.

[Le10]

T. Le, Homology torsion growth and Mahler measure, http://arxiv.org/abs/1010.4199 (2010), 25 pp.

[Lev06]

A. Levin, One-parameter families of unit equations, Math. Res. Lett. 13 (2006), no. 5– 6, 935–945.

[LM11]

A. Levin and D. McKinnon, Ideals of degree one contribute most of the height, preprint (2011), 16 pp.

[LS05]

F. Luca and I. Shparlinski, On the exponent of the group of points on elliptic curves in extension fields, Internat. Math. Res. Notices 23 (2005), 1391–1409.

[Mag08]

C. Magagna, A lower bound for the r-order of a matrix modulo N , Monatsh. Math. 153 (2008), no. 1, 59–81.

[Man65]

H.B. Mann, On linear relations between roots of unity, Mathematika 12 (1965), 107– 117.

[Mas75]

D.W. Masser, Elliptic Functions and Transcendence, LNM, vol. 437, Springer-Verlag, 1975.

[Mas84]

, Small values of the quadratic part of the N´eron-Tate height on an abelian variety, Compositio Math. 53 (1984), 153–170.

[Mas89a]

, Counting points of small height on elliptic curves, Bull. Soc. Math. France 117 (1989), 247–265.

Bibliography

155

[Mas89b]

, Specializations of finitely generated subgroups of abelian varieties, Trans. Amer. Math. Soc.. 311 (1989), 413–424.

[Mas99]

, Specializations of some hyperelliptic jacobians, Number Theory in Progress, Essays in Honour of A. Schinzel, vol. 1819, Springer-Verlag, 1999, pp. 324–333.

[Mas03]

, Heights, transcendence, and linear independence on commutative group varieties, Diophantine approximation (F. Amoroso and U. Zannier, eds.), Lecture Notes in Math., vol. 1819, Springer, 2003, (Cetraro, 2000), pp. 1–51.

[Mas09]

, Multiplicative dependence of values of algebraic functions, Analytic Number Theory (Chen et al., ed.), Lecture Notes in Math., Essays in Honour of Klaus Roth, vol. 1819, Cambridge Univ. Press, 2009, pp. 324–333.

[Mau08]

G. Maurin, Courbes alg´ebriques et ´equations multiplicatives, Math. Ann. 341 (2008), no. 4, 789–824. , Equations multiplicatives sur les sous-vari´et´es des tores, preprint (2010), 79

[Mau10] pp. [McQ95]

M. McQuillan, Division points on semi-abelian varieties, Invent. Math. 120 (1995), 143–159.

[Mum69]

D. Mumford, A Note of Shimura’s Paper ‘‘Discontinuous Groups and Abelian Varieties”, Math. Ann. 181 (1969), 345–351.

[MVZ10]

D.W. Masser, F. Veneziano, and U. Zannier, Diophantine approximation on elliptic curves over function fields, preprint (2010), 15 pp.

[MW94]

D.W. Masser and G. W¨ ustholz, Endomorphism estimates for abelian varieties, Math. Z. 215 (1994), 641–653.

[MZ08]

D.W. Masser and U. Zannier, Torsion anomalous points and families of elliptic curves, C.R. Acad. Sci., Paris 346 (2008), no. I, 491–494.

[MZ10a]

, A note on simultaneous torsion points on Legendre elliptic curves. Bicyclotomic polynomials and impossible intersections, preprint (2010), 19 pp.

[MZ10b]

, Torsion anomalous points and families of elliptic curves, Amer. J. Math. 132 (2010), 1677–1691. , Torsion points on families of products of elliptic curves, preprint (2010), 25

[MZ10c] pp. [MZ10d]

, Torsion points on families of squares of elliptic curves, Math. Ann. (2010), 28 pp., to appear.

[Nar04]

W. Narkiewicz, Elementary and Analytic Theory of Algebraic Numbers, 3d ed., Springer Monographs in Mathematics, Springer-Verlag, 2004.

[Noo06]

R. Noot, Correspondances de Hecke, action de Galois et la conjecture d’Andr´e-Oort (d’apr´es Edixhoven et Yafaev), Ast´erisque, S´eminaire Bourbaki, vol. 307, SMF, 2006, Exp. No. 942, pp. 165–197.

[NWY08]

J. Noguchi, J. Winkelmann, and K. Yamanoi, The second main theorem for holomorphic curves in semi-abelian varieties ii, Forum Mathematicum 20 (2008), 469–503.

156

Bibliography

[Oor97]

F. Oort, Canonical liftings and dense sets of CM-points, Arithmetic geometry, Sympos. Math., vol. XXXVII, Cambridge Univ. Press, 1997, pp. 228–234.

[Pil04]

J. Pila, Integer points on the dilation of a subanalytic surface, Quarterly J. Math. 55 (2004), 207–223.

[Pil05]

, Rational points on a subanalytic surface, Ann. Inst. Fourier (Grenoble) 55 (2005), no. 5, 1501–1516.

[Pil09a]

, On the algebraic points of a definable set, Selecta Math. 15 (2009), no. 1, 151–170.

[Pil09b]

, Rational points of definable sets and results of Andr´e-Oort-Manin-Mumford type, Internat. Math. Res. Notices 13 (2009), 2476–2507.

[Pil11]

, O-minimality and the Andr´e-Oort conjecture for Cn , Ann. Math. 173 (2011), 1779–1840.

[Pin05a]

R. Pink, A combination of the conjectures of Mordell-Lang and Andr´e-Oort, Geometric Methods in Algebra and Number Theory, Bogomolov and Tschinkel eds., Progress in Math., vol. 253, Birkh¨ auser, Boston, 2005, pp. 251–282.

[Pin05b]

, A common generalization of the conjectures of Andr´e-Oort, Manin-Mumford and Mordell-Lang, manuscript (2005), 13 pp.

[PS88]

A. Pillay and C. Steinhorn, Definable sets in ordered structures. III, Trans. Amer. Math. Soc. 309 (1988), no. 2, 469–476.

[PS04]

Y. Peterzil and S. Starchenko, Uniform definability of the Weierstrass ℘ functions and generalized tori of dimension one, Selecta Math. N. S. 10 (2004), 525–550.

[PW06]

J. Pila and A. Wilkie, The rational points of a definable set, Duke Math. J. 133 (2006), no. 3, 591–616.

[PZ08]

J. Pila and U. Zannier, Rational points in periodic analytic sets and the ManinMumford conjecture, Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei 9 (2008), no. 2, 149–162.

[Ray83]

M. Raynaud, Courbes sur une vari´et´e ab´elienne et points de torsion [Curves on an abelian variety and torsion points], Invent. Math. 71 (1983), no. 1, 207–233.

[R´em09a]

G. R´emond, Autour de la conjecture de Zilber-Pink, J. Thor. Nombres Bordeaux 21 (2009), no. 2, 405–414.

[R´em09b]

, Intersection de sous-groupes et de sous-vari´et´es. III, Comment. Math. Helv. 84 (2009), no. 4, 835–863.

[Ros86]

M. Rosen, Abelian varieties over C, Arithmetic Geometry, Cornell, G. and Silverman, J. eds., Springer-Verlag, 1986, pp. 79–101.

[Rum88]

R. Rumely, Note on van der Poorten’s proof of the Hadamard quotient theorem, I, II, S´eminaire de Th´eorie des nombres de Paris, 1986-87, Progress in Math., vol. 75, Birkh¨auser, Boston, 1988, pp. 349–382, 383–409.

[SA94]

P. Sarnak and S. Adams, Betti numbers of congruence groups (with an appendix by Zeev Rudnick), Israel J. Math. 88 (1994), nos. 1–3, 31–72.

157

Bibliography [Sca11a]

T. Scanlon, Counting special points: Logic, diophantine geometry and transcendence theory, Bull. Amer. Math. Soc. to appear (2011), 20 pp.

[Sca11b]

, A proof of the Andr´e-Oort conjecture via mathematical logic, S´eminaire Bourbaki, vol. 1037, Ast´erisque, 63`eme ann´ee, 2010-2011, p. 15 pp.

[Sch96]

H.P. Schlickewei, Equations in roots of unity, Acta Arith. 76 (1996), 99–108.

[Sch00]

A. Schinzel, Polynomials with Special Regard to Reducibility, Encyclopedia of Mathematics and Its Applications, Cambridge Univ. Press, 2000.

[Ser73]

J-P. Serre, A Course in Arithmetic, vol. 7, GTM, Springer-Verlag, 1973.

[Ser88]

, Algebraic Groups and Class Fields, vol. 117, GTM, Springer-Verlag, 1988.

[Ser97]

, Lectures on the Mordell-Weil Theorem, third ed., Aspects of Mathematics, Friedr. Vieweg & Sohn, 1997.

[Ser07]

, Bounds for the orders of the finite subgroups of G(k), Group representation theory, EPLF Press, Lausanne, 2007, pp. 405–450.

[Shi94]

G. Shimura, Introduction to the Arithmetic Theory of Automorphic Functions, Princeton Univ. Press, 1971, repr. 1994.

[Shi97]

M. Shiota, Geometry of subanalytic and semialgebraic sets, Progress in Math., vol. 150, Birkh¨ auser, 1997.

[Sil83]

J. H. Silverman, Heights and the specialization map for families of abelian varieties, J. reine angew. Math. 342 (1983), 197–211.

[Sil92]

, The Arithmetic of Elliptic Curves, Graduate Texts in Mathematics, vol. 106, Springer-Verlag, 1992, Corrected reprint of the 1986 original.

[Sil02]

, Advanced Topics in the Arithmetic of Elliptic Curves, Graduate Texts in Mathematics, Springer-Verlag, 2002.

[Sil04]

, Common divisors of an − 1 and bn − 1 over function fields, New York J. Math. 10 (2004), 37–43.

[Sil05]

, Generalized greatest common divisors, divisibility sequences, and Vojta’s conjecture for blowups, Monatsh. Math. 145 (2005), no. 4, 333–350.

[Sto95]

M. Stoll, Two simple 2-dimensional abelian varieties defined over with Mordell-Weil rank at least 19, C. R. Acad. Sci. Paris 321 (1995), no. I, 1341–1344.

[SUZ97]

L. Szpiro, E. Ullmo, and S. Zhang, Equir´epartition des petits points, Invent. Math. 127 (1997), 337–347.

[Ull98]

E. Ullmo, Positivit´e et discr´etion des points alg´ebriques des courbes, Ann. Math. (2) 147 (1998), no. 1, 167–179.

[vdD98]

L. van den Dries, Tame topology and o-minimal structures, Cambridge Univ. Press, 1998.

Q

[vdDMM94] L. van den Dries, A. Macintyre, and D. Marker, The elementary theory of restricted analytic fields with exponentiation, Ann. Math. 140 (1994), no. 2, 183–205.

158

Bibliography

[vdP89]

A.J. van der Poorten, Some facts that should be better known, especially about rational functions, Number Theory and Applications, Banff, AB 1988, Kluwer Acad. Publ., Dordrecht, 1989, pp. 497–528.

[Via03]

E. Viada, The intersection of a curve with algebraic subgroups in a product of elliptic curves, Ann. Scuola Norm. Sup. Pisa Cl. Sci. 2 (2003), no. 1, 47–75.

[Via08]

, The intersection of a curve with a union of translated codimension-two subgroups in a power of an elliptic curve, Algebra Number Theory 2 (2008), no. 3, 248–298.

[Wil96]

A. Wilkie, Model completeness results for expansions of the ordered field of real numbers, J. Amer. Math. Soc. 9 (1996), 1051–1094.

[YZ10]

X. Yuan and S. Zhang, Calabi-Yau theorem and algebraic dynamics, preprint (2010), 23 pp.

[Zag93]

D. Zagier, Algebraic numbers close to both 0 and 1, Math. Comp. 61 (1993), no. 203, 485–491.

[Zan04]

U. Zannier, Polynomial squares of the form aX m + b(1 − X)n + c, Rend. Sem. Mat. Univ. Padova 112 (2004), 1–9.

[Zan09]

, Lecture Notes on Diophantine Analysis, Appunti, vol. 8, with an appendix by F. Amoroso, Edizioni della Normale, 2009.

[Zan10]

, Hilbert irreducibility above algebraic groups, Duke Math. J. 153 (2010), no. 2, 397–425.

[Zha92]

S. Zhang, Positive line bundles on arithmetic surfaces, Ann. Math. (2) 136 (1992), no. 3, 569–587.

[Zha98a]

, Equidistribution of small points on abelian varieties, Ann. Math. 147 (1998), no. 2, 159–165.

[Zha98b]

, Small points and Arakelov theory, Doc. Math., Proceedings of the International Congress of Mathematicians, Extra Vol. II, 217–225 (electronic) 2 (1998), 217–225.

[Zha05]

, Equidistribution of CM-points on quaternion Shimura varieties., Internat. Math. Res. Notices 59 (2005), 3657–3689.

[Zil02]

B. Zilber, Exponential sums equations and the Schanuel conjecture, J. London Math. Soc. 65 (2002), 27–44.

[Zim76]

H.G. Zimmer, On the difference of the Weil height and the N´eron-Tate height, Math. Z. 147 (1976), 35–51.

Index Algebraic part (of a real variety), 76, 127 Amoroso, F., 32, 40, 44, 95 Andr´e, Y., 17, 98 Ax, J., 36, 37, 67 Baker, A., 111 Baker, M., 90 Bertrand, D., 67, 77, 79, 80, 85, 89 Beukers, F., 23 Bicyclotomic, polynomials, 71 Bogomolov, F., 27, 44, 52 Bolognesi, M., 44 Bombieri, E., 12, 15, 30, 31, 68 Brownawell, D., 34 Bugeaud, Y., 48 Chambert-Loir, A., 28 CM, points, 17, 98–100, 103 Colmez, P., 121 Complex multiplication, 100, 103 Conjecture abc of Masser-Oesterl´e, 48 Andr´e-Oort, 16, 97, 98, 105 Bogomolov, 27 Bounded Height, 13 Bounded height, 36 Ghioca-Tucker-Zhang, 91 Gyory-Sarkozy-Stewart, 52 Lang, 27 Manin-Mumford, 11, 14, 17, 27, 65, 69, 91, 93, 120 Mordell, 27 Pink, 79, 85, 86, 98 Schanuel, 37 Schinzel, 39 Shapiro, 64 Vojta, 49 Zhang, 79, 91 Zilber, 21, 37, 44 Conway, J.H., 10 Corvaja, P., 45, 48, 93

David, S., 40, 44, 73, 95 Definable, 69, 123 Deligne, P., 117 Dobrowolski, E., 32 Dobrowolski, R., 45, 95 Duke, W., 116 Edixhoven, B., 81, 97, 122 Effectivity, 19, 31, 41, 45, 73, 84, 106, 111, 116, 117, 137, 139, 143, 147 Elkies, N., 45 Equidistribution, 25, 28, 90, 99, 116 Extension, of algebraic groups, 26 Faltings, G., 11 Goldfeld, D., 104 Gromov, M., 133 Gross, B., 104 Habegger, P., 13, 89 Heath-Brown, D.R., 133 Height gcd, of Silverman, 48 N`eron-Tate, 24, 27, 71 Projective, 24, 49 Weil, 24 Hindry, M., 95 Ihara, Y., 23 Jarnik, V., 26, 133 Jones, A.J., 10 Katz, N.M., 87 Lacunary equation, 30 polynomial, 12, 13, 39 Lang, S., 10, 11, 21, 27 Laurent, M., 11, 22, 27 Le, T., 40

160 Legendre, elliptic curve, 69, 80, 86 Leitner, D., 56 Levin, A., 43 Liardet, P., 11

Index

Theorem Ailon-Rudnick, 55 Amoroso-David, 32 Amoroso-Zannier, 33 Andr´e, 17, 105, 106 Mann, H.B., 24 Baker-DeMarco, 90 Masser, D., 12–15, 28, 31, 41, 49, 65, 69, 71, 86, Bertrand, 85 105, 106 Bombieri-Masser-Zannier, 31 Maurin, G., 12, 35 Corvaja-Zannier, 48 Modular David, 73 curves, 16–18, 97, 100 Gabrielov, 130 function, 17, 101, 125 Habegger, 36 function fields, 83 Habegger, elliptic, 89 polynomials, 102 Luca-Shparlinski, 52 Mumford, D., 117 Masser, 68, 73 Masser-Zannier, 72 Nevanlinna theory, 63 Maurin, 13, 35 Noguchi-Winkelmann-Yamanoi, 64 Parametrization, r-, 133 Northcott, 24 Pell’s equation, 86 Pila, 77, 115, 127 Philippon, P., 27, 36 Pila, for Andr´e-Oort, 99 Pila, J., 15, 18, 65, 99 Raynaud, 11, 65 Pillay, A., 89 Roth, 49 Pink, R., 38, 79 Silverman, 30, 71 Pirola, G.P., 44 Subspace of Schmidt, 13, 27, 50 Poonen, B., 28 Torsion points, 10, 22 Uniformization, for subanalytic sets, 125, Raynaud, M., 11, 26 128 R´emond, G., 12 Zhang, 31 Reparametrization, 133 Zimmer, 71 Ribet section, 81 Tori, algebraic, 10, 21 Ribet, K., 81 Transcendental part (of a real variety), 76, 127 Rudnick, Z., 53 Tschinkel, Y., 52 Sarnak, P., 22, 68 Tsimerman, J., 116 Schinzel, A., 31, 39 Ullmo, E., 27 Schmidt, K., 40 Schmidt, W.M., 44, 133 van der Poorten, A., 64 Semiabelian, varieties, 26, 28, 79, 80 Vojta, P., 11, 63 Semialgebraic, 76, 124 Serre, J-P., 23, 87 Wilkie, A., 15, 69, 77 Shimura variety, 16, 97, 98, 100, 117 Wronskian, 57 Silverman, J., 49 W¨ ustholz, G., 87, 105, 106 Smyth, C., 23 Special, point, variety, 9, 10, 14, 16, 45, 70, 98 Yomdin, Y., 133 Structure, o-minimal, 69, 123 Subanalytic, 76, 124 Zagier, D., 104 Swinnerton-Dyer, H.P.F., 133 Zhang, S., 18, 40, 78, 99 Szpiro, L., 27 Zilber, B., 13, 37 Tate, J., 23, 87