*230*
*93*
*614KB*

*English*
*Pages [169]*
*Year 1977*

- Author / Uploaded
- S.S. Abhyankar

*Table of contents : PrefaceNotationI Meromorphic Curves G-Adic Expansion and Approximate Roots Strict Linear Combinations G-Adic Expansion of a Polynomial Tschirnhausen Operator Approximate Roots Characteristic Sequences of a Meromorphic Curve Newton-puiseux Expansion Characteristic Sequences The Fundamental Theorem The Main Lemmas The Fundamental Theorem Applications of The Fundamental Theorem Epimorphism Theorem Automorphism Theorem Affine Curves with One Place at Infinity Irreducibility, Newton's Polygon Irreducibility Criterion Irreducibility of the Approximate Roots Newton's Algebraic PolygonII The Jacobian Problem The Jacobian Problem Statement of the Problem Notation w-Relation Structure of the w-Degree Form Various Equivalent Formulations of the... Jacobian Problem Via Newton-Puiseux Expansion Solution in the Galois Case*

Lectures on Expansion Techniques In Algebraic Geometry

By

S.S. Abhyankar

Tata Institute Of Fundamental Research Bombay 1977

Lectures on Expansion Techniques In Algebraic Geometry

By

S.S. Abhyankar

Notes by

Balwant Singh

Tata Institute Of Fundamental Research Bombay 1977

©

Tata Institute Of Fundamental Research, 1977

No part of this book may be reproduced in any form by print, microfilm or any other means without written permission from the Tata Institute of Fundamental Research, Colaba, Bombay 400005

Printed in India By K.P. Puthran at Tata Press Limited, Bombay 400025 and published by K.G. Ramanathan for the Tata Institute of Fundamental Research, Bombay 400005

Preface These notes are based upon my lectures at the Tata Institute from November 1975 to March 1976 and further oral communication between me and the note taker. The notes are divided into two parts. In §8 or Part One we prove the Fundamental Theorem on the structure of the coordinate ring of a meromorphic curve and its value group. We then give some applications of the Fundamental Theorem, the principal one among them being the Epimorphism Theorem. The proof of the Main Lemmas (§ 7) presented here is a simplified version of the original proof of Abhyankar and Moh. The process of simplification started with my lectures at Poona University in 1975 and culminated into the present version during my lectures at the Tata Institute. The simplification resulted mainly from the keen and stimulating interest in my lectures shown by the audience at these two places, especially at the Tata Institute. In Part Two we record some progress on the Jacobian problem, which is as yet unsolved. The results presented here were obtained by me during 1970-71. Partial notes on these were prepared by M. van der Put and W. Heinzer at Purdue University in 1971. However, since the notes were not complete, they were never formally circulated. I wish to thank the Tata Institute for inviting me and providing me with an opportunity to give these lectures. My special thanks go to Balwant Singh who took over the task of recording the lectures and preparing these notes entirely on his own even to the extent of relieving me of the tedium of having to read and check the manuscript.

S. S. Abhyankar v

Notation The following notation is used in the sequel. The set of integers (resp. non-negative integers, positive integers, real numbers) is denoted by Z (resp. Z+ , N, R). We write card (S ) for the cardinality of a set S and we write inf(S ) (resp. sup(S )) for the infimum (resp. supremum) of a subset S of R. If T is a subset of a set S then S − T denotes the complement of T in S . If k is a field and n is a positive integer, we denote by µn (k) the group of nth roots of unity in k. For w ∈ µn (k) we write ord(w) for the order of w i.e., ord(w) is the least positive integer r such that wr = 1. Suppose. in a given context, k is a fixed field. We then denote by the symbol a generic (i.e. unspecified) non-zero element of k. Thus if k′ is a ring containing k and a ∈ k′ then a = means that a ∈ k and a , 0. Similarly, b = c means that b = ac for some a ∈ k, a , 0. Note that a = , b = does not mean that a = b.

vii

Contents Preface

v

Notation

vii

I Meromorphic Curves

1

1 G-Adic Expansion and Approximate Roots 1 Strict Linear Combinations . . . . . . . 2 G-Adic Expansion of a Polynomial . . . 3 Tschirnhausen Operator . . . . . . . . . 4 Approximate Roots . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

3 . 3 . 6 . 11 . 16

2 Characteristic Sequences of a Meromorphic Curve 19 5 Newton-puiseux Expansion . . . . . . . . . . . . . . . . 19 6 Characteristic Sequences . . . . . . . . . . . . . . . . . 29 3 The Fundamental Theorem 43 7 The Main Lemmas . . . . . . . . . . . . . . . . . . . . 43 8 The Fundamental Theorem . . . . . . . . . . . . . . . . 58 4 Applications of The Fundamental Theorem 9 Epimorphism Theorem . . . . . . . . . . . . . . . . . . 10 Automorphism Theorem . . . . . . . . . . . . . . . . . 11 Affine Curves with One Place at Infinity . . . . . . . . . ix

67 67 79 80

Contents

x 5

Irreducibility, Newton’s Polygon 12 Irreducibility Criterion . . . . . . . . . . . . . . . . . . 13 Irreducibility of the Approximate Roots . . . . . . . . . 14 Newton’s Algebraic Polygon . . . . . . . . . . . . . . .

95 95 99 106

II The Jacobian Problem

111

6

113 113 115 116 122 133 138 153

The Jacobian Problem 15 Statement of the Problem . . . . . . . . . . . . . 16 Notation . . . . . . . . . . . . . . . . . . . . . . 17 w-Relation . . . . . . . . . . . . . . . . . . . . . 18 Structure of the w-Degree Form . . . . . . . . . 19 Various Equivalent Formulations of the... . . . . 20 Jacobian Problem Via Newton-Puiseux Expansion 21 Solution in the Galois Case . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

Part I

Meromorphic Curves

1

Chapter 1 G-Adic Expansion and Approximate Roots 1 Strict Linear Combinations (1.1) NOTATION. Let e be a non-negative integer and let r = (r0 , r1 , 1 . . . , re ) be an (e + 1)-tuple of integers such that r0 , 0. We define di (r) = g.c.d.(r0 , . . . , ri−1 ),

1 ≤ i ≤ e + 1.

Since r0 , 0, we have di (r) > 0 for every i. Moreover, it is clear that di+1 (r) divides di (r) for 1 ≤ i ≤ e. We put ni (r) = di (r)/di+1 (r) for 1 ≤ i ≤ e. (1.2) LEMMA. Let j, c be integers such that 1 ≤ j ≤ e and 0 ≤ c < n j (r). If n j (r) divides cr j /d j+1 (r) then c = 0. Proof. Since g.c.d. (d j (r), r j ) = g.c.d. (r0 , . . . r j ) = d j+1 (r), we have g.c.d. (n j (r), r j /d j+1 (r)) = 1. Therefore if n j (r) divides cr j /d j+1 (r) then n j (r) divides c. Therefore, since 0 ≤ c < n j (r), we get c = 0. (1.3) LEMMA. Let j, c be integers 1 ≤ j ≤ e and let c =

j X i=0

ci ∈ Z for 0 ≤ i ≤ j. Assume that 0 < c j < n j (r). Let o n j′ = inf i 1 ≤ i ≤ e + 1, di (r) divides c . 3

ci ri with

4

1. G-Adic Expansion and Approximate Roots Then j′ = j + 1. In particular, d1 (r) does not divides c and c , 0.

Proof. Since d j+1 (r) divides ri for 0 ≤ i ≤ j, it is clear that d j+1 (r) divides c. Therefore j′ ≤ j + 1. Next, since 0 < c j < n j (r), we see by lemma (1.2) that n j (r) does not divide c j r j /d j+1 (r). Therefore d j (r) j−1 X does not divide c j r j . Since d j (r) divides ci ri , we conclude that d j (r) i=0

does not divide c. This proves that j′ ≥ j + 1.

2

(1.4) DEFINITION. Let Γ be a subsemigroup of Z. By a Γ- strict linear combination a of r we mean an expression of the form a=

e X

ai ri

i=0

with a0 ∈ Γ and ai ∈ Z, 0 ≤ ai < ni (r) for 1 ≤ i ≤ e. If Γ = Z+ then we call a Γ-strict linear combination of r simply a strict linear combination of r. (1.5) PROPOSITION. Let Γ be a subsemigroup of Z and let a=

e X

ai ri ,

i=0

b=

e X

bi ri

i=0

be Γ- strict linear combinations of r. If a = b then ai = bi for every i, o ≤ i ≤ e. Proof. If the assertion is false then there exists an integer j, 0 ≤ j ≤ e, such that a j , b j and ai = bi for j + 1 ≤ i ≤ e. We may assume without loss of generality that a j > b j . Writing c = a − b and ci = ai − bi for every i, we get j X c= ci ri , c j > 0. i=0

Since c = 0 and r0 , 0, we have j ≥ 1. Therefore we have 0 ≤ a j < n j (r) and 0 ≤ b j < n j (r), which shows that 0 < c j < n j (r), since c j > 0. Therefore c , 0 by Lemma (1.3). This is a contradiction.

1. Strict Linear Combinations

5

(1.6) COROLLARY. If an integer a can be expressed as a Γ- strict linear combination of r then such an expression of a is unique. (1.7) DEFINITION. Let Γ, G be subsemigroups of Z. We say G is strictly generated (resp. Γ- strictly generated) by r if G coincides with the set of all strict (resp. Γ- strict) linear combinations of r. (1.8) PROPOSITION. Assume that e ≥ 1 and ri ≤ 0 for i = 0, 1. If 3 −d2 (r) can be expressed as a strict linear combination of r then r0 divides r1 or r1 divides r0 . Proof. Let di = di (r), 1 ≤ i ≤ e + 1. Suppose −d2 is a strict linear combination of r. Then e X −d2 = ci ri i=0

Z+ ,

with c0 ∈ ci ∈ Z, 0 ≤ ci < ni (r) for 1 ≤ i ≤ e. Since −d2 , 0, there exists i, 0 ≤ i ≤ e, such that ci , 0. Let X n o i 0 ≤ i ≤ e, ci , 0 . j=

Then we have −d2 =

j X

ci ri ,

c j , 0.

i=0

Note that, since r0 , 0, we have r0 < 0 by assumption. Now, if j = 0 then −d2 = c0 r0 , so that r0 divides d2 . Therefore in this case r0 divides r1 . Assume now that j ≥ 1. Then 0 < c j < n j (r). Since d2 divides −d2 , it follows from Lemma (1.3) that j ≤ 1. Therefore j = 1 and we have (1.8.1)

−d2 = c0 r0 + c1 r1

with c0 ∈ Z+ , c1 ∈ Z and 0 < c1 < n1 (r). The last inequalities mean, in particular, that d1 /d2 = n1 (r) > 1, so that −r0 = d1 > d2 = g.c.d. (r0 , r1 ).

1. G-Adic Expansion and Approximate Roots

6

This shows that r1 , 0, so that by assumption r1 < 0. Therefore, since d2 divides r1 , we get −d2 ≥ r1 ≥ c1 r1 ≥ c0 r0 + c1 r1 = −d2 . 4

This gives −d2 = c1 r1 , so that r1 divides d2 . Therefore r1 divides r0 . (1.9) PROPOSITION. Let p be a positive integer and let (u1 , . . . , u p ) be a p-tuple of positive integers such that ui divides ui+1 for 1 ≤ i ≤ p − 1. Let a1 , . . . , a p , b1 , . . . , b p be non-negative integers such that (1.9.1)

ai < ui+1 /ui and bi < ui+1 /ui for 1 ≤ i ≤ p − 1.

If (1.9.2)

p X

ai ui =

i=1

p X

bi ui

i=1

then ai = bi for every i, 1 ≤ i ≤ p. Proof. Let e = p − 1 and let r = (r0 , . . . , re ), where r1 = ue+1−i for 0 ≤ i ≤ e. Then dr (r) = ue+2−i for 1 ≤ i ≤ e + 1. Therefore ni (r) = ue+2−i /ue+1−i for 1 ≤ i ≤ e. Let a′i = ae+1−i , b′i = be+1−i for 0 ≤ i ≤ e. Then the equality (1.9.2) takes the form e e X X ′ ai ri = b′1 ri i=0

i=0

and conditions (1.9.1) take the form a′i < ni (r) and b′i < ni (r) for 1 ≤ i ≤ e. Moreover, we have a′0 ∈ Z+ and b′0 ∈ Z+ . Now the assertion follows from Proposition (1.5) by taking Γ = Z+ .

2 G-Adic Expansion of a Polynomial (2.1) 5

Let R be a ring (commutative, with unity) and let R[Y] be the poly nomial ring in one variable Y over R. For F ∈ R[Y], we write deg F for its Y-degree. We use the convention that deg 0 = −∞.

2. G-Adic Expansion of a Polynomial

7

(2.2) Let p be a positive integer and let G = (G1 , . . . , G p ) be a p-tuple of elements of R[Y] such that the following three conditions are satisfied: (i) Gi is monic in Y and deg Gi > 0 for every i, 1 ≤ i ≤ p. (ii) deg Gi divides deg Gi+1 for every i, 1 ≤ i ≤ p − 1. (iii) deg G1 = 1. We put ui (G) = deg(Gi ) for 1 ≤ i ≤ p, and u p+1 (G) = ∞. We then define ni (G) = ui+1 (G)/ui (G) for 1 ≤ i ≤ p. Note that n p (G) = ∞ and ni (G) is a positive integer for 1 ≤ i ≤ p − 1. Let A(G) = a = (a1 , . . . , a p ) ∈ Z p 0 ≤ ai < ni (G) for 1 ≤ i ≤ p . a

For a ∈ A(G), we put Ga = Ga11 · · · G pp .

(2.3) DEFINITION. An element F ∈ R[Y] is called a strict polynomial in G if F has an expression of the form X F aG a F= a∈A(G)

with Fa ∈ R for every a and Ga = 0 for almost all a. We write R[G A ] for the set of strict polynomials in G. Note that R[GA ] is the R-submodule of R[Y] generated by the set G A = Ga a ∈ A(G) . (2.4) LEMMA. Let a, b ∈ A(G). If a , b then deg Ga , deg Gb .

Proof. This is immediate from Proposition (1.9). For, by taking ui = ui (G), 1 ≤ i ≤ p, we have deg Ga =

p X i=1

ai ui , deg Gb =

p X

bi ui .

i=1

8

1. G-Adic Expansion and Approximate Roots

(2.5) COROLLARY. Let

6

X

F=

FaGa

a∈A(G)

be a strict polynomial in G. Then deg F = sup deg(Fa Ga ). a∈a(G)

In particular, if G = 0 then Fa = 0 for all a ∈ A(G). (2.6) COROLLARY. R[G A ] is a free R-module with G A as a free basis. (2.7) DEFINITION. Let F ∈ R[G A ]. The expression X F= F aG a , a∈A(G)

which is unique by Corollary (2.6), is called the G-adic expansion of F. (2.8) DEFINITION. For F ∈ R[G A ], we define SuppG (F) = a ∈ A(G) Fa , 0 .

(2.9) COROLLARY. Let F be a non-zero element of R[G A ]. Then deg F =

sup

deg Ga .

a∈SuppG (F)

More precisely, there exists a unique element a ∈ SuppG (F) such that deg F = deg Ga > deg Gb for every bin SuppG (F), b , a. Proof. Immediate from Lemma (2.4). 7

(2.10) LEMMA. Let e be an integer, 1 ≤ e ≤ p, and let a1 , . . . , ae be non-negative integers such that ai < ni (G) for 1 ≤ i ≤ e. Then e X ai ui (G) < ue+1 (G). i=1

2. G-Adic Expansion of a Polynomial

9

Proof. We use induction on e. If e = 1 then a1 < n1 (G) implies that a1 u1 (G) < n1 (G)u1 (G) = u2 (G). Now, suppose e ≥ 2. By induction e−1 X hypothesis, we have ai ui (G) < ue (G). Therefore i=1

e X

ai ui (G) < ue (G) + ae ue (G)

i=1

= (1 + ae )ue (G) = ne (G)ue (G) (since ae < ne (G)) = ue+1 (G). (2.11) LEMMA. Let e be an integer, 1 ≤ e ≤ p. Let a = (a1 , . . . , a p ) be an element of A(G) such that ae , 0 and ai = 0 for e + 1 ≤ i ≤ p. Then ue (G) ≤ deg Ga < ue+1 (G). Pp P Proof. we have deg Ga = i=1 ai ui (G) = ei=1 ai ui (G). Therefore, since ae > 0 and ai ≥ 0 for all i, we get ie (G) ≤ deg Ga . The inequality deg Ga < ue+1 (G) follows from Lemma (2.10). (2.12) LEMMA. Let F be an element of R[G A ] such that F < R. Let e = sup i 1 ≤ i ≤ p, ∃ a ∈ SuppG (F) with ai , 0 . Then ue (G) ≤ deg F < ue+1 (G).

Proof. By Corollary (2.9) there exists a ∈ SuppG (F) such that (2.12.1)

deg F deg Ga ≥ Gb

for every b ∈ SuppG (F). Since F < R, we have a , 0. For b ∈ SuppG (F), b , 0, let eb = sup i 1 ≤ i ≤ p, bi , 0 .

1. G-Adic Expansion and Approximate Roots

10

Then Lemma (2.11) ueb (G) ≤ deg Gb < ueb + (G). Therefore it 8 follows from (2.12.1) that we have (2.12.2)

uea (G) ≤ deg F < uea +1 (G)

and that ueb (G)uea +1 (G) for every b ∈ SuppG (F), b , 0. This last inequality shows that eb ≤ ea , so that we get e sup eb b ∈ SuppG (F), b , 0 = ea . Now the lemma follows from (2.12.2).

(2.13) THEOREM. R[G A] = R[Y]. Proof. We have to show that every element F of R[Y] belongs to R[G A ]. We do this by induction on deg F. The assertion being clear for deg F ≤ 0, let us assume that deg F ≥ 1. Since u1 (G) = 1 and u p+1 (G) = ∞, there exists a unique integer e, 1 ≤ e ≤ p, such that ue (G) ≤ deg F < ue+1 (G). Then there exists a unique positive integer be such that (2.13.2)

be ue (G) ≤ deg F < (be + 1)ue (G).

If follows from (8) that we have (2.13.3)

be < ne (G)

Since Ge and hence Gbe e is monic, there exist Q, P ∈ R[Y] such that (2.13.4) 9

F = QGbe e + P

and (2.13.5)

deg P < deg Gbe e = be ue (G) ≤ deg F.

3. Tschirnhausen Operator

11

By induction hypothesis, P ∈ R[G A ]. Therefore it is enough to prove that QGbe e ∈ R[G A ]. From (2.13.4) and (2.13.5) we see that deg F = deg(QGbe e ), which shows that we have (2.13.6)

deg Q = deg F − be ue (G) < deg F.

Therefore, by induction hypothesis, Q ∈ R[G A]. Writing X Q= Qa Ga , Qa ∈ R, a∈A(G)

we get QGbe e =

X

QaGaGbe e .

a∈A(G)

It is therefore enough to show that a + (0, . . . , be , . . . , 0) ∈ A(G) for every a ∈ SuppG (Q). Since be < ue (G) by (2.13.3), it is enough to prove that ae = 0 for every a ∈ SuppG (Q). This last assertion is clear if Q ∈ R. Assume therefore that Q < R. Then, since deg Q = deg F − be ue (G) < ue (G)

(by (2.13.6)) (by (2.13.2)),

we see by Lemma (2.12) that ae = 0 for every a ∈ SuppG (Q). This completes the proof of the theorem. (2.14) COROLLARY. Every element of R[Y] has a unique G-adic expansion. Proof. Clear from Theorem (2.13) and Corollary (2.6)

3 Tschirnhausen Operator We preserve the notation of (2.1)

10

1. G-Adic Expansion and Approximate Roots

12

(3.1) Let g ∈ R[Y] be a monic polynomial of positive degree. Let G1 = Y, G2 = g. Then the conditions (i) - (iii) of (2.2) are satisfied by G = (G1 , G2 ) with p = 2, and we note that we have n1 (G) = deg g, n2 (G) = ∞ and A(G) = a = (a1 , a2 ) ∈ Z+ × Z+ < deg g . By Corollary (2.14) every element of R[Y] has a unique G = (Y, g)− adic expansion. Let f ∈ R[Y] and let X (3.1.1) f = fa Y a1 ga2 a∈A(G)

be its G-adic expansion. For i ∈ Z+ , let X (g) = C (i) fa Y a1 f a∈A(G) a2 =i

Then we can rewrite (3.1.1) in the form (3.1.2)

f =

∞ X

i C (i) f (g)g

i=0

(i) (i) with C (i) f (g) ∈ R[Y], deg C f (g) < deg g and C f (g) = 0 for almost all i. The expression (3.1.2) is called the g-adic expansion of f . It follows from Corollary (2.14) that every element f of R[Y] has a unique g-adic ∞ X expansion. In particular, if f = Ci gi with Ci ∈ R[Y], deg Ci < deg g i=0

and Ci = 0 for almost all i, then Ci = C (i) f (g) for every i and f =

∞ X Ci gi i=0

is the g-adic expansion of f . 11

e X (3.2) LEMMA. Let f ∈ R[Y]. Suppose f = Ci gi , where e is a i=0

nonnegative integer, Ci ∈ R[Y] with deg Ci < deg g for 0 ≤ i ≤ e, and Ce , 0. Then deg f = e deg g + deg Ce . In particular, we have e deg g ≤ deg f < (e + 1) deg g.

3. Tschirnhausen Operator

13

Proof. For every i, 0 ≤ i ≤ e − 1, we have deg(Ci gi ) = i deg g + deg Ci ≤ (e − 1) deg g + deg Ci < e deg g

(since deg Ci < deg g)

≤ e deg g + deg Ce

(since Ce , 0)

e

= deg(Ce g ). This shows that deg f = e deg g + deg Ce . The asserted inequalities now follow from the fact that 0 ≤ deg Ce < deg g. (3.3) COROLLARY. Let f be an element of R[Y] such that f is monic and deg f = d deg g for some non-negative integer d. Then d

f =g +

d−1 X

Ci(i) (g)gi .

i=0

Proof. Since deg f = d deg g, Lemma (3.2) shows that f =

d X

i C (i) f (g)g

i=0

(d) with deg C (d) f (g) = 0. This means that C = C f (g) ∈ R. By Lemma (3.2) again, we have d−1 X (i) i d (3.3.1) deg( f − Cg ) = deg C f (g)g < d deg g. i=0

Since deg(Cgd ) = d deg g = deg f and since both f and g are monic, 12 it follows from (3.3.1) that C = 1.

(3.4) DEFINITION. Let d be a positive integer. Let g ∈ R[Y] be a monic polynomial of positive degree and let f ∈ R[Y] be a monic polynomial of degree d deg g. Then we have (3.4.1)

f = gd +

d−1 X i=0

i C (i) f (g)g

1. G-Adic Expansion and Approximate Roots

14

(g) the Tschirnhausen coefficient in by Corollary (3.3). We call C (d−1) f the g-adic expansion of f and denote it simply by C f (g). If d is a unit in R then the Tschirnhausen transform of g with respect to f , denoted τ f (g), is defined to be τ f (g) = g + d−1 C f (g). We call τ f the Tschirnhausen operator with respect to f . Note that deg C f (g) < deg g and τ f (g) ∈ R[Y] is monic with deg τ f (g) = deg g. In (3.5) to (3.7) below, we preserve the notation of (3.4). We assume, moreover, that d is a unit in R. (3.5) LEMMA. If C f (g) , 0 then deg C f (g) = deg( f − gd ) − (d − 1) deg g. Proof. By 3.4.1 we have f − gd =

d−1 X

i C (i) f (g)g .

i=0

Since deg C (i) f (g) < deg g for every i, the above expression is the g-adic (g) = C f (g) , 0, we see by expansion of f − gd . Therefore, since C (d−1) f Lemma (3.2) that deg( f − gd ) = (d − 1) deg g + deg C f (g). (3.6) PROPOSITION. 13

(i) If C f (g) = 0 then C f (τ f (g)) = 0. (ii) If C f (g) , 0 then deg C f (τ f (g)) < deg C f (g). Proof. (i) is clear, since τ f (g) = g if C f (g) = 0.

3. Tschirnhausen Operator

15

(ii) Let h = τ f (g) = g + d−1 C f (g). Then we have hd = gd + C f (g)gd−1 + k,

(3.6.1) where

! d X d −i k= d C f (g)i gd−i . i i=2 Let c = deg C f (g). Then 0 ≤ c < deg g. Therefore we have deg k ≤ 2c + (d − 2) deg g < c + (d − 1) deg g. Now, from (3.6.1) we get f − hd = f − gd − C f (g)gd−1 − k =

d−2 X

i C (i) f (g)g − k

(by (3.4.1)).

i=0

Since d−2 X i deg C (i) (g)g ) < (d − 1) deg g ≤ c + (d − 1) deg g f i=0

by Lemma (3.2) and since deg k < c + (d − 1) deg g, we get deg( f − hd ) < (d − 1) deg h + c.

Therefore if C f (h) , 0 then deg C f (h) < c by Lemma (3.5). If C f (h) = 0 then deg C f (h) = −∞ < c. 14 (3.7) COROLLARY. C f ((τ f ) j (g)) = 0 for all j ≥ deg g. Proof. This is clear from Proposition (3.6), since deg C f (g) < deg(g).

1. G-Adic Expansion and Approximate Roots

16

4 Approximate Roots (4.1) Let R be a ring (commutative, with unity) and let R[y] be the polynomial ring in one variable Y over R. (4.2) PROPOSITION. Let n, d be positive integers such that d divides n. Let f ∈ R[Y] be a monic polynomial of degree n. Let g ∈ R[Y] be a monic polynomial. Then the following two conditions are equivalent: (i) deg( f − gd ) < n − (n/d). (ii) deg g = n/d and C f (g) = 0. Proof. (i) ⇒ (ii). Since g is monic, it is clear from (i) that deg g = n/d. Therefore we get deg( f − gd ) < (d − 1) deg g. this shows (by Lemma (3.2)) that the g-adic expansion of f − gd has the form d

f −g =

d−2 X

C (i) (g)gi . f −gd

i=0

It follows that d

f =g +

d−2 X

C (i) (g)gi f −gd

i=0

(g) = 0. is the g-adic expansion of f and C f (g) = C (d−1) f (ii) ⇒ (i). Since deg g = n/d, we have deg f = d deg g. Therefore, since C f (g) = 0, we get f = gd +

d−2 X

i C (i) f (g)g

i=0

15

by Corollary (3.3). Therefore d−2 X (i) i d deg( f − g ) = deg C f (g)g i=0

4. Approximate Roots

17

< (d − 1) deg g

(by Lemma (3.2))

= n − (n/d). (4.3) DEFINITION. Let f ∈ R[Y] be a monic polynomial of positive degree n. Let d be a positive integer such that d divides n. An element g of R[Y] is called an approximate dth root of f (with respect to Y) if g is monic and satisfies the equivalent conditions (i) and (ii) of Proposition (4.2). (4.4) THEOREM. Let f ∈ R[Y] be a monic polynomial of positive degree n. Let d be a positive integer such that d divides n. Assume that d is a unit in R. Then there exists a unique approximate dth root of f with respect to Y. Proof. Let g = (τ f )n/d (Y n/d ). Then g is monic of degree n/d and C f (g) = 0 by Corollary (3.7). This proves the existence of an approximate dth root of f with respect to Y. Now, suppose g1 , g2 are approximate dth roots of f with respect to Y. Then deg( f − gd1 ) < n − (n/d) and deg( f − gd2 ) < n − (n/d). Therefore (4.4.1)

deg(gd1 − gd2 ) < n − (n/d).

Now, we have (4.4.2)

gd1 − gd2 = (g1 − g2 )

X

gi1 g2j .

i+ j=d−1 j

Since both g1 and g2 are monic of deg n/d, gi1 g2 is monic of degree P (d − 1)(n/d) for i + j = d − 1. Therefore d−1 i+ j=d−1 gi1 gi2 is monic with 16 X j gi1 g2 = (d − 1)(n/d) = n − (n/d). (4.4.3) deg d−1 i+ j=d−1

It follows from 4.4.1, 4.4.2 and 4.4.3 that g1 − g2 = 0.

18

1. G-Adic Expansion and Approximate Roots

(4.5) NOTATION. We denote the approximate dth root of f with respect to Y by AppdY ( f ). (4.6) COROLLARY. Let f ∈ R[Y] be a monic polynomial of positive degree n. Let d be a positive integer such that d divides n. Assume that d is a unit in R. Let g ∈ R[Y] be any monic polynomial of degree n/d. Then (τ f ) j (g) = AppdY ( f ) for all j ≥ n/d. Proof. Immediate from Corollary (3.7).

Let S be a ring (commutative, with unity) and let σ : R → S be a (unitary) ring homomorphism. Denote again by σ on R. Let f ∈ R[Y] be a monic polynomial of positive degree n. Then σ( f ) ∈ S [Y] is also a monic polynomial of degree n. Let d be a positive integer such that d divides n. Assume that d is a unit in R. Then d is also a unit in S , and we have (4.7) PROPOSITION. AppdY (σ( f )) = σ(AppdY ( f )). Proof. Put g = AppdY ( f ). Then σ(g) is monic of degree n/d. Moreover, we have σ( f ) − (σ(g))d = σ( f − gd ). Therefore deg(σ( f ) − (σ(g))d ) < n − (n/d). This shows that σ(g) = AppdY (σ( f )).

Chapter 2 Characteristic Sequences of a Meromorphic Curve 5 Newton-puiseux Expansion (5.1) NOTATION. Let k be a field. If n is a positive integer we denote 17 by µn (k) (or simply by µn if no confusion is likely) the group of nth roots of unity in k. We use the letters X, Y, t to denote indeterminates. As usual, k[[t]] denotes the ring of formal power series in t over k. We denote by k((t)) the quotient field of k[[t]]. Recall that X every element a of k((t)) has a unique expression of the form a = a j t j with a j ∈ k j∈Z

for every j and a j = 0 for j ≪ 0. We denote by ordt a the t-order of a. P Recall that if a , 0 then writing a = a j ti with a j ∈ k, we have ordt a = inf j ∈ Z a j , 0 .

P If a = 0 then ordt a = ∞. If a = a j t j ∈ k((t)) (with a j ∈ k) we define Suppt a = j ∈ Z a j , 0 .

If R is a ring and f ∈ R[Y], we write degY f (or simply deg f if no confusion is likely) for the Y-degree of f . We use the convention: deg 0 = −∞. 19

2. Characteristic Sequences of a Meromorphic Curve

20

(5.2) HENSEL’S LEMMA. Let f = f (X, Y) be an element of k[[X]][Y] such that f is monic in Y. Suppose f (0, Y) = gh, where g, h are elements of k[Y], both monic in Y, and g.c.d. (g, h) = 1. Then there exist elements g = g(X, Y), h = h(X, Y) of k[[X]][Y], both monic in Y, such that g(0, Y) = g, h(0, Y) = h and f = gh. Proof. Let n = degY f . we can write f =

∞ X

fq X q with fq ∈ k[Y] for

q=0

18

every q. Then f0 is monic in Y of degree n and deg fq < n for q ≥ 1. Let r = deg g, s = deg h. Then r + s = n. Now, in order to prove the lemma. it is enough to find, for every i ∈ Z+ , elements gi , hi of k[Y] such that 1. g0 = g and h0 = h. 2. deg gi < r and deg hi < s for all i ≥ 1. Pq 3. fq = i=0 gi hq−i for all q ≥ 0. For, then g =

∞ X i=0

gi X i , h =

∞ X

hi X i would meet the requirements of

i=0

the lemma. We define gi , hi by induction on i, these being already defined for i = 0 by condition (i). Let q be a positive integer and suppose gi , hi are already defined for i < q. Let eq = fq −

q−1 X

gi hq−i .

i=1

Then deg eq < n. Since g.c.d. (g0 , h0 ) = 1, there exist Gq , Hq ∈ k[Y] such that eq = Hq g0 + Gq h0 . Let Gq = g0 Q + gq with Q, gq ∈ k[Y] and deg gq < deg g0 = r. Then eq = hq g0 + gq h0 , where hq = Hq + Qh0 . Since deg eq < n = r + s, we get deg hq < s. Now fq =

q X i=0

gi hq−i

5. Newton-puiseux Expansion

21

and the lemma is proved. (5.3) COROLLARY. Let k be an algebraically closed field. Let u be an element of k((X)) such that ordX u = 0. Let n be an integer such that char k does not divide n. Then there exists v ∈ k((X)) such that u = vn . Proof. Since ord X u = 0 if and only if ordX u−1 = 0 and since u = vn if 19 and only if u−1 = v−n , we may assume that n is positive. Since ordX u = 0, we have u = u(X) ∈ k[[X]] and u(0) , 0. Let f (X, Y) = Y n − u. Then f (X, Y) ∈ k[[X]][Y] and f (0, Y) = Y n − u(0). Since k is algebraically n Y closed, there exist vi ∈ k, 1 ≤ i ≤ n, such that Y n − u(0) = (Y − vi ). i=1

Since u(0) , 0 and char k does not divide n, we have vi , v j for i , j. n Y (Y − vi ) then g.c.d. (g, h) = 1 Therefore if we let g = Y − v1 and h = i=2

and f (0, Y) = gh. Therefore by Hensel’s Lemma (5.2) there exists an element g(X, Y) in k[[X]][Y] such that g(X, Y) is monic in Y, g(0, Y) = g and g(X, Y) divides f (X, Y) in f (X, Y) in k[[X]][Y]. From the equality g(0, Y) = g = Y − v1 and the fact that g(X, Y) is monic in Y, we get g(X, Y) = Y − v for some v ∈ k((X)). Now g(x, v) = 0. Therefore f (X, v) = 0. This means that vn = u. (5.4) COROLLARY. Let k be an algebraically closed field. Let a be a nonzero element of k((X)) and let n = ordX a. Assume that char k does not divide n. Then there exists z ∈ k((X)) such that: (i) a = zn . (ii) ordX z = 1. (iii) k[[z]] = k[[X]] and k((z)) = k((X)). Proof. (iii) is immediate from (ii), and (ii) is immediate from (i). Therefore it is enough to prove (i). Write a = X n u with u ∈ k((X)). Then ordX u = 0. Therefore by Corollary (5.3) there exists v ∈ k((X))) such that u = vn . Let z = Xv. Then a = zn .

2. Characteristic Sequences of a Meromorphic Curve

22

(5.5) NEWTON’S LEMMA 20

Let k be an algebraically closed field. Let f (X, Y) be a non-zero element of k((X))[Y]. Assume that char k does not divide degY f (X, Y). Then there exists a positive integer m and an element y(t) ∈ k((t)) such that f (tm , y(t)) = 0. Proof. Without loss of generality, we may assume that f (X, Y) is irreducible. Let N = degY f (X, Y). We shall prove the result by induction on n. If n = 1 then the assertion is clear with m = 1. Assume theren X fore that n ≥ 2. Write f (X, Y) = fi Y n−i with fi = fi (X) ∈ k((X)) for i=0

0 ≤ i ≤ n, f0 , 0. Now, for the moment, grant the following

(5.4.1) CLAIM. In order to prove the lemma, we may, without loss of generality, make the following three assumptions: (i) f0 = 1. (ii) f1 = 0. (iii) f1 ∈ k[[X]] for every i and fi (0) , 0 for some i, 2 ≤ i ≤ n. Then (5.4.1) implies that f (X, Y) ∈ k[[X]][Y] and we have f (0, Y) = Y n + f2 (0)Y n−2 + · · · + fn (0)

21

with fi (0) , 0 for some i, 2 ≤ i ≤ n. Since char k does not divide n, it follows from the above expression for f (0, Y) that f (0, Y) is not the nth power of an element of k[Y]. Therefore, since k is algebraically closed, there exist g, h ∈ k[Y], both of them monic in Y of degree less than n, such that f (0, Y) = gh and g.c.d. (g, h) = 1. It follows by Hensel’s Lemma (5.2) that there exist g(X, Y), h(X, Y) ∈ k[[X]][Y], both of them monic in Y, such that f (X, Y) = g(X, Y)h(X, Y) and g(0, Y) = g, h(0, Y) = h. Let r = degY g(x, Y) = deg g, s = degY H(X, Y) = deg h. Then r < n, s < n and r+ s = n. Since char k does not divide n, char k does not divide at least one of r and s, say r. Then, by induction hypothesis, there exists a positive integer m and an element y(t) ∈ k((t)) such that g(tm , y(t)) = 0.

5. Newton-puiseux Expansion

23

Therefore f (tm , y(t)) = 0, and the lemma is proved modulo the Claim (5.4.1). Proof of (5.4.1). (i) Since f0 , 0, we may replace f (X, Y) by f0−1 f (X, Y). (ii) Assume (i), i.e. f0 = 1. Let Z = Y + n−1 f1 . Then f (X, Y) = f (X, Z − n−1 f1 ) = g(X, Z), say. It is clear that g(X, Z) has the form g(X, Z) = Z n + g2 Z n−2 + · · · + gn with gi ∈ k((X)), 2 ≤ i ≤ n. If m is a positive integer and y(t) is an element of k((t)) such that g(tm , y(t)) = 0 then we have f (tm , z(t)) = 0, where z(t) = y(t) − n−1 f1 (tm ). (iii) Assume that f already satisfies (i) and (ii). Since f (X, Y) is irreducible and n ≥ 2, there exists i, 2 ≤ i ≤ n, such that fi , 0. Let ui = ordX fi and let u inf ui /i 2 ≤ i ≤ n .

Let r be an integer, 2 ≤ r ≤ n, such that u = ur /r. Let W be an indeterminate and let Z = W −ur Y. Let g(W, Z) = W −nur f (W r , Y) = P Z n + ni=2 gi Z n−i , where gi = gi (W) = fi (W r )W −iur . Now ordW gi = rui − iur ≥ r iu − iur = 0 with equality for i = r. This means that gi ∈ k[[W]] for all i, 2 ≤ i ≤ n, and gr (0) , 0. Now, if m is a positive integer and y(t) is an element of k((t)) such that g(tm , y(t)) = 0 then we have 0 = g(tm , y(t)) = t−mnur f (tmur y(t)), so that f (tmr , tmur y(t)) = 0. (5.6) NOTATION. Let m be a positive integer. We write k((tm )) for the set of those a ∈ k((t)) for which Suppt a ⊂ mZ. Note that k((tm )) is a subfield of k((t)). 22 (5.7) LEMMA. Let m be a positive integer. Then k((t))/k((tm )) is a finite algebraic extension of degree m.

2. Characteristic Sequences of a Meromorphic Curve

24

Proof. The set {1, t, . . . tm−1 } is clearly a k((tm ))-vector space basis of k((t)). (5.8) DEFINITION. Let m be a positive integer and let y = y(t) be an element of k((t)). By Lemma (5.7), y is algebraic over k((tm )). Let f (tm , Y) in k((tm ))[Y] be the minimal monic polynomial of y over k((tm )). Put f = f (X, Y). Then f ∈ k((X))[Y]. By abuse of language, we shall call f the minimal monic polynomial of y over k((tm )). (5.9) LEMMA. Let m be a positive integer and let y = y(t) be an element of k((t)). Let f = f (X, Y) ∈ k((X))[Y] be the minimal monic polynomial of y over k((tm )). Then we have: (i) f is monic in Y and f is irreducible in k((X))[Y]. (ii) f (tm , y) = 0. (iii) If g = g(X, Y) is any element of k((X))[Y] such that g(tm , y) = 0 then f divides g in k((X))[Y]. (iv) degY f = [k((tm ))(y) : k((tm ))]. (v) degY f divides m. Proof. (i), (ii), (iii) and (iv) are clear from Definition (5.8). To prove (v), we note that since y ∈ k((t)), we have m = [k((t)) : k((tm ))] = [k((t)) : k((tm ))(y)][k((tm ))(y) : k((tm ))] = [k((t)) : k((tm ))(y)] degY f. 23

(5.10) LEMMA. Let m be a positive integer and let y = y(t) be an element of k((t)). Let f (X, Y) ∈ k((X))[Y] be the minimal monic polynomial of y over k((tm )). Assume that char k does not divide m and that g.c.d. ({m} ∪ Suppt y) = 1.

5. Newton-puiseux Expansion

25

Then we have: Y (i) f (tm , Y) = (Y − y(wt)), where k is the algebraic closure of w∈µm (k)

k. Moreover, the m roots y(wt), w ∈ µm (k), of f (tm , Y) = 0 are distinct. (ii) [k((tm ))(y) : k((tm ))] = degY f (X, Y) = m. Proof. By Lemma (5.9) (v) we have degY f (X, Y) ≤ m. Therefore it is enough to prove the following two statements: (1) f (tm , y(wt)) = 0 for every w ∈ µm (k). (2) If w1 , w2 ∈ µm (k), w1 , w2 , then y(w1 t) , y(w2 t). For, given (1) and (2), f (tm , Y) will have at least m distinct roots y(wt), w ∈ µm (k)., Since degY f (X, Y) ≤ m and f (X, Y) is monic in Y, both (i) and (ii) would be proved. Proof of (1). Since wm = 1, substituting wt for t in the equality f (tm , y(t)) = 0, we get f (tm , y(wt)) = 0. P P Proof of (2). Write y = y j t j with y j ∈ k. Then y(wt) = y j w j t j . Therefore if y(w1 t) = y(w2 t) then we have w1j = w2j for every j ∈ Suppt y. j Writing w = w1 w−1 2 , we get get w = 1 for every j ∈ Supp j ∈ Suppt y. 24 Since also wm = 1 and g.c.d. ({m} ∪ Suppt y) = 1, we get w = 1. This means that w1 = w2 . (5.11) REMARK. A more general version of the above lemma appears in Proposition (5.16). (5.12) LEMMA. Let p = char k. Let f = f (X, Y) be an irreducible element of k((X))[Y] such that f < k((X))[Y p ]. Let m be a positive integer and let y = y(t) be an element of k((t)) such that f (tm , y) = 0. If p divides m then y ∈ k((t p )).

26

2. Characteristic Sequences of a Meromorphic Curve

P Proof. Write y = y j t j with y j ∈ k. Suppose y < k((t p )). Then, since P p y p = y j t jp ∈ k((t p )), the minimal monic polynomial of y over k((t p )) P p is g(X, Y) = Y p − z(X), where z(X) = y j X j . Note that g(t p , Y) = Y p − z(t p ) = (Y − y) p . Let m = pr and let h(X, Y) = f (X r , Y). Then h(t p , y) = f (tm , y) = 0. Therefore g(X, Y) divides h(X, Y) in k((X))[Y], so that g(t p , Y) = (Y − y) p divides h(t p , Y) = f (tm , Y) in k((t p ))[Y]. This implies that in the algebraic closure of k((tm ))y occurs as a root of the polynomial f (tm , Y) in Y with multiplicity at least p. But this is a contradiction, since f (tm , Y), being irreducible in k((tm ))[Y] and being not an element of k((tm ))[Y p ], is a separable polynomial over k((tm )). This contradiction proves that y ∈ k((t p )). (5.13) LEMMA. Let k be an algebraically closed field. Let f = f (X, Y) be an irreducible element of k((X))[Y] such that f is monic in Y and char k does not divide degY f . Then there exists an element y(t) of k((t)) and a positive integer m such that char k does not divide m and f (tm , y(t)) = 0. 25

Proof. By Newton’s Lemma (5.5) thee exists a positive integer m and an element y(t) of k((t)) such that f (tm , y(t)) = 0. Let us choose m to be the least positive integer for which there exists an element y(t) of k((t)) with f (tm , y(t)) = 0. We then claim that char k does not divide m. For, let p = char k and suppose p divides m. Then by Lemma (5.12) y(t) ∈ k((t p )). Therefore there exists z(t) ∈ k((t)) such that y(t) = z(t p ). Now, we get 0 = f (tm , y(t)) = f ((t p )m/p , z(t p )), which shows that f (tm/p , z(t)) = 0. This contradicts the minimality of m.

(5.14) NEWTON’S THEOREM Let k be an algebraically closed field. Let f = f (X, Y) be an irreducible element of k((X))[Y] such that f is monic in Y. Let n = degY f , and assume that char k does not divide n. Then there exists an element y(t) of k((t)) such that f (tn , y(t)) = 0. Moreover, for any such y(t) we have: Y (i) f (tn , Y) = (Y − y(wt)). w∈µk (k)

(ii) The n roots y(wt), w ∈ µn (k), of f (tn , Y) = 0 are distinct.

5. Newton-puiseux Expansion

27

(iii) g.c.d. ({n} ∪ Suppt y(wt)) = 1 for every w ∈ µn (k). Proof. By Lemma (5.13) there exists a positive integer m such that (5.13.2) CLAIM. char k does not divide m and f (tm , y(t)) = 0 for some y(t) ∈ k((t)). Let us assume that m is the smallest positive integer satisfying (5.13.2). Let d = g.c.d. ({m} ∪ Suppt y(t)). We claim that d = 1. for, since d divides every j ∈ Suppt y(t), there exists z(t) ∈ k((t)) such that y(t) = z(td ). Now, we have 0 = f (tm , y(t)) = f ((td )m/d , z(td )), which shows that f (tm/d , z(t)) = 0. Therefore by the minimality of m 26 we get d = 1. Since f (X, Y) is monic in Y and irreducible in k((X))[Y] and since f (tm , y(t)) = 0, f is the minimal monic polynomial of y(t) over k((tm )). Therefore, since d = 1, by Lemma (5.10) we get n = degY f (X, Y) = m. Now, (i) and (ii) follow directly from Lemma (5.10). Since, Suppt y(wt) = Suppt y(t) for every w ∈ µn (k), (ii) follows from the fact d = 1 proved above. (5.15) REMARK. With the notation of Theorem (5.14). let y(t) = P j P y j t with y j ∈ k. If we write X 1/n for t then y(X 1/n ) = y j X j/n and f (X, y(X 1/n )) = 0. Note that y(X 1/n ) is a power series in X with fractional exponents, in fact with exponents in (1/n)Z. The equality f (X, y(X 1/n )) = 0 can thus be interpreted to mean that given an equation f (X, Y) = 0 (where f (X, Y) is an irreducible element of k((X))[Y]), we can expand Y as a fractional power series in X with exponents in (1/n)Z. We call y(X 1/n ) a Newton-Puiseux expansion of Y in fractional powers of X. Note that there are n distinct Newton-Puiseux expansions of Y, given by the n distinct roots y(wt), w ∈ µn (k).

2. Characteristic Sequences of a Meromorphic Curve

28

(5.16) PROPOSITION. Let m be a positive integer such that char k does not divide m, and let y = y(t) be an element of k((t)). Let f (X, Y) ∈ k((X))[Y] be the minimal monic polynomial of y over k((tm )). Let d = g.c.d. ({m} ∪ Suppt y). Then ( f (tm , Y))d =

Y

(Y − y(wt)).

w∈µm (k)

27

where k is the algebraic close of k. In particular, we have [k((tm ))(y) : k((tm ))] = degY f (X, Y) = m/d. Proof. Since d divides j for every j ∈ Suppt y(t), there exists z(t) ∈ k((t)) such that y(t) = z(td ). Let τ = td . Then y(t) = z(τ) and clearly we have g.c.d. ({m/d} ∪ Suppτ z(τ)) = 1. Therefore by Lemma (5.10) we have Y (5.16.1) f (τm/d , Y) = (Y − z(wτ)). w∈µm/d

where µm/d = µm/d (k). Let v be a primitive mth root of unity in k. Then vd is a primitive (m/d)th root of unity of k. Therefore di µm/d = v 1 ≤ i ≤ m/d and from 5.16.1 we get

f (tm , Y) =

m/d Y (Y − z(vdi τ)) i=1

(5.16.2)

m/d Y = (Y − z((vi t)d ))

=

i=1 m/d Y

(Y − y(vi t)).

i=1

6. Characteristic Sequences

29

Let n = m/d. Since d divides j for every j ∈ Suppt y(t), m divides n j for every j ∈ Suppt y(t). It follows that y(vrn+i t) = y(vi t) for all integers i, r. Therefore we get Y

w∈µm (k)

(Y − y(wt)) =

m Y (Y − y(v j t)) j=1

d n d−1 Y n Y Y i rn+i = (Y − y(v t)) = (Y − y(v t)) r=0 i=1 m

i=1

d

= ( f (t , Y))

(by 5.16.2).

6 Characteristic Sequences Throughout this section, we shall preserve the notation introduced in 28 (6.1) below

(6.1) Let k be an algebraically closed field and let X, Y, t be indeterminates. Let f = f (X, Y) be an irreducible element of k((X))[Y] such that f is monic in Y. We call such an f a meromorphic curve over k. Let n = degY f , and assume that char k does not divide n. Then by Newton’s Theorem (5.14) there exists an element y(t) ∈ k((t)) such that f (tn , y(t)) = 0 and Y f (tn , Y) = (Y − y(wt)). w∈µn (k)

Therefore if z(t) is any element of k((t)) such that f (tn , z(t)) = 0 then z(t) = y(wt) for some w ∈ µn (k). In particular, we have Suppt z(t) = Suppt y(t). Thus the set Suppt y(t) depends only on f and not on a root y(t) of f (tn , Y) = 0. Therefore we can make (6.2) DEFINITION. The support of f denoted Supp( f ) is defined by Supp( f ) = Suppt y(t) where y(t) is any element of k((t)) such that f (tn , y(t)) = 0.

2. Characteristic Sequences of a Meromorphic Curve

30 29

(6.3) CONVENTION. We extend the notion of divisibility in Z to the set Z ∪ {∞, −∞} by postulating that: (i) ∞ and −∞ divide every element of Z ∪ {∞, −∞}. (ii) No integer divides ∞ or −∞. Note that “a divides b” is still a reflexive and transitive relation on Z ∪ {∞, −∞}. If I is a subset of Z we denote, as usual, by g.c.d. (I) the unique non-negative generator of the ideal of Z generated by I. If I is a subset of Z ∪ {∞, −∞} such that I 1 Z then we define g.c.d. (I) = −∞. For a subset I of Z we denote by inf(I) the infimum of I. As usual, we set inf(φ) = ∞. (6.4) DEFINITION. Let J be a subset of Z bounded below and let ν be a non-zero integer. We define mi (ν, J) and di+1 (ν, J) for every i ∈ Z+ by induction on i as follows: m0 (ν, J) = ν, d1 (ν, J) = |ν|, m1 (ν, J) = inf(J) and, i ≥ 2, di (ν, J) = g.c.d. (di−1 (ν, J), mi−1 (ν, J)), mi (ν, J) = inf j ∈ J j . 0(mod di (ν, J)) .

Note that we have di (−ν, J) = di (ν, j) for every i ≥ 1.

(6.5) LEMMA. With the notation of (6.4), let J1 = J and, for i ≥ 2, let Ji = j ∈ J1 j . 0(mod di (ν, J)) .

Let d = g.c.d. ({ν} ∪ J). Then we have:

(i) di+1 (ν, J) = g.c.d. (m0 (ν, J), . . . , mi (ν, J)) for all i ≥ 0. (ii) di+1 (ν, J) divides di (ν, J) for every i ≥ 1. (iii) Ji ⊃ Ji+1 and mi (ν, J) < Ji+1 for every i ≥ 1. In particular, if Ji , φ then Ji ⊃ Ji+1 and mi (ν, J) < mi+1 (ν, J). ,

6. Characteristic Sequences 30

31

(iv) If i ≥ 2 and Ji , φ then di (ν, J) > di+1 (ν, J) ≥ d. If i ≥ 1 and ji = φ then di+1 (ν, J) = −∞. Moreover, there exists a unique non-negative integer h such that we have: (v) d1 (ν, J) ≥ d2 (ν, j) > d3 (ν, J) > · · · > dh+1 (ν, J) = d. (vi) di (ν, J) = −∞ for i ≥ h + 2. (vii) mi (ν, J) ∈ Z for 0 ≤ i ≤ h and mi (ν, J) = ∞ for i ≥ h + 1. (viii) m1 (ν, J) < · · · < mh (ν, J) < mh+1 (ν, J) = ∞. (ix) di (ν, J) = g.c.d. ({ν} ∪ { j ∈ J| j < mi (ν, J)}) for 1 ≤ i ≤ h + 1. Proof. (i) Clear from the definition by induction on i. (ii) Follows from (i). (iii) Let i ≥ 1. It follows from (ii) that Ji ⊃ Ji+1 . Moreover, since di+1 (ν, J) divides mi (ν, J), we have mi (ν, J) < Ji+1 . If Ji , φ then mi (ν, J) = inf(Ji ) belongs to Ji , so that we get Ji ⊃ Ji+1 and , mi (ν, J) < mi+1 (ν, J). (iv) Let i ≥ 2. If Ji , φ then mi (ν, J) ∈ Ji , so that di (ν, J) does not divide mi (ν, J). This shows that di (ν, J) > di+1 (ν, J). Moreover, since Ji , φ, by (iii) we have J p , φ for 1 ≤ p ≤ i. Therefore m p (ν, J ∈ J) for 1 ≤ p ≤ i, so that d = g.c.d. ({ν} ∪ J) divides g.c.d. (m0 (ν, J), . . . , mi (ν, J)) = di+1 (ν, J). This shows that di+1 ≥ d. Now, suppose i ≥ 1 and Ji = φ. Then mi (ν, J) = inf(Ji ) = ∞. Therefore di+1 (ν, J) = −∞. This proves (iv). We now claim that there exists i ≥ 1 such that Ji = φ. For, if Ji , φ for every i then, by (iv), {di (ν, J)|i ≥ 2} is a strictly decreasing infinite sequence of integers bounded below by d. This is not possible. Therefore there exists i such that Ji = φ. Let h + 1 = inf i ≥ 1 Ji = φ .

32 31

2. Characteristic Sequences of a Meromorphic Curve Then, since Ji ⊃ Ji+1 for every i ≥ 1, we have Ji , φ for 1 ≤ i ≤ h and Ji = φ for i ≥ h + 1. This proves (vi), (vii) and (viii) in view of (iii) and (iv).

(v) Since J p , φ for 1 ≤ p ≤ h, we have m p (ν, J) ∈ J for 1 ≤ p ≤ h. Therefore d divides dh+1 (ν, J). On the other hand, since Jh+1 = φ, dh+1 (ν, J) divides j for every j ∈ J. Since dh+1 (ν, J) also divides ν, we see that dh+1 (ν, J) divides d. Therefore we get dh+1 (ν, J) = d. Now, (v) follows from (i) and (iv). (ix) Fix an i, 1 ≤ i ≤ h + 1. Let J ′ = j ∈ J j < mi (ν, J)

and let d′ = g.c.d. ({ν} ∪ J ′ ). If i = 1 then J ′ = φ and we have d′ = |ν| = di (ν, J). Assume therefore that 2 ≤ i ≤ h + 1. Since mi (ν, J) = inf(Ji ), we have J ′ ∩ Ji = φ. This means that di (ν, J) divides j for every j ∈ J ′ . Therefore di (ν, J) divides d′ . On the other hand, by (viii) m p (ν, J) ∈ J ′ for 1 ≤ p ≤ i − 1. Therefore, since ν = m0 (ν, J), d′ divides g.c.d. (m0 (ν, J), . . . , mi−1 (ν, J)), which is equal to di (ν, J) by (i). Thus we get d′ = di (ν, J). (6.6) DEFINITION. Let J be a subset of Z bounded below and let ν be a non-zero integer. The m-sequence of J with respect to ν, denoted m(ν, J), is defined to be m(ν, J) = (m0 (ν, J), . . . , mh (ν, J), mh+1 (ν, J)),

32

where mi (ν, J) is defined as in Definition (6.4) and where h is the unique non-negative integer of Lemma (6.5). If ν and J are not clear from the context then we shall write h(ν, J) for h. Note then that h(−ν, J) = h(ν, J). Note also that by Lemma (6.5) we have mi (ν, J) ∈ Z for 0 ≤ i ≤ h and mh+1 (ν, J) = ∞.

6. Characteristic Sequences

33

(6.7) LEMMA. Let J be a subset of Z bounded below and let ν be a non-zero integer. Let e be an integer such that 1 ≤ e ≤ h(ν, J) + 1. Let ′ J = j/de j ∈ J, j < me (ν, J) , where de = de (ν, J). Let ν′ = ν/de . Then J ′ ⊂ Z, J ′ is bounded below, ν′ is a non-zero integer and we have h(ν′ , J ′ ) = e − 1, mi (ν′ , J ′ ) = mi (ν, J)/de , di+1 (ν′ , J ′ ) = di+1 (ν, J)/de for 0 ≤ i ≤ h(ν′ , J ′ ). Proof. A straightforward verification. In the remainder of this section we let ν be an integer such that |ν| = n. (6.8) DEFINITION. The m-sequence m(ν, f ) of f with respect to ν is defined by m(ν, f ) = m(ν, Supp( f )). Note that, since |ν| = degY f, h(ν, Supp( f )) depends only on f an does not depend upon ν. We shall write h( f ) for h(ν, Supp( f )) and mi (ν, f ) for mi (ν, Supp( f )) for 0 ≤ i ≤ h( f ) + 1. Note that mi (ν, f ) = ordt y(wt) for every w ∈ µn (k). (6.9) DEFINITION. The d-sequence d( f ) of f is defined to be d( f ) = (d1 ( f ), . . . , dh+1 ( f ), dh+2 ( f )), where h = h( f ) and di ( f ) = di (ν, Supp( f )) as defined in Definition (6.4), 33 1 ≤ i ≤ h + 2. We note that, since |ν| = degY f , d( f ) depends only on f and does not depend upon ν. (6.10) DEFINITION. The q-sequence q(ν, f ) of f with respect to ν is defined to be q(ν, f ) = (q0 (ν, f ), . . . , qn (ν, f ), qh+1 (ν, f )),

2. Characteristic Sequences of a Meromorphic Curve

34

where h = h( f ), qi (ν, f ) = m + i(ν, f ) for i = 0, 1, and q j (ν, f ) = m j (ν, f ) − m j−1 (ν, f ) for 2 ≤ j ≤ h + 1. (6.11) DEFINITION. The ssequence s(ν, f ) of f with respect to ν is defined to be s(ν, f ) = (s0 (ν, f ), . . . , sh (ν, f ), sh+1 (ν, f )), where h = h( f ), s0 (ν, f ) = q0 (ν, f ) and si (ν, f ) =

i X

q p (ν, f )d p ( f )

p=1

for 1 ≤ i ≤ h + 1. (6.12) DEFINITION. The r-sequence r(ν, f ) of f with respect to ν is defined to be r(ν, f ) = (r0 (ν, f ), . . . , rh (ν, f ), rh+1 (ν, f )), where h = h( f ), r0 (ν, f ) = s0 (ν, f ) and ri (ν, f ) = si (ν, f )/di ( f ) for 1 ≤ i ≤ h + 1. Some properties of the various sequences defined above are listed in the following proposition. These will be used in the sequel, mostly without explicit reference. 34

(6.13) PROPOSITION. Let ν be an integer such that |ν| = n. Let h = h( f ) and for every i, 0 ≤ i ≤ h + 1, let mi = mi (ν, f ), qi = qi (ν, f ), si = si (ν, f ), ri = ri (ν, f ) and di+1 = di+1 ( f ). Then: (i) di+1 divides di for 1 ≤ i ≤ h + 1. (ii) d1 ≥ d2 > d3 > · · · > dh > dh+1 = 1. (iii) d1 = n and dh+2 = −∞. (iv) r0 = s0 = q0 = m0 = ν and r1 = q1 = m1 . (v) rh+1 = sh+1 = qh+1 = mh+1 = ∞.

6. Characteristic Sequences

35

(vi) mi , qi , si , ri are integers for 0 ≤ i ≤ h. (vii) m1 < m2 < · · · < mh < mh+1 = ∞. (viii) qi is a positive integer for 2 ≤ i ≤ h. (ix) di = g.c.d. ({n} ∪ { j ∈ Supp( f )| j < mi }) for 1 ≤ i ≤ h + 1. (x) For 0 ≤ i ≤ h + 1, we have (1) (2) (3) (4)

di+1 di+1 di+1 di+1

= g.c.d. = g.c.d. = g.c.d. = g.c.d.

(m0 , . . . , mi ), (q0 , . . . , qi ), (r0 , . . . , ri ), (s0 , s1 /d1 . . . , si /di ).

In particular, each of the four sequences m(ν, f ), q(ν, f ), s(ν, f ) and r(ν, f ) determines d( f ), the sequence s(ν, f ) determining d( f ) by the recursive formula (4). (xi) each one of the four sequences m(ν, f ), q(ν, f ), s(ν, f ) and r(ν, f ) determines the other three. Proof. (i) Follows from Lemma (6.5). (ii) Follows from Lemma (6.5) and Theorem (5.14). (iii) Clear from the definition and Lemma (6.5). (iv) Clear from the definition.

35

(v) Clear from the definition. (vi) By Lemma (6.5) mi is an integer for 0 ≤ i ≤ h. Therefore it follows the definition that qi , si are integers for 0 ≤ i ≤ h and that r0 is an integer. Now by (i) d p /di is an integer for 1 ≤ p ≤ i ≤ h. Therefore for 1 ≤ i ≤ h i X ri = si /di = q p (d p /di ) p=1

is an integer.

2. Characteristic Sequences of a Meromorphic Curve

36

(vii) Follows from Lemma (6.5). (viii) Follows from (vi) and (vii). (ix) Follows from Lemma (6.5), since n = |ν|. (x) (i) follows from Lemma (6.5). (2) follows easily from (1), since q0 = m0 , q1 = m1 and qi = mi − mi−1 for 1 ≤ i ≤ h + 1. To prove (3), we note that we have (6.13.1)

ri

i−1 X

q p (d p /di ) + qi

p=1

for 1 ≤ i ≤ h + 1. Therefore, since d p /di is an integer and since di divides q p for 1 ≤ p ≤ i − 1, we get g.c.d. (di , ri ) = g.c.d. (di , qi ) = g.c.d. (q0 , . . . , qi−1 , qi )

(by (2))

= di+1

(by (2))

for 1 ≤ i ≤ h + 1. Therefore, since d1 = |q0 | = |r0 |, we get (3) for 0 ≤ i ≤ h + 1 by induction on i, (4) is immediate from (3). 36

(xi) Since each of four sequences determines d( f ) by (x), it is enough to show that each one of them together with d( f ) determines the other three. It is clear from the definition that m(ν, f ) determines q(ν, f ), q(ν, f ) and d( f ) determine s(ν, f ), and s(ν, f ) and d( f ) determine r(ν, f ). Moreover, q(ν, f ) clearly determines m(ν, f ) by the formulas m0 = q0 , mi =

i X

qp,

1 ≤ i ≤ h + 1.

p=1

6. Characteristic Sequences

37

Therefore, to complete the cycle, it is enough to show that r(ν, f ) and d( f ) determine q(ν, f ). But this is clear from the recursive formulas q0 = r0 , qi = ri −

i−1 X

q p (d p /di ), 1 ≤ i ≤ h + 1,

p=1

which we get from 6.13.1. (6.14) LEMMA. Let ν be an integer such that|ν| = n. Let h = h( f ) and let mi = mi (ν, f ), di+1 = di+1 ( f ) for 0 ≤ i ≤ h + 1. Let y(t) be an element of k((t)) such that f (tn , y(t)) = 0. Let e be an integer such that 1 ≤ e ≤ h + 1. Let w be an nth root of unity in k and let p = ord(w). Then we have: (i) ordt (y(t) − y(wt)) ≥ me if and only if p divides de . (ii) ordt (y(t) − y(wt)) ≤ me if and only if p does not divide de+1 . (iii) ordt (y(t) − y(wt)) = me if and only if p divides de and p does not divide de+1 . Proof. It is clearly enough to prove (i) and (ii). Since ordt (y(t) = m1 = ordt y(wt) and since p divides n = d1 , (i) is obvious for e = 1. Since mh+1 = ∞ and since p does not divide −∞ = dh+2 , (ii) is obvious for 37 e = h + 1. Therefore it is enough to prove (i) for e ≥ 2 and (ii) for e ≤ h. Now, for the moment, grant the following two statements: (i′ ) If 2 ≤ e ≤ h + 1 and p divides de then ordt (y(t) − y(wt)) ≥ me . (ii′ ) If 1 ≤ e ≤ h and p does not divide de+1 then ordt (y(t) − y(wt)) ≤ me . Then if 2 ≤ e ≤ h + 1 and ordt (y(t) − y(wt)) ≥ me we get ordt (y(t) − y(wt)) > me−1 , since me > me−1 . This shows by (ii′ ) that p divides de . If 1 ≤ e ≤ h and ordt (y(t) − y(wt)) ≤ me then we get ordt (y(t) − y(wt)) < me+1 since me < me+1 . This shows by (i′ ) that p does not divide de+1 . Thus, in order to complete the proof of the lemma, it is enough to prove (i′ ) and (ii′ ).

2. Characteristic Sequences of a Meromorphic Curve

38

X (i′ ) Let J = Supp( f ) = Suppt y(t). Write y(t) = y j t j with y j ∈ k, j∈J X j j h j , 0 for every j ∈ J. Then y(wt) = w y j t . Therefore we have j∈J

ordt (y(t) − y(wt)) = inf j ∈ = inf j ∈ = inf j ∈ me ,

J w j , 1 J j . 0(mod p) J j . 0(mod de )

where the inequality follows from the fact that p divides de . (ii′ ) Let c = inf i 1 ≤ i ≤ h, p does not divide di+1 .

Then, since p divides n = d1 , we see that p divides dc and p does not divide dc+1 . Moreover, c ≤ e. Now, dc+1 = g.c.d. (dc , mc ). Since p divides dc and p does not divide dc+1 , we see that p does not divide mc . Therefore wmc , 1, which shows that ordt (y(t) − y(wt)) ≤ mc ≤ me .

38

(6.15) PROPOSITION. Let ν be an integer such that |ν| = n. Let h = h( f ) and let mi = mi (ν, f ), di+1 = di+1 ( f ) for 0 ≤ i ≤ h + 1. Let y(t) be an element of k((t)) such that f (tn , y(t)) = 0. Let E = ordt (y(w1 t) − y(w2 t)) w1 , w2 ∈ µn (k), w1 , w2 , M1 = {m1 , . . . , mh }

and

M2 = {m2 , . . . , mh } .

Then M2 ⊂ E ⊂ M1 . Moreover, we have M1 , if d1 > d2 , E= M2 , if d1 = d2 .

6. Characteristic Sequences

39

Proof. If h = 0 then d1 = 1 and E = M1 = M2 = φ. We may therefore −1 assume that h ≥ 1. Since ordt (y(w1 t) − y(w2 t)) = ordt (y(t) − y(w2 w1 t)), it is clear that E = ordt (y(t) − y(wt)) w ∈ µn (k), w , 1 . Let w ∈ µn (k),

w , 1, and let p = ord(w). Then p divides n = d1 and p does not divide 1 = dh+1 . Therefore there exists e, 1 ≤ e ≤ h, such that p divides de and p does not divide de+1 . Therefore by Lemma (6.14) we get ordt (y(t) − y(wt)) = me ∈ M1 .

This proves that E ⊂ M1 . Now, let i be an integer such that 2 ≤ i ≤ h. Since di divides d1 = n, there exists w ∈ µn (k) such that ord(w) = d1 . Since i ≥ 2, di does not divide di+1 by Proposition (6.13). Therefore by Lemma (6.14) we have mi = ordt (y(t) − y(wt)) ∈ E. This proves that M2 ⊂ E. Now, suppose d1 > d2 . Then, if w is a 39 primitive nth root of unity in k, ord(w) = d1 does not divide d2 , so that by Lemma (6.14) we get m1 = ordt (y(t) − y(wt)) ∈ E, which proves that E = M1 . Finally, suppose d1 = d2 . Then, since d2 = g.c.d. (d1 , m1 ), d1 divides m1 . Therefore wm1 = 1 for every w ∈ µn (k). Since ordt y(t) = m1 = ordt y(wt), it follows that ordt (y(t) − y(wt)) > m1 for every w ∈ µn (k). This means that m1 < E, which proves that E = M2 . (6.16) PROPOSITION. Let ν be an integer such that |ν| = n. Let e be an integer such that 1 ≤ e ≤ h( f ) + 1. Let de = de ( f ) and let n′ = n/de′ ν′ = ν/de . Let f ′ be an irreducible element of k((X))[Y] such that f ′ is monic in Y and degY f ′ = n′ . Assume that Supp( f ′ ) = j/de j ∈ Supp( f ), j < me (ν, f ) . Then h( f ′ ) = e − 1, and for o ≤ i ≤ h( f ′ ) we have:

40

2. Characteristic Sequences of a Meromorphic Curve

(i) mi (ν′ , f ′ ) = mi (ν, f )de . (ii) di+1 ( f ′ ) = di+1 ( f )/de . (iii) qi (ν′ , f ′ ) = q + i(ν, f )/de . (iv) si (ν′ , f ′ ) = si (ν, f )/de2 (if i , 0). (v) ri (ν′ , f ′ ) = ri (ν, f )/de . 40

Proof. (i) and (ii) follow from Lemma (6.7). (iii), (iv) and (v) follow immediately from (i) and (ii). (6.17) PROPOSITION. Let ν be an integer such that |ν| = n. Let f ′ be an irreducible element of k((X))[Y] such that f ′ is monic in Y and degY f ′ = n. Suppose there exists z(t) ∈ k((t)) such that f ′ (tn , z(t)) = 0 and ordt (z(t) − y(t)) > mh (ν, f ), where h = h( f ). Then we have: (i) h( f ′ ) = h( f ). (ii) m(ν, f ′ ) = m(ν, f ). (iii) q(ν, f ′ ) = q(ν, f ). (iv) s(ν, f ′ ) = s(ν, f ). (v) r(ν, f ′ ) = r(ν, f ). (vi) d( f ′ ) = d( f ). Proof. Let J = Supp( f ), J ′ = Supp( f ′ ). Then the hypothesis implies that we have ′ (6.17.1) j ∈ J j ≤ mh (ν, f ) = j ∈ J j ≤ mh (ν, f ) . We shall prove the lemma under the weaker assumption (6.17.1). Note that it is enough to prove (ii). For, the rest then follows from (ii) and the definition. We first prove by induction on i that we have (6.17.2)

i ≤ h( f ′ ) and mi (ν, f ) = mi (ν, f ′ )

6. Characteristic Sequences

41

for 0 ≤ i ≤ h = h( f ). For i = 0, this is clear. Suppose now that p is an integer, 1 ≤ p ≤ h, such that (6.17.2) holds for 0 ≤ i ≤ p − 1. Then by Proposition (6.13) (x) we have d p ( f ′ ) = d p ( f ) = d p , say. Let j ∈ J j , 0 (mod d p ) , Jp = J, ′ j , 0 (mod d ) , j ∈ J p J ′p = J′,

if p ≥ 2, if p = 1, if p ≥ 2, if p = 1.

Then we have m p (ν, f ) = inf(J p ), m p (ν, f ′ ) = inf(J ′p ). Since m p (ν, f ) 41 ≤ mh (ν, f ), we have m p (ν, f ) ∈ J ′p by (6.17.1). This shows that m p (ν, f ′ ) ≤ m p (ν, f ) ≤ mh (ν, f ). Therefore by (6.17.1) m p (ν, f ′ ) ∈ J p′ so that m p (ν, f ′ ) ≤ m p (ν, f ). This proves that m p (ν, f ′ ) = m p (ν, f ) < ∞, which shows also that p ≤ h( f ′ ). Thus (6.17.2) is proved for 0 ≤ i ≤ h. In particular, we get h ≤ h( f ′ ) and dh+1 ( f ′ ) = dh+1 ( f ) = 1 by Proposition (6.13). This means that ′ Jh+1 = j ∈ J j < 0 (mod dh+1 ( f ′ ))

is empty, so that h ≥ h( f ′ ). Thus we have h( f ′ ) = h = h( f ) and by (6.17.2) we get m(ν, f ) = m(ν, f ′ ).

Chapter 3 The Fundamental Theorem 7 The Main Lemmas Throughout this section, we preserve the notation introduced in (7.1) 42 and (7.2) below (7.1) NOTATION. Let k be an algebraically closed field and let X, Y, t be indeterminates. Let n be a positive integer such that char k does not divide n. Let f = f (X, Y) be an irreducible element of k((X))[Y] such that f is monic in Y and degY f = n. Let ν be an integer such that |ν| = n. Let h = h( f ) and for every i, 0 ≤ i ≤ h + 1, let mi = mi (ν, f ) qi = qi (ν, f ) si = si (ν, f ) ri = ri (ν, f ) di+1 = di+1 ( f ). Also, let ni = di /di+1 for 1 ≤ i ≤ h. (Note that by Proposition (6.13)) ni is a positive integer for every i and ni ≥ 2 for 2 ≤ i ≤ h. Finally, we fix a root y(t) of f (tn , Y) = 0, i.e., we fix an element y(t) of k((t)) such that f (tn , y(t)) = 0. Recall then that by Newton’s Theorem (5.14) we have Y f (tn , Y) = (Y − y(wt)), w∈µn

43

3. The Fundamental Theorem

44 43

where for a positive integer m we write µm for µm (k). Let X y(t) = y jt j j

with y j ∈ k for every j. (7.2) NOTATION. We shall use the symbol to denote a generic (i.e. unspecified) non-zero element of k. Thus if k′ is an overfield of k and a ∈ k′ then a = means that a ∈ k and a , 0. Similarly, b = c means that b = ac for some a ∈ k, a , 0. Note that a = and b = does not mean that a = b. (7.3) DEFINITION. Let k′ be an overfield of k and let z be a non-zero element of k′ ((t)). If m = ordt z, we can write z in the form z = atm + tm+1 z′ with a ∈ k, a , 0 and z′ ∈ k′ ((t)). We define the initial form (resp. initial co-efficient) of z, denoted info (z) (resp. inco (z)), by info (z) = atm (resp. inco (z) = a). We also define info (0) = 0, inco (0) = 0. (7.4) DEFINITION. Let i be an integer with 1 ≤ i ≤ h + 1. We define A(i) = w ∈ µn ordt (y(t) − y(wt)) < mi , R(i) = w ∈ µn ordt (y(t) − y(wt)) ≥ mi , S (i) = w ∈ µn ordt (y(t) − y(wt)) = mi . (7.5) LEMMA. Let i be an integer, 1 ≤ i ≤ h + 1. Then: (i) R(i) = µdi . In particular, card (R(i)) = di . 44

(ii) Let i ≤ h. Then S (i) = R(i) − R(i + 1) = µdi − µdi+1 . In particular, card (S (i)) = di − di+1 . (iii) S (h + 1) = {1}. Proof.

7. The Main Lemmas

45

(i) By Lemma (6.14) we have R(i) = w ∈ µn ord(w) divides di = w ∈ µn wdi = 1 = µdi .

(ii) Since, for every w ∈ µn , ordt (y(t) − y(wt)) belongs to the set {m1 , . . . , mh+1 } by Proposition (6.15), we see that S (i) = R(i) − R(i + 1), for 1 ≤ i ≤ h. Therefore (ii) follows from (i). (iii) This is clear, since mh+1 = ∞ and the roots y(wt), w ∈ µn , are distinct. (7.6) LEMMA. Let e be an integer, 1 ≤ e ≤ h, and let m = me . Let z be an element of an overfield of k. Then we have Y (z − wm ym ) = (zne · ynme )de+1 . w∈R(e)

Proof. Let u be a primitive de th root of unity in k and let v = um . Then, since de+1 = g.c.d. (de , m), we see that v is a primitive nth e root of unity. Therefore, since R(e) = µde = ui 1 ≤ i ≤ de

by Lemma (7.5), we get Y

(z − wm ym ) =

w∈R(e)

de Y (z − vi ym ) i=1

=

de+1 ne Y−1 Y

(z − v j+ine ym )

i=0

j=1

de+1 ne Y = (z − vi ym )

(since vne = 1)

j=1

= zne − ynme

de+1

since v is a primitive nth e root of unity.

,

45

3. The Fundamental Theorem

46

(7.7) LEMMA. Let i be an integer, 1 ≤ i ≤ h + 1. Then we have Y si−1 − mi−1 di , if i ≥ 2, (y(t) − y(wt)) = ordt 0, if i = 1. w∈Q(i)

Proof. Since, for every w ∈ µn , ordt (y(t) − y(wt)) belongs to the set {m1 , . . . , mh+1 } by Proposition (6.15), we get (7.7.1)

Y

(y(t) − y(wt)) =

w∈Q(i)

i−1 Y Y

(y(t) − y(wt)).

j=1 w∈S ( j)

From this the assertion is clear for i = 1. Assume now that i ≥ 2. Since card (S ( j)) = d j − d j+1 by Lemma (7.5), we have Y (y(t) − y(wt)) = (d j − d j+1 )m j ordt w∈S ( j)

for 1 ≤ j ≤ h. Therefore from (7.7.1) we get i−1 X Y d j − d j+1 m j (y(t) − y(wt)) = ordt j=1

w∈Q(i)

=

i−1 X

q j d j − mi−1 di

j=1

= si−1 − mi−1 di . (7.8) COROLLARY. h Y X q j (d j − 1) = sh − mh . (y(t) − y(wt)) = ordt w∈µn w,1

Proof. The equality

j=1

h X

q j (d j − 1) = sh − mh is clear. Now, if h = 0

j=1

46

then n = d1 = dh+1 = 1, so that the assertion is clear in this case, since

7. The Main Lemmas

47

in the middle we have an empty sum and on the left hand side the order of an empty product. Assume now that h ≥ 1. Taking i = h + 1 in Lemma (7.7), we get Q(i) = µn − {1} and si−1 − mi−1 di = sh − mh dh+1 = sh − mh =

h X

q j (d j − 1).

j=1

(7.9) COROLLARY. Let fY (X, Y) denote the Y-derivative of f (X, Y). Then we have ordt ( fY (tn , y(t))) =

h X

q j (d j − 1) = sh − mh .

j=1

Proof. Since f (tn , Y) =

Y

(Y − y(wt))

w∈µn

we get fY (tn , y(t)) =

Y

(y(t) − y(wt))

w∈µn w,1

and the assertion follows from Corollary (7.8).

(7.10) COROLLARY. Let u(t) be an element of k((t)) such that ordt (u(t) − y(t)) > mh . Then ordt ( f (tn , u(t))) = sh − mh + ordt (u(t) − y(t)). Proof. Let w ∈ µn , w , 1. Then ordt (y(t) − y(wt)) ≤ mh by Lemma (6.14). Therefore, since u(t) − y(wt) = (u(t) − y(t)) + (y(t) − y(wt)) and since ordt (u(t) − y(t)) > mh , we get ordt (u(t) − y(wt)) = ordt (y(t) − y(wt))

47

3. The Fundamental Theorem

48

for every w ∈ µn , w , 1. Therefore Y (u(t) − y(wt)) ordt ( f (tn , u(t))) = ordt w∈µn Y = ordt (u(t) − y(t)) + ordt (y(t) − y(wt)) w,1

= ordt (u(t) − y(t)) + sh − mh

by Corollary (7.8).

(7.11) LEMMA. Let i be an integer, 1 ≤ i ≤ h + 1. Let y(t) =

P

y jt j.

j re , and from (7.20.4) we get ge (tn , y) = catre + v.

8 The Fundamental Theorem Throughout this section we preserve the notation of (7.1). In addition, we also fix the following notation: (8.1) NOTATION. For an integer e, 1 ≤ e ≤ h + 1, we get Y, ge = ge (X, Y) = Appde ( f ), Y

We note that gh+1 = f .

if e = 1, if e ≥ 2.

8. The Fundamental Theorem

59

(8.2) Fundamental Theorem (Part One). 59

Let e be an integer, 1 ≤ e ≤ h + 1. Then we have ordt ge (tn , y(t)) = re . Proof. Since gh+1 = f and rh+1 = ∞, the assertion is clear for e = h + 1. Next, we have g1 (tn , y(t)) = y(t), and ordt y(t) = m1 = r1 , which proves the assertion for e = 1. Assume now that 2 ≤ e ≤ h. Then the assertion X is immediate from Corollary (7.20) by taking a = yme and u = y j ti j>me

and noting that yme , 0.

(8.3) Fundamental Theorem (Part Two) Let R be a subring of k((X)) such that n is a unit in R and f ∈ R[Y]. Then: (i) gi ∈ R[Y] for every i, 1 ≤ i ≤ h + 1. Further, let R[Y] = R[Y]/ f R[Y] and let gi be the image of gi under the canonical map R[Y] → R[Y]. Then: (ii) R[Y] is a free R-module with the set g¯ b b ∈ B as a free basis, where g = (g1 , . . . , gh ) and B = b = (b1 , . . . , bh ) ∈ Zh 0 ≤ bi < di /di+1 for 1 ≤ i ≤ h .

h (For h = 0 interpret this notation as B = {φ} and g¯ b ∈ B = {1}.)

Proof.

1. For i = 1, g1 = Y ∈ R[Y]. For i ≥ 2 the assertion follows from the uniqueness of AppdYi ( f ). 2. We first note that since degY f = n > 0 and since f is monic in Y, the restriction of the canonical map η : R[Y] → R[Y] to R is injective. We identify R with its image in R[Y]. Then, writing F = η(F) for F ∈ R[Y], we have (8.3.1)

F = F for every F ∈ R.

3. The Fundamental Theorem

60

60

Now, let Gi = gi for 1 ≤ i ≤ h + 1. Then the (h + 1)-tuple G = (G1 , . . . , Gh+1 ) satisfies conditions (i)-(iii) of (2.2) with p = h + 1. Therefore by Corollary (2.14) every element F of R[Y] has a unique expression of the form X (8.3.2) F= FaGa , Fa ∈ R, a∈A(G)

where A(G) = a = (a1 , . . . , ah+1 ) ∈ Zh+1 0 ≤ ai < ni (G) for 1 ≤ i ≤ h + 1 . Recall that with the notation of (2.2) we have nh+1 (G) = ∞ and ni (G) = (n/di+1 )/(n/di ) = di /di+1

(8.3.3)

for 1 ≤ i ≤ h. Now, let F be any element of R[Y] and let F ∈ R[Y] be a lift of F. Then from (8.3.1) and (8.3.2) we get X a (8.3.4) F= F aG . a∈A(G)

Now Gh+1 = 0. Therefore, if a ∈ A(G) is such that ah+1 , 0 then G = 0. Therefore, in view of (8.3.3), the expression (8.3.4) reduces to the form X F= Fb′ gb , a

b∈B

where

Fb′

= F(b1 ,...,bh ,0) for b ∈ B. This proves that R[Y] is generated as b an R-module by the set g b ∈ B . Now, to prove that this set is a free basis, suppose X (8.3.5) 0= Fb′ gb b∈B

Fb′

with define

(8.3.6)

∈ R for every b and

Fb′

= 0 for almost all b. For a ∈ A(G),

′ , F(a 1 ,...,ah ) Fa = 0,

if ah+1 = 0, if ah+1 , 0.

8. The Fundamental Theorem 61

61

Let

X

F=

FaGa .

a∈A(G)

It is enough to prove that F = 0. For, this would imply by the uniqueness of the expression (8.3.2) that Fa = 0 for every a ∈ A(G), which would prove, in view of (8.3.6), that Fb′ = 0 for every b ∈ B. Now, suppose F , 0. Then, since f divides F in F[Y] by (8.3.5), we have F < R and deg F ≥ deg f = deg Gh+1 . But this is a contradiction by (8.3.6) and Lemma (2.12). Therefore F = 0, and the proof of the theorem is complete. (8.4) LEMMA. Let k((X)) be identified with the subfield k((tn )) of k((t)) by putting X = tn . Let R be a subring of k((X)) such that f ∈ R[Y]. Let R[y(t)] be the R-subalgebra of k((t)) generated by y(t). Then: (i) R[y(t)] = F(tn , y(t)) F(X, Y) ∈ R[Y] . (ii) There exists an R-algebra isomorphism

u : R[Y]/ f R[Y] → R[y(t)] which fits in a commutative diagram R[Y]H

η

HH HH H u HHH $

/ R[Y]/ f R[Y] qq qqq q q xqqq u¯

R[y(t)]

where η is the canonical homomorphism and u is defined by u(F(x, Y)) = F(tn , y(t)) for F(X, Y) ∈ R[Y]. Proof. (i) This is clear. (ii) It is clear that u is an R-algebra homomorphism. Since u( f ) = f (tn , y(t) = 0, u factors via η to give u. Since u is surjective by (i), so is u. To show that u is injective, it is enough to show that ker u = 62

3. The Fundamental Theorem

62

f R[Y]. Let F(X, Y) ∈ ker u. Then F(tn , y(t)) = 0. Therefore, since f is the minimal monic polynomial of y(t) over k((tn )), f divides F(X, Y) in k((X))[Y]. Since f is monic, it that follows f divides F(x, Y) in R[Y].

(8.5) Fundamental Theorem (Part Three) Let k((X)) be identified with the subfield k((tn )) of k((t)) by putting X = tn . Let R be a subring of k((X)) such that n is a unit in R and f ∈ R[Y]. Let R[y(t)] = F(tn , y(t)) F(X, Y) ∈ R[Y] . Let gi = gi (tn , y(t)), 1 ≤ i ≤ h. Then:

(i) R[y(t)] is a free R-module with the set {gb |b ∈ B} as a free basis, where g = (g1 , . . . , gh ) and B = b = (b1 , . . . , bh ) ∈ Zh 0 ≤ bi < di /di+1 for 1 ≤ i ≤ h .

(ii) Let F ∈ R[y(t)] and let

F=

X

F b gb ,

Fb ∈ R.

b∈B

If b, b′ ∈ B, b , b′ and Fb , 0, Fb′ , 0 then ′

ordt (Fb gb ) , ordt (Fb′ gb ). In particular, if F , 0 then there exists a unique b ∈ B such that ordt F = ordt (Fb gb ). (iii) With the notation of (ii), let b ∈ B be the unique element such that ordt (F) = ordt (Fb gb ). Then ordt F = ordt Fb +

h X i=1

bi ri .

8. The Fundamental Theorem

63

Proof. 63

(i) We first note that by Theorem (8.3) we have gi ∈ R[Y] for 1 ≤ i ≤ h. Let us now identify R[Y(t)] and R[Y] = R[Y]/ f R[Y] as Ralgebras via the isomorphism u of Lemma (8.4). With this identification, gi is the image of gi under the canonical map R[Y] → R[Y]. Therefore (i) follows directly from Theorem (8.3). (ii) Let Γ+ (R) = (n/r0 ) ord X G G ∈ R, G , 0 . Then, since n = |r0 |, it is clear that Γ+ (R) is a subsemigroup of R is a subsemigroup of Z. For b = (b1 , . . . , bh ) ∈ B such that Fb , 0, let us define b0 = (n/r0 ) ord X Fb . Then b0 ∈ Γ+ (R). Since ri = ordt gi by Theorem (8.2), we get (8.5.1)

ordt (Fb gb ) = ordt Fb +

h X i=1

bi ri =

h X

bi ri ,

i=0

since b0 r0 = n ordX Fb = ordt Fb . Similarly, if b′ ∈ B and Fb′ , 0 then (8.5.2)

b′

ordt (Fb′ g ) =

h X

b′i ri ,

b′0 ∈ Γ+ (R).

i=0

Now, since b, b′ ∈ B, we have 0 ≤ bi < di /di+1 , 0 ≤ b′i < di /di+1 for 1 ≤ i ≤ h. Thus (8.5.1) and (8.5.2) are Γ+ (R)-strict linear combinations of r = (r0 , . . . , rh ). Therefore (ii) follows from Proposition (1.5). (iii) This was proved in (8.5.1) above. (8.6) DEFINITION. Let R be a subring of k((X)) such that f ∈ R[Y]. Let w ∈ µn (k). The set n n ordt F(t , y(wt)) F(X, Y) ∈ R[Y], F(t , y(wt)) , 0 .

which is clearly independent of w ∈ µn (k) and is a subsemigroup of Z, is called the value semigroup of f with respect to R and is denoted ΓR ( f ). 64

3. The Fundamental Theorem

64

(8.7) Fundamental Theorem (Part Four). Let R be a subring of k((X)) such that n is a unit in R and f ∈ R[Y]. Let Γ+ (R) = (n/r0 ) ord X F F ∈ R, F , 0 . Then we have:

(i) Γ| (R) is a subsemigroup of Z. (ii) Γ+ (R)r0 ⊂ ΓR ( f ) and ri ∈ ΓR ( f ) for every i, 1 ≤ i ≤ h. (iii) ΓR ( f ) is Γ(R)-strictly generated by r = (r0 , . . . , rh ). In particular, suppose we are in one of the following two cases: (1) The ALGEBROID CASE: R = k′ [[X]] for some subfield k′ of k, f ∈ R[Y] and r0 = n. (2) The PURE MEROMORPHIC CASE: R = k′ [X −1 ] for some subfield k′ of k, f ∈ R[Y] and r0 = −n. Then we have: (i’) Γ+ (R) = Z+ . (ii’) ri ∈ ΓR ( f ) for every i, 0 ≤ i ≤ h. (iii’) ΓR ( f ) is strictly generated by r = (r0 , . . . , rh ). (For the definition of Γ+ (R)-strict generation, see (1.7)) Proof. (i) This is clear, since n = |r0 |. (ii) Let γ ∈ Γ+ (R). Then there exists F = F(X) ∈ R such that F , 0 and γ = (n/r0 ) ord X F. This gives γr0 = n ordX F = ordt f (tn ), which shows that γr0 ∈ ΓR ( f ). Next, since gi ∈ R[Y] by Theorem (8.3) and since ordt gi (tn , y(t)) = ri by Theorem (8.2), we get ri ∈ ΓR ( f ) for 1 ≤ i ≤ h.

8. The Fundamental Theorem

65

(iii) Let γ ∈ ΓR ( f ) and let F(X, Y) ∈ R[Y] be such that γ = ordt F(tn , y(t)). Put F = F(tn , y(t)). Then F , 0. Therefore by Theorem (8.5) (iii) we have γ = ordt F = ordt Fb (tn ) +

h X

bi ri ,

i=1

where Fb = Fb (X) ∈ R, Fb , 0, and bi ∈ Z, 0 ≤ bi < di /di+1 for 65 1 ≤ i ≤ h. Let b0 = (n/r0 ) ord X Fb . Then b0 ∈ Γ+ (R) and we have h X ordt Fb (tn ) = b0 r0 . Therefore γ = bi ri , which shows that γ is i=0

a Γ+ (R)-strict linear combination of r. Conversely, if γ =

h X

γi ri

i=0

is a Γ+ (R)-strict linear combination of r then it follows (ii) that γ ∈ ΓR ( f ). This proves (iii). (i’) is clear, and (ii’), (iii’) follow from (i’), (ii) and (iii).

(8.8) COROLLARY. With the notation of Theorem (8.7), suppose R contains an element of X-order 1 or −1. (This condition is satisfied, for example, if X ∈ R or X −1 ∈ R). Them g.c.d. (ΓR ( f )) = 1, i.e., the subgroup of Z generated by ΓR ( f ) coincides with Z. Proof. By assumption, we have n/r0 ∈ Γ+ (R) or −n/r0 ∈ Γ+ (R). Therefore by Theorem (8.7) (ii), n ∈ ΓR ( f ) or −n ∈ ΓR ( f ). Since n = |ro |, we get ro ∈ ΓR ( f ) or −ro ∈ ΓR ( f ). Also ri ∈ ΓR ( f ) for 1 ≤ i ≤ h by Theorem (8.7)(ii). Now, since g.c.d. (−r0 , r1 , . . . rh ) = g.c.d. (r0 , r1 , . . . , rh ) = dh+1 = 1, the corollary follows.

Chapter 4 Applications of The Fundamental Theorem 9 Epimorphism Theorem Let k be a field and let X, Y, Z, τ be indeterminates. (9.1) DEFINITION. Let C be a finitely generated k-subalgebra of k[Z] such that the quotient field of C is k(Z). We call C (the coordinate ring of) an affine polynomial curve over k and we call k(Z) the function field of C. If, moreover, C is generated as a k-algebra by two elements then we call C an affine polynomial plane curve. A k-algebra epimorphism (i.e., surjective homomorphism) α : k[X, Y] → C is called an embedding of C in the affine plane over k. Note that if C has an embedding in the affine plane then C is a plane curve. Moreover, the mapping α 7→ (α(X), α(Y)) gives a bijective correspondence between the embeddings of C in the affine plane and ordered pair (x, y) of elements of C such that C = k[x, y]. (9.2) DEFINITION. An embedding α : k[X, Y] → C is said to be permissible if α(X) , 0 and char k does not divide degZ α(X).

(9.3) Equation of an Embedding Let α : k[X, Y] → C be a permissible embedding of an affine plane polynomial curve C. Let x = α(X), y = α(Y). Then α(F) = F(x, y) for 67

66

4. Applications of The Fundamental Theorem

68

every F = F(X, Y) ∈ k[X, Y]. Let n = degZ x. Let k be the algebraic closure of k and let θ : k[Z] → k((τ)) be the k-algebra monomorphism defined by θ(Z) = τ−1 . Then it is clear that we have ordτ θ(F(x, y)) = − degZ F(x, y)

(9.3.1) 67

for every F(X, Y) ∈ k[X, Y]. In particular, we have ordτ θ(x) = −n. Since char k does not divide n, there exists, by Corollary (5.4), an element t ∈ k((τ)) such that ordτ t = 1 and θ(x) = t−n . Note then that we have k((t)) = k((τ)) and ordt a = ordτ a for every a ∈ k((t)). Write x = x(t) = θ(x) = t−n and y = y(t) = θ(y). We call y(t) a Newton-Puiseux expansion of y in fractional powers of x−1 . Let f = f (x, Y) ∈ k((X))[Y] be the minimal monic polynomial of y over k((tn )) (Definition (5.8)). Recall that f is the unique irreducible element of k((X))[Y] such that f is monic in Y and f (tn , y) = 0. We call f the meromorphic equation of the embedding α. (9.4) LEMMA. With the notation of (9.3), we have: (i) degY f = n. (ii) f ∈ k[X −1 , Y]. Proof. (i) We have degY f = [k((tn ))(y) : k((tn ))]. Therefore, since y ∈ k((t)), we get degY f ≤ [k((t)) : k((tn ))] = n. On the other hand, since α is surjective, we have Z ∈ k(x, y). Therefore τ−1 ∈ k(x, y) ⊂ k((tn ))(y), so that τ ∈ k((tn ))(y). Therefore degY f ≥ [k((tn ))(τ) : k((tn ))] = n

68

by Lemma (5.10), since 1 ∈ Suppt (τ). This proves (i). (ii) Since degZ x = n > 0, x is transcendental over k and k(Z) is algebraic over k(x) with [k(Z) : k(x)] = n. Let g(x, Y) ∈ k(x)[Y] be the minimal monic polynomial of y over k(x). Since α is surjective, we have k(x)(y) = k(Z). Therefore degY g(x, Y) = n. We claim that g(x, Y ∈ k[x][Y]). In order to prove the claim, we have only to show that

9. Epimorphism Theorem

69

y is integral over k[x]. Now, writing x =

n X ai Z i , ai ∈ k for 0 ≤ i ≤ n, i=1

an , 0, we have n

Z +

n−1 X

−1 i ai a−1 n Z + (a0 − x)an = 0,

i=1

which shows that Z is integral over k[x]. Since y ∈ k[Z], y is also integral over k[x]. Thus g(x, Y) ∈ k[x][Y]. Put h(X, Y) = g(X −1 , Y). Then h(X, Y) ∈ k[X −1 ][Y] ⊂ k((X))[Y] and h(X, Y) is monic in Y with degY h(X, Y) = n. Now, h(tn , y) = g(t−n , y) = g(θ(x), θ(y)) = θ(g(x, y)) = 0. This shows that f (X, Y) = h(X, Y) and (ii) is proved. (9.5) REMARK. Put ϕ = ϕ(X, Y) = f (X −1 , Y). Then by Lemma (9.4) ϕ ∈ k[X, Y]. We claim that ϕ generates ker α. To see this we note that ker α is a principal prime ideal of k[X, Y] and, since f is irreducible in k[X −1 , Y], ϕ is irreducible in k[X, Y]. Therefore it is enough to show that ϕ ∈ ker α. Now, θ(ϕ(x, y)) = ϕ(t−n , y) = f (tn , y) = 0. Since θ is a monomorphism, our claim is proved. Noting that ϕ is the unique generator of ker α which is monic in Y, we call ϕ the algebraic equation of the embedding α. If ψ is any generator of ker α then, clearly, we have ψ = ϕ for some . (9.6) REMARK. With the notation of (9.3), suppose S is a subring of k such that x and y belong to S [Z]. Consider the pair (X − x, Y − y) of elements of S [Z][X, Y] and let g = g(X, Y) ∈ S [X, Y] be the Z-resultant of X − x and Y − y. Then clearly g is monic in Y and, since degZ x = n, we have degY g = n. Moreover, we have g(x, y) = 0, so that 0 = θ(g(x, y)) = g(t−n , y). therefore it follows from Lemma (9.4) (i) that f (X, Y) = g(X −1 , Y) ∈ S [X −1 , Y]. This gives an alternative proof of part (ii) of Lemma (9.4).

(9.7) Characteristic Sequences of an Embedding Continuing with the notation of (9.3), let R = k[X −1 ]. Then f ∈ R[Y] by 69 Lemma (9.4). Let h = h( f ) and let mi = mi (−n, f ), qi = qi (−n, f ),

4. Applications of The Fundamental Theorem

70

si = si (−n, f ), ri = ri (−n, f ), di+1 = di+1 ( f ) for 0 ≤ i ≤ h + 1. The sequence (m0 , . . . , mh+1 ), (resp. (q0 , . . . , qh+1 ), resp. (s0 , . . . , sh+1 ), resp. (r0 , . . . , rh+1 ), resp. (d1 , . . . , dh+2 )) is called the characteristic m (resp. q, resp. s, resp. r, resp. d)- sequence of the permissible embedding α. Note that we have (9.7.1)

r0 = −n = − degZ α(X).

Moreover, by (9.3.1) we have (9.7.2)

r1 = ordt y = ordτ y = − degZ α(Y).

Let Z− = {a ∈ Z a ≤ 0}.

Recall that ΓR ( f ) is the subsemigroup of Z defined by ΓR ( f ) = ordt F(tn , y) F(X, Y) ∈ R[Y], F(tn , y) , 0 .

(9.8) LEMMA. With the notation of (9.7), we have: (i) ΓR ( f ) ⊂ Z− . (ii) If C = k[Z] then ΓR ( f ) = Z− .

(iii) ΓR ( f ) is strictly generated by r = (r0 , r1 , . . . , rh ). (iv) r0 < 0, r1 = ∞ or r1 ≤ 0, and ri < 0 for 2 ≤ i ≤ h. (v) If C = k[Z] and h ≥ 2 then rh = −1. 70

Proof. (i) Let F(X, Y) ∈ R[Y] be any element such that F(tn , y) , 0. Put G(X, Y) = F(X −1 , Y). Then G(X, Y) ∈ k[X, Y] and, with the notation of (9.3), we have ordt F(tn , y) = ordt G(t−n , y) = ordτ G(t−n , y) (9.8.1)

= ordτ G(θ(x), θ(y)) = ordτ θ(G(x, y)) = − degZ G(x, y)

9. Epimorphism Theorem

71

by (9.3.1). Therefore ordt F(tn , y) ≤ 0. this proves (i) (ii) In view of (i), it is enough to prove that −1 ∈ ΓR ( f ). Since α is surjective, thee exists G(X, Y) ∈ k[x, Y] such that G(x, y) = Z. Put F(X, Y) = G(X −1 , Y). Then F(X, Y) ∈ R[Y] and G(X, Y) = F(X −1 , Y). Therefore by the computation (9.8.1) we get ordt F(tn , y) = − degZ Z = −1. This shows that −1 ∈ ΓR ( f ). (iii) This is immediate from Theorem (8.7) (iii′ ). (iv) The assertion about r0 and r1 follows from (9.7.1) and (9.7.2). Now suppose 2 ≤ i ≤ h. Then we have (9.8.2)

di > di+1

by Proposition (6.13) (ii). Therefore 1 < di /di+1 , so that ri is a strict linear combination of r = (r0 , r1 , . . . , rh ). Therefore ri ∈ ΓR ( f ) by (iii), which shows by (i) that ri ≤ 0. Since di does not divide ri by (9.8.2), we have ri , 0. Therefore ri < 0. (v) It follows from (ii) and (iii) that ri = −1 for some i, 0 ≤ i ≤ h. Since h ≥ 2 and since dh divides ri for i ≤ h − 1 it follows from (9.8.2) that ri , −1 for 0 ≤ i ≤ h − 1. Therefore rh = −1. 71 (9.9) LEMMA. With the notation of (9.7), suppose −d2 ∈ ΓR ( f ). Then r0 divides r1 or r1 divides r0 . Proof. Since −d2 ∈ Z, we have d2 , −∞. This means that h ≥ 1. Therefore r1 , ∞ and it follows from Lemma (9.8) (iv) that ri ≤ 0 for i = 0, 1. since −d2 ∈ ΓR ( f ), Lemma (9.8) (iii) shows that −d2 is a strict linear combination of r = (r0 , . . . , rh ). Now, the assertion follows from Proposition (1.8). (9.10) DEFINITION. If C = k[Z], we call C the affine line over k. In Theorem (9.11) and (9.19) below we study the embeddings of the affine line in the affine plane.

(9.11) Epimorphism Theorem (First Formulation) Let k be any field and let α : k[X, Y] → k[Z] be a k-algebra epimorphism such that α(X) , 0, α(Y) , 0. Let n = degZ α(X), m = degZ α(Y).

4. Applications of The Fundamental Theorem

72

Suppose char k does not divide g.c.d. (m, n). Then n divides m or m divides n. Proof. By the symmetry of the assertion, we may assume that char k does not divide n. Then α is a permissible embedding. We now use the notation of (9.3) and (9.7) with C = k[Z]. By (9.7.1) and (9.7.2) we have r0 = −n and r1 = −m , ∞. Therefore h ≥ 1. By Lemma (9.8) (ii) we have ΓR ( f ) = Z− . Therefore −d2 ∈ ΓR ( f ), so that r0 divides r1 or r1 divides r0 by Lemma (9.9). This means that n divides m or m divides n, and the theorem is proved. The following example shows that in Theorem (9.11) we cannot relax the condition “char k does divide g.c.d. (m, n)”. (9.12) EXAMPLE. Let p = char k. Let e, s be positive integers and let e

x = Zp

y=Z+

s X

ai Z ip

i=0

72

with ai ∈ k for 0 ≤ i ≤ s and as , 0. Let α : k[X, Y] → k[Z] be the k-algebra homomorphism defined by α(X) = x, α(Y) = y. We claim that α is surjective. To prove our claim, it is enough to show that Z ∈ k[x, y]. j In fact, we show by descending induction on j that Z p ∈ k[x, y] for 0 ≤ j ≤ e, this assertion being clear for j = e. Suppose now that j ≥ 0 j+1 and Z p ∈ k[x, y]. We have j

j

yp = Z p +

s X

j

j+1

aip (Z p )i .

i=0

j

This shows that Z p ∈ k[x, y], and our claim is proved. Now, let n = degZ x = pe , m = degZ y = sp. It is clear that we can choose e, s to be such that neither n divides m nor m divides n. Specifically, take e ≥ 2 and s = qpc where q, c are integers such that q ≥ 2, q . 0 (mod p) and 0 ≤ c ≤ e − 2.

9. Epimorphism Theorem

73

(9.13) QUESTION. Let α : k[X, Y] → k[Z] be a k-algebra epimorphism such that α(X) , 0, α(Y) , 0. Let n = degZ α(X), m = degZ α(Y). Let p = char k, and let n = n′ pe , m = m′ pd , whee n′ , m′ , e, d are integers such that n′ . 0 (mod p), m′ . 0 (mod p), e ≥ 0, d ≥ 0. Is it then true that n′ divides m′ or m′ divides n′ ? (9.14) DEFINITION. Let A = k[X, Y] and let σ be a k-algebra automorphism of A. We say σ is primitive if there exists P(Z) ∈ k[Z] such that either

σ(X) = X,

σ(Y) = Y + P(X);

or

σ(X) = X + P(Y),

σ(Y) = Y.

We say σ is linear if there exist ai , bi , ci ∈ k, i = 1, 2, such that σ(X) = a1 X + b1 Y + c1 ,

σ(Y) = a2 X + b2 Y + c2 .

We say σ is elementary if σ is primitive or linear. We say σ is tame 73 if σ is a finite product of elementary automorphisms. (9.15) REMARK. It is easily checked that the set of all tame automorphisms of A is a subgroup of the group of all k-algebra automorphisms of A. In fact, it is true that all k-algebra automorphisms of A are tame. In the next section we shall deduce this fact from the Epimorphism Theorem in case char k = 0 (Theorem (10.1)) (9.16) DEFINITION. Let α, β : k[X, Y] → k[z] be k-algebra epimorphisms. We say α is equivalent (resp. tamely equivalent) to β if there exists a k-algebra automorphism (resp. tame automorphism) σ of k[X, Y] such that the diagram k[x, Y]H σ

HH HHα HH HH $

: k[Z] vv v vv vv vv β

k[X, Y]

74

4. Applications of The Fundamental Theorem

is commutative, i.e., α = βσ. (9.17) REMARK. It is clear that both equivalence and tame equivalence are equivalence relations and that tame equivalence implies equivalence. (9.18) DEFINITION. Let α : k[X, Y] → k[Z] be a k-algebra epimorphism. We say α is wild if α(X) , 0, α(Y) , 0 and char k divides both degZ α(X) and degZ α(Y).

(9.19) EPIMORPHISM THEOREM (SECOND FORMULATION). 74

Let α, β : k[X, Y] → k[Z] be k-algebra epimorphisms. Assume that neither α nor β is wild. Then α and β are tamely equivalent. In particular, α and β are equivalent. Proof. Let γ : k[X, Y] → k[Z] be the k-algebra epimorphism defined by γ(X) = Z, γ(Y) = 0. Then, since tame equivalence is an equivalence relation, it is enough to prove the following assertion: (9.19.1) If α is not wild then α and γ are tamely equivalent. Given α, we define the transpose αt of α to be the k-algebra epimorphism αt : k[X, Y] → k[Z] given by αt (X) = α(Y), αt (Y) = α(X). Clearly, α and αt are tamely equivalent and α is wild if and only if αt is wild. Put D(α) = degZ α(X) + degZ α(Y). Then D(α) = D(αt ). We now prove (9.19.1) by induction on D(α). First, suppose D(α) ≤ 1. Replacing α by αt , if necessary, we may assume that degZ α(X) ≥ degZ α(Y). Then, since α is surjective, the assumption D(α) ≤ 1 implies that degZ α(Y) ≤ 0 and degZ α(X) = 1. This means that there exist a, b, c, ∈ k, a , 0, such that α(X) = aZ + b and α(Y) = c. Let σ be the k-algebra automorphism of k[X, Y] defined by σ(X) = a(X) + b, σ(Y) = Y + c. Then σ is tame and clearly we have α = γσ. Now, suppose D(α) ≥ 2. Again, replacing α by αt , if necessary, we may assume that degZ α(X) ≥ degZ α(Y). This means, in particular, that α(X) < k. If α(Y) ∈ k then degZ α(X) ≥ 2. This is not possible, since

9. Epimorphism Theorem

75

α is surjective. Therefore α(X) < k and α(Y) < k. Let n = degZ α(X), m = degX α(Y). Since α is not wild and n ≥ m ≥ 1, it follows from Theorem (9.11) that m divides n. Let n = rm, where r is a positive integer. Write α(X) =

rm X

ai Z i ,

α(Y) =

i=0

m X

b jZ j

j=0

with ai , b j ∈ k for 0 ≤ i ≤ rm, 0 ≤ j ≤ m and bm , 0. Let σ be the 75 r k−algebra automorphism of k[X, Y] defined by σ(X) = X − arm b−r mY ′ and σ(Y) = Y. Then σ is primitive, therefore tame. Let α = ασ. Then α′ : k[X, Y] → k[Z] is a k-algebra epimorphism and α and α′ are tamely equivalent. Now, we have α′ (X) = α(σ(X)) r = α(X − arm b−r mY )

=

rm X i=0

r m X b X j . ai Z i − arm b−r j m j=0

This shows that degX α′ (X) < rm = n. Moreover, α′ (Y) = α(σ(Y)) = α(Y). Therefore degZ α′ (Y) = m, and we get D(α′ ) < D(α). Now, since α is not wild, char k does not divide g.c.d. (n, m) = m = degZ α′ (Y). This shows that α′ is not wild, so that α′ and γ are tamely equivalent by induction hypothesis. Therefore α and γ are tamely equivalent, and (9.19.1) is proved. (9.20) COROLLARY. If char k = 0 then any two k-algebra epimorphisms k[X, Y] → k[Z] are tamely equivalent. Proof. Immediate from Theorem (9.19), sine if char k = 0 then there are no wild k-algebra epimorphisms. (9.21) COROLLARY. Let char k = 0. Let ϕ be an element of k[X, Y] such that k[X, Y]/(ϕ) is isomorphic (as a k-algebra) to k[Z]. Then there exists an element ψ of k[X, Y] such that k[ψ, ϕ] = k[X, Y].

76

76

4. Applications of The Fundamental Theorem

Proof. Let α : k[X, Y] → k[Z] be the k-algebra epimorphism defined by α = vu, where u : k[X, Y] → k[X, Y]/(ϕ) is the natural surjection and v : k[X, Y]/(ϕ) → k[Z] is a k-algebra isomorphism. Then ker α = (ϕ). Let β : k[X, Y] → k[z] be the k-algebra epimorphism defined by β(X) = Z, β(Y) = 0. then ker β = (Y). By Corollary (9.20) there exists a k-algebra automorphism σ of k[X, Y] such that β = ασ. This gives (ϕ) = ker α = σ(ker β) = (σ(Y)). Therefore σ(Y) = ϕ. Let Ψ = σ(X). Then k[x, Y] = k[σ(X), σ(Y)] = k[ψ, ϕ] = k[ψ, ϕ]. (9.22) LEMMA. Let the assumptions be those of Corollary (9.21). Assume, moreover, that degY ϕ > 0. Then: (i) ϕ is monic in Y for some . (ii) ϕ(X −1 , Y) is irreducible in k((X))[Y], where k is the algebraic closure of k. Proof. Let α : k[X, Y] → k[Z] be the k-algebra epimorphism defined at the beginning of the proof of Corollary (9.21). Then ker α = (ϕ). Since degY ϕ > 0, we have X − a . 0 (mod ϕ) for every a ∈ k. This shows that degZ α(X) > 0. Therefore α is a permissible embedding. Let f = f (X, Y) ∈ k((X))[Y] be the meromorphic equation of α. It follows from Remark (9.5) that ker α = ( f (X −1 , Y)). Therefore f (X −1 , Y) = ϕ for some , and the lemma is proved. (9.23) COROLLARY. Let the assumptions be those of Corollary (9.21). Assume, moreover, that degY ϕ > 0. Then there exists an element ψ of k[X, Y] such that degY ψ < degY ϕ and k[ψ, ϕ] = k[X, Y]. Proof. By Corollary (9.21) there exists ψ ∈ k[X, Y] such that k[ψ, ϕ] = k[X, Y]. It is now enough to show that if degY ψ ≥ degY ϕ then there exists ψ′ ∈ k[X, Y] such that degY ψ′ < degY ψ and k[ψ′ , ϕ] = k[X, Y]. Let n = degY ϕ, m = degY ψ and suppose m ≥ n. In view of Lemma (9.22), replacing ϕ by ϕ, we may assume that ϕ is monic in Y. Similarly, since k[X, Y]/(ψ) = k[ψ, ϕ]/(ψ) ≈ k[ϕ] ≈ k[Z].

77

we may replace ψ by ψ and assume that ψ is monic in Y. Now,

9. Epimorphism Theorem

77

k[ψ, ϕ] = k[X, Y] implies that k′ [ψ, ϕ] = k′ [Y], where k′ = k(X). Therefore if S , T are indeterminates then the k′ −algebra homomorphism γ : k′ [S , T ] → k′ [Y] defined by γ(S ) = ψ, γ(T ) = ϕ, is surjective. Therefore by Theorem (9.11) n divides m or m divides n. Since m ≥ n, we get m = pn for some positive integer p. Let ψ′ = ψ − ϕ p . Then k[ψ′ , ϕ] = k[ψ, ϕ] = k[X, Y]. Moreover, since both ψ and ϕ are monic in Y, we have degY ψ′ < m. (9.24) THEOREM. Let char k = 0. Let ϕ = ϕ(X, Y) be an element of k[X, Y] such that n = degY ϕ > 0, ϕ is monic in Y and k[X, Y]/(ϕ) is isomorphic (as a k-algebra) to k[Z]. Let f = f (X, Y) = ϕ(X −1 , Y). Then f is irreducible in k((X))[Y]. Let h = h( f ) and let ψ = AppdY (ϕ), where d = dh ( f ). If h ≥ 2 then k[ψ, ϕ] = k[X, Y]. (As usual, k denotes the algebraic closure of k.) Proof. Let α : k[X, Y] → k[Z] be the k-algebra epimorphism defined by α = vu, where u : k[X, Y] → k[x, Y]/(ϕ) is the natural surjection and v : k[X, Y]/(ϕ) → k[z] is a k-algebra isomorphism. Then ker α = (ϕ) and, since n > 0, α is a permissible embedding. Since ϕ is monic in Y, it follows from Remark (9.5) that f is the meromorphic equation of α. We now use the notation of (9.3) and (9.7). Let g = g(X, Y) = AppdY ( f ). Then by Proposition (4.7) g(X, Y) = ψ(X −1 , Y). Since h ≥ 2, we have ordt ψ(t−n , y) = ordt g(tn , y) = rh by Theorem (8.2). Since ψ(t−n , y) = θ(ψ(x, y)), it follows from (9.3.1) that degZ ψ(x, y) = −rh . By Lemma (9.8) (v) we have rh = −1. Therefore we have (9.24.1)

degZ α(ψ) = 1.

Now, by Corollary (9.23) there exists an element ψ′ of k[X, Y] such that degY ψ′ < n and k[ψ′ , ϕ] = k[X, Y]. It follows that k[Z] = k[α(ψ′ )]. 78 Therefore we have (9.24.2)

degZ α(ψ) = 1.

It follows from (9.24.1) and (9.24.2) that we have α(ψ′ ) = aα(ψ) + b for some a, b ∈ k, a , 0. This means that ψ′ = aψ + b + λϕ

78

4. Applications of The Fundamental Theorem

for some λ ∈ k[X, Y]. Since degY ψ′ < n and degY ψ = n/d < n, we get λ = 0 and ψ′ = aψ + b. This shows that k[ψ′ , ϕ] = k[ψ, ϕ], and the theorem is proved. With the notation and assumptions of Theorem (9.24) we have the following four corollaries: (9.25) COROLLARY. If h ≥ 2 then rn (−n, f ) = −1. Proof. This was noted in the proof of the theorem above.

(9.26) COROLLARY. degY ϕ divides degX ϕ or degX ϕ divides degY ϕ. Proof. Let α : k[X, Y] → k[Z] be the permissible embedding defined in the proof of Theorem (9.24). Then, since f (X, Y) = ϕ(X −1 , Y) is the meromorphic equation of α (Remark (9.5)), it follows from Lemma (9.4) that degZ α(X) = degY ϕ = n, Let m = degX ϕ. If m = 0 then n divides m. If m > 0 then by the argument above, we get degZ α(Y) = m. Now, it follows from Theorem (9.11) that n divides m or m divides n. (9.27) COROLLARY. d2 ( f ) = d1 ( f ) or d2 ( f ) = −q1 (−n, f ). 79

Proof. As seen in the proof of Corollary (9.26), we have n = degZ α(X). Therefore d1 ( f ) = degZ α(X). Moreover, by (9.7.2) we have degZ α(Y) = −q1 (−n, f ). Now, the corollary follows from Theorem (9.11). (9.28) COROLLARY. k[X, Y]/(ψ) is isomorphic (as a k-algebra) to k[Z]. Proof. This is clear, since k[X, Y] = k[ψ, ϕ].

(9.29) REMARK. The results proved in (9.21) - (9.28) above hold also for char k > 0 (and, infact, the same proof goes through), provided we make the assumption that degY ϕ (or, by symmetry, degX ϕ) is not divisible by char k.

10. Automorphism Theorem

79

10 Automorphism Theorem As in §9, k ia an arbitrary field and X, Y, Z are indeterminates.

(10.1) Automorphism Theorem. Every k-algebra automorphism of k[X, Y] is tame. (For the definition of a tame automorphism, see (9.14). In the proof below we deduce the Automorphism Theorem from the Epimorphism Theorem in case char k = 0. For a proof in the general case the reader is referred to [5].) Proof of (10.1) in char k = 0. Let ϕ be a k-algebra automorphism of k[X, Y]. Let γ : k[X, Y] → k[Z] be the k-algebra epimorphism defined by γ(X) = Z, γ(Y) = 0, and let α = γϕ. Then α : k[X, Y] → k[Z] is also an epimorphism. Therefore by Corollary (9.20) there exists a tame k-algebra automorphism σ of k[X, Y] such that α = γσ. Thus we get γϕ = γσ. Put ψ = ϕσ−1 . Then ϕ = ψσ, and it is enough to prove that ψ is tame. Now, γψ = γ. Therefore ψ(ker γ) = ker γ. Now, ker γ = (Y). Therefore we have 80 ψ(Y) = aY

(10.1.1) for some a ∈ k, a , 0. Now,

k[Y][X] = k[ψ(Y), ψ(X)] = k[aY, ψ(X)] = k[Y][ψ(X)]. Therefore there exist P(Y) ∈ k[Y] and b ∈ k, b , 0, such that (10.1.2)

ψ(X) = bX + P(Y).

It is clear from (10.1.1) and (10.1.2) that ψ is tame. (10.2) THEOREM. Let f , g be elements of k[X, Y] such that k[ f, g] = k[X, Y]. Then deg f divides deg g or deg g divides deg f . (Here deg denotes total degree with respect to X, Y. In the proof below we deduce Theorem (10.2) from the Epimorphism Theorem in

80

4. Applications of The Fundamental Theorem

case char k = 0. For a proof in the general case the reader is referred to [5].) Proof of (10.2) in case char k = 0. Let n = deg f , m = deg g. Let f + be the homogeneous component of f of degree n, i.e., f + is a homogeneous polynomial in X, Y of degree n such that f = f + + f ′ with f ′ ∈ k[X, Y] and deg f ′ < n. It is then clear that degY f < n if an only if X divides f + . + + Similarly, degY g < m if and only if X divides g , where g is the homogeneous component of g of degree m. Since X + aY a ∈ k is an infinite

81

set of mutually coprime elements of k[X, Y], there exists a ∈ k, a , 0, such that X ′ = X + aY divides neither f + nor g+ . Therefore, replacing X by X ′ we may assume that n = degY f , m = degY g. Let k′ = k(X) and let S , T be indeterminates. Let α : k′ [S , T ] → k′ [Y] be the k′ -algebra homomorphism defined by α(S ) = f , α(T ) = g. Then the assumption k[ f, g] = k[X, Y] implies that α is an epimorphism. Therefore it follows from Theorem (9.11) that n divides m or m divides n.

11 Affine Curves with One Place at Infinity (11.1) Throughout this section, by a valuation we shall mean a real discrete valuation with value group Z. Thus if K is a field then a valuation v of K is a map v : K → Z ∪ {∞} satisfying the following three conditions: (i) v(a) = ∞ if an only if a = 0 (ii) v|K ∗ : K ∗ → Z is a surjective homomorphism of groups, where K ∗ is the group of units of K (iii) v(a + b) ≥ min(v(a), v(b)) for all a, b ∈ K. We denote by Rv the ring of v and by mv the maximal ideal of Rv . Recall that Rv = {a ∈ k|v(a) ≥ 0} and mv = {a ∈ K|v(a) > 0}. The ring Rv is a discrete valuation ring with quotient field K. If k is a subfield of K such that v(a) = 0 for every non-zero element a of k then we say, as usual, that v is a valuation of K/k. Note that in this case the residue

11. Affine Curves with One Place at Infinity

81

field Rv /mv of v is an overfield of k. We say v is residually rational over k if k = Rv /mv . Let L/K be a field extension. Let v be a valuation of K and let w be a valuation of L. We say w extends (or lies over) v if Rw ∩ K = Rv . (11.2) DEFINITION. Let k be a field and let A be a k-algebra. We say A is an affine curve over k (more precisely, the coordinate ring of an integral affine curve over k) if the following three conditions are satisfied: (i) A is finitely generated as a k-algebra. (ii) A is an integral domain. (iii) A has Krull dimension one, i.e. if K is the quotient field of A then 82 tr. degk K = 1 (11.3) DEFINITION. Let A be an affine curve over k. We say A is a plane affine curve (resp. the affine line) if A is generated as a k-algebra by two elements (resp. one element). Note that the affine line is the polynomial ring in one variable over k. (11.4) DEFINITION. Let A be an affine curve over k. We say A has only one place at infinity if the following two conditions are satisfied: (i) There exists exactly one valuation v of K/k, where K is the quotient field of A, such that A 1 Rv . (ii) The unique valuation v of condition (i) is residually rational over k. We call v the place (or valuation) of A at infinity. (11.5) EXAMPLE. An affine polynomial curve over k (Definition (9.1)) has only one place at infinity. For, if A is such a curve then A ⊂ k[Z] and the quotient field of A is k(Z), where Z is an indeterminate. If v is the Z −1 -adic valuation of k(Z)/k then it is clear that v is residually rational over k and is the unique place of A at infinity. In particular, the affine line has only one place at infinity.

82

4. Applications of The Fundamental Theorem

(11.6) LEMMA. Let v be a valuation of K/k. Let x be a non-zero element of K. If x is algebraic over k then v(x) = 0. Proof. Suppose v(x) , 0. Since x is algebraic over k if an only if x−1 is algebraic over k, we may assume that v(x) > 0. If xis algebraic over k then, since x , 0, there exist n ≥ 1 and ai ∈ k, 0 ≤ i ≤ n − 1, such that a0 , 0 and xn + an−1 xn−1 + · · · + a1 x = a0 . 83

Since v(x) > 0, we have v(xn + an−1 + . . . + a1 x) > 0. But v(a0 ) = 0. This contradiction proves that v(x) = 0. (11.7) LEMMA. Let v be a valuation of K/k such that v is residually rational over k. Then k is algebraically closed in K. Proof. Let x ∈ K be algebraic over k. We want to show that x ∈ k. We may assume that x , 0. Then x−1 is also algebraic over k. Since x ∈ Rv or x−1 ∈ Rv , we may assume, without loss of generality, that x ∈ Rv . Then since v is residually rational over k, there exists a ∈ k such that v(x − a) > 0. Now, since x − a is algebraic over k, it follows from Lemma (11.6) that x − a = 0, which shows that x ∈ k. (11.8) LEMMA. Let A be an affine curve over k with only one place v at infinity. Let K be the quotient field of A. Let x ∈ A, x < k. Then: (i) x is transcendental over k and v is the unique valuation of K/k extending the x−1 -adic valuation of k(x)/k. (ii) v(x) = −[K : k(x)]. In particular, v(x) < 0. (iii) A is integral over k[x]. Proof. (i) Since v is residually rational over k and since x < k, x is transcendental over k by Lemma (11.7). Let v′ be any valuation of K/k extending the x−1 -adic valuation of k(x)/k. Then x−1 is a non-unit in the ring Rv′ of v′ . This means that x < Rv′ . Therefore A 1 Rv′ , and the hypothesis on A implies that v = v′ .

11. Affine Curves with One Place at Infinity 84

83

(ii) Since v is the only valuation of K/k extending the x−1 -adic valuation of k(x)/k and since the residue field of v is k, [K : k(x)] equals the ramification index of v over the x−1 -adic valuation of k(x)/k, i.e., [K : k(x)] = v(x−1 ) = −v(x). (iii) Let y ∈ A. To show that y is integral over k[x], it is enough to show that y is integral over each valuation ring of k(x)/k containing k[x]. Let then Rw be such a valuation ring with valuation w, and let w1 , . . . , wr be all the extensions of w to K. Then, if Rw is the r \ integral closure of Rw in K, we have Rw = Rwi . Therefore it is i=1

enough to prove that y ∈ Rwi for every i, 1 ≤ i ≤ r. Since A ⊂ Rv′ for every valuation v′ of K/k other than v, we have only to show that wi , v for every i, 1 ≤ i ≤ r. But this is clear, since x ∈ Rwi for every i, 1 ≤ i ≤ r, and x < Rv by (ii). (11.9) COROLLARY. Let A be an affine curve over kwith only one place v at infinity. Then v(A − {0}) = v(a) a ∈ A, a , 0 is a subsemigroup of the semigroup of non-positive integers. Moreover, the only units of A are the non-zero elements of k. Proof. The first assertion is immediate from Lemma (11.8) (ii). To prove the second assertion, let x < k. Then x is transcendental over k, hence a non unit in k[x]. Since A is integral over k[x], x is a non-unit in A. (11.10) REMARK. In view of Corollary (11.9), we may omit explicit mention of k in Definition (11.4). That is, we may say A to have only one place at infinity if there exists a subfield k of A such that A is an affine curve over k with only one place at infinity in the sense of Definition (11.4). The subfield k is then uniquely determined by A. viz, it is the set of all units of A together with zero. We call k the ground field of A. 85 (11.11) DEFINITION. Let R be a ring and let R[Y] be the polynomial ring in one variable Y over R. An element f of R[Y] is said to be almost

84

4. Applications of The Fundamental Theorem

monic in Y if f , 0 and the leading coefficient of f is a unit in R, i.e. f , 0 and there exists a unit a in R such that deg( f − aY n ) < n, where n = degY f . (11.12) PROPOSITION. Let k′ be a field and let k be its algebraic closure. Let ϕ = ϕ(X, Y) be an element of k′ [X, Y] ⊂ k((X −1 ))[Y] such that degY ϕ > 0. Let A = k′ [X, Y]/(ϕ), where (ϕ) = ϕk′ [x, Y]. Assume that A is an affine curve over k′ with only one place v at infinity. Then: (i) ϕ is almost monic in Y. (ii) degY ϕ = −v(X + (ϕ)). (iii) ϕ is irreducible in k((X −1 ))[Y].

86

Proof. Let x = X + (ϕ). Since degY ϕ > 0, we have x < k′ . Therefore by Lemma (11.8) x is transcendental over k′ and A is integral over k′ [x]. In particular y = Y + (ϕ) is integral over k′ [x], and (i) is proved. Now, if K is the quotient field of A then we have degY ϕ = [K : k′ (x)]. By Lemma (11.8) we have [K : k′ (x)] = −v(x). This proves (ii). In order to prove (iii), we may, in view of (i), replace ϕ by aϕ for a suitable non-zero element a of k′ to assume that ϕ is monic in Y. Then ϕ(x, Y) ∈ k′ [x][Y] is the minimal monic polynomial of y over k′ (x). Let L be an overfield of k((x−1 )) such that we have a k′ (x)-monomorphism u : K → L and L is generated over k((x−1 )) by u(y). (Here we regard k((x−1 )) as an overfield of k′ (x) via the natural inclusions k′ ֒→ k(x) ֒→ k((x−1 )).) Let ψ(x, Y) ∈ k((x−1 ))[Y] be the minimal monic polynomial of u(y) over k((x−1 )). In order to prove (iii), it is enough to show that ϕ(x, Y) = ψ(x, Y). Now, since ϕ(x, u(y)) = u(ϕ(x, y)) = 0, ψ(x, Y) divides ϕ(x, Y) in k((x−1 ))[Y]. Therefore it is now enough to show that degY ϕ(x, Y) ≤ degY ψ(x, Y). Let n = degY ϕ(x, Y), m = degY ψ(x, Y). Then n = v(x−1 ) by (ii), and m = [L : k((x−1 ))]. Let w be a valuation of L extending the x−1 -adic valuation of k((x−1 ))/k. We claim that there exists a (unique) valuation v′ of K such that w is an extension of v′ . For, let w′ : K → Z ∪ {∞} denote the restriction of w to K. Then, writing K ∗ for the group of units of K, w′ (K ∗ ) is a subgroup of Z. Since w(x−1 ) > 0 and x−1 ∈ K, we have w′ (K ∗ ) , 0. If r is the positive generator of w′ (K ∗ ), we put v′ = r−1 w′ .

11. Affine Curves with One Place at Infinity

85

Then v′ : K → Z ∪ {∞} is surjective and our claim is proved. Now, since v′ (x−1 ) > 0, v′ is an extension of the x−1 -adic valuation of k′ (x)/k′ . Therefore v′ = v by Lemma (11.8). Now, we get n = v(x−1 ) = v′ (x−1 ) = r−1 w(x−1 ) ≤ w(x−1 ) ≤ [L : k((x−1 ))] = m, and (iii) is proved. This completes the proof of the proposition. (11.13) NOTATION. Let k be an algebraically closed field and let ϕ = ϕ(X, Y) be an element of k[X, Y] such that ϕ is monic in Y and char k does not divide degY ϕ. In particular, this means that degY ϕ > 0. Let n = degY ϕ. Assume that ϕ is irreducible in k((X −1 ))[Y]. Put f = f (X, Y) = ϕ(X −1 , Y). Then f is a irreducible element of k((X))[Y] and f is monic in Y with degY f = n. Therefore by Newton’s Theorem (5.14) there exists y(t) ∈ k((t)) such that f (tn , y(t)) = 0. Let k′ be a subfield of k such that ϕ ∈ k′ [x, Y]. Let R = k′ [X −1 ]. Then f ∈ R[Y]. Let R[Y] = R[Y]/ f R[Y] and let A = k′ [X, Y]/ϕk′ [X, Y]. It is then clear that the k′ algebra isomorphism θ′ : k′ [X, Y] → R[Y] defined by θ′ (X) = X −1 , θ′ (Y) = Y, induces a k′ -algebra isomorphism θ′ : A → R[Y]. Recall also that if k′ [t−n , y(t)] denotes the k′ -subalgebra of k((t)) generated by 87 t−n and y(t) then by Lemma (8.4) there exists k′ -algebra isomorphism u : R[Y] → k′ [t−n , y(t)] given by u(F(X, Y)) = F(tn , y(t)), where F(X, Y) denotes the image of an element F(X, Y) of R[Y] under the canonical homomorphism R[Y] → R[Y]. Putting θ = uθ′ , we get a k′ -algebra isomorphism θ : A = k′ [X, Y]/ϕk′ [X, Y] → k′ [t−n , y(t)] given by θ(F(x, Y)) = F(t−n , y(t)) for F(X, Y) ∈ k′ [X, Y], where x (resp. y) is the canonical image of X (resp. Y) in A. In the sequel we shall (11.13.1)

Identify A with k′ [t−n , y(t)] via θ.

Note that under this identification we have x = t−n and y = y(t). Let K = k′ (tn , y(t)) be the quotient field of A. Since K is a subfield of k((t)), we have a map ordt : K → Z ∪ {∞}.

86

4. Applications of The Fundamental Theorem

Let h = h( f ) and let ri = ri (−n, f ), di+1 = di+1 ( f ) for 0 ≤ i ≤ h + 1. Let ΓR ( f ) be the value semigroup of f with respect to R. Recall that Γr ( f ) = ordt F(tn , y(t)) F(X, Y) ∈ R[Y], F(tn , y(t)) , 0 . (11.14) LEMMA. With the notation of (11.13), we have: (i) ordt (A − {0}) = ΓR ( f ). (ii) ordt is a valuation of K/k′ . (iii) A is an affine curve over k′ with only one place ordt at infinity. (iv) ordt (A − {0}) is strictly generated by r = (r0 , . . . , rh ) 88

(v) r0 < 0, r1 = ∞ or r1 ≤ 0, and ri < 0 for 2 ≤ i ≤ h. Proof. (i) In view of the identification of A with k′ [t−n , y(t)] via θ, we have ΓR ( f ) = ordt F(tn , y(t)) F(X, Y) ∈ R[Y], F(tn , y(t)) , 0 = ordt F(tn , y(t)) F(X, Y) ∈ k′ [X, Y], F(t−n , y(t)) , 0 = ordt F(x, y) F(X, Y) ∈ k′ [X, Y], F(x, y) , 0 = ordt a a ∈ A, , 0 = ordt (A − {0}).

(ii) We have only to show that ordt : K → Z ∪ {∞} is surjective or, equivalently, that ordt (K ∗ ) = Z, where K ∗ = K −{0}. Now ordt (K ∗ ) is clearly the subgroup of Z generated by the semigroup ordt (A − {0}), hence by ΓR ( f ) in view of (i). Since X −1 ∈ R, the assertion now follows from Corollary (8.8). (iii) Since ϕ is monic in Y, A is integral over k′ [x]. We have ordt (x) = ordt (t−n ) = −n. Therefore ordt (x−1 ) = n = degY ϕ = [K : k′ (x)].

11. Affine Curves with One Place at Infinity

87

This shows that ordt is the only valuation of K/k′ extending the x−1 -adic valuation of k′ (x)/k and that ordt is residually rational over k′ . Now, let w be any valuation of K/k′ such that A 1 Rw . Then, since A is integral over k′ [x], we have k′ [x] 1 Rw . This means that w(x) < 0, so that w(x−1 ) > 0. Therefore w extends the x−1 -adic valuation of k′ (x), and we get w = ordt . (iv) This is immediate from Theorem (8.7) (iii′ ), since we have ordt (A− {0}) = ΓR ( f ) by (i) and we are in the pure meromorphic case. (v) We have r0 = −n < 0. Next, r1 = ordt (y). If y ∈ k′ then ordt (y) = 0 89 or ∞. If y < k′ then, since y ∈ A, we get ordt (y) < 0 by (iii) and lemma (11.8) (ii). Now, let gi = gi (X, Y) = AppdYi ( f ), 2 ≤ i ≤ h. Then gi ∈ k′ [X −1 ][Y] for every i by Theorem (8.3)(i). Put ψi = ψi (X, Y) = gi (X −1 , Y), 2 ≤ i ≤ h. Then ψi ∈ k′ [X, Y] for every i. Now, for 2 ≤ i ≤ h, we have ri = ordt gi (tn , y(t))

(by Theorem (8.2))

−n

= ordt ψi (t , y(t)) = ordt ψi (x, y)

(by (11.13.1)).

Therefore by (iii) and Lemma (11.8) (ii) it is enough to prove that ψi (x, y) < k′ for every i, 2 ≤ i ≤ h. Now, we have degY ψi = n/di . This shows that 1 ≤ degY ψi < n = degY ϕ for every i, 2 ≤ i ≤ h. Therefore, for every a ∈ k′ , ϕ does not divide ψi − a in k′ [X, Y]. This means that ψi (x, y) < k′ . (11.15) THEOREM. Let k be an algebraically closed field and let ϕ be an element of k[X, Y] such that degY ϕ > 0. Consider the following four conditions. (i) For every subfield k′ of k such that ϕ ∈ k′ [X, Y], k′ [X, Y]/ϕk′ [X, Y] is an affine curve k′ with only one place at infinity. (ii) k[X, Y]/ϕk[x, Y] is an affine curve over k with only one place at infinity.

88

4. Applications of The Fundamental Theorem

(iii) There exists a subfield k′ of k such that ϕ ∈ k′ [X, Y] and k′ [X, Y]/ϕ k′ [X, Y] is an affine curve over k′ with only one place at infinity. (iv) ϕ is almost monic in Y and ϕ is irreducible in k((X −1 ))[Y]. We have (i) ⇒ (ii) ⇒ (iii) ⇒ (iv). Moreover, if char k does not divide degY ϕ then (iv) ⇒ (i).

90

Proof. (i) ⇒ (ii) ⇒ (iii). Trivial. (iii) ⇒ (iv). Immediate from Proposition (11.12). (iv) → (i). Assume that char k does not divide degY ϕ. Let k′ be a subfield of k such that ϕ ∈ k′ [X, Y]. Then, replacing ϕ by a ϕ for a suitable a ∈ k′ , we may assume that ϕ is monic in Y. Now, (i) follows from Lemma (11.14) (iii). (11.16) COROLLARY. Let k′ be a field and let k be its algebraic closure. Let ϕ = ϕ(X, Y) be a non-zero element of k′ [X, Y] such that char k does not divide degY ϕ and k′ [X, Y]/ϕk′ [X, Y] is an affine curve over k′ with only one place at infinity. Then for every λ ∈ k, k′ (λ)[X, Y] is an affine curve over k′ (λ) with only one place at infinity Proof. Since char k does not divide degY ϕ, we have degY ϕ > 0. Therefore by Theorem (11.15) ϕ is almost monic in Y, i.e. there exists a ∈ k′ , a , 0, such that a ϕ is monic in Y. Since k = {aλ λ ∈ k}, we may replace ϕ by a ϕ and assume that ϕ is monic in Y. By Theorem (11.15) ϕ is irreducible in k((X −1 ))[Y]. Since degY (ϕ + λ) = degY ϕ is not divisible by char k for every λ ∈ k, it is enough, by Theorem (11.15), to prove that ϕ + λ is irreducible in k((x−1 ))[Y] for every λ ∈ k. Let n = degY ϕ. Put f = f (X, Y) = ϕ(X −1 , Y). Then f is an irreducible element of k((X))[Y] and f is monic in Y with degY f = n. Clearly, it is enough to prove that f + λ is irreducible in k((X))[Y] for every λ ∈ k. By Newton’s Theorem (5.14) there exists an element y(t) of k((t)) such that f (tn , y(t)) = 0. Let h = h( f ), sh = sh (−n, f ) and ri = ri (−n, f ) for 0 ≤ i ≤ h. Then by Lemma (11.14)(v) we have rh ≤ 0. First, suppose that rh = 0. Then by Lemma (11.14)(v) we have h = 1. Therefore we get 1 = dh+1 ( f ) = d2 ( f ) = g.c.d. (r0 , r1 ) = g.c.d. (−n, 0) = n. Thus in this case we have degY ( f + λ) = 1, which clearly implies that

11. Affine Curves with One Place at Infinity

89

f + λ is irreducible in k((X))[Y]. Now, suppose that rh < 0. Then sh < 0. Let fλ = f + λ. Then fλ (tn , y(t)) = λ ∈ k. Therefore ordt fλ (tn , y(t)) ≥ 0 > sh . Now, it follows from the Irreducibility Cri- 91 terion (Theorem (12.4)) proved in the next section that fλ is irreducible in k((X))[Y]. (11.17) REMARK. Let us justify the use of a result from § 12 in proving Corollary (11.16) by declaring that the result of Corollary (11.16) will not be used anywhere in the sequel. (11.18) QUESTION. Is Corollary (11.16) true without the assumption that char k does not divide degY ϕ? (11.19) PROPOSITION. Let k be a field and let n be a positive integer such that char k does not divide n. Let ϕ = ϕ(X, Y) = a0 (X)Y n + a1 (X)Y n−1 + · · · + an (X) with ai (X) ∈ k[X] for 0 ≤ i ≤ n, a0 (X) , 0. Let m = degX ϕ. Assume that k[X, Y]/ϕk[X, Y] is an affine curve over k with only one place at infinity. Then a0 (X) ∈ k and we have n degX ai (X) ≤ im for every i, 0 ≤ i ≤ n. Moreover, if m ≥ 1 then we have degX an (X) = m and n degX ai (X) ≤ i deg X an (X) for every i, 0 ≤ i ≤ n. Proof. By Proposition (11.12) ϕ is almost monic in Y. This means that a0 (X) ∈ k. Therefore, replacing ϕ by a0 (X)−1 ϕ, we may assume that a0 (X) = 1. Now, if m = 0 then the assertion is clear. Assume therefore that m ≥ 1. Then by Proposition (11.12) ϕ is almost monic in X. This shows that degX an (X) = m. Now, by Proposition (11.12) ϕ is irreducible in k((X −1 ))[Y], where k is the algebraic closure of k. Therefore by Newton’s Theorem (5.14) there exists y(t) ∈ k((t)) such that Y (Y − y(wt)). ϕ(t−n , Y) = w∈µn (k)

Let q = ordt y(wt) for all w ∈ µn (k). Then, since ai (t−n ) equals (−1)i 92

90

4. Applications of The Fundamental Theorem

times the ith elementary symmetric function of {y(wt)|w ∈ µn (k)}, we have ordt ai (t−n ) ≥ iq for 1 ≤ i ≤ n. Moreover, since Y an (t−n ) = (−1)n y(wt), w∈µn (k)

we have ordt an (t−n ) = nq, which gives q = ordX an (X −1 ) = − degX an (X). Therefore for every i, 1 ≤ i ≤ n, we get n deg X ai (X) = −n ord X ai (X −1 ) = − ordt ai (t−n ) ≤ −iq = i deg X an (X) = im. (11.20) COROLLARY. Let k be a field of characteristic zero and let f , g be elements of k[X, Y] such that k[ f, g] = k[X, Y]. Let m = degX f , n = degY f and let f = a0 (X)Y n + a1 (X)Y n−1 + · · · + an (X) with ai (x) ∈ k[X] for 0 ≤ i ≤ n. Then we have n degX ai (X) ≤ im for 0 ≤ i ≤ n. Moreover, if m ≥ 1 (resp. n ≥ 1) then f is almost monic in X (resp. Y).

93

Proof. The inequality n deg X ai (X) ≤ im is obvious for n = 0. We may therefore assume that n > 0. Then, since k[X, Y]/ f k[X, Y] is isomorphic to k[g], which is an affine curve over k with only one place at infinity (Example (11.5)), the corollary follows from Propositions (11.19) and (11.12). (11.21) DEFINITION. Let k be a field and let f be a non-zero element P of k[X, Y]. Write f = ai j X i Y j with ai j ∈ k. The degree form of f , denoted f + , is defined by X f+ = ai j X i Y j i+ j=n

11. Affine Curves with One Place at Infinity

91

where n = deg f . (Note that deg f and f + depend only on the k-vector subspace kX ⊕ kY of k[X, Y] and do not depend upon a k-basis X, Y of kX ⊕ kY.) (11.22) DEFINITION. Let f ∈ k[X, Y], f < k. We say f has only one ¯ Y], where point at infinity if f + is a power of a linear polynomial in k[x, k¯ is the algebraic closure of k. (Note that this definition depends only on the k-vector subspace kX ⊕ kY of k[X, Y] and is independent of the choice of a k-basis X, Y of kX ⊕ kY.) (11.23) PROPOSITION. Let k be a field of characteristic zero and let f be an element of k[X, Y] such that k[X, Y]/ f k[X, Y] is an affine curve over k with only one place at infinity. Then f has only one point at infinity. Proof. We may assume that k is algebraically closed. For, by interchanging X and Y, if necessary, we may assume that degY f > 0 and then apply Theorem (11.15). Now, suppose f + is not a power of a linear polynomial in k[X, Y]. Then, replacing X, Y by a suitable k-basis of kX ⊕ kY, we may assume that f + is of the form f + = Xr

q Y

(X + ai Y),

i=1

where r, q are positive integers and ai ∈ k, ai , 0, for 1 ≤ i ≤ q. Let 94 m = degX f and n = degY f . Then m = r + q and m > n ≥ q ≥ 1. By Proposition (11.12) f is almost monic in Y. Therefore n > q and we can write f in the form f = f1 + f2 + f3 , where f1 = f + , f2 = bY n for some b ∈ k, b , 0, and X ci j X i Y i f3 = i+ j mh (ν, f ). 100 Proof. We shall use the notation of (12.1). (i) Let L be a finite algebraic normal extension of k((t)) such that L contains the splitting field of ϕ(tn , Y) over k((tn )). Then there exist z1 , . . . , zn ∈ L such that we have n Y ϕ(t , Y) = (Y − zi ). n

(12.4.1)

i=1

Let v be a valuation of L extending the valuation ordt of k((t)). (See (12.2).) Let e = v(t). Then we have v(a) = e ordt a for every a ∈ k((t)). Now, we have ordt ϕ(tn , y(wt)) = ordt ϕ(tn , y(t)) > sh for every w ∈ µn . Therefore v(ϕ(tn , y(wt))) > esh for every w ∈ µn and it follows that Y ϕ(tn , y(wt)) nesh < v w∈µn n Y Y = v (zi − y(wt))) (by (12.4.1)) i=1 w∈µn n Y n f (t , zi ) = v i=1

=

n X i=1

v( f (tn , zi )).

5. Irreducibility, Newton’s Polygon

98

Therefore there exists i0 , 1 ≤ i0 ≤ n, such that, writing z = zi0 , we have v( f (tn , z)) > esh . It therefore follows from Lemma (12.3) that there exists w′ ∈ µn such that we have (12.4.2) 101

v(z − y(w′ t)) > emh .

Put y′ = y(w′ t). For w ∈ µn , let σw be the k((tn ))-automorphism of k((t)) defined by σw (t) = wt. Let τw be an extension of σw to an automorphism of L. Since k((t)) it complete with respect to the valuation ordt , v is the only valuation of L extending ordt . Therefore, since ordt = ordt ◦σw , we have v = v ◦ τw for every w ∈ µn . In particular, from (12.4.2) we get (12.4.3)

v(τw (z) − τw (y′ )) = v(z − y′ ) > emh

for every w ∈ µn . Moreover, if w1 , w2 ∈ µn , w1 , w2 , then by Proposition (6.15) we have (12.4.4) v(τω1 (y′ )−τω2 (y′ )) = e ordt (y(w1 w′ t)−y(w2 w′ t)) ≤ emh . Therefore, since τw1 (z) − τw2 (z) = (τw1 (z) − τw1 (y′ )) + (τw1 (y′ ) − τw2 (y′ )) + (τw2 (y′ ) − τw2 (z)), it follows from (12.4.3) and (12.4.4) that v(τw1 (z) − τw2 ) ≤ emh if w1 , w2. In particular, τw1 (z) , τw2 (z) if w1 , w2 . Therefore the set S = τw (z) w ∈ µn consists of n distinct elements. Since all the

n elements of S are conjugates of z over k((tn )), the minimal polynomial of z over k((tn )) has degree at least n. On the other hand, ϕ(tn , Y) ∈ k((tn ))[Y], degY ϕ(tn , Y) = n and ϕ(tn , z) = 0. Therefore ϕ(tn , Y) is irreducible in k((tn ))[Y]. This means that ϕ(X, Y) is irreducible in k((X))[Y]. This proves (i).

(ii) Since ϕ is irreducible by (i), all the roots of ϕ(tn , Y) belong to k((t)) by Newton’s Theorem (5.14). Therefore τw (z) ∈ k((t)) for every w ∈ µn . Now, taking z(t) = τw (z) with w = w′−1 , (ii) follows from (12.4.3).

13. Irreducibility of the Approximate Roots

99

13 Irreducibility of the Approximate Roots (13.1) Let k be an algebraically closed field and let f = f (X, Y) be an irre- 102 ducible element of k((X))[Y]. Assume that f is monic in Y and that char k does not divide n = degY f . Let ν be an integer such that |ν| = n. With this notation, we have the following theorem: (13.2) THEOREM. Let y(t) be an element of k((t)) such that f (tn , y(t)) = 0. Let e be an integer such that 1 ≤ e ≤ h( f ) + 1 and let ge = ge (X, Y) = AppdYe ( f ). where de = de ( f ). Then: (i) ge is irreducible in k((X))[Y]. (ii) If e ≥ 2 then there exists an element z(t) of k((t)) such that ge (tn/de , z(t)) = 0 and ordt (z(tde ) − y(t)) = me (ν, f ). Proof. (i) If e = 1 then degY ge = n/d1 = 1, so that the assertion is clear in this case. If e = h( f ) + 1 then ge = f , so that the assertion is clear also in this case. We assume now that 2 ≤ e ≤ h( f ). Write y(t) = P P j y j t j , where me = y j t with y j ∈ k for every j, and let y(t) = j sh′ (ν′ , Ge ),

where h′ = h(Ge ). For, given (13.2.1), we can apply Theorem 103

5. Irreducibility, Newton’s Polygon

100

(12.4) with f (resp. ϕ) replaced by Ge (resp. ge ) and conclude that ge is irreducible. Now, (13.2.1) is clearly equivalent to ordt ge (tn , y(t)) > sh′ (ν′ , Ge )de .

(13.2.2)

By Proposition (6.16) we have h′ = e − 1 and sh′ (ν′ , Ge )de = se−1 (ν, f )/de < se (ν, f )/de = re (ν, f ). Therefore, in order to prove (13.2.2), it is enough to prove that ordt ge (tn , y(t)) > re (ν, f ).

(13.2.3)

Now, (13.2.3) follows from Corollary (7.20) by taking a = 0 and u = 0. This completes the proof of (i). (ii) If e = h( f ) + 1 then de = 1, ge = f and me = ∞. Therefore in this case the assertion is clear by taking z(t) = y(t). Now, suppose 2 ≤ e ≤ h( f ). Then, in view of (13.2.1), it follows from Theorem ′ (12.4) that there exists z′ (t′ ) ∈ k((t′ )) such that ge (t′n , z′ (t′ )) = 0 and ordt′ (z′ (t′ ) − y(t′ )) > mh′ (ν′ , Ge ).

(13.2.4)

Therefore by Proposition (6.17) we get (13.2.5) h(ge ) = h′ , m(ν′ , ge ) = m(ν′ , Ge ), S (ν′ , ge ) = s(ν′ , Ge ). In particular, from (13.2.4) we get ordt (y′ (t′ ) − z′ (t′ )) > mh′ (ν′ , ge ).

(13.2.6) 104

Now, by Corollary (7.10) applied to (13.2.6) by replacing f (resp. y(t), resp. u(t)) by ge (resp. z′ (t′ ), resp. y′ (t′ )), we get ′

ordt′ (ge (t′n , y′ (t′ ))) = sh′ (ν′ , ge ) − mh′ (ν′ , ge ) + ordt′ (y′ (t′ ) − z′ (t′ )). From this, by (13.2.5) we get ′

ordt′ (ge (t′n , y′ (t′ ))) = sh′ (ν′ Ge ) − mh′ (ν′ Ge ) + ordt′ (y′ (t′ ) − z′ (t′ )).

13. Irreducibility of the Approximate Roots

101

Now, since t′ = tde , there exists z(t) ∈ k((t)) such that z′ (t′ ) = z(tde ), and we get ordt (ge (tn , y(t))) = de sh′ (ν′ , Ge ) − de mh′ (ν′ , Ge ) + ordt (y(t) − z(tde )). Therefore by (13.2.3) we get re (ν, f ) < de sh′ (ν′ , Ge ) − de mh′ (ν′ , Ge ) + ordt (y(t) − z(tde )) = se−1 (ν, f )/de − me−1 (ν, f ) + ordt (y(t) − z()tde ) by Proposition (6.16). This gives ordt (y(t) − z(tde )) > me−1 (ν, f ) + fe (ν, f ) − se−1 (ν, f )/de = me−1 (ν, f ) + (se (ν, f ) − se−1 (ν, f ))/de = me−1 (ν, f ) + qe (ν, f ) = me (ν, f ). Therefore, since ordt (y(t) − y(t)) = me (ν, f ), we get ordt (z(tde ) − y(t)) = ordt ((z(tde ) − y(t)) + (y(t) − y(t))) = me (ν, f ). ′

Also, from ge (t′n , z′ (t′ )) = 0 we get ge (tn/de , z(t)) = 0. This completes 105 the proof of (ii). (13.3) COROLLARY. Let f and ν be as in (13.1). Let e be an integer, 2 ≤ e ≤ h( f ) + 1. Let ge = AppdYe ( f ), where de = de ( f ). Let ν′ = ν/de . Then h(ge ) = e − 1 and for 0 ≤ i ≤ e − 1 we have mi (ν′ , ge ) = mi (ν, f )/de , qi (ν′ , ge ) = qi (ν, f )/de , si (ν′ , ge ) = si (ν, f )/de2 ′

ri (ν , ge ) = ri (ν, f )/de , di+1 (ge ) = di+1 ( f )/de .

(if i , 0).

102

5. Irreducibility, Newton’s Polygon

Proof. This is immediate from Theorem (13.2) (ii).

(13.4) COROLLARY. Let char k = 0. Let ϕ = ϕ(X, Y) be an element of k[X, Y] such that n = degY ϕ > 0, ϕ is monic in Y and k[X, Y]/(ϕ) is isomorphic (as a k-algebra) to k[Z], where Z is an indeterminate. Let f = f (X, Y) = ϕ(X −1 , Y). Then f is irreducible in k((X))[Y]. Let h = h( f ) and for 1 ≤ e ≤ h + 1 let ψe = AppdYe (ϕ), where de = de ( f ). Then k[X, Y]/(ψe ) is isomorphic (as a k-algebra) to k[Z] for every e, 1 ≤ e ≤ h + 1. 106

Proof. The irreducibility of f follows from Theorem (9.24). Now, since d1 ( f ) = n, ψ1 is monic in Y of Y-degree one. Therefore the assertion is clear for e = 1. For 2 ≤ e ≤ h + 1 we prove the assertion by decreasing induction on e. If e = h + 1 then de = 1, so that ψe = ϕ and the assertion follows from the hypothesis. Now, let 2 ≤ e ≤ h( f ) and suppose k[X, Y]/(ψe+1 ) is isomorphic to k[Z]. Let ge+1 = AppYde+1 ( f ). Then by Proposition (4.7) we have ge+1 (X, Y) = ψe+1 (X −1 , Y). Let h′ = h(ge+1 ). Then by Corollary (13.3) we have h′ = e and dh′ (ge+1 ) = de /de+1 . If d′ follows that ψe = AppYh (ψe+1 ), where dh′ = dh′ (ge+1 ). Now it follows from Corollary (9.28) that k[X, Y]/(ψe ) is isomorphic to k[Z]. (13.5) COROLLARY. With the notation and assumptions of Corollary (13.4) , let h = h( f ) and let mi = mi (−n, f ), qi = qi (−n, f ), si (−n, f ), ri = ri (−n, f ) and di+1 = di+1 ( f ) for 0 ≤ i ≤ h. Then we have: (i) ri = −di+1 for 2 ≤ i ≤ h. (ii) si = −di di+1 for 2 ≤ i ≤ h. (iii) qi = di−1 − di+1 for 3 ≤ i ≤ h. (iv) mi = d1 − di − di+1 for 2 ≤ i ≤ h. (v) If h ≥ 2 then mi < n − 2 for every i, 1 ≤ i ≤ h. Proof. (i) Fix an e, 2 ≤ e ≤ h, and let ψ = AppdYe+1 (ϕ). Then by Corollary (13.4) k[X, Y]/(ψ) is isomorphic to k[Z]. Let g = g(X, Y) =

13. Irreducibility of the Approximate Roots

103

ψ(X −1 , Y). Then g = AppdYe+1 ( f ). Let h′ = h(g). Then by Corollary (13.3) we have h′ = e and dh′ (g) = de /de+1 . Noting that degY ψ = n/de+1 and h′ = e ≥ 2, it follows from Corollary (9.25) that we have rh′ (−n/de+1 , g) = −1. By Corollary (13.3) we have rh′ (−n/de+1 , g) = re (−n, f )/de+1 = re /de+1 . Thus we have −1 = re /de+1 , and (i) is proved. (ii) This is immediate from (i), since si = di ri . (iii) By (ii) we have −di di+1 = si = si−1 + qi di = −di−1 di + qi di , since i ≥ 3. This gives qi = di−1 − di+1 . (iv) For i ≥ 3 we have mi = mi−1 + qi = mi−1 + di−1 − di+1 by (iii). Therefore, by induction on i, it is enough to prove that m2 = d1 − d2 − d3 . Now, by (ii) we have −d2 d3 = s2 = q1 d1 + q2 d2 . Therefore we get m2 = q1 + q2 = −q1 ((d1 /d2 ) − 1) − d3 . Now, by Corollary (9.27) we have d2 = d1 or d2 = −q1 . We consider the two cases separately. Case(1). d2 = d1 . Then m2 = −d3 = d1 − d2 − d3 . Case (2). d2 = −q1 . Then m2 = d2 ((d1 /d2 ) − 1) − d3 = d1 − d2 − d3 . (v) Suppose h ≥ 2. It is enough to prove that mh < n − 2. By (iv) we have mh = d1 − dh − dh+1 < d1 − 2 = n − 2, since dh+1 = 1 and dh ≥ 2.

107

104

5. Irreducibility, Newton’s Polygon

(13.6) REMARK. Corollaries (13.4) and (13.5) hold also for char k > 0 (and, in fact, the same proof goes through) provided we assume that n is not divisible by char k. 108

(13.7) PROPOSITION. Let f and ν be as in (13.1). Let e be an integer, 1 ≤ e ≤ h( f ). Let y(t) be an element of k((t)) such that f (tn , y(t)) = 0. Let k′ be an overfield of k and let y∗ (t) be an element of k′ ((t)) such that ordt (y∗ (t) − y(t)) ≥ me (ν, f ) and me (ν, f ) ∈ Suppt y∗ (t). Let ge = ge (X, Y) be defined as follows: If e ≥ 2 then ge = AppdYe ( f ), whereas if e = 1 then g1 = AppdY1 ( f ) or g1 = Y, where de = de ( f ). Let g′e denote the Y-derivative of ge . Then we have ordt g′e (tn , y∗ (t)) = re (ν, f ) − me (ν, f ). Proof. With either definition of g1 we have g′1 = 1. Therefore, since r1 (ν, f ) = m1 (ν, f ), the assertion is clear in case e = 1. Assume now that e ≥ 2. By Theorem (13.2) ge is irreducible in k((X))[Y]. Put d = de , g = ge , h′ = h(g), ν′ = ν/d, s′h′ = sh′ (ν′ , g), m′h′ = mh′ (ν′ , g). Then by Corollary (7.9) applied to g we have (13.7.1)

ordt g′ (tn/d , z(t)) = s′h′ − m′h′ ,

where g′ = g′e and z(t) ∈ k((t)) is any zero of g(tn/d , Y). Put mi = mi (ν, f ), qi = qi (ν, f ), si = si (ν, f ) and ri = ri (ν, f ) for 0 ≤ i ≤ h( f ). then by Corollary (13.3) we have h′ = e − 1, s′h′ = se−1 /d2 , m′h′ = me−1 /d. Therefore d(s′h′ − m′h′ ) = se−1 /d − me−1 = se /d − qe − me−1 = re − me . Therefore it follows from (13.7.1) that we have (13.7.2) 109

ordt g′ (tn , z(td )) = re − me

for any zero z(t) of g(tn/d , Y). By Theorem (13.2) we may choose z(t)

13. Irreducibility of the Approximate Roots

105

such that ordt (y(t) − z(td )) = me . Then, since ordt (y∗ (t) − y(t)) ≥ me and me ∈ Suppt y∗ (t) by assumption and since me < Suppt z(td ), we get ordt (y∗ (t) − z(td )) = me .

(13.7.3) Now, we have

Y

g(tn/d , Y) =

(Y − z(wt)),

w∈µn/d

where µn/d = µn/d (k). Therefore Y

g(tn , Y) =

(Y − z(wtd )).

w∈µn/d

differentiating with respect to Y and then substituting y = y∗ (t), we get g′ (tn , y∗ (t)) =

X Y (y∗ (t) − z(wtd )

v∈µn/d w,v

P1 +

X

Pv ,

v∈µn/d v,1

where Pv =

Y (y∗ (t) − z(wtd )). Thus, in order to complete the proof of w,v

the proposition, it is now enough to prove the following two statements: (i) ordt P1 = re − me . (ii) ordt Pv > re − me for every v ∈ µn/d − {1}. Since we have y∗ (t) − z(wtd ) = (y∗ (t) − z(td )) + (z(td ) − z(wtd )) and since for w , 1

110

ordt (z(td ) − z(wtd )) ≤ dm′h′ me−1

(Proposition (6.15)) (Corollary (13.3))

5. Irreducibility, Newton’s Polygon

106 =< me ,

it follows from (13.7.3) that we have (13.7.4)

ordt (y∗ (t) − z(wtd )) = ord(z(td ) − z(wtd )) < me

for w , 1. Therefore ordt P1 = ordt

Y (z(td ) − z(wtd )) w,1

= ordt g′ (tn , z(td )) = re − me by (13.7.2). This proves (i). Now, let v ∈ µn/d , v , 1. We have Pv = P1 (y∗ (t) − z(td ))(y∗ (t) − z(vtd ))−1 . Therefore by (i) we have ordt Pv = re − me + ordt (y∗ (t) − z(td )) − ordt (y∗ (t) − z(vtd )). Therefore (ii) will be proved if we show that ordt (y∗ (t) − z(td )) > ordt (y∗ (t) − z(vtd )). Since v , 1, this last inequality is clear from (13.7.3) and (13.7.4).

14 Newton’s Algebraic Polygon (14.1) 111

We revert to the notation of (7.1), (7.2) and (7.3). In addition, we fix the following notation: for an integer m, we put p(m) = inf i 1 ≤ i ≤ h + 1, m < mi .

14. Newton’s Algebraic Polygon Let d∗ (m) = d p(m) and let s p−1 + (m − m p−1 )d p , ∗ s (m) = md1 ,

107

if p = p(m) ≥ 2. if p(m) = 1.

Note that p(mi ) = i + 1, d∗ (mi ) = di+1 and s∗ (mi ) = si for 1 ≤ i ≤ h. If Z is an indeterminate, define if m < {m1 , . . . , mh }, Z − ym , P(m, Z) = Z ne − ynme , if m ∈ {m1 , . . . , mh }, e

where e = p(m) − 1. with the above notation, we have

(14.2) THEOREM. Let m be an integer. Let Z be an indeterminate and let k′ = k(Z). Let y∗ be an element of k′ ((t)) such that info (y∗ − y(t)) = (Z − ym )tm . Then info ( f (tn , y∗ )) = P(m, Z)d

∗ (m)

∗

t s (m) .

Proof. X Suppose m ∈ {m1 , . . . , mh }. say m = me . Then p(m) = e + 1. Let y(t) = y j t j . Then it easily follows from the assumption on y∗ that we 112 j m. Therefore, since (14.2.1)

y∗ − y(wt) = (y∗ − y(t)) + (y(t) − y(wt)),

5. Irreducibility, Newton’s Polygon

108

we get info (y∗ − y(wt)) = info (y∗ − y(t)) = (Z − ym )tm for w ∈ R(p). This shows that we have Y Y (Z − ym )tm (y∗ − y(wt)) = info w∈R(p)

w∈R(p)

= (Z − ym )d

(14.2.2)

∗ (m)

tmd

∗ (m)

,

since by Lemma (7.5) card (R(p)) = d p = d∗ (m). Now, suppose w ∈ Q(p) and p ≥ 2. Then by Proposition (6.15) we get ordt (y(t)) ≤ m p−1 . Since m < {m1 , . . . , mh }, we have m p−1 < m. Therefore from (14.2.1) we get (14.2.3)

113

info (y∗ − y(wt)) = info (y(t) − y(wt)) for w ∈ Q(p).

Since Q(1) = φ, (14.2.3) holds also for p = 1. Now, clearly, inco (y(t) − y(wt)) = for every w ∈ Q(p). Therefore we get Y Y ∗ (y(t) − y(wt)) (y − y(wt)) = info info w∈Q(p)

w∈Q(p)

(14.2.4)

s

= t ,

where by Lemma (7.7) we have s p−1 − m p−1 d p , if p ≥ 2, s= 0, if p = 1.

From (14.2.2) and (14.2.4) we get Y info ( f (tn , y∗ )) = info (y∗ − y(wt)) w∈µn (k)

= (Z − ym )d = P(m, Z)d

∗

∗

(m) md∗ (m)+s

t

(m) s∗ (m)

t

.

(14.3) REMARK. The above theorem is an algebraic version of the method of Newton’s polygon for constructing a root in k((t)) of the equaP tion f (tn , Y) = 0. The successive coefficients y j of a root y(t) = y j t j

14. Newton’s Algebraic Polygon

109

are found by induction on j. Thus, suppose we know y j for Xj less than a ∗ certain integer m. Let Z be an indeterminate and let y = y j t j + Ztm . j 0. Let f = x1 + x1 , g = x2 . Then J( f, g) = 1. Then J( f, g) = 1. However, ( f, g) is not an automorphic pair. For, k[x1 , x2 ]/(g) = k[x1 ] , k[x1 + x1p ], which shows that k[ f, g] , k[x1 , x2 ]. This explains the assumption char k = 0 made in the Jacobian problem.

16. Notation

115

16 Notation (16.1) Let A = k[x1 , x2 ] as in § 15. We assume henceforth that char k = 0. Let 119 w = (w1 , w2 ) be a pair of integers. By the w-gradation on A we mean the gradation on A obtained by giving weight wi to xi , i = 1, 2. Recall that this means that we write A in the form A = ⊕ A(n) w , n∈Z

i1 i2 where A(n) w is the k-subspace of A generated by monomials x1 x2 with i1 w1 + i2 w2 = n. The elements of A(n) w are called w-homogeneous elements of w-degree n. Note that by this definition 0 is w-homogeneous of w-degree n forX every n. Every element f of A can be written uniquely in the form f = fw(n) , where fn(n) is w-homogeneous of w-degree n and n

fw(n) = 0 for almost all n. We call fw(n) the nth w-homogeneous component of f . Suppose f , 0. Then there exists m ∈ Z such that fw(m) , 0 and fw(n) = 0 for all n > m. We call this m the w-degree of f and denote it by dw ( f ). Thus n o dw ( f ) = sup n ∈ Z fw(n) , 0 .

If f = 0, we define dw ( f ) = −∞. If f , 0 then the w-degree form of f , denoted fw+ , is defined by fw+ = fw(m) , where m = dw ( f ). If f = 0, we define fw+ = 0. Note that f is w-homogeneous if and only if f = fw+ . Suppose now that w = (1, 1). then the w-gradation on A is called the usual gradation on A. In this case we often omit the symbol w in the notation introduced above. Thus we write d( f ), f (n) , f + , . . . etc. for dw ( f ), fw(n) , fw+ , . . .. when w = (1, 1).

(16.2) The w-gradation on A defined in (16.1) above is with respect to the au- 120 tomorphic pair x = (x1 , x2 ). If u = (u1 , u2 ) is any automorphic pair then we can also define a gradation on A by giving weight wi to ui , i = 1, 2.

6. The Jacobian Problem

116

However, in the sequel we will mostly need to consider only the (1, 1)gradation on A with respect to an arbitrary automorphic pair u. In order to distinguish this from the usual gradation, we fix the following notation: deg f denotes d(1,1) ( f ) with respect to x, degu f denotes d(1,1) ( f ) with respect to u, If u = (u1 , u2 ) is an automorphic pair and f ∈ A, we write degu1 f (resp. degu2 f ) for the u1 - degree (resp. u2 -degree) of f regarded as a polynomial in u1 (resp. u2 ) with coefficients in k[u2 ] (resp. k[u1 ]).

(16.3) One final piece of notation: We denote by k∗ the set of non-zero elements of k and, as noted in (7.2), we use the symbol to denote a generic (i.e., unspecified element of k∗ .)

17 w-Relation We preserve the notation of §15 and §16. In particular, we have char k = 0. Let w = (w1 , w2 ) be a pair of integers. (17.1) LEMMA. Let F, G be non-zero w-homogeneous elements of A. The following two conditions are equivalent: (1) F r = G s for some r, s ∈ Z+ ; r + s > 0. (2) There exist p, q ∈ Z+ and a w-homogeneous element H of A such that F = H p , G = H q . 121

Proof. (1) ⇒ (2). Write F = H1p1 . . . Hnpn , G = H1q1 . . . Hnqn , where Hi is an irreducible w-homogeneous element of A, pi , qi ∈ Z+ for 1 ≤ i ≤ n and g.c.d. (Hi , H j ) = 1 for i , j. Then (1) implies that rpi = sqi for every i, 1 ≤ i ≤ n. Now, if r = 0 or s = 0, say r = 0, then s > 0 and qi = 0 for every i, so that G = . In this case (2) follows by taking H = F, p = 1, q = 0. We may therefore assume that r > 0 and s > 0.

17. w-Relation

117

Then for any i, pi = 0 if and only if qi = 0. Therefore we may assume that pi > 0 and qi > 0 for every i, 1 ≤ i ≤ n. Then for every i we have pi /qi = s/r = p/q, say, where p, q are positive integers such that g.c.d. (p, q) = 1. For every i, 1 ≤ i ≤ n, there exists a positive integer ti such that pi = pti , qi = qti . Let H = H1t1 . . . Hntn . Then F = H p , G = H q . (2) ⇒ (1). If p = 0 = q then F = , G = , so that F = G, which implies (1) in this case. Assume therefore that p + q > 0. Now, (2) implies that Fq = G p , which implies (1). (17.2) DEFINITION. Let f , g ∈ A. We say f and g are w-related if f , 0, g , 0, and F = fw+ and g = g+W satisfy the equivalent conditions (1) and (2) of Lemma (17.1). We say f and g are related if f and g are (1,1)-related. (17.3) LEMMA. Let f, g1 , . . . , ge be elements of A. (i) If fw+ = and g1 , 0 then f and g1 are w-related. (ii) If f and gi are w-related for every i, 1 ≤ i ≤ e, then f and g1 . . . ge are w-related. Proof. (i) We have fw+ = = (g+1w )◦ . (ii) By induction on e, it is enough to consider the case e = 2. There exist ri , si ∈ Z+ , ri + si > 0, such that F ri = Gisi , where F = fw+ , Gi = g+iw , i = 1, 2. This gives (17.3.1)

F r1 s2 +r2 s1 = (G1 G2 ) s1 s2 .

If si = 0 for i = 1 or 2, say s1 = 0, then r1 > 0 and F r1 = G◦1 = 122 shows that F = . Therefore in this case f is related to g1 g2 by (i). We may therefore assume that s1 > 0, s2 > 0. Then s1 s2 > 0, and it follows from (17.3.1) that f and g1 g2 are w-related. (17.4) PROPOSITION. Let F, G be non-zero w - homogeneous elements of A of w-degrees m, n, respectively. Consider the following five conditions:

6. The Jacobian Problem

118 (1) F and G are w-related.

(2) F and G are algebraically dependent over k. (3) J(F, G) = 0. (4) F n = Gm . (5) F |n| = G|m| . Among these five conditions we have the following implications: (1) ⇒ (2) ⇒ (3) ⇒ (4) ⇒ (5). Assume, moreover, that at least one of the following two conditions is satisfied: (i) w1 w2 > 0 and F < k or G < k. (ii) m , 0 or n , 0. Then the above five conditions (1) - (5) are equivalent to each other. Further, let d = g.c.d. (m, n). Then d > 0 and the conditions (1) - (5) are also equivalent to each of the following two conditions: (6) F n/d = Gm/d . (7) We have mn ≥ 0 and there exists a w-homogeneous element H of A such that F = H |m|/d , G = H |n|/d . 123

In order to prove the proposition, we need the following three lemmas: (17.14.1) LEMMA. Let L be a field and let L(t) be the field of rational functions in one variable t over L. Let Dt be the L-derivation of L(t) defined by Dt (t) = 1. If h is an element of L(t) such that Dt (h) = 0 then h ∈ L. (17.14.2) LEMMA. Let f , g be non-zero elements of A of w-degrees m, n, respectively. If f n = gm then mn ≥ 0.

17. w-Relation

119

(17.14.3) LEMMA. Let F, G be non-zero w homogeneous elements of A of w-degrees m, n respectively. Then we have: mF = w1 x1 D1 (F) + w2 x2 D2 (F),

(i)

nG = w1 x1 D1 (G) + w2 x2 D2 (G). w1 x1 J(F, G) = mFD2 (G) − nGD2 (F),

(ii)

w2 x2 J(F, G) = nGD1 (F) − mFd1 (G). Proof of Lemma (17.14.1). The assertion is clear if h ∈ L[t]. In general, we can write h = f /g with f , g ∈ L[t] and g.c.d. ( f, g) = 1. Then we have 0 = Dt (h) = (gDt ( f ) − f Dt (g))/g2 , which shows that gDt ( f ) = f Dt (g). Thus g divides f Dt (g) in L[t]. Therefore, since g.c.d. ( f, g) = 1, g divides Dt (g). Since degt Dt (g) < degt g, we get Dt (g) = 0, so that g ∈ L. Therefore h ∈ L[t], and the assertion follows. Proof of Lemma (17.14.2). Suppose mn < 0. then one of m, n is positive and the other is negative. We may suppose that m < 0 and n > 0. Then f n g−m = implies that f (also g) is a unit of A. Therefore f ∈ k∗ . But this means that m = 0, which is a contradiction. 124 Proof of Lemma (17.14.3). (i) We have only to observe that w1 x1 D1 xi11 xi22 + ω2 x2 D2 xi11 xi22 = (i1 w1 + i2 w2 ) xi11 xi22 . (ii) We have w x D (F) D2 (F) w1 x1 J(F, G) = det 1 1 1 w1 x1 D1 (G) D2 (G)

!

w x D (F) + w2 x2 D2 (F) D2 (F) = det 1 1 1 w1 x1 D1 (G) + w2 x2 D2 (G) D2 (G) ! mF D2 (F) = det (by (i)) nG D2 (G)

!

6. The Jacobian Problem

120

= m FD2 (G) − nGD2 (F). This proves the first equality of (ii). The second is proved similarly.

125

Proof of Proposition (17.4). (1) ⇒ (2). We have F r = G s for some non-negative integers r, s, r+ s > 0. Therefore F and G are algebraically dependent over k. (2) ⇒ (3). Let X1 , X2 be indeterminates and let ϕ = ϕ(X1 , X2 ) ∈ k[X1 , X2 ] be such that ϕ , 0 and ϕ(F, G) = 0. Then ϕ < k. Therefore degX1 ϕ+degX2 ϕ > 0. We may choose ϕ to be such that degX1 ϕ+degX2 ϕ is the least possible. Let ϕi = DX,i(ϕ), i = 1, 2, where X = (X1 , X2 ). Then we have degX1 ϕi + degX2 ϕ1 < degX2 ϕ + degX2 ϕ, i = 1, 2. Moreover ϕ1 , 0 or ϕ2 , 0. Ir follows that we have ϕ1 (F, G) , 0 or ϕ2 (F, G) , 0. Now, we have 0 = D1 (ϕ(G, G)) = ϕ1 (F, G)D1 (F) + ϕ2 (F, G)D1 (G), 0 = D2 (ϕ(F, G)) = ϕ1 (F, G)D2 (F) + ϕ2 (F, G)D2 (G). Since ϕ1 (F, G) , 0 or ϕ2 (F, G) , 0, we get ! D1 (F) D1 (G) = J(F, G), 0 = det D2 (F) D2 (G) which proves (3). (3) ⇒ (4). We have J(F, G) = 0 and we want to show that F n /Gm ∈ k. Since k = k(x1 ) ∩ k(x2 ), it is enough, by symmetry, to show that F n /Gm ∈ k(x1 ). By lemma (17.14.3), we have 0 = w1 x1 J(F, G) = mFD2 (G) − nGD2 (F). This gives D2 (F n /Gm ) = F n−1 Gm−1 (nGD2 (F) − mFD2 (G))/G2m = 0. Therefore F n /Gm ∈ k(x1 ) by Lemma (17.14.1). (4) ⇒ (5). Since F n = Gm if and only if F −n = G−m , it is enough to show that we have m ≥ 0, n ≥ 0 or m ≤ 0, n ≤ 0. But this is immediate, since mn ≥ 0 by Lemma (17.14.2).

17. w-Relation

121

Assume now that one of the conditions (i) and (ii) is satisfied. It is then enough to prove that d < 0, (5) ⇒ (1) and (1) ⇒ (7) ⇒ (6) ⇒ (2). We first note that if condition (i) is satisfied then either w1 > 0, w2 > 0 or w1 < 0, w2 < 0. In either case, since F < k or G < k, we get m , 0 or n , 0. Therefore we may assume that condition (ii) is satisfied. It is then clear that d > 0. 126 (5) ⇒ (1). Trivial, since m , 0 or n , 0. (1) ⇒ (7). There exist p, q ∈ Z+ and a w-homogeneous element E of A such that F = E p , G = E q . Let e = dw (E). Then m = pe, n = qe. It follows that mn ≥ 0. Also, since condition (ii) is satisfied, we have p > 0 or q > 0, say q > 0. Let d′ =g.c.d. (p, q) and write p = p′ d′ , ′ ′ q = q′ d′ , so that g.c.d. (p′ , q′ ) = 1. Let H = E d . Then F = H p , ′ G = H q . It is now enough to show that p′ = |m|/d, q′ = |n|/d. Since q > 0 and since g.c.d. (p′ , q′ ) = 1 = g.c.d. (|m|/d, |n|/d), it is enough to prove that p′ |n| = q′ |m|. Now, since F = E p , G = E q , we have pn = dw (G p ) = dw (E pq ) = dw (F q ) = qm. This shows that p′ |n| = q′ |m|. (7) ⇒ (6). Immediate, since mn ≥ 0. (6) ⇒ (2). Immediate, since m , 0 or n , 0. (17.5) COROLLARY. Let f , g1 , . . . , ge be non-zero elements of A. Let m = dw ( f ), ni = dw (gi ), 1 ≤ i ≤ e, and let d = g.c.d. (m, n1 , . . . , ne ). Assume that m > 0 and that f and gi are w related for every i, 1 ≤ i ≤ e. Then there exists a w-homogeneous element H ∈ A of w-degree d such that fw+ = H m/d . Proof. We prove the assertion by induction on e. Since f and g1 are wrelated, there exists, by Proposition (17.4), a w-homogeneous element H1 of A such that fw+ = H1m/d1 , where d1 = g.c.d. (m, n1 ). It follows that dw (H1 ) = d1 , so that the assertion is proved for e = 1. Now, let e > 1 and let d′ = g.c.d. (m, n1 , . . . , ne−1 ), d′′ = g.c.d. (m, ne ). By induction hypothesis and by the case e = 1, there exist w-homogeneous elements ′ H2 , H3 of A with dw (H2 ) = d′ , dw (H3 ) = d′′ , such that fw+ = H2m/d = 127 ′′ H3m/d . This shows that H2 and H3 are w-related. Therefore by the

6. The Jacobian Problem

122

case e = 1, there exists a w-homogeneous element H ∈ A of w-degree ′ d such that H2 = H d /d . (Note that d = g.c.d. (d′ , d′′ ).) Thus we get fw+ = H m/d .

18 Structure of the w-Degree Form We preserve the notation §15 and §16. In particular, we have char k = 0. Let w = (w1 , w2 ) be a pair of integers. (18.1) DEFINITION. For non-zero elements f , g of A we define δw ( f, g) = dw ( f g) − dw (x1 x2 ) − dw (J( f, g)). (18.2) LEMMA. Let f , g be non-zero elements of A. Then we have: (i) δw ( f, g) ≥ 0. J( f, g)+w , (ii) J( fw+ , g+w ) = 0,

if δw ( f, g) = 0 if δw ( f, g) > 0.

Proof.

(i) Clearly, we have dw (Di ( f )) ≤ dw ( f ) − wi , dw (Di (g)) ≤ dw (g) − wi for i = 1, 2. Therefore dw (D1 ( f )D2 (g) − D2 ( f )D1 (g)) ≤ dw ( f g) − w1 − w2 = dw ( f g) − dw (x1 x2 ).

which proves (i). 128

(ii) Let f ′ = f − fw+ , g′ = g−g+w . Then dw ( f ′ ) < dw ( f ), dw (g′ ) < dw (g). An easy computation shows that J( f, g) = J( fw+ , g+w ) + h. where h ∈ A with dw (h) < dw ( f g) − dw (x1 x2 ). Now, (ii) follows, since J( fw+ , g+w ) is (either zero or) w-homogeneous of w-degree dw ( f g) − dw (x1 x2 ).

18. Structure of the w-Degree Form

123

(18.3) LEMMA. Let f be a non-zero element of A such that dw ( f ) , 0. Suppose there exists g ∈ A such that f and J( f, g) are w-related. Then there exists h ∈ A such that f and J( f, h) are w-related and δw ( f, h) = 0. Proof. If δw ( f, g) = 0 then we may take h = g. Assume therefore that δw ( f, g) > 0. It is then enough to prove the following assertion: (18.3.1) There exists h ∈ A such that f and J( f, h) are w-related and δw ( f, h) < δw ( f, g). For, then the lemma would follow by induction on δw ( f, g). To prove (18.3.1), we note first that, since f and J( f, g) are w-related, we have j( f, g) are w-related, we have J( f, g) , 0 by definition. Therefore g , 0. Moreover, by Lemma (18.2) the assumption δw ( f, g) > 0 implies that J( fw+ , g+w ) = 0. Therefore by Proposition (17.4) f and g are w-related and there exists c ∈ k∗ such that c( fw+ )|n| = (g+w )|m| , where m = dw ( f ), n = dw (g). (Note that by assumption we have m , 0.) Define h = g|m| − c f |n| , Then J( f, h) = J( f, g|m| − c f |n| ) = |m|g|m| J( f, g). It follows from Lemma (17.3) that f and J( f, h) are w-related. Now, put p = |m|. Then we have dw (J( f, h)) = dw (g p−1 ) + dw (J( f, g)) = dw (g p−1 ) + dw ( f g) − dw (x1 x2 ) − δw ( f, g) = dw ( f g p ) − dw (x1 x2 ) − δw ( f, g) > dw ( f h) − dw (x1 x2 ) − δw ( f, g), since (g+w ) p − c( fw+ )|n| = 0. Thus we get

129

δw ( f, g) > dw ( f h) − dw (x1 x2 ) − dw (J( f, h)) = δw ( f, h). This proves (18.3.1), and the lemma is proved.

6. The Jacobian Problem

124

(18.4) COROLLARY. Let f be a non-zero element of A such that dw ( f ) , 0. Suppose there exists g ∈ A such that f and J( f, g) are w-related. Then there exist w-homogeneous elements H, G of A, a positive integer p and a non-negative integer r such that fw+ = H p and J(H, G) = H r , Proof. By Lemma (18.3), replacing g by h we may assume that δw ( f, g) = 0. Then by Lemma (18.2) we have J( f, g)+w = J( fw+ , g+w ). Since f and J( f, g) are w-related, there exist non-negative integers p, q and a w-homogeneous element H of A such that fw+ = H p , J( f, g)+w = H q . Since dw ( f ) , 0, we have p > 0. Let G = g+w . Then H q = J( f, g)+w = J(H p , G) = pH p−1 J(H, G). which shows that q ≥ p − 1. Let r = q − (p − 1). Then we have J(H, G) = H r .

130

(18.5) LEMMA. Assume the w1 w2 > 0. Let H, G be non-zero whomogeneous elements of A such that J(H, G) = H r for some positive integer r. Then H r−1 divides G in A. Proof. We want to show that G/H r−1 ∈ A. Let k be the algebraic closure of k. Since A = k[x1 , x2 ] ∩ k(x1 , x2 ), it is enough to prove that G/H r−1 ∈ k[x1 , x2 ]. We may therefore assume that k = k. Since w1 w2 > 0, we have w1 > 0, w2 > 0 or w1 < 0, w2 < 0. Since an element F of A is (w1 , w2 )-homogeneous if and only if it is (−w1 , −w2 )-homogeneous, we may assume that w1 > 0, w2 > 0. Let m = dw (H), n = dw (G). Since U(H, G) , 0, we have h < k, G < k. Therefore m > 0 and n > 0. From J(H, G) = H r we get m + n − (w1 + w2 ) = mr (Lemma (18.2)). This gives (18.5.1)

n/m = r − 1 + (w1 + w2 )/m > r − 1.

Next, by Lemma (17.14.3) we have (18.5.2)

nGD1 (H) − mHD1 (G) = w2 x2 J(H, G) = x2 H r , nGD2 (H) − mHD2 (G) = −w1 x1 J(H, g) = x1 H r .

18. Structure of the w-Degree Form

125

Let u1 , u2 be indeterminates. Identify A with the subring k[uw1 1 , uw2 2 ] of k[u1 , u2 ] by putting xi = uwi i , i = 1, 2. Then A = k[u1 , u2 ] ∩ k(x1 , x2 ). Therefore it is enough to prove the following assertion: (18.5.3)

H r−1 divides G in k[u1 , u2 ].

Put u = (u1 , u2 ) and let Du,i be the k-derivation of k(u1 , u2 ) defined by Du,i (u j ) = δi j (Kronecker delta), i, j = 1, 2. Then Du,i (F) = wi uwi i −1 Di (F) for every F ∈ A. Therefore from (18.5.2) we get (18.5.4)

nGDu,1 (H) − mHDu,1 (G) = uw1 1 −1 uw2 2 H r , nGDu,2 (H) − mHDu,2 (G) = uw1 1 uw2 2 −1 H r .

Since H, G are w-homogeneous in A, they are (1, 1)-homogeneous 131 in k[u1 , u2 ] of degrees m, n respectively. Now, (18.5.3) follows from (18.5.1) and (18.5.4) in view of the following (18.5.5) SUBLEMMA Assume that k is algebraically closed. Let H, G be non-zero homogeneous elements of A of positive degrees m, n, respectively. Let r be a positive integer such that r − 1 ≤ n/m and H r divides nGDi (H) − mHDi (G) for i = 1, 2. Then H r−1 divides G. Proof. Being homogeneous, H is a product of homogeneous linear polynomials in A. Therefore it is enough to prove that if F is a homogeneous linear polynomial in A and p is a positive integer such that F p divides H then F (r−1)p divides G. So, let F = a1 x1 + a2 x2 with a1 , a2 ∈ k, and suppose F p divides H. We want to show that F (r−1)p divides G. We may assume that F p+1 does not divide H. Moreover, by interchanging x1 and x2 , if necessary, we may assume that a1 , 0. We may then assume that a1 = 1. Write H = F p H ′ , G = F q G′ with q ∈ Z+ and H ′ , G′ ∈ A such that H ′ . 0 (mod F), G′ . 0 (mod F). We want to show that q ≥ (r − 1)p. We consider two cases: CASE (1). np = mq. In this case we have q/p = n/m ≥ r − 1, by assumption. Therefore q ≥ (r − 1)p.

6. The Jacobian Problem

126

CASE (2). np , mq. Since D1 (F) = 1, we have D1 (H) = pF p−1 H ′

(mod F p )

D1 (G) = qF q−1 G′

(mod F q ).

Therefore we get nGD1 (H) − mHD1 (G) ≡ (np − mq)F p+q−1 G′ H ′ 132

(mod F p+q ).

Since np−mq , 0 and G′ H ′ . 0 (mod F) and since, by assumption, divides nGD1 (H) − mHD1 (G), we get pr ≤ p + q − 1. This gives (r − 1)p < q. Hr

(18.6) COROLLARY. Assume that w1 w2 > 0. Let H, G be non-zero whomogeneous elements of A such that J(H, G) = H r for some positive integer r. Then there exists a w-homogeneous element G′ of A such that J(H, G′ ) = H. Proof. By Lemma (18.5) we have G = G′ H r−1 for some G′ ∈ A. Since H, G are w-homogeneous, so is G′ . Now, H r = J(H, G′ H r−1 ) = H r−1 J(H, G′ ), so that J(H, G′ ) = H. (18.7) COROLLARY. Assume that w1 > 0, w2 > 0. Let f , g be elements of A such that f and J( f, g) are w-related. Then there exist whomogeneous elements H, G of A and a positive integer p such that fw+ = H p and J(H, G) = H s with s = 0 or 1. Proof. Since f and J( f, g) are w-related, we have J( f, g) , 0, which shows that f < k. Therefore, since w1 > 0, w2 > 0, we have dw ( f ) , 0. Therefore by Corollary (18.4) there exist w-homogeneous elements H, G of A and a positive integer p such that fw+ = H p and J(H, G) = H r for some non-negative integer r. If r = 0, we are through. If r > 0 then by Corollary (18.6) there exists a w homogeneous element G′ of A such that J(H, G′ ) = H. Replacing G by G′ , the assertion is proved. (18.8) LEMMA. Assume that w1 w2 > 0. Let H, G be w-homogeneous elements of A such that J(H, G) = . Then:

18. Structure of the w-Degree Form

127

(i) If |w1 | = |w2 | then H = a1 x1 + a2 x2 with a1 , a2 ∈ k, a1 , 0 or a2 , 0. (ii) If |w1 | > |w2 | then H = z, where z = x2 or z = x1 + axw2 1 /w2 with 133 a ∈ k. Moreover, if a , 0 then w1 /w2 ∈ N. (iii) If |w1 | < |w2 | then H = z, where z = x1 or z = x2 + axw1 2 /w1 with a ∈ k. Moreover, if a , 0 then w2 /w1 ∈ N. Proof. By symmetry, it is enough to prove (i) and (ii). Since w1 w2 > 0, we have either w1 > 0, w2 > 0 or w1 < 0, w2 < 0. We may assume, without loss of generality, that w1 > 0, w2 > 0. Then, since H < k, G < k, we have (18.8.1)

dw (H) ≥ min(w1 , w2 ), dw (G) ≥ min(w1 , w2 ).

Since J(H, G) = , it follows from Lemma (18.2) that dw (HG) = dw (x1 x2 ) = w1 + w2 . (i) If w1 = w2 then dw (HG) = 2w1 . Therefore from (18.8.1) we get dw (H) = w1 . This means that H is a non-zero homogeneous polynomial in x1 , x2 of degree one. (ii) Since w1 = w2 we have dw (G) ≥ w2 by (18.8.1). Therefore dw (H) ≤ w1 . This means that deg x1 H ≤ 1. If deg x1 H = 0 then, since H is w-homogeneous, we have H = xn2 for some n ∈ N. This implies that xn−1 divides J(H, G) = . Therefore n = 1 and 2 H = x2 . Now, suppose deg x1 H = 1. Then H is w-homogeneous of w-degree w1 . Therefore we have H = bx1 + cxw2 1 /w2 with b ∈ k∗ , c ∈ k and w1 /w2 ∈ N if c , 0. Let z = x1 + b−1 cxw2 1 /w2 . Then H = z. (18.9) LEMMA. Assume that w1 + w2 , 0. Let H be a non-zero whomogeneous element of A such that J(H, x1 x2 ) = H. Then H = xi11 xi22 for some non-negative integers i2 , i2 with i1 + i2 > 0.

6. The Jacobian Problem

128

Proof. Let J(H, x1 x2 ) = cH with c ∈ k∗ . Then we have D (H) D2 (H) cH = det 1 x2 x1

!

= x1 D1 (H) − x2 D2 (H). Let d = dw (H). We can write X H=

j

j

H j1 j2 x11 x22

j1 w1 + j2 w2 =d

with Hh1 j2 ∈ k. Then x1 D1 (H) − x2 D2 (H) =

X j j ( j1 − j2 )H j1 j2 x11 x22

Therefore we have j1 − j2 = c for all those pairs ( j1 , j2 ) for those pairs ( j1 , j2 ) for which H j1 j2 , 0. Since also j1 w1 + j2 w2 = d and since ! 1 −1 ,0 det w1 w2 (because w1 + w2 , 0), there exists a unique pair (i1 , i2 ) such that Hi1 i2 , 0. This means that H = xi11 xi22 . Since J(H, x1 x2 , 0), we have H < k. Therefore i1 + i2 > 0. (18.10) LEMMA. Assume that w1 = w2 , 0. Let H, G be w - homogeneous elements of A such that H , 0 and J(H, G) = H. Then G = (a1 x1 + a2 x2 )(b1 x1 + b2 x2 ) and H = (a1 x1 + a2 x2 )i1 (b1 x1 + b2 x2 )i2 , 135

where i1 , i2 are non-negative integers with i1 + i2 > 0 and a1 , a2 , b1 , b2 are elements k such that a1 x1 + a2 x2 and b1 x1 + b2 x2 are linearly independent over k.

134

18. Structure of the w-Degree Form

129

Proof. We may assume, without loss of generality, that w1 = w2 = 1. Since J(H, G) = H, by Lemma (18.2) we have d(HG) = d(H) + d(x1 x2 ), where d = dw . This gives d(G) = 2. Now, assume for the moment that k is algebraically closed. Then there exist a1 , b1 , a2 , b2 ∈ k such that u1 = a1 x1 + a2 x2 and u2 = b1 x1 + b2 x2 are linearly independent over k and G = u21 or G = u1 u2 . Now, u = (u1 , u2 ) is an automorphic pair for A. If G = u21 then we have H = J(H, G) = Ju (H, G) ! Du,1 (H) Du,2 (H) = det 2u1 0 = u1 Du,2 (H). This is not possible, since degu2 Du,2 (H) < degu2 H. Thus we have G = u1 u2 . Now, since H = J(H, G) = Ju (H, u1 u2 ) and H is (1,1)-homogeneous with respect to u, it follows from Lemma (18.9) that we have H = ui11 ui22 with i1 + i2 > 0. Thus we have proved that we can choose elements a1 , b1 , a2 , b2 ∈ k (= algebraic closure of k) which meet the requirements of our lemma. If this choice cannot be made in k then G would be irreducible in A and it would follow from the form of H that i1 = i2 and H = Gi1 . But this is not possible, since J(H, G) , 0. (18.11) LEMMA. Assume that w1 w2 > 0. Let H, G be w-homogeneous 136 elements of A such that H , 0 and J(H, G) = H. If |w1 | > |w2 | (resp. |w1 | < |w2 |) then G = zx2 and H = zi1 xi22 (resp. G = x1 z and H = xi11 zi2 ) for some non-negative integers i1 , i2 with i1 + i2 > 0, where z = x1 + axw2 1 /w2 (resp. z = x2 + axw1 2 /w1 ) for some a ∈ k. If a , 0 then w1 /w2 ∈ N (resp. w2 /w1 ∈ N). Proof. The proof is analogous to that Lemma (18.10). First, we note that, by symmetry, it is enough to consider the case |w1 | > |w2 |. Since w1 w2 > 0, we may assume, without loss of generality, that w1 > 0,

6. The Jacobian Problem

130

w2 > 0. Then w1 > w2 . Since J(H, G) = H, we have dw (HG) = dw (H) + dw (x1 x2 ) (Lemma (18.2)). Therefore dw (G) = w1 + w2 . Since w1 > w2 , the only monomials in x1 , x2 of w-degree w1 + w2 are x1 x2 1 /w2 )+1 . Therefore we have G = bx1 x2 + and (if w1 /w2 ∈ N then) x(w 2 (w1 /w2 )+1 with b, c ∈ k and w1 /w2 ∈ N if c , 0. We claim that b , 0. cx2 For, if b = 0 then we get ! D1 (H) D2 (H) = xw2 1 /w2 D1 (H), H = J(H, G) = det 0 xw2 1 /w2 which is not possible, since deg x1 D1 (H) < deg x1 H. Thus b , 0. Let z = x1 + axw2 1 /w2 , where a = b−1 c. Then G = bzx2 . Let u1 = z, u2 = x2 . Then u = (u1 , u2 ) is an automorphic pair for A and ui is w-homogeneous of w-degree wi , i = 1, 2. Therefore H is w-homogeneous with respect to u. Moreover, we have H = J(H, G) = Ju (H, bu1 u2 ) = Ju (H, u1 u2 ). Therefore it follows from Lemma (18.9) that we have H = ui11 ui22 with i1 + i2 > 0, and the lemma is proved.

137

(18.12) LEMMA. Assume that w1 > 0, w2 > 0 and that w2 divides w1 and w2 , w1 . Let a be a non-zero element of k and let u = (u1 , u2 ) be the automorphic pair defined by u1 = x1 + axw2 1 /w2 , u2 = x2 . Let f be an element of A such that fw+ = ui11 ui22 , where i1 is a positive integer and i2 is a non-negative integer. Then degu f < deg f . (See (16.2) for the definition of degu f and deg f .) Proof. Let n = dw ( f ). Since ui is w-homogeneous of w-degree wi , i = 1, 2, fw+ is also the w-degree form of f with respect to u = (u1 , u2 ) (i.e., when we regard f as a polynomial in u1 , u2 and give weight wi to ui , i = 1, 2). Since fw+ = ui11 ui22 , we can write f in the form X p p (18.12.1) f = ui11 ui22 + b p1 p2 u1 1 u2 2 p1 w1 +p2 w2 0, and u = (u1 , u2 ) is an automorphic pair for A which has one of the following three forms: (i) If w1 = w2 then ui is homogeneous linear in x1 , x2 , i = 1, 2. (ii) If w1 > w2 then u1 = x1 + axw2 1 /w2 , u2 = x2 , with a ∈ k and w1 /w2 ∈ N if a , 0. (iii) If w1 < w2 then u1 = x1 , u2 = x2 + axw1 2 /w1 with a ∈ k and w2 /w1 ∈ N if a , 0. Moreover, if u is given by (ii) (resp. (iii)) and if i1 , 0 (resp. i2 , 0) and , 0 then degu f < deg f .

132

6. The Jacobian Problem

Proof. By symmetry, it is enough to consider the cases w1 = w2 and w1 > w2 . If w1 > w2 and if i1 , 0 and a , 0 in (ii) then the last assertion of the theorem follows immediately from Lemma (18.12).

139

Now, J( f, g) = implies that f and J( f, g) are w-related. Therefore by Corollary (18.7) there exist w-homogeneous elements H, G of A and a positive integer r such that fw+ = H r and J(H, G) = H s with s = 0 or 1. Since J( f, g) = , we have f , 0. Hence H , 0. Suppose s = 0. Then J(H, G) = . Therefore if w1 = w2 then by Lemma (18.8) H is homogeneous linear in x1 , x2 . Let u1 = H and let u2 be any homogeneous linear polynomial in x1 , x2 such that u1 , u2 are linearly independent over k. Taking i1 = r, i2 = 0, we have fw+ = ui11 ui22 . Now, if w1 > w2 then by Lemma (18.8) H = z where z = x2 or z = x1 + axw2 1 /w2 with a ∈ k and w1 /w2 ∈ N if a , 0. Let u1 = x1 + axw2 1 /w2 , u2 = x2 and let (r, 0), if z = u1 , (i1 , i2 ) = (0, r), if z = u2 .

Then we have fw+ = ui11 ui22 . Now, suppose s = 1. Then J(H, G) = H. If w1 = w2 then by j j Lemma (18.10) we have H = u11 u22 , where u1 , u2 are homogeneous linear and are linearly independent over k. Taking i1 = r j1 , i2 = r j2 , we get fw+ = ui11 ui22 . If w1 > w2 then by Lemma (18.11) we have H = j j u11 u22 , where u2 = x2 , u1 = x1 + axw2 1 /w2 with a ∈ k and w1 /w2 ∈ N if a , 0, and j1 , j2 are non-negative integers such that j1 + j2 > 0. Taking i1 = r j1 , i2 = r j2 , we get fw+ = ui11 ui22 .

(18.14) DEFINITION. Let f be an element of A such that f < k and let r be a positive integer. We say f has r points at infinity with respect to the w-gradation if fw+ is a product of r mutually coprime factors in k[x1 , x2 ], where k is the algebraic closure of k, i.e., if fw+ = hn11 . . . hnr r , where n1 , . . . , nr are positive integers and h1 , . . . , hr are irreducible elements of k[x1 , x2 ] with g.c.d. (hi , h j ) = 1 for i , j. We say simply that f has r points at infinity if f has r points at infinity with respect to the usual (i.e., (1,1)-)gradation.

19. Various Equivalent Formulations of the...

133

(18.15) COROLLARY. Let f , g be elements of A such that J( f, g) = . If w1 > 0, w2 > 0 then f (also g) has at most two points at infinity with respect to the w-gradation. In particular, f (also g) has at most two points at infinity. Proof. Immediate from Theorem (18.13).

19 Various Equivalent Formulations of the Jacobian Problem We preserve the notation of §15 and §16. In particular, we have char 140 k = 0.

(19.1) Newton Polygon of f Let u = (u1 , u2 ) be an automorphic pair for A. Let f ∈ A. Writing P f = ai1 i2 ui11 ui22 with ai1 i2 ∈ k, we put S u ( f ) = (i1 , i2 ) ai1 i2 , 0 . We call S u ( f ) the support of f with respect to u. Let Nu ( f ) be the smallest convex subset of the real plane R2 containing the set S u ( f )∪{(0, 0)}. We call Nu ( f ) the Newton Polygon of f with respect to u.

0

Newton Polygon of f (Points of S u ( f ) are indicated by dots)

6. The Jacobian Problem

134

Note that Nu ( f ) is the set of points (p1 , p2 ) ∈ R2 for which there exist (i1 , i2 ), ( j1 , j2 ) in S u ( f ) and s, t ∈ R with 0 ≤ s, t ≤ 1 such that (p1 , p2 ) = (i1 st + j1 (1 − s)t, i2 st + j2 (1 − s)t).

0 141

We write S ( f ) (resp. N( f )) for S x ( f ) (resp. N x ( f )) and call it simply the support (resp. Newton Polygon)of f . (19.2) THEOREM. Let f , g be elements of A such that J( f, g) = . Assume that f has only one point at infinity and that deg f ≥ 2. Then there exists an automorphic pair u = (u1 , u2 ) for A such that degu f < deg f . Proof. Let k be the algebraic closure of k. Since f has only one point at infinity, there exists an irreducible homogeneous element F in A such that f + = F n for some positive integer n and (19.2.1)

142

F is a power of a homogeneous linear polynomial in k[x1 , x2 ]. Since char k = 0, the homogeneous polynomial F, being irreducible in k[x1 , x2 ], factors into distinct (i.e. mutually coprime) homogeneous linear polynomials in k[x1 , x2 ]. Therefore in view of (19.2.1) we necessarily have deg F = 1, so that by a suitable homogeneous linear change of variables in A, we may assume that F = x2 and f + = x + 2n with n = deg f ≥ 2. Then (0, n) ∈ S ( f ) and i1 + i2 < n for all

19. Various Equivalent Formulations of the...

135

(i1 , i2 ) ∈ S ( f ) − {(0, n)}. It follows that (0, n) ∈ N( f ) and i1 + i2 < n for all (i1 , i2 ) ∈ N( f ) − {(0, n)}. (this means that N( f ) lies below the line through (0, n) with slope −1 and meets that line only in the point (0, n). See the figure below.) Since J( f, g) = and n ≥ 2, we have f < k[x2 ]. Therefore there exists (i1 , i2 ) ∈ S ( f ) with i1 > 0. Let q = inf (n − i2 )/i1 (i1 , i2 ) ∈ S ( f ), i1 > 0 and let (p1 , p2 ) ∈ S ( f ) be such that q = (n − p2 )/p1 . (Note that (p1 , p2 ) is one of

0

the points of S ( f ) − {(0, n)} lying on the line PQ in the above figure and that −q is the slope of the line PQ.) Let w = (w1 , w2 ), where w1 = n − p2 , w2 = p1 . Since p1 + p2 < n, we have w1 > w2 . Therefore by Theorem (18.13) we have fw+ = ur11 ur22 , where r1 , r2 are non-negative integers with r1 + r2 > 0, u2 = x2 and u1 = x1 + axw2 1 /w2 with a ∈ k 143 and w1 /w2 ∈ N if a , 0. Let (i1 , i2 ) ∈ S ( f ). Then, since i2 ≤ n and since q = w1 /w2 , we get i1 w1 + i2 w2 ≤ nw2 . This, together with the fact that p1 w1 + p2 w2 = nw2 , shows that dw ( f ) = nw2 and that the two distinct points (0, n) and (p1 , p2 ) belong to S ( fw+ ). Therefore fw+ is not a monomial in x1 , x2 . This means that r1 , 0 and a , 0. Therefore by Theorem (18.13) we have degu f < deg f , and the theorem is proved. (19.3) REMARK. Let u = (u1 , u2 ) be an automorphic pair for A. Let σ be the k-algebra automorphism of A defined by σ(xi ) = ui , i = 1, 2. Let us say that u is obtained from x by σ. We say σ is homogeneous

136

6. The Jacobian Problem

linear if there exist ai , bi ∈ k such that ui = ai x1 + bi x2 , i = 1, 2. We say σ is very primitive if there exist a ∈ k and n ∈ Z, n ≥ 2, such that u1 = x1 + axn2 , u2 = x2 or u1 = x1 , u2 = x2 + axn1 . We then note from the proof of Theorem (19.2) that there exists an automorphic pair u for A such that degu f < deg f and u is obtained from x by a homogeneous linear automorphism followed by a very primitive automorphism. (19.4) THEOREM. The following four statements are equivalent: (i) If f, g ∈ A and J( f, g) = then k[ f, g] = A. (ii) If f, g ∈ A and J( f, g) = then f has only one point at infinity. (iii) If f, g ∈ A and J( f, g) = then N( f ) is a triangle with vertices (0, n), (0, 0), (m, 0) for some non-negative integers m, n. (iv) If f, g ∈ A and J( f, g) = then deg f divides deg g or deg g divides deg f .

144

Proof. (I) ⇒ (II). This follows from Corollary (11.24). (II) ⇒ (I). If deg f ≥ 2 then, since by (II) f has only one point at infinity, it follows from Theorem (19.2) that there exists an automorphic pair u = (u1 , u2 ) for A such that degu f < deg f . Moreover, Ju ( f, g) = , so that f has only one point at infinity with respect to u. Therefore, by a repeated application of (II) and Theorem (19.2), we may assume that deg f = 1. Now, by a further linear automorphism of A, we may assume that f = x1 . Then = J( f, g) = D2 (g), which shows that g = x2 + p(x1 ) with p(x1 ) ∈ k[x1 ]. Now, it is clear that k[ f, g] = A. (I) ⇒ (III). Let m = deg x1 f , n = deg x2 f . Let T be the triangle with vertices (0, n), (0, 0), (m, 0). We claim that N f = T . This is clear if m = 0 or n = 0. Assume therefore that m ≥ 1 and n ≥ 1. Then by Corollary (11.20) f is almost monic in both x1 and x2 . This means that (m, 0) ∈ S ( f ) and (0, n) ∈ S ( f ). Therefore T ⊂ N( f ). Now, let + · · · + an (x1 ) f = a0 (x1 )xn2 + a1 (x1 )xn−1 2 with ai (x1 ) ∈ k[x1 ] for 0 ≤ i ≤ n. Then by Corollary (11.20) we have n deg x1 ai (x1 ) ≤ im for every i, 0 ≤ i ≤ n. It follows that if (p, q) ∈ S ( f )

19. Various Equivalent Formulations of the...

137

then np ≤ (n−q)m, so that np+mq−mn ≥ 0. This shows that (p, q) ∈ T . Therefore S ( f ) ⊂ T and hence N( f ) ⊂ T . Thus N( f ) = T . (III) ⇒ (II). We may assume that k is algebraically closed. Let d = deg( f ). Suppose f has at least two points at infinity. Then by a linear homogeneous change of variables (i.e. by replacing x1 , x2 by a suitable k-basis of kx1 ⊕kx2 ) we may assume that f + = xr! G, where r is a positive integer and G is a homogeneous element of A such that x1 does not divide G in A and deg G > 0. Since J( f, g) = , N( f ) is a triangle with vertices (0, n), (0.0), (m, 0), where m, n non-negative integers such that 145 m + n > 0. This shows that if n ≥ m then the monomial xn2 appears in f + with a non-zero coefficient. But this is not possible, since f + = xr1G with r > 0. Thus we have n < m. Therefore, since N( f ) is the triangle (0, n), (0, 0), (m, 0), we get f + = xm 1 . This is also not possible since x1 does not divide G and deg G > 0. (I) ⇒ (IV). This follows from Theorem (10.2). (IV) ⇒ (I). Assuming (IV), we prove (I) by induction on deg( f g). Since J( f, g) = , we have f < k, g < k. Therefore deg f ≥ 1, deg g ≥ 1 and deg( f g) ≥ 2. If deg( f g) = 2 then deg f = 1 = deg g and the assertion is clear in this case. Now, let m = deg f , n = deg g, and assume that m + n ≥ 3. Without loss of generality, we may assume that m ≥ n. Then by (IV) n divides m. Since deg( f g) ≥ 3 and J( f, g) = , we have J( f + , g+ ) = 0 by Lemma (18.2). Therefore by Proposition (17.4) we have f + = c(g+ )m/n for some c ∈ k∗ . Let h = f − cgm/n . Then deg h < deg f . Moreover, clearly J(h, g) = J( f, g) = . Therefore k[h, g] = A by induction hypothesis. Since k[ f, g] = k[h, g], (I) is proved. (19.5) REMARK. In order to solve the Jacobian problem, we may assume that the field k is algebraically closed. For, each of statements (II), (III) and (IV) of Theorem (19.4) is unaltered if we replace k by its algebraic closure. (19.6) REMARK. In the next section we give yet another equivalent formulation of the Jacobian problem in terms of a Newton-Puiseux expansion.

6. The Jacobian Problem

138

20 Jacobian Problem Via Newton-Puiseux Expansion We preserve the notation of §15 and §16. In particular, we have char k = 0. We assume, in addition, that k is algebraically closed.

(20.1) Newton-Puiseux Expansion 146

Let f , g be elements of A. Assume that n = deg x2 f > 0 and that f is monic in x2 . By a construction analogous to the one used in §9, we can expand g in fractional powers of f −1 with coefficients in the algebraic closure of k(x1 ). Explicitly, let L be the algebraic closure of k(x1 ) and let τ be an indeterminate. Let θ : L[x2 ] → L((τ)) be the L-algebra monomorphism defined by θ(x2 ) = τ−1 . It is then clear that we have ordτ θ(F) = − deg x2 F for every F ∈ L[x2 ]. In particular, we have ordτ θ( f ) = −n. By Corollary (5.4) there exists t ∈ L((τ)) such that ordτ (t) = 1 and θ( f ) = t−n . We then have L((t)) = L((τ)) and ordτ F = ordτ F for every F ∈ L((t)). Let B = k[x1 ]. Then B ⊂ L and we have A = B[x2 ]. Let X i B((t)) = ai t ∈ L((t)) ai ∈ B ∀i . Then we have

(20.1.1) LEMMA θ(A) ⊂ B((t)). Proof. We have only to show that θ(x2 ) = τ−1 belongs to B((t)). Since f is monic in x2 with deg x2 f = n, we can write f = xn2 + f1 with f1 ∈ A and deg x2 f1 < n. Therefore we get t−n = θ( f ) = τ−n (1 + τp) with p ∈ B[[τ]]. It follows that t = ζτ(1 + τq), where ζ ∈ µn (k) (= nth roots of unity in k) and ! ∞ X s i−1 i q τ p ∈ B ∈ B[[τ]]. i i=1

20. Jacobian Problem Via Newton-Puiseux Expansion

139

where s = −1/n. Replacing t by ζ −1 t, we may assume that ζ = 1. Let ∞ X τ= ai ti with ai ∈ L. Then we get 147 i=1

τ=

∞ X

ai τi (1 + τq )i .

i=1

Now, we can write (1 + τq)i = 1 + τqi with qi ∈ B[[τ]]. Let qi =

∞ X bi j τ j j=0

with bi j ∈ B. Then we get (20.1.1.1)

τ=

∞ X i=1

∞ X bi j τ j+1 . ai τi 1 + j=0

Comparing the coefficients of τ, we get a1 = 1 ∈ B. Inductively, assume that ai ∈ B for 1 ≤ i ≤ d − 1 for some integer d ≥ 2. Then, comparing the coefficients of τd in (20.1.1.1) we get 0 = ad + c, where c=

d−1 X

ai bi,d−1−i .

i=1

By induction hypothesis c ∈ B. Therefore ad ∈ B. This proves that we have (20.1.1.2)

τ = t(1 + tr.)

with r ∈ B[[t]]. Therefore we get ∞ X i i i −1 −1 (−1) t r . τ = t 1 + i=1

which shows that τ−1 ∈ B((t)).

(20.1.2) COROLLARY For any choice of t ∈ L((τ)) such that θ( f ) = t−n , we have τ = ζt(1 + tr.) for some r ∈ B[[t]] and some ζ ∈ µn (k).

6. The Jacobian Problem

140

Proof. Immediate from (20.1.1.2). In view of Lemma (20.1.1), we can restrict θ to A to get a B-algebra monomorphism θ : A → B((t)) such that θ( f ) = t−n and (20.1.3) 148

ordt θ(F) = − deg x2 F

for every F ∈ A. Let θ(g) =

(20.1.4)

X

g jt j

j

with g j = g j (x1 ) ∈ B. We call (20.1.4) a Newcon-Puiseux expansion of g in fractional powers of f −1 . Note that for fixed x1 , x2 , f, g, (20.1.4) depends on the choice of an element t such that θ( f ) = t−n . If t1 , t2 are two such choices then we have t1 = ζt2 for some ζ ∈ µn (k). Thus there are atmost n distinct Newton-Puiseux expansions of g in fractional powers of f −1 and any two of them are conjugate to each other under a B-automorphism of B((t)) given by t 7→ ζt for some ζ ∈ µn (k). In particular, the condition (JC) in Definition (20.2) below depends only on x = (x1 , x2 ), f , g and does not depend upon t. (20.2) DEFINITION. With the notation of (20.1), we say the pair ( f, g) satisfies condition (JC) (with respect to X) if the following holds: (JC) g j ∈ k for every j ≤ n − 2 and deg x1 gn−1 = 1.

(20.3) A DERIVATION OF L((t)). Continuing with the notation of (20.1), put u1 = x1 , u2 = f and u = (u1 , u2 ). Since deg x2 f > 0, u is a transcendence base of K = k(x1 , x2 ) over k. Therefore we have k-derivations Du,1 , uu,2 of K as defined in (15.1). Let di denote the unique extension of Du,i to a k-derivation of L(x2 ), i = 1, 2. Let δ : L((t)) → L((t)) be the map defined by X X d1 (a j )t j . δ a j ti = j

j

Then δ is clearly a k((t))-derivation of L((t)). We note that δ(B((t))) ⊂ B((t)).

20. Jacobian Problem Via Newton-Puiseux Expansion 149

141

Moreover, denoting again by θ the extension of θ to an L - monomorphism L(x2 ) → L((t)) of fields, we have (20.3.1) LEMMA δθ = θd1 . Proof. Since L(x2 ) is separable algebraic over k(u1 , u2 ), it is enough to show that δθ|k(u1 , u2 ) = θd1 |k(u1 , u2 ). Therefore it is enough to check that δθ(ui ) = θd1 (ui ), i = 1, 2. Now, δθ(u1 ) = δθ(x1 ) = δ(x1 ) = δ(u1 ) = 1 and θd1 (u1 ) = θ(1) = 1. Next, δθ(u2 ) = δθ( f ) = δ(t−n ) = 0 and θd1 (u2 ) = θ(0) = 0. The lemma is proved. (20.4) THEOREM. Let f , g be elements of A. Assume that f is monic in x2 and that deg x2 f > 0. Then the following two conditions are equivalent: (i) J( f, g) = . (ii) ( f, g) satisfies (JC). Proof. We use the notation of (20.1) and (20.3). Let Di = D x.i , i = 1, 2, where x = (x1 , x2 ). By the chain rule of derivation we have J( f, g) = Ju ( f, g)Jx (u1 , u2 ) = Ju ( f, g)Jx (x1 , f ) ! ! 0 1 1 0 = det det d1 (g) d2 (g) D1 ( f ) D2 ( f ) = −d1 (g)D2 ( f ). This gives θ(d1 (g))θ(D2 ( f )) = −θ(J( f, g)). Therefore by Lemma (20.3.1) we get δ(θ(g))θ(D2 ( f )) = −θ(J( f, g)). Using the expression (20.1.4) for θ(g) we get X d1 (g j )ti θ(D2 ( f )) = −θ(J( f, g)). (20.4.1) j

6. The Jacobian Problem

142

Now, let n = degx2 f . Then n ≥ 1. Since f is monic in x2 , we 150 get D2 ( f ) = nxn−1 + f ′ with f ′ ∈ A and degx2 f ′ < n − 1. Therefore 2 1−n θ(D2 ( f )) = nτ + θ( f ′ ) with ordτ θ( f ′ ) > 1 − n. It therefore follows from Corollary (20.1.2) that θ(D2 ( f )) = t1−n + e, where e ∈ L((t)) and ordt e > 1 − n. This shows that we have θ(D2 ( f ))−1 = tn−1 + h with h ∈ L((t)) and ordt h > n − 1. Therefore from (20.4.1) we get X (20.4.2) d1 (g j )t j = −θ(J( f, g))(tn−1 + h). j

Now, suppose J( f, g) = . Then we have X d1 (g j )t j = (tn−1 + h). j

This shows that d1 (g j ) = 0 for j ≤ n− 2 and d1 (gn−1 ) = , which clearly implies that ( f, g) satisfies condition (JC). Conversely, suppose that ( f, g) satisfies condition (JC). Then we have d1 (g j ) = 0 for j ≤ n − 2 and d1 (gn−1 ) = . Therefore it follows from (20.4.2) that we have X (20.4.3) tn−1 + d1 (g j )t j = −θ(J( f, g))(tn−1 + h). j≤n

This shows that ordt θ(J( f, g)) = 0. Therefore by (20.1.3) we get deg x2 J( f, g) = 0, which means that J( f, g) ∈ L. Put λ = J( f, g). Then θ(λ) = λ. Therefore comparing the coefficients of tn−1 in (20.4.3) we get = −λ, which shows that λ = , and the theorem is proved. (20.5) NOTATION. Let f, g be elements of A. Assume that n = deg x2 f > 0 and that f is monic in x2 . Then with the notation of (20.1) we have a commutative diagram L[xO 2 ] ?

A

151

θ

/ L((t)) O

θ

? / B((t))

where θ is a B-algebra monomorphism such that θ( f ) = t−n and

20. Jacobian Problem Via Newton-Puiseux Expansion (20.5.1)

143

ordt θ(F) = − deg x2 F

for every F ∈ A. Let θ(g) =

X

g j ti

j

with g j = g j (x1 ) ∈ B. Assume that the pair ( f, g) satisfies condition (JC), i.e. assume that we have (20.5.2)

g j ∈ k for every j ≤ n − 2, deg x1 gn−1 = 1.

˜ = Φ(X, ˜ Then by Theorem (20.4) we have J( f, g) = . Let Φ Y) ∈ L((X))[Y] be the minimal monic polynomial of θ(g) over L((tn )). (See ˜ is the unique irreducible element of Definition (5.8).) Recall that Φ ˜ n , θ(g)) = 0. Put Φ = Φ(X, Y) = L((X))[Y], monic in Y, such that Φ(t −1 ˜ Φ(X , Y). (20.5.3) LEMMA (i) Φ is monic in Y and degY Φ = n. (ii) Φ ∈ B[X, Y]. (iii) Φ( f, g) = 0. (iv) L[X, Y]/(Φ) is isomorphic (as an L-algebra) to L[ f, g]. Proof. (i) By definition of Φ, Φ is monic in Y. By (20.5.2) n−1 ∈ Suppt θ(g). Therefore g.c.d. ({n} ∪ Suppt θ(g)) = 1. ˜ = n. This proves 152 Now it follows from Lemma (5.10) that degY Φ (i). (ii) Let Ψ = Ψ(X, Y) ∈ B[X, Y] be the x2 -resultant of ( f − X, Y − g). Since f is monic in x2 , Ψ is monic in Y. Moreover, since ˜ deg x2 f = n, we have degY Ψ = n. Put Ψ(X, Y) = Ψ(X −1 , Y).

6. The Jacobian Problem

144

We have Ψ( f, g) = 0. Therefore 0 = θ(Ψ( f, g)) = Ψ(t−n , θ(g)) = ˜ n , θ(g)). It now follows from (i) that Φ ˜ = Ψ. ˜ Therefore Φ = Ψ(t Ψ ∈ B[X, Y]. (iii) Since Φ = Ψ as proved above, we have Φ( f, g) = Ψ( f, g) = 0. (iv) Let α : L[X, Y] → L[ f, g] be the L-algebra epimorphism defined by α(X) = f , α(Y) = g. then (ii) and (iii) Φ ∈ ker α. Since Φ is irreducible in L((X −1 ))[Y] ⊃ L[X, Y] and is monic in Y, Φ is irreducible in L[X, Y]. Therefore ker α = (Φ), and (iv) is proved. (20.5.4) A SPECIALIZATION. Since deg x1 gn−1 = 1 by (20.5.2), there exists c ∈ k such that gn−1 (x1 ) , gn−1 (c) , 0. We choose such a c ∈ k and keep it fixed in the sequel. For an element F of A = B[x2 ] (resp. B((t)), B[X, Y], B[X −1 , Y], . . .) we shall denote by F the element of k[x2 ] (resp. k((t)), k[X, Y], k[X −1 , Y], . . .) ˜ ϕ˜ = Φ. ˜ obtained from F by putting x1 = c. Let ϕ = Φ, (20.5.5) LEMMA (i) ϕ ∈ k[X, Y], ϕ is monic in Y and degY ϕ = n. (ii) ϕ˜ ∈ k[X −1 , Y] ϕ˜ is monic in Y and degY ϕ˜ = n. (iii) ϕ˜ is the minimal monic polynomial of θ(g) =

P j

g j t j over k((tn )).

(iv) ordt (θ(g) − θ(g)) = n − 1. Proof. (i) is immediate from Lemma (20.5.3). (ii) This follows from (i), since ϕ(X, ˜ Y) = ϕ(X −1 , Y).

20. Jacobian Problem Via Newton-Puiseux Expansion 153

145

˜ n , θ(g)) = 0, we have ϕ(t (iii) Since Φ(t ˜ n , θ(g)) = 0. Since gn−1 , 0, we have n − 1 ∈ Suppt θ(g). Therefore the minimal monic polynomial of θ(g) over k((tn )) has Y-degree n (Lemma (5.10)). Therefore by (ii) ϕ˜ is the minimal monic polynomial of θ(g) over k((tn )). (iv) Since g j ∈ k for j ≤ n−2, we have g j = g j for j ≤ n−2. Moreover, we have gn−1 , gn−1 . Therefore the assertion follows. (20.5.6) Characteristic Sequences of ( f, g). ˜ and we define the characteristic (See § 6.) We define h( f, g) = h(Φ) sequences of the pair ( f, g) by ˜ mi ( f, g) = mi (−n, Φ), ˜ qi ( f, g) = qi (−n, Φ), ˜ si ( f, g) = si (−n, Φ), ˜ ri ( f, g) = ri (−n, Φ), ˜ di+1 ( f, g) = di+1 (Φ), for 0 ≤ i ≤ h( f, g) + 1. (Note that these sequences depend not only on f , g, but also on x = (x1 , x2 ). However, the omission of x in the notation mi ( f, g) etc. will cause no confusion.) (20.5.7) LEMMA We have h(ϕ) ˜ = h( f, g) and mi (−n, ϕ) ˜ = mi ( f, g), qi (−n, ϕ) ˜ = qi ( f, g), si (−n, ϕ) ˜ = si ( f, g), ri (−n, ϕ) ˜ = ri ( f, g), di+1 (ϕ) ˜ = di+1 ( f, g) for 0 ≤ i ≤ h(ϕ) ˜ + 1.

6. The Jacobian Problem

146

Proof. Immediate, since g.c.d. (n, n − 1) = 1, n − 1 ∈ Suppt θ(g), n − 1 ∈ 154 Suppt θ(g) and ordt (θ(g) − θ(g)) = n − 1 by Lemma (20.5.5). In the remainder of subsection (20.5) we fix the following notation: h = h( f, g), mi = mi ( f, g), qi = qi ( f, g), si = si ( f, g), ri = ri ( f, g), di+1 = di+1 ( f, g) for 0 ≤ i ≤ h + 1. Also, for 1 ≤ i ≤ h + 1, let if i = 1, Y, ψ˜ i = d App i (ψ, ˜ if i ≥ 2 Y if i ≥ 1, Y, ψi = Appdi (ϕ), if i ≥ 2, Y ˜ ∂ ψi , ψ˜′ i = ∂Y ∂ψi ψ′i = . ∂Y (See § 4). (20.5.8) LEMMA We have: (i) h ≥ 1. (ii) m1 = − deg x2 g ≤ 0. (iii) mi < n − 1 for 1 ≤ i ≤ h − 1 and mh ≤ n − 1. 155

Proof.

(i) This is clear, since θ(g) , 0.

20. Jacobian Problem Via Newton-Puiseux Expansion

147

(ii) Follows from (20.5.1) and the fact that g , 0. (iii) This is also clear, since n − 1 ∈ Suppt θ(g) and g.c.d. (n, n − 1) = 1. (20.5.9) LEMMA For 1 ≤ i ≤ h + 1, we have (i) ψ˜ i (X, Y) = ψi (X −1 , Y), (ii) ψ˜′ i (X, Y) = ψ′i (X −1 , Y). Proof.

(i) Follows from Proposition (4.7).

(ii) Follows from (i) (20.5.10) LEMMA For F(X, Y) ∈ k[X, Y], we have degx2 F( f, g) = − ordt F(t−n , θ(g)). Proof. This follows from 20.5.1, since θ(F( f, g)) = F(t−n , θ(g)).

(20.5.11) LEMMA For 1 ≤ e ≤ h, we have degx2 ψe ( f, g) = −re . Proof. We have ψ1 (X, Y) = Y. Therefore by Lemma (20.5.10) degx2 ψ1 ( f, g) = − ordt θ(g) = −m1 = −r1 . This proves the assertion for e = 1. Assume now that e ≥ 2. Since me ≤ n − 1 by Lemma (20.5.8), it follows from Lemma (20.5.5) (iv) that we have X X θ(g) = g j t j + gme tme + g jt j. jme

Therefore, since gme , 0, it follows from Corollary (7.20) that ˜ n , θ(g)) = re . Therefore by Lemma (20.5.9) we have ordt ψe (t−n , ordt ψ(t θ(g)) = re . Now, the lemme follows from Lemma (20.5.10).

6. The Jacobian Problem

148 (20.5.12) LEMMA

For 1 ≤ e ≤ h, we have deg x2 ψ′e ( f, g) = me − re . Proof. Since ordt (θ(g) − θ(g)) = n − 1 ≥ me and me ∈ Suppt θ(g), it follows from Proposition (13.7) that ordt ψ˜′ e (tn , θ(g)) = re − me . Therefore by Lemmas (20.5.9) and (20.5.10) we get degx2 ψ′e ( f, g) = − ordt ψ′e (t−n , θ(g)) = me − re . 156

(20.6) DEFINITION. An element f of A is said to be x2 regular if f , 0 and deg f = deg x2 f . Note that f is x2 -regular if and only of x1 does not divide f + in A. In Lemmas (20.7) - (20.9) below, we let the notation and assumptions be those (20.5). We assume, moreover, that f ix x2 -regular. (20.7) LEMMA. Let e be an integer, 2 ≤ e ≤ h. Assume that ψi ( f, g) is related to f for every i, 1 ≤ i ≤ e − 1. Let F = F(X, Y) be a non-zero element of k[X, Y] with degY F < n/de . Then F(F, g) is related to f . Proof. Let R = k[X]. Let p = e − 1 and let G = (G1 , . . . , G p ), where Gi = ψi for 1 ≤ i ≤ p. Then G satisfies conditions (i)-(iii) of (2.2) and, with the notation of (2.2), we have ni (G) = di /di+1 for 1 ≤ i ≤ p − 1. Let + p A(G) = a = (a1 , . . . , a p ) ∈ (Z ) ai < di /di+1 for 1 ≤ i ≤ p − 1 .

Then by Corollary (2.14) we have the G-adic expansion X (20.7.1) F= FaGa a∈A(G)

of F with fa = Fa (X) ∈ R for every a ∈ A(G). By Corollary (2.9) we have p X ai degY Gi = degY Ga ≤ degY F < n/de = n/d p+1 i=1

for every a ∈ SuppG F. In particular, we have a p n/d p = a p degY G p < n/d p+1 .

20. Jacobian Problem Via Newton-Puiseux Expansion

149

This gives (20.7.2)

a p < d p /d p+1

for every a ∈ SuppG (F). Putting X = f , Y = g in (20.7.1), we get X (20.7.3) F( f, g) = Fa ( f )G( f, g)a ,

157

a∈S

where S = SuppG (F). Since Ga ( f ) ∈ k[ f ], we can rewrite (20.7.3) in the form X F( f, g) = λb H b b∈B(H)

with λb ∈ k for every b, where h = (h0 , . . . , h p ) with h0 = f . Hi = Gi ( f, g) for 1 ≤ i ≤ p, and where B(H) = b = (b0 , . . . , b p ) ∈ (Z+ ) p+1 bi < di /di+1 for 1 ≤ i ≤ p . Note that the condition b p < d p /d p+1 for b ∈ B(H) is justified in view of (20.7.2). Since deg x2 Hi = −ri for 1 ≤ i ≤ p (Lemma (20.5.11)) and deg x2 H0 = deg x2 f = n = −r0 , we have, for every b ∈ (B(H)). b

deg x2 H =

p X

bi (−ri ).

i=0

which is clearly a strict linear combination of (−r0 , . . . , −r p ). (See § 1.) ′ Therefore if b, b′ ∈ B(H), b , b′ , then deg x2 H b , deg x2 H b . It follows that there exists a unique b ∈ B(H) such that λb , 0 and (20.7.4)

′

deg x2 F( f, g) = deg x2 (λb H b ) > deg x2 (λb′ H b )

for every b′ ∈ B(H), b′ , b. Now, by assumption, Hi is related to f for every i, 0 ≤ i ≤ p. In particular, since f is x2 -regular, so is Hi for ′ ′ every i, 0 ≤ i ≤ p. Therefore we have deg x2 H b = deg H b for every b′ ∈ B(H), and it follows from (20.7.4) that we have F( f, g)+ = (λb H b ). Since each Hi is related to f , so is λb H b by Lemma (17.3). Thus 158 F( f, g) is related to f , and the lemma is proved.

150

6. The Jacobian Problem

(20.8) LEMMA. Let e be an integer, 2 ≤ e ≤ h. Assume that ψi ( f, g) is related to f for every i, 1 ≤ i ≤ e − 1. Then f has only one point at infinity or ψe ( f, g) is x2 -regular. Proof. By the chain rule for differentiation we have (20.8.1)

J( f, ψe ( f, g)) = ψ′e ( f, g)J( f, g) = ψ′e ( f, g).

Now, if J( f + , ψe ( f, g)+ ) = 0 then by Proposition (17.4) f and ψe ( f, g) are related. Therefore in this case, since f is x2 -regular, so is ψe ( f, g). Thus we may now assume that J( f + , ψe ( f, g)+ ) , 0. Then by (20.8.1) and Lemma (18.2) we have (20.8.2)

J( f + , ψe ( f, g)+ ) = ψ′e ( f, g)+ .

159

Since degY ψ′e = degY ψe − 1 < n/de , it follows from Lemma (20.7) that ψ′e ( f, g) is related to f . Therefore there exist non-negative integers p, q and a homogeneous element H of A such that f + = H p , ψ′e ( f, g)+ = H q . From (20.8.2) we get J(H q , G) = H q , where G = ψe ( f, g)+ . This shows that p − 1 ≤ q and J(H, G) = H r , where r = q − p + 1. If r = 0 then J(H, G) = and it follows from Lemma (18.8) (i) that H is linear in x1 , x2 , which shows that f has only one point at infinity. We may therefore assume that r > 0. Then by Lemma (18.5) H r−1 divides G. Let G = EH r−1 with E ∈ A. Then from J(H, G) = H r we get J(H, E) = H. Therefore by Lemma (18.10) we have E = (a1 x1 + a2 x2 )(b1 x1 + b2 x2 ) and H = (a1 x1 + a2 x2 )i1 (b1 x1 + b2 x2 )i2 , where i1 , i2 are non-negative integers, i1 + i2 > 0, and a1 , a2 , b1 , b2 are elements of k such that a1 x1 + a2 x2 and b1 x1 + b2 x2 are linearly independent over k. If i1 = 0 or i2 = 0 then H (and therefore f ) has only one point at infinity. Assume therefore that i1 > 0, i2 > 0. Then, since f (and therefore H) is x2 -regular, we have a2 , 0, b2 , 0. This implies that E is x2 -regular. Therefore G = EH r−1 is x2 -regular. This means that ψe ( f, g) is x2 -regular. (20.9) LEMMA. Let e be an integer, 2 ≤ e ≤ h. Assume that ψi ( f, g) is related to f for every i, 1 ≤ i ≤ e − 1. Assume also that me , n − 2. Then f has only one point at infinity or ψe ( f, g) is related to f .

20. Jacobian Problem Via Newton-Puiseux Expansion

151

Proof. If f has only one point at infinity, there is nothing to prove. Therefore by Lemma (20.8) we may assume that ψe ( f, g) is x2 -regular. By Proposition (17.4) we have to show that J( f + , ψe ( f, g)+ ) = 0. Suppose J( f + , ψe ( f, g)+ ) , 0. Then, since by (20.8.1) we have J( f, ψe ( f, g)) = ψ′e ( f, g) we get (20.9.1)

deg f + deg ψe ( f, g) − 2 = deg ψ′e ( f, g)

by Lemma (18.2). Since degY ψ′e < n/de , ψ′e , ψ′e ( f, g) is related to f by Lemma (20.7). Therefore, since f is x2 -regular, so is ψe ( f, g). Also, by assumption, ψe ( f, g) is x2 -regular. Therefore we have deg ψe ( f, g) = deg x2 ψe ( f, g) = −re by Lemma (20.5.11) and deg ψ′e ( f, g) = deg x2 ψ′e ( f, g) = me − re by Lemma (20.5.12). Therefore, since deg f = n, (20.9.1) gives n − re − 160 2 = me − re , so that me = n − 2, which is a contradiction. Therefore J( f + , ψe ( f, g)+ ) = 0, and the lemma is proved. (20.10) THEOREM (cf. Theorem (19.4)). The following three statements are equivalent: (I) If f , g ∈ A and J( f, g) = then k[ f, g] = A. (V) Let f , g ∈ A. Assume that deg x2 f > 0 and that f is x2 -regular and is monic in x2 . If the pair ( f, g) satisfies condition (JC) then we have deg x2 f = 1 or me ( f, g) < deg x2 f − 2 for every e, 1 ≤ e ≤ h( f, g). (VI) Let f , g ∈ A be as in statement (V). If the pair ( f, g) satisfies (JC) then we have deg x2 f = 1 or me ( f, g) , deg x2 f − 2 for every e, 1 ≤ e ≤ h( f, g). Proof. Consider the statement (II) If f , g ∈ A and J( f, g) = then f has only one point at infinity.

152

6. The Jacobian Problem By theorem (19.4) it is enough the implications (I) ⇒ (V) ⇒ (VI) ⇒ (II).

161

(I) ⇒ (V). Let f , g satisfy the hypothesis of (V). Then by Theorem (20.4) we have J( f, g) = . Therefore by (I) we have k[ f, g] = A. We now use the notation of (20.5). From the equality k[ f, g] = A we get L[ f, g] = L[x2 ]. This means that L[X, Y]/(Φ) is isomorphic to L[x2 ] (Lemma (20.5.3)) (iv)). Now, by Lemma (20.5.8) we have h ≥ 1. If h ≥ ˜ < 2 then it follows from Corollary (13.5) (v) that me ( f, g) = me (−n, Φ) n − 2 for every e, 1 ≤ e ≤ h, where n = degx2 f . Suppose now that h = 1. Let m1 = m1 ( f, g). Then h = 1 implies that g.c.d. (n, m1 ) = 1. Suppose m1 ≥ n − 2. Then n − m1 ≤ 2. Since m1 ≤ 0 by Lemma (20.5.8), we get n ≤ 2. If n = 2 then we must have m1 = 0. This is not possible, since g.c.d. (n, m1 ) = 1. Therefore n = 1, and (V) is proved. (V) ⇒ (VI). Trivial. (VI) ⇒ (II). Let f , g be elements of A such that J( f, g) = . We have to show that f has only one point at infinity. To do this we may replace x1 , x2 by any basis of the k-vector space kx1 ⊕ kx2 . We may therefore assume, without loss of generality, that x1 does not divide f + , i.e., f is x2 -regular. Then, in particular, deg x2 f > 0. Moreover, replacing f by f for suitable , we may assume that f is monic in x2 . By Theorem (20.4), since J( f, g) = , the pair ( f, g) satisfies condition (JC). Let n = deg x2 f = deg f . If n = 1 then, clearly, f has only one point at infinity. Assume therefore that n > 1. Then by (VI) we have me ( f, g) , n − 2 for every e, 1 ≤ e ≤ h, where h = h( f, g). Since deg f > 1 and J( f, g) = , it follows from Lemma (18.2) that J( f + , g+ ) = 0. Let us now use the notation of (20.5). Since J( f + , g+ ) = 0, it follows from Proposition (17.4) that f and g = ψ1 ( f, g) are related. Now, since me ( f, g) , n − 2 for every e, 1 ≤ e ≤ h, it follows from Lemma (20.9) by induction on e that f has only one point at infinity or f is related to ψe ( f, g) for every, e, 1 ≤ e ≤ h. If f has only one point a infinity then we have nothing more to prove. We may therefore assume that f is related to ψe ( f, g) for every e, 1 ≤ e ≤ h. In particular, since f is x2 -regular, so is ψe ( f, g) for every e. Therefore, for 1 ≤ e ≤ h, we have deg ψe ( f, g) = degx2 ψe ( f, g) = −re

21. Solution in the Galois Case

153

by Lemma (20.5.11). Therefore since deg f = n = −r0 and since g.c.d. ( f0 , . . . , rh ) = dh+1 = 1, it follows from Corollary (17.5) that there exists a homogeneous element 162 H of A of degree 1 such that f + = H n . This means that f has only one point at infinity.

21 Solution in the Galois Case In this section we show that the answer to the Jacobian problem is in the affirmative in case k(x1 , x2 )/k( f, g) is a Galois extension (Theorem (21.11)). We preserve the notation of § 15 and § 16. In addition, we fix the following notation: Let f , g be elements of A = k[x1 , x2 ] such that J( f, g) = . Put B = k[ f, g] and L = k( f, g). Recall that we have k = k(x1 , x2 ) and that char k = 0.

(21.1) Definition and Notation. As in § 11, by a valuation we shall mean a real discrete valuation. Let Ω be a field of characteristic zero and E. F be over fields of Ω such that E is a finite field extension of F. Let v be a valuation of E/Ω and let V = Rv be the discrete valuation ring of E/Ω associated to v. Let W = V ∩F. We say V lies over (or is an extension of) W. We denote by eV|W (or simply, eV ) the ramification index of V over W, i.e., eV = v(z), where z is a uniformizing parameter for W. We say V is ramified (resp. unramified) in the extension E/F if Ev > 1 (resp. eV = 1). We say W is ramified in E/F if there exists an extension V of W to E such that V is ramified in E/F. In our proof of Theorem (21.11) we shall need the following wellknown formula:

(21.2) Lemma (Hurwitz Formula). Let Ω be an algebraically closed field of characteristic zero and let E, F be function fields of one variable over Ω such that E is a finite extension 163

6. The Jacobian Problem

154

of F. Let n = [F : F] and let gE (resp. gF ) be the genus of E/Ω (resp. F/Ω). Then we have X (eV − 1), 2gE − 2 = (2gF − 2)n + V

where the summation is over all discrete valuation rings V of E/Ω and eV = eV|V∩F . For a proof of this lemma see, for instance, [4]. (21.3) COROLLARY. With the notation of Lemma (21.2) suppose that gF = 0 and that there exists atmost one discrete valuation ring of F/Ω ramified in E/F. Then E = F. Proof. By Lemma (21.2) we have 2gE − 2 = −2n +

X

(eV − 1).

V

By assumption, all those V for which eVX > 1 lie over the same discrete valuation ring of F. Therefore we have (eV − 1) ≤ n − 1 and we get V

2gE − 2 ≤ −n − 1, so that n ≤ 1 − 2gE ≤ 1.

(21.4) LEMMA. K/L is a finite (separable) extension. Proof. Since K is finitely generated over L, we have only to show that K has no non-trivial L-derivations. Let d be an L-derivation of K. Then we have 0 = d( f ) = D1 ( f )d(x1 ) + D2 ( f )d(x2 ), 0 = d(g) = D1 (g)d(x1 ) + D2 (g)d(x2 ). Since J( f, g) , 0, we get d(x1 ) = 0 = d(x2 ). Therefore d = 0. 164

(21.5) COROLLARY. f and g are algebraically independent over k and B is the polynomial ring in two variables f and g over k. (21.6) LEMMA. Let J be a prime ideal of A of height one. Then ht(J ∩ B) = 1. (Here ht denotes “height”.)

21. Solution in the Galois Case

155

Proof. Let k be the algebraic closure of k and let A = k[x1 , x2 ], B = k[ f, g]. Since A is integral over A, there exists a prime ideal J of A such that J ∩ A = J . Moreover, htJ = 1. since B is integral over B, we have ht(J ∩ B) = ht(J ∩ B). We may therefore assume that k = k. Since K/L is algebraic (Lemma (21.4)) we have J ∩ B , 0. Suppose ht(J ∩ B) > 1. Then J ∩ B = ( f − a)B + (g − b)B for some a, b ∈ k. We have J = pA for some p ∈ A. Since p divides f − a and g − b in A, p divides J( f − a, g − b) = J( f, g) = in A. This is a contradiction.

(21.7) Proposition (Birational Case) If L = K then B = A. Proof. Let Q be any prime ideal of B of height one. Then Q = qB for some q ∈ B. Since q < k, q is a non-unit in A. Therefore there exists a prime ideal J of A of height one such that q ∈ J . Then by Lemma (21.6) we have J ∩ B = Q. Therefore BQ ⊂ AJ . Now, both BQ and AJ are discrete valuation rings of the same field K. Therefore we have BQ = AJ , so that A ⊂ BQ . Thus \ BQ = B. a⊂ htQ=1

(21.8) DEFINITION. Let J be a prime ideal of A of height one. We say J is unramified B if the discrete valuation ring AJ is unramified in the extension K/L (Definition (21.1)). Note that J is unramified over B if and only if J ∩ B 1 J 2 . (21.9) LEMMA. Every prime ideal of A of height one is unramified 165 over B. Proof. Let J be a prime ideal of A of height one and let Q = J ∩ B. We have to show that Q 1 J 2 . Let J = pA, Q = qB with p ∈ A, q ∈ B. Since q < gk, we have (21.9.1)

(∂q/∂ f )B + (∂q/∂g)B 1 qB.

6. The Jacobian Problem

156 Now, we have

Di (q) = (∂q/∂ f )Di ( f ) + (∂q/∂g)Di (g) for i = 1, 2. Therefore since J( f, g) = , we get (21.9.2)

(∂q/∂ f )A + (∂q/∂g)A ⊂ D1 (q)A + D2 (q)A.

Now, suppose q ∈ p2 A. then Di (q) ∈ pA, i = 1, 2. Therefore by (21.9.2) we have (∂q/∂ f )B + (∂q/∂g)B ⊂ pA ∩ B = qB, which contradicts (21.9.1)

(21.10) LEMMA. Let u1 , u2 be elements of K such that K/k(u1 , u2 ) is a finite extension. Then there exists a ∈ K such that k(u1 + au2 ) is algebraically closed in K. Proof. For a subfield F of K let F denote its algebraic closure in K. Consider the family k(u1 + au2 )(u2 ) a ∈ k 166

of subfields of K containing k(u1 , u2 ). Since there are only finitely many fields between k(u1 , u2 ) and K (and since k is infinite), there exist a1 , a2 ∈ k, a1 , a2 , such that k(v1 )(u2 ) = k(v2 )(u2 ), where v1 = u1 + a1 u2 , v2 = u1 + a2 u2 . Since u2 ∈ k(v1 , v2 ), we get k(v1 ) ⊂ k(v2 )(u2 ) ⊂ k(v2 )(v1 ) and k(v2 ) ⊂ k(v1 )(u2 ) ⊂ k(v1 )(v2 ). Therefore we have k(v1 )(v2 ) = kk(v2 )(v1 ). Since k(v2 ) ⊂ K = k(x1 , x2 ), k is algebraically closed in k(v2 ). Therefore, since u1 , u2 and hence v1 , v2 are algebraically independent over k, k(v1 ) is algebraically closed in k(v2 )(v1 ) = k(v1 ) (v2 ). This means that k(v1 ) = k(v1 ). (21.11) THEOREM. If K/L is a Galois extension then B = A. Proof. In view of Proposition (21.7), we have only to show that L = K. Replacing f by f + ag for some a ∈ k, we may assume that k( f ) is algebraically closed in K(Lemma (21.10)). Then, denoting by Ω the

21. Solution in the Galois Case

157

algebraic closure of k( f ), we see that Ω and K are linearly disjoint over k( f ). It follows that Ω(g) and K are linearly disjoint over L, so that L = Ω(g)∩K, the intersection being taken in ΩK. Therefore, putting E = ΩK, F = Ω(g), it is enough to show that E = F. Suppose E , F. Then it follows from Corollary (21.3) that at least two (discrete) valuation rings of F/Ω are ramified in E/F. Since the (g−1 )-adic valuation ring is the only valuation ring of F/Ω not containing Ω[g], there exists a valuation ring W ′ of F/Ω such that W ′ ⊃ Ω[g] and W ′ is ramified in E/F. Let W = W ′ ∩ L. Then W is a discrete valuation ring of L containing k( f )[g] and is ramified in K/L. Since K/L is Galois, the extensions of W to K are ramified over L. Now, since W ⊃ k( f )[g], W = B for some prime ideal of B of height one. Let = qB with q ∈ B. then q is a nonunit in A. Therefore there exists a prime ideal J of A of height one such that q ∈ J . By Lemma (21.6) we have J ∩ B = . Therefore W = B = AJ ∩ L. Thus J is ramified over B. This is a contradiction 167 by Lemma (21.9). (21.12) REMARK. The above proof shows, in fact, that there cannot exist a proper Galois extension of L contained in K.

Bibliography [1] S.S. Abhyankar and T.T. Moh: Newton-Puiseux expansion and 168 generalizer Tschirnhausen transformation, J. reine angew. Math., 260, 47-83 and 261, 29-54 (1973). [2] S.S. Abhyankar and T.T. Moh:Embeddings of the line in the plane, J. refine angew, math., 276, 148-166 (1975). [3] S.S. Abhyankar and B. Singh: Embeddings of certain curves in the affine plane, to appear in Amer. J. Math. [4] H. Hasse, Zahlenthrorie, 2nd ed., Akademie-Verlag, Berlin, 1963. [5] W. van der Kulk, On polynomial rings in two variables, Nieuw Archief voor Wiskunde, (3) I, 33-41 (1953).

159