From Christoffel Words to Markoff Numbers 9780198827542, 0198827547

In 1875, Elwin Bruno Christoffel introduced a special class of words on a binary alphabet linked to continued fractions

183 93 5MB

English Pages 169 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

From Christoffel Words to Markoff Numbers
 9780198827542, 0198827547

  • Commentary
  • Made from https://libgen.is/book/index.php?md5=5AD311075D8770DD89DC90C7F16CA68B . Section headings were manually edited to avoid the use of ligatures. Some fonts still missing.

Table of contents :
Cover
From Christoffel Words to Markoff Numbers
Copyright
Dedication
Acknowledgements
Contents
Introduction
PART I The Theory of Markoff
1
Basics
2 Words
2.1 Tiling the Plane with a Parallelogram
2.2 Christoffel Words
2.3 Palindromes
2.4 Standard Factorization
2.5 The Tree of Christoffel Pairs
2.6 Sturmian Morphisms
3 Markoff Numbers
3.1 Markoff Triples and Numbers
3.2 The Tree of Markoff Triples
3.3 The Markoff Injectivity Conjecture
4 The Markoff Property
4.1 Markoff Property for Infinite Words
4.2 Markoff Property for Bi-infinite Words
5 Continued Fractions
5.1 Finite Continued Fractions
5.2 Infinite Continued Fractions
5.3 Periodic Expansions Yield Quadratic Numbers
5.4 Approximations of Real Numbers
5.5 Lagrange Number of a Real Number
5.6 Ordering Continued Fractions
6 Words and Quadratic Numbers
6.1 Continued Fractions Associated with Christoffel Words
6.2 Markoff Supremum of a Bi-infinite Sequence
6.3 Lagrange Number of a Sequence
7 Lagrange Numbers Less Than Three
7.1 FromL(s) < 3 to the Markoff Property
7.2 Bi-infinite Sequences
8 Markoff’s Theorem for Approximations
8.1 Main Lemma
8.2 Markoff’s Theorem for Approximations
8.3 Good and Bad Approximations
9 Markoff’s Theorem for Quadratic Forms
9.1 Indefinite Real Binary Quadratic Forms
9.2 Infimum
9.3 Markoff’s Theorem for Quadratic Forms
10 Numerology
10.1 Thirteen Markoff Numbers
10.2 The Golden Ratio and Other Numbers
10.3 The Matrices μ(w) and Frobenius Congruences
10.4 Markoff Quadratic Forms
11 Historical Notes
PART II The Theory of Christoffel Words
12 Palindromes and Periods
12.1 Palindromes
12.2 Periods
13 Lyndon Words and Christoffel Words
13.1 Slopes
13.2 Lyndon Words
13.3 Maximal Lyndon Words
13.4 Unbordered Sturmian Words
13.5 Equilibrated Lyndon Words
14 Stern–Brocot Tree
14.1 The Tree of Christoffel Words
14.2 Stern–Brocot Tree and Continued Fractions
14.3 The Raney Tree and Dual Words
14.4 Convex Hull
15 Conjugates and Factors
15.1 Cayley Graph
15.2 Conjugates
15.3 Factors
15.4 Palindromes Again
15.5 Finite Sturmian Words
16 Bases and Automorphisms of the Free Group on Two Generators
16.1 Bases and Automorphisms of F(a,b)
16.2 Inner Automorphisms
16.3 Christoffel Bases of F(a,b)
16.4 Nielsen’s Criterion
16.5 An Algorithm for the Bases of F(a,b)
16.6 Sturmian Morphisms Again
17 Complements
17.1 Other Results on Christoffel Words
17.2 Lyndon Words and Lie Theory
17.3 Music
Bibliography
Index

Citation preview

from christoffel words to markoff numbers

From Christoffel Words to Markoff Numbers CH R ISTO P H E R EU TE N AU E R

1

3

Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © Christophe Reutenauer 2019 The moral rights of the author have been asserted First Edition published in 2019 Impression: 1 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number: 2018947066 ISBN 978–0–19–882754–2 DOI: 10.1093/oso/9780198827542.001.0001 Printed and bound by CPI Group (UK) Ltd, Croydon, CR0 4YY

Je dédie ce livre à ma femme Anissa Amroun, à Catherine, à Alexandre, et à Félix In memoriam Aldo de Luca

A CK N O W L E D G E M E N TS Many thanks to Sreˇcko Brlek, Christian Kassel, Mélodie Lapointe, Vladimir Shpilrain, Ilya Kapovich, Paul Schupp, Enrico Bombieri, Sergey Fomin, Florent Hivert, Dominique Perrin, Antonio Restivo, Amy Glen, Jean Berstel, François Bergeron, Hugh Thomas, David Clampitt, Damien Jamet, Caroline Series, Nancy Wallace, Herman Goulet-Ouellet, Serge Perrine, Fanny Desjardins, for discussions, help, and mail exchanges. Aldo de Luca gave me several useful comments and suggestions: ti ringrazio molto, Aldo. The thorough reading by Patrice Séébold and Gwénaël Richomme of the manuscripts saved me from many typos, sloppiness, and some mistakes: un très grand merci, Patrice et Gwénaël. I am grateful to Aaron Lauve and Neha Siddiqui, who read carefully the whole manuscript, for their mathematical and stylistic suggestions: thanks a lot, Aaron and Neha. Christophe Reutenauer Montréal 2018

CO N T E N TS Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

PART I THE THEORY OF MARKOFF 1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2 Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9 9 11 12 14 17 18

2.1 2.2 2.3 2.4 2.5 2.6

Tiling the Plane with a Parallelogram Christoffel Words Palindromes Standard Factorization The Tree of Christoffel Pairs Sturmian Morphisms

3 Markoff Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1 Markoff Triples and Numbers 3.2 The Tree of Markoff Triples 3.3 The Markoff Injectivity Conjecture

21 25 26

4 The Markoff Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.1 Markoff Property for Infinite Words 4.2 Markoff Property for Bi-infinite Words

29 32

5 Continued Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1 5.2 5.3 5.4 5.5 5.6

Finite Continued Fractions Infinite Continued Fractions Periodic Expansions Yield Quadratic Numbers Approximations of Real Numbers Lagrange Number of a Real Number Ordering Continued Fractions

35 36 37 38 40 41

6 Words and Quadratic Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.1 Continued Fractions Associated with Christoffel Words 6.2 Markoff Supremum of a Bi-infinite Sequence 6.3 Lagrange Number of a Sequence

43 44 47

7 Lagrange Numbers Less Than Three . . . . . . . . . . . . . . . . . . . . . . . . . . 49 7.1 From L(s) < 3 to the Markoff Property 7.2 Bi-infinite Sequences

49 52

x | contents

8 Markoff’s Theorem for Approximations. . . . . . . . . . . . . . . . . . . . . . . . 55 8.1 Main Lemma 8.2 Markoff ’s Theorem for Approximations 8.3 Good and Bad Approximations

55 55 58

9 Markoff’s Theorem for Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . 63 9.1 Indefinite Real Binary Quadratic Forms 9.2 Infimum 9.3 Markoff ’s Theorem for Quadratic Forms

63 65 67

10 Numerology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 10.1 10.2 10.3 10.4

Thirteen Markoff Numbers The Golden Ratio and Other Numbers The Matrices μ(w) and Frobenius Congruences Markoff Quadratic Forms

69 70 71 73

11 Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 PART II THE THEORY OF CHRISTOFFEL WORDS 12 Palindromes and Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.1 Palindromes 12.2 Periods

79 83

13 Lyndon Words and Christoffel Words . . . . . . . . . . . . . . . . . . . . . . . . . 87 13.1 13.2 13.3 13.4 13.5

Slopes Lyndon Words Maximal Lyndon Words Unbordered Sturmian Words Equilibrated Lyndon Words

87 88 89 90 92

14 Stern–Brocot Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 14.1 14.2 14.3 14.4

The Tree of Christoffel Words Stern–Brocot Tree and Continued Fractions The Raney Tree and Dual Words Convex Hull

95 98 101 104

15 Conjugates and Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 15.1 15.2 15.3 15.4 15.5

Cayley Graph Conjugates Factors Palindromes Again Finite Sturmian Words

109 111 115 118 121

16 Bases and Automorphisms of the Free Group on Two Generators . . . . 125 16.1 Bases and Automorphisms of F(a,b) 16.2 Inner Automorphisms 16.3 Christoffel Bases of F(a,b)

125 129 132

contents | xi 16.4 Nielsen’s Criterion 16.5 An Algorithm for the Bases of F(a,b) 16.6 Sturmian Morphisms Again

135 137 139

17 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 17.1 Other Results on Christoffel Words 17.2 Lyndon Words and Lie Theory 17.3 Music Bibliography Index

143 146 146 149 155

Introduction

In a short article written in Latin in 1875 [C1], Elwin Bruno Christoffel introduced a class of words on a binary alphabet, now called Christoffel words. They were followed in the twentieth century by the theory of Sturmian sequences, introduced by Morse and Hedlund in 1940 [MH] in the framework of symbolic dynamics, developed more recently within combinatorics on words, and related to discrete geometry. Independently of Christoffel, Andrei Andreyevich Markoff (the originator of Markov processes, but writing his name in the French way in his two articles written in French) published in 1879 and 1880 his famous theory, now called Markoff theory. It characterizes certain quadratic forms and certain real numbers by extremal inequalities. Both classes are constructed by using certain words, essentially the Christoffel words. Markoff ’s theory has visibly fascinated many great mathematicians, who reproved one of the two aspects of the theory: Frobenius (1913), Remak (1924), Perron (1921), Dickson (1930), Cassels (1949), Harvey Cohn (1955), Bombieri (2007). The link between Christoffel words and the theory of Markoff was noted by Frobenius (1913) and known to Caroline Series (1985). The present book is motivated by this link. In Part I we provide the classical theory of Markoff in its two aspects, based on the theory of Christoffel words. In Part II we give more advanced results of the theory of Christoffel words. I. Part I presents the classical theory of Markoff. This theory has two facets, both of which are considered. The first one culminates in an approximation theorem for irrational real numbers: it gives a sequence of approximations whose error terms are progressively smaller that hold for all irrationals but for an exceptional sequence of quadratic numbers, given in the theorem. Let us be more precise. The first level of approximation is a theorem of Hurwitz: for p each irrational real number x, there exist infinitely many rational numbers q such that p

|x − q |
0 if and only if the slope α of u is smaller than the slope β of v. Indeed, u (resp. v) is positively proportional to the vector (1,  α) (resp.(1, β)), so that det(u, v) is positively proportional to the determinant 1 α of the matrix , which is β − α (we leave to the reader the case where the slope of 1 β u or v is infinite). In conclusion, given two nonzero integral vectors u, v in the first quadrant, with slope(u) < slope(v), the parallelogram constructed on them is primitive if and only if det(u, v) = 1. For further use, when we prove that a Christoffel word is uniquely the product of two others, we prove the following lemma. Corollary 2.1.1 Let A, B be two integral points such that B = A + (p, q) with p, q relatively prime positive integers. There is a unique point M lying in the rectangle with diagonal AB under this diagonal, such that AMB is a primitive triangle.

christoffel words | 11 B M' M A

Figure 2.2 The unique point M such that AMB is a primitive triangle

The result and its proof are illustrated in Figure 2.2. Proof For M choose the integral point in the rectangle below the diagonal and closest to the diagonal. Since p, q are relatively prime, there is no integral point on the open segment (A, B). One deduces that the triangle AMB is primitive. To prove uniqueness, suppose that there is another point M such that the triangle −→ − → AM B is primitive. Then, the triangle AMB being primitive, the vectors AM, AB form a basis of Z2 . Their determinant is therefore equal to ±1, and actually (by −→ − → the result above) it is 1, since the slope of AM is smaller than that of AB. Similarly, −−→ − −−→ − → −→ − → → det(AM , AB) = 1. Hence we have det(AM, AB) = 1 = det(AM , AB). Thus, taking −−→ − → the difference, det(MM , AB) = 0, so that MM and AB are parallel. Since MM is strictly under the diagonal, it is shorter than AB and we deduce that there is −−→ some integral point on the open segment AB (namely the point C = A + MM ), a contradiction. 2

2.2 Christoffel Words A lattice path is a sequence of consecutive elementary steps in the plane; each elementary step is a segment [(x, y), (x + 1, y)] or [(x, y), (x, y + 1)], with x, y ∈ Z. In this section the paths will be finite, but we consider in the sequel also infinite paths. Let p, q be non-negative, relatively prime integers. Consider the segment from some integral point A to B = A + (p, q) and the lattice path from A to B located below this segment and such that the polygon delimited by the segment and the path has no interior integer point. Given a totally ordered alphabet {a < b}, the lower Christoffel word of slope q/p is the word in the free monoid {a, b}∗ coding the above path, where a (resp. b) codes a horizontal (resp. vertical) elementary step. See Figure 2.3, which shows the path corresponding to (p, q) = (7, 4), corresponding to the lower Christoffel word aabaabaabab of slope 4/7. Note that the slope of a lower Christoffel word is equal to the slope of the segment in the plane delimited by the extreme points of the corresponding discrete path. Note also that for the definition of the lower Christoffel word, the choice of A is immaterial, and we often take A = (0, 0). We say that the above path and the lower Christoffel word discretize from below the segment AB.

12 | words A + (7, 4)

A

Figure 2.3 The lower Christoffel words aabaabaabab of slope 4/7

The upper Christoffel word of slope q/p is defined similarly, by considering the lattice path located above the segment. Since the rectangle with opposite vertices A and B and sides parallel to the coordinate lines has a symmetry around its centre, it follows that the upper Christoffel word of a given slope is the reversal of the lower Christoffel word of the same slope. A Christoffel word is a word which is a lower or an upper Christoffel word. Denote by E the automorphism of the free monoid {a, b}∗ that exchanges a and b. Then E exchanges lower and upper Christoffel words. It sends the lower Christoffel word of slope q/p onto the upper Christoffel word of slope p/q: to see it, in Figure 2.3 for example, let A = (0, 0) and take the symmetry with respect to the diagonal y = x. Clearly, the number of as in the lower and upper Christoffel words of slope q/p is p, while the number of bs is q. In particular, |w|a , |w|b are relatively prime when w is a Christoffel word and w cannot be a power of another word. The letters a and b are Christoffel words, simultaneously upper and lower ones, of respective slope 1 and ∞; they correspond to the cases (p, q) = (1, 0) and (p, q) = (0, 1) respectively. The other Christoffel words are called proper. The words an b and abn , for n ≥ 0, are lower Christoffel words. Denote by C the conjugator: it is the mapping sending each word a1 · · · an of length n onto a2 · · · an a1 . This defines an action of the cyclic group Z/nZ on the set of words of length n. The conjugates of a word w of length n are the words Ci (w), i = 0, . . . , n − 1. If w is a Christoffel word, then |w|a , |w|b are relatively prime, and therefore these words are all distinct; indeed, otherwise one has Cd (w) = w for some proper divisor d of n, and therefore w is a positive power of a word of length d, contradicting the previous primality property.

2.3 Palindromes Let p, q be relatively prime positive natural numbers. Call a cutting word (or billiard word) associated with q/p the word m obtained as follows: let A, B be two integral points such − → that AB = (p, q) and consider the intersections of the open segment AB with the coordinate lines. If the coordinate line is vertical, then code the intersection by a and if it is horizontal, code it by b. Then m is the word obtained by reading these letters, corresponding to the

palindromes | 13 (7, 4)

(0, 0)

Figure 2.4 The cutting word abaabaaba associated with 4/7

Figure 2.5 Bijection between steps and intersections

intersections, successively from left to right. For example, see Figure 2.4, which shows the cutting word associated with 4/7. Note that the cutting word m associated with q/p satisfies |m|a = p − 1, |m|b = q − 1. It is clear that any cutting word is a palindrome: indeed, the defining rectangle (the rectangle of diagonal AB) has a symmetry around its centre which preserves coordinate lines and the diagonal. Theorem 2.3.1 A proper lower Christoffel word of slope r is of the form amb for some palindrome m, which is the cutting word associated with r. The corresponding upper Christoffel word is bma. Proof We prove the first assertion, and the second will follow, by what has been seen in Section 2.2. Note that, clearly, a proper lower Christoffel word begins with a and ends with b. Figure 2.5 shows the bijective correspondence between the steps of the path defining the lower Christoffel words (excluding the first and the last step) and the 2 intersection points defining the cutting word.

The theorem is illustrated in Figure 2.6, where the lower and upper Christoffel words of slope 4/7 are represented. Their paths together form a snake-like polygon, which illustrates the fact that they are of the forms amb and bma for the same word m, which is the cutting word associated with 4/7. The proof of the next theorem is left to the reader. Theorem 2.3.2 Let p, q be two positive, relatively prime integers. Let P = {ip, 0 < i < q} and Q = {jq, 0 < j < p}. Then P, Q are disjoint. Let P ∪ Q = {r1 < r2 < . . . < rp+q−2 }. Define the word m = a1 a2 · · · ap+q−2 on the alphabet {a < b} by the conditions ai = a if q ri ∈ Q and ai = b if ri ∈ P. Then m is the cutting word associated with p . Each cutting word may be obtained in this way.

14 | words (7, 4)

(0, 0)

Figure 2.6 Lower and upper Christoffel words and the cutting word

We say that m is obtained by superposition of two periodic phenomena. See [BLRS], Section I.6.5, for concrete examples of this.

2.4 Standard Factorization Christoffel words may be constructed recursively. The first step follows from the following result. Theorem 2.4.1 (Borel and Laubie [BL], pp. 28–29) Each proper Christoffel word is uniquely the product of two Christoffel words. This factorization of a proper Christoffel word is called its standard factorization. Note that, as the proof will show, the two elements in the standard factorization of a lower Christoffel word are lower Christoffel words. We say that (u, v) is a Christoffel pair if u, v, uv are all three lower Christoffel words. Proof

1. It is enough to prove it for lower Christoffel words, since upper ones are the reversal of lower ones and since if a lower Christoffel word is the product of two Christoffel words, then they must be lower ones: indeed, with a < b, a proper Christoffel word is a lower one if and only if it begins with a, and also if and only if it ends with b. 2. Let q/p be the slope of the proper lower Christoffel word w. Let A, B be integral points such that B = A + (p, q). In the rectangle whose diagonal is the segment [A, B], choose M as in Corollary 2.1.1. Then the triangle AMB is primitive. There is −→ no integral point on the open segment (AM), so that AM = (i, j) with i, j relatively −→ prime. Similarly, MB = (k, l), with k, l relatively prime. Then the Christoffel word of slope q/p is the concatenation of the two Christoffel words of slopes j/i and l/k; see Figure 2.7, where w is represented by the red curve. It is the product of the two words represented by the red curves from A to M and from M to B. 3. Conversely, if a lower Christoffel word w is the product of two such words, w = uv say, then in the defining rectangle of w, with diagonal AB, we define on the path

standard factorization | 15 B N

M A

Figure 2.7 Standard factorization

coded by w the point M, corresponding to the factorization w = uv. Then the triangle AMB is primitive; indeed, the paths corresponding to u and v are below the segments AM and MB respectively. Hence the triangle is contained in the polygon delimited by the path corresponding to w and the segment AB, and we know that this polygon has no interior integer points. The point M is unique according to Corollary 2.1.1. Thus the words u, v are unique. 2 − → −→ In Figure 2.7, the point N is defined by the condition AN = MB. Then the parallelogram AMBN is primitive. An example of standard factorization is seen in Figure 2.3: the closest point is A + (2, 1), and therefore the standard factorization of the word is (aab)(aabaabab). Corollary 2.4.2 (Borel and Laubie [BL], Proposition 1; see also Cohn [Co2], Section 8, Osborne and Zieschang [OZ], Lemma 2.2) The following conditions are equivalent for two lower Christoffel words u, v of commutative image (i, j) and (k, l) respectively: (i) uv is a lower Christoffel word;   i j (ii) the determinant of the matrix is 1; k l (iii) the parallelogram constructed on the vectors (i, j) and (k, l) is primitive and the slope of (i, j) is smaller than the slope of (k, l). In this case, w = uv is the standard factorization of w. Proof (i) implies (ii): take the notations of the proof of Corollary 2.1.1 and of the third part −→ −→ of the proof of Theorem 2.4.1. We have (i, j) = AM, (k, l) = MB. Then, as in the proof −→ − → −→ −→ −→ −→ −→ of that corollary, 1 = det(AM, AB) = det(AM, AM + MB) = det(AM, MB). (ii) implies (iii): this follows from Section 2.1. −→ −→ − → (iii) implies (i): Let A be the origin. Let AM = (i, j), MB = (k, l), AN = (k, l). The parallelogram is AMBN. Then the slope condition implies that M is below the segment AB. See Figure 2.7, where the discretization from below of the segment AM (resp. MB) is u (resp. v). Since the parallelogram is primitive, the discretization of AB is uv, which is therefore a lower Christoffel word. The final assertion follows from Theorem 2.4.1. 2

16 | words Corollary 2.4.3 Let w = uv be a lower Christoffel word with its standard factorization. (i) Then u(uv) and (uv)v are lower Christoffel words with the indicated standard factorization. (ii) If w is of length at least 3, then either u is a proper prefix of v, in which case v = uv is the standard factorization of v, or v is a proper suffix of u, in which case u = u v is the standard factorization of u. Proof Denote by (i, j) and (k, l) the commutative images of u, v respectively. By (ii) in the previous corollary, we have il − jk = 1.

(i) The commutative image of uv is (i + k, j + l). We have (i + k)l − (j + l)k = 1; by (i) in the same result, (uv)v is a lower Christoffel word, and this word has the indicated standard factorization. The proof for u(uv) is similar. (ii) We assume now that the length of w is at least 3, that is, i + j + k + l ≥ 3. Then j + k > 0: otherwise, j = k = 0, 1 = il − jk = il, and i = l = 1. Then i + j + k + l = 2, a contradiction. It is enough by Theorem 2.4.1 to prove that either v = uv with v a lower Christoffel word, or that u = u v with u a lower Christoffel word. a) Suppose first that i ≤ k. Consider the integral points A, M = A + (i, j), N = A + (k, l), B = A + (i + k, j + l) = M + (k, l) = N + (i, j), as in Figure 2.7 (M is −→ below the diagonal AB since the slope j/i of AM is smaller than the slope l/k −→ of MB, because il − jk = 1). The parallelogram AMBN is primitive. Translate −→ this parallelogram by the vector AM, obtaining the parallelogram MHN  B; the condition i ≤ k implies that the x-coordinate of H = A + (2i, 2j) is not larger than that of B; see Figure 2.8. The two parallelograms are primitive, so that the only integral points in their union are their vertices A, M, H, N  , B, N. In particular, there are no integral points in the interior of the triangle AHB. It follows that the discretization from below the segment AB is that of AM, −→ −→ followed by that of MH, and then by that of HB. Since AM = MH, the two first discretizations are equal, which means that u is a prefix of v; moreover, v = uv where v is the discretization of HB, hence v is a lower Christoffel word. N' B N

H M A

Figure 2.8 u is a prefix of v

the tree of christoffel pairs | 17 b) Suppose now that i > k. We have il − jk = 1, so that we must have l ≤ j: indeed, otherwise l > j, hence il ≥ (k + 1)(j + 1) = jk + j + k + 1 > jk + 1 (since j + k > 0, as verified at the beginning of (ii)), contradicting il − jk = 1. The involutive automorphism E of the free monoid {a, b}∗ that exchanges a and b exchanges lower and upper Christoffel words. Moreover, taking the reversal also exchanges lower and upper Christoffel words. Since w = uv, E(w) ˜ has, by the uniqueness in Theorem 2.4.1, the standard factorization E(˜v)E(u). ˜ The commutative images of E(˜v) and E(u) ˜ are (l, k) and (j, i) respectively. Since l ≤ j, the previous argument shows that E(u) ˜ = E(˜v)m for some lower Christoffel word m. Hence u˜ = v˜E(m) and u = u v, with u = E(m), ˜ and the latter is a lower Christoffel word. 2

This tree, which is implicit in [BL], appears in [BdL], Figure 3, p. 200; it appears already in another form, in [Co2], Figure 1, p. 19. It follows from Corollary 2.4.3 that if (u, v) is a Christoffel pair, then (u, uv) and (uv, v) are both Christoffel pairs. Moreover, (a, b) is clearly a Christoffel pair. Thus, if we construct an infinite binary tree with root (a, b) and with the rule described in Figure 2.9, then all nodes of this tree are Christoffel pairs. Conversely, each Christoffel pair appears on this tree: this follows from the second part of Corollary 2.4.3. The latter implies also that the Christoffel pairs on the tree are all distinct: indeed, a node (u, v), distinct from the root, is either a left child or a right child, but not both (indeed, the conditions ‘u prefix of v’, ‘v prefix of u’, are mutually exclusive), and one concludes recursively. Note that if (u, v) appears on the tree, then uv is a lower Christoffel word; moreover, each lower Christoffel word appears exactly once in the tree as the product of the two elements in the pair, by uniqueness of the standard factorization. The result below follows easily from the rule for constructing all Christoffel pairs. Corollary 2.5.1 The set of pairs (|u|, |v|) for all Christoffel pairs (u, v) is equal to the set of pairs of relatively prime positive natural integers. Observe that a proper lower Christoffel word w = uv with its standard factorization is such that at least one of u or v is not proper if and only if w = an b or w = abn , n ≥ 1; its standard factorization is (a, an−1 b) or (abn−1 , b) respectively.

(u, v )

(u, uv )

(uv, v )

Figure 2.9 The rule for constructing the tree of Christoffel pairs

18 | words

2.6 Sturmian Morphisms Consider the totally ordered alphabet {a < b}. Recall that we denote by (u, v) any endomorphism f of the free monoid {a < b}∗ defined by f (a) = u, f (b) = v. We define now four endomorphisms as follows: G = (a, ab), D = (ba, b), G˜ = (a, ba), and D˜ = (ab, b). We consider in the sequel of this book several infinite complete planar rooted binary trees. For each node in such a tree, there is a unique path from the root to this node; this path is coded by a word w ∈ {a < b}∗ , with a meaning ‘left’ and b meaning ‘right’. In this way we obtain a bijection between {a < b}∗ and the set of nodes of the tree, and we say that w codes the node. For example, in the tree of Figure 2.10 the node (a2 bab, ab) corresponds to the word abb. Denote by End({a, b}∗ ) the monoid of endomorphisms of the free monoid {a, b}∗ . Theorem 2.6.1 Let ρ : {a, b}∗ → End({a, b}∗ ) be the monoid homomorphism such that ˜ Let (u, v) be any node in the tree of Christoffel pairs, coded by ρ(a) = G and ρ(b) = D. the word x ∈ {a < b}∗ . Then in the monoid of endomorphisms of the free monoid {a, b}∗ , one has the equality (u, v) = ρ(x). ˜ For example, looking at Figure 2.10, one has (a2 bab, ab) = ρ(abb) = G ◦ D˜ ◦ D: ˜ ˜ ˜ ˜ ˜ ˜ indeed, G ◦ D ◦ D(a) = G ◦ D(ab) = G(abb) = aabab and G ◦ D ◦ D(b) = G ◦ D(b) = G(b) = ab. Proof This is clear for the root, that is, x = 1. The general case follows by induction. Suppose indeed that (u, v) = ρ(x). Then ρ(x)(a) = u, ρ(x)(ab) = uv, and ρ(x)(b) = v. Thus, we have ρ(xa)(a) = ρ(x) ◦ G(a) = ρ(x)(a) = u and ρ(xa)(b) = ρ(x) ◦ ˜ G(b) = ρ(x)(ab) = uv. On the other hand, ρ(xb)(a) = ρ(x) ◦ D(a) = ρ(x)(ab) = ˜ uv and ρ(xb)(b) = ρ(x) ◦ D(b) = ρ(x)(b) = v. In the first (resp. second) case, we have ρ(xa) = (u, uv) (resp. ρ(xb) = (uv, u)) in accordance with the construction rule of the tree; see Figure 2.9. 2

Corollary 2.6.2 The image of any lower Christoffel word under the endomorphism G or D˜ is a lower Christoffel word.

(a, b)

(a, ab)

(a, a2b) (a, a3b)

(a3b, a2b)

(ab, b)

(a2b, ab) (a2b, a2bab)

(a2bab, ab)

(a, a4b) (a4b, a3b)

Figure 2.10 The tree of Christoffel pairs

(ab, ab2) (ab, abab2)

(abab2, ab2)

(ab2, b) (ab2, ab3)

(ab3, b) (ab3, ab4) (ab4, b)

sturmian morphisms | 19 Proof This is clear for the lower Christoffel words of length 1, that is, a and b, since ab is a lower Christoffel word. Consider now a proper lower Christoffel word w = uv with its standard factorization. Then (u, v) appears on the tree of Christoffel pairs and, viewed as an endomorphism of the free monoid {a, b}∗ , is equal to ρ(x) by Theorem 2.6.1 and its notations. In particular, ρ(x)(a) = u. Then ax codes some node (u , v ) in the tree; in particular, u v is a lower Christoffel word. Hence by the theorem G(u) = G(ρ(x)(a)) = G ◦ ρ(x)(a) = ρ(a) ◦ ρ(x)(a) = ρ(ax)(a) = u and similarly G(v) = v . Thus G(w) = G(uv) = G(u)G(v) = u v is a lower Christoffel word. ˜ The proof that D(w) is a lower Christoffel word is similar. 2

Corollary 2.6.3 The image of any lower Christoffel word under the endomorphism G˜ or D is a conjugate of some lower Christoffel word. ˜ Proof By Corollary 2.6.2, it is enough to prove that for each word w, D(w) and D(w) ˜ We do it only for D: recall that D = (ba, b) are conjugate, and similarly for G and G. and D˜ = (ab, b). Thus, viewing them as endomorphisms of the free group F(a, b), we ˜ where ι is the inner automorphism g → bgb−1 : indeed, this is readily have D = ι ◦ D, verified on the generators a and b. Note that the image by D˜ of each nonempty word in the free monoid {a, b}∗ ends with b, so that we see that for each word w in {a, b}∗ , one ˜ ˜ has D(w) = ub and D(w) = bu. Thus D(w) and D(w) are conjugate. 2 A Sturmian morphism is an endomorphism of the free monoid {a, b}∗ which sends each lower Christoffel word onto the conjugate of a lower Christoffel word. Equivalently, it sends each conjugate of a lower Christoffel word onto the conjugate of a lower Christoffel ˜ D˜ are Sturmian morphisms. word. Thus, we have shown that the endomorphisms G, D, G, Sturmian morphisms will be given several characterizations in Section 16.6.







3 •







Markoff Numbers

The Markoff equation is a special Diophantine equation. A solution is called a Markoff triple. The main result in this chapter, Theorem 3.1.1, is a bijection between lower Christoffel words and Markoff triples.

3.1 Markoff Triples and Numbers A Markoff triple is a multiset {x, y, z} of positive integers satisfying the Markoff equation x2 + y2 + z2 = 3xyz. Examples are {1, 1, 1}, {1, 1, 2}, and {1, 2, 5}. A Markoff triple is called proper if the three numbers are distinct; otherwise, we call it improper. A Markoff number is an element of a Markoff triple. Consider the monoid homomorphism μ from the free monoid {a, b}∗ into SL2 (Z) defined by     2 1 5 2 μ(a) = , μ(b) = . 1 1 2 1 We assume that the alphabet {a, b} is totally ordered: a < b. The matrix construction of the Markoff triples as shown in the following theorem was obtained by Cohn [Co1, Co2]. He used the Fricke relation (the last equation in Lemma 3.1.4), having noted the striking analogy between Markoff ’s equation and this relation. Uniqueness was noticed by Bombieri [B], Theorem 26, and independently by the author of [Re3], Theorem 1. Theorem 3.1.1 The mapping sending each lower Christoffel word w with standard factorization uv onto the multiset   1 1 1 Tr(μ(u)), Tr(μ(v)), Tr(μ(w)) 3 3 3 is a bijection from the set of proper lower Christoffel words onto the set of proper Markoff triples. From Christoffel Words to Markoff Numbers, Christophe Reutenauer. Oxford University Press (2019). © Christophe Reutenauer 2019. DOI: 10.1093/oso/9780198827542.001.0001

22 | markoff numbers We give the proof after some preparatory lemmas. It is proved in the next lemma that for any lower Christoffel word w, 13 Tr(μ(w)) = μ(w)12 . Thus the triple of the theorem is also equal to {μ(u)12 , μ(v)12 , μ(w)12 }. Lemma 3.1.2 (i) ([Re2], Lemme 3.2) If M is a symmetric 2×2 matrix, and N = μ(a) Mμ(b), then 13 Tr(N) = N12 . (ii) If N = μ(w) for some lower Christoffel word w, then 13 Tr(N) = N12 . Proof One has

       2 1 p q 5 2 2p + q 2q + s 5 2 = 1 1 q s 2 1 p+q q+s 2 1   10p + 9q + 2s 4p + 4q + s = . 5p + 7q + 2s 2p + 3q + s The trace of the latter matrix is 12p + 12q + 3s, which is three times its 12-entry 4p + 4q + s. This proves the first assertion. If w is a proper lower Christoffel word, then w = amb for some palindrome m, by Theorem 2.3.1; then N = μ(w) = μ(a)μ(m)μ(b) and μ(m) is symmetric, since m is a palindrome and μ(a), μ(b) are symmetric matrices. This proves (ii) by (i) when w is proper. If w = a or b, one sees that (ii) holds by inspection. 2 Lemma 3.1.3 The improper Markoff triples are {1, 1, 1} and {1, 1, 2} . Proof Suppose that {x, x, y} is a Markoff triple. Then 2x2 + y2 = 3x2 y, so that x2 divides y2 , thus x divides y: y = xu. Then 2x2 + x2 u2 = 3x3 u, thus 2 + u2 = 3xu. Thus u divides 2 and u = 1 or 2. We have 2u + u = 3x; in both cases, 3x = 3, x = 1, and thus y = u, and we obtain the triples {1, 1, 1} and {1, 1, 2}. 2

Lemma 3.1.4 (Fricke relations [Fr], (6) p. 91) Let A, B be matrices in SL2 (Z). Then Tr(A2 B) + Tr(B) = Tr(A) Tr(AB), Tr(AB2 ) + Tr(A) = Tr(AB) Tr(B), and Tr(A)2 + Tr(B)2 + Tr(AB)2 = Tr(A) Tr(B) Tr(AB) + Tr(ABA−1 B−1 ) + 2. Proof We identify α as usually a scalar with the scalar matrix αI2 of size 2×2. By the theorem of Cayley–Hamilton, we have A2 − Tr(A)A + 1 = 0, since A has determinant 1. Thus A2 + 1 = Tr(A)A, hence A2 B + B = Tr(A)AB and, taking the trace, Tr(A2 B) + Tr(B) = Tr(A) Tr(AB), proving the first relation. The second follows after exchanging A and B. The Cayley–Hamilton theorem also implies that A−1 = Tr(A) − A, so that ABA−1 B−1 = AB(Tr(A) − A)(Tr(B) − B). Thus we have

ABA−1 B−1 = Tr(A) Tr(B)AB − Tr(A)AB2 − Tr(B)ABA + ABAB.

(3.1)

Now, by the two first relations in the lemma, we have Tr(AB2 ) = Tr(AB) Tr(B) − Tr(A) and Tr(ABA) = Tr(A2 B) = Tr(A) Tr(AB) − Tr(B). Hence, taking the trace in Eq. (3.1), we obtain

markoff triples and numbers | 23 Tr(ABA−1 B−1 ) = Tr(A) Tr(B) Tr(AB) − Tr(A) Tr(AB) Tr(B) + Tr(A)2 − Tr(B) Tr(A) Tr(AB) + Tr(B)2 + Tr(ABAB). The last Fricke relation follows, by applying once again the theorem of Cayley– 2 Hamilton for the matrix M = AB under the form Tr(M2 ) = Tr(M)2 − 2. From the first relation, we deduce the following result, which will be used later. Corollary 3.1.5 Let A, B be matrices in SL2 (Z) and x, y, z ∈ Z. Then the two following conditions are equivalent: (i) x = 13 Tr(A), y = 13 Tr(AB), z = 13 Tr(A2 B); (ii) x = 13 Tr(A), y = 13 Tr(AB), 3xy − z = 13 Tr(B). Proof If (i) holds, then 3xy − z = 13 Tr(A) Tr(AB) − 13 Tr(A2 B). By the first Fricke relation in the lemma, this is equal to 13 Tr(B) and thus (ii) holds. The converse is similar and left to the reader. 2

The corollary has a symmetric statement, obtained after exchanging A and B: the condition x = 13 Tr(B), y = 13 Tr(AB), z = 13 Tr(AB2 ) is equivalent to the condition x = 13 Tr(B), y = 13 Tr(AB), 3xy − z = 13 Tr(A). The next lemma will allow us to construct recursively the Markoff triples in the proof of the theorem. Lemma 3.1.6 (Markoff [M2] p. 397) Let {x, y, z} be a proper Markoff triple with x < y < z. Then {x, y, 3xy − z} is a Markoff triple and 3xy − z < y. Proof Define a polynomial in t: P(t) = t 2 − 3xyt + x2 + y2 . Then z is a root of P(t). The other root is 3xy − z = (x2 + y2 )/z. Hence this root is a positive integer and {x, y, 3xy − z} is a Markoff triple. Now, we have P(y) = y2 − 3xy2 + x2 + y2 ; since x ≥ 1, we have 3x ≥ 3, and because y > x we deduce that P(y) < 0. Thus y lies strictly between the two roots of P, and since the root z is larger than y, the other one must be smaller, that is, 3xy − z < y. 2

Lemma 3.1.7 If A, B, C are d × d square matrices over the positive integers, with d ≥ 2, then for any i, j = 1, . . . , d, one has (AB)ij > Aij , Bij and (ABC)ij > (AC)ij , Bij .  Proof We have (AB)ij = k Aik Bkj . By the assumption d ≥ 2 and by  positivity (each and B . Now, (ABC) = entry ≥ 1), this is strictly larger than A ij ij ij k,l Aik Bkl Clj >   A B C ≥ A C = (AC) . Similarly, (ABC) > B . 2 ij ij ij ik kk kj ik kj k k Corollary 3.1.8 If w = uv is a lower Christoffel word with its standard factorization, then μ(w)12 > μ(u)12 , μ(v)12 . If, moreover, u is a prefix of v (resp. v is a suffix of u), then μ(u)12 < μ(v)12 and μ(uuv)12 < μ(uvv)12 (resp. μ(v)12 < μ(u)12 and μ(uvv)12 < μ(uuv)12 ). In particular, μ(uuv)12 = μ(uvv)12 . Proof Note that if m is a nonempty word, the 2×2 matrix μ(m) is square and has positive integer entries. Thus we may apply the previous lemma.

24 | markoff numbers We have μ(w) = μ(u)μ(v), so that the first assertion follows directly from the lemma. Now, if u is a prefix of v, then it is a proper prefix (since w is not a square) and we may write v = uv , v nonempty. Thus the lemma implies that μ(u)12 < μ(v)12 ; from the lemma also, we deduce that the 12-entry of μ(u)μ(u)μ(v) is smaller than that of μ(u)μ(v)μ(v) = μ(u)μ(u)μ(v )μ(v). The other case is treated similarly. The last assertion follows from Corollary 2.4.3 (ii) if w = ab, and by direct calculation if w = ab. 2 The next lemma allows us to deduce the Markoff equation from the last Fricke relation. Lemma 3.1.9 If (u, v) is a Christoffel pair and A = μ(u), B = μ(v), then Tr(ABA−1 B−1 ) = −2. Proof We claim that for any Christoffel pair (u, v), the element uvu−1 v−1 of the free group F(a, b) is conjugate to aba−1 b−1 . Using the construction of Christoffel pairs by their tree, it is easy to see that if the claim holds for (u, v), then it also holds for (u, uv) and (uv, v). Indeed, u(uv)u−1 (uv)−1 = u(uvu−1 v−1 )u−1 and (uv)v(uv)−1 v−1 = uvu−1 v−1 . Finally, the claim is trivial on the root. The claim implies that ABA−1 B−1 and μ(a)μ(b)μ(a)−1 μ(b)−1 are conjugate −1 B−1 ) = Tr(μ(a)μ(b)μ(a)−1 μ(b)−1 ). We have μ(a)μ(b) matrices; thus Tr(ABA        2 1 5 2 1 −1 1 −2 12 5 μ(a)−1 μ(b)−1 = = 1 1 2 1 −1 2 −2 5 7 3     3 −7 11 −24 = , which concludes the proof. 2 −5 12 6 −13 Proof of Theorem 3.1.1 Let w be a lower Christoffel word with standard factorization uv and let (x, y, z) = ( 13 Tr(μ(u)), 13 Tr(μ(v)), 13 Tr(μ(w))). By the Fricke relation in Lemma 3.1.4 and by Lemma 3.1.9, one has (3x)2 + (3y)2 + (3z)2 = (3x)(3y)(3z), hence {x, y, z} is a Markoff triple, since by Lemma 3.1.2 (ii), x, y, z are positive integers. Suppose that w = ab; then by Corollary 2.4.3 (ii) and Corollary 3.1.8, x, y, z are  distinct,  12 5 hence {x, y, z} is a proper Markoff triple. If w = ab, then, since μ(ab) = , 7 3 {x, y, z} = {1, 2, 5} is a proper Markoff triple, too. Thus, the mapping of the theorem is well defined and we prove now that it is surjective. Let {x, y, z} be any proper Markoff triple. We may assume that x < y < z. Then by Lemma 3.1.6, {x, y, 3xy − z} is a Markoff triple and 3xy − z < y. The sum x + y + (3xy − z) is then smaller than x + y + z and we use induction on this sum. If {x, y, 3xy − z} is an improper Markoff triple then, since x and 3xy − z are both < y, it must be {1, 1, 2} by Lemma 3.1.3; then we have x = 1, 3xy − z = 1, y = 2, so that z = 3 · 2 − 1 = 5 and the triple {x, y, z} corresponds to the lower Christoffel word ab. If {x, y, 3xy − z} is a proper Markoff triple, then by induction it is equal to { 13 Tr(μ(u)), 13 Tr(μ(v)), 13 Tr(μ(w))} for some lower Christoffel word with standard factorization w = uv. Since the maximum is y, we have by Corollary 3.1.8 y = 1 1 1 1 3 Tr(μ(w)) = 3 Tr(μ(uv)). If x = 3 Tr(μ(u)) and 3xy − z = 3 Tr(μ(v)), then by Corollary 3.1.5 {x, y, z} corresponds to the lower Christoffel word uuv. Otherwise,

the tree of markoff triples | 25 x = 13 Tr(μ(v)) and 3xy − z = 13 Tr(μ(u)) and symmetrically (see the statement after the proof of Corollary 3.1.5) {x, y, z} corresponds to the lower Christoffel word uvv. This proves surjectivity of the mapping and we now prove injectivity. Suppose that the proper Markoff triple {x < y < z} corresponds to the two proper lower Christoffel words w1 = u1 v1 and w2 = u2 v2 , with their standard factorizations. If w1 = ab, then also w2 = ab: otherwise, we have that w1 = ab is a proper factor of w2 and therefore μ(w2 )12 is strictly larger than μ(w1 )12 by Lemma 3.1.7, and the triples corresponding to w1 and w2 are not equal since they do not have the same maximum. Thus, we may assume that w1 , w2 = ab. By Corollary 2.4.3, we may assume either that u1 is a proper prefix of v1 and u2 is a proper prefix of v2 , or that u1 is a proper prefix of v1 and v2 is a proper suffix of u2 (the two other cases are similar). In the first case, we have by Corollary 2.4.3 the standard factorizations vi = ui vi , i = 1, 2. By the prefix condition and Corollary 3.1.8, we have (μui )12 < (μvi )12 < (μwi )12 ; since the Markoff triple {x < y < z} corresponds to wi = ui vi , we have by Lemma 3.1.2 (ii) x = 13 Tr(μ(ui )) = 13 Tr(A), y = 13 Tr(μ(vi )) = 13 Tr(μ(ui vi )) = 1 1 1 1 2  2 3 Tr(AB), and z = 3 Tr(μ(wi )) = 3 Tr(μ(ui vi )) = 3 Tr(A B) (if we set A = μ(ui ),  B = μ(vi )). Thus (i) in Corollary 3.1.5 holds, and we deduce (ii): {x, y, 3xy − z} = { 13 Tr(A), 13 Tr(AB), 13 Tr(B)}. The Markoff triple corresponding to vi = ui vi is ti = { 13 Tr(μ(ui )), 13 Tr(μ(vi )), 13 Tr(μ(vi ))}. Thus ti = {x, y, 3xy − z} and t1 = t2 . By induction on the length of the words, we deduce that v1 = v2 , that is, u1 v1 = u2 v2 . By uniqueness of the standard factorization, u1 = u2 , which implies w1 = w2 . In the second case, one has v1 = u1 v1 and u2 = u2 v2 . One obtains, similarly using Corollary 3.1.5 and its symmetric statement, that the triple {x, y, 3xy − z} corresponds to both lower Christoffel words v1 and u2 . Thus by induction u1 v1 = u2 v2 , and therefore, by uniqueness of the factorization into Christoffel words (Theorem 2.4.1), u1 = u2 , v1 = v2 . Then we see that the triple {x, y, z} corresponds to the words w1 = u1 v1 = u21 v1 = u21 v2 and w2 = u2 v2 = u2 v22 = u1 v22 . Hence the maximum of this triple is equal to both μ(u21 v2 )12 and μ(u1 v22 )12 . This contradicts the last assertion in Lemma 3.1.8 applied to the Christoffel word v1 = u1 v2 . 2

3.2 The Tree of Markoff Triples Since all Christoffel pairs appear exactly once on the tree of Christoffel pairs (see Section 2.5), and by Theorem 3.1.1 it follows that if we replace in the previous tree each node (u, v) by the triple   1 1 1 (3.2) Tr(μ(u)), Tr(μ(uv)), Tr(μ(v) 3 3 3 (also equal to (μ(u)12 , μ(uv)12 , μ(v)12 ) by Lemma 3.1.2), we obtain a tree where each proper Markoff triple appears exactly once as a node, in the form of a triple whose underlying set is this triple. This tree is called the tree of Markoff triples; see Figure 3.1.

26 | markoff numbers (1, 5, 2)

(1, 13, 5)

(1, 34, 13)

(1, 89, 34)

(1, 233, 89)

(34, 1325, 13)

(5, 29, 2)

(13, 194, 5)

(5, 433, 29)

(29, 169, 2)

(13, 7561, 194) (194, 2897, 5) (5, 6466, 433) (433, 37666, 29) (29, 14701, 169) (169, 985, 2)

(89, 9077, 34)

(169, 499393, 985)

...

(985, 5741, 2)

Figure 3.1 The tree of Markoff triples (x, z, y)

(x, 3xz − y, z) = (x,

x2 + z 2 y

, z)

(z, 3zy − x, y) = (z,

y2 + z2 x

, y)

Figure 3.2 The rule for building the tree of Markoff triples

It follows also from Eq. (3.2) and Corollary 3.1.8 that in each triple (x, z, y) appearing in this tree, the maximum of the three numbers is z; however, the minimum may be x or y. This tree may be constructed directly, starting from the root (1, 5, 2) and applying the rule given in Figure 3.2. This follows from the Fricke relations. Suppose indeed that (x, z, y) = ( 13 Tr(μ(u)), 13 Tr(μ(uv)), 13 Tr(μ(v)). Then 3xz − y = 3 · 13 Tr(μ(u)) · 1 1 1 2 3 Tr(μ(uv)) − 3 Tr(μ(v)) = 3 Tr(μ(u v)) by Lemma 3.1.4, so that by the rule for building the tree of Christoffel pairs (see Figure 2.9), the left child in Figure 3.2 is valid; the proof is similar for the right child. Corollary 3.2.1 The three numbers in a Markoff triple are pairwise relatively prime. Proof It is enough to show that the three numbers are relatively prime (since by the Markoff equation, if a prime number divides two of them, it divides the third). This is clear for the improper triples and for the triple (1, 5, 2), the root of the tree of Markoff triples. If (x, z, y) is some node of this tree with x, y, z relatively prime, then clearly x, 3xz − y, z are relatively prime, and so are also z, 3zy − x, y. By the construction of the tree, the result follows. 2

3.3 The Markoff Injectivity Conjecture It follows from Theorem 3.1.1 that for each Markoff number m, there exists a lower Christoffel word w such that m = 13 Tr(μ(w)) = μ(w)12 . We say that m is the Markoff number associated with the Christoffel word w. In this case, |w|a , |w|b are called the Frobenius coordinates of m. Denote m by mw .

the markoff injectivity conjecture | 27 It is conjectured that the mapping w → mw , which is surjective by the theorem, is also injective. This is called the Markoff injectivity conjecture. It has many equivalent formulations (see the book by Martin Aigner [A] where one also finds the solution of several particular cases of the conjecture). The conjecture is attributed to Frobenius. Indeed, it is mentioned twice as an open problem in Frobenius’s article in 1913 [F], under equivalent formulations. Indeed, on p. 601, Frobenius writes ‘Unentschieden ist bis jetzt die Frage, ob einer Markoffschen Zahl p zwei verschiedene Ketten entsprechen können, d.h. ob p in zwei Lösungen p, a, b und p, c, d die grösste Zahl sein kann’ and on p. 614, ‘Es ist aber nicht ausgeschlossen, dass zwei verschiedenen Brüchen ρ = χχ und σ = λλ dieselbe Zahl pρ = pσ entspricht’.1 A problem induced by the injectivity conjecture is the following: if the conjecture is true, that is, if there is a bijection between lower Christoffel words and Markoff numbers, determine the order of lower Christoffel words induced by the natural order of Markoff numbers. Equivalently, since proper lower Christoffel words are naturally in bijection with Q∗+ (via their slope), determine the order of the latter set induced by the bijection between Markoff numbers and this set. Partial answers are provided in Aigner’s book [A], 10.2 and [RS].

1 “It is not known until now if a Markoff number p may correspond to two distinct chains, that is, may be the largest element of two solutions p, a, b and p, c, d” (p.601); “it is not excluded that to two distinct fractions ρ = χχ

and σ =

λ λ

correspond the same number pρ = pσ ” (p. 614).







4 •







The Markoff Property

The Markoff property, introduced by Markoff, is a certain combinatorial property of bi-infinite words on a binary alphabet {a, b}. Such a word has this property if whenever there is a factor xy in the word, with {x, y} = {a, b}, then it may be extended into a factor of the form ymxymx, ˜ where the length of m is bounded. Markoff then proves that the word must be periodic; it turns out that the periodic pattern is equal to some Christoffel word (Theorem 4.2.1). We introduce here a similar property for (one-sided) infinite words, a variant of Markoff ’s property, and prove that such a word must be ultimately periodic, with a similar periodic pattern (Theorem 4.1.6); we may bound the length of the nonperiodic initial part of the word, so that the first theorem may be deduced from the second.

4.1 Markoff Property for Infinite Words We say that an infinite word s on the alphabet {a, b} satisfies the Markoff property if there exist K and N such that for any factorization s = uxyv ˜ with {x, y} = {a, b} and |u| ≥ K, one has u = myu, v = mxv, and |m| ≤ N. Note that u is a finite word and v an infinite one. For example, s = a(ab)∞ = aabababab · · · satisfies the Markoff property with K = 2 and N = 0; it is easily verifed that s does not satisfy the Markoff property with K = 1. Another example is (aab)∞ = aabaabaab · · · with K = 2 and N = 1. The numbers N, K above are natural integers. We extend the definition to N = −1: we say that s satisfies the Markoff property for K and N = −1 if there is no factorization s = uxyv ˜ with |u| ≥ K; in this case, s is ultimately constant: s = ux∞ , x = a or b, and |u| = K. We consider in the whole section an infinite word s, on the alphabet {a, b}, that satisfies the Markoff property with the parameters K and N. Lemma 4.1.1 Let s = ws1 for some finite word w, such that |w| = K − 1. Then no word of the form xmxymy, ˜ {x, y} = {a, b}, is a factor of s1 . Proof Suppose that s1 = hxmxymy ˜ · · · . Then whxmxymy ˜ · · · = ws1 = s = uxyv ˜ with u˜ = whxm˜ and v = my · · · . Then |u| ≥ |w| + 1 = K, so that we obtain from the Markoff From Christoffel Words to Markoff Numbers, Christophe Reutenauer. Oxford University Press (2019). © Christophe Reutenauer 2019. DOI: 10.1093/oso/9780198827542.001.0001

30 | the markoff property property: u = nyu , v = nxv . Thus mxh˜w˜ = u = nyu and my · · · = v = nxv . Then we cannot have m = n, since x = y. If m is shorter than n, then mx, my are both prefixes of n, a contradiction; similarly if n is shorter than m. This proves the lemma by contradiction. 2 Lemma 4.1.2 If t is an infinite word on the alphabet {a, b} such that none of the words x(xy)i xy( yx)i y, xy(xy)j xy( yx)j y2 , {x, y} = {a, b}, i ∈ N is a factor of t, then at most one of the words aa or bb is a factor of t. Proof By contradiction, write t = hx2 ky2 t1 , {x, y} = {a, b}. We may assume that k is of minimum length. Then this length is nonzero, otherwise x2 y2 is a factor of t, contradicting the first condition (with i = 0). By minimality of the length of k, we have k = ( yx)i , i ≥ 1. Note that t1 must begin with x: otherwise, t contains the factor x2 ( yx)i y3 , hence also the factor xyxy3 , contradicting the second condition (with j = 0). Thus we may write yt1 = ( yx)j t2 , for some maximum positive integer j, or j infinite. Thus t = hx2 ( yx)i y( yx)j t2 = hx(xy)i xy( yx)j t2 . We must have j ≤ i: otherwise, j > i and t contains as a factor the word x(xy)i xy( yx)i y, a contradiction. In particular, j is not infinite. Suppose that t2 begins with x: then t contains y( yx)j x = y2 (xy)j−1 x2 , and by minimality of |k| we must have j − 1 ≥ i, that is, j > i, a contradiction. If t2 begins with yx, this contradicts the maximality of j. Suppose that t2 begins with y2 . Then t contains x(xy)i xy( yx)j y2 . If j = i, then t contains x(xy)i xy( yx)i y, a contradiction; if, on the other hand, j = 1, . . . , i − 1, then t contains xy(xy)j xy( yx)j y2 , a contradiction. Thus there is a contradiction in all cases, which proves the lemma. 2

Recall that G is the endomorphism of the free monoid {a, b}∗ , sending a onto a and b onto ab. Lemma 4.1.3 If s = G(t) and N ≥ 0, then t satisfies the Markoff property for the parameters K − 1 and N − 1. Lemma 4.1.4 For any v ∈ {a, b}∗ ,  = G(˜v)a.  = G(v)a  = aG(v) G(va)

(4.1)

In particular, G(va) is a palindrome if and only if v is a palindrome. Proof The two first equalities are evident. We prove the third by induction on the length.  = aG(a)G(w)  = aG(w)a  = If v = 1, it is evident. Suppose that v = aw. Then aG(v) G(w)aa ˜ (by induction) = G(w)G(a)a ˜ = G(wa)a ˜ = G(aw)a  = G(˜v)a. Suppose now     = G(w)aba  ˜ (by that v = bw. Then aG(v) = aG(b)G(w) = aG(w)G(b) = aG(w)ba  induction) = G(w)G(b)a ˜ = G(wb)a ˜ = G(bw)a = G(˜v)a.  = G(va); this is For the last assertion: G(va) is a palindrome if and only if G(va) equivalent to G(˜v)a = G(va), which in turn is equivalent to G(˜v) = G(v), and finally, G being injective, to v˜ = v; that is, v is a palindrome. 2

markoff property for infinite words | 31 Proof of Lemma 4.1.3 If t has no factorization t = uabv ˜ nor t = ubav ˜ with |u| ≥ K − 1, then clearly t satisfies the property for K − 1 and N − 1. Otherwise, t has factorizations of this type. Suppose that t = uabv ˜ with |u| ≥ K − 1.   ≥ K. By Then by Eq. (4.1), s = G(t) = G(u)aabG(v) ˜ = (G(u)a)abG(v) and |G(u)a|   the Markoff property for s, one has G(u)a = nbu , G(v) = nav , |n| ≤ N. Then u cannot be empty, so that u = u a and nbu = G(u) is in the image of G; hence n must end by a, n = n a, and n is in the image of G, n = G(m), n = G(m)a. Then G(u) = nbu = G(m)abu , hence u = mbu1 . Moreover, G(v) = G(m)aav and therefore v = mav1 . Finally, |m| ≤ |G(m)| = |n | = |n| − 1 < |n| ≤ N. If t = ubav, ˜ the proof is similar and left to the reader (hint: consider the equality G(v) = nbv , which implies that n = n a). 2

Recall that the endomorphism D of the free monoid {a, b}∗ is defined by D(a) = ba and D(b) = b. Corollary 4.1.5 If s = G(t) or s = D(t), and if N ≥ 0, then t satisfies the Markoff property with the parameters K − 1 and N − 1. Proof If s = G(t), it is Lemma 4.1.3. If s = D(t), it follows from the lemma by exchanging a and b. 2

Theorem 4.1.6 If an infinite word s on the alphabet {a, b} satisfies the Markoff property for the parameters K and N, then s = uw∞ for some finite word u and some lower Christoffel word w. The length of u is bounded by a function depending only on K and N. Proof We claim that there exists a function f (K, N), defined for integers K ≥ 0 and N ≥ −1, such that f (K, −1) ≥ K and if N ≥ 0, then K + 2f (N, N − 1) + 1 = f (K, N). Indeed, one may take f (K, N) = 2N+3 + K − 2N − 5. Then one indeed has f (K, −1) = 22 + K + 2 − 5 = K + 1 ≥ K, and for N ≥ 0, K + 2f (N, N − 1) + 1 = K + 2(2N+2 + N − 2N + 2 − 5) + 1 = K + 2N+3 − 2N − 6 + 1 = 2N+3 + K − 2N − 5 = f (K, N). We prove the theorem by induction on N and show that |u| is bounded by f (K, N). If N = −1, then s = ux∞ with |u| = K ≤ f (K, −1), x = a or b, and the result is true. Suppose now that N ≥ 0. Let s = ws1 with |w| = K − 1; then by Lemma 4.1.1, no word of the form xmxymy ˜ is a factor of s1 . It follows that s1 satisfies the hypothesis of Lemma 4.1.2, so that aa and bb are not simultaneously factors of s1 . We claim that s1 satisfies the Markoff property with parameters K1 = N + 1 and N: indeed, if s1 = uxyv, ˜ |u| ≥ N + 1, then s = wuxyv, ˜ |wu| ˜ ≥ K + N ≥ K, and therefore uw˜ = myu , v = mxv , |m| ≤ N; by the length condition on u, we deduce that u = myu , which proves the claim. Note that if we take a longer word w with s = ws1 , the previous arguments still apply. Suppose that bb is not a factor of s1 . Then s1 begins by a or by ba, and we may assume, by taking a word w longer by one letter if necessary, that s1 begins by a. Then, since s1 does not contain bb, it implies that s1 is an infinite product of the words a and ab; in other words, s1 = G(t) for some infinite word t. Then by Corollary 4.1.5, t satisfies the Markoff property with parameters N and N − 1. It follows by induction that t = uw∞

32 | the markoff property for some finite word u satisfying |u| ≤ f (N, N − 1), and some lower Christoffel word w. Then s = ws1 = wG(u)G(w)∞ . Moreover, by Corollary 2.6.2, G(w) is a lower Christoffel word. The length of wG(u) is bounded by K + 2|u| ≤ K + 2f (N, N − 1) ≤ f (K, N), which ends the proof in this case. If aa is not a factor of s1 , then, symmetrically, we may assume that s1 = D(t). Then, similarly, t = uw∞ for some finite word u satisfying |u| ≤ f (N, N − 1), and some lower Christoffel word w. Thus s = ws1 = wD(u)D(w)∞ . We know by Corollary 2.6.2 that ˜ D(w) is a lower Christoffel word; moreover, we have seen in the proof of Corollary 2.6.3 ˜ that D(w) = vb for some word v and that then D(w) = bv. Thus s = wD(u)(bv)∞ = wD(u)b(vb)∞ . We have finally |wD(u)b| ≤ K + 2f (N, N − 1) + 1 = f (K, N). 2 The theorem has a converse: if s = uw∞ for some lower Christoffel word w, then s satisfies the Markoff property. This may be proved directly, and can also be deduced from the converse of Theorem 4.2.1; it is also a consequence of Theorem 6.3.1 and Theorem 8.2.1.

4.2 Markoff Property for Bi-infinite Words We say that a bi-infinite word s on the alphabet {a, b} satisfies the Markoff property if there exists N such that for any factorization s = uxyv ˜ with {x, y} = {a, b}, one has u = myu , v =    mxv , and |m| ≤ N. Note that u, v, u , v are right infinite words and that u˜ is a left infinite word, the reversal of u, defined in the obvious way. Theorem 4.2.1 (Markoff [M1, M2]) If a bi-infinite word on the alphabet {a, b} satisfies the Markoff property, then it is equal up to a shift to ∞ w∞ for some lower Christoffel word w. Lemma 4.2.2 If u, u are words and w, w are lower Christoffel words on the alphabet {a < b} such that uw∞ = u w∞ , then w = w and either u = u wi or u = uwi for some natural number i. b The slope of any word w ∈ {a < b}∗ is by definition the quotient |w| |w|a , which is a non-negative rational number or ∞. This is coherent with the definition of the slope of a Christoffel word.

Proof Let pn be the prefix of length n of uw∞ ; the slope of w is equal to the limit when n tends to infinity of the slope of pn . It follows, therefore, from the hypothesis that w and w have the same slope. Thus w = w . The hypothesis implies that for some factorization w = fg, one has w = gf . But this is possible if and only if f = 1 or g = 1, since the words Ck (w), k = 0, . . . , |w| − 1 are all distinct (see Section 2.2). Thus the lemma follows. 2 Proof of Theorem 4.2.1 Let s be a bi-infinite word having the Markoff property. Let sn denote the infinite sequence obtained by truncating s at the left of its term of rank n. Then sn is an infinite word satisfying the Markoff property for the parameters K = N and N + 1 (we leave this verification to the reader). It follows from Theorem 4.1.6 that sn = un w∞ n for some lower Christoffel word wn and some word un whose length is bounded by a constant independent of n. It then follows from the previous lemma that wn = w is independent of n and that s = ∞ w∞ up to a shift. 2

markoff property for bi-infinite words | 33 This theorem has a converse, which may be proved directly (see [Re1], Theorem 6.1); it is also a consequence of Theorem 6.2.1 and of Lemma 9.3.2. It is interesting to note, as proved by Glen, Lauve, and Saliola [GLS], Theorem 1, that the words m that occur in the definition of the Markoff property of bi-infinite words are exactly the central words of de Luca, studied in the second part of this book; this follows from Theorem 15.5.4 (iii).







5 •







Continued Fractions

The theory of continued fractions appears in many textbooks; see, for example, [HW], Chapter X, [A], Chapter 1, [B], Appendix A. We thus do not prove the most basic results of this theory. We define in the first two sections the finite and infinite continued fractions and their properties. In Section 3, we state the periodicity results on continued fractions. In Section 4, we give a result of Legendre and the approximation results for real numbers. The Lagrange number, which is central for the theory of Markoff, is introduced in Section 5. In the last section we introduce the alternating lexicographical order on finite and infinite words; it corresponds to the natural order on real numbers and will be used often in the book.

5.1 Finite Continued Fractions A finite continued fraction is an expression of the form a0 +

1 a1 + a

2+

1

..

,

(5.1)

1

. + a1n

where n ∈ N, a0 ∈ Z, a1 , . . . , an ∈ P. This expression is written [a0 , a1 , a2 , . . . , an ]. It represents a well-defined rational number, since by positivity there is no division by zero. It is well known that each rational number has such an expression. Moreover, a given rational number has exactly two such expressions: if one of them is as previously and an > 1 or n = 0, then the other is [a0 , a1 , a2 , . . . , an−1 , an − 1, 1]; if, on the contrary, an = 1 and n > 0, then the other is [a0 , a1 , a2 , . . . , an−1 + 1]. For example, [−2] = [−3, 1], [1, 1, 2] = [1, 1, 1, 1], [2, 2] = [2, 1, 1]. Define integers p−1 , p0 , . . . , pn and q−1 , q0 , . . . , qn recursively by p−1 = 1, p0 = a0 , q−1 = 0, q0 = 1, pi = ai pi−1 + pi−2 , and qi = ai qi−1 + qi−2 for i = 1, · · · , n. Then From Christoffel Words to Markoff Numbers, Christophe Reutenauer. Oxford University Press (2019). © Christophe Reutenauer 2019. DOI: 10.1093/oso/9780198827542.001.0001

36 | continued fractions pn . qn

[a0 , . . . , an ] =

 We have also, for n ≥ 1, pn−1 /qn−1 = [a0 , a1 , . . . , an−1 ]. Define P(a) = for n ≥ 0



pn qn

 a 1 . One has 1 0

 pn−1 = P(a0 )P(a1 ) · · · P(an ). qn−1

(5.2)

It follows from this equation that pn qn−1 − qn pn−1 = (−1)n+1 .

(5.3)

Moreover, by transposing the matrices we obtain 



pn

qn

pn−1

qn−1

= P(an )P(an−1 ) · · · P(a0 ).

We deduce that pn /pn−1 = [an , an−1 , . . . , a0 ] and qn /qn−1 = [an , an−1 , . . . , a1 ].

5.2 Infinite Continued Fractions An infinite continued fraction is an expression of the form a0 +

1 a1 + a

2+

,

1

..

.+

1 1 an + 1

..

.

with a0 ∈ Z, an ∈ P for any n ∈ N. It is written [a0 , a1 , a2 , . . . , an , . . .]. It is well known that each irrational real number has a unique expression as an infinite continued fraction, in the sense that this number is the limit of the finite continued fractions (5.1) as n goes to infinity. In other words, [a0 , a1 , a2 , . . . , an , . . .] = lim [a0 , a1 , a2 , . . . , an ]. n→∞

Conversely, such an expression represents an irrational real number. Starting from x = [a0 , a1 , a2 , . . . , an , . . .], define xn = [an , an+1 , an+2 , . . .]. Then clearly x = x0 = a0 + x11 , and recursively

periodic expansions yield quadratic numbers | 37

x = a0 +

1 a1 + a

2+

1

..

.+

. 1

1 an−1 + x1 n

One shows that if the numbers pi , qi are defined as in Section 5.1, then for n ≥ 1 x=

xn pn−1 + pn−2 . xn qn−1 + qn−2

(5.4)

5.3 Periodic Expansions Yield Quadratic Numbers In order to consider ultimately periodic continued fractions, it is convenient to use the notation −−−−−− [a0 , . . . , an , b1 , . . . , bp ] := [a0 , . . . , an , b1 , . . . , bp , b1 , . . . , bp , b1 , . . . , bp , . . .], including the case where the prefix a0 , . . . , an is empty; in this latter case, the continued fraction is called periodic. √ Recall that a real number x is quadratic if it may be written x = a+c b , where a, b, c are rational numbers such that b is not a square in Q and c = 0. Equivalently, x is a root of some irreducible polynomial P of degree 2 over the rationals; √ P is unique up to a nonzero factor. The conjugate of the quadratic number x above is x¯ = a−c b . Equivalently, x¯ is the other root of the polynomial P. It follows that the conjugate does not depend on the chosen expression. −−−−−− Proposition 5.3.1 Let n, a0 , . . . , an−1 be positive integers, x = [− a− 0 , . . . , an−1 ], y = − − − − − − − − [an−1 , . . . , a0 ], and 

  p q a = P(a0 ) · · · P(an−1 ) = 0 r s 1

Then x, y are quadratic, x =

  1 a · · · n−1 0 1

p−s+ px+q py+r ¯ and y = rx+s , y = qy+s , y = −1/x,

 1 . 0 √

(s−p)2 +4qr . 2q

px+q

Proof Note that q, r are positive. By Eq. (5.4), x = rx+s since x = xn . Thus rx2 + 2 (s − p)x √ − q = 0. The discriminant of this equation is (s − p) + 4qr, hence x =

(s−p)2 +4qr : indeed, x is positive and the other root is negative, since q, r > 0, 2r  that p − s − (s − p)2 + 4qr < 0. Now, since the matrices in the product are

p−s+

so



symmetric, the product we have y =

py+r qy+s

an−1 1

and y =

p−s+

  1 a ··· 0 0 1 √ 2

(s−p) +4qr . 2q

   1 p r is equal to . Thus, similarly, 0 q s

38 | continued fractions

Now −1/x¯ is equal to −

p−s−

√ 2r

(s−p)2 +4qr

=−



2r(p−s+ (s−p)2 +4qr) (p−s)2 −((s−p)2 +4qr)

= y. Finally,

x, y are irrational, since their expansion into continued fractions is infinite. Hence the discriminant is not a square in Q and x, y are quadratic. 2 Concerning the previous result, one has to mention theorems of Lagrange and Galois: Lagrange’s theorem asserts that a real number x has an ultimately periodic expansion into continued fractions if and only if x is quadratic; Galois’s theorem says that the expansion is periodic (as those in the preceding proposition) if and only if x is quadratic and satisfies x > 1 and −1 < x¯ < 0. A proof of both theorems may be found in Aigner’s book [A].

5.4 Approximations of Real Numbers Consider an infinite continued fraction x = [a0 , a1 , a2 , . . . , an , . . .]. Let pi , qi be as in Section 5.1. The number an is called the n-th partial quotient of x. The rational number pn /qn is called the n-th convergent of x; n is called the rank of the convergent. The sequence of convergents of even rank is strictly increasing, while the sequence of convergents of odd rank is strictly decreasing. Both converge to x, and one has for any n |x − pn /qn | < 1/q2n . See [HW] for a proof of these basic results in the theory of continued fractions. It follows from the definition of the (qn )s that the sequence (qn )n≥1 is strictly increasing. Hence one deduces Dirichlet’s theorem, whose original proof was quite different, using the pigeonhole principle: for any irrational real number x there are infinitely many rationals p/q, such that |x − p/q| < 1/q2 . One has a stronger approximation, as follows: for any n ≥ 1, one has either |x − pn /qn | < 1/2q2n or |x − pn+1 /qn+1 | < 1/2q2n+1 . Indeed, by contradiction, suppose that both inequalities are false; then, since x is irrational, |x − pn /qn | > 1/2q2n and |x − pn+1 /qn+1 | > 1/2q2n+1 . Note that by Eq. (5.3), qn (qn+1 x − pn+1 ) − qn+1 (qn x − pn ) = (−1)n+1 . Dividing by qn qn+1 , we obtain (x − pn+1 /qn+1 ) − (x − pn /qn ) = (−1)n+1 /qn qn+1 . By a previous statement, x − pn+1 /qn+1 and x − pn /qn have opposite sign, so that |x − pn+1 /qn+1 | + |x − pn /qn | = 1/qn qn+1 . Hence 1/qn qn+1 > 1/2q2n + 1/2q2n+1 and multiplying by 2q2n q2n+1 , we deduce that 2 qn+1 + q2n < 2qn qn+1 and then (qn+1 − qn )2 < 0, a contradiction. Let p/q ∈ Q, q > 0. We show that convergents are optimal in the following sense: if q < qn+1 , then |qx − p| ≥ |qn x − pn |. We may assume that p/q is reduced. The system of two equations in the unknowns u, v pn u + pn+1 v = p, qn u + qn+1 v = q,

(5.5)

has determinant ±1. Thus there is a unique integral solution u, v. If we had u = 0, then v = q/qn+1 > 0. Thus v ≥ 1 and q ≥ qn+1 , against the assumption. If v = 0, then u = p/pn =

approximations of real numbers | 39 q/qn , and since the two fractions p/q and pn /qn are reduced and q, qn > 0, p = pn and q = qn ; then clearly |qx − p| ≥ |qn x − pn |. We assume now that u, v are nonzero and show that they have opposite sign. If v ≤ −1, then qn u = q − qn+1 v > 0 and thus u > 0. If, on the contrary, v ≥ 1, then qn u = q − qn+1 v ≤ q − qn+1 < 0 and thus u < 0. Since, as we saw, qn x − pn and qn+1 x − pn+1 have opposite sign, we see that u(qn x − pn ) and v(qn+1 x − pn+1 ) have the same sign. Now by Eq. (5.5), qx − p = (qn u + qn+1 v)x − (pn u + pn+1 v) = u(qn x − pn ) + v(qn+1 x − pn+1 ). The two last summands have the same sign, so that |qx − p| = |u||qn x − pn | + |v||qn+1 x − pn+1 | > |qn x − pn | (qn+1 x − pn+1 = 0 since x is irrational), as required. We show finally that if |x − p/q| < 1/2q2 , then p/q is a convergent of x. Indeed, since q0 = 1 and (qn )n≥1 is strictly increasing, there exists n ≥ 0 such that qn ≤ q < qn+1 (q being positive). Then the previous property shows that |qx − p| ≥ |qn x − pn |. Thus |x − pn /qn | ≤ (1/qn )|qx − p| = (q/qn )|x − p/q| < 1/2qqn . Assuming that p/q = pn /qn , |pq −qp | we obtain that |pqn − qpn | ≥ 1 and thus 1/qqn ≤ nqqn n = |p/q − pn /qn | ≤ |x − p/q| + |x − pn /qn | < 1/2q2 + 1/2qqn . Thus 1/qqn < 1/q2 , hence q < qn , a contradiction. Thus p/q = pn /qn . We thus have the following result. Theorem 5.4.1 For each convergent p/q of an irrational real number x, one has |x − p/q| < 1/q2 . Of two successive convergents, one of them, p/q say, satisfies |x − p/q| < 1/2q2 . If p/q is a rational number satisfying |x − p/q| < 1/2q2 , then it is a convergent of x. The last assertion is due to Legendre [L], (9) pp. 27–29. For later use, we state and prove the following result. We define the Fibonacci numbers (Fn ) by the rules: F−1 = 0, F0 = 1, and Fn+1 = Fn + Fn−1 for any natural number n. Lemma 5.4.2 If x, y are two real numbers whose expansions into continued fractions coincide up to the partial quotient of rank n, then |x − y| < F12 . n

Note that the numbering starts at 0: a0 is the partial quotient of rank 0. Hence the expansion into continued fractions of x and y in the lemma coincide for their first n + 1 terms. Proof If n = 0, then x, y have the same integer part, so that |x − y| < 1 = 1/F02 . Let n ≥ 1. By hypothesis, x = [a0 , . . . , an , an+1 , . . .] and y = [a0 , . . . , an , bn+1 , . . .]. Then, by x pn−1 +pn−2 yn pn−1 +pn−2 Eq. (5.4), we have x − y = xnn qn−1 +qn−2 − yn qn−1 +qn−2 and xn = [an , an+1 , . . .], yn = [an , bn+1 , . . .] have the same integer part an , so that |xn − yn | < 1. Thus

(xn pn−1 + pn−2 )( yn qn−1 + qn−2 ) − (xn qn−1 + qn−2 )( yn pn−1 + pn−2 ) (xn qn−1 + qn−2 )( yn qn−1 + qn−2 ) xn (pn−1 qn−2 − qn−1 pn−2 ) − yn (qn−2 pn−1 − pn−2 qn−1 )) = . (xn qn−1 + qn−2 )( yn qn−1 + qn−2 )

x−y =

(−1)n (x −y )

n This is by Eq. (5.3) equal to (xn qn−1 +qn−2 )( nyn qn−1 +qn−2 ) . Since n ≥ 1, xn and yn are ≥ 1, so that the denominator is greater than or equal to the square of qn−1 + qn−2 .

40 | continued fractions Now it is easy to prove by induction that qi ≥ Fi if i ≥ −1. Thus qn−1 + qn−2 ≥ Fn−1 + Fn−2 = Fn , which ends the proof. 2 Note that the proof works even if x, y are rational numbers, provided the partial quotients of rank n exist.

5.5 Lagrange Number of a Real Number Let x be an irrational real number. Consider the set of real numbers L such that the inequality |x − p/q| < 1/Lq2 holds for infinitely many rational numbers p/q. Define L(x) to be the supremum of all these L. It is called the Lagrange number of x. Note that, by Theorem 5.4.1, there are infinitely many p/q such that |x − p/q| < 1/2q2 . Thus L(x) ≥ 2. Moreover, if the inequality |x − p/q| < 1/2q2 holds, then p/q is a convergent of x. This shows that L(x) may be defined as the supremum of all numbers L > 0 such that |x − pn /qn | < 1/Lq2n for infinitely many n, where pn /qn , n ≥ 0 are the convergents of x. Let x be an irrational real number, with its infinite continued fraction [a0 , a1 , a2 , . . .], and define as in Section 5.2, xn = [an , an+1 , an+2 , . . .]; moreover, for n ≥ 1, define yn = [an , . . . , a1 ]. Finally, for n ≥ 2, let λn (x) = xn + y−1 n−1 . We have λn (x) = [an , an+1 , . . .] + [an−1 , . . . , a1 ]−1 = an + [an+1 , an+2 , . . .]−1 + [an−1 , . . . , a1 ]−1 = [an+1 , an+2 , . . .]−1 + [an , an−1 , . . . , a1 ].

(5.6)

For n = 1, we define λ1 (x) = [a2 , a3 , . . .]−1 + a1 = [a1 , a2 , . . .] = x1 . The next result is essential for the determination of the Lagrange number of a real number; it is stated by Hurwitz [H1], p. 283. Theorem 5.5.1 L(x) = lim sup λn (x). n→∞

Note that if (un ) is a sequence of real numbers, then lim sup(un ) = lim sup(un+1 ). Proof For n ≥ 0, we have by Eq. (5.4)

x−

pn xn+1 pn + pn−1 pn = − qn xn+1 qn + qn−1 qn xn+1 (pn qn − qn pn ) + pn−1 qn − qn−1 pn (−1)n = = , 2 (xn+1 qn + qn−1 )qn (xn+1 + y−1 n )qn

by Eq. (5.3), and since by Section 5.1 qn /qn−1 = [an , an−1 , . . . , a1 ] = yn . Thus |x −

pn 1 |= . qn λn+1 (x)q2n

(5.7)

ordering continued fractions | 41 Let L > 0 be such that there are infinitely many n satisfying |x − pn /qn | < 1/Lq2n . We deduce that for these n one has 1/λn+1 (x)q2n < 1/Lq2n and therefore L < λn+1 (x). Thus L ≤ lim supn→∞ λn (x) := M. It follows that L(x) ≤ M. Furthermore, given  > 0, there are infinitely many n such that λn+1 (x) > M − . p 1 Thus, for these n, we have |x − qnn | < (M−)q 2 . Therefore, L(x) ≥ M − . This being n true for any , we deduce that L(x) = M, what was to be shown. 2 Two irrational real numbers are called equivalent if their expansions into continued fractions are ultimately equal. Formally, let x = [a0 , a1 , a2 , . . .], y = [b0 , b1 , b2 , . . .]. Then x, y are equivalent if for some i, j ∈ N, one has ai+n = bj+n for any natural number n. Corollary 5.5.2 Equivalent irrational real numbers have the same Lagrange number. Proof It is enough to show that if x = [a0 , a1 , . . .] and y = [a1 , a2 , . . .], then L(x) = L( y). Now λn ( y) = [an+1 , an+2 , . . .] + [an , . . . , a2 ]−1 and λn+1 (x) = [an+1 , an+2 , . . .] + [an , . . . , a1 ]−1 . By Lemma 5.4.2, the difference of the two sequences [an , . . . , a2 ]−1 and [an , . . . , a1 ]−1 has the limit 0. Note that lim sup(un ) = lim sup(vn ) if lim(un − vn ) = 0. Thus the corollary follows. 2

Concerning equivalent numbers, one has to recall the following  result:  two irrational real a b numbers x, y are equivalent if and only if for some matrix ∈ GL2 (Z), one has c d y = ax+b cx+d . See for example [HW], Theorem 175 or [B], Theorem A.1.

5.6 Ordering Continued Fractions If real numbers are represented in base k, then one compares them using the lexicographical order [v] by Lemma 7.1.3. We want to show that x = 1, y = 2. By contradiction, suppose that u = w1u , v = w2v ; since [w1u ] > [w2v ], w must be of odd length; then  . Note that |a u˜ | ≥ I ≥ |m| (because |u| ≥ I + M, u = wyu and s = a0 u˜ 1w1122w2v ˜ 0 |wy| ≤ M + 1, so that |u | ≥ I − 1). Thus 1w1122w2 ˜ is a factor of s1 . Hence, since w has at least a block of odd length, we see by inspection that so has s1 , a contradiction. This proves that x = 1, y = 2. We therefore take L = I + M. The case s = a0 u2211v ˜ is treated similarly. 2 Theorem 7.1.7 Given an infinite word s = a0 a1 a2 · · · over Z, with ai > 0 for i > 0, whose Lagrange number is < 3, there exists a factorization s = a0 ws and an infinite word t over {a, b} satisfying the Markoff property, with parameters K and N, such that s = χ(t). If one assumes that λi (s) ≤ θ < 3 for any i ≥ I, then K, N and the length of w may be bounded by functions depending only on θ and I. Recall that χ is the substitution sending a onto 11 and b onto 22. Proof If an infinite word over {1, 2} has only blocks of even length, then it is in the image of χ. Let L be as in Lemma 7.1.6; in particular, L ≥ I. Write s = a0 ws ˜  with |a0 w| ˜ =L or L + 1, where the length is chosen so that, according to Lemma 7.1.2 (i) and Lemma 7.1.5, s has only 1s and 2s, and only blocks of even length. Then s = χ(t) for some t. Then we apply Lemma 7.1.6 and we let M be as in this lemma. We show that t satisfies the Markoff property. Let t = uxyv ˜ with {x, y} = {a, b} and with |u| ≥ K = (M + 1)/2 . Note that K ≥ (M + 1)/2, hence 2|u| ≥ M + 1. Con˜  = a0 wχ(t) ˜ = sider the case x = a, y = b, the other case being similar. Then s = a0 ws   a0 wχ( ˜ u)1122χ(v). ˜ Since χ(u) ˜ = χ(u), we have s = a0 (χ(u)w)1122χ(v). Since the length of χ(u)w is at least L (because |w| ≥ L − 1 and |u| ≥ 1), Lemma 7.1.6 implies that χ(u)w = p2u and χ(v) = p1v for some finite word p of length at most M. Since the length 2|u| of χ(u) is larger than M, p has even length; otherwise, p2 is in the image of χ, hence the last letter of p is 2, but p1 is also in the image of χ, and thus 1 is its last letter, a contradiction. Thus p is in the image of χ, p = χ(m) and m is of length at most M/2. Since |χ(u)| ≥ M + 1 ≥ |p2| and χ(u)w = p2u , we obtain χ(u) = p2u1 = χ(m)22χ(u ). Moreover, χ(v) = χ(m)11χ(v ). This implies that u = mbu , v = mav , with m of length at most M/2. We conclude by taking N = M/2 . 2

7.2 Bi-infinite Sequences Recall that the Markoff supremum of a bi-infinite sequence s = (an )n∈Z has been defined in Section 6.2.

bi-infinite sequences | 53 Theorem 7.2.1 (Markoff [M1, M2]) If the Markoff supremum of a bi-infinite sequence s = (an )n∈Z is < 3, then s is equal up to a shift to χ(t) for some bi-infinite sequence t over {a, b} satisfying the Markoff property. Proof We have M(s) = supi∈Z (λi (s)) < 3. Choose some real number θ such that M(s) < θ < 3 and some I ≥ 1 such that the Fibonacci number FI−1 satisfies M(s) + F21 ≤ θ . I−1

Fix some integer k and consider the infinite word tk = (bn )n∈N defined by bn = ak+n . We show that for i ≥ I, one has λi (tk ) ≤ θ . Indeed, λi (tk ) = [bi , . . . , b1 ] + [bi+1 , bi+2 , . . .]−1 = [ak+i , . . . , ak+1 ] + [ak+i+1 , ak+i+2 , . . .]−1 and λk+i (s) = [ak+i , . . . , ak+1 , ak , ak−1 , . . .] + [ak+i+1 , ak+i+2 , . . .]−1 . By hypothesis, we have λk+i (s) ≤ M(s). The expansions into continued fractions of the real numbers [ak+i , . . . , ak+1 ] and [ak+i , . . . , ak+1 , ak , ak−1 , . . .] coincide for their i ≥ I first terms, so that by Lemma 5.4.2 these two numbers differ by at most F21 . Thus λi (tk ) ≤ M(s) + F21 ≤ θ. I−1

I−1

It follows from Theorem 7.1.7 that tk = wk χ(uk ) for some word wk and some infinite word uk satisfying the Markoff property with parameters K and N; moreover, the lengths of wk , K, and N are bounded by constants which are independent of k. Since this is true for any k, it follows that s = χ(t) for some bi-infinite sequence t; furthermore, t satisfies the Markoff property with parameter N. The easy details are left to the reader. 2 The theorem has a converse. Theorem 7.2.2 If a bi-infinite sequence s is equal to χ(t) for some bi-infinite sequence t over {a, b} satisfying the Markoff property, then the Markoff supremum of s is less than 3. See [Re1], Theorem 7.2 (i), [B], Lemma 10, [BLRS], Theorem 8.10, or [A], Corollary 9.16. It may also be obtained as a consequence of Markoff ’s methods in Chapter 9.







8 •







Markoff ’s Theorem for Approximations

In this chapter we prove Markoff ’s theorem for approximations: if x is an irrational real number such that its Lagrange number L(x) is < 3, then the continued fraction of x is ultimately periodic and has as its periodic pattern a Christoffel word on the alphabet {11, 22}. Moreover, the bound is attained. See Theorem 8.2.1 and Corollary 8.2.2.

8.1 Main Lemma We begin by a result which is the consequence of the previous chapters and which will be used to prove Markoff ’s theorem for approximations in the next section. Lemma 8.1.1 Let s be an infinite word over Z, with ai > 0 for i > 0, such that for some I and some θ < 3 one has λi (s) < θ for any i ≥ I. Then s = uχ(w)∞ for some lower Christoffel word w and some word u whose length is bounded by a function depending only on I and θ. Proof By Theorem 7.1.7, we have s = a0 ms , s = χ(t), where t satisfies the Markoff property for the parameters K, N, and where the length of m and the numbers K, N are each bounded by some function depending only on I and θ. By Theorem 4.1.6 we have t = vw∞ , where w is a lower Christoffel word and the length of v is bounded by some function depending only on K and N. Consequently, s = a0 mχ(v)χ(w)∞ , which proves the lemma. 2

8.2 Markoff’s Theorem for Approximations Theorem 8.2.1 Let x be an irrational real number. Then its Lagrange number L(x) is less than 3 if and only if x is equivalent to some number xw = [χ(w) ˜ ∞ ] (or equivalently ∞ to [χ(w) ]) for some lower Christoffel word w. In this case, let m bethe Markoff

number μ(w)12 = 13 Tr(μ(w)) associated with w. Then L(x) = L(xw ) = 9 − m42 and    p q p−s xw = 2m + 12 9 − m42 with μ(w) = and therefore m = q = 13 (p + s). r s

From Christoffel Words to Markoff Numbers, Christophe Reutenauer. Oxford University Press (2019). © Christophe Reutenauer 2019. DOI: 10.1093/oso/9780198827542.001.0001

56 | markoff’s theorem for approximations Proof Note first that since w and w˜ are conjugate (Theorem 6.2.2), so are their images under χ, and therefore [χ(w) ˜ ∞ ] and [χ(w)∞ ] are equivalent real numbers. In particular, they have the same Lagrange number by Corollary 5.5.2. The same result showsthat if x is equivalent to xw , then L(x) = L(xw ), and by

Theorem 6.3.1 L(xw ) =

9 − m42 < 3, where m is the Markoff number as in the

statement. Then μ(w)12 = 13 Tr(μ(w)) by Lemma 3.1.2. Moreover, by Theorem 6.1.1  p−s (where y corresponds to xw ), xw = 2m + 12 9 − m42 . Conversely, let x be such that L(x) < 3. Let s = a0 a1 a2 · · · be the infinite word such that x has the continued fraction expansion [a0 , a1 , a2 , . . .]. Then L(s) = L(x) < 3. Thus there exist I and θ as in Lemma 8.1.1. We deduce that s = uχ(w)∞ for some lower Christoffel word w; since w and w˜ are conjugate, we also have s = vχ(w) ˜ ∞ and therefore x is equivalent to xw . 2 Markoff ’s theorem is often stated as a series of progressively better approximations, with exceptions the first exception being the golden ratio and the numbers equivalent to it. Formally, this is the following result. Corollary 8.2.2 Let M be a finite set of Markoff numbers, such that for any Markoff numbers n, p, if n < p and p ∈ M, then n ∈ M. Let W be the finite set of all lower Christoffel words corresponding to the Markoff numbers in M (in other words, W is the set of all lower Christoffel words w such that μ(w)12 ∈ M). Let m be the smallest Markoff number not in M. Then for each irrational real number not equivalent to any xw , w ∈ W, there are infinitely

many rational approximations p/q of x, such that |x − p/q| < 1/ 9 − m42 q2 .  √ We give three examples. First, let M = ∅. Then m = 1, W = ∅, and 9 − m42 = 5. We obtain that each real √irrational number has infinitely many rational approximations satisfying |x − p/q| < 1/ 5q2 . This is a theorem of Hurwitz [H1], Satz 1 p. 279, which we state as a corollary. p

Corollary 8.2.3 For each irrational real number x there are infinitely many rational fractions q , √ such that |x − p/q| < 1/ 5q2 . Note that Martin Aigner proves the following nice straightening of the previous result of Hurwitz: of any three consecutive convergents of x, one of them, p/q say, satisfies the inequality in the corollary (see [A], proof of Theorem 1.21 pp. 21–22). This is reminiscent of a similar result for two consecutive convergents and the constant 2 at the denominator; see Section 5.4.  √ √ Now let M = {1}. Then m = 2, W = {a}, 9 − m42 = 8. Moreover, xa = 5+1 2 , the golden ratio. We obtain that each real irrational number x not equivalent to x has infinitely a √ many rational approximations satisfying |x − p/q| < 1/ 8q2 . Note that ‘x equivalent to xa ’ means that in the expansion into continued fractions of x, only 1s occur after some rank.

markoff’s theorem for approximations | 57  √ √ Finally, let M = {1, 2}. Then W = {a, b}, m = 5, 9 − m42 = 221 5 , xb = 1 + 2. We obtain that each irrational real number that is equivalent neither to xa nor to xb has √ 2 infinitely many rational approximations satisfying |x − p/q| < 1/ 221 5 q . Note that ‘x equivalent to xb ’ means that in the expansion into continued fractions of x, only 2s occur after some rank. We proceed now to the proof of the corollary. This requires several lemmas. Lemma 8.2.4 If λn+1 (x) > α, then |x − pn /qn | < 1/αq2n . If λn+1 (x) < α, then |x − pn /qn | > 1/αq2n . Proof It follows directly from Eq. (5.7).

2

Corollary 8.2.5 If α > 0 and L(x) > α, then x has infinitely many rational approximations p/q of x such that |x − p/q| < 1/αq2 . Proof Since L(x) = lim supn (λn (x)), there are infinitely many n such that λn+1 (x) > α, and we apply the previous lemma. 2

Lemma 8.2.6 Let x be a real irrational number suchthat L(x) < 3 (so that x is equivalent to

xv for some lower Christoffel word v and L(x) = 9 − m42 where m is the Markoff number associated with v). Then there are infinitely many rational approximations p/q of x such that |x − p/q| < 1/L(x)q2 .

This result will be a consequence of Theorem 8.3.3 in the next section. Proof of Corollary 8.2.2 If w is a word of length n, then each entry in μ(w) is at least equal to 2n−1 : indeed,the entries of the generating matrices μ(a), μ(b) are all ≥ 1,   n−1  1 1 2 2n−1 and the n-th power of is equal to n−1 ; thus each entry of μ(w) is 1 1 2 2n−1 at least equal to 2n−1 . Therefore, if μ(w)12 ∈ M, the length of w is bounded. Thus W is finite.  Let x be a real irrational number not equivalent to any xw , w ∈ W. Let α = 9 − m42 . Then α < 3. If L(x) ≥ 3, then L(x) > α, and by Corollary 8.2.5 we find infinitely many rational approximations p/q of x such that |x − p/q| < 1/αq2 . If L(x) < 3, then by Theorem 8.2.1 x is equivalent to  xv for some lower Christoffel word v, with associated

/ W, so that n is not in M and Markoff number n and L(x) = 9 − n42 . By hypothesis, v ∈ therefore n ≥ m. If n > m, then L(x) > α, and by Corollary 8.2.5 we also find infinitely many approximations, as above. If n = m, then we use Lemma 8.2.6. 2

The conclusion of Lemma 8.2.6 is in general not true without the assumption L(x) < 3. Indeed, Perron ([P1], pp. 7–8) proves that for positive integers b > c, the number x = − [bc] has infinitely many rational approximations such that |x − p/q| < 1/L(x)q2 , whereas − y = [cb] has only finitely many such approximations, although x and y are equivalent and therefore L(x) = L(y).

58 | markoff’s theorem for approximations

8.3 Good and Bad Approximations We consider any real number x with L(x) < 3, so that by Theorem  8.2.1 x is equivalent

to some number xw , w lower Christoffel word, and L(x) = L(xw ) = 9 − m42 with m the Markoff number associated with w. Recall that the continued fraction expansion of xw is [χ(w) ˜ ∞ ]. Moreover, x has an expansion of the form [a0 s], where s = (a1 , a2 , . . .) is an ultimately periodic sequence with periodic pattern p = χ(w) and also p˜ = χ(w). ˜ Lemma 8.3.1 Let s be a periodic sequence having the periodic pattern p = χ(w) for some lower Christoffel word w. Then either s = up˜∞ with u a suffix of odd length of p, ˜ or s = up∞ with u a suffix of even length of p.

Recall that a lower Christoffel word w and its reversal w, ˜ which is the upper Christoffel word of the same slope, are conjugate: w = hk, w˜ = kh for some words h, k (see Theorem 6.2.2). Hence χ(w) = χ(h)χ(k) and χ(w) ˜ = χ(k)χ(h) are conjugate too, and the conjugating words χ(h), χ(k) are of even length, since χ maps the letters a, b onto the words 11, 22 of even length. Proof Since p and p˜ are conjugate, s also has the periodic pattern p. ˜ Thus we may write s = u1 p∞ = u2 p˜∞ , with u1 (resp. u2 ) a suffix of p (resp. p). ˜ By the remark before the 2 proof, the lengths of u1 and u2 have the same parity, so that the lemma follows.

Lemma 8.3.2 Let s be an ultimately periodic, but nonperiodic, sequence having the periodic pattern p = χ(w) for some lower Christoffel word w. Then one has one of the following cases, where u, u , v ∈ P∗ and a, b ∈ P: (i) s = ubvp˜∞ , p˜ = u av, with either v of even length and b < a, or v of odd length and b > a; (ii) s = ubvp∞ , p = u av, with either v of even length and b > a, or v of odd length and b < a. Proof Define the sequence t as the longest periodic suffix of s. Then t is a proper suffix of s and t has the periodic patterns p and p. ˜ Let a be the letter such that at is purely periodic and b be the letter which precedes t in s. Then b = a. Next, at = av1 p∞ = av2 p˜∞ where av1 (resp. av2 ) is a suffix of p (resp. p). ˜ By the remark before the previous lemma, the lengths of v1 and v2 have the same parity. Moreover, p = u1 av1 , p˜ = u2 av2 , and s = ubt = ubv1 p∞ = ubv2 p˜∞ . Suppose that b < a: if |vi | is even, we are in case (i), first part; otherwise, in case (ii), second part. Suppose now that b > a; then we argue similarly, and the lemma follows. 2

According to Lemma 8.3.1 and Lemma 8.3.2, we classify the numbers x = [a0 s] into two classes: we say that x is of type I if one of the three following conditions holds: 1. s = up∞ and u is a suffix of even length of p; 2. s = ubvp∞ , p = u av, v of even length, a, b are letters such that b > a; 3. s = ubvp∞ , p = u av, v of odd length, and a, b are letters such that b < a.

good and bad approximations | 59 In type I, we say that n is good if n + 1 is the position of the last letter of some p in the sequence a0 s = · · · p∞ (the position of a0 being 0). We say that n is bad if it is the position of any other letter in some p∞ . Thus, if s is not periodic, we disregard finitely many positions of letters at the beginning of the sequence. Type II is defined similarly, as follows: 1. s = up˜∞ and u is a suffix of odd length of p; ˜ 2. s = ubvp˜∞ , p˜ = u av, v of even length, a, b are letters such that b < a; 3. s = ubvp˜∞ , p˜ = u av, v of odd length, and a, b are letters such that b > a. In type II, we say that an integer n is good if in the sequence a0 s n + 1 is the position of the first letter in some p, ˜ and bad if it is the position of any other letter in some p˜∞ . The next result was proved by Burger, Folsom, Pekker, Roengpyta, and Snyder ([BFPRS], Theorem 1.5) in a particular case, and in full generality by Bombieri ([B], Theorem 29). Theorem 8.3.3 Let x be an irrational real number with L(x) < 3 and let [a0 s] be its expansion p into continued fractions. Let qnn be its convergents. Except for finitely many ns, one has: if n is good, then |x − pn /qn | < 1/L(x)q2n ; if n is bad, then |x − pn /qn | > 1/L(x)q2n . For the definition of convergents, see Section 5.1. Lemma 8.2.6 follows from this theorem; it is indeed clear, by periodicity, that the number of good ns is infinite. Observe also that any x as in the theorem is either of type I or of type II; this follows from the two previous lemmas. It may, however, be of both types, as in the case of the golden ratio. We compare in the next lemma the numbers λn (s) and λn (t), where t is a bi-infinite sequence and s is a sequence that coincides with t in its positive part, except for finitely many values. Lemma 8.3.4 Let t = (bn )n∈Z ∈ PZ , n0 ≥ 0, and s = (an )n∈N ∈ ZN with an > 0 if n > 0. Suppose that ∀n > n0 , an = bn . Let n > n0 . • If n0 = 0 and n is odd; or n0 ≥ 1, n − n0 is odd, and an0 > bn0 ; or n0 ≥ 1, n − n0 is even, and an0 < bn0 ; then λn (s) < λ(t). • If n0 = 0 and n is even; or n0 ≥ 1, n − n0 is even and an0 > bn0 ; or n0 ≥ 1, n − n0 is odd and an0 < bn0 ; then λn (s) > λn (t). Proof Suppose that n0 = 0 and n is odd. If n = 1, then λ1 (s) = [a1 , a2 , . . .] = [b1 , b2 , . . .] < [b1 , b2 , . . .] + [b0 , b−1 , . . .]−1 = λ1 (t). If n ≥ 3, then

[bn−1 , bn−2 , . . . , b1 ] > [bn−1 , bn−2 , . . . , b1 , b0 , b−1 , . . .], since the word bn−1 bn−2 · · · b1 is a prefix of even length of the infinite word bn−1 bn−2 · · · b1 b0 b−1 · · · (see Corollary 5.6.2). Hence λn (s) = [an−1 , an−2 , . . . , a1 ]−1 + [an , an+1 , . . .] = [bn−1 , bn−2 , . . . , b1 ]−1 + [bn , bn+1 , . . .] < [bn−1 , bn−2 , . . . , b1 , b0 , b−1 , . . .]−1 + [bn , bn+1 , . . .] = λn (t).

60 | markoff’s theorem for approximations Note that the latter inequality holds because of the inversion of the two sides in the former one. Suppose now that n0 ≥ 1, n − n0 is odd and an0 > bn0 . Then [an−1 , an−2 , . . . , a1 ] = [an−1 , . . . , an0 −1 , an0 , an0 +1 , . . . , a1 ] = [bn−1 , . . . , bn0 +1 , an0 , an0 −1 , . . . , a1 ] > [bn−1 , . . . , bn0 +1 , bn0 , bn0 −1 , . . . , b1 , b0 , b−1 , . . .], since an0 > bn0 and the word bn−1 . . . bn0 +1 is of even length (see Corollary 5.6.2). Thus λn (s) = [an−1 , an−2 , . . . , a1 ]−1 + [an , an+1 , . . .] = [an−1 , an−2 , . . . , a1 ]−1 + [bn , bn+1 , . . .] < [bn−1 , bn−2 . . .]−1 + [bn , bn+1 , . . .] = λn (t). All other cases are similar and left to the reader.

2

Proof of Theorem 8.3.3 We take the notations given at the beginning of the section. We know that a0 s = (an )n∈N is an ultimately periodic sequence with periodic pattern p and p. ˜ Denote the periodic bi-infinite sequence which coincides with a0 s for large enough n by t = (bn )n∈Z . Then t has the same periodic patterns: t = ∞ p∞ = ∞ p˜∞ up to the shift. If i is the position of the first letter in some p˜ within t, or of the last letter of some p within t, then λi (t) = L(t) (see the proof of Theorem 6.2.1). We know also that L(t) =

L(x) = 9 − m42 . We treat only the case where x is of type II; the other type is similar. Suppose first that s = up˜∞ and u is a suffix of odd length of p. ˜ Note that a0 s = a0 up, ˜ with a0 u of even length. Assume that n is good, that is, i = n + 1 is the position of the first letter of some p˜ in the sequence a0 s. Note that this position is even, since u is of odd length and p˜ of even length. Then by Lemma 8.3.4 (with n0 = 0, s replaced by a0 s, and n replaced by n + 1) λn+1 (a0 s) > λn+1 (t) = L(x), and it follows from Lemma 8.2.4 that |x − pn /qn | < 1/L(x)q2n . Assume now that n is bad. We have two cases, depending on the parity of n. Assume that n + 1 is even and larger than the length of p. Then an+1 an+2 · · · has a prefix of the form χ(v) for some conjugate v of w, different from w, ˜ and an an−1 · · · a1 has as prefix χ(˜v). By Theorem 6.2.2, one has w v. Consider their expansion into continued fractions u = [a0 , a1 , . . .], v = [b0 , b1 , . . .]. Let n be the smallest integer such that an = bn . We show by induction on n that, taking an equivalent form, we may assume that u > 0 > v. If n = 0, then a0 > b0 and therefore u > a0 > v; thus u − a0 > 0 > v − a0 . By the lemma, these two numbers are the roots of the quadratic form g such that f =  1 −a0 g· . Suppose now that n > 0, that is, a0 = b0 ; then with h(x, y) = f (a0 x + 0 1 1 1 y, x), the roots of h are the numbers u−a and v−a . Their expansions into continued 0 0 fractions are [a1 , a2 . . .], and [b1 , b2 . . .], and we conclude by induction on n. Thus we are reduced to a quadratic form f having a positive root u and a negative root v. If one of them at least has absolute value > 1, then the interval [v, u] contains at least two integers. If their absolute values are both < 1, we replace f (x, y) with f (y, x), obtaining an equivalent quadratic form whose roots are the inverses of the previous ones, hence having different signs and both having absolute values > 1. Thus we may assume that, the two roots being u > v, the interval [v, u] contains at least two integers. Let h be the upper integer part of v. Then −1 < v − h < 0 and u − h > 1. These two numbersare, by the lemma, the roots of the quadratic form g 1 −h 2 such that f = g · M = , which concludes the proof. 0 1

infimum | 65

9.2 Infimum Let f0 = f = αx2 + βxy + γ y2 be an indefinite binary quadratic form, such that f (x, y) = 0 for any integers x, y that are not both zero; let ξ and − η1 be the roots of the polynomial f (x, 1), and assume that ξ , η > 1, as allowed by the conclusion of Theorem 9.1.1. By hypothesis, ξ and η are irrational real numbers, and we consider their expansions into continued fractions ξ = [a0 , a1 , a2 , . . .], η = [a−1 , a−2 , a−3 , . . .], which define the bi-infinite word (an )n∈Z . Note that  all thean are positive. a 1 Recall that P(a) = . Consider the bi-infinite sequence of quadratic forms fi , 1 0 such that for any n ∈ Z one has fn+1 = fn · P(an ). It exists and is unique, since we must have f1 = f0 · P(a0 ), f2 = f1 · P(a1 ), . . ., and f−1 = f0 · P(a−1 )−1 , f−2 = f−1 · P(a−2 )−1 , . . . . Denote fn = fn (x, y) = αn x2 + βn xy + γn y2 . Theorem 9.2.1 (Markoff [M1], p. 388) The infimum of the form f is equal to the infimum of the real numbers |αn |, n ∈ Z. Lemma 9.2.2 For n ∈ Z, let ξn = [an , an+1 , . . .] and ηn = [an−1 , an−2 , . . .]. Then ξn and −1 ηn are the roots of the quadratic form fn . 1 Proof One has ξn+1 = ξn −a and fn+1 = fn · P(an ), so that fn = fn+1 · P(an )−1 = fn+1 · n   0 1 . This shows by Lemma 9.1.2 that if ξn is a root of fn , then ξn+1 is a root of 1 −an fn+1 . We conclude by induction that ξn is a root of fn for any n ∈ N. For n < 0 and −1 ηn the proof is similar. 2

Proof of Theorem 9.2.1, following Markoff [M1], pp. 386–388 Since the fn are all equivalent, they have the same infimum. We have fn (1, 0) = αn , so that the infimum of f is ≤ |αn |. √ Suppose first that all an are equal to 1. We have ξ = 5+1 = η. The two numbers 2 √

1− 5 are conjugate and are the roots of x2 − x − 1, so that, multiplying ξ and −1 η = 2 if necessary by a nonzero factor, we may assume that f (x, y) = x2 − xy − y2 . The infimum of this form is 1, since the values of f for x, y integers are integers. Hence the theorem follows in this case, since all αn are nonzero integers and α0 = 1. Suppose now that at least one of the an is ≥ 2. By shifting the bi-infinite sequences (an ) and (fn ) using Lemma 9.2.2, we may assume that a0 ≥ 2, so that ξ > 2. Suppose that |f (x, y)| ≤ |α0 | for some integers x, y, not both zero. We show that |f (x, y)| ≥ |αn | for some n ∈ Z. This will imply the theorem. If y = 0, this is clear, since then f (x, y) = α0 x2 and therefore |f (x, y)| ≥ |α0 |, since x is a nonzero integer.   a 1 Suppose that x = 0. Note that f (x, y) = f−1 · P(a−1 ) = f−1 · −1 = f−1 1 0 (a−1 x + y, x). Thus f (x, y) = f (0, y) = f−1 (y, 0) = α−1 y2 . Thus |f (x, y)| ≥ |α−1 |, since y is a nonzero integer.

66 | markoff’s theorem for quadratic forms We may therefore assume that x, y are both nonzero, and we use the equation  x 1  x + . f (x, y) = α0 y2 − ξ y y η

(9.1)

Suppose first that xy > 0. We may assume that x, y > 0. By contradiction, assume that x x y < 2, then y < ξ , thus | xy − ξ | = ξ − xy > 2 − xy

|f (x, y)| = |α0 y2 ( xy − ξ )( xy + η1 )| > |α0 |y2 (2 − xy ) xy , since

and η1 > 0. We conclude that |f (x, y)| > |α0 |(2y − x)x ≥ |α0 | (because x, y are integers and since 2y − x > 0), contradicting |f (x, y)| ≤ |α0 |. We therefore have xy ≥ 2. It follows that xy + η1 > 2. The hypothesis |f (x, y)| ≤ |α0 | with Eq. (9.1) then implies that y2 |ξ − xy |( xy + η1 ) ≤ 1, hence |ξ − xy |
2, Eq. (9.1) implies that | xy + η1 | < 2y12 , so x y

that − xy is a convergent of η1 = [0, a−1 , a−2 , a−3 , . . .]. Denote by pn /qn the convergents of this number. Then (x, y) = d(−pn , qn ) for some nonzero integer d and some n ≥ 1 (n = 0, since x/y = 0 and p0 /q0 = 0). Then f (x, y) = d2 f (−pn , qn ) and we show next that f (−pn , qn ) = α−n−1 , which  will imply  that |f (x, y)| ≥ |α−n−1 |. pn+1 pn We have by Eq. (5.2) = P(0)P(a−1 ) · · · P(a−n−1 ). Transposing qn+1 qn and taking inverses, we obtain, since the matrices P(a) are symmetric and of determinant −1,   qn −qn+1 n (−1) = P(0)−1 P(a−1 )−1 · · · P(a−n−1 )−1 . −pn pn+1 Since

       1 0 1 qn −pn = = , 0 1 0 −pn qn   1 multiplying on the left by P(0) and on the right by , we see that 0     n −pn −1 −1 1 (−1) = P(a−1 ) · · · P(a−n−1 ) . qn 0   0 1 qn 1 0 −pn

−qn+1 pn+1

Now we have f−n−1 = f · P(a−1 )−1 · · · P(a−n−1 )−1 , so that f−n−1 (1, 0) = f (P(a−1 )−1 · · · P(a−n−1 )−1 This implies that α−n−1 = f−n−1 (1, 0) = f (−pn , qn ).

  1 ). 0 2

markoff’s theorem for quadratic forms | 67 √

d(f ) L(f )

Corollary 9.2.3 Let f be as in Theorem 9.2.1. Then of the bi-infinite sequence s = (an )n∈Z .

= M(s), the Markoff supremum

Proof By the theorem, L(f ) = inf n∈Z |αn |. Since by Lemma 9.2.2 ξn and −1 ηn are the 2 + β x + γ , their difference is equal, up to the sign, roots of the polynomial α x n n √ n √ √ d(fn ) |αn | ;

d(f )

d(f )

thus one has ξn + η1n = |αn |n = |αn | , since the first sum is positive. √ √ d(f ) d(f ) Thus, if L(f ) > 0, we obtain that L(f ) = supn∈Z |αn | = supn∈Z (ξn + η1n ) = M(s). √ d(f ) If L(f ) = 0, then inf(|αn |) = 0, hence M(s) = sup(ξn + η1n ) = sup |αn |n = ∞, and the result is still true. 2 to

9.3 Markoff’s Theorem for Quadratic Forms 

 p q Let w be a lower Christoffel word and let μ(w) = , where q = m is the Markoff r s number associated with w and therefore m = 13 (p + s); see Section 3.3. Define the associated Markoff quadratic form by fw (x, y) = mx2 + (s − p)xy − ry2 . Theorem 9.3.1 Let f (x, y) be an indefinite binary quadratic form. Assume that its infimum  L(f ) and its discriminant d(f ) satisfy the inequality d(f ) < 3L(f ). Then f is equivalent to a multiple of some Markoff form fw . Let m be the Markoff number associated with the lower 2 Christoffel word w. One has d(f w ) = fw (1, 0) = m, so that the infimums √w ) = 9 − √4m , L(f d(f ) d(fw ) of fw and f are attained, and L(f ) = L(fw ) = 9 − m42 . Lemma 9.3.2 Let s = (an )n∈Z be a bi-infinite word over P whose Markoff supremum M(s) is less than 3. Then, for some lower Christoffel word w, s is equal up to a shift to ∞ χ(w)∞ . Proof This follows from Theorem 7.2.1 and Theorem 4.2.1.

2

Proof of Theorem 9.3.1 The hypothesis implies that L(f ) = 0; in particular, f (x, y) = 0 for any integers x, y that are not both zero. By Theorem 9.1.1, we may assume that the roots of f are distinct, and that one is in (−1, 0) and the other is > 1. Taking the notations of Section 9.2, we may therefore assume that f = f0 . Let s = (an )n∈Z and define its shift sk by sk = (an+k )n∈Z . By Corollary 9.2.3 and the hypothesis, M(s) < 3. From Lemma 9.3.2 we deduce that for some k, sk = ∞ χ(w)∞ , where w is a lower Christoffel word. Since w and w˜ are conjugate by Theorem 6.2.2, we also have sl = ∞ χ(w) ˜ ∞ for some l. Now f is equivalent to fl . Moreover, by Lemma 9.2.2, the expansion into ˜ ∞ ], so that by Theorem 6.1.1 it satisfies continued fractions of the root ξl of fl is [χ(  w) p q pξl +r 1 the equation ξl = qξl +s , where μ(w) = . The other root of fl is − [χ (w) ∞] , r s which is the conjugate of ξl by the same theorem. Thus fl is proportional to the form qx2 + (s − p)xy − ry2 = fw .

68 | markoff’s theorem for quadratic forms We have d(fw ) = (s − p)2 + 4qr = (s + p)2 − 4ps + 4qr = 9m2 − 4, since p + s = 3m and since the matrix μ(w) ˜ has determinant √ 1. It follows from Theorem 6.2.1   d(f ) 4 that M(s) = 9 − m2 . Thus, by Corollary 9.2.3, L(fw w) = 9 − m42 . Thus L(fw ) = √ 2 9m −4 9− 42

= m = fw (1, 0), which ends the proof.

2

m

Markoff ’s theorem is often stated as a series of progressively better inequalities relating d(f ) and L(f ), with exceptions where they do not hold, the first exception being the quadratic form x2 − xy − y2 and the forms equivalent to a multiple of this. Formally, this is the following result: Corollary 9.3.3 Let M be a finite set of Markoff numbers, such that for any Markoff numbers n, p, if n < p and p ∈ M, then n ∈ M. Let W be the finite set of all lower Christoffel words corresponding to the Markoff numbers in M (in other words, W is the set of all lower Christoffel words w such that μ(w)12 ∈ M). Let m be the smallest Markoff number not in M. Then for each indefinite binary quadratic form f (x, y) not equivalent to a multiple of any   4 form fw , w ∈ W, one has d(f ) ≥ 9 − m2 L(f ).  √ We give several examples. First, let M = ∅. Then W = ∅, m = 1, and 9 − m42 = 5. We obtain a result due to Korkine and Zolotareff [KZ].  √ Corollary 9.3.4 Let f (x, y) be an indefinite binary quadratic form. Then d(f ) ≥ 5L(f ).    √ 2 1 4 Now let M = {1}. Then W = {a}, m = 2, 9 − m2 = 8. We have μ(a) = , 1 1 so that fa = x2 − xy − y2 . Thus we obtain the following result, also due to Korkine and Zolotareff [KZ]. Corollary 9.3.5 Let f (x, y) be an indefinite binary  quadratic √ form. If f is not equivalent to a multiple of the form fa = x2 − xy − y2 , then d(f ) ≥ 8L(f ). These two results are mentioned by Markoff [M1], who gives them as the motivation of 

his own work. The next example is M = {1, 2}. Then W = {a, b}, m = 5, 9 − m42 =   √ 5 2 221 2 2 5 . Since μ(b) = 2 1 , we have fb = 2x − 4xy − 2y . Thus for each form equivalent neither to a multiple of fa = x2 − xy − y2 nor of fb = 2x2 − 4xy − 2y2 , one √  221 has d(f ) ≥ 5 L(f ).  Proof of Corollary 9.3.3 Suppose first that d(f ) ≥ 3L(f ). Then the conclusion holds,   since 9 − m42 < 3. Suppose now that d(f ) < 3L(f ). Then by Theorem 9.3.1, f is equivalent to a multiple of some fv , v a lower Christoffel word, with associated Markoff √  d(f ) / M. Thus n ≥ m, number n, and L(f ) = 9 − n42 . By hypothesis, we must have v ∈   and consequently 9 − n42 ≥ 9 − m42 , which ends the proof. 2







10 •







Numerology

This chapter gives several examples which may help the reader to work in concrete terms with Markoff numbers, Christoffel words, Markoff constants, and quadratic forms. Some results of Frobenius, Aigner, and Clemens are also given.

10.1 Thirteen Markoff Numbers The thirteen smallest Markoff numbers are 1, 2, 5, 13, 29, 34, 89, 169, 194, 233, 433, 610, 985, which are also those smaller than 1000. The corresponding Markoff constants √ 9m2 −4 m

 9 − m42 =

are respectively √ √ 5, 8,

√ √ √ √ 2600 221 157 7565 10400 , , , = , 5 13 29 34 17 √ √ √ √ √ 84680 488597 71285 257045 338720 , , = , , 89 169 194 97 233 √ √ √ √ 1687397 3348896 837224 8732021 , = , . 433 610 305 985



The thirteen lower Christoffel words corresponding to the thirteen previous Markoff numbers are respectively a, b, ab, a2 b, ab2 , a3 b, a4 b, ab3 , a2 bab, a5 b, abab2 , a6 b, ab4 . See Figures 2.10 and 3.1.

√ d(f ) The Markoff spectrum is the set of real numbers of the form L(f ) where f is a real binary indefinite quadratic form. The Lagrange spectrum is the set of real numbers of the form From Christoffel Words to Markoff Numbers, Christophe Reutenauer. Oxford University Press (2019). © Christophe Reutenauer 2019. DOI: 10.1093/oso/9780198827542.001.0001

70 | numerology L(x) where x is a real irrational number. It follows from Theorems 8.2.1 and 9.3.1 that the intersection of both spectra with the interval [0, 3) is equal to the set of Markoff constants 

9 − m42 where m is a Markoff number. There are many results on these spectra which may be found in the extraordinary book of Cusick and Flahive [CF]. In particular, the Markoff spectrum contains strictly the Lagrange spectrum (loc. cit. Theorem 1, p. 35). As shown above, the intersections of the two spectra with the set of real numbers < 3 are both equal to a discrete subset of the reals. √ The intersections of both spectra with [3, ∞) have gaps, and they both contain [ 21, ∞) (loc. cit. Theorems 8 and 9, p. 55).

10.2 The Golden Ratio and Other Numbers The images of the thirteen lower Christoffel words of Section 10.1 under the homomorphism μ are the matrices       2 1 5 2 12 5 μ(a) = , μ(b) = , μ(ab) = , 1 1 2 1 7 3       31 13 70 29 81 34 2 2 3 μ(a b) = , μ(ab ) = , μ(a b) = , 19 8 41 17 50 21       212 89 408 169 463 194 4 3 2 μ(a b) = , μ(ab ) = , μ(a bab) = , 131 55 239 99 284 119     555 233 1045 433 μ(a5 b) = , μ(abab2 ) = , 343 144 613 254     1453 610 2378 985 μ(a6 b) = , μ(ab4 ) = . 898 377 1393 577 For w, a lower Christoffel word, recall that the real quadratic expansion √  number xw , whose p−s+ 9m2 −4

into continued fractions is χ(w) ˜ ∞ , is equal to xw = 2m + 12 9 − m42 = , with 2m   p m μ(w) = and 3m = p + s; see Theorem 8.2.1. For w equal to one of the thirteen r s words in Section 10.1, the sequence of xw , in which the first xa is the golden ratio, is p−s

xa = xab2 = xab3 = xabab2 =

√ √ √ √ 1+ 5 9 + 221 23 + 1517 , xb = 1 + 2, xab = , xa2 b = , 2 10 26 √ √ √ 53 + 7565 15 + 650 157 + 71285 , xa3 b = , xa4 b = , 58 17 178 √ √ √ 309 + 257045 172 + 84680 411 + 488597 , xa2 bab = , xa5 b = , 338 194 466 √ √ √ 791 + 1687397 538 + 837224 1801 + 8732021 , xa6 b = , xab4 = . 866 610 1970

the matrices μ(w) and frobenius congruences | 71 Observe that Theorem 8.2.1 and Corollary 8.2.2 imply that the golden ratio and the numbers which are equivalent to it (these are sometimes called noble numbers), are the irrational real numbers which have the worst approximations by rational numbers; they are the numbers whose Lagrange number is the minimum possible. This property is relevant in phyllotaxis, as shown in [BR].

10.3 The Matrices μ(w) and Frobenius Congruences For any lower Christoffel word w, the matrix μ(w) has a special form which is given by the following result. It is a particular case of Theorem 4.13 in Aigner’s book [A] (the case a = 2; see also [A], Example 4.9), due to Clemens [Cle]. The existence of the numbers u, v below was shown directly from the Markoff equation by Frobenius ([F], p. 602). Theorem 10.3.1 Let w be a proper lower Christoffel word with standard factorization w = w1 w2 . Let m, m1 , m2 be the Markoff numbers respectively associated with w, w1 , w2 . There exist unique natural integers u, v, such that m1 u ≡ m2 mod m, 0 < u < m2 , and u2 + 1 = mv. One has   2m + u m μ(w) = . 2m − u − v m − u Corollary 10.3.2 (variant of Markoff [M2]) fw (x, y) = mx2 − (m + 2u)xy − (2m − u − v)y2 . Recall that we denote by mw the Markoff number associated with the lower Christoffel word w. By Theorem 3.1.1 and Lemma 3.1.2, we may write   aw mw . μ(w) = cw 3mw − aw Lemma 10.3.3 (following Aigner [A], Eq. (4.9)) If w = w1 w2 is a lower Christoffel word with its standard factorization, then mw1 = −aw mw2 + mw aw2 and mw2 = −aw1 mw + mw1 aw . Proof We have μ(w) = μ(w1 )μ(w2 ); thus, since these matrices have determinant 1,    a 3mw2 − aw2 −mw2 mw μ(w1 ) = μ(w)μ(w2 )−1 = w cw 3mw − aw −cw2 aw2

and −1

μ(w2 ) = μ(w1 )



3mw1 − aw1 μ(w) = −cw1

−mw1 aw1



aw c

 mw . 3mw − aw

It follows that mw1 = −aw mw2 + mw aw2 and mw2 = (3mw1 − aw1 )mw − mw1 (3mw − aw ) = −aw1 mw + mw1 aw . 2

72 | numerology Following a particular case of Definition 4.10 in [A], we define the index of w as the number Iw = maww . The following result is a particular case of Theorem 4.11 in Aigner’s book [A]. Lemma 10.3.4 For each proper lower Christoffel word w, one has Ia = 2 < Iw
0 if and only if it is a prefix of the infinite word u∞ for some word u of length p. The period is nontrivial if 0 < p < n, and we say that w is periodic if it has a nontrivial period p. Each factor of the form ai ai+1 · · · ai+p−1 is called a periodic pattern of w (i = 1, . . . , n − p + 1). A border of a word w is a nontrivial proper prefix of w, which is also a suffix of w. If w has no border, we say that w is unbordered. Lemma 12.2.1 (i) If w is a prefix of uw, then w has the period |u|. (ii) The word w has a nontrivial period p if and only if it has a border of length |w| − p.

84 | palindromes and periods Proof (i) Since 0 is certainly a period, we may assume that u is nonempty. Then w is a prefix of uw, hence uw is a prefix of uuw, and so on; this implies that w is a prefix of u∞ , and therefore w has the period |u|.

(ii) Assume that w has a border m of length |w| − p. Then 0 < p < |w|, w = um = mv, and u is of length p. Then wv = umv = uw, hence w is a prefix of uw. By (i), w has the nontrivial period p. Conversely, assume that w has the nontrivial period p. Then, writing w = a1 · · · an for some letters ai , we have a1 · · · an−p = ap+1 · · · an , which is therefore a border of w. 2 Corollary 12.2.2 A word is unbordered if and only if it is not periodic. Theorem 12.2.3 (de Luca and Mignosi [dLM]) A word in {a, b}∗ is central if and only if for some relatively prime natural numbers p, q it has the two periods p and q and is of length p + q − 2. Lemma 12.2.4 (de Luca [dL1], Lemma 3) If a palindrome has a palindromic suffix, then it has as period the length of the corresponding prefix. Proof Let w = us, with w, s palindromes. We may assume that s is a proper nontrivial suffix. We have w = w˜ = s˜u˜ = su. ˜ Thus s is a border of w, and by Lemma 12.2.1 w has the period |u|. 2

Lemma 12.2.5 Let w = aw ∈ {a, b}∗ be a proper Christoffel word with standard factorization w = uv. Let p be the prefix of length |v| − 1 of w. Then w p is a palindrome and has v as periodic pattern and thus has the period |v|, which is nontrivial if |w| ≥ 3. Proof Suppose that u, v are both proper Christoffel words. Then by Theorem 2.3.1, u = amb, v = anb, w = ambanb, where m, n, mban are palindromes. Moreover, w has by Theorem 12.1.8 the factorization into two palindromes w = (ana)(bmb). Thus p = an, w = nabmb, and w p = nabmban is a palindrome. Now, w p has mban as a suffix, and mban is a palindrome, as said above. Hence, by Lemma 12.2.4, w p has as a period the length of the corresponding prefix, that is, nab, which has the same length as v = anb. Note that v is a proper suffix of w, hence a factor of w p, which therefore has v as a periodic pattern. If u, v are not both proper Christoffel words, then (by the last observation in Section 2.5) either w = an b, n ≥ 1, u = a, v = an−1 b, w p = an−1 ban−1 , which has v as periodic pattern, or w = abn , n ≥ 1, u = abn−1 , v = b, w p = bn , which has v as a periodic pattern. Finally, if |w| ≥ 3, then |v| < |v| + |w| − 2 = |p| + |w | = |w p|, and the period is nontrivial. 2

Lemma 12.2.6 Let p, q be relatively prime natural numbers and n = p + q. The oriented graph with vertices 0, 1, . . . , n − 1 and arrows i → j if j = i + q or j = i − p is a cycle. Proof Note that q and n are relatively prime. Since q generates the cyclic group Z/nZ, the graph G with set of vertices Z/nZ = {0, 1, . . . , n − 1} and arrows i → j if j ≡ i + q

periods | 85 mod n, is a cycle. This graph coincides with the graph G of the lemma. Indeed, each arrow in G is an arrow of G , because −p ≡ q mod n. Moreover, for each arrow i → j in G , one has either i + q < n and then j = i + q, so that it is an arrow in G, or i + q ≥ n, and then, since i + q < 2n, j = i + q − n = i − p and it is an arrow in G, too. 2 Corollary 12.2.7 Let p, q be two relatively prime positive integers. The alphabet of a word m of length p + q − 2 having the periods p and q has at most two elements. Let m ∈ {a, b}∗ . If p + q − 2 = 0, m is the empty word. If p + q − 2 > 0, then: (i) if p or q = 1, m is the power of a single letter; (ii) if p, q > 1, m may take exactly four different values, two of them being as in (i) and the others having two letters, and each obtained from the other by exchange of these letters. Proof Let n = p + q > 2 (the case p + q − 2 = 0 is immediate) and m = a1 · · · an−2 . Let H be the undirected graph obtained from the graph G of the previous lemma by removing the vertices 0 and n − 1 and forgetting the orientations of the edges. Thus the vertices of H are 1, . . . , n − 2. Let 1 ≤ i, j ≤ n − 2 and suppose that there is an edge between i and j in H; we show that ai = aj . We may suppose that i < j; then j = i + q or i = j − p, hence ai = aj by the periodicity hypothesis. Thus, if i, j are connected by a path in H, then ai = aj . By the previous lemma, this graph has one or two connected components. Thus the alphabet of the word must have one or two elements. In order to prove (i) and (ii), one notes that H has only one connected component if and only if 0 and n − 1 are connected in G, that is, if and only if p = 1 or q = 1. 2

Lemma 12.2.8 If w = amb is a Christoffel word with standard factorization uv, then m has the relatively prime periods |u| and |v|. Proof Using Lemma 12.2.5, we see that m has the period |v|. Symmetrically, it has the period |u|. We conclude using Corollary 2.5.1. 2 Proof of Theorem 12.2.3 If m is central, let w be the Christoffel word amb. By the previous lemma, we may conclude, since |m| = |u| + |v| − 2. Conversely, let p, q be two relatively prime natural integers and m be a word of length p + q − 2 having the periods p and q. We may assume that m is not the power of a single letter (because such a word is central). Then p, q ≥ 2. By the previous corollary, m is unique, up to the exchange of a and b. By Corollary 2.5.1, there exists a lower Christoffel word w = uv (standard factorization) such that p = |u| and q = |v|. By what has been proved above, we have w = am b, where m is a central word having the periods p and q. By uniqueness, m = m or m = Em . Hence m is central, since E exchanges lower and upper Christoffel words. 2

For later use, we also state and prove the next result, which is a theorem due to Fine and Wilf [FW]. Corollary 12.2.9 Let p, q be two natural positive integers. If a word w of length at least p + q − gcd(p, q) has the periods p and q, then it has the period gcd(p, q).

86 | palindromes and periods Proof We may assume that p, q are not both equal to 1. Assume first that p, q are relatively prime and that w is of length n = p + q − 1; note that n ≥ 2. Let H be the undirected graph obtained from the graph of Lemma 12.2.6 by removing the vertex 0 and forgetting the orientations of the edges. This graph is connected so that, arguing as in the proof of Corollary 12.2.7, w must be the power of a single letter. If w is longer, then by considering the successive factors of length n (which is ≥ 2), we see that w is also the power of a single letter, and hence has period 1 = gcd(p, q). In the general case, we show that w has the period d = gcd(p, q); equivalently, for each factorization w = uxu yu , where x, y are letters and u is a word of length d − 1, one has x = y. It is enough to show that for each factorization w = u0 a1 u1 a2 · · · ak uk , where a1 , . . . , ak are letters and u0 , . . . , uk are words such that |u0 |, |uk | ≤ d − 1 and |ui | = d − 1 for i = 1, . . . , k − 1, one has a1 = a2 = . . . = ak . Now one has p + q − d ≤ |w| ≤ d − 1 + kd, since |ai ui | ≤ d. We deduce that the length k of the word w = q q p p a1 · · · ak satisfies k ≥ d + d − 1 − d−1 d , and therefore (since d + d − 1 is an integer) q q p p k ≥ d + d − 1. Now w clearly has the periods d and d , so that, by the previous gcd = 1 case, w is the power of a single letter. 2

Theorem 12.2.10 (de Luca and Mignosi [dLM], Proposition 8) A word m on the alphabet {a, b} is central if and only if it is a palindrome and if amb is a product of two palindromes. The condition in the theorem is called property R in [dL1], since it goes back to Robinson [Ro].2 Equivalently, w = amb is a Christoffel word if and only if it is a product of two palindromes and if m is a palindrome. Proof If m is central, then it is a palindrome; moreover, amb is a Christoffel word which has a palindromic factorization by Theorem 12.1.8. Conversely, suppose that m is a palindrome such that amb = uv, u, v are palindromes. Since amb is not a palindrome, u, v are nonempty. If v is of length 1, then v = b, am = u, and by Lemma 12.2.4, am has the period 1, so is a power of a and m too; thus m is central. Similarly if u is of length 1. Thus we may assume that v and u are of a length of at least 2. Then u = au a, v = bv b, and u , v are palindromes. Then amb = au abv b, m = u abv . Then, by the same lemma, m has the periods p = |abv | and q = |u ab|, and we have |m| = p + q − 2. If p, q are relatively prime, m is central by Theorem 12.2.3. By contradiction, suppose that p, q are not relatively prime; then by Corollary 12.2.9, m has the period d = gcd(p, q) ≥ 2: m is a prefix of w∞ for some word w of length d. Then, since m = u abv and q = |u ab| is a multiple of d, we have u ab = wi . Thus w = w ab, and therefore m = (w ab)j w , since d divides |m| + 2. Note that j > 0, since j = 0 would imply that m = w and then d − 2 = |m| = p + q − 2, d = p + q, which cannot be true, since p, q ≥ 2. But a word of the form w ab · · · abw cannot be a palindrome, since ab = ba. 2 2 Actually, in property R, the condition on amb is replaced by the condition that mba is the product of two palindromes; it is equivalent, since the property of being a product of two palindromes is conjugation-closed, as is well known.







13 •







Lyndon Words and Christoffel Words

Lower Christoffel words are particular Lyndon words. They are maximal for the lexicographical order among Lyndon words of given slope (Borel and Laubie). They are, together with the upper Christoffel words, the only unbordered (finite) Sturmian words (Chuan). Finally, they are the Lyndon words which are equilibrated (Melançon).

13.1 Slopes We begin by comparing Christoffel words according to their slope, and to the lexicographical order (defined in Section 5.6). Proposition 13.1.1 (Borel and Laubie [BL], p. 27) Let k, l be positive integers. Let u, v be two distinct lower Christoffel words of slope r, s respectively. Then r < s if and only if uk