Gröbner-Shirshov Bases: Normal Forms, Combinatorial And Decision Problems In Algebra 9789814619486, 9789814619493, 9789814619509

147 22 5MB

English Pages 308 Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Gröbner-Shirshov Bases: Normal Forms, Combinatorial And Decision Problems In Algebra
 9789814619486, 9789814619493, 9789814619509

Table of contents :
Contents
Preface
Chapter 1. Introduction
1.1. The Euclidean algorithm
1.2. The Gaussian elimination algorithm
1.3. Systems of polynomial equations
Chapter 2. Free algebras
2.1. Semigroups and groups
2.2. Noncommutative polynomials
2.3. Free commutative algebras
2.4. Free Lie algebras
Chapter 3. Composition-Diamond Lemma
3.1. Grobner bases for commutative algebras
3.2. Grobner–Shirshov bases for associative algebras
3.3. Grobner–Shirshov bases for tensor product of free algebras
3.4. Grobner–Shirshov bases for Lie algebras
Chapter 4. Applications of Grobner–Shirshov bases
4.1. Normal form for semigroups and groups
4.2. Free product of algebras and groups
4.3. Modules over associative algebras
4.4. Replicated algebras
4.5. Associative conformal algebras
4.6. Lifting of Grobner bases to Grobner–Shirshov bases
4.7. Lie algebra with unsolvable word problem
4.8. Grobner–Shirshov basis for the Drinfeld–Kohno Lie algebra
Chapter 5. Grobner-Shirshov bases for Lie algebras over a commutative algebra
5.1. Preliminaries
5.2. Composition-Diamond lemma for Liek[Y ](X)
5.3. Examples of Shirshov and Cartier
5.4. Cohn conjecture
5.5. Other applications
Chapter 6. Decision problems for groups
6.1. Decision problems
6.2. Computability
6.3. Group theoretic tools
6.4. Unsolvability of the word and conjugacy problems for finitely presented groups
6.5. The word problem and r.e. Turing degrees
6.6. The Higman embedding theorem
6.7. The conjugacy problem and r.e. Turing degrees
Chapter 7. (Co)Homology and Gr¨obner–Shirshov basis
7.1. Preliminaries
7.2. Algebraic discrete Morse theory
7.3. Group extensions
7.4. (Singular) Extensions of Algebras
Bibliography
Index

Citation preview

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

9287_9789814619486_tp.indd 1

3/6/20 10:11 AM

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

b2530   International Strategic Relations and China’s National Security: World at the Crossroads

b2530_FM.indd 6

This page intentionally left blank

01-Sep-16 11:03:06 AM

9287_9789814619486_tp.indd 2

3/6/20 10:11 AM

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data Names: Bokut, L. A. (Leonid Arkadevich), 1937- author. Title: Gröbner–Shirshov bases : normal forms, combinatorial, and decision problems in algebra / by Leonid Bokut (Sobolev Institute of Mathematics, Russia) [and three others]. Description: New Jersey : World Scientific, 2018. | Includes bibliographical references. Identifiers: LCCN 2018000569 | ISBN 9789814619486 (hardcover : alk. paper) | ISBN 9789814619493 (ebook) | ISBN 9789814619509 (ebook other) Subjects: LCSH: Grobner bases. | Commutative algebra. | Algebra. Classification: LCC QA251.3 G77 2018 | DDC 512/.44--dc23 LC record available at https://lccn.loc.gov/2018000569

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

Copyright © 2020 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

For any available supplementary material, please visit https://www.worldscientific.com/worldscibooks/10.1142/9287#t=suppl

Printed in Singapore

RokTing - 9287 - Grobner–Shirshov Bases.indd 1

22/5/2020 4:09:52 pm

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Contents Preface

vii

Chapter 1. Introduction 1.1. The Euclidean algorithm 1.2. The Gaussian elimination algorithm 1.3. Systems of polynomial equations

1 1 3 5

Chapter 2. Free algebras 2.1. Semigroups and groups 2.2. Noncommutative polynomials 2.3. Free commutative algebras 2.4. Free Lie algebras

9 9 11 12 14

Chapter 3. Composition-Diamond Lemma 3.1. Gr¨ obner bases for commutative algebras 3.2. Gr¨ obner–Shirshov bases for associative algebras 3.3. Gr¨ obner–Shirshov bases for tensor product of free algebras 3.4. Gr¨ obner–Shirshov bases for Lie algebras

29 29 34 39 47

Chapter 4. Applications of Gr¨ obner–Shirshov bases 4.1. Normal form for semigroups and groups 4.2. Free product of algebras and groups 4.3. Modules over associative algebras 4.4. Replicated algebras 4.5. Associative conformal algebras 4.6. Lifting of Gr¨ obner bases to Gr¨obner–Shirshov bases 4.7. Lie algebra with unsolvable word problem 4.8. Gr¨ obner–Shirshov basis for the Drinfeld–Kohno Lie algebra

53 53 57 60 62 73 80 87 89

Chapter 5. Gr¨obner-Shirshov bases for Lie algebras over a commutative algebra 5.1. Preliminaries 5.2. Composition-Diamond lemma for Liek[Y ] (X) 5.3. Examples of Shirshov and Cartier v

93 94 95 107

page v

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

vi

textbook

CONTENTS

5.4. Cohn conjecture 5.5. Other applications

109 112

Chapter 6. Decision problems for groups 6.1. Decision problems 6.2. Computability 6.3. Group theoretic tools 6.4. Unsolvability of the word and conjugacy problems for finitely presented groups 6.5. The word problem and r.e. Turing degrees 6.6. The Higman embedding theorem 6.7. The conjugacy problem and r.e. Turing degrees

115 115 116 131 146 149 154 162

Chapter 7. (Co)Homology and Gr¨ obner–Shirshov basis 7.1. Preliminaries 7.2. Algebraic discrete Morse theory 7.3. Group extensions 7.4. (Singular) Extensions of Algebras

185 186 205 224 257

Bibliography

265

Index

279

page vi

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Preface ˇ sov) in his pioneering work ([290], 1962) A.I. Shirshov (also spelled Sirˇ posed the following fundamental question: How to find a linear basis of a Lie algebra presented by generators and defining relations? He gave an infinite algorithm to solve this problem using a new notion of composition (later called “s-polynomial”, in Buchberger’s terminology [82, 83]) of two Lie polynomials and another new notion of completion of an arbitrary set of Lie polynomials (adding nontrivial compositions; the critical pair/completion (cpc-) algorithm in the later terminology of Knuth and Bendix [195] and Buchberger [85, 86]). Shirshov’s algorithm goes as follows. Consider a set S ⊂ Lie(X) of Lie polynomials in the free associative algebra kX generated by X over a field k (the algebra of non-commutative polynomials on X over k). Denote by S  the superset of S obtained by adding all non-trivial Lie compositions (“Lie s-polynomials”) of the elements of S. The problem of triviality of a Lie polynomial modulo a finite (or recursive) set S can be solved algorithmically using Shirshov’s Lie reduction algorithm from his previous paper [286], 1958. In general, an infinite sequence S ⊆ S  ⊆ S  = (S  ) ⊆ · · · ⊆ S (n) ⊆ . . . of Lie composition sets arises. The union S c of this sequence has the property that every Lie composition of elements of S c is trivial modulo S c . A set of Lie polynomials with such property now called a Lie Gr¨ obner– Shirshov (GS) basis. Then a new “Composition-Diamond1 lemma for Lie algebras” (Lemma 3 in [290]) implies that the set Irr(S c ) of all S c -irreducible (or S c -reduced) basic Lie monomials [u] in X is a linear basis of the Lie algebra Lie(X | S) generated by X with defining relations S. Here a basic Lie monomial means a Lie monomial in a special linear basis of the free Lie algebra Lie(X) ⊂ kX, known as the Lyndon–Shirshov (LS for short) basis (Shirshov [290] and Chen–Fox–Lyndon [92], see below). An LS monomial 1The name “Composition-Diamond lemma” combines the Neuman Diamond Lemma [244], the Shirshov Composition Lemma [290] and the Bergman Diamond Lemma [16].

vii

page vii

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

viii

9287 – Gr¨ obner-Shirshov Bases

textbook

PREFACE

[u] is called S c -irreducible (or S c -reduced) whenever u, the associative image of [u], does not contain the word s¯ for all s ∈ S, where s¯ is the maximal word of s as of an associative polynomial (in the deg-lex ordering). To be more precise, Shirshov used his reduction algorithm at each step S, S  , S  , (n) . . . . Then we have a directed system S → S  → S  → . . . and S c = lim −→S is what is now called a minimal GS basis (a minimal GS basis is not unique, but a reduced GS basis is, see below). As a result, Shirshov’s algorithm gives a solution to the above problem for Lie algebras. Shirshov’s algorithm, dealing with the word problem, is an infinite algorithm like the Knuth–Bendix algorithm [195], 1970, dealing with the identity problem2 for every variety of universal algebras. The initial data for the Knuth–Bendix algorithm is the defining identities of a variety. The output of the algorithm, if any, is a “Knuth–Bendix basis” of identities of the variety in the class of all universal algebras of a given signature (not a GS basis of defining relations, say, of a Lie algebra). Shirshov’s algorithm gives linear bases and algorithmic decidability of the word problem for one-relation Lie algebras, (recursive) linear bases for Lie algebras with (finite) homogeneous defining relations, and linear bases for free products of Lie algebras with known linear bases [290]. He also proved the Freiheitssatz (freeness theorem) for Lie algebras [289] (for every one-relation Lie algebra Lie(X | f ), its subalgebra generated by X \ {xi0 }, where xi0 appears in f , is a free Lie algebra). The Shirshov problem [290] on the decidability of the word problem for Lie algebras was solved negatively in [32]. More generally, it was proved that some recursively presented Lie algebras with undecidable word problem can be embedded into finitely presented Lie algebras (with undecidable word problem). It is a weak analogue of the Higman embedding theorem for groups [163]. The problem [32] whether an analogue of the Higman embedding theorem is valid for Lie algebras is still open. For associative algebras a similar problem was solved positively by V.Ya. Belyaev [14]. A simple example of a Lie algebra with undecidable word problem was given by G.P. Kukin [201]. A similar algorithm for associative algebras was implicit in Shirshov’s paper [290]. The reason is that he treats Lie(X) as the subspace of Lie polynomials in the free associative algebra kX. Then, to define a Lie composition of two Lie polynomials f and g relative to an associative word w which is a nontrivial common multiple of f¯ and g¯, he defines first the associative composition (“non-commutative s-polynomial”) (f, g)w = f b − ag, with a, b ∈ X ∗ . Then he inserts some brackets on (f, g)w to get [f b]f¯ − [ag]g¯ by applying his special bracketing lemma of [286]. We can obtain S c 2We use the standard algebraic terminology “the word problem”, “the identity problem”, see Kharlampovich, Sapir [192] for instance.

page viii

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

ix

for every S ⊂ kX in the same way as for Lie polynomials and in the same way as for Lie algebras (“CD-lemma for associative algebras”) to infer that Irr(S c ) is a linear basis of the associative algebra kX | S generated by X with defining relations S. All proofs are similar to those in [290] but much easier. Moreover, the cases of semigroups and groups presented by generators and defining relations are just special cases of associative algebras via semigroup and group algebras. To summarize, Shirshov’s algorithm gives linear bases and normal forms of elements of every Lie algebra, associative algebra, semigroup or group presented by generators and defining relations! The algorithm works in many cases (see below). The theory of Gr¨ obner bases and Buchberger’s algorithm were initiated by B. Buchberger (Thesis [82] 1965, paper [83] 1970) for commutative associative algebras. Buchberger’s algorithm is a finite algorithm for finitely generated commutative algebras. It is one of the most useful and famous algorithms in modern computer science. Shirshov’s paper [290] was in the spirit of the program of Kurosh (1908– 1972) to study non-associative (relatively) free algebras and free products of algebras, initiated in Kurosh’s paper [202], 1947. In that paper he proved non-associative analogs of the Nielsen–Schreier and Kurosh theorems for groups. It took quite a few years to clarify the situation for Lie algebras in Shirshov’s papers [283], 1953 and [289], 1962 closely related to his theory of GS bases. It is important to note that Kurosh’s program quite unexpectedly led to Shirshov’s theory of GS bases for Lie and associative algebras [290]. A step in Kurosh’s program was made by his student A.I. Zhukov in 1950 [318]. He algorithmically solved the word problem for non-associative algebras. In a sense, it was the beginning of the theory of GS bases for nonassociative algebras. The main difference with the future approach of Shirshov is that Zhukov did not use a linear ordering of non-associative monomials. Instead he chose an arbitrary monomial of maximal degree as the “leading” monomial of a polynomial. Also, for non-associative algebras there is no “composition of intersection” (“s-polynomial”). By this reason, this approach cannot be a model for Lie and associative algebras.3 Shirshov, also a student of Kurosh, defended his Candidate of Sciences Thesis [282] at Moscow State University in 1953. His thesis together with the paper that followed [286], 1958 may be viewed as a background for his later method of GS bases. In the thesis, he proved the free subalgebra 3After graduating from Moscow University Zhukov moved to the present Keldysh Institute of Applied Mathematics (Moscow) to do computational mathematics. S.K. Godunov in [150] mentioned his contribution in relation to the creation of the famous Godunov numerical method. So, Zhukov was a forerunner of two important computational methods!

page ix

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

x

9287 – Gr¨ obner-Shirshov Bases

textbook

PREFACE

theorem for free Lie algebras (now known as Shirshov–Witt theorem, see also Witt [310], 1956) using the elimination process later rediscovered by Lazard [209], 1960. He used the elimination process later [286], 1958 as a general method to prove the properties of regular (LS) words, including an algorithm of (special) bracketing of an LS word (with a fixed LS subword). The latter algorithm is of some importance in his theory of GS bases for Lie algebras (particularly in the definition of the composition of two Lie polynomials). Shirshov also proved the free subalgebra theorem for (anti-) commutative non-associative algebras [285], 1954. He used that later in [291], 1962 for the theory of GS bases of (commutative, anti-commutative) non-associative algebras. In his Thesis [282], Shirshov also found the series of bases of a free Lie algebra (nowadays called “Hall–Shirshov” bases, see also [290], the first issue of the journal “Algebra and Logic” founded by Malcev).4 The LS basis is a particular case of the Shirshov or Hall–Shirshov series of bases (cf. Reutenauer [267], where this series is called the “Hall series”). In the definition of his series, Shirshov used Hall’s inductive procedure (see Ph. Hall [161], 1933 and M. Hall [159], 1950): a non-associative monomial w = ((u)(v)) is a basic monomial whenever • (u), (v) are basic; • (u) > (v); • (u) = ((u1 )(u2 )), then (u2 ) ≤ (v). However, instead of ordering by the degree function (Hall words), Shirshov used an arbitrary linear ordering of non-associative monomials satisfying ((u)(v)) > (v) For example, in his Thesis [282], 1953 Shirshov used the following ordering of nonassociative monomials. Given two nonassociative words (u) and (v), consider the associative and commutative images (“contents”) u∗ 4It must be pointed out that A.I. Malcev (1909–1967) inspired Shirshov’s works very much. Malcev was an official opponent (referee) of his (second) Doctor of Sciences Dissertation at MSU in 1958. The first author of this book, L.A. Bokut, remembers this event at the Science Council Meeting, chaired by A.N. Kolmogorov, and Malcev’s words “Shirshov’s dissertation is a brilliant one!”. Malcev and Shirshov worked together at the Sobolev Institute of Mathematics in Novosibirsk since 1959 until Malcev’s sudden death at 1967, and have been friends despite the age difference. Malcev headed the Algebra and Logic Department and Shirshov was the first deputy director of the institute (S.L. Sobolev was the director). In those years, Malcev was interested in the theory of algorithms in mathematical logic and algorithmic problems of model theory. Thus, Shirshov had an additional motivation to work on algorithmic problems for Lie algebras. Both Malcev and Kurosh were delighted with Shirshov’s results of [290]. Malcev successfully nominated the paper for an award of the Presidium of the Siberian Branch of the Academy of Sciences (Sobolev and Malcev were the only Presidium members from the Institute of Mathematics at the time).

page x

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

xi

and v ∗ (say, for the monomial (u) = ((x2 x1 )((x2 x1 )x1 )) the corresponding image u∗ is x2 x2 x1 x1 x1 ). Compare u∗ and v ∗ lexicographically (a proper prefix of a word is greater than the entire word). If we use the lexicographic ordering, (u) (v) if u v lexicographically (with the condition u uv for v = 1), then we obtain the Lyndon–Shirshov basis.5 For example, for the alphabet {x1 , x2 } with x2 x1 we obtain basic Lyndon-Shirshov monomials by induction: x2 , x1 , [x2 x1 ], [x2 [x2 x1 ]] = [x2 x2 x1 ], [[x2 x1 ]x1 ] = [x2 x1 x1 ], [x2 [x2 x2 x1 ]] = [x2 x2 x2 x1 ], [x2 [x2 x1 x1 ]] = [x2 x2 x1 x1 ], [[x2 x1 x1 ]x1 ] = [x2 x1 x1 x1 ], [[x2 x1 ][x2 x1 x1 ]] = [x2 x1 x2 x1 x1 ], and so on. These are exactly all Shirshov regular (LS) Lie monomials and their associative images are exactly all Shirshov regular words with a oneto-one correspondence between two sets given by the Shirshov bracketing algorithm for associative words. An elementary step of Shirshov’s bracketing algorithm is to join the minimal letter of a word to previous ones by bracketing and to continue this process with the lexicographic ordering of the new alphabet. For example, suppose that x2 x1 . Then we have, for example, the following succession of bracketings: x2 x1 x2 x1 x1 x2 x1 x1 x1 x1 x2 x1 x1 , [x2 x1 ][x2 x1 x1 ][x2 x1 x1 x1 ][x2 x1 x1 ], [x2 x1 ][[x2 x1 x1 ][x2 x1 x1 x1 ]][x2 x1 x1 ], [[x2 x1 ][[x2 x1 x1 ][x2 x1 x1 x1 ]]][x2 x1 x1 ], [[[x2 x1 ][[x2 x1 x1 ][x2 x1 x1 x1 ]]][x2 x1 x1 ]], or x2 x1 x1 x1 x2 x1 x1 x2 x1 x2 x2 x1 , [x2 x1 x1 x1 ][x2 x1 x1 ][x2 x1 ]x2 [x2 x1 ], [x2 x1 x1 x1 ][x2 x1 x1 ][x2 x1 ][x2 [x2 x1 ]], where x2 x1 x1 x1 ≺ x2 x1 x1 ≺ x2 x1 ≺ x2 x2 x1 . Here [x2 x1 . . . x1 ] has a left-normed bracketing. By the way, the second series of partial bracketing of a word illustrates Shirshov’s factorization theorem [286] of 1958 that every word w can be factorized in a unique way as a nondecreasing product of LS words, i.e., w = v1 v2 . . . vn , v1 v2 · · · vn , every vi is a Shirshov regular word. Sh¨ utzenberger and Sherman [279], 1963, cited Shirshov’s paper [286] as a source where the factorization theorem was originally formulated. Two years later, Sh¨ utzenberger [280] cited [286] and (by a mistake, see the 5The Lyndon–Shirshov basis for the alphabet {x , x } is different from the above 1 2 Shirshov’s “content” basis starting with monomials of degree 7.

page xi

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

xii

9287 – Gr¨ obner-Shirshov Bases

textbook

PREFACE

reference to D. Knuth in Berstel and Perrin [17]6) Chen–Fox–Lyndon [92]. So, according to [279], [280] and to D. Knuth [17], Shirshov’s work [286] is indisputably the origin of what is now widely known as the Chen–Fox– Lyndon factorization theorem. The Shirshov special bracketing [286] goes as follows. Let us give as an example the special bracketing of the LS word w = x2 x2 x1 x1 x2 x1 x1 x1 with the LS-subword u = x2 x2 x1 . The Shirshov standard bracketing is [w] = [x2 [[[x2 x1 ]x1 ][x2 x1 x1 x1 ]]]. The Shirshov special bracketing is [w]u = [[[u]x1 ][x2 x1 x1 x1 ]]. In general, if w = aub in X ∗ then the Shirshov standard bracketing gives [w] = [a[uc]d], where b = cd. Now, c = c1 · · · ct , each ci is LS-word, c1 · · · ct in the lex-ordering (by the Shirshov factorization theorem). Then one must change bracketing of [uc]: [w]u = [a[. . . [[u][c1 ]] . . . [ct ]]d] The main property of [w]u is that [w]u is a monic associative polynomial with the maximal monomial w, so [w]u = w. Actually, Shirshov [290], 1962 needed a “double” relative bracketing of a regular word with two disjoint LS subwords. Then he implicitly used the following property: every LS subword of c = c1 · · · ct as above is a subword of some ci , 1 ≤ i ≤ t. Shirshov defined regular monomials [286], 1958, as follows: (w) = ((u)(v)) is a regular monomial iff • w is a regular word; • (u), (v) are regular monomials (then automatically u v in the lex-ordering); • if (u) = ((u1 )(u2 )) then u2 v. Once again, if we formally omit all Lie brackets in Shirshov’s paper [290] then essentially the same algorithm and essentially the same CDlemma (with the same but much simpler proof) yield a linear basis for associative algebra presented by generators and defining relations. The differences are the following: • No need to use LS monomials and LS words, since the set X ∗ is a linear basis of the free associative algebra kX; 6Let us state a quote from [17] with appropriate reference numbers according to the bibliography list of this volume: “This theorem has actually an imprecise origin. It is usually credited to Chen, Fox and Lyndon, following the paper of Sch¨ utzenberger [280] in which it appears as an example of a factorization of free monoids. Actually, as pointed out to one of us by D. Knuth in 2004, the reference [92] does not contain explicitly this statement. However, it can be recovered from their results.” Let us stress, that Schutzenberger [280] does contain explicitly the reference to Shirshov [286].

page xii

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

xiii

• The definition of associative compositions for monic polynomials f and g is much simpler than the definition of Lie compositions for monic Lie polynomials, since we do not need special Shirshov’s bracketing. • The definition of elimination of the leading word s¯ of an associative monic polynomial s is straightforward: a¯ sb → a(rs )b whenever s = s¯ − rs , a, b ∈ X ∗ . For Lie polynomials, it is much more involved and also uses the Shirshov special bracketing. We can formulate the main idea of Shirshov’s proof as follows. Consider a complete set S of monic Lie polynomials (“complete” means that all compositions are trivial). If s1 , s2 ∈ S, w = a1 s¯1 b1 = a2 s¯2 b2 , where w, ai , bi ∈ X ∗ and w is a LS word, then Lie monomials [a1 s1 b1 ]s1 and [a2 s2 b2 ]s2 are equal modulo the smaller Lie monomials from the ideal Id(S) generated by Id(S):  αi [ai si bi ]si , αi ∈ k, [a1 s1 b1 ]s1 = [a2 s2 b2 ]s2 + i>2

where si ∈ S, [ai si bi ]si = ai s¯i bi < w. Actually Shirshov proved a more general result: if (a1 s1 b1 ) = a1 s1 b1 , and (a2 s2 b2 ) = a2 s2 b2 with w = a1 s1 b1 = a2 s2 b2 then  αi (ai si bi ), αi ∈ k, (a1 s1 b1 ) = (a2 s2 b2 ) + i>2

where si ∈ S and (ai si bi ) = ai s¯i bi < w. Below we call a Lie polynomial sb. (asb) a Lie normal S-word provided that (asb) = a¯ This is precisely where he used the notion of composition and other notions and properties mentioned above. For associative algebras (and for commutative-associative algebras), it is much easy to prove the analogue of this property: If S is a complete monic set in kX (or k[X]), w = a1 s¯1 b1 = a2 s¯2 b2 , ai , bi ∈ X ∗ , s1 , s2 ∈ S then  a1 s1 b 1 = a2 s2 b 2 + αi ai si bi , αi ∈ k, i>2

where si ∈ S, ai s¯i bi < w. Summarizing we can say that the Shirshov’s work [290] implicitly contains the CD-lemma for associative algebras as a simple exercise that requires no new ideas. The first author, L.A. Bokut, can confirm that Shirshov clearly understood this and told him that “the case of associative algebras is the same”. The CD-Lemma for associative algebras was formulated in Bokut [33], 1976 (with a reference to Shirshov’s paper [290]), Bergman [16], 1978, and Mora [242], 1986.

page xiii

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

xiv

textbook

PREFACE

The CD-Lemma for associative algebras can be applied to every semigroup P = SmgX | S, and in particular to every group. Namely, one may consider the semigroup algebra kP over a field k. The latter algebra has the same generators and defining relations as P , i.e., kP = kX | S, where S consists of binomials of the form u − v. All compositions of such binomials u1 − v1 and u2 − v2 is again a binomial u − v. Applying Shirshov’s algorithm to a set of semigroup relations S gives rise to a complete set of semigroup relations S c . The S c -irreducible words in X constitute the set of normal forms of the elements of P . Before we go any farther, let us give some well known examples of algebras, groups, and semigroups presented by generators and defining relations. • The Grassman algebra kX | x2i = 0, xi xj + xj xi = 0, i > j. The set of defining relations is a GS basis with respect to the deg-lex ordering. A k-basis is {xi1 · · · xin | xij ∈ X, j = 1, . . . , n, i1 < · · · < in , n ≥ 0}. • The Clifford algebra kX | xi xj + xj xi = aij , 1 ≤ i, j ≤ n, where (aij ) is an n×n symmetric matrix over k. The set of defining relations is a GS basis with respect to the deg-lex ordering. A k-basis (if char k = 2) consists of the same words as for the Grassman algebra. • The universal enveloping algebra of a Lie algebra L is    U (L) = k X | xi xj − xj xi = αkij xk , i > j , where  k X = {xi | i ∈ I} is a well ordered linear basis of L and [xi xj ] = αij xk (i > j) is the multiplication table of L. The Poincar´e–Birkhoff– Witt (PBW) theorem follows: {xi1 xi2 · · · xin | i1 ≤ i2 ≤ · · · ≤ in , it ∈ I, t = 1, . . . , n, n ≥ 0} is a linear basis of U (L). • A. Kandri-Rody and V. Weispfenning [179] invented an important class of “algebras of solvable type”, which includes universal enveloping algebras. An algebra of solvable type is R = kX | sij = xi xj − xj xi − pij , i > j, pij < xi xj , where all compositions (sij , sjk )w , w = xi xj xk , are trivial modulo (S, w). Here pij is a noncommutative polynomial with all terms less than xi xj .

page xiv

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

xv

They created a theory of GS bases for every algebra in this class. In particular, they found a linear basis of every quotient of R. • Let L = Lie(X | S) be a Lie algebra presented by a set of generators X and a a set of defining relations S. The universal associative enveloping algebra U (L) may be presented as U (L) = kX | S (−) , where S (−) ⊆ kX stands for the set of associative polynomials obtained from S by replacing all Lie products by commutators. Then the PBW Theorem in the Shirshov’s form (see Theorem 3.33) says that S (−) is a GSbasis (in the deg-lex ordering) if and only if S is a GS-basis in Lie(X | S). As a corollary, there is a linear basis of U (L) which consists of words u1 u2 · · · un , where ui are S-reduced LS words, u1 u2 · · · un (in the lex-ordering), see [66, 67]. • Free Lie algebras LieK (X) over a commutative algebra K. A series of authors (Ph. Hall, M. Hall, A.I. Shirshov, R. Lyndon) proposed different linear K-bases of these algebras (Hall basis, Hall–Shirshov series of bases, Lyndon–Shirshov basis), see also [22, 113, 296]. • The Lie k-algebras presented by Chevalley generators and defining relations of types An , Bn , Cn , Dn , G2 , F4 , E6 , E7 , E8 . Serre’s theorem provides linear bases and multiplication tables for these algebras (they are finite dimensional simple Lie algebras over a field k). GS bases for these algebras were found in [59, 60, 61]. • The Coxeter groups are presented as SmgX | x2i = 1, mij (xi , xj ) = mji (xj , xi ), where M = (mij ) is a Coxeter matrix. J. Tits [298] (see also [21]) algorithmically solved the word problem for any Coxeter group. Finite Coxeter groups are presented by “finite” Coxeter matrices An , Bn , Dn , G2 , F4 , E6 , E7 , E8 , H3 , H4 . Coxeter’s theorem provides normal forms and Cayley tables (these are finite groups generated by reflections). GS bases for finite Coxeter groups were found in [68]. • Affine Kac-Moody algebras [173]. The Kac–Gabber theorem provides linear bases for these algebras under symmetrizability condition on the Cartan matrix. GS bases of these algebras were found in [254, 255, 256]. • Borcherds–Kac–Moody algebras [74, 75, 76, 173]. GS bases of these algebras are not known.

page xv

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

xvi

textbook

PREFACE

• Quantum enveloping algebras (V.G. Drinfeld, M. Jimbo). Lusztig’s theorem [217] provides linear canonical bases of these algebras. Different approaches were developed by C.M. Ringel [268, 269], J.A. Green [152], and V.K. Kharchenko [186, 187, 188, 189, 190]. GS bases of quantum enveloping algebras are unknown except for the case An , see [65, 107], although the linear (PBW) bases are constructed for all types [191]. • Koszul algebras. The quadratic algebras having a basis of standard monomials, called PBW-algebras, are always Koszul (S. Priddy [261]), but the converse is not true. In a different terminology, PBW-algebras are algebras with quadratic GS bases, see [253]. • Elliptic algebras (B. Feigin, A. Odesskii). These are associative algebras presented by n generators and n(n − 1)/2 homogeneous quadratic relations for which the dimensions of the graded components are the same as for the polynomial algebra in n variables. The first example of this type was the Sklyanin algebra (1982) generated by x1 , x2 , x3 with the defining relations [x3 , x2 ] = x21 , [x2 , x1 ] = x23 , [x1 , x3 ] = x22 (see [249]). GS bases for these algebras are not known. • Leavitt path algebras. GS bases of these algebras were found in A. Alahmedi et al. [6] and applied by the same authors to determine the structure of Leavitt path algebras of polynomial growth, see in [7]. • Iwahory–Hecke algebras over a field k differ from the group algebras of Coxeter groups in that instead of x2i = 1 there are relations 1/2

(xi − qi

1/2

)(x1 + qi

) = 0 or (xi − qi )(x1 + 1) = 0,

where qi are units of k. There are two known k-linear bases of the Iwahory– Hecke algebras: one in natural, the other is the Kazhdan–Lusztig canonical basis [218]. The GS bases for the Iwahory–Hecke algebras are known for the finite Coxeter matrices. A deep connection of the Iwahory–Hecke algebras (of type An ) and braid groups (as well as link invariants) was found by V.F.R. Jones [172]. • Artin braid group Bn . The Markov–Artin theorem provides the normal form and semi-direct structure of the group in the Burau generators. Other normal forms of Bn were obtained by Garside, Birman–Ko–Lee, and Adjan– Thurston. GS bases for Bn in the Artin–Burau, Artin–Garside, Birman– Ko–Lee, and Adjan–Thurston generators were found in [35, 36, 37, 38, 110]. • Artin–Tits groups. They differ from Coxeter groups in the absence of the relations x2i = 1. Normal forms are known in the spherical case, see E. Brieskorn, K. Saito [79].

page xvi

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

xvii

• The groups of Novikov–Boone type [71, 128, 175, 176, 177, 178, 247] with unsolvable word or conjugacy problem. These are groups with standard bases (standard normal forms or standard GS bases), see [24, 27, 34, 98]. We will discuss the relations between standard and Gr¨ obner–Shirshov bases below. • Adjan’s [4] and Rabin’s [264] constructions of groups with unsolvable isomorphism problem and Markov properties. A GS basis is known for Adjan’s construction [38]. • Markov’s [231] and Post’s [260] semigroups with unsolvable word problem. The GS basis of Post’s semigroup is found in [316]. • Plactic monoids. A theorem due to Richardson, Schensted, and Knuth provides a normal form of the elements of these monoids (see [204]). New approaches to plactic monoids via GS bases in the alphabets of row and column generators were found in [87, 91, 96, 200]. • The groups of quotients of the multiplicative semigroups of some quadratic semigroup algebras. In order to study the embedding of a semigroup into a group the notion of a group with a relative standard basis was proposed by Bokut in [28], 1968. The idea of a relative standard basis has the same origin as the idea of a GS basis, it was later formalized in [69] under the name of a relative GS-basis. To date, the method of GS bases has been adapted, in particular, to the following classes of linear universal algebras, as well as for operads, categories, and semirings. In this volume, we will discuss the following cases. Unless stated otherwise, we consider all linear algebras over a field k. • Associative and commutative algebras ([82, 83]); • Associative noncommutative algebras ([16, 33, 290]); • Associative algebras over a commutative algebra ([239]); • Lie algebras ([290]); • Lie algebras over a commutative algebra ([42]); • Modules over an associative algebra ([111, 180, 181]); • Dialgebras and trialgebras ([49]); • Associative conformal algebras ([56, 54]); At the heart of the GS method for a class of linear algebras lies a CD-lemma for a free object of the the class. For the cases above, the free objects are the free associative algebra kX, the double free associative k[Y ]-algebra k[Y ]X, the free Lie algebra Lie(X), the double free Lie k[Y ]algebra Liek[Y ] (X). The statements of all CD-lemmas are essentially the same. Consider a class Var of linear universal algebras (variety), a free algebra VarX in Var, and a well-ordered k-basis N (X) of monomials in VarX. A subset

page xvii

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

xviii

9287 – Gr¨ obner-Shirshov Bases

textbook

PREFACE

S ∈ VarX is called a GS basis if every composition of elements of S is trivial (vanishes upon elimination of leading terms s¯, s ∈ S). Then the following conditions are equivalent: (i) S is a GS basis. (ii) If f ∈ Id(S) then the leading term f¯ contains a subterm of the form s¯ for some s ∈ S. Here Id(S) stands for the ideal generated by S in VarX. (iii) The set of S-irreducible terms is a linear basis of the Var-algebra VarX | S, generated by X with defining relations S. Typical compositions are compositions of intersection and inclusion. Shirshov [290, 291] avoided inclusion compositions. He suggested instead that a GS basis must be reduced (leading words do not contain each other as subwords). In some cases (Lie algebras, di- and tri-algebras) the leading term f¯ of a polynomial f ∈ VarX may not belong to VarX. For Lie algebras, one has f¯ ∈ kX, for di- or tri-algebras f¯ belongs to an “ordinary” free algebra generated by extended set of generators. Almost all CD-lemmas require the notion of a “normal S-term”. A term of the form (asb) with only one occurrence of s ∈ S is called a normal s)b). Given S ⊂ kX, every S-word (that is, an SS-term if (asb) = (a(¯ term) is a normal S-word. Given S ⊂ Lie(X), every Lie S-monomial (Lie S-term) is a linear combination of normal Lie S-terms [290]. One of the two key lemmas asserts that if S is complete under compositions then every element of the ideal generated by S is a linear combination of normal S-terms. Another key lemma says that if S is a GS basis and the leading words of two normal S-terms are the same then these terms coincide modulo lower normal S-terms. As we mentioned above these results were proved by Shirshov [290] for Lie(X). An important class of problems related with the word problem for groups and semigroups includes estimates of unsolvability degree and embedding problems. The ideas motivated by the GS-bases technique were shown to be fruitful in the study of these problems. Even if the word problem for a particular system is not algorithmically solvable, we may wonder how “tough” is the unsolvability. As a measure of complexity, the notion of a Turing degree was introduced by E. Post in [258]. The lowest Turing degree 0 corresponds to algorithmically solvable problems, degree 1 corresponds to the halting problem of a Turing machine, which is recursively enumerable (r.e.) and in some sense “universal” among all r.e. problems. A question posed by E. Post in [258] asked whether there exist intermediate Turing degrees between 0 and 1. The problem was independently solved by R. Friedberg [143] and A. Muchnik [243] who showed that intermediate degrees of unsolvability exist.

page xviii

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

xix

In the papers by L.A. Bokut [24, 25, 27, 28, 29, 30, 31] the technique of groups with (relative) standard bases was developed to construct a group whose conjugacy problem has an arbitrary Turing degree of unsolvability and to construct such a semigroup Q that its semigroup algebra Z2 Q is embeddable (without zero) into a group but is not embeddable into a skew field (Malcev problem). In brief, let G =  X | Φ  be a presentation of a group. The set Φ of defining relations may be written in a semigroup form: Φ = {ui = vi : i ∈ I}, where ui , vi are some words in the alphabet X ∪ X −1 . One may determine a “one-sided” rewriting system (semi-Thue system), where the elementary transformations have the form uui v → uvi v,

i ∈ I.

A chain of elementary transformations moving u to v is denoted u →∗ v. A word u is said to be terminal if none of the transformations can act on u. Suppose the following properties hold for a given group presentation G =  X | Φ . (1) For every u ∈ S there exists a terminal word C(u) such that u →∗ C(u). (2) If u →∗ v, then u = v in G. (3) If u = v in G then there exists a word w such that u →∗ w and v →∗ w (the Church–Rosser, or confluence condition). It follows from these conditions that u = v in G if and only if C(u) and C(v) are graphically equal words. In some cases we need to replace (3) by a weaker condition. (3 ) There exists an equivalence ≡α such that u ≡α v implies u = v in G and if u = v in G then there exist w1 and w2 , w1 ≡α w2 and u →∗ w1 , v →∗ w2 . In the case when (1), (2), and (3 ) hold, u = v in G if and only if C(u) ≡α C(v). In Bokut’s papers, the notion of a group with a standard basis was introduced; for such a group, there exists a “standard” presentation giving rise to a rewriting system with properties (1)–(3) (the word “basis” here has the same meaning as “linear base” in the group algebra). Moreover, a group with a relative standard basis was defined to study the groups of fractions of multiplicative semigroups of certain rings. For such a group, there exists a standard rewriting system satisfying conditions (1), (2), and (3 ). This approach was used to show that the word problem in the Boone’s groups may have an arbitrary Turing degree of unsolvability [24]. This result was obtained earlier (using other methods) by A. Fridman [141, 142], C. Clapham [117], and W. Boone [72]. Analogously, for the Novikov groups Ap1 ,p2 [246] it is possible to show via rewriting systems that the

page xix

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

xx

textbook

PREFACE

word problem in these groups is solvable [27]. This was first shown by A. Fridman [140], the Gr¨ obner–Shirshov bases approach was applied in [98]. In contrast, the conjugacy problem for Novikov groups Ap1 ,p2 can have an arbitrary Turing degree of unsolvability (see [29]). The existence of finitely presented groups with this property was independently proved by D. Collins [125]. The groups with relative standard bases were applied to solve the problem posed by A.I. Malcev in 1937 (see the Introduction of [229] or [293, p. 559]): Whether there exists a ring which is embeddable (without zero) into a group but it cannot be embedded into a skew field? This problem was motivated by the Malcev’s example of a ring without zero divisors that cannot be embedded into a division ring (the solution of a Van der Waerden problem [226]). The Malcev’s solution of the Van der Waerden problem has a tight connection to the notion of a Gr¨ obner–Shirshov basis which had not been yet invented in 1937. Namely, consider a semigroup (or monoid) S generated by a set of eight elements X = {a, b, c, d, x, y, u, v} with defining relations Σ = {by = ax, dy = cx, bv = au}. It is easy to see that Σ is a GS basis relative to the deg-lex order. However, if we generate a group G by the set X relative to Σ then the compositions would include a relation dv = cu. Therefore, S cannot be embedded into a group. Later on, this idea led A.I. Malcev to the well known necessary and sufficient conditions for a semigroup to be embeddable into a group [227, 228]. It remains to note that the group algebra R = QS has no zero divisors (GS basis Σ allows us to find a linear basis of R). Obviously, R may not be embedded into a division ring since its subsemigroup S is not embeddable into a group. Therefore, a problem on a semigroup algebra which is embeddable (without zero) into group but not embeddable into a division ring was a very natural question. An example of a semigroup solving the Malcev problem was constructed in [30] by means of relatively confluent rewriting systems: the corresponding group of fractions has a relative standard basis with respect to a certain equivalence. There exists a semigroup algebra and its completion to a power series algebra whose multiplicative semigroup (without the zero element) can be embedded into a group but the algebra itself is not embeddable into a division algebra (settling a problem of Malcev). Explicitly, let Q4 be a semigroup generated by v0 , v1 , ai , bi , ci , si , ti , 1 ≤ i ≤ 4, with defining relations ai si = ci v0 ,

bi si+1 = ci v1 ,

bi ti+1 = ai ti ,

(here t5 = t1 , s5 = s1 ).

1 ≤ i ≤ 4,

page xx

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

PREFACE

textbook

xxi

Then the semigroup algebra Z2 Q4 over the field of two elements and the formal power series algebra Z2 Q4 can be embedded (without zero) into a group, but cannot be embedded into a division algebra [23, 26, 30, 31]7. Overall, the notion of a group with a standard basis is a non-ordered version of a GS basis: they coincide provided that one may choose a monomial order in such a way that ui > vi for every rewriting rule ui → vi . The notion of a relative standard basis provides a more precise tool for the study of combinatorial problems for groups, but it is less universal than GS-basis since closely related with a structure of a particular group. It is natural to expect that Gr¨ obner–Shirshov bases relate naturally to all instances of a (semi)group or algebra, in particular, to the cohomological properties. Indeed, the knowledge of a Gr¨obner–Shirshov basis for an associative algebra is essential for constructing the resolution proposed by D.J. Anick in 1986 [8] which is convenient for computations. Therefore, GS bases of associative algebras are vitally important for calculation of the Hochschild cohomology groups and cohomological products. As a corollary, the problem of description of all singular extensions for a given (semi)group or algebra makes use of GS-bases. The book is organized as follows. The introductory Chapter 1 contains a series of elementary examples of (commutative) Gr¨ obner bases. In particular, the well-known Gauss algorithm for solving a system of linear equations is nothing but a computation of an appropriate Gr¨obner basis by eliminations of variables. We will show on examples how the problem of solving a system of non-linear equations leads to the idea of compositions (s-polynomials). Chapter 2 contains the necessary information about free objects. In order to state a Composition-Diamond Lemma for a class of algebras one needs to know what is the normal form of elements in the free algebra of that class. We will observe the construction of free semigroups, groups, noncommutative associative algebras, and Lie algebras. For the latter, we will prepare combinatorial results used in the sequel. When proving the nonassociative Lyndon–Shirshov words to form a linear basis of the free Lie algebra, we will borrow the “weak version” of the PBW Theorem: every Lie algebra embeds into an associative algebra relative to the commutator. The GS bases theory for associative algebras provides an extremely short 7The semigroup Q is the only known semigroup with the above properties. In fact 4

(see [23]), there exists a series of semigroups Qn = Smgv0 , v1 , ai , bi , ci , si , 1 ≤ i ≤ n | ai si = ci v0 , bi si+1 = ci v1 , bi ti+1 = ai ti , where tn+1 and sn+1 are identified with t1 and s1 , respectively. We hope that all Qn for n ≥ 4 give rise to semigroup algebras (over an arbitrary field k containing nth roots of unity) settling the Malcev problem. Other examples of (not semigroup) algebras that are not embeddable into skew fields but have multiplicative semigroups embeddable into groups were constructed by A.A. Klein [194] (a student of S. Amitsur) and A.J. Bowtel [78] (a student of P.M. Cohn).

page xxi

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

xxii

9287 – Gr¨ obner-Shirshov Bases

textbook

PREFACE

proof of the PBW Theorem which is stated in Chapter 3. The last one contains the core statements of the GS bases theory for commutative algebras (Gr¨ obner bases), noncommutative associative algebras, and for Lie algebras. An important particular case is related with the tensor product of two associative algebras. In Chapter 4, we observe applications and modifications of the core statements (CD-Lemmas) for various classes of systems that often appear in mathematical practice. One of such modifications is needed for the study of Lie algebras over commutative rings. The world of Lie algebras over rings is very much different from what we know over fields. In particular, a Lie algebra over a commutative ring may not be embeddable into an associative one. In Chapter 5 we consider a breed of Gr¨obner bases and Gr¨ obner–Shirshov bases for Lie algebras which provides an approach to the study of Lie algebras over commutative rings. Chapter 6 is devoted to the algorithmic instances of the group theory involving Gr¨ obner–Shirshov bases. We state examples of finitely presented groups with unsolvable word and/or conjugacy problems, construct finitely presented groups with arbitrary recursively enumerable Turing degree of the word/conjugacy problem (of course, the degree of the word problem should not exceed the one of conjugacy problem), and prove the Higman theorem on the embedding of a finitely generated group into a finitely presented one. The final Chapter 7 reflects the application of the Gr¨ obner–Shirshov bases theory for associative algebras to the computation of the Hochschild cohomology ring. We aim to show how the knowledge of a Gr¨ obner–Shirshov basis can give us a method to describe all extensions of groups and associative algebras. The first author, Leonid Bokut, is grateful to Dr. Youngshan Chen, Dr. Li Yu, Dr. Qiuhui Mo, and Dr. Zerui Zhang for collaboration. During the early stages of this work, Viktor Lopatkin enjoyed the hospitality of South China Normal University in Guangzhou, with extremely thanks to Dr. Zhang Junhuai for the great support, and of Czech Technical ˇˇtov´ıˇcek and Cestm´ ˇ University in Prague, with special thanks to Pavel S ır Burd´ık. The research of V. Lopatkin was supported by the grant of the Government of the Russian Federation for the state support of scientific research carried out under the supervision of leading scientists, agreement 14.W03.31.0030. Pavel Kolesnikov was supported by the Program of fundamental scientific researches of the Siberian Branch of Russian Academy of Sciences, project 0314-2019-0001.

page xxii

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

CHAPTER 1

Introduction The classical Gr¨ obner and Gr¨obner–Shirshov bases theory deals with three types of polynomials: commutative associative, noncommutative associative, and Lie. It provides algorithms for passing from a set of polynomials of a fixed type to another set (of the same type) with better properties. Buchberger’s algorithm works for commutative polynomials, while Shirshov’s algorithms work for noncommutative and Lie polynomials. Two classical cases of Buchberger’s algorithm have been known for ages: the Euclidean algorithm for polynomials in one variable, and the Gaussian Elimination algorithm for linear (degree 1) polynomials in several variables. A big advantage of Buchberger’s algorithm is that, like the two special cases, it is finite. This means that for any given finite set of commutative polynomials in several variables, the computation terminates after finitely many steps. Buchberger’s algorithm is a specialization of Shirshov’s algorithms, which, in contrast, are infinite in general; nevertheless, they have been used with great success in both theoretical and applied research. 1.1. The Euclidean algorithm Let k be a field, and let f (x) and g(x) be two polynomials in one variable x over a field k. Our goal is to present an algorithm that finds gcd(f (x), g(x)), the greatest common divisor (g.c.d.) of f (x) and g(x). This can be obtained by the well-known Euclidean algorithm based on the division algorithm. Example 1.1. Take f (x) = x4 + x2 − x − 1,

g(x) = x3 − 2x2 + 1.

Do the following reduction of f (x) modulo g(x): f (x) → f (x) − xg(x) = 2x3 + x2 − 2x − 1. Here, we use the leading monomial x3 of g(x) to reduce the leading monomial x4 of f (x) (the elimination of the leading monomial of g in f ). The other elementary step of the algorithm is the division of a polynomial by 1

page 1

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2

textbook

1. INTRODUCTION

using its leading coefficient. In our case, 1 1 2x3 + x2 − 2x − 1 → f1 (x) = x3 + x2 − x − . 2 2 The two steps make up a transformation that preserves the g.c.d., which we indicate by writing {f (x), g(x)}  {f1 (x), g(x)}. Let us apply the same steps now to f1 (x) and g(x): 5 2 3 x −x− 2 2 3 2 2 → f2 (x) = x − x − . 5 5 In order to save space below, we write this as f1 (x) → f1 (x) − g(x) =

f1 (x) → f1 (x)|g(x)=0 → f2 (x). Thus, {f1 (x), g(x)}  {f2 (x), g(x)}. Actually, instead of the two steps {f, g}  {f1 , g}  {f2 , g}, we can reduce f (x) in one step by substituting x3 = 2x2 − 1, or in other words, by using the formal equation g(x) = 0: f (x) → x4 + x2 − x − 1|x3 =2x2 −1 = 2x3 + x2 − 2x − 1 → 2x3 + x2 − 2x − 1|x3 =2x2 −1 = 5x2 − 2x − 3 3 2 → f2 (x) = x2 − x − . 5 5 Using f2 (x) = 0, we reduce g(x): 1 2 2 1 x − x− 5 5 5 1 1 =− x+ 10 10

g(x) → x3 − 2x2 + 1|x2 = 25 x+ 35 = 1 1 → x2 − x − |x2 = 25 x+ 35 2 2 → g1 (x) = x − 1, and then we obtain that

{f2 , g}  {f2 , g1 }. Finally, by using g1 (x) = 0, we reduce f2 (x) to: 2 f2 (x) → x2 − x − 5 The chain of equivalences yields {f, g} gcd(f (x), g(x)) = x − 1.

3 |x=1 = 0. 5  {x − 1}, which shows that

page 2

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

1.2. THE GAUSSIAN ELIMINATION ALGORITHM

textbook

3

Later on, we will define a Gr¨obner basis (GB) of a set of polynomials, we will see that GB{x4 + x2 − x − 1, x3 − 2x2 + 1} = {x − 1}. In general, GB{f1 (x), . . . , fm (x)} = gcd{f1 (x), . . . , fm (x)} for polynomials fi (x) in one indeterminate. 1.2. The Gaussian elimination algorithm Let k be a field and x1 , . . . , xn indeterminates. We now solve a system of linear equations a11 x1 + · · · + a1n xn = b1 , .. .

(1.1)

am1 x1 + · · · + amn xn = bm , where aij , bi ∈ k. It is well-known that this system can be successfully solved by using the Gaussian Elimination algorithm. Example 1.2. Solve the following system of equations: 2x1 + 3x2 + x3 = 1, 3x1 + 2x2 − x3 = 2. Instead of these equations, let us deal with the corresponding polynomials f = 2x1 + 3x2 + x3 − 1, g = 3x1 + 2x2 − x3 − 2. First of all, we need to order the variables. The method is to set x1 > x2 > x3 ; then in our case, x1 is the leading monomial of both f and g. Dividing f and g by their leading coefficients: 3 1 1 f → f1 = x1 + x2 + x3 − , 2 2 2 2 1 2 g → g1 = x1 + x2 − x3 − , 3 3 3 we have {f, g}  {f1 , g1 }, where the equivalence means that the corresponding linear systems are equivalent (have the same solution sets over k). Using f1 , we reduce g1 : 5 5 1 g1 → g1 − f1 = − x2 − x3 − 6 6 6 1 → g2 = x2 + x3 + 5

page 3

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4

textbook

1. INTRODUCTION

to obtain {f1 , g1 }  {f1 , g2 }. We may in fact stop here, but in order to have something really simple, we can further use g2 to eliminate x2 in f1 : 3 4 f1 → f1 − g2 = x1 − x3 − . 2 5 Thus our original system is equivalent to the system x1

4 , 5 1 x2 + x3 = − , 5 − x3 =

which is in reduced row-echelon form. As we shall see later, GB{f, g} = {f2 , g2 }. In general, the Gauss elimination algorithm produces for any consistent system (one that has solutions) an equivalent system in row-echelon form: xi1 + c1,i1 +1 xi1 +1 + · · · xi2 + c2,i2 +1 xi2 +1 + · · · ..

+ c1n xn = d1 + c2n xn = d2 (1.2)

. xil + cl,il +1 xil +1 + · · · + cln xn = dl

where 1 ≤ i1 < i2 < · · · < il ≤ n. The system (1.1) is inconsistent if and only if after finitely many steps we arrive at 0 = d, where d = 0. In this case, (1.1) is equivalent to the system 0 = 1. In terms of the Gr¨ obner bases theory, for all tuples of linear polynomials {f1 , . . . , fm } we have GB{f1 , . . . , fm } = {h1 , . . . , hl }, where the fi correspond to (1.1), and either the hj correspond to (1.2) or GB{f1 , . . . , fm } = {1}. The elimination of xi2 from h1 using h2 , of xi3 from h2 using h3 , and so on, is not absolutely necessary, but it gives a unique Gr¨ obner basis independent of the choice of the elementary transformation steps, which therefore depends only on the ordering of the indeterminates. A different ordering leads to a different basis.

page 4

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

1.3. SYSTEMS OF POLYNOMIAL EQUATIONS

5

1.3. Systems of polynomial equations What are the common features of the two cases considered above? (i) We fix a total order on monomials, which determines the leading monomial of each polynomial. For 1-variable polynomials, it is the degree order xn > xm ⇔ n > m; for multivariable linear polynomials, it is enough to order its indeterminates. (ii) We reduce a given set of polynomials by means of the following operations: (a) dividing a polynomial by its leading coefficient; (b) eliminating the leading monomial of one polynomial from the other polynomials of the set. We would like to do something similar to an arbitrary system of polynomial equations in several indeterminates. However, we will see that the reductions in (ii) are insufficient, and we need a new operation s(f, g), which is called the s-polynomial operation. An alternative name for this is the composition (f, g)w of f and g with respect to some uniquely defined monomial w. Example 1.3. Solve the following system of polynomial equations f = x21 x2 − 1 = 0,

(1.3)

g = x31 + 7x1 x2 + 1 = 0.

Let us use the deg-lex (degree-lexicographic) ordering of monomials to compare monomials first by degree and then lexicographically. For instance, x31 > x21 x2 > x1 x2 > 1. Again we will deal with the polynomials f and g rather than the equations f = 0 and g = 0. We aim to reduce the set {f, g} as much as possible by passing to an equivalent system of equations. The leading monomials of f and g in (1.3) are f¯ = x21 x2 and g¯ = x31 respectively. We cannot use g = 0 to reduce f¯, neither can we use f = 0 to reduce g¯, for neither f¯ nor g¯ is a submonomial of the other. The great discovery of Buchberger (and Shirshov for noncommutative polynomials) was that in such cases the system can be simplified further by a new spolynomial (composition) operation. Take the least common multiple w = lcm(f¯, g¯), which in our case is x31 x2 = f¯x1 = g¯x2 , and consider s(f, g) = (f, g)w = f x1 − gx2 . For our system (1.3), we have s(f, g) = (x21 x2 − 1)x1 − (x31 + 7x1 x2 + 1)x2 = −7x1 x22 − x1 − x2 1 1 → h = x1 x22 + x1 + x2 , 7 7 thus {f, g}  {f, g, h},

f¯ = x21 x2 ,

g¯ = x31 ,

¯ = x1 x2 . h 2

page 5

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

6

textbook

1. INTRODUCTION

Proceeding with s(f, h), we obtain the following: ¯ 1, w1 = lcm(f¯, ¯h) = x21 x22 = f¯x2 = hx 1 1 s(f, h) = (f, h)w1 = (x21 x2 − 1)x2 − (x1 x22 + x1 + x2 )x1 7 7 → q = x21 + x1 x2 + 7x2 , thus {f, g, h}  {f, g, h, q},

f¯ = x21 x2 ,

¯ = x1 x2 , h 2

g¯ = x31 ,

q¯ = x21 .

The reduction g → g|q=0 = −x21 x2 + 1 → x21 x2 − 1|f =0 = 0, shows that we may exclude g from the set; we are then left with {f, h, q},

f¯ = x21 x2 ,

¯ = x1 x2 , h 2

q¯ = x21 .

Next, we reduce f : f → f |q=0 = −x1 x22 − 7x22 − 1 → x1 x22 + 7x22 + 1|h=0 1 1 1 → f1 = x22 − x1 − x2 + , 49 49 7 thus {f, h, q}  {f1 , h, q},

f1 = x22 ,

¯ = x1 x2 , h 2

q¯ = x21 .

Now we see that h is also redundant: h → h|f1 =0 → x21 + x1 x2 + 7x2 |q=0 → 0, thus {f1 , h, q}  {f1 , q},

f1 = x22 ,

q¯ = x21 ,

and we conclude that (1.3) is equivalent to 1 1 1 x1 − x2 + = 0, 49 49 7 q = x21 + x1 x2 + 7x2 = 0.

f1 = x22 −

(1.4)

We cannot reduce (1.4) any further: since x22 and x21 are coprime, the eliminations are impossible, while s-polynomial operations only introduce redundancies. The theory of Gr¨ obner bases guarantees that (1.4) is the unique simplest system equivalent to (1.3) as long as we fix the order of indeterminates obner bases, x1 > x2 and use the deg-lex order on monomials. In terms of Gr¨ we have GB{f, g} = {f1 , q}.

page 6

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

1.3. SYSTEMS OF POLYNOMIAL EQUATIONS

textbook

7

To solve (1.4) let us change the order to lex-deg order to compare two monomials first lexicographically and then by degree (that is, the entire word is bigger than its prefix). For example, x21 > x1 x22 > x1 x2 > x1 > x22 > x2 > 1. Then (1.3) is equivalent to f2 = x1 + x2 − 49x22 − 7 = 0, q1 = (49x22 − x2 + 7)2 + (49x22 − x2 + 7)x2 + 7x2 = 0. Actually GB{f, g} = {f2 , q1 } in lex-deg order of monomials. This system can be solved in radicals. Now we are going to look at another problem concerning the system (1.3). Namely we address the following equality, or word problem: Problem. Find an algorithm that for a given polynomial F (x1 , x2 ) decides whether F (x1 , x2 ) = 0 modulo f = 0 and g = 0. In other words, we need to determine the validity of the implication for each F , that is, f = 0, g = 0 ⇒ F = 0. (1.5) To be more precise, this means that F (x1 , x2 ) = h1 (x1 , x2 )f (x1 , x2 ) + h2 (x1 , x2 )g(x1 , x2 ) for some polynomials h1 , h2 which is the same as saying that F is in the ideal generated by f and g; see Remark 1.5 below. The desired algorithm consists of reducing F (x1 , x2 ) as much as possible by using f1 = 0 and q = 0 (but not f and g). If zero results, then (1.5) is true; otherwise, (1.5) is false. Example 1.4. Keeping the same f and g as in (1.3), consider a polynomial F of the following form: F = x1 x32 + α1 x1 x22 + α2 x1 x2 + α3 x1 + α4 x2 + α5 , αi ∈ k. For which values of αi is (1.5) valid? The new basis {f1 , q} allows us to solve this problem. The series of reductions: F → F |f1 =0 = F1 → F1 |q=0 = F2 1 1 α1 − 3 )x1 → F2 |f1 =0 = (α2 − )x1 x2 + (α3 − 7 7 7 1 α1 − 3 )x2 + α5 + (α4 − 7 7 quickly yields the answer: (1.5) is valid if and only if 1 1 α1 + 3 , α5 = 0, α2 = , α3 = α4 = 7 7 7

page 7

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

8

9287 – Gr¨ obner-Shirshov Bases

textbook

1. INTRODUCTION

where α1 ∈ k is arbitrary. Note that we cannot reduce F at all just by using the initial polynomials f and g! We can formulate Buchberger’s algorithm as follows: Algorithm. Given any set of polynomials in several indeterminates, add the s-polynomials to the set, and reduce polynomials by eliminating their leading monomials, also dividing them by their leading coefficients. Stop when no further reductions are possible (so in particular, when all spolynomials reduce to 0). The main significance is that we can always solve the equality problem for every finite system of polynomial equations, by using Buchberger’s algorithm. Remark 1.5. Note that the question (1.5) is stated about the image of a polynomial F in the quotient algebra modulo the ideal generated by f and g. Formally, it has nothing to do with solutions of these polynomial equations in k. However, if the field k is algebraically closed then the Nullstellensatz allows us to conclude that the set of all (α, β) ∈ k 2 such that F (α, β) = 0 belongs to the set of solutions of the system f = 0, g = 0 if and only if there exists n ≥ 1 such that F n reduces to zero by means of f1 and q.

page 8

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

CHAPTER 2

Free algebras In this chapter we consider free objects in the classes of semigroups and groups. Further, we proceed to the following classes (varieties) of linear algebras over a fixed field k: commutative algebras Com, associative algebras As, Lie algebras Lie: Com : xy = yx, (xy)z = x(yz), As : (xy)z = x(yz), Lie : x2 = 0, (thus xy = −yx, the anticommutativity identity), (xy)z + (yz)x + (zx)y = 0, (the Jacobi identity). 2.1. Semigroups and groups Recall that a semigroup is a set A equipped with an associative binary product A × A → A, x(yz) = (xy)z for all x, y, z ∈ A. A semigroup is commutative if, in addition, xy = yx for all x, y ∈ A. If there exists an identity element 1 ∈ S such that x1 = 1x = x for every x ∈ X then A is called a monoid. Obviously, every semigroup may be embedded into a monoid by joining an identity element. In the class Smg of all semigroups, the free object is easy to construct. Suppose X = {x1 , x2 , . . . , xn , . . . } is a set, then the set X ∗ of all words in the alphabet X forms a semigroup relative to the operation of concatenation. The empty word here acts as an identity element, so X ∗ is actually a monoid. The monoid X ∗ has the following universal property. For every monoid M and for every map  : X → M , there exists a unique homomorphism of semigroups f : X ∗ → M such that the following diagram is commutative (here, as elsewhere, i is the inclusion map): X

/ X∗

i



 } M

∃!f

Indeed, if u = xi1 . . . xin ∈ X ∗ then f (u) = (xi1 ) . . . (xin ) ∈ M . Therefore, X ∗ is the free monoid generated by X. 9

page 9

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

10

textbook

2. FREE ALGEBRAS

The universal property implies that for every monoid M there exists a set X such that M is a homomorphic image of X ∗ . The kernel θ of this homomorphism is a congruence of the semigroup X ∗ . If S ⊂ X ∗ × X ∗ is a generating subset of the congruence θ then we may present M as a semigroup generated by the set X relative to defining relations S: M = SmgX | u = v, for (u, v) ∈ S. The most important problem related to such a presentation of a semigroup (or monoid) is to determine whether the images of two words in X ∗ coincide in M . Gr¨obner–Shirshov bases provide a way to solve this problem in general. Let us consider two examples in the rest of this section. To construct a universal object in the class of commutative semigroups (monoids) one may consider the set [X] of all words of the form xi11 . . . xinn , for n ≥ 0 and i1 , . . . , in ≥ 0, called commutative words in X (including 1). Relative to the operation (xi11 . . . xinn )(xj11 . . . xjnn ) = xi11 +j1 . . . xinn +jn , the set [X] turns into a commutative monoid which has the following universal property. For every commutative monoid C and for every map  : X → C, there exists a unique homomorphism of semigroups f : [X] → C such that the following diagram is commutative: X

i

/ [X]



 ~ C

∃!f

It is easy to see that [X]  SmgX | xi xj = xj xi , for i > j. Another important example is the free group F (X) generated by the set X. In particular, every group is a monoid, so is F (X). As a monoid, −1 −1 it is generated by the set X ∪ X −1 = {x1 , x−1 1 , x2 , x2 , . . . , xn , xn , . . . }. Therefore, there exists an epimorphism of semigroups ι : (X ∪ X −1 )∗ → F (X). The kernel of ι is a congruence generated by pairs (xi x−1 i , 1) and −1 −1 ∗ (xi xi , 1). Every congruence class in (X ∪ X ) contains a unique word u or which is reduced, i.e., does not contain a subword of the form xi x−1 i x−1 x . Hence, F (X) may be identified with the set of all reduced words in i i X ∪ X −1 .

page 10

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.2. NONCOMMUTATIVE POLYNOMIALS

textbook

11

2.2. Noncommutative polynomials Recall that a semigroup algebra kA of a semigroup A over a field k is the space of all formal expressions α1 u1 + · · · + αn un ,

where αi ∈ k, ui ∈ A and n ≥ 0.

This is an associative algebra over the field k which has the following universal property: For every associative algebra B and for every homomorphism  from A to the multiplicative semigroup of B there exists a unique homomorphism of associative algebras f : kA → B such that the restriction of f on A ⊂ kA coincides with . Let X = {x1 , . . . , xn , . . . } be a finite or countable linearly ordered set, x1 < x2 < . . . xn < . . . (actually X may be any well ordered set). Denote by kX the semigroup algebra kX ∗ of the free semigroup X ∗ . The elements of kX may be considered as noncommutative polynomials on X with coefficients in k,     αi1 ...im xi1 . . . xim  m ≥ 0 and αi1 ...im ∈ k . kX = i1 ,...,im

A monomial u = xi1 . . . xim is called an (associative or noncommutative) word in X, the length |u|, or degree deg(u), is m. It follows from the definition that X ∗ is a linear basis of kX. The combination of two universal properties (of X ∗ and kX ∗ ) leads to the following universal property of the algebra kX: For every associative algebra B and for every map  : X → B, there exists a unique algebra homomorphism f : kX → B such that the following triangle is commutative: i / kX X  ∃!f  | B It follows that any associative algebra B is a quotient of some kX,

B = kX/I, where I isan ideal of kX. Let S be a set of generators of I, i.e., I = Id(S) = { αi ai si bi | ai , bi ∈ X ∗ , si ∈ S, αi ∈ k}. Then by definition B = kX/Id(S) = kX | S, the associative algebra with generators X and defining relations si = 0, for si ∈ S. As in the case of semigroups, the main problem is to find a special obner–Shirshov system S c of defining relations of B that is called a Gr¨ basis of the ideal I, or a GSB of B. Such a set S c can be found using

page 11

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

12

textbook

2. FREE ALGEBRAS

the Shirshov algorithm for associative algebras applied to the initial set of defining relations S. The algebra kX is refereed to as a free ring (free associative algebra on X and k). The theory of free rings had been developed in the book [123]. One of the main properties of kX is Theorem 2.1 (Cohn’s Theorem). Any left (right) ideal I of kX is a free left (right) kX-module, i.e., there exist  S ⊂ I such that any polynomial f ∈ I has a unique presentation f = fi si , where si ∈ S and fi ∈ kX. The Theorem can be proved using GS bases theory for modules; in fact S is a GS basis of I as a left kX-module, see section 4.3. Another general property of kX over a field k of characteristic zero is the following Theorem 2.2 (Kemer’s Theorem). Any fully characteristic ideal T of kX (i.e., an ideal that is closed under substitutions xi → fi for any fi ∈ kX) is finitely generated as a fully characteristic ideal. The result is proved in the book [185]. 2.3. Free commutative algebras Let X = {x1 , x2 , . . . , xn , . . . } be a finite or countable linearly ordered set with x1 > x2 > · · · > xn > . . . variables or letters. In fact X can be any well ordered set. Remark 2.3. The above order on X, x1 > x2 > . . . , is common for the case of commutative variables and follows the implicit order of variables in the Gauss elimination algorithm. For noncommutative variables one uses the opposite order x1 < x2 . . . , see below. By k[X] one denotes the semigroup algebra of the free commutative semigroup [X] generated by X. Obviously, this is the usual polynomial algebra on X,     αi xi11 . . . xinn  n ≥ 0, ij ≥ 0 and αi1 ,...,in ∈ k . k[X] = i1 ,...,in

As in the case of noncommutative algebras, the algebra k[X] satisfies the following universal property. For every commutative algebra C and any map  : X → C, there exists a unique algebra homomorphism f : k[X] → C such that the following triangle is commutative: X

i

/ k[X]



 } C

∃!f

page 12

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.3. FREE COMMUTATIVE ALGEBRAS

13

The algebra k[X] is also called the free commutative-associative algebra over X and k. From the universal property it follows that any commutative algebra C is a quotient of some k[X], C = k[X]/I, where I isan ideal of k[X]. Let S be a set of generators of I, i.e., I = Id(S) = { αi ai si | ai ∈ [X], si ∈ S, αi ∈ k}. Then by definition C = k[X]/Id(S) = k[X | S], the commutative algebra with generators X and defining relations si = 0, for si ∈ S. Our main problem is, starting with S, to find a special system S c of defining relations of C that is called a Gr¨ obner basis of the ideal I, or a GB of the C. The system S c can be found using the Buchberger algorithm. We will use the following theorem. Theorem 2.4 (Hilbert’s Basis Theorem). Let X = {x1 , . . . , xn } be a finite set. Then any ideal I  k[X] is finitely generated, I = Id(f1 , . . . , fm ) =

m 

  gi fi  gi ∈ k[X], for 1 ≤ i ≤ m and m ≥ 1 .

i=1

This theorem follows by induction on n, using the following lemma. Lemma 2.5. Let R be a commutative Noetherian ring with 1 (i.e., any ideal of R is finitely generated). Then the polynomial ring R[x] (in one variable) is a commutative Noetherian ring as well. Proof. Let I  R[x] and In (the nth leading ideal) the set of leading coefficients of elements in I of degree ≤ n. Then clearly In  R, In ⊆ In+1 . Moreover, if I   R[x] with I ⊆ I  and with In = In for all n ≥ 0, then I = I  . Otherwise, take f ∈ I  \ I with least possible degree, say m. Then  , where either f1 = 0 or the degree of f1 is less f = axm + f1 , 0 = a ∈ Im than m. Since a ∈ Im , there exists a g = axm + g1 ∈ I, where either g1 = 0 or the degree of g1 less than m. Thus, f − g = f1 − g1 ∈ I  which implies that f1 = g1 since m is least. This contradicts f ∈ I. Let L0 ⊆ L1 ⊆ . . . be an ascending chain of ideals of R[x] and Lin the nth leading ideal of Li . Note that Lij ⊆ Lkm whenever i ≤ k and j ≤ m. Since R is Noetherian, the ascending chain {Lii |i ≥ 0} of ideals of R stabilizes, say at Ljj . For any n with 0 ≤ n ≤ j − 1 the ascending chain {Lin | i ≥ 0} stabilizes, say at Ltn n . Let m = max{j, t1 , . . . , tj−1 }. Then for all i ≥ m and n ≥ 0, Lin = Lmn . Thus Li = Lm . So R[x] is Noetherian. 

page 13

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

14

9287 – Gr¨ obner-Shirshov Bases

textbook

2. FREE ALGEBRAS

2.4. Free Lie algebras If A is an associative algebra over a field k then the same linear space relative to a new product (x, y) → [xy], for x, y ∈ A, given by [xy] = xy − yx is a Lie algebra denoted by A(−) . Consider a well ordered set X = {x1 , x2 , . . . }, where x1 < x2 < . . . is the order. A Lie polynomial f in X and k is a noncommutative polynomial in X and k (i.e., f ∈ kX) which is a linear combination of Lie words in X. A Lie word is defined by induction: (1) A single letter xi ∈ X is a Lie word of length, or degree, 1. (2) If u, v are Lie words of length k and l respectively, then [uv] = uv − vu is a Lie word of length k + l. Examples of Lie words are [xi xj ], [[xi xj ]xk ], [xi [xj xk ]], [[xi xj ][xk xl ]], [[. . . [xi1 xi2 ] . . . ]xik ] which is an example of a left normed Lie word, and [xi1 [xi2 . . . [xik−1 xik ] . . . ]] which is is an example of a right normed Lie word. The set of all Lie polynomials in X and k will be denoted by Liek (X) or Lie(X) if k is fixed. It is a Lie algebra under the Lie product [f g] = f g −gf . In fact, it is a Lie subalgebra of the commutator algebra kX(−) generated by X. In order to prove that the algebra Lie(X) satisfies a universal property, we have to use the following statement. Lemma 2.6. For every Lie algebra L there exists an associative algebra A such that L ⊂ A(−) . We will prove this statement in Section 3.2. The combination of Lemma 2.6 and the universal property of kX leads to the following property of Lie(X). For every Lie algebra L and for every map  : X → L there exists a unique Lie algebra homomorphism f : Lie(X) → L such that the following triangle is commutative: X

i

/ Lie(X)



 { A

∃!f

Liek (X) is called a free Lie algebra on X and k. Again, it follows that any Lie algebra L is a quotient of some Lie(X), L = Lie(X)/I, where I is an ideal a set of generators

of I, i.e.,  of Lie(X). Let S be αi [ai si bi ] | ai , bi ∈ X ∗ , si ∈ S, αi ∈ k . Then by I = Id(S) =

page 14

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.4. FREE LIE ALGEBRAS

textbook

15

definition C = Lie(X)/Id(S) = Lie(X | S), the Lie algebra with generators X and defining relations si = 0, for si ∈ S. Our main problem is, starting with S, to find a special system S c of defining relations of L that is called a Gr¨ obner–Shirshov basis of the ideal I, or, by abuse of terminology, a GSB of the Lie algebra L. Such a set S c can be found using the Shirshov algorithm for Lie algebras. It is not hard to see that every nonassociative word in Lie(X) is equal to a linear combination of left- (or right-) normed Lie words. So left(right-) normed Lie words form a set of linear generators of Lie(X), but they are linearly dependent. A natural question arises: what set of words is a linear basis of Lie(X)? Several such bases are known: the Hall basis (M. Hall, 1949), the HallShirshov series of bases (Shirshov, 1953, 1962), the Lyndon–Shirshov basis (Chen, Fox, Lyndon, 1958, Shirshov, 1958), right-normed basis (Chibrikov, 2006). Note that the Lyndon–Shirshov basis is a particular case of HallShirshov series of bases. In the sequel, we will need the Lyndon–Shirshov basis of Lie(X) and so discuss this notion in detail. We start with the Lyndon–Shirshov associative words. Let X = {xi | i ∈ I} where I is a well ordered set and order X by xi > xp if i > p, for i, p ∈ I. Let X ∗ be the free monoid generated by X. For u = xi1 xi2 · · · xik ∈ X ∗ , let xβ = min(u) = min{xi1 , xi2 , · · · , xik }, fir(u) = xi1 , |u| = k. We call |u| the length of u. In the sequel, cont(u) denotes the set of all letters contained in u, cont(u) = {xi1 , · · · , xit } for u = xi1 · · · xit ∈ X ∗ . Definition 2.7. Let u = xi1 xi2 · · · xik ∈ X ∗ . Then u is said to be a weakALSW if fir(u) > min(u) or |u| = 1, where ALSW stands for “associative Lyndon–Shirshov word”. Let u be a weak-ALSW, min(u) = xβ and |u| ≥ 2. We define X  (u) = {xji = xi xβ · · · xβ | i > β and j ≥ 0}.  j

Note that

xji

= xi xβ · · · xβ is just a symbol.  j

Now, we order X  (u) in the following way: xji11 > xji22 ⇔ i1 > i2 or (i1 = i2 and j2 > j1 ).

page 15

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

16

textbook

2. FREE ALGEBRAS

Suppose that u, v are weak-ALSWs and min(v) ≥ min(u) = xβ . Then we define mt  ∗ ∗ 1 vu = xm i1 · · · xit in (X (u)) ⇔ v = xi1 xβ · · · xβ · · · xit xβ · · · xβ in X ,   m1

mt

where xij > xβ , mj ≥ 0, for 1 ≤ j ≤ t. To simplify notation, we use u instead of uu . Throughout this section, we assume that x1 < x2 < x3 < · · · , which is in accordance with the definition of the order on X given above. Example 2.8. Let u = x2 x1 , v = x3 x2 . Then vu = x03 x02 , v  = x13 . The following lemma is obvious. Lemma 2.9. Let u be a weak-ALSW, with u = vw where v, w = 1 and w = xβ w1 where xβ = min(u). Then u = vu wu . Example 2.10. Let v = x3 x2 x1 , w = x2 x2 and u = vw = x3 x2 x1 x2 x2 . Then u = x03 x12 x02 x02 , vu = x03 x12 , wu = x02 x02 and u = vu wu . Unless stated otherwise, we always use the lexicographic order both on (X  (u))∗ and X ∗ (i.e., w > wt if t = 1 and zxi t1 > zxj t2 if xi > xj ). Lemma 2.11. Let u, v be weak-ALSWs with |v| ≥ 2. Then u > v ⇔ uuv >  . vuv Proof. Let xβ = min(uv). Assume that u > v. Then there are two cases to consider. Case 1. A letter in u is greater than the corresponding letter in v: u = xi1 xβ · · · xβ · · · xis−1 xβ · · · xβ xis xβ · · · xβ · · ·    l1

ls−1

ls

v = xi1 xβ · · · xβ · · · xis−1 xβ · · · xβ yz · · · ,   l1

where xis > y.

ls−1

(a) If y = xβ , then uuv = xli11 · · · xis−2 x s−1 xls · · · s−2 is−1 is l

l

l

 vuv = xli11 · · · xis−2 x s−1 · · · , s−2 is−1 l

 where ls−1 > ls1 .

 . So, uuv > vuv (b) If y > xβ , then

uuv = xli11 · · · xis−1 xls · · · s−1 is l

 vuv = xli11 · · · xis−1 yn · · · . s−1 l

 . So, uuv > vuv

page 16

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.4. FREE LIE ALGEBRAS

textbook

17

Case 2. The word u is a prefix of v: u = xi1 xβ · · · xβ · · · xis xβ · · · xβ   l1

ls

v = xi1 xβ · · · xβ · · · xis xβ · · · xβ yz · · · .   l1

ls

(a) If y = xβ , then uuv = xli11 · · · xliss l

 vuv = xli11 · · · xiss · · · ,

where ls > ls .

 . So, uuv > vuv   (b) If y > xβ , then vuv = uuv y n · · · and so, uuv > vuv .   Conversely, assume that uuv > vuv . We will prove that u > v. There are also two cases to consider.  = xli11 · · · xliss y n · · · . (1) uuv = xli11 · · · xliss and vuv  l l  xls · · · and vuv = xli11 · · · xis−1 x  ls · · · , where (2) uuv = xli11 · · · xis−1 s−1 is s−1 is xis > xis or (xis = xis and ls > ls ). In both cases, it is clear that u > v. 

Definition 2.12. Let u ∈ X ∗ . Then u is called an ALSW if ( ∀v, w ∈ X ∗ s.t. v, w = 1) u = vw ⇒ vw > wv. Remark 2.13. Let u, v ∈ X ∗ and the vu ∈ (X  (u))∗ be as before. We denote by |v| the length of v in X ∗ , as defined above, and |vu |X  the length of vu in (X  (u))∗ . Lemma 2.14. Let u be a Weak-ALSW with |u| ≥ 2. Then u is an ALSW in X ∗ if and only if u is an ALSW in (X  (u))∗ . Proof. (⇒). If |u |X  = 1, then u is an ALSW. Suppose that |u |X  > 1 and u = vu wu . Then u = vw. Since u is an ALSW, vw > wv which implies (vw)u > (wv)u by Lemma 2.11. Therefore, by Lemma 2.9, vu wu > wu vu and so, u is an ALSW. (⇐). Let u = vw and xβ = min(u). If fir(w) = xβ , then vw > wv. If fir(w) = xβ , then u = vu wu ⇒ vu wu > wu vu ⇒ (vw)u > (wv)u ⇒ vw > wv. Hence, u is an ALSW.



Remark 2.15. For a weak-ALSW u, it is clear that |u |X  < |u| if |u| > 1. For an ALSW u, we denote by u = (u ) and u(k) = (u )(k−1) for k > 0 generally. Therefore, X k (u) = X k−1 (u ).

page 17

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

18

9287 – Gr¨ obner-Shirshov Bases

textbook

2. FREE ALGEBRAS

Lemma 2.16. For u ∈ X ∗ , u is an ALSW if and only if there is a k ≥ 0 such that |u(k) |X k (u) = 1. Proof. We apply induction on |u|. If |u| = 1, then there is nothing to do. Assume that |u| > 1. Since |u |X  < |u| and |u(k) |X k (u) = |(u )(k−1) |X k−1 (u ) , by induction and by Lemma 2.14, the result follows.



Example 2.17. Let u = x5 x4 x5 x3 . Then u = x05 x04 x15 , u = (x05 )1 (x15 )0 and u = ((x05 )1 )1 . Therefore, by Lemma 2.16, u is an ALSW. Lemma 2.18. Let u ∈ X ∗ . Then u is an ALSW if and only if ( ∀v, w ∈ X ∗ s.t. v, w = 1)

u = vw ⇒ u > w.

Proof. (⇒). We use induction on |u|. If |u| = 1, then the result clearly holds. Suppose that |u| ≥ 2, xβ = min(u) and u = vw, with v, w = 1. If w = xβ w1 then u > w. If w = xβ w1 then u = vu wu . Since u is an ALSW, by induction, u > wu . Hence, by Lemma 2.11, u > w. (⇐). We use induction on |u|. If |u| = 1, then u = xi is an ALSW. If |u| > 1 and |u |X  = 1 then by Lemma 2.16 u is an ALSW. If |u |X  > 1 and u = vu wu then u > wu follows from u > w. By induction, u is an ALSW. Hence, by Lemma 2.14, u is an ALSW.  Lemma 2.19. Suppose that u is an ALSW, xβ = min(u) and |u| > 1. Then uxβ is an ALSW. Proof. Follows from Lemma 2.18.



Lemma 2.20. Let u and v be ALSWs. Then uv is an ALSW if and only if u > v. Proof. (⇒). Suppose that uv is an ALSW. Then, by Lemma 2.18, u > uv > v. (⇐). We use induction on |uv|. Suppose that u > v. If |uv| = 2 or v = xβ = min(uv), then the result is obvious. Otherwise, we have that    uuv > vuv , where uuv , vuv are ALSWs. By induction, uuv vuv = (uv) is an ALSW and so is uv.  Lemma 2.21. For every u ∈ X ∗ , there exists a unique decomposition u = u1 u2 · · · uk , where ui is an ALSW, for 1 ≤ i ≤ k and u1 ≤ u2 ≤ · · · ≤ uk .

page 18

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.4. FREE LIE ALGEBRAS

textbook

19

Proof. To prove the existence, we use induction on |u|. If |u| = 1 then it is trivial. Let |u| > 1 and xβ = min(u). If u = xβ v then v has the required decomposition and so does u. Otherwise, u is a weak-ALSW. Thus, u has the decomposition and so does u, by Lemmas 2.11 and 2.14. To prove the uniqueness, we let u = u1 · · · uk = w1 · · · ws be the decompositions such that ui , wj are ALSWs for any i, j where u1 ≤ · · · ≤ uk and w1 ≤ · · · ≤ ws . If u = xβ v, then u1 = w1 = xβ and the result follows from the induction on |u|. Otherwise, u is a weak-ALSW and   u = u1u · · · uku = w1u · · · wsu are the decompositions of u . Now, by induction again, the result follows.  Remark 2.22. In Lemma 2.21, the word uk is the longest proper suffix of u which is ALSW itself. Example 2.23. Let u = x1 x1 x2 x1 x2 x1 x1 . Then u = x1 x1 x2 x1 x2 x1 x1 = u1 u2 u3    u1

u2

u3

is the decomposition of u. Lemma 2.24. Let u be an ALSW and |u| ≥ 2. If u = vw, where w is the longest ALSW proper suffix of u, then v is an ALSW. Proof. Suppose that v is not an ALSW. Then, by Lemma 2.21, we can assume that v = v1 v2 · · · vm , with m > 1, where each vi is an ALSW and v1 ≤ v2 ≤ · · · ≤ vm . If vm > w, then vm w is an ALSW and |vm w| > |w|, a contradiction. If vm ≤ w, then we get another decomposition of u which contradicts the uniqueness in Lemma 2.21. Thus, v must be an ALSW.  Example 2.25. Let u = x5 x4 x5 x4 x3 x5 x3 . Then u = x5 x4 x5 x4 x5 x3 = vw   v

w

and u, v, w are all ALSWs. Now, for an ALSW u, we introduce two bracketing schemes. The first one, called up-to-down bracketing, is defined inductively by [xi ] = xi , [u] = [[v][w]], where u = vw and w is the longest proper ALSW suffix of u. Example 2.26. Let u = x2 x2 x1 x1 x2 x1 . Then the up-to-down bracketing may be done in a step-by-step way as follows: [ ] : u → [[x2 x2 x1 x1 ][x2 x1 ]] → [[x2 [x2 x1 x1 ]][x2 x1 ]] → [[x2 [[x2 x1 ]x1 ]][x2 x1 ]].

page 19

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

20

textbook

2. FREE ALGEBRAS

The second scheme is called down-to-up bracketing. Let us explain it first with a sample word u = x2 x2 x1 x1 x2 x1 . Join the minimal letter x1 to the preceding letters: u → x2 [x2 x1 ]x1 [x2 x1 ]. Form a new alphabet of the nonassociative words x2 , [x2 x1 ] and x1 ordered lexicographically, i.e., x2 > [x2 x1 ] > x1 . Join the minimal letter x1 to the previous letters: x2 [x2 x1 ]x1 [x2 x1 ] → x2 [[x2 x1 ]x1 ][x2 x1 ]. Form a new alphabet x2 > [x2 x1 ] > [[x2 x1 ]x1 ]. Join the minimal letter [[x2 x1 ]x1 ] to the previous letter: x2 [[x2 x1 ]x1 ][x2 x1 ] → [x2 [[x2 x1 ]x1 ]][x2 x1 ]. Form a new alphabet [x2 [[x2 x1 ]x1 ]] > [x2 x1 ]. Finally, join the minimal letter [x2 x1 ] to the previous letter: [x2 [[x2 x1 ]x1 ]][x2 x1 ] → [[x2 [[x2 x1 ]x1 ]][x2 x1 ]] = [u]. Remark 2.27. We denote by [ ] the down-to-up bracketing and by [[ ]] the up-to-down bracketing. Lemma 2.28. [u] = [[u]] for any ALSW u. Proof. We use induction on |u|. If |u| = 1, then u = xi and [xi ] = [[xi ]] = xi . Assume that u = vw, where w is the longest proper ALSW suffix of u. Then, by Lemma 2.24, v is an ALSW. If fir(w) = min(u) = xβ then w = xβ and v = xi xβ · · · xβ . By induction, we have [v] = [[v]]. Hence, in this case, [u] = [[u]] = ([v]xβ ). If fir(w) = min(u) = xβ then u = vu wu . Suppose that vu = su tu such that tu wu is an ALSW. Then tw is an ALSW and |tw| > |w|, a contradiction. This proves that wu is the longest proper ALSW suffix of u . By induction,

page 20

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.4. FREE LIE ALGEBRAS

textbook

21

[vu ] = [[vu ]] and [wu ] = [[wu ]]. Moreover, by the definition of [ ] and [[ ]], we have [ ] : u → vu wu → [vu ][wu ] → · · · [[ ]] : u → vu wu → [[vu ]][[wu ]] → · · · 

Therefore, [u] = [[u]].

Now we give the definition of a non-associative Lyndon–Shirshov word. Let X ∗∗ denote the set of all non-associative words in an alphabet X. Given an associative word u ∈ X ∗ , we use (u) to denote a nonassociative word obtained from u by some bracketing. Definition 2.29. Let < be the order on X ∗ as before and (u) a nonassociative word. Then (u) is called a non-associative Lyndon–Shirshov word (NLSW) if (i) u is an ALSW, (ii) if (u) = ((v)(w)), then both (v) and (w) are NLSWs, (iii) in (ii) if (v) = ((v1 )(v2 )), then v2 ≤ w in X ∗ . Remark 2.30. In Definition 2.29 (ii), we have v > w by Lemma 2.20. Theorem 2.31. Let u be an ALSW. Then there exists a unique bracketing (u) on u such that (u) is a NLSW. Proof. (Existence). Let u be an ALSW. We will prove, by induction on |u|, that up-to-down bracketing is one way of bracketing such that [[u]] is a NLSW. If |u| = 1, then there is nothing to do. Suppose that |u| > 1 and u = vw where w is the longest ALSW proper suffix of u. Then, [[u]] = [[[v]][[w]]]. By induction, both [[v]] and [[w]] are NLSWs. Now we assume that [[v]] = [[v1 ]][[v2 ]] and v2 > w. Then, v2 w is an ALSW, a contradiction. So, v2 ≤ w and hence, [[u]] is a NLSW. (Uniqueness). We assume that u is an ALSW and ( ) is a way of bracketing such that (u) is a NLSW. Then, we have to show (u) = [[u]]. We use induction on |u|. If |u| = 1, then clearly (u) = [[u]]. Suppose that |u| > 1 and u = xi1 xβ · · · xβ · · · xis xβ · · · xβ , where xij > xβ = min(u). Note that if v = xi xβ · · · xβ , xi > xβ , then [[v]] = [[· · · [[xi xβ ]] · · · xβ ]]xβ

page 21

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

22

9287 – Gr¨ obner-Shirshov Bases

textbook

2. FREE ALGEBRAS

is the unique way of bracketing such that [[v]] is a NLSW. According to the definition of NLSW, any associative word in a bracket must be ALSW. Hence, (u) = ((xi1 xβ · · · xβ )(xi2 xβ · · · xβ ) · · · (xis xβ · · · xβ )) = [[[[xi1 xβ · · · xβ ]][[xi2 xβ · · · xβ ]] · · · [[xis xβ · · · xβ ]]]]. By induction, (u ) = [[u ]] and therefore, [[u]] = [[u ]] = (u ) = (u).



If (u) is a NLSW, then we denote it by [u]. We now return to the free associative algebra kX generated by X. We consider ( ) as Lie bracket in kX, i.e., for every a, b ∈ kX interpret (ab) as ab − ba. The subset Lie(X) is the subalgebra of kX(−) generated by X. For a given noncommutative polynomial f ∈ kX, f = 0, its leading word according to the above order on X ∗ is denoted f¯ ∈ X ∗ . Then  f = αf + αi ui , where f , ui ∈ X ∗ , f > ui and α, αi ∈ k. Denote the set {u | f (u) = 0} by supp f and let deg f stands for |f |. A polynomial f is called monic if α = 1. Note that if |u| = |v| and u < v then the lexicographic order which we used before is the same as the degree-lexicographic order on X ∗ . Theorem 2.32. For every nonassociative word (u) ∈ X ∗∗ there is a presentation  (u) = αi [ui ], where each αi ∈ k, [ui ] is a NLSW and |ui | = |u|. Moreover, if (u) = ([v][w]), then ui > min{v, w}. Proof. If |u| = 1, then (u) = [u] and the result follows. If |u| > 1 then, without loss of generality, we may assume (u) = ([v][w]), where [v] and [w] are NLSWs. Indeed, if (u) = ((v)(w)) then, by induction on the length of u,   βj [wj ], (v) = αi [vi ] and (w) = where αi , βj ∈ k, [vi ] and [wj ] are NLSWs, |vi | = |v|, |wj | = |w|. It remains to prove the statement for each ([vi ][wj ]). In addition, we may assume that (u) = ([v][w]) with v > w because of ([v][w]) = −([w][v]). Proceed by induction on |v|. If |v| = 1, then (u) is a NLSW. Suppose that |v| > 1 and [v] = [[v1 ][v2 ]]. There are two subcases (a) If v2 ≤ w then (u) = (([v1 ][v2 ])[w]) is a NLSW.

page 22

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

2.4. FREE LIE ALGEBRAS

textbook

23

(b) If v2 > w then (u) = (([v1 ][v2 ])[w]) = (([v1 ][w])[v2 ]) + ([v1 ]([v2 ][w])). By induction, ([v1 ][w]) = ([v2 ][w]) =

 

Then (u) =

γi [ti ],

where ti > min{v1 , w} = w

γj  [tj ],



where tj > min{v2 , w} = w.

γi ([ti ][v2 ]) +



γj ([v1 ][tj ]).

Note that min{ti , v2 } and min{tj , v1 } > min{v, w} = w. Therefore, if (u) = ([v][w]) is not a NLSW then we may rewrite (u) as  (u) = δi ([pi ][qi ]), δi ∈ k, where min{pi , qi } > min{v, w}. If either of the words in the right-hand side is not a NLSW then apply such a rewriting again. The process stops since  cont(u) = cont(pi qi ) for all i. Example 2.33. Let (u) = (((x3 x2 )(x2 x1 ))(x2 x1 x1 )). Then (u) = (((x3 (x2 x1 ))x2 )(x2 x1 x1 )) + ((x3 (x2 (x2 x1 )))(x2 x1 x1 )), (((x3 (x2 x1 ))x2 )(x2 x1 x1 )) = (((x3 (x2 x1 ))(x2 x1 x1 ))x2 ) + ((x3 (x2 x1 ))(x2 (x2 x1 x1 ))) = (((x3 (x2 x1 x1 ))(x2 x1 ))x2 ) + ((x3 ((x2 x1 )(x2 x1 x1 )))x2 ) + ((x3 (x2 x1 ))(x2 (x2 x1 x1 ))), ((x3 (x2 (x2 x1 )))(x2 x1 x1 )) = ((x3 (x2 x1 x1 ))(x2 (x2 x1 ))) + (x3 ((x2 (x2 x1 ))(x2 x1 x1 ))) = ((x3 (x2 x1 x1 ))(x2 (x2 x1 ))) + (x3 ((x2 (x2 x1 x1 ))(x2 x1 )) + (x3 (x2 ((x2 x1 )(x2 x1 x1 )))), and hence, (u) = (((x3 (x2 x1 x1 ))(x2 x1 ))x2 ) + ((x3 ((x2 x1 )(x2 x1 x1 )))x2 ) + ((x3 (x2 x1 ))(x2 (x2 x1 x1 ))) + ((x3 (x2 x1 x1 ))(x2 (x2 x1 ))) + (x3 ((x2 (x2 x1 x1 ))(x2 x1 )) + (x3 (x2 ((x2 x1 )(x2 x1 x1 )))) is a linear combination of NLSWs. Lemma 2.34. Let [u] be a NLSW. Then [u] = u.

page 23

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

24

textbook

2. FREE ALGEBRAS

Proof. We use induction on |u|. If |u| = 1, then the result holds immediately. Let |u| > 1 and [u] = [[v][w]]. Then, by induction, [v] = v and [w] = w. Suppose that   αi vi , and [w] = w + βj wj , [v] = v + vi x2 > . . . following standard practice in the Gauss elimination algorithm and Gr¨ obner bases theory for commutative polynomials. For noncommutative and nonassociative variables we will choose x1 < x2 < . . . . Let us define the deg-lex order on the set of commutative words [X] to compare two words first by degree and then lexicographically. For example, x31 > x21 x2 > x1 x2 x3 > x21 . In general,   ik > jk , or xi11 xi22 . . . xinn > xj11 xj22 . . . xjnn iff   jk and (i1 , i2 , . . . , in ) > (j1 , j2 , . . . , jn ) lexicographically. ik = This order is a monomial order in the sense that (i) If u > v, then wu > wv for any w ∈ [X], (ii) [X] is a well ordered set, i.e., a linearly ordered set with minimal (descending chain) condition: every strictly decreasing chain u1 > u2 > · · · > ur > . . . is finite. Condition (ii) is equivalent to requiring that 1 is the least monomial. For if 1 > u for some monomial u then it follows from (i) that 1 > u > u2 > · · · which is a contradiction to (ii). The converse follows from the fact that k[X] is Noetherian, i.e., has no infinite proper ascending sequences of ideals (a consequence of Hilbert’s Basis Theorem). Indeed, if u1 > u2 > . . . is a descending chain of monomials then the chain of ideals Ik = Id(uk ) for k = 1, 2, . . . is strictly increasing. The lexicographic order may be regarded as a monomial one if we assume that a proper prefix of a word is less than the word. A more general example of a monomial order on [X] is lex-deg order which is defined as follows. Suppose X is split into disjoint nonempty subsets B1 , B2 , . . . , Bt and each [Bi ] is equipped with a monomial order >i . For two given monomials 29

page 29

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

30

9287 – Gr¨ obner-Shirshov Bases

textbook

3. COMPOSITION-DIAMOND LEMMA

u = u1 u2 · · · ut , v = v1 v2 · · · vt where each ui , vi is a monomial involving only inteterminates from block Bi we say that u > v iff u1 = v1 ,. . . , us−1 = vs−1 and us >s vs for some s with 1 ≤ s ≤ t − 1. In particular if the number of blocks t is the same as the number n of indeterminates then we have the lexicographic order. A polynomial f ∈ k[X] has a maximal (leading) word f¯ with a leading coefficient αf¯ = 0,  f = αf¯f¯ + αi ui , ui ∈ [X], αf¯, αi ∈ k. ui wl+1 ≥ · · · ≥ wm , l ≥ 1. We will use induction on (w1 , l) with w1 ≥ f¯ and l ≥ 1 using the lex order for such pairs. If w1 = f¯ or l = 1, the claim is clear. Otherwise, l > 1 and w1 = a1 s¯1 = a2 s¯2 . Since w1 is a common multiple of s¯1 and s¯2 , it must contain w = lcm(¯ s1 , s¯2 ) as a subword, w1 = b lcm(¯ s1 , s¯2 ) such that a1 s1 = bw|s¯1 →s1 , a2 s2 = bw|s¯2 →s2 . Then a1 s1 − a2 s2 = b(s1 , s2 )w ≡ 0 (mod S, w1 ). This means that we may rewrite the initial equality for f as f ≡ (α1 + α2 )a2 s2 + · · · + αl al sl + αl+1 al+1 sl+1 + . . . . If l > 2 or α1 + α2 = 0, we may apply the induction (since we decrease l). Otherwise, we decrease w1 . (ii) ⇒ (iii). For any S, the ELW’s by S in all monomials of a polynomial give rise a following presentation of an arbitrary polynomial h:   βj a j s j , (3.1) h= αi ui +

page 31

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

32

9287 – Gr¨ obner-Shirshov Bases

textbook

3. COMPOSITION-DIAMOND LEMMA

¯ This means that where each αi , βj ∈ k, ui ∈ Irr(S), sj ∈ S and aj s¯j ≤ h. Irr(S) is a set of linear generators of k[X | S]. If a linear combination of S-irreducible words ui belongs to I then by (ii) some uj contains s¯ for some s ∈ S, a contradiction. (iii) ⇒ (i). If h = (f, g)w then, as above, all αi = 0 and (f, g)w ≡ 0 (mod S, w).  Corollary 3.5. A monic set S ⊂ k[X] is a Gr¨ obner basis if and only if all compositions of polynomials from S can be reduced to zero by ELW’s by S. Proof. (⇐) This was proved above. (⇒) Let h = (f, g)w and f, g ∈ S. By ELW’s by S one can present h as before. By Theorem 3.4 all αi ’s are zero. This means that h goes to zero by ELW’s by S.  In the rest of this section we consider an algorithm which allows us to construct a Gr¨obner basis S c (a complement of S) starting from a set S of (monic) polynomials in such a way that Id(S c ) = Id(S). The process, known as the Buchberger algorithm, terminates in finitely many steps in the case of commutative algebras. Let X = {x1 , . . . , xn }, S = {s1 , . . . , sm } ⊂ k[X] and let I = Id(S) be the ideal of k[X] generated by S. If each composition of elements from S is trivial modulo (S, w) then S is a Gr¨obner basis. Otherwise, if there exists a nontrivial composition h = (f, g)w for some f, g ∈ S then let us add h to S. Note that if f, g ∈ I then h ∈ I by the definition of a composition. Then apply ELWs by S to present h in a form (3.1), and denote by h the sum of all S-irreducible terms: h = αi ui , ui ∈ Irr(S). Obviously, h ∈ I since the process of ELW by S mentioned above preserves membership of the ideal I. Alternatively, we may apply ELWs by S only to the leading ¯  is S-irreducible. term of h so that h → · · · → h , where h In any case, for a nontrivial composition h = (f, g)w we construct a polynomial h ∈ I with an S-irreducible leading term. It remains to add h to S. Let S ⊂ S  = S ∪ {h }. It is easy to see that Id(¯ s1 , . . . , s¯m ) is a proper subset of Id(¯ s1 , . . . , s¯m , s ) since h ∈ Id(¯ s1 , . . . , s¯m ). Then we continue the process and obtain a series of generator sets of Id(S), S ⊂ S  ⊂ S  ⊂ . . . .

(3.2)

such that the ideals generated by the leading monomials of S, S  , S  , . . . constitute an ascending chain of ideals in k[X]. By the Hilbert Basis Theorem the chain (3.2) is finite and hence the process terminates in finitely many steps.

page 32

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.1. GROBNER BASES FOR COMMUTATIVE ALGEBRAS

textbook

33

If the initial set S is finite, then S  , S  , . . . are also finite. Therefore, every ideal I in k[X] has a finite Gr¨obner basis obtained by the Buchbereger algorithm applied to a finite set of generators of I. Definition 3.6. A Gr¨ obner basis S in k[X] is called reduced if for every s∈S we have supp(s) ⊆ Irr(S \ {s}), where supp(s) = {u1 , u2 , . . . , um } for  α u with 0 =

α s= m i i i ∈ k and ui ∈ [X]. i=1 A Gr¨obner basis S in k[X] is said to be minimal if there are no inclusions among polynomials from S, i.e., f¯ = a¯ g, a ∈ [X], for neither different f, g ∈ S. Clearly, a reduced Gr¨ obner basis is minimal. Corollary 3.7. For any ideal I of k[X] and any monomial order < on [X] there exists only one reduced Gr¨ obner basis of I. Proof. Clearly, there exists a minimal Gr¨ obner basis S ⊂ k[X] for the ideal I = Id(S). For any s ∈ S, by using (3.1), we have s = s + s , where supp(s ) ⊆ Irr(S \ {s}) and s ∈ Id(S \ {s}). Since S is a minimal Gr¨ obner basis, we have s¯ = s for any s ∈ S. Let S  = {s | s ∈ S}. It is clear that S  ⊆ Id(S) = I. By Theorem 3.4, for every f ∈ Id(S), f¯ = a¯ s = as for some a ∈ [X] and s ∈ S. Therefore,  S is a reduced Gr¨obner basis for the ideal I. Suppose that S and R are two reduced Gr¨obner bases for the ideal I. By Theorem 3.4, for every s ∈ S we have s¯ = a¯ r , r¯ = b¯ s1 for some a, b ∈ [X], r ∈ R, s1 ∈ S and hence s¯ = ab¯ s1 . Since s¯ ∈ supp(s) ⊆ Irr(S \ {s}), we have s = s1 . It follows that a = b = 1 and so s¯ = r¯. If s = r then 0 = s − r ∈ I = Id(S) = Id(R). By Theorem 3.4, s − r = a1 r¯1 = b1 s¯2 for some a1 , b1 ∈ [X] with r¯1 , s¯2 < s¯ = r¯. This means that s2 ∈ S \ {s} and r1 ∈ R \ {r}. Since s − r ∈ supp(s) ∪ supp(r), we have either s − r ∈ supp(s) or s − r ∈ supp(r). If s − r ∈ supp(s) then s − r ∈ Irr(S \ {s}) which contradicts s − r = b1 s¯2 ; if s − r ∈ supp(r) then s − r ∈ Irr(R \ {r}) which contradicts s − r = a1 r 1 . Hence, s = r and then S ⊆ R. Similarly, R ⊆ S.  The following membership problem plays a key role in “pure” algebra as well as in numerous applications: Given f ∈ k[X] and S ⊂ k[X], is f (X) ∈ Id(S)? Corollary 3.8. Given a finite set S ⊂ k[X], the membership problem for the ideal I = Id(S) is algorithmically solvable assuming we may compute with the elements of the underlying field k.

page 33

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

34

textbook

3. COMPOSITION-DIAMOND LEMMA

Proof. We may assume that S is a Gr¨obner basis of I. Induction on u = f ≥ 1. If u = 1, it is clear. Otherwise if u does not contain s¯, s ∈ S, the answer is negative. If u = a¯ s, s ∈ S, a ∈ [X] then f1 = f − αas has less leading monomial and is belonged to I. By induction we are done.  Remark 3.9. If S was infinite then by the Hilbert Basis Theorem there still exists a finite set of generators for I = Id(S). However, finding this finite set of generators is a separate problem. Corollary 3.10. The Gauss elimination algorithm for solving systems of linear equations is a particular case of Buchbereger’s algorithm. Proof. Let Ax = b,

A ∈ Matm,n (k), x = (x1 , . . . , xn ) and b = (b1 , . . . , bm )

be a system of linear equations over k. Let fi = ai x − bi , 1 ≤ i ≤ m be the corresponding system of linear polynomials. Then the Buchberger algorithm for {fi } coincides with the Gauss elimination algorithm for the system.  Theorem 3.11. Given Gr¨ obner bases of ideals I, J in k[X] we may find algorithmically a Gr¨ obner bases of I + J and I ∩ J. Proof. If S1 and S2 are finite Gr¨obner bases of I and J, respectively, then I +J is generated by S1 ∪S2 . The Buchbereger algorithm applied to S1 ∪S2 produces a Gr¨ obner basis of I + J. To find a Gr¨ obner basis of I ∩ J, consider the extended set of variables X ∪ {t} = {x1 , . . . , xn , t} and an ideal L of k[X ∪ {t}] generated by all ts1 , (1 − t)s2 for s1 ∈ S1 , s2 ∈ S2 . Obviously, L ∩ k[X] = I ∩ J. Therefore, we may apply the algorithm to the generators of L assuming a monomial order on [X ∪{t}] in which all words with t are greater than words without t (e.g., the lex-deg order with B1 = {t}, B2 = X). All polynomials in the resulting Gr¨ obner basis that do not contain t form a Gr¨obner basis of I ∩ J.  A more complicated √ algorithm exists for finding a Gr¨obner basis for the radical of I, the ideal I := {f ∈ k[X] | f m ∈ I for some m ≥ 1}. This algorithm is of crucial importance for algebraic geometry and commutative algebra, it can be found in [13]. 3.2. Gr¨ obner–Shirshov bases for associative algebras In this section, we state a proof of the Shirshov Composition-Diamond Lemma for associative algebras. Let k be a field, kX the free associative algebra over k generated by X and let X ∗ be the free monoid generated by X. The empty word is the

page 34

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.2. GROBNER–SHIRSHOV BASES FOR ASSOCIATIVE ALGEBRAS

textbook

35

identity which is denoted by 1. For a word w ∈ X ∗ , we denote the length (degree) of w by |w| (deg w). Let X ∗ be a well ordered set. Let f ∈ kX, f = 0, with the leading word f¯ and f = αf¯ − rf , where 0 = α ∈ k. We call f monic if α = 1. A well order > on X ∗ is called monomial if it is compatible with the multiplication of words, that is, for u, v ∈ X ∗ , we have u > v ⇒ w1 uw2 > w1 vw2 , for all w1 , w2 ∈ X ∗ . A standard example of a monomial order on X ∗ (where X is a well ordered set) is the deg-lex order: to compare two words, first compare their degrees and then lexicographically. Let f and g be two monic polynomials in kX, and let < be a monomial order on X ∗ . Then there are two kinds of compositions: (i) If w is a word such that w = f¯b = a¯ g for some a, b ∈ X ∗ with |f¯| + |¯ g | > |w|, then the polynomial (f, g)w = f b − ag is called the intersection composition of f and g with respect to w. (ii) If w = f¯ = a¯ gb for some a, b ∈ X ∗ , then the polynomial (f, g)w = f − agb is called the inclusion composition of f and g with respect to w. The word w in the composition (f, g)w is called an ambiguity. Let S ⊂ kX be a subset of monic noncommutative polynomials, and kX is called trivial modulo (S, w), denoted w ∈ X ∗ . A polynomial h ∈ h ≡ 0 (mod S, w), if h = αi ai si bi , αi ∈ k, ai , bi ∈ X ∗ , si ∈ S, and ai s¯i bi < w. Elements of the form asb, for a, b ∈ X ∗ and s ∈ S, are called S-words. Note that S-words are actually polynomials, but they are obtained from words of the form a∗b in an extended alphabet X ∪ {∗} by replacing the new letter ∗ with s ∈ S. This explains the term “S-word”. A monic set S ⊂ kX is called a Gr¨obner–Shirshov basis (GS basis) in kX with respect to the monomial order < if every composition (f, g)w of polynomials f, g ∈ S is trivial modulo (S, w). A set S is called a minimal GS basis in kX if S is a GS basis in kX if there is no inclusion compositions in S, i.e., for any f, g ∈ S with f = g, one has f¯ = a¯ g b for a, b ∈ X ∗ . Denote sb, s ∈ S and a, b ∈ X ∗ }. Irr(S) = {u ∈ X ∗ | u = a¯ A GS basis S in kX is reduced if for all s ∈ S nwe have supp(s) ⊆ Irr(S \ {s}), where supp(s) = {u1 , u2 , . . . , un } if s = i=1 αi ui , 0 = αi ∈ k, ui ∈ X ∗ . The following lemma plays a key role for proving Composition-Diamond Lemma for associative algebras. Lemma 3.12. If S is a GS basis in kX and w = a1 s¯1 b1 = a2 s¯2 b2 , where a1 , b1 , a2 , b2 ∈ X ∗ , s1 , s2 ∈ S, then a1 s1 b1 ≡ a2 s2 b2 (mod S, w).

page 35

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

36

9287 – Gr¨ obner-Shirshov Bases

textbook

3. COMPOSITION-DIAMOND LEMMA

Proof. There are three cases to consider. Case 1. Suppose that the subwords s¯1 and s¯2 of w are disjoint, say, |a2 | ≥ |a1 | + |¯ s1 |. Then, a2 = a1 s¯1 c and b1 = c¯ s2 b2 for some c ∈ X ∗ , and s2 b2 . Now, so, w1 = a1 s¯1 c¯ a1 s1 b1 − a2 s2 b2 = a1 s1 c¯ s2 b2 − a1 s¯1 cs2 b2 = a1 s1 c(¯ s2 − s2 )b2 + a1 (s1 − s¯1 )cs2 b2 . Since s¯2 − s2 < s¯2 and s1 − s¯1 < s¯1 , we conclude that   a1 s1 b 1 − a2 s2 b 2 = αi ui s1 vi + βj uj s2 vj i

j

for some αi , βj ∈ k, S-words ui s1 vi and uj s2 vj such that ui s¯1 vi , uj s¯2 vj < w. Case 2. Suppose that the subword s¯1 of w contains s¯2 as a subword. Then s¯1 = a¯ s2 b, a2 = a1 a and b2 = bb1 , that is, w = a1 a¯ s2 bb1 for some S-word as2 b. We have a1 s1 b1 − a2 s2 b2 = a1 s1 b1 − a1 as2 bb1 = a1 s1 − as2 bb1 = a1 (s1 , s2 )s¯1 b1 . By the triviality of compositions, a1 s1 b1 ≡ a2 s2 b2 (mod S, w). Case 3. The words s¯1 and s¯2 have a nonempty intersection as subwords of w. We may assume that a2 = a1 a, b1 = bb2 , w = s¯1 b = a¯ s2 , |w| < |¯ s1 | + |¯ s2 |. Then, similar to the Case 2, we have a1 s1 b1 ≡ a2 s2 b2 (mod S, w).  Lemma 3.13. Let S ⊂ kX be a subset of monic polynomials. Then for every f ∈ kX we have   f= αi ui + βj a j s j b j , ui ≤f¯

aj sj bj ≤f¯

where each αi , βj ∈ k, ui ∈ Irr(S), and aj sj bj is an S-word. So, Irr(S) is a set of linear generators of the algebra kX | S. Proof. The result follows by induction on f¯.



Theorem 3.14 (Composition-Diamond Lemma for associative algebras). Let < be a monomial order on X ∗ , and let S ⊂ kX be a monic set. Then the following statements are equivalent. (i) S is a Gr¨ obner–Shirshov basis in kX. (ii) f ∈ Id(S) ⇒ f¯ = a¯ sb for some s ∈ S and a, b ∈ X ∗ . ∗ sb, s ∈ S, a, b ∈ X ∗ } is a linear basis of (iii) Irr(S) = {u ∈ X | u = a¯ the algebra kX | S.

page 36

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.2. GROBNER–SHIRSHOV BASES FOR ASSOCIATIVE ALGEBRAS

textbook

37

Proof. n (i) ⇒ (ii). Let S be a GS basis and 0 = f ∈ Id(S). Then we have f = i=1 αi ai si bi where each αi ∈ k, ai , bi ∈ X ∗ and si ∈ S. Let w1 = w2 = · · · = wl > wl+1 ≥ · · · . We will use the induction on l and w1 to prove that f¯ = a¯ sb for some s ∈ S and a, b ∈ X ∗ . If l = 1 then f¯ = a1 s1 b1 = a1 s¯1 b1 and hence the result holds. Assume that l ≥ 2. Then, we have w1 = a1 s¯1 b1 = a2 s¯2 b2 . It follows from Lemma 3.12 that a1 s1 b1 ≡ a2 s2 b2 (mod S, w1 ). If α1 + α2 = 0 or l > 2 then the result follows by induction on l. For the case α1 + α2 = 0 and l = 2, we use induction on w1 . This shows (ii). (ii) ⇒ (iii).By Lemma 3.13, Irr(S) generates kX | S as a linear space. Suppose that i αi ui = 0 in kX | S, where 0 = αi ∈ k and ui ∈ Irr(S).   This means that i αi ui ∈ Id(S) in kX. Then i αi ui = uj ∈ Irr(S) for some j, which contradicts (ii). (iii) ⇒ (i). For any f, g ∈ S, by using Lemma 3.13 and (iii), we have  (f, g)w ≡ 0 (mod S, w). Therefore, S is a GS basis. wi = ai s¯i bi ,

Remark 3.15. Let K be a commutative ring with identity 1. It is easy to check that if we replace the field k by the commutative ring K, then all concepts and results in the above statements are true for any monic polynomial set S. In particular, we have the Composition-Diamond Lemma (CD-Lemma, for short) for a monic set S of the algebra KX over the commutative ring K. Similar to the Buchbereger algorithm of completion for commutative polynomials, there is a Shirshov algorithm for noncommutative polynomials. However, in contrast to the case of commutative algebra, the Shirshov algorithm in general does not terminate in a finite number of steps (indeed, we can not use Noetherianity in the noncommutative case). If a subset S ⊂ kX is not a GS basis then one can add to S all nontrivial compositions of polynomials from S. Continuing this process repeatedly, we finally obtain a GS basis S c that contains S. Such a process is called Shirshov’s algorithm, S c is called GS completion of S. Of course, in order to call this process an “algorithm” we have to assume that the initial set S is recursively enumerable (e.g., finite) and that the underlying field k is computable. The resulting set S c is also recursively enumerable (but not necessarily finite even if S was).  ∈ Note that, by Lemma 3.13, for each u ∈ X ∗ \ Irr(S) there is a u  < u and u − u  ∈ Id(S). The following k Irr(S)such that either u  = 0 or u theorem gives a linear basis of the ideal Id(S) if S ⊂ kX is a GS basis. Theorem 3.16. Suppose that S ⊂ kX is a Gr¨ obner–Shirshov basis. Then the set {u − u  | u ∈ X ∗ \ Irr(S)} is a linear basis of the ideal Id(S) of kX.

page 37

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

38

textbook

3. COMPOSITION-DIAMOND LEMMA

Proof. Suppose that 0 = f ∈ Id(S). Then by the CD-lemma for associative algebras, f¯ = a1 s¯1 b1 = u1 for some s1 ∈ S and a1 , b1 ∈ X ∗ which implies that f¯ = u1 ∈ X ∗ \ Irr(S). Let f1 = f − α1 (u1 − u 1 ), where α1 is the coefficient of the leading term of f and u 1 < u1 or u 1 = 0. Then f1 ∈ Id(S) and f¯1 < f¯. By induction on f¯, the set {u − u  | u ∈ X ∗ \ Irr(S)} generates Id(S) as a linear space. It is clear that the set {u − u  | u ∈ X ∗ \ Irr(S)} is linearly independent.  By a similar proof to that of Corollary 3.7, we have the following theorem. Theorem 3.17. Let I be an ideal of kX and < a monomial order on X ∗ . Then there exists unique reduced Gr¨ obner–Shirshov basis S for I. Example 3.18. Let K be a commutative ring with 1, and let A be an associative K-algebra. Suppose A is a free K-module with a linearly ordered basis X = {ai | i ∈ J}. For i, j ∈ J, the multiplication in A is given by  ai aj = αnij an , αnij ∈ K. Denote set



n∈J n ∗ n∈J αij an by μ{ai aj }. Then, relative to deg-lex order on X , the

S = {ai aj − μ{ai aj } | i, j ∈ J} is a Gr¨ obner–Shirshov basis of A in KX. Therefore, the algebra A has a presentation A = KX | S. Example 3.19. Let L be a Lie algebra over a field k with operation [··] and a well ordered linear basis X = {xi | i ∈ J}. The multiplication table of L may be presented as [xi xj ] = μ{xi , xj }, for i, j ∈ J and i > j.  Here μ{xi , xj } = n∈J αnij xn where αnij ∈ k for every i, j ∈ J and i > j. Then U (L) = kX | S (−)  is called the universal enveloping associative algebra of L, where S (−) = {xi xj − xj xi − μ{xi , xj } | i, j ∈ J and i > j}. The set S (−) is a Gr¨ obner–Shirshov basis in kX relative to the deg-lex order on X ∗ . The classical Poincar´e-Birkhoff-Witt (PBW) Theorem may be derived from the CD-Lemma. Corollary 3.20 (PBW Theorem). The set xi1 . . . xin ,

for i1 , . . . , in ∈ J with i1 ≤ · · · ≤ in and n ≥ 0,

forms a linear basis of U (L).

page 38

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.3. GROBNER–SHIRSHOV BASES FOR TENSOR PRODUCT

textbook

39

In particular, every Lie algebra embeds injectively into an appropriate associative algebra. This observation allows us to conclude that Lie(X) ⊂ kX is indeed free Lie algebra generated by X (see Lemma 2.6). 3.3. Gr¨ obner–Shirshov bases for tensor product of free algebras Let X and Y be sets and kX ⊗ kY  be the tensor product algebra. In this section, we will give a Composition-Diamond Lemma for the algebra kX ⊗ kY . We will also prove a theorem on the pair of algebras (k[X] ⊗ kY , kX ⊗ kY ) following the spirit of Eisenbud, Peeva and Sturmfels’ theorem [134] on (k[X], kX). Let X and Y be two well-ordered sets, T = {yx = xy | x ∈ X and y ∈ Y }. With the deg-lex ordering (assuming y > x for every x ∈ X and y ∈ Y ) on (X ∪ Y )∗ , T is clearly a Gr¨obner–Shirshov basis in kX ∪ Y . Then, by Theorem 3.14, N = X ∗ Y ∗ = Irr(T ) = {u = uX uY | uX ∈ X ∗ and uY ∈ Y ∗ } is a set of normal words of the tensor product kX ⊗ kY  = kX ∪ Y | T . Let kN be a k-linear space spanned by N . For any u = uX uY and v = v X v Y , both from N , we define the multiplication of the normal words as follows uv = uX v X uY v Y ∈ N. It is clear that kN is exactly the tensor product algebra kX ⊗ kY , that is, kN = kX ∪ Y | T  = kX ⊗ kY . Now, we order the set N by means of the deg-lex rule. For any u = uX uY , v = v X v Y ∈ N , u > v ⇔ |u| > |v| or (|u| = |v| and (uX > v X or (uX = v X and uY > v Y ))), where |u| = |uX | + |uY | is the length of u. Obviously, > is a monomial ordering on N . Throughout this section, we will use the above ordering unless stated otherwise. For a polynomial f ∈ kX ⊗ kY , f has a unique presentation of the form  αi ui , f = αf¯f¯ + where f¯, ui ∈ N , f¯ > ui and αf¯, αi ∈ k. The proof of the following lemma is straightforward and we hence omit the details. Lemma 3.21. Let f ∈ kX⊗kY  be a monic polynomial. Then uf v = uf¯v for any u, v ∈ N .

page 39

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

40

textbook

3. COMPOSITION-DIAMOND LEMMA

We give here the definition of compositions. Let f and g be two monic polynomials of kX ⊗ kY  and w = wX wY ∈ N . Then we have the following compositions. (1) Inclusion: (1.1) X-inclusion only. g X b for some a, b ∈ X ∗ , and f¯Y , Suppose that wX = f¯X = a¯ Y g¯ are disjoint. Then there are two compositions according to wY = f¯Y c¯ g Y and wY = g¯Y cf¯Y for c ∈ Y ∗ , respectively: (f, g)w1 = f c¯ g Y − f¯Y cagb,

w1 = f X f¯Y c¯ gY ,

(f, g)w2 = g¯Y cf − agbcf¯Y ,

w2 = f X g¯Y cf¯Y .

and

(1.2) Y -inclusion only. g Y d for c, d ∈ Y ∗ and f¯X , g¯X are Suppose that wY = f¯Y = c¯ disjoint. Then there are two compositions according to wX = f¯X a¯ g X and wX = g¯X af¯X for a ∈ X ∗ , respectively: g X − f¯X acgd, (f, g)w1 = f a¯

w1 = f¯X a¯ gX f Y ,

(f, g)w2 = g¯X af − cgdaf¯X

w2 = g¯X af¯X f Y .

and

(1.3) X, Y -inclusion. g X b for some a, b ∈ X ∗ and wY = Suppose that wX = f¯X = a¯ Y Y ∗ ¯ f = c¯ g d for some c, d ∈ Y . Then (f, g)w = f − acgbd. The transformation f → (f, g)w = f − acgbd is said to be the elimination of leading word (ELW) by g in f . (1.4) X, Y -skew-inclusion. g X b for some a, b ∈ X ∗ and wY = Suppose that wX = f¯X = a¯ Y Y ∗ ¯ g¯ = cf d for some c, d ∈ Y . Then (f, g)w = cf d − agb. (2) Intersection: (2.1) X-intersection only. Suppose that wX = f¯X a = b¯ g X for some a, b ∈ X ∗ with |f¯X | + X X Y Y |¯ g | > |w |, and f¯ , g¯ are disjoint. Then there are two compog Y and wY = g¯Y cf¯Y for c ∈ Y ∗ , sitions according to wY = f¯Y c¯ respectively: (f, g)w1 = f ac¯ g Y − f¯Y cbg,

w1 = wX f¯Y c¯ gY

page 40

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.3. GROBNER–SHIRSHOV BASES FOR TENSOR PRODUCT

textbook

41

and (f, g)w2 = g¯Y cf a − bgcf¯Y ,

w2 = wX g¯Y cf¯Y .

(2.2) Y -intersection only. g X for some c, d ∈ Y ∗ with |f¯Y | + Suppose that wY = f¯Y c = d¯ Y Y X X ¯ |¯ g | > |w |, and f , g¯ are disjoint. Then there are two compog X and wX = g¯X af¯X for a ∈ X ∗ , sitions according to wX = f¯X a¯ respectively: g X − f¯X adg, w1 = f¯X a¯ g X wY (f, g)w = f ca¯ 1

and (f, g)w2 = g¯X af c − dgaf¯X ,

w2 = g¯X af¯X wY .

(2.3) X, Y -intersection. g X for some a, b ∈ X ∗ and wY = f¯Y c = d¯ g Y for If wX = f¯X a = b¯ ∗ X X X Y ¯ ¯ g | > |w | and |f | + |¯ gY | > some c, d ∈ Y together with |f | + |¯ Y |w |, then (f, g)w = f ac − bdg. (2.4) X, Y -skew-intersection. g X for some a, b ∈ X ∗ and wY = cf¯Y = g¯Y d for If wX = f¯X a = b¯ ∗ some c, d ∈ Y together with |f¯X | + |¯ g X | > |wX | and |f¯Y | + |¯ gY | > Y |w |, then (f, g)w = cf a − bgd. (3) Both inclusion and intersection: (3.1) X-inclusion and Y -intersection. g X b for There are two subcases to consider. If wX = f¯X = a¯ ∗ Y Y Y ¯ some a, b ∈ X and w = f c = d¯ g for some c, d ∈ Y ∗ with Y Y Y ¯ |f | + |¯ g | > |w |, then (f, g)w = f c − adgb. If w = f = a¯ g b for some a, b ∈ X ∗ and wY = cf¯Y = g¯Y d for ∗ some c, d ∈ Y with |f¯Y | + |¯ g Y | > |wY |, then X

¯X

X

(f, g)w = cf − agbd. (3.2) X-intersection and Y -inclusion. There are two subcases to consider. If wX = f¯X a = b¯ g X for some ∗ X X X Y Y ¯ ¯ g | > |w | and w = f = c¯ g Y d for some a, b ∈ X with |f | + |¯ ∗ c, d ∈ Y , then (f, g)w = f a − bcgd. X ¯ g X for some a, b ∈ X ∗ with |f¯X | + |¯ g X | > |wX | If w = f a = b¯ Y Y Y ∗ ¯ and w = cf d = g¯ for some c, d ∈ Y , then X

(f, g)w = cf ad − bg.

page 41

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

42

textbook

3. COMPOSITION-DIAMOND LEMMA

From Lemma 3.21, it follows that for any case of compositions (f, g)w < w. If Y = ∅ then the compositions of f, g are the same as in kX. Let S be a monic subset of kX ⊗ kY  and f, g ∈ S. A composition (f, g)w is said to be trivial modulo (S, w), denoted by 

(f, g)w ≡ 0 (mod S, w),

if (f, g)w = i αi ai si bi , where ai , bi ∈ N , si ∈ S, αi ∈ k, and ai s¯i bi < w for all i. We call S a Gr¨obner–Shirshov basis (GS basis) in kX ⊗ kY  if all compositions of elements in S are trivial modulo S and corresponding w. Lemma 3.22. Let S be a GS basis in kX ⊗ kY  and s1 , s2 ∈ S. If w = a1 s¯1 b1 = a2 s¯2 b2 for some ai , bi ∈ N with i = 1, 2, then a1 s1 b1 ≡ a2 s2 b2 (mod S, w). Proof. There are four cases to consider. Case 1. Inclusion. ∗ (1.1) X-inclusion only. Suppose that w1X = s¯X sX 1 = a¯ 2 b, where a, b ∈ X , Y Y X X X X and s¯1 , s¯2 are disjoint. Then a2 = a1 a and b2 = bb1 . There are two subcases to be considered: w1Y = s¯Y1 c¯ sY2 and w1Y = s¯Y2 c¯ sY1 , where c ∈ Y ∗ . Y Y Y X Y Y Y Y Y s2 , we have w1 = s1 s¯1 c¯ s2 , a2 = a1 s¯1 c, bY1 = c¯ sY2 bY2 , For w1 = s¯1 c¯ X Y w = a1 w1 b1 b2 = a1 s¯1 ac¯ s2 b2 and hence Y Y Y sY2 bY2 − aX ¯1 cs2 bbX a1 s1 b 1 − a2 s2 b 2 = a1 s1 b X 1 c¯ 1 aa1 s 1 b2 Y = a1 (s1 c¯ sY2 − s¯Y1 cas2 b)bX 1 b2 Y = a1 (s1 , s2 )w1 bX 1 b2

≡0

(mod S, w).

For w1Y = s¯Y2 c¯ sY1 , we have w1 = sX ¯Y2 c¯ sY1 , aY1 = aY2 s¯Y2 c, bY2 = c¯ sY1 bY1 , 1 s X Y w = a1 a2 w1 b1 and hence Y Y Y X Y Y a1 s1 b 1 − a2 s2 b 2 = aX ¯2 cs1 b1 − aX s1 b 1 1 a2 s 1 aa2 s2 bb1 c¯ Y = aX sY2 cs1 − as2 bc¯ sY1 )b1 1 a2 (¯ Y = aX 1 a2 (s1 , s2 )w1 b1

≡ 0 (mod S, w). (1.2) Y -inclusion only. This case is similar to 1.1.

page 42

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.3. GROBNER–SHIRSHOV BASES FOR TENSOR PRODUCT

textbook

43

(1.3) X, Y -inclusion. We may assume that s¯2 is a subword of s¯1 , i.e., w1 = X X X Y Y s2 bd, where a, b ∈ X ∗ , c, d ∈ Y ∗ , aX s¯1 = ac¯ 2 = a1 a, b2 = bb1 , a2 = a1 c Y Y and b2 = db1 . Thus, a2 = a1 ac, b2 = bdb1 , w = a1 w1 b1 and hence a1 s1 b1 − a2 s2 b2 = a1 s1 b1 − a1 acs2 bdb1 = a1 (s1 − acs2 bd)b1 = a1 (s1 , s2 )w1 b1 ≡0

(mod S, w).

∗ (1.4) X, Y -skew-inclusion. Assume that w1X = s¯X sX 1 = a¯ 2 b, where a, b ∈ X Y Y Y ∗ X X X X and w1 = s¯2 = c¯ s1 d, where c, d ∈ Y . Then a2 = a1 a, b2 = bb1 , Y X Y aY1 = aY2 c and bY1 = dbY2 . Thus, w = aX 1 a2 w1 b1 b2 and hence Y X Y X Y X Y a1 s1 b 1 − a2 s2 b 2 = aX 1 a2 cs1 b1 db2 − a1 aa2 s2 bb1 b2 Y X Y = aX 1 a2 (cs1 d − as2 b)b1 b2 Y X Y = aX 1 a2 (s1 , s2 )w1 b1 b2

≡0

(mod S, w).

Case 2. Intersection. (2.1) X-intersection only. We may assume that s¯X ¯X 1 is at the left of s 2 , i.e., X X X ∗ X X X X s2 , where a, b ∈ X and |¯ s1 |+|¯ s2 | > |w1 |. Then a2 = aX w1 = s¯1 b = a¯ 1 a X Y Y Y and bX = bb . There are two subcases to be considered: w = s ¯ c¯ s and 1 2 1 1 2 w1Y = s¯Y2 c¯ sY1 , c ∈ Y ∗ . sY2 , i.e., w1 = s¯1 bc¯ sY2 , we have aY2 = aY1 s¯Y1 c, bY1 = c¯ sY2 bY2 , For w1Y = s¯Y1 c¯ w = a1 s¯1 ac¯ s2 b2 = a1 w1 b2 and Y Y a1 s1 b1 − a2 s2 b2 = a1 s1 bbX sY2 bY2 − aX ¯2 cs2 b2 2 c¯ 1 aa1 s

= a1 (s1 bc¯ sY2 − a¯ sY2 cs2 )b2 = a1 (s1 , s2 )w1 b2 ≡0

(mod S, w).

For w1Y = s¯Y2 c¯ sY1 , i.e., w1 = s¯Y2 c¯ s1 b, we have aY1 = aY2 s¯Y2 c, bY2 = c¯ sY1 bY1 , Y X Y a w b b and hence w = aX 1 2 1 2 1 Y Y Y X Y X Y Y a1 s1 b 1 − a2 s2 b 2 = aX ¯2 cs1 bbX s1 b 1 1 a2 s 2 b1 − a1 aa2 s2 b2 c¯ Y Y = aX sY2 cs1 b − as2 c¯ sY1 )bX 1 a2 (¯ 2 b1 Y X Y = aX 1 a2 (s1 , s2 )w1 b2 b1

≡0

(mod S, w).

(2.2) Y -intersection only. This case is similar to 2.1.

page 43

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

44

9287 – Gr¨ obner-Shirshov Bases

textbook

3. COMPOSITION-DIAMOND LEMMA

Y (2.3) X, Y -intersection. Assume that w1X = s¯X sX ¯Y1 d = c¯ sY2 , 1 b = a¯ 2 , w1 = s ∗ ∗ X X X Y Y s1 | + |¯ s2 | > |w1 | and |¯ s1 | + |¯ s2 | > |w1Y |. where a, b ∈ X , c, d ∈ Y , |¯ X X X X Y Y Y Y Then a2 = a1 a, b1 = bb2 , a2 = a1 c, b1 = db2 , w = a1 w1 b2 and hence Y X Y a1 s1 b1 − a2 s2 b2 = a1 s1 bbX 2 db2 − a1 aa1 cs2 b2

= a1 (s1 bd − acs2 )b2 = a1 (s1 , s2 )w1 b2 ≡0

(mod S, w).

Y sX sY1 = (2.4) X, Y -skew-intersection. Assume that w1X = s¯X 1 b = a¯ 2 , w1 = c¯ Y ∗ ∗ X X X Y Y s¯2 d, where a, b ∈ X , c, d ∈ Y , |¯ s1 | + |¯ s2 | > |w1 | and |¯ s1 | + |¯ s2 | > |w1Y |. X X X X Y Y Y Y X Y Y Then a2 = a1 a, b1 = bb2 , a1 = a2 c, b2 = db1 , w = a1 a2 w1 bX 2 b1 and hence Y X Y X Y X Y a1 s1 b 1 − a2 s2 b 2 = aX 1 a2 cs1 bb2 b1 − a1 aa2 s2 b2 db1 Y X Y = aX 1 a2 (cs1 b − as2 d)b2 b1 Y X Y = aX 1 a2 (s1 , s2 )w1 b2 b1

≡0

(mod S, w).

Case 3. Both inclusion and intersection. (3.1) X-inclusion and Y -intersection. We may assume that w1X = s¯X 1 = ∗ X X X X a¯ sX 2 b, where a, b ∈ X . Then a2 = a1 a and b2 = bb1 . There two cases to consider: w1Y = s¯Y1 d = c¯ sY2 and w1Y = c¯ sY1 = s¯Y2 d, where c, d ∈ Y ∗ , Y Y Y s2 | > |w1 |. |¯ s1 | + |¯ Y For w1Y = s¯Y1 d = c¯ sY2 , we have aY2 = aY1 c, bY1 = dbY2 , w = a1 w1 bX 1 b2 and hence Y X Y X Y a1 s1 b 1 − a2 s2 b 2 = a1 s1 b X 1 db2 − a1 aa1 cs2 bb1 b2 Y = a1 (s1 d − acs2 b)bX 1 b2

= a1 (s1 , s2 )w1 b2 ≡ 0 (mod S, w). Y X Y sY1 = s¯Y2 d, we have aY1 = aY2 c, bY2 = dbY1 , w = aX For w1Y = c¯ 1 a2 w1 b2 db1 and hence Y X Y X Y a1 s1 b 1 − a2 s2 b 2 = aX 1 a2 cs1 b1 − a1 aa2 s2 bb1 db1 Y = aX 1 a2 (cs1 − as2 bd)b1 Y = aX 1 a2 (s1 , s2 )w1 b1

≡ 0 (mod S, w).

page 44

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.3. GROBNER–SHIRSHOV BASES FOR TENSOR PRODUCT

textbook

45

(3.2) X-intersection and Y -inclusion. Assume that w1X = s¯X sX 1 b = a¯ 2 , ∗ X X X X X X X s1 | + |¯ s2 | > |w1 |. Then a2 = a1 a, b1 = bb2 . There are a, b ∈ Y with |¯ two subcases to be considered: w1Y = s¯Y1 = c¯ sY2 d and s¯Y2 = c¯ sY1 d, where ∗ c, d ∈ Y . Y For w1Y = s¯Y1 = c¯ sY2 d, we have aY2 = aY1 c, bY2 = dbY1 , w = a1 w1 bX 2 b1 and hence Y X Y a1 s1 b1 − a2 s2 b2 = a1 s1 bbX 2 b1 − a1 acs2 b2 db1 Y = a1 (s1 b − acs2 d)bX 2 b1 Y = a1 (s1 , s2 )w1 bX 2 b1

≡0

(mod S, w).

Y For w1Y = s¯Y2 = c¯ sY1 d, we have aY1 = aY2 c, bY1 = dbY2 , w = aX 1 a2 w1 b2 and hence Y X Y X Y a1 s1 b 1 − a2 s2 b 2 = aX 1 a2 cs1 bb2 db2 − a1 aa2 s2 b2 Y = aX 1 a2 (cs1 bd − as2 )b2 Y = aX 1 a2 (s1 , s2 )w1 b2

≡ 0 (mod S, w). Case 4. Suppose s¯1 and s¯2 are disjoint. For w = wX wY , by symmetry, sY2 bY2 and wY = there are two subcases to be considered: wY = aY1 s¯Y1 c¯ Y Y Y Y X X X X X ∗ X X a2 s¯2 c¯ s1 b1 , where w = a1 s¯1 a¯ s2 b2 , with a ∈ X , a2 = aX ¯X 1 s 1 a, b1 = X X ∗ a¯ s2 b2 , and c ∈ Y . X Y Y For w = aX ¯X sX ¯1 c¯ sY2 bY2 = a1 s¯1 ac¯ s2 b2 , we have a2 = a1 s¯1 ac, 1 s 1 a¯ 2 b 2 a1 s b1 = ac¯ s2 b2 and hence a1 s1 b1 − a2 s2 b2 = a1 s1 ac¯ s2 b2 − a1 s¯1 acs2 b2 = a1 (s1 − s¯1 )acs2 b2 − a1 s1 ac(s2 − s¯2 )b2 ≡0

(mod S, w).

X Y Y For w = aX ¯X sX ¯2 c¯ sY1 bY1 , we have aY1 = aY2 s¯Y2 c, bY2 = c¯ sY1 bY1 and 1 s 1 a¯ 2 b 2 a2 s hence Y Y X Y X X a1 s1 b 1 − a2 s2 b 2 = aX ¯2 cs1 a¯ sX ¯1 aaY2 s2 bX sY1 bY1 1 a2 s 2 b 2 b 1 − a1 s 2 c¯ Y Y = aX sY2 cs1 a¯ sX ¯X sY1 )bX 1 a2 (¯ 2 −s 1 as2 c¯ 2 b1 .

page 45

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

46

textbook

3. COMPOSITION-DIAMOND LEMMA

Let s1 =

n i=1

Y αi uX 1i u1i and s2 =

sX ¯X sY1 = s¯Y2 cs1 a¯ 2 −s 1 as2 c¯

n 

m j=1

Y βj u X 2j u2j , where α1 = β1 = 1. Then

αi uX s2 cuY1i − 1i a¯

i=2

=

n 

αi uX s2 − s2 )cuY1i + 1i a(¯

m 

βj uY2j c(s1 − s¯1 )auX 2j

j=2

+

n 

Y αi uX 1i as2 cu1i −

i=2 n m 

m 

βj uY2j cs1 auX 2j

j=2

X Y Y αi βj uX 1i au2j u2j cu1i −

i=2 j=2

≡0

βi uY2j c¯ s1 auX 2j

j=2

i=2



m 

m  n 

X αi βj uY2j cuY1i uX 1i au2j

j=2 i=2

(mod S, w1 )

Y X Y s1 a¯ sX ¯X s2 c¯ sY1 . Since w = aX where w1 = s¯Y2 c¯ 2 = s 1 a¯ 1 a2 w1 b2 b1 , we have Y Y sY2 cs1 a¯ sX ¯X sY1 )bX a1 s1 b 1 − a2 s2 b 2 = aX 1 a2 (¯ 2 −s 1 as2 c¯ 2 b1

≡0

(mod S, w). 

This completes the proof.

Lemma 3.23. Let S ⊂ kX ⊗ kY  with each s ∈ S monic and Irr(S) = {w ∈ N | w = a¯ sb, a, b ∈ N and s ∈ S}. Then for any f ∈ kX ⊗ kY ,   f= αi ai si bi + βj u j , ai s¯i bi ≤f¯

uj ≤f¯

where αi , βj ∈ k, ai , bi ∈ N , si ∈ S, and uj ∈ Irr(S). Proof. Let f =



αi ui ∈ kX ⊗ kY , where 0 = αi ∈ k and u1 > u2 >

i

· · · . If u1 ∈ Irr(S), then let f1 = f − α1 u1 . If u1 ∈ Irr(S) then there exist s ∈ S and a1 , b1 ∈ N such that f¯ = a1 s¯1 b1 . Let f1 = f − α1 a1 s1 b1 . In both cases, we have f¯1 < f¯. Then the result follows from induction on f¯.  By summarizing the above lemmas, we arrive at the following: Theorem 3.24 (Composition-Diamond Lemma for tensor product). Let S ⊂ kX ⊗ kY  with each s ∈ S monic and < the ordering on N = X ∗ Y ∗ as before. Then the following statements are equivalent. (i) S is a Gr¨ obner–Shirshov basis in kX ⊗ kY . (ii) f ∈ Id(S) ⇒ f¯ = a¯ sb for some a, b ∈ N and s ∈ S. (iii) Irr(S) = {w ∈ N | w = a¯ sb, a, b ∈ N and s ∈ S} is a k-linear basis for the quotient algebra kX ⊗ kY / Id(S).

page 46

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.4. GROBNER–SHIRSHOV BASES FOR LIE ALGEBRAS

textbook

47

 Proof. (i) ⇒ (ii). Suppose that 0 = f ∈ Id(S). Then f = αi ai si bi for some αi ∈ k, ai , bi ∈ N and si ∈ S. Let wi = ai s¯i bi and w1 = w2 = · · · = wl > wl+1 ≥ · · · . We will prove that f¯ = a¯ sb for some a, b ∈ N and s ∈ S, by using induction on (w1 , l). If l = 1, then the result is clear. If l > 1, then w1 = a1 s¯1 b1 = a2 s¯2 b2 . Now, by (i) and Lemma 3.22, we see that a1 s1 b1 ≡ a2 s2 b2 (mod S, w1 ). Thus, α1 a1 s1 b1 + α2 a2 s2 b2 = (α1 + α2 )a1 s1 b1 + α2 (a2 s2 b2 − a1 s1 b1 ) ≡ (α1 + α2 )a1 s1 b1

(mod S, w1 ).

Now (ii) follows. (ii) ⇒ (iii). For any 0 = f ∈ kX ⊗ kY , by Lemma 3.23, we can express f as   βj u j , f= αi ai si bi + where αi , βj ∈ k, ai , bi ∈ N , si ∈ S and uj ∈ Irr(S). Then Irr(S) generates the factor algebra. Moreover, if 0 = h = βj uj ∈ Id(S), uj ∈ Irr(S), ¯ = a¯ u1 > u2 > · · · and β1 = 0, then u1 = h sb for some a, b ∈ N and s ∈ S by (ii), a contradiction. This shows that Irr(S) is a linear basis of the quotient algebra. (iii) ⇒ (i). For every f, g ∈ S, we have h = (f, g)w ∈ Id(S). The result is now trivial if (f, g)w = 0. Assume that (f, g)w = 0. Then, by Lemma 3.23 and (iii), we have  αi ai si bi . h= ¯ ai s¯i bi ≤h

¯ = (f, g) < w, we see immediately that (i) holds.  Now, by noting that h w Remark 3.25. Theorem 3.24 is valid for any monomial ordering on X ∗ Y ∗ . Remark 3.26. Theorem 3.24 is precisely the Composition-Diamond Lemma for associative algebras (Theorem 3.14) when Y = ∅. 3.4. Gr¨ obner–Shirshov bases for Lie algebras In this section, we give the Composition-Diamond Lemma for Lie algebras. Let X = {x1 , x2 , . . . } be a well-ordered set as above. Throughout this section, we extend the order on X to the deg-lex order < on X ∗ . Denote by Lie(X) = Liek (X) the free Lie algebra generated by X over a field k. Recall that the set of all nonassociative Lyndon–Shirshov words (NLSWs) forms a linear basis of Lie(X). Lemma 3.27. Let ac and cb be ALSWs, where a, b, c ∈ X ∗ and c = 1. Then w = acb is also an ALSW.

page 47

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

48

textbook

3. COMPOSITION-DIAMOND LEMMA

Proof. We use induction on |w| = n. If n = 3, then w = xi xj xk is an ALSW, because ac and cb are ALSWs and thus xi > xj > xk . In the inductive case n > 3, suppose min(w) = xβ , b = xeβ ˜b, e ≥ 0, fir(˜b) = xβ and c˜ = cxeβ . Then w = a˜ c˜b and w = a c˜ ˜b . w w w

It is clear that so is w.

aw c˜w ,

c˜w ˜bw

are ALSWs. By induction, w is an ALSW and 

Definition 3.28. Let f and g be two monic Lie polynomials in Lie(X) ⊂ kX. Then, there are two kinds of Lie compositions: (i) If w = f¯ = a¯ gb for some a, b ∈ X ∗ , then the polynomial f, gw = f − [agb]g¯ is called the composition of inclusion of f and g with respect to w. (ii) If w is a word such that w = f¯b = a¯ g for some a, b ∈ X ∗ with ¯ deg(f ) + deg(¯ g ) > deg(w), then the polynomial f, gw = [f b]f¯ − [ag]g¯ is called the composition of intersection of f and g with respect to w. By Lemma 3.27, in the Definition 3.28 (i) and (ii), w is an ALSW. Definition 3.29. Let S ⊂ Lie(X) be a nonempty subset, h a Lie polynomial and w ∈ X ∗ . We shall say  that h is trivial modulo (S, w), denoted by h ≡Lie 0 (mod S, w), if h = i αi [ai si bi ]s¯i , where each αi ∈ k, ai , bi ∈ X ∗ , si ∈ S, [ai s¯i bi ]s¯i is a RNLSW (see Definition 2.41) and ai s¯i bi < w. Definition 3.30. Let S ⊂ Lie(X) be a nonempty set of monic Lie polynomials. Then S is called a Gr¨obner–Shirshov basis in Lie(X) if every composition f, gw with f, g ∈ S is trivial modulo (S, w), i.e., f, gw ≡Lie 0 (mod S, w). Lemma 3.31. Let f, g be monic Lie polynomials. Then f, gw − (f, g)w ≡ 0

(mod {f, g}, w)

in the associative sense (see Definition 3.3). Proof. If f, gw and (f, g)w are compositions of intersection, where w = f¯b = a¯ g , then, by Corollary 2.42, we may assume that   αi ai f bi − ag − βj aj gbj , f, gw = [f b]f¯ − [ag]g¯ = f b + I1

I2

page 48

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.4. GROBNER–SHIRSHOV BASES FOR LIE ALGEBRAS

textbook

49

where ai f¯bi , aj g¯bj < f¯b = a¯ g = w. It follows that f, gw − (f, g)w ≡ 0 (mod {f, g}, w). Similarly, for the case of the compositions of inclusion, we have the same conclusion.  Theorem 3.32. Let S ⊂ Lie(X) ⊂ kX be a nonempty set of monic Lie polynomials. Then S is a Gr¨ obner–Shirshov basis in Lie(X) if and only if S is a Gr¨ obner–Shirshov basis in kX. Proof. Note that, by the definitions, for any f, g ∈ S, they have composition in Lie(X) if and only if they do so in kX. Suppose that S is a Gr¨obner–Shirshov basis in Lie(X). Then, for any composition f, gw , we have  f, gw = αi [ai si bi ]s¯i , I1

where [ai s¯i bi ]s¯i are RNLSWs and ai s¯i bi < w. By Corollary 2.42,  f, gw = βj cj sj dj , I2

where each cj s¯j dj < w. Thus, by Lemma 3.31, we obtain (f, g)w ≡ 0 (mod S, w). Hence, S is a Gr¨ obner–Shirshov basis in kX. Conversely, assume that S is a Gr¨ obner–Shirshov basis in kX. Then, for every composition f, gw in S, by Lemma 3.31, we obtain f, gw ≡ (f, g)w ≡ 0

(mod S, w).

Therefore, we can assume by Theorem 3.14 that  f, gw = αi ai si bi , I1

where ai s¯i bi < w and w > a1 s¯1 b1 > a2 s¯2 b2 > . . .. Since f, gw ∈ Lie(X), we obtain that f, gw = a1 s¯1 b1 is an ALSW which shows [a1 s¯1 b1 ]s¯1 to be ¯ 1 < f, g . Then, a RNLSW. Let h1 = f, gw − α1 [a1 s1 b1 ]s¯1 . Clearly, h w by Corollary 2.42, we have h1 ≡ 0 (mod S, w). Now, by induction on f, gw , we have  f, gw = αi [ci si di ]s¯i , I2

page 49

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

50

9287 – Gr¨ obner-Shirshov Bases

textbook

3. COMPOSITION-DIAMOND LEMMA

where each [ci s¯i di ]s¯i is a RNLSW and ci s¯i di < w. This proves that S is a Gr¨ obner–Shirshov basis in Lie(X).  As an immediate corollary, we obtain the following statement. Theorem 3.33 (PBW Theorem in the Shirshov form). Let L = Lie(X | S), S ⊂ Lie(X) ⊂ kX, S a Gr¨ obner–Shirshov basis in Lie(X) and U (L) = kX | S. Then S is a GS basis in kX and a linear basis of U (L) consists of words u = u1 · · · un , where u1 ≤ · · · ≤ un in lex-order with n ≥ 0 and each ui is S-irreducible associative Lyndon–Shirshov word.  Lemma 3.34. Let S ⊂ Lie(X) with each s ∈ S monic. Let Irr(S) = {[u] | [u] is a NLSW, u = a¯ sb, s ∈ S and a, b ∈ X ∗ }. Then every h ∈ Lie(X) has a representation   αi [ui ] + h= ¯ [ui ]∈Irr(S), ui ≤h

βj [aj sj bj ]s¯j .

¯ sj ∈S, aj s¯j bj ≤h

 Proof. We can assume that h = i αi [ui ], where each [ui ] is a NLSW, 0 = αi ∈ k and u1 > u2 > · · · . If [u1 ] ∈ Irr(S), then let h1 = h − α1 [u1 ]. If [u1 ] ∈ Irr(S), then there exists s ∈ S and a1 , b1 ∈ X ∗ such that u1 = a1 s¯1 b1 . Now, let h1 = h − α1 [a1 s1 b1 ]s¯1 ∈ Lie(X). ¯ Now, the result follows from inducHence, in both cases, we have h¯1 < h. ¯ tion on h.  Theorem 3.35 (Composition-Diamond Lemma for Lie algebras over a field). Let S ⊂ Lie(X) ⊂ kX be a nonempty set of monic Lie polynomials. Let IdLie (S) be the Lie ideal of Lie(X) generated by S. Then the following statements are equivalent. (i) S is a Gr¨ obner–Shirshov basis in Lie(X). sb for some s ∈ S and a, b ∈ X ∗ . (ii) f ∈ IdLie (S) =⇒ f¯ = a¯  (ii ) f ∈ IdLie (S) =⇒ f = α1 [a1 s1 b1 ]s¯1 + α2 [a2 s2 b2 ]s¯2 + · · · , where αi ∈ k and f¯ = a1 s¯1 b1 > a2 s¯2 b2 > · · · . (iii) Irr(S) = {[u] | [u] is a NLSW, u = a¯ sb, s ∈ S and a, b ∈ X ∗ } is a k-basis for Lie(X | S). Proof. (i) ⇒ (ii). By noting that IdLie (S) ⊆ Id(S), where Id(S) is the ideal of kX generated by S, and by using Theorems 3.32 and 3.14, the result follows.  αi [ui ] = 0 in Lie(X | S) with u1 > (ii) ⇒ (iii). Suppose that [ui ]∈Irr(S)  u2 > · · · , that is, αi [ui ] ∈ IdLie (S). Then each αi must be 0. [ui ]∈Irr(S)

page 50

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ 3.4. GROBNER–SHIRSHOV BASES FOR LIE ALGEBRAS

Otherwise, say α1 = 0. Then, by (ii), we know that



51

αi [ui ] = u1 which

i

implies that [u1 ] ∈ Irr(S), a contradiction. On the other hand, for any f ∈ Lie(X), by Lemma 3.34, we have  f + IdLie (S) = αi ([ui ] + IdLie (S)). i

(iii) ⇒ (i). For every composition f, gw with f, g ∈ S, we have f, gw ∈ IdLie (S). Then, by (iii) and by Lemma 3.34,  f, gw = βj [aj sj bj ]s¯j , where each βj ∈ k, [aj sj bj ]s¯j is a normal S-word, and aj s¯j bj < w. This proves that S is a Gr¨ obner–Shirshov basis in Lie(X).  (ii) ⇒ (ii ). This part is clear. In the simplest case when S consists of only one Lie polynomial, we obtain a fundamental corollary known as the Freiheitssatz for Lie algebras. Example 3.36 (Shirshov). For an arbitrary polynomial f ∈ Lie(X), it is clear that {f } is a Gr¨ obner–Shirshov basis in Lie(X). From this it follows that the word problem is solvable for each one-relator Lie algebra Lie(X | f ).

page 51

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

b2530   International Strategic Relations and China’s National Security: World at the Crossroads

b2530_FM.indd 6

This page intentionally left blank

01-Sep-16 11:03:06 AM

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

CHAPTER 4

Applications of Gr¨ obner–Shirshov bases In this chapter, we review some applications of Gr¨ obner–Shirshov bases for noncommutative associative algebras and for Lie algebras. 4.1. Normal form for semigroups and groups Let X be a set, S ⊆ X ∗ × X ∗ , and let ρ(S) be the congruence on X ∗ generated by S, then A = SmgX | S = X ∗ /ρ(S) is the quotient semigroup and k(X ∗ /ρ(S)) is the corresponding semigroup algebra. We identify the set {u − v | (u, v) ∈ S} with S. Then it is easy to see that   αi ui + Id(S) → αi u¯i σ : kX | S → k(X ∗ /ρ(S)), is an algebra isomorphism. The Shirshov completion S c of S consists of “semigroup relations”, S c = {ui − vi | i ∈ I}. Then Irr(S c ) is a linear basis of kX | S and so σ(Irr(S c )) is a linear basis of k(X ∗ /ρ(S)). This shows that the members of Irr(S c ) are exactly the normal forms of elements of the semigroup SmgX | S. Thus, in order to find normal forms of the semigroup SmgX | S, it suffices to find a GS basis S c in kX | S. In particular, let G = X | S be the group generated by X with defining relations S, where S = {(ui , vi ) ∈ F (X) × F (X) | i ∈ I}, F (X) is the free group generated by X. Then G has a presentation (as a semigroup) G = SmgX ∪ X −1 | S, xε x−ε − 1, ε = ±1, x ∈ X,

X ∩ X −1 = ∅.

Let us demonstrate this approach with some examples. 1. A normal form for the alternating group. In this section, we first find a Gr¨ obner–Shirshov basis for the alternating group An and then we give the normal form theorem of An . Let Sn be the group of all permutations of {1, 2, · · · , n}. Then the subset An of all even permutations in Sn is a normal subgroup of Sn called 53

page 53

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

54

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

the alternating group of degree n. The following presentation of An was given in the monograph of N. Jacobson (see [168], p. 71):  An = xi , 1 ≤ i ≤ n − 2 | x31 = 1, (xi−1 xi )3 = x2i = 1, 2 ≤ i ≤ n − 2,  (xi xj )2 = 1, 1 ≤ i < j − 1, j ≤ n − 2 where xi = (12)((i + 1)(i + 2)), i = 1, 2, . . . , n − 2, with (ij) being a transposition. We now give a presentation of the group An as a semigroup:  −1 −1 3 An = Smg x−1 1 , xi , 1 ≤ i ≤ n − 2 | x1 x1 = x1 x1 = x1 = 1, (xi−1 xi )3 = x2i = 1, 2 ≤ i ≤ n − 2, (xi xj )2 = 1, 1 ≤ i < j − 1, j ≤ n − 2



Let us order the generators in the following way: x−1 1 < x1 < x2 < · · · < xn−2 Let X = {x−1 1 , x1 , x2 , . . . , xn−2 }. Now we define the words xji = xj xj−1 · · · xi , where j > i > 1 and xj1ε = xj xj−1 · · · x1 ε , for ε = ±1. The proof of the following lemma is straightforward. We omit the details. Lemma 4.1. For ε = ±1, the following relations hold in the alternating group An : −ε (1) x2ε 1 = x1 ; 2 (2) xi = 1, for i > 1; (3) xj xi = xi xj , for j − 1 > i ≥ 2; (4) xj xε1 = x−ε 1 xj , for j > 2; (5) xji xj = xj−1 xji , for j > i ≥ 2; (6) xj1ε xj = xj−1 xj1−ε , for j > 2; −ε (7) x2 x1 ε x2 = x−ε 1 x2 x1 ; ε −ε (8) x1 x1 = 1. Now, we can state the normal form theorem for the group An . Theorem 4.2. Let  An = xi , 1 ≤ i ≤ n − 2 | x31 = 1, (xi−1 xi )3 = xi 2 = 1, 2 ≤ i ≤ n − 2,  (xi xj )2 = 1, 1 ≤ i < j − 1, j ≤ n − 2 be the alternating group of degree n. Let S consist of relations (1)–(8) from Lemma 4.1. Then (i) with the deg-lex order, S is a Gr¨ obner–Shirshov basis of the alternating group An ;

page 54

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.1. NORMAL FORM FOR SEMIGROUPS AND GROUPS

textbook

55

(ii) every element w of An has a unique representation w = x1j1 x2j2 · · · xn−2jn−2 , where xtt = xt for t > 1, xii+1 = 1, xi1 = xi xi−1 · · · xε1 , for 1 ≤ ji ≤ i + 1, 1 ≤ i ≤ n − 2 and ε = ±1 (here we use xi1 instead of xi1ε ). Proof. By Lemma 4.1, it is easy to see that every element w of An has a representation w = x1j1 x2j2 · · · xn−2jn−2 , ji ≤ i + 1. Here x1j1 may take 3 values 1, x1 , x−1 1 , the value of x2j2 has 4 options, and, generally, xiji may take i + 2 values, for 1 ≤ i ≤ n − 2. So there are n!/2 different reduced words. From this fact, it follows that each representation is unique since |An | = n!/2. On the other hand, it is clear that Irr(S) consists of the same words w. Hence, by Theorem 3.14, S is a Gr¨ obner–Shirshov basis of the  alternating group An . Remark 4.3. According to Theorem 3.14, normal form in Sn−1 is x1j1 x2j2 ...xn−2jn−2 , but here xi1 = xi xi−1 · · · x1 . 2. Gr¨ obner–Shirshov basis for the Chinese monoid. The Chinese monoid CH(X, j > k, xi xj xj = xj xi xj , xi xi xj = xi xj xi , for every i > j. Theorem 4.4. With the deg-lex order on X ∗ the following relations (1)–(5) form a Gr¨ obner–Shirshov basis of the Chinese monoid CH(X): (1) (2) (3) (4) (5)

xi xj xk − xj xi xk , xi xk xj − xj xi xk , xi xj xj − xj xi xj , xi xi xj − xi xj xi , xi xj xi xk − xi xk xi xj ,

where xi , xj , xk ∈ X and i > j > k.

page 55

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

56

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Let Λ be the set which consists of words on X of the form un = w1 w2 · · · wn , for n ≥ 0, where w1 = xt111 , w2 = (x2 x1 )t21 xt222 , w3 = (x3 x1 )t31 (x3 x2 )t32 xt333 , .. . wn = (xn x1 )tn1 (xn x2 )tn2 · · · (xn xn−1 )tn(n−1) xtnnn , with xi ∈ X, x1 < x2 < · · · < xn and all exponents non-negative. Corollary 4.5. The set Λ is a set of normal forms of all elements of the Chinese monoid CH(X). 3. Gr¨ obner–Shirshov basis for the free inverse semigroup. Let S be a semigroup. An element s ∈ S is called an inverse of t ∈ S if sts = s and tst = t. An inverse semigroup is a semigroup in which every element t has a unique inverse, denoted by t−1 . Let X −1 = {x−1 | x ∈ X} with X ∩ X −1 = ∅. Denote X ∪ X −1 by Y . We define the formal inverses for elements of Y ∗ by the rules 1−1 = 1, (x−1 )−1 = x, for x ∈ X, (y1 y2 · · · yn )−1 = yn−1 · · · y2−1 y1−1 , for y1 , y2 , · · · , yn ∈ Y. It is well known that the free inverse semigroup (with identity) generated by X has the following presentation: FI(X) = SmgY | aa−1 a = a, aa−1 bb−1 = bb−1 aa−1 , for a, b ∈ Y ∗ . We give formal definitions in Y ∗ of an idempotent, (prime) canonical idempotent and ordered (prime) canonical idempotent. Let < be a well order on Y . (i) The empty word 1 is an idempotent. (ii) If h is an idempotent and x ∈ Y , then x−1 hx is both an idempotent and a prime idempotent. (iii) If e1 , e2 , · · · , em , for m > 1, are prime idempotents, then e = e1 e2 · · · em is an idempotent. (iv) An idempotent w ∈ Y ∗ is called canonical if w has no subword of the form x−1 exf x−1 , where x ∈ Y and both e, f are idempotents. (v) A canonical idempotent w ∈ Y ∗ is called ordered if for any subword e = e1 e2 · · · em of w, fir(e1 ) < fir(e2 ) < · · · < fir(em ), where each ei is an idempotent, m > 2 and fir(u) is the first letter of u ∈ Y ∗.

page 56

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.2. FREE PRODUCT OF ALGEBRAS AND GROUPS

textbook

57

Theorem 4.6. Let S be the set of the following two kinds of polynomials in kY : • ef − f e, where e and f are ordered prime canonical idempotents such that ef > f e; • x−1 e xf  x−1 − f  x−1 e , where x ∈ Y , x−1 e x and xf  x−1 are ordered prime canonical idempotents. Then, with the deg-lex order on Y ∗ , S is a Gr¨ ober–Shirshov basis for free inverse semigroup SmgY | S. Corollary 4.7. Normal forms of elements of the free inverse semigroup SmgY | S are u0 e1 u1 · · · em um ∈ Y ∗ , where m ≥ 0, u1 , . . . , um−1 = 1, u0 u1 · · · um has no subword of form yy −1 for y ∈ Y , e1 , · · · , em are ordered canonical idempotents, and the first (last, respectively) letters of the factors of ei , for 1 ≤ i ≤ m, are not equal to the first (last, respectively) letter of ui (ui−1 , respectively). The above normal forms are analogous to the “semi-normal” forms that were given by Poliakova and Schein [252]. 4.2. Free product of algebras and groups Gr¨ obner–Shirshov bases allow us to find easily a normal form of elements in the free product of associative (Lie) algebras, semigroups, or groups. Suppose K is a commutative ring with 1. Let A, A1 , and A2 be associative K-algebras, and ei : Ai → A be K-algebra homomorphisms, i = 1, 2. Then (A, (e1 , e2 )) is said to be a free product of A1 and A2 if for every Kalgebra B and for every K-algebra homomorphisms fi : Ai → B, i = 1, 2, there exists a unique K-algebra homomorphism ϕ : A → B such that the following diagram commutes, i.e., ϕei = fi , i = 1, 2: Ai ei

 A

fi

/B >

ϕ

It is easy to see that free product is unique up to isomorphism, it is denoted A = A1 ∗ A2 . The free product always exists, namely, if Ai = KXi | Si , for i = 1, 2, with X1 , X2 disjoint then A1 ∗ A2 = KX1 ∪ X2 | S1 ∪ S2  is a presentation of the free product of A1 and A2 . Replacing associative algebras with Lie algebras (semigroups, groups) leads to the definition of a free product in the corresponding classes of algebraic systems.

page 57

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

58

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

It is easy to see that if S1 and S2 are monic sets then there are no compositions (f, g)w between f ∈ S1 and g ∈ S2 since they depend on non-intersecting sets of variables. Hence, if S1 and S2 are GS bases in the corresponding algebras then is S1 ∪ S2 is a GS basis for the free product. For every K-algebra, there always exists a monic GS basis given by the multiplication table. Theorem 4.8. Let Ai , i = 1, 2, be associative K-algebras with identity 1 and Li = Ti ∪ {1} be linearly ordered K-bases for Ai , where T1 = {ai | i ∈ I1 },

a0 = 1,

T2 = {bh | h ∈ I2 },

b0 = 1.

For every ai , aj ∈ T1 and bh , bk ∈ T2 , let  αnij an , bh bk = ai aj = n∈I1 ∪{0}



m γhk bm

m∈I2 ∪{0}

 n m where αnij , γhk ∈ K, and let M1 = ai aj − αij an | i, j ∈ I1 , M2 =

 m bh bk − γhk bm | h, k ∈ I2 . Order T1 ∪ T2 by ai < bh for all i ∈ I1 , h ∈ I2 , and assume the words in (T1 ∪ T2 )∗ are ordered in the deg-lex way. Then the following statements hold.

obner–Shirshov basis in the algebra KT1 ∪ T2 . (a) M1 ∪ M2 is a Gr¨ (b) Let Ω be the set which consists of u = c1 c2 · · · ct , for t ≥ 0, such that • if t = 0, then u = 1; • if t = 1, then u ∈ T1 ∪ T2 ; • if t > 1, then ci ∈ T1 ∪ T2 , and successive ci and ci+1 are not in the same set T1 or T2 . Then A1 ∗ A2 is a free K-module with basis Ω.   m bm . The only Proof. (a) Let aij = ai aj − αnij an and bhk = bh bk − γhk types of compositions are (aij , ajk )ai aj ak and (bhk , bkq )bh bk bq . Since      (ai aj )ap = αnij (an ap ) = αnij αsjp as = αnij αsjp as , n

ai (aj ap ) =



n

αnjp (ai an ) =

n



s

αnjp



n

n,s

 αsin as

s

and (ai aj )ap = ai (aj ap ), we have   αnij αsnp = αnjp αsin n

n

=

 n,s

αnjp αsin as ,

page 58

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.2. FREE PRODUCT OF ALGEBRAS AND GROUPS

59

for every s ∈ I1 . Then (aij , ajk )ai aj ak = aij ak − ai ajk       = ai aj − αnij an ak − ai aj ak − αnjk an =−



n

αnij (an ak )

+



n



 s

n

n

αnjk (ai an )

n

αnij αsnk





αnjk αsin

 as

n

≡ 0 (mod M1 ∪ M2 , ai aj ak ) and similarly, (bhk , bkq )bh bk bq ≡ 0 (mod M1 ∪ M2 , bh bk bq ). obner–Shirshov basis. This shows (a). Hence, M1 ∪ M2 is a Gr¨ (b) This part follows from the CD-lemma for associative algebras, Theorem 3.14.  Now we turn our attention to the free product of groups. Let Gi = Xi | Si , for i = 1, 2 with X1 ,X2 disjoint. Then the free product of the groups G1 and G2 is G1 ∗ G2 = X1 ∪ X2 | S1 ∪ S2 . A reduced sequence (or normal form) is a sequence g1 , g2 , · · · , gn , for n ≥ 0, of elements of G1 ∗ G2 such that each gi = 1, gi ∈ G1 ∪ G2 and successive gi , gi+1 are not in the same factor. (We allow n = 0 for the empty sequence.) Let Gi be a group with identity 1 and Gi = Ti ∪ {1}, i = 1, 2. Then Ti ∪{1} is a K-basis for the algebra KTi | Mi  where Mi = {ar at −{ar at } | ar , at ∈ Ti }. Then M1 ∪ M2 is a Gr¨obner–Shirshov basis of the algebra KT1 ∪ T2 . Now, by using Theorem 3.14 and section 4.1, we easily obtain the following corollary. Corollary 4.9 (The Normal Form Theorem for Free Products). Let G1 , G2 be two groups. Then, each element w of G1 ∗ G2 , the free product of groups G1 and G2 , can be uniquely expressed as w = g1 g2 · · · gn , where n ≥ 0 and g1 , g2 , . . . , gn is a reduced sequence. A similar method may be applied to find a base for the free product of Lie algebras over a field k. Let Li , i = 1, 2, be Lie algebras with bases {ai | i ∈ I} and {bj | j ∈ J}, respectively. A base of the free product Lie algebra L1 ∗ L2 consists of all those NLSWs [u] with a singular ALSW u in the alphabet {ai , bj | i ∈ I, j ∈ J}. An ALSW u is said to be singular if u contains no subwords of the form ai aj , for i > j and i, j ∈ I, or bi bj , for

page 59

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

60

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

textbook

i > j and i, j ∈ J. This follows immediately from the representation    p  L1 ∗ L2 = Lie {ai , bj | i ∈ I, j ∈ J}  [ai ak ] − αik ap , i > k, p∈I

[bl bj ] −



 q βlj bq , l > j .

q∈J q αpik , βlj

Here ∈ k are the structure constants of Li . The set of relations of the algebra L1 ∗ L2 is closed under compositions. Therefore, the statement on a linear base for L1 ∗ L2 follows from Theorem 3.35. 4.3. Modules over associative algebras Let X, Y be two sets, and let modk X Y  be the free left kX-module with basis Y . Then modk X Y  = ⊕y∈Y kXy is called a “double-free” module. We now define a Gr¨obner–Shirshov (GS) basis in modk X Y . Suppose that < is a monomial order on X ∗ , with < a well order on Y , and set X ∗ Y = {uy | u ∈ X ∗ , y ∈ Y }. We define an order < on X ∗ Y as follows: for w1 = u1 y1 , w2 = u2 y2 ∈ X ∗ Y , let w1 < w2 ⇔ u1 < u2

or u1 = u2 and y1 < y2 .

Suppose S ⊂ modk X Y  with each s ∈ S monic. Then we define just one type of composition in S, the inclusion composition, which  means αi ai si , f¯ = a¯ g for some a ∈ X ∗ , where f, g ∈ S. If (f, g)f¯ = f − ag = where αi ∈ k, ai ∈ X ∗ , si ∈ S and ai s¯i < f¯, then this composition is called trivial modulo (S, f¯). Definition 4.10. A monic set S ⊂ modk X Y  is said to be a GS basis if for every f, g ∈ S their composition (f, g)f¯ is trivial modulo (S, f¯) whenever f¯ = a¯ g for some a ∈ X ∗ . Theorem 4.11 (CD-lemma for modules). Let S ⊂ modk X Y  be a monic set, and < the order on X ∗ Y as before. Then the following statements are equivalent: (i) S is a GS basis in modk X Y . (ii) If 0 = f ∈ kXS, then f¯ = a¯ s for some a ∈ X ∗ and s ∈ S. s, a ∈ X ∗ , s ∈ S} is a linear basis for (iii) Irr(S) = {w ∈ X ∗ Y | w = a¯ the quotient modk X Y | S = modk X Y /kXS. The proof follows immediately from the observation that modk X Y  is a subalgebra of kX ∪ Y | yx = 0, y ∈ Y and x ∈ X ∪ Y . If S is a GS basis in the sense of Definition 4.10 then S∪{yx | y ∈ Y and x ∈ X ∪ Y } is a GS basis in kX ∪ Y . The converse is also true.

page 60

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.3. MODULES OVER ASSOCIATIVE ALGEBRAS

61

Let S ⊂ kX and A = kX | S. Then, for a left A-module A M , we can regard A M as a kX-module in a natural way: for f ∈ kX and m ∈ M , set f m := (f + Id(S))m. Note that A M is an epimorphic image of an appropriate free A-module. Now, we assume that AM

= modA Y | T  = modA Y /AT,

where T ⊂ modA Y . Let     T1 = (fi + Id(S))yi ∈ T fi yi ∈ modk X Y   and R = SX ∗ Y ∪ T1 . Then, as kX-modules, A M = modk X Y | R. In particular, every left ideal I of A as well as A/I may be regarded as left A-modules. Assume that S is a GS basis for the left ideal kXS with no compositions at all between different elements of S. It means that S is a minimal GS basis of the left ideal kXS. Then kXS is a free kX-module with the basis S. This shows the Cohn Theorem 2.1. Yet another application of the CD-lemma for modules leads to GS bases for the Verma modules over the Lie algebra of coefficients of a free Lie conformal algebra. Let us find a linear basis for such a module. Let B be a set of symbols. Let N be an integer constant (locality function on B). Consider X = {b(n) | b ∈ B and n ∈ Z} and let L = Lie(X | S) be the Lie algebra generated by X relative to the relations S, where        s n (−1) [b(n − s)a(m + s)]  a, b ∈ B, m, n ∈ Z . S= s s For every b ∈ B, let b(z) be the formal power series over L given by  b(z) = b(n)z −n−1 ∈ L[[z, z −1]]. n

The series b(z), for b ∈ B, generate the free Lie conformal algebra C with locality function N (see [272]). Moreover, the coefficient algebra of C coincides with L. Let B be a linearly ordered set. Define an order on X in the following way: a(m) < b(n) ⇔ m < n

or m = n and a < b.

We use the deg-lex order on X ∗ . Then, it is clear that the leading term of each polynomial in S is b(n)a(m) such that n − m > N or (n − m = N and (b > a or (b = a and N is odd))). It is straightforward to check the following Lemma 4.12. With the deg-lex order on X ∗ , S is a GS basis in Lie(X).

page 61

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

62

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Corollary 4.13. Let U = U (L) be the universal enveloping algebra of L. Then a linear basis of U consists of monomials a1 (n1 )a2 (n2 ) · · · ak (nk ),

ai ∈ B, ni ∈ Z

such that for every 1 ≤ i < k  N − 1, if ai > ai+1 or ai = ai+1 and N is odd ni − ni+1 ≤ N, otherwise. 4.4. Replicated algebras In this section, we propose an approach to the GS bases theory for a class of varieties obtained by replication procedures. The latter may be easily described in terms of defining identities. In particular, di-replication of Lie algebras leads to the notion of a Leibniz algebra, di-replication of associative algebras gives (associative) di-algebras. Tri-replication of associative algebras gives the class known as tridendriform algebras. Suppose GS bases theory is known for algebras in a variety Var. Denote by di Var and tri Var the varieties obtained by di- and tri-replication of Var. Then we explicitly construct a “Var-envelope” for free algebras di VarX and tri VarX such that a solution of the ideal membership problem in this Var-envelope induces a solution of the same problem in di VarX or tri VarX. We will apply the technique to study universal enveloping Lie tri-algebras of Lie algebras and associative enveloping tri-algebras of Lie tri-algebras. 4.4.1. Replication of varieties. Let (Σ, ν) be a language, i.e., a set of operations together with arity function ν : Σ → Z+ . A Σ-algebra is a linear space A equipped with polylinear operations f : A⊗n → A, for f ∈ Σ and n = ν(f ). Denote by Alg = AlgΣ the class of all Σ-algebras, and let AlgX stand for the free Σ-algebra generated by a set X. (For example, if Σ consists of one binary operation then AlgX is the magmatic algebra.) For a given language (Σ, ν), define two replicated languages (Σ(2) , ν (2) ) and (Σ(3) , ν (3) ) as follows: Σ(2) = {fi | f ∈ Σ, i = 1, . . . , ν(f )}, Σ(3) = {fH | f ∈ Σ, ∅ = H ⊆ {1, . . . , ν(f )}},

ν (2) (fi ) = ν(f ); ν (3) (fH ) = ν(f ).

Denote Alg(2) = AlgΣ(2) and Alg(3) = AlgΣ(3) . Note that Σ(2) may be considered as a subset of Σ(3) via fi = f{i} . This is why we will later deal with Σ(3) assuming the same statements (in a more simple form) hold for Σ(2) . Suppose X = {x1 , x2 , . . . } is a set of generators, and let an element Φ ∈ AlgX be polylinear with respect to x1 , . . . , xn ∈ X. Every monomial

page 62

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.4. REPLICATED ALGEBRAS

textbook

63

summand of Φ may be naturally considered as a rooted tree whose leaves are labeled by variables x1 , . . . , xn and nodes are labeled by symbols of operations from Σ. Every node labeled by f ∈ Σ has one “input” and ν(f ) “outputs”, each output is attached to a subtree. For example, if Σ = {(· ∗ ·), [·, ·]} consists of two binary operations then the term [x1 , (x4 ∗x3 )]∗[x2 , x5 ] may be identified with

The natural polylinear composition rule on such trees is given as follows: Given a tree T with n leaves and tress T1 , . . . , Tn (each Ti has mi leaves), their composition T (T1 , . . . , Tn ) has m1 + · · · + mn leaves, this is a tree obtained by attaching each Ti to the ith leaf of T and by natural shift of numeration of leaves in each Ti . For example, if Σ = {(· ∗ ·), [·, ·]} consists of two binary operations, T = [x2 , x3 ] ∗ x1 , T1 = [x2 , [x1 , x3 ]], T2 = [x2 , x1 ] and T3 = x1 ∗ x2 , then T (T1 , T2 , T3 ) is presented by

Given a polylinear monomial Φ(x1 , . . . , xn ) and a nonempty subset H ⊂ {1, . . . , n}, we may define ΦH ∈ Alg(3) X in the following way. Let us emphasize leaves xi for i ∈ H in Φ and change the labels of nodes by the following rule. For every node labeled by a symbol f ∈ Σ consider the subset S of {1, . . . , ν(f )} which consists of those output numbers that are attached to subtrees containing emphasized leaves. If S is nonempty then replace f with fS , if S is empty then replace f with f{1} . For example, the monomial Φ = T (T1 , T2 , T3 ) from the example above transforms to the following one for H = {1, 2, 4}: ([[x5 , x4 ]2 , (x6 ∗1 x7 )]1 ∗1,2 [x2 , [x1 , x3 ]1 ]1,2 ) Transforming every polylinear monomial of Φ in this way, we obtain ΦH . (For Σ(2) , it is enough to consider |H| = 1.)

page 63

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

64

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Suppose Var is a variety of Σ-algebras defined by a collection of polylinear identities S(Var) ⊂ AlgX, X = {x1 , x2 , . . . }. Definition 4.14. A Σ(3) -algebra belongs to the class tri Var if and only if it satisfies the following identities in Alg(3) X: fH (x1 , . . . , xi−1 , gS (xi , . . . , xi+m−1 ), xi+m , . . . , xn+m−1 ) = fH (x1 , . . . , xi−1 , gQ (xi , . . . , xi+m−1 ), xi+m , . . . , xn+m−1 ),

(4.1)

where f, g ∈ Σ, ν(f ) = n, ν(g) = m, H ⊆ {1, . . . , n}, S, Q ⊆ {1, . . . , m}, i∈ / H, and (4.2) ΦH (x1 , . . . , xn ) = 0, where Φ ∈ S(Var), deg Φ = n and H ⊆ {1, . . . , n}. Example 4.15. Let Σ consist of one binary product μ(a, b) = [ab]. Consider Var = Lie (char k = 2) with defining identities [x1 x2 ] + [x2 x1 ] = 0,

[[x1 x2 ]x3 ] + [[x2 x3 ]x1 ] + [[x3 x1 ]x2 ] = 0.

Then Σ(3) consists of three binary operations μ{1} (a, b) = [a  b],

μ{2} (a, b) = [a  b],

μ{1,2} (a, b) = [a ⊥ b].

The family of defining identities of the variety tri Lie of Lie tri-algebras contains replicated skew-symmetry [x1  x2 ] + [x2  x1 ] = 0,

[x1 ⊥ x2 ] + [x2 ⊥ x1 ] = 0.

This allows us to replace [a  b] with −[b  a] and express all defining relations in terms of [·  ·] and [· ⊥ ·]. It is easy to see that (4.1) turns into [[x1  x2 ]  x3 ] = −[[x2  x1 ]  x3 ],

(4.3)

[[x1 ⊥ x2 ]  x3 ] = [[x1  x2 ]  x3 ],

(4.4)

and the replication of Jacobi identity leads (up to equivalence) to the following three relations: [[x1  x2 ]  x3 ] − [x1  [x2  x3 ]] + [x2  [x1  x3 ]] = 0,

(4.5)

[[x1  x2 ] ⊥ x3 ] − [x1  [x2 ⊥ x3 ]] − [[x1  x3 ] ⊥ x2 ] = 0,

(4.6)

[[x1 ⊥ x2 ] ⊥ x3 ] + [[x2 ⊥ x3 ] ⊥ x1 ] + [[x3 ⊥ x1 ] ⊥ x2 ] = 0.

(4.7)

Note that (4.5) is the left Leibniz identity, (4.3) easily follows from (4.5). Therefore, a Lie tri-algebra may be considered as a linear space with two operations [·  ·] and [· ⊥ ·] such that the first one satisfies the left Leibniz identity, the second one is Lie, and (4.4), (4.6) hold.

page 64

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.4. REPLICATED ALGEBRAS

65

Remark 4.16. The operad governing the variety tri Lie is Koszul dual to the operad CTD governing the variety of commutative tridendriform algebras introduced in [212]. This is a particular case of a general relation between di- or tri- algebras and their dendriform counterparts [155]. In [319], tri Lie is stated as CTD! . In a similar way, one may construct the defining identities of the variety tri As starting from associativity: (x1  x2 )  x3 = x1  (x2  x3 ),

(x1  x2 )  x3 = x1  (x2  x3 ),

(x1  x2 )  x3 = x1  (x2  x3 ),

(x1  x2 )  x3 = x1  (x2  x3 ),

(x1  x2 )  x3 = x1  (x2  x3 ),

(x1  x2 )  x3 = x1  (x2 ⊥ x3 ),

(x1 ⊥ x2 )  x3 = x1 ⊥ (x2  x3 ),

(x1  x2 ) ⊥ x3 = x1 ⊥ (x2  x3 ),

(x1  x2 ) ⊥ x3 = x1  (x2 ⊥ x3 ),

(x1 ⊥ x2 )  x3 = x1  (x2  x3 ).

4.4.2. Construction of free di- and tri-algebras. In this section, we present simple construction of the free algebra tri VarX generated by a given set X in the variety tri Var. The same construction works for di Var after obvious simplifications. To make the results more readable, we restrict to the case when Σ consists of one binary product since this is the most common case in practice. In this case, Σ(3) = {, , ⊥}, the set of binary operations corresponding to H = {1}, {2}, and {1, 2}, respectively. However, there are no obstacles to the transfer of the following considerations to an arbitrary language. Let Com be the variety of associative and commutative algebras. Example 4.17. For every commutative tri-algebra C ∈ tri Com and for every algebra B ∈ Var the linear space C ⊗ B equipped with (p ⊗ a) ∗ (q ⊗ b) = (p ∗ q) ⊗ ab,

for ∗ ∈ {, , ⊥}, a, b ∈ B and p, q ∈ C,

is a tri-algebra in the variety tri Var. Another general construction for a tri-replicated system is the following. Example 4.18. Suppose A is an algebra in Var equipped with a linear map ϕ : A → A such that ϕ(ϕ(a)b) = ϕ(aϕ(b)) = ϕ(a)ϕ(b) = ϕ(ab),

a, b ∈ A.

(4.8)

Let us define three binary operations on the space A as follows: a  b = ϕ(a)b, (ϕ)

Then the system A

a  b = aϕ(b),

a ⊥ b = ab.

(4.9)

obtained belongs to tri Var.

Subsequently, we will need the following particular tri Com algebra.

page 65

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

66

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Example 4.19. Let C2 be the 2-dimensional space with a basis {e1 , e2 } equipped with binary operations ei ⊥ ei = ei ,

e1  e1 = e1  e1 = e1 ,

e1  e2 = e2  e1 = e2 ,

other products are zero. It is easy to check that C2 ∈ tri Com. Given a set X, denote by X˙ the copy of X: X˙ = {x˙ | x ∈ X}. ˙ in the variety Var. There exists Denote by F the free algebra VarX ∪ X a unique homomorphism ϕ : F → VarX ⊂ F determined by x → x and x˙ → x, for x ∈ X. Let us define three binary operations on the space F by (4.9). Denote the system obtained by F (3) . Obviously, ϕ2 = ϕ, so relation (4.8) holds for ϕ. Hence, F (3) belongs to tri Var as a particular case of Example 4.18. For a monomial u in F , denote by degX˙ u the degree of u with respect ˙ to variables from X. Lemma 4.20. The subalgebra V of F (3) generated by X˙ coincides with the subspace W of F spanned by all monomials u such that degX˙ u > 0. Proof. It is clear that V ⊆ W . Indeed, consider u  v, u  v, u ⊥ v for u, v ∈ V . Inductive arguments allow us to assume u, v ∈ W and thus all three products also belong to W by the definition of F (3) . The converse embedding W ⊆ V is proved analogously.  It is easy to see that the space V from Lemma 4.20 coincides with the ˙ ideal of F generated by X. Theorem 4.21. The subalgebra V of F (3) is isomorphic to the free algebra in the variety tri Var generated by X. Proof. It is enough to prove universal property of V in the class tri Var. Suppose A is an arbitrary tri-algebra in tri Var, and let α : X → A be an arbitrary map. Our aim is to construct a homomorphism of tri-algebras χ : V → A such that χ(x) ˙ = α(x) for all x ∈ X. The subspace A0 = span {a  b − a  b, a  b − a ⊥ b | a, b ∈ A} is an ideal of A. The quotient A¯ = A/A0 carries a natural structure of an algebra from Var given by a ¯¯b = a ⊥ b. Consider the formal direct sum ˆ ¯ A = A ⊕ A equipped with one well-defined product a ¯b = a  b, a¯b = a  b, a ¯¯b = a ⊥ b, ab = a ⊥ b,

page 66

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.4. REPLICATED ALGEBRAS

67

¯ where c¯ = c + A0 ∈ A, ¯ for c ∈ A. Then Aˆ ∈ Var. for a ¯, ¯b ∈ A, There is a natural relation between A ∈ tri Var and Aˆ ∈ Var. For the tri-algebra C2 from Example 4.19 we have an embedding of tri-algebras ι : A → C2 ⊗ Aˆ (the latter is from Example 4.17) given by ¯ + e2 ⊗ a, ι : a → e1 ⊗ a

for a ∈ A.

Now, construct α ˆ : X ∪ X˙ → Aˆ as ¯ α ˆ (x) = αx ∈ A,

α( ˆ x) ˙ = α(x) ∈ A,

ˆ for x ∈ X. The map α ˆ induces a homomorphism of algebras ψˆ : F → A. Finally, define ψ : F → C2 ⊗ Aˆ by ˆ ˆ ), )) + e2 ⊗ ψ(f ψ(f ) = e1 ⊗ ψ(ϕ(f

for f ∈ F,

(4.10)

Let us show that ψ is a homomorphism of tri-algebras. For every f, g ∈ F and ∗ ∈ {, , ⊥}, ˆ ˆ )) ∗ (e1 ⊗ ψ(ϕ(g)) ˆ ˆ )) + e2 ⊗ ψ(f + e2 ⊗ ψ(g)) ψ(f ) ∗ ψ(g) = (e1 ⊗ ψ(ϕ(f ˆ ˆ = (e1 ∗ e1 ) ⊗ ψ(ϕ(f )ϕ(g)) + (e1 ∗ e2 ) ⊗ ψ(ϕ(f )g) ˆ ϕ(g)) + (e2 ∗ e2 ) ⊗ ψ(f ˆ g). + (e2 ∗ e1 ) ⊗ ψ(f Hence,

⎧ ˆ ˆ ⎪ ⎨e1 ⊗ ψ(ϕ(f g)) + e2 ⊗ ψ(ϕ(f )g), ˆ ˆ ϕ(g)), ψ(f ) ∗ ψ(g) = e1 ⊗ ψ(ϕ(f g)) + e2 ⊗ ψ(f ⎪ ⎩ ˆ ˆ g), e1 ⊗ ψ(ϕ(f g)) + e2 ⊗ ψ(f

for ∗ = ; for ∗ = ; for ∗ = ⊥.

(4.11)

On the other hand, it is straightforward to compute ψ(f ∗ g). Since ϕ is a homomorphic averaging operator, the results coincide with those in (4.11). It is easy to see from the definition that ψ(x) ˙ = ι(α(x)). Therefore, ψ(V ) ⊆ ι(A) ⊆ C2 ⊗ Aˆ by Lemma 4.20. Finally, the desired homomorphism χ may be constructed as χ = ι−1 ◦ ψ|V : V → A. In other words, the diagram ⊆ X˙ −−−−→ ⏐ ⏐ ϕ



V −−−−→ ⏐ ⏐χ 

F (3) ⏐ ⏐ψ 

α ι X −−−−→ A −−−−→ C2 ⊗ Aˆ

commutes.



page 67

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

68

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Remark 4.22. All constructions of this section make sense for di-algebras. It is enough to consider only operations  and  on F given by the same rules. The role of V ⊆ F is played by the subspace of polynomials linear ˙ in X. Example 4.23. Let Var = Lie, di Lie is the variety of Leibniz algebras. Then ˙ di LieX  V ⊆ LieX ∪ X. Suppose X is linearly ordered; let us extend the order to X ∪ X˙ in the natural way: x > y ⇒ x˙ > y, ˙ x˙ > y for all x, y ∈ X. It is easy to see that all words of the form [. . . [[x˙ 1 x2 ]x3 ] . . . xn ] (with left-justified bracketing) are linearly independent in F since this is so ˙ These words correspond to for their images in U (F )  kX ∪ X. [. . . [[x1  x2 ]  x3 ]  · · ·  xn ] ∈ di LieX,

(4.12)

where [·  ·] satisfies the (right) Leibniz identity [x  [y  z]] = [[x  y]  z] − [[x  z]  y]. Obviously, every Leibniz polynomial may be rewritten as a linear combination of monomials (4.12), so the latter form a linear basis of di LieX. 4.4.3. Computing GS bases for replicated algebras. As above, let Var be a variety of algebras (with one binary product) defined by polylinear identities, di Var and tri Var are the corresponding varieties of di- and tri-algebras. Suppose X is a nonempty set of generators. By Theorem 4.21 and Remark 4.22, the free systems tri VarX and di VarX may be considered ˙ As in Section 4.4.2, let us denote these as subspaces of F = VarX ∪ X. subspaces by V . We will consider the case of tri-algebras in detail. For every S ⊆ V ⊆ F (3) denote by (S)(3) the ideal of V generated by S. Every tri-algebra A ∈ tri Var may be presented by generators and defining relations as A  tri VarX | S  V /(S)(3) for appropriate X and S. In order to understand the structure of A we have to know how to decide whether a given f ∈ V belongs to (S)(3) . This kind of problem is the main target of the Gr¨ obner–Shirshov (GS) basis method. In order to translate GS theory from the class Var to tri Var (and to di Var) we need the following, in which ϕ is the endomorphism of F defined in Section 4.4.2.

page 68

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.4. REPLICATED ALGEBRAS

69

Theorem 4.24. Let S ⊂ V ⊂ F (3) . Then (S)(3) = (S ∪ ϕ(S)) ∩ V, where (P ) stands for the ideal of F generated by its subset P . Proof. Denote I = (S ∪ ϕ(S)). Obviously,  I= Is and I0 ⊆ I1 ⊆ . . . , s≥0

where I0 = span (S ∪ ϕ(S)), Is+1 = Is + F Is + Is F,

for s ≥ 0.

(3)

Similarly, for J = (S)

we have  J= Js and J0 ⊆ J1 ⊆ . . . , s≥0

where J0 = span (S), Js+1 = Js + V  Js + Js  V + V  Js + Js  V + Js ⊥ V + V ⊥ Js , for s ≥ 0. Since S ⊂ V , ϕ(S) ⊂ VarX, and VarX ∩ V = 0, we have I0 ∩ V = J0 . Moreover, I0 = J0 + I0 , where I0 = span ϕ(S) = ϕ(J0 ) ⊆ VarX. Assume Is = Js + ϕ(Js ) for some s ≥ 0. Note that F = V + VarX and VarX = ϕ(V ) by Lemma 4.20. Then Is+1 = Is + (V + VarX)Is + Is (V + VarX) = Js + V Js + V ϕ(Js ) + VarXJs + Js V + ϕ(Js )V + Js VarX + ϕ(Js ) + ϕ(Js ) VarX + VarXϕ(Js ). It remains to note that V Js = V ⊥ Js ,

V ϕ(Js ) = V  Js ,

VarXJs = V  Js ,

Js V = Js ⊥ V,

ϕ(Js )V = Js  V,

Js VarX = Js  V.

Hence, Is+1 = Js+1 + ϕ(Js ) + ϕ(Js ) VarX + VarXϕ(Js ), but the latter three summands obviously give ϕ(Js+1 ). We have proved Is = Js + ϕ(Js ) for all s ≥ 0. Therefore, Is ∩ V = Js , and I ∩ V = J, as required.  Remark 4.25. For di-algebras, the statement of Theorem 4.24 holds true: the intersection of V  di VarX with the ideal generated by S ∪ ϕ(S) in F is equal to the ideal generated by S in the di-algebra V .

page 69

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

70

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Let S ⊂ tri VarX. Then I = (S ∪ ϕ(S)) is a ϕ-invariant ideal of F , so one may induce tri-algebra structure on F/I. Corollary 4.26. tri VarX | S is isomorphic to the subalgebra of (F/I)(3) ˙ Similar statement holds for di-algebras. generated by X. Theorem 4.24 and Remark 4.25 provide an easy approach to GS bases theory for the classes of tri- and di-algebras (tri Var and di Var, respectively) modulo the analogous theory for the variety Var. In order to find GS basis of an ideal generated by S ⊆ tri VarX one should rewrite the relations ˙ and find the GS basis Sˆ of the from S as elements of F = VarX ∪ X, ideal in F generated by S ∪ ϕ(S). (In the case of di-algebras, it is enough to find a part of the latter GS basis, namely, those polynomials of degree ˙ To find a linear basis of tri LieX | S one should just consider ≤ 1 in X.) ˆ S-reduced monomials in F and choose those that belong to V . Example 4.27. Let X = {x, y}. Consider the Leibniz algebra L ∈ di Lie generated by X with one defining relation f = [x  y] + [y  x] + y. Define the order on X ∪ X˙ by x˙ > y˙ > x > y. According to the ˙ general scheme, denote F = LieX ∪ X. Then f may be interpreted as [xy]+[ ˙ yx]+ ˙ y˙ ∈ F , and ϕ(f ) = y. Obviously, f and ϕ(f ) have a composition = [yx]+ ˙ y, ˙ and the latter has no more compositions of inclusion (f, ϕ(f ))[xy] ˙ with ϕ(f ). Therefore, {y, [yx] ˙ + y} ˙ is a GS basis in F . The linear basis of di Liex, y | f  consists of all those non-associative Lyndon–Shirshov words in Liex, ˙ y, ˙ x, y that are linear in X˙ that do not contain y or yx ˙ as (associative) subwords. Obviously, these are y˙ and [. . . [[xx]x] ˙ . . . x]. Hence, the linear basis of L consists of y and [. . . [[x  x]  x]  · · ·  x]. Let us also state an example of a more general nature. Let L be a Lie algebra with linear basis X and multiplication table μ : X × X → kX, μ(x, y) is a linear form in X for every x, y ∈ X. Assume X is linearly ordered. Let U⊥ (L) stand for the universal object in the class of tri-Lie algebras containing L. To be more precise, U⊥ (·) is the left adjoint functor to the forgetful functor (⊥) from the category of tri-Lie algebras to Lie algebras: given a tri-Lie algebra A, A(⊥) is the Lie algebra (A, ⊥). Then

page 70

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.4. REPLICATED ALGEBRAS

textbook

71

U⊥ (L)  tri LieX | S, where S = {[x ⊥ y]−μ(x, y) | x, y ∈ X and x > y}. According to the scheme described above, we have to consider U = LieX ∪ X˙ | [x˙ y] ˙ − μ(x, ˙ y), [xy] − μ(x, y), for x, y ∈ X and x > y,   where μ(x, ˙ y) = i αi z˙i , for μ(x, y) = i αi zi . Note that polynomials from S = {[x˙ y] ˙ − μ(x, ˙ y) | x, y ∈ X and x > y} do not have nontrivial compositions, so S is a GS basis. The same applies to ϕ(S) = {[xy] − μ(x, y) | x, y ∈ X and x > y}. Moreover, polynomials from S and ϕ(S) have no compositions since they depend on different variables. Hence, S ∪ ϕ(S) is a GS basis and U is isomorphic to the free product L˙ ∗ L, where L˙ is the isomorphic copy of L. Corollary 4.26 implies U⊥ (L) ˙ is isomorphic as a linear space to the ideal of L˙ ∗ L generated by L. 4.4.4. PBW Theorem for tri-algebras. Now, let A be an associative tri-algebra with operations , , and ⊥. Then A(−) with new operations [a  b] = a  b − b  a, [a  b] = a  b − b  a,

(4.13)

[a ⊥ b] = a ⊥ b − b ⊥ a, is a Lie tri-algebra. Given L ∈ tri Lie, denote its universal enveloping associative tri-algebra by U (L). The approach of Section 4.4.3 allows to prove an analogue of the PBW Theorem for tri Lie. Suppose L ∈ tri Lie, L0 = span {[a  b] − [a  b], [a  b] − [a ⊥ b] | a, b ∈ L} as in Section 4.4.2, and let X be a basis of L such that X = X0 ∪ X1 , where X0 is a basis of L0 . Denote by μ , μ , and μ⊥ the linear forms corresponding to the operations on L. Since μ (x, y) = −μ (y, x) and μ⊥ (x, y) = −μ⊥ (y, x), we have to consider U = AsX ∪ X˙ | S ∪ ϕ(S), where S = S ∪ S⊥ , S = {xy ˙ − y x˙ − μ˙  (x, y) | x, y ∈ X}, S⊥ = {x˙ y˙ − y˙ x˙ − μ˙ ⊥ (x, y) | x, y ∈ X, x > y}. Note that ϕ(S ) = {xy − yx − μ (x, y) | x, y ∈ X} = S¯ ∪ S¯ ∪ S¯0 , where S¯ = {xy − yx − μ (x, y) | x, y ∈ X, x > y}, S¯ = {xy − yx + μ (y, x) | x, y ∈ X, x > y}, S¯0 = {μ (x, x) | x ∈ X},

page 71

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

72

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

textbook

and ϕ(S⊥ ) = {xy − yx − μ⊥ (x, y) | x, y ∈ X and x > y}. Hence, the elements of ϕ(S) have the following compositions of inclusion: μ (x, y) + μ (y, x),

μ (x, y) − μ⊥ (x, y),

μ (y, x) + μ⊥ (x, y),

μ (x, x),

where x > y. Obviously, the linear space of these compositions coincides with L0 , so we may add the letters from X0 to the defining relations. Since μ (X, X0 ) = 0 in L and μ (X0 , X) ⊂ L0 , it is enough to consider the ˙ following defining relations in AsX ∪ X: x,

for x ∈ X0 ;

(4.14)

xy − yx − μ (x, y),

for x, y ∈ X1 and x > y;

(4.15)

xy ˙ − y x˙ − μ˙  (x, y),

for x ∈ X and y ∈ X1 ;

(4.16)

x˙ y˙ − y˙ x˙ − μ˙ ⊥ (x, y),

for x, y ∈ X and x > y.

(4.17)

Let us use the same order on X ∪ X˙ as in Example 4.23 along with the ˙ ∗. deg-lex order on (X ∪ X) ˙ Theorem 4.28. Relations (4.14)– (4.17) form a GS basis in AsX ∪ X. Proof. Relations (4.15) correspond to the multiplication table of the Lie ¯ = L/L0 , so all their compositions of intersection are trivial. The algebra L same holds for (4.17): it corresponds to the Lie algebra L(⊥) . It remains to compute two families of compositions (f, g)w : (1) f = yz − zy − μ(y, z), g = xy ˙ − y x˙ − μ˙  (x, y) and w = xyz, ˙ where y, z ∈ Z1 with y > z, and x ∈ X; (2) x˙ y˙ − y˙ x˙ − μ˙ ⊥ (x, y), g = yz ˙ − z y˙ − μ˙  (y, z) and w = x˙ yz, ˙ where x, y ∈ X with x > y, and z ∈ X1 . Let us compute in detail the second composition: (f, g)w = f z − xg ˙ = −y˙ xz ˙ − μ˙ ⊥ (x, y)z + xz ˙ y˙ + x˙ μ˙  (y, z).   ˙ let us write h ≡ h if h − h = For h, h ∈ AsX ∪ X, i αi ui si ui , where ˙ Then the si belong to (4.14)– (4.17), αi ∈ k, and ui s¯i ui < w = x˙ yz. (f, g)w ≡ −yz ˙ x˙ − y˙ μ˙  (x, z) − μ˙ ⊥ (x, y)z + z x˙ y˙ + μ˙  (x, z)y˙ + x˙ μ˙  (y, z) ≡ −μ˙  (y, z)x˙ − y˙ μ˙  (x, z) − μ˙ ⊥ (x, y)z + z μ˙ ⊥ (x, y) + μ˙  (x, z)y˙ + x˙ μ˙  (y, z) = x˙ μ˙  (y, z) − μ˙  (y, z)x˙ + μ˙  (x, z)y˙ − y˙ μ˙  (x, z) + z μ˙ ⊥ (x, y) − μ˙ ⊥ (x, y)z ≡ μ˙ ⊥ (x, μ (y, z)) + μ˙ ⊥ (μ (x, z), y) − μ˙  (μ⊥ (x, y), z)

(4.18)

page 72

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.5. ASSOCIATIVE CONFORMAL ALGEBRAS

73

The right-hand side of (4.18) is equal to [x ⊥ [y  z]] + [[x  z] ⊥ y] − [[x ⊥ y]  z] ∈ L which is zero in every Lie tri-algebra.  (⊥) ¯ ), Corollary 4.29. As a linear space, U (L) is isomorphic to U (L)⊗U 0 (L ¯ where U (L) is the usual universal enveloping associative algebra of the Lie ¯ = L/L0 and U0 (L(⊥) ) is the augmentation ideal of U (L(⊥) ). algebra L

Proof. A linear basis of U (L) consists of those associative words of strictly positive degree in X˙ that are reduced relative to (4.14)–(4.17): x1 . . . xk y˙ 1 . . . y˙ m ,

for xi ∈ X1 and yj ∈ X,

where x1 ≤ · · · ≤ xk , k ≥ 0, y1 ≤ · · · ≤ ym and m ≥ 1.



4.5. Associative conformal algebras A conformal algebra is a linear space C equipped with a linear map ∂ : C → C and a family of bilinear products (· (n) ·), for n ∈ Z+ , such that the following axioms hold: for every a, b ∈ C a (n) b = 0, and

for almost all n ∈ Z+ ;

∂a (n) b = −na (n−1) b, a (n) ∂b = ∂(a (n) b) + na (n−1) b.

(4.19) (4.20)

The locality function on C is a map NC : C × C → Z+ , where NC (a, b) is equal to the minimal N such that a (n) b = 0 for all n ≥ N . If    n a (n) (b (m) c) = (−1)s (a (n−s) b) (m+s) c s s≥0

for all a, b, c ∈ C and n, m ∈ Z+ then C is said to be associative. Introduce the family of operations  {a (n) b} = (−1)n+s ∂ (s) (a (n+s) b), for a, b ∈ C and n ∈ Z+ . s≥0

In an associative conformal algebra, we have {∂a (n) b} = ∂{a (n) b} + n{a (n−1) b}, {a (n) ∂b} = −n{a (n−1) b}, a (n) {b (m) c} = {(a (n) b) (m) c}. Axiom (4.19) shows the class of associative conformal algebras is not a variety in the sense of universal algebra. However, for a given set X of generators and for a fixed function N : X × X → Z+ there exists a free object in the class of associative conformal algebras C generated by X such that NC (a, b) ≤ N (a, b) for a, b ∈ X.

page 73

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

74

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Let (X, ≤) be a well-ordered set, and N : X × X → Z+ be an arbitrary function. Consider the set of symbols O = {∂, Lan , Rna | n ∈ Z+ and a ∈ X} Define a monomial order on O∗ in the following way: compare two words first by their degree in the variables Rna , then by the deg-lex order assuming b , Lan < Lbm if n < m or n = m and a < b; similarly for Rna , ∂ < Lan < Rm b Rm . Denote by A(X) the associative algebra generated by O relative to the following defining relations: Lan ∂ − ∂Lan − nLan−1 ,

(4.21)

a Rna ∂ − ∂Rna − nRn−1 , b a a b Rm Ln − Ln Rm ,

(4.22) (4.23)

for a, b ∈ X, n, m ∈ Z+ . Lemma 4.30. Polynomials (4.21)– (4.23) form a GS basis with respect to the order on O∗ . Proof. The only composition here is the composition of intersection of (4.21) and (4.23). It is straightforward to check that this composition is trivial.  Denote by modA (X) the free (left) A(X)-module generated by X. Lemma 4.30 implies that the linear base of modA (X) consists of words b1 bp ∂ s Lan11 . . . Lankk Rm . . . Rm c, 1 p

s, k, p, ni , mj ∈ Z+ , ai , bj , c ∈ X.

(4.24)

Definition 4.31. Let us call a word of type (4.24) normal if p = 0, ni < N (ai , ai+1 ) for i = 1, . . . , k − 1, and nk < N (ak , c). Denote by B(X, N ) the set of all normal words, and use B0 (X, N ) for the set of ∂-free normal words. Let M (X, N ) stand for the A(X)-module generated by X with the following defining relations: Lan b, 

for a, b ∈ X and n ≥ N (a, b),

(4.25)

N (a,b)−n−1

Rnb a − (−1)n

∂ (s) Lan+s b,

for a, b ∈ X and n ∈ Z+ .

(4.26)

s=0

This module is the main object of study in this section. Let us extend the order to the set of words O∗ ∪ O∗ X which occur in the computation of compositions in the free A(X)-module generated by X. For ux, vy ∈ O∗ X, where u, v ∈ O∗ and a, y ∈ X, assume ux ≺ vy if and only if u ≺ v or u = v and x < y. Moreover, set O∗ ≺ O∗ X.

page 74

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.5. ASSOCIATIVE CONFORMAL ALGEBRAS

textbook

75

Theorem 4.32. A Gr¨ obner–Shirshov basis of the A(X)-module M (X, N ) consists of (4.25), (4.26) and    n a (−1)q (4.27) Lan Lbm u + Ln−q Lbm+q u, q q≥1

where a, b ∈ X, n ≥ N (a, b), m ∈ Z+ and u ∈ B0 (X, N ). Proof. Let us show that relations (4.27) follow from the defining relations (4.25), (4.26). For u = c ∈ X, consider the compositions of intersection of c c Lan − Lan Rm , f = Rm

g = Lan b

for n ≥ N (a, b) and m ∈ Z+ : c c c Lan − Lan Rm )b − Rm (Lan b) (f, g)Rcm Lan b = (Rm



N (b,c)−1

≡ (−1)m+1

(−1)s Lan ∂ (s) Lbm+s c

s=0





s=0

t≥0

N (b,c)−1

≡ (−1)m+1

(−1)s

  n (s−t) a Ln−t Lbm+s c. ∂ t

These compositions are trivial for m ≥ N (b, c), while for m = N (b, c) − 1 we obtain (f, g)Rcm Lan b to be a multiple of Lan Lbm c. For smaller m, proceed by induction. Suppose (4.27) holds for u = c and all m > p. Then   n (s−t) a Ln−t Lbp+s c ∂ t s=0 t≥0     p+1 a b q+s+1 n = (−1) (−1) Ln Lp c + ∂ (s) Lan−q Lbp+s+q c q s,q≥1     n r+t (r) a b (−1) + ∂ Ln−t Lp+r+t c t r≥0,t≥1      p+1 a b q n a b = (−1) (−1) (4.28) Ln Lp c + Ln−q Lp+q c . q 

N (b,c)−1

(f, g)Rcp Lan b ≡ (−1)p+1



(−1)s

q≥1

Here we have applied the substitution r = s − t for the last sum in the right-hand side of (4.28). Therefore, relation (4.27) holds for m = p as well.

page 75

May 26, 2020

17:1

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

76

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

9287 – Gr¨ obner-Shirshov Bases

Denote by Σ(X, N ) the set of relations (4.25), (4.26), and (4.27). It is straightforward to check that Σ(X, N ) is closed with respect to compositions. Note that the set of normal words is exactly the set of Σ(X, N )reduced words. Hence, by Theorem 4.11, the linear basis of M (X, N ) consists of normal words.  Corollary 4.33. As a linear space (even as a k[∂]-module) M (X, N ) is isomorphic to a free associative conformal algebra generated by X with locality function N . Proof. Every associative conformal algebra C generated by a set X is an A(X)-module relative to the action of A(X) on C given by Lan u = a (n) u,

Rna u = {u (n) a}

for a ∈ X, u ∈ C and n ∈ Z+ . Relations (4.25) and (4.26) also hold in C. Moreover, the axiom of conformal associativity of C allows us to write an arbitrary element f ∈ C as a linear combination of elements ∂ s (a1

(n1 )

(a2

(n2 )

. . . (ak

(nk )

ak+1 ) . . . )),

for s ≥ 0, k ≥ 1, ai ∈ X and 0 ≤ ni < N (ai , ai+1 ). These elements are exactly the normal words, linearly independent in M (X, N ). To complete the proof it is enough to note that the A(X)-module M (X, N ) is an associative conformal algebra. Given a normal word u = ∂ s Lan11 . . . Lankk c ∈ M (X, N ), for ai , c ∈ X, define    n c k La(n1 1,...,a Lun = s!(−1)s ,s1 ),...,(nk ,sk ) Ln−s+s1 +···+sk ∈ A(X), s s1 ,...,sk ≥0

where k La(n1 1,...,a ,s1 ),...,(nk ,sk )

s1 +···+sk

= (−1)

    n1 n k a1 ... Ln1 −s1 . . . , Lankk −sk , s1 sk

and set u (n) v = Lun v. This family of operations, together with ordinary multiplication by ∂, makes M (X, N ) into an associative conformal algebra.  We will denote this algebra by Conf(X, N ). This is a free associative conformal algebra generated by a set X relative to the locality function N and is unique up to isomorphism. Remark 4.34. The conformal algebra C(X, N ) was first constructed by Roitman [272] by means of formal distributions.

page 76

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.5. ASSOCIATIVE CONFORMAL ALGEBRAS

77

As a result, we obtain the following Corollary 4.35. The ideals of Conf(X, N ) are exactly A(X)-submodules of M (X, N ). Proof. Every (two-sided) ideal of Conf(X, N ) is obviously closed with respect to the action of ∂, Lan , and Rna . Conversely, every A(X)-submodule of M (X, N ) is ∂-invariant and closed relative to left and right conformal multiplications on a ∈ X.  Let us summarize the results above to state the CD-Lemma for associative conformal algebras. Definition 4.36. Suppose S is a set of elements in Conf(X, N ). Identify S with a set of linear combinations of normal words B(X, N ) in modA (X). If S together with Σ(X, N ) is a GS basis in the A(X)-module modA (X) then we say that S is a GS basis in Conf(X, N ). By abuse of terminology, we say a GS basis S in Conf(X, N ) is a GS basis of the ideal generated by S in Conf(X, N ). As an immediate corollary of Theorem 4.11 we obtain the following statement. Theorem 4.37. The following statements are equivalent: • S is a GS basis in Conf(X, N ); • The set of S-reduced normal words forms a linear basis of Conf(X, N | S). Conformal analogue of the commutator is given by [a (n) b] = (a (n) b) − [b (n) a]. An associative conformal algebra C relative to the operations [· (n) ·], for n ∈ Z+ , turns into a Lie conformal algebra. Examples of Lie conformal algebras are known from mathematical physics and representation theory. Let us consider some of these examples and calculate GS bases of their universal envelopes under an additional locality condition. Example 4.38 (Heisenberg–Virasoro Lie conformal algebra). Let L be generated as a k[∂]-module by X = {v, h}, where [v

(0)

v] = ∂v,

[v

(1)

v] = 2v,

[v

(0)

h] = ∂h,

[v

(1)

h] = h,

other products [· (n) ·] are zero.

[h (1) v] = h,

(4.29)

page 77

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

78

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

textbook

For the Heisenberg–Virasoro Lie conformal algebra L, consider the algebra A(L; X) generated by O = {∂, Lhn , Rnh , Lvn , Rnv | n ∈ Z+ } relative to the following defining relations: Lan ∂ − ∂Lan − nLan−1 ,

for n ≥ 0 and a ∈ X,

(4.30)

a , Rna ∂ − ∂Rna − nRn−1 b b b a Rn Lm − Lm Rn , Lvn Lvm − Lvm Lvn − (n − m)Lvn+m−1 , Lhn Lvm − Lvm Lhn − nLhn+m−1 ,

for n ≥ 0 and a ∈ X,

(4.31)

Lhn Lhm



Lhm Lhn ,

for n, m ≥ 0 and a, b ∈ X,

(4.32)

for n > m ≥ 0,

(4.33)

for n, m ≥ 0,

(4.34)

for n > m ≥ 0.

(4.35)



We will use an order on O which is slightly different from used above: assume Lhn > Lvm for all n, m ≥ 0. Obviously, (4.30)–(4.35) is a GS basis relative to this modified order. Let us fix the following locality function on X: N (v, v) = N (h, v) = 2,

N (v, h) = 1,

N (h, h) = 0.

Then relations (4.25) and (4.26) turn into Lvn h, for n ≥ 1,

Lhm h, for m ≥ 0,

(4.36)

Lvn v, Lhn v, for n ≥ 2, R0v h − Lh0 v + ∂Lh1 v, R1v h + Lh1 v, Rnv h, for n ≥ 2, R0v v − Lv0 v + ∂Lv1 v, R1v v + Lv1 v, Rnv v, for n ≥ 0, h h, for m ≥ 0. R0h v − Lv0 h, Rnh v, for n ≥ 1, Rm

(4.37) (4.38) (4.39) (4.40)

The commutation relations encoding (4.29) are Lv1 v − v,

Lh1 v − h,

Lh0 v − Lv0 h.

(4.41)

It is straightforward to check that the set S of relations (4.30)–(4.41) form a GS basis in the free A(L; X)-module generated by X. For example, let us show that the composition of intersection of f = R0v Lh0 − Lh0 R0v and g = Lh0 v − Lv0 h is trivial: (f, g)Rv0 Lh0 v = (R0v Lh0 − Lh0 R0v )v − R0v (Lh0 v − Lv0 h) = R0v Lv0 h − Lh0 R0v v ≡ Lv0 R0v h − Lh0 R0v v ≡ Lv0 (Lh0 v − ∂Lh1 v) − Lh0 (Lv0 v − ∂Lv1 v) ≡ (Lv0 Lh0 − Lh0 Lv0 )v − ∂Lv0 Lh1 v + ∂Lh0 Lv1 v ≡ ∂Lh0 Lv1 v − ∂Lv0 Lh1 v ≡ Lh0 v − Lv0 h ≡ 0

(mod Σ ∪ S, R0v Lh0 v).

The set of (Σ ∪ S)-reduced normal words consists of ∂ s (Lv0 )k a,

for a ∈ X and s, k ≥ 0.

These words form a linear basis of U (L; X, N ).

page 78

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.5. ASSOCIATIVE CONFORMAL ALGEBRAS

79

Example 4.39 (GS basis of U (Vir; v, 3)). Let us compute the GS basis of the universal associative envelope of the Virasoro conformal algebra L = k[∂]v, [v (0) v] = ∂v, [v (1) v] = 2v, relative to the associative locality function N (v, v) = 3. Rnv

Here we will use a different ordering of generators O. Denote Ln = Lvn , = Rn , suppose L0 < L1 < ∂ < L2 < · · · < R0 < R1 < . . . ,

and compare monomials in O∗ first by their degree in Rn , next by the deg-lex rule. In these settings, A(L; X) is generated by O relative to the following defining relations: ∂L0 − L0 ∂,

(4.42)

∂L1 − L1 ∂ + L0 ,

(4.43)

Ln ∂ − ∂L0 − nLn−1 ,

for n ≥ 2,

Rn ∂ − ∂Rn − nRn−1 , Rn Lm − Lm Rn ,

(4.44)

for n ≥ 0,

(4.45)

n, m ≥ 0,

Ln Lm − Lm Ln − (n − m)Ln+m−1 ,

(4.46) for n > m ≥ 0.

(4.47)

Polynomials (4.42)–(4.47) form a GS basis in kO. As an A(L; X)-module, U (Vir; v, 3) is generated by X = {v} with the following relations: Ln v,

for n ≥ 3,

R0 v − 2L0 v + L1 ∂v − ∂

L2 v,

(4.49)

R1 v + L1 v − ∂L2 v,

(4.50)

R2 v − L2 v,

(4.51)

Rn v,

for n ≥ 3,

∂L2 v − 2L1 v + 2v. Relations • • • •

(4.48) (2)

(4.52) (4.53)

(4.42)–(4.53) have the following compositions of intersection: (4.46) with (4.48), w = Rm Ln v, for n ≥ 3; (4.47) with (4.48), w = Lm Ln v, for m > n ≥ 3; (4.44) with (4.53), w = Ln ∂L2 v, for n ≥ 2; (4.45) with (4.53), w = Rm Ln v, for n ≥ 0.

Calculation of these compositions gives only one new defining relation L2 L2 v.

(4.54)

page 79

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

80

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Denote by S the set of relations (4.42)–(4.54). Relation (4.54) has a composition of intersection with (4.47), w = Ln L2 L2 v, for n ≥ 3, which is trivial modulo S and w. Yet another composition is obtained from (4.46) and (4.54) relative to w = Rn L2 L2 v, for n ≥ 0. This composition is obviously trivial for n ≥ 2. Let us show as an example the computation of that composition for n = 0: R0 L2 L2 v ≡ L2 L2 R0 v ≡ L2 L2 (L0 v − ∂L1 v + ∂ (2) L2 v) ≡ L2 L2 L0 v − L2 ∂L2 L1 v − 2L2 L1 L1 v + 2L2 ∂L1 L2 v + L2 L0 L2 v ≡ L2 L2 L0 v − 2L1 L2 L1 v − 2L2 L1 L1 v + 4L1 L1 L2 v + L2 L0 L2 v ≡ 2L2 L0 L2 v + 2L2 L1 v − 2L1 L2 v − 2L2 L1 L1 v + 2L1 L1 L2 v ≡ 4L1 L2 v + 2L1 L2 v + 2L2 v + 2L1 L1 L2 v − 2L1 L2 v − 2L2 L1 L1 v ≡ 4L1 L2 v + 2L2 v + 2L1 L1 L2 v − 2L1 L1 L2 v − 4L1 L2 v − 2L2 v ≡ 0 (mod S, R0 L2 L2 v). As a result, S is a GS basis of U (Vir; v, 3). 4.6. Lifting of Gr¨ obner bases to Gr¨ obner–Shirshov bases Now, we give some applications of Theorem 3.24. Example 4.40. Suppose that for the deg-lex ordering, S1 and S2 are GS bases in kX and kY  respectively. Then for the deg-lex ordering on X ∗ Y ∗ as before, S1 ∪ S2 is a GS basis in kX ∪ Y | T  = kX ⊗ kY . It follows that kX | S1  ⊗ kY | S2  = kX ∪ Y | T ∪ S1 ∪ S2 . Indeed, the possible compositions in S1 ∪ S2 are X-including only, Xintersection only, Y -including only and Y -intersection only. Suppose that f, g ∈ S1 and (f, g)w1 ≡ 0 (mod S1 , w1 ) in kX. Then in kX ⊗ kY , we have (f, g)w = (f, g)w1 c, where w = w1 c for any c ∈ Y ∗ . From this, it follows that each composition in S1 ∪ S2 is trivial modulo S1 ∪ S2 . A particular case of Example 4.40 is the following. Example 4.41. Let X, Y be well-ordered sets, k[X] the free commutative associative algebra generated by X. Then S = {xi xj = xj xi | xi > xj , xi , xj ∈ X} is a GS basis in kX ⊗ kY  with respect to the deg-lex ordering. Therefore, k[X] ⊗ kY  = kX ∪ Y | T ∪ S. In [134], a GS basis in kX is constructed by lifting a commutative Gr¨ obner basis and adding some commutators. Let X = {x1 , x2 , . . . , xn }, [X] the free commutative monoid generated by X and k[X] the polynomial ring. Let S1 = {hij = xi xj − xj xi | i > j} ⊂ kX.

page 80

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ ¨ 4.6. LIFTING OF GROBNER BASES TO GROBNER–SHIRSHOV BASES

textbook

81

Consider the natural map γ : kX → k[X] which maps xi to xi and the lexicographic splitting of γ, which is defined as the k-linear map δ : k[X] → kX, xi1 xi2 · · · xir → xi1 xi2 · · · xir if i1 ≤ i2 · · · ≤ ir . For every u ∈ [X], we present u = xl11 xl22 · · · xlnn , where li ≥ 0. We use an arbitrary monomial ordering on [X]. Following [134], we define an ordering on X ∗ using the ordering x1 < x2 < · · · < xn as follows: for u, v ∈ X ∗ , u > v ⇔ γ(u) > γ(v) in [X] or (γ(u) = γ(v) and u >lex v). s) It is easy to check that this ordering is monomial on X ∗ and δ(s) = δ(¯ for any s ∈ k[X]. Moreover, v ≥ δ(u) for every v ∈ γ −1 (u). For m = xi1 xi2 · · · xir ∈ [X], with i1 ≤ i2 · · · ≤ ir , denote the set of all the monomials u ∈ [xi1 +1 , · · · , xir −1 ] by U (m). The proofs of the following lemmas are straightforward. Lemma 4.42. Let a, b ∈ X ∗ , a = δ(γ(a)), b = δ(γ(b)) and s ∈ k[X]. If w = aδ(¯ s)b = δ(γ(ab)¯ s), then, in kX, aδ(s)b ≡ δ(γ(ab)s)

(mod S1 , w).

s)b = Proof. Suppose that s = s¯+s and h = aδ(s)b−δ(γ(ab)s). Since aδ(¯ ¯ < w. By noting that δ(γ(ab)¯ s), we have h = aδ(s )b − δ(γ(ab)s ), and h  γ(aδ(s )b) = γ(δ(γ(ab)s )), we obtain h ≡ 0 (mod S1 , w). Lemma 4.43. Let f, g ∈ k[X], g¯ = xi1 xi2 · · · xir , with i1 ≤ i2 ≤ · · · ≤ ir , and w = δ(f¯g¯). Then, in kX,  δ((f − f¯)g) ≡ αi ai δ(ui g)bi (mod S1 , w) where α g ), i ∈ k, ai ∈ [x ∈ X | x ≤ xi1 ], bi ∈ [x ∈ X | x ≥ xir ], ui ∈ U (¯  and γ( αi ai ui bi ) = f − f¯. Theorem 4.44. Let the orderings on [X] and X ∗ be defined as above. If S is a minimal Gr¨ obner basis in k[X], then s)} ∪ S1 S  = {δ(us) | s ∈ S and u ∈ U (¯ is a Gr¨ obner–Shirshov basis in kX. Proof. We will show that all the possible compositions of elements in S  are trivial. Let f = δ(us1 ), g = δ(vs2 ) and hij = xi xj − xj xi ∈ S  .

page 81

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

82

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

First we consider the composition of f and g. Case 1. f and g have a composition of inclusion, i.e., w = δ(u¯ s1 ) = aδ(v¯ s2 )b for some a, b ∈ X ∗ and a = δ(γ(a)), b = δ(γ(b)). s1 s¯2 ) = s¯1 s¯2 , then If s1 and s2 have no composition in k[X], i.e., lcm(¯ u = u s¯2 , γ(ab)v = u s¯1 for some u ∈ [X]. By Lemma 4.42 and Lemma 4.43, we have (f, g)w = δ(us1 ) − aδ(vs2 )b ≡ δ(us1 ) − δ(γ(ab)vs2 ) ≡ δ(u s¯2 s1 ) − δ(u s¯1 s2 ) ≡ δ(u (s1 − s¯1 )s2 ) − δ(u (s2 − s¯2 )s1 ) ≡0

(mod S  , w).

Since S is a minimal Gr¨ obner basis in k[X], the only possible compositions of δ(s1 ) and δ(s2 ) are intersection. The corresponding composition of s1 and s2 in k[X] is (s1 , s2 )w = a s1 −b s2 , where a , b ∈ [X], w = a s¯1 = b s¯2 , and |w | < |¯ s1 | + |¯ s2 |. Then w is a subword of γ(w). Hence, we deduce that  w = δ(tw ) = δ(ta s¯1 ) = δ(tb s¯2 ) and u = ta , γ(ab)v = tb for some t ∈ [X]. Then (f, g)w = δ(us1 ) − aδ(vs2 )b ≡ δ(us1 ) − δ(γ(ab)vs2 ) ≡ δ(ta s1 ) − δ(tb s2 ) ≡ δ(t(a s1 − b s2 )) ≡ δ(t(s1 , s2 )w ) ≡0

(mod S  , w)

since t(s1 , s2 )w < tw = γ(w). Case 2. If f and g have a composition of intersection, we may assume that f¯ is on the left of g¯, i.e., w = δ(u¯ s1 )a = bδ(v¯ s2 ) for some a, b ∈ X ∗ and a = δγ(a), b = δγ(b). Similarly to Case 1, we have to consider whether s1 and s2 have compositions in k[X] or not. One can check that both cases are trivial mod(S  , w) by Lemma 4.42 and Lemma 4.43. Now we consider the composition of f and hij . ¯ ij = xi xj can not be a subword of f¯ = δ(u¯ By noting that h s1 ), since i > j, the only possible compositions are intersection. Suppose that s¯1 = xi1 · · · xir xi , with i1 ≤ i2 ≤ · · · ≤ ir ≤ i. Then f¯ = δ(u¯ s1 ) = xi1 vxi for some v ∈ kX, v = δγ(v), and w = δ(u¯ s1 )xj .

page 82

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ ¨ 4.6. LIFTING OF GROBNER BASES TO GROBNER–SHIRSHOV BASES

textbook

83

If j ≤ i1 then (f, hij )w = δ(us1 )xj − xi1 v(xi xj − xj xi ) = δ(u(s1 − s¯1 ))xj + xi1 vxj xi ≡ xj δ(u(s1 − s¯1 )) + xj xi1 vxi ≡ xj (δ(u(s1 − s¯1 )) + δ(u¯ s1 )) ≡ xj δ(us1 ) ≡ 0 (mod S  , w). If j > i1 then uxj ∈ U (¯ s1 ) and (f, hij )w = δ(us1 )xj − xi1 v(xi xj − xj xi ) = δ(u(s1 − s¯1 ))xj + xi1 vxj xi ≡ δ(uxj (s1 − s¯1 )) + δ(xi1 vxi xj ) ≡ δ(uxj (s1 − s¯1 )) + δ(uxj s¯1 ) ≡ δ(uxj s1 ) ≡0

(mod S  , w). 

This completes the proof. Now we extend γ and δ as follows. γ ⊗ 1 : kX ⊗ kY  → k[X] ⊗ kY ,

uX uY → γ(uX )uY ,

δ ⊗ 1 : k[X] ⊗ kY  → kX ⊗ kY ,

uX uY → δ(uX )uY .  Y A polynomial f ∈ k[X] ⊗ kY  has a presentation f = αi uX i ui , where X Y ∗ αi ∈ k, ui ∈ [X] and ui ∈ Y . Let the orderings on [X] and Y ∗ be any monomial orderings. We order the set [X]Y ∗ = {u = uX uY | uX ∈ [X] and uY ∈ Y ∗ } as follows. For u, v ∈ [X]Y ∗ , u > v ⇔ uY > v Y or (uY = v Y and uX > v X ). Now, we order X ∗ Y ∗ : for u, v ∈ X ∗ Y ∗ , u > v ⇔ γ(uX )uY > γ(v X )v Y or (γ(uX )uY = γ(v X )v Y and uX >lex v X ). This ordering is clearly a monomial ordering on X ∗ Y ∗ . The following definitions of compositions and GS bases are taken from [239]. Let f, g be monic polynomials of k[X] ⊗ kY , L the least common multiple of f¯X and g¯X . (1) Inclusion. g Y d for some c, d ∈ Y ∗ . If Let g¯Y be a subword of f¯Y , say, f¯Y = c¯ Y Y X X Y ¯ ¯ f = g¯ then f ≥ g¯ and if g¯ = 1 then we set c = 1. Let w = Lf¯Y =

page 83

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

84

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

Lc¯ g Y d. We define the composition L L C1 (f, g, c)w = ¯X f − X cgd. g¯ f (2) Overlap. Let a non-empty beginning of g¯Y be a non-empty ending of f¯Y , say, Y ¯ g Y for some c, d, c0 ∈ Y ∗ and c0 = 1. Let f = cc0 , g¯Y = c0 d, f¯Y d = c¯ Y Y ¯ w = Lf d = Lc¯ g . We define the composition L L C2 (f, g, c0 )w = ¯X f d − X cg. g¯ f (3) External. Let c0 ∈ Y ∗ be any associative word (possibly empty). In the case that the greatest common divisor of f¯X and g¯X is non-empty and f¯Y , g¯Y are non-empty, we define the composition L L C3 (f, g, c0 )w = ¯X f c0 g¯Y − X f¯Y c0 g, g¯ f where w = Lf¯Y c0 g¯Y . Let S be a monic subset of k[X] ⊗ kY . Then S is called a GS basis (standard basis) if for every nonzero element f ∈ Id(S) its principal monomial f¯ contains s¯ as a subword for some s ∈ S. A composition being trivial modulo S and corresponding w is defined in the usual way. We also have that S is a GS basis in k[X] ⊗ kY  if and only if all the possible compositions of its elements are trivial. A GS basis in k[X] ⊗ kY  is called minimal if for any s ∈ S and all si ∈ S \ {s}, s¯i is not a subword of s¯. Similar to the proof of Theorem 4.44, we have the following theorem. Theorem 4.45. Let the orderings on [X]Y ∗ and X ∗ Y ∗ be defined as before. If S is a minimal GS basis in k[X] ⊗ kY , then S  = {δ ⊗ 1(us) | s ∈ S, u ∈ U (¯ sX )} ∪ S1 is a GS basis in kX ⊗ kY , where S1 = {hij = xi xj − xj xi | i > j}. Proof. We will show that all the possible compositions of elements in S  are trivial. For s1 , s2 ∈ S, let f = δ ⊗ 1(us1 ), g = δ ⊗ 1(vs2 ), hij = xi xj − xj xi ∈ S  and L = lcm(s¯1 X , s¯2 X ). Case I. First we consider the composition of f and g. In this case, all the possible compositions are related to the ambiguities w (in the following, a, b ∈ X ∗ and c, d ∈ Y ∗ ). I.1 Compositions of inclusion.

page 84

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

¨ ¨ 4.6. LIFTING OF GROBNER BASES TO GROBNER–SHIRSHOV BASES

textbook

85

Y I.1.1 X-inclusion only: wX = δ(us¯1 X ) = aδ(v¯ sX = s¯Y1 c¯ sY2 2 )b, w Y Y Y or w = s¯2 c¯ s1 . X I.1.2 Y -inclusion only: wX = δ(us¯1 X )aδ(v¯ sX = δ(v¯ sX 2 ) or w 2 ) Y Y Y X s2 d. aδ(us¯1 ), w = s¯1 = c¯ I.1.3 X, Y -inclusion: w = δ ⊗ 1(u¯ s1 ) = acδ ⊗ 1(v¯ s2 )bd. X Y I.1.4 X, Y -skew-inclusion: wX = δ(u¯ sX ) = aδ(v¯ s ¯Y2 = 1 2 )b, w = s Y c¯ s1 d. I.2 Overlap. Y I.2.1 X-intersection only: wX = δ(u¯ sX sX 1 )a = bδ(v¯ 2 ), w = Y Y Y Y Y s¯1 c¯ s2 or w = s¯2 c¯ s1 . X I.2.2 Y -intersection only: wX = δ(u¯ sX sX = 1 )aδ(v¯ 2 ) or w X X Y Y Y s1 ), w = s¯1 c = d¯ s2 . δ(u¯ s2 )aδ(v¯ Y I.2.3 X, Y -intersection: wX = δ(u¯ sX sX = 1 )a = bδ(v¯ 2 ), w Y Y s¯1 c = d¯ s2 . sX sX I.2.4 X, Y -skew-intersection: wX = δ(u¯ 1 )a = bδ(v¯ 2 ), Y Y Y w = c¯ s1 = s¯2 d. I.3 External. sX sX I.3.1 X-inclusion and Y -intersection: wX = δ(u¯ 1 ) = aδ(v¯ 2 )b, Y Y Y Y Y Y w = s¯1 c = d¯ s2 or w = c¯ s1 = s¯2 d. sX sX I.3.2 X-intersection and Y -inclusion: wX = δ(u¯ 1 )a = bδ(v¯ 2 ), Y Y Y Y Y Y w = s¯1 = c¯ s2 d or w = s¯2 = c¯ s1 d.

We only check cases (I.1.1), (I.1.2) and (I.1.3). Other cases are similar. ∗ Y (I.1.1) Suppose that wX = δ(us¯1 X ) = aδ(v¯ sX 2 )b, a, b ∈ X and s¯1 , Y Y Y Y Y s2 and w = s¯2 are disjoint. There are two cases to consider: w = s¯1 c¯ s¯Y2 c¯ sY1 , where c ∈ Y ∗ . We will only prove the first case and the second is similar. s1 , s¯2 ) = s¯1 s¯2 , If s1 and s2 have no composition in k[X]⊗kY , i.e., lcm(¯  ¯ X for some u ∈ [X]. By the proof of Theorem then u = u s¯X , γ(ab)v = u s 2 1 4.44, we have sY2 − s¯Y1 caδ ⊗ 1(vs2 )b (f, g)w = δ ⊗ 1(us1 )c¯ sY2 )) − δ ⊗ 1(γ(¯ sY1 c)γ(ab)vs2 ) ≡ δ ⊗ 1(us1 γ(c¯ ≡ δ ⊗ 1(u s¯X sY2 ) − δ ⊗ 1(¯ sY1 cu s¯X 2 s1 c¯ 1 s2 ) ≡ δ ⊗ 1(u s1 c¯ s2 ) − δ ⊗ 1(u s¯1 cs2 ) ≡ δ ⊗ 1(u (s1 − s¯1 )cs2 ) − δ ⊗ 1(u s1 c(s2 − s¯2 )) ≡0

(mod S  , w).

Note that the elements of S have no composition of inclusion because S is minimal and s1 and s2 have no overlap composition because sY1 and sY2 are disjoint. If s1 and s2 have an external composition in k[X] ⊗ kY , i.e., C3 (s1 , s2 , c)w = (L/¯ sX sY2 ) − (L/¯ sX sY1 c)s2 = t2 s1 γ(c¯ sY2 ) − 1 )s1 γ(c¯ 2 )γ(¯

page 85

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

86

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

t1 γ(¯ sY1 c)s2 where gcd(¯ sX ¯X ¯X ¯X 1 ,s 2 ) = t = 1, s 1 = tt1 , s 2 = tt2 , and L =  Y Y  s1 c¯ s2 ), then w is a subword of γ(w). Therefore, we have tt1 t2 , w = Lγ(¯ w = δ ⊗ 1(mw ) and u = mt2 , γ(ab)v = mt1 since ut1 = γ(ab)vt2 and gcd(t1 , t2 ) = 1. Then (f, g)w = δ ⊗ 1(us1 )c¯ sY2 − s¯Y1 caδ ⊗ 1(vs2 )b ≡ δ ⊗ 1(us1 γ(c¯ sY2 )) − δ ⊗ 1(γ(¯ sY1 c)γ(ab)vs2 ) sY2 )) − δ ⊗ 1(mt1 γ(¯ sY1 c)s2 ) ≡ δ ⊗ 1(mt2 s1 γ(c¯ ≡ δ ⊗ 1(mC3 (s1 , s2 , c)w ) ≡ 0 (mod S  , w) since mC3 (s1 , s2 , c)w < mw = γ(w). (I.1.2) Suppose that wY = s¯Y1 = c¯ sY2 d, for c, d ∈ Y ∗ , and δ(us¯1 X ), X δ(v¯ s2 ) are disjoint. Then there are two compositions according to wX = X ∗ X sX = δ(v¯ sX δ(us¯1 X )aδ(v¯ 2 ) and w 2 )aδ(us¯1 ) for a ∈ X . We only consider the first. sX sX (f, g)w = δ ⊗ 1(us1 )aδ(v¯ 2 ) − δ(u¯ 1 )acδ ⊗ 1(vs2 )d ≡ δ ⊗ 1(us1 γ(a)v¯ sX sX 2 − u¯ 1 γ(a)vγ(c)s2 γ(d)) ¯X ≡ δ ⊗ 1(uγ(a)v(s1 s¯X 2 −s 1 γ(c)s2 γ(d))) ≡ δ ⊗ 1(uγ(a)vC1 (s1 , s2 , γ(c))w ) ≡ 0 (mod S  , w), ¯X ¯Y1 = s¯X ¯X sY2 γ(d) and where w = s¯X 1 s 2 s 1 s 2 γ(c)¯ uγ(a)vC1 (s1 , s2 , γ(c))w < uγ(a)vw = γ(w). (I.1.3) We may assume that g¯ is a subword of f¯, i.e., w = δ ⊗ 1(u¯ s1 ) = X acδ ⊗ 1(v¯ s2 )bd, for a, b ∈ X ∗ and c, d ∈ Y ∗ . Then u¯ sX = γ(ab)v¯ s = mL 1 2 for some m ∈ [X], u¯ sY1 = γ(c)¯ sY2 γ(d). (f, g)w = δ ⊗ 1(us1 ) − acδ ⊗ 1(vs2 )bd ≡ δ ⊗ 1(us1 − γ(ac)vs2 γ(bd)) L L ≡ δ ⊗ 1(m X s1 − m X γ(c)s2 γ(d)) s¯1 s¯2 ≡ δ ⊗ 1(mC1 (s1 , s2 , γ(c))w ) ≡ 0 (mod S  , w), where w = Lγ(c)¯ sY2 γ(d) and mC1 (s1 , s2 , c)w < mw = γ(w). Case II. Now we consider the composition of f and hij . Similar to the proof of Theorem 4.44, they only have compositions of Xintersection. Suppose that s¯X 1 = xi1 · · · xir xi , where i1 ≤ i2 ≤ · · · ≤ ir ≤ i.

page 86

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.7. LIE ALGEBRA WITH UNSOLVABLE WORD PROBLEM

textbook

87

Then f¯ = δ ⊗ 1(u¯ s1 ) = xi1 vxi s¯Y1 for some v ∈ kX, v = δγ(v), and Y w = δ ⊗ 1(u¯ s1 )xj s¯1 . If j ≤ i1 , then sY1 (xi xj − xj xi ) (f, hij )w = δ ⊗ 1(us1 )xj − xi1 v¯ = δ ⊗ 1(u(s1 − s¯1 ))xj + xi1 vxj xi s¯Y1 ≡ xj δ ⊗ 1(u(s1 − s¯1 )) + xj xi1 vxi s¯Y1 ≡ xj (δ ⊗ 1(u(s1 − s¯1 )) + δ ⊗ 1(u¯ s1 )) ≡ xj δ ⊗ 1(us1 ) ≡0

(mod S  , w).

If j > i1 , then uxj ∈ U (¯ s1 ) and (f, hij )w = δ ⊗ 1(us1 )xj − xi1 v¯ sY1 (xi xj − xj xi ) = δ ⊗ 1(u(s1 − s¯1 ))xj + xi1 vxj xi s¯Y1 ≡ δ ⊗ 1(uxj (s1 − s¯1 )) + δ ⊗ 1(xi1 vxi xj s¯Y1 ) ≡ δ ⊗ 1(uxj (s1 − s¯1 )) + δ ⊗ 1(uxj s¯1 ) ≡ δ ⊗ 1(uxj s1 ) ≡0

(mod S  , w).

This completes the proof.



As an application of the above result, we have now constructed a GS basis in the tensor product kX ⊗ kY  by lifting a given GS basis in the tensor product k[X] ⊗ kY . 4.7. Lie algebra with unsolvable word problem Let P = Smgx, y | ui = vi , i ∈ I be a semigroup. Consider the Lie algebra AP = Lie(x, xˆ, y, yˆ, z | S), where S consists of the following relations: (1) [ˆ xx] = 0, [ˆ xy] = 0, [ˆ yx] = 0, [ˆ yy] = 0, (2) [ˆ xz] = −[zx], [ˆ y z] = −[zy], (3) zui ! = zvi !, for i ∈ I. Here, zu! means the left normed bracketing. In this section, we find a Gr¨obner–Shirshov basis for the Lie algebra AP and, by using this result, give another proof for Kukin’s theorem, see Corollary 4.47. Order x ˆ > yˆ > z > x > y and let > be the deg-lex order on {ˆ x, yˆ, x, y, z}∗ . Let ρ be the congruence on {x, y}∗ generated by the set of pairs {(ui , vi ), i ∈ I}. Let

page 87

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

88

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

textbook

(3 ) zu! = zv!, for (u, v) ∈ ρ with u > v. Theorem 4.46. With the above notation, the set S1 of relations (1), (2) obner–Shirshov basis in Lie(ˆ x, yˆ, x, y, z). and (3 ) is a Gr¨ Proof. For any u ∈ {x, y}∗ , we have zu! = zu, by induction on |u|. All possible compositions in S1 are intersection of (2) and (3 ), and inclusion of (3 ) and (3 ). ˆzu, (u, v) ∈ ρ, u > v, f = [ˆ xz] + [zx] and For (2) and (3 ), w = x g = zu! − zv!. We have xg]g¯ ([ˆ xz] + [zx], zu! − zv!)w = [f u]f¯ − [ˆ ≡ ([ˆ xz] + [zx])u! − [ˆ x( zu! − zv!)] ≡ zxu! + x ˆzv! ≡ zxu! − zxv! ≡0

(mod S1 , w).

For (3 ) and (3 ), w = zu1 = zu2 e, e ∈ {x, y}∗ , (ui , vi ) ∈ ρ, with ui > vi for i = 1, 2. We have ( zu1 ! − zv1 !, zu2 ! − zv2 !)w ≡ ( zu1 ! − zv1 !) − ( zu2 ! − zv2 !)e! ≡

zv2 !e! − zv1 ! ≡ zv2 e! − zv1 !

≡ 0 (mod S1 , w). 

obner–Shirshov basis in Lie(ˆ x, yˆ, x, y, z). Thus, the set S1 is a Gr¨ Corollary 4.47 (G.P. Kukin [201]). Let u, v ∈ {x, y}∗ . Then u = v in the semigroup P ⇔ zu! = zv! in the Lie algebra AP .

Proof. Suppose that u = v in the semigroup P . Without loss of generality, we may assume that u = au1 b, v = av1 b for some a, b ∈ {x, y}∗ and xr] = 0, and so (u1 , v1 ) ∈ ρ. For any r ∈ {x, y} relations (1) imply [ˆ zxc! = [z x ˆ]c! = [ zc!ˆ x], zyc! = [ zc!ˆ y], for any c ∈ {x, y}∗ . From − a !b! = this it follows that in AP , zu! = zau1 b! = zau1 !b! = zu1← ← − ← − zu1 a b! = zv1 a b! = zav1 b! = zv!, where ← x−−·−·− ·− x =x x · · · x , x x ···x =x  x  ···x  x −− i1 i2

in

in in−1

i1



i1 i2

in

i1 i2

in

for xij ∈ {x, y}. Moreover, (3 ) holds in AP . Suppose that zu! = zv! in the Lie algebra AP . Then both zu! and zv! have the same normal form in AP . Since S1 is a Gr¨obner–Shirshov basis in AP by Theorem 4.46, both zu! and zv! can be reduced to the same normal form of the form zc! for some c ∈ {x, y}∗ only by the relations (3 ). This implies that u = c = v in P . 

page 88

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.8. GS BASIS FOR THE DRINFELD–KOHNO LIE ALGEBRA

89

By the above corollary, if the semigroup P has undecidable word problem then so does the Lie algebra AP . 4.8. Gr¨ obner–Shirshov basis for the Drinfeld–Kohno Lie algebra In this section we find a Gr¨obner–Shirshov basis for the Drinfeld–Kohno Lie algebra Ln . Definition 4.48. Let n > 2 be an integer. The Drinfeld–Kohno Lie algebra Ln over Z is defined by generators tij = tji for distinct indices 1 ≤ i, j ≤ n − 1, and relations tij tkl = 0, tij (tik + tjk ) = 0, where i, j, k, l are pairwise distinct. Clearly, Ln has a presentation LieZ (T | S), where T = {tij | 1 ≤ i < j ≤ n − 1} and S consists of the following relations: tij tkl = 0,

for k < i < j, k < l and l = i, j;

(4.55)

tjk tij + tik tij = 0,

for i < j < k;

(4.56)

tjk tik − tik tij = 0,

for i < j < k.

(4.57)

Now we order T as follows: tij < tkl if either i < k or i = k and j < l. Let < be the deg-lex order on T ∗ . Theorem 4.49. Let S be the set of relations (4.55), (4.56), (4.57) as before, < the deg-lex order on T ∗ . Then S is a Gr¨ obner–Shirshov basis for Ln . Proof. We list all the possible ambiguities. We use (i) ∧ (j) to denote the composition of the type (i) and type (j). For convenience, we denote (4.55), (4.56), (4.57) by (1), (2), (3) respectively. For (1) ∧ (n), with 1 ≤ n ≤ 3, the possible ambiguities w are: (1) ∧ (1) : tij tkl tmr ,

for k < i < j, k < l, l = i, j, m < k < l, m < r and r = k, l),

(1) ∧ (2) : tij tkl tmk ,

for k < i < j, k < l, l = i, j and m < k < l,

(1) ∧ (3) : tij tkl tml ,

for k < i < j, k < l, l = i, j and m < k < l.

For (2) ∧ (n), with 1 ≤ n ≤ 3, the possible ambiguities w are: (2) ∧ (1) : tjk tij tmr ,

for m < i < j < k, m < r and r = i, j,

(2) ∧ (2) : tjk tij tmi ,

for m < i < j < k,

(2) ∧ (3) : tjk tij tmj ,

for m < i < j < k.

page 89

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

90

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 4. APPLICATIONS OF GROBNER–SHIRSHOV BASES

For (3) ∧ (n), with 1 ≤ n ≤ 3, the possible ambiguities w are: (3) ∧ (1) tjk tik tmr ,

for m < i < j < k, m < r and r = i, k,

(3) ∧ (2) tjk tik tmi ,

for m < i < j < k,

(3) ∧ (3) tjk tik tmk ,

for m < i < j < k.

We claim that all compositions are trivial relative to S. Here, we only prove cases (1) ∧ (1), (1) ∧ (2), (2) ∧ (1), (2) ∧ (2), the other cases can be proved similarly. For (1) ∧ (1), let f = tij tkl , g = tkl tmr , with k < i < j, k < l, l = i, j, m < k < l, m < r and r = k, l. Then w = tij tkl tmr and (f, g)w = (tij tkl )tmr − tij (tkl tmr ) = (tij tmr )tkl

(mod S, w).

There are three subcases to consider: r = i, j, r = i and r = j. It is straightforward to see that in all these cases (tij tmr )tkl ≡ 0 (mod S, w). For (1) ∧ (2), let f = tij tkl , g = tkl tmk + tml tmk , with k < i < j, k < l, l = i, j and m < k < l. Then w = tij tkl tmk and (f, g)w = (tij tkl )tmk − tij (tkl tmk + tml tmk ) = (tij tmk )tkl − tij (tml tmk ) ≡ −tij (tml tmk ) ≡ −(tij tml )tmk − tml (tij tmk ) ≡ 0 (mod S, w). For (2) ∧ (1), let f = tjk tij + tik tij , g = tij tmr , with m < i < j < k, m < r and r = i, j. Then w = tjk tij tmr and (f, g)w = (tjk tij + tik tij )tmr − tjk (tij tmr ) ≡ (tjk tmr )tij + (tik tmr )tij

(mod S, w).

There are two subcases to consider: r = k and r = k. Subcase 1. If r = k then (tjk tmr )tij + (tik tmr )tij ≡ 0 (mod S, w).

page 90

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

4.8. GS BASIS FOR THE DRINFELD–KOHNO LIE ALGEBRA

textbook

91

Subcase 2. If r = k, then (tjk tmr )tij + (tik tmr )tij = (tjk tmk )tij + (tik tmk )tij ≡ −tij (tmk tmj ) − tij (tmk tmi ) ≡ (tij tmj )tmk + (tij tmi )tmk ≡ −tmk (tmj tmi ) + tmk (tmj tmi ) ≡0

(mod S, w).

For (2) ∧ (2), let f = tjk tij + tik tij , g = tij tmi + tmj tmi , with m < i < j < k. Then w = tjk tij tmi and (f, g)w = (tjk tij + tik tij )tmi − tjk (tij tmi + tmj tmi ) = (tjk tmi )tij + (tik tmi )tij + tik (tij tmi ) − tjk (tmj tmi ) ≡ tij (tmk tmi ) − tik (tmj tmi ) − tjk (tmj tmi ) ≡ −(tij tmi )tmk + (tik tmi )tmj − (tjk tmj )tmi ≡ −tmk (tmj tmi ) − (tmk tmi )tmj + (tmk tmj )tmi ≡0

(mod S, w).

Thus S is a Gr¨obner–Shirshov basis for Ln .



Let L be a Lie algebra over a commutative ring K, L1 an ideal of L and L2 a subalgebra of L. Recall that L is a semidirect product of L1 and L2 if L = L1 ⊕ L2 as K-modules. By Theorems 3.35 and 4.49, we have immediately the following corollaries. Corollary 4.50. The Drinfeld–Kohno Lie algebra Ln is a free Z-module with a Z-basis Irr(S) = {[tik1 tik2 · · · tikm ] | tik1 tik2 · · · tikm is an ALSW in T ∗ , m ∈ N}. Corollary 4.51. The algebra Ln is an iterated semidirect product of free Lie algebras. Proof. Let Ai be the free Lie algebra generated by {tij | i < j ≤ n − 1}. Clearly, Ln = A1 ⊕ A2 ⊕ · · · ⊕ An−2 as Z-modules, and the relations (1), (2), (3) imply Ai is an ideal of Ai +  Ai+1 + · · · + An−2 .

page 91

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

b2530   International Strategic Relations and China’s National Security: World at the Crossroads

b2530_FM.indd 6

This page intentionally left blank

01-Sep-16 11:03:06 AM

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

CHAPTER 5

Gr¨ obner-Shirshov bases for Lie algebras over a commutative algebra In this chapter, we establish the Composition-Diamond lemma for Lie algebras over a polynomial algebra, i.e., for “double free” Lie algebras. It provides a Gr¨ obner–Shirshov bases theory for Lie algebras over a commutative algebra. In the case when the underlying commutative algebra is a field, we obtain an alternative proof of the Composition-Diamond Lemma for Lie algebras which is closer to Shirshov’s original proof of 1962. Let k be a field, K a commutative associative k-algebra with identity, and L a Lie K-algebra. Let LieK (X) be the free Lie K-algebra generated by a set X. Then, of course, L can be presented as K-algebra by generators X and some defining relations S, L = LieK (X | S) = LieK (X)/ Id(S). In order to define a Gr¨obner–Shirshov basis for L, we first present K in a form K = k[Y | R] = k[Y ]/ Id(R), where k[Y ] is a polynomial algebra over the field k and R ⊂ k[Y ]. Then the Lie K-algebra L has the following presentation as a k[Y ]-algebra L = Liek[Y ] (X | S ∪ RX), where RX stands for the set {f x | f ∈ R and x ∈ X}. Now by definition, a Gr¨obner–Shirshov basis for L = LieK (X | S) is a Gr¨ obner–Shirshov basis of the ideal Id(S ∪ RX) in the “double free” Lie algebra Liek[Y ] (X). As an application of the Composition-Diamond Lemma (Theorem 5.18 below), a Gr¨ obner–Shirshov basis of L gives rise to a linear basis of L as of a k-algebra. We give applications of Gr¨ obner–Shirshov bases theory for Lie algebras over a commutative algebra K (over a field k) to the Poincar´e–Birkhoff– Witt theorem. A Lie algebra over a commutative ring is called special if it is embeddable into an (universal enveloping) associative algebra. Otherwise it is called non-special. There are known classical examples by A.I. Shirshov 93

page 93

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

94

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

[284] and P. Cartier [89] of Lie algebras over commutative algebras over GF(2) that are not embeddable into associative algebras. Shirshov and Cartier used ad hoc methods to prove that some elements of corresponding Lie algebras are not zero though they are zero in the universal enveloping algebras, i.e., they proved non-speciality of the examples. Here we find Gr¨ obner–Shirshov bases of these Lie algebras and then use the CompositionDiamond Lemma (Theorem 5.18) to get the result, i.e., we give a new conceptual proof. P.M. Cohn [121] gave the following examples of Lie algebras Lp = LieK (x1 , x2 , x3 | y3 x3 = y2 x2 + y1 x1 ) over truncated polynomial algebras K = k[y1 , y2 , y3 | yip = 0, for 1 ≤ i ≤ 3], where k is a filed of characteristic p > 0. He conjectured that Lp is a non-special Lie algebra for any p. The algebra Lp is called the Cohn’s Lie algebra. Using Theorem 5.18 we will prove that L2 , L3 and L5 are non-special Lie algebras. We present an algorithm that allows one to check for any p, whether Cohn’s Lie algebra is non-special. We give a new class of special Lie algebras in terms of defining relations (Theorem 5.24). For example, any one relator Lie algebra LieK (X | f ) with a k[Y ]-monic relation f over a commutative algebra K is special (Corollary 5.25). It gives an extension of the list of known special Lie algebras (ones with valid PBW Theorems) (see P.-P. Grivel [154]). Let us state that list here: 1. L is a free K-module (G. Birkhoff [18], E. Witt [309]); 2. K is a principal ideal domain (M. Lazard [207, 208]); 3. K is a Dedekind domain (P. Cartier [89]); 4. K is over a field k of characteristic 0 (P.M. Cohn [121]); 5. L is K-module without torsion (P.M. Cohn [121]); 6. 2 is invertible in K and [x[yz]] = 0 for any x, y, z ∈ L (Y. Nouaze and P. Revoy [245]); 7. No nonzero element of K annihilates an element of the center of LK (A.I. Shirshov [284]). As a last application, we will prove that every finitely or countably generated Lie algebra over an arbitrary commutative algebra K can be embedded into a two-generated Lie algebra over K. 5.1. Preliminaries We start with some concepts and results from the literature concerned with the Gr¨obner–Shirshov bases theory of a free Lie algebra Liek (X) generated

page 94

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

5.2. COMPOSITION-DIAMOND LEMMA FOR Liek[Y ] (X)

textbook

95

by X over a field k. For the reader’s convenience we recall some basic definitions in the following. Let X = {xi | i ∈ I} be a well-ordered set with xi > xj if i > j for any i, j ∈ I. Let X ∗ be the free monoid generated by X. For u = xi1 xi2 · · · xim ∈ X ∗ , let the length of u be m, denoted by |u| = m. We use two linear orderings on X ∗ : (i) (lex-ordering) 1 > t if t = 1 and, by induction, if u = xi u and v = xj v  then u > v if and only if xi > xj or xi = xj and u > v  ; (ii) (deg-lex ordering) u v if |u| > |v|, or |u| = |v| and u > v. We regard Liek (X) as the Lie subalgebra of the free associative algebra kX, which is generated by X under the Lie bracket [u, v] = uv − vu. Given f ∈ kX, denote by f¯ the leading word of f with respect to the deg-lex ordering; f is said to be monic if the coefficient of f¯ is 1. We denote the set of all non-associative Lyndon–Shirshov words on X by NLSW(X). Considering a NLSW [w] as a polynomial in kX, we have [w] = w (see Lemma 2.34). This implies that if f ∈ Liek (X) ⊂ kX then f¯ belongs to the set of all associative Lyndon–Shirshov words ALSW(X). Lemma 5.1. Let w = aub be as in Theorem 2.39. Then [uc] = [u[c1 ][c2 ] . . . [cn ], ] that is [w] = [a[. . . [u[c1 ]] . . . [cn ]]d]. Lemma 5.2. Suppose that w = aubvc, where w, u, v ∈ ALSW(X). Then there is a bracketing [w]u,v = [a[u]b[v]d] on the word w such that [w]u,v = w. More precisely, ⎧ ⎪ if [w] = [a[up]q[vs]l]; ⎨[a[up]u q[vs]v l], [w]u,v = [a[u[c1 ] · · · [ct ]v · · · [cn ]]u p], if [w] = [a[u[c1 ] · · · [ct ] · · · [cn ]]p] ⎪ ⎩ with v a subword of ct . 5.2. Composition-Diamond lemma for Liek[Y ] (X) Let Y = {yj | j ∈ J} be a well-ordered set and [Y ] = {yj1 yj2 · · · yjl | yj1 ≤ yj2 ≤ · · · ≤ yjl and l ≥ 0} the free commutative monoid generated by Y . Then [Y ] is a k-linear basis of the polynomial algebra k[Y ]. Let the set X be a well-ordered set, and

page 95

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

96

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

let the lex-ordering < and the deg-lex ordering ≺X on X ∗ be defined as before. Let Liek[Y ] (X) be the “double” free Lie algebra, i.e., the free Lie algebra over the polynomial algebra k[Y ] with generating set X. From now on we regard Liek[Y ] (X) ∼ = k[Y ] ⊗ Liek (X) as the Lie subalgebra of k[Y ]X ∼ = k[Y ] ⊗ kX, the free associative algebra over the polynomial algebra k[Y ], which is generated by X under the Lie bracket [u, v] = uv − vu. Let TA = {u = uY uX | uY ∈ [Y ] and uX ∈ ALSW(X)} and TN = {[u] = uY [uX ] | uY ∈ [Y ] and [uX ] ∈ NLSW(X)}. As we know from Section 2.4, the elements of TA and TN are in one-to-one correspondence with each other. Remark 5.3. For u = uY uX ∈ TA , we still use the notation [u] = uY [uX ] where [uX ] is a NLSW on X. Let kTN be the linear space spanned by TN over k. For any [u], [v] ∈ TN , define  [u][v] = αi uY v Y [wiX ]  αi [wiX ] in Liek (X). where αi ∈ k, the [wiX ] are NLSWs and [uX ][v X ] = kT as a k-algebra and T is a k-basis of k[Y ] ⊗ Then k[Y ] ⊗ Liek (X) ∼ = N N Liek (X). We define the deg-lex ordering on [Y ]X ∗ = {uY uX | uY ∈ [Y ], uX ∈ X ∗ } as follows: for any u, v ∈ [Y ]X ∗ , u v if uX X v X or uX = v X and uY Y v Y , where Y and X are the deg-lex ordering on [Y ] and X ∗ respectively. Remark 5.4. By abuse of notation, from now on, in a Lie expression like [[u][v]] we will omit the external brackets, so [[u][v]] = [u][v]. Clearly, the ordering is “monomial” in the sense that [u][w] [v][w] whenever wX = uX for any u, v, w ∈ TA . Considering any [u] ∈ TN as a polynomial in the k-algebra k[Y ]X, we have [u] = u ∈ TA . For any f ∈ Liek[Y ] (X) ⊂ k[Y ] ⊗ kX,  one can present f as a klinear combination of TN -words, i.e., f = αi [ui ], where [ui ] ∈ TN . With respect to the ordering on [Y ]X ∗ , the leading word f¯ of f in k[Y ]X is an element of TA . We call f k-monic if the coefficient of f¯ is 1. On the other hand, presented as a k[Y ]-linear combination of NLSW(X),  f can be X X i.e., f = fi (Y )[uX i ], where fi (Y ) ∈ k[Y ], [ui ] ∈ NLSW(X) and u1 X

page 96

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

5.2. COMPOSITION-DIAMOND LEMMA FOR Liek[Y ] (X)

97

¯X = uX and f¯Y = f1 (Y ). We call f k[Y ]-monic if uX 2 X . . .. Clearly f 1 f1 (Y ) = 1. It is easy to see that k[Y ]-monic implies k-monic. Using the concepts above, we rewrite Lemma 2.34 as follows. Lemma 5.5 (Shirshov). Suppose that w = aub where w, u ∈ TA and a, b ∈ X ∗ . Then [w] = [a[uc]d], where [uc] ∈ TN and b = cd. Represent c in a form c = c1 c2 . . . cn , where c1 , . . . , cn ∈ ALSW(X) and c1 ≤ c2 ≤ . . . ≤ cn . Then [w] = [a[u[c1 ][c2 ] . . . [cn ]]d]. Moreover, the leading word of [w]u = [a[· · · [[[u][c1 ]][c2 ]] . . . [cn ]]d] is exactly w, i.e., [w]u = w. We still use the notion [w]u as the special bracketing of w relative to u. Let S ⊂ Liek[Y ] (X) and Id(S) be the k[Y ]-ideal of Liek[Y ] (X) generated by S. Then any element of Id(S) is a k[Y ]-linear combination of polynomials of the following form: (u)s = [c1 ][c2 ] · · · [cn ]s[d1 ][d2 ] · · · [dm ],

for m, n ≥ 0,

with some placement of parentheses, where s ∈ S and ci , dj ∈ ALSW(X). We call such (u)s an s-word (or S-word). Now, we define two special kinds of S-words. Definition 5.6. Let S ⊂ Liek[Y ] (X) be a k-monic subset, a, b ∈ X ∗ and s ∈ S. If a¯ sb ∈ TA then by Lemma 5.5 we have the special bracketing [a¯ sb]s¯ of a¯ sb relative to s¯. We define [asb]s¯ = [a¯ sb]s¯|[¯s]→s to be a special normal s-word (or special normal S-word). Definition 5.7. Let S ⊂ Liek[Y ] (X) be a k-monic subset and s ∈ S. We define the normal s-word, denoted by u!s , where u = asb, a, b ∈ X ∗ (u is an associative S-word), inductively. (i) A polynomial s ∈ S is normal of s-length 1; (ii) If u!s is normal with s-length k and [v] ∈ NLSW(X) such that X

X

|v| = l, then [v] u!s when v > u!s and u!s [v] when v < u!s are normal of s-length k + l. From the definition of the normal s-word, we have the following lemma.

Lemma 5.8. For any normal s-word u!s = (asb), for a, b ∈ X ∗ , we have u!s = a¯ sb ∈ T A .

page 97

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

98

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

Remark 5.9. It is clear that for an s-word (u)s = [c1 ][c2 ] · · · [cn ]s[d1 ][d2 ] · · · [dm ], (u)s is normal if and only if (u)s = c1 c2 · · · cn s¯d1 d2 · · · dm . Now we give the definition of compositions. Definition 5.10. Let f , g be two k-monic polynomials of Liek[Y ] (X). Denote the least common multiple of f¯Y and g¯Y in [Y ] by L = lcm(f¯Y , g¯Y ). If g¯X is a subword of f¯X , i.e., f¯X = a¯ g X b for some a, b ∈ X ∗ , then the polynomial L L C1 f, gw = ¯Y f − Y [agb]g¯ g¯ f is called the inclusion composition of f and g with respect to w, where w = Lf¯X = La¯ g X b. If a proper prefix of g¯X is a proper suffix of f¯X , i.e., f¯X = aa0 , g¯X = a0 b, for a, b, a0 = 1, then the polynomial L L C2 f, gw = ¯Y [f b]f¯ − Y [ag]g¯ g ¯ f is called the intersection composition of f and g with respect to w, where w = Lf¯X b = La¯ gX . If the greatest common divisor of f¯Y and g¯Y in [Y ] is not 1, then for g X c ∈ TA , the polynomial any a, b, c ∈ X ∗ such that w = Laf¯X b¯ L L g X c]f¯ − Y [af¯X bgc]g¯ C3 f, gw = ¯Y [af b¯ g ¯ f is called the external composition of f and g with respect to w. If f¯Y = 1, then for any special normal f -word [af b]f¯, a, b ∈ X ∗ , the polynomial C4 f w = [af¯X b][af b] ¯ f

is called the multiplication composition of f with respect to w, where w = af¯X baf¯b. Immediately, we have that Ci −w ≺ w, for i ∈ {1, 2, 3, 4}. Remark 5.11. (1) When Y = ∅, there are no external and multiplication compositions. This is the case of Shirshov’s compositions over a field. (2) In the cases of C1 and C2 , the corresponding w belongs to TA by the property of ALSWs, but in the case of C4 we have w ∈ TA . (3) For any fixed f , g, there are finitely many compositions C1 f, gw , C2 f, gw , but infinitely many C3 f, gw , C4 f w .

page 98

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

5.2. COMPOSITION-DIAMOND LEMMA FOR Liek[Y ] (X)

textbook

99

Definition 5.12. Given a k-monic subset S ⊂ Liek[Y ] (X) and w ∈ [Y ]X ∗ (not necessary in TA ), an element h ∈ Liek[Y ] (X) is called trivial modulo (S, w), denoted by h ≡ 0 (mod S, w), if h can be presented as a k[Y ]-linear combination of special normal S-words with leading words less than w, i.e.,  αi βi [ai si bi ]s¯i , h= i

where αi ∈ k, βi ∈ [Y ], ai , bi ∈ X ∗ , si ∈ S, and βi ai s¯i bi ≺ w. In general, for p, q ∈ Liek[Y ] (X), we write p ≡ q (mod S, w) if p − q ≡ 0 (mod S, w). The set S is called a Gr¨obner–Shirshov (GS) basis in Liek[Y ] (X) if all the possible compositions of elements in S are trivial modulo S and corresponding w. If a subset S of Liek[Y ] (X) is not a GS basis then one can add all nontrivial compositions of polynomials of S to S. Continuing this process repeatedly, we finally obtain a GS basis S c that contains S. Such a process is called Shirshov’s algorithm, S c is called the Gr¨obner–Shirshov complement of S. Lemma 5.13. Let f be a k-monic polynomial in Liek[Y ] (X). If f¯Y = 1 or f = gf  where g ∈ k[Y ] and f  ∈ Liek (X), then for any special normal f -word [af b]f¯, for a, b ∈ X ∗ the element (u)f = [af¯X b][af b]f¯ has a presentation  αi βi ui !f (u)f = [af¯X b][af b]f¯ = ui f (u)f

where αi ∈ k and βi ∈ [Y ]. Proof. There are two cases. Case 1. f¯Y = 1, i.e., f¯ =f¯X . By Lemma 5.5 and since ≺ is monomial, we have [af¯b] = [af b]f¯ − βi vi ≺afb ¯ αi βi [vi ], where αi ∈ k, βi ∈ [Y ], vi ∈ ALSW(X). Then  αi βi [af b]f¯[vi ] (u)f = [af¯b][af b]f¯ = [af b]f¯[af b]f¯ + =



βi vi ≺af¯b

αi βi [af b]f¯[vi ].

βi vi ≺af¯b

The result follows since vi ≺ af¯b and each [af b]f¯[vi ] is normal. Case 2. f = gf  , i.e., f¯X = f¯ . Then we have (u)f = [af¯ b][af b]f¯ = g([af¯ b][af  b]f¯ ). The result follows from Case 1.



page 99

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

100

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

The following lemma plays a key role. Lemma 5.14. Let S be a k-monic subset of Liek[Y ] (X) in which each multiplication composition is trivial. Then for any normal s-word u!s = (asb) and w = a¯ sb = u!s , where a, b ∈ X ∗ , we have u!s ≡ [asb]s¯ (mod S, w). Proof. For w = s¯ the lemma is clear. For w = s¯, since either u!s = (asb) = [a1 ](a2 sb) or u!s = (asb) = (asb1 )[b2 ], there are two cases to consider. Let  |a1 |, if (asb) = [a1 ](a2 sb); δ(asb) = s-length of (asb1 ), if (asb) = (asb1 )[b2 ]. The proof will be proceeding by induction on (w, δ(asb) ), where (w , m ) < (w, m) ⇔ w ≺ w or w = w and m < m (w, w ∈ TA and m, m ∈ N). Case 1. u!s = (asb) = [a1 ](a2 sb), where a1 > a2 s¯X b, a = a1 a2 and (a2 sb) is a normal s-word. In this case, (w, δ(asb) ) = (w, |a1 |). Since w = a¯ sb  = a1 a2 s¯b a2 s¯b, by induction, we may assume that (a2 sb) = [a2 sb]s¯ + αi βi [ci si di ]s¯i , where βi ci s¯i di ≺ a2 s¯b, a1 , a2 , ci , di ∈ X ∗ , si ∈ S, αi ∈ k and βi ∈ [Y ]. Thus,  u!s = (asb) = [a1 ][a2 sb]s¯ + αi βi [a1 ][ci si di ]s¯i . Consider the term [a1 ][ci si di ]s¯i . ¯i di ≺ If a1 > ci s¯X i di , then [a1 ][ci si di ]s¯i is a normal s-word with a1 ci s w. Note that βi a1 ci s¯i di ≺ w, then by induction, βi [a1 ][ci si di ]s¯i ≡ 0 (mod S, w). If a1 < ci s¯X i di , then [a1 ][ci si di ]s¯i = −[ci si di ]s¯i [a1 ] and [ci si di ]s¯i [a1 ] is a normal s-word with βi ci s¯i di a1 ≺ βi a2 s¯ba1 ≺ βi a1 a2 s¯b = w. Y If a1 = ci s¯X = 1, by i di , then there are two possibilities. For si Lemma 5.13 and by induction on w we have βi [a1 ][ci si di ]s¯i ≡ 0 (mod S, w). For si Y = 1, [a1 ][ci si di ]s¯i is the multiplication composition, then by assumption, it is trivial modulo (S, w). This shows that in any case, βi [a1 ][ci si di ]s¯i is a linear combination of normal s-words with leading words less than w, i.e., βi [a1 ][ci si di ]s¯i ≡ 0 (mod S, w) for all i. Therefore, we may assume that u!s = (asb) = [a1 ][a2 sb]s¯ and a1 > wX > a2 s¯X b. If either |a1 | = 1 or [a1 ] = [[a11 ][a12 ]] and a12 ≤ a2 s¯X b, then u!s = [a1 ][a2 sb]s¯ is already a normal s-word, i.e., u!s = [a1 ][a2 sb]s¯ = [a1 a2 sb]s¯ = [asb]s¯.

page 100

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.2. COMPOSITION-DIAMOND LEMMA FOR Liek[Y ] (X)

101

If [a1 ] = [[a11 ][a12 ]] and a12 > a2 s¯X b, then u!s = [a1 ][a2 sb]s¯ = [[a11 ][a12 ]][a2 sb]s¯ = [a11 ][[a12 ][a2 sb]s¯] + [[a11 ][a2 sb]s¯][a12 ]. Let us consider the second summand [[a11 ][a2 sb]s¯][a12 ]. Then by induction on w and by noting that [a11 ][a2 sb]s¯ is normal, we may assume that [a11 ][a2 sb]s¯ = αi βi [ci si di ]s¯i , where βi ci s¯i di a11 a2 s¯b, for si ∈ S, αi ∈ k, βi ∈ [Y ] and ci , di ∈ X ∗ . Thus,  αi βi [ci si di ]s¯i [a12 ], [[a11 ][a2 sb]s¯][a12 ] = where a11 > a12 > a2 s¯X b and w = a11 a12 a2 s¯b.  ¯i di a12 If a12 < ci s¯X i di , then [ci si di ]s¯i [a12 ] is normal with w = βi ci s βi a11 a2 s¯ba12 ≺ w. By induction, βi [ci si di ]s¯i [a12 ] ≡ 0 (mod S, w). If a12 > ci s¯X i di , then [ci si di ]s¯i [a12 ] = −[a12 ][ci si di ]s¯i and [a12 ][ci si di ]s¯i is normal with w = βi a12 ci s¯i di βi a12 a11 a2 s¯b ≺ w. Again we can apply induction. If a12 = ci s¯X i di , then as discussed above, it is either the case in Lemma 5.13 or the multiplication composition and each is trivial modulo (S, w). These show that [[a11 ][a2 sb]s¯][a12 ] ≡ 0 (mod S, w). Hence, u!s ≡ [a11 ][[a12 ][a2 sb]s¯] (mod S, w). where a11 > a12 > a2 s¯X b. Noting that [a11 ][[a12 ][a2 sb]s¯] is normal and now (w, δ[a11 ][[a12 ][a2 sb]s¯] ) = (w, |a11 |) < (w, |a1 |), the result follows by induction. Case 2. u!s = (asb) = (asb1 )[b2 ] where a¯ sX b1 > b2 , b = b1 b2 and (asb1 ) is a normal s-word. In this case, (w, δ(asb) ) = (w, m) where m is the s-length of (asb1 ). By induction on w, we may assume that  u!s = (asb) = [asb1 ]s¯[b2 ] + αi βi [ci si di ]s¯i [b2 ]. sb1 , si ∈ S, αi ∈ k, βi ∈ [Y ] and ci , di ∈ X ∗ . where βi ci s¯i di ≺ a¯ Consider the term βi [ci si di ]s¯i [b2 ] for each i. If b2 < ci s¯X i di , then [ci si di ]s¯i [b2 ] is a normal s-word with βi ci s¯i di b2 ≺ w. If b2 > ci s¯X i di , then [ci si di ]s¯i [b2 ] = −[b2 ][ci si di ]s¯i and [b2 ][ci si di ]s¯i is normal s-word with sb1 ≺ βi a¯ sb1 b2 = w. Finally, if b2 = ci s¯X βi b2 ci s¯i di ≺ βi b2 a¯ i di , then as above, by Lemma 5.13 and induction on w or by assumption, βi [ci si di ]s¯i [b2 ] ≡ 0 (mod S, w). These show that βi [ci si di ]s¯i [b2 ] ≡ 0 (mod S, w), for each i. Therefore, we may assume that u!s = (asb) = [asb1 ]s¯[b2 ], a, b ∈ X ∗ , where b = b1 b2 and a¯ sX b 1 > b 2 . Note that for [asb1 ]s¯ = s or [asb1 ]s¯ = [a1 ][a2 sb1 ]s¯ with a2 s¯X b1 ≤ b2 or [asb1 ]s¯ = [asb11 ]s¯[b12 ] with b12 ≤ b2 , u!s is already normal. Now we consider the remaining cases.

page 101

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

102

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

Case 2.1. Let [asb1 ]s¯ = [a1 ][a2 sb1 ]s¯ with a1 > a1 a2 s¯X b1 > a2 s¯X b1 > b2 . Then we have u!s = [[a1 ][a2 sb1 ]s¯][b2 ] = [[a1 ][b2 ]][a2 sb1 ]s¯ + [a1 ][[a2 sb1 ]s¯[b2 ]]. We consider the term [[a 1 ][b2 ]][a2 sb1 ]s¯. By noting that a1 > b2 , we may assume that [a1 ][b2 ] = ui a1 b2 αi [ui ] where αi ∈ k and ui ∈ ALSW(X). We will prove that [ui ][a2 sb1 ]s¯ ≡ 0 (mod S, w). If ui > a2 s¯X b1 , then [ui ][a2 sb1 ]s¯ is a normal s-word with w = ui a2 s¯b1 a1 b2 a2 s¯b1 ≺ w = a1 a2 s¯b1 b2 . Finally, if ui < a2 s¯X b1 , then [ui ][a2 sb1 ]s¯ = −[a2 sb1 ]s¯[ui ] and [a2 sb1 ]s¯[ui ] is a normal s-word with w = a2 s¯b1 ui a2 s¯b1 a1 b2 ≺ w, since a1 a2 s¯b1 is an ALSW. If ui = a2 s¯X b1 then, as above, by Lemma 5.13 and induction on w or by assumption, [ui ][a2 sb1 ]s¯ ≡ 0 (mod S, w). This shows that u!s ≡ [a1 ][[a2 sb1 ]s¯[b2 ]] (mod S, w). By noting that a1 > a2 s¯X b1 > b2 , the result now follows from the Case 1. Case 2.2. Let [asb1 ]s¯ = [asb11 ]s¯[b12 ] with a¯ sX b11 > a¯ sX b11 b12 > b12 > b2 . Then we have u!s = [[asb11 ]s¯[b12 ]][b2 ] = [[asb11 ]s¯[b2 ]][b12 ] + [asb11 ]s¯[[b12 ][b2 ]]. sb11 b2 < a¯ sb11 b12 , we may Let us first deal with [[asb11 ]s¯[b2 ]][b12 ]. Since a¯ apply induction on w and have that  αi βi [ci si di ]s¯i [b12 ], [[asb11 ]s¯[b2 ]][b12 ] = where βi ci s¯i di a¯ sb11 b2 and w = a¯ sb11 b12 b2 .  If b12 < ci s¯X d , then [c s d ] i i i s¯i [b12 ] is normal s-word with w = i i X βi ci s¯i di b12 ≺ w. If b12 > ci s¯i di , then [ci si di ]s¯i [b12 ] = −[b12 ][ci si di ]s¯i and [b12 ][ci si di ]s¯i is a normal s-word with w = βi b12 ci s¯i di βi b12 a¯ sb11 b2 ≺ a¯ sb11 b12 b2 = w. Finally, if b12 = ci s¯X d , then as above, by Lemma 5.13 i i and induction on w or by assumption, βi [ci si di ]s¯i [b12 ] ≡ 0 (mod S, w). These show that u!s ≡ [asb11 ]s¯[[b12 ][b2 ]] (mod S, w).  Let [b12 ][b2 ] = [b12 b2 ] + ui ≺a1 b2 αi [ui ] where αi ∈ k and ui ∈ ALSW(X). By noting that a¯ sX b11 > b12 b2 , we have [asb11 ]s¯[ui ] ≡ 0 (mod S, w) for all i. Therefore, u!s ≡ [asb11 ]s¯[b12 b2 ] (mod S, w). Note that [asb11 ]s¯[b12 b2 ] is normal so (w, δ[asb11 ]s¯[b12 b2 ] ) < (w, δ[asb1 ]s¯[b2 ] ), the result follows by induction. The proof is complete. 

page 102

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.2. COMPOSITION-DIAMOND LEMMA FOR Liek[Y ] (X)

103

Lemma 5.15. Let S be a k-monic subset of Liek[Y ] (X) in which each multiplication composition is trivial. Then every element of the k[Y ]-ideal generated by S can be written as a k[Y ]-linear combination of normal S-words. Proof. Note that for any h ∈ Id(S), h can be presented by a k[Y ]-linear combination of S-words of the form (u)s = [c1 ][c2 ] · · · [ck ]s[d1 ][d2 ] · · · [dl ]

(5.1)

with some placement of parentheses, where s ∈ S, cj , dj ∈ ALSW(X) and k, l ≥ 0. By Lemma 5.14 it suffices to prove that (5.1) is a linear combination of normal S-words. We will prove the result by induction on k + l. It is trivial when k + l = 0, i.e., (u)s = s. Suppose that the result holds for k + l = n. Now let us consider (u)s = [cn+1 ]([c1 ][c2 ] · · · [ck ]s[d1 ][d2 ] · · · [dn−k ]) = [cn+1 ](v)s . By the inductive hypothesis, we may assume without loss of generality sd ∈ TA that (v)s is a normal s-word, i.e., (v)s = v!s = (csd) where c¯ and c, d ∈ X ∗ . If cn+1 > c¯ sX d, then (u)s is normal. If cn+1 < c¯ sX d then sX d (u)s = − v!s [cn+1 ] where v!s [cn+1 ] is normal. Finally, if cn+1 = c¯ then by Lemma 5.14, (u)s = [cn+1 ](csd) ≡ [cn+1 ][csd]s¯. Now the result follows from the multiplication composition and Lemma 5.13.  Lemma 5.16. Let S be a k-monic subset of Liek[Y ] (X) in which each multiplication composition is trivial. Then for any normal S-word asb!s = [a1 ][a2 ] · · · [ak ] v!s [b1 ][b2 ] · · · [bl ] with some placement of parentheses, the three following S-words are linear combinations of normal S-words with the leading words less than a¯ sb: (i) w1 = asb!s |[ai ]→[c] where c ≺ ai ; (ii) w2 = asb!s |[bj ]→[d] where d ≺ bj ; (iii) w3 = asb!s |vs →v s where v  !s ≺ v!s . Proof. We first prove (iii). For k + l = 1, for example, asb!s = v!s [b1 ], it is easy to see that the result follows from Lemmas 5.15 and 5.13 since either v  !s [b1 ] or [b1 ] v  !s is normal or w3 is a multiplication composition. Now the result follows by induction on k + l. We now prove (i), and note that (ii) is similar. For k + l = 1, asb!s = [a1 ] v!s and then w1 = [c] v!s . Then either v!s [c] or [c] v!s is normal or w1 is equivalent to the multiplication composition with respect to w = X v!s v!s . Again by Lemmas 5.15 and 5.13, the result holds. For k + l ≥ 2, the result follows from (iii). 

page 103

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

104

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

Let s1 , s2 ∈ Liek[Y ] (X) be two k-monic polynomials in Liek[Y ] (X). If ∗ a¯ sX sX 1 b¯ 2 c ∈ ALSW(X) for some a, b, c ∈ X , then by Lemma 5.2, there X X exits a bracketing [a¯ s1 b¯ s2 c]s¯X such that [a¯ sX sX = a¯ sX sX sX 1 b¯ 2 c. 1 b¯ 2 c]s ¯X sX 1 ,¯ 2 1 ,¯ 2 Denote [as1 b¯ s2 c]s¯1 ,¯s2 = s¯Y2 [a¯ sX sX | sX , sX 1 b¯ 2 c]s¯X 1 ,¯ 2 [¯ 1 ]→s1 [a¯ s1 bs2 c]s¯1 ,¯s2 = s¯Y1 [a¯ sX sX | sX , sX 1 b¯ 2 c]s¯X 1 ,¯ 2 [¯ 2 ]→s2 [as1 bs2 c]s¯1 ,¯s2 = [a¯ sX sX | sX . sX sX 1 b¯ 2 c]s¯X 1 ,¯ 2 [¯ 1 ]→s1 ,[¯ 2 ]→s2 s2 c = Thus, the leading words of the above three polynomials are a¯ s1 b¯ X s¯Y1 s¯Y2 a¯ sX b¯ s c. 1 2 The following lemma is essential. Lemma 5.17. Let S be a Gr¨ obner–Shirshov basis in Liek[Y ] (X). For every s1 , s2 ∈ S, β1 , β2 ∈ [Y ] and a1 , a2 , b1 , b2 ∈ X ∗ such that w = β1 a1 s¯1 b1 = β2 a2 s¯2 b2 ∈ TA , we have β1 [a1 s1 b1 ]s¯1 ≡ β2 [a2 s2 b2 ]s¯2

(mod S, w).

Proof. Let L be the least common multiple of s¯Y1 and s¯Y2 . Then wY = β1 s¯Y1 = β2 s¯Y2 = Lt for some t ∈ [Y ], wX = a1 s¯X ¯X 1 b 1 = a2 s 2 b2 and   L L β1 [a1 s1 b1 ]s¯1 − β2 [a2 s2 b2 ]s¯2 = t Y [a1 s1 b1 ]s¯1 − Y [a2 s2 b2 ]s¯2 . s¯1 s¯2 X = a1 s¯X sX Consider the first case in which s¯X 2 is a subword of b1 , i.e., w 1 a¯ 2 b2 ∗ X X for some a ∈ X such that b1 = a¯ s2 b2 and a2 = a1 s¯1 a. Then

β1 [a1 s1 b1 ]s¯1 − β2 [a2 s2 b2 ]s¯2   L L X X = t Y [a1 s1 a¯ s2 b2 ]s¯1 − Y [a1 s¯1 as2 b2 ]s¯2 s¯1 s¯2 = tC3 s1 , s2 w , if L = s¯Y1 s¯Y2 , where w = LwX . Since S is a Gr¨obner–Shirshov basis, C3 s1 , s2  ≡ 0 (mod S, LwX ). The result follows from w = tLwX = tw . sY1 )[a1 s¯1 as2 b2 ]s¯1 ,¯s2 and Suppose that L = s¯Y1 s¯Y2 . By noting that (1/¯ Y s2 b2 ]s¯1 ,¯s2 are normal, by Lemma 5.14 we have (1/¯ s2 )[a1 s1 a¯ s2 b2 ]s¯1 ,¯s2 ≡ s¯Y2 [a1 s1 a¯ sX [a1 s1 a¯ 2 b2 ]s¯1

(mod S, w ),

[a1 s¯1 as2 b2 ]s¯1 ,¯s2 ≡ s¯Y1 [a1 s¯X 1 as2 b2 ]s¯2

(mod S, w ).

page 104

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.2. COMPOSITION-DIAMOND LEMMA FOR Liek[Y ] (X)

105

Thus, by Lemma 5.16, we have β1 [a1 s1 b1 ]s¯1 − β2 [a2 s2 b2 ]s¯2 ¯Y1 [a1 s¯X = t(¯ sY2 [a1 s1 a¯ sX 2 b2 ]s¯1 − s 1 as2 b2 ]s¯2 ) = t((¯ sY2 [a1 s1 a¯ sX s2 b2 ]s¯1 ,¯s2 ) + ([a1 s1 as2 b2 ]s¯1 ,¯s2 2 b2 ]s¯1 − [a1 s1 a¯ − [a1 s1 a¯ s2 b2 ]s¯1 ,¯s2 ) − ([a1 s1 as2 b2 ]s¯1 ,¯s2 − [a1 s¯1 as2 b2 ]s¯1 ,¯s2 ) ¯1 as2 b2 ]s¯1 ,¯s2 )) − (¯ sY1 [a1 s¯X 1 as2 b2 ]s¯2 − [a1 s = t((¯ sY1 [a1 s1 a¯ sX s2 b2 ]s¯1 ,¯s2 ) + [a1 (s1 − [¯ s1 ])as2 b2 ]s¯1 ,¯s2 2 b2 ]s¯1 − [a1 s1 a¯ − [a1 s1 a(s2 − [¯ s2 ])b2 ]s¯1 ,¯s2 − (¯ sY1 [a1 s¯X ¯1 as2 b2 ]s¯1 ,¯s2 )) 1 as2 b2 ]s¯2 − [a1 s ≡ 0 (mod S, w). ∗ Secondly, if s¯X ¯X ¯X sX 2 is a subword of s 1 , i.e., s 1 = a¯ 2 b for some a, b ∈ X ,  X s1 . Thus, by noting that then [a2 s2 b2 ]s¯2 = [a1 as2 bb1 ]s¯2 . Let w = L¯ [a1 [as2 b]s¯2 b1 ] is normal and by Lemmas 5.14 and 5.16,

β1 [a1 s1 b1 ]s¯1 − β2 [a2 s2 b2 ]s¯2   L L = t Y [a1 s1 b1 ]s¯1 − Y [a1 as2 bb1 ]s¯2 s¯ s¯2  1  L L = t Y [a1 s1 b1 ]s¯1 − Y [a1 s1 b1 ]s¯1 |s1 →[as2 b]s¯2 s¯1 s¯2   L − Y [a1 as2 bb1 ]s¯2 − [a1 s1 b1 ]s¯1 |s1 →[as2 b]s¯2 s¯2 !   " L L L = t a1 s1 − Y [as2 b]s¯2 b1 − Y ([a1 as2 bb1 ]s¯2 − [aX 1 [as2 b]s¯2 b1 ]) s¯Y1 s¯2 s¯2 L = t[a1 C1 s1 , s2 w b1 ] − Y ([a1 as2 bb1 ]s¯2 − [a1 [as2 b]s¯2 b1 ]) s¯2 ≡ 0 (mod S, w). ¯X One more case is possible: a proper suffix of s¯X 1 is a proper prefix of s 2 , X X ∗ i.e., s¯1 = ab and s¯2 = bc for some a, b, c ∈ X and b = 1. Then abc is an ALSW. Let w = Labc. Then by Lemmas 5.14 and 5.16, we have β1 [a1 s1 b1 ]s¯1 − β2 [a2 s2 b2 ]s¯2   L L = t Y [a1 s1 cb2 ]s¯1 − Y [a1 as2 b2 ]s¯2 s¯1 s¯2 L L = t Y ([a1 s1 cb2 ]s¯1 − [a1 [s1 c]s¯1 b2 ]) − t Y ([a1 as2 b2 ]s¯2 − [a1 [as2 ]s¯2 b2 ]) s¯1 s¯2 + t([a1 C2 s1 , s2 w b2 ] ≡0

(mod S, w).

page 105

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

106

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS



The proof is complete.

Theorem 5.18 (CD-lemma for Lie algebras over a commutative algebra). Let S ⊂ Liek[Y ] (X) be a nonempty set of k-monic polynomials and Id(S) be the k[Y ]-ideal of Liek[Y ] (X) generated by S. Then the following statements are equivalent. (i) S is a Gr¨ obner–Shirshov basis in Liek[Y ] (X). (ii) f ∈ Id(S) ⇒ f¯ = βa¯ sb ∈ TA for some s ∈ S, β ∈ [Y ] and a, b ∈ X ∗. (iii) Irr(S) = {[u] | [u] ∈ TN , u = βa¯ sb, s ∈ S, β ∈ [Y ], a, b ∈ X ∗ } is a k-linear basis for Liek[Y ] (X | S) = Liek[Y ] (X)/ Id(S). Proof. (i) ⇒ (ii). Let S be a Gr¨obner–Shirshov basis  and 0 = f ∈ Id(S). Then, by Lemma 5.15, f has an expression f = αi βi [ai si bi ]s¯i , where αi ∈ k, βi ∈ [Y ], ai , bi ∈ X ∗ and si ∈ S. Denote wi = βi [ai si bi ]s¯i , for i = 1, 2, . . . . Then wi = βi ai s¯i bi . We may assume without loss of generality that w1 = w2 = · · · = wl wl+1 % wl+2 % · · · for some l ≥ 1. The claim of the theorem is obvious if l = 1. Now suppose that l > 1. Then β1 a1 s¯1 b1 = w1 = w2 = β2 a2 s¯2 b2 . By Lemma 5.17, α1 β1 [a1 s1 b1 ]s¯1 + α2 β2 [a2 s2 b2 ]s¯2 = (α1 + α2 )β1 [a1 s1 b1 ]s¯1 + α2 (β2 [a2 s2 b2 ]s¯2 − β1 [a1 s1 b1 ]s¯1 ) ≡ (α1 + α2 )β1 [a1 s1 b1 ]s¯1

(mod S, w1 ).

Therefore, if α1 + α2 = 0 or l > 2, the result follows by induction on l. For the case α1 + α2 = 0 and l = 2, we use the induction on w1 . Now the result follows. (ii) ⇒ (iii). For any f ∈ Liek[Y ] (X), we have   αi βi [ai si bi ]s¯i + αj [uj ], f= βi [ai si bi ]s¯ f¯

[uj ]f¯

i

where αi , αj ∈ k, βi ∈ [Y ], [uj ] ∈ Irr(S) and si ∈ S. Therefore, the set Irr(S) generates the algebra Liek[Y ] (X)/ Id(S).  On the other hand, suppose that h = αi [ui ] = 0 in the Lie algebra Liek[Y ] (X)/ Id(S), where αi ∈ k and [ui ] ∈ Irr(S). This means that h ∈ Id(S). Then all αi must be equal to zero. Otherwise, h = uj for some j which contradicts (ii). (iii) ⇒ (i) For every f, g ∈ S, we have   αi βi [ai si bi ]s¯i + αj [uj ]. Cτ (f, g)w = βi [ai si bi ]s¯ ≺w i

[uj ]≺w

page 106

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.3. EXAMPLES OF SHIRSHOV AND CARTIER

107

For τ = 1, 2, 3, 4, since Cτ (f, g)w ∈ Id(S) and by (iii), we have  Cτ (f, g)w = αi βi [ai si bi ]s¯i . βi [ai si bi ]s¯ ≺w i

Therefore, S is a Gr¨ obner–Shirshov basis.



5.3. Examples of Shirshov and Cartier In this section, all algebras (Lie or associative) are understood to be taken over an associative and commutative k-algebra K with identity and all associative algebras are assumed to have identity. Let L be an arbitrary Lie K-algebra which is presented by generators X and defining relations S, L = LieK (X | S). Let K have a presentation by generators Y and defining relations R, K = k[Y | R]. Let Y and X be deg-lex orderings on [Y ] and X ∗ respectively. Let RX = {f x | f ∈ R, x ∈ X} as above. Then L = Liek[Y |R] (X | S) ∼ = Liek[Y ] (X | S ∪ RX) as k[Y ]-algebras. The Poincar´e–Birkhoff–Witt theorem cannot be generalized to Lie algebras over an arbitrary ring. Following P.M. Cohn, a Lie K-algebra which can be embedded into an associative K-algebra is said to be special. The first non-special example was given by A.I. Shirshov ([284], see also [292]), and he also formulated that if no nonzero element of K annihilates an element of the center of LK , then a faithful representation always exists. Another classical non-special example was given by P. Cartier [89]. In the same paper, he proved that each Lie algebra over Dedekind domain is special. In both examples the Lie algebras are taken over commutative algebras over the Galois field GF(2). Shirshov and Cartier used ad hoc methods to prove that some elements of corresponding Lie algebras are not zero though they are zero in the universal enveloping algebras. P.M. Cohn [121] proved that any Lie algebra over K = k[Y | R], where char k = 0, is special. Also he claimed that he gave an example of non-special Lie algebra over a truncated polynomial algebra over a filed of characteristic p > 0. But he did not give a proof. Here we find Gr¨ obner–Shirshov bases of Shirshov’s and Cartier’s Lie algebras and then use Theorem 5.18 to get the results, we give also a proof for P.M. Cohn’s example of characteristics 2, 3, and 5. We present an algorithm that one can check for any p, whether Cohn’s conjecture is valid. Note that if L = LieK (X | S), then the universal enveloping algebra of L is UK (L) = KX | S (−)  where S (−) is just S but substituting all [u, v] by uv − vu.

page 107

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

108

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

textbook

Example 5.19 (Shirshov [284]). Let the field k = GF(2) and K = k[Y | R], where Y = {yi | i = 0, 1, 2, 3}, R = {y0 yi − yi | i = 0, 1, 2, 3} ∪ {yi yj | i, j = 1, 2, 3}. Let L = LieK (X | S1 , S2 ), where X = {xi , for 1 ≤ i ≤ 13}, S1 consists of the following relations [x2 , x1 ] − x11 , [x3 , x1 ] − x13 , [x3 , x2 ] − x12 , [x5 , x3 ] − x10 , [x6 , x2 ] − x10 , [x8 , x1 ] − x10 , [xi , xj ], for any other i > j, and S2 consists of the following relations y0 xi − xi for i = 1, 2, . . . , 13, x4 − y1 x1 , x5 − y2 x1 , x5 − y1 x2 , x6 − y3 x1 , x6 − y1 x3 , x7 − y2 x2 , x8 − y3 x2 , x8 − y2 x3 , x9 − y3 x3 , y3 x11 − x10 , y1 x12 − x10 , y2 x13 − x10 , y1 xk , for k = 4, 5, . . . , 11, 13, y2 xt for t = 4, 5, . . . , 12, y3 xl for l = 4, 5, . . . , 10, 12, 13. Then L is not special. Proof. Consider L = LieK (X | S1 , S2 ) as Liek[Y ] (X | S1 , S2 , RX). We order Y and X by yi > yj if i > j and xi > xj if i > j respectively. It is easy to see that for the ordering on [Y ]X ∗ as before, S = S1 ∪ S2 ∪ RX ∪ {y1 x2 − y2 x1 , y1 x3 − y3 x1 , y2 x3 − y3 x2 } is a Gr¨obner–Shirshov basis in Liek[Y ] (X). Since x10 ∈ Irr(S) and Irr(S) is a k-linear basis of L by Theorem 5.18, x10 = 0 in L. On the other hand, the universal enveloping algebra of L has a presentation: (−) (−) UK (L) = KX | S1 , S2  ∼ = k[Y ]X | S1 , S2 , RX, (−)

where S1 is just S1 but substituting all [uv] by uv − vu. The Gr¨ obner–Shirshov complement (c.f., Mikhalev–Zolotykh [239]) of (−) S1 ∪ S2 ∪ RX in k[Y ]X is (−)

S c = S1

∪ S2 ∪ RX ∪ {y1 x2 = y2 x1 , y1 x3 − y3 x1 , y2 x3 − y3 x2 , x10 }.

Thus, L is not special.



Example 5.20 (Cartier [89]). Let k = GF(2), K = k[y1 , y2 , y3 | yi2 = 0, for i = 1, 2, 3] and L = LieK (X|S), where X = {xij | 1 ≤ i ≤ j ≤ 3}

page 108

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.4. COHN CONJECTURE

109

and S = {[xii , xjj ] − xji (i > j), [xij , xkl ] (otherwise), y3 x33 − y2 x22 − y1 x11 }. Then L is not special. Proof. Let Y = {y1 , y2 , y3 }. Then L = LieK (X | S) ∼ = Liek[Y ] (X | S, yi2 xkl , for all i, k, l). Let yi > yj if i > j and xij > xkl if (i, j) >lex (k, l) respectively. It is easy to see that for the ordering on [Y ]X ∗ as before, we have that S  = S ∪ {yi2 xkl , for all i, k, l} ∪ S1 is a Gr¨obner–Shirshov basis in Liek[Y ] (X), where S1 consists of the following relations y3 x23 − y1 x12 , y3 x13 − y2 x12 , y2 x23 − y1 x13 , y3 y2 x22 − y3 y1 x11 , y3 y1 x12 , y3 y2 x12 , y3 y2 y1 x11 , y2 y1 x13 . The universal enveloping algebra of L has a presentation: UK (L) = KX|S (−)  ∼ = k[Y ]X|S (−) , yi2 xkl , for all i, k, l. In UK (L), we have 0 = y32 x233 = (y2 x22 + y1 x11 )2 = y22 x222 + y12 x211 + y2 y1 [x22 , x11 ] = y2 y1 x12 . On the other hand, since y2 y1 x12 ∈ Irr(S  ), y2 y1 x12 = 0 in L. Thus, L is not special.  5.4. Cohn conjecture Let K = k[y1 , y2 , y3 | yip = 0, for i = 1, 2, 3] be the algebra of truncated polynomials over a field k of characteristic p > 0. Let Lp = LieK (x1 , x2 , x3 | y3 x3 − y2 x2 − y1 x1 ). Then Lp is not special. We call Lp Cohn’s Lie algebra. Remark 5.21. In UK (Lp ), we have 0 = (y3 x3 )p = (y2 x2 )p + Λp (y2 x2 , y1 x1 ) + (y1 x1 )p = Λp (y2 x2 , y1 x1 ), where Λp is a Jacobson–Zassenhaus Lie polynomial. P.M. Cohn conjectured that Λp (y2 x2 , y1 x1 ) = 0 in Lp . Theorem 5.22. Cohn’s Lie algebras L2 , L3 and L5 are not special. Proof. Let Y = {y1 , y2 , y3 }, X = {x1 , x2 , x3 } and S = {y3 x3 − y2 x2 − y1 x1 , yip xj | 1 ≤ i, j ≤ 3}. Then Lp ∼ = = Liek[Y ] (X | S) and UK (Lp ) ∼ c k[Y ]X | S. Suppose that S is a Gr¨obner–Shirshov complement of S p in Liek[Y ] (X). Let SX ⊂ Lp be the set of all the elements of S c whose X-degrees do not exceed p.

page 109

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

110

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

Note that the Jacobson–Zassenhaus Lie polynomial Λp (y2 x2 , y1 x1 ) is of X-degree p. Then we have Λp (y2 x2 , y1 x1 ) ∈ Irr(S c ) if and only if p Λp (y2 x2 , y1 x1 ) ∈ Irr(SX ). Since the defining relation of Lp is homogep neous on X it follows that SX is a finite set. By Shirshov’s algorithm, one p can compute SX for Lp . First, we consider p = 2 and prove the element Λ2 = [y2 x2 , y1 x1 ] = 2 y2 y1 [x2 x1 ] = 0 in L2 . Then by Shirshov’s algorithm we have that SX consists of the following relations y3 x3 + y2 x2 + y1 x1 , yi2 xj (for 1 ≤ i, j ≤ 3), y3 y2 x2 + y3 y1 x1 , y3 y2 y1 x1 , y2 [x3 x2 ] + y1 [x3 x1 ], y3 y1 [x2 x1 ], y2 y1 [x3 x1 ]. Thus, Λ2 is in the k-linear basis Irr(S c ) of L2 . Now, by the above remark, L2 is not special. Secondly, we consider p = 3 and prove the element Λ3 = y22 y1 [x2 x2 x1 ]+ 2 3 y2 y1 [x2 x1 x1 ] = 0 in L3 . Then again by Shirshov’s algorithm, SX consists of the following relations y3 x3 − y2 x2 − y1 x1 , yi3 xj (for 1 ≤ i, j ≤ 3), y32 y2 x2 − y32 y1 x1 , y32 y22 y1 x1 , y2 [x3 x2 ] + y1 [x3 x1 ], y32 y1 [x2 x1 ], y22 y1 [x3 x1 ], y3 y22 [x2 x2 x1 ] − y3 y2 y1 [x2 x1 x1 ], y3 y22 y1 [x2 x1 x1 ], y3 y2 y1 [x2 x2 x1 ] − y3 y12 [x2 x1 x1 ]. Thus, y22 y1 [x2 x2 x1 ], y2 y12 [x2 x1 x1 ] ∈ Irr(S c ), which implies Λ3 = 0 in L3 . 5 Thirdly, let p = 5. Again by Shirshov’s algorithm, SX consists of the following relations 1) y3 x3 − y2 x2 − y1 x1 , 2) yi5 xj , 1 ≤ i, j ≤ 3, 3) y34 y2 x2 + y34 y1 x1 , 4) y34 y24 y1 x1 , 5) y2 [x3 x2 ] + y1 [x3 x1 ], 6) y34 y1 [x2 x1 ], 7) y24 y1 [x3 x1 ], 8) y33 y22 [x2 x2 x1 ] − y33 y2 y1 [x2 x1 x1 ], 9) y33 y24 y1 [x2 x1 x1 ], 10) y33 y2 y1 [x2 x2 x1 ] − y33 y12 [x2 x1 x1 ],

page 110

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.4. COHN CONJECTURE

111

11) y1 [x3 x2 x3 x1 ], 12) y1 [x3 x1 x2 x1 ], 13) y1 [x3 x2 x2 x1 ] + y1 [x3 x2 x1 x2 ], 14) y2 [x3 x1 x2 x1 ], 15) y32 y23 [x2 x2 x2 x1 ] − 2y32 y22 y1 [x2 x2 x1 x1 ] + y32 y2 y12 [x2 x1 x1 x1 ], 16) y33 y23 y12 [x2 x1 x1 x1 ], 17) y32 y22 y1 [x2 x2 x2 x1 ] − 2y32 y2 y12 [x2 x2 x1 x1 ] + y32 y13 [x2 x1 x1 x1 ], 18) y32 y24 y12 [x2 x1 x1 x1 ], 1 19) y32 y24 y1 [x2 x2 x1 x1 ] − y32 y23 y12 [x2 x1 x1 x1 ], 2 20) y33 y12 [x2 x2 x1 x2 x1 ], 21) y33 y2 y1 [x2 x1 x2 x1 x1 ], 22) y33 y12 [x2 x1 x2 x1 x1 ], 23) y33 y22 [x2 x1 x2 x1 x1 ], 24) y32 y22 y1 [x2 x2 x1 x2 x1 ] + y32 y2 y12 [x2 x1 x2 x1 x1 ], 25) y32 y2 y12 [x2 x2 x1 x2 x1 ] + y32 y13 [x2 x1 x2 x1 x1 ], 26) y32 y24 y12 [x2 x1 x2 x1 x1 ], 27) y3 y24 [x2 x2 x2 x2 x1 ] − 3y3 y23 y1 [x2 x2 x2 x1 x1 ] + y3 y23 y1 [x2 x2 x1 x2 x1 ] + 3y3 y22 y12 [x2 x2 x1 x1 x1 ] + 2y3 y22 y12 [x2 x1 x2 x1 x1 ] − y3 y2 y13 [x2 x1 x1 x1 x1 ], 28) y3 y23 y1 [x2 x2 x2 x2 x1 ] − 3y3 y22 y12 [x2 x2 x2 x1 x1 ] + y3 y22 y12 [x2 x2 x1 x2 x1 ] + 3y3 y2 y13 [x2 x2 x1 x1 x1 ] + 2y3 y2 y13 [x2 x1 x2 x1 x1 ] − y3 y14 [x2 x1 x1 x1 x1 ], 29) y3 y24 y13 [x2 x1 x1 x1 x1 ], 30) y32 y23 y13 [x2 x1 x1 x1 x1 ], 2 1 31) y3 y24 y12 [x2 x2 x1 x1 x1 ] + y3 y24 y12 [x2 x1 x2 x1 x1 ] + y3 y23 y13 [x2 x1 x1 x1 x1 ], 3 3 1 4 4 32) y3 y2 y1 [x2 x2 x2 x1 x1 ] − y3 y2 y1 [x2 x2 x1 x2 x1 ] + y3 y23 y12 [x2 x2 x1 x1 x1 ] 3 1 2 3 2 − y3 y2 y1 [x2 x1 x2 x1 x1 ] + y3 y22 y13 [x2 x1 x1 x1 x1 ], 3 3 33) y23 y12 [x3 x3 x1 x3 x1 ], 34) y23 y12 [x3 x1 x3 x1 x1 ],

page 111

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

112

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

35) y33 y22 y13 [x2 x1 x1 x1 x1 ], 2 2 36) y32 y23 y12 [x2 x2 x1 x1 x1 ] + y32 y23 y12 [x2 x1 x2 x1 x1 ] − y32 y22 y13 [x2 x1 x1 x1 x1 ]. 3 3

0 Thus, Λ5 (y2 x2 , y1 x1 ) = y24 y1 [x2 x2 x2 x2 x1 ] ∈ Irr(S c ), which implies Λ5 =  in L5 . 5.5. Other applications Now we give some examples which are special Lie algebras. Lemma 5.23. Suppose that f and g are two polynomials in Liek[Y ] (X) such that f is k[Y ]-monic and g = rx, where r ∈ k[Y ] and x ∈ X, is k-monic. Then each inclusion composition of f and g is trivial modulo {f } ∪ rX. Proof. Suppose that f¯ = [axb] for some a, b ∈ X ∗ , f = f¯ + f  and g = r¯x + r x. Then w = r¯axb and C1 f, gw = r¯f − [a[rx]b]r¯x = r¯f  − r [axb] = rf  − r f ≡ 0 (mod {f } ∪ rX, w).  Theorem 5.24. For an arbitrary commutative k-algebra K = k[Y | R], if S is a Gr¨ obner–Shirshov basis in Liek[Y ] (X) such that for any s ∈ S, s is k[Y ]-monic, then L = LieK (X | S) is special. Proof. Assume without loss of generality that R is a Gr¨obner basis in k[Y ]. Note that L ∼ = Liek[Y ] (X | S ∪ RX). By Lemma 5.23, S ∪ RX is a Gr¨ obner–Shirshov basis in Liek[Y ] (X). On the other hand, in UK (L) ∼ = k[Y ]X | S (−) ∪ RX, S (−) ∪ RX is a Gr¨ obner–Shirshov basis in k[Y ]X. Thus for any u ∈ Irr(S ∪ RX) in Liek[Y ] (X), we have u ¯ ∈ Irr(S (−) ∪ RX) in k[Y ]X. This completes the proof.  Corollary 5.25. Any Lie K-algebra L = LieK (X | f ) with one monic defining relation f = 0 is special. Proof. Let K = k[Y | R], where R is a Gr¨obner basis in k[Y ]. We can regard f as a k[Y ]-monic element in Liek[Y ] (X). Note that any subset of Liek[Y ] (X) consisting of a single k[Y ]-monic element is a Gr¨obner–Shirshov basis. Thus by Theorem 5.24, L = LieK (X | f ) ∼ = Liek[Y ] (X | {f } ∪ RX) is special. 

page 112

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

5.5. OTHER APPLICATIONS

113

Corollary 5.26. If L is a free K-module, then L is special.  l αij xl , Proof. Let X = {xi | i ∈ I} be a K-basis of L and [xi xj ] =  where αlij ∈ K and i, j ∈ I. Then L = LieK (X|[xi xj ] − αlij xl , for i > j and i, j ∈ I). Suppose that  K = k[Y | R], where R is a Gr¨obner basis in k[Y ]. Since S = {[xi xj ] − αlij xl , for i > j and i, j ∈ I} is a k[Y ]-monic Gr¨ obner–Shirshov basis in Liek[Y ] (X), by Theorem 5.24, L = LieK (X | S) ∼  = Liek[Y ] (X | S ∪ RX) is special. Now we give some other applications. Theorem 5.27. Suppose that S is a finite homogeneous subset of Liek (X). Then the word problem of LieK (X | S) is solvable for any finitely generated commutative k-algebra K. Proof. Let S c be a Gr¨ obner–Shirshov complement of S in Liek (X). Clearly, S c consists of homogeneous elements in Liek (X) since the compositions of homogeneous elements are homogeneous. Since K is finitely generated commutative k-algebra, we may assume that K = k[Y | R] with R a finite obner– Gr¨ obner–Shirshov basis in k[Y ]. By Lemma 5.23, S c ∪ RX is a Gr¨ Shirshov basis in Liek[Y ] (X). For a given f ∈ LieK (X), it is obvious that after a finite number of steps one can write down all the elements of S c whose X-degrees do not exceed the degree of f¯X . Denote the set of such elements by Sf¯X . Then Sf¯X is a finite set. The result follows from Theorem 5.18,  Theorem 5.28. Every finitely or countably generated Lie K-algebra can be embedded into a two-generated Lie K-algebra, where K is an arbitrary commutative k-algebra. Proof. Let K = k[Y | R] and L = LieK (X | S) where X = {xi | i ∈ I} and I is a subset of the set of natural numbers. Without loss of generality, we may assume that with the ordering on [Y ]X ∗ as before, S ∪ RX is a Gr¨ obner–Shirshov basis in Liek[Y ] (X). Consider the algebra L = Liek[Y ] (X, a, b | S  ) where S  = S ∪ RX ∪ R{a, b} ∪ {[aabi ab] − xi | i ∈ I}. Clearly, L is a Lie K-algebra generated by a, b. Thus, in order to prove the theorem, by using our Theorem 5.18, it suffices to show that with the ordering on [Y ](X ∪ {a, b})∗ as before, where a b xi , for xi ∈ X, S  is a Gr¨ obner–Shirshov basis in Liek[Y ] (X, a, b). It is clear that all the possible compositions of multiplication, intersection and inclusion are trivial. We only check the external compositions of

page 113

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

114

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 5. GROBNER-SHIRSHOV BASES FOR LIE ALGEBRAS

some f ∈ S and ra ∈ Ra: let w = Lu1 f¯X u2 au3 where L = lcm(f¯Y , r¯) and u1 f¯X u2 au3 ∈ ALSW(X, a, b). Then L L C3 f, raw = ¯Y [u1 f u2 au3 ]f¯ − [u1 f¯X u2 (ra)u3 ] r¯ f1   L L X ¯ = ¯Y [u1 f u2 au3 ]f¯ − r [u1 f u2 au3 ]f¯X r¯ f1   L L X X ¯ ¯ − [u1 f u2 (ra)u3 ] − r [u1 f u2 au3 ]f¯X r¯ r¯   L L ¯X = [u1 ( ¯Y f )u2 au3 ]f¯ − [u1 (r f )u2 au3 ]f¯X r¯ f1 L − r ([u1 f¯X u2 au3 ] − [u1 f¯X u2 au3 ]f¯X ) r¯ ≡ [u1 C3 f, rxw u2 au3 ] (mod S  , w) for some x occurring in f¯X and w = Lf¯X . Since S ∪ RX is a Gr¨obner– Shirshov basis in Liek[Y ] (X) we have C3 f, rxw ≡ 0 (mod S ∪ RX, w ). Thus, by Lemma 5.16, [u1 C3 f, rxw u2 au3 ] ≡ 0 (mod S  , w). 

page 114

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

CHAPTER 6

Decision problems for groups We have already seen in Section 4.7 an example of algorithmic unsolvability in connection with the word problem for Lie Algebras. In this Chapter we will cover several of the, by now, classical results for finitely presented groups mostly in terms of Turing degrees (these are discussed in Section 6.2.1.6). The proofs rely heavily on the notion of a group with a standard basis introduced by Bokut’ [24, 25] (see also [34]). It was shown later by Kalorkoti [178] that in most situations a standard basis is in fact a GS basis. Our aim is to make this chapter fairly self-contained and we therefore discuss key notions such as computability as well as key group theoretic tools. 6.1. Decision problems Informally speaking a decision problem is a set of questions each of which has a ‘yes’ or ‘no’ answer. More formally we specify a decision problem by describing a generic instance and the question we wish to answer. For example, the word problem for a finitely presented group G =  X | R  is: Instance: A word w in the free group F (X) generated by X. Question: Is w = 1 in G, i.e., is w ∈ RF (X) the normal closure of R in F (X)? This level of formality is sufficient for most situations though care must be exercised, e.g., what would it mean computationally if the generators of G were indexed by an uncountable set? In order to formulate a theory of decision problems and comparisons (or more appropriately reductions) between them we need to express them in a common format. For recursion theory, problems are normally encoded using natural numbers so that a decision problem is simply a set W ⊆ N and the problem is to decide if any candidate natural number is in W . In other areas it is more convenient to use some finite alphabet Σ and encode problems as strings over Σ. Whichever system is used there are many ways to encode problems. We simply choose one method with the caveat that the encoding must be effective, i.e., there 115

page 115

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

116

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

is an algorithm which when given an object can find its encoding (and vice versa). The formalization of these ideas belongs to the theory of computability. Section 6.2 discusses those aspects of the theory that are most relevant to this Chapter 6.2. Computability We begin with a general notion of machines so that we can introduce some definitions that are common to the models we consider in this chapter. We also record some basic definitions in order to avoid ambiguity. A (deterministic) machine M consists of (1) a countable set of configurations (together with a numbering), (2) a recursive subset of configurations called the terminal configurations, (3) a function from non-terminal configurations to configurations which defines the machine’s basic moves and is written C → C  . We have not yet defined ‘recursive’ however it simply means ‘effective’ or ‘algorithmic’. Cohen [119] requires that the move function be recursive. This is the case for most situations but it would rule out the notion of computing with an oracle discussed in Section 6.2.1.5. Note however that ‘computable’ without further qualification means computable without an oracle. These notions will be made precise in Section 6.2.1. All the machines we consider will be deterministic, i.e., if C → C  and C → C  then C  = C  . This is implied by the fact that → is a function rather than a relation. If there is a sequence C = C0 → C1 → · · · → Cn = C  for some n ≥ 0 we write C →∗ C  . The computation of a machine starting with a configuration C is the sequence of configurations C0 , C1 , C2 , . . . , where C0 = C and Ci → Ci+1 for each i. A computation may be finite (terminating) or infinite (non-terminating). A set gives rise to an associated decision problem, i.e., the question of deciding if a given object (usually drawn from some associated type) is a member of the set. For a machine M we define the following problems (where in each case C denotes a configuration). (1) The halting problem for M : H(M ) = {C | M halts from the configuration C}. (2) The special halting problem for M : H0 (M ) = {C | C →∗ C0 } where C0 is a distinguished terminal configuration of M .

page 116

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

117

(3) The confluence problem for M : K(M ) = {(C, C  ) | C →∗ C  and C  →∗ C  for some C  }. (4) The loop problem for M : L(M ) = {C | C →∗ C  and C  →+ C  , for some C  }, where →+ denotes one or more moves. We say that M halts from C if C ∈ H(M ) and it loops from C if C ∈ L(M ). We also say that C, C  conflow if (C, C  ) ∈ K(M ). The halting problem can be widened to a class of machines so that we are given both a machine and a configuration and ask if the given machine halts on the given configuration. A partial function f : S → T is simply a function whose domain is a subset of S. Such a function f is said to be total if its domain is S. Thus the move function of a machine is a partial function on the set of configurations (if there are no terminal configurations then it is a total function). These distinctions are essential for developing the theory of computability and will apply only for this discussion; elsewhere the notation f : S → T will denote a function in the usual sense, i.e., a total function. For a subset A of S we define its characteristic function χA in the usual way, i.e., χA : S → {0, 1} with  1 if a ∈ A, χA (a) = 0 if a ∈ A. We define the partial characteristic function χAp as above but leave it undefined for the case a ∈ A. We now recapitulate briefly some standard definitions related to alphabets and strings for the convenience of the reader. Let Σ be a non-empty set. The elements of Σ are often referred to as letters and Σ as an alphabet. A string over Σ is a finite sequence of symbols from Σ, the set of all such strings is denoted by Σ∗ . Formally a string is a map σ : {1, 2, . . . , n} → Σ for some n ∈ N. If n = 0 then the string is empty, in any case the length is n and we use (σ) to denote the length of a string σ (an alternative notation is |σ|). In many areas, such as computer science, the empty string is denoted by . However when strings are the elements of an algebraic structure, such as a group, it is more customary to call them words and denote the empty string by 1 (we will use whichever terminology is customary without further comment; “string” in discussing computability and “word” elsewhere). Put σ(i) = xi , for 1 ≤ i ≤ n. It is convenient to denote the string σ by x1 , x2 , . . . , xn (assuming that “,” is not a member of Σ) or simply by x1 x2 · · · xn if, as will always be the case for us, the adjacency of symbols causes no confusion. Two strings u = x1 x2 . . . xm and v = y1 y2 . . . yn are

page 117

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

118

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

defined to be equal if m = n and xi = yi in X, for 1 ≤ i ≤ n. We define uv to be x1 x2 . . . xm y1 y2 . . . yn . Clearly this is an associative operation and  acts as the identity. Indeed Σ∗ is a monoid under this operation. For example if Σ = {a, b} then Σ∗ = {1, a, b, aa, ab, ba, bb, . . .}. 6.2.1. Turing machines. 6.2.1.1. The Church–Turing Thesis. A significant amount of effort was expended in the first third or so of the twentieth century on the problem of characterizing the intuitive notion of “effective procedure.” Church [116] proposed λ-definability as a definition of effectively calculable functions from natural numbers to natural numbers1. G¨odel, in discussions with Church, was not convinced. However when Turing [299] published his analysis of what it means for a person to compute a function2 together with a notion of appropriate machines G¨ odel was convinced. Subsequently it was shown that Turing’s and Church’s proposals were equivalent. The claim that the intuitive notion of computable function is captured by these proposals is now referred to as the Church–Turing Thesis (some authors attach only one of the two names). Clearly this cannot be proved mathematically, of course it could be disproved by a counter example. To date no such counter example has been found. 6.2.1.2. Definition of Turing machines. In this section we introduce one variant of Turing’s formal model of computing; there are many possible. Informally, a Turing machine consists of a tape, a head which “scans” the tape, and a finite control. The tape is a linear sequence of squares, each of which contains one of a finite number of distinct symbols. The tape is unbounded to the left and right. There is a special symbol called the blank symbol ; at any instant during the computation of a Turing machine, all but a finite number of tape squares contain the blank symbol. (So the unbounded tape represents a potential rather than actual infinity.) At any time instant, the head scans (or is positioned over) a particular tape square. The Turing machine starts with some input written on the squares of the tape, and with the head positioned over the leftmost input symbol. The machine then proceeds in a sequence of moves or transitions. A single move of the Turing machine is determined by the symbol s scanned by the tape head, and the current state q of the finite control. For some pairs (q, s) no move is defined; in this case the machine halts. If a move is defined for the pair (q, s), the machine undergoes three changes: (1) the finite control assumes a (possibly) new state; 1Church proposed this for total functions; it was later extended to partial functions

by Kleene [193] 2In fact Turing focused on real numbers but there is no essential difference between this and an analysis for functions from strings to strings.

page 118

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

119

(2) the head prints a symbol on the scanned tape square, overwriting whatever was previously there (the new symbol might be the same as the old one); (3) the head is moved one square left or right. More formally, a Turing machine is described by a sextuple M = (Q, Γ, Σ, ¯b , qI , δ), where Q

is a finite set of states,

Γ

is a finite tape alphabet,

Σ⊂Γ ¯b ∈ Γ − Σ qI ∈ Q

is the input alphabet, is the blank symbol, is the initial state,

and δ is the transition function, which is a partial function mapping Q×Γ to Q × Γ × {L, R}. (L and R can be interpreted as left and right, respectively.) The disposition of the machine M at any instant is completely described by the sequence αqβ, where q ∈ Q is the current state of M , α is the infinite sequence of tape symbols strictly to the left hand end of the scanned symbol, and β is the infinite sequence of tape symbols starting at the scanned symbol and reading to the right. The sequence αqβ is the configuration of M . Normally we suppress leading or trailing blanks3 so that α, β are finite strings. Moreover when the state cannot be confused with a tape symbol we do not underline it. We now define a move of the machine M . Let C = . . . b2 b1 b0 qac0 c1 c2 . . . be a configuration of M . If the (partial) function δ is not defined on the pair (q, a) then the machine halts. If δ(q, a) = (q  , a , D), then the new configuration C  of M after a single move is defined by  . . . b2 b1 b0 a q  c0 c1 c2 . . . , if D = R,  C = . . . b2 b1 q  b0 a c0 c1 c2 . . . , if D = L. The computation of M on input x is the computation starting with C = qI x. We normally define the function δ by using a short hand notation consisting of quintuples: we use (q, xi , q  , y, D) to denote the fact that δ(q, xi ) = (q  , y, D). Finally note that if q is a state for which there is no quintuple starting with q then the machine halts if it enters q; such 3If the scanned square is blank and all squares to the right are blank we use αq¯ b rather than αq to denote the configuration.

page 119

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

120

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

states are called halting states (note however that a machine can halt in other states). We can associate a partial function f : Σ∗ → Σ∗ with any Turing machine T in various ways, all of which are equivalent. One simple definition is: given x ∈ Σ∗ we start the machine as usual with the initial configuration qI x. If the machine does not halt then f (x) is undefined. Otherwise let αqβ be the final configuration. Put β = σsγ where σ ∈ Σ∗ and s ∈ Σ∗ (s exists since ¯b ∈ Σ and at the start of the computation all but finitely many squares are blank). We define f (x) to be σ. A function is said to be recursive (or computable) if it is computed by some Turing machine4. A subset A of Σ∗ is said to be recursive if its characteristic function χA is recursive. Such a set is said to be recursively enumerable (abbreviated to r.e.) if its partial characteristic function χAp is recursive. At the intuitive level, the difference between a recursive set and an r.e. one is that for the first case there is an appropriate Turing machine that always delivers a verdict on the question ‘is a ∈ A?’ while in the second case we only have this guarantee if it so happens that a is indeed a member of A. Clearly every recursive set is r.e. but the converse is not true as we show below in Section 6.2.1.4. An alternative way of viewing r.e. sets is in terms of generating them one element at a time. A set is r.e. if it is either empty or there is a computable function f with domain N such that the elements of the set are f (0), f (1), f (2), . . . (an element might be repeated infinitely many times in this sequence). As an example, let G =  X | R  be a group with X and R r.e. (we discuss group presentations in Section 6.3). Let N be the normal closure of R in the free group F (X) generated by X. Thus N consists of all the words that are equal to the identity in G. Every element of N is of the form n # wi−1 rii wi i=1

where n ∈ N, wi ∈ F (X), ri ∈ R and i = ±1 for 1 ≤ i ≤ n. We can list (algorithmically) the members of N as follows. First we deal with the special case that R = ∅. If this is so then N = {1} and clearly this is r.e. So assume that R = ∅. It is easy to see that F (X) is r.e. (it is recursive if X is recursive), so let φ be a recursive function that lists it. Now we show one way to list all elements of form w−1 r±1 w for w ∈ F (X) 4This definition involves an existential quantifier that is itself non-constructive. That is to say, the problem of showing that a function is computable is not the same as the problem of finding an algorithm for it. It is worth comparing this with fixed point theorems: given the right conditions we know there is a fixed point but might not have any method of determining it. Kuznetsov’s [203] proof that recursively presented simple groups have solvable word problem has such a non-constructive feature.

page 120

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

121

and r ∈ R. Since R is r.e. it can be listed by a recursive function ρ. Given n we define the numbers , f , t as follows. If n is even set  = 1, n + 2 = 2f t otherwise  = −1, n + 1 = 2f t where 2 | t in each case. Define ρ(n) = φ(f − 1)−1 ρ(t) φ(f − 1). Now we define our function ν for listing N . Put ν(0) = 1. For an argument n > 0 put n = pe11 pe22 · · · pemm where p1 , p2 , . . . , pm are the prime factors of n in increasing order. We define ν(n) = ρ(e1 − 1)ρ(e2 − 1) · · · ρ(em − 1). It is easy to see that the function ν is computable and lists precisely the elements of N . However it does this in a fairly erratic way. We might ask if there is a function that lists the elements by strictly increasing length when they are freely reduced (we assume that R = ∅ so that N is infinite). This can be done if and only if N is recursive, i.e., G has solvable word problem. We will see in Section 6.4 that this is not always the case. Finally we discuss one subtlety in order to avoid possible misunderstanding. Suppose, for example, that a machine is in state q scanning a single symbol x with all other squares blank. It then moves this symbol to the right by one or more squares and again enters state q scanning the symbol x (with all other squares blank). The configurations at the start and end of this process are the same. In other words our tape does not have an origin with respect to which we can measure the position of symbols. Thus the notion of a tape is only a convenient metaphor in this treatment. In certain situations it is useful to have a tape with origin but we will not need this here. 6.2.1.3. Directed state Turing machines. A directed state Turing machine is one for which the state being entered (i.e., the state in the third position of the quintuple causing the move) determines the direction. If the direction for a state is L we call it an L-state otherwise an R-state. Such machines will help us to avoid complications when considering modular machines in Section 6.2.2. It is easy to see that every Turing machine can be replaced with an equivalent directed state one. Note that a state can be both an L-state and an R-state if it cannot be entered by any transition of T . Of course such states can be removed, alternatively the ambiguity can be avoided by regarding all such sates as L-states and we will assume that this is the case. 6.2.1.4. The halting problem. Informally, the halting problem for Turing machines is: Instance: A Turing machine T and a configuration C of T . Question: Does the computation of T starting with C terminate? The problem is decidable if and only if there is a Turing machine that delivers the correct verdict. As stated there is a difficulty because we have allowed the states and alphabets of Turing machines to be arbitrary but

page 121

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

122

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

of course fixed for any one machine. Thus there can be no single machine that can accept arbitrary configurations and Turing machines as input on which to deliver a verdict. However this problem is easily dealt with: there is no loss in computational power if we fix the alphabets. We can then encode any Turing machine and configuration for it in the agreed alphabet. We will use T  to denote the encoding of a Turing machine and (x, y) to denote the encoding of a pair of strings (just in this sub-section). We allow for the possibility that a constituent string is already over the encoding alphabet; thus (x, T ) denotes the coding of x and T where x is a string over the input alphabet of T . A clear requirement of the encoding scheme is that it must be effective, i.e., it can be computed by a Turing machine. Moreover given a string over the encoding alphabet we can decide if it is a coding and, if so, decode it. Suppose that there is a machine that solves the halting problem, then it can solve the following version: given an input string x for a machine T does T halt when started with input x? Such a machine computes the characteristic function of the set H = {(x, T ) | T halts on input x}. Denote the assumed machine by MH and note that MH will always halt by assumption. If a string is in H we say that MH accepts the input otherwise  it rejects it. We can now produce a new machine MH with the following behavior:  halts on input x ⇐⇒ MH rejects input x. MH

To obtain the machine we simply compute as MH does. If it rejects we do nothing while if it accepts we enter a new state that, e.g., moves right for ever. We now build a machine Mdiag with the behavior Mdiag halts on input x ⇐⇒ MH rejects input (x, x).  To obtain Mdiag we make a simple modification of the machine MH : when presented with input x it first duplicates it to produce (x, x) and then computes as before. However we now have

Mdiag halts on input Mdiag  ⇐⇒ MH rejects input (Mdiag , Mdiag ) ⇐⇒ Mdiag does not halt on input Mdiag  which is a contradiction. It follows that MH does not exist and H is not a recursive set. One of Turing’s fundamental insights was the existence of universal machines. Such a machine U is given as input (in an agreed format) the description of any Turing machine T together with input x for it. The machine U then proceeds to simulate the computation of T on x step by

page 122

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

123

step (the machine U will take a large number of steps to simulate each single step of T ). Note that the set H is r.e. for if (x, M ) ∈ H a (slightly modified) universal Turing machine can detect this and accept, while if (x, M ) ∈ H the simulating machine runs for ever and thus does not accept. Strictly speaking we should also deal with strings that are not in the format (x, M ); by the requirements of an encoding scheme such strings form a recursive set so they can be detected and rejected. Finally, there is a single machine T for which the halting problem is unsolvable, i.e., there is no algorithm to decide if T halts on a given input x (and hence arbitrary configuration). For this we just take any universal Turing machine. 6.2.1.5. Computing with oracles. It is common in mathematics and other areas to reduce one problem to another, i.e., a solution of the second problem will yield a solution to the first. This idea can be formalized in the realm of computation by the use of oracle Turing machines. We deal here with decision problems encoded as sets. Let S be a set of words over some finite alphabet Σ, then an S-oracle Turing machine is a standard Turing machine with input alphabet Σ but equipped with a special query state q? as well as answer states qY and qN . When this state is entered the machine moves in one step into one of two special sates: if the string on the tape from the scanned square up to the first symbol not in Σ is in S then the new state is qY otherwise it is qN . In all other respects computation proceeds as normal. In terms of pure computability, oracle machines are only of interest when the oracle set is not recursive. Otherwise we can amend our oracle machine with one that decides membership of the oracle set and obtain a standard Turing machine. The situation is different if we are measuring computational resource such as time. We will not be concerned with this aspect in this book so we do not go into details; they can be found in many books, e.g., see Papadimitriou [251] or B¨orger [77]. 6.2.1.6. Turing degrees. A set A is said to be Turing reducible to a set B, denoted by A ≤T B, if membership of A can be decided by a Turing machine equipped with an oracle for B. It is easily seen that ≤T is a partial order. Moreover if A ≤T B and B is r.e. (respectively, recursive) then A is also r.e. (respectively, recursive). We define an equivalence relation ≡T by A ≡T B if and only if A ≤T B and B ≤T A. The equivalence classes under this relation are called degrees of unsolvability; in developing the theory, subsets of N are used though this is not essential. Degrees are normally indicated with bold face lower case Roman letters or numerals. By convention 0 denotes the degree of all recursive sets while 1 denotes the degree corresponding to the halting problem.

page 123

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

124

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

A degree is said to be r.e. if it contains an r.e. set (for then all sets in the degree are r.e.). Thus both 0 and 1 are r.e. degrees. It can be seen that if a is an r.e. degree then a ≤T 1. The r.e. degrees are of particular interest since most naturally occurring problems are r.e., this includes the word and conjugacy problems for finitely presented groups. Furthermore they relate to the possible variety of axiomatizable theories. In his wide ranging paper Post [258] asked if there are r.e. degrees strictly between 0 and 1. The problem remained unsolved till 1956 when Friedberg [143] and Muchnik [243] working independently demonstrated, almost simultaneously, that there are mutually incomparable r.e. degrees. Clearly these degrees must be strictly between 0 and 1. In this chapter we will use the phrase ‘P is a-computable’ to mean that P can be computed with an oracle machine equipped with an oracle set from a. Readers are referred to Rogers [271] or B¨ orger [77] for further details on degrees and recursion theory in general. 6.2.2. Modular machines. In this section we discuss a class of machines introduced in [2], [3]. Essentially they are an arithmetic encoding of Turing machines. Their use can simplify proofs of unsolvability for finitely presented groups and other areas (see [120]). A modular machine M normally has N2 as its set of configurations. The machine M also has (1) integers m > n > 0, (2) finitely many quadruples (ai , bi , ci , R),

for i ∈ I,

(aj , bj , cj , L), for j ∈ J, such that 0 ≤ ak , bk < m and 0 ≤ ck < m2 for all k ∈ I ∪ J. Moreover, at most one quadruple begins with any given pair of integers (this ensures that M is deterministic). The basic move function of M is defined as follows. Given a configuration (α, β) put (α, β) = (um + a, vm + b) where 0 ≤ a, b < m. If M has a quadruple (a, b, c, D) then  (um2 + c, v), if D = R; (α, β) → (u, vm2 + c), if D = L. If M has no such quadruple then there is no next move and the machine halts in which case (α, β) is terminal. An extended modular machine uses Z2 as its set of configurations but otherwise it is the same as an ordinary modular machine.

page 124

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

125

Lemma 6.1. Let M be an extended modular machine and (α, β) ∈ Z2 . If (α, β) ∈ H0 (M ) then (α, β) ∈ N2 . Proof. We may assume that (0, 0) is terminal for M for otherwise H0 (M ) = ∅ and there is nothing to prove. Suppose that (α, β) ∈ H0 (M ). Then (α, β) = (α1 m+a, β1 m+b) with 0 ≤ a, b < m and M has a quadruple (a, b, c, D). Assume w.l.o.g. that D = R so that (α, β) → (α1 m2 + c, β1 ) and (α1 m2 + c, β1 ) ∈ H0 (M ). If α < 0 then α1 < 0 and so α1 m2 + c < 0 since 0 ≤ c < m2 . If β < 0 then β1 < 0. Thus either way if (α, β) ∈ N2 then (α1 m2 + c, β1 ) ∈ N2 . Since (0, 0) ∈ N2 we have a contradiction.  The letter m together with the sets of quadruples (ai , bi , ci , R) for i ∈ I and (aj , bj , cj , L) for j ∈ J will always have the above meaning for a modular machine M . The number n is used for defining input and output functions for M . We do not define these as they will not be needed and so we shall use n as a general variable (see [3] for details). Let T be a directed state Turing machine (we do not need to have directed states but this simplifies matters, e.g., see [119]). The alphabet symbols of T can be represented by 0, 1, . . . , n with 0 representing the blank square symbol. The states of T can be represented by n + 1, n + 2, . . . , t and possibly 0. In most applications we assume that T has a distinguished halting state q0 which will always be encoded by 0. We choose an integer m > t. Let C = . . . b2 b1 b0 qac0 c1 c2 . . . be a configuration of T , where we  identify and states with their encodings. Put u = bi mi and  symbols i v = ci m ; note that these sums are finite since all but finitely many squares are blank and this symbol is encoded as 0. We encode C as  (um + q, vm + a), if q is an R-state; φ(C) = (um + a, vm + q), if q is an L-state. If Q = (q, s, q  , s , D) then it is encoded as a quadruple by  (q, s, ms + q  , D), if q is an R-state; Q → (s, q, ms + q  , D), if q is an L-state. It is easy to see that for configurations C, C  of T we have C → C  in T if and only if φ(C) → φ(C  ) in M , the modular machine encoding T . It follows that if T has unsolvable halting problem then so does M . Assuming that (0, 0) is a terminal configuration for a modular machine M , we define its special halting problem to be H0 = {(α, β) | (α, β) →∗ (0, 0)}. It follows that there is a Modular machine M with H0 (M ) undecidable. To see this take a universal Turing machine and alter it so that if it halts

page 125

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

126

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

it proceeds to wipe the tape5 and then halt in state q0 . Recall that the confluence problem for M is K(M ) = {(α, β), (α , β  ) | (α, β) →∗ (α , β  ) and (α , β  ) →∗ (α , β  ) for some (α , β  )}. Both sets are r.e. and clearly H0 (M ) ≤T K(M ) so that the Turing degree of the first problem is never larger than that of the second. The situation is classified completely by the following result: Theorem 6.2 (Cohen [119]). Let a, b be r.e. Turing degrees with a ≤T b. Then there is a modular machine with special halting problem of degree a and confluence problem of degree b. This holds for extended modular machines as well. 6.2.3. Abstract machine results. This section contains results that are used in Section 6.7 and can be omitted on first reading. We return to the notion of a general machine M introduced in Section 6.2. It is frequently convenient to single out a special terminal configuration C0 of M and define the special halting problem of M to be H0 (M ) = { C | (C, C0 ) ∈ D(M ) }. For modular machines (defined in Section 6.2.2) we take C0 = (0, 0), provided that (0, 0) is terminal (otherwise H0 (M ) = ∅). For single-tape Turing machines (defined in Section 6.2.1) we normally take C0 to consist of the empty tape and a special halting state q0 . Any realistic machine works by obeying instructions and every basic move C → C  can be given a label which is the instruction used to produce C  from C. In the case of a Turing machine the labels are the quintuples which define it. For a modular machine the labels are its quadruples. From now on we shall assume that each basic move of a machine M has a label attached to it. A computation C0 → C1 → · · · → Cn determines a unique path Π Π consisting of the labels of each basic move. We use C0 −→ Cn to represent such a computation sequence. We denote the length of Π by l(Π). For an arbitrary configuration C and a path Π we put CΠ = C  if there is Π a computation C −→ C  , otherwise CΠ is undefined. We put Σ ≤ Π if there is a path Δ such that Π = ΣΔ. Note that if CΠ = CΠ (where both sides are defined) then either Π ≤ Π or Π ≤ Π. Moreover if equality does 5Care has to be taken here. There is no way that an arbitrary machine can determine how far to the left and right of the scanned square it needs to go in order to wipe the tape. However we can bypass this problem by delimiting the working space with special marker symbols. As the computation proceeds the marker symbols are moved one square to the right or left, as appropriate, whenever more space is needed.

page 126

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

127

not hold then M loops from C. For two pairs (Σ, Σ ) and (Π, Π ) we put (Σ, Σ ) ≤ (Π, Π ) if there is a path Δ such that Π = ΣΔ and Π = Σ Δ. A pair (C, C  ) of configurations is said to conflow via a pair of paths (Π, Π ) if CΠ, C  Π are both defined and CΠ = C  Π . A pair of paths (Π, Π ) is said to be a mass confluence for the configurations (Ci , Ci ) for 0 ≤ i ≤ n if each (Ci , Ci ) conflows via (Π, Π ) for 0 ≤ i ≤ n. The mass confluence problem for M is the problem of deciding for arbitrary configurations (Ci , Ci ) for 0 ≤ i ≤ n whether or not they mass conflow (note that n is also arbitrary). Lemma 6.3. Let (C, C  ) conflow via (Σ, Σ ) and via (Π, Π ). Then it follows that M loops from C and C  or (Σ, Σ ) ≤ (Π, Π ) or (Π, Π ) ≤ (Σ, Σ ). Proof. We may assume without loss of generality that Π = ΣΔ for some path Δ. Then CΣΔ is defined and C  Σ Δ = CΣΔ = CΠ = C  Π . Thus  either (Σ, Σ ) ≤ (Π, Π ) or M loops from C  (and hence C). Lemma 6.4. Let M have solvable loop problem and let (C, C  ) be a pair of configurations which conflow. Suppose that M does not loop from C (and hence C  ). Then there is a minimal confluence (Σ, Σ ) for (C, C  ) which can be computed. Proof. Let (Π, Π ) be any confluence for (C, C  ). If we choose a confluence (Σ, Σ ) for (C, C  ) with l(Σ) + l(Σ ) minimal the result then follows from Lemma 6.3.  Note that in the preceding lemma we require M to have solvable loop problem in order to ensure that the set P = { (C, C  ) | M does not loop from C, C  } is recursively enumerable. The lemma then states that there is a partial recursive function f from pairs of configurations to pairs of paths such that if (C, C  ) ∈ P conflow then f (C, C  ) is a minimal confluence for (C, C  ). Lemma 6.5. Suppose that M loops from C, D. Then (1) we can recursively find paths Π0 , . . . , Πr , Πr+1 . . . , Πr+s , Σr+1 , . . . , Σr+s such that any computation from C is either by a path Πi for 0 ≤ i ≤ r or by a path Πi Σni for r + 1 ≤ i ≤ r + s and n ≥ 0. (2) We can recursively find paths Γ0 , . . . , Γp , Γp+1 , . . . , Γp+q , Δp+1 , . . . , Δp+q such that the set { Γ | CΓ and DΓ are both defined }

page 127

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

128

textbook

6. DECISION PROBLEMS FOR GROUPS

is just { Γ0 , . . . , Γp } ∪ { Γi Δni | p + 1 ≤ i ≤ p + q and n ≥ 0 }. Moreover (CΓ, DΓ) = (CΓ , DΓ ) if and only if Γ = Γ or Γ = Γi Δti , Γ = Γi Δui for some i with p + 1 ≤ i ≤ p + q and t, u ≥ 0. In particular the set { (CΓ, DΓ) | CΓ, DΓ are both defined for some path Γ } is finite and recursively computable from C, D. Proof. For the first part let C = C0 , C1 , . . . , Cr , Cr+1 , . . . , Cr+s+1 = Cr+1 be the configurations of the computation from C up to the first loop. The result follows if we choose Πi for 0 ≤ i ≤ r + s to be the path such that Ci = C0 Πi and Σi for r + 1 ≤ i ≤ r + s to be the minimal non-empty path such that Ci = Ci Σi . For the second part we use the notation introduced in the previous paragraph. We can check which, if any, of the Πi for 0 ≤ i ≤ r apply to D and put these amongst Γ0 , . . . , Γp . Now for each i > p consider the paths Πi Σni for all n ≥ 0. Note that there are only finitely many configurations reachable from D and we can find out what they are. If DΠi is not defined then DΠi Σni is not defined for any i > p and n ≥ 0. Assume now that DΠi is defined. There are two possibilities to consider. not defined. (1) There is an n ≥ 0 with DΠi Σni defined and DΠi Σn+1 i Here we obtain finitely many paths Γ such that CΓ, DΓ are both defined and Γ = Πi Σti for some t ≥ 0. We put all these amongst Γ0 , . . . , Γp . Note that in this case we cannot have DΠi Σti = DΠi Σui for u > t. (2) There exist u ≥ 0 and t < u such that DΠi Σti = DΠi Σui . Let u be k−(u−t) minimal. Clearly DΠi Σki is defined for all k and DΠi Σki = DΠi Σi n for k ≥ u. Here we have finitely many different endpoints { DΠi Σi | 0 ≤ and n < u } and paths of type Λj Θkj for all k ≥ 0 where Θj = Σu−t i the Λj are the elements of { Πi Σni | 0 ≤ n < u }. We put the Λj amongst Γp+1 , . . . , Γp+q and the Θj amongst Δp+1 , . . . , Δp+q . The rest of the lemma follows easily.  Note that the preceding lemma can be extended to any number of configurations. We shall say that a machine is strongly labeled if whenever C0 → C  , C1 → C  and both basic moves have the same label then C0 = C1 . If we label modular machines by quadruples, as indicated above, then they are strongly labeled. Theorem 6.6. Suppose that M is strongly labeled and has solvable loop problem. Then the mass confluence problem for M is Turing equivalent to the confluence problem for M .

page 128

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.2. COMPUTABILITY

129

Proof. The confluence problem for any machine M is certainly Turing reducible to its mass confluence problem since the pair (C, C  ) conflows if and only if it mass conflows. We show the opposite reduction for two pairs of configurations (C, C  ), (D, D ). The general case is similar. First we decide (with respect to an oracle for the confluence problem of M ) whether or not (C, C  ), (D, D ) both conflow. If either pair fails to conflow then the answer to the problem is ‘no’. Form now on we assume that both pairs conflow and consider two cases. Case 1. M does not loop from at least one of C or D. We may assume without loss of generality that M does not loop on C. By Lemma 6.4 we can recursively find a minimal confluence (Σ, Σ ) for (C, C  ). Now if (C, C  ), (D, D ) mass conflow via (Π, Π ) then by Lemma 6.3 there is a path Δ such that Π = ΣΔ and Π = Σ Δ. An easy induction shows that if DΣΔ = D Σ Δ then DΣ = D Σ . (Note that we need M to be strongly labeled for the induction to go through.) Thus (C, C  ), (D, D ) mass conflow if and only if they do so via (Σ, Σ ). Case 2. M loops from C and D. Note that M also loops from C  and D . By Lemma 6.4 we can recursively find the finite sets { (CΓ, DΓ) | Γ is any path such that CΓ, DΓ are both defined } and { (C  Γ , D Γ ) | Γ is any path such that C  Γ , D Γ are both defined }. Now (C, C  ), (D, D ) mass conflow if and only if these two sets have nonempty intersection.  Theorem 6.7. Let M be a strongly labeled machine with solvable loop problem. Suppose that we are given (1) pairs of configurations (Ci , Ci ) for 0 ≤ i ≤ n, (2) a recursive function σ from configurations to integers which is additive, i.e., σ(Π1 Π2 ) = σ(Π1 ) + σ(Π2 ) for all paths Π1 , Π2 , (3) an integer m, (4) pairs of integers fj , fj for 0 ≤ j ≤ n. Then the problem of deciding whether or not there is a mass confluence  (Π, Π ) for the (Ci , Ci ) such that mσ(Π) fj = mσ(Π ) fj for 0 ≤ j ≤ n is Turing reducible to the confluence problem for M . Proof. We give a reduction to the mass confluence problem for M . An application of Theorem 6.6 completes the result.

page 129

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

130

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

Clearly we may assume that m = 0 and we may also disregard any pair fj , fj which is zero. Note also that if any fj (respectively fj ) is zero then the answer to the problem is ‘no’ unless fj (respectively fj ) is also zero. Thus we may assume that none of the given integers is zero. Now if there is to be a ‘yes’ answer then we must have fi /fi = fj /fj for 0 ≤ i, j ≤ n so suppose that this is the case. It therefore follows that we may replace the fj , fj with just two integers f, f  . We prove the theorem for two pairs of configurations (C, C  ), (D, D ). The general case is similar. First we check (with respect to an oracle for the mass confluence problem of M ) whether or not (C, C  ), (D, D ) mass conflow. If they do not then the answer to the problem is ‘no’. From now on we assume that they do mass conflow and consider two cases. Case 1. M does not loop from at least one of C or D. It was shown in Case 1 of Theorem 6.6 that we can recursively find a mass confluence (Σ, Σ ) for (C, C  ), (D, D ) such that if (Π, Π ) is any other mass confluence for the two pairs then Π = ΣΔ and Π = Σ Δ for some path Δ.   But now mσ(Π) f = mσ(Π ) f  if and only if mσ(Σ) f = mσ(Σ ) f  . Case 2. M loops from C and D. Let Π0 , . . . , Πr , Πr+1 , . . . , Πr+s , Σr+1 , . . . , Σr+s be paths for C, D constructed as in the second part of Lemma 6.5 and let Γ0 , . . . , Γp , Γp+1 , . . . , Γp+q , Δp+1 , . . . , Δp+q be paths for C  , D . It follows from Lemma 6.5 that we can recursively find finitely many pairs of paths of form (Πi , Γj ), (Πi Σi , Γj ), (Πi , Γj Δj ) or (Πi Σi , Γj Δj ) such that the following are all the mass confluences for (C, C  ), (D, D ): (1) (Πi , Γj ). (2) (Πi Σti , Γj ) for all t ≥ 0. (3) (Πi , Γj Δuj ) for all u ≥ 0. (4) (Πi Σti , Γj Δuj ) for all t, u ≥ 0. We can merge all four cases into one by considering the following problem: given paths Π, Γ, Σ, Δ decide whether or not the equation mσ(Π)+tσ(Σ) f = mσ(Γ)+uσ(Δ) f  has integer solutions t, u ≥ 0. Furthermore we are allowed to specify that t = 0 or u = 0. If we do specify that t = 0 or u = 0 then the question is trivially decidable. For the general case a solution exists only if f /f  is a power of m. If this is so then we wish to know if an equation of form at − bu = c, with a, b ≥ 0, has solutions in non-negative integers t, u. If any one of a, b, c is zero the problem is clearly decidable. Otherwise the equation has an

page 130

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.3. GROUP THEORETIC TOOLS

131

integer solution t0 , u0 if and only if gcd(a, b) divides c. If this is so then we can find a non-negative solution t1 = t0 + bx, u1 = u0 + ax for some integer x.  Finally we concentrate on modular machines. As mentioned above we take the label of a basic move C → C  to be the quadruple which is used to produce C  from C. The recursive function σ of Theorem 6.7 is defined by σ(Π) = #(R-quadruples in Π) − #(L-quadruples in Π). Remark 6.8. Let a, b be any two r.e. Turing degrees with a ≤T b. As stated in Theorem 6.2, there is an extended modular machine M with terminal configuration (0, 0) such that (1) the special halting problem for M has degree a, (2) the confluence problem for M has degree b. Unfortunately, we will need M to have solvable loop problem. Following the analysis of [119] it is easy to see that the modular machines constructed there have solvable loop problem provided they are obtained from Turing machines whose loop problem for unmarked tape is solvable. The discussion of normal and non-normal configurations in [119] shows that this holds if each Turing machine is constructed from a large scale machine whose loop problem is solvable. However, if we construct a large scale machine with halting problem of degree a, confluence problem of degree b and solvable derivability problem by fitting together two of the machines mentioned in [119], we find that the machine never loops. This is because in any computation which does not halt one register becomes arbitrarily large. Fortunately we do not use the derivability problem and so we may take it to be solvable. 6.3. Group theoretic tools For the convenience of the reader, we survey briefly some of the key tools we will need for the rest of this chapter. More details can be found in the books by Magnus, Karass and Solitar [295], Lyndon and Schupp [220] or Johnson [170] (of these [220] has the most comprehensive coverage). The definitions given here are in the same spirit as many in Chapter 2 but focused on groups. Suppose G ∼ = F/N for some free group6 and N  F , i.e., a normal subgroup. Let X be a set of free generators for F and R a set of generators for N as a normal subgroup of F . Then we say that  X | R  is a free presentation of G. As usual we will abuse notation and set G =  X | R . 6The definition of a free group is the same as for free algebraic structures of other types (e.g., semigroups) given in Chapter 2.

page 131

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

132

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

The elements of R are called relators. The presentation is said to be finitely generated if X is finite. The presentation is said to be finite if X and R are both finite. The simplest examples of presentations are those of free groups: if F is freely generated by X then it has the presentation  X | ∅  which we write more simply as  X | . In general if N is generated by r1 , r2 , . . . then we write  X | r1 , r2 , . . .  instead of  X | {r1 , r2 , . . .} . At times it is convenient to replace relators by equations: if, e.g., r = uv −1 then we write u = v instead of r on the right hand side. Such equations are called relations. As usual we tend to abuse notation with presentations, e.g., if X, Y are sets of generators and R, S sets of words in the two sets respectively then we write  X, Y | R, S  rather than  X ∪ Y | R ∪ S . From a purely mathematical point of view the sets X, R in a presentation  X | R  can be arbitrary so long as R ⊆ (X ∪ X −1 )∗ (see Section 6.3.4.1 for the definition of (X ∪ X −1 )∗ ). From the point of view of computability some restrictions are necessary, both in order to be able to phrase questions and to have invariance of properties. As a simple example, if X or R is not recursive then we have no way of recognizing generators or relators. More subtly, consider a subset W of N and the group FW =  x0 , x1 , x2 , . . . | xi = 1, for i ∈ W . Clearly FW is a free group (where we agree that the trivial group is free). However xi = 1 if and only if i ∈ W and if we take for W a non-recursive set then the word problem for F is not solvable. Of course the group F always has a presentation for which the word problem is indeed solvable. Thus the property of having solvable word problem seems to be determined by the presentation rather than the group. This is avoided entirely if we insist on “manageable” presentations. A group G has a recursive presentation if G =  X | R  where X is finite and R is an r.e. subset of X ∗ . It is not hard to show that any of the various decision problems for groups (word, conjugacy etc.) is solvable with respect to one recursive presentation if and only if it is so for any other. Finally note that if  X | R  is a recursive presentation then there is a computable listing of the elements of R, say as r0 , r1 , r2 , . . . (this is proved on page 120). Now let y be a new symbol not in X. Then  X, y | y, y i wi , for i ∈ N  is a presentation for the same group where the set of relators is recursive, this is the reason for the name (rather than r.e presentation). 6.3.1. Free products. If we have a collection Ai , for i ∈ I, of groups there are various ways of building a new group from this, e.g., their direct product. Ideally we want each Ai to be embedded in the new group. The free product of such a collection produces such a group that has no more relations than strictly necessary.

page 132

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.3. GROUP THEORETIC TOOLS

133

Definition 6.9. A free product of a collection Ai , for i ∈ I, of groups is a group P with the properties: (1) for each i ∈ I there is a monomorphism ji : Ai → P ; (2) for every group G and homomorphisms fi : Ai → G, for i ∈ I, there is a unique homomorphism φ : P → G that extends each fi . In terms of a commutative diagram we have Ai fi

 ~ G

/P

ji

φ

Theorem 6.10. Free products exist and are unique up to isomorphism. If Ai =  Xi | Ri  with the Xi pairwise disjoint then their free product has presentation  Xi , for i ∈ I | Ri , for i ∈ I . It follows that there is exactly one free product of {Ai | i ∈ I} and this is denoted by ∗i∈I Ai . For two groups G, H we use G ∗ H. It is easily seen that (G ∗ H) ∗ K ∼ = G ∗ (H ∗ K). 6.3.2. Free products with amalgamation. Suppose that G1 , G2 are groups with isomorphic subgroups H1 , H2 respectively and let θ : H1 → H2 be an isomorphism. In forming the free product G1 ∗ G2 there is no ‘interaction’ between the elements of the two groups. The isomorphism θ allows us to introduce a limited form of interaction while still preserving the all important property that the constituent groups are embedded in the new one. The idea is to identify elements of H1 and H2 via the isomorphism θ, i.e., in the new group we have h = θ(h) for all h ∈ H1 . As usual we define the new group by means of a universal property. Definition 6.11. With the preceding notation, we say that a group A with homomorphisms θ1 : G1 → A and θ2 : G2 → A is a free product of G1 and G2 with amalgamated subgroup if the following holds. For every group K and pair of homomorphisms f1 : G1 → K and f2 : G2 → K such that the following diagram (without the map φ) commutes H1

θ

 H2

/ G1 N NNN NNNf1 NNN θ1 NNN  φ '/ AO p8 K p p pp θ2 ppfp p p 2 pp / G2 p

there is a unique homomorphism φ as shown preserving commutativity.

page 133

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

134

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

At first sight it seems odd that we do not require θ1 , θ2 to be 1–1; this turns out to be a consequence of the definition as given. Clearly the definition can be generalized to arbitrarily many groups, however the special case of two groups is the most important. Theorem 6.12. Free products with amalgamation exist and are unique up to isomorphism. Proof. Using the notation of the definition, let N be the normal closure of  hθ(h−1 ); for h ∈ H1  in G1 ∗ G2 and set A = (G1 ∗ G2 )/N . The maps θi are defined by / G1 ∗ G2 Gi θi

%

π

 (G1 ∗ G2 )/N

where π is the natural map. Now there is a unique homomorphism φ : G1 ∗ G2 → K such that G1 PP PPP PPPf1 PPP θ1 PPP  P' ψ / G1 ∗O G2 n7 K n n n nnn θ2 nnnf2 n n nnn G2 commutes. It follows that ψ(g) = 1 for all g ∈ N . We thus obtain a well defined homomorphism φ : (G1 ∗ G2 )/N → K. It is a straightforward matter to check that the definition is satisfied with φ and that A is unique up to isomorphism.  We denote the group A by G1 ∗θ G2 . Theorem 6.13. Let G1 =  X1 | R1  and G2 =  X2 | R2  where X1 , X2 are disjoint. Suppose that H1 , H2 are subgroups of G1 , G2 respectively and that θ : H1 → H2 is an isomorphism. Let S = {hθ(h−1 ) | h ∈ H1 }. Then G1 ∗θ G2 =  X1 , X2 | R1 , R2 , S . In the case of a free product we have a very straightforward set of normal forms: the product of elements from alternating factors with only the first and last being allowed to be trivial. When amalgamation is allowed we need to take into account that, e.g., h1 h2 = θ(h1 )h2 so that alternation of factors is not enough by itself. We now describe a simple and widely used method of obtaining normal forms for this situation. Let Ti be a left

page 134

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.3. GROUP THEORETIC TOOLS

135

transversal for Hi in Gi for i = 1, 2. We insist that the representative of Hi is 1 otherwise the transversals can be chosen freely. For ai ∈ Gi we denote the representative of the coset ai Hi by ai so that ai = ai hi for some hi ∈ Hi which is uniquely determined by ai (once Ti is chosen). Definition 6.14. With the preceding notation, a sequence a1 , a2 , · · · an , h is a normal form of G1 ∗θ G2 if and only if h ∈ H1 and ai , ai+1 do not both belong to G1 or G2 for 1 ≤ i < n. We regards two normal forms a1 , a2 , · · · an , h and a1 , a2 , · · · an , h as equal if and only if ai = ai  in G1 or G2 as appropriate and h = h in H1 . Clearly every normal form corresponds to an element of G1 ∗θ G2 . The converse is also true and the correspondence is 1-1 from which it follows that we have embedding of the constituent groups: Theorem 6.15. With the preceding notation, every element of G1 ∗θ G2 has a unique normal form. Theorem 6.16. With the preceding notation, the homomorphisms θi : Gi → A are 1–1. 6.3.3. HNN extensions. In constructing the free product with amalgamation of two groups we have two separate groups with isomorphic subgroups which are then identified in a larger group. HNN extensions deal with the analogous case when the two subgroups reside in the same group. Let G be a group with subgroups A, B that are isomorphic via φ. Let t be a new letter (called a stable letter) and set G∗ =  G, t | t−1 at = φ(a), for a ∈ A . Thus the new relations provide a type of quasi-commutativity since we have t−1 a = φ(a)t−1 , for a ∈ A, tb = φ−1 (b)t,

for b ∈ B.

The subgroups A, B are called the associated subgroups and are often denoted by A(t), B(t) respectively. We will use the more economical notation A(t ) where A(t) denotes A and A(t−1 ) denotes B. With this notation the preceding relations can be expressed as t c = φ− (c)t ,

c ∈ A(t− ).

A more compact notation for G∗ is G∗ =  G, t | t−1 At = φ(A) 

page 135

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

136

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

or G∗ =  G, t | t−1 At = B, φ  As might be expected G is embedded in G∗ . Moreover those words of G∗ for which the occurrences of t cannot all be removed by application of the quasi-commutativity relations and free reduction are not equal to the identity. These claims can be proved by methods similar to those for free products with amalgamation. First we define normal forms for elements of G∗ . Choose transversals for right cosets of the subgroups A(t ) such that 1 represents the cosets A(t ); once chosen the transversals are assumed to be fixed for the rest of the discussion. We say that a sequence g0 , t1 , g1 , . . . , tn , gn is in normal form if: (1) g0 ∈ G; (2) gi is in the transversal of A(t−i ); (3) there is no consecutive subsequence t , 1, t− . Theorem 6.17. Every element of G∗ can be expressed uniquely as a product g0 t1 g1 · · · tn gn where g0 , t1 , g1 , . . . , tn , gn is in normal form. As an immediate consequence we obtain: Lemma 6.18. The group G is embedded in G∗ . Moreover, as shown in [220], a version of Theorem 6.15 follows from Theorem 6.17 by a simple argument. Consider a word g0 t1 g1 · · · tn gn (not necessarily corresponding to a normal form). We say that it is pinch reduced if it has no subword ti gi ti+1 with i = −i+1 and gi ∈ A(ti+1 ). Lemma 6.19 (Britton’s Lemma [80]). A word of G∗ that is pinch reduced and contains an occurrence of t is not equal to the identity. Moreover if u = v and the words u, v are both pinch reduced then u = u0 t1 u1 · · · tn un and v = v0 t1 v1 · · · tn vn where each ui , vi is t-free. The preceding lemma is often crucial in the analysis of the word problem as well as other decision problems. We will cover some further results in the spirit of Britton’s Lemma in Section 6.7.2 in connection with analyzing the conjugacy problem. The next simple result is used in Section 6.6. Lemma 6.20. Let G∗ =  G, t | t−1 at = φ(a)  be an HNN extension of G. Suppose that H is a subgroup of G with φ(H ∩ A) = H ∩ B. Then the subgroup  H, t  of G∗ has presentation  H, t | t−1 at = φ(a), for a ∈ H ∩ A , i.e., a presentation consisting of one for H with the extra displayed relations adjoined.

page 136

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.3. GROUP THEORETIC TOOLS

137

Proof. Let u be a word in the generators of H and t. We show that if u = 1 then this is a consequence of the relations of H and extra ones adjoined. If u is t-free then the claim is trivial. Suppose now hat u involves t so that it has a t-pinch. Thus u = u0 t− vt u1 where v is t-free and v ∈ A(t ). Since v is a word in the generators of H we have v ∈ H ∩ A(t ). Now u = u0 φ (v)u1 . From the assumption that φ(H ∩ A) = H ∩ B we deduce that φ−1 (H ∩ B) = H ∩ A so that φ (v) can be written as a word in the generators of H. The proof is now completed by induction on the number of t-symbols in u.  One obvious generalization of HNN extensions is to use more than one pair of isomorphic subgroups and corresponding isomorphisms. Thus we have a family Ai , Bi of subgroups of G with isomorphisms φ : Ai → Bi for i ∈ I. We introduce stable letters ti for i ∈ I and consider the group G∗ =  G, ti | t−1 i Ai ti = φi (Ai ) . The results of this section hold in the general situation with obvious adjustments to definitions. A further possible generalization is discussed by, e.g., Miller [241]. We end this section with a some remarks. HNN extensions are frequently defined by taking generators { ai | i ∈ I } and { bi | i ∈ I } for the associated subgroups and the presentation is written as  G, t | ai t = tbi , for i ∈ I . Our task is then to show that the map ai → bi extends to an isomorphism of the subgroups generated. This amounts to showing the following: let u be a word in the ai and v the corresponding word in the bi , if w = 1 then v = 1. Moreover, we frequently start with a base group G0 and then build a sequence G0 ≤ G1 ≤ G2 . . . of HNN extensions. 6.3.4. Groups with a standard basis. Let G0 ≤ G1 ≤ . . . ≤ Gn = G be a tower of HNN extensions with G0 free7. This is a construction that is frequently employed and it is useful to be able to obtain explicit normal forms; in particular for the analysis of various decision problems. By contrast to Gr¨obner–Shirshov bases, the method we discuss here makes use of the special structure of the construction and so it is often easier to find the normal forms. On the other hand we do not have a guarantee that such forms exist, though in most situations we can ensure this possibly by a change to the presentations of some of the groups. In fact we will see in Section 6.3.4.1 that in many situations the normal forms obtained by the method described here do indeed give a Gr¨ obner–Shirshov basis. 7The tower need not be finite for the method discussed but this simpler situation covers the cases considered in this Chapter.

page 137

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

138

9287 – Gr¨ obner-Shirshov Bases

textbook

6. DECISION PROBLEMS FOR GROUPS

We introduce the following notation: • Σα is the set of stable letters for Gα over Gα−1 ; • Φα is the set of all HNN relations Ai p = pBi for all p ∈ Σα , of Gα ; • Σα = ∪β 0, the sets Cβ are already defined for β < α. The word W ≡ R0 pi11 R1 · · · piss Rs ,

(6.3)

where s > 0, each Rj is a Σα -word, and pij ∈ Σα , is included in Cα if and only if (a) W is freely reduced, 8Recall that if X is an alphabet then we set X −1 = {x−1 | x ∈ X}, and it is assumed that X −1 is disjoint from X. We call the letters of X positive and those of X −1 negative .

page 138

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.3. GROUP THEORETIC TOOLS

139

(b) Rj ∈ C α = ∪β 0 we already know that each Cβ , for β < α, is a set of normal forms for Gβ and the rules for transforming Gβ -words into normal form are formulated. We have the following rules for transforming a Gα -word (6.3) into normal form: (1) put each Rj into normal form; (2) carry out any cancellations of letters pj in the word thus obtained; (3) exclude from the word obtained in the preceding step the leftmost occurrence of one of the subwords (6.4); (4) return to the first step until the same word is produced twice. The third step means that one has to apply the following transformations xC(A(x−1 )A2 )p → A(x)A−1 1 pB, −1 x−1 C(A(x)A−1 )A2 pB −1 , 1 )p → A(x

yC(A(y −1 )B2 )p−1 → A(y)B1−1 p−1 A,

(6.5)

y −1 C(A(y)B1−1 )p−1 → A(y −1 )B2 p−1 A−1 . It is clear that if the process given above halts on all words of form Rp , where R ∈ Gα and p ∈ Σα , then it does so on all words. To prove that it does halt on Rp we usually define a non-negative integer μ(R) depending on R and show that an application of the rules produces a word R p R with μ(R ) < μ(R). We discuss this matter further in Section 6.3.4.1 where we consider a general method. We observe that in various situations we need to add extra relations to some of the presentations of the groups in the tower in order to obtain unique normal forms. For example, if we have a finite tower of finite presentations for which the word problem is unsolvable then we cannot obtain normal forms unless we adjoin an infinite (undecidable) set of relations; otherwise we would have an algorithm for the word problem. We will return to this in Section 6.5. We make one final observation. Suppose G =  X | R  is obtained as a sequence of HNN extensions starting with a free group and G has a standard basis. Let W be a word in the elements of X (and their inverses) and suppose that C(W ) ≡ 1. In finding the normal form of W we use

page 140

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

6.3. GROUP THEORETIC TOOLS

141

free reductions and transformations of forbidden sub-words, i.e., a set of re-writes i i for i = ±1 and i ∈ I, x− i xi → 1, Lj → Rj , for j ∈ J. Let Y be the set of letters that appear in any words in these re-writes. A simple argument shows that W is in the normal closure of  Lj Rj−1 ; j ∈ J  in the free group generated by Y . 6.3.4.1. Connection with Gr¨ obner–Shirshov bases. The notion of a group with standard basis is defined independently of Gr¨obner–Shirshov basses and can be used without reference to them. However it was shown by Kalorkoti [178] that in many situations a standard basis is in fact a GS basis. We present the key points here. First we introduce an order on words that is relevant to many applications. The order is made quite general by allowing non-zero weights to be assigned to letters but in most cases each letter is given weight 1. For an alphabet X we set, as usual, X ± = X ∪ X −1 where X −1 = { x−1 | x ∈ X } (it is assumed that the two sets of the union are disjoint). Put Σ = ∪α Σα . We assume that y. We have

Λ(0) = {[x], [y]}, Λ(1) = {[x|x], [x|y 2 ]}, Λ(2) = [x|x|x], [x|x|y 2 ] , . . . . By the previous result, the set of the following polynomials (x ⊗ 1)(x ⊗ 1) − (y ⊗ 1)(y ⊗ 1), (x ⊗ 1)(y ⊗ 1)(y ⊗ 1) − (y ⊗ 1)(y ⊗ 1)(x ⊗ 1), (1 ⊗ x)(1 ⊗ x) − (1 ⊗ y)(1 ⊗ y), (1 ⊗ x)(1 ⊗ y)(1 ⊗ y) − (1 ⊗ y)(1 ⊗ y)(1 ⊗ x), (x ⊗ 1)(1 ⊗ x) − (1 ⊗ x)(x ⊗ 1), (x ⊗ 1)(1 ⊗ y) − (1 ⊗ y)(x ⊗ 1), (1 ⊗ x)(y ⊗ 1) − (y ⊗ 1)(1 ⊗ x), (y ⊗ 1)(1 ⊗ y) − (1 ⊗ y)(y ⊗ 1), is a Gr¨onber–Shirshov basis of Λ ⊗K Λ with respect to the order x ⊗ 1 > 1 ⊗ x > y ⊗ 1 > 1 ⊗ y. Thus (Λ ⊗K Λ)(0) = {[x ⊗ 1], [y ⊗ 1], [1 ⊗ x], [1 ⊗ y]}, (Λ ⊗K Λ)(1) = [x ⊗ 1|x ⊗ 1], [x ⊗ 1|(y 2 ⊗ 1)], [x ⊗ 1|1 ⊗ x], [x ⊗ 1|1 ⊗ y],

[1 ⊗ x|1 ⊗ x], [1 ⊗ x|(1 ⊗ y 2 )], [1 ⊗ x|y ⊗ 1], [y ⊗ 1|1 ⊗ y] . It is easy to define the map F : A• (Λ ⊗ Λ) → A• (Λ) ⊗ A• (Λ), for example, we have F1 ([x ⊗ 1|1 ⊗ x]) = [x] ⊗ [x], F2 ([x ⊗ 1|(y ⊗ 1)2 |1 ⊗ y]) = [x|y 2 ] ⊗ [y],

etc.

page 219

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

220

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

textbook

Suppose now we have a comultiplication Δ : Λ → Λ ⊗ Λ then for some left Λ-module M , we obtain a homomorphism ) ' (  : Δ HomΛ⊗ΛAr (Λ) ⊗ As (Λ), M ⊗ M → HomΛ (Ap+q (Λ), M ⊗ M ). r+s=p+q

To be more precise, we have the following diagram A• (Λ ⊗ Λ) O

• F

/ A• (Λ) ⊗ A• (Λ) ϕ

Δ•

A• (Λ)  take a ϕ ∈ * To define Δ,

 M ⊗M ' ( HomΛ⊗Λ Ar (Λ) ⊗ As (Λ), M ⊗ M and set

r+s=p+q

 := ϕ ◦ F• ◦ Δ• . Δ(ϕ) Next, ( M ) ⊗ HomΛ (Aq (Λ), M ) to * we have 'a map from HomΛ (Ap (Λ), HomΛ⊗Λ Ar (Λ) ⊗ As (Λ), M ⊗ M , called the ∨-product [88, §7, r+s=p+q

Chapter IX], which is given by the following formula: (ξ ∨ ζ)(c ⊗ c ) := ξ(c) ⊗ ζ(c ).

(7.7)

 Thus we define the -product as a composite of ∨ and Δ: HomΛ (Ap (Λ), M ) ⊗ HomΛ (Aq (Λ), M ) WWWW WWWW  WWWW WWWW WWW+ ∨ HomΛ (Ap+q (Λ), M ⊗ M ) 4 hhhh  hhhhh Δ h h hhhh hhhh h  ' ( * HomΛ⊗Λ Ar (Λ) ⊗ As (Λ), M ⊗ M r+s=p+q

 More precisely, using Sweedler notation, set Δ(x) := (x) x(1) ⊗ x(2) , for x ∈ Λ. Then the the map Δn : An (Λ) → An (Λ ⊗ Λ) can be described as follows % & Δn [x1 | . . . |xn+1 ] = x1 (1) ⊗ x1 (2) | . . . |xn+1 (1) ⊗ xn+1 (2) (7.8) (x1 ),...,(xn+1 )

Next, any Λ-module homomorphisms μ : M ⊗ M → M induces the map μ∗ : HomΛ (Ap+q (Λ), M ⊗ M ) → HomΛ (Ap+q (Λ), M ) Thus we obtain a -multiplication in the cohomology H ∗ (Λ, M ). In particular, we have the following

page 220

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.2. ALGEBRAIC DISCRETE MORSE THEORY

221

Proposition 7.28. Let Λ be an F-algebra over a field F and ε : Λ → F an F-algebra homomorphism. Let the comultiplication Δ : Λ → Λ ⊗ Λ be given  by Δ(λ) = (λ) λ(1) ⊗ λ(2) , for every λ ∈ Λ. Then the graded F-module * n H ∗ (Λ, F) = n≥0 H (Λ, F) is an F-algebra with respect to a -product which is given by the following formula (ξ  ζ)(wn )  ( ' (ε ◦ μ∗ ) Γλ⊗ρ Γλp Γρn−p λp+1 · · · λn ρ1 · · · ρp ξ(vp )ζ(un−p ), = n

(7.9)

0≤p≤n λ⊗ρ∈(Λ⊗Λ)⊗n vp ∈Λ(p−1) un−p ∈Λ(n−p−1)

ζ ∈ H q (Λ, F). Next, the graded Hochschild cohomology where ξ ∈ H p (Λ, F),* ∗ module HH (Λ) = HH n (Λ) is an F-algebra with respect to a -product n≥0

which is given by the following formula (ξ  ζ)(wn ) =  ' ' λ (( Γp ξ(vp ) (λp+1 · · · λn ) ⊗ (ρ1 · · · ρp ) Γρn−p ζ(un−p ) , μ∗ Γλ⊗ρ n 0≤p≤n λ⊗ρ∈(Λ⊗Λ)⊗n vp ∈Λ(p−1) un−p ∈Λ(n−p−1)

(7.10) where the sums are taken over all paths, in the graphs ΓA (B• (Λ ⊗ Λ)), ΓA (B• (Λ)) (resp. ΓA (B• (Λ ⊗ Λ, Λ ⊗ Λ)), ΓA (B• (Λ, Λ))), of the forms: Γλ⊗ρ

n [ω1(i1 ) ⊗ ω1(j1 ) | . . . |ωn(in ) ⊗ ωn(jn ) ] −− −→ [λ1 ⊗ ρ1 | . . . |λn ⊗ ρn ],

Γλ p

[λ1 | . . . |λp ] −−→ vp ∈ Λ(p−1) , Γρ n−p

[ρp+1 | . . . |ρn ] −−−→ un−p ∈ Λ(n−p−1) , , Γλp , here wn = [ω1 | . . . |ωn ] ∈ Λ(n−1) , i1 , j1 , . . . , in , jn ∈ {1, 2} and Γλ⊗ρ n ρ Γn−p are corresponding weights. Proof. The proof follows immediately from Corollary 7.26, the previous commutative diagram, and (7.7).  Remark 7.29. As we have already discussed (see Remark 7.8), the classical Composition–Diamond lemma for free associative algebras over fields has, in some cases, the same form for free associative algebras over arbitrary commutative rings. Thus, in the same cases, all the above and hereafter results of this subsection are also true.

page 221

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

222

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

Example 7.30. Let us calculate the Hochschild cohomology ring of the free associative commutative algebra Λ = K[x1 , . . . , xn ] over a commutative ring K. We have K[x1 , . . . , xn ] ∼ = Kx1 , . . . , xn /(xi xj − xj xi ), where 1 ≤ i, j ≤ n. Set x1 > x2 > · · · > xn , then it is easy to see that  {xi xj − xj xi } 1≤i 1, and let RN be the multiplication table of N . It is clear that RG = {ζ n = 1} is a Gr¨obner–Shirshov basis of G. As above, denote elements of Zn by 0, 1, . . . , n − 1. Suppose that there are maps ϕ : Zn → Aut(N ) and f : Zn × Zn → N that satisfy the identities (7.22)–(7.25): ϕp ◦ ϕq = f (p, q) · ϕp+q · f (p, q)−1 , ϕ0 = idN , f (0, p) = f (p, 0) = 1N , f (p, q) · f (p + q, r) = ϕp (f (q, r)) · f (p, q + r), for every p, q, r ∈ Zn . For k > 0 we put ϕk1 := ϕ1 ◦ · · · ◦ ϕ1 . Fix α ∈ N and define a map  f : Zn × Zn → N as follows f (p, q) =

k



1, if p + q ≤ n − 1, α, if p + q ≥ n,

and finally, let ϕ1 = ψ be an automorphism of N such that ψ m (η) := α · η · α−1 for all η ∈ N , and we set ϕp := ψ p , for 1 ≤ p ≤ n − 1. With these definitions we verify easily that the afore-mentioned identities for f and ϕ are satisfied and thus an extension is defined. Next, it follows from the identities for ϕ that ϕn1 is an inner automorphism of N . It follows that an extension of N by Zn exists if and only if, we have an automorphism of N and an element α ∈ N such that (1) the nth power of the automorphism is the inner automorphism of N given by transformation by α, and (2) α is fixed by the automorphism.

page 244

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.3. GROUP EXTENSIONS

245

In the next subsection we show how braid groups can be described as extensions. 7.3.6. Braid groups. In this section we investigate the working of Thurston’s automaton to describe a Gr¨obner–Shirshov basis of braid groups and then we show how braid groups can be obtained (via generators and relations) as an extension of pure braid groups by symmetric groups. We will use as generators for Bn the set of positive crossings, that is, the crossings between two (necessarily adjacent) strands, with the front strand having a positive slope. We denote these generators by σ1 , . . . , σn−1 . These generators are subject to the following relations: σi σj = σj σi , if |i − j| > 1, σi σi+1 σi = σi+1 σi σi+1 . One obvious invariant of an isotopy of a braid is the permutation it induces on the order of the strands: given a braid B, the strands define a map p(B) from the top set of endpoints to the bottom set of endpoints, which we interpret as a permutation of {1, . . . , n}. In this way we get a homomorphism p : Bn → Sn , where Sn is the symmetric group. The generator σi is mapped to the transposition si = (i, i + 1). Now we want to define an inverse map p−1 : Sn → Bn . To that end, we need the following definition [136, p.183] Definition 7.41. Each permutation π gives rise to a total order relation ≤π on {1, . . . , n} with i ≤π j if π(i) < π(j). We set Rπ := {(i, j) ∈ {1, . . . , n} × {1, . . . , n} | i < j and π(i) > π(j)}. Lemma 7.42. [136, Lemma 9.1.6] A set R of pairs (i, j), with i < j, comes from some permutation if and only if the following two conditions are satisfied: (i) If (i, j) ∈ R and (j, k) ∈ R, then (i, k) ∈ R. (ii) If (i, k) ∈ R, then (i, j) ∈ R or (j, k) ∈ R for every j with i < j < k. Now we will define ([136, p.186]) the very important concept of nonrepeating braid (simple braid). Definition 7.43. Recall that our set of generators includes only positive crossings. We call a positive braid non-repeating (i.e., simple) if any two of its strands cross at most once. We define2 Div(Δn ) ⊂ Bn+ to be the set of classes of non-repeating braids. The following lemma summarizes all the above concepts and notations. 2The notation Div(Δ ) will become clear as we proceed. n

page 245

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

246

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

Lemma 7.44. [136, Lemma 9.1.10 and Lemma 9.1.11] The homomorphism p : Bn+ → Sn restricts to a bijection Div(Δn ) → Sn . A positive braid b is non-repeating (i.e., simple) if and only if |b| = |p(b)| (as usual, | · | means the length of a word). If a non-repeating braid maps to a permutation π, two strands i and j cross if and only if (i, j) ∈ Rπ . 1 2 3 4 5 6

1 2 3 4 5 6 Figure 15. The simple braid Rπ , where π = (426153). Example 7.45. For the permutation π =

 1 4

2 3 2 6

4 1

5 6 5 3

 ∈ S6 we

obtain π(1) > π(2), π(1) > π(4), π(1) > π(6), π(2) > π(4), π(3) > π(4), π(3) > π(5), π(3) > π(6), π(5) > π(6). Hence, Rπ = {(1, 2), (1, 4), (1, 6), (2, 4), (3, 4), (3, 5), (3, 6), (5, 6)}, and we obtain a non-repeating braid shown in Figure 15. Definition 7.46. [136, Proposition 9.1.8] For any two permutations π1 , π2 ∈ Sn , and corresponding simple braids Rπ1 , Rπ2 , we define Rπ1 ∧ Rπ2 as follows: Rπ1 ∧ Rπ2 = {(i, k) ∈ Rπ1 ∩ Rπ2 | (i, j) ∈ Rπ1 ∩ Rπ2 or (j, k) ∈ Rπ1 ∩ Rπ2 for all j with i < j < k}. Roughly speaking, the set Rπ1 ∩ Rπ2 should satisfy condition (ii) of Lemma 7.42, otherwise we have to put Rπ1 ∧  Rπ2 = Rε , here  ε is the identity 1 2 ... n element of Sn , i.e., the permutation . Example 7.48 can 1 2 ... n help to understand this concept (see also [136, p.185]). Let us consider two partial orderings [136, Definition 9.1.12] in the semigroup Bn+ , defined as follows: if ab = c for a, b, c ∈ Bn+ , we say that a is a head and b is a tail of c, and we write H(c) = a, T (c) = b.

page 246

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.3. GROUP EXTENSIONS

247

For any Rπ ∈ Div(Δn ) and for any τ ∈ Sn , we denote by τ Rπ the image of a pair under a permutation; defined by taking the image of each component and reordering, if necessary, so the smaller number comes first. It follows that Rπ−1 = πRπ . Now we can present Thurston’s automaton T having the set of states Div(Δn ). A characteristic feature of T is that, after a word Rπ is input, the state of T is the maximal tail of Rπ that lies in Div(Δn ). Definition 7.47. [136, Proposition 9.2.1] If T is in a state Rπ ∈ Div(Δn ) and we input a word Rτ ∈ Div(Δn ), then the resulting state is T(Rπ , Rτ ) := H(Rπ Rτ ) · T (Rπ Rτ ), where H(Rπ Rτ ) = Rπ π −1 (¬πRπ ∧ Rτ ), T (Rπ Rτ ) = Rτ \ (¬πRπ ∧ Rτ ), the element T (Rπ Rτ ) is the maximal tail of Rτ that gives an element of Div(Δn ) when Rπ is multiplied on the right by Rτ . Roughly speaking, the Thurston automaton works in the following way. Suppose we have two simple braids Rπ and Rτ , and we want to rewrite the word Rπ Rτ to a canonical form (see below). The braid Rπ looks for a new crossing of the braid Rτ , and if the braid Rτ allows the taking of this crossing (i.e., if there is a decomposition Rτ = Rτ  Rτ  such that Rτ  is a braid which exactly contains the needed crossing for Rπ ), then the braid Rπ takes this crossing and so on, in other words the braid Rπ is hungry and greedy for new crossing every time. Example 7.48. Let us consider the following simple braids Rπ and Rτ (see Fig. 16), where     1 2 3 4 5 1 2 3 4 5 π= , τ= . 3 4 5 2 1 4 2 5 1 3 We have Rπ = {(1, 4), (1, 5), (2, 4), (2, 5), (2, 5), (3, 4), (3, 5), (4, 5)} Rτ = {(1, 2), (1, 4), (1, 5), (2, 4), (3, 4), (3, 5)}, and we obtain πRπ = {(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)}, ¬πRπ = {(3, 4), (3, 5), (4, 5)}.

page 247

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

248

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

Figure 16. Simple braids Rπ (left), Rτ (right). Then ¬πRπ ∩ Rτ = {(3, 4), (3, 5)}. Hence, by Lemma 7.42, ¬πRπ ∧ Rτ = ¬πRπ ∩ Rτ and it follows that H(Rπ Rτ ) = Rπ ∪ π −1 (¬πRπ ∩ Rτ ) = {(1, 2), (1, 3), (1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)}, T (Rπ Rτ ) = Rτ \ (¬πRπ ∧ Rτ ) = {(1, 2), (1, 4), (1, 5), (2, 4)}, see Figure 17. As we have seen in the previous Example the essence of the Thurston automaton is the feeding of a braid by new crossings. It follows that there exist “the most hungry” braid and the full one. The first one is obviously the braid Rε , and the second one is called Garside braid and dented by ◦ Δn . It can be described physically by rotating the   n strands together 180 1 2 ... n clockwise. We have p(Δn ) = . One representative for n n − 1 ... 1 Δn , shown in Figure 18, is Δn := σ1 (σ2 σ1 ) · · · (σn−1 · · · σ1 ). Definition 7.49. [136, Theorem 9.2.2, Proposition 9.2.3] A braid Rw ∈ Bn+ is in the left greedy normal form if it has a decomposition Rw = Rπ1 Rπ2 · · · Rπ , where each Rπi ∈ Div(Δn ) is a simple braid and ¬πi Rπi ∧ Rπi+1 = Rε , for any 1 ≤ i < . Geometrically, if two strands that are adjacent at the boundary of Rπi and Rπi+1 cross in Rπi+1 , they also cross in Rπi . For example, the right greedy normal form of the braid Rπ Rτ is shown at the right in Figure 17. Indeed, if we look at the boundary of these braids,

page 248

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.3. GROUP EXTENSIONS

249

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

Figure 17. The “feeding process” is shown here. We see (at the left) that the lower braid (Rτ ) has crossings (look at the dashed circle) for the upper one (Rπ ), i.e., the lower braid can “feed” the upper one by giving to it these crossing. At the right we see the result of the feeding; the upper braid is exactly the braid H(Rπ Rτ ) and the lower one T (Rπ Rτ ). 1 2 3 4 5

1 2 3 4 5 Figure 18. The Garside braid Δn = σ1 (σ2 σ1 )(σ3 σ2 σ1 )(σ4 σ3 σ2 σ1 ).

page 249

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

250

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

then we see that if any two strands cross in the lower braid, they also cross in the upper one. To find out more about Δn , we look at a semigroup automorphism of Bn+ called a flip, which takes each generator σi to σi := σn−i . The name is justified, because the image of a braid under this automorphism is indeed obtained by flipping this braid around the horizontal axis. It is also easy to see that ¬πRπ ∧ Δn = ¬πRπ for any Rπ ∈ Div(Δn ), in other words, the braid Δn can feed any simple braid and the feeding procedure looks like completing the braid Rπ to the $π (see Fig. 19). In particular braid Δn , meanwhile Δn is transformed to R we have (7.26) σi Δn = Δn σn−i , for all 1 ≤ i ≤ n − 1 and hence Δ2n is central in Bn+ (see [136, Proposition 9.1.16]). 1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

Figure 19. The braid Rπ , π = (13254) is fed by Δ5 ; the Rπ , takes all missing crossings and Rπ is transformed to $π . It can be Δ5 , meanwhile, Δ5 is transformed to the R $ written as Rπ Δ5 = Rπ Δ5 (σ2 σ4 Δ5 = Δ5 σ1 σ3 ). Historical remark 2. As is well known (see [130] for details), Garside [145] solved the Conjugacy Problem for the braid group Bn by introducing

page 250

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.3. GROUP EXTENSIONS

251

a submonoid Bn+ and a distinguished element Δn of Bn+ (in [136] this element is denoted by Ωn because it corresponds to the last (in some sense) permutation ω which sends {1, 2, . . . , n} to {n, n − 1, . . . , 1}) that he called fundamental, and showing that every element of Bn can be expressed as a + fraction of the form Δm n w, with m being an integer and w ∈ Bn . Although F. Garside was very close to such a decomposition when he proved that the greatest common divisors exist in Bn+ , the result did not appear in his work explicitly, and it seems that the first instances of such distinguished decompositions, or normal forms, go back to the 1980’s, to the independent works by S. Adjan [5], M. El Rifai and H. Morton [135], and W. Thurston (circulated notes [297], later appearing as Chapter IX of the book [136] by D. Epstein et al.). The normal form was soon used to improve Garside’s solution of the Conjugacy Problem [135] and, extended from the monoid to the group, to serve as a paradigmatic example in the then emerging theory of automatic groups due to J. Cannon, W. Thurston, and others. Sometimes called the greedy normal form or Garside normal form, or Thurston normal form, it became a standard tool in the investigation of braids and Artin–Tits monoids and groups from a viewpoint of geometric group theory and of the theory of representations. For more details we refer the reader to [130]. Here we show that the greedy normal form can be obtained as a Gr¨ obner–Shirshov normal form. We start with a normal form of a symmetric group presented in Coxeter presentation. Proposition 7.50. Let S = {s1 , . . . , sn−1 } be the set of generators (transpositions) of the symmetric group Sn . Set si > sj whenever i > j and consider the corresponding deg-lex ordering > on S ∗ . A Gr¨ obner–Shirshov basis for the symmetric group Sn , with respect to the order >, is the following set of relations: s2i = 1, for 1 ≤ i ≤ n, si sj = sj si , for i − j ≥ 2 and 1 ≤ i, j ≤ n, si+1 si si−1 · · · sj si+1 = si si+1 si si−1 · · · sj , if i + 1 ≥ j and 1 ≤ i, j ≤ n. Proof. The proof is straightforward.



As a consequence we obtain the following Corollary 7.51 (See also [68]). The set N (Sn ) := {s1i1 s2i2 · · · snin | obner–Shirshov normal forms for Sn in the ik ≤ k + 1} consists of Gr¨ generators si := (i, i + 1) relative to the deg-lex ordering, where sαβ := sβ sβ−1 · · · sα for β ≥ α and sβ,β+1 := 1.

page 251

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

252

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

Take π ∈ Sn with the normal form n(π) = s1i1 s2i2 · · · snin ∈ N (S) and set |(π)| := |n(π)|. Define the following group  Bn := Rn(π) , π ∈ Sn | Rn(π) · Rn(τ ) := Rn(πτ ) ,  whenever |n(π · τ )| = |n(π)| + |n(τ )| . Proposition 7.52. The group Bn is isomorphic to the braid group Bn . Proof. Indeed, define θ : Bn → Bn , 

θ : Bn → Bn ,

σi → Rsi , Rn(π) → n(π)|si →σi ,

for any 1 ≤ i ≤ n − 1, and n(π)|si →σi means a substituting σi for si in the word n(π). It is clear that these mappings are homomorphisms satisfying θ ◦ θ = idBn and θ ◦ θ = idBn .  Set π ⊥ τ if |n(πτ )| = |n(π)| + |n(τ )|. Proposition 7.53. Let π, τ, ξ ∈ Sn . If |n(πτ ξ)| = |n(π)| + |n(τ )| + |n(ξ)|, then π ⊥ τ ⊥ ξ, π ⊥ τ ξ and πτ ⊥ ξ. Proof. By the associative law (πτ )ξ = π(τ ξ), the statement follows easily.  Corollary 7.54. Let π, τ, ξ ∈ Sn . If πτ ⊥ ξ and π ⊥ τ , then π ⊥ τ ξ and τ ⊥ ξ. Proof. By πτ ⊥ ξ, and π ⊥ τ , |n(πτ ξ)| = |n(π)| + |n(τ )| + |n(ξ)|, and using Proposition 7.53 we complete the proof.  As above we assume that s1 < s2 < · · · < sn−1 . Next, define Rn(π) ≤ Rn(τ ) if and only if |n(π)| > |n(τ )| or |n(π)| = |n(τ )| and n(π) x−1 > y > y −1 > ζ > ζ −1 and consider the corresponding deg-lex order on the free monoid generated by x, x−1 , y, y −1 , ζ, ζ −1 . It is easy to verify that the set of relations xx−1 = 1,

x−1 x = 1,

ζζ −1 = 1,

ζ −1 ζ = 1,

yy −1 = 1, xζ = ζx,

y −1 y = 1, yζ = ζy

is a Gr¨ obner–Shirshov basis of N . Next, consider the symmetric group S3 in the Coxeter presentation S3 = s1 , s2 | s21 = 1, s22 = 1, s1 s2 s1 = s2 s1 s2 , it is also easy to see that if we set s1 > s2 and consider the corresponding deg-lex order on the free monoid generated by s1 , s2 then all relations in the presentation form exactly a Gr¨ obner–Shirshov basis of S3 .

page 254

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.3. GROUP EXTENSIONS

255

Define a map f : S3 × S3 → N as follows: f (s1 , s2 ) = f (s1 , s2 s1 ) = f (s2 , s1 ) = f (s2 , s1 s2 ) = f (s1 s2 , s1 ) = f (s2 s1 , s2 ) = 1, f (s1 , s1 ) = f (s1 , s1 s2 ) = f (s1 , s1 s2 s1 ) = f (s1 s2 , s1 s2 ) = f (s1 s2 s1 , s2 ) = x−1 y −1 ζ −1 , f (s2 , s2 ) = f (s2 , s2 s1 ) = f (s2 , s1 s2 s1 ) = f (s1 s2 s1 , s1 ) = x, f (s1 s2 , s2 ) = x−1 yx, f (s1 s2 , s2 s1 ) = f (s1 s2 , s1 s2 s1 ) = x−1 ζ −1 , f (s2 s1 , s1 ) = y, f (s2 s1 , s1 s2 ) = f (s2 s1 , s1 s2 s1 ) = f (s1 s2 s1 , s1 s2 ) = yx, f (s1 s2 s1 , s1 s2 s1 ) = ζ −1 . Further, define a map ϕ : S3 → Aut(N ), {S3 & π → ϕπ } as follows ϕ1 (x) = x, −1

ϕs1 (x) = x

ϕs2 (x) = x,

ϕ1 (y) = y, yx,

ϕ1 (ζ) = ζ,

ϕs1 (y) = x,

ϕs1 (ζ) = x−1 y −1 ζyx,

ϕs2 (y) = y −1 ζ −1 x−1 ,

ϕs2 (ζ) = ζ,

and we then set ϕτ π (−) := f (τ, π)−1 ϕτ (ϕπ (−))f (τ, π), for any τ, π ∈ S3 . It is easy to verify that ϕπ , π ∈ S3 are automorphisms of P and the maps ϕ, f satisfy the identities (7.22) – (7.25). By Theorem 7.37, a group E3 generated by (1, π), (x, π), (y, π), (ζ, π), where π ∈ S3 and is defined by the relations (u, π) · (v, τ ) := (nN (uϕπ (v)f (π, τ )), nS3 (πτ )), is an extension of N by S3 and by Theorem 7.36, another extension (with the same “outer” action ϕ) is equivalent to E3 because of H 2 (S3 , Z) = 0 (see Example 7.33). We claim that E3 is isomorphic to the braid group B3 . To prove this claim we recall some basic facts on pure braid groups, we refer the reader to [19, 230] for more details. Let us consider the braid group Bn , for n ≥ 2, on n-strings generated by the set {σ1 , . . . , σn−1 } and defined by relations σi σj = σj σi , for |i − j| > 1, σi σi+1 σi = σi+1 σi σi+1 , for i = 1, 2, . . . , n − 2. The subgroup of Bn generated by the elements −1 −1 −1 · · · σj−2 σj−1 Ai,j := σj−1 σj−2 · · · σi+1 σi2 σi+1

page 255

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

256

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

where 1 ≤ i < j ≤ n, is called the pure braid group, and denoted by Pn . Further, Pn is defined by the relations Ai,k Ai,j Ak,j = Ak,j Ai,k Ai,j , Ap,j Ak,p Ak,j = Ak,j Ap,j Ak,p , (Ak,p Ak,j A−1 k,p )Ai,p

=

p < j,

Ai,p (Ak,p Ak,j A−1 k,p ),

Ak,j Ai,p = Ai,p Ak,j ,

i < k < p < j,

k < i < p < j or p < k.

For n = 3 we have B3 = σ1 , σ2 | σ1 σ2 σ1 = σ2 σ1 σ2 , the group P3 is generated by A1,2 = σ12 and A1,3 = σ2 σ12 σ2−1 , A2,3 = σ22 and is defined by just the following two relations A1,2 A1,3 A2,3 = A2,3 A1,2 A1,3 , A2,3 A1,2 A1,3 = A1,3 A2,3 A1,2 , i.e., we obtain A1,2 A1,3 A2,3 = A2,3 A1,2 A1,3 = A1,3 A2,3 A1,2 . By the definition of A1,2 , A1,3 , A2,3 , and by σ1 σ2 σ1 = σ2 σ1 σ2 , A1,2 A1,3 A2,3 = σ12 σ2 σ12 σ2−1 σ22 = σ12 σ2 σ12 σ2 = σ1 (σ1 σ2 σ1 )σ1 σ2 = σ1 (σ2 σ1 σ2 )σ1 σ2 = (σ1 σ2 σ1 )(σ2 σ1 σ2 ) = Δ23 . We have already shown, see (7.26), that Δ2n is in the center of Bn+ , using then the given relations in P3 we obtain: A1,2 Δ23 = Δ23 A1,2 , A2,3 Δ23 = Δ23 A2,3 . The latter implies P3 ∼ = Z × F2 , where ζ := Δ−2 3 is a generator of the free cyclic group Z, F2 is the free group generated by x := A2,3 , and y := A1,3 . Hence N ∼ = P3 . Next, by a straightforward verification one can check easily that the maps ϕ : S3 → Aut(P3 ) and f : S3 × S3 → P3 are given by ϕπ (A) = Rπ ARπ−1 ,

−1 f (π, τ ) = Rπ Rτ Rπτ ,

where Rπ is a Thurston generator of B3 . Define a map ψ : E → B3 as follows (u, π) → uRπ . Using the previous formulas for ϕ and f , we obtain ψ((u, π)) · ψ((v, τ )) = uRπ · vRτ = u(Rπ v)Rτ = uϕπ (v)Rπ Rτ = uϕπ (v)f (π, τ )Rπτ = ψ((uϕπ (v)f (π, τ ), πτ )) = ψ((u, π) · (v, τ )), therefore ψ is a homomorphism. Moreover ψ is an isomorphism, indeed, ψ −1 is defined by ψ −1 (Rπ ) := (1, π). Thus, E3 ∼ = B3 , as claimed.

page 256

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.4. (SINGULAR) EXTENSIONS OF ALGEBRAS

257

Consider now the general case. Define maps f : Sn × Sn → Pn and ϕ : Sn → Aut(Pn ) as follows: −1 , f (π, τ ) := Rπ Rτ Rπτ

ϕπ (A) := Rπ ARπ−1 .

It is clear that ϕπ ∈ Aut(Pn ). Further, recall that the homomorphism p : Bn → Sn is given by p(Rπ ) = π, for any π ∈ Sn , then from Rπ Rτ = f (π, τ )Rπτ we have πτ = p(f (π, τ ))πτ , hence f (π, τ ) ∈ Ker(p), i.e., f (π, τ ) ∈ Pn for any π, τ ∈ Sn . We have −1 −1 f (π, τ )f (πτ, μ) = Rπ Rτ Rπτ · Rπτ Rμ Rπτ μ −1 = Rπ Rτ Rμ Rπτ μ,

and −1 ϕπ (f (τ, μ))f (π, τ μ) = ϕπ (Rτ Rμ Rτ−1 μ )Rπ Rτ μ Rπτ μ −1 −1 = Rπ Rτ Rμ Rτ−1 μ Rπ Rπ Rτ μ Rπτ μ −1 = Rπ Rτ Rμ Rπτ μ,

hence f satisfies (7.23). Thus, a braid group Bn can be presented by the following set of generators {ARπ | A ∈ Pn and π ∈ Sn } and is determined by the following relations: ARπ · BRτ := Aϕπ (B)f (π, τ )Rπτ , −1 where Rπ , for π ∈ Sn , are Thurston generators, f (π, τ ) := Rπ Rτ Rπτ , and −1 ϕπ (B) := Rπ BRπ .

7.4. (Singular) Extensions of Algebras Let A, E, Λ be algebras over a field F. An algebra E is called an extension of the algebra A by Λ if A2 = 0, A is an ideal of E and E/A ∼ = Λ as algebras. In this section, by using Gr¨obner–Shirshov bases, we characterize completely the extensions of A by Λ. We start with discussions of derivations and we by using Composition– Diamond lemma we show how to describe all derivations of an algebra. 7.4.1. Invariants and Derivations. Recall that nth Hochschild cohomology module of an K-algebra Λ with coefficients in a Λ-bimodule M is the K-module n (Λ, Λ), M )), H n (Λ, M ) := H n (HomK (B

for n = 0, 1, . . . ,

where the elements of the complex are K-linear functions f : Λ × ···× Λ → M  n

page 257

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

258

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

such that f (λ1 , . . . , λn ) = 0 whenever one λi belongs to K. The coboundary dn f is the function given by dn f (λ1 , . . . , λn+1 ) = λ1 f (λ2 , . . . , λn+1 ) +

n 

(−1)n f (λ1 , . . . , λi λi+1 , . . . , λn+1 )

i=1

+ (−1)n+1 f (λ1 , . . . , λn )λn+1 . In particular, a zero-cochain is a constant m ∈ M ; its coboundary is the function adm : Λ → M which is given by adm (λ) := mλ − λm, for every λ ∈ Λ. Hence H 0 (Λ, M ) := {m ∈ M | λm = mλ, for all λ ∈ Λ}, this is also called K-module of invariants. Similarly, a 1-cocycle is a K-module homomorphism D : Λ → M satisfying the identity D(λλ ) = D(λ)λ + λD(λ ),

for λ, λ ∈ Λ;

(the Leibniz product rule) such a function D is called a derivation of Λ. In particular, if M = Λ. As is well known, every element λ ∈ Λ determines the inner derivation adλ given by adλ (a) = λa − aλ, for a ∈ Λ. Denote by ad(Λ) the K-module of inner derivations of Λ. Thus the Kmodule HH 1 (Λ) := Der (Λ)/ ad(Λ) is the module of outer derivations of Λ. Example 7.60 (Derivations of the quantum plane). The quantum plane is defined by the F-algebra Aq2|0 := Fx, y | xy = q −1 yx, where q ∈ F and q = 0 is fixed. Set x > y and consider the corresponding deg-lex order on the free monoid generated by x, y. It is clear that {xy = q −1 yx} is a Gr¨obner– 2|0 Shirshov basis of Aq and hence the set ∪n,m≥0 {y n xm } is an F-basis of  2|0 2|0 Aq , and we can write D(w) = n,m≥0 Fn,m (α)y n xm , for every w ∈ Aq 2|0

where Fn,m : Aq → F are functionals and almost all Fn,m (α) are zero. 2|0 Conversely, let {Fn,m }n,m∈N be a family of functionals Fn,m : Aq → F such that almost all Fn,m (w) are zero, then this family determines the  2|0 2|0 linear map D(·) = n,m≥0 Fn,m (·)y n xm : Aq → Aq . We may ask when 2|0

a family of functionals determines a derivation of Aq .

page 258

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.4. (SINGULAR) EXTENSIONS OF ALGEBRAS

259

We have 

D(x)y =

n,m≥0



xD(y) =



Fn,m (x)y n xm y =

1 Fn,m (x)y n+1 xm , qm

n,m≥0



Fn,m (y)xy n xm =

n,m≥0

1 Fn,m (y)y n xm+1 , qn

n,m≥0

then D(x)y + xD(y) =

   1 1 F (x) + F (y) y n xm n−1,m n,m−1 qm qn n,m≥1   + Fn−1,0 (x)y n + F0,m−1 (y)xm . n≥1

m≥1

Further, we have D(y)x =



Fn,m (y)y n xm x =

n,m≥0

yD(x) =



D(y)x + yD(x) =

Fn,m (y)yy n xm =



Fn,m (y)y n+1 xm ,

n,m≥0



Fn,m (y)y n xm+1 +

n,m≥0

=

Fn,m (y)y n xm+1 ,

n,m≥0

n,m≥0

hence





Fn,m (y)y n+1 xm

n,m≥0

(Fn,m−1 (y) + Fn−1,m (x))y n xm

n,m≥1

+





F0,m−1 (y)xm +

m≥1



Fn−1,0 (x)y n .

n≥1

2|0

Let D be a derivation of Aq , then D(x)y + xD(y) = D(xy) =

1 1 1 D(yx) = D(y)x + yD(x), q q q

which implies that 1 1 1 Fn−1,m (x) + n Fn,m−1 = (Fn,m−1 (y) + Fn−1,m (x)) , for n, m ≥ 1, qm q q 1 Fn−1,0 (x) = Fn−1,0 (x), q 1 F0,m−1 (y) = F0,m−1 (y). q

page 259

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

260

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

From the last two equations it follows that if q = 1 then Fn−1,0 (x) = 0 and F0,m−1 (y) = 0, for all n, m ≥ 1. By the first equation,     1 1 1 1 − − (x) = Fn,m−1 (y), F n−1,m qm q q qn and we obtain Fn,m−1 (y) = −

q n 1 − q m−1 · Fn−1,m (x), q m 1 − q n−1

for n, m ≥ 1.

2|0

Thus the vector space DerF (Aq ) is generated by the following set of derivations ∪n,m≥0 {Dn,m };  k · y n xm , if m > 0, Dn,m (x) = 0, if m = 0, ⎧ n+1 1 − q m−1 ⎨− q · · k · y n+1 xm−1 , if n > 0, Dn,m (y) = qm 1 − qn ⎩ 0, if n = 0, where q = 1 and k ∈ F is fixed. To calculate the vector space of outer derivations of the quantum plane 2|0 (i.e., the first Hochschild cohomology HH 1 (Aq )) we have to factorize 2|0 DerF (Aq ) by the module of all inner derivations. We leave the details to the reader. 7.4.2. The second cohomology of algebras. As in the case of groups, H 2 (Λ, M ) can be interpreted in terms of extensions by the algebra Λ. Definition 7.61. An extension by the algebra Λ is an epimorphism π : E → Λ of algebras. Let π : E → Λ be an extension by the algebra Λ. The kernel J := Ker(π) of π is a two-sided ideal in E, hence an E-bimodule. For each n, let Jn denote the F-submodule of E generated by all products j1 j2 · · · jn of n factors ji ∈ J. Then J = J1 ⊇ J2 ⊇ J3 ⊇ · · · , and each Jn is a two-sided ideal of E. An extension π is said to be cleft if π has an algebra homomorphism ρ : Λ → E as right inverse, i.e., π ◦ ρ = idΛ ; that is, if E contains a subalgebra mapped isomorphically onto Λ by π. An extension π is said to be singular if J := Ker(π) satisfies J2 = 0. Lemma 7.62. In each singular extension π : E → Λ the E-bimodule K may be regarded as a Λ bimodule.

page 260

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.4. (SINGULAR) EXTENSIONS OF ALGEBRAS

261

Proof. For π(e) = π(e ) implies e − e ∈ J, so J2 = 0 implies e · j = e · j for each j ∈ J. This defines the left action of each λ := π(e) on j.  Given any Λ-bimodule M, a singular extension of M by Λ is a short exact sequence μ

π

→E− →Λ→0 0→M− where E is an algebra, π a homomorphism of algebras, M is regarded as a E-bimodule by pullback along π, and μ : M → E is a monomorphism of E-bimodules. For given M and Λ, two such extensions are congruent if there is an algebra homomorphism ϕ : E → E such that the following diagram is commutative >EA || AAA π | | AA || AA || /M ϕ Λ BB ~> BB ~ ~ BB ~~ BB μ  ~~ π E μ

0

/0

Let A, Λ be F-algebras, A2 = 0 and A a Λ-bimodule. Let f : Λ × Λ → A be an F-bilinear map. Next, let I be a linearly set, and G := {gi }i∈I an F-basis of A, Λ = FY | R, where R = {λ − N (λ)} is a minimal Gr¨ obner–Shirshov basis for Λ and λ the leading term of the polynomial pλ = λ − N (λ) ∈ FY , where N (λ) is a Gr¨obner–Shirshov normal form of the λ. Let us set E := FG × Y | S, where S consists of the relations (gi , γ) · (gj , γ  ) = (gi γ  + γgj + f (γ, γ  ), N (γγ  )), for all gi , gj ∈ G and γ, γ  ∈ Y , and let us order the set G × Y by the deg-lex order. Theorem 7.63. The set S is a Gr¨ obner–Shirshov basis for E if and only if there the function f : Λ × Λ → A satisfies the following condition: λ · f (μ, η) − f (λμ, η) + f (λ, μη) − f (λ, μ) · η = 0,

(7.27)

for every λ, μ, η ∈ Λ. Moreover, if this is the case, E is a singular extension of A by Λ in a natural way. Proof. (⇐) Suppose that E is an extension of the algebra A by Λ. For any gi , gj , gk ∈ G and for any γ, γ  , γ  ∈ Y , we have to prove that all possible

page 261

May 26, 2020

17:1

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

262

9287 – Gr¨ obner-Shirshov Bases

textbook

¨ 7. (CO)HOMOLOGY AND GROBNER–SHIRSHOV BASIS

compositions related to ambiguities w = (gi , γ)(gj , γ  )(gk , γ  ) are trivial modulo S. Indeed, on the one hand,   w = (gi , γ)(gj , γ  ) (gk , γ  ) = (gi γ  + γgj + f (γ, γ  ), N (γγ  ))(gk , γ  ) = (gi N (γ  γ  ) + γgj γ  + f (γ, γ  )γ  + N (γγ  )gk + f (N (γγ  ), γ  ), N (N (γγ  )γ  )), and

  w = (gi , γ) (gj , γ  )(gk , γ  ) = (gi , γ)(gj γ  + γ  gk + f (γ  , γ  ), N (γ  γ  )) = (gj N (γ  γ  ) + γgj γ  + N (γγ  )gk + γf (γ  , γ  ) + f (γ, N (γ  γ  )), N (γN (γ  γ  )),

and thus the possible compositions are related to ambiguities w are trivial modulo S because of R is a Gr¨obner–Shirshov basis of Λ and then N (N (γγ  )γ  ) = N (γN (γ  γ  ), and f satisfies condition (7.27). (⇒) Conversely, if S is a Gr¨ obner–Shirshov basis for E, then, by noting that for (0, γ)(0, γ  )(0, γ  ), the compositions are trivial modulo S, we can easily check that the condition (7.27) holds in A = (A + Id(S))/ Id(S). Then by the Composition-Diamond lemma, A ∼ = A. Moreover, let us assume that S is a Gr¨obner–Shirshov bases for E. Then, by the Composition-Diamond lemma, each element e ∈ E can be written uniquely as e = (a, θ) + (0, λ) = (a, λ), where a ∈ A and λ ∈ Λ, and θ is the zero element of Λ. Hence, E ∼ = A ⊕ Λ as vector spaces with the following multiplication: for any a, b ∈ A and λ, μ ∈ Λ, (a, λ) · (b, μ) := (aμ + λb + f (λ, μ), λμ). This implies that A is an ideal of E and E/A ∼ = Λ as algebras. This completes the proof.  Throughout the subsequent examples A means an F-algebra such that A2 = 0 and A is a Λ-bimodule. Example 7.64. Let Λ = Fx | xn = ϕ(x) be a cyclic algebra, where n is a natural number and ϕ(x) is a polynomial of degree less than n such that ϕ(0) = 0. Define f : Λ × Λ → A by f (xp , xq ) = f ∈ A for all 1 ≤ p, q ≤ n and where f is a fixed element of A. Then, by Theorem 7.63, E is isomorphic to an extension of A by Λ if and only if there exist f ∈ A such

page 262

Downloaded from www.worldscientific.com by UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA on 09/12/23. Re-use and distribution is strictly not permitted, except for Open Access articles.

May 26, 2020

17:1

9287 – Gr¨ obner-Shirshov Bases

textbook

7.4. (SINGULAR) EXTENSIONS OF ALGEBRAS

263

that fx = xf and E = Fb1 , . . . , bk , . . . , x | S, where {b1 , . . . , bk , . . .} is an F-basis of A, and S = {xn = ϕ(x) + f}. Example 7.65. Let Λ = F[x1 , . . . , xn ] be the free associative commutative algebra generated by x1 , . . . , xn . Set xi > xj whenever i < j and consider the corresponding deg-lex order. It is clear that  {xi xj − xj xi } 1≤i xq whenever p < q and consider the corresponding deg-lex order. It is clear that   n   i kpq xi xp xq − xq xq − 1≤p