HANDBOOK OF BIOCHEMISTRY: section a proteins, volume ii;section a proteins, volume ii 9781351080859, 1351080857, 978-1-138-59688-7, 978-0-429-48737-8

This first volume contains data on amino acids which consists of the coefficients of solubility in water, heat capacitie

367 39 34MB

English Pages [443] Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

HANDBOOK OF BIOCHEMISTRY: section a proteins, volume ii;section a proteins, volume ii
 9781351080859, 1351080857, 978-1-138-59688-7, 978-0-429-48737-8

Table of contents :
Content: Cover
Half Title
Title
Copyright
Advisory Board
Preface
The Editor
Table Of Contents
Nomenclature
Chapter 1 Biochemical Nomenclature
Chapter 2 Nomenclature Of Labeled Compounds
Chapter 3 The Citation Of Bibliographic References In Biochemical Journals
Chapter 4 Iupac Tentative Rules For Thenomenclature Of Organic Chemistrysection E. Fundamental Stereochemistry
Chapter 5 Abbreviations And Symbols For The Description Of Theconformation Of Polypeptide Chainstentative Rules (1969)
Chapter 6 A One-letter Notation For Amino Acid Sequences Tentative Rules Chapter 7 Symbols For Amino-acid Derivatives And Peptides Recommendations (1971)Chapter 8 Abbreviated Nomenclature Of Synthetic Polypeptides(polymerized Amino Acids) Revised Recommendations (1971)
Chapter 9 Structures And Symbols For Synthetic Amino Acidsincorporated Into Synthetic Polypeptides
Amino Acids
Chapter 10 Data On The Naturally Occurring Amino Acids
Chapter 11 A ,j3-unsaturated Amino Acids
Chapter 12 Amino Acid Antagonists
Chapter 13 Properties Of The A-keto Acid Analogs Of Amino Acids
Chapter 14 Far Ultraviolet Absorption Spectra Of Amino Acids Chapter 15 Uv Absorption Characteristics Of /v-acetyl Methyl Esters Ofthe Aromatic Amino Acids, Cystine And Of A-acetylcysteineChapter 16 Numerical Values Of The Absorbances Of The Aromatic Aminoacids In Acid, Neutral And Alkaline Solutions
Chapter 17 Ultraviolet Spectra Of Derivatives Of Cysteine,cystine, Histidine, Phenylalanine, Tyrosine, And Tryptophan
Chapter 18 Luminescence Of The Aromatic Amino Acids
Chapter 19 Luminescence Of Derivatives Of The Aromatic Amino Acids
Chapter 20 Luminescence Of Proteins Lacking Tryptophan
Chapter 21 Luminescence Of Proteins Containing Tryptophan Chapter 22 Covalent Protein ConjugatesChapter 23 Hydrophobicities Of Amino Acids And Proteins
Chapter 24 Specific Rotatory Dispersion Constants For 0.1 M Amino Acid Solutions
Chapter 25 Circular Dichroism (cd) Spectra Of Metal Complexes Ofamino Acids And Peptides
Chapter 26 Errors Of Amino Acid Metabolism
Chapter 27 Errors Of Organic Acid Metabolism
Chapter 28 Free Amino Acids In Amniotic Fluid In Early Pregnancy(13 To 18 Weeks) And At Term
Chapter 29 Free Amino Acids In Blood Plasma Of Newborn Infants And Adults
Peptides And Polypeptides Chapter 30 A List Of Sequential Polypeptides, Their Methodof Preparation And Product AnalysisChapter 31 Peptides Prepared By Solid Phase Peptide Synthesis
Chapter 32 Poly(a-amino Acids), Their Solubility And Susceptibility To Enzymatic Activities
Index

Citation preview

Handbook of Biochemistry and Molecular Biology

Handb ook of Bioche mistry and Molecu lar Biology 3rd Edition

Proteins -Volume I EDITOR

Gerald D. Fasn1an, Ph. D. Rosenfield Professor of Biochemistry Graduate Department of Biochemistry Brandeis University Waltham, Massachusetts

Boca Raton London New York

CRC Press is an imprint of the Taylor & Francis Group, an informa business

First published 1976 by CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 Reissued 2018 by CRC Press © 1976 by CRC Press, Inc. 1970, 1968 by The Chemical Rubber Co. CRC Press is an imprint of Taylor & Francis Group, an Informa business ©

No claim to original U.S. Government works This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfihning, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organiza-tion that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Publisher's Note The publisher has gone to great lengths to ensure the quality of this reprint but points out that some imperfections in the original copies may be apparent. Disclaimer The publisher has made every effort to trace copyright holders and welcomes correspondence from those they have been unable to contact. ISBN 13: 978-1-138-59688-7 (hbk) ISBN 13: 978-0-429-48737-8 (ebk) Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com

Handbook of Biochemistry and Molecular Biology 3rd Edition

Proteins Volume I Editor Gerald D. Fasman, Ph. D. Rosenfield Professor of Biochemistry Graduate Department of Biochemistry Brandeis University Waltham, Massachusetts The following is a list of the four major sections of the Handbook, each consisting of one or more volumes Ptoteins - Amino Acids, Peptides, Polypeptides, and Proteins Nucleic, Acids — Purines, Pyrimidines, Nucleotides, Oligonucleotides, tRNA, DNA, RNA Lipids, Carbohydrates, Steroids Physical and Chemical Data, Miscellaneous - Ion Exchange, Chromatog­ raphy, Buffers, Miscellaneous, e.g., Vitamins

ADVISORY BOARD Gerald D. Fasman Editor Herbert A. Sober (deceased) Consulting Editor MEMBERS Bruce Ames Professor, Department of Biochemistry University of California Berkeley, California 94720 Sherman Beychok Professor, Department of Biological Sciences Columbia University New York, New York 10027 Waldo E. Cohn Senior Biochemist, Biology Division Oak Ridge National Laboratory Oak Ridge, Tennessee 37830 Harold Edelhoch National Institute Arthritis, Metabolism and Digestive Diseases Department of Health, Education, and Welfare National Institutes of Health Bethesda, Maryland 20014 John Edsall Professor Emeritus, Biological Laboratories Harvard University Cambridge, Massachusetts 02138 Gary Felsenfeld Chief, Physical Chemistry Laboratory Laboratory of Molecular Biology National Institute of Arthritis, Metabolism, and Digestive Diseases National Institutes Of Health Bethesda, Maryland 20014 Edmond H. Fischer Professor, Department of Biochemistry University of Washington Seattle, Washington 98195

Victor Ginsburg Chief, Biochemistry Section, National Institute of Arthritis, Metabolism and Digestive Diseases Department of Health, Education, and Welfare National Institutes of Health Bethesda, Maryland 20014 Walter Gratzer MRC Neurobiology Unit Department of Biophysics Kings College University of London London England Lawrence Grossman Professor, Department of Biochemical and Biophysical Sciences School of Hygiene and Public Health The Johns Hopkins University Baltimore, Maryland 21205 Frank Gurd Professor, Department of Chemistry Indiana University Bloomington, Indiana 47401 William Harrington Professor, Department of Biology The Johns Hopkins University Baltimore, Maryland 21218 William P. Jencks Professor, Graduate Department of Biochemistry Brandeis University Waltham, Massachusetts 02154

ADVISORY BOARD (continued)

0 . L. Kline Executive Officer American Institute of Nutrition 9650 Rockville Pike Bethesda, Maryland 20014 1. M. Klotz Professor, Department of Chemistry Northwestern University Evanston, Illinois 60201 Robert Langridge Professor, Department of Biochemistry Princeton University Princeton, New Jersey 08540 Philip Leder Chief, Laboratory of Molecular Genetics National Institute of Child Health and Human Development National Institutes of Health Bethesda, Maryland 20014 I. Robert Lehman Professor, Department Biochemistry School of Medicine Stanford University Stanford, California 94305 Lawrence Levine Professor, Graduate Department of Biochemistry Brandeis University Waltham, Massachusetts 02154 John Lowenstein Professor, Graduate Department of Biochemistry Brandeis University Waltham, Massachusetts 02154 Emanuel Margoliash Professor, Department of Biological Sciences Northwestern University Evanston, Illinois 60201

Julius Marmur Professor, Department of Biochemistry and Genetics Albert Einstein College of Medicine New York, New York 10461 Alton Meister Professor, Department of Biochemistry Cornell University Medical College New York, New York 10021 Kivie Moldave Professor, Department of Biochemistry California College of Medicine University of California Irvine, California 92664 D. C. Phillips Professor, Laboratory of Molecular Biophysics Department of Zoology Oxford University Oxford England William D. Phillips The Lord Rank Research Centre Ranks Hove, McDougall Ltd. Lincoln Road, High Wycombe Bucks England

G. N. Ramachandran Professor, Molecular Biophysics Unit Indian Institute of Science Bangalore India Michael Sela Professor, Department of Chemical Immunology The Weizmann Institute of Science Rehovot Israel

ADVISORY BOARD (continued)

Waclaw Szybalski Professor, McArdle Laboratory for Cancer Research The University of Wisconsin Madison, Wisconsin, 53706 Serge N. Timasheff Professor, Graduate Department of Biochemistry Brandeis University Waltham, Massachusetts 02154

Ignacio Tinoco, Jr. Professor, Department of Chemistry University of California Berkeley, California 94720 Bert L. Valiee Professor, Biophysics Research Laboratory Peter Bent Brigham Hosoital Harvard Medical School Boston, Massachusetts 02115

PREFACE The rapid pace at which new data is currently accumulated in science presents one of the significant problems of today — the problem of rapid retrieval of information. The fields of biochemistry and molecular biology are two areas in which the information explosion is manifest. Such data is of interest in the disciplines of medicine, modern biology, genetics, immunology, biophysics, etc., to name but a few related areas. It was this need which first prompted CRC Press, with Dr. Herbert A. Sober as Editor, to publish the first two editions of a modern Handbook o f Biochemistry, which made available unique, in depth compilations of critically evaluated data to graduate students, post-doctoral fellows, and research workers in selected areas of biochemistry. This third edition of the Handbook demonstrates the wealth of new information which has become available since 1970. The title has been changed to include molecular biology; as the fields of biochemistry and molecular biology exist today, it becomes more difficult to differentiate between them. As a result of this philosophy, this edition has been greatly expanded. Also, previous data has been revised and obsolete material has been eliminated. As before, however, all areas of interest have not been covered in this edition. Elementary data, readily available elsewhere, has not been included. We have attempted to stress the areas of today’s principal research frontiers and consequently certain areas of important biochemical interest are relatively neglected, but hopefully not totally ignored. This third edition is over double the size of the second edition. Tables used from the second edition without change are so marked, but their number is small. Most of the tables from the second edition have been extensively revised, and over half of the data is new material. In addition, a far more extensive index has been compiled to facilitate the use of the Handbook. To make more facile use of the Handbook because of the increased size, it has been divided into four sections. Each section will have one or more volumes. The four sections are titled: Proteins —Amino Acids, Peptides, Polypeptides, and Proteins Nucleic Acids — Purines, Pyrimidines, Nucleotides, Oligonucleotides, tRNA, DNA, RNA Lipids, Carbohydrates, Steroids Physical and Chemical Data, Miscellaneous - Ion Exchange, Chromatography, Buffers, Miscellaneous, e.g., Vitamins By means of this division of the data, we can continuously update the Handbook by publishing new data as they become available. The Editor wishes to thank the numerous contributors, Dr. Herbert A. Sober, who assisted the Editor generously, and the Advisory Board for their counsel and cooperation. Without their efforts this edition would not have been possible. Special acknowledgments are due to the editorial staff of CRC Press, Inc., particularly Ms. Susan Cubar Benovich, Ms. Sandy Pearlman, and Mrs. Gayle Tavens, for their perspicacity and invaluable assistance in the editing of the manuscript. The editor alone, however, is responsible for the scope and the organization of the tables. We invite comments and criticisms regarding format and selection of subject matter, as well as specific suggestions for new data (and their sources) which might be included in subsequent editions. We hope that errors and omissions in the data that appear in the Handbook will be brought to the attention of the Editor and the publisher.

Gerald D. Fasman Editor August 1975

PR E F A C E TO AM INO ACID S, PEPT ID E S, PO LY PEPTID ES, AND P R O T E IN S , VOLU M E I The section of the Handbook o f Biochemistry and Molecular Biology on Amino acids, Peptides, Polypeptides and Proteins is divided into three volumes. The first volume contains information relating to the naturally occurring amino acids, a, /3-unsaturated amino acids, amino acid antagonists, and a-keto analogues of amino acids. Data on ultraviolet absorption, fluorescence, optical rotatory dispersion and circular dichroism of amino acids and their derivatives are contained herein. Relevant data on peptide synthesize is included: Amino acid derivatives, preparation of sequential polypeptides, solid state synthesize, and poly-a-amino acids. The second and third volumes will contain material mainly on proteins. Although the data, for which the editor alone is responsible, are far from complete, it is hoped these volumes will be of assistance to those working in the field of biochemistry and molecular biology.

Ge rald D. F asm an Editor January 1976

THE EDITOR Gerald D. Fasman, Ph.D., is the Rosenfield Professor o f Biochemistry, Graduate Department of Chemistry, Brandeis University, Waltham, Massachusetts. Dr. Fasman graduated from the University of Alberta in 1948 with a B.S. Honors Degree in Chemistry, and he received his Ph.D. in Organic Chemistry in 1952 from the California Institute of Technology, Pasadena, California. Dr. Fasman did postdoctoral studies at Cambridge University, England, Eidg. Technische Hochschule, Zurich, Switzerland, and the Weizmann Institute of Science, Rehovoth, Israel. Prior to moving to Brandeis University, he spent several years at the Children’s Cancer Research Foundation at the Harvard Medical School. He has been an Established Investigator of the American Heart Association, a National Science Foundation Senior Postdoctoral Fellow in Japan, and recently was a John Simon Guggenheim Fellow. Dr. Fasman is a member of the American Chemical Society, a Fellow of the American Association for the Advancement of Science, Sigma Xi, The Biophysical Society, American Society of Biological Chemists, The Chemical Society (London), the New York Academy of Science, and a Fellow of the American Institute of Chemists. He has published 180 research papers.

The Editor and CRC Press, Inc. would like to dedicate this third edition to the memory of Eva K. and Herbert A. Sober. Their pioneering work on the development of the Handbook is acknowledged with sincere appreciation.

TA B LE O F C O N TEN TS NOMENCLATURE Biochemical Nomenclature........................................................................................................................... 3 Nomenclature of Labeled Compounds........................................................................................................16 The Citation of Bibliographic References in Biochemical Journals........................................................ 17 1UPAC Tentative Rules for the Nomenclature of Organic Chemistry Section E. Fundamental Stereochemistry...............................................................................................................21 Abbreviations and Symbols for the Description of the Conformation of Polypeptide Chains Tentative Rules (1969)............................................................................................................................59 A One-letter Notation for Amino Acid Sequences Tentative R u le s ....................................: ............ 75 Symbols for Amino Acid Derivatives and Peptides Recommendations (1971).................................... 79 Abbreviated Nomenclature of Synthetic Polypeptides (Polymerized Amino Acids) Revised Recommendations (1 9 7 1 )....................................................................................................... 91 Structures and Symbols for Synthetic Amino Acids Incorporated into Synthetic Polypeptides. . . .96 AMINO ACIDS Data on the Naturally Occurring Amino A c id s.................................................................................... I l l a,j3-Unsaturated Amino A c id s.................................................................................................................. 175 Amino Acid Antagonists.............................................................................................................................177 Properties of the a-Keto Acids Analogues of Amino Acids.................................................................... 181 Far Ultraviolet Absorption Spectra of Amino Acids...............................................................................183 UV Absorption Characteristics of W-Acetyl Methyl Esters of the Aromatic Amino Acids, Cystine, and TV-Acetylcysteine.............................................................................................................................186 Numerical Values of the Absorbances of the Aromatic Amino Acids in Acid, Neutral and Alkaline Solutions..................................................................................................................................187 Ultraviolet Spectra of Derivatives of Cysteine, Cystine, Histidine, Phenylalanine, Tyrosine, and T ryptophan............................................................................................................................................ 192 Luminescence of the Aromatic Amino A c id s ........................................................................................ 200 Luminescence of Derivatives of the Aromatic Amino Acids................................................................. 201 Luminescence of Proteins Lacking Tryptophan...................................................................................... 204 Luminescence of Proteins Containing Tryptophan.................................................................................205 Covalent Protein Conjugates.................................................................................................................... 208 Hydrophobicities of Amino Acids and Proteins......................................................................................209 Specific Rotatory Dispersion Constants for 0.1 M Amino Acid Solutions.........................................244 Circular Dichroism (CD) Spectra of Metal Complexes of Amino Acids and Peptides....................... 245 Errors of Amino Acid M etabolism .......................................................................................................... 317 Errors of Organic Acid Metabolism.......................................................................................................... 326 Free Amino Acids in Amniotic Fluid in Early Pregnancy (13 to 18 Weeks) and at T e rm ............... 327 Free Amino Acids in Blood Plasma of Newborn Infants and Adults................................................... 328 PEPTIDES AND POLYPEPTIDES A List of Sequential Polypeptides, Their Method of Preparation and Product Analysis..................331 Peptides Prepared by Solid Phase Peptide Synthesis.............................................................................. 374 Poly (a-Amino Acids), Their Solubility and Susceptibility to Enzymatic Activities......................... 393 INDEX

401

Nomenclature

3

BIOCHEMICAL NOMENCLATURE This synopsis of the recommendations of the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) was prepared by Waldo E. Cohn, Director, NAS-NRC Office of Biochemical Nomenclature (OBN, located at Biology Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830), from whom reprints of the CBN publications listed below and on which the synopsis is based are available. The synopsis is divided into three sections: Abbreviations, symbols, and trivial names. Each section contains material drawn from the documents (A 1 to C l, inclusive) listed below, which deal with the subjects named. Additions consonant with the CBN Recommendations have been made by OBN throughout the synopsis. RULES AND RECOMMENDATIONS AFFECTING BIOCHEMICAL NOMENCLATURE AND PLACES OF PUBLICATION (AS OF FEBRUARY 1975) I.

IUPAC-IUB Commission on Biochemical Nomenclature A l. Abbreviations and Symbols [General; Section 5 replaced by A6 ] A2. Abbreviated Designation o f Amino-acid Derivatives and Peptides (1965) [Revised 1971; Expands Section 2 of A ll A3. Synthetic M odifications o f Natural Peptides (1966) [ Revised 1972] A4. Synthetic Polypeptides (Polymerized Amino Acids) (1967) [Revised 1971 ] A5. A One-letter Notation for Amino-acid Sequences (1968) A6. Nucleic Acids, Polynucleotides, and their Constituents (1970) B1.

B2. B3.

(Nomenclature o f Vitamins, Coenzymes, and Related Compounds) a. Miscellaneous [A, B’s, C, D’s, tocols, niacins; see B2 and B3] b. Quinones with Isoprenoid Side-chains: E, K, Q [Revised 1973] c. Folic Acid and Related Compounds d. Corrinoids: B-12’s [Revised 1973] Vitamins B-6 and Related Compounds [Revised 1973] Tocopherols (1973)

C l. C2.

Nomenclature o f Lipids (1967) [Amended 1970; see also II, 2] Nomenclature o f a-Amino Acids (1974) [See also II, 5]

D l.

Conformation o f Polypeptide Chains (1970) [See also III, 2]

E l. E2. E3. E4.

Enzyme Nomenclature (1972) a [Elsevier (in paperback); Replaces 1965 edition.] Multiple Forms o f Enzymes (1971) [Chapter 3 o f E l ] Nomenclature o f Iron-sulfur Proteins (1973) [Chapter 6.5 o f E l ] Nomenclature o f Peptide Hormones (1974)

II.

Documents Jointly Authored by CBN and CNOC [See III] 1. Nomenclature o f Cyclitols (1968) [Revised 1973] 2. Nomenclature o f Steroids (1968) [Amended 1971; Revised 1972] 3. Nomenclature o f Carbohydrates-I (1969) 4. Nomenclature o f Carotenoids (1972) [Revised 1975] 5. Nomenclature o f a-Amino Acids (1974) [Listed under I, C2 in the following table]

III.

IUPAC Commission on the Nomenclature o f Organic Chemistry (CNOC) 1. Section A (Hydrocarbons), Section B (Heterocyclics): J. Am . Chem. Soc., 82, 5545; a Section C (Groups containing N, Hal, S, Se/Te): Pure Appl. Chem., 11, Nos. l - 2 a [A, B, and C Revised 1969:a Butterworth’s, London (1971)] 2. Section E (Stereochemistry):15 J. Org. Chem., 35, 2489 (1970); Biochim. Biophys. Acta, 2 0 8 ,1 (1970); Eur. J. Biochem., 1 8 ,1 5 1 (1970) [See also I, D l]

aNo reprints available from OBN; order from publisher. ^ Reprints available from OBN (in addition to all in IA to ID and II).

4

Handbook o f Biochemistry and Molecular Biology

RULES AND RECOMMENDATIONS AFFECTING BIOCHEMICAL NOMENCLATURE AND PLACES OF PUBLICATION (AS OF FEBRUARY 1975)(continued) IV.

Physiochemical Quantities and Units (IUPAC)a J. Am. Chem. Soc., 82, 5517 (1960) [Revised 1970: Pure Appl. Chem., 21, 1 (1970)]

V.

Nomenclature of Inorganic Chemistry (IUPAC)/. Am. Chem. Soc., 82, 5523a [Revised 1971: Pure Appl. Chem., 28, No. 1 (1971)] a Drugs and Related Compounds or Preparations 1. U.S. Adopted Names (USAN) No. 10 (1972) and Supplement [U.S. Pharmacopeial Convention, Inc., 12601 Twinbrook Parkway, Rockville, Md.] 2. International Nonproprietary Names (INN) [WHO, Geneva]

VI.

8,2227 10,4994 10,3983 10,4827 14,1803

112,17* 113,5 127,613 125,673 127,741 151,507

128,269 * 136,13 147,4

II,I(Revised) II, 2f Amendments 11,3 H,4 Amendments

* First, unrevised version. (R) = revised version.

3Reprints availab le from OBN. b No reprints available from OBN; order from publi sher. cIn French. ^ In Russian. e In German.

*Also• in other journals. gAlsc>in Biopolymers, 1 1 , 321. hJ. Mtol. Biol., 55, 299. V. M,ol B iol, 52, 1.

5,1* 10,1 25,2 21,455 25,397

24,1 35,1

258,1 310,295

10,4825 12,3582 14,2559

126,769 135,5 151,1

147,1 160,355

E2 E3 E4 165,1* 164,453 248,387 244,223 286,217

17,193

229,1

121,577

9,3471

2,127 12,1 53,1

145,405

152,1 202,404

2,1 53,15(R) 45,7(R) 40,325(R) 46,217(R)

1,259 27,201(R) 1,379* 26,301(R) 5,151 15,203

Dl*

6,3287

354,155(R)

1 07,l(a—c) 387,397(R)

263,205(R) 133,1* 278,211(R) 168,6 247,1

Eur. J. Biochem.

14,449

105,897 116(5)

123,409

c if

13,1555(R) 13,1056(R)

5,1445 11,1726(R) 6,362 * 11,942(R) 7,2703 9,4022

Biochemistry

Biochim. Biophys. Acta

Amendments C2

102,15 147,15(R) 147,1(R) 137,417(R) 147,11(R)

118,505 165.1(R) 161(2),iii(R) 162,1(R) 165,6(R)

B l* B lb( Revised) Bld( Revised) B2( Revised) B3(Revised)

101,1 126,773(R) 104,17* 127,753(R) 113,1 120,449

136,1 150,1(R) 121,6 * 151,597(R) 125(3),i 145,425

Biochem. J.

A lf A2(Revised) A 3 (Revised) A4(Revised)g A5 A6h

Arch. Biochem. Biophys.

247,613 247,2633

243,5809 *

246,6127 248,5907 250,3215

245,6489

242,4845 245,1511

245,4229 *

241,2987

241,527 247,977(R) 242,555 * 247,323(R) 243,3557 245,5171

J. Biol. Chem.

l j

37,285(R) 31 285(R)

33,447(R)

38,439

40,(R) 31,649(R) 33,439(R) 31,641 40,

Pure Appl. ChemA

CBN RECOMMENDATIONS APPEAR IN THE FOLLOWING PLACES*

51,3 * 51,819

54,123

50,1363

49,331

50,3 49,121 * 49,325 * 51,205 * 50,1577

Biochimie (Bull. Soc.)c

7,289

2,784

1,872 2,282 * 2,466 * 5,492(R) 3,473 6,167

Afolek. BiolA

350,523* 351,663

353,852

350,279

351,1165*

348,266

348,245 348,256 * 348,262 * 349,1013 * 350,793 351,1055

Z. Phys. Chem.e

5

6

Handbook o f Biochemistry and Molecular Biology

ABBREVIATIONS Abbreviations are distinguished from symbols as follows (taken from Reference A l): a. Symbols, for monomeric units in macromolecules, are used to make up abbreviated structural formulas (e.g., Gly-Val-Thr for the tripeptide glycylvalylthreonine) and can be made fairly systematic. b. Abbreviations for semi-systematic or trivial names (e.g., ATP for adenosine triphosphate; FAD for flavinadenine dinucleotide) are generally formed of three or four capital letters, chosen for brevity rather than for system. It is the indiscriminate coining and use of such abbreviations that has aroused objections to the use of abbreviations in general. [Abbreviations are thus distinguished from symbols in that they (a) are for semi-systematic or trivial names, (b) are brief rather than systematic, (c) are usually formed from three or four capital letters, and (d) are not used - as are symbols —as units of larger structures. ATP, FAD, etc., are abbreviations. Gly, Ser, Ado, Glc, etc., are symbols (as are Na, K, Ca, O, S, etc.); they are sometimes useful as abbreviations in figures, tables, etc., where space is limited, but are usually not permitted in text. The use of abbreviations is permitted when necessary but is never required.] 1. Nucleotides (N = A, C, G, 1,0 , T, U, X, ^ - see Symbols) Nucleoside 5 '-phosphate Nucleoside 5'-di(or pyro)phosphate Nucleoside 5 '-triphosphate

NMP NDP NTP

Prefix d indicates deoxy.

2. Coenzymes, vitamins CoA(or Co ASH) CoASA c DPN a FAD FMN GSH GSSG NAD b NADP b NMN TPN C

3.

Miscellaneous

ACTH CM-cellulose DEAE-cellulose DDT EDTA Hb,HbCO,HbOa

P,

Coenzyme A Acetyl Coenzyme A D ip ho sp ho pyridine nucleotide Flavin-adenine dinucleotide Riboflavin 5 '-phosphate Glutathione Oxidized glutathione Nicotinamide-adenine dinucleotide (cozymase, Coenzyme I, diphosphopyridine nucleotide) Nicotinamide-adenine dinucleotide phosphate (Coenzyme II, triphosphopyridine nucleotide) Nicotinamide mononucleotide Triphosphopyridine nucleotide

Adrenocorticotropin, adrenocorticotropic hormone, or corticotropin 0-(Carboxym ethyl)cellulose 0-(D iethy laminoe thy l)ceUulose 1,1,1 -Trichloro-2,2-b is(p-chlorophenyl)ethane Ethy lened iaminetetraacetate Hemoglobin, carbon m onoxide hemoglobin, oxyhemoglobin Inorganic orthophosphate

a Replaced by NAD (also DPN* by NAD*, DPNH by NADH). bGeneric term; oxidized and reduced forms are NAD*, NADH (NADP*, NADPH). c Replaced by NADP (also TPN* by NADP*, TPNH b y NADPH).

7 PPj TEAE-cellulose Tris

Inorganic pyrophosphate 0-(Triethylam inoethyl)cellulose Tris(hydroxymethyl)aminomethan (2-amino-2-hydroxymethylpropane-l,3-diol)

4. Nucleic Acids DNA, RNA hnRNA mtDNA cRNA mRNA nRNA rRNA tRNA tRNAAla AA-tRNA Ala-tRNA or Ala-tRNA*1® tRNAMet tRNAm e t or tR N A ^ e * fMet-tRNA

Deoxyribonucleic acid, ribonucleic acid (or -nucleate) Heterogeneous RNA Mitochondrial DNA Complementary RNA Messenger RNA Nuclear RNA Ribosomal RNA Transfer RNA (generic term; sRNA should not be used for this or any other purpose) Alanine tRNA; tRNA^*a, tR N A ^ a : isoacceptor alanine tRNA’s Aminoacyl-tRNA; aminoacylated tRNA; “ charged” tRNA (generic term) Alanyl-tRNA Methionine tRNA (not enzymically formylatable) Methionine tRN A, enzymically formylatable to . . . Formylmethionyl-tRNA (small f, to distinguish from fluorine F)

SYMBOLS Symbols are distinguished from abbreviations in that they are designed to represent specific parts of larger molecules, just as the symbols for the elements are used in depicting molecules, and are thus rather systematic in construction and use. Symbols are not designed to be used as abbreviations and should not be used as such in text, but they may often serve this purpose when space is limited (as in a figure or table). Symbols are always written with a single capital letter, all subsequent letters being lower-case (e.g., Ca, Cl, Me, Ac, Gly, Rib, Ado), regardless of their position in a sequence, a sentence, or as a superscript or subscript. Some abbreviations expressed in symbols (see also Section II F below), as examples of the use of symbols: Dimethylsulfoxide Tetranitromethane Guanidine hydrochloride Guanidinium chloride Cetyltrimethylammonium bromide Ethyl methanesulfonate Methylnitronitrosoguanidine -nitrosourea -nitrosamine -fluorene Aminofluorene Acetylaminofluorene Acetoxyacetylaminofluorene jy-Acetylneuraminic acid “ Replaces DMSO. ^ Replaces TNM. c Replaces Gu, Gd, and G. d Replaces CTAB (similarly compounds). e Replaces NU. f Rep laces NA. ^ Replaces AAF. ^ Not NANA.

for

Me, SO * (NOa)4C b Gdn • H Q c GdmCl CtMea NBr d M eS03 Et MeN20 3Gdn -Nur e -Nam f -Fin NH2 Fln AcNH Fln * Ac(AcO)NFln AcNeu b

other

ammonium

8

Handbook o f Biochemistry and Molecular Biology

I.

Phosphorylated Compounds (Reference A l) -P(or P-) (“p” in Nucleic Acids; see IV) -P-(hyphen in Nucleic Acids; see IV) -P-P or -PP or PP- (cf. PPj in Abbreviations)

-P 0 3H2 (or its ions) -P 02 H-(or its ion) -P 02 H-P03 H2 (or ions)

Examples:3 Glucose 6-phosphate Phosphenolpyruvate (pyruvenol phosphate) Fructose 1,6-bisphosphate (not di) Creatine phosphate Phosphocreatine

Glucose-6-P (or Glc-6-p; see II below). P-ewo/Pyruvate or ewo/Pyruvate-P or e Prv-Pb Fructose-1,6-P2 (or Fru-1,6-P2; see II below). Creatine-P P-Creatine

aNote that symbols are hyphenated even where names are not. b Recommended by OBN.

II. Peptides and Proteins (References A l—A5) A. Symbols (Reference A 2 - A 5 ) 1.

Common amino acids Symbol

Name Alanine Arginine Asparagine Aspartic acid Cysteine Glutamic acid Glutamine Glycine Histidine Isoleucine Leucine

Symbol

Three-letter 2

One-letter^

Ala Arg Asn c ’d -e Asp d>e Cys e Glu f Gin e >f’8 Gly His e lie Leu

A R N D C E

Q G H I L

Name Lysinee Methionine Phenylalanine Proline Serinee Threonine e Tryptophan e Tyrosine e Valine Unknown or ‘other'

Three-letter3 Lys Met Phe Pro Ser Thr Trp Tyr Val AAh

One-letted K M F P S T W Y V X

a One capital, two small letters, at all times. ^ For special uses and with special conventions; see III, I following. cOr Asp (NH2 ); see Footnotes e and g. U n certainty as between Asp and Asn may be designated by Asx (or B). S ub stitution on a functional group may be indicated, as shown in Footnotes c and g, by parenthesis following the symbol, e.g., Cys (Cme), Ser (P); see C 2 below. U ncertainty, as between Glu and Gin, may be designated by Glx (or Z); pyroglutamate is pGlu or Gly

or

^ ^ G ly

ly /

c,y\

^>G ly

G ly /^ ♦ The prolonged and well-entrenched ambiguity in the nomenclature o f the N -1 being the biochemist’s N - 3 and vice versa) led to a new trivial system for designating these substances: The imidazole N nearer the alanine residue is designated pros (symbol rr) and the one farther tele (symbol r), to give the following names and symbols: prosmethylhistidine or w-methylhistidine, His(irMe); fe/emethylhistidine or r-methylhistidine, His(rMe).

D.

Polypeptides: Follow Rules for Substitution (C above) (Reference A z)

Glycylglycine Ma-(«lutumylglycinc ;V-7-Glutamylglycine

Gly-Gly Glu-Gly Glu

j Glu

or

-Gly or

LTL,

Glu

«---- Gly

I------- Gly Glutathione

Glu

or

Glu LiIII

7 - Cys-Gly Not

Glu

or

i------

i__r

Glu

Cys-Gly or Glut-Cys-Gly)

CysGly

Cys-Gly

N* -o-Glutamyllysine

Glu

or

n /V6 -7 -GLiUaniyllysine

Glu Lys

or

Glu

Glu

Lys Lys

or

_r I I

Lys or Glu-elys

Glu

Lys

or

Glu(eLys)

11

Glycyllysylglyeine dihydrochloride Its A^-formylderivative

+H2— Gly*Lys-Gly—OH • 2HCI

Gly-Lys-Gly

-or

Gly-Lys(CHO)-Gly

CHO

E.

Cyclic Polypeptides (Reference A2) 1. Homodetic: Gramicidin S

cyc/o-Val-Orn-Leu-DPhe-Pro-Val-Orn-Leu-DPhe-Pro Val-OrivLeu-D Phe-Pro-Val-Orn-Leu-D Phe-Pro

c 2.

Val

Orn —►Leu — DPhe —*• Pro— i

Pro

DPIve«— Leu — Orn«— Va!«-J

Heterodetic: Oxytocin

Cys-Tyr-lle-Asn-Gln-Cys-Pro-Leu-Gly-NLL

Cyclic ester of threonylglycylglycylglycine Thr-Gly-Gly-Gly

F.

H

H Thr-Gly-Gly-Gly-

II

Substituents (Reference A2) 1. NH2 protecting groups of the urethan type (partial list) p-M ethoxyphenylazobenzyloxycarbonyl-

Mz-

p-Nitrobenzyloxycarbonylp-Bromobenzyloxycarbonyl-

Z- or Cbz-a Z (N 0 2)Z(Br)-

p-Phenylazobenzyloxycarbonylr-Butoxycarbonyl-

p-Methoxybenzyloxycarbonyl-

Z(OMe)-

Cyclopentyloxycarbonyl-

PzBoc-b or ButOCOPoc- or cPeOCO-

Benzyloxycarbonyl-

2. Other N- protecting groups (partial list) Acetyl-

Ac-

Benzoyl-(Ce H5 CO-)

PhCO- or BzPhCH2 - or Bzl PhSCH2- or o-NitrophenylthioBtmNH2COPhenylthiocarbamoyl(preferred to Cbm) Phthaloyl-

BenzyL (C6H5CH2-) BenzylthiomethylCarbamoyl-

l-Carboxy-2-nitrophenyl-5-thio3-Carboxypropionyl(H OO C-CH, -C H 2 - C O - )

NbsSuc-

Maleoyl(—OC—CH=CH—CO—) Maleyl(HOOC—CH=CH—CO—) Methylthiocarbamoyl-

PhthalylSuccinyl(-O C -C H 2 -C H 2 - C O - ) T etrahydropyrany i-

Mal-e or Mal< MalMeNHCS-8 or Mtc-f NpsPhNHCS-a or Ptc-f -P h t- or Pht< Pht-S u c- or Suc< H4 pyran-c>g

12

Handbook o f Biochemistry and Molecular Biology

Dansyl-(5-dimethylaminonaphthalene-inonaphthalene1 -sulfonyl) DinitrophenylFormylp-1 o do phenylsulfony 1 (pipsyl)

Dns-Cor dansyl3

Tosyl(p-tolylsulfonyl)

Tos- or tosyl

N 2 ph-a,c or Dnp HCO-d or CHOIps or pipsyl

Trifluoroacetyl-

CF 3 CO- or F 3 Ac-a,h Ph3 C-a,i or Trt-

Trityl(triphenylmethyl)

3. Substituents at carboxyl group Benzyloxy- (benzyl ester) Cyanomethoxy(cyanomethyl ester) Diphenylmethoxy(benzhydryl ester) Ethoxy- (ethyl ester) Methoxy- (methyl ester)

—OCH2 Ph or —OBzl -OCH2CN or -OMeCN -OCHPh 2 or —OBzh —OEt -OMe

p-Nitrophenoxy(p-nitrophenyl ester) p-Nitrophenylthio-

—ONph

Phenylthio(phenylthiolester) 1-Piperidino-oxy8-QuinolyloxySuccinimido-oxyTertiary butoxy(f-butyl ester)

-SPh

-SNph

-OPip -OQu —ONSuc -OBu*

Preferred. bNot BOC or /BOC. cThe use o f D for di and T for tri (or tetra) is discouraged. Recognized symbols with numerical subscripts are recommended. dfMet is approved for formylmethionine. eMalNEt is recommended for 7V-ethylmaleimide (not NEM). f Mtc and Ptc have been used to denote methyl- and phenylthiohydantoins (e.g., Ptc-Leu). Since this incorrectly implies the substitution of an amino acid by a “phenyl (or methyl) thiohydantoyl” group, the correct representation,,CS-Leu:NPh or,PhNCS-Leu-t or, in text, Leu>PhNCS, is recommended. gNot THP or Thp (see Footnote c). hNot TFA (see Footnote c). *Or trityl.

4.

Other substituents (and reagents)

2-Aminoethyl-a CarbamoylmethylCarboxymethylp-Carboxyphenylmercuril-Carboxy-2-mitrophenyl-5-thionitrophenyl-5-thioDiazoacetyl-Diisopropylphosphor DinitrophenylHydroxyethyl-

-(CH2 ) 2 NH2 or Aet-a>b -CH2 CONH2 or Ncm-CH2 C 0 2H or Cxm-HgBzOH NbsN 2 CHCO- or N 2 -e N2ph g -(CH2)2OH or HOEt-

Chloroethylamine Chloroacetamide Chloroacetic acid p-Chloromercuribenzoate 5,5 '-Dithiobis(2-nitrobenzoic acid) ( 2 -nitrobenzoic acid)

Diisopropylfluorophosphate Fluorodinitrobenzene Ethylene oxide N-Ethylmaleimide Tetrahydrofuran Tosyllysyl chloromethyl ketone

C1(CH2 ) 2 NH2 or AetCl ClCH2 CONH2 or NcmCl C1CH 2C 0 2 H or CxmCl ClHgBzO-c Nbs2 d

iPr2/>-F f N 2 ph-F b (CH2 )2 0 or E t> 0 MalNEt1 H4 furan J Tos-LysCH2Cl k

13

F 3Ac- m Me3Si-n

TrifluoroacetylTrimethylsilyl-

Tos-ArgOMe 1

Tosylarginine methyl ester Trifluoroacetic acid Tetramethylsilane

F 3A cOH Me4 S i n

aFor -ethylamine, -Etn; for -ethanolamine (see Lipids), -OEtn. bN ot AET. c Replaces PCMB, pCMB, and CMB. d Replaces DTNB. e Replaces, DIP and Dip. f Replaces DPF, DFP, DIPF, etc. gReplaces DNP and Dnp. h Re places FDNB. ^Replaces NEM. ^Replaces THF. Similarly, H„ folate. k Repiaces TLCK (similarly for TPCK, e ta ). R eplaces TAME (similarly for other N-substituted amino-acid esters. See C l above). m Replaces TFA. n Not TMS- or TMS. Similarly, Me2 SO, not DMSO; N A c3, not NTA.

G. Polymerized Amino Acids (Synthetic Polypeptides) (Reference A4) 1. Linear polymers (only normal peptide links are involved). a. Homopolymer: polylysine; poly(Lys) or (Lys)n (n may be replaced by a number). b. Copolymer, alternating sequence: poly(alanine-lysine); poly(AJa-Lys) or (Ala-Lys)n. c. Copolymer, random sequence, composition unspecified: poly(alanine, lysine); poly(Ala,Lys) or (Ala,Lys)n. d. Copolymer, random sequence, molar percentages (2 = 100%) known: poly(DLGlu56Lys38DTyr6) or (DLGlu56Lys38DTyr6)n (only lysine is L). e. Block polymer of poly(Glu) linked via a-COOH to a-NH2 of poly(Lys): poly(Glus 6)-poly(Lys4 4 ) or (GluS6)„hexapeptide amide

/. One-letter Notation 0 (Reference A 5) 1. 2. 3. 4. 5.

Symbols: see II.A.1 above (NH2 terminal at left, COOH terminal at right). Known sequence: spaceb between symbols. Unknown sequence: comma0 between symbols, parentheses*1 enclosing. Adjacent unknown sequences: = replaces )(. Uncertainty as to sequence or terminus: / (see examples b and c).

Examples: a. (Ala, Cys, Asp) (Arg, Ser) (Gly, His, He) Lys-Leu-Met-Asn-Pro-Gln becomes (A, C, D = R, S = G, H, I) K L M N P Q b. A c. (A.

C D E F G H I K L M N P Q C. D = R ,S = G.H. I) K L/M N/

P Q/

In c., the tripeptides A . C . D and G . H . I are not of known sequence, but are inferred by analogy with the known peptide b.; the inference is expressed by periods instead of commas. The comma between R and S indicates that no inference as to sequence can be drawn for .this dipeptide. The internal slashes indicate that no connection between L and M, and N and P, has been proven, although KL, MN and PQ are each of known internal sequence. The final slash indicates that Q has not been proven to be the COOH terminal residue of the entire peptide, although it is the terminus of the PQ dipeptide. aFor display o f very long sequences or computer use only. b In place o f hyphen in three-letter system. Spaces must be equal to characters, as in typing. So must commas, dots, and all other symbols. c As in three-letter system; becomes a dot (period) when sequence is inferred but not demonstrated (see example c). dThe double symbol, )(, is replaced by = (see 4) to preserve equal spacing.

16

Handbook o f Biochemistry and Molecular Biology

NOMENCLATURE OF LABELED COMPOUNDS The statement below was adopted by the IUB Commission of Editors of Biochemical Journals* (CEBJ) and appears, in the same or in similar form, in the Instructions to Authors of their journals. This system originated with the Chemical Society (London) and was subsequently adopted by the American Chemical Society (Handbook for Authors, 1967). It was adopted by CEBJ in 1971 and is the only system currently permitted in the pages of their journals. ISOTOPIC ALLY LABELED COMPOUNDS The symbol for the isotope introduced is placed in square brackets directly attached to the front of the name (word), as in [14C] urea. When more than one position in a substance is labeled by means of the same isotope and the positions are not indicated (as below), the number of labeled positions is added as a right-hand subscript, as in [14C2]glycollic acid. The symbol “U” indicates uniform and “G” general labeling, e.g., [U-14CJ glucose (where the 14C is uniformly distributed among all six positions) and [G-14C]glucose (where the 14C is distributed among all six positions, but not necessarily uniformly); in the latter case it is often sufficient to write simply “ I14C]glucose.” The isotopic prefix precedes that part of the name to which it refers, as in sodium [l 4C] formate, iodop 4C2 ] acetic acid, 1-aminof14C] methylcyclopentanol (H2N - 14CH2- C sH8-O H ), a-naphth[14C]oic acid (C10H - 14C 0 2H), 2-acetamido-7[131I]iodofluorene, fructose 1,6-[ l -3 2P] diphosphate, D-[*4C]glucose, 2H-[2-2H] pyran, S-[8-14C]adenosyl[3s S] methionine. Terms such as “ 131I-labeled albumin” should not be contracted to “ [*311] albumin” (since native albumin does not contain iodine), and “ 14C-labeled amino acids” should similarly not be written as “ [14C] amino acids” (since there is no carbon in the amino group). When isotopes of more than one element are introduced, their symbols are arranged in alphabetical order, including 2H and 3H for deuterium and tritium, respectively. When not sufficiently distinguished by the foregoing means, the positions of isotopic labeling are indicated by Arabic numerals, Greek letters, or prefixes (as appropriate), placed within the square brackets and before the symbol of the element concerned, to which they are attached by a hyphen; examples are [1-2H] ethanol (CH3- C 2H2-O H ), [1-14C] aniline, L-[2-14C] leucine (or L-fa-14C]-leucine), [carboxy-14C] leucine, [Me14C]isoleucine, [2,3-*4C] maleic anhydride, [6/7-14C]xanthopterin, [3,4-* 3C,35S]methionine, [2-13C; l-14C]acetaldehyde, [3-14C; 2,3-2H; 15N] serine. The same rules apply when the labeled compound is designated by a standard abbreviation or symbol, other than the atomic symbol, e.g. [7-32P] ATP. For simple molecules, however, it is often sufficient to indicate the labeling by writing the chemical formulae, e.g. 14C 0 2, H2*80 , 2H20 (not D20 ), H235S04, with the prefix superscripts attached to the proper atomic symbols in the formulae. The square brackets are not to be used in these circumstances, nor when the isotopic symbol is attached to a word that is not a chemical name, abbreviation or symbol (e.g. 1311-labeled). *CEBJ consists of the Editors-in-Chief of the following journals: Archives o f Biochemistry and Biophysics, Biochemical Journal, Biochemistry, Biochimica et Biophysica Acta, Biochimie, European Journal o f Biochemistry, Hoppe-Seyler’s Zeitschrift flir Physiologische Chemie, Journal o f Biochem­ istry, Journal o f Biological Chemistry, Journal o f Molecular Biology, and Molekulyarnaya Biologiya; corresponding members include Proceedings o f the National Academy o f Sciences (U.S.A.) and approximately 40 others.

17

THE CITATION OF BIBLIOGRAPHIC REFERENCES IN BIOCHEMICAL JOURNALS RECOMMENDATIONS (1971) * IUB Commission of Editors of Biochemical Journals (CEBJ) These Recommendations were reviewed by the Commission in August 1972, when it was decided to publish them. PREAMBLE Two basic systems for the citation of references are used at present. The so-called Harvard System (where names of authors and the date are cited in the text, and the reference list is in alphabetical order) and the Numbering System (where numbers, but not necessarily names of authors, are cited in the text, and the reference list is in order of citation in the text). Several ways of quoting references in the list are in current use. The Commission is of the opinion, arrived at as a result of much consultation between many senior editors, that it is unlikely that all journals would accept a recommendation to use either the Harvard or the Numbering System to the exclusion of the other. It believes, however, that most biochemists will accept the need for, and indeed welcome, a substantial degree of unification of practices, there being no strong case for the individuality of each journal on this issue. Accordingly, the Commission makes the following Recommendations to all biochemical journals; the reasons for some of them are given. The Recommendations deal first with the way in which references should be cited in the list; the proposal is suitable for journals adopting either the Harvard or the Numbering System. Secondly, there are Recommendations about the way in which each of these systems is used. Thirdly, abbreviations for titles of journals and a few other points are considered. Implementation of the Recommendations would mean that any very small differences between journals in their practices would be of the type that can be attended to at the redactory stage of preparation for press. The Commission recognizes that it cannot deal with a number of smaller problems concerning citations that arise from time to time. RECOMMENDATIONS 1. Citations of References in the List of References Should Be as Follows Braun, A., Brown, B. & LeBrun, C. (1971) Journal, 11,111—113. Notes: (a) This form can be used by both systems. (b) Journals using the Numbering System should arrange the references in numerical order beside the number (which can be italicized or in brackets according to the house custom of the journal). (c) Journals using the Harvard System should arrange the references in alphabetical order, whatever the language, except in certain situations (see Recommendation 4a below). (d) This recommendation incorporates the following points: i. Initials after surnames (full first names are not given in the list). ii. The use of the symbol is recommended if at all possible because of its widespread usage and the fact that it is independent of the language. No comma before

* From IUB Commission of Editors of Biochemical Journals (CEBJ), J. B iol Chem., 248(21), 7279-7280 (1973). With permission.

18

Handbook o f Biochemistry and Molecular Biology

iii. Year in parentheses (this follows immediately after the authors’ names because it is essential to the Harvard System). iv. Journal title (abbreviated). This can be in italics according to house practice (see Recommendation 7 below concerning journal title abbreviations). v. Volume number. This can be in heavy type or italics according to house practice. vi. A few journals do not have volume numbers in which case the page numbers should follow immediately after the abbreviated journal title. If it is necessary to quote both a volume and a part number, the reference should read: Brown, B. (197\) Journal, 11, pt 1, 121-123. vii. First and last pages should be given. The Commission decided to make this Recommendation mainly on the basis of evidence that the additional information provided by quoting the last page was being required increasingly in many types of library and information retrieval services. Citation of the last page (as well as the first) has been requested for some time by the secondary and abstracting journals. Citation of both first and last pages is also an aid in the prevention of errors. viii. The number of stops and commas is kept as small as possible. (e) Authors’ names and the abbreviated name of the journal when repeated in the next reference should be spelled out in full; ibid, and similar terms should not be used. (0 Recommendations of the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and similar documents should be referred to as: Commission on Biochemical Nomenclature (1970) followed by a journal reference. (g) Junior should be abbreviated to “Jr,” not “jun.” 2. Numbering System in the Text The use of authors’ names is permissible as authors wish; only the initial letter of the name should be in capital type. Numbers can be inserted in parentheses or as superscripts according to house custom. The printing of references at the foot of the page on which they are first quoted is considered to be helpful with the Numbering System but is not part of the Recommendation because the extra cost is generally considered to be prohibitive. 3. Harvard System in the Text For multi-author papers, it is recommended that: a. Not more than two authors to be named either on the first or any subsequent occasion; b. et al. should be used for three or more authors on every occasion; c. Each name to have the initial letter in capital type only. Examples (Harvard System style): Braun et al. (1969) did some work that was confirmed by LeBrun (1970). These results (Braun et al., 1969; LeBrun, 1970) have been discussed by Brown & Braun (1971). The same Recommendation (without the year) applies when authors are quoted in the text in the Numbering System.

4. Harvard System in the List of References a. A special problem arises in the list when there are several papers by, e.g., Green et al. in the same or over several years. While the list could be in strict alphabetical order of the full reference, the reader will find no clue in the text to the alphabetical status of the names of the second and subsequent authors (see Recommendations 3a and 3b). It is therefore recommended that all the papers by Green et al. (that is by Green and more than one co-author) should be arranged, irrespective of the names of the other

19

authors, in chronological order (over many years if necessary) and designate tham a, b, c, Uc. Examples: Green, Green, Green, Green,

G. (1970) etc. G. & Brown, B. (1971) etc. G. & White, W. (1969) etc. G., White, W. & Black, B. (1968a) etc.

sequence governed by order or date o f publication, as far as can be ascertained.

Green, G., Brown, B. & Black, B. (1968b) etc. Green, G., White, W., Black, B. & Brown, B. (1969) etc. Green, G., Black, B. & Brown, B. (1970) etc.

b. Names beginning with “Me” should be listed under “Me” and not under “Mac,” to decide alphabetical order. c. Names beginning with “De,” “Van,” or “von,” etc. should be arranged under D or V/v, etc. 5. Reference to Books These should appear in text like any reference to a journal paper. The reference in the list should read: Brown, B. & Braun, A. (1971) in Book Title (LeBrun, C., ed.), pp. 1-20, Publisher, Town. Notes: a. If a volume number has to be quoted, this would appear before the pp. as, e.g., “vol. 2,” with the number in Arabic numerals (even when Roman numerals are printed on the cover of the book). b. Where an author wishes to refer to a specific page within a book reference, this should be given in the text. Example (in text): “ . . . discussed on p. 21 of Braun et al.(1971).” 6. Other Forms of References a. In the press. It is recommended that (i) this should mean that the paper has been finally accepted by a journal, (ii) it is quoted in the text (both systems) just as any other paper, (iii) the year quoted should be the best estimate revised if necessary at proof stage, and (iv) the full citation in the list to read: Braun, A. & Brown, B. (1971) Journal, in the press. b. Submitted for publication should be used in a typescript only when it is reasonable to expect that it will be possible to alter the quotation to a final form at a stage before publication; if such alteration cannot be made then the name of the journal involved should be stated. c. The use of in preparation and private communication should not be allowed because they have no real value. d. Personal communication and unpublished work should be permitted in the text only, i.e., not in the list of references. Editors may require to see written evidence of the former.

20

Handbook o f Biochemistry and Molecular Biology

7. Abbreviations for Journal Titles Most biochemical journals use the Chemical Abstract* system but a few use the World List, 4th Edition. The Commission noted that the latest information available (International List of Periodical Title Word Abbreviations prepared for the UNISIST/ ICSU-AB Working Group on Bibliographical Descriptions) suggests that the abbreviations that will be recommended finally by ICSU will be very similar to those now used by Chemical Abstracts. Believing that complete uniformity on this issue is highly desirable now and estimating that it may be a few more years before ICSU finally reports, the Commission recommends that all biochemical journals should now use the Chemical Abstracts (American Chemical Society) system. The Commission believes that any changes that will be required when ICSU eventually issues recommendations on this point will be comparatively minor ones. 8. Implementation of these Recommendations The Commission at its meeting in Menton, May 7 to 8, 1971, has taken the view that the degree of uniformity envisaged in the Recommendations is highly desirable and therefore further recommends to all biochemical journals that the changes required should be made as soon as possible. The Commission recognizes that all journals will have to make some changes (in most cases these are minor) from their present established practices to implement these Recommendations in full. It considers that the possible objections of difficulties even for a commercial publisher with an established “house style” are outweighed by the advantage that conformity of style in the citation of references will prove to the authors, editors, and readers upon whom all journals depend for their existence. *The journal-title abbreviations in Biological Abstracts are essentially the same in Chemical Abstracts. A List o f Serials with Title Abbreviations is available from BioSciences Information Service of Biological Abstracts, 2100 Arch Street, Philadelphia, PA 19103.

21

IUPAC TENTATIVE RULES FOR THE NOMENCLATURE OF ORGANIC CHEMISTRY SECTION E. FUNDAMENTAL STEREOCHEMISTRY* International Union of Pure and Applied Chemistry INTRODUCTION This Section of the IUPAC Rules for Nomenclature of Organic Chemistry differs from previous Sections in that it is here necessary to legislate for words that describe concepts as well as for names of compounds. At the present time, concepts in stereochemistry (that is, chemistry in threedimensional space) are in the process of rapid expansion, not merely in organic chemistry, but also in biochemistry, inorganic chemistry, and macromolecular chemistry. The aspects of interest for one area of chemistry often differ from those for another, even in respect to the same phenomenon. This rapid evolution and the variety of interests have led to development of specialized vocabularies and definitions that sometimes differ from one group of specialists to another, sometimes even within one area of chemistry. The Commission on the Nomenclature of Organic Chemistry does not, however, consider it practical to cover all aspects of stereochemistry in this Section E. Instead, it has two objects in view: To prescribe, for basic concepts, terms that may provide a common language in all areas of stereochemistry; and to define the ways in which these terms may, so far as necessary, be incorporated into the names of individual compounds. The Commission recognizes that specialized nomenclatures are required for local fields; in some cases, such as carbohydrates, amino acids, peptides and proteins, and steroids, international rules already exist; for other fields, study is in progress by specialists in Commissions or Subcommittees; and further problems doubtless await identification. The Commission believes that consultations will be needed in many cases between different groups within IUPAC and IUB if the needs of the specialists are to be met without confusion and contradiction between the various groups. The Rules in this Section deal only with Fundamental Stereochemistry, that is, the main principles. Many of these Rules do little more than codify existing practice, often of long standing; however, others extend old principles to wider fields, and yet others deal with nomenclature that is still subject to controversy. Rule E-0 The stereochemistry of a compound is denoted by an affix or affixes to the name that does not prescribe the stereochemistry; such affixes, being additional, do not change the name or the numbering of the compound. Thus, enantiomers, diastereoisomers, and cis-trans isomers receive names that are distinguished only by means of different stereochemical affixes. The only exceptions are those trivial names that have stereo­ chemical implications (for example, fumaric acid, cholesterol). N ote: In some cases (see Rules E-2.23 and E-3.1) stereochemical relations may be used to decide between alternative numberings that are otherwise permissible. E-l. Types of Isomerism E -l.l. The following nonstereochemical terms are relevant to the stereochemical nomenclature given in the Rules that follow. * From IUPAC Inf. Bull Append. Tentative Nomencl. Sym. Units Stand., No. 35, August 1974, pp. 3 6-80 . With permission.

22

Handbook o f Biochemistry and Molecular Biology

(a) The term structure may be used in connection with any aspect of the organization of matter. Hence: structural (adjectival) (b) Compounds that have identical molecular formulas but differ in the nature or sequence of bonding of their atoms or in arrangement of their atoms in space are termed isomers. Hence: isomeric (adjectival) isomerism (phenomenological) Examples:

(In this and other Rules a broken line denotes a bond projecting behind the plane of the paper, and a thickened line denotes a bond projecting in front of the plane of the paper. In such cases a line of normal thickness denotes a bond lying in the plane of the paper.) (c) The constitution of a compound of given molecular formula defines the nature and sequence of bonding of the atoms. Isomers differing in constitution are termed constitutional isomers. Hence: constitutionally isomeric (adjectival) constitutional isomerism (phenomenological) Example: H3C-O—CH3 is a constitutional isomer of H3C—CH2 —OH. Note: Use of the term “structural” with the above connotation is abandoned as insufficiently specific. E-1.2. Isomers are termed stereoisomers when they differ only in the arrangement of their atoms in space. Hence: stereoisomeric (adjectival) stereoisomerism (phenomenological) Examples:

23

is a stereoisomer o f

E-1.3. Stereoisomers are termed cis-trans isomers when they differ only in the positions of atoms relative to a specified plane in cases where these atoms are, or are considered as if they were, parts of a rigid structure. Hence: cis-trans isomeric (adjectival) cis-trans isomerism (phenomenological) Examples:

and

and

E-1.4. Various views are current regarding the precise definition of the term “configuration.” (a) Classical interpretation: The configuration of a molecule of defined constitution is the arrangement of its atoms in space without regard to arrangements that differ only as after rotation about one or more single bonds, (b) This definition is now usually limited so that no regard is paid also to rotation about n bonds or bonds of partial order between one and two. (c) A third view limits the definition further so that no regard is paid to rotation about bonds of any order, including double bonds. Molecules differing in configuration are termed configurational isomers. Hence: configurational isomerism N otes: (1) Contrast conformation (Rule E-1.5). (2) The phrase “differ only as after rotation” is intended to make the definition independent of any difficulty of rotation, in particular independent of steric hindrance to rotation. (3) For a brief discussion of views (a) to (c), see Appendix 1. It is hoped that a definite consensus of opinion will be established before these Rules are made “Definitive.” Examples: The following pairs of compounds differ in configuration:

24

Handbook o f Biochemistry and Molecular Biology

These isomers (iv) are configurational in view (a) or (b) but are conformational (see Rule E-1.5) in view (c)

E-1.5. Various views are current regarding the precise definition of the term “conformation.” (a) Classical interpretation: The conformations of a molecule of defined configuration are the various arrangements of its atoms in space that differ only as after rotation about single bonds, (b) This is usually now extended to include rotation about n bonds or bonds of partial order between one and two. (c) A third view extends the definition further to include also, rotation about bonds of any order, including double bonds. Molecules differing in conformation are termed conformational isomers. Hence: conformational isomerism N otes: All the Notes to Rule E-1.4 apply also to E-1.5. Examples: Each of the following pairs of formulas represents a compound in the same configuration but in different conformations.

25

See Example (iv) to Rule E-1.4.

E-1.6. The terms relative stereochemistry and relative configuration are used with reference to the positions of various atoms in a compound relative to one another, especially, but not only, when the actual positions in space (absolute configuration) are unknown. E-1.7. The terms absolute stereochemistry and absolute configuration are used with reference to the known actual positions of the atoms of a molecule in space.* E-2. cis-trans Isomerism^ Preamble. The prefixes cis and trans have long been used for describing the relative positions of atoms or groups attached to nonterminal doubly bonded atoms of a chain or attached to a ring that is considered as planar. This practice has been codified for hydrocarbons by IUPAC.** There has, however, not been agreement on how to assign cis or trans at terminal double bonds of chains or at double bonds joining a chain to a ring. An obvious solution was to use cis and trans where doubly bonded atoms formed the backbone and were nonterminal and to enlist the sequence-rule preferences to decide other cases; however, since the two methods, when generally applied, do not always produce analogous results, it would then be necessary to use different symbols for the two procedures. A study of this combination showed that both types of symbols would often be required in one name and, moreover, it seemed wrong in principle to use two symbolisms for essentially the same phenomenon. Thus it seemed to the Commission wise to use only the sequence-rule system, since this alone was applicable to all cases. The same decision was taken independently by Chemical Abstracts Service who introduced Z and E to correspond more conveniently to seqcis and seqtrans of the sequence rule. It is recommended in the Rules below that these designations Z and E based on the sequences rule shall be used in names of compounds, but Z and E do not always correspond to the classical cis and trans which show the steric relations of like or similar

* Determination of absolute configuration became possible through work by Bijvoet, J. M., Peerdeman, A. F., and van Bommel, A. J., Nature, 168, 271 (1951); cf. Bijvoet, J. M., Proc. Kon. Ned. Akad. Wetensch., 52, 313 (1949). ^ These Rules supersede the Tentative Rules for olefinic hydrocarbons published in the Comptes rendus of the 1 6 th IUPAC Conference, New York, N.Y., 1951, pp. 102-103. ** Blackwood, J. F., Gladys, C. L., Loening, K. L., Petrarca, A. E., and Rush, J. E., J. Amer. Chem. Soc., 90, 509 (1968); Blackwood, J. E., Gladys, C. L., Petrarca, A. E., Powell, W. H., and Rush, J. E., J. Chem. Doc., 8, 30 (1968).

26

Handbook o f Biochemistry and Molecular Biology

groups that are often the main point of interest. So the use of Z and E in names is not intended to hamper the use of cis and trans in discussions of steric relations of a generic type or of groups of particular interest in a specified case (see Rule E-2.1 and its Examples and Notes, also Rule E-5.11). It is also not necessary to replace cis and trans for describing the stereochemistry of substituted monocycles (see Subsection E-3). For cyclic compounds the main problems are usually different from those around double bonds; for instance, steric relations of substitutents on rings can often be described either in terms of chirality (see Subsection E-5) or in terms of cis-trans relationships, and, further, there is usually no single relevant plane of reference in a hydrogenated polycycle. These matters are discussed in the Preambles to Subsections E-3 and E-4. E-2.1. Definition o f cis-trans. Atoms or groups are termed cis or trans to one another when they lie respectively on the same or on opposite sides of a reference plane identifiable as common among stereoisomers. The compounds in which such relations occur are termed cis-trans isomers. For compounds containing only doubly bonded atoms, the reference plane contains the doubly bonded atoms and is perpendicular to the plane containing these atoms and those directly attached to them. For cyclic compounds, the reference plane is that in which the ring skeleton lies or to which it approximates. When qualifying another word or a locant, cis or trans is followed by a hyphen. When added to a structural formula, cis may be abbreviated to c, and trans to t (see also Rule E-3.3). Examples: (Rectangles here denote the reference planes and are considered to lie in the plane of the paper.)

The groups or atoms a,a are the pair selected for designation but are not necessarily identical; b,b are also not necessarily identical but must be different from a,a.

cis or trans according as a or b is taken as basis of comparison

N otes: The formulas above are drawn with the reference plane in the plane of the paper, but for doubly bonded compounds it is customary to draw the formulas so that this plane is perpendicular to that of the paper; atoms attached directly to the doubly bonded atoms then lie in the plane of the paper and the formulas appear as, for instance

27

Cyclic structures, however, are customarily drawn with the ring atoms in the plane of the paper, as above. However, care is needed for complex cases, such as

The central five-membered ring lies (approximately) in a plane perpendicular to the plane of the paper. The two a groups are trans to one another; so are the b groups; the outer cyclopentane rings are cis to one another with respect to the plane of the central ring, cis or trans (or Z or E\ see Rule E-2.21) may also be used in cases involving a partial bond order when a limiting structure is of sufficient importance to impose rigidity around the bond of partial order. An example is

trans (or E)

E-2.2. cis-trans Isomerism around Double Bonds. E-2.21. In names of compounds steric relations around one or more double bonds are designated by affixes Z and/or E, assigned as follows. The sequence-rule-preferred* atom or group attached to one of a doubly bonded pair of atoms is compared with the sequence-rule-preferred atom or group attached to the other of that doubly bonded pair of atoms; if the selected pair are on the same side of the reference plane (see Rule 2.1) an italic capital letter Z prefix is used; if the selected pair are on opposite sides an italic capital letter E prefix is used.^ These prefixes, placed in parentheses and followed by a hyphen, normally precede the whole name; if the molecule contains several double bonds, then each prefix is immediately preceded by the lower or less primed locant of the relevant double bond. Examples:

(£)-2-Butene

(Z)-2-Methyl-2-butenoic acid** or (Z)2-methylisocrotonic acid (see Exceptions below)

(£>2-Methyl-2-butenoic acidt t or (£>2-Methylcrotonic acid (see Exceptions below)

* For sequence-rule preferences see Appendix 2. t These prefixes may be rationalized as from the German zusammen (together) and entgegen (opposite). ** The name angelic acid is abandoned because it has been associated with the designation trans with reference to the methyl groups. ttT h e name tiglic acid is abandoned because it has been associated with the designation cis with reference to the methyl groups.

28

Handbook o f Biochemistry and Molecular Biology

(Z)-l,2-Dibromo-l-chloro-2-iodoethylene (By the sequence rule, Br is preferred to Cl, but I to Br)

(£)-(3-Bromo-3-chloroallyl)benzene

Exceptions to Rule E-2.21. The following are examples of accepted trivial names in which the stereochemistry is prescribed by the name and is not cited by a prefix.

E-2.22 (.Alternative to Part o f E-2.21). (a) When more than one series of locants starting from unity is required to designate the double bonds in a molecule, or when the name consists of two words, the Z and E prefixes together with their appropriate locants may be placed before that part of the name where ambiguity is most effectively removed, (b) [Alternative to (a)] When several Z or E prefixes are required they are arranged in* *Systematic names are recommended for derivatives of these compounds formed by substitution on carbon.

29

order as follows: Of the four atoms or groups attached to each doubly bonded pair of atoms, that one preferred by the sequence rule is selected; the single atoms or groups thus selected are then arranged in their sequence rule order (determined in respect of their position in the whole molecule), and the prefixes Z and/or E for the respective double bonds are placed in that order, but without their locants. Note: In method (a) the final choice is left to an author or editor because of the variety of cases met and because the problems are not always the same in different languages. The presence of the locants usually eases translation from the name to a formula, but this method (a) may involve the logical difficulty explained for the third example below. Method (b) always gives a single unambiguous order and is not subject to the logical difficulty just mentioned, but translation from the name to the formula is harder than for method (a). Method (a) may be more suitable for cursive text, and method (b) for compendia. If method (b) is used it should be used whenever more than one double bond is involved, but method (a) is to be used only under the special conditions detailed in the rule. Examples:

[The last example shows the disadvantages of both methods. In method (a) there is a fault of logic, namely, the 3Z,5E are not the property of the unsubstituted heptadienoic acid chain, b ut the 3Z arises only because of the side chain that is cited before the 3Z,5£\ In method (b) it is some trouble to assign the E,Z,E to the correct double bonds.]

30

Handbook o f Biochemistry and Molecular Biology

E-2.23. When Rule C-13.1 or E-2.22(b) permits alternatives, preference for lower locants and for inclusion in the principal chain is allotted as follows, in the order stated, so far as necessary: Z over E groups; cis over tram cyclic groups;/? over 5 groups (alsor over 5, etc., as in the sequence rule); if the nature of these groups is not decisive, then the lower locant for such a preferred group at the first point of difference. Examples:

(a) (2Z,5£>2,5-Heptadienedioic acid (b) (£’,Z)-2,5-Heptadienedioic acid [The lower numbers are assigned to the Zdouble bond.] ♦The terms syn, antiy and amphi are abandoned for such compounds.

31

[According to Rule C-13.1 the principal chain must include the C = €—CH3 group because this gives lower numbers to the double bonds (1,3 rather than 1,4); then the Cl-containing Z group is chosen for the re­ mainder o f the principal chain in accord with Rule E-2.23.]

[The principal chain is chosen to include the CK)-group, and the prefix Z refers to the (/?)-group.J

E-3. Relative Stereochemistry of Substituents in Monocyclic Compounds^ Preamble. The prefixes cis and trans are commonly used to designate the positions of substituents on rings relative to one another; when the ring is, or is considered to be, rigidly planar or approximately so and is placed horizontally, these prefixes define which groups are above and which below the (approximate) plane of the ring. This differentiation is often important, so this classical terminology is retained in Subsection E-3; since the difficulties inherent in end groups do not arise for cyclic compounds, it is unnecessary to resort to the less immediately informative EjZ symbolism. When the cis-trans designation of substituents is applied, rings are considered in their most extended form; reentrant angles are not permitted; for example

The absolute stereochemistry of optically active or racemic derivatives of monocyclic compounds is described by the sequence-rule procedure (see Rule E-5.9 and Appendix 2). The relative stereochemistry may be described by a modification of sequence-rule symbolism as set out in Rule E-5.10. If either of these procedures is adopted, it is then superfluous to use also cis or trans in the names of individual compounds. t Formulas in Examples to this Rule denote relative (not absolute) configurations.

32

Handbook o f Biochemistry and Molecular Biology

E-3.1. When alternative numberings of the ring are permissible according to the Rules of Section C, that numbering is chosen which gives a cis attachment at the first point of difference; if that is not decisive, the criteria of Rule E-2.23 are applied. The prefixes cis and trans may be abbreviated to c and t , respectively, in names of compounds when more than one such designation is required. Examples:

E-3.2. When one substituent and one hydrogen atom are attached at each of two positions of a monocycle, the steric relations of the two substituents are expressed as cis or trans, followed by a hyphen and placed before the name of the compound. Examples:

E-3.3. When one substituent and one hydrogen atom are attached at each of more than two positions of a monocycle, the steric relations of the substituents are expressed by adding r (for reference substituent), followed by a hyphen, before the locant of the lowest numbered of these substituents and c ox t (as appropriate), followed by a hyphen, before the locants of the other substituents to express their relation to the reference substituent. Examples:

r - \,r-2,c-4-T richlorocyclopentant (not r-1 , t- 2 , M , which would follow from the alternative direction of numbering; see Rule E-3.1)

r-5-Chloro-M, c-3-cyclohexanedicarboxylic acid

33

E-3.4. When two different substituents are attached at the same position of a monocycle, then the lowest numbered substituent named as suffix is selected for designation as reference group in accordance with Rule E-3.2 or E-3.3; or, if none of the substituents is named as suffix, then of the lowest numbered pair that one preferred by the sequence rule is selected as reference group; and the relation of the sequence-rule preferred group at each other position, relative to the reference group, is cited as c or t (as appropriate). Examples:

E-4. Fused Rings Preamble. In simple cases the relative stereochemistry of substituted fused-ring systems can be designated by the methods used for monocycles. For the absolute stereochemistry of optically active and racemic compounds the sequence-rule procedure can be used in all cases (see Rule E-5.9 and Appendix 2), and for related relative stereochemistry the procedure of Rule E-5.10 can be applied. Sequence-rule methods are, however, not descriptive of geometrical shape for other than quite simple cases. There is as yet no generally acceptable system for designating in an immediately interpretable manner the stereochemistry of polycyclic bridged ring compounds (for instance, the endo-exo nomenclature, which should solve one set of problems, has been used in different ways). These and related problems (e.g., cyclophanes, catenanes) will be considered in a later document. E-4.1. Steric relations at saturated bridgeheads common to two rings are denoted by cis or trans, followed by a hyphen and placed before the name of the ring system, according to the relative positions of the exocyclic atoms or groups attached to the bridgeheads. Such rings are said to be cis fused or trans fused. Examples:

34

Handbook o f Biochemistry and Molecular Biology

E-4.2. Steric relations at more than one pair of saturated bridgeheads in a polycyclic compound are denoted by cis or tram, each followed by a hyphen and, when necessary, the corresponding locant of the lower numbered bridgehead and a second hyphen, all placed before the name of the ring system. Steric relations between the nearest atoms* of cis- or tram-bridgehead pairs may be described by affixes cisoid or transoid, followed by a hyphen and, when necessary, the corresponding locants and a second hyphen, the whole placed between the designations of the cis- or tram-ring junctions concerned. When a choice remains among nearest atoms, the pair containing the lower numbered atom is selected; cis and tram are not abbreviated in such cases. In complex cases, however, designation may be more simply effected by the sequence-rule procedure (see Appendix

2). Examples:

E-5. Chirality E-5.1. The property of nonidentity of an object with its mirror image is termed chirality. An object, such as a molecule in a given configuration or conformation, is termed chiral when it is not identical with its mirror image; it is termed achiral when it is identical with its mirror image. Notes: (1) Chirality is equivalent to handedness, the term being derived from the Greek Xetp = hand. (2) All chiral molecules are molecules of optically active compounds, and molecules of all optically active compounds are chiral. There is a 1:1 correspondence between chirality and optical activity. (3) In organic chemistry the discussion of chirality usually concerns the individual molecule or, more strictly, a model of the individual molecule. The chirality of an assembly of molecules may differ from that of the component molecules, as in a chiral quartz crystal or in an achiral crystal containing equal numbers of dextrorotatory and levorotatory tartaric acid molecules. (4) The chirality of a molecule can be discussed only if the configuration or conformation of the molecule is specifically defined or is considered as defined by *The term “ nearest atoms” denotes those linked together through the smallest number of atoms, irrespective of actual separation in space. For instance, in the second Example to this Rule, the atom 4a is “ nearer” to 10a than to 8a. fF o r the designation re f see Rule E-5.10.

35

common usage. In such discussions structures are treated as if they were (at least temporarily) rigid. For instance, ethane is configurationally achiral although many of its conformations, such as (A), are chiral; in fact, a configuration of a mobile molecule is chiral only if all its possible conformations are chiral; and conformations of ethane such as (B) and (C) are achiral.

(D) and (E) are mirror images and are not identical, not being superposable. They represent chiral molecules. They represent (D) dextrorotatory and (E) levorotatory glyceraldehyde.

(F) is identical with its mirror image. It represents an achiral molecule, namely, a molecule of 1,2, 3-propanetriol (glycerol),

E-5.2. The term asymmetry denotes absence of any symmetry. An object, such as a molecule in a given configuration or conformation, is termed asymmetric if it has no element of symmetry. Notes'. (1) All asymmetric molecules are chiral, and all compounds composed of them are therefore optically active; however, not all chiral molecules are asymmetric since some molecules having axes of rotation are chiral. (2) Notes (3) and (4) to Rule E-5.1 apply also in discussions of asymmetry. Examples:

E-5.3. (a) An asymmetric atom is one that is tetrahedrally bonded to four different atoms or groups, none of the groups being the mirror image of any of the others.

36

Handbook o f Biochemistry and Molecular Biology

(b) An asymmetric atom may be said to be at a chiral center since it lies at the center of a chiral tetrahedral structure. In a general sense, the term “chiral center” is not restricted to tetrahedral structures; the structure may, for instance, be based on an octahedron or tetragonal pyramid. (c) When the atom by which a group is attached to the remainder of a molecule lies at a chiral center, the group may be termed a chiral group. Notes: (1) The term “asymmetric,” as applied to a carbon atom in rule E-5.3 (a), was chosen by van’t Hoff because there is no plane of symmetry through a tetrahedron whose corners are occupied by four atoms or groups that differ in scalar properties. For differences of vector sense between the attached groups, see Rule E-5.8. (2) In Subsection E-5 the word “group” is used to denote the series of atoms attached to one bond. For instance, in (i) the groups attached to C* are —CH3, —OH, —CH2CH3, and —COOH; in (ii) they are -C H 3, -O H , -COCH2CH2CH2, and -C H 2CH2CH2CO.

(3) For the chiral axis and chiral plane (which are less common than the chiral center), see Appendix 2. (4) There may be more than one chiral center in a molecule and these centers may be identical, or structurally different, or structurally identical but of opposite chirality; however, the presence of an equal number of structurally identical chiral groups of opposite chirality, and no other chiral group, leads to an achiral molecule. These statements apply also to chiral axes and chiral planes. Identification of the sites and natures of the various factors involved is essential if the overall chirality of a molecule is to be understood. (5) Although the term “chiral group” is convenient for use in discussions it should be remembered that chirality attaches to molecules and not to groups or atoms. For instance, although the sec-butyl group may be termed chiral in dextrorotatory 2-sec-butyl-naphthalene, it is not chiral in the achiral compound (CH3CH2)(CH3)CH—CH3. Examples:

In this chiral compound there are two asymmetric carbon atoms, marked C*, each lying at a chiral center. These atoms form part of different chiral groups, namely, -CH(CH3> COOH and -C H (C H 3)CH2CH,

In this molecule (weso-tartaric acid) the two central carbon atoms are asymmetric atoms and each is part of a chiral group -CH(OH)COOH. These groups, however, although structurally identical, are of opposite chirality, so that the molecule is achiral.

E-5.4. Molecules that are mirror images of one another are termed enantiomers and may be said to be enantiomeric. Chiral groups that are mirror images of one another are termed enantiomeric groups.

37

Hence: enantiomerism (phenomenological) N ote: Although the adjective enantiomeric may be applied to groups, enantiomerism strictly applies only to molecules [see Note (5) to Rule E-5.3]. Examples: The following pairs of molecules are enantiomeric.

The sec-butyl groups in (vi) are enantiomeric.

E-5.5. When equal amounts of enantiomeric molecules are present together, the product is termed racemic, independently of whether it is crystalline, liquid, or gaseous. A homogeneous solid phase composed of equimolar amounts of enantiomeric molecules is termed a racemic compound. A mixture of equimolar amounts of enantiomeric molecules present as separate solid phases is termed a racemic mixture. Any homogeneous solid containing equimolar amounts of enantiomeric molecules is termed a racemate.

38

Handbook oj Biochemistry and Molecular Biology

Examples: The mixture of two kinds of crystal (mirror-image forms) that separate below 28° from an aqueous solution containing equal amounts of dextrorotatory and levorotatory sodium ammonium tartrate is a racemic mixture. The symmetrical crystals that separate from such a solution above 28°, each containing equal amounts of the two salts, provide a racemic compound. E-5.6. Stereoisomers that are not enantiomeric are termed diastereoisomers. Hence: diastereoisomeric (adjectival) diastereoisomerism (phenomenological) N ote: Diastereoisomers may be chiral or achiral. Examples:

E-5.7. A compound whose individual molecules contain equal numbers of enantio­ meric groups, identically linked, but no other chiral group, is termed a meso compound. Example:

E-5.8. An atom is termed pseudoasymmetric when bonded tetrahedrally to one pair of enantiomeric groups (+)-a and (-)-a and also to two atoms or groups b and c that are different from group a, different from each other, and not enantiomeric with each other. Examples:

39

Notes: (1) The orientation, in space, of the atoms around a pseudoasymmetric atoms is not reversed on reflection; for a chiral atom (see Note to Rule E-5.3) this orientation is always reversed. (2) Molecules containing pseudoasymmetric atoms may be achiral or chiral. If ligands b and c are both achiral, the molecule is achiral as in the first example to this Rule. If either or both of the nonenantiomeric ligands b and c are chiral, the molecule is chiral, as in the second example to this Rule, that is the molecule is not identical with its mirror image. A molecule (i) is also chiral if b and c are enantiomeric, that is, if the molecule can be symbolized as (ii), but then, by definition, it does not contain a pseudoasymmetric atom.

(3) Compounds differing at a pseudoasymmetric atom belong to the larger class of diastereoisomers. (4) In example (A), interchange of H and OH on C* gives a different achiral compound, which is an achiral diastereoisomer of (A) (see Rule E-5.6). In example (B), diastereoisomers are produced by inversion at C* or °C, giving in all four diastereo­ isomers, all chiral because of the —CH(CH3)CH2CH3 group. E-5.9. Names of chiral compounds whose absolute configuration is known are differentiated by prefixes R, S, etc., assigned by the sequence-rule procedure (see Appendix 2), preceded when necessary by the appropriate locants. Examples:

E-5.10. (a) Names of compounds containing chiral centers, of which the relative but not the absolute configuration is known, are differentiated by prefixes R*, S* (spoken R star, S star), preceded when necessary by the appropriate locants, these prefixes being assigned by the sequence-rule procedure (see Appendix 2) on the arbitrary assumption that the prefix first cited is R.

40

Handbook o f Biochemistry and Molecular Biology

(b) In complex cases the stars may be omitted and, instead, the whole name is prefixed by rel (for relative). (c) When only relative configuration is known, enantiomers are distinguished by a prefix (+) or (-), referring to the direction of rotation of plane-polarized light passing through them (wavelength, temperature, solvent, and/or concentration should also be specified, particularly when known to affect the sign). (d) When a substituent of known absolute chirality is introduced into a compound of which only the relative configuration is known, then starred sym bols/?*,S* are used and not the prefix rel. N ote: This Rule does not form part of the procedure formulated in the sequence-rule papers by Cahn, Ingold, and Prelog (see Appendix 2). Examples:

E-5.11. When it is desired to express relative or absolute configuration with respect to a class of compounds, specialized local systems may be used. The sequence rule may, however, be used additionally for positions not amenable to treatment by the local system. Examples: gluco, arabino, etc., combined when necessary with D or L, for carbohydrates and their derivatives [see IUPAC-IUB Tentative Rules for Carbohydrate Nomenclature; see also/. Org. Chem., 28,281 (1963)]. D, L for amino acids and peptides [see Comptes rendus of the 16th IUPAC Conference, New York, N.Y., 1951., pp. 107—108; also published in Chem. Eng. News, 30, 4522(1952)]. D, L, and a series of other prefixes and trivial names for cyclitols and their

41

derivatives [see IUPAC-IUB Tentative Rules for the Nomenclature of Cyclitols, 1967, IUPACInf. Bull'., No. 32, 51 (1968); also published in J. Biol. Chem., 243, 5809 (1968)]. a, 13, and a series of trivial names for steroids and related compounds [see IUPAC-IUB Revised Tentative Rules for the Nomenclature of Steroids, 1967, IUPAC In f Bull., No. 33, 23 (1968); also published in /. Org. Chem., 34, 1517 (1969)]. The a, j3 system for steroids can be extended to other classes of compounds such as terpenes and alkaloids when their absolute configurations are known; it can also be combined with stars or the use of the prefix rel when only the relative configurations are known. In spite of the Rules of Subsection E-2, cis and trans are used when the arrangement of the atoms constituting an unsaturated backbone is the most important factor, as, for instance, in polymer chemistry and for carotenoids. When a series of double bonds of the same stereochemistry occurs in a backbone, the prefix all-c/s or all-trans may be used. E-5.12. (a) An achiral object having at least one pair of features that can be distinguished only by reference to a chiral object or to a chiral reference frame is said to be prochiral, and the property of having such a pair of features is termed prochirality. A consequence is that, if one of the paired features of a prochiral object is considered to differ from the other, the resultant object is chiral. (b) In a molecule an achiral center or atom is said to be prochiral if it would be held to be chiral when two attached atoms or groups, that taken in isolation are indistinguishable, are considered to differ. Notes: (1) For a tetrahedrally bonded atom this requires a structure Xaabc (where none of the groups a, b, or c is the enantiomer of another). (2) For a fuller exploration of this concept, which is of particular importance to biochemists and spectroscopists, and for its extension to axes, planes, and unsaturated compounds, see Hanson, K. R.,/. Am. Chem. Soc., 88, 2731 (1966). Examples:

In both examples (A) and (B), the methylene carbon atom is prochiral; in both cases it would be held to be at a chiral center if one of the methylene hydrogen atoms were considered to differ from the other. An actual replacem ent^ one of these protium atoms by, say, deuterium would produce an actual chiral center at the methylene carbon atom; as a result, compound (A) would become chiral, and compound (B) would be converted into one of two diastereoisomers. E-5.13. Of the identical pair of atoms or groups in a prochiral compound, that one which leads to an (R) compound when considered to be preferred to the other by the sequence rule (without change in priority with respect to other ligands) is termed pro-R, and the other is termed pro-S.

42

Handbook o f Biochemistry and Molecular Biology

Example:

E-6. Conformations E-6.1. A molecule in a conformation into which its atoms return spontaneously after small displacements is termed a conformer. Examples:

E-6.2. (a) When, in a six-membered saturated ring compound, atoms in relative positions 1, 2, 4, and 5 lie in one plane, the molecule is described as in the chair or boat conformation according as the other two atoms lie, respectively, on opposite sides or on the same side of that plane. Examples:

N ote: These and similar representations are idealized, minor divergences being neglected. (b) A molecule of a monounsaturated six-membered ring compound is described as being in the half-chair or half-boat conformation according as the atoms not directly bound to the doubly bonded atoms lie, respectively, on opposite sides or on the same side of the plane containing the other four (adjacent) atoms. Examples:

(c) A median conformation through which one boat form passes during conversion

43

into the other boat form is termed a twist conformation. Similar twist conformations are involved in conversion of a chair into a boat form or vice versa. Examples:

E-6.3. (a) Bonds to a tetrahedral atom in a six-membered ring are termed equatorial or axial according as they or their projections make a small or a large angle, respectively, with the plane containing a majority of the ring atoms.* Atoms or groups attached to such bonds are also said to be equatorial or axial, respectively. Notes'. (1) See, however, pseudoequatorial and pseudoaxial [Rule E-6.3(b)]. (2) The terms equatorial and axial may be abbreviated to e and a when attached to formulas; these abbreviations may also be used in names of compounds and are there placed in parentheses after the appropriate locants, for example, l(e)-bromo-4(a)-chlorocyclohexane. Examples:

(b) Bonds from atoms directly attached to the doubly bonded atoms in a monounsaturated six-membered ring are termed pseudoequatorial or pseudoaxial accor­ ding as the angles that they make with the plane containing the majority of the ring atoms approximate those made by, respectively, equatorial or axial bonds from a saturated six-membered ring. Pseudoequatorial and pseudoaxial may be abbreviated to e and a', respectively, when attached to formulas; these abbreviations may also be used in names, then being placed in parentheses after the appropriate locants. Example:

E-6.4. Torsion angle: In an assembly of attached atoms X -A -B -Y , where neither X nor Y is collinear with A and B, the smaller angle subtended by the bonds X -A and Y—B in a plane projection obtained by viewing the assembly along the axis A—B is termed the *The terms axial, equatorial, pseudoaxial, and pseudoequatorial [see Rule E-6.3(b)] may be used also in connection with other than six-membered rings if, but only if, their interpretation is then still beyond dispute.

44

Handbook o f Biochemistry and Molecular Biology

torsion angle (denoted by the Greek lower case letter theta 0 or omega co). The torsion angle is considered positive or negative according as the bond to the front atom X or Y requires rotation to the right or left, respectively, in order that its direction may coincide with that of the bond to the rear selected atom Y or X. The multiplicity of the bonding of the various atoms is irrelevant. A torsion angle also exists if the axis for rotation is formed by a collinear set of more than two atoms directly attached to each other. Notes'. (1) It is immaterial whether the projection be viewed from the front or the rear. (2) For the use of torsion angles in describing molecules see Rule E-6.6. Examples: (For construction of Newman projections, as here, see Rule E-7.2.)

E-6.5. If two atoms or groups attached at opposite ends of a bond appear one directly behind the other when the molecule is viewed along this bond, these atoms or groups are described as eclipsed, and that portion of the molecule is described as being in the eclipsed conformation. If not eclipsed, the atoms or groups and the conformation may be described as staggered. Examples:

45

Projection of CH3CH3CHO. The CH3 and the H of the CHO are eclipsed. The O and H’s of CH, in CH2CH3 are staggered.

E-6.6. Conformations are described as synperiplanar (,sp), synclinal (sc), anticlinal (ac), or antiperiplanar (ap) according as the torsion angle is within ±30° of 0°, ±60°, ±120°, or ±180°, respectively; the letters in parentheses are the corresponding abbreviations. Atoms or groups are selected from each set to define the torsion angle according to the following criteria: (1) if all the atoms or groups of a set are different, that one of each set that is preferred by the sequence rule; (2) if one of a set is unique, that one; or (3) if all of a set are identical, that one which provides the smallest torsion angle. Examples:

In the above conformations, all CH2C1—CH2C1, the two Cl atoms decide the torsion angle.

Criterion for: rear atom front atom

2 2

2 2

1 1

3 2

46

Handbook o f Biochemistry and Molecular Biology

Criterion for: rear atom front atom

2 2

2

2

1

1

E-7. Stereoformulas E-7.1. In a Fischer projection the atoms or groups attached to a tetrahedral center are projected on to the plane of the paper from such an orientation that atoms or groups appearing above or below the central atom lie behind the plane of the paper and those appearing to left and right of the central atom lie in front of the plane of the paper, and that the principal chain appears vertical with the lowest numbered chain member at the top. Examples:

Notes'. (1) The first of the two types of Fischer projection should be used whenever convenient. (2) If a Fischer projection formula is rotated through 180° in the plane of the paper, the upward and downward bonds from the central atom still project behind the plane of the paper, and the sideways bonds project in front of that plane. If, however, the formula is rotated through 90° in the plane of the paper, the upward and downward bonds now project in front of the plane of the paper and the sideways bonds project behind that plane. E-7.2. To prepare a Newman projection, a molecule is viewed along the bond between two atoms; a circle is used to represent these atoms with lines from outside the circle toward its center to represent bonds to other atoms; the lines that represent bonds to the nearer and the further atom end at, respectively, the center and the circumference of the circle. When two such bonds would be coincident in the projection, they are drawn at a small angle to each other. t *The lone pair of electrons (represented by two dots) on the nitrogen atoms are the unique substituents that decide the description of the conformation (these are the “ phantom atoms” of the sequence-rule symbolism). 1"Cf. Newman, M. S., Rec. Chem. Progr., 13, 111 (1952); J. Chem. Educ., 33, 344 (1955); Steric Effects in Organic Chemistry, John Wiley & Sons, New York, 1956, 5.

47

Examples:

E-7.3. General N ote; Formulas that display stereochemistry should be prepared with extra care so as to be unambiguous and, whenever possible, self-explanatory. It is inadvisable to try to lay down rules that will cover every case, but the following points should be borne in mind. A thickened line (— ) denotes a bond projecting from the plane of the paper toward an observer, a bioken line (-- -) denotes a bond projecting away from an observer, and, when this convention is used, a full line of normal thickness (— ) denotes a bond lying in the plane of the paper. A wavy line may be used to denote a bond whose direction cannot be specified or, if it is explained in the text, a bond whose direction it is not desired to specify in the formula. Dotted lines ( • • • • • • ) should preferably not be used to denote stereochemistry, and never when they are used in the same paper to denote mesomerism, intermediate states, etc. Wedges should not be used as complement to broken lines (but see below). Single large dots have sometimes been used to denote atoms or groups attached at bridgehead positions and lying above the plane of the paper, with open circles to denote them lying below the plane of the paper, but this practice is strongly deprecated. Hydrogen or other atoms or groups attached at sterically designated positions should never be omitted. In chemical formulas, rings are usually drawn with lines of normal thickness, that is, as if they lay wholly in the plane of the paper even though this may be known not to be the case. In a formula such as (I) it is then clear that the H atoms attached at the A/B ring junction lie further from the observer than these bridgehead atoms, that the H atoms attached at the B/C ring junction lie nearer to the observer than those bridgehead atoms, and that X lies nearer to the observer than the neighboring atom of ring C.

However, ambiguity can then sometimes arise, particularly when it is necessary to

48

Handbook o f Biochemistry and Molecular Biology

show stereochemistry within a group such as X attached to the rings that are drawn planar. For instance, in formula (II), the atoms O and C*, lying above the plane of the paper, are attached to ring B by thick bonds, but then, when showing the stereochemistry at C*, one Finds that the bond from C* to ring B projects away from the observer and so should be a broken line. Such difficulties can be overcome by using wedges in place of lines, the broader end of the wedge being considered nearer to the observer, as in (III). In some fields, notably for carbohydrates, rings are conveniently drawn as if they lay perpendicular to the plane of the paper, as represented in (IV); however, conventional formulas such as (V), with the lower bonds considered as the nearer to the observer, are so well established that is is rarely necessary to elaborate this to form (IV).

By a similar convention, in drawings such as (VI) and (VII), the lower sets of bonds are considered to be nearer than the upper to the observer. In (VII), note the gaps in the rear lines to indicate that the bonds crossing them pass in front (and thus obscure sections of the rear bonds). In some cases, when atoms have to be shown as lying in several planes, the various conventions may be combined, as in (VIII). In all cases the overriding aim should be clarity.

APPENDIX 1. CONFIGURATION AND CONFORMATION See Rules E-l .4 and E-l .5. Various definitions have been propounded to differentiate configurations from conformations. The original usage was to consider as conformations those arrangements of the atoms of a molecule in space that can be interconverted by rotation(s) around a single bond, and as configurations those other arrangements whose interconversion by rotation requires bonds to be broken and then re-formed differently. Interconversion of different configurations will then be associated with substantial energies of activation, and the various species will be separable, but interconversion of different conformations will normally be associated with less activation energy, and the various species, if separable, will normally be more readily interconvertible. These differences in activation energy and stability are often large.

49

Nevertheless, rigid differentiation on such grounds meets formidable difficulties. Differentiation by energy criteria would require an arbitrary cut in a continuous series of values. Differentiation by stability of isolated species requires arbitrary assumptions about conditions and halflives. Differentiation on the basis of rotation around single bonds meets difficulties connected both with the concept of rotation and with the selection of single bonds as requisites, and these need more detailed discussion here. Enantiomeric biaryls are nowadays usually considered to differ in conformation, any difficulty in rotation about the 1,T bond due to steric hindrance between the neighboring groups being considered to be overcome by bond bending and/or bond stretching, even though the movements required must closely approach bond breaking if these substituents are very large. Similar doubts about the possibility of rotation occur with a molecule such as (A), where rotation of the benzene ring around the oxygen-to-ring single bonds affords easy interconversion if x is large but appears to be physically impossible if x is small; and no critical size of x can be reasonably established. For reasons such as this, Rules E-1.4 and E-1.5 are so worded as to be independent of whether rotation appears physically feasible or not (see Note 2 to those Rules).

The second difficulty arises in the many cases where rotation is around a bond of fractional order between one and two, as in the helicenes, crowded aromatic molecules, metallocenes, amides, thioamides, and carbene-metal coordination compounds (such as B). The term conformation is customarily used in these cases and that appears a reasonable extension of the original conception, though it will be wise to specify the usage if the reader might be in doubt. When interpreted in these ways, Rules E-1.4 and E-1.5 reflect the most frequent usage of the present day and provide clear distinctions in most situations. Nevertheless, difficulties remain and a number of other usages have been introduced. It appears to some workers that once it is admitted that change of conformation may involve rotation about bonds of fractional order between one and two, it is then illogical to exclude rotation about classical double bonds because interconversion of open-chain cis-trans isomers depends on no fundamentally new principle and is often relatively easy, as for certain alkene derivatives such as stilbenes and for azo compounds, by irradiation. This extension is indeed not excluded by Rules E-1.4 and E-1.5, but if it is applied that fact should be explicitly stated. A further interpretation is to regard a stereoisomer possessing some degree of stability (that is, one associated with an energy hollow, however shallow) as a configurational isomer, the other arrangements in space being termed conformational isomers; the term conformer (Rule E-6.1) is then superfluous. This definition, however, requires a knowledge of stability (energy relations) that is not always available. In another view, a configurational isomer is any stereoisomer that can be isolated or (for some workers) whose existence can be established (for example, by physical methods); all other arrangements then represent conformational isomers; but it is then impossible to differentiate configuration from conformation without involving experi­ mental efficiency or conditions of observation. Yet another definition is to regard a conformation as a precise description of a configuration in terms of bond distances, bond angles, and dihedral angles.

50

Handbook o f Biochemistry and Molecular Biology

In none of the above views except the last is attention paid to extension or contraction of the bond to an atom that is attached to only one other atom, such as —H or = 0 . Yet such changes in interatomic distance due to nonbonded interactions may be important, for instance, in hydrogen bonding, in differences due to crystal form, in association in solution, and in transition states. This area may repay further consideration. Owing to the circumstances outlined above, the Rules E-1.4 and E-1.5 have been deliberately made imprecise, so as to permit some alternative interpretations, but they are not compatible with all the definitions mentioned above. The time does not seem ripe to legislate for other than the commoner usages or to choose finally between these. It is, however, encouraging that no definition in this field has (yet) involved atomic vibrations for which, in all cases, only time-average positions are considered. Finally it should be noted that an important school of thought uses conformation with the connotation of “a particular geometry of the molecule, i.e., a description of atoms in space in terms of bond distances, bond angles, and dihedral angles,” a definition much wider than any discussed above. APPENDIX 2. OUTLINE OF THE SEQUENCE-RULE PROCEDURE The sequence-rule procedure is a method of specifying the absolute molecular chirality (handedness) of a compound, that is, a method of specifying which of two enantiomeric forms each chiral element of a molecule exists. For each chiral element in the molecule it provides a symbol, usually R or 5, which is independent of nomenclature and numbering. These symbols define the chirality of the specific compound considered; they may not be the same for a compound and some of its derivatives; and they are not necessarily constant for chemically similar situations within a chemical or a biogenetic class. The procedure is applied directly to a three-dimensional model of the structure, and not to any two-dimensional projection thereof. The method has been developed to cover all compounds with ligancy up to four and with ligancy six,* and for all configurations and conformations of such compounds. The following is an outline confined to the most common situations; it is essential to study the original papers, especially the 1966 paper / before using the sequence rule for other than fairly simple cases. General Basis The sequence rule itself is a method of arranging atoms or groups (including chains and rings) in an order of precedence, often referred to as an order of preference; for discussion this order can conveniently be generalized a s a > b > c > d , where > denotes “is preferred to.” The first step, however, in considering a model is to identify the nature and position of each chiral element that it contains. There are three types of the chiral element, namely, the chiral center, the chiral axis, and the chiral plane. The chiral center, which is very much the most commonly met, is exemplified by an asymmetric carbon atom with the tetrahedral arrangement of ligands, as in (1). A chiral axis is present in, for instance, the chiral allenes such as (2) or the chiral biaryl derivatives. A chiral plane is exemplified by the plane containing the benzene ring and the bromine and oxygen atoms in the chiral compound (3), or by the underlined atoms in the cycloalkene (4). Clearly, more than one

*Ligancy refers to the number of bonds from an atom, independently of the nature of the bonds, t c a h n , R. S., Ingold, C., and Prelog, V., Angew. Chem. Int. Ed., 5, 385 (1966); errata, 5, 511 (1966); Angew. Chem, 78, 413 (1966). Earlier papers: Cahn, R. S. and Ingold, C. K. , / Chem. Soc. (Lond.), 612 (1951); Cahn, R. S., Ingold, C., and Prelog, V., Experientia, 12, 81 (1956). For a partial, simplified account see Cahn, R. S Chem. Educ., 41, 116 (1964); errata, 41, 503 (1964).

51

type of chiral element may be present in one compound; for instance, group “a” in (2) migh be a sec-butyl group which contains a chiral center.

The Chiral Center Let us consider first the simplest case, namely, a chiral center (such as carbon) with four ligands, a, b, c, and d, which are all different atoms tetrahedrally arranged as in CHFCIBr. The four ligands are arranged in order of preference by means of the sequence rule; this contains five subrules, which are applied in succession so far as necessary to obtain a decision. The first subrule is all that is required in a great majority of actual cases; it states that ligands are arranged in order of decreasing atomic number, in the above case (a) Br > (b) Cl > (c) F > (d) H. There would be two (enantiomeric) forms of the compound and we can write these as (5) and (6). In the sequence-rule procedure the model is viewed from the side remote from the least-preferred ligand (d), as illustrated. Then, tracing a path from a to b to c in (5) gives a clockwise course, which is symbolized by (/?) (Latin rectus, right; for right hand); in (6) it gives an anticlockwise course, symbolized as (S) (Latin sinister, left). Thus (5) would be named (R)-bromochlorofluoromethane, and (6) would be named (S)-bromochlorofluoromethane. Here already it may be noted that converting one enantiomer into another changes each R to S , and each S to R, always. It will be seen also that the chirality prefix is the same whether the alphabetical order is used, as above, for naming the substituents or whether this is done by the order of complexity (giving fluorochlorobromomethane).

Next, suppose we have H3C-CHC1F. We deal first with the atoms directly attached to the chiral center; so the four ligands to be considered are Cl > F > C (of CH3) > H. Here the H’s of the CH3 are not concerned, because we do not need them in order to assign our symbol. However, atoms directly attached to a center are often identical, as, for example, the underlined C’s in H3C-CHC1-CH20H. For such a compound we at once establish a preference (a) Cl > (b, c) C,C > (d) H. Then to decide between the two C’s we work outward, to the atoms to which they in turn are directly attached and we then find which we can conveniently write as C(H,H,H) and C(0,H,H). We have to compare H,H,H with 0,H,H, and since oxygen has a higher atomic number than hydrogen we have 0 > H

52

Handbook o f Biochemistry and Molecular Biology

and thence the complete order Cl > C (of CH2 OH) > C (of CH3) > H, so that the chirality symbol can then be determined from the three-dimensional model.

We must next meet the first complication. Suppose that we have a molecule (7).

To decide between the two C’s we first arrange the atoms attached to them in their order of preference, which gives C(C1,C,H) on the left and C(F,0,H) on the right. Then we compare the preferred atom of one set (namely, Cl) with the preferred atom (F) of the other set, and as Cl > F we arrive at the preferences a > b > c > d shown in (7) and chirality (S). If, however, we had a compound ( 8 ) we should have met C(C1,C,H) and C(C1,0,H) and, since the atoms of first preference are identical (Cl), we should have had to make the comparisons with the atoms of second preference, namely, O > C, which to the different chirality (R) as shown in ( 8 ).

Branched ligands are treated similarly. Setting them out in full gives a picture that at first sight looks complex but the treatment is in fact simple. For instance, in compound (9) a first quick glance again shows (a) Cl > (b, c) C,C > (d) H: When we expand the two C’s we find they are both C(C,C,H), so we continue exploration. Considering first the left-hand ligand we arrange the branches and their sets of atoms in order thus: C(C1,H,H) > C(H,H,H). On the right-hand side we have C(0,C,H) > C(0,H,H) (because C > H). We compare first the preferred of these branches from each side and we find C(C1,H,H) > C(0,C,H) because Cl > O, and that gives the left-hand branch preference over the right-hand branch. That is all we need to do to establish chirality (S) for this highly branched compound (9). Note that it is immaterial here that, for the lower branches, the right-hand C(0,H,H) would have been preferred to the left-hand C(H,H,H); we did not need to reach that point in our comparisons and so we are not concerned with it; but we should have reached it if the two top (preferred) branches had both been the same CH2 C1. Rings, when we met during outward exploration, are treated in the same way as branched chains.

53

With these simple procedures alone, quite complex structures can be handled; for instance, the analysis alongside Formula (10) for natural morphine explains why the specification is as shown. The reason for considering C-12 as C(C,C,C) is set out in the next paragraphs.

Now, using the sequence rule depends on exploring along bonds. To avoid theoretical arguments about the nature of bonds, simple classical forms are used. Double and triple bonds are split into two and three bonds, respectively. A > C =0 group is treated as (i) (below) where the (0) and the (C) are duplicate representations of the atoms at the other end of the double bond. —C^CH is treated as (ii) and —C=N is treated as (iii).

54

Handbook o f Biochemistry and Molecular Biology

Thus in D-glyceraldehyde (11) the CHO group is treated as C(0,(0),H) and is thus preferred to the C(0,H,H) of the CH2OH group, so that the chirality symbol is (R).

Only the doubly bonded atoms themselves are duplicated, and not the atoms or groups attached to them; the duplicated atoms may thus be considered as carrying three phantom atoms (see below) of atomic number zero. This may be important in deciding preferences in certain complicated cases. Aromatic rings are treated as Kekule structures. For aromatic hydrocarbon rings it is immaterial which Kekule structure is used because “splitting” the double bonds gives the same result in all cases; for instance, for phenyl the result can be represented as ( 1 2 a) where “( 6 )” denotes the atomic number of the duplicate representations of carbon. (6 )

(13)

For aromatic hetero rings, each duplicate is given an atomic number that is the mean of what it would have if the double bond were located at each of the possible positions. A complex case is illustrated in (13). Here C-l is doubly bonded to one or other of the nitrogen atoms (atomic number 7) and never to carbon, so its added duplicate has atomic number 7; C-3 is doubly bonded either to C-4 (atomic number 6 ) or to N- 2 (atomic number 7), so its added duplicate has atomic number 6lA\ so has that of C-8 ; but C-4a may be doubly bonded to C-4, C-5, or N-9, so its added duplicate has atomic number 6.33. One last point about the chiral center may be added here. Except for hydrogen, ligancy, if not already four, is made up to four by adding “phantom atoms” which have atomic number zero and are thus always last in order of preference. This has various uses but perhaps the most interesting is where nitrogen occurs in a rigid skeleton, as, for example, in a-isosparteine (14). Here the phantom atom can be placed where the nitrogen

55

SOME COMMON GROUPS IN ORDER OF SEQUENCE-RULE PREFERENCE51 A. Alphabetical Order (Higher Number Denotes Greater Preference)

64 Acetoxy 36 Acetyl 48 Acetylamino 21 Acetylenyl 10 Allyl 43 Amino 44 Ammonio +H3N 37 Benzoyl 49 Benzoylamino 65 Benzoyloxy 50 Benzyloxycarbonylamino 13 Benzyl 60 Benzyloxy 41 Benzyloxy carbonyl 75 Bromo 42 fer-Butoxycarbonyl 5 /2 -Butyl 16 sec-Butyl 19 tert-Butyl

38 Carboxyl 74 Chloro 17 Cyclohexyl 52 Diethylamino 51 Dimethylamino 34 2,4-Dinitrophenyl 28 3,5-Dinitrophenyl 59 Ethoxy 40 Ethoxycarbonyl 3 Ethyl 46 Ethylamino 68 Fluoro 35 Formyl 63 Formyloxy 62 Glycosyloxy 7 /2 -Hexyl 1 Hydrogen 57 Hydroxy 76 Iodo

9 Isobutyl 8 Isopentyl 20 Isopropenyl 14 Isopropyl 69 Mercapto 58 Methoxy 39 Methoxycarbonyl 2 Methyl 45 Methylamino 71 Methylsulfinyl 66 Methylsulfinyloxy 72 Methylsulfonyl 67 Methylsulfonyloxy 70 Methylthio 11 Neopentyl 56 Nitro 27 m-Nitrophenyl 33 o-Nitrophenyl 24 p-Nitrophenyl

55 Nitroso 6 /2 -Pentyl 61 Phenoxy 22 Phenyl 47 Phenylamino 54 Phenylazo 18 Propenyl 4 /2 -Propyl 29 1-Propynyl 12 2-Propynyl 73 Sulfo 25 m-Tolyl 30 o-Tolyl 23 p-Tolyl 53 Trimethylammonio 32 Trityl 15 Vinyl 31 2,6-Xylyl 26 3,5-Xylyl

B. Increasing Order o f Sequence Rule Preference

1 Hydrogen 2 Methyl 3 Ethyl 4 «-Propyl 5 w-Butyl 6 /2 -Pentyl 7 /2 -Hexyl 8 Isopentyl 9 Isobutyl 10 Allyl 11 Neopentyl 12 2-Propynyl 13 Benzyl 14 Isopropyl 15 Vinyl 16 sec-Butyl 17 Cyclohexyl 18 1-Propenyl 19 tert-Butyl

20 Isopropenyl 21 Acetylenyl 22 Phenyl 23 p-Tolyl 24 p-Nitrophenyl 25 m -Tolyl 26 3,5-Xylyl 27 /n-Nitrophenyl 28 3,5-Dinitrophenyl 29 1-Propynyl 30 o-Tolyl 31 2,6-Xylyl 32 Trityl 33 o-Nitrophenyl 34 2,4-Dinitrophenyl 35 Formyl 36 Acetyl 37 Benzoyl 38 Carboxyl

39 Methoxycarbonyl*5 40 Ethoxycarbonyl* 5 41 Benzyloxycarbonyl*5 42 te/Y-Butoxycarbonylb 43 Amino 44 Ammonio +H 3N 45 Methylamino 46 Ethylamino 47 Phenylamino 48 Acetylamino 49 Benzoylamino 50 Benzyloxycarbonylamino 51 Dimethylamino 52 Diethylamino 53 Trimethylammonio 54 Phenylazo 55 Nitroso 56 Nitro 57 Hydroxy

58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76

Methoxy Ethoxy Benzyloxy Phenoxy Glycosyloxy Formyloxy Acetoxy Benzoyloxy Methylsulfinyloxy Methylsulfonyloxy Fluoro Mercapto H S Methylthio CH3S Methylsulfinyl Methylsulfonyl Sulfo H 0 3S Chloro Bromo Iodo

a ANY alteration to structure, or substitution, etc., may alter the order of preference. b These groups are R 0 C (= 0 )-

lone pair of electrons is; then N-l appears as shown alongside the formula; and the chirality (/?) is the consequence. The same applies to N -l 6 . Phantom atoms are similarly used when assigning chirality symbols to chiral sulfoxides (see example to Rule E-5.9).

56

Handbook o f Biochemistry and Molecular Biology

(14) ( \R , 6 R , IS , 9S, 11R , 16/0-Sparteine

Symbolism In names of compounds, the R and S symbols, together with their locants, are placed in parentheses, normally in front of the name, as shown for morphine ( 1 0 ) and sparteine (14), but this may be varied in indexes or in languages other than English. Positions within names are required, however, when more than a single series of numerals is used, as for esters and amines. When relative stereochemistry is more important than absolute stereochemistry, as for steroids or carbohydrates, a local system of stereochemical designation may be more useful and sequence-rule symbols need then be used only for any situations where the local system is insufficient. Racemates containing a single center are labeled (R S). If there is more than one center the first is labeled (RS) and the others are (RS) or (SR) according to whether they are R or S when the first is R. For instance, the 2,4-pentanediols CH3 —CH(OH)—CH2CH(OH)-CH 3 are differentiated as one chiral form (2R,4R)~ other chiral form (25,45)meso compound (2/? ,4 5 )racemic compound (2RS,4RS)~

Finally the principles by which some of the least rare of other situations are treated will be very briefly summarized. Pseudoasymmetric Atoms A subrule decrees that R groups have preference over S groups and this permits pseudoasymmetric atoms, as in abC(c-R)(c-S) to be treated in the same way as chiral centers, but as such a molecule is achiral (not optically active) it is given the lower case symbol r or s. Chiral Axis The structure is regarded as an elongated tetrahedron and viewed along the axis —it is immaterial from which end it is viewed; the nearer pair of ligands receives the first two positions in the order of preference, as shown in (15) and (16).

57

(

16)

Chiral Plane T h e se q u en ce -ru le-p referred a to m a to m .”

In

attach ed from

com pound

to

(3)

the left-h a n d

this o x y g e n

atom

this

is

oxygen

d irectly a tta ch ed the

atom

C

of

the

t o t h e p l a n e is c h o s e n a s “ p i l o t

left-h a n d

CH2

group.

Now

this

in t h e p l a n e . T h e s e q u e n c e - r u l e - p r e f e r r e d

is

p ath

is t h e n e x p l o r e d in t h e p l a n e u n t i l a r o t a t i o n is t r a c e d w h i c h is

c l o c k w i s e ( R ) o r a n t i c l o c k w i s e ( S ) w h e n v i e w e d f r o m t h e p i l o t a t o m . In ( 3 ) t h i s p a t h is O

(R).

C -► C ( B r ) a n d it is c l o c k w i s e

Other Subrules O th er su b ru les ca ter for n e w ch ira lity crea ted b y is o to p ic la b elin g (h ig h er m ass n u m b e r preferred

to

lo w e r )

and

for

steric

d ifferen ces

in

the

ligan d s.

I s o to p ic la b elin g

rarely

ch an ges s y m b o ls a llo tted to oth er centers.

Octahedral Structures E x t e n s i o n s o f t h e s e q u e n c e r u le e n a b l e l i g a n d s a r r a n g e d o c t a h e d r a l l y t o b e p l a c e d in a n order o f p referen ce, in c lu d in g p o ly d e n ta te

ligan d s, s o

that a ch iral str u c tu r e ca n th e n

a l w a y s b e r e p r e s e n t e d as o n e o f th e e n a n t i o m e r i c f o r m s ( 17 ) a n d ( 1 8 ) . T h e fa c e 1- 2 - 3

is

o b ser v ed fr o m th e side r e m o te fro m th e fa ce 4 - 5 - 6 (a s m a rk ed b y a rro w s), a n d th e p a th 1

2 -> 3 is o b s e r v e d ; in ( 1 7 ) t h i s p a t h is c l o c k w i s e ( R ) , a n d in ( 1 8 ) it is a n t i c l o c k w i s e ( S ) .

Conformations T h e t o r s i o n a n g l e b e t w e e n s e l e c t e d b o n d s f r o m t w o s i n g l y b o n d e d a t o m s is c o n s i d e r e d . T h e s e l e c t e d b o n d f r o m e a c h o f t h e s e t w o a t o m s is t h a t t o a u n i q u e l i g a n d , o r o t h e r w i s e to th e ligan d

p referred b y

fr o n t lig a n d e c lip se d w it h h e lix ); if th is r o ta tio n ( m i n u s ) . E x a m p l e s are

t h e s e q u e n c e r u le . T h e s m a l l e r r o t a t i o n n e e d e d t o m a k e t h e the

rear o n e

is n o t e d ( t h i s is t h e r o t a t o r y c h a r a c t e r i s t i c o f a

is r i g h t - h a n d e d it l e a d s t o a s y m b o l

P (p lu s);

if le ft-h a n d e d to

M

58

Handbook oj Biochemistry and Molecular Biology

Details and Complications F o r d e t a ils an d c o m p l ic a t i n g fa c to r s th e o rig in a l p ap ers sh o u ld

be co n su lte d . T h e y

in c lu d e tr e a tm e n t o f c o m p o u n d s w it h h ig h s y m m e t r y or c o n ta in in g rep ea tin g u n its (e .g ., c y c lito ls ), a lso

n

b o n d in g

(m e ta llo c e n e s, e tc .), m e so m eric c o m p o u n d s and m e so m eric

rad icals, a n d h elica l a n d o t h e r s e c o n d a r y str u c tu r e s.

59

ABBREVIATIONS AND SYMBOLS FOR THE DESCRIPTION OF THE CONFORMATION OF POLYPEPTIDE CHAINS TENTATIVE RULES (1969) * IUPAC-IUB Commission on Biochemical Nomenclature Preamble These Rules are based on “ A Proposal of Standard Conventions and Nomenclature for the Description of Polypeptide Conformation” (Edsall et al. ) , 8 and have been prepared by a subcommission set up by the IUPAC-IUB Commission on Biochemical Nomenclature in 1966? The original proposals have been modified so as to bring them as far as possible into line with the system of nomenclature current in the fields of organic and polymer chemistry. Two Recommendations are appended to the Rules, the first dealing with the terms configuration and conformation, and the second with primary, secondary, and tertiary structure. These are formulated as recommendations rather than rules because there is at present no general agreement about their definition. Note: Two alternative notations are recommended throughout. That with superscripts and subscripts may be used when it is unlikely to cause confusion, e.g., in printed or manuscript material; that without is to be used where superscripts or subscripts may cause confusion, or are technically difficult or impossible, e.g., in computer outputs. In the latter connection the following Roman equivalents of Greek letters are recommended:

Rule 1. General Principles of Notation 1.1. Designation o f Atoms. The atoms of the main chain are denoted thus:

Where confusion might arise the following additional symbolism may be used:

1.2. Amino-acid residues, —NH-CHR-CO-, are numbered sequentially from the aminoterminal to the carboxyl-terminal end of the chain, the residue number being denoted /. Example:

^ The members o f the subcommission were J. C. Kendrew (Chairman), W. Klyne, S. Lifson, T. Miyazawa, G. N^methy, D. C. Phillips, G. N. Ramachandran, and H. A. Scheraga. In addition, the following assisted in the work o f the subcommission: R. S. Cahn, R. Diamond, J. T. Edsall, P. J. Flory, C. K. Ingold, A. Liquori, V. Prelog, and J. A. Schellman. * From IUPAC-IUB Commission on Biochemical Nomenclature, Pure Appl. Chem., 40(3), in press. With permission.

60

Handbook o f Biochemistry and Molecular Biology

1.3. For some purposes it is more convenient to group together the atoms —CHR-CO-NH—. These groups are described as “peptide units,” and the peptide unit number, like the residue number, is denoted i. It will be noted that the two numbers are identical for all atoms except NH; generally there will be no confusion, because a single document will use either “ residues” alone, or “peptide units” alone, but in the latter case explicit reference must be made to this usage at the beginning. If confusion might arise, the symbols Nf and Hf are to be used for these atoms in the zth peptide unit, Nf- and Hz in the zth residue (so that Nf = Nz- + j ). Example:

Notes: (i) Residue notation is used throughout these Rules. (ii) Whether “residues” or “peptide units” are being used, 0 Z-and 0 Z- always refer to torsion angles about the same C f. 1.4. Bond Lengths. If a bond A—B be denoted Az- By or Az- (see Rules 3.1, 4.5), the bond length is written b(Af-, By-) [or b(Ai, B/), or b f (orZ?A(z)]. An abbreviated notation for use in side chains is indicated in Rule 4.5. Note: The symbol previously recommended for bond length was /. This symbol is no longer recommended, partly because it is easily confused with 1 in many type fonts, and partly because it is also used for vibration amplitude in electron diffraction and spectroscopy. B. / \ 1.5. Bond Angles. The bond angle included between three atoms N Ck is written r(Az-, By, Ck ), which may be abbreviated, if there is no ambiguity, to r(By) or r f or tB(/‘). A D \ / 1.6. Torsion Angles. If a system of four atoms B—C is projected onto a plane normal to bond B—C, the angle between the projection of A—B and the projection of C-D is described as the torsion angle* of A and D about bond B-C ; this angle may also be described as the angle between the plane containing A, B, and C, and the plane containing B, C, and D. The torsion angle is written in full as 0(AZ-, By, Ck , which may be abbreviated, if there is no ambiguity, to 0(By, Ck ), 0(By), or Of, etc. In the eclipsed conformation in which the projections of A -B and C -D coincide, Q is given the value 0° (synplanar conformation). A torsion angle is considered positive (+0) or negative (-0) according as, when the system is viewed along the central bond in the direction B -> C (or C -> B), the bond to the front atom A (or D) must be rotated to the right or to the left, respectively, in order that it may eclipse the bond to the rear atom D (or A); note that it is immaterial whether the system be viewed from one end or the other. These relationships are illustrated in Figure 1. *The terms dihedral angle and internal rotation angle are also used to describe this angle, and may be regarded as alternatives to torsion angle though the latter has been used throughout these Rules.

61

FIGURE 1. Newman and perspective projections illustrating positive and negative torsion angles. Note that a right-handed turn of the bond to the front atom about the central bond gives a positive value of 0 from whichever end the system is viewed. Notes: (i) Angles are measured in the range -180° < 0 < +180°, rather than from 0° to 360°, so that the relationship between enantiomeric configurations or conformations can be readily appreciated. (ii) The symbols actually used to describe the various torsion angles important in polypeptides are 0, 0 , o>, v, and x (see Rules 3.2, 4.5.2). In the above, 0 is used simply as an illustrative generic symbol covering all these.

Rule 2. The Sequence Rule, and Choice of Torsion Angle 2.1. The Rules here enunciated for use in the field of synthetic polypeptides and proteins are in general harmony with the Sequence Rule of Cahn, Ingold, and Prelog,* with the exceptions of Rules 2.1.1 and 2.2.2 (cases II and III), and later Rules dependent upon these. The Sequence Rule was formulated as a universal and unambiguous means of designating the “handedness” or chirality of an element of asymmetry. It includes Subrules for the purpose of arranging atoms or groups in an order of precedence or preference, and this system may conveniently be used in the description of steric relationships across single bonds (see Klyne and Prelog) . 11 Here its function is to determine the priority of precedence of different atoms or groups attached to the same atom. However, Rule 2.1.1 below overrides the precedences of the Sequence Subrules, providing a new “local” (specialist) system for use with the general Sequence Rule.t After application of Rule 2.1.1, the normal procedure of the Sequence Rule is applied, but modified by Rule 2.2.2; in this connection the only parts of the Sequence Rule required are given in Rules 2.1.2 to 2.1.5. 2.1.1. The main chain is given formal priority over branches, notwithstanding any conflict with the following rules. Thus the main chain has precedence at Ca over the side chain, and at C' over 0 \ N ote: This rule has not yet been formally accepted except in the present context. *See Cahn, Ingold, and Prelog7 IUPAC Tentative Rules for the Nomenclature of Organic Chemistry, Section E, in IUPAC Information Bulletin No. 35, pp. 36—80. 10 Earlier papers: Cahn and Ingold 5 Cahn, Ingold and Prelog.6 For a partial, simplified account see Cahn4 Eliel.9 ^ Other local systems are available analogously for steroids, carbohydrates and cyclitols, where the Sequence Rule is applied when the local system does not suffice.

62

Handbook o f Biochemistry and Molecular Biology

2.1.2. The order of (decreasing) priority is the order of (decreasing) atomic number. Example: Cl I In Br — C — CH3 the order of priority is Br, Cl, CH3, H. H

2.1.3. If two atoms attached to the central atom are the same, the ligands attached to these two atoms are used to determine the priority: Examples: Cl

precedence over CyH3 because Cx is bonded to C, H, H, and Cy to H, H, H).

OH t (iii) In CH3 — CII2 — C — CH(CH3 )2 the order is OH, CH(CH3 )2 , CH2— CH3, H. H

2.1.4. A double bond is formally treated as though it were split. Thus > C = 0 is treated as > C - 0. I I (O) (C) Example: In CH3 — CO — OH the order is = 0 , — OH, CH3.

2.1.5. If two ligands are distinguished only by having different masses (e.g., deuterium and hydrogen), the heavier takes precedence. # Example: D

Note: This rule is to be used only if the two previous rules do not give a decision. 2.2 Choice o f Torsion Angle and Numbering o f Branches (Tetrahedral Configurations)

2.2.1.

If, in a compound P—B—C -E , the Sequence Rule gives the priorities A > P , Q

and D > E > F, then the Principal Torsion Angle, 0, is that measured by reference to the atoms A—B—C -D as in Rule 1.6 above. The branches beginning at C are numbered CT D, Cy E and Cy F. 2.2.2. If two branches are identical, and the third is different (or nonexistent), they are numbered in a clockwise sense when viewed in the direction B -> C, as follows (see Figure 2). 2.2.3. If all three branches are identical, that giving the smallest positive or negative

63

FIGURE 2. Tetrahedral configurations. Case I: D > E = E. D has the highest priority and is given the smallest number (1). Case II: D = D > E. E has the lowest priority and is given the largest number (3). Case III: D = D, numbered 1 and 2 (E is nonexistant). In each case the Principal Torsion Angle is measured between A -B and Branch 1. N otes: (i) The rule given in Case II differs from Conformational Selection Rule (b) of the Sequence Rule (see Cahn, Ingold, and Prelog, p. 406) 7 according to which if an identity among the groups of a set leaves one group unique, the unique group is Fiducial. The reason for the difference is that the Sequence Rule would define Principal Torsion Angle in terms of a hydrogen atom whenever a single such atom formed part of the set; in the X-ray technique, nearly always used to establish structures of the type under discussion, hydrogen atoms are usually unobservable, and even at best not accurately locatable, so that the position of one used to define a Principal Torsion Angle could only be established by calculation based on (perhaps unjustified) assump­ tions about the bond angles concerned. These considerations apply with even more force to Case III, where one branch is nonexistent; The “ phantom atom ” of zero atomic number would be given highest priority because it is unique. (ii) In Case III the clockwise passage from CD1 to CD2 shall be by the shorter of the two possible routes.

value of the Principal Torsion Angle is normally* assigned the highest priority and the lowest number ( 1 ) (see Figure 3, IV, V); if two branches have torsion angles respectively +60° and -60°, the former is chosen (see Figure 3, VI). The others are numbered in a clockwise sense when viewed in the direction B C. Note: Rule 2.2.3 introduces a new principle, not invoked in 2.2.1 or 2.2.2, that the precedence depends on the conformation. This must necessarily be done since in this case the branches are distinguishable only in this respect. (The same applies to Rule 2.3.2 below). 2.3 Choice o f Torsion Angle and Numbering o f Branches (Planar Trigonal Configurations) A D \ / 2.3.1. If, in a compound P -B -C such that B, C, D, and E are coplanar, or nearly so, Q

\

the Sequence Rule gives the priorities A > P , Q and D > E, then the Principal Torsion Angle is that measured by reference to atoms A—B—C—D as in Rule 1. 6 above. The branches beginning at C are numbered CT D, Cy E. 2.3.2. If the branches are identical, that giving the smallest positive or negative value of *The qualification “normally” is added to avoid the need to renumber the branches, if by chance the rule would demand this in consequence of a movement during refinement of a structure. In this or similar cases, the symbolism should remain unchanged.

64

Handbook o f Biochemistry and Molecular Biology

FIGURE 3. Tetrahedral configurations. Three identical branches: IV - general case, Q positive; V - general case, 0 negative; VI - Q = +60°.

FIGURE 4. Planar trigonal configurations. Identical branches: VII - Q positive; VIII - 6 negative; IX - 6 = +90°.

the Principal Torsion Angle is normally assigned the highest priority and the lowest number (1); if the two branches have torsion angles respectively +90° and -90°, the former is chosen (see Figure 4). Rule 3. The Main Chain (or Polypeptide Backbone) 3.1. Designation o f Bonds. Bonds between main-chain atoms are denoted by the symbols of the two atoms terminating them, e.g., Nz—C f, C f—Cz, Cf—Nf-+1, Cz- O z, Nz—Hz. Abbreviated symbols should not be used. Bond lengths are written Z?(Cz-, N/+1), etc. 3.2. Torsion Angles 3.2.1. The Principal Torsion Angle describing rotation about N—Ca is denoted by that describing rotation about Ca—C is denoted by 0, and that describing rotation about C—N is denoted by co. The symbols 0 Z-, 0 /5 and coz are used to denote torsion angles of bonds within the zth residue in the case of 0 and 0 , and between the /th and (/ + l)th residues in the case of co; specifically, 0 Z- refers to the torsion angle of the sequence of atoms C#-_ j , Nf-, C^, Cz; 0 Z- to the sequence Nf-, C f, Cz-, N/+1; and coz- to the sequence Cz?, C„ Nz+1, C?+1 (see Figure 5). In accordance with Rules 1.6 and 2.1.1, these torsion angles are ascribed zero values for eclipsed conformation of the main-chain atoms N, Ca, and C, that is, for the so-called czs-conformations (see Table 1). N otes: (i) This convention differs from that proposed by Edsall et al. 8 The new designation of angles may be derived from the old by adding 180° to, or subtracting 180°

0

,

65

Table 1 MAIN-CHAIN TORSION ANGLES FOR VARIOUS CONFORMATIONS IN PEPTIDES OF L-AMINO ACIDS 0 0° +60° +120° + 180° -1 2 0 ° -6 0 °

Rotation about N -C *

0

Ca -C trans \ cis j transl cis / t o N H trans\ cis )

0° +60° + 120° +180° -120° -60°

Ca - H C“ - R C“ - C C“ - H Ca - R

Rotation about C ^-C

trans\ cis / trans vto C - 0 cis ( trans \ Ca -H cis /

C ^ -N C ^ -R Ca -H Ca - N C ^ -R

Notes: (i) trans to N7—Hi is the same as cis to N£—C^"1; trans to C7—0 7- is the same as cis to C7- N 7. + i (see Figure 5). (ii) For the description of D-amino acids, interchange C ^-H and C ^ -R in the Table.

FIGURE 5. Perspective drawing of a section of polypeptide chain represen­ ting two peptide units. The limits of a residue are indicated by dashed lines, and recommended notations for atoms and torsion angles are indicated. The chain is shown in a fully extended conformation ( PhNCS in textual material). •^ N o t succinoyl.6

88

Handbook o f Biochemistry and Molecular Biology

Comment Many reagents used in peptide and protein chemistry for the modification (protection) of amino, carboxyl and side-chain groups in amino-acid residues have been designated by a variety of acronymic abbreviations, too numerous to be listed here. Extensive and indiscriminate use of such abbreviations is discouraged, especially where the accepted trivial name of a reagent is short enough, e.g., tosyl chloride, bromosuccinimide, trityl chloride, dansyl chloride, etc., or may be formulated in terms of the group transferred, e.g. N2 ph-F instead of FDNB for l-fluoro-2,4-dinitrobenzene, Dns-Cl or dansyl-Cl in place of DNS, Nbs2 in place of DTNB for 5,5'-dithio-bis(2-nitrobenzoic acid) (Ellman’s reagent), (Pr'O^PO-F, P ij2 P-F, iP r^ -F , or Dip-F instead of DFP for diisopropylfluorophosphate. Other commonly used substances that may be expressed more clearly in terms of symbols are MalNEt (instead of NEM) for 7V-ethylmaleimide, Tos-PheCH2Cl (instead of TPCK) for L-l-tosylamido-2-phenylethyl chloromethyl ketone, Tos-Arg-OMe (instead of TAME) for tosyl-L-arginine methyl ester, Me3 Si- (instead of TMS-) for trimethylsilyl, CF 3 CO- (instead of TFA) for trifluoroacetyl (see 5.2), H4furan (instead of THF), etc. Some additional symbolic terms for substituents (and reagents), as examples, are 2-AminoethylCarbamoylmethylCarboxymethylChloroethylamine Ethyleneimine Chloroacetamide Chloroacetic acid p-Carboxyphenylmercurip-Chloromercuribenzoate DiazoacetylHydroxyethylEthylene oxide

-(CH2) 2NH2 (preferred to Aet) -CH2CONH2 (preferred to Cam) -CH2C 0 2H (preferred to Cm) C1(CH2)2NH2 (CH2)2NH ClCH2CONH2 c ic h 2c o 2h -HgBzOH pCl-HgBzO— N2CHCO-(C H 2 )2OH (c h 2)2o

. Polypeptides 6 .1. Polypeptide Chains5 — Polypeptides may be dealt with in the same manner as substituted amino acids, e.g.,

6

Glycylglycine Af-Aoc

Ahx(A6 ,«M e)

A hx(A ?,7 Me)

Ahx(7Me)

Bz(4AM e); B z (4CH 2 N H 2

Ams

Ala(aMe)

Apr(0Im-2)

7Ahp(0OH, eMe)

eAhxc

Ala((3Et2)

acA 2 h x C A ^ M e ,)

Symbol

STRUCTURE AND SYMBOLS FOR SYNTHETIC AMINO ACIDS INCORPORATED INTO SYNTHETIC POLYPEPTIDES (continued)

£ Handbook o f Biochemistry and Molecular Biology

/)■ v N— N

/= N

> \L _ N

- C H 2CH(NH2)COOH

—CH2CH(NH2)COOH

COOH

HN

/ \ HN----- ^ -C O O H

37

38

lI

CH3NHCOCH, CH(NH2)COOH

36

c h 2 CH(NH2 )COOH

HOOCCH2CH(NH2 )CONHCH3

H 0~ ^

nh2

(CH3)3 N(CH2 )4CH(NH2 )COOH

r

HOOCCH(NH, )(CHa )4COOH

O

—CH2CH(NH2 )CH2COOH

Structure

35

34

33

32

31

30

29

No.

Azt

Azetidine-2-carboxylic acid

Azr

Asn(Me); Asp(NHMe)

Aspartic /3-methylamide

Aziridinecarboxylic acid

Asp-NHMe

Tyr(3NH2)

a,eA 2hx(A^Me3)

Apr(/3Pyr-2)c,t*

Apr(0Prd-2)

aApm

/3Abu(7Ph)

Symbol

Aspartic a-methylamide

3-Aminotyrosine

2-Amino-6-(trimethylammonio)hexanoic acid

2-Amino-3-(2-pyrimidyl)propionic acid

2-Amino-3-(2-pyridyl)propionic acid

2-Aminopimelic acid

3-Amino-4-phenylbutyric acid

Name/reference

STRUCTURE AND SYMBOLS FOR SYNTHETIC AMINO ACIDS INCORPORATED INTO SYNTHETIC POLYPEPTIDES (continued)

99

C

42

43

Ala(/3cHxA‘ A5); Phe(H2)

Ala(0cHx); Phe(H6)

/3-(l,4-Cyclohexadienyl)alanine (9 ) (dihydrophenylalanine)

/3-(Cyclohexyl)alanine (10 , 20 ) (hexahydrophenylalanine)

^

^ _ C

^ C H ( N H 2 )COOH

46

47

48

H

2 CH(NH2 )COOH

Gly(cHx)

(CNEt)Gly; CNEt-Gly

AM2-Cyanoethyl)glycine

n c c h 2c h 2 n h c h 2c o o h

45

o!-(Cyclohexyl)glycine

Ala(/3CN)

0-Cyanoalanine

NCCH2CH(NH2 )COOH

44

CH2CH(NH2 )COOH

Ctr

CitrullineD

H 2NCONH(CH2), CH(NH2)COOH

Tyr(3Bzl)

3-Benzyltyrosine (8 )

Phe(4Cl)

, CH(NH2 )COOH

Phe(aBzl)

Azro

Symbol

(a-Benzyl)phenylalanine (7)

Aziridinonecarboxylic acid (6)

Name/reference

(4-Chloro)phenylalanine (5)

H

y \ — CH2CH(NH, )COOH

l ^ ^ C

HO —

Structure

CH* C(CH2c 6 H s )(NH2 )COOH

HN-----COOH

A

0

41

40

39

No.

STRUCTURE AND SYMBOLS FOR SYNTHETIC AMINO ACIDS INCORPORATED INTO SYNTHETIC POLYPEPTIDES (continued)

100 Handbook o f Biochemistry and Molecular Biology

58

57

56

h° t

H0

H 0\ _

>

H0^ c y

HO v___

.CH(OH)CH(NH2)COOH

€ H 2CH(NH2)COOH

■CH2C(CH3)(NH2)COOH

(3,4-Dihydroxyphenyl)serine

3,4-Dihydroxyphenylalanineb

3,4-Dihydroxy-(cv-methyl)phenylalanine .[/3-(3,4-Dihydroxyphenyl)-a-methylalanine

2,3-Diaminopropionic acid

H2NCH2CH(NH2)COOH

55

Dopa(|3HO)

Dopa b

Dopa(aMe)

A2p r c

A2 pm c

2,2'-Diaminopimelic acid

aeA2hx(A^)

2,6-Diamino-4-hexynoic acid (11 )

HOOCCH(NH2 )(CH2 )3CH(NH2 )COOH

2CH(NH2)COOH

54

h 2 n c h 2o = cch

/3eA2hx; /3Lys

3,6-Diaminohexanoic acid (isolysine;b 0-lysine^)

H2N(CH2)3CH(NH2 )CH2c o o h

52

53

A2bu c

Gly(cPe)

a-(Cy clopenty l)gly cine

2,4-Diaminobutyric acid

Ala(/3cPe)

Symbol

/5-(Cyclopentyl)alanine

Name/reference

h 2 n c h 2 CH2CH(NH2)COOH

CH(NH2)COOH

CH,CH(NH, )COOH

Structure

51

50

49

No.

STRUCTURE AND SYMBOLS FOR SYNTHETIC AMINO ACIDS INCORPORATED INTO SYNTHETIC POLYPEPTIDES (continued)

101

H2 NC(=NH)NHCH2CO—

H2 NC(=NH)NHCH2CH(NH2 )COOH

H2 NC(=NH)NH(CH2 )4 COOH

(CF3 )2CHCH(NH2 )COOH

H2 NC(=HN)NH(CH2 )4 CH(NH2 )COOH

68

69

70

71

1 1 Et

67

0

c h 3c h 2 n h c h 2c o o h

65

66

CH3CH2 SCH2CH2 CH(NH2)COOH

64

N -----T " CHa CH(NH2)COOH

(Ctf Hs )2C(NH2 )COOH

C(CH3 )(NHJ )COOH

63

HO

(CH3 )2CHCH(CH3 )CH(NHCH3)COOH

61

62

(CH3 )2C(SH)CH(NH2 )COOH

60

ch3

HOOCCH2CH(Mea N -0 )C 0 0 H

Structure

59

No.

Homoarginineb

y t -Hexafluorovaline [0,0-bis(trifluoromethyl)alanine]

5-Guanidinovaleric acid

0-Guanidinoalanine

Guanidinoacetyl (Af-amidinoglycyl; glycocy amine)

te/eEthylhistidine;b’c “ 1-Ethylhistidine” (14 ) (cf. 88, 89)

Ar-Ethylglycine

Ethionine^

a ,a-Diphenylgly cine

Harc

Val(7 F6)

Vlr(SGdn)

Ala(/3Gdn)

GdnAc-; AmdGly—

His(>Et)b vo

00

I

v2

£ 3

1

1

c4

£>

CO Ov VO

o r^-

1

1

Ov

CO Ov

o

O

O

Ov

C0 OV vo

o

O

1

1

I

|

1

S

CO I

B ■= i s

VO VO

Cd VO

f I

VO vo

N

1

g CS © © °

I

£>

|

- VO ^ H* N m |5 R s * S +

l i °

3

It

i o O e U. 3

0 VO I

• «o $ ~

1 3 -2 1

s? ~ e ««

1

* rs ™ t"-’ 3 +

VO ^

w

+

w

® 3 +

w 0 00 rs • Ov S rs

1

0 VO 1

1

S

6

o ’

6

z

z S Z x ^ 7 W u

z *£

-

1

5 " ° „ 8 ^ 3 X w

O ^ z. 3

1

« *° VO ' £=-

z S z* Z 2 u w

* y

r ' vo co ^ Ov 00 00 0 H 00 ^ vw'

^

z" £ tfg u w

1

co ro vo rs r> io -H vo OV

JO -

o

2 Z 00 X ^ »

0 o,

.5

S

^G

•S

0

.§ 9 * ^•§ J >v

i ■Q2 §E 8

•a

C O vd

>» •S 8 g

^l< ! O A3

5

S

121

1 «

CM r-~

1

VO r-

rr-

Ov t-

«o t-~

VO

tr-

l

VO r-

00

ON

VO t—

00 r-

1

1

| oo

oo

(N 00

ed v

a

a u a

d ic a rb o x y lii

t

v > 5 3

u

* € «

ri

X

a

a m in o p ro p i

l

z

u

o ua z^ o u

a” z a

o v in e b r a in

i^ 11 i l l J1 IS

ed u

X

u

So

a*

oo

u

a th yru s sativus

8 g

- z - -u

trep to m y c e s

a th y ru s pu sillus

S o u rce

u

u

• , J 1 N n h 2

SH

i

H C = C - C H 2C H ( N H 2) C O O H 1 1 N ^ /N H

H C ------- N ' II H Cs — C H ( N H 2) C H 2C O O H

h

S e C H 2 C H (N H 2 )C O O H

H 3 C S e (C H 2 ) 2 C H (N H 2 )C O O H

Escherichia coli

( h y d r o ly z a te )

C H 2 —S e —S e —C H 2 1 1 2 c h n h 2 1 1 COOH COOH

c h n h

H O O C C H (N H 2 )C H 2C H 2 S eC H 2C H (N H 2 )C O O H

S tr u c tu r e

Astragalus p e ctin a tus

S ta n leya pinnata

S o u rc e

258

s u lf o n ic a d d )

(2 -a m in o -e th a n e -

S e le n o c y s ta th io n in e

(sy n o n y m )

A m in o a c id

254

N o.

2h , n o 3s ( 1 2 5 .1 5 )

9 h 14n 2o 4 s ( 2 4 6 .3 2 )

6 h 9 n 2o 2s ( 1 7 3 .2 3 )

C9H m N 0 6 S ( 2 6 1 .2 6 )

c

c

C 4 H s n 20 2s ( 1 7 2 .2 2 )

c

( 1 6 7 .0 3 )

C 3 H 6 N 0 2 Se

C s H j , N 0 2 Se ( 1 9 6 .1 3 )

C6H l 2 N 20 4 Se2 ( 3 3 4 .1 1 )

C , H , 4 N 20 4 Se (2 6 9 .0 7 )

Formula (mol wt)

-

-

-

1 9 7 .5 2 0 1 .5 ° (3 3 7 )

(2 9 0 )

3 2 0 ° (d e c )

-

-

2 6 3 -2 6 5 ° (3 3 4 )

-

Melting point °c a

-

-

-

-

Db

-

- 4 2 5 (c 1, 1 N a c e tic a c id ) ( 1 0 2 )

(3 3 9 )

- 1 0 25 (c 2, 1 N HC1)

H

DATA ON THE NATURALLY OCCURRING AMINO ACIDS (continued)

-

-

(2 9 0 )

8 .4 7 11 .4

1.8 4

"

9 .0 6 (2 9 0 )

- 0 .3

-

-

-

PKa

341

102

338

337

3 35

-

334a

334

327

342

10 2

337

292

33 5

-

334a

327

-

10 2

339

337

-

335

-

3 34

327



-

1 02

337

-

-

-

'

-

References Isolation and puri­ Chroma­ Chem­ Spectral data fication tography ' istry

158

Handbook o f Biochemistry and Molecular Biology

3 ,5 - D ib r o m o ty r o s in e

2 ,4 -D iio d o h is tid in e

3 ,3 '- D iio d o th y r o n in e

3 ,5 - D iio d o ty r o s in e

263

263a

264

26 5

(io d o g o r g o ic a c id )

5 -C h lo ro p ip e rid a z in e 3 -c a rb o x y lic a c id

d ic h lo r o b u ty r i c a c id

2-A m in o - 4 ,4 -

(s y n o n y m )

A m in o a c id

262a

262

N o.

C o ra l p r o te in

B o v in e th y r o i d g la n d

H u m a n u rin e

Gorgona s p ecies

M o n a m y c in h y d r o ly z a te

S tr e p to m y c e s arm en to su s

S o u rc e

Br

V

I

I

1

"

\

I

6h 7 i 2 n 3 o 2 ( 4 0 6 .9 6 )

HO - ^

I

C

H

2C H ( N H , C O O H

c

9h 9i 2 n

( 4 3 3 .0 1 )

o

3

C 1 SH I 3 I 2 N 0 4 ( 5 2 5 .1 1 )

c

C , H , N 0 3 B r2 ( 3 3 9 .0 1 )

C s H ,C 1 N 2 0 2 ( 1 6 4 .6 1 )

C 4 H 7 C1 2 N 0 2 ( 1 7 2 .0 2 )

H 0^>0^>C H 2C H (N H 2)C 00H

"

/

F o r m u la ( m o l w t)

L . L -H a lo g e n -C o n ta in in g A m in o A c id s

^ - C H 2C H ( N H 2) C O O H

I — C = C — C H 2 C H (N H 2 )C O O H

H O —^

Br

NH

H ,C H

1

CHCOOH

1

C1HC

/C H l\

C l 2 C H C H 2 C H (N H 2 )C O O H

S tr u c t u r e

(2 9 0 )

1 9 4 ° (d e c )

(d e c ) (3 4 5 )

2 3 3 -2 3 4 °

-

245° (3 4 4 )

(2 0 4 a)

8 3 -8 5 ° (D N P d e riv )

°ca

M e ltip g p o in t Db

( 1. 1 TV H C 1X 1)

+ 2 .9

-

-

-

+ 1 5 7 33 (c 0 .1 8 , CHC13 ) (2 0 4 b )

(6 4 )

(c 0 .7 4 , H 20 ) + 2 6 .2 2 5 (c 0 .7 4 , 1 N HC1)

+ 6 .7 2 5

w

DATA ON THE NATURALLY OCCURRING AMINO ACIDS (continued)

(2 9 0 )

(O H ) 7 .8 2

6 .4 8

2. 1 2

-

-

6 .4 5 (O H ) 7 .6 0 ( 20 )

2 .1 7

P*a

346

345

344a

344

204 b

64

f ic a tio n

Is o la tio n a n d p u r i­

347

345

344a

-

204b

64

C h ro m a ­ to g r a p h y

348

3 45

-

344

204b

204a,

64

d a ta

is tr y

64

S p e c tra l

C hem -

R e fe re n c e s

159

R a t th y r o id g la n d

5 -M o n o c h lo r o ty r o s in e

2 -M o n o io d o h is tid in e

M o n o io d o ty r o sin e

T h y r o x in e

S ^ ^ ’- T riio d o -

266b

267

268

269

270

th y r o n in e

B uccin u m u n d a tu m

3 -M o n o b ro m o -5 m o n o c h lo r o ty r o s in e

266a

Phaseolus vulgaris

T h y r o id g la n d

(a n a lg a)

N ereocystis luetkeana

B u ccin u m u n d a tu m

sponges

S e a fa n s a n d

Source

3 - M o n o b r o m o ty r o s in e

Amino acid (synonym)

266

No.

HC

I

C H a C H (N H 2

- C H 2 C H ( N H 2) C O O H

H O -/

\ - 0 - /

I

V - C H 2CH(NH2)COOH

I

k—^

I

\ - C H 2 C H ( N H 2) C O O H

I

- /

\—^

I

\ - 0

I

J--- y

1

^ - C H 2C H ( N H 2) C O O H

J------ y

H O -/

ycoon

' V - C H 2 — C H (N H 2 )C O O H

\—%

1 I

=C

H O —/

HN'

Cl

f,

Cl

//

1

Br ____

H O —/ '

HO —

\ - C H 2C H ( N H 2) C O O H

\= /

H O -/

J-— ^

Br

Structure

( 6 5 0 .9 8 )

C , SH , 2 I 3 N 0 4

( 7 7 6 .8 8 )

C , 5 H , , I4 N 0 4

( 3 0 7 .1 1 )

C 9H 1 0 N O 3

C 6H 8IN 3 0 2 ( 2 8 1 .0 2 )

( 2 1 5 .6 5 )

C 9 H , 0 C lN O 3

C 9 H 9C l N 0 3 B r ( 2 9 4 .5 5 )

( 2 6 0 .1 0 )

C9H 1 0 N O 3 B r

Formula (mol wt)

N

E tO H ) (2 9 0 )

HC1(2 9 0 )

+ 2 3 .6 2 4

(2 9 0 )

10.1

(O H )

+ 1 5 (c 5 , 1 HC1,

(c5,\N

'

-

w Db

(d e c )

2 3 3 -2 3 4 °

95% E tO H ) (2 9 0 )

2 3 6 ° (d e c ) (2 9 0 )

-

Melting point oc a

DATA ON THE NATURALLY OCCURRING AMINO ACIDS (continued)

(2 9 0 )

10.1

8 .4 0 (O H )

2.2

6 .4 5

2.2

-

pKa

355

352

351

350

349b

349a

349

3 47

353

351

350

349b

349a

349

3 48

-

348

349b

349a

-

References Isolation and puri­ Chroma- Chem- Spectral fication tography istry data

160 Handbook o f Biochemistry and Molecular Biology

C ilia tin e

272

l-H y d r o x y - 2 - a m in o e t h y lp h o s p h o n ic a c id

L o m b ric in e ( 2 -a m in o - 2 - c a rb o x y -

273a

274

S ea an e m o n e

O - P h o s p h o s e rin e

2 -T rim e th y l-

2 77

278

a m in o e th y l- b e ta in e p h o s p h o n ic a c id

C a se in

0 - P h o s p h o h o m o s e r in e

276

L a ctobacillus

2 -M e th y la m in o e th y lp h o s p h o n ic a c id

S ea an e m o n e

E a r th w o rm

(p la sm a m e m b ra n e )

3

o

2 0 3 P O C H 2 C H (N H 2 )C O O H

O -

II ( C H 3) 3 + N C H 2C H 2I^ O H

h

H 2 0 3 P O C H j C H 2 C H (N H 2 )C O O H

H 20 3 P 0 C H 2C H 2C H ( N H 2 ) C 0 0 H

c h

1

H N C H 2C H 2 P 0 3 H 2

NH O II II H 2N C N H ( C H 2 ) 2O P O C H 2C H ( N H 2) C O O H 1 OH

o

3p

( 2 7 0 .2 1 )

6h 1 s n 4o 6 p

2h 8n o 4 p ( 1 4 1 .0 8 )

( 1 5 3 .1 3 )

4h 1 2 n

2 h 8n o 3 p ( 1 2 5 .0 7 )

c

o

( 1 6 7 .1 6 )

5h 1 4 n

3p

C 3H 8N 0 6 P ( 1 8 5 .0 8 )

C 4H 1 0 N O 6 P ( 1 9 9 .1 1 )

C 3 H 1 0 N O 3P ( 1 3 9 .1 0 )

c

c

c

H 2N -C H 2- C H (0 H )P 0 3h 2

2c h 2p o 3h 2

A ca n th a m o eb a castellanii

ch

C 2H 8N O s P ( 1 5 7 .0 7 )

c

2n

(m o l w t)

F o r m u la

M . L -P h o s p h o ru s -C o n ta in in g A m in o A c id s

(C H 3 ) 2 N C H 2 C H 2 P 0 3 H 2

h

H 2 0 , P C H 2 C H (N H 2 )C O O H

S tr u c tu r e

S ea an e m o n e

Sea an e m o n e

T etrahym ena p yrifo rm is, Z o a n th u s sociatus

S o u rc e

275

p h o s p h a te )

e t h y l- 2 -g u a n id in e e t h y l h y d ro g e n

2 - D im e th y la m in o e th y lp h o s p h o n ic a c id

273

p h o s p h o n ic a c id )

( 2 - a m in o e th y l

a-A m ino-/3p h o s p h o n o p r o p io n ic a c id

(sy n o n y m )

A m in o ac id

271

N o.

2 5 0 -2 5 2 ° (3 6 0 )

-

178° (d e c ) (3 6 4 )

(3 6 0 )

291°

(3 6 1 )

2 2 3 -2 2 4 °

(3 6 0 )

2 4 9 .5 °

(3 5 9 )

2 8 0 -2 8 1 ° (d e c )

2 2 8 ° (d e c ) (3 5 7 )

M e ltin g p o in t oc a

+ 7 .2 ( H 2 0 ) (3 6 6 )

H 20 ) (3 6 4 )

+ 6 .2 5 2 2 -5 (c 2 .4 ,

(3 6 2 )

(c 0 .9 3 , H 20 )

+ 1 4 .5 2 3 5

-

-

MDb

DATA ON THE NATURALLY OCCURRING AMINO ACIDS (continued)

-

8 .9 ( 20 )

-

(3 5 9 )

6 .4

(3 5 7 )

8.8 1 1. 0

4 .5

2.2

pK a

Is o la tio n

360

365

364

360

361

360a

360

359

358

a n d p u r i­ f ic a tio n

360

366

364

360

361

360a

360

359

3 58

C h ro m a­ to g r a p h y

360

366

364

360

362

360

35 9

357

C hem ­ is try

R e fe re n c e s

360

-

360

362

360a

360

359

S p e c tra l d a ta

161

H e rc y n in ( h is tid in e b e ta in e )

H o m o b e ta in e ( 0-a la n in e b e ta in e )

285

286

M u s h ro o m s

E rg o t

E r g o th io n in e (b e ta in e o f th io l

284

h is tid in e )

B o v in e e la s tin

D e s m o s in e

b e ta in e )

( 7 -a m in o - 0h y d r o x y b u ty r i c a c id

C a r n itin e

b e ta in e )

V e r te b r a te m u s c le

R a t b r a in

7 - B u ty r o b e ta in e ( 7 - a m in o b u ty r ic ac id

b e ta in e )

Sta ch ys (.B eto nica ) officinalis

T o b a c c o leav es

S ource

B e to n ic in e (4 - h y d r o x y p r o lin e

c a rb o x y - p r o p y l) - 0c a rb o x y p y rid in iu m b e ta in e

iV -(3 -A m ino -3 -

A m ino acid (sy n o n y m )

283

282

281

280

279

N o.

1

c o o ‘

n h

3+

C H 2C H 2C H C O C T

l

= c c h

= c c h

h c o o

h c o o

*

-

N ( C H 3)3

2c

tW

F orm ula (m ol w t)

2 4 h 3 9 n 5o , ( 8 7 9 .4 3 )

7h 1 5 n o 3 ( 1 6 1 .2 1 )

7h 1 5 n o 2 ( 1 4 4 .2 0 )

7h 13n o 3 ( 1 5 9 .1 9 )

c

6 h 13n o 2 ( 1 3 1 .1 8 )

C 9 H 1 SN 3 0 2 ( 2 0 3 .2 2 )

C 9 H 1 SN 3 0 2 ( 2 2 9 .2 9 )

c

c

c

c

C l 0H l 4 N 2 O 5 ( 2 4 2 .2 4 )

N . L -B e tain es

C H 2C H ^ H C O O -

C H ( N H 2) C O O -

1

Y

( C H 2)4

N ( C H ,) ,

2c

^

^

(C H 3 ) 3 f i c H 2 C H 2 C O O -

H

N ^C ^N

h c

N^ C xN H

h c

n h >*

O O C H C H j C H 2C

C H 2 ( C H 2) 2C H C O O

(C H , )3 N *C H 2 C H (O H )C H j C O O '

(CHj) , N C H a C H , C H , C O O "

( C H 3)2

|

H 2C ^ ♦ . C H C O O N

H O H C ------ C H 2

O

S tru ctu re

-

290° (3 4 0 )

(3 7 1 )

1 9 5 -1 9 7 °

(3 7 0 )

1 8 0 -1 8 4 °

(d e c ) (3 6 9 )

2 4 3 -2 4 4 °

2 4 1 -2 4 3 ° (d e c ) (3 6 8 )

M elting p o in t oc a

(2 9 0 )

-

(c l , H a O )

+ 1 1 5 2 7 *5

(3 7 4 )

(c 5, H 2 0 )

+ 2 3 .5 ”

-

-3 6 .6 1 5 (3 6 9 )

H 20 ) (3 6 8 )

+ 2 4 24 (c 2 ,

w Db

DATA ON THE NATURALLY OCCURRING AMINO ACIDS (continued)

-

-

-

(3 7 5 )

8 .8 0 9 .8 5

2 .4 0

1 .7 0

pK a

324

332

336

375

372

370

369

368

Isolation and p u ri­ ficatio n

340

373

370

368

-

-

C h ro m a­ tog rap h y

-

3 33

340

375

257,

371

367,

-

369

368

C h em ­ istry

R eferences

340

-

-

-

375

368

S pectral d a ta

162 Handbook o f Biochemistry and Molecular Biology

L a m in in e (a -a m in o -c -trim e th y la m in o a d ip ic a c id )

L y c in (g ly c in e b e ta in e )

M io k in in e ( o r n ith in e b e ta in e )

N ic o tia n in e ( A ^ - a m in o - S -

291

292

293

294

c a rb o x y p ro p y l0 -c a r b o x y p y r id in iu m b e ta in e )

I s o d e s m o s in e

290

T o b a c c o le av es

H u m a n s k e le ta l m u s c le

L y c iu m barbarum

Lam inaria angustata

B o v in e e la s tin

E rythrin a sub um b ra ns

virgata

(3 -h y d r o x y p r o lin e b e ta in e )

H y p a p h o r in ( t r y p t o p h a n b e ta in e )

C ourbonia

A lfa lfa

S o u rc e

3 - H y d r o x y s ta c h y d r in e

(p ip e c o lic a c id b e ta in e )

H o m o s ta c h y d r in e

(sy n o n y m )

A m in o a c id

289

288

287

N o.

H2

1

( C H i) «

L

" '



1

nh3

1 C H 2C H 2C H C O C r

J

( C H 3) 3N + ( C H 2) 3C H C O O 1 N * ( C H 3) 3

(C H 3 ) 3 b T C H , C O O

N H j’

2)2C H C O O “

W

i C H j) 3 C H C O ° ”

H

H 2N C H C O O -

T

(C H 3 ) 3 b T (C H 2 ) 4 C H (N H 2 )C O O

NH

" O O C C H ( C H 2)2 _ ^ \ , ( C

I

--------- :t- C H 2C H C O O 1 M C H ,)3

I

H C + CHCOO" 2 ''N " ' 1 ( C H 3)2

H 2C -------C

( C H 3) 2

H C ^ C v CH 21 | 2 H 2C ^ C H C O O N

S tr u c t u r e

F o r m u la

C 1 0 H , 4 N 2O s

C i i H 2 S N 20 2 ( 2 1 7 .3 3 )

C 5H n N 0 5 ( 1 1 8 .1 6 )

C 9 H 2 0 N 2O 4 ( 2 2 0 .2 9 )

( 8 7 9 .4 3 )

Cj 4H39NsO ,

C 14H 1 8 N 20 2 ( 2 4 6 .2 9 )

( 1 5 9 .2 0 )

C 7H , 3N 0 3

( 1 5 6 .2 1 )

C8H, 4 N 0 2

( m o l w t)

2 4 1 -2 4 3 ° (d e c ) (1 5 6 )

(3 0 7 )

2 1 0 -2 1 2 °

+ 2 8 .4 24 (c 5 , H 2 O ) (1 5 6 )

(3 0 7 )

(c 2 .9 , H a O )

+ 1 0 30

-

15 6

232

234

245

375

58

307

315

f ic a tio n

pK a

°ca

[b 0.4/HCONMe2 0.9/Me2SO l/HCONMe 2 l/HCONMe 2 0.4/HCONMe2

0.9/HCONMe2 0.7/HCONMe2 0.8/HCONMe2 0.5-0.6/HCONM e 2 0.6-0.7/HCONM e 2 l/HCONMe 2

-OPcp

-ONSu

-OPhOH

-OTcp(2,4,5)

-OTcp(2,4,5)

-ONSu

-ONSu

-OTcp(2,4,5)

-OTcp(2,4,5)

-OTcp(2,4,5)

-OPcp -OTcp(2,4,5) -OTcp(2,4,5)

-OTcp(2,4,5) -NHNHj/Bu1NO,

Lys-Ala-Gly

Lys-Ala2

Lys-Ala 2

Lys(Cbz)-Ala2

Lys(Cbz)-Ala2

Lys-Ala-Glu

Lys-Ala-Glu

[Lys(Cbz)] 2-Gly

[Lys(Cbz)] 2-Gly

[ Lys(Cbz) ] 2 -Ala

Lys2-Pro Lys2-Pro Lys-Arg-Gly

Lys-Arg-Ala Lys-Glu-Ala

301b

302

303

304

305

306

307

308

309

310

310a 310b 311

312 313

/benzene

0.9/HCONMe2 0.03/benzene: HCONMe2 (6 :1)

0 .1

0.9/HCONMe2 l/HCONMe2a

-OPcp -ONSu

Lys-Gly-Ala Lys-Ala-Gly

300 301a

E t 3 N/2.5 MeMorph/1.1

Et3N/1.25

E tjN /1.3

Et 3 N/4

E t 3 N/2.5 (Pr‘ ) 2 EtN/1

Et 3 N/2

-ONSu

Et 3 N /l Et 3 N /-

His-Ala-Glub

_ -/HCONMe 2 : Me2SO l/HCONMe2a

299

-ONp -OTcp(2,4,5)

Agent

Base/ equivalents

His-Gly2 His(Cbz)-Gly2

Poly

Concentration (M)/solvent

297 298

Entry No.

Polymerizing conditions

5 days/20° 2 weeks/ room temp.

7 days/ room temp. 5 days/20° 4 days/ room temp. 28 h/ room temp.b 7 days/ room temp. 4 days/ room temp. 1 0 days/ room temp. 1 0 days/ room temp. 6 days/ room temp. 6 days/ room temp. 1 2 days/ room temp. 1 2 days/ room temp. 1 0 days/ room temp. 1 0 d a y s/1 0 d ay s/5 days/20°

_ 5 - 7 days/

Time/ temperature

SEQUENTIAL POLYPEPTIDES (continued)

37

66

3 0 -6 0 3 0 -6 0 95

33

46

52

100

100

70

81

40

88.5

57

55 72

88

/V.S.

1 4 /3 /21/V.S. and 19/1R 15/V.S. and IR 9/[rj]

10/V.S.

9/V.S.

10/V.S.

Mostly low/Gj q

Mostly low/G S 0

8

25 n and 41fw/ P-100 22/V.S.

48/CM-CeU.

143/G-25, 119/jt 51/ jt

-

30/Vel., Eq . 7



120 12 1

120

— -

35a 35a -

-

10 0



10 0

10 0

, 102

, 102

, 102

, 102

, 102

119

119

10 0

10 0









117 , 118

116

96 115a 115b

99.3b/H+ 100/H’'

>99.5 ala/G.C.b —

114

32 70

Ref.

100/H+

_ —

— —



Optical purity (%)



Yield (%)

Degree of polymerization/ method

Product

356 Handbook o f Biochemistry and Molecular Biology

Asp(OMe)-Gly2

Asp(OMe)-Gly2

Asp(OMe)-Gly2

Asp(OMe)-Gly2

Asp(OMe)-Gly2 Asp(OMe)-Gly2

Asp(OMe)-Gly2

Asp(OMe)-Gly2

Asp(OMe)-Gly2

A§?Gly-Glyb Asp-Ser(Ac)-Gly Asp(OMe)-Ser-Gly

Asp(OMe)-Ser-Gly

Asp(OMe)-Ser-(Ac)-Gly D CC /1-2 Asp-Ser-Gly -ONpb Asp-Cys-Gly -OPcp

Asp(Cys-Gly)-OH

316

317

318

319

320 321

322

323

324

325 326 327

328

329 330 331

332

-OPcp

-ONp

-ONp -ONp -ONp

-ONp

-ONp

-ONp

-ONp -ONp

-ONp

-ONp

-ONp

l/HCONMe2

Et3N/2

48 hi room temp. 48 h/ room temp.

E t3N/2

10 days/— 10 d a y s/Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. Overnight/ room temp. 3 days 50 h/ room temp. 4 days/ room temp.

Time/ temperature

NA

E t3N/1.25

1.2-0.6/M e2SOa 0 .1 2 -0 .1 8/HCONMe2 l/HCONMe2

E t3N/1.5 E t3N /NaONp/1

Et3 N /l

E tjN /l

Et3 N /l

E tjN /l Et3N /l

E tjN /l

E tjN /l

E tjN /l

Et3 N /1

E tjN /l

E tjN /l E tjN /l E tjN /l

Base/ equivalents

0.9/Me2 SO —/Me2 SO l/M e2SO

1.78/OP(NMe2 )3

1.78/MePdn

0.26/HCONMe2

-/M e 2SO l.l/HCONM e2

0.66/Me2 SO

0.83/Me2SO

0.9/Me2SO

1.55/Me2SO

1.8/Me2SO

0.6-0.7/HCONM e2 0.7-0.9/HCONM e, 2.4/Me2 SO

Concentration (M)l solvent

73b

39b

-

-

84

100 75

50

65

25/Eq.b

13/Dnp; 21/Vel. 22/Eq.b

21/Dnp; 18/Vel.

24/Dnp 18/Vel. 33/Dnp; 37/Vel.

17/[r,]

33/[rj]

8/[r)l

15/N 14/[r,]

60 30

21/[rj]

3 3 /h ]

40/Dnp

26/In]

2 3 /h l

1 4 /6 1 /30 /[»,]

40

45



70

65

3 0 -6 0 3 0 -6 0 —

Yield (%)

Degree of polymerization/ method

Product





< 100/[a]

100/[a]

100/ [or]

122

25 25 122

25

14 25 25

14

14

100/H+ 100/H+

14

100/H+

100/H+

32 14

14

io o /tr -

14

100/H*

100/H+

14

14

100/H+

35a 35a 14

Ref.

14

100/H+

_

Optical purity (%)

+ o’ o

-ONp

-ONp

Asp(OMe)-Gly2

315

-OPcp -OTcp(2,4,5) -ONp

Agent

Lys-Pro2 Lys-Pro2 Asp(OMe)-Gly2

Poly

3i3 a 313b 314

Entry No.

Polymeirizing conditions

SEQUENTIAL POLYPEPTIDES (continued)

357

E t3N /l Et3N /l E tjN /1.4 E tjN /1.5

0.6/H4 furana 0.8/dioxanea 2.5 - 5/HCONMe j 2.5/HCONMe2

-/HCONMe2 or Me2SO 0.7/Me2 SO

-ONSu

-ONSu

-ONp -ONp

-OPcp -ONp -ONpb

-OTcp(2,4,6)

-OPcp

Asp2-DGlu

Asp-DGlu-Glu

Glu(OEt)-Gly2 Glu(OEt)-Gly2

Glu-Gly-Ala Glu-Gly-Ala Glu (OBzl)-Gly-Ala

335

336

337 338

339 340 341

342

22 h/ room temp.

E t3 N/2

E t3 N/2 Et3N /l E t3N /lb Et 3N/1.3

0.75/Me2 SO 0.23/Me2SO l/M e2SO 0.7/HCONMe2 0.5/Me2SOa 1.2/HCONMe2

-OPhOH

-OPhOH

-OPcp

-ONp

-ONp

-ONp

Glu-Ala2

Ala-Glu-Ala

Glu-Ala-Glu

Glu-Ala-Glu

Glu-Ser-Gly

Glu(OMe)-Ser(Ac)Glu(OMe)

347

348

349

350

351

352

E t3N /l

15-38/D np, Vel. and Eq. 15-20/O N p

45 100

61/Dnp

44/Dnp, 76/Vel.

-aminobenzoylglycy 1histidylglycyl-/>-aminobenzoyl-Eaminocaproyl] Glutamylaspartylisoleucylalanylmethionyl glutamyl lysine [Ile5|-angiotensin II [P-Asp1, lie5[-angiotensin II lAsn1,lie5[-angiotensin II [Ala3,lie5[-angiotensin II Des-Arg1- [Thr 6[-bradykinin Des-Arg1- [Phe6[-bradykinin Des-Arg1- [Ly s(T os)6[-bradykinin Des-Arg l- [D-Pro2[-bradykinin Des-Arg *-[D-Pro3[-bradykinin Des-Arg1- [D-Pro 7j-brady kinin Des-Arg*- [D-Arg9[-bradykinin Des-Arg ^[Ty^M e)5 8[-bradykinin Des-Arg^(Ty^M e)5,Thr6,Leu8[-bradykinin Des-Arg *- [Tyr(Me)5,8,Gly6|-bradykinin Des-Arg^jD-Pro2,3,7,Tyr(Me)5,8[bradykinin

cyclo-[Gly-Pab-Gly-His-Gly-Pab-Acap-]

H-Gly-Ala-Phe-Val-Gly-Leu-Met-NH2 H-Lys-Glu-Leu-Gly-Tyr-Gln-Gly-OH H-Glu-Thr-Leu-Asp-Ala-Thr-Arg-OH H-Met-Glu-His-Phe-Arg-Trp-Gly-OH H-Arg(N02)-Val-Tyr-Ile-His(Bzl)-Pro-Phe-OH H-DL-Gly-Phe-Leu-Phe-Leu-Gly-Phe-OH H-Val-Leu-Ser-Pro-Ala-Asp-Lys-OH H-Glu-Arg-Val-Glu-Trp-Leu-Arg-OH

H-Glu-Asp-lle-Ala-Met-Glu-Lys-OH H-Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-OH H-)5-Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-OH H-Asn-Arg-Val-Tyr-Ile-His-Pro-Phe-OH H-Asp-Arg-Ala-Tyr-Ile-His-Pro-Phe-OH H-Pro-Pro-Gly-Phe-Thr-Pro-Phe-Arg-OH H-Pro-Pro-Gly-Phe-Phe-Pro-Phe-Arg-OH H-Pro-Pro-Gly-Phe-Lys(Tos)-Pro-Phe-Arg-OH H-D-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-Pro-D-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-Pro-Pro-Gly-Phe-Ser-D-Pro-Phe-Arg-OH H-Pro-Pro-Gly-Phe-Ser-Pro-Phe-D-Arg-OH H-Pro-Pro-Gly-T yr( Me)-Ser-Pro-T yr( Me)-Arg-O H H-Pro-Pro-Gly-Tyr(Me)-Thr-Pro-Leu-Arg-OH H-Pro-Pro-Gly-Tyr(Me)-Gly-Pro-Tyr(Me)-Arg-OH H-D-Pro-D-Pro-Gly-Tyr(Me)-Gly-D-Pro-Tyr(Me)Arg-OH

Handbook o f Biochemistry and Molecular Biology

380

PEPTIDES PREPARED BY SOLID PHASE PEPTIDE SYNTHESIS (continued) Entry number

Refer­ ence

Solid support

Type of bond

Monomer protection

Coupling reagent

Deprotection

Cleavage

109

5

S-DVB-2%

Bzl

Boc

DCC

HCl-HOAc

NaOH

110 111 112 113 114

14 15 38 88 93

S-DVB-2% h o c h 2-s -d v b S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc

DCC ONp DCC DCC DCC

HCl-HOAc HCl-HOAc HCl-HOAc HCl-Diox. HCl-Diox.

HF N H 3-MeOH HBr-TFA HBr-TFA HF

115

134

S-DVB-2%

Bzl

Boc

ONp,OSu

HCl-Diox.

N H 2N H 2-DMF

116 117

69 69

Phenol-trioxane Phenol-trioxane

Phenyl Phenyl

Aoc Aoc

DCC,ONp DCC,ONp

HCl-MeOH HCl-MeOH

N H 3-MeOH N H 3-MeOH

118

69

Phenol-trioxane

Phenyl

Aoc

DCC,ONp

HCl-MeOH

N H 3-MeOH

119

69

Phenol-trioxane

Phenyl

Aoc

DCC,ONp

HCl-MeOH

N H 3-MeOH

120 121

93 64

S-DVB-2% S-DVB-2%

Bzl Phenacyl

Boc Boc

DCC OSu

HCl-HOAc TFA

HBr-TFA STP

122 123

135 135

S-DVB-2% S-DVB-2%

Bzl Bzl

Boc Boc

DCC DCC

HCl-HOAc HCl-HOAc

HBr-TFA HBr-TFA

124

94

S-DVB-2%

Bzl

Boc

DCC

HCl-HOAc

nh

125

98

S-DVB-2%

Bzl

Boc

DCC

HCl-HOAc

OH-AER-MeOH

126

69

S-DVB-2%

Bzl

Boc

HCl-Diox.

HBr-TFA

127 128 129

62 85 109

S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl

Boc Boc Boc

NEPIS, DCC-HoSu OSu DCC DCC

HCl-HOAc HCl-HOAc HCl-HOAc

HBr-TFA NH2N H 2-DMF HBr-TFA

130 131 132 133 134 135

16 89 38 57 7 46

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% Phenol-CH20

Bzl Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc Boc

DCC DCC,ONp DCC DCC DCC DCC

HCl-HOAc HCl-HOAc HCl-HOAc HCl-Diox. HCl-HOAc HCl-HOAc

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA

136 137

99 121

S-DVB-2% S-DVB-2%

Bzl Bzl

Boc DCC Boc,Bpoc DCC

HBr-TFA HF

138

117

S-DVB-2%

Bzl

Boc

DCC

HCl-HOAc HCl-Diox.,15 % TFA/CH2C12 HCl-HOAc

HBr-TFA

139

109

S-DVB-2%

Bzl

Boc

DCC

HCl-HOAc

HBr-TFA

140 141 142 143 144 145 146 147 148 149 150 151 152 153 154

7 7 7 17 18 18 18 18 18 18 18 18 18 18 18

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc

DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC

HCl-HOAc HCl-HOAc HCl-HOAc HCl-HOAc HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox.

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA

3

381

PEPTIDES PREPARED BY SOLID PHASE PEPTIDE SYNTHESIS (continued) Entry number

Number of residues

155 156 157 158 159 160 161 162

8 8 8 8 8 8 8 8

163

8

Trivial name or source Des-Arg1- (Leu5•8,Thr6J-bradykinin Des-Arg MLeu5,8,Gly6j-bradykinin D es-A rg^T y^M e)5,8,Gly6|-bradykinin Des-Arg^fTyr^Me)5,8,Asn6J-bradykinin Des-Arg1- [Tyr(Me)5>8,Thr6j-bradykinin Nylon 6 oligomers TMV protein—(105-112) Hisddylphenylalanylvalylglutamyllysylalanylalanylalanine Staphylococcal nuclease—(132-139)

Sequence H-Pro-Pro-Gly-Leu-Thr-Pro-Leu-Arg-OH H-Pro-Pro-Gly-Leu-Gly-Pro-Leu-Arg-OH H-Pro-Pro-Gly-Tyr(Me)-Gly-Pro-Tyr(Me)-Arg-OH H-Pro-Pro-Gly-Thr(Me)-Asn-Pro-Tyr(Me)-Arg-OH H-Pro-Pro-Gly-Tyr(Me)-Thr-Pro-Tyr(Me)-Arg-OH H-[Cap]n-OH; n = 2-8 H-Ala-Glu-Thr-Leu-Asp-Ala-Thr-Arg-OH H-His-Phe-Val-Glu-Lys-Ala-Ala-Ala-OH Boc-Glu(OBzl)-Ala-Gln-Ala-Lys(TFA)-Leu-Ile-OH

protected 164 165 166 167 168 169

8 8 8 8 8 8

170

8

171 172 173 174 175 176 177

8 8 8 8 8 8 8

178 179 200

8 8 8

201 202 203 204 205

8 8 9 9 9

[lie ^De5]-angiotensin 11 |Ala3,Ile5)- angiotensin 11 [lie5,Ala81- angiotensin 11 [Alas ]- angiotensin 11 [Val5,Phlac8|-angiotensin II Glycylprolylglycylprolylprolylglycylalanyllysine Phenylalanylphenylalanylglycylleucylleucylleucylglycylphenylalanine [D-Val5]-angiotensin II [Val5,D-Pro7l-angiotensin II [Deuterated-Valsl angiotensin II Lysozyme-(64-71) [Asn^HyVs[-angiotensin II [Asn1,HyPhlac4,Vals[-angiotensin II Leucylalanyltyrosyllysyllysyllysyllysyllysine, protected [Pro3,He5J-angiotensin II Polyglutamyl folic acid Prolylglycylthreonyllysylmethionylisoleucylphenylalanylalanine [Gly^Gly2,He5[-angiotensin II Glucagon(22-29) Bradykinin Bradykinin D-Bradykinin

206 207 208 209 210 211 212 213 214 215

9 9 9 9 9 9 9 9 9 9

Acetyl-bradykinin [Thr6[-bradykinin jphe6[-bradykinin (Lys(Tos)6[-bradykinin [D-Pro2[-bradykinin [D-Pro3[-bradykinin [D-Pro7j-bradykinin Acetyl- [D-Pro7J-bradykinin [D-Arg9J-bradykinin [Ty r(Me) 5’8[-bradykinin

216

9

Acetyl- [Tyr(Me) 5•8J-bradykinin

217 218 219 220 221 222 223

9 9 9 9 9 9 9

[Thr6,Leu8J-bradykinin |Gly6,Tyr(Me)8[-bradykinin Acetyl- [Gly6,T yr(Me)8[-bradykinin [D-Pro2 3,7 J-bradykinin [Leu5 8,Thr6J-bradykinin [Leu5*8,Gly6j-bradykinin [Tyr(Me)5,8,Gly6|-bradykinin

224

9

[Tyr(Me)5,8, Asn6J-bradykinin

225

9

[D-Arg1,9,D-Phe5,8,D-Ser6|-bradykinin

226

9

Des-Arg1-endo-Phe8"-bradykinin

H-Ile-Arg-Val-Tyr-Ile-His-Pro-Phe-OH H-Asp-Arg-Ala-Tyr-Ile-His-Pro-Phe-OH H-Asp-Arg-Val-Tyr-Ile-His-Pro-Ala-OH H-Asp-Arg-Val-Tyr-Ala-His-Pro-Phe-OH H-Asp-Arg-Val-Tyr-Val-His-Pro-Phlac-OH H-Gly-Pro-Gly-Pro-Pro-Gly-Ala-Lys-OH H-DL-Phe-Phe-Gly-Leu-Leu-Leu-Gly-Phe-OH H-Asp-Arg-Val-Tyr-D-Val-His-Pro-Phe-OH H-Asp-Arg-Val-Tyr-Val-His-D-Pro-Phe-OH H-Asp-Arg-Val-Tyr-(2H)-Val-His-Pro-Phe-OH H-Cys-Asn-Asp-Gly-Arg-Thr-Pro-Gly-OH H-Asn-Arg-Val-Tyr-HyV-His-Pro-Phe-OH H-Asn-Arg-Val-HyPhlac-Val-His-Pro-Phe-OH H-Leu-Ala-Tyr-[Lys(TFA)]5-OH H-Asp-Arg-Pro-Tyr-Ile-His-Pro-Phe-OH Pteroyl-(7 -Glu)n-Glu-OH, n = 0 - 6 H-Pro-Gly-Thr-Lys-Met-Ile-Phe-Ala-OH H-Gly-Gly-Val-Tyr-lle-His-Pro-Phe-OH H-Phe-Val-Gln-Try-Leu-Met-Asn-Thr-OH H-Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-D-Arg-D-Pro-D-Pro-Gly-D-Phe-D-Ser-D-ProD-Phe-D-Arg-OH Ac-Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Thr-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Phe-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Lys(Tos)-Pro-Phe-Arg-OH H-Arg-D-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-Arg-Pro-D-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Ser-D-Pro-Phe-Arg-OH Ac-Arg-Pro-Pro-Gly-Phe-Ser-D-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-D-Arg-OH FI-Arg-Pro-Pro-Gly-Tyr(Me)-Ser-Pro-Tyr(Me)Arg-OH Ac-Arg-Pro-Pro-Gly-Tyr(Me)-Ser-Pro-Tyr(Me)Arg-OH H-Arg-Pro-Pro-Gly-Phe-Thr-Pro-Leu-Arg-OH H-Arg-Pro-Pro-Gly-Phe-Gly-Pro-Tyr(Me)-Arg-OH Ac-Arg-Pro-Pro-Gly-Phe-Gly-Pro-Tyr(Me)-Arg-OH H-Arg-D-Pro-D-Pro-Gly-Phe-Ser-D-Pro-Phe-Arg-OH H-Arg-Pro-Pro-Gly-Leu-Thr-Pro-Leu-Arg-OH H-Arg-Pro-Pro-Gly-Leu-Gly-Pro-Leu-Arg-OH H-Arg-Pro-Pro-Gly-Tyr(Me)-Gly-Pro-Tyr(Me)Arg-OH H-Arg-Pro-Pro-Gly-Tyr(Me)-Asn-Pro-Tyr(Me)Arg-OH H-D-Arg-Pro-Pro-Gly-D-Phe-D-Ser-Pro-D-PheD-Arg-OH H-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Phe-Arg-OH

382

Handbook o f Biochemistry and Molecular Biology

PEPTIDES PREPARED BY SOLID PHASE PEPTIDE SYNTHESIS (continued) Entry number

Refer­ ence

Solid support

Type of bond

Monomer protection

Coupling reagent

Deprotection

Cleavage

155 156 157 158 159 160 161 162

18 18 18 18 18 19 38 93

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc Boc Boc Boc

DCC DCC DCC DCC DCC DCC DCC DCC

HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. TFA HCr-HOAc HCl-HOAc

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-HOAc HBr-TFA HBr-TFA

163

132

S-DVB-2%

Bzl

Boc

OSu,ONp

HCl-Diox.

HBr-TFA

164 165 166 167 168

71 17 91 71 40

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA

169 170

59 45

S-DVB-2% S-DVB-2%

Bzl Bzl

Boc Boc

DCC HCl-HOAc DCC HCl-HOAc DCC HCl-HOAc NEPIS HCl-HOAc DCC, HCl-HOAc Mixed anh. DCC HCl-HOAc DCC HCl-HOAc

171 172 173 174 175 176 177

92 92 92 107 97 97 69

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc Boc Boc

HCl-HOAc HCl-HOAc HCl-HOAc HCl-HOAc HCl-HOAc HCl-HOAc HCl-Diox.

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA

178 179 200

71 103 108

S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl

Boc Boc Boc

DCC DCC DCC DCC,ONp DCC DCC NEPIS, Azide DCC Mixed anh. DCC

HCl-HOAc TFA-CH2C12 HCl-HOAc

HBr-TFA HBr-TFA HBr-TFA

201 202 203 204 205

125 60 20, 21 14 22

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc

DCC DCC,ONp DCC DCC DCC

HCl-HOAc HCl-HOAc-BME HCl-HOAc HCl-HOAc HCl-Diox.

HBr-TFA HF HBr-TFA HF HBr-TFA

206 207 208 209 210 211 212 213 214 215

18 18 18 18 18 18 18 18 18 18

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc Boc Boc Boc Boc Boc

DCC DCC DCC DCC DCC DCC DCC DCC DCC DCC

HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox.

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA

216

18

S-DVB-2%

Bzl

Boc

DCC

HCl-Diox.

HBr-TFA

217 218 219 220 221 222 223

18 18 18 18 18 18 18

S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2% S-DVB-2%

Bzl Bzl Bzl Bzl Bzl Bzl Bzl

Boc Boc Boc Boc Boc Boc Boc

DCC DCC DCC DCC DCC DCC DCC

HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox. HCl-Diox.

HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA HBr-TFA

224

18

S-DVB-2%

Bzl

Boc

DCC

HCl-Diox.

HBr-TFA

225

18

S-DVB-2%

Bzl

Boc

DCC

HCl-Diox.

HBr-TFA

226

18

S-DVB-2%

Bzl

Boc

DCC

HCl-Diox.

HBr-TFA

HBr-TFA HBr-TFA

383

PEPTIDES PREPARED BY SOLID PHASE PEPTIDE SYNTHESIS (continued) Entry number

Number of residues

Trivial name or source

Sequence

227 228 229

9 9 9

Des-Arg^bradykininyl-glycine H-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-Gly-OH Des-Arg'-bradykininyl-arginine H-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-Arg-OH Des-Arg1- |Tyr(Me)5 8)-bradykininyl-glycine H-Pro-Pro-Gly-Tyr(Me)-Ser-Pro-Tyr(Me)-Arg-

230

9

D-Retrobradykinin

231

9

Lysine vasopressin, protected

232 233

9 9

Chymotrypsin (193-201) nonapeptide |Tyr(Me)5,Thr6,Tyr(Me)8J-bradykinin

234 235 236 237

9 9 9 9

Oxytocin Oxytocin Oxytocin Oxytocin, protected

238

9

Oxytocin, protected

239 240 241

9 9 9

Oxytocin, protected Deamino-oxytocin Deamino-oxytocin, protected

242 243 244

9 9 9

|Lys8|-vasopressin ILys8|-vasopressin JLys8|-vasopressin, protected

245

9

(Lys81-vasopressin, protected

246 247

9 9

(Ser4,Gln8|-oxytocin [Ser4,Gln8|-oxytocin, protected

248

9

Oxytocinoic acid, protected

249 250 251 252 253

9 9 9 9 9

TMV protein— .—

32 2" g



I

f6.

o. 5

1

vo

1 Ce8

4)1

*c

4

.2

i8

j

v s c sc *c £“

2

I 13 sy . 3i

'I

1 1

H« *0 0 .

•» i 0 0s8 •* — Ot o a

^i?£

^J2 ^ x >» 4> o

«

J= >> a 1* >» -c

•S2 «> $ g « S t s

*

Ic z «

gS fl

j :i s

g

111

!

10 14> 1«. ” = £* .e5c 'Sgo

« 2 ? S ||

I * 2 S5

S ! 3§> 2i. * 2- s© *°

E | o 0* o a Ǥ!

•S f ?

r | c s i 1 S. 1 -2 i; £ S

I ?g

■ S lf | | 2

c4) ^>» «t mE ^ •4> a

H 20 if n < 4 H 20

(Qy-Pro-Gly)4 (Gly-Pro-Pro)_, 6 Harn

LVsn

^Pn

(DL-HSe)n

h 2o

H 20

H20 < pH 6

H20 a 2AcOH, FjA cO H, Cone. Ligand N H / halides h 2o h 2o

Glu-Gly(Pro-Pro-Gly) s G lyn (Gly-Ala-Pro)4 (Gly-Pro-Ala)n

Hi*n

H jO > pH 4

H jO > pH 4, HCONMe2 (continued)

S oluble inc

(D-Glu)n

Glun (continued)

Substrate**’0

Ficin Human thyroid proteinase 6 Keratinase Leucine aminopeptidase Leucine aminopeptidase I from Aspergillus oryzae Leucine aminopeptidase 11 from Aspergillus oryzae Leucine aminopeptidase III from Aspergillus oryzae Pancreatic protease 1 Pancreatic protease 2 Papain Penicillium cyaneofulvum protease Pepsin Pronase Rennin Staphylococcus aureus 6 Subtilisin T rypsin Yeast proteinase C Chymotrypsin Pepsin “ Pronase” Prolyl hydroxylase Keratinase Prolyl hydroxylase Collagenase Prolyl hydroxylase Prolyl hydroxylase Prolyl hydroxylase Chymotrypsin Trypsin Chymotrypsin Lactoperoxidase Pepsin Takadiastase6 Chymotrypsin Pepsin “ Pronase” Trypsin Carboxypeptidase C Pepsin Acid protease from germinated sorghum A rthrobacter proteinase Carboxypeptidase*A Carboxypeptidase-B Cathepsin D 2e Chymotrypsin Elastase, Fibrinolysin (bovine) Ficin Human thyroid proteinase Keratinase

Enzyme 4 .0 -7 .5 4 .3 -7 .7 9.0 5 .0 -7 .7 6.5 6.5 6.0 5.3 5.3 4 .0 -8 .0 4.5 2 .0 -5 .0 7.6, 8.0 2 .0 -5 .0 4.8 3 .5 -7 .5 5 .0 -8 .6 4.2 7.7, 8.0 2.3, 4.2 7.6, 8.0 7.8 9.0 7.2 7.0 7.8 7.8 7.8 8.0 8.0 7.7, 8.0 6.0 2.3, 4.2 5.3 7.7, 8.0 2.3, 4.2 7.6, 8.0 7.7, 8.0 5.3 2.3 3.6 8.0 5 .6 -9 .5 7 .9 -9 .3 4.7 7.7, 8.0 7 .0 -1 0 7.4 7 .0 -1 2 4 .3 -9 .7 9.0

pH

+ + + + + + + +

-

+ slow slow + -

-

+ + slow slow + + + + + + + + + + + + + very slow + slow slow +

R eacted (♦) o r n o t ( -)

Lys2, Lys3, Lys4

Lys2, Lys3, Lys4 LySj, Lys4 , Lyss

Lys

Oligopeptides

His

Iodinated polyhistidine

H ar2, H ar,

Gly-Pro-Ala

Glu

Glu 3, G1u4

Glu

Glu, GlUj

M ajor p ro d u c ts 0

POLY(a-AMINO ACIDS), THEIR SOLUBILITY AND SUSCEPTIBILITY TO ENZYMATIC ACTIVITIES (continued)

9

10 14 9 18 21 22 23 52 52 16, 18, 24 11 2, 4 , 25 2 10 14 18 2, 18 12 2 2 ,4 2 53 9 55 26 59 59 59 61 61 2 27 2 28 2 2 2 2 29 4 8 30 31, 32 3 1 ,3 2 14 2, 14, 32 , 33 32 28 1 0 , 32 14

R eference *

394

Handbook o f Biochemistry and Molecular Biology

33% HBr/AcOH, Hot AcOH

H 20 , AcOH

*°n

h 2o

h 2o

h 2o

O m n(38) Kien

(Pro-Gly-Ala)n (Pro-Gly-Gly)n

h 2o

C H Q j , Cl, AcOH h 2o

Metn [Lys(M e)Jn

H 20

(DLys)n

Soluble in c

(Lys-Ala-Ala)n

H2 0 (c o n tin u e d )

Lysn (c o n tin u e d )

Substrate**’0

Carboxypeptidase B Trypsin Trypsin Chymotrypsin Keratinase Acid protease from germinated sorghum Aminopeptidase P Carboxypeptidase C Chymotrypsin Clostridial aminopeptidase Dipeptido carboxypeptidase from E. coli Prolyl hydroxylase Leucine aminopeptidase from bovine lens Leucine aminopeptidase I from Aspergillus oryzae Leucine aminopeptidase II from Aspergillus oryzae Leucine aminopeptidase III from Aspergillus oryzae Pepsin Penicillium cyaneofulvum protease Proline iminopeptidase “ Pronase” Prolyl hydroxylase Trypsin X-Prolyl aminopeptidase Yeast proteinase C Collagenase Collagenase

Leucine aminopeptidase Leucine aminopeptidase I from Aspergillus oryzae Leucine aminopeptidase II from Aspergillus oryzae Leucine aminopeptidase III from Aspergillus oryzae Pancreas-protease Papain Penicillium cyaneofulvum protease Pepsin “ Pronase” Pronase E Staphylococcus aureus e Subtilisin Takadiastate Thrombin Trypsin Carboxypeptidase-B a-Chym otrypsin Pancreas powder extract Pancreas-protease Pronase E Takadiastase Trypsin Elastase Trypsin

Enzyme

susp. 9.0 3.6 8.6 5.3 7.7, 8.0 8.6 8.1 7.5 9.1 6.5 6.5 6.0 2 .3 ,4 .2 4 .0 -1 0 .7 7 .8 -9 .S 7.6, 8.0 7.5 7.7, 8.0 7.7 5 .0 -8 .0 7.0 7.0

7.5 7.2

4 .3 -9 .7 6.5 6.5 6.0 7.65 7 .0 -1 2 10.7 2 .3 -4 .6 7.6, 8.0 7.0, 8.3 3 .2 -6 .1 , 7 .1 -9 .5 6 .5 -1 0 4 .8 5 ,5 .0 7.8 6 .0 -1 0 7.65 7.65 6.0 7.65 7.00, 8.3 4.85 8.0 8.6 7.5

pH

♦ + very slow +

-

+ + + + very slow very slow very slow -

+ +

-

+ -

-

-

+

-

+ + + slow + + + + + + +

Reacted (+) or n o t ( - )

Pro-Gly, Gly-Pro-Gly, Gly-Pro-Gly-Gly

Pro

Pro

Lys(Me)

Lys-Ala-Ala, (Lys-Ala-Ala), Ala-Ala-Lys

Lys2, LySj

Lys

Lys

Lys, Lys2, Lys3 Oligopeptides Oligopeptides

Lys

Major products0

POLY(a-AMINO ACIDS), THEIR SOLUBILITY AND SUSCEPTIBILITY TO ENZYMATIC ACTIVITIES (continued)

13 51 39 40 9 8 41 42 2 43 44 4 8 ,5 8 45 21 22 23 2 ,4 11 4 6 ,4 7 2 48 2 49 12 26 26

32 21 22 23 60 1 0 ,3 2 11 2, 32 , 34 2 , 14 60 14 32 2 8 ,6 0 28 32 , 34 , 35 60 60 36 60 60 60 36 37 37

Referenc 1

395

H20 for n < 10, 10% AcOH or 50% EtOH for n = 20 Cone. LiBr in H20 Pyr, HCONMe2, Cl2AcOH H20 > pH 9, Pyr, HCONMe2, Me2SO

H20 > pH 5.5 1 3AcOH

(Pro-Pro-Gly)n f

(Tyr-Ala-Glu) Valn

Carboxypeptidase C C o s t rid ial aminopeptidase Collagenase Dipeptido carboxypeptidase from E. coli Prolyl hydroxylase Prolyl hydroxylase Acid protease from germinated sorghum Chymotrypsin Chymotrypsin Chymotrypsin Keratinase Pepsin Trypsin Acid proteinase from germinated sorghum

Enzyme

susp. 7.7, 8.0 7 .3 -7 .5 , 8.3 9.0 2.3, 4.2 7.7, 8.0 3.6

5.3 8.6 7.0 8.1 7.2, 7.5, 7.6, 7.8 7.5, 7.8 3.6, susp.

pH

Compiled by Arieh Yaron.

fThe nonapeptides, n = 3 with jV-tert-pentyloxycarbonyl and benzyl ester blocked N- and C-terminal groups, respectively, were also hydroxylated.

T y rn

T rPn

Sern

H20

Soluble in c

(Pro-Gly-Pro)n

Substrate * 1’0

+

-

-

+

+ + +

+

+

_

Reacted (+) or n o t ( -)

Oligopeptides with N-terminal tyrosine

Pro, Gly-Pro-CPro-Gly-Proln.j Gly-Pro, Gly-Pro-Pro, Pro-Gly-Pro-Pro Gly-Pro, (Pro-Gly-Pro)n_ ,-Pro Hydroxylated poly-(Pro-Gly-Pro)

M ajor p ro d u c ts 0

POLY(a-AMINO ACIDS), THEIR SOLUBILITY AND SUSCEPTIBILITY TO ENZYMATIC ACTIVITIES (continued)

40 2 50 9 2 ,4 2 8 2

29 43 26 44 4 8 ,5 4 ,5 5 ,5 8 5 6 ,5 7 8

Reference *

396 Handbook o f Biochemistry and Molecular Biology

397

POLY(a-AMINO ACIDS), THEIR SOLUBILITY AND SUSCEPTIBILITY TO ENZYMATIC ACTIVITIES (continued) REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52.

Lindestrom-Lang, A eta Chem Scand, 12, 851 (1958). Simons and Blout, Biochim. Biophys. Acta, 9 2 ,1 9 7 (1964). Katchalski, Sela, Silman, and Berger, in The Proteins, Vol. II, 2nd ed., Neurath, Ed., Academic Press, New York, 1964, 524. Neumann, Sharon, and Katchalski, Nature, 195,1002 (1962). Lindley, Nature, 178,647 (1956). Kaye and Sheratzky, Biochim Biophys. Acta, 190, 527 (1969). Ariely, Wilchek, and Patchomik, Biopolymers, 4, 91 (1966). Garg and Virupaksha, Eur. J. Biochem., 17, 13 (1970). Nickerson and Durand, Biochim. Biophys. Acta, 77, 87 (1963). Katchalski, Levin, Neumann, Riesel, and Sharon, B ull Res. Counc. Isr. Sect. A , 10, 159 (1961). Ankel and Martin, Biochem. J., 91, 431 (1964). Hayashi and Hata, Biochem. Biophys. Acta, 263, 673 (1972). Seely and Benoiton, Biochem. Biophys. Res. Commun., 37, 771 (1969). Lundblad and Johansson, Acta Chem. Scand, 22, 662 (1968). Martin and Jonsson, Can. J. Biochem., 43, 1745 (1965). Green and Stahm ann,/. Biol. C hem , 197, 771 (1952). Avrameas and Uriel, Biochemistry, 4, 1750 (1965). Miller, /. A m Chem. Soc., 86, 3913 (1964). Folk and Schirmer, J. Biol. Chem., 240, 181 (1965). Gjessing and Hartnett, J. Biol. Chem., 237, 2201 (1962). Nakadai, Nasuno, and Iguchi, ,4gric. Biol. Chem., 37,757 (1973). Nakadai, Nasuno, and Iguchi, Agric. Biol. Chem., 31,161 (1973). Nakadai, Nasuno, and Iguchi, Agric. Biol. Chem., 37,775 (1973). Miller, /. A m Chem Soc., 83, 259 (1961). Simons, Fasman, and Blout, /. Biol. Chem., 236, PC64 (1961). Harper, Berger, and Katchalski, Biopolymers, 11, 1607 (1972). Holohan, Murphy, Flanagan, Buchanan, and Elm ore,Biochim. Biophys. Acta, 322,178 (1973). Rigbi, Ph.D. thesis, Hebrew University, Jerusalem, 1957. Nordwig, Hoppe-Seyler’s Z. Physiol. Chem., 349, 1353 (1968). Hofsten and Reinhammar, Biochim Biophys. Acta, 110, 599 (1965). Gladner and Folk, J. Biol Chem., 231, 393 (1958). Miller, J. A m Chem. Soc., 86, 3918 (1964). Katchalski, Adv. Protein Chem., 6 ,1 2 3 (1951). Waley and Watson, Biochem. J., 55, 328 (1953). Katchalski, Grossfeld, and Frankel, J. Am. Chem. Soc., 70, 2094 (1948). Tsuyki, Tsuyuki, and Stahmann, J. Biol. Chem., 222,479 (1956). Yaron, Tal, and Berger, Biopolymers, 11, 2461 (1972). Debabov, Davidov, and Morozkin, Izvest. Akad. Nauk. SSR SER Khim., p. 2153 (1966). Katchalski, Sela, Silman, and Berger, in The Proteins, V o l II, 2nd ed., Neurath, Ed., Academic Press, New York, 1964, 521. Rigbi and Gros, B ull Res. Counc. Isr., 11A, 44 (1962). Yaron and Mlynar, Biochem. Biophys. Res. Commun. 32, 658 (1968); Yaron and Berger, Methods in Enzymology, Perlman and Lorand, Eds., VoL 19, Academic Press, New York, 1970, 521. Nordwig, Hoppe-Seyler’s Z. Physiol. Chem., 349,1353 (1968). Kessler and Yaron, Biochem. Biophys. Res. Commun., 50,405 (1973); Methods in Enzymology, Perlman and Lorand, Eds., Academic Press, New York, in press. Yaron, Mlynar, and Berger, Biochem. Biophys. Res. Commun., 47, 897 (1972). Wiederanders, Lasch, Kirschke, Bohley, Ansorge, and Hanson, Eur. J. Biochem., 36, 504 (1973). Sarid, Berger, and Katchalski, J. Biol. Chem., 234, 1740 (1959). Sarid, Berger, and Katchalski, J. B iol C hem , 237, 2207 (1962). Kivirikko and Prockop, J. Biol. Chem., 242, 4007 (1967). Dehm and Nordwig, Eur. J. Biochem., 17, 364 (1970). Rigbi, Seliktar, and Katchalski, Bull. Res. Counc. Isr., 6A, 313 (1957). Paik and Kim, Biochemistry, 11, 2589 (1972). Uriel and Avrameas, Biochemistry, 4, 1740 (1965).

398

Handbook o f Biochemistry and Molecular Biology

POLY(a-AMINO ACIDS), THEIR SOLUBILITY AND SUSCEPTIBILITY TO ENZYMATIC ACTIVITIES (continued)

61.

Kivirikko, Kishida, Sakakibara, and Prockop, Biochim. Biophys. Acta, 271, 347 (1972). Prockop, Juva, and Engel, Hoppe-Seyler’s Z. Physiol. Chem., 348, 553 (1967). Rhoads and Udenfriend, Arch. Biochem. Biophys., 1 3 3 ,1 0 8 (1969). Kikuchi, Fujimoto, and Tamiya, Biochem. J., 115, 569 (1969). Suzuki and Koyama, Biochim. Biophys. Acta, 177, 154 (1969). Hutton, Jr., Marglin, Witkop, Kurtz, Berger, and Udenfriend, Arch. Biochem. Biophys., 125, 779 (1968). Kivirikko and Prockop, J. Biol Chem., 244, 2755 (1969). Darge, Sass, and Thiemann, Z. Naturforsch. 28c, 116 (1973). Rigbi and Elzab, 6th FEBS Meet., Madrid, Abstr. No. 730, 1969; Rigbi, Segal, Kliger, and Schwartz, Bayer Symp. V,

62.

Katchalski, Sela, Silman, and Berger, in The Proteins, VoL II, 2nd ed., Neurath, Ed., Academic Press, New York,

63.

Bamford, Elliott, and Hanby, Synthetic Polypeptides, Academic Press, New York, 1956. Katchalski and Sela, Adv. Protein Chem., 13, 243 (1958). Szwarc,/lJv. Polymer S et, 4, 1 (1965). Stahmann, Polyamino Acids, Polypeptides and Proteins, Wisconsin Press, Madison, 1962. Sela and Katchalski, Adv. Protein. Chem., 14, 391 (1959). Katchalski, Proc. VI Int. Congr. Biochem., p. 80 (1965). Katchalski, Harvey Lect., 59, 243 (1965). Katchalski, in New Perspectives in Biology, Sela, Ed., Elsevier, New York, 1964, 67. K auzm ann,/lw w . Rev. Phys. Chem., 8, 413 (1957). Scheraga, Annu. Rev. Phys. Chem., 10, 191 (1959).

53 . 54 . 55 . 56 . 57 . 58 .

59. 60 .

Fritz, Tschesche, Greene, and Truscheit, Eds., Springer-Verlag, Berlin, 1974, 541. 1964, 436. 64 . 65 .

66. 67 .

68. 69 .

70. 71 . 72 .

73.

Leach, Rev. Pure Appl. Chem., 9, 1 (1959).

74 . 75 . 76 .

Umes and Doty, Adv. Protein Chem., 16, 401 (1961). Harrap, Gratzer, and Doty, Annu. Rev. Biochem., 30, 269 (1961). Schellman and Schellman, in The Proteins, VoL II, 2nd ed., Neurath, Ed., Academic Press, New York, 1 9 6 4 ,1 . Fasman, Tooney, and Shalitin, in Encyclopedia o f Polymer Science and Technology, Bikales, Ed., John Wiley &

77. 78. 79 . 80 . 81 . 82 .

Sons, New York, 1965, 837. Fasman, in Biological Macromolecules, Poly-a-Amino Acids, Protein Models fo r Conformational Studies, Timasheff and Fasman, Eds., Marcel Dekker, New York. Goodman, Verdini, Choi, and Masuda, Top. Stereochem., 5, 69 (1970). Scheraga, Chem. Rev., 71, 195 (1971). Johnson, J. Pharm. S e t, 63, 313 (1974). Blaut, Bovey, Goodman, and Lotan, Peptides, Polypeptides and Proteins, John Wiley & Sons, New York, 1974.

Index

401

IND EX A Abbreviations amino acids, table of, 78 distinguished from symbols, 6 polypeptide chains, nomenclature rules, 5 9 - 7 3 polypeptides, sequential, 333 use of, in biochemical nomenclature, 6 - 7 Abrine physical and chemical properties, 148 Absorbances, molecular, see Molecular absorbances 2-Acetamidoacrylic acid structure and symbols for those incorporated into synthetic polypeptides, 97 0 -Acetamido-L-alanine, see 0 -Af-AcetyFa-, j3-diaminopropionic acid Acetoacetic acid decarboxylase average hydrophobicity value, 2 1 0 Acetylcholine receptor average hydrophobicity value, 2 1 0 W-Acetylalanine physical and chemical properties, 148 Af-Acetylarginine physical and chemical properties, 132 W-Acetylaspartic acid physical and chemical properties, 148 Acetylcholinesterase average hydrophobicity value, 2 1 0 Af-Acetylcysteine UV absorption characteristics, 186 /3-A^Acetyl-a,0-diaminopropionic acid physical and chemical properties, 119 7V-Acetyldjenkolic acid physical and chemical properties, 151 A^-Acetyl-2-fluorophenylalanine structure and symbols for those incorporated into synthetic polypeptides, 97 Af-Acetyl-0-D-glucosaminidase average hydrophobicity value, 2 1 0 Af-Acetyl-/3-D-glucosaminidase-A, average hydrophobicity value, 2 1 0 Af-Acetyl-/3-D-glucosaminidase-B, average hydrophobicity value, 2 1 0 Acetylenic dicarboxylic acid diamide physical and chemical properties, 1 2 1 Af-Acetylglutamic acid physical and chemical properties, 148 (9-Acetylhomo serine physical and chemical properties A^-Acetyllysine-Af-methylamide structure and symbols for those incorporated into synthetic polypeptides, 97 TV-Acetylornithine physical and chemical properties, 119 Acetylornithine 7 -transaminase average hydrophobicity value, 2 1 0 A-Acetyl-DL-0-phenylalanine methyl ester UV spectra of, 193

O-Acetylserine sulfhydrylase A average hydrophobicity value, 2 1 0 Af-Acetyl-DL-tryptophan methyl ester, UV spectra, 192 W-Acetyl-L-tyrosine ethyl ester, UV spectra, 194 7V-Acetyl-L-tyrosine ethyl ester anion, UV spectra, 195 Achiral molecule, nomenclature for, 3 6 - 5 8 Acidemia organic acid metabolism errors, characteristics, 3 2 6 327 -Acid glycoprotein average hydrophobicity value, 2 1 0 Acid phosphatase wheat bran luminescence of, table, 206 Aciduria amino acid metabolism errors, characteristics, 317 - 3 24 organic acid metabolism errors, characteristics, 326 - 327 Actin average hydrophobicity value, 2 1 0 , 2 1 1 F-Actin luminescence of, table, 205 Actomyosin luminescence of, table, 205 Acylcarrier protein average hydrophobicity value, 2 1 1 Acylphosphatase average hydrophobicity value, 2 1 1 Adenosine average hydrophobicity value, 2 1 1 Adenosine deaminase, average hydrophobicity value, 211 Adenosine monophosphate deaminase average hydrophobicity value, 2 1 1 Adenosine monophosphate nucleosidase average hydrophobicity value, 2 1 1 Adenosine triphosphate, average hydrophobicity values, 2 1 1

Adenylate cyclase average hydrophobicity value, 2 1 1 Adenylosuccinate - AMP-lyase average hydrophobicity value, 2 1 2 Adrenodoxin average hydrophobicity value, 2 1 2 Adrenodoxin reductase average hydrophobicity value, 2 1 2 Aequorin average hydrophobicity value, 2 1 2 Agglutinin average hydrophobicity value, 2 1 2 Alanine far ultraviolet absorption spectra aqueous solution at pH 5, 184 neutral water, table, 185 0.1 M sodium dodecyl sulfate, table, 185 free acid in amniotic fluid in early pregnancy and at term, 327 free acid in blood plasma of newborn infants and adults, 328 physical and chemical properties, 1 1 2

402

Handbook o f Biochemistry and Molecular Biology

specific rotatory dispersion constants, 0.1 M solution, 244 spectra, far UV, 184 symbols for atoms and bonds in side chains, 69 a-Alanine antagonists of, 177 /3-Alanine antagonists of, 177 free acid in blood plasma of newborn infants and adults, 328 hyperbetaalaninemia, effect of, 317 physical and chemical properties, 112 structure and symbols for those incorporated into synthetic polypeptides, 97 D-Alanine antagonists of, 177 Alanine amino transferase average hydrophobicity value, 212 Alanosine physical and chemical properties, 134 2-Alanyl-3-isoxazolin-5-one physical and chemical properties, 135 Albizzine physical and chemical properties, 132 Albumin average hydrophobicity value, 212 bovine serum luminescence of, table, 205 human serum luminescence of, table, 205 Alcohol dehydrogenase average hydrophobicity value, 212 Aldehyde dehydrogenase, protein A average hydrophobicity value, 212 Aldolase average hydrophobicity value, 212 luminescence of, table, 205 Aldolase C, average hydrophobicity value, 212 Aldolase, fructose diphosphate, average hydrophobicity value, 212 Aldolase, fructose-1,6-diphosphate, average hydrophobicitv values, 212 Alkaline phosphatase average hydrophobicity value, 213 Alkaline proteinase-B average hydrophobicity value, 213 Alliin physical and chemical properties, 151 D-Alloenduracididine physical and chemical properties, 164 A //ohydro xyproline physical and chemical properties, 135 D-Allohydroxyproline betaine, see Turcine Alloisoleucine structure and symbols for those incorporated into synthetic polypeptides, 97 D-Allo isoleucine physical and chemical properties, 164 Allokainic acid physical and chemical properties, 135 D-Allothreonine physical and chemical properties, 164

S-Allylcysteine physical and chemical properties, 151 Allylglycine antagonism to Cysteine, 177 S-Allylmercaptocysteine physical and chemical properties, 151 a>Amidase average hydrophobicity value, 213 Amides far ultraviolet absorption spectra neutral water, table, 185 0.1 M sodium dodecyl sulfate, table, 185 aqueous solution at pH 5, 184 Af-Amidinoglycyl, see Guanidinoacetyl a-Aminoacetic acid, see Glycine a-Amino-7 -Af-acetylaminobutyric acid physical and chemical properties, 119 Amino acid decarboxylase, aromatic average hydrophobicity value, 213 D-Amino acid oxidase average hydrophobicity value, 213 Amino acids, see also specific acids abbreviations, table of, 78 aminoacidurias, characteristics of, table, 3 1 7 - 3 2 4 antagonists, table of, 177 - 180 aromatic luminescence of, table, 200 luminescence of derivatives of, table, 201 - 203 average hydrophobicity values for side chains, 209 circular dichroism spectra of metal complexes of, 245 - 315 enzymatic activities, susceptibility of poly(a-amino acids), 3 9 3 - 3 96 free acid in blood plasma of newborn infants and adults, table, 328 a-Keto acid analogs of, properties, 181 - 182 metabolism, errors of, table, 3 17 - 3 2 4 molecular absorbances of aromatic types, data, 187 - 189 naturally occurring physical and chemical properties, table, 111 - 166 synonyms for, 111 - 166 L-phosphorus-containing, physical and chemical properties, 161 - 164 plasma, normal or low, renal aminoacidurias, 3 23 - 3 2 4 poly(a-amino acids) enzymatic activities, susceptibility to, 3 9 3 - 3 9 6 solubility, table, 3 9 3 - 396 polymerized abbreviated nomenclature for, 9 1 - 9 5 symbols for, use of, 13 residue notation, 5 9 - 6 0 sequences, rules for one-letter notation, 7 5 -7 8 solubility of poly(a-amino acids), table, 3 93 -3 9 6 spectra, far UV absorption, data, 184 -1 85 symbols for derivatives and peptides, 7 9 - 9 0 symbols for, use of, 8 - 9 L-TV-substituted, physical and chemical properties, 148 -1 51 L-sulfur and selenium-containing, physical and chemical properties, 151 - 158 synthetic

403 structures and symbols for those incorporated into synthetic polypeptides, 96