Measurement, Control, and Communication Using IEEE 1588 (Advances in Industrial Control) 1846282500, 9781846282508

A common sense of time among the elements of a distributed measurement and control system allows the use of new techniqu

100 64 3MB

English Pages 302 [291] Year 2006

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Measurement, Control, and Communication Using IEEE 1588 (Advances in Industrial Control)
 1846282500, 9781846282508

Citation preview

Advances in Industrial Control

Other titles published in this Series: Digital Controller Implementation and Fragility Robert S.H. Istepanian and James F. Whidborne (Eds.) Optimisation of Industrial Processes at Supervisory Level Doris Sáez, Aldo Cipriano and Andrzej W. Ordys Robust Control of Diesel Ship Propulsion Nikolaos Xiros Hydraulic Servo-systems Mohieddine Jelali and Andreas Kroll Strategies for Feedback Linearisation Freddy Garces, Victor M. Becerra, Chandrasekhar Kambhampati and Kevin Warwick Robust Autonomous Guidance Alberto Isidori, Lorenzo Marconi and Andrea Serrani Dynamic Modelling of Gas Turbines Gennady G. Kulikov and Haydn A. Thompson (Eds.) Control of Fuel Cell Power Systems Jay T. Pukrushpan, Anna G. Stefanopoulou and Huei Peng Fuzzy Logic, Identification and Predictive Control Jairo Espinosa, Joos Vandewalle and Vincent Wertz Optimal Real-time Control of Sewer Networks Magdalene Marinaki and Markos Papageorgiou Process Modelling for Control Benoît Codrons Computational Intelligence in Time Series Forecasting Ajoy K. Palit and Dobrivoje Popovic Modelling and Control of mini-Flying Machines Pedro Castillo, Rogelio Lozano and Alejandro Dzul Rudder and Fin Ship Roll Stabilization Tristan Perez Hard Disk Drive Servo Systems (2nd Edition) Ben M. Chen, Tong H. Lee, Kemao Peng and Venkatakrishnan Venkataramanan Publication due March 2006 Piezoelectric Transducers for Vibration Control and Damping S.O. Reza Moheimani and Andrew J. Fleming Publication due March 2006 Windup in Control Peter Hippe Publication due April 2006 Manufacturing Systems Control Design Stjepan Bogdan, Frank L. Lewis, Zdenko Kovaˇci´c and José Mireles Jr. Publication due May 2006

John C. Eidson

Measurement,Control, and Communication Using IEEE 1588 With 38 Figures (including 4 in Color)

123

John C. Eidson, PhD Agilent Technologies, Inc. MS 24M-A 3500 Deer Creek Road Palo Alto, CA 94304 USA

British Library Cataloguing in Publication Data Eidson, John C. Measurement, control, and communication using IEEE 1588. (Advances in industrial control) 1.Automatic control - Standards 2.Automatic control I.Title 629.8’0218 ISBN-10: 1846282500 Library of Congress Control Number: 2006921167 Advances in Industrial Control series ISSN 1430-9491 ISBN-10: 1-84628-250-0 e-ISBN 1-84628-251-9 ISBN-13: 978-1-84628-250-8

Printed on acid-free paper

© Springer-Verlag London Limited 2006 CableLabs® is a registered trademark of Cable Television Laboratories, Inc., 858 Coal Creek Circle, Louisville, CO 80027-9750, U.S.A. http://www.cablelabs.com/ CompactPCI® is a registered trademark of PCI Industrial Computer Manufacturers Group, Inc., PICMG, c/o Virtual, Inc., 401 Edgewater Place, Suite 600, Wakefield, MA 01880, U.S.A. http://www.picmg.org/ DeviceNet™ is a trademark of the ODVA, ODVA Headquarters, Technology and Training Center, 1099 Highland Drive, Suite A, Ann Arbor, MI 48108-5002, U.S.A., www.odva.org® Ethernet® is a registered trademark of the Xerox Corporation, 800 Long Ridge Road, Stamford, CT 06904, U.S.A. http://www.xerox.com/go/xrx/template/013.jsp?Xcntry=USA&Xlang=en US MATLAB® is a registered trademark of The MathWorks, Inc., 3 Apple Hill Drive Natick, MA 01760-2098, U.S.A. http://www.mathworks.com QNX® is a registered trademark of QNX Software Systems Ltd., 175 Terence Matthews Crescent, Ottawa, Ontario, Canada, K2M 1W8. http://www.qnx.com/ Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Printed in Germany 987654321 Springer Science+Business Media springer.com

Advances in Industrial Control Series Editors Professor Michael J. Grimble, Professor of Industrial Systems and Director Professor Michael A. Johnson, Professor (Emeritus) of Control Systems and Deputy Director Industrial Control Centre Department of Electronic and Electrical Engineering University of Strathclyde Graham Hills Building 50 George Street Glasgow G1 1QE United Kingdom

Series Advisory Board Professor E.F. Camacho Escuela Superior de Ingenieros Universidad de Sevilla Camino de los Descobrimientos s/n 41092 Sevilla Spain Professor S. Engell Lehrstuhl für Anlagensteuerungstechnik Fachbereich Chemietechnik Universität Dortmund 44221 Dortmund Germany Professor G. Goodwin Department of Electrical and Computer Engineering The University of Newcastle Callaghan NSW 2308 Australia Professor T.J. Harris Department of Chemical Engineering Queen’s University Kingston, Ontario K7L 3N6 Canada Professor T.H. Lee Department of Electrical Engineering National University of Singapore 4 Engineering Drive 3 Singapore 117576

Professor Emeritus O.P. Malik Department of Electrical and Computer Engineering University of Calgary 2500, University Drive, NW Calgary Alberta T2N 1N4 Canada Professor K.-F. Man Electronic Engineering Department City University of Hong Kong Tat Chee Avenue Kowloon Hong Kong Professor G. Olsson Department of Industrial Electrical Engineering and Automation Lund Institute of Technology Box 118 S-221 00 Lund Sweden Professor A. Ray Pennsylvania State University Department of Mechanical Engineering 0329 Reber Building University Park PA 16802 USA Professor D.E. Seborg Chemical Engineering 3335 Engineering II University of California Santa Barbara Santa Barbara CA 93106 USA Doctor K.K. Tan Department of Electrical Engineering National University of Singapore 4 Engineering Drive 3 Singapore 117576 Doctor I. Yamamoto Technical Headquarters Nagasaki Research & Development Center Mitsubishi Heavy Industries Ltd 5-717-1, Fukahori-Machi Nagasaki 851-0392 Japan

To my loving family, and to my many friends and colleagues worldwide who have contributed much to IEEE 1588

Series Editors’ Foreword

The series Advances in Industrial Control aims to report and encourage technology transfer in control engineering. The rapid development of control technology has an impact on all areas of the control discipline. New theory, new controllers, actuators, sensors, new industrial processes, computer methods, new applications, new philosophies}, new challenges. Much of this development work resides in industrial reports, feasibility study papers and the reports of advanced collaborative projects. The series offers an opportunity for researchers to present an extended exposition of such new work in all aspects of industrial control for wider and rapid dissemination. In an increasingly complex technological world, standards of all different types play an important role in ensuring the uniform comprehension of technology and aiding compatibility in different technological domains. The ISO series of international standards covers a wide range of technology including, for example, assistive technological devices for accessibility. In the construction industry, the International Building Code gives standards on all aspects of buildings and their internal layout. For the electrical and electronic industries the IEEE issues a series of technological standards and the particular standard IEEE 1588 for real-time applications in measurement, control and communications systems is the subject of this new Advances in Industrial Control monograph by John Eidson (Agilent Technologies, Inc., USA). Engineers involved with the application of standards are well acquainted with their value, but one suspects that many in the engineering community are not very familiar with either the standards system or its value as a pedagogical resource. Standards are usually the distillation of the expertise and knowledge of a group of world-leading experts in a technological field. Thus, standards almost always contain invaluable information about categories, classification, definitions and technological description. Whilst this information may be given in condensed form it is a valuable resource for study. If you are a lecturer, using a standard may give you the framework that you need to explain a technology clearly and precisely to an engineering audience. John Eidson’s approach to his presentation of the IEEE 1588 standard follows three themes: the background and context to the standard, the detail of the standard

x

Series Editors’ Foreword

and, finally, the application of the standard. In this way a very rich picture emerges from what might have been thought to be an arcane subject. The book opens with an exploration of the background to the tasks of time-keeping and synchronization and an authentic historical context is presented to the problems involved in these tasks. The middle section of the book is devoted to the IEEE 1588 standard per se where an analysis of the standard is presented along with a discussion of practical implementation issues. One item of special note for the control specialist is the control design for a clock servo system that turns out to be a discrete PI control loop design problem (page 146 onward). The later chapters of the book report actual or proposed applications in a series of case studies from large turbine operations power systems, instrumentation systems, robotics, motion control and communication systems. Overall this text is a very welcome addition to the Advances in Industrial Control monograph series. John Eidson has ensured that this monograph offers material for both the general and the specialist reader alike. Whilst the specialist reader will find the explanation of the IEEE 1588 standard invaluable and instructive, the general reader will undoubtedly enjoy the historical background to timekeeping problems and gain real insight through a perusal of the applications and case study chapters. M.J. Grimble and M.A. Johnson Industrial Control Centre Glasgow, Scotland, U.K.

Preface

This book is about the use of IEEE 1588, and the explicit representation of time in the design and operation of measurement, control, and communication systems. In a larger sense, it is about the combination of the explicit representation of time, and the use of networking and distributed system technology in solving hard real-time application problems. IEEE 1588 is a new standard that was published in November 2002. It has attracted the attention of technologists worldwide in the fields of industrial automation, test and measurement, and telecommunications. Products and installed systems based on the standard are beginning to appear, and both the protocol and its implications are attracting the attention of university researchers. As with all new technologies, there is a learning curve that must be overcome by potential users. Reading a standard has never proved to be a particularly pleasant task, and is certainly not the ideal vehicle for an introduction to a technology. This book is intended to make climbing this learning curve both easier and, hopefully, more interesting. The book is organized in three major parts. The first part provides an introduction to the field, some background on timekeeping and synchronization, and a high-level overview of the IEEE 1588 standard. The second part consists of a detailed analysis of IEEE 1588 and a discussion of the more important practical issues in implementing the standard. The third part begins with a general discussion of system architectures based on IEEE 1588 technology, and then provides examples of actual or proposed applications in the fields of industrial automation and power, test and measurement, and communications. The last part consists of appendices giving more detailed information concerning IEEE 1588 messages and data sets. Readers whose primary interest is in the application of IEEE 1588 will find that Part II can be skipped on first reading. IEEE 1588, like all technologies, is the result of the efforts of many people. The cited references represent a partial list of the researchers on whose work this book is based, and to whom I am indebted. Credit must also be given to

xii

Preface

the past and current members of the IEEE P1588 committee that produced the original version of the IEEE 1588 standard, and who are currently hard at work on the first revision. Everyone using IEEE 1588, including myself, owes special thanks to Kang Lee of NIST not only for his work as the IEEE sponsor of the standard, but also for his tireless efforts in its promotion. Thanks are also due to NIST for its support of the standard, and for hosting the first two conferences on IEEE 1588. These conferences brought together many researchers, developers, and users of the technology, and did much to foster the cooperative spirit that has enabled its rapid development. On a more personal note, I would like to thank Hermann Kopetz of the Technical University of Vienna, and Edward Lee of the University of California at Berkeley for many inspirational discussions on topics related to IEEE 1588 technology. I would also like to thank the management of Agilent Technologies, and earlier of Hewlett-Packard, for their continued support of the technology, the standards activities, and my work on this book. Those in large organizations appreciate that the management always points to a few individuals willing to champion a technology in the face of very good arguments for its termination. Simply put, this technology would never have seen the light of day but for the extraordinary support of Randy Coverstone, Jon Kim, Bill Shreve, and—most especially—Jay Warrior, who shared the dream and promoted it both inside and outside the corporation. It has been my privilege over the years to work with a particularly stimulating group of colleagues at Agilent Technologies and Hewlett-Packard. Many continue to contribute to the development of this technology. I would like to especially acknowledge Jeff Burch, Bruce Hamilton, and Stan Woods, who have been my closest collaborators. The technical presentation in this book has benefited greatly from discussions with Danny Abramovitch of Agilent Technologies, Glenn Algie of Nortel, Galina Antonova of General Electric, Doug Arnold of Symmetricom, Len Cutler of Agilent Technologies, Pat Diamond of Semtech, Michael Gerstenberger of KUKA Robotics, Dirk Mohl of Hirschmann Automation and Control, Anatoly Moldovansky of Rockwell Automation, David Petticord of Complete Networks, Dan Pleasant of Agilent Technologies, Silvana Rodrigues of Zarlink Semiconductor, Mark Shepard of General Electric, Dave Tonks of Semtech, Veselin Skendzic of Schweitzer Engineering, and Miao Zhu of Agilent Technologies. Thanks are due to Anthony Doyle, Oliver Jackson, and the staff at Springer for their encouragement and support in preparing and publishing this book.

Palo Alto, California January 2006

John C. Eidson

Credits

The author is grateful to the following organizations for granting permission to quote from their documents and publications. Agilent Technologies, Inc. Table 5.1 is reproduced from Agilent Laboratories Report AGL-2002-13, and Section 8.2 is reproduced from Agilent Laboratories Report AGL-2005-1, both courtesy of Agilent Technologies, Inc. Bosch Rexroth, AG Figure 7.10 is reproduced, with kind permission, from a photograph supplied by Bosch Rexroth. Elsevier Figures 2.8 and 5.18 are reprinted from Figures 14 and 20 of Frequency Control Devices, John R. Vig and Arthur Ballato in Ultrasonic Instruments and Devices, E.P. Papadakis, ed., pages 637–701, 1999 Academic Press, with permission from Elsevier. Hirschmann Automation and Control, GmbH Figures 7.6 and 7.7 are provided courtesy of Hirschmann Automation and Control, GmbH. International Telecommunication Union (ITU) Equations 2.1 through 2.11, Tables 2.1 through 2.5, and Figure 2.1 are reproduced or adapted from ITU-T Recommendation G-810, with the kind permission of the ITU.

xiv

Credits

Institute of Electrical and Electronics Engineers (IEEE) The material in the following list is reprinted with permission from IEEE 1588-2002, IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. The IEEE disclaims any responsibility or liability resulting from the placement and use in the described manner. The items directly copied or copied with minor modification from IEEE 1588-2002 are: • •

Figures 4.1, 4.3, 4.5, 4.14, 4.15, 4.16, 4.17, 4.18, 4.19, 4.20, 4.25, 4.26, and 4.27. Equations 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 4.10, and 4.11.

KUKA Robotics Figures 7.8, 7.9, 7.11, and 7.12 and the quotations in Section 7.4 are reproduced from drawings and material provided courtesy of KUKA Robotics. LXI Consortium Section 8.1.1 is reproduced from the LXI Standards Definition, Preliminary revision 0.93, 8 April 2005, courtesy of the LXI Consortium. Nortel Corporation Figure 9.1 is based on a slide provided courtesy of the Nortel Corporation. Semtech, Ltd. Figures 9.8, 9.9, and 9.10 are based on drawings provided courtesy of Semtech. Symmetricom, Inc. Figure 9.2 is based on drawings provided courtesy of Symmetricom. Zarlink Semiconductor Figures 9.3, 9.4, 9.5, and 9.6 are based on drawings provided courtesy of Zarlink Semiconductor.

Contents

Part I Background 1

Introduction to Time-based Measurement, Control, and Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Temporal Specifications in Systems Based on Modern Computer Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The State of the Art in Implementing Real-time Systems . . . . . 1.3 IEEE 1588 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 4 7

2

The Evolution of Clocks and Clock Synchronization . . . . . . . 2.1 The Influence of Time and Its Measurement on Our Lives . . . . 2.2 The Measurement of Time and Time Intervals . . . . . . . . . . . . . . 2.2.1 Clocks Through the Ages . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Characterization of Oscillators . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Properties of Modern Oscillators . . . . . . . . . . . . . . . . . . . . 2.3 Time Scales and Calendars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Synchronization Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Early Synchronization and Distribution Protocols . . . . . 2.4.2 IRIG-B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Loran-C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.4 NTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.5 GPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.6 IEEE 1588 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9 9 10 10 12 19 25 29 30 30 31 31 32 32

3

An Overview of Clock Synchronization Using IEEE 1588 . . 3.1 The History of the Development and the Objectives of IEEE 1588 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Overview of the Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Fundamental Operation of the Protocol . . . . . . . . . . . . . . . . . . . . 3.3.1 System Boundaries and Communications . . . . . . . . . . . . . 3.3.2 Master-slave Synchronization Hierarchy . . . . . . . . . . . . . .

35

3

36 38 39 40 42

xvi

Contents

3.3.3 Startup and Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.4 Synchronization Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3.5 System Management Overview . . . . . . . . . . . . . . . . . . . . . . 58

Part II IEEE 1588 4

A Detailed Analysis of IEEE 1588 . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.1 System Boundaries and Communications . . . . . . . . . . . . . . . . . . . 62 4.2 Master-slave Synchronization Hierarchy . . . . . . . . . . . . . . . . . . . . 64 4.2.1 States in the Master-slave Hierarchy . . . . . . . . . . . . . . . . . 64 4.2.2 The PTP State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2.3 The State Decision Algorithm and Data Set Updates . . 73 4.2.4 Data Set Comparison Algorithm . . . . . . . . . . . . . . . . . . . . 84 4.2.5 Clock Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.3 Startup and Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.3.1 Powerup and Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.3.2 Changes in Clock Characteristics or Default Data Sets . 112 4.3.3 Changes in the Underlying Network Topology . . . . . . . . . 113 4.3.4 Fault Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.4 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.4.1 The One and Two Message Synchronization Models . . . 115 4.4.2 Message Timestamp Point and Internal Latency . . . . . . 120 4.4.3 Slave Clock Synchronization . . . . . . . . . . . . . . . . . . . . . . . . 121 4.4.4 Burst Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.4.5 External Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.5 System Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.5.1 Clock and Topology Discovery . . . . . . . . . . . . . . . . . . . . . . 125 4.5.2 Clock Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4.5.3 Data Set Configuration and Visibility . . . . . . . . . . . . . . . . 126 4.5.4 Management Message Communications . . . . . . . . . . . . . . . 126 4.6 Application Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.6.1 Performance Monitoring Features . . . . . . . . . . . . . . . . . . . . 128 4.6.2 Time Scale Support Features . . . . . . . . . . . . . . . . . . . . . . . 128 4.7 Likely Extensions to IEEE 1588 . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

5

Practical Issues in Implementing IEEE 1588 . . . . . . . . . . . . . . . 133 5.1 Clock and Boundary Clock Design . . . . . . . . . . . . . . . . . . . . . . . . . 133 5.1.1 The Hardware Clock, Oscillator, and Clock Adjustment Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5.1.2 The Packet Recognition, Identification, and Timestamp Capture Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.1.3 Boundary Clock Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 5.1.4 IEEE 1588 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 5.2 Clock Servo Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Contents

xvii

5.2.1 Model of the Slave Servo . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 5.2.2 Determining the Stability of the Slave Servo . . . . . . . . . . 150 5.2.3 Summary of Servo Design Issues . . . . . . . . . . . . . . . . . . . . 164 5.3 Oscillator Selection and Environmental Issues . . . . . . . . . . . . . . . 164 5.4 IEEE 1588 in Non-UDP/IP Ethernet Systems . . . . . . . . . . . . . . . 168 5.4.1 DeviceNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 5.4.2 Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 5.4.3 Non-UDP/IP Ethernet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.5 Synchronizing to UTC Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

Part III Applying IEEE 1588 6

System Architecture Based on Synchronized Clocks . . . . . . . 177 6.1 Partitioning in IEEE 1588 Systems . . . . . . . . . . . . . . . . . . . . . . . . 177 6.1.1 Application Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 6.1.2 Execution Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 6.1.3 The Boundary Between Soft and Hard Real-time Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 6.2 Module Design Supporting Time-based Partitioning . . . . . . . . . . 188 6.3 Network Design for IEEE 1588 Systems . . . . . . . . . . . . . . . . . . . . 191

7

Case Studies in Industrial Automation and Power . . . . . . . . . 193 7.1 The Monitoring and Control of Large Turbines . . . . . . . . . . . . . . 194 7.2 Power System Monitoring and Control . . . . . . . . . . . . . . . . . . . . . 200 7.3 Boundary Clocks for Industrial Applications . . . . . . . . . . . . . . . . 204 7.4 Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 7.5 Motion Control and General Plant Automation . . . . . . . . . . . . . . 213

8

Case Studies in Instrumentation Systems . . . . . . . . . . . . . . . . . . 217 8.1 The LXI Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 8.1.1 LXI Device Synchronization and LAN-based Triggering 220 8.1.2 Module-to-module Data Communication . . . . . . . . . . . . . 222 8.2 LXI Module Design and System Programming Practices . . . . . . 222 8.3 LXI System Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 8.4 Data Acquisition Applications Using LXI . . . . . . . . . . . . . . . . . . . 234 8.5 General Applications Using LXI . . . . . . . . . . . . . . . . . . . . . . . . . . 237 8.5.1 Characteristics of Future LXI-based Test Systems . . . . . 237 8.5.2 Integration of LXI-based Test Equipment with VXI and PXI Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 8.5.3 Examples of Future LXI-based Test Systems . . . . . . . . . . 239

xviii

9

Contents

Case Studies in Communications . . . . . . . . . . . . . . . . . . . . . . . . . . 243 9.1 Background on the Application of IEEE 1588 in Telecommunications Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 9.2 Proposed Telecommunications Applications Using IEEE 1588 . 245 9.2.1 Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 9.2.2 Linking SONET Rings via Ethernet . . . . . . . . . . . . . . . . . 248 9.2.3 Timing in Cable TV Infrastructure . . . . . . . . . . . . . . . . . . 249 9.2.4 Timing Distribution in Central Offices . . . . . . . . . . . . . . . 250 9.2.5 Circuit Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 9.2.6 Internal Timing in Telecommunications Equipment . . . . 252 9.3 Proposed Techniques to Enable IEEE 1588 in Telecommunications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 9.3.1 Asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 9.3.2 Latency Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 9.3.3 IEEE 1588 Timing Redundancy . . . . . . . . . . . . . . . . . . . . . 254 9.4 Early Measurements of IEEE 1588 Operating on Metropolitan Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

10 The Future of IEEE 1588 and Time-based Applications . . . . 261 10.1 Specific Concerns and Likely Outcomes . . . . . . . . . . . . . . . . . . . . 261 10.2 What Will the Future Bring? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 10.3 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Part IV Appendices A

Field Definitions for IEEE 1588 Messages . . . . . . . . . . . . . . . . . . 267 A.1 Message Fields Common to All PTP Messages . . . . . . . . . . . . . . 267 A.2 Sync and Delay Req Message Fields . . . . . . . . . . . . . . . . . . . . . . . 268 A.3 Follow Up Message Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 A.4 Delay Resp Message Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 A.5 Management Message Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

B

IEEE 1588 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B.1 Default Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B.2 Current Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B.3 Parent Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B.4 Global Time Properties Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . 272 B.5 Port Configuration Properties Data Set . . . . . . . . . . . . . . . . . . . . 272 B.6 Foreign Master Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

Part I

Background

1 Introduction to Time-based Measurement, Control, and Communication

This book is about the use of IEEE 1588, and the explicit representation of time in the design and operation of measurement, control, and communication systems. These systems share a common feature in that they interact with devices or processes that themselves operate based on real-world time. Such systems are often termed “hard real-time systems” because their actions must meet time constraints imposed by the application space, rather than by the operation of the measurement or control devices. Examples of such hard real-time systems abound. In the measurement world, complex test systems composed of many electronic instruments operating in concert are used to verify the performance of even more complex electronic or electromechanical devices such as radar systems, electronic engine controls, power generators, and heart pacemakers. In the field of control, combinations of computers, controllers, sensors, and actuators collaborate to regulate printing presses, oil refineries, packaging machines, traffic lights, and home heating equipment. Communications systems operate to pass information from source to destination, and are governed by the laws of physics, the operation of the communication protocols, and the vagaries of users that result in unpredictable traffic flows. The use of time in measurement and control is actually very familiar, as our everyday life seems to be governed by the clock. Alarm clocks have been with us for ages. Does there exist a businessman that can function without the ever-present meeting reminder pop-ups on the computer screen? Lawns are watered, coffee is brewed, and favorite television programs are recorded, all based on some sort of mechanical or electronic clock. In programming these devices, the time at which something is to happen is entered, as well as the time at which the activity is to cease, or in some cases, the duration of the activity. Never is there a list of activities executed based on the speed at which the device operates.

4

1 Introduction to Time-based Measurement, Control, and Communication

1.1 Temporal Specifications in Systems Based on Modern Computer Technology It may come as a surprise to those not familiar with the field that in programming test, measurement, and control systems, it is usually difficult or impossible to specify actions based on time. It is true that some systems provide crude schedulers for starting a task based on time. Others are based on timers that may be used to provide notification for a fixed time after the timer is set. However, there are few examples in which detailed specifications of the system behavior as a function of time can be expressed, and even fewer mechanisms for enforcing such behavior, or even recognizing when the specifications have not been met. The reason is that most modern test, measurement, control, and communication systems are operated by computers based on the latest microprocessors, operating systems, and programming languages. Microprocessors provide little support for real-time specifications. Typically, support is limited to timed interrupts that can be used to form a clock or to implement timers. These interrupts invariably operate based on the oscillator used to drive the local microprocessor, and therefore will be asynchronous to similar devices in other microprocessors in the system. To make matters worse, microprocessors implement numerous optimizations that make their operation non-deterministic in time. Examples are memory cache, speculative execution, and interrupt priority. Given the available support in hardware, even the so-called real-time operating systems are best-effort systems with only statistical guarantees of temporal behavior. System clocks are typically implemented at the operating system level based on timed interrupts from the supporting hardware. In most cases, the granularity of these clocks is not sufficient for applications of interest in this book. There are synchronization protocols that enable clocks in one system to agree with other clocks in the system at accuracies useful in business and commerce. These will be discussed further in Section 2.4. The common languages used in programming these systems do not support any form of temporal semantics. These languages are designed for data processing for business and commerce, and mathematical manipulations for the scientific community. There are simply no mechanisms in these languages for specifying actual simultaneity, parallel execution, temporal deadlines, or other time-related concepts. If time must be referenced, then this is always as an operating system call outside of the programming language.

1.2 The State of the Art in Implementing Real-time Systems To overcome the lack of support for time-based specification in modern computing environments, a number of techniques have evolved and form the basis

1.2 The State of the Art in Implementing Real-time Systems

5

for the practice usually referred to as embedded systems programming. These techniques include: •





Ignoring high-level languages and operating systems, and programming the microprocessors in a low-level or assembly language where more explicit control of the underlying hardware is possible. System-wide timing is often accomplished by means of special purpose hardware support accessible to the control microprocessors, such as the IRIG-B and GPS protocols to be discussed in Section 2.4. Using the time support available in the general computing environment. In general, this tactic requires trial-and-error adjustment of the code to produce the desired temporal behavior. This technique is fragile in that the results are dependent on the speed of execution of code in the microprocessor, which is subject to all manner of abuse from the underlying hardware. If communication between computers or devices is required, then additional time variations will be introduced, making it even harder to hold to tight temporal specifications. Generating a system-wide ordering mechanism for enforcing temporal relationships. These systems are called time-slotted systems. The time-slots may, or may not, be tied to real-world time.

What has emerged from these options are three distinct design patterns for enforcing real-time behavior on the components of a distributed system. The first of these design patterns is the message-based system. In this pattern, coordination among components is achieved by message exchange. These messages contain no explicit reference to time, and the usual semantics is that any implicit time—for example, the time at which a command is to be executed—is based on the time of message receipt. In effect, the message contains two components: the payload explicitly representing data or a command, and the relevant time implicitly represented by the time of receipt. This is the dominant pattern in general computing. Examples are remote procedure calls, publish-subscribe mechanisms, and blackboard systems. Message-based systems are used almost exclusively in current measurement systems, and in many control systems as well. While a message-based paradigm works very well for conveying state, it is a poor mechanism for specifying time, due to the fluctuations in the latencies associated with the network transport layer and operating systems. Commercial systems typically have a global time scale established by the NTP protocol. However, time is invariably used as data—for example, in timestamping a financial transaction—and rarely for control. The second design pattern is the time-slotted system. In this pattern, system-wide operations are broken into a succession of time-slots. Typically, these time-slots are defined by the time-slots of a time division multiplexed network transport protocol. The time-slots may, or may not be uniform, and may or may not be based on a global time scale. Instead of using a global time scale, the time-slots may be state-based, using a token passing protocol, or they may be managed by a central authority that notifies all participating

6

1 Introduction to Time-based Measurement, Control, and Communication

devices when to move to the next slot. The slots are typically numbered based either on a time scale or on sequence numbers. This allows ordering of events within the granularity of the time-slots. Slots may be assigned to functions, but usually are assigned to devices. This design pattern is heavily used in industrial automation. Examples are the SERCOS and Profibus protocols, and the TTP protocol of Kopetz. Provided the functions of each participating device are constrained to complete execution within a time-slot, this technique is very good at enforcing real-time performance. The time granularity is set by the duration and consistency of the time-slots, and is very well matched to systems with periodic behavior. The final pattern is the time-based system. In this pattern, a system-wide time scale is established, and time-critical system actions are based on this time scale, rather than on message receipt times or the boundary of a timeslot. Global time scales are typically provided by local clocks participating in a synchronization protocol. Messages passed between devices can contain a timestamp defining the occurrence time of some meaningful event, allowing the recipient to base action on the received timestamp and the global time scale. This pattern is relatively rare due to the lack of sufficiently accurate system-wide time scales, and the lack of programming tool support for the pattern. However, there are several proprietary time-based systems that have been used quite successfully in the industrial automation industry. A timebased system should be a good match for problems with irregular or episodic time behavior, or systems where a finite number of events occur within a time too short to permit disambiguation using time-slots. These problems occur quite commonly in the test and measurement field, and in certain applications in industrial automation. The characteristics of each of these design patterns are summarized in Table 1.1. Table 1.1. Summary of hard real-time design patterns [1] Characteristic

Message-based

Time-slotted

Message semantics

Command/value Command/value Command/value and execution and execution time time

Timing accuracy limitations

Message latency Time-slot fluctuation fluctuation

Clock synchronization accuracy

Temporal ordering limitations

Transport protocol dependent

Quantized by clock synchronization accuracy and precision

Quantized by time-slot length

Time-based

1.3 IEEE 1588

7

1.3 IEEE 1588 IEEE 1588 provides a way to easily synchronize clocks in a distributed system, thereby establishing a system-wide accurate and precise time scale. It is logical to expect this to lead to increased usage of the time-based design pattern discussed in Section 1.2, and to more precisely defined time-slotted systems. Since it is a rare application that perfectly matches any design paradigm, it is likely that all three of the design patterns will continue to be used, but in different proportions and in new and interesting combinations. The goal of this book is to explore these possibilities.

2 The Evolution of Clocks and Clock Synchronization

The theme of this book is how a common sense of time may be used to enhance the performance and simplify the design of distributed systems used in measurement, control, and communications. A common sense of time is established in such systems by means of a clock synchronization protocol. Given their importance in these discussions, it is appropriate to take a closer look at time, clocks, and clock synchronization protocols. This chapter is organized into four sections covering different aspects of these topics, including a: • • • •

Brief look at the influence of time and its measurement on our lives, Discussion of the measurement of time and time intervals, Discussion of time scales and calendars, and Survey of several clock synchronization protocols.

Each section presents a brief overview of the topic, and provides sufficient background for the remainder of the book. Each of these topics is fascinating in its own right, and interested readers are urged to consult the references provided. The first section on the influence of time in our lives is clearly not directly related to the main topic of this book. It is included to raise the question of why an understanding of time, so clearly a major factor in the development of civilization and such an integral part of our daily lives, is conspicuously absent from the current programming methodologies employed in major branches of measurement and control.

2.1 The Influence of Time and Its Measurement on Our Lives There can be no question that questions concerning time have been a subject of human inquiry from the dawn of civilization to the present day. Time has been studied by scientists and philosophers, has been a preoccupation of

10

2 The Evolution of Clocks and Clock Synchronization

religion and art, and has had a pronounced influence on our daily lives and thought. One of the most dramatic changes over the course of recorded history is the accuracy and precision to which we are able to measure time. Although we cannot say for sure what motivated the early pioneers in the field, it is reasonable to assume that there have always been a few that pushed the frontiers of timekeeping simply for the challenge, or because it was intellectually interesting in its own right. For much of human history, our lives have been closely aligned with natural cycles. Whether we were farmers or villagers, most of our activities followed the patterns set by the seasons, the needs of livestock, birth, and death. Even today, most of us can probably conduct the majority of our activities with the same timekeeping instruments available several hundred years ago. For whatever reason, over the last millennium our ability to measure time has improved by about 15 orders of magnitude. Most of this improvement occurred within the last 400 years, starting with the introduction of the pendulum clock by Huygens around 1656 [2]. Historically, there has always been a close association between time and religion. Early art artifacts portray gods related to the Sun and Moon, which were also the primary measure of time. These were matters of considerable religious importance, and it is no coincidence that most of the early progress in the field of timekeeping was made by clerics and their astronomers. In most faiths, there are certain time constraints that must be met by the faithful. From the medieval monk rising in time for matins prayer to the modern Muslim praying precisely at noon, the timekeeping must be up to the task. On a larger scale, the religious holidays must align with the correct astronomical events, which perhaps explains why the West operates on the Gregorian calendar, rather than one named after Galileo. Readers interested in this aspect of time will enjoy the beautiful and very readable book assembled by the National Maritime Museum in England, titled The Story of Time [3].

2.2 The Measurement of Time and Time Intervals A clock consists of an oscillator and a counter. The function of the oscillator is to establish a repeatable interval of time, and by counting these intervals, it is possible to create a time scale useful for the intended application. In this section, the development of increasingly better oscillators is explored. In the second section, the time scales based upon these oscillators will be studied. 2.2.1 Clocks Through the Ages The earliest oscillators used in timekeeping were the motion of the Sun and the Moon and, to a lesser extent, the stars and planets. An astonishing variety

2.2 The Measurement of Time and Time Intervals

11

of instruments evolved to measure these motions. Given the times, the most sophisticated of these was the astrolabe, which by the Middle Ages was in common use in Europe. While few early examples have survived, there are reports indicating that the astrolabe had its origin in Greek times [2]. The sundial, familiar to every school child, had its origin between 500 and 600 BC, although there are Egyptian artifacts indicating that measuring time based on shadows was understood much earlier [2]. At night the movement of the stars can be used, and devices designed for this purpose, known as nocturnals, were in common use by the 17th century. Of course, it is not necessary to refer to the heavens at all to measure time. Anything that establishes a consistent and convenient interval will work. It is known that the Babylonians, Egyptians and Chinese all used the flow of water to establish time intervals. For example, a device called a clepsydra was used in Egypt at least as far back as 1400 BC. The clepsydra is simply a vessel with a hole in it to allow water to gradually escape, thus establishing a repeatable time interval. Like its more modern manifestation, the hour glass, the clepsydra required human action at the end of each time interval to establish a time scale. Human ingenuity found other interval mechanisms to establish time, of which the burning of incense sticks was one of the more interesting. Water-driven geared mechanisms have been used for timekeeping since at least 300 BC, and water-driven mechanical devices continued to be used at least until 1000 AD in Islamic cultures [2]. These devices were apparently rediscovered by Europeans, because by the 11th century water-driven clocks were common in European cities. By the 14th century, weight-driven mechanisms had taken hold, and a robust clock industry began to develop in Europe. In 1656, the Dutchman Christiaan Huygens introduced the pendulum clock. This event sparked a renewed interest in precision timekeeping, and marked the beginning of a period of rapid improvement in the precision and accuracy of clocks. During the few decades following Huygens, the accuracy of household clocks improved from about ± 15 min per day to perhaps ± 15 s per day [2]. The measurement of time began to assume more commercial importance, as well. The outstanding problem of the 18th century was, of course, the determination of longitude. The existing methods based on astronomy were simply not reliable enough to prevent significant losses due to ships that went off course. To address this situation, the English Board of Longitude, on 8 July 1714, set a prize of £20,000 (several million dollars in today’s currency) for a seaworthy clock that permitted longitude to be determined to 0.5 ◦ over a great circle, roughly 30 miles. The accuracy required of the clocks was better than 3 s per day over a period of 6 weeks, this being the approximate duration of a voyage to the New World. Most people have heard of John Harrison, who labored for decades producing a series of clocks culminating in the famous H-4 that met the requirements for the £20,000 prize. Lesser known are the trials and tribulations that went into this effort, and Harrison’s difficulties in overcoming political opposition to achieve full recognition

12

2 The Evolution of Clocks and Clock Synchronization

for his work. Those interested in this fascinating story should read the very accessible account by Dava Sobel, appropriately titled Longitude [4]. Those interested in the technical details of mechanical oscillators will find the book by Landes [5] interesting. In our era, the most significant change in timekeeping was the introduction of watches and clocks based on a quartz oscillator. Not only did this oscillator provide better timekeeping, it also reduced the cost of the timekeeping part of the clock or watch to very low levels. One result was the greatly reduced importance of the Swiss watchmaking industry. It is now possible to buy a watch for a few dollars that outperforms the best of the mechanical watches of a few years ago. While it is still possible to pay many thousands of dollars for a watch, the money is really buying a piece of jewelry or art or designer name. The actual timekeeper in such a watch is not significantly better than that in an inexpensive watch. 2.2.2 Characterization of Oscillators The characterization of oscillators is an extremely complex subject. This section presents a summary of the terms and characterization techniques that are useful in selecting components for the application domains discussed in this book. There is extensive literature and numerous standards documents on this subject. A good compilation of relevant papers on the subject may be found in NIST Technical Note 1337 [6]. Interested readers should also review IEEE Standard 1139-1999 [7] and ITU-T Standard G.810 [8]. Oscillators can only be characterized by measuring them against a reference oscillator. If the errors in the reference oscillator are negligible with respect to the oscillator under test, then the error in the time scale produced by counting periods of the measured oscillator may be represented as shown in Equation 2.1 [8]. x(t) = x0 + y0 t +

D 2 φ(t) t + 2 2πfnom

(2.1)

In this equation, • • • • •

x0 represents the origin of the time scale relative to the origin of the reference time scale, y0 represents the fractional frequency offset between the reference and the measured oscillator, D represents the linear fractional frequency drift of the measured oscillator, φ(t) represents the random phase deviations of the measured oscillator, and fnom represents the nominal frequency of the reference oscillator.

The linear fractional frequency drift, D, represents both environmental and inherent effects. The environmentally induced effects are due to changes in external variables such as temperature, pressure, and supply voltage, while the

2.2 The Measurement of Time and Time Intervals

13

inherent effects are the normal aging processes of oscillators. It is customary to separate these two causes, and oscillator data sheets almost always provide the sensitivities to environmental parameters and aging. For precision oscillators, the specification of the random variations represented by φ(t) are usually given in terms of one of the following measures, depending on the intended application: • • • • •

ADEV: the Allan deviation, MDEV: the modified Allan deviation, TDEV: the time deviation, TIErms : the root mean square of the time interval error (TIE), and MTIE: the maximum time interval error.

ADEV, MDEV, and TDEV measurements are all subject to error resulting from residual systematic effects that have not been removed from the data. Diurnal wander, unrecognized systematic frequency drift due to temperature change, and unexpected aging are examples of potential sources of these errors. ADEV, MDEV, and TDEV are typically used to characterize oscillators when designing stable sources for instrumentation and metrology applications. They also are useful in applications where frequency stability is important, such as certain forms of wireless communication. TIE and MTIE are typically used by the telecommunications industry in designing and qualifying time and frequency distribution systems. In particular, MTIE is quite useful in determining the needed buffer sizes in telecommunications networks. The definitions of these measures, shown below, follow the form shown in ITU-T Recommendation G-810 [8], and are reproduced here with the kind permission of the ITU. The ITU specifications are rather terse, and a more complete discussion of each of the measures, along with alternate but equivalent representations, will be found in the references cited, and in NIST Technical Note 1337 [6]. In all cases, it is assumed that deterministic variations arising from the constant, linear, and quadratic terms of Equation 2.1 have been removed from the data. For all of these measurements, it is assumed that the function x(t), representing the time error, is sampled with N equally spaced samples, xi = x(iτ0 ), for i = (1, 2, ..., N ), and with a sampling interval τ0 . The observation interval, τ , is given by τ = nτ0 . Allan Deviation The first measure, ADEV, was devised by David Allan to overcome shortcomings in the usual measures of standard deviation. Over the long term, the time scales of most clocks will diverge, leading to standard deviations that depend on the length of the observation. ADEV converges for the types of noise present in oscillators used for timing purposes.

14

2 The Evolution of Clocks and Clock Synchronization

ADEV, the Allan deviation, is defined as [8]:  2  1  x(t + 2τ ) − 2x(t + τ ) + x(t) ADEV(τ ) = 2τ 2

(2.2)

The angle brackets denote an ensemble average. ADEV(nτ0 ) may be computed using the following estimator [8]:   N

− 2n  2 1 ∼ ADEV(nτ0 ) = xi + 2n − 2xi + n + xi 2 2 2n τ0 (N − 2n) i = 1 (2.3) 

N −1 n = 1, 2, ..., integer part of 2 The Allan deviation does converge for the types of noise commonly associated with timing oscillators. Amplitude modulation is not generally a consideration, since oscillators typically run in a saturated mode. However, several types of phase modulation can occur. With the exception of white noise phase modulation and flicker phase modulation, the Allan deviation exhibits a different slope for each of these types of noise modulation. The common types of noise, their abbreviations, and the slope of the Allan deviation for each are shown in Table 2.1. Table 2.1. Common noise types and Allan deviation slopes [8] Type of noise modulation

Abbreviation

White phase Flicker phase White frequency Flicker frequency Random walk frequency

WPM FPM WFM FFM RWFM

Slope of phase noise spectral density

Slope of ADEV(τ )

0 –1 –2 –3 –4

τ −1 τ −1 1 τ−2 τ0 1 τ2

A complete discussion of these noise types, the expected spectral densities, and Allan deviations may be found in Stein [9], the CCIR [10], and Lance et al. [11]. Modified Allan Deviation MDEV, the modified Allan deviation, is defined as [8]:    2  n

 1 1 MDEV(nτ0 ) = (xi + 2n − 2xi + n + xi ) 2(nτ0 )2 n i=1

(2.4)

2.2 The Measurement of Time and Time Intervals

15

The angle brackets denote an ensemble average. MDEV(nτ0 ) may be computed using the following estimator [8]: ∼ MDEV(nτ0 ) =   2  j −1 N −

3n + 1 n + 

1 (xi + 2n − 2xi + n + xi ) 2n4 τ02 (N − 3n + 1) j = 1 i=j

 N n = 1, 2, ..., integer part of 3

(2.5)

In contrast to the Allan deviation, the modified Allan deviation exhibits a different slope for each type of noise modulation. The slopes of the modified Allan deviation for each type of noise are shown in Table 2.2. Table 2.2. Common noise types and modified Allan deviation slopes [8] Type of noise modulation White phase Flicker phase White frequency Flicker frequency Random walk frequency

Slope of MDEV(τ ) 3

τ−2 τ −1 1 τ−2 τ0 1 τ2

A complete discussion of the modified Allan deviation and its dependency on these noise types may be found in Stein [9] and Howe and Vernotte [12]. Time Deviation TDEV, the time deviation, is defined as [8]:   n  2   1 

TDEV(nτ0 ) = 2 (xi + 2n − 2xi + n + xi ) 6n i=1 nτ0 = √ MDEV(nτ0 ) 3

(2.6)

16

2 The Evolution of Clocks and Clock Synchronization

The angle brackets denote an ensemble average. TDEV(nτ0 ) may be computed using the following estimator [8]: TDEV(nτ0 ) ∼ =   2  j −1 N −

3n + 1 n + 

1 (xi + 2n − 2xi + n + xi ) 6n2 (N − 3n + 1) j = 1 i=j nτ0 ∼ = √ MDEV(nτ0 ) 3

n = 1, 2, ..., integer part of

N 3

(2.7)



The slopes of the time deviation for each type of noise are shown in Table 2.3. Additional discussion on the calculation the time deviation may be found in Li et al. [13]. Table 2.3. Common noise types and time deviation slopes [8] Type of noise modulation White phase Flicker phase White frequency Flicker frequency Random walk frequency

Slope of TDEV(τ ) 1

τ−2 τ0 1 τ2 1 τ 3 τ2

Time Interval Error TIE and TIErms are defined as follows [8]: TIE(t, τ ) = [x(t + τ ) − x(t)]   2  x(t + τ ) − x(t) TIErms (τ ) =

(2.8)

The angle brackets denote an ensemble average. TIErms (nτ0 ) may be computed using the following estimator [8]:   N

−n  1 2 ∼ TIErms (nτ0 ) = xi + n − xi (2.9) N − n i=1 n = 1, 2, ..., N − 1 The slopes of TIErms for each type of noise are shown in Table 2.4. TIErms does not converge when FFM or RWFM noise is present.

2.2 The Measurement of Time and Time Intervals

17

Table 2.4. Common noise types and TIErms slopes [8] Type of noise modulation

Slope of TIErms (τ )

White phase Flicker phase White frequency

τ0 τ0 1 τ2

Maximum Time Interval Error MTIE, the maximum time interval error, is a measure of the maximum accumulated phase error over a given observation time. The formal definition of MTIE involves probability computations on a random variable X given in Equation 2.10 [8]. ⎧ ⎫ ⎩ X = max max [x(t)] − (2.10) min [x(t)]⎭ 0 ≤ t0 ≤ T − τ

t0 ≤ t ≤ t0 + τ

t0 ≤ t ≤ t0 + τ

MTIE may be computed for a specific value of the observation interval, τ = nτ0 , using the following estimator [8]: ⎧ ⎫ ⎩ max x(i) − MTIE(nτ0 ) ∼ max min x(i)⎭ = 1≤k≤N −n k≤i≤k+n k≤i≤k+n (2.11) n = 1, 2, ..., N − 1 Equation 2.11 is best understood by reference to Figure 2.1, which illustrates the relationships of the various terms. These terms are defined in Table 2.5.

T=(N−1)τ

0

x(t)

Time error

τ=nτ

0

x

ppk

1 2 3 k

k+n

N

Figure 2.1. Definition of MTIE terms [8]

i

18

2 The Evolution of Clocks and Clock Synchronization Table 2.5. Relationship of the quantities in Figure 2.1 and Equation 2.11 [8] τ0 τ = nτ0 T xi xppk

Sample period Observation time interval of interest Measurement period during which the N samples were taken The ith time error sample The peak-to-peak value of xi within the observation time interval starting at xk MTIE(τ ) the maximum value of xppk for all observations of length τ within T

Examples of the Characterization Methods In this subsection, examples of each of the characterization techniques are illustrated. In all cases, the computations use the TIE data of Figure 2.2.

150

TIE−nanoseconds

100

50

0

−50

−100

−150

0

50

100 Time−seconds × 50

150

200

Figure 2.2. Portion of the TIE data

The TIE data were generated in Matlab using a random number generator based on a normal distribution with a standard deviation of 50 ns and with zero mean. The data set consists of 100,000 points with an assumed sampling interval of 0.02 s. The maximum value is 231.8 ns and the minimum value is – 242.6 ns. The first 200 points of the TIE data are illustrated in Figure 2.2. Figure 2.3 shows the Allan deviation computed from the TIE data of Figure 2.2. The slope of the curve is τ −1 , which from Table 2.1 indicates either white phase or flicker phase modulation. Figure 2.4 shows the modified Allan deviation computed from the TIE data of Figure 2.2. The slope of the curve is τ −1.5 . From Table 2.2, a slope of

2.2 The Measurement of Time and Time Intervals

19

τ −1.5 indicates the presence of white phase noise and not flicker phase noise, which would have a slope of τ −1 . Figure 2.5 shows the time deviation computed from the TIE data of Figure 1 1 2.2. The slope of the curve is τ − 2 . From Table 2.3, a slope of τ − 2 again indicates the presence of white phase noise. Figure 2.6 shows the TIErms computed from the TIE data of Figure 2.2. The slope of the curve is approximately τ 0 , which, from Table 2.4, is again consistent with the presence of white phase noise. Figure 2.7 shows the MTIE computed from the TIE data of Figure 2.2. Recall that the value of MTIE for a given observation interval represents the maximum differential time error observable over the data. Therefore, the maximum value of MTIE in Figure 2.7, approximately 475 ns, should be the sum of the maximum and minimum values of the time interval error data, i.e., 231.8 + 242.6 = 474.4.

−5

10

−6

Allan deviation

10

−7

10

−8

10

−9

10

−10

10

−2

10

−1

10

0

1

10 10 Observation time−seconds

2

10

3

10

Figure 2.3. Allan deviation (ADEV) of the TIE data

2.2.3 Properties of Modern Oscillators For all practical purposes, timekeeping today relies on four oscillator technologies, which can be divided into two categories: • •

Quartz crystal resonances, and Rubidium, cesium, and hydrogen atomic resonances.

2 The Evolution of Clocks and Clock Synchronization −5

10

−6

10

−7

Modified Allan deviation

10

−8

10

−9

10

−10

10

−11

10

−12

10

−13

10

−2

10

−1

10

0

1

10 10 Observation time−seconds

2

10

3

10

Figure 2.4. Modified Allan deviation (MDEV) of the TIE data

−7

10

Time deviation−seconds

20

−8

10

−9

10

−10

10

−2

10

−1

10

0

1

10 10 Observation time−seconds

2

10

Figure 2.5. Time deviation (TDEV) of the TIE data

3

10

2.2 The Measurement of Time and Time Intervals −8

7.13

x 10

7.12

TIErms−seconds

7.11

7.1

7.09

7.08

7.07

7.06 −2 10

−1

10

0

1

2

10 10 10 Observation time−seconds

3

10

4

10

Figure 2.6. TIErms of the TIE data −7

5

x 10

4.8

MTIE−seconds

4.6 4.4 4.2 4 3.8 3.6 3.4 −2 10

−1

10

0

1

2

10 10 10 Observation time−seconds

3

10

4

10

Figure 2.7. Maximum time interval error (MTIE) of the TIE data

21

22

2 The Evolution of Clocks and Clock Synchronization

Oscillators Based on Quartz Crystals Quartz oscillators are based on the mechanical vibrations of a quartz crystal, and some form of quartz oscillator is used in almost every commercial clock and watch intended for general use. For precision timekeeping, the oscillator is usually in the form of a disk cut from a single crystal of quartz, and at a specific angle with respect to the crystallographic axes. Because the frequency of oscillation depends on the bulk properties of the quartz, these oscillators exhibit small but significant frequency variation with temperature, pressure, and acceleration. These environmental sensitivities vary with the specific cut of the crystal, which allows the properties of the oscillators to be optimized for the specific application. For example, Figure 2.8 shows the error in a common quartz-based wristwatch. When kept on the wrist at body temperature, the watch is actually quite accurate. However, if left outside at extreme ambient temperatures, the error rises significantly [14]. As will be seen below, the short-term stability properties of quartz oscillators are quite favorable. This favorable short-term stability, combined with economically reasonable methods of temperature control, has allowed the quartz oscillator to become the oscillator of choice for radios, cell phones, computers, and many other commercially significant applications.

25

Time Error per Day (seconds)

20

15

10

5

0

−5

−55 Military Cold

−10 Winter

28 Wrist Temperature °C

49 Desert

85 Military Hot

Figure 2.8. Wristwatch accuracy as a function of temperature [14]

2.2 The Measurement of Time and Time Intervals

23

There are various techniques for compensating for temperature. The simplest is to select the cut of the crystal to minimize the temperature coefficient at the normal operating temperature. These uncompensated crystal oscillators are usually designated as XOs. An XO will have a temperature coefficient on the order of a few parts per million per ◦ C. It is also possible to select other components in the oscillator circuit, for example, a varactor, with a temperature coefficient that varies in such a way as to cancel much of the temperature variation of the quartz crystal. These devices are designated TCXOs, and provide stabilities on the order of 5 × 10−7 per ◦ C. The MCXO, or microprocessor-compensated crystal oscillator, makes use of two resonances in selected cuts of quartz. These resonances are based on different modes of oscillation, and therefore have different temperature coefficients. By measuring the frequency difference of the two oscillating modes, the temperature can be measured and used by the microprocessor to produce compensating changes in the oscillator circuits. An MCXO can have temperature coefficients in the range of 10−8 per ◦ C. A final technique is to use an external oven to maintain the temperature of the quartz, as well as other components of the oscillator, at the optimum operating point. These OCXOs can achieve temperature coefficients from 2 × 10−10 to 10−9 per ◦ C, depending on whether a single or a double oven is used. Of course, these improvements come with a cost. Although a simple XO may sell for less than a dollar, a high-quality OCXO may cost several hundred dollars. Quartz crystal oscillators can also be manufactured to operate at a variety of frequencies by varying the dimensions and the mode of oscillation. As will be discussed in Section 5.3, quartz crystal oscillators are a key component in IEEE 1588 technology. Readers interested in more details on quartz oscillator technology will find the tutorial by Vig and Ballato [14] an excellent starting point. Oscillators Based on Atomic Resonances Rubidium, cesium, and hydrogen oscillators are all based on atomic resonances, and as such they are much less susceptible to the environmental factors that plague quartz oscillators. The resonance between two carefully chosen quantum states in an atom of rubidium, cesium, or hydrogen has a very small dependence on environmental perturbations. However, the temperaturesensitive mechanisms required to interrogate the resonance frequency can perturb the resonance frequency. Therefore, the resulting devices still exhibit environmentally induced degradation, albeit at a much lower level. Rubidium vapor cell devices make use of a transition between the hyperfine states | F = 1, mF = 0 > and | F = 2, mF = 0 > in the ground state of 87 Rb atoms. The cells are made such that the 87 Rb vapor pressure in the cell is about 10−5 torr, and a buffer gas is included with a vapor pressure of about 10 torr. The buffer gas confines the rubidium atoms, and reduces collisions with the cell walls that would otherwise broaden the resonance. The buffer

24

2 The Evolution of Clocks and Clock Synchronization

gas is tailored to provide a zero temperature sensitivity at some operating temperature. The transition between the hyperfine states | F = 1, mF = 0 > and | F = 2, mF = 0 > has a resonance frequency of 6,834,682,610.9 Hz under vacuum. Typically, the collisions between the 87 Rb atoms and the buffer gas molecules shift the resonance frequency by a few kHz. The cell is optically pumped with a 87 Rb lamp filtered by a 85 Rb gas filter to provide the proper pumping spectrum. The transition between the F-2 and F-1 hyperfine states is induced by a microwave signal driven by a quartz oscillator. At resonance, the microwave signal causes a change in the absorption of the light detected by a photocell. This change is used to drive the servo that locks the microwave frequency to the atomic resonance. Rubidium standards provide stabilities on the order of 10−11 to 10−12 with respect to both temperature and aging, which is intermediate between quartz and cesium beam devices. The appeal of the rubidium vapor cell device is the improved performance over quartz, and its smaller size, compared to both cesium beam devices and hydrogen masers. The cost of the rubidium vapor cell devices has been driven down to less than $1,000, due to the number of these devices used in the cellular telephone industry. There is a DARPA contract with the goal of producing a rubidium oscillator in a 1-cm3 package. NIST also has a program to develop these devices.1 Interested readers will find a good account of this work in the paper by Knappe et al. [15]. If these devices can be made, and at a reasonable price point, they will make a major impact in many areas, including IEEE 1588 applications. Cesium beam devices make use of an atomic transition at 9,192,631,770 Hz. Indeed, the international second (SI) is defined as 9,192,631,770 periods of the radiation emitted by this transition. In the device, a beam of cesium atoms from an oven is passed through a series of magnets and microwave cavities that serve to separate the beam into two states. Transitions between these states are induced by a microwave field generated from a crystal oscillator controlled by a servo system. The error signal is derived from a detector that senses the change in the number of atoms reaching a hot wire ion detector. Although considerably larger than a rubidium vapor cell device, a cesium beam device provides accuracies on the order of 10−12 to 10−14 , and with much lower sensitivity to environmental factors. Cesium beam devices are quite expensive, with costs on the order of $50,000. Cesium beam devices form the basis for the world’s time standard, as discussed in Section 2.3. Readers interested in a more complete account of the history and technical details of cesium clocks will find the paper by Cutler [16] helpful. The hydrogen maser, like the rubidium and cesium devices, makes use of an atomic transition. The transition frequency used in a hydrogen maser is at 1,420,405,751.8 Hz. As before, the transition is induced by a microwave 1

The current URL of the NIST small-scale atomic clock program is http://tf.nist.gov/ofm/smallclock/.

2.3 Time Scales and Calendars

25

field to phase lock the microwave signal frequency to the atomic transition. Hydrogen masers have much better short-term stability than do either the rubidium or cesium devices, but are quite large and very expensive. Hydrogen maser stabilities are on the order of 10−14 to 10−15 . Readers interested in learning more about atomic frequency standards will find numerous references and white papers on the web sites of suppliers of these devices, and in the literature. The article by Cantor [17] provides a good introductory survey of the field. The Allan deviations for quartz, rubidium, and cesium oscillators are illustrated in Figure 2.9. The CTS CB3LV curve is for an inexpensive quartz oscillator often used to drive microprocessors, while the 10811D curve is for a precision ovenized quartz crystal oscillator typically found in high-quality instrumentation. The performance of the precision quartz oscillator is roughly three orders of magnitude better than that for the inexpensive version. In both cases, the short-term stability is good, but the long-term stability is poor. The rubidium oscillator is commonly used in telecommunication systems as a secondary reference. Rubidium oscillators are often combined with precision quartz crystal oscillators to take advantage of the better long-term performance of the rubidium and the short-term characteristics of the quartz oscillators. The final curve is for an Agilent 5071A cesium standard. These devices have the best long-term stability of any of the commonly used oscillators, and an ensemble of 5071A clocks comprise the majority of the clocks used in the determination of the UTC time scale.

2.3 Time Scales and Calendars Calendars are our method of organizing events over long periods of time. In all calendars, the fundamental unit of measure is the day, but exactly what should constitute a day has been the subject of debate throughout history. The obvious choices are to begin the day at midnight, sunrise, noon, or sunset, and each possibility has its advocates. For example, both the Jewish and Islamic days start at sunset. At the International Meridian Conference in 1884, it was decided to use midnight for civil timekeeping purposes. The longest unit of calendar measure is the year. Here again, there is a choice between basing the year on solar or lunar cycles. As usual, there are calendars for both choices, and both have the disadvantage that neither contains an integral number of days. For example, the solar year is 365.2422 days long, which requires any solar-based calendar to have a mechanism for adjusting the number of days in a year, if the same date is always to fall at the same place in the seasonal cycle. Once the basic structure of a calendar is determined, the epoch or starting point must be set. Here, astronomy offers little guidance, so some culturally dependent event is selected. It is no surprise, then, that the furor over exactly

26

2 The Evolution of Clocks and Clock Synchronization

−6

10

−7

10

−8

10

CTS CB3LV

Allan deviation

−9

10

Rubidium −10

10

10811D

−11

10

Rubidium

−12

10

5071A Cesium −13

10

Second

Minute

Hour

Day

−14

10

−2

10

0

10

2

10

4

10 tau seconds

6

10

8

10

Figure 2.9. Allan deviations for quartz, rubidium, and cesium oscillators

when the millennium started was not of universal concern. The millennium is an artifact of the Gregorian calendar, with no obvious meaning in the Julian, Jewish, Muslim, Chinese, or the religious or civil Hindu calendars. The difficulty in keeping the calendar and the seasons in line has led to a series of modifications of the calendar structure, the epoch, or both. Thus, in the time of Julius Caesar, the older Roman calendar was about 90 days out of line with the seasons. The required adjustment gave rise to the Julian calendar, and the peculiarity that the current names for September, October, November and December (months 9, 10, 11, and 12) in the Julian calendar appear to reflect the numbers 7, 8, 9, and 10. By 1500, things were again out of alignment. This time the error was approximately 10 days, with Easter being the troublesome point. In 1582, Pope Gregory XIII, in a papal bull titled Intergravissimas, removed 10 days from the Julian calendar, created a new algorithm for calculating the lunar age and Easter, and instituted a change in the leap year rule, thus establishing the Gregorian calendar, which is still in use. This event was not greeted with universal acceptance, with politics and religious differences delaying adoption for nearly four centuries. It was not until 1762 that the Gregorian calendar was accepted in England, and not until 1927 in Turkey. It is best not to judge

2.3 Time Scales and Calendars

27

this behavior too harshly. In our own times, not all countries have joyfully embraced the MKS system of units in everyday commerce. Readers interested in more details about the development of calendars will appreciate the very interesting account written by Silke Ackermann, titled The Principles and Uses of Calendars [18]. One of the side effects of the improvements in clocks was that astronomers found that the length of the day varied considerably over the seasons. This introduced the additional complication of reconciling the astronomical calendar with time maintained by the increasingly accurate clocks, and the needs of society. As one would expect, this has led to a bewildering array of time scales, each optimized for the needs of a particular application. The first group of time scales—UT0, UT1, and UT2—are directly tied to astronomical motion. UT0 corrects a perfect clock for the apparent motion of the Sun observed at exactly noon, on a perfect clock, when the observer is on the Greenwich meridian. This observed pattern, the analemma that resembles a figure eight, is caused by the combination of the tilt of the Earth’s axis and the motion of the Earth around the Sun. Superimposed on this motion is the wobble of the polar axis. The correction of UT0 for this wobble results in the UT1 time scale. UT2 is a refinement of UT1 to correct for very small annual and semi-annual variations in UT1. There remain random variations in UT2, resulting in unpredictable errors on the order of 10−9 , or about 60 ms per year [19]. The second group of time scales, UTC and TAI, form the basis for modern civil timekeeping. Since 1 January 1972, TAI has been the official basis for UTC. UTC, or coordinated universal time, is the official worldwide legal time scale, having replaced the older Greenwich mean time (GMT) in 1986. TAI, from the French temps atomique international, is a time scale based on the international second. The SI is based on the transition between the two hyperfine levels of the ground state of the cesium atom, and is defined as the duration of 9,192,631,770 periods of radiation from this transition. This definition was accepted at the 13th General Conference of Weights and Measures in 1967. In practice, TAI is based on a weighted average of more than 200 cesium clocks maintained by national standards laboratories around the world. This average is computed periodically by the Bureau International des Poids et Mesures (BIPM).2 The BIPM estimates that TAI differs from an imaginary perfect clock by less than 100 ns per year, or about three parts in 1015 . TAI is a continuous time scale. The UTC and TAI time scales share the definition of a second, the SI, but differ by a correction to UTC to keep it within 0.9 s of UT1. TAI was defined to be equal to UT1 on 1 January 1958. Between 1958 and 1972, the corrections between UTC and TAI were made in fractions of a second. Since 1972, the corrections between UTC and TAI have been made by the addition or subtraction of a whole number of 2

The current URL of the BIPM is http://www.bipm.org/en/home/.

28

2 The Evolution of Clocks and Clock Synchronization

seconds called leap seconds. If necessary, leap seconds are added at the end of any UTC month, with preference given to the end of June or December. Leap seconds are always added at the end of the last day of the month. For example, if a positive leap second is added, the addition is made at UTC 23:59:59. The next seconds tick would be 23:59:60, followed by 00:00:00 at the beginning of the next day. The printable representation of UTC is specified by international standard as YYYY-MM-DD for the date, and hh:mm:ss for the time of day [20]. The three time scales of most interest to the application areas discussed in this book are the time scales used by the NTP, GPS, and IEEE 1588 protocols. Of these, NTP is based on UTC, while GPS and IEEE 1588 are based on TAI. In an actual implementation, the time scale established by each is maintained by a counter, with the zero of this counter establishing the origin, or epoch, of the time scale. The epochs for the three protocols are shown in Table 2.6. In this table, MJD refers to the term modified Julian day.3 Table 2.6. Epochs of NTP, GPS and IEEE 1588 time scales Time/MJD UTC

NTP seconds GPS seconds IEEE 1588 seconds

00:00:00 15020

00:00:00 0 1900-01-01 NTP epoch

00:00:00 40587

00:00:00 1970-01-01

00:00:00 44244

00:00:00 1980-01-06

0 IEEE 1588 epoch 0 GPS epoch

Since the NTP counter is adjusted at each leap second correction, intervals computed by subtracting two NTP timestamps will be in error by that amount of the leap second corrections that occurred between the two times. Neither GPS nor IEEE 1588 require this correction. The conversion between NTP and UTC can be done without knowing the current number of leap seconds, but both GPS and IEEE 1588 require this information. Fortunately, both the NTP and GPS protocols provide the current number of leap seconds as part of their transmissions, and IEEE 1588 will distribute this information if either NTP or GPS is used as the primary source of time for IEEE 1588.

3

The Julian date (JD) is the Julian day number (JDN) followed by the fraction of the day elapsed since the preceding Greenwich mean noon. The JDN is a day count with the origin, JD = 0, at Greenwich mean noon on 1 January 4713 BC. The modified Julian date (MJD) is the JD less 2,400,000.5, which shifts the origin to midnight on 17 November 1858. For example: at 0 hours on 1 January 1900, JD = 2,415,020.5, and MJD = 15,020.

2.4 Synchronization Protocols

29

2.4 Synchronization Protocols Applications making use of time and clocks fall into two categories. In the first, the passage of time is only relative to local events, and there is no need to tie local time to a more global time scale. The most familiar examples for engineers are the time scales supported by electronic instruments such as oscilloscopes, which are used to accurately measure the time evolution of signals or the relative time between two events. In the second category are applications in which local time must be coordinated with global time. The use of clocks in everyday commerce depends on the clocks being synchronized to a global time scale, and the operation of most of our infrastructure, from transportation to telecommunications, would be impossible without a common sense of global time. Increasingly, even test and measurement applications that formerly used a purely local sense of time are being tied to a global time scale. Although this is also good experimental practice, it is increasingly being required by government regulation (e.g., in the pharmaceutical industry), and makes data more useful in quality control procedures. The technical solutions for meeting the requirement of a global time scale are either a method of distributing time or one of synchronizing clocks. In a time distribution system, an application needing the time either requests a value of the current time, or if the distribution uses a broadcast mechanism, it simply observes the time. In such a system, the user will know the time based on the time scale of the supplier. However, unless the locations of the observer and supplier of the time and the propagation time of the communication link are known, the observer cannot relate the civil time in the two locations. Familiar examples of time distribution systems are the time services provided by the phone companies or WWV broadcasts in the United States. Both examples provide time that is tied to a global time scale. Consider a traveller taking a flight from San Francisco and falling asleep, so that upon landing he did not realize that he had been diverted to Denver, instead of landing in New York as planned. A call to a time service in San Francisco would provide Pacific standard time. The traveller, believing that he was in New York and making the appropriate 3 h correction to his watch, would be surprised that his watch was in error by 2 h with respect to the local clocks. In addition, his watch would be behind the global time scale by the propagation time of the message from San Francisco to Denver. A clock synchronization protocol calibrates a time distribution system in order to accurately synchronize the clocks in the system. There are many examples of both distribution and synchronization protocols. The remainder of this section discusses some early historical examples, and several of the more important modern protocols.

30

2 The Evolution of Clocks and Clock Synchronization

2.4.1 Early Synchronization and Distribution Protocols As noted in Section 2.2.1, one of the most pressing commercial needs around 1700 was for accurate seaworthy clocks that could be synchronized to the time measured by the observatory in Greenwich. Clocks properly synchronized to Greenwich time enabled the accurate measurement of longitude. However, the clocks still needed to be set to Greenwich time at the outset of a voyage from England. This was accomplished by having a large ball, visible to sailors on the Thames, fall precisely at noon from the top to the bottom of a mast on the Greenwich observatory. The accuracy of this distribution system was limited by the reaction times of the sailors in observing the ball, and then setting the shipboard clock. During the 19th century, both the means and the need to improve the global time scale were present. The need was the increasing requirements of commerce, in particular the coordinating of railroad schedules. Before the advent of railroads, time was a local issue, with little need for global time scale synchronization, except in the loosest sense. Elaborate systems of time distribution were common in major cities. For example, clocks in Paris used pneumatic signals transported over pipes to coordinate clocks throughout the region. The advent of telegraphy around 1840, and the wireless around 1901 provided the means to more accurately synchronize clocks, and these formed the basis for global timekeeping well into the last century. However, the question of what time scale to use, and who was to be the supplier became mired in the usual struggle between commercial and political powers. Those interested in the political and scientific intrigue behind these developments should read the account of the development of the theory of relativity by Galison [21]. 2.4.2 IRIG-B IRIG-B is a time distribution standard, and is the most commonly implemented form of a group of standards published by the Inter Range Instrumentation Group of the Range Commander’s Council located at the White Sands Missile Range in New Mexico.4 The IRIG-B standard specifies a method of encoding time in the form of day-of-year, hour, minute, and second using a 1-kHz carrier frequency. Additional encoding allows the transmission of supplemental information, such as pending leap seconds or shift to daylight savings time, time quality, offset from UTC, etc. The update rate is once per second. There are options for different physical encoding and media. IRIG-B is commonly used to distribute and report time in localized environments, such as a test system or military test range. Equipment for distributing IRIG-B is available from many manufacturers in a wide variety of form factors, from cards for PC backplanes to distribution amplifiers for cabled systems. The use 4

The current URL for the council is http://www.jcte.jcs.mil/rcc. Their documents can be found at http://www.jcte.jcs.mil/rcc/PUBS/oldoc.htm.

2.4 Synchronization Protocols

31

of IRIG-B for precise time measurement will require additional calibration for the propagation time between the source and receiver. 2.4.3 Loran-C Loran-C is a system of precisely controlled transmitters used for navigation, and is accessible primarily in United States coastal waters. Each Loran station transmits an encoded signal at a precise time in relation to similar signals from other Loran stations. By measuring the time of reception of signals from at least three Loran stations, it is possible to locate position within 0.25 nautical miles. Loran transmissions are controlled by very precise atomic clocks. Although no time-of-day information is carried on the transmission, a suitably designed receiver and clock can maintain a very precise time scale. If the clock is initially synchronized to UTC, the time scale can be maintained within a few microseconds over long periods of time. Due to its limited spatial coverage, Loran-C is not particularly useful for applications in industrial automation or test and measurement. Efforts to upgrade the Loran system have been underway for the past several years. This enhanced Loran, often termed e-Loran, will have better coverage and accuracy and is intended to provide a backup service to the GPS system. If these efforts are successful, there will be many more applications that can use Loran. It has been projected that e-Loran will support frequency accuracy to a part in 10+12 , and time accuracy to about 30 ns [22]. If so, it will be quite attractive to the telecommunications industry. 2.4.4 NTP Network time protocol (NTP) is without a doubt the most widely used and successful time synchronization protocol available. NTP was developed by David L. Mills at the University of Delaware, and is designed to synchronize clocks by means of message passing over the internet. The nominal accuracy is in the low tens of milliseconds on wide area networks to sub-milliseconds on a local area network. It uses a client-server architecture, with over 200 primary time servers located throughout the world. Many of these time servers are synchronized to atomic clocks maintained by various national time services, such as NIST. NTP is the basis for most modern transaction-based commercial systems, including stock market transaction timestamping, air traffic control, communication network monitoring, and distributed database and file system coordination and monitoring. It is available on essentially every computing platform and operating system in use today. The current version of NTP is version 4. Version 3 is specified in RFC 1305 [23], which has been in use since 1992. Reference implementations are available. NTP uses filtering algorithms to reduce network-induced jitter and oscillator wander. The protocol implements features to reject faulty servers, and makes use of redundant servers and network paths to improve reliability.

32

2 The Evolution of Clocks and Clock Synchronization

Since the primary NTP servers are linked to national standards, NTP time is traceable to UTC. Further information, along with pointers to reference implementations and documentation, may be found on the NTP Project web site.5 2.4.5 GPS The global positioning system (GPS) is a constellation of some 24 satellites in orbit around the Earth. It is operated by the United States Department of Defense, and its primary purpose is to provide time and location information for military purposes. In recent years, the receivers necessary to receive and process GPS signals have become so inexpensive that GPS time and position is used in a myriad of commercial products, including cell phones, automobile and personal navigation systems, tracking devices, and many others. It is now the most commonly used system for the distribution of highly accurate time traceable to UTC. Each GPS satellite contains an atomic clock. Signals from the satellites are monitored by ground stations that compute correction information, which is then transmitted to each satellite. These corrections permit satellite timing and position information to be accurate to within a few nanoseconds and a few meters, respectively [19]. The satellites transmit additional information, allowing transformation between the GPS time scale and UTC. Due to the high accuracies obtainable, the GPS system is used to establish time in a number of critical applications, including the synchronization of NTP time servers to UTC, and the synchronization of the primary reference clocks in telecommunication systems. It is also used to establish time and frequency in most base stations of cellular telephone systems. When averaged by a suitable oscillator, such as a rubidium oscillator, frequency accuracies on the order of a part in 1013 and time accuracies of a few nanoseconds can be realized. The GPS system is highly reliable. It is, however, subject to the control of the United States Department of Defense, who, if they wish, can reduce the accuracy of the system to users not in possession of the necessary information to remove the degradation. This has led to an understandable reluctance of many, particularly in the telecommunications industry, to place complete reliance on the GPS system. There are competing satellite systems being assembled. The Glonass system of satellites is maintained by the Russian government, and has capabilities similar to the GPS system. There is a third system, the Galileo system, being developed by the European Community. 2.4.6 IEEE 1588 The final protocol considered in this section is IEEE 1588, the subject of this book. As noted, IEEE 1588 is designed to fill a niche not well served by either 5

The current URL is http://www.ntp.org/.

2.4 Synchronization Protocols

33

of the two dominant protocols, NTP and GPS. IEEE 1588 is designed for local systems requiring very high accuracies beyond those attainable using NTP. It is also designed for applications that cannot bear the cost of a GPS receiver at each node, or for which GPS signals are inaccessible.

3 An Overview of Clock Synchronization Using IEEE 1588

Chapter 1 outlined three distinct styles of measurement and control: messagebased, periodic, and time-based. The time-based method depends on the presence of accurately synchronized clocks in each of the devices in the system. As noted in Chapter 2, there are a number of ways of synchronizing such clocks, each with advantages and disadvantages. However, none have proved satisfactory for use in modern network-distributed measurement and control systems. It was to meet the needs of this environment that the IEEE 1588 standard was developed. The goal of this chapter is to provide the reader with a clear understanding of the context in which IEEE 1588 was developed, and a clear conceptual understanding of how the IEEE 1588 protocol synchronizes clocks. A standards document is notoriously difficult to read, and is usually a poor vehicle for gaining an understanding of the overall operation and performance of a device or system based on the standard. The principal purpose of a standard is to provide a minimal, but sufficient set of specifications for an implementation to ensure that the operation of the implementation meets the technical goals of the standard. Even for a standard of modest complexity such as IEEE 1588, the level of detail necessary for an adequate specification obscures the overall operation. The presentation in this, and subsequent chapters discusses material beyond what is in the actual text of the standard. These discussions are the views of the author, and should not be construed as an official interpretation of the standard. Hopefully, these discussions will increase the reader’s understanding of the standard. However, it remains the responsibility of each implementor of the standard to use his or her own engineering judgement as to the meaning of each of the standard’s specifications. The IEEE has specific procedures for providing official interpretations of the meaning of the stan-

36

3 An Overview of Clock Synchronization Using IEEE 1588

dard. These procedures are described on the web site maintained by the IEEE Standards Association.1 Since the standard was published, numerous questions have been received by the working group that developed the standard. The responses to these questions, along with a great deal of other information relevant to implementors or users of IEEE 1588, can be found on the IEEE 1588 web site.2 This chapter is divided into three sections, each describing a different aspect of IEEE 1588. The first section examines the objectives the IEEE 1588 standards committee sought to meet. It also provides a brief history of the development of the technology. The second section provides a general overview of the standard document itself. The third section presents the fundamental operations of the synchronization protocol defined by IEEE 1588. A detailed analysis of the major specifications of IEEE 1588, a description of ongoing standards revision activity related to IEEE 1588, and speculation on how proposed revisions will impact the standard will be found in Chapter 4. It is strongly recommended that the reader obtain from the IEEE a copy of IEEE 1588-2002 for reference while reading this and subsequent chapters.3 IEEE 1588-2002 [24] is the version of the standard discussed in this book, and numerous references will be made to specific clauses, tables, and figures in this standard.

3.1 The History of the Development and the Objectives of IEEE 1588 IEEE 1588 is based on work begun around 1990 in the central research laboratories of the Hewlett-Packard Company, and continued at Agilent Technologies after the split from Hewlett-Packard in 1999. The technology was originally intended for use in instrumentation systems using network communication for control and data transport. Early public presentations of this technology attracted considerable interest from the industrial automation community, and by the fall of 2000 it was clear that there was sufficient interest in the technology to warrant a standardization effort. IEEE 1588 was developed under the rules of the IEEE Standards Association. Formal work on the standard began in the spring of 2001, and concluded with the publication of the standard in November 2002. The IEEE sponsoring organization is the TC-9 Technical Committee on Sensor Technology of the IEEE Instrumentation and Measurement Society. The standard has also been approved by the IEC as IEC 61588 [25]. 1

The current URL of the IEEE Standards Association is http://standards.ieee.org. 2 The current URL is http://ieee1588.nist.gov. 3 IEEE publications are available from the Institute of Electrical and Electronics Engineers, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331, USA.

3.1 The History of the Development and the Objectives of IEEE 1588

37

The standard committee’s objectives are found in Clauses 1.1 and 1.2 of the standard [24], and form the context needed to appreciate why certain specifications appear in the standard. These objectives are as follows: •











The protocol must enable real-time clocks in the components of a distributed network measurement and control system to be synchronized to sub-microsecond accuracy. A real-time clock in this context is a clock with a time scale approximately commensurate with the international second. Clocks synchronized using IEEE 1588 will have the same epoch, or time scale origin, to sub-microsecond accuracy. It was not an objective of the standard to synchronize these clocks to UTC, although this can easily be done, as discussed in Chapter 5. The protocol must operate over local area networks that support multicast communications. Ethernet, as realized in IEEE 802.3 [26], is the obvious target network for many applications of this standard. However, the intent of the standard is to also allow implementation on network technologies other than Ethernet. The protocol is designed to operate on relatively localized network systems typically found in test and measurement or industrial automation environments at the bench or work cell level. Such environments are usually contained within tens or, at most, a few hundred meters spatially, and with few network components, such as switches or routers, present. The protocol was not designed to operate over the internet or wide area networks. The protocol must accommodate clocks with a variety of accuracy, resolution, and stability specifications. The target applications almost always involve a mixture of high- and low-accuracy devices. For example, it is inappropriate to require a thermocouple to support the same clock accuracy as a high-speed digitizer. The protocol is designed to be administration-free, at least in the default mode of operation. The motivation for this objective is understandable in the context of test and measurement or industrial automation. The protocol is much more attractive if simply attaching a device to the network results in the automatic synchronization of its clock, without recourse to configuring address tables or other parameters. The multicast communication requirement on target networks is the enabler for this feature. Finally, the protocol is designed with minimal resource requirements both in terms of network bandwidth, and computational and memory capability in the devices. In both test and measurement and industrial automation applications, there will be many devices that require a synchronized clock but have cost constraints that must be respected.

Since the publication of the standard in November 2002, there have been three conferences focused on IEEE 1588. The 2003 and 2004 conferences were sponsored jointly by the Instrumentation and Measurement Society of the IEEE and the United States National Institute of Standards and Technology (NIST).

38

3 An Overview of Clock Synchronization Using IEEE 1588

These conferences were held at the NIST campus in Gaithersburg, Maryland. The 2005 conference was hosted by the Zurich University of Applied Sciences. The proceedings of these conferences are available as NIST technical reports [27] [28] [29]. At the 2003 conference, barely 11 months after the standard was published, five organizations demonstrated working prototypes of IEEE 1588 implementations. At the 2004 conference, seven organizations participated in a plug-fest. The devices of all seven organizations successfully synchronized to each other to roughly 50 ns accuracy. The resulting time scale was linked to UTC via GPS. At the 2005 conference 13 organizations participated in a plug-fest, which tested the protocol in more detail than at the previous conferences. The prevailing sentiment of the attendees at these conferences suggests that the standard met its objectives. However, almost as soon as the ink was dry on the standard, it was being applied to application domains and with target specifications very different than those set forth in these objectives. As a result of discussions at the 2004 conference [28], the sponsoring society was asked to reopen the standard for extensions to accommodate these new applications and specifications. This work was ongoing when this book was published. Section 4.7 speculates on the likely impact of this work.

3.2 Overview of the Standard This section describes the organization of the IEEE 1588-2002 standard document. The document contains preliminary matter followed by nine major clauses, five appendices, and a table of contents. A minimal overview of each clause is provided in Clause 1 of the standard. The preliminary matter contains statements by the IEEE concerning the use of the standard. Implementors of this standard should read this material with care, since it contains information regarding issues such as warranty, copyright, standards life cycles, and patent issues. The life cycle and patent issues are particularly important. All IEEE standards are subject to mandatory reviews. The standard may be modified at any time following the procedures defined by the IEEE, and must be reviewed every 5 years. Implementors must realize that from the IEEE’s perspective, a standard is an evolving, not a static document. To be sure, the rate of evolution set by the IEEE procedures is measured in years, but standards do evolve to meet the needs of markets and changes in technology. The IEEE, like many standards organizations, has strict rules on a standard requiring the use of patented technology. In the case of IEEE 1588, the preliminary material specifically says that statements by a patent holder that may bear on this standard are on file with the IEEE. The first five clauses of the standard are fairly conventional. Clauses 1 and 2 define the scope of the standard and list references to other standards cited in IEEE 1588. Clause 3 contains definitions of terms used in the standard. Some of these terms are not in common usage and are unique to this standard

3.3 Fundamental Operation of the Protocol

39

(e.g., boundary clock). Others have specific meanings in the standard that may differ from the meaning in other contexts (e.g., subnet). Of particular note is Clause 3.18 that defines the acronym PTP (precision time protocol). PTP is the term used in IEEE 1588 as the name of the protocol defined by the standard. Historically, this term was used during the development of the technology prior to standardization, and was carried over into the text of the standard by the responsible working group. The acronym PTP will be used extensively throughout this book. Clause 4 defines conventions used in the standard. Implementors should carefully study Clause 4.3 concerning state machines. A number of state machines are defined in the standard, and are to be interpreted as Mealy state transition diagrams. Clause 5 defines the datatypes used in the standard. Since the protocol is concerned with time and streams of messages between clocks, it is inevitable that any finite representation of these will at some point overflow. Thus, datatypes share with finite hardware implementations the potential for severe protocol errors if these rollover properties are not correctly handled by an implementation. Clauses 6 and 7 constitute the actual operational specification of the PTP protocol. All relevant concepts, parameters, timing relationships, state behavior, message types, actions, and communication issues such as addressing are covered in Clauses 6 and 7. Clause 8 defines the format of the many PTP messages. Clause 8 includes the datatyping of messages, the required field definitions, and the order of fields within a message, but does not include on-the-wire specifications of the messages. Clauses 6, 7, and 8 are intended to be network-neutral. It is expected that for each network transport of interest, an annex specific to it will be developed. These transport-specific annexes define the addressing scheme, on-the-wire message formats, and any other network-specific manifestations of the general requirements of Clauses 6, 7, and 8 of the body of the standard. In the initial standard [24], the only such annex is Annex D, which defines the operation of PTP for a UDP/IP transport on an 802.3 network.

3.3 Fundamental Operation of the Protocol The PTP protocol defined in IEEE 1588 is designed to synchronize clocks in devices in a distributed system. PTP is a distributed protocol. Every device in the system that implements IEEE 1588 executes exactly the same protocol. There is no central authority governing any aspect of the protocol, nor is there any need to configure nodes prior to operation, assuming that the default values of various parameters are instantiated in all IEEE 1588-enabled devices. The entire operation of the protocol is implemented using only information obtainable from the exchange of PTP-defined messages between these clocks. There are five operational features of the protocol that together allow the synchronization of clocks in a system, and provide sufficient management

40

3 An Overview of Clock Synchronization Using IEEE 1588

capability to observe and tailor the system to meet specific application needs. These features are: • • • • •

Establishing boundaries and communications for the system to be synchronized, Establishing a master-slave synchronization hierarchy, Establishing orderly startup and reconfiguration of the system, Providing the necessary information to allow slave clocks to correctly synchronize to their master, and Providing system and clock management capability when needed by an application.

Each of these points is explored in more detail in the following sections. 3.3.1 System Boundaries and Communications PTP is designed to operate on packet-based networks that support multicast communications. There are five message types defined by PTP: • • • • •

Sync Delay Req Follow Up Delay Resp management

Message types Sync and Delay Req are called “event messages”, since they are used as timing events by the PTP protocol. Message types Follow Up, Delay Resp, and management are called “general messages”. Follow Up and Delay Resp messages are used to convey timing information. Both event and general messages are to be communicated using a multicast model of communication to enable the self-configuration objective of the standard to be met. Management messages are normally communicated using a multicast model, but in addition are permitted to use a point-to-point model. Figure 3.1 illustrates a typical IEEE 1588 system topology that allows independent synchronization systems to be maintained on the same communication network. Each system maintains its time scale independently of the others. These independent synchronization systems are called “subdomains”. Subdomains are implemented by defining a namespace, so that each subdomain is distinguished by a subdomain name. All PTP messages contain the name of the applicable subdomain. All interactions, communications, and other features of IEEE 1588 occur within a single subdomain, and are logically independent of similar operations in other subdomains. There may be performance impairments that result from multiple subdomains sharing a common communication fabric. These impairments result from communication or processor loading on system components, and are discussed further in Chapter 5. The boundary of a subdomain is determined by the underlying communication fabric and how it responds to multicast messages. The PTP protocol

3.3 Fundamental Operation of the Protocol

41

will synchronize all clocks in a subdomain that receive and process PTP event and general messages. Therefore, to limit the extent of an IEEE 1588 subdomain, it is necessary to correctly limit the propagation of multicast messages in the underlying communication fabric. This may be done physically by isolating the subdomain, or logically by proper configuration of routers, switches, and similar network equipment. A system such as the one shown in Figure 3.1 typically contains several end or terminal devices, and several network devices. An end device is a device with only a single network connection. End devices containing a PTP clock are termed “ordinary clocks” in the standard.

Ordinary clock

Ordinary clock

Ordinary clock

Network device Boundary clock

Ordinary clock

Ordinary clock

Ordinary clock

Figure 3.1. Typical system of network devices, boundary clocks, and ordinary clocks

Depending on the network technology, a network may also contain nonterminal devices such as routers, switches, and repeaters. These devices are called network devices, and may or may not contain specialized IEEE 1588 functionality. These devices serve to pass a network message received on one port to one, or more of the remaining ports of the device. Routers and switches make these forwarding decisions based on addressing information contained in protocol headers of the messages. Repeaters simply forward the message to the other ports on the device. The standard refers to any end or network device that can issue or receive PTP messages as a node (Clause 3.13 [24]). Strictly speaking, any end or network device performs this function, not only devices that support IEEE 1588. Since devices that do not implement the protocol have no direct effect on the protocol, no confusion arises. However, such non-

42

3 An Overview of Clock Synchronization Using IEEE 1588

IEEE 1588 devices may adversely influence the performance of the protocol. These performance impairments will be discussed in more detail in Chapter 5. Network devices containing specialized IEEE 1588 functionality are called boundary clocks. The PTP protocol uses boundary clocks to logically segment the physical network to create a master-slave synchronization hierarchy of clocks. In the general case, a node may be connected to a boundary clock or a network device. The network, depicted by the cloud in Figure 3.1, may contain a mixture of boundary clocks and network devices, depending on the complexity of the application and the network technology. Note that not all potential network technologies have the equivalent of routers and switches. A network in such a technology would appear to PTP as a set of ordinary clocks communicating directly with each other in a multicast manner. An example of such a network would be a set of nodes that communicate using CAN.4 3.3.2 Master-slave Synchronization Hierarchy All clocks in IEEE 1588 systems are organized into a master-slave hierarchy. In operation, the first task of the PTP protocol is to logically configure network communications used by IEEE 1588 into a tree structure supporting this hierarchy (see Clause 6.2.2.1 [24]). At the root of this hierarchy is a clock that the standard terms the “grandmaster clock”. The time scale of the entire IEEE 1588-enabled system will be traceable to the time scale of the grandmaster clock. PTP thus produces a common sense of time throughout the system. If the time scale of the grandmaster clock is UTC, then the entire system will have a UTC time scale. Otherwise, the time scale is relative to the epoch established by the grandmaster. Each boundary clock creates a branch point in the hierarchy tree, while the leaves of the tree are formed by ordinary clocks. Network devices that are not boundary clocks do not establish branch points in the hierarchy, and except for performance degradation, are invisible to the protocol. Figure 3.2 illustrates several topological possibilities. The PTP master-slave hierarchy is established on top of whatever underlying communication topology is present. Thus, in Figure 3.2, if the network connections represented by the lettered lines are all present, then PTP will seek to logically partition the PTP message traffic in such a way as to produce a tree structure supporting the master-slave hierarchy. Many underlying communication systems run a minimum spanning tree or similar protocol, and will present a tree-structured communication topology to the PTP algorithms. The spanning tree algorithm used in Ethernet systems is defined in IEEE 802.1D [31]. A very complete discussion of spanning tree algorithms can be found in Perlman [32]. 4

CAN (controller area network) is used in the automotive and automation industries. See [30] for additional details on CAN networks.

3.3 Fundamental Operation of the Protocol Ordinary clock-8 K

Ordinary clock-9 L Boundary H clock-5 D

Ordinary clock-10

Ordinary clock-11

M Boundary clock-6 E Boundary clock-3

Ordinary clock-12

N J

P

Network device-7 F

C

43

G

Network device-4 A

Ordinary clock-1

B

Ordinary clock-2

Figure 3.2. General network topology

An example of such a spanning tree structure is illustrated in Figure 3.2, if the segmented lines G, H, and J are removed. However, the PTP protocol assumes that in general the underlying communication is not tree-structured, and will present multiple paths between PTP nodes. Network devices such as an ordinary Ethernet switch not implementing IEEE 1588, but multicasting packets to all ports are invisible to the protocol. Thus, the network of Figure 3.2 would actually appear to PTP as illustrated in Figure 3.3. The link G of Figure 3.2 would not be visible to PTP, except possibly by observing fluctuations in latency due to the selection of the path between nodes, for example, between nodes 6 and 1. PTP functionality in nodes 6 and 3 would see nodes 1, 2, 11, and 12 as though they were on a multi-drop link, or connected via a repeater rather than a switch. PTP uses a device called a boundary clock to logically segment a network into a minimum spanning tree, thus reducing cyclic graphs, as shown in Figure 3.3, to acyclic graphs as illustrated in Figure 3.4. This is accomplished by removing one or more communication links, in this case links C and E. This network segmentation is implemented in PTP by means of an algorithm called the best master clock algorithm. This is a modified spanning tree algorithm, and operates based on information contained in the Sync messages. The details on how the best master clock algorithm works will be discussed in Chapter 4. In Figure 3.4, clock-5 is presumed to be the best clock in the system. The best clock in the system is termed the grandmaster clock in IEEE 1588. PTP places the grandmaster clock, in this case clock-5, at the root of the minimum spanning tree. Branch points in the tree occur only at boundary clocks. The

44

3 An Overview of Clock Synchronization Using IEEE 1588 Ordinary clock-8 K

Ordinary clock-9 L Boundary H clock-5 D

Ordinary clock-10

Ordinary clock-11

M Boundary clock-6

P

N J

E Boundary clock-3

Ordinary clock-12

F C

A

Ordinary clock-1

B

Ordinary clock-2

Figure 3.3. Topology of Figure 3.2 as it appears to PTP

actual PTP topology is determined by the state of each clock port in the system. PTP defines a number of states for each port of a clock. Ordinary clocks have a single port, while boundary clocks have multiple ports. The states relevant to this discussion are master, slave, and passive, indicated in Figures 3.4 and 3.5 by M, S, and P in the boxes representing the various clocks. In the example of Figure 3.4, all four ports of the grandmaster clock, clock5, are in the master state. The single port of each ordinary clock directly connected to clock-5 is in the slave state, for example, clock-9 connected by path L. The PTP spanning tree algorithm then prunes the network so that the number of boundary clocks between any ordinary clock and the grandmaster is reduced to a minimum. This is done in a completely distributed manner by the PTP algorithm executing independently on the remaining boundary clocks, i.e., clock-6 and clock-3 in this example. These boundary clocks place one or more of their ports in the passive state, with the single port closest to the grandmaster in the slave state. Synchronization takes place only between pairs of clock ports in which one port is in the master state and the other is in the slave state. The passive state is used by PTP to create the spanning tree. Thus, in Figure 3.4, the ports of clock-3 connected to paths C and E are in the passive state. The spanning tree algorithm works differently if there are two or more clocks in the system that are determined to be equivalent and, in addition, are accurately traceable to UTC. Figure 3.5 illustrates this topology. In this example, clock-8 and clock-12 are both designated as grandmaster clocks. The algorithm will then logically segment the network into disjoint subnetworks by

3.3 Fundamental Operation of the Protocol Ordinary clock-8

Ordinary clock-9

Ordinary clock-10

Ordinary clock-11

Ordinary clock-12

S

S

S

S

S

K

L M Boundary clock-5 H M M

M

M Boundary clock-6 S M

P

J

M D

Best Clock

N

M

45

E P Boundary clock-3 S P

F

C

A

B

M S Ordinary clock-1

S Ordinary clock-2

Figure 3.4. Topology of Figure 3.3 pruned by the boundary clocks

using the passive port state of selected boundary clocks. Each of these disjoint subnetworks will contain exactly one grandmaster clock. In this example, both clock-6 and clock-3 have ports in the passive state, effectively removing paths E, D, and H from the network seen by PTP. The internal architecture of a boundary clock is illustrated in Figure 3.6. The figure shows a three-port boundary clock. Each port is effectively serviced by a multiplexer that separates PTP traffic from all other network traffic. NonPTP traffic is directed to the usual switch or router functionality, while PTP traffic is directed to the block labeled “IEEE 1588 Functions” that implement the PTP boundary clock functions. This block maintains a separate section for each port, to allow each port to be in a different state. Generally, there is a single clock in the PTP block serving all ports. If one of the ports is in the slave state, then this clock will synchronize to a master clock connected to that port. If the clock in the boundary clock is selected as the grandmaster, then clocks connected to any of its ports will generally synchronize to the internal clock of the boundary clock. The exception to this is if one of the boundary clock ports, or the port of a clock connected to a boundary clock port is in

46

3 An Overview of Clock Synchronization Using IEEE 1588 Best Clock

Best Clock

Ordinary clock-8

Ordinary clock-9

Ordinary clock-10

Ordinary clock-11

Ordinary clock-12

M

S

S

S

M

K

L M Boundary clock-5 H S M

M

N

M M Boundary clock-6 P S

P

J

M D

E P Boundary clock-3 P S

F

C A

M

S Ordinary clock-1

B S Ordinary clock-2

Figure 3.5. Topology of Figure 3.3 with two grandmaster clocks

the passive state for the purpose of segmenting the network, as discussed in connection with Figure 3.5. For example, in Figure 3.5: • • • •

Clock-11 synchronizes to clock-12, Clock-6 (the internal clock of boundary clock-6) synchronizes to clock-12 via the port on clock-6 that is in the slave state, Clock-10 synchronizes to the internal clock of boundary clock-6 via a port on clock-6 that is in the master state, and Clock-6 does not synchronize to clock-5 because the port on clock-6 servicing link-H is in the passive state.

The mechanism for determining which end of a link (for example, link-E of Figure 3.5) is in a passive, and which is in a master state for purposes of segmenting the network will be discussed in detail in Chapter 4. One of the key features of a boundary clock is that it does not forward Sync, Delay Req, Follow Up, or Delay Resp messages. Instead, each port of a boundary clock intercepts these four message types. To devices connected

3.3 Fundamental Operation of the Protocol

47

IEEE 1588 Functions

Clock

PTP-1

PTP-2

PTP-3

Switch Functions

MUX

MUX

MUX

Port-1

Port-2

Port-3

Figure 3.6. Internal view of a boundary clock

to each of these ports, the ports appear as an ordinary clock. However, the timestamps issued by the ports are based on the internal clock of the boundary clock, thus creating the clock hierarchy. A boundary clock will forward management messages so that a complex network of clocks may be managed using the multicast communication model. This forwarding is limited to a finite number of boundary clock retransmissions by a field in the management message. This feature is provided to contain the management messages within the IEEE 1588 system. 3.3.3 Startup and Reconfiguration One of the objectives of IEEE 1588 is to enable administration-free systems. This means that there must be some automatic mechanism to handle occurrences such a powerup of the entire system or of a single clock, the appearance or disappearance of a clock in the system, a change in the capabilities of a clock in the system, or a change in the underlying communication topology. Furthermore, this must be done in an autonomous and completely distributed

48

3 An Overview of Clock Synchronization Using IEEE 1588

manner, and in such a way as to minimize any disruption to an operating IEEE 1588 system. The mechanism for achieving this is the best master clock algorithm discussed in Section 3.3.2, coupled with timing constraints on state changes within each clock. The best master clock algorithm is based on the ability of pairs of clocks to decide which clock is better, based on information contained in Sync messages. The timing constraints derive from the rate at which Sync messages are sent, and rules on when they are to be processed by recipients. These rules are embodied in the PTP state machine (Clause 7.3 [24]). The combined effect of the best master clock algorithm, timing constraints, and state machine rules on startup and reconfiguration is best understood by considering several operational cases based on the system, illustrated in Figure 3.7, and the greatly simplified version of the PTP state machine in Figure 3.8.

Ordinary clock-2

Boundary clock-1

Ordinary clock-3

Ordinary clock-4

net-A

Figure 3.7. Simple network of clocks

Figure 3.7 shows a boundary clock-1, and three ordinary clocks 2, 3, and 4, each with a port connected to the multicast network, net-A. Each port executes an independent copy of the entire PTP protocol, including the state machine. In all cases, the descriptions convey the general outline and result of the operation of PTP. The exact details are considerably more complex.

C E

Master

Listening A

Slave B D

Figure 3.8. Simplified PTP state machine

3.3 Fundamental Operation of the Protocol

49

Case 1 In this case, all clocks are powered down and a single clock, clock-2, powers up. Clock-2 enters the listening state via transition E of the state machine in clock-2. Clock-2 listens for Sync messages from other clocks for a protocol parameter-determined wait interval. In this case, clock-2 is the only active clock, and at the end of this interval will transition to the master state via transition A and will then begin transmitting Sync messages at a rate defined by the synchronization interval. The synchronization interval is set by a parameter termed the “sync interval”. The default value of the sync interval is 2 s. The wait interval is named the PTP SYNC RECEIPT TIMEOUT interval, and is itself a times 10 multiple of the sync interval. Thus, in this example, clock-2 will wait 20 s before the transition to the master state. The length of the startup interval is a compromise between rapid response and thrashing due to missing or corrupt Sync messages. Case 2 In this case, clock-2 is in the master state, transmitting Sync messages when clock-3 is powered up and enters the listening state. If, after receiving at least two Sync messages from clock-2, clock-3 determines (using the best master clock algorithm) that it is inferior to clock-2, then clock-3 transitions into the slave state via transition B and proceeds to synchronize to clock-2. On the other hand, if clock-3 determines that it is better than clock-2, it transitions into the master state via transition A and will then start to transmit Sync messages. When clock-2 has received two of these messages, it will determine (using the best master clock algorithm) that it is inferior to clock-3, and will transition into the slave state via transition D and proceed to synchronize to clock-3. Case 3 In this case, all clocks have been on for some time, with clock-4 the master and all others slaves. What happens if clock-4 is turned off or disconnected? This will result in the cessation of Sync messages being received by the other clocks. The remaining clocks 1, 2, and 3 do nothing during the wait interval, PTP SYNC RECEIPT TIMEOUT, and then transition to the master state via transition C and start transmitting Sync messages. This will result in each of the three clocks receiving Sync messages from two other clocks. Each clock examines the Sync messages from the other two clocks, and determines (using the best master clock algorithm) whether it is a better or inferior clock. If it finds that it is inferior to any of the other clocks, it transitions to the slave state and proceeds to synchronize to that clock. Since this process occurs in all clocks, eventually only a single clock will be left in the master state, and all others will be in the slave state.

50

3 An Overview of Clock Synchronization Using IEEE 1588

Case 4 This is identical to Case 3, except that instead of clock-4 being turned off, the network connection between clock-2 and clock-3 is severed. Since clock-3 is still receiving Sync messages from clock-4 (its master), no change occurs in either clock-3 or clock-4. Clock-1 and clock-2 cease receiving Sync messages from clock-4 due to the break in the network. After the wait interval, they both transition into the master state, and start to transmit Sync messages. As in previous cases, one of them will determine that it is inferior based on the received Sync messages, and will transition into the slave state. The net result is that after the break in the network and the passage of enough time for all of the transitions to occur, there will be two disjoint networks, each with a master clock. These examples convey the general outline of the interaction of the best master clock algorithm, timing constraints, and the PTP state machine in a simple system either with no boundary clocks or when the boundary clocks have a connection to only a single port. The situation is considerably more complex in networks with multiple boundary clocks, or with active connections to the ports of a single boundary clock (for example, Figure 3.4). Recall that a boundary clock segments a complex network. The boundary clock treats each port separately by transmitting PTP event and general messages based on the state of the port. Synchronization of the several ports of a boundary clock is done internally. For example, in Figure 3.4 links D, F, and H all operate independently, as discussed in the four cases, with the ports on the boundary clock appearing to other ports on the same link as an ordinary clock. Understanding the internal operation of the boundary clock in determining its state, and therefore whether it transmits or receives Sync messages on a given port, requires the detailed analysis of the best master clock algorithm and PTP state machine given in Chapter 4. 3.3.4 Synchronization Overview One of the primary objectives of IEEE 1588 is to achieve sub-microsecond synchronization accuracy. To achieve this objective, the protocol must: • • •

Provide one or more events that can be timestamped and used as the basis for computing clock corrections, Communicate the needed timestamps to the clocks requiring this information, and Overcome timing impairments introduced by the various components of the system.

Figure 3.9 is a simplified diagram of the sequencing and critical timestamps associated with timing messages. The precise timing requirements are found in Clause 7.11 [24], and discussed in detail in Chapter 4.

3.3 Fundamental Operation of the Protocol Master Clock Time

51

Slave Clock Time

t1

Sync message

t2m

Follow_Up message containing value of t1

t2

Data at Slave Clock t2

t1, t2 t3m

Delay_Req message

t4

t3

t1, t2, t3

Delay_Resp message containing value of t4 time

t1, t2, t3, t4

Figure 3.9. Timing diagram for synchronization messages

The basic sequence in synchronizing a slave clock to a master clock is: • • • • • • •

The master clock sends a Sync message to all directly connected slave clocks. The master clock generates a timestamp t1 based on the master’s local clock, indicating the Sync message sending time at the master clock. A slave clock receives the Sync message and generates a timestamp t2 based on the slave’s local clock, indicating the Sync message receipt time at the slave. The master clock communicates the Sync message sending timestamp t1 to the slaves as a data field in a Follow Up message. The slave clock sends a Delay Req message to the master clock. The slave clock generates a timestamp t3 based on the slave’s local clock, indicating the Delay Req sending time at the slave clock. The master clock receives the Delay Req message and generates a timestamp t4 based on the master’s local clock, indicating the Delay Req receipt time at the master clock. The master clock communicates the Delay Req receipt timestamp t4 to the slave as a data field in a Delay Resp message. The slave uses the four timestamps t1 , t2 , t3 , and t4 to compute the offset between the slave and master clocks.

The offset between the time as seen by the master clock and a slave is computed based on the four timestamps present in the slave after an exchange

52

3 An Overview of Clock Synchronization Using IEEE 1588

of these four timing messages (Clause 7.8 [24]). This computation is based on the following definitions of the quantities illustrated in Figure 3.9: •

The measured difference between the time a Sync message is received at the slave and the time it was sent by the master: ms difference = t2 − t1



The measured difference between the time a Delay Req message is received at the master and the time it was sent by the slave: sm difference = t4 − t3



(3.3)

The actual propagation time for a Sync message traveling between the master and slave clocks: ms delay = t2m − t1



(3.2)

The actual offset between the time observed on the master and slave clocks: offset = tslave − tmaster



(3.1)

(3.4)

The actual propagation time for a Delay Req message traveling between the slave and master clocks: sm delay = t4 − t3m

(3.5)

The following relationships are derived from first principles: ms difference = offset + ms delay

(3.6)

sm difference = − offset + sm delay

(3.7)

Rearranging Equations 3.6 and 3.7 yields  offset = (ms difference − sm difference)  − (ms delay − sm delay) ÷ 2

(3.8)

and ms delay + sm delay = ms difference + sm difference

(3.9)

Thus, there are two equations (Equations 3.8 and 3.9) involving two measured quantities: ms difference and sm difference; and three unknowns: offset, ms delay, and sm delay. It is clearly impossible to solve Equations 3.8 and 3.9 without additional independent measurements or assumptions. This problem is not confined to IEEE 1588, but is shared by all synchronization protocols based on the exchange of timing information over channels with unknown propagation delays. The assumption—Equation 3.10—used by the PTP protocol is that the two propagation times are equal. If the error caused by this

3.3 Fundamental Operation of the Protocol

53

assumption is significant relative to the required synchronization accuracy, then some sort of calibration correction must be applied. This issue will be discussed further in Chapters 4 and 5. sm delay = ms delay = one way delay

(3.10)

Combining Equations 3.1, 3.2, 3.8, 3.9, and 3.10 offset = (ms difference − sm difference) ÷ 2   = (t2 − t1 ) − (t4 − t3 ) ÷ 2

(3.11)

one way delay = (ms difference + sm difference) ÷ 2   = (t2 − t1 ) + (t4 − t3 ) ÷ 2

(3.12)

The effect of path asymmetry on the calculation of clock offset can be seen from the example in Table 3.1. In this example, it is assumed that the slave clock is 1 h ahead of the master, and that the master-to-slave propagation delays, and the slave-to-master propagation delays, are 30 and 40 min, respectively. In the table, entries enclosed in parenthesis are not used in the computations but are for reference only. The example computed values for offset and one way delay of 55 and 35 min, respectively, are in obvious disagreement with the assumed offset of 1 h and the 30 and 40 min delays assumed between master and slave, and slave and master. The disagreement in each case amounts to half of the asymmetry in the assumed delay values. Table 3.1. Example of asymmetric delay on offset computation Quantity t1 t2 t2m t3 t3m t4 ms difference sm difference offset one way delay

Equation

Result

Time at master 9:00

Time at slave (10:00) 10:30

(9:30) 11:00 (10:00) 10:40 3.1 3.2 3.11 3.12

1:30 – 0:20 0:55 0:35

It is clear that the timestamps associated with sending and receiving Sync and Delay Req messages are critical to the operation of the protocol. Synchronization accuracy will depend in part on the accuracy and repeatability of timing information derived from these timestamps.

54

3 An Overview of Clock Synchronization Using IEEE 1588 IEEE 1588 Clock Application Code

IEEE 1588 Code

Operating System & Protocol Stack

OS

MII Physical Layer (PHY)

Network Cable

Physical Layer (PHY)

PHY

Cable

PHY

MII

Switch Fabric & Queues

Switch

MII Physical Layer (PHY) Ethernet Switch

Figure 3.10. Timing impairments of system components

Figure 3.10 illustrates the principal sources of timing impairments for a typical Ethernet-based clock and switch. Approximate values for the impairments in Figure 3.10 for a 10/100 BaseT Ethernet system are shown in Table 3.2. The impairments from operating systems arise from queues in the protocol stack, interrupt service routine timing differences, context switches, and load- and application-specific variation in code execution times. Switch impairments arise from input and output queuing and variation in the access patterns to the switch fabric. Switches also have an additional impairment, in that the path taken by the Sync and Delay Req packets may not be the same, thereby introducing asymmetry. The PHY layer fluctuations and asymmetry arise primarily from the operation of the phase lock loops in the receive path that are used to recover the received clock signal. Cable-induced fluctuations are typically small, and arise from microphonics and similar causes. Asymmetry in some cables, for example, CAT-5 cable often used in Ethernet, is intentionally introduced to reduce crosstalk between wire pairs. The values for switch impairments, and to some extent

3.3 Fundamental Operation of the Protocol

55

operating system impairments, are network traffic-dependent. Similar timing impairments are introduced by the components of all transport technologies. Table 3.2. Approximate Ethernet timing impairments Quantity

Delay

Fluctuations

Asymmetry

OS PHY Cable Switch

0.1–3,000 μs < 50 ns 3 ns/m 0.4–3,000 μs

0.1–3,000 μs < 1 ns 2

to Stratum >2 Variances

YES

9

Return B better than A

NO NO

8

7 [GM of A is Preferred] exclusive or [GM of B is Preferred]

GM of A is a Preferred Clock

YES

YES

NO 11

to path length comparisons

10

Return A better than B

Figure 4.15. DCA clock comparison—part 2 (This figure is adapted from IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. All rights reserved.)

88

4 A Detailed Analysis of IEEE 1588 1

stratum >2 Variances

2 GMVariance of A similar to GMVariance of B

NO

3 GMVariance of A better than GMVariance of B YES

NO

YES

4 NO

Preferred Resolution

YES

5

NO

[GM of A is Boundary Clock] exclusive or [GM of B is Boundary Clock]

6

YES

7 GMUUID of A less than GMUUID of B

GM of A is Boundary Clock NO

NO

8 NO Preferred Resolution YES YES

YES

Figure 4.16. DCA clock comparison—part 3 (This figure is adapted from IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. All rights reserved.)

4.2 Master-slave Synchronization Hierarchy

89

If the data set references are found to reference the same clock at block-2 of Figure 4.14, the remainder of the comparison is based on path length and will be discussed later in this section. Figure 4.14 is typical of the logic used in comparing the attributes of the referenced clocks. The attributes are resolved in the precedence order listed above for the stratum number and identifier. The first difference in these two attributes detected, for example, at blocks 4 and 6 of Figure 4.14 or blocks 2 and 3 of Figure 4.15, results in immediate evaluation of the Preferred Clock attribute of the two clocks, as shown in the Preferred Resolution section of Figure 4.14. This Preferred Clock attribute is used as the basis for the final decision between two clocks otherwise perceived as different, based on one of the earlier attributes. The result is that if only one of two referenced clocks is preferred, it will be deemed better irrespective of any earlier comparisons. The concept of a set of preferred clocks allows system integrators to designate a set of one or more clocks from which the grandmaster will be selected. For example, in systems where devices are subject to removal for maintenance, it would be useful to have a special device that is not subject to frequent removal being designated as the grandmaster. For the case in which the stratum number is greater than 2, block-5 of Figure 4.15 causes the next test to be made on the variances, as shown in Figure 4.16. The variance comparison in block-2 of Figure 4.16 is a similarity comparison. In the case of variances, two clocks are deemed similar if their variances agree within a defined tolerance. Unlike stratum number and identifier, the variance may change frequently in implementations that compute it in real-time, rather than providing a fixed value. The similarity comparison is used to prevent thrashing due to minor statistical variations that may occur. If the variances are similar, the next comparison is based on the boundary clock attribute. The final resolution is the tiebreaker at block-6 of Figure 4.16 based on UUID. Thus, comparisons on data sets referencing two different clocks with stratum numbers greater than 2 will always result in one or the other being deemed better, rather than better by path length. In this case, an examination of the SDA of Figure 4.5 shows that unless cyclic topology exists that causes block3 of Figure 4.14 to be invoked, there is no possibility of a port being in the PTP PASSIVE state. Thus, provided the logic following block-3 is handled correctly, there is no way a subdomain will be partitioned into disjointed regions unless there are at least two clocks with stratum numbers of 1 or 2. This explains why the examples discussed in case 2 of Section 4.2.3 resulted in a single tree structure, while the examples of case 3 resulted in disjointed trees. The logic of block-3 of Figure 4.14 is invoked when the underlying communication topology is cyclic, thereby allowing multiple Sync messages referencing the same grandmaster clock to reach a port. As noted, clocks with stratum numbers of 1 or 2 are treated differently. The reason is that clocks with this attribute are extremely accurate and precise, and there is no reason to prefer the one over the other. With this in mind, the

90

4 A Detailed Analysis of IEEE 1588

SDA and DCA were designed to partition a subdomain containing multiple stratum 1 or 2 clocks into compact regions, each containing a single such clock. There is a subtle interaction between the SDA and the DCA algorithms, in the case where the stratum number is 1 or 2, that results in this partitioning. Recall that partitioning can happen only at a port that is in the PTP PASSIVE state. From Figure 4.5, this can happen in only two circumstances: •



At block-3 of the SDA. This comparison is between the data set D0 describing the stratum 1 or 2 clock for the port in question and a received Sync message. If D0 wins the comparison, then the port will be in the PTP MASTER state, and the partitioning will occur elsewhere as a result of the second circumstance. If D0 loses the comparison, the port will be in the PTP PASSIVE state and will result in partitioning. At block-10 of the SDA. The comparison of block-10 occurs only between two Sync message received at two different ports of a boundary clock. If the comparison is not based on path length, then the port will be in the PTP MASTER state and some port, not the one represented by Erbest , will be in the PTP SLAVE state as a result of block-9. If the comparison at block-10 is based on path length, then the port receiving Erbest will be in the PTP PASSIVE state, which results in partitioning.

Block-3 of Figure 4.14 and block-11 of Figure 4.15 both lead to logic that compares two data sets A and B based on path length, as is illustrated in Figures 4.17, 4.18, 4.19, and 4.20. This logic introduces an attribute named “steps removed” (Clause 7.4.3.1 [24]). The steps removed attribute is a measure of how many communication paths have been traversed between a clock and the grandmaster clock, and is a member of the current data set of the clock. For a clock directly connected to the grandmaster, i.e., no intervening boundary clocks, the value of steps removed will be 1. For the grandmaster clock, the value will be 0. This attribute is updated as a result of the SDA only at block-5, block-7, and block-8 (see Figure 4.5). All other blocks of the SDA leave the attribute unchanged. Block-8 occurs when a clock port becomes a slave. The update is defined in Table 21 of Clause 7.6.5 [24]. Table 21 specifies that the new value of steps removed equals the value of the localStepsRemoved field of Ebest incremented by 1. When a Sync message is sent by a master port, the value of the localStepsRemoved field is set to the steps removed value of the current data set of the clock issuing the Sync message (Clause 8.3.1.16 [24]). This mechanism results in the steps removed attribute of the current data set increasing by 1 as each successive path between clocks is traversed. The value of steps removed in a clock is set to 0 as a result of either block-5 or block-7 of the SDA. This update is specified in Table 18 of Clause 7.6.5 [24]. Both block-5 and block-7 result from the comparison of the default data set of a clock, with Ebest causing the clock port to enter the PTP MASTER state. This can occur only if the clock is also the grandmaster clock of a subdomain.

4.2 Master-slave Synchronization Hierarchy 1

91

path length comparisons 3 Steps Removed of A < Steps Removed of B?

NO 2

Steps Removed of A within 1 of Steps Removed of B?

NO YES 5

4 Return A better than B

YES

Return B better than A

6 Steps Removed of A = Steps Removed of B?

NO

7 YES

Steps Removed of A < Steps Removed of B? NO

10

UUID of Sender of A = UUID of Sender of B?

NO

12

YES 8 YES

to path length A closer than B

9 to path length B closer than A

11 to path length same sender

UUID of Sender of A < UUID of Sender of B?

NO

YES 13 Return A better than B by path length

14 Return B better than A by path length

Figure 4.17. DCA path length comparison—part 1 (This figure is adapted from IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. All rights reserved.)

92

4 A Detailed Analysis of IEEE 1588 1

path length same sender 3

Port Number Receiving A < Port Number Receiving B?

NO 2

Port Number Receiving A = Port Number Receiving B?

NO YES 5

4 Return A better than B by path length

YES

Return B better than A by path length

6 GMSequenceId of A = GMSequenceId of B?

YES

7 sequenceId of A = sequenceId of B? YES NO

8 Return A same message as B

NO

9 sequenceId of A > sequenceId of B?

10 GMSequenceId of A > GMSequenceId of B?

NO

11 YES 12

Return B better than A by update

YES NO

Return A better than B by update

Figure 4.18. DCA path length comparison—part 2 (This figure is adapted from IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. All rights reserved.)

4.2 Master-slave Synchronization Hierarchy 1

2

93

path length A closer than B

UUID of Receiver of B = UUID of Sender of B?

YES 4 YES

Port Number Receiving B = Port Number Sending B?

5 Return message from self, ignore

NO

3

6

UUID of Receiver of B < UUID of Sender of B?

YES

NO

7 NO

Return A better than B by path length

Port Number Receiving B < Port Number Sending B?

YES NO

8 Return A better than B

Figure 4.19. DCA path length comparison—part 3 (This figure is adapted from IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. All rights reserved.)

The logic following block-3 of Figure 4.14 and block-11 of Figure 4.15 is discussed as three separate cases. Recall that in applying the SDA at port r of a clock, only the most recent Sync message received from another clock port is to be considered. There is also a requirement that Sync messages received at port r that were also sent from port r are to be excluded (Clause 7.5.2 [24]). In the case of a boundary clock, this clause also requires that Sync messages received at port r sent by another port on the same boundary clock are to be excluded. The path length comparisons are not required for all computations. Table 4.3 summarizes the various conditions under which a path length comparison is required based on the operation of the DCA in Figures 4.14, 4.15, and 4.16.

94

4 A Detailed Analysis of IEEE 1588 1

2

path length B closer than A

UUID of Receiver of A = UUID of Sender of A?

YES 4 YES

Port Number Receiving A = Port Number Sending A?

5 Return message from self, ignore

NO

3

6

UUID of Receiver of A < UUID of Sender of A?

YES

NO

7 NO

Return B better than A by path length

Port Number Receiving A < Port Number Sending A?

YES NO

8 Return B better than A

Figure 4.20. DCA path length comparison—part 4 (This figure is adapted from IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, Copyright 2002, by IEEE. All rights reserved.) Table 4.3. Computations requiring path length consideration in the DCA Condition

Topology

Erbest

Ebest

D0 vs. Ebest or Erbest

A and B have same GMUUID

Cyclic or multicast

Always required

Always required

Never occurs

Attributes of A better than B

Any

Never occurs

Never occurs

Never occurs

A same capability as B, stratum 1 or 2

Any

Only for cyclic or multicast topology

Always required

Always required

4.2 Master-slave Synchronization Hierarchy

95

Case 1: Selecting Erbest at Port r Based on Path Length The computation of Erbest at port r requires successive pair-wise comparison of the set Sr of Sync messages received at port r during the preceding sync interval, as illustrated at port-3 in Figure 4.4. The results of the previous computation of Erbest are also included under certain circumstances, as discussed in Section 4.2.2. The makeup of this set of Sync messages depends on the topology adjacent to port r. The two possibilities are illustrated in Figure 4.21. Figure 4.21A shows the ports of two clocks directly connected. The set Sr consists of, at most, two Sync messages received from port p on clock-2. This is because the sync interval is a subdomain-wide parameter (Clause 6.2.5.5 [24]). Normally, Sr consists of a single message, but it is possible that minor discrepancies between the interval timing in the two clocks can occasionally result in a second message being included in the set. However, in this case the requirement that only the most recent be considered will remove one of them.

Clock-1

Clock-1

port r

port r 1 port s

1

Multicast (e.g. Ethernet switch) port qi

port q j

2

3

port- p

port- p

port- p

Clock-2

Clock-j

Clock-k

ports-qi

ports-qi

ports-qi

A

B

Figure 4.21. Topologies adjacent to a clock port

Figure 4.21B shows port r of clock-1 connected to a multicast medium. This allows Sync messages from multiple clock ports to reach port r in the same sync interval, unless some underlying spanning tree protocol has reduced the connectivity of the multicast medium, as discussed in Section 3.3.2. In general, the set Sr will contain Sync messages from several clocks. It is possible

96

4 A Detailed Analysis of IEEE 1588

that one of these messages may be from a different port on clock-1, although these must be discarded as discussed earlier. It is also possible that multiple copies of the same message can be received. In this case, the set Sr will consist of several Sync messages, possibly with duplicates, but with only the most recent message from a given clock port and with no messages from another port on the clock owning port r. Over multiple sync intervals, the size of set Sr will decrease, since all but at most one of these clocks will eventually enter either the PTP SLAVE or PTP PASSIVE states as a result of the operation of the best master clock algorithm in each of the clocks. The crucial point in this process is when Sr at port r of clock-1 contains only Sync messages from the one remaining foreign or current master clock, clock-k, and a similar situation holds for the set Sk . At this point, the DCA executing on the ports of the two clocks must agree on which Sync messages are deemed to represent the better grandmaster clock. The path length comparisons start at block-2 of Figure 4.17. From Table 4.3, this path length computation can occur only if data sets A and B have the same value of GMUUID, or if both A and B represent clocks that have the same stratum number and identifiers, and the stratum numbers are 1 or 2. If the steps removed field of the two Sync messages differ by more than 1, block-3 of Figure 4.17 deems the message with the smaller value of steps removed to be absolutely better. Remember that the goal is to ensure that the number of boundary clocks between any clock and the grandmaster is minimal. A topology with path differences of 2 or more can always be reduced to a difference of 1 by moving one or more boundary clocks along the longer path. At block-6 of Figure 4.17, the two paths differ by at most 1. If the paths are of equal length, then the tie is broken, based on other factors. The first tiebreaker, blocks 10 and 12, is to compare the UUIDs of the sending ports. If the sending port UUIDs are unequal, the data set with the smaller value is deemed better by path length. If the sending UUIDs are equal, then both messages emanate from the same clock port.7 In this case, block-11 of Figure 4.17 leads to block-2 of Figure 4.18. Since all the messages in Sr are received at the same port, namely, port r, the decision at block-2 of Figure 4.18 is YES, and the remainder of the logic starting with block-6 of Figure 4.18 determines whether: • •

The two messages are duplicates, blocks 7 and 8, or One is more recent than the other, and therefore better absolutely.

Returning to the other branch at block-6 of Figure 4.17, the case where the path lengths differ by 1 leads to block-7, and then to block-2 of either Figure 4.19 or Figure 4.20, depending on which path is shortest. The logic is 7

The UUID is an attribute of the clock port, not the clock (see Table 16, Clause 7.6.4 [24]).

4.2 Master-slave Synchronization Hierarchy

97

symmetric from this point, so only one branch needs to be examined here. The case where steps removed of A is smaller leads to block-2 of Figure 4.19. There is clearly at least one flaw in this figure and its counterpart, in that a YES decision at block-2 always leads to block-5, since from Table 16 of Clause 7.6.4 [24] the UUID includes the port number. The other branch leads to a return deeming A better or better by path length, either one of which will enable the determination of Erbest . This flaw has no practical consequence since, as illustrated, the logic of block-6 never executes. If the UUID comparison in block-2 is interpreted not including the port number, then block-6 executes with exactly the same result. Thus, all the logical possibilities of the path length comparisons of Figures 4.14, 4.15, and 4.16 lead to an unambiguous determination of Erbest from the set Sr . Case 2: Selecting Ebest for a Clock Based on Path Length The computation of Ebest for a clock, clock-1, requires successive pair-wise comparison of the set Srbest of Sync messages, each representing Erbest received at port r during the preceding sync interval, as illustrated by computation B in Figure 4.4. As discussed in Section 4.2.2, the set may be the empty set. This discussion applies only to boundary clocks, since for an ordinary clock Ebest is always equal to Erbest of its single port. The set Srbest does not include any duplicate or stale Sync messages, or any Sync messages sent from clock-1, since these were eliminated during the computation of each set member Erbest . As in the previous case, this discussion focuses on path length considerations starting with block-2 of Figure 4.17. The path length comparisons start at block-2 of Figure 4.17. From Table 4.3, this path length computation can occur only if data sets A and B have the same value of GMUUID, or if both A and B represent clocks that have the same stratum number and identifiers, and the stratum numbers are 1 or 2. If either of the clocks represented by data sets A or B are better, based on consideration of stratum number, identifier, or in the case where both have stratum numbers of 1 or 2 by a comparison of variances, then one will be deemed better, with no need to consider path length. The selection of Ebest is the key to correctly partitioning a network to remove cyclic communication paths or to isolate multiple stratum 1 and 2 clocks. The logical possibilities are illustrated in Figures 4.22, 4.23, and 4.24. Figures 4.22 and 4.23 show several boundary clocks connected in a cycle. One of the clocks, labeled GM, is deemed to be the best clock in the system, and hence the grandmaster. The unlabeled clocks shown in dashed lines each represent multiple but equal numbers of clocks in series in the path. The clocks labeled A? and B? are the clocks performing the computation for Ebest . The number in the path is the value of steps removed to the grandmaster along the shortest route. Note that as far as the DCA is concerned, the topologies shown are equivalent to the case in which the path is broken at the GM, but with

98

4 A Detailed Analysis of IEEE 1588

equally capable stratum 1 or 2 clocks terminating the breaks, as illustrated by the dashed clocks and connections at the top of each figure. Figure 4.24 illustrates the case where a multicast topology or a switch introduce multiple paths between a clock-C and several ports on clock-A.

S1

S1

S1

S1

GM

GM

n

A?

A

n

n+2

A?

n

B

Figure 4.22. Canonical topologies requiring partitioning—part 1

Figure 4.22A illustrates the case where the path lengths defined by steps removed are exactly equal, and each Erbest is received from a different clock port. This leads to block-12 of Figure 4.17, with a return selecting one of the ports on A as better by path length. The assumption that clock-A is not itself better than the GM will lead to one slave and one passive port

4.2 Master-slave Synchronization Hierarchy

S1

99

S1

GM

D

C

n

n b: port_id_field 1

b: port_id_field 1 B?

B: uuid_field = 2

A: uuid_field = 1

A?

a: port_id_field 2

a: port_id_field 2 n+1

Figure 4.23. Canonical topologies requiring partitioning—part 2

by blocks 9 and 10, respectively, of the SDA in Figure 4.5. This partitions the subnet as desired. Returning to block-10 of Figure 4.17: when the UUIDs of the sender of the messages being considered are the same, block-2 of Figure 4.18 is invoked, indicating a cyclic path connecting the two ports being considered. This can occur in topologies such as those shown in Figure 4.24. This figure illustrates how a Sync message issued by a port on clock-C can arrive unchanged at two different ports on the boundary clock-A by passing through an ordinary Ethernet switch. Since the two receiving port numbers will differ in this case, the selection is made at block-3 of Figure 4.18 based on receiving port number, and will return a determination of better by path length. This allows one port to be selected as a slave port, and the other as a passive port at blocks 9 and 10 of the SDA (Figure 4.5). Figure 4.22B illustrates the case where the path lengths differ by 2 or more. As we saw in DCA case 1, this will lead to block-3 of Figure 4.17 with a return selecting one of the ports on A as better but not better by path length. Assuming that clock-A is not itself better than the GM, this will lead

100

4 A Detailed Analysis of IEEE 1588

GM 0

0

B

C

n multicast n

n

n

A?

Figure 4.24. Canonical topologies requiring partitioning—part 3

to one slave and one master port by blocks 9 and 10, respectively, of the SDA in Figure 4.5. This does not partition the network at clock-A but moves the partitioning point along the longer path toward the GM. Figure 4.23 illustrates the case where the path lengths differ by exactly 1. For this case, Erbest for port-a is designated as A in the DCA, and Erbest for port-b is designated as B. The values of the relevant attributes in Erbest for each of the ports on clocks A and B are shown in Table 4.4. The construction of the UUIDs in the last two rows of the table is based on the definitions in Table 16 of Clause 7.6.4 [24]. In each case, the UUID consists of three fields: • • •

An attribute describing the communication technology of the links. For this example it is assumed that all links use the same technology—for example, UDP on Ethernet—and the attribute value is 1. An attribute based on the uuid field characteristic of the clock sending or receiving the message. An attribute based on the port id field of the port sending or receiving the message.

As we saw in DCA case 1, this will lead to block-7 of Figure 4.17, since the values of steps removed for ports a and b for either clock differ by 1. Furthermore, since for both clocks port-b has the lowest value of steps removed, the next decision point in computing Ebest is block-2 of Figure 4.20, because in each case port-b is closer to the grandmaster.

4.2 Master-slave Synchronization Hierarchy

101

Table 4.4. Attributes for the symmetric topology of Figure 4.23 Attribute

Clock-A-a Clock-A-b Clock-B-a Clock-B-b

steps removed Sender of Erbest Receiver of Erbest Sender UUID Receiver UUID

n+1 n Clock-B Clock-C Clock-A-a Clock-A-b 1, 2, 2 1, 1, 2

n+1 n Clock-A Clock-D Clock-B-a Clock-B-b 1, 1, 2 1, 2, 2

Consider first the calculation of Ebest for clock-A shown in Figure 4.23. The logic at block-2 of Figure 4.20 makes the determination based on considering attributes of Erbest for port-a. The decision at block-2 of Figure 4.20 will be NO, with the result that the decision is based on comparing the UUIDs of the sender and receiver of Eabest at port-A-a in the figure, namely, 1, 2, 2 and 1, 1, 2, respectively. The comparison of the respective UUIDs based on the comparison algorithm of Clause 6.2.4.1 [24] results in the UUID of the receiver deemed less than that of the sender. This produces a return of B better than A at block-8 of Figure 4.20. Refer to the SDA of Figure 4.5, and assume that clock-A is not better than the grandmaster. The result of the SDA at block-8 is to make port-A-b a slave port. Port-A-a becomes a master port via SDA block-12. Now consider the same computation for clock-B of Figure 4.23. This computation is identical to the computation for clock-A until the comparison of the sender and receiver UUIDs at block-2 of Figure 4.20. In this case, the sender and receiver UUIDs are 1, 1, 2 and 1, 2, 2, respectively. Here, the UUID of the sender is deemed less than that of the receiver. This produces a return of B better than A by path length at block-7 of Figure 4.20. Again refer to the SDA of Figure 4.5, and assume that clock-B is not better than the grandmaster. The result of the SDA at block-8 is to make port-B-b a slave port. Port-B-a becomes a passive port via SDA block-11. This is exactly the behavior desired, since it produces an unambiguous partitioning of the subnet between clocks A and B. Case 3: Comparing D0 and Ebest or Erbest for a Clock Based on Path Length The comparison of Ebest and the attributes of the clock itself are illustrated by computation D in Figure 4.4. This discussion applies to both boundary and ordinary clocks. The value of steps removed for the data set D0 characterizing the clock, for example, clock-1, is always 0. The port id field of clock-1 is also 0 from Table 16 of Clause 7.6.4 [24]. Therefore, the UUID of data set D0 will never match that of an incoming Sync message Ebest , since the port id field of the Sync message will be at least 1. For the rest of this discussion, assume

102

4 A Detailed Analysis of IEEE 1588

that in the DCA data set A is set D0 , and data set B represents Ebest or Erbest . Since the GMUUIDs will never be identical, an analysis of Figures 4.14, 4.15, and 4.16 shows that the only way that path length computations can occur is if both A and B are stratum 1 or 2, and have identical stratum numbers and identifiers. For all other cases, either D0 or Ebest or Erbest as appropriate, will be deemed better based solely on considerations of grandmaster characterization fields, without recourse to path length considerations. The SDA of Figure 4.5 shows that under these circumstances all ports of clock-A will be master ports if D0 were deemed better than Ebest . If Ebest were deemed better than D0 , and if A is not a stratum 1 or 2 clock, then clock-A ports will be masters, except for the port represented by Ebest , which will be a slave port. If clock-A is a stratum 1 or 2 clock, then its ports will be either master or passive ports, depending on whether D0 is better or worse than Erbest. In the case in which D0 is deemed worse, at least the port represented by Ebest will be passive. The remainder of this discussion assumes that the direct comparison of the data sets D0 and Ebest or Erbest did not resolve based on considerations other than path length. Path length comparisons for any port r of clock-1 that lead to block-3 of Figure 4.17 will always result in A representing D0 being deemed better, since steps removed for D0 is 0 and the difference must be greater than 1 to reach block-3. In this case, there must be a boundary clock, clock-x, between the port-r represented by Erbest and the clock sending the message as Erbest . Any partitioning of the network will take place at, or beyond clock-x. This results in port-r being a master port either from block-5 or block-7 of the SDA (Figure 4.5). If the path lengths are equal, and therefore 0 since steps removed for D0 is 0, then block-6 of Figure 4.17 will lead to block-10 and will be resolved based on the UUIDs. This case occurs when the two clocks A and B are directly connected to each other, with no intervening boundary clock. Table 16 of Clause 7.6.4 [24] defines the UUID of Sender of A based on three fields in the default data set D0 of clock-A: • • •

The clock communication technology field, The clock uuid field field, and The clock port field field, which by definition has a value of 0 for D0 .

These values are compared to similar fields from Erbest that characterize the clock port sending Erbest . Since the clock port field of D0 is 0 and the sourcePortId field of Erbest is always greater than 0, these two data sets can never have the same UUID. Therefore, equal path lengths always lead to block-12 of Figure 4.17. The UUID comparison algorithm of Clause 6.2.4.1 [24] will select either the one or the other to be better by path length at blocks 13 or 14 of Figure 4.17.

4.2 Master-slave Synchronization Hierarchy

103

A similar computation will occur in clock-B represented by Erbest. If clockB determines that Sync messages from clock-A are the best received on any port (i.e., represent Ebest on clock-B), then the computation at block-12 of Figure 4.17 will, with one difference, compare the same UUIDs as in the computation in clock-A. The difference between the computations will occur in the port identifier field. The port identifier fields are the last to be considered by the UUID comparison algorithm of Clause 6.2.4.1 [24]. The combination of the clock communication technology and clock uuid fields is unique to each clock. This results in opposite conclusions in the evaluation of block-12 of Figure 4.17 in clocks A and B. In this example, the two clocks involved are directly connected, and one will be a master and the other either slave or passive, depending on whether one or both is a stratum 1 or 2 clock based on the SDA of Figure 4.5. The final case is when the path lengths differ by 1, leading to block-7 of Figure 4.17. This will lead to block-8 of Figure 4.17 and block-3 of Figure 4.19, since the steps removed of D0 is 0, and the UUIDs of sets A and B can never be equal. The comparison at block-3 of Figure 4.19 involves the UUIDs of the sender and receiver of set B representing Erbest . As in the computation of Ebest itself discussed earlier, the sender UUID references fields in Erbest characterizing the sender, and the receiver UUID references similar fields from the port database of the receiving clock. One or the other will be deemed better, resulting in a return of D0 better, or D0 better by path length. A similar computation in the other clock of the pair will produce the opposite result, again assuming the comparison reached block-3 of Figure 4.19. 4.2.5 Clock Characterization This section discusses the principal attributes of a clock, clock port, and of an IEEE 1588 system that are used in the operation of the PTP protocol. The definitions of the attributes are quite clear in the standard. This discussion concentrates on the significance of each attribute in the operation of the protocol. Some of these attributes are static properties, while others may be modified either by the protocol or by a system integrator. For each of the attributes that can be modified by the protocol or the user, the standard defines a default value. These default values are used by the initialization procedures in each clock. If all clocks in a system have the default values for each attribute, the system will perform correctly and will satisfy the objectives of the protocol, as discussed in Section 3.1. The current values of the attributes are stored in one of the data sets defined by the protocol and maintained in each node. A more detailed discussion of these data sets is found in Appendix B. Attributes in the data sets that characterize an implementation, rather than the protocol or its use, will not be discussed here.

104

4 A Detailed Analysis of IEEE 1588

Attributes Used by the Protocol to Establish the Master-slave Hierarchy These attributes are presented below. In each case the formal name of the attribute in the data set is given along with the clause in the standard. Clock and Port UUID: clock communication technology, clock uuid field, and port uuid field, Clause 6.2.4.1 [24]. The clock UUID attribute is a member of the default data set of a clock, while the port UUID is a member of the port configuration data set. Both clocks and ports of all devices participating in the PTP protocol have universally unique identifiers. These identifiers are used by the best master clock algorithm, in particular the data set comparison algorithm, to differentiate messages from different sources and in the tiebreaking mechanism for placing clocks in the master-slave hierarchy. These UUIDs are also used to target management messages to specific clocks. Failure to conform to the uniqueness requirements in constructing the UUID will have disastrous consequences in system operation. The construction of the UUID was described briefly in Section 4.2.2. The first step in constructing the various UUIDs is to construct the UUID for each port on an IEEE 1588 clock. The UUID for a clock is then derived from the UUIDs of its ports. The first element of a port UUID, the port communication technology, characterizes the network technology served by the port. The standard defines the values for this field in Clause 6.2.4.1 [24]. The second field, the port uuid, is a UUID whose definition is specific to the communication technology. The combination of these two fields must be guaranteed to be unique for each clock in all IEEE 1588 systems. In Ethernet systems, the port uuid is specified in Annex D [24] to be a MAC address permanently associated with the clock. The MAC address space is administered by the IEEE, and is guaranteed to be unique. The final field, the port id field, distinguishes the individual ports on a boundary clock. The combination of these three fields will be unique for every port on every clock in all IEEE 1588 systems, provided the requirements of Clause 6.2.4.1 [24] are met. Some network technologies may not have the equivalent of the MAC address on which to base the definition of the port uuid. In this case, the standards committee writing the annex to IEEE 1588 covering the network technology may be forced to compromise on the global uniqueness requirement by limiting uniqueness to a specific system in which uniqueness is achieved by configuration. This clearly opens up an avenue for error, but may be the best that can be done in some technologies. The UUID for the clock itself is derived from the smallest port UUID using the algorithm defined in Clause 6.2.4.1 [24]. This algorithm favors Ethernet ports, based on the assumption that these will be most prevalent, and that the use of the IEEE-maintained uniqueness of the MAC addresses will

4.2 Master-slave Synchronization Hierarchy

105

prevent uniqueness problems in boundary clocks serving communication technologies for which the uuid field is less well controlled. The algorithm first specifies how to order IEEE 1588 UUIDs. This ordering procedure is used extensively in the data set comparison algorithm discussed in Section 4.2.4. The first two fields of the clock UUID are identical to the first two fields of the smallest of the port UUIDs of the ports on the clock. In the case of an ordinary clock, the first two fields of the clock and port UUIDs will therefore be the same. The final field, the port id field of the clock UUID, is 0. As noted in case 3 of Section 4.2.4, the 0 value for the port id field of the clock is critical to the operation of the data set comparison algorithm. Stratum: clock stratum, Clause 6.2.4.3 [24]. The clock stratum attribute is a member of the default data set. This attribute is the primary attribute used by the best master clock algorithm in ordering clocks to determine the most suitable candidate for the grandmaster clock. The clock stratum attribute defines the UTC traceability properties of the clock. A value of 1 indicates that the clock is a primary reference standard directly traceable to a recognized source of UTC. An example of a PTP stratum 1 clock would be GPS receiver-clock combinations that also support PTP, and can act as a grandmaster clock. An atomic clock calibrated to UTC and supporting PTP is another possibility. The meaning of stratum number in the telecommunications field and in the timekeeping community is not the same as the PTP definition. In these other fields, stratum number conveys specific accuracy and holdover properties of the clock. In PTP, this information is conveyed by the clock identifier attribute. A PTP stratum 2 clock is a secondary standard clock. It is synchronized to a PTP stratum 1 clock by means other than PTP. A PTP clock can also be a stratum 2 clock if in the past it has been synchronized to a PTP stratum 1 clock and its holdover properties ensure that it still meets the accuracy requirements specified by its clock identifier attribute. Holdover is a measure of the ability of a clock to maintain its time scale when communication with its primary standard is suspended by a communication breakdown or other cause. The stratum 2 value allows a PTP stratum 1 clock that looses communication with its source of UTC to degrade itself to a stratum 2 clock, but still retain the grandmaster designation in the best master clock algorithm. The value of UTC traceability of PTP stratum 1 and 2 clocks is the basis for giving them preference in the operation of the best master clock algorithm. Since there does not appear to be any inherent basis for discriminating between two stratum 1 or two stratum 2 clocks, the presence of multiple stratum 1 or 2 clocks in a subdomain will result in the partitioning of the subdomain, as discussed in Section 3.3.2.

106

4 A Detailed Analysis of IEEE 1588

Stratum numbers 3 and 4 indicate that the clock is not necessarily traceable to UTC. A PTP stratum number of 3 indicates that the clock implements a PTP feature called the “external timing signal” (Clauses 6.2.3 and 7.5.20 [24]). The external timing signal is a 10-MHz signal derived from the local clock, and distributed to other clocks in the system on a separate communication medium from that used to transmit PTP messages. Devices capable of issuing external timing signals are placed higher in the hierarchy than ordinary devices, so that the external timing signal distributed by a stratum 3 clock serving as a grandmaster is synchronized to the time scale established by the PTP protocol. Finally, stratum 4 is used for all other clocks. Establishing an ordering of stratum 4 clocks depends on other clock attributes. Identifier: clock identifier, Clause 6.2.4.5 [24]. The clock identifier attribute is a member of the default data set. This is the second attribute used by the best master clock algorithm in ordering clocks to determine the most suitable candidate for the grandmaster clock. The clock identifier attribute specifies the accuracy of the clock with respect to the primary time scale. There are six possible values of this attribute: ATOM, GPS, NTP, HAND, INIT, and DFLT. ATOM and GPS are applicable only to clocks with a stratum number of 1 or 2. In the case of a stratum 1 clock, this attribute indicates the accuracy to which the clock is synchronized to the primary source of UTC, either a calibrated atomic clock for the value ATOM or a GPS source for the value GPS. In the case of a stratum 2 clock, these values indicate either the accuracy to which the clock is synchronized to a PTP stratum 1 clock, or that its holdover properties are such that the defined accuracy is still within specification. An attribute value of NTP is reserved for stratum 2 clocks, and indicates that the UTC source is the NTP protocol [23]. An attribute value of HAND is reserved for stratum 2 or greater clocks for which the epoch of the time scale has been set to UTC by an administrative procedure, such as the use of the PTP management message PTP MM SET TIME (Clause 7.12.28 [24]). Since such procedures will often require human intervention and reaction times, the UTC accuracy specification is 10 s for an attribute value of HAND. The attribute value INIT indicates that the time scale of the clock is set to some other source of time. The accuracy is consistent with that time scale, and the nature of the procedure used to set the clock. This attribute is designed for cases in which PTP is used to distribute time based on some application-specific process, such as the local time within an independent electronic system. The time scale may be completely arbitrary with no explicit reference to UTC, for example, a time scale measuring elapsed

4.2 Master-slave Synchronization Hierarchy

107

seconds since the system was turned on. The final value, DFLT, is used when none of the others apply. Clause 6.2.4.5 [24] also provides the ordering of these attributes for use by the best master clock algorithm. Variance: clock variance, Clause 6.2.4.8 [24]. The clock variance attribute is a member of the default data set. This is the third attribute used by the best master clock algorithm in ordering clocks to determine the most suitable candidate for the grandmaster clock. The value of this attribute is a measure of the inherent time precision and stability of the local clock when it is serving as a grandmaster clock. The value of this attribute is based on the Allan variance discussed in Section 2.2.2 and defined in Clause 7.7 [24]. It includes the effects of the finite resolution of the clock, the stability and noise properties of the local oscillator driving the clock, and measurement errors in generating timestamps based on the local clock. This attribute is the primary determinate of clock ordering for clocks of stratum 4. Since it is expected that for many applications, for example, instruments on a bench, all clocks will have stratum number 4, it is important that the value of the clock variance attribute correctly characterize each clock. Failure in this regard can easily result in clocks in a system basing their time on a distinctly inferior clock. Preferred Status: preferred, Clause 6.2.4.4 [24]. The preferred attribute is a member of the default data set. It is the first of the attributes used by the best master clock algorithm that characterizes the role of the clock in a system, rather than being an inherent property of the clock. A TRUE value of this attribute designates the clock as a member of a set of clocks from which the grandmaster will be selected, irrespective of any other considerations of the best master clock algorithm. Stratum 1 or 2 clocks are designated as preferred by default. Clock Type: is boundary clock, Clause 6.2.4.2 [24]. The attribute is a member of the default data set indicating whether a clock is a boundary clock or an ordinary clock. In the operation of the best master clock algorithms, boundary clocks are preferred over other clocks—all other attributes being equal. Topologically, this will place a boundary clock at the root of the master-slave hierarchy, unless there is a better ordinary clock in the system. Path Length: steps removed, Clause 7.4.3.1 [24]. This attribute is a member of the current data set, and is a measure of the number of boundary clocks traversed from the currently selected grandmaster clock. As such, it is a property of the position of the clock in the system, rather than an inherent property of the clock. This attribute plays a major role in the path length portions of the best master clock algorithm.

108

4 A Detailed Analysis of IEEE 1588

External Timing Signal: external timing, Clause 7.4.2.10 [24]. This attribute is a member of the default data set and indicates whether the clock supports the external timing signal feature of PTP. This is a static property of the clock determined by the manufacturer. Port State: port state, Clause 7.4.6.1 [24]. This attribute is a member of the port configuration data set of the clock. It reflects the current state of the port, as defined by the protocol state machine discussed in Section 4.2.2. The value of this attribute determines which operations a clock can execute, and which messages it can send and receive. Attributes Used in Defining the Scope and Extent of an IEEE 1588 System The use of these attributes is discussed in Section 4.1. These attributes are: Subdomain Name: subdomain name Clause 7.4.2.13 [24]. This attribute is a member of the default data set. Subdomain Address: subdomain address, Clause 7.4.6.4 [24]. This attribute is a member of the port configuration data set. Port Addresses: event port address, general port address, Clauses 7.4.6.5 and 7.4.6.6 [24]. These attributes are members of the port configuration data set. Attributes Used in the Synchronization Process The use of these attributes is discussed in Section 4.4. These attributes are: Follow Up Capability: clock followup capable, Clause 7.4.2.7 [24]. This attribute is a member of the default data set. It indicates whether the clock will send a Follow Up message associated with each Sync message. Latency: Clause 6.2.4.9 [24]. This implementation-specific property of a clock is not reflected as a formal attribute in any of the data sets. However, it is extremely important in generating correct values for the timestamps used by the protocol. This is discussed in Section 4.4.2. Synchronization Interval: sync interval, Clause 7.4.2.12 [24]. This attribute is a member of the default data set, and characterizes the basic timing cycle of the operation of the PTP protocol. Clock Offset: offset from master, Clause 7.4.3.2 [24]. This attribute is a member of the current data set, and represents the current value of the computed time offset between the master and slave clocks. It is the value of this attribute that drives the clock servo in a slave clock.

4.2 Master-slave Synchronization Hierarchy

109

Path Delay: one way delay, Clause 7.4.3.3 [24]. This attribute is a member of the current data set, and represents the current value of the measured propagation delay between the master and slave clocks. The value of this attribute is used in correcting the measured offset to produce the offset from master attribute of the clock. Burst Capability: burst enabled, Clause 7.4.6.10 [24]. This attribute is a member of the port configuration data set. It signifies whether the port is able to service the burst mode of operation described in Section 4.4. Attributes Used Primarily for Other Purposes The bulk of the standard specifies how to synchronize real-time clocks in a distributed system. Applications using these clocks generally require more information and services than only the current time. The attributes providing this information are: Initializable: initializable, Clause 7.4.2.9 [24]. This attribute is a member of the default data set, and indicates the behavior to be expected when a clock is rebooted or initialized by means of a management message. Epoch: epoch number, Clauses 6.2.5.6 and 7.4.5.4 [24]. This attribute is a member of the global time properties data set. The representation for time used by IEEE 1588 is defined in Clause 5.2.2 [24] as 32 bits of seconds and 32 bits of nanoseconds.8 This definition results in a seconds field overflow approximately every 136 years. The epoch number represents 16 more significant bits to the seconds field, resulting in an overflow approximately every 9 million years. This attribute was included in the standard to prevent Y2K-like problems that made everyone nervous at the turn of the century. Variance: observed variance, Clause 7.4.4.9 [24]. This attribute of the parent data set represents the observed variance (as defined in Clause 7.7 [24]) of the master clock to which a slave clock is synchronizing, as seen by the slave clock. It is provided as a service to applications or tools in evaluating the quality of the system time scale. Drift: observed drift, Clause 7.4.4.10 [24]. This attribute of the parent data set represents the observed drift of the master clock to which a slave clock is synchronizing, as seen by the slave clock. It is provided as a service to applications or tools in evaluating the quality of the system time scale. Reasonable Performance Values: utc reasonable, Clause 7.4.4.11 [24]. This attribute of the parent data set indicates whether the clock deems the 8

There is a flaw in the definition in Clause 5.2.2 [24]. The recommended correction is for the nanoseconds member of the datatype to be an unsigned integer, with the most significant bit being the sign bit of the entire time representation.

110

4 A Detailed Analysis of IEEE 1588

values of the other attributes of this section are reasonable. The standard is silent concerning how to determine whether the attributes in question are in fact reasonable. However, frequent changes in epoch number or current utc offset based on the local time scale would be cause for declaring the value of this attribute to be FALSE. UTC Offset: current utc offset, Clause 7.4.5.1 [24]. This attribute of the global time properties data set is the current value of the number of leap seconds in the difference between UTC and TAI. This information is available from the GPS system, and many other recognized UTC sources of time. In the case where the epoch of the grandmaster is set to UTC by an operator, it is the responsibility of the operator to keep this information current based on reliable sources. For example, on 1 June 2005, the value of UTC - TAI was negative 32 s [34]. This attribute is distributed throughout the subdomain for use by applications. Leap Second Warning: leap 59, leap 61, Clauses 7.4.5.2 and 7.4.5.3 [24]. These attributes of the global time properties data set are the current value of the leap seconds flags provided by many UTC sources. These flags indicate whether there will be a leap second correction at the end of the current day. These attributes are not generated by PTP but must be obtained from an external source, such as a GPS receiver in the case of stratum 1 clocks. These attributes are distributed throughout the subdomain for use by applications. Parent Statistics: parent stats, Clause 7.4.4.8 [24]. This attribute is a member of the parent data set. With the exception of the epoch number, which must be maintained, PTP does not require a clock to compute and maintain the values of any of the attributes discussed in this section. The value of the parent stats attribute indicates whether the values of these other attributes have been computed.

4.3 Startup and Reconfiguration Section 3.3.3 provided a high-level view of how startup and reconfiguration of an IEEE 1588 system occur. This section explains in detail how the protocol implements these functions. During the normal operation of an IEEE 1588 system, the states of all clock ports in a subdomain will be stable with the specific states being determined, as discussed in Section 4.2. There are several circumstance that may result in a local or subdomain-wide transition from one set of stable states to another, including: • •

The powerup or initialization of one or more clocks in the system, An internal change in the characteristics of one or more of the participating clocks, or the intentional external reconfiguration of data sets in

4.3 Startup and Reconfiguration

• •

111

one or more clocks by operators or automatic tools using the management messages, A change in the underlying network topology resulting in a change in the communication links available to the PTP protocol, and Internal faults in one or more of the clocks, or operator or tool-generated fault management procedures on one or more of the clocks using the management messages.

4.3.1 Powerup and Initialization Powerup and initialization events cause a clock to enter the PTP INITIALIZATION state. The powerup condition occurs as a result of an intentional power cycling of a clock or the system by the user, or as a result of an unexpected power outage. Initialization is typically the result of user intervention via a management message. The PTP INITIALIZATION state allows a clock to be placed in a known internal state before making a transition to the PTP LISTENING state. The PTP LISTENING state is provided to allow a clock to determine whether it is joining a subdomain that has already begun or finished the process of establishing the master-slave hierarchy, as discussed in Section 3.3.2. A clock in the PTP LISTENING state will normally transition to either the PTP PRE MASTER state or to the PTP UNCALIBRATED state, in preparation for entering the PTP MASTER and PTP SLAVE states, respectively. A SYNC RECEIPT TIMEOUT EXPIRES event causes a transition to the PTP MASTER state, indicating that there is no other clock in the PTP MASTER state transmitting Sync messages. The generation of this event is specified in Clause 7.5.14 [24], and is a timeout mechanism. The timeout mechanism is started when entering the PTP LISTENING state. The timeout mechanism expires after PTP SYNC RECEIPT TIMEOUT seconds, which from Clause 7.9 [24] is 10 times the synchronization interval. There is an error in Table 15 of the standard [24] that under certain circumstances results in an incorrect update of the data sets maintained in the clock, leading to an overwrite of the parent data set. This problem occurs in the listening state of a boundary clock if another port is already in the slave state. The likely recommendation from the committee is to delete the reference to the PTP LISTENING state from the last row of Table 15 of the standard, and to add a new row for the PTP LISTENING state specifying the following actions: • •

For an ordinary clock, update the port’s data sets to the PTP MASTER state configuration as specified for decision M1. For a boundary clock with no port in the PTP SLAVE state, update the port’s data sets to the PTP MASTER configuration as specified for decision M1.

112

• •

4 A Detailed Analysis of IEEE 1588

For a boundary clock with a port in the PTP SLAVE state, update the port’s data sets to the PTP MASTER state configuration as specified for decision M3. Set the port’s state to PTP MASTER.

This timeout mechanism is also used to detect events such as a change in the underlying topology or the failure of a master clock, as discussed in Section 4.3.3. The length of this timeout is a compromise between short reconfiguration times and the avoidance of thrashing due to clocks changing state as a result of an occasional missed Sync message. The initialization process is specified in Clause 7.5.1 [24], and requires all data sets to be initialized and any implementation-specific state, hardware, or communication mechanisms to be initialized. The initialization of data sets is specified in Clause 7.4.1 [24], and provides two possible sets of values to be used during initialization. The two options are: •



Use the default values specified for each data set member. For each member, either a specific value is specified or else the default value for the datatype is to be used. This set is termed the specification initialization set. Some data set members are classified as modifiable. These members can be modified during normal operation either as a result of management messages or by internal processes within the clock. These modifications are presumably stored in non-volatile storage to be used when needed, for example, after a powerup.

Initialization by means of a management message is specified in Clause 7.12.4 [24]. Provision is made to select either the specification initialization set, or this set modified by the values in non-volatile storage. 4.3.2 Changes in Clock Characteristics or Default Data Sets A change in members of the default data set describing a clock may result in a change of state the next time the best master clock algorithm is evaluated. This evaluation is initiated by the STATE CHANGE EVENT that occurs at least once every synchronization interval when the clock is not in the PTP INITIALIZATION state (Clause 7.5.8 [24]). However, from Figure 4.1, no state changes can result from this evaluation, except in the normal active states of the system. This mechanism allows an IEEE 1588 system to reconfigure the master-slave hierarchy if the properties of the clocks change or, as will be seen in Section 4.3.3, if path latencies change. Changes in the default data set typically are the result of direct operator action via a management message. For example, a user may designate a clock as being in the preferred set of clocks. Sophisticated implementations may be able to detect degradation or improvement in clock characteristics, and to internally make the appropriate changes. For example, a clock may change its

4.3 Startup and Reconfiguration

113

stratum number depending on the status of a synchronization link to a GPS receiver, or a clock may change its variance based on an internal model of stability as a function of warm-up or changes in ambient temperature. 4.3.3 Changes in the Underlying Network Topology A change in the underlying network topology can introduce two forms of reconfiguration. If the change results in the partitioning of a subdomain or requires a change in the master-slave hierarchy topology produced by PTP, then the changes of state will occur during successive evaluations of the best master clock algorithm. These changes in network topology may result from physical breaks or changed connections of network links, configuration of network switches or routers, or removal, connection, or failure of network components or IEEE 1588 clocks. These topological changes are discovered by the PTP protocol by one or more of the following mechanisms: •



A removal of a link causes slave ports on the link to cease receiving Sync messages, thus invoking the SYNC RECEIPT TIMEOUT EXPIRES event discussed in Section 4.3.1. A failure of a master clock can also produce this same result. A new connection causes ports to receive Sync messages from a clock port already in the PTP MASTER state. For both master and slave ports, these previously unknown Sync messages—foreign masters—are passed though a qualification process that requires the receipt of a number (PTP FOREIGN MASTER THRESHOLD) of these messages within a time window of length PTP FOREIGN MASTER TIME WINDOW seconds (Clause 7.6.2 [24]). The number of required messages is two, and the time window is four synchronization intervals (Clause 7.9 [24]) to prevent thrashing due to transient connections.

It is also possible that changes in the network topology do not result in state changes but do produce a change in path latencies between masters and slaves. For example, these latency changes can be caused by changes in physical path length, or longer delays through network elements. These changes will be detected as part of the normal master-slave synchronization process, as discussed in Section 4.4. 4.3.4 Fault Management There are two states—PTP FAULTY and PTP DISABLED—provided to allow management of faults, or the designed removal and addition of clocks. Entry into or exit from the PTP DISABLED state occurs only as the result of a management message. To re-enable a clock in this state requires that the clock first transition to the PTP INITIALIZING state.

114

4 A Detailed Analysis of IEEE 1588

Entry into the PTP FAULTY state can result from a management message, or from internal clock mechanisms that generate a FAULT DETECTED event. Exit from the PTP FAULTY state requires the clock to enter either the PTP INITIALIZING or PTP DISABLED states, and results from a management message, or from mechanisms internal to a clock that generate a FAULT CLEARED event. The generation of the FAULT DETECTED and FAULT CLEARED events is discussed in rather general terms in Clause 7.5.18 [24]. Faults are defined in terms of their consequences, rather than their causes, which of course leaves the burden on the implementor to determine when and how to generate these events. The stated symptoms (if persistent) that should generate a fault are: •

• •

Circumstances that prevent the correct operation of the protocol, both internally and externally. Possible examples include: – The detection of a denial of service attack or other inability of the clock to process the incoming messages at the rates required by the standard in Clause 7.11 [24]. – The detection of unresolved conflicts between information in the received messages and internal data sets. The obvious example cited in Clause 7.5.18 [24] is a conflict in synchronization interval. – The persistent failure of a foreign master to transition into either the PTP SLAVE or PTP PASSIVE state when the local clock deems itself better than the foreign master. This implies that one of the clocks has a faulty implementation of the best master clock algorithm. Anything that prevents the correct operation of a standard-defined application programmer interface (API). Version 1 [24] of the standard does not define any APIs, although it does specify on-the-wire formats in Annex D. Problems that prevent correct internal operation of the clock. Again, nothing is specifically cited by the standard but possible candidates include: – Failure to process messages at the required rates, as discussed in the first bullet. – Inability to synchronize to the master clock. This fault could be with either clock, but lacking other information, it should be assumed to be the fault of the slave. – Repeated short-term power cycling of the clock. – Detected changes in the environment, for example, temperature, beyond the range for which the clock can maintain its specified characteristics.

4.4 Synchronization Section 3.3.4 provided a high-level view of how the clocks in an IEEE 1588 system synchronize. This section explains in detail how the protocol implements synchronization.

4.4 Synchronization

115

The specific items discussed are: • • • • •

The one and two message synchronization models, Message timestamp points and internal latency, Slave clock synchronization, Burst mode, and External timing signals.

4.4.1 The One and Two Message Synchronization Models There are two mechanisms for synchronization supported by the PTP protocol. The critical clock attribute, indicating which of these models is to be used, is the value of the clock followup capable member of the default data set. A TRUE value of this attribute in the default data set of the master causes the two message model to be used. A FALSE value causes the one message model to be used. Two Message Model As noted in Section 3.3.4, one of the attractive features of the PTP protocol is the two message mechanism that allows the generation and communication of critical timestamps to be separate operations. In the two message model, the master sends a Sync message to the slave, as illustrated in Figure 3.9. The Sync message is timestamped at transmission, t1 , by the master and at reception, t2 , by the slave. The timestamp generated at the master is sent to the slave in a Follow Up message. A similar operation occurs when the slave sends a Delay Req message to the master that is used to generate sending, t3 , and receiving, t4 , timestamps. The master sends the master-generated timestamp to the slave in a Delay Resp message. The message fields used to communicate these timestamps are shown in Table 4.5. Note that timestamps t2 and t3 do not appear in any of the messages, since both are generated and used within the slave clock. Table 4.5. Synchronization message fields for the two message model Timestamp Description t1 t1 t4

Message

Field name

Estimated Sync sending time Sync originTimestamp Sync sending time Follow Up preciseOriginTimestamp Delay Req receipt time Delay Resp delayReceiptTimestamp

The originTimestamp field of the Sync message contains an estimated time for t1 (Clause 8.3.1.2 [24]). In the two message model, this timestamp is not actually used by the slave in computing the corrections to the slave clock. The standard allows a default value of 0 to be assigned to this field in

116

4 A Detailed Analysis of IEEE 1588

the two message model. Typically, the preciseOriginTimestamp, t1 , and the delayReceiptTimestamp, t4 , fields are generated in hardware, as discussed in Section 3.3.4 and illustrated in Figures 3.11 and 3.12. The two message format allows the actual sending of these four message types to be non-time-critical operations. The timing of both Sync and Delay Req messages is tied to the synchronization interval, but the accuracy is nominal and easily handled as an ordinary message transmission by the IEEE 1588 code. The IEEE 1588 code can easily place the timestamps t1 and t4 in the Follow Up and Delay Resp messages, respectively, and still meet the timing requirements. One Message Model When high accuracy and precision are not an issue, it is possible to implement IEEE 1588 as a pure software implementation. Under these circumstances, the implementation of a master clock is allowed to dispense with the Follow Up message by including an estimate of the sending time in the Sync message. The slave then uses this estimated timestamp, rather than the more accurate timestamp normally found in a Follow Up message. The use of the Delay Req and Delay Resp messages is unchanged, although the Delay Req receipt timestamp, t4 , will also be generated in software with reduced precision. A slave may also be implemented purely in software. However, slaves are still required to make use of the preciseOriginTimestamp in Follow Up messages, if provided by the master. The one message model can also be used if the implementation inserts an accurate timestamp into the Sync message as the message is placed on the network. This normally requires hardware support, including the recomputation of the message CRC. This is a much more difficult implementation than the simple hardware assist used in the two message model. Synchronization Message Timing The timing of the four synchronization messages Sync, Follow Up, Delay Req, and Delay Resp is specified in Clause 7.11 [24]. The timing for the Sync and Follow Up messages issued by a master clock is illustrated in Figure 4.25, which is a simplified version of Figure 25 in Clause 7.11 [24]. The value of DeltaT in these figures is the timeout value PTP SYNC INTERVAL TIMEOUT divided by the value of the PTP RANDOMIZING SLOTS parameter. From Table 24 of Clause 7.9 [24], the value of PTP SYNC INTERVAL TIMEOUT is the synchronization interval and the value of PTP RANDOMIZING SLOTS is 18. Since the default value of the synchronization interval is 2 s, DeltaT is 0.111 s in the default case. The expiration of the sync-event-interval-timeout mechanism defined in Clause 7.5.15 [24] signals the start of a synchronization period. The master clock must issue the Sync message within about 55 ms and the corresponding

4.4 Synchronization Master Clock Time

117

Slave Clock Time sync-event-interval timeout expires and is restarted Sync message