Like them or hate them, computers are here to stay. This new book presents leading-edge research from across the globe i
233 35 14MB
English Pages 343 Year 2010
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved. Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved. Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
COMPUTER SCIENCE RESEARCH AND TECHNOLOGY
No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS Additional books in this series can be found on Nova’s website under the Series tab.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Additional E-books in this series can be found on Nova’s website under the E-book tab.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS
COMPUTER SCIENCE RESEARCH AND TECHNOLOGY
KARL C. VERDINAND Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
EDITOR
Nova Science Publishers, Inc. New York Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2011 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Additional color graphics may be available in the e-book version of this book.
LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA Computer science research and technology / editor, Karl C. Verdinand. p. cm. Includes index. ISBN: (eBook)
1. Computer science--Research. 2. Information technology--Research. I. Verdinand, Karl C. QA76.27.C6734 2010 004--dc22 2010025395
Published by Nova Science Publishers, Inc. † New York Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
CONTENTS
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Preface
vii
Chapter 1
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet (EPONs) Noemí Merayo, Rubén M. Lorenzo, Tamara Jiménez, Ramón J. Durán, Patricia Fernández, Ignacio de Miguel and J. Evaristo Abril
Chapter 2
A Non-Routine Problem Solving Mechanism for a Comprehensive Cognitive Agent Architecture S. Aregahgen Negatu, Stan Franklin and Lee McCauley
45
Chapter 3
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems Christina Alexandris
71
Chapter 4
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
99
Chapter 5
Mobile Multicast Protocols in Wireless Networks Young-Joo Suh
131
Chapter 6
On Strong User Authentication for Remote Access: Combining Authentication and 3D Secure E-Commerce Milan Marković
145
Chapter 7
New Scalable Varied Density Clustering Algorithm for Large Datasets Ahmed Fahim, Abdel-badeeh Salem and Gunter Saake
163
Chapter 8
Secure Communication between STB and Smart Card in IPTV Broadcasting Song-Hee Lee, Nam-Sup Park and Jin-Young Choi
181
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
1
vi
Contents
Chapter 9
Stick-Slip Motion Control of a Wood-Stick Tool for Lapping a Led Lens Mold Fusaomi Nagata and Takanori Mizobuchi
191
Chapter 10
Attacking Smart Cards: Side Channel and Fault Analysis Philipp Grabher, Neil Hanley and Michael Tunstall
207
Chapter 11
Multicast Transport Protocols: Review and New Trends Pilar Manzanares-Lopez, Josemaria Malgosa-Sanahuja, Juan Pedro Munoz-Gea and Juan Carlos Sanchez-Aarnoutse
227
Chapter 12
Automated Diagnosis of Active Systems Gianfranco Lamperti and Marina Zanella
253
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Index
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
323
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
PREFACE Like them or hate them, computers are here to stay. This new book presents leading-edge research from across the globe in the field of computer science research, technology and applications. Each contribution has been carefully selected for inclusion based on the significance of the research to this fast-moving and diverse field. Some topics included are Ethernet Passive Optical Networks; Human-Computer Interaction Systems; Mobile multicast protocols in wireless networks and STB and Smart Cards in IPTV Broadcasting. Chapter 1 discusses Ethernet Passive Optical Networks (EPONs), which are an excellent technology to develop access networks, as they provide both high bandwidth and Quality of Service (QoS). The EPON technology is outlined as the global standardization in the future due to several factors. First of all, it is relatively simple and economical, easy to implement and its maintenance is inexpensive. Secondly, Ethernet components have followed a cost reduction curve during the last 25 years. Moreover, IP (Internet Protocol) packets are natively supported by EPON without any type of conversion. Finally, the transition from other existing PON technologies to EPON technology is very easy to do, since only the ends of the EPON must be changed, whereas the interfaces in the central office and the users would stay intact. EPON infrastructure uses a single wavelength in each of the two directions and both wavelengths are multiplexed on the same fiber. Since all users share the same wavelength in the upstream direction, a Medium Access Control (MAC) is necessary to avoid collisions between packets from different Optical Network Units (ONUs). Time Division Multiple Access (TDMA) is the most widespread control scheme in these networks, although it is inefficient because the nature of network traffic is neither homogeneous nor continuous [3132]. In this way, Dynamic Bandwidth Allocation (DBA) algorithms, based on the TDMA protocol, are the best choice as they dynamically distribute the available bandwidth depending on the current demand of ONUs. Therefore, the current MAC protocols in Ethernet PONs are based on a dynamic distribution of the upstream bandwidth among the connected ONUs. Although EPON infrastructures are able to provide enough bandwidth for nowadays applications, both the gradual increase of the number of users and the bandwidth requirements of the new emerging services, demand an upgrade of such access networks. This can be achieved by adding new wavelengths to be shared in the upstream and downstream directions in EPON networks, which leads to the so-called Wavelength Division Multiplexed EPONs (WDM-EPONs). Then, WDM technology in EPONs is viewed as the future medium access
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
viii
Karl C. Verdinand
control protocol to overcome the lack of capacity in access networks. Pure WDM-EPON architectures assign one dedicated wavelength per ONU, which implies more dedicated bandwidth and more security in the system. The high cost associated with such deployment makes that the WDM migration will be limited by technological costs and based on the necessity of service providers. In the short term, more flexible WDM-EPON architectures are needed, which could be upgraded in a cost-effective way. The combination of the WDM technology with Time Division Multiplexing (TDM) techniques is likely the best near future approach. These hybrid architectures exploit the advantages of the wavelength assignment of WDM techniques with the power splitting of the TDM methods. Recently, there has been an increasing interest in the development of optically amplified EPONs in order to extend the reach and the split ratio of these networks (up to 100 km), thus leading to the so-called Long-Reach EPONs. These network architectures are quite costeffective as they simplify the infrastructure combining the access and the metro networks into a single network. That characteristic, together with WDM-EPON infrastructures allow an increment in the potential number of users in the access network. Although the Long-Reach EPON is a very promising architecture, very little research has focused on the development of reliable DBA protocols able to face the impact of the increase in the propagation time. When classic algorithms developed for typical EPONs are applied to Long-Reach EPONs, the bandwidth utilization is inefficient due to the increase of the packet propagation time. Hence, adequate DBA protocols are highly required for these new network architectures. One aspect of human intelligence is its ability to achieve goals by devising unexpected and even creative solutions to problems that have never before been encountered. This ability of exploring and constructing solutions to non-routine problems is central to the development of the authors’ sciences and technologies. Replicating Non-Routine Problem Solving (NRPS) capability in an agent architecture would allow intelligent software agents and/or cognitive robots to deal more intelligently with highly complex and dynamically changing environments, and to cope with situations unforeseen by their designers. Chapter 2 describes an NRPS mechanism in the cognitive architecture framework called Intelligent Distribution Agent (IDA) and its learning incarnation LIDA (Learning IDA). LIDA is a hybrid architecture that integrates different mechanisms for modules and their processes including perception, emotion, memories (sensory, perceptual, episodic, procedural, working), action selection, expectation, learning, deliberation, problem solving, metacognition, and selective attention (or functional consciousness). Relevant to the NRPS mechanism, the authors will briefly discuss LIDA’s perception, selective attention, expectation, procedural memory, and action selection mechanisms. The authors’ general approach is that an NRPS mechanism should involve recruiting and activating all the available knowledge pieces and processes, so that a search for a novel solution takes place over the entire solution space of the agent. LIDA’s procedural memory stores, besides the lower-level entities that the authors call behaviors, high-level procedural constructs called behavior streams or goal hierarchies (hierarchical partially-ordered action plans.) If a behavior stream becomes relevant based on the content of selective attention, a copy of it then gets instantiated with its variables bound, and becomes part of the dynamics in the action selection module. While active in the action selection system it competes for the control of the behavior of the agent over multiple cognitive cycles, and executes an associated task. LIDA’s cognitive cycle is an iterative, continually active process that brings about the interplay among the various components of the architecture, resulting in an action being
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Preface
ix
selected and executed. Particularly, the authors view a non-routine problem solving mechanism as a special goal hierarchy whose task it is to generate a new behavior stream that handles a novel situation that could not be handled by available routine solutions. Non-routine problem solving in LIDA is a deliberative process over multiple cognitive cycles. The authors will describe the details of the interaction of the components in the architecture, and the control of the special non-routine problem solving goal hierarchy system over the solution search process. Language-specific as well as culture-specific factors are observed to play a decisive role in User Specifications for spoken Human – Computer Interaction (HCI) Systems. Chapter 3 targets to determine and to define a finite set of re-usable, transferable and language independent specifications for prosodic modeling used as general parameters for the Speech Component in Human – Computer Interaction (HCI) Systems and, specifically, in Service-Oriented Dialog Systems, constituting an application field of HCI, usually directed to the General Public as a user group. Factors related to special applications such as emotion recognition, and/or special user groups, such as children or handicapped users, are not included in the present analysis. The present specifications aim to limit empirical prosodic modeling and to provide a general framework for facilitating both the construction and the evaluation processes of prosodic modeling, independently from sublanguage-specific parameters chosen for the System. The proposed specifications target to the features of Comprehensibility and Userfriendliness in the spoken output produced by the System’s Speech Component and to the overall efficiency and reliability of the System’s performance. An IEEE 802.16j mobile WiMAX relay network is a next-generation mobile wireless broadband network. Compared with IEEE 802.16e, which also supports mobility, IEEE 802.16j introduces relay stations to the network to help relay packets between the base station and a mobile station. When a mobile station is shadowed by a building and thus has a badquality channel to the base station, such a relay design can help it achieve a higher throughput from/to the base station. Since IEEE 802.16j is a new standard and no such products are available yet in the market for researchers to evaluate its performances, developing a network simulator that supports IEEE 802.16j network simulations is very valuable. In Chapter 4, the authors present how they extend NCTUns, a network simulator and emulator that directly uses real-life TCP/IP protocol stack and applications to generate accurate simulation results, to support IEEE 802.16j network simulations. NCTUns supports the two relay modes defined in IEEE 802.16j: the transparent mode and non-transparent mode. More information about NCTUns is available at http://NSL.csie.nctu.edu.tw/nctuns.html. Providing multicast service in wireless networks is a challenging issue due to frequent location changes of mobile nodes. If conventional multicast routing protocols which assume that multicast members are topologically stationary are used in wireless networks, mobile nodes may experience several problems such as long service disruption period, packet losses during handovers, and increased propagation delay due to non-optimal multicast routing path. To overcome such problems, there have been active research efforts to develop an efficient multicast protocol for mobile wireless networks. In Chapter 5, the authors provide a comprehensive overview of mobile multicast protocols. For each protocol, the authors describe the system architecture and protocol operation, and discuss the advantages and the
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
x
Karl C. Verdinand
limitations. Then, the authors provide qualitative comparison of the protocols in terms of delivery path length, the number of packet losses, multicast service disruption period, etc. In Chapter 6, an overview of today’s most popular user authentication methods for remote access to central site facilities is given. Besides, a case study of possible combination of strong user authentication with secure payment in a specific Banking environment (the Bank) is evaluated. In fact, main characteristics of the specific PKI system in the Bank are described. This PKI system is used for electronic banking services for external users (home banking for physical persons and e-banking for legal persons), as well as a part of the identity management system (windows logon, single sign-on) and other security services for Bank’s internal users (secure e-mail, SSL client authentication). One of the main features of this system is an introduction of the EMV CDA MasterCards in the Bank. These are PKI Multos cards which have three applications on them: M/Chip Select 4 and CAP payment applications, as well as the PKI application. This way, Maestro users could for example make payment through POS and ATMs by using the M/Chip payment applications, via Internet in unconnected mode by using external CAP reader and CAP application on the card, as well as through some Internet web shops by using the secure e-commerce system (3D Secure). The payment could be done in connected mode via Bank’s web home banking portal by using the connected smart card reader and PKI application on the card with X.509 digital certificate issued by the PKI system of the Bank which is based on the fully customized domestic PKI solution. Finding clusters in data is a challenging problem especially when the clusters are being of widely varied shapes, sizes, and densities. Herein a new scalable clustering technique which addresses all these issues is proposed. In data mining, the purpose of data clustering is to identify useful patterns in the underlying dataset. Within the last several years, many clustering algorithms have been proposed in this area of research. Among all these proposed methods, density clustering methods are the most important due to their high ability to detect arbitrary shaped clusters. Moreover these methods often show good noise-handling capabilities, where clusters are defined as regions of typical densities separated by low or no density regions. In Chapter 7, the authors aim at enhancing the well-known algorithm DBSCAN, to make it scalable and able to discover clusters from uneven datasets in which clusters are regions of homogenous densities. The authors achieved the scalability of the proposed algorithm by using the k-means algorithm to get initial partition of the dataset, applying the enhanced DBSCAN on each partition, and then using a merging process to get the actual natural number of clusters in the underlying dataset. This means the proposed algorithm consists of three stages. Experimental results using synthetic datasets show that the proposed clustering algorithm is faster and more scalable than the enhanced DBSCAN counterpart. In internet protocol television (IPTV) broadcasting, service providers charge subscribing fee by scrambling the program with conditional access system (CAS) using control words (CWs). Smart card is used to decrypt the CWs and transfer them back to set-top box (STB) to descramble the scrambled program. Secure communication between STB and smart card is closely related with the benefit of service providers and the legal rights of users. In addition, secure key exchange with mutual authentication in the communication between STB and smart card is an essential part of secure communication that will significantly improve the security of the system. To provide secure communication with mutual authentication in IPTV broadcasting, there are several schemes. The schemes proposed secure and efficient method
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Preface
xi
for the communication between STB and smart card. Unfortunately, the schemes still have some security flaws. In Chapter 8, the authors review that the previous schemes are vulnerable to several attacks. Further, the authors recommend some modifications to the schemes to correct these security flaws. Friction known as friction heat, friction noise, friction wear and friction force and so on is an undesirable physical phenomenon in almost machine systems. In Chapter 9, however, a friction force generated by a stick-slip motion is introduced to improve the lapping quality of small LED lens molds. Here, the lapping means the finishing or polishing of metallic molds by using diamond paste. A novel desktop orthogonal-type robot, which has abilities of compliant motion and stick-slip motion, is first presented for lapping small metallic molds with a curved surface such as a LED lens mold. The robot consists of three single-axis devices with a high position resolution of 1 μm built in Cartesian-space. A thin wood stick tool is attached to the tip of the z-axis. The tool tip has a small ball-end shape with a diameter of 1 mm. The control system is composed of a force feedback loop, position feedback loop and position feedforward loop. The force feedback loop controls the polishing force consisting of contact force in normal direction and kinetic friction forces in tangent direction. It is assumed that the kinetic friction forces are generated by Coulomb friction and viscous friction. The position feedback loop controls the position in pick feed direction, e.g., z-direction. The position feedforward loop leads the tool tip along a desired trajectory called cutter location data (CL data). The CL data forming a spiral path are generated from the main-processor of a CAD/CAM system. The proposed robot realizes a compliant motion required for the surface following control along a spiral path. The surface following control is the basic and fundamental strategy for simply constructing an automatic lapping system. In order to improve the lapping performance, a small stick-slip motion control strategy is further added to the control system. The small stick-slip motion is orthogonally generated to the tool moving direction. Generally, the stick-slip motion is an undesirable phenomenon and should be eliminated in precision machineries. However, the proposed robot employs a small stick-slip motion, so that the polishing energy can be finely changed to partially improve the lapping quality. The effectiveness of the robot is examined through an actual lapping test of a LED lens mold with a diameter of 4 mm. In Chapter 10 the authors describe the attacks and countermeasures that apply to secure smart card applications. The authors focus on the attacks that can affect cryptographic algorithms, since the security of many smart card applications are dependent on the security of these algorithms. Specifically, they describe Power, Electromagnetic and Fault analysis in the context of smart cards and provide an overview of the necessary countermeasures. The aim of this chapter is to demonstrate that a careful evaluation of embedded software is required to produce a secure smart card application. IP multicast technology needs various components to offer adequate service to the multipoint communications. On one hand, the IGMP protocol (Internet Group Management Protocol) is in charge of establishing and managing multicast groups. On the other hand, multicast IP addressing (class D IP addresses) together with multicast routing protocols (such as DMRP, MOSPF or PIM) define the network infrastructure required to allow for the distribution of multicast traffic.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
xii
Karl C. Verdinand
However, for user applications to take the maximum advantage of IP-multicast technology, it is essential to use an adequate transport protocol. Most multicast transport protocols were designed to be used in multimedia applications, because the best results regarding performance are achieved through multicast technology. A review of the most interesting multicast transport protocols is thus presented. Chapter 11 will also describe the lastest R&D achievements in the field, such as the new Fountain codes, thanks to which techniques and algorithms required to implement loss recovery mechanisms in broadcastmulticast environments are being greatly simplified. Codifications of this sort (some of them available under GPL license) are in great demand for multicast streaming applications. On the other hand, IP-multicast technologies have not been widely accepted by Internet providers, partly because of the complexity of multicast group management (particularly its dynamism). However, IP-multicast technology can be very useful in smaller or more limited scenarios, as is the case of a few LAN networks—either wired or wireless—with various transmission capacities and features. This is the setting under which MUST (MUlticast Synchronous Transfer protocol) is introduced, a multicast transport protocol that fast and efficiently replicates a great amount of information. Chapter 12 faces the task of monitoring-based diagnosis of active systems by means of a method called reactive diagnosis. The approach is inspired by the bridged diagnostic method [17] and by the notion of an uncertain observation [15]. The former deals with monitoringbased diagnosis of polymorphic systems, with these being DESs integrating synchronous and asynchronous behavior, that takes into account completely certain observations. The current work, instead, focuses on asynchronous DESs with uncertain temporal observations. No contribution in the literature to monitoring-based diagnosis of DESs considers uncertain temporal observations prior to [19]. The work presented in this chapter makes a step forward in this direction by providing a comprehensive theoretical framework for the task at hand, by refining the problem-solving method that performs the task so as to improve its efficiency, and by showing the results of tests run on an implementation of the method itself. The remainder of the chapter is organized as follows. Section 2 introduces a sample application domain that will be exploited in subsequent examples. Section 3 recalls the class of DESs reactive diagnosis is meant to, namely active systems. Section 4 deals with the graph-based representation of the dynamic evolution of active systems. Section 5 defines the notions of a reactive-diagnosis problem and of its solution, which are based on the concept of a fragmented observation. Section 6 explains how a reactive-diagnosis problem can be solved. Section 7 illustrates how to reduce the size of the search. Section 8 substantiates the diagnostic method with algebraic, pseudo-coded algorithms. Experimental results based on an implementation of reactive diagnosis are provided in Section 9. Section 10 relates the current approach to further works in the literature. In the end, Section 11 draws some final remarks and hints future extensions and refinements.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
In: Computer Science Research and Technology Editor: Karl C. Verdinand
ISBN: 978-1-61728-688-9 © 2009 Nova Science Publishers, Inc.
Chapter 1
MEDIUM ACCESS CONTROL PROTOCOLS IN PASSIVE OPTICAL NETWORKS BASED ON ETHERNET (EPONS) Noemí Merayo*, Rubén M. Lorenzo, Tamara Jiménez, Ramón J. Durán, Patricia Fernández, Ignacio de Miguel and J. Evaristo Abril University of Valladolid, Valladolid, Spain
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
ABSTRACT Ethernet Passive Optical Networks (EPONs) are an excellent technology to develop access networks, as they provide both high bandwidth and Quality of Service (QoS). The EPON technology is outlined as the global standardization in the future due to several factors. First of all, it is relatively simple and economical, easy to implement and its maintenance is inexpensive. Secondly, Ethernet components have followed a cost reduction curve during the last 25 years. Moreover, IP (Internet Protocol) packets are natively supported by EPON without any type of conversion. Finally, the transition from other existing PON technologies to EPON technology is very easy to do, since only the ends of the EPON must be changed, whereas the interfaces in the central office and the users would stay intact. EPON infrastructure uses a single wavelength in each of the two directions and both wavelengths are multiplexed on the same fiber. Since all users share the same wavelength in the upstream direction, a Medium Access Control (MAC) is necessary to avoid collisions between packets from different Optical Network Units (ONUs). Time Division Multiple Access (TDMA) is the most widespread control scheme in these networks, although it is inefficient because the nature of network traffic is neither homogeneous nor continuous [31-32]. In this way, Dynamic Bandwidth Allocation (DBA) algorithms, based on the TDMA protocol, are the best choice as they dynamically distribute the available bandwidth depending on the current demand of ONUs. Therefore, the current *
Corresponding author: E-mail Address: [email protected], Phone Number: +34 983 423000 ext. 5549, Fax Number: +34 983 423667
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
2
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
MAC protocols in Ethernet PONs are based on a dynamic distribution of the upstream bandwidth among the connected ONUs. Although EPON infrastructures are able to provide enough bandwidth for nowadays applications, both the gradual increase of the number of users and the bandwidth requirements of the new emerging services, demand an upgrade of such access networks. This can be achieved by adding new wavelengths to be shared in the upstream and downstream directions in EPON networks, which leads to the so-called Wavelength Division Multiplexed EPONs (WDM-EPONs). Then, WDM technology in EPONs is viewed as the future medium access control protocol to overcome the lack of capacity in access networks. Pure WDM-EPON architectures assign one dedicated wavelength per ONU, which implies more dedicated bandwidth and more security in the system. The high cost associated with such deployment makes that the WDM migration will be limited by technological costs and based on the necessity of service providers. In the short term, more flexible WDM-EPON architectures are needed, which could be upgraded in a cost-effective way. The combination of the WDM technology with Time Division Multiplexing (TDM) techniques is likely the best near future approach. These hybrid architectures exploit the advantages of the wavelength assignment of WDM techniques with the power splitting of the TDM methods. Recently, there has been an increasing interest in the development of optically amplified EPONs in order to extend the reach and the split ratio of these networks (up to 100 km), thus leading to the so-called Long-Reach EPONs. These network architectures are quite cost-effective as they simplify the infrastructure combining the access and the metro networks into a single network. That characteristic, together with WDM-EPON infrastructures allow an increment in the potential number of users in the access network. Although the Long-Reach EPON is a very promising architecture, very little research has focused on the development of reliable DBA protocols able to face the impact of the increase in the propagation time. When classic algorithms developed for typical EPONs are applied to Long-Reach EPONs, the bandwidth utilization is inefficient due to the increase of the packet propagation time. Hence, adequate DBA protocols are highly required for these new network architectures.
Keywords: Access network, Ethernet, Passive Optical Network (PON), Medium Access Control Protocol (MAC), Time Division Multiplexing Access (TDMA), Wavelength Division Multiplexing Access (WDMA), Service Level Agreement (SLA), Class of Service (CoS), Dynamic Bandwidth Allocation (DBA), Long-Reach.
1. INTRODUCTION The access network, also called the first mile or last mile, connects the service provider central offices to residential or business customers. The demanded services are quite different depending on the type of customer. Residential users demand applications related to leisure activities, such as broadband Internet, television or interactive games, whereas companies demand multimedia services for the bidirectional transmission of all kind of information. In the recent years, the network traffic has been increasing at very high rates, which has caused an important evolution in the transport network. However, the access network has not suffered any important evolution or change. Besides, the new emerging services and the growth of Internet traffic have accentuated the lack of access network capacity. As a consequence, operators have deployed different technologies to support such demand. In this
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet…
3
way, telephone operators tended to deploy the Digital Subscriber Line (DSL) technology, which uses the same twisted pair as telephony lines and therefore it requires a DSL modem at the customer premises and a Digital Subscriber Line Access Multiplexer (DSLAM) in the central office. The data rate offered by DSL technologies is not enough to support integrated voice, data and video services. Furthermore, the physical area that can be covered by DSL is limited to the distance, which means that it cannot reach every contracted subscriber. On the other hand, the cable television companies integrate data services over their coaxial cable networks. Usually, these architectures combine both fibre and coaxial cable, resulting in a Hybrid Fibre Coaxial (HFC) network, where the fibre reaches the head-end to a curbside optical node, and the coaxial cable covers the rest of the path to the final subscriber. The main problem of this architecture is that each shared node has a limited effective data throughput, which is divided among many homes. Consequently, each subscriber obtaines a very slow speed during peak hours. As a conclusion, the deployed technology, DSL and coaxial cable, are not able to cover the bandwidth necessary to support the new high demanding services. In addition, up to now, the access network has been associated with the type of delivered information, the pair copper for telephony and the coaxial cable for television. However, from now on, the new access networks should be unique and should deliver voice, data and video under the same platform. Thus, another access technology that could be simple, scalable and capable of transportting voice, data and video over the same network is strongly needed. In this way, optical fibre is expected to be the best option to deal with the existing first mile challenges. The deployment of fibre in the access network is known as FTTX (Fiber To The X). However the FTTX term can group different categories depending on the portion of fibre included in the access network. Therefore, the most typical FTTX infrastructures are FTTH (Fibre To The Home), FTTB (Fibre To The Business), FTTmdu (Fibre To The multi-tenant building), FTTN (Fibre To the Node) and FTTC (Fibre To the Curb). The chapter is organized as follows. Section 2 describes the Optical Passive Network infrastructures based on the Ethernet standard. In Section 3 is presented the main challenges in EPON networks regarding contention methods used in these architectures. Section 3 focus on the Time Division Multiplex Access (TDMA) and the Wavelength Division Multiplex Access (WDMA) protocols as both of them are the most widespread medium access control protocols used in EPONs. Section 4 explains the future trends based on the deployment of large split extended EPONS, also called Long-Reach EPONs. Finally, in Section 4, it is presented the most relevant conclusion of this chapter.
2. ETHERNET PASSIVE OPTICAL NETWORKS 2.1. Introduction to Passive Optical Networks As fibre is viewed as the most suitable transmission medium in the access network, many topology alternatives were studied, such as the point-to-point, the curb switched and the point-to-multipoint architectures. In the point-to-point topology it is extended one fibre per user and therefore it needs N or 2N fibres and 2N optical transceivers, being N the number or users. On the other hand, the
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
4
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
curb-switched technology uses only one trunk fibre which eliminates fibre in the Central Office (CO), but it needs 2N+2 optical transceivers and electrical power in the field. Finally, the point-to-multipoint topology is the most cost-saving architecture as it uses one trunk fibre from the CO to one splitter and it only needs N+1 optical transceivers. Besides, it requires no electrical power in the field as it uses an optical splitter to divide the signal into so many branches as the number of users connected to the network. This network infrastructure is called Passive Optical Access Network (PON), which is a very attractive solution to deal with the problem in the first mile, as it can provide both high bandwidth and class of service differentiation. This access technology is mainly based on a bidirectional communication between the Optical Line Termination (OLT) located inside the Central Office and several Optical Network Units (ONUs) located inside o near the end subscribers [1-3] as Figure 1 shows. The passive optical network architecture defines a shared point-to-multipoint infrastructure. It is typically based on a tree topology which covers distances up to twenty kilometers. This network infrastructure is partially shared among every user, as all of them use the same equipment at the OLT. It permits an important reduction in the operation and the maintenance costs. Normally, the PON technology uses a single wavelength at each of the two directions (Figure 1), one for the downstream direction (from the OLT to the ONUs) and another for the upstream direction (from users to the OLT). These both wavelengths are multiplexed on the same fiber by means of Wavelength Division Multiplexing (WDM). However, some configurations implements one additional wavelength to transport video in the downstream channel. Finally, the PON technology does not require active components along the extern field. As a consequence, the active components are located at both ends of the access network.
Figure 1. EPON network operation in both directions, upstream and downstream using the TDMA protocol
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet…
5
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
2.2. EPON Standards There are some alternatives which are intended to standardize PON networks, such as APON (ATM PON), BPON (Broadband PON), GPON (Gigabit PON) and EPON (Ethernet PON). Among them, the two most important are GPON and EPON. The first one, GPON, defined by the FSAN (Full Service Access Network) consortium, is viewed as a good future solution, since it can support multiple classes of service with extreme efficiency. However, Ethernet PONs (EPONs), based on the Ethernet protocol, are considered a good choice for a today’s optimized access network, as Ethernet is a well-known cheap technology and interoperable with a variety of legacy equipment. This makes many studies are currently focused on such technology. The EPON technology is outlined as the global standardization in the future due to several factors. First of all, it is relatively simple and economical, easy to implement and its maintenance is inexpensive. Secondly, Ethernet components have followed a cost reduction curve during the last 25 years. Moreover, IP (Internet Protocol) packets are natively supported by EPON without any type of conversion. Finally, the transition from other existing PON technologies to EPON technology is very easy to do, since only the ends of the EPON must be changed, whereas the interfaces in the central office and the users would stay intact. In fact, EPON is by far the most popular FTTH/B technology in a worldwide basis. Of the 29 million of FTTH/B users at the end of 2008, the 60% correspond to EPON deployments and only the 17 % to GPON [4]. EPON is deployed mainly in the Asia-Pacific area. In Japan, which leads the FTTH market in the world, the service operator NTT (Nippon Telephone and Telegraph), had more than 10 million of coonected users by EPON/GEPON technology at the end of 2008 [5], and it is expected that by 2010 the number of subscribers grow to 20 million [6]. In South Korea there were more than 5 millions of EPON/GEPON users with KT and SK Broadband operators [5] by the end of 2008. However, Japan and South Korean’ markets will be eclipsed by the new FFTx deployments in China with EPON as the dominant architecture. In fact, it is expected that by the end of 2012 the number of users will grow by 50 million [6]. Moreover, this success of EPON in China could influence both, the decrease of the EPON cost and the decisions of other emerging markets. For instance, the eastern European countries such as Russia and Belarus are following the China example [7]. Regarding the United States, although the main technology used is GPON, it is also possible to find some EPON deployments. In this way, some operators such as US SONET, Cable ONE or CC Communications support triple play services in some states by means of EPON technology [8]. In contrast, Europe, whose FTTH/B leaders are Sweden, Norway and Slovenia [9] is deploying mainly GPON technology, whereas the EPON deployments are reduced to slight-magnitude projects. Finally, it is expected that the number of worldwide FTTH/B subscribers will grow to 140 millions by 2014 [4]. Asia will remain as the leader of FTTH/B market, but the gap between United States and Europe will reduce.
2.2.1. The EPON standard The main standardization force behind EPON is the IEEE 802.3ah Task Force [10]. It was designed to support a symmetrical bit rate of 1 Gbit/s in the upstream and downstream
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
6
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
channels. Hence, the typical configuration is set to a symmetric rate of 1.25 Gbit/s in both upstream and downstream channels. As it is based on the Ethernet protocol, the EPON transmission unit is equal to a common Ethernet frame with some modifications in the preamble field. In the preamble of each Ethernet frame a Logical Link ID (LLID) label is inserted (Figure 2), which locally identifies each ONU inside the EPON. In this way, ONUs are polled periodically to check if some ONUs are recently connected, and parameters such as the local address (LLID) inside the network have to be established. In the downstream direction, packets are broadcasted from the OLT to every ONU using packets of variable size from 64 to 1518 bytes, according to the IEEE 802.3 standard. Thus, as every ONU received the same data, its MAC layer has to differentiate between packets which belong to it or not. In order to carry out this task, the MAC layer of the ONU extracts the LLID label of every Ethernet frame and checks if the ONU address corresponds to its ONU [11-12]. In the upstream direction, each ONU sends its data typically following a TDMA scheme, to avoid packet collisions of several ONUs as all of them share the upstream channel. At the end of every cycle each ONU reports its buffer status in order to demand bandwidth for the next cycle. TDMA is the most widespread method in EPONs as the Carrier-Sense Multiple Access (CSMA) protocol is very difficult to implement in such infrastructures. Using CSMA, ONUs are able to easily detect a collision at the OLT due to the optical property of the optical splitter/combiner. However, Chae et al. [13] demonstrated that although the OLT detects collisions and informs ONUs about these collisions by sending to them a detection signal, the propagation delay in the EPON highly reduces the network efficiency. Besides, as the Ethernet PON networks reach distances up to 20 km, the CSMA protocol cannot be used due to its limitations when the distance increases [14]. As a consequence, TDMA is one of the most used control access schemes in EPONs. In a synchronous time division multiplexing scheme, the time is divided into fixed time length frames called cycle times. Each frame is divided into as many time slots as the number of ONUs which share the transmission channel and each slot is dedicated to one ONU. Each slot periodically transports packets of one specific ONU. Therefore, the OLT distributes the available bandwidth in each cycle for every ONU and sends to them a separate grant message to inform them about the new slot length and the time each of them starts its transmission. In order to support the management and the allocation of the temporal slots in the TDMA protocol, the EPON specification defines the socalled Multi-Point Control Protocol (MPCP) which acts inside the MAC control sublayer [15]. Preamble 7 bytes
Reserved
SOF 1 byte
SLD
Destination 6 bytes
Reserved
LLID
Source 6 bytes
Length 2 bytes
Data 0-1500 bytes
Pad 0-46 bytes
CRC
Figure 2. General Structure of a IEEE 802.3 Ethernet frame
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
FCS 4 bytes
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet…
7
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
In relation to the authentication and encryptation issues in EPONs [16], the IEEE standard 802.3ah specifies that the authentication tasks have to be made at logical level by using the family of the standards IEE 802.1x [17]. On the other hand, the encryptation is necessary and highly recommended to make it also at the logical level. The encrypted initial key and the periodic update of the key should be made by the IEEE 802.1x and 802.11i standards [18-19]. The emulation sublayer carries out the encryptation and desencrytation task, whereas the MAC control layer is responsible for updating the encrypted key. Finally, the EPON standard does not allow frame fragmentation, and hence the time slot length allocated to each ONU should match the packets length to fully use the total capacity of the assingned time slots.
2.2.1.1. The Multi-point control protocol (MPCP) in EPON networks The Multi-Point Control Protocol specifies a control mechanism between a master unit and several slaves units connected along a point-to-multipoint network segment to allow an efficient transmission in a shared channel. EPON uses MPCP to control the point-tomultipoint communication inside the EPON access network. This protocol is implemented in the MAC layer and it carries out the allocation of the available bandwidth to each ONU, the discovery of new connected ONUs and the synchronization of ONUs inside de EPON. This protocol accesses and controls the multipoint topology by means of messages, clocks and the state of every ONU [20-21]. The MPCP protocol is composed by a set of messages to schedule the communication of both upstream and downstream channels. There are several messages which control the state of every connected ONU (Register, Register ack, Register req) [22], whereas other messages control the communications between the OLT and the ONUs to distribute the available bandwidth (Gate, Report). In order to allocate bandwidth, each ONU sends a Report message, typically at the end of its transmission window with the demanded bandwidth for the next cycle. Once the OLT receives this control message, it keeps a track of every demand and depending on the bandwidth allocation policy it assigns bandwidth to every ONU. Then, the OLT informs ONUs about their transmission window for the next cycle by means of Gate messages in which it also specifies the beginning of the transmission time. The control message Gate, is shown in Figure 3 (a). It has a length of 64 bytes and it is used to inform ONUs about the allocated transmission window for the next cycle [22]. According to the IEEE 802.3ah standard, each Gate can include up to four different transmission windows, which it is useful when ONUs support different queues. Therefore, the OLT can specify the allocated bandwidth to each priority queue associated with a different service. The Gate message includes the field “Grant # Start Time”, to indicate the beginning of the transmission window and the field “Grant # Grant Length”, to show the interval time of the transmission window. On the other hand, a Report message has 64 bytes and its structure is shown in Figure 3 (b) [22]. Among the most relevant fields, the two first point the source and the final ONU. As EPON networks are able to support different applications each ONU is equipped with different queues to store different priority packets. Then, the Report message can schedule the state of up to eight different queues, each of them related to a different application by means of the field “Queue # Report”. Besides, the “Report bitmaps” shows the order in which the state of every queue appears in the control message.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
8
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al. Nº Octects 6 6 2 2 4 1
Number of Grants/Flags
Nº Octets 6 6 2 2 4 1
Grant #1 Start Time
0/4
Report bitmaps
0/2
Grant #1 Grant Length
0/2
Queue #0 Report
0/2
Grant #2 Start Time
0/4
Queue #1 Report
0/2
Grant #2 Grant Length
0/2
Queue #2 Report
0/2
Grant #3 Start Time
0/4 0/2
Queue #3 Report
0/2 0/2
0/4 0/2
Queue #5 Report
Destination Address (DA) Source Address (SA) Length/Type=88-08 Opcode=00/02 TimeStamp
Grant #3 Grant Length Grant #4 Start Time Grant #4 Grant Length Synchronization Time Pad/Reserved FCS
b0
(a)
0/2 13/39 4 b7
Destination Address (DA) Source Address (SA) Length/Type=88-08 Opcode=00/03 TimeStamp Number of queue sets
Queue #4 Report
Queue #6 Report Queue #7 Report Pad/Reserved FCS
b0
0/2 0/2 0/2 13/39 4 b7
(b)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 3. (a) Packet format of the Gate message in Ethernet PONs (b) Packet format of the Report message in Ethernet PONs
The MPCP protocol is also used to make the power and the distance ranging. The distance ranging permits to update the propagation delay between the OLT and every ONU. This propagation time is known as the Round Trip Time (RTT) or round-trip delay and it is essential when using TDMA in EPONs [20] in order to keep the synchronization in the network. This delay involves the time required for a packet to travel from the OLT to a specific ONU and back again. On the other hand, the power ranging process is necessary to avoid power saturation at the approach receivers. It is essential to eliminate this effect by optimizing the sensitive of every receiver at both upstream and downstream channels. Finally, the MPCP protocol handles the periodic auto discovery process of new connected ONUs in an enabled EPON. In case new ONUs are recently connected, the MPCP protocol negotiates some parameters such as the distance to the ONU, the gain control time, the clock synchronization time and the off/on laser time.
2.2.1.2. Point to Point Emulation (PtPE) and Shared Medium Emulation (SME) layers in EPON networks Apart from the MPCP protocol, the EPON technology also requires some other protocol with the aim to achieve a full compatibility with the Ethernet standard IEEE 802.1D [10]. Therefore, it is necessary that EPONs support a point to point communication between ONUs at the level two (link layer) without having level three (network layer). In order to do that, it has to be added two additional layers, the Shared Medium Emulation (SME) layer and the Point to multi Point Emulation (PtPE) layer [23]. These two sublayers are located under the MAC layer, preserving the invariant functionality of such layer. Both layers use the labelling
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet…
9
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
of the Ethernet frame, assigning to each ONU one LLID label of 16 bits to identify the ONU inside the EPON. In order to emulate a point to point medium in the downstream direction, the OLT is equipped with so many MAC ports as the number of ONUs connected to the EPON. When the OLT sends a frame in the downstream direction (Figure 4 (a)), the PtPE emulation sublayer inserts a LLID label associated with the incoming MAC port which corresponds to the ONU which should receive the frame. Although the optical splitter sends to every ONU the frame, only one of them has the address which coincides with the LLID label contained in the incoming frame. Therefore, the sublayer PtPE of one ONU sends the frame to the upper MAC layer only if both addresses agree. On the contrary, this PtPE sublayer discards the frame and the upper MAC layers cannot access to the information of such frame. Consequently, the PtPE sublayer is emulating a point to point performance in the downstream channel. In the upstream channel (Figure 4 (a)), the PtPE sublayer of each ONU inserts its LLID link over every transmitted frame in the upstream direction. Therefore, when the PtPE sublayer of the OLT receives a frame, it demultiplexes the signal and sends it to the MAC port related to the LLID contained in the received frame. On the other hand, in order to emulate a direct connection between ONUs it was developed the SME sublayer [23]. As it can be observed in Figure 4 (b), the OLT sends a frame with a LLID label of type broadcast, so every ONU accepts it. If one ONU wants to communicate with another one, the source ONU creates a frame with the LLID label of the destination ONU and sends it to the OLT. When the OLT receives the frame, its SME sublayer transmits this frame to every ONU in the downstream channel. Therefore, the destination ONU is the only one which accepts the frame as the LLID label contained in the frame agrees with its LLID label.
MAC Layer
MAC Layer
Point to Point
MAC Layer
Emulation (PtPE)
Downstream
MAC Layer
MAC Layer
Shared Medium Emulation (SME)
OLT OLT
Upstream
PtPE Layer
PtPE X Layer
MAC Layer
MAC Layer
MAC Layer
ONU 1
ONU 2
PtPE Layer
PtPE X Layer
PtPE Layer
MAC Layer
MAC Layer
MAC Layer
ONU 1
ONU 2
PtPE Layer
X
(a)
MAC Layer
ONU 3
X
ONU 3
(b)
Figure 4. (a) Emulation of a point-to-point communication at both upstream and downstream channels in EPONs (b). Emulation of a medium shared communication at both upstream and downstream channels in EPONs
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
10
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
2.2.2. The Gigabit EPON standard (10GEPON) This standard is an evolution of the current EPON systems, namely the IEEE 802.3ah standard, to highly increase its capacity. The interest in this new system began in 2006 and nowadays it is being standardized by the IEEE 802.3av Task Force [24-25]. This new standard pretends to be totally compatible with the current deployed EPON technology. Besides, it has been developed to support several configurations regarding the upstream and downstream transmission rates, such as the symmetric capacities of 1 Gbit/s and 10 Gbit/s. In order to do that two options are being considered:
Upstream and downstream channels are controlled by means of the wavelength division multiplexing protocol. This configuration uses independent wavelengths at both upstream and downstream channels. The upstream channel is controlled by time division multiplexing protocol and the downstream channel by the wavelength division multiplexing method.
The evolution process of the next system architecture reveals that the GEPON equipment will suffer a gradual evolution from the current equipment of 1 Gbit/s. As a consequence, the asymmetric and symmetric transmission rates should be supported by both upstream and downstream channels. Some network operators such as NTT specifies in [26] the implementation issues and requirements of a GEPON system. Since this evolution implies the coexistence of both systems (EPON and GEPON), some technical issues have to be solved:
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
The coexistence between both systems is compulsory in order to ensure a smooth transition from the 1 Gbit/s equipment to the 10 Gbit/s equipment. The wavelength allocation in the 10 Gbit/s systems has to take into consideration the existence of 1 Gbit/s equipment in the same extern plant in both channels. The coexistence in the downstream channel has to be done by using TDMA. Then, the OLT will receive multiple packet bursts with different transmission rates that should be properly distinguished and processed. The nearest approaches have to face several with technical problems such as the transmission rate detection or the gain adjustments. The downstream channel has to be multiplexed by means of WDM, due to the incompabilities with the transmission rate of the EPON and GEPON standards. In particular, EPON operates at 1 Gbit/s or 1.2 Gbit/s whereas GEPON supports rates up to 10.3125 Gbit/s.
As the transition towards a symmetric system of 10 Gbit/s pretends to be gradual, it may require an important replacement in the current deployed EPON active devices. Therefore, it should be sensible to replace simultaneously several components so that operators can distribute the CAPEX (CAPital EXpeditures) investment on time. On the other hand, it is a must that the 1 Gbit/s and 10 Gbit/s equipment will work at exactly the same time. However, it implies an important research in the supported wavelengths by each system. In this way, in the upstream channel both systems share the same transmission window (near 1310 nm), which will be moved to 1270 nm. In contrast, in the downstream channel the wavelengths
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 11 should be moved to the 1570-1600 nm window, depending on the laser sources in the OLT and the design of new optical filters [27]. In relation to the quality of service in GEPONs, the dynamic bandwidth allocation schemes applied to the upstream channels are fundamental to properly deliver the applications supported by subscribers. Some algorithms such as the proposed by Yoshihara et al. [28] achieve a good trade-off between the delay and the bandwidth efficiency.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3. CONTENTION METHODS IN EPON NETWORKS Passive Optical Networks are typically point-to-multipoint high capacity access networks based on a tree topology between the Optical Line Terminal (OLT) and the Optical Network Units (ONUs). In the downstream direction (from the OLT to the ONUs) the OLT controls the traffic which arrives from outside the network. In this direction packets do not suffer any problem as the OLT decides which packets are sent to the different ONUs in a broadcast way. Therefore, all ONUs receive the same traffic and each of them has to guess if packets belong to it. Otherwise the ONU should discard the incoming packets. The main problem in PON networks occurs in the upstream direction (from ONUs to the OLT), because all users share the same wavelength, so a MAC is necessary to avoid collisions between packets from different ONUs. There are some medium access control protocols which can be applied to PON networks, such as WDMA (Wavelength Division Multiple Access), O-CDMA (Optical Code Division Multiple Access) and TDMA (Time Division Multiple Access). Regarding O-CDMA, current technology is not prepared enough to apply it in PON networks. TDMA is the today’s most widespread control scheme in these networks as it allows a single upstream wavelength and it is very easy to implement. Nowadays, although WDMA may be an expensive alternative for the current access networks as it requires several wavelengths operating in the upstream, it is a very promising alternative for the near future PONs.
3.1. Time Division Multiple Access (TDMA) In a TDMA scheme, time is divided into periodic cycles, which at the same time are divided into so many time slots as the number of ONUs which share the channel. Therefore, each slot is dedicated to one ONU and every cycle is organized in such a way that one slot transports packets from one ONU periodically, as it is shown in Figure 5. This TDMA performance has been studied in some works [29-30], which model this scheme by means of a switch. When one specific ONU starts its transmission, it can be viewed as if the ONU is visited by a server, which stays inside the ONU taking out packets during its time slot. However, TDMA may be inefficient in the use of the bandwidth when the traffic nature is not homogeneous, as it always allocates the same bandwidth independently of the demand of each ONU. In fact, it is well known that real traffic is neither homogeneous and nor continuous, and therefore TDMA is not the most suitable scheme to be applied.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
12
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al. 2
Optical Splitter/Combiner
1
ONU 1
1 2
1
2
OLT
ONU 2 2 2
1
1
2
3
3
3
3
3
3
ONU N
Figure 5. Implementation of the TDMA protocol in EPON networks
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
To deal with this problem, algorithms which distribute the available bandwidth in a dynamic way have been proposed, called Dynamic Bandwidth Allocation algorithms (DBA). They adapt the network capacity to the traffic conditions by changing the distribution of the bandwidth assigned to each ONU depending on the current requirements.
3.1.1. Deployment of the TDMA protocol in Ethernet PONs The TDMA protocol requires a strong synchronization inside a system. When it is applied to EPONs, it is necessary to add a protection guard time between the transmissions of consecutive ONUs to avoid collisions between ONUs due to different propagation times and inaccuracies in the global synchronization time of the system. The IEEE 802.3ah [10] specifies that the guard time in EPONs depends on the Physical Medium Dependent (PMD) sublayers. The most relevant parameters to take into consideration are the on and the off laser times (ON/OFF), the Automatic Gain Control (AGC) and the Clock and Data Recovery (CDR). In [33] it is specified the range of values for each of the previous enumerated parameters:
AGC (tAGC) and CDR (tCDR) parameters. The value of both parameters can be chosen among 96 ns, 192 ns, 288 ns and 400 ns. ON (tON) and OFF (tOFF) laser times (ON/OFF). the typical value is fixed to 512 ns. Dead Zone (tdead_zone). The guard time should include a fixed time of 120 ns in order to consider possible inaccuracies in the MPCP protocol. For instantce, the interval time during the laser of one ONU is switched off, can be overlapped with part of the interval time in which the laser of the next ONU is switched on.
The guard time is the same for every ONU and it is fixed and independent of every network parameter. Moreover, the guard time is equal to the total sum of all previous time components. Then, the guard time is situated between a minimum and a maximum value depending on the value of the different parameters. The guard time is considered as wasted
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 13 time and thus, the network efficiency (η) is in part dependent on this value. Therefore, the network efficiency can be expressed by the Equation 1:
1
( t dead _ zone tON t AGC tCDR )N ONUs Tcycle
(1)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
In that formula it can be noticed that as the number of ONUs (NONUs) increases, the impact of the guard time on the efficiency is higher if the cycle time is considered fixed. Consequently, the network efficiency tends to decrease with the number of connected ONUs. In the same way, the lower the cycle time is (Tcycle), the lower the network efficiency becomes. Thus, the guard time can be enclosed between a maximum and a minimum time values that influences the performance of the network efficiency. In addition, the efficiency is also dependent on the time consumed by the MPCP protocol as it is consider as wasted time.
3.1.2. Dynamic Bandwidth Allocation (DBA) algorithms in Ethernet PONs In the downstream direction a EPON may be viewed as a point-to-multipoint network, and in the upstream direction as a multipoint-to-point network. This performance makes that the upstream and downstream channels are not communicated with each other. In order to face with this situation, the IEEE 802.3ah Task Force developed the Multipoint Control Protocol (MPCP), which controls the communication between the downstream and the upstream channels. Therefore, the application of a dynamic bandwidth assignment requires the use of the MPCP protocol to distribute the upstream bandwidth to each ONU, controlling the data transmission from the ONUs to the OLT. This protocol is implemented in the MAC layer located inside the OLT and uses five control messages. Among them, two control messages, called Report and Gate, arbitrate the communication between the OLT and the ONUs to assign bandwidth to each of them. Therefore, the upstream bandwidth is divided into bandwidth units via TDMA and each of them is assigned to a different ONU. The length of these units is considered not fixed, but it is dynamically calculated depending on the applied DBA algorithm. Hence, Report messages are used by ONUs to demand bandwidth, whereas Gate messages are used by the OLT to notify the allocated bandwidth for next cycles. At the end of their time slot, ONUs send Report messages to the OLT to request bandwidth for the next cycle time. The OLT assigns bandwidth based on the demand bandwidth of ONUs, and sends Gate control messages to inform ONUs about the new allocated bandwidth in the next cycle time. The time stamped into the Gate messages is used as global time references. Therefore, each ONU updates its local clock by means of the timestamp contained in each control message, keeping each ONU a global synchronization with the whole system. Moreover, each Gate message is able to support up to four transmission grants. Thus, each transmission grant specifies the transmission length and the transmission start time of a particular ONU. The transmission start time of each ONU is expressed as an absolute timestamp according to the global synchronization of the system. Hence, each ONU sends packets during its time slot according to its intra-ONU scheduler, which control the packet transmission from various local queues, each of them belonging to a different application. Dynamic Bandwidth Allocation algorithms can be classified into offline (centralized) or online (polling) scheduling methods.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
14
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
3.1.2.1.Centralized or offline scheduling methods in DBA algorithms In centralized algorithms (Figure 6) the OLT distributes the bandwidth for the next cycle once it receives the updated requirements of every ONU in that cycle. This performance causes a waste of time during this waiting period [34-37], although it allocates bandwidth for the next cycle knowing the total demand of every connected ONU. Besides, once the OLT receives every Report message, the considered DBA algorithm is invoked and it needs some computation time to allocate the bandwidth and to generate the grants table for the next cycle time. This scheme increases even more the idle time when the upstream channel in not used. The algorithms DMB (Dynamic Minimum Bandwidth) [35] and the algorithm proposed by Choi et. al [36] (we called it DBA-Choi) follow this policy. However, some algorithms existed in the literature such as the Enhanced DBA [34], apply a gate-ahead mechanism. In this method, the OLT sends some Gate messages for the next cycle N+1 while receiving Report messages from the current cycle N (Figure 7). In particular, the OLT applies an early bandwidth allocation scheme in which one ONU requesting a bandwidth lower than a minimum guaranteed bandwidth, can be scheduled at that very moment without waiting for the arrival of every Report. However, if one ONU requests more than this minimum guaranteed bandwidth, it has to wait until every Report message arrives and the DBA algorithm assigns bandwidth for the next transmission cycle. With this mechanism, it is expected an increase in the channel throughput and a reduction in the waiting delay, that could improve the delay of the highest priority traffic. However, this scheme only takes advantages of the wasted bandwidth for low and medium loads, since at those loads it is likely that ONUs demand less bandwidth than the minimum guaranteed bandwidth. In contrast, as the load increases ONUs will probably demand more bandwidth than the guaranteed bandwidth. In this case, ONUs are not allowed to transmit during consecutive cycles and some bandwidth will be wasted. Other algorithms, such the as IntraONU M-SFQ algorithm (Intra-ONU Modified Start-time Fair Queueing) proposed by Ghani et al. [38], follow the same bandwidth allocation policy at the OLT. DBA computation time
OLT
…
Cycle N
Cycle (N+1)
ONUs Data + Report
Computation Time DBA + RTT
Figure 6. EPON network performance implemented a DBA algorithm based on a centralized scheme
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 15 DBA computation time
OLT
Cycle (N+1) Cycle N
…
… ONUs
ONU i ONU i 1
Data + ReportBandwidth demand lower than the guaranteed bandwidth
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 7. Transmission in the upstream channel of the algorithm Enhanced DBA proposed by Assi et al. [34]
In order to solve this problem, Zheng et al. proposed in [39] a new transmission control mechanism to improve the used bandwidth under for high traffic loads. In this algorithm called n-DBA, each time the OLT receives a Report message from one ONU, it permits the ONU to transmit data instantaneously if its bandwidth demand is lower than its guaranteed bandwidth. At the same time, the remaining guaranteed bandwidth non used by the ONU is added to the total non used bandwidth in the current cycle, and a variable of time is updated with the arrival instant of the control message. If the ONU demanded bandwidth is higher than its guaranteed bandwidth, the OLT checks if the value of the variable is lower than the beginning of the next cycle and it also checks if the next Report arrives before that value minus the propagation time of the control message. If both premises fulfil, the ONU waits for the end of the cycle, determined when the value of the time variable is higher than the beginning of that value minus the propagation time of the message. However, if the second condition does not fulfil, that is, if the next Report message arrives after the current control message, the ONU are able to immediately transmit with its guaranteed bandwidth. Once the cycle ends, the remaining bandwidth of the current cycle is assigned to those ONUs with high demand which have not been served before. On the other hand, there are centralized DBA algorithms which try to improve the bandwidth utilization by dynamically changing the ONUs transmission order in every cycle. Typical DBA algorithms keep a fixed transmission scheme and ONUs transmit cycle after cycle following a fixed round robin discipline. On the contrary, other algorithms such as the proposed by Sherif et al. [37] and Ma et al. [40], dynamically change the ONUs transmission order depending on their state. In particular, the algorithm TOEDBA (Transmission Orderbased Enhanced DBA), proposed by Sherif et al. [37], developes two different variants to predetermine the ONUs transmission order in every cycle. The first option applies the scheme called SRF (Shortest Request First), in which the transmission order starts with the ONU with the lowest bandwidth demand and it continues with the next ONUs in a decreased order. In this way, the ONU with the lowest bandwidth demand is the first ONU to transmit in the next cycle time. On the contrary, in the other proposal called LRF (Longest Request First), the transmission order starts with the ONU with the highest bandwidth demand and then the other ONUs in an increased order. Hence, the ONU with the highest demand transmits the first in
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
16
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
the next cycle. On the other hand, in the algorithm DPOA (Dynamic Polling Order Arrangement), designed by Ma et al. [40], the OLT fixes the transmission order for the next cycle N+1 at the end of the cycle N. Hence, ONUs with high traffic load will transmit first in the next cycle time N+1. Moreover, as the OLT also decides which set of ONUs can transmit in each cycle, it may lead to a better utilization of the upstream available bandwidth. This method benefits ONUs with heavy traffic load or with high real-time traffic demand. The algorithm DPOA carries out the selection of ONUs and its order transmission following the next rules. First of all, once the OLT collects the queue state of every ONU, it arranges the polling order according to the queue lengths of the ONUs in a decreased order. Then, the ONU with the longest queue length will be polled the first. After that, the OLT chooses those ONUs whose queue length is lower than one fifth of the first ONU. Finally, the OLT assigns a transmission window for these ONUs for the next cycle time in the previous arranged order.
3.1.2.2. Polling or online scheduling methods in DBA algorithms On the other hand, in polling or online schemes, the OLT allocates bandwidth to each ONU for the next cycle before the last packet of the previous one arrives, resulting in efficient upstream channel utilization [31, 41-42]. However, these algorithms do not consider the global state of the network. Among polling algorithms, one of the most efficient is the Interleaved Polling with Adaptive Cycle Time (IPACT) [41-42], an adaptive cycle time approach in which ONUs are polled individually in a round robin fashion discipline, as it is shown in Figure 8. In this figure it is represented three connected ONUs located in the same distance to the OLT. Firstly, it is assumed a moment (t) in which the OLT knows the updated queue length (bytes) of every ONU and the distance to each of them. At this time t the OLT sends a Gate message to ONU1 in order to inform it about its transmission window for the next cycle (in bytes) and the instant of time it has to start its transmission. Once the ONU receives the Gate message, it begins the transmission in that specified instant. At the end of its transmission this ONU sends a Report message to the OLT with the current queue length to demand bandwidth for the next cycle. However, before the OLT receives this Report message from ONU1, it knows exactly the instant this ONU1 sends its last bit using the next information: 1. The first bit arrives to the OLT after an interval time equal to the RTT (Round Trip Time), and it also includes the Gate processing time sent by the ONU. Therefore, this time is considered as the interval time from the OLT sends a Gate message to one ONU until the OLT receives the first bit of such ONU. 2. As the OLT previously has allocated the transmission window to ONU1, it exactly knows when the last bit of such ONU is going to arrive. As it also knows the next ONU allowed to transmit (ONU2) and the distance to it, the OLT schedules the bandwidth allocation to ONU2 by sending to it a Gate message just after the end of the transmission of the previous ONU (ONU1). 3. In the same way, the OLT can calculate the arrival of the last bit of the ONU2. Therefore, the OLT will know when to send the Gate message which contains the bandwidth allocation to ONU2. Once the OLT distributes the bandwidth among ONUs, it updates a table that contains the current state of every ONU.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 17
t Gate
Gate
Gate
Data + Report
ONUs
ONU1
ONU2
Cycle N
ONU3
ONU1
ONU2
ONU3
Cycle N+1
Figure 8. EPON network performance implemented the IPACT algorithm based on an interleaved polling policy
On the other hand, if the OLT permits each ONU to empty its buffer in every cycle, the ONUs with high bandwidth demand will monopolize the upstream channel. In order to avoid this situation, the OLT should limit the Maximum Transmission Window (MTW). In the algorithm IPACT several bandwidth allocation schemes are being studied, such as the fixed, the limited allocation, the constant credit, the linear credit and the elastic credit methods:
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Fixed service. The OLT does not take into consideration the updated bandwidth demand of ONUs and it always assigns the MTW. In this way, the cycle period is constant and this scheme is similar to the TDMA method. Limited service.The OLT gives the required bandwidth to each ONU as long as the demand is lower than an imposed maximum bandwidth. When the demand is higher than this bandwidth, the OLT gives this maximum. This performance makes adaptive the cycle time depending on the updated demand of each ONU. Credit service. The OLT assigns to each ONU its demanded bandwidth plus an additional credit bandwidth. The size of the credit can be fixed or proportional to the requested window size. This credit takes into account the packets which may arrive to one ONU from this ONU sends the Report message in one cycle until this ONU is allowed to transmit in the next cycle. Elastic service. The OLT fixed the maximum window size depending on the maximum cycle time considered. The MTW is granted in such a way that the accumulated size of the last N grants (including the one being granted) does not exceeds the N×MTW (N represents the number of ONUs). Hence, if only one ONU has data to transmit, it can be allocated a maximum window of N×MTW.
In order to allocate bandwidth to each ONU, polling algorithms are a good choice as they improves the channel utilization. Among the different bandwidth assignation disciplines which may be applied in polling methods, the limited scheme offers the best performance inside the network, as it is demonstrated in [42]. One disadvantage of polling schemes is that in the Report sent by each ONU at the end of its transmission window it only reports the current state of the queue length. However, the ONU does not consider the traffic which arrives in the interval from it sends the control message in cycle N until the ONU is allowed to transmit in the next cycle N+1. As a
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
18
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
consequence, this traffic will not be reported until the cycle N+1, and it will not be sent out at least in cycle N+2. It makes this traffic to suffer an additional delay of at least one cycle. In order to solve this problem, Byun et al. [31] proposed a new algorithm based on IPACT (New-DBA). This algorithm estimates the quantity of traffic which may arrive during the waiting time between two consecutives Report and Gate messages of one specific ONU. This traffic estimation is included in the Gate message sent to the ONU in the next cycle time. In order to estimate traffic between consecutive cycles, the algorithm defines a term called R onui ( N ) , which determines the extra bandwidth allocated to one ONUi for the next cycle
time. This term represents the subtraction of the allocated bandwidth to the ONUi for cycle N from the updated queue length of that ONU. The algorithm predicts the assigned bandwidth onui to ONUi for the next cycle time N+1, called Balloc ( N 1 ) , following Equation 2:
onui onui Balloc ( N 1 ) Balloc ( N ) R onui ( N )
(2)
where the parameter α is the control gain parameter which fixes the estimated traffic for the onui ( N ) is the assigned bandwidth in the current cycle N. Then, if the next cycle, and Balloc onui current allocated bandwidth, Balloc ( N ) , is higher than the packets stored at the queues, the
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
term R onui ( N ) becomes positive and the allocated bandwidth for the next cycle decreases. On the contrary, if the current allocated bandwidth is not enough to send out all the stored packets, the term R onui ( N ) becomes negative and consequently the assigned bandwidth for the next cycle increases. The simulation results show a slight improvement of this algorithm over IPACT in the mean packet delay for low and medium loads. However, the prediction makes the algorithm more complex to implement. In other algorithms, such as the Dynamic Bandwidth Allocation algorithm with Multiple services (DBAM) implemented by Luo et al. [32], it is also applied a traffic prediction. In order to explain the algorithm, Figure 9 shows the transmission of two ONUs in several consecutive cycles. The cycle is defined as the interval time between the transmission of one ONU in two consecutive cycles. Then, the cycle N for ONU1 goes from t1 to t5, and it is defined as T1,N t 5 t 1 . As it is shown in Figure 9, the ONU1 transmits packets during its transmission window (from t1 to t2), and at the end of the transmission it sends a Report message. The interval time from t2 to t5 is the waiting period in which packets may arrive to the queues of ONU1, defined as T1w,N t 5 t 2 . For the next cycle N+1, the ONU1 starts its transmission in t5 with a transmission length from t5 to t6. As it can be noticed, this allocation is based on the information of the Report message sent in t2 and packets which arrive at ONU1 during the interval T1w,N are not taking into account for the next cycle. This performance is kept for every cycle and for every ONU connected to the EPON network. However, the algorithm DBAM applies a the Equation 3 to assign bandwidth for the next cycle N+1, called Balloci ( N 1 ) , depending on the demanded bandwidth sent in the last Report message: onu
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 19
OLT data
Report
ONU1
Report
ONU 2
ONU1
t1
data
T1,N
Report
ONU1
t4 t5
data
Report
ONU 2
ONU1
ONU2
t2 t3
data
ONU1
ONU2
t6 t7
data ONU1
t8 t9
ONUs
Tw1,N
Cycle N
Cycle N+1
Figure 9. EPON network performance implemented the DBAM (Dynamic Bandwidth Allocation with Multiple services) algorithm [32]
onui Balloc ( N 1) (1
Ti w,N Ti ,N
onui )Bdemand (N)
(3)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
onui where Bdemand ( N ) represents the real queue length for one ONUi when it sends the Report
message in cycle N. As a consequence, the assigned bandwidth for the next cycle is calculated by means of the updated bandwidth demand plus a term based on the current traffic received in the previous waiting period. The simulation study shows that DBAM improves the mean packet delay of the real traffic when compared to centralized algorithms which apply a fixed bandwidth or a limited bandwidth allocations policy. The algorithm proposed by Kim et al. [43], called SLIding Cycle Time based DBA (SLICT), is a polling algorithm which applies the limited allocation method. In this scheme, the OLT gives the required bandwidth to each ONU as long as the demand is lower than a maximum bandwidth imposed. When the demand is higher than this bandwidth, the OLT gives this latter maximum plus a proportional part of the remaining bandwidth which has not been used by the previous ONUs in that cycle. This additional bandwidth is multipled for a factor whose value is random between zero and one. The OLT allocates to the ONU the minimum bandwidth between the previous calculated bandwidth (the maximum plus the bandwidth multiple by a factor) and the demanded bandwidth. This performance makes the cycle time to be adaptive depending on the updated demand of each ONU. Moreover, this algorithm shows a strong dependence on the factor. Therefore, if the factor value is very high, the assigned bandwidth to one particular ONU can be so high that the next ONUs to transmit would not have the chance to demand more than its imposed maximum bandwidth. On the other hand, there are other polling schemes like the Cycling Polling-Based Dynamic Bandwidth Allocation (BP-DBA) algorithm proposed by Choi et al. [44]. This algorithm combines a pure TDMA scheme with a DBA method, where ONUs are periodically polled. Hence, ONUs are not polled in every cycle, and during these cycles they have to transmit following a fixed bandwidth allocation policy. This scheme pretends not to overuse the downstream channel with a lot of Gate control messages for low and medium traffic loads. Therefore, this channel capacity can be used by ONUs to send more data. The simulation results show that the new algorithm achieves higher downstream channel
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
20
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
throughput than IPACT. In contrast, it obtains higher mean packet delay and mean queue size than IPACT. A summary of the main characteristics of the algorithms presented in this section regarding the TDMA protocol applied to EPON networks is shown in Table 1.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Table 1. Comparative table of the main characteristics of developed algorithms in EPONs based on the TDMA protocol Algorithm Enhanced DBA [34]
Policy Centralized
DBM [35]
Centralized
DBA-Choi [36]
Centralized
TOEDBA [37]
Centralized
Intra-ONU MFSQ [38] n-DBA [39]
Centralized
DPOA [40]
Centralized
IPACT [4142]
Polling
New-DBA [31]
Polling
DBAM [32]
Polling
SLICT [43]
Polling
Cycling DBA [44]
Polling
Centralized
Properties Permits the transmission between consecutive cycles of ONUs with low bandwidth demand Improves the performance parameters for low loads Wastes bandwidth between consecutive cycles for medium and high loads Wastes bandwidth between consecutive cycles Gets worse performance parameters than poling schemes Wastes bandwidth between consecutive cycles Gets worse performance parameters than poling schemes Changes the ONUs transmission order in consecutive cycles Chooses only a subset of ONUs to transmit in each cycle Similar in performance to the Enhanced DBA Permits the transmission of ONUs which comply with some premises Improves the performance of the Enhanced-DBA and IntraONU M-SFQ for every load Complexs to implement Changes the ONUs transmission order in consecutive cycles Chooses only a subset of ONUs to transmit in each cycle Allows ONUs to transmit just after the previous ONU transmits its last bit It permits no idle time between two consecutive cycles Better performance than centralized algorithms Applies traffic prediction between consecutive transmissions Adds the traffic estimation to the demanded bandwidth of ONUs for next cycles Complexs to implement when compare to traditional polling algorithms Applies traffic prediction between consecutive transmissions Adds this traffic estimation to the demanded bandwidth for next cycles Complexs to implement when compare to other polling algorithms Allows ONUs with high demand to be assigned more bandwidth than its maximum, multiplied by a numeric factor Shows a strong dependency on the multiplication factor Consist of a periodic combination of fixed and dynamic bandwidth allocation policies Improves the capacity of the downstream channel but achieves higher delay than traditional polling algorithms
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 21
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.1.3.Quality of Service(QoS) in Ethernet PON Networks One of the most important challenges in EPON networks is its capacity to offer Quality of Service (QoS). Network providers and operators have to offer a minimum guarantee to customers regarding their supported services. As an example, it is important that the available resources should be properly assigned according to the priority of certain kind of services and custom. Besides, it is essential that EPONs efficiently provide both class of service and subscriber differentiation as current access networks deal with different subscriber profiles with very restrictive and heterogeneous traffic requirements. 3.1.3.1. Class of service (CoS)differentiation Dynamic bandwidth allocation algorithms should cope with class of service differentiation, and it is necessary to define methods for differentiating traffic into Classes of Service (CoS). There are several schemes to provide service differentiation in PONs. In the priority queue method, used in some algorithms [34, 42], packets belonging to different classes of services are inserted into its corresponding priority queue. All supported services share the same buffer in order to improve the performance of the highest priority traffic. Therefore, when the buffer is full, incoming packets with higher priority are inserted inside the shared buffer if other packets with lower priority can be discarded. Furthermore, in other related studies, packets are differentiated keeping separate queues of fixed length, one for each class of service [45]. Hence, when queues are full, incoming packets will be dropped independently of their priority, as it is not permitted to replace packets with lower priority. The priority queue and the separate queue described methods, used for categorizing packets inside the queues, are combined with a scheduling scheme to transmit packets outside the priority queues. Some proposals such as algorithms in [34, 42] apply the Strict Priority Queue method defined in IEEE 802.1D, in which each ONU will be equipped with a number of virtual queues equal to the number of supported services. A scheduler located inside the ONU takes out packets from one queue only if queues with more priority are empty, as it is shown in Figure 10 (in this case P0 is the highest priority service and P2 is the lowest priority service). Queue P0
Queue P0
Queue P1
Queue P1
Queue P2
Transmission Window
Queue P2
Transmission Window
Figure 10. ONU scheduler equipped with three priority queues, P 0, P1 and P2 which applies the strict priority queue method
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
22
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
However, in this scheduling method, during the interval time from one ONU sends its Report message until it starts its transmission in the next cycle, new high priority packets arrive to the ONU. Thus, these priority packets which arrive in the waiting time will be sent out ahead of the lower priority packets reported in the previous cycle time. This performance leads to an increase in the mean packet delay and the packet loss of the low priority traffic. In order to reduce this problem other works implement the Fair Nonstrict Priority Scheduling scheme [37], which works as the Strict Priority Queue, but in this case only packets reported in the last updated message can be sent out in the current cycle. Hence, if high priority packets arrive to one ONU just after the beginning of its current time slot, they have to wait until the reported low priority packets are transmitted. Therefore, if the time slot can be able to send out more traffic, it will be used to transmit packets of high priority. This method provides fairness among every supported class of service. In the algorithms presented in [32, 36], the OLT allocates the exact bandwidth to each service in a centralized way, depending on its priority and its demand, as it is represented in Figure 11 (in this case P0 is the highest priority service an P2 is the lowest priority service). In particular, in the Dynamic Bandwidth Allocation algorithm with Multiple services (DBAM) [32], each class of service is set a maximum permitted bandwidth in each cycle. In this algorithm, the OLT allocates bandwidth firstly to the highest priority services, since they require bandwidth guarantees, and the remaining bandwidth is given to the low priority traffic. The bandwidth assigned to the highest priority services is upper bounded by the smaller value between its requested bandwidth and its maximum permitted bandwidth. This controls that they do not forcefully consume the total upstream channel. However, since the maximum guaranteed bandwidth assigned to the high priority services is fixed, it may happen that for high network loads the algorihtm may not cover their bandwidth requirements. This bandwidth scheduling problem is also exhibited by the algorithm Service Level AgreementDynamic Bandwidth Allocation algorithm (SLA-DBA) proposed by Nowak at al. [45]. In contrast, the service scheduling scheme proposed in [36] (we called it DBA-Choi), assumes a fixed bandwidth assignment for the highest priority services regardless of whether they have packets to send or not. The remaining bandwidth is distributed throughout the medium priority traffic depending on the updated demand of this service by each ONU. Queue P0
Queue P0 Bandwidth allocation to each service
Queue P1
Queue P2
P0
P1 P2
Transmission Window
Queue P1
Queue P2
P0 P1 P2
Transmission Window Tiempo de Transmisión
Figure 11. ONU scheduler equipped with three priority queues, P 0, P1 and P2 where the OLT allocates the exact bandwidth for every class of service Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 23
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Finally, if there is extra bandwidth, it will be allocated to the lowest priority services as well as the previous priority traffic. This bandwidth allocation scheme has the same deficiencies of the DBAM and SLA-DBA algorithms. However, it wastes additional resources when the highest priority service is under loaded. This happens because it does not use its fixed allocated bandwidth and this bandwidth cannot be redistributed among the remaining services. In [44], Choi et al. proposed a new algorithm based on IPACT called Cycling DBA. This new algorithm implements the same service differentiation policy as the previous algorithm. As a consequence, it shows the same drawbacks of such algorithm. Other algorithms such as the BP-DDBA (Burst-Polling based Delta Dynamic Bandwidth Allocation algorithm) designed by Yang et al. [46], distributes the available bandwidth to each class of service depending on fixed weights. This weighted factor is chosen according to the priority of the service. In fact, they proposed a different profile of weights which depends on the congestion of the ONU. If the available bandwidth is enough to cover every bandwidth requirement, the algorithm applies an aggressive scheme which increases the weight given to the sensitive services. In contrast, if the demanded bandwidth exceeds the existing resources, the algorithm applies a more conservative scheme among the supported services.
3.1.3.2. Customer differentiation (service level agreement) Access networks also have to face customer differentiation. End users contract a Service Level Agreement (SLA) with a provider, which contains technical specifications called service level specifications (SLSs). It forces the network to deal with each SLA subscriber in a different way and algorithms have to take into account the requirements for every SLA. Some related studies are focused on a fair bandwidth distribution among various service providers which offer the same services on the same upstream channel to different users (DUAL-SLA algorithm) [47]. On the other hand, other works are related to an unique service provider which offers multi-service levels according to subscribers’ requirements [34-35, 48]. As an example, the Bandwidth Guaranteed Polling (BGP) method proposed in [48], divides ONUs into two sets of ONUs, the bandwidth guaranteed ONUs and best effort ONUs. Bandwidth guaranteed ONUs always receive the guaranteed bandwidth according to their SLA, whereas the others ONUs receive the remaining bandwidth. The associated SLA specifies the quantity of bandwidth that has to be guaranteed. The upstream channel is divided into equal bandwidth units, although these units are chosen in such a way that the total number of units have to be larger than the supported ONUs inside the access network. Moreover, the OLT keeps two entry tables, one for the guaranteed bandwidth ONUs (Entry Table) and another for the best effort ONUs. The first table has a number of entries equal to the number of bandwidth units obtained in the upstream channel, whereas the best effort ONUs table is not fixed in size. Those bandwidth units which are not used by the guaranteed ONUs can be dynamically allocated to the best effort ONUs. Each guaranteed bandwidth ONU can be allocated one or more bandwidth units according to its SLA or its demanded bandwidth. Therefore, guaranteed ONUs with more than one entry will be polled by the OLT more than once in a round of polling. The OLT polls every ONU in the order of the Entry Table by means of a pointer which shows the current entry. If one entry is not assigned to one guaranteed ONU, it will be offered to a non-guaranteed ONU in the order listed in the best effort table. Another common way to provide client differentiation is to use a weighted factor assigned to each ONU associated with one specific SLA. Then, the bandwidth is allocated
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
24
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
depending on these fixed weights. In the method presented in [34], each ONU is assigned a minimum guaranteed bandwidth based on its associated weight, so that the upstream channel is divided among ONUs in proportion to their SLAs. In particular, the weighted factors are assigned depending on the priority of the SLA of each ONU, and the sum of the weights has to be one. Consequently, the sum of every minimum guaranteed bandwidth is equal to the total available bandwidth contained in one cycle time. In this way, at the end of every transmission, each ONU requests bandwidth for the next cycle. If the requested bandwidth is lower than the minimum guaranteed bandwidth, the ONU is allocated the demanded bandwidth. The achieved extra banwidth will be collected with the excess bandwidth of other ONUs which do not request their minimum permitted bandwidth. In the second stage of the algorithm, the total excess bandwidth is distributed among those ONUs whose bandwidth demand is higher than their guaranteed bandwidth. Finally, this bandwidth will be assigned depending on the requested bandwidth of each high loaded ONU. On the other hand, in the Dynamic Minimum Bandwidth (DMB) algorithm [35], the OLT distributes the available bandwidth assigning different weights to each client depending on its SLA. Therefore, ONUs associated with a higher weight will be allocated more bandwidth. In this algorithm, the bandwidth allocation in every cycle is made in two stages. In the first step, a minimum guaranteed bandwidth is assigned to each ONU depending on the priority of its SLA, and it consists of two types of bandwidths. The first bandwidth component, called basic bandwidth, is a fixed bandwidth allocated to every ONU independently of the priority of its service level. The second component, the extra bandwidth, is calculated taking into account the priority of the service level and the requested bandwidth of every ONU. Once the first step is completed, if some ONUs do not used their entire guaranteed minimum bandwidth, in the second step the algorithm distributes this extra bandwidth among those ONUs which demand more bandwidth than their guaranteed minimum bandwidth. As quality of service can be offered by means of class of service or client differentiation, a suitable combination of both of them is necessary to fully support quality of service in the upstream channel. However, none of the previous proposed methods which apply client differentiation controls that the supported services comply with the standard constraints. Thus, they do not consider the final performance of the supported services of every subscriber. However, this is a very important aspect that should be covered by a DBA algorithm, since service providers have to ensure that the highest priority traffic complies with constrains in terms of mean packet delay or packet loss rate. In this way, the algorithm proposed in [49], provides both service and client differentiation but, unlike other previous works, it dynamically controls that each subscriber satisfies all the class of service requirements. Therefore, the algorithm ensures that the most sensitive services keep the mean packet delay below the maximum upper bound permitted for every priority customer. The algorithm is based on a set of weights to distribute the bandwidth, as it is done in other works [34-35], but in contrast with them, it dynamically changes the value of the weights in order to achieve the best performance under every load situation. A summary of the main characteristics of the class of service and subscriber differentiation methods applied to EPON networks based on the TDMA protocol is presented in Table 2.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 25 Table 2. Comparative table of the class of service and subscriber differentiation methods applied in EPON networks based on the TDMA contention scheme
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Policy Strict Priority Queue Method Fair Strict Priority Method Exact bandwidth allocation to each class of service
CLASS OF SERVICE DIFFERENTIATION Properties Packets of one queue are sent out only if queues with higher priority are empty Packets of each queue reported in the last updated message only can be sent out The OLT allocates the exact bandwidth to each service depending on the algorithm’s policy
Algorithms IPACT [41-42] TOEDBA [37]
SLA-DBA []45 DBAM [32] DBA-Choi [36] Cycling DBA [44] BP-DBA [46] CUSTOMER DIFFERENTIATION (SERVICE LEVEL AGREEMENT) Various service Various service providers support the same DUAL-SLA [47] providers on the services on the same upstream channel same channel Upstream channel Upstream channel divided into equal BGP [48] divided into bandwidth units bandwidth units Units are distributed among guaranteed and no guaranteed ONUs Based on static One service provider DMB [34] weights Upstream channel divided among ONUs based on fixed weights according to their SLAs Based on dynamic One service provider DySLa [49] weights Set of dynamic weights to distribute the upstream channel so that the highest priority traffic complies with the constrains of the mean packet delay for every subscriber
3.2. Wavelength Division Multiple Access (WDMA) WDMA-PON architectures are viewed as a solution for the scalability problem of traditional EPONs. In these traditional EPONs, the offered power limits the maximum number of ONUs connected to one OLT because only one wavelength is shared by all users. Thus, the available bandwidth to each subscriber decreases with the number of subscribers. One solution is to increase the transmission rate of the upstream channel of the deployed single channel EPONs. However, this approach highly increases the costs since the current transceivers have to be replaced by more powerful transceivers which operate at higher transmission rates. In contrast, traditional EPONs may be consciously upgraded by adding multiple wavelengths over the same fiber infrastructure according to the bandwidth requirements of the access network. Therefore, only those nodes which need more capacity can be separately upgraded in a cost-effective way. The WDM-PON architectures and the WDM-DBA algorithms for such architectures are being strongly studied nowadays. However,
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
26
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
it does not exist a predominant or imposed architecture. Therefore, the gradual WDM upgrade would be limited by technological costs and based on the necessity of service providers. It is preferable flexible WDM-PON architectures which could be migrated in a cost-effective way.
λ1
λ2
λ3
λ
1
’
λn
ONU 1
λ1
λ2
λ3
λ2
λ1
λn
OLT
λ3 ’
λ1 λ ’1
λ’ 2
λ’ 3
λn λ’2
ONU 2
λ’ N
λ1 λ2 λ3
’
λ1
λn ’
λ2
’
λ3
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.2.1. WDM-EPON deployed architectures Several WDM-TDM architectures have been proposed recently, although the deployment of the WDM technology in the access network is still in its first stages. Many proposed WDM-PON architectures depend on network components, such as the used of remote node (RN) in the external field. In this way, WDM-PON prototypes tend to use splitters or AWG (Arrayed Waveguide Gratings) as remote nodes. Among those WDM-PON technologies based on splitters, one of the simplest architectures is shown in Figure 12. In this configuration, the sources of the OLT and the ONUs are able to transmit at several wavelengths. This solution permits a progressive update of the architecture, as new wavelengths can be added to those subscribers with higher bandwidth demand, whereas the remaining subscribers continue with the previous configuration [50]. On the other hand, another extended WDM-PON approach employs one separate wavelength for the transmission from the OLT to each of the supported ONUs. In general, this architecture does not allow bandwidth redistribution and shows high deployment costs. Another more sophisticated architecture based on splitters combined with AWG is shown in Figure 13. This configuration permits to increase the number of users and the offered bandwidth. As the splitter limits the number of subscribers typically to 64, this solution proposes the deployment of new TDM-PON infrastructures. However, if all TDM-PONs operate at the same wavelengths, the available bandwidth would decrease significantly. To avoid this problem, this architecture incorporates an AWG in the OLT, which allows an independent management of each deployed TDM-PON infrastructure.
ONU N
Figure 12. Example of a WDM-PON architecture with a splitter as remote node (RN)
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 27 ONU ONU ONU
TDM-PON ’
λ1
λ1
λ1
λ2
λ ’1
λ3 λ’2
λn λ ’3
λ’N
A W G
ONU
λ2
ONU
λ’ 2
ONU
TDM-PON
OLT λ 3’
λ3 ONU ONU ONU
TDM-PON
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 13. Example of a WDM-PON architecture with a splitter as remote node (RN) and an incorporated AWG
The previous configuration based on only one AWG improves efficiently the available bandwidth in a PON network. However, there are important technological constraints that do not allow the expansion of the network. In order to solve this problem, an architecture consisted of several levels of AWG (typically two) has been proposed. This novel architecture takes advantages of the cyclic properties of the AWG to simultaneously augment the available bandwidth and the number of users in the PON. Then, the AWG of the first level (KxM) is located in the OLT and distributes the generated laser signals towards each of the M branch of the external network infrastructure. On the other hand, the second level of the AWG (LxN), routes each of the N wavelengths to the corresponding ONU [51-52]. To deal with the costs reductions, authors in [53] proposed hybrid WDM-TDM access architectures with reflective ONUs, an arrayed-waveguide-grating (AWG) outside the plant and a tunable laser stack at the OLT. The use of the AWG permits to be selective in wavelength and it exhibits lower losses than optical splitters. This architecture has been designed to decrease the number of lasers at the OLT and it also highly improves the security at each ONU. Besides, it also reduces the cost at ONUs using reflective devices to modulate the upstream data by means of an optical carrier sent by the OLT. Therefore, ONUs do not require any light source and all of them are identical. However, dynamic wavelength assignment is not permitted and it cannot take advantage of the inter-channel statistical multiplexing to fairly redistribute the available bandwidth. On the contrary, the WDM-PON architecture called SUCCESS [54] permits a gradual migration from TDM-PON to WDM-PON. This prototype adds tunable transmitters and receivers to the OLT, which are shared by every ONU to reduce the number of expensive DWDM transceivers. It allows that multiple tunable transceivers can be shared among several independent PONs. In addition, the ONUs use optical modulators instead of using DWDM light sources for the upstream transmission. The SUCCESS prototype was improved by deploying a ring topology in the so-called SUCCESS-HPON [55]. The OLT communicates with users using a ring topology with
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
28
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
several distribution starts connecting the OLT located at the central office with all ONUs. The semipassive performance of the remote nodes (RN) combined with the TDM/WDM topology, allows to supports different type of subscribers on the same network infrastructure and also provides protection and restoration capabilities. This remote node can be an optical splitter or an AWG. If the remote note contains an AWG, each ONU has one dedicated wavelength on a DWDM grid to communicate with the OLT. On the contrary, if the remote node contains a splitter, one wavelength is used to broadcast the downstream packets to all ONUs connected to this remote node. The authors in [56] developed a dynamic wavelength allocation algorithm which let the bandwidth be shared across multiple physical PONs, thanks to a common tunable transceiver at the OLT. It enhanced the architecture performance and reduces the related costs. The hybrid novel WDM-TDM architecture proposed in [57], uses a transmitter without wavelength selectivity based on an uncooled Fabry-Pérot Laser Diode (FP-LD). The study demonstrated that a single FD-LD can be used at any wavelength channel without wavelength tuning in a range of temperatures from 0 to 60ºC. The architecture has a double-star topology in a cascade of several arrayed-waveguide gratings (AWGs). Each AWG is shared by a number of users by means of splitters via TDM techniques. The experiments achieved a WDM/TDM architecture with 128 connected subscribers at rates of 1.25 Gbit/s for the downstream and 622 Mbit/s for the upstream. The architecture has 16 WDM channels, each of them with eight connected subscribers via TDM, with a wavelength spaced of 100 GHz. However, no DBA algorithms were discussed for such architecture. On the other hand, WDM-PON architectures highly depend on the light sources used at both ends of the access infrastructure. Hence, the architectures proposed in [58-60], consider a smooth upgrade of TDM-PONs, allowing several wavelengths to the upstream transmission. Authors in [58-59] proposed that the OLT consists of an array of fixed laser/receivers and the ONUs consit of either an array of fixed laser/receivers or several tunable laser/receivers. However, from the providers’ point of view is more likely the utilization of either tunable laser/receivers or fixed laser/receiver arrays, but not simultaneously. In the prototype proposed in [60], every ONU employs one or more fixed transceivers, permitting a gradual upgrade depending on the traffic demand of the ONUs. Then, the OLT assigns the bandwidth to each ONU in those wavelengths it supports. In addition, the fixed transceivers at the ONU can be interchanged by a fast tunable laser. This solution seems to be very interesting as it can be able to obtain several wavelengths by means of only one optical device. In that case, the OLT only could transmit in one single wavelength at any given time, which may lead to poor bandwidth utilization due to the dead tuning time every time there is a wavelength switch. There are other architectures which propose to divide ONUs into multiple subsets [61]. As each subset is allocated a fixed wavelength channel for the upstream transmission, each ONU is equipped with a fixed transceiver and the OLT with a stack of fixed transceivers. However, this architecture has limited flexibility as it does not allow dynamic wavelength allocation. In these infrastructures is necessary to install a specific wavelength light source. As a consequence, the related wavelengths of the upstream transmitters have to be chosen to be coherent with the filters. Finally, other architectures are focused on reducing the costs of each ONU by means of reflective devices to modulate the upstream data using optical carriers. Consequently, ONUs do not require any light source and all of them are identical. Therefore, in each ONU a directly-modulated Reflective Semiconductor Optical Amplifier (RSOA) is implemented at
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 29
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
the upstream transmitter. Some WDM-PON prototypes [53] modulate the downstream data using optical frequency-shift keying (FSK) techniques. The modulated downstream wavelength is injected into the RSOA located at each ONU with direct modulation for the upstream transmission. The FSK modulation reduces the light coherence and the Rayleigh scattering crosstalk.
3.2.2. Dynamic bandwidth allocation (DBA) protocols in WDM-EPONS In WDM-EPONs there are two essential approaches to distribute the available bandwidth among users: the separate wavelength and time allocation or the joined wavelength and time assignment. Most of the proposed studies consider the joint time and wavelength assignment as it permits multidimensional scheduling. Regarding the most widespread methods to dynamically allocate the supported wavelengths in WDM-EPON architectures, there are multiple schemes. Some of them are extensively used in the transport network, such as the fixed, the random, the least assigned, the least loaded or the first fit technique. The OLT keeps track of the utilization of each wavelength and uses this information to decide to decide the ONUs whose wavelength assignment will be changed. In the fixed scheme, once a wavelength has been assigned for the transmission of one ONU, this assignment never changes. This makes the wavelength allocation very simple to implement but it lacks of the statistical wavelength-domain multiplexing advantages. On the other hand, the random, the least assigned and the least loaded methods tend to excessively overload certain wavelengths, as it was demonstrated in [58-59]. In the least loaded scheme, the OLT assigns the least loaded wavelength to the next ONU able to transmit. In contrast, in the least assigned wavelength scheme, the OLT allows the next ONU to transmit in the lowest used wavelength. In contrast, the first fit scheduling wavelength scheme, in which ONUs are able to transmit in the first free wavelength, leads to an efficient solution [62]. Independently of the wavelength allocation scheme applied in WDM-EPONs, algorithms can follow two policies to distribute the bandwidth on all supported wavelengths: the offline policy and the online policy. In the former, the OLT has to wait for the Report of every ONU in order to allocate bandwidth for the next cycle time. On the contrary, in the online policy, the OLT assigns bandwidth to each ONU for the next cycle just after receiving its Report message, independently of the status of the remaining ONUs. This is possible in WDMEPON infrastructures since ONUs are permitted to transmit as soon as one of the supported wavelengths becomes free. On the other hand, in relation to the bandwidth distribution in each separate wavelength, there are several schemes to be applied. In this way, authors in [60] developed three different algorithms, called DWBA-1 DWBA-2 and DWBA-3, with different bandwidth distribution policies. The main factor which distinguishes their performance is the offline and online policy used by each of them. In the first proposed algorithm, called DWBA-1, the OLT waits until all Report messages from one cycle are received to apply the allocation algorithm for the next cycle. On the contrary, in the other two proposed approaches (DWBA-2 and DWBA-3 algorithms) the OLT permits that ONUs with low traffic demand can transmit before the reception of every Report message. In particular, in DWBA-2, the OLT immediately allows the transmission of the lightly loaded ONUs whose demanded bandwidth is lower than an imposed minimum bandwidth. Otherwise, the OLT should wait for every Report before assigning bandwidth for the next cycle. This performance permits a reduction in the mean
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
30
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
packet delay but it increases the complexity of the system as the OLT has to keep excessive information of the status of every ONU. On the other hand, in the DWBA-3 algorithm, the OLT permits the transmission of light loaded ONUs with a transmission window equal to the demanded bandwidth. However, it can be possible only if their demanded bandwidth is lower than the stipulated minimum bandwidth. Otherwise, ONUs can be able to transmit with a transmission window equal to this minimum bandwidth. Under both situations, once the cycle ends, the excessive bandwidth is allocated among the high loaded ONUs depending on three bandwidth distribution variants developed by the authors. Simulation results showed a better performance of DWBA-2 over DBWA-3. This happens because in DBWA-3, the OLT assigns the excessive bandwidth to an overloaded ONU in a separate window in the same cycle, which causes that the allocated bandwidth may be underutilized. As it was mentioned before, the authors in [60] proposed three variants to assign the excess bandwidth at the end of every cycle. As a consequence, each of the three algorithms (DWBA-1, DWBA-2 and DWBA-3) can apply whatever variant to distribute the excess bandwidth among ONUs with a great traffic demand (highly loaded ONUs). Among these three schemes, namely uncontrolled (UE), controlled (CE) and fair (FE), the former improves the bandwidth utilization and the overall network performance. In the UE variant, the OLT collects from every Report the excessive bandwidth for the next cycle and assigns it to the overloaded ONUs in a proportional way. This method may give more bandwidth than the necessary to those ONUs which are not highly overloaded. In contrast, in the CE scheme, the OLT assigns the extra bandwidth to the overloaded ONUs depending on its bandwidth demand. As this method is applied in a round robin fashion discipline, some overloaded ONUs may not receive any extra bandwidth because the excess bandwidth has been given to previous overloaded ONUs and it is near zero when these ONUs have to be served. It means that some overloaded ONUs which are polled at the end of the cycle cannot be given any extra bandwidth. On the contrary, the FE policy always ensures a part of the excessive bandwidth to every overloaded ONUs according to its bandwidth demand. Therefore, every overloaded ONU is always allocated some excessive bandwidth. Finally, the dynamic channel allocation used by the three methods is based on the first-fit technique. Authors demonstrated that the static wavelength assignment penalized ONUs with high traffic demand and it may underutilize the channel capacity if the load is not symmetric. On the other hand, the algorithm proposed in [62], called WDM-IPACT, is an extension of the Interleaved Polling Adaptive Cycle Time (IPACT) developed for EPON access networks. Similar to the previous method [60], it also applies the first-fit technique to dynamically select each wavelength channel. Hence, the algorithm permits each ONU to transmit in the first available upstream wavelength following a round robin fashion discipline. In order to do that, the OLT dynamically calculates when each upstream channel becomes idle since it keeps track of the round trip time of every ONU and its current allocated bandwidth on the different upstream channels. Moreover, it follows a polling policy that allows the OLT to schedule the transmission of each ONU once it receives its Report message instead of waiting for all the Report messages in the current cycle. Regarding the quality of service the algorithm provides class of service differentiation by using the strict priority queue scheme. The algorithm proposed in [59] developed an extension to the Multi-Point Control Protocol (MPCP) for WDM-EPONs to provide dynamic bandwidth allocation. They implemented two scheduling paradigms for WDM-EPONs, namely online and offline
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 31 schemes. In the former, the OLT applies bandwidth and wavelength allocation based on the individual request of each ONU. On the contrary, in the offline policy the OLT applies scheduling decisions taking into account the bandwidth requirements of all ONUs. It is assumed that each ONU can transmit on all supported wavelengths in the WDM-EPON by means of a tunable transceiver with negligible tuning time. The simulations demonstrated that the online scheduling obtains lower delays than the offline scheduling, especially at medium and high ONU loads. This performance happens because the offline schemes allow an interscheduling cycle gap between consecutive cycles and ONUs are not able to transmit during them. The authors proposed a future approach to permit that the scheduler of the OLT could combine the online and the offline policies. This scheme may achieve low mean packet delay and ensure levels of quality of service inside the WDM-EPON. Other algorithms as the one presented in [63] support QoS in a differentiated service framework. The algorithm, called QoS-DBA-2, allows each ONU to simultaneously transmit in two channels, where each channel is dedicated to a different type of traffic. Authors considered a WDM-EPON architecture in which each ONU is equipped with a set of fixed transceivers, so that a range of ONUs share the same wavelength using TDM techniques. In particular, the proposed architecture exhibits two fixed transceivers. Then, each ONU is permitted to use both wavelengths at the same time, but each wavelength is dedicated to one class of service. One wavelength channel is used for the highest priority traffic, and the other one for the medium and low priority traffic. Each ONU sends two Report messages at the end of its transmission, one demanding bandwidth for the highest priority services and the another for other types of services. In this way, the OLT has to wait for all ONUs Report messages in one cycle to apply the DBA algorithm to allocate bandwidth for the next cycle in each wavelength. However, this scheme may underutilize the channel capacity whether the high priority traffic is not enough to take advantage of the total capacity. Consequently, the high priority channel cycle size becomes so small that the continuous polling overhead (Report and Gate messages) provokes high wasted bandwidth. To overcome this issue, the authors proposed a new algorithm which allows ONUs to transmit part of their low priority traffic on the high priority wavelength channel only if the minimum quality of service requirements of the high priority traffic is not affected. As QoS has to be supported by access networks, the algorithm Dynamic Wavelength aSsignment to support multi-Service Level Agreement (DyWaS-SLA) proposed in [64], has been designed to simultaneously provide class of service and subscriber differentiation. The most important strength of the algorithm is that it ensures a minimum guaranteed bandwidth to every profile depending on the contracted SLA. Hence, as the network allows different service level profiles, it deals with subscribers in a different way depending on their priority. In order to do that, the algorithm applies a method based on assigning a fixed weighted factor to each profile. In particular, DyWaS-SLA distributes the bandwidth according to a set of weights to offer a guaranteed bandwidth to each priority profile. If the available bandwidth is not enough to cover the total demanded bandwidth of every subscriber, the algorithm should ensure this minimum bandwidth levels. On the other hand, as the firs fit wavelength scheme, in which ONUs transmit in the first free wavelength, leads to an efficient solution, the algorithm assumes this method to dynamically assign the wavelengths. Moreover, DyWaSSLA algorithm considers that each ONU is equipped with several fixed transceivers or a tunable transceiver for the upstream transmission. This algorihtm chose this equipment since
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
32
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
flexible WDM-EPON architectures which could be upgraded in a cost-effective way are preferable since they allow both time and wavelength dynamic allocation. A summary of the main characteristics of the algorithms presented in this section regarding WDMA access protocol applied to WDM-EPON networks is shown in Table 3. Table 3. Comparative table of the main characteristics of developed algorithms in WDM-EPONs based on the WDMA protocol
Policy Least loaded wavelength
Least assigned wavelength
First fit method Separate wavelengths to each service
Policy
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Centralized
Polling
WAVELENGTH ALLOCATION SCHEMES Properties Algorithms The next assigned wavelength is the least loaded wavelength Inefficient as it overloads certain wavelengths The next assigned wavelength is the lowest used wavelength Inefficient as it overloads certain wavelengths The next assigned wavelength is the DWBA-1, DWBA-2, DWBA-3 [60] first free wavelength WDM-IPACT [62] The most efficient method to fairly DyWaS-SLA [64] distribute the supported wavelengths Each wavelength is dedicated to QoS-DBA-2 [63] different classes of services depending on their priority BANDWIDTH ALLOCATION SCHEMES Algorithms Properties DWBA-1 [60] The OLT waits for every Report to assign bandwidth for the next cycle The excess bandwidth is distributed among the high overloaded ONUs at the end of the cycle DWBA-2 [60] ONUs with low demand transmit before the reception of every Report The excess bandwidth is distributed among the high loaded ONUs in a round robin fashion discipline DWBA-3 [60] ONUs with low demand transmit before the reception of every Report It ensures a part of the excess bandwidth to every high loaded ONU depending on its bandwidth demand WDM-IPACT [62] The OLT schedules the transmission of every ONU just after receiving its Report message in every cycle It supports class of service differentiation using the strict priority queue method DyWas-SLA [64] The OLT schedules the transmission of every ONU just after receiving its Report message in every cycle It ensures a guaranteed bandwidth by means of weighted factors
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 33
4. FUTURE APPROACH: LONG-REACH PASSIVE OPTICAL NETWORKS There is an increasing interest in the development of larger split optically amplified PONs, also called Long-Reach PONs. Many components have been developed to deploy this type of optical access networks, such as optical amplifiers which extend the network coverage from 20 km to 100 km. Moreover, advances in WDM permit to offer more bandwidth that can be used efficiently only if more users are integrated in the access network. This integration may be possible by using Long-Reach architectures. Therefore, this WDM technology used in traditional PONs could allocate more bandwidth in case end subscribers may demand it. To serve more users in an optical access network, there is a neccessity to increase the network span to cover more subscribers, and it exits a strong tendency to use Long-Reach PONs to augment its initial reach up to 100 km [65]. These network architectures also provide high cost efficiency by simplifying the network, since the access and the metro networks can be combined into only one by using 100 km of fibre. Hence, the cost associated with electronic interfaces between the access and the outer metro backhaul network is eliminated. The OLT of the traditional PONs can be replaced at the local exchange by some elementary hardware, such as optical amplifiers. As a result, a network operator may need a few central offices, reducing the related cost [65]. Due to these excellent characteristics, several Long-Reach PONs prototypes have been developed in the recent years. Some of them are the SuperPON prototype developed by British Telecom, the evolved Long-Reach architecture developed by Shea et al. and the Hybrid DWDM-TDMA PON proposed by Townsend and Talli of the University College in Cork.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.1. Long-Reach Architectures Planet SuperPON The architecture SuperPON (Super Passive Optical Network) is based on the BPON (Broadband PON) standard and it was developed to include a large splitting factor to support from 32 to 1024 subscribers [66]. It was designed to cover a larger distance up to 100 km. To overcome the inserted attenuation, the SuperPON architecture added several optical amplifiers at both upstream and downstream channels. However, in the upstream channel the optical amplifiers caused the appearance of the unwanted optical noise called Amplified Spontaneous Emission (ASE) noise, which affectted the performance of this channel. This behaviour makes that the maximum split size achieved with good SNR (Signal to Noise Ratio) was only 1024. To increase the splitting ratio, optical amplifiers were placed in parallel between split stages. Although, these optical amplifiers improved the SNR, they highly increased the effect of the ASE noise. Then, a complex gating protocol was introduced, which allowed to switch on an optical amplifier only if a signal needed to be amplified [67]. This architecture applied the TDMA protocol in combination with WDMA so that it could be used when the demanded bandwidth highly increased. The upstream channel augmented its capacity up to 311 Mbit/s (shared among all users by means of TDMA). Finally, the downstream channel reached transmission rates up to 2.5 Gbit/s.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
34
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Long-reach PON architecture This optically amplified PON architecture extends the reach up to 100 km. Although it supported the half of subscribers that SuperPON, it only used six optical amplifiers for the upstream and downstream channels, contrary to the 39 amplifiers required by SuperPON. This architecture was designed to avoid the noise funnelling that existed in SuperPON, as an intermediate amplification stage was placed at the local exchange, just after the 1024 split stage in the distribution section. Hence, no optical amplifiers were in parallel. Consequently, this novel architecture improved the SuperPON prototype and reduced the costs [68-69]. The Long-Reach PON architecture was able to achieved symmetric transmission rates up to 10 Gbit/s along distances of 100 km and supporting up to 1024 subscribers. It could be done by using low cost optical transceivers in each ONU. Long-reach PON architectures based on a ring topology A typical Long-Reach PON deployment based on a ring topology is shown in Figure 14. This configuration is perfect for network resilience and bidirectional transmission and it was proposed by Song et al. [70]. At each node of the ring, an Optical Add-Drop Multiplexer (OADM) is used to insert and drop one wavelength to reach the end subscribers. The OADM also mitigates the signal power loss along the long-reach transmission. These end subscribers are connected to each OADM by means of tree PON topologies, and each of them uses the same wavelength. Due to the high distance of the ring, the OADM multiplexer also improved the power losses of the signal that is being carried out. As the OADMs add some power supply, the interfaces become not totally passive, and therefore the term “Long-Reach PON” is not fully suitable. However, such PON systems only include the use of a very few limited active components. Other ring architectures, such as the one proposed by Santos et al. [71], use a remote node instead of OADMs. This node consists of optical amplifiers, optical splitters and optical combiners to extract and insert the optical signals, plus two optical switches for the upstream and downstream channels. The aim of this configuration is to provide protection if some service alert is detected. In this way, An et al. proposed in [72] the design of a semi-passive node able to support network protection and resilience inside the ring. In order to do that, they used AWGs and a circuit able to detect failures in the network.
OADM
OLT
ONU ONU ONU
1
OADM
2
TDM-PON
1
3 N
OADM Figure 14. Example of a Long-Reach PON (LR-PON) in a ring topology Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 35 In these novel architectures, the OADM extracts one wavelength in each node of the ring and assigns it to one independent TDM-PON. However, all wavelengths are joined in the same fibre to span the distance up to 100 km. Therefore, each TDM PON can be treated as an independent Long-Reach PON in terms of access level.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Long-reach PON architectures based on hybrid combination DWDM-TDM Other Long-Reach PON architectures are based on the combination of DWDM-TDM [65] [73] techniques, which use DWDM (Dense Wavelength Division Multiplexing) to permit several TDM PONs. Each TDM PON works in a different wavelength, but all of them share the same amplifier and backhaul fibre. This hybrid architecture shows a significant improvement in the deployed access networks, as it combines a long reach transmission thanks to the optical amplifiers, and allows an increment in the number of users thanks to DWDM techniques. In these DMDM-TDM architectures, a number of powered PONs, each one with a specific wavelength for the upstream transmission, are combined using DWDM multiplexers. Thus, each of them can be viewed as an independent Long-Reach PON. This DWDM-TDM architecture can support symmetric rates of 10 Gbit/s in the upstream and downstream channels through 100 km. The number of end supported subscribers depends on the number of PONs that each backhaul section supports. In the distribution section, between the ONU and the local exchange, there are several PONs. All these independent PONs are combined in the local exchange into a single backhaul fiber using a DWDM multiplexer/demultiplexer. As an example, the architecture proposed by Talli et al. [73], uses the C-band with 100 Ghz spaced channels. It results in a set of upstream wavelengths from 1547.2 nm to 1560.1 nm and a set of downstream wavelengths from 1529 nm to 1541.6 nm, with the presence of 17 independent TDM- PONs. Each PON has a split ratio of 246 to support up to 4352 subscriber, each of them with a transmission rate of 39 Mbit/s.
4.2. DBA Algorithms in Long-Reach Ethernet PON Architectures Long-Reach PONs, as well as PON architectures are based on a tree topology between the OLT and the ONUs. In the upstream direction, as the transmission is multipoint-to-point, some medium access control protocol is needed. These protocols have to be very efficient due to the high increment in the propagation delay from the OLT to ONUs. Therefore, when the RTT time increases due to a large distance, more packets can be kept buffered in each ONU, and consequently the mean packet delay also increases. To overcome this high propagation delay it has been proposed several dynamic bandwidth allocation algorithms for Long-Reach PONs, some of them based on centralized policies, and others based on polling policies. Regarding centralized schemes, as the OLT assigns the bandwidth for the next cycle once it receives the updated demand of every ONU in the current cycle, ONUs have to wait a time equal to the round trip time to transmit in the next cycle. Since in Long-Reach PONs the distance between the OLT and the ONUs is augmented up to 100 km, the RTT time in which ONUs cannot transmit is increased to 1 ms. In order to take advantage of this huge wasted time, it is necessary to implement some scheme that allows ONUs to transmit during it. The proposed algorithm TSD (Two-State DMB) [74], considers this round trip time as virtual fixed cycles where ONUs are able to transmit, as it is shown in Figure 15.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
36
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 15. Example of the TSD (Two-State DMB) algorithm in a Long-Reach PON [74]
Since the real bandwidth demand is not available at the beginning of every virtual cycle, the OLT distributes the bandwidth in these cycles applying a prediction of the required bandwidth by means of traffic estimation, as it can be observed in Figure 15. Therefore, centralized methods in Long-Reach PONs could become complex and require traffic prediction, which may be complicated due to the bursty nature of the traffic. In other algorithms based on polling policies, such as the one proposed in [75] called Long-reach Interleaved Polling algorithm with Service level Agreement (LIPSA), the OLT allocates bandwidth to each ONU when its Report message arrives to the OLT. Therefore, one ONU is allowed to transmit just after the previous ONU has finished its transmission, leading to an efficient bandwidth utilization. This scheme applied to Long-Reach EPONs is much simpler than centralized ones, and consequently requires less computing time. Simulation results show that the polling algorithm applied to a Long-Reach EPON of 100 km, achieves the same efficiency as the one obtained when it is applied to a typical EPON of 20 km. In addition, the algorithm does not require traffic prediction and there is not much wasted time between consecutive transmissions. In order to improve the delay performance, in the adaptive cycle polling algorithm proposed in [70], one ONU is permitted to send its Report message before its previous Gate message is received. This scheme creates a new “thread” of signalling between the ONU and the OLT and it lets parallel “polling processes” running at the same time. The number of threads is not limited, and it depends on the system environment. The computation complexity of the multi-thread polling idea is not increased if it is compared with the simple thread scheme. This similarity happens because in both cases the Report messages arrive at the OLT in a similar rate, due to the trade-off between the increased RTT and the increased threads in the Long-Reach EPON. On the other hand, nowadays there is a great interest in extending the EPON distances over 100 km. It can be found some prototypes not only focus on increasing the distance coverage, but also focus on optimizing the optical components needed in that deployment. In this way, Shea et al. have demonstrated [76] Long-Reach PONs able to operate at 10 Gbit/s in distances up to 110 km using a lower number of EDFA amplifiers in the backhaul section. Besides, Machale et al. [77] have proposed hybrid architectures based on the combination of DWDM and TDM able to transmit at 10 Gbit/s along 116 km. Following the same philosophy, Kjaer at al. [78] have proposed a bidirectional Long-Reach PON that operates at
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 37 symmetric rates of 10 Gbit/s through 120 km.There are other proposals, such as the prototype of Davey et al. [79], that have succeeded in achieving the deployment of the GPON standard in Long-Reach PONs of 135 km. In this way, Shea et al. have demonstrated in [80] an experimental infrastructure of a Long-Reach PON with wavelength conversion based on the GPON standard. This prototype was able to support 1280 subscribers providing to each user nearly 38.5 Mbit/s in distances of 120 km. In this infrastructure, there are optical wavelength converters in the upstream channel that transfer the transmitted data from the wavelength used by the ONU to a standard DWDM wavelength. It makes that a set of PONs can be grouped into the same backhaul fiber, with each PON being converted to a separate wavelength. This architecture reduces the costs of the backhaul fiber and the same platform is shared by a high number of subscribers. As it was mentioned before, polling schemes lead to an efficient bandwidth utilization and they are simple to implement in Long-Reach EPONs. However, their efficiency can be improved for low and medium network loads at which service operators are likely to work. These loads are especially interesting for operators as the network does not exhibit packet losses and thus it operates under good conditions. Hence, it seems to be very sensible to focus on improving the network performance as much as possible for such loads. For this range of loads, it may exist idle time between the transmissions of consecutive ONUs due to the high distance and the low amount of bandwidth demanded by ONUs. Therefore, in the algorithm proposed in [81], called LOng reach Highly Efficient Dynamic bandwidth Assigment (LOHEDA), this idle time could be used to anticipate the transmission of some packets before they have been granted, thus reducing their delay, as it is shown in Figure 16. The anticipation of some packets become more relevant when Long-Reach EPONs augment their distances far from 100 km, as it was mentioned before. Under this circumstance, the idle time between consecutive transmissions appears more frequently as the distance increases. tgate tgate
Wasted Time
tgate
OLT RTT
RTT
2
rt po
rt
Re
po Re
rt po Re
Re
po
rt
ONU2
te Ga
ONU1
2
te Ga
ONU0
te Ga
te Ga
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
tgate
ONU3
ONU0
ONU1 ONU2
t1 t2
t3
Cycle N
ONU3
t4
Cycle N+1
(a)
ONU0
ONU1
ONU2
ONU3
ONU0
(b)
ONU0
ONU1
ONU2
ONU3
ONU0
ONU1
ONU1
ONU2
ONU2
ONU3
ONU3
Figure 16. Example of the LOHEDA (LOng-reach Highly Efficient Dynamic Bandwidth Assignment) algorithm in a Long-Reach EPON [81]
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
38
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Simulation results showed that in terms of the mean packet delay, the LOHEDA achieved improvements for low and medium loads at which the network usually works. This improvement is highly noticeable for high distances. It has been demonstrated that as the distance increases, LOHEDA obtains lower delays than conventional polling algorithms in Long-Reach EPONs for an extended range of networks loads.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
5. CONCLUSION The new emerging services and the inefficient deployed access technologies have accentuated the lack of access capacity. Optical fibre is expected to satisfactory deal with the existing first mile challenges. The Passive Optical Network (PON) is the most promising technology to be implemented in the access segment network. Besides, Ethernet PONs (EPONs), based on the Ethernet protocol, are considered a good choice for the next future optimized access network, as Ethernet is a well-known cheap technology and interoperable with a variety of legacy equipment. As a consequence, the EPON technology is outlined as the global standardization in the future due to several factors and it makes that nowadays many studies are focused on this technology. EPONs show typically a point-to-multipoint tree topology between the OLT and the final subscribers (ONUs), resulting in a very easy and cost saving infrastructure. However, in the upstream direction as all users share the same channel, some medium control access protocol is needed to avoid collisions of simultaneous transmissions. Among the studied medium access control protocols which can be applied in EPONs, TDMA (Time Division Multiple Access) is one of the most widespread protocols since it very easy to implement. However, it may be inefficient because the nature of network traffic is neither homogeneous nor continuous. Algorithms which distribute the available bandwidth in a dynamic way, called Dynamic Bandwidth Allocation (DBA) algorithms, are necessary to adapt the network capacity to the traffic conditions by changing the distribution of the bandwidth assigned to each ONU depending on the current requirements. Although EPON infrastructures based on TDMA are able to provide enough bandwidth for current applications, the gradual increase of the number of users and the bandwidth requirements of the new services, demand an upgrade of such access networks. The addition of new wavelengths to be shared in the upstream and downstream directions in EPON infrastructures leads to the so-called Wavelength Division Multiplex EPONs (WDMEPONs). The pure WDM-EPON architecture assigns one dedicated wavelength per ONU, which implies more dedicated bandwidth and more security in the system. However, the related cost associated with such deployment makes pure WDM-EPONs as the nextgeneration architectures. Hence, the combination of the WDM technology with Time Division Multiplexing (TDM) techniques is likely the best near future approach. These hybrid architectures exploit the advantages of wavelength assignment of the WDM techniques with the power splitting of the TDM methods. As a consequence, DBA algorithms, as well as in EPON networks, are extensively used in these WDM-TDM EPON architectures to distribute the available bandwidth among ONUs in each wavelength. In relation to DBA algorithms, they can follow an offline (centralized) policy or an online (polling) policy to allocate the available bandwidth. In the former, the OLT allocates
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 39 bandwidth having the knowledge of the total network state, but this policy shows a wasted time as the OLT needs the reception of every updated demand. In contrast, in the online policy, one ONU can transmit just after the end of the previous ONU, which results in efficient channel utilization. However, the OLT does not know the total network status. Independently of the followed allocation method, one of the most important challenges in EPONs is the Quality of Service (QoS). Network operators have to provide minimum guarantees to customers in relation to the priority of the offered services (CoS) and the priority of the customer (SLA). Thus, the access network has to efficiently support both class of service and subscriber differentiation. Consequently, DBA algorithms have to guarantee that the highest priority services comply with the standard constrains for every kind of subscriber independently of their priority. Finally, there is an increasing interest in the development of larger split optically amplified PONs called Long-Reach EPONs. These network architectures provide high efficiency by simplifying the network, combining the access and the metro networks into only one by using 100 km of fibre. Although the Long-Reach EPON is a very promising architecture, very little research has focused on the development of robust DBA protocols with the aim of facing the impact of the increase in the propagation time. As a consequence, if classic DBA algorithms developed for typical EPONs are applied to Long-Reach EPONs, they may suffer inefficient bandwidth utilization due to the huge increase of the packet propagation time. Hence, adequate DBA algorithms are highly required for these networks.
REFERENCES
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
[1]
Kramer, G; Mukherjee, B; Maislos, A. Ethernet Passive Optical Networks. In Multiprotocol over DWDM: Building the Next Generation Optical Internet; Publisher: Sudhir Dixit, John Wiley & Sons, 2003, 229-275. [2] Pesavento, M; Kelsey, A. PONs for the Broadband Local Loop. Lightwave., 1999, 16(4), 68-74. [3] Lung, B. PON architecture Futureproofs FTTH. Lightwave, 1999, 16(10), 104-107. [4] IDATE Consulting & Research, (2009). World FTTx Markets. FTTx Market Report. [5] IDATE Consulting & Research, (2009). FTTx: global operator rankings. [6] Hutcheson, L. (2009). FTTH/FTTH in Asia-Pacific. FTTH Council Asia-Pacific. [7] Teknovus. (2009). Teknovus Announces EPON FTTx in Russia and Belarus. Available at: http://www.teknovus.com/News-Events/Press-Releases/2009/Teknovus-AnnouncesEPON-FTTx-in-Russia-and-Belarus. [8] Zager, M. (2007). Independet Telcos Continue Rolling Out Fiber. Broadband Properties. [9] Press Release of FTTH Council. (2009). Ranking of European FTTH penetration shows Scandinavia and smaller economies still ahead. [10] IEEE 802.3ah Ethernet in the First Mile Task Force, (2004). IEEE 802.3ah Standard. Available at: http://www.ieee802.org/3/efm/public. [11] IEEE 802.3 Ethernet in the Firt Mile Study Group, (2001). Ethernet in the First Mile: Point to Multipoint Ethernet Passive Optical Network (EPON) tutorial. Available at: http://www.ieee802.org/3/efm/public/jul01/tutorial/.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
40
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
[12] Metro Ethernet Forum, (2005). Ethernet Passive Optical Network (EPON) (tutorial). Available at: http://metroethernetforum.org/PDFs/EFMA/. [13] Chae, CH; Wong, E; Tucket, R. Optical CSMA/CD Media Access Scheme for Ethernet over Passive Optical Network. IEEE Photonics Technology Letters., 2002, 14(5), 711713. [14] Frazier, H; Pesavento, G. Ethernet takes on the First Mile. IT Profesional., 2001, 3(4), 17-22. [15] IEEE 802.1D. (2006). Call For Interest: 10 Gbps PHY for EPON. Available at: http://www.ieee802.org/1/pages/802.1D.html. [16] Murakami, K. (2002). Authentication and Encryption in EPON. Available at: http://www.ieee802.org/3/efm/public/jul02/p2mp/f/. [17] IEEE 802.1, (2004). 802.1X - Port Based Network Access Control. Available at: http://www.ieee802.org/1. [18] IEEE 802.11, (2004). IEEE 802.11i: Amendment 6: Medium Access Control (MAC) Security Enhancements. Available at: http://www.ieee802.org/11. [19] IEEE 802.11. (2007). IEEE 802.11-2007: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications. Available at: http://www.ieee802.org/11. [20] Gummalla, A; Maislos, A; Horne, DM; Pesavento, G; Kaaja, H; Song, J; Limb, SJ; Khermosh, L; Hiironen, OP; Haran, O; Yoshihara, O; Hirtha, R; Bemmel, V; Fujimoto, Y. (2002). MPCP State of the Art. Available at: http://www.ieee802.org/3/efm/public/jan02/. [21] Mailos, A. (2002). MPCP General Description. Available at: http://www.ieee802.org/3/efm/baseline/. [22] Haran, O; Gummalla, A; Maislos, A; Horne, DM; Song, J; Limb, J; Khermosh, L; Yoshihara, O; Bemmel, V; Fujimoto, Y. (2002). MPCP: Messages Format. Available at: http://grouper.ieee.org/groups/802/3/efm/public/jan02/. [23] Algie, G; Bemmel, V; Brand, R; Gaglianello, B; Gummalla, A; Haran, O; Hirth, R; Horne, D; Khermosh, L; Suzuki, H; Limb, J; Maislos, A; Sala, D; Song, J; Yoshihara, O. (2002). MPCP Baseline Proposal Architecture and Layering Model. Available at: http://www.ieee802.org/3/efm/baseline/. [24] IEEE 802.3. (2006). Call For Interest: 10 Gbps PHY for EPON. Available at: http://www.ieee802.org/3/cfi/. [25] IEEE 802.3av TF. (2007). Baseline Proposals. Available at: http://www.ieee802.org/3/av/public/baseline.html. [26] Ochiai, K; Tatsuta, T; Tanaka, T; Yoshihara, O; Oota, N; Miki, N. Development of a Gigabit Ethernet Passive Optical Network (GE-PON) System. NTT Technical Review. 2005, 3(5), 51-56. [27] Tatsuta, T; Oota, N; Mikiy, N; Kumozaki, K. Design philosophy and performance of a GE-PON system for mass deployment. IEEE/OSA Journal of Optical Networking. 2007, 6(6), 689-700. [28] Yoshihara, O; Oota, N; Miki, N. Dynamic Bandwidth Allocation Algorithm for GEPON. IEIC Technical Report (Institute of Electronics, Information and Communication Engineers)., 2002, 102(20), 1-4. [29] Brunnel, H. Message Delay in TDMA Channels with contiguous output. IEEE Transactions on Communications. 1986, 34(7), 681-684.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 41 [30] Ko, KT; Davis, BR. Delay Analysis for a TDMA Channel with contiguous output and Poisson Message Arrival. IEEE Transactions On Communications., 1984, 32(6), 707709. [31] Byun, HJ; Nho, JM; Lim, JT. Dynamic bandwidth allocation algorithm in Ethernet Passive Optical networks. Electronics Letters., 2003, 39(13), 1001-1002. [32] Luo, Y; Ansari, N. Bandwidth allocation for multiservice access on EPONs. IEEE Communications Magazine., 2005, 43(2), 16-21. [33] Kramer, G. How efficient is EPON? IEEE P802.3ah Ethernet in the First Mile Task Force. Available at: http://www.ieee802.org/3/efm/public/. 2004. [34] Assi, C; Ye, Y; Dixit, S; Ali, MA. Dynamic Bandwidth Allocation for Quality-ofService over Ethernet PONs. IEEE Journal on Selected Areas in Communications., 2003, 21(9), 1467-1477. [35] Chang, CH; Kourtessis, P; Senior, JM. GPON service level agreement based dynamic bandwidth assignment protocol. Electronics Letters., 2006, 42 (20), 1173-1174. [36] Choi, SI; Huh, J. Dynamic bandwidth allocation algorithm for multimedia services over ethernet PONs. ETRI Journal., 2002, 24(6), 465-468. [37] Sherif, SR; Hadjiantonis, A; Ellinas, G; Assi, C; Ali, M. A novel decentralized Ethernet-Based PON Access Architecture for Provisioning Differentiated QoS, IEE/OSA Journal of Lightwave Technologies., 2004, 22(11), 2483-2497. [38] Ghani, N; Shami, A; Assi C; Raja, M. Intra-ONU Bandwidth Scheduling in Ethernet Passive Optical Networks. IEEE Journal on Selected Areas in Communications., 2004, 8(11), 683-685. [39] Zheng, J. Efficient bandwidth allocation algorithm for Ethernet passive optical networks. IEE Proceedings Communications., 2006, 153(3), 464-468. [40] Ma, M; Liu, L; Cheng, TH. Adaptive scheduling for differentiated services in an Ethernet Passive Optical Network. IEEE Journal on Selected Areas in Communications., 2005, 4(10), 661-670. [41] Kramer, G; Mukherjee, B; Pesavento, G. Interleaved Polling with Adaptive Cycle Time (IPACT): A Dynamic Bandwidth Distribution Scheme in an Optical Access Network. Photonic Network Communications., 2002, 4(1), 89-107. [42] Kramer, G; Mukherjee, B; Ye, Y; Dixit, S; Hirth, R. Supporting differentiated classes of service in Ethernet passive optical networks. Journal of Optical Networking., 2002, 1(8), 280-298. [43] Kim, H; Park, H; Kang, C; Kim, C; Yoo, G. Sliding Cycle Time-Based MAC protocol for Service Level Agreeable Ethernet Passive Optical Networks. In Proceedings of the IEEE International Conference on Communications (ICC 2005), May 2005, Seúl, Corea, 1848-1852. [44] Choi, SI. Cycling Polling-Based Dynamic Bandwidth Allocation for Differentiated Classes of Service in Ethernet Passive Optical Networks. Photonic Network Communications., 2004, 7(1), 87-96. [45] Nowak, D; Perry, P; Murphy, J. A Novel Service Level Agreement Based Algorithm for Differentiated Services Enabled Ethernet PONs. In Proceedings of the IEICE Press, 3rd International Conference on Optical Internet., 2004, vol. 1, 598-599. [46] Yang, YM; Ahny, B; Nho, J. Supporting quality of service by using delta dynamic bandwidth allocations in Ethernet Passive Optical Network. Journal of Optical Networking, 2005, 4(2), 68-81.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
42
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
[47] Banerjee, A; Kramer, G; Mukherjee, B. Fair Sharing Using Dual Service-Level Agreements to Achieve Open Access in an Ethernet Passive Optical Network (EPON). IEEE Journal on Selected Areas in Communications, 2006, 24(8), 32-43. [48] Ma, M; Zhu, Y; Cheng, TH. A bandwidth guaranteed polling MAC protocol for ethernet passive optical networks. Proceedings of the Twenty Second Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2003). 2003, vol. 1, 22-31. [49] Merayo, M; Durán, R; Fernández, P; Lorenzo, RM; de Miguel, I; Abril, AJ. Bandwidth allocation algorithm based on automatic weight adaptation to provide client and service differentiation. Photonic Network Communications., 2009, 1(1), 119-128. [50] McGarry, MP; Reisslein, M; Maier, M. WDM Ethernet passive optical networks. IEEE Communications Magazine., 2006, 44(2), 15-22. [51] Mayer, G; Martinelli, M; Pattavina, A; Salvadori, E. Design and cost performance of the multistage WDM PON access networks. IEEE/OSA Journal of Lightwave Technology, 2000, 18(2), 125-143. [52] Bock, C; Prat, J; Walker, S.D. Hybrid Wdm/Tdm PON Using the AWG FSR and Featuring Centralized Light Generation and Dynamic Bandwidth Allocation. IEEE/OSA Journal of Lightwave Technology., 2005, 23(12), 3981-3988. [53] Segarra, J; Sales, V; Prat, J. An All-Optical Access-Metro Interface for Hybrid WDM/TDM PON Based on OBS. IEEE/OSA Journal of Lightwave Technology., 2007, 25(4), 1002-1016. [54] An, F; Kim, K.S; Gutierrez, D; Yam, S; Hu, E; Shrikhande, K; Kazovsky, L.G. Success: A next-generation hybrid WDM/TDM optical access network architecture. IEEE/OSA Journal of Lightwave Technology., 2004, 22(11), 2557-2569. [55] An, F; Kim, KS; Gutierrez, D; Yam, S; Hu, E; Shrikhande, K; Kazovsky, L.G. SUCESS-HPON: a next-generation optical access architecture for smooth migration from TDM-PON to WDM-PON. IEEE Communication Magazine. 2005, 43(11), 40-47. [56] Kim, KS; Gutierrez, D; An, F; Kazovsky, LG. Design and performance analysis of scheduling algorithms for WDM-PON under SUCESS-HPON architecture. IEEE/OSA Journal of Lightwave Technology., 2005, 23(11), 3716-3731. [57] Shin, DJ; Jung, DK; Shin, HS; Kwon, JW; Hwang, S; Oh, Y; Shim, C. Hybrid WDM/TDM-PON with wavelength-selection-free transmitters. IEEE/OSA Journal of Lightwave Technology, 2005, 23(1), 187-195. [58] McGarry, MP; Reisslein, M; Maier, M. WDM Ethernet Passive Optical Networks (EPONs). IEEE Communications Magazine., 2006, 44(2), 15-22. [59] McGarry, MP; Reisslein, M. Bandwidth Management for WDM EPONs. Journal of Optical Networking., 2006, 5(9), 637-654. [60] Dhaini, AR; Assi, CM; Maier, M; Shami, A. Dynamic Bandwidth Allocation Schemes in Hybrid TDM/WDM Passive Optical networks. IEEE/OSA Journal of Lightwave Technology., 2007, 25(1), 277-286. [61] Hsueh, YL; Rogge, MS; Yamamoto, S; Kazovsky, LG. A highly flexible and efficient passive optical network employing dynamic wavelength allocation. IEEE/OSA Journal of Lightwave Technology., 2005, 23(1), 277-286. [62] Kwong, KH; Harle, D; Andonovic, I. Dynamic Bandwidth Allocation Algorithm for Differentiated Services over WDM EPONs. In Proceedings of the IEEE International Conference on Communications Systems (ICCS), 2004, 116-120.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Medium Access Control Protocols in Passive Optical Networks Based on Ethernet… 43 [63] Dhani, AR; Assi, CM; Shami, A. Quality of service in TDM/WDM Ethernet Passive Optical Networks (EPONs). In Proceedings of the 11th IEEE Symposium on Computers and Communications (ISCC’06). 2006, vol. 1, 621-626. [64] Merayo, N; González, R; de Miguel, I; Jiménez, T; Durán, RJ; Fernández, P; Aguado, JC; Lorenzo, RM; Abril, EJ. Hybrid dynamic bandwidth and wavelenght allocation algorithm to support Multi-Service Level profile in a WDM-EPON. In Proceedings of the 4th International Conference (AccessNet 2009). 2009, 1-13. [65] Shea, D; Mitchell, JE. Long-Reach Optical Access Technologies. IEEE Magazine., 2007, 21(5), 5-11. [66] ACTS Project AC050. (2000). Photonic Local Access Networks (PLANET). Available at: ftp://ftp.cordis.europa.eu/pub/infowin/docs/fr-050.pdf. [67] Voorde, I; Martin, C; Vandewege, J; Qiu, X. The SuperPON demonstrator: an exploration of possible evolution paths for optical access networks. IEEE Communications Magazine., 2000, 38(2), 74-82. [68] Shea, D; Mitchell, J. A 10 Gbit/s 1024-way-split 100 km Long-Reach Optical Access Network. IEEE/OSA Journal of Lightwave Technology. 2007, 25(3), 685-693. [69] Shea, D; Ellis, A; Payne, D; Davey, R; Mitchell, J. 10 Gbit/s PON with 100 km reach and 1x1024 split. In Proceedings of the 29th European Conference and Exhibition on Optical Communications (ECOC 2003), 2003, Rimini, Italy, 850-851. [70] Song, H; Banerjee, A; Kim, B.-W; Mukherjee, B. Multi-Thread Polling: A Dynamic Bandwidth Distribution Scheme in Long-Reach PON. IEEE Journal on Selected Areas in Communications., 2009, 27(2), 134-142. [71] Santos, J; Pedro, J; Monteiro, P; Pires, J. Self-protected Long-Reach 10 Gbit/s EPONs based on a ring architecture. Journal of Optical Networking., 2008, 7(5), 467-486. [72] An, FT; Kim, K; Gutierrez, D; Yam, S; Hu, E; Shrikhande K; Kazovsky, L. SUCCESS: a next-generation hybrid WDM/TDM optical access network architecture. IEEE/OSA Journal of Lightwave Technology., 2004, 22(11), 2557-2569. [73] Talli, G; Townsend, PD. Hybrid DWDM-TDM Long-Reach PON for Next-Generation Optical Access. IEEE/OSA Journal of Lightwave Technology., 2006, 24(7), 2827-2834. [74] Chang, CH; Merayo, N; Kourtessis, P; Lorenzo, RM; Senior, JM. Full Service MAC Protocol of Metro-Reach GPONs. IEEE/OSA Journal of Lightwave Technology. 2010, (Accepted for Publication). [75] Merayo, N; Jimenez, T; Durán, R; Fernández, P; Lorenzo, RM; de Miguel, I; Abril, EJ. Adaptive polling algorithm to provide subscriber and service differentiation in a LongReach EPON. Photonic Network Communications. 2010, (Accepted for publication) DOI 10.1007/s11107-009-0230-x. [76] Shea, D; Mitchell, J. Operating Penalties in Single-Fiber Operation 10 Gb/s, 1024-way split, 110-km Long-Reach Optical Access Networks. IEEE Photonics Technology Letters., 2006, 18(25), 2463-2465. [77] MacHale, E; Talli, G; Townsend, P. 10 Gbit/s Bidirectional Transmission in a 116 km reach Hybrid DWDM-TDMPON. In Proceedings of the 2nd International Conference on Access Technologies. 2006, 37-40, Cambridge, United Kingdom. [78] Kjaer, R; Monroy, I; Oxenloewe, L; Jeppesen, P; Palsdottir, B. Bi-directional 120 km Long-Reach PON Link Based on Distributed Raman Amplification. In Proceedings of the IEEE Lasers and Electro Optics Society Annual Meeting (LEOS’O6). 2006, Montreal, Canada (Paper WEE3).
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
44
Noemí Merayo, M. Rubén Lorenzo, Tamara Jiménez et al.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
[79] Davey, R; Healey, P; Hope, I; Watkinson, P; Payne, D; Marmur, O; Ruhmann, J; Zuiderveld, Y. DWDM reach extension of a GPON to 135 km. IEEE/OSA Journal of Lightwave Technology., 2006, 24(1), 29-31. [80] Shea, D; Mitchell, J. Experimental Upstream demonstration of a Long-Reach Wavelength-Converting PON with DWDM Backhaul. In Proceedings of the Optical Fiber Communication and the National Fiber Optic Engineers Conference (OFC/NFOEC), 2007, 1-3, Anaheim, United States. [81] Jiménez, T; Merayo, N; Durán, R.J; Fernández, P; Lorenzo, RM; de Miguel, I; Ramírez, M; Abril, EJ. Polling algorithm with adaptive cycle to enhance efficiency in QoS Long-Reach EPONs. In Proceedings of the 14th European Conference on Networks and Optical Communicatios, 2009, 519-527, Valladolid, Spain.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
In: Computer Science Research and Technology Editor: Karl C. Verdinand, pp. 45-70
ISBN: 978-1-61728-688-9 © 2010 Nova Science Publishers, Inc.
Chapter 2
A NON-ROUTINE PROBLEM SOLVING MECHANISM FOR A COMPREHENSIVE COGNITIVE AGENT ARCHITECTURE S. Aregahgen Negatu, Stan Franklin and Lee McCauley Department of Computer Science and the Institute for Intelligent Systems, The University of Memphis, Memphis, Tennessee, USA
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Abstract One aspect of human intelligence is its ability to achieve goals by devising unexpected and even creative solutions to problems that have never before been encountered. This ability of exploring and constructing solutions to non-routine problems is central to the development of our sciences and technologies. Replicating Non-Routine Problem Solving (NRPS) capability in an agent architecture would allow intelligent software agents and/or cognitive robots to deal more intelligently with highly complex and dynamically changing environments, and to cope with situations unforeseen by their designers. This chapter describes an NRPS mechanism in the cognitive architecture framework called Intelligent Distribution Agent (IDA) and its learning incarnation LIDA (Learning IDA). LIDA is a hybrid architecture that integrates different mechanisms for modules and their processes including perception, emotion, memories (sensory, perceptual, episodic, procedural, working), action selection, expectation, learning, deliberation, problem solving, metacognition, and selective attention (or functional consciousness). Relevant to the NRPS mechanism, we will briefly discuss LIDA‘s perception, selective attention, expectation, procedural memory, and action selection mechanisms. Our general approach is that an NRPS mechanism should involve recruiting and activating all the available knowledge pieces and processes, so that a search for a novel solution takes place over the entire solution space of the agent. LIDA‘s procedural memory stores, besides the lower-level entities that we call behaviors, high-level procedural constructs called behavior streams or goal hierarchies (hierarchical partially-ordered action plans.) If a behavior stream becomes relevant based on the content of selective attention, a copy of it then gets instantiated with its variables bound, and becomes part of the dynamics in the action selection module. While active in the action selection system it competes for the control of the behavior of the agent over multiple cognitive cycles, and executes an associated task. LIDA‘s cognitive cycle is an iterative, continually active process that brings about the interplay among the various components of the architecture, resulting in an action being selected and executed. Particularly, we view a non-routine problem solving mechanism as a special goal hierarchy
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
46
S. Aregahgen Negatu, Stan Franklin and Lee McCauley whose task it is to generate a new behavior stream that handles a novel situation that could not be handled by available routine solutions. Non-routine problem solving in LIDA is a deliberative process over multiple cognitive cycles. We will describe the details of the interaction of the components in the architecture, and the control of the special non-routine problem solving goal hierarchy system over the solution search process.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
1. Introduction Advances in machine intelligent problem solving are possible by understanding and modeling cognitive functions and their integration in humans. An autonomous agent (Franklin & Graesser, 1997), whether it is human, animal or artificial, must continually select its next action and, to do so, it must frequently sense its environment and select the appropriate response that satisfies or leads towards satisfying one or more of its primary motivators (drives or feelings). As agents negotiate environments, they encounter multitude of challenges or problems. Looking at it differently, an autonomous agent engages in a continuous process of solving problems. A problem situation is defined by its current environmental and internal states, along with its current goals that could satisfy at least on one its agenda items or motivators. Finding a solution (producing the appropriate action in the given situation) is a deliberative process that could involve multiple mental modules. The level of familiarity with the problem situation determines whether a problem is routine or non-routine, and the amount of cognitive load in the deliberation. Routine problem solving is a process of applying an established procedure as a solution to the problem given, that there is past experience of using this solution to the problem. Non-routine problem solving is a process of devising a new procedure that solves a problem in a way that it has not been solved previously. The later is our main focus in this chapter. We humans, as agents with high cognitive capacity, have the ability to devise unexpected, and often clever, solutions to problems we‘ve never before encountered. Sometimes they are even creative. This ability to solve non-routine problems has played a central role in the development of our sciences and technologies. It would be useful to replicate this ability in software agents and robots, both for its practical value, and for the light it would shed on human problem solving. On the practical side, agent architectures capable of non-routine problem solving would allow software agents and/or robots to deal more intelligently with highly complex and dynamically changing environments, and to cope with situations unforeseen by their designers. Applications might be to unmanned vehicles for exploratory or military uses, as well as industrial control systems. The tasks of many human information agents could be automated (Franklin, 2001). On the science side, each design decision taken in pursuit of such an agent architecture translates immediately into a hypothesis, hopefully testable, for cognitive scientists and neuroscientists (Franklin, 1997, 2000b; Franklin & Graesser, 2001; Baars & Franklin, 2003; Franklin, Baars, Ramamurthy, & Ventura, 2005). Baars (1988, 1997, 2002) in his Global Workspace Theory (GWT) presented strong psychological and neuroscience evidence that shows the important role of consciousness in integrating the many specialized cognitive processes so that humans can deal with novelty in general and with non-routine problem solving in particular. Mithen (1996, 1999) also affirms, in presenting his notion of cognitive fluidity in the evolution of mind, that modern human‘s highest form of mental activity takes place as a result of cognitive fluidity (consciousness) –
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
A Non-Routine Problem Solving Mechanism …
47
the working together or integration of the various mental modules and a fluid flow of knowledge and idea among them. Consciousness‘s role in fluid flow, or broadcast of information, enables the modules to influence each other (cooperation and competition), and this in turn results a boundless capacity for learning, imagination, creativity, and problem solving. Mithen explicitly mentions the need to evolve to conscious mind for a capacity to handle a non-routine problem solving such as tool making (e.g. handaxe), in which unexpected contingencies arise and plans need to be continually modified. Many others also argue that human intelligence will not be where it is now without evolving consciousness to handle novelty (Baars, 1988; Arp, 2007; Bogdan, 1994; Cosmides & Tooby, 1992; Gardner, 1993; Humphrey, 1992; Pinker, 1997). Searle (1992) points out that our much greater flexibility, sensitivity, and creativity are derived from the evolutionary advantages of attaining high-level consciousness. Crick (1994) also asserts that without consciousness, you can deal only with familiar, rather routine situations or respond to with very limited information in novel situations (p. 20). Thus there is ample reason to pursue agent architectures with capability for non-routine problem solving. But can such pursuits succeed? We argue here that there is a good chance that it can. The major source of our optimism is the software agent technology of Learning Intelligent Distribution Agent (LIDA) - an autonomous agent that aspires to model several facets of human (and animal) cognition. LIDA is the partially conceptual, learning extension, of the original IDA system that was implemented computationally as a software agent (D‘Mello et al., 2006; Franklin, Kelemen & McCauley, 1998). The original IDA system was designed as an autonomous agent, and performed personnel work for the US Navy in a human-like fashion (Franklin, 2001). Although the design of IDA was inspired by several theories of human and animal cognition, it did not learn. The LIDA system adds three fundamental forms of learning to IDA: perceptual, procedural, and episodic learning. Over the years, several AI researchers and cognitive scientists have developed cognitive models designed around some unified theory of cognition (Newell, 1990). Some of these well known models include SOAR (Laird, Newell, & Rosenbloom, 1987), ACT-R (Lebiere & Anderson, 1993), and Clarion (Sun, 1997). Most of these are based on some extension of Post production systems (Post, 1943). In contrast, The LIDA model is a comprehensive, conceptual and computational model covering a large portion of human cognition. The model is primarily based on Baars‘ Global Workspace theory (GWT) (1988) and it implements and fleshes out a number of mostly psychological and neuropsychological theories of cognition. These include: situated or embodied cognition (Varela, Thompson, & Roach 1991; Glenberg, 1997; Barsalou, 2008: de Vega, Glenberg & Graesser, 2008), Barsalou‘s theory of perceptual symbol systems (1999), working memory (Baddeley & Hitch, 1974: 2007), Glenberg‘s theory of the importance of affordances to understanding (1997), and Sloman‘s architecture for a human-like agent (1999). There are architectures that aspire to model consciousness, and a comparison of LIDA with some of them can be found elsewhere (Franklin et. al., 2007; Sun & Franklin, 2007). To implement the LIDA cognitive model, its computational architecture employs several modules that are designed using computational mechanisms drawn from the ―new AI.‖ These include variants of the Copycat Architecture (Hofstadter and Mitchell 1995, Marshall 2002), Sparse Distributed Memory (Kanerva, 1988; 2009; Rao and Olac, 1998), the Schema Mechanism (Drescher 1991, Chaput et al. 2003), the Behavior Net (Maes 1989), and the Subsumption Architecture (Brooks 1991).
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
48
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
We argue that basing non-routine problem solving and other components of intelligence behavior on functional aspects of human cognition may serve dual purposes by yielding both engineering and scientific gains. We expect engineering improvements because we are basing our computational mechanism on the best known example of intelligence, that is, humans. Scientific gains can be achieved by using computer systems to test and perhaps augment psychological and/or neuroscience theories of non-routine problem solving (NRPS) and other mental decision making processes. One potential pitfall of relying on psychological theories is that they typically model only small pieces of cognition. In contrast, by its very nature the control system of any autonomous agent or cognitive robot must be comprehensive and fully integrated. That is, it must choose its actions based on real world sensation and perception, along with incoming endogenous stimuli, utilizing all needed internal processes. Once again, the use of the LIDA system, as an integrative comprehensive intelligent architecture, helps to overcome this problem, a major contribution in explaining a plausible cognitive approach for non-routine problems solving. As the focus of this chapter, we will discuss a non-routine problems solving mechanism in the LIDA agent framework, which incorporates much of what seems to be necessary for non-routine problem solving including functional consciousness (Franklin, 2003). In pursuing such a non-routine problem solving architecture, we‘ll need help from many sources. Our basic approach is on what psychologist Arthur Glenberg refers to meshing - a process of finding unexpected and sometimes clever solutions to non-routine problems (as cited in; Glenberg & Robertson, 2000). Meshing is typically accomplished in humans by putting together bits and pieces of knowledge and techniques that have been stored and, perhaps, used in the past to help solve other problems. Finding a solution often begins with identifying these bits and pieces, then typically continuing to find ways to mesh them. We will use these observations as a guideline. Consciousness provides the means to extensively mobilize the unconscious bits and pieces of knowledge that could contribute towards constructing a solution. This chapter will have the following outline. In the next section, we will discuss LIDA, our agent architecture, and its relevant components to support non-routine problem solving as well as its cognitive processing cycle. In section three, we will present issues that need to be addressed in non-routine problem solving. Particularly, we will discuss an approach for detection of non-routine problem situations. In section four we describe the main contribution of this chapter, a mechanism for non-routine problem solving in our cognitive agent architecture – LIDA. We conclude the chapter with a brief discussion of related works and a summarization of this chapter.
2. Architectural Support for Non-routine Problem Solving The LIDA architecture is hybrid (partly symbolic and partly connectionist) with all symbols being grounded in the physical world in the sense of Brooks (1986). The fundamental computational mechanism of the LIDA system is the codelet (Hofstadter & Mitchell, 1994), a small piece of code executing as an independent thread that is specialized for some relatively simple task. There are many types of specialized codelets; for instance, some are specialized to pay attention to a particular situation (attention codelets), others are
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
49
specialized to execute low-level procedural actions (behavior codelets) and so on. The LIDA architecture integrates different mechanisms for several cognitive modules and their processes including perception, emotion, memories (sensory, perceptual, episodic, procedural, working), action selection, expectation, learning, deliberation, problem solving, metacognition, and selective attention (or functional consciousness). Relevant to the NRPS mechanism, we will briefly discuss LIDA‘s perceptual associative memory, workspace, selective attention, expectation, procedural memory, and action selection mechanisms.
2.1. Perceptual Associative Memory
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
The perceptual knowledge-base takes the form of a semantic net with activation (called the slipnet) motivated by Hofstadter and Mitchell’s Copycat architecture (1994). Nodes of the slipnet constitute the agent’s perceptual symbols (Barsalou, 1999), representing individuals, categories, feelings, actions, and events. The perceptual symbols are grounded in the real world by their ultimate connections to various primitive feature detectors having their receptive fields among the sensory receptors or other parts of sensory memory. An incoming stimulus, say a visual image, is descended upon by a hoard of feature detectors. Feature detectors respond to specific features from the various sensory streams and perform perceptual tasks such as recognition and identification. Each of these feature detectors is looking for some particular feature (a certain color, an edge at a particular angle, etc) or more complex features (a T junction, a red line). Upon finding a feature of interest to it, the feature detector‘s activation will increase. Activation is passed. Nodes with activations over threshold, along with their links, are taken to provide the constructed meaning of the stimulus, the percept (see Figure 2).
2.2. Workspace LIDA‘s workspace is analogous to the preconscious buffers of human working memory. The percept is written to the workspace. Attention codelets watch what is written in the workspace in order to react to it. Items in the workspace decay over time, and may be overwritten. Another pivotal role of the workspace is the building of temporary structures over multiple cognitive cycles (see below). Perceptual symbols from the slipnet are assimilated into existing relational and situational templates while preserving spatial and temporal relations between the symbols. The structures in the workspace also decay rapidly.
2.3. Selective Attention Selective attention (functional consciousness) in LIDA is an implementation of Global Workspace Theory (Baars, 1988) with hosts of attention codelets, each playing the role of a daemon, watching for an appropriate condition under which to act. Each attention codelet watches for some particular situation that might call for selective attention (i.e. novelty, changes, etc). Upon encountering such a situation, the attention codelet is associated with a few nodes (from the slipnet) carrying a description of the situation. A coalition of codelets (collection of related codelets) is thus formed. During any given cycle one of these coalitions
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
50
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
with the highest average activation is considered relevant and broadcasts its information to every other codelet or processor (Baars, 1988). This broadcast is used to recruit schemes (see below) and perform various types of learning (D‘Mello, et al., 2006; Franklin & Patterson, 2006).
2.4. Procedural Memory Procedural memory in LIDA is a modified and simplified form of Drescher‘s schema mechanism (1991), the scheme net. The scheme net is a directed graph whose nodes are (action) schemes and whose links represent the ‗derived from‘ relation. A scheme consists of an action, together with its context and its result. Built-in primitive (empty) schemes contain an empty context and an empty result. The context and results of the schemes are represented by perceptual symbols (Barsalou, 1999) for objects, categories, and relations in perceptual associative memory. In order for a scheme to act, it first needs to be instantiated and then selected for execution in accordance with the action selection mechanism (discussed next). The action of an instantiated scheme consists of one or more behavior codelets that execute the actions in parallel, or a behavior stream (partially ordered and/or plan of behaviors) (see Figure 1).
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
2.5. Action Selection The LIDA architecture employs an enhancement of Maes‘ behavior net (1989) for highlevel action selection in the service of drives or feelings (including emotions) acting as primary and internal motivators (Cañamero, 1997; Franklin & McCauley, 2004; Franklin & Ramamurthy, 2006). The behavior net is a digraph (directed graph) composed of behavior codelets (a single action), behaviors (multiple behavior codelets operating in parallel), and behavior streams (multiple behaviors operating in some partial order) and their various links. These three entities all share the same representation in procedural memory (i.e., a scheme). As in connectionist models, this digraph spreads activation. The activation comes from three sources: from pre-existing activation stored in the behaviors, from the environment, and from motivators (feelings). To be acted upon, a behavior must be executable (preconditions satisfied), must have activation over threshold, and must have the highest such activation. LIDA‘s action selection mechanism incorporates five major enhancements over Maes‘ behavior net: (i) Variables – While Maes‘ behavior net operates on the basis of boolean propositions only, LIDA‘s mechanism supports variables that get bound during the instantiation of procedural schemes; (ii) Restricted search space – During the action selection phase Maes‘ mechanism performs a global search over all the available competency modules while the enhanced behavior net restricts its search to relevant (instantiated) goal hierarchies, which are a subset of the available competencies; (iii) Failure handling - Maes‘ mechanism assumes that the result of a selected action is deterministic in that every action produces its expected outcome. Therefore, this mechanism is unable to handle execution failures, which frequently occur in any real system. On the other hand LIDA‘s enhanced behavior net is endowed with a degree of fault tolerance via its expectation mechanism; (iv) Priority control – Maes‘ mechanism modulates the priorities of competing goals by building static causal
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
51
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
links among competence modules while LIDA‘s mechanism provides parametric control to dynamically change goal priorities at run time; (v) Planning and subgoaling – Maes‘ mechanism does not support classic AI planning and subgoaling but LIDA‘s mechanism, as a collection of goal structures, supports both (see Negatu, 2010; Negatu & Franklin, 2002).
Figure 1. An example of a goal context hierarchy with multiple behavior streams working together. Stream 1 handles its problem only after streams 2 satisfies its subgoal (at B5) and after another stream binds and satisfies the unbound subgoal (at B3).
Expectation Codelet In the LIDA architecture high-level constructs such as behaviors are underlain by specialized processors – codelets. Particularly, one or behavior codelets (executing in parallel) and at least one expectation codelets underlie each behavior. One of the functions of behavior codelets is to realize a low level primitive action that is equivalent to a neuronal group that controls primitive motor activity. Expectation codelets, as anticipatory behavior controllers play an important role in the action selection system. Their controlling functions are for monitoring behavioral effects, evaluating outcomes and reporting possible failures resulting from the execution of a behavior.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
52
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
Expectation codelets always attempt to provide feedback, which is based on the comparison of actual internal sensory consequences (including proprioception, tactile, etc.) and sensed environmental effects of behavioral action with the desired effects. The feedback compilation process may take one or more cognitive cycles (see below). When behavioral action fails, an expectation codelet tries to bring the detected failure to consciousness. The function of watching the events that are associated with the action of its behavior makes the expectation codelet a special case of an attention codelet. Expectation codelets have role in automatization (Negatu, 2009), non-routine problem solving, procedural learning (D‘Mello, et al., 2006). Selected behaviors, via the expectation mechanism, also produce effects of preafference (Freeman, 1995) process in the brain and preparatory attention (LaBerge, 1995.)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Goal hierarchy and Subgoaling Subgoaling and goal hierarchy facilitation in the behavior streams is particularly important to the non-routine problem solving mechanism that we address in this chapter. According to global workspace theory a goal context is a non-qualitative mental representation about one‘s own future actions that can dominate central limited capacity or consciousness; it constrains conscious goal images without itself being conscious. Goal contexts perform the following major functions. (1) They help to evoke, shape and produce actions. (2) They can be part of hierarchical structures that allow the agent to deal with tasks that extend over more than one conscious event (cognitive cycle — see below). (3) They represent future states of the agent, serving to constrain the processes that can help reach those future states. (4) They can be part of hierarchical structures that can constrain a stream of ―consciousness.‖ In this section we discuss the high-level implementation of IDA‘s action selection module that comprises its goal context hierarchy. Figure 1 illustrates an example of such a goal context hierarchy. Drives or feelings (in including emotions), as primary motivation processes, are at the top of the hierarchy, where behavior streams serve one or more of these motivators directly or indirectly. Behavior stream 1 satisfies the motivator directly, while stream 2 satisfies it indirectly through behavior stream 1. Such structures are not explicitly programmed; they are constructed at run time in an instantiated behavior network system. The figure shows a goal node – which is a special case of behavior node without explicit action. They simply state or infer when a particular state of affairs is desired and/or achieved. The style of organization of smaller behavior streams into larger goal hierarchies (Figure 1) provides an effective sub-goaling mechanism for our action selection system. The subgoaling mechanism brings a regressive reasoning capability in the decision making or action selection process. Small behavior streams, which are well-defined partial order plans, are solutions to simple routine tasks. Such small behavior streams can be brought together to handle larger routine and non-routine problems. Referring to figure 1, a deliberative process with backward reasoning can recruit the appropriate behavior stream that satisfies and binds the unbound subgoal of behavior stream 1 at its behavior B3. The sub-goaling process is the one that plays the role of ‗bringing together‘ (or the construction of goal hierarchy). That is, sub-goaling allows a goal-directed problem solving.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
53
2.6. Cognitive Cycle
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
An autonomous agent (Franklin & Graesser, 1997) cops with its changing environment by its continuous, cyclic chore of ―sense-select-act.‖ LIDA‘s cognitive cycle (Franklin et al., 2005) is the cycle of refined cognitive steps (starting after sensation and ending with action) that bring about an appropriate selection for a specific situation. As Franklin and Baars (2009) put it ―A cognitive cycle can be thought of as a moment of cognition - a cognitive moment; higher-level cognitive processes are composed of many of these cognitive cycles, each a cognitive ‗atom‘.‖ This metaphor is to say that the steps in a cognitive cycle correspond to the various atomic or sub-atomic particles in an atom. There is evidence to suggest that in humans five to ten cognitive cycles happen per second. Since the LIDA architecture is composed of several specialized mechanisms, a continual process that causes the functional interaction among the various components is essential. The cognitive cycle as such is an iterative, cyclical, continually active process that brings about the interplay among the various components of the architecture. The nine steps of cognitive cycle are shown in figure 2 and described below.
Figure 2. LIDA‘s Cognitive cycle.
1. Perception. Sensory stimuli, external or internal, are received and interpreted by perception producing the beginnings of meaning. 2. Percept to preconscious buffers. The percept, including some of the data plus the meaning, as well as possible relational structures, is stored in the preconscious buffers of LIDA‘s working memory (workspace). Temporary structures are built. 3. Local associations. Using the incoming percept and the residual contents of working memory, including emotional content, as cues, local associations are automatically
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
54
S. Aregahgen Negatu, Stan Franklin and Lee McCauley retrieved from transient episodic memory and from declarative memory, and stored in long-term working memory. 4. Competition for consciousness. Attention codelets view long-term working memory, and bring novel, relevant, urgent, or insistent events to consciousness. 5. Conscious broadcast. A coalition of codelets, typically an attention codelet and its covey of collected information content, gains access to the global workspace and have its contents broadcast. In humans, this broadcast is hypothesized to correspond to phenomenal consciousness. 6. Recruitment of resources. Relevant schemes in Procedural Memory respond to the conscious broadcast. These are typically schemes (underlain by behavior codelets) whose context is relevant to information in the conscious broadcast. Thus consciousness solves the relevancy problem in recruiting resources.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
7. Setting goal context hierarchy. The recruited schemes use the contents of consciousness, including feelings/emotions, to instantiate new goal context hierarchies (copies of themselves) into the behavior net (action selection system), bind their variables, and increase their activation. Other, environmental, conditions determine which of the earlier behaviors (goal contexts) also receive variable binding and/or additional activation. 8. Action selected. The behavior net selects a single behavior (scheme, goal context), from a just instantiated behavior stream or possibly from a previously active stream. Each selection of a behavior includes the generation of an expectation codelet (see the next step). 9. Action taken. The execution of a behavior (goal context) results in the behavior codelets performing their specialized tasks, having external or internal consequences, or both. LIDA is taking an action. The acting codelets also include at least one expectation codelet whose task it is to monitor the action, bringing to consciousness any failure in the expected results. As shown in figure 2, multiple learning mechanisms are initiated following the broadcast of conscious content. In Perceptual Associative Memory learning of new entities and associations, and the reinforcement of old ones occur, events are encoded in the Transient Episodic Memory, and new schemes may be learned and old schemes are reinforced in Procedural Memory; in all of the learning processes, the conscious content determines what to be learned. Several high-level tasks and cognitive processes operate over multiple cognitive cycles. Particularly, deliberative processes are in the continuous interplay and mutual influence of behaviors streams (goal context hierarchies) and selective attention (consciousness). Nonroutine problem solving in LIDA is such a deliberative process spanning over multiple cognitive cycles. We will describe the details of the interaction of the components in the
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
55
architecture, and the control of the special non-routine problem solving goal hierarchy system over the solution search process.
3. Approach for Non-routine Problem Solving In this section we will restate what the problem is and how we approach the modeling and design of a non-routine problem solving module in the LIDA architectural framework.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.1. Routine and Non-Routine Problems Routine problems are the ones that have readily available applicable solutions. Nonroutine problems are novel problem situations that arise from unavailability of routine solutions that can handle them. More specifically, non-routine problems cannot be categorized as any of the routine classes of problems, and thus none of the available behavior streams would be applicable. In general, a problem solving process may include the capabilities of noting differences, similarities and, to some extent, analogies between a non-routine problem and wellestablished routine problems. The resulting information is used to construct new solutions by modifying existing solutions. A mechanism may need to breakdown complex structures of procedural knowledge to lower-level components, and then use the available knowledge to construct new behavior streams as solutions. The breakdown of structures can continue to the simplest or collection of base units that Edelman (1987) called primary repertoires. Problem solving, whether routine or non-routine, and like traditional AI planning, is always specified with goal and the initial states. There are two main issues to be considered in handling a non-routine problem situation. The first involves the recognition of a novel situation, and the relevant information contained in the pieces of percepts. While this is, indeed, an important and challenging issue, it is not the sense of non-routine problem solving that we are addressing here. We consider this first issue one of perceptual learning, rather than non-routine problem solving. The second issue is to determine how to respond to the situation in an appropriate manner. This dynamic adaptation of procedural knowledge will be addressed here as the sense of non-routine problem solving that we are concerned with here.
3.2. Detecting Non-routine Problems In LIDA, understanding or give meaning to the current situation happens at the end of the third step of the cognitive cycle. That is, incoming stimuli activate low-level feature detectors in Sensory Memory. The output is sent to Perceptual Associative Memory where higher-level feature detectors feed in to more abstract entities such as objects, categories, actions, events, etc. The resulting percept is sent to the Workspace where it cues both Transient Episodic Memory and Declarative Memory producing local associations. These local associations are combined with the percept to generate a current situational model, the agent understands of what‘s going on right now.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
56
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
In the ―consciousness‖ phase parts of the situational model picked by attention codelets and their coalition is moved to Global Workspace, in which competition takes place to select the most salient, the most relevant, the most important, the most urgent coalition whose contents become the content of consciousness that are broadcast globally. Using the conscious contents, possible action schemes (mainly behavior streams) are recruited from Procedural Memory. A copy of each such is instantiated with its variables bound and sent to Action Selection system (behavior net), where its behaviors compete to be the behavior selected for this cognitive cycle. The selected behavior triggers Sensory-Motor Memory to produce a suitable algorithm for the execution of the behavior. Its execution completes the cognitive cycle. In LIDA, there are two scenarios in which a non-routine problem can arise. 1) If LIDA encounters an unexpected type of situation that could not exactly be categorized in any one of the known classes of problems; it is a non-routine problem situation since there will not be known behavior streams to handle the situation. 2) In an active behavior stream (dominant goal hierarchy), an executing behavior (dominant goal context) may fail to achieve its expected outcome, which happens if LIDA encounters perturbation in its environment, a variation in its domain, or if the selected execution algorithm malfunctions. This situation crops up if there is no alternative routine solution that corrects the novel error situation. In both scenarios, non-routine problems arise from instances of unexpected situation, which eventually is reflected in the workspace. Considering the first scenario, once the understood novel situation is written to the workspace, there may not be an attention codelet that will respond to it. This indicates that non-routine problem solving will require a special attention codelet whose task is to respond to unknown situations. But how is such a codelet to recognize such novel problem situations without knowing about all the routine occurrences? It seems to be similar to the problem faced by the immune system and may well require the same kind of solution. Just as a human would, LIDA learns new procedural methods incrementally based on related, known concepts. This is more commonly described as ―learning at the fringe‖ (Doignon & Falmagne, 1985). Let‘s see how this could be done. A reasonably short time after a novel problem situation is written in the workspace, a special, non-routine problem solving attention codelet notices that no other attention codelet is attempting to deal with this new situation, and it competes to bring this information to consciousness. Assuming that the non-routine problem solving attention codelet wins the competition for consciousness, its content of a non-routine situation will be broadcast globally thus recruiting appropriate mental resources (schemes from Procedural Memory) that can handle the situation. How does LIDA detect non-routine problems that arise from execution errors in running behavior streams? Each behavior, besides behavior codelets, has one or more expectation codelets under it. When a behavior starts execution and sends its action to Sensory Motor Memory for execution, at the same time it activates an expectation codelet. The role of an expectation codelet includes watching, evaluating and, perhaps, reporting on the outcome of the action. The function of watching events that are associated with the action of its behavior makes an expectation codelet a particular kind of attention codelet. The monitored changes resulting from a behavioral action are characterized and formulated as an outcome or result. If the actual outcome is different from the expected outcome, the expectation codelet detects an error situation. The execution error, as a non-routine problem, is described as an unfulfilled part of the outcome (goal), the fulfilled part of the outcome (state to be protected) if there is
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
A Non-Routine Problem Solving Mechanism …
57
any, and the contextual state at the point of failure. The contextual state includes the perceptual context from working memory, long-term memory and the internal state in the behavior net. The expectation codelet that detects an execution error moves to the Global Workspace and competes to bring the error information to consciousness. While there would likely be no schemes in Procedural Memory that would respond precisely to the broadcast non-routine problem situation, various schemes that recognize pieces of the information will respond in varying degrees, depending on how much of the information they are familiar with, that is, upon the intersection of the context of the scheme and the conscious contents. These initial respondents will become the base or the seed from which the search for possible solutions is initiated. Within the sequence of cognitive cycles, the search space is constrained by what is available in the conscious content. In addition, their activation levels at the time, corresponding to each scheme‘s assessment of its applicability, are used to further bias the search for a solution. The core of this chapter is to present a meshing (Glenberg, 1997) mechanism that builds new goal structures (behavior streams) that could handle a novel situation. Following global workspace theory, we conjecture that creative solving of non-routine problems by humans is accomplished using consciousness in the role of a recruiter of relevant resources. We further conjecture that such creative problem solving also requires a meshing mechanism capable of cutting and pasting various goal contexts along with their associated processors. Finally, we conjecture that some sort of mechanism (processor(s)) is needed for recognizing such problems and training attention on them. Translating all this into computational terms, we conjecture that LIDA could be provided with attention codelets that recognize non-routine problems and bring them to consciousness, and with a sufficiently capable cut and paste mechanism for behavior streams and scripts, so that it will be capable of providing clever, possibly creative, solutions to non-routine problems from its domain. In general, LIDA‘s perception and memory modules may need to be involved in the recognition of non-routine situations and in the process of constructing creative solutions. But the scope of the research here is limited to the integration of non-routine problem solving mechanisms with the action selection and ―consciousness‖ modules. In the coming sections of this chapter we will present the details of a mechanism that could build a novel solution to a problem, which is recognized to be non-routine in a given situation.
4. Non-routine Problem Solving Mechanism When we humans are faced with a problem to solve, we often create in our mind different strategies or possible solutions. We imagine the effects of executing each strategy or trial solution without actually doing so. This is similar to a kind of internal virtual reality or a simulated run. Eventually, we decide upon one strategy or trial solution, and try solving the problem using it. This process is called deliberation (Sloman 1999). The non-routine problem solving is a deliberative process to device solution for novel problems. As Glenberg (1997) suggests, such a process calls for meshing – use of chunks of prior knowledge towards obtaining solutions to novel problems.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
58
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
In this section we will present the mechanism that searches for a solution (behavior stream) out of existing schemes and doing so using consciousness to guide the search over multiple cognitive cycles.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.1. Basis for the Mechanism We would like to point out two driving principles for building LIDA‘s non-routine problem solving mechanism. The first is that complex processes are built from simple and primitive processes. The complexity arises from the structure that governs the interaction of the primitive processes in time and space. The second is that creativity is an important feature in handling non-routine problems; we believe that creativity is the process of using simple units to build larger and more complex structures such as behaviors and behavior stream templates (procedural schemes). An important capability of non-routine problem solving is to create new schemes in Procedural Memory using refined domain knowledge, but this level of creativity needs further investigation and we do not address it here. That is, we limit the primary repertoire to the level of behavior codelets and behaviors, which can be considered as neuronal groups (Edelman, 1987). An attention or expectation codelet that detected a non-routine problem may win the competition to ―consciousness,‖ and the problem information content is broadcast. Then procedural schemes find themselves relevant to the pieces of the broadcast information and instantiate the associated behavior streams. Particularly, there is a special non-routine problem solving (NRPS) behavior stream that becomes relevant to all broadcasts related to all non-routine problem situations. The NRPS behavior stream activates the NRPS module to facilitate the deliberative process of searching for solutions. Non-routine situations should be recognized and brought to ―consciousness‖ by nonroutine attention codelets. Our approach to managing non-routine problems is functionally equivalent to what Shallice (1988) call a Supervisory Attention System (SAS), which is associated with the prefrontal cortex. SAS could correct errors and determine novel action plans, and it has a number of functions (Shallice, 1988). (i) SAS Monitor: This can be considered as a non-routine situation detection system. As is the case in LIDA, SAS Monitor is connected to the long-term working memory (Ericsson & Kintsch, 1995). Novelty detection, which may take multiple cognitive cycles, notices departures from what is expected while looking at long-term working memory (LTWM). Looking is done by cuing the LTWM with the content of current percepts and retrieved content from different memories, which in turn are cued by the current situation. In LIDA, this role should be addressed in the perception and the different memory modules; (ii) SAS Modulator: The SAS must activate relevant actions and goal structures while attenuating irrelevant actions, particularly to try out available solutions to the novel situation. In LIDA, this functionality is realized by procedural schemes that respond to pieces of information from non-routine problem handling related broadcasts, and which instantiate relevant behavior streams. This is similar to adaptation by assimilation (Piaget, 1983) in the sense of using existing solutions to handle novel situations; (iii) SAS Generator: SAS is used to generate novel strategies (such as altering goal structures or action plans) to solve non-routine problems. Our research contribution in this chapter is to present a design for a mechanism that realizes the function of the SAS generator.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
59
Under LIDA framework, a Non-Routine Problem Solving (NRPS) mechanism should be a consciously mediated reasoning system that breaks down and combines procedural knowledge pieces to produce new behavior streams. An AI planner is a type of problem solver and a deliberative action selection system. But, in our system deliberative action requires ―consciousness,‖ which is not assumed in AI planners. Particularly, we propose a non-routine problem solving module of LIDA as a special planning (regressive and dynamic) behavior stream (goal structure), called NRPS behavior stream. As any other behavior stream, the NRPS behavior stream gets instantiated and executed in multiple cognitive cycles. But its particular task is to construct a new behavior stream as a solution for a given non-routine problem situation. As such, the NRPS behavior stream can be called a meta-behavior stream since it has knowledge about other behavior streams.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.2. The Mechanism: NRPS Behavior Stream To incorporate a non-routine problem solving mechanism as a special goal structure, we assume two constraints about schemes in Procedural Memory and their behaviors. (i) Each selected behavior codelet (single action) is associated with a corresponding expectation codelet. (ii) Based on the broadcast content, a recruited scheme instantiates itself and relevant behavior streams in one of the two modes: (a) Execution mode – so that the instantiations become part of the action selection dynamics of an agent; that is, when a behavior codelet executes, its associated expectation codelet knows how to monitor the execution effects, and how to report the outcome. (b) Non-routine problem solving or regressive mode – instantiations become part of the pool of actions that is available for the solution searching process. The first constraint allows the NRPS behavior stream be able to cut and paste behavior codelets, and coupled expectation codelets, as atomic units easily. A behavior codelet and its corresponding expectation codelet are reorganized under different behavior as a unit during the solution search process. The second constrained differentiates the role behavior codelets play after instantiation. According to Baars (1988) intention is future-directed, nonqualitative, mental representation of one‘s own action with a potential to dominate selective attention thereby becoming conscious content. In the regressive reasoning process during non-routine problem solving, behavior codelets could be called intention codelets (behavior codelets in goal image or intention mode). Intention codelets are a special case of attention codelets as they compete for consciousness with the goal image of future state as a content, which in turn enables recruitment of processors and subgoals (representing procedural constructs) that help to achieve the future goal state. Similarly, in non-routine problem solving or regressive mode, behavior nodes, goals nodes are underlain by intention codelets and expectations codelets. Intention codelets, representing schemes, are part of LIDA‘s primary repertoire, and a special case attention codelet will compete to bring the corresponding goal images to consciousness. The NRPS mechanism, during its regressive solution search process, involves intention codelets corresponding the top goal and subgoals. Our approach hypothesizes that there is an intention codelet, as part of a primary repertoire, corresponding to and representing every scheme node (goal state) including for those at the highest level of abstraction.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
60
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
The NRPS behavior stream is shown in figure 3. It shows the instantiated copy of the stream following the understanding of the problem situation, selective attention (conscious broadcast), the recruitment of schemes as relevant resources from procedural memory, and their instantiation and variable binding as they become active in the action selection system. The NRPS mechanism is based on a partial-order planning (POP) system (Sacerdoti 1977; Tate 1990; Wilkins 1990) - and the service of ―consciousness‖ to recruit relevant resources in its solution searching process. Variations of partial-order planners allow rich domain knowledge representation and are used in real world domains. O-Plan (Tate, Drabble, & Kirby, 1994) and Sipe-2 (Wilkins, 1990; Wilkins et al., 1995) are domain independent partial-order planners that have been used in a number of real world domains. Such planners allow plan repairing and interaction with humans, which are features particularly important for the LIDA‘s architecture. Erol, Handler, and Nau (1994) presented a formal definition of partial-order planning and proved that it is sound and complete. Wilkins (2001) discussed the comparative advantages of partial-order planners. Standard POP systems usually take initial state, goal state, and all available actions of an agent and return a plan if one is found. For our NRPS mechanism, the POP system should be modified so that it could deal with a limited set of actions to start planning and to allow ―conscious‖ deliberation if an impasse or a contentious point is reached in the planning process. An impasse is reached if a planner cannot find an action (in the current set of actions) that could satisfy an open subgoal, which we call an impasse subgoal (or contentious subgoal).
Figure 3. Non-routine problem solving (NRPS) module is a special behavior stream (goal-context hierarchy), which guides a deliberative process for problem solving over multiple cognitive cycles. Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
61
The NRPS behavior stream uses a modified POP system (the detailed algorithm can be found in Negatu, 2009) that takes a starting-plan, current-pool-of-actions or possible steps and a possible impasse subgoal and then returns resulting-plan and an impasse-subgoal. A ―consciously‖ mediated planning process continues until a consistent plan is obtained as a solution (no impasse-subgoal) or no solution is found (the impasse-subgoal exists and no new relevant action could clear the impasse-subgoal.) As we discussed before, the non-routine problem situation becomes conscious content once the coalition of the NRPS attention codelet wins the competition in the Global Workspace and its content (problem description) is globally broadcasted. This broadcast recruits or primes the NRPS behavior stream (in the procedural memory) and instantiates it for action in the Action Selection (behavior net) system. Once instantiated, it guides a multi cognitive cycle solution search. Next, we will discuss the role of each behavior in the NRPS behavior stream (figure 3).
i. Initiate Problem Solving This behavior is the one that sets the problem-solving context. Its variables are bound by the underlying NRPS procedural scheme that responds to a broadcast non-routine problem description (initial state and goal state). Along with the instantiation of the NRPS behavior stream, relevant schemes (mostly single behavior codelets) also respond to the broadcast of the same non-routine problem description. These instantiated and bound schemes form the initial set of actions (operators) available to the planner. When this behavior executes, it asserts the readiness of the initial state, goal state, and the initial actions (operators) to the other behaviors in the NRPS behavior stream. As such, it sets the context for solving the problem in hand.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
ii. Initialize Planner Using the problem context set by the Initiate Problem Solving behavior, this behavior compiles the planner initialization parameters. Particularly, on its execution it does the following: (i) Using the initial and goal states, it creates the initial empty plan (in which the Start dummy plan action has nodes and links representing the initial state of the problem as its effects, and the Finish dummy plan action has nodes and links representing the goal state of the problem as its preconditions). (ii) Nondeterministically, it chooses an open subgoal from the goal state of the problem, and sets it as an impasse or contentious subgoal, which may or may not be an actual impasse but needs to be set for the proper initialization of the planner. (iii) It initializes an empty list, which is used to store ―consciously‖ deliberated impasse subgoals. Each detected impasse subgoal is registered in this list.
iii. Planner This behavior executes to build a new plan based on the previous plan and the current pool of actions. At the end of the planning process, this behavior asserts an impasse subgoal (set to null if there is none), and the new plan. For the details of the planning algorithm, refer to Negatu (2009).
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
62
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
iv. Evaluate Solution
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
This behavior evaluates the solution search state. Particularly, it checks whether the search for a solution should continue, whether a solution has been found, or if the search has failed to find a solution. The checking process is simple and based on the asserted value of the impasse subgoal and whether this impasse has been deliberated in the earlier cycles or not. If no impasse subgoal left to be resolved, then the evaluation is a success in finding a solution with the current plan being the solution to the problem. If an impasse subgoal exists and is in the list of deliberated impasse subgoals, then it has been dealt before and therefore the evaluation is a failure to find a solution to the problem. If an impasse subgoal exists, but is not in the list of deliberated impasse subgoals, then the evaluation is to continue the search for a solution. The actions of behavior codelets under this behavior produce internal state change that asserts one of the three evaluation statuses (success in finding a solution, continue the search for a solution, and failure to find a solution) and possibly the impasse subgoal. After the internal state change is perceived, that is, the evaluation content as a percept and whatever it cues from memory is hopefully copied to the Global Workspace. Relevant attention codelets, which watch for the solution evaluation in Global Workspace, try to bring this information to ―consciousness.‖ Upon winning the competition, the content of the attention codelet and its coalition is broadcast. Relevant NRPS related procedural schemes respond to the broadcast and bind the evaluation content to the appropriate behaviors in the NRPS behavior stream. If an impasse subgoal exists, its broadcast activates schemes that could satisfy the subgoal or with the subgoal in their effect /result list to respond. Thus, they become the new additions or recruits in the set of actions available to the Planner behavior for its next cycle of search for a solution. That is, consciousness is used as filter and recruiter in the solution searching process and this is, we think, the main advantage of this mechanism over other planning systems.
v. Continue Search This is a procedural scheme that responds to a broadcast of an evaluation that asserts continue the search for solution, which has been understood as such, and reflected in the Global Workspace as a result of the action of Evaluate Solution behavior. After its variables are bound by its underlying scheme, and upon its execution, this behavior compiles a new set of actions, which consists of the union of the existing actions and the newly arrived actions. This behavior, on its execution, asserts the preconditions that allow the planner behavior to continue the search with the possible addition of newly recruited actions.
vi. Terminate with Failure This behavior is underlain by a procedural scheme that responds to a broadcast of an evaluation with content of failure to find solution to the problem. After its variables are bound by its underling behavior codelet, this behavior executes to terminate or give up on the solution search process for the problem.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
63
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
vii. Terminate with Success This behavior is underlain by a procedural scheme that responds to a broadcast of an evaluation with the content of success in finding a solution to the problem. The last plan produced by the planner is a solution. Upon its execution, this behavior converts the plan into a corresponding behavior stream template or scheme. The conversion happens based on two facts. (a) The solution, as a complete plan, is a totally ordered set of actions (behavior codelets and associated expectation codelets that can be applied sequentially to an initial state to produce a goal state. (b) The solution as a totally ordered plan could be specified by a set of partially ordered complete plans (behavior streams). For the current version of LIDA‘s action selection mechanism, the conversion also should satisfy a critical criterion - behavior codelets that underlie a behavior should be independent or should be able to execute in parallel or in any order without affecting each other. In planning terms, two actions are independent of each other; if any change of order happens between the actions, the plan remains consistent – no cycle is introduced in the ordering constraints of the plan and no conflicts arise in any one of the causal links of the plan. The simplest conversion from the plan solution to a behavior stream template is to have each action (behavior codelet and expectation codelet) underlie a behavior. Such is a correct solution and may be the only valid way to have a conversion, but it would be the worst-case scenario particularly in execution efficiency. We suggest the following heuristics for a better conversion to be applied in that order whenever doing so keeps the partial plan consistent: (a) Map a set of behavior codelets to an existing behavior. (b) Replace independently executable behavior codelets in the plan solution with a new behavior, and to do so by limiting the maximum number of behavior codelets that can underlie new behaviors. (c) Each action (behavior codelet and coupled expectation codelet), which is not converted in the previous two heuristics, is mapped to underlie a behavior by itself. Reusing knowledge pieces is an important feature of agents; the first heuristic allows the reusing of existing structures in the behavior network. Having multiple behavior codelets that underlie a single behavior improves performance since the selection of a behavior in the behavior network dynamics leads to the parallel execution of the underlying codelets. The actual solution to the non-routine problem at hand is a novel behavior stream template or scheme with the possibility of a new goal structure built using only existing behaviors, or only new behaviors, or a mix of new and old behaviors. The Terminate with Success behavior effects the storing of the new behavior stream template in the procedural memory – its action change internal state that to be understood as the availability of a solution to the non-routine problem at hand and consequently the success information becomes content in the workspace. An appropriate attention codelet, upon winning the competition in Global Workspace, the broadcast of its content (success of finding the solution) causes the incorporation of the new scheme into the procedural memory. Behavior steams are goal structures with hierarchy. Although the POP algorithm is known to produce flat plans, the free interconnection of schemes (behaviors and behaviors streams) based on their causal links and the heuristics of constructing new schemes by using already existing schemes as components maintain the hierarchical nature of behavior streams and the reusability of schemes. This was possible since schemes with different levels of abstraction are treated the same way – schemes.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
64
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
NPRS Behavior Stream at Work We have discussed details of the role of each behavior in the NRPS behavior stream in the solution search process for a given non-routine problem. Here, we point out the flow of information processing in the instantiation and execution of the NRPS behavior stream. To start with, a non-routine problem situation is perceived and associated with what could be remembered; then the problem description is eventually stored in the Workspace. LIDA does this in its understanding phase – the first three steps of its cognitive cycle (CC-1 to CC-3). An attention codelet that watches for a non-routine problem situation in the Workspace becomes active, with its coalition of information codelets, and the coalition competes in the Global Workspace to get access to ―consciousness‖ (CC-4). If and when the competition is won, the coalition gains access to the Global Workspace and the problem description is broadcast (CC-5). The NRPS behavior stream responds to the broadcast of the non-routine problem description. Other behaviors or behavior streams may find themselves relevant to pieces of information in the same broadcast, and can recruit resources that could help to construct a solution for the problem (CC-6). The responding behaviors and behavior streams bind their variables with the relevant information in the broadcast. The NRPS behavior stream instantiates and binds the NRPS behavior stream (CC-7). The other recruited behavior codelets (in NRPS or planning mode) with the behavior streams they prime become part of the initial set of actions (operators) for the problem solving process. Once the NRPS behavior stream is instantiated, it is part of the dynamics in the Action Selection system (behavior network); then the dynamics there choose a behaviors for control of action (CC-8) and get executed (CC-9) one at a time and over multiple cognitive cycles. The NRPS behavior stream guides the search for a solution to the problem with the help of ―conscious‖ mediation when an impasse arises. The Initiate Problem Solving and the Initialize Planner behaviors are executed only to initialize the problem solving process. While the Planner behavior has the task of planning in the solution search process, the Evaluate Solution behavior has the task of evaluating the output of the Planner behavior and its execution produce change in Workspace reflecting the search state. The standard ―conscious‖ mediation process (CC-4 & CC-5) feeds back the evaluation content and possibly recruits additional resources (schemes). If the evaluation is to continue the problem solving process, a relevant behavior codelet binds (or rebinds) and activates the Continue Search behavior, which has the task of setting up and asserting the parameters required by the Planner behavior. As long as the evaluation is to continue the search and it being the dominant goal context (with one of its behaviors is a winner in the behavior net dynamics), the NRPS behavior stream executes in iterative order: execution of Continue Search behavior is followed by execution of Planner behavior, followed by execution of Evaluate Solution behavior, followed by ―conscious‖ mediation, followed again by execution of Continue Search behavior. The NRPS behavior stream terminates with either failure or success. To do so, a ―consciously‖ mediated result processing, which involves all or most of the steps of LIDA‘s cognitive cycle, takes place. The process involves many codelets that are particularly designed for the non-routine problem solving mechanism. Besides behavior codelets that underlie the NRPS behavior stream, there are many attention codelets with the role of watching for or recognizing Workspace content that relates to non-routine problem solving. Among those are attention codelets that watch for a result status of failure or a success in the
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
65
solution search process. If consciously mediated evaluation content asserts failure in finding a solution to the problem, the result reporting action is executed by the Terminate with Failure behavior. If the consciously mediated evaluation content asserts success in finding a solution to the problem, the result reporting action is executed by the Terminate with Success behavior with the task of saving the solution as a behavior stream template in the procedural memory or scheme net and writing the availability of a solution to the stated problem in the workspace. The termination state written in the workspace could be brought to consciousness by relevant attention codelets. Consequently, success states will allow the agent to apply the new solution.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
5. Conclusion Non-routine problem solving is the process of devising an appropriate stream of actions as a response to novel or unexpected situations. Devising solutions is a type of learning or adaptation, which requires focus of attention, or involvement of consciousness, in handling the situation (Baars, 1997). Norman and Shallice (1986) and Shallice (1988) have proposed a functional model for a control of both routine (automatic) and non-routine behavior. This functional model is broadly equivalent to LIDA‘s decision making mechanism. In the Shallice‘s mechanism, a non-routine problem is an attention-based learning managed by a mechanism called the Supervisory Attention System (SAS), which is mapped to the prefrontal cortex (Shallice, 1988). Focusing on the deliberate action of the mechanism, working memory provides the attention system to access the current intentions, goals, and relevant past episodes. The SAS enables error correction, troubleshooting, handling non-routine problems using supervisory sub-functions such as SAS Monitor, SAS Modulator, and SAS Generator. The SAS Monitor is a non-routine problem situation or novelty detector. It is primarily connected to working memory. Non-routine situation detection is the departure or mismatch of known contents from what is perceived in the world, intended action, outcome of executed action. These are related to what is stored in episodic, procedural, and working memory pieces. The monitoring sub-function is a trigger mechanism to the other supervisory subfunctions. Although we do not address the function of the SAS Monitor here, we think that this function in LIDA is connected to internal perception and the different memory types the agent may have. The SAS Modulator function attenuates the activation of inappropriate actions, and enhances the activation of alternate but attended goal structures. Shallice suggests few possible modulatory responses, which we do not cover here. The SAS Modulator function applies existing solutions or behavior streams to new problems – that is, adaptation by assimilation (Piaget, 1983). The SAS Generator is triggered if the modulatory response by the SAS Modulator does not succeed; this asserts the unavailability of a recognized strategy to the situation in hand. The SAS Generator must be able to produce a novel strategy (new goal hierarchy or partial plan) that could handle the novel or non-routine problem situation. As in the case of Global Workspace Theory, the SAS framework suggests learning is triggered and happens with the investment of attentional resources or the involvement of consciousness. Our presentation in this chapter addresses the role of the SAS Generator – devising new behavior streams as a response to non-routine problems. While the SAS Generator (Shallice, 1988) is a functional model to device solutions to non-routine problems,
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
66
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
the mechanism presented here is a more concrete and detailed specification of the function of Shallice‘s SAS Generator; the specification is also unique in the sense that it tries to realize the function as per Baar‘s Global Workspace Theory in general, and in the context of the LIDA integrative cognitive agent framework in particular. To manage a non-routine problem, first, the problem must be recognized, and then, a novel action plan (if there is one to find) should be produced as a solution to the problem. We presented a mechanism that addresses the later with the help of conscious mediation. The hypothesis is that consciousness provides an effective resource recruitment mechanism in the search process for a novel solution. This non-routine problem solver can be seen from the point of view of the human developmental process. We are born with basic primitive capabilities, which could correspond to the processes in our mechanism. We inherit some behaviors and develop other behaviors as experience in the environment progresses. The new behaviors are developed out of the established behaviors and the underling primitive capabilities. To start with, most problems are non-routine. Progressively, we accumulate routine problem solving capabilities via different ways: by instructions from other humans, by direct feedback from the interactions in the environment, by imitation, etc. In our mechanism, routine problem solving capabilities correspond to ready-made behavior streams (partial plans) in procedural memory (scheme database in the plan space). The behavior streams, the behaviors in them, and the underlying behavior codelets are building blocks with which to generate solutions to non-routine problems. As experience grows, the non-routine problem becomes routine. If a routine task that uses a routine solution (behavior stream) is performed repetitively (or over learned), the routine task may become an automatic task; this is an automatization process, which is presented somewhere else (Negatu et al., in review; Negatu, 2009). Automatization can be used as a foundation to chunk over learned sequences of actions into a larger unit, such as a behavior. The presented non-routine problem solver is a type of adaptation or learning system that follows the Edlman‘s (1987) Neuronal Group Selection mechanism, which itself is inspired by Piaget‘s (1952) theory of cognitive development. In general, planners produce a sequence of actions. In our case, the non-routine problem solving module, with a planner at the center, is used to produce a behavior stream template (or task network) that could be instantiated to perform actions in a certain sequence. The actual sequencing of the actions and conflict resolutions happen in the dynamics of the behavior net, LIDA‘S action selection module, where instantiated streams reside. So, in LIDA, the NRPS stream and the behavior net contribute in plan generation, and the behavior net performs the plan execution. Both the plan generation and the plan execution are edited by consciousness. Non-routine problem solving, as a deliberative process in LIDA, is consciously mediated. It is nearly impossible to preprogram the complete domain knowledge; so, for LIDA and for any other agent system, it is important be designed or evolved with capabilities for different learning processes to encode, store, retrieve, and use experiences, which are acquired via its interaction with its environment. Non-routine problem solving in the LIDA architecture is the construction of a new goal hierarchy under the control of a unique behavior stream, itself a goal hierarchy, operating over multiple cognitive cycles, with the shaping of partial plans of action at each cycle. We believe that integration of the AI planning system and LIDA‘s consciousness process, along with the human interaction capability in LIDA, contribute to the development of effective, non-routine problem solving systems.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
A Non-Routine Problem Solving Mechanism …
67
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
References Arp, R. (2007). Scenario visualization: An evolutionary account of vision-related creative problem solving. Cambridge, MA: MIT Press. Baars, B. J. (1988). A Cognitive Theory of Consciousness. Cambridge: Cambridge University Press. Baars, B. J. (1997). In the Theory of Consciousness. Oxford: Oxford University Press. Baars, Bernard, J. (2002). The conscious access hypothesis: origins and recent evidence. Trends in Cognitive Science, 6, 47-52. Baars, B. J. & Franklin, S. (2003). How conscious experience and working memory interact. Trends in Cognitive Science, 7, 166-172. Baars, Bernard, J. & Stan Franklin. (2009). Consciousness is computational: The LIDA model of global workspace theory. International Journal of Machine Consciousness. Baddeley, A. D. & Hitch, G. J. (1974). Working memory. In The psychology of learning and motivation, ed. G A. Bower: 47-89. New York: Academic Press. Baddeley, A. D. & Hitch, G. J. (2007). Working memory: Past, present…and future? In N. Osaka, R. Logie & M. D‘Esposito (Eds.), Working Memory - Behavioural & Neural Correlates. Oxford: Oxford University Press. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577609. Barsalou, L. W. (2008). Grounded Cognition. Annual Review of Psychology, 59, 617-645. Bogdan, R. (1994). Grounds for cognition: How goal-guided behavior shapes the mind. Hillsdale, CA: LEA Publishers. Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation. RA-2, 14-23. Brooks, R. A. (1991). Intelligence without representation. Artificial Intelligence, 47, 139-159. Cañamero, D. (1997). Modelling Motivations and Emotions as a Basis for Intelligent Behavior. In Proceedings of the First International Symposium on Autonomous Agents, AA'97, Marina del Rey, CA, February 5-8, The ACM Press. Chaput, Harold, H. Benjamin Kuipers, & Risto Miikkulainen. (2003). Constructivist learning: A neural implementation of the schema mechanism. In Proceedings of WSOM '03: Workshop for Self-Organizing Maps. Kitakyushu, Japan. Cosmides, L. & Tooby, J. (1992). The psychological foundations of culture. In: J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind (19-136). New York: Oxford University Press. Crick, F. (1994). The astonishing hypothesis. New York: Simon & Schuster. Ericsson, K. A. & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102, 211-245. de Vega, M., Glenberg, A. & Graesser, A. (Eds.). (2008). Symbols and Embodiment: Debates on meaning and cognition. Oxford: Oxford University Press. D'Mello, S. K., Franklin, S., Ramamurthy, U. & Baars, B. J. (2006). A Cognitive Science Based Machine Learning Architecture. AAAI Spring Symposia Technical Series, Stanford CA, USA. Technical Report SS-06-02 (40-45). AAAI Pres. Doignon, J. P. & Falmagne, J. C. (1985). Spaces for the assessment of knowledge. International Journal of Man-Machine Studies, 23, 175-196.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
68
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
Drescher, G. (1991). Made Up Minds: A Constructivist Approach to Artificial Intelligence, Cambridge, MA: MIT Press. Edelman, G. M. (1987). Neural Darwinism: the theory of neuronal group selection. New York: Basic Books. Erol, K., Hendler, J. & Nau, D. S. (1994). UMCP: A Sound and Complete Procedure for Hierarchical Task-Network Planning. In Proc. AIPS. Morgan Kaufman. Franklin, S. (1997). Autonomous Agents as Embodied AI. Cybernetics and Systems, 28, 499520. Franklin, S. (2001b). An Agent Architecture Potentially Capable of Robust Autonomy. AAAI Spring Symposium on Robust Autonomy; American Association for Artificial Intelligence; Stanford, CA; March. Franklin, S. (2001). Automating Human Information Agents. In Practical Applications of Intelligent Agents, ed. Z. Chen, and L. C. Jain. Berlin : Springer-Verlag. Franklin, S. (2003). IDA: A Conscious Artifact? Journal of Consciousness Studies, 10, 47-66. Franklin, S., Baars, B. J., Ramamurthy, U. & Ventura, M. (2005). The Role of Consciousness in Memory" in Brains, Minds and Media, Vol. 1, 2005 (bmm150) (urn:nbn:de:0009-31505). Franklin, S. & Graesser, A. C. (1997). Is it an Agent, or just a Program?: A Taxonomy for Autonomous Agents. In Intelligent Agents III. Berlin: Springer Verlag. Franklin, S. & Graesser, A. (2001). Modelling Cognition with Software Agents. In CogSci2001: Proceedings of the 23rd Annual Conference of the Cognitive Science Society, ed. J. D. Moore, and K. Stenning. Mahwah, NJ: Lawrence Erlbaum Associates; August 1-4. Franklin, S., Kelemen, A. & McCauley, L. (1998). IDA: A Cognitive Agent Architecture IEEE Conf on Systems, Man and Cybernetics (2646–2651 ): IEEE Press. Franklin, S. & McCauley, L. (2004). Feelings and Emotions as Motivators and Learning Facilitators Architectures for Modeling Emotion: Cross-Disciplinary Foundations, AAAI 2004 Spring Symposium Series (Vol. Technical Report SS-04-02 48-51). Stanford University, Palo Alto, California, USA: American Association for Artificial Intelligence. Franklin, S. & Patterson, F. G. J. (2006). The LIDA Architecture: Adding New Modes of Learning to an Intelligent, Autonomous, Software Agent IDPT-2006 Proceedings (Integrated Design and Process Technology): Society for Design and Process Science. Franklin, S. & Ramamurthy, U. (2006). Motivations, Values and Emotions: Three sides of the same coin Proceedings of the Sixth International Workshop on Epigenetic Robotics (Vol. 128, pp. 41–48). Paris, France: Lund University Cognitive Studies. Franklin, Stan, Uma Ramamurthy, Sidney K. D'Mello, Lee McCauley, Aregahegn Negatu, Rodrigo Silva L., and Vivek Datla. (2007). LIDA: A computational model of global workspace theory and developmental learning. In AAAI Fall Symposium on AI and Consciousness: Theoretical Foundations and Current Approaches. Arlington, VA: AAAI. Freeman, W. J. (1995). Societies of Brains. A Study in the Neuroscience of Love and Hate. Hillsdale NJ: Lawrence Erlbaum. Gardner, H. (1993). Multiple intelligences: The theory in practice. New York: Basic Books. Glenberg, A. (1997). What memory is for. Behavioral and Brain Sciences, 20, 1-19.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
A Non-Routine Problem Solving Mechanism …
69
Hofstadter, R. D. & Mitchell, M. (1994). The Copycat Project: A model of mental fluidity and analogy-making. In Advances in connectionist and neural computation theory, Vol. 2: Analogical connections, eds. K. J. Holyoak & J. A. Barnden. Norwood N.J.: Ablex. Humphrey, N. (1992). A history of the mind: Evolution of the birth of consciousness. New York: Simon & Schuster. Kanerva, P. (1988). Sparse Distributed Memory. Cambridge MA: The MIT Press. Kanerva, P. (2009). Hyperdimensional Computing: An Introduction to computing in distributed representation with high-dimensional random vectors. Cognitive Computation, 1(2), 139-159. LaBerge, D. (1995). Attentional processing: the brain's art of mindfulness. Harvard University Press. Laird, E. J., Newell, A. & Rosenbloom, P. S. (1987). SOAR: An Architecture for General Intelligence. Artificial Intelligence, 33, 1-64. Lebiere, C. & Anderson, J. R. (1993). A connectionist implementation of the ACT-R production system. In Proceedings of the Fifteenth Annual Meeting of the Cognitive Science Society, 635-640. Hillsdale, NJ: Erlbaum. Maes, P. (1989). How to do the right thing. Connection Science, 1, 291-323. Marshall, J. (2002). Metacat: A self-watching cognitive architecture for analogy-making. In 24th Annual Conference of the Cognitive Science Society, 631-636. Negatu, S. Aregahegn (2009). Decision Making System for Cognitive Machines, Saarbrucken, Germany, LAP Lambert Academic Publishing. Negatu, A. S. & Franklin, S. (2002). An action selection mechanism for 'conscious' software agents. Cognitive Science Quarterly, 2, 363-386. Negatu, A. S., McCauley, T. L. & Franklin, S. (in review). Automatization for Software Agents. Newell, A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press. Norman, D. A. & Shallice, T. (1986). Attention to action: Willed and automatic control of behaviour. In: R. J., Davidson, G. E. Schwartz, & D. Shapiro, editors, Consciousness and Self-Regulation: Advances in Research and Theory. Plenum Press. Piaget, Jean. (1952). The Origins of Intelligence in Children. New York: International University Press. Piaget, Jean. (1983). Piaget's theory. In P. Mussen (ed.). Handbook of Child Psychology. 4th edition. Vol. 1. New York: Wiley. Pinker, S. (1997). How the mind works. New York: W. W. Norton & Company. Post, Emil Leon. (1943). Formal Reductions of the General Combinatorial Decision Problem, American Journal of Mathematics, 65(2), 197-215, 1943. Rao, Rajesh, P. N. & Olac, Fuentes. (1998). Hierarchical learning of navigational behaviors in an autonomous robot using a predictive sparse distributed memory. Machine Learning, 31, 87-113. Sacerdoti, E. D. (1977). A Structure for Plans and Behavior, Elsevier-North Holland. Searle, J. (1992). The rediscovery of the mind. Cambridge, MA: MIT Press. Shallice, T. (1988). From Neuropsychology to Mental Structure. Cambridge: Cambridge University Press.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
70
S. Aregahgen Negatu, Stan Franklin and Lee McCauley
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Sloman, A. (1999). What Sort of Architecture is Required for a Human-like Agent? In Foundations of Rational Agency, ed. M. Wooldridge, and A. Rao. Dordrecht, Netherlands: Kluwer Academic Publishers. Sun, R. (1997). Learning, action, and consciousness: A hybrid approach towards modeling consciousness. Neural Networks, 10(7), 1317-1331. Sun, R. & Franklin, S. (2007). Computational Models of Consciousness: A Taxonomy and some Examples. In: P. D. Zelazo, & M. Moscovitch, (Eds.), Cambridge Handbook of Consciousness (pp. 151–174). New York: Cambridge University Press. Tate, A. (1990). Generating Project Networks. In Alen, J.; Hendler, J.; and Tate, A., editors 1990, Readings in Planning. Morgan Kaufman, 291-296. Tate, A., Drabble, B. & Kirby, R. B. (1994). O-Plan2: an open architecture for command, planning, and control. In: M. Fox, & M. Zweben, editors, Intelligent Scheduling. Morgan Kaufman, San Francisco, CA. 213-239. Varela, F. J., Thompson, E. & Rosch, E. (1991). The Embodied Mind. Cambridge, MA: MIT Press. Wilkins, D. E. (1990). Domain-independent Planning: Representation and Plan generation. In: J., Alen, Hendler, J. & Tate, A. editors (1990). Readings in Planning. Morgan Kaufman., 319-335. Wilkins, E. D. (2001). A Call for Knowledge-based Planning. AI Magazine, Spring, Vol. 22, No. 1. Wilkins, D. E., Myers, K. L., Lowrance, J. D. & Wesley, L. P. (1995). Planning and reacting in uncertain and dynamic environments. Journal of Experimental and Theoretical AI., 1, 197-227.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
In: Computer Science Research and Technology Editor: Karl C. Verdinand, pp. 71-98
ISBN: 978-1-61728-688-9 © 2010 Nova Science Publishers, Inc.
Chapter 3
SPEECH ACTS AND PROSODIC MODELING IN SERVICE-ORIENTED DIALOG SYSTEMS Christina Alexandris National and Kapodistrian University of Athens, Hellas, Greece
1. Introduction
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
1.1. Specifications for Service Oriented Dialog Systems Language-specific as well as culture-specific factors are observed to play a decisive role in User Specifications for spoken Human – Computer Interaction (HCI) Systems. The present approach targets to determine and to define a finite set of re-usable, transferable and language independent specifications for prosodic modeling used as general parameters for the Speech Component in Human – Computer Interaction (HCI) Systems and, specifically, in Service-Oriented Dialog Systems, constituting an application field of HCI, usually directed to the General Public as a user group. Factors related to special applications such as emotion recognition, and/or special user groups, such as children or handicapped users, are not included in the present analysis. The present specifications aim to limit empirical prosodic modeling and to provide a general framework for facilitating both the construction and the evaluation processes of prosodic modeling, independently from sublanguage-specific parameters chosen for the System. The proposed specifications target to the features of Comprehensibility and Userfriendliness in the spoken output produced by the System‟s Speech Component and to the overall efficiency and reliability of the System‟s performance. The definition of language-independent and language-related specifications requires a framework defining the purpose and the intended user group of the HCI System. Systems intended for Experts are characterized by controlled input, while Systems with the General Public entail culture-specific factors and hence pose restrictions in regard to input. The intended user group determines the User-System Relationship, which is here defined as a basic parameter of an HCI System. In the present approach, the HCI System‟s purpose,
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
72
Christina Alexandris
intended user group and the User-System Relationship are related to the definition of the Speech Acts, the dialog structure and dialog content of the System. The dialog structure and dialog content are, in turn, related to the prosodic modeling of the System‟s spoken output. This set of relations is summarized in the following table (Table 1). From a psychological perspective, previous studies in Systems involving Human Computer Interaction (HCI), have demonstrated that the User is more likely to use the System and to oversee minor flaws if the System is user-friendly and if its spoken output is characterized by naturalness (Nottas et al, 2007). In this case, the User is also more likely to make the effort and gain the most benefit from the System‟s capabilities, disregarding the System‟s shortcomings.
1.2. The User-System Relationship and Related Parameters The User-System Relationship is differentiated in respect to the physical form of the System (Base-Parameter 1). In particular, when the User communicates with a Robot, the User-System Relationship is different than the case in which the User communicates with a two-dimensional (2D) (or three dimensional -3D) Conversational Agent, appearing on a screen. In the latter case, applicable in the present approach, the Conversational Agent is virtual while in the first case, the System is real.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Table 1. Relation of User-group, the User-System Relationship, Speech Acts, Dialog Structure and Prosodic Modeling
User-psychology may be considered an additional parameter determining the UserSystem Relationship. User-psychology (Base-Parameter 2) in respect to a System may be categorized in three basic types, excluding cases in which the User considers the System to be equal to a human, such as a friend or a relative. These three basic User-psychology types that may be identified are the following: (a) the System as an “Instrument” User-psychology, (b) the System as a “Servant” User-psychology and (c) the System as a “Pet” User-psychology. A User-psychology type, containing characteristics of the User-psychology types (a), (b) and (c) that may be additionally identified is the “Car Owner” User-psychology (d). The latter is more characteristic for “real” (tangible) HCI Systems like Robots.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
73
The User-System Relationship is also effected by Task-type (Base-Parameter 3), depending on whether the System‟s function is (a) only the mechanical or program execution of task, (b) whether the tasks performed by the System are of the Service quality and (c) whether special tasks, such as entertainment and social company are assigned to the System. Furthermore, the User-System Relationship is influenced by Language and Culturespecific parameters (Base-Parameter 4), namely on how familiar the Users are with the Systems and what type of language style and form of communication is expected to be used in the interaction. An additional factor determining the User-System Relationship is the Initiative factor (Base-Parameter 5). In some Systems, tasks are initiated by the System and controlled by the User: the User‟s role is more that of responses, confirmations, approval and disapproval of the tasks executed by the System. In other cases, all tasks are initiated by the User and controlled by the System. In these cases, the System checks if the task is successfully performed and there are no errors.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
1.3. Base Parameters of Present Analysis In the present approach, involving Service-Oriented dialog systems and interaction with a virtual speaker or Conversational Agent (Base-Parameter 1), we focus on experience gained from four European Union (EU) projects involving Speech Technology for social services and Human-Computer Interaction. Therefore, in the present approach, Language and Culturespecific parameters (Base-Parameter 4) are related to most European countries, including variations related to the Mediterranean. The EU Projects concerned include the SOPRANO Project (involving smart environments and services for the General Public, http://www.soprano-ip.org/), the HEARCOM Project (speech technology applications for Users with hearing problems, http://hearcom.eu/main.html), the ERMIS Project (emotionally sensitive HCI systems with Conversational Agents, http://www.image.ntua.gr/ermis/) and the AGENT-DYSL Project (involving speech technology applications and dyslexia, http://www.agent-dysl.eu/). Data is obtained from User Requirements Analysis in Work Packages, recorded data, questionnaires distributed to Users, dialog modeling corpora and studio recordings. We note that all the above-mentioned projects involve as a Project Partner the Institute for Language and Speech Processing (ILSP) -Athena Research and Innovation Center in Information, Communication and Knowledge Technologies, in Athens. In relation to the above-presented Language and Culture-specific parameters (BaseParameter 4), in the present approach, the following requirements are specified: a combination of (a) the System as an “Instrument” User-psychology and (b) the System as a “Servant” User-psychology (Base-Parameter 2) as well as the Task-types (Base-Parameter 3), specified as a combination of (a) mechanical or program task execution and (b) tasks of service performance quality. Following the typical strategy employed for achieving a maximum recognition rate in applications with highly varied user-input, as is the case of the General Public as a user group, the dialog is controlled by the System, therefore the Initiative factor (Base-Parameter 5) involved in the present framework is the “System Initiative” strategy.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
74
Christina Alexandris Table 2. Base Parameters of present approach General Specifications : Base Parameters ( Base-Parameter 1) Physical Form of System: Virtual speaker / Conversational Agent Base-Parameter 4: Language and Culture-Specific factors: European (with variations) (Base-Parameter 2) User-Psychology: (a) “System as an “Instrument” and (b) the System as a “Servant” User-psychology” (Base-Parameter 3) Task-types: (a) mechanical or program task execution (b) service performance (Base-Parameter 5) Initiative: “System-Initiative”
The above-presented Base-Parameters (1-5) and respective features constitute the basic framework for the development and application of the proposed specifications for prosodic modeling. These Base-Parameters (1-5) are summarized in Table 2.
2. Speech Acts and Dialog Structure: The Speech Act – Oriented Approach in Dialog Systems
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
2.1. Introduction In the present approach, prosodic modeling specifications are directly related to Speech Act type, determining the dialog structure and the dialog content of the HCI System. In respect to the framework of Service-Oriented dialog systems, Speech Acts are differentiated in (I) Task-related Speech Acts and in (II) Non-Task-related Speech Acts. In Task-related Speech Acts, the content is usually standard and relatively languageindependent. Furthermore, in Task-related Speech Acts, User expectations are predefined and also relatively language and culture independent. In Task-related Speech Acts a Controlled language-like approach can be applied to facilitate Speech Recognition and semantic processing and to standardize System input and output (Alexandris, 2009). Moreover, it has been observed that controlled language-like specifications originating from English (Smart, 2006, Wojcik and Holmback,1996) and German (Lehrndorfer, 1996) are mostly applicable in Greek (Alexandris, 2009). In Non-Task-related Speech Acts, the content is often observed to be not standard, may be highly dependent on user-requirements and acceptable socio-cultural norms and is, therefore mostly language-dependent. Specifically, in the languages concerned, namely English, German and Greek, a balance between differences and similarities have been observed. In this case, a “lax” Controlled-language-like approach can be adopted. Both the Controlled language-like approach for Task-related Speech Acts and the “lax” Controlled language-like approach for Non-Task-related Speech Acts presented here can be applied in Work packages for User Requirements and Dialog Modeling and may also provide a basis for prosodic modeling processes in other languages, apart from English, German and Greek.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
75
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
2.2. Relating Speech Acts and Steps in Task-Related Dialog Structure Speech Acts for Task-related Dialogs (Heeman et al., 1998) can be related to utterances produced by a Conversational Agent in Dialog Systems. Specifically, in Task-related dialogs (for example, spoken technical texts), the content of the utterances produced by the Conversational Agent is related to the Speech Acts for Task-related Dialogs (Heeman et al., 1998) (Alexandris, 2009). The Speech Acts for Task-related Dialogs involve speech acts related to user-input recognition (“Acknowledge”), confirmation of user-input “Confirm”, checking task completion/task success requested or activated by user (“Check”), providing user with necessary information or informing user about data requested by user, task success/failure or current status of process/system (“Inform”) and handling of waiting time (“Filled Pause”) (Heeman et al., 1998). The System may ask the user to provide specific input (“Request”) and expect the user‟s response (“Respond”) (Heeman et al., 1998). For reasons of efficiency, in many dialog systems, a considerable percentage of the questions asked by the System constitute “Yes/No Questions” (“Yes/No Question”) requiring a “Yes” or a “No” as an answer from the user (“Yes/No Answer”) (Heeman et al., 1998). In dialog systems involving Task-related Dialogs, for example, spoken technical texts, steps in the dialog structure may be related to more than one Speech Act. Specifically, steps in the dialog structure involving the recognition of the user‟s answer and/or keyword recognition in user-input may be related to the “Acknowledge”, “Request” or “Y/N Question” Speech Acts, as in the respective examples of utterances produced by the System (or System‟s Conversational Agent”, namely “You have chosen the “Abort” option” (“Acknowledge”), “Please enter the requested date. Please press “1” (“Request”) (Nottas et al., 2007) and “Do you wish to execute the program?” (“Y/N Question”). Problems in the processing of user-input and/or errors in the keyword recognition in userinput may be related to the both the Speech Acts “Check” and “Request”, as in the example of the produced utterance “Input cannot be processed”, “Your input cannot be processed. Please repeat” (“Check”)/(“Request”). Table 3. Relation of Step in Task-related Dialog Structure and Speech Act Step in Dialog Structure Answer / Keyword Recognition Problems or errors in Answer / Keyword Recognition Free Input Close Dialog Answer / Keyword Recognition Problems or errors in Answer / Keyword Recognition Free Input Problems or errors in Answer / Keyword Recognition Free Input Close Dialog All steps in Dialog Structure Answer / Keyword Recognition
Speech Act Y/N Question Inform
Request
Check Confirm Filled Pause Acknowledge
Input provided by the user that does not constitute a “Yes/No Answer” or is not related to keyword recognition (Free input) can be followed by the Speech Acts “Check”, “Inform” or Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
76
Christina Alexandris
“Request” as in the respective examples of utterances produced by the System “We assume that you have completed the process” (“Check”) (Nottas et al., 2007), “You still have 30 seconds to file your complaint” (“Inform”) and “Please add any further information you consider important” (“Request”) (Nottas et al., 2007). The Speech Acts “Confirm” and “Inform” may concern the closing of the dialog between System and User, as shown in the respective examples “Your entry has been successfully registered” (“Confirm”) (Nottas et al., 2007) and “Your entry has been registered as No IE6780923478” (“Inform”). Waiting time for the processing of user-input or for the completion of a process is handled by appropriate messages produced by the System such as “Please wait for two seconds” (Nottas et al., 2007), identified as a “Filled Pause” Speech Act.
3. Prosodic Modeling and Speech Acts for Task-Related Dialog
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Prosodic Emphasis and Word Category Prosodic modeling constitutes one of the typical strategies used in Dialog Systems (Kellner, 2004), contributing to the task efficiency and service efficiency of the system (Moeller, 2005). In task-related applications, for the General Public, prosodic modeling is observed to contribute both to the achievement of clarity, lack of ambiguity and user-friendly style (Alexandris, 2008), (Alexandris, 2007). For the requirements of the General Public as a User-Group, these targets are summarized as “Comprehensibility” and “User-Friendliness”. In the present approach, the above-presented Base-Parameters (1-5) and respective features constitute the basic framework for the development and application of the proposed prosodic modeling specifications, with data obtained from European Union Projects (English and German data) and National Projects (Greece). In the proposed approach, prosodic modeling of all utterances related to the Speech Acts for Task-related Dialogs is based on the use of prosodic emphasis on the sublanguage-specific elements constituting the most important information in the sentence‟s semantic content, as well as sublanguage-independent elements such as negations and elements expressing time, space (movement), quality and quantity (Alexandris, 2009). Specifically, prosodic emphasis on the negations and elements expressing time, space (movement), quality and quantity is used for the achievement of Precision (Alexandris, 2008), while prosodic emphasis on sublanguage-specific expressions and terminology is used for the achievement of comprehensibility resulting to Directness (Alexandris, 2009). In both Task-related and Non-Task-related Speech Acts, prosodic emphasis is used on the sublanguage-independent elements constituting negations and elements expressing time, space (movement), quality and quantity. Sublanguage-specific elements receiving prosodic emphasis in monolingual and multilingual Task-related Speech Acts may constitute keywords grouped under ACTIONTYPE (Malagardi and Alexandris, 2009). These word groups involve expressions related to activities such as activating a program or controlling the environment such as checking if the power supply is turned off. Examples of keywords grouped under ACTION-TYPE are expressions contained within the Task-related Speech Acts and directly related to the execution of tasks such as such as “open”/ “close”, “closed”, “turn-on”, “turned-on”, “turnoff”, “turned-off”, “running”, “start”, “started”, “stop”, “stopped”, “pause”, “paused”,
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
77
“answer”, “accept”, “accepted”, “reject”, “rejected” and “lock” (data from the Speech Component of the SOPRANO Project, http://www.soprano-ip.org/). ACTION-TYPE expressions include keywords related to the Task-related Speech Acts but not directly related to the execution of tasks such as the expressions “understand” and “repeat”. We note that expressions composed of more than one word that have to be processed by the system as a singular expression are presented with a dash “-“ between the components. In some applications involving multilingual Task-Related dialogs, prosodic emphasis may be used in sublanguage-specific elements that can be categorized as OBJECT-TYPE (Malagardi and Alexandris, 2009). Keywords grouped under OBJECT-TYPE comprise expressions related to objects involved in the activities concerned. Examples of such expressions are the words “door”, “oven”, “tap”, “television”, “lights”, “air-conditioner”, and, “thermostat”. The keywords group OBJECT-TYPE also includes expressions signalizing predefined small objects, such as “pill”, or non-object-like entities such as “message”, “dinner” and “phone-call”. Therefore, utterances in the System‟s Spoken Output constituting Task-related Speech Acts are characterised by prosodic emphasis given to at least two elements in the sentence, if it is considered that keywords constituting ACTION-TYPE are always related to keywords constituting OBJECT-TYPE (Malagardi and Alexandris, 2009), such as in the examples “The air-conditioner is turned on” (”Inform” Speech Act”) and “Shall I vacuum the floor?” (“Yes/No Question”). Examples from recorded data from the Speech Component of the SOPRANO Project are presented in Example 1. It may, therefore, be concluded that prosodic modeling of the utterances related to the Speech Acts for Task-related Dialogs is based on the use of prosodic emphasis on the sublanguage-specific elements constituting the most important information in the sentence‟s semantic content, as well as sublanguage-independent elements such as negations and elements expressing time, space (movement), quality and quantity (Alexandris, 2008). Example 1. Recorded data from the Speech Component of the SOPRANO Project. Short pauses are indicated as [Srt-P] Speech Acts Example
Speech Act
“Do you wish to [Srt-P] answer [Srt-P] the door?” “Would you like to [Srt-P] watch [Srt-P] the news?” “Shall I [Srt-P] turn [Srt-P] on [Srt-P] the dishwasher?” “The [Srt-P] air-conditioner is switched [Srt-P] on.” “The [Srt-P] tap is [Srt-P] running.” “You have [Srt-P] two [Srt-P] messages” “Please [Srt-P] take [Srt-P] your pill.” “Would you like to [Srt-P] watch the [Srt-P] news or [Srt-P] the sports section?” I [Srt-P] cannot [Srt-P] understand you. Please [Srt-P] repeat.”
Y/N Question
Inform
Request
Check
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
78
Christina Alexandris Table 4. Relation of prosodic emphasis in Task-related Dialog Speech Acts and purpose of utterance
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Parameter type Sublanguageindependent Sublanguagespecific
Elements receiving prosodic emphasis in Task-related Dialog Speech Acts spatial, temporal, quantitative expressions expressions related to manner and quality Sublanguage-specific lexicon, expressions and terminology
Purpose Achievement of precision Achievement of directness
Specifically, prosodic emphasis on the sublanguage-specific elements constituting negations and elements expressing time, space (movement), quality and quantity is used for the achievement of Precision (Alexandris, 2008), while prosodic emphasis on sublanguagespecific expressions and terminology is used for the achievement of Comprehensibility resulting to Directness (Table 4). These specifications may be incorporated in a general framework for controlling spoken output, similar to strategies employed in Controlled Languages (Alexandris, 2009). For example, for the efficient handling of semantic content and/or for precision and directness in the interactions, the words “yes”, “no”, “packaging”, “execute”, “code”, (sublanguage-specific expressions), “two minutes”, “thirty seconds” (quantity - time), and “cannot” (negation) receive prosodic emphasis (in bold print) in the respective sentences from the CitizenShield dialog system for consumer complaints (Example 2): “SYSTEM: Please answer the following questions with a “yes” or a “no” Was there a problem with the packaging?”, “SYSTEM: “Do you wish to execute the program?” (Speech Act: Yes/No Question), “SYSTEM: What is the code of the container?” (Speech Act: Request), “SYSTEM: Wait for two minutes” (Speech Act: Filled Pause), “SYSTEM: “You still have 30 seconds to file your complaint” (Speech Act: Inform), “SYSTEM: Your input cannot be processed” (Speech Act: Inform/Check). Examples from the Spoken Output Speech Component of the CitizenShield dialog system (National Project) are presented in Example 1. We note that all translations from Modern Greek are rendered with proximity to original syntactic structure. Example 2 1. SYSTEM: Please answer the following questions with a “yes” or a “no”. Was there a problem with the packaging? (Speech Act: Yes/No Question) 2. SYSTEM: Do you wish to execute the program? (Speech Act: Yes/No Question) 3. SYSTEM: What is the code of the container?” (Speech Act: Request) 4. SYSTEM: Wait for two minutes” (Speech Act: Filled Pause) 5. SYSTEM: You still have 30 seconds to file your complaint (Speech Act: Inform) 6. SYSTEM: Your input cannot be processed” (Speech Act: Inform/Check)
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
79
In the languages concerned, prosodic emphasis is related to Comprehensibility and is observed to follow the same patterns in respect to word category that is emphasized. This may be related to the fact that in Task-related Speech Acts the content is fixed (standard) and therefore specifications tend to be more language-independent. This is confirmed by the present data from the above-presented European Union Projects. In this respect, the proposed prosodic modeling may act as a Controlled language-like approach applied to facilitate Speech Recognition and semantic processing and to standardize System input and output. Furthermore, it has been observed that Controlled language-like specifications regarding written and spoken technical texts (Task-Oriented Dialogs) originating from English and German have been observed to be generally applicable in Greek (Alexandris, 2009).
Language-Specific Parameters in Respect to Prosodic Emphasis, Word Category and Semantic Content
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.2.1. Language-Specific Parameters in Greek Previous studies have demonstrated a differentiation between specific word categories in which prosodic emphasis does not determine their semantic content (I) and word categories whose semantic content may be determined by prosodic emphasis (II) (Alexandris, 2008). In the first case, the semantic interpretation of the entire phrase or sentence may be determined by the type of element receiving prosodic emphasis, but the semantic content of the emphasized element itself is not effected. The group of word categories whose semantic content may be determined by prosodic emphasis namely (1) spatial and temporal expressions, (2), a subgroup of quantifiers and numericals and (3) a sub-group of discourse particles identified as “politeness markers” (Alexandris and Fotinea, 2004) is classified as Category A or “Prosodically Determined” words (Table 5). For spatial and temporal expressions, and for the subgroup of quantifiers and numericals, the presence of prosodic emphasis signalizes an indexical interpretation (“exactly”) as opposed to a vague (Schilder and Habel 2001), interpretation or a fixed expression (Alexandris, 2008), where in the latter cases, there is an absence of prosodic emphasis. For example, with prosodic emphasis there is an indexical interpretation of the spatial expression “'dipla” as “along” in the sentence “the crack was exactly along (parallel) to the band in the packaging” as opposed to its vague interpretation as “next-to” in the same sentence. The same is observed for the temporal expression “'oso” with its indexical interpretation as “for as long as” in the sentence “the array is created for as long as the loop is running” as opposed to its vague interpretation as “while” in the same sentence. Similarly, the numerical or quantificational expression “two” (“d'yo”) is used in its indexical and literal meaning when it receives prosodic emphasis in the sentence “wait for two minutes”, while, in the same sentence without prosodic emphasis, it is perceived as a fixed expression (“wait a moment”). For discourse particles identified as “politeness markers”, the absence of prosodic emphasis signalizes them as politeness markers, while with the presence of prosodic emphasis they only have the property of discourse particles. Thus, absence of prosodic emphasis in the discourse particles “Tell me ('pite mou)” and “Mabey” ('mipos) signalizes positive politeness and friendliness towards the User in the following utterances produced by the Conversational
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
80
Christina Alexandris
Agents in Task-related Dialog Systems: Tell me ('pite mou), what is the product (Preferred utterance by Users), Mabey ('mipos) you want me to check the kitchen? (Alexandris, 2007, Alexandris, 2008). Table 5. Relation of Words of Category II to Prosodic Emphasis (prosody and semantics of individual words) Spatial and Temporal Expressions: Presence of Prosodic Emphasis = indexical interpretation (= “exactly”) Absence of Prosodic Emphasis = vague interpretation Quantifiers and numericals: Presence of Prosodic Emphasis = indexical interpretation
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
(“ =“exactly”) Absence of Prosodic Emphasis = fixed expression Discourse Particles used as Politeness Markers: Presence of Prosodic Emphasis = discourse particles, not associated with politeness Absence of Prosodic Emphasis = discourse particles as politeness markers
The group of word categories where prosodic emphasis may emphasize or intensify, but may not determine the semantic content, is classified as Category B or “Prosodically Sensitive” words. This group involves (1) adjectives expressing quality and (2) adverbs expressing mode perceptible to the senses, used in a literal, non-metaphorical way. For example, prosodic emphasis on the adjective “round” (“strogi 'lo”), in the sentence “It was in a round box” signalizes the meaning “truly/par excellence round”. Similarly, prosodic emphasis on the adverb “upside down” (“an'apoda”), for example, in the sentence “I turned it upside down” signalizes the meaning “completely upside down”. Both Category A and Category B type prosodic emphasis may be used for the (a) correct interpretation of Speaker Input in the respective Automatic Speech Recognition (ASR) Modules and (b) for achieving user-friendliness in man-machine communication in the sense of “accuracy” and “directness” (Hausser, 2006) towards the user in the Conversational Agent‟s spoken output. The rest of the word categories that are not effected by prosodic emphasis in respect to their semantic content are classified as Category C or “Prosodically Independent” words. The presence or absence of prosodic emphasis on words of Category C only effects the semantic interpretation of the entire phrase or sentence in which they belong. A significant percentage of these words are nouns or verbs and they may constitute sublanguage specific keywords. Prosodic emphasis on keywords focuses on the basic content of the utterance, for example, whether it is an action in question, in the case of a verb, or a specific object in question, in the case of a noun. Prosodic emphasis on the word elements of Category C, words is sentence dependent and highly sublanguage- and application-specific. Prosodic emphasis on elements of Category C is used both for (a) determining the basic content of the Speaker‟s input, (b) for directing the Speaker‟s input towards a keyword-specific answer, as well as (c) for achieving accuracy and directness in the Conversational Agent‟s output. For example, in the sentence “Please tell us any additional information you wish about the product or about your transaction” the keywords “additional”, “product” and “transaction” receive prosodic
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
81
emphasis for clarity towards the Users and simultaneously direct towards obtaining a respective keyword-specific answer, in this case “product-type” and “transaction-type”. Similarly, a “Yes/No” Answer is requested with the use of prosodic emphasis either on “check” or on “thermostat” in the question “Shall I check the thermostat?” In contrast to both A and B word categories, or “Prosodically Determined” and “Prosodically Sensitive” words, whose plus or minus (±) prosodic emphasis features can be systematically used in various Speech Technology Applications, including Text-to-Speech (TTS) and Automatic Speech Recognition (ASR), the prosodic modelling of Category C or “Prosodically Independent” words is highly sublanguage-dependent and application-specific.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.2.2. Prosodic Emphasis and Word Category in Other Languages The above-described relationship between prosodic emphasis and semantic content of word and word category does not demonstrate a compatibility with English and German, at least in the dialogs in Service-Oriented HCI and in respect to the above-presented European Union projects. Language use (1) and differences in respect to morphosyntactical features (2) can account for a different relationship between prosodic emphasis and the above-presented word categories in English and German. In this case, prosodic emphasis does not influence the actual semantic content and, from an applicational aspect, may be classified as prosodic modelling of Category C for “Prosodically Independent” words. Specifically, in regard to spatial and temporal expressions, quantifiers and numericals, in both English and German, prosodic modelling does not influence semantic content, in this case the indexical versus vague interpretation, and the indexical interpretation is achieved with the use of the respective adverbial modifiers. In particular, a wider use of indexical-type expressions is observed, either with the more extensive use of modifiers (adjectives or adverbs), for example “right above”, or with the morphological structure and semantic content of the individual words themselves, for example “darüber” (“right above”) in German. Exceptions, however, do exist, in respect to the relationship between prosodic emphasis and semantic content of word and word category. For instance, some compatibility is observed mostly in American English in respect to the adjectives in Category B as “Prosodically Sensitive” words, for example the expression “big” (if receiving prosodic emphasis, may be equivalent to “really big”). It is also observed that, in a similar way to Greek, the presence of prosodic emphasis on English and German expressions partially equivalent to the Politeness Markers in Greek generates a rather unfriendly effect to spoken utterances produced by the System. For example, prosodic emphasis on the expressions “Tell me”, “Please” or “Bitte” (“Please”) in German, equivalent to the Politeness Markers in Greek, is observed to render them harsh and unfriendly, at least in regard to the above-mentioned data from the European Union projects Speech Technology for social services and HumanComputer Interaction. Table 6. Speech Acts and Goals in Service-Oriented Applications for the General Public Task-related Speech Acts:
Non-Task-related Speech Acts:
Goal: (1) Perform Task successfully
Goal: (1) Perform Task successfully (2) User satisfaction / User-friendliness
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
82
Christina Alexandris
It can, therefore, be noted, at least for the data concerned, that in English and German, in the above-presented word categories, prosodic emphasis is basically, used for emphasis but does not interfere with the actual semantic content of the expressions.
4. Prosodic Modeling and Non-task-related Speech Acts in Service-Oriented Dialogs 4.1. Introduction In Task-related Speech Acts, the Goal of the Human-Computer interaction is basically one, namely, the successful performance of the activated or requested task. In Non-Taskrelated Speech Acts, the Human-Computer interaction taking place is directed towards two Goals, namely (1) the successful performance of the activated or requested task and (2) User satisfaction and User-friendliness. The goal related to requirements on the Satisfaction Level in respect to a System‟s evaluation criteria, namely perceived task success, comparability of human partner and trustworthiness (Moeller, 2005) constitutes a basic issue in Non-Taskrelated Speech Acts. It should be noted that the more Goals to be achieved, the more parameters in the System Design and System Requirements, and subsequently Dialog Design are to be considered (Wiegers, 2005). Prosodic modelling for the Non-Task-related Speech Acts may, therefore, be characterized as a complex task.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.2. Defining Non-task-related Speech Acts in Dialog Structure Data from European Union projects in Speech Technology for social services and Human-Computer Interaction in English, German and Greek allows the formulation of a general categorization scheme of Non-Task-related Speech Acts. Specifically, Non-Taskrelated Speech Acts can be divided into three main categories: Speech Acts constituting an independent step in dialog structure (Category 1), Speech Acts attached to other Speech Acts constituting with them a singular step in dialog structure (Category 2) and Speech Acts constituting an optional step in dialog structure in Service-Oriented dialogs (Category 3). Speech Acts of Category 3 constitute a marginal category, between Non-Task-related Speech Acts and the strictly Task-related Speech Acts. Non-Task-related Speech Acts of Category 1 involve the (1.1) “Open Dialog-Greeting” and (1.2) the “Close Dialog” Speech Acts. Examples of Non-Task-related Speech Acts of Category 1 are the utterances “Hello” (“Open Dialog-Greeting”) and “Thank you for using the Quick-Serve Interface” (“Close Dialog”). Non-Task-related Speech Acts of Category 2 attached to other Speech Acts are the Speech Acts related to the “Error” concept (2.1), namely (a) “Apologize”, and (b) “Justify”, the Speech Act “Introduce-new-task” (2.2) and Speech Acts related to the “Delay” concept (2.3), namely (a) “Inform-delay” and (b) “Manage-waiting-time”. Examples of Speech Acts in Category 2.1 are utterance (a) “I am sorry” (“Apologize”) following or preceeding the Task-Related Speech Act (“Inform”): “Your input cannot be processed”, utterance (b) “I cannot understand your request” (“Justify”) following or
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
83
preceeding the Task-Related Speech Act (“Request”): “Please repeat”. An example of the “Introduce-new-task” Speech Act (2.2.) is the utterance “To provide better services for you, I will ask you a few more questions.” (“Introduce-new-task”) following or preceeding the Task-Related Speech Act (“Y/N Question”): “Would you like to create a member‟s account?” Examples of Speech Acts in Category 2.3 are utterance (a) “This might take a few seconds” (“Inform-delay”) following or preceeding the Task-Related Speech Act (“Filled pause”): “Please wait. Your input is being processed”, and utterance (b) “Do you wish to proceed?” (“Manage-waiting-time”) following or preceeding the Task-Related Speech Act (“Filled pause”): “If you have finished with your input, please press OK”. A Non-Task-related Speech Act of Category 2 that usually follows a Task-related Speech Act is the “Thank” Speech Act (2.4). Typical examples of the “Thank” Speech Act are the utterances “Thank you for your input” and “Thank you for using Quick-Serve”. We note that the “Thank” Speech Act may in some cases be optional and that in other cases it may coincide with the “Close-Dialog” Speech Act (1.2). A Non-Task-related Speech Act of Category 2, that usually preceeds a Task-related Speech Act is the “Attention-alert” Speech Act (2.5), alerting the User on the content of the Task-related Speech Act that is going to follow. Similarly to the “Thank” Speech Act, the use of the “Attention-alert” Speech Act is optional and dependent on User Requirements (for example, Elderly Users, with hearing problems or/and a tendency to forgetfulness). The “Attention-alert” Speech Act may involve the calling of the User‟s name or the production of utterances requiring the User‟s attention such as “Attention” and “Mr. X, this is important”.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Table 7. Non-Task-related Speech Acts Category 1: Speech Acts constituting independent step in dialog structure (1.1) “Open Dialog-Greeting” (1.2) “Close Dialog” Category 2: Speech Acts attached to other Speech Acts (2.1) “Error” (a) “Apologize” (b) “Justify” (2.2) “Introduce-new-task” (2.3) “Delay” (a) “Inform-delay” (b) “Manage-waiting-time” (2.4) “Thank” (2.5) “Attention-alert” Category 3: Speech Acts optional step, marginal category between Task-related and Non-Task-related Speech Acts (3.1) “Optional-Information” (a) “Offer” (b) “Reminder” (3.2) “Initiate-Conversation”
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
84
Christina Alexandris
Non-Task-related Speech Acts of Category 3, constituting optional step in dialog structure in Service-Oriented dialogs, involve the Speech Acts related to the “OptionalInformation” concept (3.1), namely the (a) “Offer” and (b) the “Reminder” Speech Acts. An additional Speech Act type that may be classified under Category 3 is the “InitiateConversation” Speech Act (3.2). In the Non-Task-related Speech Acts of Category 3, the System may take its own initiative and “barge in” the standard dialog. Non-Task-related Speech Acts of Category 3 are optional steps in dialog structure and are directly dependent on User-Requirements. Examples of Speech Acts in Category 3.1 are utterance (a) “Let me assist you with this process” (“Offer”) and utterance (b) (“Reminder”) “You still have two minutes to complete this transaction”. The “Initiate-Conversation” Speech Act may ask the User if additional information is requested in topics such as the weather and news reports. Additionally, the “Initiate-Conversation” Speech Act may also be used as a form of managing waiting time by receiving free input by the User, which may be then filed in the System‟s database. For example, the System may ask “Please tell me what you believe must be done to improve our services?” Finally, it should be noted that variations or more specialized subsets of the present NonTask-related Speech Acts categorization may be identified, according to HCI System type and User Requirements. Furthermore, possible additional categories may be included. All Non-Task-related Speech Acts are presented in Table 7.
4.3. Prosodic Modeling and Non-task-related Speech Acts in ServiceOriented Applications
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.3.1. Introduction Prosodic modeling for utterances produced by the Conversational Agent constituting Non-Task-related Speech Acts, namely Speech Acts that are not directly related to the Speech Acts for Task-related Dialogs, are observed not to be entirely empirical, although they are highly dependent on the style and type of Conversational Agents chosen for an HCI system. Specifically, it has been observed that some elements related to User-friendliness are not system-specific and can be transferred to other HCI applications involving Service-Oriented dialog systems.
4.3.2. Prosodic Modeling and the User-System Relationship in Greek It has been observed that for languages like Greek, where friendliness is related to directness and spontaneity, constituting features of Positive Politeness (Sifianou, 2001), from a prosodic aspect, User-friendliness can be achieved with prosodic emphasis on expressions related to the User-System Relationship. These types of expressions can be subsumed under the general category of expressions involving the System‟s or User‟s positive intention or cooperation and may be related to respective Speech Acts. Thus, the focus is given on the content of the Non-Task-related Speech Acts based on the User-System Relationship. This is unlike the standard content of Task-related Speech Acts, which, in contrast, focus on the actual task to be executed.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
85
Specifically, in Greek Service-oriented HCI applications, prosodic emphasis is given on words signalizing the User-System Relationship (“Usr-Sys-Rel” words), namely verbs expressing the system‟s intention (action) to serve the user, expressions (usually verbs) indicating the system‟s apologies, failure or error in respect to a task executed to serve the user, verbs expressing the user‟s actions or intentions, nouns expressing a task related to the actual interaction involving good intentions (“cooperation”) and nouns expressing a task related to the system‟s services. These word categories may be described as expressions related to the System‟s positive attitude toward the User. In the following examples (Example 2 and Example 3) from the CitizenShield dialog system (Nottas el al, 2007), the above listed types of words signalizing the User-System Relationship receive prosodic emphasis (“Usr-Sys-Rel prosodic emphasis”, indicated in bold print). These words are the expressions “sorry” (system-intention noun – in Greek), “ask” (Greek: “make”), (system-service verb), “thank” (system-intention verb) and “completed” (user-action verb). At this point, it is important to stress that not all Non-Task-related Speech Acts necessarily contain “Usr-Sys-Rel” expressions. It should, additionally, be noted that expressions signalizing negations, temporal and spatial information, quantity and quality, as well as sublanguage-specific task-related expressions categorized under “ACTION-TYPE” and “OBJECT-TYPE” (as demonstrated in 3.1) receive prosodic emphasis by default (“default prosodic emphasis”, indicated in italics). In Example 3, these are the expressions “not”, “some”, “correctly”, “very much”, “more” and “additional” (Alexandris, 2007).
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Example 3 1. 2. 3. 4. 5. 6.
Δεν είμαι ζίγοσρος ότι κατάλαβα ζωζηά (“Justify”) Συγγνώμη δεν σας άκοσζα (“Justify”) Θα σας κάνω μερικές ερωτήσεις ακόμα (“Introduce-new-task”) Σας ευχαριστούμε πολύ για την συνεργασία σας (“Thank”) Σας ευχαριστώ για τα επιπλέον στοιτεία (“Thank”) Προθανώς ολοκληρώσατε με τις επιπλέον πληρουορίες (“Reminder”)
Translations close to the syntax of original spoken utterances 1. 2. 3. 4. 5. 6.
I am not sure that I understood correctly (“Justify”) I am sorry, I could not hear you. (“Justify”) I will (Greek: make) you some more questions (“Introduce-new-task”) We thank you very much for your cooperation (“Thank”) I thank you for the additional information (“Thank”) You have obviously completed providing the additional input (“Reminder”)
Therefore, in contrast to Task-related Speech Acts where Prosodic Emphasis for Prosodic Modeling is given to keywords from the Sublanguage-specific lexicon, expressions and terminology, in Non-Task-related Speech Acts, Prosodic Emphasis for Prosodic Modeling is additionally given to specific Word-groups, signalizing the User-System Relationship namely (1) system-service verbs, (2) system-intention verbs, (3) system-service nouns, (4) system-intention nouns, (5) user-intention verbs and (5) user-action verbs. We note that, in
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
86
Christina Alexandris
accordance to the complex nature of the Goals that need to be reached in Non-Task-related Speech Acts, prosodic modeling specifications are equally complex.
Relation of “Usr-Sys-Rel prosodic emphasis” and “default prosodic emphasis”
Additionally, it should be stressed that in Non-Task-related Speech Acts, “Usr-Sys-Rel prosodic emphasis” has a priority over “default prosodic emphasis” in respect to amplitude. Specifically, in Non-Task-related Speech Acts, the amplitude of the prosodic emphasis on Usr-Sys-Rel expressions is intended to be slightly higher than the amplitude of the prosodic emphasis on expressions receiving default prosodic emphasis. This specification is in accordance with the Goal of User Satisfaction and User-friendliness for Non-Task-related Speech Acts presented in Table 6. We also note that the above-described features constitute basic specifications and are subject to further “streamlining” when processed by prosodic modeling tools. The sublanguage-specific User-System Relationship Word groups observed in Greek Service-oriented HCI applications are presented in Table 8. Beyond the above-presented framework, it may be added that, in the Non-Task-related Speech Act “Attention-alert” (Category 2.5), prosodic emphasis is also given to a small group of directives such as “listen” and “take” and exclamations, including the calling of the User‟s name. Table 8. User-System Relationship Word groups (Greek)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Verbs: (1) verbs expressing the system‟s intention (action) to serve the user (system-service verbs) (2) verbs expressing the system‟s apologies, failure or error in respect to a task executed to serve the user (system-intention verbs) (3) verbs expressing the user‟s actions or intention (user-action/intention verbs) Nouns: (4) nouns expressing a task related to the actual interaction involving good intentions (i.e. “cooperation”) (system-intention nouns) (5) nouns expressing a task related to the related to the system‟s services (system-service nouns)
In respect to the Non-Task-related Speech Acts of Category 3, for the Non-Task-related Speech Act “Initiate-Conversation”, in which the User is asked whether information on additional topics is requested, a strategy followed in Task-related Speech Acts is employed. Specifically, in this case, prosodic emphasis is also used on small group of keywords signalizing information type, for example, “weather-report”, “news” and “newspaper”. This small set of words may be labeled as the “INFORMATION-TYPE” word group.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
87
For the Non-Task-related Speech Acts of Category 1, it should be noted that Speech Acts of Category 1, namely, the “Open-Dialog-Greeting” (1.1) and “Close-Dialog” (1.2) Speech Acts, are mostly fixed expressions such as “Hello” and „Goodbye”. This observation also applies to all three languages concerned, namely Greek, English and German.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.3.3. Prosodic Modeling and the User-System Relationship in English and German Despite the fact that English and German are not characterized by Positive Politeness, expressions signalizing the User-System Relationship (Usr-Sys-Rel) are observed to play a significant role in Non-Task-related Speech Acts, even though the above-described relationship between prosodic emphasis and word category in Non-Task-related Speech Acts is partially applicable in English and German. Specifically, prosodic emphasis on “Usr-SysRel” expressions is observed to contribute to the achievement of clarity and User-friendliness in English and German spoken utterances in Service-Oriented dialog systems, at least in respect to the present data acquired from European Union projects in Speech Technology for social services and Human-Computer Interaction. Examples of respective expressions receiving prosodic emphasis in HCI applications in English and in German from data received from European Union Projects (i.e. the SOPRANO Project, , http://www.soprano-ip.org/) are the words: “sorry”, “apologize”, (German: “entschuldigen”) and “help”, “assist”, (German: “helfen”, “behilflich”). We also note that words receiving prosodic emphasis (signalized in bold writing) by default, namely negation, spatial, temporal, quantitative expressions and expressions related to manner and quality as well as sublanguage-specific task-related expressions categorized under “ACTION-TYPE” and “OBJECT-TYPE” (as demonstrated in 3.1) receive prosodic emphasis by default (“default prosodic emphasis”). Default prosodic emphasis is henceforth indicated in italics. Prosodic modeling in English and German may produce similar examples in the two languages, in case a translation similar to a word-to-word translation of the previous (Greek) examples is performed. Usr-Sys-Rel expressions in English and German may correspond to the same or to a different grammatical category in relation to Greek Usr-Sys-Rel expressions. For example, a Usr-Sys-Rel expression in Greek may constitute a system-service-verb, while in English or in German the equivalent expression may constitute a system-service-noun. We also note that, similarly to Greek, not all Non-Task-related Speech Acts necessarily contain Usr-Sys-Rel expressions. The sublanguage-specific User-System Relationship Word groups observed in English and German Service-oriented HCI applications are presented in Table 9. In the respective English and German examples that will be presented, all words related to the User-System Relationship (Usr-Sys-Rel) and therefore receive prosodic emphasis are signalized in bold writing, regardless of their grammatical category. Usr-Sys-Rel expressions in English and German compatible with the respective above-presented Greek utterances are illustrated by the following examples of Category 2 in English and German (Example 4 and Example 5). We note that, similarly to Greek, in English and German Non-Task-related Speech Acts, the amplitude of the prosodic emphasis on Usr-Sys-Rel expressions is intended to be slightly higher than the amplitude of the prosodic emphasis on expressions receiving default prosodic emphasis. Since the above-described features constitute basic specifications and are subject to
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
88
Christina Alexandris
further “streamlining” when processed by prosodic modeling tools, necessary adaptations are not symmetrical in the languages concerned and are also related to variations in syntax as well as in the vocabulary (lexicon) of the respective sublanguages. Examples in English of Non-Task-related Speech Acts of Category 2.1 and 2.2 (Example 4) compatible with the above-presented Greek examples are utterance (a) “I am sorry (adjective)” (“Apologize”), utterance (b) “I cannot understand your request (noun)” (“Justify”) and utterance (Category 2.2) “To provide better services (noun) for you, I will ask you a few more questions” (“Introduce-new-task”) system-service noun, system-service verb. Examples of Speech Acts in Category 2.3 (Example 4) are utterance (a) “This might take a few seconds” (“Inform-delay”) and utterance (b) “Do you wish to proceed?” We note here that the Non-Task-related “Inform-delay” Speech Act presented here does not contain UsrSys-Rel expressions. Compatibility regarding prosodic emphasis and the User-System Relationship between Greek and English is also observed in respect to Speech Acts in Category 2.4 (“Thank”) and Category 2.5 (“Attention-alert”), such as in the respective examples “Thank you for your input” (2.4) and “(Your) Attention (noun) please.” We note that the expression “please” is related to pragmatic politeness and does not receive prosodic emphasis by default, as described in 3.2.2. Example 4: (Usr-Sys-Rel prosodic emphasis (higher amplitude) indicated in bold, default emphasis (lower amplitude) indicated in italics)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Category 2.1 (a) “I am sorry” (“Apologize”) (b) “I cannot understand your request” (“Justify”) Category 2.2 “To provide better services for you, I will ask you a few more questions.” (“Introducenew-task”) Category 2.3 (a) “This might take a few seconds” (“Inform-delay”) (b) “Do you wish to proceed?” (“Manage-waiting-time”) Category 2.4 (“Thank”) “Thank you for your input” Category 2.5 (“Attention-alert”) “(Your) Attention please” Similar examples in German of Non-Task-related Speech Acts of Category 2.1 and 2.2 (Example 5) are utterance (a) “Ich bitte um Entschuldigung (noun)” or “Ich möchte mich entschuldigen (verb)” (“Apologize”), utterance (b) “Ich kann Ihre Eingabe (noun) nicht verstehen (verb)” or “Ihre Eingabe (noun) wurde nicht richtig verstanden (verb)” (“Justify”) and utterance (Category 2.2) “Wir werden Ihnen einige weitere Fragen stellen (verb)” (“Introduce-new-task”). In the latter example, prosodic emphasis is given to the systemservice noun and system-service verb “Fragen stellen”. We note here that, depending on the
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
89
specifications of the sublanguage concerned, the expression “Fragen stellen” may be either considered an Usr-Sys-Rel expression or an ACTION-TYPE expression receiving default prosodic emphasis. Table 9. Prosodic Emphasis in User-System Relationship Word groups (Usr-Sys-Rel) (German and English) (1) expressions (nouns/verbs/adjectives) indicating the system‟s intention (action) to serve the user / a task related to the related to the system‟s services (system-service expressions) (2) expressions (nouns/verbs/adjectives) indicating the system‟s apologies, failure or error in respect to a task executed to serve the user / a task related to the actual interaction involving good intentions (system-intention expressions) (3) expressions (nouns/verbs/adjectives) indicating the user‟s actions or intentions (user-action/intention expressions)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Examples of Speech Acts in Category 2.3 are utterance (a) “Das kann einige Minuten dauern” (“Inform-delay”) and utterance (b) “Möchten Sie weitermachen (verb)?” (userintention verbs). Similarly as in English, compatibility with respective utterances in Greek is also observed in Speech Acts of Category 2.4 (“Thank”) and Category 2.5 (“Attention-alert”), such as in the respective examples “Danke für Ihre Eingabe (noun)” (2.4) and “Ich bitte um Ihre Aufmerksamkeit (noun)” and “(Ihre) Achtung (noun).” Similarly to the respective example in English, we note that the expression “bitte” (“please”) related to pragmatic politeness does not receive prosodic emphasis by default, as described in 3.2.2. Example 5: (Usr-Sys-Rel prosodic emphasis (higher amplitude) indicated in bold, default emphasis (lower amplitude) indicated in italics) Category 2.1 (a) (i) “Ich bitte um Entschuldigung” (“Apologize”), (ii) “Ich möchte mich entschuldigen” (“Apologize”) (b) (i) “Ich kann Ihre Eingabe nicht verstehen” (“Justify”) (ii) “Ihre Eingabe wurde nicht richtig verstanden” (“Justify”) (c) “ Wir werden Ihnen einige weitere Fragen stellen” (“Introduce-new-task”) Category 2.2 (a) “Das kann einige Minuten dauern” (“Inform-delay”) (b) “Möchten Sie weitermachen?” (“Manage-waiting-time”)
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
90
Christina Alexandris Category 2.4 (“Thank”) “Danke für Ihre Eingabe” Category 2.5 (“Attention-alert”) “Ich bitte um Ihre Aufmerksamkeit” “(Ihre) Achtung”
4.4. Language-Specific Features in Prosodic Modeling and Non-task-related Speech Acts 4.4.1. Introduction As demonstrated from the above-presented examples, it may be observed that expressions signalizing the User-System Relationship play a significant role in Non-Task-related Speech Acts in English and German, as demonstrated from the above-presented examples. However, in the languages concerned, namely Greek, English and German, a partial compatibility is observed in respect to prosodic emphasis and expressions signalizing the User-System Relationship in Non-Task-related Speech Acts. Specifically, this partial compatibility in respect to “Usr-Sys-Rel” expressions is observed to be related to two basic factors, the Grammatical and Lexical Parameter and the Pragmatic and Sociolinguistic Parameter.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.4.2. The Grammatical and Lexical Parameter The Grammatical and Lexical Parameter may be described as differences in the morphological and the syntactic level and differences in respect to the semantic equivalence of expressions. Differences in respect to the Morphosyntactic Level typically occur in verbs receiving prosodic emphasis and expressing the system‟s intention (action) to serve the user (systemservice verbs), verbs expressing the system‟s apologies, failure or error in respect to a task executed to serve the user (system-intention verbs) and verbs expressing the user‟s actions or intentions (user-action/intention verbs). In Greek, as a verb-framed and pro-drop language (like Spanish or Italian), the prosodic emphasis is directly matched to the finite verb, containing the features of the verb‟s subject in this case the System or the User. This difference in respect to English and German may also influence the process of identifying Usr-Sys-Rel expressions. For example, the Greek verb “΄kano” (“I-do/make”) in the context of “make questions” (in a literal transfer from Greek) (Example 6) is a primitive verb, containing the features of the verb‟s subject and does not have the task-specific semantics of the English equivalent verb “ask”. Therefore, in the context of Service-oriented HCI applications, the Greek verb “΄kano” may be identified as an Usr-Sys-Rel expression, in the sense that the System is “doing something” for the User, while the respective English equivalent verb “ask” is identified as an ACTION-TYPE expression. The German equivalent expression “Fragen stellen” (“pose/put questions”), the verb “stellen” contains more specific semantic features than the Greek verb “΄kano” and less specific semantic features than the English verb “ask”
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
91
and can be classified both as Usr-Sys-Rel expression or an ACTION-TYPE expression receiving default prosodic emphasis. In another example, the Greek verb “olokli΄rosate” (“completed”) is equivalent to the verb “finished” in English, but with the adjective “fertig” (default-emphasis) in German (Example 6). We note that the semantics of the Greek verb “olokli΄rosate” (“completed”) allows it to be classified as a Usr-Sys-Rel expression, whereas the verb “finished” in English is classified as an ACTION-TYPE task-related expression receiving default emphasis. In this respect, differences in the Morphosyntactic Level in the languages concerned may, in some cases, also be reflected in the semantic aspect. According to the above-presented data, both in English and in German, the prosodic emphasis in Usr-Sys-Rel expressions and “default-emphasis” words (ACTION-TYPE) is observed to be directly matched to the finite verb, however, the subject is phonologically realized as a separate word. In this case, it may be concluded that the semantic feature of the subject is not included in the semantics of the emphasized Usr-Sys-Rel expression and that only the verb content-specific semantic features are actually emphasized. If prosodic emphasis given on the subject (pronoun, for instance “you” or “Sie” (German)), the result may have a harshly deictic effect. The relation of the Morphosyntactic Level and the semantic and lexical aspect is also evident both in respect to the surface structure of the utterances and in expressions used. For example, in respect to the surface structure of the equivalent utterances, Greek verb “΄kanete” (“do”), is equivalent to the finite verb “machen” (“do”) in German, but with the verb “wish” in English. In a similar example, in the utterance “Please proceed”, the Greek verb (Parakal΄o –“please”) “sine΄xiste” (“proceed”) is equivalent to the respective verb “proceed” in English, but with the adverb (Bitte machen/sprechen Sie) “weiter” (default-emphasis) in German. The relation of the Morphosyntactic Level and the Lexical aspect is also evident in respect to the expressions used, such as in the typically occurring examples of the “Apologize” and “Thank” Non-Task-related Speech Acts where the word category type of the Usr-Sys-Rel expressions is not always identical in the languages concerned. For example, apologies may be expressed with a noun in German and in Greek but with a verb or a fixed expression in English and the verb phrase “Thank you”, a fixed expression in English corresponds to a verb-sentence in Greek. Differences in respect to the Grammatical and Lexical Parameter concerning the languages Greek, English and German are presented in Example 6. Example 6: Grammatical and Lexical Parameter (Usr-Sys-Rel prosodic emphasis indicated in bold, default emphasis indicated in italics) (1) Morphosyntactic Level: i. […..] ολοκληρώσατε […..] (verb = “[you] have-completed”) –(olokli΄rosate “completed”) ii. […..] you have […..] finished iii. […..]Sie sind […..] fertig (Adjective-default emphasis) (2) Morphosyntactic Level : i. […..] κάνετε […..] (verb = “[you] do”)- (΄kanete” -“do”) ii. […..] you wish to do iii. […..] möchten Sie […..] machen
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
92
Christina Alexandris (3) Morphosyntactic Level: i. Παρακαλώ σσνετίστε (verb = “(you) proceed”) - (parakal΄o –“please”) sine΄xiste ii. Please proceed iii. Bitte machen (=do)/sprechen(=speak) Sie weiter (Adjective-default emphasis) (4) Equivalence of expressions: i. Συγγνώμη (noun) (syg΄nomi - apology) ii. I apologize (verb), I am sorry (adjective, with use of a fixed expression) iii. Entschuldigung (noun) (5) Equivalence of expressions: i. Ευχαριστώ, Ευχαριστoύμε (verb = “[Ι] thank”, “[we] thank”)) (efxari΄sto, efxari΄stume –“thank”) ii. Thank you (fixed expression) iii. Danke (fixed expression), Vielen Dank (fixed expression) vi. möchten uns bedanken (=wish we say-thanks)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
4.4.3 The Pragmatic and Sociolinguistic Parameter The Pragmatic and Sociolinguistic Parameter may indirectly influence prosodic emphasis in Non-Task-related Speech Acts. In particular, the Pragmatic and Sociolinguistic Parameter involves the differences between English, German and Greek observed in respect to the perception of the actual content of the utterances produced and differences in respect to the necessity or optionality of the actual speech acts. In Example 7, the examples directly translated from Greek (the CitizenShield Project) “You have obviously finished with your additional input” and “I will ask you one more time” may seem too direct or even rude in English and in German. However, it is observed that, in Greek, the use of prosodic emphasis has a direct influence on the User-friendliness feature of a sentence. Specifically, in Example 7, the expression becomes polite if prosodic emphasis is given to the respective verbs “completed” (user-action verb) and the verb “ask” (systemservice verb) along with the expressions receiving “default emphasis”, namely, the numerical expression and the adverbs related to quantity “additional” and “one more”. We note here that the effect will be harsher in the case in which only the “default emphasis” expressions “obviously” as an adverbial and “one” as a numerical expression is emphasized. Example 7: Pragmatic and Sociolinguistic Parameter (Usr-Sys-Rel prosodic emphasis indicated in bold, default emphasis indicated in italics) Category 3.1 (“Optional Information”: “Reminder”) i. Προυανώς ολοκληρώσατε με τις επιπλέον πληρουορίες ii. Greek (in English): You have obviously finished (Greek: completed) with your additional input iii. Greek (in German): Sie sind offensichtlich mit Ihrer zusätzlichen Eingabe fertig (Greek: vervollständigt – “completed”)
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
93
Category 3.1 (“Optional Information”: “Reminder”) i. Θα σας ρωτήσω ακόμη μια υορά ii. Greek (in English): I will ask you one more time iii. Greek (in German): Ich werde Ihnen noch ein Mal fragen In Example 8, the System may in the case of Greek appear helpful to the User by introducing the new task, with the respective emphasis on Usr-Sys-Rel expressions. On the other hand, in English and in German, the System may seem too harsh and authoritative. We note that in English and in German, the addition of the adverbial “now” and “gleich” (German: “(right) now”) seems to soften any harsh effect. In contrast, in the Greek example, friendliness does not appear to be effected by the presence or absence of the adverbial “now” (΄tora).
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Example 8: Pragmatic and Sociolinguistic Parameter Category 2.3 (“Introduce-new-task”) 1. English: I will ask you some more questions I will ask you some more questions now (Adverb) 2. German Ich werde Ihnen einige weitere Fragen stellen Ich werde Ihnen gleich (Adverb) einige weitere Fragen stellen 3. Greek Θα σας κάνω μερικές ερωτήσεις ακόμα – (Tha sas ΄kano meri΄kes ero΄tisis a΄koma) I will ask (“make”) you some more questions Θα σας κάνω τώρα (adverb – “now”) μερικές ερωτήσεις ακόμα – (Tha sas ΄kano ΄tora (=now) meri΄kes ero΄tisis a΄koma) Tώρα (adverb) θα σας κάνω μερικές ερωτήσεις ακόμα (΄Tora (=now) tha sas ΄kano ΄tora meri΄kes ero΄tisis a΄koma) I will ask (“make”) you some more questions now
4.5. General Observations As indicated in Section 3, in the present prosodic modeling framework, it is observed that in English, German and Greek, “default-emphasis” words, typical in Task-related Speech Acts, can be mapped to similar grammatical categories or word-categories. In contrast, expressions signalizing the User-System Relationship (“Usr-Sys-Rel” words) do not always demonstrate a full equivalence in the languages concerned. It can, thus be observed that for Non-Task-related Speech Acts the proposed approach is partially language independent. This is in contrast to the proposed prosodic modeling for Task-related Speech Acts, where the proposed prosodic modeling may act as a Controlled language-like approach. The parameters presented for the prosodic modeling of Non-Task-
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
94
Christina Alexandris
related Speech Acts may, therefore, only be applied as a “lax” Controlled language-like strategy. Table 10. Relation of speech acts and elements receiving prosodic emphasis for prosodic modelling in a service-oriented dialog system
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Languages: English, German, Greek 1. Task-related Speech Acts 2. Non-Task-related Speech Acts Goal: Precision, Directness Goal: Directness, User-friendliness Sublanguage-specific Sublanguage-specific (task-related) expressions: User-System Relationship expressions: Emphasis: Default Emphasis Emphasis: Usr-Sys-Rel Emphasis higher amplitude Sublanguage-specific lexicon, expressions and terminology: (ACTION-TYPE, OBJECT-TYPE) Sublanguage-independent expressions: (Greek: Prosodically-determined) Emphasis: Default Emphasis
system-service expressions system-intention expressions user-action/intention expressions Sublanguage-independent expressions: (Greek: Prosodically-determined)
spatial, temporal and quantitative expressions expressions related to manner and quality
spatial, temporal and quantitative expressions expressions related to manner and quality
Emphasis: Default Emphasis Lower amplitude
The following tables summarize the observed similarities and differences in respect to the relation of prosodic emphasis and word category for the Prosodic Modeling of Task-related and Non-Task-related Speech Acts in Service-oriented Dialogs (Table 10).
5. Language-Specific Tone and Style Although prosodic emphasis provides the basis for achieving the targeted effect in the sentence‟s prosodic structure, the appropriate tone of voice in the System‟s utterances plays a final and decisive role both in respect to Comprehensibility and in respect to UserFriendliness. We note that adaptation and fine-tuning in respect to the appropriate nuance of tone is observed to be easier accomplished by pre-recorded output produced by a trained speaker than by Speech Synthesis. Regardless of the user-friendly content of the utterance produced, an unfriendly tone neutralizes any attempt aiming to User-friendliness. In contrast to the above-presented specifications based on the role of prosodic emphasis in prosodic modeling, tone cannot be fully specified as a set of stand-alone parameters and is strongly related to the quality of the Speaker‟s (or the System‟s) voice and the training (or fine-tuning) of the Speaker (or System). As a basic framework, the combination of the “Instrument” and “Servant” UserPsychology in relation to Task Type does help eliminate cases in which the System mimics specific male or female prototypes within a culture-specific framework. The characteristics of
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
95
specific culturally determined male or female prototypes are related to a complex matrix of various features, also associated with tone of voice. In the present approach, the combination of Task Type and the “Instrument” and “Servant” User-Psychology, allows a relatively neutral baseline on which the overall tone of the System‟s utterances can be adapted, according to preferences of the particular User or User-Group. However, even in this case, the acceptable tone of voice in the utterances produced by the System‟s Spoken Output may vary according to culture and language-type. For example, in some languages a characteristically vivid and expressive tone is maintained in many types of transactions and related Speech Acts, such as in spoken Italian. In other languages, such as spoken German, in a considerable number of transactions, a tone associated with reliability and responsibility is preferred. In most cases, at least for the languages and their Native Speakers in Europe, developers in Human-Computer Interaction systems have to balance between a tone associated with reliability, responsibility and, in some task-types, even authority, while at the same time User-friendliness and naturalness must be preserved. This is a necessary process for languages such as Greek, where Positive Politeness markers integrated in the prosodic modeling process prevent a tone associated with reliability and responsibility to be perceived as authoritative and rude by Greek Native Speakers (Alexandris and Fotinea, 2004). On the other hand, some languages, such as the so-called “BBC-English” or “Queen‟s English” in British English appear to have some “ready-made” varieties in tone and style balancing between the above-described characteristics. The acceptable tone of voice is, therefore, directly related to socio-cultural factors and cannot be excluded from the prosodic modeling process for applications targeted to User-Groups such as the General Public.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
6. Conclusions and Further Research The target of the present approach is to provide a general framework of re-usable, transferable and language independent specifications for the Speech Component in ServiceOriented Dialog Systems for the General Public as a user group. It should be stressed that the above-described features constitute basic specifications and are subject to further “streamlining” when processed by prosodic modeling tools. The present approach concerns specifications based on prosodic modeling directly related to Speech Act type which is, in turn, related to dialog type and dialog structure. This relation is determined by the overall framework of the Parameters (1 -5) of the HCI System in respect to the User-System Relationship. It may, additionally, be noted that the proposed prosodic-emphasis based approach can constitute a stand-alone specification for prosodic modeling, although, it should be taken into account that the overall tone in respect to the mode of articulation plays a decisive role. In respect to the language-independent feature, the proposed prosodic modeling approach can be characterized as partially language independent, depending on Speech Act type, namely Task-related or Non-Task-related Speech Acts. Specifically, within the framework of the present approach, based on the relation of Speech Act type and Prosodic Modeling, it is observed that Task-related Speech Acts display compatibility in respect to prosodic emphasis and word category in the three languages concerned, namely, English, German and Greek. In
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
96
Christina Alexandris
English and in German, in contrast to specific word categories in Greek, it is observed that prosodic emphasis does not usually affect the actual semantic content of words. Contrary to Task-related Speech Acts, it is observed that in the language pairs English, German and Greek, Non-Task-related Speech Acts display partial compatibility in respect to prosodic emphasis and word category. Partial compatibility is shown to result from differences in the morphosyntactic and semantic level (expressions) and is also related to variations in respect to the content and acceptability of the Speech Act in the languages concerned. These parameters influencing the form and content of Non-Task-related Speech Acts are summarized as the Grammatical and Lexical Parameter and the Pragmatic and Sociolinguistic Parameter. The above-described differences between Task-related Speech Acts and Non-Taskrelated Speech Acts, at least for the languages concerned, are observed to be related to the typically standard and relatively language-independent content of Task-related Speech Acts, in contrast to Non-Task-related Speech Acts, where the content is often not standard, may be highly dependent on user-requirements and acceptable socio-cultural norms and is, therefore mostly language-dependent. From the aspect of implementation, in Task-related Speech Acts, where User expectations are predefined and also relatively language and culture independent, the proposed prosodic modeling may act as a Controlled language-like approach applied to facilitate Speech Recognition and semantic processing and to standardize System input and output. We note here that it has been observed that controlled language-like specifications originating from English and German are mostly applicable in Greek (Alexandris, 2009). In contrast to the proposed prosodic modeling for Task-related Speech Acts, the parameters presented for the prosodic modeling of Non-Task-related Speech Acts can be applied as a “lax” Controlled language-like approach. It can, therefore, be concluded that the Controlled language-like specifications of the proposed approach for prosodic modeling can be characterized by features ranging from full to partial reusability and transferability to other applications. The approaches presented for the prosodic modeling in Service-Oriented HCI Systems, namely the Controlled language-like approach for Task-related Speech Acts and the “lax” Controlled language-like approach for Non-Task-related Speech Acts, can be applied in Work packages for User Requirements and Dialog Modeling. We also note here that the typically standard and relatively language-independent content of Task-related Speech Acts is easily adaptable to the language-independent features typically integrated in an Interlingua (ILT). As a final note, in respect to the relation of prosody and content of utterance within the framework of Service-Oriented dialogs and from the present data and respective European Union projects, it may be observed that Greek appears to be more sensitive to prosody, where English and German rely more substantially on content such as modifiers and particles. Specifically, from the present data, Prosody is observed to play a more intense role in respect to word semantics and overall User-friendliness in Greek, where in English and in German, both word semantics and User-friendliness are concretely defined and realized by vocabulary. Specifically, it is observed that, in addition to the expressions signalizing the User-System Relationship (“Usr-Sys-Rel” words) receiving prosodic emphasis (and resulting to a Userfriendly effect), there is an extensive use of modifiers, especially adverbials, (also accompanying most “default-emphasis” words in Task-related Speech Acts) and more extensive use of particles and pragmatically polite expressions.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Speech Acts and Prosodic Modeling in Service-Oriented Dialog Systems
97
The relation of prosody, word semantics and overall User-friendliness in Greek, English and in German beyond the framework of Service-Oriented HCI Systems remains an issue for further investigation. However, it may be observed that, in respect to prosodic emphasis, language-specific as well as culture-specific factors play a substantial role in ServiceOriented HCI Systems, at least for the language pair English, German and Greek. Compatibility with other languages in respect to the present prosodic modeling approach is an issue for further research, especially if the other languages in question display substantial differentiations in respect to the Task-Related and Non-Task-Related Speech Act types and the overall framework of the User-System Relationship.
References
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
[1]
Alexandris, C. (2009). A Speech-Act Oriented Approach for User-Interactive Editing and Regulation Processes Applied in Written and Spoken Technical Texts, in: HumanComputer Interaction. HCI Intelligent Multimodal Interaction Environments., Vol. 2, LNCS_5611. [2] Alexandris, C. (2008). Word Category and Prosodic Emphasis in Dialog Modules of Speech Technology Applications. In: Botinis, A. (ed) Proceedings of the 2nd ISCA Workshop on Experimental Linguistics, ExLing2008, Athens, Greece, August, 5-8. [3] Alexandris, C. (2007). “Show and Tell”: Using Semantically Processable Prosodic Markers for Spatial Expressions in an HCI System for Consumer Complaints”. In: Jacko, J. A. (ed) Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environment, vol. 4552/2007, Springer, New York, 13-22. [4] Alexandris, C. & Fotinea, S. E. (2004). Discourse Particles: Indicators of Positive and Non-Positive Politeness in the Discourse Structure of Dialog Systems for Modern Greek. In: International Journal for Language Data Processing “Sprache & Datenverarbeitung”, vol. 1-2/2004, 19-29. [5] Hausser, R. (2006). A Computational Model of Natural Language Communication, Interpretation, Inference and Production in Database Semantics. Springer, Berlin. [6] Heeman, R., Byron, D. & Allen, J. F. (1998). Identifying Discourse Markers in Spoken Dialog. In: Proceedings of the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, Stanford, March. [7] Kellner, A. (2004). Dialogsysteme. In: Computerlinguistik und Sprachtechnologie, Eine Einführung, Carstensen, K.U., Ebert, C., Endriss, C., Jekat, S., Klabunde, R., Langer, H. (eds.), 2nd. revised edition, München: Spektrum Akademischer Verlag. [8] Lehrndorfer A. (1996). Kontrolliertes Deutsch: Linguistische und Sprachpsychologische Leitlinien für eine (maschniell) kontrollierte Sprache in der technischen Dokumentation.Narr, Tuebingen. [9] Malagardi, I. & Alexandris, C. (2009). “Verb Processing in Spoken Commands for Household Security and Appliances”, in: Universal Access in Human-Computer Interaction., Vol. 6, LNCS_5615. [10] Moeller, S. (2005). Quality of Telephone-Based Spoken Dialogue Systems. Springer, New York.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
98
Christina Alexandris
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
[11] Nottas, M., Alexandris, C. & Tsopanoglou, A. Bakamidis, S. (2007). A Hybrid Approach to Dialog Input in the CitzenShield Dialog System for Consumer Complaints. In: Proceedings of HCI 2007, Beijing China. [12] Sifianou, M. (2001). Discourse Analysis. An Introduction. Athens: Leader Books. [13] Schilder, F. & Habel, C. (2001). From Temporal Expressions to Temporal Information: Semantic tagging of News Messages. In: Proceedings of the ACL-2001, Workshop on Temporal and Spatial Information Processing, Pennsylvania, 1309-1316. [14] Smart, J. (2006). SMART Controlled English. In: Proceedings of the 5th International Workshop on Controlled Language Applications (CLAW 2006), Cambridge, MA, USA August, 12, 2006. [15] Wiegers, Karl E. (2005). Software Requirements, Redmond, WA: Mircosoft Press. [16] Wojcik, R. H. & Holmback, H. (1996). Getting a Controlled Language Off the Ground at Boeing. In: Proceedings of CLAW–1996, Leuven, Belgium, 22-31.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
In: Computer Science Research and Technology Editor: Karl C. Verdinand, pp. 99-129
ISBN: 978-1-61728-688-9 © 2011 Nova Science Publishers, Inc.
Chapter 4
NCTUNS TOOL FOR IEEE 802.16J MOBILE WIMAX RELAY NETWORK SIMULATIONS Shie-Yuan Wang*, Hsin-Yu Chen and Shih-Wei Chuang Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Abstract An IEEE 802.16j mobile WiMAX relay network is a next-generation mobile wireless broadband network. Compared with IEEE 802.16e, which also supports mobility, IEEE 802.16j introduces relay stations to the network to help relay packets between the base station and a mobile station. When a mobile station is shadowed by a building and thus has a badquality channel to the base station, such a relay design can help it achieve a higher throughput from/to the base station. Since IEEE 802.16j is a new standard and no such products are available yet in the market for researchers to evaluate its performances, developing a network simulator that supports IEEE 802.16j network simulations is very valuable. In this chapter, we present how we extend NCTUns, a network simulator and emulator that directly uses real-life TCP/IP protocol stack and applications to generate accurate simulation results, to support IEEE 802.16j network simulations. NCTUns supports the two relay modes defined in IEEE 802.16j: the transparent mode and non-transparent mode. More information about NCTUns is available at http://NSL.csie.nctu.edu.tw/nctuns.html.
1. Introduction to IEEE 802.16j Multi-hop Relay Networks IEEE 802.16j [1][2], which is an amendment to IEEE 802.16e-2005, is the standard for IEEE 802.16j mobile relay networks. It is fully compatible with IEEE 802.16e MSs (Mobile Station) but the IEEE 802.16e BS (Base Station) needs to be modified to support relay operations. In such a system, a cell is composed of one MR-BS (Multi-hop Relay Base Station), several RSs (Relay Station), and several MSs.
*
E-mail address: [email protected] (Corresponding author)
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
100
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Two different modes of resource allocation and scheduling are specified in the standard for MSs. They are (1) the centralized scheduling mode and (2) the distributed scheduling mode. In the former mode, the bandwidth/resource allocation for all nodes (including both RSs and MSs) is determined at the MR-BS. In the latter mode, a part of the bandwidth/resource allocation for RSs and MSs is determined at RSs while another part is determined at the MR-BS. Two different relay modes are defined in the IEEE 802.16j standard: the transparent mode and the non-transparent mode. Table 1 lists the differences between these two modes. Table 1. Comparisons between the transparent and non-transparent relay mode
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Scheduling Number of hops Performance Coverage extension Cost / Complexity Forward framing info.
Transparent Mode Centralized 2 High No Low No
Non-transparent Mode Centralized / Distributed 2 or more Low Yes High Yes
Figure 1.1. The frame structure used in the transparent relay mode.
Transparent Mode A transparent mode RS (T-RS) does not forward framing information but simply forwards data for the MR-BS and its MSs. For this reason, it does not extend the coverage of
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
101
the MR-BS. In this mode the path between the MR-BS and an MS is at most two hops. That is, at most one T-RS can participate in relaying packets between the MR-BS and an MS. Due to this simple design, this mode has a lower complexity. The main objective of deploying TRSs in the network is to increase network capacity within the MR-BS coverage rather than increasing the coverage of the MR-BS. The frame structure used in the transparent mode is shown in Figure 1.1. Both the downlink subframe and the uplink subframe are divided into two zones --- the access zone and the relay zone. The DL access zone is defined for the MR-BS to communicate with MSs or T-RSs. The DL relay zone, which is called the transparent zone in the standard, is defined for T-RSs to communicate with MSs. When MSs communicate with the MR-BS through TRSs in the upstream direction, they transmit data in the UL access zone and then the T-RSs relay their data to the MR-BS in the UL relay zone.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Non-Transparent Mode The main difference between a T-RS and a non-transparent mode RS (NT-RS) is that a NT-RS transmits the framing information at the beginning of a frame while a T-RS does not. In this mode when an MS is out of range of the MR-BS and thus cannot receive the downlink information from the MR-BS, a NT-RS should play the role as the MR-BS to forward the downlink information sent from the MR-BS to the MS. The main objective of using NT-RSs is to extend the coverage of the MR-BS rather than increasing network capacity. When scheduling bandwidth, a NT-RS can either operate in the centralized scheduling mode or in the distributed scheduling mode. In the former mode, the resource allocations for all nodes in the network are scheduled at the MR-BS. In the latter mode, NT-RSs can make their own scheduling decisions for the MSs associated with them. Due to this design, an IEEE 802.16j non-transparent mode network can have one or more NT-RSs participating in relaying packets on the path between the MR-BS and an MS. The frame structure used in this mode is shown in Figure 1.2. Both the MR-BS and the NT-RS transmit control messages and map information at the beginning of a frame. By this design, an MS can view a NT-RS as the MR-BS and synchronize to it. However, this design causes a problem that NT-RSs and the MR-BS will transmit their messages at the same time, which will cause message collisions. The DL relay zone is defined for the MR-BS to communicate with NT-RSs, and hence the MS should be idle in the DL relay zone. This frame structure has a problem that when more than one NT-RSs relay packets on the path between the MR-BS and an MS, since the state of the relay zone can only be in one of the transmit (Tx), receive (Rx) or idle state, and the NT-RS must be in the receive state in the DL relay zone, a NT-RS cannot communicate with another NT-RS in the DL relay zone. For this problem, there are two methods to solve it. The first method is to group multiple frames together into a multi-frame with a repeating pattern to schedule in which frame of the group a NT-RS should transmit, receive or idle in the relay zone of that frame. This is called the multi-frame approach. The other method is called the single-frame approach, which further splits a relay zone into multiple sub-relay zones and alternates the state of the sub-relay zone according to the hop counts from the MR-BS to an MS.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
102
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 1.2. The frame structure used in the non-transparent relay mode.
Figure 2 (a). A single-hop TCP/IP network to be simulated. (b) By using tunnel interfaces, only the two links need to be simulated. The complicated TCP/IP protocol stack need not be simulated. Instead, the real-life TCP/IP protocol stack in the kernel is directly used in the simulation.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
103
2. Introduction to NCTUns NCTUns [3][4][5][6] is a powerful tool for simulations and emulations. It has two unique features. First, it uses the real-life TCP/IP (or UDP/IP) protocol stack in the Linux kernel to conduct simulations and emulations. Second, it can run up any real-life application programs on simulated nodes during simulation to generate realistic network traffic in simulations. These capabilities enable NCTUns to generate high-fidelity simulation results and evaluate the performances of real-life applications under various network conditions.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
2.1. Simulation Methodology NCTUns utilizes the kernel re-entering methodology to directly use the real-life TCP/IP protocol stack in the Linux kernel to simulate an IP network. The key facility in the kernel reentering methodology is the tunnel interface. A tunnel interface is a pseudo network interface that does not have a real physical network attached to it. However, from the kernel’s point of view, this pseudo network interface's functionality is exactly the same as that of a normal network interface (e.g., an Ethernet interface). That is, a network application program can send out its packets to its destination host through a tunnel network interface or receive packets from a tunnel network interface, just as if these packets were sent to or received from a normal Ethernet interface. Figure 2(a) illustrates how to simulate the one-hop TCP/IP network by using tunnel network interfaces. A TCP sender application program running on host 1 is sending its TCP packets to a TCP receiver application program running on host 2. One can set up the virtual simulated network by performing the following two operations. First, one configures the kernel routing table of the simulation machine so that tunnel network interface 1 is chosen as the outgoing interface for the TCP packets sent from host 1 to host 2 and tunnel network interface 2 is chosen for the TCP packets sent from host 2 to host 1. Second, for the two links to be simulated, one runs a simulation engine process to simulate them. For the link from host i to host j (i = 1 or 2 and j = 3 - i), the simulation engine opens tunnel network interface i’s and j’s special files in /dev and then executes an endless loop until the total simulated time has elapsed. In each step of this loop, it simulates a packet’s transmission on the link from host i to host j by reading a packet from the special file of tunnel interface i, waiting for the link’s propagation delay time plus the packet’s transmission time on the link (in virtual time) to elapse, and then writing this packet to the special file of tunnel interface j. While the simulation engine is running, the virtual simulated network is constructed and alive. Figure 2(b) depicts this simulation scheme. Since replacing a real link with a simulated link happens outside the kernel, the kernels on both hosts do not know that their packets actually are exchanged on a virtual simulated network. The TCP sender and receiver programs, which run on top of the kernels, of course do not know the fact, either. As a result, all existing real-life network application programs can run on the simulated network, all existing real-life network utility programs can work on the simulated network, and the TCP/IP network protocol stack used in the simulation is the real-life working implementation, not just an abstract or a ported version of it. Note that the kernels on the sending and receiving hosts are the same one --- the kernel of the simulation machine. This is why this simulation methodology is named “kernel re-entering methodology.”
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
104
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
In the above, we present the initial design and implementation of the kernel re-entering methodology. Nowadays, NCTUns uses more advanced and efficient design and implementation for this methodology. The details of these advanced design and implementation can be found in [5].
2.2. The Architecture of NCTUns NCTUns uses a distributed architecture to support remote simulations and concurrent simulations. The GUI (Graphical User Interface) program and simulation engine program are separately implemented and use the client-server paradigm to communicate. A remote user using the GUI environment can remotely submit his or her simulation job to a server running the simulation engine. The server will run the submitted simulation job and later return the results back to the remote GUI environment for analyses. This scheme can easily support multiple simulation jobs that run concurrently on different servers. Functionally, we can divide NCTUns into six components described below.
I. Graphical User Interface (GUI)
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
The GUI is a highly-integrated environment that enables a user to edit a network topology, configure the protocol modules of a network node, set the parameter values for a protocol module, specify network traffic, plot performance curves, playback animations of logged packet transfers, etc. During a simulation, the user can query or set an object’s value at any time. For example, one can set the routing table of a router duration simulation. The GUI uses Internet TCP/IP sockets to communicate with other components. One can use it to submit a simulation job to a remote simulation machine for execution, and the simulation results will be transferred back to the GUI when the simulation is finished.
II. Simulation Engine (S.E.) The Simulation Engine (S.E) is the core of NCTUns. It is a user-level program that provides a module-based platform for users to develop their protocols and integrate them into the NCTUns simulator. Besides, important services like simulation clock maintenance, timer management, event scheduling, variable registrations are all handled in the simulation engine.
III. Dispatcher The dispatcher program supports concurrent simulations on multiple simulation machines. It should be executed and remain alive to manage multiple simulation machines. When a user submits a simulation job to the dispatcher, the dispatcher will select an available simulation machine to execute this job. If no machine is available, the submitted job can be queued and managed by the dispatcher as a background job. Later on, when a simulation machine becomes available, the dispatcher will automatically send a background job to it for execution on behalf of the user.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
105
IV. Coordinator Every simulation machine has a coordinator program to communicate with the GUI and the dispatcher. The coordinator is responsible for the following tasks: 1. Forking a simulation engine process to perform a simulation. When the coordinator receives a simulation job from the job dispatcher, it forks (executes) a simulation engine process to simulate the specified network and protocols. The forked simulation server process will kill itself when its simulation is finished. 2. Reporting the status of the simulation machine to the dispatcher. The coordinator informs the dispatcher whether this machine is currently busy in running a simulation or not. When executed, it first registers itself with the dispatcher to join the dispatcher’s simulation machine farm. Later on, when its status (idle or busy) changes, it notifies the dispatcher of the new status. Based on the machine status information, the dispatcher can choose an available machine from its machine farm to service a job.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3. Communicating with the GUI and dispatcher. The simulation engine process periodically sends the current simulation time of the simulated network to the coordinator. The coordinator then forwards this information to GUI to inform the GUI user of the simulation progress. During simulation, the user can also on-line set or retrieve an object’s value (e.g., to query or set a switch’s switch table). Message exchanges between the simulation engine process and the GUI program are performed via the coordinator.
V. Real-life Application Program Any real-life application programs can be run up to generate realistic network traffic, configure networks, monitor network traffic, etc. For example, in NCTUns, the tcpdump program can be run up on a simulated node to capture packets flowing over a link and the traceroute program can be run up on a simulated node to find out the routing path traversed by a packet.
VI. Kernel Patches NCTUns uses the real-life network protocol stack in the Linux kernel to simulate transport-layer and network-layer protocols such as TCP, UDP, and IP. Very minor modifications to Linux kernel timers are made so that the timers used by the protocol stack in the kernel for simulated nodes can advance their times based on the simulation clock (which is controlled by NCTUns) rather than the real-world clock.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
106
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
3. Design and Implementation of IEEE 802.16j Transparent Mode Networks over NCTUns In this section, we present the NCTUns module-based platform and present the design and implementation of IEEE 802.16j transparent mode networks over NCTUns.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 3.1. The NCTUns module-based platform.
3.1. Module-Based Platform NCTUns provides a module-based platform. A module corresponds to a layer of a protocol stack. By this platform, one can easily insert his/her new module into NCTUns and replace the default module with his/her new module in the protocol stack. Figure 3.1 shows an example of protocol module combinations in NCTUns and illustrates how a packet traverses the modules from Host1 to Host2. In this case, two hosts are connected via a switch. Each host node has an IEEE 802.3 interface. Its protocol stack is composed of an Interface module, an ARP module, a FIFO module, an 802.3 module, a PHY module, and a LINK module in sequence from the top to the bottom. In the module-based platform, NCTUns chains all modules in a node together to form a stream. The downstream is used to simulate a node’s packet transmissions and the upstream is used to simulate a node’s packet receptions. One sees that a packet is sent by a module to its next module by calling the send() function. For a module to receive a packet from its previous module, it calls the recv() function. For a packet reception at a node, the protocol processing starts at the layer-1 protocol. The LINK module is created for specifying the connectivity of among nodes in a simulated network. The PHY module is the first module to process incoming packets. The Interface
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
107
module is the final module to process an incoming packet before this packet enters the kernel for IP and higher-layer protocol simulations.
3.2. Supported IEEE 802.16j Transparent Mode Network Topologies in NCTUns
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
To support IEEE 802.16j transparent mode in NCTUns, we add three types of nodes to NCTUns. They are (1) the Transparent Mobile Relay Base Station (TMR-BS), (2) Transparent- Relay Station (T-RS), and (3) Transparent-Mobile Station (T-MS). A TMR-BS plays the same role as the base station in a conventional WiMAX PMP network. It is the central controller in the network to allocate link bandwidth for the T-RSs and T-MSs that it manages. On the other hand, a T-RS simply forwards incoming data for its subordinate TMSs and leaves the scheduling of these data to the TMR-BS. A T-MS, which is fully compatible with IEEE 802.16e network, can work normally without any modifications. Figure 3.2.1 shows the topology of IEEE 802.16j transparent mode network that is supported in NCTUns. The TMR-BS provides services through a wired backhaul network and therefore it has two interfaces --- one for the wired network and the other for wireless communications with T-RS and T-MS. On the other hand, both the T-RS and the T-MS have only one interface, which is a wireless interface to communicate with the TMR-BS. When a T-MS wants to join the IEEE 802.16j transparent mode network, the TMR-BS is responsible for choosing an access station for the T-MS. The access station can be a TMR-BS or a T-RS. One can use a T-RS as the access station to make Line-Of-Sight transmission (shown in Figure 3.2.2) possible both between the TMR-BS and the T-RS and between the T-RS and the T-MS.
Figure 3.2.1. The supported IEEE 802.16j transparent mode network topology in NCTUns. Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
108
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
.
Figure 3.2.2. NLOS transmission and LOS transmission.
3.3. Protocol Stacks of IEEE 802.16j Transparent Mode Networks in NCTUns
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Here we present the protocol stacks of the TMR-BS node, the T-RS node, and the T-MS node in NCTUns. We also present the relationship between them and how they can connect with each other. Their protocol stacks are shown in Figure 3.3.
Figure 3.3. The protocol stacks of the nodes used in IEEE 802.16j transparent mode networks.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
109
3.3.1. IEEE 802.16j TMR-BS Node The TMR-BS provides services for T-MSs and connects with a backbone network. It has two interfaces. One is an IEEE 802.3 Ethernet interface for connecting with the backbone network and the other is an IEEE 802.16j radio interface for communicating with T-RSs and T-MSs. Its IEEE 802.3 Ethernet interface must have enough bandwidth to support the entire IEEE 802.16j network. This IEEE 802.3 Ethernet interface uses the IEEE 802.3 protocol stack, which is composed of the Interface, ARP, FIFO, MAC8023, and PHY modules. The IEEE 802.16 radio interface needs to work in accordance with the IEEE 802.16j standard. In NCTUns implementation, an IEEE 802.16j interface has the following modules: an Interface module, an MAC802_16J_PMPBS module, an OFDMA_PMPBS_MR module, a CM module, and a LINK module. The main modules in the protocol stack of the TMR-BS are the MAC802_16J_PMPBS module and the OFDMA_PMPBS_MR module. The MAC802_16J_PMPBS module performs the functions of the MAC layer of a TMR-BS and the OFDMA_PMPBS_MR module performs the physical-layer functions, which use the OFDMA technology. The Channel Model (CM) module simulates various channel conditions such as signal power attenuation, shadowing, and multi-path fading effects.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.3.2. IEEE 802.16j T-RS Node The protocol stack of a T-RS is very similar to that of a T-MS because it acts as a T-MS from the TMR-BS’s point of view. Because a T-RS does not transmit framing messages such as preamble and DL-MAP, T-MSs will not notice the existence of a T-RS. This is the reason why a T-RS is called a “transparent” RS. The T-RS has only one interface – an IEEE 802.16j wireless interface, which is used to communicate with the TMR-BS and T-MS. The protocol stack of a T-RS is composed of an Interface module, an MAC802_16J_PMPRS module, an OFDMA_PMPRS_MR module, a CM module, and a Link module. The MAC802_16J_PMPRS module performs the functions of the MAC layer of a T-RS, which include exchanging the messages and relaying the data between a TMR-BS and a T-RS. The OFDMA_PMPRS_MR module performs the physicallayer function. It encodes and decodes the data transferred to TMR-BS and T-MS. Similarly, the Channel Model (CM) module is added into the protocol stack of a TMR-RS to simulate various channel conditions.
3.3.3. IEEE 802.16j T-MS Node The T-MS has one interface, which is an IEEE 802.16j wireless interface, to communicate with the TMR-BS and T-RS. The protocol stack of the T-MS is composed of an Interface module, an MAC802_16J_PMPMS module, an OFDMA_PMPMS_MR module, a CM module, and a Link module. The MAC802_16J_PMPMS module performs the functions of the MAC layer of a T-MS. These functions include receiving/transmitting messages from/to its TMR-BS and T-RS. The OFDMA_PMPMS_MR module performs physical-layer functions.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
110
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
3.4. Design of IEEE 802.16j Transparent Mode Modules
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
3.4.1. MAC-Layer Module In NCTUns, the functionalities and procedures of the IEEE 802.16j MAC layer can be partitioned into the following parts: the initial ranging procedure, the network entry procedure, management message negotiation, network management, connection management, and packet scheduling. Initial ranging is the first procedure used by a T-MS to join an IEEE 802.16j transparent mode network. The main objective of ranging is to synchronize with the TMR-BS so that the T-MS can decode the received frame correctly. If this procedure succeeds, a T-MS/T-RS can start sending or receiving management messages. During the network entry procedure, the TMR-BS will assign connection identifications (CIDs) to the T-MSs and T-RSs that just attached themselves to it. After the network entry procedure is done, each T-MS or T-RS will establish three connections with the TMR-BS. These connections include a basic connection, a primary connection, and a data connection, respectively. The basic and primary connections are used to transfer management messages. The data connection is used to transmit data packets. The data packets sent from upper-layer protocols will be classified into one of these different connections and wait to be scheduled. In an IEEE 802.16j transparent mode network, a T-RS can only operate in the centralized scheduling mode. In this mode, the bandwidth allocation for a T-RS’s subordinate stations (TMSs) is determined at the TMR-BS. A T-RS only relays the data and management messages between the TMR-BS and a T-MS and does not perform bandwidth allocation. Figure 3.4.1 presents the procedure of the BS packet scheduling. As presented in Figure 1.1, the transparent mode frame structure is divided into four zones: DL access zone, DL transparent zone, UL access zone, and UL relay zone, respectively. The purpose of doing the BS packet scheduling is to schedule packet bursts in these zones.
3.4.2.PHY-Layer Module At the physical layer, the Orthogonal Frequency Division Multiple Access (OFDMA) technique is used and the TDD frame structure is used. In NCTUns, the IEEE 802.16j OFDMA PHY is responsible for (1) processing the frame control header (FCH), (2) simulating the channel model, (3) performing channel encoding/decoding, and (4) performing modulations.
Figure 3.4.1. The procedure of packet scheduling in the base station. Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
111
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
The FCH is generated at the physical layer and transmitted at the beginning of each frame. It stores the Down Link Frame Prefix (DLFP) and specifies the length of the DL-MAP that immediately follows the DLFP and the repetition coding used for the DL-MAP. When a T-RS or T-MS receives a packet from the TMR-BS, it must be able to decode the DLFP at the physical layer so that it knows how long the DL-MAP is. With this length information, it can understand the resource allocation for the frame. In NCTUns implementation, the DL-MAP is first generated at the MAC layer and then sent to the physical layer. Next, the DL-MAP and DLFP are then sent to T-RSs and T-MSs. When a T-RS or a T-MS receives the packet from the TMR-BS, it first decodes the DLFP and DL-MAP and then sends them to the MAC layer. With such information, the MAC layer will understand the resource allocation for the DL sub-frame. The MAC layer then sends this information back to the physical layer so that the physical layer can decode the remaining zones. Currently, NCTUns supports three types of modulations for IEEE 802.16j and they are QPSK, 16-QAM, and 64-QAM, respectively. The IEEE 802.16j standard allows dynamic selection of modulation types depending on the channel quality. This is called the Adaptive Modulation and Coding Scheme (AMC). By adopting this technology, the data rate can be increased. Currently, NCTUns only supports the basic coding, which is the (tail-biting) Convolution Code, with the coding rate of 1/2, 2/3, 3/4. The tail-biting convolution code works by initializing the encoder with the last 6 bits of the block so that it will not increase additional padding bits. The OFDMA PHY also provides repetition coding. It can be used to increase the correctness of data transmission. In NCTUns, the repetition code is only applied to the DLFP.
4. Usage Examples of IEEE 802.16j Transparent Mode Network in NCTUns In this section, we illustrate step by step how to conduct an IEEE 802.16j transparent mode network simulation in NCTUns. Five important tasks need to be performed for a simulation. They are (1) topology construction, (2) power setting, (3) channel setting, (4) QoS setting, and (5) mobility setting, respectively.
Topology Construction The first step is to specify the desired network topology in the GUI environment. As shown in Figure 4.1, there are several node icons on the GUI’s tool bar at the top. In this case, we choose to create one host (Node 1), one TMR-BS (Node 2), one T-RS (Node 3), and two T-MSs (Node 4 and Node 5), respectively. The TMR-BS connects with the host (on the backhaul network) through a wired link and it communicates with other nodes in this topology through an IEEE 802.16j wireless interface.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
112
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 4.1. An example 802.16j transparent mode network topology.
Figure 4.2. Selecting a group of nodes to form an IP subnet. Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
113
The GUI needs to generate an IP address for each node in the topology. To help the GUI know that Node 2 (MR-BS), Node 3 (T-RS), Node 4 (T-MS), and Node 5 (T-MS) are on the same IP subnet, we need to group them together in the GUI. We use the following steps to form a subnet. (1) First, we click the “Form subnet” icon (
) on the GUI’s tool bar and then left-
click the four nodes that we want to be on the same subnet. (2) Then, we right-click at any blank place on the GUI to end this grouping action. A pop-up dialog box will appear and show the IDs of the selected nodes. Figure 4.2 shows this dialog box.
Power Setting
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
The next step is to specify physical-layer parameter values. The GUI provides the tool “Specify physical-layer and channel model parameters” for a user to specify physicallayer attributes and the parameters of wireless channel models. It is used to specify: (1) physical-layer/antenna attributes, (2) channel model attributes, (3) connectivity relationships, and (4) antenna gain pattern and directivity. One can launch this tool by first clicking the tool and then clicking the node that he/she wants to configure. Figure 4.3 shows the dialog box of this tool. In part 3 of the above dialog box, one can select a mode out of two modes to display the connectivity relationship among nodes. One mode is “Use the transmitting node perspective” and the other mode is “Use the receiving node perspective.” In the former mode, the GUI use dash lines to shows that when the chosen node is transmitting a packet, which other nodes in the topology will be interfered by this transmission. On the other hand, in the latter mode, the GUI uses dash lines to shows that when the chosen node is receiving a packet, which other nodes in the topology will interfere with this packet reception if any of them is transmit a packet at the same time. If one chooses the “Use the transmitting node perspective” mode, one can specify the physical-layer parameter values like transmit power (dbm), antenna height (m), and many others on the left of the dialog box. If one instead chooses the “Use the receiving node perspective” mode, the “Node Connectivity Determination” column will be enabled, which is shown in Figure 4.4. In this column, there are two options for determining the node connectivity: (1) “Determined by power threshold” and (2) “Determined by distance.” In the first option, the GUI determines node connectivity by comparing the receive power value at the chosen node and a pre-defined receive power threshold value. The second option is designed for simplified routing modules that determine node connectivity by simply comparing nodes’ distances and a pre-defined distance value. On the top-right corner of the dialog box is the channel model selection column. Two classes of channel models are supported: (1) Theoretical Channel Model, and (2) Empirical Channel Model. The “Theoretical Channel Model“ class collects the channel models that are developed using theoretical formulas. In this class, one should first select the path-loss model that is intended to be used in the simulation and then optionally select a collaborative fading model to more realistically simulate the fading effect. The “Empirical Channel Model”
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
114
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
class collects various channel models that are developed based on real-life measurement results. So far, NCTUns supports 23 empirical channel models, among which the “COST_231_Hata” channel model is usually used for simulating WiMAX channel. It is also the channel model used in the following 802.16j network simulations. The “Antenna Gain Pattern and Directivity” button is used to specify the gain pattern of the chosen node’s antenna and its directivity setting. Because we do not use this functionality in our simulations, we do not present this tool here. Detailed descriptions about these physical-layer tools are available in the NCTUns GUI user manual [7]. In an 802.16j transparent mode network simulation, correctly setting power for each node is very important. The TMR-BS and T-RS should have a large enough transmit power so that they can cover many T-MSs. The T-MS should have a transmit power that enables it to communicate with both the TMR-BS and T-RS.
Figure 4.3. The dialog box for specifying physical-layer parameter values.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
115
Figure 4.4. The parameter settings in the “receiving node perspective” mode.
Channel Setting In the NCTUns design, the default channel ID chosen for the TMR-BS is the same as its Node ID. For example, because the TMR-BS in Figure 4.2 is Node 2, its channel ID is set to 2. To ensure that T-RSs and T-MSs can communicate with the TMR-BS on the same channel, one should set the channel ID of T-RSs and T-MSs to the channel ID of their TMR-BS. This can be performed by the following steps: (1) Double-clicking a T-RS or a T-MS node in the GUI and then left-clicking the “Node Editor” button in the popped-up dialog box. The “Node Editor” window for this node will appear and it is shown in Figure 4.5. (2) Inside the “Node Editor” window, double-clicking the PHY module box. The name of the PHY module box is OFDMA_PMPXX_MR_WIMAX, where XX may be “BS,” “RS,” or “MS,” depending on the node type. A dialog box for this PHY
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
116
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
module will pop up and inside this dialog box one can specify or modify the channel ID or other parameter values.
Figure 4.5. The node editor and the PHY module dialog box.
QoS Setting The IEEE 802.16j standard defines five scheduling services: (1) Unsolicited Grant Service (UGS), (2) Real-time Polling Service (rtPS), (3) Non-real-time Polling Service (nrtPS), (4) Best Effort (BE), and (5) Extended real-time Polling Service (ertPS), respectively.. At present, NCTUns only supports UGS scheduling, which provides a fixed uplink bandwidth for a T-MS. Here we illustrate how to set the QoS provisions for T-MSs. Figure 4.6 shows how to set the QoS provisions for T-MSs. In the popped-up dialog box, one can click the “Add” button to set the maximum uplink sustained rate (in Kbps) for every T-MS.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
117
Figure 4.6. Setting the QoS provision for T-MSs.
Mobility Setting IEEE 802.16j standard supports MS mobility. The standard defines three kinds of handover mechanism: hard handover, macro diversity handover (MDHO), and fast BS switching (FBSS). Since the hard handover mechanism is mandatory and the macro diversity handover mechanism and the fast BS switching mechanism are optional in the IEEE 802.16j standard, at present NCTUns only supports the hard handover mechanism for IEEE 802.16j networks, which is described below.
Hard Handover The hard handover mechanism uses the principle of break-before-make. That is, the MS will break the connection with the original BS before making a new connection with another BS. Although it may lower the handover quality, the MS need not maintain a list of BSs and waste radio resources to maintain the list. Figure 4.7 shows the hard handover mechanism.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
118
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Figure 4.7. The hard handover mechanism. When the MS moves from BS1 to BS2, it has to disconnect the original connection with BS1 before it can make a new connection with BS2.
In the following, we present two example cases to illustrate the capability of NCTUns in dealing with handovers in an IEEE 802.16j network. The first example demonstrates that a TMS performs a handover between a T-RS and a TMR-BS. The second example demonstrates that a T-MS performs a handover between two TMR-BSs, which needs to use the mobile IP protocol.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Example 1. Handover from a T-RS to the TMR-BS Figure 4.8 shows the topology used to conduct the simulation. In this simulation, we want the T-MS1 to handover from the T-RS to TMR-BS. To make this scenario happen, we purposely let T-MS1 move from its initial location towards the TMR-BS during simulation at a speed of 10 m/sec. In the GUI, one can use the moving path tool icon (
) to specify the
moving path of T-MS1. Table 4.1 shows the parameter values used in this case. Table 4.1. The parameter settings for example 1 Distance
Antenna height
Transmitting power
Channel model T-MS1 moving speed
TMR-BS – T-RS TMR-BS – T-MS1 TMR-BS – T-MS2 T-RS – TMS1 TMR-BS T-RS T-MS1/T-MS2 TMR-BS T-RS T-MS1/T-MS2 Cost_231_Hata 10 m/sec
270 m 430 m 200 m 180 m 30 m 20 m 1.5 m 43 dbm 43 dbm 35 dbm
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
43
0
m
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
0 27
m
119
200 m
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 4.8. An example where T-MS1 handovers from the T-RS to TMR-BS.
Without using a T-RS, the TMR-BS and a T-MS need to exchange their packets directly. This may result in a low throughput between them when the transmission path between them is non-line-of sight (NLOS). The reason is that in such a condition the signal received by the T-MS and TMR-BS is very weak and this forces them to use a more robust but lowerefficiency modulation/coding scheme to transmit data. Deploying a T-RS between the TMRBS and the T-MS can solve this NLOS problem because now there is a LOS path between the TMR-BS and the T-RS and a LOS path between the T-RS and the T-MS. The result is that on both paths a less robust but higher-efficiency modulation/coding scheme can be used to transmit data. Therefore, the end-to-end throughput achieved on the TMR-BS -> T-RS -> TMS path can be higher than that achieved on the TMR-BS -> T-MS direct path. For a T-MS, depending on the quality of the path between it and the TMR-BS, it is not always better to use a T-RS to relay its packets. Whether to use a T-RS is determined by the path selection algorithm, which is presented below. MS Path Selection Algorithm To decide the access station for a T-MS, NCTUns implements a path selection algorithm proposed in [8]. The TMR-BS decides the T-MS’s access station according to the weight of the link between the T-MS and T-RS and the weight of the link between the T-RS and TMRBS. The weight of a link corresponds to the used modulation and coding rate used on that
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
120
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
link. The TMR-BS will choose to use the T-RS as the T-MS’s access station if the following condition is met: Ws : the corresponding weight of T-MS to TMR-BS’s uplink Wr + Wp < Ws Wr : the corresponding weight of T-MS to T-RS’s uplink Wp : the corresponding weight of T-RS to TMR-BS’s uplink The relationship between the weights and the modulation and coding schemes (MCS) is shown in Table 4.2 When the TMR-BS/T-RS receives a signal from a T-MS, it should compare the signal’s SNR to those listed in Table 4.2 and choose the row whose SNR is less than the received SNR and differ in the least amount. For example, if the received SNR from a T-MS to the TMR-BS is 19 then the TMR-BS will choose the 64-QAM 2/3 as the modulation and coding scheme used for this T-MS and the corresponding weight is 1/4. Using this algorithm, we can determine whether in this example case the T-MS1 would use the T-RS as its relay station at the beginning of the simulation. Table 4.3 shows the weights of the links among T-MS1, T-RS, and TMR-BS. From this table one sees that:
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
0.47 = 1/4 + 2/9 = Wr + Wp < Ws = 1 Hence, the TMR-BS chooses to use the T-RS as the T-MS1’s access station at the beginning of the simulation. During simulation, when the T-MS1 moves towards the TMR-BS, it handovers from the T-RS to the TMR-BS at the 18th second. This phenomenon can be explained by the internal weight changing updates shown in Figure 4.9. In the first 18 seconds, because Wr + Wp is less than Ws, the T-MS1’s access station is T-RS. After the 18th second, because Ws is less than Wr + Wp, the T-MS1 decides to handover from the T-RS to TMR-BS. Table 4.2. The relationship between MCSs and weights Modulation QPSK QPSK 16-QAM 16-QAM 64-QAM 64-QAM 64-QAM
Coding rate 1/2 3/4 1/2 3/4 1/2 2/3 3/4
Received SNR 5 8 10.5 14 16 18 20
Bits per symbol 1 3/2 2 3 3 4 9/2
Weight (Symbols per bit) 1 2/3 1/2 1/3 1/3 1/4 2/9
Table 4.3. The weights of the links among T-MS, T-RS, and TMR-BS Link T-MS1 T-RS T-RS TMR-BS T-MS1 TMR-BS
Wr Wp Ws
Received SNR 19.28 29.61 3.23
Weight 1/4 2/9 1
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
121
Figure 4.9. Weight changing of T-MS1 when it moves towards the TMR-BS.
As presented above, it is not always better to use a T-RS to relay the packets sent between the TMR-BS and a T-MS. This is because if a T-RS is used, the downlink sub-frame of the IEEE 802.16j transparent mode needs to be divided into two parts of equal size, which makes the total downlink bandwidth become only one half of the original total downlink bandwidth. Path selection is an important research issue in IEEE 802.16j transparent mode networks.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Example 2. Handovers between Two TMR-BSs In this example, we show that a T-MS handovers from one TMB-BS to another TMR-BS. In Figure 4.10, there are two TMR-BSs: TMR-BS1 and TMR-BS2. Each of them forms a different IP subnet. The Subnet1 is composed of TMR-BS1, T-RS1 and T-MS, and the Subnet2 is composed of TMR-BS2 and T-RS2. The T-RS1 is attached to the TMR-BS1 while the T-RS2 is attached to the TMR-BS2. The T-MS is attached to the T-RS1 at the beginning of the simulation and it moves towards the T-RS2 during simulation. When it moves out of the transmission range of the T-RS1, the first handover occurs, which causes the T-MS to handover from the T-RS1 to the TMR-BS1. When the T-MS continues to move towards the T-RS2, the second handover occurs and at this time the T-MS handovers from the TMR-BS1 to the TMR-BS2. When it continues to move towards the T-RS2, the third handover occurs and at this time the T-MS handovers from the TMR-BS2 to the T-RS2. The occurrences of these handovers are shown in Figure 4.11. Note that the IP subnet of TMR-BS1 is different than the IP subnet of TMR-BS2. Therefore, if the Mobile IP protocol is not used, after the T-MS handovers from the TMRBS1 to TMR-BS2, the T-MS cannot join the IP subnet of the TMR-BS2. To solve this problem, NCTUns provides the Mobile IP protocol to support roaming between different TMR-BSs. Two kinds of nodes should enable their mobile IP functions: the TMR-BS and TMS. As for the T-RS, it does not need to involve in the Mobile IP protocol. To enable the mobile IP function, one first double-clicks the T-MS or TMR-BS and then clicks the “Mobile IP” tab in the dialog box. Next, one should choose the “Enable Mobile IP” option to configure the Mobile IP setting of the TMR-BS and T-MS. The required steps are described below.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
122
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Figure 4.10. The T-MS handovers between two T-RSs.
Figure 4.11. The T-MS handovers from T-RS1 to TMR-BS1 and then to TMR-BS2.
Figure 4-12 shows that the dialog boxes of the T-MS and TMR-BS are different. For the TMR-BS, one needs to fill out four columns: (1) Administered Mobile Node’s IP Address, (2) Wireless Interface IP Address, (3) port, and (4) Care-of-address. The Administered Mobile Node’s IP Address should be set to the T-MS’s IP address that is dominated by this TMR-BS. In this case, because the T-MS is dominated by TMR-BS1, the column should be filled with the T-MS’s IP address. Note that a TMR-BS has two IP addresses: the one on the
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
123
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
connected wired network and the other one on the wireless network. We should fill the “Wireless Interface IP Address” column with the TMR-BS’s wireless interface IP address. The “port” can be any port number. Finally, the “Care-of-address” should be filled with the TMR-BS’s IP address on its connected wired network. In contrast, the T-MS’s Mobile IP setting is easier. One only needs to fill in the “Home Agent IP Address” and “My own IP Address.” In this case, the Home Agent IP Address should be set to the TMR-BS1’s IP address on its wireless network.
Figure 4.12. The Mobile IP setting dialog boxes of the T-MS and TMR-BS.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
124
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang
5. Simulation Result Analyses and Verifications In this section, we validate the simulation results generated by NCTUns in the aspects of achieved throughput and latency. Table 5.1 shows the basic OFDMA parameters used in the simulation studies.
5.1. Validation of Achieved throughput of Greedy UDP Flows Here we verify the achieved throughput in the NCTUns simulation of an IEEE 802.16j transparent mode network. We set up a greedy UDP traffic flow with 1400-byte packets on the downlink to compare the achieved application-layer throughput with the theoretic throughput. In order to measure the maximum achieved throughput, we disabled the bit errors of the channel model to avoid packet losses. We divided the validations into two cases. The first case is when a T-MS communicates directly with the TMR-BS while the second case is when a T-MS communicates with the TMR-BS through a T-RS.
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
Table 5.1. The basic OFDMA parameter values used in the simulations
Figure 5.1.1. The simulation topology for a T-MS to communicate directly with the TMR-BS.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
NCTUns Tool for IEEE 802.16j Mobile WiMAX Relay Network Simulations
125
Table 5.1.1. Greedy UDP throughputs under different modulation and coding schemes (without T-RS) FE C 0 1 2 3 4 5 6
Modulation/ Code rate QPSK 1/2 QPSK 3/4 16QAM 1/2 16QAM 3/4 64QAM 1/2 64QAM 2/3 64QAM 3/4
Slot size (bytes) 6 9 12 18 18 24 27
Scheduled slots 623 620 620 620 620 620 620
Theoretical throughput 5.840625 8.71875 11.625 17.4375 17.4375 23.25 26.15625
Simulated throughput 5.6771 8.486 11.3313 16.9472 16.952 22.5985 25.4421
Header overhead 2.80% 2.67% 2.53% 2.81% 2.78% 2.80% 2.73%
Figure 5.1.1 depicts the topology used for the first case, which is composed of one Host on the left, one TMR-BS in the middle, and one T-MS on the right. The transmission path of the greedy UDP flow is from the Host, through the TMR-BS, and to the T-MS. We tested different modulations and coding schemes and the simulation validation results are shown in Table 5.1.1. The theoretical throughput can be calculated as follows:
∗ !
For example, the QPSK 1/2 throughput can be calculated as:
Copyright © 2010. Nova Science Publishers, Incorporated. All rights reserved.
"#$ %∗"&' (
747.6
-$
5.840625
2$
From this table, one sees that there is about 2.5% difference between the simulated throughput and the theoretical throughput. One reason for this difference is that when the TMR-BS is scheduling slots, it reserves some symbol allocations for broadcast messages and T-MS management messages. Since some symbols are used for these purposes and thus cannot be used by the upper layers such as the UDP or the application layer, the achieved UDP simulation throughput cannot reach 100% of the theoretical throughput. Another reason for the difference is the overhead of the IP and UDP headers, which reduce the available channel theoretic throughput for carrying the data payload of the UDP flow. The following formulas show that due to the overhead of these headers, the maximum channel bandwidth available for the application layer is only about 97.3% of the theoretic channel bandwidth. This explains why there is a 2.7% difference between the simulated UDP throughput and the theoretic throughput. With the two factors considered, the validation results presented in this table show that NCTUns generates precise IEEE 802.16j simulation results.
Figure 5.1.2. The topology for a T-MS to communicate with the TMR-BS through a T-RS.
Computer Science Research and Technology, edited by Karl C. Verdinand, Nova Science Publishers, Incorporated, 2010. ProQuest Ebook Central,
126
Shie-Yuan Wang, Hsin-Yu Chen and Shih-Wei Chuang Table 5.1.2. Greedy UDP throughputs under different modulation and coding schemes (with T-RS)
FEC 0 1 2 3 4 5 6
Modulation/ Code rate QPSK 1/2 QPSK 3/4 16QAM 1/2 16QAM 3/4 64QAM 1/2 64QAM 2/3 64QAM 3/4
Slot size (bytes) 6 9 12 18 18 24 27
34546748
Scheduled slots 267 268 268 269 269 269 269
Theoretical throughput 2.503125 3.76875 5.025 7.565625 7.565625 10.0875 11.3484375
Simulated throughput 2.435 3.665 4.887 7.358 9.812 9.812 11.043
Header overhead 2.72% 2.75% 2.75% 2.74% 2.74% 2.73% 2.69%
9 : 9;: