International Conference of Computational Methods in Science and Engineering, (ICCMSE 2004) : [Hotel Armonia, 19-23 November 2004] ; [recognised conference by the European Society of Computational Methods in Sciences and Engineering, (ESCMSE)] 9789067644181, 9067644188

789 114 286MB

English Pages 1192 [1219] Year 2004

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

International Conference of Computational Methods in Science and Engineering, (ICCMSE 2004) : [Hotel Armonia, 19-23 November 2004] ; [recognised conference by the European Society of Computational Methods in Sciences and Engineering, (ESCMSE)]
 9789067644181, 9067644188

Table of contents :
Content: Table of Contents (PDF)

Citation preview

International Conference of Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004)

Lecture S eries C omputer

on

C omputational S ciences I

and

I nternational C onference of C omputational M ethods in Sciences and E ngineering 2004 (ICCM SE 2004)

Edited

by

T h e o d o r e S im o s * a n d G eo r g e M a r o u lis ** *

D epartment

of

C omputer Science ** D epartment

of

and

T echnology, U n iv e r s e

C hemistry, U niversity

of

of the

Peloponnese ,T ripous, G reece

Patras, Patras, G reece

First published 2004 by VSP Published 2018 by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN 52 Vanderbilt Avenue, New York, NY 10017 Routledge is an imprint of the Taylor & Francis Group, an informa business

Copyright© 2004 by Koninklijke Brill NV, Leiden, The Netherlands. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. A C.I.P. record for this book is available from the Library of Congress ISBN 13: 978-90-6764-418-1 (pbk) Series ISSN 1573-4196

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. i-iii

Preface for the Proceedings of the

International Conference of Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004)

Recognised Conference by the European Society of Computational Methods in Sciences and Engineering (ESCMSE)

The International Conference o f Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004) is taken place at the Hotel Armonia between 19th and 23rd November 2004.

The aim of the conference is to bring together computational scientists from several disciplines in order to share methods and ideas. Topics of general interest are: Computational Mathematics, Theoretical Physics and Theoretical Chemistry. Computational Engineering and Mechanics, Computational Biology and Medicine, Computational Geosciences and Meteorology, Computational Economics and Finance, Scientific Computation. High Performance Computing, Parallel and Distributed Computing, Visualization, Problem Solving Environments, Numerical Algorithms, Modelling and Simulation of Complex System, Web-based Simulation and Computing, Grid-based Simulation and Computing, Fuzzy Logic, Hybrid Computational Methods, Data Mining, Information Retrieval and Virtual Reality, Reliable Computing, Image Processing, Computational Science and Education etc. The International Conference of Computational Methods in Sciences and Engineering (ICCMSE) is unique in its kind. It regroups original contributions from all fields of the traditional Sciences, Mathematics, Physics, Chemistry, Biology, Medicine and all branches of Engineering. It would be perhaps more appropriate to define the ICCMSE as a Conference on Computational Science and its applications to Science and Engineering. Based on the universality of mathematical reasoning the ICCMSE favours the interaction of various fields of Knowledge to the benefit of all. Emphasis on the multidisciplinary character of the Conference and the opening to new

ii

T.E. Simos and G. Maroulis

forms of interaction was central to the ICCSME 2003 held at Kastoria, in the north of Greece. It would suffice to give here a powerful example of the new spirit championed by the ICCSME. In Quantum Chemistry one applies Quantum Physics to the study of the electronic structure, properties and the interactions of atoms, molecules and more complex systems. In order to obtain explicit results one needs efficient Mathematical methods and algorithms. The application of these methods and algorithms depends almost exclusively on the use of high-performance Computers. What is more, with the aid of modem technology it is possible to create particularly performant visualization tools. Thus we have been able to study closely and to visualize fundamental chemical phenomena. Our ability to simulate the behaviour of molecules in condensed phases has opened new horizons in Medicine and Pharmacology. One sees perceives easily in interaction Chemistry, Physics, Mathematics, Computer Science and Technology, Medicine and Pharmacology. Computational Science occupies a privileged position in this continuum. The principal ambition of the ICCSME is to promote the exchange of novel ideas through the close interaction of research groups from all Sciences and Engineering. In addition to the general programme the Conference offers an impressive number of Symposia. The purpose of this move is to define more sharply new directions of expansion and progress for Computational Science. We note that for ICCMSE there is a co-sponsorship by American Chemical Society More than 490 extended abstracts have been submitted for consideration for presentation in ICCMSE 2004. From these extended abstracts we have selected 289 extended abstracts after international peer review by at least two independent reviewers. These accepted papers will be presented at ICCMSE 2004. After ICCMSE 2004 the participants can send their full papers for consideration for publication in one of the nine journals that have accepted to publish selected Proceedings of ICCMSE 2004. We would like to thank the Editors-in-Chief and the Publishers of these journals. The full papers will be considered for publication based on the international peer review by at least two independent reviewers. We would like also to thank: •

The Scientific Committee of ICCMSE 2004 (see in page iv for the Conference Details) for their help and their important support. We must note here that it is a great honour for us that leaders on Computational Sciences and Engineering have accepted to participate the Scientific Committee of ICCMSE 2004.



The Symposiums’ Organisers for their excellent editorial work and their efforts for the success of ICCMSE 2004.



The invited speakers for their acceptance to give keynote lectures on Computational Sciences and Engineering.

Preface fo r the Proceedings o f ICCMSE 2004

iii



The Organising Committee for their help and activities for the success of ICCMSE 2004.



Special thanks for the Secretary of ICCMSE 2004, Mrs Eleni Ralli-Simou (which is also the Administrative Secretary of the European Society of Computational Methods in Sciences and Engineering (ESCMSE)) for her excellent job.

Prof. Theodore Simos President of ESCMSE Chairman ICCMSE 2004 Editor of the Proceedings Department of Computer Science and Technology University of the Peloponnese Tripolis Greece Prof. George Maroulis Co-Editor of the Proceedings Department of Chemistry University of Patras Patras Greece September 2004

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. iv-v

Conference Details International Conference o f Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004), Hotel Armonia, 19-23 November, 2004. Recognised Conference by the European Society of Computational Methods in Sciences and Engineering (ESCMSE) Chairman and Organiser Professor T.E. Simos, President of the European Society of Computational. Methods in Sciences and Engineering (ESCMSE). Active Member of the European Academy of Sciences and Arts and Corresponding Member of the European Academy of Sciences, Department of Computer Science and Technology, Faculty of Sciences and Technology, University of Peloponnese, Greece. Scientific Committee Prof. H. Agren, Sweden Prof. H. Arabnia, USA Prof. J. Vigo-Aguiar, Spain Prof. Di. Belkid, Sweden Prof. K. Belkil, USA Prof. E. Brandas, Sweden Prof. G. Maroulis, Greece Prof. R. Mickens, USA Dr. Psihoyios, UK Prof. B. Wade, USA Prof. J. Xu, USA Prof. Risto M. Nieminen, Finland Invited Speakers Prof. J. Vigo-Aguiar, Spain Prof. Di. Belkic, Sweden Prof. K. Belkil, USA Prof. Paul G. Mezey, Canada Dr B. Champagne, Belgium Prof. V. Barone, Italy Prof. S.Farantos, Greece Prof. R. Fournier, Canada

Conference Details

Prof. M. Kosmas, Greece Prof. P. Kusalik, Canada Dr S. Pal, India Dr M.G. Papadopoulos, Greece Prof. C. Pouchan, France Prof. B.M. Rode, Austria Prof. K. Szalewicz, USA Prof A.J. Thakkar, Canada

v

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. vi-vii

European Society of Computational Methods in Sciences and Engineering (ESCMSE) Aims and Scope The European Society o f Computational Methods in Sciences and Engineering (ESCMSE) is a non-profit organization. The URL address is:

The aims and scopes of ESCMSE is the construction, development and analysis of computational, numerical and mathematical methods and their application in the sciences and engineering. In order to achieve this, the ESCMSE pursues the following activities: • Research cooperation between scientists in the above subject. • Foundation, development and organization of national and international conferences, workshops, seminars, schools, symposiums. • Special issues of scientific journals. • Dissemination of the research results. • Participation and possible representation of Greece and the European Union at the events and activities of international scientific organizations on the same or similar subject. • Collection of reference material relative to the aims and scope of ESCMSE. Based on the above activities, ESCMSE has already developed an international scientific journal called Applied Numerical Analysis and Computational Mathematics (ANACM). This is in cooperation with the international leading publisher, Wiley-VCH. ANACM is the official journal of ESCMSE. As such, each member of ESCMSE will receive the volumes of ANACM free of charge. Categories of Membership European Society of Computational Methods in Sciences and Engineering (ESCMSE) Initially the categories of membership will be: • Full Member (MESCMSE): PhD graduates (or equivalent) in computational or numerical or mathematical methods with applications in sciences and engineering, or others who have contributed to the advancement of computational or numerical or

European Society o f Computational Methods in Sciences and Engineering (ESCMSE)

vii

mathematical methods with applications in sciences and engineering through research or education. Full Members may use the title MESCMSE. • Associate Member (AMESCMSE): Educators, or others, such as distinguished amateur scientists, who have demonstrated dedication to the advancement of computational or numerical or mathematical methods with applications in sciences and engineering may be elected as Associate Members. Associate Members may use the title AMESCMSE. • Student Member (SMESCMSE): Undergraduate or graduate students working towards a degree in computational or numerical or mathematical methods with applications in sciences and engineering or a related subject may be elected as Student Members as long as they remain students. The Student Members may use the title SMESCMSE • Corporate Member: Any registered company, institution, association or other organization may apply to become a Corporate Member of the Society. Remarks: 1. After three years of full membership of the European Society of Computational Methods in Sciences and Engineering, members can request promotion to Fellow of the European Society of Computational Methods in Sciences and Engineering. The election is based on international peer-review. After the election of the initial Fellows of the European Society of Computational Methods in Sciences and Engineering, another requirement for the election to the Category of Fellow will be the nomination of the applicant by at least two (2) Fellows of the European Society of Computational Methods in Sciences and Engineering. 2. All grades of members other than Students are entitled to vote in Society ballots. 3. All grades of membership other than Student Members receive the official journal of the ESCMSE Applied Numerical Analysis and Computational Mathematics (ANACM) as part of their membership. Student Members may purchase a subscription to ANACM at a reduced rate. We invite you to become part of this exciting new international project and participate in the promotion and exchange of ideas in your field.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1,2004, pp. viii-xxi

Table of Contents Μ M Aghdam, J.P. Vafa Bending Analysis of Thick Rectangular Plates with Various Boundary Conditions Using Extended Kantorovich Method Μ. M. Aghdam, V. Erfanian Application of the Extended Kantorovich Method to Bending Analysis of Sector Plates G. Ala, F. Viola An Efficient Solver for Electromagnetic Transient Simulation T. Albaret, A. De Vita, P. Gumbsch Brittle Fracture in Silicon Studied by an Hybrid Quantum/Classical Method (LOTF) Harley Souza Alencar, Marco Antonio do Nascimento, Helcio Villanova Preliminary Application of CFX as Tool in the Aerodynamic Study of Combustion Chamber for Gas Micro Turbine Y Y. AL-Obaid, Y F. AL-Obaid Three-dimensional Finite Element Mesh Generation for Steel Pipeline Used for Natural Gas Transport Sergio Amat, Sonia Busquier, J. Manuel Gutierrez An Adaptive Version of a Fourth Order Iterative Method for Quadratic Equations Sergio Amat, Rosa Donat, Jacques Liandrat, J. Carlos Trillo A Fully Adaptive PPH Multiresolution Scheme for Image Processing Z A. Anastassi and T.E. Simos Trigonometrically Fitted Runge-Kutta Methods of Order Five for the Numerical Solution of the Schrbdinger Equation T. Arima, Y Matsuura, S. Oharu Numerical Flow Analysis Around Structures in Low Speed Fluids an its Application to Environmental Sciences P. I. Arsenos, C. Demakos, Ch. Tsitouras, D. G. Pavlou and M. G. Pavlou Derivation and Optimization of the Material Cost Function for Natural Gas PipeLines J. Avellar, L.G.S. Duarte, S.E.S. Duarte, L.A.C.P. da Mota A Semi-Algorithm to Find Elementary First Order Invariants of Rational Second Order Differential Equations A. Azimi, S. Kazemzadeh Hannani andB. Farhanieh Simultaneous Estimation of Heat Source and Boundary Condition in TwoDimensional Transient Inverse Heat Conduction Problem P. G. Bagos, Th. D. Liakopoulos, S.J. Hamodrakas Efficient Training of Hidden Markov Models for Protein Sequence Analysis L. Bayon, J.M. Grau, M.M. Ruiz, P.M. Sudrez A Bolza’s Problem in Hydrothermal Optimization Yasar Becerikli, B. Koray Celik Fuzzy Control of Inverted Pendulum and Concept of Stability Based on Java Karen Belkic The Need for High-Resolution Signal Processing in Magnetic Resonant Spectroscopy for Brain Tumor Diagnostics F. Bouchelaghem, A. Ben Hamida, H. Dumontet A Study of the Mechanical Behavior of an Injected Sand by a Numerical Homogenization Approach

1-5

6-9

10-13 14-15

16-19

20-24

25-28 29-32 33-36

37-40

41-44

45-48

49-52

53-56 57-60 61-64 65-69

70-74

Table o f Contentsof the Proceedings o f ICCMSE 2004___________________________________________

K. Bousson and S. D. Correia Densiflcation and Dynamic Canonical Descent: An Optimization Algorithm Yiannis S. Boutalis New Kernel Definition for Robust to Noise Morphological Associative Memories A. G. Bratsos, D. P. Papadopoulos and Ch. Skokos A numerical solution of the Boussinesq equation using the Adomian method A. G. Bratsos, I. Th. Famelis and Ch. Tsitouras Phase-Fitted Numerov type methods P. Buffel, G. Lagae, R. Vanlmpe, W. Vanlaere Determination of elastic buckling loads for lateral torsional buckling of beams including contact P. Caballero-Gil Probability to Break Certain Digital Signature Schemes P. Caballero-Gil, C. Herndndez-Goya and C. Bruno-Castaheda A Rational Approach to Gryptographic Protocols X Cai, Y. J. Li, F. S. Tu An Improved Dynamic Programming Algorithm for Optimal Manpower Planning 0. Ceylan, O. Kalenderli Parallel Computation of Two Dimensional Electric Field Distribution Using PETSC M. Chalaris, J. Samios Hydrogen Bonding In Aqueous Mixtures Containing Aprotic Solvents: A Computer Simulation Study Benoit Champagne Quantum Theory and Simulation Aspects of the Design of Molecules for Nonlinear Optics Applications D. Chen, X. L. Ma Electronic and Structural Properties of the Bl-Aln/Tin Superlattices Jian Chuanwen, Etorre Bompard A Self-Adaptive Chaotic Particle Swarm Algorithm for Short-Term Hydro System Scheduling in Deregulate Environment Gilles Cohen, Patrick Ruch Model Selection for Support Vector Classifiers via Direct Search Simplex Method Z Cournia, G. M. Ullmann andJ. C. Smith Molecular Dynamics of Cholesterol in Biomembranes L. Dedik and M. Durisovd Advanced System-Approach Based Methods for Modeling Biomedical Systems J. J. del Coz Diaz, L. Pehalver Lamarca, P. J. Garcia Nieto, J. L. Sudrez Sierra, F. J. Sudrez Dominguez and J. A. Vildn Vildn Non-linear Analysis of Sheet Cover (Umbrella) of Reinforced Concrete of 40 m diameter J. J. del Coz Diaz, P. J. Garcia Nieto, J. A. Vildn Vildn, A. Martin Rodriguez, F. J. Suarez Dominguez, J. R. Prado Tamargo and J. L. Sudrez Sierra Non-linear Analysis and Warping of Tubular Pipe Conveyors S. Delage, S. Vincent, J. P. Caltagirone andJ. P. Heliot An hybrid linking approach for solving the conservation equations with an Adaptive Mesh Refinement method 1. Dorta, C. Leon, C. Rodriguez Performance Analysis of Branch-and-Bound Skeletons M. El-Cheikh, J. Lamb, N Gorst Risk Quantitative Analysis Using Fuzzy Sets Theory C. Erdonmez, H. Saygin Neutron Diffusion Problem Solutions Using The Method of Fundamental Solutions with Dual Reciprocity Method M. R. Eslami and H. Mahbadi A Numerical Method for Cyclic Loading Analysis of Beams M. R. Eslami and A. Bagri Higher Order Elements for the Analysis of the Generalazied Thermoelasticity of Disk Based on the Lord Shulman Model

75-78 79-82 83-86 87-90 91-94

95-97 98-101 102-105 106-109

110-114

115-117

118-121 122-126

127-130 131-135 136-139 140-144

145-149

150-153

154-158 159-162 163-168

169-172 173-176

IX

X

T.E. Simos, Chairman ICCMSE 2004

E. D. Farsirotou, J. V. Soulis, V. D. Dermissis Numerical Simulation of Two-Layered, Unsteady, Stratified Flow Gonzalo Cerruela Garcia, Irene Luque Ruiz and Miguel Angel Gomez-Nieto Clustering Chemical Data Bases Through Projection of MOS Similarity Measures on Multidimensional Spaces I. G. Georgoudas, G. Ch. Sirakoulis, I. Andreadis A Potential - Based Cellular Automaton Model for Earthquake Simulation D. A. Goussis, G. Skevis and E. Mastorakos On Transport-Chemistry Interaction in Laminar Premixed Methane-Air Flames Yousong Gu, Yue Zhang and Zhen Ji Ab inito Calculations of High Pressure Effect on F e3 Pt by CASTEP

177-180

Serge Hayward Building Financial Time Series Predictions with Neural Genetic System Boshu He, Meiqian Chen, Laiyu Zhu, Jianmin Wang, Shumin Liu, Lijuan Fan, Qiumei Yu Numerical Solutions to the Reheater Panel Overheating of a Practical Power Plant T. Hoseinnejad and H Behnejad Direct Determination of Pair Potential Energy Function from Extended Law of

198-199

Corresponding States and Calculation of Thermophysical Properties for CO 2 -N 2 MZubaer Hossain Variable Reduction Scheme for Finite-Difference Modeling of Elastic Problems of Solid Mechanics M Zubaer Hossain Generalized Kinematics for Energy Prediction of N-link Revolute Joint Robot Manipulator Wei Hua, Xueshen Liu and Peizhu Ding Stationary Solutions of the Gross-Pitaevskii Equation for Neutral Atoms in Harmonic Trap R. I. Il'kaev, V. T. Punin, A. Ya. Uchaev, S. A. Novikov, N. I. S el’ chenkova Formalism of Second-Kind Phase Transitions in the Dynamic Failure Phenomenon S. Jalili and F. Moradi Theoretical Investigation of Conductance Properties of Molecular Wires L. A. A. Nikolopoulos and A. Maras Information Leakage in a Quantum Computing Prototype System: Stochastic Noise in a Microcavity Alessandra Jannelli and Riccardo Fazio Adaptive Stiff Solvers at Low Accuracy and Complexity P. Johnson, K. Busawon and S. Danaher Refining Existing Numerical Integration Methods Byoung-Hyun Ju and Kyungsook Han Visualization of Signal Transduction Pathways Byong-Hyon Ju and Kyungsook Han A Heuristic Algorithm for Finding Cliques and Quasi-Cliques in Molecular Interaction Networks Z. Kalogiratou, Hi. Monovasilis, T.E. Simos Construction of Assymptotically Symplectic Methods for the numerical solution of the Schrddinger Equation - application to 5th and 7th order Agnes Kapou, Nikos Avlonitis, Anastasia Detsi, Maria Koufaki, Theodora Calogeropoulou, Nikolas P. Benetis and Thomas Mavromoustakos CoMFA and CoMSIA 3D-Quantitative Structure-Activity Relationships of New Flexible Antileishmanial Ether Phospholipids Athanassios Katsis, Hector E. Nistazakis Sample Size Criteria for Estimating the Prevalence of a Disease G. Papakaliatakis The Influence of Contact Interaction Between the Debonded Surfaces on Debonding Arrest in Graphite/Aluminum Composite A.N.F. Klimowicz and M.D. Mihajlovic Accurate real-time simulation of semi-deformable tubes

181-184

185-189 190-193 194-197

200-203

204-206

207-210

211-214

215-218

219-222 223-225 226-229

230-233 234-238 239-243 244-247

248-252

253-256

257-260 261-264

265-268

Table o f Contentsof the Proceedings o f ICCMSE 2004

S. Korotov and P. Turchyn A posteriori error estimation of goal-oriented quantities for elliptic type BVPs H. Koshigoe Direct solver for an inverse problem of the type of the transmission M. Kosmas An Equation of State of Polymeric Melts C. Koutitas, M. Gousidou A Model and a Numerical Solver for the Flow Generated by an Air-Bubble Curtain in Initially Stagnant Water A. P. Kurmis, JP Slavotinek, K J Reynolds, TC Hearn Advanced magnetic resonance-based computational modeling in medicine: technology contributing to improved clinical outcomes Bjorn Kvamme, Tatyana Kuznetsova and Kjetil Aasoldsen Molecular Dynamics Simulations for Selection of Kinetic Hydrate Inhibitors Bjorn Kvamme, Tatyana Kuznetsova, Andreas Hebach, Alexander Oberhof, Eivind Lunde Measurements and modeling of interfacial tension for water+carbon dioxide systems at elevated pressures E. Lamkanfi, L. Vermaere, R. Van Impe, W. Vanlaere, P. Buffel Optimization of a Space Truss Dome -Evolution Strategies Wang Zunyao, Zhai Zhicai, Wang Liansheng Thermodynamic Property and Relative Stability of 76 Polychlorinated Naphthalene by Density Functional Theory Bo Liao, Tianming Wang, Kequan Ding On A Seven-Dimensional Representation of RNA secondary Structures W. S. Lin, N. Cassaigne Frameworks for Intelligent Shopping Support George P. Lithoxoos andJannis Samios Monte Carlo Simulation of Gas Adsorption in Single Walled Nanotubes F. X. Long and B. S. Gevert Dynamic Simulation of Diffusion and Sequential Reactions in Hydrodemetallization Catalyst Particles D. Lopez-Rodriguez and E. Merida-Casermeiro Matrix Bandwidth Minimization: A Neural Approach Jean M.-S. Lubuma andKailash C. Patidar Non-standard Finite Difference Method for Self-adjoint Singular Pertubation Problems S. Mahjoob, M. Mani The Performance of Cfd Methods in Aerodynamic Estimation of Circular and Non-Circular and Non-Circular Bodies P. Manetos, Y. N. Photis Integrating Data Mining Methods for Modeling Urban Growth Dynamics Nikos D. Margetis and Andreas D. Koutselos Nonequilibrium Molecular Dynamics Simulation of Diatomic Ions in Supercritical Gases in an Electrostatic Field F. Martinez, R. Garcia, M. Munuera, A. Guillamon & I. Rodriguez Largest Lyapunov Exponent Application to the Pollen Time-Series and Related Meteorological Time-Series F. Martinez, L Rodriguez, A. Guillamon, & M. Munuera Using Fractal Dimension to Characterize Pollen Time-Series J. Martin-Vaquero and J. Vigo-Aguiar On the Stability of Exponential Fitting BDF Algorithms: Higher-Order Methods Hitoshi Maruyama, Yoshihiro Asai and Koichi Yamashita Quantum Conductivity of Single Organic Molecules J. Mateu and J. A. Lopez Modeling spatial clustering through point processes: a computational point of view J. M. Mafias, W. Gonzdlez-Manteiga, J. Taboada and C. Ordonez Managing Heterogeneity in Time Series Prediction

XI

269-273 274-278 279-282 283-287

288-293

294-297 298-301

302-305 306-309

310-312 313-316 317-319 320-323

324-327 328-331

332-336

337-340 341-342

343-346

347-350 351-353 354-356 357-365 366-369

xii

T.E. Simos, Chairman ICCMSE 2004

Rolando Menchaca-Mendez, Leandro Ballladares Ocaha, Bruces Campbell, Ruben Peredo Valderrama Web-Browsing Collaborative 3-D Virtual Worlds: A Framework for Individuals, Artifacts, and Decorations E. Merida-Casermeiro andD.Lopez-Rodriguez Multivalued Neural Network for Graph MaxCut Problem G. B. Mertzios, P. K Sotiropoulos An Algorithmic Solution of Parameter-Varying Linear Matrix Inequalities in Toeplitz Form Gergely Meszdros-Komdromy Simulation-based Choice of Strategies in the Development of Regions Khaireel A. Mohamed Increasing the Accuracy of Anticipation with Lead-Lag Timing Analysis of Digital Freehand Writings for the Perceptual Environment Th. Monovasilis, Z. Kalogiratou, T. E. Simos Fourth order Trigonometrically-fitted and Exponentially-fitted Symplectic Methods for the Numerical Integration of the Schrddinger Equation A. Mora, P. Cordero, M. Enciso, I. P. de Guzmdn Minimal generator of a non-deterministic operator Abdul Hafid M. Elfaghi and Ali Suleiman M. Bohr Enel Computations of Gas Turbine Blades Heat Transfer Using Two- Equation Turbulence Models Marek Naroznik, Jan Niedzielski Analytical Methods in Computer Modeling of Weakly Bound Hydrocarbon Radicals in the Gas Phase D. G. Natsis, A. G. Bratsos andD. P. Papadopoulos A Numerical Scheme for a Shallow Water Equation S. Oharu, Y. Oharu, D. Tebbs A Time-Dependent Product Formula and its Application to an HIV Infection Model N. A. Panayiotou., N. P. Evangelopoulos., S. T. Ponis Multi Level Decision Support in a Soccer Ticket Club Call Centre with the Use of Simulation W. G. Park, Η H Chun, M. C. Kim Flow Analysis of Flush Type Intake Duct of Waterjet G. Petihakis, G. Triantafyllou, G. Korres and A. Theodorou An Integrated Mathematical Tool. Part-I: Hydrodynamic Modeling G. Petihakis, G. Triantafyllou, G. Korres and A. Theodorou An Integrated Mathematical Tool. Part-II: Ecological Modeling Helfried Peyrl, Florian Herzog and Hans P. Geering Numerical Solution of the Hamilton-Jacobi-Bellman Equation in Stochastic Optimal Control with Application of Portfolio Optimization G. Psihoyios An Enhanced MEBDF Approach for the Numerical Solution of Parabolic Partial Differential Equations G. Psihoyios Towards a General Formula for the Stability Functions of a Family of Implicit Multistep Methods Jalil Rasekhi, Jalil Rashed-Mohassel Centre of Excellence and Wild life : A New Feature of Genetic Algorithm Md. Abdur Razzaque, Md. Ahsan Habib, Md. Arman Hossain, Md. Mahjuzur Rahman An Efficient Disconnected Operation Protocol in SAN Distributed File System M. Remesikovd Numerical Solution of Two-Dimensional Contaminant Transport Problem with Dispersion and Adsorption in a Groundwater Layer Piotr Romiszowski and Andrzej Sikorski Monte Carlo Study of Polymer Translocation Through a Hole P. Romiszowski and A. Sikorski Properties of Star-Branched and Linear Chains in Confined Space. A Computer Simulation Study

370-374

375-378 379-382

383-386 387-390

391-395

396-399 400-403

404-405

406-409 410-413

414-418

419-423 424-427 428-431 432-435

436-439

440-443

444-447 448-452 453-456

457-460 461-463

Table o f Contentsof the Proceedings o f ICCMSE 2004

____

xiii

D. P. Sahas and T. E. Simos A Trigonometrically Fitted Method of Seventh Algebraic Order for the Numerical Solution of the One-Dimensional Schrbdinger Equation D. Satkovskieni, P. Pipiraite, R. Jankauskas and J. Sulskus Computational Methods in Solving the Additivity Problem in Chemistry: Conformational Energies of Chloroalkanes A. Schug, T. Merges, A. Verma, W. Wenzel All-Atom Protein Structure Prediction with Stochastic Optimization Methods Elena Sendroiu Presheaf and Sheaf Computation Ioannis Skarmoutsos and Jannis Samios Molecular Dynamics Simulation of Cis-Trans N-Methylformamide (NMF) Liquid Mixture. Structure and Dynamics J. V. Soulis and A. J. Klonidis Numerical Simulation of Two-Dimensional Dam-Break Flows S. Spartalis, L. Iliadis, F. Maris Using Fuzzy Sets, Fuzzy Relations, Alpha Cuts and Scalar Cardinality to estimate the Fuzzy Entropy of a Risk evaluation System: (The case of Greek Thrace Torrential Risk) V. Van Speybroeck, G. B. Marin, M. Waroquier Calculation of Hydrocarbon Bond Dissociation Energies in a Computationally Efficient Way Yu. I. Sukharev, B. A. Markov, I. Yu. Sukhareva Ion Pulsation in Oxyhydrate Gel System G. N. Triantafyllou, I. Hoteit, A. I. Pollani Toward a Pre-Operational Data Assimilation System for the E. Mediterranean using Kalman Filtering Techniques T. N. Truong Computational Science and Engineering Online Ch. Tsitouras Stage Reduction on P-Stable Numerov Type Methods of Eighth Order Evangelos Tzanis Designing Recurrence Sequences: Properties and Algorithms A. Ya. Uchaev, V.T. Punin, S.A. Novikov, Ye. V. Kosheleva, L.A. Platonova, N. I. Selchenkova Nonergodic Behavior of Dissipative Structures - The Cascade of Fracture Centers in Dynamic Fracture of Metals A. Ya. Uchaev, R.I. Ilkayen, V.T. Punin, A.P. Morovov, S.A. Novikov, N. I. Selchenkova, N. A.. Yukika, L.A. Platonova, Ye. V. Kosheleva, V.V. Zhmailo Scaling Properties of Dissipative Structures Produced at the Metal Nanolevel Under Effects of Ultrashort Laser Radiation Pulses Over a Nanosecond Longevity Range A. Ya. Uchaev, R.I. Ilkayen, V.T. Punin, S.A. Novikov, Ye. V. Kosheleva, Ye. V. Kosheleva, N.I. Zavada,, L.A. Platonova, N. I. Selchenkova Universal Properties of Metal Behavior in Dynamic Fracture Over a Wide Longevity Range Shown Under Effects of High-Power Penetrating Radiation Pulses R. Peredo Valderrama and L. Balladares Ocaha Proposal for Development of Reusable Learning Materials for Wbe Using Irlcoo and Agents Z. A. Anastassi and T. E. Simos A Trigonometrically Fitted Runge-Kutta Pair of Orders Four and Five for the Numerical Solution of the Schrddinger Equation F. J. Veredas A Web-Based Simulator of Spiking Neurons for Correlated Activity Analysis L. Vermaere, E. Lamkanfi, R. Van Impe, W. Vanlaere, P. Buffel Optimization of a Space Truss Dome - Genetic Algorithms G. D. Verros, T. Latsos and D. S. Achillas Development of a Unified Mathematical Framework for Calculating Molecular Weight Distribution in Diffusion Controlled Free Radical Homo-polymerization

464-466

467-470

471-474 475-478 479-482

483-486 487-490

491-495

496-499 500-505

506-509 510-513 514-518 519-522

523-526

527-529

530-534

535-538

539-542 543-546 547-550

xiv

T.E. Simos, Chairman ICCMSE 2004

D. S. Vlachos Numerical and Monte-Carlo Calculation of Photon Energy Distribution in Sea Water D. S. Vlachos and A. C. Xenoulis Monte-Carlo Simulation of Clustering in a Plasma-Discharge Source D. S. Vlachos and T. E. Simos On Frequency Determination in Exponential Fitting Multi-step Methods for ODEs D. S. Vlachos A Hybrid Adaptive Neural Network for Sea Waves Forecasting C. Voglis and I. E. Lagaris A Rectangular Trust-Region Approach for Unconstrained and Box-Constrained Optimization Problems Vassilios Vonikakis, Ioannis Andreadis and Antonios Gasteratos A New Approach to Machine Contour Perception Xiongwu Wu and Bernard R. Brooks Isotropic Periodical Sum: An Efficient and Accurate Approach to the Calculation of Long-Range Interactions Pang Xiao-feng and Feng Yuan Ping Nonlinear Excitation and Dynamic Features of Deoxyribonucleic ACID (DNA) Molecules Pan Xiao-feng and Yu Jia-feng and Lao Yu-hui Influences of Structure Disorders in Protein Molecules on the Behaviors of Soliton Transferring Bio-Energy Yiming Li, Cheng-Kai Chen and Shao-Ming Yu A Two-Dimensional Thin-Film Transistor Simulation Using Adaptive Computing Algorithm Kailiang Yin, Heming Xiao, Jing Zhong and Duanjun Xu A New Method for Calculation of Elastic properties of Anisotropic Material by Constant Presssure Molecular Dynamics Kailiang Yin, Duanjun Xu, Chenglung Chen A Brand New Reactive Potential RPMD Based on Transition State Theory Developed for Molecular Dynamics on Chemical Reaction Shao-Ming Yu, Shih-Ching Lo, Yiming Li, and Jyun-Hwei Tsai Analytical Solution of Nonlinear Poisson Equation for Symmetric Double-Gate Metal-Oxide-Semiconductor Field Effect Transistors Younsuk Yun, Hanchul Kim, Heemoon Kim, Kwangheon Park Atomic and Electronic Structure of Vacancies in U 0 2 : LSDA+U Approach R. J. Zhang, C. H Hu, M. Q. Lu, Y. M. Wang, K Yang, D. S. Xu First-Principles Investigations of the Electronic Structure of B-Lani 4 alh x (X=4, 4.5, 5,7) Xi Hong Zhao, Bei Li A Nonlinear Elastic Unloading Damage Model for Soft Soil and its Application to Deep Excavation Engineering X. Zhao, J. S. Liu, M. Brown, C. Adams Computational Aided Analysis and Design of Chassis X. Zhao, T. G. Karayiannis, M. Collins A Transient 3-D Numerical Model for Fluid and Solid Interactions Zhigen Zhao, Yingping Zheng, Wei-ping Pan Reliability Analysis of Concrete Structure by Monte Carlo Simulation Zhigen Zhao, Boshu He, Wei-ping Pan Multiple Regression Analysis of Coal's Calorific Value Using Data of Proximate Analysis Wei-Dong Zou, Minghu Wu A Possible Organic Half-Metallic Magnet: 2-(5-pyrimidinyl)-4,4,5,5-tetramethyl4,5-dihydro-lH-3-oxoimidazol-l-oxyl Fragiskos Batzias Symposium on Industrial and Environmental Case Studies - Preface F. A. Batzias An Algorithmic Procedure for Fault Dimension Diagnosis in Systems with Physical

551-552

553-555 556-557 558-561 562-565

566-568 569-572

573-577

578-581

582-585

586-588

589-591

592-595

596-599 600-603

604-607

608-611 612-615 616-619 620-623

624-627

628-629 630-634

Table o f Contentsof the Proceedings o f ICCMSE 2004

and Economic Variables - The Case of Industrial Heat Exchange F. A. Batzias, A. S. Kakos, N. P. Nikolaou Integration of a Reactor/Mixer/Internal-Inventories Subsystem with Upstream External Inventories Distributed Over a Time/Space Domain A . F. Batias,, N. P. Nikolaou, D. K. Sidiras GIS-based Discrimination of Oil Pollution Source in an Archipelago - The Case of the Aegean C. G. Siontorou, A. S. Kakos, G. Batis GIS-based Computer Aided Air Pollution Biomonitoring for Impact Assessment Application in the Case of Materials Deterioration M. Badell, E. Ferndndez, J. Bautista, L. Puigjaner Empowering Financial Tradeoff in Joint Financial & Supply Chain Scheduling & Planning Modeling M. Yuceer, R. Berber and Z. Ozdemir Automatic Generation of Production Scheduling Problems in Single Stage MultiProduct Batch Plants: Some Numerical Examples M. Yuceer, E. Karadurmus, R. Berber Simulation of River Streams: Comparison of a New Technique to QUAL2E Jully Tan, Dominic Chwan Yee Foo, Sivakumar Kumaresan, Ramlan Abdul Aziz, Alexandros Koulouris Evaluation of Debottlenecking Strategies for a Liquid Medicine Production Utilising Batch Process Simulation Nida SHEIBA T-OTHMAN, Rachid A MARI, Sami OTHMAN Receding Horizon Control for Polymerization Processes M. Shacham, H. Shore and N. Brauner A General Procedure for Linear and Quadratic Regression Model Identification A. Dietz, C. Azzaro-Pantel, L. Pibouleau, S. Domenech Optimal Design of Batch Plants under Economic and Ecological Considerations : Application to a Biochemical Batch Plant D. Vakalis, H. Sarimveis, C. T. Kiranoudis, A. Alexandridis, G. Bafas Modeling and Simulation of Wildfires Based on Artificial Intelligence Techniques Christos Makris and Athanasios Tsakalidis Computational Methods in Molecular Biology and Medicine - Preface E. Giannoulatou, K. Perdikuri and A. Tsakalidis Concept Discovery and Classification of Cancer Specific Terms from Scientific Abstracts of Molecular Biology G. Anogianakis, A. Anogeianaki, V. Papaliagkas Do Synonymous Codons Point Towards a Thermodynamic Theory of Tissue Differentiation? K. O. Shohat-Zaidenraise, A. Shmilovici, /. Ben-Gal Gene-Finding with the VOM Model M. Christodoulakis, C. Iliopoulos, K. Tsichlas, K. Perdikuri Searching for Regularities in Weighted Sequences C. Makris, Y. Panagis and E. Theodoridis Efficient Algorithms for Handling Molecular Weighted Sequences and Applications D. Wu, K. Womer Preface to Symposium - Performance Measurement Based on Complex Computing Techniques Y Li, D. Wu Workforce Schedule and Roster Evaluation Using Data Envelopment Analysis Wei Jiang, Changan Zhu, Yi Zhang, Xiaoqiang Zhong, Desheng Wu Isometry Algorithm for the Plane Loop in Shape Machining Desheng Wu, Liang Liang, Hongman Gao and Y. Li A Strategy of Optimizing Neural Networks by Genetic Algorithms and its Application on Corporate Credit Scoring Li ShuJin, Li ShengHong, Desheng Wu The Valuation of Quanto Options Under Stochastic Interest Rates Yanping Zhao and Donghua Zhu Evaluation of the Collected Pages and Dynamic Monitoring of Scientific

XV

635-641

642-646

647-652

653-656

657-660

661-665 666-670

671-673 674-676 677-682

683-686 687- 687 688- 691

692-695

696-700 701-704 705-708

709-710

711-713 714-716 717-721

722-725 726-729

xvi

T.E. Simos, Chairman ICCMSE 2004

Information Weidong Zhu and Yong Wu Research on Expert’s Colony Decision Method of Gradual Refinement Based on Theory of Evidence D. T. Hristopulos Preface of the Symposium : Stochastic Methods and Applications D. T. Hristopulos Effects of Uncorrelated Noise on the Identification of Spatial Spartan Random Field Parameters Manolis Varouchakis and Dionissios T. Hristopulos An Application of Spartan Spatial Random Fields in Geostatistical Mapping of Environmental Pollutants A. Shmilovici, I. Ben-Gal On the Application of Statistical Process Control to the Stochastic Complexity of Discrete Processes T. E. Simos Preface of the Symposium: Mathematical Chemistry Sonja Nikolic, Ante Milicevic and Nenad Trinajstic Graphical Matrices as Sources of Double Invariants for USE in QSPR Guillermo Restrepo, Eugenio J. Llanos and H iber Mesa Chemical Elements: A Topological Approach D. Janezic and M. Pensa Molecular Simulation Studies of Fast Vibrational Modes Benoit Champagne Preface for the Mini-Symposium on: “Simulations O f Nonlinear Optical Properties For The Solid State ” Maxime Guillaume Modeling of the (Non)Linear Optical Properties of an Infinite Aggregate of AllTrans Hexatriene Chains by the Supermolecule Approach and the “Electrostatic Interactions” Model M. Veithen and Ph. Ghosez, X. Gonze First-principles study of non-linear optical properties of ferroelectric oxides Edith Botek, Jean-Marie Andre, Benoit Champagne, Thierry Verbiest and Andre Persoons Mixed Electric-Magnetic Second Order Response of Helicenes M. Rerat, R. Dovesi Determination of the macroscopic electric susceptibilities from the microscopic (hyper) polarizabilities a , β and γ Y. Aoki, F. L. Gu and J. Korchowiec Elongation Method at Semi-empirical and ab initio Levels for Large Systems Feng Long Gu, Benoit Champagne and Yuriko Aoki Evaluation of Nonlinear Susceptibilities of 3-Methyl-4-nitropyridine 1-oxide Crystal: An Application of the Elongation Method to Nonlinear Optical Properties F. Castet, L. Ducasse, M. Guillaume, E. Botek, B. Champagne Evaluation of Second-Order Susceptibilities of 2-methyl-4-nitroaniline (MNA) and 3-methyl-4-nitropyridine-l-oxyde (POM) Crystals R. Kishi, M. Nakano, T. NittaandK . Yamaguchi Structural Dependence of Second Hyperpolarizability of Nanostar Dendritic Systems M. Nakano, R. Kishi, T. Nitta and K. Yamaguchi Quantum Master Equation Approach to the Second Hyperpolarizability of Nanostar Dendritic Systems Marjorie Bertolus, Mireille Defranceschi Multiscale Modeling of Irradiation Effects in Solids and Associated Out-ofEquilibrium Processes E. Cances Some Mathematical Issues Arising in the Ab Initio Modeling and Simulation of Crystalline Materials

730-733

734-736 737-740

741-744

745-748

749- 749 750- 752 753-755 756-759 760-761

762-764

765-768 769-770

771-774

775-778 779-782

783-786

787-790

791-794

795- 795

796- 796

Table o f Contentsof the Proceedings o f ICCMSE 2004

C. Coudray Some Properties of Isolated Defects in Charged Mgo Clusters Louise Dash Electronic Excitations: ab initio Calculations of Electronic Spectra and Application to Zirconia, ZrO 2 M. Freyss, T. Petit Ab initio Modeling of the Behaviour of Helium in Americium and Plutonium Oxides G. Robert, G. Jomard, B. Amadon, B. Siberchicot, F. Joliet and A. Pasturel Specific Problems Raised by the M oderation of Actinides under Irradiation : Case of Delta Plutonium D. Mathieu Computational Methods for Atomistic Simulation of the Molecular and Electron Dynamics in Shocked Materials Fabienne Ribeiro, Marjorie Bertolus, Mireille Defranceschi Investigation of Amorphization-Induced Swelling in SiC : A Classical Molecular Dynamics Study Claude Pouchan Computational Vibrational Spectroscopy Vincenzo Barone Accurate Vibrational Spectra and Magnetic Properties of Organic Free Radicals in The Gas Phase and in Solution D. Begue, N. Gohaud and C. Pouchan A Parallel Approach for a Variational Algorithm in the Vibrational Spectroscopy Field P. Carbonniere and V. Barone Vibrational Computation Beyond the Harmonic Approximation: An Effective Tool for the Investigation of Large Semi-Rigid Molecular Systems Benoit Champagne Hyper-Raman Spectroscopy and Quantum Chemistry Simulations N. Gohaud, D. Begue and C. Pouchan How to Process a Vibrational Problem for a Medium Size Molecule: Application To CH 3 N, CH 3 Li AND (CH 3 Li) 2 Christophe lung, Fabienne Ribeiro, Claude Leforestier A Jacobi Wilson description coupled to a Block-Davidson Algorithm: An efficient scheme to calculate highly excited vibrational spectrum J. Lievin and P. Cassam-Chenai The Vibrational Mean Field Configuration Interaction (VMFCI) Method: A Flexible Tool for Solving the Molecular Vibration Problem Michael N. Vrahatis, George D. Magoulas, Gerasimos C. Meletiou, Vassilis P. Plagianakos Computational Approaches to Artificial Intelligence: Theory, Methods and Applications A. D. Anastasiadis, G. D. Magoulas, Μ. N. Vrahatis A Globally Convergent Jacobi-Bisection Method for Neural Network Training Miguel Couceiro, Stephan Foldes, Erkko Lehtonen On Compositions of Clones of Boolean Functions V. L. Georgiou, N. G. Pavlidis, K. E. Parsopoulos, Ph. D. Alevizos, Μ. N. Vrahatis Evolutionary Adaptive Schemes of Probabilistic Neural Networks Gheorghita Ghinea and George D. Magoulas Integrating Perceptech Requirements through Intelligent Computation of Priorities in Multimedia Streaming E. C. Laskari, G. C. Meletiou, Y. C. Stamatiou and Μ. N. Vrahatis Assessing the Effectiveness of Artificial Neural Networks on Problems Related to Elliptic Curve Cryptography E. C. Laskari, G. C. Meletiou, Y. C. Stamatiou and Μ. N. Vrahatis Applying Evolutionary Computation Methods for the Cryptanalysis of Feistel Ciphers

xvii

797-799 800-801

802-805

806-807

808-810

811-811

812-812 813-816

817-819

820-824

825-827 828-830

831-833

834-837

838-842

843-848 849-851 852-855 856-859

860-863

864-867

xviii

T.E. Simos, Chairman ICCMSE 2004

K. E. Parsopoulos and Μ. N. Vrahatis UPSO : A Unified Particle Swarm Optimization Scheme K. E. Parsopoulos, V. A. Kontogianni, S. I. Pytharouli, P. A. Psimoulis, S. C. Stiros and Μ. N. Vrahatis Nonlinear Data Fitting for Landslides Modeling N. G. Pavlidis, Μ. N. Vrahatis, P. Mossay Economic Geography : Existence, Uniqueness and Computation of Short-Run Equilibria D. K. Tasoulis and Μ. N. Vrahatis Unsupervised Clustering Using Semi-Algebraic Data Structures Tai-hoon Kim Approaches and Methods of Security Engineering Eunser Lee, Sunmyoung Hwang Security Considerations for Securing Software Development Sang ho Kim, Choon seong Leem Common Development Security Requirements to Improve Security of IT Products Sangkyun Kim, Choon Seong Leem Decision Supporting Method with the Analytic Hierarchy Process Model for the Systematic Selection of COTS-based Security Controls Sangkyun Kim, Hong Joo Lee, Choon Seong Leem Applying the ISO17799 Baseline Controls as a Security Engineering Principle under the Sarbanes-Oxley Act Chaetae Im, HyoungJong Kim, GangShin Lee Cost Efficient Group Communication & Management for Intrusion Tolerant System Tai-hoon Kim, Chang-wha Hong, Sook-hyun Jung Countermeasure Design Flow for Reducing the Threats of Information Systems Deok-Gyu Lee, Hee-Un Park, Im-Yeong Lee Authentication for Single Domain in Ubiquitous Computing Using Attribute Certification Hyung-Woo Lee, Sung-Hyun Yun, Nam-Ho Oh, Hee-Un Park, Jae-Sung Kim Authenticated IP Traceback Mechanism on IPv6 for Enhanced Network Security Against DDoS Attack Sung-Hyun Yun, Hyung-Woo Lee, Hee-Un Park, Nam-Ho Oh, Jae-Sung Kim The Undeniable Digital Copyright Protection Scheme Providing Multiple Authorship of Multimedia Contents Seungyoun Lee, Myong chul Shin, Sang yule Choi, Seung won Kim, Jae sang Cha Field Test-based Propagation Path-Loss Models for Terrestrial ATSC Digital TV Jaesang Cha, Kyungsup Kwak, Jeongsuk Lee and Chonghyun Lee Application of Binary Zero-correlation-duration Sequences to Interferencecancelled WPAN system Chong Hyun Lee Numerically Stable and Efficient Algorithm for Vector Channel Parameter and Frequency Offset Estimation Eunser Lee, Sunmyoung Hwang Security Requirements of Development Site Dimitrios D. Thomakos Advances in Financial Forecasting Timotheos Angelidis, Stavros Degiannakis Modeling Risk in Three Markets: VaR Methods for Long and Short Trading Positions S. Bekiros & D. Georgoutsos Comparative Evaluation of Technical Trading Rules: Neurofuzzy Models vs. Recurrent Neural Networks B. Brandi Machine Learning in Economic Forecasting and the Usefulness of Economic Theory: the Case of Exchange Rate Forecasts Gikas A. Hardouvelis, Dimitrios Malliaropoulos The Yield Spread as a Symmetric Predictor of Output and Inflation

868-873 874-879

880-883

884-887 888-889 890-892 893-895 896-899

900-903

904-907

908-910 911-914

915-918

919-922

923-928 929-936

937-951

952-954 955-957 958-960

961-963

964-966

967-969

Table o f Contentsof the Proceedings o f 1CCMSE 2004_

H. Heidari An Evaluation of Alternative VAR Models for Forecasting Inflation Burg Kayahan, Thanasis Stengos Testing of Capital Asset Pricing Model with Local Maximum Likelihood Method Constantina Kottaridi, Gregorios Siourounis A Flight to Quality! International Capital Structure Under Foreign Liquidity Constraints M. Koubouros, D. Malliaropulos and E. Panoloulou Temporary and Permanent Long-Run Asset-Specific and Market Risks in the Cross-Section of US Stock Returns S. Papadamou, G. Stephanides Improving Technical Trading Systems by Using a New Matlab based Genetic Algorithm Procedure George A. Papanastasopoulos, Alexandros V. Benos Extending the Merton Model: A Hybrid Approach to Assessing Credit Quality P. L. Rinderu. Gh. Gherghinescu, O. R. Gherghinescu Modeling the Stability of the Transmission Mechanism of Monetary Policy A Case Study for Romania, 1993-2000 Leonidas S. Rompolis Estimating Risk Neutral Densities of Asset Prices based on Risk Neutral Moments: An Edgeworth Expansion Approach A. Shmilovici, Y. Kahiri, s. Hauser Forecasting with a Universal Data Compression Algorithm: The Forex Market Case Dimitrios D. Thomakos Functional Filtering, Smoothing and Forecasting Dimitrios Thomakos Market Timing and Cap Rotation George Maroulis Computational Molecular Science: From Atoms and Molecules to Clusters and Materials Per-OlofAstrand Molecular Mechanics Model for Second Hyperpolarizabilities and Macroscopic Susceptibilities: Applications to Nanostructures and Molecular Materials N. C. Bacalis and A. Metropoulos, D. A. Papaconstantopoulos Lowest Energy Path of Oxygen near CH : A Combined Configuration Interaction and Tight-Binding Approach S. C. Farantos Molecular Reaction Pathways and Elementary Bifurcation Tracks of Periodic Orbits R. Fournier Global Optimization Methods for Studying Clusters M. Menadakis and G. Maroulis, P. G. Koutsoukos A Quantum Chemical Study of Doped CaCO 3 (calcite) Krzysztof Szalewicz and Rafal Podeszwa Density-functional-based methods for calculations of intermolecular forces AjitJ. Thakkar Computational challenges in the determination of structures and energetics of small hydrogen-bonded clusters Amlan K. Roy, Ajit J. Thakkar, B. M. Deb Numerical solution of the Schrodinger equation for two-dimensional double-well oscillators T. A. Wesolowski Applications of the orbital-free embedding formalism to study the environmentinduced changes in the electronic structure of molecules in condensed phase M. Yu. Balakina, S. E. Nefediev Application of Purpose-Oriented Moderate-Size GTO Basis Sets to Nonempirical Calculations of Polarizabilities and First Hyperpolarizabilities of Conjugated Organic Molecules in Dielectric Medium

XIX

970-973 974-975 976-978

979-983

984-987

988-990 991-993

994-997

998-1001

1002-1005 1006-1008 1009-1010

1011-1014

1015-1021

1022-1024

1025-1028 1029-1032 1033-1036 1037-1041

1042-1045

1046-1050

1051-1053

XX

T.E. Simos, Chairman ICCMSE 2004

S. S. Batsanov Electronic Polarizabilities of Atoms, Molecules and Crystals, Calculated by Crystal Chemical Method H. Berriche, F. Spiegelman and Μ. B. El Hadj Rhouma

1054-1056

1057-1060

A Quantitative Study of the Rydberg States of N aArw(n=l-12) Clusters C. Ghanmi, H. Berriche and H. Ben Ouada Evaluation of the Atomic Polarisabities of Li And K using the Long Range Lik+ Electronic States Behaviour A. Chrissanthopoulos and G. Maroulis Electric Properties of Boron and Aluminum Trihalides M. S. A. El-Kader and S. M. El-Sheikh Collision-Induced Light Scattering Spectra of Mercury Vapor at Different Temperatures & M. El-Sheikh Empirical Pair-Polarizability Models from Collision-Induced Light Scattering Spectra for Gaseous Argon A. Haskopoulos and G. Maroulis Interaction Dipole Moment and (Hyper) Polarizability in Rg-Xe A . Hatzis, A. Haskopoulos and G. Maroulis A Database of Gaussian-Type Basis Sets Applied to the Calculation of Electric Properties of the Azide Anion P. Karamanis and G. Maroulis

1061-1064

1065-1068 1069-1072

1073-1076

1077-1080 1081-1084

1085-1087

Static (Hyper) Polarizability of CH 3 X, X=F, Cl, Br and I P. G. Kusalik, A. V. Gubskaya and L. Hemdndez de la Pena The Interplay Between Molecular Properties and the Local Environment in Liquid Water C. Makris and G. Maroulis Electric Properties of F 2 , CIF, BrF and IF G. Maroulis and A. Haskopoulos Electric Hyperpolarizability of Small Copper Clusters. The Tetramer Cu 4 as a Test Case Agnie M. Kosmas and Evangelos Drougas The Quantum Mechanical Investigation of Various I-Containing Polyoxides SouravPal, Nay ana Vaval, K. R. Shamasundar and D. Ajitha Molecular Electric Properties using Coupled-Cluster Method M. G. Papadopoulos, A. Avramopoulos and H Reis On the Vibrational Polarizabilities and Hyperpolarizabilities. Analysis of Some Specific Examples : Pyrrole and Harf B em d M. Rode Ab initio QMM/MM Simulations: A Powerful Tool for the Study of Structure and Ultrafast Dynamics of Electrolyte Solutions I. Rivalta, N Russo and E. Sicilia Two-State Reaction Paradigm in Transition Metals Mediated Reactions Z Slanina, K. Kobayashi and S. Nagase Computations of Endohedral Fullerenes as Agents of Nanoscience: Gibbs Energy Treatment for Ca@C72, Ca@C82, La@C82, N2@C60, and La@C60 G. Vlachopoulos, G. A. Spyroulias, P. Cordopatis NMR Solution Structure Determination of Biomolecules and NMR-driven Docking Simulations of Biomolecular complexes Kailiang Yin, Duanjun Xu, Chenglung Chen Solvent Effects on the B-Cyclodextrin Inclusion Complexes with M-Cresol and Dynamic Hydrophobicity: Molecular Dynamics Bo Zhao, Zhihua Zhou Theoretical Study on the Coumarins for Designing Effective Second-Order Nonlinear Optical Materials Vassilis Kodogiannis, Ilias Petrounias Preface for the Symposium: Intelligent Information Systems

1088-1091

1092-1095 1096-1100

1101-1104 1105-1107 1108-1111

1112-1114

1115-1118 1119-1120

1121-1125

1126-1128

1129-1131

1132-1133

Table o f Contentsof the Proceedings o f 1CCMSE 2004

Emanuele Della Valle, Nahum Korda, Stefano Ceri and Dov Dori Gluing Web Services through Semantics: The COCOON Project Panagiotis Chountas, Vassilis Kodogiannis, Ilias Petrounias, Boyan Kolev, Krassimir T. Atanassov On Probability, Null Values, Intuitionistic Fuzzy, & Value Imperfection M. De Beule, E. Maes, R. Van Impe, W. Vanlaere and O. De Winter Artificial Neural Networks and Cardiology: a Fertile Symbiosis? Mingwei Chen, Ilias Petrounias A Short Survey of Multi-Agent System Architectures Simon Courtenage and Steven Williams The Forward Web: A New Web Architecture for Organizing Expanding Knowledge T. E. Simos Preface of the Symposium: Multimedia Synchronization Model and The Security in the Next Generation Mobile Information Systems Hyuncheol Kim, Jongkyung Kim, Seongjin Ahn, Jinwook Chung Design and Implementation of Session Initiation Protocol (SIP) Based Multi-Party Secure Conference System Gi Sung Lee Multimedia Synchronization Algorithm in All-IP Networks Keunwang Lee, Jonghee Lee, HyeonSeob Cho Design and Implementation of Mobile-Learning system Using Learner-oriented Course Scheduling Agent Hyonmin Kong, Youngmi Kwon, Geuk Lee An Extension of BGP Algorithm for End-to-End Traffic Engineering Yong-Hwan Lee, Yu-Kyong Lee, Young-Seop Kim, Sang-Burm Rhee Real Time Face Detection in Mobile Environments Dong Chun Lee, Jung-Doo Koo, Hongjin Kim New Mobility Management for Reducing Location Traffic in the Next Generation Mobile Networks Hong-Gyoo Sohn, Yeong-Sun song, Gi-Hong Kim, Hanna Lee, and Bongsoo Son Automatic Identification of Road Signs for Mobile Vision Systems Sang-Young Lee, Cheol-Jung Yoo, Dong Chun Lee Mobile-Based Healthcare B2B Workflow System for Efficient Communication Sang K Lee and Young J. Moon Intelligent All-Way Stop Control Sustem at Unsignalized Intersections

XXI

1134-1138 1139-1143

1144-1147 1148-1152 1153-1155

1156- 1156

1157- 1160

1161-1164 1165-1168

1169-1172 1173-1176 1177-1180

1181-1184 1185-1188 1189-1192

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 1-5

Bending Analysis of Thick Rectangular Plates with Various Boundary Conditions Using Extended Kantorovich Method M. M. Aghdam 1, J.P. Vafa Department of Mechanical Engineering, Amirkabir University of Technology, Hafez Ave., Tehran, Iran Received 7 August, 2004; accepted in revised form 30 August, 2004 Abstract: Bending of thick rectangular plates with various boundary conditions is studied using Extended Kantorovich Method (EKM). The governing equations based on the Reissner shear deformation theory include a system of eight first order partial differential equations (PDEs) with eight unknowns. Application of EKM to the governing equations yields to a double set of eight ordinary differential equations (ODEs) in terms of x andy. These equations were then solved iteratively until a level of prescribed convergence is achieved. It is shown that the convergence of the method is very fast. Results demonstrate good agreement with those of finite element analysis. Keywords: Bending analysis; Reissner plate theory; Extended Kantorovich Method; Iterative procedure

1. Introduction and theory background One of the approximate methods to solve the governing equations of plates is the EKM [ 1-7]. It was mainly used for various analyses of thin isotropic and orthotropic plates where the governing equation is a single forth order POE [2-3] as well as eigenvalue problems [4]. However, there are also studies [57] dedicated to thick plates where a system of three second order POE was solved. In most of these studies, clamped boundary conditions were considered due to the nature of the governing equations and ease of applying clamped boundary conditions in the EKM. In this study, the first order shear deformation theory of Reissner for thick plates [8] is employed. In order to apply various combinations of BC's, the governing equations in their complete form which consist of eight first orders POE with eight unknowns are considered. This offers a simple procedure to apply any combination of boundary conditions for the plate. For a rectangular thick plate (blh= Jvx[ f(r' )W(r-r',h)]dr'- Jv'w(r-r',h)xf(r' )dr'

n

(5)

n

where the apex in V' indicates that the derivative is referred to r '. By using the curl theorem, equation (5) can be expressed as follows:

l~p

-

I \-

·•o

~...c )'!------,

~.

~

. J>

· 0. We divide [0, oo] into subintervals [a;, b;] so that W(x) is a constant with value W;. After this the problem (1) can be expressed by the approximation 1 President

of the European Society of Computational Methods in Sciences and Engineering (ESCMSE) Member of the European Academy of Sciences and Arts 3 Corresponding author. Please use the following address for all correspondence: Dr. T.E. Simos, 26 Menelaou Street, Amfithea- Paleon Faliron, GR-175 64 Athens, GREECE, Tel: 0030 210 94 20 091 4 E-mail: [email protected] 2 Active

Z.A. Anastassi and T.E. Simos

34

y;'

= (W-E) y;,

y;(x) =A; exp (

whose solution is

VW- Ex)+ B; exp ( -VW- Ex),

A;, B; E R.

(2)

This form of Schriidinger equation reveals the importance of exponential fitting when constructing new methods. In the next section we will present the most important parts of the theory used.

2 2.1

Basic theory

Explicit Runge-Kutta methods

An s-stage explicit Runge-Kutta method used for the computation of the approximation of Yn+i (x ), when Yn(x) is known, can be expressed by the following relations: 8

Yn+i

= Yn + Lb;k; i=l

k;=hf

(

i-1

Xn+c;h,yn+hLa;jkj

)

, i=l, ... ,s

(3)

J=l

where in this case f (x, y(x)) = (W(x)- E) y(x). Actually to solve the second order ODE (1) using first order numerical method (3), (1) becomes: z'(x) = (W(x)- E) y(x) y'(x) = z(x)

while we use two pairs of equations (3): one for Yn+i and one for Zn+l· The method shown above can also be presented using the Butcher table below: 0

Coefficients c2 ,

... ,

C2

U2J

C3

U31

a32

Cs

Usi bi

Us2

as,s-1

b2

bs-i

(4) b,

c8 must satisfy the equations: i-1

ci=Laij, i=2, ... ,s

(5)

j=l

Definition 1 [1] A Runge-Kutta method has algebraic order p when the method's series expansion agrees with the Taylor series expansion in the p first terms: y(n)(x) = yi;~.(x), n = 1, 2, ... ,p. A convenient way to obtain a certain algebraic order is to satisfy a number of equations derived from Tree Theory. These equations will be shown during the construction of the new methods.

Trigonometrically Fitted Runge--Kutta Methods for the the Schrodinger Equation

2.2

35

Exponentially fitted Runge-Kutta methods

The method (3) is associated with the operator

L b; u' (x + c;h, U;) s

L(x) = u(x +h)- u(x)- h

i=l i-1

U; = u(x)

+ h L a;iu' (x + cih, Uj),

(6)

i = 1, ... , s

j=l

where u is a continuously differentiable function.

Definition 2 [2] The method (6) is called exponential of order p if the associated linear operator L vanishes for any linear combination of the linearly independent functions exp(v 0 x), exp(v 1 x), ... , exp(vpx), where v;ji = 0(1)p are real or complex numbers.

Remark 1 [3] If v; = v for i = 0, 1, ... , n, n ~ p, then the operator L vanishes for any linear combination of exp(vx), xexp(vx), x 2 exp(vx), ... , xnexp(vx), exp(vn+Jx), .. . , exp(vpx). Remark 2 [3] Every exponentially fitted method corresponds in a unique way to an algebraic method (by setting v; = 0 for all i) Definition 3 [2] The corresponding algebraic method is called the classical method.

3

Construction of the new trigonometrically fitted Runge-Kutta methods

The first method we construct will integrate exactly the functions: or equivalently {1, x, x 2 , x 3 , x 4 , exp(/wx)} {1, x, x 2 , x 3 , x 4 , cos(wx), sin(wx)}, where w is a real number and it is called frequency and I =

A.

The second method we construct will integrate exactly the functions: {1, x, x 2 , exp(Iwx), xexp(Iwx)} {1, x, x 2 , cos(wx), sin(wx), xcos(wx), xsin(wx)}

4

or equivalently

Numerical results- The resonance problem

The efficiency of the two new constructed methods will be measured through the integration of problem ( 1) with l = 0 at the interval [0, 15] using the well known Woods-Saxon potential

q = exp

uo =-50,

a= 0.6,

Xo =

( -X-aXQ) - , 7 and

UJ

where uo a

=--

(7)

36

Z.A. Anastassi and T.E. Simas

and with boundary condition y(O) = 0. The potential V(x) decays more quickly than t(~ti), so for large x (asymptotic region) the Schrooinger equation (1) becomes (8) The last equation has two linearly independent solutions k x it(k x) and k x nt(k x), where it and nt are the spherical Bessel and Neumann functions. When x --> oo the solution takes the asymptotic form y(x) "'=' Akxjt(kx)- Bkxnt(kx) ""D[sin(k x- 1r 1/2) + tan(ot) cos (k x-

1r 1/2)],

(9)

where Ot is called scattering phase shift and it is given by the following expression: (10)

where S(x) = kxjt(kx), C(x) = kxnt(kx) and Xi < Xi+I and both belong to the asymptotic region. Given the energy we approximate the phase shift, the accurate value of which is 1r /2 for the above problem. As regards to the frequency w we will use the suggestion of Ixaru and Rizea

[4]:

w=

VE - 50, ../E,

x E [0, 6.5] X

E [6.5, 15]

We compare the two new trigonometrically fitted methods to a variety of well known classical Runge-Kutta methods. The new methods and especially the second one that integrates {1, x, x 2 , exp( I wx), x exp( I wx)} are more efficient than the classical ones. The main reason for this high efficiency is that there are lower powers of energy E in the local truncation error of the new methods than in error of the other methods.

References [1] Hairer E., N0rsett S.P., Wanner G., Solving Ordinary Differential Equations I, Nonstiff Problems, Springer-Verlag, Berlin Heidelberg, 1993 [2] Simos T.E., An exponentially-fiited Runge-Kutta method for the numerical integration of initial-value problems with periodic or oscillating solutions, 115, 1-8(1998). [3] Lyche T., Chebyshevian multistep methods for Ordinary Differential Eqations, Num. Math. 19 65-75(1972). [4] Ixaru L.Gr., Rizea M., A Numerov-like scheme for the numerical solution of the Schrooinger equation in the deep continuum spectrum of energies, Comp. Phys. Comm. 19 23-27(1980).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The ;./etherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 37-40

Numerical Flow Analysis Around Structures in Low Speed Fluids and its Application to Environmental Sciences T. Arima 1 , Y. Matsuura2 , S. Oharu 3 1 Wako 2 Faculty

Research Center, Honda R&D Co., Ltd. Saitama, Japan of Information Sciences, Hiroshima City University, Hiroshima, Japan 3 Department of Mathematics, Chuo University, Tokyo, Japan

Received 13 August, 2004; accepted in revised form 31 August, 2004 Abstmct: Importance and application of numerical flow analysis to environmental science and technology are outlined. Fluid phenomena in the ocean, rivers, atmosphere and the ground are investigated by means of numerical methods and in turn proposals for the control, restoration and counterplans against the so-called environmental disrupters which destroy natural environment as well as ecological systems in nature. All such environmental disrupters diffuse in and are transported by environmental fluids. Those disrupters sometimes react on some other chemicals to generate more poisonous materials. Environmental fluid dynamics is effective for the evaluation, prediction and restoration of the environmental damage. In this paper a mathematical model of environmental fluid is presented and results of numerical simulations based on the model are exhibited. Keywords: Environmental fluid, computational fluid dynamics, numerical simulation, environmental restoration technology, three dimensional visualization. Mathematics Subject Classification: 39Al2, 62Pl2, 65M06, 65Ml2, 76005

1

Environmental fluids and environmental restoration technology

Fluid dynamical technologies are increasingly becoming important in the field of environmental science and technology. Evaluation of environmental fluid flows using numerical methods is particularly useful to understand the complex fluid motion and make it possible to control the flow fields from the point of view of environmental restoration. The environmental fluid problems may be classified by three types of applications. The first application is concerned with ultimate use of exergy. This is the most important subject for existing engines that use fossil fuel for combustion. Secondly, new energy sources such as wind and wave power generation should be extensively researched and developed. Thirdly, it is indispensable to develop not only efficient and harmless energy sources but also technologies to restore the environment which has already been polluted by exhaust gases through combustion of fossil fuel. It is also important to develop effective methods for protecting the environmental fluids against pollutants. In this paper the field of studies in evaluation, control and prediction of transport phenomena which arise in a variety of environmental problems is called environmental fluid dynamics. Obviously, the environmental fluid dynamics is one of the key theories to invent efficient technologies for the preservation and restoration of the natural environment. Here we focus our attention on dynamical analysis of diffusion and transport processes of pollutants in environmental fluids. Mathematical models of environmental fluids are 2 Corresponding

author.@E-mail:[email protected]

38

T.Arima, Y.Matsuura, S.Oharu

presented and results of simulations based on the models are exhibited. It is then expected that new environmental restoration technology will be extensively developed by means of the computational fluid dynamics.

2

A mathematical model of environmental fluids

As a mathematical model describing the motion of environmental fluids, we employ the compressible Navier-Stokes system. Although it is known (see [1]) that the application of numerical methods for the compressible Navier-Stokes system to low-speed flows does not necessarily provide us with satisfactory results in the numerical computation. This would imply that numerical simulations become inefficient and the associated computational results turn out to be inaccurate. In this paper we apply the Boussinesq approximation to the compressible Navier-Stokes system and formulate the following system of equations (1-3) as our mathematical model for describing the motion of environmental fluids: (1) Y'·v 0

+ (v · V')v] pCp [Tt + (v · V')T]

- \i'p + /1-l:!.v- p/3(T- To)g

p [vt

V'(KV'T)

+ Sc

(2)

(3)

Here the parameters v, p, p, /-1, /3, g, T and To represent the velocity vector, density, pressure, viscosity coefficient, rate of volume expansion, the acceleration of gravity, temperature and its reference temperature, respectively. The third term on the right-hand side of (2) represents the thermal effect on buoyancy. Also, the coefficient K means the thermal conductivity and Sc stands for the sum of heat sources in the fluid under consideration. Our main objective here is to obtain numerical data describing the flow field around bodies in an environmental fluid under consideration. For this purpose we impose Dirichlet boundary conditions for v and T and homogeneous Neumann boundary conditions for p on the inflow boundary. On the outflow boundary we impose homogeneous Neumann conditions for v, T and Dirichlet boundary conditions for p. On the surface of each body standing in the fluid, we impose the nonslip condition for v and homogeneous Neumann condition for T and employ an inhomogeneous Neumann boundary condition for p which is obtained from equation (2). In this paper a new numerical scheme is proposed such that an fully implicit Euler scheme for the velocity is involved.

3

Numerical Models

Making discretization in time of (1) by use of the Euler implicit method, we obtain the following system of equations: \7 · vn+i 0 (4) -(vn+i . V')vn+l _ -(vn+i . \i')Tn+i

! \i'pn+i + t:l:!.vn+i p

p

_ /3(Tn+i _ To)g

+ _1_\i'(KV'Tn+l) + ..§.::__ pCP

pCP

(5)

(6)

In what follows, we regard equations (4)-(6) as the governing equations for the motion of numerical fluids and then take the standpoint that the numerical solvability of this basic model should be investigated. In view of the unilateral positive direction of time, we linearize the nonlinear term on the right-hand side of (5) as (vn+l · \i')vn+l ~ (vn · \i')vn+i we obtain the following Helmholtz equation for the velocity vector vn+ 1 :

Numerical flow analysis around structures in low speed fluids and its application to environmental sciences

Substitution of equation (5) into equation (4) implies Poisson's equation for the pressure field pn+l: ~pn+l

Y'·vn] = -p [ V'. {(vn+l. V')vn+l} _ -----z;t

_ p/3V' . T"+lg

(8)

Also, the Helmholtz equation for the temperature field is obtained from equation (6) in the following form:

[1

+ ~t(vn. V')] rn+l

- "~t ~rn+l = Tn

pCP

+ ~tSc

(9)

pCp

Elliptic equations (7), (8) and (9) are discretized with respect to the space variables and the resultant system of finite-difference equations are solved by applying an appropriate iteration method. Namely, given velocity field vn, pressure field pn and temperature field rn at the nth time step, the velocity field vn+I , pressure field pn+l and temperature field rn+l are obtained at the (n + l)th step through the iteration procedures, respectively. As to the discretization of the physical space !1, we employ a collocation form such that the scalar and vector quantities on the same grid points. For the discretization of the convective terms, we apply an up-wind scheme of the third order accuracy which is proposed in Kawamura [2]: V

( -ov) = ox ;

V;

-Vi+2

+ 8(vi+l- V;_I) + 'L'i-2 + IV; I...c...;c.::_ Vi+2- 4vi+l + 6v; - 4u;_J + V;-2 _ ___;c.;_;__-:-:-"--....:........:__...:........::. 12~x

4~x

(10)

For the other terms we employ central difference schemes of the second order accuracy. It should be mentioned here that the above-mentioned finite-difference scheme is unconditional stable.

4

Results of Numerical Simulations

Computation is started with an adequite initial data and qualitative features are investigated by analyzing the numerical results of the simulation at time steps at which the flow field is well developed and approaches a quasi-stationary state. Figure l(a) depicts the contours of the pressure on the cross section containing the axes of the two cylinders. In the figure of contours of the pressure it is observed that a vertical sequence of separate regions like cells of negative pressure are formed. This is due to the presence of nonstationary vortices of Karmann-type. Furthermore, a vertical sequence of regions of positive pressure are observed in the front of the rear cylinder. This phenomenon suggests that the nonstationary vortices generated by the front cylinder interact the regions of stagnation existing in the front of the rear cylinder and deteriorate the stagnation pressure. Figure 1(b) depicts the iso-surfaces of vorticity and illustrates the 3D structure of the pressure field and vorticity distribution.

I

• (a) P ressure contour

(b) Absolute vorticity

Figure 1: Computed pressure contour and vorticity iso-surfaces On the other hand, concerning the vorticity distribution, regular variation due to the formation of vortex pairs of Karmann type is observed in the top part of the rear cylinder in the same way as in the pressure field. Vorticity distribution is concentrated in the back of the middle part of the rear cylinder, although the vorticity distribution in t he bottom part is comparatively diffusive.

39

40

T.Arima, Y.Matsuura, S.Oharu



~

L .I I

(a) streamline

(b) Particle trajectory

Figure 2: Upward flow motion observed behind two circular cylinders This suggests that the flows running around the bottom parts of the cylinders form longitudinal vortices. It is then inferred that these longitudinal vortices would motivate the upward flows behind the two cylinders. In Figure 2 the stream lines and trajectories of particles in the fluid are depicted. Trajectories of particles are drawn in the following way: We released the particles from the back of each cylinder and traced the trajectories forward and backward in time until the particles reach the boundaries of the computational domain and those of the bodies in the fluid. It is seen from Figure 2(a) that upward flows behind the cylinders are rolling up towards the top. Furthermore, the motion of longitudinal vortices around the bottom sides can be observed as inferred from the iso-surfaces of vorticities. Figure 2(b) is obtained by arranging particles on the same trajectories as in Figure 2(a) at regular time intervals. From this it is seen that the particle distribution represents how long a particle stay in the flow , and that particles are concentrated in the back of the front cylinder. These results of numerical simulations may have applications to oyster raft)

NOx

~

(a) Oyster raft

(b) Street trees

Figure 3: Structures in the ocean and atmosphere various environmental problems. The farming part of an oyster raft may be regarded as a regular arrangement of cylinders with bottom ends. Our results suggests that red tide plankton would stay in the upward vortices rolling up along the back of the cylinders provided that the oyster raft is moved or water currents exist against the raft. Another application is that street trees are regarded as a sequence of cylinders with top ends, and that automobile exhaust fumes may be caught in the back of each tree which assimilate those poisonous gases. See Figure2(a) and 2(b). It is then expected that new environmental restoration technology will be significantly improved by applying the results of numerical simulations of the motion of environmental fluids.

References [1] E. Turkle, "Preconditioned Method for Solving the Incompressible and Low Speed Compressible Equations," J. Comput. Phys., Vol.72, No. 2, pp.277-298, 1987 [2] T. Kawamura and K. Kuwahara, "Computation of High Reynolds Number Flow around a Circular Cylinder with Surface Roughness," AIAA Paper No. 84-0340, 1984.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 41-44

Derivation and Optimization of the Material Cost Function for Natural Gas Pipe-Lines P.l. Arsenos 1, C. Demakos2, Ch. Tsitouras3, D.G. Pavlou 1 and M.G. Pavlou4 Received 5 September, 2004; accepted in revised form 12 September, 2004 Abstract: Pipe-lines used for natural gas transfer are mainly operating at high pressure. The latter condition

is associated with their wall thickness which determines construction cost. Currently, it is recognized that specific construction techniques that are based in reinforcing the pipe-line with rigid rings can result in significant reduction of material cost. This cost is strongly influenced by the mechanical parameters, material strength and material price. The objective of the present study is the derivation and optimization of the cost function [I ,2] versus mechanical parameters. In order to obtain the function that is to be optimized, a numerical procedure based on matrix analysis and theory of elasticity will be used. Keywords: Pressure pipes, beam on elastic foundations, transfer matrices. Mathematics Subject Classification: 74805, 74P05, 74G99

1. Formulation of the mechanical problem A long axisymmetric pipe of radius R and wall thickness t is reinforced by rigid rings of diameter d. The rings are uniformly distributed along the pipe-line and the distance between two rings is L. The pipe is loaded by internal static pressure Po due to natural gas flow. Under these conditions, at every point x along the wall, a radial displacement w=w(x, Po, t, L) is developed. The boundary conditions of the problem are:

w(O) = 0, w(L) = 0, w'(O) = 0, w'(L) =0

(I)

2. Relation of the material cost with the construction parameters In order to develop the basic equation that describes the radial displacement distribution w(x) of the pipe's wall, the equilibrium of a longitudinal strip oflength Land width a.M=l of the pipe is considered [3]. It is well known by the mechanics that the longitudinal displacement u(x,y) of a point A(x,y) of a beam is given by the simple equation [4]: u (x,y) = yw' (2) where w' is the slope of the centroidal axis after bending. According to the definition of strain it can be written:

au

c=ax=yw



Taking into account the relation between stress a and strain E given by the Hooke's law

a=Ee where E is the elasticity modulus, then the equations (2) and (3) result to:

a= Eyw" Using the above expression, the equilibrium equation

1 TEl of Chalkis, Department of Mechanical Engineering, GR34400 Psahna, Greece 2 TEl of Piraeus, Department of Civil Engineering, GRI22 44 Athens, Greece. E-Mail: [email protected] 3 TEl of Chalkis, Department of Applied Sciences, GR34400 Psahna, Greece. E-Mail: [email protected] 4 ERGOSE S.A., Works of Greek Federal Organization, Karolou 27, 10437 Athens, Greece

(3)

42

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ P. L Arsenos et al

M= JyudF F

results to M=Ew"J

where

Considering the equilibrium equations Q=oM OX

and

q= OQ OX

the following differential equation is obtained: EJw'"'=q

(4)

The distributed load q is the superposition of the action q •• of the internal pressure Po and the reaction q* of the wall [5], so that: (5) q=q••-q• where (6)

q**=Po and Et

(7)

q*=-w R2

Combining equations (4)-(7) the following fundamental differential equation is derived: Et

E.lw""+2w=Po R The general solution of the above equation [6] is: w = Bt ch cos+ B2 ch sin+ B3sh cos+ B4sh sin- Po I k

2~ 4EJ

k=Et/R 'P= -

chcos=

ePx +e-Px

2

cospx,

. chsm

ePx +e-Px . smPx 2

ePx -e-Px ePx -e-Px sinPx shsin = shcos ----cospx, 2 2 Taking into account the boundary conditions (1), the state vector of the beam at the ring locations (i.e. x=O and x=L) can be obtained by a matrix equation of the following form: w(L)

Au

w'(L)

A21

A12 A22

AJs

w(O)

A2s

w'(O)

M(L) Q(L)

or

M(O)

A.t. 0

A.t2 0

A.ts 0 0

Q(O)

Derivation and Optimization of the Material Cost Function for Natural Gas Pipe-Lines_ _ _ _ _ __

Att

Al2

A21

A22

~I

~2 0

0

I

A14

-1

0

0

0

w(O)

-ats

A24

0

-1

0

w'(O)

-a25

0

0

0 -1

M(O)

-a35

0

0

0

0 -1

Q(O)

-a45

~ 0 0

0

0

0

0

w(L)

0

0

0

0

0

0

w'(L)

0

0

0

M(L}

0 0

0

0

Q(L)

0

0

0

0

0

0

I

0

0

0

0

0

43

The solution of the above linear system results to: M(O)=M(L)=

12b4 EJL2 (12 + b 4 L4 )Po

k( 432 -72b 4 L4 + 12b~ L5 + b8 L8 )

Q(O) = Q(L) = _ 24b4 EJL(36-2b 4 L4 +3b 5 L5 )Po k( 432 -72b4 L4 + 12b5 L~ + b8 L8 ) The thickness of the pipe's wall and the diameter of each ring can be determined by the following wellknown equations: 6Mmax Sy 4Qmax Sy ---=-and---=12 N trd2 N

or t=

~6MmaxN Sy

and d = ~4QmaxN trSy

(8)

The cost c of the required material is proportional to the weight. Therefore:

c = ,{ 2tr RLt + 2trR

tr~2 )

The normalized unit cost c*=c/21cRJ.L can be written: 1 ( Lt+trd2) c*=L 4Taking into account equations (8) the normalized unit cost function c• it can be expressed:

c*=..!._[L~6MmaxN + QmaxN) L

Sy

Sy

(9)

In order to minimize the required cost [7,8] for the pipe-line construction, the distance L between the rigid rings can be determined by the following condition:

ac•

-=0

ar

(10)

3. Numerical example A long pipe-line with mechanical parameters EJ=9.84xl09 Nmm2/mm, k=O.l68 Nmm2/mm is constructed by steel with yield stress Sy=240 Mpa and elasticity modulus E=2.lxl0 5 N/mm2• Within the pipe, natural gas with pressure Po=3xl05 Pa is flow. For the reinforcement of the pipe-line a uniform distribution of rigid rings with diameter d have been applied. During loading, two sections along the length are critical for the constructions strength: (a) the section located at the rings (x=O or x=L) and (b) the section located at the middle between two rings (x=L/2). The stressing of the section of the case (b) is predominant as the distance between the rings increased. The bending moment M(L/2) at this section is determined by a matrix equation of the following form:

P. I. Arsenos eta/

44

I~L/,)l 1~. .."ll~·)I Bl2

w'(L/2)

M(L/2) = Q(L/2) 1

B 21

.

B22

B 25

w'(O)

.

M(O)

.

B41

B42

0

0

B4s 1 0 0

Q(O) 1

For the calculation of the normalized unit cost c* given by (9), the following results are used: Mmax =max[M(O),M(L/2)] Qmax =Q(O) The normalized cost versus the parameter L indicating strong influence by the construction technique. The equation ( 10) is solved numerically using the commercial program Mathematica, taking for the safety factor the value N=5. The solution of the above equation resulted: Loptimum

'"'6743 mm .

References [1] Cook, W.D. and Kress, M. (1991), A multiple criteria decision model with ordinal preference data, European Journal of Operational Research, 54, 191-198. [2] Doumpos, M, Zopounidis, C. and Pardalos, P.M. (2000a), Multicriteria sorting methodology: Application to financial decision problems, Parallel Algorithms and Applications, 15/1-2, 113129. [3] Wu, T.Y. and Liu, G.R. (2000), Axisymmetric bending solution of shells of revolution by the generalized differential quadrature rule, International Journal of Pressure Vessels and Piping, 77, 149-157. [4] Timoshenko, S.P. and Gpodier, J.N. (1970), Theory of elasticity, 3'd Ed., McGraw-Hill, New York. [5] Boresi, A.P. and Sidebottom, O.M. (1985), Advanced Mechanics of Materials, 41h Ed., Wiley, New York. [6] King, A.C., Billingham, J., and Otto, S.R. (2003), Differential Equations, Cambridge Univ. Press, Cambridge. [7] Koehler, G.J. and Erenguc, S.S. (1990), Minimizing misclassifications in linear discriminant analysis, Decision Sciences, 21, 63-85. [8] Starn, A. (1990), Extensions of mathematical programming-based classification rules: A multicriteria approach, European Journal of Operational Research, 48, 351-361.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 45-48

A Semi-Algorithm to Find Elementary First Order Invariants of Rational Second Order Ordinary Differential Equations J. Avellar 1 , L.G.S. Duarte 2 , S.E.S. Duarte 3 , L.A.C.P. da Mota4 Universidade do Estado do Rio de Janeiro, Instituto de Flsica, Departamento de Flsica Teorica, R. Sao Francisco Xavier, 524, Maracanii, CEP 20550-013, Rio de Janeiro, RJ, Brazil. Received 2 September, 2004; accepted in revised form 17 September, 2004 Abstract: Here we present a method to find elementary first integrals of rational second order ordinary differential equations (SOODEs) based on a Darboux type procedure [3, 4, 5]. Apart from practical computacional considerations, the method will be capable of telling us (up to a certain polynomial degree) if the SOODE has an elementary first integral and, in positive case, finds it via a single quadrature. Keywords: Elementary first integrals, semi-algorithm, Darboux, Lie symmetry PACS: 02.30.Hq

1

Earlier Results

In the paper [1], one can find an important result that, translated to the case of SOODEs of the form , _ M(x, y, y') _ ¢( ') (1) y - N(x,y,y') x,y,y ' where M and N are polynomials in (x, y, y'), can be stated as:

Theorem 1: If the SOODE {1) has a first order invariant that can be written in terms of elementary functions, then it has one of the form:

l=wo+ Lc;ln(w;),

(2)

where m is an integer and the w' s are algebraic functions 5 of (x, y, y'). 1 E-mail: [email protected] Fundac;ao de Apoio a Escola Tecnica, E. T.E. Juscelino Kubitschek, 21311-280 Rio de Janeiro - RJ, Brazil 2 E-mail: [email protected] 3 E-mail: [email protected] 4 E-mail: [email protected] 5 For a formal definition of algebraic function, see [2].

46

A vellar et al

The integrating factor for a SOODE of the form (1) is defined by:

R(¢-y")

= di(x,y,y')

(3)

dx

where -Jx represents the total derivative with respect to x. Bellow we will present some results and definitions (previously presented on [6]) that we will need. First let us remember that, on the solutions, di = Ix dx + Iy dy + Iy' dy' = 0. So, from equation (3), R(¢dx- dy') = Ix dx + Iy dy + Iy' dy' = di = 0. Since y' dx = dy, we have

R[(¢ + Sy')dx- Sdy- dy'J

= di = 0,

(4)

adding the null term S y' dx - S dy, where S is a function of (x, y, y'). From equation (4), we have: Ix = R(¢+Sy'), Iy = -RS, Iy' = -R that must satisfy the compatibility conditions. Thus, defining the differential operator D Ox + y' oy + ¢ Oy', after a little algebra, that can be shown to be equivalent to:

=

2

'D[R]

(5)

V[RS]

(6)

The theoretical foundations for the algorithm

Let us start this section by stating (without presenting the demonstration here in the extended abstract) a corollary to theorem 1 concerning S and R.

Corollary: If a SOODE of the form (1) has a first order elementary invariant then the integrating factor R for such an SOODE and the function S defined in the previous section can be written as algebraic functions of (x, y, y'). Besides that, working on equations (5) and (6), we get: (7)

That equation will be regarded as a first order ordinary differential equation (FOOD E) for S(x) over the solutions for the SOODE (1). Concerning eq.(7) we can demonstrate the following theorem:

Theorem 2: Consider the operator

Ds = ((NS) 2

+ (NMy'-

MNy')S- (NMy- MNy))os

+ N 2 D.

(8)

If Pis an eigenpolynomial of Ds (i.e., Ds[P] = >.P) that constains S, then P = 0 is a particular solution of eq.(7). Since the existence of an elementary first order invariant I (first integral) for the SOODE (1) implies that S can be written as an algebraic function, we have the result:

Theorem 3: If the SOODE (1) has an elementary first integral, then the operator Ds defined above has an eigenpolynomial P that contains S. This result provides us a algorithm to find S. In order to obtain the integrating factor R, let us show a relation between the functionS and a symmetry of the SOODE (1). Making the following transformation

A semi-algorithm to find elementary first order invariants

47

S = - D[7J]

(9)

1)

eq.(7) becomes

(10) i.From Lie theory we can see that eq.(lO) represents the condition for a SOODE (1) to have a symmetry [0, 7J]. So, from (9) we can find a symmetry given by (11)

3

Finding the integrating factor and the first integral

Looking at eq.(4) we can infer that R is also an integrating factor for the auxiliary FOODE defined by dz = 5 ( 12) ,

dy

where x is regarded as a parameter. Besides, for this FOODE, [7J, D[7JlJ is a point symmetry. So, R is given by

R=

1

1)

s +.D[1J]

.

(13)

If in (13) we use 1J defined by (11) we would get a singular R. However, eq.(lO) has another solution independent from 1J given by

__ Jefcfvdx

1)- 1)

1)

2

(14)

dx.

Using this r; in (13) we get Rand, once this is done, we can calculate the first integral I by using simple quadratures.

4

Example and conclusions

Consider the SOODE (15)

Constructing the Ds operator we get that (16)

is an eigenpolynomial of it. Then S is given by S = zjy + v'x"2-=1 and

1)

=

..jx + y'x"2-=1 x/X9

ye

2

' 'fi =

1),

r; are respectively

J x + v'x"2-=1 f ~fJ!;~; dx

------,-/X9-;x=;o2=_,'-1..:r..:::.__:c.__ _

ye

(17)

2

R can be written as (18)

48

A vellar et a/

leading to the following first integral: I-

-

Jx

+ ../x2 -

1 (2zy + ~y2) ../x 2 -1e~ .

--------'-----=~-----'-

{19)

Here we presented a semi-algorithm do find an elementary first integral for a rational SOODE. We are working on a full implementation of it in the MAPLE system.

References [1] M Prelle and M Singer, Elementary first integral of differential equations. Trans. Amer. Math. Soc., 279 215 {1983).

[2] Davenport J.H., Siret Y. and Tournier E. Computer Algebra: Systems and Algorithms for Algebraic Computation. Academic Press, Great Britain {1993). [3] Y K Man and M A H MacCallum, A Rational Approach to the Prelle-Singer Algorithm. J. Symbolic Computation, 11 1-11 {1996), and refferences therein.

[4] L.G.S. Duarte, S.E.S.Duarte and L.A.C.P. da Mota, A method to tackle first order ordinary differential equations with Liouvillian functions in the solution, in J. Phys. A: Math. Gen. 35 3899-3910 {2002). [5] L.G.S. Duarte, S.E.S.Duarte and L.A.C.P. da Mota, Analyzing the Structure of the Integrating Factors for First Order Ordinary Differential Equations with Liouvillian Functions in the Solution, J. Phys. A: Math. Gen. 35 1001-1006 {2002) [6] L G S Duarte, S E S Duarte, L A C P da Mota and J E F Skea, Solving second order ordinary differential equations by extending the Prelle-Singer method, J. Phys. A: Math. Gen., 34 30153024 {2001).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 49-52

Simultaneous Estimation of Heat Source and Boundary Condition in Two-Dimensional Transient Inverse Heat Conduction Problem A. Azimi, '· 2 S. Kazemzadeh Hannani and B. Farhanieh School of Mechanical Engineering, Sharif University ofTechnology, Tehran- Iran 1 Energy Research Centre, Research Institute of Petroleum Industry, Tehran- Iran Received 2 August, 2004; accepted in revised form 15 August, 2004 Abstract: In this research, a simultaneous estimation of heat source and boundary conditions using

parameter and function-estimation techniques is presented for the solution of two-dimensional transient inverse heat conduction problem (IHCP). The IHCP involves simultaneous unknown time varying heat generation and time-space varying boundary conditions estimation. Two techniques are considered, Levenberg-Marquardt scheme for parameter estimation and adjoint conjugate gradient method for function estimation. To have fewer numbers of unknown coefficients for estimation, polynomials are used in the parameter estimation scheme. The measured transient temperature data needed in the inverse solution are made by exact or inexact (noisy) data. The results of the present study are compared to exact heat source and temperature (heat flux) boundary conditions.

Keywords: Inverse Heat Conduction- Parameter and Function Estimations- Simultaneous Estimation

1.

Introduction

The development of engineering softwares for the study of Inverse Heat Conduction Problems (IHCPs) has widely investigated for applications in industrial and research areas. Time-space IHCPs have been used to estimate time or time-space varying unknown heat generation, surface boundary conditions and thermophysical properties using one or more temperatures measured by sensors inside or on the boundaries of the physical domain. Mathematically, IHCPs, unlike the direct heat conduction problems, belong to a class of "ill-posed" problems which do not satisfy the "well-posed" conditions introduced by Hadamard[!]. In fact, the IHCPs are very sensitive to random errors in the measured temperature data, thus requiring special techniques for their solutions in order to satisfy the stability condition, one of the "well-posed" conditions. Minimization of error is an aim of the inverse Analyses. Inverse analysis is related to analytical design theory. The concept of choosing the best experiments in the inverse analysis and minimizing it is common to choose the best cost function in the analytical design theory. A number of parameter and function-estimation schemes for inverse analysis have been proposed to treat the illposed nature ofiHCPs [2, 3]. In this research, a simultaneous estimation of unknown heat source and boundary conditions is presented for the solution of two-dimensional transient inverse heat conduction problem (IHCP). The IHCP involves simultaneous unknown time varying heat generation and time-space varying boundary conditions estimation. Two techniques Levenberg-Marquardt scheme for parameter estimation and adjoint conjugate gradient method for function estimation are considered.

2. Mathematical Formulation The governing equation in non-dimensional form is the two-dimensional transient heat conduction equation with heat source. The governing equation expressed in Cartesian coordinate system is then:

t>O, O y-i(q) > 0. In this work we assume the closeness of the protocols since in the definition of the payoff function we always consider both the wish of one party to know the other's secret and the wish of the other party to prevent that from happening. According to the aforementioned functional definition of a two-party cryptographic protocol J, at the end of the execution, party i should receive the output of/; on secrets MA and MB. Depending on whether fA = f B we may distinguish between symmetric and asymmetric protocols. From the first group we will study the protocols of Fair Exchange, Secure Two-Party Computation and Coin Flipping. On the other hand, representative protocols of the group of asymmetric protocols are Oblivious Transfer, Bit Commitment and Zero Knowledge Proof. This classification is important for the proposed game theoretic model because it implies the translation to a symmetric game where possible payoffs and outputs of both parties coincide, or to asymmetric games where that does not occur. For every analyzed protocol we give formal definitions of income, expense and payoff functions for each party in every possible combination of behaviours and misbehaviours of parties, and

A Rational Approach to Cryptographic Protocols

101

obtain necessary hypothesis to guarantee the existence of a honest strategy profile being a Nash equilibrium. Also we rank properties of exclusivity, voyeurism, correctness and privacy. Although we consider the possibility of misbehaviours by both parties, in this paper we analyze specially the case when exactly one of them is dishonest. Concretely, we deal with the idea of modeling cryptographic protocols design as the search of an equilibrium in order to defend honest parties against all possible strategies of malicious parties. So, our main objective is to illustrate the close connection between protocols and games and to use game theoretic techniques for the definition and analysis of cryptographic protocols. Since this work is in progress, many questions are still open. On the one hand, one direction for further investigation involves the use of a similar game-theoretic approach to describe and analyze multiparty cryptographic protocols. On the other hand, the relationship between properties defined here and other properties described by different authors, and a possible extension of this study to stronger and weaker definitions of essential properties may also deserve further study.

References [1] Buttyan, L., Ph.D.Thesis, Building Blocks for Secure Services: Authenticated Key Transport and Rational Exchange Protocols, Laboratory of Computer Communications and Applications, Swiss Federal Institute of Technology- Lausanne (EPFL), 2002.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 102-105

An Improved Dynamic Programming Algorithm for Optimal Manpower Planning 1 X.

Cait·U 2 ,

Y.J. Lil 2 , F.S. Thu

Department of System Engineering & Engineering Management The Chinese University of HongKong, Shatin N. T., Hong Kong

t School of Economics and Management Tsinghua University, Beijing 100084 P. R. China U Department of Automation Nankai University, Tianjin 300071, P. R. China

Received 4 August, 2004; accepted in revised form 22 August, 2004 Abstract: We consider a manpower planning problem (MPP) over a long planning horizon. Dynamic demands for manpower must be satisfied by allocating enough number of employees, under the objective to minimize the overall cost including salary, recruitment cost, and dismissal cost. We first formulate the problem as a multi-period decision model. We then reveal several properties of the optimal solution and develop an improved dynamic programming algorithm with polynomial computational complexity. Keywords: Manpower planning, dynamic program, computational complexity

Mathematics Subject Classification: 93B40, 37N40

1

Introduction

With the rapid development of economy, manpower planning has become an important problem in today's business world, especially in labor-intensive corporations, where the workforce plays a prominent role in determining the effectiveness and cost of the organization. As such, studies of optimal manpower planning have received extensive attention in the last two decades; see, e.g., [1], [2], and [3]. Considering the dynamic fluctuations of manpower demands, it is natural for an organization to determine the optimal size of its workforce by making proper and dynamic decisions on recruitment and dismissal over different periods of time. Such models, however, have not received much attention in the literature, due to properly the inherent complexity in deriving the optimal dynamic solutions. It is often that such a problem becomes a dynamic optimization model, which requires very sophisticated computational algorithms to search for the optimal or near-optimal solutions. In this article we will first model the manpower planning problem with dynamic demands as a multi-period decision process with constraints. We will then propose an improved dynamic 1 This research was supported in part by NSFC Research Funds No. 60074018 and No. 70329001. The second author would also like to acknowledge the support of the China Postdoctoral Science Foundation No. 2003034020 and the Tsinghua-Zhongda Postdoctoral Science Foundation (2003). 2 Corresponding authors. E-mail: [email protected] (X. Cai), [email protected] (Y.J. Li)

Improved Dynamic Programming Algorithm for Optimal Manpower Planning

103

programming algorithm to derive the optimal solution. Our approach will be devised based on analysis on the properties of the optimal decisions. Our approach is not only computationally efficient, but also capable to reveal some useful insights on the desirable solutions with respect to the data, and therefore allowing management to have better understanding of the impacts and benefits of the solutions.

2

The Model

We consider the following manpower planning problem for an organization. Suppose that based on forecast of its business, the number of employees required in each time period has been specified in advance. The organization can decide on the number of the employees to be recruited or dismissed at the end of every period, subject to the constraint that the workforce available in the coming period would meet the demand for manpower. The objective is to find a series of optimal decisions on recruitment/dismissal in all time periods over the planning horizon, so that the total manpower-related cost is minimized. Notation: T

a (3+ / {3X[t] u[t]

Dt

The number of time periods being considered. The salary per employee in each time period. The recruitment/dismissal cost when an employee is recruited/dismissed. The number of employees available in period t, which is called the state in the period t. The number of employees being recruited (u[t] > 0)/dismissed(u[t] < 0) at the end of period t. The manpower demand in period t, where t = 0, · · · , T. (Assume Do= 0).

The problem under consideration can be formulated as

J=~~~~ {

MPl: s. t.

aX[T]+

T-1

~

X[t + 1] X[t] X[O]

[ax[t]+{3+max{u[t],O}+f3-max{-u[t],o}J X[t] + u[t],

:0::

Dt, Xo

t = 0, 1, · · · , T- 1

t = 1, 2, · · · , T

}

(1)

(2) (3)

(4)

The constraint (2) shows the dynamics of the workforce available in the organization. The constraint (3) indicates that the demand must be met by the available employees in each period. The constraint (4) specifies the value of initial state, where Xo is a given constant.

3

A Standard Dynamic Programming Approach

One can see that MP1 is a dynamic optimization problem. For each pair (t,x), let ft(x) be the minimum total cost from the periods t to T, subject to the constraint that there are x employees in period t. For a feasible decision u[t] = u, the number of employees in period t + 1 is Xt+! = x + u. Let Dmax = max{Dt, t = 1, 2, ... , T}, we can have the following dynamic program:

DPl:

fr(x)

ax, Dr :-:; x :-:;

ft(x)

ax+ min {{3+ max{xt+l- x,O} + {3- max{x- xt+ 1 ,0} Xt+1

Dmax

(5)

104

X. Cai, Y.J. Li, F.S. Th

The optimal solution is obtained when f 0 (X0 ) is computed. We can show that the time complexity of the algorithm DP1 is bounded above by O(TD~ax)· This time complexity is too high, in particular when we have a large Dmax or a large T. In the following we present our improved algorithm, which requires substantially less computational time.

4

Improved Dynamic Programming Approach for MPl

In the algorithm DP1, for any given state x (Dt :'S x :'S Dmax) we compute the value ft(x) over all possible Dmax - Dt+l· This is actually not necessary, if we can utilize some properties of the problem MPl. Some analysis is given below. Let us look at the case (Dt :'S Dt+l) as illustrated in Fig. 1. We can first define dt = max{Dt+l,Dt+2···· ,Dt+d, where L = 13 1rl {fxl is defined as the smallest integer greater than or equal to x), and then divide the possible X[t] = x into two values dt and Dt+l· Further, since Dt :'S x :'S Dmax• the entire interval of the value x can be divided into three parts: [Dt, Dt+I], (Dt+r,dt) and [Dt+r.dt]· Thus, we can get the states Xt+I of equation {6) as follows: When Dt :'S x :'S Dt+l• Xt+l = Dt+l; When Dt+l < x :'S dt, Xt+l = x; When dt -dt, then we get X (tl [t]' -

t

=

{

X

dt (t+Ll[ ']

t

if t + 1 :S t' < t + L , ift+L:St :ST

• If t?:: T- Le and Dt?:: Dt+l• then X(tl[t'] = Dt fort :S t' :ST. Step 3. The optimal states over the entire planning horizon are X( 0 l[t] (0::; t ::; T). Correspondingly, we can get the optimal value fo(Xo).

Theorem 2. The algorithm IDP2 can generate an optimal solution of MPJ in O(T) time.

5

Concluding Remarks

We have modelled a dynamic manpower planning problem as a multi-period decision process, and developed an improved dynamic programming algorithm to compute its optimal solution. Our proposed algorithm is efficient, requiring only a time complexity O(T), which is independent of the magnitude of the manpower demands. The reduction of computing time of our algorithm has been achieved by analysis of the optimal properties of the problem. Topics for further investigation include consideration of multiple skills, training requirements and costs, promotion, leave, etc.

References [1] D.J. Bartholomew, A.F. Forbes and S.l. Mclean, Statistical techniques for manpower planning (2nd). Wiley, New York, 1991. [2] A. M. Bowey, Coorporate manpower planning, Management Decision, 15(1977) 421-469 [3] H.K. Alfares, Survey, Categorization, and Comparison of Recent Tour Scheduling Literature, Annals of Opereraion Research, 127(2004) 145-175 [4] P.P. Rao, A dynamic programming approach to determine optimal manpower recruitment policies, Journal of Operations Research Society, 41(1990) 983-988.

Lecture Series on Computer

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

and Computational Sciences Volume 1, 2004, pp. 106-109

Parallel Computation of Two Dimensional Electric Field Distribution Using PETSC 0. Ceylan 1

0.

Kalenderli 2

1 Computational

Science and Engineering Program Informatics Institute Istanbul Technical University 34469, Maslak, Istanbul, Turkey

2

Department of Electrical Engineering Istanbul Technical University 34469, Maslak, Istanbul, Turkey

Received 4 August, 2004; accepted in revised form 25 August, 2004 Abstmct: Parallel programming is an efficient way for solving large scaled problems. In this study electric field distribution of an example system is approximately computed using finite difference method by PETSC. For different size of grid values and processor numbers results are given and compared. With the help of PETSC features, such as parallel preconditioners and Krylov subspace methods different solution techniques are applied and their performances are compared. Keywords: Electric field, finite difference method, parallel computation, PETSC

Mathematics Subject Classification: 65y05, 65Y05 PACS: 0270Bf

1

Introduction

Parallel computation is an efficient way when one deals with large scaled problems. In order to use this way electric field problems can be given as an example large scaled problems. In the numerical computation of these fields, for example, they can be modelled in two dimensional cartesian coordinates with equal or non-equal sized grids. There are several methods used for numerical computation of electric fields depending on solution of Laplace and Poisson equations. One of the methods is finite difference method (FDM). By using finite difference method which is an approximate method for solving partial differential equations, good results can be handled if grid sizes are chosen very big. Finite difference method is widely used in scientific problems. These include linear and nonlinear, time independent and dependent problems. This method can be applied to problems with different boundary shapes, different kinds of boundary conditions,and for a region containing a number of different materials [1]. Some finite differer.ee applications can be found in [2-6]. 1 Corresponding 2 Corresponding

author: E-mail: [email protected], Phone: +902122857077, Fax: +90 212 2857073 author: E-mail: [email protected], Phone: +902122856759, Fax: +90 212 2856700

107

Parallel Computation of Two Dimensional Electric Field Distribution Using PETSC

In this study PETSC (the Portable, Extensible Toolkit for Scientific Computation) is used. PETSC is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It employs the MPI (Message Passing Interface) standard for all message-passing communication. PETSC software is a freely available tool and provides a rich environment for writing parallel codes. The modules in PETSC are appealing as it integrates a hierarchy of components, including data structures, Krylov solvers and preconditioners that can be imbedded into various application codes [7, 8]. The program was run on a SunFire 12K high-end server. This server gives researcher shared memory environment. It has 16 -900 MHz UltraSPARC III Cu- CPU with 32 GB memory and 16-1200 MHz UltraSPARCIII Cu-CPU with 32GB memory.

2

Electric Field Computation

Steady state distribution of electrical potential V(x,y) on a two dimensional plane can be given by Laplace equation;

8 2 V(x, y) 8x 2

+ 8 2 V(x, y) 8y 2

= 0

(1)

Equation (1) can be discretized, on two dimensional cartesian coordinates with equal sized grids, by using first order central differences benefiting from Taylor series as follows;

8 2 V(x, y) ~ V(x 8x 2

+ h, y) -

2V(x, y) h2

+ V(x- h, y)

(2)

8 2 V(x, y) ~ V(x, y +h) - 2V(x, y) + V(x, y- h) 8y2 h2

(3)

By summing Equation (2) and Equation (3);

V(x + h, y) + V(x- h, y) + V(x, y +h)+ V(x, y- h) - 4V(x, y)

=0

(4)

and leaving V(x,y) alone;

V(x, y) =

1

4 [V(x + h, y) + V(x- h, y) + V(x, y +h)+ V(x, y- h)]

(5)

By using Equation (5) for all grid points one can write grid number times equations for this system. This yields a linear system and can be solved by numerous numerical methods (direct methods and iterative methods). On the other hand as given in Figure 1 for 2d plane, one grid point has four neighbour points. Hence the system is in a sparse structure and the constructed matrix is a five banded matrix.

3

Numerical Example

An example system of [9] is used for electric field computing. Boundary conditions are given in Figure 2. While writing the code using PETSC, petscles header file must be included, so that SLES (Scalable Linear Equation Solvers) solvers can be used. With MatCreate command laplace matrix A was created and in a loop, it's appropriate spaces were filled with contigious chunks of rows across the processors. Considering boundary conditions vector b was created. System Ax = b was solved by Krylov solvers with the help of PETSC features.

0. Ceylan, 6. Kalenderli

108

100 [V]

.....,

60(VJ

80 (V]

40 [V]

Figure 1: One point and its neighbour points

Figure 2: 3x3 grid and boundary values for the example system

Table 1: Wall clock time (s) for different mesh sizes and number of processors

I

I

Processor Number 16,----11 Mesh Size 1-----,1,------,,---___,..2--.----.,.4--.-r;8--,--.,-,12,.-----,--..,100 200 300 400 500 600

X X X X

X X

100 200 300 400 500 600

3.429 33.715 117.23 409.909 910.686 1616.726

1.852 20.044 89.917 212.011 460.202 788.18

1.757 10.918 40.05 109.869 229.922 316.280

1.325 6.08 23.155 65.322 139.284 255.186

1.510 4.406 15.224 42.352 101.271 196.386

1.565 3.860 14.104 30.131 58.748 146.062

Speedup vs. mesh size

16

,--~--~--~--,.:2P ::-~ ""-,,-

--,

dp;

14

'Sproc'

"12proc·

12

- - - - - - - - - - - - -QL--~---L--~--~-~

1OOx 100

Figure 3: Two dimensional electric field distribution for 200x200 mesh size

200x200

300x300 400x400 Mesh Size

500x500 600x600

Figure 4: Speedup vs. mesh size graphic for different number of processors

An example result for 200*200 grid is shown in Figure 3. Wall clock time (s) for the solution of t he electric field distribution problem for different sizes of grids and processors is shown in Table

Parallel Computation of Two Dimensional Electric Field Distribution Using PETSC

109

1. The speedup vs. mesh size graphic is plotted in Figure 4.

4

Conclusion

In this study we are focused on an important engineering problem, the computation of electric field using finite difference methods. We computed electric field in parallel by using a freely available tool PETSC. As shown in Figure 4 the speedup increases with the increase of the processor numbers. From Table 1 it is easily seen that for a specific number of grids the solution time is decreasing when the processor number increases. For small sized experiments this decrease is not as much as big sized experiments. When problem size gets bigger in size, solution by finite difference method is more accurate but also the computation time and computation cost increases.

References [1] P. B. Zhou, Numerical Analysis of Electromagnetic Fields, Springer-Verlag, 1993. [2] M. V. Schneisder, Computation of Impedance and Attenuation of TEM-Lines by Finite Difference Methods, IEEE Transactions on Microwave Theory and Techniques, Vol. 13, pp. 792-800, November, 1965. [3] P. Basappa, V. Lakdawala, G. Gerdin, Computation of the Electric Field Around a Wet Polluted Insulator by Analytic and Finite Difference Techniques, Conference on Electric Insulation and Dielectric Phenomena, 2000. [4] E. A. Erdelyi, E. F. Fuchs, Fields in Electric Devices Contatining Soft Nonlinear Magnetic Materials Techniques, IEEE Transactions on Magnetics, Vol. 4, pp. 1103-1108, 1974. [5] H. I. Saleheen, K. T. Ng, Parallel Finite Difference Solution of General Inhomogeneous Anistropic Rio-Electrostatic Problems, IEEE-EMBC and CMBBC, pp. 247-248, 1995. [6] J. Duncan, The Accuracy of Finite-Difference Solutions of Laplace's Equation, IEEE Transactions on Microwave Theory and Techniques, Vol 15, No 10, pp. 575-582, 1967. [7] S. Balay, K. Buschelman, W. D. Gropp, D. Kaushik, M. Knepley, L. C. Mcinnes, B. F. Smith, H. Zhang, PETSC users manual, Technical Report ANL-95/11 Revision 2.1.3, Argonne National Laboratory, 2002. [8] S. Balay, W. D. Gropp, http://www.mcs.anl.gov fpetsc.

L.

C.

Mcinnes,

B.

F.

Smith

PETSC web

page,

[9] D. Trybus, Z. Kucerovsky, A. leta, T.E Doyle, Distributed Electric Field Approximation, Proceedings of the 16th Annual International Symposium on High Performance Computing Systems and Applications, 2002.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. I 10-114

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Hydrogen Bonding In Aqueous Mixtures Containing Aprotic Solvents: A Computer Simulation Study M. Chalaris 1, J. Samios2 Laboratory of Physical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis 15771 Athens, Greece

Abstract: Molecular dynamics simulations have been perfonned to investigate the thennodynamic and

structural properties as well as the self diffusion coefficients and hydrogen bonding in aqueous (binary) mixtures containing the aprotic solvents acetone, acetonitrile, dimethyl sulfoxide (DMSO) and dimethylfonnamide (DMF). The self diffusion coefficients of both components in the mixtures were calculated and compared. The results show a non-linear dependence with composition. In each case, this property also reveal a peculiar minimum at mole fraction in the range xAs=0.35 -0.40 (AS=aprotic solvent). The diffusion coefficients of water-acetone are found to be anomalous and, the diffusion coefficients of water show a different behavior in this mixture compared with that in the other two mixtures. The intennolecular structure of each mixture was investigated and a number of interesting structural effects have been noted. In the case of acetone and acetonitrile water mixtures, there is a loss of tetrahedral water coordination in the systems In the case of DMSO and DMF water mixtures, the analysis of several site site pair distribution functions reveals that the average tetrahedral coordination of water is preserved for mole fractions of DMSO or DMF, up to 0.4 . Finally, the hydrogen bonding statistics are obtained and compared with available experimental results. Keywords:Statistical Mechanics, MD simulation; aprotic solvents; diffusion coefficients; hydrogen bonding Mathematics SubjectC/assification: 82805, 82015, 82C70 PACS: Here must be added the AMS-MOS or PACS Numbers

1. Introduction Binary mixtures in which water is one of the components are interesting molecular liquid systems due to their importance in many fields of solution chemistry. Following the literature, we can notice that considerable effort has been made so far to clarify the non-ideal behaviour of the physicochemical properties of such systems. In particular, over the last two decades the problem of the microstructure (local order) in aqueous solutions of organic molecules has attracted much experimental and theoretical interest. This intensive scientific interest is due to the fact that many questions regarding the properties of such associated hydrogen-bonded solutions have not yet been definitively answered. Notice that hydrogen bonding is known to be one of the most important weak interactions between the molecules leading to the formation of well-defined molecular aggregates. It should be emphasized that enhanced intermolecular structure is presumably responsible for the non-ideal mixing behaviour in many physicochemical properties of such solutions. Acetonitrile(AN), Acetone(Ac), Dimetylsulfoxide(DMSO) and Dimethylformamide(DMF) belong to a class of aprotic (AS) polar organic solvents, which are miscible with water in all proportions. Furthermore, the AS-water mixtures are particularly interesting systems because of their potential applicability in many fields such as organic chemistry, liquid chromatography and solvent extraction. The present work describes the use of the Molecular Dynamics (MD) simulation technique to study the properties of these mixtures at ambient conditions over a wide range of mole fractions (Ot

-.ousn

•. u:;,:u y

Figure 3: Total displacements in a first load step (left) and a second load step (right). In figure 4, the von Mises stress [3] in the ribs is shown as well as the normal stress (3] in a section near the central column. In figure 5, we show the average stress in the cloths, principal stresses S l, S2 and S3, and von Mises stress (2]. ;;;.,.

,.,.,,., ""

A.!;

a-t-:•, ,·, P. J. Garcia Nietol 2>, J. A. Vihin Vilan(3>, A. Martin Rodriguez0 >, F. J. Suarez Dominguez0 >, J. R. Prado Tamargol 1>and J. L. Suarez Sierra0 > (1) Department of Construction Engineering High Polytechnic School, University of Oviedo, Edificio Departamental N° 7 - 33204- Gij6n (Spain) (2) Department of Mathematics Faculty of Sciences, University of Oviedo, Cl Calvo Sotelo s/n- 33007- Oviedo (Spain) (3) Department of Mechanical Engineering High University School of Industrial Engineering University ofVigo Campus de Lagoas-Marcosende- 36200- Vigo (Spain) Abstract: In this paper, an evaluation of distribution of strains and stresses and a warping effect are determined throughout a tubular pipe conveyor due to the geometry and local effects by the finite element method (FEM) [1, 5]. The non-linearity is due to the 'contact problems' that governs the phenomenon. Finally the forces and moments are determined on the different elements of the tubular pipe conveyor, given place to the conclusions that are exposed in this study. Keywords: Finite element modeling, numerical methods, contact problems PACS: 74S05, 74MIO, 74MI5

1. Introduction The commercial use of the tubular pipe conveyor is relatively recent and many of the design, engineering and maintenance techniques are common to the conventional systems. The tubular pipe conveyor is a natural evolution of the conventional belt conveyors so that conserves all the advantages of these and it eliminates many of its inconveniences, becoming a very attractive option for the design of material transport's systems. However, this type of transport has presented important problems in its practical application, such as the warping [5] of the belt in the curved sections. Consequently it is necessary a more exhaustive theoretical study of this problem. In order to solve this question we have developed a numerical model by means of the finite element method (FEM) [2, 7] taking into account the contact behavior with friction between the belt and the rollers.

2. Geometry To define the geometry of a tubular pipe conveyor is complex using an analysis program by finite elements (see figure 1). For this reason, an advanced parametric design language (APDL) was used [I]. Firstly, we build each of the sections that describe the belt of the pipe conveyor. Then we introduced the coordinates of the roller stations and joined them with the belt by means of the contact elements with friction, obtaining the graphics shown in figure 2. With this procedure a total of 15 cases were modeled with a 0.3 m of diameter in each section and curves understood between 6° and 90° (angle a in figure I) with a total belt length ranges from 240 to 390 m.

146 _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ _ _ __ _ _ _ _ _ _ }. J. del Coz Diaz et. a/.

SECTlOH 4

TAIL

SECTION I

HJ!AD SECTION 5

Figure I: Installation scheme (left) and typical section (right).

Figure 2: Geometry of the finite element model.

3. Mathematical model A particularly difficult nonlinear behavior to analyze is the contact between two or more bodies. Contact problems range from frictionless contact in small displacements to contact with friction in general large strain inelastic conditions. Although the formulation of the contact conditions is the same in all these cases, the solution of the nonlinear problems can in some analyses be much more difficult than in other cases. The nonlinearity of the analysis problem is now decided by the contact conditions. The objective is to briefly state the contact conditions in the context of a finite element analysis and present a general approach for solution. Let us consider N bodies that are in contact at time t. Let

s; be the complete area of contact for each

body L, L = I ...N; then the principle of virtual work for the Nbodies at timet gives [2-3, 7]:

~ Uv' r~o;eudV' }=~{fv, &;(/;

8

)'dV' +

fs j&;(/;5 )'dS'} + ~ fs:- &~(J;c )'dS'

where the part given in brackets corresponds to the usual terms:

(I)

Non-linear Analysis and Warping of Tubular Pipe C o n v e y o r s - - - - - - - - - - - - - -

147

T~ = Cartesian components of the Cauchy stress tensor (forces per unit areas in the deformed

geometry).

oleij =strain tensor corresponding to virtual displacements.

Ou;

= components of virtual displacement vector imposed on configuration at time t , a function of

x~,j =

1,2,3 ...

x; = cartesian coordinates of material point at time t. V 1 =volume at timet.

(J; = components of externally applied forces per unit volume at time t. (t/ ) = components of externally applied surface tractions per unit surface area at time t. 8 ) 1 1

s~ =surface at timet on which external tractions are applied.

Ou;5

=

Ou;

evaluated on the surface S~ (the

prescribed displacements on the surface s~

Ou;

components are zero and corresponding to the

).

and the last summation sign in equation (I) gives the contribution of the contact forces. The contact force effect is included as a contribution in the externally applied tractions. The components of the contact tractions are denoted as

(t;

c) 1

and act over the areas

S; (the actual area of contact for body at

time t}, and the components of the known externally applied tractions are denoted as over the areas S~

(J/)

. It is possible to assume that the areas S~ are not part of the areas

1

and act

s; , although

such an assum tion is not necessa

{¢ :~-·

~-

~--

Figure 3: Bodies in contact at timet (left) and contact elements (right). Figure 3 (left) illustrates schematically the case of two bodies, which are now considered in greater detail. In this paper, the two bodies in contact are denoted as body I and body J. Note that each body is supported such that without contact no rigid body motion is possible. Let contact surface tractions on body I due to contact with body J, then virtual work due to the contact tractions in (I) can be written as:

(7/J )

1

(7/J ) = -V 1

be the vector of JI ) 1 •

Hence, the

(2) where

&{

and

&(

are the components of the virtual displacements on the contact surfaces of bodies

I and J , respectively, and : (3)

148 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ J. J. del Coz Diaz et. al.

The pair of surfaces

SJJ and SJ1 are termed a 'contact surface pair' and note that these surfaces are

not necessarily of equal size. However, the actual area of contact at time t for body I is and for body J it is

s; of body/,

s; of body J, and in each case this area is part of S'J and SJ' . It is convenient to

call S/J the 'contactor surface' and SJ' the 'target surface'. Therefore, the right-hand side of (2) can be interpreted as the virtual work that the contact tractions produce over the virtual relative displacements on the contact surface pair. These conditions can now be imposed on the principle of the virtual work equation using a Penalty Approach (PA), Lagrange Multiplier Method (LMM) or Augmented Lagrangian Method (ALM). This work uses the technique ofPA since it proves to be more efficient from a numerical point of view (4, 6).

4. Analysis of the results As can be observed in figure 4 (upper and lower) the values of the warping moment in the contact elements nearest to the transition from curved to straight sections are increased with the belt conveyor angle a and with the percentage loading. In the case studied, this rise in value shows a non-linear oscillatory behavior but increasing in all cases. In our case, the maximum reaction in the roller station is I ,057.7 N (80% percentage loading), precisely in the transition from curved to straight section.

1200 1000

-._-__ _l -1_-_ -_ o_ _1 ~_-__::_j2!---- ---------------- ____ _

BOO

~

600

z

400

i= u < w

200

0

0::

0 -200 -400 -600

-BOO ANGLE

a

ANGLE

a

1200 1000

j-

-1-0

BOO

~ 600 z 400

0

i= u

< w 0::

200 0 -200 -400 -600

-BOO

Figure 4: Roller Stations Reactions in the ' curved to straight' transition as a function of the belt conveyor angle a and the percentage loading (50% -upper- and 80% -lower-).

Non-linear Analysis and Warping of Tubular Pipe Conveyors _ _ _ _ _ _ _ _ _ _ _ _ __

149

5. Summary and conclusions In summary, the governing equations to be solved for the two-body contact problem are the usual principle of virtual work equation (I), with the effect of the contact tractions included through externally applied (but unknown) forces, plus the constraint equation. The finite element solution of the governing continuum mechanics equations is obtained by using the discretization procedures for the principle of virtual work, and in addition now also discretizing the contact conditions. Thanks to the distribution of load inside tubular pipe and the form of contact between material and belt it is possible to get slopes of the tubular pipe conveyor larger than 30°, opposite to 17°-20° in a conventional belt. This characteristic is very outstanding when the instalation is situated in mines or in plants in which it is neccesary to get important slopes. Besides it is possible to design paths with curved sections. In this way we can carry out curved sections greater than 45° so that the number of transference points required are reduced. Nowadays it is possible to carry out curved sections up to 90°. The method of the finite elements has been shown as a suitable tool in the modeling and nonlinear analysis of tubular pipe conveyors.

Acknowledgments The authors express deep gratitude to Construction Department and Department of Mathematics at Oviedo University and Department of Mechanical Engineering at Vigo University for useful assistance. Helpful comments and discussion are gratefully acknowledged. We thanks to Swanson Analysis Inc. for the use of ANSYS University Intermediate program.

References [1) ANSYS User's Manual: Procedures, Commands and Elements Vols. I, II and III. Swanson Analysis Systems, 2004. [2] K. Bathe, Finite Element Procedures, Englewood Cliffs, Prentice-Hall, New York, 1996. [3] T. Chandrupatla and A. Belegundu, Introduction to Finite Elements in Engineering, Englewood Cliffs, Prentice-Hall, New Jersey, 1991. [4] J.J. del Coz Diaz, P.J. Garcia Nieto, F. Rodriguez Mazon and F.J. Suarez Dominguez, Design and finite element analysis of a wet cycle cement rotary kiln, Finite Elements in Analysis and Design 39 17-42(2002). [5) E.P. Popov and T.A. Balan, Engineering Mechanics of Solids, Prentice-Hall, New Jersey, 1999. [6] J.C. Simo and T.A. Laursen, An augmented Lagrangian treatment of contact problems involving friction, Computers and Structures 42 97-116(1992). [7] O.C. Zienkiewicz and R.L. Taylor, The Finite Element Method: Solid and Fluid Mechanics and Non-linearity, McGraw-Hill Book Company, London, 1991.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 150-153

An hybrid linking approach for solving the conservation equations with an Adaptive Mesh Refinement method S. Delage*· 1 , S. Vincent*, J .P Caltagirone* and J.P. Heliot+ *Laboratoire TREFLE, University of Sciences Bordeaux!, 16 avenue PeyBerland, 33607 Pessac, France. +cEA-CESTA, BP n 2, 33114 Le Barp, France. Received 22 July, 2004; accepted in revised form 15 August, 2004 Abstract: Solving the conservation equations for complex incompressible flows on a single grid turns out to be hardly possible. Numerical diffusion schemes and different scale phenomena compel us to use very fine meshes. Therefore, the obtention of an accurate solution requires expensive calculation time and computer memory. We aim at evaluating the efficiency of an implicite adaptive mesh refinement method so as to partially circumvent these problems. Keywords: Conservation equations, Implicit, Adaptive Mesh Refinement (AMR), Interpolation, Connections.

Mathematics Subject Classification: 02.70.- c, 02.60.- x, 02.06.Cb. PACS: 47, 47.11.

+ j,

47.27.Eq, 47.55.KJ, 02.70.- c, 02.60.- x.

1

Introduction

Understanding phenomena which happen on incompressible turbulent and multiphasic flows need an accurate solution. On the one hand, we have to capture different scale phenomena, therefore using a very fine mesh is essential to catch small ones even if they appear locally in space and time. On the other hand, the use of monotone schemes for advection and diffusion problems prevent us from well interpreting results, given that physical diffusion is covered up by numerical one in most real configurations. Thereby, based on the [l]'s ideas for compressible fluids, an original One-Cell Local Multigrid method (OCLM) has been developped by [5] to solve the Naviers-Stokes equations for two-phase flows. The equations are approximated by Finite Volumes on a MAC grid. This method consists in generating fine grids from a coarser one, by means of gradient criterion ie: if a point M on level G1-I verifies the criterion, the control volume around M is refined and level G1 is built (The refined control volume is called AMR cell). An odd cutting is needed to ensure a conform connection between the fine grids of level G1. When a third level Gl+I is generated, it is embedded in level G1, which is itself embedded in level G 1_ 1 . The solution on points of level 1 corresponding

author. E-mail: [email protected]

151

Hybrid linking approach for conservation equations

Gt generated by a point from level Gt-i is treated as follows: For the limit points, the coarse solution is prolongated on level Gt using a classical Q1 interpolation procedure and or the interior points, the conservation equations are solved on level Gt. Then, the discrete solution on level Gt is restricted to level Gt-l using a direct injection procedure or a Full Weighting Interface Control Volume (FWICV) [2]. However, this method presents some failure. Firstly, level Gt has to be considered as a serie of independant AMR cells. In fact, the information does not go through cell to cell but from level Gt-i to level Gt. Hence, a numerous interpolated points make the solution less accurate. Secondly, the classical Q1 interpolation procedure coupled with an implicit solver does not verify the incompressibility condition. Given the fact a penalty method is used to take the interpolation into account, the flow results to be constrained. Here we propose an improvement of this method to face with these two problems namely a linking method between AMR cells of a single level Gt and an implicit interpolation procedure.

2

Connections

When several adjoining points of level Gt-i generate AMR cells on level Gt, there are overlapping. As a matter of fact, the points belonging to an AMR cell limit are overlapped by interior points of another AMR cell to which they are connected. Hence, these limit points can be solved instead of being interpolated. These connections enable a good transmission of information cell to cell and preserve the solution accuracy. As a result, level G 1 has to be considered as a single block and not as a serie of independent AMR cells where the block' interior points are solved and the limit ones are interpolated. Figure 2 presents connected (so solved), solved and interpolated points on level Gt. We can notice that the number of interpolated points is small compared with the solved ones on a single level Gt. e Connected poh

,........,...,... Solved-

" e •

Connection betwftn IWo AMR eels Right AMR eel pomr comected 10 let AMR cell poirl Left AMR eel pan c~ to rigf1l AMR eel poW

Figure 1: Linking procedure on level Gt.

3

Interpolation and Implicit resolution

An implicit solver can be used to solve a part or the whole of the conservation equations. So, the discretized equations can be put in the following form An+ I xn+l = Bn, where Bn is the second member at iteration n, xn+l, the unknown at iteration n + 1 and An+i the matrix to inverse. The lines i of the matrix which correspond to points detected to be interpolated are replaced by the interpolation coefficients. Different classical interpolation procedures namely Q3, Q2 and Ql have been tested to prolongate the coarse solution on level Gt for the limit points. We have noted oscillations when working with strong gradients and high order level interpolation such as Q3 or Q2 (Gibbs phenomena) whereas a Q1 interpolation remains monotone. So, a Ql interpolation will be used for discontinuous solutions whereas a Q3 interpolation will be implemented for more regular ones.

S. Delage, S. Vincent, J.P. Caltagirone, J.P Heliot

152

Once the matrix filled, a new matrix An+ I is obtained with the same dimensions as An+ I and the system to solve becomes An+ I xn+I = iJn {figure 3). A zero is put on the second member Bj. Hence, the adaptive mesh refinement is treated on an implicit way which avoids time-lags between solved points and interpolated ones. Thus the flow is not constrained anymore (see also [3]).

Figure 2: Typical shape of An+ I (left) and An+! {right).

4

Explicit resolution

The transport or advection equations cannot be solved in an implicit way because discretization schemes used in the implicit solver are either diffusive when they are monotone or dispersive. So we have no choice but to use explicit TVD or WENO schemes to discretize advection terms.[4]

5

Results

The performances of the AMR method are evaluated on the solving of a scalar advection-diffusion equation. A disc of concentration 1, submerged in another fluid is sheared by a rotating velocity field of intensity 2 rad.s- 1 . The evolution of the concentration field is calculated either on a classical refined mesh {144 * 144) or on a coarse one {16 * 16) with several AMR levels {here 2) which corresponds to a local refined mesh {144 * 144). The efficiency of AMR on reducing numerical diffusion is first tested. On the warped disc {figure 3), we can note that the higher the grid level, the weaker numerical diffusion is. So our AMR method allows us to control the numerical diffusion induced by the discretization scheme. Now, if the diffusion coefficient of w- 10m 2 .s - 1 is raised up to 10- 4 m 2 .s - 1 , it is shown on left and centered plots of figure 4 that AMR solution captures the effect of t he molecular diffusion. Concerning the performances of the AMR method, it requires less computer memory than the use of a classical refined mesh {35% to 90% less){figure 4 right). However, given that it requires expensive calculations in time and has not been optimized with respect to time yet, the rate of profit is about 6% in time. In 3D and for coupled vectorial equations, benefit will be greater.

6

Conclusion

A new hybrid linking technique dedicated to AMR methods has been proposed. Its efficiency has been demonstrated on scalar equation solving concerning the decrease in numerical diffusion, computer memory and time cost. Vectorial conservation equation solving is being achieved and will be presented in the congress and the full article version.

Hybrid linking approach for conservation equations

153

Figure 3: Numerical diffusion on level Go (left), G 1 (center) and C 2 (right) - t = 2s, timestep w- 3 s, diffusion coefficient w- 10 m 2 .s- 1 .

"""''

nui'I"'er of po!I'I!S wrth a r.titled rNr&h 144"144

!&:»::•

16000

/

>4000

'

,h/

1:000 I()'.)X•

/~,!'.~'lA·

1l001)

6000 4000 -- - · ~_....,. IO"f-10)

- " " " - ! - ' l 0 « 1 1 10'l-41

~·· nul'ftbQr

af~intl wllh a ooatw mnh 18"18 and anA A ICIW\2

0.5

,,

11me(s)

Figure 4: Effect of the molecular diffusion when 2 AMR level are considered in the xz plan (Ieft)and in a z-slice (center) concentration fields- Comparison of the memory performances between the AMR procedure and the regular grids one (right) - t = 2s, timestep w- 3 s, diffusion coefficient w-4m2 .s-l.

References [1] M.J. BERGER and P. COLLELA, Local adaptive mesh refinement for hyperbolic partial differential equations. J. Comput. Phys., 82, 64-84, 1989. [2] W. HACKBUSH, Multi-grid methods and applications.SCM, vol. 4, Springer, Berlin, 1985. [3] ROME C. and GLOCKNER S.,An implicit multiblock coupling for the incompressible NavierStokes equations. Accepted for publication in Journal for Numerical Methods in Fluids. [4] S. VINCENT and J .P. CALTAGIRONE, Solving incompressible two-phase flows with a coupled TVD interface capturing / local mesh refinement method. Godunov Methods: Theory and applications, E.F . Toro, Klumer Academic/Plenum Publishers, New York, 1007-1014, 2001. [5] S. VINCENT and J .P. CALTAGIRONE, One Cell Local Multigrid method for solving unsteady incompressible multi-phase flows, J. of Comput. Phys. 1 163, 172-215, 2000.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume 1, 2004, pp. 154-158

Performance Analysis of Branch-and-Bound Skeletons I. Dorta, C. Leon, C. Rodriguez'

Departamento de Estadistica, 1.0. y Computaci6n, U niversidad de La Laguna, E-38271 La Laguna, Tenerife, Spain http://nereida.deioc.ull.es Received 2 August, 2004; accepted in revised form 15 August, 2004 Abstract: This article proposes a study of load balancing for Branch-and-Bound algorithms. Concretely, sequential and parallel generic skeletons to implement this algorithmic technique are presented. To accomplish the work the CALL tool is used. CALL allows to annotate the code by hand at the special points with a complexity function. Also, some preliminary computational results are presented. Keywords: Algorithmic Skeletons, Branch-and-Bound Technique, Work Load Balancing Mathematics Subject Classification: 68R99

1

Introduction

The Branch-and-Bound Technique is a general method to solve combinatorial optimization problems. The MaLLBa [2] skeleton provides a set of classes to implement automatically this technique. In this paper, a sequential implementation of the Knapsack Problem is presented, using the MaLLBa pattern. The complexity analysis of an algorithm produces as a result a "complexity function" that provides an approximation of the number of operations to accomplish. The CALL tool allows the logging of the C code that implements the algorithm with that complexity expression. The LLAC [1] tool is used to analyze the obtained results of the execution of the experiments with CALL.

This article is organized as follows. In second section, a sequential implementation of the Knapsack Problem using the skeleton for Branch-and-Bound technique is analyzed using CALL tool, and a parallel skeleton is briefly described. In the third section the preliminary computational results are presented.

2

Using call tool to accomplish performance analysis

The algorithm presented in Figure 1 shows the recursive code used to solve the Knapsack Problem. The number of objects for insertion into the knapsack is stored in the variable N, the variables w and p store the weights and the benefits of each of them, while the variable M represents the capacity. Since it is a maximization problem, the initial value -oo is assigned to the variable that stores the best solution found until that moment; bestSol (lines 1 to 4). Between lines 8 and 17 the bound function (lowerUpper) is defined. We use the same function to calculate the lower and upper bounds. The lower bound is defined as the maximum benefit that can be obtained from a 1 E-mail:

{isadorta, cleon, casiano }@ull.es

155

Performance Analysis of a BnB Skeleton

I• main files •I 2

number N, M;

3 4 5 6 7 8 9 10 11 12 13 14 15 16

number w [MAX] , p [MAX] ; number bestSol • -INFINITY;

17

} }

18 19 20 21 22 23 24 25 26 27

28 29 30 31 32 33 34 35 36 37 38 39 40 41

#pragma ell code double numvis;

void lowerUpper (number k, number C, number P, number •L, number •U) { number i, weig, prof; if (C m is given by: logF(M>m)=a-b·m (I) The parameter b presents a wide range of values that depends on the fault. For small earthquakes it was found that b E [0.80, 1.06] while for large ones b E [1.23, 1.54], [3]. This equation is known as the Gutenberg-Richter law. By the other hand, the energy E released during an earthquake is believed to have the following behaviour: logE(M> m) = c + dm (2)

'Corresponding author. email: gsirak(@ee.duth.gr

186 - - - - - - - - - - - - - - - - - - - - - - - - - - - /.G. Georgoudas eta/.

where d = I and d equations:

=

3/2 for small and large earthquakes respectively [3]. By combining these F(E> e) oc EP

(3)

As a result, relatively simple models will properly capture the essential physics responsible for these behaviours [I]. The use of minimalistic models began when Burridge and Knopoff published results indicating OR-like power-law behaviour from a simple chain of blocks and springs being pulled across a rough surface. In such a model, springs connect blocks representing contiguous sections of a fault to provide a linear elastic coupling [4]. Seismologists became interested in cellular automata as possible analogues of earthquake fault dynamics when Bak and Tang demonstrated that even highly simplified, nearest neighbour automata produce power-law event size distributions [5]. The sand-pile automaton ofBak et al. [6] was employed to demonstrate the emergence of power-law statistics as the result of the combined action of a large number of simple elements which interact only with nearby cells via a pre-specified interaction rule. Bak et al. [6] termed such behaviour Self-Organised Critical behaviour [6]. Carlson and Langer proposed a ID dynamical version of the Burridge and Knopoff model (BK model) [7], while Olami, Feder and Christensen introduced a generalized, continuous, nonconservative cellular automaton for modelling earthquakes, known as OFC earthquake model [8]. Here, it is described a simple cellular automaton model with continuous states and discrete time, constituted of cells-charges that aim to simulate earthquake activity with the usage of potentialities. The produced simulation results are found in good quantitative and qualitative agreement with the Gutenberg-Richter (OR) scaling relation predictions, while numerical results for various cascade (earthquake) sizes and different critical states are presented and found in good agreement with others CA earthquake models. It should be mentioned that, in the literature, the BK model is treated as an example of weakly driven dissipative system with many meta-stable states. In most of these studies a rough purely velocity-weakening friction law is used. But any phenomenological friction law has to be velocity strengthening for large velocities. At the presented model the use of potentialities is able to produce a more realistic friction for small velocities. Additionally, the physics of the model using potential dynamics reveal a more natural structure than that described at literature [4], [8]. Finally, the parameter of different neighbourhood, meaning the usage of Moore neighbourhood, instead of von Neumann neighbourhood, is also examined resulting in the corresponding measurements.

2. CA model description Potential - based model for earthquake simulation is a two-dimensional dynamic system constituted of cells-charges. Its aim is the simulation of seismic activity with the use of potentialities. It is assumed that the system balances through the exercitation of electrostatic Coulomb-forces among charges, without the existence of any other form of interconnection in-between. Such kind of forces is also responsible for this level to be bonded with a rigid but moving plane below. Cells are being recomposed by the alteration ofthis plane's potential. The major characteristic of this model is that the whole study of the seismic activity stands on a potential-based analysis for each cell and not on the electrostatic forces that are being developed among them. The equivalence of the potential-based study with the force study is being ensured under the condition that the system is conservative. Obviously, the fact that in our system exist only conservative forces (Coulomb-forces among the charges) very well satisfies the above-mentioned condition. Furthermore, this is confirmed by the results show that the system presents self - organized criticality [9]. The hence coming advantage of this method is that vector analysis that would require additional compromises bringing on a decay of the results received is being overcome. On the other hand the manipulation of magnitude-only sizes provides calculate simplification as well as increased level of reliability. As far as it concerns the dynamics of the potential-based model, this is ensured by the existence of simple update rules that takes place in discrete steps. In fact, there exists a model of cellular automaton, CA. Specifically, if at time moment t = 0 the potential V;J of a cell placed at (iJ) exceeds the threshold value V,h of the level below, the balance is disturbed, and the cell is being moved. The moving cell is transferred at a point of lower energy, hence at a state of lower but nonzero potential. The removal of the cell reorders the values of the potential at its nearest neighbours. The model study has been made taking under consideration four (Von Neumann neighbourhood) or eight active neighbours (Moore neighbourhood). In both cases, the moving cells, wherever placed in the lattice, interact with a constant number of neighbours contributing in the systems homogeneity.

A Potential- Based Cellular Automaton Model for Earthquakes _ _ _ _ _ _ _ _ _ _ _ _ __

187

Potential values conversion results in new cells to become unstable driving to the appearance of the well known as cascade phenomenon, which is the earthquake's equivalent of the proposed model [1]. The updated potential values following the local displacement of the (iJ) cell-charge is proven to be given by the relationship below: v;±,,j

~ v;±l,j + (a * r)v;,j

V± 1,) 1 ~ Vl,j'±I + (a* r)V1,) v;±l,j±l

v;,j

~ v;±,j±l +(a* r)v;,/Moore)

(4)

~I

This model needs only to store one real-value field, which is the potential. Being also accompanied by the merit that this is a magnitude-only size and hence it can be fully described by a unique value, further assumptions are avoided. Another crucial factor, cell displacement, dx;,;, is present in the model. Not only it is taken into account by the presence at the update rule of parameter a, which is inverse proportional to cell displacement dx;,;, but also by the definition of an upper displacement limit. In other words, a maximum value for cell movement is introduced, away from which the return to an equilibrium state becomes impossible (charge would tend to infinite hence its potential value to zero). More specifically:

k

k

xi,j :S:x,h. =>-~-=> V;,j ~ V,h' wherek >O,constant xi,j x,h.

(5)

It is obvious that the displacement condition implies an additional one for the respective potential value. Consequently, the potential at each place (iJ) should be constantly larger than the value of the lower potential limit, V,h·· The dynamics of the proposed model is defined by the use of the parallel updating of the lattice [1]. All cells are being tested and change their potential values until none of them has a value greater than V,h· As soon as the whole procedure is completed, the earthquake phenomenon, which has been generated by the presence of value V;J>V,h, has been fully simulated. The successive discrete step, which also implies the recurrence of an earthquake process, initiates with the application of the so-called increment method. According to this method, the cell with its value closer to the limit V,h is being found. This value minimizes the difference (V,h-V;,;). which is then being added to the potential value of each cell. Finally, as far as it concerns the boundary conditions the use of closed (zero) boundary conditions has been preferred. This decision relies on the fact that the model should present strong forms of in-homogeneity at the boundaries, since in-homogeneity is an attribute of earthquake faults at the surface.

3. Measurements and simulation results There have been made two kinds of measurements so far. They are defined as follows [3]:



Critical state:

It presents the state of the system after a large number of earthquake simulations, which practically takes place after the successive addition of the quantity (V,h- V;j)min at the value of the potential of each cell-charge.



Cascade (earthquake) size:

It is defined as the total number of cells that participate at a single earthquake procedure, which stands as long as the condition V;J> V,h is true. This magnitude can be treated as a measure of the total energy released during the evolution of the earthquake or even to be interpreted as a measure of the earthquake's magnitude. The results received reveal for the model the validity of the Gutenberg-Richter scaling law. Below follow results for a variety of single parameters, given in forms of contours as far as it concerns critical state measurements or plots regarding measurements that prove the validity of the Gutenberg-Richter law.

188 _ _ _ __ _ _ _ _ _ _ _ _ _ __ __ _ _ _ _ _ _ _ _ /.G. Georgoudas el a/.

.



·~,}---~~,-:;-,-----,,,---:-,--:-,---,,.------..! ll'l·~ . . .

,...._....,j _.",. ..

w

.........

to•0}-----:.,:;-----:":----:,.,:---::"·---:::,.___.J tN-12t ,._*O.~h~JO_t(OJO I.Iocn'lflll!l """trtUI

~

Figure I: (a) Semi-log plot of the cascade size vs. the number of earthquakes for N=64, Vth=2 (Von Neumann neighborhood-4 active neighbours) and 1000 events. (b) Semi-log plot of the cascade size vs. the number of earthquakes for N=l28, V1h=30 (Moore neighborhood-S active neighbours) and 10000 events. At the above given plots, the dependence of the number of seismic procedures with a certain 'magnitude' M, named N(M), with this 'magnitude' M is depicted by a semi-logarithmic scale. It is obvious that these two sizes [log N(M), M] hold a linear relationship with a constant slope -p. Consequently, this is a result-based confirmation of the existence of the fundamental GutenbergRichter law in this model (Figure I). The corresponding results, which include an increased number of active neighbours (Moore neighbourhood, including eight active neighbours) display also the maintenance of this substantial relationship [Figure I (b)]. It should be noted that the larger the number of interacted neighbours and of the lattice size the more accurate the results are. The drawback of such an approach is that the accumulation of neighbours widens the range of 'magnitudes' without though severely destroying the linear dependence. In any case, this is successfully confronted by increasing the value of the potential limit, vth· The following contours depict the form of the critical state for different values of the system's parameters. It is of major concern the size of the lattice together with the threshold value V,h as well as the number of the active neighbourhoods.

Figure 2: Critical state contour for N=64 and Vth=2, Vth=4, Vth=8, Vth=IO, (from up left to right down) (Von Neumann neighbourhood- 4 active neighbours), in correspondence, and 10000 events. The figures above disclose the complicated structure of the final state. The main feature is the reduction of the cells that reach their maximum value (which is depicted by bright colours) while the threshold of the potential value V,h increases. A rational explanation is based on the fact that the growth of the value V,h attenuates the probability of a potential value close to the threshold one to appear at the active neighbourhood. The shape is very similar to that observed in discrete sand-pile automaton, thus proving the existence of self-organised criticality in the model [6]. Similar results are obtained following the application of the Moore neighbourhood with eight active neighbour-cells (Figure 3).

A Potential- Based Cellular Automaton Model for Earthquakes_ _ _ _ _ __ _ __ _ _ __

189

Figure 3: Critical state contour for N=64 and Vth=4, V,h=lO, (from left to right) (Moore neighbourhood - 8 active neighbours), in correspondence, and 10000 events.

4. Conclusions The use of simple models, amenable to making the best use of today's computing power as well as yielding to analytical treatment, is without a doubt a powerful approach to difficult nonlinear problems. The presented potential-based CA model was constructed in order to simulate earthquake activity in correspondence to the quasi-static two-dimensional version of the BK spring-block model of earthquakes, as well as, to the OFC earthquake model. Numerical results for various earthquake sizes and different critical states were presented and the parameter of different neighbourhood in the proposed model was also explored. The simulation results were found in good quantitative and qualitative agreement with the GR scaling relation predictions. Consequently, complex real systems, like earthquakes, can be studied with simple cellular automata models.

References (!] E.F. Preston, J.S. SaMartins, J.B. Rundle, M. Anghel, and W. Klein, Models of Earthquake

Faults with Long-Range Stress Transfer, IEEE Computing in Science and Engineering, 2 3441(2000). [2] B. Gutenberg, and C.F. Richter, Magnitude and Energy of Earthquakes, Ann. Geophys., 9 115(1956). [3] G. Hernandez, Parallel and distributed simulations and visualizations of the Olami-FederChristensen earthquake model, Physica A, 313 301-311(2002). (4] R. Burridge, and L. Knopoff, Model and theoretical seismicity, Bulletin of Seismological Society ofAmerica, 57 3 341-371(1967). [5] P. Bak, and C. Tang, Earthquakes as a Self-organised Critical Phenomenon, J Geophys. Res. 94 811 15635-15637(1989). [6] P. Bak, C. Tang, and K. Wiesenfield, Self-organised Criticality: an Explanation of 1/fNoise, Phys. Rev. Lett., 59 381- 384(1987). [7] J.M. Carlson, and J.S. Langer, Mechanical model of an earthquake fault, Phys. Rev. A, 40 6470-6484( 1989). [8] Z. Olami, H.J.S. Feder, and K. Christensen, "Self-Organized Criticality in a Continuous, Nonconservative Cellular Automaton Modelling Earthquakes", Phys. Rev. Lett., 68 8 12441247(1992). [9] T. Hwa, and M. Kardar, Dissipative transport in open systems: An investigation of selforganized criticality, Phys. Rev. Lett., 62 1813-1816(1989).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 190-193

On Transport-Chemistry Interaction in Laminar Premixed Methane-Air Flames D.A. Goussist,l G. Skevist and E. Mastorakos# trnstitute of Chemical Engineering and High Temperature Processes, FORTHIICE-HT, 26500 Rio, Patra, Greece #Engineering Department, University of Cambridge, Cambridge CB2 IPZ, UK Received 6 August, 2004; accepted in revised form, 18 August 2004 Abstract: A laminar premixed CH4 -Air flame is analyzed with tools from the Computational Singular Perturbation method. It is shown that in the flame zone, transport has no influence in the directions in the phase space, along which act the fastest chemical time scales. Diffusion starts becoming important in the directions along which act the intermediate chemical time scales, while along the slowest directions both diffusion and convection dominate. It is also demonstrated that the stoichiometry of the mixture has no significant effect in the dynamics along the fastest chemical directions, influencing the slow ones, along which chemistry interacts with transport. The analysis presented here is based on numerical CSP data. The ease in their interpretation makes CSP a well suited tool for the study of complex and unexplored chemical mechanisms.

Keywords: Singular Perturbations, Computational Methods, Transport-Chemistry Interaction.

1. Introduction A thorough understanding of the coupling between chemistry and molecular transport processes is essential for the accurate description of laminar and turbulent flame structures and limiting phenomena such as ignition and extinction. Traditional flame analysis methods include reaction path and sensitivity analyses [e.g., I, 2] and asymptotic expansion methods [e.g., 3]. Here, the algorithmic local Computational Singular Perturbation (CSP) analysis [4-5] will be employed for the analysis oflaminar flames. CSP has been applied to a wide range of combustion problems [e.g., 6-8] and is based on the realization that the solution of the species equation follows a trajectory, which lies on a manifold in the mass fraction space [4-6]. This manifold is created by the equilibration of the "fast" part of chemistry, due to the action of fast chemical time scales acting perpendicular to the manifold; directions along which the effect of diffusion and convection time scales is negligible [7]. The "slow" part of chemistry and the other physical processes present, such as convection and diffusion, are responsible for moving the solution along the manifold. CSP will be employed here to study the transport-chemistry interaction in laminar premixed CH4-Air flames. The aim of this study is to investigate the extent to which transport couples with chemistry and the physical mechanism by which such a coupling is possible.

2. The CSP Tools Given a kinetic mechanism consisting of N species, E elements and K elementary reactions (forward and backward reactions counted separately), the species N-dim. system of equations is: (I)

where, y is the vector of the species mass fractions, Lc and Ld are the convective and diffusive spatial vector differential operators, p is the mixture density, W is a diagonal matrix with the species'

1 Corresponding

author. E-mail: [email protected]

191

Transport-Chemistry Interaction in Laminar Premixed Flames _ _ _ _ _ _ _ _ _ _ _ __

molecular weights as entries, Sk is the stoichiometric vector of the k-th elementary reaction and R k is the k-th reaction rate. Equation (I) can be cast as: dy dt

-

=

I

a 1h + ... + aNh

N

(2)

where (3) (4a-c) ai and bi are the N-dim. CSP basis vectors (biak = 8~), hi are the related amplitudes, b~ is the n-th component of bi and L~ and L'd are the n-th components of Lc andLd [5, 9]. The signs of bi and ai are adjusted so that fi > 0 for i=l,N-E, while fi = 0 for i=(N-E+l),N due to the conservation of elements in each elementary reaction. The CSP-modes aihi are ordered in eq. (2) according to the magnitude of the chemical time scales they relate to; i.e. the i=l term relates to the fastest scale 'tchem,l, etc. Here, the CSP basis vectors ai and b' are approximated by their leading order term: the right and left, respectively, eigenvectors of the Jacobian of the source term in eq. (I) [5, 7]. The chemical time scales are defined as 'tchem,i = where A., is the i-th eigenvalue acting along the direction ai.

liiA.J,

Equation (I) can also be cast as:

f -1

- -NJ

-dy = -I W a 1h + ... + aNh dt p

(5)

where iii= W 1ai and jii =phi, so that K elementary reactions, convection and diffusion in eq. (I) can all be replaced by N non-physical reactions, the stoichiometric vector and rate of which are iii and jli (i-th CSP reaction and rate). Inspecting the elements in iii and the major contributors in the expression

for jli , valuable information can be extracted regarding the most important chemical paths, which either are in equilibrium or drive the system.

3. Chemistry-Transport Interaction Steady, freely propagating, one-dimensional, laminar, premixed atmospheric C~-Air simulated with the RUN-lDL code [10], using the detailed GRI 3.0 mechanism incorporates 53 species (N=53), 325 reversible reactions (K=650) and 5 elements (E=5). reported here the inlet mixture temperature was considered at T0 = 300K, while the

flames were [II], which In the flames exit product

temperature was computed T00 = 1670K for cji=0.6, T00 =2231K for cjl=l.O and T00 = 1909K for cjl=l.5. Since the problem considered here is steady, all CSP rates jli are equilibrated: i=l,N-E

(6a)

i=(N-E+l),N

(6b)

The N-E eqs. (6a) are projections of theN-dim. system of eqs. (I) on the N-E directions in the phase space, along which chemistry is active and is allowed to interact with transport. TheE eqs. (6b) govern the conserved scalars and by definition chemical activity is absent. The equilibration of each CSP rate jli is associated with a time scale, the magnitude of which depends on both transport and chemistry. In particular, the time scales characterizing the fastest CSP rates are dominated by the fast components of chemistry only. The time scales of the next CSP rates are influenced by both chemistry and transport; the effect of the latter becoming more important as the time scales become slower, dominating completely those of theE CSP rates, eqs. (6b). In order to relate the above with the different processes contributing in the cancellations in jli = 0 , we introduce the quantities: (7a-c)

192

Goussis et a/.

which, since ~~~rei+ ~~~onl +II~;rl =I, are indicative of the chemical, convective and diffusive activity levels taking place in each CSP rate

hi .

The quantities I~rc' I~on and I~;r for all i=I,N-E=48 modes that allow transport-chemistry interaction were computed at a point where temperature attains an average value, T3 = (T0 - T00 )/ 2 , for the three stoichiometries «F0.6 (T3 =1000K),

c:

.l!l

I -•-

3321.0

J

-~M

197

~ 2

6

~ c w

3319.0

3318.5 3318.0

3317.5

Pressure (GPa)

2

4

Pressure (GPa)

6

Figure 4 The lattice constants of magnetic and non magnetic phase (left) and the total energy (right) as functions of pressure.

4. Conclusion Density functional calculations have been performed on Fe3Pt under various high pressures, in order to study the pressure effect on structure, electronic and magnetic properties. In case of cubic, magnetic phases, the linear compression rate is 1.39x 10'3/GPa. As pressure increases, s band population increases while p band population decreases, and d band population increases. The magnetic moment of Fe atoms decreases while that of Pt atoms increases. The cell moment increase while the saturated magnetization decreases. Calculations performed on tetragonal phases show that the cell volumes and magnetizations are smaller than that of the corresponding cubic phases; the total energy is almost the same. Two phases with different cell volume and magnetization could exist. The results on non magnetic phase showed a small cell volume and high total energy, indicating that the ferromagnetic state is the ground state of Fe 3Pt. Geometry optimization of non magnetic tetragonal phase yield the same values for a and c, such reduced to the cubic phase. It shows that the cubic to tetragonal martensitic could not exist.

Acknowledgments This work was supported by the National Science Fund for Distinguished Young Scholars and the National Natural Science Foundation of China (No. 50325209, 50172006, 50232030).

References [I] M.D. Segall, P.J. Lindan, M.J. Probet C. J. Pickard, P. J. Hasnip, S. J. Clark, and M. C. Payne, First-principles simulation: ideas, illustrations and the CASTEP code, J. Phys.: Condens. Mater. 14 2717-2743 (2002) [2] T. Kakeshita, T, Kakeuchi, T. Fukuda, M. Tsujiguchi, T. Saburi, R. Oshima and S. Muto, Giant magnetostriction in an ordered Fe 3Pt single crystal exhibiting a Martensitic transformation, Appl. Phys. Lett. 77 1502-1504 (2000) [3] R. Hayn and V. Drchal, Invar behavior of disordered fcc-Fe,Pt 1., alloys, Phys. Rev. 8 58 4341-4344 {1998) [4] M. Matsushita, T. Nishimura, S. Endo, M. Ishizuka, K. Kindo and F. Ono, Anomalous magnetic moments in Fe-Pt and Fe-Pd Invar alloys under high pressure. J. Phys.: Condens. Mater. 14 10753-10757 (2002) [5] M. Uhl, L.M. Sandratskii and J. Kubler, Spin fluctuations in y-Fe and in Fe3Pt Invar from local-density-fuctional calculations, Phys. Rev. 8 50 291-301 {1994) [6] J.W. Taylor, J.A. Duffy, A.M. Bebb, J.E. McCarthy, M.R. Lees, M.J. Cooper and D.N. Timms, Spin-polarized electron momentum density distribution in the Invar system Fe 3Pt, Phys. Rev. 8 65 224408 (2002)

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 198-199

Building Financial Time Series Predictions with Neural Genetic System Serge Hayward' Department of Finance Ecole Superieure de Commerce de Dijon, 29, rue Sambin, 21000, Dijon, France Received 2 August, 2004; accepted in revised form, 18 August 2004 Keywords: Artificial Neural Network; Genetic Algorithm; Performance Surface; Evaluation Criteria; Summary Statistics; Economic Profitability; Stock Trading Strategies

Problems with applications of computational methods in finance are often due to the lack of common methodology and statistical foundations of its numerous techniques. At the same time, relationships between summary statistics used for predictions' evaluation and profitability of investment decisions based on these predictions are not straightforward in nature. The importance of the latter is particularly evident for applications of an evolutionary I artificial neural network (E/ANN) under supervised learning, where the process of network training is based on a chosen statistical criterion, but when economic performance is generally sought. This paper is a step towards the econometric foundation of computational methods in finance. For our experiment we develop the dual network structure, where forecasting network feeds into the action network. The model is evolutionary in the sense it considers a population of networks (individual agents facing identical problems/instances) that generate different solutions, which are assessed and selected on the basis of their fitness. Considering traditional performance measures as common errors, accuracy (directional), correlation (the desired and ANN output), improvement over 'efficient prediction', their relationships with stock trading strategies' profitability were found to be of complicated and non-conclusive nature. Only the degree of improvement over 'efficient prediction' shows some robust links with returns' measures. The experiment establishes that training ANN with the performance surface optimised with genetic algorithm (GA) for directional accuracy (DA), discounting least recent values or minimizing number of large errors generally improves strategies' profitability. The simulation shows that among three optimisation of performance surface considered, strategies trained on learning the sign of the desired output were generally superior to those, trained to reduce the number of large errors or focusing learning on recent values. The results also demonstrate that DA (alone or always) does not guarantee profitability of a trading strategy trained with that criterion. Addressing the issue of ANN topology dependency, simulations reveal optimal for financial applications network settings. Optimality of discovered ANN topologies' is explained through their links with the ARMA processes, thus presenting identified structures as nonlinear generalizations of such processes. Optimal settings examination demonstrates the weak relationships between statistical and economic criteria. Model discovery and performance surface optimisation with GA demonstrate profitability improvement with an inconclusive effect on statistical criteria. A trading strategy choice (with regards to the time horizons) is a function of market conditions, which themselves are a function of strategies used by agents, populated this market. In these settings market 1 Corresponding author. E-mail: [email protected]

Building Financial Time Series Predictions with Neural Genetic System _ _ _ _ _ _ _ _ _ __

199

conditions (or strategies used by the dominant type of traders) determine the optimal memory length. This approach is an effort towards considering a market environment endogenously in financial data mining. To model the turmoil in an economic system with frequent shocks, short memory horizons are considered optimal, as older data is not necessarily informative for the current/future state modelling/forecasting. In thinner markets there is higher likelihood that short memory horizons agents take the dominant position, influencing price movements. Thin markets' dominance by a particular traders' type facilitates a better environment for learning with agent-based methods of profitable strategies, used by dominant traders. The research demonstrates that the performance surface set-up is a crucial factor in search of a profitable prediction with an agent-based model. Evaluation criteria used to assess predictive power of stock trading strategies, generated with agent-based models, might significantly differ from criteria leading to their profit maximization. The choice of evaluation criteria combining statistical and economic qualities is viewed as essential for an adequate analysis of social dynamics. Fine-tuning the ANN settings is considered to be an important stage in the computational model set-up for results' improvement and mechanism understanding. GA is proposed to be used for model discovery, making technical decisions less arbitrary and adding additional explanatory power to the analysis of social dynamics with agent-based methods and tools.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 200-203

Numerical Solutions to the Reheater Panel Overheating of a Practical Power Plant Boshu He A.t, Meiqian Chen\ Laiyu Zhu 8 , Jianmin Wang 8 , Shumin Liu A, Lijuan Fan\ QiumeiYuA ASchool of Mechanical, Electronical and Control Engineering, Beijing Jiaotong University, Beijing 100044, China 8 Dagang Power Plant, Tianjin 300272, China Received 17 June, 2004; accepted in revised form 12 August, 2004 Abstract: The commercially available CFD package, FLUENT was utilized to numerically solve the metal surface overheating issues of the reheater pendants that exist in the full-scale No.3 boiler of Dagang Power Station, Tianjing, China. Some factors that may affect the velocity and temperature distributions at the fmal reheater inlet section (final superheater outlet) had been taken into account when the designated coal was burned, such as the quantity and fashion of counter-flow in the operation, the pressure difference in the air box, and the downward inclination of the secondary air injection. The basic conclusion is that some corresponding measures must be taken to rebuild the flow constructions to effectively avoid the boiler reheater pendant metal overheating. To obtain background detail, eight reformation cases were arranged on the main field, influencing reasons to diagnose this boiler numerically. Compared to the base case, all of the reformation cases had some emendatory effects to the flow and temperature distributions. The most outstanding among the reformations was Case I, where the secondary air (OFA and the upper secondary air of primary air burner D) was operated with counter-flow with a downward angle, and the pressure difference in the air box was increased. It can then be concluded that Case I can more efficiently modifY the velocity and temperature deviations in the overheating regions, to ensure the furnace will operate within stable and safe conditions. Undoubtedly, these conclusions are of value to the other units of this power plant, and to other power plant furnaces throughout China that have similar construction and capacity.

Keywords: FLUENT; numerical retrofitting; tangentially fired furnace; residual swirling; velocity

deviation; temperature deviation; secondary air counter-flow Mathematics Subject Classification: 62P30, 76F70

1. Introduction The No.3 boiler of Dagang Power Station was a tangentially fired oil furnace designed and manufactured by the FRANCO TOSI INDUSTRIAL£ S.p.A Co., Italy. There exists panel overheating of the reheater pendants after the fuel was switched to pulverized Datong coal in 2001. This overheating of the tube surface of reheater pendants greatly effects the operation of the boiler, in both safety and economical aspects. After the retrofitting with the introduction of counter-flow of the over frre air (OFA), the overheating had been preliminarily controlled with less frequency. It has been recognized that residual swirling widely exists in tangentially fired pulverized coal furnaces, and this results in velocity and temperature deviations. These effects would be more severe with the increased capacity of the boiler [1-5]. At the same time, this kind of furnace is common throughout China, due to the many advantages it possesses as had been previously mentioned [3]. The overheating of the tube surface in the No. 3 boiler of Dagang Power Station, Tianjin, China, originates from the velocity and temperature deviations due to the thermo-mechanism aspect. Even though the overheating was greatly decreased with the introduction of OFA counter-flow, it did appear frequently with the BCD grinders operation, and the thermo-mechanism was not very clear. For example, how to obtain the most reasonable distribution of velocity and temperature fields for the designated coal, by the different

1

Corresponding author. E-mail: [email protected], [email protected]

Numerical solutions to the reheater panel overheating of a practical power p l a n t - - - - - - - - -

201

conditions of the flow rates or the optimal times to introduce the counter-flow into the furnace. In what directions will the overheating occur when the coal is switched from the fuel that had been designated for use. These rules, of course, could be taken as controlling factors through the actual operation tests, with higher costs and tremendous risks. The above influences can be economically simulated in a faster and a more effective way with the development of computerized numerical techniques. Over the last 20 years, the numerical techniques have gained the reputation of being effective tools in identifying and solving problems related to pulverized coal combustion [3, 6]. The numerical calculations generally require the application of computational fluid dynamics (CFD), which is not an easy task, especially for three-dimensional engineering modeling. For this reason, the research undertaken herein has been performed using the commercial CFD package from Fluent Inc [7]. It was employed to numerically retrofit the furnace by the introduction of 9 Cases, including one baseline case (Case A), to find the solution to the reheater pendant overheating problem. The results obtained would undoubtedly be instrumental for this boiler to operate in a more economical way, and be referential for other units of this station and other power stations throughout China.

2. Models and numerical methods The simulation of the operational furnace is complex and exacting. Many mathematical models and pre-digestions will be introduced. These models include Turbulent Flow, Coal Devolatilization, Coal Combustion, and Heat Radiation Transfer. The CFD software from Fluent was used to numerically retrofit the operation of the No.3 full-scale boiler of Dagang Power Station. The gas-phase turbulent reacting flow was predicted by solving the steady-state conservation equations for continuity, momentums, pressure, enthalpy, mixture fractions, turbulence energy, and dissipation. The gas and particle phases are coupled through the particle source terms. The gas-phase combustion was modeled using the fast chemistry mixture fraction approach. The mixture fraction is defined as the local mass fraction of burned and unburned fuel stream elements (C, H, etc.) and its values are then used to compute the individual species molar fractions, density, and temperature with equilibrium chemistry and pdf approaches [6, 7]. Moreover, a group of coal particle injectors was used that enabled to track the coal particles during the solution procedure. Some models nested with the software are selected in the simulation work.

3. Retrofitting cases, results and analysis The overheating of the metal surfaces of the reheater pendants always occurs with the combination of BCD grinders in operation, and it is rare with the other grinders' combinations such as ABC, ACD, ABD, etc. This indicates that the BCD combination should be thoroughly analyzed. There are 9-sets of CFD retrofitting cases established for this furnace, with the BCD combination, which are labeled from A to I. Each is identified by the counter-flow second air jets, the pressure difference in the air box, and the axis tilt-angle of the counter-flow jets. Where case A is set as the base case with no counter-flow jets, the designed air box pressure difference, horizontal jets of secondary air, and the details for the other cases can be found in Table I. The CFD based solutions have been calculated for Cases B to I where the counter-flow jets are put into operation and the results will be compared with those from the base case A. The overheating occurs frequently in the L2 vertical plane, which is the outlet of the final superheater, or the inlet of the final reheater and is shown in Figure I where the a, b and c are the positions partitioning the horizontal flue into 4 equal segments in the vertical plane. The most frequent position that the overheating occurs is the place around position a. Therefore, the focus will be the distributions of the velocities and temperatures at the position a in this CFD based retrofitting analysis.

202

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Boshu He et. a/.

Table I: Retrofitting Cases and explanations Cases A 8

c

D E F G H

I

Counter-flow combinations void OFA OFA&4.2 OFA OFA&4.2 OFA OFA&4.2 OFA OFA&4.2

General situation Secondary air jets with horizontal injection Secondary air jets with 20 degree downward injection Secondary air jets with horizontal injection Secondary air jets with 20 degree downward injection

The velocity magnitudes are compared in Figure 2 in the positions of L2-a for all of the 9 cases. The x component velocity takes a very similar profile to the velocity magnitude, since this component is the main one at these positions for all cases. From this Figure one can observe that the characteristics of the velocity distributions have similar profiles for all of the retrofitting cases, i.e. the velocities near the right wall reveal higher magnitudes than those near the left wall for the position of L2-a, and the relatively low velocity magnitudes occur near the left wall. Among all of the cases, the greatest velocity difference occurs in case A, as indicated by Figure 2, a distinct high velocity value is present near the right side while a relatively low value is present close to the central and the left regions. The velocity differences between the high and low values will inevitably result in the homologous high and low heat transfer coefficients and will result in the heat absorption differences by the water vapor. Therefore, case A is indeed the most dangerous one, as had been substantiated by the actual operation, and effective measures must be taken to rebuild the velocity fields to the levels in cases through 8 to I. Compared to case A, the velocity differences between the right and left side walls becomes obviously less in some extent for the other cases where some retrofitting techniques have been put into operation (the details can be found in Table I) and the least amount of difference is in case I.

-•-E -•-G

Furnace air box with designed pressure difference: 1000 Pa Primary air velocity: 28 m/s Secondary air velocity: 50 m/s Furnace air box with increased pressure difference: 2000 Pa Primary air velocity: 28 m/s Secondary air velocity: 55 m/s

PI P2 P3 P4

P5 Pfl

PANEL SUPERHEATER I PANEL SUPERHEATER 2 PLATEN SUPERHEATER FINAL SUPERHEATER

FINAl REHEA TER INTERMEDIATE REHEATER

PnmaryaorD

m .

X

Pn~n~~rya11C

Pnmarya1rB

Pnmarya1rA

3

Figure I: Geometric diagram, burner and injection arrangements for the numerical retrofitting

-•-F -•-H

-•-1

Ylm

Figure 2: Velocity distributions at line L2-a for Figure 3: Temperature distributions at line L2-a for the retrofitting cases the retrofitting cases The temperature magnitudes are compared in Figure 3, in the positions ofL2-a for all of the 9 cases. It can be concluded that the temperature levels for all of the retrofitting cases appears to have substantial decrease at this location as compared to base case A. The smallest temperature deviation is fortunately also found to be in case I.

Numerical solutions to the reheater panel cwerheating of a practical power plant _ _ _ _ _ _ _ __

203

Both the velocity and temperature distributions at the L2-a location for all of the numerical cases, which have been shown in Figures 2 and 3, indicate that case I can best modifY those distributions for base case A than any of the other cases. Among all of the other numerical cases, the least variations for the velocity and temperature distributions can be obtained by case I. This means that case I may effectively eliminate the residual swirling that originates from the Concentric Firing System (CFS) of this tangentially fired furnace. Case I has been recommended and accepted to be in the future operation by the power station owner.

4. Conclusions The numerical simulations, based upon one base case and 8 retrofitting cases for this boiler with the designated coal, indicate: • There exist large deviations for velocity and temperature between the two side walls near the reheater pendants at the base case A where there is no reformation. Some particular and effective measures must be incorporated to rebuild the velocity and temperature fields in order to guarantee the furnace will be operating in a both steady and safe manner. • Whether the unreasonable velocity field or the temperature profile can be rectified to some extent, by introduction of the counter-flow secondary air, and the final effect is influenced by the amount and the direction of the counter-flows, and the pressure difference in the air box. • Among all of the 8 retrofitting cases, it is case I, with the introduction of counter-flow jets with a 20 degree downward tilt and doubling the air box pressure difference, that the fields can be effectively modified and result with minimized deviations of velocity and temperature close to the reheater pendants. This case can then most probably prevent the metal surface of the reheaters from overheating.

Acknowledgments The authors gratefully acknowledge financial supports from the Dagang Power Station Project foundation and the BJTU (Beijing Jiaotong University) paper foundation

References [I] B.S.He, On the separated vortices around the panels and the evolution of the swirl flow in tangentially fired furnaces: Ph.D. Dissertation. Xi'an: Xi'an Jiaotong University, 1999. (in Chinese) [2] B.S. He, S.F. Ding, Y.F. Diao, J.Y. Xu, C.H. Chen, Numerical Study on the Contrary Modes of Air Jets in a Large Unity Furnace, Proc. Chinese Society for Electrical Engineering, 21 (200 I), 60-64 (in Chinese). [3] B.S. He, M.Q. Chen, Q.M. Yu, S.M. Liu, L.J. Fan, S.G. Sun, J.Y. Xu, W.P. Pan, Numerical Study of the Optimum Counter-flow Mode of Air Jets in a Large Utility Furnace, Computers and Fluids, 33 (2004), 1201-1223. [4] C.G. Yin, S. Caillat, J.L. Harion, B. Baudoin, E. Perez, Investigation of the flow, combustion, heat-transfer and emission from a 609 MW utility tangentially fired pulverized-coal boiler, Fuel, 81(2002), 997-1006. [5] C.G. Yin, L. Rosendahl, T.J. Condra, Further study of the gas temperature deviation in largescale tangentially coal-fired boilers, Fuel, 82(2003}, 1127-1137. [6) A.M. Eaton, L.D. Smoot, S.C. Hill, C.N. Eatough, Components, formulations, solutions, evaluation, and application of comprehensive combustion models, Prog Energy Combust Sci 25( 1999}, 387-436. [7] FLUENT6.1-User's guide. Fluent Inc., 2003.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 204-206

Direct Determination of Pair Potential Energy Function from Extended Law of Corresponding States and Calculation of Thermophysical Properties for COrN2 T. Hoseinnejadt and H. Behnejad Department of Chemistry, Tehran University, Tehran, Iran Received 23 June, 2004; accepted in revised form, 15 August 2004 Abstract: The isotropic reduced intermolecular potential energy function for COrN2 has been determined

using a direct inversion of the experimentally reduced viscosity collision integrals obtained from the corresponding states correlation then we used these results in conjunction with the second virial coefficient to calculate the outer branch of the potential well. The results are then fitted to obtain a best MSV potential model. Our obtained interaction potential function has been used to predict thermophysical data in a wide temperature and composition range which provides a reasonable agreement with the experimental data. Keywords: intermolecular potential energy for C02-N2, direct inversion method, collision integrals,

corresponding states principle, thermophysical properties

Mathematics subject Classification: Thermodynamics of mixtures PACS: 80A10

1. Inversion Procedure The prediction and interpretation of most phenomena involving atoms and molecules depend on knowledge of intermolecular pair potential function. In the early 1970s most information concerning intermolecular forces was inferred from study of the thermophysical properties but the functions resulting from this extremely difficult method do not appear to be unique. The basic purpose of inversion method is to obtain the potential energy by considering the experimental data instead of fitting the data to a constrained potential model having a few parameters[!]. The direct inversion procedure for the viscosity is based on the idea that at a given T' the values of reduced orientation averaged viscosity collision integral, ( n'(x), (s=10-6) x-oo,

(xLm '(x tum

=(xLym

'(x )asym

(7)

=[-~+[/)-_!_]_!_]-! 2

2

(8)

X

Taking into consideration the normalization of the NLSE, the criterion for our SSSM at

X max

is

I (x) ••m-'(xLm[ -i+[/3-iJ~r l I (meso-H). Such a dependence means that the relation of the number of clusters of one size to the number of clusters of another size depends not only on their dimension but also on the relation of sizes. The same degree relations describe distributions (by sizes) of glide lines and strips occurring near the site of failure centers formation characterizing the turbulent mixing of crystal lattice (meso-!) and distribution of "roughness of mountain relief' of the failure centers surface (nanolevel) by sizes. There were studied physics preconditions of application of percolation models for description of metals failure process in the dynamic longevity range . In the loaded state the failure centers density p increases, and when reaching the critical density Pc there originates connectivity in the system of failure centers, changing the body connectivity, i.e. macro-failure occurs. At the final stage of the dynamic failure the process is controlled by concentration criteria, when failure centers dimension and the average distance between them are connected by the definite relation. The critical phenomena are conditioned by the properties of the whole complex of system particles, but not by each particle individual properties.The foregoing determines the universal properties of metals behavior in the dynamic failure phenomenon. The unique mechanism of the process of dynamic failure - the loss of connectivity of the system through clusterization of failure centers cascade (equal dimensionality of order parameter, unique class of versatility) and equal space dimensionality, where the process occurs, determine the possibility for prediction of behavior of unstudied metals in the extreme conditions and for "constructing" of new materials resistant to definite types of exposure by the means of computer. Application of apparatus of critical phenomena theory and theory of second-kind phase transitions for the processes of dynamic failure at the final stage allowed determination of universal properties of metals behavior in the phenomenon of dynamic failure conditioned by self-arrangement and instability in dissipative structures. Keywords: apparatus of critical phenomena theory, second-kind phase transitions theory, dissipative

structure, failure centers cascade, processes of dynamic failure PACS: Here must be added the AMS-MOS or PACS Numbers

The paper studies the possibility for applying an apparatus of critical phenomena theory and second-kind phase transition theory to description of failure process at the final stage in the dynamic longevity range (t -10- 6 + w-1/ s). Critical phenomena are cooperative phenomena. They are conditioned by the properties of the whole integrity of the system particles but not by individual properties of each particle. The radius of correlation characterizes the distance at which the structure elements influence each other, and, thus, they are dependent. The correlation radius characterizes the distance at which the structure elements influence each other and are dependent. For all fractal systems this correlation radius is described by the power law [1].

1 Corresponding

author. Head of Department RFNC-VNIIEF. E-mail: uchaev@expd. vniief.ru

2 2 0 - - - - - - - - - - - - - - - - - - - - - - - - - - R.I. 1/'kaev et, a/. As a result of a large volume of calculation-and-theoretical and experimental studies [2-4) it was shown that the originating dissipative structure - failure centers cascade puts up the body resistance to the external action in the dynamic longevity range. The cascade of failure centers is a fractal cluster when distribution of failure centers by sizes is determined by the degree law N(D) - D-a; N(D) - number of failure centers of dimension D, a > 1 (mesa-level-II). Such a dependence means that the relation of the number o clusters of one size to the number of clusters of another size depend not on their size, but on dimensions relation. The same power relations describe the distributions by sizes of strips and slip bands emerging near failure centers formation characterizing turbulent mixing of crystal lattice (meso- I) and distribution by sizes "mountain relief roughness" of failure centers surface (nanolevel) [2-4]. In contrast to the theory of temperature phase transitions where the transition between two phases occurs at a critical temperature, the percolation transition is a geometry phase transition. The percolation threshold separates two phases: in one phase there exist final clusters, in another one - a single infinite cluster. Let us view, for example, the magnetic phase transition. At low temperatures some materials has non-zero spontaneous magnetization. As temperature increases the spontaneous magnetization continuously decreases and disappears at critical temperature. In the percolation theory concentration of occupied sites plays the same part as the temperature in temperature phase transitions. The probability that the site belongs to the infinite cluster is similar to the order parameter in the temperature phase transition theory. Many important cluster characteristics (correlation length, average site number) near the transition are described by power function with different critical indices (see expression 1.2) [5]. Thus the infinite cluster power is described by expression of the form P(x) -lx-xcla. (I) where Xc - percolation threshold at which the infinite cluster emerges. The average number of final cluster sites (receptivity analogue) at x - Xc __... 0 behaves as S(x) - lx-xci-Y, (2) where y- critical index [5, 6]. Universal indices [5, 6] do not depend on the lattice type, percolation type but only on dimensionality of the space of the problem. The results of studies given in papers [7, 8) show that with increasing lattice scale L, the mass of percolation cluster (M) (roughness of mountain relief of fracture surface was contemplated) grows as M(L)- L0 , where D- fractal dimensionality. Fig. I, a, b presents samples fracture surfaces under the action of high-current beams of relativistic electrons at the energy input rate dE/dt- 105 + 10 11 J/(g·s), in the temperature range T0 - 4K +0,8 T melt. in the longevity range t- 10-6+ 10" 10 s. lgM(L) 3.5

()

Cu048 Sn03 AI02 TWOS

-1.2

·1 .0

0

3.0

0 2.5

8

8

2.0 1.5

1.0 0. 5

0.0

© -1 .6

-1.4

-0.8

-0.6

-0.4

-0.2

0.0

a b c Figure I: View of fracture surface: a- Cu Ll = 0,48 mm; b - Ti 6 = 0,05 mm; c- masses of percolation clusters of fracture surfaces as some metals lattice scale increases, where Ll - sample thickness.

Fig. I, c gives the mass of percolation cluster of fracture surface as some metals lattice scale increases under the action of high-current beams of relativistic electrons.

0.2 lgL

Formalism ofSecond-Kind Phase Transitions in the Dynamic Failure Phenomenon_ _ _ _ _ __

221

The results of processing show that the fractal dimensionality D obtained at processing of some metals fracture surfaces (see fig.2) is close to the data given in the modem literature [7,8] obtained by calculations Dp = 1.89. Fractal dimensionality (see fig.!) of the mass of percolation clusters of the fracture surface has the value ofD- 1,8 as some metals lattice scale increases. Numerical simulation of percolation clusters at the lattice 50x50 with different value ofx (x > Xc H x < Xc) was performed, where Xc- percolation threshold. The mass M(L) of clusters given in fig.3 for different values of L of the lattice scale was determined. Fig.2 presents a percolation cluster of 50x50 square lattice for the occupation probability p = 0,3 (fig. 2, a), p = 0,593 (fig. 2, b). Fig. 2, c presents the "mass" of percolation cluster at the square lattice for the occupation probability of0,3; 0,593. Fractal dimensionality D has a value of (for x > Xc) D = 1.6 for the occupation probability p = 0,3; D = 1.79 for occupation probability p = 0,593. Values of D obtained at processing of computer clusters and fracture surfaces are close. I.e., the results of studies presented in fig.!, 2 prove that fracture surface roughness for a number of studied structural materials under the action of high-current beams of relativistic electrons is a percolation cluster. , lgM (L) 10

0

0.593

v 0.3

,,. ,.

Figure 2: Percolation cluster of 50x50 square lattice for p occupation probability: a) p = 0.3 b) p = 0.593; c) growth of mass of percolation cluster for the occupation probability p=0,3; 0,593. I.e. application of the apparatus of the critical phenomena theory and the theory of second-kind phase transitions for the dynamic failure process at the final stage allowed determining of universal properties of metals behavior in the phenomenon of dynamic failure conditioned by self-arrangement and instability in dissipative structures. The unique mechanism of the process of dynamic failure of a number of studied metals - the loss of connectivity of the system through clusterization of the failure centers cascade (equal dimensionality of the order parameter, the unique universality class) and the equal space dimensionality where the process occurs determine the possibility for forecasting of unstudied metals behavior under extreme conditions as well as for "constructing" by computer method new materials resistant to certain kinds of effect. The above said defines the universal properties of metals behavior in the dynamic failure phenomenon.

References [I] M. Shreder. Fractals, chaos, power laws.lzhevsk: 2001, RKhD, 528 p. [2] E.K. Bon'ushkin, N.I. Zavada, S.A. Novikov, A.Ya. Uchaev Kinetics of dynamic metal failure in the mode of pulse volume heating-up. Scientific edition. Edited by R.I. Il'kaev// Sarov, 1998, 275p.

222 - - - - - - - - - - - - - - - - - - - - - - - - - - R.I. 1/'kaev et, a/. [3] R.I. Il'kaev, A.Ya. Uchaev, S.A. Novikov, N.I. Zavada, L.A. Platonova, N.I. Sel'chenkova. Universal metals properties in the dynamic failure phenomenon. // Report of Academy for Science (RAS), 2002, v. 384, N2 3, P. 328-333. [4] R.I. Il'kaev, V.T. Punin, A.Ya. Uchaev, S.A. Novikov, E.V. Kosheleva, L.A. Platonova, N.I. Sel'chekova, N.A. Yukina. Time regularities of the process of dynamic metals failure conditioned by hierarchic properties of dissipative structures - failure centers cascade II RAS, 2003, vol. 393, No.3. [5] Yu.Yu. Tarasevich. Percolation: theory, applications, algorithms:Education book. M.: Editorial URSS, 2002. - 112 p., illustr. [6] I.M. Sokolov. Dimensionalities and other geometric critical indexes in the percolation theory. II UFN, 1986, vol. 150, N22. P. 221-255. [7] Feder E. Fractals II M.: Mir. 1991. 264 p. [8] Stauffer, D. "Scaling Theory of Percolation Clusters," No. I, 1-74 (1979).

Phys.

Reports, Vol.

54,

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 223-225

Theoretical Investigation of Conductance Properties of Molecular Wires S. Jalili 1•2• 'and F. Moradi 1 'Department of chemistry, K. N. Toosi University of Technology, P. 0. Box 16315-1618, Tehran, Iran 2Computational Physical Sciences Research Laboratory, Department ofNano-Science, Institute for Studies in Theoretical Physics and Mathematics (I PM), P .0. Box 19395-5531, Tehran, Iran Received 2 August, 2004; accepted in revised form 25 August, 2004 Abstract: Quantum-mechanical based methods, such as Density Functional Theory and ab initio, in

conjunction with Non-equilibrium Green's Function method are used to obtain the current-voltage (1-V) characteristic curves of a molecular nano-wire bridging two metallic electrodes. By using the improved chemical software's, the Hamiltonians of the three main parts of system, i.e. the right lead, the device, the left lead and conductance properties of a molecular wire were calculated. Keywords: Conductance properties, Molecular wire, ab initio, Non-equilibrium Green's Function, Density

Functional Theory.

PACS: 77.63.-b. 73.40.-c, 85.65.+h

A molecular nano-wire normally refers to a system composed of a molecule bridging two electron reservoirs. The emerging field of Molecular Electronics is concerned with constructing information processing devices by coupling single molecules, with electronic functionalities, together and connecting the resulting nano-wire to external electrodes. The design of such a system poses several theoretical, computational and experimental challenges. Ability of electronic transport could be theoretically observed in almost all of the aromatic systems, such as benzene, thiophene and etc. It seems that bithiolate derivatives of these molecules connected to suitable metallic contacts, like gold leads, is an ideal configuration to study conductance properties of them. Several research groups have studied phenyl bithiol molecule as a molecular wire which bridging two golden contacts, some of the recent researchers -and also us- are interested to study a thiophene bithiol molecule (TBT) as a molecular wire using computational approaches such Quantum-Mechanical based methods. In this work we have assumed that the terminal S atoms of the TBT molecule are connected to the triangular hollow sites in the two gold electrodes. Recent experiments have suggested that the binding to a hollow site is energetically more favorable, while others suggest this to be true for a single-atom connection. Consequently, the geometry of a molecule-metal contact is, as yet, not well understood. The TBT molecule contact-geometry and the associated bond lengths of the TBT molecule were determined from Density Functional Theory (DFT) calculations. By dividing the main system to three parts, that is, the left lead, the device (TBT and some of the surface atoms from leads) and the right lead, we calculated the Hamiltonian of each part independently and then added them to obtain the total Hamiltonian of the system. The obtained Hamiltonian should be added to the self-consistent potential to get the Fock matrix of the whole system. This matrix includes the effects of external fields, kinetics energy of electrons, electron-electron interactions, and electron-nuclear interactions. This matrix calculated directly by a self-consistent procedure using the DFT approach with Beck-3 exchange and Perdew-Wang 91 correlation and LANL2DZ basis set which incorporates the relativistic core pseudo potentials. Consequently we could describe the Green's function of the system according to a Non-equilibrium Green's Function (NEG F) formalism,

' Corresponding author. E-mail address: [email protected]

224

- - - - - - - - - - - - - - - - - - - - - - - - - S. Jalili and F .Moradi

G=(ES-Fr' = [

ES -F

""

r'

""

']_,

(I)

D

The first left block of this matrix is the device Green's Function ( G"") in which S"" and F"" are the overlap and the Fock matrices of the device respectively. rD-'r' is the self-energy term, which describes the effects of contacts on the device. Due to the fact that many of the elements of coupling matrix ( r ) are zero, we used the only surface term of D-' that is the surface Green's function g, and thus the reduced device Green's function could be written as: (2) where the self-energies I, and I, are non-Hermitian matrices that are related to the non-zero part of the corresponding r and g through (3)

Using the anti-Hermitian components of the self-energies (4)

and the device Green's function we calculated the Transmission Function T(E)=trace[rpr,G']

(5)

in order to obtain the 1-V curve of the TBT molecular wire the following equation was used: I= 2: [ dET(E)[!,(E)- /,(£)]

(6)

where /,.,(£) are the Fermi functions with electrochemical potentials f.J,., (which are supposed to be equal to each other and to gold's Fermi energy, ie, -5.31 ev at equilibrium) /, 2

.

E-p

(E) =[I+ exp(--'-·')r'

K,T

(7)

The obtained 1-V curve (Fig. I) represents a resonant mechanism for TBT as well as the results have been noticed for other aromatic rings such as phenyl bithiol. This representation, beside others could be used to prove the theory of non-ohmic current-voltage relation in these systems. In this work, other properties such as DOS, Transmission Function, voltage drop through TBT molecule and etc will be discussed.

......

1-tf,. ...

• J.

,.,.......... ('t')

Figure I: /- V curve of thiophene bithiol connected to two gold lead

Theoretical Investigation of Conductance Properties of Molecular Wires _ _ _ _ _ _ _ __

225

References [I] S. Datta, Electronic Transport in Mesoscopic Systems, Cambridge University Press, Cambridge, 1995. [2] P. Damle, A. W. Ghosh, S. Datta, Chern. Phys 281 (2002) 171. [3] M. Di. Ventra, N.D. Lang, S. T. Pantelides, Chern. Phys 281 (2002) 189. [4] W. Tian, S. Datta, S. Hong, R. Reienberger, J. !.Henderson, C. P.Kubiak, J. Chern. Phys 109(7) (1998) 2874. [5] E. Emberly, G. Kirczenow, Phys. Rev. B 58 (1998) 10911. [6] Y. Xue, S. Datta, M. Ratner, J. Chern. Phys 115 (2001) 4292. [7] C. Majumder, H. Mizuseki, Y. Kawazaoe,J. Chern. Phys 118(21) (2003) 9809. [8] N. B. Larsen, H. Biebuyck, E. Delamarche, B. Michel, J. Am. Chern. Soc 119 (1997) 3107. [9] N. Camillone, C. E. D. Chidsey, G.Y. Liu, G. Scoles,J. Phys. Chern 98 (1993) 3503. [10] P. Hay, W. Wadt,J. Chern. Phys 82 (1985) 270. [II] P. Hay, W. Wadt, J. Chern. Phys 82 (1985) 284. [12] A. D. Becke, J. Chem.Phys 98 (1993) 5648. [13] J.P. Perdew, in: P. Ziesche, H. Eschrig (Eds.), Electronic Structure of Solids, Akademie Verlag, Berlin, 1991, pp. 11-20. [14] L.A. Bumm and eta!. Science 271 (1996) 1705. [15] M.A. Reed and et al. Science 278 (1997) 252.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer

and Computational Sciences Volume I, 2004, pp. 226-229

Information Leakage in a Quantum Computing Prototype System: Stochastic Noise in a Microcavity L.A.A. Nikolopoulos' and A. Maras Department of Communication Sciences and Technology, Faculty of Sciences and Technology, University of Peloponnese, GR-221 00 Tripolis, Greece Received I July, 2004; accepted in revised form 15 July, 2004 Abstract: One serious problem in the physical implementation of quantum computing is that of main-

taining quantum noise. Noise (decoherence) of a quantum system, due to its interaction with the surrounding environment, is the main source of spoiling in the execution of a quantum algorithm. In this work we investigate quantum noise through methods from classical and quantum information theory. Decoherence is seen as the noise effect of a channel operation on a single quantum bit (qubit) and its properties are investigated through the quantum entropy evolution and fidelity transmission. One of the fundamental physical systems that appears promising for quantum computation is that of an atom and electromagnetic (ElM) field in a cavity [ 1]. Keywords: Quantum information, quantum computation, quantum communication, entanglement, quantum entropy, decoherence, density matrix PACS: 03.67.Pp, 03.67.Kx, 5.40.-a

1 Introduction Quantum computation is a rapidly growing field of modem science [2] which has attracted a large number of scientists from areas other than physics, such as mathematics, computing, information and communication theory. Quantum computer is nothing else than a physical system whose evolution can be manipulated in such way so that to perform specific computational tasks. Current technology is faced with the implementation of practical quantum computation algorithms, which when compared with the classical counterpart algorithms (i.e. prime factorization, discrete Fourier transform), achieve an amazing computational performance [3, 4]. One of the main obstacles of quantum computational systems and algorithms is the inevitable imprecision due to interaction of the fundamental quantum system with its environment. Decoherence appears in any quantum computation resulting in a loss of information when a quantum signal is sent in time and/or space through noisy quantum channels. Theoretical and experimental methods have to be developed for the control of such problems which either have pure quantum mechanical origin or they have their analogue in classical information field. Noise (decoherence) in a quantum physical system An important topic of research in an open quantum system is the dynamics, induced by the surrounding environment. Noise in such quantum systems, as is the system that we are going to consider, can be 1Corresponding

author. E-mail: [email protected]

227

Infonnation Leakage in a Quantum Computing Prototype System

described by a master equation for the density operator of the system. Assuming that the total hamiltonian (system + environment) is given by the the hamiltonian H = H s + HE + H s E· the time evolution of the total density operator is given as the solution of the Liouville equation, where the initial density operator is considered to be in a factorized form as p(O) = p8 (0) 0 Pe(O). Here HsE denotes the system-environment interaction and Ps, Pe the density operators of the system and the environment, respectively. Tracing out the environmental degrees of freedom together with a Markov and a Born approximation [5] results in the following reduced density matrix equation for the system alone:

a = -t[Hs,Ps] . atPs(t)

"'[L;p + .~

8

t '2{L;L;,p 1 t ] L;8 }

,

i =a, c.

(I)

t=a,c

The first term of the right-hand-side describes the coherent evolution of the system (noiseless channel) while the second and third terms (i = a, c) describe the noise inserted into the system. The Lindblad operators L; = ,/k;CJ;, i = a, c [6] describe the decoherence of the system due to its interaction with the environment. Decoherence time, in the first approximation, is given by the quantity 1/k;, i = a, c. Operators CJa, CJ c represent the destruction operators in the Hilbert space of the system. Noise in quantum computation Assume that the initial state 11/1.(0)) of the system is the input (qubit) of some arbitrary quantum computing operation. The system's evolution, in absence of noise, denote the desired quantum operation (i.e. AND, NOT, XOR, CNOT). The final state of the system lws (t f)) represents the value of the qubit after the operation. Noise (decoherence) is now inserted due to the fact that no quantum system is completely isolated from the environment. This affects the final state of the system (final value of the qubit) in a unpredictable way. Let's now consider the case that we are going to examine: • Evolution of a pure initial state to a final (mixed or not) state. Assume that the initial state of the system 11/ls(O)) = 11/la(O)) 0 11/lc(O)) is pure and the atom and the cavity field are uncorrelated. As the system evolves according (Eq. I), in the presence of the environment, it becomes entangled (quantum correlated) in such way that the state of the system is not possible to be written in a factorized form (uncorrelated case). One interesting question is under what conditions in terms of the input (initial) states of the field and the atom the degree of the entanglement can be maximum. Noise examined with quantum information concepts From the quantum information point of view, noise effects may be investigated through the concepts of quantum entropy and the density matrix of the state. More over, fidelity of entanglement as a measure of the pure quantum correlation (non-classical) between the two parts (qubits) of the system is considered. The (Von Neumann) quantum entropy S of the bipartite system is defined as: (2)

with r; being the eigenvalues of the density matrix of the system. The classical counterpart is the Shannon entropy [7] (Sc = Pi log 2 p;). It has been shown that in general S ~ Sc. Evolution of the state of the system p 8 (t) may be modeled as a quantum channel. The initial state of the system p.(O), is the state 'sent' by the source and the final state p.(tJ) is the information 'received' by the receiver. Environmental noise (quantum decoherence) again may affect the final state of the system (receiver readout) in an unpredictable way, thus leading to loss of information. Here we are going to examine the evolution of the linear entropy (St) and the fidelity of entanglement (F.), defined by:

L

228

L. A.A. Nikolopoulos and A. Maras

.... -

-······

_,.....

\

C · ·-'

=~:::::

....

=£.... , : Figure I: I Fidelity transmission and linear entropy for two different constant values of the external ElM field. We have used K = 2 and 1 = 0.2 in a.u.

St = 1- Tr.[p~(t)]

2

F.= (1/J.(t)lp.(O)I1/J.(t)) = Tr.[p.(t)p.(O)].

(3)

Atom and photon in a ElM cavity: quantum computation in a noisy environment

The system is considered to be bipartite, constituted by a typical two-level atomic system (TLS), with hamiltonian ha = woaz, inside an ElM cavity modeled by the hamiltonian he= wcata. Their interaction is given by the operator hac= ig(a+a +at a_). In addition the system is driven by an external ElM field of frequency WL with hamiltonian h 1 = £L(t)(at eiwLt + ae-iwLt). The atom-cavity field hamiltonian is given by H = ha +he+ hac+ ht. In the interaction picture and by making the rotating wave approximation, we end up with the following transformed form for the system hamiltonian: (4)

with ~a= WL- Wo and ~c = WL- We the detunings. Environment, for this bipartite system, is the continuum of ElM field modes surrounding the atom and the cavity walls. Noise is inserted through the purely quantum mechanic phenomenon of spontaneous emission of the excited state of any atomic system. The excited state of the TLS decays into the ground state with a rate I· Moreover, coupling of the intracavity ElM field with the walls (non-ideal mirrors) of the cavity cause photons to leak out the cavity mirrors with a rate K. Noise for the atomic system observables modeled with the Pauli destruction spin matrix while noise for the cavity ElM mode, is modeled by the destruction cavity mode operator, namely,

The number Nc is the number of photons for the cavity mode, here kept equal to five Nc = 5. The decoherence characteristic times are of the order r; = 1/ k;, i = a, c. Let's now assume the initial state of the intracavity field to be the vacuum state IDe), and the initial state of the atomic system to be in the ground state lOa). In other words, if we consider that the initial qubit is of the form, 1/Ja(O) = niOa) + /311a), then Q = 1,/3 = 0. The initial state of the bipartite system (atom+intracavity field) is given by 1/J.(O) = 11/Ja(O)) ® IDe) = lOa De)· The initial entropy of the system is S(O) = 0. We have complete information for both of the subsystems independently. The two systems are initially uncorrelated. The system after sufficient time has been evolved in its steady state 1/Jss under the influence of the external ElM field of strength £ L and the noise induced by the environment, characterized by the parameters K, I· Evolution is governed by the differential equation for the density matrix elements (Eq. 1). At that

Infonnation Leakage in a Quantum Computing Prototype System

229

time, the two systems are correlated in a non-classical way and their joint (not-separable) state is called the entangled state. Of crucial importance is the entropy that this entangled state has obtained since it represents the amount of information that we can extract from that state. In addition the fidelity of the transmission gives us a quantitative measure of how well the information has been sent from the 'source' to the 'receiver' or equivalently from the quantum computation point of view, the probability that no error has occurred during the execution of the quantum logical algorithm (gate). In the present case we assume y;, = 2, 'Y = 0.2 (characterize the coupling of the noisy environment with the channel). The coupling between the atom and the cavity FJM field has been set tog= 1 (characterizes the transmission channel). We also assume that the detunings are zero (~c = ~a = 0). Thus, we have resonant conditions. For the case of weak external field t:L = 0.1 a.u. we see that the linear entropy Sz remains very close to zero while the fidelity remains very close to one. This suggests that a small amount of information has been lost to the environment while at the same time the system has been entangled. We have almost complete information both for the joint system and for the subsystems independently. The two systems are almost uncorrelated. When we increase the external field to the value of t:L = 0.5 a.u. we see that entropy is increased in its steady value but still stays very small (S1 __. 0.029) while fidelity decreases more rapidly with the field strength (F. __. 0.854). This is consequence of the fact that while the information loss for the joint system is small the available information for the the two subsystems separately has been decreased. The degree of the entanglement (degree of quantum correlation) between the two subsystems has been increased. Results for different initial states and other envriromental parameters have also been obtained.

References [ 1] B. B. Blinov, D. L. Moehring, L.-M. Duan, and C. Monroe, Observation of entanglement between a single trapped atom and a single photon. Nature 428 (2004), 153-157. (2] C. Monroe, Quantum information processing with atoms and photons. Nature,416 (2002), 238246. [3] P. Shor. Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A, 52:R2493, 1995. [4] Lov K. Grover. Quantum mechanics helps in searching for a needle in a haystack. Phys. Rev. Lett., 78:4709-4712, 1997. [5] N.G. Van Kampen. Stochastic Processes in Physics and Chemistry. Elsevier Science B.V. Amsterdam, 1992. [6) G. Lindblad. On the generators of quantum dynamical semigroups. Commun. Math. Phys., 48:119-130, 1976. [7] C.E. Shannon. A mathematical theory of communication. Bell, Syst. Tech. J., 27:379-423,623656, 1948.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 230-233

Adaptive Stiff Solvers at Low Accuracy and Complexity Alessandra Jannelli and Riccardo Fazio 1 Department of Mathematics, University of Messina Salita Sperone 31, 98166 Messina, Italy Received 12 July, 2004; accepted in revised form 18 August, 2004 Abstmct: We consider here adaptive stiff solvers at low accuracy and complexity for systems of ordinary differential equations. For the adaptive algorithm we propose to use a novel monitor function given by the comparison between a measure of the local variability of the solution times the used step size and the order of magnitude of the solution. The considered stiff solvers are: a special second order Rosenbrock method with low complexity, and the classical BDF method of the same order. We use a reduced model for the production of ozone in the lower troposphere as a test problem for the proposed adaptive strategy. Keywords: Stiff ordinary differential equations, linearly-implicit and implicit numerical methods, adaptive step size. Mathematics Subject Classification: 65L05, 65L07.

1

Introduction.

The main concern of this work is to study the most promising adaptive solvers at low accuracy and complexity that can be used for the numerical integration of stiff systems of ordinary differential equations (ODEs) written here, without loss of generality, in autonomous form:

de= R(c) dt

(1)

where cERn and R(c): Rn-+ Rn. Adaptive solvers can be used to automatically adjust the step size according to user specified criteria. Accepted strategies for variable step-size selection are based mainly on monitoring of the local truncation error: Milne's device in the implementation of predictor-corrector methods; embedded Runge-Kutta methods as developed by Sarafyan [4], Fehlberg [3], Verner [6] and Dormand and Price [2]; Richardson local extrapolation proposed by Bulirsch and Stoer [1]. The simple and inexpensive approach to the adaptive step size selection, proposed here, is to require that the change in the solution is monitored in order to define a suitable local step size. This results into a simpler algorithm than the classical ones.

2

A simple adaptive step-size strategy.

In this section, we present a simple adaptive procedure for determining the local integration stepsize according with user-specified criteria. First of all, the user has to choose bounds on the 1 Corresponding

author. E-mail: [email protected] or [email protected]

231

Adaptive stiff solvers

allowable tolerance, say 0 < 1Jmin < 1Jmax, for a monitor function 1Jn to be defined below. Moreover, a range for the permissible step size is also required to the user (~tmin :S ~tn :S ~tmax)· Large enough tolerance intervals for ~tn and 1Jn should be used, otherwise the adaptive procedure might get caught in a loop, trying repeatedly to modify the step size at the same point in order to meet the bounds that are too restrictive for the given problem. However, in general, the step size should not be too small because the number of steps will be large, leading to increased round-off error and computational inefficiency. On the other hand, the interval size should not be too large also because truncation error will be large in this case. We consider first the simple scalar case. Given a step size ~tn and an initial value en at time tn, the method computes an approximation en+I at time tn+I = tn + ~tn, so that we can define the monitoring function

where fM > 0 is of the order of the machine precision. Now, we can require that the step size is modified as needed in order to keep 1Jn between the tolerance bounds. The basic guidelines for setting the step size are given by the following algorithm: 1. Given a step size ~tn and an initial value en at time tn, the method computes a value en+I

and, consequently, the monitoring function 1Jn.

2. If 1Jmin :S 1Jn :S 1Jmax' then tn is replaced by tn + ~tn; the step size ~tn is not changed and the next step is taken by repeating Step 1 with initial value en replaced by en+ 1 • 3. If 1Jn < 1/min, then tn is replaced by tn + ~tn and ~tn is replaced by 10~tn; the next integration step, subject to the check at Step 5, is taken by repeating Step 1 with initial value en replaced by en+I. 4. If 1Jn > 1Jmax, then tn remains unchanged; ~tn is replaced by ~tn /2 and the next integration step, subject to the check at Step 5, is taken by repeating Step 1 with the same initial value en.

5. If

~tmin

~tmax

:S

~tn

and by

:S

~tmax,

~tmin

if

return to Step 1; otherwise ~tn is replaced by < ~tmin, then proceed with Step 1.

~tmax

if

~tn

>

~tn

Recall the definition of the monitoring function 1Jn:

len+! -en I

1Jn

= len I+ fM =

(I

len+! -en I ~tn

~tn

len I+ fM ~

Idedt (tn) Ilen~tn I+

fM'

and note that this can be considered as a measure of the suitability of the used step size. In fact, we can consider

I~~

tn) a measure of the increase or decrease of the solution,

~tn

as the grid

resolution, and len I+ fM the order of magnitude of the solution. So that, in the above formula the product of the derivative times grid resolution is compared with the order of magnitude of solution. When the numerical solution increases or decreases to much, that is the monitor function exceed the upper bound, our algorithm choices to reduce the time step. On the other hand, if the solution slowly varies with respect to the grid resolution over the order of magnitude of the solution, then if the value of 1Jn is within the chosen range, the step size is unchanged; otherwise the step size is magnified by one order of magnitude. It is evident that this adaptive approach can be used with a single method. In the vectorial case, a norm has to be considered instead of the absolute values, for instance the two norm ll·ll2 or the infinity norm ll·lloo·

232

Alessandra Jannelli and Riccardo Fazio

·~~-: H~-·: I ·l[ • J1 ·~[ LJ 2

6

7

2

7

8

x10'

2

3

4

7

8

2

x1~

3

4

56

7

8

I

2

1.

I

11~

1 \~HIJ -~~~~~ 2

•••• \:I 7

10'

3

4

58

7

8

I

X 10'

-~l"-···-... /.::·.vJ~ ·hl\~f'\J 2

3

4

56

7

8

1

2

3"

5 6 1 8

1

x1o'

x10'

Figure 1: Results obtained by: ROS2 on the left and BDF2 on the right. Top: c1 ; middle-up: c3 ; middle-down: step-size selection; and bottom: relative error.

3

A test case: ozone production in the lower troposphere.

Ozone is an unsafe gas for human beings and animals even during short term exposures and can damage crops when over long periods its levels are too high. Ozone is formed in many different reactions. A reduced model used for the production of ozone in the lower troposphere is given below, see

[6],

c; = J11C3 c; = Jll C3 C~

J1,2C1 J1,3C2C4

= J1,3C2C4- J1,1C3

C~ =

J1,2C1 -

J1,3C2C4

+ 82

(2)

233

Adaptive stiff solvers

with the initial conditions

c(O) = [0,1.3108 ,5 1011 ,8 1011 )T.

Here' is the derivative with respect the independent variable the concentrations c; fori= 1, 2, 3, 4 are given in molecules for cm3 and time in seconds. The involved parameters are given by 10-40

J.ll = { 10-5 e7 sec(!)

where sec(t) = (sin

during the night during the day

(~(th- 4))) 0.2

J.l2 = 105 '

and

J.l3 = 10-16 '

82

th = th- 24 Lth/24J ,

= 106

th = t/3600'

here L·J stands for the floor function. The figure 1 shows two components of the numerical solution, step-size selection and monitor function obtained with a low complexity second order Rosenbrock (ROS2) method and the BDF2 (Backward Difference Formulas) method of the same order. We set 17max = 10- 3 and 17min = 17max/10, lltmin = 1 and lltmax = 1000. The ROS2 method calculates the solution in 0.7 second after 1983 steps using a maximum time step equal to 1000. Instead, the BDF2 obtains the numerical solution in 18 seconds with 2429 steps and a maximum time step equal to 390.

Acknowledgment Work supported by a grant from the Messina University and partially by the Italian "MIUR".

References [1] R. Bulirsch and J. Stoer, Numerical treatment of ordinary differential equations by extrapolation methods, Num. Math. 8 1-13(1966). [2] J.R. Dormand and P.J. Price, A family of embedded Runge-Kutta formulae, J. Comp. Appl. Math. 6 19-26(1980). [3] E. Fehlberg, Classical fifth-, sixth-, seventh- and eighth order formulas with step size control, Computing 4 93-106(1969). [4] D. Sarafyan, Error estimation for Runge-Kutta methods through pseudoiterative formulas. Techn. Rep. No 14, Lousiana State Univ., New Orleans, 1966.

[5] J. M. VERNER. Explicit Runge-Kutta methods with estimates of the local truncation error. SIAM J. Num. Anal., 15 (1978) 772-790. [6] J. G. Verwer, W. H. Hundsdorfer and J. G. BJorn, Numerical time integration for air pollution models, Sur. Math. Ind. 2 107-174(2002).

VSP International Science Publishers

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 234-238

P.O. Box 346, 3700 AH Zeist

The Netherlands

Refining Existing Numerical Integration Methods P. Johnson, K. Busawon and S. Danaher School of Engineering and Technology, Northumbria University, Newcastle-Upon-Tyne, NE1 8ST, U.K. Received 12 July, 2004; accepted in revised form 13 August, 2004 Abstract: In this paper we present a refinement to the single-step numerical methods of solving ordinary differential equations (ODEs). We propose an adaptive algorithm for generating the step length between successive numerical approximations to the solution of an ODE. The algorithm is designed such that more accurate approximations to the solution are made when the solution gradient is high, whilst accuracy is dropped in favour of economising computation effort when the gradient is low. It is demonstrated how this adaptive algorithm can be incorporated into the forward Euler method as well as all other single-step methods. An example is studied and simulations carried out comparing the single-step Euler, Heun and Runge-Kutta methods with their adaptive method counterparts. Analysis of the simulations is carried for each case. Through the simulations and analysis it is demonstrated that the proposed algorithm does indeed refine the existing single-step methods. Keywords: numerical integration, adaptive step, differential equation Mathematics Subject Classification: 74H15, 65D30

1 Consider the ODE

d~~t)

Introduction

= f(x(t), t);

x(to) = xo

(1)

where x E R. We assume that the relevant hypotheses for the existence and the unicity of the solution of (1) are satisfied. In addition, there exists a constant M > 0 such that lf(x(t), t)i :::; M for all x E Rand t 2:: 0. Throughout the paper, the set of real numbers is denoted by Rand the set of natural numbers by N. We are concerned here with finding the numerical solution of (1) over a finite time interval [a, b]. The simplest of all of the single-step numerical integration methods is that of the forward Euler. For this reason we will examine the problem associated with single-step numerical integration techniques using the forward Euler method as an example. First of all, recall that the forward Euler method consists of generating two sequences:

tn+i =tn+T;

n=0,1,2, ... ,N

(2)

235

Refining existing numerical integration methods

(3)

where N E N, to = a, tN = b, T = b ~a > 0 is the constant integration step and Xn is an approximation of x(tn)· Note that equation (2) defines a monotone increasing arithmetic sequence {tn};;=O• which we call the integration sequence, and hence gives rise to uniformly spaced integration instances. It is clear that with a small Tone may make good approximations to a solution of (1) though at the expense of computational effort, whereas a large T will economise effort at the expense of accuracy. Hence, one is forced to make trade-off between accuracy and computation if a solution demands accuracy and economy in different sub-intervals of [a, b]. We show, by the use of a variable integration step, that the error between the numerical and the analytical solution can be significantly reduced without the extra computational effort that the single-step integration methods would require.

2

An adaptive numerical algorithm

We now present a refinement of the forward Euler method. Again, for reasons of ease of illustration, Euler's method is examined as it is the simplest of all single-step methods. Consider the ODE described by (1). As before, we are concerned with finding the numerical solution of (1) over a finite interval [a, b]. We propose the following numerical integration algorithm defined by the sequences:

tn+l = tn

To

+ 'YJ2(xn, tn) + 1;

Xn+l = Xn

+ (tn+l

n = 0, 1, 2, ... , p

- tn) f(xn, tn)

(4) (5)

where PEN, to =a, tp E [b, b +To[ and both To > 0 and 'Y 2: 0 are real constants. First of all, note that, by comparing (2)-(3) with (4)-(5), if To = T and 'Y = 0 or f(xn, tn) = 0, then the Euler algorithm and the above proposed algorithm are identical. Otherwise, the integration step (3 _

To n - 'YJ2(xn, tn)

+1

is inversely proportional to the approximation of the square of the gradient ± = f(x(t), t), of the solution x(t). Consequently, the integration steps will be relatively small during regions where x(t) contains rapid changes. As a result, one can ensure that more accurate approximations of x(t) will be obtained in intervals of x(t) that exhibit rapid change. On the other hand, if approximations to x(t) are being made in an interval of x(t) which exhibits little or slow change (i.e. f(x(t), t) ~ 0) then f3n will increase. Consequently, the integration steps will grow and relatively few computations will be made and stored in such a region. It is clear then, that this method is adaptive in nature, rather than single-step, due to it's variable integration step size. The constant 'Y is introduced to add a weight factor to the gradient. In other words, it allows f3n to have a tunable gradient sensitivity. Furthermore, since lf(x(t), t)l ::::; M, we can see that -rJ'~+l ::::; f3n ::::; To. Also, since f3n > 0, the sequence {tn}t:=o is monotone increasing and hence divergent. This means that {tn}t:=o is indeed a valid integration sequence. It is important to realise that the divergence property of {tn}t:=o is necessary so as to ensure that tn covers the whole interval [a, b].

3

Comparison with Euler method

The adaptive numerical integration method that is the extension to the Euler method outlined above, is compared with the forward Euler method for a particular function f(x(t), t) in (1) over a fixed interval [to, tp] = [to, tN]· In the example we studied two cases (more cases are studied

236

P. Johnson, K. Busawon, S. Danaher

in the full paper): Case 1 T = T0 = 0.10 and 'Y = 1 - Case 2 T = To = 0.05 and 'Y = 1. To compare the two techniques obJectively we adopt two different approaches. Firstly we use the square error, computed by (Llx)n = (x(tn)- Xn) 2 over the interval [t 0 , tp], to measure the relative · r Tadaptive - - wh ere r d enotes t he mean square error .or a particular mean square error A(to,tp) = Teuler

method. In the second approach we define a variable S[to,tp]• referred to as the S factor, which represents the computation effort with respect to the error over the time interval [t 0 , tp], and is defined as follows:

where

Pq

1

Rq =

2

'"""(.6.x) 2

(Pq +I) ~

for q = 1, 2

n

is the mean square error per point for the adaptive numerical solution in case q and Pq number of points in the solution for case q. Also

+ 1 is the

is the mean square error for the adaptive numerical solution in case q. It can be shown that if > 1, then we may say that the adaptive method has improved upon the method of Euler. The improvement being, that the number of points required to reduce the error between numerical and real solution by a given factor using the adaptive method, is less than that which would be required by the Euler method. The following example was used in the comparison: S(to.tP)

dx(t)

~

= -tx;

to

2

which has the solution x(t) = e--'2 and we set tp are summarised in the table and figure below.

= 0;

= tN = 4.

I Example I

To

I 'Y I

Case 1 Case 2

0.1 0.05

1

I II I

=1

x(O)

Aro,41

The results of the Euler comparison

I

Sro.41

0.95021 0.9574 2.116~

I

The above data illustrates several points. In both cases the mean square error for the adaptive method never exceeds that of the Euler method i.e. A[to, 4 ] < 1. Also, one can see that for case 2 S(to, 4 ] > 1 which demonstrates the low cost at which this additional accuracy comes.

4

Further extensions

It has been shown in the above how one can refine/extend the Euler method for numerically solving (1). The extension is to replace the static integration step size in the Euler algorithm with an adaptive step size. It stands to reason that the same extension may be made to, not just Euler's method, but to any single-step method. Every single-step method employs the same algorithm for generating it's integration sequence. The algorithm is of the form (2) and can be written generally as tn+I = tn + h; n = 0, 1, 2, ... , Q where h is a real positive constant. As with the extension of Euler's method, we propose that one can replace this algorithm with that of (4) i.e. tn+I = tn +.Bni n = 0, 1, 2, ... , P. In making this substitution we can justifiably expect that in order to reduce the

237

Refining existing numerical integration methods

·4

1.4 rx_10_:.~:-:-:-:;::=::::::r:==::::r:==::::r==::::;-, Adaptive Solution Euler Solution

1.2

... t .................. T ..... '" ...... '" ... T ... '" ... "" ... ... 0

0 0

0

0 0 0

0 0

0

0

------- t·-- .. -- t-- ..• ·- t---.-0 0

----- ...... '--- .. ---'- ---- ...... '- ---- ... -

0.8

0

>;(x) Plugging this expansion into the PDE, multiplying with a test function i, integrating over applying Green's formula and the boundary conditions yield:

(6)

n,

and

322 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ F.L. Long and B.S. Gevert

If we defme the following notions: M;. j D;. j =

= b 1$;dx

(8)

b (DeV;) • V';dx K;. j = 1- a1$;ds Fj = b .filljdx

(9) (10)

(II)

1- agjds

(12)

M-+(D+K)U =(F+G)

(13)

Gj = Equation (7) can be rewritten as

dU dt

which is a large and sparse system of the ordinary differential equations (ODE) with the initial value U;(O)=O. Further, in order to solve Equation (13) numerically, we discretize the time in continuous piecewise linear functions as well, and obtain

U;(t) = U;n-l (x)'l'n -l(t) + Ut (x)'l'n(t) Plugging this expansion into equation (13) and integrating over systems of equations:

n

(14)

lead to the following sequence of

(15) where dt is the tiny time step chosen. The ODE system is an initial value problem (IVP) and ill conditioned. Therefore, the adaptive time step control involving automatic choice of the time step to satisfY a tolerance on the error is employed in this work. Moreover, since the coefficient matrix F is of function of U that is time dependent, the ODE system is also nonlinear. Thereby, re-evaluating andre-factorizing the matrix Fare necessary while the fixed point iteration method is used to solve the nonlinear system. For the systems of the PDE in our

u

u

case, we first arrange in the form of = [Uvo- EP Uvo- EPH2 Udeposits matrices M, D, K, F and G are changed correspondingly.

r

and then the coefficient

The key issues for developing the codes on MA TLAB are: (i) Defining the geometry and meshing; (ii) Factorizing the coefficient matrices that are not time dependent; (iii) Time evolving while re-evaluating and re-factorizing the matrix that is time dependent in each time-step; (iv) Solving U at the point of time; (v) Recalculating the time-step by satisfYing a tolerance on the error; (vi) Plotting and visualizing.

Dynamic Simulation ofDiffosion and Sequential Reactions in Hydrodemetallization Catalyst Particles__ 323

25 1lll

20

15

10

o•

lll 200

100

Figure 1: Concentration Profile of Vanadyl Etioporphyrin (VO-EP) after I h on Stream.

Figure 2: Concentration Profile of the Metal Deposits after I h on Stream.

4. Results and Discussion After inputting the corresponding parameter values adopted from the experimental results of Run VE-6 [3, 4], the process of simultaneous diffusion and sequential reactions in a catalyst particle was vividly visualized, which was helpful to understand the whole HDM process. At the end, the concentration changes inside the particle were animated and the final concentration profiles of VO-EP and metal deposits were plotted as shown in Figure I and Figure 2, respectively. The simulation results indicated that the hydrodemetallization of vanadyl etioporphyrin was a diffusion-limited reaction, since VO-EP was unable to penetrate the catalyst particle deeply. The concentrations of VO-EPH 2 with the maximum inside the boundary did not build up during the process, demonstrating that VO-EPH2 was a reaction intermediate. M-shaped deposition profiles in the radial direction were obtained as shown in Figure 2, confirming the suggestion that the M-shaped profile was a result of a series reaction. Furthermore, since dense deposition inside particles could decrease effective diffusivity, catalysts with a large pore size and a high pore volume were demanded. Finally, for short cylindrical particles the diffusion from the ends was significant, so the end effects could not be neglected.

References [I] M.C. Tsai and Y.W. Chen, Restrictive Diffusion under hydrotreating reactions of heavy residue oils in a trickle bed reactor, Ind. Eng. Chern. Res. 32 1603-1609(1993). [2] P.W. Tamm, H.F. Hamsberger and A.G. Bridge, Effects of feed metals on catalyst aging in hydroprocessing residuum, Ind. Eng. Chern. Process Des. Dev. 20 262-273( 1981 ). [3] R. Agrawal and J. Wei, Hydrodemetalation of nickel and vanadium porphyrins. I. intrinsic kinetics, Ind. Eng. Chern. Process Des. Dev. 23 505-514(1984). [4] R. Agrawal and J. Wei, Hydrodemetalation of nickel and vanadium porphyrins. 2. intraparticle diffusion, Ind. Eng. Chern. Process Des. Dev. 23 515-522(1984).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 324-327

Matrix Bandwidth Minimization: A Neural Approach D. L6pez-R.odriguez 1 and E. Merida-Casermeiro 2 Department of Applied Mathematics, E. T. S. I. Informatica, University of Malaga 29071 Malaga, Spain Received 30 July, 2004; accepted in revised form 28 August, 2004 Abstroct: The aim of this paper is to present a neural technique to tackle the bandwidth minimization problem. This neural model, successfully applied to other problems, presents better solutions than those of other classical approaches to this problem, as shown by the simulations performed. Keywords: Bandwidth Minimization problem. Multivalued neural network. Combinatorial Optimization.

Mathematics Subject Classification: 05C78 (Graph labelling), 05A05 (Combinatorial choice

problems).

PACS: 120304

1

Introduction

Bandwidth Minimization Problem (BMP) is one of the most recurrent themes in literature, since it has many applications in science and engineering, not only in linear algebra. The aim of BMP is to find a reordering scheme of rows and columns of a matrix, i.e. a permutation of its rows and columns, such that nonzero elements get as close as possible to the main diagonal. This problem can be derived from graph theory in terms of minimizing the bandwidth of the adjacency matrix associated to a given graph. So, BMP has indeed many optimization applications [6], from both linear algebra and graph theory, including circuit design, large hypertext media storage, VLSI design, finite element method for partial differential equations and resolution of large linear systems. In this last case, if matrix size is N x N, Gaussian elimination can be carried out in O(Nb2 ) if matrix bandwidth is b < N, instead of a much slower O(N 3 ). BMP is also interesting due to its NP-completeness, proved in the general case in the middle 70's [10]. In the case of minimizing a graph bandwidth, Garey et a!. [4] showed that the problem is NP-complete even if the maximum vertex degree in the graph is 3. So, the intractability of this problem is assured and thus there is need for algorithms to achieve near optimal solutions. In the late 60's, E. Cuthill and J. McKee [2] introduced the first algorithm, CM, to approach BMP by constructing a level structure of the associated graph. Subsequently, Liu and Sherman [7] showed that reversing Cuthill-McKee's ordering can never increase the bandwidth of the matrix. 1 Corresponding 2 E-mail:

author. E-mail: [email protected] [email protected]

325

Matrix Bandwidth Minimization

Another approach based on the level structure of the associated graph is Gibbs-Poole-Stockmeyer (GPS) algorithm [5], which achieved results comparable to those of CM, with the advantage of being less time-consuming. A Simulated Annealing procedure was presented by Dueck and Jeffs (1995). Although it finds better results than GPS or RCM in some particular cases, it should be noted that it takes up to 2000 times longer. Some another approaches consist in a spectral decomposition of the matrix [1], although they are focused on minimizing the 'work-bound' of the matrix, i.e. the cost of a Cholesky factorization, and bandwidth of the permuted matrix can be high. In 2001, Corso and Romani modified this spectral technique to reduce both bandwidth and 'work-bound', but in general their algorithm performed better than RCM only when RCM was used at a preprocessing stage. Since neural networks have been successfully applied to many other combinatorial optimization problems [8, 9, 11], improving the solutions given by classical algorithms, we expect that the model proposed in this paper achieves better results than other techniques.

2

Definition of the Problem

Bandwidth Minimization Problem (BMP) can be stated from two closely related points of view, one being matrix theory and the other being graph theory. Let M = (m;,j) beaN x N real symmetric matrix, with zero-diagonal. The bandwidth of M is defined as w(M) = maxi,j=l, ... ,N{Ii- jl : m;,j ~ 0}. BMP consists in searching for a permutation of the rows and the columns in M such that its nonzero elements lie as close as possible to the main diagonal, making the bandwidth of the new matrix minimum. Denoting by SN the set of permutations of {1, ... , N} and by M 0 = (mo(i),o(j)) (a E SN) the permuted matrix, we look for a permutation a E SN such that w(M,) = minoESN w(Ma)· i,From the viewpoint of graph theory, BMP arises when all nodes in a given graph must be optimally labelled. So, let G = (V, E) be an undirected graph with no self-connections, where V = {vi} is the set of vertices and E is the edge set. A labelling (or numbering) of G is a bijective function

{1, 2, ... , N} (where N is the cardinality of V) that assigns an index to each vertex. Then, the adjacency matrix of G with respect to gc[V]

Fig.2 Conductance of the model system of Fig. I.

356 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Hitoshi Maruyama et. a/.

Molecular Orbitals of C6H4S2 Molecular Orbitals of Au. C6H. S2Au4

8-&W8 -4.3 7

~~·=~ -. 0Q_02---8Q_O ~ 0 '------

8-H-8 ~~~~-~----_-:

8-B-8 - 10.75

g~g

8~2

-3.56

,--- -.. -4.38

4

...

~ ~ ·o~~ '" -12.22

: ,- -- ..

"

Er=- 5.3eV

,--..

--

' '

~0 ~0

8-u--8 ~~·~~---· 0.2

0. 4

0.6

0 .8

Transmission Probabil ity Fig.3 Analysis of the transmission probability in tenns of the molecular orbitals of the system

Acknowledgments This study was supported by a Grant-in-aid for the 21" Century COE Program for Frontiers in Fundamental Chemistry, and by a grant for Scientific Research (B) (No.l4340174) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. Some of the numerical calculations were carried out at the Research Center for Computational Science of Okazaki National Research Institute, Japan.

References [I) S.Sanvito, C.J.Lambert, J.H.Jefferson and A.M. Bratkovsky, Phys. Rev. B, 59, 11936 (1999) [2) M.A.Reed, C.Zhou, C.J.Muller, T.P.Burgin, J.M.Tour, Science, 272, 252 (1997).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 357-365

Modelling spatial clustering through point processes: a computational point of view J. Mateu 1 and J .A. Lopez Department of Mathematics, Campus Riu Sec, University Jaume I, E-12071 Castellon, Spain Received 10 August, 2004; accepted in revised form 28 August, 2004 Abstract: Spatial point process models provide a large variety of complex patterns to model particular cluster situations. Usually, the main tools to compare theoretical results with observations in many varied scientific fields such as engineering or cosmology are statistical, so the new theories and observations also initiated an active use of spatial statistics in these fields. However, due to model complexity, spatial statistics often rely on MCMC computational methods, a backbone for using these techniques in practice. In this paper we introduce point field models, such as the continuum random-cluster process, the area-interaction process and interacting neighbour point processes, that are able to produce clustered patterns of great variety. In particular, we focus on computational aspects for simulation and statistical inference while analyzing the flexibility of these models when used in practical modelling. Keywords: Area-interaction point processes, Clustered point patterns, Continuum randomcluster processes, Interacting neighbour point processes, Spatial statistics.

Mathematics Subject Classification: 60G 12, 60G55.

1

Introduction

Stochastic geometry is the study of random patterns, whether of points, line segments, or objects. In particular, point processes are random point patterns in the plane or in space. The simplest example of these is the Poisson point process, which models "completely random" distribution of points in the plane or space. It is a basic building block in stochastic geometry and elsewhere. Its many symmetries typically reduce calculations to computations of area or volume. In stochastic geometry the classic generalization of the Poisson process is the Boolean model; the set-union of (possibly random) geometric figures or grains located one at each point or germ of the underlying Poisson process. Much of the amenability of the Poisson process is inherited by the Boolean model, though nevertheless its analysis can lead to very substantial calculations and theory as discussed by Hall (1988) and Molchanov (1996). It is often convenient to think of Boolean models as being derived from marked Poisson processes, in which the points are supplemented by marks encoding the geometric data (such as disk radius) of the grains; in effect the Poisson process now lives on R 2 x T where T is the mark space. Poisson cluster processes form a special case, in which the grains are simply finite clusters of points. The famous Neyman-Scott processes arise when the clusters are produced by independent sampling. How can we escape the limitations of the independence 1 Corresponding

author. E-mail: [email protected], Fax:+34.964. 728429

358

J. Mateu and J.A. Lopez

property of Poisson processes? One possibility is to randomize the intensity of the Poisson process, which produces the Cox process (also termed the doubly stochastic Poisson process). Since the introduction of Markov point processes in spatial statistics, attention has focused on the special case of pairwise interaction models. These provide a large variety of complex patterns starting from simple potential functions which are easily interpretable as attractive and/or repulsive forces acting among points. However, these models do not seem to be able to produce clustered patterns in sufficient variety. And this is the reason why other families of Markov point process models, able to produce clustered patterns, are introduced in this paper. The area-interaction process (Baddeley and van Lieshout, 1995), which uses the area of an associated Boolean model, the quermass-weighted point processes (Kendall, et al. 1999), which use perimeter length or total curvature characteristics, the continuum random-cluster model (Haggstrom, et al. 1999) and the penetrable spheres mixture model (Haggstrom, et al. 1999) are of interest in spatial statistics in situations where the independence property of the Poisson process needs to be replaced either by attraction or by repulsion between points. They are also highly relevant in statistical physics, where the first and third models provide the most well-known example of a phase transition in a continuous setting. Ripley and Kelly (1977) introduced Markov processes with fixed finite range interactions. Continuum random-cluster processes, on the other hand, allow interactions with arbitrarily large range. Two points of a continuum random-cluster process are neighbours if they belong to the same connected component. Therefore, two points that are far away from each other interact if there is a chain of other points connecting them. A new class of processes, which is between Ripley-Kelly Markov and connected component Markov processes (Baddeley and M0ller, 1989), is that when we allow points to interact if they are close enough (neighbours) or if there is a third point (but not a longer chain of points) connecting them. This new class is named interacting neighbour processes. One example of such a process is Geyer's saturation process (Geyer, 1999). It resembles the Strauss process, but has an extra parameter which puts an upper bound on the contribution to the density of any single point, and which, therefore, overcomes the normalizing problem in the clustered case. Because of this extra parameter, the saturation process is no longer a Markov process with respect to r-close neighbours but it is a Markov process with respect to r-close neighbours and the neighbours of these. Another example is the isolated-point-penalization point process (Hayat and Gubner, 1999) in which the interaction term controls the number of isolated points. In this paper we introduce the main point field models that are useful as particular models of cluster situations within the context of spatial statistics. We further analyze, through a simulation study, the flexibility of these models when used in practical modelling and focus on inferential and simulation computational tools.

2

Spatial point process methodological setup

Markov point processes are a rich class of stochastic models for spatial patterns, with the virtue of being relatively tractable. They are defined to satisfy one of several spatial counterparts of the Markov conditional independence property. The likelihood takes a simple explicit form, apart from a difficult normalising factor. Indeed typically the likelihood is an exponential family, and the canonical sufficient statistic is often closely related to nonparametric spatial statistics. Typically each process is the equilibrium measure of an associated (space-time) Markov process; thus it is amenable to MCMC simulation and inference. Accordingly there is much current interest in exploring the potential applications of Markov point processes, which include spatial statistics, digital image analysis and geostatistics. In order that likelihoods may exist, we shall restrict attention to finite simple point processes whose distributions are absolutely continuous with respect to the distribution of the Poisson pro-

359

Modelling spatial clustering

cess. Such a process may be visualised very easily as a random finite number of points at random locations in a space S. A realisation of the point process X is a finite unordered set of points,

x={xJ, ... ,xn}

(1)

with x; E S. The space S in which the points lie is typically a subset of ~d but may be any Polish space. Let N be the space of all such realisations. Suppose that all point process models X here are absolutely continuous with respect to the distribution of the Poisson point process with intensity measure 11 on S, where 11 is a fixed, nonatomic, finite Borel measure. Then X has a probability density f : N --. [0, oo] such that 00

P(X E A)= exp( -11(S)) LIn(!, A)

(2)

n=O

for each A E F, where I 0 (f,A) = 1 {0 E A} /(0) and for n 2:1

In(!, A) =

~

{ ... { 1 { {x1, ... , Xn} E A}/( {xl, ... , Xn} )d11(xl) ... d11(xn)

n. Js Js

(3)

In the simple case where S is a bounded subset of ~d and 11 is the restriction to S of the Lebesgue measure,

(4) is the probability that the process consists of a point near each of the locations x 1 , ... , Xn and no other points. Example 1: Poisson process. A point process is a homogeneous planar Poisson process of intensity A if: PPl: the number, n(S), of events in any planar region S follows a Poisson distribution with mean .XISI, where lSI denotes the area of S; PP2: Given n(S) = n, the n events in S form an independent random sample from the uniform distribution on S; PP3: For any two disjoint regions S 1 and S2, the random variables n(Sl) and n(S2) are independent. Let n(x) denote the number of points in a realisation x E N. If a, j3 > 0 are constants,

f(x)

= a(Jn(x)

(5)

defines the density of the Poisson process with intensity measure /311(.), and the normalising constant a equals a= exp { -((3- 1)11(S)} (6) Example 2: Inhomogeneous Poisson process. For a function (3 : S --. [0, oo ),

n

n(x)

f(x) = a

j3(x;)

(7)

i=l

is the density of the "inhomogeneous" Poisson process with intensity measure K( B) = JB (3( u )d11( u) on S and the normalising constant is

a= exp

{-is ((J(u)- 1)d11(u)}

(8)

360

J. Mateu and J.A. Lopez

Example 3: Cox process. A point process is a Cox process if: CPl: A(x) is a non-negative valued stochastic process and CP2: conditional on the realisation of A(x), the point process is an inhomogeneous Poisson process with intensity function A(x). Example 4: Poisson cluster process. A point process is a Poisson cluster process if: PCPl: Parents form a homogeneous Poisson process of intensity p; PCP2: The number of offsprings per parent is a random variable M, realised independently for each parent; PCP3: The position of each offspring relative to its parent is a bivariate random variable Y, realised independently for each offspring; PCP4: The observed point process consists of the superposition of offsprings from all parents. 2.1

lnterpoint interactions

Definition. A finite Gibbs point process is a finite simple point process with a density satisfying the positively condition

f (x) > 0 ==> f (y) > 0

f (x) (9)

for ally C x. By an application of the Mobius inversion formula or inclusion/exclusion, the density of any finite Gibbs point process can be written in the form

f (x)

= exp { Vo

+L

n(x) i=l

VJ

(x;)

+L

v2

(x;,

Xj)

+

i 0 is the normalising constant. The terms b (x;) influence the intensity and location of points, while the terms h (x;, x j) introduce dependence or interaction between different points of the process. Typically, a is not known explicitly. Assume now that the interaction function h has finite range r > 0, in the sense that h( u, v) = 1 whenever llu- vii > r. Declare two points u, v E S to be neighbours, and write u "'v, if they are closer than r units apart: u"' v iff

llu- vii < r

(12)

361

Modelling spatial clustering

Then interactions occur only between neighbours, and the density becomes

(13) A Markov point process on S with respect to a symmetric, reflexive realtion ~ is a finite Gibbs point process whose conditional intensity A(u;x) = f(xUu)/f(x) depends only on u and

{x; EX:

X;~

u}.

Example 6: Widom-Rowlinson penetrable sphere model The Widom-Rowlinson penetrable sphere model, or area-interaction process (Baddeley and van Lieshout, 1995), has density

(14) where (3, 'Y > 0 are parameters, a > 0 is the normalising constant, and A(x) is the area of

Ur(x) =

n(x) (

i~ B(x;; r)

)

nS

(15)

where B(x;; r) is the disc of radius r centred at x;. The above density is integrable for all values of 'Y > 0. The process produces clustered patterns when 'Y > 1, ordered patterns when 'Y < 1, and reduces to a Poisson process when 'Y = 1. This model has interactions of infinite order.

Example 7: Continuum random-cluster process The continuum random-cluster process, has density

(16) where (3, 'Y > 0 are parameters, a > 0 is the normalising constant, and c(x) denotes the number of connected components. This model exhibits regularity for 'Y < 1, and clustering for 'Y > 1.

Example 8: Triplets process The idea of Markov point processes suggests adding the clique of next higher order to get a process that permits positive attraction of pairs of points. Define

W(x)

1

n

6LL L

I[ilx;-xjll~r]J[IIxj-xkll~r]

i=l j#i k;"i#j

I [llx;- Xkll ~ r]

(17)

the number of triplets of points that are mutual neighbours. Define t(x) to be the 3-dimensional vector (n(x), S(x), W(x)), where S(x) is the same as in the Strauss process. Then, the triplets process (Geyer, 1999) has a density depending on three parameters (0 = ((3,"f,TJ)) and takes the form

(18) The normalizing constant a is finite if and only if 'Y ~ 1 and TJ ~ 1 or if TJ < 1. Thus the canonical parameter space of the family is 8 = {0 E ~2 : TJ < 1} indicating that this is a good model for clustering.

362

J. Mateu and J.A. Lopez

Example 9: Saturation process Another simple way of obtaining a model for clustering is by defining for each point s E ;c

L

m.(x) =

I [lis-

ull ::; r].

(19)

u#sEX

Then

d

LsEX

m.(x) is just twice the neighbour pair statistic S(x). Now, we put an upper bound

> 0 on the influence of any single point and define U(x)

=

L

min(d, m.(x)).

(20)

sEX

Define t( x) to be the 2-dimensional vector (n( ;c), U (;c)). The saturation process (Geyer, 1999) has a density depending on the parameters (11 = ({3,-y)) and takes the form

f(x) = a{Jn(x)-,.U(x)

(21)

Now the normalizing constant a is finite for all values of 'Y.

Example 10: Isolated-point-penalization point process Hayat and Gubner (1999) introduce the isolated-point-penalization point process which has the density

/(x)

= af31xi-,.I(x)

with 0 < 'Y < 1, and J(x) is the number of isolated points with respect to the r-close neighbour relation. Note that two points ~ and 1J are neighbours if 0 < d(~, TJ) ::; r, where d(~, TJ) is the distance between~ and TJ·

Example 11: Interacting neighbour point process A point process X is an interacting neighbour point (INP) process with respect to a given neighbour relation ~ if its density has the form

/(x) = Ct

II g(x;, Nx(x;)) Xi

(22)

EX

where a is a normalizing constant and g a measurable function. This class of models was defined by Grabarnik and Siirkkii (2001). The conditional intensity can be written as

.X(~;x) =g(~,Nxu{{}(~))

II XtENx( 0 is an arbitrary constant and Nx(x;) consists of the points that are within distance r from X;. In the case 1 > 1, the interaction term lmax{O,INx(x,)l(c-INx(x;)l)} obtains its larget value as the number of neighbours equals c/2, i.e., the cluster size c/2 + 1 is favoured. Thus, we can control the size of clusters by choosing the constant c appropriately. Since for large values of the interaction parameter 1 all clusters are of the same size, we call this process a twin cluster process. The case 0 < 1 < 1 is interesting as well. Then, the interaction term gets its smallest value when the number of neighbours of each point equals c/2 and its largest value if points have either no neighbours or c or more neighbours. Hence, instead of being a repulsive model, it is a specific cluster model defining processes with a combination of regular and clustered structures. This is called bipattem process.

3

Inference: Pseudolikelihood approach

Given a point pattern x = {x 1 , x 2 , ... Xn}, the conditional intensity function of an event at u given the point pattern x in the remainder of S is defined as

'*(· )=f(xuu)_ " u,x f(x) The general log-pseudo-likelihood function for a Gibbs process is given by

pl(x;O) = f)og{.X*(x;;x\ {x;} ;0)} -1-X*(u;x;O)du. i=l s

(23)

The following relation is satisfied for a general Gibbs process

.X*(u;x;O) =f3exp{- tk-i+l (t)x, =a* (t)x, +a._, (t)x 2 + .. ·+a1 (t)x, ~ b(t), 'r/t E T,

k ~ 2,

(3)

i=l

which are: k-l Iak-i+l

(t )x, ~ b(t), 'r/t E s,',

(4)

i=l

(5)

An algorithmic solution ofparameter-varying Linear Matrix Inequalities in Toeplitzform _ _ _ _ __

381

Using the equivalence of 2., where only one variable is eliminated, we lose information about the conditions that this variable should satisfy. Indeed, in this elimination the variable x, has been removed and thus the information about the range of the values that x, may take in an eventual solution of (3) is lost. The idea which is used in order to reinstate the information about x, is the additional elimination of another variable, let x,_,, so that a second couple of equations similar to (4) and (5) are derived, which have a solution if and only if (3) has a solution. Thus, considering the elimination of two variables x, and x,_, , we arrive at 3., which describes the Decomposition of the initial inequality (3) into a set of four equivalent inequalities, each one of them including k -1 variables, without losing information about the range of the variables in the solution.

3. Geometrical representation of the main results For the illustration of the above, consider the following linear system:

][x'] [a (t)xa (t)x +a (t)x

0 [a1 (t) a2 (t) a,(t) x2

where:

! 1,

a1 (t) =

0,

-1,

1

=

2

1

1

1

]

) g.

r z. 11,

Y Ndo 1 (A) is the closure operator given by F(x) = CI'{F,v,!}({x}).

2

Minimal Generator of an nd.ideal-o

Definition 2.1 Let be (A, ::0) a lattice and F an nd.ideal-o in A. We say that G E Ndon(A) is a generator ofF if G = F. Particularly, VF E Ndon(A), F is an generator of F. Now, we are interested in a generator without redundance, therefore we search minimal generators. Theorem 2.2 Let be (A, ::0) a finite Boote algebra and Then, the ideal generated by

In Atom( A)

is

I~

A an ideal in (A,::;).

(In Atom( A)]

=I.

Corollary 2.3 Let be (A, ::0) a finite Boote algebra and F an nd.ideal-o in A. Then, there exists G: A-> 2Atom(A) generator of F. The corollary 2.3 can help us in the search of the minimal generator. Example 2 Let be the boole algebra of the positive divisors of 30, D3o, with the divisibility relation and the nd. ideal-o F : D3o -> 2D30 defined by: F(x) = { (x] D3o

ifx E {1,2,3,5} in other case

the ndo G(x) = F(x) n Atom(D3o) is described by G(x)

={

{x}

{2,3,5}

2 If

ifx = 1 six E {2,3,5} in other case

F, G E Ndon(A) then (F n G)( a)= F(a) n G(a).

30

6

/I~ 10

15

3

5

lXXI

2

~I/

398

A. Mora, P. Cordero, M. Enciso, I.P. de Guzman

But G have unnecessary information because the ndo: H(6) = {5}; H(lO) = {3}; H(15) = {2}; H(x) = 0 in other case satisfies that H C G and fi = G =F. Proposition 2.4 Let be (A, s) a boole algebm finite and F an nd.ideal-o in A. If H : A -> 2Atom(A) is the ndo defined by H(x) = (F(x) '- (x]) n Atom( A), then H is a genemtor of F. Definition 2.5 Let be (A, s) a lattice and F, G E Ndon(A). We say that F and G are lents ifF= G. In this case, we say that F is a cover of G and vice versa.

equiva-

Now, we formalize in the following definition the idea of "to have less information than". Definition 2.6 Let be (A, S) a poset, F, G: Aar(F) __, 2A, we define: 1. F ~ G ifGrafo(F) s Grafo(G), that is, if for all (a, b) with bE F(a) there exists (a',b') with b' E G(a') that satisfies aS a' and b S b'. Particularly, ifF~ G, we have that F ~G. 2. F -< G ifF

~

G and F =J G.

Definition 2. 7 Let be (A, s) a lattice and F, G E Ndon(A). We say that F is redundant if there exists G equivalent to F such that G -< F. Therefore, a ndo is redundant if there exists another equivalent ndo with less information. The following example shows ndo's with redundancy. Example3 LetbeU={a,b,c}, and(2u,~): 1. ThendoF:2u->2 2u,F({a})={{a},{c}}, is redundant because the ndo G : 2u ...... 22u G({a}) = {{a},{c}},

F({a,b})={U},

G(X) = 0 in other case

satisfies that G-< F and, as F({a,b}) ~ G({a,b}) 2. The ndo F : 2u -> 22u, F( {a}) = { {a, c}}, the ndo G : 2u -> 22u

G({a}) = {{c}}, satisfies that G-< F and, as F( {a})~

= 2u, we have that G=F.

F( X) = 0 in other case is redundant because

G(X) = 0 in other case

G( {a})= {0, {a}, {c}, {a, c}},

3. The ndo F: 2u -> 22u, F( {a}) = {{c}}, redundant because the ndo G : 2u -> 22u G( {a})= {{c}, {b} }, satisfies that G-< F and

F(X)=0inothercase

F( {a, c}) = {{b} },

we have that

G =F.

F(X) = 0 in other case is

G(X) = 0 in other case

G=F.

The following proposition summarize the cases presented in the below example. Proposition 2.8 1. there exists a E A and b E F(a) such that b E ~(a), where Fab is given by Fab(a) F(a) '- {b} and Fab(x) = F(x) otherwise.

Minimal generator of a non-deterministic operator

399

2. there exists a, b' E A and b E F( a) such that b' < b and b E Fa;; (a) where Faw is given by Faw(a) = (F(a)' {b}) U {b'} and Faw(x) = F(x) otherwise. 3. there exists a, a' E A and b E F( a) such that a' < a, b E F( a') and b E ~ (a) where Faba' is given by Faba'(a) = F(a)' {b}, Faba'(a') = F(a') U {b} and Faba'(x) = F(x) otherwise. Finally we remark the following characterization of minimal generator of a ndo. Definition 2.9 Let be (A, :'S) a lattice andF E Ond 1(A). We say thatG E Ond1(A) is a minimal generator ofF if:

• G is equivalent to F, • G-< F and • G is not redundant.

3

Functional Dependencies in lattice theory

In an awful amount of research on databases, the study of FDs is based on the notion of /-family (Amstrong's Relation) that we characterize in [1] in the framework of the lattice theory. In [1] we proof the following theorem and turns the proof of well-known properties of FDs in a trivial matter. Theorem 3.1 Let A be a non-empty set and F a relation in 2A. F is a f -family over A if and only ifF is a ndo in (2A,~). In this paper we formalize the database concept de family non-redundant of FDs (minimal closure) with the concept of minimal generator of the corresponding ndo.

References [1] P. CORDERO, M. ENCISO, I. P. DE GUZMAN, AND A. MORA, SLFD Logic: Elimination of data redundancy in Knowledge Representation, LECTURE NOTES- LNAI 2527, Advances in AI, Iberamia 2002, pp. 141-150 [2] J. MARTINEZ, P. CORDERO, G. GuTIERREZ AND I. P. DE GUZMAN, A new algebraic tool for Authomatic Theorem Provers, Annals of Mathematics and Artificial Intelligence. Kluwer Academic Publisher. To appear. [3] J. MARTINEZ, P. CORDERO, G. GUTIERREZ AND I. P. DE GUZMAN Generalizations of Lattices Looking at Computation, Discrete Mathematic., Elsevier. To appear. [4] G. GRATZER, General lattice theory, Birkhauser Verlag. Second Edition. 1998 [5] J. DEMETROVICS AND D. T. Vu, Some results about normal forms for functional dependency in the relational datamodel, Discrete Applied Mathematics 69, 1996, pp. 61-74 [6] J. DEMETROVICS, L. LIBKIN, AND I. B. MUCHNIK, Functional dependencies in relational databases: A lattice point of view, Discrete Applied Mathematics 40, 1992, pp. 155-185 [7] J. PAREDAENS, P. DE BRA, M. GYSSENS, AND D. V. VAN GUCHT, The Structure of the Relational Database Model, EATCS Monographs on Theoretical Computer Science, 1989

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 400-403

Computations of Gas Turbine Blades Heat Transfer Using Two-Equation Turbulence Models Abdul Hafid M. Elfaghit and Ali Suleiman M. Bahr Enel Aeronautical Engineering Department, Engineering Academy Tajoura Tajoura, Libya Received 8 June, 2004; accepted in revised form 25 September, 2004 Abstract: Turbine blade heat transfer is a very important engineering problem characterized by high turbulence levels and very complex flow fields. The prediction of turbine blade heat transfer is very important, especially when turbine inlet temperature increases. This paper is about numerical predictions of flow fields and heat transfer to gas turbine blades using different two-equation turbulence models. Four two-equation turbulence modeles were used, the standard

k - & model, the modefied Chen-Kim k - & model, RNG model and Wilcox standard k - m turbulence model. These models are based on the eddy viscosity concept, which determines the turbulent viscosity through time-averaged Navier-Stocks differential equations. The simulation was performed at Aerospace Engineering Department, University Putrra Malaysia (UPM) using the general-purpose computational fluid dynamics code, PHOENICS, which solves the governing fluid flow and heat transfer equations. An H-type, body-fitted-coordinate (BFC) grid is used and upstream with downstream periodic conditions specified. The results are compared with the available experimental measurements obtained from a research carried out at the Von Karman Institute of Fluid Dynamics (VKI). A comparison between the turbulence models and their prediction of heat flux on the blade is carried out. Keywords: CFD, turbine, heat transfer, turbulence models, PHOENICS

1. Introduction The development of high performance gas turbine requires high turbine inlet temperatures that can lead to sever thermal stresses to the turbine blade particularly in the first stages of turbine. Also, the efficiency of gas turbine depends strongly on the momentum and heat transfer characteristics of the turbulent boundary layers developed on the blade surfaces. Therefore, a major objective of gas turbine design is to determine the thermal and aerodynamical characteristics of the turbulent flow in the turbine cascade. The two-equation turbulence models have been and will likely continue to be among the most widely used turbulence model for flow predictions in engineering applications. In particular, the k-e model is the most popular in commercial softwares.

2. Governing Equations The averaged continuity equation (1), momentum equation (2) and energy equation (3) for compressible fluid, which obtained by averaging the instantaneous equations can be written as folioing:

a_ a

- p + - pu = 0

ar

' Email: [email protected]

ax

(I)

Computations of Gas Turbine Blades Heat TransferUsing Two-Equation Turbulence Models _ _ _ _ _ 401

(2) (3)

The heat flux, q, is given by Fourier's low:

a

JJ P,

oT

(4)

q=-k-=-C - -

ot

p

ox,

The heat flux is treated according to Reynolds analogy. First the heat flux is divided into a laminar and turbulent parts

q=qlam +qturb

(5)

3. Turbulence Models A turbulence model can be described as a set of relations and equations needed to determine the unknown turbulent correlations that have arisen from the averaging process. In the two-equation model, the Reynold's stress are given by Boussinesq's assumption: 2 (6) - u,u, =v(+ - - ) - - oij(v, - + k ) OX; ox, 3 ox,

au, au;

au,

4. Simulation Procedure The turbine blade chosen for this work is a transonic high loaded cascade from von Karman Institute of fluid dynamics (VKI). Flow and heat transfer experimental measurements at many different operating conditions and for different Mach and Reynolds numbers are available in Arts et al. The case chosen here is referred to as Mur218. it has an inlet free stream turbulence level of 4percent, an outlet Mach number of 0. 76 and a chord Reynolds number based on the outlet conditions of I 06 A two-dimensional non-orthogonal grid to the x- and z-co-ordinates was laid over the flow domain. A fine grid of 125 cells in stream wise direction were incorporated with 50 cells in the pitch wise direction using PHOENICS input language (PIL). The grid is made to be fine near the wall surface and the inlet and outlet boundaries. H-type, body-fitted co-ordinate (BFC) grid was used and upstream and downstream periodic conditions were specified. Cyclic boundary conditions were used at upstream and downstream regions. The turbulence models are applied using their corresponding commands and boundary conditions. The computational domain and grid system are shown in figure I.

_____.

Inlet

Outlet

\ Figure I: Turbine blade H-grid

402 - - - - - - - - - - - - - - - - A b d u l Hafid M Elfaghi and Ali Suleiman M Bahr Enel

5. Grid Independence and Convergence In order to assess the grid independence of the results, a grid resolution study was undertaken. A grid of 125 in chord wise direction and 50 in pitch wise direction were found to be sufficient for this type of flow. A grid system of2JOx70 was tested and found that the results obtained by both grid systems are closed to each other. The time taken by the 125x50 is reduced by 40% compared with 2JOx70 grid.

6. Results and Discussion Figure 2 shows that the predicted surface pressure distributions along the blade surface are in good agreement with the experimental data. A strong pressure gradient is observed along the suction side, the pressure failed rapidly up to S/S....,.=0.5 and then increase up to trailing edge. Figure 3 shows the velocity field along the blade surface. On suction side, the flow is accelerated up to s"/Smax=0.5 and then decelerated up to trailing edge. On the pressure side, the flow is a slightly accelerated at the leading edge up to s•/Smax=0.6 and then strongly accelerated.

·~ r-------------,

J ~.

0

~

0

0.6 0

...

0

c)

02

"q.o

·..

u 0

o .. o

· ·o

·- .~- -~.': 0 -~-Q.~- -~· · " OA

0.2

0.6

O.M

G.l

0

Q. · ·

.0· " . .0 ...~ . .,.....:_ _

o·· Smnx

8.8 SIS..• I

- · - · • • 'w.:uon sn.Jc

Pre!lsure s1dc o

Mca:..W'CIJk,.'"lllS

Figure 2: Pressure distribution

Figure 3: Mach number distribution

Figure 4 shows the predicted wall heat flux distribution on suction side compared with Arts et al. (1990) measurement results. On the initial part of blade, the flow is laminar and the acceleration leads to decrease of heat transfer rate. The standard model starts well, but then gives an over prediction of heat transfer. Chen-Kim and RNG models capture the overall behavior quite well, although the predicted heat transfer level is bet too high. Wilcox standard k- w model works better in the leading edge region. Figure 5 shows wall heat transfer for the pressure side. The results agree well with the measurements.there are still problems at the leading edge and the results give high-predicted heat transfer on this region.

Computations of Gas Turbine Blades Heat TransferUsing Two-Equation Turbulence Models _ _ _ _ _ _ 403

0

E.'qlC111DCUt

.,

St:Mdnrd l< modd Cbcn-Knn modcl

160

-0

E.""""""'

+ Stakbrdk-Kim modd "' Wtlco-t._·"'modd

120 100 X

110 -

l

0

02

04

0

06

0

Figure 4: Wall heat flux (Suction side)

01

0 .4

0.6

08

Figure 5: Wall heat flux (pressure side)

7. Conclusion The following points could be pointed out as a concluded remarks of this work: 1- All models give good heat transfer prediction for pressure side, except close to the leading edge. 2- Standard model slightly under-predicts the heat flux on the pressure side, whereas Chen-Kim and RNG models give high values 3- All k- E models generate too high turbulence levels in stagnation point region, which give rise to the heat transfer rates. 4- Wilcox standard k- ro model give better heat transfer predictions in the leading edge region.

Acknweldegments This work was carried out at the department of Aerospace, University Putra Malaysia (UPM), which is gratefully acknowledege.

References [I] T. Arts., Lmbert M. Rouvriot and A. W. Rutherford, "Aero-thermal investigation of high loaded transonic linear turbine guide vane cascade", Technical report 174, Von Karman Institute for Fluid Dynamics. (1990) [2] Y.S. Chen and S.W.Kim, "Computation of turbulent flows using an extended k- e turbulence closure model", NASA CR-179204 ( 1987) [3] J. Larson, L. Eriksson and U. Hall, "External Heat Transfer Prediction in Supersonic Turbine Using the Reynolds Averaged Navier-Stokes Equations", 12th ISABE conference Melbourne, vol. 2, pp. 1102-1112. (1995) [4] B.E.Launder. and D.B. Sharma, "The Numerical Computation of Turbulent Flow", Methods in Appl. Mech. & Eng., Vol.3, pp.269 (1974) [5] V. Yakhot and S.A. Orszag, "Renormalization group Analysis of Turbulence ", Journal of Sci. Comput., Vol.l, pp.3. (1996)

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 404-405

Analytical Methods in Computer Modelling of Weakly Bound Hydrocarbon Radicals in the Gas Phase Marek Naro:i:nik 1 Facuity of Sciences Department of Computing University ofPodlasie, 08-110 Siedlce, Poland Jan Niedzielski Chemistry Department Warsaw University, 02-089 Warsaw, Poland Received 5 July, 2004; accepted in revised form 8 August, 2004 Abstract: The analytical forms for interaction potentials were used to estimate the matrix elements that are necessary to use the perturbation calculus in investigations of the weakly bound hydrocarbon radicals. Keywords: Perturbation calculus, matrix elements, Schrodinger equation Mathematics Subject Classification: AMS-MOS: 68-04,65005, 65007,65010, 81Ql5

1. Theory An analytical form of the potential interaction energy ofhydrocarbon molecules and radicals at long distances is developed. For instance, the interaction energy of methyl radicals can be given in the form [I] V=to - t l

cos 2 9 1 -1:2 cos 2 9 2 -

-A ex{-!:) [ f 1 (r) cos9 12 - f 2 (r) cos9 1 cos9 2 j 2

(I)

where the parameters EQ, t" t 2, f" f2 are the spherically symmetrical functions of the distance r between the mass centers of reactants while cose" cos6 2, cos6 12 are the directional cosines for mutual orientations of reactants. Using this analytical form we can readily estimate the matrix elements [2] (2)

where d't is the element of volume while 'I' n and 'I'm are the wave functions of the Hamiltonian of the system associated with the rotational states. Using formulae (I) and (2) we can effectively estimate energy levels of the interacting radicals, En. To this end the Brillouin-Wigner series is needed. [2,3] The adiabatic curves that are obtained, En(r), reveal the presence ofloosely bound physicochemical states such as CH 3•.. CH 3, [4] and CH 3 ..• H. [5] The prerequisite for the existence of such states is associated with the number I that quantizes the orbital momentum of reactants. I must be smaller than the limiting value . For instance, in the case of methyl radical interaction =39. [4] The

1 Corresponding autor: Faculty of Sciences, University ofPodlasie, 08-110 Siedlce, Poland E-mail address: marekn [email protected]

Marek Narotnik and Jan Niedzielski

405

energy levels E0 (r) can also be used to estimate the mean radical interaction potential in the hard sphere approximation. [6]

2. Modelling of radical recombinations The latter method was used to study the following recombinations: CH 3 + H, CH 3 + CH 3, allyl + allyl, tert-C4 H9 + tert-C4 H9 • [6, 7] The presence of loosely bonded states can explain the reasons for the negative temperature dependence of the recombination rate constants, k. For the system CH 3 + CH3 we obtain the dependence k- 1 112 while for the system tert-C4 H9 + tert-C~ 9 the dependence k - 1 312 is obeyed. Both are in perfect agreement with experiment. The statistical sums for the bonded states of radicals that are indispensable in such calculations are estimated with the use of spline functions interpolations. The existence of loosely bounded states in the radical recombination can throw some light on the little known process of recombination rate constant decrease under very high pressures. [8] The above mentioned methods were used to study many systems, most recently the interactions and the bonded states of recombining allyl radicals. [9]

References [I] M.Naro:Znik and J.Niedzielski, Analytical form of the interaction energy of radicals at short and long distances, Theoretica Chimica Acta 94 257-269 (1996). [2] M.Naro:Znik and }.Niedzielski, Energy levels of the weakly interacting radicals, Theoretica Chimica Acta 94 271-285 (1996). [3] A.G.Makhaneck and W.S.Korolkov, Analytical Methods in Quantum Mechanical Theory of Interactions. Nauka y Technika, Minsk, 1982 (in Russian). [4] M.Naro:Znik, Recombination of radicals in the high-pressure and high temperature limit. Part I Reaction CH 3 + CH 3. Journal of Chemical Society, Faraday Transactions 94 2531-2539 ( 1998). [5] M.Naro:Znik and J.Niedzielski, Recombination of radicals in the high-pressure and high temperature limit. Part 2 Reaction CH 3 +H. Journal of Chemical Society, Faraday Transactions 94 2541-254 7 ( 1998). [6] M.Naro:Znik, Interactions between isobutane molecules and between tert-butyl radicals estimated onb the basis of second virial coefficients. Journal of Molecular Structure (Theochem) 624 267278 (2003). [7] M.Naro:Znik and J.Niedzielski, Recombination of tert-butyJ radicals: the role of weak van der Waals interactions. Theoretical Chemistry Accounts 108 103-112 (2002). [8] A.G.Zawadzki and J.T.Hynes, Radical recombination rate constants from gas to liquid phase. Journal of Physical Chemistry 93 7031-7036 ( 1989). [9] M.Naro:Znik and J.Niedzielski, to be published

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 406-409

A Numerical Scheme for a Shallow Water Equation D. G. Natsis

1,

A. G. Bratsos and D.P. Papadopoulos

Department of Mathematics, Technological Educational Institution (T.E.I.) of Athens, GR-122 10 Egaleo, Athens, Greece. Received 30 July, 2004; accepted in revised form 25 August, 2004 Abstroct: A finite-difference method is presented for the solution of the two-dimensional Boussinesq-type set of equations as those were introduced by Madsen et al [6] in the case of a constant-depth environment. The application of the finite-difference scheme results in an initial value problem and it is proposed the unknown quantities to be calculated implicitly by solving a linear system of equations. The numerical treatment of the system is briefly discussed, while numerical results are the subject of a following work. Keywords: Boussinesq Model; Wave braking; Finite-difference method. Mathematics Subject Classification: 65M06, 35L05, 35L45 PACS: 02.60.Lj

1

Introduction

A finite-difference scheme is proposed for the solution of the two-dimensional Boussinesq-type set of equations in the case of constant depth environment. Following physical laws (see for example Madsen et al [6] etc.) the problem in its preliminary form is constructed by the following system of nonlinear equations. The application of the finite-difference scheme results in an initial value problem and it is proposed a method of implicit calculation of the unknown functions by solving a linear system of equations.

(1)

=

-~ ~ (_!!!!__)~ P) ox (L)h+w oy h+w ox [J'(w) (ex- _ h+w

2

]

(2) 1 Corresponding

author. E-mail: [email protected]

407

A numerical scheme for a shallow water equation

(3)

We are working in the region 0 = { (x, y); L~ < x < L~, L~ < y < L~} for t > to, where w = w (x, y, t) is the surface elevation, p = p (x, y, t) and q = q (x, y, t) are the density of the flow to the directions x and y respectively, h = h (x, y) the depth of the bottom of the sea, d = d (x, y) is the instantaneous depth with d = h + w, /5 = /5 (x, y, t) is the thickness of the surface roller, (ex, cy) the components of the roller celerity, which

B=

B

8(w)

=

81 (1 - h~w),

and B is the dispersion coefficient in

+ ~· 2

The numerical solution

To obtain a numerical solution the region R = n x [t >to] with its boundary 8R consisting of the lines x = L~, L~, y = L~, L~ and t = to, is covered with a rectangular mesh, G, of points with coordinates (x, y, t) = (xk. Ym, tn) = (L~ + khx, L~ + mhy, to+ nf) with k, m = 0, 1, ... , N + 1 and n = 0, 1, ... , in which hx = (L~- L~) I (N + 1) and hy = (L~- L~) I (N + 1) represent the discretization into N + 1 subintervals of the space variables, while f represents the discretization of the time variable. The solution for the unknown functions w, p and q of an approximating finite-difference scheme at the same point will be denoted by w';: m, p';: m and q';: m respectively, while for the purpose of analyzing stability, the numerical value of actu~lly obtain~d (subject, for instance, to computer round-off errors) will be denoted by w';: m, fi'k m and ii'k m. The solution vectors will be ' ' '

wn

=

w(tn)

=

[w~. 0 ,w~,l, ... ,w~.N+l; w~.o,wf,I, ... ,wf,N+I; n

n

n

.. ; WN+I,O,WN+I,I• ... ,WN+I,N+I

]T '

(4)

Pn = P (tn) = [P~.o,PK!, ... ,pKN+I; P~.o,P~,I, ... ,p~.N+I; n n n ]T

(5)

qn = q (tn) = [qQ',o, Qo,l, ... , Qo,N+I; q?,o, q?,1, ... , q?,N+I; n n n ]T

(6)

.. ; PN+I,O• PN+I,I, ... , PN+I,N+I

.. ; qN+I,O• qN+I,I' ... , qN+I,N+I

,

·

T denoting transpose. Then there are (N + 2) 2 values to be determined at each time step.

2.1

The boundary conditions

In Eqs. (1)-(3) the boundary conditions will be assumed to be of the form U

and

(L~ -

hx, y, t)

::::J

~ [ -3u (L~, y, t) + 6u (L~ + hx, y, t) -

U

(L~ + 2hx, y, t)] ,

u(L~+hx,y,t)::::J~[-3u(L~,y,t)+6u(L~-hx,t)-u(L~-2h,y,t)].

(7) (8)

408

D. G. Natsis, A. G. Bratsos and D. P. Papadopoulos

2.2

The finite-difference scheme

Replacing the space derivatives in Eqs. (1)-(3) with appropriate finite-difference approximations we obtain the following system in matrix-vector form w' (t) = F 1 (p (t) , q (t)) ,

+ A2 q' (t) A3 p' (t) + A4 q' (t)

A1 p' (t)

(9)

(w (t), p (t), q (t))

(10)

= F 3 ( w (t) , p (t) , q (t))

(11)

= F2

where

w' (t) = d :t(t) ' p' (t) = d

~t(t)

and q' (t) = d

~t(t).

(12)

which finally leads to the following initial value problem w' (t) =

F!,

+ A2 q' (t) A3 p' (t) + A4 q' (t)

A1 p' (t)

(13)

F2

(14)

= F~

(15)

=

with w (0) = wo, p (0) =Po, q (0) = Qo and t > 0, in which A;; i = 1, 2, 3, 4 are block matrices of order (N + 2) 2 . This system will give rise to a numerical method for the evaluation of the unknown functions w, p and q. The method is under investigation and the results will subject of a following work.

Acknowledgment This research was co-funded by 75% from E.E. and 25% from the Greek Government under the framework of the Education and Initial Vocational Training Program - Archimedes, Technological Educational Institution (T.E.I.) of Athens project 'Computational Methods for Applied technological Problems'.

References [1] A. Bayram & M. Larson, Wave transformation in the nearshore zone: comparison between a Boussinesq model and field data, Coastal Eng. Vol. 39 149-171(2000).

[2] S. Beji & K. Nadaoka, A formal derivation and numerical modelling of the improved Boussinesq equations for varying depth, Ocean Eng., Vol. 23 691-704(1996).

[3] J. Larsen & M. Dancy, Open Boundaries in Short Wave Simulations- A New Approach, Coastal Engineering, Vol 1 285-297(1983).

[4] P. A. Madsen, R. Murray and 0. R. S0rensen, A New Form of the Boussinesq Equations with Improved Linear Dispersion Characteristics (Part 1), Coastal Engineering, Vol 15 no 4, 371-388(1991).

[5] P. A. Madsen, 0. R. S0rensen, A New Form of the Boussinesq Equations with Improved Linear Dispersion Characteristics, Part 2: A Slowly-Varying Bathymetry, Coastal Engineering, Vol 18 no 1, 183-204(1992).

A numerical scheme for a shallow water equation

409

[6] P. A. Madsen, 0. R. S0rensen and H. A. Schaffer, Surf zone dynamics simulated by a Boussinesq type model. Part I. Model description and cross-shore motion of regular waves, Coastal Engineering 32 255-287(1997). [7] K. Nadaoka & K. Raveenthiran, A phase-averaged Boussinesq model with effective description of carrier wave group and associated long wave evolution, Ocean Eng., Vol. 29 21-37(2002). [8] F.S.B.F. Oliveira, Improvement on open boundaries on a time dependent numerical model of wave propagation in the nearshore region, Ocean Eng. Vol. 28 95-115(2000). [9] D. M. Peregrine, Long waves on a beach, J. Fluid Mech., Vol. 27 Part 4(1967).

[10] H. A. Schaffer, P. A. Madsen and R. Deigaard, A Boussinesq model for waves breaking in shallow water, Coastal Eng. Conf. 547-566(1980). [11] Z. L. Zou, High order Boussinesq equations, Ocean Eng., Vol. 26 767-792(1999).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume 1, 2004, pp. 410-413

A Time-Dependent Product Formula and its Application to an HIV Infection Model S. Oharu 1 , Y. Oharu 2 , D. Tebbs 3 2

1•3 Department of Mathematics, Chuo University, Tokyo, Japan Department of Physics and Astronomy, Arizona State University, Tempe, Arizona, U.S.A.

Received 13 August, 2004; accepted in revised form 29 August, 2004

Abstract: In order to obtain a comprehensive form of mathematical models describing nonlinear phenomena such as HIV infection processes, a general class of time-dependent evolution equations is introduced in such a way that the nonlinear operator is decomposed into the sum of a relatively good operator and a perturbation which is nonlinear in general and satisfies no global continuity condition. An attempt is then made to combine the implicit approach (usually adapted for convective diffusion operators) and explicit approach (more suited to treat continuous-type operators representing various physiological and reaction processes), resulting in a semi-implicit product formula. Decomposing the operators in this way and considering their individual properties, it is seen that approximation-solvability of the original equation is verified under suitable conditions. Once appropriate terms are formulated to describe treatment by antiretroviral therapy, the time dependence of the reaction terms appears, and such product formula is useful for generating approximate numerical solutions to the original equation. With this knowledge, a continuous model for HIV infection processes is formulated and physiological interpretations are provided. The abstract theory is then applied to show existence of unique solutions to the continuous model describing the behavior of the HIV virus in the human body and its reaction to treatment by antiretroviral therapy. The product formula suggests appropriate discrete models describing the HIV infection mechanism and to perform numerical simulations based on the model of the HIV infection processes. Finally, the results of our numerical simulations are visualized, and it is observed that our results agree with medical aspects in a qualitative way and on a physiologically fundamental level. Keywords: Mathematical model, time-dependent evolution equation, semi-implicit product formula, continuous model, discrete model, HIV infection processes, antiretroviral therapy. Mathematics Subject Classification: 35K57, 47Hl4, 47J35, 58D25, 62Pl0, 92C55

1

An HIV infection model

The model given here is based upon the original models treated in [2] and [3]. We here make an attempt to present a mathematical model such that the effects of treatment are taken into account and the suggested mathematical algorithms are used to address practical clinical issues. Once this form of treatment by antiretroviral therapy is intoduced into the model, time-dependence of the reaction terms appear and the equations take the form stated below: 1 Corresponding

author.@E-mail:[email protected]

A Time-Dependent Product Formulaand its Application to an HIV Infection Model

OtU

d1Llu + b1(t) · V'u + S- au+ PJU(Ws

OtVs

d2Llv 8

+ Wr)-

+ b2(t) · Y'v + 1/i(t)J'.w.u- f3vs- P2V 8

8

(1li(t)'y.w 8 (W 8

411

+ 'YrWr)u

+ Wr) (HIV)

The function u represents the density of healthy cells, as yet uninfected by the virus. v 8 and Vr are the densities of infected cells that are sensitive and resistant, respectively, to the treatment. Finally, W8 and Wr represent the concentration of the virus itself, again divided into sensitive and resistant strains. All cells have a natural mortality, represented by the constants a and (3, and we assume that there exists a supply of healthy cells described by S. The supply term S depends on the amount of virus present and takes the form s = So - s· (w s + Wr) I (B s + w. + Wr) ~ So. The virus also decays in the presence of the immune system and the terms -c5uw8 and -c5uwr account for this. The infection mechanism is described by means of the constants 'Ys and 'Yr, which specify the relative proportions of sensitive and resistant cells that are produced when infection occurs. Treatment is represented using the functions 17 1 and 172 . These restrict both the reproduction of treatment-sensitive infected cells and the regeneration of the HIV virus. Saturation terms Pk, k = 1, · · · ,4 are given by

Pk

= p~j(ck +ws +wr),

k

= 1, ···

,4,

P5

= gr(Ws +wr)/(B+ws +wr)·

The constants P"k and Ck stand for the first and second saturation constants, respectively. In this model it is assumed that the external input of resistant virus from the lymphoid compartment is specified by the threshold function gr(w): gr(w) = 0 if w is less than the threshold wo and gr( w) = P4 if w ;:::: w0 . This means that the capacity of the resistant virus to establish requires that the total population remains above the threshold wo. It should be noted that Pk · (W8 + Wr) ~ P"k for k = 1, 2, 3, 4, when w., Wr 2': 0. The treatment functions 111 ( ·) and 112 ( ·) are given 11 1 (t) = e- 0 and ih -> t.

Theorem For any v E D there exists a strong generalized solution u( ·) to (HIV). Moreover, for v having component functions taking nonnegative values almost everywhere, the strong solution u( ·) is also nonnegative in the same sense, over all time t.

3

Numerical simulation

The primary purpose of the numerical simulation is to analyze changes in time of the levels of virus and rates of disease progression, the rates of viral turnover, the relationship between immune system activation and viral replication, and the time to development of drug resistance. The CD4+T cell count is understood to be the best indicator of the state of immunologic competence of the patient with HIV infection. Our HIV infection model encompass a spectrum ranging from an acute syndrome associated with primary infection to a prolonged asymptomatic state to advanced disease. An attempt has been made to simulate the following four stages of the HIV infection processes: The early and later stages of HIV infection, the median and most crucial stage after antiretroviral therapy has started, and states in the case that treatment stops. More detailed and other important physiological aspects will be taken into account in the subsequent studies. Figure 1 illustrates the primary stage after infection has been established. With no treatment, infection is taking place. It is observed that treatment-sensitive cells are more prolific than resistant cells. Figure 2 shows the later stage of infection. Treatment-resistant cells are still not as prolific as sensitive cells. This eventually evolves into a stable state. As mentioned in Section 2, it is inferred that mutation of sensitive virus occurs in this stage after the establishment of chronic infection. Figure 3 depicts a state after antiretroviral therapy has started. A stable state is quickly reached, in which the resistant infected cells appears to be extremely dominant. The treatment seems to be effective in the sense that it results in the almost complete destruction of sensitive cells, but the resistant cells remain without competition with the immune system. Figure 4 suggests a state after treatment has been stopped. This result is obtained by means of the same model under the assumption that treatment stops at a specific time. Once treatment stops, resistant cells are still present and motivate rapid increase of sensitive cells. Thus it is likely to reach a state in which both types of cells coexist, and this may cause further treatment to be less effective. These results of our numerical simulations agree with medical aspects. A critical question is when to

A Time-Dependent Product Formulaand its Application to an HIV Infection Model

Figure 1: Infection without treatment

Figure 2: Later stage of infection

Figure 3: After antiretroviral therapy starts

Figure 4: Treatment stops

413

start antiretroviral therapy in asymptomatic patients who are infected with the HIV virus [1] and our mathematical approach is expected to help in decision-making about this question. It is best to regard HIV disease as beginning at the time of primary infection and progressing through various stages. Active virus replication and progressive immunologic impairment occur throughout the course of HIV infection in most patients. Our next plan is to develop a new extended model such that all of these aspects are taken into account. References [1] Fauci et al. in: E Braunwald, et al. , Harrison's Principles of Internal Medicine, 15t h Edition, McGraw-Hill, New York, pp. 1852-1913. [2] D. E. Kirschner and G. F. Webb, Resistance, Remission, and Qualitative Differences in HIV Chemotherapy, Emerging Infectious Diseases, Vol.3, No.3, July-September, 1997, pp.273- 283. [3] D. Tebbs, On the product formula approach to a class of quasi-linear evolution equations, Tokyo J. 11ath., to appear.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 414-418

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Multi Level Decision Support in a Soccer Ticket Club Call Centre with the Use of Simulation Panayiotou N.A.

1

Department of Mechanical Engineering, Sector of Industrial Management and Operational Research, National Technical University of Athens, 15780 Zografos, Athens, Greece Evangelopoulos N.P.

2

Department of Mechanical Engineering, Sector of Industrial Management and Operational Research, National Technical University of Athens, 15780 Zografos, Athens, Greece Ponis S.T. 3 Department of Mechanical Engineering, Sector of Industrial Management and Operational Research, National Technical University of Athens, 15780 Zografos, Athens, Greece Received 6 August, 2004; accepted in revised form 21 August, 2004 Abstract: This paper presents the results of a study carried out in the call centre of a ticket-selling company. The general purpose of this project was the improvement of the service quality in an environment of highly variable demand. Simulation was initially applied to analyze the present operation of the call centre and extract conclusions concerning its performance. Thereinafter, it was used as a support tool for the evaluation of various technological solutions and for the redesign of its organizational structure. In addition, the staffing needs, the trunk lines' number and the agents' schedules, concerning the following year's anticipations, were estimated by simulation. Keywords: Call Centre, Simulation, Decision Support, Case Study Mathematics Subject Classification: OOA 72, 37M05, 68U20

1.

Introduction

Customer service using the phone has become a very important issue during the last few years. According to Gartner Group [I], more than 70% of business transactions take place over the telephone. As a result, the presence of call centers in the economic life of our society is indispensable for making business. A call centre is defined as any group of employees whose principal business is talking on the telephone to customers or prospects [2] The decisions involved in the successful operation of a call centre are both strategic and tactical. The most important and difficult decision that has to be made by the Top Management is the level of customer service that will be offered compared with the necessary resources, both human and technological, to achieve this level of service. The high cost of the trained

1 Corresponding author. Lecturer at the National Technical University of Athens. E-mail: [email protected] 2 Corresponding author. Mechanical Engineer, NTUA, Greece. E-mail: [email protected] 3 Corresponding author. Research Engineer at the National Technical University of Athens. E-mail: [email protected]

Multi Level Decision Support in a Soccer Ticket Club Call Centre with the Use of Simulation _ _ _ __

415

call centre workforce, which can amount up to 65% of the overall budget [I] makes the decision even more difficult. On the other hand, the Middle Level Management has to utilize the available resources in the best possible way, no matter how complicated and dynamic the operation of the call centre may be, in order to satisfY the strategic constraints imposed by the Top Management. Therefore, the success or failure of a company's customer relationships depends on whether the call center management team coordinates all the resources and the technologies available in service to the company's strategic goals [3]. Call centers have relied historically, on Erlang-C based estimation formulas to help determine the number of agent positions and queue parameters [4]. These estimators have worked fairly well in traditional call centers, however recent trends such as skill-based routing, electronic channels and interactive call handling demand more sophisticated techniques [5]. Discrete event simulation provides the necessary techniques to gain insight into these new trends, and helping to shape their current and future designs [6]. Discrete Event Simulation is a proven methodology that allows to incorporate all of the activities, resources, business rules, workload, assumptions, and other characteristics of a process into one model, and to test the impact of changes in assumptions or other elements on the behavior and performance of the process. Virtually any performance criterion can be examined with simulation [7]. This paper presents a case study demonstrating the use of discrete event simulation in a call center of a company operating in the market of athletic tickets, aiming at the improvement of Management's decision making in both the strategic and tactical levels. The peculiarities of the sector are discussed and specific propositions are made for the improvement of decision making with the use of a simulation-based software tool.

2. The Case Study The company under study is a small-sized unit operating in the market of athletic tickets and specifically in the soccer market. Although it is a common application in the USA and in many European Countries, in Greece this is the first venture. The field of selling soccer tickets via call centers and the internet is a growing market and especially after the recent success of the Greek national soccer team, the perspectives are extremely promising. The company collaborates with major Greek soccer clubs and administrates part of and occasionally the whole number of their available tickets. Its main target is to offer high quality services in order to attract social groups who typically abstain from such events. For that purpose, the first step of the company was the development of a web-based application which enables its customers to buy tickets via the internet, and to its agents to sell tickets via a call centre and a sales branch. Consequently the organizational framework of the company was built around the information system, forcing the service processes in the call centre to become of secondary priority. The ticket market is characterized by the extravagant variation of the demand which is depended on several factors, sometimes visible but others not. Every soccer match acts as a trigger event causing customers to contact the three service points (the call centre, internet & sales branch). The volume of the contacts, the distribution throughout a period of time or among the three service points varies, depending on factors such as the popularity of the event, the general performance of the soccer club, the weather etc. Having to deal with a plethora of different events, some of them scheduled and others not, the demand figures over a year show various periods of average and high demand and even periods of almost no demand. Trying to provide a high level of service under these circumstances, without launching the operational cost, is a tricky and challenging venture.

3. Developing a Flexible Model The methodological approach which was applied during the project consisted of three major steps: • Processes analysis and data gathering • Modeling and verification • Simulations and output analysis The processes were recorded along with observations, after conducting interviews with the managers, the call centre supervisor and the agents. The data collection for the call centre was relatively easy as there was a call recorder system and also a database including the customers' requests. On the other hand there was inadequate data concerning the branch and the internet; therefore, they were excluded from the simulation study. During this phase it was revealed that the design of the call centre was far from this of the usual typical call centre. No type of automation was installed apart from the computers and the phones. Furthermore, the number of trunk lines was equal to the number of vacancies, leaving

416 - - - - - - - - - - - - - - - - - - Panayiotou N.A., Evange/opoulos N.P., Ponis S. T.

occasionally nil queue space. Some other facts which had a serious impact on the modeling process are listed below: • The same employees had to work in the call centre and the sales branch which are placed in distant locations. No changes could be made during the day • The types of customers' requests were: Information, Member Registration, Ticket Purchase and Complaints which had different process times • In case there was a ticket sale the agents had to prepare it for dispatch with a courier. That process had a low priority and it was interrupted in cases of incoming calls Passing on to the next phase the goal was to develop a model representing the initial operation of the company, validating it and exporting tangible conclusions concerning the service quality. Owing to the particularities of our case and in order to persuade the management about the efficiency of the simulation study easily, we decided to use detailed modeling techniques in combination with eye catching animation. From the testing runs of the model the suspicions about the large number of balked customers, due to the lack of queue space, were confirmed. The phenomenon was intense causing the number of customers who were serviced by the clerks to be notably lower than the number of those who attempted to call. In reality the balked customers will probably recall depending on how much they desire to buy a ticket. But from the scope of quality that fact is impermissible, especially in the case of phone sales. Attempting to model the recall process difficulties were encountered in the estimation of factors such as the average waiting time for recalling and the customers' "tolerance", which is the average maximum number of attempts customers would make. A series of simulations was executed for a range of the above factors and a typical sample of our conclusions is shown in Figure I.

Figure I: Percentage of aborted customers in connection with the customer's tolerance As the typical performance measurements (i.e. the percentage of customers reaching an agent within a specific time) couldn't be applied due to the lack of queue space, the recall study was proven a useful tool to persuade the management for organizational changes.

4. Strategic Decisions After the recognition of the quality problems in the operation of the call centre, simulation was used to plan the next strategic movements of the company, and specifically to compare technological solutions. The key point for the success of such studies is "forecast". The management had strong reasons to believe that the demand would be increased, nevertheless uncertainness was still intense. Therefore, all the solutions which would be examined aimed to increased flexibility and a limited pay back period. The major operational cost in call centers, estimated up to 65% of the total cost, is related to the personnel. After the essential changes, the initial simulation model was used to compare the staffmg needs for three scenarios. The first one was the installation of a single call waiting system. All types of calls would still be handled by the agents. The second one suggested the installation of a VoiP (Voice over IP) IVR (Interactive Voice Response) which would handle part of the calls requiring information. The third scenario consisted of the connection of the above IVR system to the information system of the company, giving the customers the possibility to service themselves. For each case quality measurements, such as the ASA (Average Speed of Answer) and the percentage of the aborted calls, were used as criteria. The scenarios were tested for several volumes of calls and the conclusions are listed below:

Multi Level Decision Support in a Soccer Ticket Club Call Centre with the Use of Simulation _ _ _ __

417



For similar levels of demand compared to those of the previous year, the single IVR solution requires fewer agents than the call waiting system and is a viable investment. The pay back period is estimated to be less than two years • The second IVR solution is strongly dependent on the customers' acceptance of the automated services. Generally, for it to be viable it requires a significantly increased volume of calls In addition, sensitivity analysis was conducted to verify that changes on variables such as the acceptance of recorded information didn't have an impact on the above results. Therefore, the second solution was considered more suitable for the company enabling it to gradually set up a connection with the information system, if it was justified by the number of calls.

5. Tactical Decisions Tactical decisions are closely related to strategic decisions. Having chosen which technological investment would be utilized, simulation was applied to estimate staff size in different occasions through an horizon of a year, the number of trunk lines as well as the schedules of agents. For the estimation of the staff size, the fact that the management expected an increase in demand of about 50% was taken into account. Under the given circumstances it was concluded that the available at present staff, with the aid of the IVR, would be adequate to offer high level service to customers. On peak days, the above staff could service the increased volume of calls but the percentage of the reneged customers would increase. Assuming that dissatisfied customers recalled, 80% would be serviced only after one attempt. Concerning the trunk lines and given the fact that they are an operational cost, their number was estimated in order to eliminate the possibilities of balked customers. During the previous year all agents had the same schedule. An alternative schedule was tested in which it was assumed that half the staff would start two hours later every morning. Having no data concerning demand during the two hours after the working schedule, a hypothesis based on the tendency of in-day seasonality was used. The conclusion was that using a rolling schedule wouldn't have any impact on the service level while the operational hours of the call centre would increase along with the volume of the serviced customers.

6. Conclusions and Further Research Discrete event simulation proved to be an important support tool for different level decision making in the first call centre of soccer ticket selling. Having only the essential statistical data available, simulation is a reliable means to extract valuable conclusions about the call centre operation and its service level. Ordinary call centre measurements, such as agents utilization, the average speed of answer and the abandonment rates are evaluated In addition, several technological investments, along with alternative operational scenarios, are easily tested by simulation, detecting simultaneously the potential bottlenecks and the critical processes of the operation. Furthermore, the existent model, in combination with forecast application, can evolve into an operational decision support tool. Decisions such as the agents' days off and their extra working hours or the necessity of hiring seasonal staff can be evaluated by simulation experiments. The modem business processes, including the operations in call centers, tend to be more and more complicated. A flexible and easily applied tool is required to analyze them thoroughly and simulation proved to be an exceptionally efficient solution.

References [I] Gartner Group, The Corporate Call Center; Much More Than Call Handling, !996. [2] Mehrotra, V., Ringing Up Big Business. OR!MS Today, Vol. 24, No.4, August 1997, pp. 1824, 1997. [3] Dawson, K., The Call Center Handbook. The Complete Guide to Starting, Running and Improving Your Call Center. CMP Books, 3'd Edition, Gilroy, CA, 2001. [4] Bodin M., Dawson K., Call Center Handbook. New York: Flatiron Publishing, Inc., 1996.

418 - - - - - - - - - - - - - - - - - Panayiotou N.A., Evange/opoulos N.P.. Ponis S.T.

[5] Cleveland B., Mayben J., Call Center Management - On Fast Forward, Maryland, Call Center Press, 1997. [6] Tanir 0., Modelling Complex Computer and Communication Systems- A Domain-Oriented Design Framework, New York, McGraw-Hill, 1997.

[7] Sadowski, R.P., Shannon, R.E., Pedgen,D.P., Introduction to Simulation: Using SIMAN, Me Graw Hill, 1990.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 419-423

Flow Analysis of Flush Type Intake Duct ofWaterjet W. G. Park' School of Mechanical Engineering Pusan National University Pusan 609-735, Korea

Department of Naval Architecture and Ocean Engineering, Pusan National University Received 5 August, 2004; accepted in revised form 20 August, 2004 Abstract: Watetjet propulsion system is widely used to thrust high speed marine vessels in excess of 30-35 knots by virtue of high propulsive efficiency, good maneuverability, and less vibration. Since 7-9% of the total power is approximately lost in intake duct due to flow separation, nonuniformity, etc., detail understanding of flow phenomena occurring within intake duct is essential to reduce the power loss. The present work solved 3-D incompressible RANS equations on multiblocked grid system of the flush type intake duct of watetjet. The numerical results of pressure distributions, velocity vectors, and streamlines were compared with experiments and good agreements were obtained for three jet velocity ratios. Keywords: RANS equations; Watetjet; Intake duct; Flush type; Boundary layer ingestion

1. Introduction Wateijet propulsion system is widely used to thrust high speed marine vessels in excess of 30-35 knots by virtue of high propulsive efficiency, good maneuverability, and less vibration. Also, the cavitation can be delayed, or reduced by increasing the static pressure of impeller face through diffusing the cross-sectional area of intake duct of wateljet. Besides application to the high speed ferries, wateljet is installed in military amphibian vehicle to cross the river or land the shore. From the aspect of power loss, approximately 7-9% of the total power is lost in intake duct due to local flow separation, nonuniformity, etc. Thus, detail understanding of flow phenomena occurring within the intake duct is essential to reduce the power loss, as well as noise and vibration. The type of wateljet intake is typically classified into two types: ram (also called as pod or strut) and flush intake. Ram intake is used on hydrofoil crafts. Flush intake is widely used on monohulls, planning crafts, and catamarans. Although a lot of information about the flush intake exists, most of them are confined to manufacturers and are not opened to public domain. The objective of present work is to gain detail flow information of the flush type of wateijet intake duct by solving incompressible RANS equations. The present calculation is also compared with experiments.

2. Governing equations and numerical method The 30 incompressible RANS equations in a curvilinear coordinate system may be written as:

16-6 )=o aq+_£_(F:-F: at a~ v)+~(f'-f' m, v)+_£_ as~ v

(1)

where q = (0, u, v, w ]/J . E, F, and G are the convective flux terms. Ev, Fv, and Gvare viscous flux terms. Eq.(l) is solved by so called "Iterative Time Marching Method"[l-3]. In this method, the 1 Corresponding author. Pusan National University. E-mail: [email protected] 2 Coauthors. Pusan National University. E-mail: [email protected], [email protected]

420 _______________________________ Park et. a/.

continuity equation is solved by MAC method and the momentum equation is solved by time marching Scheme. The spatial derivatives of convective flux terms are differenced with QUICK scheme. To capture the turbulent flows, low Reynolds number k-e model[4] was implemented. The initial condition is set by solving equation of fully developed duct flow at each cross-sectional plane. At boundary of • The velocity is nozzle exit, the pressure is obtained from p.1ma;~~ = p •maJ.. 1 + ~x (.:1.../iJx). vp lma~t - 1 extrapolated from interior nodes and, then, weighted by a factor to satisfY the mass conservation. On the body surface, the no slip condition is applied for velocity components. The surface pressure is determined by setting the zero normal pressure gradient of pressure. The turbulent velocity profile in wind tunnel was given by the !17th-power law.

4. Results and discussion The present method has been applied to the flow within the intake duct of waterjet, as shown in Figure I by carrying out three computations of NR = 6, 7, and 8. Here, Jet Velocity Ratio(NR) is defined as NR=VjV"'"where Vj means jet velocity at nozzle exit and Voo means vessel velocity. Figure 2 and 3 show surface pressure distribution on ramp and lip side, compared with experiment[5]. The digits of Figure 2(a) and 3(a) mean the locations of pressure tab. The pressure coefficient is defined as: cP = p- p/Nj(pv,~jz). The subscript "IN" denotes the value at inlet of the duct. These figures show well agreements with the present computation and experiment. In common with these figures, pressure rapidly increases from tab number 35 and 16, respectively. This is due to the bottleneck effect of the converged nozzle area. This pressure increment becomes severe as NR increases. This tendency is deserved because high NR means large mass flow rate through the duct, i.e., high bottleneck effect. Figure 4 shows velocity vectors and streamlines at cross-section of inlet plane at NR=8, compared with PIV measurement[l6]. In these figures, the flow is strongly swallowed through the inlet of the duct, even along the both sides of the inlet. The vortex induced by the separation along the comer of the side wall of inlet is also clearly shown. This flow feature has the same tendency of the flow sketch by Allison[6] in the case of NR>I.O. From Figure 4(c), the present work gives the ingestion streamtube width around 2 times the physical width of duct inlet. Figure 5(a) shows two locations, "frame I and 2", of PIV measurement in vertical plane. Figure 5(b) shows the streamlines in the symmetry plane at NR=8. Especially, Figure 5(b) shows the stagnation point on the lip side. The position of the stagnation point (or, stagnation line in Figure 4(c)) is an important factor to compute the mass flow rate and the performance of intake duct, because the only upper flow region of the streamline A-A connecting with the stagnation point enters into the intake duct. Therefore, the location of the stagnation point directly affects the mass flow rate of the waterjet. Figure 6 and 7 show streamlines in frame I and 2, compared with experiments at NR=6, 7, and 8. All figures of Figure 6 and 7 give well agreements with the present calculation and experiment. Figure 8 shows the pressure contours and streamlines at NR=7. Since the flow fields of other NR are very similar to those of Figure 8, figures of NR=6 and 8 are omitted here.

(a) Whole grid

(b) Grid for intake duct Figure!: Grid system

Flow Analysis of Flush Type Intake Duct of Waterjet _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

"'

1" 100 90

JVR6

Experiment --Calculation

80 10

8"

60

_]

50

"

30

..........

20

" 1 !

1

10

12

IB

90

120

JVR7

"'

Expe~ment

100

--Calculation

90

80

tS

22 211

3D 32 SB 4D 41 .C4

(b) atNR=6

(a) Location of Pressure tabs

"' "' ""

20

Pressure tap no.

JVRB

Experiment --Calculation

00

10

a.

10

U"

60

50

"

30 20

20

"

" 1 !I

1

10

12

Ill

20

22 21

30 S2 !IB 4D 41 44

1 9

1

10

12

Pressure tap no.

Ill

20 22 28

30 32 98 40 .tl

«

Pressure tap no.

(c) at NR=7

(d) at NR=8

Figure 2: Pressure distribution on the ramp 120

"' 1

oo

90

JVR6

• Experiment --Calculation

80 10

8-sa

50

30 20

" Pressure tap no.

(b)atNR=6

(a) Location of Pressure tabs 120

"' 100

90

120

JVR7

"'

Experiment --Calculation

90

80

a.

ltoo

u

70 60

50

50

"

"

20

"

Experiment --Calculation

80

70

30

JVRB

30

..

20

" 10

18

Hi

17 2C

10

2 ..

(c) atNR=7

11

lEi

Pressure tap no.

Pressure tap no.

(d) at NR=8 Figure 3: Pressure distribution on the lip

17 20

24

421

422 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - P a r k et. a/.

(a) Experiment

(b) Calculation

(c) Streamlines Figure 4: Velocity vectors and streamlines at inlet (NR=8)

(a) Locations of PIV measurements

(b) Streamlines in the symmetry

Figure 5: Location ofPIV experimentation

·0.11

0.1

xiL

(a) Experiment (b) Calculation Figure 6: Streamline in frame I (NR=8)

423

Flow Analysis of Flush Type Intake Duct ofWaterjet _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

-0 .03 -TT-rT"'T"l'TTTTTTTTT"1rT,-r-r -T""--;;.....-, -0.04

-0 .1

. -011

-0.11

-0 12 -013 -014 L"-~~~c..._.,_'--L-'---:!-;_-"--"-""--'-;;~-"-~'--'

0.9 x!L

0.85

0.9

-0.15

x/l

(a) Experiment (b) Calculation Figure 7: Streamline in frame 2 (NR=8)

.....

•·.

r

,r ;;

I l !'t _, T 1 ; I"'T" 0 11"1 J'JC C. tt tJ.• :U :I OJ'JJ l:t

·.v:

· ~ ,,

~

:r

:I n ltu.1Uf rh' ''I 1-.1;n

..

(a) Pressure contour (b) Streamlines Figure 8: Pressure contours and streamlines at JVR=7

5. Conclusions The numerical flow analysis was performed to provide a detail understanding of complicated flow phenomena of an intake duct of waterjet. The 3-D incompressible RANS equations were successfully solved on multiblocked grid system for NR=6, 7, and 8. The present calculations were compared with experimental data of surface pressure distribution and PIV measurements and attained good agreements with experiments for all values of NR. From this calculation, strong suction flow through the inlet of the intake duct was shown and the vortex induced by the separation along the corner of the side wall of inlet was also clearly shown. The location of stagnation point on the lip side, an important factor of the performance of intake duct, was well predicted by the present calculation.

References [I] Park, W. G. and Sankar, L. N., An iterative time marching procedure for unsteady viscous flows , ASME-FED 20 (1993) [2] Park, W. G. , Jung, Y. R., and Ha, S.D., Numerical viscous flow analysis around a high speed train with crosswind effects, AIAA JournaL 36 477-479 (1998) [3] Park, W. G., Jung, Y. R., and Kim, C. K., Numerical flow analysis of single-stage ducted marine propulsor, To Be Published, Ocean Engineering (2004) [4] Chien, K.Y., Prediction of channel and boundary-layer flows with a low-reynolds number turbulence model, AIAA Journal, 20 ( 1982) [5] Kim, K.C. Experimentation ofwaterjet intake duct flow, PNU Report (in Korean) (2003) [6] Allison J., Marine Watetjet Propulsion, SNAME Transactions, 101 275-335 (1993)

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume 1, 2004, pp. 424-427

An Integrated Mathematical Tool. Part-1: Hydrodynamic Modeling G. Petihakisa,b, G. Triantafylloua 1 , G. Korresa and A. Theodoroub Hellenic Center for Marine Research, Institute of Oceanography, PO BOX 712, 19013, Anavyssos, Greece University of Thessaly, Dept of Agriculture and Water Environment, Fytoko, Nea Ionia Magnisias (38446), Greece a

b

Received 8 September, 2004; accepted in revised form 20 September, 2004 Abstroct: Numerical ecosystem models, as operational tools, may directly apply to problems of environmental management, predicting the response of natural systems to perturbations and modifications of various kinds. Since dynamics are described in some detail, the model may provide insight into the governing mechanisms influencing the functioning of the ecosystem described in a companion paper [1]. Pagasitikos is a semi-enclosed gulf situated on the western part of Aegean Sea north of the island of Evia connected at the south with the Aegean Sea through the 5.5 km wide, narrow channel of Trikeri. The predominant weak winds of the area result in small to moderate water currents while renewal occurs mainly through the deep-water layer of the communication channel with Aegean Sea. During winter months the water mass of Pagasitikos is fairly mixed, forming a two-layer thermocline for the rest of the year with the exception of August when three layers are observed. Inflows of fresh waters in the areas of Volos and Almiros are also observed during winter and spring adding to the complexity of the system. In this work the capability of a high-resolution hydrodynamic model based on Princeton Ocean Model (POM) to describe the phenomenology of Pagasitikos is investigated. Comparison with direct current measurements show good agreement indicating that the model can successfully reproduce the general circulation characteristics of the area aiding in the understanding of the complicated hydrological structure and circulation patterns appearing within the area. In addition the effect of high frequency wind forcing is explored through the comparison with surface forcing specified from monthly mean climatological data. Keywords: Ocean Modeling, Hydrodynamics, Water masses, Circulation

1

Introduction

Understanding the dynamics of marine ecosystem is a fundamental issue for the exploration, management and protection of coastal areas. Although the study of the underlying processes is rather challenging for the scientific community this often proves to be a difficult task due to the significant complexity of the system. Especially coastal areas are characterized by a high degree of complexity requiring intensive long-term studies capable to reveal relations between the numerous variables and parameters. Pagasitikos is a semi-enclosed system in the western Aegean Sea north of Evia Island surrounded by high mountains draining their rainwater into the gulf. It is a shallow gulf with a mean depth 1 Corresponding

author. E-mail: [email protected]

An integrated mathematical tool. Part- I: Hydrodynamic modeling

425

of 69 m with the deepest area at the eastern- central part (108m). The total area is 520km 2 and the total volume 36km 3 . An important feature is the connection with the Aegean Sea and north Evoikos through the narrow (3 - 10 km) and relatively deep (80 m) Trikeri channel modulating the hydrology patterns of the gulf. The microclimate is typical of the Mediterranean basin with two major wind groups, although are particularly weak resulting in the formation of prolonged thermocline. The etesian blow from July to September with a North - V/est direction exhibiting maximum values during afternoon and minimum during night. The second group is the Southerly warm and dry winds. The mean annual air temperature is 16.5°C with maximum 31.0°C in July and minimum 11.0°C in January. In this work the phenomenology of Pagasitikos is described using the data from the latest research project Development of an Integrated Policy or the sustainable Management of Pagasitikos Gulf and the capability of a high-resolution hydrodynamic model based on Princeton Ocean Model (POM) to describe the phenomenology of Pagasitikos gulf is investigated. In addition the hydrodynamic model results are used to provide the physical background for an ecosystem model implemented in the same area developing thus an integrated ecosystem operational tool for Pagasitikos gulf. The latter work is presented in a separate scientific article.

2

Materials and Methods 2.1

Model description

The hydrodynamic model (POM) is a primitive equations finite difference model which makes use of the hydrostatic and Boussinesq approximations. The model solves the 3-D Navier-Stokes equations on an Arakawa-C grid with a numerical scheme that conserves mass and energy. The spatial differences are central and explicit in the horizontal and central and implicit in the vertical. Centered differences are used for the time integration (leapfrog scheme) of the primitive equations. In addition, since the leapfrog scheme has a tendency for the solution to split at odd and even time steps, an Asselin filter is used at every time step. The numerical computation is split into an external barotropic mode with a short time step (dictated by the CFL condition) solving for the time evolution of the free surface elevation and the depth averaged velocities, and an internal baroclinic mode which solves for the vertical velocity shear. Horizontal mixing in the model is parameterized according to [2] while vertical mixing is calculated through the Mellor and Yamada 2.5 turbulence closure scheme. The reader is referred to [3] for a detailed description of the POM model equations and the numerical algorithm. The model state vector contains all prognostic (state) variables of the model at each sea grid point. The state variables consist of the sea surface elevation, the zonal and meridional components of velocity, potential temperature, salinity, the turbulent kinetic energy and the turbulent kinetic energy times the turbulent length scale. The computational domain for Pagasitikos gulf model covers the area between 22.8125E to 23.3025E and 39.0N to 39.43N. It uses a Cartesian coordinate system consisting of 49 x 45 elements with a horizontal grid resolution of 0.01 degrees both in latitude and longitude. In the vertical there are 25 elements of variable thickness with logarithmic distribution near the surface and the bottom for greater accuracy where velocity gradients are larger. For the model's bathymetry direct measurements and naval charts were used. A [4] filter of third order was also applied to the interpolated bathymetry in order to perform the necessary smoothing (elimination of small-scale noise). The model's climatological run was initialized with spring period objectively analysed field hydrological data while, initial velocities were set to zero. The model was integrated using a 'perpetual year' forcing atmospheric data set. This data set was derived from the 1979 - 1993 6hour European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis data (horizontal

426

G. Petihakis et al.

resolution P x 1°) by proper averaging in time to get the monthly mean values [5]. It consists of the longitudinal and meridional components of the wind stress, air temperature, air humidity, cloud cover, net upward heat flux, evaporative heat flux and solar insolation at the sea surface. Additionally the precipitation data needed for the freshwater budget were taken from Jaeger (1976) monthly data set (horizontal resolution 5° x 2.5°.

3

Results and Discussion

Model results indicate that the general circulation of the basin is a combination of thermohaline circulation driven by surface buoyancy fluxes, river discharges and exchanges through the open boundary and wind-driven circulation. The general circulation is of the basin is an interplay between a pure cyclonic structure due to the buoyancy forcing and an anticyclonic one triggered by the action of the wind forcing. During winter and spring the barotropic circulation involves a dipole of a cyclonic eddy in the western part of the basin and an anticyclone in the eastern part. During spring the anticyclone weakens while the cyclone intensifies. During summer the anticyclonic circulation prevails into the central part of the basin accompanied by a squeezed cyclonic eddy in the western part of the basin. As already discussed in our previous analysis, the energetics of the basin during summer are dictated by the action of the wind stresses at the surface. The anticyclone is completely absent and the rather intense cyclone developed during spring has taken its place. During autumn the barotropic circulation is very similar with the summer one although the anticyclone progressively moves eastwards and disamplifies. The temperature distribution along the vertical involves a sharp contrast between the surface and the bottom of the basin during spring and summer and a vigorous homogenization during winter where a temperature inversion is evident at the bottom of the basin. During spring and summer a triple layer system develops, consisting of a warm and shallow surface mixed layer, an intermediate layer of intense stratification (10-40m) and a deep cold layer. The salinity vertical distribution shows a contrast of a surface fresh layer on top of a more saline intermediate and bottom layer. This is actually a prominent characteristic of the basin successfully reproduced by the model run. The salinity contrast is more intense during autumn when the bottom layer salinity is almost 38.0 psu. During autumn and winter the halocline is located at approximately 50 m depth while during spring and summer the halocline up rises to 30 m. Surface salinity minima during autumn and winter are located at the western flag of the basin and can be attributed to fresh water discharges at the Almiros area.

Acknowledgment This work has been supported by the MFSTEP EU research project. The authors would like to thank A. Pollani for her help during the preparation of this work.

References [1] G. Petihakis, G. Triantafyllou, G. Korres, and A. Theodorou. An integrated mathematical tool. part-ii: Ecosystem modeling. Lecture Series on Computer and Computational Sciences, !:(submitted), 2004. [2] J. Smagorinsky. General circulation experiments with the primitive equations, i, the basic experiment. Mon. Weather Rev., 91:99~164, 1963.

An integrated mathematical tool. Part-1: Hydrodynamic modeling

427

[3] A. F. Blumberg and G. L. Mellor. A description of a three-dimensional coastal ocean circulation model. In N.S. Heaps, editor, Three-Dimensional Coastal Ocean Circulation Models, Coastal Estuarine Science, pages 1-16. American Geophysical Union, Washington, D.C., 4 edition, 1987.

[4] R. Shapiro. Smoothing, filtering, and boundary effects. Reviews of Geophysics and Space Physics, 8(2):359-387, 1970.

[5] G. Korres and A. Lascaratos. An eddy resolving model of the aegean and Jevantine basins for the mediterranean forecasting system pilot project (mfspp): Implementation and climatological runs. Annates Geophysime, 21:205-220, 2003.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 428-431

An Integrated Mathematical Tool. Part-11: Ecological Modeling G. Petihakisa,b, G. Triantafylloual, G. Korresa and A. Theodoroub Hellenic Center for Marine Research, Institute of Oceanography, PO BOX 712, 19013, Anavyssos, Greece University of Thessaly, Dept of Agriculture and Water Environment, Fytoko, Nea Ionia Magnisias (38446), Greece a

b

Received 8 September, 2004; accepted in revised form 20 September, 2004 Abstroct: Pagasitikos gulf is a semi-enclosed basin highly influenced both by anthropogenic activities (inflow of nutrients at the north and west parts) as well as by water exchange between the gulf and the Aegean Sea at its south part (Trikeri channel) resulting in the development of functional sub-areas within the gulf. Thus the inner part is characterized by eutrophic conditions with sporadic formation of harmful algal blooms while the central part acts as a buffer with mesotrophic characteristics influenced by the oligotrophic outer area. In a companion paper [1) the circulation fields and the development of water masses in the Pagasitikos gulf were explored. The aim of this study is to investigate the interactions between the physical and biogeochemical systems in the Pagasitikos gulf by coupling advanced hydrodynamic and ecological models. The simulation system comprises of two on-line coupled sub-models: a three-dimensional hydrodynamic model based on Princeton Ocean Model (POM) and an ecological model adapted from the European Regional Seas Ecosystem Model (ERSEM) for the particular ecosystem. A cost function is used for the validation of model results with field data. Simulation results are in good agreement with in-situ data illustrating the role of the physical processes in determining the evolution and variability of the ecosystem. Keywords: Numerical modeling, Ecosystem modeling

1

Introduction

Pagasitikos is a rather sensitive ecosystem due to its semi-enclosed nature and the shallow depths. The human activity in the coastal areas is not significant with agricultural farming being the major occupation. However during the last years there has been a shift towards intensive production of cereal and cotton with the use of large quantities of fertilizers rich in nitrogen, phosphate and sulphur. A significant proportion of these chemicals find its way into the marine ecosystem carried by rain waters through a network of periodic small torrents. The only major city is Volos at the north part of the gulf with a population of 120,000 inhabitants and a well-developed industrial sector. It was during 60's when the first heavy industries were built attracting workers from the surrounding areas and leading into a population explosion. The fast growth of the area and the absence of the necessary infrastructure caused significant problems to the gulf, as it became the recipient for major quantities of rural, and industrial effluents. A domestic sewage treatment plant for the domestic effluents was planned as early as 1964 but it took 23 years to become operational. 1 Corresponding

author. E-mail: [email protected]

An Integrated Mathematical Tool. Part-II: Ecological Modeling

429

An extreme perturbation occurred in early 60's during the draining program of lake Karla when large quantities of enriched waters with nutrients were channeled into Pagasitikos via an aqua duct at the north of the gulf. Although this lasted few years only the larger part of the lake was given for cultivation and as a result during winter rainwater washes the soil becoming enriched with fertilizers, pesticides and particulate material a proportion of which is finally poured into Pagasitikos. During 1982 dense mucilage composed by phytoplankton cells, bacteria, zooplankton excretions and detritus covered large areas in the north part of the gulf causing significant problems to the fishing community and to tourism. This phenomenon was greatly reduced both in space and time in the following years just to return with grater severity in 1987 the worst year ever recorded.

2

Materials and Methods 2.1

Model description

An essential feature of aquatic ecosystem models is the combination of biology and physics, which cannot completely be separated (migration with currents, sedimentation due to biological actions, sinking of senescent phytoplankton) however, a rough separation is possible. Thus the 3D ecosystem model initially developed for the Cretan Sea [2] consists of two, highly portable, on-line coupled, sub-models: the three-dimensional Princeton Ocean Model (POM) [3] described in part I, [1], and the ecological model. The physical model describes the hydrodynamics of the area and provides the background physical information to the ecological model, which describes the biogeochemical cycles. The ecosystem model is based on the 1D European Regional Seas Ecosystem Model [4] and is an extension of the 1D model applied to Pagasitikos [5]. It is a generic model that can be applied in a wide range of ecosystems [6], [7], [5] with a significant degree of complexity allowing thus a good representation of all those processes that are significant in the functioning of the system. A coherent system behavior is achieved by considering the system as a series of interacting physical, chemical and biological complex processes [4]. A 'functional' group approach has been adapted for the description of the ecosystem while organisms are separated into groups according to their trophic level and subdivided to similar size classes and feeding methods. The dynamics between the functional groups include physiological (ingestion, respiration, excretion, egestion, etc.) and population processes (growth, migration and mortality) which are described by fluxes of carbon and nutrients. The physical processes affecting the biological constituents are advection and dispersion in the horizontal, and sedimentation and dispersion in the vertical, with the horizontal processes operating on scales of tens of kilometers and the vertical processes on tens of meters [4]. Biologically driven carbon dynamics are coupled to the chemical dynamics of nitrogen, phosphate, silicate and oxygen. The food web is divided into three main groups, the producers, the decomposers and the consumers, each of which may be defined as having a standard set of processes. The first group includes the phytoplankton, which is further divided to functional groups based on size and ecological properties. These are diatoms P1 (silicate consumers, 20- 200Jl), nanophytoplankton P2 (2- 20JI), picophytoplankton P3 ( < 2JI) and dinoflagellates P4 (> 20JI). In the 3D code the following equation is solved for the concentration of C for each functional group of the pelagic system:

430

G. Petihakis et al.

where U, V, W represent the velocity field, AH the horizontal viscosity coefficient, and KH the vertical eddy mixing coefficient, provided by the POM. L BF stands for the total biochemical flux, calculated by the model, for each pelagic group. This equation is approximated by a finitedifference scheme analogous to the equations of temperature and salinity [2] and is solved in two time steps [8]: an explicit conservative scheme [9] for the advection, and an implicit one for the vertical diffusion [10]. Due to the heavy computational cost the benthic-pelagic coupling is described by a simple first order benthic returns module, which includes the settling of organic detritus into the benthos and diffusional nutrient fluxes into and out of the sediment. 2.2

Model validation

The validation of a 3D ecosystem model is usually a difficult task due to the scarce in-situ data. A possible solution is the use of a cost function as described by [11]. It is a mathematical function enabling the comparison of model results with field measurements estimating a non-dimensional value which is indicative of how close or how distant two particular values are. C

_ Mx,t - Dx,t x,tsdx,t

where, Cx,t is the normalised deviation between model and data for box x and season t, Mx,t the mean value of the model results within box x and season t, Dx,t the mean value of the in situ data within box x and season t and sdx,t the standard deviation of the in situ data within box x and season t. The cost function results give an indication of the goodness of fit of the model, by providing a quantitative measure of deviation, normalised in units of standard deviation of data. The lower the value of the cost function the better the agreement between model and data. In this work the same categories of cost function results described by [11] ( < 1 = Very good, 1 - 2 = Good, 2 - 5 = Reasonable, > 5 = Poor) were used.

3

Results and Discussion

Initially all results from the cost function were grouped (all stations - all seasons - all depths), with 43% falling into the first category (very good), and 23% in the good. Additionally only 11% of the model results are poor, indicating that the model overall simulates very satisfactorily the system. Considering the expected variability of the system in time and space, model results were also validated for the different seasons and stations. Spring is the season with the most information since there are measurements in all stations and at all depths. The model simulates very well the ecosystem of Pagasitikos with 55% of the model results being very good and only 2% falling in the poor category. Validation results indicate that the mathematical model adapted into the marine ecosystem of Pagasitikos gulf is a close representation of the real system with only exception the very variable channel area where detailed information on the important processes is necessary. The particular ecosystem model should be used as a useful tool producing significant information and knowledge on the dynamics and functioning of Pagasitikos gulf under both natural conditions as well as perturbations. It should be used for the exploitation of the response into reasonable changes in parameters or processes which are included in the structure. Such simulations are anyway performed during the sensitivity testing and can provide under appropriate management very useful information. Of course in order to forecast the evolution of the ecosystem after disturbance (e.g. inputs of enriched waters), a monitoring system should be established providing frequent detailed information on the major ecosystem variables. The Pagasitikos nowcasting - forecasting ecosystem model should be the goal of a future operational forecasting system. To

An Integrated Mathematical Tool. Part-II: Ecological Modeling

431

achieve the goal a numerical tool was developed. This work can be viewed as a first step towards establishing a system for understanding the dynamics of the ecosystem which, in a second step, through the development of data assimilation techniques, will lead to operational applications. Work in this area is in progress and it will be reported later during the project Data Integration System for Eutrophication Assessment in Coastal Waters (InSEA), recently funded by the EU.

Acknowledgment This work has been supported by the MFSTEP EU research project. The authors wish to thank A. Pollani for her help during the preparation of this work.

References [1] G. Petihakis, G. Triantafyllou, G. Korres, and A. Theodorou. An integrated mathematical tool. part-i: Hydrodynamic modeling. Lecture Series on Computer and Computational Sciences, !:(submitted), 2004. [2] G. Petihakis, G. Triantafyllou, J.l. Allen, I. Hoteit, and C. Dounas. Modelling the spatial and temporal variability of the cretan sea ecosystem. Journal of Marine Systems, 36(3-4):173-196, 2002. [3] A. F. Blumberg and G. L. Mellor. A description of a three-dimensional coastal ocean circulation model. In N.S. Heaps, editor, Three-Dimensional Coastal Ocean Circulation Models, Coastal Estuarine Science, pages 1-16. American Geophysical Union, Washington, D.C., 4 edition, 1987. [4] J.W. Baretta, W. Ebenhoh, and P. Ruardij. The european regional seas ecosystem model, a complex marine ecosystem model. Netherlands Journal of Sea Research, 33:233-246, 1995. [5] G. Petihakis, G. Triantafyllou, and A. Theodorou. A numerical approach to simulate nutrient dynamics and primary production of a semi-enclosed coastal ecosystem (pagasitikos gulf, western aegean, greece). Periodicum Biologorum, 102( 1) :339-348, 2000. [6] J .J. Allen. A modelling study of ecosystem dynamics and nutrient cycling in the humber plume, uk. Journal of Sea Research, 38:333-359, 1997. [7] G. Triantafyllou, G. Petihakis, and J.l. Allen. Assessing the performance of the cretan sea ecosystem model with the use of high frequency m3a buoy data set. Annates Geophysicae, 21:365-375, 2003. [8] G. L. Mellor. An equation of state for numerical models of oceans and estuaries. Journal Atmospheric Oceanic Technology, 8:609-611, 1991. [9] H.J. Lin, S.W. Nixon, D.l. Taylor, S.L. Granger, and B.A. Buckley. Responses of epiphytes on eelgrass, zostera marina 1., to separate and combined nitrogen and phosphorus enrichment. Aquatic Botany, 52:243-258, 1996. [10] R.D. Richtmyer and K.W. Morton. Difference methods for initial-value problems. Krieger Publishng Company, Malabar Florida, 1994. [11] A. Moll. Assessment of three-dimensional physical-biological ecohaml simulations by quantified validation for the north sea with ices and ersem data. ICES Journal of Marine Science, 57:1060-1068, 2000.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 432-435

Numerical Solution of the Hamilton-Jacobi-Bellman Equation in Stochastic Optimal Control with Application of Portfolio Optimization Helfried Peyrl 1 , Florian Herzog2 , and Hans P. Geering3 Measurement and Control Laboratory, Swiss Federal Institute of Technology, CH-8092 Ziirich, Switzerland Received 4 August, 2004; accepted in revised form 15 August, 2004 Abstract: This paper provides a numerical solution of the Hamilton-Jacobi-Bellman (HJB) equation for stochastic optimal control problems. The computation's difficulty is due to the nature of the HJB equation being a second-order partial differential equation which is coupled with an optimization. By using a successive approximation algorithm, the optimization gets separated from the boundary value problem. This makes the problem solvable by standard numerical methods. The method's usefulness is shown in an example of portfolio optimization with no known analytical solution. Keywords: Stochastic Optimal Control; Computational Methods; Computational Finance

1

Introduction

A necessary condition for a solution of stochastic optimal control problems is the Hamilton-JacobiBellman (HJB) equation, a second-order partial differential equation that is coupled with an optimization. Unfortunately, the HJB equation is difficult to solve analytically and only for some special cases analytical solutions are known, e.g. the LQ regulator problem.

2

Problem Formulation

Consider the stochastic process x E IR.n which is governed by the stochastic differential equation dx = f(t, x, u)dt + g(t, x, u)dZ,

(1)

where dZ denotes k-dimensional uncorrelated standard Brownian motion. The vector u denotes the control variables contained in some compact, convex set U C IR.m. Our problem starts at arbitrary time t E (0, T) and state x E G, where G is an open and bounded subset of lRn. The final time of our problem denoted by r is the time when the solution (t, x( t)) leaves the open set Q = (0, T) x G: r = inf{s ~ t I (s,x(s)) ¢ Q}. 1 now

at Automatic Control Laboratory, Swiss Federal Institute of Technology, E-mail: [email protected] [email protected] 3 E-mail: [email protected] 2 E-mail:

Numerical Solution of the HJB Equation in Stochastic Optimal Control

433

Our aim is now to find the admissible feedback control law u which solves the following stochastic optimal control problem leading to the cost-to-go (or value) function J(t,x): J(t, x) =

max lE

u(t,x)EU

{1r t

L(s, x, u)ds + K(r, x(r))}

(2)

s.t. dx = f(t, x, u)dt

+ g(t, x, u)dZ,

where lE denotes the expectation operator and L, K are scalar functions. The HJB equation turns out to be a necessary condition for the cost-to-go function J(t,x). For a detailed derivation, the reader is referred to [1, 2], or [3]. Using the differential operator A(t, x, u) = ~ l:~j=l CT;j tJx~~x; + 2::~ 1 /; 8~,, where the symmetric matrix CT = (CT;j) is defined by e1(t,x,u) = g(t,x,u)gT(t,x,u), the HJB equation can be written as follows: Jt +max {L(t, x, u) + A(t, x, u)J} uEU

0,

(t,x)EQ

K(t,x),

(t,x) E a•Q,

J(t,x)

(3)

where a• Q denotes a closed subset of the boundary aQ such that (T, x( T)) E a• Q with probability one: a*Q = ([0, T] X aG) u ({T} X G). The HJB equation (3) is a scalar linear second-order PDE which is coupled with an optimization over u. This makes solving the problem so difficult (apart from computational issues arising from problem sizes in higher dimensions).

3

Successive Approximation of the HJB Equation

In this section we will reveal a numerical approach for solving the HJB equation. Solving the PDE and optimization problem at once would lead to unaffordable computational costs. Chang and Krishna propose a successive approximation algorithm which will be used in the following [4]: 1. k = 0; choose an arbitrary initial control law u 0 E U. 2. Solve following boundary value problem to compute the problem's value Jk(t, x) for the fixed control law uk: Jtk

+ L(t, x, uk) + A(t, x, uk)Jk

0,

Jk(t, x)

3. Compute the succeeding control law

uk+ 1

K(t, x),

(t,x) E Q, (t, x) E a•Q.

by solving the optimization problem

uk+l = argmax{L(t,x,u) +A(t,x,u)Jk}. uEU

4. k

=k+

(4)

(5)

1; back to step 2.

To proof convergence of above algorithm following lemmas and theorems are used:

Lemma 1 Let Jk be the solution of the boundary value problem corresponding to the arbitmry but fixed control law uk E U: Jtk

+ L(t,x,uk) + A(t,x,uk)Jk Jk(t, x)

0, K(t,x),

(t,x)EQ (t,x) E a•Q.

(6)

434

H. Peyrl, F. Herzog, and H. Geering

Then Jk(t,x) = J(t,x,uk),

(7)

where J(t,x, uk) denotes the value of our problem for a fixed control law uk. Lemma 2 Let the sequences of control laws uk and their affiliated value functionals Jk be generated by the successive approximation algorithm. Then the sequence Jk satisfies

(8) Theorem 1 Let the sequences of control laws uk and their corresponding value functionals Jk be defined as above. Then they converge to the optimal feedback control law u(t, x) and the value function J(t,x) of our optimal control problem {2), i.e.: lim uk(t,x) = u(t,x) and lim Jk(t,x) = J(t,x).

k-oo

4

k-oo

Computational Implementation and Issues 4.1

Numerical Solution of the HJB-PDE

Boundary value problem (4) is a scalar second-order PDE with non-constant coefficients and hence can be tackled by standard methods for linear parabolic PDEs. Since the HJB-PDE has a terminal condition Jk(T,x) = K(T,x) rather than an initial condition, it must be integrated backwards in time. We use finite difference schemes as they are both well-suited for simple (rectangular) shaped domains Q and rather easy to implement. Our solver employs an implicit scheme and uses upwind differences for the first order derivatives for stability reasons. Second order and mixed derivatives are approximated by central space differences. 4.2

Optimization

For computing the succeeding control law, the nonlinear optimization problem (5) must be solved. Since we approximate Jk(t, x) on a finite grid, the optimization must be solved for every grid point. This can be accomplished by standard optimization tools. For problems with simple functions, it may be possible to obtain an analytical solution for the optimization step. This is to be preferred in return of less computational time. 4.3

Numerical Issues

Since the number of unknown grid points at which we approximate Jk(t, x) grows by an order of magnitude with dimension (Bellman's curse of dimensionality) and grid resolution we have to face issues of memory limitations and computation time and accuracy. The PDE solvers outlined in Section 4.1 require the solution of large systems of linear equations. The coefficient matrix of these linear systems is banded and therefore strongly encourages the use of sparse matrix techniques to save memory. Furthermore, applying indirect solution methods for linear systems such as successive overrelaxation provides higher accuracy and memory efficiency than direct methods.

5

CASE STUDY

The case study presents a portfolio optimization problem in continuous-time with no known analytical solution.

Numerical Solution of the HJB Equation in Stochastic Optimal Control

5.1

435

Portfolio Optimization Problem

We consider a portfolio optimization problem where an investor has the choice of investing in the stock market or to put his money in a bank account. The objective of the investor is to maximize the power utility of his wealth at a finite fixed time horizon T: max ~ W~' (T). Thus the portfolio optimization problem is max

1

-W~'(T)

uEJ-I,IJ "(

s.t. dW(t)

W(t)(r + u(t)(Fx(t)

+

+ f- r))dt

dx(t)

W(t)u(t)/0t)dZ 1 (a1 + A1x(t))dt + vdZ2

dv(t)

(a2

+ A2v(t))dt + u/0t)dZ3 ,

(9)

where 'Y < 1 is coefficient of risk aversion and u(t) E [-1, 1]. We make the assumption that both of the processes x(t) and v(t) are measurable and we have both of the time series to estimate the model parameters. In Fig. 1 the optimal investment policy as function of x(t) and v(t) is shown.

Voll!ility ~l

Figure 1: Optimal investment policy

References [1) Jiongmin Yong and Xun Yu Zhou, Stochastic Controls, Hamilton Systems and HJB Equations. Springer Verlag New-York, 1999. [2) B. Hanzon and H. Schumacher and M. Vellekoop, Finance for Control Engineers. Tutorial Workshop at the European Control Conference, Porto, Portugal, 2001. [3) Wendell H. Fleming and Raymond W. Rishel, Deterministic and Stochastic Optimal Control. Springer-Verlag, 1975. [4) M. H. Chang and K. Krishna, A Successive Approximation Algorithm for Stochastic Control Problems. Applied Mathematics and Computation, vol. 18, no. 2, 155- 165(1986).

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 436-439

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

An Enhanced MEBDF Approach for the Numerical Solution of Parabolic Partial Differential Equations G. Psihoyios 1 Department of Mathematics and Technology, Anglia Polytechnic University, East Road, Cambridge CB I I PT, United Kingdom Received 21 June 2004; accepted in revised form 12 August 2004 Abstract: In [I] the known MEBDF scheme had been used to solve ordinary differential equations (ODEs)

that arise in the method of lines solution of time dependent partial differential equations (PDEs). There, it was shown that MEBDF is superior to the widely used BDF approach in certain important classes of problems. This short paper serves as an introduction to the different logic we have followed in order to improve the stability characteristics of MEBDF. Here wee are also briefly referring to the fact that our new and enhanced approach produced a marked positive effect on both the accuracy and efficiency compared to [1]. Due to space limitations we are unable to present the numerical results that support our claims. Keywords: Stability, MEBDF, Parabolic partial differential equations, Method oflines.

Mathematics Subject Classification: 65Ml2, 65L05, 65M20, 65L06, 65120. PACS: 02.60.Lj, 44.05.+e

1. Introduction In [I] the author presented an account of the MEBDF applied to the solution of IVPs that result from time dependent partial differential equations (PDEs). A robust technique for solving time dependent PDEs is the numerical method of lines or simply the method of lines. Let us consider the one dimensional time dependent POE:

9

91 u(x,t)=g(x,t,u,ux,uxx), w1 e- s.(I) I '1'1 [fi.h ....Jkn "" "" { (jj,h, ... Jk) E" s.(I) I VlsijSk V+: s.; ---+Snj. +tifJ=jj} [6].

rr.

3.4 Non-interference of Goguen Now, we present the non-interference definition of Goguen [2]. Given a systemS with objects Sn for

n eN, with morphisms

+e: Sn---+ Sn ,

for e: n ---+ n ' , and with behaviour (i.e. limit) L

having projections 'n: ~ Sn, then Sm is non-interfering with Sk if and only if (1)

where L' is the limit of the subsystem of S which does not contain Sm and all morphisms to and from Sm , and T' n= L

'---+ Sn, for neN\{m}, are its projections. 3.5 Non-interference of Goguen in the Internal Language

Next, we express the non-interference of Goguen [2] in the internal language. So,

Tk

( L) and

are subobjects of S k and thus we can suppose them isomorph (not necessary equal). By considering the property of subobject classifier n[4] its characteristic morphisms (i.e. predicates) are thus equal T' k ( L ')

3XE"L(eq o denotes central differences, I is the identity matrix 3x3 and A = - , B·oG = - are the

iJQ

iJQ

Jacobian matrices resulting from the linearization procedure. Approximate factorisation of equation (7) yields the implicit fonn of equations (6) which can be implemented in the following sequence, Klonidis et al6 :

(

Lit &4· ) . • Lit ofr• fi;• • I+ 2 Ll~ LIQ = -2( ~ + Llq - W)

(I+ Lit

oB" )LIQ••I = LIQ'

2 Llq

o··~ = o· + Llo··l

I" step

(8)

2"d step

(9)

3'd step

(10)

The values of the unknown variables at every point of the field are obtained by solving a block tridiagonal system.

4. Validation of the Model In this section numerical results are compared with measured data obtained from a series of experiments that had been perfonned in a rectangular section converging-diverging flume by Bellos2 • Figure I shows the plan view of the flume configuration. A hypothetical dam consisting of a sluice gate was placed at the throat of the flume. With the gate initially closed, the upstream basin was filled with water at a depth of 30.0 em. A sudden rising of the gate caused a flood wave to surge, overwhelming the downstream dry bed. Measured stage hydrographs at x=8.5 m along the center-line of the flume are compared with equivalent model results for the test case of h 1=0.30 m, Sax=S0y=O.O, see figure 2. The Manning's roughness coefficient was set equal to 0.012, which is close to the glass-steel material of the tested flume. In all cases computed results and measured data seem to agree well.

i

1.4

I
20Jl). All phytoplankton groups contain internal nutrient pools and have dynamically varying C:N:P ratios. The nutrient uptake is controlled by the difference between the internal nutrient pool and external nutrient concentration. The microbial loop contains bacteria Bl , heterotrophic flagellates Z6, and microzooplankton Z5, each with dynamically varying C:N:P ratios. Bacteria act to decompose detritus and can compete for nutrients with phytoplankton. Heterotrophic flagellates feed on bacteria and picophytoplankton, and are grazed by microzooplankton. Microzooplankton also consumes diatoms and nanophytoplankton and is grazed by mesozooplankton. The parameter set used in this simulation is the same as in the 3D Cretan Ecosystem Model (Petihakis et al., 2002). The biogeochemical model coupled to the physical model consists of 4-D advectiondiffusion-reaction equations, and is solved for the concentration of C for each functional group of the pelagic system: ~=-U'!£-v'!£-w'!£+~ AH'!£)+~ AH'!£)+a( KH '!£)+'f.BF til axi>yaz&\axili\.av&\ az where U, V. W represent the velocity field, AH the horizontal viscosity coefficient, and KH the vertical eddy mixing coefficient, provided by the POM. "f.BF stands for the total biochemical flux, calculated by ERSEM, for each pelagic group. The benthic-pelagic coupling is described by a simple first order benthic returns module, which includes the settling of organic detritus into the benthos and diffusional nutrient fluxes into and out of the sediment.

3. Model Set-up The hydrodynamic model used is the ALERMO (Korres and Lascaratos, 2003) which has one open boundary located at 20° E. The computational grid has a horizontal resolution of l / l0°xl / l0° and 30 sigma layers in vertical with a logarithmic distribution near the sea surface. The model's bathymetry was obtained from the US Navy Digital Bathymetric Data Base - DBDB5 - (with a nominal resolution of 1/ 12° x 1/12°) by bilinear interpolation. A Shapiro filter of third order was also applied to the interpolated bathymetry in order to perform the necessary smoothing. The model's climatological run

502 ______________________________________________ G.N. Triantafollou, et a/.

was initialized with the Mediterranean Ocean Data-Base which contains seasonal profiles of temperature and salinity mapped on a 1/4° xl/4° horizontal grid. Additionally, initial velocities were set to zero. The temperature and salinity profiles at the open boundaries were also derived from the same database. Atmospheric data sets consist of monthly values for the longitudinal and meridional components of the wind stress, air temperature, air humidity and cloud cover. These monthly values were derived from the 1979 - 1993 6-hour ECMWF re-analysis data (horizontal resolution I o x I0 ) by proper averaging in time. Additionally the precipitation data needed for the freshwater budget were taken from Jaeger ( 1976) monthly data set (horizontal resolution 5° x 2.5°). This set of atmospheric data is then used by the air-sea interaction scheme of the model for the estimation of heat, freshwater and momentum fluxes at the sea surface. The initial conditions for the nutrients are taken from Levitus (1982) while the other biogeochemical state variables from the 3D ecosystem model for the Cretan Sea (Petihakis et al., 2002). The ecosystem pelagic state variables along the open boundary are described by solving water-column I D ecosystem models at each surface grid point on the open boundary. The integration starts from spring initial conditions (15th of May). The model was run perpetually for four years to reach a quasi steady state and to obtain inner fields fully coherent with the boundary conditions.

4. The assimilation scheme: The SEEK filter The extended Kalman (EK) filter is an extension of the Kalman filter to nonlinear systems. However, its implementation in realistic ecosystems is not feasible because of its high computational cost. Different degraded forms of the extended Kalman filter have been proposed, which reduce the dimension of the system (n) through some kind of projection into a low dimensional sub-space (Cane et al., 1995; Fukumori and Malanotte-Rizzoli, 1995; Hoang et al., 1997). In this study, we used the Singular Evolutive Extended Kalman (SEEK) which has been developed by Pham et al. (1997). This filter uses low rank estimation error covariance matrices, which make possible the implementation of the EK filter with realistic ocean models. With this assumption, the correction of the filter is made only along the directions for which the error was not sufficiently attenuated by the dynamics of the system. These directions can be further let to evolve in time to follow changes in the model dynamics. The SEEK filter has been successfully implemented in different realistic ocean applications, e.g. Pham et al. (1997) and Hoteit et al. (2001) with physical ocean circulation models, and Triantafyllou et al. (2003) and Carmillet et al. (2001) with ecosystem models. A schematic diagram of the filter's algorithm is illustrated in Figure 2. The computational cost of the SEEK filter is mainly due to the evolution of its 'correction directions'. Recently, a further simplification of the SEEK has been broadly used, examining the asymptotic approximation of the error covariance matrix. Results from the implementation of the SEEK filter show an immediate reduction of the error level after the first correction. Thus the use of the SFEK filter, a variant of SEEK, which maintains the initial correction directions invariant, is expected to produce reasonable results as well. When the initial error covariance is decomposed into empirical orthogonal functions, it is not easy to determine the truncation level suitable to conduct the assimilation experiments for the given ecosystem. The SFEK filter can be therefore also appropriate to perform sensitivity analysis regarding the relevance of the empirical sub-space for propagating surface observations at the deeper layers avoiding the heavy computation load of the SEEK filter.

5. Determination of the (initial) correction sub-space The (initial) correction subspace was determined using an empirical orthogonal functions (EOF) analysis (also known as principal components analysis). This analysis is widely used in meteorology and oceanography to determine the principal modes of the models variability. It provides the best approximation of a set of state vectors (and of their sample covariance matrix) in a low dimensional sub-space (with a singular low rank matrix). Since the ecosystem variables are of different nature (not homogeneous), a metric was used to make these variables independent of their units. We actually applied the so-called multivariate EOF analysis. The computation of the EOFs was made through a simulation of the model itself. After three years of model spin-up run to reach a statistically quasisteady state, an integration of about 2 years was carried out to generate a historical sequence Hs of model realizations. A sequence of 130 state vectors was retained by storing one every six days to reduce the calculations, because successive states are quite similar. The filter's 'directions of correction' were then obtained via a multivariate EOF analysis on the sample Hs. It was found that the

Toward a pre-operational data assimilation system for the E. Mediterranean----------

503

first 15 EOFs explain more than 90% of the variance of the system. This is a strong indication that the state of our system evolves in a subspace with a dimension much smaller than the dimension of the original system, which supports the filter's corrections in the directions of the leading modes of the system. Error Covariance Matrix

FORECAST

1/ 1 =.!..[M(x:'., +cl,'_,)-x:] r.

Figure 2. Schematic diagram of the assimilation cycle. In the forecast module the state and the evolution of the correction directions evolve with the model dynamics. In the correction module the forecast is corrected using the new observations and the updated error covariance matrix.

6. Twin experiment results - Conclusions Prior to the assimilation of real data, the twin experiment approach was used to validate our assimilation system. Within this approach, the true state X' is assumed to be provided by the model itself. A model run is then performed to generate a set of X'. These X' are then used twice: to generate the (pseudo-) observations, and as a reference state to assess the quality of the filter's analysis on nonobserved variables. A random error was added to the pseudo-observations to be as close as possible to the real situation. Then the model is initialized with the mean state vector and the performance of the filter can be assessed by its ability to bring the model back into its 'true' trajectory while using ocean colour data only. Therefore, following the two years simulation for the generation of the H, sequence, the model was run for a half year to produce the reference states set. The assimilation experiments were performed on a three day basis using surface chlorophyll pseudo-measurements, which were extracted from the reference states. The performance of the filter was measured through the comparison of the relative root mean square RRMS error for each state variable, over the whole simulation domain. The definition of the RRMS can be found in Triantafyllou et al. (2003). The RRMS of the filter using 15 and 25 EOFs are illustrated in Figure 3 and compared with the RRMS of the model free-run for phosphate, nitrate, diatoms, picoplankton, mesozooplankton and bacteria. It can be seen that the error efficiently is reduced and remains relatively low for all the variables. After 17 analysis steps, the estimation error reaches a saturation value for all variables. Regarding the behaviour of the filter in relation to the number of adopted EOFs, the filter is shown to provide very satisfactory results while using 15 EOFs only. The filter's degradation that can be seen in the RRMS of picoplankton, mesozooplankton and bacteria between the I Oth and the 17th analysis steps, when using 15 EOFs, suggests that more EOFs must be used during some periods, particularly when the model shows instabilities.

504 ______________________________________________ G.N. Triantafol/ou, eta/.

I·}.::::.:::::::::::::::::2:::::::.;;::::;::::: :· 0

20

4Q

firk _ N..JQO

DO

120

1410

100

I

180

1

I J.. ::::::::::: : -· ·.· . . . . :. . . . . :. . . . . . .:. . . ··: . . . . . :..........:. .[··...................................... o

a:o

.. o

eo

&.DY.t~

120

t40

100

1J

teo

1

~

:.=.:~. . . . ~ I

05 • . : - : ; : : : : : · · · · · · • • . . . . u,

*



................. .• !.!!ttft.a:ltltlf_ -···-~····· .. •• .. •

,.J.:: :;: : :.: , ,.~-, ~l· 0

: : ,:; ,:,. . ~. . .~.: . . :~. . :~. . .·~ 'j 20

4D

1!!0

80

Days

100

120

140

160

180

Figure 3. Evolution in time of the RRMS for phosphate, nitrate, diatoms, picoplankton, mesozooplankton, and bacteria from the model free-run(+) and the SFEK filter with rank 15 (*)and 25 (x).

The filter was shown to be very efficient in estimating the state of the ecosystem model. The experiments have been carried out under a "perfect model" assumption. This is very optimistic since in practice any ecosystem model will have deficiencies. When real data will be assimilated, the performance of the filter will highly depends on the incorporation of a realistic model error so that the filter does not completely follow the forecast model. The estimation of this error is however very difficult because of the huge dimension of the system and of the cruel lack of observations. Another point that might improve the filter's behavior when real data are used is to find a way to directly assimilate ocean color data instead of converting them to phytoplankton biomass through chlorophyll. This can be done by modifying the model in order to predict chlorophyll from phytoplankton values and bio-optical algorithm. The mismatch between observed and predicted surface chlorophyll values would then be used to drive the filters. These problems will be investigated in the near future within the framework of the MFSTEP project.

Acknowledgments This work was carried out within the framework of the project Mediterranean Forecasting System Toward Environmental Predictions (MFSTEP).

References Baretta, J. W., W. Ebenhoh and P. Ruardij (1995). The European Regional Seas Ecosystem Model, a complex marine ecosystem model, Netherlands Journal of Sea Research, 33:233-246. Baretta-Bekker, J. G., J. W. Baretta and E. Rasmussen (1995). The microbial foodweb in the European regional Seas Ecosystem Model, Netherlands Journal of Sea Research, 33: 363-379. Blumberg, A. F. and G. L. Mellor (1987). A description of a three-dimensional coastal ocean circulation model, In Three-Dimensional Coastal Ocean Circulation Models. N. S. Heaps. Washington, D.C., AGU. 4: 1-16.

Toward a pre-operational data assimilation system for the E. Mediterranean _ _ _ _ _ _ _ _ __

505

Cane, M.A., A. Kaplan, R.N. Miller, B. Tang, E. C. Hackert and A. J. Busalacchi (1995). Mapping tropical Pacific sea level: data assimilation via a reduced state Kalman filter, Journal of Geophysical Research, 101(10): 599-617. Carmillet, V., J. M. Brankart, P. Brasseur, H. Drange, G. Evensen and J. Verron (2001). A singular evolutive extended Kalman filter to assimilate ocean color data in a coupled physicalbiochemical model of the North Atlantic ocean, Ocean Modelling, 3(3-4): 167-192. Ebenhoh, W., J. G. Baretta-Bekker and J. W. Baretta (1997). The primary production module in the marine ecosystem model ERSEM II with emphasis on the light forcing, Journal of Sea Research, 38: 173-193. Fukumori, I. and P. Malanotte-Rizzoli (1995). An approximate Kalman filter for ocean data assimilation: an example with an idealized Gulf Stream model, Journal of Geophysical Research, 100(C4): 6777-6793. Hoang, H. S., P. De Mey, 0. Tallagrand and R. Baraille (1997). Adaptive filtering: Application to satellite data assimilation in oceanography, Journal Dynamicas of Atmospheres and Oceans, 27: 257-281. Hoteit, 1., D. T. Pham and J. Blum (2001). A semi-evolutive partially local filter for data assimilation, Marine Pollution Bulletin, 43: 164-174. Jaeger, L. (1976). Monatskarten des Niederschlags fur die ganze Erde, Ber. Dtsch. Wetterdienste, 18(1839): 1-38. Korres, G. and A. Lascaratos (2003). An eddy resolving model for the Aegean and Levantine basins for the Mediterranean Forecasting System Pilot Project (MFSPP): Implementation and climatological runs, Annales Geophysicae, 21: 205-220. Levitus, S. (1982). Climatological Atlas of the World Ocean, No. 13, Washington. Petihakis, G., G. Triantafyllou, J. I. Allen, I. Hoteit and C. Dounas (2002). Modelling the Spatial and Temporal Variability ofthe Cretan Sea Ecosystem, Journal of Marine Systems, 36: 173-196. Pham, D. T., J. Verron and M. C. Roubaud (1997). Singular evolutive Kalman filter with EOF initialization for data assimilation in oceanography, Journal of Marine Systems, 16: 323-340. Triantafyllou, G., I. Hoteit and C. Petihakis (2003). A singular evolutive interpolated Kalman filter for efficient data assimilation in a 3D complex physical-biogeochemical model of the Cretan Sea, Journal of Marine Systems, 40-41: 213-231. Varela, R. A., A. Cruzardo and J. E. Gabaldon (1995). Modelling the primary production in the North Sea using ERSEM, Netherlands Journal of Sea Research, 33: 337-361.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 506-509

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Computational Science and Engineering Online T. N. Truong' Henry Eyring Center for Theoretical Chemistry, Department of Chemistry, University of Utah, 315 South 1400 East, rm 2020 Salt Lake City, UT 84112, USA Received 6 August, 2004; accepted in revised form 18 August, 2004 Abstract: We present the development of an integrated extendable Web-based simulation environment for computational science and engineering called Computational Science and Engineering Online (CSEO). CSEO is a collaboratory that allows computational scientists to perform research using state-of-the-art application tools, to query data from personal or public databases, to document results in an electronic notebook, to discuss results with colleagues using different communication tools, and to access gridcomputing resources from a web browser. Currently, CSEO provides an integrated environment for multiscale modeling of complex reacting systems and biological systems. A unique feature of CSEO is in its framework that allows data to flow from one application to another in a transparent manner. Keywords: Web-based Simulation, Grid computing, Problem Solving Environment.

The discovery of the World Wide Web (often referred to as the web) has been revolutionizing the way we communicate since the last decade. Scientists are also looking into web technology to help revolutionize the way science is being conducted and taught. The last several years have seen tremendous efforts in the creation of 'collaboratories'. These are laboratories without walls, in which researchers can take advantage of web technology to expand their research capabilities and to collaborate in solving complex scientific problems.[1-4] These collaboratories can be classified into two types: data sharing oriented or remote access scientific instrument oriented. The Research Collaboratory for Structural Bioinformatics is an excellent example of the data-oriented collaboratory that provides access to databases of biological structures and tools for determining and analyzing these structures. Most existing collaboratories are instrument driven and provide the capability for real-time data acquisition from remote research instruments through web-accessible servers in a seemingly transparent way and are often focused specifically on a particular complex scientific problem. An example is the Space Physics and Aeronomy Research Collaboratory (SPARC) that provides an internet-based collaborative environment for studies of space and upper atmospheric science facilitating real-time data acquisitions from a remote site in Greenland. The establishment of these collaboratories has undisputed potential for making significant impacts on science and technology in the 21" century. The possibility of using web technology to provide a new framework for simulation attracted a lot of interest in the mid-1990s [5-9]. However, the progress so far has not made any significant impact on the scientific computing community since it has not been user driven as assessed by Kuljus and Paul [8]. In this study, we describe our current efforts in developing an integrated web-based simulation environment for multi-scale computational modeling of chemical and biological systems. The new environment is called Computational Science and Engineering Online (CSEO). The goal is to create a web-based simulation environment that would benefit the greater simulation community of computational science and engineering while gradually introducing the psycho-social changes to

1 Corresponding author. E-mail: [email protected].

Computational Science and Engineering Online_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

507

facilitate the paradigm shift in scientific research. To do so, we would focus our attention on how research is currently being done and then design the environment that would enhance current research capabilities while not significantly altering the research culture. The main goal of CSEO is to provide a Web-based grid-computing environment in which a researcher can perform the following functions: I. Research using a variety of state-of-the-art scientific application tools. 2. Access and analyze information from databases and electronic notebooks. 3. Share and discuss results with colleagues. 4. Access computational resources from the computing grids that are far beyond those available locally. 5. Master subjects in other areas of computational science and engineering asynchronously, without regard to geographical location and schedule. In designing CSEO, our vision is for it not to be a central web portal but rather a world-wide extensive network of many mirror sites, hosted by different universities, computer centers, national laboratories, even industries, that share their public databases via a secure network. This will maximize resource utilization, data generation and sharing. GUI

Portal

• Ctea:elnpiA • Send&

-.... o...

·Visu-

• S.....Oa!aon

cieoiO J(J 3 s longevity) and dynamic (t f'h -Lie - hydrodynamic times, where c is the sound velocity and L is the laser radiation area dimension). If the radiation intensity J is sufficient, a crater is produced. Shock-wave phenomena are not observed.

r, (

1 Corresponding author. Head of Department RFNC-VNIIEF. E-mail: [email protected]

524 _________ __________________ _________ _______ A. Ya. Uchaevel. a/.

r; - rh ( r; -/(/ 10 + 10" 11s). In case of the sufficient intensity J a crater is produced and 2. destructive processes result from shock-wave and thermal shock phenomena. r; - 10·' 3 + 10" 14s. In case of the definite laser radiation intensity J electrons leave the area of 3. laser radiation effects due to light Stoletov pressure, which may lead to Coulomb explosion of the ion core. After the laser radiation pulse stops affecting (condition 2), a negative pressure is generated due to unloading in the nonequilibrium area equal toP- -rpcpTv + pcvM, which may result in formation of fracture centers cascade having a fractal dimension over a picosecond longevity range. The loading 11 parameters are as follows: the radiation duration is to - 10" 10 + 10· s, the quantum frequency is v 10 14 + 10 15 Hz, the number of free electrons is lle - 1022 cm·3, the velocity of radiation-thermalized

electrons is vM - 107 cm/s. Using the methods of quantitative fractography and computation facilities, dissipative structures - the cascade of fracture centers were revealed, which shows the possibility for the dynamic fracture process proceeding over a nanosecond longevity range. Figs. I a, b show the sections of copper samples after laser radiation effects.

a)

Figure l: Copper sample sections after laser radiation effects. Fig. 2 shows the section of a copper sample after effects of high-current beams of relativistic electrons (thermal shock) [1-4].

Figure 2: Copper sample section after effects of high-current relativistic electron beams (thermal shock). The data in fig. I, b and fig. 2 point to the analogy of the process of dynamic fracture of metals. Thus, application of lasers with a picosecond, especially femtosecond pulse duration opens up new possibilities to study the dynamic fracture process over a picosecond range of negative pressure effects, which greatly expands the area ofnonequilibriu m states under study. The carried out research has made it possible to establish scaling (fractal) scale-invariant properties of dissipative structures at a nanolevel. By now a great number of phenomena and tasks, where a fractal structure (dimension) serves as the basic system characteristic, have been determined. This approach is successfully applied [2) for

Scaling properties of dissipative structures produced at the metal nanoleve/ _ _ _ _ _ _ _ _ _ __

525

specifying quantitative characteristics of dissipative structures produced in dynamic fracture of metals, explosives and during structural modification of metals and alloys affected by pulsed high-current relativistic electron beams over a dynamic longevity range (I- 10·6 + 10" 10 s). According to the literary data, the systems formed under highly nonequilibrium conditions are fractal systems characterized by a fractal dimension [ 1-4]. Using a specialized mathematical package of programs of the interactive image analysis system (liAS), the damaged surface images were processed. The roughness size distribution and the fractal dimension of roughnesses were determined. A typical view of the copper sample damaged surface after exposure to USP of laser radiation is shown in fig. 3. Its view after effects of high-current relativistic electron beams is shown in fig. 4.

Figure 3: Damaged surface of the copper sample exposed to pulsed laser radiation.

Figure 4: Damaged surface of the copper sample exposed to pulsed effects of high-current relativistic electron beams. Fig. 5 shows the copper sample damaged surface exposed to pulsed effects of high-current relativistic electron beams. The results of the damaged surface processing after effects of USP of laser radiation and high-current beams of relativistic electrons are shown in figs. 5 and 6 .

..

. .. ..

. ,_ . ..

Figure 5: Size roughness distribution of the copper sample exposed to USP of laser radiation. D rougness dimension; - average roughness dimension; N(D)/(N())- number ofroughnesses of D ( ) dimension

526 ___________________________________________ A.Ya. Uchaevet. a/.

Figure 6: Size roughness distribution of the copper sample exposed to high-current relativistic electron beams. Thus, ultrashort pulses of laser radiation open up new possibilities to study the behavior of metals under extreme conditions. Application of electron microscopy techniques allows quantitative characteristics of dynamic fracture to be established over a picosecond range of the loading pulse with the amplitude of up to several tens ofGPa.

References [I) R.I. llkayev, V.T. Punin, A. Ya. Uchaev, S. A. Novikov, Ye.V. Kosheleva, L.A. Platonova, N. I. Selchenkova, N.A. Yukina. Time regularities of the dynamic fracture ofmetals caused by hierarchic properties of dissipative structures - the cascade offracture centers II Report of Academy for Science (RAS), 2003, v. 393, No 3. [2] Ye.K. Bonyushkin, N.J. Zavada, S.A. Novikov, L.A. Platonova, N.I. Selchenkova, A.Ya. Uchaev. Fractals in applied physics. Sarov: RFNC-VNIIEF, 1995, pp.l23-174. [3] Ye.K. Bonyushkin, N.I. Zavada, S.A. Novikov, A.Ya. Uchaev. Kinetics ofdynamic fracture of metals under pulsed volumetric heating conditions. Scientific edition. Ed. by R. I. llkayev, Sarov, 1998, 275 p. [4] Ye.K. Bonyushkin, B.L Glushak, N.I. Zavada, S.A. Novikov, L.A. Platonova, N.l. Selchenkova, A.Ya. Uchaev. ll PMTF, 1996, No 6, pp.I05-115.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 527-529

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Universal Properties of Metal Behavior in Dynamic Fracture Over a Wide Longevity Range Shown Under Effects of High-Power Penetrating Radiation Pulses A.Ya. Uchaev\ R.I. Ilkayev, V.T. Punin, S.A. Novikov, Ye.V. Kosheleva, N.l. Zavada, L.A. Platonova, N.l. Selchenkova Russian Federal Nuclear Center- VNIIEF, Mira 37, Sarov, Nizhni Novgorod region, Russia, 607 188 Received 7 July, 2004: accepted in revised form I August, 2004 The invariants of metal behavior in dynamic fracture (t-J0- 6-J(J 10s longevity range) have been found. The dynamic invariant allows the behavior of untested metals and alloys to be predicted under extreme conditions and new alloys resistant to certain types of pulse effects to be constructed using numerical methods. Keywords: Thermal shock (TS), quasi-static and dynamic longevity ranges, dissipative structures, fractographic studies, percolation cluster, universal coordinates.

Abstract:

During thermal shock (TS) caused by high-current relativistic electron beams, which is detailed in [1-3], the fracture process proceeds within t < 1o-8 s time in contrast to conventional loading methods. The energy introduction time is dT/dt - 10 12 K/s and the temperature changes up to Tmelt· The basic objective is to find general regularities in the behavior of metals, which are invariants with respect to changes in the external environments. The size distribution of fracture centers for various materials in the sections of loaded samples, which is presented in lg(D/), lg(N(D)/N()) coordinates, is produced by the similarity conversion [13]. This testifies to the fact that the process of dynamic fracture in metals proceeds within one primary process - accumulation and growth of fracture centers, which is accounted for by the basic part of longevity. Spectral size distribution of fracture centers has the form N(D) - D-a , where D is the fracture center size, a> 1 (fig.l, a). The fracture centers accumulation rate may be described by the evolution equation (fig. I, b) of dN/dt - Jt, p > 1 type. The data given in fig.l show correlated behavior and initiation of self-organization of the fracture centers cascade within the destructed sample. The results of fractographic studies pointed to initiation of plastic flow areas (similar to turbulence even at initial temperatures (T - 4K)) causing loss of the long-range lattice order near the forming and increasing fracture centers. These are deformation micro- and mesolevels (fig.2). Fig.3 shows the result of mathematical simulation of the volumetric percolation cluster of fracture centers and the computer section produced by sectioning the volumetric percolation cluster with Q plane [2, 4]. Taking into account that there is an influence sphere around fracture centers having R size, which is connected with the plastic flow fields, and the fractographic analysis data, the following expression may be obtained N" 113 ~ 1.2R, (I) (N is the density of fracture centers). It allows transition from micro- to macrofracture to be estimated not only qualitatively, but also quantitatively (concentration criterion) [1].

1 Corresponding author. Head of Department RFNC-VNIIEF. E-mail: [email protected]

528 _______________ _______________ _____________ A.Ya. Uchaevet. a/.

'''·

Figure I: Size distribution of fracture centers in Fe (Lt=4·10-4m thickness, dark markers) and Cu (.1=10-3m) samples in the sections parallel to the damaged surface (o•(oo)> o•(oo)> oA(M)> o• (00)> o+, where ois the depth from the damaged surface (a) and the fracture centers accumulation rate (calculations and experiments) for various metals on the fracture time scale tf (b). 0 Cua2 Ll=4·104 m; Cua1 .1=2·104 m; D Pba2 Ll=4·104 m; Pba) Ll=3·104 m; 4 m, cr 1>cr2; 5 dN/dt=N equation of solution -------curve.1=2.3·10 Ph m; Cu L1=5·10-

*

1

Based on the structure-energy analogy of the metal behavior after introduction of thermal and mechanical energy, which in both cases results in the lattice long-range order violation, the longevity data on some metals are presented in universal coordinates (fig.4). The universal coordinates make it possible to establish the time dependence of the relation between the critical density of the absorbed energy causing fracture and the lattice energy parameters (enthalpy - H and melting heat - Lm)- The data in fig.4 give the absolute values of dissipative unloading wave losses over the dynamic fracture range, they are close to a single curve and determine the boundary above which there is a fracture area. The insets in fig. 4 show the change in the fracture mechanism from one-site to many-site. The systematized data over the quasi-static range are given for pure metals [5]. Taking into consideration the self-similar character of vulnerability to damage accumulation over the dynamic longevity range, the relation between the critical energy density and the longevity may be obtained: E '· t = const [1-3], where y- 3.8. The derived expression determines the amplitude-and time coordinate irrespective of the loading method and geometry.

Figure 2: Structurization of the lattice slip bands near the growing fracture centers and tangents to the slip bands

-- v· . ~ ~~

. . . '-"" . :-- · ~

' "'.

)

:

-

~

-~

Figure 3: Percolation cluster of the fracture centers and percolation computer section produced by Q plane sectioning.

Universal properties of metal behavior in dynamic fracture over a wide longevity range _ _ _ _ _ __

0 "J ' 0 ..

529

,,,

'V PI '

+ z.

......

• • c., "

A "'• (d

"' ,.• X

'] 4

..

~·I~· ~ .~ ·.'1.

,.

/

..

' lpfHffJ•I.J

"

Table: p(T,P), H(T), f(T,P), Lm Measured: E=Pifp, Lm- melting heat, H- enthalpy, r- Gri.ineisen parameter, p- material density Figure 4: Dependences oft longevity for some metals in universal coordinates (tis in seconds)

References [I] Ye.K. Bonyushkin, N.I. Zavada, S.A. Novikov, A.Ya. Uchaev Kinetics of dynamic fracture of metals under pulse volumetric heating conditions. Scientific edition. Ed. by R. I Ilkayev, Sarov, 1998, 275 p. [2] R.I. Ilkayev, A. Ya. Uchaev, S. A. Novikov, N.J. Zavada, L.A. Platonova, N. I. Selchenkova II Report of Academy for Science (RAS), 2002, v. 384, No 3, P. 328-333. [3] R.I. Ilkayev, V.T. Punin, A. Ya. Uchaev, S. A. Novikov, Ye.V. Kosheleva, L.A. Platonova, N. I. Selchenkova, N.A. Yukina. II DAN, 2003, v. 393, N2 31. Prigogine, Stuart Rice (Eds): Advances in Chemical Physics Vol. 93: New Methods in Computational Quantum Mechanics, John Wiley & Sons, 1997. [4] Ye.K. Bonyushkin, N.J. Zavada, S.A. Novikov, L.A. Platonova, N.I Selchenkova, A.Ya. Uchaev. Fractals in applied physics. Sarov: RFNC-VNIIEF, 1995, pp.123-174 [5] Regel V.R., Slutsker A.l., Tomashevsky E.l. Kinetic nature of solid strengths. M. : Nauka, 1974.560 p.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 530-534

Proposal for Development of Reusable Learning Materials for Wbe Using lrlcoo and Agents R. Peredo Valderrama and L.Balladares Ocana Agents Laboratory, Center for Computing Research, IPN, Mexico, D.F. Av.Juan de Dios Batiz Esq. Othon de Mendizabal s/n Col. Nueva Industrial Vallejo, 07738, Mexico, D.F. Mexico Received 6 August, 2004: accepted in revised form 26 August, 2004 Abstract: The research in WBE systems is centered in reusability, accessibility, durability and

interoperability of didactic materials and environments of virtual education. In this article we make a development proposal based on a special type of labeled materials called Intelligent Reusable Learning Components Object Oriented (JRLCOO), the components conform a fundamental part of the tool CASE of courses authoring tool denominated Authoring IRLCOO, producing learning materials with interface and functionality standardized with SCORM 1.2, rich in multimedia, interactivity and feedback is described. The structuring model for dynamic composition of these components is based on the concept graph knowledge representation model. The multiagent architecture as a middleware for open WBE system is developed for sequencing and delivery of learning materials composed of IRLCOOs.

Keywords: WBE, intelligent reusable learning components object oriented, intelligent components, agents

1.

From RLO to Intelligent Components

In software engineering, a component is a reusable programme building block that can be combined with other components in the same or other computers in a distributed network to form an application. More specifically, a component is a piece of software reusable in binary form you can connect to other components with relatively little effort [13]. Examples of a component include: a single button in a graphical user interface, a small interest calculator, an interface to a database manager. A component within a context called a container and can be deployed different servers in a network and communicate with other for needed services.

1.1 RLCOO

Applying these ideas to the development of RLO, Macromedia's Flash components have been used as the basis of content elements in the format of the Reusable Learning Components (RLC) [ 14], because Flash is an integrator of media and have a powerful programming language denominated ActionScript 2.0 completely object oriented, is a client that allows multimedia, besides loading of certain media in Run-Time, allowing to have a programmable and adaptive environment to the student's necessities in Run-Time. Flash already has Smart Clips for the learning elements denominated LI, so, the original idea was to generate a library multimedia of RLC for WBE. Different templates for RLC as well as for the programmable buttons with Smart Clips and Clip Parameters interface were generated, for example, generic buttons for audio, video, exams, charts, etc. The interfaces stayed as files swjin a static way in the first versions. New learning templates also were created to add the functionality or to improve the existent one. To load components, a new mechanism was generated based on the integration of loadVars and loadMovie for the personalization of the materials in Run-Time, looking for to separate the contents of the control of the component using different levels inside the Flash player, allowing to generate more specialized components, smaller , reusable, and to be able to integrate them in run-time inside a bigger component. The liberation of ActionScript version 2.0 inside Flash MX 2004 allowed to redefine the RLC to Reusable Learning Components Object Oriented (RLCOO), allowing to implement the paradigm object oriented, besides allowing to contain certain common functionalities inside of the Application Programming Interface (API) like: Communication with the LMS, Communication with the Agents and dynamic load of Assets in Run-Time.

Proposal for Development of Reusable Learning Materials for Wbe Using lrlcoo and Agents_ _ _ _ __

531

As we see later, this mechanism is also used to provide communication of the RLCOO with the LMS. A general scheme of the RLCOO with their classes and the API for load dynamic de Assets . I . how en F"•g. I and2 respective~ ISs ~-

f-

-

-~~ ~--~

~

>TIMI(GUIJ

_ SCO(URI.) _~-scou

r F=::?"

API_LMS

::

Flash

I.ISSeiV-..(_._,...l

00

_i

~

~·- ~

asset

-~--- -UISGeiV-...(-)

1---.1\

"'

I .,:;~N~/

-9'11~)

·MIT. . .~)

'"'

.....,,

I I I ,... II

~0'1-.-.!CA«.._,..,.

t::""'*

- IV 0

Asset RLCOO

..... .._

,,.., '"'""' mp3

1

raw media

met!Hiata

i

content

objoct

~

=-- l ~~

(Application Programming Interface) Load Resource in RunTime

Programmtng Interlace) with the LMS & CAP module of A ctJOnscnpl

paraneters from the LMS & CAP

code

parameters to the LMS&CAP

Figure I. Basic diagram of classes of RLCOO Figure 2. The general schema of RLCOO with load dynamic of Assets. written in ActionScri]J12.0.

1.2 Intelligent RLCOO A new type of components, called IRLCOO, was developed for the advanced templates, adding nonlinear feedback from the users and the functionality of dynamic component generation for the autoconfigurable labeled materials. A general scheme of the IRLCOO with nonlinear feedback is shown in Fig. 3. For the nonlinear feedback, a small Neural Network (NN) of the Bidirectional Associative Memory (BAM) type was used. The NN is composed of two layers, identified by two groups of neurons {a., a2, . . . , a,} and {b., b2, .. . , bn} . The nxp generated connections conform a correlation matrix, represented by W [15]: m

W='L,A[Bk k

This matrix is generated by means of the vectors A= {a., a2, ... , a, } and 8 ={b., b2, ... , b0 }, being A the input vector that can be composed of different factors that a teacher wants to measure, as for example: time, score and number of intents; while the vector 8 represents the output or feedback that the student will receive in function of the input vector A (Fig. 4). This NN was programmed as a part of IRLCOO. Since in Flash two-dimensional matrices do not exist, one dimensional matrix was used adding some logic to work appropriately.

.....

Tilt: !CORE OI:PL'IDL'IC ON COa lltct"St:.SS. Jllo1.1N.aO Of' ATTI.MI"TS A.'li"O TI)Q

... "' eeee

Figure 3. The general schema of IRLCOO with Figure 4. An assessment lRLCOO of the Hot Spot e before evaluation. d amic feedback.

1.3 Separation of content and control with IRLCOO The components have three fundamental phases, the first of development, where it is implemented the different components for a course, bound to media, where the goal is to have separate the media of the component, functionality, metadata, and control, using the Authoring IRLCOO by means of files XML

532

- - - - - - - - - - - - - - - - - - - - R. Pereda Valderrama and L.Bal/adares Ocana

like static parameters. The second phase is when the component is mounted in the server and making use of the API for dynamic load of Assets in Run-Time maybe take the control of the component as much to level media as to level control, depending on the metric of the learners in the LMS. The third phase refers to that these components have functionalities to measure metric of the learners in RunTime without one has to program this. Achieving component the reusables and flexibility in Run-Time.

.

~·-··- - -- - --.

1

:

j

:

i ! ·-------------'

_

____ _ ____

,.,..,.,..,

,..... ..,..

Content objec

..

Standard Communication Methods

RTS/LMS

RunTime Enviroment

Spot

1.4 Communication between IRLCOO and LMS The common implementation of this model is in an asynchronous RTE (browser-based delivery) and one where a content object initiates all communication. As depicted in Fig. 6, a content object and the LMS implement communication adapters in the content object's RTE. The SCORM communications standard prescribes a rich language whereby the management system and module can communicate. Four of the most important SCORM commands: LMSinitialize, LMSFinish, LMSGetValue, and LMSSetValue. LMSinitialize marks the begin of a module, says to the LMS, "Starting up. Start your clock and begin tracking me". LMSFinish marks the end of a module, says to the LMS, "Stopping up. Stop the clock and cease tracking me". LMSGetValue command enables the module to request information from the LMS. LMSSetValue command can be used to send a wide variety of data to the LMS.

2. Authoring IRLCOO Once we have components and models for their structuring, development of easy-to-use authoring tools put in the hands of subject matter experts, is imperative to the success of the WBE. In this way, knowledge can be packaged and distributed very quickly. During the development of the didactic materials for EVA, it was decided to develop templates of the components and an authoring tool, called IRLCOO Factory, to facilitate their deployment, since many teachers are not experts in handling of multimedia tools. We have developed the structurer and packer of content multimedia using the IRLCOO as the lowest level of content granularity. The integration of the RLCOO and content aggregation is carried out on the base of the concept graph model that allows to generate materials at different granularity levels. It is important to mention that the IRLCOO are basically components ActiveX. As it can be seen in the Fig. 8.

3. Communication between Agents In this architecture, agents are embedded in the applications running both at the client and server (modeling the functionality of the LMS) sides and implemented as programmes in Visual Basic .NET (Fig. 9). Each one of these applications contain an agent based on an ActiveX control, by means of which communication among the applications is supported. In the case of a client, this agent is called Personal Assistant (PA) and in the case of a server, Sequencing Agent (SA). These agents inherit the functionality of the BasicAgent of the Component Agent Platform (CAP). The SA agent has an API to manage the RLOs repository (DB), implementing "sequence", "retrieve" and "delivery" services. Hosting embedded ActiveX agent, client's application has the following additional components. Its essential part is a browser invoked as a COM server that is used to visualize the IRLCOOs. Another component of the client application is an XMLDOM object to manipulate the IRLCOOs required by the user. The interaction between the user and the RLC is provided by the client's GUI. When the user requests an IRLCOO, the client's agent is activated and it generates a FIPA-ACL message (generally using the "request" performative) to be sent to the corresponding LMS 's agent. LMS is composed of a

Proposal for Development of Reusable Learning Materials for Wbe Using Irlcoo and Agents _ _ _ __

533

set of (distributed) SAs, each of them controlling a particular domain of the knowledge space defined the

9. Agen t-based architecture of the WBE

Acknowledgments The authors of this article would like to thank the IPN and CIC for partial support for this work within the project: 20040858. The authors would like to acknowledge all their colleagues and students participating in the design and development of the software and learning materials described in this article.

References [I) W. Veen, B. Collis, S. Santema, & R. Genang. Leamer led learning in an online community, Proc. of the World Conf. on ED-MEDIA 2001, Tampere, Finland, June 24--30,2001. [2] Colin Steed, Web-based training (Gower Publishing Limited, 1999). [3) Reusable Learning Object Strategy Definition, Creation Process and Guidelines for Building, v3.1 CiscoSystems,Inc.,URL:htto:/1 www.cisco.com/wam/public/ l 0/wwtraining/eleaming/implement/rlo strategy v3-l.pdf. [4) Open Knowledge Initiative, MIT, URL: http://web.mit.edu/oki/. [5) Advanced Distributed Learning Initiative, http://www.adlnet.org. [6] Global IMS Learning Consortium, URL: http://www. imsproject.org/. [7] L. Sheremetov & R. Peredo Valderrama, Development of labeled multimedia for WBT using SCORM, Proc. of the lASTED International Conf. CATE '02, Cancun, Mexico, 2002. [8] L. Sheremetov & V. Uskov, Hacia Ia nueva generacion de sistemas de aprendizaje basado en Ia Web (Towards the New Generation of Web-based Learning Environments), Computaci6n y Sistemas, 5 (4), 2002, 356-367 (in Spanish). [9] L. Sheremetov & R. Peredo Valderrama, Development of Reusable Learning Materials for WBE using Intelligent Components and Agents, International Journal of Computers & Applications, Volumen 25 I Number 3 ACTA Press, 2003, 170-178. [IO)D. Cowan, An object-oriented framework for University of Waterloo, Ontario, Canada, 1998.

liveBOOKs, Technical Report, CS-98,

[ll)L. Sheremetov & R. Quintero, EVA: Collaborative distributed learning environment based in agents, Proc. of the World Conf. ED-MEDIA 2001, Tampere, Finland, June 24--30,2001. [12] L. Sheremetov & A. Guzman Arenas, EVA: An interactive Web-based collaborative learning environment, Computers and Education, 39 (2), 2002, 161-182.

534 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ R. Peredo Valde"ama and L.Bal/adares Ocana [13] G. Eddon & H. Eddon, Inside distributed COM (Microsoft Press, 1998).

[14]Macromedia, Inc., URL: http://www.macromedia.com/. [15]M. Partida, R. Peredo, & F. Cordova, Simulacion de redes neuronales, Memoria Bidireccional Bivalente Adaptativa: BAM, Polibits, I (14), 1995, A ~no VII (in Spanish). [ 16] IEEE 1484.2, Draft Standard for ECMAScript API for Content to Runtime Services Communication. [17]IEEE 1484.11.1, Working Draft 14, Draft Standard for Data Model for Content Object Communication. [18]U. Vladimir & U. Maria, Reusable learning objects approach to Web-based education, Proc. ofthe lASTED International Conf. CATE '02, Cancfut, Mexico, 2002 [19]S. Franklin & A. Graesser, Is it an agent, or just a program?, in J.P. Mueller, M.J. Wooldridge, & N.R. Jennings (Eds.), Intelligent agents III (Berlin: Springer, 1997), 21-36. [20]M. Wooldridge, Agent-based software engineering, lEE Proceedings of the Software Engineering, 144 (I), 1997, 26[21] N.R. Jennings, On agent-based software engineering, Artificial Intelligence, 117, 2000, 277296. [22] FIPA XC00061, FIPA ACL message structure specification, 200 I. [23]L. Sheremetov & M. Contreras, Component agent platform. Proc. of the 2nd International Workshop of Central and Eastern Europe on Multi-Agent Systems, CEEMAS '01, Cracow, Poland, September 26-29, 2001,395-402. [24] IMS Simple Sequencing Specification, Version 0.7.5 Public Draft Specification, IMS Global Learning Consortium, Inc., April 2002. [25]A. Gordon & L. Hall, Collaboration with agents in a virtual world, Proc. of the Workshop on Current Trends and Artificial Intelligence in Education, 4, World Congress on Expert Systems, Mexico City, Mexico, 1998. [26]C.I. Peila de Carrillo, R. Fabregat Gesa, & J.L. Marzo Lazaro, WWW-based tools to manage teaching units in the PLAN-G distance learning platform, Proc. of the EDMEDIA 2000, World Conf. on Educational Multimedia, Hypermedia and Telecommunications, AACE, Montreal, July 2000.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 535-538

A Trigonometrically Fitted Runge-Kutta Pair of Orders Four and Five for the Numerical Solution of the Schrodinger Equation Z.A. Anastassi and T .E. Simos

1 2 3 4

Department of Computer Science and Technology, Faculty of Sciences and Technology, University of Peloponnese, GR-221 00 Tripolis, Greece Received 21 September, 2004 Abstract: We are constructing a trigonometrically fitted pair of explicit Runge-Kutta methods of orders four and five. The pair has a variable step size which is determined by a specific algorithm. The algorithm is based on the error as that is expressed by the absolute difference of the values computed by each method separately. The step size control in combination with the trigonometrical fitting result in a very efficient method when compared to well known methods. Keywords: Runge-Kutta pairs, trigonometrical fitting, radial Schrodinger equation, resonance problem. PACS: 0.260, 95.10.E

1

Introduction

The radial Schriidinger equation has the following form:

y"(x) =

c(l;

1)

+ V(x)-

(1)

E) y(x)

where l(~-1; 1 ) is the centrifugal potential, V(x) is the potential, E is the energy and W(x) = l(~-1; 1 ) V(x) is the effective potential. It is valid that lim V(x) = 0 and therefore lim W(x) = 0. x-oo

x-oo

If we divide [O,oo] into subintervals [a;,b;] so that W(x) is a constant with value

problem (1) bedomes

y;' =

(W-E) y;,

y; ( x) = A; exp ( 1 President

whose solution is

JW -

Ex)

+ B;

exp (-

+

JW -

Ex),

A;, B; E

~.

W;,

then

(2)

of the European Society of Computational Methods in Sciences and Engineering (ESCMSE) Member of the European Academy of Sciences and Arts 3 Corresponding author. Please use the following address for all correspondence: Dr. T.E. Simas, 26 Menelaou Street, Amfithea- Paleon Faliron, GR-175 64 Athens, GREECE, Tel: 0030 210 94 20 091 4 E-mail: [email protected] 2 Active

536

Z.A. Anastassi and T.E. Simos

2 2.1

Basic theory

Explicit Runge-Kutta methods

An s-stage explicit Runge-K utta method used for the computation of the approximation of Yn+I (x), when Yn(x) is known, can be expressed by the following relations: s

Yn+! = Yn

+ Lb;k; i=l

(3)

where in this case

f

(x, y(x)) = (W(x)- E) y(x).

When solving a second order ODE such as (1) using first order numerical method (3), then problem ( 1) becomes: z'(x) = (W(x)- E) y(x) y'(x) = z(x)

while we use two pairs of equations (3): one for Yn+! and one for Zn+!· The method shown above can also be presented using the Butcher table below: 0

Coefficients

c2, .•• ,

c2

a21

C3

a31

a32

Cs

as!

as2

as,s-1

b!

b2

bs-1

(4)

b.

c8 must satisfy the equations: i-1

Ci

=L

(5)

aij, i = 2, ... , s

j=l

Definition 1 [1] A Runge-Kutta method has algebraic order p when the method's series expansion agrees with the Taylor series expansion in the p first terms: y(nl(x) = yf,~b.(x), n = 1, 2, ... ,p. 2.2

Step size control

The control of the step size is managed by the following algorithm: We choose two one-step methods of orders q9 and q9 , with q9 ~ q9 + 1. Let Y; and Y; be the approximate values of the two one-step methods for the exact solution y at the grid point x;. • We compute the approximate solutions Y and • We computeS= 0.9h

(

IIY~YII )

1/{qg+l)

Y;

, where

E

at x; =

-

+ h.

IIYII RELERR + ABSERR

537

A Trigonometrically Fitted Runge-Kutta Pair of Orders Four and Five

• If

IIY -

Yll

Xi+ 1 = X;

• If

< c:, then Yi+1

+ h.

= Y is accepted as the approximated value at the grid point The next step size will be h = min {S, 4h}.

IIY- Yll > c:,

then the step must be repeated with the new step size h

= max{S, 1/4h}.

See also [5]. 2.3

Exponentially fitted Runge-Kutta methods

Method (3) is associated with the operator

L b; u' (x + c;h, U;) s

L(x) = u(x +h)- u(x)- h

i=l i-1

U; = u(x)

+ hL

a;iu' (x

+ Cjh, Ui ),

(6)

i = 1, ... , s

j=1

where u is a continuously differentiable function. Definition 2 [2] Method (6) is called exponential of order p if the associated linear operator L vanishes for any linear combination of the linearly independent functions exp(v 0 x), exp(v 1x), ... , exp(vpx), where v;li = 0(1)p are real or complex numbers. Remark 1 [3] If v; = v for i = 0, 1, ... , n, n ::; p, then the operator L vanishes for any linear combination of exp(vx), xexp(vx), x 2 exp(vx), ... , xnexp(vx), exp(vn+ 1x), ... , exp(vpx). Remark 2 [3] Every exponentially fitted method corresponds in a unique way to an algebraic method (by setting v; = 0 for all i) Definition 3 [2] The corresponding algebraic method is called the classical method.

3

Numerical results- The resonance problem

In this paper we will study the case of E > 0. We will integrate the problem (1) with l = 0 at the interval [0, 15] using the well known Woods-Saxon potential UQ

V(x) = 1 + q

U1

q

uo =-50,

x~ xo),

q = exp (

+ (1 + q)2' a= 0.6,

xo = 7

and

where

(7)

uo

U1=--

a

and with boundary condition y(O) = 0. The potential V(x) decays more quickly than Schrodinger equation (1) becomes

y"(x)=

c(l;

1 (~1 1 ),

so for large x (asymptotic region) the

1) -E)y(x)

(8)

The last equation has two linearly independent solutions k x j1(k x) and k x n1(k x), where j1 and n1 are the spherical Bessel and Neumann functions. When x -+ oo the solution takes the asymptotic form

Z.A. Anastassi and T.E. Simas

538

y(x) ~ Akxj,(kx)- B kxnz(kx) ~ D[sin(k x- 1r l/2) +tan( 51) cos (k x-

1r

l/2)],

{9)

where 81 is called scattering phase shift and it is given by the following expression: {10)

where S(x) = kxj1(kx), C(x) = kxn1(kx) and X; < x;+l and both belong to the asymptotic region. Given the energy we approximate the phase shift, the accurate value of which is 1r /2 for the above problem. As regards to the frequency w we will use the suggestion of Ixaru and Rizea

[4]:

w=

VE

- 50,

vE,

x E [0, 6.5] X E

[6.5, 15]

References [1] Hairer E., N!llrsett S.P., Wanner G., Solving Ordinary Differential Equations I, Nonstiff Problems, Springer-Verlag, Berlin Heidelberg, 1993 [2] Simos T.E., An exponentially-fiited Runge-Kutta method for the numerical integration of initial-value problems with periodic or oscillating solutions, 115 1-8{1998). [3] Lyche T., Chebyshevian multistep methods for Ordinary Differential Eqations, Num. Math. 19 65-75{1972). [4] Ixaru L.Gr., Rizea M., A Numerov-like scheme for the numerical solution of the Schri:idinger equation in the deep continuum spectrum of energies, Comp. Phys. Comm. 19 23-27 1980 [5] Avdelas G., Simos T.E. and Vigo-Aguiar J., An embedded exponentially-fitted Runge-Kutta method for the numerical solution of the Schri:idinger equation and related problems, Comp. Phys. Comm. 131 52-67, 2000.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 539-542

A Web-Based Simulator of Spiking Neurons for Correlated Activity Analysis F.J. Veredas 1 Departamento de Lenguajes y Ciencias de Ia Computacion, ETSI Telecomunicacion, Universidad de Malaga, 29071 Malaga, Spain Received 19 July, 2004; accepted in revised form 14 August, 2004 Abstract: A computational tool to study correlated neural activity is presented. SENNECA (Simulating Elementary Neural NEtworks for Correlation Analysis) is a web-based simulator specifically designed to simulate small networks of integrate-and-fire neurons. It implements a model neuron that can reproduce a wide scope of integrate-and-fire models by adjusting the parameter set. Launched simulations run remotely on a cluster of computers. The main features of the simulator are explained, and an example of neural activity analysis is given to illustrate the potential of this tool. Keywords: Spiking neuron; correlated firing; web-based simulator; computers cluster

1

Introduction

Some general-purpose simulators offer the neuroscientist the possibility of implementing highly realistic models of the membrane electrical behavior, with multicompartmental neuron models like in GENESIS [1, 8] and NEURON [2, 5], among the most popular tools. However, these models are computationally very costly. In particular, tasks that involve analysis of correlated activity with cross-correlograms imply long simulations, since the neurons under study must generate enough spikes for the correlogram shape to emerge from background activity. In these cases, dedicated tools clearly outperform popular simulators. In this paper we describe SENNECA [9], a specific-purpose web-based neural simulator dedicated to the study of pairwise correlated activity. SENNECA is based on a highly parameterized neuron model to allow the independent study of a wide number of physiological factors, and also to account for most variants of realistic neuron models -leaky vs. non-leaky, instantaneous integration vs. EPSP (excitatory postsynaptic potential), etc.

Methods The structure of a simulation has been divided in four sets of parameters that allow the experimenter to flexibly define (i) the network topology, (ii) the cellular electrical behavior, (iii) the synaptic transmission, and (iv) the simulator environmental setting. 'Corresponding author. Tel.: +34-952-137155; fax: +34-952-131397. E-mail: [email protected]

F.J. Veredas

540

Network parameters A network of neurons is defined by specifying the number of neurons, and the synaptic strength of the connections among them (given by W;j in equation 2), that can take positive or negative values for excitatory and inhibitory synapses, respectively. Neuronal parameters The model neuron embodied in SENNECA is a general-purpose integrate-and-fire model [7]. It has been designed according to two main objectives: (i) to allow the simulation of a wide range of models by setting parameters to proper values, and (ii) to provide the user with enough degrees of freedom to independently study the influence of physiological characteristics in pairwise analysis of neural activity. The electrical properties of the neuron membrane are expressed in equation 1, parameterized by the following constants: resting potential (v), threshold potential (8), afterhyperpolarization (v), membrane time constant (r) and refractory period (p). The behavior of the dendritic tree depends on five parameters: (i) background noise (r in equation 1) obtained from a distribution selected from a list of different noise sources: uniform noise, Gaussian noise, 1/f noise, Poissonian noise [10], and custom noise; (ii) post-synaptic current injection rise time (a), (iii) decay time (,B) and (iv) amplitude (w); and (v) axonal transmission delay and synaptic jitter. The model that interrelates all these magnitudes is expressed by three equations that govern the injected synaptic current (1, 2) and the membrane potential (3). (1)

{

w· · ..!!.. e-. ~ lJ Oj

W ij

Vi(t

+ ~t)

=

{

{3

- o +u

i>J-a,+u e-~ 11, J

vV;(i;)

if u ::; ai .;f u > ~J. • ~

(1- ~.t) Vj(t) + '2:t I;(t) +~.tv

if t ::0:: p+t; if t > p + i;

(2)

(3)

where ii is the step where neuron j fired last time, and Dji is the overall delay of the last spike in neuron j to reach neuron i, where both, constant transmission delay and variable synaptic jitter, contribute. C is de capacitance of the membrane. Some parameters are not expressed in this formulation for the sake of simplicity. Simulation parameters The environment is also parameterized to define the conditions under which the simulation will take place. Two halting conditions are provided: the number of spikes in the target neuron for a correlogram to be significant, and the maximum simulation time, given in simulation steps. The output shape is also affected by a different parameter, the correlogram window size. User interface The simulator home-page can be found at http: I /senneca. geb. uma. es and is structured in several pages for the interaction with the user. The different parameters are grouped into pages according with their functionality into the simulator: ( i) topological (number of neurons, and connection weights), (ii) neuronal (membrane and other physiological parameters), (iii) synaptic

541

Web-based spiking neurons simulator

(background noise and transmission delay and jitters) and (iv) simulation parameters (see above). It is also likely to introduce comments (free-style text) about the simulation. When a simulation is launched from SENNECA's graphical user interface, it will run remotely on a heterogeneous cluster of computer based on Mosix 7.0. The state of each simulation can be monitored and controlled from the web and the results are reported by email and also accessible at SENNECA web pages. SENNECA is also provided with a simulation data base to file the simulations of each user, which includes the possibility of sharing simulation results and information among different users. Furthermore, the simulator integrates a web-forum to allow scientific discussions among remote users. SENNECA is compatible with most web browsers (Explorer 5.0, Mozilla 5.0, Netscape 7.0 and Opera 5.0).

Simulation results A

B 250

"'

250

200

"'

·c.

"' 100

"' 100

50

50

~ 150

·c. monosynaptic

@---il

.. =....

.c:

.. =.. = ::E

l:>il

~

I

lr

I I)

~

=

-

Experimental

I,\~

b)

~

0.05

-

::E ·:;:

0.10

l:>il

·:;:

~

.c:

Model#2 -

Q

·:;: ~

Model# I

0.15

.c:

0.00

l:>il

IE+2 IE+J IE+4 IE+5 IE-tii IE+7 IE+8

Molecular Weight

0.05

0.00

~

IE+2 IE+J IE+4 IE+5 IE-tii IE+7 IE+8

Molecular Weight

Figure I: MWD of final product at polymerization temperature 70°C a) Feed initiator (AIBN) concentration 0.3% wt b) a) Feed initiator (AIBN) concentration 0.5% wt. Model#!: free volume. Model#2: chain entanglements. Experimental data obtained by GPC [18].

References [I] G.T. Russell, D.H. Napper and R.G. Gilbert Macromolecules 21 2141 (1988). [2] D.S. Achilias and C. Kiparissides, Macromolecules 25 3739 ( 1992). [3] P.G. deGennes, J. Chern. Phys., 55 572 (1971). [4] P.G. deGennes, Macromolecules 9 587 (1976). [5] P.G. deGennes, Macromolecules 9 594 (1976). [6] P.G. deGennes, Nature (London) 282 267 (1979). [7] J.S. Vrentas and J.L. Duda, J. Polym. Sci., Part B: Polym. Phys. 15 403 (1977). [8] J.S. Vrentas and J.L. Duda, J. Polym. Sci., Part B: Polym. Phys. 15 417 (1977). [9] W.H. Ray,J. Macromol. Sci. Rev. Macromol. Chern. C8 I (1972). [IO]D.S. Achilias and C. Kiparissides, J. Macromol. Sci. Rev. Macromol. Chern. C32 183 (1992). [II] C. Kiparissides, G. Verros and J.F. MacGregor, J. Macromo/. Sci. Rev. Macromol. Chern. C33 437 (1993). [ 12] D.S. Achi lias and C. Kiparissides, J. App/. Polym. Sci. 35 1303 ( 1988). [13] M. Smoluchowski, Z. Phys. Chern. 92 129 (1918). [14] S.K. Soh and D.C. Sundberg, J. Polym. Sci., Polym. Chern. Ed. 20 1299 (1982). [15]T.J. Tulig and M. Tirrell, Macromolecules 14 1501 (1981). [16]D.J. Coyle, T.J. Tulig and M. Tirrell, Ind. Engng Chern. Fundam. 24 343 (1985). [17]G.D. Verros, Polymer, 44 7021 (2003). [ 18] S. T. Balke and A.E. Hamielec, J. App/. Polym. Sci. 19 903 ( 1972).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 551-552

Numerical and Monte-Carlo Calculation of Photon Energy Distribution in Sea Water D.S. Vlachos 1 Hellenic Center for Marine Research, PO BOX 712, GR-190 13 Anavissos, Greece Received 5 September, 2004; accepted in revised form 20 September, 2004 Abstract: Photons, when emitted in sea water, are subject to multiple scattering mechanisms, resulting in a shift in their initial energy. Consequently, a problem arises when measured spectra of radioactivity in sea water are processed, because the measured values do not reflect the initial photon energy but a distribution of energies. In this work a detailed analytical formulation of this distribution is performed and numerical results are compared to results from Monte-Carlo calculation. Keywords: Marine radioactivity, photon energy distribution PACS: 29.30.Kv, 29.40.Mc

1

Formulation

Assume that radioactive sources are distributed uniformly in sea water. This is a reasonable simplification if we consider the averaging effect of radioactivity over time. Let n( a) be the concentration of photons with energy E and a = E/m 0 c2 ( m 0 is the rest mass of electron and c the speed of light). The change of n(a) over time is given:

where

R(a)

the rate of photon generation at energy a& the probability per second for a photon with energy a to scatter to the energy a 1 due to Compton scattering is the probability per second for a photon with energy a to be absorbed leaving in its place an electron-positron pair. Finally, the positron will interact with an electron producing two photons with energy a = 1 the probability per second for a photon with energy a 1 to scatter to a different energy due to all scattering mechanisms.

'Corresponding author. E-mail: [email protected], [email protected]

552

Vlachos

For clarity, we assume that photons are generated only in one energy, namely ao. By setting:

n(a) =A( a)+ k · 8(a- ao) + m · 8(a- 1) we find for k and m k

m=2·

= _!!____

(2) (3)

S(ao)

k · R(ao) +go A( a)· R(a) · da

S(1)

(4)

and substituting in (1), we find for A:

G. C(a 0 ,a1 ) s(~o) +go A( a)· R(a) · da f 00 S(ao) +2· S( 1) ·C(1,at)+ Jo A(a)·C(a,at)da-A(a1 )·S(at) = 0

(5)

where in steady-state dn(a)fdt = 0. The above equation can be solved numerically. The functions R(a) and S(a)) are calculated from the XCOM software [1]. The function C(a,at) is calculated by the Klein-Nishima formula [2]which gives for the angular distribution of scattered photons:

p(B) =

1r •

where

2

r0

a ) ao

( -

a a0

2 ( -

+ -ao a

2

-sin B)

ao 1 + ao · (1 -cosO)

a = .,.------:-'---=

(6)

(7)

Changing the variable B to a we have:

C(o:o,o:) =

7r •

1 +1 ) ] · [ u(a- - ao- ) - u(o:- o:o) ] r 02 21 [ -a +ao -- 1 + ( 1-a 0 ao o: o: ao 1 + 2o:o

(8)

where u is the step function. Equation (8) is solved numerically. The results are compared with those produced from MonteCarlo simulation of the distribution of photons in sea water.

References

[1] M.J. Berger and J.H. Hubell: XCOM: photon cross sections with a personal computer, NBSIR 87-3597 (1987). [2] N. Tsoulfanidis: Measurement and Detection of Radiation, Hemisphere Publishing Corporation, ISBNJ 0-8916-523-1 {1983).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 553-555

Monte-Carlo Simulation of Clustering in a Plasma-Discharge Source D.S. Vlachos 1 Hellenic Center for Marine Research, PO BOX 712, GR-190 13 Anavissos, Greece and

A.C. Xenoulis N.C.S.R. Demokritos, GR-153 10 Agia Paraskevi, Greece Received 5 September, 2004; accepted in revised form 20 September, 2004 Abstract: The assumption that the ionization and coagulation of metal plasma-generating cluster source can be described by the orbital limited motion theory, when it was examined by Monte Carlo calculations, was found inadequate. That assumption cannot firstly predict the modes of cluster ionization observed experimentally. Secondly, the effectiveness of coagulation was found to depend on the initial size of the condensing particles. Specifically, the coagulation of dust particles proceeds in a satisfactory degree. The coagulation of single atoms in a cluster source, however, saturates at about 140 atoms per cluster, in serious underestimation of relevant experimental data. Keywords: Monte-Carlo, Clustering, Plasma-Discharge Mathematics Subject Classification: 65Z05 PACS: 02.70.Uu, 81.07.Bc

1

Introduction

Certain of the most important cluster sources (such as the laser-ablation, the magnetron Hnd the hollow-cathode source) generate plasma during particle coagulation. Nevertheless, clustering under plasma conditions is very poorly studied and understood. On the experimental front it has been observed that negatively as well as positively ionized clusters coexist during coagulation [1], suggesting that electromagnetic interactions may be responsible for clustering. On the theoretical front, the coagulation of lOA-radius dust particles in a steady-state plasma was successfully described in terms of electrostatic and dipole interactions [2]. The untested use of this model to describe clustering in a gas-aggregation source, however, is not safe because in the source the coagulation conditions are significantly less favorable than those previously assumed [2]. The purpose of the present study is to examine whether or under what conditions the model of Ref. [2] is applicable to a plasma-discharge cluster source. The relevant realistic conditions tested in the 1 Corresponding

author. E-mail: [email protected], [email protected]

554

Vlachos and Xenoulis

present Monte Carlo calculation is that clustering starts from atoms and that the interaction time is about 1msec [1].

2

The Simulation Algorithm

It may be instructive first to describe qualitatively the evolution of events taking place when Cu atoms (or dust particles) and plasma are mixed. As soon as the interaction is turned on, all the present (initially neutral) Cu atoms are negatively ionized. This happens because the number of collisions with electrons is overwhelming compared to those with Ar+ ions. As a consequence, a short-lived outburst of clustering observed at the very outset, caused by Cu0 - Cu- dipole interaction, stops because, as already mentioned, all the particles become negatively charged and repulse each other. Nevertheless, a few dimmers manage to be formed in the mean time. As the particleplasma interaction continues, some of the negatively charged metal particles will turn to neutral when they collide with Ar+. These fresh neutrals are very short lived. They almost instantly interact with either other negatively charged Cu particles leading to larger negative clusters, or with electrons reverting to negatively ionized clusters of the same mass. This clustering process, however, soon fades out because the density of scattering centers is reduced due to clustering. Nevertheless, because in the mean time the dimensions of the clusters have increased, a second coagulation stage, supported by multiple ionization, becomes possible. During that stage, the mean negative charge per cluster increases significantly causing a commensurate increase in the strength of the dipole interaction giving a fresh outburst of clustering events. A sample volume, zo- 15 m 3 , of the experimental apparatus has been considered with periodic boundary conditions. Monte Carlo calculations take into account the interaction of metal particle with each other and with the plasma components, i.e. electrons and ions. A constant electron density is assumed and the overall neutrality is obtained by adjusting the ion density. All plasma components and metal particles are considered in thermodynamic equilibrium. The scattering rate of a particle with a specific type of scattering center is calculated using the formulae:

(1) where N is the density of scattering centers, v the relative velocity of the particle and the scattering center and u the cross section which is calculated in the center of mass reference system. Both coagulation and ionization cross sections are calculated taking into account electrostatic forces. In particular, a neutral particle interacts with a charged particle via a dipole moment induced by the latter to the former. Finally, the temperature of clusters is considered to be size independent, while the density of all plasma components follows a Maxwell-Boltzmann distribution over energy. More details may be found in Ref.[1]. The calculations were performed for two kinds of particles, differentiated by their initial size. A radius of 1.5A represents Cu atoms, while a radius of lOA represents the metal dust particles used in the previous calculation [2].

3

Simulation Results

Figure la shows the evolution in time of the mean number of atoms per cluster and Figure lb of the mean charge per cluster, for two kinds of particles with initial radius 1.5A and lOA. The results of Figure la clearly demonstrate that the efficiency of coagulation depends significantly on the initial dimensions of the particles involved. In fact, while the lOA particles, in agreement with Ref. [2], coagulate efficiently, the coagulation initiated by single atoms stops relatively soon reaching a plateau at about 140 atoms per clusters. This is a serious size underestimation since the experimental sizes range between 1,000 and 300,0000 atoms per cluster [1].

Monte-Carlo Simulation of Clustering in a Plasma-Discharge Source

'! ti

'E

200 180 ,8160 ! 140 13 120

a.

~(too c . 80

1e

::~I

60

I

40 20

0 0

I

I

I

.......... _. :

- .. . ,. •

.· ...

••

555

Br---------~



I

.•

••

0.3 0,6 0,9 1,2 l,S Tlrne(m•c)

0 ........................ 0 .., 80 120 180 3X)

...... ........,.,llelmlper c:lu*r

Figure 1: The dependence of mean number of particles per cluster on (a) time and (b) on mean negative charge per cluster for two different atomic radius: circles correspond to 1.5A and squares to lOA.

It should be pointed out that several contradictory effects are competing with each other, and depending on the particular circumstances the one or the other may assume ascendancy. Such a case is, for instance, in the very early stages of clustering, when as Figure Ia reveals single atoms coagulate faster than dust particles. To understand this feature which seems inconsistent with the final outcome, it must be taken into account that, as Figure lb shows, the small clusters built by atoms sustain smaller negative charge than the clusters composed of dust particles. Consequently, the probability for the former clusters to get neutralized via collisions with Ar+, and as a consequence to coagulate, is larger than that for the later. This coagulation, however, reduces the number of the atoms and of the atom-composed clusters, leading to the plateau observed in fig la. On the other hand, the charge on the dust- composed clusters, as Figure lb shows, increases faster with size than on the atom-composed clusters. Apparently, this faster increase is sufficient to compensate for the decreasing number of the dust-composed clusters (see Equation 1), ensuring the unhindered coagulation of these particles.

References [1] A.C. Xenoulis, G. Doukelis, P. Tsouris, A. Karydas, C. Potiriadis, A.A. Katsanos and Th. Tsakalakos: Vacuum 49-113 (1998). [2] V.A. Schweigert and LV. Schweigert: J. Phys. D: Appl. Phys. 29-655 (1996).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 55&-557

On Frequency Determination in Exponential Fitting Multi-step Methods for ODEs D.S. Vlachos Hellenic Center for Marine Research, PO BOX 712, GR-190 13 Anavissos, Greece and

T.E. Simos 1 Department of Computer Science and Technology, Faculty of Sciences and Technology, University of Peloponnese, GR-221 00 Tripolis, Greece Received 5 September, 2004; accepted in revised form 20 September, 2004 Abstmct: The frequency determination in an exponential fitting multi-step method is a question without a definite answer. In most of the cases, the estimation of the frequency arises from the nature of the problem, as in the Schr&linger equation. Another approach is to select a frequency which increases the order of the method by zeroing the first non vanishing term of the linear truncation error. In this work, two general methods are exploited: the first is applicable to equations of the form y"(x) = f(x,y) and relates the frequency with the coefficient of y in J(x, y) while the other connects the frequency with the curvature of the solution. Keywords: Exponential fitting, multi-step methods Mathematics Subject Classification: 65L06

1

Introduction

Consider a differential equation of the form:

y" = f(x,y)

(1)

where the first derivative of y is not explicitly appear in f. Now let an exponential fitting method is employed for the numerical integration of the above equation. Recently, Ixaru et a! have proposed in [1], that an optimal way to select the frequency for the exponential fitting method is such that the first non vanishing term of the linear truncation error is zeroing. Although this is reasonable, it comes out that this operation needs the evaluation of high order derivatives of the unknown function y(x) making it impractical especially in high order methods. 1 Corresponding author. Active Member of the European Academy of Sciences and Arts. [email protected], [email protected]

E-mail: simos-

Frequency Determination in Exponential Fitting Multi-step Methods

557

On the other hand, Simos in [2] proposed that if

f(x, y)

= c · y + ...

(2)

then we can select for the frequency in the exponential fitting method the parameter: w

=vfcl

(3)

which gave very accurate results without further computational effort. In this work, a different approach is proposed for the calculation of the frequency. Consider for clarity the following differential equation: d ( x(t) ) ( fx(t) ) y(t) = /y(t)

dt

(4)

and an exponential fitting method to solve it. It is clear then that the solution (x( t), y( t)) T can be represented in the xy-plane as a curve. It is well known, that the circle that best approximates the curve(x(t), y(t))T at the point t =to has radius equal with the inverse of the curvature at the same point, defined as:

lr x rl

K=~

(5)

_ ( x(t) )

(6)

where r-

y(t)

It follows now, that the frequency for the exponential fitting method can be estimated by:

w=

K ·

. lrl

=

lr x rl Jj:j2

(7)

The basic advantage of the proposed method is twofold: first that only one derivative has to be calculated and second that the motion of the center of the osculating circle can be similarly considered in order to produce a second frequency and so on. The proposed method is tested and compared to a set of problems usually encountered in the bibliography.

References [1] L.Gr. Ixaru, G. Vanden Berghe and H. De Meyer, Journal of Computational and Applied Mathematics, 140-423 (2002). [2] T.E. Simos and Jesus Vigo-Aguiar, Computer Physics Communications, 140-358 (2001). York, 1965.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 558-561

A Hybrid Adaptive Neural Network for Sea Waves Forecasting D.S. Vlachos 1 Hellenic Center for Marine Research, PO BOX 712, GR-190 13 Anavissos, Greece Received 5 September, 2004; accepted in revised form 20 September, 2004 Abstmct: The physical process of generation of wind waves is extremely complex, uncertain and not yet fully understood. Despite a variety of deterministic models presented to predict the heights and periods of waves from the characteristics of the generating wind, a large scope still exists to improve on the existing models or to provide alternatives to them. In this work, a hybrid adaptive neural network has been designed and used in order to predict the wave height. The system has been proved to produce a 95% successful24 hours prediction of wave height after 2 months of operation. Keywords: Neural networks, fuzzy controller, wave forecast Mathematics Subject Classification: 68T05 PACS: 84.35.+i, 92.10.Hm

1

Introduction

The knowledge of heights and periods of oscillatory short waves is essential for almost any engineering activity in the ocean. These waves are generated by the action of wind through pressure as well as shear mechanism. Wind-wave relationships have been explored over a period of five decades in the past by establishing empirical equations and also by numerically solving the equations of wave growth [1], [2]. However, the complexity and uncertainty of the wave generation phenomenon is such that despite considerable advances in computational techniques, the solutions obtained are neither exact nor uniformly applicable at all sites and at all times. Moreover, since the numerical models that have been developed for wave prediction are strongly depend on the wind forecasting, any errors or divergences in the wind model have been founded that dramatically amplified in the results of the wave model. On the other hand, many of the most important properties of biological intelligence arise through a process of self-organization whereby a biological system actively interacts with a complex environment in real time. The environment is often noisy and non-stationary, and intelligence capabilities are learned autonomously and without benefit of an external teacher. For example, children learn to visually recognize and manipulate complex objects without being provided with explicit rules for how to do so. The main problem encountered in the design of an intelligent system 1 Corresponding

author. E-mail: [email protected], [email protected]

A Hybrid Adaptive Neural Network for Sea Waves Forecasting

559

capable of autonomously adapting in real time to changes in his world, is called the plasticitystability dilemma, and a theory called adaptive resonance theory is being developed that suggests a solution to this problem (3]. The plasticity-stability dilemma asks how a learning system can be designed to remain plastic, or adaptive, in response to certain events, yet also remain stable in response to irrelevant events. In particular, how can it preserve its previously learned knowledge while continuing to learn new things. The proposed neural network based wave height prediction system has been designed taking into account the above considerations. The second important consideration of the proposed system is that, since wind forecast is necessary to obtain accurate predictions for the wave height, there must be a mechanism to handle for the reliability of the wind model. The solution that we adopted is to design a hybrid network, in which the wind forecast is used as an input to a fuzzy controller and the overall prediction is produced by a three-layer perceptron.

2

Network Architecture

Figure l shows a schematic diagram of the neural system used for the wave height prediction and its subsystems. The input of the system is the significant wave height, the wind speed and the wind direction time series. The data are coming every three hours and already predicted values of wind speed and wind direction are included. These values are produced by the POSEIDON system [4]. The different subsystems are explained in detail in the following.

WindS peed

Fuzzy

--+I Controller

WlndDir

Figure 1: Schematic diagram of the hybrid adaptive predicting system.

The role of the Sl Switch is to feed the actual value of the wave significant height either to the long-term memory or the error control subsystem. When a new value of the significant wave height (HmO) is coming, the 51-Switch feeds this value to the Error Control Subsystem in order

560

Vlachos

to correct the previous predictions of the system. After the correction is performed, the Sl-Switch feeds the incoming value to the S2-Switch. The role of the S2-Switch is to feed the long-term memory either with the actual value of the wave height or the predicted one in order to obtain a deeper prediction. More precisely, the value of wave height at the time tn can be used to predict the wave height at time tn+l· Then this value can be used in order to predict the wave height at time tn+2 and so on. In our case the prediction is performed up to time tn+8 which corresponds to twenty-four hours prediction. The wind speed and wind direction forecasts are the inputs of the fuzzy controller. The output of the controller, which is a fuzzifiedversion of the wind speed and direction, is analyzed in u- v components (WindN, WindE). The neural network used to predict the wave height is a three layer back propagation network [5]. The first layer consists of neurons that simply hold the input values. These values are passed through the weighted connections to the second layer, which consists of neurons that sum their input and pass it to their activation system. The output of the second layer is passed through the weighted connection to the third layer, which have only one neuron. Let In be the network input, Wnk the weights of the connections between the first and second layer, fk the activation functions of the neurons of the second layer, 9k the weights of the connections between the second and third layer, N the number of input neurons and K the number of the neurons of the second layer. The activation function of the neurons of the second layer is given by the formulae: (1) The slope of the activation function is very important because the sensitivity of the system depends on it. Moreover, there is a different slope between positive and negative values in order to account for the effect of wind direction in coastal zones. In these cases, the direction of the wind is critical for the wave growth. The weights Wnk and 9k and the parameters and c; are the long-term memory of the system. These (N + 3) * K parameters are modulated during adaptive learning. In every correction step, each parameter is modified in such a way that the overall error in prediction will be decreased. As far as the fuzzy controller is concerned, its role is to distribute the wind in fuzzy bands and register its tension. In this way, the exact values for the wind speed and direction, as they are indicated by the numerical model, is not so important in the dynamics of the system. On the other hand, the tension of the wind in conjunction with the already register values of the wave height is the key to the prediction.

ct

3

Results

Figure 2 shows the prediction results of the system after three months of operation. The results are compared with the measured values of weight height, the values calculated by the arithmetic model of the POSEIDON system and the values produced by an adaptive system without the fuzzy treatment of the wind data [6]. It is clear from this figure, that errors in wind speed and wind direction prediction reflect directly to the wave height prediction by the numeric model. On the other hand, the adaptive system is less sensitive to these errors while the fuzzy treatment of the wind data improves further the predictions.

561

A Hybrid Adaptive Neural Network for Sea Waves Forecasting

3,-------------------------------------r===~

-HmO --- Model

2.5

t-----------------7+----------------------------~-*-ANN

--+-HANN

2

l:

...."'

-;;

.t:

1,5

>

~

0+-----~------~----~------~------r------r----~ 70 60 50 40 30 20 10 0

Time

Figure 2: Schematic diagram of the hybrid adaptive predicting system. The solid line is the actual values, the empty squares are the model predictions, the asterisks are the values predicted with the adaptive system and the empty circles are the values predicted by the hybrid adaptive neural network.

References [1] B. Kissman: Wind Waves, Prentice Hall, Englewood Cliffs, NJ, 1965. [2] World Meteorological Organization, 1988, Guide to Wave Analysis and Forecasting, WMO no. 72. Secretariat of the World Meteorological Organazation, Geneva, Switzerald.

[3] D.S. Vlachos, D.K. Fragoulis and J.N. Avaritsiotis, Sensors and Actuators B, 43-1 (1998). [4] T . Soukissian, G. Chronis and K. Nittis, Sea Technology, 40(7)-31 (1999). [5] B. Kosko: Neural Networks and Fuzzy Systems, Prentice Hall, Englewood Cliffs, NJ, 1992. [6] D.S. Vlachos and A. Papadopoulos, Elsevier Oceanographic Series, 69-403, (2003).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 562-565

A Rectangular Trust-Region Approach for Unconstrained and Box-Constrained Optimization Problems C. Voglis 1 and I.E. Lagaris 2 Department of Computer Science, University of Ioannina, P.O. BOX 1186 - GR 45110 Ioannina, Greece Received 4 August, 2004; accepted in revised form 21 August, 2004 Abstmct: A convex quadratic programming problem with bound constraints is solved at

first, using a Lagrange multiplier approach. Then a trust region method for non-linear optimization with box constraints is developed, where the trust region is a hyperbox, in contrast to the usual hyperellipsoid choice. The resulting subproblem is solved using our above mentioned QP technique. Keywords: Trust-Region, Quadratic programming, Lagrange multipliers, box constraints Mathematics Subject Classification: 90C20, 90C30, 90C53

1

Introduction

Non-linear optimization plays an important role in many fields of science and engineering, in the industry, as well as in a plethora of practical problems. Frequently the optimization parameters are constrained inside a range imposed by the nature of the problem at hand. Developing methods for bound constrained optimization is hence quite useful. The most efficient optimization methods are based on Newton's method where a quadratic model is adopted as a local approximation to the objective function. Two general approaches have been followed. One uses a line-search along a properly selected descent direction, while the other permits steps of restricted size in an effort to maintain the reliability of the quadratic approximation. The approaches in this second class, bear the generic name Trust-Region techniques a brief description of which is given in Section 2. In this article we will deal with a method of that type. We develop a method that adopts a hyperbox geometry for the trust region. This offers the obvious advantage of dealing with linear instead of quadratic constraints that are imposed by spherical trust regions. In addition it allows effortless adaptation to bound constrained optimization problems. We analyze the approach for the bound-constrained quadratic programming in Section 3, and we present an algorithmic solution in Section 3.1. In section 4, we embed this QP technique in the general setting of the trust region approach, for both unconstrained and bound-{;onstrained problems.

2

Trust Region Methods

Trust region methods[!] fall in the category of sequential quadratic programming. The algorithms in this class are iterative procedures in which the objective function f(x) is represented by a 'Corresponding author. E-mail: [email protected] [email protected]

2 E-mail:

A Rectangular Trust-Region Approach for Unconstrained and Box-Constrained Optimization Problems

563

quadratic model inside a suitable neighborhood (the trust region) of the current iterate, as implied by the Taylor series expansion. This local model of J(x) at the kth iteration can be written as: f(xk

+ s), mk(s)

= J(xk)

+ sT g(k) + ~sT B(k)s

(1)

2

where g = 0.0447 A at T=310K. This show that in normal biological condition DNA could melt partially and go on normal transcription. The critical value of< u2 > of a hydrogen bond[JJ, which designates beginning of melting or transcription of DNA, is chosen as < u2> = 02

0.040 A . Thus we can find down-critical temperature is Tdown =285K in our model. If assuming that 0 2

at< u2 >., =0.06 A all base- pairs are melted, then we can find up-critical temperature is Tup ~ 400K.

Acknowledgments *One of authors, Pang Xiao-feng, would like to acknowledge National Natural Science Foundation of China (grant No:l9974034).

References [1], S.Yomosa,Phys.Rev.A27(1983)2120;30(1984)474; 2186;62( 1993)1 075;59(1990)3765:

J.Phys.Soc.JPN

64(1995)1917;62(1993)

[2],S.Takeno and S.Homma, Prog.Theor.Phys.70(1983)30877(1987)548; 59(1990)1890; Homma,physica 0114(1998)202; J.Biol. Phys. 24(1999)115;Phys. Lett.Al33(1988)275

S.

[3]E.W.Prohofsky, Statistical mechanics and stability of macromolecules, Cambridge Univ. Press,Cambridge, 1995; Phys.Rev.A38(1988)1538;Comments Moi.Ceii.Biophys.2(1983)65; E.W.Prohofsky,K.C.Lu,L.Van Zandt and B.E.Putnam, Phys.Lett.70A(l979)492 [4]M.Peyrard and A.R.Bishop, Phys.Rev.Lett.62( 1989)2755 [5], M.Barbi, S.Cocco and M.Peyrard, Phys. Lett.A253(1999)358;S.Cocco, M.Barbi and M.Peyrard, Phys. Lett.A253(1999)161; M.Barbi, S.Cocco, M.Peyrard and S.Ruffo, J.Biol. Phys.24( 1999)97;S.Cocco and R.Monasson, Phys.Rev.Lett.83( 1999)5178 [6]Pang Xiao-feng, Chin. J. Atom. Mol. Phys. 19(2002)417; Physics, Prog. Phys. (Chinese) 22(2002)218 [7],S.W.Englander and J.J.Englander, Proc.Nati.Acad.Sci.USA,53(1965)370 [8],J .J.Englander and P.H. Von.Hippel, J .Mod.Bio.63( 1972) 171

Nonlinear Excitation and Dynamic Features ofDeoxyribonucleic ACID (DNA) Molecules _ _ _ _ __

[9],S.W.Englander and J.J.Englander, methods Enzymol.49( 1978)24 [IO],G.M.Mrevlislivil, Sov. Phys. USP.22(1979)433;

577

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 578-581

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Influences of Structure Disorders in Protein Molecules on the Behaviors of Soliton Transferring Bio-Energy Pang Xiao-feng( 1X2 l, and Yu Jia-feng0 l and Lao Yu-hui(J) (lllnstitute of High-Energy Electronics, University of Electronic Science and Technology of China, Chengdu 610065, P.R. China and (2llntemational Center for Materials Physics, Chinese Academy of Sciences, Shenyang 110015, P.R.China Received 11 July, 2004; accepted in revised form 8 August, 2004 Abstract: Collective effects of the structure disorder of the protein molecules, containing inhomogeneous distribution of masses for the amino acids and fluctuations of the spring constant, of dipole-dipole interaction constant and of exciton-phonon coupling constant and diagonal disorder, resulting from nonuniformities and aperriodicities, on the soliton in the improved model have been numerically simulated. The results obtained shows that the structure disorders can change the states of the solitons, but the soliton is quite robust against these disorder effects, it is only dispersed or disrupted at larger structure disorders. According to properties of structure of normal proteins we can conclude from these results that the soliton in the improved model is stable, it is possibly a carrier of bio-energy transport in the protein molecules. Keywords: Protein molecule, structure disorder, collective effect , siliton , bioenergy transport PACS numbers: 87. 22As, disorder 03. 65-w, 05. 40+j, 71.38+j

We know that Bio-energy transport is a fundamental process in the biology, a lot of biological phenomena, for example, muscle contraction, DNA replication, nervous information transfer along cell membrane and work of sodium and calcium pumps, are associated with it, where the energy is released by hydrolysis of adenosin triphosphate. However understanding the mechanism of the transport is a long-standing problem that remains of great interest up to now. Following Davydov's idea[l 1, proposed first by Davydov in the 1970s[ll,one can take into account the coupling between the amide-I vibrational quantum (exciton) and the deformation of amino acid lattice. Through the coupling, nonlinear interaction appears in the motion of the vibrational quanta, which could lead to a self-trapped state of the vibrational quantum, thus a soliton occurs in the systems11 1. It can move together with deformational lattice over macroscopic distances along the molecular chains retaining the wave shape and energy and momentum and other properties of quasiparticle. The Davydov model have been extensively studied by many scientists from the 1970s121 • However, characteristic of Davydov's solitonlike quasiparticles occur only at low temperatures about T=I a(t) >I fJ(t)

>=I [1+~a. (t)B; + ~! . (t)B; )

*

exp{- ~[q. (t)P. -Jr" (t)u.J} I0

>ph

2

}

o>..... (I)

Influences ofStructure Disorders in Protein Molecules on the Behaviors ofSoliton Transfe"ing Bio-Energy _ 579

(2)

s:

n

and Bn are Boson creation and annihilation operators for the exciton, u. and Pn are the displacement and momentum operators of lattice oscillator in site n, respectively. A. is a normalization constant, we here choose A. =1. Where & 0= 1i m 0=1665cm·'=0.2035ev is excitation energy of an isolated amide-! oscillator or energy of the exciton (the C=O stretching mode). Present non-linear coupling constants are X 1 and z 2 , they represent the modulations of the one-site energy and resonant (or dipole-dipole) interaction energy for the excitons caused by the molecular displacements, respectively. M is the mass of an amino acid molecule, and W is the elasticity constant of the protein. J is the dipole-dipole interaction energy between neighboring sites. Average values of the parameters are where

J =0.967 meV, W =13 N/m, M =114mp, Co= & 0 , %1 = 62 pN and %2 =10-15 pN, respectively. From our Hamiltonian and wave function in E~s. (1)-(2) we obtained that the new soliton is thermal stable, has so enough long lifetime (about to· -10- 10 ps) at biological temperature 300K, its critical temperature is 320K. This shows that the new soliton is possibly a carrier ofbio-energy transport in the protein. However, the proteins, which consist of 20 different amino acid residues with molecular weights between 75 mp (glycine) and 204 mp (tryptophane), are not periodic, but nonuniform, in which there is structure disorder. The inhomogeneous distribution of masses for amino acid residues result necessarily in fluctuations of the spring constant, dipole-dipole interaction constant and exciton-phonon coupling constant and ground state energy or diagonal disorder of the protein molecule. Thus, it is very necessary to study influences of different structure disorders on the solitons. We now investigate this problem by fourth order Runge-Kutta way£41 . From Eqs. (l)-(2)and Schrodinger and Heisenbeger equations we can get the following simulation equations (3) arn =-J(ain+l +ain_,)+ z,(qn+l -qn_,)ain + X2(qn+l-qn)(ain+l +ain_,) -1i ain = -J(arn+l + arn-l) +X, (qn+l- qn_,)arn + X2 (qn+l -qn)(arn+l + arn-l)

(4) (5)

qn=7rn/M

~n = W(qn+l- 2qn + qn_ 1 ) + 2z1{arn2+1 + ai;+, - arn2_1 -ai;_,)

(6)

+4z 2[arn(arn+l -arn_ 1 )+ain(ain+l -ain_1 )] where a,.(t)=a(t)r.+ia(t)i.,

ian 12 = larn 12 + lail, a{t)r. and a(t)in are real and imaginary parts of a,.(t).

In the simulation by the fourth order Runge-Kutta way we should first make discrete treatment for the variables in above equations, the time be represented by j, the step size of the spacing variable is 0

denoted by h. The system of units ev for energy, A for length, and ps for time proved to be suitable for the numerical solutions with a time step size of0.0195. Note that we here used fixed chains and at the size N, where N is the number of units, we choose to be N=200 or 50 .The initial excitations are a,.(o)=Asech[(n-no) ( 1 + z 2 ii4JW] (where A is normalization constant), q.(O)= 7r .(0)=0. The simulation was performed by data parallel algorithm and MALAB langrage. In the case of periodicity proteins. Using above average values for the physical parameters, numerically simulation shows that the solution can always retain basically constant to move without dispersion along the molecular chains in the range of spacings of 200 amino acid residues and the time of 50ps, i.e. Eqs.(3)-(6) have exactly soliton solution. In order to study influence of a random series of masses on the soliton, we introduce a parameter a k> which is a random -number generator with equal probability within a prescribed interval and can denote the mass at each point in the chain, i. e., Mk= a

z

k M. We show numerically that up to a larger intervals for a k> for example, 0.67,; a k,;300, the stability of the soliton does not change, but in the case of the large intervals as 0.67 ~ a k~ 700, the vibrational energy is dispersed.

580 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Pang Xiao-feng, and Yu Jia-feng and Lao Yu-hui

In general, the disorder of masses of amino acids will result in the changes of the spring constant W, dipole-dipole interaction constant J, and coupling constant( X 1 +X2 ) and ground state energy E 0 which are represented by Ll W=W- W , Ll J=J -

J , Ll ( X 1 + X 2 )=( X 1 + X 2 )-( X1 + X2 )

and

li &o=& 0 -co=t::IPnl• respectively, here f3.is a random number generators to designate the random features of the ground state energy. We see that up to a random variation of ±40% W or LlJ :$ 9%], or Ll(X 1 + X 2 )..JZ(m),

(6)

where Z is the unknown vector, F is the nonlinear vector form , a nd D , L , U, I , and mare diagonal, lower triangular, upper triangular, identity mat rices, and index of monotone iteration, respectively.

584

Li, Chen, and Yu

The monotone iterative parameter is determined node-by-node depending on the device structure, doping concentration, grain boundary concentration, bias condition, and nonlinear property of each decoupled equation. The monotone iterative method applies here for semiconductor device simulation is a global method in the sense that it does not involve any Jacobian matrix [5].

++ ~ + +~ +

Figure 2: The left figure is the initial mesh which contains 52 nodes, the middle one is the third refined mesh containing 1086 nodes, and the right one contains 3555 nodes is the 5th mesh.

4

Results and Discussion

We now present several simulation results to demonstrate accuracy and performance of the proposed numerical algorithm. The device is a typical poly-TFT with 100 nm gate oxide thickness, and the junction depth is 50 nm. It has a LDD elliptical Gaussian profile with 2 * 10 20 cm -J . Fig. 2 shows the process of mesh refinements. The mechanism of one-irregular mesh refinement is based on the estimation of solution error element by element. Fig. 3-(a) reports the relationship of the number of nodes (and elements) versus the refinement levels. The increasing rate of the number of nodes (and elements) gradually becomes slow when the refinements increase. Fig. 3-(b) shows the monotone iterative convergence behavior when solving the electrostatic potential with (w/) and without (w jo) grain boundary traps in the simulated device, and the biasing conditions are Vv = 1.0 V and Va = 1.0 V. Fig. 4 shows the computed potential and electron density for the device under bias conditions Vv = 1.0 V and Va = 1.0 V, respectively. In the channel region of the device, obviously, the simulated electron density reveals the grain boundary effects.

5

Conclusions

In this paper, we have successfully applied our adaptive computing methodology to the poly-TFT simulation. This solution scheme mainly relies on adaptive one-irregular mesh, finite volume, and monotone iterative methods. Numerical results and benchmarks for a TFT are also presented to show the robustness and efficiency of the method.

Acknowledgment This work is supported in part by the National Science Council of Taiwan under contracts NSC-932215-E-429-008 and NSC 93-2752-E-009-002-PAE, the grant of the Ministry of Economic Affairs, Taiwan under contract No. 92-EC-17-A-07-Sl-0011, and the grant from Toppoly Optoelectronics Corp, Miao-Li County, Taiwan.

585

Thin-Film Transistor Simulation

~6000 t:

Q)

E Q)

(a)

1e+2 r-----------------------~

Elements ~0

~ :::-1 e+O ~

~4000 E.

e

\;_ -~

-

~1e-2

t:

Ul

~> .

'-- "

w

Nodes

(b) ········ w/ grain boundary traps - - w/o grain boundary traps

Q)

Q)

!:" Q)

"0

g 2000

..

1: 1e-4 -

0

-

...._,

....__"

0

'\

u

\

\ __, I ...... I _ __,_ 1e-6 ..__......I_ _.....

2

6

4

Refinement level

10

8

1000 100 10 # of Iteration (Log)

1

10000

Figure 3: (a) The number of nodes and elements versus the refinement levels. (b) A monotone iterative convergence behavior w/ and wjo applying grain boundary traps. o.5o

(a)

(b)

0.50

0.40

0.40

-

§

-

0.30

!>-

>0.20

0.30

0.20 1.598

0.10

0.00 0.00

0.80

1.60

2.40

3.20

20.41

1.095 0.592 0.089

0.10

-0.414

0.00

4.00

15.50 10.59 5.680 0.769 0.00

0.80

1.60

2.40

3.20

4.00

X (um)

X (um)

Figure 4: Contours of (a) the potential and (b) electron density for the TFT with Vv

= Vc = 1.0 V .

References [1] R. W. Dutton and A. J. Strojwas, Perspectives on Technology and Technology- Driven CAD, IEEE Trans. CAD 19, 2, 1544-1560(2000).

[2] D. L. Scharfetter, H. K. Gummel, Large-Signal Analysis of a Silicon Read Diode Oscillator, IEEE Trans. Elec. Dev. ED-16, 66-77(1969). [3] Y. Li, S-M Yu, and P. Chen, A Parallel Adaptive Finite Volume Method for Nanoscale Double Gates MOSFETS Simulation, in Proceedings of the International Conference of Computational Methods in Sciences and Engineering, 12-16 September, 382-386(2003) . [4] Y. Li, S. M. Sze, and T-S Chao, A Practical Implementation of Parallel Dynamic Load Balancing for Adaptive Computing in VLSI Device Simulation, Engineering with Computers 18, 2, 8(2002), 124-137. [5] Y. Li, A Parallel Monotone Iterative Method for the Numerical Solution of Multidimensional Semiconductor Poisson Equation, Computer Physics Communications 153, 3, 7(2003), 359-372.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 586-588

A New Method for Calculation of Elastic properties of Anisotropic Material by Constant Pressure Molecular Dynamics Kailiang Yin· 1•2, Heming Xiao3, Jing Zhong 1 and Duanjun Xu2 I. 2. 3.

Department of chemical engineering, Jiangsu Polytechnic University, Changzhou 213016, PRC; Department of chemistry, Zhejiang University, Hangzhou 310027, PRC; Department of chemistry, Nanjing University of Science and Technology, Nanjing 210094, PRC.

Received 30 July, 2004; accepted in revised form 27 August, 2004 Abstract: I ,3,5-tri-amino-2,4,6-tri-nitrobenzene (TATB) is a typical and wide- studied explosive molecular and its single crystal is a typical triclinic lattice with a, b, c respectively 9.010, 9.028, 6.812 angstrom and a, p, y 108.59, 91.82, 119.97. This explosive crystal is a stiff and anisotropic material. Packing of polymer on its crystal surface can obviously improve its mechanics properties. As a powerful tool, molecular dynamics (MD) simulation can be used to calculate the mechanics properties mainly elastic properties. But when applied the software package of Materials Studio (MS) to carry out MD, only the isotropic elastic properties which were averaged in x, y and z directions of materials can be obtained. To calculate the elastic properties in one direction of anisotropic materials, we developed a method which is similar to the experimental determination of elastic properties in one direction. Firstly, a PI periodic super cell ofTATB with 34.06x36.04 x28.838 angstroms which ab plane was designated as (0 I 0) planar of the crystal was constructed. After several fixing and relaxing steps, the cell was pre-equilibrated 500 ps and performed I 00 ps MD at 298 K within NVT ensemble. The averaged isotropic tensile modulus and Poisson' ratio were calculated by MS analysis module of elastic properties (static) and they were 1796±90 GPa and 0.231 respectively. Secondly, to obtain the elastic properties in each direction, NPT ensemble was chosen and different stresses in six different directions were added through many tries to keep the cell parameters fluctuating around those values in NVT ensemble while performing constant pressure MD. The MS averaged isotropic tensile modulus and Poisson' ratio of well equilibrated NPT's system were then 2361±30 GPa and 0.273 respectively. Finally, while carrying out subsequent constant pressure MD, different magnitude of compressive or tensile stress was applied to the cell in one direction. The elastic properties were then obtained via the strain-stress profile. The calculated anisotropic tensile modulus was separately 1423, 2558 and 1955 GPa in x, y and z direction and the averaged value was 1979 GPa while averaged Poisson's Ratio was 0.217. These two averaged values were well agreed with the MS calculated ones within about 20% deviation. The result revealed that our method can be applied to calculate elastic properties of anisotropic materials.

Keywords: TA TB, constant pressure molecular dynamics, anisotropic, elastic properties PACS:

46. 15. -x

Theoretic Section I.

Calculation of the elastic properties of isotropic material by MS

As we known, the static elastic properties of isotropic material are same in each direction and can be calculated by the software package of Materials Studio (MS). The method is as follow: for each configuration submitted for static elastic constants analysis, a total of 13 minimizations are performed. The first consists of a conjugate gradients minimization of the undeformed amorphous system. The target minimum derivative for this initial step is 0.1 kealiA. Following this initial stage, three tensile and three pure shear deformations of magnitude ±0.0005 are applied to the minimized undeformed system and the system is reminimized following each deformation. The internal stress tensor is then obtained from the analytically-calculated virial and used to obtain estimates of the six columns of the elastic stiffuess coefficients matrix. If more than a single configuration has been analyzed, the

Corresponding author.. E-mail: mat [email protected], [email protected]

A New Method for Calculation of Elastic properties of Anisotropic Material_ _ _ _ _ _ _ _ _ __ 587

averages and standard deviations for all the elastic constants and averages for all elastic properties such as tensile or Young modulus, Poisson's Ratio, shear modulus, bulk modulus, A. and Ji. For an anisotropic material, the MS considers it as an isotropic material and the averaged elastic properties in each direction but not the elastic properties in one direction will be obtained. 2.

Calculation of the elastic properties of anisotropic material by constant pressure MD

To calculate the elastic properties of anisotropic materials by MS, we developed a method which is similar to the experimental determination of elastic properties in one direction. The key of this method is that the simulated cell must be firstly adjusted to equilibrium in NPT ensemble by applying six different external stress ax, ay. a" axy, axz· ayz to the cell. After the equilibration, different external stress in one direction was applied to the cell and a constant pressure MD within certain time period was performed. By averaging the equilibrated box length in each direction, averaged 1.. h, lc were obtained, then corresponding change of each of three box length Ill and the strain ll///0 could be calculated. By plotting the strain-stress profile and fitting line by least squares technique, we can calculate the tensile modulus and Poisson's ratio in one direction though the slopes of three lines according to the following formula (x as the direction which external stress ax applied to):

!ila

Longitudinal strain E: = - Lateral strain

(I)

loa

n

/),[b

(2)

E:b =--,

lob Tensile or Young modulus E a

= -CTX E:n

CTX

(3)

/),fa/ / 0 a

(4)

Table 1: Calculated anisotropic elastic tensile modules and Poisson' ratios ofTATB in PI periodic cell with (0 I 0) planar and calculated isotropic ones of equilibrated T ATB in NPT ensemble by MS Direction a

E. (GPa) 1423

Direction b Eb (GPa) 2558

Direction c Ec (GPa) 1955

Averaged E(GPa) 1979

MS value

E (GPa) 2395 0.273

Acknowledgments The author wishes to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions. This work was supported by NSFC 29836150 and JSNFS BK2003402.

588 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ K.L.Yinet. a/.

strain-stress profile of TATB with

a~

0. 0700 0. 0600 0. 0500 0. 0400

§

!l

0. 0300

...;

o. 0200

i

rb

k,

is C-C bond force constant, r0 is 'equilibrium' or referenced bond length, r 1 is the limited bond length beyond which the bond will be considered as near breaking In equation (2), the former two parts are bond stretching terms,

de is an alterable displacement constant which can be used to adjust the depth of potential energy V1( r) trap at r = r0 so can be used to control the dissociation activate energy. The r ~ r 1 region is considered as the

state,

bonded region. The third and forth regions are transition region, in which two-body potential transfers from bonded to non-bonded. In this region,

v; (r) reaches its maximum when r = r

energy of activated dissociation or recombination transition state. Where, the height of energy barrier. The last two parts are which

v; (r)

ka

2 ,

which corresponds to the

is also a parameter used to control

v; (r) in non-bonded i.e. van der Waals interaction region, in

Here, V0 = &A - & , &A is another In CRACK force field, non-bond cutoff is used and worked by multiplying a switch

includes the normal Leonard-Jones potential (U 12-6).

adjustable parameter. function:

V(r) = (v;(r)+ V0 )SW(r,r,,r,)

r,

~

r

~

rc

where r, and rc are respective cut-on and cut-off distance. The values of boundary points in set as follow according to the properties of carbon particles:

r,

r1

= 0.6 0" ,

r2

=2.30" , rc = 2.50" . To ensure the continuity of potential function v; (r)

should be

(4)

V ( r)

= 0.80" , rmin =Vl.o-,

at every boundary point, here

24&A

ka = ---"-0"(0"- r.) Table 2. The number of pyrolyzed molecules of n-decane at different simulation time under varied temperatures (de= 125kJ/mol )

Force Field

Toxvaerd CRACK CRACK CRACK CRACK

Temperature I K

500-900 700 710 715 800

Time/ps

5

10

15

20

25

30

0 0 I 3 6

0 0 0 4

0 0 I 6 II

0 0 0

0 0 I 6 12

0 0 I 7 14

8

5

10

are primarily

A Brand New Reactive Potential RPMD Based on Transition State T h e o r y - - - - - - - - - - -

Figure I. Illustration of

591

V1( r) potential curve

Acknowledgments The author wishes to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [I] Leach, A. R. Molecular Modelling-Principles and Applications. Pearson Education Limited: Harlow, England, (200 I) [2] Lipkowitz, K. B. Chem. Rev. 5, 1829 (1998). [3] Hummer, G, Rasaiah, J. C. & Noworyta J.P. Nature 414, 188 (2001). [4] Yin, K. L., Xu, D. J. & Chen, C. L. ChineseJ. Inorg. Chem. 19,480 (2003). [5] Yin, K. L., Xia, Q., Xu, D. J., Xi, H. T., Sun, X. Q. & Chen, C. L. Macromol. Theor. Simul. 12, 593 (2003). [6] Yin, K. L., Xu, D. J., Xia, Q., Ye, Y. J., Wu, G Y. & Chen, C. L. Acta Phys.-Chim. Sin. 20, 302 (2004). [7] Yin, K. L., Xia Q., Xi, H. T., Xu, D. J., Sun, X. Q. & Chen, C. L. J. Mol. Struct. (Theochem) 674, 157 (2004). [8] Bounaceur, R., Warth, V., Marquaire, P., Scacchi, G., Domine, F., Dessort, D., Pradier, B. & Brevart, 0. J. Anal. Appl. Pyrolysis 64, 103 (2002). [9] Vandenbroucke, M., Behar, F. & Rudkiewicz, J.L. Org. Geochem. 30, 1105 (1999). [I 0] Savage, P. E. J. Anal. Appl. Pyrolysis 54, I 09 (2000). [II] Franz, J. A., Camaioni, D. M., Autrey, T., Linehan, J. C. & Alnaijar M.S. J. Anal. Appl. Pyrolysis 54, 37 (2000). [12] Fierro, V., Schuurman, Y., Mirodatos, C., Duplan, J.L. & Verstraete, J. Chem. Eng. J. 90, 139 (2002). [13] Brenner, D. W., Shenderova, 0. A., Harrison, J. A., Stuart, S. J., Ni, B. & Sinnott, S. B. J. Phys.:Condens. Matter 14, 783 (2002). [14] Toxvaerd, S. J. Chem. Phys. 87,6140 (1984). [ 15] Van Gunsteren, W. E. & Berendsen, H .. J C. Angew. Chem., Int. Ed. 29, 992 ( 1990). [16] Brown, D. & Clarke, H. R. Mol. Phys. 51, 1243 (1984). [17] Zhang, J. F. & Shan, H. H. Fundamentals of Oil Refining Technics (China Petrochemical Press, Beijing, Chinese) (1994).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 592-595

Analytical Solution of Nonlinear Poisson Equation for Symmetric Double-Gate Metal-Oxide-Semiconductor Field Effect Transistors Shao-Ming Yu", Shih-Ching Lob, Yiming Lied I, and Jyun-Hwei Tsaib "Department of Computer and Information Science, National Chaio Tung University, Hsinchu 300, Taiwan bNational Center for High-Performance Computing, Hsinchu 300, Taiwan c Department of Computational Nanoelectronics, Nano Device Laboratories, Hsinchu 300, Taiwan d Microelectronics and Information Systems Research Center, National Chaio Tung University, Hsinchu 300, Taiwan Received 31 July, 2004; accepted in revised form 28 August, 2004 Abstract: In this paper, an analytical solution of Poisson equation for double-gate MOSFET is presented. An explicit surface potential function is also derived to make the whole solution be a fully analytical one. The resulting solution provides an accurate description for partially and fully depleted devices in regions of operation. Doping concentration in silicon region is also considered. A comparison with numerical data shows that the solution gives an accurate approximation of potential distribution for a nano-scale doublegate MOSFET in all regions of operation. Keywords: analytical solution, Poisson equation, surface potential, double-gate MOSFET.

Mathematics Subject Classification: 34815, 35F30 PACS: 02.60.-x, 02.60.Cb, 02.60.Lj, 85.30.De, 85.30.Tv

1. Introduction Double gate metal oxide semiconductor field effect transistor (DG-MOSFET) are of interest today mainly because of their inherent suppression of short-channel effects (SCEs), high transconductance and ideal subthreshold swing (S-swing) [1-4]. Thus, the scalability of semiconductor devices is intimated to nanoscale. A key question is how to calculate the characteristics of device efficiently and accurately, especially for circuit simulation. Among the characteristics of devices, potential is a basic and important one. As potential is obtained, electron density is derived. The purpose of this work is to introduce a new 1-D closed-form analytical approximation of potential distribution in the silicon film of a DG-MOSFET in regions of operation (depletion, weak inversion and strong inversion). Previous studies can only considered analytical solution of DG-MOSFET with given surface potential [2-3] or undoped silicon film [4]. To make the solution be a fully analytical one, the explicit form of surface potential is also derived. A comparison with numerical data shows that the solution gives an accurate approximation of potential distribution for a nano-scale double-gate MOSFET in all regions of operation.

2. Explicit Solution of Surface Potential In this study, 1-D symmetric double-gate MOSFET, which is illustrated in Fig. 1, is considered. The Poisson equation and the boundary conditions can be written as

lltP = qNj -exp(-IP/V,)+(n,fNj(exp(IP/V,)-IWc,, ,-t,,/2 < x < t,,/2,

1 Corresponding author. Postal address: P.O. BOX 25-178, Hsinchu 300, Taiwan, E-mail: [email protected]

(1)

Analytical Solution ofNonlinear Poisson Equation _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 593

¢= ¢, and o¢/ax =±C.,(vc- v,, -¢.}/c,, for x =±t,, / 2. For x = 0, ¢ =¢, and o¢/ox =o. ¢ is potential, ¢, is surface potential, ¢6 is potential at the center of silicon region. Vc is gate voltage, VF8 is flat-band voltage and V, is thermal voltage, which is considered as a constant. n; is intrinsic concentration and Na is doping concentration. 10 , is thickness of oxide, 1,; is thickness of silicon film, & 0 , is the dielectric constant of oxide, &,; is dielectric constant of silicon and c., is the capacitance of oxide. q is electron charge. 8¢/iJx = 0 is obtained from the assumption of symmetric applied bias [4].

F11 I

¢1

oxide

¢z

silicon

(A

oxide

Ij I

t ox tsi

1ox

Figure I : The profile of double-gate MOSFET Multiplying (8¢/iJx}dx on both side of the Poisson equation ofEq. (I) and integrating from the surface toward the center of silicon film, i.e., x = -1,, /2 to 0, and ignoring ¢, -¢., which is much smaller than

V, exp(¢./V, )- V, exp(¢)V,), one obtains

(vc- V,8 - ; . ) ' /r' =(¢, -¢, )+ [v, exri.,-¢,/V,)- V, exri.,-¢,/V,)]+ [v, exp({¢,- 2¢1 )jv,)- V, exp((;, -2¢1 )jv,)]. (2)

y =~2qc,,N. jc., and ¢1 = ln(N./n,). To solve for¢,, another expression relating ¢, and ¢6 is required. The relation is obtained from the condition of full depletion for body charge and the simplification of discretization of Eq. (I) [5-7]. It is given as ¢6 =¢, - at,,c.,(vc- v,, - ¢,)f2e,,- Vxd, where a is a fitting parameter. The critical voltage Vc=Vc at which is changed from partially depleted (PO) to the fully depleted (FO) device needs to be calculated before deriving the analytical solution. It will be derived from the condition that ¢6 =0 and ¢, =V,d .

v,

where

(3)

Before deriving the analytical solution of surface potential, two cases are discussed. Case (I): depletion therefore, exp(- ¢./V, ) "' 0 and and weak inversion, in this case, o < ¢, < 2¢1 , ¢, »

v, ,

¢, » V,, exp(- ¢./V,)"' 0 - 2¢tfV, )"' o. and exp(¢, _ z¢1 ;v,)"' o. Firstly, surface potential of PO devices is discussed, i.e., ¢6 = 0. In case (1), Eq. (2) can be approximated and solved as ¢/; = vc, - v,, + r' /2- r~r ' / 4 + vc - v,, +A , where A= v, (1 + exp(- 2¢tfV, )) . According to the studies [6-7], surface potential of case (II) is given by As +48 , ;:, =f; +V,Inl(vc - v,, - J;f /r' - !; +llfv,} , where /; =(z;, +¢:, - J(;:. exp(¢,

Case (II): strong inversion, in this case 0 < ¢, < 2¢1 ,

-z;J

)jz .

mentioned above, surface potential equation can be linked by the following smooth function [3, 4], which is given as ¢:, =¢:,- V, ln{l + exp((¢:, - ¢/; )jv, )}. Next, surface potential of fully depleted device is derived. In case (I), Eq. (2) can be approximated and . In case (II), the surface potential is given as fV "'' = vGB - vFB - y"\\r solved as 'l'sd xd

¢/;

= 2¢1

+V,InMvc-v,, -JJ/ r' -v,dlfv,[l-exp(- V,d/V,)]}· Also, surface potential equation can be

linked by the following smooth function, which is given as ¢[, = ¢!, - v, In {1 + exp((¢!, - ¢~ )jv.)}. If a device can change the operation mode from PD device to FO device as V G increases. Then the smooth function is employed to link the operation mode [4]: ¢,M=

> ¢!, + ;:, I+ /1 exp((VG- Vc )/n,V, ) I+ I, exp(- (VG- Vc )/ n,V,) where 11 and n1 are fitting parameters.

(4)

594 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Y. Li, S.C. Lo, S. M Yu andJ. H. Tsai

3. Analytical Solution of Poisson equation After deriving the explicit function of surface potential, Poisson equation of Eq. (I) could be solved. Integrating it from surface to center of silicon film (considering x =- t ,, /2 to 0), we have

The positive sign is for

o~ x ~ t ,, /2

and the negative sign is for - t ,, /2 ~ x ~ 0. In case (I), Eq. (5) can

be simplified and rewritten as follows (6)

Integrating Eq. (II) from top surface to center of the silicon film, we have ; = ;. + {;, -;.{t- ~qN.[l- (n,jN.fY(2(;, -;.)e,){x+ t,,/2)]'- E',(x) for -t,,/2 ~ x ~ 0,

(7)

; = ;. + (;,- ;.{t + ~qN.[l- (n,/N.)Y(2(;,- ;.)e,,)(x- t,,/2)]'- E',(x) for 0 ~ x ~ t,,j2,

(8)

where E',(x)= (a,x 6 +~x' }tnlc,/(x+d;}j and E,(x)= (a,x 6 +~x' ftnlc,/(x+d,}j. E,(x) and E2 (x) are the adjust term, which are introduced into the solution of Eq. (7). The function form is obtained by comparing Eqs (5) and (6), we know the slope ofEq. (5) is larger (smaller) than or equal to the slope of Eq. (6) for x > 0 (x < 0). ~ ~ c; , 2 , J, and J, are fitting parameters. In case (II), Eq. (5) can be approximated and solved as follows

a;, a,, , , c

; = ;. - V,ln[ cos(~qn,' /2V,c,,N. exp(;./2V, )x ;

J]- £, (x), for - t,,/2 ~ x ~ 0,

=;.-v,m[cos(~qn,'/2V,c,,N.exp{;./2V,)xJ]-£,{x)•

for

O~x~t,,j2,

(9)

(10)

.E,(x)=(a,x6 +~x'Pn[c,/(x+d,)] and .E,(x)=(a,x6 +~x'pn[c,/(x+d,)] . .E,(x) and .E,(x) are the adjust term, a, , a,, ~ , b, , c, , c, , J, and J, are fitting parameters. Then, we have analytical solution

where

and surface potential of 1-D DG-MOSFET.

4. Calibration and Verification In this study, a 20nm double-gate NMOSFET with lox = 2nm, t,; = 20nm and Na = 10 17 cm·3 is simulated for Va = 0.05, 0.1, 0.7 and 1.0 V. Eq. (4) is considered to estimate the surface potential. From the results, the analytical solution of surface potential has an approximation with error, which is small than 0.005% in depletion and weak inversion region. In the strong inversion region, a larger error, which is smaller than 0.0 I% is obtained. The results, which are given in Table I, show that the explicit function gives good approximations of surface potential. With the explicit surface potential, potential distributions are also simulated under four given gate applied biases, i.e., Va = 0.05, 0.1, 0.7 and 1.0 V. The results are illustrated in Figs 2. Table I: Numerical and analytical surface potential of the simulated device

Va=0.05 V Va= 0.1 V Va= 0.7 V Va= 1.0 V

Numerical (A) Analytical (B) Difference (C)=(B)-(A) 0.940459 0.9405 0.000041 0.953597 0.953588 -0.000009 1.03872 1.03866 -0.00006 1.06509 1.06498 -0.00011

Error (C)/(A) 0.00436% -0.00094% -0.00578% -0.010%

Analytical Solution ofNonlinear Poisson Equation _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 595

0.98.---------------., Tox = 2 nm, Tsi =20 nm

1.10.-----------------, Tox = 2 nm, Tsi = 20 nm

0.96

1.05

-0.94

]!c:

til

~

0.92

~ Q.

0

Q. 0.90

0.88

-

Numerical phi Analytical phi

• 1.00 0.95 0.90

0.86

0.85 &..-J-_ -10

-5

0

Distance (m)

5

10

-10

__._ __.._ _..__ _...__,

-5

0

Distance (m)

10

(b)

(a)

Figure 2: Comparison of numerical and analytical results, (a) Vc;=0.05 and 0.1 V; (b) VG =0.7 and 1.0 V.

5. Conclusion In this study, a 1-D analytical solution of Poisson equation for DG-MOSFET is derived successfully. The solution can be used for PO and FD devices in all regions of operation. In addition, the solution is workable for devices with doping concentration in silicon region. According to the numerical comparison, the results are very accurate. Therefore, the solution can be employed to couple with quantum effects so as to apply to simulation of transport characteristics for devices and circuit simulation.

Acknowledgments The work is partially supported by the National Science Council (NSC), Taiwan, under Contracts NSC92-2215-E-429-010, NSC-93-2215-E-429-008, and NSC 93-2752-E-009-002-PAE. It is also partially supported by Ministry of Economic Affairs, Taiwan, under contract No. 92-EC -17-A-07-S 1-00 II.

References [I] D. Hisamoto, et al., FinFET-A Self-Aligned Double-Gate MOSFET Scalable to 20nm, IEEE Transaction on Electron Device 47 2320-2325 (2000). [2] G. Baccarani and S. Reggiani, A Compact Double-Gate MOSFET Model Comprising Quantum-Mechanical and Nonstatic Effects, IEEE Transaction on Electron Device 46 16561666 (1999). [3] A. Rahman and M. S. Lundstrom, A Compact Scattering Model for the Nanoscale DoubleGate MOSFET, IEEE Transaction on Electron Device 49 481-489 (2002). [4] Y. Taur, Analytical solutions of charge and capacitance in symmetric and asymmetric doublegate MOSFETs, IEEE Transactions on Electron Device 48 2861-2869 (200 I). [5] S. -L. Jang; B. -R. Huang; J. -J. Ju, A unified analytical fully depleted and partially depleted SOl MOSFET model, IEEE Transactions on Electron Devices 46 1872-1876 ( 1999). [6] R. van Langevelde and F. M. Klaassen, An explicit surface-potential-based MOSFET model for circuit simulation, Solid-State Electronics 44 409-418 (2000). [7] Y. S. Yu, S. W. Hwang, and D. Ahn, A unified analytical SOl MOSFET model for fully- and partially-depleted SOl devices, 2001 Asia-Pacific Workshop on Fundamental and Application ofAdvanced Semiconductor Devices (200 I), 329-334.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 596-599

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Atomic and Electronic Structure of Vacancies in UO 2 : LSDA+U Approach Younsuk Yun 1\ Hanchul Kim2, Heemoon Kim3, Kwangheon Park 1 1 Department of Nuclear Engineering, Kyung-Hee University, Suwon,, 449-701, Korea 2 Korea Research Institute of Standard and Science, P.O.Box 102, Yuseong, Daejeon, 305-600 Korea 3 Korea Atomic Energy Research Institute, P.O.Box 105, Yuseong, Daejeon, 305-353 Korea

Received 5 August, 2004; accepted in revised form 24 August, 2004 Abstract: We present the density functional theory calculations of U02 within the LSDA+U approach. For the bulk U02, the electronic structures of U02 obtained both the LDA and the LSDA+U approaches in order to elucidate the strong correlation effect. Then we performed supercell calculations to investigate the structure of different defects in uo2, the defect formation energy, and the defect-induced changes in the electronic structure. Finally, we deduce the activation energy of Xe diffusion, which is one of fission products, through the defects in U02 • Keywords: Sf electrons, LSDA+U approach, defect formation energy, fission products

Uranium dioxide is an important fuel material for nuclear industry and the electronic structure of 5f electrons of uranium has not been reliably described so far. The 5f electrons of uranium is one of the representative strongly-correlated systems, and thus a proper treatment of electron-electron correlations in essential for the description of the electronic structure on the antiferromagnetic (AFM) state ofU0 2• During irradiation U0 2 undergo nuclear fission and as a result, the noble gas Xe is formed. Xe diffuses into the gap between the cladding and pellets of the fuel, and cause swelling of the fuel. Point defects are thought to be the major channel of Xe diffusion in U0 2 [3,4]. Therefore, it is important to understand defects in order to investigate the diffusion mechanism in nuclear reactor. The density functional theory (OFT) calculations within the local spin density approximation (LSDA) often fail to describe systems with strongly correlated electrons, predicting to metallic behaviour contrary to the observed insulating behaviour [5]. Orbital-dependent functionals like the LSDA+U approach are known to correct this kind of problems. The LSDA+U method was proposed by Anisimov eta/. in order to bridge the gap between the OFT-LOA and the many-body approaches for the strongly correlated electronic systems [6,7]. Dudarev eta!. also showed that the LSDA+U method of Anisimov eta/. improves the calculated ground state [8-10]. In this paper, we investigate the localized nature and the effect of the Coulomb repulsion in the U-5/ shells on the electronic structure of U0 2 using LSDA+U method. We used the simplified LSDA+U energy functional, due to Dudarev eta/., which is expressed the following form ELSDA+U

where

nm,a

=

ELSDA

I- -" 2 +l(U -J)L.,(nm,u -nm.u)

(I)

"

is the occupation number of the mthfstates, a is the projection of spin. [] and ] are the

spherically averaged matrix elements of the screened Coulomb interaction between f electrons. This approach is to add a penalty functional to the LSDA total energy, which forces the on-site Coulomb repulsion. It is important to be aware of the fact that when using the LSDA+U, the total energy depends on the parameters[] and]. In Eq. (1), the parameters[] and] do not enter separately, only the difference (U -])is meaningful [8-10]. The 5/Coulomb correlation energy[] was determined to be 4.6±0.8eV from the energy difference in the two spectroscopies of X-ray Photoemission Spectroscopy and Bremsstrahlung Isochromat Spectroscopy [I]. We chose the parameters of [J = 4.5eV and] = 0.5 from Dudarev eta/. and compare them with other experimentally observed data [1,10]. All the t Corresponding author. E-mail: [email protected]

Atomic and Electronic Structure of Vacancies in UO 2 - - - - - - - - - - - - - - - -

S97

calculations have been performed using the V ASP package, that is based on projected-augmented-wave and the plane wave basis set (11-13].

---, 3.------------------------------------------0-2p(up) :E 2 :

f

1 O+---~~~:= =:r-~----~~~------~~------~

-1

-2

0-2p(dn)

-3~~--~~T---~~T-r-r-r-r-r-~~~,~~-,~~

2

U-6d(up)

1

0+---~=-=-~----~----~~?~

jf ~~~~-r~-,-,r-r-~o-~-r~~-,--r-~~~-r-r~~-,~~ U-6d(dn)

~ ~

00

8

U-5f (up)

0~---------6~--~~--~~

-3

-6

U-5f (dn)

38~~--T-~T---~~r-.-r-r-r-r-~~~,~~_,~_,

30 25

Total DOS

20 15 10

5

0;-~.-~--~~~--~~~~r---~~~~~~~,_,

0

2

4

6

8

10

12

14

16

18

20

22

24

Energy (eV) Figure 1. The spin polarized density of states ofU02 The calculated equilibrium lattice constant ofU02 is found to be S.44 A. It is underestimated by about O.SS % of the experimental value of S.47 A [14]. The calculated cohesive energy per U02 molecule is 26.9 eV, that is larger than experimental value of 22.3 eV. It seems to be overestimated about 20 %, because it does not contains spin-polarization energy of atoms. The underestimation of lattice constant and the overestimation of cohesive energy are typical of DFT-LDA calculations (IS]. The spin polarized local density of states (LDOS) and total density of states (DOS) of AFM bulk U02 obtained from LSDA+U calculations shown Fig. I. The LSDA+U calculations predict the correct insulating ground states with the band gap of 1.9 eV. In contrast, calculations employing original LDA result in metallic electronic structure where the uranium Sf bands are partially occupied. The insulating nature of the ground state originates from the presence of Hubbard-type correlations in the bands of uranium Sf states. The band gap of 1.9 eV is opened up due to the split Sf bands, and agrees with the experimental result of2.0 eV [16]. The LDOS in Fig. I shows noticeable difference between up- and down-spins only Sf states. This implies that the AFM ground state of U02 is governed by the Sf electrons.

598 - - - - - - - - - - - - - - Younsuk Yun, Hanchul Kim, Heemoon Kim, Kwangheon Park

+ - ao-----+

e : uranium atom

0

: oxygen atom

C :uranium vacancy

Figure 2. Oeft) Anitiferromagnetic ordering in the (100) direction of bulk U0 2 and (right) 2x2x2 supercell containing uranium vacancy In order to investigate the defects in U0 2, we used the 2x2x2 supercells containing 98 atoms as shown in Fig, 2. We calculated the atomic and electronic structure ofU02 containing the four kinds of defects which are an uranium vacancy, an oxygen vacancy, di-vacancy of uranium-oxygen and trivacancy of uranium-oxygen-oxygen vacancies.

References [I] Y. Baer and J. Schoenes, Phys. Rev. B 47,888 (1980) [2] A.P. Cracknell, M.R. Daniel, Proc. Phys. Soc. 92, 705 (1967) [3] J.P. Crocombette, F. Joliet, L.Thien Nga, and T.Petit, Phys. Rev. B 64, 104107 [4] Richard G. J. Ball, Robin W. Grimes, J. Chern. Soc. Faraday Trans., 1990, 86(8), 1257-1261 [5]. G. A. Sawatzky and J. W. Allen, Phys. Rev. Lett. 53,2339 (1984) [6] V.I. Anishimov, J. Zaanen, and 0. K. Andersen, Phys. Rev. B 44, 943(1991) [7] V.I. Anishimov, I. V. Solovyev, M.A. Korotin, M. T. Czyzyk, and G. A. Sawatzky, Phys. Rev. B 48, 16929(1993) [8] S.L. Dudarev, D. Nguyen-Manh and A.P. Sutton, Philos. Mag. B. vol. 75, 613(1997) [9] S. L. Dudarev, A. I. Liechtenstein, M. R. Castell , G. A. D. Briggs , and A. P. Sutton , Phys. Rev. B 56(8), 4900( 1997). [10]. S. L. Dudarev, G. A. Botton, S. Y. Savrasov, C. J. Humphreys, A. P. Sutton, Phys. Rev. B, 57(3), 1505( 1998). [II] G. Kresse and J. Hafuer, Phys. Rev. B 47, RC558 (1993)

Atomic and Electronic Strncture of Vacancies in UO 2 - - - - - - - - - - - - - - - -

[12] G. Kresse and J. Hafner, Phys. Rev. B 48, 13115 (1993) [13] G. Kresse and J. Hafner, Phys. Rev. B 47, 14251 (19931994) [14] L. Lynds, J.Inorg. Nucl. Chern. 24, 1007 (1962) [15] Paxton, A. T., Methfesse1, M., and Po1atog1ou, H. M., 1990. Phys. Rev. B 41, 8127 [16] Schoenes, J., 1990, J. Solid st. Chern., 88, 2.

599

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 600-603

First-Principles Investigations of the Electronic Structure of B -Lan~alhx (X=4, 4.5, 5, 7) R.J. Zhang 1, C.H. Hu, M.Q. Lu, Y.M. Wang, K. Yang, D.S. Xu Institute of Metal Research, Chinese Academy of Sciences, Shenyang 110016, China Received 6 September, 2004; accepted in revised form 20 September, 2004 Abstract: The electronic structures ofll-LaN4A1Hx (x=4, 4.5, 5, 7) have been investigated using a plane-

wave pseudo-potential method. The results of calculated formation energy and equilibrium unit cell volume show that LaNi~l~.s is the most favorable structure to the real composition of LaN4AIH•. LaNi~lH 7 should not be formed as a stable hydride due to the reduction in the number of H-Ni bonding states associated with the reduction in the number of d states near the Fermi surface in LaN4AIH 7• LaNi~l can not absorb hydrogen as much as LaNi 5 . Keywords: electronic structure; plane-wave pseudo-potential method; formation energy

1. Introduction A large number of investigations on LaNi 5 and related compounds as representative rare alloys have attracted much attention all over the world since Philips Company in Holland firstly found them in 1959. And the study about them has acquired important developments since the early 70's in relation to their exceptional hydriding properties11 · 21. In recent years, hydrogen storage alloys based on La are commercially used as negative electrode materials for nickel-metal hydride (Ni-MH.) battery13•41 • For practical application, substitutions at the La and Ni sites have been extensively used to improve the hydrogen absorption and desorption characteristics, since they affect the stability and the hydrogen content of the hydride. As observed in experiments, the substitutions of aluminum for nickel induce a decrease of the hydrogen content for the hydrides. In this paper, in order to investigate the role played by AI atoms in the decrease of the hydrogen content, we performed a first-principles calculations for flLaN4AIH. (x=4, 4.5, 5, 7), in which imaginary LaN4AIH 7 is at the structure ofLaNi5H 7•

2. Computational details The calculations presented in this work were performed using a plane-wave pseudo-potential (PW-PP) method 151 within the generalized-gradient approximation (GGA) of Perdew-Wang forml 61 to density-functional theory. A finite basis set correction was applied to the total energy and stress tensor when a geometric optimization of variable cell parameters was made. The pseudopotentials were constructed for neutral atoms as described in the report' 71 • The La-4forbital was therefore not included. In fact, the contributions of La-4 fto the interactions between atoms were proved to be so small that they were ignored in the PW-PP calculations and the covalent density analysis in their studyl 81. The much more transferable ultrasoft pseudopotentialsl91 were constructed for all ions, i.e., H, La, Ni, and AI. The ultrasoft potential for these elements was recently reported to have a great advantage for an accurate description of them in complex solidsl81 • With these pseudopotentials, the plane-wave cutoff, Ecut was chosen to be 400 eV in the present study. This was confirmed to achieve a good convergence with respect to the total energy E, and the heat of formationl 7• SJ. The formation energy Ml ofLaN4AIH. can be calculated by the following expression:

till= E,[LaNi4 A/Hx]-( E,[LaNi

4 Al]+~E,[H 2 ])

where E, is the total energy calculated for the equilibrium unit formula shown in the parenthesis.

1 Corresponding author. Ph.D. candidate. Tel.: +86-242-397-1641; fax: +86-242-389-1320. E-mail: [email protected]

(I)

First-Principles Investigations of the Electronic Structure ofB-Lani4alh, (X=4, 4.5, 5, 7) _ _ _ _ _ _ _ 601

We chose four structures of !3 -LaNi4 AIH, (x=4, 4.5, 5, 7) for study the hydrogen content of Calculations of these hydrides were made by optimizing all degrees of freedom including cell parameters and internal coordinates within a given space group. For carrying out efficient calculation, the supercells were made; of its sizes were multiple of the unit cell. Zero-point energies were not taken into account because their effects on the total energy were smaller than the computational accuracy, in agreement with those reported by the theoretical calculation in Ref.1 10l. LaN~IH,.

3. Results and discussion 3.1. Favorable structure of 13 -LaNi~IHx In the structures ofLaNi4AIH., AI atoms only occupy the 3g sites[JJ, 121• The hydrogen in LaN~I has been reported to occupy only two types of sites, which are 6m and 12n113l. Obviously, AI atoms additive induce the decrease in the number of the Ni atoms and Ni-H bonds. And since the deviation of H from the 3fto the off-center 12n position is driven by the increase in number of Ni-H bondsl11, the occupied sites of hydrogen in LaNi~l can be considered as 6m and 3f for the structures of LaN~IH. (x=4, 4.5, 5) for simRlicity. LaNi~IH7 was supposed to have the same structure as LaNi 5H7 with the space group P63mc11 for comparing them each other. The occupied sites of hydrogen in LaN~IH, (x=4, 4.5, 5) and LaNi~IH1 are shown in Fig. I. The calculated formation energy, equilibrium unit cell volume of !3 -LaNi~IH. (x=4, 4.5, 5, 7) and its error to the experimental value are listed in Table I. It can be seen that both the formation energy and the volume of LaNi~IH4 . 5 are most in agreement with the value of experiments. It indicates that LaNi~IH4. 5 is closer and more favorable to the real composition of LaNi4 AIH, than the others, though the hydrogen content of LaNi~IH 5 looks more similar than LaN~IH4 5 to that of LaN~IH48 observed in experiment.

Fig. I. Schematic supercell of the LaNi.tAIH4.s and LaN~IH 7 . LelfThe structure with two different H sites (6m and 3f) and a double unit cell (Z=2). Right: The structure reported by Kazuyoshi Tatsumi eta/. (Ref. 22) with three different H sites (6m, 3f and 4h) and a double unit cell (Z=2). From the largest to smallest atoms denote La, Ni, AI and H atoms, respectively. Table I The computed formation energy ( AH ), equilibrium unit cell volume and its error to the experimental value of !3 -LaNi4AIH, (x=4, 4.5, 5, 7) LaN~I~

LaNi4AIH4s

AH (kJ/moiH2) Calc. AH (kJ/molH2) Expt.[J 4J

- 63.33

- 46.22

Volume (A3) Volume (A3) Expt.[JSJ

100.52

LaNi4AlHs

LaN~IH1

- 33.78

- 33.88

104.81

110.41

- 47.88 102.86 103.3

6 0 2 - - - - - - - - - - - - - - - - - - - - - - - - - - R.J. Zhanget. a/. 3.2. Electronic structures offJ-LaNi,AIHx The total electronic densities of states (TDOSs) for P-LaN4AIH. are plotted in Fig.2. The partial density of states (PDOS) plots for a H Is part, a Ni 3d part, a La 5d part, and an AI 3sp part, which are the main contributors to the local density of states (LDOS), are given in Fig.3. There are two strong PDOS peaks originating from Ni 3d and about five PDOS peaks originating from H Is. Moreover, the positions of the two strong PDOS peaks from Ni 3d and the strong H I sa peaks (approximately -5.4 eV) for LaN4AIH, (x=4, 4.5, 5) are not quite different, while those for LaN4AIH7 has an obvious shift to the low energy region from the others. The anti-bonding state is formed near but below Fermi level EF(that is set to zero as a reference) by the interaction of H I sa· band with a hybridization orbital of metal d, s and p orbitals. The scaled bond order, BO' H-Ni (or La, All =BOH-Ni (or La, AI) I BLH-Ni (or La, All between H-Ni, H-La, and H-Al for LaN4AIH., which is used to evaluate the covalent bonding strength, are listed in Table 2. BL is the average bond length between H and Ni (or La, AI) atoms. We can consider that the smallest BO'H-Ni in LaN4AIH 7 just confirms the decreasing levels of covalency. According to the hydrogen content increasing with the number of H-Ni bonding states, LaN4AIH. (x=4, 4.5, 5, 7) should have more H-Ni bonding states with increasing x. However, in the two strong PDOS of Ni 3d, the DOS of the one near to the Fermi surface (approximately -1.4 eV) increases from LaNi~IH4 to LaNi~IH 5 and then decreases to LaN4AIH 7. Therefore, we can conclude that LaN4AIH7 should not be formed as a stable hydride due to the reduction in the number of H-Ni bonding states associated with the reduction in the number of d states near the Fermi surface in LaN4AIH7. So, LaN4Al can not absorb hydrogen as much as LaNi 5• 4 LaNi4AIH.(Cmmm) 3 -------·· Ni-3d

~ __ $

-

. /\L f:i·

(a)

ot=====~~~~~==~

·;::

li ~==~~~~~--~--~--~~~~ :::J

en 0

:::::::- 4 LaNi4AIH4 _5(Cmmm) 3 --······· Ni-3d

8

~

$

J\. ::i\

(b)

•§ __ ~u 01=~~:;==1:~~::~==~=::=~:_~ LaNi AJH (Cmmm) t: S"'

oo

4 4 5 3 ········· N1'-3d ·· La-5d 2

/j \ i:

(c)

8 Al-3sp ~ 0 1=~-=~~H-:1:s:=J:~~:--:·-~·:··:j:_~\·:;;~::··:::·:·::~

c

1

I--

4 LaNi4AIH7(Cmc21) 3 ---------Ni-3d La-5d 2

Al-3sp --H-1s

--o

0

~

-o

Fig.2 The total densities of states (TDOSs) off3LaN4AIH. (x=4, 4.5, 5, 7). EF is referenced as zero.

ti\

(d)

Ot=~==~==/~:···=·=~~-~-=--==~~ -20 -10 0

10

Energy(eV) Fig.3 The projected densities of states (PDOSs) off3 -LaN4AIH. (x=4, 4.5, 5, 7). EF is referenced as zero.

First-Principles Investigations of the Electronic Structure of8-Lani,alh, (X=4, 4.5, 5, 7} _ _ _ _ _ _ _ 603

Table 2 The average bond order (80), bond length (BL) and scaled bond order (BO') and H-La for il -LaNi4AIH, (x=4, 4.5, 5, 7) H-Ni H-La BO' BL BO' BO BO BL BO -0.167 -0.066 --{).000 0.196 2.017 0.097 2.535 LaNi~IH4 -0.072 0.200 0.184 1.971 0.093 -0.188 2.618 LaNi~IH4.s 1.877 0.109 -0.182 2.673 -0.068 0.200 0.204 LaNi4AIHs -0.124 2.423 -0.051 0.227 0.183 2.064 0.089 LaNi~IH7

between H-Ni H-Al BL 3.206 2.029 2.006 2.010

BO' 0.000 0.099 0.100 0.113

4. Conclusions The electronic structures ofJ3-LaN~IH, (x=4, 4.5, 5, 7) have been investigated by first-principles calculations. The results show as following: (I) LaNi~IH 45 is the most favorable structure to the real composition ofLaN~IH,. (2) LaNi4AIH 7should not be formed as a stable hydride due to the reduction in the number of HNi bonding states associated with the reduction in the number of d states near the Fermi surface in LaNi4AIH 7. LaNi4Al can not absorb hydrogen as much as LaNi 5•

Acknowledgments The authors are grateful to the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [I] H.H. Van Mal, K.H.J. Bushow, F.A. Kuijpers, Hydrogen absorption and magnetic properties of LaCo 5xNi 5 •5x compounds, J. Less-Common Met 32 (1973) 289-296 [2] H.H. Van Mal, K.H.J. Bushow, A.R. Miedema, Hydrogen absorption in LaNi 5 and related compounds: Experimental observations and their explanation, J. Less-Common Met 35 (1974) 65-76 [3] K.H.J. Buschow, P.C.P. Bouten, A.R. Miedema, Hydrides formed from intermetallic compounds of two transition metals: a special class ofternary alloys, Rep. Prog. Phys. 45 (1982) 937-1039 [4] A. Anani, A. Visintin, K. Petrov, et a/., Alloys for hydrogen storage in nickel/hydrogen and nickel/metal hydride batteries, J. Power Sources 47 ( 1994) 261-275 [5] MD Segall, Philip J D Lindan, M J Probert, eta/, First-principles simulation-ideas, illustrations and the CASTEP code, J. Phys.: Condens. Matter 14 (2002) 2717-2744 [6] J.P. Perdew, J.A. Chevary, S.H. Vosko, eta/, Atoms, molecules, solids, and surfaces: Applications of the generalized gradient approximation for exchange and correlation, Phys. Rev. B 46 ( 1992) 66716687 [7] Kazuyoshi Tatsumi, Isao Tanaka, Haruyuki Inui, et a/, Atomic structures and energetics of LaNi 5-H solid solution and hydrides, Phys. Rev. B 64 (200 I) 184105 [8] Kazuyoshi Tatsumi, Isao Tanaka, Katsushi Tanaka, et a/, Elastic constants and chemical bonding of LaNi 5 and LaNi 5H 7 by first principles calculations, J. Phys.: Condens. Matter 15 (2003) 6549-6561 [9] D. Vanderbilt, Soft self-consistent pseudopotentials in a generalized eigenvalue formalism, Phys. Rev. B 41 (1990) 7892-7895 [10] L.G. Hector Jr., J.F. Herbst, T.W. Capehart, Electronic structure calculations for LaNi 5 and LaNi 5H1-energetics and elastic properties, J Alloy Comp 353 (2003) 74-85 [II] A. Percheron-Guegan, C. Lartigue, J.C. Achard, Neuron and x-ray diffraction profile analyses and structure of LaNi 5,LaNi 5.,Al., and LaNi 5.,Mnx intermetallics and their hydrides (deuterides), J LessCommon Met 74 (1980) 1-12 [12] C. Lartigue, A. Percheron-Guegan, J.C. Achard and F. Tasset, Thermodynamic and structural properties of LaNi 5.,Mn, compounds and their related hydrides, J Less-Common Met 75 (1980) 23-29 [13] A. Percheron-Guegan, C. Lartigue, J.C. Achard, Correlations between the structural properties, the stability and the hydrogen content of substituted LaNi 5 compound, J Less-Common Met 109 (1985) 287-309 [14] H. Diaz, A. Percheron-Guegan, J.C. Achard, et a/, Thermodynamic and structural properties of LaNi 5.yAly compounds and their related hydrides, Int. J. Hydrogen Energy 4 ( 1979) 445-454 [15] R.T.Walters, Helium dynamics in metal tritides I. The effect of helium from tritium decay on the desorption plateau pressure for La-Ni-Al tritides, J Less-Common Met !57 (1990) 97-108

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 604-607

VSP International Science Publishen; P.O. Box 346, 3700 AH Zeist The Netherlands

A Nonlinear Elastic Unloading Damage Model for Soft Soil and its Application to Deep Excavation Engineering Xi Hong Zhao Department of Geotechnical Engineering, Tongji University, Shanghai 200092, China BeiLi Taihu Basin Authority of the Ministry of Water Resources of China Shanghai 200434, China Received 14 July, 2004; accepted in revised form 14 August, 2004

This paper is divided into two parts: I. First part, based on damage soil mechanics 111 , combined with test results for unloading stress path in Shanghai soft soil, presents a non-linear elastic unloading damage model for soft soil. This part deals with a nonlinear elastic unloading model for soft soil and a nonlinear elastic unloading damage model for soft soil in detail. The constitutive equation for nonlinear elastic unloading model for soft soil is expressed as follows:

{a}= [cHc} in which

k}={&x

&y &z Yxy Yyz rzxr,

{a}={ax

O"y O"z Txy Tyz Tzx}T

andelasticmatrix

[c]

The constitutive equation for nonlinear elastic unloading damage model for soft soil is expressed in incremental form as follows:

in which C is effective elastic matrix for damage soil, and dC is its increment. According to the principle of virtual work, this incremental FEM equation could be expressed as:

where {M }" = increment of node force vector for an element, [B] = strain matrix of an element, and {~o }' = increment of node displacement vector for an element.

Nonlinear Elastic Unloading Damage Model for Soft Soil _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 605

and

{M>}=

Thus, the incremental FEM equation can be also expressed as:

Now, let

[.Kf = b[Bf[c}B]an, the element stiffness matrix coupling with damage,

[.Kd f {&o}', the damage dissipation matrix due to nonlinear elastic damage, in which

then, the incremental FEM equation can be simplified as:

It is the incremental FEM equation for non-linear elastic unloading damage model. According to the incremental FEM equation, an FEM program can be complied. Thus, this program can be applied to unloading problem, such as excavation engineering. 2. Second part mainly describes the application of non-linear elastic unloading damage model to an engineering case with the depth of 30.4m. This case is one of most important parts of the project "The Outer Ring Tunnel Project of Shanghai", whose size is the first one in Asia and the second one in the world. In this part FEM is introduced to analyze this specially big and deep excavation engineering in deformation of wall and earth pressure on wall etc. Next, the outline and assumption ofFEM model are introduced as follows: (I) The constitutive relation of soil is adopted as the nonlinear elastic unloading damage model. Soil behind wall is assumed as unloading horizontally and load invariable vertically, while soil beneath excavation bottom in front of wall is assumed as unloading vertically and load invariable horizontally. The damage of soil is assumed as across-anisotropic with damage variables in horizontal direction as D 1= D 2 and damage variable in vertical direction as D 3 • (2) The diaphragm wall is considered as a linear elastic material while the braces are simulated as spring elements. (3) The excavation is analyzed as a plane strain problem. (4) The depth of calculation field is taken as 2 times the depth of the excavation under its bottom. The width of calculation field is taken as half the width of the excavation inside and 2 times the depth of the excavation outside. (5) The displacement boundary condition is that the bottom boundary is a fixed one and the lateral boundary, fixed horizontally and free vertically. (6) The water level is assumed consistent with ground level outside the excavation and consistent with the excavation surface inside the excavation. This assumption keeps a constant in one working case. (7) Initial displacement of the soil is assumed to be zero, i.e. the stress-strain state of the soil is assumed not changed during the construction of diaphragm wall before excavating.

606

____________________________________________________________ ZhaoandLi

(8) An 'air element' is adopted to simulate any removed element, which means the original element is still remained and the soil is replaced by air with very small modulus. (9) Incremental method is used to simulate excavation in steps. In every step the element stiffness matrix is calculated using current stress level of soil. (10) The load of excavation is calculated as follows: the incremental displacement caused by the former excavation step is used to calculate the corresponding incremental stress, accordingly the corresponding incremental node load is obtained as the incremental excavation load of this step. From the FEM results of macro-analysis and micro-analysis some conclusions could be drawn as follows: a) Based on damage soil mechanics, combined with test results for unloading stress path in Shanghai soft soil, this paper presents a nonlinear elastic unloading damage model for soft soil. It should be pointed out that this model is the companion model of the non-linear elastic damage model for soft soil. b) The FEM results reasonably consistent with the measured ones for deformation of wall in every step of excavation, which confirms the feasibility and applicability of the unloading damage model of soil to deep excavation engineering. c) FEM calculation and analysis show that the distribution of earth pressure along depth and the relationship between earth pressure and displacement of wall could be well described qualitatively, and the reinforced soil beneath excavation bottom could be simulated with increase of initial unloading modulus of the soil. All these facts illuminate the broad application future for the unloading damage model of soil to deep excavation engineering. d) The FEM results for damage variables show that the damage develops with step of excavation displayed in the expanding of damaged area and increasing of damage variable value. A kind of concentrated damage occurs near the comer of excavation and extends to a certain depth, where the maximum value of damage variable and the measured maximum horizontal displacement occur in these regions. e) The different evolution rules of damage appear in different directions and different regions of excavation. That is, The different damage for soil in different directions and in different regions of excavation reflects the notable influence for stress path on deformation of soilo f) In general, the degree of damage in passive region in front of wall is more serious than that in active region behind wall. Therefore, the method of macro analysis integrated with micro analysis may be a good way for studying the deep excavation engineering

References [I] X.H. Zhao, H. Sun and K.W. Lo, Damage Soil Mechanics (in Chinese and English). Tongji University Press, Shanghai, China(2000). [2] G.B. Liu and X.Y. Hou, The unloading stress strain characteristic for soft soil. Chinese J. China Underground Engineering and Tunnel( in Chinese), 2 16-23( 1997). [3] X.H. Hu, Study on unloading time effect of soft soil in Shanghai, Dissertation for MS., Tongji University(in Chinese), Shanghai, China, 16-23(1999). [4] S.D. Li, S.Q. Zhang, B.T. Wang et al, Modulus formula for foundation soil under lateral unloading during excavation, Chinese J. China Civil Engineerin(in Chinese), 35(5) 70-74(2002).

Nonlinear Elastic Unloading Damage Model for Soft Soil_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 607

[5] G.B. Liu, A study on unloading deformation characteristic for soft soil, "Tongji Geotechnical Engineering Facing to 21" Century-Proceeding of 90'h Anniversary of the Founding of Tongji University" (in Chinese), 320-329(1997). [6] D.E. Daniel and R.E. Olson, Stress-strain properties of compacted clay. Proceeding ofASCE, JGTD, 100(GTlO), 1123(1974). [7] C.L. Chow and J.Wang, An anisotropic theory of elasticity for continuum damage mechanics. International Journal of Fracture, 33 3-16(1987). [8] J. Lemaitre, Evolution of dissipation and damage in metals, submitted to dynamic loading. Proc. J.C.Ml, Kyoto Japan(l971). [9] X.H. Zhao, B. Li, K. Li, et al, A study on theory and practice for specially big and deep excavation engineering----deep excavation engineering in Puxi, Outer Ring Tunnel Project of Shanghai. Chinese J. Geotechnical engineering(in Chinese), 25(3) 258-263(2003). [10] G.X.Yang, K. Li, X.H. Zhao, et al, A study on IT construction for specially big & deep excavation engineering-Deep excavation engineering of Outer Ring Tunnel Project in Puxi, Shanghai. Chinese J. of Geotechnical Engineering( in Chinese), 25(4) 483-487(2003). [II] B. Li, Theory and practice on specially big & deep excavation engineering in soft soil areas. Doctoral degree dissertation (in Chinese), Tongji University, Shanghai, China (2003).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 608-611

Computational Aided Analysis and Design of Chassis J.S Liu*

M. Brown**

*Department of Engineering, University of Hull, HU6 7RX Hull, UK

C. Adams**

**Bankside-Patterson Ltd Barmston Road, Swinemoor Road HU17 OLA Beverley, UK

Received 6 August, 2004; accepted in revised form 26 August, 2004 Abstract: In this work, a computer-aided design and finite element analysis methodology has been

developed for the caravan chassis. 3-D models are created by software ANSYS to predict the stress and deflection of the chassis under various working conditions. The optimal design techniques developed by the group helps to find the best shapes and sizes of the chassis elements. Computer-aided manufacturing has been employed for the components and assembly procedures. The computational techniques have been applied for both the traditional and the new generation chassis. For the traditional chassis, the weak parts of the structure have been discovered with improvements proposed to enhance the four ground beams and reducing the material in other beams. Pre-stress has been reduced for the manufacturing process. For the new generation chassis, a versatile, high quality and cost effective chassis has been successfully designed and launched into UK market. It is envisaged that benefits will gain from the improved products for better performance and less cost and the computational techniques will play a key role in the product design. Keywords: computer aid design, finite element analysis, optimal design, caravan chassis Mathematics SubjectClassification: 65F30 73K25

1. Introduction The old design method for caravan chassis is based on experience and tests, which could be inefficient, conservative, unreliable and costly. With the development of high performance computing techniques, numerical simulation of the caravan chassis is envisaged as an efficient and accurate method to assist design. Market analysis of the chassis shows a trend for a versatile chassis with less production time and cost, better performance and longer warrantee, easier assembly and transportation. Therefore, the development of suitable computational methods for the chassis is proposed to meet the market request

[I].

The traditional chassis is the main products in UK caravan market. It employs rolled-steel section beams and weld techniques to assemble the beams into structure. Those beams are composed of four groups: longitudinal channel, cross member, brace (under, side and knee brace) and subsidiary beam. The subsidiary beam consists of tow bar, axle and corner steady. It generally provides 2 years warrantee by painting or galvanizing the chassis. As the result of computer aid engineering, the new generation chassis aims at the standardization of the design and manufacture for the product catalogue. By employing the galvanized beams and riveting techniques, the NG chassis is composed of middle longitudinal channel, tapered wings and cross beams. It would be much stronger than the traditional product and a longer warranty could be offered [2].

2. Computer aided analysis and optimal design In the caravan industry, the computer aided design techniques are widely used, such as Auto CAD drawings. However, the information of the geometry and configuration provided by Auto CAD is not 1 Corresponding author. Dept of Engineering, University of Hull, E-mail: [email protected], [email protected]

Computational Aided Analysis and Design ofC h a s s i s - - - - - - - - - - - - - -- - - - - 609

adequate to predict the structure performance under certain load conditions. Theoretical calculation is not applicable for the complex geometry of the chassis. A computational model has more advantages in accuracy and efficiency. Finite element analysis has been used in this paper for the simulation. The numerical method is to replace the analytical equations describing the solid material (stress-strain equations) by discretised approximations and the structural response is obtained numerically at a finite number of chosen distributed points. The possibility of achieving a design that efficiently optimizes multiple performances, coupled with the difficulty in selecting the values of a large set of design variables, makes structural optimization an important tool for the design of chassis. 3-D models have been created by the commercial software ANSYS and applied on the caravan chassis for the first time. Carbon steel BS EN I 0025 has been selected for the chassis. Its Young's modulus is 2.05x I 0 11 N/(m"2) and Poisson's ratio is 0.285 Ref [3]. Beam element 188 [4] is selected for the simulation. Generally, a computer model is composed of 14 types of beam sections, around I 00 nodes and 130 elements. A multi-objective optimisation computer program, MOST (multifactor optimisation of structures technique), has been used to accommodate and implement the optimization [5]. The optimization methodology can be stated as: (I) Minimize compliance f.. (T;) and/or mass / 2 (I;) Subject to geometric and response constraints (2) For the design of chassis elements, although the constraint and load differ, the objective of the design is the same: find the best possible shapes and sizes of the structural members, i.e. minimising stress, deformation and mass [5).

3. Results and discussions The performance of the traditional chassis in Bankside-Patterson Ltd Ref [I] has been studied. A uniformly distributed load (UDL) is applied on the cross beam of the chassis to represent the 5.5 ton weight of caravan. In order to meet the Code of Caravan [6] and less pre-stress request from the customers, 3-D models have been tested on 5 working conditions [!). Calculation shows that the working condition at 3-point support (caravan on wheels and tow bar), is the worst case with the biggest deformation and highest stress. Weak parts are found on the ground brace near the axle (see figure 1). Although the maximum stress 239 N/mm2 is bellow the material safety stress (275 N/mm2), it reflects that it may enhance the chassis performance by improving the four beams and other beams may work far away from their fully capacity. Material can be saved from the longitudinal channel and cross beam. In order to meet the customer request for less pre-stress during the manufacturing process, existing traditional chassis has been tested under no pre-stress, current pre-stress and other pre-stress situation with results shown in table I. It is realized that amount of pre-stress is necessary but can be reduced from 20 rom to IOmm.

Figure I: The maximum stress distribution at under brace near the axle Table I: The deformation and highest stress in the 3-point support traditional chassis

I. No pre-stress 2. Less pre-stress 3. Uneven less pre-stress 4. Uneven less pre-stress 5. Existing pre-stress

Max stress (N/mm2) 285 250 261 256 236

Front/back deformation (rom) Pre-stress Final 0 -60/-117 -41/-104 13113 2/22 -75/-95 -53/-97 5120 -38/-97 19/20

610 _ _ _ _ _ _ _ _ __ _ _ _ _ __ _ _ __ _ __ _ _ _ _ _ X Zhao et. a/.

The effects of design parameters on traditional chassis performance have been studied (i.e., the beam cross section types and sizes) and applied in finding the optimal design for the new generation chassis. The numerical analysis show that the G shape would be the best cross-sectional shape for the longitudinal channels and the wings, considering load bearing and anti-buckling capabilities [7]. Increasing the height and width of the cross sections and reducing the plate thickness is found efficient to improve the stiffness with less material. Punched holes are designed on the longitudinal channels and wings to assemble the cross members. It reduces the weight of chassis and the galvanising cost. The angle bracket is designed to joint the cross members with longitudinal channels. Four choices in vertical position and various choices in horizontal position have been provided for the cross beams to give flexibility of heights and positions of cross members. A series of holes are punched near the bottom flange of the longitudinal channel to offer a flexible bolt joints from one to three axles. To join the middle longitudinal channel, the wing is flapped and folded. Therefore, two G sections are jointed using the riveting technique in cooperation with specialist in fastening systems. It is a cold formed joint which creates a very tight and strong mechanical interlock with no clearance around fastener. It is a sealed joint without heating and holes [8]. Figure 2 shows the deflection of the NG chassis under 7 ton load. The maximum deflection happened at both ends of the chassis. The final catalogue of the NG chassis has been formed with a standard length of middle longitudinal channel, a set of length wings, and the cross members. It covers all the existing chassis, sized from 7 to 14 meters in length, weighted from 4.2 to 7 ton, and axle from I to 3 in number. Any customer request can be fit into 4 catalogues of the chassis and therefore no more design work needed. Current prototype production samples indicate that the production speed has been greatly increased with cost reduction [9] . Figure 3 shows the final product. 1

I'K.'UA.L ':i::X.-aJTrH N STF.P-l

::;un -1 Tl~l

UX

(1\VG)

Rf~"'(;".,.-0

u« -e•.o.:::.s StoN ~

--s7 . o:.::s -1 . 008

Figure 2: The deflection of chassis under 7 ton load on 5 supports (unit mm)

Figure 3: The picture ofNG chassis

Computational Aided Analysis and Design ofC h a s s i s - - - - - - - - - - - - - - - - - - 611

4. Conclusions The computational technology (i.e. computer aid design, finite analysis and optimal design) has been applied for the caravan chassis design. For the traditional product, the weak parts of the chassis have been found in the ground brace by computer model. Improvements have been proposed to enhance the under brace beams and reduce material in other elements. Detailed pre-stress has been proposed to meet the customer request. The design helps to cut down manufacturing time and provides competitive market advantages. Furthermore, the design criteria are derived and applied in finding an optimal design. A new generation chassis is proposed with better performance at competitive cost. It is composed of G section longitudinal beams, tapered wings and C section cross beams jointed by the rivet technology. The improved product makes it possible to form a chassis catalogue in terms of suitable range of caravan sizes and loads, it also reduces the design and production process. The design of a flexible axle and cross beam position will provide customers with more choices. Two transportation methods can be selected for NG chassis: (a) shipping the assembled chassis, or (b) shipping the components and assembling the shipped components at the customer's site. An additional benefit is that pre-stress has been significantly reduced from an industry norm of20 to 25 mm, down to 5 mm currently with the prospect to be eliminated entirely. Customer's responses to date have been very positive and the product has been viewed as a significant advance in the caravan industry.

Acknowledgments The author wishes to thank the KTP and Bankside-Paterson Ltd for funding of this project.

References [I] X. Zhao, J. Liu, K Swift, M. Brown and C. Adam, Finite element analysis and optimal for caravan chassis, the 4'h International Conference on Advanced Engineering Design, 5th _gth Sept 2004, 1-6, Glasgow, UK. [2] D. A. Brewer, A Chassis for use with Static caravan homes, UK patent Application, BanksidePatterson Ltd., GB 2371026A, 1-9 (2001). [3] S. P. Timoshenko, Mechanics of Material, Van Nostrand Reinhold Company, 1972. [4] ANSYS User Manual, version 6.1, ANSYS Company, 2004. [5] J.S. Liu and L. Hollawy, Design optimization of composite panel structures with stiffening ribs under multiple loading cases, Computers and Structures, 78, 637-647(2000). [6] Code of Practice 501 Draft, Specification for undergear of caravan holiday homes and residential park homes, National Caravan Council Limited, 2, 1-12(2003). [7] X. Zhao, NG II structural analysis, Technical report, Bankside-Patterson Ltd, 1-48,2004. [8] X. Zhao, J. Liu, K Swift, M. Brown and C. Adam, The design of new generation chassis by advanced techniques, the 4'h International Conference on Advanced Engineering Design, 5th_ 8'h Sept 2004, 1-6, Glasgow, UK. [9] C. Adam, New Generation (NG) Chassis, Launch presentation, Beverley, UK, March 13, 2004.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 612-615

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

A Transient 3-D Numerical Model for Fluid and Solid Interactions X. Zhao'*

T. G. Karayiannis**

*Department of Engineering, University of Hull, HU6 7RX Hull, UK

M. Collins** **Department of Engineering Systems London South Bank University SEl OAA London, UK

Received 6 August, 2004; accepted in revised form 26 August, 2004 Abstract: In this paper, a transient 3-D numerical model has been developed for the general utilization for the fluid solid interaction problem. Iterative coupling method has been involved to retain the advantages of finite element method (FEM) and finite volume method (FVM) by using commercial software ANSYS6.0 and CFX4.4. The data exchange and mesh control of the fluid solid interface have been developed by user subroutines and FOTRAN programs. It has been applied on the study of water flow in flexible tubes, blood flow in arteries and single phase flow in Coriolis mass flowmeters, satisfYing results have been achieved agree to theoretical, numerical and experimental data. Keywords: Fluid solid interaction, iterative coupling method, finite element method, finite volume method Mathematics SubjectC/assification: 65F30 73K25

1.

Literature review

The numerical method for fluid solid interaction is to replace the analytical equations describing the flow (e.g. Navier Stokes equations) or the solid material (stress-strain equations) by discretised approximations and the solutions of the flow properties are obtained by numerical means at a finite number of chosen distributed points. Typically, the numerical schemes in fluid-solid coupling can be classified as one of three types [I]: simultaneous, hybrid and iterative methods. Both simultaneous and hybrid methods limit the involvement of commercial packages and restricts its extension to complex wall behavior. The iterative method ensures that fluid and solid equations are solved separately and then coupled externally in an iterative manner, using the boundary solution from one as a boundary constraint for the other. In its latest developments, Penrose et al's [2] developed a piece of software BLOODSIM to control fluid simulation by CFX5.5 and structure analysis by ANSYS. Further application of BLOODSIM on artery can be found in [3] by the author. However, its limitation on elastic material and inconvenience for user subroutine makes the group to develop a novel coupled fluid-flow and stress-analysis numerical model for the general utilization. Based on the work [1-7], a novel coupling method using CFX4.4 and ANSYS is created. Data transference and the iterative simulation were realized by manual operation between ANSYS and CFX4.4

2. Formulation of the computing The key in the coupling of the fluid flow discretized equations to the structural discretized equations lies to how the interface conditions are imposed. According to ref [8], the equations (I) of displacement compatibility, traction (velocity) equilibrium and stress equilibrium are applied on the interface

d Is =ds ,, s ds rs Where d 1s , Is 1 and s 1 ( , , 1 , and (solid) on the interface respectively.

1Is= ;rs ,,

s

s

sl=s,

(I)

s,s ) are the displacement, tractions and stress of the fluid

1 Corresponding author. Dept of Engineering, University of Hull, E-mail: [email protected], [email protected]

A Transient 3-D Numerical Model for Fluid and Solid Interactions._ _ __ _ _ _ _ _ _ _ _ __

613

The numerical model developed in this paper is based on Arbitrary Langarian-Eulerian method and designed for the general utilization of FSI problems, with no limitation of deformation requirement and material property. Interface of fluid and solid domain was selected as data transformation base. Hexahedral element was employed both for fluid and solid domain with identical network on interface. Fluid mesh comply the solid deformation on the moving boundary. Mesh movement of the interface and inside fluid domain is controlled by CFX4.4 subroutine USRGRD, data transformed by USRTRN, boundary condition by USRBCS. Particular FOTRAN programs (Fromctoa.f and Fromatoc.f) have been created to transfer the data format between CFX and ANSYS, i.e. stress on interface element faces and interface nodes relocation. Figure I scheme the details of method I.

3. Results and discussions First application of the numerical model is to study the wave propagation in an elastic tube. A water flow in elastic tube is selected where the fluid inlet pressure is proportional to time. Figure 2 shows the comparison of numerical model present in this paper compared with wave speed calculated by analytical [9] and published numerical result [2]. The calculation shows that the numerical result of this model under predict the speed while the published numerical model over predict the speed compared with analytical result. Usrgrd.f

Usrtrn.f

Usrbcs.f

Figure 5 The stress of the meter at quarter of the period Figure I The coupling method I 12

w ave propagation speed along a flexible tube

Ui' 10

::z:

.,., .,"'>



].

"0

Q.

.. 3:

8 6



analytical data [9] - - numerical result [2]

-

4



numerical (present work)

2 ~----~--~----~--~=r====~====r=====r===~~l O.OOE+OO 5.00E+05 1.00E+06

1.50E+06 2.00E+06 2.50E+06 3.00E+06 3.50E+06 4.00E+06 Wall stiffness (Pa)

Figure 2 Comparison of predictions of wave speed in an elastic tube

614 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ XZhaoet. a/.

Second application of the numerical model is to study the pulsatile blood flow in simple artery. The artery is assumed as simple axis symmetric geometry with inner radius 5mm, outer radius 6mm and length 50mm. Blood is treated as a incompressible laminar, Newtonian fluid, with I 03 kg/m3 in density and 4x!0-3 kg/(s·m) in viscosity. The decoupled solid and fluid models were first validated before the running the coupled calculation. The coupling model has been tested for the following benchmarks: (I ) steady flow in rigid artery, (2) steady flow in linear elastic artery, (3) pulsatile flow in rigid artery, (4) pulsatile flow in linear elastic artery and (5) pulsatile flow in nonlinear elastic artery [2, 7]. Figure 3 shows the comparison of the numerical result with published data. A satisfying agreement has been achieved.

+

~:~t:::: ~-" ~ -0.1 i

0 .5

')(

Ill

/

-0 .2

/

-0.3

radial postion (x/r)

,, · 1

numerial results (present work) at time 0*0.75s numerial results (present work) at time 0.25*0.75s numerial results (present work) at time 0.5*0.75s numerial results (present work) at time 0.75*0.75s - p u blished results [1] at time 0*0.75s •

-published results [1] at time 0.25*0.75s -

published results [1 ] at time 0.5*0.75s

- - published results [1] at time 0.75*0.75s

Figure 3 Velocity comparison for transient flow in linear elastic artery Finally the single-phase flow in straight Coriolis meter has been simulated [6]. Figure 4 shows the time delay of the flow speed in comparison with the numerical result, where D is the sensor distance of the meter. A linear relationship of the flow rate and time delay between the sensors can be seen. Numerical results agree to the experimental [I 0] and analytical data [II]. Figure 5 shows the stress maximum distribution of the meter at quarter of the vibration period, which agrees to published numerical result [12] .Furthermore, the coupling model has been applied to study the design parameter and the application parameter influence on the meter performance. It helps to investigate the transient fluid pressure and solid stress distribution. Results show the meter's accuracy is restricted in a certain range of flow conditions.

time delay varying with flow rate ? .OOE-05

~ 6.00E-05

.o. .o.

>Ill 5.00E-05

Qj

"C Q)



4 .00E-05

-•

3.00E-05 2.00 E-05 1.00E-05



experimental [1 0] D=0.4m



experimental [1 0] D=0.6m

I.

experimental l10] D=0.9m

-

analytical [11 ] D=0.4m

-

analytical [1 1] D=0.6m

- -analytical [11] D=0.9m

0

2.6

3.6

4 .6

f low rate (m/s)

5.6

1!.

numerical (present work) D=0.4m numerical (present work) D=0.6m numerical (present work) D=0.9m

Figure 4 The time delay varying with flow speed and sensor locations of the meter

A Transient 3-D Numerical Model for Fluid and Solid Interactions _ _ _ _ _ _ _ _ _ _ _ __

615

4. Conclusions The comprehensive numerical method combining two commercial codes (CFX4.4 and ANSYS) for solving coupled solid/fluid problems has been developed and applied to the simulation of water flow in tube, blood flow in arteries and single-phase flow in Coriolis meters. Those benchmarks have shown that the transient 3-D model capacity for accuracy modeling fluid solid interaction with wide application. As a novel research result, the fluid pressure and solid stress distribution of the Coriolis meter has been investigated and the design and application factors of the meter have been tested by the numerical model. References [ l] S.Z. Zhao, The numerical study of fluid-solid interactions for modelling blood flow in Arteries, Ph.D. thesis, City University, UK (1999). [2] J. M. T. Penrose, D. R. Hose, C. J. Staples, I. S. Hamill, I. P. Jones, and D. Sweeney, Fluid structure interactions: coupling of CFD and FE, 18. CAD-FEM Users' Meeting, lnternationale FEM- Technologietage, Graf-Zeppelin-Haus, Friedrichshafen, 20-22 September (2000). [3] X. Zhao, The numerical study of moving boundary problems: application in arteries and Coriolis meters, Ph.D. thesis in preparation, London South Bank University, UK (2004). [4] S. Z. Zhao, X. Y. Xu and M. W. Collins, The numerical analysis of fluid-solid interactions for blood flow in arterial structures, part 2: development of coupled fluid-solid algorithms. Proc lnstn Mech Engrs, 212, 241-252 ( 1998). [5] S. Z. Zhao, X. Y. Xu and M. W. Collins, A novel numerical method for analysis of fluid and solid coupling, Numerical Methods in Laminar and Turbulent Flow, 10, Editors Taylor, C., 525-534 (1997), also in the Proceedings of the Tenth International Conference Held in Swansea, 21-25 July (1997).

[6] X. Zhao, M. Collins and T. G. Karayiannis, Numerical coupling method for moving boundary problems: application in Coriolis meters, Proceeding of WSEAS and IASME International Conferences on Fluid Mechanics, Corfu, Greece, August 17-19, 1-6 (2004). [7] X. Zhao, J. M. T. Penrose, S. Z. Zhao, T. G. Karayiannis and M. Collins, Numerical study of fluid solid interaction (FSI) in arteries, Proceeding of WSEAS and IASME International Conferences on Fluid Mechanics, Corfu, Greece, August 17-19, 1-6 (2004). [8] K. J. Bathe, H. Zhang, S. H. Ji, Finite element analysis of fluid flows fully coupled with structural interactions, Computers and structures, 12, 1-16( 1999). [9] McDonald, Blood Flow in Arteries, Edward Arnold ( 1990). [10] Sultan, G., Theoretical and Experimental Studies of the Coriolis Mass Flowmeter, Ph.D. Thesis, Cranfield University, UK (1990). [ll]H. Raszillier, and F. Durst, Coriolis-effect in mass flow metering, Archive of Applied Mechanics, 61, 192-214 (1991). [12]R. M. Watt, Modelling of Coriolis Mass Flowmeters Using ANSYS, ANSYS users Conf. Pittsburgh, U.S. No. 10, 67-78(1991).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational &iences

Volume I, 2004, pp. 616-619

Reliability Analysis of Concrete Structure by Monte Carlo Simulation Zhigen Zhao a.b

1

Yingping Zheng a

Wei-ping Pan b

a. Department of Resources and Environmental Engineering, Anhui University of Science and Technology, Huainan, P.R. China 232001; b. Material Characterization Center, Western Kentucky University, Bowling Green, KY, USA 42101 Received 6 August, 2004; accepted in revised form 15 August, 2004 Abstract: Traditionally, factor of safety has always been used to evaluate the stability of a given structure. However, factor of safety is a definite value and is established with no consideration of any uncertainties and variability of the concerned objective. In this paper, the difference between reliability and factor of safety is briefly introduced. In addition, taking a given concrete structure as an example, the Monte Carlo simulation method is introduced for the reliability analysis. This research may be helpful to provide a method for evaluating the safety degree more effectively and completely. Keywords: safety degree, factor of safety, reliability, Monte Carlo simulation, concrete structure Mathematics Subject Classification: 65C05

1 Introduction In the past, factor of safety has always been used to evaluate the safety degree of a given structure. However, the factor of safety is limited to a defmite value and is established with no consideration of the uncertainties and variability of the concerned objective. Therefore, factor of safety cannot effectively and completely reflect the safety degree of a given structure. For example, the safety degree is not 120% when the factor of safety is 1.2. Moreover, the same factors of safety may not mean the same degrees of safety to different given structures. In fact, the safety degree is a random variable with uncertainty and variability. Therefore, it is important to take this kind of randomness into consideration when the safety degree of a given structure is studied. In the recent years, the reliability analysis is rapidly developed and widely used to evaluate the safety degree in many fields, and is generally accepted as an effective way to make up for the shortcomings of factor of safety in concerning the randomness of safety degree. The theory of Reliability Analysis is a frontier science based on Probability and Mathematical Statistics. Because the reliability of a given structure can be measured by probability, so the mathematic method can be used to evaluate the safety degree of a concerned project. In this paper, the difference between the reliability and factor of safety is briefly introduced. In addition, taking a given concrete structure as an example, the Monte Carlo simulation method of reliability analysis is introduced.

2 Reliability and Factor of Safety 2.1 Reliability Reliability ( R ) can be defined as the probability of a device performing its purpose adequately for the period of time intended under the specified conditions encountered. This probability is actually called probability of success ( P. ) while the opposite is the probability of failure ( P1 ). Their relationship

1 Corresponding author. A PhD and a Professor of Anhui University of Science and Technology. And at present, also a visiting scholar at Western Kentucky University. E-mail: [email protected] or [email protected].

Reliability Analysis of Concrete Structure by Monte Carlo S i m u l a t i o n - - - - - - - - - - - - 617

is R = P, = 1- P1 . A device as mentioned above can be anything, and in this paper, it refers to a given concrete structure. The reliability of a given concrete structure can be defined as the probability when the random variable of strength ( S ) is greater than the random variable of load ( L ) of this given concrete structure. That isR = P, = P(S > L)= P(S-L > 0).

2.2 Factor of Safety Factor of safety can be defined as the ratio of the available to the required, or the Capability of a device to the Demand on this device. In this paper, the available or the Capability means the Strength of a given concrete structure while the required or the Demand means the Load on this given concrete structure. Therefore, the factor of safety of a given concrete structure ( F, ) can be defined as the ratio -of average Strength ( S ) to average Load ( L ). That is F, = S I L .

3 A Given Concrete Structure 3.1 Distribution of the Strength and the Load Listed in Table I and Figure I are the distributions of the Strength and the Load of a given concrete structure. . o f the strength and the Load o f a given concrete structure Ta bl e I : D"1stn"b ut10n Strength( S ) Class Load( L) Relative Relative Cumulative Relative Relative Cumulative Interval Midpoint Frequency Frequency Frequency Frequency (Mpa) (Mpa) (1,)

(M,)

( J;)

( F,)

14.5-15.5 15.5-16.5 16.5-17.5 17.5-18.5 18.5-19.5 19.5-20.5 20.5-21.5 21.5-22.5 22.5-23.5 23.5-24.5 24.5-25.5 25.5-26.5 26.5-27.5 27.5-28.5 28.5-29.5

15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0

0.000 O.o78 0.095 0.118 0.131 0.126 0.112 0.101 0.091 0.080 0.068

0.000 0.078 0.173 0.291 0.422 0.548 0.660 0.761 0.852 0.932 1.000

( J;)

( F,)

0.000 0.075 0.092 0.107 0.123 0.143 0.150 0.131 0.101 0.078

0.000 O.o75 0.167 0.274 0.397 0.540 0.690 0.821 0.922 1.000

0.18 ;., 0.15

u

c ~ 0.12

[

(.L.,

"

0.09

;.

:; 0.06

~

O.D3 16

18

22 24 26 Strength or Load (Mpa)

20

28

30

Figure I: Relative Frequency distribution of Strength and Load

618 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Z.G.Zhao, eta/.

3.2 Factor of Safety Based on table I, when the midpoint stands for the corresponding interval, the mean values of Load and Strength can be calculated as follows: L = L_L; xJ; = 16.0x0.078+17.0x0.095+ ... +25.0x0.068 = 20.283

S= L_S; xJ; = 21.0x0.075+22.0x0.092+ ... +29.0x0.078 = 25.114 Therefore, the factor of safety is F, = S I L = 25.114/20.283 = 1.238 3.3 About Probability of Failure Although factor of safety is 1.238 based on the above calculation, the probability of failure is obviously existed when the Strength is less than the Load. In the following, the Monte Carlo Method is used to calculate this probability of failure so as to obtain the probability of success.

4 Reliability Analysis by Monte Carlo Simulation The term "Monte Carlo" was introduced by von Neumann and Ulam during World War II, as a code word for the secret work at Los Alamos; it was suggested by the gambling casinos in the city of Monte Carlo in Monaco; however, the Monte Carlo simulation goes way beyond gambling applications. Currently, the Monte Carlo simulation is widely used in many fields. The principle behind this method is to develop a computer-based analytical model that predicts the behavior of a system. That is this method provides approximate solutions to a variety of mathematical problems by performing statistical sampling experiments on a computer.

4.1 Random Number Generation Many techniques for generating random numbers have been suggested, tested, and used in past years. The most commonly used present-day method for generating pseudorandom numbers is one that produces a nonrandom sequence of numbers according to a recursive formula that is based on calculating the residues modulo of an integer m of a linear transformation. Although these processes are completely deterministic, it has been shown that the numbers generated by the sequence appear to be uniformly distributed and statistically independent. Congruential methods are based on a fundamental congruence relationship, which may be expressed as Xi+J =(aX; +c)(modm),i = l,···,n, where the multiplier a, the increment c, and the modulus mare nonnegative integers. Given an initial starting value X 0 (also called the seed ), the sequence {X;} with any value i can be yielded. The random numbers on the unit interval (0, I) can be obtained by U; =X;Im.

4.2 Random Variables ( {S;}, {L;}) Generation Take Strength as an example to illustrate how to get the random variable {S;} • Based on Table I, the relative cumulative ftequency{F,} is known. For a certainU;. it falls into [F,,F,+ 1), and corresponding No. (i+l) class, and M 1owe,,M_.,. represent lower class boundary value and upper class boundary value. So, the S; = M 1• .,., + (M .,., - M,.we,) x (U;- F, )/(F,+1 - F,) For instance, suppose U; = 0.600 , it falls into [0.540,0.690) , corresponding M 1• .,., = 25.5 , Mupw = 26.5. So, S; = 25.5 + (26.5- 25.5) x (0.600- 0.540)/(0.690- 0.540) = 25.9000. Likewise, for Load, it is the same way with the Strength. Suppose U; = 0.600, it falls into [0.548,0.660), corresponding M 1• .,., = 20.5, Muppe' = 21.5. Therefore, S;

= 20.5 + (21.5- 20.5) X (0.600- 0.548) /(0.660- 0.548) = 20.9643 • 4.3 Flow Chart of Monte Carlo Method

The flow chart of Monte Carlo Simulation is given in Figure 2.

Reliability Analysis of Concrete Structure by Monte Carlo Simulation------------ 619 Input simulation runs; N Initial value of failure; Nf=O Loop: FOR i ~ I TON Produce No. i random number of Strength; US(i) Get No. i random Strength value based on US(i) and cumulative frequency distribution; S(i) Produce No. i random number of Load; UL(i) Get No. i random Load value based on UL(i) and cumulative frequency distribution; L(i) ; - - - - - _ _ s ( i )

i

07

06

03

01 0

Secood Unit

01 First Unit 99.4

20

40

60

80

86.8

100

120

- I I " Tirre

Figure I. Gantt chart for the solution of the example problem

References [I] J.F. Pekny and D.L. Miller, Exact Solution of the No-Wait Flowshop Scheduling Problem with a Comparison to Heuristic Methods, Cornp. & Chern. Engng. IS 741-748(1991). [2] R. Musier and L. Evans, An Approximate Method for the Production Scheduling of Industrial Batch Processes with Parallel Units. Cornp. & Chern. Engng. 13 229-238(1989). [3] E. Kondili, C.C. Pantelides and R.W.H. Sargent, A general algorithm for short-term scheduling of batch operations- I. MILP formulation, Cornp. & Chern. Engng. 17 211-227 (1993). [4] N. Shah, C.C. Pantelides and R.W.H. Sargent, A general algorithm for short-term scheduling ofbatch operations- II. Computational issues, Comp. & Chern. Engng. 11229-244 (1993). [5] J. Cerda, G.P. Henning and I.E. Grossmann, A Mixed- Integer Linear Programming Model for Short - Term Scheduling of Single Stage Multiproduct Batch Plants with Parallel Lines, Ind. Eng. Chern. Res. , 36 1695-1707(1997).

[6] MINOS Ver. 5.4 User's Guide (by B.A. Murtagh and M.A. Saunders), Systems Optimization Laboratory, Stanford University, Stanford, CA, 1995.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 661-665

Simulation of River Streams: Comparison of a New Technique to QUAL2E M. Yuceer 1, E. Karadunnus2 , R. Berber 1' 1Department

of Chemical Engineering, Faculty of Engineering, Ankara University, 06100 Ankara, Turkey 2Corum Engineering Faculty, Gazi University, 19200 Corum, Turkey

Abstract: Predictions and quality management issues for environmental protection in river basins rely on water-quality models. These models can be used to simulate conditions in or near the range of the calibrated or verified conditions. In this respect, estimation of parameters, which is still practiced by heuristic approaches (i.e. manually), seems to be the point where the attention needs to be focused. The authors' research group developed a systematic approach for dynamic simulation and parameter estimation in river water quality models [I ,2], which eliminated the cumbersome trial-end-error method. The approach is based on simulating the river stream like a series of CSTRs with embedded parameter estimation capability by using dynamic data. For the implementation of the suggested technique, they have later reported a user-interactive software named as RSDS (River Stream Dynamics and Simulation) [3]. This study provides a comparative investigation of the suggested modeling approach to a well established and worldwide known water quality software, QUAL2E. Experimental data collected in field observations along the Y esilirmak river basin in Turkey were checked against the predictions from the both software. The results indicated that much better agreement with the experimental data could be obtained from the RSDS with respect to QUAL2E. Thus, the systematic procedure suggested in the present work provides an effective means for reliable estimation of model parameters and dynamic simulation for river basins, and therefore, contributes to the efforts for predicting the extent of the effect of possible pollutant discharges in river basins. Keywords: River water quality, parameter estimation, dynamic simulation

1.

Introduction

Water quality models generally require a relatively large number of parameters to define their functional relationship, and since prior information on parameter values is limited, they are commonly defined by fitting the model to observed data. The model can be used to simulate conditions in or near the range of the calibrated or verified conditions. In this respect, estimation of parameters, which is still practiced by heuristic approaches (i.e. manually), seems to be the point where the attention needs to be focused. The state of the art in river water quality modeling was summarized by Rauch eta/. [4] who addressed some issues related to the practical use of the river quality models upon comparison of I 0 important software products they indicated that only two of them offered limited parameter estimation capability. Mullighan eta/ .[5] also noted that practitioners often resorted to manual trial-and-error curve fitting for calibration. Generally accepted software for river water quality modeling worldwide is U.S. EPA's

I'

Corresponding author. E-mail: [email protected]

662 _____________________ M Yuceer, E. Karadurmus and R. Berber

QUAL2E [6]. However, this software does not address a number of practical problems such as the issue of parameter estimation. In a previously study [ 1], we have suggested a dynamic simulation and parameter estimation strategy so that the heavy burden of finding reaction rate coefficients was overcome. Modeling was based on the fact that the segment of river between sampling stations was assumed as a completely stirred tank reactor (CSTR). This work was later extended to the series of CSTR approach [2], and furthermore a MATLAB-based user-interactive software was developed for easy implementation of the technique [3]. The program, named as RSDS (River Stream Dynamics and Simulation), was coded in MATLAB™ 6.5 environment. This work extends the previous work in the sense that a comparative study is provided to assess the capabilities and effectiveness of the suggested technique with respect to the QUAL2E software. The model was constituted from the dynamic mass balances for different forms of nitrogen and phosphorus, biological oxygen demand, dissolved oxygen, coliforrns, non-conservative constituent and algae for each computational element. Model parameters conforming to those in QUAL2E water quality model, were estimated by an SQP algorithm by minimizing an objective function. As QUAL2E model is almost the standard for river water quality modeling, we have chosen it for comparing the predictions from the suggested new methodology.

2. Computing Methodology In dynamic modeling, serially connected CSTRs are assumed to represent the behavior of river stream. Each reactor forms a computational element and is connected sequentially to the similar elements upstream and downstream. The following assumptions were employed for model development: • Well-mixed dendritic stream • Well mixing in cross sections of the river • Constant stream flow and channel cross section • Constant chemical and biological reaction rates within the computational element. Physical, chemical and biological reactions and interactions that might occur in the stream have all been considered. The modeling strategy stems from that of QUAL2E water quality model [6]. The dynamic mass balances for II constituents (i.e. state variables) were written, and additionally several algebraic equations describing various phenomena such as conversion of different forms of nitrogen were involved. A nonlinear constrained parameter estimation strategy has been incorporated into the simulation so that a number of techniques (involving Gauss-Newton, Levenberg-Marquardt and Sequential Quadratic Programming SQP algorithms) can be employed. SQP, being probably the most effective of all, updates the Hessian matrix of the Lagrangian function, solves quadratic programming subproblem and uses a line search and merit function calculation at each iteration. The estimation strategy was based on minimizing an objective function defining the difference between the predictions and observed data during the transient period of observations. In many practical applications of water quality models, parameter values are chosen manually rather than by automated numerical techniques, or heuristic approaches are resorted to [4]. It is known that automated methods are associated with some difficulties depending on the model structure, optimization method, number of variables and parameters, type of measurements and so on. So far no automated estimation of parameters for river water quality models has been reported. In a most recent work, Sincock eta/. [7] reported a detailed study involving the identification of model parameters. However, relatively too short of data was considered, and although temperature was predicted accurately; nitrate, BOD and DO predictions were less so. Therefore, this work provides extension over the previous approaches particularly in terms of automatically generating reliable estimates of water quality model parameters without resorting to the trial-and-error simulations.

3. Experimental measurements

Simulation of River Streams: Comparison of a New Technique to QVAL2E - - - - - - - - - - - 663

For both of field observations and data collection, the concentrations of many water-quality constituents indicative of the level of pollution in the river were determined either on-site by portable analysis systems or in laboratory after careful conservation of the samples. The experimental data for parameter estimation was obtained from two sampling stations along the Yesilirmak River around the city of Amasya in Turkey. The concentrations of many water-quality constituents, corresponding to the state variables of the model and indicative of the level of pollution in the river, were determined in 30 minutes intervals either on-site by portable analysis systems or in laboratory after careful conservation of the samples. Water quality constituents of the river were determined at various locations along a 7 km long section of the river. The sampling location matched with the river stream such that the volume element of water whose quality was sampled at the location zero (i.e. starting point, Durucasu gauging station) was followed with the stream. This was just like dynamically keeping track of an element in the river flowing the same velocity as the main stream. The location of sampling points were determined such that available measurement and would suffice to make such a study possible. After the starting point, sampling was done at locations 0.225 km., 4 km, 5 km and 7 km downstream. The industrial wastewater of a baker's yeast production plant was being discharged right after the starting point. Therefore, the results of the study would indicate the extent of the pollution caused by the discharge from this industrial plant. In the simulations the addition of this discharge was considered as a continuous disturbance to the system and its effect on the water quality, thus, were determined.

4. Results Predictions from the RSDS and QUAL2E were then compared to field data for the river reach of 7 kms. Figure I and 2 show the profiles of the two most important pollution variables, BOD and Dissolved Oxygen (DO), in a 7 kms section of the river after point source input. For quantitative evaluation, Absolute Average Deviation (AAD) values were calculated for both softwares, which are given for some pollution variables as follows: Dissolved Phosphor AAD(%)

DO

BODs

RSDS

QUAL2E

RSDS

QUAL2E

RSDS

QUAL2E

5.26

6.92

1.61

3.14

1.0

0.85

The predictions from the RSDS indicated a much better agreement with the experimental data, compared to QUAL2E. Thus, the systematic procedure suggested in the present work provides an effective means for reliable estimation of model parameters and dynamic simulation for river basins, and therefore, contributes to the efforts for predicting the extent of the effect of possible pollutant discharges in river basins.

664 - - - -- - - - - - - - - - - - - - - - - M Yuceer, E. Karadurmus and R. Berber

BOD-5 ~ ,---.---.---,---.---:r==~~~

19

········1·········i·········i········:·········:·

L--------.,.--

18 ........ , ........ .;.. ······:········ · ·:···· ·· ···1 ·· ··· ····:··········:···

~ ·······-~·-·······=········-~ ·-······1········-l········r·····+·· 17

1

f f i LC LL

·::

0

;; 14 1

. .. . • ; .........

f·········r···

··1········-i·······t:::::::r:

- . ... ··~··· .••.•. ; .•.. . • . ··: . . .... .

·······+·····+······+ . . . +. . .·+·. .. +-=1. ·i····... ··i ........ ·f·....... i· ........ ;........ ..... ··i···

12 11 ......

o

'

10o

1000

o

'

~

.

-~ -

.

o

o

300J

4000

I

'

500J

6000

7000

Distance , m

Figure I. BOD profile in a 7 kms section

DO

[I-;- ~;~e~imental I

9,---.---.---,-------:r==~==~~

.

8.5

~

.

,

:

;

-

Oua12E

·········= ·········!·········:··········:·········t········r·······r· ~

8 . ....... ......... : .........

g

f········j········~········+·······+· ~

-~=0-~- ~ 75

()

' : : - ~~---· -;-··· · ··· r ······r···· ···i ········r ······r ·· I

! I

7

I

: I

I

: I

I

: I

I

: I

I

'

I

: I

········i·········;.........~ ........ ;.........i · · · ·· ····~· ·· · · ···+ ·· :

6 5o

1000

:

200)

300J 4000 500J Dissolved Oxygen, m

6000

7000

Figure 2. DO profile in a 7 kms section

References

{1] E. Karadurmus and R. Berber, Dynamic simulation and parameter estimation in river streams, Environmental Technology 2S 471-479(2004). {2] M. Yuceer, E. Karadurmus and R. Berber, Dynamic modelling of river streams by a series of CSTR approach, AIChE Annual Meeting, 16-21. November 2003, San Francisco (Session: CAST 10- Environmental Performance Monitoring and Metrics), Paper 455e. [3] M. Yuceer and R. Berber, Effective verification of river water quality models through optimum parameter estimation: A new software. Joint Conf. of American & Indian lnst. of Chern. Engineers, Session on Environment, Mumbai, India, December 28-30, 2004 (Submitted for presentation).

Simulation ofRiver Streams: Comparison ofa New Technique to QUAL2E - - - - - - - - - - 665

[4] W. Rauch, M. Henze, L. Koncsos, P. Reichert, P. Shanahan, L. Somlyody, and P. V anrolleghem, River water quality modelling: I. State of the art, lAWQ Biennial Int. Conf. Vancouver-Canada, 21-26 June 1998. [5] A.E. Mulligan and L.C. Brown, Genetic algorithms for calibrating water quality models, J. Environmental Engineering 124(3) 202-211 ( 1998). [6] L. C. Brown and Jr. T. 0 Barnwell, The enhanced stream water quality model QUAL2E and QUAL2E-UNCAS, Documentation No. EPA/600/3-87/007, Environmental Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Athens, Georgia, 1987. [7] A.M. Sincock, H.S. Wheather, and P.G. Whitehead, Calibration and sensitivity analysis of a river water quality model under unsteady state conditions, J. ofHydrology 277 214-229(2003).

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 666-670

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Evaluation of Debottleneck.ing Strategies for a Liquid Medicine Production Utilising Batch Process Simulation Jully Tan, Dominic Chwan Yee Foo 1, Sivakumar Kumaresan, Ramlan Abdul Aziz, Chemical Engineering Pilot Plant, Universiti Teknologi Malaysia 81310 Skudai, Johor, Malaysia. Tel: +607-5531662, Fax: +607-5569706 E-mail addresses: jully2181 @hotmail.com, [email protected], [email protected], [email protected] Alexandros Koulouris Intelligen, Inc Thessaloniki Technology Park PO Box 328,57001 Thermi-Thessaloniki, Greece Tel: +30-2310-498292, Fax: +30-2310-498280 Email Address: [email protected] Abstract: Computer Aided Process Design (CAPO) and simulation tools are used to debottleneck an integrated pharmaceutical production producing a liquid medicine, LIQMED. The process is comprised of several sections: 3 pre-blending sections A, B and C, a syrup making section, a main blending section, an intermediate storage and a packaging section. The bottleneck was found to be the cartoning process in the packaging section. Eight equipment alternatives were proposed to remove this bottleneck and were evaluated based on the process throughput and economic criteria such as Cost Benefit Ratio (CBR) and cost of investment. The best debottlenecking alternative was found to have the highest cost benefit ratio of 1.4 and the lowest cost of investment of $1.2 M. Keywords: Batch process, pharmaceutical production, modelling and optimisation, throughput analysis, debottlenecking.

1. Introduction Computer Aided Process Design (CAPO) and simulation tools have been successfully used in the bulk chemical industry since the early 1960s (Westerberg et al., 1979). However, the use of these tools has only begun in the biochemical-related production industry throughout the past decade (Ernst et a/., 1997). Process simulation can be applied in several stages of process development. Once process ideas are conceived, process simulators can be used for project screening and selection for strategic planning based on economic analysis or any other critical process requirement. When process development nears completion on the pilot level, simulation tools are used to systematically design and optimise the large-scale process for the commercialisation of products. Good process simulators can facilitate the transfer of process technology as well as to facilitate design. They can be used to estimate the required capital investment of the process. In large-scale batch manufacturing, process simulation is primarily used for process scheduling, debottlenecking and on-going process optimisation. It is capable of tracking equipment use for overlapping batches and identifYing process bottlenecks (Petrides, et a/., 2002).

2. Case study 1

Corresponding author. Postgraduate Researcher. Chemical Engineering Pilot Plant. Universiti Teknologi Malaysia. E-mail: [email protected]

Evaluation ofDebottlenecking Strategies forLiquid Medicine Production Utilising Batch Process Simulation 667

A case study is modelled based on the operation condition of an existing pharmaceutical process for the production of an oral liquid medicine, LIQMED. Figure I shows the simulation flowsheet for the case study modelled in SuperPro Designer 5.0. Fifty two weeks of operation (with five working days a week and sixteen hours a day) are taken as the basis of this work. The dispensing and blending process is divided into 5 main parts, i.e. pre-blending sections A, B and C, the syrup making section and the main blending section. The batch size of the process is 2000 L, with the final product being packed into 90 mL bottles. Twelve ingredients are used to produce LIQMED. Each of these ingredients plays its own role either as an active reagent or excipient. Active ingredients I, 2 and 3 are ingredients that are responsible for LIQMED 's therapeutic action. Excipients such as food additives and colouring are inactive or inert substances which are added to provide stability of the drug formation in LIQMED.

SYRUP MAKING S ug.u

DISPEl'IS ING AND BLENDING SECTION

@01 ....... 1

@--Of-.. ~

MAIN BLENDING

PACKAGING SECTION -·-·-·-·-·-·-·-·-·-·-·-·-·-·-·

-

·-·-·

-

·-·-·-·-·-·

Cup Pl•cing

1

~

·-·-·

L•b•l

'II>

I

PRODUC T

Figure I: Simulation flowsheet for the base case of production LIQMED The syrup making procedure is the leading process in the production of LIQMED. The syrup ingredients (sugar and water) are charged into a 2000 L blending tank (V-105) and agitated for 3 hours to ensure a well-mixed solution. The syrup is then transferred into the main blending tank (V-101), followed by the food additives. While the mixture is stirred in tank V-101, the pre-blending procedure takes place. Three different pre-blending procedures will take place. In pre-blending A, the blending of the sweetener, brewing agent, active ingredient I and active ingredient 2 take place in tank V-102. The colouring agents is heated and melted with the presence of deionised water at I oooc and is blended in V-103 in Section B. This colouring agent requires a separate blending procedure as the raw material is in creamy form. Pre-blending of the preservative agents, anaesthetic agents, solvent and flavouring agents takes place in tank V-104 in Section C. When all pre-blending procedures are completed, the ingredients are transferred into tank V-101 starting with the mixture in V-102. This is followed by the mixture in V-103. After the completion of V-104 product transfer, agitation is repeated for 10 minutes to ensure all ingredients are well mixed. Finally, deionised (DI) water is added to adjust the final mixture volume to the required amount and the mixture is stirred again for 15 minutes. Upon completion of the main blending process, the mixture in the V-101 is transferred to the intermediate tank, V-106 before packaging take place. In the base case study, the filling machine has a capacity of filling 28 bottles per minute. The Labeller, on the other hand, has a labelling speed of 30 bottles per minute. The cup placing procedure is carried out manually in the base case. The cartoning

668 - - - - - -- - - - - - - - - - - - - - - - - - - - - - - - - T a n et. a/. machine is able to carton up to 24 bottles per minute. Finally, the products are packed into boxes with 72 bottles each for shrink wrapping. Figure 2 shows the operation Gantt chart for the case study. Note that the operating time of the labeller is shorter than the filler, due to the different speed of the two machines. Thus, the cartoning machine with the Jon est o time of 15.69 h, is the overall rocess bottleneck. lday 12

16

20

24

Complete Recipe

28

32

36

40

P-4 in V-1 04 (5.42 h), Prc-blcnding(C) tank

P-3 in V-103

P-3 in V-103 (5.65 h), Prc-blcnding(B) tank

P-2 in V-102

P-2 in V-102 (5.69 h), Pre-blending (A) Tank

P-5 in V-105

P-5 in V-105 (5.29 h). Syrup Making Tank

P-1 in V - 101

Food Additive AGITATE-I

ransfer in Pre-Blend A Transfer in Pre-Blend B AG ITATE-2 Trans fer in Pre-Blend C AG ITATE-3 C HARGE-6 AG ITATE-4 TRANSFER-OUT- I C IP- 1 P-6 in V-1 06 P-7 in FL- 101 P-8 in LB-101 P-9 in BX-1 0 1 P-10 in BX-102

48

52

56

60

64

68

72

Complete Re=~~~

~ CONCEPT

~ DISCOVERY

DOCUMENT CLUSTERS

Figure 1: System Architecture.

Step 1: Abstract Retrieval. keyword-based query.

A set of abstracts is retrieved through PubMed Central using a

Step 2: Tokenization. Every abstract is divided into a set of lexical units, called tokens. The purpose of this stage is to determine the lexical units of each retrieved abstract, and to filter out punctuation and symbols such as brackets, quotation marks, as well as numbers. Special consideration was given in the identification of special tokens such as chemical formulas and equations. Step 3: Part-of-Speech Tagging. In this step we identify the morpho-syntactic category of each word in the abstract (noun, verb, adjective, etc). The overall objective of this step is to filter out non-significant words on the basis of their morpho-syntactic category (i.e. prepositions, pronouns, articles, etc). Step 4: Stemming. In this step we restrict the morphologic variation of the textual data by reducing each of the different inflections of a given word form to a unique canonical representation (or lemma). For example the words treatment and treat identified as NOUN and VERB respectively (from the previous step) could be assigned under the same morpho-syntactic tag VERB. This process is user-driven. Step 5: Filtering out words. To eliminate common English words we have employed two wellknown term weighting schemes, the TF-IDF metric and the Shannon metric. The first scheme is based on the assumption that terms that appear frequently in a document (TF = Term Frequency), but rarely in the collection of abstracts (IDF = Inverse Document Frequency) are more likely to be specific to the document. For the IDF we have used the following variant: w; = log2(N;jn;), where w; is the weight of term i in the abstract, N; is the frequency of term i in the collection of abstracts L, and n; is the number of documents in L that term i occurs in. Terms with a high TF-IDF value (the product TF*IDF) are retained for further processing.

690

Giannoulatou, Papavassiliou, Perdikuri, Tsakalidis

For the Shannon metric, we have used the following variant w; = N;j * s;, where w; is the weight of term i, N;j if the frequency of term i in abstract j and s; is the signal of term i in the set of abstracts. The signal and the noise of a term are calculated on the basis of its frequency and its probability of occuring. Terms that appear with the same probability in the set of abstracts have high noise, while terms with different probabilities of appearance in the set of abstracts are meaningful.

Step 6: Creation of a go-list. In order to extract the go-list (final vocabulary of terms) we have used the union of the results produced by the application of the above two metrics. Thus we have tried to retain the most representative terms for each abstract, namely, the frequently occuring terms with zero noise. To further reduce the vocabulary size, the user has the ability to select the word categories as identified by the part-of-speech-tagging and restrict the analysis to specific word categories (i.e. nouns, verbs, adjectives). We subsequently encode the textual information (obtained from the previous steps) into a lexical table, where each row corresponds to a scientific abstract and each column to a term of the go-list. The lexical table is represented as a frequency table, where the cell;,j contains the number of occurrences of token j in the abstract i. Step 1: Clustering. In the clustering process we have used the K-means Clustering algorithm. In more detail having as an input a set of K abstracts to be clustered, the clustering process starts by assigning randomly each point to a specific cluster. In each step the distances between the points in each cluster and the centroid of the cluster are computed. Every point (abstract) is assigned in the nearest cluster. The process is repeated until no regroupings of the abstracts takes place. Step 8: Concept Discovery. In order to obtain the final set of terms W that are specific and highly descriptive for a given cluster of documents, we employ the well-known log-odds formula: Qij = log 2 (f;i/f;), where Qij represents the preference of term i in a document cluster j, Iii represents the frequency of term i in cluster j and /; represents the frequency of term i in the total set of abstracts. A term is considered significant for a cluster if its presence in the cluster is considered more significant than its presence in the total population.

3

Results

To evaluate the reliability of our approach, we have performed various case studies, one of which is presented here. We have retrieved 158 abstracts from the PubMed Central using the keyword-query: "corepressor AND co-activator". After the abstract retrieval process we applied the linguistic processing methodology. The frequency threshold for the TF-IDF technique was set to 15 and for the Shannon metric we retain the terms wiht positive signal and zero noise. In the particular experiment we kept the word categories of nouns, verbs and adjectives. The iterative K-means clustering produced two very distict clusters, with descriptive terms for two important types of cancer: breast cancer and prostate cancer. Characteristc terms with high log-odds values are shown in Table 1. It is interesting to note that many terms describe the clusters with high fidelity and immediately imply the transciption factors and the biochemical pathways that modulate different types of cancer. In more detail Cluster 1 contains descriptive terms of endocrine -hormonal- therapies in breast cancer while Cluster 2 contains descriptive terms of biology and therapeutic -hormonal- interventions in prostate cancer. These results are actually promising for the use of our methodology since the researches concerning the biological mechanisms of carcinogenesis are closely related to altered functions of co-repressors and co-activators gene-proteins, which are revealed in this case study.

Cancer Classification from Concept Discovery in Scientific Abstracts of Molecular Biology - - - 691

Table 1: Representative terms describing two clusters in a case study Cluster! Cluster2 ablation antagonist agonist-dependent antiestrogenic breast androgen receptor-beta ARNIP tamoxifen prostate tumor PCa ERalpha-positive AR ERalpha androgen-independent

I

4

I

I

Discussion and Further Work

It appears that term co-occurence and the processing steps that we have implemented generate reliable document clusters that not only associate PubMed Central abstracts into meaningful groups but also provide the labels for a shallow content analysis in a rapid and reliable way. The above methodology is based on the statistical treatment of words. All the threshold values that have been used in our case studies were otpimized empirically, by extensive experimentation, and can be set as parameters by the user. Thus the method is sufficiently flexible and parameterbased to allow extensive exploration of various document collections. Our approach was to keep the methodology as general as possible, without encoding facts pertinent to a specific biological process. We are particularly interested in ontology induction experiments for transcription profiling, as textual analysis in scientific abstracts of Transcription Factors has not been analyzed yet.

References [1] T. Ono, H. Hishigaki, A. Tanigami, T. Takagi: Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics 2001, 17:155-161. [2] T. Sekimizu, H. Park, J. Tsujii: Identifying the interaction between genes and gene products based on frequently seen verbs in MEDLINE abstracts. Genome Informatics 1998. [3] C. Fredman, P. Kra, H. Yu, M. Krauthammer, A. Rzhetsky: "GENIES": a natural-language processing system for the Extraction of Molecular Pathways from Journal Articles. In Proceedings of the 10th International Conference on Intelligent Systems for Molecular Biology, (ISMP 2001), S74-S82 [4] A. Renner, A. Aszodi: High-throughput functional annotation of novel gene products using document clustering. In Proceedings of the Pacific Symposium on Biocomputing, (PSB2000), 54-68, (2000). [5] A.G. Papavassiliou: Transcription Factor-Based Drug Design in Anticancer drug Development, Molecular Medicine, Vol. 3 (12), 799-810, {1997).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 692-695

Do Synonymous Codons Point Towards a Thermodynamic Theory of Tissue Differentiation? G. Anogianakis 1, A. Anogeianaki, V. Papaliagkas Department of Physiology, Faculty of Medicine, University ofThessaloniki, GR-54124 Thessaloniki, Greece

Abstract: Deciphering the human genome has held a number of surprises including an unexpectedly low

number of genes to account for our preconceptions about human structure and function; especially when one has to take the existence of control mechanisms into consideration. Although a DNA molecule can be thought of as a string over an alphabet of four characters (the nucleotides) and proteins as strings over an alphabet of twenty characters (amino acids), recent discoveries have brought forth the possibility that control mechanisms of tissue differentiation which are based on purely thermodynamic principles may exist. It is proposed that the definition and quantification of such potential mechanisms requires algorithms that combine string computation with thermodynamic evaluation of the DNA translation and transcription processes.

Keywords: Codon, differentiation, strings, synonyms, thermodynamics

1. Introduction A DNA strand can be thought of as a string over an alphabet of four characters, the nucleotides, i.e., molecular entities consisting of a phosphate group and a pentose sugar (deoxyribose) linked to a nitrogenous base. As only four nitrogenous bases (Adenine or "A," Guanine or "G," Cytosine or "C" and Thymine or "T") are used in the DNA structure, it is the bases that give the nucleotides their individuality (and name). Codons, i.e. triplets ofnucleotides that code for the same amino acid are the "words" in this "four letter" language. One of the corollaries of the DNA coding mechanism is that DNA sequence alterations in a gene can change the structure of the protein that it codes for. Indeed, given the fact that the genetic alphabet uses four letters, that the words it uses have, all, three letters each but the language of genetics contains twenty one "semantically different words" (i.e., the twenty amino acids and a "terminating codon" which serves as the punctuation mark) it is evident that there must be many different ways of "spelling," for at least some, of the words of the language of genetics. The actual correspondences are tabulated in Table I. Thus the codon UCC codes for the amino acid Serine, CCU for Proline etc, but also CCU, CCC, CCA and CCG all code for Proline. The corresponding frequencies are presented in Table 2.

2.

Implications of synonymous codons

The genomes of species from bacteria [I] to Drosophila [2] show unique biases for particular synonymous codons and, recently, it was shown that such codon preferences exist in mammals [3]. Systematic differences in synonymous codon usage between genes selectively expressed in six adult human tissues were reported, while the codon usage ofbrain-specific genes is, apparently, selectively

1 Corresponding author. E-mail: [email protected]

Do Synonymous Codons Point Towards a Thermodynamic Theory of Tissue Differentiation? _ _ _ _ _ 693

preserved throughout the evolution of human and mouse from their common ancestor [3]. In particular, when genes that are preferentially expressed in human brain, liver, uterus, testis, ovary, and vulva were analyzed, synonymous codon biases between gene sets were found. The pairs that were compared were brain-specific genes to liver-specific genes; uterus-specific genes to testis-specific genes; and ovaryspecific genes to vulva-specific genes. All three pairs differed significantly from each other in their synonymous codon usage raising the possibility that codon biases may be partly responsible for determining which genes are expressed in which tissues. Such a determination may, of course, take place at the level of transcriptional control. However, given the relatively low number of genes identified in the human genome, ''vis-a-vis" our preconceptions about human structure and function, one is tempted to explore whether other control mechanisms operate (alone, in tandem or in parallel with transcriptional level mechanisms) in tissue differentiation. Table 1: The 64 possible combinations of the four bases and the codons they represent. X stands for the terminating codon. I" BASE

u

u PHE PHE LEU LEU LEU LEU LEU LEU ILE ILE ILE MET VAL VAL VAL VAL

c

A

G

c SER SER SER SER PRO PRO PRO PRO THR THR THR THR ALA ALA ALA ALA

2NuBASE A TYR TYR X X HIS HIS GLN GLN ASN ASN LYS LYS ASP ASP GLU GLU

3KU G CYS CYS X TRP ARG ARG ARG ARG SER SER ARG ARG GLY GLY GLY GLY

BASE

u

c

A G

u

c

A G

u

c

A G

u

c

A G

Table 2: The number of codons used to code for each amino acid and the relative frequency of codons used by each amino acid. AMINO ACID ALA ARG ASN ASP CYS GLN GLU GLY HIS ILE LEU LYS MET PHE PRO SER THR

TRP TYR VAL

X TOTAL

CODONSUSED 4 6 2 2 2 2 2 4 2 3 6 2 I 2 4 6 4 I 2 4

3 64

FREQUENCY (%) 6.2500 9.3750 3.1250 3.1250 3.1250 3.1250 3.1250 6.2500 3.1250 4.6875 9.3750 3.1250 1.5625 3.1250 6.2500 9.3750 6.2500 1.5625 3.1250 6.2500 4.6875 100.0000

694 - - - - - - - - - - - - - - - - - - G. Anogianakis, A. Anogeianaki, V. Papaliagkas

Assuming that some kind of parsimony principle governs the development of control mechanisms at the DNA transcription level (a very strong but necessary assumption that is required in order to limit and focus the subsequent discussion), there are two obvious mechanisms that can be used to link synonymous codon choice and tissue-specific gene expression: • The first mechanism depends on local transfer RNA abundance. The tRNA pools in the brain, e.g., may differ from the pools in liver, and so if the codon usage of a gene is calibrated to the tRNA pools that exist in the brain, that gene will be translated more efficiently in brain. • The second mechanism would make use of the different chemical affinities between different tRNAs coding for the same amino acid and the underlying DNA structure. In other words, if certain codons have larger affinities with their corresponding tRNAs than their synonymous codons, it stands to reason that they will be expressed more readily. A corollary of this argument is that differentiation will occur when the appropriate genes find themselves in an energetically appropriate environment to be expressed.

3.

Ways to approach the problem

It is interesting that the scientists that announced the discovery of tissue-specific codon usage and the expression of human genes offer the first mechanism as an explanation of their observations [3, 4]. However, this is still speculative and although there is evidence for it [5], it may be thought of as leading to a circular argument: a gene is expressed (to abundantly produce specific tRNA) so that another gene may be 'induced to induce" yet another gene, and so on ad infinitum. Although such a process cannot be ruled out without in vitro identification of gene induction sequences and experimentation on their functioning under different tRNA pool composition, it appears to be an extremely inefficient, albeit accurate, procedure that sacrifices an unordinary number of genes for control purposes. In order to resolve the question of whether the second mechanism that was proposed as the link between synonymous codon choice and tissue-specific gene expression has any theoretical (or practical) merit, it is necessary to, first, associate each codon with a value reflecting its potential for expression. To illustrate this point let us assume, e.g., that a gene is coding a peptide with the sequence: ALA-ALA-ALA-ALA-ALA-ALA-ALA-ALA-ALA-ALA. Let us, further, assume that codon GCU has twice the affinity for its corresponding tRNA than codon GCC has for its own corresponding tRNA. Similarly, codon GCC has twice the affinity for its corresponding tRNA than codon GCA has for its own corresponding tRNA and, finally that codon GCA has twice the affinity for its corresponding tRNA than codon GCG has for its own corresponding tRNA. It is evident that, if chemical affinities alone were the determining factor and if gene expression was a linear function of the product of its codons' affinities for their corresponding tRNAs, then a gene represented by the sequence: GCUGCUGCUGCUGCUGCUGCUGCUGCUGCU, would be expressed 230 times more readily than the gene represented by the sequence: GCGGCGGCGGCGGCGGCGGCGGCGGCGGCG, despite the fact that these two hypothetical genes are synonymous. Real life, of course, is not, usually, that generous to scientists or simple enough to be easily simulated but the example above serves as an illustration of the approach that is required to resolve the problem of amino acid coding redundancies, codon synonyms and their possible involvement in tissue differentiation. However, once the problem is formulated, it will be possible to utilize computer models of tRNAs to estimate the thermodynamics of the codon- tRNA interaction and to proceed to derive reasonable estimates of the propensity of a gene to express itself. The second step, of the proposed approach to the problem, is to translate gene sequences into sequences of "gene potentials for expression" and to attempt to characterize the different tissues by those genes (and their potential for expression) that are actually expressed in them. In order to do this, databases of all possible permutations of a gene sequence (in terms of all the synonymous codons to the codons that are actually used), should be calculated and compared to the actually occurring codon sequences. We

Do Synonymous Codons Point Towards a Thermodynamic Theory of Tissue Differentiation? _ _ _ _ _ 695

could speculate that, given the immense number of permutations in terms of its composing codons that a gene represents (i.e., the potentially synonymous genes), sequences that actually occur will have specific, thermodynamic (?)reasons for being selected. The final step of the proposed approach is to attempt to resolve the inverse problem, i.e., based on protein structure, to attempt identify which genes give rise to particular proteins. Although, conceptually, resolving the inverse problem does not differ from the translation of gene sequences into sequences of"gene potentials for expression," this step is much more specific in that it requires protein sequencing before one proceeds to take it. On the other hand, given that many proteins are produced by enzymatic sectioning of precursor molecules, it also represents a unique opportunity to link genes coding for such protein molecule precursors with the tissues where they are expressed.

4.

Significance of the problem

We should, at this point, underscore that linking tissue differentiation with thermodynamic constraints, in synonymous codon expression, has profound philosophical and even aesthetic implications: It reduces intellectual dependence on models of molecular evolution where extremely detailed designs in outcome have to be "designed" by blind evolutionary forces and introduces a view whereby some of the most basic and best understood natural principles (those of thermodynamics) undertake to assist evolution in shaping the biological order at a more fundamental level. At the same time, its aesthetic implications sprout from the fact that it introduces a vision of differentiation very similar to that observed in a garden at spring. Apart from its philosophical significance, however, the answer to the problem of whether the existence (and expression characteristics) of synonymous codons, also implies a thermodynamic theory of tissue differentiation, will have a significant impact on cancer research. Indeed a central issue of cancer research has to do with the mechanisms which cancer cells dedifferentiate, lose the ability to perform the normal functions of the normal cell type from which they mutated and often resemble embryonic cells. It is plausible, therefore, that in case cancer development can be associated with usage of the wrong codon synonyms, that gene therapies can be devised in a much more logical fashion.

Acknowledgments The corresponding author, a non-mathematician, very early in his adult life, left the green pastures of molecular biology for brain electrophysiology. From this place, he wishes to thank Joshua B. Plotkin of the Bauer Center for Genomic Research at Harvard and his colleagues, for rekindling his, by now ancient, interest in synonymous codons with their recent paper which appeared in PNAS [3]. He also wants to thank Prof. Athanasios Tsakalidis Head of the Computer Engineering and Informatics Dept., in the University ofPatras, (Greece) for constantly stimulating his interests in strings, and databases.

References [1] T. Ikemura, Codon usage and tRNA content in unicellular and multicellular organisms. Mol Bioi Evol2(1) 13-34 (1985) [2] J.R. Powell and E.N. Moriyama, Evolution of codon usage bias in Drosophila. PNAS 94 77847790 (1997) [3] J.B. Plotkin, H. Robins and A. J. Levine, Tissue-specific codon usage and the expression of human genes. PNAS 10112588-12591 (2004) [4] M. Phillips, Different codons, same amino acid. http://www.biomedcentral.com/news/20040817/0 I (last visited Sept 02, 2004) [5] D. B. Carlini and W. Stephan, In vivo introduction ofunpreferred synonymous codons into the Drosophila Adh gene results in reduced levels of ADH protein. Genetics 163(1) 239-43 (2003)

VSP International Science Publishers P.O. Box 346,3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 696-700

Gene-Finding with the VOM Model K.O. Shohat-Zaidenraise", A. Shmilovici* 1, I. Ben-Gal" *Dept. oflnformation Systems Eng. Ben-Gurion University P.O.Box 653, Beer-Sheva, Israel

"Department of Industrial Engineering TelAviv University Ramat-Aviv, Tel-Aviv 69978, Israel

Accepted 31 August, 2004 Abstract: We present the architecture of an elementary gene-finding algorithm that is based on the Variable Order Markov model (YOM). Experiments with the gene-finder on three Prokaryotic genomes indicate that it has advantage on the detection of short genes. Keywords: Gene Finding, Variable Order Markov Models, Sequence Analysis Mathematics Subject Classification: 92020, 60J20, 91 882, 60-08

1. Introduction Though many gene-finder programs exist, there are numerous unresolved issues, such as uncovering genes in recently sequenced organisms when the gene-finding program is matched to the characteristics of different known sequences. Here, we present a new approach for gene finding based on a Variable-Order Markov (VOM) Model. The VOM model is a generalization of the traditional Markov model, which is more efficient in terms of its parameterization, thus, can be trained on relatively short sequences. As a result, the proposed VOM gene-finder outperforms traditional genefinders that are based on fifth-order Markov models for newly sequenced bacterial genomes. This paper presents two contributions: First, it demonstrates the use of VOM models to annotate DNA sequences into coding and non-coding regions. Second, we showcase the ability to predict short genes that may be undetectable by other gene-finder programs.

2. Introduction to the VOM Algorithm The VOM model is a close relative of the context tree algorithm [I] that was recently demonstrated [2] as a universal compression algorithm that operates in the non-asymptotic domain. A VOM construction algorithm, is an algorithm that computes from a given training sequence the probability estimates for every context. The VOM model is used to evaluate the probability of any testing sequence (generated by the source that generated the training sequence). A context in the context tree is represented by a node where lengths (depths) of various contexts (branches in the tree) do not need to be equal. The context-tree algorithm of [3] contains two distinct phases: In the tree growing phase, the counts of all the sub-sequences that are shorter than a predefined depth K max are used to update the symbol counters in the nodes. In the tree pruning phase, probability estimates are computed for every context and pruning rules keep a descendent node only if the entropy of its symbols is sufficiently different from the entropy of the symbols of its parent node. The distribution of symbols in the nodes of the pruned tree defines the VOM model that is used to estimate P(X1 X~K) - the probability of any 1

symbol X 1 given the context string X~K "' X -K, X -K +I, ... , X 0 . Note that for a Markov chain model, the order is fixed to K max so there is no pruning phase. The Markov chain model suffers from

1

Corresponding author. E-mail: [email protected]

Gene-Finding with the VOM Model _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

697

exponential growth of the number of parameters to be estimated. For small data-sets this results in over-fitting to the training set and poor variance-bias tradeoff. The YOM algorithm was implemented in the MATLAB scripting language setting K max = 9 .

3. Introduction to Gene-finding A gene recognition algorithm takes as input a DNA sequence and produces as output a feature table describing the location of all genes present in the sequence. Nevertheless, the reliability of the gene prediction must be questioned since only a relatively small number of genes have been verified in laboratories. A gene-finder usually contains two steps: a) coding-region recognition - recognizing Open Reading Frames (ORF) - sections of DNA that contains a series of codons (base triplets) coding for amino acids located between the start codon (initiation codon) and a stop codon. An ORF represents a candidate gene that encodes for a protein. b) Gene parsing due to recognition of motifs and start/stop codons.

3. I. The VOM Based Gene-Finder The key idea behind the YOM based gene-finder is to use alternative YOM models to compress sliding windows of DNA sequences. DNA compressibility is used as the measures of interest. It is well known that DNA sequences are neither chaotic nor random. Thus, DNA sequences should be reasonably compressible. However, it is well known that the compression of DNA sequences is a very difficult task [4]. The YOM-based gene-finding algorithm includes the following steps: 1. The annotation algorithm was adapted from [5]. Given a sequence and four YOM models: three phased non-homogeneous models (since the first codon can starts from the first, second or the third nucleotide) and one non-coding homogeneous model, the sequence is annotated such that each nucleotide receives a symbol I, 2, 3, or N, corresponding to the coding phases 1, 2, or 3 or to a noncoding region N. The four YOM models are constructed from the training genome. A running window of W=54 nucleotides, (starting 26 nucleotides upstream and ending 27 nucleotides downstream of the current nucleotide) was compressed with the four possible YOM models. Neighboring transitions in the reading frames are eliminated with a Yiterbi-like algorithm: each nucleotide was annotated according to the best compression (the compression, which achieved the minimum number of bits per sequence window of size 54). The annotation of a sequence is achieved by the path that minimizes the total compression for all the nucleotides in the sequence and the transition penalties. A penalty cost P= I 00 is used to eliminate frequent frame shifts. 2. Identify the boundaries of the predicted coding segments in the transitions in the annotation. 3. Find the location of potential start and stop codons in the three different phases. 4. Use search algorithms to match each putative coding region found in step I with proximal start and stop codons. 5. Create the complementary string in the translating direction (as if it were the main string) by converting every 'A' to 'T' and vice versa, and every 'C' to 'G' and vice versa. Match each putative coding region to the proximal start and stop codons. No information containing potential promoter sites or other motifs is used. 6. Repeat steps 1-4 for the complementary sequence.

3.2. Optimization Experiments for the VOM Based Gene-Finder The YOM based Gene-finder contains several algorithms that need to be optimized. Here, we briefly describe the results of several optimization experiments. The full details are presented in [6]. The purpose of the first set of experiments was to improve the accuracy of distinguishing coding and non-coding DNA segments. It was conducted on the dataset of [7], which contains representative segments of the human genome. First, the effect of the pruning constant C in the YOM algorithm was investigated. It was concluded that a pruning constant of C=2 provides a superior accuracy to the Hexamer model (Markov 5) while using only about half the number of parameters. The purpose of the second set of experiments was to optimize the gene annotation algorithm. The experiments were conducted on the genome of Synechocystis PCC6803. Each version of the YOM based gene-finder annotated the genome, and the Sensitivity (Sn) and Specificity (Sp) of each annotation was computed by comparing it to the annotation published in GenBank that is considered accurate. Experiments were conducted with three running window sizes (54, 78,1 02), four penalty costs

698 - - - - - - - - - - - - - - - - - K.O. Shohat-Zaidenraise A. Shmilovici, I. Ben-Gal

(50,100,200,300), two methods for considering neighboring coding segments, and four methods for matching the start/stop codons with the coding segments. It was concluded that the most accurate annotation is produced with window size of 54, penalty of I 00, and the start/stop codons are the closest to the coding segment boundary.

4. Comparative Genome Experiments In the following experiments the VOM based gene-finder- with the best parameters found abovewas used to annotate the five bacteria genomes presented in Table I, whose annotated sequence. In the first set of experiments, the VOM based gene-finder with a VOM trained on the GENIE [8] human genome was compared with two other gene-finders. The dicodon gene-finder of [9] is based on an algorithm similar to ours, except that there the VOM is replaced by a dicodon (Hexamer, Markov 5) distribution for the coding segments. The GeneMark.jbf [I 0] is a state-of-the-art gene-finder that combines various algorithms, including motif identification. The results of the first group of experiments are presented in Table 2. The inferiority of the VOM based gene-finder can be attributed to the lack of a learning mechanism that the others have, and to the use of a inappropriate VOM trained on the human genome. Table 3: Short Genes Summary

Table I: Genomes used for comparison experiments Species Synechocystis PCC6803 (SP) Pyrococcus horikoshii (PH) Mycobacterium tuberculosis(MTI_ Helicobacter pylori(HP) Bacillus subtilis(BS)

Accession Number

Length

NC_000911

3,573,470

NC_000961

1,738,505

ALI23456

4,411,529

AE000511

1,667,867

AL009126

4,214,814

SP

PH

Table 2: Comparing to [9] and to GeneMarkjbf[IO] Gen orne SP

PH

BS

Genefinder VOMG-F Kim [9] GeneMark VOMG-F Kim [9] GeneMark VOMG-F Kim [9] GeneMark

Sn

Sp

Sn+Sp

.9627 .9640 .9696 .7356 .9730 .9777 .7971 .9750 .9554

.8774 .9850 .9970 .8769 .9390 .9027 .9630 .9770 .9917

1.8401 1.9490 1.9666 1.6125 1.9120 1.8808 1.7601 1.9520 1.9471

Identification

TheVOM Genegene-finder Mark

Full identification Partial identification Un-identified genes Full identification Partial identification Un-identified genes Full identification Partial identification Un-identified genes

14

3

I

0

Genome

BS

0

12

24

0

17

0

7

48

65

5

28

I

12

99

Table 4: Comparing the FP Rates SP

Gene Mark VOM G-F

PH

BS

FP

FN

FP

FN

FP

FN

0.26

2.65

8.87

1.87

0.72

3.99

11.7

3.24

8.68

22.25

2.74

18.13

Gene-Finding with the VOM Model _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 699

Comparing the genes detected to the true genes in GeneBank, (Table 3) revealed an advantage for short genes. Many more short genes were discovered by our gene-finder that were undetected by GeneMarkjbf Table 4 compares the false positive (FP) and the false negative (FN) rates of the VOMbased gene-finder and GeneMarkjbf The rate is computed by dividing the FP count by the total examined length (#nucleotides for each gemone). The FP rate of the VOM gene-finder is higher or equal to that of the GeneMarkjbf The FN rate is much higher. Therefore, it is recommended to combine the current gene-finding algorithm with other identification features (such as motif detection) to improve the total accuracy. In the analysis of a new genome, the best statistical models of the coding and non-coding regions are unknown. Using an inappropriate model reduces the accuracy of the gene-finder. The purpose of the second set of experiments is to analyze the performance of Prokaryotic VOMs implemented on a new prokaryotic genome. The following cross-validation experiment was performed for each genome in Table 1: a VOM is trained from the combination of four genomes and the output of the VOM based gene-finder on the fifth genome is compared to the true genes in GeneBank. Surprisingly, Comparing the results to that of table 2, it turns out that the VOM trained on the human dataset produced higher accuracy then the VOMs trained on the prokaryotic datasets. We postulate that one possible reason for the poor performance in the experiments is that the diversity between two prokaryotic genomes might be greater than the diversity between prokaryotic and eukaryotic genomes. To verify this assumption, further experiments were conducted in which we cross-tested the VOM models for two different genomes - Mycobacterium tuberculosis and, Bacillus subtilis - belonging to the same family of bacterium (gram-positive bacterium). A significant improvement was detected for the Bacillus subtilis, and no improvement for the Mycobacterium tuberculosis. Note that the training sets for training the different VOMs do not have the same sizes- which may affect the accuracy of the VOM models.

5. Conclusions We presented a new approach for gene finding based on the VOM Model. Although the VOMbased gene-finder is simple - it does not consider motifs other than the start and the stop codons - we demonstrated that the VOM model can be applied for gene finding instead of the commonly used Markov models. The VOM based gene-finder excels in detecting short genes when only a small training set is available, therefore, it is suggested that it is used as a complementary tool to other genefinders that excel in detecting longer genes. The current version of the gene-finder is only a prototypeintegrating it with other learning algorithms is expected to further improve its accuracy.

References [I] J. Rissanen, "A Universal Data Compression System", IEEE Transactions on Information Theory, 29 (5), 656- 664(1983). [2] J. Ziv, "A Universal Prediction Lemma and Applications to Universal Data Compression and Prediction", IEEE Transactions on Information Theory, 47(4), 1528- 1532(2000). [3] I. Ben-Gal, Morag G., Shmilovici A., "CSPC: A Monitoring Procedure for State Dependent Processes", Technometrics, 45(4), 293-311(2003). [4] S. Grumbach, F. Tahi, "A new challenge for compression algorithms: genetic sequences", J. ofInformation Processing and Management, 30(6), 866-875(1994). [5] A. Shmilovici, I. Ben-Gal, "Using a Compressibility Measure to Distinguish Coding and Noncoding DNA", Far East Journal of Theoretical Statistics, 13(2), 215234(2004). [6] K.O. Shohat-Zaidenraise, "Gene finding via Context Learning Models", Thesis submitted to Tel-Aviv University, Israel, February 2004. [7] J.W. Fickett, C.S. Tung, "Assessment of Protein Coding Measures", Nucleic Acids Research, 20(24 ), 6441-6450,( 1992). [8] GENIE data-sets, from Genbank version 105, www.fruitfly.org/seq tools/datasets!Human/intron v 105/

1998.

Available:

700 _________________ K.O. Shohat-Zaidenraise A. Shmi/ovici, I. Ben-Gal

[9] J. Kim, "A Study on Dicodon-oriented Gene Finding using Self-Identification Learning", A thesis submitted to School of Knowledge Science, Japan Advanced Institute of Science and Technology, February 2000 [IO]A.M. Shmatkov, A. A. Melikyan, F.L. Chemousko, M. Borodovsky, "Finding Prokaryotic Genes by the 'Frame-by-Frame' Algorithm: Targeting Gene Starts and Overlapping Genes", Rio informatics, IS( II), 874-886( 1999).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 701-704

Searching for Regularities in Weighted Sequences M. Christodoulakis, C. Iliopoulos, K. Tsichlas Department of Computer Science, King's College, Strand, London WC2R 2LS, England { manolis,csi,kostas }@dcs.kcl.ac. uk K. Perdikuri Department of Computer Engineering and Informatics, University of Patras, 26500 Patras, Greece perdikur@ceid. upatras.gr Abstract: In this paper we describe algorithms for finding regularities in weighted sequences. A weighted sequence is a sequence of symbols drawn from an alphabet E that have a prespecified probability of occurrence. We show that known algorithms for finding repeats in solid sequences may fail to do so for weighted sequences. In particular, we show that Crochemore's algorithm for finding repetitions cannot be applied in the case of weighted sequences. However, one can use Karp's algorithm to identify repeats of specific length. We also extend this algorithm to identify the covers of a weighted sequence. Finally, the implementation of Karp's algorithm brings up some very interesting issues.

1

Introduction

Weighted sequences are used for representing relatively short sequences such as binding sites as well as long sequences such as profiles of protein families (see [2], 14.3). In addition, they are also used to represent complete chromosome sequences ([2], 16.15.3) that have been obtained using a whole-genome shotgun strategy with an adequate cover. The cover is the average number of fragments that appear at a given location. Usually, the cover is large enough so that errors as well as SNPs are clearly spotted and removed by the consensus step. By keeping all the information the whole-genome shotgun produces, we would like to dig out information that has been previously undetected after being faded during the consensus step (for example the consensus step wrongly chooses a symbol for a specific position than another). As a result, errors in the genome are not removed by the consensus step but remain and a probability is assigned to them based on the frequency of symbols in each position. In this paper we present efficient algorithms for finding repetitions and covers in a weighted sequence. In solid sequences the algorithms of Crochemore [1] and Karp [5] are well known and have a O(nlogn) time complexity. Their difference is that the first algorithm computes repetitions of all possible lengths while the second can compute repetitions of prespecified length. There was already an attempt [4] to apply Crochemore's algorithm to weighted sequences. However, as we show in this paper the algorithm fails to find repetitions in O(nlogn) time. In fact it needs O(n2 ) time to be able to compute all repetitions. However, Karp's algorithm has already been applied for this problem [3] successfully. In this paper we extend this algorithm to compute covers on weighted strings while we experimentally investigate its efficiency. The structure of the paper is as follows. In Section 2 we give the basic definitions to be used in the rest of the paper. In Section 3 we argue why Crochemore's algorithm is not suitable for weighted sequences while in Section 4 we sketch the algorithm for finding repetitions and covers based on Karp's algorithm. Finally, in Section 5 we provide experimental results.

702

M. Christodoulakis, C. Iliopoulos, K. Perdikuri, K. Tsichlas

2

Preliminaries

In this work we concentrate on the identification of repetitions and covers of fixed length in a weighted biological sequence with probability of appearance 2: 1/ k, where k is a small fixed constant determined by biologists (for example k :::; 10). The size of k is chosen small in order to represent the restricted ambiguity in the appearance of several characters in a biological sequence. Assume an alphabet~= {1,2, ... ,u}. A words of length n is represented by s[l..n] = s[1]s[2]· · · s[n], where s[i] E ~for 1 :::; i:::; n, and n = lsi is the length of s. A factor f of length p is said to occur at position i in the word s iff= s[i, · · · i + p- 1]. A word has a repetition when it has at least two equal factors. A repetition is a cover when each position of the word s belongs in this repetition. A weighted sequence is defined as follows:

Definition 1 A weighted sequences= s1s2 · • · sn is a set of couples (q, 1r;(q)), where 1r;(q) is the occurrence probability of chamcter q E ~ at position i. For all positions 1 :::; i :::; n, L~=l 7r;(q) = 1. A factor is valid when its probability of occurrence is 2: -lc, where k is a small fixed constant. The probability of occurrence of a factor f = /[1] ... f[m] occurring at position i in weighted sequence s is the product of probabilities of occurrence of the respective symbols of f in s, i.e. IT;"=! 7ri+j-J(/[j]). A weighted sequence has a repetition when it has at least two identical occurrences of a factor (weighted or not). In biological problems scientists are interested in discovering all the repetitions as well as covers of all possible words having a probability of appearance larger than a predefined constant. In the algorithms we provide we can always find the largest repetition or cover in a weighted sequence by a simple exponential and binary search on possible lengths. As a result we focus only for a prespecified length d.

3

Why Crochemore's Algorithm Fails for Weighted Sequences

This section assumes that the reader is familiar with the algorithm of Crochemore for finding repetitions [1]. The algorithm uses integers to represent factors. The E; vector holds the factors of length i that start at each position of the text. The algorithm works in stages, each of which corresponds to the computation of repetitions of length larger by one with respect to the previous stage, while at the first stage repetitions of length 1 are computed. At each stage the algorithm chooses small classes to work on, without processing large classes. All classes that have not been processed in stage i, implicitly specify longer factors at stage i + 1. This is the crucial property of this algorithm that results in an 0( n log n) time complexity. The problem on weighted sequences is clear: we cannot increment factors implicitly; we have to update their probabilities of occurrence at each step so that we know whether these repetitions have a probability 2: As a result, we are obliged to process all classes which leads to an O(n 2 ). The authors of [4] did not notice this problem so they claimed that the complexity is O(nlogn), which is wrong. Alternatively, one could try find all the repetitions without computing probabilities, and then compute the probabilities of the actual repetitions. This is no better because the length of each repetition can be O(n) (thus, O(n) multiplications) for each repetition. Moreover, if we don't compute probabilities throughout the algorithm we might end up with O(l~ln) factors. As a result, it seems that adopting this approach for finding repetitions in weighted sequences will probably not lead to and o(n 2 ) algorithm.



4

Karp's algorithm

Karp's algorithm computes equivalence classes, similar to Crochemore's, but it computes them using logn steps of O(n) time each. It has been successfully applied to weighted sequences [3]. The following lemma is the basic mechanism of Karp's algorithm.

703

Searching for Regularities in Weighted Sequences

Lemma 1 For integers i, j, a, b with b ::::; a we have iEa+bi precisely when iEaj and i (1) or, equivalently, when iEaj and i + aEbj +a (2).

+ bEaj + b

Based on Karp's algorithm, we will briefly sketch how it is applied to weighted sequences.

Definition 2 Given a weighted sequences, positions i and j of s are k-equivalent (k E {1, 2, ... , n} and i, j E { 1, 2, ... , n - m + 1}) -written iEki- if and only if there exists at least one substring f of length m, that appears at (starts at) both positions i and j. The equivalence class Ek is represented as a vector v~k)v~k) ... v~k~k+l of sets of integers, where each set v}k) contains the labels of the equivalence classes of Ek to which each factor starting at position i belongs. The implementation of the algorithm is based on Equation (2) from Lemma 1 while its description in [3] was based on Equation (1). A detailed description of the algorithm follows. The new algorithm will be using ea + eb pushdown stores, P(1), ... , P(ea) and Q(1), ... , Q(eb)· 1. Sort the vector v(a) using the ?-pushdown stores; that is, run through v(a), and for each factor x at each position i, push i into P(x). Note that the same position i may be pushed into more than one stacks. So far, having the same position i in more than one ?-stack causes no problem, since these stacks are distinct. But, for the sake of the explanation, let's distinguish them in the following way: we will use ix to denote the position i when it refers to the factor x starting at i; thus, ix' will denote the same position i but referring to it"s second factor x' (f x). 2. In success, pop each P(x) until it is empty. As the number ix is popped from P(x), push it into the Q-pushdown stores Q(y) y E v~~a provided that d +a::::; n- (a+ b- 1). Note that there may be more than one factors y of length b starting at position d +a. Therefore, position ix will be pushed into all appropriate Q stacks. However, when another P stack, say P(x'), is popped, it is possible that the same position ix' (referring to different factor) will be pushed into the same Q stacks as ix. Had we not distinguished ix from ix', we would end up with the same position i appearing more than once in the same stack Q(y), and of course the ambiguity as to which factor starting at position i this refers to, would make it impossible to go on to the third step. 3. Finally, construct v(a+b); Successively pop each Q-stack until empty. Start with a variable class counter c initially set to 1. As each ix is popped from a given stack Q(y) test whether or not x is equal to x', where x' is the factor represented by position j just previously popped from the same stack; that is, element ix' was previously popped. If this is so, then iEaj and i + bEaj + b, so iEa+bi; therefore, insert c in the set v~a+b). Otherwise we have i+bEaj +b but not iEaj; therefore, insert c + 1 into v~a+b) and increment c to c + 1. When stack Q(y) is exhausted, increment c to c + 1 before beginning to pop the next Q-stack. Whenever ix is the first element from a Q-stack, insert c into v~a+b) automatically.

4.1

Covers

Covers in weighted sequences fall into two categories: (a) allow overlaps to pick different symbols from one single and (b) factors that overlap choose the same symbols for overlapping regions. Notice that the first kind of covers allows border-less covers to overlap while the second does not. Assume that we are interested only in length-d covers. This problem is solved in O(n) extra time (thus O(n log n) total time) for either case: 1. Scan Ed for every factor occurring at position 1 (there is a constant number of them); if the distance of consecutive occurrences of the same factor is always ::::; d then a cover has been found. 2. Compute the border array of the candidate cover (O(d) time). Scan Ed, like in (1), only now reject occurrences that start at positions other than some border of the previous occurrence. Obviously we can do this for every class Ei that is computed on the way to create Ed· The computation of type (a) covers is straightforward. On the other hand, type (b) covers face a difficulty:

704

M. Christodoulakis, C. Iliopoulos, K. Perdikuri, K. Tsichlas

--c-.-·-.·-._ .........

i' I ,

j

1

loco;IC..O ... _

~"'

_r-,,,..,_

,

--"--'

·~.~~~-.=-~-=-~-=-~-=-~-~­

_-

••

..... I

do .......

Figure 1: Weighted Covers. Left: The running time with respect ton. Right: The running time with respect to log 2 d. how can the border array of a factor be computed, since factors are only represented by integers, and the actual factors (strings) are never stored? Obviously, there is no other way than storing the actual strings that correspond to the numbers that represent factors. The space complexity for this is O(nd), since there are at most O(n) factors of length d. The time complexity remains unaffected by the fact that the actual factors are identified and stored. For example, consider that we will combine the equivalence relations Ea and Eb to obtain Ea+b· The identification of a factor in Ea+b takes only constant time since it can be constructed by the concatenation of one factor from Ea with one factor from Eb.

5

Experimental Results

The algorithms were implemented in C++ using the Standard Template Library (STL), and run on a Pentium-4M 1.7GHz system, with 256MB of RAM, under the Red Hat Linux operating system (v9.0). The datasets used for testing the performance of our algorithms consisted of many copies of a small random weighted sequence. We chose this repeated structure, rather than totally random files, in order to get a fair comparison of the running times. The running time for locating weighted covers is shown in Figure 1. As expected, weighted covers of type (b) need more time to be computed since, in contrast with type (a) covers, a border array has to be constructed, and overlapping between consecutive occurrences of the same factor needs to be tested. Nevertheless, the asymptotic growth is still O(nlogd). An interesting aspect of the algorithm is being revealed in the right graph: the running time tends to become constant for larger values of d. The reason is simple: as the length of the factor increases, there is a point at which the number of factors (with adequate probability) gets to zero.

References [1) M. CROCHEMORE. An Optimal Algorithm for Computing the Repetitions in a Word. Information Processing Letters, 12:244-250, 1981. [2) D. Gusfield. Algorithms on strings, trees, and sequences. Cambridge University Press, 1997. [3) C.ILIOPOULOS, K. PERDIKURI, A. TSAKALIDIS AND K. TSICHLAS. The Pattern Matching Problem in Biological Weighted Sequences. In the Proc. of FUN with Algorithms, 2004. [4] C. ILIOPOULOS, L. MOUCHARD, K. PERDIKURI, A. TSAKALIDIS. Computing the repetitions in a weighted sequence. In the Proc.of the Pmgue Stringology Conference {PSG), pp. 91-98, 2003. [5] R. KARP, R. MILLER, A. ROSENBERG. Rapid Identification of Repeated Patterns in Strings, Trees and Arrays. In the Proc. of the Symposium on Theory of Computing (STOC), 1972.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 705-708

Efficient Algorithms for Handling Molecular Weighted Sequences and Applications C. Makris 1 , Y.Panagis 2 and E. Theodoridis 2

[1] Department of Applied Informatics in Management & Finance, Technological Educational Institute of Mesolonghi, Greece makri@ceid. upatras.gr [2] Computer Engineering & Informatics Dept. of Univesity of Patras and Research Academic Computer Tecnology Institute (RACTI). Rio, Greece, P.O. BOX 26500. {panagis, theodori}@ceid.upatras.gr

Abstract: In this paper we present the Weighted Suffix Tree, an efficient data structure for computing string regularities in weighted sequences of molecular data. Molecular Weighted Sequences can model important biological processes such as the DNA Assembly Process or the DNA-Protein Binding Process. Thus pattern matching or identification of repeated patterns, in biological weighted sequences is a very important procedure in the translation of gene expression and regulation. We present time and space efficient algorithms for constructing the weighted suffix tree and some applications of the proposed data structure to problems such as pattern matching, repeats discovery, discovery of the longest common subsequence of two weighted sequences. Keywords: Molecular Weighted Sequences, Suffix Tree, Pattern Matching, Identifications of repetitions, Covers.

1

Introduction

Molecular Weighted Sequences appear in various applications of Computational Molecular Biology. A molecular weighted sequence is a molecular sequence (either a sequence of nucleotides or aminoacids), where each character in every position is assigned a certain weight. This weight could model either the probability of appearance of a character or the stability that the character contributes in a molecular complex. Definition A weighted word w = w[1]w[2]· · · w[n] is a sequence of positions, where each position w[i] consists of a set of ordered pairs. Each pair has the form (s, rr;(s)), where rr;(s) is the rr;(s) = 1. probability of having the characters at position i. For every position w;, 1 ~ i ~ n,

L\ls

Position

1 A

2

c

3 T

4 T

5 (A,0.5) (C,0.5) (G, 0) (T, 0)

6 T

7

c

8 (A,0.5) (C,0.3) (G,O) (T,0.2)

Figure 1: A weighted word w.

9 T

10 T

11

T

C. Makris et al.

706

Thus in the first case a molecular weighted sequence can be the result of a DNA Assembly process. The key problem today in sequencing a large string of DNA is that only a small amount of DNA can be sequenced in a single read. That is, regardless of whether the sequencing is done by a fully automated machine or by a more manual method, the longest unbroken DNA substring that can be reliably determined in a single laboratory procedure is about 300 to 1000 (approximately 500) bases long [1],[2]. A longer string can be used in the procedure but only the initial 500 bases will be determined. Hence to sequence long strings or an entire genome, the DNA must be divided into many short strings that are individually sequenced and then used to assemble the sequence of the full string. The critical distinction between different large-scale sequencing methods is how the task of sequencing the full DNA is divided into manageable subtasks, so that the original sequence can be reassembled from sequences of length 500. Reassembling DNA substrings introduces a degree of uncertainty for various positions in a biosequence. This notion of uncertainness was initially expressed with the use of "don't care" characters denoted as "*". A "don't care" character has the property of matching against any symbol in the given alphabet. For example the string p = AC*C* matches the pattern q = A*GCT under the alphabet I: = {A, C, G, T, *}. In some cases though, scientists are able to go one step further and determine the probability of a certain character to appear at the position previously characterised as wildcard. In other words, a "don't care" character is replaced by a probability of appearance for each of the characters of the alphabet. Such a sequence is modelled as a weighted sequence. In the second case a molecular weighted sequence can model the binding site of a regulatory protein. Each base in a candidate motif instance makes some positive, negative or neutral contribution to the binding stability of the DNA-protein complex [4], [10]. The weights assigned to each character can be thought of as modeling those effects. If the sum of the individual contributions is greater than a treshold, the DNA-protein complex can be considered stable enough to be functional. Thus we need new and efficient algorithms in order to analyze molecular weighted sequences. A fundamental problem in the analysis of Molecular Weighted Sequences is the computation of significant repeats which represent functional and structural similarities among molecular sequences. In [7] authors presented a simple algorithm for the computation of repeats in molecular weighted sequences. Although their algorithm is simple and easy to be implemented, it is not efficient in space needed. In this paper we present an efficient algorithm, both in time and space limitations, to construct the Weighted Suffix Tree, an efficient data structure for computing string regularities in biological weighted sequences. The Weighted Suffix Tree, was firstly intoduced in [6]. In this work, which is primarily motivated by the need to efficiently compute repeats in a weighted sequence, we further extend the use of the Weighted Suffix Tree to other applications on weighted sequences.

2

The Weighted Suffix Tree

In this section we present a data structure for storing the set of suffixes of a weighted sequence with probability of appearance greater than 1/k, where k is a given constant. We use as fundamental data structure the suffix tree, incorporating the notion of probability of appearance for every suffix stored in a leaf. Thus, the introduced data structure is called the Weighted Suffix Tree (abbrev. WST). The weighted suffix tree can be considered as a generalisation of the ordinary suffix tree to handle weighted sequences. We give a construction of this structure in the next section. The constructed structure inherits all the interesting string manipulation properties of the ordinary suffix tree. However, it is not straightforward to give a formal definition as with its ordinary

Efficient Algorithms for Handling Molecular Weighted Sequences and Applications

707

Figure 2: A Weighted Suffix Tree example. counterpart. A quite informal definition appears below. Definition Let S be a weighted sequence. For every suffix starting at position i we define a list of possible weighted subwords so that the probability of appearance for each one of them is greater than 1/k. Denote each of them as Si,J, where j is the subword rank in arbitrary numbering. We define W ST(S) the weighted suffix tree of a weighted sequence S, as the compressed trie of a portion of all the weighted subwords starting within each suffix Si of S$, $ rf. ~. having a probability of appearance greater than 1/k. Let L(v) denote the path-label of node v in WST(S), which results by concatenating the edge labels along the path from the root to v. Leafv ofWST(S) is labeled with index i if :lj > 0 such that L(v) = Si,j [i .. n] and 1r(Si,j[i · · · n]) ~ 1/k, where j > 0 denotes the j-th weighted subword starting at position i. We define the leaf-list LL(v) of vas a list of the leaf-labels in the subtree below v. We will use an example to illustrate the above definition. Consider again the weighted sequence shown in Fig. 1 and suppose that we are interested in storing all suffixes with probability of appearance greater than a predefined parameter. We will construct the suffix tree for the sequence incorporating the notion of probability of appearance for each suffix. For the above sequence and k ~ 1/4 we have the following possible prefixes for every suffix: prefixes for suffix x[1· · ·11]: Su = ACTTATCATTT, 1r(SI,t) = 0.25, and S1,2 = ACTTQTCATTT, 1r(S1,2) = 0.25, prefixes for suffix x[2 .. ·11]: S2,1 = CTTATCATTT, 1r(S2.tl = 0.25, and S2, 2 = CTTQTCATTT, 1r(S2,2) = 0.25. etc. The weighted suffix tree for the above subwords appears in Fig. 2. In [11] we present O(N) time algorithm for constructing the Weighted Suffix Tree of a weighted sequence of length N.

3

Applications of the WST

In this section we present compendiously some applications of the Weighted Suffix Tree. Pattern matching in weighted sequences: in this problem given a pattern p and a weighted sequence x, we want to find the starting positions of p in x, each with probability of appearance greater than 1/k. Firstly, we build the WST for x with parameter k. If p consists entirely of non-weighted positions we spell p from the root of the tree until at an internal node v, either we have spelled the entire p, in which case we report all items in LL(v), or we cannot proceed further and thus we report failure. If p contains weighted positions we decompose it into solid patterns each with Pr{ occurence} > 1/ k and match each one of them using the above procedure. This solution takes O(m +a) time, m = IPI, where a is the output size, with O(n) reprocessing. Computing the Repeats: in this problem given a weighted sequence x and an integer k we are

C. Makris et al.

708

searching for all the repeats of all possible words having a probability of appearance greater than 1/k and do not overlap. We build again a WST with parameter k and traverse it bottom-up. At each internal node v, which has more than 1 leafs at his subtree we report the leafs beneath it, in pairs. This process requires O(nlogn +a) , where a is the output size, with O(n) reprocessing. When we allow overlaps, the repetitions have to present the same characters at the positions that overlap. For this case we need nearest common ancestor queries to check this restriction in constant time. This process requires O(n 2 +a) , where a is the output size, with O(n) reprocessing Longest Common Substring in Weighted Sequences: in this problem given two weighted strings S 1 and52, find the longest common substring with probability of appearance greater than 1/k in both strings. To find the longest common substring in two given weighted strings 5 1 and 5 2 a generalised weighted suffix tree for 5 1 and 5 2 is built. The path label of any internal node is a substring common to both 5 1 and 52 with probability of appearance greater than 1/k. The algorithm merely finds the node with greatest string-depth with a preorder traversal. The above procedure runs in O(n) time.

References [1] Celera Genomics: (2000)2185-2195.

The Genome Sequence of Drosophila melanogaster. Science, V(287),

[2] Celera Genomics: The Sequence of the Human Genome. Science, Vol. 291, (2001) 1304-1351. [3] Crochemore, M.: An Optimal Algorithm for Computing the Repetitions in a Word. Inf. Proc. Lett., Vol. 12. (1981) 244-250. [4] G. Grillo, F. Licciuli, S. Liuni, E. Sbisa, G. Pesole. PatSearch: a program for the detection of patterns and structural motifs in nucleotide sequences. Nucleic Acids Res.,3608-3612,31(2003). [5] Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997) [6] Iliopoulos, C., Makris, Ch, Panagis, 1., Perdikuri, K., Theodoridis, E., Tsakalidis, A.: Computing the Repetitions in a Weighted Sequence using Weighted Suffix Trees, In European Conference on Computational Biology (ECCB 2003), Posters' Track. [7] Iliopoulos, C., Mouchard, L., Perdikuri, K., Tsakalidis, A.,: Computing the repetitions in a weighted sequence, Proceedings of the Prague Stringology Conference (PSC 2003), 91-98. [8] Kolpakov, R., Kucherov, G .. Finding maximal repetitions in a word in linear time. In Proc. FOCS99, pp. 596-604, (1999). [9] Kurtz, S., Schleiermacher, C.,: REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics, Vol. 15, (1999) 426-427. [10] H. Li, V. Rhodius, C. Gross, E. Siggia Identification of the binding sites of regulatory proteins in bacterial genomes Genetics, 99 (2002), 11772-11777. [11] Iliopoulos C., Makris C., Panagis Y., Perdikuri K., Theodoridis E. and Tsakalidis A .. IFIP International Conference on Theoretical Computer Science. Toulouse August 2004.(To appear).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 709-710

Preface to Symposium-Performance Measurement Based on Complex Computing Techniques D. Wu 1 K. Womer 2 'School of Business, University of Science and Technology of China, 230026 Jinzhai Road Hefei, PR China 2College

of Business Administration, University of Missouri-- StLouis, One University Blvd St Louis, MO 63121-4400

Abstract: this preface presents the purpose, content and results of one of the ICCMSE 2004 symposiums organized by Dr. D. Wu and Professor K. Womer, who is also Dean of College of Business Administration, University of Missouri-- StLouis. Keywords:

Performance evaluation, Complex computing

Performance evaluation is a very complex task since economic organizations are complex multidimensional systems [1]. They may compose different sub-systems characterized by different measures of effectiveness and cost. Moreover, information is inherently incomplete and uncertain in these organizations. For example, the tradeoffs and relationships among different measures are often completely unknown. Any successful method has to handle the complexity of the system and the inherent uncertain and subjective information. Analyzing performance factors in such systems has proved to be a challenging task that requires innovative performance analysis tools and methods to keep up with the rapid evolution and ever increasing complexity of such systems. Three major evaluation techniques are important here: measurement, analytic modeling and simulation. Complex computing methodologies such as neural networks have been widely adopted for this purpose. They are systems of adaptable nodes that learn by example, store the learned information, and make it available for later use[2,3,4]. Programming techniques such as Data Envelopment Analysis (DEA) are more transparent techniques that are also promising ways to to deal with this task [5]. We feel that the time is ripe to offer, in an international symposium, a selection of good papers having, as their principal theme, performance evaluation theory, practice, or impact in a large-scale application. This symposium will cover various topics on quantitative modeling, simulation and measurement/testing of complex systems. The topics of interest include, but are not limited to: · Integrated performance tool environments · Analytical modeling · Performance measurement and monitoring using neural networks · Performance metrics · Performance predictions · Performance of memory systems · Performance-directed system design · Performance implications of parallel and distributed systems · Case studies of in-depth performance analysis on existing systems ·System case studies showing the role of performance tools in the design of systems 1

Corresponding author. E-mail: [email protected], dash_ [email protected] 2 [email protected]

7 1 0 - - - - - - - - - - - - - - - - - - - - - - - - - - - D. Wu and K. Womer The manuscripts submitted were reviewed by highly qualified expert referees in a thorough twostage review procedure. The articles that we have been happy to accept all enjoy a combination of originality, high technical merit, and relevance to the topic. We would like to thank all the authors who submitted manuscripts to this special issue. We regret that only a fraction of the submissions, all of them interesting, could be included. We owe a special debt of gratitude to the many able reviewers who generously commented, often is extraordinary detail, on the submissions. Finally, we would like to acknowledge the good-hearted patience of the authors and of the organizers in spite of this international symposium's long gestation.

References [I] 0. Marta, Lorenzo D. Ambrosio, Ra.aele Pesenti, Walter Ukovich. Multiple-attribute decision support system based on fuzzy logic for performance assessment. European Journal of Operational Research .in press, 2003. [2] Dimla, S. ( 1999). Application of perceptron neural networks to tool-state classification in a metal-turning operation. Engineering Applications of Artificial Intelligence, 2(4), 471-477. [3] Kusiak, A., & Lee, H. Neural computing-based design of components for cellular manufacturing. International Journal of Production Research Society, 34(7), 1777-1790 ( 1996). [4] L. Liang and D. Wu . An application of pattern recognition on scoring Chinese corporations financial conditions based on backpropagation neural network Computers & Operations Research, In Press, Available online 14 November( 2003) [5]A. Chames, W.W. Cooper, and E. Rhodes, Measuring the efficiency of decision making units, European Journal of Operational Research 2, 429-444( 1978)

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 711-7I3

Workforce Schedule and Roster Evaluation Using Data Envelopment Analysis 1

Y. Li 1 D. Wu2 School of Business University of Alberta, Edmonton, AB, Canada T6G 2R6 2

School of Business, University of Science and Technology of China, Hefei, AnHui, P.R. China, 230052

Abstract: A framework composed by simulation modeling and data envelope analysis (DEA) is

proposed to evaluate the overall service quality of various schedules and rosters that satisfY a time-varying demand. Workforce scheduling is a process of generating employee shifts to match customer demands for service while keeping costs under control and satisfYing all applicable regulations (e.g., shift lengths and spacing of breaks). Workforce rostering is to assign shifts to employees preferably with the consideration of employees' satisfaction to their working time. The proposed approach provides an effective and innovative tool to estimate the performance of schedules and rosters through multiple objectives that management has concerns with, other than merely service level. A cone-ratio DEA model is used to differentiate various schedules or rosters according to a decision maker's preferences to their performance indicators produced by a simulation model. Keywords: Workforce scheduling; Multiple Objectives; DEA; cone ratio

1. Introduction Workforce scheduling is a process of generating employee shifts to match customer demands for service while keeping costs under control and satisfYing all applicable scheduling regulations (e.g., shift lengths and spacing of breaks). One of the reasons that make this problem a difficult one to solve is the complexity of evaluating service quality. Traditional workforce scheduling approaches use service level - the percentage of customers served within a predetermined threshold time, as the only target to achieve while minimizing the staffing cost. Glover and McMillan ( 1986) employed techniques that integrate management science methods and artificial intelligence to solve general shift scheduling problems. Balakrishnan and Wong (1990) solved a rotating workforce scheduling problem as modeling a network flow problem. Smith and Bennett ( 1992) combined constraint satisfaction and local improvement algorithms to develop schedules for anesthetists. Schaerf and Meisels (1999) proposed a general local search method for employee scheduling problems. Recently, Laporte (1999) considered developing rotating workforce schedules by hand and showed how the constraints can be relaxed to get acceptable schedules. All of those among many other proposed workforce scheduling algorithms used only one criterion service level to indicate service quality. However, unless service level is set to be a hundred percent, which is not common in practice, it is never clear what happens to the customers who wait more than the proposed threshold time. There are certainly more indicators than service level exist to reveal service quality from different angles (Castillo, Joro and Li 2003). For example maximum wait time indicates the worst scenario; balking and reneging rate show the percentage of the customers who attempted to but did yet receive services; average and max queue length provide information for the physical capacity of service facilities etc. Using service level only to represent service quality, at least the above mentioned information has to be ignored. The service quality indicators considered in this paper include but are not limited to service level, average wait time, maximum wait time, average queue length, maximum queue length, balking 1 2

[email protected] Corresponding author: [email protected]

712 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Y. Li and D. Wu rate, reneging rate, and personnel utilization. A simulation model is used to generate these indicators given various proposed schedules because of two main reasons. First, simulation models are able to accommodate complicated realistic situations without being imposed on strong assumptions such as MIM/s and single independent period by period (SIPP). In addition, most analytical models have difficulty to generate certain indicators such as maximum queue length etc. Considering current available techniques, we decide to adopt simulation instead of analytical models to achieve high flexibility in various situations and performance indicators with the expense of speed. Workforce rostering assigns shifts in a schedule to employees while satisfying various constrains such as one employee working at most one shift in a day and employee availability. In many operations, hospitals in particular, satisfying employees' preferences for working time has been a chronic problem in workforce rostering. Poor employee satisfaction often leads to high personnel turnover, absenteeism, resentment, poor job performance and unfit mental and physical conditions-situations that translate to Joss of productivity, quality and even safety. Therefore, indicators denote employees' satisfaction for working time should be considered to evaluate workforce rosters compare to schedules. An aggregated employee preference score is used in the data sample to represent overall employee preference to a roster; however, more indicators can be included if necessary. A framework composed by simulation modeling and data envelope analysis (DEA) is proposed to evaluate the overall service quality of various schedules and additionally overall employee satisfaction for rosters. This approach provides an effective and innovative tool to estimate the performance of schedules and rosters through multiple objectives that management has concerns with, other than merely service level. The cone-ratio DEA model is used here to aggregate the decision maker's preference information to the simulated performance indicators and employee satisfaction indicators.

2.

Cone-ratio DEA

The methodology we use here is cone-ratio Data Envelopment Analysis (DEA), a weight-restricted DEA developed based on the CCR model (Chames et al. 1978). DEA is considered a robust tool for the evaluation of relative efficiencies as well as for the establishment of goals (or benchmarks) for the entities out of the efficiency border (or envelope). The analyzed entities or DMU's (for Decision Making Units) are compared under Farrel's concept of efficiency (Farrel eta/., 1962), that consists of a ratio of the weighted sum of the outputs over the weighted sum of the inputs of each DMU. Suppose that there are n DMUs, that is DMU i ( j = 1,2, · · ·, n) to be evaluated. Each DMU i has m different inputs xii and s different outputs Yrj. Let the observed input and output vectors of

DMUj be xj =(xlj,x2j•···,xmjY >0' respectively. The Cone-ratio model can be written as Max

s.t.

j=1,2,-··,n and yj =&lj•Y2j• ... •YsjY >0'

uryo vrxo

ury

vrx - 0, ..1/ > 0, A.i+l < 0 and A; A.i+l > 0) and deal with them

respectively. (3) Dealing with self-intersection isometry loops: To take isometry operation on a closed contour, it possibly results of self-intersection, that is to say, there occurs several least closed loops. When self-intersection happens with the isometry loop, firstly we make subsections with the self-intersection contour in the isometry loop according to the self-intersections, and pick up information of the new closed loop, then eliminate the loops whose orientation is opposite to that of the original loop by judging the orientation of them. Also it's necessary to point out that it is easy to judge the orientation of a protruding closed loop, the only thing needed is to calculate the intersection product of the vectors of two arbitrary bordered contour segments, and judge the symbol of the result. While it is more complex with a concave one, and we adopt the corresponding algorithm. (4) Dealing with mutual-intersection of the isometry loops: A common case is that a real part has mould pocket, inside which there are more than one islands. while taking the isometry operation on the mould pocket and the contour of the island at one time, chances happen on the intersection of the isometry, which are of the mould pocket and the island, or two islands. In this case, it needs to have an aggregation operation on the intersection closed loops. In order to do that efficiently, we let the outer loop be counter clock wise (CCW), in reverse clock wise (CW). Also, we attach the each intersection point of the two direction lines a denotation eigenvalue except the geometry coordinate, which corresponds to the geometry of the intersection element. (5) The ending condition of the isometry operation: All the inside contour will dwindle after the isometry operation, even if the island contour increases, it will join with the outer contour and decrease in the end. Hence, the operation must be stopped when the result of the isometry operation on the inner contour reduces to a point. However, the common case is not reducing to a point after the inner contour loop isometry operation but an unexpected contour loop. So it's unmeaningful to have the isometry operation based on that. We can find the condition of ending the isometry operation, that is, the orientation of the isometry loop is opposite to that of the original loop. (6) The examination and prevention of chopping leakage: The cutter removes the remainder along the isometry loops of the contour-parallel isometry machining on the 2D mould pocket. In this process, there will be chop leakage in milling. If there occurs interspace between the wrapped line of the cutter feeding along two arbitrary bordered isometry line, finally it results of some unexpected islands. In IE-CAM pocket machining, we adopt a new method to examine and prevent the chopping leakage. Acknowledgments The research is supported by the Professor Changan Zhu and Chuanqi Li, through the National Research Lab Program. References [I] K. Morishige, Y. Takeuchi, K. Kase. Tool Path Generation Using C-Space for 5-axis Control Machining, Transaction of the ASME, 1999, 121(2): 144-149 [2] Chih-Ching Lo. Two-stage cutter-path scheduling for ball-end milling of concave and wall bounded surfaces. Computer-Aided Design, 2000, 32: 597-603 [3] C-C. Lo. Efficient cutter-path planning for five-axis surface machining with a flat-end cutter. Computer-Aided Design, 1999, 31: 557-566 [4] Yuan-Shin Lee, Bahattin Koc. Ellipse-offset approach and inclined zig-zag methodfor multi-axis roughing of ruled surface pockets. Computer-Aided Design, 1998, 30(12): 957-971 [5] Yong Seok, Kunwoo Lee. NC milling tool path generation for arbitrary pockets defined by sculptured surfaces. Computer-aided Design, 1990, 22(5): 273-284

716 - - - - - - - - - - - Wei Jiang . Changan Zhu, Yi Zhang, Xiaoqiang Zhong, Desheng Wu

[6] S.C. Park, B.K. Choi. Uncut free pocketing tool-paths generation using pair-wise offiet algorithm. Computer-aided Design, 2001,33: 739-746 [7] Daoyuan Yu, Jianling Zhong, Zhuang Xiong. The Computational method of the cutter path planning in space surface NC and its problem. Mechanic Automation, 1997, 19( I): 21-27 [8] Solid Edge 2000 API Online Help [9] SolidWorks 2001 API Online Help

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 717-721

A Strategy of Optimizing Neural Networks by Genetic Algorithms and its Application on Corporate Credit Scoring Desheng Wu 1', Liang Liang 1, Hongman Gao2 and Y. Li 3 1School

ofBusiness,University of Science and Technology of China, 230026 Jinzhai Road Hefei, PR China 2 School of Business, University ofMississippi,USA 3 School of Business, University of Alberta, Edmonton, AB, Canada T6G 2R6 Abstract: When the ordinary backpropagation neural network is trained, the problem of seriously local minimum often occurs. This makes net-training unfinished and weights distributed in the network immature. The current paper proposes a strategy of optimizing neural networks by genetic algorithm to deal with this problem. If the initially decided conditions (including training time and error precision) cannot be satisfied, i.e., the neural networks settle in local minimum, the unaccomplished network turns to be optimized by the genetic algorithm through adjusting the immature weights. Then the evolutionary weights which are believed to grasp the property of the sample construct a new network model which shows strong discriminant power. By use of the proposed evolutionary strategy, a credit scoring model applied to Chinese business corporations is developed and the scoring power of the model is tested in this study. Finally conclusions are presented. Keywords: Corporate credit scoring; backpropagation neural networks; genetic algorithms; weight

1. Introduction The separation of ownership and management poses the problem of potential conflict between the shareholders, i.e. the owners of the firm, and the managers who run it. By exposure of corporate information quickly and accurately, credit analysis is viewed as a good way to assist to solve the problem of moral hazard and adverse selection. It is widely believed that the neural network is an accurate technique to score credit conditionslt- 61. Since Odom firstly used neural networks to analyze credit risk in 1990, the application of the neural network in this field has been attached much importance by the applicants and researchers lll_ Coats and Fant contrast a neural network with linear discriminant analysis for a set of cases labeled viable or distressed obtained from COMPUST AT for the period 1970-1989 l2l. They find the neural network to be more accurate than linear discriminant analysis. Tam and Kiang studied the application of a BP neural network model to Texas bank failure prediction for the period 1985-1987[31. The neural network prediction accuracy is compared to linear discriminant analysis, logistic regression, k nearest neighbor, and a decision tree model. Their results suggest that the multilayer perceptron is most accurate. Altman et a!. employs both linear discriminant analysis and a neural network to score corporate credit for 1000 Italian firms. The authors conclude that neural networks are not a clearly dominant mathematical technique compared to traditional statistical techniques such as discriminant analysis l4 l. Piramuthu studies credit scoring of Belgium corporation, comparing a BP neural network and a neural-fuzzy model, preprocessing the data sample by primary component analysis(PCA) [SJ_ Mu-Chen and Shih-Hsien Huang construct a NN-based credit scoring model, which classifies applicants as accepted (good) or rejected (bad) credits. The GA-based inverse classification technique is used to reassign rejected credits to the preferable accepted classl 61 . All of the models of these studies are based on special variables and data set of their relative countries. Thus, the adaptation to Chinese conditions of the results for assessing distress potential of I' Corresponding author.. E-mail: [email protected]

718 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Desheng Wu, Liang Liang, Hangman Gao andY. Li

firms is questionable because the representative credit ratios and data set are different from country to country and even from company to company. Hence, it is critical to do experiential research on credit scoring with variables and data set of Chinese firms. A few scholars in China such as Baoan Yang, Hai Ji and Chende Lin[8] [17], dipped into the fields. However, firm-specific factors, macroeconomic trends, and industrial factors are omitted from the models, as well as the decision-maker's individual influence from experience, wisdom and information priority. Thus, the results of these studies are not generalizable due to these reasons as well as the data accuracy of their samples. Baoan Yang, Hai Ji et. al.[8] suggested a BP model to diagnose credit risk conditions with a set of data from the credit bank of Suzhou city in China ,which were not selected carefully, namely, their study is not real experiential research. Similar to those of [8], cursory demonstration results by use of neural networks and comparison with some other methods are given in [17]. The tactic to deal with problems such as local minimum, which are commonly met, is seldom detailed in these previous researches. In order to overcome the suffering from data redundancy, it is always necessary to preprocess the data by primary component analysisl 2•51. But this may causes seriously local minimum, which will be found in this study. In this research, after a preprocessing technique by primary component analysis, an evolutionary layer is suggested by the authors to optimize the BP neural network. In the evolutionary layer, genetic algorithm is used to assist training the network by adjusting the weights and biases. Thus the deficiency of serious local minimum is well dealt with. Whilst the research work was undertaken in the field of credit scoring, many, if not all, of the procedures and findings described may be applicable to other areas of classification and trend analysis.

2. Variable Selection and Sample Design We refer to the index system that is currently used in domestic relative industry. Then the representative variables employed in this study are zeroed in on four areas of different businesses that may need attention: variables estimating profitability(Vl,V2,V3), variables estimating operational efficiency(V4,V5), variables estimating solvency power (V6,V7,V8)and variables estimating development ability(V9,VIO) which are paid attention by three different principal parts(shown in Table I ).So these variables are comprehensive to account for corporate credit conditions. Table I: Corporate credit scoring variables Totalobject

Sub-Object

profitability

Corporate credit scoring

operational efficiency solvency efficiency Development ability

variables

Earnings per stock(Vl) Return on assets(V2) Net operating profit rate of retum(V3)

Principal caring part

Examined carefully by the investors

Receivables tumover(V4) inventory turnover(V5) total-debts-to-total-assets(V 6) Current ratio (V7)

Examined carefully by the loaner

Acid-test ratio(V8) market domain(V9) Final-equity-to-initial-equity (VI 0)

Examined carefully by the nation

In the process of collecting data, the key problems are the representation and reliability of sample as well as sample errors. So the case companies are all public companies from the Chinese business industry in 1996. 30 case companies are carefully selected, balance sheet and income statement data as well as specialist scoring values are collected. The modeling data set are composed of 24 case companies, testing data set composed of the other 6 case companies.

3. Model Generally speaking, the whole discriminating process mainly follows four stages: pre-processing

Strategy of Optimizing Neural Networks by Genetic A l g o r i t h m s - - - - - - - - - - - - - - 719

stage, network training stage, model testing stage, i.e., credit scoring stage and post-processing stage. Specially, in the network training stage, we propose an evolutionary network to improve the learning process. The so-called evolutionary network is a kind of neural network combing the GA to deal with local minimum. When the neural network is being trained, the problem of local minimum often occurs and makes the training of the neural network unfinished. Then the initially decided precision cannot be achieved and the unaccomplished network constructed by the immature weights cannot effectively grasp the genuine property of cases being scored. Evidently, weights need to be further optimized as well as biases. This is the focus of the current study. Figure I plots the scoring flow by the evolutionary network. In figure I, it is shonwn that if the initially decided conditions (including training time and error precision) cannot be satisfied, as is said in the rhombus, the unaccomplished network turns to be optimized by GA through adjusting the immature weights. On the contrary, if the initially decided conditions are satisfied (either after GA optimization or before it), the scoring can be executed and results can ten be yielded.

GA optimization Fig. I: flow chart of the evolutionary network

Step 1-6 give the outline of the GA used in the evolutionary network as follows. Step 1: when the network set in a local minimum, it enters the evolutionary layer following our algorithm and randomly generates an initial population of chromosomes limited within the interval of maximal and minimal weight values (these represent our trial weights, and are normally encoded as a parameter binary string). Step 2: transfer the binary population to weights population and evaluate each chromosome to establish its fitness (success at solving the problem). Step 3: select pairs of individuals for recombination based on their fitness. Step 4: combine the individuals, using genetic operators such as crossover and mutation. Step 5: create a new population from the newly-produced individuals. Step 6: repeat steps 2 through 5 until the satisfactory solution is found,namely, optimized weights and biases are found.

720 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Desheng Wu, Liang Liang, Hongman Gao andY. Li

4. Summery and conclusion The current paper mainly focuses on the validation of a strategy of combining the BP neural networks and genetic algorithms on corporate credit scoring. This strategy seeks to deal with local minimum occurred during net-training. By use of the strategy, the unaccomplished network will tum to be optimized by the genetic algorithm through adjusting the immature weights if the initially decided conditions (including training time and error precision) cannot be satisfied, i.e., the neural networks settle in local minimum. Then the evolutionary weights which are believed to grasp the property of the sample construct a new neural network model named as the evolutionary neural network. It is demonstrated to be better than ordinary BP neural networks in scoring credit conditions. Also, the high accurate rate of the supposed model validates the strong scoring power of the I 0 variables in this research. Whilst the research work was undertaken in the field of credit scoring, many, if not all, of the procedures and findings described may be applicable to other areas of classification and trend analysis.

References [I] Odom, MD, and Sharda, R. A Neural Network Model for Bankruptcy. Prediction. Proceedings of the IEEE International Joint Conference on Neural Networks. 1990 ,2: 163-168 [2] Coats P, Pant L . Recognizing financial distress patterns using a neural network tooi.Financial Management.1993: 142-155 [3] K. Tam and M. Kiang, Managerial applications of the neural networks: The case of bank failure predictions. Management Science, 1992, 38:416-430 [4] E. Altman, G. Marco, and F. Varetto. Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks. Banking and Finance. 1994, 18:505-529. [5]

Piramuthu, S. Financial credit-risk evaluation with neural and neurofuzzy systems. European Journal of Operational Research.l999,112(2): 310-321

[6]

Mu-Chen Chen, Shih-Hsien Huang. Credit scoring and rejected instances reassigning through evolutionary computation techniques. Expert Systems with Applications. 2003(24) :433-441

[7] Jinmei Wu. Credit Analysis. Audit publishing company ofChina.2001 [8] Baoan yang, Hai Ji. A Study of Commercial Bank Loans Risk Early Warning Based on BP Neural Network. Systems Engineering--Theory and Practice 2001,5 [9] Brill, J. The importance of credit scoring models in improving cash flow and conditions. Business Credit. 1998 (1): 16-17. [10] J. E. Baker J. Adaptive Selection Methods for Genetic Algorithms.Proc. ICGA I. 1985:101-111 [II] Hecht Nielsen R. Neural computing. Addison Wesley,l990;124-133 [12] Matlab. Matlab Users Guide. The Mathworks Inc,South Natick,MA,USE,I994 [13] K. A. De Jong. Analysis of the Behaviour of a Class of Genetic Adaptive Systems, Dept. of Computer and Communication Sciences, University of Michigan, Ann Arbor, 1975. [14] J. E. Baker. Reducing bias and inefficiency in the selection algorithm. Proc. ICGA 2, 1987:14-21 [ 15]

L. Booker.lmproving search in genetic algorithms. In Genetic Algorithms and Simulated

[16]

H. MOhlenbein and D. Schlierkamp-Voosen. Predictive

Annealing, L. Davis (Ed.). Morgan Kaufmann Publishers .1987:61-73 Algorithm Evolutionary Computation.l993,1(1):25-49

Models

for the Breeder Genetic

Strategy of Optimizing Neural Networks by Genetic A l g o r i t h m s - - - - - - - - - - - - - - 721

[17] Chen Xionghua , Lin Chengde,YE Wu.Credit risk assessment of enterprise basing on neural network. Journal of Systems Engineering. 2002 ,17(6):570-576. [18] Liang Liang, Desheng Wu. An Application of Pattern Recognition on Scoring Financial Conditions Based on Backpropagation Neural Network. Computers & Operations Research. under publication.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 722-725

The Valuation of Quanto Options Under Stochastic Interest Rates Li ShuJin 1 , Li ShengHong 2, Desheng Wu 3 1•2 Department

of Mathematics, University of ZheJiang, 310027 ZheJiang Province, HangZhou 3 School

of Business, University of Science and Technology of China, 230026 Hefei, PR China [email protected]

Extended Abstract The quanto option is a contingent claim whose investor has to consider to hedge the risk from both the foreign stock price and exchange rate simultaneously. This paper discussed two cases of quanto options: (I) an option of floating exchange rate, written on foreign stock in terms of foreign currency, whose value is transformed to domestic currency value by exchange rate at maturity time; (2) an option of fixed exchange rate, by which the value of this option is transformed to domestic currency value at maturity time. Quanto options were introduced by Reiner in 1992. When he or she invests foreign stock, investor is not only concerned about the risk of foreign stock price but also that of the exchange rate. Therefore, the investor demands for controlling the risk of foreign stock price and exchange rate simultaneously due to the unfavorable changes in the shapes of yield curves, or avoid one risk. There are a number of author who have discussed the pricing of options on interest rates, for instance, Flesaker (1991) and Viswanathan (1991) have derived a closed-form solution for a European option on an interest rate using various of the one-factor Heath et al. ( 1992) model. Dravid et al. ( 1993) considers to price foreign index contingent claims and Reiner ( 1993) introduces the price of the quanto option under deterministic interest rate. To our knowledge, there are few literatures about the quanto options pricing under stochastic domestic and foreign spot rates. We consider a financial market operating in continuous time and described by a probability space

(n,:S,P), a time horizon T and a filtration {3,} 0 ~ 1 sr• representing the information available at time t. One central problem in financial mathematics is the pricing and hedging of contingent claims by means of dynamic trading strategies based on S, which is the asset price for trading. We assume the market is complete. That is, in this market, there are neither transaction costs nor taxes, borrowing and short-selling are allowed without restrictions, and borrowing rate is the same as the lending rate. A European contingent claim is a 3r -measurable random variable describing the payoff at time T of some derivative security. To address the importance of correlation, this paper employs a multi-factor valuation model. To begin with, we assume that given the initial curves, /(0, r) and g(O, r), "f r, the domestic and foreign forward rate,

f (t, T)

and g(t, T) , and the foreign stock price,

S (t) , and the exchange rate,

C(t), follow the following diffusion process of the form under the measure P:

dJ(t, T)

= a(t, T)dt +ad (t, T)d~ (t)

The Valuation ofQuanto Options Under Stochastic Interest Rates _ _ _ _ _ _ _ _ _ _ _ _ 723

d,g(t,T) = j](t,T)dt +a1 (t,T)dW2(t) dS, = S, [us (t)dt +as (t)dW3(t)] dC, = C, [.uc(t)dt +

t

P;r(t)dW;(t)]

where W(t) = (W1 (t), W2(t), W3(t), W4(t)) is a 4-dimensional standard P -Brownian motion, and

a(t,T), j](t,T), .Us (t) and ,uc(t) are the drift coefficients of the domestic and foreign forward rate, f(t,T) and g(t,T), and the foreign stock price, S(t), and the exchange rate, respectively, and

C(t),

P;, i = 1,2,3,4 are constants satisfying

PI2 +Pi +Pi +Pi = 1 Furthermore, ad (t, T), a 1 (t, T), as (t), and f(t) are their volatilities, respectively.

P, p 1W1 (t) + p 2W2(t) + p 3W3(t) + p 4W4(t)

It is easy to testify that under the measure

is also a standard P -Brownian motion. In section 2 of this paper, we find an equivalent martingale measure, which all processes

Q with P, i.e., under

X,= B~ 1 (t)P(t,T)

¥; = B~ 1 (t)C,Q(t, T)

V, = B~ 1 (t)C,B /t) Z, are

= B~ 1 (t)C,S,

Q-martingales, where rrd(u)du

Bd(t) = exp.L

B1 (t)

rrf(u)du

= exp.L

P(t,T)

= exp

- Jf(t,u)du

Q(t,T) = exp

- {s-(t,u)du

In effect, in order to find an equivalent measure

Q with P, we find a previsible vector

y(t) = ( YI(t),y2(t),y3(t),y4(t) ), called market price of risk, and a new measure

Qunder which

W(t) = (wl(t),W2(t),W3(t),W4(t)) is a new 4-dimensional standard Brownian motion, where

W;(t)=W;(t)+ !r;(v)dv, i=1,2,3,4 At last, we fmd out the martingale measure, Q, under which domestic and foreign forward rate,

f(t,T) and g(t,T), foreign stock price, S(t), and the exchange rate, C(t), satisfy the following diffusion processes, respectively:

dJ(t,T) = aAt,T)dW; (t)-aAt,T)'J:.At,T)dt d,g(t,T) =a1 (t,T)dW2(t)- a 1 (t,T){p 2f(t) +I. 1 (t,T)}it dS,

= S, [as (t)dW3(t) + b (t)- pps (t)f(t) }it]

724 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Li Shu.Jin, Li ShengHong, Desheng Wu

In section 3 of this paper, we discuss the two cases of quanto options: (I) an option of floating exchange rate; The payoffs of the call options at maturity time, T , is

VJc(T) = Cr max{Sr -K,O} In fact, its payoffs at maturity time, T , is the one of the call options computed by means of foreign currency, then transformed to domestic currency by exchange rate. Therefore, the investor who invests the quanto option attaches importance not only to the risk of foreign stock price but also to that of exchange rate; (2) an option of fixed exchange rate; The payoffs of the call option at maturity time, T , is

V~c(T)

= C max{Sr -K,O}

where K is strike price in term of foreign currency and C is pre-agreed exchange rate by which the value of the call options is transformed to domestic currency value at maturity time in the second case. Making use of martingale theory, we derive the close-form pricing formulas of call quanto options and put-call parity formulas under stochastic and foreign interest rates in a complete market, respectively. Furthermore, the pricing formulas of put quanto options are obtained. Our main result in this paper is

Theorem 1 The price of the first quanto call option at time 0, denoted by V~c, whose exercise price is K (foreign currency value) and maturity time is T is

VJc

=

CSN( d +~a )-C KQ(O,T)N( d -~a) 0 0

1

0

1

r

where N (.) is the cumulative probability function for a standard normal stochastic variable, and

2 ( T)dv+ d 1 -_ -lnQ(O,T)+lnS0 -lnK , a 2 -_ r~ L.f v,

a

a 52 ( v )dv

Theorem 2 (parity formula) At time 0 the price of the first quanto put option, denoted by

V~P ,

and the price of call one, denoted by VQc, which have same strike price K and maturity time T, have the following relation:

V~c- V~P = C0 S0

-

KC0 Q(O,T)

Furthermore, we can get the price of the first quanto put option at time 0 is

V~P = C KQ(O,T)N( -d +~a)-C SN( -d -~a)

where N(.), d 1 and

0

1

a

refer to theorem I.

0

0

1

Theorem 3 The price of the second quanto call option at time 0, denoted by V~c , whose exercise price is K (foreign currency value) and maturity time is T and pre-agreed exchange rate is C is

V~ =CS0 P(O,T) expb N(d2 +_!_a)-CKP(O,T)N(d2 _ _!_a) Q(O,T)

where N {.) and

a

r

2

r(p

2

refer to theorem I and

b=

r~(v,T)dv+

r 1 (v,T)- pp5 (v) f(v)dv

2

The Valuation ofQuanto Options Under Stochastic Interest R a t e s - - - - - - - - - - - - - 725

d2

= -lnP(O,T)+lnC0 S0 -InK +b ()"

Theorem 4 (parity formula) At time 0 the price of the second quanto put option, denoted by V~P,

and the price of call one, denoted by

V~c , which have same strike price K and maturity time T , have

the following relation:

V 2 -V 2 QC

QP

=CS

o

P(O,T)expb-CKP(O T) Q(O,T) ,

Furthermore, we can get the price of the second quanto put option at time 0 is

V 2 =CKP(O T)N(-d QP

,

2

+!a)-cs P(O,T) N(-d _!a) 2 o Q(O,T) 2 2

where N (.) , d 2 and 0" refer to theorem 3. Only if there are formulas of the quanto options, we may know how to hedge risk from both foreign stock price and exchange rate and foreign and domestic spot interest rate. In this paper, we won't discuss hedging problem duo to the length of the article.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 726-729

Evaluation of the Collected Pages and Dynamic Monitoring of Scientific Information Yanping ZHAOt and Donghua ZHUt School of Management & Economics, Beijing Inst. OfTech. (Beijng 100081)

Abstract: This paper deals with the problems faced in gathering and analyzing of large amount of topic-specific scientific information for enterprises, governments or militaries. We propose a framework for efficient use of collected papers and dynamically monitoring the changes occurred in the specific-topic. A composite index is built for ranking the collected scientific papers with objective and authority criteria. The index can be automatically updated according to computer collected parameters for their changes or for the user demand. An assisted module for detecting the changes are discussed. Initial test show that our prototype is promising, and has potential value in many information gathering and retrieval application prospects. Keywords: technology forecasting, scientific information gathering, PageRank, dynamic algorithm.

1. Introduction The fast development makes scientific and engineering literature grow explosively, all traditional analysis of information entirely depend on human experts to complete will become impossible and may cause delay in decision making or technology information assessment. So some intelligent methods to assist human experts to do analysis work, access and process varieties of topics of scientific information are becoming more welcome and face challenges. The challenges are: (I) New technology intercrosses with many other research areas. (2) The fast the speed of change and the large amount of scientific information. (3) How to make the collected information possess authoritative, relevance, and freshness, how to make the collection process efficient and guarantee the precision and comprehensive. (4) The published ones are often in modifications, and new hot spots emerge at any time and somewhere. How to track them. Therefore, we make some contributions and introducing some intelligent methods to partly solve these problems, and propose a methods for arranging the collected information. This paper is organized as follows, section 2, the related works; section 3, methods and techniques for establishing dynamic ranking and updating models; section 4 construction of a dynamic monitoring module; section 5 initial results and future works.

2. Related Works Recently, the research search engines pages rank and for topic-specific page gathering has grown rapidly. Many new methods and systems have been developed. Chakrabarti, (1999) proposed focused crawling concept and build a system with Bayesian classifier as the target documents selector. Aggarwal et al.(2001) have proposed a statistical learning model for crawlers to discover topicspecific resources. Kleinberg (1998) proposed HITS algorithm, which can provide hub pages and authorities. But the method is for a general search engine and need initial large amount of document collections and intensive iteration to calculate. Page L. et al. ( 1998) presented a popular algorithm called PageRank makes it very successful. ' E-mail: [email protected] 1 E-mail: [email protected]

Evaluation of the Collected Pages and Dynamic Monitoring of Scientific Information _ _ _ _ _ _ _ _ 727

And Haveliwala, ( 1999) also provides an efficient algorithm. Recently, Taber H. Haveliwala (2002) proposed a topic-sensitive PageRank approach, by computing a set of PageRank vectors, each biased with a different topic, which can generate more accurate rankings than with a single, generic PageRank vector. And gave some suggestions for improving the results further. Nowadays, many authors or groups supply the on-line service for the tracking changes. AT&T 's Internet Difference Engine(AIDE), a system, can automatically compare the two pages before and after changes, then construct a "incorporate" page, to tell the specific change. There are also systems like Netrnind and so on, can detect the new pages containing the keywords user given. Khoo Khyou Bun et al.(2003) proposed a method which can summarizes the changes in the page, the advantage of it is not only to tell user the newly hot spot, but also to show the newly emerged topic in the specific domain. We use and modify those models to give the topic change or shifting, and arrange the collected papers for efficient use.

3. Dynamic Ranking And Updating Models Users are not interested in all collected papers, but the pages relevant or deemed important. Moreover, papers from different tunnels will have different ranking policies, they couldn't be comparative to each other. So we have to find ourselves own ranking measures as an overall equity scores for all collected pages and also provide different profiles of them. We considered four factors: (I) The context or semantic measure; (2) The Link structure( which means citations) measure; (3) The official statistical data for authorities(such as survey paper, top rated author(s), organization( s), journal(s)); (4) Freshness and others (such as page age or publication date). Those factors are for the prioritized pages ranked in the top list if there are large amount of available material of relevance.

3.1 The Context Or Semantic Measure In feature selection stage, one of the major problems is the treatment of terminology, which represents documents domain specific knowledge or concepts. Therefore, the selection of target features is very important for collecting relevant documents. Here we take the advantage of the keywords from authoritative scientific citation thesaurus, such as EI or INSPECT etc. to make them most representative and efficient for specific scientific topic. If user couldn't supply accurate search terms, our system has a learning ability for expending the co-occurrence terms for the suitable target feature generation. And if for Chinese documents, Chinese Journal Networks (CNKI) will take the place of EI, Chinese word group(s) will be extracted from the CNKI database, and for Chinese document analysis, a NLP module uses a Chinese segmentation program, ICTCLAS system from the Institute of Computing Technology of Chinese Academy of Science, to parse the co-occurrence terms from Chinese character documents. Then each collected page have its IS(P) greater than a threshold will be saved.

3.2 The Link Structure Measure Since collected papers only have similarities to the target, but can not tell which paper is more authoritative than the others. From scientometrics, the more citations to one paper, the more authoritativeness of the paper. So we introduce JP (P) based on famous Google's PageRank algorithm with personalized (Topic-specific) feature. The efficient PageRank algorithm with our personalized initialization uses the following formula. Rank= d M Rank+ (1-d) Per (I) Where Rank is the (PR(PJ), PR(Pz), ... ,PR(P,J)' vector, M is the link graph matrix with mif in Mas 1/n(P), if Page P; pointing to P;, otherwise mif is 0. Per is the Personalized initial vector, with Per = [1/N}NxJ for high quality computation. Here we introduce a dynamic approach: to modify the Per after one round of the collection, and Let the initial Per be the real scores got from cited pages, which our system can pick automatically from the web, such as the "cited by" number of the page from NEC's virtual Library or the Web of Science because they are authority scientific databases. And the initial vector is standardized to the Page Rank initial condition as the sum of the components of the vector is equal to I. Since the PageRank scores is computed once and saved with the titles, so this kind of work is not overburden for the system by periodically updated.

728 - - - - - - - - - - - - - - - - - - - - - Yanping Zhao and Donghua Zhu

3.3 The Official Statistical Data For Authorities For this case, we use statistical data as real scientific citations, Journal Impact Factors from JCR or others relevance ranks from the searched scientific databases, and the survey papers and influencing authors or affiliations on the top list feed back from the our monitoring module, All these are collected and associated with the paper title. And can be combined together with weighted indicators. For comparability of the contributions among them, consideration such as making them normalized and use weights like: JA( P) = P JJ Citation( P) + P 2J Survey( P) + P 3/ Author( P)

(2)

+ P4JJournai(P) + PS!Affii(P) 3.4 The Indicator of the Freshness For this factor, we use the date of the paper publication, the Index as

( 3)

published year -1990 Current ye ar - 1990 3.5 The Composite Indicator for Importance = 1_

IF(P)

We make the 4 indicators normalized for comparability and with the consideration of simplicity and fast in computation. Also make the indicator give some profiles of the collected papers.

IC(P) =ailS (P)+a2IP (P)+a3IA (P)+a4IF(P)

(4)

Perhaps some documents may not have scores of some of the Indexes such as IP(P) or IA(P), we assume the corresponding indicators as zero, at least they have IS(P) for their contents since they are similar to the topic.

4. Dynamic Monitoring Module We modified an approach from Adam Jatowt and Khoo Khyou Bun et al. proposed and construct a module to accomplish the dynamic monitoring task. Our module contains three components: the new domain viewer, crawler, and change extractor. The viewer combines our dynamic feature generating algorithm, selects feature terms from collected papers and sends them to EI, CNKI, or GOOGLE, and collect those results meeting our target, Secondly, the crawlers is set by timer to crawl the pages in those domains containing the results to collect the updated part of the original pages or newly emerged pages. Thirdly, the change extractor makes the comparison of the new and old one, and extract the changed part in the pages.

4.1 the Domain Fitness The aim of the new domain viewer is to get the domains which contain newly emerged user interested topic or new domains. First, it uses the dynamic feature selecting method to send the user target to the search engines, then recognizes the returned sites significance by calculating the domain Fitness scores of the returned results. The fitness scores is as follows:

. F 1tness

=

#of {pages .contain .feature .terms} #of {links .to.other .domains} + Total {pages .in.the.domain} Total {links .of.the.domain}

(5)

where# stands for the number, and if the Fitness score bigger enough, then choose the site as a fitness domain.

4.2 the Crawlers We use depth first algorithm go to 3 depths level ofthe fitness domains, by sending the sites http header request, and get the "last modified" value, to judge if the collected page is changed. If it is changed, then pick it up, and find out changed part, if it has new links, then pick the new pages for similarity check.

4.3 Change Detection and Extraction Methods Change detection is made by Sentence extractor and sentence analyzer. The sentence extractor find two sets of the sentences in new and old pages, and the sentence analyzer compares them and find out those not inside the old page. And incorporate the new ones to construct a changed part. Then uses TF-PDF algorithm to find the important sentences. The formula is as follows:

-

n

a

1 -);and IFjd I= wj =L.~:PIFjd lexp(-

Nd

Fa

k=k Lk=i

Fkd

(6)

Evaluation of the Collected Pages and Dynamic Monitoring of Scientific Information _ _ _ _ _ _ _ _ 729

Where Wj is the weight of jth term, Fjd is the frequency of the jth term occurred in the domain d, nid is the number of documents in the domain d and containing jth term, and Ndis the total number of the documents belong to the domain d, and K is the total number of the terms. Finally uses sentence extractor to compute the average sentence weights for each sentence in the changed part by summarizing the terms weights inside the sentence, and to pick out the significant sentences to construct the final changed part of the topics.

5. Initial Results And Future Works The system in VC++ 6.0. (Called BIT). We track some topics we are interested, for instance "PageRank" , From Fig. I we see that the top one papers is from L.Page, who gave the famouse GOOGLE search algorithm. And other indicators show some profiles of the collected papers. Fig 2 shows the terms with the heavy weights picked by sentence extractor. It tells the newly emerged top terms. We can see topic such as "PageRank" is related to "Information retrieval" etc and also shows the newly related with topics such as "availability", "reliability", "data mining" "personalization" etc. It is meaningful for the technology monitoring! These initial tests are promising. And we can suggest that it can be stronger and more efficient if it gets more thorough collections of the topic related areas. Which means the change detection make dynamic monitoring of scientific topic possible.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 730-733

Research on Expert's Colony Decision Method of Gradual Refinement Based on Theory of Evidence Weidong Zhu 1 and Yong Wu2 School of Management Hefei University of Technology, Hefei, 230009,China

Abstract: This paper discusses decision method based on theory of evidence, it analyses the problem of the existing decision method, and puts forward analyzing the focal element of basic probability number by gradual refinement method, in order to solve basic probability number which forms state factor of focal element. This method establishes contacts between the solution of basic probability number which forms state factor of focal element and the process of expert's colony decision, It makes decision method based on theory of evidence more scientific and more rational. Keywords: Theory of evidence, dempster combination rule, neural network

l.Introduction In decision analyzing, it's very difficult to provide future state of the system accurately in general cases. The general expert analyses the system state according to its knowledge and experience mastered, provides the subjective estimation of the system future state. When utilizing expert's colony to estimate the future state of the system, it's necessary to combine the prediction suggestion of expert's colony. The combination method based on theory of evidence !I - 91 and neural network offers the effective method for combining expert's colony prediction suggestion. It is expressed as basic probability number M of @that this method combines 1101 the prediction suggestion of expert's colony (@is system state set). Suppose decision set as D={d,,d2 ,.··,dm}, System state i decision scheme j and

(i =~---,~ j =~···,n). Under known abovementioned conditions, how to carry on decision, it is the problem that decision should be solved on the basis of theory of evidence. corresponding remuneration function is r(d;,x)

2. Decision method based on theory of evidence and existing problem XinSheng Duan provides two kinds of decision methods in document [ll,l 21 , one is utilize plausibility function to solve subjective probabilities of different system state, then ask out the expectation remuneration of each decision scheme. Utilize formula of plausibility function

P I ( {X }) =

L XE

Ask out point plausibility pi ({x 1 }},. ..

p(x. )= 1

,

pi ({x n })

pl({x;}} pi ({x 1 })+ .. ·+pi ({x n}}

pi ({x; }}

i

=J,. ..,n

tpl({x;}} i=l

1 Weidong

2 Yong

m(A) A

Zhu, Ph.D, Professor, E-mail: [email protected] Wu, Postgraduate, E-mail: [email protected]

(2.1)

Research on Expert's Colony Decision Method of Gradual Refinement Based on Theory of Evidence _ _ _ 731

p(x 1 ), ···, p(xn) is subjective probability that various kinds of states appear. We choose the decision scheme that is corresponding to the maximum expectation remuneration as the best scheme. Namely: (2.2) Another kind of decision method is known as M decision method. The thought of this method is that we ask out firstly remuneration function of each focal element, then we regard basic probability number as subjective probability in order to ask out the expectation remuneration of every decision scheme as the decision basis.

1 • Lr(d;,xk) -;:-{d;,Aj)=1 Aj 1 x,eAi

v=~~xjtr(d;

,Aj

(2.3)

)•m(Aj )) ; =l,.. ·,m

(2.4)

We choose the decision scheme that is corresponding to the maximum expectation remuneration as the best scheme. We should pay attention to the assumption of each kind of method while using in abovementioned two kinds of decision methods. Suppose it as follows: The first kind of method uses (2. 1) formula to confirm various kinds of subjective probability that each state appears, it means that the more plausibility, the more subjective probability. And generally speaking, point function pi ({x ; }) can't show all information that is concluded by basic probability number M and belief function Bel. The bigger pi ({x; }) is, might not be big m({x; }) and Bel ({x;}) The second kind of method uses (2.3) formula to calculate remuneration function of corresponding focal element, this formula ask out the simple average of remuneration function value, which is included by focal element state factor. It means that the appearance probability that every state factor included by each focal element is equal, and generally speaking, the appearance probability in every state is not equal. Because of the existence of above-mentioned assumption, it makes above-mentioned two kinds of decision methods have limitation. In order to solve this problem scientifically, we must know how to ask and solve the appearance probability and basic probability number of system state on terms that we have already known to the basic probability assignment in each focal element.

3. Decision method that ask and solve basic probability number of each state based on focal element analyzing From above-mentioned analysis we can know, if we want to carry on decision analyzing with theory of evidence, we should direct concrete problem and explore new decision method in order to solve problem that above-mentioned assumptions bring. The method is that asks and solves basic probability number and belief function of each state based on focal element analyzing. We can use this method to analyze the focal element of basic probability number which combine colony's expert opinion, it is divided into two situations, first situation is that we know basic probability number of all state factors of system, namely we know m({x; }) i =I,.··, n. The second kind of situation is that we are unknown basic probability number of all state factors of system. The first kind of situation: Basic probability number of each system state can be regarded as the basis that we analyze and calculate appearance probability of each state. m (0) may not equal zero at this moment, m(E>) express the suspicion to the evidence. When we believe the basic probability number of each system state in fully, we can define the appearance probability of each state. As follows:

m({x;}) p(x. )= ' m({x 1 })+m({x 2 })+ .. ·+m({xn})

i =l,.. ·,n

(3.1)

When we are suspicious of the basic probability number of each system state, we can give m (0) to each state. We can define the appearance probability of each state. As follows:

732 - - - - - - - - - - - - - - - - - - - - - - - - - - Weidong Zhu and Yong Wu

I

p(x

'

)=

m({x; })+-•m(0) n

m({x 1 })+ m({x 2 })+··· +m({x n })+ m(0)

i = I;··,n

(3.2)

The second kind of situation: We analyze the focal element of basic probability number by gradual refinement method, in order to solve basic probability number which forms state factor offocal element. First of all, if focal element that is formed by individual state factor, we can confirm that the basic probability number of individual state factor is the basic probability number of corresponding focal element. Secondly, if focal element contained the focal element that is formed by individual state factor which we have already known basic probability number, we can eliminate individual state factor which we have already known basic probability number and corresponding basic probability number from focal element and basic probability number. Finally, we need to introduce new information to judge and solve the individual state factor which we haven't confirm basic probability number yet. We can use expert consult method. State factors which is included by focal element A; are much fewer than state factors of0, This method is to resolve the system, and the gradual refinement method, so this method is feasible. This method establishes contacts between the solution of basic probability number which forms state factor of focal element and the process of expert's decision, It makes decision method based on theory of evidence more scientific and more rational. After the ones that ask out the basic probability number m ({x ; }) of each system state, we can ask out the probability that each state takes place by means of the first situation method. After that, we can calculate expectation remuneration of each decision scheme and choose the optimum decision scheme. Its calculation formula as follows:

V= md~X {~r(di'xj )• p(xj )}

i = l,.··,m

(3.3)

4. Empirical Research This paper chooses the expert prediction on the security market as the instance, carries on the evidence combination by means of combination method based on theory of evidence and neural network, we get the basic probability number M which reflected future state situation of security market, then use expert colony decision method of gradual refinement and M decision method to ask out the best decision scheme separately, we can conclude that the former scheme is obviously superior to the latter through the calculation result

Acknowledgments I gratefully acknowledge the support of NSFC (70171033) and natural science fund of Anhui Province (00043607); I also wish to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [I] Shafer G, A Mathematical Theory of Evidence[M], Princeton University Press, Princeton, New Jersey, 1976,pl-16 [2] Voorbraak F. On the Justification of Dempster's Rule of Combination [J], Artificial Intelligence, 1991,48: 171-197 [3]

Renbing Xiao Xue Wang, Qi Fei, The research of relevant evidence combination method[J], Pattern Recognition and Artificial Intelligence, 1993,9:227-234

[4] Chun Yang, Huaizu Li, A n Evidence Reasoning Model with Its Application to Expert Opinions Combination[J] ,System Engineering-Theory and Practice ,2001.4:43-48 [5] Wenji Du, Yanhui Chen, Weixin Xie, A Weighted Dempster's Rule of Combination [J], Journal of

Research on Expert's Colony Decision Method of Gradual Refinement Based on Theory of Evidence _ _ _ 733

XiDian University, 1999,10:549-551 [6] Ronald R.Yager, On the dempster-shafer framework and new combination rules[J], Information Sciences, 1987,41:93-137 [7] Quan Sun, Xiuqing Ye, Weikang Gu, A New Combination Rules of Evidence Theory[J], Acta Electronica sinica, 2000,8 :117-235 [8]

Shanlin Yang,Weidong Zhu,Minglun Ren,Research on Combination Method for Unequal Conflicted Evidence Based Optimal Adjustment Coefficient[J],Chinese MIS,2003,3:55-60

[9] Weidong Zhu,Shanlin Yang,Minglun Ren Research of Evidence Combination Based on Neural Network ,Proceedings of the Eightth PACIS,2004,8 [10]

Weidong Zhu, Shanlin Yang, Minglun Ren ,Expert's Group Forecasting System Based on Learning and Theory ofEvidence[J], Forecasting ,2003,1

[II] Weidong Zhu, Research on Theory of Evidence Based and Internet Oriented Intelligent Decision Support Systems, Dissertation ofHeifei University of Technology 2003.4:42-45 [12] Xinsheng Duan, Theory of Evidence and Decision, Artificial intelligence [M], China Renming University Press, 1993.3:13-19

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 734-736

Preface of the Symposium : Stochastic Methods and Applications D. T. Hristopulos 1 Department of Mineral Resources Engineering, Technical University of Crete, GR-73100 Chania, Greece When we express in mathematical language a physical process that involves random quantities, we use essentially a stochastic model. Many scientific fields employ such models, even if their name does not involve the word stochastic (e.g., the physics of disordered media, geostatistics). The mathematical background of stochastic analysis is well developed and described in excellent textbooks (e.g., [1, 7, 13]). There is today a strong interest in applications of stochastic methods, because all natural and industrial processes in non-idealized environments involve random components due to intrinsic variabilities, which defy complete characterization, as well as the inescapable presence of experimental noise. Stochastic analysis has applications in fields such as structural engineering [11], physical processes in environmental media [3, 9], natural resources exploration [5], environmental health [4], finance [10], materials fracture [2], filtering theory [6], signal analysis [8], image analysis [12], etc. This session aims to facilitate an exchange of ideas between such disciplines. In this spirit, I will briefly outline some issues of terminology and of current research interest. The term stochastic typically refers to processes that are distributed in time (e.g., financial option prices), space (e.g., mineral concentrations), or both (e.g., groundwater pollutant concentrations). For spatially distributed variables the term spatial random fields is used, while for time evolution the term stochastic process is preferred. For space-time dependence the term space-time random fields is common. I will use the general term random field (RF) for both space and time dependence. RFs are used in the study of random heterogeneity, i.e., spatial variability that can not be captured by deterministic models. In the physics literature, the word disorder is often used for the same purpose, and it indicates departure from the perfectly orderly arrangement of crystal structures. Various classifications of RFs exist, depending on the property of interest. Regarding spatial (temporal) dependence, three main categories exist: (1) Uncorrelated RFs, i.e., white noise or fluctuations that can not be resolved by the measurement procedures. (2) Weakly correlated RFs, in which the correlations decay 'fast' with the distance between points (lag), so that the integral of the correlation function remains finite. Then, the RF is said to have short-range order. Note that disorder and short-range order are not mutually exclusive, since it is possible to have the latter in a system that exhibits long-range disorder. (3) Long-range correlated RFs in which the correlations decay slowly (e.g., as a power-law function of the lag). Correlations are important for many engineering applications, because they allow statistical prediction at points where measurements are not available, giving engineers the ability to design optimal exploitation strategies for mines and to estimate the environmental risk from the dispersion of toxic pollutants in the groundwater. Hence, randomness does not exclude predictability but 1 Corresponding

author. E-mail: [email protected]

Preface of Symposium No.4

735

implies a statistical distribution of the predictions, which must be accompanied by a measure of the associated uncertainty. If the probability density function (pdf) of the estimated variable is not available, one is limited to estimates of the mean (expected) value and the standard deviation. Uncertainty quantification is crucial in the analysis of industrial and environmental, processes. The related term reliability, gives a measure of belief in the nominal value of the process. In connection with product quality the term uniformity is used. Uniformity is inversely proportional to the width of the pdf for the specific product property. Presence of long-range correlations is important, because it invalidates mean-field approaches and has a significant impact on the statistics of extreme events. A random process is stationary (in the weak sense) if its correlation function depends only on the time lag but not the specific times. Similarly, spatial RFs are called stationary or statistically homogeneous. A space-time RF is then characterized as, e.g., stationary and statistically homogeneous. Notably, stationarity in time does not imply statistical homogeneity and vice versa. There is currently strong interest in the development of space-time correlation functions for environmental and other geostatistical applications. Quenched randomness occurs if the fluctuations are frozen in the system, if they are invariant during the time scales of interest and can be probed only empirically (e.g., fluid permeability of geological formations). Solutions of stochastic partial differential equations represent RFs that can be static (e.g., fluid pressure in porous media) or dynamic (e.g., tracer concentrations in groundwater transport). A significant distinction is between Gaussian and non-Gaussian RFs. The former are standard for many engineering applications (e.g., geostatistics), while the latter are necessary if the details of the tail are important (e.g., in the weak-link scaling model of fracture). It is hoped that this symposium will provide a forum for the fruitful exchange of ideas on new developments and applications of stochastic methods.

References [1] R.J. Adler, The Geometry of Random Fields, Wiley, New York (1981).

[2] B.K. Chakrabarti and L.G. Benguigui, Statistical Physics of Practure and Breakdown in Disordered Systems, Clarendon Press, Oxford (1997) [3] G. Christakos, Random Field Models in Earth Sciences, Academic Press, San Diego, (1992).

[4] G. Christakos and D. T:-Hristopulos, Spatiotemporal Environmental Health Modelling, Kluwer Adademic Pub!., Boston (1998). [5] P. Goovaerts, Geostatistics for Natural Resources Evaluation, Oxford, NY (1997).

[6] A. H. Jaswinski, Stochastic Processes and Filtering Theory, Academic Press (1970). [7] B. 0ksendal, Stochastic Differential Equations, Springer, Berlin (1998).

[8] A. Papoulis, Probability Random Variables, and Stochastic Processes, McGraw-Hill, NY (2001). [9] Y. Rubin, Applied Stochastic Hydrogeology, Oxford University Press, New York (2003). [10] A. N. Shiryaev, Essentials of Stochastic Finance, World Scientific, Singapore (1999).

[11] J. S6lnes, Stochastic Processes and Random Vibrations, J. Wiley, New York (1997). [12] G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, Springer-Verlag, New York (1995).

736

D. T. Hristopu/os

[13] M. Yaglom, Correlation Theory of Stationary and Related Random Functions I, Springer, New York {1987).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 737-740

Effects of U ncorrelated Noise on the Identification of Spatial Spartan Random Field Parameters D. T. Hristopulos 1 Department of Mineral Resources Engineering, Technical University of Crete, GR-73100 Chania, Greece Received 4 September, 2004; accepted in revised form September 10, 2004 Abstract: Spartan Spatial Random Fields (SSRF's) provide a new class of models for spatial dependence. Their main advantages compared to classical geostatistical techniques are computational efficiency and parametric parsimony. This paper focuses on the identification of the SSRF parameters from data contaminated with uncorrelated random noise, known by the name 'nugget effect' in geostatistical applications. Keywords: geostatistics, inference, inverse problem, spatial dependence, optimization Mathematics Subject Classification: 63M30, 62P12, 62P30, 62M40

1

Introduction

Spatial Random Fields (SSRF's) are widely used as models of spatial dependence for environmental and geophysical processes [2, 3, 4, 11]. A spatial random field (SRF) [1, 13] includes a number of possible states (realizations) so that the observation frequency of a particular state is determined from a multivariate probability density function (pdf), which depends on the point values of the field but also on the spatial configuration of each state. In general, an SRF state can be decomposed into a trend mx(s) and a fluctuation X'(s), so that X(s) = mx(s)

+ X'(s).

(1)

The trend represents the large-scale variations of the field, obtained by averaging the realizations over the ensemble, i.e. mx(s) = E[X(s)], while the fluctuation corresponds to fast variations, which may appear as random changes. For many applications in Geostatistics one can assume that the fluctuation is a second-order stationary SRF, or a random field with second-order stationary increments [13]. In the following, we will assume that the initial SRF has been detrended, and we will use the symbol X(s) for the fluctuation. In addition, for short we will refer to second-order stationary fields as stationary. Spartan spatial random fields (SSRFs) [6], belong to the class of Gauss-Markov random fields [12]. They aim to provide computationally efficient tools for geostatistical spatial distributions, including environmental pollutant or mineral concentrations, transport coefficients of heterogeneous media, environmental health factors, etc. The term Spartan indicates a parsimonious set of model parameters. In [6] the fluctuation-gradient-curvature (FGC) SSRF model was studied. Its energy 1 Corresponding

author. E-mail: [email protected]

D. T. Hristopulos

738

functional embodies Gaussian fluctuations and involves three terms that measure the magnitude of the fluctuations, as well as the fluctuation gradient and the curvature. The main mathematical properties, including expressions for the covariance function, and permissibility conditions [13] for the model parameters were discussed for the continuum and square-lattice models. The covariance spectral density of the SSRFs is cut off by construction at a frequency that corresponds to physical resolution constraints. Since the SSRFs are band-limited, their realizations are differentiable in the mean square sense [9]. The probability density function (pdf) of SSRFs is determined from an energy functional H[X.>.(s)], according to the familiar in statistical physics expression for the Gibbs pdf /x[X.>.(s)]

= z-i exp { -H[X.>.(s)]}.

(2)

The constant Z(partition function) is the pdf normalization factor obtained by integrating >. denotes the fluctuation resolution scale. The energy functional determines spatial variability in terms of interactions between neighboring sites. The pdf of stationary Gaussian SRFs used in classical geostatistics can be expressed in a similar form as follows exp (-H) over all degrees of freedom. The subscript

(3)

where Gx;.>. (s, s') is the covariance function; the latter needs to be determined from the data for all distance vectors s - s'. In the following, we suppress >. for notational convenience. In SSRF models, spatial dependence is determined from interactions, which are physically motivated (e.g., in the cases of space-time models that evolve dynamic evolutions of the pdf) or they represent plausible geometric constraints (e.g., in the case of quenched geological disorder). The pdf of the FGC model involves three main parameters: the scale factor TJo , the covarianceshape parameter 71 1 , and the correlation length ~. The introduction of these physically meaningful parameters simplifies the identification problem and allows intuitive initial guesses. A factor that adds flexibility is the coarse-graining kernel that determines the fluctuation resolution scale >. [6]. The resolution is directly related to the smoothness properties of the SSRF. In [6, 7], a kernel with a boxcar spectral density with a sharp cutoff in frequency jwavevector space at kc ex >.-I was assumed. While the cutoff frequency is treated as a constant, it is also possible to consider it as an additional model parameter. An implication of the interaction-based energy functional is that the model parameters follow from simple sample constraints. This feature permits fast, practically linear in N, solution of the parameter inference problem. For SSRFs sampled on inhomogeneous lattices, the interactions between 'near neighbors' are not uniquely defined. One alternative [6] is to superimpose a regular square lattice on the sample area. Each lattice cell then contains a variable number of sample points. The neighbor interactions in this scheme are between points in neighbor cells. Determining the neighbor structure increases the computational effort [6], but the inference process is still fast. Methods for non-constrained simulation of FGC SSRFs were presented in [7]. In both cases the simulations are exact, in the sense that the simulated states respect Gaussian statistics and the spatial structure by construction. In the lattice case, spatial correlations are imposed by filtering independent Gaussian random numbers with the square root of the covariance spectral density function. In the case of irregular distributions, correlations are imposed by sampling an adequate number (i.e., the order of 104 ) of wavevectors from a probability distribution obtained by integrating the covariance spectral density. Geometric anisotropy has been addressed with a systematic method [5], which allows determining the main parameters of anisotropy and then transforming into an isotropic coordinate system. The method was applied to real data from a mining site in Greece in [8].

Noisy Spartan Random Fields

739

2

SSRF with Random Noise

Eq.l assumes that the SSRF represents a measurable properly, the spatial variation of which consists of a deterministic trend and a correlated fluctuation. In practice, the observable X*(s) is likely to be contaminated by noise, which may represent unresolved fluctuations or measurement errors. Then, the observable SRF is given by the sum of two terms, X*(s) = X(s)

+ e(s)

(4)

where e(s) is a Gaussian white noise SRF that is statistically independent of the SSRF X(s). The covariance function of the observable field X*(s) is also additive due to the independence of the noise and property fields, i.e., G~(r) = a;8(r) + Gx(r)). The additive property holds for the covariance spectral densities as well. The observable spectral density includes an additional parameter (the variance of the noise field). Based on [6], if dis the spatial dimension and K>.(k) the kernel spectral density, it follows that (5)

3

Parameter Identification

To determine the model parameters, one needs to define experimental constraints that capture the main features of the SSRF and the noise. The constraints used in [6] are based on two-point products of the field in d orthogonal directions, i.e., So(s) = X~(s), S1 (s) = L~= 1 ['ViX>.(s)] 2 , and S 2 (s) = L~,j= 1 tl~i) [X>.(s)] tl~) [X>.(s)], where tl~i) denotes the centered second-order difference operator. The experimental constraints are given by S 0 (s) (sample variance), S1 (s) (average square gradient) and S 2 (s), where the bar denotes the sample average. The procedure defined in [6] for the calculation of the experimental constraints does not change in the presence of noise. The respective stochastic constraints are E [Sm(s)], m = 0, 1, 2, where E [·]denotes the ensemble average. The goal is to obtain relations for the stochastic constraints in terms of the parameters of the observable covariance function. Let E* [Sm], m = 0, 1, 2 denote the moments of the noisy field. Since the noise term has an impact on the covariance only at zero lag, we obtain:

E* [So] = E [So] +a~ E* [5!] = E [5!] E* [S2] = E [S2]

(6)

2

+ d;~2 2

+ 4d2 a4 ae

(7)

(8)

where the SSRF moments E [Sm] are given in [6], and a is the lattice step. Matching the stochastic and experimental constraints is formulated as an optimization problem in terms of a distance functional that measures the deviation between the two sets of constraints [6]. Suitable grouping of the constraints allows eliminating 1/0· This corresponds to using the normalized spectral density G~(k) = G~(k)ho· Minimization of the distance functional leads to a set of optimal values fJ1, ~,&;/flo. The value of fJo follows then from the ratio

(9) Using kc as an additional parameter in the optimization needs further investigation. The efficiency of the parameter identification problem is due to two factors: Firstly, formulation of the

740

D. T. Hristopulos

FGC pdf in terms of a clearly defined and physically meaningful correlation length and a dimensionless shape coefficient implies straightforward interpretation of the model parameters. Rough estimates for the correlation length can be obtained, either by visual inspection of preliminary maps or from the plot of the dispersion variance of block-averaged SRF values versus the block size [10]. Secondly, the distance functional involves scaled moments that are independent of the global scaling parameter T/O· This reduces the number of free parameters from four to three, i.e., TJ!,~,u;/TJo. The parameters TJo and u; are then evaluated from the total variance of the sample. We will investigate the framework outlined above for the SSRF parameter identification problem in the presence of noise using synthetic SSRF realizations mixed with uncorrelated noise.

References [1] R.J. Adler, The Geometry of Random Fields, Wiley, New York (1981). [2] G. Christakos, Random Field Models in Earth Sciences, Academic Press, San Diego, CA (1992). [3] G. Christakos and D. T. Hristopulos, Spatiotemporol Environmental Health Modelling, Kluwer Adademic Pub!., Boston (1998). [4] P. Goovaerts, Geostatistics for Naturol Resources Evaluation, Oxford University Press, NY (1997). [5] D. T. Hristopulos, New anisotropic covariance models and estimation of anisotropic parameters based on the covariance tensor identity, Stochastic Environmental Research and Risk Assessment, 16 (1) 43-62 (2002). [6] D. T. Hristopulos, Spartan Gibbs random field models for geostatistical applications, SIAM Journal in Scientific Computation 24 2125-2162 (2003). [7] D. T. Hristopulos, Simulations of Spartan random fields, in Proceedings of the International Conference of Computational Methods in Sciences and Engineering (Editor: T. E. Simos) 242-247 World Scientific, London, UK (2003). [8] D. T. Hristopulos, Anisotropic Spartan Random Field Models for Geostatistical Analysis, in Proceedings of 1st International Conference on Advances in Mineral Resources Management and Environmental Geotechnology (Editors: Z. Agioutantis and K. Komnitsas) 127-132 Heliotopos Conferences (2004). [9] D. T. Hristopulos, Spartan Random Fields: Smoothness Properties of Gaussian Densities and Definition of Certain Non-Gaussian Models, in Interfacing Geostatistics, GIS and Spatial Data Bases, Proceedings of International Workshop STATGIS 2003 (Editor: J. Pilz), Springer, Berlin, Heidelberg, forthcoming. [10] C. Lantuejoul, Geostatistical Simulation: Models and Algorithms, Springer, Berlin, Germany (2002). [11] Y. Rubin, Applied Stochastic Hydrogeology, Oxford University Press, New York (2003). [12] G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, Springer-Verlag, New York (1995). [13] M. Yaglom, Correlation Theory of Stationary and Related Random Functions I, Springer, New York (1987).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004,pp. 741-744

An Application of Spartan Spatial Random Fields in Geostatistical Mapping of Environmental Pollutants Manolis Varouchakis and Dionissios T. Hristopulos 1 Department of Mineral Resources Engineering, Technical University of Crete, GR-731 00 Chania, Greece Received 8 August, 2004; accepted in revised form 16 September, 2004

Abstract: This paper presents a preliminary application of Spartan Spatial Random Fields (SSRFs) to a real geostatistical data set. The SSRFs bypass calculation and fitting of the experimental variogram and thus provide a computationally fast alternative to the classical geostatistical approach. The study focuses on the concentration of the heavy metal chromium (Cr) in the Jura region (Switzerland). A map of Cr concentration on a square prediction grid is generated based on a set of irregularly spaced measurements. A new estimation method that minimizes the SSRF interaction functional is applied instead of the commonly used kriging estimators. Keywords: geostatistics, estimation, chromium concentration, prediction, validation Mathematics Subject Classification: 63M30, 62PI2, 62P30, 62M40

1. Introduction The distribution of heavy metals in the soil over an area of 14.5km2 in the Swiss Jura region [I] is studied. The available data consist of 359 samples that contain concentrations in parts per million (ppm) for seven heavy metals. The aim of this study is to generate from the data a map describing the dispersion of pollutant concentrations in the entire area. This can be accomplished with the aid of a geostatistical model. Such maps are useful tools for planning soil remediation actions in compliance with EU regulations. Typically, a variant of the kriging algorithm [I] is used to generate geostatistical maps. In contrast, we apply the recently introduced Spartan Spatial Random Field (SSRF) model. The initial data are segregated in two sets: The training set used for the parameter identification (inference) process contains 259 samples at the locations s" t = I, ... , N, . The validation set contains I 00 points sv, v = !, ... , Nv, where the estimated concentrations are compared with the actual values, to obtain a quantitative measure of the method's accuracy. At this stage, the geostatistical analysis is applied to chromium (Cr), which follows approximately the normal distribution.

2. Geostatistical Model Spartan Spatial Random Fields (SSRFs) have been introduced and their mathematical properties have been investigated [2]-[5], mostly for Gaussian probability density functions. However, the SSRF models have not been applied to real data sets. The SSRF models are determined by a small number of parameters, the inference of which from the sample does not require the variogram calculation. The SSRF parameter inference is based on relatively simple statistical restrictions that are calculated effectively. Let us assume that the variable of interest (e.g., concentration) is represented by the spatial random 1 Corresponding author. E-mail: dionisi@mred tuc.gr

742 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ M Varouchakis & D. T. Hristopulos

field, X, (s), where}. is the resolution. The SSRF probability density function (pdf) includes all the information regarding the spatial dependence. General SSRF models are subsets of Gibbs random fields and thus can be expressed with the following mathematical relation:

fx[X, (s)] = z-' exp{-H[X, (s)]}

(I)

In the above equation, /,. [X, (s)] is the pdf, Z is a normalization constant, and H[ X, (s)] is a spatial interaction functional that expresses the dependence of the random field values at different locations in space. The innovation that is imported by the SSRFs is the dependence of H[X, (s)]on physical quantities related to the spatial distribution of values, i.e., the local gradient and curvature of the "topography", as well as the complete determination of the interaction from a small number of parameters. An example oflocal interdependence is given by the following functional:

(2) The integral implies that all the locations are included (in practice, discretized versions are used). The scale parameter 17o determines the total variance of the fluctuations, the coefficient 1]1 the shape of the covariance function, and,; the correlation radius (i.e., the range of the spatial dependence). In geostatistical applications, the first objective is solving the inverse problem that leads to the identification of the model parameters from the available data. For the SSRF model, statistical constraints that can be calculated from the sample are defined. It is possible (but not constraining) for the restrictions to correspond to the terms of the interaction functional. For example, for the interactions given by the equation (I), the constraints include the sample variance, as well as the average gradient square and curvature square. The respective ensemble (mean) values of these quantities are calculated from the pdf (I), and they involve the unknown parameters lJo , .,, , ,;. The optimal values of the parameters are determined by minimizing the difference between the sample and the ensemble values of the constraints [2]. The calculation of the statistical constraints and the solution of the optimisation problem are typically faster than the calculation of the variogram [2].

3. Estimation and Validation It is assumed that the Cr distribution corresponds to a Gaussian, second-order stationary and isotropic random field, the spatial dependence of which can be modelled by means of the SSRF given by equations (I) and (2). The parameters 1]0 , 1]1 and ,;, are determined from the Cr concentration at the points of the training set, using the procedure described in [2]. In The optimal values are .,, =394 and (= 1.3

(km). A correlation neighbourhood B(s.) is defined around each validation point s,, based on the correlation radius. In general, two options are available for the estimation: (i) The SSRF covariance function is used in a kriging algorithm. (ii) The estimates (s,) are determined from local solutions

x:

of the equation

H[ X, (s)] = 0,

s e B(s.) in the neighbourhood of s,. The solutions of this equation

can be expressed in closed form for the interaction functional of equation (2). Here we use the approach (ii), which will be described in detail elsewhere. Explicit expressions for the uncertainty of the estimates can also be obtained. For values of the shape coefficient 1]1 ~ 2, as is the case here, the estimate is given by the equation

x; (s,) = 47r(A, + B, ), v= I ,... ,N, .

(3)

The local parameters A,, B, are determined by minimizing the average square error of the explicit solutions

x; (s;)

with respect to the actual values X, (s;) . The average is calculated over all the training

points s; e B(s.) that are located inside B(s.). The parameter lJo does not influence the estimated concentration values, but it does affect the uncertainty of the estimates.

Spartan Environmental Mapping-------- - - - - - - - - - - - - - - - - - 743

4. Analysis of SSRF Model Performance A comparison between the estimated values (i.e., predictions) and the actual values of Cr concentration at the validation points is shown in Fig. I. The plot reveals deviations between the SSRF estimates and the actual values, and the overall variability of the estimates is less than the actual. The deviations is measured by means of the average absolute value of the relative error, c, . which is defined by means of (4)

The calculated value of c, is approximately 28%. Further examination of the training and validation sets shows that in the former the concentration values range between 18 ppm and 65ppm, while in the latter between 3.3 ppm and 70 ppm. This observation explains why the estimated variability (based on the training set) underestimates the actual variability. The observed deviation between the actual values and the SSRF estimates is partly due to this factor. In particular, the SSRF predictions are significantly off at validation points where the concentration values are outside the range of the training set. For example, if the two points with the lowest and highest concentrations are removed from the validation set, the error decreases to 19%. The performance is reasonable, given the fact that a simple SSRF model is used, which corresponds to a normal (Gaussian), stationary and isotropic distribution. A map of Cr concentration on a regular grid that covers the study area is shown in Fig 2. 70 60

"E c. g

c:

.Q



c

g 0

(.)

10

00

20

40

60

80

100

Figure I : Plot of the actual Cr concentration (*, continuous line) and the SSRF estimates (x, dashed line) at the locations of the validation set.

5. Conclusions This paper presents a preliminary application of a specific SSRF model in environmental pollutant mapping, using a data set of Cr soil concentration from the Jura region of Switzerland. Reasonable agreement between the actual and estimated values at the points of the validation data set is observed, in spite of several conceptual simplifications implied by the specific model. This supports the feasibility of using SSRFs for the generation geostatistical maps. Future research will focus on (i) testing the conceptual assumptions and (ii) methodological improvements of the SSRF inference and estimation algorithms, aiming to obtain better discretization for the training and prediction grids, optimal values of the correlation radius .; and the shape coefficient, which are expected to improve the prediction accuracy. In addition, the estimation uncertainty will be investigated, and comparisons with kriging methods will be conducted.

744 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ _ M Varouchakis & D. T. Hristopulos

0.5

1.5

2

2.5

3

3.5

4

4.5

Figure 2: Gray scale map ofCr concentrations generated by the SSRF method on a prediction grid. The training set locations are marked by diamonds, and the validation set locations are marked by squares.

References [I] P. Goovaerts, Geostatisticsfor Natural Resources Evaluation. Oxford Univ. Press, New York (1997). [2] D. T. Hristopulos, Spartan Gibbs random field models for Geostatistical Applications, SIAM Journal in Scientific Computation, 24,2125-2162 (2003). [3] D. T. Hristopulos, Anisotropic Spartan Random Field Models for Geostatistical Analysis, Proceedings of r' International Conference on Advances in Mineral Resources Management and Environmental Geotechnology (editors: Z. Agioutantis and K. Kornnitsas, TUC), Heliotopos Conferences (2004), 127-132. [4] D. T. Hristopulos, Numerical Simulations of Spartan Gaussian Random Fields for Geostatistical Applications on Lattices and Irregular Supports, Journal of Computational Methods in Sciences and Engineering, in print. [5] D. T. Hristopulos, Spartan Random Fields: Smoothness Properties of Gaussian Densities and Defmition of Certain Non-Gaussian Models, Interfacing Geostatistics, GIS and Spatial Data Bases. Proceedings of International Workshop STATGIS 2003 (Editor: J. Pilz, Klagenfurt), Springer, Berlin, in print.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 745-748

On the Application of Statistical Process Control to the Stochastic Complexity of Discrete Processes A. Shmilovici*\ I. Ben-Gal" *Dept. of Information Systems Eng. Ben-Gurion University P.O.Box 653, Beer-Sheva, Israel

"Department oflndustrial Engineering TelAviv University Ramat-Aviv, Tel-Aviv 69978, Israel

Accepted 29 August, 2004 Abstract: A change in a process would change its description length - as measured by its stochastic

complexity. The key idea here is to monitor the statistics of the stochastic complexity (the equivalent code length) of a data sequence A context tree model is used as the universal device for measuring the stochastic complexity of a state dependent discrete process. The advantage of this method is in the expected reduction in the number of samples needed for reliable monitoring. Keywords: Process Control, Control charts, Stochastic Complexity, Context Tree Algorithm Mathematics Subject Classification: 62M02, 93E35, 62MIO, 62P30

1. Introduction Most statistical process control (SPC) methods construct a Shewhart type control chart for the distribution of some attributes of the process (e.g., mean) that is being monitored. Often the observations (or attributes) are assumed to be independent and identically distributed (i.i.d.) and traditionally they are assumed to be normally distributed. As evidenced by the wide spread implementation of the Shewhart control charts, this practice has proved to be very useful. However, there are important situations where the i.i.d., and the normality assumptions are grossly inaccurate, and can lead to false alarms and late detection of faults. For example, due to the widespread of digital controllers, many industrial processes are being controlled by a feedback controller that intervenes whenever the process deviates significantly from a pre-specified set-point. This introduces dependency between consecutive samples and deviation from the normality assumption. The need to find substitutes for the traditional Shewhart control charts has been recognized in the literature. Model specific methods (such as Exponential Weighted Moving Average) unfortunately need relatively large amounts of data to produce accurate results, and can not capture more complex process dynamics, such as Hidden Markov Models (HMM). Model generic methods (such as the ITPC[l] and the CSPC[2]) use asymptotic properties from information theory as a replacement for explicit distribution assumption about the process characteristics. In practice, large amounts of data are needed for the application of ITPC, either for deriving an empirically based control limit, or for using an analytically derived control limit that is based on asymptotic considerations. The stochastic complexity of a sequence is a measure of the number of bits needed to represent and reproduce the information in a sequence. It is a statistic commonly used as a yardstick for choosing between alternative models for an unknown time series. Universal coding methods have been developed to compress a data sequence without prior assumptions on the properties of the generating process. Universal coding algorithms - typically used for data compression - model the data for coding in a less redundant representation. The length of the data after compression is a practical measure of the stochastic complexity of the data. Some universal algorithms are known to have asymptotic performance as good as the optimal non-universal algorithms. 1 Corresponding author. E-mail: [email protected]

746 __________________________ A. Shmi/ovici, I. Ben-Gal

This means that for long sequences, the model provided by the universal source behaves like the "true" system for all tasks we wish to use it for, such as coding, prediction, and decision making in general. Here we use the context tree algorithm of Rissanen [3], which is a universal coding method, for measuring the stochastic complexity of a time series. The advantage of this specific algorithm, in contrast to other universal algorithms known only to have asymptotic convergence, is that it has been also to have the best non-asymptotic convergence rate. Thus, it can be used to compress even relatively short sequences - like the ones available from industrial processes. The key idea in this paper is to monitor the statistics of the stochastic complexity (the equivalent code length) of a data sequence A context tree model is used as the universal device for measuring the stochastic complexity of a state dependent discrete process. The advantage of this method is in the expected reduction in the number of samples needed for reliable monitoring.

2. The Context Tree Method Following the notation in [4], let us consider a discrete sequence with N+l symbols,

X~N =X -N•X -N+lo···,Xo, where each symbol X; belongs to an alphabet A of cardinality, !Aland where the sequence is emitted by a stationary source. In the estimation problem, given X~ N , you need to estimate P(XI I x~N), the unknown conditional probability distribution of XI given x~N. Consider the class of universal conditional probability measures that count the recurrence of the longest suffix of X 1 in X~N. The suffix- X 0

0 -Ko(X_N)

-is a subsequence ofthe past sequenceX~N

termed also as the context. In the context tree algorithm, probability estimates are computed for every context and a weighted probability is evaluated. A context is represented by the path of branches starting at the root until it reaches a specific node. The context order is reversed with respect to the order of observance, such that deeper nodes correspond to previously observed symbols. The lengths (depth) of various contexts (branches in the tree) do not need to be equal. Given a context tree, compression can be obtained as a result of recurring patterns in the data. Each node in the tree is related to a specific recurring context (sub-sequence). Hence, the original sequence can be coded by the sub-sequences in the context tree. Using an arithmetic coder, it is guaranteed that the redundancy does not exceed two bits per sequence. A sequence that does not belong to the same class of sequences from which the context tree was generated (trained) is expected to obtain a lower compression rate when using the context tree probabilities from the training set.

3. SPC for the Stochastic Complexity Derived from the Context Tree Model Within the context of developing a SPC procedure, we have to set up three parameters for the algorithm: N - the series length from which the reference tree will be computed; K max - the maximal context tree depth; and N - the sequence length for which the compression statistic is computed with the context tree. According to Ziv [5], if we set the depth of the context tree such 0

I

'

3



3

P(XIIX_Kmaxl)~ A1J ,then N~Kmax and N>NAT1 •

The stochastic complexity of any sequence x/1 that was prefixed by the sequence

X~D+I

(the

context), and was generated by the same information source that generated the sequence Y~N+I can be measured with the universal algorithm that uses the context tree T = r(s,e 5 ) that has a structureS and parameters 8s. The context tree will found from the sequence Y~N+I (training sequence), There is a recursive method to calculate the stochastic complexity measure -log 2 (Pr(x1N I

x~D+Io T))= -log 2 (~tr(x1 I xj.::b, T)J = -1~1 1ogz (rr(x1 I xJ::-i (J)))

where K 1(i)= K 1(x 1 ,xj::-b, T).

1

On the Application of Statistical Process Control to the Stochastic Complexity of Discrete Processes _ _ _ 74 7

The context tree Pr(x j N

1X

T

with

lSI

leafs describes a multinomial distribution with values

j~tuJ When the context tree model is estimated from a sufficiently long sequence Y~N+ 1 ,

»lSI, than

Pr(x j I X )~i,(j)) are also independent and identically distributed. In that case, the

expression is approximately the sum of i.i.d. random variables. The value of the stochastic complexity

xf

of the sequence

preftxed by the sequence X~D+ 1 is a random variable with

!SIN

possible

discrete values. For a sufficiently large N, the distribution of the stochastic complexity can be approximated with the Central Limit Theorem. Now, we are ready to devise a recipe for the SPC of a process, using its stochastic complexity as the measure for the process stability: Use a sequence of observation Y~N+ 1 from the in-control state of the system to develop the context tree model T"' T(S, EJ s). • Calculate the first two moments from T • Denote by q 1 and q 2 the required False Alarm Rates (FARs) with respect to the Upper Probability limit (UPL) and Lower Probability Limit (LPL). Construct the Shewart control charts with q 1 = q 2 = 0.00135. •



For every sequence of length N use the context tree to compute its stochastic complexity. Insert a point in the Shewart control charts and observe the UCL and LCL.

4. Example: SPC for a single stage production system Consider a system of two machines: M I, M2 separated by a finite-capacity buffer B. Machine M2 attempts to process a part whenever it is not starved (i.e., whenever there are parts in its input buffer) and machine Ml attempts to process a part whenever it is not blocked. Figure I presents the state transition diagram for the system with in-control production probabilities of the machines to be PI = 0.9 and P2 = 0.8 , respectively, and a buffer capacity to be limited to C=3. 0.2

0.74

0.1

0.74

Figure I: The State transition Diagram of the in-control production system Figure 2 rresents the 'in control' distribution of buffer levels in the referenced process,

tx j I X j-1 , in the form of an analytical context tree Ta. Note that it is a single-level tree with lSI = 4 contexts and a symbol set of size IAI = 4. The root node presents the system steady-state Pr1

probabilities and the leaves presents the transition probabilities given the current state.

Figure 2: In-control context tree- Ta -based on the "in-control" state Figure 3 presents a second context tree Pr2

tx j

I X j- 1 )

-

Tb - with different values of the

production probabilities, namely PI = 0.7,p2 = 0.9. It is used for simulating the "out of control" distribution. As we can see, it had captured the probability difference.

748 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A. Shmilovici,/. Ben-Gal

Figure 3: Context tree Tb for the modified system parameters to p 1 = 0.7,fJ2 = 0.9.

5. Conclusions We proposed a method for using the stochastic complexity of a sequence as the statistical measure for SPC. The stochastic complexity measure uses a context tree model as a universal model that can capture complex dependencies in the data. The advantage of the proposed method is two-fold: a) It is generic and suitable for many types of discrete processes with complex unknown dependencies. b) It is suitable for relatively short data sequences. The viability of the proposed SPC method and its advantages were demonstrated with numerical experiments.

References [I]

L.C. Alwan, Ebrahimi N., Soofi E.S, "Information Theoretic Framework for Process Control", European Journal of Operations Research, 111, 526-542( 1998).

[2]

I. Ben-Gal, Morag G., Shmilovici A., "CSPC: A Monitoring Procedure for State Dependent Processes", Technometrics, 45(4), 293-311(2003).

[3]

J. Rissanen, "A Universal Data Compression System", IEEE Transactions on Information Theory, 29 (5), 656- 664(1983).

[4]

J. Ziv, "A Universal Prediction Lemma and Applications to Universal Data Compression and Prediction", IEEE Transactions on Information Theory, 47(4), 1528- 1532(2000).

[5]

J. Ziv, "An Efficient Universal Prediction Algorithm for Unknown Sources with Limited Training Data", 2002, Available: www.msri.org/publications/ln/ msri/2002/infotheory/ziv/1/

VSP International Science Publishers P.O. Box 346,3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 749-749

Preface of the Symposium : Mathematical Chemistry T.E. Simos Chairman ofiCCMSE 2004 Department of Computer Science and Technology, Faculty of Sciences and Technology, University ofPeloponnese, GR-221 00 Tripolis, Greece

This symposium has been created after a proposal of Sonja Nikolic of the Rugjer Boskovic Institute (Croatia). After a successful review, we have accepted her proposal. The organizer of the symposium, Sonja Nikolic, has selected, after international peer review three papers: • • •

"Graphical Matrices as Sources of Double Invariants for Use in QSPR" by Sonja Nikolic, Ante Mili~evic and Nenad Trinajstic "Chemical Elements: a topological approach" by Guillermo Restrepo, Eugenio J. Llanos and Heber Mesa and "Molecular Simulation Studies ofFast Vibrational Modes" by D. Janezi~ and M. Penca

I want to thank the symposium organizer for her activities and excellent editorial work.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 750-752

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Graphical Matrices as Sources of Double Invariants for Use in QSPR Sonja Nikolic 1 Ante Milicevic and Nenad Trinajstic The Rugjer Boskovic Institute, P.O.B. 180, HR-10002 Zagreb, Croatia Received 7 July, 2004; accepted in revised form 4 September, 2004 Abstract: Graphical matrices are presented. Their construction via selected sets of subgraphs and the replacement of subgraphs by numbers representing graph invariants are discussed. The last step of the procedure is to apply the method of choice for obtaining the desired double invariant from the graphical matrix in the numerical form. It is also pointed out that many so-called special graphtheoretical matrices from the literature are rooted in the corresponding graphical matrices. Keywords: graphical matrices, chemical graph theory, double invariants, QSPR Mathematics Subject Classification: 05C10, 68RIO

Introduction Graphical matrices are matrices whose elements are subgraphs of the graph rather than numbers. Since the elements of these matrices are (sub)graphs, they are called the graphical matrices [I]. Thus far a very little work has been done on these matrices [1,2]. However, many so-called special matrices [3], such as the Wiener matrices [4,5] and the Hosoya matrices [6], may be regarded to be the numerical realizations of the corresponding graphical matrices. Therefore, the graphical matrices appear to be a promising area of research in chemical graph theory. The advantage of a graphical matrix lies in the fact that it allows great many possibilities of numerical realizations. In order to obtain a numerical form of a graphical matrix, one needs to select a graph invariant and replace all the graphical elements (subgraphs of some form) by the corresponding numerical values of the selected invariant. In this way, the numerical form of the graphical matrix is established and one can select another or the same type of invariant - this time an invariant of the numerical matrix. Graph invariants generated in this way are double invariants [2] in view of the fact that two invariants are used in constructing the targeted molecular descriptor.

Construction of the graphical matrix We present several graphical matrices that lead to Wiener-Wiener indices, Hosoya-Wiener indices, Hosoya-Balaban indices, Randic-Wiener indices, etc. These indices were considered because the values of the Randic indices, Wiener indices, Balaban indices and Hosoya indices for smaller acyclic fragments (trees) are readily available [e.g., 7]. As an example, we give the construction of the graphical matrix, based on the concept of the edge-Wiener matrix [8], that leads to the edge-WienerWiener index •ww. It should be noted that the edge-Wiener graphical matrix is constructed by the consecutive removal of the edges from the graph. This is shown below for the hydrogen-depleted graph G representing 2,2,3-trimethylpentane. Since a graphical matrix is a square, V x V, symmetric matrices, it is enough to give for the demonstrative purposes only the upper triangle of the matrix. For graphs without loops, the corresponding graphical matrices have zeros diagonal elements.

1 Corresponding author. E-mail:

[email protected]

Graphical Matrices _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 751

G

0

.·Yr 0

0

0

0

Xr0

0

0

~~

0

Xt-~ 0

0

0

0

X(-.,

0

~ •0

0

0

0

0

0

8

0

Numerical realization ofthe graphical matrix that leads to a double invariant The next step is to replace (sub)graphs with their Wiener numbers obtained by summing up the Wiener numbers of acyclic fragments. Below is given the numerical realization of the above edgeWiener graphical matrix.

0

0

0

0

0

0

19

0

0

46

46

0

0

29

0

0

0

46

0

46

0

0

0

0

0

0

0

0

0

0

0

0

46

0

0

0 The summation of the matrix-elements in the above matrix-triangle gives the edge-WienerWiener number •ww of 2,2,3-trimethylpentane ('WW =278).

752 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ S. Nikolic. A. Milicevii:, N. Trinajstii:

Conclusion The application of this kind of molecular descriptors to QSPR modeling is described for octanes since the modeling properties of this class alkanes is well-studied in the literature [e.g., 1,9,10] and thus we have a standard against which we tested our models. Our modeling is based on the CROMRsel procedure so devised to give the best possible model for a given number of descriptors [11-13].

Acknowledgments This work was supported by Grant No. 0098034 from the Ministry of Science and Technology of Croatia.

References [I]

M. Randic, N. Basak and D. Plav~ic, Novel Graphical Matrix and Distance-Based Molecular Descriptors, Croat. Chern. Acta 77 (2004).

[2]

M. Randic, D. Plavsic and M Razinger, Double Invariants MATCH - Comm. Math. Comput. Chern. 35 (1997) 243-259.

[3]

A. Milicevic, S. Nikolic, N. Trinajstic and D. Janezic, Graph-Theoretical Matrices in Chemistry, a review in preparation. Draft free of charge is available on the e-mail address: [email protected]

[4]

M. Randic, Novel Molecular Descriptor for Structure-Property Studies, Chern. Phys. Lett. 211 (1993) 478-483.

[5]

M. Randic, X. Guo, T. Oxley and H. Krishnapryan, Wiener Matrix: Source of Novel Graph Invariants,J. Chem.lnf Comput. Sci. 33 (1993) 709-716.

[6]

M. Randic, Hosoya Matrix - A Source of New Molecular Descriptors, Croat. Chern. Acta 67 (1994) 415-429.

[7]

N. Trinajstic, S. Nikolic, J. von Knop, W. R. Miiller and K. Szymanski, Computational Chemical Graph Theory - Characterization, Enumeration and Generation of Chemical Structures by Computer Methods, Horwood, New York, 1991, pp. 263-266.

[8]

M. Randic, Novel Molecular Descriptor for Structure-Property Studies, Chern. Phys. Lett. 211 (1993) 478-483.

[9]

M. Randic and N. Trinajstic, In Search for Graph Invariants of Chemical Interest, J. Mol. Struct. (Theochem) 300 ( 1993) 551-572.

[10]

G. Riicker and C. Rucker, On Topological Indices, Boiling Points and Cycloalkanes, J. Chern. lnf Comput. Sci. 39 (1999) 788-802.

[II]

B. Lucie and N. Trinajstic, Multivariate regression outperforms several robust architectures of neural networks, J. Chern. lnf Comput. Sci. 39 (1999) 121-132.

[12]

B. Lucie, N. Trinajstic, S. Sild, M. Karelson and A.R. Katritzky, A.R. A new efficient approach for variable selection based on multiregression: Prediction of gas chromatographic retention times and response factors, J. Chern. lnf. Comput. Sci. 39 ( 1999) 610-621.

[13]

B. Lucie, D. Arnie and N. Trinajstic, Nonlinear multivariate regression outperforms several concisely designed neural networks on three QSPR data sets, J. Chern. lnf Comput. Sci. 40 (2000) 403-413.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 753-755

Chemical Elements: A Topological Approach Guillermo Restrepo 1, Eugenio J. Llanos and Heber Mesa Laboratorio de Quimica Te6rica, Universidad de Pamplona, Pamplona, Colombia Observatorio Colombiano de Ciencia y Tecnologia, Bogota, Colombia Received 28 August, 2004; accepted in revised form 4 September, 2004 Abstract: We carried out a topological study of 72 chemical elements taking advantage of chemometric tools and general topology. We defined every element as an ordered set of 128 properties (physicochemical and chemical), then applied PCA and cluster analysis CA (4 similarity measures and 5 grouping methodologies). Afterwards we took dendrograms (complete binary trees) of the CA and developed a mathematical methodology to extract neighbourhood relationships of those trees. By means of this approach we builded up a basis for a topology on the set of chemical elements and then provided with a topology the same set. Finally we calculated some topological properties (closures, derived sets, boundaries, interiors and exteriors) of several subsets of chemical elements such as alkaline metals, alkaline earths, noble gases, metals and non-metals. We found that alkaline metals, alkaline earths and noble gases appear not related to the rest of elements and found that the boundary of non-metals are some semi-metals and its shape is like a "stair". Keywords: Mathematical chemistry, Chemical topology, Chemical elements, Periodic Table Mathematics Subject Classification: 92E99, 54A I 0

1.

Extended Abstract

Mendeleev' s work on the classification of chemical elements was the result of taking the whole set of elements, not one by one [1]. In this way he found, looking on the set, several relationships among properties of chemical elements. A picture that represents these similarities is the periodic table and exactly the groups of elements within it. Our aim in this work is to take the chemical elements defined not by means of their electrons but their properties and study their similarity relationships. In this way we collect a set of 128 phisico-chemical and chemical properties and define every element as a 128tuple. It means that every element is a point in a space of properties of 128 dimensions. Once we have a mathematical representation of chemical elements we start to calculate similarities among them by means of cluster analysis CA [2,3]. This procedure starts calculating a similarity matrix, which commonly is a distance matrix and then builds up groups or clusters by means of a grouping methodology [4,5]. CA procedure finishes with a 2-dimensional representation of clusters called dendrogram [6], which in mathematical terms is a complete binary tree. We use 4 similarity functions, 3 of them metrics [7], and 5 grouping methodologies. In this way we have 20 dendrograms that in a mathematical point of view show similarity relationships among elements (neighborhoods). Afterwards, with the aim of having a way to represent the information of 20 dendrograms in only one mathematical object, we calculate the ultrametric matrix to every dendrogram and then we do the sum of all of them. Thus, we have an ultrametric matrix that represents all dendrograms. After, we do a density plot of that matrix and in it was evident the similarity among elements of several groups of the conventional periodic table [8]. On the other hand, no wanting to lose the tree character of the dendrograms, we calculated a consensus tree [9] (Adams consensus), that collect the information of the 20 dendrograms in only one tree. This 1 Corresponding

author. Professor at Universidad de Pamplona, Colombia. E-mail: [email protected]

754 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ G. Restrepo et. a/.

consensus, as every dendrogram, shows similarity relationships. With this tree, and according to Villaveces' conjecture [1,6) and some of our results [IO,II], we start a mathematical description of the information shown by the consensus tree. This description takes advantage of cuts of"branches" of the tree. In this part, we describe a mathematical procedure to put the intuitive idea of cut on the tree in mathematical terms. This procedure starts considering a tree as a graph and every branch as a subtree of restricted cardinality [I 0]. Once we have the collection of subtrees or branches we take them to build up a basis for a topology and provide with a topology the set of chemical elements. With this topology on the set of elements we study the following topological properties of several subsets of chemical interest [12] within chemical elements: closure, derived set, boundary, interior and exterior [13). Subsets of chemical interest that we study are: all groups of the conventional periodic table, main elements, transition elements, metals, non-metals, semimetals, hydrogen, boron, carbon, silicon, the group of three: germanium, tin and lead, nitrogen, phosphorus, the group of three: arsenic, antimony and bismuth, oxygen, sulfur and the group of three: selenium, tellurium and polonium. Among topological results it is important to remark that the boundary of alkaline metals, alkaline earths, noble gases, hydrogen, scandium group, titanium group, boron, carbon, silicon and oxygen is empty which means that this subsets are not related to other elements. In other words, that these elements appear far from the others in the space of properties. On the other hand we found that the boundary of metals is made of some elements that have been considered as semimetals many years ago. We found too, that this boundary has a "stair" shape in the conventional periodic table. According to these results we can talk about a mathematical structure on the set of chemical elements, a topological structure which reproduce several aspects of the chemical understanding about chemical elements. Besides, the mathematical procedure to provide with a topology this set is not only to chemical elements, it is possible to apply this methodology to whatever set, in particular a chemical set that can be defined by means of its properties as we showed recently [14].

Acknowledgments Authors wish to thank the great help of Dr. Villaveces at Observatorio Colombiano de Ciencia y Tecnologia for his excellent ideas. In the same way the mathematical support of Dr. Isaacs at Universidad Industrial de Santander, on the other hand Dr. Sneath at Leicester University for having discuss his work, Dr Carb6-Dorca at Universitat de Girona and Dr. Bultinck at Gent Universitet for their comments and suggestions. We thank to Universidad de Pamplona, especially its rector Dr. Gonzalez for all his support during this research.

References [I] J. L. Villaveces, Quimica y Epistemologia, una relaci6n esquiva, Rev. Colomb. Fil. Cienc. 1 926(2000). [2] P. H. A. Sneath, Numerical Classification of the Chemical Elements and its relation to the Periodic System, Found. Chern. 2 237-263(2000). [3] P. Willett, Chemical Similarity Searching, J. Chern. Inf Camp. Sci. 38 983-996(1998). [4] R. G. Brereton, Chemometrics: Applications of Mathematics and Statistics to Laboratory Systems, Ellis Horwood, Chichester, 1993. [5) M. Otto, Chemometrics: Statistics and Computer Application in Analytical Chemistry, WileyVCH, Weinheim, 1999. [6] G. Restrepo, H. Mesa, E. J. Llanos and J. L. Villaveces, Topological Study of the Periodic System, In: B. King and D. Rouvray, The Mathematics of the Periodic Table, Nova, New York, 2004 (in press). [7] B. Mendelson, Introduction to Topology, Dover, New York, 1990.

Chemical Elements: a topological approach_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 755

[8] W. C. Femelius and W. H. Powell, Confusion in the Periodic Table of the Elements, J. Chem. Educ. 59 504-508(1982). [9] R. D. M. Page, COMPONENT User's manual (Release 1.5) University of Auckland, Auckland, 1989 (http://tax.onomy.zoology.gla.ac.uk/rod/cpw.html). [tO] G. Restrepo, H. Mesa, E. J. Llanos and J. L. Villaveces, Topological Study of the Periodic

System, J. Chem. Inf Comp. Sci. 44 68-75(2004).

[ll]G. Restrepo, Los elementos quimicos, su matematica y relaci6n con el sistema peri6dico, Bistua.l91-98(2004). [12]N. N. Greenwood and A. Earnshaw, Chemistry of the Elements, Butterworth-Heinemann, Oxford, 1998. [13]S. Lipschutz, General Topology, McGraw-Hill, New York, 1965. [14]H. A. Contreras, M. C. Daza and G. Restrepo: Estudio Topologico de los Alcanos. Proceedings of the Primer Encuentro Nacional de Quimicos Te6ricos, Universidad de Pamplona, Colombia, 2004.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 756-759

Molecular Simulation Studies of Fast Vibrational Modes D. Janezic 1 and M. Penca National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia Received 1 September, 2004; accepted in revised form 15 September, 2004 Abstroct: The survey of our past and present work on molecular simulation studies of fast vibrational modes will be presented. In particular, new symplectic integration algorithms for numerical solution of molecular dynamics equation and methods for the determination of vibrational frequencies and normal modes of large systems will be described. Keywords: molecular dynamics simulation, normal mode analysis, symplectic integration methods, Hamiltonian systems, Lie algebra, vibrational modes, large systems Mathematics Subject Classification: 65C20, 65120, 70H05, 70H15 PACS: 31.15.Qg

A Introduction Many physical problems, particularly in chemical and biological systems, involve processes that occur on widely varying time scales. Such problems have motivated the development of new methods for treating fast vibrational modes.

B Molecular Dynamics Among the main theoretical methods of investigation of the dynamic properties of biological macromolecules, such as proteins, are molecular dynamics (MD) simulation and harmonic analysis. MD simulation is a technique in which the classical equation of motion for all atoms of a molecule is integrated over a finite period of time. The resulting trajectory is used to compute time-dependent properties of the system. Harmonic analysis is a direct way of analyzing vibrational motions. Harmonicity of the potential function is a basic assumption in the normal mode approximation used in harmonic analysis. This is known to be inadequate in the case of proteins because anharmonic effects, which MD has shown to be important in protein motion, are neglected. When anharmonic effects are incorporated quasiharmonic analysis may be applied. In this method, the MD simulation is utilized to obtain effective modes of vibration from the atomic fluctuations about an average structure. These modes include the anharmonic effects neglected in a normal mode calculation [1]. 1 Corresponding

author. E-mail: [email protected]

757

Molecular Simulation Studies

C Harmonic Dynamics Harmonic analysis - normal mode calculation has been used for many years in the interpretation of vibrational spectra of small molecules. It provided the motivation for the application of harmonic approximation to large molecules, particularly proteins. For macromolecules, normal mode analysis focuses on the low frequency modes which are frequently associated with biological function. The role of low frequency normal modes involving global conformation changes and which have been theoretically determined for several proteins is emphasized. Low frequency modes of proteins are particularly interesting because they are related to functional properties. The analysis of these motions in the limit of harmonic dynamics lends insight into the behavior and flexibility of these molecules. The modes presented here include the lowest modes of Bovine Pancreatic Trypsin Inhibitor (BPTI) [2, 3]. For a typical macromolecular system, the problem may be so intractable that it becomes desirable to reduce the computation cost of a harmonic analysis by making some approximations concerning the nature of the motions. One such approximation involves reducing of the size of the secular equation by partitioning the Hessian matrix into relevant and irrelevant parts. This can be done through the use of an appropriate unitary transformation which approximately block diagonalizes the full Hessian matrix. The irrelevant part is subsequently ignored, and the relevant part is represented by a smaller matrix. Another approximation is the reduced basis harmonic analysis techniques which allow the study of motions of interest in the harmonic limit. The reduction of the problem can be viewed as either the removal of unwanted motion through the use of constraints, or as the inclusion of desired motion. Once the normal modes have been obtained, a great variety of analyses can be performed [1, 4]. HARMONIC ANALYSIS OF LARGE SYSTEMS Xray structure of BPTI

t

t

Minimizations Polygen Parameters

Minimizations ' PARAMI9

I

t 2 nanoseconds of molecular dynamics

rdie I cdie

580 atoms /904 atoms

.--

Minimizations PARAMI9

t

Harmonic Analysis

~~"~'

No cu Switc Sh. f 1t

t

t Frame analysis

t

10 I frames from 1.0-2.0 ns by O.DI 101 frames from 1.0-1.01 ns by 0.0001

Quasiharmonlc Analysis Ty~

of calculation

Reduced basis - free dihedrals only - free dihedrals and omegas - all dihedrals and angles - dihedrals and spherical hannonics -just spherical harmonics

t

rdie

Dynamics average structure for last nanosecond (1.0-2.0 ns)

rdie

580 atoms /904 atoms

Ty~

PARAMI9 580atoms

rdie I cdie 904 atoms

Truncation method

J

of calculation

Full basis Reduced basis - free dihedrals only - free dihedrals and omegas - all dihedrals and angles - dihedrals and spherical hannonics -just spherical harmonics

t

Minimization (shift) Harmonic analxsis {full basis) Concensus modes Frequency statistics

t

Frequency and Eigenvector Analysis

Figure 1: Shematic representation of procedures for performing harmonic analysis of large systems.

D. Janeiic and M. Penca

758

D Symplectic Dynamics Harmonic analysis also proved useful in developing an efficient symplectic MD integration methods. Symplectic integration methods are often the right way of integrating the Hamilton equations of motion. Recent advances in development of SISM (Split Integration Symplectic Method) and HANA (Hydrogens ANAlytical) for combined analytical and numerical solution of the Hamiltonian system based on a factorization of the Liouville propagator are presented [5, 6]. These techniques, derived in terms of the Lie algebraic language, are based on the splitting of the total Hamiltonian of the system into two pieces, each of which can either be solved exactly or more conveniently than by using standard methods. The individual solutions are then combined in such a way as to approximate the evolution of the original equation for a time step, and to minimize errors [7]. The SISM and HANA use an analytical treatment of high frequency motions within a second order generalized leap-frog scheme. The computation cost per integration step for both methods is approximately the same as that of commonly used algorithms, and they allow an integration time step up to an order of magnitude larger than can be used by other methods of the same order and complexity [8]. The SISM and HANA have been tested on a variety of examples. In all cases they posses long term stability and the ability to take larger time steps while being economical in computer time. The approach developed here is general, but illustrated at present by application to the MD integration of the model system of linear chain molecules and a box of water molecules. Further improvements in efficiency were achieved by implementing the SISM to computers with highly parallel architecture [9]. The SISM performs in parallel as standard leap-frog Verlet method and the speedup is gained due to the larger time step used [10].

E Conclusions The present work provides an overview of a variety of methods for the molecular, harmonic, and symplectic dynamics of large systems, such as biological macromolecules, in the case when all degrees of freedom were taken into account. If the same macromolecular system employing the same potential function is used in all of the calculations, then a direct comparison of all the methods is possible. Through the combination of methods presented, insight can be gained into the dynamics and flexibility of large molecular systems.

Acknowledgment This work was supported by the Ministry of Education, Science and Sports of Slovenia under grants No. P1-0002 and J1-6331.

References [1] BROOKS Bernard R., JANEZIC Dusanka, KARPLUS Martin, Harmonic analysis of large systems. I. Methodology. J. Comput. Chem. 16 1522-1542 (1995). [2] JANEZIC Dusanka, BROOKS Bernard R. Harmonic analysis of large systems. II. Comparison of different protein models. J. Comput. Chem. 16 1543-1553 (1995). [3] JANZIC Dusanka, VENABLE Richard M., BROOKS Bernard R. Harmonic analysis of large systyems. III. Comparison with molecular dynamics. J. comput. Chem. 16 1554-1566 (1995).

Molecular Simulation Studies

759

[4] JANEZIC Dusanka, Some multiple-time-scale problems in molecular dynamics. Cell. Mol. Biol. 7 78-81 (2002). [5] JANEZIC Dusanka, MERZEL Franci. Split integration symplectic method for molecular dynamics integration. J. Chem. Inf. Comput. Sci. 37 1048-1054 (1997). [6] JANEZIC Dusanka, PRAPROTNIK Matej. Symplectic Molecular Dynamics Integration Using Normal Mode Analysis. Int. J. Quant. Chem. 84 2-12 (2001). [7] PRAPROTNIK Matej, JANEZIC Dusanka. Symplectic Molecular Dynamics Integration Using Normal Mode Analysis. Cell. Mol. Bioi. 7 147-148 (2002). [8] JANEZIC Dusanka, PRAPROTNIK Matej. Molecular Dynamics Integration Time Step Dependence of the Split integration Symplectic method on System Density. J. Chem. Inf. Comput. Sci. 43 1922-1927 (2003). [9] BORSTNIK, Urban, HODOSCEK, Milan, JANEZIC, Dusanka. Improving the performance of molecular dynamics simulations on parallel clusters. J. Chem. Inf. Comput. Sci. 44 359-364 (2004). [10] TROBEC, Roman, STERK, Marjan, PRAPROTNIK, Matej, JANEZIC, Du8anka. Parallel programming library for molecular dynamics simulations. Int. J. Quant. Chem. 96 530-536 (2004).

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 760-761

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Preface for the Mini-Symposium on: "Simulations OfNonlinear Optical Properties For The Solid State" Benoit Champagne

1

Laboratoire de Chimie Theorique Appliquee Facultes Universitaires Notre-Dame de Ia Paix 61, rue de Bruxelles B-5000 Namur (BELGIUM) As reviewed recently[!], the calculation of the properties of individual molecules as found in an infinitely dilute gas has for long been of great interest to quantum chemists. This curiosity has been spurred in recent decades by the increasing importance of the communications industry in the world and the parallel need for materials having specific properties for electronic, optical and other devices. In particular, the nonlinear-optical quantities, defined at the microscopic level as hyperpolarizabilities and at the macroscopic level as nonlinear susceptibilities, have played a key role in determining the suitability of substances for practical use, for example, in electro-optical switching and frequency mixing. With few exceptions, a useful nonlinear optical material will be in the solid phase, for example, a single crystal or a poled polymer embedded in a film. Ironically, the quantum chemical calculations of nonlinear optical properties have for the most part been concerned with a single microscopic species. Much has been learnt in this way about appropriate molecular construction but the ultimate goal must be to investigate the nonlinear optical (NLO) properties in the solid phase. In the physics arena the theoretical determination of NLO properties of solids has been more advanced though not to the degree that has been achieved for simple gas-phase molecules using modem quantum-chemical practices. For example, density functional theory in its crudest form has been frequently adopted to find some NLO properties for semi-conductors. A glaring example of lack of progress is the third-order susceptibility of quartz. There is, as yet, no rigorous calculation of this quantity, even though it is the reference point for nearly all NLO measurements. It is our opinion that in the next few decades this situation is going to change, that the field of single molecule calculations will be saturated and attention will tum to the more practically relevant solid phase. This makes it an opportune time to review and discuss in this minisymposium what computational strategies have been already developed. As the situation currently stands, there are two extreme approaches: (i) the oriented-gas model, and (ii) the supermolecule model - and everything in between. For (i), in its simplest disguise, single molecules lie side-by-side and the properties of the solid are just an appropriate combination of the molecular ones. For (ii), the whole solid is considered as a giant molecule and the computational approach is that of standard quantum-chemical calculations. An important variation of (ii) is to make use of the translational symmetry that exists in a crystal or a periodic polymer and this leads to socalled crystal orbital methods. Here is an important break from the conventional molecular calculations and one that is likely to see a great deal more use in the NLO field in the near future. Refinement of the oriented-gas model, including more and more precise accounting for inter-species interactions, is another likely avenue for progress. The macroscopic optical responses of a medium are given by its linear and nonlinear susceptibilities, which are the expansion coefficients of the material polarization, P, in terms of the 1 Corresponding author. [email protected]

Preface for the Mini-Symposium on: "Simulations Of Nonlinear Optical Properties For The Solid State"_ _ 761

Maxwell fields, E. For a dielectric or ferroelectric medium under the influence of an applied electric field, the defining equation reads: p =Po +x''lE+xn,k] - 0£ ·1'

(1)

where Eo is the zero field energy and 1' the polarization that can be computed as a Berry phase of field-polarized Bloch functions 'lj.>n,k· On the one hand, using a perturbation expansion of Eq. (1) [1], we obtain analytic expressions of electric field derivatives of the energy and related physical quantities such as NLO susceptibilities, Raman tensors or electrooptic (EO) tensors. On the other hand, for small fields, Eq. (1) can be minimized with respect to the 'lj.>n,k to study the response of a solid to a finite electric field [2]. Both approaches will be discussed below in the framework of the Kohn-Sham density functional theory (DFT). Then, we use them in the study of ferroelectric AB03 compounds, a class of materials with high EO and NLO coefficients that are of direct interest for various technological applications [3]. In addition, we present a method to compute the temperature dependence of the EO coefficients and the refractive indices of these materials within a first-principles effective hamiltonian approach [4]. 1 Corresponding

author. E-mail: [email protected]

M. Veithen et al.

766

2

First-principles study of non-linear optical properties from density functional perturbation theory

Many important properties of solids can be expressed as derivatives of an appropriate therm(}dynamic potential with respect to perturbations such as atomic displacements, electric fields or macroscopic strains. For example, the NLO susceptibilities can be computed as third-order derivatives of the energy functional in Eq. (1) with respect to three electric fields [5] while the Raman susceptibilities are related to third-order derivatives of E with respect to two electric fields and one atomic displacement [6]. These derivatives can be computed from density functional perturbation theory by applying the S(}-called 2n + 1 theorem [7] to Eq. (1) [1]. This theorem says that it is possible to compute energy derivatives up to the order 2n + 1 from the derivatives of the wavefunctions up to the order n. As a consequence, third-order energy derivatives can be computed from the knowledge of the ground-state wavefunctions and of their first-order derivatives with respect to the corresponding perturbations. The NLO susceptibilities are determined by the response of the valence electrons to an electric field. At the opposite, the electrooptic coefficients involve the response of the electrons and the crystal lattice. In the Born and Oppenheimer approximation, these coefficients can be decomposed into (i) a bare electronic part, (ii) an ionic contribution and (iii) a piezoelectric contribution [8]. The electronic part describes the response of the valence electrons to a (quasi- )static macroscopic electric field when the ions are considered as artificially clamped at their equilibrum positions, and can be deduced from the NLO susceptibilities. The ionic contribution is produced by the relaxation of the atoms within the quasi-static electric field. It can be computed from the knowledge of the frequency, polarity and Raman susceptibility of the zone-center transverse optical phonon modes. Finally, the piezoelectric contribution is related to the modifications of the unit cell shape induced by the electric field. It can be computed from the piezoelectric strain coefficients and the elast(}optic coefficients. We apply this methodology to the study of the EO response of various ferroelectric oxides such as LiNbOa, BaTiOa and PbTiOa, with the aim of identifying the origin of the large electrooptic coefficients of these compounds. At high temperature, these materials are in a highly-symmetric paraelectric phase. As the temperature is lowered, they undergo one or several ferroelectric phase transitions driven by the condensation of an unstable phonon mode. At the phase transition, this unstable mode transforms into a low energy and highly polar mode in the ferroelectric phase that can strongly interact with an electric field. It is therefore apt to generate a large EO response if it exhibits, in addition, a large Raman susceptibility. This is the case in the ferroelectric phase of LiNbOa and BaTiOa, where we observe a strong contribution of the successor of the soft mode to coefficients r13 and raa- At the opposite, in the tetragonal phase of PbTiOa, the successor of the soft mode plays a minor role due to its lower Raman susceptibility.

xm

3

Finite electric field techniques

As mentioned in the introduction, because of the interband tunneling, an insulator in an electric field does not have a true ground state. In many practical situations however, tunneling is negligible on the relevant time scale and, for relatively small fields, the system remains in a polarized long lived metastable state. It is therefore possible to minimize the energy functional in Eq. (1) and to study the response of a solid to a finite electric field. This method offers an alternative approach to the one described in the previous section. In particular, it allows an accurate computation of the energy, the forces, the stress tensor and the polarization in the presence of an electric field [2]. It can therefore be applied to study the structural and electronic response of an insulator to an electric field and to compute electric field derivatives of the energy up to any order from finite

767

First-principles study of non-linear optical properties of ferroelectric oxides

differences.

4

Temperature dependence of the EO coefficients

Based on the dominant contribution of the soft mode to the EO tensor at 0 K, we develop an effective Hamiltonian [4] approach to simulate its temperature dependence in BaTi03. This approach is based on a Taylor expansion of the total energy and the electronic dielectric susceptibility around the paraelectric reference structure. The only degrees of freedom included in this expansion are the macroscopic strain and atomic displacements along the lattice Wannier function associated to the soft mode. The finite-temperature dependence of the model is studied from Monte Carlo simulations. In the tetragonal phase, we observe a divergence of the EO coefficients r13 and r33 at the transition to the cubic phase and a divergence of r4 2 at the transion to the orthorhombic phase. Moreover, the formalism used in the effective Hamiltonian provides a microscopic interpretation of the model of DiDomenico and Wemple [9] that identifies the linear EO effect in ferroelectrics to a quadratic effect biased by the spontaneous polarization.

Acknowledgments MV and XG acknowledge financial support from the FNRS Belgium. This work was supported by the Volkswagen Stiftung project "Nano-sized ferroelectric Hybrids" (I/77 737), FNRS (grants 9.4539.00 and 2.4562.03), the Region Wallonne (Nomade, project 115012), the PAl 5.01 and EU Exciting network (European Union contract HPRN-CT-2002-00317).

References [1] R. W. Nunes and X. Gonze, Berry-phase treatment of the homogeneous electric field perturbation in insulators, Phys. Rev. B 63 155107 (2001). [2] I. Souza, J. Iniguez, and D. Vanderbilt, First-Principles Approach to Insulators in Finite Electric Fields, Phys. Rev. Lett. 89 117602 (2002). [3] 1\1. E. Lines and A. M. Glass, Principles and applications of ferroelectrics and related materials Clarendon Press, Oxford, 1977. [4] W. Zhong, David Vanderbilt and K. M. Rabe, Phase Transitions in BaTi03 from First Principles, Phys. Rev. Lett. 73 1861-1864 (1994). [5] A. Dal Corso, F. Mauri and A. Rubio, Density-functional theory of the nonlinear optical susceptibility: Application to cubic semiconductors, Phys. Rev. B 53 15638-15642 (1996). [6] G. Deinzer and D. Strauch, Raman tensor calculated from the 2n functional theory, Phys. Rev. B 66 100301 (2002).

+

1 theorem in density-

[7] X. Gonze and J.-P. Vigneron, Density-functional approach to nonlinear-response coefficients of solids, Phys. Rev. B 39 13120-13128 (1989). [8] M. Veithen, X. Gonze and Ph. Ghosez, unpublished.

768

M. Veithen et al.

[9] M. Dr Domenico Jr. and S. H. Wemple, Oxygen-Octahedra Ferroelectrics. I. Theory of Electrooptical and Nonlinear optical Effects, J. Appl. Phys. 40 720-734 (1969).

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 769-770

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Mixed Electric-Magnetic Second Order Response of Helicenes Edith Botek*'·', Jean-Marie Andrei, Benoit Champagne', Thierry Verbiesr, and Andre Persoons 2 1 Laboratoire

de Chimie Theorique Appliquee, Facultes Universitaires Notre-Dame de Ia Paix, rue de Bruxelles, 61, B-5000 Namur (Belgium). 2 KU Leuven, Laboratory of Chemical and Biological Dynamics, University ofLeuven, Celestijnenlaan 200D, B-3001 Leuven (Belgium) Received June 26 2004; accepted July 15 2004 Abstract : The mixed electric-magnetic second-order NLO responses of helicenes are evaluated. The microscopic responses are determined at the RP A level whereas the macroscopic responses are obtained by averaging over the microscopic response for the various orientations of the molecule.

Keywords: second-order NLO response, mixed electric-magnetic response, helicenes PACS: 3l.l5.-p, 31.25.-v, 33.15.Kr, 33.55.Fi

1. Introduction Chiral systems are particularly interesting to study second-order nonlinear optical (NLO) effects due to their intrinsic non-centrosymmetry that allows NLO processes to be observed even in highly symmetric media [I]. Hicks and co-workers and Persoons and co-workers, both independently, demonstrated the presence of anomalously large chiral effects in NLO responses measurements of surface films of chiral molecules and chiral polymers [2-3]. Schanne-Klein et a!. have reported the presence of magnetic-dipole contributions of the same order of magnitude as electric-dipole contributions performing SHG-ORD/CDILD experiments in a nonresonant configuration for an isotropic spin-coated layer of a chiral salt [4]. Hence, the possibility of enhancing NLO properties by properly optimizing such magnetic-dipole contribution appears to become an exploitable tool to obtain new NLO materials. Persoons and co-workers have developed experimental techniques, based on SHG, to study magnetic-dipole contributions to the nonlinearity in thin films of chiral materials. They are able to detect magnetic-dipole contributions on the order of 5% of the highest electric-dipole counterparts [5-7]. In the present work the relation between the magnetic contributions (up to first order) to the first susceptibility with respect to the pure electric contribution is investigated from a theoretical frame in a series of chiral systems.

2. Mixed electric-magnetic nonlinear responses At the microscopic (molecular) level, the second-order electric-dipole moment is:

!li(2ro) =

L W~E i(ro)Ek (ro) + 13:;'Ei(ro)Bk (ro)] jk

'Corresponding author. e-mail: [email protected]

(1)

770 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Edith Boteket. a/.

and the second-order magnetic-dipole moment is: (2)

mJ2ro) = Ll3~kceE)ro)Ek(ro) jk

acting both as sources of radiation. The subscripts i, j and k are the Cartesian coordinates, meanwhile the superscripts refer the electric-dipole (e) and the magnetic-dipole (m) interactions. J3;ik is a component of the first hyperpolarizability and ro is the frequency of the incident light. The microscopic hyperpolarizabilities are calculated employing the Green functions formalism at the Random Phase Approximation (RPA) level [8] using the 6-3IG* basis set by means of the quadratic response functions [9]. The macroscopic response l 2> is obtained by averaging over the microscopic response for the various orientations of the molecule in space with respect to the laboratory coordinate system.

3. Second-order NLO responses of helicenes Families of substituted helicenes (as well as the corresponding non-substituted species) presenting the potentiality of optimizing the magnetic contribution for maximizing the NLO response were considered as examples to investigate the relation between pure electric and mixed electricmagnetic macroscopic responses. The electric response of some helicenes investigated in this work has been studied in a previous work [10] by tuning the first hyperpolarizability with a proper addition of donor/acceptor groups. For the present work the calculations predict mixed electric-magnetic responses which, in some cases, are not negligible with respect to the pure electric contributions. Such predictions could be useful for complementing investigations related to the NLO processes in several field of sciences.

References [I] M. Kauranen, T. Verbiest, and A. Persoons, J. Nonlinear Optical Phys. & Materials 8, 171 (1999). [2] T. Petralli-Mallow, T.M. Wong, J.D. Byers, H.l. Yee, and J.M. Hicks, J. Phys. Chern. 97, 1383 (1993). [3] M. Kauranen, T. Verbiest, E.W. Meijer, E.E. Havinga, M.N. Teerenstra, A.J. Schouten, R.J.M. Nolte, and A. Persoons, Adv. Mater. 7, 641 (1995). [4] M.C. Schanne-Klein, F. Hache, A. Roy, C. Flytzanis, and C. Payrastre, J. Chern. Phys. 108, 9436 (1998). [5) M. Kauranen, J.J. Maki, T. Verbiest, S. Van Elshocht, and A. Persoons, Phys. Rev. B 55, 1985 (1997). [6) S. Sioncke, T. Verbiest, and A. Persoons, Optical Materials 21, 7 (2002). [7] T. Verbiest, S. Sioncke, and A. Persoons, J. Photochem. And Photobiology A: Chemistry 145, 113 (2001). [8) T. H. Dunning and V. McKay, J. Chern. Phys. 47, 1735 (1967); 48, 5263 (1968). [9] See for example: W.A. Parkinson, and J. Oddershede, J. Chern. Phys. 94, 7251 (1991). [10]8. Champagne, J.M. Andre, E. Botek, E. Licandro, S. Maiorana, A. Bossi, K. Clays, and A. Persoons, CHEMPHYSCHEM, in press.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume 1, 2004, pp. 771-774

Determination of the Macroscopic Electric Susceptibilities from the Microscopic (hyper)polarizabilities a, (3 and "!

x(n)

M. Rerat 1 Laboratoire de Chimie Theorique et de Physico-Chimie Moleculaire, UMR5624, Universite de Pau, FR-64000 Pau, France

R. Dovesi Dipartimento di Chimica, IFM, Universita di Torino, l-10125 Torino, Italy Received 31 July, 2004 Abstroct: In this work, we will show that it is possible to evaluate the bulk dielectric constant f(o) and the electric susceptibilities x'/'

t=l'

where the indices sand t refer to the non-degenerated and doubly-degenerated modes respectively, x and g the constants of anharmonicity that depend on cubic and quartic force constants and €the quantum number associated with vibrational angular momentum. If we take into account that each matrix element of the vibrational Hamiltonian can be directly evaluated with respect to the vibrational wavefunction, the equation expression (3) is more useful: E.,=

+

Iw,(v, 1q;1 v,) + Iw,L(v,,flq,'lv,f) + Lk, ,(v, 1q;1 v,)(v, 1q;1 v,) + Lk,.,.(v, 1q:1 v,) ,

'

'

~k,.,(v,lq;lv.{ ~(v,,flq,, q, lv,e)J

+ ~km,( ~(v,,llq,_ q,

lv,,e)X ~(v,,f'lq,. q, Iv,,rl]+ t,:k,,( ~(v,flq,' q,'l v,.e)J- ~ t,:k,,e;

This procedure, easy to implement, have the strong advantage to have independent calculations for each vibrational level (Ev. €). Thus, the development of a parallel algorithm enable us to treat a problem as large as (CH 3Lih (24 modes for 26 millions of configurations "''··'······'•··• and vi( max)

= 7) in

few seconds. Despite of being very efficient, perturbation theory gives sensitive results, in particular with regard to the study of resonances such as, for example, Fermi's and Darling-Denisson's. When these type of accidental degeneracy occurs between two vibrational levels it is no longer legitimate to use the perturbation theory. In order to circumvent this problem, the energies of the two perturbed levels could be obtained by direct diagonalization of the corresponding part of the energy matrix corrected to the first order as mentioned in ref [6]. These conditions of calculation are sufficient to determine (from a highly correlated wave functions by using at least a triples basis set) at less than 2% the position of the fundamental bands of small and medium [3] organic compounds. Generally, it isn't sufficient to study both combination and hot bands. For a more comprehensive treatment of complete vibrational spectra, it is better to use a variational approach. However, for systems containing more than five atoms, it is generally difficult to use this solution because the treatment of the gigantic matrices of IC is required. During the last years, an intermediate solution consists of the use of mixed methods of variation-perturbation [7], which select iteratively the vibrational states by using the perturbation method in a reduced size under space containing the useful information. Nowadays, it's possible, thanks to the development of average data processing, to work with the whole vibrational information and then develop pure variational methods for the treatment of medium size systems. It is, on the other hand, necessary to optimize in a different way the variational algorithm. One of the ways to reach this challenge is to use parallel calculations. All the details concerning the method are now available in ref [8]. This approach consists of: • Taking an inventory of the vibrational states potentially needed for the description of the problem. This vibrational anharmonic states are constructed by the linear combinations of products ofwavefunctions expressed in the harmonic oscillators basis. • Taking into account the symmetry of each state • Cutting out the process into several spectral windows. • Use a massive parallel algorithm for the treatment of the variational approach

3. Overall structure and method We decided to do calculations on a small heterogeneous cluster, in order to debug and optimize the code. This cluster runs under Linux 0/S. The ten-processor system used to carry out the calculations reported in this paper has the following configuration: • •

8 IGHz Intel Pentium III processors on dual-processors motherboards with 512 Mo RAM memory per motherboard; 2 2GHz Intel Xeon processors on dual-processors motherboard with 4 Go RAM;

A Parallel Approach for a Variational Algorithm in the Vibrational Spectroscopy Field _ _ _ _ _ __

• • • • • •

819

I 00 Mbits/s fast ethemet for network communications; Linux 0/S (Red Hat Linux release 7.1 for the Pill and 7.2 for the Xeon Kemel2.4.xx); LAM-MPI v7.0.4 software for parallelism communications; PBS software for job sournission. BLAS & LAPACK libraries (latest releases) Intel Fortran Compiler version 7.I

The MPI message passing library was chosen for parallelisation, as this is currently the most portable parallelisation model and implementations are available for almost every computer on the market, as well as clusters of workstations or personal computers. The overall structure of the code (P_Anhar) consists of single software, which is made up of several modules and many other minor routines. All the details concerning the code are now available in ref [8].

References I. P.K. Berzigiyarov, V.A. Zayets, L.Y. Ginzburg, V.F. Razumov, E.F. Sheka, Int. J. Of Quant. Chern 96(2), 73-79 (2004) V. Barone, G. Festa, A. Grandi, N. Rega, N. Sanna, Chern. Phys. Lett. 388, 279-283 (2004). 2. P. Carbonniere, D. Begue, C. Pouchan, Chern. Phys. Lett. 393(1-3), 92-97 (2004) and references therein. 3. V. Barone, Chern. Phys. Lett. 383, 528-532 (2004) 4. P. Carbonniere, D. Begue, A. Dargelos, C. Pouchan, Chern. Phys. 300,41 (2004) 5. A. Willetts, N.C. Handy, Chern. Phys. Lett. 235, 286 (1995) P. Carbonniere, V. Barone, Chern. Phys. Lett. 392, 365-371 (2004) 6. R. Burel, S. Carter, N.C. Handy, Chern. Phys. Lett. 373, 357 (2003) 7. C. Pouchan, K. Zaki, J. Chern. Phys. 107, 342 (1997) 8. N. Gohaud, D. Begue, C. Darrigan, C. Pouchan submitted

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 820-824

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Vibrational Computation Beyond the Harmonic Approximation: An Effective Tool for the Investigation of Large Semi-Rigid Molecular Systems P. Carbonniere 1 and V. Barone2* 1Laboratoire

de Chimie Theorique et Physico-Chimie Moleculaire, UMR 5624 Federation de recherche IPREM 2606, Universite de Pau et des Pays de l'Adour, IFR Rue Jules Ferry, F-64000, Pau, France. 2Laboratprio di Struttura e Dinamica Moleculare Dipartimento di Chimica, Universita "Federico II", Complesso Universitario Monte Sant' Angelo, Via Cintia I-80126 Napoli, Italy

Received 2 August, 2004; accepted in revised form 20 August, 2004 Abstract: This paper deals with the computation of vibrational transitions by quantum mechanical methods. It shows the capability of density functional theory based methods (OFT) to determine anharmonic force fields able to provide reliable interpretation of IR spectra via an effective second order perturbative procedure. The study based on an investigation of 6 semi-rigid molecules containing from 4 to 12 atoms reviews several pure and hybrid functionals as well as basis sets in order to propose an efficient route for the vibrational calculation oflarger semi-rigid molecules. Keywords: hybrid force fields, DFT, basis set, vibrational frequencies, performances. Mathematics SubjectC/assification: Here must be added the AMS-MOS or PACS PACS: Here must be added the AMS-MOS or PACS Numbers

1. Introduction Infrared (IR) and Raman spectroscopies are among the most powerful techniques for characterizing medium size molecules, but proper assignment of fundamental frequencies is often not straightforward due to the large number of (often overlapping) transitions present in high resolution experimental spectra. In the last years computation of harmonic force fields by quantum mechanical methods has provided an unvaluable aid in this connection thanks to the development of more reliable models with good scaling properties. With the recent impressive progresses in hardware and software, it is becoming feasible to investigate if effective approaches going beyond the harmonic level can offer further significant improvements. For small molecules, converged ro-vibrational levels can be obtained by fully variational methods 1•3• However, for larger molecules some approximation becomes unavoidable concerning both the form of the potential and the ro-vibrational treatment. The most effective models are at present based on truncated two- or three-mode potentials followed by an effective second order perturbative treatment (PT2)4•6 or by diagonalization of configuration interaction matrix built on self-consistent vibrational states (VSCF-CI) 7 and/or guessed by a preliminary perturbative approach (CI-PT2) 8• Of course, the cheap PT2 approach cannot give the same results of a converged variational computation. For instance, the PT2 expressions are (at least in one dimension) exact for a Morse oscillator, and therefore certainly not correct for the incomplete quartic development, which is the most practical representation of the potential for large molecules. Thus, the PT2 predictions can be closer to experiment than their variational counterpart 9' 10. Nearly degenerate

Vibrational computation beyond the harmonic approximation _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

821

vibrational states (e.g Fermi and Darling-Dennison resonances) are, of course, a problem for a straightforward PT2 approach, but a well-established improvement is obtained by removing the strongest interactions in the second order treatment and treating them in a proper fashion by diagonalizing the corresponding part of the Hamiltonian matrix. On these grounds, the effective PT2 approach can achieve an accuracy of the order of 10 cm- 1 for fundamental transitions in the case of small size systems with high level quartic force fields (CCSD(T)/spdfbasis set quality)6 • As a matter of fact, a fully automated implementation of the highly cost effective PT2 vibrational approach has been recently coded4 in the Gaussian package 11 • At this point, the only obstacle to investigate vibrational spectra of large molecules comes from the determination of reliable quartic force fields, keeping in mind that the CCSD(T) approach is still limited to hexa-atomic systems, due to its very unfavourable scaling with the number of active electrons. One of the most significant issues of our most recent investigations is that reliable spectroscopic properties of semi-rigid molecules can be computed by last generation density functionals and that the results can be further improved coupling structures and harmonic force fields computed at higher levels with OFT anharmonic corrections. Considering the very promising results obtained recently by OFT computations of the vibrational spectra for small 12•13 and medium size molecules 10' 14 , we have undertaken a series of systematic studies aimed to propose the most effective computational model for semi-quantitative investigations of large molecules 1 •16 • The final results of this investigation are summarized in the present paper.

2. Method and computational details OFT quartic force fields of H2CO (formaldehyde), H 2CS (thioformaldehyde), H2CN (methylenimine), C2 H4 (ethylene), s-tetrazine, and benzene have been computed by the Gaussian 03 package 11 • The performances of the widely used B3L YP functional in the computation of harmonic and anharmonic frequencies has been investigated using 14 standard basis sets summarized in Table I. Other conventional (BLYP, HCTH,TPSS) and hybrid (897-1, 81895) functionals were also tested with 3 different basis sets: the cheap 6-31+G(d,p), the TZ2P which performs fairly well with the 897-1 functional and the cc-pVTZ which is particularly accurate for post Hartree-Fock methods. The quality of the results is assessed by comparison with the most reliable CCSD(T) computations available in the literature. Details about the implementation of effective PT2 anharmonic computations in the Gaussian 03 package are given elsewhere4 • Here, we only recall that starting from an optimized geometry, it is possible to build automatically third and semidiagonal fourth derivatives w.r.t normal coordinates Q for any computational model for which analytical second derivatives are available. Note that for non linear molecules these computations require at most the Hessian matrices at 6N-11 different points, N being the number of atoms in the molecule. Note also that the best compromise between different error sources is obtained using a stepsize ofO.Ol A in the numerical differentiation of second derivatives, tight geometry optimizations and fine grids (at least 99 radial and 590 angular points) in both SCF and CPKS computations. Next, fundamental transitions (v 1), overtones (2v 1), combination bands (v 1+vi) are evaluated using second order perturbation theory. Rotational contributions to anharmonicity, which cannot be neglected for quantitative studies, have been added to the procedure. As shown in a recent study, a perturbative evaluation of these terms leads to results in very close agreement with their variational counterparts 16 • Furthermore, as mentioned above, standard PT2 computations are not sufficient for strongly interacting states. The usual empirical criterium based on their energetic difference (~ 1 ) does not take into account that the error in the perturbative treatment comes also from the strength of the coupling. A viable solution to this problem has been suggested by Martin and coworkers6, who derived simple formulae giving fairly good estimates of the differences between explicitly including a Fermi resonance and absorbing it into the anharmonicity constant (~2 ). ~ 2 =1cm- 1 is the default value in the latest version of the implemented algorithm and may be manually set up by the keyword DelVPr. Table I: Basis sets used in this study together with cc-p VDZ and cc-pVTZ. (+ diff X : sp diffuse functions on each heteroatom if any. Acronym

Dzp

dzpd

dzpdT

dzpp

dzppd

dzppdT

Description

6-31G*

6-31G* +diffX

6-3l+G*

6-31G**

6-31G** + diffX

6-3t+G••

Acronym

tzp

tzpd

tzpdT

tzpp

tzppd

tzppdT

Description

6-311G*

6-311G* + diffX

6-3ll+G*

6-3ltG••

6-31G** +diffX

6-3ll+G••

822 - - - - - - - - - - - - - - - - - - - - - - - - - P. Carbonniere and V Barone

3. Results and discussion Several studies have shown that diffuse basis functions are much more important for optimizing the performances of DFT than larger expansions of the valence orbitals and/or additional polarization functions 10•17•18 . Figure I shows the same trend in the case of harmonic computations driven by the B3L YP model where the accuracy of calculations is improved by a factor of 2 when using the cheap 631+G(d,p) basis set, compared to the cc-pVTZ results. Thus, the B3LYP/6-31+G(d,p) model leads to an average error approaching 10 em·' on harmonic frequency calculations for the trial set of molecules. It is noteworthy that the larger discrepancies generally come from the small size systems: considering only medium size molecules (s-tetrazine and benzene) the average absolute error is lowered by about 30%.

cAv. Abs. Error • stand•d dov

30 24.3

25

20.7

20

15

I II I 13

11

11

10

dzp

170

16.3

"

15

dzpd

dzpdT

dzpp

21.8

21.0

19.3

19.3

150

179

16

"

98

dzppd

dzppdT

tzp

17.8 14

tzpd

tzpdT

18.7

16

"

tzpp

17.9

I

'"

17

17

ccpo.dz

ccp..tz

13

tzppd

tzppdT

Figure I: Average absolute error and standard deviation w.r.t. CCSD(T)/cc-pVTZ results for B3L YP harmonic frequencies (cm' 1) of H2CO, H2CS, H2CN, C2H4 , s-tetrazine and benzene. Concerning the very costly calculations beyond the harmonic level, the question which arises is how well DFT anharmonic force fields fit their CCSD(T) counterparts. A first answer can be given adding B3L YP anharmonic corrections to CCSD(T) harmonic frequencies and then comparing the resulting anharmonic frequencies to the most reliable full CCSD(T) anharmonic computations available in the literature. The results collected in figure 2 show that the B3L YP functional is able to reproduce CCSD(T) anharmonicities with an error lower than 4 em·' and that basis set extension above the 631 G( d) level leads to a modest variation on computed values. icAv.Abs. Error • Max error

I

14

10

9

8

8

9

:r I n 3~

3.8

0

dzp

dzpd

dzpdT

11

11

10

9

8

12

10

7

6

3.7

13

12

11

12

3.0

3.2

dzpp

dzppd

I I

r 3.0

dzppdT

3.5

I

tzp

r 3.8

tzpd

4.0

n

tzpdT

rr nr n 31

tzpp

3.3

3.5

tzppd

tzppdT

4.0

ccpWz

3.3

ccplotz

Figure 2: Average absolute error and maximum error w.r.t. CCSD(T)/cc-pVTZ results for B3L YP anharmonicities (cm' 1) of the small size systems H2CO, H2CS, H2CN, C2H4 • As a matter of fact, several medium size semi-rigid molecules (pyrrole, furan, uracil, 2-thiouracil, azabenzenes, benzene) were investigated by hybrid force fields in which harmonic and anharmonic parts are computed at different level of theory 10•14 . In the framework of this study, figure 3 shows that the combination B3L YP/6-31 +G(d,p)//B3L YP/6-31 G(d) yields results in fair agreement with experimental data, paving the route towards the semi-quantitative study of larger semi-rigid molecules. Although the CCSD(T) method is definitely more accurate, the cases of benzene and s-tetrazine confirm that this level of theory is required only at the harmonic level, at least for the precision (say I 0 cm. 1) we are pursuing.

Vibrational computation beyond the harmonic approximation _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

lc Av

ADs.

&Tor JN!\.

f.lax error M'i\. 5J Av. Hill ~Max



1-fll

823

I

37 31

30

uracil

luran

p~le

28

26

25 21

19

2-lhiourac:il

p)'1dine

P)ridazjne

pyrimidine

21

P)'112ine

1,2,3-triazine 12.4-triazlne

20

s-tetrazine

21

benmne

Figure 3: Average absolute error and maximum error w.r.t. experimental results of the fundamental transitions calculated by differents hybrid force fields for larger semi-rigid molecules. The harmonic//anharmonic hybrid force fields are: H: CCSD(T)/AN0-4321; M: B3LYP/dzppdT; L: B3LYP/dzp. For clarity, maximum errors are given without inclusion ofkekule and C=O modes when present. B3L YP is the most implemented hybrid density functional in user friendly packages, and, therefore, the most widely used. Nevertheless, it's worth examining if other DFT based models may outperform the B3LYP results. Systematic studies for small size systems have already shown that among current density functionals (LDA, BLYP, BP86, B3L YP, B97-l) the two last hybrid functionals perform a very good job. In our recent work, the analysis has been extended to medium size molecules with functionals based on the generalized gradient approximation, GGA (HCTH, TPSS, BL YP) and their hybrid counterparts (B97-l, B3LYP, BlB95). Figure 4 shows the relative accuracy of the different methods, and point out that GGA functionals are not sufficiently accurate to drive vibrational computations. Furthermore, in view of the significantly lower errors of TPSS w.r.t. BL YP, one may expect very promising results from its corresponding hybrid. While work is in progress in this connection in our laboratories, B3L YP and B97-l functionals combined with the 6-31 +G(d,p) or TZ2P basis sets already perform a remarkably good job and represent at present the best compromise between computation time and quality of the results.

I•Av. Abs. Enor I 51.1 (33.9)

60

49.9 (37. 7)

49.1 (34.6)

50 40

28.8 23.9 (23.6)

30

27.4 (16.4)

24.4

(20.6)

(18.8)

21.6 (17.6)

24.9 (16.2)

20

10

0~--~~~Lr--~~~--~~~~--~~~--~~~Lr--~~~--~~~~

JlP~4ft~4ft~4ft~4P~4ft~4 0~ ~ $l 0~ ~ $l ,P~ $l ,P~ .t' R ,P~ ~ $l ,p"~

.'"

,....

">~ .J.'~''O ~~

.J.q

~

Preconditioning Orthonormalization

4)

Define the initial subspace 7t = { uo ... Ub-d Diagonalize H in the {uo, ... , UM-d basis set Select the b eigenvectors {'lt~M), m = 1 ... b} with largest

with zero order Hamiltonian H 0

~=

{ 1-

:

£

R [15]. More specifically, starting from an arbitrary initial vector x 0 E D, one can subminimize at the kth iteration the function f(xf, ... ,xf- 1 ,x;,x7+ 1 , ... ,x~), along the ith direction and obtain the corresponding subminimizer x;. Obviously for the subminimizer X; holds that a;J(xf, ... , xf_I, X;, xf+I, ... , x~)/ax; = 0. This is a one-dimensional subminimization because all the components of the vector xk, except from the ith component, are kept constant. Then the ith component is updated according to: (1)

for some relaxation factor Tk. The objective function f is subminimized in parallel for all i. In neural network training we have to minimize the error function E with respect to the weights W;j, E is the batch error measure defined as the sum of squared differences error function over the entire training set. Assuming that along a weight's direction an interval is known which brackets a local minimum W;j. When the gradient of the error function is available at the endpoints of the interval of uncertainty along this weight direction, it is necessary to evaluate function information at an interior point in order to reduce this interval. This is because it is possible to decide if between two successive epochs (t - 1) and (t) the corresponding interval brackets a local minimum simply by looking the function values E(t-1), E(t) and gradient values aE(t-1)/aw;j, aE(t)/aw;j at the endpoints of the considered interval (see [11] for a general discussion on the problem). The conditions that have to be satisfied are [11]:

aE(SI) < O and OW;j

aE(S2) > O OW;j ,

aE(SI) < O and E(SI) < E(S2), OW;j aE(SI) > O and aE(S2) > 0 and E(SI) > E(S2), OW;j OW;j

(2)

where S 1 and S 2 determine the sets of weights for which the coordinate that corresponds to the weight W;j is replaced by a; = min{w;j(t -1), W;j(t)}, and b; = max{w;j(t- 1), W;j(t)} correspondingly. Notice that, at this instance, between two successive epochs (t - 1) and (t) all the other coordinates remain the same. The above three conditions lead to the conclusion that the interval [a;, b;] includes a local subminimizer along the direction of weight W;j. A robust method

A Globally Convergent Jacobi-Bisection Method for Neural Network Training

845

of interval reduction called bisection can now be used. We will consider here the bisection method which has been modified to the following version described in [14]: (3)

where p = 0, 1, ... is the number of subminimization steps and w? =a;; h; = signo;E(w 0 ) (b; -a;); the set of weights at the (t-1) epoch while wP is obtained by replacing the coordinate of w0 that corresponds to the weight W;j by wf, sign defines the well known triple valued sign function and o;E denotes the partial derivative of E with respect to the ith coordinate. Of course, the iterations (3) always converge with certainty tow; E (a;, b;) if for some wf, p = 1, 2, ... , the first one of the conditions (2) holds. The reason for choosing the bisection method is that it always converges within the given interval (a;, b;) and it is a globally convergent method. Also, the number of steps of the bisection method that are required for the attainment of an approximate minimizer w; of (1) within the interval (a;, b;) to a predetermined accuracy cis known beforehand and is given by V = rlog2[(b;- a;)c-IJl . Moreover it has a great advantage since it is worst-case optimal, i.e. it possesses asymptotically the best possible rate of convergence in the worst-case [12]. This means that it is guaranteed to converge within the predefined number of iterations and moreover, no other method has this property. Therefore, using the value of v it is easy to have beforehand the number of iterations that are required for the attainment of an approximate minimizer w; to a predetermined accuracy. Finally, it requires the algebraic signs of the values of the gradient to be computed. w 0 determines

3

The Globally Convergent JRprop Method

The aim of the new algorithm is to improve the learning speed as well as to ensure subminimization of the error function along each weight direction. The term global convergence in our context is used "to denote a method that is designed to converge to a local minimizer of a nonlinear function, from almost any starting poinf' [3, p.5]. Dennis and Schnabel also note that "it might be appropriate to call such methods local or locally convergent, but these descriptions are already reserved by tradition for another usage". Moreover, Nocedal, [7, p.200], defines a globally convergent algorithm as an algorithm with iterates that converge from a remote starting point. Thus, in this context, global convergence is totally different from global optimisation [13]. In our approach JRprop's convergence to a local minimiser is treated using principles from unconstrained minimisation theory. Suppose that x 0 is the starting point of the iterative scheme:

xk+i=xk+rkdk,

k=0,1, ...

(4)

where dk is the search direction and rk > 0 is a step-length. Suppose further that

(i)

f:

V C

~n--> ~is

the function to be minimized and

f is bounded below in ~n;

(ii) f is continuously differentiable in a neighborhood N of the level set C = {x: f(x) ~ f(x 0 )}; (iii) V' f is Lipschitz continuous on ~n that is for any two points x and y E ~n, V' f satisfies the Lipschitz condition with constant L, ]]V'f(x)- V'f(y)]] ~ L]]x- yiJ, Vx,y EN. Convergence of the general iterative scheme (4) requires that the adopted search direction dk satisfies the condition V'E(wk)T dk < 0, which guarantees that dk is a descent direction of f(x) at xk. The step-length in (4) can be defined by means of a number of rules, such as the Armijo's rule [3], the Goldstein's rule [3], or the Wolfe's rule [16], and guarantees the convergence in certain cases. For example, when the step-length is obtained through Wolfe's rule [16]:

f(xk + rkdk) - f(xk) ~ a! rk\7 f(xk) T dk, V'f(xk +rkdk)T dk ;;. a2V'f(xk)T dk,

(5)

(6)

846

A.D. Anastasiadis, G.D. Magoulas and M.N. Vrahatis

where 0 < a 1 < a2 < 1, then a theorem by Wolfe [16] is used to obtain convergence results. Moreover, the Wolfe's Theorem [3, 7] suggests that if the cosine of the angle between the search direction dk and - 'i1 f(xk) is positive then limk~oo 'i1 f(xk) = 0, which means that the sequence of gradients converges to zero. For an iterative scheme (4), limk~oo 'ilf(xk) = 0 is the best type of global convergence result that can be obtained (see [7] for a detailed discussion). Evidently, no guarantee is provided that (4) will converge to a global minimiser, x*, but only that it possesses the global convergence property, [3, 7], to a local minimiser.

Theorem 1: Suppose that Conditions (i)-(iii) are fulfilled. Then, for any x 0 E lRn and any sequence {xk}k:, 0 generated by the JRprop's scheme: . x k+l = x k - r k d"Jag { 71Jk , ... , T};k , ... , TJnk} s1gn

("!( v x k)) ,

(7)

where sign ('il f(xk)) denotes the column vector of the signs of the components of 'i1 f(xk), rk > 0, TJ~, m = 1, 2, ... , i-1, i+ 1, ... , n are small positive real numbers generated by the JRprop learning rates' schedule:

while

f(xk) ~ f(xk-l)

then

if

(8mf(xk-l)·8mf(xk)>O)

then T}~=min{TJ~-l.TJ+,.::lmax},

(8)

if

(8mf(xk-l)·8mf(xk) 0}, then there always exist rk satisfying Relations (5)(6) [3, 7]. Moreover, the Wolfe's Theorem [3, 7] suggests that if the cosine of the angle between the descent direction dk and the -'ilf(wk) is positive then limk~oo 'ilf(xk) = 0. In our case, indeed

cosfh

-Vf(xk)T dk

= IIVJ(x•JIIIWII > O.

D

In batch training, E is bounded from below, since E(w) ;:;;: 0. For a given training set and network architecture, if w* exists such that E(w*) = 0, then w* is a global minimiser; otherwise, w with the smallest E(w) value is considered a global minimiser. Also, when using smooth enough activations (the derivatives of at least order pare available and continuous), such as the well known hyperbolic tangent, the logistic activation function etc., the error E is also smooth enough. The globally convergent modification of the JRprop, named Globally-Jacobi-Rprop (GJRprop), is implemented through Relations (7)-(11). The role of t5 is to alleviate problems with limited precision that may occur in simulations, and should take a small value proportional to the square root of the relative machine precision. In our tests we set t5 = 10- 6 in an attempt to test the convergence accuracy of the proposed strategy. Also rk = 1 for all k allows the minimisation step

A Globally Convergent Jacobi-Bisection Method for Neural Network Training

847

Table 1· Results for the cancer and diabetes problems Algorithm

Cancer Epochs

Rprop JRprop GJRprop

225 194 137

Average Class. Success(%) 97.60 97.64 97.66

Speed (sec) 1.58 1.36 1.20

Conv. (%) 100 100 100

Diabetes Epochs

312 290 260

Speed (sec) 1.90 1.77 1.60

Average Class. Success (%) 75.60 75.64 75.78

Conv. (%) 90 95 100

along the resultant search direction to be explicitly defined by the values of the local learning rates. The length of the minimisation step can be regulated through rk tuning to satisfy (5)~(6). Checking (6) at each iteration requires additional gradient evaluations; thus, in practice (6) can be enforced simply by placing the lower bound on the acceptable values of the learning rates [5, p.1772], i.e. ~min·

4

Experimental study and results

In this section, we evaluate the performance of the GJRprop algorithm, and compare it with the Rprop and the JRprop algorithms. We have used well~studied problems from the UCI Repository of Machine Learning Databases of the University of California [6], as well as problems studied extensively by other researchers in an attempt to reduce as much as possible biases introduced by the size of the weights space. In all cases we have used networks with classic logistic activations. Below, we report results from 50 independent trials for four UCI problems. These 50 random weight initializations are the same for the two learning algorithms, and the training and testing sets were created according to Probenl [8]. The first benchmark is the breast cancer diagnosis problem which classifies a tumor as benign or malignant based on 9 features [6, 8]. We have used an FNN with 9~4~2~2 nodes (a total of 56 weights) as suggested in [8]. The termination criterion is E ~ 0.02 within 2000 iterations. The second problem is the diabetesl benchmark. It is a real-world classification task which concerns deciding when a Pima Indian individual is diabetes positive or not [6, 8]. There are 8 features representing personal data and results from a medical examination. The Proben1 collection suggests a 8~2~2~2 FNN (34 weights overall). The termination criterion is E ~ 0.14 within 2000 iterations .

... GJRprop -JRprop - Rprop

••• GJRprop -JRprop - Rprop

~

10·06

10

10

'L~~~......:::::::==::":::.:;-,.:~::._::-...:-::._::-...:-:..=.-...:-:...-=.J 400

500

600

700

800

Number of Epochs

900

1000

500

1000

1500

2000

2500

3000

3500

4000

Number of Epochs

Figure 1: GJRprop, JRProp and Rprop learning curves: Cancer (left) and Diabetes (right)

848

A.D. Anastasiadis, G.D. Magoulas and M.N. Vrahatis

The results for the cancer and diabetes problems are summarized in Table 1. The new algorithm performs significantly better than the JRprop and Rprop. In Table 1, we represent the average time ("Speed", measured in sees), the classification success with the testing set ("Class. Success", measured by the percentage of testing patterns that were classified correctly), and the convergence success in the training phase ( "Conv." , measured by the percentage of simulation runs that converged to the error goal) for an algorithm. The increased convergence speed does not seem to affect the classification success of the new method in testing. Figure 1 illustrates a case where GJRprop converges to a minimiser faster than JRrpop, while Rprop gets stuck to a minimiser with higher error value, for the two tested problems.

References [1] Anastasiadis A. D., Magoulas G. D., Vrahatis M. N., An efficient improvement of the Rprop algorithm, In: Proceedings of the 1st International Workshop on "Artificial Neural Networks in Pattern Recognition (!APR 2003}", University of Florence, Italy, pp.197-201, 2003. [2] Anastasiadis A. D., Magoulas G. D., Vrahatis M. N., Sign-based learning schemes for pattern classification, Pattern Recognition Letters, submitted. [3] Dennis J. E. and Schnabel R. B., Numerical Methods for Unconstrained Optimization and nonlinear equations, SIAM, Philadelphia, 1996. [4] Magoulas, G. D., Vrahatis, M. N., and Androulakis, G. S., Effective back-propagation training with variable stepsize, Neural Networks, 10, 1, 1997, 69-82. [5] Magoulas, G. D., Vrahatis, M. N., and Androulakis, G. S., Improving the convergence of the backpropagation algorithm using learning rate adaptation methods, Neural Computation, 11, 7, 1999, 1769-1796. [6] Murphy, P. M. and Aha, D. W., UCI Repository of machine learning databases, Irvine, CA: University of California, Department of Information and Computer Science, 1994, http://www .ics. uci.edu/- mlearn/MLRepository.html [7] Nocedal J., Theory of algorithms for unconstrained optimization, Acta Numerica, pp.199-242, 1992. [8] Prechelt, 1., PROBEN1-A set of benchmarks and benchmarking rules for neural network training algorithms, T.R. 21/94, Fakultat fiir Informatik, Universitat, Karlsruhe, 1994. [9] Riedmiller, M. and Braun, H., A direct adaptive method for faster backpropagation learning: The RPROP algorithm, Proc. Int. Conf. Neural Networks, San Francisco, CA, 1993, 586-591. [10] Rumelhart, D. E. and McClellend, J. L., Parallel distributed processing: Explorations in the microstructure of cognition, Cambridge, MIT Press, 1986, pp.318-362. [11] Scales, L. E., Introduction to non-linear optimization, MacMillan Publishers, 1985, pp. 34-35. [12] Sikorski, K., Optimal Solution of Nonlinear Equations, Oxford Univ. Press, New York, 2001. [13] Treadgold, N. K. and Gedeon, T. D., Simulated annealing and weight decay in adaptive learning: The SARPROP algorithm, IEEE Trans. Neural Networks, 9, 4, 1998, 662-668. [14] Vrahatis, M. N., Solving systems of nonlinear equations using the nonzero value of the topological degree, ACM Trans. Math. Software, 14, 1988, 312-329; ibid: 14, 1988, 33G-336. [15] Vrahatis, M. N., Magoulas, G. D., and Plagianakos, V. P., From linear to nonlinear iterative methods, Appl. Numer. Math., 45, 1, 2003, 59-77. [16] Wolfe P., Convergence conditions for ascent methods, SIAM Review, 11, 226-235, 1969; ibid: 13, 185-188, 1971.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume 1, 2004, pp. 849-851

On Compositions of Clones of Boolean Functions Miguel Couceiro 1 , Stephan Foldes 2 , Erkko Lehtonen 3 Received 31 July, 2004; accepted in revised form 5 September, 2004 Abstract: We present some compositions of clones of Boolean functions that imply factorizations of !1, the clone of all Boolean functions, into minimal clones. These can be interpreted as representation theorems, providing representations of Boolean functions analogous to the disjunctive normal form, the conjunctive normal form, and the Zhegalkin polynomial representations. Keywords: Function class composition, Clones, Boolean functions, Post Classes, Class factorization, Normal forms, DNF, CNF, Zhegalkin polynomial, Applications of universal algebra in computer science Mathematics Subject Classification: 06E30, 08A 70, 94C10

1

Introduction

It is a well-known fact that every Boolean function can be represented as a disjunction of conjunctions of literals, as a conjunction of disjunctions of literals, and as a multilinear polynomial over GF(2). These representations are called the disjunctive normal form (DNF), the conjunctive normal form (CNF), and the Zhegalkin polynomial, respectively. These facts can be reformulated by means of composition of clones of Boolean functions. Moreover, the clones occurring in these compositions are minimal. In other words, these theorems represent different factorizations of the clone !1 of all Boolean functions into minimal clones. In this work, we present more such factorizations of !1. Furthermore, these factorizations can be interpreted as representation theorems similar to the DNF, CNF and Zhegalkin polynomial representation theorems.

2

Definitions

Let A, B, E, G be arbitrary non-empty sets. A function on A to B is a map f: An--+ B, for some positive integer n called the arity of f. A class of functions on A to B is a subset :F ~ Un> 1 BAn. For a fixed arity n, the n different projection maps a = (at I t E n) >-> a;, i E n, are also called variables. For A = B = {0, 1}, a function on A to B is called a Boolean function. If f is an n-ary function on B to E and 91, ... , 9n are all m-ary functions on A to B, then the composition f(gl, .. . , 9n) is an m-ary function on A to E, and its value on (a1, .. . , am) E Am 1 Department of Mathematics, Statistics and Philosophy, University of Tampere, Fl-33014 Tampereen yliopisto, Finland. Partially supported by the Graduate School in Mathematical Logic MALJA. Supported in part by grant #28139 from the Academy of Finland. 2 Institute of Mathematics, Tampere University of Technology, P.O. Box 553, FI-33101 Tampere, Finland. Corresponding author. E-mail: [email protected] 3 Institute of Mathematics, Tampere University of Technology, P.O. Box 553, Fl-33101 Tampere, Finland.

850

M. Couceiro, S. Foldes, E. Lehtonen

is /(gl(al, ... ,am), ... ,gn(al, ... ,am)). Let I~ Un~1E 8 n and :T ~ Un~lBAn be classes of functions. The composition of I with :1, denoted I o :T, is defined as Io:J

= {!(g1, ... ,gn) I n,m ~ 1, f

n-ary in I, 91,· .. ,gn m-ary in :T}.

A clone on A is a set C 1AAn that contains all projections and satisfies Co C i

(I)

wj

In case of inconsistent matrices, the hyperlines have no common intersection point, i.e. the intersection set is empty. Thus, the FPM represents the hyperlines as fuzzy lines and finds the solution

858 _______________________________________

G.Ghinea and G. D. Magou/as

of the approximate priority assessment problem, as an intersection point of these fuzzy lines with values for the priorities that satisfY all judgements "as well as possible". Previous work gives evidence that FPM is able to produce better results than other methods when the degree of inconsistency is high [7]; this is a valuable property in our case. By following [6], the problem can be formulated as the standard linear programming problem of Equation (2), where the objective is to maximise A, a measure of intersection region of the fuzzy lines, subject to a set of m=p(p-1) constraints, which are given in matrix form, Re9lmxp, in Figure I (b). maximize A, subject to A.a't + R.tw :'> dt. k = I, ... , m, I

~A~ 0, and

f

W;

= I , w; > 0, i

i=l

In Equation (2), p is the number of elements compared; w = (w1 , w2 , ••• , wP

=

I, ... , p

(2)

Yis the vector of

priority weights; k indicates the k-th row of matrix R, and the values of the tolerance parameters dt represent the admissible interval of approximate satisfaction of the crisp inequalities R.tw < 0. For the practical implementation of the FPM, it is reasonable for all these parameters, dt. to be set equal [7]. The optimal solution to the problem is a vector ( w' , X ) , whose first component maximises the degree of membership of the fuzzy feasible region set, whilst the second gives the value of the maximum degree of satisfaction. The method is explained in detail in [6]. After deriving the underlying weights from the comparison matrices through the FPM technique, the weighted priority, w;, and the relative weight, w j.i, are synthesised using weighted sum aggregation in order to find the preference of microprotocol j with respect to all criteria/requirements simultaneously. Preference is denoted by W; and determines the overall ranking priority, or weight, of microprotocolj (obviously, the microprotocol with the maximum overall value W; will be chosen): p

wj

= LW; ·wj,i·

(3)

i=l

3. Application example We use a scenario which illustrates the ability of our approach to select appropriate microprotocols and construct a suitably-tailored protocol stack depending on the prevailing operating network environment and user perceptual preferences. The FPM was applied and the relative scores, wj.i, thus obtained are presented in Table I, where for example, one can notice that the first four microprotocols considered have an equal importance with respect to managing segment loss (SL). However, the most important microprotocol for segment loss is microprotocol 6, which has the highest relative score. Table I: Relative scores, BER

so

micro! 0.0574

micro2 0.0574

wj.i,

of the alternatives comparison matrices with respect to each criterion

micro 3

micro4

micro 5

micro 6

0.0574 0.0787

0.0574 0.0787

0.0574 0.1049

0.0574 0.1180

0.0881 0.2280

0.0881

0.1872

0.0638 0.3335 0.1883 0.0658

micro 9

micro 7 0.0574

micro 8 0.1541

0.4441

0.2753

0.0787 0.0617

0.0787 0.0617

0.0787 0.0617

0.0517 0.0474

0.0456 0.0316

0.2280 0.0743

0.0456 0.0474

0.0456 0.0316

0.0979

0.0818

0.0590

0.0568

0.0462 0.2529

0.0694 0.0354

0.1226 0.0293

0.0833 0.0759

0.1761

0.0625

0.0759

0.0759

0.0341 0.0759

0.1082

0.2754

SL DEL JIT

0.0881 0.2280 0.1614

0.0881 0.0638 0.1958

v

0.0842

0.2254

0.0771 0.0842

A

0.2175

0.0658

0.2175

T

0.1761

0.0833

0.1761

0.0833

0.1250

D

0.0843

0.0826

0.0826

0.3710

0.0759

..

In this example, no assumptiOns are made about the underlymg network conditiOns, or about the multimedia content to be transported over the network. As such, the a priori judgement values which arise from technical considerations, as well as user judgements resulting out of our user-perceptual evaluation experiments, [3], are used. As can be observed from the table, delay and audio are the most important criteria from a technical and user point of view, respectively. This is because distributed multimedia applications have an essential real-time characteristic, which makes delay the primordial factor from the technical point of view. On the other hand, our work on perceptual aspects of multimedia has confirmed previous experiments in highlighting that the most important medium in a multimedia presentation, from a user perspective, is the audio component.

Integrating Perceptech Requirements through Intelligent Computation of Priorities in Multimedia Streaming 859

Furthermore, from a technical angle, the Segment Loss (SL) criterion has the same importance as the Segment Order (SO) criterion, and thus in Figure I (b) these two criteria are shown to be "equally important". Similarly, for the user, the Text (T) criterion is as important as Dynamism (D). Moreover, as can be observed, QoS and perceptual parameters are considered to be equal to unity, which reflects a balance between perceptual and QoS considerations in this initial scenario.

By applying the FPM, the priority weights w; are derived (see Table 2). Finally, by synthesising the relative scores and the priority vector, using Equation (3), the overall priorities of the alternatives are obtained (see Figure 2). Micro! is suggested as the best alternative, as it has a high relative score for both delay (the most important parameter from a technicaVQoS point of view) as well as audio (the most important parameter from a user point of view). Although micro7 is even better suited to manage these two parameters, if one considers the overall set of parameters, it is micro I that achieves at least equal or higher relative scores than micro7 for eight out of the nine parameters/criteria. 0.2 weight

0.12

I

General Case

D

Visually-impaired user

!.iii High Delay Levels

0.04 micro•

micro7 micto8 m1Cf09

Figure 2: Microprotocol priority weights for three cases

References [I] Bouch, A., Kuchinsky, A., and Bhatti, N., Quality is in the eye of the beholder, Proc. CHI 2000, The Hague, The Netherlands, 2000, pp. 297-304. [2] Chen, S.J. and C.L. Hwang, Fuzzy Multiple Attribute Decision Making: Methods and Applications, Lecture Notes in Economics and Mathematical Systems, 375, 1992. [3] Ghinea G. and Thomas J.P., Quality of Perception: User Quality of Service in Multimedia Presentations, accepted for publication,/£££ Transactions on Multimedia. (4] Hikichi, K., Morino, H., Matsumoto, S., Yasuda, Y., Arimoto, 1., Ijume, M. and Sezaki, K., Architecture of Haptics Communication System for Adaptation to Network Environments, Proc. IEEE Int. Conference on Multimedia and Expo, Tokyo, Japan, 2001, pp. 744-747. [5] Krasic, C., Walpole, J., Feng, W. Quality-Adaptive Media Streaming by Priority Drop, Proc. NOSSDAV'03, Monterey, California, USA, 2003, pp. 307-310. [6] Mikhailov, L. A fuzzy programming method for deriving priorities in the analytic hierarchy process, Journal ofthe Operational Research Society, 51, 341-349,2000. [7] Mikhailov, L., and Singh, M.G., Comparison Analysis of Methods for Deriving Priorities in the Analytic Hierarchy Process, Proc. IEEE International Conference on Systems, Man and Cybernetics, Tokyo, Japan, 1999, pp. 1037-1042. (8] Sermadevi, Y., Masry, M.A., Hemami, S.S. MINMAX Rate Control with a Perceived Distortion Metric, Proc SPIE- Visual Communications and Image Processing, San Jose, CA, January 2004 [9] Varadarajan, S., Ngo, H.Q., Srivastava, J. An Adaptive, Perception-Driven Error Spreading Scheme in Continuous Media Streaming, Proc ICDCS 2000, Taiwan, 2000, pp. 475-483. [IO]Varadarajan, S., Ngo, H.Q., Srivastava, J. Error spreading: a perception-driven approach to handling error in continuous media streaming,IEEEIACM Transactions on Networking, 10(1), 139-152,2002 [II]Yadavalli, G., Masry, M.A., Hemami, S. S. Frame Rate Preferences in Low Bit Rate Video, Proc IEEE IC/P, 2003, Barcelona, Spain, 2003, pp. 441 - 444. [ 12] Zimmermann, H.-J ., Fuzzy Set Theory and Its Applications, Kluwer, 2"d ed., Boston, 1991.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 860-863

Assessing the Effectiveness of Artificial Neural Networks on Problems Related to Elliptic Curve Cryptography E.C. Laskari*·P, G.C. Meletiout.P, Y.C. Stamatiou§,t3 and M.N. Vrahatis*·+ 4 *Department of Mathematics, University of Patras, GR-26110 Patras, Greece

t A.T.E.I. of Epirus, P.O. Box 110, GR-47100 Arta, Greece §Department of Mathematics, University of Aegean, GR-83200 Samos, Greece +University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece Received 31 July, 2004; accepted in revised form 5 September, 2004 Abstmct: Cryptographic systems based on elliptic curves have been introduced as an alternative to conventional public key cryptosystems. The security of both kinds of cryptosystems relies on the hypothesis that the underlying mathematical problems are computationally intractable, in the sense that they cannot be solved in polynomial time. Artificial neural networks are computational tools, motivated by biological systems, which have the inherent ability of storing and making available experiential knowledge. These characteristics give to the artificial neural networks the ability to solve complex problems. In this paper, we study the performance of artificial neural networks on the approximation of data derived from the use of elliptic curves in cryptographic applications. Keywords: Artificial neural networks, approximation, elliptic curves, discrete logarithm. Mathematics Subject Classification: 68T05, 68T20, 68T30, 33F05, 14H52, 11G20, 11T71.

1

Introduction

Cryptographic systems based on Elliptic Curves (ECs) have been proposed in [5, 12] as an alternative to conventional public key cryptosystems. Their main advantage is that they use smaller parameters compared to the conventional cryptosystems (e.g. RSA). This is due to the apparently increased difficulty of the underlying mathematical problem, the Elliptic Curve Discrete Logarithm Problem (ECDLP). This problem is believed to require more time for its solution than the time required for the solution of its finite field analogue, the Discrete Logarithm Problem (DLP) that ensures the security of a number of cryptosystems (e.g. ElGamal). The security of cryptosystems that rely on both types of discrete logarithms, is based on the hypothesis that the underlying mathematical problems are computationally intractable, in the sense that they cannot be solved in polynomial time. Numerous techniques have been proposed to speed up the solution of these two types of the discrete logarithm problem, relying on both algebraic and number theoretic methods, software oriented methods and as well as approximation and interpolation techniques [2, 9, 10, 17]. 1 Corresponding author. E-mail: [email protected] 2E-mail: [email protected] 3 E-mail: [email protected] 4 E-mail: vrahatis@math. upatras.gr

Artificial Neural Networks on Problems Related to Elliptic Curve Cryptography

861

Artificial neural networks are computational tools that are motivated by biological systems, which have the inherent ability of storing experiential knowledge and make it available to use. These characteristics give to the artificial neural networks the ability of solving complex problems. In this paper, we study the performance of artificial neural networks when applied to the approximation of data produced by cryptographic applications that employ elliptic curves.

2

A Brief Introduction to Elliptic Curves

An Elliptic CuTVe over a prime finite field IFP, p > 3 and prime, is denoted by E(IFp) and it is defined as the set of all pairs (x, y) E IFP (points in affine coordinates) that satisfy the equation y 2 = x 3 +ax+ b where a, b E IFP• with the restriction 4a 3 + 27b2 =F 0. These points, together with a special point denoted by 0 called point at infinity and a appropriately defined point addition operation form an Abelian group. This is the Elliptic CuTVe group and the point 0 is its identity element (see [1, 16] for more details on this group). The order m of an elliptic cuTVe is defined as the number of the points in E(IFp)· According to Hasse's theorem (see e.g., [1, 16]) it holds that p

+ 1- 2../P,::; m,::; p + 1 + 2.,fP.

The order of a point P E E(IFp) is the smallest positive integer n for which nP = 0. From Lagrange's theorem, it holds that the order of a point cannot exceed the order of the elliptic curve. We will, now, describe the discrete logarithm problem. Let G be any group and y one of its elements. The discrete logarithm problem for G to the base g E G consists in determining an integer x such that gx = y when the group operation is written as multiplication or xg = y when the group operation is written as addition. In groups formed by elliptic curve points the group operation is the addition. Therefore, the definition of the discrete logarithm problem for elliptic curves is as follows. Let E be an elliptic curve over a finite field IFq, P a point on E(IFq) of order n and Q a point on a E(IFq) such that Q = tP, with 0,::; t,::; (n -1). The discrete logarithm problem for elliptic curves consists in determining the value of t. Groups defined on elliptic curves are special since the best algorithms that solve the discrete logarithm problem for them require have an exponential expected number of steps. In contrast, for the discrete logarithm problem defined over the multiplicative group of IF~ the best algorithms known today requires sub-exponential time in the size of the used group.

3

Approximation through Artificial Neural Networks

Artificial Neural Networks (ANNs) have been motivated by biological systems and more specifically by the human brain. Formally, "an ANN is a massively parallel distributed processor made of simple processing units, the neurons, which has a natural propensity for storing experiential knowledge and making it available to use. It resembles the brain in two ways. First, knowledge is acquired by the network from its environment through a learning process, and second, interneuron connection strengths, called weights, are used to store the acquired knowledge" [3]. Each artificial neuron is characterized by an input/output (1/0) relation and implements a local computation. Its output is determined by its 1/0 characteristics, its interconnection to other neurons and possible external inputs. The overall functionality of a network is determined by its topology, the training algorithm applied, and its neuron characteristics. In this paper from the wide family of neural network types, we focus on Feedforward Neural Networks (FNN). In FNNs all neuron's connections lead to one direction and the neurons can be partitioned in layers. This kind of networks can be described with a series of integers that denote the number of neurons at each layer. The operation of such networks consists of iterative steps. The input layer neurons

862

E.G. Laskari, G.C. Meletiou, Y.C. Stamatiou and M.N. Vrahatis

are generally assigned to real inputs, while the remaining hidden and output layer neurons are passive. In the next step the neurons of the first hidden layer collect and sum their inputs and compute their output, which is applied to the second layer. This procedure is propagated to all layers until the final outputs of the network are computed. The computational power of neural networks derives from their parallel distributed structure and their inherent ability to adapt to specific problems, learn and generalize. These characteristics give to the ANNs the ability to solve complex problems [4, 14]. The training process involves modification of the network weights by presenting to it training samples, called patterns, for which the desired outputs are a priori known. The ultimate goal of training is to assign to the weights (free parameters) of the network W, values such that the difference between the desired output (target) and the actual output of the network is minimized. The adaptation process starts by presenting all the patterns to the network and computing a total error function E = L;:=l Ek, where P is the total amount of patterns used in the training process and Ek is the partial network error with respect to the kth training pattern, computed by summing the squared discrepancies between the actual network outputs and the desired values of the kth training pattern. The training patterns can be applied several times to the network but in a different order. Each full pass of all the patterns that belong to the training set, T, is called a training epoch. If the adaptation method succeeds in minimizing the total error function then it is obvious that its aim has been fulfilled. Thus training is a non-trivial minimization problem. The most popular training method is back propagation method [3], which is based on the steepest descent optimization method. The back propagation learning process applies small iterative steps which correspond to the training epochs. At each epoch t the method updates the weight values proportionally to the gradient of the error function E(w). The whole process is repeated until the network reaches a steady state, where no significant changes in the weights are observed, or until the overall error value drops below a pre-determined threshold. At this point we conclude that the network has learned the problem "satisfactorily". The total number of epochs required can be considered as the speed of the method. More sophisticated training techniques can be found in [3, 6, 7, 8, 15, 18].

4

Problem Formulation and Preliminary Results

In this contribution, we consider the approximation of the elliptic curve discrete logarithm problem using ANNs. More specifically, we consider a specific elliptic curve along with several instances of the discrete logarithm over it and study the performance of ANNs in approximating the value t of the corresponding discrete logarithm. The training methods considered in this study are the Resilient Back Propagation (RPROP) [15] and the Scaled Conjugate Gradient (SCG) method [13]. Relative to the network architecture, we test a variety of topologies with different number of hidden layers and various numbers of neurons at each layer. To make the adaptation of the network easier, the data are normalized before the training. Assuming that the data presented to the network are in Zp, where pis prime, the intervalS= [-1, 1], is split in p-subintervals. Thus, numbers in data are transformed to analogous ones in S. To evaluate the network performance we measure the percentage of training data for which the network was able to compute the exact target value t. This measure is denoted as Jl.O· Next, we employed the Jl.±v measure, which represents the percentage of the data for which the difference between desired and actual output does not exceed ±v of the real target. As input patterns of the ANN, the tetrad of the components of the two points P, Q defined by the discrete logarithm over E(lFq), is presented to the network and the corresponding value t of the discrete logarithm forms the target value of the network. Preliminary results for small primes p, i.e. primes from 1009 to 5003, indicate that ANNs are able to adapt to the training data, even for the measure Jl.o· However,

Artificial Neural Networks on Problems Related to Elliptic Curve Cryptography

863

the topology and the number of epochs required for the adaptation of the data of the ECDLP, are quite large in comparison with the corresponding requirements for the DLP [11]. Further examination of the ability of ANNs on adapting to the training data for larger primes. as well as, study on the generalization performance of the networks to unknown data will be presented.

Acknowledgment We acknowledge the partial support by the "Archimedes" research programme awarded by the Greek Ministry of Education and Religious Affairs and the European Union.

References [1] Blake, I., Seroussi, G., and Smart, N., Elliptic Curves in Cryptography, London Mathematical Society Lecture Notes Series 265, Cambridge University Press, 1999. [2] Coppersmith, D. and Shparlinski, I., On polynomial approximation of the discrete logarithm and the diffie-hellman mapping, J. Cryptology, 13, pp. 339-360, 2000. [3] Haykin, S., Neural Networks, Macmillan College Publishing Company, 1999. [4] Hornik, K., Multilayer feedforward networks are universal approximators, Neural Networks, 2, pp. 359-366, 1989. [5] Koblitz, N., Elliptic curve cryptosystems, Math. Camp., 48, pp. 203-209, 1987. [6] Magoulas, G.D., Plagianakos, V.P., and Vrahatis, M.N., Adaptive stepsize algorithms for on-line training of neural networks, Nonlinear Analysis T.M.A., 47(5), pp. 3425-3430, 2001. [7] Magoulas, G.D., Vrahatis, M.N., and Androulakis, G.S., Effective backpropagation training with variable stepsize, Neural Networks, 10(1), pp. 69-82, 1997. [8] Magoulas, G.D., Vrahatis, M.N ., and Androulakis, G.S., Increasing the convergence rate of the error backpropagation algorithm by learning rate adaptation methods, Neural Computation, 11(7), pp. 1769-1796, 1999. [9] Maurer, U. and Wolf, S., The relationship between breaking the diffie-hellman protocol and computing discrete logarithms, SIAM J. Computing, 28, pp. 1689-1721, 1999. [10] Meletiou, G. C. and Mullen, G.L., A note on discrete logarithms in finite fields, Appl. Algebra Engrg. Comm. Comput., 3(1), pp. 75-79, 1992. [11] Meletiou, G.C., Tasoulis, D.K., and Vrahatis, M.N., Cryptography through interpolation approximation and computational intelligence methods, Bull. Greek Math. Soc., 2004, in press. [12] Miller, V., Uses of elliptic curves in cryptography, LNCS, 218, pp. 417-426, 1986. [13] M0ller, M.F., A scaled conjugate gradient algorithm for fast supervised learning, Neural Networks, 6, pp. 525-533, 1993. [14] Pincus, A., Approximation theory of the MLP model in neural networks, In Acta Numerica, Cambridge University Press, pp. 143-195, 1999. [15] Riedmiller, M. and Braun, H., A direct adaptive method for faster backpropagation learning: The RPROP algorithm, In Proceedings of the IEEE International Conference on Neural Networks, pp. 586-591, San Francisco, CA, 1993. [16] Silverman, J.H., The Arithmetic of Elliptic Curves, Springer-Verlag, 1986. [17] Winterhof, A., Polynomial interpolation of the discrete logarithm, Des. Codes Cryptogr., 25(1), pp. 63-72, 2002. [18] Vrahatis, M.N., Androulakis, G.S., Lambrinos, J.N., and Magoulas, G.D., A class of gradient unconstrained minimization algorithms with adaptive stepsize, J. Comput. Appl. Math., 114(2), pp. 367-386, 2000.

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 864-867

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Applying Evolutionary Computation Methods for the Cryptanalysis of Feistel Ciphers E.C. Laskari*·P, G.C. Meletiout.t2, Y.C. Stamatiou§,p and M.N. Vrahatis*·l 4 *Department of Mathematics, University of Patras, GR-26110 Patras, Greece tA.T.E.I. of Epirus, P.O. Box 110, GR-47100 Arta, Greece §Department of Mathematics, University of Aegean, GR-83200 Samos, Greece lUniversity of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece Received 31 July, 2004; accepted in revised form 5 September, 2004 Abstmct: Evolutionary Computation algorithms are stochastic optimization methods inspired by natural evolution and social behavior. These methods have been proven effective in tackling difficult problems that involve discontinuous objective functions and disjoint search spaces. In this contribution a problem introduced by the cryptanalysis of Feistel cryptosystems by means of differential cryptanalysis is formulated as an optimization problem and the performance of Evolutionary Computation methods in addressing this problem is studied for a representative Feistel cryptosystem, the DES. This approach is applicable to all Feistel cryptosystems that are amenable to the differential cryptanalysis method. Keywords: Computational Intelligence, Evolutionary Computation, DES, Differential Cryptanalysis, Feistel Ciphers Mathematics Subject Classification: 90C59, 90C30, 90C26, 94A60.

1

Introduction

Evolutionary Computation algorithms are stochastic optimization methods inspired by natural evolution and social behavior. The most known paradigms of such methods are Genetic Algorithms, Evolution Strategies and the Differential Evolution algorithm, which are based on the principles of natural evolution, and the Particle Swarm Optimization, which is based on the simulation of social behavior [4, 5, 11]. These methods have been applied in several scientific fields that include optimization problems, such as mathematics, economy, medicine, engineering and others, and have proven effective in tackling difficult problems that involve discontinuous objective functions and disjoint search spaces. In this contribution we consider a problem introduced by the cryptanalysis of a Feistel cryptosystem and formulate it as an optimization one. Then the performance of Evolutionary Computation methods in addressing this problem is studied. More specifically, induced by Differential 1 Corresponding

author. E-mail: [email protected] [email protected] [email protected] 4 E-mail: [email protected]

2 E-mail: 3 E-mail:

Applying Evolutionary Computation Methods for the Cryptanalysis of Feistel Ciphers

865

Cryptanalysis, we investigate the problem of finding some bits of the key that is used to a simple Feistel cipher, the Data Encryption Standard (DES) with reduced number of rounds. Our preliminary results on DES reduced to four rounds are encouraging since the method managed to address the problem at hand using a smaller amount of function evaluations than the brute force approach.

2

Background and Problem Formulation

A Feistel cipher is a cryptosystem based on a sequential r times repetition of a function, the round function, that maps an n-bit plaintext P, to a ciphertext C. In a Feistel cipher the current n-bit word is divided into (n/2)-bit parts, the left part L; and the right part R; [3]. Then round i, 1 :::;; i :::;; r has the following effect: L;

= R;_J,R; = L;-1 EBF;(R;-J,K;),

where K; is the subkey used in the ith round (derived from the cipher key K), and F; is an arbitrary round function for the ith round. The output of a Feistel cipher is ordered as ( R., Lr), i.e. after the last round function has been applied, the two halves are swapped. An interesting characteristic of Feistel based cryptosystems is that the decryption function is identical to the encryption function except that the subkeys, K;, and the round functions, F;, are applied in reverse. This makes the Feistel structure an attractive choice for both software and hardware implementations. The best known and most widely used Feistel block-cipher cryptosystem is DES, which is the outcome of the collaboration between the government of the United States and IBM in the '70s. It is a symmetric algorithm, meaning that the parties exchanging information possess the same key. DES processes plaintext blocks of n = 64 bits, producing 64-bit ciphertext blocks, with effective key size k = 64 bits, 8 of which can be used as parity bits. The plaintext block is divided into the left and right parts of 32 bits each. The main part of the round function is the F function, which works on the right half of the data, using a subkey of 48 bits and eight S-boxes. The S-boxes are mappings that transform 6 bits to 4 bits in a nonlinear way and constitute the only nonlinear part of DES. The 32 output bits of the F function are XORed with the left half of the data and the two halves are exchanged. A more detailed description of the DES algorithm can be found in [10, 12]. Two of the most powerful cryptanalytic attacks for Feistel based ciphers, that were first applied with success to the cryptanalysis of DES, depend critically on the exploitation of specific weaknesses of the S-boxes of the target cryptoalgorithm. These attacks are the Linear Cryptanalysis (see [9, 8]) and the Differential Cryptanalysis (see [1, 2]). Differential Cryptanalysis (DC) is a chosen plaintext attack which uses only the resultant ciphertexts. The basic tool of the attack is the ciphertext pair which is a pair of ciphertexts whose plaintexts have particular differences. The two plaintexts are chosen at random, as long as they satisfy the difference condition. DC analyzes the effect of particular differences in plaintext pairs on the differences of the resultant ciphertext pairs. These differences can be used to assign probabilities to the possible keys and to locate the most probable key. This method usually works on a number of pairs of plaintexts with the same particular difference using only the resultant ciphertext pairs. For cryptosystems similar to DES, the difference is chosen as a fixed XORed value of the two plaintexts. The most important component in DC is the use of a characteristic, which can be informally defined as follows [1]: "Associated with any pair of encryptions are the XOR value of its two plaintexts, the XOR of its ciphertexts, the XORs of the input of each round in the two executions and the XORs of the outputs of each round in the two executions. These XOR values form an rround characteristic. A characteristic has a probability, which is the probability that a random pair with the chosen plaintext XOR has the round and ciphertext XORs specified in the characteristic". Each characteristic allows the search for a particular set of bits in the subkey of the last round: the bits that enter some particular S-boxes, depending on the chosen characteristic. The

E.C. Laskari, G.C. Meletiou, Y.C. Stamatiou, and M.N. Vrahatis

866

characteristics that are most useful are those that have a maximal probability and a maximal number of subkey bits whose occurrences can be counted. The DC is a statistical method and can fail in rare instances. A more extended analysis on DC and its results on DES for different numbers of rounds can be found in [1]. For the Differential Cryptanalysis of DES reduced to four rounds, a one-round characteristic with probability 1 can be used. This characteristic at the first step of cryptanalysis provides 42 bits of the subkey of the last round. In the case where the subkeys are calculated with the DES key scheduling algorithm, the 42 bits given by DC are actual key bits of the 56 key bits and there are 14 key bits still missing. A suggestion for finding these key bits was to try all the 2 14 possibilities in decrypting the given ciphertexts, using the resulting keys. The right key should satisfy the known plaintext XOR value for all the pairs that are used by DC. An alternative way is to use a second characteristic that corresponds to the missing bits and try a more careful counting on the key bits of the last two rounds. Instead of using the aforementioned approaches to find the missing key bits, we formulate the problem of the missing 14 bits as an optimization problem as follows. We consider each one of the 14 bits as a component of a 14-dimensional vector. Such a vector represents a possible solution of the problem. Assume that the right 42 key bits found by DC were suggested using np pairs. We use these np pairs for the evaluation the possible solutions provided by the optimization method. More specifically, for each possible solution, X;, suggested by the optimization algorithm, we construct the 56 bits of the key, using the 42 bits which are known by DC and the 14 components of X; in proper order. With the resulting key, we decrypt the np ciphertext pairs that where used by DC and count the number of decrypted pairs that satisfy the known plaintext XOR value, denoted as cnpx,· Thus, the evaluation function j, is the difference between the desired output np and the actual output cnpx, i.e., f(X;) = np- cnpx,. The global minimum of the function f is zero and the global minimizer provided, will be the actual key with high probability.

3

Preliminary Results and Discussion

In a recent work [6], we have studied the performance of the PSO method formulated for Integer Programming [7], in addressing the prescribed problem of cryptanalysis. Some indicative results of this study using 20 ciphertext pairs, for both global (PSOCG) and local (PSOCL) neighborhood

Table 1: Results for six different keys using np = 20 ciphertext pairs. Function Evaluations key Method Suc.Rate min mean PSOCG 1146 200 98% kl PSOCL 100% 2020 200 kl PSOCG 99% 854 200 k2 2079 200 PSOCL 100% k2 1542 200 PSOCG 97% ka 2300 200 PSOCL 100% ka 1698 200 PSOCG 97% k4 1884 300 PSOCL 100% k4 1870 200 PSOCG 93% k5 1788 300 PSOCL 100% k5 PSOCG 740 200 100% k6 1717 200 PSOCL 100% k6

Applying Evolutionary Computation Methods for the Cryptanalysis of Feistel Ciphers

867

variants of the PSO method with constriction factor and population size of 100 particles, are reported in Table 1. The notation k;, for i = 1, ... , 6, is for the six different keys that were used in the experiments and as a measure of the performance of the proposed approach, the number of function evaluations required by the method to locate the global minimum were counted. Each function evaluation corresponds to the decryption of all 20 ciphertext pairs, using a particular key. The success rates of each algorithm, that is the proportion of the times it achieved the global minimizer within a prespecified threshold is also reported. Our first results were encouraging since the applied method was able to locate the 14 missing bits in an average of 1500 function evaluations as opposed to the 2 14 required by brute force. Furthermore, the considered methods are simple and can be readily adapted to handle more complex Feistel based cryptosystems. Thus, in this contribution, we extend our study providing more detailed results and interesting conclusions for the cryptanalysis of Feistel-based cryptosystems using Evolutionary Computation methods.

Acknowledgment We acknowledge the partial support by the "Archimedes" research programme awarded by the Greek Ministry of Education and Religious Affairs and the European Union.

References [1] Biham, E. and Shamir, A., Differential cryptanalysis of DES-like cryptosystems, Journal of Cryptology, 1991. [2] Biham, E. and Shamir, A., Differential Cryptanalysis of the Data Encryption Standard, Springer-Verlag, 1993. [3] Feistel, H., Cryptography and computer privacy, Scientific American, 1973. [4] Fogel, D.B., Evolutionary Computation: Towards a New Philosophy of Machine Intelligence, IEEE Press, Piscataway, NJ, 1995. [5] Kennedy, J. and Eberhart, R.C., Swarm Intelligence, Morgan Kaufmann Publishers, 2001. [6] Laskari, E.C., Meletiou, G.C., Stamatiou, Y.C., and Vrahatis, M.N., Evolutionary computation based cryptanalysis: A first study, In Proceedings of the World Congress of Nonlinear Analysts WCNA-2004, in press. [7] Laskari, E.C., Parsopoulos, K.E., and Vrahatis, M.N., Particle swarm optimization for integer programming, In Proceedings of the IEEE 2002 Congress on Evolutionary Computation, pp. 1576-1581, Hawaii, HI, 2002, IEEE Press. [8] Matsui, M., Linear cryptanalysis method for DES cipher, Lecture Notes in Computer Science, 765, pp. 386-397, 1994. [9] Matsui, M. and Yamagishi, A., A new method for known plaintext attack of FEAL cipher, Lecture Notes in Computer Science, pp. 81-91, 1992. [10] Menezes, H., van Oorschot, P., and Vanstone, S., Handbook of applied cryptography, CRC Press series on discrete mathematics and its applications, CRC Press, 1996. [11] Schwefel, H.-P., Evolution and Optimum Seeking, Wiley, New York, 1995. [12] Stinson, D., Cryptography: Theory and Practice (Discrete Mathematics and Its Applications), CRC Press, 1995.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 868-873

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

UPSO: A Unified Particle Swarm Optimization Scheme K.E. Parsopoulos 1 and M.N. Vrahatis 2 Department of Mathematics, University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece Received 31 July, 2004; accepted in revised form 5 September, 2004 Abstract: We introduce Unified Particle Swarm Optimization, a new scheme that harnesses the local and global variants of the standard Particle Swarm Optimization algorithm, combining their exploration and exploitation abilities. Convergence in probability can be proved for the new approach in unimodal cases and preliminary results justify its superiority against the standard Particle Swarm Optimization. Keywords: Optimization, Particle Swarm Optimization, Stochastic Algorithms, Swarm Intelligence Mathematics Subject Classification: 90C26, 90C30, 90C59

1

Introduction

Particle Swarm Optimization (PSO) is a stochastic, population-based optimization method. Upto-date it has been applied successfully on a plethora of test problems in diverse scientific fields [1, 5, 6, 7]. Its efficiency can be attributed to the information exchange among the search points that constitute the population. There are two main variants of PSO with respect to the information exchange scheme that is used, each with different exploration and exploitation characteristics. Practitioners usually select the most proper variant based on their experience as well as on the special characteristics of the problem at hand. Unified Particle Swarm Optimization is a new scheme that harnesses the two variants of PSO in a unified scheme that combines their exploration and exploitation capabilities. Under assumptions, convergence in probability can be proved for the new approach. Preliminary results on a widely used set of benchmark problems are indicative of the new scheme's efficiency.

2

Unified Particle Swarm Optimization

Emergent behavior in socially organized colonies constituted a great source of inspiration for computer scientists. Ant colonies, bird flocks and fish schools that could tackle efficiently combinatorial and numerical optimization problems were modeled and applied successfully on numerous benchmark and real-life problems, giving rise to the class of swarm intelligence algorithms [3]. PSO is a swarm intelligence optimization algorithm developed by Eberhart and Kennedy [3]. It employs a population called a swarm, § = {x 1 , ... , XN}, of search points called particles, x; = (x; 1 , x; 2 , ... , X;n) T, i = 1, ... , N, which probe the search space, S C !Rn, simultaneously. The 1 Corresponding 2 E-mail:

author. E-mail: kostasp@math. upatras.gr [email protected]

869

UPSO: A Unified Particle Swarm Optimization Scheme

algorithm works iteratively. Each particle is initialized to a random position in the search space. Then, at each iteration, each particle moves with an adaptable velocity, v; = (Vii, Vi2, ... , Vin) T, while retaining in a memory the best position, Pi = (Pii,Pi2• ... ,p;n)T E S, it has ever visited in the search space. In minimization problems, best positions have lower function values. The particle's movement is also influenced by the experience of the rest particles, i.e., by their best positions. This is performed through the concept of neighborhood. More specifically, each particle is assigned a neighborhood which consists of some prespecified particles. Then, the particles that comprise the neighborhood share their experience by exchanging information. There are two main variants of PSO with respect to the number of particles that comprise the neighborhoods. In the global variant, the whole swarm is considered as the neighborhood of each particle, while, in the local variant, smaller neighborhoods are used. Neighboring particles are determined based rather on their indices than their actual distance in the search space. Let gi be the index of the best particle in the neighborhood of x;, i.e., the index of the particle that attained the best position among all the particles of the neighborhood. The particles are considered in a ring topology. Thus, their indices are considered in a cyclic order, i.e., 1 is the index that follows after N. At each iteration, the swarm is updated according to the equations [1, 7], (k+l)

vi

(k+l)

xi

[vik) + (Plk) - xlk)) + xlk) + v?+ X

CJ TJ

c2r2

(P~7)

- xlk))] ,

(1)

(2)

!) ,

where i = 1, ... , N; k is the iterations' counter; x is a parameter called constriction factor that controls the velocity's magnitude; c 1 and c 2 are positive acceleration parameters, called cognitive and social parameter, respectively; and r 1 , r 2 are random vectors that consist of random values uniformly distributed in [0, 1]. All vector operations in Eqs. (1) and (2) are performed componentwise. A stability analysis of PSO, as well as recommendations regarding the selection of its parameters are provided in [1, 7]. The performance of a population-based algorithm depends on its ability to perform global search of the search space (exploration) as well as more refined local search (exploitation). Proper balance between these two characteristics result in enhanced performance. In the global variant of PSO, all particles are attracted by the same overall best position, converging faster toward specific points. Thus, it has better exploitation abilities. On the other hand, in the local variant, the information of the best position of each neighborhood is communicated slowly to the other particles of the swarm through their neighbors. Therefore, the attraction to specific best positions is weaker, hindering the swarm from getting trapped in locally optimal solutions. Thus, the local variant of PSO has better exploration ability. Proper selection of the neighborhood's size affects the trade-off between exploration and exploitation. The selection of the most proper neighborhood size is an open problem. In practice, it is up to the practitioner and it is based solely on his experience. Unified Particle Swarm Optimization (UPSO) is a new scheme that harnesses the global and the local variant of PSO, thereby combining their exploration and exploitation capabilities. Let gi(k+l) denote the velocity update of the ith particle, Xi, in the global PSO variant, while .c?+J) denotes the corresponding velocity update for the local variant. Then, according to Eq. (1), g(k+l)



vlk) + X [ vjk) +

X[

(Plk) - xlk)) + (P~k) - xlk))] , r; (Plk) - x?)) + c2r~ (P~~) - xlk))] ,

CJ TJ CJ

c2r2

(3)

(4)

where k denotes the iteration number; g is the index of the best particle of the whole swarm (global variant); and gi is the index of the best particle in the neighborhood of Xi (local variant). These

870

K.E. Parsopoulos and M.N. Vrahatis

two search directions can be combined in a single equation, resulting in the main UPSO scheme, ui(k+IJ (k+i)

X;

u g;k+i) xlk)

+ (1 _

u) .elk+ I),

(5)

+ U;(k+l),

(6)

where u E [0, 1] is a parameter called the unification factor, which determines the influence of the global and local components in Eq. (5). For u = 1, Eq. (5) is equivalent to the global PSO variant, while for u = 0 it is equivalent to the local PSO variant. For all intermediate values, u E (0, 1), we obtain composite variants of PSO that combine the exploration and exploitation characteristics of its global and local variant. UPSO can be further enhanced by incorporating a stochastic parameter in Eq. (5) that imitates the mutation of evolutionary algorithms, however, it is directed toward a direction which is consistent with the PSO dynamics. Thus, Eq. (5) can be written as, (7) which is mostly based on the local variant or, alternatively, (8)

which is mostly based on the global variant, where r 3 "' N(f.l, 17 2 I) is a normally distributed parameter, and I is the identity matrix. A proof of convergence in probability can be given for the schemes of Eqs. (7) and (8). The proof follows the analysis of Matyas [4] for stochastic optimization algorithms. Assume that F : S --> lR is a unimodal objective function, Xopt is its unique minimizer in S, and Fopt = F(Xopt)· Also, let xlk) be the ith particle of the swarm and plk) be its best position in the kth iteration. The proof does not take into consideration the index i, therefore we will refer to them as x(k) and p(k), respectively. The level setofF at a constant value, K, is defined as G[K] = {x: F(x) < K}. We assume that G[K] i 0, for all K > Fopt· Let A(k+i) = (1- u) .C(k+IJ, B(k+i) = uQ(k+i), and f(z) be the probability distribution of r 3 B(k), with r 3 "' N(f.l, 17 2 I). The choice of the Normal distribution as the probability distribution of r 3 , guarantees that f(z) i 0, for all z, although the proof holds for any choice of probability distribution for r3, as long as this relation holds. We define as a successful step of UPSO at iteration k, the fact that F (x(k+i)) < F (p(k)) - c:, for a prescribed c: > 0. The probability of a successful step from x(k) is given by Pp(x) =

{ Jc[F(p)-e]

f(z- x)dz.

Then, based on the analysis of Matyas [4], the following theorem is straightforwardly proved:

i 0, for all K > Fopt, and f(z) i 0 for all z. Then, at least one sub-sequence of best positions, {p(k)}, of any particle, x, of the swarm in UPSO tends in probability to Xopt·

Theorem 1 Let F(x) have a unique minimum inS, G[K]

Proof. Let 8(x) = {z: g(z,x) < 8}, 8 > 0, be the 8-neighborhood of a point x. We will prove that for any 8 > 0 it holds that,

k~~ p

{ !1 (P(k) ,Xopt)

> 8}

=

kl.!_.~ p

{P(k)

~ 8(Xopt)}

= 0,

i.e., the probability that the distance g (p(k), Xopt) > 8, or equivalently that p(k) ~ 8(xopt), tends to zero. If we denote by F 0 the minimum value ofF on the boundary of 8(xopt), we shall have

871

UPSO: A Unified Particle Swarm Optimization Scheme

F0 > Fopt· We can now define c: = c:(8) such that 0 < c:(8) < F0 - Fopt· For all previous best positions, p rf. 8(xopt), of the particle under consideration, the inequality F(p)- c: > Fopt, is valid. Furthermore, from the assumptions of the theorem, G[F(p) - c:] is a non--€mpty region. Since f(z) > 0 for all z, there will exist an a > 0, such that PF(x) ~ a, i.e., the probability of a successful step from x is positive (although in some cases it may become very small). Let F (x(ll) = F (p(ll) be the initial function value of x and p, respectively (recall that the initial position and the initial best position of a particle coincide). We denote r = (F (p(ll)- F 0 ) jc:, and m = l r J, i.e., m is the largest integer less than r. From the design of the PSO and UPSO algorithm, if even m + 1 steps turn out to be successful, then all the subsequent points of the sequence {p(k)} lie in 8(xopt)· Consequently, the probability P {p(k) rf. 8(xopt)} is less than or equal to the probability that the number of successful steps does not exceed m, i.e.,

where, y(i) = 1, if there was a successful step in iteration i, and y(i) = 0, otherwise. The latter probability increases with a decrease in the probability of successful steps, and since PF(x) ~a, it obeys the well-known Newton's theorem (on the binomial probability distribution),

where k is the number of steps (iterations) taken. Further, when k > 2m and a < 0.5,

~ (~) ai(1- a)k-i < (m + 1) (~) (1- a)k =

m; 1 k(k- 1)(k- 2) · · · (k-

m

+ 1)(1- a)k

1 k < -m+1-km(1 -a) .

m.

Consequently, P {e(p(k), Xopt) > 8} < ~km(l- a)k. Thus, for a> 0, it is clear that, lim km(l - a)k = 0,

k~oo

and the theorem is proved. • We must note that the best position p of the particle x may be replaced by the overall best position of the whole swarm, with minor modifications in the proof.

3

Experimental Results and Discussion

The performance of UPSO was investigated on the test set used by Trelea in [7], and it consists of the Sphere, Rosenbrock, Rastrigin, and Griewank function in 30 dimensions, as well as the Schaffer's function in 2 dimensions. The test functions are denoted as Ft, F2, F3, F4, and F5, respectively. For comparison purposes, the PSO configuration reported in [7] was also adopted here. Specifically, two sets of parameters were used, denoted as Set 1 and Set 2, respectively. Set 1 consists of x = 0.6 and c 1 = c2 = 2.833, which is the equivalent of the set a = 0.6 and b = 1. 7 in [7], and Set 2 consists of x = 0.729, c 1 = c2 = 2.05, which is the equivalent of the set a= 0.729 and b = 1.494 in [7]. The maximum number of iterations was 10000. The swarm was initialized in the range [-100, 100]3° for the Sphere function, [-30, 30]3° for the Rosenbrock function, [-5.12, 5.12] 30 for the Rastrigin function, [-600, 600] 30 for the Griewank function, and [-100, 100]2 for Schaffer's

K.E. Parsopoulos and M.N. Vrahatis

872

Table 1: Results for UPSO. Success rates (in parenthesis) and expected number of function evaluations are reported. Success Rates & Expected Number of Function Evaluations 'P = 0.1 Jl.=O J1.=1 Jl.=O J1.=1 Set 1 Set 1 Set 2 Set 1 Set 2 Set 1 Set 2 Set 2 F1 (Sphere). Best value reported in [2, 7] is (1.00)10320 (1.00)3981 (1.00)5401 (1.00)3078 (1.00)3754 (1.00)5268 (1.00)2892 (1.00)3658 (1.00)3876 (1.00)6588 (1.00)6924 (1.00)6852 (1.00)8807 (1.00)6278 (1.00)7011 (1.00)6579 (1.00)8427 (1.00)12174 ( 1.00) 14679 (1.00)11370 (1.00)14238 ( 1.00) 12771 ( 1.00) 14640 (1.00)11433 (1.00)14202 F2 (Rosenbrock). Best value reported in [2, 7] is (0.50)15930 (1.00)2138 (1.00)2413 (1.00)4950 (0.95)10451 (1.00)2193 (1.00)2570 (1.00)3656 ( 1.00) 10333 (1.00)4385 (1.00)4835 (1.00)6765 (1.00)17714 (1.00)4223 (1.00)5330 (1.00)8601 (1.00)11466 (1.00)8505 (1.00)9936 (1.00)12186 (1.00)16626 (1.00)8040 (1.00)9765 ( 1.00) 15450 ( 1.00) 14688 F3 (Rastrigin). Best value reported in [2, 7] is (0.90)4667 (1.00)4373 (1.00)11315 (0.90) 16541 (1.00)8412 (1.00)3820 (1.00)2889 (1.00)6846 (1.00)7506 (1.00)8766 (1.00)7233 (1.00)8744 (1.00)11673 (1.00)5805 (1.00)9294 (1.00)8196 (1.00)10256 (1.00)11475 ( 1.00) 11640 (1.00)15882 (1.00)19437 (1.00)8433 (1.00)10293 (1.00)14013 (1.00)17934 F4 (Griewank). Best value reported in [2, 7] is (1.00)9390 (1.00)3161 (1.00)3275 (1.00)4157 (1.00)5473 (1.00)3312 (1.00)3620 (1.00)4343 (1.00)5383 (1.00)6507 (1.00)7229 (1.00)6914 (1.00)8114 (1.00)6516 (1.00)6009 (1.00)6870 (1.00)8300 (1.00)11844 (1.00)14019 ( 1.00) 10725 (1.00)13230 (1.00)12150 (1.00)14640 (1.00)10839 (1.00)13458 Fs (Schaffer). Best value reported in [2, 7] is (0.75)6440 (0.90)24153 (0.85)20927 (0.95)21922 (0.95)37092 (1.00)37790 (1.00)18591 ( 1.00)36664 (0.95)19234 (1.00)16716 (1.00)19437 ( 1.00)22254 (1.00)36020 ( 1.00)25433 ( 1.00)26211 (1.00)29874 (1.00)22680 (1.00)25425 (1.00)24072 (1.00)21531 (1.00)19203 (1.00)33411 (1.00)20115 (1.00)21456 (1.00)22764 'P = 0.9

N

15 30 60 15 30 60 15 30 60 15 30 60 15 30 60

function, while the corresponding error goals were 0.01, 100, 100, 0.1, and w-s [7]. For each function, 20 experiments were performed, using three different swarm sizes, 15, 30, and 60. The particles were allowed to move anywhere in the search space without constraints on their velocity. Regarding the configuration of UPSO, two cases were investigated, namely the case of Eq. (8) with unification factor u = 0.9, which is closer to the global PSO variant used in [7], and Eq. (7) with u = 0.1, which is closer to the local PSO variant. For both cases, two different configurations of the distribution of r3 were considered, namely, one with mean value /-1 = 0, and one with mean value /-1 = 1. The standard deviation of r 3 was u = 0.01 in all cases, to avoid wide deterioration of the PSO dynamics. The neighborhood radius for the determination of the local PSO search direction, C;, in Eqs. (7) and (8), was 1, i.e., the neighbors of the ith particle, x;, were the particles x;_ 1 and Xi+l· This selection was made in order to take full advantage of the properties of the local version, since the larger the neighborhoods the closer is the local variant to the global one. For each parameter configuration and test function, the success rate of UPSO, namely the fraction of the experiments in which the error goal was achieved, as well as the corresponding expected number of function evaluations, defined as (Number of Particles)x(Average Number of lterations)/(Success Rate) [7], were recorded and they are reported in Table 1. The results were very promising. The success rate never fell under 0.90 (i.e., 90%), while {in most cases) the expected number of function evaluations was smaller than the values reported in [7] and [2]. UPSO achieved success rates of 100% even in cases where the plain PSO had very low success rates. For example, in the cases of F4 and F 5 and a swarm of 15 particles, the plain PSO with the parameters of Set 1 had a success rate of just 0.35 and 0.45, respectively, as reported in [7], while UPSO's success rate was higher than 0.90 for any set of parameters. Furthermore, the success rate and the lowest expected number of function evaluations for the test functions F 1-Fs, as reported in [7], was {1.0)10320, {0.50)15930, {0.90)4667, {1.00)9390, {0.75)6440, respectively. In UPSO, the

UPSO: A Unified Particle Swarm Optimization Scheme

873

corresponding numbers were (1.00)2892, (1.00)2138, (1.00)2889, (1.00)3161, and (1.00)16716 (for each case the lowest expected number of function evaluations is bold faced in the table). We must notice that in the case of Fs, the number 6440 reported in [7] corresponds to a success rate equal to 0.75, while the number 16716 of UPSO corresponds to a success rate equal to 1.00. Regarding the different configurations of UPSO, the version of Eq. (7) with u = 0.1 and J.l = 0 had the better overall performance, probably due to the better exploration ability of the local PSO variant, which is favored in this scheme. Moreover, the Set 1 proposed by Thelea in [7] outperformed the (most popular) Set 2. Also, UPSO with J.l = 0 outperformed that with J.l = 1, in most problems. Further experiments were performed using the plain UPSO scheme of Eq. (5), both in static and dynamic optimization problems, revealing that the values u = 0.5 and u = 0.2 result in enhanced performance of the algorithm. The mathematical properties behind this effect are still under investigation, along with possible correlations between the UPSO scheme and PSO variants with adaptive neighborhood size.

4

Conclusions

A Unified Particle Swarm Optimization (UPSO) that aggregates the local and the global variant of PSO in a unified scheme has been introduced. The proposed approach seems to exploit the good properties of both variants and preliminary experiments on the test set used in [7] justify its efficiency. Further investigation is required to analyze the dynamics of UPSO. A self-adaptive scheme that will exploit knowledge of the characteristics of the objective function, as well as the performance of the algorithm, to control the unification factor is currently under development, along with an analysis of its convergence rates.

Acknowledgment The authors wish to thank the anonymous referees, as well as Dr. M. Clerc and Dr. I.C. Thelea for their careful reading of the manuscript and their fruitful comments and suggestions.

References [1] Clerc, M. and Kennedy, J., The particle swarm-explosion, stability, and convergence in a multidimensional complex space, IEEE Transactions on Evolutionary Computation, 6(1), pp. 58-73, 2002. [2] Eberhart, R.C. and Shi, Y., Comparing inertia weights and constriction factors in particle swarm optimization, In Proc. 2000 IEEE CEC, pp. 84-88, Piscataway, NJ, 2000. IEEE Service Center. [3] Kennedy, J. and Eberhart, R.C., Swarm Intelligence, Morgan Kaufmann Publishers, 2001. [4] Matyas, J., Random optimization, Automatization and Remote Control, 26, pp. 244-251, 1965. [5] Parsopoulos, K.E. and Vrahatis, M.N., Recent approaches to global optimization problems through particle swarm optimization, Natural Computing, 1(2-3), pp. 235-306, 2002. [6] Parsopoulos, K.E. and Vrahatis, M.N., On the computation of all global minimizers through particle swarm optimization, IEEE Transactions on Evolutionary Computation, 8(3), pp. 211224, 2004. [7] Thelea, I.C., The particle swarm optimization algorithm: Convergence analysis and parameter selection, Information Processing Letters, 85, pp. 317-325, 2003.

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 874-879

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Nonlinear Data Fitting for Landslides Modeling K.E. Parsopoulos* 1 , V.A. Kontogiannit2, S.I. Pytharouli13, P.A. Psimoulist4, S.C. Stirost 5 and M.N. Vrahatis* 6 *Department of Mathematics, University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece tDepartment of Civil Engineering, University of Patras, GR-26500 Patras, Greece Received 31 July, 2004; accepted in revised form 5 September, 2004 Abstmct: We consider data fitting schemes that are based on different norms to determine the parameters of curve--models that model landslides in dams. The Particle Swarm Optimization method is employed to minimize the corresponding error norms. The method is applied on real-world data with promising results. Keywords: Curve Fitting, Optimization, Particle Swarm Optimization, Swarm Intelligence, Landslides Modeling

Mathematics Subject Classification: 65010, 90C26

1

Introduction

A common problem in physics, earth sciences and engineering is the optimal fitting of a curve to a set of observations of certain parameters versus time (or space). The observations are usually contaminated by various types of errors. The usual procedure that is followed to solve such problems is to test various empirically selected model--curves and estimate the parameters of the curve that minimize the difference between the values obtained through the model and the observed ones. Traditionally, this task is accomplished using the well-known Least Squares Method (LSQR). More specifically, linear or linearized equations are used and the sum of squares of differences among observations and the corresponding model--curve values is minimized. Therefore, the practitioner has to decide only regarding the most appropriate curve--model (e.g. polynomial, periodic, exponential, mixed, etc.) such that an acceptable fit is obtained. In some cases, however, the available data are noisy, unevenly distributed versus time, there is no a priori knowledge of the variance--covariance matrix or they do not correspond to rather smooth curves (for instance they include offsets, a usual case in tectonic and geotechnical studies [4]). In such cases, the LSQR approach may not be successful, resulting in complex curve--models that lack physical significance and ability to be incorporated to further modeling and analysis. In such cases, the use of different data fitting approaches has been proved very useful [1]. 'Corresponding author. E-mail: [email protected] [email protected] [email protected] 4 E-mail: [email protected] 5 E-mail: [email protected] 6 E-mail: [email protected] 2 E-mail:

3 E-mail:

875

Nonlinear Data Fitting for Landslides Modeling

Vaiaic:n ci tte dsta're {Jxirts 81 B2 B3 84 Eli Eli 87 EB Ell Q1Z)

i1~ ~

1CXXl

~

~

ij

0

I

ffi

-~

1977 1979 1001 1963 1965 1007 1009 1001 1963 1005 1007 1009 2Xl1

lirre Figure 1: The record of observations for the Polyfyto Dam.

Evolutionary and Swarm Intelligence algorithms have been successfully applied on several data fitting problems [5, 6]. Their ability to work using solely function values even for discontinuous and non-differentiable functions renders them a promising alternative in cases where traditional algorithms, such as LSQR, fail. The aim of this paper is to investigate alternative curve fitting techniques based on the Particle Swarm Optimization (PSO) algorithm and three different norms to cope with a real-life curve fitting problem from the field of Civil Engineering. Results are reported and discussed. Section 2 is devoted to the description of the problem, while the employed optimization algorithm, PSO, is briefly described in Section 3. Experimental results are reported and discussed in Section 4.

2

Description of the Curve Fitting Problem and Models

The problem investigated here is the monitoring of a landslide of the Polyfyto Dam in the Aliakmonas river in north Greece. A record of observations has been collected in collaboration with the Greek Public Power Corporation s.a.. The record consists of a large number of observations of distance changes obtained by monitoring 7 control points, denoted as Bl - B7, on the landslide relative to a stable reference station on stable ground, over a period of 20 years. The record is depicted in Figure 1, along with observat ions for 3 auxiliary points, B8, B9 and Ql2. As we can see, t he control point B2 exhibited the largest displacement. The first step in the analysis of the landslide is the determination of a mathematical model, which captures the pattern of the landslide movement and can be used to estimate its future trends [7]. For this purpose, the movement of each control point was individually investigated. In Figure 1 it is clear that almost all points are moving faster in early years, while their movement tends to be stabilized in late years. This effect can be described using different mathematical models, although, just a few models retain the physical meaning of the specific phenomenon. The simplest model that could be used is a polynomial of degree four. However, it exhibits some upward

876

Parsopoulos, Kontogianni, Pytharouli, Psimoulis, Stiros and Vrahatis

and downward branches that do not fit the observations, and for this purpose, two types of an exponential decay model were adopted,

Modell: Model2:

f(t) f(t)

A {1- exp(-t/B)) + C, A {1- exp(-t/B)) +Kt +C.

{1) {2)

The next step in the analysis is the determination of the unknown parameters A, B, C and K, such that the error among the observations and the corresponding values provided by the model is minimized. For the error measurement, several norms can be used. The most common choices are the £1, £2 and foe-norms, which are defined as, m

lleiii=Lie;l, i=l

respectively, where m is the number of observations and e; = M;- 0;, i = 1, ... , m, with 0; being the ith observed value and M; be the corresponding value implied by the model. The £1-norm is the most "fair" norm since it uses the absolute values of the errors. However, it results in non-differentiable minimization problems, therefore, it cannot be used with traditional gradient-based minimizers. On the other hand, the £2-norm results in differentiable minimization problems but the assumed error values are not always consistent with the actual ones. For example, an absolute error value equal to w- 3 becomes w- 6 , while an absolute error equal to 102 becomes 104 . The foe-norm constitutes the most proper choice in cases where outliers that must be taken seriously into consideration appear in the set of observations, since it minimizes the maximum among all absolute errors. The performance of LSQR for the determination of the unknown parameters A, B, C and K, is rather poor with the deviation being larger at the edge of the curve where indeed a good fitting is sought. This happens due to the £2-norm, on which LSQR is based. Thus, alternative fitting techniques that use different norms are of great interest in order to provide more reliable results.

3

Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a swarm intelligence optimization algorithm developed by Eberhart and Kennedy [3]. It employs a population, called a swarm,§= {x 1, ... , XN }, of search points, called particles, x; = (x; 1,x; 2, ... ,x;n)T, i = 1, ... ,N, which probe the search space, S C IR.n, simultaneously. The algorithm works iteratively. Each particle is initialized to a random position in the search space. Then, at each iteration, each particle moves with an adaptable velocity, v; = {v;1, vi2, ... , v;n)T, while retaining in a memory the best position, Pi= (P;I,Pi2• ... ,p;n)T E S, it has ever visited in the search space. In minimization problems, best positions have lower function values. The particle's movement is also influenced by the experience of the rest particles, i.e., by their best positions. This is performed through the concept of neighborhood. More specifically, each particle is assigned a neighborhood which consists of some prespecified particles. Then, the particles that comprise the neighborhood share their experience by exchanging information. There are two main variants of PSO with respect to the number of particles that comprise the neighborhoods. In the global variant, the whole swarm is considered as the neighborhood of each particle, while, in the local variant, smaller neighborhoods are used. Neighboring particles are determined based rather on their indices than their actual distance in the search space [6]. Let g; be the index of the best particle in the neighborhood of x;, i.e., the index of the particle that attained the best position among all the particles of the neighborhood. The indices of the particles are considered in a cyclic order, i.e., 1 is the index that follows after N. At each iteration,

Nonlinear Data Fitting for Landslides Modeling

877

Table 1: Computed solutions for the two models.

c

Model

Norm

A

f1

2377.10

2447.27

236.68

Modell

f2

2391.74

2468.11

246.08

foe

2397.33

2588.62

283.42

f1

1626.50

1431.98

0.106

178.76

f2

1624.57

1464.39

0.106

187.87

foe

1686.59

1582.49

0.096

202.47

Model2

B

K

the swarm is updated according to the equations [2, 8], (k+l)

V;

X [v(k) t

+ c 1r 1 (P(k) t

- x(k)) t

x(k))] + c2 r 2 (P(k)9i

(k+l)

I

'

(3)

(4)

X;

where i = 1, ... , N; k is the iterations' counter; X is a parameter called constriction factor that controls the velocity's magnitude; c 1 and c2 are positive acceleration parameters, called cognitive and social parameter, respectively; and r 1 , r 2 are random vectors that consist of random values uniformly distributed in [0, 1]. All vector operations in Eqs. (3) and (4) are performed componentwise. A stability analysis of PSO, as well as recommendations regarding the selection of its parameters are provided in [2, 8]. PSO has been applied on f 1 -norm errors-in-variables data fitting problems with very promising results, exhibiting superior performance even than the well-known Trust Region methods [5]. Therefore it was selected for the error minimization in our problem using the fi. f2 and foe-norms.

4

Results and Discussion

The PSO algorithm was used for the determination of parameters of the two models defined in Eqs. (1) and (2), minimizing the error defined through the £1 , f2 and foe-norms, which will be denoted as Ll, £2 and £3, respectively. We concentrated on the case of the control point B2, which had the largest displacement in our set of observations. The data set for B2 consisted of 404 observations. For the PSO, the default parameters, x = 0.729 and CJ = c2 = 2.05 were used. The swarm size was equal to 60 for Model 1 and 80 for Model 2. The algorithm was let to run for 5000 iterations. We conducted 100 independent experiments for each model and norm. In all experiments, the same solutions (model parameters) were computed and they are reported in Table 1. The absolute error for each observation was also recorded for the detected model parameters. The mean value and the standard deviation of these absolute error values as well as the typical error for a single observation,

where m is the number of observations and n is the dimension of the problem were computed for the three norms. For Model 1, the plot of the actual data along with the corresponding model values for each norm, a boxplot with the distribution of the absolute error for the 404 observations

878

Parsopoulos, Kontogianni, Pytharouli, Psimoulis, Stiros and Vrahatis

l:::r:-

""'l___i!_______j

o

0

1000

2000

3000

4000

""'

5000

eooo

1000

aooo

Figure 2: Plot of the actual data along with the corresponding model values for each norm {left), boxplot with the distribution of the absolute error for all observations for the computed model parameters (center), and statistics of absolute error (right) for Model 1. Labels £1, L2 and L3 correspond to the norms e, , e2 and eoo, respectively.

~40

~~

i.. ,--, " i>--


JRn+l,

where In+ 1 = y det J Fn, and Vn+l is the direct product of the domain Dn with an arbitrary interval of the real y-axis containing the point y = 0. Then the zeros of the following system of equations: /;(Xt,X2, . .. ,Xn) = 0, i = 1, ... ,n, (4) Y detJFn(Xt,X2, ... ,xn) = 0, are the same as the zeros of Fn(x) = en provided that y = 0. Moreover, the determinant of the Jacobian matrix of (4) is equal to [det JFn (x)] 2 which is always nonnegative (positive at the simple zeros). Thus we may conclude:

Theorem [7]: The total number Nr of zeros of Fn(x) =en is given by (5)

882

N.G. Pavlidis, M.N. Vrahatis and P. Mossay

under the hypotheses that Fn is twice continuously differentiable and that all the zeros are simple and lie in the strict interior of Vn+l· Several methods for the computation of the topological degree have been proposed [5, 9, 8]. These methods are based on Stenger's method that is an almost optimal complexity algorithm for some classes of functions [8].

2

Economic Geography

Lately the increasing interest in the field of economic geography has attracted many scientists from various disciplines ranging from economics to regional science and geography. There is no doubt that the building of the European Union and the several policy issues which come along have contributed to boost interest in the field. The New Economic Geography has emerged from the long-existing need to explain concentrations of economic activity and thus of people. The literature in the field provides a general equilibrium framework explaining the emergence of economic agglomerations as a trade-off between increasing returns at the firm level and transportation costs related to the shipment of goods. We consider a standard new economic geography model involving a finite number of regions (see [1]). This model can be viewed as the extension of Krugman's core-periphery model [3] to the case of a spatial economy consisting of N regions. Like in Krugman's original work, there are two sectors in the economy [3]. The agricultural sector employs farmers and produces a single homogeneous good under constant returns to scale. The manufacturing sector employs workers and produces a differentiated good, giving rise to manufacturing varieties. Consumers (workers and farmers) buy the agricultural good on a perfectly competitive national market and manufacturing varieties on monopolistically competitive regional markets. In addition, transporting manufacturing varieties from their production place to the place where they are consumed, is costly. Economic equilibria define economic allocations and prices derived from optimal behaviors of firms and consumers that are compatible with market clearing. On the one hand, short-run equilibria are obtained under the assumption of no spatial adjustment. These short-run equilibria are thus viewed as implicitly determined by some given spatial distribution of labor. On the other hand long-run equilibria refer to steady states of a spatial economy where workers are allowed to adjust their location over time. In the case of a spatial economy consisting of 2 regions, a short-run equilibrium has been shown to exist and to be unique (see [4]), and the number and stability of steady states have been studied (see [1]). However, in the case of 3 regions or more, no analytical result concerning short- or long-run equilibria has been derived so far.

3

Proposed Approach

In this paper we investigate short-run equilibria of the general N-regional model. Regions are denoted by i = 1, ... , N. Consider some spatial distribution of labor L; across these regions. The proportion of the labor force in region i is denoted as N

>.;

=

L;/L_Lj. j=l

The variables of the model are y;, 0;, and w; representing the income, the manufacturing price index, and the manufacturing wage in region i, respectively. The system of equations defining the short-run equilibria of the spatial economy can be written in the following reduced form:

Economic Geography: Existence, Uniqueness and Computation of Short-Run Equilibria

883

Y;

{

W;

{

f; wj;~ 1 ) N

~ y1 ej-

exp [-r(a- 1)d(i,j)]

}~' 1/u

1

exp [-r(a- 1)d(i,j)]

}

,

where:

d(i,j) a f.1 T

distance between locations i and j, elasticity of substitution among manufacturing varieties, share of manufacturing expenditure, transportation cost per unit of distance for manufacturing goods.

In this contribution we propose to investigate the existence of a fixed point to this system of equations, to verify its uniqueness, and to apply efficient computational methods to determine it using tools from topological degree theory [5, 9, 8].

References [1] Fujita, M., Krugman, P., and Venables, A., The spatial economy, cities, regions and international trade, MIT Press, 1999. [2] Hoenders, B.J. and Slump, C.H., On the calculation of the exact number of zeros of a set of equations, Computing, 30, 1983, pp.137-147. [3] Krugman, P., Increasing returns and economic geography, The Journal of Political Economy, 99, 3, (1991), pp.483-499. [4] Mossay, P., The core-periphery model: Existence of short-run equilibria, Technical Report, Universidad de Alicante, Spain, 2004. [5] Mourrain, B., Vrahatis, M.N., and Yakoubsohn, J.C., On the complexity of isolating real roots and computing with certainty the topological degree, J. Complexity, 18, 2, 2002, pp.612-640. [6] Ortega, J.M. and Rheinbolt, W.C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. [7] Picard, E., Traite d'analyse, 3rd ed., chap. 4.7., Gauthier-Villars, Paris, 1922. [8] Sikorski, K., Optimal Solution of Nonlinear Equations, Oxford Press, 2000. [9] Stenger, F., Computing the topological degree of a mapping in lRn, Numer. Math., 25, 1975, pp.23-38.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer

and Computational Sciences

Volume I, 2004, pp. 884-887

Unsupervised Clustering Using Semi-Algebraic Data Structures D.K. Tasoulis 1 and M.N. Vrahatis 2 Department of Mathematics, University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece Received 31 July, 2004; accepted in revised form 5 September, 2004 Abstract: A clustering algorithm named k-windows clustering algorithm has been recently proposed [14]. The k-windows algorithm attempts to enclose all the patterns that belong to a single cluster within a d-dimensional window. In this contribution we propose to modify this algorithm by using semi algebraic data structures instead of windows. Keywords: Unsupervised clustering, cluster analysis, range searching, semi-algebraic data structures.

Mathematics Subject Classification: 68P05, 68Tl0, 62H30.

1

Introduction

Clustering techniques were originally conceived by Aristotle and Theophrastos in the fourth century B.C. and in the 18th century by Linnaeus [8], but it was not until 1939 that one of the first comprehensive foundations of these methods was published [13]. Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Clustering is fundamental in knowledge acquisition, and has been applied in numerous fields. A fundamental issue in cluster analysis, independent of the particular clustering technique applied, is the determination of the number of clusters present in a data set. This issue remains an unsolved problem in cluster analysis. For instance well-known and widely used iterative techniques, such as the k-means algorithm [7], require from the user to specify the number of clusters present in the data prior to the execution of the algorithm. Even the simplest clustering problems are known to be NP-Hard [1]. The Euclidean k-center problem in the plane is NP-Hard [9]. This problem can be defined as: Given a setS of n points in ad-dimensional metric space (JRd, p) and an integer k, compute a partition E of S into k subsets S 1 , ... , Sk, such that E has the smallest possible size. We define the size of a cluster S; to be the maximum distance (under the p-metric) between a fixed point c; called center of the cluster and a point of S;. The size of a partition is defined to be the maximum size of a cluster in the partition. Recently the k-windows clustering algorithm [14], has been extended [3, 4, 11, 12] in order to be able to automatically determine the number of clusters present in a dataset. The k-windows algorithm attempts to place a window over all the patterns that belong to a single cluster. In this 1 Corresponding 2 Email:

author. Email: [email protected] [email protected]

Unsupervised Clustering Using Semi-Algebraic Data Structures

885

contribution we propose to modify this algorithm by using semi algebraic data structures instead of windows.

2

Unsupervised k-windows clustering algorithm

For completeness purposes we briefly describe the workings of the original k-windows algorithm and its extension to automatically determine the number of clusters. Intuitively, the k-windows algorithm tries to capture all the patterns that belong to one cluster within a d-dimensional window (frame, box) [14]. To meet this goal it uses two fundamental procedures "movement" and "enlargement". During the movement procedure each window is centered at the mean of patterns that are included in it. The movement procedure is iteratively executed as long as the distance between the new and the previous center exceeds the user--defined variability threshold, Ov. On the other hand, the enlargement process tries to augment the window to include as many patterns from the current cluster as possible. To this end, enlargement takes place at each coordinate separately. Each range of a window is enlarged by a proportion Oe/1, where Oe is user-defined and l stands for the number of previous successful enlargements. To consider an enlargement successful firstly the movement procedure is invoked, and after it terminates the proportional increase in the number of patterns included in the window is calculated. If this proportional increase exceeds the user-defined coverage threshold, Oc, then the enlargement is considered successful. In this case, if the successful enlargement was for coordinate c' ;;:,: 2, then all coordinates c'', such that c" < c', undergo enlargement assuming as initial position the current position of the window. Otherwise, the enlargement and movement steps are rejected and the position and size of the d-range are reverted to their prior to enlargement values. In Figure 1 the two processes are illustrated. (a)

Ml

.·, ... • • ._._ ... L•

(b)

••• •. • •

M3

~

..•••.. .... ••••

!t .......... .

M2

:

:

L.. ~....! ..~. E.l.J ~

~. e e

E2

Figure 1: (a) Sequential movements M2, M3 of initial window Ml. (b) The enlargement process. E1 enlargement is rejected while E2 is accepted. A fundamental issue in cluster analysis, independent of the particular clustering technique applied, is the determination of the number of clusters present in a dataset. This issue remains an unsolved problem in cluster analysis. For instance well-known and widely used iterative techniques, such as the k-means algorithm [7], require from the user to specify the number of clusters present in the data prior to the execution of the algorithm. On the other hand, the unsupervised k-windows algorithm generalizes the original algorithm by endogenously determining the number of clusters. The key idea to achieve this is to apply the k-windows algorithm using a "sufficiently" large number of initial windows. The windowing technique of the k-windows algorithm allows for a large number of initial windows to be examined, without any significant overhead in time complexity. Once movement and enlargement of all windows terminate, all overlapping windows are considered for merging. The merge operation is guided by two thresholds the merge threshold, Om, and the similarity threshold, 08 • Having identified two overlapping windows, the number of patterns that lie in their intersection is computed. Next, the proportion of this number to the total patterns included in each window is calculated. If the mean of these two proportions exceeds 08 , then the two windows are considered to be identical and the one containing the smaller number of points

886

D.K. Tasoulis and M.N. Vrahatis

is deleted. Otherwise, if the mean exceeds Om, then the windows are considered to belong to the same cluster and are merged. This operation is illustrated in Figure 2; the extent of overlapping between windows W1 and W2 exceeds the threshold criterion and the algorithm considers both to belong to a single cluster, unlike windows W3 and W 4 , which capture two different clusters. On the other hand the extent of overlapping of windows Ws and W6 exceeds the 88 threshold thus W6 is deleted. For a comprehensive description of the algorithm and an investigation of its capability to automatically identify the number of clusters present in a dataset, refer to [3, 4, 11, 12]. (a)

(c)

Figure 2: (a) W1 and W2 satisfy the similarity condition and W1 is deleted. (b) W3 and W4 satisfy the merge operation and are considered to belong to the same cluster. (c) W 5 and W6 have a small overlapment and capture two different clusters.

The computational complexity of the algorithm depends on the computational complexity of the range searches. To make this step time efficient a technique from Computational Geometry [2, 10] is employed. This technique constructs a multi-dimensional binary tree (kd-tree) for the data at a preprocessing step and traverses this tree to solve the Orthogonal Range Search Problem. From the performance viewpoint, the kd-tree requires, optimally, O(dn) storage and can be optimally constructed in 8( dn log n) time, where d is the dimension of the data and n is the number of patterns. The worst-case behavior of the query time is O(IAI + dnl-l/d) [10] where A is the set containing the points belonging to the specific d-range.

3

Clustering Using Semi-Algebraic Data Structures

The aim of this paper is to replace the windowing technique of k-windows algorithm, by using data structures different from d-ranges. Using semi-algebraic structures like simplices, spheres or ellipsoids, the clustering ability of the algorithm can be enhanced since these kind of structures are able to capture clusters with non-trivial shapes. The difficulty that arises, though is the efficiency of the range searching. More specifically unlike the orthogonal range searching no simplex range searching data structure is known that can answer a query in polylogarithmic time using near-linear storage. Thus,

Range

Storage

Query Time

Simplex

nlog(n)

logn

Ball

m

rt

n

n/ml/fd/2llogc(n) nt-l/(2d-3)+
thod A

odD

Assurance ProducV Approaches System/ Service

(

Assw.ut (~·-)

Fig. I the relationship between the customer's requirements and the implementation 2.2 Append Security Requirements For the development of software, the first objective is the perfect implementation of customer's requirements. And this work may be done by very simple processes. How-ever, if the software developed has some critical security holes, the whole network or systems that software installed and generated are very vulnerable. Therefore, developers or analyzers must consider some security-related factors and append a few security-related requirements to the customer's requirements. Fig.2 depicts the idea about this concept. The processes based on the refinement of the security-related requirements are considered with the processes of soft-ware implementation. 2.3 Implementation of Security Requirements Developers can reference the ISO/IEC 15408, Common Criteria (CC), to implement security-related requirements appended. The multipart standard ISO/IEC 15408 defines criteria, which for historical and continuity purposes are referred to herein as the CC, to be used as the basis for evaluation of security properties of IT products and systems. By establishing such a common criteria base, the results of an IT security evaluation will be meaningful to a wider audience. The CC will permit comparability between the results of independent security evaluations. It does so by providing a common set of requirements for the security functions of IT products and systems and for assurance measures applied to them during a security evaluation. The evaluation process establishes a level of confidence that the security functions of such products and systems and the assurance measures applied to them meet these requirements. The evaluation results may help consumers to determine

892 ________________________ Eunser Lee, Sunmyoung Hwang

whether the IT product or system is secure enough for their intended application and whether the security risks implicit in its use are tolerable.

Fig.2 Append security-related requirements 3.

Conclusion and Future Work

This paper proposes a method appending some security-related requirements to the customer's requirements. For the development of software, the first objective is the perfect implementation of customer' s requirements. However, if the software developed has some critical security holes, the whole network or systems that software installed and generated may be very vulnerable. Therefore, developers or analyzers must consider some security-related factors and append a few security-related requirements to the customer' s requirements. For the future work, the processes based on the refinement of the security-related requirements must be considered with the processes of software implementation.

References [I] ISO, ISO/IEC TR 15504-5:1998 Information technology - Software process assessment - Part 5: An assessment model and indicator guidance. [2] ISO, ISO/IEC 21827 Information technology - Systems Security Engineering Capability Maturity Model (SSE-CMM). [3] ISO, ISO/IEC 15408: Information technology- Security techniques- Evaluation criteria for IT security, 1999. [4] Tai-hoon Kim and Haeng-kon Kim: A Relationship between Security Engineering and Security Evaluation, ICCSA 2004, LNCS 3046, Part 4, 2004. [5] Eun-ser Lee, Kyung-whan Lee, Tai-hoon Kim and 11-hong Jung: Introduction and Evaluation of Development System Security Process of ISO/IEC TR 15504, ICCSA 2004, LNCS 3043, Part I, 2004.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 893-895

Common Development Security Requirements to Improve Security of IT Products Sang ho Kim 1, Choon seong Leem 2 2Yonsei

1KISA, 78, Garak-Dong, Songpa-Gu, Seoul, Korea University, 134, Shinchon-Dong, Seodaemun-Gu, Seoul, Korea

Abstract: T Development security has been more important to make secure IT products maintaining integrity and confidentiality during development process. However, standards which specify development security requirements are not detailed to use and differences also exist among requirements of each standard. In this paper, we present common development security requirements to maintain confidentiality and integrity of desired IT products in development environment. These requirements are induced through the analysis of security standards such as Common Criteria, BS 7799, and SSE-CMM. We believe that common development requirements we suggest contribute on improving security of IT products. Keywords: Security Requirements, Security Improvement, Common Criteria

1.

Introduction

Security has been a crucial issue for IT products. As the use of IT products arises, the demands to secure these products also arise. This is the reason that IT products usually contain private data which have to be available only to authorized user. Developers usually have been defined security functional requirements which should be in-luded in IT Products for given particular environment against expected threats identified by risk analysis. However, Non-functional security requirements like development security are still ignored. It causes security breaches to IT products. In this paper, we focus on development security among non-functional security requirements. There are at least two reasons for the lack of development security in applying to development of IT products. Firstly, development security requirements are generally difficult to define. Standards which specify development security requirements are not detailed to use and differences also exist among requirements of each standard. Common Criteria(CC, ISO/IEC15408) [I], international standard for evaluation of security properties of IT products and systems, specify development security requirements in ALC_DVS components of assurance requirements but these requirements are not specific and even ambiguous. BS7799-l [2], SSE-CMM [3] also shows requirements for the development security management, focusing on security management of organization. Those can't be directly applied to development security of IT products. Secondly, developer lacks expertise for development security. Many developers tend to consider security functional features but rarely review development security issues. In this paper, we present common development security requirements to maintain confidentiality and integrity of desired IT products during development process. These requirements are induced through the analysis of security standards such as Common Criteria, BS 7799 and SSE-CMM.

2. Related Works 2.1 Common Criteria

A Common Criteria is an international standard to be used as the basis for evaluation of security properties of IT products. It defines development security in ALC_ DVS of assurance requirements.

1

2

KISA, 78, Garak-Dong, Songpa-Gu, Seoul, Korea, E-mail : [email protected] Yonsei University, 134, Shinchon-Dong, Seodaemun-Gu, Seoul, Korea

894 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _.Sang ho Kim, Choon seong Leem

Items ALC_DVS.2.

Requirements Describe all the physical, procedural, personnel, and other security measures that are necessary to protect the confidentiality and integrity of the TOE(Target of Evaluation) din its development environment. Provide evidence that these security measures are followed during the development and maintenance of the TOE. Security measures provide the necessary level of protection to maintain the confidentiality and integrity of the TOE

I ALC DVS.2. 2 ALC_DVS.2. 3

2.2 BS 7799-1 BS7799-I is also an international standard for controls of information security management of organization. Requirements for development security are specified in personnel security and physical & environmental security section repetitively. Personnel security ·Security roles and responsibilities of job & employment ·Personnel screening, confidentiality agreement ·Security education and training

Physical & environmental security section ·Physical security perimeter ·Physical entry controls ·Securing offices, rooms and facilities ·Working in secure area ·Equipment security

2.3SSE-CMM SSE-CMM is a standard related to security engineering practices in software life cycle. Development security requirements are described in BPs(Base Practices) ofPA(Process Area)OI, PA09, PAlO, PA20. PA PAO I ;Administer Security Control PA09; Provide security Input PAlO; Specify Security Needs PA20; Manage System Engineering Support Environment

BPs BP.OI.OI , BP01.03, BPOI.04 BP.09.05, BP.09.06 BP.l0.02, BP.I0.05 BP02, BP03, BP06

3. Common requirements for development security We propose common requirement for development as a blow framework. Development security requirements are composed of three categories and their elements.

PES. J. Secu .-l tv c ontrol

Pt-tS. J. S ite access c ontrol

PP.S. l. Vls lto..-s c at"ltrol

PES.2 Securi t y

Pt-15.2 Netw ork securi t y

PRS.2

Pl-t5.3 DocUI'nents protection

PRS.3 Incident 1"'\00dllng

~w~eness

PES. 3 Securi ty occour"''t

Product dlstrl butklon

Figure I. Common development security Framework 3.1 Personal Security Requirements of personal security are defined as PES I, PES2, and PES3. PES. I Security management

1.1 Recruit & Retire developer management 1.2 Developer role & responsibility definition 1.3 Outsourced develo r mana ement

Common Development Security Requirements to improve Security of IT Products _ _ _ _ _ _ _ __

PES2. Security awareness PESJ. Security account management

895

2.1 Security awareness program 2.2 Periodical program execution 3.I Account create/modification/delete 3.2 Password guideline 3.3 Access privileges 3.2 Physical Security

Requirements of physical security are defined as PHSI, PHS2, and PHS3. PHS I Develop site access control PHS2. Network security PHSJ. Documents protection facility

1.1 Access limited area setting 1.2 Access control equipments 1.3 System protection measures 1.1 Network security policy 1.2 Network protection equipment 1.3 Periodic vulnerability analysis & Audit 1.1 Access control equipment for 1.2 Periodic back-up & audit

3.3 Procedural Security Requirements of procedural security are defined as PRSI, PRS2, and PRS3. PRS I. Visitor access control PRS 2. Product distribution PRS 3. Incident Handling

1.1 Visitor lists & escorts 1.2 Security badges 1.1 Production line security 1.2 Secure distribution rout 1.3 Stock security management 1.1 Emergency contact points 1.2 Response & Recovery procedure

4. Conclusions and Future works In this paper, common development security requirements have been provide to maintain integrity and confidentiality during development process of IT products. We believe that common development security requirements we suggest contribute on improving security of IT products and reduce a security breach. More research for detailed metrics of security developments is future works.

References [I] ISO. ISOIIEC 15408-3:1999 Information technology - Security techniques - Evaluation criteria for IT security- Part 3: Security assurance requirements [2] BSI (UK), BS7799-I: Information security management-Part): Code of practice for information security management, 1999. [3] Carnegie Mellon University, Systems Security Engineering Capability Maturity Model, Version 3.0, 2003. [4] ISO. ISO!IEC TR 13335, Information Technology Guidelines for the management of IT Security, 1996. [5] Sang ho Kim, SSE-CMM BPs to Meet the Requirements of ALC_DVS.I Component in CC, Page 1069-1075, Springer LNCS. 2003. [6] Sang ho Kim, Supplement of Security-Related Parts of ISOIEC TR 15504, Page 1084-1089, Springer LNCS. 2003.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 896-899

Decision Supporting Method with the Analytic Hierarchy Process Model for the Systematic Selection of COTS-based Security Controls Sangkyun Kim 1 Somansa, Woolim e-Biz center, 16, Yangpyeongdong 3-ga, Yeongdeungpogu, Seoul 150-103, Korea Choon Seong Leem 2 Department of Computer and Industrial Engineering, Yonsei University, 134, Shinchondong, Seodaemoongu, Seoul 129-749, Korea Received 31 July, 2004; accepted in revised form 25 August, 2004 Abstract: The successful management of information security within an organization is vital to its survival and success. The effective implemenation of COTS (Commercial, Off-The-Shelf)-based security controls is one of the critical success factors of information security managements. This paper presents the formal method which provides the process of selection and the criteria for evaluation of COTS-based security controls for the effectiveness and the efficiency of decision making of corporate managers. A case study proves practical values of this paper. Keywords: AHP; Decision Support; COTS; Security Control Mathematics SubjectClassification: 68U99 PACS: 89.70.+c

1. Previous Researches The previous methodologies which are famous as a selection methodology of information systems such as METHOD/I, ASAP and VIP2000 do not consider security issues, but only focus on reliability and usability issues [I, 2, 3, 4]. TCSEC, ITSEC and CC deal with issues of functionality or effectiveness of security product itself. Because the objective of these evaluation schemes is to supply official certification of particular security controls, it's difficult to use these evaluation schemes when an organization evaluates and selects security controls which have similar levels of certification. Several methods have been proposed in previous researches to characterize and to guide selection of COTSbased security tools. A summary of these researches is shown in table I [5, 6, 7, 8]. T able I: Previous researches on the selection of security controls

Research Gilbert, 1989

Objective Guide for selecting automated risk analysis tools

Polk, 1992

Guide for selecting anti-virus tools and techniques

Limits Only focus on specific security products

Barbour, 1996 Henze,2000

SCE(Software Capability Evaluation) implementation guide for supplier selection Implementation procedures ofiT security safeguards

Absence of an evaluation and selection criteria

1 Corresponding author, Member of the Research Group of Security Engineering, E-mail: [email protected] 2

E-mail: [email protected]

Decision Supporting Method with the Analytic Hierarchy Process Model _ _ _ _ _ _ _ _ _ _ __

897

2. Process Model In this paper we propose the process model which includes eight steps. The logic of these steps which provide tactics-level plannings is derived from the researh of METHOD/I, ASAP and VIP2000 [I, 2, 3]. The activities and tasks which provide operational plannings are derived from the research of Gilbert, Polk, Barbour and Henze [5, 6, 7, 8]. Key characteristics of eight steps are: Step I) Requirement analysis: Technical and administrative requirements are consolidated. Technical requirements include the types of database, communication protocol, manipulation structure, interoperability, and functional list. Administrative requirements include the allocatable resources, legal liability, business needs, and constraints; Step 2) Introduction planning: An introduction plan is based on a requirement analysis. It arranges resources and provides time plans from a RFP development to operation phase. The introduction team is organized with experienced and trained members. Roles and responsibilities of the team are defined considering an introduction plan and characteristics of each member; Step 3) RFP development: A RFP is developed considering internal requirements and market environments. Internal requirements are consolidated during a requirement analysis phase. Market environments include informations on what kinds of vendors and products are available, and the industrial best practices. Criteria might be used in this phase; Step 4) Proposal receipt: The introduction team identifies solution providers who have a public confidence, and judges bidders who will receive a RFP. A NDA must be signed before sending a RFP and other related materials; Step 5) Bidders' briefing: The introduction team conducts a review on presented proposals, vendor presentations, interviews, and benchmarking test; Step 6) Judgement & contract: The introduction team consolidates data, makes judgement on vendor and product, and produces a report. Finally, the introduction team makes a contract considering technical and administrative requirements; Step 7) Introduction: A prototype may be required to assure an operational performance in production environments. An installation and data migration are conducted with detailed testing and revising activities. The administrators should be trained before an operation phase. Finally, an accreditation is required based on contract terms; Step 8) Operation: An awareness training of users, auditing of violation, compliance checking of legal liability and security policy, and change management of security related features should be performed.

3. Evaluation Criteria We make the first level of evaluation criteria with a concept of S3IE provided in VIP2000. The factors of the second and third level of evaluation criteria are composed of the factors of Kavanaugh, Beall and Firth [9, 10, II]. The software quality attributes of Mario and IS09126 are also included in the third level of evaluation criteria [ 12]. The evaluation criteria are described in table 2. Table 2· Evaluation criteria I" level 2"0 level Credibility track record of supplier speciality coverage Competiti- sales condition veness of architecture product function performance Continuity of service

vendor stability contract terms

3'"Ievel market share, certification, relationship security expertise, solution lineup, best practice, turnkey solution geographic coverage price, marketing program, maintenance, support services H/W requirement, OS supported, source language, source code available, NOS supported, protocols supported, component model supported preventive function, detective function, deterrent, recovery function, corrective function functionality, reliability, usability, efficiency, maintainability, portability financial stability, vision and experience of the management staff warranty, product liability

4. A Case Study In this case study, the AHP method was applied to a particular project in which XYZ Co. Ltd. wanted to implement information security systems. XYZ had a planning to introduce a firewall. In this study,

898 ________________________ Sangkyun Kim, Choon Seong Leem

there was no relationship between vendors and XYZ, and vendors and products were treated independently. The process model provided in section 2 was used to support introduction steps. XYZ used selection criteria for a RFP development, judgement, and making a contract. The first step was to develop a hierarchical structure of the problem. It classifies the goal, all decision criteria and variables into four major levels. The first level of the hierarchy is the overall goal to select the best firewall. Level 2 and Level 3 represent the evaluation criteria in selecting the firewall. Level 4 contains the decision alternatives that affect the ultimate selection of chosen firewalls. The judgements were elicited from the security experts in the security solution providers and government agencies. Expert Choice was used to facilitate comparisons of priorities. For example, the competitiveness of product was the most important criteria in Level 2. After inputting the criteria and their importance into Expert Choice, the priorities of each set of judgements were found and recorded in figure I.

I

credibility

of supplier

I

I

best security control

I

L: 0.222 G: 0.222

track record

L: 0.396 G: 0.088

'-mar1)),and marks this value on Pj':fF 2 , the second 8 bits of FL field. After marking, the router sends the packet to the destination. On transmission path, intermediate router does not perform marking if finding TM is set as 1, because the packet has been marked by the previous router.

3.3 Authenticated Traceback Path Reconstruction For a packet transmitted through the network, victim system V restructures the malicious DDoS attack path with authenticated verification process. First of all, let's say Pv is a set of packets arrived at victim system V. Pv is a set of packets corresponding to DDoS attacking, and Mv is a set of packets within Pv, which were marked by routers. First it obtains Bx' and Kx respectively included in the authenticated header. Now it is possible to obtain Kx by generating Bx', which is authentication information in packet. And then extract Kx using Ax and timestamp information t; x. Step 2: Authenticated Reconstruction MrFl == H(/t;x(M[F; Rx), (Rx E D(M;) == 2) and M['1 F1 == H((/t;x((H Lol M;/\00111111+2); Rx), (Rx E D(M;) == 2), where D(M;) = M{F- (HLoiM; 1100111111). Now the victim system can restructure the actual attack path through which packets in malicious DDoS attack packet set Pv were transmitted by repeating the same process for Mj satisfying D(Mj) == n, (n 2: 3). We can verify the packet's integrity with authentication process on IP traceback packet. And we can also verify hashed MAC value with chain key Kt;x and Kt:y from AH header.

4

Conclusions

We overviewed the possible attack mechanism on IPv6 by considering of the vulnerability of proposed IPv6 structure. As a solution, we propose a new traceback function on IPv6 packet. If a victim system is under attack, it identifies the spoofed source of the hacking attacks using the generated and collected traceback path information. Thus this study proposed a authenticated technique to trace back the source IP of spoofed DDoS IPv6 packets with a traceback function.

Acknowledgment This work is supported by KISA and University IT Research Center(ITRC) Project of Korea.

References [1] S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC2460, IETF drafe, December 1998. [2] Franjo Majstor, "Does IPv6 protocol solve all security problems of 1Pv4?", Information Security Solutions Europe, 7-9 October 2003. [3] Pete Loshin, "IPv6 : Theory, Protocol, and Practice", Second edition, Morgan Kaufmann, 2003. [4] John Elliott, "Distributed Denial of Service Attack and the Zombie and Effect", IP professional, March/ April 2000. [5] Timo Koskiahde, "Security in Mobile IPv6", 8306500 Security protocols, Tampere University of Technology, 2002. [6] Tatsuya Baba, Shigeyuki Matsuda, "Tracing Network Attacks to Their Sources", IEEE Internet Computing, pp. 20-26, March, 2002. [7] Hassan, Aljifri, "IP Traceback: A New Denial-of-Service Deterrent?", IEEE SECURITY & PRIVACY, pp.24-31, May/June, 2003.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 919-922

The Undeniable Digital Copyright Protection Scheme Providing Multiple Authorship of Multimedia Contents Sung-Hyun Yun 1 , Hyung-Woo Lee 2 , Hee-Un Park, Nam-Ho Oh, Jae-Sung Kim 3 (1) Div. of Info. and Comm. Engineering, Cheonan University, Anseo, Cheonan, 330-704, Korea (2) Dept. of Software, Hanshin University, Yangsan, Osan, Gyunggi, 447-791, Korea (3) Korea Information Security Agency, Garak, Songpa, Seoul, 138-803, Korea Abstract: A digital copyright is a signature added to a digital multimedia contents. The copyright has been used to establish authorship and to check if the data has been tampered with. In existing copyright protection schemes based on ordinary signature scheme, some one can make counterfeited copyright and embeds it to the copyrighted digital contents to assert multiple claims of rightful authorship. In this case, an original author wants to make the copyright which can not be verified without help of the author so that only the intended verifier can be convinced about the validity of it. In undeniable signature scheme, the signature can be verified and disavowed only with cooperation of the signer. If the copyright has property of undeniable signature, the verifier can not distinguish between valid and invalid copyright without help of the copyright owner. Generally, the digital multimedia contents is made by many authors. In application that requires many authors and a designated verifier, an algorithm for undeniable multisignature is needed. In this paper, the digital copyright protection scheme based on undeniable multi-signature scheme is proposed. The copyright can be verified and disavowed only in cooperation with all signers. Our scheme is secure against active attacks such as signature modification and repudiation of signature by signers. It can be applied to fair on-line sales of co-authoring copyrighted digital multimedia contents. Keywords: Digital Copyright Protection Scheme, Undeniable Signature, Undeniable MultiSignature, Multimedia Contents Security

1

Introduction

A digital copyright is generally used to establish author's right on digital multimedia contents. The digital copyright is a signature added to the digital multimedia contents so as to check if the contents has been tampered with or to identify its intended recipient. In the conventional digital signature schemes, a signer can make original signature on the digital data in which the signer can not repudiate the signature. Everybody can certify the signature since the ordinary signature has self-verification property. Therefore, in existing copyright protection schemes based on ordinary signature schemes, some one can make counterfeited copyright and embeds it to the copyrighted digital multimedia contents to assert multiple claims of rightful authorship. In this case, an original author wants to make the copyright which can not be verified without help of the author so that only the intended verifier can be convinced about the validity of it. 'Corresponding author. Div. of I&C Engineering, Cheonan University, Korea. E-mail : [email protected] of Software, Hanshin University, Korea. E-mail: [email protected] Information Security Agency, Seoul, Korea. E-mail : {hupark, nhooh, jskim }@kisa.or.kr

2 Dept.

3 Korea

Sung-Hyun Yun, et al.

920

An undeniable digital signature scheme is proposed by D.Chaum at first[2]. In the undeniable signature scheme, the digital signature can be verified and disavowed only with cooperation of the signer. There are many applications which conventional digital signature schemes can not be applied to. In on-line sales of digital contents, an owner of the digital contents wants to know whether a distributor sales the contents to customers fairly. In this case, the owner can satisfy on-line sales model if the model provides the mechanism which the customer can not buy the digital contents without help of the owner. The digital copyright based on undeniable signature scheme differ from ordinary copyright in that the customer can not distinguish between valid and invalid copyright without help of the copyright owner. Only the original owner can confirm this copyright as authentic to the customer. Existing copyright protection schemes are mainly focused on protection of single owner's authorship[7]. Generally, the digital multimedia contents is made by many authors with cooperative works. The new digital copyright protection scheme is required to provide equal rights to coworkers. In application that requires many authors and a designated verifier, an algorithm for undeniable multi-signature is needed. In this paper, the digital copyright protection scheme providing multiple authorship of multimedia contents is proposed. The proposed scheme is based on undeniable multi-signature scheme. The multi-signature can be verified and disavowed only in cooperation with all signers. In the proposed scheme, the El-Gamal signature scheme is modified to satisfy properties of undeniable signature and to extends it to make multi-signature[1,5]. Our scheme is secure against modification of multi-signature and repudiation of it by signers. It also can be applied to protection of digital copyright on co-authoring digital multimedia contents. In case of dispute between authors, the proposed scheme can resolve it by launching disavowal protocol to identify whether authors have cheated.

2

The Undeniable Digital Copyright Protection Scheme

To make the proposed scheme, we modify the El-Gamal signature equation[!], let k(mh + s) xr (mod p- 1), and extends it to accept undeniable properties of D.Chaum's scheme[2] and multi-signature properties[3,4]. The proposed scheme consists of multi-signature generation, multisignature confirmation and disavowal protocols. In multi-signature generation protocol, a copyright maker can make an undeniable digital copyright. In multi-signature confirmation protocol, the digital copyright can be verified only with the help of all authors. If the copyright verification fails, the disavowal protocol is used to identify whether the copyright is invalid or some authors have cheated. The following parameters are used in the proposed scheme. Authors: u1, u2, ... , Un, Multimedia Contents: mE Zp- 1 Author i's Private Key: x; E Zp_ 1 , Author i's Public Key: y; gx' (mod p)

=

(1) Multi-signature Generation Protocol A copyright maker sends digital multimedia contents to all authors. Each author computes the undeniable signature and sends it to the copyright maker. The copyright maker uses each author's undeniable signature to compute the undeniable multi-signature. Step 1: The copyright maker generates hash value mh = h(m, hpr) and sends (m, hpr) to the first author u1. The hash parameter hpr is adjusted to make mh as a primitive root of p. Step 2: To make the common public key Y, the Ut let Y1 = Yt· The u 1 chooses a random number k1. kt and p- 1 are relatively prime integers. The u 1 computes r 1 mhk, (mod p) and sends it to the second author u2. Step 3: The intermediate author u; (2 ~ i ~ n) receives (r;_ 1 , Yj_ 1 ) from the u;_ 1. The u; chooses

=

= =

= =

a random number k; and computes r; rf~ 1 mhn',~, k ' (mod p), Y; y;:\· gn·,~, Xj (mod p). The u; sends (r;, Y;) to the next author ut+ 1 . If u; is the last author, the u; computes R r~~ 1

= =

921

The Undeniable Digital Copyright Protection Scheme

mhll}~• k, (mod p), Y = Y:_:> 1 = gll}~• XJ (mod p) and sends it to all authors as well as the copyright maker. Step 4: Each author u; (1 :::; i :::; n) computes the undeniable signature s; and sends it to the copyright maker. Since k; and p - 1 are relatively prime integers, there exists s; satisfying the following equation, k; · s; =X;· R- k; · mh (mod p- 1). Step 5: The copyright maker computes the undeniable multi-signatureS= fl7= 1 (mh +s1) (mod p).

(2) Multi-signature Confirmation Protocol Step 1: The verifier chooses two random numbers (a, b) and computes the challenge ch = R 5 ·a · yRn·b (mod p). The challenge chis delivered to the first author u1. Step 2: The u 1 computes the response rsp 1 = chx~' (mod p) and sends it to the second author u 2 . Step 3: The intermediate author u; (2:::; i:::; n) receives the response rsp;-1 from the Ui-1· Then -I

the u; computes the response rsp; = rsp~.'.. 1 (mod p) and sends it to the next author Ui+J· If the u; is the last author, the response r spn is delivered to the verifier. ·a· gRn ·b (mod p) holds, the verifier ensures that multi-signature Step 4: If the equation r SPn = is valid. Otherwise, the disavowal protocol is launched to identify whether multi-signature is invalid or some authors have cheated.

mf

(3) Disavowal Protocol The verifier chooses two random numbers (c, d) and computes the second challenge ch' = R 5 ·c . yRn·d (mod p), a· d =I b · c (mod p- 1). If the second response rsp~ is not equal to mfn·c. gRn·d (mod p), additional step 5 is required. Step 5: The verifier makes the discrimination equations. If R 1 = (rspn · g-Rn·b)c (mod p) equals to R 2 = (rsp~ · g-Rn·d)a (mod p), the verifier confirms that multi-signature is invalid. Otherwise, at least more than one signer have cheated on valid multi-signature.

3

Undeniable Property Analysis

In this section, undeniable properties of the proposed scheme is analyzed. In theorem 1 and 2, we show correctness of our disavowal protocol. Definition 1 The valid multi-signature and the invalid multi-signature on the message m are defined as follows. X Xj (mod p- 1) contains private keys of all signers and mh is the hash result on m. • Valid multi-signature: R m~J'~, k, (mod p), flj= 1 kj(mh + Sj) Rn ·X (mod p- 1)

=n;=1 = = • Invalid multi-signature : R' =m~J~I kj (mod p), nJ=l kj(mh + Sj) =I R,n. X (mod p- 1) We also define X' satisfying following equation, nJ=1 kj(mh + Sj) =R'n. X' (mod p- 1)

Theorem 1 The proposed disavowal protocol can identify that signers have compute the invalid response on the valid multi-signature. Proof: If more than one signer have cheated during the multi-signature confirmation protocol, x- is modified. We assume that x- is the valid inverse of X and X'- is the invalid inverse of X. The verifier computes the challenge ch = RS·a · Y Rn ·b (mod p ). If more than one signer compute the response improperly on the valid multi-signature, the response made by all signers becomes rspn = chx•-• (mod p). Since the response rspn is not equal to mf·a · gRn·b (mod p), the verifier launches disavowal protocol with new challenge value (c, d) as follows.

1

1

1

ch'

=RS·c · yRn·d (mod p),

rsp~

=ch'

x'-1

(mod p)

Since the second response rsp~ is not equal to mf·c.gRn·d (mod p), the verifier makes following discrimination equations.

922

Sung-Hyun Yun, et al.

From the above equations, R1 is not equal to R2. Therefore, we prove the correctness of the proposed disavowal protocol in case that more than one signer have cheated on the valid multisignature. Q.E.D.

Theorem 2 The proposed disavowal protocol can identify that the multi-signature is invalid. Proof: The first challenge and response on the invalid multi-signature is as follows.

The second challenge and the response on the invalid multi-signature is as follows.

The verifier makes the following discrimination equations. In the following equation, R1 is equal to R2.

Therefore, we prove the correctness of the proposed disavowal protocol in case that the multisignature is invalid. Q.E.D.

4

Conclusion

Many authors can participate jointly in authoring of a digital multimedia contents. In this case, the copyright of the digital contents must be shared by all participants. In this paper, we propose the undeniable digital copyright protection scheme based on undeniable multi-signature scheme and undeniable property of the proposed scheme is proved. For sales on digital multimedia contents by on-line, a customer can buy it by launching multi-signature confirmation protocol. Without the consent of all authors, the customer can not buy digital contents. Especially, in case of dispute between authors, the proposed disavowal protocol can discriminate whether authors have cheated or the digital copyright is invalid.

Acknowledgment This work is supported by KISA and University IT Research Center(ITRC) Project from Korea.

References [1] T.Elgamal, "A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms," IEEE Transactions on Information Theory, Vol. IT-31, No. 4, pp.469-472, 1985. [2] D.Chaum, "Undeniable Signatures," Advances in Cryptology, Proceedings of CRYPT0'89, Springer-Verlag, pp.212-216, 1990. [3] L.Harn, "(t,n) Threshold Signature and Digital Multisignature," Workshop on Cryptography & Data Security, pp.61-73, 1993. [4] S.H.Yun, T.Y.Kim. "A Digital Multisignature Scheme Suitable for EDI Message," Proceedings of 11th International Conference on Information Networking, pp.9B3.1-9B3.6., 1997. [5] S.H.Yun, T.Y.Kim, "Convertible Undeniable Signature Scheme," Proceedings of IEEE High Performance Computing ASIA'97, pp. 700-703, 1997. [6] S.H.Yun, S.J.Lee, "An electronic voting scheme based on undeniable signature scheme," Proceedings of IEEE 37th carnahan conference on Security Technology, pp.163-167, 2003. [7] Andre Adelsbach, Birgit Pfitzmann, Ahmad-Reza Sadeghi : Proving Ownership of Digital Content, 3rd International Information Hiding Workshop (IHW '99), LNCS 1768, SpringerVerlag, 117-133, 1999.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 923-928

Field Test-based Propagation Path-Loss Models for Terrestrial ATSC Digital TV Seung youn Lee 1, Myong chul Shin\ Sang yule Choe, Seung won Kim 3, Jae sang Cha4 • 1SungKyunKwan

Univ., Department of Information & Communication Eng. Kyonggi, 440-746, Korea 2Induk Institute of Technology, San 76, Wolgye-dong, Nowon-Gu, Seoul 139-749, Korea. 3Radio and Broadcasting Research Lab., ETRI, Daejeon, 305-350, Korea. 4SeoKyeong Univ., Dept. of. Information & Communication Eng., Seoul, 136-704, Korea. Abstract: In this paper, we propose Propagation Path-loss models for terrestrial ATSC (Advanced Television Systems Committee) DTV (Digital Television) using the results of ATSC DTV field-test measurements in Seoul, Korea in 2001. Numerical formulae of proposed Path-loss models are derived from the measured values of received field strength. Newly proposed Path-loss models have closer correctly agreement with the field-measured results than these of conventional Path-loss models ( e.g., Free space model, Friss model, Hata model, 2-Ray model, ITU-R 526-3 model and Lee model) in the LOS(Line of Sight) and Non-LOS areas. Path-loss models presented in this paper can be readily utilized usefully for efficient ATSC DTV system implementation requiring accurate link-budget calculation. Keywords: ATSC DTV, path-loss model, link margin, link budget.

1.

Introduction

The studies of development of ATSC DTV technology have been progressed actively in various areas of the world so far. In comparison with conventional NTSC(National Television System Committee)type TV, ATSC DTV could implement HDTV(high definition television) broadcasting easily with robust signal propagation property. From the results of ATSC DTV field measurements[2)[3), we certified the propagation distance such as cell-coverage-radius concept of mobile communication could be reached to father than IOOkm or more when the transmitted output power is given as lkW. Since one of the most important characteristics of A TSC DTV signal can be found in the longer propagation distance (i.e., longer cell coverage property) than that of cellular mobile communication or NTSC TV, the path loss model which means large scale channel characteristics can be one of the most important components to be considered in the system design or performance analysis. However, new path-loss model that present terrestrial ATSC DTV have not been proposed till now, because DTV's concept itself appeared just nowadays. Thus, in this paper, we prove that the large deviation between the measured result and conventional path-loss [4]-[8] exists. Moreover, we proposed new Propagation Path-loss models for terrestrial ATSC DTV using the results of ATSC DTV field measurements as shown in Table I.

2. Proposed Field Test-based Path-loss models for terrestrial A TSC DTV As shown in Figs.l-4, we certified the large deviation between the Filed measured result and theoretical values of conventional Path-loss models [4)-[8) existed in the LOS and Non-LOS area, where conventional Path-loss models are selected that have suitable property for NTSC TV or mobile communications to be compared. Table I summarized new Path-loss models proposed in this paper. Thus, as shown in Figs. 1-4, newly proposed Path-loss models are more suitable for the field-measured results than that of conventional Path-loss models in the LOS and Non-LOS areas.

2.1 Proposed LOS model 4

Corresponding author: Dept. of Information & communication Engineering Seokyeong Univ., E-mail: [email protected]

924 _ _ _ _ _ _ _ Seung youn Lee, Myong chul Shin, Sang yule Choi,Seung won Kim, Jae sang Cha

Measured path loss is compared with existing propagation models such as Free space model, Friss Model, Hata model and 2-Ray model. The Fig. I shows the compared result. Free space model is similar to the one of measured path loss' average. Friss model has a low average value of path loss, the changing pattern ofHata model is quite different [4)[8]. There is a difference between 2-Ray model and measured path loss in numerical value. But, both of them have a similarity in a changing pattern. Therefore, by using the similarity of changing pattern, we made our suggested LOS model based on the various revision of2-Ray model. Tabl e I E:quatlons ofP ath-Loss ModeIs propose m t Is paper Equations of Proposed Path loss Model PL[dB] = 401og(d[km])- { IOiog(G,) + IOiog(G,) + 201og(h,) Basic Eq. + 20log(h,)} + 120 + a(k) LOS Suburban Eq. In this case, a(k) of Basic Eq = 22.17dB area In this case, a(k) of Basic Eq = 37.32dB Urban Eq. Massed Urban In this case, a(k) of Basic Eq = 55.58dB v

NonLOS area

Single Knife-edge Diffraction Model

u ~-I -l~u~O

O~u~l l~u~2.4

u?:2.4 N

I I ) = h~ ;:2 ( ~+ d2

L =0 (dB) L = 20 log(0.5- 0.62v) (dB) L = 201og(0.5e 095 ) (dB) L = 20 log(0.4- ~0.1184- (O.Iv- 0.38) 2 ) (dB)

L = 20 log( -0.225/ v) (dB) N

Mutiple Knife-edge Ld = LL',+L"(wx), +LL"(yz), -IOiogCN (dB) Diffraction Model i=l i=l P, : Received power , P, : Transmitted Power, h,: Height ofTransmitter(Tx), h,: Height ofReceiver(Rx), G, : Tx Antenna Gain, G,: Rx Antenna Gain, A.: wave-length, d (m): Distance between Tx and Rx, f: frequency, h: (Height of Knife-edge obstacle)-(Height of Line between Tx and Rx Antenna) dl,: Distance between Tx Antenna and Knife-edge obstacle. d2,: Distance between Knife-edge obstacle and Rx Antenna. L'i:diffraction loss over the I-th cylinder( dB), L"(wx) 1 :sub-path diffraction loss for the section between points wand x for the first cylinder L"(wx) 1 :sub-path diffraction loss for the section between points y and z for the all cylinders. CN: correction factor to account for spreading loss diffraction over successive cylinders. Conventional 2-Ray model is a path loss model considered direct wave as well as reflected wave. When the receiver is far from the transmitter by distance d, the path loss equation is as follows PL(dB) = 40 log d -(10 log G, + 10 log G, + 20 log h, + 20 log h,) d: the distance of transmitter to receiver(m) G, : transmitter antenna gain

G, : receiver antenna gain h, : the height of transmitter antenna

h, : the height of receiver antenna

The numerical formula of conventional 2-ray model is considered only distance and height of transmitter and receiver. Moreover, frequency and other environmental factors are left out of consideration in the conventional 2-ray model. Based on the basic equation, various new LOS models are proposed, which can be applied to Suburban, Urban, Massed urban. According to the regional characteristics, we followed process that basic equation is added to various average constants calculated with consideration of obstacles by regional groups.

Field Test-based Propagation Path-Loss Models for Terrestrial ATSC Digital TV _ _ _ _ _ _ _ _ _ _ 925

"'

--,--- ---,-----1

t

t

------

t 1

- - _t Me a sured Path lo ss +--+ 1 FreeSpace1 Mode l

1

-

---:------ ~------ ~--- 2~~~;~}~~~~t----1Hata Mode l1 1Proposed Mode l

"~,---~---.~,---~.~,-~==~======~====~. D istance (km)

Figure. I Comparison of Measured value and Theoretical values of Path-Loss models in the LOS areas. The criteria of classification of regional groups are defined as follows in this paper: - Massed-urban : In this paper, it is defined as a downtown area where high-rise apartment buildings and skyscrapers higher than 15 stories are clustered close together around measurement area - Urban : In this paper, it is defined as a downtown area where lower apartment buildings and residential buildings lower than 15 stories are scattered around measurement area. - Sub Urban : In this paper, it is defined as a suburbs where buildings higher than 3 stories seldom exist around measurement area, come in sight. Fig. I shows the comparison between measured value and theoretical values of path-Loss models in the LOS areas. In the Fig I, we can find path-loss pattern of the proposed model is more similar to the actual measured result than that of other conventional path-loss models. Fig. 2 is a graph that is compared between standard deviation of measurement result and that of calculation result calculated in each propagation model. We can see that standard deviation between measured path loss and the Standard deviation of conventional path-loss models exists between 8.43d8 and 13.52dB. However, Standard deviation of the suggested model in this paper is just only 5.45d8. So, we can confirm that the model suggested in this paper corresponds to the measured result more than that of conventional path-loss models.

15,-- - -- - - -- - - - - - - - -- -- ---,

§

14

- - - - - - - - - - - - - - - - - t-3-:"5'15"&- - - - - - - - - - - - - - - - -

13

-----------------

12

-----------------

11

1,.

j

F rae space Mo d e I

Friss

2-ray

Mode I

Mode I

Hata Mode I

Prop o sed M oCi el

Figure. 2 Standard deviation to the Measured value of Propagation Models in the LOS areas.

2.2 Proposed Non-LOS model Non-LOS area is the area where wave doesn't come to directly but come to after being diffracted because there are obstacles between receiver and transmitter. Particularly, In the Korea having a lot of mountain regions, we can find easily many Non-LOS regions. In this paper, measured path loss data was compared to the conventional Non-LOS models such as ITU-R 526-3 model[5) and Lee model[?].

926 _ _ _ _ _ _ _ Seung youn Lee, Myong chul Shin, Sang yule Choi.. Seung won Kim, Joe sang Cha

' ' ------r------,------

M easured Path loss

1

------L-- ---1 Aroposed----' Model

------ ~--- n·,· ~--~~·~ __ _

' - -- - - - - -,80 -

-

'

_ _ _J _ _ _ _ _ _

- -

.J

., - - - - - - ,

--- -;------ -:------ -t=-;::-;::-::-~M ~e~a=-s~ ur~e~ d ~P-=-at-h""to =- ~s~ s -. --- ~

70 ----- -~----- -:------,

60

=

-,-' - - - - - -,-' - - -

,

fT~-:~§-~~s1:-3_M_a~e~-

_ _ Prop,o sed Model ,

------~------,-----------------------

--- ~ --- l '

",~---.~ , ---~--~~--~.,---7,,,---~ Distance (km)

Figure. 3 Comparison of Measured value and Theoretical values of Path-Loss models in the Non-LOS areas. As shown in the Fig.3, changing pattern of the Lee model and the ITU-R 526-3 model is similar to that of measured path loss. The reason why it doesn't match completely is that the only topographical information is considered without factors like artificial construction. The path loss of Lee model fluctuated a lot. On the other hand, the path loss of ITR-R 526-3 model is similar to that of measured path loss and a little changed comparing to the Lee model. In this paper, we presented a Non-LOS model by altering distance element in the Lee model in order to reflect the characteristics that the location of terrestrial DTV transmitter antenna in this country is much higher than any other. The numerical equation of a presented Non-LOS model in this paper is corresponded to the Lee model[?). However, the distance factors of a presented model is obtained by using different method compared to the conventional Lee model[?). In the presented Non-Los model, distance factors are obtained by the following method. That is, draw a straight line between transmitter and receiver. And then draw a orthogonal line from the top of the obstacle to the line connected line transmitter and receiver. Let an effective distance d 1 and d2 are the length from orthogonal point to transmitter and receiver respectively, and effective height h is the length from the orthogonal point to the top of obstacle.

j

I '"

M o del

IT U - R 5 26- J M o del

Prop o sed M o Threshold, then go to Procedure 1, and repeat Procedure 2; - Jnew :'S Threshold, then terminate;

Algorithm for Vector Channel Parameter and Frequency Offset Estimation

943

End of Procedure 2

C. Efficient Estimation Algorithm Major computational load for the channel estimation comes from the eigen-decompositions of Q 1 (IJ, flf) corresponding to IJ. On the other hand, in order to find calibration vector d, eigendecomposition or matrix inverse of Q 2 is needed as well as a few matrix-vector manipulations. For each IJ, iteration, o(N 3 ) flops are needed for eigen-decomposition of Q 1 (IJ, flf) and o(M 3 ) flops needed for the eigen-decomposition or the inverse of Q2 [9]. This amount of complexity is tolerable since calibration process could be done once in a while. However, if we assume that the curve obtained by plotting minimum eigenvalue of Amin(Q 1 (IJ, flf)) for each IJ is smooth, then we can find fast algorithm for finding 11min(Q 1 (IJ, llf)), the eigenvector associated with the minimum eigenvalue of Q 1(1J,Af). The power method of Q 1(1J,Af)- 1 is the one of fast method [9]. However, direct application of the power method is not desirable, since it requires matrix inverse of Q 1 (1J, flf). Suppose that vectors v; and lli+J are the eigenvectors associated with the minimum eigenvalues of Q1 (IJ;, llf) and QI (IJi+l• flf), respectively. By using the assumption of smoothness of curve, we may write the relationship of two vectors as follows

or (11) Suppose that v; and Q 1(1Ji+ 1,Af) are obtained already, then vi+ 1 can be found by solving the linear equation in (11). To obtain a fast adaptive solution, we use the Gauss-Seidel iteration for solving the linear equation and we summarize the procedure as follows: Procedure 3 A Initialization 1. IJo = 0;

2. IJ = IJo; 3. Ao =minimum eigenvalue of (Q 1 (1J,flf)) 4. vqld = eigenvector corresponding to Ao BLoop 1. for (k = 1 :Maximum Iteration )

2. for (i = 1 : N) i-1

N

I = (void-""' a· ·vk- ""' a Ilk+ t 1 ~ t,J J ~ i,j vk)/a j i,i

j=l

j=i+l

where the vf1d and ai,j are the ith and (i,j) elements of void and (Q 1 (1J,Af)), respectively. C Termination Check If IJ :S 9max, the maximum angle, then compute the following; 1. IJ = IJ

+ fliJ,

where AIJ is the increment in angle.

2. Compute QI(IJ,Aj)

944

Chong Hyun Lee et. a/.

4. go to Procedure [B]; If 0

> Omax, then terminate;

End of Procedure 3 In order to compare the computational cost between the algorithm based on eigen-decomposition and the one based on Gauss-Seidal, we examine the number of multiplications in computing (Q 1 (0, fl. f)) for each 0. The number of multiplications required to obtain the matrix is expressed as follows: T(n) = 4Nn2 + (2N 2 + 2N- 4N K)n- 2(N 2 + N)K (12) where then= N M and K is the dimension of noise subspace in Rzz· By setting K to be 90% of n, we plot the number of multiplications according to the number of antennas and N, the length of code sequence in Figure (2) . 3.5•1 0'

.ir

2.5

2

Figure 2: Number of multiplication in computing (Q 1 (0, tl.f)) By simple evaluation of the Gauss-Seidal algorithm in [C], we can find that the algorithm requires N 2 multiplications for single iteration k. This amount of computations is fairly smaller since 12N3 is required in performing eigen-decomposition of (Q 1 (0, tl.f)). By adding these numbers into (12), we plot the number of multiplications required for single angle 0 and for total angle Oo ~ 0 ~ Omax in Figure (3) and (4). We can observe that the fast algorithm works with fairly small amount of computations.

4

EXPERIMENTAL RESULTS

The performance of the proposed algorithm with either one of the constraints in (10) is similar, and therefore, the result for the case of d(1) = 1 is presented in this chapter. We use a uniform circular array with six antennas separated by half a wavelength. We use random BPSK modulated data streams, and the Gold codes with the processing gain of N = 31. We assume that 15 users (K = 15) produce two multi-path signals (Lk = 2). For simplicity, we consider azimuthal angle only. The DO As of the reference user are assumed to be [40°, 85°] and the delays be [19.3, 25.3]

Algorithm for Vector Channel Parameter and frequency Offset Estimation

945

~:~ = . . . . - "'""'"''"

~

Figure 3: Number of multiplications using antenna number= 4

I If ~=I=: . . . . - '•

~

CodeL..nglh N

([~·~==I~ I 0





•Codelenglh •N

-

-

~

Figure 4: Number of multiplications using antenna number = 6

Chong Hyun Lee et. al.

946

chips. The DOAs and delays of the rest of users are randomly generated between [0, 180] and [0, 31], respectively. The frequency offset l::.f is assumed to be 0.1. We assume that the number of multi-path signals of the reference user is known. We use 400 observation symbols with no over-sampling. The signal-to-noise ratio (SNR) is assumed to be 20 dB, and the total power of an interfering user is twice of that of the reference user. The gain-phase CtmeNm of each antenna is chosen as = 1 + v'i2aaUm,

Ctm

1/Jm = v'i2a,pup

where Um and up are uniformly distributed in [-0.5, 0.5] and a"'= 0.2 a,p = 20°. By fixing the iteration number in Gauss-Seidal algorithm by 20, we plot the minimum eigenvalues. The result obtained when the frequency offset is not compensated, is shown in Figure 5 and Figure 6.

-

Figure 5: Minimum eigenvalues obtained using un-calibrated array of un-compensated frequency offset

~~- -· · · ·· 0

5

10

c-

.: ... : .... fr . . . : . 1!1

20

25

30

:

35

Figure 6: time delays obtained using un-calibrated array of un-compensated frequency offset The result obtained when the frequency offset is compensated but the array is not calibrated, is shown in Figure7 and Figure 8. The result obtained when the array is calibrated and the frequency offset is compensated, is shown in Figure9 and Figure 10.

Algorithm for Vector Channel Parameter and Frequency Offset Estimation

947

···.-----~-~--~------------,

....

,ooo!-----;~----:;~----;:;;------;:;----7.; ,,.;;----,;;;---::;;---;;;;--~

Figure 7: Minimum eigenvalues obtained using un-calibrated array of frequency compensation

~I . . . . . :..!,· .......:. I ~1.........:.............!!..... I 0

5

10

15

0

5

10

15

"""'"

C"O•

20

25

30

35

20

25

30

35

Figure 8: time delays obtained using un-calibrated array of frequency compensation

.•.

.------~--------------.

.•.

.....

10_.0,_----;::--~----=----=---,,c,.-~-~-~--:! !o

Figure 9: Minimum eigenvalues obtained using calibrated array of frequency compensation

948

Chong Hyun Lee et. a/.

~0

-

................. !!........... s 10

15

20

25

30

I 35

1.. . . . . . . -. . . . 1~. . . I 0

5

10

15

20

25

30

35

Figure 10: time delays obtained using calibrated array of frequency compensation

In these figures, we observe that when frequency offset is not compensated, we cannot estimate channel parameters in both space and time as well as the calibration vector d . T he convergence behavior of DOA according to iteration is shown in Figure 11.

..

~ t---.

-

---j

Figure 11: DOA Estimation vs. iteration

Figure 12 shows the normalized calibration error defined as (II di- d 1 lb /II dt ll2), where di and d 1 are the gain-phase vector at the jth iteration and the true gain-phase vector, respectively. From these figures, we observe that the proposed algorithm performs well even when the there is carrier offset error. Finally, we present the result obtained by using the iteration number (maximum iteration) by 2 and 20 in Gauss-Seidal algorithm, respectively. The minimum eigenvalues at the first iteration and the fifth iteration is shown in Figure 13 and Figure 14. The estimated time delays obtained using 2 iterations are shown in Figure 15. From the results shown in the Figures, we verify that t he proposed fast algorithm perform well even with small number of iteration in Gauss-Seidal algorithm.

Algorithm for Vector Channel Parameter and frequency Offset Estimation

" I

I

I·· I

•o·•

.

Figure 12: DOA Estimation vs. iteration

Figure 13: Minimum eigenvalues obtained at first iteration

,.

L I

..,.

1o-•0~-::;---;;;---;;;-----;;;----;; ,.,;;----;:;;----,,:;;---;-:;;--~

Figure 14: Minimum eigenvalues obtained at fifth iteration

949

950

Chong Hyun Lee et. al.

l. . . . . . . . . 0

5

10

15

lr ...... .. I 20

25

30

35

1.. . . . . . . . . . . 1~. .. I 0

5

10

HI

co.,

20

25

30

35

Figure 15: time delays obtained using iteration of 2 at 5th. iteration.

5

Conclusion

In this paper, we presented a numerically efficient and stable estimation algorithm that can be used in Spread spectrum system in which frequency offset error exists. The proposed algorithm is not dependent on the structure of data and array geometry and requires only binary code sequence of an arbitrary reference. By using the binary sequence, the algorithm provides us with vector channel estimates of the frequency offset, the DOA and delay of multi-path signals. The efficient algorithm is based on Gaus&.Seidal algorithm rather than using eigen-decomposition or SVD in computing eigen-values and eigen-vectors at each iteration. The algorithm is based on the two step procedures, one for estimating both channel and frequency offset and the other for estimating the unknown array gain and phase. Consequently, estimates of the DOAs, the multipath impulse response of the reference signal source, and the carrier frequency offset as well as the calibration of antenna array are provided. The performance of the proposed algorithm is investigated by means of computer simulations. The analytic and simulation results reveals that proposed algorithm is reduces the number of multiplications by order of one.

Acknowledgment This work was supported in part by University IT Research Center Project (INHA UWB-ITRC) in Korea

References [1) G. Tsoulos and M. Beach, "Calibration and linearity issues for an adaptive antenna system," In Proc. 47th IEEE Vehicular Technology Conference, pp. 1597 -1600 Volume: 3 , May 1997. [2) B. Friedlander and A. Weiss, "Eigenstructure Methods for Direction Finding with Sensor Gain and Phase Uncertainties," Proc. IEEE ICASSP, pp. 2681-2684, January 1988. [3) A. Paulraj, T . Shan, V. Reddy and T. Kailath, "A subspace approach to determining sensor gain and phase with applications to array processing" , In SPIE, Adv. Algorithms Architectures Signal Process. , , vol. 696 , pp 102-109, Aug. 1986.

Algorithm for Vector Channel Parameter and Frequency Offset Estimation

951

[4] V. Soon, L. Tong, Y. Huang, and R. Liu, "A subspace method for estimating sensor gains and phase", IEEE Trans. Signal Processing, , vol. 42 , pp 973-976, Apr. 1994. [5] D. Fuhrmann, "Estimation of sensor gain and phase", IEEE Trans. Singal Processing, vol. 42, pp 77-87, 1994 [6] M. Hebley and P. Taylor, "The effect of diversity on a burst-mode carrier-frequency estimator in the frequency selective mulipath channel", IEEE Tras. Comm., Vol. 46, pp 553- 560, Apr. 1998. [7] M. Morelli and U. Mengali, "Carrier frequency estimation for transmissions over selective channels", IEEE Tras. Comm., Vol. 48, pp 1580 - 1589, Sep. 2000. [8] M. Eric, S. Parkvall, M. Dukic and M. Obradovic, "An Algorithm For Joint Direction Of Arrival, Time-Delay and Frequency-Shift Estimation in Asynchronous DS-CDMA Systems", 1998 IEEE 5th International Symposium on Spread Spectrum Techniques and Applications, Vol. 2, pp 595 -598, Sep. 1998. [9] G. Golub and C. Van Loan, Matrix Computations, Baltimore, Johns Hopkins Uni. Press, 1996 [10] S. Bensley and Behnaam Aazhang, "Subspace-Based Channel Estimation for Code Division Multiple Access Communication Systems", IEEE Tras. Comm., Vol. 44, pp 1009- 1020, Aug. 1996. [11] A. Graham, Kronecker Products and Marix Calculus: with Applications, New York, John Wiley & Sons, 1981 [12] D. Astely, A. Lee Swindlehurst and Bjorn Ottersten, "Spatial Signature Estimation for Uniform Linear Arrays with Unknown Receiver Gains and Phases", IEEE Trans. Signal Processing, Vol. 47, NO. 8, pp 2128 - 2138, Aug. 1999. [13] C. H. Lee, S. Kim and J. Chun, "An Online Calibration Algorithm for the CDMA based Adaptive Antenna Array," in Proc. 34th Asilomar Conf., Pacific Grove, CA, Oct. 2000. [14] S. Kobayakawa, M. Tsutsui, and Y. Tanaka, "A Blind Calibration Method for an Adaptive Array Antenna in DS-CDMA Systems Using an MMSE Algorithm," Proc. IEEE VTC, May 2000. [15] Ying-Chang Liang and Francois P.S. Chin, " Coherent LMS algorithms," IEEE Signal Proc. letter, pp. 92-94, March 2000. [16] Simon Haykin, Adaptive Filter Theory, 3rd ed., Prentice-Hall, Englewood CLiffs, N.J. 1996.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 952-954

Security Requirements of Development Site Eunser Lee 1, Sunmyoung Hwang 2 1Chung-Ang 2Daejeon

University, 221, Huksuk-Dong, Dongjak-Gu, Seoul, Korea University, 96-3 Yongun-dong, Tong-gu, Taejon 300-716, South Korea

Abstract: The IT products like as frrewall, IDS (Intrusion Detection System) and VPN (Virtual Private

Network) are made to perform special functions related to security, so the developers of these products or systems should consider many kinds of things related to security not only design itself but also development environment to protect integrity of products. When we are making these kinds of software products, ISO/IEC TR 15504 may provide a framework for the assessment of software processes, and this framework can be used by organizations involved in planning, monitoring, controlling, and improving the acquisition, supply, development, operation, evolution and support of software. But, in the ISOIIEC TR 15504, considerations for security are relatively poor to other security-related criteria such as ISOIIEC 21827 or ISOIIEC 15408. In fact, security related to software development is concerned with many kinds of measures that may be applied to the development environment or developer to protect the confidentiality and integrity of the IT product or system developed. This paper proposes some measures related to development process security by analyzing the ISO/IEC 21827, the Systems Security Engineering Capability Maturity Model (SSE-CMM) and ISO/IEC 15408, Common Criteria (CC). And we present a Process of Security for ISOIIEC TR 15504.

Keywords: Site Security, Software Development, Common Criteria, Security Process.

1. Introduction ISOIIEC TR 15504, the Software Process Improvement Capability Determination (SPICE), provides a framework for the assessment of software processes [I]. This framework can be used by organizations involved in planning, monitoring, controlling, and improving the acquisition, supply, development, operation, evolution and support of software. But, in the ISOIIEC TR 15504, considerations for security are relatively poor to others. For example, the considerations for security related to software development and developer are lacked. When we are making some kinds of software products, ISOIIEC TR 15504 may provide a framework for the assessment of software processes, and this framework can be used by organizations involved in planning, monitoring, controlling, and improving the acquisition, supply, development, operation, evolution and support of software. But, in the ISOIIEC TR 15504, considerations for security are relatively poor to other security-related criteria such as ISOIIEC 21827 or ISOIIEC 15408 [2-3]. In fact, security related to software development is concerned with many kinds of measures that may be applied to the development environment or developer to protect the confidentiality and integrity of the IT product or system developed. In this paper, we propose a process related to security by comparing ISOIIEC TR 15504 to ISOIIEC 21827 and ISOIIEC 15408. The proposed scheme may be contributed to the improvement of security for IT product or system. And in this paper, we propose some measures related to development process security by analyzing the ISOIIEC 21827, the Systems Security Engineering Capability Maturity Model (SSE-CMM) and ISO/IEC 15408, Common Criteria (CC). And we present a Process for Security for ISOIIEC TR 15504.

1 2

Chung-Ang University, 221, Huksuk-Dong, Dongjak-Gu, Seoul, Korea, E-mail : [email protected] Daejeon University, 96-3 Yongun-dong, Tong-gu, Taejon 300-7 I 6, South Korea, E-mail : [email protected]

Security Requirements of Development S i t e - - - - - - - - - - - - - - - - - - - - - - 953

2. A New Process for Development Site Security For example, we want to deal the security for the site where the software is developed. In the ISOIIEC TR 15504-5, there is the Engineering process category (ENG) which consists of processes that directly specifY, implement or maintain the software product, its relation to the system and its customer documentation. In circumstances where the system is composed totally of software, the Engineering processes deal only with the construction and maintenance of such software. The processes belonging to the Engineering process category are ENG.! (Development process), ENG. !.I (System requirements analysis and design process), ENG.l.2 (Software requirements analysis process), ENG.l.3 (Software design process), ENG.l.4 (Software construction process), ENG.l.5 (Software integration process), ENG.l.6 (Software testing process), ENG.l.7 (System integration and testing process), and ENG.2 (Development process). These processes commonly contain the 52nd work product (Requirement specification), and some of them have 51st, 53rd, 54th work products separately. Therefore, each process included in the ENG category may contain the condition, 'IdentifY any security considerations/constraints'. But the phrase 'IdentifY any security considerations/constraints' may apply to the 'software or hardware (may contain firmware) development process' and not to the 'development site' itself. In this paper we will present a new process applicable to the software development site. In fact, the process we propose can be included in the MAN or ORG categories, but this is not the major fact in this paper, and that will be a future work. We can find the requirements for Development security in the ISOIIEC 15408 as like; Development security covers the physical, procedural, personnel, and other security measures used in the development environment. It includes physical security of the development location(s) and controls on the selection and hiring of development staff. Development security is concerned with physical, procedural, personnel, and other security measures that may be used in the development environment to protect the integrity of products. It is important that this requirement deals with measures to re-move and reduce threats existing in the developing site (not in the operation site). These contents in the phrase above are not the perfect, but will suggest a guide for development site security at least. The individual processes of ISOIIEC TR 15504 are described in terms of six components such as Process Identifier, Process Name, Process Type, Process Purpose, Process Outcomes and Process Notes. The style guide in annex C of ISOIIEC TR 15504-2 provides guidelines which may be used when extending process definitions or defining new processes. Next is the Development Security process we suggest. (I) Process Identifier: ENG.3 (2) Process Name: Development Security process (3) Process Type: New (4) Process purpose: The purpose of the Development Security process is to protect the confidentiality and integrity of the system components (such as hardware, software, firmware, manual, operations and network, etc) design and implementation in its development environment. As a result of successful implementation of the process: (5) Process Outcomes: - access control strategy will be developed and released to manage records for entrance and exit to site, logon and logout of system component according to the released strategy - roles, responsibilities, and accountabilities related to security are defined and released - training and education programs related to security are defined and followed - security review strategy will be developed and documented to manage each change steps (6) Base Practices: ENG.3.BP.l: Develop physical measures. Develop and release the physical measures for protecting the access to the development site and product.

954

Eunser Lee, Sunmyoung Hwang

ENG.3.BP.2: Develop personnel measures. Develop and release the personnel measures for selecting and training of staffs. ENG.3.BP.3: Develop procedural measures. Develop the strategy for processing the change of requirements considering security. ENG.3 Development Security process may have more base practices (BP), but we think these BPs will be the base for future work. For the new process, some work products must be defined as soon as quickly. Next items are the base for the definition of work products WP category number I

WP category

ORGANIZATION

2 3

PROJECT RECORDS

3.

WP classification number 1.1

WP classification

WP type

Policy

1.2 1.3 1.4

Procedure Standard Strategy

Access control to site and so on Entrance and so on Coding and so on Site open and so on

Future work 3.1

Future work Report

Future work Site log and so on

3.2

Record

3.3

Measure

Entrance record and so on Future work

Conclusion and Future Work

This paper proposed a new Process applicable to the software development site. In fact, the Process we proposed is not perfect not yet, and the researches for improving going on. Some researches for expression of Base Practice and development of Work Products should be continued. But the work in the paper may be the base of the consideration for security in ISOIIEC TR 15504. ISOIIEC TR 15504 provides a framework for the assessment of software processes, and this framework can be used by organizations involved in planning, monitoring, controlling, and improving the acquisition, supply, development, operation, evolution and support of software. Therefore, it is important to include considerations for security in the Process dimension. In this paper we did not contain or explain any component for Capability dimension, so the ENG.3 Process we suggest may conform to capability level 2. Therefore, more research efforts will be needed. Because the assessment cases using the ISOIIEC TR 15504 are increased, some processes concerns to security are needed and should be included in the ISOIIEC TR 15504.

References [I] ISO, ISOIIEC TR 15504: Information technology- Software process assessment (SPICE). [2] ISO, ISOIIEC 21827 Information technology - Systems Security Engineering Capability Maturity Model (SSE-CMM). [3] ISO, ISOIIEC 15408: Information technology- Security techniques- Evaluation criteria for IT security, 1999. [4] Eun-ser Lee, Kyung-whan Lee, Tai-hoon Kim and 11-hong Jung: Introduction and Evaluation of Development System Security Process of ISOIIEC TR 15504, ICCSA 2004, LNCS 3043, Part I, 2004. [5] Tai-hoon Kim and Haeng-kon Kim: A Relationship between Security Engineering and Security Evaluation, ICCSA 2004, LNCS 3046, Part 4, 2004.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 955-957

Advances in Financial Forecasting Dimitrios D. Thomakos 1 Department of Economics, School of Management and Economics, University of Peloponnese, GR-221 00 Tripolis, Greece Accepted: August 8, 2004 Abstract: This Symposium covers a variety of papers that present new methods and applications of existing, state of the art, methods in three broad categories of financial forecasting. In the first category we have five papers that contribute to the theory and applications, to economic and financial data, of forecasting methods, filtering and smoothing. In the second category we have three papers that deal with the construction and evaluation of successful trading rules that can be used in real-life transactions. Finally, in the third category we have seven papers that cover different aspects in the theory and applications of dynamic macroeconomics, asset pricing and risk management. Keywords: filtering, forecasting, trading methods, asset pricing, risk management.

1

Summary

In the last decade or so the field of financial forecasting has become very active, both in producing new methods but also in incorporating state of the art methods from other fields. In this Symposium we have fifteen papers that cover a variety of methods and applications in financial forecasting. We classified them into three broad categories, based on their content and potential applicability. These categories are methods and applications in forecasting, methods for and evaluation of trading rules, and dynamic macroeconomics, asset pricing and risk management. The first category includes five papers. Brandl [3] considers the combination of machine learning methods, such as genetic algorithms and neural nets, and economic theory in improving exchange rate forecasts. Heidari [5] provides a comparative analysis of several alternative vector autoregressive models for forecasting inflation, including classical and Bayesian models. Rinderu et al. [11] consider the problem of the stability of the monetary transmission mechanism in a developing country and how it can be modeled in the context of a dynamical system. Shmilovici et al. [13] use the context tree algorithm of Rissanen, for compression and prediction of a time series to analyze 12 pairs of international intra-day currency exchange rates. Finally, Thomakos [14] proposes a new method for filtering, smoothing and forecasting based on the use of a particular causal filter with time-varying, data-dependent and adaptable weights. The second category includes three papers. Angelides and Degiannakis [1] test the accuracy of parametric, non-parametric and semi-parametric methods in predicting the one-day-ahead Valueat-Risk (VaR) of perfectly diversified portfolios in three types of markets (stock exchanges, commodities and exchange rates), both for long and short trading positions. Bekiros and Georgoutsos 1 Symposium

organizer. E-mail: [email protected]

D. D. Thomakos

956

[2] investigate the nonlinear predictability of technical trading rules based on a recurrent neural network as well as a neurofuzzy model, with corresponding trading rules and an application to the NASDAQ, NYSE and NIKKEI markets. Papadamou and Stephanides [9] explore the potential power of digital trading and present a new Matlab tool based on genetic algorithms, which specializes in parameter optimization of technical rules. The third category includes seven papers. Hardouvelis and Malliaropoulos [4] present evidence that the predictive ability of the yield spread for short-run inflation is related to its predictive ability for economic activity. Kayahan and Stengos [7] tests the conditional version of the Sharpe-Lintner CAPM by adopting Local Maximum Likelihood as its nonparametric methodology. Kottaridi and Siourounis [7] consider an econometric framework where macroeconomic monetary volatility is linked to the probability distribution of liquidity shocks hitting an international investor, providing evidence that for what is called "flight to quality". Koumbouros [8] proposes a new method that decomposes the overall market risk into parts reflecting long-run market uncertainty related to the dynamics of the present value of revisions in expectations about future asset-specific and market cash-flows and discount-rates. Papanastasopoulos and Benos [10] design a hybrid model to act as an early warning system to monitor changes in the credit quality of corporate obligors. Rompolis and Tzavalis [12] propose a new nonparametric approach of estimating the risk neutral density (RND) of asset prices or log-returns, that exploits a relationship between the call and put prices and the conditional characteristic function of the asset price and the log-return. Finally, Wang et al. [15] examine the predictability of stock index returns (S&P500, S&P400 and Russell 2000) using short-term interest rates as predictors, and suggest that trading strategies based on their findings can be profitable.

Acknowledgment The organizer wishes to express his appreciation to Professor T. E. Simos for his encouragement, comments and recommendations in putting together this Symposium. Any errors or omissions are solely the responsibility of the organizer.

References [1] T. Angelidis and S. Degiannakis. "Modeling Risk in Three Markets: VaR Methods for Long and Short Trading Positions". [2] S. Bekiros and D. Georgoutsos. "Comparative evaluation of technical trading rules: neurofuzzy models vs. recurrent neural networks". [3] B. Brandl. "Machine Learning in Economic Forecasting and the Usefulness of Economic Theory: the Case of Exchange Rate Forecasts" . [4] G. A. Hardouvelis and D. Malliaropulos. "The Yield Spread as a Symmetric Predictor of Output and Inflation". [5] H. Heidari. "An Evaluation of Alternative VAR Models for Forecasting Inflation". [6] B. Kayahan and T. Stengos. "Testing of Capital Asset Pricing Model with Local Maximum Likelihood Method". [7] C. Kottaridi and G. Siourounis. "A Flight to Quality! International Capital Structure Under Foreign Liquidity Constraints".

Financial Forecasting

957

[8] M. Koumbouros. "Temporary and Permanent Long-Run Asset-Specific and Market Risks in the Cross-Section of US Stock Returns". [9] S. Papadamou and G. Stephanides. "Improving Technical Trading Systems by Using a New Matlab based Genetic Algorithm Procedure". [10] G. Papanastasopoulos and A. Benos. "Extending the Merton Model: A Hybrid Approach to Assessing Credit Quality". [11] P.L. Rinderu, Gh. Gherghinescu and O.R. Gherghinescu. "Modeling the Stability of the Transmission Mechanism of Monetary Policy: A Case Study for Romania, 1993-2000". [12] L. Rompolis and E. Tzavalis. "Estimating Risk Neutral Densities of Asset Prices based on Risk Neutral Moments: An Edgeworth expansion approach". [13] A. Shmilovici, Y. Kahiri and S. Hauser. "Forecasting with a Universal Data Compression Algorithm: The Forex Market Case". [14] D. Thomakos. "Functional Filtering, Smoothing and Forecasting". [15] D. Thomakos, T. Wang and J. T. Wu. "Market Timing and Cap Rotation".

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational &iences Volume l, 2004, pp. 958-960

Modeling Risk in Three Markets: VaR Methods for Long and Short Trading Positions Timotheos Angelidis 1 Stavros Degiannakis Department of Banking and Financial Management, University of Piraeus, GR- 185 34, Piraeus, Greece Athens Laboratory of Business Administration. GR-166 71, Vouliagmeni, Greece Department of Statistics, Athens University of Economics and Business GR-104 34, Athens, Greece Accepted: August 8, 2004 Abstract: The accuracy of parametric, non-parametric and semi-parametric methods in predicting the oneday-ahead Value-at-Risk (VaR) of perfectly diversified portfolios in three types of markets (stock exchanges, commodities and exchange rates) is investigated, both for long and short trading positions. The risk management techniques are designed to capture the main characteristics of asset returns, such as leptokurtosis and asymmetric distribution, volatility clustering, asymmetric relationship between stock returns and conditional variance and power transformation of conditional variance. Based on backtesting measures and a loss function evaluation method, we find out that the modeling of the main characteristics of asset returns produces accurate V aR forecasts. Especially for the high confidence levels, a risk manager must employ different volatility techniques in order to forecast the VaR for the two trading positions. Keywords: Asymmetric Power ARCH model, Skewed-/ Distribution, Value-at-Risk, Volatility Forecasting. Mathematics SubjectCiassification: 62P20.

1.

Empirical Investigation

Value-at-Risk (VaR) at a given probability level a, is the predicted amount of financial loss of a portfolio over a given time horizon. Given the fact that the asset returns are not normally distributed, since they exhibit skewness and excess kurtosis, it is plausible to employ volatility forecasting techniques that accommodate these characteristics. The one-step-ahead volatility of daily returns is estimated by a set of Autoregressive Conditional Heteroskedasticity (GARCH, EGARCH, TARCH and APARCH) models by assuming three distribution assumptions (Normal, Student-! and Skewed Student-!), historical and filtered-historical simulations and the commonly used variancecovariance method. We aim to evaluate the predictive accuracy of various models under a risk management framework. We employ a two stage procedure to investigate the forecasting power of each volatility forecasting technique. In the first stage, two backtesting criteria (Kupiec (1995), Christoffersen (1998)) are implemented to test statistical accuracy of the models, which will serve as the final diagnostic check, in order to judge the "quality" of the VaR forecasts. Moreover, the purpose of the used

1 Corresponding

Author. Tel.: +30-210-8964-736.

Modeling Risk in Three Markets: VaR Methods for Long and Short Trading Positions _ _ _ _ _ _ __

959

backtesting measures is twofold. First, to test whether the average number of the VaR violations 2, according to an out-of-sample period, is statistically equal to the expected one. It is important to note that the estimated VaR number must neither overestimate nor underestimate the "true" but unobservable value. In the former case, the financial institution does not use its capital efficiently, while in the latter case it can not cover future losses. Second, given the fact that an adequate model must wide the VaR forecasts during volatile periods and narrow them otherwise, it is necessary to examine if the violations are also randomly distributed. However, in most of the cases there are more than one risk model that satisfies both the backtesting measures and therefore a risk manager can not select a unique volatility forecasting technique. Hence, in order to achieve this goal, we compare the best performed models via a loss function, in the attempt to select one model among the various candidates. In the second stage, we employ standard forecast evaluation methods in order to examine whether the differences between models (which have converged sufficiently), are statistically significant. We focus on out-of-sample evaluation criteria because we believe that a model that may be inadequate according to some in-sample evaluation criterion, can still yield ''better" forecasts in an outof-sample framework than a correctly specified model. We generate out-of-sample VaR forecasts for two equity indices (S&P500, FTSEJOO), two commodities (Gold Bullion $ffroy Ounce, London Brent Crude Oil Index U$/BBL) and two exchange rates (US$ to Japanese¥, US$ to UK£), obtained from Datastream for the period of January 3'd 1989 to June 30'h 2003. For all models, we use a rolling sample of 2000 observations in order to calculate the 95% and the 99% VaR,+llt for long and short trading positions. Under the framework of the loss function approach, we evaluate all the models with p-value greater than I 0% for both unconditional and conditional coverage tests. A high cut-off point is preferred in order to ensure that the successful risk management techniques will not a) over or under estimate statistically the "true" VaR and b) generate clustered violations. In the case of a smaller cutoff point, an incorrect model could not be easily rejected, which might turn to be costly for a risk a manager. Table I summarizes the two-stage model selection procedure3. In the first stage (columns 2 and 3) the models that have not been rejected by the statistical backtesting procedures are presented, while in the second stage (column 4), the volatility methods that are preferred over the others, based on the loss function approach, are exhibited. For example, in panel A, for the S&P500 index, the GARCH(I,l)-normal model achieves the smallest value of the loss function, while its forecasting accuracy is not statistically different to that of the EWMA, EGARCH( I, I) and AP ARCH( I, I) models with normally distributed innovations. Our study sheds a light on the volatility forecasting methods under a risk management framework, since it juxtaposes the performance of the most well known techniques for different markets (stock exchanges, commodities and exchange rates) and trading positions. Although, the normal distribution produces adequate one-day-ahead VaR forecast at the 95% confidence level, models that parameterize the leverage effect for the conditional variance, the leptokurtosis and the asymmetry of the data, forecast accurate the VaR at the 99% confidence level. Moreover, short-trading positions should be modeled using volatility specifications different to that of portfolios with long trading positions. Specifically, more sophisticated techniques that accommodate the features of the financial time series are needed, in order to calculate the one-day-ahead VaR. Brooks and Persand (2003) pointed out that models, which do not allow for asymmetries either in the unconditional return distribution or in the volatility specification, underestimate the "true" VaR. Giot and Laurent (2003) proposed the skewed Student-t distribution and pointed out that it performed better than the pure symmetric one, as it reproduced the characteristics of the empirical distribution more accurate. These views are confirmed for both confidence levels and trading positions, as most of the selected models parameterize these features. At the 95% confidence level, specifications with normally distributed errors achieve the lowest Joss function values. In most of the cases, the techniques that produce the most accurate VaR predictions are the same for both long and short trading positions. On the other hand, the volatility specifications, that parameterize the leverage effect for the conditional variance and the asymmetry of 2 A violation occurs if the predicted VaR is not able to cover realized Joss. Tables with detailed results are available upon request.

3

960 ----------------------------------------- T Angelides, S. Degiannakis the innovations' distribution, forecast the VaR at the 99% confidence level more adequately. However, the models that must be employed for modeling the short and the long trading positions are not the same. This finding is in contrast with that of Giot and Laurent (2003) who argued that the AP ARCH model based on the skewed Student-t distribution forecasts the VaR adequate for both trading positions. Finally, for long position on OIL index (95% VaR), and short position on GOLD index (99% VaR), there are no models that produce adequate VaR forecasts. Given the fact that for these cases they have been rejected by the conditional coverage test, there is evidence that clustered violations were generated.

References [I] Brooks, C., Persand, G., 2003. The effect of asymmetries on stock index return Value-at-Risk estimates. The Journal of Risk Finance, Winter, 29-42. [2] Christoffersen, P., 1998. Evaluating interval forecasts. International Economic Review, 39, 841-862. [3] Giot, P., Laurent, S., 2003. Value-at-Risk for Long and Short Trading Positions. Journal of Applied Econometrics, 18,641-664. [4] Kupiec, P.H., 1995. Techniques for VerifYing the Accuracy of Risk Measurement Models. Journal of Derivatives, 3, 73-84.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 961-963

Comparative Evaluation of Technical Trading Rules: Neurofuzzy Models vs. Recurrent Neural Networks S. Bekiros 1 & D. Georgoutsos Department of Accounting and Finance Athens University of Economics and Business 76 Patission str., I 04 34 Athens, GREECE Accepted: August 8, 2004 Abstract: This paper investigates the nonlinear predictability of technical trading rules based on a recurrent

neural network as well as a neurofuzzy model. The efficiency of the trading strategies was considered upon the prediction of the direction of the market in case of NASDAQ, NYSE and NIKKEI returns. The sample extends over the period 2/811971 - 4/7/1998 while the sub-period 4/811998 - 2/5/ 2002 has been reserved for out-of-sample testing purposes. Our results suggest that, in absence of trading costs, the return of the proposed neurofuzzy model is consistently superior to that of the recurrent neural model as well as of the buy & hold strategy for bear markets. On the other hand, we found that the buy & hold strategy produces in general higher returns than neurofuzzy model or neural networks for bull periods. The proposed neurofuzzy model which outperforms the neural network predictor allows investors to earn significantly higher returns in bear markets. Keywords: Technical trading rules, Neurofuzzy models, Neural networks Mathematics Subject Classification: 89.65, 42.79, 47.52, 05.45

1. Extended Abstract The recurrent network uses the tansig transfer function (G) in its hidden (recurrent) layer, and the pure! in function (S) in its output layer. The output y stands for the forecasted stock index return and is given by:

where

and

x1,1 , ... ,xn,t

are past values of the stock index return.

In the neurofuzzy model y also stands for the forecasted stock index return and values of the stock index return. The output is given by:

1 Corresponding

author. E-mail: [email protected]

x,,. ... ,xn,t

are past

962 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ S. Bekiros, D. Georgoutsos

where w; are the firing strengths of the membership function grades and c, d. h coefficients from the general first-order Sugeno model of the form: IF x is A ANDy is B THEN

z = h +c ·X +d ·y

References [I] Elman, J. L. 1990, "Finding structure in time," Cognitive Science, 14, 179-211. [2] Jang, J., 1993, "Adaptive-Network-Based Fuzzy Inference Systems", IEEE Transactions on Systems, Man and Cybernetics, 23 (3), 665-685. [3] Masters, T., 1993, "Advanced Algorithms for Neural Networks", John Wiley. [4] Pesaran, M.H., and Timmermann, A., 1992, "A simple nonpararnetric test of predictive performance", Journal of Business and Economics Statistics, 10,461-465 [5] Sugeno, M., 1985, "Industrial applications of fuzzy control", Elsevier Science Publications Co. [6] Sugeno, M., 1988, "Fuzzy Control", Nikkan Kougyou, Shinbunsha.

-0277

05!$1

1D27

0543

1.665

OD21

OD63

OD81

TotalRetum

B&HRetum

Sign Ra.te

PTtest

MSE

ShazpeR.ati:>

Ideal Prof'Jt 0094

OD73

OD18

1.177

0543

1D27

0.662

Bull 0512

Bear

OD48

OD37

OD29

1362

0533

-0982

NF

0.112

OD82

OD13

0.723

0514

0.166

0563

Bull Bear

-0.107

-OD80

OD14

-1275

0.468

-0.190

-0317

RNN

OD93

OD68

ODll

-0288

0.4!i0

0.166

0.467

Bull

NYSE

Bear

-OD16

-OD12

ODll

-OS08

0.465

-0.190

-0048

NF

-OD29

-OD21

OD18

-l.IU8

0.462

0241

-0.159

Bull

RNN

B&H return: Total return from Buy& Hold strategy Sign Rate: Proportion of the cases that the strategy predicts the correct sign of the market PT test:~&~ test(l992) MSE: Mean Square Error Sha1pe Ratio: Return from the strategy per unit of risk I de a1 Profit: Compares the forecasting system return against the perfect forecaster

NF:~model

Notation: RNN: Recurrent Neural Network

-OD26

-OD20

OD35

0382

0500

-0982

Bear

Bull

RNN

NASDAQ

s~.,.riDd

Arc:hi---

:buiU.

TABLE I : Statisticalresults for the trarung models

OD37

OD27

OD21

-1549

0.462

-0.773

0216

Bear

-OD62

-0045

OD14

-2.169

0.451

0241

-0.336

Bull

NIKKm55

NF

OD94

OD!$1

OD16

0.848

0511

-0.773

0543

Bear

w

IC

0\

""C;' :::

~

~

~

_f

~

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational &iences Volume I, 2004, pp. 964-966

Machine Learning in Economic Forecasting and the Usefulness of Economic Theory: the Case of Exchange Rate Forecasts B. Brandl' Department of Government, Faculty of Business, Economics, and Computer Science, University of Vienna, A-1210 Vienna, Austria Accepted: August 8, 2004 Abstract: This paper focuses on an integration of economic theory in a machine learning process for the

purpose of exchange rate forecasting. Since the early 1980s literature stresses the weakness of economic theory in forecasting exchange rates. Consistent with these results, in this paper it is asked how machine learning can increase the forecasting performance of theoretical models. Therefore structural exchange rate models are implemented in a machine learning process as a framework in which and among which the possibilities of machine learning are exploited not only to identify further sources of influence but also to test alternative variables of aggregates suggested by exchange rate theory. Thus to serve as a tool for model selection. The applied approach uses a Genetic Algorithm for model selection and Neural Networks for the generation of the forecasts. It is shown that this combination of economic theory and machinery learning not only increases the "fitness" (a popular term in Genetic Algorithm literature) of theoretical exchange rate models but is also fruitful for the effectiveness and correctness of machine learning processes. As experience showed, relationships derived from machine learning techniques are often not convincing as regards their correctness and effectiveness. Most machine learning approaches do not contribute much to persuade otherwise. In this paper we tried to overcome parts of this limitation. The approach is illustrated in some detail for five exchange rates on a monthly frequency. Keywords: Forecasting, Exchange Rate Theory, Genetic Algorithm, Neural Network Mathematics Subject Classification: 62P05, 91864, 62M45

1. Combining machine learning with economic theory The basic idea of this paper is to combine economic theory with machine learning to forecast exchange rates. Thus to increase the "fitness" of theoretically derived exchange rate models. The reason for this combined approach is that relying merely on economic theory on the one hand and machine learning on the other hand has its problems. Nevertheless, both approaches reveal advantages, but also disadvantages. The main advantage of forecasting exchange rates using economic theory is that relatively stable relationships are considered. Particularly with regard to the forecasting performance a main disadvantage of theoretical approaches is that these stable relationships provide a low statistical fit on actual data. One reason for this is that they abstract from many other (temporary and case specific) sources of influence on current exchange rate fluctuations but also from psychological dynamics which can be modeled by technical indicators. In this paper such indicators (among other fundamentals) are added to the structural models by using machine learning. See the appendix for a discussion on the used models. On the other hand, the problem of most pure machine learning approaches is that they are not able to distinguish between basic and fundamental relationships and temporary relationships. Moreover, machine learning may be "blinded" by spurious causality so that it can only model exchange rate behaviour over a specific (relatively short) time span. Whereas, a principle advantage of machine learning is that over these specific time spans a considerably high goodness of fit can be achieved as such methods have the possibility to detect an trace influences on exchange rates which are not describable by economic theory. 1 Corresponding author. Assistant Professor at the University of Vienna. E-mail: [email protected]

Machine Learning in Economic Forecasting and the Usefulness of Economic Theory _ _ _ _ _ _ _ _ 965

2. Structuring the GA search space The combination of theoretical relationships with additional relationships has the advantage that the goodness of fit of the theoretical models can be raised without endangering that relationships offered by economic theory get lost. To search among combinations between those two sorts of influences the GA is applied. However, to constrain the GA, the search space is divided into several clusters or factor groups (F) from each of which the GA has to select variables to build forecast models. Usually one series from each factor group, while those are assembled according to theoretical issues. The advantage of using a GA for this optimization task is that many combinations between series from different factor groups can be evaluated automatically without loosing the structure of the theoretical models. The number of factor groups, depends on the theory used, and the number of series in the factor group depends on how abstract the theory is formulated, as well as on issues such as availability of data on different frequencies. See the appendix for the theoretical aspects of the models. Especially on higher frequencies such as on a monthly frequency many theories are only applicable by using proxies. However, as mentioned, in addition to considering the series in accordance with economic theory, other sources of influence are evaluated. Usually other exchange rates, technical indicators and financial market series such as stock market indices have been taken into account, because they not only are said to have the capacity to proxy (or even anticipate) real economic activity but also are mapping capital movement between countries, which in tum affect exchange rates. This idea of constraining the GA can be expressed analytically. The general representation of the forecast equation to be optimized is of the following form that the search space is divided in several clusters: k1

sr+l =

L

n1

km

nm

k1

n1

IfJ,}F;~r-J + ··· + L Ifl'J' F;7- 1 + L Iaiixi,r-J

j=O i=l

j=O i=l

(I)

j=O i=l

With F indicating series from factor groups according to economic theory and X; indicating all other series used. Accordingly fJ denotes the coefficients of the theoretical variables and a that of all others. m stands for the number of theoretical factor groups, n for the number of variables in one group of factors and k for the number of lagged series considered. The GA is used for the model selection, whereas is constrained to select exactly one variable from every theoretical factor group, which means

' I demands a higher risk price for risks associated with market cash flow uncertainty (ftcc.r and A vc.r ) rather than for risks linked to shocks to market returns (ftcv.r and PDD.r) , since any positive (negative) shock to wealth discount rates is at a benefit (cost) of worse future investment opportunities, whereas the investor is never compensated later for every positive (negative) shock to dividends. Since our four-beta model in (7) can be written in the restricted from as follows:

(8) with A., = Var(em.,r+l), the beta prices of market cash-flow risk, A.cc and A.oc, are a y multiple of the beta risk prices of market discount-rate risk, A.cv and A.v0 , respectively.

3. Data, Methodology and Empirical Analysis 3.1 Data and Methodology In order to derive the cash-flow and discount-rate "news" series N,C.. 1 and N,~ 1 we employ a VARGARCH(l,l) methodology. A VAR-GARCH(l,l) system is estimated for each asset i and the for the market portfolio m. Our monthly data consists of 3 sets of 10 portfolios sorted according to market value, book-to-market and dividend-price ratio, and 3 US-economy-wide variables that serve as instruments, from December 1928 to December 200 I. Following the common practice, the state variables have been selected under the assumption that they can successfully forecast future portfolio returns, and are (a) the log market price-earnings ratio, p-e, (b) the term yield spread, TY, and (c) the small-stock value spread, VS. The V AR-GARCH approach enables us to construct the news series as linear combinations of the standardized innovations of the state variables. Then using the sample covariances of these standardized residuals we calculate the cash-flow and discount-rate betas in (3) and (4), and we estimate the prices of beta risks by running an cross-sectional OLS regression of an unconditional version of (7) and (8).

3.2 Empirical Analysis Table I report the empirical findings. Our extended model improves the ability of the disappointing static CAPM and the two-beta I-CAPM provided by Campbell and Vuolteenaho (2004) to capture the spread in mean asset premia. The proportion of cross-sectional variability explained increases from 46.4% (two-beta model) to the high 83%, while the pricing error is still highly insignificant (i = -0.566 ). Most importantly, and even when the insignificant constant is included in the regression, all except A.cv ( i = 1.69 ) the slope coefficients are significant indicating that the approach of decomposing cash flow and discount rate market risks into two components each reflecting sensitivities of asset's dividend growth news and future return news yields interesting insights for the determination of average risk premia. Once the insignificant A.0 is removed all four risk prices are highly significant and the high estimated values for A.cc and A.vc (0.0 19 and 0.023 respectively) provide further support on the results presented by Campbell and Vuolteenaho (2004) and Campbell, Polk and Vuolteenaho

982 - - - - - - - - - - - - - - - - M Koubouros. D. Malliaropu/os and E. Panopou/ou

(2004). They argued that value and small stocks have considerably higher market cash-flow betas than large and growth stocks. We extend their results by showing that the sign and magnitude of our estimated beta-risk prices of the decomposed cash-flow market risk are in line with a rational asset pricing model for a long-lived investor. This investor requires a higher premium per unit of market cash-flow risk than for market discount-rate risk and the factor of proportionality restricted to be equal to the coefficient of relative risk aversion is both economically and statistically significant. For this group of portfolios although we reject the equality hypothesis for the market discount rate premia (Aco and Aoo) we can safely accept if for the market cash flow premia (Ace and Aoc). Overall, for our fourbeta specification in (7) and (8), the spread in the average excess returns of size, value portfolios and dividend-price sorted portfolios seems to be not puzzling. Table I: Asset pricing results: Cross-sectional regressions of average excess returns on cash-flow and discount-rate betas of 25 book-to-market, I 0 size and I 0 dividend-price portfolios.

Ao

Am

Two-beta

Two-beta

Four-beta

Four-beta

CAPM

1-CAPM

1-CAPM

1-CAPM

1-CAPM

o.ow· (0.006)

--0.004

[1.799]

-0.002 (0.003) (-Q.566]

(0.005) (-Q.766]

-0.003 (0.006) [-Q.470]

Ac

0.071···

0.0061···

0.016 ••

0.012···

(0.015) (4.668]

Ao

(0.006) [2.747]

(0.009) (6.519]

(0.001) [18.042]

Ace

Aoo

(p-value)

x2 -test (p-value)

0.006

0.004···

0.028. (0.014) [2.030]

0.023···

0.009 ••

o.oo8···

(0.004) [1.690]

Aoc

F- test (all zero)

0.019···

(0.029) (2.627]

Aco

adj.-R 2

0.024.

(0.004) (2.549]

-2.7%

45.6% 13.178 (0.000)

A.o = Ao = 0

321.302 (0.000)

46.4%

82.7%

(0.009) (2.305]

(0.001) (3.806]

(0.011) (2.179]

(0.001) (5.606]

83.1%

35.733 (0.000)

A.o = Aco = 0 Ace= Aoc

14.429 (0.001)

0.846 (0.357)

(p-value)

Aco =Avo 17.017

CRRA(y)

5.304···

(1-stat)

(9.713)

x2 -test

(0.000)

Risks in a Cross-Section of US R e t u r n s - - - - - - - - - - - - - - - - - - - - - - 983

0.012···

I I

(0.001) [18.042]

Acknowledgments The author wishes to thank Dimitrios Thomakos (University of Peloponnese, Dept. of Economics) for his careful reading of the manuscript and his helpful comments.

References [I] J. Campbell, A variance decomposition of stock returns, Economic Journal IOJ 157159(1991). [2] J. Campbell, Understanding risk and return, Journal of Political Economy I 04 298-345( 1996). [3] J. Campbell and J. Mei, Where do betas come from? Asset pricing dynamics and the sources of systematic risk, Review ofFinancial Studies 6 567-592(1993). [4] J. Campbell, J. Polk and T. Vuolteenaho, Growth or glamour, unpublished paper, Harvard University (2003) [5] J. Campbell and R. Shiller, Stock prices, earnings and expected dividends, Journal of Finance, 43 661-676(1988). [6] J. Campbell and T. Vuolteenaho, Bad beta good beta, forthcoming American Economic Review (2004). [7] R. Merton, An Intertemporal capital asset pricing model, Econometrica 41 867-887(1973).

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 984-987

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Improving Technical Trading Systems by Using a New Matlab based Genetic Algorithm Procedure S. Papadamou 1 G. Stephanides Department of Economics, University ofThessaly, 382 21 Volos, Greece, +Department of Applied Informatics, University of Macedonia Economic and Social Sciences, 54006, Thessaloniki, Greece Accepted: August 8, 2004 Abstract: Recent studies in financial markets suggest that technical analysis can be a very useful tool in

predicting the trend. Trading systems are widely used for market assessment however parameter optimization of these systems has adopted little concern. In this paper, to explore the potential power of digital trading, we present a new Matlab tool based on genetic algorithms, which specializes in parameter optimization of technical rules. It uses the power of genetic algorithms to generate fast and efficient solutions in real trading terms. Our tool was tested extensively on historical data of a UBS fund investing in Emerging stock markets through a specific technical system. Results show that our proposed GATradeTool outperforms commonly used, non-adaptive, software tools with respect the stability of return and time saving over the whole sample period. Keywords: financial markets; prediction; genetic algorithms; investment, technical rules Mathematics SubjectC/assification: 91899,91828, 46N60, 92015.

1. Introduction In our day's traders and investment analysts require fast and efficient tools in a ruthless financial market. Battles in trading are now mainly waged at computer speed. The development of new software technology and the appearance of new software environments like Matlab provide the basis for solving difficult financial problems in real times. Technical analysis has been a part of financial practice for many decades, but in many studies researchers have ignored the issue of parameter optimization, leaving them open to criticism of data snooping and the possibility of survivorship bias (Lo and MacKinley [I] , Brock et al [4] and Papadarnou et al.[6]). Traditionally researchers used ad hoc specification of trading rules. Papadamou and Stephanides [5], implemented a new Matlab based toolbox for computer aided technical trading that have included a procedure for parameter optimization problem. However, the weak point of their optimization procedure is time: the objective function (e.g. profit) isn't a simple squared error function but a complicated one (each optimization iteration goes through the data, generates trading signals, calculates profits etc.). When the data sets are large and you would like to reoptimize your system often and you need a solution as soon as possible, then try out all the possible solutions and get the best one would be a very tedious task. The aim of this study is to show how genetic algorithms (GAs) (Holland [3], Goldberg [2]), a class of algorithms in evolutionary computation, can be employed to improve the performance and the efficiency of computerized trading systems. Bauer [8, 9] in his research offered practical guidance 1 Corresponding author. Lecturer in Monetary & Banking Economics, Department of Economics, University of Thessaly, Argonauton & Filelinon, Volos, Greece. E-mail: [email protected] [email protected]

Improving Technical Trading Systems by Using a New Matlab based Genetic Algorithm Procedure _ _ _ 985

concerning how GAs might be used to develop attractive trading strategies based on fundamental information. According to Allen and Kaljalainen [10], genetic algorithm is an appropriate method to discover technical trading rules. Fernandez-Rodriguez et al [7] by adopting genetic algorithms optimization in a simple trading rule provide evidence for successful use of GAs from the Madrid Stock Exchange.

2. Methodology Our methodology is conducted in several steps. Firstly, we have to implement our trading system based on technical analysis. In developing a trading system, you need to determine when to enter and when to exit the market. If the trader is in the market the binary variable F, is equal to one otherwise is zero. As position traders we base the majority of our entry and exit decisions on daily charts by constructing a trend following indicator (Dimbeta). This indicator calculates the deviation of current prices from its moving average of91 length. The indicators used in our trading system can be formalized as below: . b Close, -MovAv 1 (Close,8 1 ) D1m eta 1 = ---'-------'--'---~-'(I) MovAv 1 (Close ,81 ) Where Close 1 is the closing price of the fund at time t and function MovAv calculates the simple moving average of the variable Close with time length 91. I 11-1 MovAv 1 (Close, 8 1 ) Close 1_ 1 , t 8 1 ,81 + !, ... , N (2)

=-I

=

(}I t=O

Our trading system consists of two indicators, the Dim beta indicator and the Moving Average of Dimbeta given by the following equation. I II -t MovAv 1 (Dim beta, 8 2 ) = Dimbeta 1_ 1 , t = 8 2 ,82 + !, ... , N (3)

t

(}2 t=O

If MovAv 1 (Dimbeta, 8 2 ) cross upward the Dimbeta 1 then enter long into the market (i.e. buy signal). If MovAv 1 (Dimbeta, 8 2 ) cross-downward Dimbeta 1 then close the long position in the market (i.e. sell signal). Secondly, we have to optimize our trading strategy. It is well known that maximizing objective functions such as profit or wealth can optimize trading systems. The most natural objective function for a risk-insensitive trader is profit. In our software no short sales are allowed and the leverage factor is set fixed at v= I, the wealth at time T is given by the following formula: T

W(TJ = W0

fl (I+ F,_ 1 • r,) ·{I- oiF, -

F 1_ 1 1}

(4)

1=1

Where r1 =(Close 1 I Close 1_ 1 ) -I is the return realized for the period ending at time t, o are the transaction costs and Ft is the binary dummy variable indicating a long position or not (i.e. I or 0). The profit is given by subtracting from the final wealth the initial wealth, Pr = W(T) - W0 . Optimizing a system involves performing multiple tests while varying one or more parameters (91, 92) within the trading rules. In this paper we investigate the possibility of solving the optimization problem by using genetic algorithms. Our proposed GATradeTool, operates on a population of candidate solutions encoded. Each decision variable in the parameter set is encoded as a binary string and these are concatenated to form a chromosome. It begins with a randomly constructed population of initial guesses. These solution candidates are evaluated in terms of our objective function (equation 4). In order to obtain optimality each chromosome exchanges information by using operators (i.e. crossover and mutation) borrowed from natural genetic to produce the better solution (Whitley [II] ).

3. Empirical Results We apply our methodology in a UBS Mutual Fund investing in emerging stock markets3. The data analyzed consists of 2800 observations on daily closing prices of that fund for the period 1/5/98 25/6/04.The optimization period is defined between 115/98 to 25/6/03. The optimized system was evaluated through the extended period 115/98 to 25/6/04. Firstly, the effect of different GA parameter configurations will be studied by changing one parameter while keeping the remaining one's fixed at default values. More specifically we are interested to measure the effect of the population size and the crossover parameter in the performance of the genetic algorithm based optimization procedure. Secondly, we compared the solutions of optimization problem conducted by different software tools in order to measure the validity of the GATradeTool proposed.

986 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ S. Papadamou, G. Stephanides

By looking in table I we can say that as long as you increase the population size the best and the average solutions are higher. However, after a population size of 30 the performance decreased. In order to take into consideration the computational costs involved since increase in population size, we calculate the time needed for solving the problem. Low population size leads to low performance and low completion time. According to the efficiency index the best solution is that given by the population size 20. The results were analogous in case of crossover parameter investigation. Table I Popu Ialton . s·tze Etliect 40 5 10 20 30 72/135 203/202 IDimbeta!MADimbeta 531197 711141 138/206 36,97 4,68 9,00 17,57 26,66 [completion Time 88,83% 95,35% 121,09% 126,39% 108,87% !Max Return -53,00% -87,52% -15,26% -40,00% -0,34% !Min Return 59,72% 68,25% 69,74% 76,07% 71,77% !Avg Return 42,33% 39,09% ~t. Dev. Of Returns 29,08% 33,03% 36,97% 3,05 2,89 2,79 3,28 2,99 !Max Ret./St. Dev By looking in table two you can compare the results of optimization of our trading system by using three different software tools. The first row gives the result for the GATradeTool against the Metastock and the FinTradeTool (Papadamou and Stephanides, [5]). Our proposed software tool (GATradeTool) can solve the optimization problem very fast without any specific restrictions about the number of total tests. The maximum number of test that can be performed in Metastock software is 32000. The FinTradeTool needs much more time in order to find the optimal solution, that is closed to the best solution provided by the GATradeTool. The optimum parameters that have been found in period 1/5/98-25/6/03 were used in the extended period 5/1/98-25/6/04. Table 2: Comparison of three different software tools Software Optimized Total Completion Optimization Ext. Evaluation %Change Index Tool Parameters Tests Time Period Return Period Return In Return Return/ f_l/5/98-25/6/03)_(1/5/98-25/6/0-!1_ Time f_Minl 17,81 GATrade (73,135) 120,0% 127,0% 5,9% 6,7% FinTrade (75,129) 39601 67,15 126,4% 141,2% 11,7% 1,9% Metastock (60,111) 32000 30,3 3,9% 116,9% 122,1% 4,5% Note: the optimized parameters are 8 1 and 82•

Figure! depicts the evolution of the maximum, minimum and average return across the 300 generations for the Dimbeta trading system. It can be observed that the maximum return has a positive trend. It appears to be relatively stable after 150 generations and moves in the range between 1.2 and I (ie. 120%-100% return). For the minimum fitness no pattern seems to exist. For the average population return a clear upward trend can be found in the first 180 generations, this is an indication that the overall fitness of the population improves over time. Concerning the volatility of the solutions, standard deviation of solutions after an increase in the first generations stabilizes in a range between 0.3 and 0.6 providing evidence of a stable and efficient set of solutions. In conclusion, Genetic algorithms are better suited since they perform random searches in a structured manner and converge very fast to populations of near optimal solutions. The GA will give you a set (population) of "good" solutions. Analysts are interested to get a few good solutions as fast as possible rather than the globally best solution. New trading system can easily implemented in our procedure by making only few changes in the code producing the trading signals.

Improving Technical Trading Systems by Using a New Matlab based Genetic Algorithm Procedure _ _ _ 987

~ 0.5 ;; a:

~

E

~ .0.5 generation

generation

generation

generation

Figure I : Evolution of Several statistics over 300 generations

References [I] A.W. Lo, and A.C. MacKinlay When are contrarian profits due to stock market overreaction? Review of Financial Studies 3 175-206 (1990). [2] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning AddisonWesley 1989. [3] J.H. Holland Adaptation in natural and artificial system University of Michigan Press 1975. [4] W. Brock, J. Lakonishok and B. LeBaron Simple technical trading rules and the stochastic properties of stock returns Journal ofFinance 4 7 1731-1764 ( 1992). (5] S. Papadamou and G. Stephanides A New Matlab-Based Toolbox For Computer Aided Dynamic Technical Trading, Financial Engineering News May/June Issue No. 31 (2003). [6] S. Papadamou and S. Tsopoglou Investigating the profitability of Technical Analysis Systems on foreign exchange markets, Managerial Finance 27(8) 63-78 (2001 ). [7] F. Femandez-Rodriguez C. Gonzalez-Martel and S. Sosvilla-Rivero Optimisation of Technical Rules by Genetic Algorithms: Evidence from the Madrid Stock Market Working Papers 200114 FEDEA (ftp://ftp.fedea.es/pub/Papers/2001 /dt2001-14.pd0 [8] R.J. Bauer and G .E. Liepins Genetic Algorithms and Computerized trading strategies Expert Systems in Finance (Editors D.E. O'Leary and P.R. Watkins) Amsterdam The Netherlands: Elsevier Science Publishers 1992. [9] R.J. Bauer Genetic Algorithms and Investment Strategies. New York John Wiley & Sons Inc. 1994 [I 0] F. Allen and R. Karjalainen Using genetic algorithms to find technical trading rules Journal of Financial Economic 51 245-271 (1 999). [ll]D. Whitley The Genitor algorithm and selection pressure: why rank-based allocations of reproductive trials are best In Proceedings of the third International Confe rence on Genetic Algorithms 116-121 (1989).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 988-990

Extending the Merton Model: A Hybrid Approach to Assessing Credit Quality George A. Papanastasopoulos 1 Alexandros V. Benos Department of Economics, School of Management and Economics University of Peloponnese, GR-221 00 Tripolis Campus, Greece Department of Banking and Financial Management, University of Piraeus, GR-18534 Piraeus, Greece Accepted: August 8, 2004 Abstract: In this paper we have designed a model to act as an early warning system to monitor changes in the credit quality of corporate obligors. The structure of the model is hybrid in that it combines the two credit risk modeling approaches: (a) a structural model based on Merton's contingent claim view of firms, and (b) a statistical model determined through empirical analysis of historical data. Specifically, we extend the standard Merton approach to estimate a new risk-neutral distance to default metric, allowing liabilities and the corresponding default point to be stochastic. Then, using financial ratios, other accounting based measures and the risk-neutral distance to default metric from our structural model as explanatory variables we calibrate the hybrid model with an ordered - probit regression method. Using the same econometric method, we calibrate a model using our risk-neutral distance to default metric as unique explanatory variable. Then, using cumulative accuracy plots we have test the classification power of those models to predict default events out of sample. Finally, we use the optimized model to predict expected default probabilities for industrial companies listed in the Athens Stock Exchange.

Keywords: credit risk, distance to default, financial ratios, accounting variables, financial ratios Mathematics Subject Classification: 62P20, 62H30.

1. Extended Abstract Credit risk refers to the risk due to unexpected changes in the credit quality of a counter party or issuer and its quantification is one of the major frontiers in modern Finance. Credit risk measurement depends on the likelihood of default of a firm to meet its a required or contractual obligation and on what will be lost if default occurs. In this paper we have designed a model to act as an early warning system to monitor changes in the credit quality of corporate obligors. The structure of the model is hybrid in that it combines the two credit risk modeling approaches: (a} a structural model based on Merton's contingent claim view of firms, and (b) a statistical model determined through empirical analysis of historical data. Central to our hybrid model is a variant of Merton's analytical model of firm value. Fundamental, to Merton's model is the idea that corporate liabilities (equity and debt) could be considered as contingent claims on the value of the firm's assets. To see this, consider the case of a simple firm with market value assets equal to A , representing the expected discounted future cash flows and a capital structure with two classes of liabilities: equity with market value equal to E and zero-coupon debt with face value DT , maturity at timeT. The issue of the debt prohibits the payment of dividends 8 until the face value is paid at maturity T . Now consider the position of the 1 Corresponding author. Lecturer of the University of Peloponnese. E-mail: [email protected].

Extending the Merton Model: A Hybrid Approach to Assessing Credit Quality_ _ _ _ _ _ _ _ _ _ _ 989

equityholders and the debtholders of this simple firm. On debt's maturity T, if the market value of firm's assets A exceeds the face value of debt, the debtholders will receive the promised payment and the equityholders will receive the residual claim A- Dr. If the market value of firm's assets does not exceeds the face value of debt the equityholders will find it preferable to exercise their limited liability rights, default on the promised payment and surrender the firm's ownership to its debtholders. These payoffs imply that equity and debt possesses option like features with respect to the solvency of the firm. The payoff of equityholders is equivalent to that of a European Call option with underling asset the firm's asset value A, strike price equal to Dr (firm's default boundary) and maturity T and thus its value at time T equal to Er

= max(Vr -Dr ,0 ). The payoff of debtholders is equivalent to that

of a portfolio composed of default-free debt with face value D and maturity at T plus a European Put Position

on

the

assets

of

the

A

firm's

with

strike

Dr

and

maturity

T

Dr = min(Dr, Ar) = Dr - max(Dr - Ar ,0) . Therefore, the equilibrium theory of option pricing proposed by Black and Scholes (1973), can be used to price corporate liabilities (equity and debt) and to estimate the underlying default probability of a public. However, some of the underlying assumptions of the original model serve to facilitate its mathematical representation and can be considerable weakened2 . Specifically, in the original model and most of its modified version the default boundary beneath which promised payments to debtholders are not made and default occurs is assumed to be constant. Hence, the estimated risk - neutral expected default probabilities cannot capture changes in the relationship of asset value to the firm's default point that caused from changes in firm's leverage. However, these changes are critical in the determination of actual default probability. Moreover, the simplifying assumption of a constant default boundary is the major reason that the model results in unrealistic estimated short-term credit spreads that differ from those observed empirically. It is common that firms adjust their liabilities as they are near default. Empirical studies have showed, that the liabilities of commercial and industrial firms increase as they are near default while the liabilities of financial institutions often decrease. This difference reflects the ability of firms to liquidate their assets and adjust their leverage as they encounter difficulties. In order to capture the uncertainty associated with leverage, we have introduced randomness to the default boundary. Therefore, we have developed a new Merton-Type approach to estimate default probabilities and value corporate liabilities. Our modified version of Merton Model, is based like all structural default risk models on the idea that corporate liabilities (debt & equity) can be valued as contingent clairris on the firm's assets. The model is forward looking since it uses current market information regarding the future prospects of the underlying firm. Moreover, it relates different credit risk factors in an analytical way and allows non-linear effects and interaction among them. Its basic output the risk-neutral distance to default equals:

l

2

A-8 m -·T In --- -}} --

RNDDr=

2 .Ja/ ·T+:1}

J

DP

It is straightforward, that risk-neutral distance to default measure and the risk-neutral expected default probability depends on: • The current market value of firm's assets A • The asset volatility a A , which is a measure of business risk. • The initial level of the default boundary DP • The default boundary volatility li. , which captures the uncertainty about changes on firm's leverage. • The continuously compounded risk free rate r •

The stream of expected cash dividends

8

The length of time horizon T . Our risk-neutral distance to default metric does not take into account credit risk factors such as cash flow adequacy, asset quality, earning performance, capital adequacy which are evaluated by •

2

Merton 1974, p.450

990 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ G. Papanastasopoulos, A. Benos

financial ratios and accounting-based measures. Therefore, we calibrate a hybrid model with an ordered-probit regression to explain credit ratings and rating transitions using our risk-neutral distance to default metric, financial ratios and other accounting based measures as explanatory variables. Using the same econometric method, we calibrate a model using our risk-neutral distance to default metric as unique explanatory variable. Then, using cumulative accuracy plots we have test the classification power of those models to predict default events out of sample and out-of time. We have found that by enriching the risk-neutral distance to default metric with financial ratios and accounting variables into the hybrid model, we can improve both in of-sample fit of credit ratings and out-of-sample predictability of default events. Finally, we have used the hybrid model to predict expected default probabilities for industrial companies listed in the Athens Stock Exchange.

Acknowledgments The authors gratefully acknowledge professor Gikas A. Hardouvelis, associate professor Dimitrios Thomakos, lecturer George Skiadopoulos and Elias Panagiotidis for their helpful comments and suggestions. All errors are our own.

References [I] Altman E., "Financial ratios, discriminant analysis and the prediction of corporate bankruptcy", Journal ofFinance, (I 968), vol.23, pp.589-609. [2] Crouhy M., Galai D., Mark R., "A Comparative Anatomy of Current Credit Risk Models" Journal ofBanking and Finance, (2000), vol.24, pp.57-117. [3] Greene H. W., "Econometric Analysis", (Princeton Education 2000). [4] Harrison J.M., Kreps D., "Martingales and arbitrage in multi-period securities markets", Journal of Economic Theory, ( 1979), vo1.20, pp.381-408. [5] Merton C. R, "Theory of Rational Option Pricing", Bell Journal of Economics and Management Science, (I 973), vo1.4, pp.l41-183. [6] Merton C. R ,." On the Pricing of Corporate Debt: The Risk Structure of Interest Rates ", Journal ofFinance, (1974), vol. 29, pp.449-470. [7] Rubinstein M., "Great Moments in financial economics: II. Modigliani-Miller Theorem", Journal ofInvestment Management, (2003), vol.l(2). [8] Vassalou Maria, Xing Yuhang, "Default Risk in Equity Returns", University of Columbia Working Paper, (2003).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 991-993

Modeling the Stability of the Transmission Mechanism of Monetary Policy A Case Study for Romania, 1993-2000 P.L. Rinderu 1 Gh. Gherghinescu O.R. Gherghinescu University ofCraiova, 200585 13 Al.I.Cuza Street, Craiova, Romania Accepted: August 8, 2004

1. Introduction Most of the mathematical models used for approaching problems in the field of economics should be considered and analyzed under the assumption of dynamic systems. In systems theory the concept of dynamic system relates input functions to output functions via the concept of state. The most often used examples of dynamic systems are finite-dimensional linear systems and finite-dimensional Gaussian systems. There are though economic problems that cannot be modeled in an adequate way by linear systems. Given the fact that the use of non-linear approaches implies difficulties in obtaining the solution, the most appropriate tool under these circumstances is to create a linear model and to use it as first and acceptable proxy for the real system.

2. Theoretical assumptions The chosen model should meet some requirements: - To be realistic for the phenomenon to be modeled. This requires deep understanding of and experience with the phenomenon. -The models are analytically, algebraically and computationally tractable. Some other points should be also taken into consideration: - Characterization of finiteness properties of outcomes. - Characterization of decidability and complexity of solving problems regarding system identification and control. - Decomposition of dynamic systems into irreducible components. - Generating abstractions of systems for distinguishing the levels in hierarchical systems. - Characterization of classes of control laws for specific control objectives, when appropriate. Generally speaking, the existence of a control law for a particular control objective may be formulated if a recognition problem is not well understood. We assume that the control objective is described as a subset of input-output functions of another dynamic system to be called the control objective system. The control objective is then to establish the existence of a controller in a prescribed pattern of dynamic systems such that the set of input-output functions of the associated closed-loop system would be equal to or included in that of the controlobjective system. The formulation of such a problem using a mathematical formalism has the advantage of ensuring a high degree of generality for almost all classes of deterministic systems. The basic formulation, considering a dynamic system, has the form:

x(t)= f(x(t), u(t)), x(t0 )= x0 ,

1 Corresponding author. E-mail: [email protected]

992 - - - - - - - - -- - - -- - - - P. Rinderu, G. Gherghinescu, 0. Gherghinescu

y(t) = h(x(t), u(t)), z(t) = J(x(t), u(t)), where:

y(t) - the observation function; z(t) - the function that should be controlled.

If considering a dynamic feedback based on partial observations, the following relation represents, de facto, a control law: xc(t) = fc(xc(t), y(t)), Xc(ta) = Xc, a, where the function v(t) represents the external input. A basic problem for the correctness of the model is represented by the identification of different characteristics, process that requires the availability of an observation-based realization or filter systems with specified finiteness properties. System identification therefore requires restriction to a class of dynamic systems for which the associated class of observation-based realizations or filter systems has a specified finiteness property.

3. Methods The proposed model aims at simulating the transmission mechanism of monetary policy in Romania in the framework of monetary targeting (in which the central bank has been exerting its control over the monetary basis as operational objective in order to attain a certain level for broad money as intermediary objective and thus contribute to price stability, as fundamental objective of its policy). The proposed interval for the analysis is 1993-2000, given the fact that before 1993 a direct control has been exerted upon broad money, whereas after 2003 the role of monetary aggregates in the transmission mechanism has been continuously decreasing in favor of that of the interest rate. We propose a non-linear model with double feed-back reaction loops for the transmission mechanism of monetary policy during the above-mentioned interval, as presented in fig. I .

back

"I" feed back

Fig. I Block-diagram of the model where: NL - non linear block; Mo - monetary basis as operational objective for monetary policy; M2 - broad money as intermediary objective for monetary policy; i - inflation rate measured by the consumer price index as a proxy for price stability. Using linearizations and taking into account the qualitative aspects of the inner processes, after simple algebraic calculus we deduct the partial transfer functions that link M2 to M0 via the money multiplier and ito M2 via money velocity, respectively. The next step is to determine the global transfer function of the system, which allows for considerations over the stability of the simulated process of regulating i as function of M0 and M2.

Modeling the Stability of the Transmission Mechanism of Monetary P o l i c y - - - - - - - - - - - 993

4. Conclusions The study presents a double feed-back model for controlling price stability via appointing monetary basis as operational objective and broad money as intermediary objective for monetary policy. The model allows for quantifying the stability of the transmission mechanism of monetary policy within the monetary targeting regime in Romania between 1993 and 2000.

Acknowledgments The authors wish to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [I] D. Antohi, I. Udrea and H. Braun, Transmission Mechanism of Monetary Policy in Romania, Research Papers of the National Bank of Romania, No.13, Bucharest, 2003. [2] C. Belea, Non-linear Automation, Technical Publishing House, Bucharest, 1983. [3] R.E. Kalman, P.L. Falb and M.A. Arbib, Topics in Mathematical Systems Theory, McGrawHill Book Co., New York, 1969. [4] J.H. van Schuppen, System theory for system identification, Journal of Econometrics 118, 313339,2004. [5] -, The Transmission Mechanism ofMonetary Policy, Bank of England, 2000.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 994-997

Estimating Risk Neutral Densities of Asset Prices based on Risk Neutral Moments: An Edgeworth Expansion Approach Leonidas S. Rompolis 1 Elias Tzavalis Department of Accounting and Finance, Athens University of Economics and Business Department of Economics, Queen Mary & Westfield College, University of London, London El 4NS Accepted: August 8, 2004 Abstract: ln this paper we suggest a new nonparametric approach of estimating the risk neutral density (RND) of asset prices or log-returns. Our approach exploits a relationship between the call and put prices and the conditional characteristic function of the asset price and the log-return. The latter is used to nonparametrically estimate the risk neutral moments of the underlying asset price or its log-return. These moments' estimates can be employed to estimate the RND using the generalized Edgeworth series expansion of a density. We then evaluate the performance of our approach by estimating the RND for the asset price and/or its associated log-return for three popular option pricing models: the Black-Scholes model, the stochastic volatility and the stochastic volatility jump diffusion model. Using S&P 500 option prices and implied volatilities we estimate the implied RND. Keywords: risk neutral densities, conditional characteristic function, risk neutral moments, Edgeworth series expansion. Mathematics Subject Classification: 60EIO, 60G44, 91828.

1. Introduction In the last several years many studies tried to investigate the expectations and preferences of the investors that are implied by the option prices given by the market. The most important feature, concerning these studies, was the estimation of the risk neutral density (RND). Under the no-arbitrage condition there exists an equivalent probability measure under which all assets (including options) discounted at the risk free rate of interest are martingales. Options are valued under risk-neutrality, therefore the RND is the density implied by the option prices. The existing methods for estimating the RND can be distinguished in three categories. The former category of methods claims that asset returns are governed by a known parametric model, which implies a specific RND. 2 The second category of methods make use of the Cox and Ross formula (1976), typically using a nonlinear optimization method to find the exact form of the RND that produces the predicted option prices that are "close" to the observed option prices. Among these techniques same place more structure on the RND to be derived, 3 others place less structure, and therefore they can be viewed as non parametric. 4 The last category is consisted of methods that first try to construct an option pricing formula, from the observed option prices, and then exploit the BreedenLitzenberger formula (1978) to derive the RND. Shimko (1993), Malz (1997) and Campa et.al. (1998) E-mail: [email protected], [email protected] See for example Heston ( 1993) for a stochastic volatility model and Bates ( 1996) for a model with stochastic volatility and jumps. 3 See for example Madan and Milne (1994) and Melick and Thomas (1997). 'See for example Rubistein (1994) and Buchen and Kelly (1996). 1

2

Estimating Risk Neutral Densities of Asset Prices based on Risk Neutral Moments _ _ _ _ _ _ _ _ _ 995

use tools of numerical analysis, like quadratic polynomials and cubic splines, in order to construct the option pricing function. On the other hand, Ait-Sahalia and Lo (I 998) and Ait-Sahalia and Duarte (2003) utilize kernel regression to estimate the functional form that relates the option price to the strike price. In this paper we propose a new nonparametric method for estimating the RND based on option data. Based on Bakshi and Madan's (2000) fundamental theorem stating that put and call option prices span the conditional characteristic function (CCF) of the underlying asset price (or return), we derive closedform equations for the conditional risk neutral moments (RNM), based on the out-of-the money (OTM) call and put option prices. We then employ the generalized Edgeworth series expansion in order to approximate the implied RND. This approach gives us the opportunity, without estimating the RND, to obtain the moments of the asset price distribution or of the asset price return. We can therefore estimate the mean (considered as the expected value of the asset return), the variance, the skewness (measuring the asymmetry of the market expectations) and the kurtosis.

2. Extracting the RNM general formula from option prices Our analysis is motivated by the result ofBakshi and Madan's (2000) fundamental theorem stating that the continuum of characteristic functions and the continuum of options are equivalent classes of spanning securities, under the risk-neutral measure. We can therefore calculate the risk-neutral CCF of the asset price (or return) as a function of option prices. This result can be directly used to extract the RND by the inverse Fourier transform. However, this may lead us to problems in estimating the RND, as the CCF implied by option prices is defined only in the neighborhood of zero. We, thus, follow another approach which relies on the RNM of the RND. We write the CCF as a Taylor series expansion around zero, and we then extract the coefficient of the polynomial, which are exactly the RNM. This approach has its own merits as, apart from estimating the RND, it enables us to directly compare the risk-neutral and physical moments [see Bakshi, Kapadia and Madan (2003)]. The above general formula for the RNM cannot be directly used when we have market data. First, an option pricing function must be constructed from the available option data. To do so, we follow the technique of Campa et. a/. (1998). Option prices are first converted to implied volatilities, using the Black-Scholes formula. Then the new data are interpolated by cubic splines. Finally the continuous implied volatility function is converted back into a continuous option pricing function. The second path of the RNM numerical approximation is the calculation of the integrals involved in the formula. A Gaussian quadrature procedure is used to evaluate the numerical integration.

3. Estimating the RND from the RNM through the Edgeworth series expansion Using the RNM that we calculate in the previous section we can approximate the RND exploiting the Edgeworth series expansion up to order four. In order to apply the Edgeworth expansion an approximating density must be chosen. The natural choice would be the log-normal density when we approximate the RND of the asset price, and the normal density when we approximate the RND of the asset price log-return. The coefficients of these "prior" densities are given by the first two RNM. Namely, what we really do is to adjust the approximating density for the risk-neutral skewness and excess kurtosis, appeared in the RND [see Ait-Sahalia and Lo (I 998)]. The method that we just described has its own advantages. It is a fully nonpararnetric approach because it does not assume any additional structure on the strike-call/put pricing relationship to estimate the RND and it is not data intensive. The main disadvantage is that the Egdeworth expansion can gives us negative values. Nevertheless, we can control that before applying the method using the ranges derived by Jondeau and Rockinger (200 I).

4.

Numerical evaluation and empirical example

We evaluate the performance of the method suggested in the previous section, to accurately approximate the RND. To this end, we conduct a set of numerical exercises evaluating the ability of the method to accurately approximate the theoretical RNM and RND for the log-returns implied by the Black-Scholes, stochastic volatility and stochastic volatility jump diffusion model. The last two models are used extensively in the literature to improve upon the empirical performance of the Black-Scholes

996 ~~~~~~~~~~~~~~~~~~~~~~~- L. Rompolis, E. Tzavalis model. In all the examples we conducted the theoretical and estimated RNM are very close, and so are the approximated RND and the density implied by the model. To assess the empirical relevance of our method we present an application to the estimating of the logreturn distribution of the S&P 500 index using the S&P 500 European option prices obtained from the CBOE. We calculate the RND at two different days of 2002, and for different maturities.

Acknowledgments The authors wish to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [I] Ait-Sahalia, Yacine and Andrew W. Lo, Nonparametric estimation of state-price densities implicit in financial asset prices, Journal of Finance 53 499-548(1998). (2] Ait-Sahalia, Yacine and Jefferson Duarte, Nonparametric option pricing under shape restrictions, Journal of Econometrics 116 9-4 7(2003 ). [3] Bakshi, Gurdip and Dilip Madan, Spanning and derivative-security valuation, Journal of Financial Economics 55 205-238(2000). (4] Bakshi, Gurdip, Nikunj Kapadia and Dilip Madan, Stock return characteristics, skew laws, and the differential pricing of individual equity options, The Review of Financial Studies 16 I 01-143(2003). [5] Bates, DavidS., Jumps and stochastic volatility: exchange rate processes implicit in Deutsche Mark options, Journal ofFinancial Studies 9 69-1 07(1996). [6] Breeden, Douglas and Robert H. Litzenberger, Prices of state-contingent claims implicit in option prices, Journal ofBusiness 51 621-651 (I 978). [7] Buchen, Peter W. and Michael Kelly, The maximum entropy distribution of an asset inferred from option prices, Journal ofFinancial and Quantitative Analysis 31 143-159( 1996). (8] Campa, Jose M., P.H Kevin Chang and Robert L. Reider, Implied exchange rate distributions: Evidence from OTC option markets, Journal of International Money and Finance 17 117160(1998). [9] Cox, John and Stephen Ross, The valuation of options for alternative stochastic processes, Journal ofFinancial Economics 3 145-166(1976). [I 0] Heston, Steven L., A closed-form solution for options with stochastic volatility with applications to bond and currency options, Review ofFinancial Studies 6 327-343(1993).

[II] Jondeau, Eric and Michael Rockinger, Gram-Charlier densities, Journal of Economic Dynamics and Contro/25 1457-1483(2001). [ 12] Madan, Dilip B. and Frank Milne, Contingent claims valued and hedged by pricing and investing in a basis, Mathematical Finance 4 223-245( 1994 ). [ 13] Malz, Allan, Estimating the probability distribution of future exchange rate from option prices, The Journal ofDerivatives 20-36(1997). [14] Melick, William R. and Charles P. Thomas, Recovering an asset's implied PDF from option prices: An application to crude oil during the Gulf crisis, Journal of Financial and Quantitative Analysis 32 91-115(1997). [ 15] Rubistein, Mark, Implied binomial trees, Journal of Finance 49 771-818( 1994).

Estimating Risk Neutral Densities of Asset Prices based on Risk Neutral Moments--------- 997

[16] Shimko, David, Bounds ofprobabi1ity, Risk 6 33-37(1993).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume I, 2004, pp. 998-1001

Forecasting with a Universal Data Compression Algorithm: The Forex Market Case A. Shmilovici 1, Y. Kahiri S. Hauser Dept. of Information Systems Eng. Ben-Gurion University P.O.Box 653, Beer-Sheva, Israel School of Management Ben-Gurion University P.O.Box 653, Beer-Sheva, Israel Accepted 8 August, 2004 Abstract: We use the context tree algorithm of Rissanen, for compression and prediction of time series. The weak form of the EMH is tested for 12 pairs of international intra-day currency exchange rates for one year series of 1,5,10,15,20,25 and 30 minutes. Statistically significant compression is detected in all the time-series, yet, the Forex market turns out to be efficient most of the time, and the short periods of inefficiency are not sufficient generating excess profit. Keywords: Efficient Market Hypothesis, Context Tree, Forex Intra-day Trading, Stochastic Complexity Mathematics Subject Classification: 62P05, 91884, 62M20

1. Introduction The Efficient Market Hypothesis (EMH) states that the current market price fully reflects all available information. Assuming that all information is widely spread and accessible to investors, no prediction of future changes in prices can be made. New information is immediately discovered and quickly disseminated to produce a change in the market price. The weak form of the EMH considers only past price data and rules out predictions based on the price data only. The prices follow a random walk, where successive changes have zero correlation. The EMH has been investigated in numerous papers. Thorough surveys, such as [I] and [2], present conflicting results. Most arguments in favor of the EMH rely on statistical tests that show no predictive power for the tested models. However, claims for successful predictions with nonlinear models such as neural networks may not contradict the EMH if the trading community does not immediately assimilate new useful methods. In any event, predicting stock prices is generally accepted to be a difficult task, since stock prices do behave very much like a random-walk process most of the time. Universal coding methods [3],[4] were developed within the context of coding theory to compress a data sequence without any prior assumptions about the statistics of the generating process. The universal coding algorithms - typically used for file compression - constructs a model of the data that will be used for coding it in a less redundant representation. The crucial part in coding is to come up with a conditional probability for the next outcome given the past. This can be done, for example, by calculating sequentially empirical probabilities in each "context" of the data. A prediction algorithm that uses these conditional probabilities to make predictions will work, due to the statistical analysis performed by the universal data compression algorithm.

1 Corresponding

author. E-mail: [email protected]

Forecasting with a Universal Data Compression Algorithm - - - - - - - - - - - - - - - - - 999

Connection between compressibility and predictability exists in the sense that sequences, which are compressible, are easy to predict and conversely, incompressible sequences are hard to predict. Merhav and Feder [4] found an upper and a lower bound to the connection between compressibility and predictability. Since the compressibility of a data set indicates that the data is not random, we can use the model that was used for compressing the data, for prediction of a future outcome. Here we use the context tree algorithm of Rissanen [5], which is a universal coding method, for measuring the stochastic complexity of a time series. The advantage of this specific algorithm, in contrast to other universal algorithms known only to have asymptotic convergence, is that Rissanen's context-tree algorithm has been shown [6] also to have the best asymptotic convergence rate. Thus, it can be used to compress even relatively short data sets - like the ones available from economic time series. We use this algorithm as described [7] to examine the weak form of the market efficiency hypothesis. The idea is that, in an efficient market, compression of the time series is not possible, because there are no patterns and the stochastic complexity is high. Periods of reduced stochastic complexity indicate times where a model (or patterns) is found, potentially allowing prediction and obtaining abnormal financial gains. This would indicate periods of potential inefficiency in a financial market.

2. Numerical Experiments

The weak form of the EMH is tested for one year for 12 pairs of international intra-day currency exchange rates. The currencies are described in table I. The intra-day currency exchange rates were encoded for series of 1,5,10,15,20,25 and 30 minutes.to a tri-nary string indicating a {low, stable, high} trend. A context tree was computed for each sliding window. First, using the context tree, the compression was computed for each sliding window. Statistically significant compression above random is detected in all of those series for all the different periods. The difference between the compression of different series indicates that it takes some time for the forex market to absorb new information. This indicates potential market short term inefficiency that can be used for forecasting. Second, using the context tree, forecasts were computed for each sliding window, e.g. a forecast of the 51st day was generated based on the previous 50 days. The forecasts were compared with the actual next trends in the time series. Attempts to use the compression for prediction of the future tri-nary trend {low, stable, high} such that the {high, low} trends are above the trading costs, failed to produce consistent results. Third, the forecasts were compared only in to values following a compressible window to asses the theory that compressibility indicates the possibility for prediction. Several statistical tests were used to check the quality of the prediction: The confusion matrix measures the match between the forecast and the real data. The context tree supply a forecast with a probability the forecast is correct. Therefore, the prediction quality was measured for the 0%, 50%, 70% and 80% estimated probability for a correct forecast. Then, we used the sign test to verify that a prediction that was made with 80% chance to be correct produces better results then other predictions with lower probability to be accurate. For example, table I below presents the sign test between forecast with 70% probabilities and no probabilities to be correct for I and 5 minute intervals. We did not find that forecasts given with a higher probability of being correct were significantly better than forecast that were given with lower probabilities. The Kappa statistic coefficient was computed to measure the degree of compatibility between the forecasted data and the real data. Tables 2 below illustrate the results of this test on forecast that were given with 0% or more chance to an accurate prediction. The bold numbers indicate the significant of the coefficient to be bigger then zero The Kappa coefficient we found can be describe as poor to medium correspondence. The final part of the research was a simulation of opening and closing positions. The rule for opening a position was given by the context tree, the position was closed in the end of the period the forecast was given for. We found that in some of the currencies there are profit before paying a commission, but in almost all the cases there was loss after paying commission. Table 3 below illustrates an example of the simulation results when the position was taken for 80% chance to be right in the forecast.

1000 - - - - - - - - - - - - - - - - - - - - - A. Shmilovici, Y. Kahiri, S. Hauser

Table I: The sign test 70% and 0% prediction accuracy for I and 5 minute

Currency

1

5 +

EURUSD

+ +

GBPUSD USDJPY

+

AUDUSD CHFJPY

+ +

EURCAD EURCHF EURGBP EURJPY GBPCHF GBPJPY USDCHF

Sum Statistic

+ + + + 6 0.387

+

+ 6 0.387

Table 2: The Kappa coefficient for several currency series Currency

95%CI

Kappa

O.oJ 0.014 0.021

I Minute Standard Erorr 0.015 O.oJ5 0.014

-0.015 to 0.04 -0.015 to 0.043 -0.007 to 0.05

0.056 0.052 0.054

Standar dErorr 0.005 0.005 0.005

100 75 50

0.007 O.oJ I 0.017

0.013 0.013 0.013

-0.019 to 0.033 -0.0 II to 0.036 -0.008 to 0.043

0.039 0.027 0.034

0.005 0.005 0.005

0.03 to 0.049 0.018 to 0.036 0.024 to 0.043

AUDUSD

100 75 50

O.oJI 0.012 0.021

0.025 0.024 0.024

-0.038 to 0.059 -0.036 to 0.06 -0.026 to 0.068

0.007 0.007 0.007

0.03 to 0.058 0.026 to 0.054 0.031 to 0.058

CHFJPY

100 75 50

0.003 0.007 O.oJ5

0.017 0.017 0.017

-0.03 to 0.037 -0.026 to 0.041 -0.018 to 0.047

0.044 0.04 0.044 0.019 0.015 O.ot5

0.005 0.005 0.006

0.009 to 0.029 0.006 to 0.025 0.006 to 0.025

Moving Window 100 75 50

Kappa

GBPUSD

EURUSD

I

10 Minute

I

95%CI 0.046 to 0.065 0.042 to 0.062 0.045 to 0.064

Table 3: summary of a trading strategy for some currency series Minutes EURUSD

Summary GBPUSD

Summary

Profit loss commission Profit loss commission

1 17 II 34 -28 8 4 20 -16

5 4 20 22 -38 39 12 32 -5

10 40 40 22 -22 51 18 30 3

I I

3. Conclusions Statistically significant compression is detected in all the time-series. The compression is detected even for 5 minutes and 30 minutes interval series, though to a less extent. It seems that it takes some

Forecasting with a Universal Data Compression Algorithm _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 1001

time for the forex market to absorb new information. This indicates potential market short term inefficiency. We found that though a good compression capability should indict a good prediction capability, the accuracy of the prediction was poor. There was no significant difference between the quality of prediction between predictions with high degree of success (as predicted by the context tree) and predictions with a low degree of success. A simulation of opening and closing positions demonstrated no profit beyond the commission for the intra-day trade. Our conclusion is that though the context tree is a useful tool for forecasting time series, the Forex market is efficient most of the time, and the short periods of inefficiency are not sufficient generating excess profit.

References [I] E.F. Fama, "Efficient Capital Markets: II", Journal of Finance, 46 1575- 1611(1991). [2] T. Hellstrom, , Holmstrom, K., "Predicting the Stock Market", Technical Report ImaTOM-1997-07, Umea University, Sweden- available on the web [3] M. Feder, Merhav N., Gutman M, "Universal Prediction oflndividual Sequences", IEEE Transactions on Information Theory, 38(4) 1258- 1270(1992). [4] N. Merhav, Feder M., "Universal Prediction", IEEE Transactions on Information Theory, IT-44, 2124-2147(1998). [5] J. Rissanen, "A Universal Data Compression System", IEEE Transactions tm Information Theory, 29 (5), 656- 664(1983). [6] J. Ziv, "A Universal Prediction Lemma and Applications to Universal Data Compression and Prediction",/£££ Transactions on Information Theory, 47(4)1528- 1532(. 2000). [7] M. Weinberger, Rissanen J., Feder M., "A Universal Finite Memory Source", IEEE Transactions on Information Theory, 41(3)643- 653(1995). [8] F.M.J. Willems, Shtarkov Y.M. and Tjalkens T.J., "The context-tree weighting method: Basic properties", IEEE Trans. On Information Theory, 41(3) 653-664(1995).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume l, 2004, pp. 1002-1005

Functional Filtering, Smoothing and Forecasting Dimitrios D. Thomakos 1 Department of Economics, School of Management and Economics, University of Peloponnese, GR-221 00 Tripolis, Greece Accepted: August 8, 2004 Abstmct: A new method for filtering and forecasting univariate time series is proposed, which is based on a particular type of a causal filter with functional, time-varying coefficients. The main novelty of the proposed method is the nature of the filtering coefficients, which are data-dependent and local in character. The filtering and forecasting procedures can incorporate two different delay parameters, one applied to the functional coefficients and the other to the data used in the filter; a data-dependent methodology is also presented for selecting these delays and the order of the filter. Various simpler cases of the general method are presented and the potential of the method is illustrated using simulated and real-world time series. Keywords: filtering, forecasting, functional coefficients, nonparametric. Mathematics Subject Classification: 60G35, 62G99, 62Ml0, 62M20. PACS: to be completed

1

Presentation of the Problem

Suppose that we are given a univariate, real-valued time series Xt and we are interested in filtering and forecasting based on a causal filter of the generic form: k-1

It~

L

Tj(Zt-d, )Xt-d2-j

(1)

j=O

where k E N+ is the order of the filter, Zt-d 1 ~ (Xt-dp Xt-d,-1, ... , Xt-d,-l+J) T is an (£ x 1) vector of past observations with£ EN, and (d 1 ,d2 ) EN are two delay parameters. The filtering coefficients Tj(·) are time-varying functions of the vector Zt-d, and we call them the filtering functionals. Note that the filtering coefficients differ in two respects (besides being time-varying) from the coefficients found in other filtering equations: first, they incorporate delays and second, they are assumed to be functions of a subset of recent observations. Our aim is to use sample information in computing these coefficients. In particular, we are interested in adjusting the filtering coefficients based on the spatial distance between values of the time series that are apart h time units, where h is to be determined by the 1 Corresponding

author. E-mail: [email protected]

Functional Filtering

1003

model. Therefore, in computing the filtering coefficients we take into account both the time distance as well as the spatial distance of the observations, in anticipation that the coefficients will be more adaptable to changes in the evolution of the time series. To address these requirements we propose that the filtering functionals are constructed based on kernel functions that are widely used in nonparametric time series analysis. We note that the proposed model would belong in the class of functional coefficient autoregressive models if the data generating process was of the form: (2) where n 1 is a zero-mean (white) "noise" time series. In this case we would have that ft would be the minimum mean-squared error forecast for Xt: assuming that d2 > 0, letting !1 1 '!;[ a(s ~ t; x.) be the information set based on observations up to and including time t - d2 we would have E [xt l!1t] '!;[ft.

2

A Set of Filtering Functionals

Define the sets A c JRi and B c (0, +oo); for an (C x 1) vector t/J '!;[ (1/J 1, 1/J2, ... , 1/;e) T Jet the function w(,P) : A--> B denote a kernel function such that it satisfies the conditions w(O) ~ w(,P), w( t/J) = w( -1/J) and JAw( t/J )dt/J = 1. An example of such a kernel function is the multivariate Gaussian kernel given by:

d•'IJ1 1 2_(1) -exp(--1/J;) = i

w(,P) =

i=l

J27r

J27r

2

i

1T

(3)

exp(--t/1 t/J) 2

=

Next, define the (C x 1) vector tPt-d2-i tPt-d2-j(Xt-d,) '!;[ (xt-d 2-j- Xt-d 1 )/a, where a> 0 is a scaling factor and j = 0, 1, ... , k - 1. Note that the sth element of the vector tPt-d2-i, say tPt-d2-j,s, measures the spatial distance between the observation Xt-d,-i and the observation Xt-d,,s in terms of the scaling factor a. If tPt-d 2-i was used as input in the kernel function w(·) above, then the value of w(tPt-d,-i) would change with the distance between the observations Xt-d,-j and Xt-d,.s and with the scaling factor a. The scaling factor a essentially controls the height of the kernel function, a feature which is useful in controlling the degree of smoothing, and therefore allow for explicit training in filtering and forecasting. Using the above kernel function we can now construct filtering functionals that satisfy the necessary conditions of all filtering coefficients, namely that they are positive and that they sumup to unity. We thus assume that the filtering fuctionals Tj ( ·) take the following form: ·(

TJ

and we clearly have that value ft is then given by:

Tj(·)

Xt-d,

) '!!!

-

~ 0, for all j

ft = ~ L

w(tPt-d,-i)

= 0, 1,.,,, k-

w(tPt-d2-i)

k-1

(4)

k-1

Li=O w(t/Jt-d,-i)

i=O L;=o w(t/Jt-d2-i)

,

1, and 2:::~:6 rj(-)

Xt-d2-j

= 1.

The filtered

(5)

The weight that is being attached to the value Xt-d 2-j depends on the spatial distance between the value itself and the values in the evaluating vector Xt-d,, based on observations that are apart h = ld1 - d2 - jl time units. It is clear from the above formula that the proposed filtering functionals are similar (but not identical) to a local nonpararnetric regression; note that, in this

1004

D. D. Thomak06

context, the epithet local refers both to the local character of the filtering functionals and to the use of a limited number of observations in forming the filtered value fti this is in contrast to standard nonparametric regression.

3

Selecting the Model's Parameters

There are five parameters in the model, that need to be determined by the data: the order of the filter k, the order of the evaluating vector i, the delay parameters (d 1, d2) and the scaling factor u. Let 8 ~ (k, i, d 1, d2 , u) T denote the (5 x 1) vector containing these parameters. The choice of the delay parameters and the order of the filter depends on the problem at hand. For smoothing and filtering problems the order of the filter will depend on the degree of smoothing that is required. For forecasting problems the order of the filter as well as the delays will be chosen to optimize the predictive ability of the model. For all of the above, the performance of the smoothing, filtering or forecasting operation will depend on the selected value of the scaling factor u: for a given sample of observations, as u increases the filtering coefficients become less and less varying and approach w(O) (i.e. larger and constant, equal weights); conversely, as u decreases the filtering coefficients become less and less varying and approach zero. Therefore, a balance between larger and smaller values of u is required. For filtering and smoothing problems a simple grid search over the allowable range of the elements of 8 should be sufficient. However, for forecasting problems (where the forecasting performance could be explicitly evaluated) selection of the parameters can be successfully achieved by training the model using a variant of the generalized cross-validation (GCV) approach. The generic GCV approach that we propose is the following: suppose that we have available a sample of T observations and let T1 < T be a subset of training observations. For a suitably chosen objective function that evaluates forecasting performance, say g(8) use the first T1 training observations to compute the v-step ahead forecast, say xl+v· Then, advance the time index by one and use the next T1 training observations to compute the next v-step ahead forecast, and so on until we have a sequence ofT- T1 forecasts. Using these forecasts evaluate g(8) for the chosen 8. Repeating the above procedure for a sequence of N values of 8, say E> ~ (8 1 , · · · , 8N) select the "optimal" value of 8 as: 8* ~ argming(8) (6)

8E9

Acknowledgment The author wishes to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [1] Z. Cai, J. Fan and Q, Yao, 2000. "Functional coefficient models for nonlinear time series", Journal of the American Statistical Association, 95, pp. 941-956. [2) R. Chen and R. S. Tsay, 1993. "Functional coefficient autoregressive models", Journal of the American Statistical Association, 88, pp. 298-308. [3) J. Fan and E. Masry, 1997. "Local polynomial estimation of regression functions for mixing processes", Scandinavian Journal of Statistics, 24, pp. 165-179. [4) X. Li and N. E. Heckman, 1996. "Local linear forecasting", Technical Report 167, Department

of Statistics, University of British Columbia, Vancouver.

Functional Filtering

1005

[5] D. Thomakos and J. Guerard, 2004. "Naive, ARlMA, Nonprametric, Transfer Function and VAR Models: A Comparison of Forecasting Performance", International Journal of Forecasting, 20, pp. 53-67. [6] Q. Yao and H. Tong, 1998. "Cross-validatory bandwidth selections for regression estimation based on dependent data", Journal of Statistical Planning and Inference, 68, pp. 387-415.

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 1006-1008

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Market Timing and Cap Rotation Dimitrios Thomakos Department of Economics, University of Peloponnese, End ofKaraiskaki Street, 22100, Tripolis Greece Tao Wang' Department of Economics, Queens College, 65-30 Kissena Blvd, Flushing, NY 11367 U.S.A. Jingtao Wu Department of Economics 260 Heady Hall Iowa State University Ames, lA 500 II U.S.A. Accepted: August 8, 2004 Abstract: We examine the predictability of stock index returns (S&P500, S&P400 and Russell 2000) using short-term interest rate. We find that the short-term interest rate has predictive power over the relative performance of stock index returns. Trading strategies based on the finding prove to be profitable. Keywords: Market timing, short-term interest rate, trading strategy Mathematics Subject Classification: 62P20, 62J05.

1. Introduction Both practitioners and academics have devoted considerable time to study the effect of market timing for returns of different asset classes in asset allocation strategies as well as long-short market neutral strategies. By now, few would disagree that returns are at least predictable, although there is generally no agreement on the reasons for predictability. One of the most commonly used input variables in market-timing techniques is the short-term interest rate. Among the studies that document predictability of returns by short-term interest rates, Breen, Glosten, and Jagannathan (1989) investigate the performance of a market-timing strategy in shifting funds between Treasury bills and stocks. They find that during 1954 to 1987, a portfolio managed by predictions of a three-year rolling regression of excess stock return on the one-month riskfree rate has a variance of monthly returns about 60% of the variance of monthly returns on the valueweighted stock index, with an average return 2 basis points higher. Such a strategy is found to be worth an annual management fee of 2% of the value of the assets managed. On the other hand, Lee ( 1997) found that the stock return (S&P500) becomes insensitive to the risk-free rate over time, especially after April 1989. However, the reason for the disappearance of the interest rate effect is not clear. In this paper, we examine the relationship between indices of large cap, mid cap and small cap stocks and short-term interest rate. In particular, we are interested in the relation between relative performance of different cap indices and short-term interest rate.

1 Corresponding

author. E-mail: tao [email protected], phone: +212-718-997-5445

Market Timing and Cap R o t a t i o n - - - - - - - - - - - - - - - - - - - - - - - - 1007

One development for the past 20 years is the growth of corporate bonds market. For large corporations, most of their financing need can be fulfilled through the bonds market by using corporate bonds or commercial papers. The cost of external finance through the bonds market in some ways could be cheaper than that from the bank-lending channel. For small firms, they are still affected by changes in real short-term interest rate since it is still more difficult for them to finance their investment through the bonds market, especially from the short-rate commercial paper market. Based on this development, large cap stocks should be affected by the bonds yield while small cap stocks are affected by the real short-term interest rate. This observation is consistent with the Lee ( 1997) study that S&P500 return becomes insensitive to the interest rate over time, as S&P500 is basically a large-cap index. Therefore, a market-timing strategy based purely on short-term interest rate would not be profitable for large-cap stocks but could be profitable for small-cap stocks. It is based on this argument that we want to study the relative performance of these three cap-stock indices. We seek to identify which cap is likely to outperform the others, which cap is likely to under perform the others, and eventually create a managed portfolio to make full use of our results.

2. Data We use the S&P500, the S&P Mid-Cap 400 and the Russell 2000 to represent the large, mid and small-cap stock indices. We collect the one-month T-hill return as the short-term interest rate from Ibbotson and Associates

3. Methodology Our base regression includes the following specification using S&P500 and Russell 2000 as an example: r,SPsoo _ r/us2000 = ao + a 1 r,~ + &, (I) where

r,SPSOO

is the return for S&P500,

r,Ruslooo

is the return for Russell 2000 and r,~ is the risk-free

rate as represented by Ibbotson and Associates. We also run regressions using returns from S&P500, S&P400 and returns from S&P400 and Russell 2000. The data period is from December 1978 to February 2004. In the regression, we split our sample into two periods: 1978:12-1987:12 and 1988:12004:2, and present our results in the next section.

4. Results In the following table, we present our results from the above regression. Table I. Regression Coefficients from Equation I (t-statistics are in the parentheses) r,SP500 _ r,Rus2000

al R2

1978:121987:12 -0.7108 (-0.54) 0.0027

1988:1 2004:2 3.4446 (2.79) 0.0246

r,SP500 _ r,SP400

1978:121987:12 -0.3440 (-0.43) 0.0018

1988:12004:2 1.9619 (2.38) 0.0164

r,SP400 _ r,Rus2000

1978:121987:12 -0.3667 (-0.34) 0.0013

1988:12004:2 1.4827 (1.85) 0.0136

The above results support the argument that after 1980s, because of the change in the financial sector, it becomes easier for large-sized firms to finance through different channels, therefore they do not rely on the bank lending channel as much as before. As a result, a 1 all become positive and statistically significant. Large cap tends to over-perform mid-cap while mid-cap over-performs small cap stocks when interest rate rises. We then estimated the three-year rolling regression from the above specification. From rolling regression results, we make prediction of the relative performance of three index portfolios. From the prediction, we construct two different managed portfolios. One is the long-only portfolio, which we only invest in the index portfolio with the highest forecasted performance. The other one is the marketneutral portfolio that is constructed by longing 100% of the highest forecasted performance index portfolio and short 100% of the combination of the rest two assets. For example, if the model predicts S&P500 will outperform the rest two portfolio in the next month, and is neutral on the prospect of S&P

1008 - - - - - - - - - - - - - - - - - - - - - D. Thomakos, T. Wang. J. Wu 400 and Russell 2000, we would long 100% S&P 500 and short 50% S&P 400 and 50% of Russell 2000. In the whole sample period 1982:1-2004:2, our long-only portfolio has an average monthly return of 1.4% with standard deviation 4.85%. It outperforms all the index portfolios in terms of return and return-standard deviation ratio. The market-neutral portfolio has an average monthly return of 0.52% with standard deviation 2.54%. Its correlation with the S&P500 return is 0.032. In the subsample period 1987:1-2004:2, our long-only portfolio also outperforms the other index portfolio. It has an average monthly return of 1.36% with standard deviation 4.85%. The market-neutral portfolio has an average monthly return of0.6% with standard deviation 2.66%.

5. Conclusion In this paper, we examine whether the short-term interest rate can be used to predict the relative performance of three capitalization indices: S&P500, S&P400 and Russell 2000. We find some evidence that short-term interest rate has some predictive power of the relative performance of these indices. Trading strategies based on this finding prove to be profitable. Table 2: Summary Statistics for Monthly Returns 1982:1-2004:2

Market Neutral Portfolio Long-Only Portfolio Russell 2000 SP400 SP500

Mean 0.52% 1.40% 0.93% 1.03% 1.20%

Standard Deviation 2.54% 4.85% 5.72% 4.70% 4.56%

Mean/S.D. 0.20 0.29 0.16 0.22 0.26

1987:1-2004:2

Market Neutral Portfolio Long-Only Portfolio Russell 2000 SP400 SP500

Mean 0.60% 1.36% 0.85% 0.95% 1.08%

Standard Deviation 2.66% 4.85% 5.92% 4.87% 4.68%

Mean/S.D. 0.22 0.28 0.14 0.19 0.23

Acknowledgments The authors wish to thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions.

References [I] Breen, W., L. R. Glosten, and R. Jagannathan, "Economic significance of predictable variations in stock index returns," Journal of Finance, 44, I 177-1189 (1989). [2] Lee W., "Market timing and short-term interest rate," Journal of Portfolio Management, spring, 35-46 (1997).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 1009-1010

Computational Molecular Science: From Atoms and Molecules to Clusters and Materials The aim of this Symposium is to bring forth the essential contribution of Computational Science to modem Molecular Science. The contributed papers represent computational efforts in a wide spectrum of systems and phenomena. The relevance of the reported results extends form atoms and small molecules to biomolecules, clusters and materials. A strikingly large variety of subjects is present: theoretical calculations of electric properties, intermolecular interactions and the structure of weakly bonded molecules, collision-induced phenomena, solvent effects on molecular properties, molecular dynamics and more. A total of thirty-three papers are presented: eight invited lectures, seventeen oral presentations and eight posters. Farantos proposes a new approach to the analysis of molecular reaction pathways and elementary bifurcation tracks of periodic orbits. This point of view opens the way to the rigorous study of highly excited vibrational states. The analysis of Saddle-node (SN) bifurcations of periodic orbits is expected to offer new insights into fundamental spectroscopic observations of chemical processes. Fournier presents a new method, Tabu Search in Descriptor Space (TSDS), for the investigation of potential surfaces of homoatomic clusters. The determination of stable molecular geometries for clusters of some size has attracted particular attention in late years. This is mainly due to the emergence of important applications in Materials Science. The study of chemical bonding and basic physicochemical properties of clusters represents a major challenge to fundamental Molecular Science. Results of the application of the TSDS method to the study of Ar, Li and Si clusters are presented and discussed. Kusalik presents and discusses a new method for the determination of the properties of water in the liquid state. A mean-field approach is used to describe the electrostatic environment acting on the water molecule. This allows the calculation of the electric multipole moments and (hyper)polarizability. A new technique, Centroid Molecular Dynamics (CMD), is applied to the study of quantum dynamics in liquid water leading to interesting new results. This includes a novel characterization of the phenomenon of"effective tunneling" in liquid water. Pal presents a multireference version of Coupled-cluster theory, MRCC. This version is expected to perform better than standard Single-reference Coupled-cluster theory in the description of near-degenerate states and molecules away from their equilibrium geometry. Some explicit results are presented for the dipole moment of test systems. Papadopoulos focuses on the calculation of vibrational polarizabilities and hyperpolarizabilities of pyrrole and HArF. The accurate determination of these vibrational contributions is of capital importance to our understanding the nonlinear optics of molecules. The successful design of NLO materials depends includes fundamental estimations of the relative magnitude of the electronic and vibrational contributions to the electric (hyper)polarizability. Rode's contribution demonstrates fully the power and significance of modem QMIMM molecular dynamics simulation techniques applied to problems pertaining to the determination of the microscopic structure and ultrafast dynamics of ions in liquids. The presented examples are of importance to both Chemistry and Biology. They include the solvation structure and dynamics of several hydrated ions. Particular attention is paid to the visualization of these effects via the newly designed MOLVISION software tool. Szalewicz presents new density functional theory (DFT) based methods for the calculation of intermolecular forces. Conventional ab initio methods are of high predictive capability when applied to similar problems but their applicability is limited to relatively small systems. DFT offers an attractive alternative. The methods have been successfully applied to the interaction of two molecules containing twelve atoms each. Thakkar presents an analysis of the computational aspects of the determination of molecular structures and energetics in small hydrogen-bonded clusters. The systems of interest are the formic acid, nitric acid, glycolic acid and the water molecule. The quantum chemical methods used in the

1010 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Maroulis

calculation extend from semiempirical approaches and density functional theory to Meller-Plesset perturbation theory and Coupled-cluster methods. The study of these systems reveals important challenges for computational Quantum Chemistry.

George Maroulis Department of Chemistry University ofPatras Greece

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences

Volume 1, 2004, pp. 1011-1014

Molecular Mechanics Model for Second Hyperpolarizabilities and Macroscopic Susceptibilities: Applications to Nanostructures and Molecular Materials Per-Olof Astrand 1 Department of Chemistry, Norwegian University of Science and Technology, 7491 Trondheim, Norway Received 7 March, 2004; accepted in revised form 10 March, 2004 Abstroct: The point-dipole interaction model for the dipole-dipole polarizability and its second hyperpolarizability is discussed. Extensions regarding frequency-dependence, damping of interatomic interactions and macroscopic polarization are included. Results for carbon fullerenes and nanotubes are discussed. Keywords: Polarizability, hyperpolarizability, point-dipole interaction, fullerene, nanotube

PACS: 31.70-f, 33.15-e, 36.40-c, 41.90-e, 78.20-e, 82.90+j

1

Introduction

The electronic structure of molecular systems may be represented in different ways. In molecular orbital and density functional theory, the wave function or the electronic density are normally expanded in terms of atomic basis function and the resulting orbital coefficients are obtained by solving the Schrodinger or Kohn-Sham equations. On the other hand, in molecular mechanics the electronic structure is represented by atom-type parameters such as atomic charges and van der Waals parameters, and the interaction energy is calculated by adopting the atom-type parameters in analytic function of interatomic distances [1, 2]. Atom-type parameters may be determined either empirically by fitting to experimental data or from quantum chemical calculations. In principle, to calculate the interaction energy by molecular mechanics force fields only takes a small fraction of the time it takes to carry out a quantum chemical calculation. Therefore, quantum chemical calculations are normally carried out on single molecules or relatively small molecular systems whereas force fields are adopted in simulations of liquids and solutions. It should be noted, however, that Car-Parrinello simulations propagate the electronic density as described by density functional theory and such simulations have been adopted for example for liquid water [3]. In quantum chemistry, molecular response properties such as (hyper)polarizabilities are calculated conveniently by adopting response theory. It should also be expected that the atom-type parameters in a force field mimic the corresponding molecular properties [2]. For example, if the electrostatics is represented by atomic charges, the correct molecular charge, dipole moment and possibly also the molecular quadrupole moment should be described by the atomic charges. In most force fields, however, the atomic charges are 'Corresponding author. E-mail: [email protected]

P.-O. Astrand

1012

obtained by parametrizing the electrostatic potential around the molecule. The representation of the electrostatics may be extended to atomic dipole moments, which has been investigated in detail recently [4]. Equivalently, atomic polarizabilities should represent the molecular polarizability and possibly also higher-order quadrupole polarizabilities. One way to model the molecular polarizability according to classical electrostatics is the point-dipole interaction (PDI) model [5, 6]. This model will be described here with extensions to frequency-dependence [8], damping of interatomic interactions [9, 10, 11], to molecular second hyperpolarizabilites [12, 13], and to macroscopic polarization [14].

2

Theoretical Background

The PDI model was first introduced by Silberstein [5] and to a large extent exploited by Applequist and coworkers [6]. For a system of N atomic polarizabilities, Otf,o/3• in an external electric field, E1~J, the atomic induced dipole moment, J.l~~~. for atom I is given as ind J.!J,o

=

OIJ,o/3

(E'ixt 1,{3

~ T(2) ind) + L...J IJ,{3"(J.IJ,"(

'

(1)

J#

where the second term on the right-hand side is the electric field from all other atomic induced dipole moments. Here and in the following, the Einstein summation convention is adopted for repeated Greek subscripts. This results in 3N coupled equations, which may be solved by standard matrix techniques as [6, 9] 1-'ind =

(a-1-T)- 1Eext

(2)

where 11-ind and Eext are 3N vectors and a and Tare 3N x 3N matrices. A two-atom relay tensor, B( 2 ) may be defined as

s< 2 > = (a- 1 - T) - 1

(3)

,

resulting in that the molecular polarizability, Ot~t, is given as mol

01 o{3 =

N "'s(2)

(4)

L...J I J,o/3 · I,J

In our work, a spherically symmetric polarizability was included for each element, and these atomtype polarizabilities were parametrized against quantum chemical calculations of molecular polarizabilities [8]. The results are improved dramatically by including a damping of the interatomic interactions by modifying the T-tensor in Eq. (1) [9]. We regarded the overlap between two classical charge distributions, which resulted in that the distance between two atoms in the system is replaced by a scaled distance [11]. Furthermore, it was assumed that the frequency-dependence of the molecular polarizability may be modelled with an Unsold approximation as [8] Otp(-w;w)=Otp(O;O) x ( - 2

w1>

Wp-W

2 ),

(5)

where w is the frequency and wp is an atom-type parameter to be determined. It is obviously a crude approximation only valid far from absorption since it is the atomic parameter wp with the lowest value that determines where the first excitation is located irrespective of the surroundings in the molecule. Hyperpolarizabilities may be obtained by extending Eq. (I) with higher order terrns,

Molecular mechanics properties for hyperpolarizabilities

ind Etot J11,o = a1,o{3 1,{3

+ (il'1,o{3-y6 1 Etot Etot Etot 1,6 1,-y 1,{3

1013

'

(6)

where Et~ is the total electric fied given as the sum of the external field and the field from the other atomic induced dipole moments in the system. /'l,a{3-y6 is an atomic second hyperpolarizability and an atomic first hyperpolarizability, f31,of3-r• has not been included because it is zero for spherically symmetric particles. By adopting the approach by Sundberg [12], the molecular second ) mol , IS ' g1ven · · t erms of a ,our-atom c ' b'l't hyperpo Ianza 1 1 y, l'of3-r m re Iay t ensor, B(14JKL,of3-y 6 6 , as N

1::'$~6

=

L

BJ1KL,o{3-y6'

(7)

1,J,K,L

where the four-atom relay tensor may be expressed as [13]

(8) where B-(2) 1J,o{3

N

< < T(2) (2) = U1JUo{3 + """ ~ IK,o-y 8 KJ,-y{3'

(9)

Ktf

The Lorentz local-field model for macroscopic polarization has been extended by including also the electric fields of the surrounding atomic induced dipole moments according to Eq. (1). The Lorentz-Lorenz equation is not modified, but the polarizability included in the equation becomes the effective polarizability of a particle in a cluster instead of the polarizability of of the isolated particle [14].

3

Results

The model as developed by us has been used for the frequency-dependent polarizability tensors for carbon nanotubes [15] and boron nitride nanotubes [16] as well as the polarizability of molecular clusters [11]. The extension of the model to second hyperpolarizabilities has been used to study the properties of carbon fullerenes and nanotubes [17, 18]. In particular, the saturation length of I' for carbon nanotubes has been studied for tubes up to a length of 75 nm [17]. It is found that carbon nanotubes are comparable to conjugated polymers with respect to the magnitude of ')', which indicates that they are promising candidates for future optical materials. Finally, the macroscopic polarization of C6o fullerenes have been calculated [14].

Acknowledgment P.-O.A. has received support from the Norwegian Research Council (NFR) through a Strategic University Program (Grant no 154011/420), a NANOMAT program (Grant no 158538/431) and a grant of computer time from the Norwegian High Performance Computing Consortium (NOTUR)

References [1] A. K. Rappe, C. J. Casewit, Molecular mechanics across chemistry, University Science Books, Sausalito, 1997.

1014

P.-O. Astrand

(2] 0. Engkvist, P.-O. Astrand, G. Karlstrom, Accurate intermolecular potentials obtained from molecular wave functions: Bridging the gap between quantum chemistry and molecular simulations, Chern. Rev. 100 (2000) 4087-4108. (3] P. L. Silvestrelli, M. Parrinello, Structural, electronic, and bonding properties of liquid water from first principles, J. Chern. Phys. 111 (1999) 3572-3580. (4] H. Solheim, K. Ruud, P.-O. Astrand, Atomic dipole moments calculated using analytical molecular second-moment gradients, J. Chern. Phys. 120 (2004) 10368-10378. (5] L. Silberstein, Molecular refractivity and atomic interaction, Phil. Mag. 33 (1917) 92-128.

(6] J. Applequist, J. R. Carl, K.-F. Fung, An atom dipole interaction model for molecular polarizability. Application to polyatomic molecules and determination of atom polarizabilities, J. Am. Chern. Soc. 94 (1972) 2952-2960. (7] J. Applequist, An atom dipole interaction model for molecular optical properties, Ace. Chern. Res. 10 (1977) 79-85. (8] L. Jensen, P.-O. Astrand, K. 0. Sylvester-Hvid, K. V. Mikkelsen, Frequency-dependent molecular polarizability calculated within an interaction model, J. Phys. Chern. A 104 (2000) 15631569. (9] B. T. Thole, Molecular polarizabilities calculated with a modified dipole interaction, Chern. Phys. 59 (1981) 341-350. (10] R. R. Birge, Calculation of molecular polarizabilities using an anisotropic atom point dipole interaction model which includes the effect of electron repulsion, J. Chern. Phys. 72 (1980) 5312-5319. (11] L. Jensen, P.-O. Astrand, A. Osted, J. Kongsted, K. V. Mikkelsen, Polarizability of molecular clusters as calculated by a dipole interaction model, J. Chern. Phys. 116 (2002) 4001-4010. (12] K. R. Sundberg, A group-dipole interaction model of the molecular polarizability and the molecular first and second hyperpolarizabilities, J. Chern. Phys. 66 (1977) 114-118. (13] L. Jensen, K. 0. Sylvester-Hvid, K. V. Mikkelsen, P.-O. Astrand, A dipole interaction model for the molecular second hyperpolarizability, J. Phys. Chern. A 107 (2003) 2270-2276. (14] L. Jensen, P.-O. Astrand, K. V. Mikkelsen, Microscopic and macroscopic polarization in C6o fullerene clusters as calculated by an electrostatic interaction model, J. Phys. Chern. B 108 (2004) 8226-8233. (15] L. Jensen, 0. H. Schmidt, K. V. Mikkelsen, P.-O. Astrand, Static and frequency-dependent polarizability tensors for carbon nanotubes, J. Phys. Chern. B 104 (2000) 10462-10466. (16] J. Kongsted, A. Osted, L. Jensen, P.-O. Astrand, K. V. Mikkelsen, Frequency-dependent polarizability of boron nitride nanotubes: A theoretical study, J. Phys. Chern. B 105 (2001) 10243-10248. (17] L. Jensen, P.-O. Astrand, K. V. Mikkelsen, Saturation of the third-order polarizability of carbon nanotubes characterized by a dipole interaction model, Nano Lett. 3 (2003) 661-665. (18] L. Jensen, P.-O. Astrand, K. V. Mikkelsen, The static polarizability and second hyperpolarizability of fullerenes and carbon nanotubes, J. Phys. Chern. A xxx (2004) yy-zz, ASAP article.

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 1015-1021

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lowest Energy Path of Oxygen near CH: A Combined Configuration Interaction and Tight-Binding Approach N.C. Bacalis and A. Metropoulos Theoretical and Physical Chemistry Institute, National Hellenic Research Foundation, Vasileos Constantinou 48, GR- 116 35 ATHENS, Greece

D.A. Papaconstantopoulos Center for Computational Materials Science, Naval Research Laboratory, Washington DC 20375-5345 USA

Abstract: It is demonstrated that the lowest energy path for the formation of a polyatomic molecule (applied to the HC-0 formation) is easily calculated via a geometry-independent tight binding Hamiltonian fitted to accurate ab-initio configuration interaction (CI) total energies. This Hamiltonian not only reproduces the CI calculations accurately and efficiently, but also effectively identifies any CI energies happening to erroneously converge to excited states. PACS: 31.10.+z, 31.15.-p, 31.50.-x, 82.20.Kh

The question At present, an accurate ab-initio determination of the reaction path in a chemical reaction, needing the detailed knowledge of the pertinent potential energy surface (PES) (diabatic or adiabatic), may be inhibitively time consuming. For the ground state, the time problem is already traditionally overcome via the density functional theory (DFT) [1], which self-consistently approximates the many electron by a one-electron problem. However, DFT calculations sometimes fail to explain experimentally observed features of the PES [2]. Thus, accurate CI calculations are more or less indispensable, even if performed in a rather limited, but representative, set of molecular geometries. Therefore, a reliable interpolation scheme for the pertinent PES, based on CI calculations, and overcoming the problem of wrong CI convergence, is desirable.

The purpose It is shown that such an interpolation scheme is possible, based on a spin-polarized [3] geometryindependent [4] Slater - Koster (SK) parametrization [5] of ab-initio CI total energies [6]. As a demonstration, the method is applied to the construction of the potential energy surface (PES) of the HCO(X 2 A') state [7] (a state without a barrier). The lowest energy path of the formation of HCO (in 2 A' symmetry), while 0 approaches HC, is also computed using the interpolated PES.

Bacalis et. al.

1016

The procedure First several (724 - compared to 71969 of H 3 [8]) accurate CI total energies, based on (less accurate) multi-configuration self-consistent field (MCSCF) orbitals, are calculated at selected geometries of the H,C,O atoms in the A' symmetry of the Cs group. Most of them (508) are fitted to the interpolation scheme, the remaining serving to check the quality of the fit. For the fit a nonorthogonal spin-polarized tight binding (TB) Hamiltonian is formed, whose matrix elements, along with those of the overlap matrix, are expressed as functions of the bond direction, according to the SK scheme [5], and of the bond length, according to the Naval Research Laboratory (NRL) technique [4], i.e.: The functions are generally polynomials of the interatomic distance, within exponential envelopes, the coefficients and the exponents being varied as parameters. For two adiabatic states near some (avoided) crossing the TB Hamiltonian naturally produces two diabatic PESs in nearby extrapolation, and predicts to which diabatic PES, ground-state or excited, nearby CI energies belong. Among these, the appropriate ones can be used to extend the fit beyond the (avoided) crossings, around which two sets of parameters are needed for the two PES's. If it happens, as with HCO, that the ground and excited state energies beyond the crossing lie close to each other, the adiabatic PES can be fitted as well, with comparable accuracy. Using at each point of the DOF space the lowest lying TB-fitted PES, the adiabatic path can be found: For each value of a desired degree of freedom (in our case for each C-0 distance) the energy minimum is searched [9] in the space of the remaining degrees offreedom (C-H distance and H-C-0 angle). Having the parametrized tight binding Hamiltonian, any property can be trivially computed.

Methodology For the CI energies the correlation consistent aug-cc-pVTZ basis set was used [10, 11] in conjunction with the complete active space self-consistent field (CASSCF) + 1 + 2 multi-reference CI method (MRCI) employed in the MOLPRO package [6] (the four electrons in the 1s orbitals of C and 0 were unexcited). The CASSCF calculations were state-averaged, and the active space was limited to the 9 valence orbitals among which the remaining 11 electrons were distributed. In the subsequent MRCI calculations the uncontracted configurations were around 50 million internally contracted to about one million. Calculations between C-0 distances of 1.7 and 6 bohr were done for several H-C-0 angles between 50° and 180° and several C-H distances between 1.7 and 4.5 bohr, most around the C-H equilibrium distance of 2.12 bohr. The three lowest roots of the secular equation were computed to increase the accuracy of the calculation. By an analytic gradient optimization at the MCSCF level, an approximate (MCSCF) equilibrium geometry was found at the DOF space point (i'HcJco,OH-c-o) = (2.12, 2.2, 126°) (in a.u.). Because it is not evident whether the aforementioned points are beyond any avoided crossing, where the role of the ground and the excited states would be interchanged, first several DOF points near equilibrium were obtained by employing a generalization of the 3-dimensional sphere to the generally multi-dimensional (in this case also 3-dimensional) DOF space: X;= rJr;- 1, i = {HC, CO}, X3 = 1, where generally for n degrees of freedom, points belonging to a n-dimensional hypersphere of radius r and center (x;, i = 1, ... ,n) are obtained by

e;o-

Xn-

Xn

Xn-1- Xn-1

r cosOn r sinOn cosOn-1 r sinOn sin8n-i···cos81

(1)

1017

Lowest Energy Path of Oxygen near CH

where the 1st fh = 0 or 180°, the two points of a "!-dimensional sphere", and the other 0 < 8; < 180° are the "azimuthal" hypersphere angles (incidentally, a variable dimensional do-loop code was invented, needed to treat any larger molecule). Thus, first points with small r were fitted, and gradually the fit was extended to more remote DOF points. The formalism of the NRL geometry - independent TB parametrization is described in detail in Ref. [4]; here an essential summary is only presented. The total energy is written as

L L

=

E[n(f')]

f(J.I. -TE; •)

+ F[n(f')]

E; •

i; s=l,2

f(J.I.' -:/;'.)

(2)

E;'.

i; 8=1,2

where [12] f(x) = 1/(1 +ex), T=0.005 mRy, and E;' 8

=

E; •

+ Vo

;

J.1. 1

=

J.1.

+ Vo

; Vo = F[n(f')]/Ne

(3)

with Ne = L:;; ; •=l, 2 f((J.L- E; 8 )/T) being the number of electrons, i counts the states, s = 1, 2 counts the spin. Since the total energy is independent of the choice of zero of the potential, the shift Vo is sufficient to be determined by the requirement that E;' 8 are the eigenvalues of the generalized eigenvalue problem (H- S E;' 8 ) 1/J; 8 = 0, where His the TB Hamiltonian and Sis overlap matrix in an atomics- and p-orbital basis representation {ci>a}· Thus, a non-orthogonal TB calculation uses on-site, hopping and overlap parameters. Demanding that only the on-site SK parameters are affected by the shift Vo, for atom I in a spin-polarized structure the matrix elements are expressed as 3

h li

_ ""'

• -

L..,

bll n

8

2n/3

(}[ •

. '

l _ -

s,p

(4)

n=O

where l!I

8

"' "L...J Jf-l

e ->.~1 J- s Rr

J

f(RI

J-

Ro)

(5)

Tc

is a generalized pair potential ("density"), with Ro = 15 bohr, rc = 0.5 bohr, R1 J is the internuclear distance between atoms I and J, i(J) denote the type of atom on the site I(J) while >.1 j 8 , depending on the atom type, and b{ n 8 are the on-site NRL geometry-independent parameters (GIP). It is found sufficient to keep hopping and overlap parameters spin independent, of the form (6)

where 'Y indicates the type of interaction (i.e. ssa, spa, ppa, pp7r and psa). The NRL GIPs are ey n and 9-r , R is the interatomic distance, and Ro and rc are as in eq_ 5. Within the context of the NRL code [4], written primarily for solids, the molecule was treated as a base to a large cubic lattice unit cell (lattice constant= 100 a.u.) ensuring vanishing interaction between atoms in neighboring cells. Thus, the PES was described in terms of the following NRL GIPs for each spin polarization_ On-site: s: H, C, 0, (H depending on C), (Con H), (H on 0), (0 on H), (C on 0), and (0 on C); p: C, 0, (C on H), (0 on H), (C on 0), and (0 on C). Hopping and overlap parameters: ssa: H-C, H-0, C-0; spa: H-C, H-0, C-0 and 0-C (denoted as psa); ppa and pp7r: C-0. For HCO, since similar atoms are well separated, the H-H, C-C and 0-0 parameters vanish. We fitted 508 CI points and checked the resulting PES against 216 more CI energies not included in the fit. The error was less than w- 3 a.u., which is within the ab-initio PES

1018

Bacalis et. al.

Table 1: Geometric characteristics of HCO around equilibrium, along the reaction path, in a.u. (HC-0 angle in degrees). The last three columns indicate the minimum energy molecular geometry. C-0 distance Total Energy C-H distance H-C-0 angle 2.6 -113.6328 2.069 117.53 2.5 -113.6485 2.071 118.69 2.4 -113.6610 2.077 119.77 2.3 -113.6685 2.088 120.84 2.2 -113.6687 2.107 121.91 2.1 -113.6583 2.130 122.98 2.0 -113.6326 2.153 124.09

accuracy (starting from different initial guesses the MCSCF calculation may converge to slightly different results by 10- 3 a.u.). To ensure obtaining physically meaningful TB parameters, for a very limited number of molecular geometries the Hamiltonian eigenvalues were also fitted, while the total energy was fitted for all 508 structures. Finally, for the reaction path we used a non-linear energy minimization technique employing Powell's conjugate directions method [13] modified to be restricted to closed intervals of the DOF space [9]. For comparison, each of the 724 ab-initio CI calculations needs 3 hours of CPU time, each n-dimensional hypersphere radius r-increase, to fit more remote points (with 10 such hypersphere radial extensions all points can be covered) needs 2-3 hours, and each 2-dimensional energy minimization, using the final TB parameters (i.e. the reaction path determination), needs a few seconds.

Results The fitted TB Hamiltonian could predict correctly total energy curves for points not included in the fit as shown for example in Fig. 1. Since it produces naturally the diabatically extended branch of the energy, it could distinguish to which adiabatic state near an avoided crossing the CI values belong. Classifying such CI points may sometimes be misleading or unrecognizable by mere observation of the MCSCF orbitals. An example is shown in Fig. 2. However, the most impressive aspect was that we realized, through the fit, that at some points (about 10 in 700) the CI calculation had converged to excited energies (which ought to be disregarded, otherwise they would destroy the fit). An example is given in Fig. 3. Finally, Fig. 4 shows the lowest energy path for the formation of HCO, as HC approaches 0. For a triatomic molecule the figure contains the whole information: For each C-0 distance the minimum energy and the corresponding C-H distance and H-C-0 angle are displayed via two arrows starting at C (on the curve) and ending one at 0 (horizontal) and the other at H (oblique). As seen from the figure, at large C-0 distances, 0 is more attracted toward H, but, in approaching equilibrium, 0 binds mainly with C, the H-C-0 angle gradually becoming ~ 122° (representing the CI value). Around equilibrium (c.f. Table 1), the angle changes slightly monotonically by 1-2°, but because, in increasing the C-0 distance, the C-H distance decreases, predominantly an antisymmetric stretching vibration occurs. To our knowledge there is no experimental confirmation of the reaction path of this intermediate molecule.

Acknowledgment We wish to thank Dr. M.J. Mehl for many useful discussions.

Lowest Energy Path of Oxygen near CH

1019

Flg.1o C·H • 3.01 a.u.

Various 9 angles among 50° to 180' 0.00

-1stCiroot ·····TB

········ dW

.().05

.;

"'_,_ Iii

:: -0.10

~

w

]j

-v- 2nd Cl root

·£~;/~fl INt#t ---fitted

~

-0.15

~ .().20

Various C-O distances among 1.72 to 5.02 a.u.

Flg.1b

C-0 • 4.8 a.u. (fit) C-Q s 3.8 LU. (fit)

C-0 =- 3.2a.u.

(prediction)

C0 • 2.Sa.u.~ (fit) -.2~

H-C - - (1.71D 2.8 o.u.)

Figure 1: Predicted total energy E in a.u. (Above:) vs C-0 distance for C-H distance = 3.01 bohr, and various H-C-0 angles. (Below:) vs C-H distance for various C-0 distances, and H-C-0 angle = 100°.

1020

Bacalis et. al.

Fig.2 C-H - 1.7 1 a.u

-1stCiroot ·· ~

-1stCiroot

w

~

1-

B

..(1.18

.. "(r; k) =

L t.p"(r- A"- T)eik·T

(2)

T

where AI' defines the nucleus coordinates in the reference cell on which 'P" is centered. If translational symmetry is taken into account and Bloch functions are used to form representative sets, which are the bases for the irreducible representation of the translation group, the problem is factorized into n problems of dimension m where n is the number of the irreducible representations of the translational group and m is the number of basis functions in the unit cell[2]. The experimental structural data for calcite were taken from Markgraf[3]. The basis adopted for calcite is that used by Catti[4]. The outer sp and d orbitals were optimized in the crystalline environment. All calculations were performed at the HF level of theory. The irreducible part of the Brillouin zone was sampled at 32 points using a shrinking factor of 6. Other computational parameters such as the overlap and the penetration tolerance for the coulomb and exchange integrals were kept to their default values[5]. The basis sets used for the substitutional cations were among those implemented in the CRYSTAL98 code and have been tested in various crystalline compounds.

3

Results

We have used the supercell approach to study the incorporation of foreign ions into the calcite bulk structure. As a consequence all symmetry operations with translational components were removed thus reducing the symmetry and increasing the computational cost. Four cations isovalent to Ca2 + were use as impurities, namely Mg2+, Sr 2 +, Ba2+ and Zn 2 +. The exact position of the impurity ions was the Ca site. The degree of contamination ranged down to 25%. All structures were allowed to relax by changing the position of the oxygen atom. Table 1 summarizes the effect of the presence of a foreign ion into calcite. Table 1: Energetic and structural data for the substitution of Ca 2 + of calcite with a foreign isovalent cation. Unrelaxed and Relaxed refer to the defect formation energy of the corresponding structures given in eV. X-0 is the distance in A between the substitutional ion and the 0 atom in the relaxed structure. !lR is the difference in X-0 distance between the relaxed and unrelaxed structure Mg Sr Ba Zn 50% 25% 50% 25% 50% 25% 50% 25% Unrelaxed -2.8744 -2.8740 2.3005 2.3011 6.4686 6.4707 -3.2430 -3.2428 Relaxed -3.3507 -3.4278 1.9887 1.9775 4.3665 4.3068 -3.6994 -3.7182 X-0 2.2467 2.2457 2.4717 2.4689 2.7161 2.7173 2.2597 2.2559 -0.1252 -0.1262 0.0998 0.0970 0.3442 0.3454 -0.1122 -0.1160 !lR

It is obvious from table 1 that the substitution of a Ca atom by Mg2+ and Zn 2 + is energetically favourable. For Mg2 + this could be expected since the existance of dolomites may suggest that type of substitution. On the other hand the addition of Mg 2 + ions into a growth solution lead in the formation of aragonite. At this point it must be noticed that the degrees of contamination are

A Quantum Chemical Study of Doped CaCOa (calcite)

1031

relative high. The reported in the literature [6] structures of biogenic Mg-calcites contain 6-13% Mg2 +. In addition as the percent of Mg2+ in the host lattice change there is also a change in the size of the cell, which in our calculation was kept fixed. All these could explain the overestimated Mg-0 distance (approximately 2.10A in MgC0 3 and CaMg(C0 3 ) 2 ). The incorporation of Zn2 + cation into bulk calcite has been confirmed by Reeder[7] using XAFS spectroscopy. The observed Zn-0 distance in their experiments is 2.14-2.15A to be compared with our calculated value of 2.25A corresponding to 25% of contamination. For Sr2+ and Ba2+ our calculations show that they cannot incorporate into bulk calcite. Contrary to our results Reeder[7] showed that Ba2+ can enter into calcite lattice occupying Ca site. The Ba-0 distance observed is 2.68A which compares well to our calculated value. So the positive value of the defect formation energy could be attributed to relatively small Ba-Ba distance compared to the large size of Ba2+ cation. In fact table 1 suggest that the incorporation of foreign ions into calcite is size dependent. This is evident in figs 1 and 2 where the effect of the cation size to the defect formation energy and the local environment is clear.

----50% Unrelaxed • · · 50% Relaxed • 25% Relaxed

0.7

0.8

0.9

1.0

1.1

1.2

1.3

1..

1.5

Ionic Radii

Figure 1: Defect formation energy versus ionic radii for the doped calcite.

Ions smaller than Ca2 + like Mg2+ and Zn2 + (ionic radii 0.72A and 0.74A) can be incorporated into the host lattice while larger ions like Sr 2 + and Ba2+ (ionic radii 1.18A and 1.35A) cannot. Also the key factor that controls the distortion of the lattice and the local environment around the impurity is the cation size. However in order to have the complete picture of the modes of incorporation lower degrees of contamination and variation of the size of the cell have to be considered.

1032

M. Menadakis, G. Maroulis and P. G. Koutsoukos

Figure 2: Left: Electron density map on the Ca-Mg-0 plane of a relaxed Mg-doped calcite. Right: Electron density map on the Ca-Ba-0 plane of a relaxed Ba-doped calcite. Contour lines [0.005,0.5] ea03 are separated by 0.025 ea03 •

Acknowledgment M. Menadakis gratefully acknowledges a scholarship form the Institute of Chemical Engineering and High Temperature Chemical Processes of the Foundation of Research and Technology-Hellas (FORTH/ICE-HT).

References [1] Y. J. Han and J. Aizemberg, Effect of Magnesium Ions on Oriented Growth of Calcite on Carboxylic Acid Functionalized Self Assembled Monolayers, JAGS 125 4032-4033(2003). [2] C. Picani, Quantum-Mechanical Ab-initio calculation of the properties of Crystalline Materials, Springer-Verlag, Berlin Heidelberg, 1996.

[3] S. A. Markgraf R. J. Reeder, High-temperature structure refinements of calcite and magnesite, Am. Mineral. 70 590-600(1985). [4] M. Catti A. Pavese E. Apra and C. Roetti, Quantum Mechanical Hartree-Fock Study of Calcite (CaC03) at Variable Pressure and Comparison With Magnesite (MgC0 3), Physics and Chemistry of Minerals 20 104-110(1993). [5] V. R. Saunders R. Dovesi C. Roetti M. Causa N. M. Harrison R. Orlando C. M. ZicovichWilson, CRYSTAL98 User's Manual, Torino: Universita di Torino 1999. [6] J. Paquette and R. J. Reeder, Single-crystal X-ray structure refinements of two biogenic magnesian calcite crystals, Amer. Mineral. 75 1151-1158(1990). [7] R. J. Reeder G. M. Lambie and P. A. Northrup, XAFS study of the coordination and local relaxation around Co2 + , Zn 2 + , Pb2 + and Ba2+ trace elements in calcite, A mer. Mineral. 84 1049-1060(1999).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 1033-1036

Density-functional-bas ed methods for calculations of intermolecular forces Krzysztof Szalewicz 1 and R.afal Podeszwa Department of Physics and Astronomy, University of Delaware, Newark, Delaware 19716

Alston Misquitta Cambridge University, Chemical Laboratories, Cambridge, CB2 lEW England

Bogumil Jeziorski Department of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland

Abstract: Ab initio wave-function-based methods can predict intermolecular force fields very accurately for monomers containing a few atoms. These methods are, however, too time consuming for larger molecules, in particular for any molecules of biological interest or molecules forming energetic solid state materials. The density-functional theory (DFT) methods would be fast enough, but are currently not able to predict the very important dispersion component of the force field. A new perturbational method that is capable of computing the force fields accurately will be presented. This method is based on a DFT description of isolated molecules but computes intermolecular forces using expressions beyond DFT. Calculations for model compounds have shown that the new method reproduces all components of the intermolecular force, including dispersion, extremely well, in fact challenging the accuracy of wave-function-based methods. At the same time, the computer resources required by this method are similar to those of the standard DFT. The method has already been applied to interactions of two monomers containing 12 atoms each, and it should be able to handle interactions of molecules containing 20 and more atoms. Keywords: Intermolecular forces; Density-functional theory; Dispersion energy; Perturba-

tion theory

PACS: 34.20.Gj

Symmetry-adapted perturbation theory {SAPT) provides not only a conceptual basis for understanding intermolecular interactions but also an efficient computational framework for accurate predictions of interaction energies [1, 2]. The pair interactions and three-body nonadditive interactions of arbitrary closed-shell molecules can now be computed using the program SAPT2002 [3]. This approach has provided some of the most accurate intermolecular potentials for various dimers and trimers, as confirmed by comparisons of the computed spectra with experiment. In particular, the water dimer and trimer spectra agreed with experiment very well [4] and the predictions of the SAPT potential are competitive to those provided by empirical potentials fitted to the spectra [5]. Simulations of liquid water with SAPT potentials [6, 7] provided the first quantitative determination of the role that the three-body effects play in this system. 1 e-mail:

[email protected]

1034

Szalewicz et al.

Applications of wave-function-based ab initio methods to interactions of molecules containing ten or more atoms have not been possible since calculations employing SAPT or any other electronic structure method that includes correlation effects at a level adequate for describing intermolecular interactions require relatively significant computer resources. On the other hand, although the existing density-functional theory (DFT) methods are fast enough for such calculations, these methods are known to fail to describe an important part of the van der Waals forces, the dispersion interaction. In fact, two of us have shown [8] that supermolecular OFT calculations lead to large errors also in other interaction energy components (the electrostatic, induction, and exchange interactions) due to an incorrect behavior of electron densities at distances from nuclei that are relevant for intermolecular interactions. Table 1: Interaction energies (in kcal/mol) for the OMNA dimer at a near minimum geometry. Hartree-Fock frozen-core MP2 MP4 (SOTQ) CCSO CCSO(T) full-core CCSD(T) SAPT SAPT(DFT)/PBEO SAPT(OFT)/897-2

2.25 -7.90 -7.85 -5.31 -6.85 -6.86 -7.36 -6.22 -6.56

It has recently been shown that a solution to this difficulty is a SAPT approach utilizing the DFT description of monomers [8, 9]. Such a method, now called SAPT(DFT), has been first proposed by Williams and Chabalowski [10] and later developed by some of the present authors [8, 9] and independently by Hesselmann and Jansen [11]. The method does not relay on asymptotic expansions and therefore is applicable for all separations between the interacting molecules. The SAPT(DFT) approach avoids the problems of supermolecular OFT by using this method only to describe each monomer, but calculating the interaction energies from expressions beyond OFT. In addition, the wrong long-range behavior of monomer densities is fixed by applying an asymptotic correction to the exchange-correlation potential of DFT. SAPT(OFT) calculations require only a small fraction of computer resources used by the regular SAPT and converge much faster in the size of the basis sets. Moreover, although initially SAPT(OFT) was expected to be a method providing medium quality results for very large molecules, it turned out that at least in some cases the accuracy of SAPT(DFT) surpasses that which can be reached with the currently programmed regular SAPT and basis sets of a reasonable size. Our most recent results for several dimers show that in all cases when there were significant discrepancies between the results from the two approaches, these were resolved in favor of SAPT(DFT), i.e., were resulting from theory level truncations and basis set incompleteness in the regular SAPT calculations. All the individual physical components of the intermolecular force, including the dispersion energy, have been reproduced by SAPT(OFT) very well. The initial SAPT(DFT) calculations were performed for fairly small systems, such as He, Ne, H2 0, and C0 2 homogenous dimers. Recently, we were able to compute interaction energies for the dimethylnitramine (OMNA) dimer containing 24 atoms. OMNA is an important model compound for energetic materials and was investigated by SAPT in the past [12]. In Table 1, we show the

DFT-based methods for intermolecular forces

1035

Table 2: Individual components of the interaction energy for SAPT and SAPT(DFT) with PBEO and B97-2 functionals. Energies in kcaljmol. The value in parentheses following the SAPT dispersion energy is this quantity computed with neglect of intramonomer correlation effects. Component electrostatic 1-st order exchange induction exchange-induction dispersion exchange-dispersion .5HF

total

SAPT -10.51 18.28 -6.07 4.49 -13.50 (-12.83) 1.34 -1.38 -7.36

PBEO -10.25 17.43 -6.54 5.02 -11.83 1.34 -1.38 -6.22

B97-2 -10.05 16.85 -6.35 4.82 -11.75 1.30 -1.38 -6.56

total interaction energies at a near minimum geometry computed using SAPT, SAPT(DFT), and several supermolecular methods. We have used the Ml geometry defined in Table 3 of Ref. [12] and the basis set was also taken from that reference. This basis set is of double-zeta quality and includes bond functions. The monomer-centered "plus" basis sets (MC+BS) were used in the SAPT and SAPT(DFT) calculations, whereas the dimer-centered "plus" basis sets (DC+BS) were used in the supermolecular calculations. The latter calculations were performed in the counterpoise corrected way. The "plus" denotes the use of bond function in both approaches and in the MC+Bs case also the use of the isotropic part of the basis set of the interacting partner. The supermolecular calculations employ the many-body perturbation theory with the M¢ller-Plesset decomposition of the Hamiltonian (results denoted by MP) or the coupled-cluster (CC) methods with various levels of electron excitations: single (S), double (D), triple (T), quadruple (Q). The regular SAPT results given in Table 1 employ the complete standard set of corrections, in contrast to the calculations of Ref. [12] which used the sum of the Hartree-Fock interaction energy and of the dispersion energy with neglect of intramonomer correlation effects. Table 1 shows first that the higher-order terms neglected in Ref. [12] are quite important and decrease the magnitude of the interaction energy by more than 3 kcal/mol. SAPT(DFT) gives interaction energies within about 1 kcal/mol of the regular SAPT and about 0.5 kcal/mol of the CCSD(T) method, the most advanced of practically applicable electronic structure approaches. This is an excellent agreement taking into account that both the regular SAPT and CCSD(T) are much more computer intensive than SAPT(DFT). The SAPT(DFT) calculations were performed with two very different functionals: PBEO [15, 16] and B97-2 [13, 14], which give results within 0.3 kcal/mol of each other, showing that SAPT(DFT) is only weakly dependent on the choice of the functional. The framework of SAPT provides insights into the physical structuture of the interaction energy. In Table 2, the individual components of the interaction energy: the electrostatic, induction, dispersion, and exchange contributions are shown. The interaction energy at the same level as used in Ref. [12] can be obtained by adding the Hartree-Fock interaction energy from Table 1 and the dispersion energy with neglect of intramonomer correlation effects listed in Table 2. The sum of these two quantities, equal to -10.58 kcal/mol, differs from the minimum energy of -11.06 kcal/mol given in Table 3 of Ref. [12] since the latter result was computed in the DC+BS approach whereas the present results used the MC+Bs scheme. It can be seen that, as already pointed out in Ref. [12], our results do not support the conventional description of interactions of large molecules which includes only the electrostatic component. Clearly, the first-order exchange and the dispersion energies are actually larger in magnitude than the electrostatic interactions. An

1036

Szalewicz et al.

attempt to described the DMNA dimer at the Hartree-Fock level, as it is often done for large molecules, would lead to completely wrong conclusions as the interaction energy at this level is positive. One can see in Table 2 a generally good agreement between the individual SAPT and SAPT(DFT) components. If the findings of Ref. [17] extend to the DMNA dimer, the SAPT(DFT) components can actually be more accurate than the SAPT ones. This research was supported by grants from the Army Research Office and the National Science Foundation.

References [1] B. Jeziorski, R. Moszynski, and K. Szalewicz, Chern. Rev. 94, 1887 (1994).

[2] B. Jeziorski and K. Szalewicz, in Handbook of Molecular Physics and Quantum Chemistry, edited by S. Wilson, Wiley, 2002, Vol. 3, Part 2, Chap. 9, p. 232.

[3] SAPT2002:

An Ab Initio Progmm for Many-Body Symmetry-Adapted Perof Intermolecular Intemction Energies by turbation Theory Calculations R. Bukowski et al., University of Delaware and University of Warsaw: http:/ /www.physics.udel.edu/ ~szalewic/SAPT /SAPT.html.

[4] G. C. Groenenboom, E. M. Mas, R. Bukowski, K. Szalewicz, P. E. S. Wormer, and A. van der Avoird, Phys. Rev. Lett. 84, 4072 (2000). [5] F.N. Keutsch, N. Goldman, H.A. Harker, C. Leforestier, and R.J. Saykally, Mol. Phys. 101, 3477 (2003). [6] E. M. Mas, R. Bukowski, and K. Szalewicz, J. Chern. Phys. 118, 4386 (2003). [7] E. M. Mas, R. Bukowski, and K. Szalewicz, J. Chern. Phys. 118, 4404 (2003).

[8] A.J. Misquitta and K. Szalewicz, Chern. Phys. Lett. 357, 301 (2002). [9] A.J. Misquitta, B. Jeziorski, and K. Szalewicz, Phys. Rev. Lett. 91, 033201 (2003). [10] H.L. Williams and C.F. Chabalowski, J. Phys. Chern. A 105, 646 (2001).

[11] A. Hesselmann and G. Jansen, Chern. Phys. Lett. 357, 464 (2002); ibid. 362, 319 (2002); ibid. 367, 778 (2003). [12] R. Bukowski, K. Szalewicz, and C. Chabalowski J. Phys. Chern. A, 103, 7322 (1999). [13] A.D. Becke, J. Chern. Phys. 107, 8554 (1997). [14] P.J. Wilson, T.J. Bradley, and D.J. Tozer, J. Chern. Phys. 115, 9233 (2001). [15] J.P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996). [16] C. Adamo, M. Cossi, and V. Barone, J. Mol. Struct. (Theochem) 493, 145 (1999). [17] A.J. Misquitta et al., to be published.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 1037-1041

Computational challenges in the determination of structures and energetics of small hydrogen-bonded clusters Ajit J. Thakkar 1 Department of Chemistry, University of New Brunswick, Fredericton, New Brunswick E3B 6E2, Canada Received 25 July, 2004 Abstroct: Our laboratory has been involved in the determination of the structures and energetics of small hydrogen-bonded clusters by various quantum chemical methods including semiempirical, density functional, and ab initio M0ller-Plesset perturbation theory and coupled cluster methods. Clusters involving formic acid, nitric acid, glycolic acid and water molecules have been studied. The computational challenges encountered, the strategies used to face them, and some of the results obtained are surveyed. Keywords: Hydrogen-bonded clusters, energy minimization, Gaussian-type functions, Hartree-Fock method, density functional theory, M0ller-Plesset perturbation theory, coupled cluster approach. PACS: 36.40.-c, 31.25.Qm, 31.15.Ew

1

Cluster chemistry

The properties of a piece of bulk crystal do not change dramatically as we repeatedly subdivide it until the piece reaches molecular dimensions or, in other words, the nanometer scale. Particles of a material consisting of a few to a few thousand atoms are called clusters. The properties of clusters often show dramatic size and shape dependence. Clusters of metals, semiconductors, ionic solids, rare gases, and small molecules have been studied using both theoretical and experimental methods. Intense interest in clusters arises because they can be used to investigate surface properties including mechanisms of heterogeneous catalysis [1], and because clusters can serve as building blocks for new materials and electronic devices. An outstanding example of the fruits of cluster chemistry is fullerene chemistry [2, 3] which grew out of the study of carbon clusters, and has become important in nanotechnology [4]. Recent books on clusters include a monograph on metal clusters [5], and edited collections on molecular clusters [6] and metal nanoparticles [7] . Molecular clusters are held together by relatively weak intermolecular forces [8, 9] or by hydrogen bonds [10]. Hydrogen-bonded clusters are an important class of molecular clusters. Small clusters of water molecules have received a lot of attention; see, for example, the experimental work of Saykally and coworkers [11], and the computational investigations of Xantheas and coworkers [12]. We have studied several types of hydrogen-bonded clusters as outlined next. 1 E-mail:

ajitGunb.ca, Web page: http://vvv.unb.ca/cbem/ajit/

1038

A.J. Thakkar

F2

Figure 1: The Z and E rotamers of formic acid, and the gas-phase dimer structure F2.

2

Hydrogen-bonded clusters

Formic acid is present in clouds and fog, and plays an important role in human metabolism. Formic acid exhibits rotational isomerism between the Z and E forms shown in Fig. 1. Both rotamers have been well characterized through spectroscopic techniques. Both experimental and theoretical studies indicate that the Z rotamer is more stable than theE rotamer by about 4.0 kcal/mol [13, 14]. The structure of formic acid differs markedly in different phases. In the gas phase, formic acid forms the cyclic C2h dimer, depicted as D in Fig. 1, with two strong, nearly linear, equivalent 0H· · · O=C H-bonds as shown by spectroscopic [15, 16, 17] and quantum chemical methods [18, 19]. Like acetic and glycolic acid but unlike many other carboxylic acids whose crystal structures consist of associated dimers, formic acid crystallizes in long catameric chains [20, 21] in which H-bonds link each molecule to two neighbors. Chains of the Z form are found in the low temperature (4.5°K) crystal structure [21] whereas chains of theE form are found at higher temperatures [20]. The structure of liquid formic acid remains a subject of debate but probably consists of short chains similar to those observed in the solid [22]. Hence it is of some interest to determine how the structures of small clusters of formic acid molecules evolve with cluster size. Roy, McCarthy and I have performed density functional theory (DFT) computations to study the structures of trimers [23], tetramers [24], pentamers [25] and hexamers [26] of formic acid. Glycolic acid, CH20HCOOH, plays an important role in dermatology and the cosmetics industry [27, 28], and it is involved in several life processes [29]. The glycolic acid molecule has rich functionality that allows it to simultaneously form intra- and intermolecular 0-H· · · 0 hydrogen bonds, and also allows for weaker C-H· · · 0 interactions. The growth of clusters of glycolic acid is of interest for the same reasons as apply to clusters of formic acid. Kassimi and I have investigated the competition between various types of hydrogen bonding in dimers of glycolic acid [30]. Raman and infrared spectroscopy studies [31] suggest that glycolic acid exists in a monomeric form in dilute aqueous solution. A first step towards understanding such a solution is to consider small gas-phase clusters consisting of a glycolic acid molecule and a few water molecules. My group has performed semiempirical, DFT, M0ller-Plesset perturbation theory and coupled cluster computations on clusters of glycolic acid with 1-6 [32, 33], and 16 and 28 water molecules [34]. Nitric acid, HN0 3 , is widely used to manufacture explosives such as nitroglycerin and fertilizers such as ammonium nitrate. Nitric acid plays an important role in atmospheric chemistry because it acts as a stratospheric reservoir for NOx. Hart and I have examined the structural isomers of the dimers of nitric acid [35] with DFT and M0ller-Plesset perturbation theory. The computational challenges involved in these studies include the choice of computationally efficient but sufficiently accurate quantum chemical methods and basis sets. Since the number of structural isomers grows exponentially with cluster size, a major challenge is to find all low-lying structures including the global minimum. The purpose of this talk is to survey the strategies we have adopted to face these challenges, and some of the results we have obtained. As an example of

1039

Hydrogen-bonded clusters: Computational challenges

Figure 2: A quasi-spherical, structural isomer of a cluster of glycolic acid and 28 water molecules.

our results, a quasi-spherical isomer of a cluster of glycolic acid and 28 water molecules is shown in Figure 2. It fits standard conceptions of a solvated glycolic acid molecule. A stacked isomer of a cluster of glycolic acid and 28 water molecules is shown in Figure 3. It can be described as a glycolic acid molecule attached to the side of a small crystal of ice made up of a stack of cubic arrangements of water molecules!

v

. ',

6 ·~

f.lO-Q .o . r-;:o······:i)-----o-0-~: ---: ;----~~--b-o ,,v \ ..r;t , , , {2-o;'........ A.o. , , , .--· ~---· ~ ~: ~ p '&' ;•---qg :' ..... (

(:JR. .. --· (') o

.·.a r

... , ·..

n-----~

'~-----))·····"



·--- o.. _ lf

~P

~.P

--~----o-0

Figure 3: A stacked, structural isomer of a cluster of glycolic acid and 28 water molecules.

1040

A.J. Thakkar

Acknowledgments This lecture would not have been possible without the contributions of my coworkers: Amlan K. Roy, Noureddin El Bakali Kassimi, Shaowen Hu, James R. Hart, Edet F. Archibong and Shane P. McCarthy. I thank the Natural Sciences and Engineering Research Council of Canada for their continuing support.

References [1] G. Ertl and H. J. Freund. Catalysis and surface science. Physics Today 52, 32-38 (1999). [2] R. Taylor. Lecture Notes on F'ullerene Chemistry: A Handbook for Chemists (Imperial College, London, 1999). [3] A. Hirsch. The Chemistry of the F'ullerenes (Wiley-VCH, New York, 2002). [4] E. Osawa. Perspectives of F'ullerene Nanotechnology (Kluwer Academic, New York, 2002). [5] W. Ekardt. Metal Clusters (Wiley, New York, 1999). [6] M. Driess and H. Noth (Eds.). Molecular Clusters of the Main Group Elements (Wiley, New York, 2004). [7] D. L. Feldheim and C. A. Foss, Jr. (Eds.). Metal Nanoparticles (Marcel Dekker, New York, 2001). [8] A. J. Stone. The Theory of Intermolecular Forces (Oxford, New York, 1996). [9] A. J. Thakkar. Intermolecular interactions. In Encyclopedia of Chemical Physics and Physical Chemistry, (eds.) J. Moore and N. Spencer (Institute of Physics Publishing, Bristol, 2001), vol. I. Fundamentals, chap. A1.5, pp. 161-186. [10] S. Scheiner. Hydrogen bonding: A theoretical perspective (Oxford, New York, 1997). [11] F. N. Keutsch and R. J. Saykally. Water clusters: Untangling the mysteries of the liquid, one molecule at a time. Proc. Nat. Acad. Sci. U.S.A. 98, 10533-10540 (2001). [12) S. S. Xantheas, C. J. Burnham, and R. J. Harrison. Development of transferable interaction models for water. II. Accurate energetics of the first few water clusters from first principles. J. Chern. Phys. 116, 1493-1499 (2002). [13) W. H. Hocking. The other rotamer of formic acid, cis-formic acid. 1113-1121 (1976).

z.

Naturforsch. 31A,

[14] M. Pettersson, J. Lundell, L. Khriachtchev, and M. Riisanen. IR spectrum of the other rotamer offormicacid, cis-HCOOH. J. Am. Chern. Soc.119, 11715-11716 (1997). [15] G. H. Kwei and R. F. Curl, Jr. Microwave spectrum of 0 18 formic acid and structure of formic acid. J. Chern. Phys. 32, 1592-1594 (1960). [16] A. Almenningen, 0. Bastiansen, and T. Motzfeldt. Reinvestigation of the structure of monomer and dimer formic acid by gas-phase electron diffraction technique. Acta Chern. Scand. 23, 2848-2864 (1969). [17] A. Almenningen, 0. Bastiansen, and T. Motzfeldt. Influence of deuterium substitution on the hydrogen bond of dimer formic acid. Acta Chern. Scand. 24, 747-748 (1970).

Hydrogen-bonded clusters: Computational challenges

1041

[18] L. Thri. Ab initio molecular orbital analysis of dimers of cis-formic acid. Implications for condensed phases. J. Phys. Chern. 100, 11285-11291 (1996). [19] W. Qian and S. Krimm. Spectroscopically determined molecular mechanics model for the intermolecular interactions in hydrogen-bonded formic acid dimer structures. J. Phys. Chern. A 105, 5046-5053 (2001). [20] I. Nahringbauer. Hydrogen bond studies: CXXVII. A reinvestigation of the structure of formic acid (98K). Acta Crystallogr. B 34, 315-318 (1978). [21] A. Albinati, K. D. Rouse, and M. W. Thomas. Neutron powder diffraction analysis of hydrogen-bonded solids. II. Structural study of formic acid at 4.5K. Acta Crystallogr. B 34, 2188-2190 (1978). [22] P. Jedlovszky, I. Bak6, G. Palinkas, and J. C. Dore. Structural investigation of liquid formic acid. X-ray and neutron diffraction, and reverse Monte Carlo study. Mol. Phys. 86, 87-105 (1995). [23] A. K. Roy and A. J. Thakkar. Structures of the formic acid trimer. Chern. Phys. Lett. 386, 162-168 (2004). [24] A. K. Roy and A. J. Thakkar. Formic acid tetramers: A structural study. Chern. Phys. Lett. 393, 347-354 (2004). [25] A. K. Roy and A. J. Thakkar. Pentamers of formic acid. To be published (2005). [26] A. K. Roy, S. P. McCarthy, and A. J. Thakkar. Hexamers of formic acid. To be published (2005). [27] L. S. Moy, K. Howe, and R. L. Moy. Glycolic acid modulation of collagen production in human skin fibroblast cultures in vitro. Dermatal. Surg. 22, 439-441 (1996). [28] R. G. Males and F. G. Herring. A 1 H-NMR study of the permeation of glycolic acid through phospholipid membranes. Biochim. Biophys. Acta-Biomembr. 1416, 333-338 (1999). [29] I. Zelitch. Plant respiration. In McGraw-Hill Encyclopedia of Science and Technology (McGraw-Hill, New York, 1992), vol. 13, pp. 705-710. 7th ed. [30] N. E.-B. Kassimi, E. F. Archibong, and A. J. Thakkar. Hydrogen bonding in the glycolic acid dimer. J. Mol. Struct. (Theochem) 591, 189-197 (2002). [31] G. Cassanas, M. Morssli, E. Fabreque, and L. Bardet. Etude spectrale de l'acide glycolique, des glycolates et du processus de polymerisation. (Spectral study of glycolic acid, glycolates and of the polymerization process). J. Raman Spectrosc. 22, 11-17 (1991). [32] A. J. Thakkar, N. E.-B. Kassimi, and S. Hu. Hydrogen-bonded complexes of glycolic acid with one and two water molecules. Chern. Phys. Lett. 387, 142-148 (2004). [33] A. K. Roy, S. Hu, and A. J. Thakkar. Clusters of glycolic acid with three to six water molecules. To be published (2004). [34] A. K. Roy, J. R. Hart, and A. J. Thakkar. Clusters of glycolic acid with 16 and 28 water molecules. To be published (2005). [35] J. R. Hart and A. J. Thakkar. Nitric acid dimer structures. To be published (2004).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 1042-1045

Numerical solution of the Schrodinger equation for two-dimensional double-well oscillators Amlan K. Roy, Ajit J. Thakkar 1 Department of Chemistry, University of New Brunswick, Fredericton, New Brunswick E3B 6E2, Canada

B. M. Deb Department of Chemistry, Panjab University, Chandigarh 160 014, India Received 27 July, 2004 Abstmct: Wave functions, energies and selected expectation values of the low-lying stationary states of two-dimensional double well potentials are obtained from the long-time solutions of the corresponding time-dependent SchrOdinger equation. The latter is transformed to a diffusion-like equation which is then solved by an alternating-direction, implicit, finite-difference method. Keywords: Double-well oscillators, two-dimensional Schriidinger equation, numerical solution, finite differences.

PACS: 36.40.-c, 31.25.Qm, 31.15.Ew

1

Introduction

A double-well oscillator is described by a potential function that has two minima separated by a barrier. Problems which are modeled with the help of double-well potentials include the inversion of ammonia, tunneling of protons in hydrogen bonded systems, structural phase transitions, and quantum coherence in Josephson junction superconductors. Thus, it is not surprising that onedimensional quantum systems with double-well potentials, particularly the anharmonic potential function V(x) = -Z2 x 2 +Ax\ have been studied extensively. Relatively little work has been done on double-well potentials in two and three dimensions. Progress has been made by Witwit and coworkers [1, 2, 3, 4, 5] on the computation of energy levels for such potentials but wave functions and properties other than the energy remain unexamined to our knowledge. In this work we examine energies, wave functions and properties of the three lowest states of the two-dimensional double-well potential given by V(x, y) = -Z~x 2 /2- Z~y2 /2 + A(axxX4

+ 2axyx 2 y 2 + ayyy4 )/2.

-.. its (h.= me= e = 1) are used throughout . .hor. E-mail: ajitlunb.ca, Web page: http://wwv.unb.ca/chem/ajit/

{1)

Two-dimensional double-well oscillators

1043

2

Method

The quantities of interest are solutions of the time-independent Schriidinger equation

fi r/>m,n(X, y)

= Em,nr/>m,n(X, y)

(2)

where the time-independent Hamiltonian is given by

fi

=

-~ (D~ + D~) + V(x,y)

(3)

in which Di = fJ2j8x 2 , D~ = 8 2f8y 2 , the potential V is given by Eq. (1), and two quantum numbers {m,n} are used to label the solutions of the two-dimensional Schriidinger equation (2). In this work, we find the stationary state wave functions cfin(x, y) as long-time limits of solutions of the time-dependent Schriidinger equation: . t) = . 81/J(x, y; t) H- ·'·( '~' x,y, 1 at

(4)

where fi is the time-independent Hamiltonian of Eq. (3). As in the work of Anderson [6], we write Eq. (4) in imaginary time T, substitute T = -it, and let Dt a;at, to obtain a diffusion-like equation in two space dimensions H1j;(x,y;t) = -D11j;(x,y;t)

(5)

or simply fi = - D 1 in operator form. Eq. (5) resembles a diffusion quantum Monte Carlo equation. One may express 1/J(x, y; t) as [7] 1/J(x, y; t) = Co,or/>o,o(x, y)

+

L 00

Cm,nrl>m,n(X, y) e-(Em,n-Eo.o)t

(6)

m,n>O

from which it is apparent that ! lim ~oo

1/J(x, y; t) = Co,or/>o,o(x, y)

(7)

so that numerically propagating 1/J(x, y; t) to a sufficiently long time t will give us the ground state time-independent wave function apart from a normalization constant. Expectation values of properties, including the energy, can be obtained as mean values of the pertinent operator A: (A)

= lim (1/J(x, y; t)l Al/J(x, y; t)) t~oo

(8)

where the angular brackets indicate integration over the entire domain of the spatial variables {x,y}. Excited states can be treated in the same way provided that one ensures they stay orthogonal to all lower states at each time step. A finite-difference method is developed for the numerical solution of Eq. (5) using PeacemanRachford splitting [8]. This is an unconditionally stable and convergent method that falls in the category of alternating-direction, implicit, finite-difference methods [9]. In the method, if the spatial grid consists of Nx x Ny points, then each time step requires the solution of Ny - 2 sets of tridiagonal linear equations of dimension Nx - 2 followed by the solution of Nx - 2 sets of tridiagonal linear equations of dimension Ny- 2. A standard LU decomposition technique [10] is used for the solution of the tridiagonal systems. Nx = Ny = 1951 was used in our final computations. The energy calculated at each time step is used to monitor convergence of the time-dependent wave function to the desired stationary state wave function. The number of time steps required for convergence varied from a few hundred to a few thousand. Details of the method will be presented in the lecture and will also be published in full elsewhere.

A.K. Roy et al.

1044

3

Sample Results

Our results are illustrated in Figure 1 for two choices of the potential parameters. Both potentials have four wells separated by ridges near the perimeter, and a large maximum centered at the origin. The wells on the left are relatively shallow and the ground state wave function is peaked at

3

3

X

Figure 1: The bottom row shows potentials and the next three rows show the corresponding wave functions for the (0,0), (1 ,0) and (1, 1) states respectively as one moves up the page. The left = Z~ = A = 5, and a x x = ayy = axy = 1. The right panels correspond to panels correspond to = Z~ = 10, A = 3/2, axx = ayy = 1, and axy = 1/2.

z;

z;

the origin, whereas the wells on the right are relatively deep and so show localization at the wells manifested as four maxima in the ground state wave function. This trend continues in the (1 , 0) state with only two extrema seen on the left but four on the right. In the (1, 1) state, we see that the probability density for crossing between the peaks is higher for the shallower wells.

1045

Two-dimensional double-well oscillators

Acknowledgments The Natural Sciences and Engineering Research Council of Canada provided support for this work.

References [1] M. R. M. Witwit and J.P. Killingbeck. A Hill-determinant approach to symmetric double-well potentials in two dimensions. Can. J. Phys. 73, 632-637 (1995). [2] M. R. M. Witwit. Energy levels for nonsymmetric double-well potentials in several dimensions: Hill determinant approach. J. Comp. Phys. 123, 369-378 (1996). [3] M. R. M. Witwit. Application of the Hill determinant approach to several forms of potentials in 2-dimensional and 3-dimensional quantum systems. J. Math. Chem. 20, 273-283 (1996). [4] M. R. M. Witwit and N. A. Gordon. Calculating energy levels of a double-well potential in a two-dimensional system by expanding the potential function around its minimum. Can. J. Phys. 75, 705-714 (1997). [5] M. R. M. Witwit. Inner-product perturbation theory. Energy levels of double-well potentials for 2-dimensional quantum systems by expanding the potential functions around their minima. J. Math. Chem. 22, 11-23 (1997). [6] J. B. Anderson. A random walk simulation of the Schriidinger equation: Hj. J. Chem. Phys. 63, 1499-1503 (1975). [7] B. L. Hammond, W. A. Lester Jr., and P. J. Reynolds. Monte Carlo Methods in Ab Initio Quantum Chemistry (World Scientific, Singapore, 1994). [8] D. W. Peaceman and H. H. Rachford. The numerical solution of parabolic and elliptic differential equations. J. Soc. Ind. Appl. Math. 3, 28-41 (1955). [9] J. W. Thomas. Numerical Partial Differential Equations: Finite Difference Methods (Springer, New York, 1995). [10] G. Dahlquist and

A.

Bjiirck. Numerical Methods (Prentice-Hall, London, 1974).

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume 1, 2004, pp. 1046-1050

Applications of the orbital-free embedding formalism to study the environment-induced changes in the electronic structure of molecules in condensed phase T.A. Wesolowski 1 Department of Physical Chemistry University of Geneva 30, quai Ernest-Ansermet, CH-1211 Geneva 4, Switzerland Received 31 July, 2004 Abstract: We outline the key elements of the universal orbital-free first-principles based embedding formalism applicable in theoretical studies of the electronic structure of atoms, molecules, intermolecular complexes, etc. in the presence of the environment [Wesolowski and Warshel, J. Phys. Chern., 97 (1993) 8050]. So far, most of the applications of this formalism concerned studies of potential energy surface (geometries, IR spectra, etc.) of embedded molecules. Here, we review the past and current applications of the orbital-free embedding formalism to study electronic structure of embedded molecules. Keywords: Density functional theory, embedding, electronic structure, kinetic energy functional PACS: 31.15.Ew, 31.15.Bs, 71.10.Ca, 71.70.-d,

1

Introduction

Embedded systems (called also confined systems) are becoming object of great interest in physics, chemistry, and materials science. The theoretical description of an ion, atom, molecule, etc. embedded in a microscopic environment in condensed matter represents a serious challenge for theory (for review of various quantum mechanical approaches to study confined systems, see Ref. [1] for instance). One of the strategies is based on the idea of an embedding (or confining) potential. It traces its origin to the ideas of Sommerfeld and Welker [2]. Although the basic physical laws are known, a first-principles based and universally applicable, embedding potential applicable in practical computer modelling studies has not been formulated yet. Instead, another category of methods using system-tailored empirical embedding potentials is widely used in studies of liquid, solids, and biomolecules [3]. In this work, we outline the density functional theory route to derive the first-principles based embedding potential. It was initially used by Wesolowski and Warshel to study solvated molecules[4]. Its applications to study potential energy surface related properties (physisorption, proton transfer reactions in liquids and enzymes, IR spectra of probe molecules in zeolites, etc.) were reviewed elsewhere [5]. This work, concerns the methodological and computer implementation issues relevant it its applications in studies of electronic structure of embedded systems. They are illustrated by the past and current applications. 'Corresponding author. E-mail: [email protected]

Applications of the orbital-free embedding ..

2

1047

Hohenberg-Kohn theorems in the subsystem based formulation of density functional theory

In this section, we outline the basic ideas of the subsystem-based formulation of DFT originally proposed by Cortona [6]. It will be given for the case of two interacting subsystems relevant for the subsequent considerations concerning orbital-free embedding [4]. For the sake of simplicity, the formalism is given for the spin-compensated case in all formulas throughout this work. Following Hohenberg-Kohn theorems [7], the ground-state electron density of a given system comprising NAB electrons can be derived from the minimization of the following total-energy functional:

(1) where

(2) where v is the external potential, Exc[P] is the exchange-correlatiuon functional defined by Kohn and Sham [8], and T 8 [p] is the kinetic energy functional in a reference system of non-interacting electrons (T.[p]) defined in the Levy "constrained search" [9]: (3)

where W8 denotes the trial functions of the single determinant form. Alternatively, Eo can be obtained from a step-wise search: (4)

where

+

JJ J ~

(PA(r')

2

+ ps(r')) (PA(r) + ps(r)) dr'dr lr'- rl

v(r)p(r)(pA(r)

In the above formula, the kinetic energy T 8 [PA

+ Ps(r)) dr + Exc[PA + PB]

+ PB]

(5)

is is expressed in a hybrid way: (6)

3

Orbital-free embedding

Following the same ideas as that leading to Kohn-Sham equations, leads to the one-electron equations allowing one to derive PA for a given PB (electron density of the environment) [4].

(7)

1048

T.A. Wesolowski

where (PA

=

L [¢(A)if 2).

The superscript KSCED (Kohn-Sham Equations with Constrained

1

Electron Density) is used to indicate the difference between the effective potential in Eq. 7 and that in the Kohn-Sham formalism [8l. Eq. 8 leads not only to the electron density which minimizes the bi-functional Ev[PA,PBl for a given PB but also to a set of one-electron functions (embedded orbitals {¢(A);}) and the corresponding eigenvalues ( {f(A)i} ). If the two sets of orbitals ( {¢(A);} and {¢(a);}), one for each subsystem, are obtained from a coupled calculations in which PA and PB switch their role [18], they are not orthogonal except for some particular cases. Therefore, they cannot be generally considered to be equivalent to the Kohn-Sham orbitals for the whole system. The total effective potential Ve~ jc ED [PA, pB, rl can be conveniently split into one component, the Kohn-Sham effective potential for the isolated subsystem (VK 5 [r,pA]), and the remaining part representing the environment (V:~{[r,pA,PBD which reads:

+

8Exc [PA(r) + Ps(r)l _ 8Exc [PA(r)l 8pA 8pA

+ _8T-";:_ad-::"[Pc..:A:.:..':....:.PB~l 8pA

(8)

where the first term shows the explicit form of the contributions of the atomic nuclei to the effective potential and Exc[Pl· Eq. 8 provides a general form of the Hohenberg-Kohn theorem based embedding potential. Several authors adapted the orbital-free embedding potential of the form given in Eq. 8 and used in in various contexts [10l. It is worthwhile to note that using explicit orbitals to evaluate T 8 [PAl and T.[Psl and an explicit but approximate bi-functional of the two electron densities PA and PB to evaluate r;:ad[pA, Pal in this approach, situates it somewhere between the Kohn-Sham formalism where orbitals are used for the whole system and the orbital-free strategy originally introduced to quantum mechanics by Thomas and Fermi [Ill.

4

Approximating the total energy bi-functional E[pA, p8 ]

Applications of Eqs. 7-8 in computer modelling rely on the approximations to the relevant functionals: r;:ad[pA, Pal and Exc[Pl· In this work, we use the approximate functionals chosen based on dedicated studies [12, 13l. The chosen approximations lead to an approximate bi-functional E[pA, Pal which was shown to reproduce very accurately the energetics of weakly overlapping electron densities PA and PB such as those occurring for weak intermolecular complexes close to the equilibrium geometry [14].

5

Applications

The orbital-free embedding potential of Eq. 8 was originally used to derive the difference of the solvatation free-energy difference between water and methane [15l. The overall accuracy of the results in such calculations depends on both the accuracy of the used approximated functionalderivative or;•;tA·PB] and that of the used approximate bi-functional r;:ad[pA, p8 l. In these applications, they were approximated using gradient expansion approximation of the kinetic energy functional [16l truncated to the second order. Although this approximation proved to be quite satisfactory for evaluating the energies, the potential was found to have a serious flaw. It manifests itself in a significant underestimation of the effect of the solvent on the dipole moment of the solute.

1049

Applications of the orbital-free embedding ..

The magnitude of the instantaneous value of the dipole moment of the embedded water molecule evaluated along the molecular dynamics trajectory, was shown to oscillate with the amplitude of about 0.05 Debye around the mean value which was about 0.2 Debye larger than that for the isolated water molecule. The orbital-free embedding reproduced thus qualitatively the effect of the solvent but the solvatation induced increase of the dipole moment was underestimated by a factor of 3-4. This result indicated that for the studies of the electronic structure related properties the approximation to the embedding potential based on gradient expansion approximation is inadequate. Indeed, dedicated studies on the accuracy of various approximations to T8nad[pA, PB] resulted in a construction of an approximate bi-functional r;wd[pA, p 8 ] of the generalized gradient approximation form [13] which was shown to be significantly superiour than the one derived from gradient expansion approximation[12]. In particular, the interaction induced shifts of the dipole moment in such systems as hydrogen-bonded intermolecular complexes were improved considerably. This methodological development, together, with a new implementation of Eqs. 7-8 [18] based on the computer code deMon [19] made it possible to turn back to our interest in electronic structure of embedded systems. However, the new implementation has been used mainly in various studies related to potential energy surface. Such studies have been reviewed elsewhere [5]. The first, practical application of this new implementation to study electronic-structure dependent properties was the evaluation of the effect of the noble gas (either Ne or Ar) matrix on the isotropic component of the hyperfine tensor (Aiso) of the embedded magnesium cation [17]. In these studies, the electron density of the noble gas atoms of the first coordination shell was considered as PB in Eq. 8. The numerical results showed that the used approximations in Eq. 8 were indeed encouraging For instance, the calculated values: 222.9 gauss (Ar) and 210.4 gauss (Ne) where in excellent agreement with the exerimental measurements (222.4 gauss for Ar and 211.6 gauss for Ne). It is worthwhile to note that these results were obtained with a rather large basis set. For smaller basis sets, the numerical values agree less good with experiment but the relative shifts of Aiso (free cation vs embedded cation) were found to depend only negligibly on the size of the used basis sets. Since, the contribution of the electrostatic components of the embedding potential in Eq. 8 can be expected to be negligible in the cas of the embedding density being that of one or more noble gas atoms, the excellent agreement of the calculated shifts of Aiso with experiment indicates that the used approximation for the other terms in Eq. 8 are sufficiently accurate to describe the sum of the remaining non-electrostatic terms. Recently, we started several new types of applications of the orbital-free embedding formalism to study such a electronic-structure dependent properties as: i) hyperfine structure of the Mn impurities in perovskites, ii) orbital-level splitting of lanthanide ions in crystalline environment, iii) complexation induced shifts of the localized excitations in hydrogen-bonded dimers, iv) spinstate of the transition metal centres in metalloenzymes. The last type of applications became possible owing to the merge of two formalisms [20] the orbital-free embedding and the linear response time-dependent density-functional theory route to investigate electronic excitations.

Acknowledgment This work is supported by the Swiss National Science Foundation (Project 21-63645.00).

References [1] W. Jaskolski, Phys. Rep., 271 (1966) 1. [2] A. Sommerfeld and H. Welker, Ann. Phys., 32 (1938) 56.

1050

T.A. Wesolowski

[3] There is a vast literature concerning the 'embedded molecule approach'. For pioneering papers in chemistry and solid state physics, see: (a) A. Warshel and M. Karplus, J. Am. Chern. Soc., 94 (1972) 5612; (b) J.L. Whitten and T.A. Pekkanen, Phys. Rev., B21 (1980) 4357; For review, see for instance: (c) J. Gao, in "Reviews in Computational Chemistry" vol. 7, Eds. K.B. Lipkowitz and B. Boyd, VCH, New York, (1995) p. 119 [4] T.A Wesolowski and A. Warshel, J. Phys. Chern., 97 (1993) 8050. [5] T.A. Wesolowski, CHIMIA, 58 (2004) 311; 56 (2002) 707. [6] P. Cortona, Phys. Rev. B., 44 (1991) 8454. [7] P. Hohenberg and W. Kohn, Phys. Rev., B136 (1964) 864. [8] W. Kohn and L.J. Sham, Phys. Rev. 140 (1965) A1133 [9] M. Levy, Proc. Natl. Acad. Sci. USA, 76 (1979) 6062 [10] (a) H.T. Stokes, L.L. Boyer, M.J. Mehl, Phys. Rev. B, 54 (1996) 7729; (b) E.V. Stefanovitch, T.N. Truong, J. Chern. Phys., 104 (1996) 2946; (c) N. Govind, Y.A. Wang, E.A. Carter, J. Chern. Phys., 110 (1999) 7677; (d) 0. Warschkow, J.M. Dyke, D.E. Ellis, J. Comput. Phys., 143 (1998) 70; (e) J.R. Trail and D.M. Bird, Phys. Rev. B, 62, (2000) 16402 [11] (a) L.H. Thomas, Proc. Camb. Phil. Soc., 23 (1927) 542; (b) E. Fermi, Z. Physik, 48 (1928) 73 [12] T.A. Wesolowski, J. Chern. Phys., 106 (1997) 8516. [13] T.A. Wesolowski, H. Chermette, and J. Weber, J. Chern. Phys., 105 (1996) 9182. [14] T.A. Wesolowski and F. Tran J. Chern. Phys., 118 (2003) 2072. [15] T.A. Wesolowski and A. Warshel, J. Phys. Chern., 98 (1994) 5183. [16] D. A. Kirzhnits, Sov. Phys. JETP, 5, (1957) 64. [17] T.A. Wesolowski, Chern. Phys. Lett., 311 (1999) 87. [18] T.A. Wesolowski, J. Weber, Chem.Phys.Lett., 248 (1996), 71. [19] (a) A. St-Amant and D. R. Salahub, Chern. Phys. Lett. 169 (1990) 387; (b) Alain St-Amant, Ph.D. Thesis, University of Montreal (1992); (c) deMon-KS version 3.5, M.E. Casida, C. Daul, A. Goursot, A. Koester, L.G.M. Pettersson, E. Proynov, A. St-Amant, and D.R. Salahub principal authors, S. Chretien, H. Duarte, N. Godbout, J. Guan, C. Jamorski, M. Leboeuf, V. Malkin, 0. Malkina, M. Nyberg, L. Pedocchi, F. Sim, and A. Vela contributing authors, deMon Software, 1998. [20] M. Casida and T.A. Wesolowski, Inti. J. Quant. Chern., (2004) 96 577.

Lecture Series on Computer

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

and Computational Sciences Volume I, 2004, pp. 1051-1053

Application of Purpose-Oriented Moderate-Size GTO Basis Sets to Nonempirical Calculations of Polarizabilities and First Hyperpolarizabilities of Conjugated Organic Molecules in Dielectric Medium M.Yu. Balakina. 1•, S.E. Nefedievb • A. E. Arbuzov Institute of Organic and Physical Chemistry of Kazan Scientific Center, Russian Academy of Sciences, Arbuzov str., 8, 420088, Kazan, Russia; bKazan State Technological University, K.Marx str., 68, 420015, Kazan, Russia Received 7 March, 2004; accepted in revised form 10 March, 2004 Abstract: In the present work the moderate-size basis sets constructed earlier are used for the SCF level

calculations of static molecular polarizabilities and hyperpolarizabilities of conjugated organic molecules in dielectric medium. It is shown that studying the solvent effect on the nonlinear optical properties of the conjugated organic molecules with dominant 11-electron polarization at the [4s3p2d/3s]level of theory one can obtain rather reliable results.

Keywords: ab initio calculations, basis sets, molecular polarizabilities and hyperpolarizabilities, solvent

effect

PACS: 3l.l5.Ne, 3l.l5.Ar, 3l.70.Dk, 42.70.Jk

1. Introduction Molecular design of promising nonlinear optical (NLO) chromophores should incorporate the medium effects in the theoretical modeling, as the dielectric environment affects significantly the electric properties of the chromophores [1-3]. The additional motivation for studying the medium effects is conditioned by the fact that the experimental measurements are mostly performed in the condensed phase. Quantum chemical calculations present a natural alternative to the experimental investigation of the chromophores, allowing the estimation of their NLO characteristics both in the gas phase and in the dielectric environment, and providing the insight into the origin of their NLO activity. However, the theoretical estimation of chromophore electric properties, particularly molecular hyperpolarizability, even in the gas phase, is a nontrivial task demanding a special methodology, primarily the adequate basis set [4-6]. The choice of the adequate basis set for the estimation of the NLO properties in solution calls additional examination. It is obvious that for the computational reasons moderate size basis sets are preferable. Recently the purpose-oriented basis sets were suggested in [7 ,8] for the calculations of molecular polarizability and hyperpolarizability at the SCF level. The aim of the present work is to study the adequacy of the basis sets elaborated in [8] for the calculation of static molecular polarizabilities and hyperpolarizabilities of conjugated organic molecules in solution. For this purpose we apply Time-Dependent Hartree-Fock (TDHF) method [9] for the calculation of electric properties (a.ii and J3ijk), the solvent effect being taken into account with the Polarizable Continuum Model (PCM) [10,11]. The basis sets are tested here by the example of p-nitroaniline (PNA) molecule - the NLO chromophore, widely studied both theoretically and experimentally (see, for example [12, 13]).

2. Results and Discussion 1 Corresponding author. E-mail: [email protected]

1052 - - - - - - - - - - - - - - - - - - - - - - MYu. Balakina and S.E. Nefediev

The geometrical parameters of PNA were optimized in 6-31 G** basis set both in the gas phase and in two solvents with different dielectric constants (chloroform with e=4.9 and acetone with e=20.7). The effect of the solvent on the geometrical parameters was shown to be insignificant: (CN bond lengths are shortened by 0.01 A, ONC valence angles are increased by 0.3-0.4°, the change in other parameters being negligible). Thus, for the basis set tests we have used the gas-phase geometry of PNA. The electric properties are calculated for the PNA molecule embedded into chloroform. According to the PCM methodology, chromophore occupies in the dielectric medium a cavity, which is defined in terms of interlocking spheres with radii equal to 1.2 times the corresponding van der Waals atomic radii [14], the resulting values used here are: Rif= 1.52A, Rc=2.04A, Ro= 1.8A, RH= 1.44A. The energy-optimized (9s5p/4s) Huzinaga atomic basis set [15) was used in [8] as the sp-substrate, for the first row atoms it was augmented with diffuse functions of s- and p- types and two sets of primitive d-functions to account for the polarization of electron density in covalent bonds and in the outer region of the atomic space, the exponent of the corresponding function being optimized in the presence of the applied electric field. The similar assumptions were used for the construction of Hydrogen basis functions. Thus, with various contractions by Dunning [ 16) the following basis set families were generated: [4s3pNd/3sNp), [5s3pNd/3sNp) and [6s4pNd/4sNp), for N=2 the exponents of the 2d/2p polarization functions being the following: a 1=0.72, a 2=0.115 for C; a 1=0.98, a 2=0.144 for N; a 1=1.28, a 2=0.225 for 0; a 1=1.0, a 2=0.125 for H [7,8]. To provide the better description of the atomic outer region mostly influenced by the polarization effects of interest, the 3d/3p polarization set was also used here, the exponent values being as follows: a 1=0.72, a 2=0.182, a 3=0.073 for C; a 1=0.98, az=0.233, a 3=0.089 for N; a 1=1.28, a 2=0.347, a 3=0.146 for 0; a 1=1.0, a 2=0.210, a 3=0.074 for H [7,8]. The dependence of the calculated (hyper)polarizabilities on the contraction flexibility of the basis and the dimension of the polarization set is studied. The analysis of the results of the calculations demonstrate that, as one would expect, the l3iik values are much more sensible to the extension of the basis set than the aii ones. For the given basis family the convergence of the values of the components of (hyper)polarizability tensors is shown to be achieved, the uncertainties in the results (estimated as half a maximal difference between the data falling in the convergence domain [8]) not exceeding 0.1·1 0"24 esu for aii and 0.1· 10"30 esu for l3iik• that is notably larger than in the case of the gas phase calculations. The use of contraction of different flexibility is shown to affect the values of both aij and ~ijk values only insignificantly, slightly increasing the calculated data with the increase of the contraction flexibility of the sp-substrate, this tendency is particularly noticeable for the absolute value of the dominant longitudinal component llzzz· The effect of additional polarization function on the first-row atoms is slightly pronounced, causing a small decrease of l3iik values (-o.3·1 0"30 esu for 13zzz). The effect of the polarization functions on Hydrogen on both polarizability and hyperpolarizability values is negligible. So here, much as in the case of the gas phase, even the [4s3p2d/3s] basis set provides the data fairly well reproducing the results obtained with the [6s4p3d/4s3p) basis, which is the largest basis set examined here, the error being 0.5% for aii and I% for dominant longitudinal hyperpolarizability 13zzz- Thus we conclude that studying the solvent effect on the NLO properties of the conjugated organic molecules with dominant n-electron polarization at the [4s3p2d/3s] level of theory one can obtain rather reliable SCF results.

Acknowledgments The support of the Russian Foundation for Basic Research (Project N2 02-03-32897) is greatfully acknowledged.

References [I] A. Willetts, J.E.Rice, D.M.Burland, D.P.Shelton, Problems in the comparison of theoretical and experimental hyperpolarizabilities, Journal of Chemical Physics 97 7590-7599(1 992). [2] K.V.Mikkelsen, Y.Luo, H. Agren, P.Jorgensen, Solvent induced polarizabilities and hyperpolarizabilities of para-nitroaniline, Journal of Chemical Physics I 00 8240 ( 1994). [3] Y.Luo, P.Norman, H.Agren, A semiclassical approximation model for properties of molecules in solution, Journal of Chemical Physics 109 3589-3595(1998).

Application of Purpose-Oriented Moderate-Size GTO Basis S e t s - - - - - - - - - - - - - - 1053

[4] P.R.Taylor, T.J.Lee, J.E.Rice, J. Almliif, The polarizabilities ofNe, Chemical Physics Letters 163, 359-365 ( 1989). [5] G.Maroulis. Accurate electric multipole moment, static polarizability and hyperpolarizability derivatives for N2, Journal of Chemical Physics 118 2673-2687(2003). [6] G. Maroulis, C. Pouchan, Molecules in static electric fields: linear and nonlinear polarizability ofHC=N and HC=P, Phys. Rev. A 57 2440-2447(1998). [7] M.B. Zuev and S.E. Nefediev, Rational Design of Atomic Gaussian Basis Sets for Ab Initio Calculations of the Dipole Polarizabilities and Hyperpolarizabilities. I. Optimized Polarization Sets for the First-Row Atoms from B to F Journal of Computational Methods in Sciences and Engineering; Special issue (2003). [8] M.B. Zuev, M Yu. Balakina, S.E. Nefediev, Rational Design of Atomic Gaussian Basis Sets for Ab Initio Calculations of the Dipole Polarizabilities and Hyperpolarizabilities. II. Moderate size Optimized Sets for the First-Row Atoms from B to F, Journal of Computational Methods in Sciences and Engineering Special issue (2003). [9] S.P. Kama and M. Dupuis, Frequency Dependent Nolnlinear Optical Properties of Molecules: Formulation and Implementation in the HONDO Program, Journal of Computational Chemistry 12 487-504(1991). [I 0] R. Cammi, M. Cossi, J. Tomasi, Analytical derivatives for molecular solutes. III. Hartree-Fock static polarizability and hyperpolarizabilities in the polarizable continuum model, Journal of Chemical Physics 104 4611-4620(1996). [II]R. Cammi, M. Cossi,B. Mennucci, J. Tomasi, Analytical Hartree-Fock calculation of the dynamical polarizabilities a,j3, and y of molecules in solution, Journal of Chemical Physics lOS 10556-10564(1996). [12] M. Stahelin, D.M. Burland, J.E. Rice Solvent dependence of the second order hyperpolarizability inp-nitroaniline, Chemical Physics Letters 191, N3/4, 245-250 (1992). [13] P. Salek, 0. Vahtras, T. Helgaker, H. Agren, Density-functional theory oflinear and nonlinear time-dependent molecular properties, Journal of Chemical Physics 117 9630-9645(2002). [14]A. Bondi, van der Waals Volumes and Radii, Journal of .Physical Chemistry 68 441451(1964). [ 15] S. Huzinaga, Gaussian-type functions for polyatomic systems. I, Journal of Chemical Physics 42 1293-1302( 1965). [16]T.H. Dunning, Gaussian basis functions for use in molecular calculations. I. Contraction of (9s5p) atomic basis sets for the first-row atoms, Journal of Chemical Physics 53 28232833(1 970).

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 1054-1056

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Electronic Polarizabilities of Atoms, Molecules and Crystals, Calculated by Crystal Chemical Method S.S. Batsanov 1 Center for High Dynamic Pressure, Mendeleevo, Moscow Region 141570, Russia Received I August, 2004; accepted 27 August, 2004 Abstract: New empirical methods are proposed for calculating the refractions of free atoms, atoms

involved in covalent bonds, atoms in elemental solids and in inorganic compounds with polymeric structures. Keywords: electronic polarizability, refraction, atom, electronegativity

PACS: 32.10.Dk, 33.15.Kr The full system of electronic polarizabilities (refractions) of free atoms has been derived from direct measurements and quantum-mechanical calculations (Table I, upper lines). Refractions of atoms in covalent bonds have been experimentally determined only for nonmetals and metals of the Ia and 4b7b subgroup of the Periodic Table. In addition, polarizabilities of some metals were have been calculated from the experimental molecular refractions of their organometallic compounds by additive method. However, these data are not available for all elements. To fill all the gaps in Table I, it is necessary to develop new methods for determining the covalent refraction of atoms. Hohm [I] and Maroulis et al. [2] have shown that at the dissociation of a molecule the change of the enthalpy D and the refraction L1R are related by the equation (I), which can be used to calculate atomic refractions. D=A+BL1R (I) R is proportional to the atomic volume (and hence to?) and Dis proportional to Z*lr, where Z* is the effective nuclear charge and r is the covalent radius, hence DILJR. is proportional to Z*lr4 . If D is half the dissociation energy of the M-M bond and L1R = R(M)- Y, R(M 2 ), then Z*lr 4 = k DILJR. (2) Using the experimentally measured values of the refractions of free atoms and molecules of alkali metals and copper, the present author has calculated the factor k of the equation (2) and found it to be fairly stable for these elements. Then, assuming k to be constant for all elements, one can obtain the values of LJR. and, using the known magnitudes of R for the isolated atoms, derive the covalent refractions of metals [3]. The results of these calculations are listed in the middle lines of Table I. Taking into account that the refractive indices n of metals for A.~ oo are very high and hence the the Lorentz-Lorenz function in the formula (3) is close to unity, we can assume the refractions of metals RM to be simply equal to their atomic volumes [4]. Refractions of atoms in solid metals are presented in the lower lines of Table I. (3)

1 Corresponding

author. E-mail: [email protected]

Electronic Polarizabilities ofAtoms, Molecules and Crystals - - - - - - - - - - - - - - - 1055

Table l: Refractions (cm3) of free atoms, atoms involved in covalent bonds, and atoms in elemental solids

c

u

Be

58.4 41.4 13.0

14.0 10.8 4.9

B 7.6 4.3 3.5

Na 60.8 49.9 23.6

Ma 27.S 19.3 14.0

AI 17.1 11.S 10.0

4.4 2.00 2.00 Si 13.6 9.0S 9.0S

K 110 93.1 45.6

Ca 61.6 46.2 26.3

Sc 44.9 32.5 1S.O

11 36.8 27.3 10.6

31.3 21.1 8.4

29.3 20.9 7.2

Cu 15.4 12.0 7.1

Zn

14.5 12.9 9.2

Ga 20.5 17.2 11.7 y

Ge 15.3 11.3 11.3

As 10.9 10.9 10.3

9.S 10.8 11.6

Zt 45.1 27.6 14.0

Nb 39.6 23.8 10.9

Mo 32.3 21.4 9.4

Tc 28.8 19.0 8.2

So 19.4 16.3 16.3 Hf since P (W 3) r;;; P (W2) r;;;P (WI)· In relational databases with null values it is recognised that there are many different types of null values, each of which reflects different intuitions about why a particular piece of information is unknown. In [II) five different types of nulls are suggested. The labels and semantics of them are defined as follows. Let V be a function, which takes a label and returns a set of possible values that the label may have. Table 2: Null Markers Label (X) Ex-mar Ma-mar PI-mar Par-mar (V,) Pm-mar (V,)

V(X) D D u {.l} {.l} V, v, u {.l}

Intuitively, V (Ex-mar) = D says that the actual value of an existential marker can be any member of the domain D. Likewise, V (Ma-mar) = D u {.l} says that the actual value of a maybe marker can be either any member ofD, or the symbol.l, denoting a non-existent value. Similarly, V (Par-mar (V s)) = V, says that the actual value of a partial null marker of the form pa mar (V ,) lies in the set V, a subset of the domain D.A controversial issue is the use of the .l, which denotes that an attribute is inapplicable. Certainly this is the interpretation of an algebraic manipulation of the unknown information, instead of a conceptual manipulation and interpretation. Assuming the sample fact spouse, the individual, Tony, is a bachelor, and hence, the wife field is inapplicable to him, .l. Conceptually the issue can be resolved with the use of the subtypes (e.g. married, unmarried) as part of the entity class Person. A subtype is introduced only when there is at least one role recorded for that subtype. In the general case the algebraic issue under the use of subtypes is whether the population of the subtypes in relationship to the supertype is • Total and Disjoint: Populations are mutually exclusive and collectively exhaustive. • Non-Total and Disjoint: Populations are mutually exclusive but not exhaustive. • Total and Overlapping: Common members between subtypes and collectively exhaustive, in relationship to supertype. • Non-Total and Overlapping: Common members between subtypes and not collectively exhaustive, in relationship to supertype. The conceptual treatment of null will permit us to reduce the table in "Table 2" using only two types of null markers. Table 3: Eliminated Null Markers Label (X) V-mar_(V} P-mar (V,) 0-mar (D-V,)_

V(X) {V} {V,} {D-V,}

On Probability, Null Values, lntuitionistic Fuzzy, & Value lmpeifection_ _ _ _ _ _ _ _ _ _ __

1141

In [6] a connection between the measure of randomness (p) or observation and compatibility (0) can be achieved. In this way a fact is presenting information that is observed and testified by one or more information sources therefore a set of alternatives is defined with p constrained by 0 ~ p ~ I and IJ= I. In this case the P-mar, marker is used to denote that one of the members of the restricted set V, is the actual value, of a label value. In this latter case the n-mar, marker to denote that a label value can be any value derived from the label domain D, excluding the set of probable values Vs. In the case where information is explicit a value marker is any single element (V) from the domain D of the label value. The probability measure stress a partial order of the possible worlds. This is W 3 r:;;;_W 2r:; ;_ W ~>since P (W 3) r:;:;_ P (W 2) r:;;;_P (W 1) "Table I". The challenge is to keep this partial ordering as part of P-mar, marker which a set of alternatives but with no explicit ordering. A set type is defined as following: set type = Set of Base Type. The identifier set type is defined over the values specified in base type. The base type can be for example Boolean, character, an enumeration type or a sub-range. Choosing the base type for the P-Mar constraint to be an enumeration type (a~> a2, a3) a partial order is established since a 1r:;;;_a 2r:;;;_a 3, therefore the ordering defined by the probability distribution is assured and value imperfection can be expressed and accommodated in the database in a simple way. Depending on the modelled enterprise modellers may decide to use either the weighted approach (probabilistic values) or the unweighted approach. Both approaches are not negating each other on the contrary a parallelism can be drawn through the partial order property. Further more probabilistic repositories are implying the semantics of the CW A, since every fact instance of fact F not explicitly represented in the relation F, has probability 0. In an effort to null values, maybe fact instances or tuples are utilised. In this way a relation F is composed of a sure component and a maybe component, which may hold, or not. The CW A is utilised with the formulation of the possible world, where in each situation every tuple is certain. However the use of either probabilistic or null values does not let the modeller to express indefiniteness, to put it differently certainty or uncertainty with respect to likelihood of occurrence of a possible event. For example an information source may not be reliable, i.e. the expert, who enters the information, has a particular reliability coefficient, or the data is not quite actual, etc. That is why we develop a model of an intuitionistic fuzzy repository, which can store data with certain degree of reliability or non-reliability.

3. An Intuitionistic Fuzzy Relational Repository An Intuitionistic fuzzy relational repository for treating unknown information with respect to the semantics of intuitionistic validity < 11 F> and falsity < VF > is defined. Using the ideas for intuitionistic fuzzy expert systems [7] we can estimate any fact F and it can obtain intuitionistic fuzzy truth-

>, such that 11 F, VF E [0, 1) and .UF + VF ~ 1. Therefore, the above fact can be represented in the form< F,!lF'VF >. Let R be an Intuitionistic Snapshot fuzzy relation: values V(F) =< llF, VF

R = { I x E X}, where x = ... , coin> is an ordered tuple belonging to a given universe X. A projection operation over R defines a relation, which is a vertical subset of R, containing the values of the specified attributes: ncoll, ... ,coln(R) = { I X E X }, where x.colk is the k-th column (attribute) of the ordered couple x.The projection retains the degrees of membership and nonmembership of R. A selection operation defines a relation, which contains only those tuples from R for which a certain predicate is satisfied. We can say that the selection modifies the degrees of membership and nonmembership of R depending on the corresponding value of the predicate: crp(R) = { I x E X }, where P is the predicate, i.e. the elements of the result relation have degree of membership, which is logically AND-ed with the corresponding value of the predicate P. A Cartesian product of two relations RxS is identical to the Cartesian product operation defined in the Intuitionistic fuzzy sets theory [I], which uses the logical AND between the degrees of membership: Let Sbe another intuitionistic fuzzy relation: S = { I y E Y}, then: RxS = { «x, y>, min(!!R(x), !!s(Y)), max(vR(x), v 5(y),)> I E XxY}. If we further wish to represent temporal information about a particular enterprise then a temporal relation can be defined as follows

1142 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ P. Chountaset. a/.

The elements of the temporal information (we can call them temporal data, or temporal facts, too) can be represented in the form < F, t ~ , t ~ > , where [ t ~ , t ~ ] is a time interval. Using the ideas for intuitionistic fuzzy expert systems [7] we can estimate any fact F and it can obtain intuitionistic fuzzy truth-values V(F) =< J.l F, v F > , such that J.l F, v F E [0, l] and ,u F+ vF ~ l . Therefore, the above fact can be represented in the form< F, t~, t~ ,J.lF, V F

>.

This form of the fact corresponds to the case in which the fact is valid in interval

tF = [t[ ,t;]

and at

every moment of that interval the fact has the truth-value < ,u F, v F >. Each of the functions J.!, v, and will be regarded as having two arguments i,l, where i is the index of the corresponding fact instance, and 1 is an element in the interval T = [t~ot2 ]. Now, the truth and falsity degrees of the fact instances can be estimated as above, but the pair converts to

JJ.lt(l)dl JVt( l)dl

12

< J.l' ;(I), v' t(l) >=
12-

II

ti ,t~ ,... ,t: with the forms ti = [ti,L ,t{R], and if its truth-values for these intervals are, respectively, < .ui, vi >, < ,u~, v~ >, ... , < .u:, v: >, (l~i.5h), Now, if the fact is valid in a set of intervals

then our fact will have the representation

< F,ti ,,ui, vi ,t~ ,,u~, v~ ,..

Now, for the set of so constructed facts we can define quantors V and 3 having truth-values estimations as follows

V(Vt F)=, J=l,2, ... ,n

V(::lt

1=1,2, ... ,n

F)=< t=l,2, max .ui, min vi >. ... t=l,2, ... ,n

,n

The first estimation determines the truth-value of the expression "Up to now F has always been valid" and the second one- of the expression "In some interval of time F has been valid". The advantage of this alternative is that temporal ignorance can be expressed using the generic semantics " it is true up to now" or " is true at some point" recording both the belief and falsity of these propositions. It is assumed that negative information will be stored by default to the equivalent database representation. In this sense it is expected that the database will obey the open world assumption (OWA). This implies that integrity constraints have to be redefined, a problem that still remains unsolved for the database community.

4. Conclusions In this paper three distinct architectural approaches for building a data repository to accommodate unknown information have been presented. The first architecture is based on a unweighted approach and it is utilised with the aid of null values. The second approach is abased on a weighted approach and it is utilized with the aid of probabilistic values. Both approaches do obey the CW A assumption. We finally proposed a new architectural approach for building a data repository to accommodate unknown information based on Intuitionistic Fuzzy logic that follows the OWA. The Intuitionistic model has two basic goals: first, to be able to store data with a particular degree of reliability (degree of membership-non membership of the elements) and second, to be able to process queries, which contain indefiniteness.

References [I]

C.E. Dyreson, R.T. Snodgrass, Support Valid-Time Indeterminacy, ACM Transactions on Database Systems, Vol. 23, No. I, pp. 1-57, 1998

[2]

F. Kabanza, J-M. Stevenne, P. Wolper, Handling Infinite Temporal Data, Proceedings of ACM Symposium on Principles of Database Systems (PODS), 1990

On Probability, Null Values,lntuitionistic Fuzzy, & Value Imperfection _ _ _ _ _ _ _ _ _ __

1143

[3]

D. Dey, S. Sarkar, A Probabilistic Relational Model and Algebra, ACM Transactions on Database Systems, Vol. 21, No.3, 1996

[4]

L. Lakshmanan, N. Leone, R. Ross, V. Subrahmanian, ProbView: A Flexible Probabilistic Database System, ACM Transactions on Database Systems, Volume 22, No.3 (Sep. 1997).

[5]

K. Candan, J. Grant, V. Subrahrnanian A Unified Treatment of Null Using Constraints, Computer Science Technical Report Series, University of Maryland, USA CS-TR-3456, UMIACS-TR-95-47, April1995

[6]

M. Delgado, S. Moral, On the concept of Possibility-Probability Consistency in Fuzzy Sets For Intelligent Systems, D.Dubois, H.Prade and R. Yager (eds), Morgan Kaufman Publishers, pp 24 7-250, 1993

[7]

Atanassov K. Intuitionistic Fuzzy Sets, Springer-Verlag, Heidelberg, 1999.

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. 1144-1147

Artificial Neural Networks and Cardiology: a Fertile Symbiosis? M. De Beule 01 , E. Maes 0 , R. Van Impe 0 , W. Vanlaere 0 and 0. De Winter* 0

Department of Structural Engineering, Faculty of Engineering, Ghent University, 9052 Zwijnaarde, Belgium

*Department of Radiotherapy and Nuclear Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium Abstract: The technique of Artificial Neural Networks (ANN's) is applied to model the risk stratification

according to d'Agostino et al. [1]. The performance of the network proves its ability to find non-linear relationships in (medical) data and some important factors in accomplishing an accurate and reliable network are derived. At the end an ANN is designed to investigate the predictive quality of certain well chosen risk factors for secondary prevention. The performance of the resulting network is put in the right perspective and some aspects that need further study are mentioned. Keywords: Neural networks, diagnosis, cardiology, Framingham and d' Agostino.

1. Introduction Epidemiologists seek to understand the predictive quality of certain risk factors in risk prevention, using - in general - classical determination methods. Since the 1960's a lot of mathematical predictive models are created for primary prevention (prediction of the risk of a new cardiovascular disease (CHD)) based on the Framingham study. In secondary prevention, the risk for a new CHD-event is calculated for persons with a history of coronary or cardiovascular disease. By evaluating afterwards which risk factor affected the global risk for each patient the most, it is possible to support the secondary prevention in an effective way. The risk factors considered in the analyses of d' Agostino are age, log ratio of total to HDL cholesterol and diabetes for men. Log transformed Systolic Blood Pressure (SBP) and smoking are also included in the model for women in addition to the risk factors for men. Some important factors in accomplishing an accurate and reliable network are derived from a basic ANN to model the risk stratification according to d' Agostino. Some of those recommendations are used in the design of a second network. Using classical epidemiological insights, the authors determined 10 (secondary and possible) risk factors: age, diabetes, smoking, personal history, heart rate at rest, systolic blood pressure at rest, total cholesterol value, left ventricular ejection fraction (LVEF), end systolic volume (together with LVEF determined by QGS myocardial gated SPECT software) and defect extent in stress minus defect extent at rest. The idea grew to see whether these factors could be used in a neural network, to predict the outcome of the patients. The reader is presumed to have some basic knowledge of the technique of ANN's. If there would be a lack of this knowledge the authors strongly recommend to consult the work of e.g. Rojas [2].

1 Corresponding author. E-mail: [email protected]

Artificial Neural Networks and Cardiology: a Fertile Symbiosis?_ _ _ _ _ _ _ _ _ _ _ _ __

1145

2. Artificial Neural Networks and d' Agostino One way to prove the ability of ANN's to deal with complex non-linear datasets, is to see whether an ANN is able to predict the calculated risk scores (in%) reasonably accurate (e.g. Absolute Error< 3 %). For this reason a dataset of 273 patients of the Ghent University Hospital is studied. The data are collected in a time period of about four years and all patients had a history of ischemic heart disease and an ejection fraction of less than 40 %. In a first phase the secondary risk is calculated for each patient and then a network is trained, validated and tested (with the software packet JavaNNS [3]) to see whether the network's risk score (output) corresponds with the calculated risk values. The total database is divided randomly in a training set of 200 persons and a validation set of 73 patients. The application range for the network for the non-discrete parameters is as follows: age [39,7;89,1] (years), TCIHDL [2,017;10,05], SBP [80;190] (rnrnHg) and secondary risk [4,02;33,49] (%). All data are scaled as symmetrically as possible within the range [-I; I]. In the training phase there are two possible options: searching a network with an acceptable accuracy or seeking the optimal network with the highest accuracy possible. For both options it is necessary to vary all parameters (e.g. number of hidden nodes, number of hidden layers, training algorithm, learning rate, ... ) when searching for the lowest error on the output. To find the optimal network, it is necessary to take a lot more combinations of those parameters in consideration than with respect to the search for an acceptable network. For this study we have chosen to search for an acceptable network as it gave already reasonably accurate results. In order to have an independent performance check, it is necessary to test the network with 'virgin' data (not used for training and validation). As all patient data from the dataset have already been used, the authors chose to generate the test set as follows. For every parameter two values (laying within the application range) are considered, as can be seen in Table I. By combining all possible combinations of these parameters and calculating the secondary risk, a test set with 64 cases is generated. Table I: Values for the parameters used in the test set. Gender

Male

Age (year)

50

70

Diabetes

Yes

No No

Female

Smoking

Yes

TC/HDL

4

6

SBP (rnrnHg)

120

140

In Figure I the target values are compared to the output values of the network. In this figure the bisector represents the perfect match between the target- and output values. The figure clearly proves that the network performs very accurately in predicting the secondary risk at risks lower than 17 %. For higher risks the maximum absolute error is about 3 % and the resulting network can thus be stated as acceptable for the prediction of the secondary risk according to d'Agostino's method. 30 25

::;? 20

e...

g_ 15 :; 0

10 5 0 0

5

10

20 15 target(%)

25

30

Figure I: Performance of the resulting network. The authors would like to draw the attention to the fact that the absolute errors of the test set are much higher than those for the validation set. A plausible cause for this discrepancy is that the data are not

I I 46 _ _ _ _ _ _ _ _ _ _ _ M De Beu/e, E. Maes, R. Van lmpe, W Vanlaere and 0. De Winter

very well spread over the whole range (e.g. 14,5 % women in the training set, 2,8 % women in the validation set and 50 % women in the test set). To make an accurate prediction with ANN's, it is necessary to train the network with examples which are distributed evenly over the range of application for each input parameter. If this is not the case, the network can generate results which cannot be trusted in areas where few examples were available. A possibility to cover that lack in distribution is a profound study of the spreading of the data and a good (re)defining the range of applicability of the network (e.g. exclude all women from the data set).

3. Artificial Neural Networks and Risk Stratification A second ANN is developed to predict the occurrence of a total event (i.e. cardiac death, non fatal myocardial infarct, Coronary Artery Bypass Grafting, Percutaneous Transluminal Coronary Angioplasty or hospitalisation after heart failure). Using this ANN, each patient can be categorised as a high risk (possible occurrence of an event) or a low risk person (no predicted event). Considering the determinant factors for the collected database of the Ghent University Hospital, the designed network is a tool for secondary prevention. To make an accurate prediction with ANN's, it is necessary to train the network with examples which are evenly distributed over the range of application for each input parameter as was mentioned earlier. For this reason 200 male patients with an age between 45 and 85 years were selected from the 273 persons database. Their heart rhythm (in rest) varies from 40 to 100 beats/minute, the systolic blood pressure has a minimum value of 90 and a maximum value of I 80 mmHg and the total cholesterol value lies between 120 and 280 mg!dl. All men have an ejection fraction between I 5 and 40 % and an end systolic volume of 50 to 300 mi. These ranges determine the field of applicability of the network. A useful tool to define those ranges are scatterplots. In Figure 2 such a plot is given to determine the range of applicability for the ejection fraction (E F). The different slopes of the curve are an indication of the spreading of the data and are an explanation for the exclusion of the patients with an EF less than I 5%.

"""

i:l..l

0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0

50

100

!50

200

250

patient nr Figure 2: Spread of ejection fraction for all male data. The feedforward network is trained with the backpropagation algorithm with a momentumterm using the software packet JavaNNS. In the quest for a network with an acceptable accuracy, numerous training procedures are carried out while varying several parameters (e.g. number of hidden layers, nodes per hidden layer, momentum term, ... ). The characteristics of the most accurate network (that produces the highest sensitivity and specificity for the validation set) are shown in Table 2. Table 2: Parameters of the resulting network. Number of data Number of hidden nodes

150 (Training set) Layer I

Parameters

Layer2 Learning Rate Momentum Term

Number of Training cycles

50 (Validation set) 5 2 0,3 0,1 1500

Artificial Neural Networks and Cardiology: a Fertile Symbiosis?_ _ _ _ _ _ _ _ _ _ _ _ __

1147

The ratio of the number of training examples to the degrees of freedom for the designed network is 2,5. This approaches the recommended value of 3 in Freeman et al. [4]. Since all data are used for the training and validation set, the network can not be validated with an independent test set. The performance (in%) of the network is summarised in Table 3. The accuracy (Ace) is the percentage of correct classifications, the sensitivity is the fraction of high risk persons who are tested positive (i.e. classified as high risk or occurrence of an event), the specificity is the fraction of low risk persons who are tested negative (i.e. classified as low risk or no event). The Positive Predictive Value (PPV) is the ratio of the number of correct high risk classifications to all high risk classifications, the relation of the number of correct low risk classifications to all low risk classifications is given by the negative predictive value (NPV). Table 3: Performance of the trained and validated network. Dataset

Ace(%)

Se(%)

Sp(%)

PPV (%)

NPV(%)

Training set

89,33

66,67

98,15

93,33

88,33

Validation set

82,00

64,29

88,89

69,23

86,49

The designed ANN recognises with success the low risk patients with an acceptable accuracy (i.e. 88,89 %). The support for the remaining group could then be intensified in secondary prevention. In this way the "health care resources" could be spent in a more efficient and economical way. In order to use this model in the clinical practice, a reliable evaluation of the network should be carried out. This involves an external and a temporal evaluation. The external check can be done by testing the performance using another database. This can be an indication of whether the network remains valuable for slightly different populations. A temporal evaluation means that the network has to be tested after a certain period of time to see whether the model remains valid. In addition a comparison of the network with existing classical methods for secondary prevention has to be made. If all tests are positive, the effect of the network on the decision making strategy of the physicians has to be investigated. When this last evaluation is also positive, one could start thinking of implementing such a network in the clinical practice. The obtained sensitivity of our network is rather low. Maybe the technique of (k-fold) Cross Validation can offer a solution to increase the performance as the number of training examples is rather low (compared to the number of input parameters). As mentioned in the introduction, the input parameters (or risk factors) of our network are chosen using classical epidemiological insights. An alternative method would be to do a full investigation (e.g. Principal Component Analysis) on the complete dataset to see whether the same risk factors are obtained. If this study would reveal new risk factors, their predictive value can be tested with ANN's. Another possibility would be to decrease the number of input parameters and to study the effect on the network. Further research can lead to a network with two output nodes: one parameter that predicts the occurrence of total events and another one that determines the time to this event.

References [I] R.B. D'Agostino, M.W. Russell and D.M. Huse, Primary and subsequent coronary risk appraisal: new results from the Framingham study, American Heart Journal 139 272-281 (2000). [2] R. Rojas: Neural networks: a systematic introduction. Springer-Verslag, Berlin Heidelberg, 1996. [3] Java Neural Network Simulator, User Manual, Version 1.1, Fischer, Hennecke, Sannes and Zell, University ofTiibingen. [4] R.V. Freeman, K.A. Eagle and E.R. Bates, Comparison or artificial neural networks with logistic regression in prediction of in-hospital death after percutaneous translurninal coronary angioplasty, American Heart Journal 140 511-520(2000).

Lecture Series on Computer and Computational Sciences Volume I, 2004, pp. I 148-1152

VSP International Science Publishers P.O. Box 346, 3700 AH Zeist The Netherlands

A Short Survey of Multi-Agent System Architectures Mingwei Chen, Ilias Petrounias 1 Department of Computation, UMIST, PO Box 88, Manchester M60 I QD, UK Received 3 September, 2004; accepted in revised form 4 September, 2004 Abstract: This paper presents a review of multi-agent system architectures. The importance of these architectures is discussed first and a set of functional and non functional criteria are identified. Different architectures from the literature are presented and evaluated against this set of criteria.

Keywords: multi-agent systems, intelligent systems, survey, architectures

1. Introduction First, we clarify the meaning of terminologies used in MAS research [2]: Agent Architectures analyze agents as independent reactive/proactive entities. Agent architectures conceptualize agents as being made of perception, action, and reasoning components [3]. Agent architectures concern how individual agents are constructed internally. MAS Infrastructure: An infrastructure is a technical and social substrate that stabilizes and enables instrumental (domain-centric, intentional) activity in a given domain [1]. We can generalize MAS infrastructure as composed of a series of regulations, specifications, standardizations, implemented system components and agent services that serve to solve the "typical, costly, commonly-accepted" [I], non-domain-specific issues in most MASs. MAS Development Frameworks are building toolkits for constructing MASs. They should provide support for basic agent communication and for higher-level agent interaction needs, agent management, security and coordination services. Multi-Agent System Architectures analyze agents as interacting service provider/consumer entities. Architectures facilitate agent operations and interactions under environmental constraints, and allow them to take advantage of available services and facilities [3]. 1.2 Criteria for MAS architectures On the functional side, multi-agent system architectures are responsible for providing those typical, commonly-used, non-domain-specific functionalities, which are also called basic agent services. The lack of basic agent services in development frameworks are the major obstacle to the long expected prosperity of multi-agent applications. Examples of these, as shown in Fig I, include facilitation, coordination, message transport, agent management, security etc. The criteria for good MAS architectures are classified into two categories: functional and non-functional. Functional criteria: Facilitation capabilities, Coordination capabilities, Agent management capabilities, Security services. Non Functional criteria: Modelling capabilities, Naming & addressing dependency, Performance influence, Socialization capabilities, Extensibility, Scalability. The functional criteria consist of the basic agent services that an MAS architecture is expected to offer or is capable of incorporating. Here the following questions should always be asked: what agent services specified by the particular criterion does the system architecture provide? How good are they in terms of their effectiveness and efficiency? What are the pros and cons in terms of the way they are incorporated or implemented? 1.3 Classification of MAS architectures We classify MAS architectures into four categories according to the coordination (especially facilitation) approaches they use: Federation, Matchmaker, Shared Data Space Coordination Models, Organisational Structuring Architectures. The reasons for choosing coordination approaches as a classifying criterion are twofold. Coordination services, (especially facilitation), are the most basic, 1 E-mail: [email protected]

A Short Survey of Multi-Agent System Architectures--- - - - - - - - - - - - - - - - 1149

indispensable functionalities in an MAS architecture. Any MAS architecture would have to provide at least facilitation services in one way or another. The coordination model of system architecture effectively determines the fundamental organisational structure of any MAS based on that architecture. It must be noted that this classification is not disjoint.

""'"" ... C OI,! ill"~

~""-~l

,

~er.vlc •;~

1

...,.;IIQI VT't-

&c:u:az

I

D I

.......

.o..........,...

c~nir'>9lin