Parallel Algorithms for Numerical Linear Algebra [1st Edition] 9781483295732

This is the first in a new series of books presenting research results and developments concerning the theory and applic

753 71 22MB

English Pages 338 [319] Year 1990

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Parallel Algorithms for Numerical Linear Algebra [1st Edition]
 9781483295732

Table of contents :
Content:
Advances in Parallel ComputingPage ii
Front MatterPage iii
Copyright pagePage iv
EditorialPage vManfred FEILMEIER, Gerhard R. JOUBERT, Udo SCHENDEL, Frans J. PETERS
PrefacePages vii-viiiHenk VAN DER VORST, Paul VAN DOOREN
A quadratically convergent parallel Jacobi process for diagonally dominant matrices with distinct eigenvalues*Pages 3-16M.H.C. PAARDEKOOPER
A Jacobi-like algorithm for computing the generalized Schur form of a regular pencilPages 17-36J.-P. CHARLIER, P. VAN DOOREN
Canonical correlations and generalized SVD: applications and new algorithmsPages 37-52L. Magnus EWERBRING, Franklin T. LUK
From Bareiss' algorithm to the stable computation of partial correlations*Pages 53-91Jean-Marc DELOSME, Ilse C.F. IPSEN
A recursive doubling algorithm for solution of tridiagonal systems on hypercube multiprocessors*Pages 95-108Ömer EGECIOGLU, Cetin K. KOC, Alan J. LAUB
Least squares modifications with inverse factorizations: parallel implicationsPages 109-127C.-T. PAN, R.J. PLEMMONS
Solution of sparse positive definite systems on a hypercube*Pages 129-156Alan GEORGE, Michael HEATH, Joseph LIU, Esmond NG
Some aspects of parallel implementation of the finite-element method on message passing architectures*Pages 157-187I. BABUàKA, H.C. ELMAN
An overview of parallel algorithms for the singular value and symmetric eigenvalue problems*Pages 191-213Michael BERRY, Ahmed SAMEH
Block reduction of matrices to condensed forms for eigenvalue computationsPages 215-227Jack J. DONGARRA, Danny C. SORENSEN, Sven J. HAMMARLING
Multiprocessing a sparse matrix code on the Alliant FX/8Pages 229-239Iain S. DUFF
Vector and parallel methods for the direct solution of Poisson's equationPages 241-263Paul N. SWARZTRAUBER, Roland A. SWEET
Factoring with the quadratic sieve on large vector computersPages 267-278Herman TE RIELE, Walter LIOEN, Dik WINTER
Efficient vectorizable PDE solversPages 279-297W. SCHÖNAUER, R. WEIß
Vectorizable preconditioners for elliptic difference equations in three space dimensionsPages 299-321O. AXELSSON, V. EIJKHOUT
Solving 3D block bidiagonal linear systems on vector computersPages 323-330J.J.F.M. SCHLICHTING, H.A. VAN DER VORST

Citation preview

ADVANCES IN PARALLEL COMPUTING VOLUME 1

Bookseries Editors:

Manfred Feilmeier Prof. Dr. Feilmeier, Junker & Co. Institut fόr Wirtschafts- und Versicherungsmathematik GmbH Munich, F.R.G.

Gerhard R. Joubert (Managing Editor) Corporate CAD Centre, Philips Eindhoven, The Netherlands

Udo Schendel Institut fόr Mathematik III Freie Universitδt Berlin F.R.G.

Assistant Editor:

Frans J. Peters Corporate CAD Centre, Philips Eindhoven, The Netherlands

NORTH-HOLLAND AMSTERDAM · NEW YORK · OXFORD · TOKYO

PARALLEL ALGORITHMS FOR NUMERICAL LINEAR ALGEBRA Edited by

HenkA. VAN DER VORST Delft University of Technology The Netherlands and

Paul VAN DOOREN Philips Research Laboratory Brussels, Belgium

1990 NORTH-HOLLAND AMSTERDAM · NEW YORK · OXFORD · TOKYO

ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 P.O. Box 211,1000 AE Amsterdam, The Netherlands Distributors for the United States and Canada: ELSEVIER SCIENCE PUBLISHING COMPANY INC. 655 Avenue of the Americas New York, NY 10010, U.S.A. LIBRARY OF CONGRESS Library of Congress Cataloging-in-Publication Data

Parallel algorithms for numerical linear algebra / edited by Henk A. van der Vorst and Paul van Dooren. p. cm. - (Advances in parallel computing: v. 1) "Reprinted from the Journal of computational and applied mathematics, vol. 27, numbers 1 & 2 (September 1989)" - T.p. verso. Includes bibliographical references. ISBN 0 444 88621 4 1. Parallel processing (Electronic computers). 2. Algebra, Linear. 3. Numerical calculations. I. Vorst, H.A. van der, 1944II. Dooren, Paul van. HI. Series. QA76.5.P31458 1990 004». 35 Ό15125 -dc20 89-71001 CIP

Reprinted from the Journal of Computational and Applied Mathematics, Vol. 27, Numbers 1 & 2 (September 1989) ISBN: 0 444 88621 4 © ELSEVIER SCIENCE PUBLISHERS B.V., 1990 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science Publishers B.V./ Physical Sciences and Engineering Division, P.O. Box 211,1000 AE Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the publisher, unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instruc­ tions or ideas contained in the material herein. Printed in The Netherlands

ν

Advances in Parallel Computing

Editorial The international journal "Parallel Computing" was launched in 1983 and the first issue appeared in 1984. The planning at that time was to publish one volume consisting of four issues annually. Original plans were to use one issue per year to publish collected papers on a special topic a n d / o r conference proceedings. The demand for the publication of papers in the journal was, however, such that it soon became necessary to increase the number of volumes to the present four per year. In spite of this growth the sharp (and unexpected) rise in the number of papers submitted for publication resulted in the time from submission to publication becoming unacceptably long. The editors were thus forced to cancel all plans for the future publication of special issues and conference and workshop proceedings. These measures succeeded in reducing the average waiting time for the publication of papers, but left the editors of Parallel Computing with the problem of accommodating the many requests received for the publication of papers on special topics as well as conference and workshop proceedings. Following extended discussions with the publishers of Parallel Computing, it was decided to launch a monograph series "Advances in Parallel Computing". This series will be devoted to special topics and conference or workshop proceedings in the parallel computing field. In order to maintain the high standard set by the journal Parallel Computing it was furthermore decided to have all contributions, including conference proceedings, reviewed by special editorial boards appointed for each volume. Applications for the publication of material in "Advances in Parallel C o m p u t i n g " can be directed towards the Managing Editor Gerhard R. Joubert, the Assistant Editor Frans Peters, or the publishers. Manfred F E I L M E I E R Gerhard R. J O U B E R T Udo SCHENDEL Editors-in - Chief Frans J. P E T E R S Assistant

Editor

vii

Preface During the last decade, parallel computing has become a hot topic within computational and applied mathematics. This is, of course, heavily influenced by the fact that several parallel architectures have become commercially available, which has led to a d e m a n d for efficient parallel algorithms. Parallel architectures have been developed because early parallel research indicated it to be profitable, but in turn they also influenced the directions in which parallel research is evolving. Nevertheless, this special field is still in an early stage and one may assume that, due to future technical and software improvements, certain forms of parallelism have not reached maturity yet. One might think specifically of dataflow-oriented algorithms or systolic-array algorithms. For many important large scale computing applications, production codes have already been developed, based upon carefully designed algorithms, exploiting the capabilities of a particular parallel architecture. A typical example of this is the code that is used for the numerical weather prediction by ECMWF-Reading and which is run in parallel on a 4-processor C R A Y X-MP. The present status of parallel computing is such that a number of successful approaches for parallelism have already emerged—and are still being investigated thoroughly—while some other approaches are being researched for technologies of a more advanced nature than those presently available. A great deal of the activity is in fact concentrated in the field of numerical linear algebra. This is only to be expected since linear algebra is at the basis of most solution methods for which one would consider the use of a parallel computer these days: here, we think of linear algebra problems such as linear least squares, SVD, sparse linear systems, discretized P D E s , fast Poisson solvers and eigenvalue problems. With this background in mind we considered the idea of collecting a number of representative contributions in a book on "Parallel Algorithms for Numerical Linear Algebra". T o that purpose an outline of the contents was made and selected authors were invited to submit contributions. Although we asked the authors to focus on a selected topic, they were free in their presentation. Hence the book is composed intentionally of papers, rather than chapters. Right from the start we have striven for contributions in four different classes, which, in our opinion, reflect the main streams in parallel linear algebra. Indeed, all aspects of this field have been covered, from theoretically oriented analysis of algorithms to practical implementations for specific computers. The first part contains papers on systolic array algorithms. Paardekooper analyzes a new Jacobi-method for computing eigenvalues of general matrices and proves its quadratic convergence when applied to diagonally dominant matrices. Charlier and van Dooren look at new Jacobi-like methods for computing general eigenvalues of an arbitrary regular pencil and prove its quadratic convergence when applied to ' n o r m a l ' pencils. Ewerbring and Luk develop a Jacobi-like algorithm for computing the generalized singular value decomposition and apply this to the computation of canonical correlations. Delosme and Ipsen develop systolic algorithms for the stable computation of partial correlation coefficients.

viii

Preface

The second part looks at message-passing systems. Egecioglu, Koc and Laub look at efficient implementations of the recursive doubling al­ gorithm for tridiagonal matrices on hypercube systems. Pan and Plemmons develop a framework for the analysis of updating and downdating techniques for recursive least squares problems in view of their use on a hypercube architecture. George, Heath, Liu and N g discuss the role of ehmination trees in the exploitation of sparsity and the identification of parallelism in the direct solution of sparse positive definite systems. Implementation aspects for the finite-element method on parallel computers with local memory and message passing are considered by BabuSka and Elman. In the third group we have papers focussing on algorithms for parallel shared-memory systems. Berry and Sameh present an overview of parallel algorithms for the singular value and symmetric eigenvalue problems for dense matrices. The reduction of a dense matrix to either tridiagonal or to Hessenberg form by Householder transformations is described by Dongarra, Hammarling and Sorensen. Duff discusses several issues concerning multiprocessing codes for the direct solution of sparse linear systems. Swarztrauber and Sweet consider recent develop­ ments in the area of fast direct Poisson solvers in the context of parallel computing. Finally, the last group contains contributions which consider the design of fast algorithms and implementations for vector supercomputers. Te Riele, Lioen and Winter show how they have factored a 92-decimal digits n u m b e r on the N E C SX-2 vector computer. Design considerations for the package F I D I S O L are discussed by Schönauer and Weiß. Axelsson and Eijkhout analyze the behavior of suitable line block factorizations for solving elliptic difference equations in three space dimensions. The efficient vectorization of the solution of block bidiagonal systems is studied by Schlichting and van der Vorst.

Acknowledgements We would like to thank all people who have contributed to the successful completion of this book: Luc Wuytack for suggesting it and giving us the opportunity to be the editors, the authors for their contributions and, last but not least, the referees for their careful reading and their constructive criticisms. Henk VAN D E R V O R S T Delft University of Technology, The Netherlands

Paul V A N D O O R E N Philips Research Lab., Brussels, Belgium October 23rd, 1989

Journal of Computational and Applied Mathematics 27 (1989) 3 - 1 6 North-Holland

3

A quadratically convergent parallel Jacobi process for diagonally dominant matrices with distinct eigenvalues* M.H.C. P A A R D E K O O P E R Department

of Econometrics,

Tilburg University, 5000 LE Tilburg, The

Netherlands

Received 23 August 1988 Revised 14 November 1988

Abstract: This paper discusses a generalization for non-Hermitian matrices of the Jacobi eigenvalue process (1846). In each step \n pairs of nondiagonal elements are annihilated in an almost diagonal matrix with distinct eigenvalues. We prove that the recursively constructed sequence of matrices converges to a diagonal matrix. As in the classical Jacobi method the convergence is quadratic and the process is adapted to parallel implementation on an array processor or a hypercube.

Keywords: Eigenvalues, non-Hermitian matrices, diagonally dominancy, quadratic convergence, parallel algorithm, Jacobi methods, parallel processors.

1. Introduction Some algorithms, considered unfavorable on a single processor sequential computer, may be excellent on a distributed computing system. In the Jacobi algorithm [8] the computational capacity can be increased by exploiting its inherent parallelism. Especially the Jacobi algorithms for the symmetric eigenvalue problem [3,8] and those for the singular-value problem [2,4] are particularly amenable to parallel implementation. This paper describes a natural, despite new Jacobi-like process for the diagonalization of a nXn with distinct eigenvalues. This process is nonnormal diagonal dominant matrix A^C quadratically convergent. In the description of the algorithm we assume, without loss of generality, η to be even. In each iteration (^)

A

= S-'A Sk, w

k>0,

(1.1)

\n pairs of symmetrically placed off-diagonal elements are annihilated in a parallel fashion. * This research is part of the VF-program "Parallelle Algoritmiek", THD-WI-08185-25, which has been approved by the Netherlands Ministery of Education and Sciences. 0377-0427/89/$3.50 © 1989, Elsevier Science Publishers B.V. (North-Holland)

4

M.H.C. Paardekooper / Parallel Jacobi process

Each SL is a direct sum of \n unimodular shears 7 ^ , / = , . . . , \n\

Τ = I"'* Τ

/(/, k)

~'

(12)

l{i k)

Τ

m(i,

k)

In (1.2) the index pair (/, k) refers to the zth shear in the kth iteration. Annihilator Tik (see (2.8)) acts as in the classical Jacobi method for a symmetric matrix. Evidently the annihilators (k) are not unitary, for the matrices A are non-Hermitian. Consequently the m o n o tonic decrease {k) cannot be guaranteed. In [13] the same of the Frobenius norms of the nondiagonal parts of A lack of monotonicity has caused genuine difficulties in the proof of the quadratic convergence of the Eberlein algorithm [5,11] for the algebraic eigenproblem. An appropriate strategy for the pivots (/(/, k)9 m(i9 k)) aims to annihilate each off-diagonal element exactly once in η — 1 successive steps. The caterpillar permutation [3] generates a pivot strategy that annihilates each off-diagonal element exactly once in a sweep of \n{n — 1) shear transformations. These are performed in η — 1 steps. The caterpillar permutation Κ can be read off from 1

3 - > 5 - > ... - > Λ - 3 - > Λ - 1 /• 1 2 < - 4 < - 6 < - ... η — 2 /7 For the pivots (/(/, k), m(i, k)) of Tik 1 < / ( / , k) < m(i9 If 6

( 0)

k) ^n,

holds

/(/, k) < l(j9

k)

for 1 < / < y < \n.

(1.3)

t

= (1, 2 , . . . , H ) , then /

(k)

With b

k

/ + 2, / - 2, i-l, \i, we have

(0)

= Kb {/(/,

k)9

m(i9

2 < / = 2A: 2 and 2

2

II T\\¡ = \[x+y

+ ((x +yf

- 4)

V 2

).

(2.13)

With (2.8) we get x + j - 2 = 2(|i + ^


then

n

0

Π

Ii" ni*m. w

k= 2

Proof. With

(2.9)

Π

and

II h

w

(3.2),

III
) e C . '^'

v

\ $j>

Then V

= (0), k = 0

or

,

2

,

is defined by

{^J)^{U(hk)9m(i9k)}\0