Frontiers of Supercomputing II "d0e6388"

What Can We Learn from Our Experience with Parallel Computation up to Now?

Jack T. Schwartz

Jack T. Schwartz is Professor of Mathematics and Computer Science at New York University's Courant Institute of Mathematical Sciences. Trained as a mathematician at Yale University, he has worked in a variety of subjects, including functional analysis, mathematical economics, computer design, parallel architectures , programming-language design, and robotics. Currently, he is interested in the design of interactive computer systems and interfaces.

In this talk I wish to raise a few questions rather than try to answer any. My first question is, what does the way in which parallel computers are currently being used tell us about how they will be used? In this connection, I would like to distinguish among a number of computational paradigms into which particular problems might fit.

The most successful paradigm so far has been the SIMD paradigm, which exists in a number of versions, and all, of course, are evolving. It is worth distinguishing between two types of SIMD computation: the "lockstep" kind of computation (which is what the hardware forces you to if it is vector hardware or some other form of hardware that is centrally driven) and the "relaxed SIMD" paradigm (which you would have on a machine that is basically SIMD, at least in its software structure, but that permits independent branching within a DOALL loop). The latter sort of machine will handle complex loops with subroutine calls in a much more comfortable way than

― 222 ―

a lockstep machine, but in both cases SIMD software organization will leave one still thinking of problems in a SIMD mode.

Along the spectrum that runs from "relaxed SIMD" to a true MIMD paradigm, there arise other distinctions between a number of types of MIMD calculations. The first of these paradigms is the Monte Carlo class of MIMD calculations, which represents a particularly advantageous pole within the MIMD world. In Monte Carlo, one uses what are essentially independent complex processes, which only need to interact when their results are combined at the end of what may be a very long calculation. One generates almost independent parallel computations at an arbitrary scale—the ideal case for a MIMD machine.

A second class of MIMD computations, which requires more communication, is typified by chaotic relaxation schemes. Here, communication is required, but the numerical method used does not always have to have absolutely up-to-date information about what's going on at the neighbors of a good data point, as long as the available information is updated often. The broad applicability of this general idea, i.e., the fact that it can be expected to lead to stable and convergent calculations, is reflected in the familiar fact that I need read a newspaper only once a day instead of every minute and nevertheless can feel this keeps me reasonably current about what's happening in the world. This remark on communication within an algorithmic process reflects a very fundamental point: communication latencies within parallel systems are likely to increase because it may simply not be feasible as machines get larger and larger to provide absolutely current remote information to all their processors or to keep caches fully current. Indeed, this consideration may define a whole class of computations that theorists haven't considered in depth but that deserve attention—computations that simply ignore the noncurrency of data but work out well, anyhow.

Another class of algorithms that has stood very much in the forefront of the thinking of people who begin from the MIMD end of the parallel computer design spectrum is the "workpile" kind of computation. Let me distinguish between two types of such algorithms. The first is the class of "compute-driven" workpile computations, as typified by Lisp theorem-prover searches. Such searches find more and more branches to explore and simply throw those branches onto a pile; the branches are subsequently picked up by some processor whose prior exploration has ended. The second class is the "data-driven" workpile algorithms, typified by a large commercial database system in which only one processor can do a particular piece of work, that processor being the one on whose

― 223 ―

disk the data required for the work in question is resident. Parallel applications of this sort suggest the use of highly parallel collections of "servers."

I hope that presentations at this conference can help to answer the following questions, which seem to me to be strategic for the development of parallel computers and computation over the next few years:

• What are the basic paradigms of parallel computation? Will both parallel architectures and the languages that constitute the user-hardware interface evolve around those paradigms? Are there other paradigms that can be defined as broadly as those I have listed and that are equally important?

• Are all the paradigms outlined above really important? Are they all of roughly equal importance, or does our experience to date suggest that some are much more important than others? For example, there seems to have been much more success in the "relaxed SIMD" mode use of parallel computations than in the Monte Carlo mode. Is this true? If so, what does it tell us about the future?

• Are all of the cases that I have listed really populated? For example, are there really any important compute-driven workpile algorithms, or is their existence a myth propagated by people who have thought about these algorithms from a theoretical point of view?

I raise these questions because I believe that they can help give some shape to the question of where parallel computer hardware and software design is going.

― 225 ―